Professional Documents
Culture Documents
1
TA225 The Technology of Music
Sound
Processes
Chapter 1 Desktop Sound page 3
c
2 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 2
This publication forms part of an Open University course, TA225 The Technology of
Music. Details of this and other Open University courses can be obtained from the
Course Information and Advice Centre, PO Box 724, The Open University, Milton Keynes
MK7 6ZS, United Kingdom: tel. +44 (0)1908 653231, email general-enquiries@open.ac.uk
Alternatively, you may visit the Open University website at http://www.open.ac.uk
where you can learn more about the wide range of courses and packs offered at all
levels by The Open University.
To purchase a selection of Open University course materials visit the webshop at
www.ouw.co.uk, or contact Open University Worldwide, Michael Young Building,
Walton Hall, Milton Keynes MK7 6AA, United Kingdom for a brochure. tel. +44 (0)1908
858785; fax +44 (0)1908 858787; email ouwenq@open.ac.uk
MK7 6AA
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
recording or otherwise, without written permission from the publisher or a licence from the
Copyright Licensing Agency Ltd. Details of such licences (for reprographic reproduction) may be
obtained from the Copyright Licensing Agency Ltd of 90 Tottenham Court Road, London W1T 4LP.
Open University course materials may also be made available in electronic formats for use by
students of the University. All rights, including copyright and related rights and database
rights, in electronic course materials and their contents are owned by or licensed to The Open
In using electronic course materials and their contents you agree that your use will be solely
for the purposes of following an Open University course of study or otherwise as licensed by
Except as permitted above you undertake not to copy, store in any medium (including
or show in public such electronic materials in whole or in part without the prior written consent
of The Open University or in accordance with the Copyright, Designs and Patents Act 1988.
Printed in the United Kingdom by The Burlington Press, Foxton, Cambridge CB2 6SW.
1.1
3 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 3
Chapter 1
Desktop Sound
CONTENTS
Aims of Chapter 1 4
1 Introduction 7
Workstation 12
2.1.4 Metering 18
2.3.2 S/PDIF 28
2.4.1 Cables 31
2.4.2 Connectors 32
2.5.1 Inputs 34
2.5.2 Outputs 35
3 Storing sound 36
3.4.1 AU 44
3.4.2 AIFF 44
4 Editing 52
4.1.3 Normalisation 56
5 Mixing 61
6 Adding effects 70
6.1 Equalisation 70
6.2.1 Echo 74
6.2.2 Reverberation 75
6.3.1 Flanging 78
6.3.2 Chorus 80
6.5.1 Invert 84
6.5.3 Vocoder 86
6.6.4 Summary 89
7 External control 89
8 Summing up 91
Summary of Chapter 1 93
Appendices 97
Acknowledgements 106
AIMS OF CHAPTER 1
1 INTRODUCTION
This chapter will look at the technology behind the processes involved
in recording musical performances. A case study approach using a
commercial desktop sound device will be used so that you can see how
the technology described is used in practice. In addition, through the
computer activities associated with this chapter, you will get practical
experience of making recordings.
The chapter contains specific details of a number of real devices and
also details of a number of different audio standards. These details are
given for comparative/illustrative purposes and you are not expected
to learn the details of any of them. However, if presented with the
same or similar details in an assessment question, you should be able
to understand and answer questions involving them.
The ‘output’ of this chapter is a master recording of a musical
performance. Chapter 4 in this block will take this master recording as
its starting point and show the processes and technologies involved in
the mass distribution of such a master recording. I shall consider a
‘master’ recording to be a single stereo recording, in other words a
master recording will always comprise two sound channels (the left
hand channel and the right hand channel) which are separate in terms
of the sound they carry, but linked in that they are always stored and
played back together.
Can you think of any technical developments that might have had an
influence on bringing about desktop sound.
Comment
Some possible technical developments are:
• developments in technology that allow more electronic circuitry to
be concentrated into a smaller space, and therefore permitting
more complex operations to be performed;
• the development of desktop computers that can operate fast
enough to be able to process digital sound data in real time;
• the development of sophisticated techniques for processing and
manipulating digital sound;
• the development of digital storage devices (computer hard disks,
CDs, DVDs, etc.) to enable practical lengths of sound to be stored
in digital form. I
processors are now being used since they are available at a price
that is not beyond the individual with a need for, or a serious
interest in, making professional quality recordings.
Whichever desktop system
is used (computer or
dedicated audio processor),
the basic processes involved
in making a master
recording are the same.
These processes are:
assembling/recording the
analogue inputs sounds that are to be used,
editing, mixing and adding
assembling effects as illustrated in
the sounds
Figure 1. In this section I
digital input
will outline these processes,
editing and in later sections I will
sound expand on them and
mixing
recording device introduce the technologies
adding effects that are involved. Then, in
Chapter 4 of this block some
of the more commercial
final stereo master
recording aspects of the production
sequence will be described.
Figure 1 Producing a master recording
Comment
My list of the main considerations is given below, but you may have
thought of some others.
• Does the desktop sound device have facilities to record multiple
tracks?
• If the device can record multiple tracks, are there sufficient
available to record all the sound elements of the piece of music?
• How easy (or expensive) is it to recreate the original recording set-
up if a re-recording becomes necessary at a later date?
• Are all the elements of the recording available at the same time?
• Can any ‘effects’ be applied to individual sound element(s) as the
recording is being made?
You may not have thought of this last point, but as you will see as you
study this chapter, some effects can involve a great deal of processing
and they may not be able to be done fast enough to keep up with the
progress of the recording. I
For all but the most simple recordings such as where a stereo
recording is being made using just a pair of microphones, and
assuming suitable equipment is available, it is better to try to record
the elements separately, and then combine them later. This is because
the blending can be done at leisure and away from the pressures of
time that a recording session brings, and so hopefully produce a better
result. In addition, if there is a musical or technical problem with one
sound element that cannot be corrected later, it might be possible to re-
record this one element on its own and therefore save the cost of
having to re-record the whole piece again.
Assembling the sound sources involves input and output connections,
signal levels, cables and connectors, and this is the subject of Section 2. In
addition, the sound elements have to be recorded (or stored)
somewhere, and this aspect is studied in Section 3.
You have already carried out some simple editing operations using
your computer earlier in the course.
This stage may also incorporate some alteration of the overall levels or
dynamic range of individual tracks or groups of tracks to make best use
of the available dynamic range and so to maximise the signal-to-noise
ratio.
Increasingly common now is the ability to create a scene list or edit list
which contains a list of the editing actions that are to be carried out in
real time as the recording is being played. For example, the edit list
may contain instructions to switch on and switch off different tracks at
predefined points – a process known as punch in and punch out
respectively.
An advantage of using an edit list is that the original tracks are usually
not altered, and so re-editing can be carried out as many times as
required without the original material being changed or destroyed – a
process known as non-destructive editing. Another advantage is that
re-editing can be done very quickly, and the new edit can easily be
compared with the previous one. Edit lists also allow effects to be
switched on and off and other operations to be carried out in real time
as playback proceeds.
(a)
work navigate
section
display
data entry/
control section
quick navigate
section
Follow the steps associated with this activity which you will find in
the Block 3 Companion. These will give you a basic familiarisation of
the course’s music recording and editing software. I
14 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 14
amplitude amplitude
averaged over time.
time
Although the formula for the r.m.s.
value is only exact for sine wave
signals, it is sufficiently accurate to
–
be used as a means of specifying
(a)
Vp
signal level
time
r.m.s.
amplitude
= 0.71 Vp Figure 3 (a) Peak and peak-to-peak
– values of a sine wave; (b) r.m.s. value
(b) of a general sound signal
15 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 15
Using the term volume implies some sort of average level, e.g. the
volume setting on your hi-fi when listening to a piece of music – there
will be loud passages and soft passages, but there is an overall volume
control setting that gives a comfortable listening level. Thus input
sensitivities are usually specified in terms of r.m.s. values as
illustrated in Figure 3(b).
However, as you will see later, peak amplitudes are also important, as
they give a measure of the largest instantaneous signal level that needs
to be accommodated.
These reference levels are all around the level of a line signal (see
under Section 2.1 above) which means that low-level signals from
devices such as microphones have negative decibel figures (i.e. they
have a lower value than the reference level) and signals with
amplitudes greater than ‘line’ level (e.g. the sort of levels required by
loudspeakers) have positive decibel values. In contrast, for the SPL
scale where 0 dB is specified as the quietest sound pressure level, any
audible sound will always have a positive decibel value.
2.1.4 Metering
Audio devices, then, usually provide a number inputs with different
sensitivities and each usually has an associated input level control for
fine adjustment of the level. I mentioned that one of the reasons to
provide inputs with different sensitivities is to prevent overloading of
the input circuitry and causing distortion of the signal. Overload must
therefore be avoided at all costs – particularly with digital signals
where even small amounts of overload are very noticeable.
However, in order to maximise the signal-to-noise ratio the input level
control should be set as high as possible so as to use as much of the
dynamic range of the device as possible. How then can the optimum
input level be set? The answer is to provide some sort of sound level
indication.
To monitor the audio level two basic metering systems are used: VU
(volume unit) and PPM (peak programme meter).
VU (volume unit) is a metering system which gives an indication of
the signal level which is roughly proportional to the perceived volume.
However, because the meter does not show short large transients in the
signal (short loud passages), these may cause audio distortion without
there being any indication on the meter. This makes the VU unit of
limited value especially for use on digital systems.
Sometimes a large signal transient can be so short that no distortion is
audible. PPMs (peak programme meters) therefore are designed to give an
indication only if the transient is large and long enough to be likely to
produce audible distortion. PPMs were originally designed for analogue
signals, but for digital signals a much simpler peak value indication
(however short) is usually all that is needed. Any type of peak level
indication though is a much more useful indication of sound level from
which to set an input level control, particularly where the signal is or will
be digitised. Several different PPM scales exist, but two of the more
common are the British Broadcasting Corporation’s (PPM (BBC)) which is
scaled 0 – 7 and the European Broadcasting Union’s (PPM (EBU)).
24 There are many different ways of
22 displaying sound levels from analogue
20 meters to strips of lights with varying
18 numbers of lights in the strip, but most
16 are based on average (VU) or peak (PPM)
14
measurement of the audio signal.
12 7 +12
Sometimes a meter might indicate both
10
8 6 +8
types of measurement by having the
6
peak sound level retained on the display
4 5 +4 either for a short time or until reset
+3 manually. Figure 4 shows some of the
2
0 0 –100 4 test more common metering scales for line-
–2 –80 level signals compared with a dBu scale.
–3
–4 3 –4
–60
–6 –6
Figure 4 Some common metering scales for
dBu VU PPM (BBC) PPM (EBU) line-level signals compared with the dBu scale:
(a) (b) (c) (a) VU, (b) PPM (BBC) and (c) PPM (EBU)
19 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 19
+ 48 V
voltage
audio signal
0
time
This constant voltage has no effect on the audio signal as it simply gives
it a constant offset which is removed in the audio device before the
signal is processed. (Of course the voltage of the phantom power must
be absolutely constant and not be contaminated with noise or other
electrical interference or these will be fed straight into the input along
with the wanted microphone signal.)
Care must be taken to turn the phantom power off when using a micro-
phone that does not require it otherwise the microphone might be damaged.
time
maximum negative
output voltage
In this activity you will use the course’s sound editing software to
investigate what a signal sounds like when it becomes clipped. As this
involves a feature of the editor you have not used before, you will find
detailed steps for this activity in the Block 3 Companion.
22 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 22
Run the course’s sound editing software, load the sound file associated
with this activity and then follow the steps associated with this
activity in the Block 3 Companion.
Comment
I hope you heard how clipping of the signal produces quite audible
distortion in the form of additional harmonics. I
Comment
There are a number of problems that can occur, my initial list is given
below, but you may have thought of other ones:
• the rate at which the individual bits of the serial data are sent is
not known;
• the ‘sense’ of the digital data is not known (i.e. the receiver might
interpret a binary 1 as a 0 and vice versa);
• the receiver does not know how to detect when the bits for one
sound sample end and bits for the next begin. I
What is the data rate of a single digital sound signal that has been
sampled at 44.1 kHz (44 100 Hz) and quantised using 16 binary bits?
Ignore any additional synchronisation or control data.
Comment
If the sample rate is 44.1 kHz, then a new sample from both sound
channels must be sent every 1/(44.1 × 103) = 23 µs (23 millionths of a
second).
Each sample contains 16 bits and there are two sound channels, so
within the above time, 32 bits of data need to be sent along the serial
connection. Thus, in order to send this number of bits within a 23 µs
period, each bit must be sent within 23/32 = 0.71 µs. Therefore the
data rate is 1/(0.72 × 10–6)= 1 408 450 bits per second. I
As you should be able to appreciate from the above activity, the data
rate required means that the bandwidth of the connection must be
much higher than for a single analogue audio channel (i.e. 20 kHz).
This means that special cable and connectors may need to be used
instead of ordinary analogue audio ones.
The second aspect mentioned above is sometimes called bit
synchronisation as it refers to making sure each individual binary bit
is received correctly. This is addressed in the AES/EBU specification
by specifying that the digital signal contains at least one transition
between the logic 0 voltage level and the logic 1 voltage level for every
bit of the sound data as illustrated in Figure 9. By doing this, even if
the data stream contains a long stream of all ones or all zeros, and/or
has been degraded through a long connecting lead, the receiver can
recreate the original stream of binary digits. How does it do this?
1
(a) 1 0 0 1 1 0 digital serial sound data
0
time
Figure 9 (a) Original stream of binary ones and zeros
that form the digital sound data; (b) the signal that is
transmitted; (c) the degraded signal that is received by
the receiver showing the crossing points the receiver
uses to recreate the original data stream
The receiver uses timing rather than detecting voltage levels to recreate
the original digital signal. The receiver simply has to measure the time
between each crossing of the signal between the two voltage levels.
When it does this, it should find there are two different times – one
being roughly twice the time of the other. From Figure 9(c) you should
25 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 25
see that two short times in succession will indicate a binary 1 and a
single double-length time a binary 0.
Using this system also means that the data rate does not have to be
fixed and can be set depending on the number of channels and the
number of bits per sample – remember that the data must be sent in
real time, so the more channels and/or bits per sample, the more data
that has to be sent within the time for a single sample. Such a system
as this is known as a self-clocking system as it does not require any
separate synchronising signal to be able to detect the bits correctly.
The third aspect mentioned above – that of determining where the bits
for one sound sample end and those for the next begin – is achieved by
dividing the data into different sections called sub-frames, frames and
blocks, and ensuring that each of these are identified by a unique
pattern of bits. Such a procedure is sometimes referred to as word
synchronisation or frame synchronisation, as it refers to the ability of
the receiver to detect specific sections of the data – this of course is in
addition to bit synchronisation which permits the receiver to detect
each individual bit correctly in the first place.
In the AES/EBU specification a sub-frame contains the data for a
single digital audio sample for one sound channel plus some
associated synchronisation and auxiliary data (Box 5). A frame
consists of one sub-frame from each of the sound channels strung
together one after each other – normally there are only two channels
(for stereo), so a frame would contain two sub-frames (Box 6). Frames
are sent at the rate of the original sample rate (i.e. one frame is sent
within the time interval between each sound sample) so the data
transfer can operate in real time.
So that the receiver can obtain information about the form of the sound
signal (e.g. how many channels, how many bits per sample), there
must be some further information contained in the data stream. Each
sub-frame contains a few spare bits which can be used for this
purpose, however, the numbers of these bits is insufficient to carry all
the information that the receiver requires. Unlike the sound data, this
type of information does not change from sample to sample (i.e.
between sub-frames), so the spare bits in each of 192 consecutive
frames are collected together to form a special set of data called the
channel status block (see Box 7). The amount of data that can be
carried in this block is then sufficient to provide not only information
about the form of the audio data, but also some additional user-defined
information as well. Each sound channel has its own channel status
block, but of course much of the information between channels will be
the same and will be duplicated in each block.
Validity flag
User data
Figure 10 The AES/EBU sub-frame format
Channel status
Parity bit
The first four bits of the sub-frame are called After the preamble, the next 24 bits in the serial
the preamble and they contain a unique stream contain the actual binary data for one sound
sequence of bits that identify not only the start sample. This allows sound data quantised with up
of a sub-frame, but also its type. There are to 24 bits to be sent, but it is more usual to reserve
three different preambles usually denoted X, Y the first 4 bits for an auxiliary sound channel, and
and Z. The X preamble indicates the start of a the remaining 20 bits to the main sound signal.
sub-frame containing a sample from sound However many bits are used for the sound data,
channel A (the left-hand channel in a stereo there must always be a total of 24 bits in this
system), the Y preamble indicates the start section (unused bits are padded out with 0s).
of a sub-frame containing a sound sample The auxiliary sound channel is a low-quality voice-
from channel B (the right-hand channel) and grade sound channel that can be used in a studio
the Z preamble indicates not only the start of situation to provide voice communication without
a sub-frame containing a sample from sound the need for any additional cables. This channel
channel A, but also the start of a 192-frame consists of a digital sound stream quantised with
block. Since the first frame in a block always up to 12 bits at a sample rate of one third of the
contains a sample from channel A, once in sample rate of the main sound channel (the 12 bits
every 192 frames, the X preamble is replaced of each sample are separated into three 4-bit
by a Z preamble to indicate the start of a sections and sent using the 4 auxiliary channel bits
block of data. in three consecutive sub-frames of the main channel
In order to separate the data stream into sound – hence the sample rate needs to be one third that
samples, therefore, the receiver simply looks of the main signal).
for these preambles and from these it can The last 4 bits of the 32-bit long frame are used
identify not only the start of each sub-frame, individually as follows:
but also the start of each frame and of each
block. How is this done? The answer is that • bit 28, the V bit, is a validity flag which indicates
each of the three preambles has a unique data whether the sound data is reliable and is suitable
stream with special longer and shorter zero for conversion to an analogue signal or not. If
crossing times than occur during normal data, this flag is set, then the data is either
as illustrated in Figure 11. Notice though that erroneous, or is not sound data at all (for
the overall length of each unique preamble example it could be computer or textual data
section still occupies four bit times. for use in CD-I players);
• bit 29 is the U user data bit – the U bits from
1 bit
period each sub-frame are collected together and can
be used for auxiliary user data;
• bit 30 is the C channel status data bit – the C
X
bits from each sub-frame are collected together
to form the information that tells the receiver
Y
the form of the sound data that is being sent;
Z • bit 31 is the parity bit – this bit provides a simple
method of checking for errors in the data stream.
Parity will be considered further in Chapter 4 of
time
this block. For reasons beyond the scope of this
Figure 11 Data streams for the X, Y course, inclusion of the parity bit also improves
and Z preambles the receiver’s ability to detect the preambles.
27 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 27
2.3.2 S/PDIF
The Sony/Philips digital interface (S/PDIF) standard is fully
compatible with the AES/EBU standard described in Section 2.3.1.
The form of the digital sound data and the sub-frame, frame and block
structure is exactly as described in the AES/EBU specification, the
only difference is in the interpretation of the channel status
information that is accumulated over one block of 192 sound samples.
The first bit in the channel status information (i.e. the channel status
bit from the first sub-frame in a block) is used to specify whether the
sound data is for consumer or professional use. If this bit is set, then
the channel status data is to be interpreted as described in the AES/
EBU specification. If this bit is zero, the channel status bits are to be
interpreted as described in the S/PDIF specification (see Box 8).
Important points to note about this specification are:
• serial copy management system (SCMS) is incorporated (this is a
method of preventing multiple copies of a recording being made
which will be described in Chapter 4 of this block);
• the specification caters for 2 or 4 channel sound;
• the source of the sound can be specified (e.g. CD, DAT);
• CD subcode data can be incorporated;
• there is much scope for future expansion.
As mentioned in Box 8, there is a significant amount of redundancy
built into the specification of the channel status data to allow for
future expansion in the capabilities of the interconnection. The uses to
which this interconnection is put, and the types of devices which need
to be interconnected is continually evolving, and the specification can
therefore be enhanced to cater for new applications, whilst still
retaining compatibility with all previous uses.
29 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 29
Why is the bit rate of a serial AES/EBU data stream related to the
number of sound channels it contains? I
The problem with increasing the bit rate too much is that signal loss
and interference on the interconnecting lead increases, and the
maximum length of lead that can be used reduces eventually to an
impractical length.
30 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 30
To solve this problem, Sony, Mitsubishi, Neve and SSL have jointly
produced a specification based on the AES/EBU specification called
the multi-channel audio digital interface (MADI) for transferring up to
56 simultaneous digital sound channels in real time on a single
connecting wire.
The details of MADI are beyond the scope of this course, but in
essence, a sample from each sound channel is formed into an AES/EBU
sub-frame, and 56 of these subframes are strung together to form a
frame that is transmitted within a single sample period. However, to
allow the receiver to decode the data stream correctly and to enable the
required amount of data to be sent using a practical bit rate:
• the bit rate is set to a constant value (100 Mbits per second);
• a different coding scheme is used for the individual bits which
means that a separate timing signal is necessary to ensure the data
can be decoded correctly by the receiver.
To enable a constant data rate to be achieved, sample data for all 56
channels is always sent, but for unused channels it is set to zero. The
channel status data indicates how many of the channels are in use.
as mentioned above, this is only important for long lead lengths and/or
high frequency signals such as those used in digital audio signals,
where cables with a specific impedance and designed for such uses are
available. Figure 13 shows in more detail than Figure 5 the
construction of a typical screened cable that can be used for both
analogue and digital audio.
inner conductor insulation
The digital systems mentioned in Section 2.3 can all be used with an
optical connection, and such inputs and outputs are increasingly being
provided. In this type of connection, electrical signals and wires are
replaced by light and a fibre-optic cable. By a process called total
internal reflection, light can be transmitted down a flexible
transparent tube – see Figure 14.
fibre-optic cable
light
2.4.2 Connectors
One of the most common connectors for consumer audio is the phono
plug (Figure 15(a)), but although cheap, these are not very robust, and
do not provide a very secure electrical or mechanical connection (there
is no locking mechanism, for example, that prevents the plug being
accidentally pulled out of its socket). Also very common is the jack plug
which is available in a variety of different sizes. For professional audio,
the 1/4 inch jack plug is the most common connector (Figure 15(b)).
33 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 33
Figure 15 Common audio connectors: (a) phono; (b) 1/4" jack; (c) TRS jack;
(d) XLR; (e) BNC; (f) DNP optical
For connections where there are two signal wires and the common
ground connection (for balanced connections and where a stereo pair
of audio channels is combined in one connection), a jack plug with an
additional sleeve connection is used (Figure 15(c)). In this case, the
two signals wires are connected to the tip and the ring, and the ground
connection to the sleeve (where there is only one signal wire it is
always connected to the tip, and for a stereo connection the left
channel is always connected to the tip). This type of jack plug is
sometimes referred to as a TRS (Tip, Ring, Sleeve) connector.
For professional applications another common connector is the
three pin XLR connector (Figure 15(d)). These connectors are very
robust, and lock together so that they cannot accidentally be pulled
apart. Pin 1 is always the earth or ground connection, and pins 2
and 3 are the balanced signal connections (pin 2 only is used where
there is a single signal connection). XLR connectors are always used
for microphones that require a +48 volt phantom power supply (see
Section 2.1.6).
At present there is no standard connector for digital links (AES/EBU etc.)
although one of the above types is usually used. In consumer devices a
phono plug is commonly used, whereas an XLR connector is common
with professional devices. For high frequency digital connections
(e.g. MADI), a BNC connector is used to minimise losses and
reflections caused by impedance mismatching (Figure 15(e)).
For optical digital links, a special connector must be used that efficiently
transfers the light between the end of the optical fibre and the light
source/receiver. There are a variety of different connectors in use, but a
common one in the consumer field is the DNP connector shown in
Figure 15(f). DNP stands for ‘dry non-polish’ and these connectors are
designed for easy attachment of the optical fibre without glue (‘dry’) or
the need to polish the end of the fibre after cutting. Of course the
consequence of this easy connection system is more light loss at the
connection, but this is not significant given the intended use of these
connectors for short cables in consumer applications.
34 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 34
What type of cable and connector is suitable to use for the connection
between a microphone that supplies a balanced output and the
microphone input on an audio device, assuming the microphone is
being used ‘on location’? Explain your answer.
2.5.1 Inputs
There are eight analogue combined microphone and line inputs, which
all offer balanced connections. Since they are dual purpose inputs
(they can be used for both low and line-level signals), they must be
able to cope with a wide range of signal levels without causing
distortion. To do this, each input has an associated level control
that can be adjusted to provide a suitable signal level for the main
recording and mixing sections of the device – indeed these controls are
marked ‘LINE’ and ‘MIC’ to indicate the setting for each type of input
(Figure 17). In other devices there may be a switch to select the input
sensitivity, or sometimes two sets of inputs are provided, one for low-
level and one for line-level signals.
Inputs 3 to 8 all use 1/4 inch jack plugs with the tip–ring–sleeve (TRS)
arrangement to provide a balanced 2-wire plus ground connection.
Input 8 has an additional jack connector with a high impedance input.
This can be used for high impedance audio transducers such as some
types of guitar pickup and piezoelectric microphones as mentioned in
Section 2.1.3.
All eight inputs have a specified input level range from –46 dBu to
+ 4 dBu and can therefore cater for a wide range of signal sources from
a balanced condenser microphone that requires phantom power
through to a line-level signal from a synthesiser. Remember though
that none of these are stereo inputs, so two channels must be used if a
stereo source is to be connected. This means that the device is limited
to just four stereo signal sources at any one time.
2.5.2 Outputs
The AW16G provides three stereo analogue unbalanced outputs:
• a main line-level output (nominal level –10 dBV) on separate left
and right jack connectors;
• an auxiliary line-level output (nominal level –10 dBV) on separate
left and right jack connectors designed for monitoring purposes or
for adding effects;
• a headphone output, with a nominal power output of 100 mW,
provided by a single TRS jack connector.
This is a fairly basic range of outputs, but again provided to satisfy the
intended uses of the unit. A more sophisticated mixer unit would
probably provide additional outputs such as a line-level output for
each input – or more probably a combined input–output connection
using a TRS jack plug that can be used to add or punch in effects to a
particular input channel.
In this activity you will use the course’s music recording and editing
software to make a recording and so learn a little more about the
program’s features and operation. You will find the steps associated
with this activity in the Block 3 Companion. I
36 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 36
3 STORING SOUND
ACTIVITY 17
(EXPLORATORY) ................................................................
Comment
First of all, for reasons of quality as will be explained in Chapter 4, the
MiniDisc is not recommended for use in any of the mastering stages
(although, many people do use MiniDiscs in such situations, and do
obtain good results).
For the initial recording of the performance, some form of multitrack
recorder is needed. This limits the choice to DVD-audio, ADAT and
hard disk as apart from solid state memory all the others can only
record one stereo channel. Solid state memory is a definite possibility
for the future, but at present the practical capacity of a solid state
digital recorder is not sufficient to record 45 minutes of high
quality 6-track sound. In addition, some types of solid state memories
lose their contents when the power is switched off which makes
recordings rather vulnerable.
For the mix-down stage, the type of recorder used to record the
multitrack originals would have to be used again, but the mixed-down
version could be recorded on any of the above systems. However, this
assumes the mix-down stage does not involve any editing of the
material. If this is needed as well, then a hard disk or solid state
recorder would be the ideal choice as they both allow instant access to
any part of the recording. Again though, fully solid state recorders are
not at present a practical proposition.
The final master stereo recording needs to be stored in a permanent
form where there is no loss of quality, either through the recording
system, or over time. The choices here are any writeable version of CD
(including SACD or DVD-A if writeable versions exist), digital
magnetic tape and possibly hard disk (or any type of backup storage
designed for computer data). I
38 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 38
From the above activity, it is clear that hard disk recording is one of
the most useful formats to use in all the stages of the production of a
stereo master recording. In addition, although use of electronic
memory for the storage of complete recordings is not practical at
present (in 2004), such memory is used in conjunction with hard disk
stores (and indeed with most other digital storage methods) for the
temporary storage of digital audio data during processing. So it is
worth looking at both of these types in a little more detail here.
Discussion of the other systems will be left until Chapters 4 and 5.
ACTIVITY 18
(SELF-ASSESSMENT) ...........................................................
As your answer to the above activity should have told you, a 64 Gbyte
hard disk can store well over 100 hours of stereo sound, or 50 hours
of 4-track sound, 25 hours of 8-track sound, etc. (all assuming the
standard CD format of 44 100 samples per second for each channel,
each quantised using 16 bits).
Box 9 gives some brief details about the construction and operation of
a hard disk unit often called a hard disk drive (HDD).
One of the major advantages of hard disks, particularly for editing, is
that the time needed for the hard disk to retrieve any section of a
recording is roughly the same, and this can dramatically reduce editing
time (this is known as a random access device). This is different from
a tape system where it clearly takes longer to access, say, the end
section of a recording when the tape is positioned at the start than it
would do to access a point half way through.
The other aspect of hard disks that has to be considered is their read/
write speed and, coupled with this, the time between instructing the
disk to read or write a particular section, and the time when the data
actually starts to be stored or read. There are two components to this
39 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 39
To aid location of the required data, the data is divided into radial sections called
sectors as illustrated in Figure 19. To read the data, the read/write head is
positioned at the correct track, and the magnetic areas on the disk under the
head cause small electrical signals to be generated in the head as the disk rotates
under it. Re-recording data is done simply by overwriting old data with the new
data. All the read/write heads are connected to a single actuator, so they move in
and out of the disk together. They can be used simultaneously, but to reduce the
amount of electrical circuitry required, they are usually only used one at a time so
that the read/write circuitry can be shared between all the heads. This means
that data is recorded only in a single serial stream of bits.
sector
disk
track recording
surface
Figure 19 Layout of data
on the surface of a hard disk
To minimise the access time for data, the disks rotate continuously as long as
power is applied – this means that there is no delay while the disk(s) reach the
correct rotational speed. To reduce wear on the disk, the heads are kept out of
direct contact with the disk, and they float on a cushion of air of typically 20 µm
(20 millionths of a metre) thick. If a head should touch the surface of a disk
through wear, a fault or excessive shock, then this can damage both the head
and the disk causing a disk crash to occur. Often this is fatal, and the whole
disk unit will have to be discarded. Although it is sometimes possible to recover
some data, inevitably some will be lost.
40 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 40
time – the time needed for the read/write heads to move to the correct
track, called the seek time, and the subsequent time spent waiting for
the required sector to appear under the read/write head as the disk
rotates, called the access time. Sometimes, the access and seek times
can cause problems, particularly for simultaneous multitrack recording.
This is because data is not necessarily stored in consecutive sectors or
tracks on each disk, and so during recording the heads will need to be
repositioned a number of times.
To overcome this problem, and as long as the disk write speed is
sufficiently fast, a hard disk audio recorder will contain some
temporary memory for data in transit consisting of computer-type
solid state memory. This memory is often made quite large so that it
can store temporarily a significant amount of sound (minutes rather
than seconds). For recording, this allows the recorder to take in the
regularly occurring sound samples, store them temporarily in the
temporary memory, and then send the samples to the hard disk in
bursts whenever a reasonably sized block of data has been accumulated
as illustrated in Figure 20. This provides spare time to cater for the
hard disk’s seek and access times, but it is only possible if the write
data rate is sufficiently fast to allow this to happen. The reverse
occurs on playback.
data written in bursts
analogue-to-digital
converter
hard disk
Figure 20 Using temporary solid state memory to cater for hard disk seek and access times
Now that hard disk units have become so compact and robust and are
able to store practical lengths of multichannel sound in real time, they
are now being used throughout the recording, editing and mastering
processes. There is however the disadvantage that the disk units in
audio recorders are usually permanently installed in the device and
are not designed to be changed by the user. So, once a disk is full, the
data must be transferred elsewhere before more sound can be recorded
(assuming the original recordings cannot just be overwritten).
operation and the fact that the access time for any part of the memory
is the same – see Box 10.
Random access memory (RAM) is electronic solid state memory that usually
loses its contents when the power is switched off and so cannot be used for
long-term data storage. Such memory is called volatile memory and is in contrast
to hard disk storage which retains its memory when the power is turned off
is that not only, as its name implies, is it random access (i.e. the access time
for any part of the memory is the same) but the read and write speed is
extremely fast – much faster than for a hard disk – and there are no additional
gigabytes, but bit-for-bit, RAM is much more expensive than hard disk storage.
In fact, some RAM memory will be present in all digital recorders and
audio workstations and it is just the same sort of memory found in
desktop computers. In audio workstations and when computers and
their hard disks are used to record or playback sound it is simply used
as short term intermediate storage as described in Section 3.2 above. In
audio workstations (and desktop computers), because of its very high
read/write speed, RAM is also used as the working memory for editing
operations, with sections of sound being swapped between the main
permanent storage medium (hard disk, digital tape etc.) and RAM as
and when required.
The solution to the problem of RAM being volatile is either to use a
battery back-up supply, or to use special non-volatile RAM, but this is
more expensive than volatile RAM, and may also take longer to write
data although read times are usually similar.
However, a relatively new type of solid-state memory called flash
memory is now becoming very popular.
Why is RAM not suitable for use in a portable solid state audio player? I
3.4.1 AU
The AU (AUdio) file format is probably the simplest of formats for the
storage of digital sound samples. It was originally developed by Sun
Microsystems for use on computers using the Unix operating system
and it is commonly used for storing sound in a compressed form
suitable for transmission over the Internet as the compression results
in a very compact file.
An AU file is split into three sections:
• a header section that contains information about the form of the
digital sound samples – Box 12 gives brief details about the data
contained in this section;
• an optional comment section that can contain general textual
information (e.g. name, copyright details and so on);
• the sound data itself, stored in the form indicated by the parameters
in the header section.
If there is more than one track (e.g. two tracks forming a stereo
channel) the sound data is interleaved between each. So, for example,
if there are 2 tracks, sample 1 from track A will be stored first, then
sample 1 from track B, this is followed by sample 2 from track A and
then sample 2 from track B and so on.
Can you think of a reason why the data from each track should be
interleaved rather than storing them as separate blocks of data?
Comment
Interleaving the track data allows the sound stored in the file to be
played as it is being read. If the tracks were separated, then the reading
device would have to continually jump between sections of the file to
read individual samples. This may slow down the reading process so
much that playback in real time is not possible unless the data from all
the tracks is first read into temporary RAM memory. I
3.4.2 AIFF
The Audio Interchange File Format (AIFF) is based on a standard file
format called the Interchange File Format (IFF) originally developed
by a company called Electronic Arts. AIFF is the version of this format
developed in the late 1980s for audio data. In Chapter 3 of this block you
will be introduced to the MIDI or music code version of the IFF format.
45 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 45
The chunk size follows which for this chunk is always 18.
The ‘data’ part of this chunk contains the following information about the
If there is more than one sound track, the data for each is interleaved
within this data chunk. Figure 22 shows the form of an AIFF sound
file and illustrates how the common and data chunks are contained
within the header chunk.
‘FORM’
chunk size
‘AIFF’
‘COMM’
chunk size
COMM chunk
sound parameter data AIFF form chunk
(contains COMM
and SSND chunks)
‘SSND’
chunk size
SSND chunk
sound sample data
In addition to the above basic three chunk types, there are a number of
optional chunks that can give more information about the form and use
of the sound data. These are outlined below.
• Marker chunk. This chunk allows one or more particular points in
the sound data to be marked and given a name. This information
can be used for any purpose, but a common use is for specifying
loop points where the sound data is to be used as wave data for a
synthesiser (see Chapter 8 in Block 2).
• Instrument chunk. This chunk is used in conjunction with the
marker chunk to specify further details about how a synthesiser
should play back the sound – of course assuming the sound data
chunk contains wave data. Details such as tuning, note range, key
velocity and volume can be specified with this chunk.
• MIDI chunk. This chunk contains MIDI data, but not usually actual
note data. More usually this is used to store control (system
exclusive) data for an electronic instrument. (Note, the MIDI system
will be explained in detail in Chapter 3 of this block, so do not
worry if you do not understand anything about this chunk now –
the details are given here just for completeness.)
• Audio recording chunk. This chunk allows the channel status
information and user data contained in an AES/EBU or S/PDIF
signal to be included in the sound file (see Section 2.3 earlier).
• Application specific chunk. This chunk can be used for any
purpose – specific to one type of device or more general
information. For example this chunk could be used by a sound
editor program to store an edit list (see Section 1.2.2 above) and
other program-specific data.
48 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 48
‘RIFF’
chunk size
‘WAVE’
‘FMT ’
chunk size
FORMAT chunk
sound parameter data RIFF WAVE chunk
(contains FORMAT
and DATA chunks)
‘DATA’
chunk size
DATA chunk
sound sample data
Following this is the chunk size which gives the number of bytes in the chunk,
excluding the 8 bytes used by the chunk type and size data.
The ‘data’ part of this chunk contains the following information about the
The chunk size which follows specifies the number of bytes contained in the
chunk (excluding the 8 bytes required by the chunk type and size).
The ‘data’ portion of the chunk contains the actual sound data. As with AIFF,
Each sample uses an integral number of bytes, with unused bits being padded
Now this is where the situation gets a little bit more complicated!
Instead of using one data chunk the sound samples can be divided up
into a number of chunks within another type of chunk called a ‘LIST’
chunk. There are two possibilities for audio data chunks within a LIST
chunk – a DATA chunk as described in Box 18 and a ‘SLNT’ or silent
chunk. This chunk indicates a section of silence and it contains a single
parameter indicating how many ‘sample times’ of silence tare to occur.
The idea behind this is to reduce the file size by not having to incorporate
a large string of zero samples for long periods of silence.
To complicate the situation even further, other chunk types are allowed
inside a LIST chunk, but any further discussion of this is beyond the
scope of the course.
In this activity you will use the course’s sound editing software to
create a short AIFF file and a short RIFF WAVE file. You will then
examine these to see if you can identify the various chunks that have
been described above.
Run the course’s sound editing software and follow the steps
associated with this activity in the Block 3 Companion.
Comment
Unfortunately, without the use of a program that displays each file’s
data in an understandable form (and a detailed knowledge of the file
formats which is not given here), you cannot confirm that each file
conforms exactly to its specification. Such an investigation is beyond
the scope of this course, but I hope that this activity has demonstrated
that actual sound files do seem to use the various chunks of data
described above. I
51 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 51
RAM memory
CD drive
Figure 24 Photograph of the inside the AW16G audio workstation showing the
hard disk drive, the optional CD drive and two RAM memory devices
52 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 52
4 EDITING
We have now looked at the basics of getting sound into and out of an
audio recording device (or computer), and how it can be stored within
it. In most cases, once the raw material is stored inside the device, it
will need to be processed in some way before a master recording can be
produced. At its simplest this might just be tidying up the lead in and
lead out sections (fading in and fading out), but most likely the
following processes will need to be carried out between making the
recording of a performance and the subsequent production of a final
master recording.
• Some editing of the various ‘takes’ will need to be done,
• for multitrack recordings, the tracks will have to be combined to
form a stereo channel,
• there may be a need to add various effects to the sound (e.g.
reverberation).
This section and the next two sections therefore will look at each of
these three processes, but do note that often the processes may be
combined, or may be carried out in a different order.
An important point to keep in mind during the discussion is that
sometimes a task has to be done in real time (i.e. as the sound is playing),
and at other times the task can be done at leisure, or in small sections.
This has a bearing on the speed at which each task must operate.
As in previous sections, the facilities provided in the AW16G Audio
Workstation will be used as an example, and you will get practical
experience of some of the processes with the course’s music recording
and editing software.
Although most of the discussion will be on how the processing is
done using digital audio, for completeness and comparison, analogue
methods will be noted where appropriate.
1:1
2:1
3:1
4:1
threshold
20:1 (limiting)
input
or less than the threshold sample value. If it is below, then the sample
is unaltered. If it is above the threshold, then the difference between
the sample value and the threshold value is simply multiplied by the
compression ratio. For example a compression ratio of 2:1 will result
in the difference between the sample and threshold values being
halved. Notice that I have described the process in terms of the
magnitude of the sound sample (i.e. ignoring the sign). This is because
a positive value has to be made less positive and a negative value less
negative, and so the use of the term magnitude covers both situations.
An analogue compressor/limiter may only provide a limited range of
settings for the above parameters whereas a digital device or the
equivalent software version found in audio processing programs will
often allow the user an almost infinite variation in the basic
parameters.
If the sound comes from a stereo source, then some devices will allow
the amount of compression to be controlled by the sound levels from
both channels, and the same amount of compression then affects both
channels at the same time. This prevents the compression causing
violent swings in the apparent position of the sound between the left
and right loudspeakers.
4.1.2 Expansion/gating
Audio expansion and audio gating are the opposite processes to audio
compression and audio limiting respectively and are similarly
implemented in both the analogue and digital domains. In audio
expansion, the sound is amplified less as the sound level is reduced.
In other words as the sound level is reduced, it appears to get even
softer, and thus the overall dynamic range is increased. Audio gating
occurs when below a particular sound level the signal is switched off
completely. Again there are the same parameters associated with these
forms of processing as for compression/limiting, i.e. threshold, ratio,
attack time and decay time.
Unlike compression/limiting which is primarily used to reduce the
dynamic range and make the recording as audible as possible,
expansion/gating is normally used to produce special effects.
Sometimes gating is used to cut out unwanted background noise
during, for example, speech, but this can give a very unnatural effect
when there is complete silence in the pauses. In the music world,
expansion and gating is commonly used for drums, with particularly,
the bass drum of a drum kit being gated to provide the effect of
damping.
4.1.3 Normalisation
Normalisation is the process whereby the whole sound is changed in
level until the highest peak is at a predefined ‘normal’ level. Unlike the
processes described above, all signal levels are altered by the same
amount. Usually normalisation involves amplifying the sound by a
constant factor such that the highest amplitude peak is raised to just
below the maximum level the device can cope with (but this might not
always be the case). Thus, the amount of normalisation applied must
be determined from the peak value of the signal, not the average level
otherwise normalisation might cause high-level peaks to be clipped
and introduce distortion.
Figure 26(a) shows the waveform of a section of a sound waveform
before normalisation, and Figure 26(b) shows the same signal after
normalisation.
time
maximum negative
signal level
(a)
sound level
time
(b)
Figure 26 (a) the waveform of a section of a sound
waveform before normalisation;
(b) the same signal after normalisation
higher than this maximum, i.e the signal will be clipped. However, it
is an easy matter for a digital audio processor to scan through a
recording to find the peak value, and then to work out a multiplication
factor which will result in this peak value just reaching the maximum
quantisation value – or more usually a headroom value which is just
below this. This multiplication factor can then be applied to every
sample and the result stored as the normalised version. As before, this
is a purely mathematical process and usually does not need to be done
in real time.
Remember that this normalisation process works according to the
highest level of the sound over a whole recording (or a section of a
recording). Most of the time the average level will be much less than
this value, and in the extreme, a recording might contain a single large
peak which may even have come from an unwanted click which causes
the normalisation process to have little effect. In such cases, some
editing of these peaks may need to be done first to remove or reduce
them so that normalisation will have the desired effect.
In this activity you will apply normalisation to the recording you made
in Activity 16. Follow the steps associated with this activity which
you will find in the Block 3 Companion. I
All in all, it is much better to work with the original recordings, but of
course in doing so the original will be permanently changed
(destructive editing), and if an edit goes wrong, then restoring the
original will be at best difficult or at worst impossible.
In the very early days of sound recording, editing was not possible – if
a recording went wrong, it had to be discarded and a whole new take
done. When analogue magnetic tape systems appeared, editing could
be accomplished by splicing (physically cutting and joining sections
using thin sticky-backed plastic tape called splicing tape). To obtain a
smoother result, the tape was cut at an angle rather than at right angles
across the tape (Figure 27). However, splicing only provides a straight
switch between the two recordings, cross-fades or fading in and fading
out cannot be done by this method.
In this activity you will carry out some simple editing operations and
learn more about the course’s music recording and editing software.
You will find the steps for this activity in the Block 3 Companion. I
In this activity you will carry out two simple editing operations;
one is destructive (i.e. the sound is permanently altered) and one is
non-destructive and uses the ‘edit list’ facilities of the course’s
music recording and editing software. You will find the steps for
this activity in the Block 3 Companion. I
61 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 61
5 MIXING
Mixing of audio signals is the process that occurs when two or more
sources need to be combined in various and possibly varying
proportions to produce a new combined recording. The audio sources
could come from a number of microphones recording a single
performance – either live or from a multitrack recorder, or from a
variety of different sources linked to a single performance (e.g. a
combination of signals from microphones and electronic instruments),
or sources from more than one performance (e.g. adding background
effects such as atmospheric sounds etc.). In fact fading out and fading
in that was mentioned above is just a special form of mixing over a
short period of time, where one of the ‘signals’ to be mixed is silence.
As I mentioned in Section 1.2.2, mix-down or simply the mix is the
term usually applied to this stage in the mastering process.
When recording a live performance, often the recording engineer will
try to record each performer or group of performers separately using
individual microphones recorded to individual tracks on a multitrack
recorder. This means that a master stereo recording can be produced
with the best balance of the various sources at leisure at a later date. If
the mixing is done at the time of the recording there is little chance of
changing the balance afterwards. Sometimes the multi-track ideal is
not practical or possible, and in such cases it is very important that the
mix of the various sources is set up with care during rehearsals.
In the discussion below it does not matter whether the sound is
coming from live or from recorded sources or a mixture of both except
you should bear in mind that it is vital all the sources are
synchronised – a topic that will be considered further in Chapter 3 of
this block.
Any sound mixing unit will have a number of inputs and a number of
outputs. The individual sound inputs are mixed in the required
configuration and proportions and the results sent to the required
outputs. Within a mixer, the actual ‘mixing’ is done with the use of
one or more buses. In this context an audio bus is the place where the
signals to be mixed are fed, and from where the result is sent to the
mixer output. In analogue mixers a bus would physically comprise an
electrical wire to which the various inputs were connected. In digital
mixers there is no physical equivalent component, but it is still useful
to retain the idea of buses.
level controls
1
direct
control
mixed
2 sound out
analogue +
sound direct
sources control
3
variable
control circuit
“remote”
control circuits
4
variable
control circuit
control signals
(digital or analogue signals)
synchronised electronic
selector switches
1 2 3 4
multiplication factors
Today’s desktop computers operate so fast that they also are able to
carry out the mixing process in real time, although of course here the
mixing is done by the computer program rather than by a dedicated
electronic device (although the computer itself may contain additional
circuits to speed up mathematical processes such as multiplication).
The solution in both cases is to make sure the levels of the individual
sound sources are such that their sum is within the maximum signal
level. Sometimes the dynamic range of the mixing stage and subsequent
stages is increased to address this problem. This is especially true in
digital mixers where often the internal processing in a digital mixer is
carried out using extra bits to allow a larger range of sample values to
be accommodated. Even if this is done within the digital mixer, the
number of bits must be reduced to the original number by a process
called downsizing when the master recording is made.
Comment
Downsizing involves adjusting each sample so that it can be stored using
fewer bits. However, before any bits are removed, it must be ensured
that they do not contain any data. This is done by multiplying the
value of each sample by the ratio of the new number of levels to the old
number of levels, and then storing the resulting value using only the
reduced number of bits. This ratio turns out to be the equivalent of
dividing the sample by 2n where n is the difference between the
numbers of bits used before and after the downsize operation.
So, for example, if a 16 bit digital sound signal needs to be downsized
to 14 bits, each sample value needs to be divided by 216–14 = 22 = 4. I
mixer
r2
r8 sound clip
input jacks 1–8 input channels 1–8 r2
track channels 1–16
r2 return channels 1/2 r2 stereo/aux out jacks
digital stereo in jack r2
pad channels 1–4 digital stereo out jack
r2
monitor out jacks
buses L/R phones jack
aux buses 1/2 metronome
effect buses 1/2 r2
stereo buses L/R effect 1
r8 r2
internal effects
r2 stereo output channel r2
effect 2
r2
r2
recorder
input CD play
patching
r2 r16 r16
CD write
1 2 3 4
quick loop data backup/restore
sampler hard disk import/export CD-RW drive
recorder
AW16G, I will not consider them further. The effect units provide
effects such as reverberation and I will look at these in more detail
in the next section on adding effects.
So, in the AW16G, ignoring the pad tracks etc., the main inputs to the
mixer section are:
• 8 inputs from the device’s microphone/line input connectors;
• one stereo input from the digital S/PDIF connector;
• 16 track inputs from the hard disk (these are arranged as 8 separate
tracks and 4 paired stereo channels);
• a further stereo input from the hard disk’s stereo channel (as
mentioned in Section 3.5);
• two stereo channels from the effects units;
• a stereo input from the CD drive (if fitted).
Associated with each of these inputs are a number of controls and a
number of level monitoring facilities. The main controls being a level
or volume control that adjusts the amount of the signal that is added to
the mix and a pan control that allows a single track to be sent to the
left and right channels of a stereo bus in varying proportions.
The mixer section contains a number of audio buses to which each of
the inputs can be assigned (i.e. connected to):
• a stereo bus (i.e. separate left and right buses);
• two auxiliary buses that can be used when an external effect unit is
required, or when a special monitor mix is required for a performer;
• two buses that are used to supply mixes of sound to the effects
units – there is one bus for each unit;
• a general-purpose stereo bus that can be used for partial mixes, or
for intermediate mixes.
66 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 66
Direct mode
This mode is used to record inputs individually on separate hard disk tracks
so that they can be mixed down to a master stereo channel at a later time. The
relevant mixer connections are set up by selecting the DIRECT mode record
screen on the display and then pressing input channel and track channel
buttons on the front panel to make the connections which then appear on the
screen. Figure 31 shows a typical direct mode setup where 2 microphones
and a synthesiser with a stereo output are being recorded on 4 separate hard
disk tracks and Figure 32 shows the direct mode display for this set up.
Mixed mode
This mode uses the general-purpose bus to allow an intermediate mix to be
made in order to save track usage. Here, the inputs are assigned to the bus,
and the bus output is assigned to 2 hard disk tracks (one for the left channel
and one for the right). This saves tracks, but of course the mix of the inputs
cannot be altered at a later date (neither can any effects be later added to an
individual input). Figure 33 shows the MIXED mode set up for the same
inputs as described in the DIRECT mode example above, and Figure 34
shows the corresponding MIXED mode display where the bus outputs are
connected to hard disk tracks 1 and 2. Note that the two microphone inputs
are connected to both the left and right buses (so that the pan controls can be
used to position the sound from each microphone anywhere between the left
and right ends of the stereo sound field), but the synthesiser left and right
inputs are only connected to their appropriate bus lines.
67 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 67
synthesiser/rhythm
machine
1 2 3 4 5 6 7 8
mic/line input
jacks
stereo
input output
channels channel
1 2 3 4 5 6 7 8 stereo
mixer section
recorder section
track 1
track 2
track 3
track 4
track 5
Figure 31 A setup
track 6
using the AW16G’s
track 7
DIRECT recording mode
track 8
Figure 32 The AW16G’s screen display for the setup
in Figure 31
synthesiser/rhythm
machine
1 2 3 4 5 6 7 8
mic/line input
jacks
stereo
input output
channels channel
1 2 3 4 5 6 7 8 stereo
recorder section
track 1
L/R bus
track 2
track 3
mixer section
track 4
track 5
track 6
track 7 Figure 33 A setup using the
track 8 AW16G’s MIXED recording mode
Bounce mode
Bounce or ping-pong mixing or recording is a generally-used term that
describes the situation where a number of individual already recorded
tracks are combined and stored as two new tracks, with or without
new live material being added. This is an intermediate stage that can
again be used to reduce the number of tracks, but with the same
provisos in terms of getting the mix right and the addition of effects to
individual channels as mentioned above for the mixed mode.
Figure 35 shows a BOUNCE set-up where 8 previously recorded tracks
are combined to form a single stereo channel which is then stored in
two unused hard disk tracks (tracks 9 and 10). Figure 36 shows the
bounce screen display for this set-up.
recorder section
track 1
track 2
track 3
track 4
track 5 phones
track 6
track 7
track 8 stereo/aux out monitor out
track 9
track 10
stereo
output
track channel
channels
stereo bus
mixer section
Mixdown mode
This mode is where the final mix is made to the stereo bus which is
then recorded onto the hard disk’s stereo channel. The mixdown can
involve live inputs, tracks previously recorded on the hard disk and
other extras such as sounds from the quick loop sample pads.
Figure 37 shows a typical MIXDOWN set-up which mixes 4 live inputs,
4 previously recorded hard disk tracks and sounds from the four quick
loop samplers to create a final single stereo channel. Figure 38 shows
the mixdown screen display for this set-up.
recorder section
track 1
track 2
track 3
track 4
stereo L
stereo R
phones
stereo/aux monitor
mic/line out out
input jacks
1 2 3 4
1 2 3 4
sound clip
stereo
output
channel
1 2 3 4 1 2 3 4 1 2 3 4 stereo
stereo bus
mixer section
Figure 37 A setup using
the AW16G’s MIXDOWN
recording mode
For all of these four modes, adjustments can be made to some of the
parameters as the recording progresses. In particular the front panel
slider level controls can be used to alter the mix during recording and
channels can be switched on and off (punch in and punch out). Using
an edit list (called a scene memory in the AW16G), all the settings for
each channel and any/all adjustments that need to be made to the
70 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 70
6 ADDING EFFECTS
6.1 Equalisation
This is one of the most common and most basic effects, and is concerned
with altering the frequency spectrum of the signal in some way.
Equalisation, or EQ as it is sometimes called, obtains its name from the
process of boosting frequency ranges that have been attenuated (reduced)
through a transmission medium or audio recorder in order to ‘equalise’
signal frequencies in this range back to their original levels.
Today, equalisation covers not only boosting certain frequency ranges,
but also cutting them, and doing this with many different, and
sometimes very small, frequency ranges.
Sometimes equalisation is fixed and is inbuilt into a system. For
example a type of equalisation called pre-emphasis is used in vinyl
records, and Dolby B is commonly used in compact cassettes to
increase the signal-to-noise ratio. You will come across these and other
examples of fixed equalisation in Chapter 5 of this block.
However, equalisation is often used subjectively to affect the character
of a sound, for example boosting the upper frequencies a little can
make speech more intelligible by emphasising the consonants, mains
hum can be reduced by turning down the lower frequencies.
71 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 71
mid-range boost
signal gain/loss
frequency
mid-range cut
centre frequency
loss
signal gain/loss
of cut or boost (Figure 41(a)) and the
centre frequency (Figure 41(b)), but
it also allows the bandwidth of the frequency
frequencies that are affected to be
varied (the bandwidth setting is
sometimes called the ‘Q’) as shown loss
in Figure 41(c). (a)
gain
changing the centre
frequency
signal gain/loss
frequency
loss
(b)
gain
changing the ‘Q’
signal gain/loss
Figure 41 frequency
Parametric equalisation:
(a) varying the amount of boost/cut;
(b) varying the centre frequency; loss
(c) varying the bandwidth or Q (c)
cut
(a) f1 f2 f3 f4 f5 f6 f7 f8
boost
All types of equalisation use one of three types of electronic filter – high
pass, low pass and band pass which were introduced in Chapter 5 of
Block 1. The creation of effective electronic filters whether implemented
using analogue or digital techniques is complex and is beyond the scope
of this course. However, it is instructive to note that digital filters cannot
work with single sound samples in isolation, since one sample does not
give any information about the instantaneous frequency content of
the signal. To create a digital filter requires proportions of a number
of consecutive samples to be added to produce a single new ‘filtered’
sample as outlined in Box 23.
In this activity you are supplied with a piece of music that has an
annoying continuous mains-frequency hum. Your task is to introduce
suitable parametric equalisation to remove as much as possible of this
hum without affecting the music. You will find the steps associated
with this activity in the Block 3 Companion. I
74 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 74
6.2.1 Echo
Adding echo to a recording is simply a matter of delaying the sound by a
fixed amount and then adding a proportion of this delayed sound to the
original sound. This only gives one echo, so if multiple echoes are needed,
the delayed sound is also fed back to the input of the delay again. This
is fairly simple to achieve using both analogue techniques (see Box 24)
and digital techniques, although digital techniques give a better result
and offer much more control over the echo as explained in Box 25.
analogue sampler
audio input
analogue charge
samples storage units
(short delay) (long delay)
buffer
in pointer size
last sample to be stored in memory
previously stored samples
6.2.2 Reverberation
Simulating reverberation is more difficult to achieve effectively than
echo. Why might this be so? Hopefully the following revision activity
will give you a clue.
76 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 76
Comment
If you had difficulty answering this activity, you may like to have a
quick look back at the material on reverberation in Sections 3 and 4 of
Chapter 4 in Block 1 to refresh you memory before carrying on. I
response
direct
sound
time
hand clap
d3 serial delays
6.3.1 Flanging
Flanging or phasing was first mentioned in Chapter 1 of Block 1, and
is a synthetic effect that occurs when the frequency response of the
system contains a number of equally spaced notches (a notch is a small
frequency range where the signal is reduced) and where the frequency
of these notches varies slowly up and down the audible spectrum.
This introduces a characteristic sweeping effect to the sound.
To remind you what flanging sounds like, listen to the audio track
associated with this activity which is a repeat of the track used in
Activity 25 in Chapter 1 of Block 1 that demonstrates flanging. I
The notches are produced by delaying the signal by a small amount and
adding it to the original signal. This has the effect of cancelling some
frequencies and boosting others. Figure 54 illustrates this effect for two
special cases where the delay is exactly one half of the cycle time of the
sound frequency (Figure 54(a), (b) and (c)) and exactly equal to the cycle
time of the sound frequency (Figure 54(d), (e) and (f)). These special
cases where the resultant signal is either completely cancelled or exactly
doubled will occur at frequencies of 3, 5, 7, etc. times or 2, 4, 6, etc.
times the original frequencies respectively (i.e. the odd harmonics and
the even harmonics respectively). Viewed as a frequency spectrum, these
special cases appear as a set of equally spaced notches (where the delay
causes cancellation) and peaks (where the delay causes reinforcement)
in the frequency spectrum. A device that has this effect on a signal is
amplitude
frequency f1
time
(a)
amplitude
delay
(b)
amplitude
(a)+(b)
(result = no signal)
time
(c)
amplitude
(d)
amplitude
delay
delay = 1/f2
time
(e)
amplitude
(d)+(e) (result = f2 at
twice the amplitude) time
(f)
6.3.2 Chorus
Chorus is an effect that is designed to simulate the sound of a number of
players (or singers) from just one or a few musicians. Its effect is really
only useful for music and in particular for strings and singers as you have
heard in Chapters 6 and 7 in Block 2. As explained in Block 2, when a
number of musicians play the same part (or singers sing in unison) small
varying differences in their pitch cause multiple and changing phase
differences between the individual notes and overall these combine to
sound as random small changes in both volume and pitch and result in a
full, rich sound. Box 30 outlines how this effect may be synthesised.
Listen to the audio track associated with this activity. It is a short piece of
music played using a ‘strings’ patch on a synthesiser. The music is played
three times, first without chorus, second with chorus and finally with
both chorus and reverberation added. I
81 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 81
In this activity you will use the course’s music recording and editing
software to add effects to some music. You will find the steps
associated with this activity in the Block 3 Companion. I
the two pitches will be found to be the same (assuming the car and
listener are stationary). However, if the delay is varied (the car moves
towards or away from the listener) then the pitch of the delayed signal
will appear to the listener to change – a reducing delay (the car is moving
towards the listener) raises the pitch (the sound waves are ‘squashed’)
and a lengthening delay (the car is moving away from the listener) lowers
the pitch (the sound waves are ‘stretched’). BUT this only occurs as long
as the delay is changing (i.e. the car is moving). Immediately the delay is
constant (the car stops), the pitches of the delayed and undelayed sounds
become the same again. Box 31 explains how this can be achieved without
having to resort to moving either the microphone or the performer!
As you can see from Box 32, the whole process of pitch and time
shifting can become very complicated, particularly if good results are
to be obtained for sound sources that contain frequencies over the
whole audible range, and where large variations in tempo and/or pitch
are required. Any further discussion of the techniques that are
employed are beyond the scope of this course, however, I hope this
section has given you an overall idea of how pitch and tempo changing
can be achieved and has shown you the problems that have to be
overcome when trying to implement such effects.
In this activity you are supplied with a piece of music where one part
is played in the wrong key. Your task is to alter the pitch of the music
so that it is in the correct key. Carry out the steps associated with this
activity which you will find in the Block 3 Companion. I
6.5.1 Invert
This is a simple effect that inverts the sound waveform – positive
sound values (either analogue voltages or digital sample values)
become negative and negative ones become positive. This is also
known as changing the phase, and if the inverted signal is added to the
original signal they will cancel out leaving no signal at all. Note that
this is not the same as delaying the signal as described in Box 26 as in
that situation, cancellation only occurs at a specific frequency where
the delay is equal to one half of the frequency’s cycle time and its odd
harmonics.
Inversion can be used to simulate stereo images from a mono signal (as
described in the next section), and it can also be used to change the
phase of a microphone if it is found that it has been connected the
wrong way round (see Box 33).
6.5.3 Vocoder
A vocoder produces an effect that makes a non-musical sound appear
to speak or sing. This is an example of a multi-track effect where the
effect is produced by the interaction of one sound track with another.
Analogue implementations of the vocoder effect have been in existence
for some time, and this is where the term phase vocoder originates (see
Box 32) – although the effects produced and processes involved are not
closely related.
In the case of the vocoder, one track is used to amplitude modulate the
other (amplitude modulation or AM was explained in Chapter 8 of
Block 2). So, for example, the sound of a telephone ringing can be
simulated by using a vocoder on two sound tracks, one containing the
continuous sound of a bell and the other containing a person saying
“ring ring”.
In this activity you will investigate the effects on the stereo sound field of
feeding the left and right signals with different proportions of in-phase
and out-of-phase signals from a monophonic source. You will find the
steps associated with this activity in the Block 3 Companion. I
input library
speaker
mic/line
simulator
input jack
EQ
input
level
input channel
6.6.4 Summary
As you can see, the AW16G offers a wide range of different effects,
with a large number of preset settings held in libraries. Most of the
time, there will be a preset setting that is suitable for most situations.
However, individual parameters can be adjusted if a special setting
needs to be created, or there needs to be some fine tuning of a preset
setting. Unfortunately, like the AW16G editing operations, setting up
these bespoke settings can be a little tedious because of the small
display and lack of a pointing device (mouse) as would be the case
when using a sound processing package on a desktop computer.
7 EXTERNAL CONTROL
8 SUMMING UP
In this chapter you have looked at the whole process of making a
master recording, from the forms of audio signal that may need to
be recorded, through the cables, connectors and inputs by which
the signals are input to the recording device to the methods of
recording the sound and the subsequent stages of editing, mixing
and adding effects. Later in this block you will continue on from
where this chapter left off and look at the processes, systems and
equipment that are involved with the copying and distribution of a
master recording.
Throughout the chapter, you have seen how the processes of producing
a master recording are achieved in a real digital hardware device –
the Yamaha AW16G – as well as getting practical experience of
them through the course’s music recording and editing software.
However, before leaving this chapter, I want you to have a go at
producing a real master recording that is more substantial than the
ones you have tackled so far.
92 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 92
Figure 64 The TA225 Course Tune Copyright © 2004 The Open University
The TA225 Course Tune has also been harmonised and there are some
specially written words for it as well. In the final activity of this
chapter, you will have the chance to work with this tune to produce
your own master recording.
The TA225 Course Tune will be used again in Chapter 3 of this block
where you will work with the MIDI version of the tune and you will
also see what a professional musician and composer can do when
given just the tune to work with.
SUMMARY OF CHAPTER 1
Desktop sound is the equivalent in sound output. Input sensitivities are usually given
terms of desktop publishing of textual in terms of r.m.s. values and are often
material. Desktop sound is the process of specified in decibels. (Sections 2.1 and
producing a fully edited and mixed master 2.1.1)
recording to professional standards using
equipment that conveniently fits onto a Three decibel scales are used for specifying
desktop. Desktop sound became a reality in input sensitivities or amplitudes of sound
the 1990s because of advances in digital signals in their analogue electronic form
audio and storage technologies. (Section 1.1) – dBV, dBu and dBm where 0 dBV is 1 volt
r.m.s., 0 dBu is 0.775 V r.m.s. The dBm
Producing a master recording first involves scale is a power ratio scale, but can often
assembling all the raw elements of the be approximated to the dBu scale.
recording. This may include making (Section 2.1.2)
acoustic recordings of live performances,
recording directly from electronic Input sensitivities vary widely, but a
instruments, or obtaining already recorded typical value for a high sensitivity analogue
material either in analogue or digital form. audio input is 1 mV r.m.s. and perhaps
(Section 1.2.1) 20 mV r.m.s. for a line level input.
Sometimes the impedance of the input can
For all but the simplest of recordings, some have an effect on signal levels, interference
editing and mixing of the various sound and noise. Electrical impedance is a frequency
elements will then be needed. Mixing dependent quantity that represents the
involves adding proportions of the various resistance to flow of electricity. It is
sound sources to create the required overall measured in ohms. (Section 2.1.3)
sound. Sometimes mixing needs to be done
in stages. Editing involves cutting out, In order to monitor signal levels, various
inserting, swapping or moving sections of metering systems are used, the common ones
sound. Editing of individual sound sources being an indication of the average signal
or of the overall mixed sound may need to level and an indication of the peak level.
be carried out, even if it only involves (Section 2.1.4)
fading in and fading out at the start/end of
the recording. An edit list is a list A balanced input uses two wires for the
containing a record of the editing/mixing sound signal – one wire contains an
operations and can be used to speed up the inverted version of the signal on the other
editing/mixing process, it can also allow wire. Such an arrangement makes the signal
non-destructive editing to be carried out less susceptible to external interferences
whereby the original sound elements are not because, as any such interference is likely
altered. (Section 1.2.2) to affect both signal wires equally, it will
be cancelled out in the receiving audio
Finally there is a large range of effects that device. (Section 2.1.5)
can be applied to individual sound sources
or to the mixed sound, from commonly In order to power condenser microphones,
used ones like equalisation, reverberation high sensitivity analogue inputs sometimes
and chorus to more specialised effects. allow a steady state voltage called phantom
(Section 1.2.3) power to be added to the signal wire.
(Section 2.1.6)
The Yamaha AW16G Professional Audio
Workstation is a desktop sound device that Analogue sound outputs on audio
incorporates a digital sound mixer and a equipment are usually at line level unless
multitrack hard disk recorder. It includes they are special outputs such as those
all the features and facilities needed to be designed for headphones or loudspeakers.
able to create a professional-quality master Loudspeakers can require quite high voltage
stereo recording including editing and signal levels. (Section 2.2)
effects facilities. (Section 1.3)
In order successfully to implement digital
There are two main types of analogue sound inputs and outputs, there must not only be
inputs – high sensitivity for microphones a specification for the physical form of the
and other sound sources that have a low data, but there must also be an agreed
electrical output and low sensitivity or line protocol that determines how the data is to
level for devices that have a higher level be interpreted. The AES/EBU, S/PDIF and
94 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 94
MADI specifications are all related and connection systems are just used as a
contain protocols to specify how digital transport means for AES/EBU or S/PDIF
sound samples can be sent in real time as a formatted sound data, but they can also be
serial stream of digital data. (Section 2.3) used to transfer sound samples in their raw
form without forming them into sub-frames
In the AES/EBU specification, the digital and frames, etc. (Section 2.3.4)
sound samples are sent serially along the
connection. Samples for each track (if there The loss of signal level, reflections of the
is more than one) are sent interleaved. The signal at each end, and the addition of
data rate is variable, but is set such that one interference and noise can all be affected
sample for each and every track is sent by the type and construction of the cables
within the sample period. The data is sent used to carry sound signals. Loss of signal
in a form that enables the receiver to recreate depends mainly on the length of the cable,
the original bit stream by measuring the time but can also be affected by its characteristic
between zero level crossings of the signal. impedance, as can the effect of any
This allows the receiver to recognise reflections that occur at the ends.
individual bits correctly and more Interference can be reduced by using a
accurately than if amplitude levels were screened cable whereby the signal wire is
measured. In order for the receiver to decode enclosed in a sheath that is connected to
individual sound samples from the stream the equipment’s ground or earth
of bits, synchronisation data is added. This connection. Digital connections can use an
is done by assembling the sound samples optical method whereby the signal is sent
into sub-frames, frames and blocks. A sub- down an optical fibre as light. This type of
frame contains a special preamble set of bits connection can easily cope with the much
followed by the bits for one sample from higher frequencies required by a digital
one track and some additional control/status sound signal and also provides electrical
bits called channel status information. A isolation between the sending device and
frame consists of one sub-frame for each the receiver. (Section 2.4.1)
sound track. A block consists of a set of 192
frames and provides a means whereby the There are a variety of different connectors
few status/control bits in each sub-frame can that are commonly used for audio signals.
be collected together to provide important Some provide a locking mechanism to
information about the form of the sound prevent accidental disconnection and some
samples. Such information includes the provide connections for two signals
sample rate, the number of bits per sample (together with a common earth or ground
and other control data. (Section 2.3.1) connection) that enables them to be used
with a balanced signal or a stereo signal.
The S/PDIF standard is fully compatible Connectors for optical connections must be
with the AES/EBU standard in terms of the constructed so as to minimise light loss at
physical form of the signal and the sub- the connection. (Section 2.4.2)
frame/frame/block format. A bit in the
channel status information indicates The AW16G offers eight combined mic/line
whether the signal conforms to the S/PDIF balanced analogue inputs and a S/PDIF
standard or the AES/EBU standard. In the optical digital input. The device provides
S/PDIF format, the channel status bits have three analogue stereo outputs – a main
slightly different interpretations, and there output, an auxiliary output and a
is a substantial amount of unspecified data headphone output – and a S/PDIF optical
that can be used for future enhancements digital output. (Section 2.5)
to the specification. (Section 2.3.2)
There are a number of different methods of
The multi-channel audio digital interface digital sound recording, at present the most
(MADI) is a multi-track version of the AES/ useful of these for desktop sound is the
EBU system that allows up to 56 tracks to multitrack hard disk recorder. (Section 3.1)
be transferred simultaneously. It uses a
much higher data rate, a slightly different A hard disk unit contains one or more
coding scheme, and a separate timing signal rotating disks which are coated on both
to ensure individual bits are decoded sides with a magnetic material. The data is
correctly. (Section 2.3.3) stored/read by a set of read/write heads that
move radially across the disk. The disks
Increasingly now, digital sound data is rotate continuously and are sealed during
being transferred using the connection manufacture to minimise the effects of dust.
systems used in computers, particularly The data is stored in concentric circles
USB and FireWire. Sometimes these called tracks which are divided radially into
95 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 95
sectors. The access time of an item of data recording of 8 tracks and a stereo channel
on a hard disk has two components – the and simultaneous playback of 16 tracks and
time for the read/write head to move to the a stereo channel. In addition there can be
required track and the time for the required up to 8 virtual tracks associated with each
sector to appear under the head. The major track that can be used for different ‘takes’
advantages of using hard disks in the of the same recording. The unit also
recording and mastering processes are the contains solid state memory for use as
small access time and the multitrack temporary storage and an optional CD drive
capability. However, even though hard is available. (Section 3.5)
disks have a very small access time, some
solid state memory needs to be used as a Normalisation is the process of adjusting the
temporary store as well. Modern desktop level of a digital sound signal to use the
computers with suitable software are able full dynamic range. Audio compression is
to carry out most if not all of the operations used to reduce the dynamic range, and
required for desktop sound; however, a audio limiting is used to prevent distortion
dedicated device will often be more occurring from overload. Expansion
compact and portable and will be designed increases the dynamic range and gating
specifically for sound recording and switches off a signal when it reduces below
processing, and so may well have a better a certain level. There are a number of
overall performance than a desktop parameters associated with compression,
computer. (Section 3.2) limiting, expansion and gating – threshold,
amount, attack time and decay time.
Solid state memory has no moving parts and (Section 4.1)
has a very short access time. Random access
memory (RAM) is usually volatile – it loses Editing is the process of adding, deleting,
its contents when the power is removed – merging or swapping sections of a recording.
and so can only be used for temporary Analogue techniques involved cutting and
storage of sound data. It is also more joining sections of magnetic tape to avoid
expensive and not available in such large reducing the quality through making
storage sizes as hard disk units. Flash multiple generations of copies. Digital
memory is a form of non-volatile solid state techniques do not have this problem, and
memory. However, once data has been carrying out editing is usually just a matter
stored, it has to be erased in blocks before of reorganising the sound samples. Fading
new data can be stored. Also flash memory in, fading out or cross-fading between
can only be erased and reprogrammed a sections involves carrying out a large
finite number of times. (Section 3.3) number of simple numerical calculations on
sound samples. Non-destructive editing
Sound data is stored in a computer in a
occurs when the original material is not
number of different file formats. Three
altered. An edit list is sometimes used to
common formats are AU, AIFF and RIFF
automate the editing process. (Section 4.2)
WAVE. In all the formats, additional data has
to be included to indicate the sample rate,
the number of quantisation levels of the sound The AW16G has facilities to edit tracks and
data and the number of tracks as well as other all the common editing operations are
information about the sound data that is available. However, the lack of a large
stored. Where there is more than one track, display screen and pointing device can
the samples from each track are interleaved make editing quite tedious and time-
to enable the sound to be replayed as it is consuming. (Section 4.3)
read from the file. The AU format is the
simplest format and consists of a header Mixing is the process of combining individual
section, an optional comment section and the sound sources to create the final required
sound data section. The AIFF format uses a overall sound. Mixing using analogue
number of self-contained sections called techniques involves adding proportions of
chunks – the basic chunks used are the header, each analogue source. For digital sources,
common and data chunks. Similarly, the mixing is done by calculating fractions of
WAVE format uses chunks, the basic ones each sample from each source and adding the
being the header, format and data chunks. results together. For both analogue and digital
There are a number of additional optional mixing, care must be taken to ensure the
chunks available with both the AIFF and amplitude of the mix is not too large that it
WAVE formats. (Section 3.4) causes distortion. To cater for this, mixer units
often have a larger dynamic range than that
The AW16G workstation contains a hard of the individual sources. (Sections 5.1, 5.2
disk unit that allows simultaneous and 5.3)
96 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 96
The mixer section of the AW16G contains an ensemble of the same types of instrument
inputs from the audio inputs and the hard playing together. Analogue generation is
disk track recorder as well as from two effects complicated, but digital techniques are
sections. The mixed sound is output from the more straightforward and involve the use
mixer on one or more audio buses that can be of a number of varying delays connected in
fed to the device’s outputs as well as to the parallel. (Section 6.3)
hard disk recorder and the effects units.
Various mixing modes are available that are Changing pitch without varying the tempo
designed to be used at different stages of the and vice versa are useful effects to have
mixdown process. (Section 5.4) available. Pitch can be changed by creating
a reducing or increasing delay. Tempo can
There is a large range of effects that can be be changed by slicing the sound into short
applied to both individual tracks and to overlapping sections and then sliding these
final mixes. Equalisation is one of the most sections together or apart in time. Both
common effects and this involves adjusting effects need great care to implement
the level of the sound components in one satisfactorily, particularly for large
or more frequency bands. Simple variations. (Section 6.4)
equalisation consists of treble and bass
controls, but more elaborate equalisation can Another common effect is invert where the
involve one or more mid-range frequency sound waveform is inverted. This can be
bands with full control over the centre used to correct for incorrect signal
frequencies, the width of the frequency connections and to provide stereo imaging
bands (the Q) and the amount of boost or effects. A stereo channel can be thought of
cut. (Section 6.1) as being composed of a sum and difference
signal rather than a left and right signal.
Echo and in particular reverberation are Individual sound sources have a
widely used effects to simulate the localisation and an image within the stereo
acoustics of a room or building. Creation of sound field. A vocoder is an effect created
echo involves adding a delayed proportion by amplitude modulating one sound source
of the signal to the original signal. by another. An envelope follower also uses
Generating a delayed signal with analogue amplitude modulation, but here it is the
techniques is not easy if quality is not to amplitude of one signal that changes the
be compromised, however with digital amplitude of the other. (Section 6.5)
signals either a temporary buffer called a
FIFO can be used to delay the sound The AW16G contains a large range of effects
samples, or a special device that contains a that can be applied to individual channels
string of storage elements can be used. and to mixes of sources. There are a number
Reverberation is more complicated to create of effects libraries that allow common
as it involves creating not only the early settings to be set up quickly. Bespoke
reflections, but also the multi-reflections settings can also be created and stored in
that form the actual reverberation. Analogue the libraries. Some effects like pitch and
reverberation can be simulated using a time changing cannot be carried out in real
special room, a metal plate or coiled time. (Section 6.6)
springs, but the results are never particularly
good, and only minimal adjustment of the External control of audio devices can allow
parameters is possible. Digital techniques the device to be remotely controlled, to be
for reverberation involve combining a controlled by a computer or to enable the
number of individual delay units each with settings to be stored. There are a number of
differing delays. (Section 6.2) common interconnection methods,
commonly USB, FireWire and MIDI are
Flanging is an effect that is created by used. (Section 7.1)
adding a delayed version of a signal to the
original signal which has the effect of The AW16G uses a MIDI interface to give
reinforcing some frequencies and cancelling remote access. This provides facilities to
out others. By making the delay variable, enable the AW16G to synchronise its
the characteristic flanging sound is operations with other devices (in both
produced. Flanging was originally master and slave modes), to allow ‘scenes’
produced using two identical but not to be recalled, to allow recording/playback
synchronised tape recorders to create be started and stopped remotely, to enable
slightly different and varying delays. Digital the workstation’s settings to be backed up
simulation is created by using a delay unit and to allow the device’s front panel slider
with a varying delay. Chorus is an effect controls to be used to control a remote
that is designed to simulate the sound of device. (Section 7.2)
97 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 97
APPENDICES
The tables in these appendices are given for information only as an illustration of the
types of settings and adjustment parameters that a typical sound processing device (or
computer program) might provide.
(continued ...)
100 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 100
Activity 3
Non-destructive editing is the process whereby a sound is edited in some
way without the original sound being altered. This can be done by
using an edit list. (It may also be done by making copies of the original
sound and editing the copy rather than the original, however, this is
not possible with analogue devices unless quality is compromised, and
may not be able to be done with dedicated desktop sound devices.)
Activity 5
(a) If the peak-to-peak amplitude is 20 mV, then the peak amplitude
will be 20/2 = 10 mV. Thus the r.m.s. voltage will be 10 × 0.71
= 7.1 mV.
(b) If the r.m.s. amplitude is 71 mV, then the peak value will be 71 ÷ 0.71
= 100 mV.
Activity 6
(a) –6 dB represents a halving of a quantity, so the sound will have a
level of 40 – 6 = 34 dB.
(b) +40 dB represents a multiplication of 100 times, so the sound will
have a level of 40 + 40 = 80 dB.
Activity 7
(a) 2 V (+6 dB is a doubling and 0 dBV is 1 V)
(b) 0.0775 V or 77.5 mV (–20 dB is a tenth and 0 dBu is 0.775 V)
(c) 0.05 V or 50 mV (–26 dB can be thought of as –20 dB, or a tenth,
followed by –6 dB or a halving with 0 dBV being 1 V)
(d) 0.001 V or 1 mV (–60 dB can be thought of as (–20) + (–20) + (–20)
or one tenth times one tenth times one tenth or one thousandth
with 0 dBV being 1 V)
Activity 8
If 0 dBV represents a signal level of 1 V r.m.s., then 20 mV can be
thought of as being composed of a doubling of this reference level (2 V)
followed by a hundredth of this value (2 V ÷ 100 = 0.02 or 20 mV).
Since +6 dB represents a doubling and –40 dB represents one
hundredth, so 20 mV is represented by 0 + 6 – 40 = –34 dBV.
There are other ways of explaining the result, for example taking one
hundredth first and then doubling, or dividing up the decibel value into –
40 and +6 first and then seeing what this means in terms of voltages.
Activity 12
In the AES/EBU system, a single digital sound sample from one sound
channel is sent in one sub-frame. For a stereo system, two sub-frames
will be transmitted for every sound sample. Since a sub-frame consists of
32 bits, between each sound sample the transmitter must send 32 × 2 =
64 bits of data along the serial interface. The data rate of the serial
interface must therefore be 64 times the original audio sample rate, or
64 × 44.1 × 103 = 2 822.4 × 103 or about 2.8 M bits per second.
102 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 102
Activity 13
The system works in real time such that one sample from each sound
channel is transmitted within the sampling interval of the original
digital sound data. Thus, the more channels, the more digital data that
has to be sent within a sample period, and so the higher the bit rate
needs to be to incorporate this data.
Activity 14
If the microphone has a balanced output, a screened cable with a
twisted pair of signal wires needs to be used. The screened
construction is needed to minimise interference particularly since
low-level microphone signals are being carried, and there needs to be
two signal wires to carry the balanced microphone signal.
The connectors need to have 3 connections – 2 for the signal wires and
one for the screen of the cable. Both TRS jack connectors and XLR
connectors could therefore be used, but because the microphones are
being used on location, XLR types are to be preferred as they are robust
and have a locking mechanism to prevent accidental disconnection.
Activity 15
–46 dBu can be thought of as –40 dBu and –6 dBu. –40 dB represents
one hundredth and –6 dB is a halving. Therefore, if 0 dBu is 0.775 V,
–40 dBu is 0.775/100 = 0.00775 V, and –46 dBu is 0.00775/2 = 0.003875 V
or 3.8 mV.
Activity 18
One Gbyte is the same as 1024 Mbytes, so if one CD can store 640 Mbytes
of audio data, then 64 Gbytes can store 64 × 1024 ÷ 640 ≈ 102 CDs
worth of stereo digital sound.
Activity 19
RAM is not suitable because it is volatile which means that if the
battery in the audio player runs flat or needs to be changed, any sound
that is stored will be lost (but note that the device will contain some
RAM that is used as temporary storage during playing or recording).
Activity 21
Each sample requires 16 bits which uses 2 bytes. There are 2 tracks
(the left and right stereo tracks), so four bytes are needed to specify the
amplitude values for each sample point. If there are 44 100 samples
every second (sample rate is 44.1 kHz), then the total number of bytes
that must be read every second is 44 100 × 4 = 176 400 bytes.
Activity 23
The description is of the process of gating a digital sound signal, since
if the sound level (sample magnitude) is below a certain point (the
threshold magnitude), then the sound level is set to zero.
103 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 103
Activity 25
(a) If the sample rate is 48 000 samples per second, then within the
fade-in period there will be 48 000 × 2 = 96 000 sound samples.
At a point one quarter of the way into this period (i.e. after one half
a second), the sample number will be 96 000 ÷ 4 = 24 000.
(b) If the fade in period is linear, then after one quarter of the fade in
period, the sound amplitude should be one quarter of its final
level. Thus the multiplication factor for this sound sample should
be 0.25.
Activity 29
The difference in the number of bits is 24 – 16 = 8. Hence the number
that each sample has to be divided by is 28 or 256.
Activity 32
From your study of Section 3 in Chapter 4 of Block 1, you should
remember that the reverberation consists of the direct sound, followed
by the early reflections and then the multiple reverberations. Figure 65
shows the typical form of this for the hand clap in a reverberant room.
amplitude
direct
sound early
reflections
reverberation
time
hand clap
Figure 65 Answer to Activity 32
Activity 34
Cancellation first occurs at a frequency where the delay time is equal
to one half the cycle time of the signal. In this case, if the delay time is
1 ms, the cycle time of the signal must be 2 ms for cancellation to first
occur. This corresponds to a frequency of 1/2000 which is 500 Hz.
The same situation occurs for every odd harmonic, i.e. 1.5 kHz, 2.5 kHz,
3.5 kHz, etc.
Activity 39
Compression is being carried out. The easy explanation is to say that
the dynamics type is ‘comp’ standing for compression! However,
without this indication, the graph indicates that for low input levels,
the output level rises linearly with input level (the graph is a straight
line at 45° to the axes). However, above a certain point, the output level
rises less than the equivalent input level rise. This has the effect of
compressing the dynamic range of the input signal. (Compare this
graph to the one shown in Figure 25.)
104 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 1 DESKTOP SOUND 104
LEARNING OUTCOMES
Acknowledgements
Grateful acknowledgement is made to the following sources for
permission to reproduce material in this chapter:
Alistair Jones and Peter Peck of Yamaha-Kemble Music (UK) Ltd for
help with the AW16G case study material.
Figures 2(b), 30–38 and 60–63: ‘Yamaha AW16G Manual’ Yamaha-
Kemble Music (UK) Ltd.
107 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 107
TA225
Block 3 Sound processes
Chapter 2
Notation and
Representation
CONTENTS
Aims of Chapter 2 108
1 Introduction 109
4 Printing 116
Acknowledgements 126
108 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 108
AIMS OF CHAPTER 2
1 INTRODUCTION
One of the topics in Chapter 1 of this block was the storage of sound and,
in particular, the storage of music. Storage in that case depended on
representing the pressure wave that the listener hears in a permanent
form. The pressure wave in the air was picked up by microphones and
an electrical representation created. The electrical representation was
used to create a record in a permanent medium. In this instance, a digital
format was used, although analogue representation could have been used.
Another method of storing music is in terms of ‘codes’ or ‘instructions’
that represent how the music should sound or be played. Such methods
are not concerned with accurately recording a pressure wave, but with
storing information that will enable the pressure wave to be re-created
by an instrument. There are several methods by which this can be
done, from the simple example of the pins on a rotating cylinder in
a musical box to today’s MIDI system. Conventional music notation
too can be regarded a system of codes or instructions for recording
and recreating music – with the proviso that there is an element of
approximation in the way that notation represents music, and a
degree of latitude in how it should be interpreted.
In this chapter we will be looking exclusively at conventional music
notation, and in Chapter 3 of this block you will find out about MIDI
and other forms of ‘coded music’. The purpose of this chapter, though,
is not to teach you how to use notation (that is, how to read it or how
to transcribe music into notation) but to look briefly at some of the
interactions between technology and notation. Much of the material in
this chapter will be presented using a number of video sequences,
rather than printed text.
A striking characteristic of Western art music is the way the word
‘music’ has almost become synonymous with music notation. This
dual meaning of the word is nicely demonstrated by the story of a
young music student who was about to perform to some examiners.
Before beginning, the student asked, ‘Do you mind if I play without the
music?’, to which an examiner is alleged to have answered, ‘By all
means dispense with the notation, but please let us have the music.’
This dual meaning of the term ‘music’ is understandable: it is almost
impossible to imagine how a Wagner opera or a Mahler symphony
could be performed without notation, or how they could have been
composed without the use of notation. This is not to say, however, that
Western art music has always been accurately notated in all respects,
nor that Western art music is the only kind that uses notation. In
earlier times, many of the details of rhythm, ornamentation and
dynamics in Western music were often left unnotated because it
was understood that the performer would supply what was missing.
Outside art music, in jazz, folk and popular music, where improvisation
is usual, notation is quite often used, though generally in a simplified
form as a reminder – a skeleton of the tune or the chord progression –
rather than as an encapsulation of the finished work. Indeed, Western
notation appears to have begun as just such an aide-mémoire, intended
for people who were already familiar with the music. In non-Western
music, notation is found in the musics of (for example) China, India,
110 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 110
Music notation has developed a range of uses beyond the obvious one of
transmitting a musical work from a composer to a performer and,
ultimately, to an audience, and in this section I want to look briefly at
some of these other functions.
One consequence of the development of notation has been the development
of a body of works (or canon) that are thought to be of special standing
historically or aesthetically (or both). For instance, the repertory of Western
music in the Mediaeval period is largely dominated by beautifully written
volumes of Latin church music, which were the repositories of specially
valued pieces. There are hints that notation played a part in regulation of
liturgical practices – the copies promoted the ‘correct’ musical forms for
church services. Similarly, composers of the nineteenth and early twentieth
centuries attempted to control performances of their music by making the
notation as detailed as possible.
Although many Mediaeval volumes were no doubt put to practical
use, some of the more elaborate ones were more probably library,
archive or presentation copies. Here notation was a way of ‘keeping’
music, an ephemeral art in performance, for eternity – or at least for
the next generations. At nearly all periods in Western history there seem to
have been collectors of music, a few of them probably not even able to make
sense of the notation themselves. (This was the case even at times when the
composers themselves had no thought of their music being performed by
posterity.) The possession of notated music was sometimes an end in itself
– a sign of culture and status – but more often it was specifically tied to the
owner’s affection for, or feeling of duty towards, a particular repertory.
Before the advent of musical recording, a volume of notation fulfilled much
the same function as a record – it was music waiting to be brought to life.
111 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 111
The origins of neumes are much disputed; their most likely provenance
was the grammatical signs that were used in Classical Greek, derivations
of which are found in many European languages (for example the acute,
grave and circumflex accents in French and Portuguese). Figure 1 shows
an example of a text with neumes marked on it and an enlargement.
Figure 1 A gradual notated in Breton neumes from the late ninth century
Listen to the audio track associated with this activity where you can
hear the hymn that supplied the syllables ut, re, mi, etc. for successive
notes of the scale. Listen for the rising scale created by the syllables
ut, re, mi, etc. I
notation were Léonin and Pérotin, two names associated with the
long breve Notre Dame school. Léonin (c. 1163–90) composed only two-part
Figure 4 organa, whilst Pérotin (c. 1160–1240) included parts for a third and
Long and short fourth voice. Both composers based their rhythmic notation on the
notes in organum ‘long’ and the ‘breve’ (‘breve’ meaning ‘short’), shown in Figure 4.
These were the only two rhythmic units used at the time. Notice that
the notes are ‘filled’ rather than white; also, in this notation the stem
indicated a longer note, which is the opposite of modern usage.
Listen to the audio track for this activity which is an excerpt from
Pérotin’s organum Viderunt Omnes. I
h
a b a a a b a a f
b d d b d d a b d a d c f
c c c c c c b c c c b c
c a d c a a c c h
a a d c a
4 PRINTING
4.1 Introduction
Following the development of methods for printing text and pictures in
the fifteenth century, ways of printing music notation were developed.
These methods aimed initially at emulating handwritten notation, but
many features of handwritten music notation did not lend themselves
to printed reproduction. Consequently printed notation developed
features that were specially adapted to printing, and in some cases
these found their way back into handwritten notation.
Printing certainly did not make handwritten music obsolete.
Until well into the twentieth century it was quite common for
professional performers to use handwritten parts under certain
circumstances. For instance, at a session for recording film music
it would not be economic to use anything other than handwritten (and
possibly photocopied) parts. Once the recording was made, the notated
music would have little further interest, and in many cases would be
discarded or lost. However, these are rather specialist circumstances,
and for most musicians nowadays, notated music nearly always means
printed music.
Printing is essentially the bulk reproduction of an image or text.
For many centuries an image or text for printing had to be created in a
special way so that it could be used as a means for applying ink to paper.
In looking at printing, therefore, we generally need to be concerned
with two aspects of the process:
1 The creation of the ‘master’ copy, which will be reproduced
identically in bulk.
2 The techniques by which an image of the master copy is
transferred to paper.
In the early days of printing, these two aspects were closely connected.
However, techniques of printing developed in the twentieth century
led to the separation of these two aspects, to the extent that, in modern
printing, the creation of the master image is completely separate from
the business of applying ink to paper.
117 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 117
In this section of the chapter you will be looking at some video sequences
under the title ‘Music printing’ which are concerned almost entirely with
the first of the two aspects given above, that is, the creation of the musical
master copy. These video sequences are introduced in the sections below
by very brief descriptions of the associated methods of transferring
images to paper – the second of the two aspects listed above.
4.2 Letterpress
One of the earliest methods of printing used the letterpress technique,
in which the image or text to be printed is created in relief, that is, as a
raised surface. Figure 6 shows a single letterpress character, or type,
for the letter n. Pieces of type such as this were almost invariably made
of metal, and created by casting (pouring molten metal into a mould).
A complete piece of text for printing is created by combining such
pieces of type into words (and spaces), and locking them solidly into a
frame. By passing an inked roller over the top of an assembly of such
characters, the printing surface gains a layer of ink. When a piece of
paper is pressed onto the top, it receives an impression (in reverse) of
the text, or image.
Figure 6 When it came to adapting this technique to music, there were several
A letterpress
character
problems to overcome. One was that the characters of music notation sit
on continuous stave lines, rather than being isolated by white space, as
happens with individual letters of text. In addition, whereas in text all
the letters in a line sit on a common base line, the characters in music
can be on any line or space of the staff, or on ledger lines above or
below the staff.
One solution is to treat the piece of music as a single image and to
create a printing surface in relief by carving away extraneous material,
as happens in the creation of a woodcut or linocut. Another possibility
is to use printing to create the stave lines only, and to add the notes by
hand. Yet another possibility is double-impression printing, in which
the stave lines are printed first, and the notes printed afterwards. All
these systems were used for music, but none was entirely satisfactory,
and they all missed the benefit that comes from using separate pieces
of type, which is that the pieces of type can be disassembled after the
printing and re-used to print something else. The video section in the
following activity shows how these problems were overcome to allow
music printing by separate pieces of type.
4.3 Engraving
Engraving is a way of creating a printing surface that uses the intaglio
printing technique. Itaglio is the reverse of relief printing: instead
of using a raised image, intaglio uses a sunken image (Figure 7).
The image is created by engraving it into a metal plate with an engraving
tool. Alternatively, in etching, a wax-coated metal plate has the image
incised into the wax with a sharp tool. This exposes the underlying
metal, and when the plate is immersed in acid, metal is removed in the
places where the wax has been removed.
The image is printed by first wiping an inky cloth or roller over the
plate, which fills the recesses of the image with ink, and then wiping a
clean cloth over the plate, which removes ink from the non-engraved
area. When a piece of paper is applied to the plate, it takes the ink from
the engraved image, leaving a printed impression (again as a mirror
image) on the paper.
A single engraved plate would normally not carry just a single page of
music, but several pages (typically a multiple of four). During printing,
a piece of paper would be printed on both sides (one after the other),
using a different plate for each side. By folding and cutting, the piece
of paper becomes a series of pages, which are stitched or glued with
others to create the final book.
119 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 119
The term ‘engraved’ in a musical context has come to mean notation that
has the appearance of printed notation rather than handwritten notation,
irrespective of whether engraving was actually used to produce it.
Engraving was widely used from the Baroque era through to the mid-
twentieth century for the printing of music.
Watch the DVD video sequence 3 ‘The Halstan Process’. The Halstan
process was a proprietary stencil process used by the printer Halstan,
based in Buckinghamshire, which specialises in music printing. The
video section shows you how the master image was created, but does
not show any of the subsequent processes by which the litho plate is
produced. I
The following questions relate to the Halstan process shown in the last
activity.
(a) How were corrections made?
(b) How were words added for vocal items?
(c) Why was the master image larger than the printed size? I
What drawbacks do the Finn brothers see with the use of systems such
as Sibelius? I
There are several ways of getting the music notation from a computer
program such as Sibelius onto a piece of paper that a performer can
play from. One way is simply to output the music to a printer attached
to the computer. This is suitable if only one or a few copies are
required. For longer print runs, it becomes more economic to use print
technology. High-quality output from a good laser printer can be used
as the master image for photocopying or, if still longer print runs are
needed, for photolithographic printing. Alternatively, a computer file
122 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 122
Watch the DVD video sequence 5 ‘The Future’. Jonathan and Ben Finn
speculate about the future of computer systems such as Sibelius. I
123 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 123
SUMMARY OF CHAPTER 2
Notation has served many functions in lines). Later forms used die punches
addition to that of conveying a composition which were hammered into the plate to
from a composer to a performer. Other create a recessed impression of each
important functions include a mnemonic character. Notes could be beamed together
function, a regulating function, an in engraving. Corrections were made by
organising function during the hammering on the back of the plate to
compositional process, and an aid to remove the recessed characters. (Section 4.3)
analysis. (Section 2)
Etching, like engraving, is an intaglio
Western musical notation evolved from a process. A waxed metal plate had wax
relatively imprecise method (using neumes) selectively removed using a sharp tool,
for indicating pitch changes in plainchant. thereby exposing parts of the underlying
Neumatic notation consisted of small lines metal. When the plate was immersed in
placed over a written text. It did not indicate an acid bath, exposed metal was etched
rhythm. (Section 3) away, creating a recessed version of the
image. (Section 4.3)
The use of stave lines is associated with
Guido d’Arezzo (though he may not have Photolithography uses a plane litho plate
been their originator). Guido d’Arezzo as a printing surface (not a raised or
introduced other innovations such as an recessed surface). The master image is
early form of tonic solfa and the use of the transferred photographically to a light-
Guidonian hand as a teaching aid. Rhythmic sensitive coating on the plate. This
symbols were first consistently used in the chemically changes parts of the coating,
notation of organum (which also used stave so that non-image areas become soluble.
lines). (Section 3) During processing, the coating in the non-
image areas is removed. The residual
Tablatures are systems of notation that coating on the litho plate, representing the
indicate how notes should be produced on master image, retains ink, whereas non-
particular instruments rather than what the image areas repel ink. (Section 4.4)
notes sound like. In lute tablatures, a set of
horizontal lines represent the courses of the The Halstan process is a method of creating
instrument, and letters above each line a master image of a piece of music intended
indicate the fretting positions. Rhythm is for printing by photolithography. Stencils
indicated by flags above notes to show are used to create an inked image of the
their relative duration. (Section 3) music on paper (after the paper has first
had stave lines ruled and the intended
Letterpress printing uses a raised printing positions of musical characters faintly
surface. The term letterpress is also marked). The image is created larger than
associated with the use of separate pieces the printed size, and reduced
of type, which are combined to make the photographically. Correction in the
printing surface. In music printing by Halstan process is much simpler than in
letterpress, separate pieces of music type engraving. (Section 4.4)
had short sections of stave lines and a
musical character on a line or space. When Computerised music setting systems (such
pieces of type were combined, neighbouring as Sibelius) have turned music setting into
sections of stave line joined up, giving the a desk-top process. Music notation can
appearance of continuous stave lines. easily be created, edited, transposed and
(Section 4.2) played back. Parts can easily be extracted
from a score. (Section 4.5)
In letterpress printing, it was not possible
to beam notes together. (Section 4.2) The final, corrected version of the music
can be printed out on a high-quality laser
Engraving is an intaglio process in which printer and used as the master for
the image is recessed into a metal plate. photolithographic printing. Alternatively,
Early forms of engraving were freehand a computer file can be used for laser-
processes (apart from the ruling of stave processing of a litho plate. (Section 4.5)
124 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 124
Activity 5
(a) Because each note is a separate piece of type, beams cannot be
created that will join all the separate pieces of type in a run.
(b) Beaming enables notes to be grouped into beats, which makes
them easier to read. When the notes are separated, as in
letterpress, it is not so easy to sort them into beats at a glance.
For instance, In Figure 8, (b) is much easier to interpret than (a),
although the two pieces of notation represent the same thing.
(a)
(b)
Activity 7
In the ‘old style’, all characters except the stave lines were engraved
freehand. In the ‘new style’, die punches were used for all characters
except beams, ledger lines and long slurs.
Activity 9
(a) Corrections were made by simply painting over the errors with
typewriter correction fluid and re-creating the character.
(b) Words for vocal items were typeset separately as strings of text,
which were cut up and stuck down beneath the notes as required.
(c) Two principal advantages were claimed. Reducing the size
photographically gave a sharper image and made any corrections
less conspicuous.
Activity 11
They say these systems are not as flexible as pen-and-ink when it
comes to very old or very new music, which often use non-standard
notations.
The brothers point out that traditional music engraving was a highly
skilled job, and engravers had to serve a long apprenticeship. Too
many computer users think high-quality music setting is easy, or that
the computer can do everything.
125 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 125
LEARNING OUTCOMES
After studying this chapter, and the associated DVD video sections,
you should be able to:
1 Explain correctly the meaning of the emboldened terms in the main
text and use them correctly in context.
2 Discuss some of the functions that music notation serves.
3 Summarise briefly the evolution of music notation from neumes to
the modern Western system (including tablature).
4 Outline the basic principles of letterpress printing.
5 Describe briefly the process of music printing by letterpress,
explaining how it relates to text printing by letterpress and what
problems music presents for the letterpress process. (Activity 5)
6 Describe briefly the process of music printing by engraving, and
discuss its advantages over letterpress printing. (Activity 7)
7 Outline briefly the photolithographic method of printing and
how a master music image may be created (and edited) for litho-
graphic printing by stencil methods and by computer programs.
(Activities 9 and 11)
8 Discuss some of the benefits and problems of using computer-based
music setting systems. (Activity 11)
126 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 2 NOTATION AND REPRESENTATION 126
Acknowledgements
Grateful acknowledgement is made to the following sources for
permission to reproduce material within this chapter.
Figure 1: MSS47, folio 34 verso by courtesy of the Bibliotheque
Municipale de Chartes; Figure 2: Copyright © Bibliothèque
Municipale de Valenciennes; Figure 3: The Bodleian Library,
University of Oxford, MS. Canon. Liturg. 216, fol.168r.
127 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 127
TA225
Block 3 Sound processes
Chapter 3
Carillon to MIDI
CONTENTS
Aims of Chapter 3 130
1 Introduction 131
2.1
Barrel orchestrions 133
2.2
Music in the street 134
2.3
Cylinder musical boxes 138
2.4
Disc musical boxes 139
3 Cardboard books and paper rolls 140
7.1
A simple MIDI set-up 155
7.2
MIDI channels 156
7.3
real time operation 156
7.4
MIDI messages 157
7.5
Specification components 157
128 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 128
10.1
Channel messages 165
10.2
System messages 169
10.3
Running status 172
10.4
Message coding 174
11 More MIDI features 178
12.1
MIDI equipment 196
12.1.1 MIDI generators 197
12.2
MIDI in computers 205
12.2.1 Hardware 205
12.3
MIDI in film and TV music 208
12.4
MIDI limitations and improvements 210
12.5
MIDI and the TA225 Course Tune 211
Summary of Chapter 3 214
129 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 129
Appendices
Appendix 1 – Table of General MIDI pitched sounds 218
Acknowledgements 227
AIMS OF CHAPTER 3
1 INTRODUCTION
*Ord-Hume, A.W.J.G. (1978) Barrel Organ, George Allen & Unwin, London, p. 407.
133 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 133
Watch the DVD video sequence 2 ‘Eisbout’s Carillon’ which shows the
operation of the Eisbout’s Carillon which forms part of a clock chiming
mechanism. The hammers are operated by electric solenoids which
receive their instructions from a MIDI program running on a personal
computer. These instructions are stored on a floppy disk. I
Watch the DVD video sequence 3.1 ‘Pin barrel orchestrion’ which
shows a café pin barrel orchestrion chosen to demonstrate how a range
of different instruments can be controlled from a single pin barrel. As
the barrel rotates you can observe how the pins on the barrel engage
with the levers on the key frame to operate the various instruments.
During the sequence the operator shows how a different tune can be
selected. The barrel is turned by a substantial clockwork motor.
You can watch and listen to a complete performance of a piece of
music played on this pin barrel orchestrion in the performance section
of the video sequences in sequence 6.1 ‘Clockwork barrel orchestrion’.
Comment
Although this orchestrion is over 100 years old and in need of some
restoration it nevertheless ably demonstrates a pin barrel in action. I
134 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 134
course tune
crotchet
pin
C5 quaver
pin
C4
notes on
position and type of pin paper cut to length
instrument
(piano keyboard
representation)
Watch the DVD video sequence 3.2 ‘Street piano’ which shows a hand
operated street piano mounted on a carriage to wheel about the streets
of, in the case of this piano, Warwick. Observe the crude nature of the
mechanism which controls the hammers. See also how different tunes
are selected by using a lever to move the barrel horizontally. I
Watch the DVD video sequence 3.3 ‘Barrel organ’ which shows a small
hand powered barrel organ. The compressed air comes from bellows
situated in the base of the instrument and pumped by a crank attached
to the barrel. Note particularly how the different effects of sustain,
vibrato and trill can be generated from different shaped pins. I
Pin barrel instruments vary both in size and in operation, from portable
20 note street-organs, such as that featured in Activity 5 to the large
orchestrions similar to the one in Activity 3.
The world of the hurdy-gurdy man, the name commonly given to the
operator of street organs in Europe, appears a sad and lonely one as
may be seen from the contemporary illustration in Figure 7, where the
man is shown operating a portable street-organ which is covered to
protect it from the weather. This loneliness is reinforced in the poem
Der Leiermann by Wilhelm Müller. Activity 6 gives you the opportunity
listen to this poem.
Figure 7 A contemporary
drawing of a 19th century
hurdy-gurdy man
137 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 137
In Great Britain the person operating the street piano was known
traditionally as an organ-grinder and would often be accompanied by a
pet monkey who would shake a coin tin at passers-by to attract their
attention and more importantly their money. Street pianos often played
out of tune largely because of damp weather affecting the wooden piano
frame which held the strings. The action of the barrel pins operating a
simple mechanism to strike the strings afforded a very crude sound.
This gave the street piano a bad reputation, with the unsubtle ill-
tuned music becoming a curse to many city-dwellers who were daily
subjected to this ‘entertainment’. It is said that many organ-grinders
were paid to go away rather than as a reward for their performance!
In the middle of the 19th century the English mathematician and pioneer
of the modern computer Charles Babbage went as far as to appeal to the
House of Lords in the English Parliament to get organ-grinders and their
instruments banned from London streets. Although quantity rather
than quality was offered, street pianos remained popular and were still
to be found in the streets of large cities in Great Britain up to the
outbreak of the Second World War in 1939.
138 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 138
The teeth form a comb with each tooth cut to a slightly different length
creating a musical scale. Usually the cylinder was turned by a clockwork
motor. The pins plucked the teeth directly, there was no intermediate
mechanism except to operate bells and small drums which were used
to augment the tunes. This may be seen in Activity 8.
Watch the DVD video sequence 3.4 ‘Musical box’ which shows a
musical box in operation. Note how automata* are used to ring the
bells under the control of the six pins at the right-hand-side of the
cylinder. These pins are highlighted in the video sequence. Automata
appeared only in the finest musical boxes. I
of tunes to be played.
Watch the DVD video sequence 3.5 ‘Pin disc musical box and
polyphon’. The video opens with a domestic disc musical box and
then shows a large Polyphon. Note how the mechanism differs from
the cylinder musical box with an intermediate ‘star-wheel’ operating
the teeth. I
*Automata are moving figures of humans or animals that function while the
mechanism is playing and indeed may ‘play’ instruments such as bells.
140 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 140
I am sure you can appreciate the ease in which the disc can be fitted.
It occurs to me that this offers advantages to both the user and the
manufacturer. Can you suggest what they might be?
Comment
The advantage to the user is that new tunes could easily be purchased
or borrowed making the pin disc musical box much more flexible than
the cylinder box. They were also less expensive to buy.
The advantage to the manufacturer is that there are future sales in discs
offering the latest tunes. I
Whilst pin discs overcame many of the limitations of cylinders they were
developed too late. Phonograph cylinders and gramophone records
were making inroads into mechanical music businesses of the early
20th century. Eventually they were to take the majority of sales and
put the mechanical industry into permanent decline. The next section
introduces an alternative instruction system which superseded the pin
barrel mechanism and, for a time, even withstood the onslaught by the
gramophone.
From what you have read and observed in the preceding section make
a list of any drawbacks you think that the use of pin barrels for storing
music may have. To think about this you might find it useful to replay
the Section 3 video sequences on pin barrel instruments. I
Table 1 A comparison of the number of pianos in the UK and USA around 1900
Source: After Ord-Hume, A.W.J.G. (1984) Pianola, George Allen & Unwin, London, p. 124.
141 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 141
In that case only when friends or relatives who played visited was an
enjoyable evening spent listening to the latest tunes and singing along
to old favourites.
One way to get more use out of the
piano was to purchase Edwin
Votey’s Pianola, or ‘pushup’ as it
became known (for you had to
push it up to the piano keyboard
in order for it to work), shown in
Figure 10.
Watch the DVD video sequence 4.1 ‘Reiterating piano’ which shows a
small Italian boudoir card book player piano. Playing instructions are
contained on a cardboard book system. Note particularly how the
mechanism is restrained by the card and how the operator is able to
modulate the sound by use of a lever. I
Figure 13 A small reed organ being played by Paul Camps assisted by Course
Team member Richard Seaton
Here the paper acted as a valve to restrict the flow of air to the reed
which made the sound. (This is an example of a reed organ where the
sound is created by the vibrations induced into the reed by the flow of
air). When the handle on the organette was turned bellows in the base
caused a vacuum to be created. Any hole in the paper would cause air
to be drawn across the reed due to the suction created by the bellows.
Acting as an air-valve caused less stress on the paper than would be
created by a directly operating on a mechanical system. As the paper
was moved forward, by the same motion that operated the bellows, the
tune was played. The organette may be seen playing in Activity 13.
Watch the DVD video sequence 4.2 ‘Organette’ which shows a small
domestic organette. This instrument has only fourteen notes but is
still capable of producing a good tune. The paper would normally be
stored as a roll. I
Comment
Advantages include less storage space necessary as paper is thinner and
can be rolled rather than just folded. Also paper is easier to perforate
than card so smaller holes are possible.
144 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 144
The main disadvantage is that paper is more fragile and tears easily.
Mechanisms must thus be made to put as little stress as possible on
the paper possibly making them more complicated as you will see in
the next section. I
paper roll
suction
tracker
bar
Run the computer animation for this activity which shows how a note
operates in the reproducing piano. I
Global sales of player pianos reached their peak in 1923 but by this
time both radio broadcasting and gramophone records were becoming
rival sources of entertainment in the home. By 1940 nearly all
manufacturers had either ceased production or had gone out of
business despite the fact that the player piano was, and still is, capable
of giving very fine musical performances as may be enjoyed in the next
activity.
Watch the DVD video sequence 4.3 ‘Reproducing player piano’ which
shows such an instrument in action. Notice the mechanism, but also
enjoy the superb sound of a truly fine musical instrument.
You can watch and listen to a complete performance of Thurlow
Lieurence’s By the waters of Minnetonka played on this reproducing
player piano in the performance section of the video sequences in
sequence 6.2 ‘Reproducing player piano’. I
Can you think what might have happened if a pianist played a wrong
note whilst recording a piano roll?
Comment
If a wrong note was evident when the master piano roll was replayed
the offending hole was covered with sticky paper and a new hole cut in
the correct place using a hand punch. In reality, as with recordings
today, every blemish could be covered up and a perfect recording
would result. Even touch and tempo could be reworked. The composer
Percy Grainger was reported as saying that the piano roll reproduced
him not merely as he did play but as he “would like to play”*. I
*Ibid, p. 35.
146 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 146
Listen to the two audio tracks associated with this activity. On the first
track you will hear an excerpt of George Gershwin’ Rhapsody in Blue
made for the reproducing piano by the composer in 1925. In addition
to playing the solo music Gershwin added a piano reduction of the
accompaniment passages normally played by an orchestra. As it would
not have been possible to play both the solo and accompaniment
passages at the same time when recording the original piano roll a
second pass was made to add the additional notes in a manner similar
to ‘over-dubbing’ on a tape recorder.
On the second track you will hear a similar excerpt from Rhapsody in
Blue, again played by George Gershwin but this time accompanied by
the Columbia Jazz Band conducted by Michael Tilson Thomas. This
recording was made in 1976 even though George Gershwin died in
1937! For this recording the accompaniment in the piano roll was
painstakingly removed by covering the holes corresponding to each
note of the reduction leaving just the solo piano passages. I
ACTIVITY 19 (REVISION)
Name some instruments where the note has to be formed before it can
be played? I
Unfortunately despite being a wonder of its age the electric café violin
is really an example of ambition over achievement for the violin is not
really best suited for mechanised operation.
Watch the DVD video sequence 5.2 ‘Banjo orchestrion’ which shows
the mechanical banjo orchestrion built in the early 1990’s by Ramey &
Co of Detroit to a much earlier design. The banjo is accompanied by a
range of percussive instruments. The instrument is controlled by a
conventional piano roll which was prepared using the MIDI
technology that will be discussed later in this chapter.
You can watch and listen to a complete performance of a piece of
music played on this banjo orchestrion in the performance section of
the video sequencies in sequence 6.3 ‘Banjo orchestrion’. I
Watch the DVD video sequence 5.3 ‘Electric orchestrion’ which shows
the Grand Electric Orchestra. It is a true tour de force of mechanical
entertainment incorporating a wide range of musical instruments and
special lighting effects – all controlled from instructions contained on
a roll of paper. I
Listen to the audio track associated with this activity. This is Conlon
Nancarrow’s Study for Player Piano No 49a played on a 1927 modified
Ampico player piano. Notice that the opening sounds like a normal
piano piece that could be played by any competent pianist. After about
20 seconds you will begin to realise that something else is going on
and after a few more bars I hope you are left with the impression that
either several pianists or multitrack recording is being employed.
However the finale is so fast that only a mechanical piano could cope.
Speeds of up to 50 notes a second are quite normal in works written
by Nancarrow. I
Listen to the audio track associated with this activity. You will hear an
excerpt from Conlon Nancarrow’s Player Piano Study No. 11
beautifully played on a conventional Steinway grand piano by Joanna
MacGregor. It is a multitrack recording as the piece requires eight
hands to play it! I
A simple pin barrel musical box contains only one tune. Can you think
of two possible methods of altering the pitch of the tune without
altering its tempo?
Comment
The easiest method of altering the pitch is to simply slide the pin
barrel to the left or right by the required number of notes – assuming
the mechanics of the musical box permit this.
Another more time-consuming method would be to reposition every
pin on the barrel individually by the required number of notes. I
Listen to the audio track associated with this activity. This is J.S. Bach’s
‘Sinfonia’ from Cantata No. 29 played by Wendy Carlos on the Moog
synthesiser. This is another track from the Switched-on Bach album
that was featured in the Moog synthesiser video sequence in
Activity 22 in Chapter 8 of Block 2. As you listen to the music
remember that this was created part by part and sound by sound
using a monophonic synthesiser and multitrack recording techniques,
and was certainly not something that was able to be done in a live
performance. I
These early synthesisers were still monophonic devices, but they were
all controlled by analogue electronic voltages which were supplied
from the keyboard. So by designing the individual synthesisers to
work with the same values of control voltage, it was possible to
connect one keyboard to a number of separate synthesisers – just like
the organ example above.
However, when polyphonic synthesisers appeared, this method of
control no longer worked as a single control voltage cannot easily be
used to indicate more than one key being pressed at a time. Within
these polyphonic devices, the keyboard would be scanned
electronically, and key presses would be communicated to the sound
generating circuitry by numerical codes.
So, theoretically even with these early polyphonic synthesisers it
would have been possible to separate the keyboard from the sound
generating circuitry. Conversely, the sound generating section might
thus be able to be remotely controlled by codes which may not have
come from a keyboard – they might have come from an electronic store
that sent the correct codes at the right time to produce the required
music. Indeed, manufacturers did sometimes use this feature to
produce devices called sequencers that were able to store the key codes
for one or more pieces of music.
The problem is that manufacturers kept their key codes secret, and in
any case one manufacturer’s set of codes was incompatible with
another’s. Of course this is quite understandable since if
manufacturers provided a remote control connection on a synthesiser
they did not want customers to buy a competitor’s keyboard or
sequencer to control it with.
However, as long as there was no standard method of connecting
different types and makes of synthesiser and keyboard together, the
problem of producing multiple different sounds at the same time
without multiple keyboards would remain.
Comment
I said this because each manufacturer risked a loss of sales since, by
agreeing on a common interface, they were opening up the possibility
154 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 154
The main limitations of the MIDI system that you should keep in mind
during the following discussion are:
• MIDI only contains codes for musical sounds (i.e. sounds with a
definite pitch), although there is provision for percussion and a few
other sounds. However, it cannot be used (yet!) for any random
sound.
• MIDI does not contain the actual sound you hear, it only contains
codes that instruct a MIDI device to produce musical sounds.
• Only the discrete pitches of one of the 12 pitch classes on a
standard keyboard can be transmitted – intermediate frequencies
cannot be represented or transmitted (yet!) although MIDI does
cater for the use of pitch benders which can address this situation
somewhat.
• Only a small number of the many nuances of playing an instrument
can be accommodated.
• MIDI does not define the exact sound that should be heard,
although this aspect is now being addressed with General MIDI and
the new downloadable sounds specification.
Note my bracketed comment in items 1 and 3 above; future
enhancements to the MIDI specification are likely to address these
limitations – indeed at the time of writing (2004) manufacturers are
starting to use special MIDI code sequences to address the pitch
limitation mentioned in item 3.
7 MIDI BASICS
In this section, I will outline the basics of the MIDI system, and this
will serve as an introduction to the more detailed examination in later
sections.
8 MIDI HARDWARE
shield
5 2 4 5 2 4
3 1 3 1
three connectors.
labelled.
With the connections as shown in Figure 18, the MIDI signal sent by
the keyboard when it is played is received by the first synthesiser.
Within this synthesiser, the keyboard signal is relayed to the second
synthesiser via the MIDI THRU connection.
If, on the other hand, the second synthesiser’s MIDI IN connector was
connected to the MIDI OUT connector of the first synthesiser, then the
MIDI signals from the keyboard would not be received by the second
synthesiser. But now if the first synthesiser also incorporated a
keyboard, then playing this keyboard would cause the notes to be
sounded on the second synthesiser (as well as the first synthesiser).
To confuse the situation even further, some synthesisers that
incorporate a keyboard have a ‘remote’ or ‘local off’ setting whereby the
keyboard is effectively disconnected from the sound generating section
of the device. If this is done, and the alternative connection scheme
mentioned above is used, then the second synthesiser will be played
by the keyboard of the first synthesiser and the first synthesiser will
be played by the separate MIDI keyboard!
Box 6 Optoisolator
An optoisolator is a small electronic component that contains a light source in close optical
contact to a light sensor (Figure 21). Both devices are enclosed in a light-proof moulding
so that they are not affected by external light. The light source is usually a component
called a light emitting diode (LED) which requires a current of about 5 mA to produce
sufficient light. There is no electrical connection between the LED and the light sensor.
Optoisolators work best with light source light
digital signals, and there are (LED) light detector
various types of devices
designed to operate at
input output
different maximum data rates
(the higher the data rate the
more expensive the device). light-proof encapsulation
protection
+5 V resistors
serial MIDI
data
output driver
5 2 4
3 1
input output +5 V
opto-isolator
serial MIDI data
+5 V
5 2 4 protection
3 1 components
output driver
MIDI IN connector
5 2 4
= device earth connections
3 1
+5 V = 5 volt supply connections
MIDI THRU
Note here that because no current flowing indicates a binary 1 data bit,
the idle state (i.e. the binary state when no data is begin transmitted, or
when the MIDI input is disconnected) must be also be interpreted as a
binary 1. This sounds a little odd, but this approach makes the MIDI
signal compatible with the serial system that has been used for many
years in computers, where the convention is that the idle state is a
binary 1. This also allows standard readily available computer
components to be used to convert the serial data into MIDI bytes as
you will see in the next section.
163 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 163
time
Figure 23 Parallel to serial conversion where the least significant
bit is sent first
If the data rate of a MIDI signal is 31 250 bits/s, what is the maximum
number of MIDI bytes that can be sent in a second? I
164 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 164
Run the computer animation for this activity. This is a simple animation
that shows how a MIDI byte is converted to serial form with the two extra
bits added, and how this data stream is decoded at the receiving end.
The simulation shows how the receiver looks for the start of a binary 0
(i.e. a 1 to 0 transition) and when it finds this it waits for a period
equal to one half of the time for a complete bit (1 31 250 × ½ = 16 µs)
before sampling the data stream again. If the result is still a binary 0,
the receiver assumes this is a valid start bit (and not just some
noise or interference), and it proceeds to sample the data stream
every 32 µs (the time for one bit at a bit rate of 31 250 bits per
second). Each time it samples the data stream it notes the binary
state of the signal, and from this forms the 8 bits that form the
MIDI byte (the least significant bit of the byte is always sent first).
As a final check, the receiver checks that the signal is a binary 1 at one
bit time after the most significant bit has been received (i.e. detection
of the stop bit, although often receivers do not bother to do this).
Finally the start and stop bits are removed leaving the original eight
data bits. The receiver than waits for the next binary 0 and the
process starts again for the next MIDI byte.
Comment
You may wonder how the receiver detects the start bit of a MIDI byte
if the MIDI connection is made in the middle of a transmission.
In this case, it is possible that some bytes will be wrongly received, but
at some point the receiver will not receive either a start bit or a stop bit
when it is expecting one, and so eventually synchronisation will be
achieved. I
10 MIDI MESSAGES
Having looked at the physical aspects of how MIDI bytes are transmitted
between two or more MIDI-equipped devices, we can now look at how
MIDI messages are formed from these bytes.
As I mentioned earlier, MIDI messages are composed of one or more
MIDI bytes and there are two types of byte – status and data. The
MIDI status byte is an instruction to do something and a MIDI data
byte provides any data that is needed before the instruction can be
carried out.
In this section, I will be introducing the features of the original MIDI
specification, and in later sections I will describe the major enhancements
that have been made to this specification over the years, that I mentioned
in Section 7.5.
There are two basic classes of MIDI message – channel and system.
Channel messages are the main set of messages that are used to
communicate music instructions, so I will look at these first.
165 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 165
Note On
This is the most commonly used MIDI message and is sent whenever a
note is to be played (e.g. when a note is pressed on the keyboard). The
Note On message requires three MIDI bytes – the ‘Note On’ status byte
and two data bytes which specify the pitch of the note to be sounded,
and the velocity with which the note has been played.
The pitch is specified as a number in the range 0 to 127 for each semitone
on a keyboard. Middle C (C4) is pitch number 60, so the C# above this
is 61, the next D is 62, and an octave above middle C (C5) is 72 (since
there are 12 semitones in an octave).
The velocity is a measure of how hard the note has been played, and
therefore indicates how loud the note should sound. On a piano, if a
loud note is required, the player will play the note with a lot of force,
the amount of force can be determined by measuring the speed at
which the note is pressed. Thus in a MIDI keyboard there is circuitry
for each note that measures the speed the note is pressed and from this
the required MIDI velocity value can be determined. This value, like
pitch, is a number in the range 0 to 127. High velocity values indicate
hard key presses and therefore loud notes.
Even with instruments like a pipe organ and some of the cheaper
electronic keyboards that are not ‘touch sensitive’ a velocity value
must be included, so in these situations a constant mid-range value of
64 is usually used.
Note that a velocity value of 0 is interpreted as zero velocity, or a note
played with a zero volume level. This is usually interpreted as the
166 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 166
equivalent to a Note Off message (see below), and this fact is put to
good use when running status is used (as will be explained later).
Note Off
This MIDI status message turns a particular note off. As with the Note On
message, this message consists of a status byte (containing the message
type and MIDI channel) and two data bytes – the pitch of the note to be
turned off (0–127 as for the note on message), and a ‘release velocity’
(again a number in the range 0 to 127). Of course, this message will
simply be ignored if the specified note has not previously been turned
on, but the converse is not true, i.e. sending a Note On message when
the note is already playing will cause the note to be sounded again
with (possibly) a new velocity value.
The release speed of a note seems a rather odd parameter to send, and
indeed very few keyboards bother to measure this speed. In any case, it
is not clear what the release speed is supposed to indicate in terms of
the sound heard. In fact, the note off message is not used very much, as
more often notes are turned off by turning them on with a velocity of
zero as mentioned above.
Remember also, that, depending on the particular instrument allocated
to the channel, the actual sound of a note may have disappeared long
before the MIDI signal contains an instruction to turn the note off. For
example, a piano-sounding note will decay over a fairly short time.
Conversely, if a note is played using an instrument that does not decay
(e.g. an organ sound), then this note will continue to sound until it is
turned off. This can sometimes lead to problems of ‘stuck’ notes in the
event of a fault, or if the receiver of the MIDI signal cannot process the
MIDI commands sufficiently quickly.
Aftertouch
Some electronic keyboards have a pressure sensor under the keys so
that in addition to measuring the speed at which a key is pressed, any
additional pressure that the player gives to the key whilst it is
depressed can be determined. This pressure is called aftertouch.
Aftertouch can be used to modify some aspect of the note being played
such as its volume, vibrato or timbre. There are two types of
aftertouch, one that affects each note individually, and one that affects
all notes currently being played. The former is usually only found in
the more expensive synthesisers, but it is common for a device that
does not have this facility on its keyboard nevertheless to respond in
some way to aftertouch MIDI messages.
167 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 167
Control change
Control change messages are used to indicate some sort of
modification to the sound through the use of a controller such as a
piano sustain pedal.
These messages contain the control change status byte (which includes
the channel number) and two data bytes representing the controller
type and the controller value. There is provision for a large number of
controller types such as a modulation wheel, foot controller (a pedal
that controls the volume or some other parameter), master volume,
sustain pedal, reverberation level, stereo panning, and there are many
undefined controllers that can be used for future enhancements.
Indeed there have already been a number of new controller types
added since the original specification was prepared.
Some controllers require a simple on/off indication, but others are
continuous and may need a large number of different data values to
indicate a particular setting. Where the data value only requires two
states (e.g. on or off as for a sustain pedal), then in general all values
between 0 to 63 indicate the control should be switched off, and values
between 64 and 127 are treated as indicating the control should be
turned on.
The specification also caters for a ‘double precision’ data value to be
used for a continuous controller that requires more than 128 different
steps. If this is necessary, then two complete MIDI messages (two sets
of three MIDI bytes) are sent, the first contains the most significant
part of the controller value, and the second contains the least
significant part. In this situation, both status messages are identical
control change messages, but the controller type data values are
different (but related) to indicate which part of the controller’s value is
being sent. In this way, controller values of between 0 and 16 383 can
be communicated.
Channel mode
Channel mode messages affect how the receiving MIDI device is to be
configured. They comprise the status byte (which includes the channel
number) and two data bytes – the mode type and the mode data.
168 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 168
Program change
This MIDI message is used to select different sounds sometimes called
patches or programs for a particular MIDI channel. This is the
equivalent of changing the stops on a pipe organ. On most electronic
keyboards or synthesisers, there is a set of pre-assigned sounds, and
there is often a facility for having additional sets of user-defined
sounds. Take care here not confuse MIDI channels and synthesiser
programs. Programs are the particular sounds that a synthesiser can
produce, any of these can be assigned to one or more MIDI channels
with the MIDI program change message.
In addition to the status byte the Program change message contains one
data byte that defines the new program number (0–127). When MIDI
was first specified in 1983, it was thought unlikely that any electronic
keyboard would have more than 128 different sounds, but this has
proved to be a false assumption, and today keyboards can contain
many hundreds of sounds.
Note that this says nothing about the actual sound that the synthesiser
will produce (e.g. piano, strings woodwind etc.), this depends on what
happens to be stored in the relevant program when the MIDI message
is received. The General MIDI enhancement to the original
specification that I will discuss later has attempted to address this
problem.
169 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 169
Pitch bend
Many synthesisers and electronic keyboards have a pitch bend wheel
that allows the player to raise or lower the pitch of all the notes
currently being played. When released, the pitch bend generally
returns to its central, no-bend, position.
The MIDI Pitch bend channel message is used to indicate this action to
other MIDI devices, so when the player moves the pitch bend control,
a whole stream of MIDI Pitch bend messages is generated. As well as
the status byte (which includes the channel number), the pitch bend
message contains two data bytes that allow a maximum of 16 384 pitch
steps to be specified, where 0 = maximum pitch lowering, 8192 = no
pitch change, 16 383 = maximum pitch raising.
The problem here is that the message contains no detail as to the size
of the pitch change each individual value represents (although it is
possible to specify this beforehand using special control change
messages). For example the maximum bend might be a semitone or an
octave or more depending on the receiver’s setting. Another problem is
that this affects all notes on the specified channel. So, trying to
simulate a sound such as a violinist sliding up one string while
playing a constant note on another is not possible unless each string is
assigned its own MIDI channel.
In this activity you will use a ‘MIDI Demonstrator’ program that the
Course Team has produced to experiment with some simple MIDI
channel messages. Carry out the steps associated with this activity
which you will find in the Block 3 Companion. I
Data byte: 0 (switch on with zero velocity – i.e. silence the note)
Data byte: 0 (switch on with zero velocity – i.e. silence the note)
Data byte: 0 (switch on with zero velocity – i.e. silence the note)
As you can imagine, where there are large numbers of notes being
transmitted using just one note on status byte at the start. Even real
time system messages can be interleaved without the need for a new
status byte.
However remember that this situation will only occur if only one
then a new status byte will have to be sent each time such a change
in a short time.
Determine the denary value of the following binary numbers (the left-
most bit is always the most significant bit).
(a) 0000 1010
(b) 0011 0000
(c) 1100 1001
Comment
If you had difficulty in answering this activity, you should revise
Section 5.5.2 of Chapter 6 in Block 1 before continuing. I
So, in the MIDI system how are these 256 denary values or MIDI codes
allocated? In fact the bits are used individually or in groups of two,
three or four as you will see, but in each case the resulting MIDI byte
can be represented by one number within the range 0 to 255.
The MIDI specification states that any byte that has its most significant
bit set to 1 (the left-most bit when written down, or the bit that is sent
last when it is transmitted over the serial connection) should be
175 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 175
Ignoring the upper 4 bits, what MIDI channel are the following two
MIDI status bytes referring to?
(a) 1011 1100
(b) 1001 0110
What makes these bytes MIDI status bytes as opposed to MIDI data
bytes? I
So, in any MIDI status byte bits 0 to 3 are used to specify the channel
and bit 7 is used to indicate a status byte. This leaves three bits
available or up to 8 different combinations in which to specify a
particular MIDI message (bits 4–6). Table 2 relates the eight possible
states of these three bits to the MIDI message that is represented.
When these bits are incorporated with the most significant bit and bits
0 to 3, a range of denery values for each MIDI message can be
determined, as shown in Table 3.
Table 3 MIDI message value ranges
(a) Work out the MIDI message and channel number represented by
each of the following MIDI status byte values.
(i) 195 (ii) 137
(b) What status byte value is required for the following MIDI messages?
(i) Pitch bend on channel 3
(ii) Channel aftertouch on channel 5 I
In Table 3 you may have noticed that the Control change and Channel
mode messages both use the same set of status values. How are the two
differentiated?
Comment
The difference occurs in the data byte that follows. Data values of
0–119 are used for Control change messages and values between 120
and 127 are used for Channel mode messages as shown in Table 4.
Remember that data bytes must have their most significant bit (bit 7)
set to 0, so the maximum data value is 127. I
Follow the steps associated with this activity which you will find in
the Block 3 Companion. These contain details on setting up the
course’s music recording and editing software for MIDI operation and
some simple introductory MIDI exercises using the program. I
178 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 178
Having looked at the basic way in which MIDI operates, I want now to
introduce you to some of the major enhancements to the original
specification that have been introduced over the years. As you will see
some of these enhancements go well beyond the original intention of
MIDI.
Since the details of many of these enhancements are quite complicated,
I will not be looking at them in as much detail as I have done for the
basic MIDI specification. However, for illustration and reference only I
will sometimes include tables of the relevant byte values that are used.
MIDI OUT
MIDI IN
computer
MIDI IN MIDI OUT
synthesiser
Figure 26 MIDI connections for a sample dump using the with handshaking mode
that the loop point request/transmit message is used to save time when editing
Box 8 (continued)
Since a wavetable can contain a large number of data values, having to
continually reload the data for a the whole waveform just to try out an edit to
a small part can be time-consuming. This feature allows just the part of the
waveform that needs to be edited to be transferred, and so reduces the amount
of data that is transferred.
Note also that some synthesisers and sound generators just use their
manufacturer’s ID (and sometimes a sub-ID) instead of the second and third
data bytes listed above. They do however use the sample protocol for
transferring data, but the data itself may not be wavetable data but the device’s
own particular configuration and patch data.
What’s gone wrong? Well, as I mentioned earlier, and as you should now
be able to appreciate, there is nothing in the basic MIDI specification that
indicates what actual sound is heard when a particular sound or patch
is selected. So for example in my scenario above, the synthesiser on
181 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 181
which the song was composed may have had the new piano sound
stored in its patch 60 and the new string sound in patch 124 – because
they happened to be spare patches at the time – and thus the MIDI
codes stored in the sequencer would have allocated these patches to
the MIDI channels used for the notes of the song using program change
messages. However, the computer used for playback had a trumpet
sound in patch 60 and the bird sounds in patch 124 (even though the
computer may well have perfectly decent piano and string sounds
available on other patches).
Given this situation, and as manufacturers started to adopt standard
sets of patches, the MMI stepped in and defined a set of sounds called
General MIDI (GM). Whilst doing this, the MMI took the opportunity
also to include some other components in the GM specification to
ensure compatibility as detailed for reference only in Box 9.
the actual message must contain a data byte value of one less than the
required patch number since the patches are numbered from 1 to 128
whereas the program change data byte has a range from 0 to 127.
Many synthesisers contain additional banks of sound patches beyond
the basic 128 GM set. To cater for this, there is an additional SysEx
message to switch GM sounds on or off. There is also an additional
master volume SysEx message which controls the overall volume – not
just the volume of the sounds from one MIDI channel.
Follow the steps associated with this activity which you will find in
the Block 3 Companion. These contain some simple experiments with
General MIDI sounds. I
183 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 183
How is the SMPTE code used in audio? Clearly when the code is used
it must in some way be locked to an audio recording so that forever
more the code represents the absolute time interval from the start of
the material. If the code and the audio material were to get out of lock,
then the code would become useless in providing a means to identify
individual parts of the recording. This means that on replay, not only
can the replay speed be checked for accuracy, but where the replay
needs to be synchronised to another sound, its speed can be
dynamically adjusted so that the time codes keep in step with the
master time code generator.
The problem here is that audio is continuous whereas video and film
is discrete i.e. it is composed of individual frames. Sound does not
have any such discrete ‘frames’. The answer is simply to ‘mark’ the
audio signal at the regular SMPTE frame rate intervals.
So, for an original audio recording, a separate audio track is used to
store the SMPTE time code. This can be done before, during or after
185 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 185
How are the MIDI Song pointer and Timing clock related to the music
being played? (You may need to refer to the information given on system
real-time MIDI messages in Section 10.2 to answer this question.) I
Full messages
For cueing purposes, it is convenient to be able to send a complete
SMPTE time code so that a slave device can move directly to a
particular position in a song.
To cater for this, MTC defines a universal real time SysEx sequence
that contains the complete SMPTE time code. Also included in this
message is a SysEx channel number so that individual slaves can be
addressed. One of these channels is reserved to indicate that all slaves
in the MIDI chain should respond.
When this message is received, the only action that a slave device
should take is to move to the defined place in the current song. Playing
should not start until a MIDI start message is received, or MTC quarter
frame messages start.
User bits
As mentioned earlier, the SMPTE time code contains some user bits
that can be used for date/take details etc. Since this data tends not to
vary with each individual time code, MTC does not code it with the
individual quarter frame or full messages, but provides another SysEx
message for this purpose.
Notation information
Sometimes, a device needs specific musical notation information, this
is particularly the case where a musician needs to be informed of the
number of beats in a bar and when each bar starts. So MTC has another
SysEx message that allows the master device to send details about the
current time signature and also a bar marker that indicates the start of
a bar. Such information allows a slave device to produce a click or
illuminate a set of lights in sequence and in time with the MIDI
messages so that a musician can play in time.
Cueing messages
Finally, MTC provides a further range of SysEx messages that can
inform the slave devices of a number of different ‘events’, and when
these events should be carried out. There are messages that set up
these ‘events’, and create a list of actions that should be carried out by
the slave as the song is played. These messages can be generated from
the master device’s edit list and used to inform a slave device of the
editing processes the slave needs to carry out as the song is played.
187 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 187
Enter Standard MIDI files (SMF). These are special computer files
that not only store the MIDI messages, they also retain the timing
information that is needed to enable each message to be sent at the
required time. On a desktop computer, MIDI files usually have the
extension .MID or .SMF.
Basic format
SMF uses the Interchange File Format (IFF) that is used for storing
digital sound samples using the AIFF and RIFF WAVE formats.
These were introduced in Chapter 1 of this block.
As with the AIFF and RIFF WAVE formats, the basic building block of
an SMF is the chunk which, as explained in Chapter 1, allows other
types of data (e.g. digital sound samples) to be included in a file of
MIDI data and also caters for future expansion by allowing new chunk
types to be specified.
188 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 188
As you may recall from Chapter 1 every chunk contains three sections:
• the chunk identification (a four-character word);
• the chunk size in terms of the number of bytes of data that follow;
• the actual chunk data (which sometimes starts with a chunk type).
Chunk types
For files containing MIDI data, there are two defined chunk types –
header (chunk identification ‘MThd’) and track (chunk identification
‘MTrk’). The file must start with a header chunk and this is followed
by one or more track chunks.
The interpretation of the word ‘track’ in this context needs some
explanation. A track describes any set of MIDI messages held in a
single track chunk. It could apply to the MIDI messages for all the
instruments, channels and other information used in a song (i.e. there
is only one track chunk), it could comprise the MIDI messages for just
one or a small number of MIDI channels (in which case there would be
one track chunk for each channel or channel set used), or it might
comprise a number of different songs or other MIDI data that are
conveniently stored in the same file – or any combination of these.
Box 11 gives some brief details about the information contained in the
header chunk that must be included at the start of an SMF file.
Why are file SMF sizes likely to be much larger if a fixed number of
bytes is used to specify the delta time? I
Well I’m sure you guessed it, MIDI has now branched out well beyond
the vision of the original designers into a more general control system
that can be used to control not only sound devices, but also lights,
staging and effects such as smoke machines etc.
MIDI machine control (MMC) was the first step in this direction, and
it used MIDI SysEx messages specifically to control the emerging new
hard disk digital recorders (See Box 14).
MMC has now largely been superseded and incorporated into a much
wider control specification called MIDI show control (MSC), which
specifies commands to control a wide range of devices that can be
associated with and/or may have a need to be synchronised with music.
Such devices are lights, stage machinery, video and film and special
effects. The specification is aimed principally at the professional user
in a theatrical or similar environment, but simpler implementations
can be used by smaller concerns and amateurs in applications such as
disco lights.
193 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 193
Through this brief look at MIDI show control, I hope you can envisage
how MSC can be used to control the technical aspects of a complete
performance from a single control device, which may not be much
194 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 194
The final major addition to the original MIDI specification has been
developed for just the above sort of situation.
The MIDI Downloadable Sounds (DLS) enhancement to the original
specification defines an industry standard approach to storing and
transferring between synthesisers sound sets or patches that use
sample-based wavetable synthesis (as described in Chapter 8 of Block 2).
By using DLS, precise specifications for how a particular patch should
sound can now not only be stored on a computer, they can also be
transferred between DLS-compatible synthesisers even over the
Internet and even if the synthesisers have been manufactured by
different companies.
195 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 195
12 MIDI IN ACTION
Watch the DVD video sequence 1 ‘MIDI equipment’. This first sequence
in the ‘MIDI in action’ set of sequences contains four short sections.
In Section 1.1 ‘Simon’s background’, Simon Whiteside introduces
himself and tells us a bit about his background. In Section 1.2 ‘Simon’s
studio’ Simon gives us a tour of his small studio, and explains the
various items of equipment he has and their uses.
In the final two sections 1.3 ‘MIDI basics’ and 1.4 ‘General MIDI’ Simon
gives us a broad overview of the MIDI system and General MIDI from
his own perspective.
As you watch the video sequence, make a few notes about the important
points Simon makes about his equipment, and his use of MIDI and
General MIDI. I
other controls, and the keys usually include aftertouch. Quite often
though, many of these MIDI functions can be generated using a
synthesiser or electronic keyboard, where the device’s MIDI OUT
signal is used as the ‘MIDI keyboard’. In this situation, if some
processing of the MIDI signal is required before it is fed to the device’s
sound generating section, then this will not be possible unless the
synthesiser has a local on/off mode.
What is meant by a local on/off mode, and how can a local off mode be
used to solve the situation outlined above? I
As useful as a MIDI keyboard is, it is not the only device that a musician
can use to generate MIDI.
Another common MIDI generator is the MIDI drum controller. Drum
sounds have always been popular in MIDI as they allow people who
are unable to play drums or other percussion instruments the chance
to add drum sounds, and therefore rhythm, to their music using a
MIDI keyboard to generate the required MIDI messages.
On the other hand percussionists also like to use MIDI to enhance
the range of sounds they can produce. So there is available a large
range of MIDI drum kits that can be played just like a conventional kit,
but instead of producing drum sounds, just produce MIDI messages.
Some of these devices are designed to be played with the fingers
rather than using sticks, but others are full sized drum kits that can be
played just like their conventional equivalents – but of course they
make little noise (Figure 28). This aspect is a distinct advantage to a
percussionist wishing to practice without disturbing the neighbours!
Another possibility is to place a microphone near
a real drum and feed the microphone’s signal
to a device called a MIDI trigger. This device
causes MIDI messages to be sent in response to
the sound signal coming from the microphone.
Of course once the drum ‘sounds’ are in the form
of MIDI messages, they can be used in any way
the musician wishes – for example to change the
sound from a snare drum to a timpani or even to
control pitched sounds such as strings or piano.
It’s not only percussionists that like to get in on
the MIDI act. Players of other instruments would
sometimes like to be able to use the facilities
that MIDI provides – even if it’s only to be able
to practice at home with headphones on at two
in the morning! Thus manufacturers have come
up with a number of novel ‘MIDI instruments’
that mimic their conventional counterparts, but
provide a MIDI output signal. Sometimes these
are actual instruments that produce sound but
have a special pickup, but others produce no
sound.
Figure 28 A MIDI drum set
199 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 199
originally designed
Figure 31 A keyboardless
MIDI sound generator
202 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 202
Finally there are what one might call speciality musical MIDI devices
like the MIDI melodeon in Activity 48, and it is here that we come a
full circle from the original code-operated musical instruments that I
introduced in Section 2. I would like to mention just two of these
‘speciality’ devices – one is the MIDI-driven carillon that I have
already talked about, and the other is a MIDI controlled piano which is
the modern equivalent of the barrel and player pianos.
Piano manufacturers are now producing modern ‘player pianos’. These
can not only be played just like a normal piano, but can be controlled
not by a paper roll, but by MIDI messages, complete with the keys
moving just like the original paper roll instruments. Of course they
also provide a MIDI OUT signal when played, and will incorporate
some form of storage device and/or computer interface. Interestingly
the top-of-range models are purchased mainly by the rich and famous
and used with MIDI files to provide player-less background music at
parties etc. rather than being used for serious music or MIDI work.
This is just the same situation when pianos became a fashion item in
the late 1890s and early 1900s as I mentioned earlier in Section 3.
Play the MIDI file associated with this activity using either your
normal MIDI-playing software or the course’s music recording and
editing software. This is just a few bars of Thurlow Lieurance’s Indian
love song By the waters of Minnetonka. This is the piece of music that
you heard the player piano playing from a piano roll in Activity 16.
Unfortunately an exact MIDI copy of the piano roll could not be found,
so this version has been generated specially for the course by the
Course Team. I
12.2.1 Hardware
The hardware of a computer refers to the physical bits and pieces that
make up the computer and include not only the electronics inside the
main box, but also the extra devices or peripherals that are needed
such as the computer’s keyboard, display and mouse. All desktop
computers have some sort of hardware ‘sound card’ which acts as a
sound interface and provides analogue audio inputs and outputs and
which contains analogue-to-digital and digital-to-analogue converters
that enable sound input to and output from the computer. The ‘sound
card’ may be integrated into the computer’s main circuitry or it may be
contained on a separate electronic circuit board (but still within the
main computer box).
Today most of these sound interfaces also contain a MIDI input and a
MIDI output (although as mentioned earlier in this chapter, an adapter
lead is often needed if the standard 5-pin DIN sockets are needed – see
Figure 19). Some interfaces even contain sophisticated sound generators
and can interpret MIDI messages themselves to produce an analogue
music output directly. Indeed such is the progress of technology these
days that it is possible to integrate a complete synthesiser on a computer
sound card that provides General MIDI as well as user programmable
and downloadable patch capabilities.
An example of such a sound card
is shown in Figure 33.
analogue
audio input
left
analogue right
audio outputs
S/PDIF digital
output
MIDI
interface
Figure 33 The Yamaha SW1000XG PCI sound card which contains a complete
synthesiser with full MIDI facilities.
206 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 206
12.2.2 Software
Software refers to the programs that, when executed (or run), cause a
computer to carry out some function. Some software is provided to
help execute more complex programs – the Windows® operating
system is an example of this type of program, and is often called
system software. In fact Windows itself is composed of many smaller
programs that do specific jobs, e.g. looking after the reading and
writing to secondary memory (hard disk), organising the loading and
execution of the user’s programs which are known as applications.
All desktop computers will have the necessary programs to provide basic
sound input and output, and to interface to the MIDI connections.
However, these are of limited use without applications that can facilitate
making music with MIDI.
Many of these application programs provide a complete MIDI
environment that includes a multitrack sequencer, MIDI editing, time
code generation, and may even provide a general MIDI synthesiser.
More sophisticated programs will integrate MIDI tracks with sound tracks
and provide a complete sound creation and editing environment.
The course’s sound recording and editing software is an example of
this type of program.
An important point to note is the idea of a software MIDI synthesiser.
It is quite possible to implement General MIDI using a program rather
than an actual electronic device such as those found in synthesiser
207 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 207
12.2.3 Latency
One problem area when using a computer simultaneously for both MIDI
and audio is that of making sure the sounds from the MIDI data are
heard at the correct time in relation to the analogue sound created
from the digital audio data. Any delay of either the digital sound or
the MIDI sound is known as the latency. Why might this occur?
If you think about the situation of a computer playing back a song in real
time that uses both digital sound and MIDI data, then it is reasonable
to suppose that the following situations might occur:
• the sound samples get delayed because perhaps they have to be
sent to an external sound unit;
• the MIDI sound is delayed because the computer is using a
software MIDI sound generator that takes some time to respond to
each MIDI message and generate the required sound.
There are many factors that can affect the synchronisation between digital
sound and MIDI sound, and you may have found in the practical work
associated with this chapter that you had to adjust the settings of the
course’s sound recording and editing software to compensate for the
latency on your computer.
Latency can also occur in a recording where the performer is playing
along to a previously recorded sound and there is a delay between the
performer’s recording and the existing recording. This means that on
playback the two sounds do not sound together as they should.
However, it is important not to confuse latency with inadequate
performance. Latency just means that a sound or a MIDI signal
gets delayed in relation to the other signals it is supposed to be
synchronised with. It does not mean that the computer cannot cope
with the speed or amount of data that it has to process in order to
provide the sounds in real time. If performance is the problem, then
this is likely to manifest itself in unpredictable effects such as a
variable delay or corruption of the audio and/or MIDI sounds.
208 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 208
Watch the DVD video sequence 2 ‘MIDI in film and TV music’ from
the set of video sequences collectively entitled ‘MIDI in Action’.
This sequence contains six short sections.
209 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 209
There are a number of general points that come out of the video sequence
in Activity 52 – some of the more important ones are mentioned below,
but you will probably have noted a number of other points as well.
• The original video footage has the SMPTE time code ‘burnt in’ to it
to provide the means of synchronisation.
• The mood and pace of the music should be such as to enhance and
not distract from the viewing experience.
• There are a number of ways of entering the notes of the music into
the computer – playing them directly on a music keyboard, placing
the notes on the score display or entering them individually into a
MIDI list or ‘piano roll’ display.
• The finished music is usually supplied as sound (rather than MIDI
codes) on digital audio tape (DAT) or CD. This is because General
MIDI cannot be relied upon to produce the sounds sufficiently
closely, and also there are many better sounds available that are not
covered by General MIDI. However, the use of DLS may mean that
supplying MIDI codes with DLS wavetable data may allow MIDI
files rather than digital audio files to be delivered in such instances
in the future.
• The stability of the playback of today’s digital sound and video
recording systems is such that continuous synchronisation is
not needed for short pieces of music, particularly as here where
very precise synchronisation (such as lip-sync) is not required.
Only a few specific synchronisation points are all that is necessary for
synchronising the sound to the picture.
• Using live musicians brings a number of problems to the process
in terms of having to manually adjust the scores the computer
generates and also the legal problems and costs of recording
rights.
Listen to the two audio tracks associated with this activity. Both tracks
contain the final full piece of music that Simon Whiteside composed
for the section of the ‘Leonardo’ programme that was the subject of the
video sequences in Activity 52. One of the tracks is the version using
live musicians and the other uses only sounds from Simon’s various
synthesisers and samplers, and is the version used for the ‘international’
version of the programme.
Which do you think is the one that uses live musicians?
Comment
You should not have found it too difficult to detect that the first
version is the one that uses live musicians. However, I hope you will
agree with me that the synthesised version is extremely convincing,
and that in context it is unlikely that a viewer who is concentrating on
the action on screen would be aware that the music is in fact not
generated with live musicians. I
In this activity you will experiment with the score, list and ‘piano roll’
MIDI editors in the course’s sound recording and editing software.
You will find the steps for this activity in the Block 3 Companion. I
As you have seen in this chapter, MIDI is certainly a very useful tool
in the generation and distribution of music. But even though the MIDI
specification has been and is still being enhanced to incorporate more
features, it is after all basically a method for coding music and as such
will always have its limitations – just as conventional music notation
and piano rolls do.
There are two areas where MIDI has its main problems – functionality
and physical. The functional problems include musical considerations
such as catering for different temperaments, timbres and the nuances
of live players, and the physical problems include the speed of MIDI
messages and the connection system.
However, as you will see in the next activity, Simon Whiteside sees a
further limitation of MIDI as being how to get certain nuances of
performance into MIDI codes even though the MIDI system does
already have facilities to record such features.
Clearly then MIDI has been and still is a very robust system that works
well, and has certainly bought music creation to many people who
cannot read music or play an instrument.
211 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 211
Why should the speed of MIDI messages cause a problem with dance
music where there is a lot of percussion? I
In this final section of the chapter, you will work again with the
TA225 Course Tune, but this time you will incorporate MIDI sounds.
In fact, the MIDI version of the tune was produced before any of the
live versions were recorded since all the performers in the live versions
played along to the MIDI version with a click added. They used head-
phones to listen to the tune so that it and the click didn’t get recorded
with their performance. You will see Simon Whiteside doing just this
in the final set of video sequences in this chapter.
The Course Team gave Simon a copy of just the melody line of the
TA225 Course Tune with no indication of harmony or tempo and
asked him to produce three arrangements of the tune – one that used
General MIDI, one that used synthesised sounds and one free choice
version which we hoped would include some live music. In the next
activity you will see what Simon did with the tune.
Watch the DVD video sequence 4 ‘MIDI and the course tune’. This
sequence contains five short sections.
As before you should make some brief notes about the points Simon
makes concerning the sounds he used, and the way he went about
producing the three course tune versions.
Comment
In the score displays shown during the General MIDI version, notice
again the crude layout of the notes. As Simon mentioned in an earlier
sequence, if the score was needed for live musicians, then there would
need to be some human intervention to get the score in a form that
musicians would comfortably be able to play from.
Make sure you understand the distinction between sampled sounds
and synthesised sounds. Both are controlled by MIDI codes, but
sampled sounds are generated originally from live instruments using
looping techniques as explained in Chapter 8 of Block 2 whereas
synthesised sounds are sounds produced totally electronically. A set
of General MIDI sounds may well use both sampled and synthesised
sounds – the method of production of the sound is not laid down, only
that the sound produced should be like the instrument named in the
212 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 212
Comment
Whichever version you personally prefer, I hope you will appreciate
first of all how different all three are from the original versions of the
TA225 Course Tune that you have worked with. The reasons for this
are not just in the slower speed of the tune that Simon decided to use,
but also in the harmonies and counterpoint he used and in the
‘embellishments’ he incorporated (the rhythmic patterns and the
glissandi in the harp part for example). All of these aspects combine to
create interest and variation in the pieces.
Notice also how each version presents its own technical challenges.
The General MIDI version uses a large number of MIDI channels and
has lots of different ‘instruments’ playing. The dance version requires
the use of synthesised sounds which need to be carefully selected, and
the music itself is difficult to input to the computer as some of the
parts cannot be played directly and need to be input note by note.
From a musical point of view the jazz version is quite straightforward
for a musician who is used to playing jazz music (as Simon is).
However, as well as the complication of adding the live melody horn
sound, in order for the result to sound convincing, sampled sounds
had to be used. Notice in particular here that the drum sound Simon
used was created not from the sample of a single brush hit (i.e. using
brushes as a side drum beater as mentioned in Activity 1 of Chapter 5
in Block 2), but from a sample of the complete basic rhythmic pattern
of the piece. This sample lasts a full bar of the music, and at the start
of every bar there is a simple MIDI message to start the sample playing
(two MIDI messages are in fact used as you will see in Activity 61). I
213 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 213
Simon has provided the Course Team with MIDI files for all three of
his arrangements of the TA225 Course Tune. In this activity you will
use the course’s recording and editing software to examine and play
these three files, and in the case of the dance and jazz versions, to try
to edit them to produce acceptable results. You will find the steps for
this activity in the Block 3 Companion.
Comment
I hope this activity has demonstrated the advantages of General MIDI
in enabling people to create MIDI files that can they can be sure will
always sound similar to how they intended – of course as long as
General MIDI is used when they are played! I
In this final activity of the chapter, you will return to the original
version of the TA225 Course Tune, this time incorporating MIDI to
the mix. Like the last activity in Chapter 1, this is an open-ended
activity, and you can spend as much or as little time on it as you wish.
The Block 3 Companion contains some more information about this
activity and an outline of the procedure you should follow. I
In this chapter and Chapter 1 of this block you have learned about the
technology behind the recording, editing and mixing of both digital
audio and MIDI. In addition I hope you feel you have gained some
valuable practical skills in these areas and that these will spur you on
to experiment with creating your own music using these skills and the
software provided with the course. Good luck!
214 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 214
SUMMARY OF CHAPTER 3
Pin barrels or cylinders store music as pins as a series of codes and was originally
around the circumference. When the barrel designed to enable one or more sound
is rotated the pins engage with the generators to be controlled from a single
mechanism of the instrument to play the keyboard. In the early days of electronic
tune. The position and size of the pin music, sounds were produced using
determines the note. The speed of rotation analogue techniques. They were
then determines the tempo. Instruments monophonic and were controlled by
include carillons, organs, pianos and varying voltages. When polyphonic
orchestrions. Drawbacks to using a barrel digitally controlled synthesisers appeared,
include the expense of manufacture and variable voltages could not be used to
limitations in the number of tunes offered. control the sound generating devices, and
Barrels are also expensive to replace. The this led to the development of the MIDI
pin-disc, used in Polyphons and the like, system which allowed one keyboard to
overcame many of these drawbacks but were control a number of synthesisers,
superseded by the gramophone record. independently and simultaneously. MIDI
(Section 2) has become so successful that it is now being
used in applications which have little to
Paper rolls and to a lesser extent cardboard do with the original concept of music
books overcame the drawbacks of cylinders. codes. (Section 6.1)
These use the Jacquard concept whereby
holes in the paper or cardboard carry the As useful and popular as MIDI is, it is not a
instructions for the instrument. The hole universal solution for all music
causes a note to be played and the length applications. There are many facets of music
of the hole determines the length of the that MIDI cannot cater for. However, it is
note. The speed of the paper affects the likely that future enhancements to the MIDI
tempo. Used mainly to control pianos, the system will become ever more closer to this
original Pianola or pushup involved adding universal goal. (Section 6.2)
a mechanism to the front to play the keys
of a standard piano. Later the player piano In a basic MIDI system, there is one master
was produced that incorporated the controller that generates MIDI messages. The
mechanism in the piano. (Section 3) controller is connected to one or more
devices that respond to these messages.
Many novel mechanical devices have been (Section 7.1)
produced to play a variety of conventional
musical instruments such as the violin and Each one of the basic set of MIDI music codes
banjo. Some of these even controlled can be associated with up to 16 separate
lighting and other effects, all from channels. Receiving devices can be set up to
instructions on a perforated paper roll. respond to one or more of these channels thus
Many well-known pianists recorded allowing the controller to control the devices
performances on piano rolls, and some individually if required. (Section 7.2)
composers composed music specifically for
mechanical instruments, and in particular MIDI messages are sent along a single cable
the player piano. The popularity of the serially, i.e. one after the other. The speed
player piano was affected both by the of transfer of individual messages is such
improvements in the gramophone and by that usually there is no perceptible relative
radio broadcasting. (Section 4) delay in the music that is generated from
these messages. However, the finite speed of
A look at mechanical music shows that transfer can produce problems in situations
coded music can only contain information where there is a lot of MIDI activity. For such
about a limited number of aspects of the situations an additional MIDI connection is
music. Continuous variations in quantities often used. (Section 7.3)
such as pitch, timbre and dynamics are
difficult if not impossible to code A MIDI byte consists of eight bits of data,
satisfactorily. To ensure the music is and is the smallest unit of MIDI data. MIDI
reproduced from the music codes as messages are made up of one or more MIDI
intended, there is a need for standards, even bytes. (Section 7.4)
so there is no guarantee that the music will
sound exactly as intended. (Section 5) The MIDI specification comprises details on
the hardware and electrical signals to be
The Musical Instrument Digital Interface used and how the data is to be interpreted.
(MIDI) is a system for communicating music (Section 7.5)
215 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 215
MIDI messages are carried along a one-way sound either through the use of a controller
communication path using a balanced cable. such as a sustain pedal, or through effects
In the original specification the connector such as overall volume and reverberation.
at each end is a standard 5-pin DIN plug. A There are many MIDI Control Change codes
full MIDI port consists of three 5-pin DIN unallocated that can be used for future
sockets connectors – MIDI IN, MIDI OUT enhancements. Channel mode messages
and (optionally) MIDI THRU. The MIDI IN indicate how the receiver should be
connector is used to receive MIDI messages, configured – polyphonic or monophonic,
the MIDI OUT connector is used to output local control on or off, onmi mode on or off
MIDI messages, and the MIDI THRU (i.e. respond to all MIDI channels or not)
connector outputs a relay of the messages and switch all notes off. Program change
received on the MIDI IN connector. Because messages allow the selection of different
of the physical size of the 5-pin DIN patches (different sounds) for note messages
connector, other types of connection are on a particular MIDI channel. Pitch bend
often used with computers, and this allows the pitch of a note to be varied as it
necessitates the use of an adapter unit or sounds. (Section 10.1)
cable to provide the standard MIDI
connectors. (Section 8) MIDI system messages are of three types –
common, real time and exclusive. System
A MIDI connection uses an optoisolator in common messages contain a number of
the receiver to electrically isolate the MIDI instructions concerned with playing a pre-
equipments. An optoisolator is an electronic recorded set of MIDI codes that form a song.
component that uses light to transfer a System real-time messages are instructions
digital signal from its input to its output. that must be acted upon straight away. As
The MIDI specification therefore is given well as transport controls (start, stop etc.)
in terms of the current needed to drive an there is a MIDI clock message to aid
optoisolator device – current flowing synchronisation, a message to indicate that
indicates a binary 0, no current flowing a the transmitter is still connected and
binary 1. (Section 9.1) operating and a system reset command.
MIDI bytes are sent using asynchronous System exclusive (SysEx) messages contain
serial transfers which involves adding a arbitrary device- or manufacturer-specific
start and a stop bit to the eight data bits. data. Manufacturers have to register with
The addition of these bits allows the receiver the MIDI Manufacturers Association (MMA)
to receive first the individual bits and then to obtain a system exclusive identification
each MIDI byte correctly. (Section 9.2) code. System exclusive messages can be
designated as being real-time (i.e. they must
There are two classes of MIDI bytes – status be acted upon immediately), or non-real-
and data. MIDI status bytes are instructions time. (Section 10.2)
to do something and MIDI data bytes are
the data that the instruction requires (if In order to reduce the amount of data that
any). MIDI status bytes are of two types, needs to be transmitted, running status may
channel and system. (Section 10) be used whereby a string of MIDI messages
which have the same status byte can be sent
Channel messages contain a channel with the status byte being sent only once at
designation and are the messages that carry the start of the string. (Section 10.3)
the main sets of music codes. There are seven
possible channel messages – Note On, Note MIDI status bytes are indicated by having
Off, Aftertouch, Control Change, Program their most significant bit (bit 7) set to 1. This
Change and Pitch Bend. Note On and Note means that MIDI data values must be in the
Off messages are the basic instructions to range 0 to 127. Bits 4, 5 and 6 of a status
play notes. Each requires two MIDI data byte are used to determine the status type,
bytes – a pitch specification and a velocity and bits 0 to 3 are used to indicate the MIDI
value. The data values range between 0 and channel. (Section 10.4)
127 with 60 being the pitch value for Over the years since the original MIDI
middle C (C4), and 127 being the highest specification was published, a number of
velocity value. Aftertouch refers to enhancements have been added. Sample
additional pressure that the player applies dump is an enhancement that allows
to the keys after pressing the notes. wavetable or patch data to be transferred to
Aftertouch can affect individual notes and from a synthesiser. This is achieved using
(polyphonic aftertouch), or all notes special SysEx messages and there are two
(channel aftertouch). Control Change methods of sending the data, one uses a one-
messages refer to a modification of the way connection with no handshaking and
216 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 216
the other uses handshaking and requires a audio files. SMF uses two chunks types – a
two-way connection using two MIDI leads. header chunk that contains information on
The data is transferred in packets of 120 bytes. the form of the MIDI data, and one or more
(Section 11.1) track chunks that contain the MIDI data
itself. The time relationship of the MIDI
The General MIDI enhancement was brought messages is retained by adding an item of
in in response to the problems people were data called the delta time before each MIDI
having because there was no common set message (or event) that indicates how long
of sounds. General MIDI (GM) details a set the next MIDI message is to occur after the
of 128 different musical sound patches and previous one. A delta time of 0 means there
47 percussion sounds that a compliant should be no time interval between
device must be able to produce. In addition messages and that they should be sent one
GM specifies at least a 24-note polyphony directly after the other. To cater for very long
capability and that the GM device must be and very short delta times, and to minimise
able to respond to Note On velocity values, file sizes, the delta time is specified using
Channel Aftertouch and a number of a variable number of bytes. SysEx messages
specific controller status messages. General can be incorporated, and there are a number
MIDI 2 (GM2) further enhances GM to of other items of information called meta
include 32-note polyphony, the ability to events that can also be stored. One of these
produce 16 different musical sounds and 2 is a lyric that can allow words to be
different percussion sounds simultaneously attached to associated MIDI music messages.
and the ability to respond to a number of (Section 11.4)
additional control change messages and some
new SysEx messages concerned with overall MIDI machine control (MMC) and MIDI
effects such as reverberation and chorus. show control (MSC) are enhancements that
General MIDI Lite and Scaleable Polyphony in the main use special SysEx messages to
General MIDI are subsets of GM designed for control a wide range of devices that may be
use primarily in mobile telephone ring tones. associated with or may need to be
Scaleable polyphony GM places a priority on synchronised with MIDI music codes,
MIDI channels so that a less-capable device audio signals and video. MMC is designed
can ignore the lower priority channels it is for controlling hard disk recorders and
unable to cope with. (Section 11.2) similar audio equipment in order to
MIDI time code (MTC) is an enhancement synchronise audio and MIDI. MSC
that provides a constant time interval incorporates and extends MMC to include
reference for synchronisation purposes. This control of a whole host of different devices
is in contrast to the MIDI timing clock including stage machinery, video and film
message that is related to the tempo of the lighting and special effects. Through the use
music. MTC is compatible with the SMPTE of MSC, all aspects of a complete stage
time code that has been in use in the film performance are able to be controlled using
and television industries for many years. MIDI messages sent along standard MIDI
The SMPTE time code was originally based connections. (Section 11.5)
on the number of picture frames that had
occurred since the start of the film or video. MIDI Downloadable Sounds (DLS) allows
Different frame rates are catered for with wavetable data to be sent along a MIDI
different variations in the SMPTE time code. connection or stored in a MIDI file. This
When applied to sound, the SMPTE time means that the precise sounds a song
code indicates the time that a particular part requires can be sent or stored along with
of a recording should occur relative to the the music data so that when the song is
start of the piece. MTC uses a previously played on a DLS-equipped wavetable
undefined MIDI status byte to provide a synthesiser the sound should be heard
special time code message that is sent at a exactly as was originally intended. DLS
rate of between 96 and 120 times each provides for communication/storage of the
second (depending on the SMPTE frame basic digital samples forming the wavetable
rate variation used). There are also a number data, details about any loop points that
of additional SysEx messages to give should be used for the steady-state sound,
additional timing and associated data on any vibrato or tremolo that should be
information. (Section 11.3) added, descriptions of how the volume and
pitch of the sound is to vary through the start
Standard MIDI files (SMF) are used to store up, steady-state and release phases of the
sequences of MIDI messages. These files use notes and details of the behaviour of the
the same basic Interchange File Format (IFF) sound in response to pitch bend and MIDI
as used for AIFF and RIFF WAVE digital Control Change messages. (Section 11.6)
217 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 217
The uses of MIDI now stretch far beyond should not be confused with delays arising
those originally envisaged, and its from inadequate performance that can
shortcomings in terms of the physical produce variable synchronisation problems
connections, the data rate and the limited as well as problems with the sounds
number of channels are becoming more themselves. (Section 12.2)
evident. In the future it is likely that a new
specification will be produced that will The study of a professional musician/
address these shortcomings whilst still composer at work shows that MIDI is an
providing backwards compatibility with the invaluable tool in the creation of background
vast numbers of existing MIDI devices. music for film and television programmes.
(Section 11.7) MIDI offers great flexibility in terms of being
able to try out ideas, the fine adjustment of
There are three main categories of MIDI tempo, the ease of entering and editing the
equipment – those that generate MIDI music, and even, if necessary, in producing
messages, those that manipulate them and the final music without the need to use live
those that interpret MIDI messages to musicians. Synchronisation between music
produce sound. The main MIDI generator is and pictures is easily maintained by using a
the MIDI keyboard, but there are other desktop computer as a central controller. The
common devices such as MIDI drum sets computer controls the playback/recording of
and wind controllers. There are also a the sounds (either as MIDI codes or digital
number of novel MIDI instruments – either audio) as it replays the pictures which have
new devices or conventional instruments previously been transferred to the computer
with a MIDI interface attached. Devices that from video tape. The final sounds are sent to
manipulate MIDI messages include the production company as digital audio, and
sequencers, expanders, filters, mappers, the stability of digital audio and video
patchbays, multiport interfaces, patch playback is such that continuous
editors and librarians and diagnostic synchronisation between the two is not
devices. MIDI sound generators include necessary, and only a small number of
electronic keyboards, synthesisers – with specific synchronisation points need to be
or without keyboards, drum machines and specified. (Section 12.3)
conventional instruments fitted with a
MIDI interface. A MIDI implementation Although MIDI is a robust system that works
chart is a chart with a standard layout that well and is now an essential tool in music
indicates the MIDI capabilities of a device. making, it does have its limitations. Some
(Section 12.1) of these are fundamental in that using a
system of music codes can never exactly
The functions of MIDI generator, represent all aspects of music. MIDI also has
manipulator and sound generator are all some physical/electrical problems in terms
available in today’s desktop computer, of the data speed and connection system.
although generating MIDI music codes may Another problem is that of capturing the
be easier with the use of an attached MIDI nuances of live performers even when the
music keyboard, and better sounds may be MIDI system is capable of representing such
obtained by using an external synthesiser. nuances. In the future the MIDI system is
For a more sophisticated set-up, a desktop likely to be further enhanced in terms of
computer may be augmented with a number both its functionality and physical/
of peripherals such as a multiport interface, electrical aspects. (Section 12.4)
mixing units and sound cards incorporating
a hardware synthesiser. Sophisticated The creation of different types of music
software for desktop computers is available using MIDI codes requires different
that provides a complete integrated MIDI and approaches. Sampled sounds are used when
audio environment. Latency is the term used sounds that mimic those of real instruments
to describe any delay in an audio or MIDI as closely as possible are needed. Sometimes
path that causes two sounds to become out samples of whole phrases or rhythm
of synchronism by a fixed amount of time. patterns rather than just of a single note are
The delays most commonly occur in desktop the best way to obtain a realistic effect.
computers either due to delays in the Synthesised sounds are more often used
computer processing the sound data (MIDI when ‘new’ sounds are required. The use
or audio) or through delays in generation of of MIDI enables passages of music to be
the sounds. Most computer sound programs created that would not otherwise be able to
have facilities to cater for latency. Latency be played in real time. (Section 12.5)
218 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 218
APPENDICES
The tables in these appendices are given for reference only and should
be used to obtain a general idea of the range of sounds that General MIDI
provides for and the types of equipment and operations that MIDI
Show Control includes.
Value Command
0 Reserved for future extensions
1 Go
2 Stop
3 Resume
4 Timed go
5 Load
6 Set
7 Fire
8 All off
9 Restore settings
10 Reset
11 Timed stop
222 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 222
Activity 7
As you will recall from Chapters 1 and 2 in Block 2 as the length
of a pipe is reduced the pitch of the note produced is increased.
Miniaturisation meant that pitches of notes produced by the
mechanisms would be unrealistically high.
Activity 11
There are at least three major drawbacks. Firstly, no tune can be longer
than the time it takes for the barrel to complete a single revolution.
Tunes had to be arranged to fit this time which of course varied
between instrument models and makers. Secondly, the barrel is
difficult, although not impossible, to change so the range of tunes
offered was limited and difficult to up-date. Finally, each barrel
had to be hand-made and so were very expensive to buy.
Other problems that you may have considered include the size of the
barrels, the fact they are quite fragile (a single broken pin would spoil
the music and be difficult to repair), and the poor action that led to
mediocre piano performance.
Activity 19
Most stringed instruments other than pianos and harpsichords have to
form the note. Violins, cellos, double-basses, guitars, banjos, etc., all
rely on fingering to make the note on the string before playing it.
Activity 25
The pin barrel system reproduced the music as written, but the
performance was dependant upon the skill of the artisan who made
the pin barrel. The selection and placement of the pins would
directly affect the resulting performance.
The piano roll was a faithful reproduction of the performance of the
original artist (with minor corrections!). Naturally the quality of the
player piano could affect the sound but the performance was that of
the artist. So, in Activities 23 and 24, when you listened to Rhapsody
in Blue you were hearing George Gershwin himself play perhaps his
most famous work.
Activity 29
Chapter 1 of this block mentioned that the use of a twisted pair
balanced connection reduced the effects of interference from other
electrical sources. A screened cable also helps this as well.
The greater the length of lead, the more chance there is of interference
becoming a problem. So for cable lengths of 15 m it is reasonable to
suppose that some protective measures like a using a balanced signal
via a twisted pair of wires and screening might need to be used.
223 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 223
Activity 30
The interconnection diagram is shown in Figure 34.
MIDI OUT
MIDI keyboard
Activity 31
The serial conversion process involves the addition of two extra
bits to the eight MIDI bits giving a total of 10 bits for each message.
The maximum number of messages is achieved by sending each
message straight after the previous one with no intervening time gap, so if
31 250 bits can be sent each second, a maximum of 31 250 ÷ 10 = 3125
MIDI bytes can be sent.
Activity 33
(a) The A above middle C (A4).
(b) The C an octave below middle C (C3).
Activity 36
(a) Starting on the right, the weightings of the bits that are 1 are:
4 + 8 = 12
(b) Starting on the right, the weightings of the bits that are 1 are:
16 + 32 = 48
(c) Starting on the right, the weightings of the bits that are 1 are:
1 + 8 + 64 + 128 = 201
Activity 37
Looking at the weightings of the lowest four significant bits only:
(a) 4 + 8 = 12, so MIDI channel 13 is being referred to.
(b) 2 + 4 = 6, so MIDI channel 7 is being referred to.
Both bytes have their most significant bits as one, which makes them
both MIDI status bytes.
Activity 38
(a) (i) Program change on MIDI channel 4
(ii) Note Off on MIDI channel 10
(b) (i) 226
(ii) 212
224 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 224
Activity 42
The MIDI clock is sent at a rate dependant on the crochet (quarter note)
speed of the music being played. In turn, the song pointer is related to
the number of MIDI clocks. So both of these will vary with the tempo
and content of the music, and are not fixed to absolute time.
Activity 43
If a fixed number of bytes were to be used, the actual number must
cater for the largest possible delta time, which could be a long time and
therefore require many bytes. However, it is likely that many
individual MIDI messages will need to occur at the same time, or
within a very short time interval, i.e. have a delta time of 0 or a small
number. Having to use up many bytes of data in every such case is
clearly going to add a large overhead to the amount of data that needs
to be stored, and therefore will increase SMF file sizes.
Activity 46
A single MIDI port can only carry 16 channels, so if more than this
number are needed, then additional MIDI ports need to be used (i.e.
additional sets of connections). Modern samplers/synthesisers now
often incorporate two or more MIDI ports (i.e. more than one set of
DIN connectors – see Figure 17). In addition, a separate MIDI interface
box is used to allow MIDI data to be routed from and to specific devices.
Activity 47
A local mode on/off mode is where the synthesiser or MIDI device has
the facility to disconnect its keyboard circuitry from its sound
generation circuitry. Thus in local off mode, when the keyboard is
played, the MIDI OUT signal contains the MIDI messages
corresponding to the key presses, but no sound is generated. However,
in this mode, MIDI messages fed to the MIDI IN input do get
interpreted by the sound production circuits.
So, in the situation where some processing of the MIDI signal needs to
be done before the signal is used to generate any sound, the
synthesiser is put in local off mode, the device’s MIDI OUT signal is
fed to the MIDI manipulation device, and this device’s MIDI OUT
signal is fed to the synthesiser’s MIDI IN.
Activity 50
(a) The D-50 transmits MIDI note numbers from 24 to 108. If 60
represents the pitch C4, then 24 represents C2 and 108 represents
C8. Thus the pitch range is C4 to C8.
(b) In the MIDI implementation chart, key aftertouch (polyphonic
aftertouch) has a cross in both transmit and receive columns
indicating it does not implement this feature. However, the
channel aftertouch row has an asterisk in both columns indicating
that this feature can be switched on or off under user control, and
that the setting is memorised. Thus the D-50 does implement
channel aftertouch.
225 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 225
Activity 51
For processing digital sound data the main requirement of the
computer is to be able to store large quantities of data (i.e. the digital
sound samples) and to move them around very quickly. Thus the
critical aspects will be the speed of operation of the computer, the
amount of main memory and possibly the size of the secondary
memory (hard disk).
Activity 53
The main advantages of using MIDI are:
• the music can be entered in a number of ways to suit the composer’s
ability and choice, some of these do not require the music to be played
in real time or require the composer to be able to read conventional
music notation;
• the timing of the music can be continuously adjusted to fit the video;
• the music can easily be cut up, copied and pasted;
• a good idea of how the final music will sound can be obtained by
using General MIDI sounds even though eventually special sounds
and/or live musicians will be used.
Activity 57
There are two aspects to dance music where there is a lot of percussion
that contribute to causing a problem with the speed of MIDI messages.
First the General MIDI specification allocates channel 10 to percussion
sounds, and since a MIDI device often sends MIDI note messages in
channel order, the data for channels 1–9 must be sent before the
percussion data is sent. This means that there may be a variable delay in
the percussion sounds sounding. Since these sounds usually contain the
pulse of the music, any delay, and particularly a varying delay, may cause
the beat of the music to appear to vary and also cause the percussion to
perhaps not sound in synchronism with the other instruments.
The other problem is the quantity of messages that percussion sounds
use. Consider a drum roll. Each individual hit of the drum in a roll has to
be indicated by a separate MIDI message. Put these all together to create a
roll and you get a huge rush of MIDI messages in a very short time. The
quantity of these messages can sometimes cause the MIDI transmission
system to overload resulting in delays to other MIDI messages and/or an
uneven drum roll.
Activity 59
In order to be able to synchronise the melody horn to the MIDI version
of the jazz arrangement, Simon needed to listen to the MIDI sounds and
also hear the pulse of the music with an audible click as he played.
In the recording the MIDI version and click must not also be recorded
with the melody horn, so Simon had to wear headphones. However,
unless the headphones are of the ‘enclosed’ type then it is quite possible
that the unwanted sounds might still be heard in the background of the
melody horn recording. (In fact many of the cheaper unenclosed type of
headphone are very bad at restricting the sound that emanates into the
area surrounding the listener – how many times have you heard quite
clearly what the person sitting next to you on the train is listening to
on their personal jukebox!)
226 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 3 CARILLON TO MIDI 226
LEARNING OUTCOMES
After studying this chapter you should be able to:
Acknowledgements
Grateful acknowledgement is made to the following sources for
permission to reproduce material in this chapter:
Figures 28 and 29: Yamaha-Kemble Music (UK) Ltd; Figure 32:
Roland (UK) Ltd; ‘Music in Code’ video sequences: Paul Camps,
Janet Whitehead and the late Graham Whitehead founder of the
nickelodeon collection at Ashorne Hall, Warwickshire; ‘Music in Action’
video sequences: Simon Whiteside; translation of Der Leiermann in
Activity 6: Janet Seaton.
228 TA225 BLOCK 3 SOUND PROCESSES INDEX 228
229 TA225 BLOCK 3 SOUND PROCESSES INDEX 229
INDEX
BNC connector 33
COMM see AUdio file format (AU)
bowing 147
compact disc (CD) 37, 209
breve 114
drive 51, 65
broadcasting 145
player 183
brush 212
sub-code data 28
compression ratio 53
computer
byte 43
application see application
bit numbering 174
desktop see desktop computer
weightings of bits 173
file see audio file format, standard MIDI
file
cent 89
crash (hard disk) see hard disk crash
channel
cross-fade 57, 59, 83, 193
audio 9
crosstalk (in a stereo channel) 86
aftertouch
musical box 138
message
chant 111
D-50 synthesiser 203
chorus 11
MIDI implementation chart 204
generating 80
file
dBu 16, 18
clipping 21
decay time (in reverberation) 76
code
expansion/gating 54
desktop computer 7, 89
electroacoustic music 116
desktop publishing 7
electronic keyboard 154, 181, 201
desktop sound 8
electronic memory 38, 40, 41
destructive editing 58
flash see flash memory
diffusion (in reverberation) 76
non-volatile see non-volatile memory
digital audio tape (DAT) 209
RAM see random access memory
recorder 42, 37
volatile see volatile memory
digital filter 73
electronic music 116
digital input/output 22
end of exclusive (EOX) 171, 176, 179
episema 113
EQ see equalisation
DNP connector 33
fixed 70, 71
Doppler effect 81
in reverberation 76
downloadable sounds
treble see treble equalisation
downsizing 64
see also Q
gating
dynamics processing 87
FFT see fast Fourier transform
film 182
echo 74
analogue techniques 74
film music 116
digital techniques 74
filter
analogue techniques 57
low-pass see low-pass filter
location counter 60
editing
analogue techniques 78
nudge function 60
digital techniques 80
format chunk
non-real time 89
FORM see AUdio file format header
synchronisation
interchange file format (IFF) 44, 187
47
aftertouch
keyboard
GML see General MIDI Lite
electronic see electronic keyboard
Grainger, P. 145
MIDI see MIDI keyboard
gramophone 140, 145
graphic equalisation 71
latency 207
guitar 87
letterpress 117
MIDI see MIDI guitar light emitting diode (LED) 161, 201
access time 39
line-level signal 18
audio recorder 38
Linotype 7
disk crash 39
sector 39
LIST chunk
track 39
litho plate 119, 121
headroom 57
localisation (in stereo sound field) 85
high-pass filter 73
long (note value) 114
hum 161
long-playing disc see vinyl LP
loudspeaker 87
simulation 87
low-pass filter 73
lute 114
impedance 17
impedance
MADI digital interface 30
implementation chart
manual (keyboard instrument) 152
improvisation 109
MD see MiniDisc
input
mechanical musical instrument 132
meter
interface
microphone
Note On see Note On
phase 84
origins and acceptance 153
vocal 53
patch editor/ librarian 200, 208
microtone 115
patchbay 200
mid-range equalisation 71
piano 202
applications 194
quarter frame message 184
basics 155
router 171
byte 157
running status 172
cable 158
sample dump 178
channel aftertouch/channel
serial-to-parallel conversion see serial-to-
values 177
start/continue/stop 170
development 152
system exclusive message see SysEx
diagnostic tools 201
system exclusive see system exclusive
downloadable sounds see MIDI MIDI message
drum controller/kit/machine see drum system real time message 170, 176
machine
system reset 170
expander 200
trigger 198
filter 200
universal SysEx message see SysEx
generator 197
wind controller 198, 205
guitar 198
MIDI downloadable sounds (DLS) 154, 157,
implementation chart
general form 195
interface 206
MIDI implementation chart 202
introduction 151
MIDI IN/OUT/THRU connectors 158, 162,
librarian 154
MIDI machine control (MMC) 157, 192
list 209
MIDI Manufacturers Association
control
manipulator 199
device and command list 221
mapper 200
general form 193
MiniDisc (MD) 37
organum 113
mixer 206
output
digital techniques 62
direct 66
mixdown 69
mixed mode 66
packet (MIDI sample dump) 178
Monotype 7
parametric equalisation 71, 72
multiplexing 63
peripheral (computer) 205
analysis 111
phase vocoder see vocoder
functions of 110
phasing see flanging
history 111
phono connector 32
by computer 121
damper 135
Nancarrow, C. 148
MIDI editor see MIDI piano roll editor
neume 111
recording 145
noise
piano-organ see street piano
in cables 19
Pianola 141, 144
non-volatile memory 42
dial noting 135
pin 134
pinning 135
normalisation 52, 56
scale noting 135
notch 78
ping-pong (audio mixing mode) see bounce
pitch 166
velocity 166
pitch 165
velocity 165
piston 152
organ 146
pointer (in FIFO) 75, 82
polyphonic aftertouch
reverberation 11, 52, 65, 74, 75, 86, 88, 167,
port
algorithm see reverberation type
Portastudio 12
colour/equalisation see colour
power (electrical) 16
decay time see decay time
power
diffusion see diffusion
delay
printing
impression printing
music 116
rhythm
protocol 22
format
pulses per quarter note (ppq) 170
QuarkXpress 7
Quicktime 207
151
r.m.s. amplitude 14
sample frame 46
random access 38
sampling 74
interval 23
rank 152
interval see also sampling rate
reflections
sector (hard disk) see hard disk sector
in cables 31
self-clocking system 25
semitone 89
cue chunk 50
see also asynchronous transfers, bit/byte
data chunk 48
synchronisation, self-clocking system
fact chunk 50
serial-to-parallel conversion 163
format chunk 48
setting see typesetting
instrument chunk 50
LIST chunk 50
side drum 212
playlist chunk 50
signal-to-noise ratio 11, 18, 52, 70
SLNT chunk 50
slice (sound) 83
text chunk 50
SLNT chunk
offset 187
synchronisation 183, 200, 206, 209,
channel 171
sound loop 64
sound stage 85
MIDI
MIDI
tablature 114
spring
take 9, 10, 51, 52, 57, 59
in reverberation 77
tempo (in Pianola) 141
data chunk
using time slices 83
mode/type 188
time domain (working in) 83
SysEx 190
ticks 189
track 188
timing clock 184
stave 113
track
audio 9
stereophonic (stereo) 85
transducer (sound) 77
field 85
transient 54, 195
difference signal
tremolo 195
music
tuning 182
storing audio 36
type (in printing) 117
table of requirements 36
typesetting 7, 118
sub-code data 29
channel) 85, 86
universal serial bus (USB) 30, 90, 160, 194,
sustain 136
USB see universal serial bus
variable length specification (in IFF WAV(E) see Resource Interchange File
chunks) 189
Format
vinyl LP
pickup 17
XLR connector 33, 34
violin 168
electromechanical 146
Yamaha 01X Digital Mixing Studio 90
virtual track 51
Yamaha AW16G 12
vocoder 83, 86
dynamic range 12
volatile memory 42
editing facilities 60
volume 14
effects provided 87
mixing facilities 64
Wagner, R. 109
storage facilities 51
Acknowledgement
Cover image: © 1997 Photodisc, Inc.
240 TA225 BLOCK 3 SOUND PROCESSES INDEX 240