Ebook TA225 Block 3 Part 2 ISBN0749258993 L3

1 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 4 MUSIC DISTRIBUTION 1
TA225 The Technology of Music
Sound
Processes
Chapter 4 Music Distribution page 3
Chapter 5 The Music Business page 75
Index page 137
This publication forms part of an Open University course, TA225 The Technology of
Music. Details of this and other Open University courses can be obtained from the
Course Information and Advice Centre, PO Box 724, The Open University, Milton Keynes
MK7 6ZS, United Kingdom: tel. +44 (0)1908 653231, email general-enquiries@open.ac.uk
Alternatively, you may visit the Open University website at http://www.open.ac.uk
where you can learn more about the wide range of courses and packs offered at all
levels by The Open University.
To purchase a selection of Open University course materials visit the webshop at
www.ouw.co.uk, or contact Open University Worldwide, Michael Young Building,
Walton Hall, Milton Keynes MK7 6AA, United Kingdom for a brochure. tel. +44 (0)1908
858785; fax +44 (0)1908 858787; email ouwenq@open.ac.uk
The Open University
Walton Hall, Milton Keynes
MK7 6AA
First published 2004
Copyright © 2004 The Open University
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
transmitted or utilized in any form or by any means, electronic, mechanical, photocopying,
recording or otherwise, without written permission from the publisher or a licence from the
Copyright Licensing Agency Ltd. Details of such licences (for reprographic reproduction) may be
obtained from the Copyright Licensing Agency Ltd of 90 Tottenham Court Road, London W1T 4LP.
Open University course materials may also be made available in electronic formats for use by
students of the University. All rights, including copyright and related rights and database
rights, in electronic course materials and their contents are owned by or licensed to The Open
University, or otherwise used by The Open University as permitted by applicable law.
In using electronic course materials and their contents you agree that your use will be solely
for the purposes of following an Open University course of study or otherwise as licensed by
The Open University or its assigns.
Except as permitted above you undertake not to copy, store in any medium (including
electronic storage or use in a website), distribute, transmit or re-transmit, broadcast, modify
or show in public such electronic materials in whole or in part without the prior written consent
of The Open University or in accordance with the Copyright, Designs and Patents Act 1988.
Edited, designed and typeset by The Open University.
Printed in the United Kingdom by The Burlington Press, Foxton, Cambridge CB2 6SW.
ISBN 0 7492 5899 3
1.1
TA225 Block 3 Sound processes
Chapter 4
Music Distribution
CONTENTS
Aims of Chapter 4 4
1 Introduction 5
2 Stamping the record 5
2.1 Introduction 5
2.2 The recording production sequence 7
2.2.1 Planning 7
2.2.2 Session recording 8
2.2.3 Mix-down 9
2.2.4 Post production 10
2.2.5 Manufacture 11
2.2.6 Distribution 11
3 Audio system characteristics 13
3.1 Dynamic range 13
3.2 Bandwidth, frequency response and distortion 14
3.3 Signal-to-noise ratio 16
3.4 Summary of Section 3 17
4 Digital audio 18
4.1 Introduction 18
4.2 Digital storage techniques 18
4.2.1 A digital recording system 19
4.2.2 Digital bandwidth 21
4.2.3 Signal-to-noise in digital audio systems 22
4.2.4 Digital dynamic range 25
5 Digital audio tape recording 26
5.1 Introduction 26
5.2 Rotary head tape recorders 26
5.2.1 Digital audio tape recorders 26
5.2.2 The Alesis multitrack digital audio tape recorder 29
5.3 Stationary head tape recorders 30
5.4 Tape versus disc 31
6 Digital disc systems 32
6.1 Introduction 32
6.2 The audio compact disc system 32
6.3 MiniDisc 38
6.4 Advanced disc-based systems 40
7 Digital disc technologies 42
7.1 Oversampling 42
7.2 Single bit conversion 43
7.3 Correcting media faults 43
7.3.1 Error detection 45
7.3.2 Error correction 46
7.3.3 Data interleaving 48
7.3.4 Error concealment 49
7.4 Copy protection 50
8 Digital audio transmission 51
8.1 Introduction 51
8.2 Channel bandwidth 52
8.3 Digital audio compression 53
8.3.1 Introduction 53
8.3.2 Lossy or lossless? 54
8.3.3 Lossless compression 54
8.3.4 Lossy compression 55
8.3.5 Lossy coders and master recordings 59
8.4 Digital audio and the Internet 59
8.4.1 Internet radio 61
8.5 Digital Broadcasting 62
8.5.2 Digital Audio Broadcasting 63
8.5.3 Digital Radio Mondile 65
Summary of Chapter 4 67
Answers to self-assessment activities 70
Learning outcomes 74
Acknowledgements 74
AIMS OF CHAPTER 4
To describe the processes involved in making a commercial record

from a master recording of a musical performance.
To show how the qualities of a recording may be improved by using
digital audio technologies.
To demonstrate the benefits of using digital technologies for audio
storage.
To explain the main digital recording and playback technologies
currently used within the music industry.
To demonstrate how to deal with errors in digital media.
To explain the technology of digital audio data compression.
To explain the methods and advantages of using digital data for
audio transmission.
1 INTRODUCTION
The first three chapters of this block discussed ways in which musical
performances may be recorded and stored. These include a written
score, a set of coded instructions, or an audio recording of the actual
sounds. This chapter takes the last of these methods and describes
both the events leading up to the recording and the subsequent
manufacturing and distribution processes.
At the time of writing (2004) the recorded medium is most likely to be
the digital audio compact disc (correctly abbreviated to CD-DA but
generally referred to as the CD, as I will here). Available since October
1982, the CD revolutionised the music industry and so part of this
chapter is devoted to CD technology. Other technologies for recording
and distributing music in the digital domain, including digital tape,
alternative disc systems, the Internet and wireless broadcasting, will
also be discussed.
2 STAMPING THE RECORD
2.1 Introduction
Music companies such as Universal Music Group, Sony Music, EMI,
Warner Music and BMG are responsible for all aspects of record
production from deciding which works merit recording to the
appearance and sound of the final product.
A commercial recording is the outcome of hours of work involving
many people. Look closely at the notes that accompany any
commercial recording and you should find somewhere a list of people
involved with the production. The booklet that came with my copy of
the Virgin Classic’s recording A Venetian Coronation 1595, illustrated
in Figure 1, lists 63 names.
Figure 1 The booklet from the Virgin Classic’s recording of A Venetian Coronation 1595
(Virgin Veritas 59006)
Aside from the performers and the conductor, Paul McCreesh, the
names listed include the executive producer, the music editors, the
balance engineer, the cover designer, the photographer and several
friends and advisers to the conductor. Even the organ tuner gets a
credit! Not included in the list, but equally important, are all those
responsible for organising the venue (it was recorded at Brinkburn
Priory, Northumberland, UK), providing the hospitality during the
recording, undertaking post production after the recording,
manufacturing the CD, printing the booklet, etc., etc. In reality
hundreds of people will have been involved in making this record.
ACTIVITY 1 (PRACTICAL, EXPLORATORY) .................................................
Take a few moments to look at some of the booklets from your own
recordings and see how many people are involved in producing them.
Do you find that each record has a similar list? You may find that
some record labels make more information available than others.
Typically CDs tend to show more information than either audio
cassettes or vinyl LPs.
Note down any job titles that recur. Usually the job title will be
accompanied by the name of a person. Are you able to classify any of
these jobs into either artistic or engineering rôles?
Comment
Some of the jobs that I discovered by looking at labels from a random
selection of my own CDs include: producer; remix engineer;
production assistant; A&R co-ordinator (A&R is explained later in this
section); director; balance engineer; tonmeister (literally sound-
master); designer; recording engineer; digital remix engineer; executive
producer.
The job functions within the record side of the music industry may be
classified into two rôles, the artistic and the engineering, as listed in
Table 1.
Table 1 Rôles within the music industry include:

Artistic rôle Engineering rôle
Producer Remix engineer
Production assistant Recording engineer
Director Tonmeister
Designer Balance engineer
Executive producer Digital remix engineer
A&R co-ordinator Post production engineer
In general, work within the music industry calls for an appreciation of

both artistic and engineering functions. The engineer must have some
understanding of the music as an art-form and the producer must be
aware of the recording technologies. Aspects of these functions will be
considered in the next section where the recording production
sequence will be described.
2.2 The recording production sequence

So far you have discovered that making a commercial recording
involves lots of people, many with very specific skills. This section
describes a typical sequence for the production of a commercial
recording as shown in Figure 2.
I am assuming here that the 1. planning
recording involves many
performers and is to take place
on location rather than a
2. session recording session
recording that involves a small takes
number of people (e.g. a pop
group) just performing in a
studio. 3. mix-down
master
other recordings recording
2.2.1 Planning
Record producers, the people
4. post production
who are responsible for
bringing the recording to the additional track
market place, take on two information
specific tasks. First comes the 5. manufacture

creative process, i.e. finding
and selecting the music, packaging and
programme notes
choosing arrangements, getting 6. distribution CD
the desired sound, planning
the cover and insert notes, etc.
Figure 2 The six stages of the production sequence
This is followed by a number
of administrative tasks, i.e.
booking the musicians, agreeing the recording venue, selecting the
support staff, balancing budgets and preparing reports. Producers are
supported by A&R (Artists and Repertoire) co-ordinators who are
responsible for finding new songs and signing new artists to record
labels, overseeing projects and matching artists to producers.
For a record to be made the producer must ask and satisfactorily
answer the five basic questions listed in Figure 3.
The central question must be
why the recording is to be
made. Commercial aspects what?
cannot be ignored because
ultimately music companies
must make money. However, why? where?
this need not be the over-riding
issue in every case. All record
companies have their best
sellers and their loss-leaders.
Not every record can be
expected to win a Gold Disc.1
Feb 2004
1
Sales of 500 000 records in M T W T F S S
the USA.
Figure 3 Five questions to ask who?

before a record is produced when?
Indeed Columbia Records waited 30 years before their seminal recording

of Miles Davis’s Kind of Blue received this award. But the question as
to whether the recording is likely to meet favour with the record-buying
public must be asked and satisfactorily answered. Then the budget for
the production may be agreed, performers and engineers booked and a
suitable venue secured along with all the necessary support services.
2.2.2 Session recording

On the day, the performers and engineers meet up with all the support
staff to make a session recording. In fact, as you may well have realised
from preceding chapters, with multitrack recording facilities, it is not
always necessary to get everyone together at one time. However for
many performances the interaction between all the players is vital.
The music can be discussed and rehearsed by the musicians while the
recording engineer (known also as the balance engineer or tonmeister)
chooses and tests appropriate microphones, trying them in various
locations so as to pick up the wanted sounds and disregard any
extraneous noise. Several recordings or takes will be made. Different
microphone set-ups and positioning of artists will be tried and in every
case the take will be numbered and logged as it is important to keep good
documentation. The final recording from the session contains all the
takes, for nothing is discarded at this time. The producer, engineer and
performers will to listen to sections of the recording but the final editing
will be left to a later time. Figure 4 shows a recording session made
by representatives of the BBC for the Open University course A207.
Figure 4 A BBC recording session for the Open University course A207
Not all artists expect to spend a lot of time in the recording studio.
In 1942 the American singer Bing Crosby made a recording of Irving
Berlin’s song White Christmas in eighteen minutes with only two
takes. This recording has now sold over 31 million copies.
2.2.3 Mix-down
Once the recording session is over the producer and engineers work on
the mix-down of the session recording to make the master recording.
The mix-down will usually be to two channel stereo although now
surround sound formats may also be made, depending on the content
of the recording and how it is to be distributed. Judgements as to the
ultimate sound will be made by the producer and engineers on an
artistic basis. In well-designed digital audio systems there should be
no loss in quality or added noise on any copies made from the session.
Digital editing allows very small blemishes in the sound to be corrected to
achieve a near-perfect recording from the original performance. (You may
recall from Chapter 3 in this block that piano rolls could be similarly
corrected.) Figure 5 shows a page from the editing notes of a recording of
the cadenza from the Sibelius Violin Concerto in D minor. The numbers
at bars, e.g. 205 and 118, represent recording takes. You can see the
reference to noise from a passing motor car (‘filter car’) during take 120
which needs attention. Note also the mixing of takes from 118 to 120, back
to 118, etc. gradually building a new performance from a collection of
takes. The really
important point
about this example is
that the editing is not
being used to piece
together an accept-
able performance
from an incompetent
artist (although that
possibility could
exist). Instead edit-
ing is being used to
create a performance
that is better than
even this highly
competent player
would be likely to
achieve in a single
take or at a live
performance.
Once the mix-down
is complete the
master recording will
be stored on a digital
recording medium
such as an 8 mm data
cartridge, digital
audio tape (DAT) or
CD. (These systems
will be introduced
later in this chapter.)
Figure 5 Part of the editing notes for editing a musical performance
2.2.4 Post production

Post production takes a master recording and adapts it to the chosen
distribution medium. As I mentioned earlier the medium would
probably be CD although two new standards, the Super Audio
Compact Disc (SACD) and the Audio Digital Versatile Disc (DVD-A)
are making in-roads into the recorded media market. Distribution by
wireless or over the Internet also requires post production. However, I
will focus here on post production for CDs.
Post production offers a final opportunity for limited changes to be
made to the master recording before manufacture and distribution.
A decision may be taken by the marketing department to put an extra
track onto the record to make it more attractive to the public. This
track may be from a completely different source with a quite different
sound, in which case the post production engineer would have to
apply equalisation and normalisation to this track to try to make it
compatible with the existing material.
ACTIVITY 2 (REVISION) .......................................................................
What do you understand by the terms equalisation and normalisation

explained originally in Chapter 1 of this block? How would these
techniques be applied to the addition of an extra track onto a
compilation CD?
Data specific to the medium is also added at the post production stage.
For a CD this would include information about track separation, track
numbers and the length of each track. Once all the digital audio and
associated data is finalised the master recording is stored onto digital
media such as the Exabyte 8 mm tape cartridge illustrated in Figure 6
using a special disc description protocol (DDP) file format. Exabyte
tape is a specially formulated, substantially bit-error free, tape
manufactured primarily for archiving computer data.This is preferred
by engineers as the audio data will be substantially free from errors.
DAT and CD-R
may also be used
but are not
considered ideal
as they have
relatively high
bit-error rates.
The post
production
master tape may
be referred to as
the EQ’d-master
to identify the
fact that has been
made for a
specific medium.
Figure 6 An Exabyte tape cartridge
2.2.5 Manufacture
Today’s music industry mass-produces recordings in enormous
quantities. For example nearly ten billion CDs were manufactured
world-wide between 1999 and 2002. And yet the duplication process,
which produces records that are identical to the master, still uses a
process similar to that invented by Emile Berliner in 1894. (You will
read more about Berliner in the final chapter of this block.) His process
used ‘stampers’, which were reverse or negative representations of a
master disc, to press a groove into a hard rubber compound. The fully
automated manufacturing process for CDs using a similar principle
but with different compounds is described in Box 1 ‘The compact disc
manufacturing process’.
2.2.6 Distribution
Record distributors are owned either by music companies or are
independent. The independent distributors handle the smaller labels
which offer the more specialist genres. Distributors look after
shipping, warehousing, inventory control and have a sales force who
get the records into the marketplace by selling to the record stores.
ACTIVITY 3 (PRACTICAL, EXPLORATORY) .................................................
Take a look again at some of the inserts from recordings that you own but
now look for the record company and label. Make a list of the company
and label. Do you see that a record company can own several labels?
Comment
Table 2 lists the major music companies with examples of their record
labels. Unfortunately, the distributor’s name does not feature on the
notes for the record. To find a distributor you would need to search
the Web or refer to one of the many record magazines.
Table 2 The major record companies and examples of their labels
Record company Label

Universal Music Group Decca, Deutsche Grammophon, Island
Mercury, Philips, Verve
Sony Music Entertainment Inc. CBS, Columbia, Epic, Sony Classical
EMI Records Ltd. EMI Classics, HMV, Parlophone
Warner Music Group Atlantic, Elektra, WEA
BMG Entertainment Arista, Conifer, RCA
Publicity, both for specialist magazines and the record retailers will be
made available prior to the release date and advanced copies of the
recording will be sent to reviewers and broadcasting organisations.
Finally, enough records should be manufactured to ensure those of us
who want a copy are not kept waiting!
ACTIVITY 4 (SELF-ASSESSMENT) ...........................................................
Describe in your own words the six stages necessary to get a CD into
the shops.
Box 1 The compact disc manufacturing process

The complete CD manufacturing process is shown light writes a spiral of audio and associated data, in
diagramatically in Figure 7. the form of a series of ‘marks’, onto the photo resist.
Writing the data is fully automatic, using computer
Disc mastering control, taking data from the EQ’d-master. When the
The disc mastering process (Figure 7(a)) starts with exposure is complete the disc is developed (3) using
a specially made flat glass disc (1) which is coated a special fluid, which leaves a spiral groove consisting
with a photo resist – a substance that changes its of areas of land and pits in the photo resist. The
characteristics when exposed to light. The disc is thickness of the photo resist determines the depth of
placed into a laser beam recorder (2) where laser the pits that will be pressed into the completed disc.
A fine coating of silver metal (4) is then deposited
over the disc’s surface. The disc is placed in an
(a) mastering photoresist electroplating tank where a coating of nickel (5) is
dispersed onto the silver. When the nickel coating is
1
thick enough to be handled the silvered-nickel disc is
glass plate parted from the glass (6) and cleaned to remove any
laser beam
particles of photo resist from the silvered surface.
pits exposed
This metal disc, which is a negative copy of the
2 laser cutting
original master disc is called the metal-master or
‘father’. The electroplating process is again used to
pits formed
create positive copies of the original glass master,
3 developing process known as ‘mothers’, from the metal-master (7). These
are used to make the negative stampers, known as
‘sons’ (8). These stampers are used to press the data
silver deposit into the plastic medium to form an exact copy of the
original glass master. After many pressings new
4 silver plating stampers will be made from mothers. Eventually, if
there is a very high demand for copies of the CD, new
metal master mothers may have to be created from the original
metal-master.
5 nickel plating
father
electroplating
6 7
mother
(b) duplication stamper mother

8
son (stamper)
polycarbonate
plastic
9 molding
disk substrate
Disc duplication
reflective Figure 7(b) shows the duplication process. The disc is
aluminium made from clear polycarbonate plastic in an injection
moulding machine. The stamper presses the data into
10 metalisation the plastic material and forms the disc (9). (It takes
about 5 seconds to press each one.) Then a reflective
protective layer
of lacquer layer of aluminium is deposited over the data surface
11 lacquering (10) using an evaporation or ‘sputtering’ technique.
(This takes about 3 seconds.) Finally a layer of lacquer
is sprayed over the aluminium (11) to protect it from
abrasion and oxidation and a label is printed onto the
Figure 7 The manufacturing steps to make a lacquered surface. Following quality control checks the
CD based on Berliner’s original process discs are packaged and boxed for distribution.
3 AUDIO SYSTEM CHARACTERISTICS
This section will review the essential characteristics of an audio

system, i.e. dynamic range; bandwidth; signal-to-noise and distortion
which were introduced in Chapter 6 of Block 1. The importance of
each characteristic will be discussed as it is necessary to appreciate
them in order to understand why it became necessary for the music
industry to encompass digital audio technologies.
3.1 Dynamic range

When you attend a live concert you might experience sound levels
varying from the barely audible to the down-right uncomfortable.
These differences in sound levels may be regarded as the dynamic range
of the concert. In order that an audio system can give a realistic experience
of a recording of the same concert it ought to be capable of reproducing a
similar dynamic range. You read previously that the sound pressure level
detected by the ear varies from the threshold of hearing (e.g. demonstrated
by hearing a pin drop) to the threshold of pain (e.g. within a 100 metres
of a jumbo jet taking off). In fact the ear is capable of detecting a
dynamic range of over 120 dB, as shown in Figure 8, where the sound
pressure level (SPL) varies from above 120 dB-SPL down to 0 dB-SPL
which is described as the minimum audible field.
120
sound pressure level (dB) (reference: 20 µPa)
100
80
60
40
20
10 100 1000 10000

frequency (Hz)
Figure 8 The response of the ear over the audible frequency range. Each graph
is a line of perceived equal-loudness. Notice the general lack of sensitivity of
the ear in the low frequency range at all sound amplitudes
When reproducing sounds the dynamic range of an audio system is
limited by two characteristics. The lowest sound level is determined
by the noise in the system, i.e. when the lowest signal level becomes
14
� TA225 BLOCK 3 SOUND PROCESSES
(a)
0
� noise
time
amplitude
(b)
0
CHAPTER 4 MUSIC DISTRIBUTION
enveloped by electronic noise, as shown in Figure 9(a). The highest

sound level is limited by the system. A distorted or clipped output,
shown in Figure 9(b), is produced when a sound above the designed
maximum level is input.
amplitude
maximum
signal level
Figure 9 (a) an audio signal enveloped in noise, (b) a clipped audio signal
A studio microphone is capable of achieving a dynamic range of around

90 dB. Unfortunately, in the recording process this dynamic range
is reduced by the various production and manufacturing stages. For
early analogue systems (you will read more about these in Chapter 5)
the dynamic range could be as low as 30 dB and even the very best
analogue audio systems were capable of no more than 70 dB.
ACTIVITY 5 (EXPLORATORY) ................................................................
What would be the ideal dynamic range of an audio system?
Comment
The ideal dynamic range would be one that matched the human ear,
i.e. 130 dB. However, given that studio microphones limit dynamic
range to around 90 dB this would appear to be a suitable value for
consumer audio systems. Of course in an average domestic
environment it would be impossible to achieve a dynamic range
approaching this value – just think what the neighbours might say!
An ideal audio recording and playback system would impose no

limitations on the signal output from a microphone, so the
reproduction should sound as if the microphone was connected
directly to the loudspeaker. Right from their outset recording
technologies caused sound degradation by adding noise to the signal,
by limiting the frequency response of the sound and by imposing
limits on the dynamic range.
3.2 Bandwidth, frequency response and distortion

In Chapter 6 of Block 1 you learned that the bandwidth of an audio
system is specified by two frequencies known as the cut-off frequencies
between which the frequency response is substantially flat. This is
found by inputting a range of frequencies to a system and measuring
the corresponding output. The two points at which the sound level has
dropped by 3 dB determines the frequency response. Figure 10 shows
an example of the frequency response for an audio amplifier, which is
from 30 Hz to 30 kHz. The sound level output from the system for any
time
14
90 dB 90
80 dB 80
70 dB 70
gain
60 dB 60
50 dB 50
40 dB 40
30 dB 30
10 Hz 100 Hz 1000 Hz 10 000 Hz 100 000 Hz
frequency
Figure 10 The frequency response of a typical audio amplifier (a repeat of Figure 26 in
Block 1 Chapter 6)
frequency between the two cut-off frequencies must be the same for a
constant input amplitude to ensure the relationship between the
fundamental frequencies and the harmonics that comprise the sound
are preserved. Any variation will corrupt this relationship and cause
audible distortion which can change the timbre of the sound.
A microphone with a frequency response that is substantially flat

between 80 Hz and 20 kHz is used to record an organ recital which
includes the Toccata from Charles-Marie Widor’s Fifth Organ
Symphony. The organ on which it will be played has 32 foot pipes
capable of producing notes below 20 Hz. What effect will using this
microphone have on the recorded sound?
Typically we can hear frequencies between 20 Hz to 20 kHz. However,

bandwidths vary depending on the nature of the audio signal and the
requirements of the audio system. For example, because speech is still
intelligible when the bandwidth is reduced to between 300 Hz and 3.4
kHz, a telephone channel carrying speech can have a narrower
bandwidth than that for a high quality radio channel as indicated in
Table 3. In general the wider the bandwidth the more signal
information conveyed and the better the fidelity of the reproduced
sound. Some newer digital media such as SACD and DVD-A are
capable of offering bandwidths well in excess of 20 kHz as detailed
later in this chapter.
Table 3 Examples of bandwidths for analogue audio channels
Audio Approximate Approximate

bandwidth frequency response source
Telephone 300 – 3 400 Hz 3.1 kHz
AM radio 50 – 6 000 Hz 5.95 kHz
FM radio 30 – 15 000 Hz 14.97 kHz
ACTIVITY 7 (COMPUTER) ....................................................................
Run the course’s sound editing software and open the computer sound
file associated with this activity. Use the program’s graphic equaliser
filter to create, in turn, each of the three bandwidths given in Table 3.
What effect on the sound does each setting of the filter have compared
to the unfiltered sound?
Comment
I hope you noticed how the timbre of the musical instrument changed
especially when you used the very narrow bandwidth setting. You
probably noticed a lack of high frequency with the AM setting and
really very little difference at all with the FM setting when compared
to the unfiltered sound.
Recent research has shown that increasing the bandwidth beyond the
audible range may contribute to a more realistic reproduction
especially of hall acoustics. Some domestic audio systems now offer
bandwidths up to 80 kHz to support SACDs and DVD-As.
Audio systems may be described by a transfer function, introduced in
Chapter 6 of Block 1, which relates the characteristics of the output
signal to those of the input. The transfer function may also be represented
on a spectral graph where the frequency content of the input and the
output can be compared. Unless the transfer function is linear as shown
in Figure 11(a), the output signal will be distorted due to the creation
of additional harmonics as shown in Figure 11(b). The additional
harmonics, including sub-harmonics (harmonics below the lowest
frequency in the input signal), cause distortion in the output signal.
voltage
voltage
linear transfer
function
frequency frequency
(a)
unwanted additional
frequencies causing
distortion
voltage
voltage
non-linear
transfer function
frequency frequency
(b)
Figure 11 (a) Linear and (b) non-linear transfer functions
3.3 Signal-to-noise ratio

One of the limits to the dynamic range of a system is extraneous noise
which has always affected sound recordings. Indeed, even with the
latest audio systems, the battle of the wanted audio signal versus
unwanted noise is still being fought, for as the background noise
becomes nearly silent, the ear becomes evermore alert to unwanted

sounds. Noise, both in the form of electrical interference and audible
disturbance, comes from everywhere: mains distribution wiring;
electronic circuits; cables and connectors; system mechanics;
mechanical and electromechanical transducers; recording media; air
conditioning; cooling fans; even human breathing. And of all these
sources become part of the audio signal to which we listen.
+ =
Figure 12 Noise added to an analogue audio signal becomes

integral to the final signal
You may recall from Block 1, Chapter 6 that the signal-to-noise ratio is
the average audio signal power divided by the average noise power and
is usually expressed in decibels. Whilst it is usual to specify the
largest possible value, in reality the signal-to-noise ratio is very
dependant on the sound levels and the environment in which they are
being heard. For example a noisy amplifier may be adequate to feed a
loudspeaker, but totally unacceptable for use with headphones. The
former would need a much higher sound level thus masking the noise
from the amplifier.
An audio system has an average power output of 100 W with an

average noise power of 10 µW. Calculate the signal-to-noise ratio of the
system in decibels. Note, you may need to refer to the table of
amplitude and power decibel ratios in Table 1 of Block 1, Chapter 6 to
answer this question.
3.4 Summary of Section 3

Three main characteristics determine the quality of the audio system:
dynamic range, audio bandwidth and signal-to-noise ratio. These
characteristics are to an extent interrelated and dependant upon the
technology used by the audio system. An ideal system would offer no
constraints to the original sound but this is impossible. The best that
can be offered is to match or exceed the characteristics of the ear, but
this is not possible with analogue systems where the dynamic range is
limited to, at best, 70 dBs. The industry therefore had to look to new
technologies to gain improvements.
Describe in your own words why the transfer function of an audio

system should be as linear as possible.
4 DIGITAL AUDIO
4.1 Introduction
You may recall from Chapter 6 of Block 1 that a technique known as
pulse code modulation (PCM) can be used to code analogue audio in
the form of a stream of electrical pulses. The principle used is to
sample the analogue level at regular intervals, assign a binary code to
represent each level and convert the resulting codes to a serial stream
of binary bits. The advantages of using a digital code to convey and
store the audio signals will be examined in this section.
Although invented in 1937, by an Englishman, Alec Reeves, it took 30
years for technology to develop the first digital audio recording using
PCM to be demonstrated by the Japanese Broadcasting Corporation.
The general impression gained at that time was that the sound
reproduced by the digital audio tape recorder could not be matched by
any analogue machine of the day. The bit pattern recovered by playing
back the tape corresponded exactly to the bit pattern originally coded
from the analogue signal and recorded onto tape. The record and
playback systems apparently caused no audible artefacts to be added to
the digital signal – it was described at the time as an ‘ideal’ recording
system.*
Over the next few years evolving electronic and computer technologies
allowed the development of the circuits and storage methods needed to
cope with the complexities of digital audio. In 1982, fifteen years after
the first demonstration of PCM recording, Philips and Sony jointly
introduced the compact disc to the world – consumer digital audio
systems had become possible.
4.2 Digital storage techniques

The main reason for using digital audio technology is that it gives
immunity to the wanted signal from noise generated during playback
or due to mishandling, electrical and mechanical interference. (I’m
thinking here of how the signal from a vinyl LP disc is so easily
corrupted by surface scratches, dust and pickup noise or the so-called
‘tape hiss’ that is so intrusive in analogue tape recording.)
Unfortunately this immunity from noise comes at a price – namely the
complexities of handling the high data rates and storing the large
volumes of data generated by the PCM converters. When the first
digital tape recorder was developed (1967) the computer industry was
still in its infancy. Computer memories were measured in tens of
kilobytes (a kilobyte is 1024 bytes or 8192 bits) and hard disk drives
were the physical size of today’s automatic washing machines and
stored a few megabytes of data. The next activity will remind you of
the volumes of data generated when converting analogue audio signals
to the digital domain.
*Baert, L., Theunissen, L., Vergult, G., Maes, J. and Arts, J. (1995) Digital
Audio and Compact Disc Technology, Oxford: Focal Press, Butterworth-
Heinemann, p. 7.
Calculate the number of bits required to store 1 second of stereo digital

audio data if each channel of the analogue signal is sampled at 44.1 kHz
and 16 bits are used to code each sample.
4.2.1 A digital recording system

A digital recording system will comprise three separate parts:
1 A coder which converts the analogue audio signal to a binary
digital signal, provides additional data so that any errors occurring
in the digital data when it is recovered can be corrected and
interfaces with the recording device.
2 A decoder which carries out the reverse operation to the coder by
interfacing with the recording device, recovering the digital data,
correcting any errors in the digital data and reconstructing the
analogue signal.
3 A digital storage medium which can be magnetic tape, a computer
hard-disk, various types of magneto-optical discs, which I will
return to later in this chapter. Memory cards which were
mentioned in Chapter 1 of this block, may also be used although, at
the time of writing, they still have limited storage capacities in
comparison to the other devices.
Note that the coder and the decoder are often referred together as a
codec, i.e. the combination of a coder and a decoder.
Whilst the storage devices may vary enormously, the recording and
playback process is basically the same with just the interface to the
device changing. A description of the recording and playback process
is given in Box 2.
Describe in your own words the stages necessary to record an analogue

audio signal output from a stereo pair of microphones onto a suitable
digital storage medium. Assume that the storage medium accepts the
digital audio data as a serial bit stream.
How does the performance of a digital audio system compare with that
of an analogue system in terms of bandwidth, signal-to-noise ratio and
dynamic range? Well, to put it simply, as long as the digital audio system
offers a ‘window’ that is wider than necessary for the analogue signal
in all three of these aspects then no loss of quality will occur when the
digital system conveys the analogue signal. For this to happen the
digital clipping level must be above the largest analogue signal and the
digital noise level has to be below any noise in the analogue signal.
Also the high and low frequency response of the digital channel must
exceed the range of frequencies in the analogue signal. An example of
this is a recording on a CD which was mastered on an analogue tape
recorder. All the artefacts of the master tape are reliably reproduced by
the CD and nothing is added by the digital system.
Box 2 The digital audio recording and playback process
Recording
The block diagram in Figure 13 shows (a) the coder and (b) the decoder used for recording and
playing back audio signals. The analogue signal input to the coder is amplified and bandwidth limited
by an anti-aliasing filter. This is a low-pass filter designed to stop high frequencies, i.e. those
above half the sampling rate, being passed into the quantiser (the analogue-to-digital converter).
For a typical high-quality analogue audio signal the minimum sampling rate needs to be at least 40
kHz (twice the audio bandwidth of 20 kHz). In practice rates slightly higher than twice the audio
bandwidth are chosen to ensure the filter is able stop unwanted frequencies to prevent aliasing.
(This was introduced in Section 5.3 in Chapter 6 of Block 1.)
second channel
(if stereo)
analogue
anti-aliasing A/D + error parallel
audio amplifier correction
filter converter to serial
input
storage
medium
(a) coder
error analogue
serial to D/A reconstruction
– audio
parallel correction converter filter
output
storage
medium second channel
(if stereo)
(b) decoder
Figure 13 (a) A digital recording system and (b) a digital playback system
The quantiser generates a parallel binary number, often called a code word, which represents the
amplitude value for each sample. You read in Block 1 that the greater the number of quantisation bits
used by the quantiser the better the quality of the reconstructed sound. This is because the
binary code may not be exactly the value of the sample, but the closest approximation to it.
(The greatest difference is half the interval between two quantisation levels.) The difference
between the actual and the quantised value is termed the quantisation error which leads to
quantisation noise, a form of distortion in the reconstructed analogue signal, shown in Figure 14. The
parallel code word output from
the quantiser is combined with original signal quantisation error quantised signal
the code words from the second
channel, if two channels are
being used, and error correction
data and other information to
= +
suit the digital storage medium
is added. Finally the data is
converted into a serial bit
stream using a parallel-to-serial
converter so that it can be Figure 14 Quantisation noise is the sum of the audio signal
stored or transmitted. and the quantisation error
Playback
The digital audio data is recovered as a serial stream of bits from the digital storage medium shown
in Figure 13(b). It is converted back to parallel form and then processing, including error correction,
is carried out before the two channels are separated. The binary code words are then input to a
digital-to-analogue converter. The reconstructed samples are converted at the original sampling
rate. The converter’s output is passed into a low-pass reconstruction filter which forms the final
analogue audio signal. The cut-off frequency of this filter will be slightly less than half the sampling
rate of the D/A converter. You should note here that this is a very generalised description and in
reality some of the operations may be carried out in a different order – in particular, serial-to-parallel
conversion may be performed at an earlier stage. A similar situation would also exist in the coder.
4.2.2 Digital bandwidth

The bandwidth of a digital signal is basically determined by the bit
rate of the digital data. The bit rate is dependent upon two factors, the
sampling rate and the number of quantisation bits (i.e. the number of
binary bits that are used to code the level of each sample). The former
is dependent on the bandwidth requirements of the analogue audio
signal, the latter on the required quality of the audio system. Table 4
shows examples of bit rates for various digital systems. As you can see
the sampling rate varies for different systems but is always a minimum
of twice the analogue bandwidth (remember the sampling theorem).
High quality audio reproduction requires the sampling rate to be at
least 40 kHz. In reality this rate is somewhat higher and newer
systems, such as SACD and DVD-A, are using considerably higher
rates although the necessity for this is somewhat controversial.
Table 4 Typical sampling rates of some digital audio systems
Digital audio Audio Sampling Number Single channel

system bandwidth quantisation of bits data rate*
rate
Telephone 3.4 kHz 8 kHz 8 64 kbit/s
FM radio 15 kHz 32 kHz 16 512 kbit/s
Audio CD 20 kHz 44.1 kHz 16 705 kbit/s
Studio sound 20 kHz 48 kHz 16 768 kbit/s
SACD 80 kHz 192 kHz 24 4608 kHz
DVD-A 80 kHz 192 kHz 24 4608 kHz
* Doubled for a stereo pair
ACTIVITY 12 (LISTENING) .....................................................................
Listen to the audio track associated with this activity. You will hear
two versions of the same recording. I hope you can hear a difference
between the two. Can you suggest what is wrong with the second
version? (Hint: Think about sampling rates.)
Comment
The first version is the recording of a piano using the standard CD
sampling rate of 44.1 kHz. The second version is the same piece but
recorded using a sampling rate of 11.025 kHz but without lowering the
frequency of the anti-aliasing filter. (Note, a quarter of the necessary
sampling rate was chosen to emphasise the problem of aliasing).
The effect of using a reduced sampling rate is to introduce unwanted
‘alias’ frequencies into the audio signal. These arise because there are
too few samples to be able later to reconstruct the original signal from
the digital data. Instead of the high frequencies being reproduced a
false low frequency tone is reconstructed from the samples, which
distorts the sound.
As well as using a necessary sampling rate of twice the audio

bandwidth, the number of quantisation bits also has an effect on the
quality of the reconstructed signal as the next section will explain.
4.2.3 Signal-to-noise ratio in digital audio systems

Noise in digital audio systems comes from the quantisation process.
You may recall that in digitally coding the analogue signal the resulting
code word does not necessarily represent the exact value of the sample
level but the closest approximation to it. The difference between the
original value and the quantised value is the quantisation error which
causes a form of distortion known as quantisation noise.
The number of quantisation levels is determined by the number of
quantisation bits. Box 3 ‘Digital signal-to-noise ratio’ shows that for n
quantisation bits the signal-to-noise ratio for the system approximates
to 6n dB. The next activity demonstrates the advantage of using as
many bits as practical in a digital system.
Box 3 Digital signal-to-noise ratio

Consider a digital audio system using 3-bit code words which gives 23 or 8
quantisation levels and assume this covers an input voltage of 1024 mV (1.024 V).
Assuming all the quantisation intervals are equal, the voltage difference between
each quantisation level will be 1024/8 mV = 128 mV, so the maximum
quantisation error is half this or 128/2 mV = 64 mV
Now, if the number of quantisation levels is doubled to 16 (i.e. the number of
bits in the code word is increased to 4), the voltage difference between
quantisation levels will now be 1024/16 mV = 64 mV. The maximum quantisation
error is still half the quantisation voltage which is now 64/2 mV = 32 mV
The change in quantisation noise between using three bits and four bits is
thus: 64:32 or 2:1 or 6 dB
So doubling the number of quantisation levels (i.e. adding an additional
quantisation bit) improves the signal-to-noise ratio by 6 dB.
As a general principle therefore the signal-to-noise ratio for any digital system
relates directly to the number of quantisation bits, and for n quantisation bits
the signal-to-noise ratio is 6n.
Calculate the approximate signal-to-noise ratio for a domestic CD

player with 16 quantisation bits.
Comment
Using the expression derived in Box 3, for the domestic CD player with
16 quantisation bits the signal-to-noise ratio equals 6 × 16 = 96 dB.
Table 5 gives a Table 5 Comparison of signal-to-noise ratios for

comparison of analogue (blue) and digital audio systems
the signal-to-
noise ratios for Signal-to-noise Audio medium
ratio
typical analogue
(blue) and 30–40 dB Coarse-groove disc (a shellac 78)
digital systems. 50–65 dB Micro-groove disc (a vinyl LP)
65–75 dB Studio analogue tape recorder
up to 96 dB Studio digital tape recorder
96 dB Domestic CD player
120 dB SACD
144 dB DVD-A
Quantisation noise results from the stepped effect caused by the

conversion process. This causes little disturbance to high level audio
signals, sounding similar to the noise in analogue systems since the
quantisation errors occur in an uncorrelated, i.e. an apparently
random, fashion. However, with low level audio signals the
quantisation noise more directly relates to the signal and is said to be
correlated, i.e. related to the pattern of the audio signal. For example
consider a very low level audio signal that, when quantised, causes a
change of a single bit as illustrated in Figure 15. In the reconstructed
signal, this very low-level signal will appear as a square wave which
will generate audible harmonics causing distortion. One way of over-
coming the problem of quantising this low level signal would be to
increase the number of quantisation levels so that the signal will be
coded into several levels and will thus not generate a square wave at the
previous audible level. Unfortunately this brings about other problems
which you will see as you attempt Activities 14 and 15 below.
1000 1000
0111 0111
amplitude
amplitude
0110 0110
0101 0101
0100 0100
time time
Figure 15 Quantising a low-level signal that causes only a single bit change
Calculate how many additional quantisation bits will be required to

increase the signal-to-noise ratio of a domestic CD player from its
existing 96 dB to 144 dB.
Comment
Start from the expression for the theoretical signal-to-noise ratio as
derived in Box 3:
signal-to-noise ratio = 6n dB
where n is the number of quantisation bits
So, for a given signal-to-noise ratio, n = signal-to-noise ratio/6. In this
case the required signal-to-noise ratio is 144 dB, therefore the number
of quantisation bits required is 144/6 = 24.
As a domestic CD player has 16 quantisation bits 8 additional
quantisation bits are therefore necessary to increase the signal-to-noise
ratio from 96 to 144 dB.
As you can see adding extra bits is necessary but costly in terms of
increased storage capacity. The technology to provide this increase in
storage capacity did not exist at the time when the CD was developed.
The next activity considers storage capacity of a CD when additional
quantisation bits are used.
ACTIVITY 15
(EXPLORATORY) ................................................................
Given the playing time of a standard CD is approximately 70 minutes

calculate the reduction in playing time that would result from
accommodating the eight extra quantisation bits. Ignore any overheads
due to formatting and error correction data.
Comment
For each sample the number bits to be stored is increased from 16 to
24. If overheads are ignored then the playing time will be reduced by
the ratio of 16:24 which gives a new playing time of:
70 × (16/24) = 46.7 minutes
Again you can see that using more quantisation levels would not be a
very good idea for CDs due to the reduction in playing time. However,
adding more quantisation bits is exactly what has happened with the
higher capacity storage technologies used for SACD and DVD-A systems.
Another way to ensure that the distortion resulting from quantising a
low level signal is not generated is to add a low level ‘spoiling’ signal.
This is an idea taken from video recording technology where low level
random noise is used to improve picture quality. If a very small
amount of random noise, known as dither noise, is added before the
analogue-to-digital conversion process then the sound is improved.
This may seem quite strange, but although the overall signal-to-noise
ratio is decreased slightly, the effect of the additional noise is to
decorrelate it from the signal with the result that the audible distortion
is effectively reduced. This then allows the digital system to code
signal amplitudes of less than the amplitude of one quantisation
interval and still keep the noise levels low, as shown in Figure 16.
1000 1000
0111 0111
amplitude
amplitude
0110 0110
0101 0101
0100 0100
time time
Figure 16 A dithered input signal and the resultant waveform
Often dither noise need not be added to an analogue signal prior to

quantisation as usually there is sufficient noise already in the signal
from the preceding analogue circuits. Most computer-based digital
editing systems can generate dither noise if desired.
ACTIVITY 16
(LISTENING) .....................................................................
The audio track for this activity contains two examples of the same
recording. The first has been made without the use of dither noise.
You will be able to hear audible distortion particularly as the notes
fade away. In the second example dither noise has been added
effectively reducing the distortion. In both cases the effects have been
deliberately emphasised to demonstrate how the addition of dither
noise can improve the sound of a digital audio recording. In normal
cases the dither noise is so low that it is inaudible as you will see in
the following activity.
In this activity you will use the course’s sound editing software to
both see and hear the amount of dither noise that could be added to a
digital audio signal on a CD. As the activity uses features of the sound
editor you have not previously used, you will find detailed steps on
how to carry out this activity in the Block 3 Companion.
Comment
In this activity you were able to see that a very low level of dither noise
is added to the signal to avoid any audible distortion when converting
from a higher to a lower number of quantisation levels. This level of
noise is inaudible, but as Activity 16 demonstrated, the audible
distortion is removed, although it is at a very low level.
4.2.4 Digital dynamic range

The improved signal-to-noise ratio possible in digital systems means that
the lowest operating level is reduced so increasing the dynamic range as
compared with an analogue system. The upper operating level in an
analogue system has always been set below the level that would start to
cause distortion due to the non-linear response characteristic at high
levels. The difference between this upper operating level and the point at
which distortion occurs is known as the headroom. It is a safety margin to
allow for the occurrence of unexpectedly high audio levels which could
cause some audible distortion if this margin was not used. Digital systems
do not need to allow for a safety margin as no distortion whatsoever will
occur until the maximum code-word value is reached. At this point the
signal is clipped and gross distortion occurs. So, to reduce the possibility
of this clipping an artificial headroom of 18 dB below the maximum
possible coding level has been suggested by the European Broadcasting
Union in their Technical Recommendation R68–2000.
Why do you think increasing the number of quantisation bits will

improve the dynamic range of a digital audio system?
Comment
Increasing the number of quantisation bits improves the signal-to-
noise ratio so reducing the lower operating level of the audio signal.
As a result of increasing the number of quantisation levels there are
more code-word values available before the maximum code-word value
is reached and clipping occurs. Thus the dynamic range is extended.
Digital audio systems are capable of improvements over analogue systems

in terms of signal-to-noise ratio, bandwidth and dynamic range. However
this is at a cost of the generation of prodigious quantities of digital data
which has to be processed, stored or transmitted. The remainder of this
chapter looks at the technologies developed to handle all this digital data.
The analogue to digital conversion processes add a particular form of

noise to the audio signal. Describe how this noise occurs and how it
may be kept to a minimum.
5 DIGITAL AUDIO TAPE RECORDING
5.1 Introduction
From Activity 10 you found that a typical stereo digital audio signal
generates nearly 1.5 megabits of data every second. To store data at this
rate, recording equipment with a bandwidth approaching 2 MHz is
required which is a hundred times greater than the bandwidth of an
analogue audio recorder. Fortunately, in the early days of digital audio,
the newly developed video tape recorder (VTR), which had been
designed for recording the high bandwidth signals that television
pictures require, was able to be adapted to recording digital audio data.
In this section then we will look at the basic technology behind digital
audio recorders that use magnetic tape as the recording medium. This
will highlight some of the disadvantages of using tape, particularly in
the consumer area. Section 6 then will look at disc-based systems such
as the CD and MiniDisc, that overcome some of these disadvantages.
5.2 Rotary head tape recorders

In 1969 Nippon-Columbia developed a prototype digital audio recorder
which used a PCM digital audio processor that connected directly to a
standard VTR. Subsequently, digital audio processing devices, such as
the Sony Corporation’s PCM-1610, became available. These provided
the necessary digital conversion, data processing and interfacing to the
VTR. These digital audio recorders, described in Box 4 ‘Rotary head
tape recorders’, became known as PCM-VTR systems. The PCM-1610
was often used with Sony’s U-matic VTR (a high quality professional
VTR system), becoming a de facto standard for two channel digital
audio production and CD mastering for many years.
5.2.1 Digital audio tape recorders

The digital audio tape (DAT) system was introduced by Sony in 1987.
It used the same helical recording head principle as the VCR, but with
a specially designed tape cassette which was much smaller than the
ones used in VCRs. The
mechanism is shown in
Figure 18, and Box 5 on
page 28 outlines its
operation. Each tape can
record up to two hours of
16-bit stereo digital audio
using a sampling rate of
48 kHz. This system has
an impressive specifi-
cation with the possibility
of recording to a higher
quality than CDs. Box 6
‘A DAT recorder speci-
fication’ gives details of a
typical DAT recorder for
reference only. Figure 18 A DAT cassette tape transport
Box 4 Rotary head tape recorders

Recording high frequency signals such as video or digital audio data onto magnetic tape requires
a head-to-tape speed of around 10 metres per second (m/s) or 250 inches per second (ips). For
comparison a typical head-to-tape speed for an analogue audio studio tape recorder is 38 cm/s or
15 ips. Running tape at a speed of 10 metres per second over a fixed head proved totally impractical
as not only were huge reels containing prodigious lengths of tape necessary, but also the tape
heads suffered extreme wear and frequent breaks in the tape occurred. In America the Ampex
Corporation developed a rotary head recording system where the tape moved at a relatively
slow speed around a drum which contained spinning record/play heads. This gave the equivalent
high head-to-tape speed necessary for recording video signals without the drawbacks of moving
the tape at high speed. It did, however, increase the complexity of the tape heads.
In the rotary head system, the tape is
video track
wrapped around just over half of the
head-drum which contains two record-
replay heads* that revolve at very high
speed in the same direction as the tape
moves. As shown in Figure 17(a), the
tape is slanted across the drum forming tape movement
a helix, which causes a series of diagonal
magnetic tracks to be recorded on the
tape as illustrated in Figure 17(b). This spinning head
pattern of tracks gives rise to the term (a)
helical tape recording. Head-to-tape
contact is continually maintained, but direction of tape travel
a very short break in the signal path
occurs during the electronic switching control
from one head to the other. This break track
is made to coincide with the end of the video track
previous frame and start of the next
frame of the television picture so the
interruption is not obvious**. A fixed audio
erase head mounted before the head- track
drum assembly covering the full width (b)
of the tape removes any previous
recordings. Additional fixed analogue
heads mounted after the head-drum Figure 17 The rotary-head tape recorder showing
assembly record synchronisation and (a) the helical path of the tape across the head
drum and (b) the diagonal tracks
analogue audio signals along the two
edges of the tape.
The digital audio processor unit designed to interface with the above recording system is a device
that first converts the analogue audio to a digital PCM signal. It then splits the digital data into
discrete blocks (the gaps between the data blocks coinciding with the head switching occurring at
either 25 or 30 times a second depending on the television standard) and finally forms these
blocks into a signal that looks to a VCR like any normal video signal (i.e. the part of the signal that
usually contains the picture information contains the digital audio data). Fourteen-bit quantisation
was used initially but this was later increased to 16-bit to match CD systems. Error detection and
correction, discussed later, ensured that the recovered audio signal was free from any errors
introduced by VCR.
Helical recording is used in consumer video cassette recorders (VCRs), camcorders and many
digital audio tape recorders where the helical recording system is integrated in the device and a
separate video recorder is not used.
*Modern video tape recorders usually use four heads.

**Television pictures are made up from individual frames – 25 per second in Europe and 30 per
second in North America. The rotary head must switch heads in the gap between the frames
otherwise part of the picture will be lost during head switching. In normal video recordings, each
diagonal track contains one picture frame.
Box 5 The DAT recorder

The rotating recording method is
used to achieve the necessary 180°
recording bandwidth but the tape guide guide
is not wound as far around the tape H H
spinning tape head as in VCRs drum VCR
in order to reduce tape wear.
Figure 19 compares head-tape erasure head control head
wrap angles for DAT and VCRs.
To maximise tape usage, there is guide guide
no guard band (space) between tape 90°
adjacent diagonal tracks on the DAT
tape which is normally needed to H H
avoid cross-talk between tracks. drum
Each track partially overlaps the
adjacent track – a principle Figure 19 Comparison of tape–drum wrap
known as overwrite recording, angles between (a) a VCR and (b) a DAT recorder
as shown in Figure 20. However,
the angle of the two heads
to the tape (known as the direction of tape travel
azimuth angle) is set at 20°
difference to minimise
interference between the
tracks. Textual data (track
e n ad
timings, table of contents,

e m he
t
ov y
etc.) – called subcode data –

m tar
ro
is included with the digital

audio data. Also there is no
need to have a separate
erase head as the special head 1 azimuth angle head 2
recording method means a +/– 20˚
new recording will simply
Figure 20 Overwrite recording
overwrite the old one.
There is no separate linear control track along the length of the tape to provide
synchronisation and track location information – a special ‘automatic track following’
signal is recorded with the digital audio data.
The system offers a number of different recording modes including a long play mode and
a high speed (high quality) mode, which, for half the recording time, offers 16-bit digital
data at a sample rate of 96 kHz.
DAT recorders are available in both professional and non-professional specifications as
portable and non-portable machines. Serial copying management system (SCMS), explained
in a later section, is included in non-professional recorders.
Box 6 A DAT recorder specification (for reference only)

Mode Standard Long-play Pre-recorded
Tape width (mm) 3.81 3.81 3.81
Recording time (min) 120/60 240 80
Tape speed mm/s 8.15 4.075 12.225
Head speed (rpm) 2000 1000 2000
Sample rate (kHz) 32/48/96 32 44.1
Lossless coding 16-bit n/a 16-bit
Lossy coding 12-bit 12-bit n/a
No of channels 2/4 2/4 2
Can you suggest a reason why the low sample rate of 32 kHz is
included in the specification of the DAT recorder?
Comment
Broadcasters use a 15 kHz bandwidth for analogue radio
transmissions. The lower DAT recorder sampling rate exceeds this
bandwidth on replay making it suitable for broadcast use without the
need for re-sampling. Being portable the DAT recorder is ideal for
location recordings, especially interviews.
5.2.2 The Alesis multitrack digital audio tape recorder

One of the drawbacks of rotary head digital audio recorders is their
inability to offer true multitrack recording facilities that were available
on multitrack analogue tape recorders. The Alesis multitrack digital
audio recorder (ADAT), was introduced by the Alesis Corporation in
1991 to address this problem. An outline specification for this
recorder is given in Box 7 for reference only. ADAT recorders use high
quality super-home video system (S-VHS) tapes with a modified VCR.
Although primarily designed for professional use, they are available
for the serious amateur, but the system has not made real inroads into
the consumer sector.
Box 7 Specification for an ADAT multitrack digital audio recorder

(for reference only)
Frequency response: 20 – 20 000 Hz +/–0.5 dB
Audio channels: 8
Sample rate: 44.1 or 48 kHz
Dynamic range: 97 dB
Harmonic distortion: 0.009%
Wow and flutter: unmeasureable
Synchronisation: ADAT sync input and output
Recording time: 63 minutes on a 182 minute S-VHS video cassette
tape (35% reduction in recording time as compared
to using the tape for normal video recording)
Digital input/output: Optical ADAT digital data input and output
The use of high quality S-VHS video tape and a faster than normal tape
speed means that 8 channels of digital sound sampling at either 44.1 or
48 kHz using 16, 20, or 24 quantisation bits can be recorded. In
addition the system uses a special encoding system that includes
synchronisation/identification signals that not only aid editing, but
also allow recorders to be cascaded to allow the possibility of 16 or
more tracks.
The ADAT system remains popular and there are a wide range of
recorders, editors and other devices available that are compatible with
the ADAT digital input and output format and synchronisation
signals. Indeed there are now ADAT hard disk recorders that use a
computer hard disk instead of magnetic tape as the storage medium.
5.3 Stationary head tape recorders

Analogue multitrack tape recorders were able to record as many as 32
individual tracks which was often sufficient to allow part
performances to be recorded on different occasions if all the musicians
were not available at one time. Subsequent studio mix-down creates
the final sound. Most rotary head recorders (apart from ADAT) could
only record two channel data making multitrack techniques and
surround sound recording formats impossible.
Why do you think the multitrack tape recorder was so popular with
recording companies?
Comment
The multitrack facility allows many microphones to be used to record
individual artists and instruments. The final mix-down can then be
created from the best possible tracks in the quiet of the studio and
away from all the tensions of the recording session. Different mixes
may be experimented with which would not be possible with only a
two track recording.
The multitrack digital

stationary head recorder,
illustrated in Figure 21,
was developed to satisfy
the needs of the
professional studio.
It offers facilities similar
to the earlier analogue
machines but with a digital
audio quality to match
the PCM-VTR systems.
Development of new coding
systems, special heads and
magnetic tapes capable of
supporting high data rates at
realistic tape speeds enabled
the Digital Audio Stationary
Head (DASH) format,
outlined in Box 8, to become
a very successful system. Figure 21 A multitrack DASH recorder
Digital tape recorders are gradually being superseded by high capacity

hard disk recorders which were introduced in Chapter 1 of this block.
It is only since it has been possible to manufacture very high capacity
hard disks at reasonable cost that tape storage has been replaced for the
production of session and master recordings. By using appropriate
computer editing software such as that which accompanies this course,
or a dedicated studio system, high numbers of individual digital tracks
may be made available.
Box 8 An outline of the DASH recording system

The standard recording speed for a DASH recorder is 76 cm/s (30 ips) for a 48
kHz sampling rate and 70 cm/s (27.5 ips) at a 44.1 kHz rate. Special tape
heads, derived from integrated circuit manufacturing technology, enable digital
multitrack recording onto tape without the need for the helical recording
technique used in VTRs. Typically 24 tracks are provided across a 12.7 mm
(half-inch) tape although the DASH II standard offers 48 tracks on similar tape
making track widths under 0.25 mm (0.01 inch). Powerful error correction
techniques allow traditional cut and splice edits to be made in a similar way to
those on analogue recorders as discussed in Chapter 1.
5.4 Tape versus disc

Digital tape recorders offered the first opportunity for recording and
storing digital audio. Whilst eminently suitable for studio use, tape
has not proved to be a popular medium for consumer audio products.
This is in direct opposition to video consumer products where the
tape-based video home system (VHS) has become a global standard.
Why is tape more acceptable for video use? The reason is in the way
we use the medium. Films and television programmes tend to be
watched in a linear fashion, i.e. starting at the beginning and running
through to the end, albeit with breaks. However, given the choice of
songs on a record album, favourite tracks tend to get played over and
over again whilst others are ignored. With tape this would involve
spooling to-and-fro, which is time consuming, but with discs
accessing any track takes the same time and can be easily mechanised.
In the next section therefore the development of digital audio disc
technology will be described.
Why do you think tape is not an ideal medium for domestic audio use
compared with disc?

Digital audio technology enabled the music industry to improve the
specification available from analogue audio systems. Dynamic range,
audio bandwidth and signal-to-noise ratio are all improved. The
problem of the enormous quantities of data generated by the digital
conversion process was solved initially by using existing rotary-head
video recording technology as it was able to offer the necessary
bandwidth to capture the digital stream. Problems with tape editing
made this system unpopular in recording studios. As tape media and
technology improved it became possible to use fixed-head multitrack
recorders enabling traditional editing techniques to be employed.
Consumer digital recording systems were based on domestic VCRs.
However, tape has never been the most satisfactory medium for
widespread domestic audio use.
6 DIGITAL DISC SYSTEMS
6.1 Introduction
A number of different disc storage media may be used for digital audio
data, some of which were originally developed for use in the computer
industry. Table 6 gives examples of the more popular disc media, not
all of which currently offer recording facilities.
Table 6 Comparison of storage methods
Disc storage medium Typical Playing time

capacity
Compact disc CD 650 Mbytes 74 min
Mini disc MD 200 Mbytes 74 min
Super audio compact disc SACD 4.7 Gbytes 74 min*
Audio digital versatile disc DVD-A 4.7 Gbytes 74 min*
Hard disk** 40 Gbytes 4,160 min
* The playing time is similar to that of CD but offers higher quality sound. Additional
features such as multi-channel surround sound and limited video are also possible
** Whilst not originally designed for audio use, hard disks are becoming very popular
as domestic audio storage devices now that very high capacity drives are available at
relatively low cost. The storage capacity shown is for uncompressed WAV files.
6.2 The audio compact disc system

Because of the disadvantages of a tape-based system for consumer
audio a disc-based medium was favoured when development of a
digital audio consumer device commenced in the 1970s.
By 1977 a Japanese consortium comprising Hitachi, Mitsubishi and
Sony had demonstrated a digital audio disc based upon the existing
30 cm video optical disc
technology illustrated in
Figure 22(a). Meanwhile the
Dutch Philips Company,
working along similar lines,
had developed a system based
on a 12 cm disc illustrated in
Figure 22(b) and capable of
holding roughly the same
amount of music as a vinyl LP.
Eventually Philips demon-
strated their product to the (a)
Japanese Sony Corporation

who agreed to work with
Philips to develop the product
further. The result of this joint
(b)
development was today’s
digital audio compact disc Figure 22 (a) 30 cm and (b) 12 cm
(CD). A maximum storage optical storage discs compared
capacity of 74 minutes was specified as this was the approximate

performance time of Beethoven’s Ninth Symphony – a favourite work
of the Sony chairman!
Full technical specifications for the CD are found in a document called
the Red Book, so named, legend has it, because the original working
documentation was stored in a binder with red covers. Subsequent
compact disc specifications have been indicated by their variously
coloured covers as listed for reference only in Table 7.
Table 7 Book colours for compact disc specifications

(for reference only)
Book Specification
colour
Red CD (Digital Audio compact disc)
Yellow CD ROM (Computer data compact disc)
Orange CD-R and CD-RW (Recordable compact disc)
Green CD-i (Interactive compact disc)
Blue E-CD ( Enhanced compact disc)
White VCD (Video compact disc)
Scarlet SACD (Super Audio Compact Disc)
Table 8 compares some of the physical and technical characteristics of

the CD and the vinyl LP record it superseded. The following
parameters are of particular relevance:
• the signal-to-noise ratio is improved both because of the number of
quantisation bits and because of the high tolerance to dust and
minor scratches on the surface of the CD. The relatively soft vinyl
LP groove was very intolerant of dust particles and easily scratched
by mishandling;
• the increased dynamic range due to a lowered signal-to-noise ratio
and the ability to reproduce high sound levels without distortion.
High sound levels on LP discs, especially at low frequencies, could
cause the pickup to jump out of the groove and damage to the disc.
Table 8 Comparison between CDs and LP discs (for reference only)
Parameter CDs LP discs

Disc diameter 12 cm 30 cm
Maximum playing time 74 min* 34 min per side
Rotational speed 568 – 228 rpm 33 1/3 rpm
Linear speed 1.2 or 1.4 m/s 0.528 – 0.211 m/s
Speed variation Below measurable levels 0.03%
Frequency response 20 – 20 kHz +/– 0.5 dB 30 – 20 kHz +/– 3 dB
Signal-to-noise ratio 96 dB 60 dB
Dynamic range 90 dB 65 dB (at 1 kHz)
Channel separation 90 dB 25 – 30 dB
*This is the Red book specification. CDs now play for over 80 minutes.
To give you an idea of the difference in the physical attributes of the

two media Figure 23 shows a photomicrograph comparing the tracks
on a microgroove vinyl LP to those of a CD. There are approximately
60 optical tracks to every LP track. Whereas the analogue signal is
stored as undulations in a continuous track or groove the digital audio
data is stored as a spiral of pits which are detected by the reflection of a
very fine laser beam onto an optical sensor.
Figure 23 A comparison of track widths between a vinyl LP (top) and a CD (bottom)
A CD consists of a clear polycarbonate disc either 8 cm or 12 cm in

diameter and 1.2 mm thick. The data is pressed into one side to form
the data layer which is aluminised giving it a reflective layer. A coat of
lacquer protects the aluminium surface and the disc label is printed
onto the lacquer.
Infrared light from a miniature laser label
lacquer
light source is focused through the
reflective surface
polycarbonate from the underside of pit land
the disc to form a 1.7 µm spot on the 1.2 mm
disc substrate
data layer (1 µm is one millionth of
dirt
a metre). Because the data layer is
over 1 mm below the outer surface
laser beam focuses
of the disc any dirt or surface through substrate
aberration will not interfere with the
light path as illustrated in Figure 24.
The digital audio data is stored as a
series of pits and lands on the data Figure 24 A dirt particle on the
disc surface does not affect the
layer of the disc (see Box 9 ‘Pits or
laser beam focus
bumps?’). Pits vary in length from
0.833 µm to 3.56 µm depending on
the value of the data they represent and are approximately 0.5 µm wide.
They form part of a spiral track spaced 1.6 µm apart. Pits are 0.390 µm
deep (which is half the wavelength of the infrared laser light).
Box 9 Pits or bumps?

The digital audio data on a CD is stored as areas of land and pits. These
are formed into the top or label side of the CD during the manufacturing
process. Because the data is read from the underneath the pits appear as
raised areas or bumps standing out above the land. However, because they
start life as pits in the original glass master they are still referred to as
pits in most literature, and here _ even though they are really bumps!
The light spot reflected from the land will have a particular intensity
as measured by an optical pickup. When a data pit is encountered the
reflected light will be scattered, so that the intensity measured by the
optical pickup will be less. These two levels of intensity are used to
retrieve the data, with low intensity being logic zero. As you can
probably imagine the optics and electronics are extremely complex
with a servo-mechanism ensuring the light spot is kept focused all the
time the disc is playing.
The data on the disc is not stored in a straight PCM format because the
bit patterns created by this coding method are not the most efficient
way to store the data. Efficient serial data storage systems, such as
those found on CDs, require the data to:
• be self-clocked (i.e. the data pattern provides its own timing

information thus eliminating the need for a separate timing signal
to decode the signal on replay so simplifying the interface);
• use the lowest possible number of bit changes (i.e., 0 to 1, or 1 to 0)
in order to keep the bandwidth as low as possible;
• equalise overall the number of zeros and ones so as to ensure that
the content of the signal does not disturb the servo-mechanism that
controls the focus.
These characteristics are addressed by employing some quite
complicated methods of encoding the basic digital audio data the
details of which are beyond the scope of this course. However Box 10
outlines the techniques that are used.
So that you can get an idea of how a CD player reads the data on a CD
and converts it to an analogue sound signal, Box 11 outlines the
various functional blocks that a typical CD player contains.
Initially, consumers were reluctant to buy into the CD system, largely
because of the lack of recordings and the high cost of both the playback
equipment and the discs, inevitable in any new format. However, once
their advantages were recognised, CDs quickly became established and
sales of vinyl LPs plummeted. Interestingly, sales of compact cassettes
were less effected due to their use with car radios. CDs are now firmly
established as the major way of distributing recorded music although,
formats such as SACD and DVD-A, both discussed later, are gaining
some following.
For reference only, the specification of a typical compact disc player is
given in Box 12. This specification is typical for those of many devices
in that certain terms such as oversampled and error correction are used
without any explanation. Both these terms represent important advances
in digital audio technology and will be discussed in the section on
digital disc technologies.
Box 10 Data coding on a CD

CDs use eight-to-fourteen modulation (EFM), A change from ones to zeros or vice versa at
termed an 8,14 code, a technique that least in every 10 bits is sufficient to maintain
transposes a standard 8-bit code into a 14-bit synchronisation.
code. (Note that prior to this stage the original Following EFM coding a further 3 ‘merging’
digital audio data has undergone various stages bits are added to form a 17-bit code word.
of processing, some of which are discussed in Their value depends on adjacent 14-bit code
later sections and this results in the data being words and are used to equalise on average
collected into blocks of 8-bit data.) It may not the number of ones and zeros. This is done
seem very obvious but in fact increasing the so that the average voltage of the signal over
number of bits necessary to code the data time is zero (assuming the binary ones and
solves all of the three characteristics required zeros are represented by equal positive and
for efficient storage of the audio data. negative voltages). The addition of the three
EFM takes each of the 256 possible 8-bit codes merging bits is quite complex adding to the
and generates an equivalent 14-bit code from cost of the encoders. Fortunately this problem
a possible 16,384 codes using a look-up table. only occurs in manufacturing the disc and
Table 9 shows examples of 8,14 code because the decoding is very straightforward
conversions. In each case the 14-bit code is the 8,17 code was included in the original
chosen to have a minimum of two zero bits Red Book CD specification and has been used
between each one bit in order to lower the ever since. (Today, technology has progressed
digital bandwidth and a maximum of ten in such that the complexity of adding the
order to avoid synchronisation problems. additional bits is not significant any more.)
Using EFM to overcome the problems listed
Table 9 Examples from the 8,14 code above actually allows more data to be stored
look-up table
on the disc than would have been possible if
Denary value 8-bit word 14 bit word the data was retained in its original format.
This may seem strange as more bits are being
100 01100100 01000100100010 used. However, because the data patterns are
101 01100101 00000000100010 simplified the pit and land areas representing
102 01100110 01000000100100 the zero and one data bits can be physically
103 01100111 00100100100010 smaller, thus taking up less space on the disc
than would otherwise be necessary.
Box 11 The CD player

A block diagram showing the major components of a CD player is given in Figure 25.
compact disc
pits
optical head tracking servo
motor
focus servo
disc in place controls

interlock
motor drive
servo
control
system display
data separator
clock drive
D/A converter low pass L channel
filter output
decoding and error digital
data memory
correction filters
low pass R channel
D/A converter
filter output
Figure 25 Block diagram of a CD player
Box 11 The CD player (continued)

A number of sub-systems are shown to: rotate Audio reconstruction
the disc, position and focus the optical pickup,
The truly revolutionary feature of the CD
recover and check the accuracy of the data,
playback system is that the optical pickup
convert the data to an analogue signal and
makes no actual contact with the surface of
finally give the user operational control and
the medium. Both LP discs and magnetic tape
feedback.
suffer signal degradation and media wear due
The system may be divided into two main to contact with the pickup so limiting the total
parts: the servo mechanisms controlling the number of possible plays.
mechanical operations and the audio
The digital audio data is stored as a series of
reconstruction circuits which generate the
pits moulded into the data land of the disc. When
analogue audio signal.
the laser beam, which is brought to a focus spot
on the land is reflected back the intensity
Servo mechanisms detected by the light sensor is a maximum
Once the CD is placed into the player and the indicating a data bit value of one, as shown in
door closed a safety interlock operates and Figure 26(a). When the laser beam crosses a
the infrared laser beam is turned on. Safety pit the reflected light is scattered, as shown in
precautions are necessary because even Figure 26(b), and the sensor detects a lower
though the infrared laser beam is invisible it light intensity. This registers as a data bit value
can cause serious damage if exposed to our of zero. The serial data is read from the disc at
eyes. Once the laser beam is focused, which a constant linear speed so the rotational speed
can only happen if the CD is inserted the of the disc varies between 568 and 228 rpm as
correct way round, a motor spins the disc up the head moves from the centre of the disc to
to speed. The rotational speed is controlled the periphery. The data memory is used to
by a drive servo which locks to unscramble the data into data words, which are
synchronisation data read directly from the checked for errors, separated into two or more
disc. The optical pickup is moved to find the channels, filtered and passed to the digital-to-
lead-in track* (positioned at the centre of the analogue (D/A) converters. The reconstructed
disc immediately before the first digital audio analogue signal for each channel is passed
track). It is then stepped forward one track* through a low-pass filter before final
and the digital audio circuits are switched on. amplification and output to the audio system.
Appropriate information about the CD is pit
sent to the display to indicate number of
tracks, playing time, track number, time
remaining, etc. User controlled switches
signal to the mechanisms to provide
start, stop, track switching, step
forward and back, pause and disc eject.
* This is a ‘data track’ similar to the audio
‘tracks’ on the CD, it is not a separate (a) (b)
physical track – physically, the CD only
contains a single spiral track of pits and Figure 26 Laser light is scattered when a pit is
lands. encountered
Box 12 A CD player specification (for reference only)

The data below is taken from the instruction book for the Quad 67 CD player.
Frequency response: 20 _ 20 000 Hz ± 0.1 dB
Signal-to-noise ratio: 100 dB
Total harmonic distortion: 0.002% at 1 kHz
Wow and flutter: below measurable levels
D/A conversion: 18 bit 64 times oversampled converter
Error correction system: Cross Interleaved Reed Solomon
Optical readout system: Laser semiconductor
Sampling frequency: 44.1 kHz
Audio output: 2 V r.m.s. max. 300 mV on normal programme material
6.3 MiniDisc
Once CD technology had become accepted by consumers the Sony
Corporation decided to attack the compact cassette market. The development
of recordable CD technologies allowed them to design a portable disc-based
system called the MiniDisc (MD). Sony released the MD system at around
the same time that Philips brought out their abortive digital compact cassette
(DCC), which attempted to provide consumers with the digital equivalent of
a standard analogue compact cassette. The two systems were in direct
competition; however, Sony’s aggressive marketing of their product together
with the advantages of a disc system for instant access and the inclusion of
additional text information caused the MD to win the day. MD uses a 6.4 cm
diameter disc (CDs are 12 cm in diameter) with pre-recorded discs using the
same recording technology as CDs. The recordable disc uses a magneto-
optical method as outlined in Box 13.
A controversial feature of the MD is that in order to use a small diameter
disc whilst still giving the same record and playback times as a CD, audio
data compression had to be used. Called ATRAC (Adaptive TRansform
Acoustic Coding), this coding system allows about five times more audio
data to be stored on the disc than would be possible without its use by
only coding sounds the ear can actually hear. The ideas behind audio data
compression will be discussed in a later section. Although the first version
of ATRAC was perceived as not being up to high fidelity standards,
subsequent versions have showed noticeable improvements, and the
current version (in 2004), is of such a quality that it needs highly trained
ears or special test sounds to distinguish any quality difference between
MD and CD recordings. In 1999 Sony produced a long playing version
(termed MDLP) which uses an ATRAC variant called ATRAC3 which
for a reduced quality allows double the recording time. A monophonic
single channel mode is also available in both standard and LP variants.
The MD system also caters for textual information about the sound data
to be stored. As well as a table of contents, and associated directory
structure, details of the music tracks can also be stored. In addition, MD
players usually contain some simple editing facilities such as moving
and deleting sections of sound, dividing up a track into two or more
separate tracks and combining tracks. A typical specification for a
MiniDisc recorder is given in Box 14 for reference only.
ACTIVITY 23 (SELF ASSESSMENT) ...........................................................
How is it possible for both MiniDiscs and CDs to offer up to 74 minutes

playing time whilst the MiniDisc at 6.4 cm in diameter is physically
smaller than the 12 cm diameter CD?
Box 14 A MiniDisc recorder specification (for reference only)

The data below is taken from the operating instructions for the Sony MZ-R30 Portable MiniDisc Recorder.
Frequency response: 20 – 20 000 Hz ± 0.1 dB Optical readout system: Laser semiconductor
Wow and flutter: below measurable limits wavelength 780 nm
Error correction Sampling frequency: 44.1 kHz
system: Advance Cross
Interleaved Reed Sampling rate converter: 32/44.1/48 kHz
Solomon (ACIRC) Audio output: 194 mV on normal
Coding: ATRAC programme material
Box 13 MiniDisc system

A magneto-optical recordable MiniDisc and the laser beam travels outwards, For both pre-recorded and recordable
(MD) utilises both magnetic and a spiral track of these tiny permanent disks, the device that detects the
optical technologies to store and magnets is produced in the magnetic reflected laser light from the disc
playback digital audio. The recordable layer. For re-recording, there is no contains a number of small detectors.
disc still has a reflecting layer as with need to erase the old information These and the associated optical
the CD, but between this reflecting first, the new data simply overwrites system are arranged such that not only
layer and the polycarbonate disc there the existing data. can the digital data be detected for
is an additional layer of a special both types of disc, but also signals can
magnetic material. This material has Playback be derived that are used to keep the
the property that at normal room How is the data read? Playback uses laser spot correctly focused and
temperatures it is not affected by a another physical phenomenon called correctly tracked along the spiral data
magnetic field. However, in the the Kerr effect. This effect is that track.
presence of a magnetic field, if this the polarisation of light is changed There is one problem with recording
material is heated to beyond a certain when it passes through a magnetised a new, unused disc – how is a correct
temperature (called the Curie point) medium. In normal light, the light spiral track created? Consider the
and then cooled, it will become ‘wave’ is vibrating in all directions situation when, during recording the
permanently magnetised (until heated perpendicular to the direction of recorder was subjected to a jolt which
to beyond the Curie point again). Since travel. However, polarised light is only caused the recording head to vibrate.
only digital data needs to be recorded, vibrating in one direction. The laser How can normal recording be
the zeros and ones can be represented beam therefore is used on playback recovered? To solve this, the required
by different magnetic polarisations since it provides a source of polarised spiral track is physically marked out
(north and south). light (but with reduced power so that by a permanent groove (rather like a
it does not heat up the magnetic record) which additionally has a 22.05
Recording
layer). As the laser beam travels kHz sine wave wobble applied to it.
Each bit of the data to be recorded is through the magnetised layer and is This ‘wobble groove’ is also frequency
fed to a small magnetic recording head reflected back again by the reflecting modulated with address information.
on the underside of the disc. This head layer, the polarisation of the reflected (Frequency modulation was introduced
produces a ‘north’ or a ‘south’ light is changed – the direction of the in Chapter 8 of Block 2 and is the
magnetic field depending on the logic change is dependent on the sense of method used to carry sound signals
level to be recorded. The laser that is the magnetism stored on the disc’s on a radio wave in FM radio, where
used for playback is then supplied with magnetic layer (north or south). The the frequency of the radio wave is
a high power pulse that causes a polarisation of reflected light is varied fractionally by the much lower
minute area on the disc to be heated detected by a special optical system frequency sound signal.) Thus all
up to beyond the Curie point. As this and a set of light detectors. recordable MDs contain permanent
point cools, the sense of the magnetic Fortunately this detector arrangement addressing and track information
field is stored in this spot as a tiny can also be used to read pre-recorded which the recorder can use to find a
permanent magnet. Using this system discs that have pits. Figure 27 particular place on the disc. This
rather than a traditional magnetic illustrates the playback process from feature is called address in pre-groove
material means not only are other an MD. (ADIP).
areas of the disc not affected since
they have not been heated to a
sufficiently high temperature
(allowing a high density of data
polarisation
storage), but the disc is not axis
susceptible to erasure by stray
magnetic fields. As the disc rotates magnetic magnetic
direction S direction N
1 1 0 0 0 1 0 1 0 1
disc recordable minidisc
rotation cross section
objective lens analyser optical pickup 1
data decode
Figure 27 Playback from a recordable MD and audio out
showing the polarisation rotation of the conversion
laser beam caused by the magnetism in
the magnetic layer laser optical pickup 2
6.4 Advanced disc-based systems

In the late 1990s, the major trade associations in the music industry
formed the International Steering Committee (ISC) to review and
comment on proposals for any new systems that could replace the
standard audio CD.
The ISC came up with a list of recommendations for any new format
which included:
• provision to prevent illegal copying yet retaining archiving and
master transfers without loss of quality;
• compatibility with conventional CDs;
• audio, video and data storage;
• high quality with multi-channel possibilities;
• must not require a caddy or cartridge;
• durable (more resistant to scratches than CDs);
• single-sided 12 cm diameter disc preferred.
In response to these recommendations, Sony and Philips again joined
forces to develop a new system called the Super Audio Compact Disc
(SACD) with a specification which surpasses that of conventional
CDs. Introduced in late 1999, an important feature of SACD is that it
can be, in its hybrid form, backwards compatible with CDs and can
therefore be used with existing CD players. However, to exploit the
full features such as an extended frequency response and surround
sound a special player is required. Box 15 ‘Super Audio Compact Disc’
gives outline details of the playback system.
Meanwhile the audio working group of the Digital Versatile Disc
Forum, composed mainly of consumer hardware manufacturers and
chaired by the Japanese Victor Company (JVC), developed the
audio-DVD. This format uses the digital versatile disc (DVD),
which is a development of the original CD for video use. By using
the large storage capacity of DVDs, a special high quality audio
recording system has been specified called the audio digital
versatile disc (DVD-A). At the time of writing DVD-A is not
compatible with CD but a double-sided disc, offering both DVD-A and
video DVD formats, is available. Box 16 ‘Audio Digital Versatile Disc’
gives brief details of the DVD-A format.
To give you a feel for how the SACD and DVD-A systems compare
with the original CD system, Table 10 gives, for reference only, a list of
the main specification items for each of the three systems.
Like the original CD, these two new systems are presently available
only as players, a specification of a ‘universal player’ capable of
playing all current audio formats is given in Box 17 for reference only.
Recordable DVDs are becoming available (both write once and write
many) offering the possibility of providing a system for high quality
multi-channel recording.
Box 15 Super Audio Compact Disc

The Super Audio Compact Disc (SACD) system uses SACD uses a very different method of
the same size of disc as a standard 12 cm CD, but analogue-to-digital conversion and
the pit size and track spacing are both reduced coding from any of the other systems
giving a storage capacity of 4.7 Gbytes. This gives described in this chapter called direct
the ability to store both conventional stereo and stream digital (DSD). Rather than
surround sound on the same disc allowing users to sampling the analogue sound signal and
choose the most suitable playback mode for their representing the size of each sample as
system. a numerical value, the difference
Unlike conventional CDs, SACD discs can contain a between consecutive sample values is
second semi-reflective layer between the normal determined. In addition, the sample rate
reflective data surface and the plastic disc as is made so high compared with the
illustrated in Figure 28. Either by changing focus or frequency of the top of the audible range
by using a second laser beam with a different that the difference between consecutive
wavelength (depending on the disc format listed samples can never be more than one.
below), data from the intermediate layer can be I will explain more about this in Section 7.2.
read. SACDs are available in three different formats: A final novel aspect of SACD is the
• a single layer disc with a 4.7 Gbytes capacity; copyright protection it affords. There are
• a dual layer disc with a 9.4 Gbytes capacity; two types of protection available _ a
• a hybrid two-layer disc where the bottom visible watermark, which is a special faint
reflective layer contains standard CD data image, printed onto the reflecting surface
that can be played on an ordinary CD player, in such a way that it does not disturb the
and a semi-transparent layer that contains laser beam, and an invisible watermark
SACD data. which is data written on the disc itself.
CD layer
0.6 mm
(entirely reflective)
SACD layer reflects 650 nm

0.6 mm wavelength is penetrated with
780 nm laser rays
(1 nm = 1 nanometre = 10 –9 m or
one thousand millionth of a metre)
SACD pick up CD pickup
wavelength 650 nm wavelength 780 nm
focused only on the focused only on the Figure 28 The hybrid SACD with
SACD layer CD layer two data layers
Box 16 Audio Digital Versatile Disc

The Audio Digital Versatile Disc (DVD-A) uses conventional PCM digital-to-
analogue conversion as with CDs, but to increase the amount of audio data
that can be stored (either resulting from higher sampling frequencies or more
bits per sample or both), DVD-A uses a lossless compression system called
Meridian Lossless Packing (MLP) that reduces the amount of data by up to 50%
without losing any information. This means that a single layer DVD-A disc is
capable of storing up to nearly 6 hours of stereo sound or up to 111 minutes of
surround sound. In reality, as with SACDs, the use of higher sampling rates and
more quantisation bits means that SACD recordings last approximately the
same time as conventional CDs.
The possibility of adding an additional layer increases this capacity much further.
At the time of writing (2004) hybrid DVD-As that play on conventional CD
player are not available, but the audio DVD can be played back on a conventional
DVD video player although not at very high sampling rates.
Table 10 Comparison of the SACD, DVD-A and original CD systems (for reference only)
CD SACD DVD-A
Capacity (Gbytes) 0.65 4.7 (single layer), 4.7
9.4 (dual layer)
Disc size (diameter) (cm) 12 12 12
Audio channels 2 up to 6 up to 6
Frequency response (kHz) 0.020 – 20 0 – >100 0 – 96
Sampling frequency (kHz) 44.1 2,833.4* 44.1, 88.2, 48, 96 or 192
(192 not in multi-channel mode)
Theoretical dynamic range (dB) 96 120 144
Recording time (minutes) 74 110 (2 channels, 74 or more
single layer)
Additional features text text, graphics, video text, still images
* 1-bit sampling rate as explained in Box 15.
Box 17 A universal audio player (for reference only)

The data below is taken from the operating instructions for the Pioneer DV-656A
universal audio player.
Audio formats: SACD, DVD-Audio, CD-DA
Frequency response: 4 _ 44 000 Hz (sampling frequency = 96 kHz)
4 _ 88 000 Hz (sampling frequency = 192 kHz)
Signal-to-noise ratio: 118 dB
Dynamic range: 108 dB
Total harmonic distortion: 0.0014%
Wow and flutter: below measurable limits
Audio output: 200 mV r.m.s. (1 kHz, –20 dB)
7 DIGITAL DISC TECHNOLOGIES
A number of special technologies have been employed with digital

audio discs to overcome fundamental problems associated with digital
storage on an optical medium.
7.1 Oversampling
Referring throughout this section to Figure 29, constructing an analogue
signal from a digital signal that has been sampled at 44.1 kHz, shown as
fs1, can cause technical difficulties. To ensure none of the unwanted
frequencies which are produced during the reconstruction process reach
the analogue output, complex anti-aliasing or reconstruction low pass
filters are necessary. These unwanted frequencies are generated by
the digital-to-analogue conversion process and occur in the range
(fs1 – fm) to (fs1 + fm) and if not removed can cause harmonic distortion.
To do this a reconstruction filter with an attenuation in the order of
–80 dB at 22.05 kHz (i.e. half the sampling frequency) but still flat
at fm (i.e. 20 kHz) is needed.
Filters with this sharp response are capable of being manufactured but
tend to be very complicated and can themselves actually introduce
harmonic distortion. Oversampling allows the use of much simpler

filters with a much less sharp response by doubling the conversion
frequency of the digital-to-analogue converter to 88.2 kHz, shown as fs2.
This is achieved by interpolating sample level points in between each
of the original (44.1 kHz) sample points. The unwanted frequencies
(fs2 – fm and above) are now well away from the wanted frequencies
and can easily be filtered out of the audio signal.
normal filter response unwanted frequency bands
wanted audio
frequency band sharp filter
response
0 fm fs – fm fs fs + fm fs – fm fs fs + f m
1 1 1 2 2 2
20 kHz 44.1 kHz 88.2 kHz
Figure 29 Using a 2 times frequency (88.2 kHz) oversampling digital-to-

analogue converter allows the use of a normal low-pass filter
7.2 Single bit conversion

Single bit conversion, mentioned in Box 15, uses a subtly different
technique to perform the analogue-to-digital and digital-to-analogue
processes and yet offers a high degree of precision, usually only
available from more costly conventional converters. Rather than each
sample consisting of 16 bits representing the signal level of the
analogue sound signal, the sample consists of just 1 bit which
represents the difference between consecutive sample values. To
achieve this the sample rate is made so high compared with the
frequency of the top of the audible range that the difference between
consecutive samples can never be more than one bit (i.e. the change in
the analogue signal level between consecutive samples is never more
than a single quantisation level). Hence the system is known as single
bit sampling. In practice sampling rates of 64 or 128 times the
necessary minimum rate of 44.1 kHz are used giving sampling
frequencies of 2.8224 or 5.6448 MHz respectively. Careful circuit
design is necessary to ensure no radio frequency (RF) interference is
caused to the audio circuits from use of these very high sampling rates.
As with the oversampling method simple low-pass filters can be used
when recovering the analogue audio signal.
7.3 Correcting media faults

One of the greatest advantages of digital audio technology is the ability
to correct errors caused by faults before they become audible. Whilst
we can tolerate faults in pictures (even to the extent of our brains
replacing missing information to complete the image), listening to a
performance accompanied by a regular chorus of clicks and crackles
can become very tedious. Methods of detecting and correcting errors in
digital data are well established both in computer and
telecommunications systems where even a single bit in error could be
disastrous, and where the streams of digital data are constantly
monitored and errors corrected.
Errors in digital media are caused by imperfections in the medium,

damage through mishandling, mechanical instability and electrical
noise and interference. These are classified as follows:
• Dropouts are errors caused by random faults in the medium plus
fingerprints, scratches and dust on its surface;
• Jitter which causes random bit errors due to timing imperfections
in the electronic circuits and the transport mechanism;
• Interference between adjacent bits;
• Electronic noise or electrical interference which causes effects
similar to dropouts.
These faults may be classified either as random errors or burst errors.
The former refers to a single bit in error whereas the latter affects
groups of bits.
Complete Table 11 classifying the different types of errors possible

from digital media as random errors, burst errors or both.
Table 11 Classification of digital media error types
Error Random Burst

1 Dropouts
2 Jitter
3 Interference
4 Electronic noise
Comment
1 Dropouts are random faults and of variable size and so also cause
burst errors.
2 Jitter causes random single bit errors rather than burst errors.
3 Interference between adjacent bits causes random errors rather
than burst errors
4 Electronic noise affects groups of bits causing burst errors.
When an error occurs the

code error
detection and correction
system attempts to trap it and
then fix it, as shown in Figure no error
30. Should the error fail to be detected
detected then a disruption to yes
the audio signal will occur
error no
and audible distortion will corrected
(or may be) be heard.
error
yes concealment
Figure 30 The outcomes from audible corrected inaudible

the occurrence of an error disturbance signal disturbance
7.3.1 Error detection

Error detection is the process that attempts to identify if an error has
occurred. If this is successful an appropriate strategy can be used
either to correct the error, in which case the digital code will be
repaired and there will be absolutely no effect on the sound, or to
conceal it making the fault cause as little audible disturbance as
possible. Two coding techniques for error detection are used, parity
and cyclic redundancy checking. Both methods require extra bits to be
added to the digital audio data and this is a necessary and fundamental
feature of any digital system employing error correction.
Parity
The simplest and therefore the most common method of error
detection is parity checking. By adding an additional bit called a
parity bit to a code word it is possible to determine if a single bit
random error has occurred. The value of the parity bit (i.e. a zero or a
one) is determined by a rule (known as a protocol) that is applied to
the binary code. Two types of parity exist; even parity in which the
total number of one-bits in the code word (including the parity bit) is
even; odd parity which has an odd total number of one-bits.
Even parity examples: Odd parity examples:
Data = 1001001 Parity bit = 1 Data = 1001001 Parity bit = 0
Data = 1100110 Parity bit = 0 Data = 1100110 Parity bit = 1
Unfortunately with parity checking there is no way of finding which
bit in the code word is in error. Also, if an even number of bits in the
same code word is affected (happily a surprisingly rare occurrence)
then the error will go undetected. Nevertheless parity checking is still
widely used in the computer industry where its simplicity allied to
the knowledge that an error has occurred is sufficient to employ a
correction strategy.
(a) Which of the following data words have errors? You may assume
that even parity has been used and that only single bit errors have
occurred.
1 1000 0111
2 1100 0010
3 1010 1010
4 1111 0001
5 0000 0000
6 1100 1101
7 1101 1111
8 0001 0000
9 1111 1111
(b) Why will an even number of bits in error go undetected if simple

parity checking is used?
Cyclic redundancy check code

Digital audio disc systems have adopted the cyclic redundancy check
code (CRC) technique to detect errors. Here a complex mathematical
calculation (the method is beyond the scope of this course) is carried
out on a group of digital data, called a data block. The result of this
calculation is included at the end of the data block as an extra code
word when the data is written to the disc. After the data is read back
the same mathematical operation is carried out on the data block
(ignoring the CRC code word) and the new result compared with the
original CRC code word. If the two values differ then an error has
occurred. As with parity checking the position of the fault in the data
block cannot be identified but detection probabilities in the order of
99.9985% are achieved and if an even number of bits are in error the
fault is still detected as the CRC code word will still be different.
Electronic circuits designed to make fast and reliable CRC code
calculations are used.
Once an error has been identified a strategy must be used to correct or
conceal it before the audio signal is heard otherwise distortion occurs.
7.3.2 Error correction

Once an error is found one of two approaches to correcting it may be
used. In the first a request is made to send the data again, a technique
known as backward error correction. This method assumes not only
that the error is non-repeatable or transitory but also that the data is
able to be re-sent in the first place and within a time such that there is
no break in the resulting audio. In the second method an additional
error correction code is added to the original data to allow errors to be
corrected immediately (the so-called ‘on-the-fly’ technique). This is
known as forward error correction and is used by digital audio discs
for two reasons. Firstly, a scratch is a hard error which cannot be
corrected by requesting the data again as the fault will always be re-sent.
Secondly, the technology available when the Red Book specification
was written could not cope with the time delay needed to re-send the
data. However a time delay has been fitted to portable digital disc
players to overcome the transitory errors generated when listening on
the move. These so-called ‘jog-proof’ systems use backward error
correction techniques which rely on buffering the audio data in a
random access memory whilst error correction strategies put things
right, assuring a continuous error-free stream of sound data. Memory
buffering still cannot cope with hard errors though as is demonstrated
by the next activity.
ACTIVITY 26 (EXPLORATORY, PRACTICAL) .................................................
Describe what you think would happen if the backward error method
encounters a permanent error in the data layer of a CD.
If you don’t think media errors exist try holding a CD up to a light
source and looking through it. (You’ll have to choose one with a see-
through label.) You will need to use a bright light but DO NOT, under
any circumstances, look directly at the light source. You may be able to
see a number of tiny pin holes in the reflective surface any of which
might cause a permanent error.
Comment
When the backward error correction method encounters an error a
request is made to send the same data again. If the error is transitory
then it will not be there the next time it is sent and so all will be well.
However, sending the same data which includes a permanent error
simply results in the error occurring again. The error correction
mechanism will be in a continuous loop. This makes this method
unsuitable for correcting errors in CDs.
To correct permanent errors forward error correction must be used.

Extra data is added to the original digital audio data in order that the
error can be corrected. The next activity demonstrates the problem of
adding extra data to the audio data.
A CD digital audio signal consists of two sound channels sampled at

44.1 kHz and quantised using 16 bits of data per sample.
(a) What will be the overall effect on the data rate of adding error
correction data to the digital audio signal data?
(b) Assuming that the error correction data adds an overhead of one
third to the digital audio data calculate the required data rate.
Comment
(a) The data rate will be increased as more data bits are effectively
used to store the digital audio signal.
(b) The data rate prior to adding correction data is 44 100 × 16
= 1 411 200 bit/s per channel or 2.8224 Mbit/s in total.
Assuming the additional data adds an extra third then the new
data rate will be 2.8224 + (2.8224/3) = 3.7632 Mbit/s
Once an error is found it is a simple operation to correct it – the value

of the bit is simply inverted (the bit is changed from a one to a zero or
vice versa). The trick is to find the position of the bit in error. Many
strategies exist to find errors, but they are extremely complex
especially if they are to deal with multi-bit burst errors. However one
method, combinational parity as described in Box 18, is relatively
straightforward and serves to demonstrate how additional data can be
used identify a single bit error in a block of data.
The following (20,12) data stream, using an even parity combinational

parity checking code, has been received:
10100 10011 00101 00110
Using the example in Table 13 as a guide check whether any single bit
errors have occurred and if found indicate which bit should be
inverted. (Note: No multiple errors exist.)
Many error correcting codes, similar in operation to the combinational

parity code, are used. They offer the ability to correct multiple bits in
error (burst errors) by using complex mathematical techniques.
Box 18 Combinational parity

Combinational parity code is used used to check the data and find any
both to detect and to correct any single bit errors:
single bit error in a block of data. 1 Check the parity bit in row 5 and
To see how this works consider column 5. This parity bit shown in
applying even parity bits to each code Table 13 is correct for even parity so
word in a block of 16 bits, arranged the parity bits themselves are correct.
in a 4 by 4 matrix, to both the horizon- 2 Check the parity in each row in turn.
tal rows and the vertical columns. The data in row 1 shows an odd
Table 12 shows the result with the number of ones, but the parity bit
parity bits in blue. is zero indicating an even parity
This version of a combinational error. All other rows are correct.
parity code is known as a (25,16) code 3 Check the parity in each column in
because a total of 25 bits are turn. The data in columns 1, 2 and 4
contained in the complete data block all have an odd number of ones but
but only 16 are used for the audio are made even by the parity bits.
data. The remaining 9 are used for Column 3 also has an odd number of
detecting (and correcting) any single ones but the parity bit is zero
bit in error in the data block. indicating an error in this column.
4 The intersection of row 1 and
Table 12 Combinational parity
column 3 points to the bit that is in
checking using a (25,16) code
error and this bit should be inverted
with the even parity bits shaded
from a one to a zero making the
Code Parity parity correct.
1001 0 Table 13 The bit in
1110 1 error is highlighted
0001 1
Code Parity
1111 0
1001 0 0111 0
0110 0
Any single bit in error will be detected 1110 1
at the intersection of the horizontal 0000 0
and the vertical parity checks. 1101 1
As an example, consider the (25,16)
bit stream: 01110 01100 11101 00000 The corrected bit stream with the bit
11011 shown in Table 13 as a 5 by 5 that was in error in blue is:
matrix. The following four steps are 01010 01100 11101 00000 11011.
7.3.3 Data interleaving

If code words are written consecutively in the blocks of data then a
burst error could destroy a whole block of digital audio making it
almost impossible to avoid sound disturbance. To alleviate this
problem consecutive code words are not written next to each other but
are spaced apart with data words from other blocks of data at such a
distance that a typical burst error would only destroy one code word
from each digital audio block rather than from the whole block, as
shown in Figure 31. This method of interleaving, ensures that burst
errors which would be too big to correct become random errors that are
capable of being corrected, and is common to all digital audio media.
To cope with both random and burst errors Philips and Sony jointly
developed the cross interleave Reed-Solomon error correction code
(CIRC). CIRC is a combination of a set of very powerful error
error
W1 W4 W7 W10 W2 W5 W8 W11 W3 W6 W9 W12
1st interleave block 2nd interleave block 3rd interleave block

(a)
error error error error
W1 W2 W3 W4 W5 W6 W7 W8 W9 W10 W11 W12
(b)
Figure 31 Example of data interleaving: (a) the interleaved block of sound data is
subjected to a burst error that affects four consecutive code words; (b) these four
words become random errors when the code words are placed in their correct order
correction methods which involves both ‘scrambling’ (i.e. interleaving)

the 16-bit code words sampled by the digital converter and also
applying two layers of complex combinational parity.
7.3.4 Error concealment

What happens when the error is recognised but is too large or not able to
be corrected or both? If nothing is done then this could spoil the sound, as
is the case when an error causes the optical tracking mechanism on a
CD to fail, producing clicks in between snatches of sound. An error
concealment, or hiding strategy can be used to minimise the effect on the
sound. Figure 32 shows an audio signal where an error has been detected
for sample g. There are three
d e f
possible ways of concealing this c
error. Figure 32(a) illustrates b
h
muting where the value of the a l
i k
digital audio code in error is set j
to zero effectively inserting
silence. As silence is audible, if (a) g=0
that is not a contradiction in
terms, this is a rather crude
d e f g=f
method and is seldom used. c
b
Figure 32(b) illustrates previous h
a l
data word holding, whereby
i k
the value of the previous data j
word is held over to replace
the data word with the error. (b)
In effect the sound is repeated
which can give poor results, d e f f+h
c g=
2
especially with high frequency b
h
sounds where there are fewer a l
samples. Figure 32(c) illustrates i
j
k
linear interpolation. Here the
average values of the samples
(c)
surrounding the data word in
error are used to replace it.
Figure 32 Examples of error
This has proved the most concealment; (a) muting, (b) previous
successful concealment method. data word holding, (c) linear interpolation
Linear interpolation may be further improved by using many previous

and future samples to provide greater accuracy, but this is more
complicated to implement.
Describe the circumstances which would cause error concealment to be

used and how the most commonly used method of error concealment
works.
7.4 Copy protection

Music companies have long been concerned about consumers making
multiple copies of their recordings. In the days of vinyl records, this
was not really a problem since it was impossible for the consumer to
copy a record other than by using an expensive magnetic tape recorder.
When audio cassettes came along, the problem of making multiple
copies became of greater concern, but because the quality of a copied
audio cassette diminishes rapidly as further generations of copy are
made, the record companies were content to put up with the situation.
However, with the advent of digital audio the music industry decided that
there had to be some controls to prevent people making illegal multiple
copies of their recordings. This is because unlike copying analogue media,
where the quality of the recording is reduced, copying using a digital
transfer method does not (or should not) produce any degradation in the
original recording at all. Thus it is possible to make copy after copy and
still end up with a recording that sounds as good as the original.
The serial copy management system (SCMS) was developed to prevent
copies of audio recordings being used to make further copies. In some
countries SCMS has to be incorporated by law in any device that can
record, send or receive digital audio. In essence, a recording is marked as
either copyrighted or not, and if it is copyrighted, whether it is an original
recording or a copy. SCMS allows one copy of an original copyrighted
recording to be made so that consumers can make up compilations of
favourite tracks and/or make a backup of the original. When a recording
device receives a digital signal it will only allow a recording to be made if
the material is not copyrighted or is from the original version of a
copyrighted recording. If the latter is the case, the recording is marked as
being both copyrighted and copy protected to prevent any further copies
being made. Most consumer digital recording formats now incorporate
SCMS data to indicate the copyright/copy status of the recording. However
the system is not without problems, particularly where consumers wish to
make copies of their own original non-copyright recordings.
Copying copyrighted material is now a hot topic in the music industry,
particularly now that it is so easy to send sound data over the Internet,
and this subject will be explored further in the last chapter of this block.

This section has studied the technologies developed for use with digital
discs. Oversampling avoids problems associated with filtering the
output from the digital-to-analogue converter. Using a sampling
frequency of 44.1 kHz means a very complex filter is required to

remove unwanted frequencies above half the sampling rate (22.05 kHz)
whilst maintaining an audio bandwidth right up to 20 kHz. By
increasing the sampling rate by a factor of two or more a less complex
filter may be used. Single bit conversion uses very high sampling rates,
typically 128 times the original, ensuring there is no more than a one
quantisation level change between samples. Again circuit
simplification is the goal as simple filtering is all that is needed to
recover the analogue audio.
Storing the audio as digital data with associated error detection and
correction data means that faults due to mishandling and media
imperfections can be tolerated, overcoming many of the problems
experienced with analogue technologies.
The music industry has always been concerned over loss of revenue
from copying but took few steps to prevent it as analogue tape copies
were always worse than the original. Copying record discs has never
been possible and it was only the advent of home tape recording that
caused concern. However with digital technology multiple bit-for-bit
copies can be produced which is causing great concern. SCMS was
developed to prevent wholesale copying whilst still allowing the
owner of the medium to make a single ‘back-up’ copy for personal use.
Describe a strategy used by a CD player to detect and correct a random

error. What effect would the error have on the sound?
8 DIGITAL AUDIO TRANSMISSION
8.1 Introduction
Digital audio transmission involves sending a signal carrying digital
audio data from one location to another. The distance between the two
locations may vary from a few centimetres to many thousands of kilo-
metres and the various ways the data can flow are described in Box 19.
Box 19 The directions of data

Data may flow both ways at the same time along a transmission link. An example
would be a telephone call where you can still hear the other person even when you
are talking, in which case the transmission is said to be full-duplex.
Alternatively the data may flow in just one direction, as for example with a
radio or TV broadcast, in which case the transmission link is described as
simplex. Finally an intermediate case exists where data flows both ways, but
not at the same time, which is called half-duplex. An example of this is 2-way
personal radios where, while one person speaks the other listens, and vice versa.
All signals are subjected to attenuation, which is a reduction in signal

strength due to physical effects of the transmission medium, and
interference due to disturbance from electrical and atmospheric
sources. The link medium for transmission may be electrical cable
(copper wire), optical fibre or radio frequency waves (wireless).
Careful design of signal cables and placing of radio transmitters can

minimise electrical interference, but attenuation can only be overcome
by systematic amplification over the length of the link to recover the
original signal level. Unfortunately this leads to a lowered signal to
noise ratio which is particularly noticeable on analogue signals. Digital
transmission overcomes these problem as digital regeneration (the
equivalent of amplifying the analogue signal) has no effect on the
signal-to-noise ratio and errors due to interference may be corrected by
the receiver. This is of particular value to the broadcast industry who
are able to transmit comparatively noise-free signals over long distances
on radio bands which are subject to high levels of interference.
Optical fibre, a medium that allows light to travel with comparatively
little attenuation or interference, has become standard for professional
transmission. Also increasing use is being made of digital wireless
network services such as Bluetooth and WiFi.
List as many types of analogue and digital signal transmission

methods as you can. Try to suggest the medium that the transmission
uses. How do you know whether the transmission is analogue or
digital? Can you make a comment about the methods of transmission?
Comment
Analogue transmissions include:
• Fixed telephone line (although it is converted to a digital signal at
the telephone exchange) which uses copper wires.
• AM and FM radio broadcasts using radio waves.
• Analogue TV broadcasts also using radio waves.
Digital transmissions include:
• Mobile telephone using radio waves.
• Computer networks using copper wire.
• Digital audio broadcasting (DAB) using radio waves.
• Terrestrial digital television (DVB) again using radio waves.
• Satellite television also using radio waves.
• Cable digital television using copper coaxial cable.
I suppose I know which is analogue and which is digital either by how
long I have been using the product or because I was sold the product
because it was digital or because I work in the field.
8.2 Channel bandwidth

The path over which a signal is transmitted is usually referred to as a
channel. The channel bandwidth limits the bit rate of the digital signal
that the channel can accommodate which in turn limits the fidelity of
the analogue signal.
Calculate the bit rate necessary for a stereo radio broadcast using PCM
with 16 bits for each sample per channel. Use a sampling rate of 32 kHz.
Table 14 shows that the digital bit rates necessary to transmit high quality
audio and video signals are very much higher than the equivalent
analogue bandwidths. To take advantage of the benefits offered by digital
transmission it is necessary to reduce the data rates whilst maintaining
the sound quality.
Table 14 Comparison of audio bandwidths and data rates
Channel medium Audio bandwidth Data rate
Telephone line 3.4 kHz 64 kbit/s

Stereo radio broadcast 15 kHz 1 Mbit/s
Television broadcast 8 MHz 160 Mbit/s
Calculate the maximum bandwidth of an audio signal that can be sent

digitally over a telephone line that has a maximum available bit rate of
56 kbits/s. Assume 16 bits are used for each sample as in the previous
activity.
8.3 Digital audio compression

8.3.1 Introduction
The traditional idea of compression in audio terms is that of reducing
the dynamic range of the sound. Broadcasters often use compression to
make music more audible in noisy environments, e.g. when listening
to music whilst travelling in a car. Audio compression makes the
quieter sounds louder without increasing the overall level of the sound.
Digital audio compression is very different and is concerned with
reducing the amount of digital data rather than altering the dynamic
range of the audio. It uses two phenomena, one developed by the
computer industry involving recording the data, the other due to the
way our brain recognises sounds. Data compression is used by the
computer industry where there is a practical limit on the capacity of a
storage device or the channel bandwidth carrying the data. It works by
reducing the number of bits needed to carry the data. The reasons why
data compression is popular for digital audio include:
• an extended playing time on a given medium;
• smaller consumer units for portable use;
• faster data transfers for a given bandwidth.
Figure 33 shows the audio data rate being reduced by the compressor
or coder (in this case by a factor of 10), then passed over a channel
with a suitable bandwidth and restored by an expander or decoder*.
transmission
digital audio in channel digital audio out
coder decoder
n bits/s n
/10 bits/s n bits/s
Figure 33 A data compression system
*Do not confuse these devices with digital recording coders and decoders
mentioned in Section 4.2.1. Indeed compression codecs could well be contained
within digital audio codecs.
Note that because of the compression factor of (in this case) 10, the
channel bandwidth can also be reduced by a factor of 10 without
affecting the digital data. Alternatively, ten times the amount of digital
data can be sent through the channel if its bandwidth is not reduced.
8.3.2 Lossy or lossless?

There are two fundamentally different ways of compressing digital
audio data – lossless and lossy.
A lossless audio codec compresses all the information that arrives at
its input in a digital form, carries it over the link and expands it back
at its output in a bit-for-bit format. Nothing is taken away or lost – as
the name implies.
In contrast, with a lossy audio codec some information in the digital
audio signal is removed before compressing the data in the same way as
the lossless codec. This information cannot be recovered, for once it is
removed it is lost forever. The main reason for doing this is that if
some information is removed before the audio data is compressed there
is less data to carry across the link. There is a great deal of controversy
as to whether a lossy audio codec produces any effects on the sound
quality. It really depends on what is taken away and how the final
sound is reproduced whether lossy compression is acceptable or not.
So any form of processing that reduces the amount of data is known
as data compression, but remember that reducing the amount of
data does not necessarily mean that it is a lossy system. There are a
number of compression techniques that reduce the quantity of data
without the loss of any information. An example in the computer field
is the ZIP file which contains a compressed form of one or more
computer files – but when decompressed, the exact original files are
recreated. The term lossless compression is often used to describe
such systems. However, lossy codecs will always be able to produce
more compact digital audio data than a lossless codec.
Can you think of one example where it is essential to keep all the
information and an example where some data may be lost without
losing the integrity of the information?
Comment
An example of lossless compression would be a computer program.
Every single bit in a computer program is essential for its operation.
The loss of a single bit of data from the program could cause the
program to fail.
An example of lossy compression could be the storage of the photographs.
Much of the information may be removed from a picture and yet the
subject remains recognisable.
8.3.3 Lossless compression

When characteristics about the information to be compressed are
known a prediction as to its content may be made and commonly
occurring patterns may be recoded to shorter values, with longer

values applied to infrequent patterns. An example of this type of
predictive or entropy coding is Morse Code. Common letters in the
English language, such as ‘e’, ‘t’ and ‘a’ are allocated codes of dot, dash
and dot-dash respectively, whereas less commonly used letters such as
‘q’ dash-dash-dot-dash and ‘z’ dash-dash-dot-dot are given longer, more
complex codes. HufFMan code, which works in a similar manner on
data, is commonly used to compress computer data and is the coding
scheme used in fax machines. An extension to the HufFMan Code is
the Lempel-Ziv-Welch (LZW) code whereby unique code length
conversion tables are constructed from the information and transmitted
along with the compressed code. This gives greater compression by
tailoring the conversion codes to the information much in the same
way as Morse code was originally customised to the English language.
Another technique for compressing binary data, called run-length coding,
is described in Box 20. In this compression method ‘runs’ of similar
characters, i.e. groups of zeros or ones, are recognised and replaced
with a number indicating how many times the same bits are repeated.
Box 20 Run-length coding

An example of the use of run length coding might be found in the 99-bit
binary code 00011….11, where the dots stand for a continuous run of 92 ones
(making 96 ones in total). Using run length coding, this sequence can be
represented as (3)0(96)1. This indicates that it consists of 3 zeros followed by
96 ones. In binary this becomes 110 110 0000 1, which is much shorter than
the original 99 bits. For ease of decoding, the numbers representing 3 (11 in
binary) and 96 (110 0000 in binary) would usually be represented by a standard
number of bits, usually 8. The compressed binary code then becomes 0000
0011 0 0110 0000 1, a total of 18 bits instead of the original 99. This gives a
compression ratio (i.e. the number of bits in the original binary code divided
by the number in the compressed binary code) of 99/18 which equals 5.5.
In general lossless systems are not very efficient with digital audio as
the data does not lend itself well to the deterministic characteristics
these systems require. However lossless coding is used for high
quality digital audio transmission and storage in order to provide a bit-
for-bit reconstruction of the original data. For example coding in the
CD digital audio format achieves a lossless compression ratio of
around two and a half times, a much lower value than that achieved by
the lossy systems discussed in the next section.
8.3.4 Lossy compression

Applications using digital audio over narrow bandwidth links only
become feasible if the data rate can be drastically reduced. Of course
this may be achieved by reducing the sampling rate and/or number of
quantisation levels but only with serious effects to the signal quality as
illustrated by the next activity.
Without altering the anti-aliasing filter or sound level, what effect does
a reduction in a) the sampling rate, and b) the number of quantisation
levels have on sound quality? (Hint: You may like to listen to the
audio track associated with Activity 12 again.)
The Moving Pictures Experts Group (MPEG, pronounced ‘em-peg’)

developed a series of standard compression methods for both pictures
and sound based on lossy compression codes. These are discussed, for
reference only, in Box 21.
Box 21 Moving Pictures Experts Group (for reference only)

The Moving Pictures Experts Group (MPEG) was formed by the International
Standards Organisation (ISO) to set video and audio compression and
transmission standards. Unusually, MPEG defines the way the decoder interprets
the bit stream and any decoder which has this capability is termed compliant.
A coder must generate a compliant bit-stream for it to work with decoders,
but the coder specification is left to the designers.
At the time of writing (2004) three MPEG standards exist and these are
summarised in Table 15 (note that there is no MPEG-3 standard).
Table 15 Summary of MPEG standards
MPEG standard Use

MPEG-1 Provides video and audio on compact disc.
Only moderate video quality.
MPEG-2 Used for high quality video and audio for both
DVD disc and Digital Video Broadcasting (DVB).
MPEG-4 Provides higher compression factors than MPEG-2
and is expected to be used in computer graphics
and Internet applications.
Lossy codecs do not provide a bit-for-bit reconstruction of the original

data but rely upon factors of human perception to allow the removal of
information from the data. For this reason the term perceptual coding
may be used to describe lossy compression methods. Audio perceptual
systems use psychoacoustics to remove the parts of the signal that have
been found to be inaudible to the listener (see Box 22 ‘Psychoacoustics
and lossy compression’). By analysing the frequency and amplitude
contents of a digital audio signal and comparing it to a model of human
auditory perception the coder removes ‘inaudible’ sounds thus lowering
the overall bit rate for transmission or storage as shown in Figure 34(a).
The decoder, shown in Figure 34(b), then reconstructs the signal which
should be perceived by the listener to be the same as the original signal.
digital audio low bit rate

signal (PCM) frequency/amplitude frame bitstream
construction
analysis packing
psychoacoustic
(a) model
low bit rate digital audio

bitstream frame signal (PCM)
reconstruction frequency/amplitude
unpacking recreation
(b)
Figure 34 A model of a perceptual coder (a) and decoder (b)

Box 22 Psychoacoustics and lossy compression

The study of psychoacoustics was introduced in Chapter 5 of Block 1 and it
explains how we respond subjectively to sounds. Principles found from research
in psychoacoustics are used in lossy compression systems to reduce the amount
of information a sound signal contains without affecting the perceived sound.
The sensitivity of the human ear varies according to the frequency of the
sound. For example one sound at a given frequency and level may be heard
whereas a sound at a different frequency, but a higher level cannot be heard.
Thus it is possible to define so called ‘hearing thresholds’ for the full audible
range of frequencies, illustrated earlier in Figure 8. By detecting which parts
of the signal are below the threshold and removing these components, the
amount of information can be reduced.
Also consider two sound sources emitting sounds at the same time. If one sound
is much louder than the other, and if they are of a similar frequency, then the
louder sound may totally mask the softer one, as illustrated in Figure 35. As an
example of this phenomenon, consider two people having a conversation, if an
aircraft passes close overhead, it is quite likely that for a certain time the
range of frequencies emitted by the aircraft’s engines will mask the conversation.
Lossy digital
audio comp- 80
ression can use
sound pressure level (dB)
this effect by 60
masker
removing the masking
parts of a sound threshold
signal that are 40
sound is masked
more than a because it is within
threshold the masking
certain level 20 threshold
in quiet
below a loud
sound, and thus
would not be 0 inaudible signal
heard, so redu-
cing the amount 0.02 0.05 0.1 0.2 0.5 1 2 5 10 20
of information frequency (kHz)
that the sound Figure 35 The masking effect of a loud tone on
signal contains. a near frequency softer tone
You will notice from Figure 34(a) that the coder appears more
complicated than the decoder in Figure 34(b). This is an important
feature introduced by the MPEG group. Can you suggest why this
should be the case?
Comment
Every listener to compressed audio data will require a decoder whilst
coders will only be needed by those involved in the production of the
compressed audio material, which will be fewer in number. Thus by
making the decoder comparatively simple the consumer saves money.
Many software decoders are either free or merely carry a license fee.
MP3 audio compression

The title MPEG and the details shown in Table 15 rather give the
impression that the standards are wholly concerned with video
compression. However, an MPEG Audio group was also convened and
developed three different levels or layers of audio compression which are
rather confusingly entitled MPEG-2 layers 1, 2 and 3. Of the three, it is
Box 23 MPEG-2, layer 3

MPEG-2, layer 3, universally known as MP3, is To further reduce bit rates in the case of a stereo
one of the most popular compression coding signal a joint stereo format is applied whereby
methods used within digital audio. It offers the content of one channel that is
acceptable audio quality with a high compression transmitted carries the information that is
ratio of 11:1. The main components of an MP3 identical to both channels and the other
coder are shown in Figure 36. channel carries the difference information.
A PCM digital audio signal is input to the coder Variable bit-rate encoding allocates more bits
where a filter bank splits the audio into 32 sub- to complex musical sounds such as those of an
bands to match the action within the cochlea orchestra and fewer to a less complex sound
of the inner ear, described in Block 1 Chapter 5 such as a solo vocalist, adding further to the
Section 3. The spectral or frequency-dependant coding efficiency. The bits carrying the digital
content of each sub-band is analysed and coded audio are packed into frames along with
using a psychoacoustic algorithm to generate embedded data about the music. Finally,
the lowest possible bit-rate for the given Huffman coding is used to reduce the overall
content.This allows sounds that cannot be data bit rate.
heard, such as those masked by louder ones and Decoding requires the frames to be unpacked
those below the hearing threshold, to be so that the spectral data can be reconstructed
removed. As the ear cannot detect the direction and used to rebuild the original waveform.
of frequencies
below 100 Hz the uncompressed audio compressed audio
stereo inform-
ation for those
frame
frequencies is PCM 44.1 kHz encoding filter MP3
packing
also discarded.
psychoacoustic
algorithm
Figure 36 Simplified block ancillary data, lyrics,

diagram of an MPEG-layer 3 graphics, hyperlinks, etc.
(MP3) audio coder
layer 3 which has become the most used and is colloquially known as
MP3, explained in Box 23. Table 16 summarises the three MPEG-2
audio layers. Layer 2 sets the standard for audio compression and is based
on the work of two independent groups who developed audio
compression codes for digital audio broadcasting (MUSICAM*) and
telecommunications transmission (ASPEC**).
Each layer currently allows input sampling rates of 32, 44.1 and 48 kHz
and can support output bit rates of 32, 48, 56, 64, 96, 112, 128, 192, 256
and 384 kbit/s. The audio can be mono, dual channel (e.g. two different
languages) or stereo.
Table 16 Comparison of MPEG-2 audio layers
Layer Bit-rate Compression Features

1 384 kbit/s 3.6:1 High quality at low complexity. A simplified version of MUSICAM.
2 256 kbit/s 5.5:1 High quality sound at moderate data rates.
Identical to MUSICAM. Used for DAB and DVB audio.
3 128 kbit/s 11:1 Moderate quality sound at low data rates.
Uses the best features of MUSICAM and ASPEC
*MUSICAM, Masking pattern adapted Universal Sub-band Integrated Coding And Multiplexing.
**ASPEC, Adaptive Spectral Perceptual Entropy Coding.
Ogg Vorbis
Ogg Vorbis is an open source non-proprietary compression format that
offers similar functionality to MP3. Developed by the Open Source
Foundation it is becoming widely supported by common media
players. If aggressive payment methods are ever sought by the MP3
patent holders current users may well turn to alternative, royalty-free
formats such as this.
8.3.5 Lossy coders and master recordings

A word of caution is necessary when using lossy coders for the
production of a master recording. Consider the situation where a
recording is assembled from a number of different sources (separate
tracks on a multitrack recorder for instance), which is then re-recorded
with some additional material. One of the advantages of a lossless
digital system is that a sound should be able to be re-recorded any
number of times without degradation – assuming of course that the
signal always remains in the digital domain and the sampling rate and
quantisation levels are not changed. Once a sound has been recorded
using a lossy compression system unwanted parts of the signal will
have been removed. When this signal is replayed, mixed with other
sound sources or re-recorded again no further parts of the signal
should be removed. However, it is likely that playing back and re-
recording a digitised sound a number of times using a lossy
compression algorithm will cause audible degradation. Furthermore,
mixing sounds that have come from lossy recording systems is
unlikely to produce a composite sound that, when recorded again with
a lossy system, will not loose even more information. There is also a
potential further problem when combining sounds that have been
subjected to different lossy coding systems. So the use of lossy
compression techniques for original, high quality recordings should be
avoided. Sometimes, recording engineers may use a lossy digital
recording system such as a MiniDisc as a secondary recording machine
in addition to a main lossless digital recorder because of its
convenience and the ready availability of systems and media. This
backup can be used for preview purposes, but will only be used as a
master source in the last resort if the main recording is damaged.
What is the essential difference between lossless and lossy

compression?
8.4 Digital audio and the Internet

Access to the Internet, that global network of computer networks, is
available to anyone with access to a desktop computer and, via a
telephone line or cable connection, to an Internet service provider
(ISP). But why is the Internet so important to music, and why have the
traditional record companies been so worried about us using it?
One of the main reasons for the Internet’s importance comes from the
universal appeal of music and the availability somewhere on the
Internet of exactly the music to which we want to listen. National and
local radio and television broadcasters try to cater for the needs of the
many so mostly play material that attracts the highest possible
audience. Commercial broadcasters must do this to ensure sufficient
advertising revenue. Minority audiences are often poorly catered for
even from the public broadcasters. The Internet has none of the
constraints of the broadcasting companies and for very little cost any
form of music may be made available and we can use powerful search
engines to seek out and listen to the music we want.
However there is a difficulty with providing music over the Internet.
For many people accessing the Internet from home gives an incoming
bandwidth of no better than 45 kbit/s or about 2.5 Mbits per minute.
As you have already found out, PCM digital audio needs about 80
Mbits per minute which is over 30 times faster than the home data rate
so real-time listening is impossible. Even downloading music can be
time consuming as demonstrated in the next activity.
(a) Calculate the time required to copy a 3 minute song from the
Internet onto a computer if the download speed is 45 kbit/s.
Assume the song is stored in the standard CD format (2 channels,
44.1 kHz sampling rate, 16-bits per sample).
(b) Can you suggest ways to reduce the download time?
Comment
(a) One second of standard CD-format sound will generate 2 × 44 100 ×
16 = 1 411 200 bits. The song is 3 minutes long so the total size of
the data to download is 3 × 60 × 1 411 200 = 254 016 000 bits.
The download speed is 45 kbit/s or 45 × 1024 bit/s = 46 080 bit/s.
So the time to download the 3 minute song will be 254 016 000/
46 080 = 5512.5 seconds or 92 minutes which is over 1.5 hours –
just for a 3 minute song!
(b) The download time can be reduced in two ways. Firstly the data
rate could be increased by using a broadband connection.
Secondly the amount of data could be reduced, using data
compression, before downloading.
A broadband telecommunications connection overcomes the problems

associated with download times. However, data file compression and
in particular the MP3 format has reduced the download times to
acceptable levels even for a traditional low-speed dial-up connection.
For example MP3 coding, with a compression ratio of about 11:1, will
allow a 3 minute song to be downloaded in under 9 minutes, a far
more realistic time. And, of course, it also needs less storage space
than the original sound data.
Another important factor that has made the Internet popular has been
the availability of low cost or free digital audio player software to
organise and either output music through the computer sound card or
store it onto writeable CD media (CD-R or CD-RW) for subsequent
playing through stand-alone audio equipment. Portable ‘MP3-players’
have also proliferated. They store music files on memory cards and
with no moving parts they are ideal for use whilst taking exercise or
travelling. Pocket PCs and organisers also have similar music facilities.
As well as being attractive to the listener, the Internet offers unique

facilities to musicians. Distribution costs are minimised and the
recording is never ‘out-of-stock’. Many new recording labels have
appeared giving new performers and music access to the public that
would be denied through traditional routes. Musicians can distribute
their own music although they may lack the publicity and expertise
available through the Internet distributors.
8.4.1 Internet radio

Broadcast radio offers little interactivity apart from radio phone-ins,
where listeners have the opportunity to appear ‘on-air’ and state their
case. Web sites, accessed via the Internet, offer the ideal medium for
interactivity and this has spawned a growth in streaming audio or
Internet radio stations. The music industry, through the Digital
Millennium Copyright Act (to be discussed in Chapter 5 of this block),
ensures that only official licensed sites broadcast music and that the
rules of the Act are rigorously enforced.
Why do you think the music industry is concerned about Internet radio?
Comment
They are concerned that record sales will be effected. Thus they not
only prosecute illegal sites under the Act but also have an input into
what music can be played and the frequency of playing.
Internet radio is made possible through the use of streaming audio

facilities which requires a computer with digital audio capabilities and
special ‘receiver’ software as outlined in Figure 37.
streaming
streaming
audio server
audio
player
software
internet buffer
audio file
Figure 37 The concept of streaming audio
When an Internet broadcast is selected the whole audio data file is not
downloaded and stored on the computer before the sound is output.
Instead a measurement of the line speed is taken and a temporary local
store, called a buffer, is opened. Audio data is streamed into the buffer
and when sufficiently full the audio output is started. As long as the
download data rate is maintained the buffer will not empty and the
audio is heard as a continuous stream. If the data stream is interrupted
for too long the buffer will empty and the audio will be muted until
the buffer is filled to a suitable level. Sound quality is variable and is
often likened to that of AM radio as you will see in Section 8.5.
What do you think is the most important requirement for successful

reception of Internet radio?
Comment
A reliable and preferably fast link to the Internet is the most important
requirement for successful reception. The original specification for
Internet radio needed an ISDN 128 kbit/s link. Improvements in
compression codecs have meant that lower bit-rates can provide
satisfactory reception but popular sites get very busy slowing
download rates.
ACTIVITY 41 (ON-LINE) (Optional) ..........................................................
You may like to use your Internet connection to try Internet radio.
Details on how to do this are given in the Block 3 Companion.
8.5 Digital broadcasting

8.5.1 Introduction
Amplitude modulation (AM) broadcasts, found on the long-, medium-
and short-wave frequencies (0.1 – 30 MHz) of the radio spectrum, deliver
a low quality audio signal mainly because the audio bandwidth is
restricted to 6 kHz. The signal-to-noise ratio is poor and many broadcast-
ers use audio compression to limit the dynamic range. However, its
strengths lie in its broadcasting coverage, especially on short wave bands
(1.8 – 30 MHz) where very long-distance transmissions are possible.
Frequency Modulation (FM) broadcasts offer higher quality analogue
audio, but with restricted coverage due the necessary use of high
frequencies in the radio spectrum (88 – 108 MHz) to provide an audio
bandwidth of 15 kHz.
A good FM radio
receiver is capable of
receiving a 15 kHz
bandwidth audio
signal with a signal-to-
noise ratio in excess of
70 dB. The FM
broadcast specification
was developed in the
early 1950s, a time
when the ‘wireless’,
similar to the one
shown in Figure 38,
was a fixed installation
in the corner of the
living room connected Figure 38 A 1940’s ‘wireless’
to an external aerial.
There were no expectations to broadcast signals suitable for mobile
and portable reception thus transmitter powers and locations were
specified with only fixed receivers in mind.
FM stereo broadcasts were introduced in 1966 using the American
‘Zenith-GE Pilot-Tone System’. This allowed suitably equipped FM
receivers to output a two channel stereo signal whilst ensuring legacy
mono FM radios still worked satisfactorily. Today, whilst CD portable
audio systems are able to offer excellent performance, FM radio reception
is less acceptable due to the aforesaid configuration of the transmitter

network. In addition manually tuning an FM radio is difficult, the
aerials are directional causing unstable reception especially for stereo
reception and signals are prone to automobile ignition interference.
The Radio Data Service (RDS) was introduced by the BBC to improve
usability. Here, extra data is transmitted with the analogue signal which
is used by the receiver to tune automatically to the strongest signal,
thereby ensuring the best possible reception at all times and also giving
station identification. Radio manufacturers use RDS to improve the
functionality of FM radios but at an increased cost to the consumer.
8.5.2 Digital Audio Broadcasting

In the early 1990s a European consortium developed a specification for a
new digital broadcasting system that would be capable of offering a
radio sound quality comparable with CDs. The result was the Eureka 147
specification for Digital Audio Broadcasting (DAB), also called Digital
Radio (but not to be confused with analogue radios using digital
displays and tuning). The DAB service offers:
• a variable mixture of high and low quality stations;
• sound quality that can be comparable with CDs;
• unimpaired mobile reception even at high speeds;
• efficient utilisation of the frequency spectrum;
• transmission capacity for ancillary data;
• low transmitter powers;
• terrestrial, cable and satellite delivery;
• user-friendly receivers;
• European or better world-wide standardisation.
The main advantage of digital broadcasting is that it make very efficient
use of the radio frequency (RF) spectrum by fitting more stations into a
given bandwidth. The current FM network requires nearly 3 MHz of
bandwidth in the RF spectrum per station to avoid interference between
transmitters (so-called co-channel interference). On the other hand a
DAB station requires under 200 kHz of bandwidth in the RF spectrum
and the transmission mode used is not affected by co-channel interference.
A system called coded orthogonal frequency division multiplex (COFDM)
has been adopted. It is inherently reliable because rather than having to
receive a continually varying analogue signal the receiver has only to select
between a few levels representing the data bits. Also error detection,
correction and concealment methods improve the stability of the audio
signal. To keep the data bit-rate low
the lossy compression format MPEG-2
layer 2 (MP2) is used. Station tuning
is much easier than with AM or FM
systems as a single frequency is used
for each station. Interference due to
reflections of buildings (a major
problem for mobile reception in
towns and cities), illustrated in
Figure 39, and from aircraft flying
in the reception path is eliminated.
Figure 39 Reflections interfere with reception
Unfortunately reception is not compatible with conventional analogue

receivers and a new, currently more costly, DAB receiver must be
purchased similar to the one shown in Figure 40. Box 24 outlines how
DAB is organised.
Figure 40 A portable DAB receiver
Box 24 Organisation of DAB

The digital audio and associated data is carried in main service channels (MSCs)
or multiplexes, each of which can carry a number of individual audio and/or
data services. Currently the UK government has allocated seven multiplexes
within Band III of the radio frequency spectrum, between 217.5 and 230.0 MHz.
There are two national and up to five local multiplexes available from any
given location providing a mixture of stereo, mono and data services. For
example the BBC multiplex (in 2004) provides twelve digital audio channels
(i.e. stations) and one data stream (for web pages), illustrated in Figure 41.
Each multiplex provides an average bit rate of 1.152 Mbit/s for these services.
The audio and associated data for each service is formed into frames which
are transmitted every 24 milliseconds. Each service occupies part of the overall
bit rate. The allocation of so many services has given rise to some controversy
as providers appear to be offering quantity (more stations) at the expense
of quality (reducing the bit rate for each station). Currently the BBC allocates
192 kbit/s (kbs) for its stereo classical music station, Radio 3, but only 64 kbit/s
for predominately speech only stations.
Web page
World Service 16 kbs Radio 1
64 kbs 128 kbs
Asian Network
64 kbs
Radio 2
BBC 7
128 kbs
80 kbs
BBC 6 Music
128 kbs
Radio 3
192 kbs
1 Xtra - BBC
128 kbs
Radio 4
Radio 5 SportX 80 kbs
64 kbs Radio 5 Live (128 kbs when
Figure 41 The BBC’s 80 kbs Radio 5 SportX is
multiplex allocation in January 2004 not being broadcast)
Can you think of reasons why broadcasters should favour broadcasting

a large number of stations, including some new stations not available
on existing analogue services, at the expense of absolute audio quality?
Comment
As with any new service which has needed an investment from both
the suppliers and the users, digital audio broadcasting must appeal to
as wide a range of the population as possible in order for it to be a
success. The necessity for the public to purchase new equipment
means they must get value for their investment. Additional services,
rather than absolute audio quality, appears to be the route taken by the
broadcasters to attract users. It should be borne in mind that one of the
goals of the Eureka 147 consortium was to improve mobile and
portable reception. In both these cases absolute audio quality is not
paramount as the listening conditions will be far from ideal.
Text information to identify the station and programme content is

broadcast within the digital audio stream which can be output on the
display of receiver. Experiments with transmitting web pages to be
displayed by a computer’s Internet browser, and computer program
downloads have also been successfully undertaken. Programme guides
giving the user the ability to time recordings are also becoming
available and receivers with memory card storage are expected to be
available during 2004.
8.5.3 Digital Radio Mondile

Both FM and DAB have a restricted geographical coverage due to their
transmission at very high frequencies. Also both have relatively
high bandwidth requirements and are unsuitable technologies to be
used on the existing AM bands. In October 2002 the International
Telecommunications Union (ITU) endorsed the use of Digital Radio
Mondile (DRM) on existing AM frequencies. Developed by a European
consortium DRM provides quality digital audio and data services
within the existing 9 kHz bandwidth AM channels. This efficiency of
bandwidth is achieved in three ways. Firstly, the DRM channel only
carries one station, although it contains programme data and text
along with the audio data. Secondly, very high compression based
on MPEG 2 advanced audio coding (AAC) is used. Finally, very low
sampling rates, in the order of 20 kbit/s are employed. As with
DAB, COFDM transmission is used and so again a special radio
receiver is necessary, which unfortunately is not compatible with
current DAB radios. A software program which runs on desktop
computers, shown in Figure 42, enables DRM to be decoded from a
suitably modified AM radio receiver. At the time of writing (2004)
DRM is still undergoing reception trials but has the potential to offer
good quality audio free from the interference that plagues the current
analogue medium- and short-waveband broadcasts. This is demonstrated
in the next activity.
Figure 42 A DRM software receiver displaying the

digital signal, programme information and data
The three audio tracks associated with this activity demonstrate

similar broadcasts using, in order, analogue AM, digital DAB and
digital DRM stations. Note particularly the extreme interference on
the AM broadcast compared with both of the digital broadcasts. The
DAB channel used a 64 kbit/s data rate whereas DRM used only
20.9 kbit/s.
List the main reasons for the Internet becoming a major source of
music.

Digital audio can offer an opportunity for both artists and listeners to
enjoy a much greater quality and a wider choice of musical performances.
Both Internet audio and DAB are able to offer the consumer a wider
choice of music genres. However, to persuade consumers to purchase
additional equipment to exploit digital reception new and innovative
services have had to be established. This has led to a tension between
quality, which requires high bit rates, and the number of services, which
restricts the available bit rates. At the time of writing DAB services are
appearing to win out over quality. That said the reception is still superior
to that offered by AM transmissions, although with good reception and
appropriate receivers, analogue FM transmission still offers superior
quality. The future may well bring improved digital audio compression
technologies to further improve the audio quality for a given bit-rate.
SUMMARY OF CHAPTER 4
In planning a recording the record imperfections and external factors. The

producer will decide what should be signal-to-noise ratio is given by the wanted
recorded, who should perform the work, signal power divided by the unwanted
where and when it should made. The noise power and is usually expressed in
decision as to why the recording is made decibels. (Section 3.3)
should be on a financially sound basis
though this need not always be the case. Pulse code modulation (PCM) is used to
(Section 2.2.1) code an analogue audio signal to a digital
form prior to storage or transmission. The
The session recording will contain all the minimum sampling rate needs to be twice
takes and re-takes of the performance all the signal bandwidth. The number of
carefully documented. Whilst the quantisation levels is determined by the
recording may be replayed no editing will required quality of the audio signal when
be done at this time. (Section 2.2.2) it is reconstructed. The product of the
sampling rate and number of quantisation
The session recording is mixed down levels determines the basic bit rate of the
usually to a two channel master recording. digital PCM signal. (Section 4.2)
Documentation from the recording session
will be used at this time. (Section 2.2.3) The bandwidth required for a PCM digital
audio signal is much higher than the
Post production is the final opportunity equivalent bandwidth necessary for the
to add additional material or make minor original analogue signal and is determined
changes. The data and formatting specific by the bit rate. (Section 4.2.2)
to the distribution medium is also added
at this stage. The final result is an EQ’d Within the digital domain, signal-to-noise
master. (Section 2.2.4) ratio is governed by the number of
quantisation levels. The greater the number
Manufacture of digital audio discs of levels the better the signal-to-noise ratio.
involves the creation of a glass master (Section 4.2.3)
which is used to make stampers which
press the data into the plastic medium. Signal conversion error causes distortion
Once labelled and boxed they are sold due to the approximation of the coded
through record distributors. (Sections 2.2.5 values to the actual signal levels. At low
and 2.2.6) signal levels this may become noticeable to
the listener as the error is correlated to the
The important characteristics of any audio audio signal. By adding a very small level
system are dynamic range, frequency of noise (called dither noise) the errors
response and signal-to-noise ratio. become decorrelated and in consequence
Dynamic range determines the loudest and less noticeable. (Section 4.2.3)
quietest sounds that may be reproduced
by the audio system. An ideal value would Increasing the number of quantisation
match the human ear. In reality the range levels improves the digital dynamic range
for analogue systems is much lower. because of a lowered signal-to-noise ratio
(Section 3.1) and the ability to support higher sound
levels without distortion. (Section 4.2.4)
Frequency response is determined by
measurement to find the range of Video tape recorders using rotary recording
frequencies where the output response is and playback heads were used as the first
flat for a constant input level. The digital audio storage devices as they could
frequency response for audio systems offer sufficient bandwidth to store the high
should be between 20 Hz and 20 kHz. data rates of PCM signals. Later, dedicated
(Section 3.2) audio devices such as DAT and ADAT
recorders, used similar rotary head
Signal-to-noise ratio compares the wanted technology. Hard disks are rapidly
signal to the unwanted noise that is added superseding digital magnetic tape recorders.
to the audio signal by system (Section 5.2)
Stationary head digital tape recorders levels. They also offer multi-channel
eventually became available when sound formats as well as stereo and, in the
technology allowed suitable heads and case of the DVD, video sequences are
tapes to be made. These machines provide possible. Copy protection technology
multitrack recording and tape editing inhibits illegal copying. Hybrid SACDs can
facilities similar to earlier analogue be played on standard CD players but
machines. (Section 5.3) without the advantages of the SACD
technology. Universal players have been
Disc storage is preferred by consumers as developed to play all digital audio disc
the random access nature of discs more standards in use. (Section 6.4)
closely matches the way the they listen to
music. (Section 5.4) Oversampling increases the sampling
frequency by two or more times in order to
allow the use of simple anti-aliasing filters
Digital audio compact discs replaced
when converting the digital audio data
analogue discs because they could support
back to its analogue form. This is usually
an improved frequency response, dynamic
achieved by interpolating sample level
range and signal-to-noise ratio. They also
points between the original 44.1 kHz
offered an increased playing time. CDs use
sample points. (Section 7.1)
a spiral track of lands and pits on a plastic
disc coated with a reflective layer to store
Single bit conversion samples the data 64
the digital data. The data is read using a
or 128 times the normal sampling
laser beam and an optical pickup that
frequency. At these very high rates the
measures the reflected laser light from the
difference between samples will only be
pits and lands on the CD. This means that
one quantisation level at most. This allows
playback involves no physical contact
the recovery of the audio signal just by
with the medium with no resultant wear
simple filtering. (Section 7.2)
through repeated playing. The discs are
not prone to sound degradation due to
physical faults, dust and mishandling. CD The ability to detect and correct or conceal
audio data uses 44.1 kHz sampling media faults caused by drop-outs, jitter,
frequency with 16 quantisation levels. At interference and noise ensures the sound
least 74 minutes of stereo sound can be is not disturbed when an error occurs. If
stored. A CD can store 650 Mbytes of data. the error cannot be corrected then it can be
(Section 6.2) concealed. If errors are not corrected or
concealed audio disturbance occurs, which
may or may not be audible. (Section 7.3)
The MiniDisc was designed to replace the
analogue compact cassette system. It was Parity is the simplest way of detecting a
the first disc-based digital audio system bit in error. It cannot detect an even number
use magneto-optical recording technology. of bits in error. Cyclic redundancy check
The recordable disc has optical and code will detect multiple bit errors.
magnetic layers. A laser beam heats a spot (Section 7.3.1)
on the magnetic layer which can be
magnetically polarised to store the value Forward error correction, which adds
of the bit as a tiny magnet. Old data is additional bits to the digital audio data, is
simply overwritten. On playback the used to correct the bit errors on digital
magnetic polarisation affects the reflection audio discs. Additional backward error
of a (less intense) laser beam allowing the correction can be used to protect against
value of the bit to be read. To give MDs shock errors in jog-proof systems.
the same playing time as CDs (74 minutes) Combinational parity is an example of
data compression (called ATRAC) is used. forward error correction. (Section 7.3.2)
The sampling frequency and quantisation
le v e l s a r e t h e s a m e a s f o r t h e C D . To ensure that consecutive blocks of data
(Section 6.3) are not destroyed by a media fault which
might not be capable of being corrected, the
Super Audio Compact Disc (SACD) and data blocks are interleaved in such a way
Audio DVD (DVD-A) systems use 12 cm that, whilst several blocks may be affected,
high density discs (4.7 Gbytes) to offer the errors are able to be corrected because
improved audio quality by increasing the they become random errors rather than
sampling frequency and quantisation burst errors. (Section 7.3.3)
Errors that are too gross to be corrected may different to the original. Lossy methods are
be concealed. This can be achieved by much more efficient at compressing audio
muting, using the previous data word or data. Examples of these include MP3 and
most commomnly by interpolating the data Ogg Vorbis. (Section 8.3.4 )
word value from the values of the
surrounding data words. (Section 7.3.4) Systems employing lossy data compression
should not be used for mastering recordings
Copies of audio stored as digital data are as audible degradation may occur when
perfect. The industry has developed ways mixed and re-recorded with other sources.
of protecting itself from the manufacture (Section 8.3.5)
of multiple copies. Serial copy
management system (SCMS) allows users The Internet is an ever expanding source
to make a single digital copy whilst of music. Streaming audio, whilst offering
preventing multiple copies of copyright lower quality audio due to the need for very
material being made. (Section 7.4) high data compression, allows a wide range
of musical genres to be heard by consumers
The path over which a signal is transmitted at very low production costs. (Section 8.4)
is usually referred to as a channel. The
channel bandwidth puts a limit on the bit
Broadcasting digital audio has the
rate of the digital signal and therefore the
potential for offering CD quality sound from
quality of the analogue signal. (Section 8.2)
radio receivers. In reality Digital Audio
Broadcasting in the UK currently offers
Large volumes of digital audio require high more stations with less bandwidth than are
data rates for transmission and large available in the existing AM or FM bands
storage capacities. Both of these problems but this places a restriction on the bit-
are alleviated by using data compression rates available and therefore limits the
techniques. (Section 8.3.1) audio quality of the stations. Programme
information and text services are
Lossless methods allow bit-for-bit retrieval available. DAB technology overcomes
of the digital data but do not have the high many problems associated with analogue
compression ratios of lossy systems. AM and FM systems but is incompatible
Entropy codes and run length coding are needing the listener to purchase new
examples of lossless compression. Lossless receivers. (Sections 8.5.1 and 8.5.2)
systems are not very efficient at
compressing audio data due to the random Digital Radio Mondile exploits the wide
nature of sound. (Sections 8.3.2 and 8.3.3) coverage of the medium- and short-
wavebands to broadcast highly compressed
Lossy methods rely on psychoacoustic low bit-rate digital audio signals without
algorithms to identify and remove sounds the problems of interference associated with
that should not be heard by the listener. analogue AM radio broadcasts. Again new
This means the decompressed version is receivers will be required. (Section 8.5.3)
ANSWERS TO SELF-ASSESSMENT ACTIVITIES
Activity 2
Equalisation enables the engineer to control the relative levels of various
frequencies in the audio bandwidth. It can vary the harmonic content or
timbre of the final recorded sound.
Normalisation ensures that the sound makes best use of the available
dynamic range.
In the case of the additional track the nature of the sound of the other
tracks would be analysed before a decision was made as to whether
changes were necessary. Particularly the track preceding the new track
must be checked for level and sound as it would be unfortunate if the
listener had to make adjustments to the replay equipment just for this
additional track.
Activity 4
Music companies control both the artistic and the technical elements of
the recording. In the planning stage (1) a production team led by a
producer will make proposals on the music, artists, engineers, venue,
recording dates, etc. based around agreed budgets. The performance
generates a session recording (2) which is subsequently edited in the
mix-down stage (3) to make the recording master. This is used in the
post production stage (4) to generate an appropriately EQ’d master for
the chosen delivery medium. This master version is used in the
manufacturing process (5) to make the final product which is then
marketed and sold through distributors (6) to the record stores.
Activity 6
The effect of using a microphone with insufficient bandwidth to record
the organ would be to alter the timbre of the sound. The relationship
between the fundamental frequency and harmonics would be changed
giving a different sound to the original. It is vital to ensure any equipment
used for recording and playback has sufficient bandwidth to capture the
full range of frequencies at their correct levels.
Activity 8
The ratio of the average output signal power to noise power is:
–6 7
100/10 × 10 = 100/0.000 001 = 10 000 000 = 10 = 70 dB
Since, in terms of power, a ratio of 10 = 10 dB, 100 = 20 dB, 1000 = 30 dB etc.
Activity 9
In order to avoid the generation of any additional harmonics which would
create distortion of the output signal, the transfer function should be linear
as illustrated in Figure 11(a). A non-linear transfer function, illustrated in
Figure 11(b), adds unwanted frequencies to the original signal.
Activity 10
With a sampling frequency of 44.1 kHz, 16 bits of digital audio data will
be stored 44 100 times a second for each channel.
Thus for 1 second of audio, 16 × 44 100 bit/s = 705 600 bits per
channel are needed or 2 × 705 600 = 1 411 200 bits for the pair of
stereo channels.
Activity 11
The analogue signal is passed through an anti-aliasing filter to limit
the bandwidth to half the sample frequency. An analogue-to-digital
converter converts each sampled level of the analogue signal to a
binary code word using the number of binary bits determined by the
required number of quantisation levels. The second channel is added
and then each code word has the error correcting code added and is
converted to a serial form before being sent to the storage medium.
Activity 19
The conversion process adds a quantisation error to the analogue
signal which is termed quantisation noise. This error is the difference
between the actual analogue level at the time the signal is sampled and
the voltage level the nearest code word represents. This difference can
be up to half the minimum quantisation interval.
Quantisation noise can be reduced by increasing the number of bits in
the code word used to represent the level, thus reducing the minimum
quantisation interval.
Unlike the random nature of the noise experienced in analogue
systems quantisation noise is correlated to the signal, in other words it
forms a distinct pattern that is related to the signal. The ear is
sensitive to correlated noise and so another way of reducing the effect
of quantisation error is to add a small amount of noise, called dither
noise, to the original signal. This decorrelates the quantisation noise
from the digital signal making it less audible.
Activity 22
The random selection of tracks involves searching the tape by spooling
from one end to the other and then rewinding once the track is played.
This is time consuming, wears the tape, and is difficult to automate.
Discs enable fast access to tracks involving little wear on the medium
and the system is easier to automate.
Activity 23
MiniDiscs use a data compression method called ATRAC to compress
the audio data by about five times. This allows the physically smaller
MiniDisc to playback for the same time as a CD.
Activity 25
(a) Parity errors occur in 2, 4, 6, 7 and 8. There are an odd number of
1s in each.
(b) An even number of bits in error will disguise the fault as the total
number of 1s will still be even. For example changing any two bits
in the code word 1110 1000 will still keep the total number of 1s
even, e.g. 1101 1000. The two bits in error are highlighted.
Activity 28
Step 1 is to show the data and parity bits (in blue) of the (20,12) code
format:
1010 0
1001 1
0010 1
0011 0
Totalling the ones in the bottom parity check row (row 4) shows the
parity to be correct, (as there are an even number of ones). Performing
the same action on the other three rows shows that row 2 has a error in
it as there are an odd number of ones (there should be an even
number). Similarly totalling the ones in each column shows that the
parity check column is correct, but it also shows that column 3 has an
error (an odd number of ones again), so the bit in error is at the
intersection of row 2 and column 3, highlighted below:
1010 0
1001 1
0010 1
0011 0
Once this bit is changed all is well and the corrected code is:
10100 10111 00101 00110
Activity 29
Error concealment is used when the a data error is detected but cannot
be corrected or because it is too big. Rather than ignore the error which
would cause audio distortion the error is hidden by using
interpolation where the value of the data error is replaced by using the
data either side of the error, as shown in Figure 32(c).
Activity 30
A random error would be a single bit error which would use the
forward error correction data to correct the bit in error. There would be
no disturbance to the sound which would appear as a perfect
recording.
Activity 32
With a sampling rate of 32 kHz and a 2 channel stereo signal where
each channel is quantised using 16 bits, the bit rate will be
32 000 × 2 × 16 = 1 024 000 bits/s ≈ 1 Mbit/s
Activity 33
For a bit rate of 56 kbit/s and using 16 bit quantisation the maximum
available bandwidth would be 56 000/16 = 3.5 kHz for mono sound (or
1.75 kHz for stereo sound). Hardly suitable for high quality reproduction
but then neither is the existing analogue telephone network!
Activity 35
(a) The necessary sampling rate is determined by the bandwidth of the
audio signal, so if the sampling rate is lowered the high frequency
components of the signal will not be reproduced and aliasing will
occur giving a very unpleasant and distorted sound.
(b) Reducing the number of quantisation levels reduces the dynamic
range (by 6 dB for every bit reduction in the numbers of bits used to
code the samples) and increases distortion by adding quantisation
noise.
Activity 37
The main difference between lossless and lossy compression is that in
the former no data is lost in the compression process. Lossy
compression always loses data, but the idea is that the brain has no
perception that any of the original information is missing when the
data is reconstructed.
Activity 44
The reasons for the rise in use of the Internet for music distribution
are as follows:
• the use of MP3 compression software;

• the relatively low cost of publishing original music on the Internet;
• the ability to find any particular genre using powerful search
engines.
LEARNING OUTCOMES
After studying this chapter you should be able to:
1 Explain correctly the meaning of the emboldened terms in the main

text and use them correctly in context.
2 State the activities and people involved in producing a recording of
a musical performance. (Activities 1, 4 and 6)
3 Describe the basic characteristics of an audio system in terms of
frequency response, signal-to-noise ratio and dynamic range.
(Activities 5, 7, 8 and 9)
4 Explain how the above characteristics may be improved by employing
digital audio techniques. (Activities 11, 14, 16, 17, 18 and 19)
5 Outline the development of the various technologies used for storing
digital audio data and describe the most important characteristics
and overall operation of the processes that have been introduced in
the main text. (Activity 22)
6 Categorise the various types of flaws in digital audio data and describe
how they may be corrected. (Activities 24, 25, 26, 28, 29 and 30)
7 Explain the operation of the methods used in transmitting digital
audio introduced in the main text. (Activity 33)
8 Describe why digital audio data compression is used and explain
the fundamentals of how the methods introduced in the main text
operate. (Activities 37 and 38)
9 Describe the ways in which digital audio transmission is increasing
the availability of musical genres and outline the techniques involved
in the transmission of digital audio that have been introduced in
the main text. (Activities 40, 43 and 44)
Acknowledgements
Grateful acknowledgement is made to the following sources for
permission to reproduce material in this chapter:
Figure 1: Notes from A Venetian Coronation 1595. Virgin Classics
Limited. By kind permission of EMI Music Publishing Ltd;
Figure 5: Sibelius Op. 47 Violin Concert. Ernst Eulenberg Ltd., London;
Figures 18 & 21: Sony Service Centre (Europe) NV, 1988, 1992, 1995.
Thanks also to Bill Strang for playing the piano on the recording for
Activities 7, 12 and 16 (and not minding what the Course Team did
with it).
Every effort has been made to trace all the copyright owners, but if any
has been inadvertently overlooked, the publishers will be pleased to
make the necessary arrangements at the first opportunity.
75 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 5 THE MUSIC BUSINESS 75
TA225 Block 3 Sound processes
Chapter 5
The Music Business
CONTENTS
Aims of Chapter 5 76
1 Introduction 77
2 Revolutions in sound recording – a short history 78
2.1 Introduction 78
2.2 Cylinders or plates? 81
2.2.1 Edison starts with cylinders 81
2.2.2 Bell and Tainter improve the phonograph 84
2.2.3 Berliner experiments with plates 88
2.2.4 Making multiple copies 90
2.2.5 Turning the handle 92
2.2.6 Music matters 93
2.2.7 Good times and bad 95
2.3 Sounds from magnets 101
2.3.2 Recording on the wire 102
2.3.3 Magnetic tape recorders 104
2.3.4 Compact cassettes 105
2.3.5 Studio tape recorders 108
3 Restoring recordings 110
3.1 Introduction 110
3.2 The life span of CDs 111
3.3 Audio restoration processes 112
3.3.2 Selecting the performance 113
3.3.3 Converting to digital 114
3.3.4 Making the joins 114
3.3.5 Removing unwanted artefacts 115
3.3.6 Application of equalisation 116
4 Digital personal jukeboxes 117
4.2 The development of digital jukeboxes 117
4.3 Case study: Apple iPod 118
4.3.1 iPod hardware 119
4.3.2 iPod software 121
4.3.3 iPod digital audio storage 122
5 Surrounded by sound 123
6 Protecting the copyright 125

6.2 What is copyright? 125
6.3 Protecting the work 126
6.3.1 Using the law 126
6.3.2 Using technology 126
6.4 The current situation 127
7 Summing up 129
Summary of Chapter 5 130
Answers to self-assessment activities 132
Learning outcomes 135
Acknowledgements 136
AIMS OF CHAPTER 5
I To outline the history of the recording industry and the evolution

of the associated technology.
I To describe processes involved in reproducing an analogue disc
recording of a musical performance and the characteristics of the
technologies used.
I To describe the processes involved in reproducing a magnetic tape
recording of a musical performance and the characteristics of the
technologies used.
I To appreciate the need for the audio restoration of historic archived
recordings.
I To outline the processes involved in audio restoration, and to give
practical experience of these processes.
I To highlight the progress of new technologies in the music
business.
I To describe the technologies involved in protecting the rights of
musicians and performers.
1 INTRODUCTION
Welcome to Chapter 5 which looks at the ways in which technology

has influenced the music industry and how this has begun to change
the way we listen to music and buy records.
The photograph in Figure 1 shows a collection of the different media
used to distribute recordings. They are, in historical order, a 78 rpm
disc, a long-playing disc (LP), a 45 rpm disc, a quarter inch magnetic
tape, a compact cassette and finally a compact disc (CD). What is
slightly unusual about the picture is that each format shown contains
the same song from one of two recorded performances of At the Drop
of a Hat by Michael Flanders and Donald Swann made between 1957
and 1959. I’m using this collection here simply to demonstrate the
many formats which have been used to make recordings available to
consumers. Each format has, in its turn, offered the consumer some
perceived advantage in terms of cost, convenience, sound quality or
capacity (playing-time), though not necessarily all at the same time.
LP
45 compact cassette
CD
magnetic tape
78
Figure 1 The same performance on a variety of media
This chapter opens with a brief history of the recording industry from
its beginnings at the end of the nineteenth century. Step changes in
technology will be highlighted in a story which often is as much about the
people who built the industry and the recordings they made as to the
technologies that were developed and used. During this discussion you
will be asked to appraise the different audio technologies mentioned by
comparing their frequency responses, signal-to-noise ratios and dynamic
ranges. You will also consider convenience of use of the various media
types shown in Figure 1. This will allow you to develop an insight into
why some technologies have succeeded and others fallen by the wayside.
You also will discover that absolute audio fidelity is not always
uppermost in the minds of either designers or the consumers.
Changes in technology not only affect the recording media and equipment
but also the way in which we listen to the music. A case study of a
personal jukebox will be used to illustrate this. This device offers the
listener immediate access to hundreds of hours of recordings as opposed
to between five and sixty minutes offered by the formats mentioned
above. There is also a brief look at surround sound technologies.
The ability to record sound and in particular digital recording
technologies have had enormous impact upon the legal aspects of the
music industry. Since the invention of the tape recorder consumers
have been able to make copies of original recordings for themselves
and others, but the relatively high cost of the equipment combined
with inferior reproduction did not overly worry the music industry’s
executives and lawyers. However, as digital copies can be near perfect
facsimiles of the original digital master recording and made with
relatively low copying equipment costs, the music industry has had to
develop ways to prevent us from making too many copies. The last
section of this chapter considers some of the legal aspects of relevance
to the music industry and introduces some of the initiatives that are
being considered and implemented to control our access to the
recording. The fluid nature of these initiatives means that, to ensure
currency, this discussion will be led through a selection of articles.
2 REVOLUTIONS IN SOUND RECORDING –

A SHORT HISTORY
2.1 Introduction
In Chapter 7 of Block 2, you made various recordings of yourself
making various vowel sounds. But did you really listen to your voice
rather than concentrating on the vowel sound, formants etc. that you
were producing?
In this first activity, I want you to make a short recording of your voice.
ACTIVITY 1 (PRACTICAL, COMPUTER) ....................................................
Using one of the course’s sound software packages, whichever you

find easier, make a recording of yourself reciting the following well-
known children’s nursery rhyme:
“Mary had a little lamb, Its fleece was white as snow.
And everywhere that Mary went The lamb was sure to go.”
The reason for this choice of rhyme will become clear in a moment.
Play the recording back to yourself. How do you sound? Do you think
your voice sounds like you?
Comment
Well do you think your voice sounded like you? Probably not, because
as you may recall from Section 5.4 in Chapter 7 of Block 2 that due to
the directivity of the voice, you do not hear it as others do and so it
sounds unrealistic to you in a sound recording. However, because you
know that the technology you are using is capable of accurately
reproducing sounds I hope you have confidence that what you are
hearing accurately represents how you sound. I
The first sound recording of a human voice, actually reciting the nursery
rhyme Mary had a little lamb, was made over 125 years ago. Just imagine
back then the reaction of people the first time they heard the sound of a
human voice coming from a machine – especially if it was theirs!
“The phonograph [recording machine] was remarkable partly
because it did not look human – it spoke just like a person, but it
looked like a machine, a simple cylinder of tinfoil.”*
So great was this invention and so insatiable was (and still is) our need to
hear recorded sounds – especially music – that within 25 years sound
recording had become a global industry.
Before sound recording was possible few people had the opportunity
to hear music in the way we take for granted today. Apart from
expensive musical boxes and the mechanical music players that were
introduced in Chapter 3 of this block, the only way music could be
heard was in live performances. Take a moment to think how your life
would be without being able to listen to music from CDs, records,
tapes, radio, television or even the Web. How often would you listen to
music if you could only hear it by attending live performances or
making it yourself? The following activity asks you to think about
listening to music before sound recording was invented.
Think of how people listened to music before the advent of sound

recording. Try to put yourself in their place and make a list of the
various ways in which you might hear music. Is there a common
thread that you can discover about the experience?
Comment
I thought of the following:
• places of religious worship (singing hymns, listening to the organ, etc.);
• at school (nursery rhymes, group songs and dance);
• in the home (barrel organ, musical box or player piano);
• live concerts (listening to the band in a local park, going to the
music hall, a classical concert or a musical theatre performance);
• dancing (to music from local bands).
A common thread that occurs to me is that on many occasions music
was created by people (amateurs rather than professionals) meeting
together – at church, school or the local public house for example.
Most of the music was live, with just the possibility of hearing a
mechanical instrument such as a barrel organ. I
Another very significant factor is that, as you have just seen in

Activity 1, recordings have allowed performers to hear what they
sound like to other people. In his book**, Robert Phillip says:
“Musicians who first heard their own recordings in the early years of the
twentieth century were often taken aback by what they heard, suddenly
being made aware of inaccuracies and mannerisms they had not suspected.”
*Wood, G. (2002) Living Dolls, London, Faber and Faber, p. 121.

**Phillip, R. (2004) Performing Music in the Age of Recording, Yale University
Press, p. 25.
He goes on to argue that the effect of recording has been to change

many features of general performing style. Before recording was
possible, when a single performance of a work might be the only
chance many listeners would have to hear a particular work,
performers tended to play in a way that would now seem exaggerated
and mannered, such was their determination to underline the music’s
expressivity and its major structural features. This style of performing
survived into the twentieth century, and can be heard in many early
gramophone recordings. However, the spread of recordings, and the
opportunity they afford for the listener to become familiar with a large
body of repertoire works, has fostered a performing style that is less
concerned with pointing up salient details of the work (which many
listeners now know from recordings) than with eliminating
mannerisms and imperfections from the performance. Philip argues
that in the age of recording, performers have become especially
concerned with precision and accuracy, whereas performers (and
listeners) of the past placed less value on these qualities.
You may recall from the very beginning of the course, in Chapter 1 of
Block 1, that sounds from any source are conveyed to our ears by small
variations in air pressure. These variations, which are captured by the
eardrum, cause impulses to be sent to the brain allowing us to make
sense of the sound. We cannot store the sound but our memory allows
us to recognise it if it reoccurs. From Chapter 6 of Block 1 you may also
remember that to capture sound we use a microphone. This also contains
a membrane, called a diaphragm, that responds to the variations in air
pressure by generating tiny electrical impulses which may be amplified
and recorded (or stored) by a suitable medium. Playing back the
recording, again by using a diaphragm but this time a cone in a
loudspeaker, regenerates these variations in air pressure with sufficient
energy so as to act on the eardrum in the same way as the original
sound. This makes the recognition of the sound possible, as if it came
from the original source. In order for us to believe we are listening to
the original sound the recording and playback systems must not
distort the original signal in any way. The next activity revises the
three main parameters that determine audio system quality, and these
will be used throughout this chapter as a means of comparing the
quality of the various different audio systems that are discussed.
As you have found out in earlier parts of the course, bandwidth,

dynamic range and signal-to-noise ratio are three parameters that can
be used as a measure of the quality of an audio system. What do you
understand by each of these terms, and what is considered a rough
acceptable value of each of them for a good quality audio system
designed to play CDs?
Hint: If you are unsure of any of the rough values, you may like to refer
back to Chapter 6 of Block 1. I
As you will discover the technologies used within the record industry
have not always been capable of delivering sounds from systems with
ideal characteristics for recording and playback. In fact user convenience
can be as important a consideration as the fidelity of the reproduced
sound.
2.2 Cylinders or plates?

“I had a little gramophone; I’d wind it round and round, and with
a sharpish needle it made a cheerful sound.”*
2.2.1 Edison starts with cylinders

In 1877 the young American inventor Thomas Alva Edison finally
completed development of an invention capable of capturing, recording
and playing back sounds. Edison called it the phonograph, from the Greek
meaning ‘sound-writer’ and it is pictured with the inventor in Figure 2.
Figure 2 Edison with his phonograph
As is often the case with truly great inventions Edison was not the
only inventor working independently on recording sounds. In April
1877 a sealed letter was deposited at the Académie des Sciences in
Paris by an impoverished French poet and amateur scientist, Charles
Cros. The contents described an apparatus that:
“consists in obtaining traces of the movements to and fro of a
vibrating membrane and in using this tracing to reproduce the same
vibrations, with their intrinsic relations of duration and intensity,
either by means of the same membrane or some other one equally
adapted to produce the sounds which result from this series of
movements.”**
Unfortunately Cros could not afford to patent his idea and it was Edison
who, in the late autumn of 1877, filed for a US patent on his phonograph.
Differences existed between the two inventions as, for example, in Cros
proposing a glass disc whilst Edison actually used a tin-foil cylinder.
*Flanders, M. and Swann, D. (1977) ‘The Song of Reproduction’ from The

Songs of Michael Flanders and Donald Swann, London, Elm Tree Books
and St George’s Press, p. 99.
**Gellatt, R. (1977) The Fabulous Phonograph, London, Cassell & Company, p. 23.
Listen to the audio track associated with this activity. It is a recording

of Edison speaking the nursery rhyme Mary had a little lamb made in
1927, 50 years after he made the original recording in 1877. None of
his original 1877 recordings have survived. I
The sound quality in the clip of Edison speaking you have just
listened to is not very good in comparison to what we have come to
expect today. This is because the system used an acoustic recording
method, described in Box 1 ‘Sounds on cylinders’.
Box 1 Sounds on cylinders

Edison’s method of recording and playback used the message destroyed the undulations in the
an acoustic (or mechanical) process to record groove so the sounds could only be played back
sounds onto tin-foil as illustrated in Figure 3. once.
A stylus, a small pointed stem of diamond or
sapphire, was coupled to a diaphragm that
together comprised the sound-box. A conical
horn, attached to the sound-box, amplified the
sound vibrations. The sideways movement of
the sound-box was controlled by a feed-screw
that turned when the cylinder was rotated. To
record a message the cylinder was turned while
you shouted into the horn. Sounds with
sufficient energy caused the stylus to vibrate
vertically and cut a groove with a profile that
undulated in sympathy with the vibrations. On
playback a stylus, again controlled by a feed-
screw, followed the original track. The
undulations in the groove picked up by the
stylus set the diaphragm vibrating which, once
magnified by the horn, recreated the sounds. Figure 3 The acoustic recording and
Unfortunately tin foil was so soft that replaying playback process
Run the computer animation associated with this activity. This animation
demonstrates Edison’s mechanical recording and playback process.
It is based on his original design for the phonograph which was
patented in 1878. I
What does the fact that a person had to shout into the horn of the
recording machine, as described in Box 1, tell you about the sensitivity
of Edison’s apparatus?
Comment
The fact a person had to shout indicates that the recording machine
was very insensitive. This was due to the mechanical stiffness (inertia)
of the mechanism that cut the groove into the recording medium, which in
this case was tin-foil. This had a direct effect on the frequency response
and dynamic range of mechanical recording machines. I
An article in the Scientific American of 22nd December 1877 described

a visit by Edison to their New York office with his phonograph.
“Mr. Thomas A. Edison recently came into this office, placed a
little machine on our desk, turned a crank, and the machine
inquired into our health, asked how we liked the phonograph,
informed us that it was very well, and bid us a cordial good night.
These remarks were not only perfectly audible to ourselves, but to
a dozen or more persons gathered around…”
By mid-1878 Edison had produced several versions of his phonograph,
even experimenting with tin-foil discs, which he abandoned as he
found the quality of reproduction deteriorated towards the centre.
Although still far from perfect the North American Review of June
1878 printed Edison’s ten uses for his invention.
1 Letter writing and all kinds of dictation without the aid of a
stenographer.
2 Phonographic books which will speak to blind people without
effort on their part.
3 The teaching of elocution.
4 Music – the phonograph will undoubtedly be liberally devoted to
music.
5 The family record – preserving the sayings, the voices and the last
words of the dying members of the family, as of great men.
6 Musical boxes, toys, etc. – a doll which may speak, sing, cry, or laugh
may be promised our children for the Christmas holidays ensuing.
7 Clocks that should announce in speech the hour of the day, call you
to luncheon and send your lover home at ten!
8 Preservation of language by reproduction of our Washingtons, our
Lincolns, our Gladstones.
9 Educational purposes – such as preserving the instructions of a
teacher so that the pupil can refer to them at any moment; or learn
spelling lessons.
10 The perfection or advancement of the telephone’s art by the
phonograph – making that instrument an auxiliary in the
transmission of permanent records.
All of the above ideas have been developed in one way or another.
Which of the above do you consider have benefited the most from the
advances in digital audio technologies described in Chapter 4 of this
block?
Comment
Although all the ideas have benefited in one way or another, the
advances have been in different ways. Not all these ideas require the
wide frequency response, high dynamic range and low signal-to noise
characteristics demanded from high quality audio systems for music
reproduction. The convenience of digital voice recorders for dictation
and talking books and the ability to use multimedia in general and
particularly language education have used the advantages of digital
audio compression techniques rather than striven for ultimate sound
quality. This has been similarly exploited in telephone answering

machines and voice storage systems where audio quality is not as
important as storage capacity.
Perhaps surprisingly, a miniature phonograph was developed and installed
inside a toy doll as mentioned in Item 6 above. Figure 4(a) is a photograph
of such a phonograph beside the doll in which it was used. Surely this
must be the forerunner of the talking greetings card which contains a
small circuit board and loudspeaker as shown in Figure 4(b). I
(b)
(a)
Figure 4
(a) A miniature phonograph alongside a toy doll in which it was used; (b) the digital
audio circuit board and loudspeaker from a modern talking greetings card
Strangely, Edison did not progress the development of his phonograph

at this time. Following a visit to view an eclipse of the sun in the state of
Wyoming, USA in 1878, he cast aside his work on recording sounds and
devoted all his energies to the development the electric lamp. (I wonder
if this was due to the darkness created by the eclipse?)
2.2.2 Bell and Tainter improve the phonograph

If Edison was not willing to continue development of the phonograph
then others were. Alexander Graham Bell, who had risen to prominence
through his invention of the telephone, took a great interest in recording
sounds even suggesting to Edison that they might collaborate. Edison
refused so Bell set about developing a recording machine with the
assistance of his cousin Chichester Bell, a chemical engineer, and
Charles Tainter, a scientist and instrument maker. By 1887 Bell and
Tainter had succeeded in producing a recording machine which they

called the graphophone (meaning sound-pencil). The graphophone
was largely similar to the phonograph but in place of tin-foil they used
cylinders of hard wax coated onto cardboard sleeves as the recording
medium. This was a great technical advance for, not only did it give
much greater quality of reproduction, but it also allowed the recording
to be replayed many times. Unfortunately, the low sound level when
played back necessitated the use of ear tubes, as illustrated in Figure 5.
Figure 5 A graphophone being used for
dictation by a 19th century ‘audio-typist’
Given the fact that wax was easier to cut than tin-foil why do you think
wax cylinders offered an improved sound quality?
Comment
The sound quality was improved because there was less stiffness
(inertia) in the recording system. This gave greater sensitivity as well
as an improved frequency response and a lowered signal-to-noise ratio.
These improvements were only marginal compared with today’s
technologies but nevertheless were a distinct advance. I
With his work on the electric lamp successfully completed, Edison

returned to what he always called his favourite invention, insisting
that the phonograph had never been far from his thoughts. What
actually came out of his laboratory looked remarkably similar to the
graphophone which he called his ‘Perfected Phonograph’. However
Edison used a solid wax cylinder rather than a wax-coated cardboard
sleeve. This worthwhile improvement allowed the surface of the
cylinder to be shaved, so erasing the original recording to allow the
cylinder to be used again. Bell and Tainter quickly adopted solid wax
for their graphophone. The litigious business society sat back, rubbed
their hands, and waited for a bitter fight between the two companies
over the patent rights!
However, the expected law suits did not arise. Firstly, the wording of the
two patents was different. Edison described his recording technique as
‘embossing or indenting’ the medium, whereas the Bell–Tainter patent
portrayed their way of recording as ‘engraving’, implying a different
approach. Secondly, a businessman named Jesse H Lippincott, who
was looking to invest cash in a new venture, offered to invest in
both inventions thus securing sole rights to sell recording machines
through his ‘North American Phonograph Company’.
The whole enterprise nearly failed for, just like Edison and Bell,
Lippincott saw the future of the recording machine for dictation in
businesses such as government bureaux and offices. Actually
shorthand-typists did not appreciate this use of the machine, seeing
it as a threat to their jobs. They even went as far as sabotaging the
machines, making them useless for dictation. After two years of
unsuccessful trading Lippincott, now in ill-health and with poor
finances, lost control of his company to Edison who still only saw
business potentials for his invention.
“He could not or would not countenance the potentialities of the
phonograph as a medium for entertainment.”*
Fortunately for Edison the Columbia Phonograph

Company, one of his subsidiaries, recorded
popular songs of the day on cylinders. They
could be played back using specially adapted
‘coin-in-the-slot’ phonographs which were
situated in public places such as drugstores and
saloons. Popular songs could be heard ‘for a
nickel a time’, as illustrated in Figure 6. Their
popularity demonstrated a public demand for
recorded music.
Edison was finally convinced of the phonograph’s
possibilities for musical reproduction, but as ever
he wanted to do it his own way, and his first
move was to liquidate his existing phonograph
companies. This allowed the way for a coalition
between two rivals, Bell and Tainter’s American
Graphophone Company and the now independent
Columbia Phonograph Co. Working together
they were soon able to offer a clockwork
Figure 6 Entertainment in 1891,
‘for a nickel a time’ powered graphophone for $50 with a catalogue
of over a thousand pre-recorded cylinders. By
Christmas 1897 they were selling a $10 clockwork powered graphophone
reproducing music “as loudly and brilliantly as the highest price
machine”. One similar to the original graphophone is illustrated in
Figure 7.
To compete Edison offered his clockwork powered ‘Home Phonograph’
for $20 (see Figure 8) which, apart from minor modifications, sold for
over 30 years.
* Gellatt (1977), p. 44. See page 81.

Figure 7 A clockwork cylinder player
Figure 8 The Edison Phonograph
In this activity you will assess the ‘quality’ of a hard wax cylinder
recording made in 1902 by looking at its bandwidth, dynamic range
and signal-to-noise ratio. This is the first of a number of computer
activities where you will use the course’s sound editing software to do
this. As each of the activities follows the same procedure and involves
features of the sound editor you have not used before, you will find a
single set of steps on how to carry them out in the Block 3 Companion.
You should therefore run the course’s sound editing software, open the
computer sound file associated with this activity and follow the steps in
the Block 3 Companion on how to determine the quality of a recording.
Comment
I obtained the following characteristics for this recording:
Characteristic Estimated value

Bandwidth 200 Hz – 3.5 kHz
Dynamic range 30 dB
Signal-to-noise ratio 32 dB
Compared with the values of a modern sound system that you should
have suggested in Activity 3 these values are poor. Do bear in mind
however that the users had no other systems by which to make
comparisons. Also, as a new technology, the phonograph was still one
of the miracles of the age. I
2.2.3 Berliner experiments with plates

Emile Berliner was a young German immigrant to the USA with an
interest in science. Whilst working in several menial jobs he educated
himself in basic physics and chemistry eventually building a small
laboratory at his boarding house. Experiments with electricity and
acoustics led to his invention of a new telephone transmitter which he
sold, enabling him to set up as a full-time inventor. He became
interested in recording sound through studying a device called the
phonoautograph. This apparatus inscribed sound vibrations as a
lateral trace onto lamp-blacked paper using a diaphragm and stylus.
Berliner thought that this lateral motion could offer superior recording
quality to Edison’s vertical method, see Box 2 ‘Cutting the groove’. He
also decided to use a disc, which he called a plate, rather than a
cylinder as the recording medium. The plate was placed onto a
turntable over a central spindle that fitted into a hole in the middle of
the plate. The turntable was then rotated at a fixed speed. This overall
design was sufficiently different from the phonograph to allow it to be
patented in 1887 using the name gramophone.*
Berliner made his plates from a tough rubber-based compound called
vulcanite allowing the groove to be sufficiently deep to ensure the sound
box was guided by the groove eliminating the need for the phonograph’s
complex feed-screw mechanism. The deep groove also allowed cheap,
replaceable steel needles to be used in place of a delicate jewel
stylus found in the cylinder machines. This made the gramophone,
illustrated in Figure 9, cheaper to manufacture than the competition.
In 1894 Berliner’s ‘United States
Gramophone Company’ released
their first single-sided (which means
the sound was recorded on just one
sound box side) 7 inch (18 cm) diameter disc.
An important point to note was that
horn unlike its rivals, the gramophone
had no means of recording sounds –
it was designed from the outset only
to play back pre-recorded sounds.
This demonstrated a high degree of
faith by Berliner that people would
be happy just to listen to sounds
Figure 9 The Berliner ‘Seven Inch’ hand-cranked
gramophone (and music in particular) in their
own homes.
(a) From reading Box 2, what restrictions do you consider have been
placed on the ultimate sound quality of cylinder and disc
recordings in terms of bandwidth and dynamic range?
(b) Why were these restrictions employed?
*In the USA, phonograph is the generic name given to all record playing
equipment. In the United Kingdom gramophone is more generally used,
although phonograph may be used when referring to cylinder players.
Box 2 Cutting the groove

The vertical (up and down) cutting Note the difference between the Because lateral cut groove spacing
method, which was nicknamed ‘hill- vertical and horizontal undulations depends upon the amplitude of the
and-dale’, shown in Figure 10(a) and also the depth of the groove sounds to be recorded as explained
was invented by Edison. The lateral in (a) and (b). above, the spacing had to be
motion (side to side), developed by When cutting the groove in the individually set for each disc but on
Berliner is shown in Figure 10(b). recording medium the level of the average it was 70 tracks per inch
In both cases the undulations in the captured sound directly affects the (30 per cm).
groove are directly analogous to the deflection of the cutter. Aloud sound Later record cutters offered
sound vibrations. results in a large deflection causing automatic dynamically variable
a deep cut using vertical recording spacing (the groove spacing was
stylus stylus motion
motion or a wide horizontal deflection for varied within a disc as the sound
lateral recording. When recording got louder or softer). This allowed
on discs the engineer had to ensure a closer groove pitch for quiet
that loud passages did not cause the sounds whilst avoiding the
lateral cutter to make such a wide likelihood of a loud sound breaking
deflection that it broke through the the wall of the groove and
groove wall of an earlier part of the therefore maximises the playing
recording. The groove spacing time. This can be seen in spacing
would be set sufficiently wide to of the grooves of the LP shown in
ensure that this would not happen. Figure 11(c).
(a) (b) However this caused other As the recording medium (i.e. the
problems because the wider spacing cylinder or disc) is turned a linear
Figure 10 Stylus motion in a
reduced the recording time. In the groove is cut. The maximum linear
recording, (a) vertical, (b) lateral
early days of disc recording, singers speed (surface speed) of the medium
The phonograph pickup was might be asked to turn their heads relative to the stationary cutting
positioned by a feed-screw so if the away from the horn or microphone head determines the highest
groove disappeared, as it during loud passages to ensure their frequency that can be recorded
sometimes did, the position of the voice kept the cutting stylus without distortion. The cutter makes
pickup was not affected. On the displacement to a reasonable level. a wave with a length that is a function
other hand the gramophone pickup This could be thought of as a very of the linear speed and sound
was simply guided by the groove crude example of compression, the frequency. The linear speed of
which in consequence had to be technique introduced in Chapter 1 a cylinder remains constant at
much deeper to avoid the pickup of this block, which is used to ensure 44 cm/s. However, because a groove
skating across the disc’s surface. loud sounds do not cause distortion. around the outside of a disc is much
Figure 11 compares an actual Cylinders had a standard groove longer than a groove towards its
vertical cut phonograph cylinder in pitch of either 100 or 200 tracks centre, the linear speed varies
(a) with an old lateral cut 78 shellac to the inch (40 or 80 per cm) over the surface of a 78 rpm disc
disc in (b) and a vinyl LP disc in (c). depending on the playing-time. from 120 cm/s at the edge down to
44 cm/s at the middle (remember
(a) Edison cylinder (b) 78 disc (c) LP disc
ca. 1910 ca. 1930 ca. 1970 that the rotational speed is constant).
The corresponding wavelength of
a 1 kHz sine wave recorded on a
78 rpm disc thus varies from 1.2 mm
down to 0.44 mm. The diameter of
the pickup stylus or needle also
‘dale’ determines the upper
lateral frequency response. This
movement is because a blunt needle
‘hill’ will not sit right at the
bottom of the V-shaped disc groove,
and will therefore not follow very
high frequency ‘wiggles’ in the
groove, whereas a sharper needle
will fit right into the bottom of the
all media to the same magnification groove, and so will be able to follow
Figure 11 The grooves on (a) an Edison cylinder, the fast lateral groove movements
(b) a 78 rpm shellac disc, (c) a vinyl LP disc much better.
Comment
(a) The linear speed controls the upper bandwidth frequency. If the
cylinder or record would have revolved faster, the linear speed
would have increased and with it the upper frequency response.
In the case of discs the track spacing is dependent on the dynamic
range as louder sounds require a greater deviation of the groove
meaning an increased track spacing.
(b) In both cases the recording-time (capacity) of the recording
medium would be decreased so shortening the playing-time. A
trade-off between ultimate quality and sound quality had to be
made. I
Listen to the two audio tracks associated with this activity. The first track
contains an original recording by Emile Berliner and this is followed in
the second track by a repeat of Edison’s recording from Activity 4.
How do you think Berliner’s recording compares with that of Edison? Can
you think why both men should have chosen to recite nursery rhymes?
Comment
The recording by Berliner was taken from an original 5 in (13 cm)
diameter vulcanite disc made in 1889. I think you will agree that the
reproduction is poor compared with the recording of Edison. However
remember that Edison’s recording was made in 1927 as a demonstration
of his original invention. I wonder why both Edison and Berliner recited
nursery rhymes? Perhaps neither could think of anything more propitious
to say at the time, but bearing in mind the poor quality of the sound, it
may have been that using well-known rhymes would make the recording
more comprehensible. Maybe it was an early marketing ploy. I
(a) Which cutting method, vertical or lateral, would give the recording
engineer the opportunity to record the louder sound?
(b) Why was it advantageous to record sounds at the highest possible
level in these early recordings? I
2.2.4 Making multiple copies

Berliner was aware that Edison had problems duplicating cylinders.
Initially copies were made from a master cylinder using a mechanical
engraving process. Unfortunately this method caused the master
cylinder to wear out after making just a few copies so performers had
to be asked to record several masters to ensure enough cylinders could
be duplicated. An improved recording system allowed multiple master
cylinders to be made by feeding several recording phonographs from
one horn, but the cylinder copying process was still far from satisfactory.
It took Berliner six years to perfect disc duplication but it was time well
spent for the principles are still used today to manufacture compact discs.
Box 3 compares the manufacturing processes for cylinders and discs.
Box 3 Cylinder and disc manufacture

A process to mass-produce cylinders was finally developed in 1901. Cylinders
were moulded using a hard black wax medium which reduced playback wear.
The process was known as ‘Gold-Moulded’ because a gold coloured vapour was
given off during the process. Sub-master moulds were created from the master
cylinder, and the wax cylinders manufactured from these moulds. This meant
that the artist only had to record a single master cylinder. About 150 cylinders
a day could be produced from a single mould.
Disc manufacture using the Berliner process started with the creation of a
master disc, known as a matrix. This was a wax coated metal disc into which
the artist cut the recording. Each master disc was inscribed with an identification
number, known as the matrix number, which appears near the centre of every
record near the label. The master matrix was then used to make moulding
tools, known as matrix stampers, using an electroplating process which
deposited a layer of metal onto the master. When separated from the master
the stamper became a negative replica of the master. The stamper was fitted
into a hydraulic press, along with an identification label and the disc material.
Once perfected up to three discs per second (180 per minute) could be pressed
in this fully automated process. Early discs were stamped just on one side but
eventually double-sided discs were developed which played on both sides.
Unfortunately Berliner’s discs, one of which is illustrated in Figure 12, were of
variable quality tending to have flat-spots in the groove which caused the
pickup to skate across the disc surface (remember the groove positioned the
pickup). From 1897 a hard-wearing compound of shellac*, slate powder and
carbon-black was used. This improved the quality and lowered production costs.
The abrasive nature of the slate dust sharpened the playing needle to ensure a
continued good fit in the groove (so maintaining the high frequency response)
but at the expense of users regularly having to replace worn-down needles –
another source of income to record manufacturers! Shellac discs gave an
acceptable surface noise and were easily mouldable at relatively low
temperatures. At room temperatures shellac discs were brittle and shattered
if dropped. Eventually, a relatively unbreakable plastic material called vinyl
(short for polyvinyl chloride) was used to manufacture discs.
Figure 12
An example of
Berliner’s vulcanite
disc
*Shellac is a resin derived from the secretion of the Lac beetle (Coccus lacca)
found in Malaysia.
When a recording company makes a decision to issue a CD of a recording

originally released on a 78 rpm disc, the remastering engineer prefers
to get hold of the matrix and make the transfer from that rather than a
normal copy of the released disc. Why should this be the case?
Comment
As you will recall from Chapter 4 of this block, every time an analogue
recording is copied there is additional noise added to the original sound
lowering the signal-to-noise ratio. Using the original matrix will ensure
the best possible sound. Unfortunately access to the original matrix is
not always possible and may reissues of original 78 recordings use
normal mass-produced discs. The topic of remastering is covered later
in this chapter. I
2.2.5 Turning the handle

The owners of the original hand-cranked gramophones were instructed
that the standard velocity for ‘seven-inch plates’ was about 70 revolutions
per minute. The owner was also warned that failure to turn the plate at
the correct speed would lead to a lowering of the pitch if turned too
slow, or a raising of the pitch if turned too fast. It is doubtful if true
reproduction of the recorded sound was ever achieved by the owners
of these machines! A better power source was needed and as electric
motors were far too costly, a suitably powerful and inexpensive clock-
work motor was used. It was designed and built by Eldridge Johnson, a
craftsman with a passion for the gramophone who would later form
‘Victor Talking Machines’ and
‘Victor Records’, which would
become ‘RCA-Victor’. The clockwork
motor proved an immediate success
with Christmas 1896 seeing the
‘Berliner Gramophone Company’ of
Washington, DC ahead of all the
competition, with the factory being
unable to keep up with the demand.
By mid-1897 the ‘Improved
Gramophone’, with a new Johnson-
designed motor and sound box was
launched. This model was destined
to become one of the most familiar
icons in the recorded music field for
it was immortalised, along with a
small black-and-white fox terrier dog
called Nipper, in a painting by
Francis Barraud, shown in Figure
13. This painting was to become the
trade mark of the ‘HMV (His
Master’s Voice) Company’ in Europe
and ‘Victor Records’ in the USA.
Figure 13 Barraud with his painting of Nipper
entitled ‘His Master’s Voice’
The recording and playback speed would ultimately be standardised at

78 revolutions per minute (rpm), as explained in Box 4. Actually this
can never be taken for granted because recording speeds varied from
under 70 to over 80 rpm. To cater for these differences a speed
controller was fitted to most gramophones.
Disc diameters also varied but 7-inch (18 cm) playing for 2 minutes,
10-inch (25 cm) playing for 3 minutes, and eventually 12-inch (30 cm)
playing for up to 5 minutes became standard. Eventually recordings
were put on both sides of the disc, known then, as now, as the A and B
sides, offering better value and greater convenience to users.
Box 4 Why 78 was chosen?

Discs revolving at 100 rpm would have given a better sound through improved
bandwidth but would have shortened the playing time of the disc to less than
Edison’s original 2-minute cylinder. 40 rpm could have increased the playing
time but would have offered a poor sound quality compared with cylinders. 70
to 80 rpm was a compromise offering a reasonable sound and an acceptable
playing time. With the introduction of mains powered synchronous electric
motors a speed of 78 rpm became standard. This is because for these types of
motor, the rotational speed is dependent on the mains supply frequency. With
78 rpm, simple reduction gearing could be used between the motor and the
turntable that required minimal changes when converted from the European
50 Hz to the North American 60 Hz mains frequency supply or vice versa.
In this activity you will look at the ‘quality’ of a 1902 gramophone

recording of Enrico Caruso by determining its bandwidth, dynamic
range and signal-to-noise ratio just as you did with the 1902
phonograph recording in Activity 9.
Run the course’s sound editing software, open the computer sound file
associated with this activity and follow the steps in the Block 3
Companion on how to determine the quality of a recording.
Comment
I obtained the following characteristics for this recording:

Bandwidth 250 Hz – 4 kHz
Dynamic range 30 dB
These values are very similar to those of Activity 9 which used

Edison’s cylinder system. The mechanical recording and replay
technology limited both frequency response and dynamic range and
provided a poor signal-to-noise ratio. I
2.2.6 Music matters

As you have discovered from Activities 9 and 14 there was little
difference in sound quality between the phonograph cylinder and the
gramophone disc. The limited frequency response of the acoustic
recording and playback process restricted the sounds that could be
reproduced. Instruments tended to be limited to brass and piano and

middle register voices (alto and tenor) were the most suitable. So why
did the disc succeed over the cylinder? The answer has little to do
with technology and much more to do with the tenor Enrico Caruso
and the entrepreneurship of one man, as you can read in Box 5.
Box 5 Recording Caruso

On the 18th March 1902 Fred Gaisberg, a senior representative of The
Gramophone Company, and the ‘father’ of recorded sound, set up an improvised
recording studio in a bedroom at the Hotel di Milano in Milan, Italy. The studio
was little more than a piano on packing cases and a gramophone recorder. In
the afternoon ten arias were recorded by a young and relatively unknown Italian
tenor, Enrico Caruso. His agent demanded £100 for the recordings which the
entrepreneur Gaisberg paid out of his own pocket. The Gramophone Company
had refused to pay, sending Gaisberg the following reply to his cable requesting
permission to record Caruso: “FEE EXORBITANT FORBID YOU TO RECORD”*.
It is generally agreed that these were the first completely satisfactory
recordings to be made. They were sold on premium 10-inch diameter ‘Red Label’
discs at ten shillings (50p) each. They were an immediate success. Caruso’s
voice had an ideal quality
for the recording techno-
logy of the time and he is
considered to be the first
serious musician to
appreciate the value of
recordings. His later record
of Vesti la giubba from
Leocavallo’s I Pagliacci,
made in November 1902,
sold over a million copies.
Figure 14 shows a self
caricature of Caruso – note
the Gramophone Company’s Figure 14 Recording Enrico Caruso – a self
logo on the wall! caricature
Other singers were encouraged to record for the Gramophone Company

including, in early 1903, the tenor Francesco Tamagno, for whom Verdi had
created the operatic role of Otello. Tamagno insisted that his twelve-inch (30 cm)
records sold for £1 each, had a special ‘Tamagno Label’, and he received a 10
percent royalty on each record sold (Tamagno was the first artist to receive royalty
payments for recordings). This was followed a year later by two famous female
singers, Nellie Melba and Adelina Patti, who both insisted on their own labels,
mauve and pink respectively, and a premium selling price of one guinea (£1.10).
(At this time 30 to 35 shillings was a good weekly wage for a skilled craftsman.)
*Northrop Moore, J. (1999) Sound Revolutions, London, Sanctuary Publishing,
p. 92.
Listen to the audio track associated with this activity. It is a recording

of Enrico Caruso singing ‘Questa o quella’ from Verdi’s opera
Rigoletto. This was the second of ten recordings made by Fred
Gaisberg in April 1902. You may notice that Caruso clears his throat at
the end of the first verse – no editing facilities were available in 1902!
This recording has been restored by Ward Marston whom you will
read about later in this chapter. I
The reason why the records produced from the Caruso recordings were
so popular was that he was singing popular music of the time and that
the quality of his voice suited the technology. You may recall from
Chapter 7 in Block 2 that a trained tenor uses a singer’s formant to
emphasise voice partials between 1 and 3 kHz which centres on the
frequency response of a mechanical gramophone – as you found out in
Activity 14.
Edison had little to offer in way of competition to Berliner’s ever
growing catalogue. He failed to make inroads into Europe and hence
record the popular artists of the time, who tended to live and perform
in that part of the world. Although the United States saw the origins of
the talking machine, Europe really transformed it into a musical
instrument by the choice of music and performers. Finally, in 1913
Edison introduced a disc, shown together with some of his cylinders
in Figure 15. Typically, it had the same vertical cut groove method
used on his cylinders, which made it unusually thick (6.5 mm) and of
course it needed a special Edison disc player. Despite offering a
superior sound Edison’s disc was too late – Berliner’s gramophone
records were too well established by virtue of the material they offered.
The Edison Company continued to supply cylinders and discs until
1929 when manufacture finally ceased.
Figure 15 Examples of Edison’s cylinders and discs
Suggest some reasons why the acoustic recording process limited the
types of instruments and voices used? I
2.2.7 Good times and bad

The music industry, like any other large industrial business, had good
times and bad times. By 1924 the burgeoning of radio broadcasting in
the United States of America caused a severe down-turn in record and
equipment sales leading to amalgamations and bankruptcies of many of
the record companies. Actually radio broadcast studio technology
proved of great importance to the record industry. The sensitive
microphones and electronic amplifiers used in broadcast studios offered

improved characteristics which were exploited in the record industry by
the development of an electro-magnetic cutting head by the American
company Western Electric. Unfortunately, they agreed to sell electric
recording technology (as it was known) only to American record
companies. Not to be out-done the far-sighted managing director of the
British Columbia Record Company went to America and bought the then
ailing ‘States-Side’ company of the same name along with an agreement
to use the new recording equipment, thereby securing the technology for
use in Europe. To capitalise on the new technology, players with electro-
magnetic pickups, using an opposite technology to the cutter, were
developed. Electric players rapidly replaced acoustic machines as they
were able to exploit the improved characteristics of the electric recordings.
In particular the new recordings were able to use electronic filtering or
equalisation to improve the replayed sound quality, as described in Box 6.
Box 6 Electro-magnetic pickups and equalisation

Audio signals are recorded as a lateral displacement or ‘wiggle’ in a linear spiral
groove cut into the disc, as described earlier. A sinusoidal wave in the groove, as
shown in Figure 16(a), will cause a voltage to be generated in the electro-magnetic
pickup coil, illustrated in Figure 16(b), due to the lateral motion of the stylus
following the wiggles in the groove. (The stylus in the pickup moves very freely
compared with the arm that is supporting it.)
groove wall
magnetic material
audio
signal
(a) coil of wire

magnet
stylus
(b) groove
Figure 16 (a) Sinusoidal signal in groove, (b) magnetic pickup
The level of the generated voltage is inversely proportional to the frequency

recorded in the groove, so as the frequency rises the output level falls. The
maximum recorded level (the maximum amplitude of the groove ‘wiggle’) is
fixed firstly by the groove spacing, if it is too high the groove wall breaks, and
secondly by the ability of the stylus to follow the lateral movement in the groove
(too large a ‘wiggle’ at too high a frequency and the stylus will be unable to
follow the groove). The minimum recorded level is set by the noise in the
system (mainly the noise generated from roughness in the disc material). To
ensure that the amplitude of the groove ‘wiggle’ is reasonably constant over
all audio frequencies, the lower frequencies (below 500 Hz) are reduced and
the higher frequencies (above 2 kHz) are boosted when the groove is cut. This
means that when the record is played back the opposite characteristics must
be applied, i.e. the lower frequencies output from the pickup must be boosted
and the higher frequencies reduced. In the late 1950s an international agreement
was reached for the frequency characteristics of recording discs based on a
specification from the Recording Industry Association of America (RIAA).
Activity 3 should have reminded you that the frequency characteristics

of an audio system should be substantially flat for frequencies between
30 Hz and 30 kHz. If a disc was replayed through an amplifier with a
flat response, what would the sound be like?
Comment
The sound would be very tinny, thin and lacking in bass notes. This is
because when the disc is replayed by an electro-magnetic pickup the
voltage output at low frequencies is reduced in level but at high
frequencies is boosted. So to play any record using an electro-magnetic
pickup it is first necessary to set the replay amplifier characteristics to
match the RIAA equalisation. I
The ideal characteristic for such an RIAA replay amplifier is shown in Figure 17. The gain for frequencies
above the low corner frequency is reduced until the middle corner frequency is reached. The gain then
remains level until the high corner frequency is reached when the gain is further reduced. Since it is
not possible in practice to produce a response that has straight lines and sharp corners, Figure 17
also shows
the actual
low corner frequency response of
a practical
18 dB ideal response
17 dB
amplifier
alongside
middle corner frequency this ideal
high corner frequency
response.
This filtering
0 dB
s h o u l d
actual therefore
response restore or
–13.6 dB
equalise
the levels of
–18 dB
the audio
20 50 500 1k 2.12k 10k 20k
frequencies
f/Hz
to those of
the original
Figure 17 Response of an RIAA replay amplifier
performance.
Prior to the RIAA agreement manufacturers specified their own equalisation, examples of which
are included, along with the RIAA characteristics, in Table 1. The use of the RIAA characteristic
also produces a useful improvement in signal-to-noise ratio. By attenuating the higher frequencies
‘surface noise’ in the pickup output caused by dust and groove wear is reduced. However boosting
the lower frequencies means mechanical noise from the turntable and external electrical noise
(e.g. mains hum) can be increased.
Table 1 Equalisation characteristics required for 78 rpm and LP disc media
Recording Low corner Middle corner High corner Level at Level at

system frequency frequency frequency 50 Hz* 10 kHz*
HMV 78 50 Hz 250 Hz n/s** +12 dB n/s**
Columbia 78 n/s** 300 Hz 1.6 kHz +14 dB –16 dB
RIAA LP 50 Hz 500 Hz 2.12 kHz +17 dB –13.6 dB
* These figures are relative to the amplifier gain at a frequency of 1 kHz. **These figures are not specified.
The computer sound file associated with this activity contains a

recording of a sound as though it had been taken straight off a disc
without RIAA equalisation being applied.
Run the course’s sound editing software, open the file and play it.
Notice how thin the sound is with the bass reduced and high
frequencies emphasised. You are now going to apply RIAA
equalisation to the sound as shown in Figure 17(b). To help you do
this, follow the steps associated with this activity in the Block 3
Companion. I
In this activity you will look at the ‘quality’ of an experimental 1932

disc recording just as you have done in Activities 9 and 14. This
illustrates the wide frequency range possible using electric recording
technologies and includes equalisation. The recording medium is
unknown but is probably shellac.
Run the course’s sound editing software, open the computer sound file
associated with this activity and follow the steps in the Block 3
Companion on how to determine the quality of a recording.
Comment
I obtained the following characteristics for this exceptional recording:
Frequency response 50 Hz – 9 kHz although there is evidence
of harmonics above this frequency
Dynamic range 50 dB
You will notice a considerable improvement in quality due to the use

of the electric recording process. Electro-magnetic transducers are
much more sensitive to higher frequencies as demonstrated by the
improved frequency response compared to the mechanical systems
used in Activities 9 and 14. I
The Great Depression of 1929 caused considerable losses with sales

dropping to a tenth of their previous value. One by one the record
companies went bankrupt or were taken over. In the UK in 1931 The
Gramophone Company and Columbia merged to become Electrical and
Musical Industries (EMI). By the end of the 1930s the market had begun to
rally, with the realisation that radio broadcasting could be used to
advantage through record promotions. In America jukeboxes, similar to
the one shown in Figure 18, flourished and by 1939 over 13 million
discs were sold just to stock the nation’s 225,000 jukeboxes!
By this time also many recordings now considered historically (and
musically) important had been made by composers such as Elgar,
Stravinsky and Rachmaninov. Many of the world’s finest performers
had also made recordings.
Following the second world war (1939–1945)

demand for records increased dramatically
(supplies of shellac had been diverted to the
war effort so creating shortages of gramophone
records – indeed old records were recycled).
An example of the upsurge is demonstrated by
the figures for sales of an early recording of a
popular piano concerto which sold 102 copies
in 1935 and 62,756 copies in 1946.
Record popularity was due in no small part to
improvements in recording techniques. For
example an engineer at The Decca Company in
England developed an extended frequency
response as part of the war effort. The ‘full
frequency-range recording’ (ffrr) technique gave
a bandwidth from 30 Hz to 14 kHz ensuring
sounds of instruments included hitherto
unheard harmonics giving a much fuller sound.
Still not everyone was happy with a technology
that, apart from improvements in fidelity, had
Figure 18 A jukebox from 1939 remained substantially unchanged since the
early 1900s. Record consumers were no longer
satisfied with excerpts of symphonies or musical works shortened to fit to
one or two sides of a disc. Full scale symphonies and choral works were
available as sets. For example Bach’s St. Matthew Passion (approximately
3 hours of music) came on eighteen double sided 12 inch records, but
listening to this work involved changing records 36 times, hardly
convenient or indeed conducive to a fine musical experience!
In 1948 Peter Goldmark, head of research at Columbia Records in America
demonstrated a 12-inch (30 cm) non-breakable microgroove vinyl disc
capable of playing 23 minutes each side. Columbia called it the LP (for
Long Playing) disc. It revolved at 331/3 rpm with up to 300 tracks to
the inch (120 per cm). The rival company RCA-Victor seemed not to
be impressed with the LP. They responded with a 7-inch (18 cm)
microgroove vinyl disc which revolved at 45 rpm, the so-called 45, with a
similar track pitch to the LP and which played for up to 4 minutes. The
‘Battle of the Speeds’ commenced.
Can you suggest how the record buying public responded to these two
new ‘standards’?
Comment
The immediate response from the record-buying public was to stop
purchasing records until the outcome of the battle was decided! I
Fortunately for all the record companies a truce was declared by 1950,
with the 78 rpm disc the loser. The LP was adopted for classical
recordings and the 45 for popular music. In Europe the change took a
little longer, but by the end of 1952 LPs were available from European
manufacturers.
The LP is not quite the end of the story of the gramophone record. As
far back as 1931 the British engineer Alan Blumlein designed and
patented a stereophonic (from the Greek meaning solid sound)
recording system which used two sound channels to create a virtual
sound ‘stage’ where an individual sound source (instrument, voice,
etc.) could be located at any point between two loudspeakers placed at
the front left and front right of the listener. The location of the source
is determined by the relative intensity in the two channels. The patent
covered two possible ways of cutting the groove in the record to allow
two separate channels to be recorded. The V-L (vertical-lateral) method
combined hill-and-dale and lateral cutting systems whereas the 45/45
technique was similar except the cutter was tilted at an angle of 45° to
the surface of the disc putting the stereo signal into the groove walls,
as illustrated in Figure 19.
90° 90°
V-L 45/45
Figure 19 The V-L and 45/45 recording techniques compared
In 1958 the 45/45 standard was adopted by the industry, having been
patented in the United States by Westrex/Bell as early as 1937.
Can you think of a problem that existing record users that might find
with the introduction of stereo discs?
Comment
Non-compatibility with existing monophonic (single channel) systems
meant the need to produce both mono and stereo discs. Retailers
would have to stock both versions of the record. A stereo disc could be
damaged if played on a mono pickup which was not designed to be
compatible with stereo discs. This is because the stereo pickup needs
to move in both the horizontal and vertical directions to cope with the
45/45 movement whereas the monophonic pickup only moved
horizontally and this could potentially cause damage to the stereo
groove wall. Also not all the music information could be recovered by
having a monophonic pickup. ‘Stereo compatible’ monophonic
pickups were eventually manufactured to overcome this problem
allowing the production of monophonic discs to be phased out. I
This brings to a close the story of the record (cylinder and disc), the
main source of recorded sound for nearly a hundred years. Apart from
refining manufacturing techniques, little change to the technology took
place. There is still a demand for vinyl discs from audiophiles, who
believe the analogue sound cannot be surpassed. But it is DJs, who
have made ‘turntablism’ an art form in its own right by creating new
music by ‘scratching’ tracks from vinyl discs, that are keeping disc
record playing alive. This demonstrates a use of the phonograph not
even imagined by Edison!
Why are the following factors important in the quality of disc

reproduction? What aspects of the quality of the reproduced sound do
they affect?
(a) The hole is exactly in the centre of the disc.
(b) The disc lies flat on the turntable, not warped in any way.
(c) The disc is made from a smooth material. I
Your answers to the first two parts of the above activity should have
described an effect known as wow. Wow is a low frequency pitch
variation which, in discs, is caused not only by the spindle hole in the
disc not being exactly centred, or by the disc being warped, but can also be
caused by slow variations in the disc motor speed. There is a second
related effect called flutter which is a higher frequency pitch variation
caused mainly by faster variations in the speed of the disc motor.
2.3 Sounds from magnets

“I’ve an opera here you shan’t escape – on miles and miles of
recording tape.”
(from ‘The song of reproduction’, see page 81)
2.3.1 Introduction
Sounds, pictures, measurement data, financial statistics, personal details,
etc., can all be recorded and stored on magnetic media, i.e. materials
that are able to be magnetised to store information for future retrieval.
Construct a table of all the different types of magnetic media you think
you may have used and what you kept on each type of media. Were
you able to put any of the media to more than just one use, i.e. store
different sorts of things on that media? Do not worry at this time if you
are uncertain as to what I mean by magnetic media.
Comment
The media types I thought of are shown in Table 2. I
Table 2 Various types and uses of magnetic media
Magnetic media type Use

Audio reel-to-reel tape Music and speech
Audio cassette tape Speech, music, and computer data
VHS tape Videos for television and digital audio
DV (digital video) tape Home movies
Hard disk* Computer data, music, speech, videos
Floppy disk Computer data and music
* There are two forms of the spelling. My convention will be that discs originally
designed for music storage will be spelt with a ‘c’, whereas disks used for computer
storage will be spelt with a ‘k’.
Did you notice from my list in the above activity that most magnetic
media can have more than one use? I doubt that the designers of the
compact cassette ever imagined it being used to store computer data
(as was the case in the 1980s when cassettes were used with home
computers). Magnetic media is incredibly versatile for recording and
storing information due to its convenience of use, low cost, reusability
and reliability, although not all these qualities are necessarily exploited.
For example audio and video recordings are often made only once and
kept indefinitely, whereas data on a computer disk may be changed
by the minute, as with the word-processor data file I am updating as
I write (and rewrite!) this section of the course.
2.3.2 Recording on the wire

A paper published by Oberlin Smith in 1888 discussed the possibilities
for recording sound using the property of magnetism. He envisaged a
cotton thread impregnated with steel dust passing through a coil carrying
a current controlled by a microphone. The variations with the sound in
the strength of the current would cause corresponding magnetic
fluctuations in the magnetic medium. Unfortunately he dismissed
his idea because he thought that ‘the magnetic influence would
probably distribute along the wire in a most totally depraved way.’
Smith’s ideas remained theoretical
as he never performed any
experiments. However, by the end of
the 19th century Valdemar Poulsen,
a Danish electrical engineer, had
demonstrated Smiths hypothesis.
Poulsen’s ‘Telegraphone’, shown in
Figure 20, was patented in 1898.
It used steel wire wrapped around a
brass cylinder as the magnetic
medium. At the Paris Exposition
of 1900 he recorded Emperor Franz
Figure 20 Poulsen’s original wire recorder
Josef of Austria, the oldest magnetic
recording now in existence.
The lack of appropriate technology meant that the telegraphone
could not compete with the gramophone. The development of an
electronic amplifier using the thermionic valve (vacuum tube)
enabled the tiny magnetic fluctuations in the steel wire to be
magnified to a usable level. By 1924 a German engineer, Dr. Curt
Stille, had developed a machine which could record sounds on a
steel tape. The BBC (British Broadcasting Company) showed great
interest for, at this time, they used disc recorders for pre-recording
programmes and talks which were cut into acetate discs, replayed
maybe twice and then discarded. So they sent two engineers to
Berlin for a demonstration. They offered to buy the machine but
were refused and so returned empty-handed. In 1931 Louis Blattner
purchased a Stille machine, shipped it to England and renamed it
the Blattnerphone, illustrated in Figure 21. It used 2 inch (6 mm)
wide flat steel tape and could record for up to 20 minutes.
The BBC evaluated it but were

unhappy with the signal-to-noise
ratio due to a constant background
hiss, caused by the physical qualities
of the steel tape. Blattner eventually
sold out to the Marconi Company
who in conjunction with Dr. Stille
further developed the recording
machine. In order to provide a
suitable audio bandwidth it was
found necessary to run the now
one inch wide steel tape at a rate
of 60 inches per second (150 cm
per second). This meant that nearly
2 miles (3 km) of metal tape was
Figure 21 A steel tape recording machine
required for a half-hour programme!
Such was the pressure for an easy record and playback system that the BBC
used steel tape recorders for a while as demonstrated in the next activity.
Listen to the two audio tracks associated with this activity. The first
track contains a recording of the famous statement made by the Rt. Hon.
Neville Chamberlain on his return from Munich. This was recorded on
a steel tape recorder in September 1938. As a comparison, the second
the track contains a copy of the 1932 experimental disc recording that
you analysed in Activity 19.
Notice the difference in background noise between the magnetic steel
tape recording and the disc recording of 6 years earlier. I
Poor signal-to-noise ratio meant that steel tape was eventually discarded
but one of the first home magnetic recording machines, the Webster
wire recorder, described in Box 7, used thin steel wire, echoing
Poulsen’s idea.
Box 7 The Webster wire recorder

The Webster wire recorder was introduced in 1946
and remained popular with amateurs until the late
1950s.
The quality of this recorder was, for the time,
surprisingly good – this was perhaps in part due to
the fast wire speed of 30 ips (from which all the
subsequent standard tape speeds – 15, 7½, 3¾, and
17 8 ips – were derived). Figure 22 is a photograph of
the Webster wire recorder. The steel wire was
0.0036 inch in diameter and reels provided up to an
hour of recording time.)
Figure 22 Photograph of a Webster Wire Recorder

2.3.3 Magnetic tape recorders

Experiments showed that the use of paper tape coated with iron oxide
particles significantly improved the signal-to-noise ratio and enabled a
lower tape speed to be used. A plastic-based version of this magnetic
tape, developed by the German company BASF, led to the development of
a commercial tape recorder with audio characteristics that could nearly
match that of the gramophone record, but not at an economical price.
Secret work on tape recorders was undertaken by the Germans throughout
the Second World War. This was revealed when Allied forces captured
the Radio Luxemburg studios in 1944 and discovered machines capable of
out-performing discs in both sound quality and playing time. In America
the Minnesota Mining Manufacturing Company (3M Co.) further refined
the tape (their experience with adhesive tapes proved advantageous) while
the Ampex Corporation developed a machine with a frequency response
of 30 to 15,000 Hz at a tape speed of 7½ ips (19 cps) and a signal-to-noise
ratio of 50 dB, certainly equalling, if not bettering, the characteristics of
discs at that time. The tape speeds used for different recording
characteristics are discussed in Box 8.
Box 8 Tape speed

The audio bandwidth of a tape recorder is determined to an extent by the
selection of the tape speed, i.e. the rate at which the tape is drawn across the
record and play heads, shown in Figure 23. The wavelength of the audio signal
recorded onto the magnetic tape is proportional to the tape speed. As the tape
speed is increased a greater proportion of the tape is used to store the audio
signal allowing higher frequencies to be retained on the tape. Because high
tape speeds are less economical on tape usage tape recorders had speed controls
to allow users to select the tape speed to suite the audio quality required as
detailed in Table 3.
pay-off reel
take-up reel
heads
erase record play

tape
guide guide
Figure 23 The tape path on a tape recorder
Table 3 Tape speed vs. bandwidth
Tape speed Bandwidth Use

38 cm/s (15 ips) 20 Hz – 20 kHz Studio recording
19 cm/s (7½ ips) 30 Hz – 15 kHz High quality home recording
9.5 cm/s (3¾ ips) 40 Hz – 13 kHz General domestic music and speech
4.8 cm/s (17 8 ips) 50 Hz – 6 kHz Recording speech (dictation)
Soon tape recorders were in use by the American radio networks for
pre-recording their broadcasts, the entertainer Bing Crosby being one
of the greatest proponents of the technology. Recording companies
were also quick to embrace the benefits of tape – especially the ease by
which mistakes could be edited and retakes inserted. Also the ability
to record for longer periods meant less need for recording sessions to
be split into short takes. Early domestic recorders were used primarily
for playing stereo recordings, but they were costly, both in terms of the
hardware and the media, a pre-recorded stereo tape cost five times that
of the equivalent mono LP disc. The sales of pre-recorded tape
plummeted once stereo LPs became available in 1958. From that point
on domestic tape recorders, similar to the one illustrated in Figure 24
were used mainly by enthusiasts for home recording.
Figure 24 An enthusiast’s home tape recorder
By referring to the information given earlier, construct a table that

compares the frequency response and playing time of the newly
developed magnetic tape with that of 78 rpm discs of the same period
(1945). I
2.3.4 Compact cassettes

The use of magnetic tape for home use has always been somewhat
problematic. Whilst it offers several advantages over discs, being
capable of high quality sound, is substantially free from surface noise
and has the ability to make personal recordings, tape never became so
popular as to make any serious inroads into the sales of discs. Why
should this be the case? The answer is one of convenience, for
magnetic tape has always been difficult to handle compared with discs
– threading the tape through the machine and onto the take-up spool
was a fiddly process, and the tape could easily get damaged or snap.
Many companies developed tape cassette systems based on standard

quarter-inch tape but none succeeded in gaining acceptance by consumers.
The compact cassette system shown in Figure 25, was developed by
Philips Gloeilampenfabrieken in 1963 for recording speech (shades of
Edison!). Philips called their cassettes compact to distinguish their
system from other audio cassette systems and they made no pretence of
achieving high quality sound, deciding to use a slow tape speed (178 ips)
and a new narrow one-eighth-inch wide tape to keep the whole system
as small as possible. The convenience of slotting cassettes into the
machine rather than having to thread tape around guides and tape
heads made this format much more suitable for consumers.
Figure 25 A Philips audio cassette recorder with a

compact cassette
To use the compact cassette system in place of long-playing records

(LPs) necessitated overcoming two obstacles. First, the limited
bandwidth due to the low speed and second, the poor signal-to-noise
ratio because of the low signal level output from the narrow tape. The
bandwidth was increased to a degree by the use of special magnetic
tape formulations, including high density ferric oxide, chromium
dioxide and pure metal compounds. The signal-to-noise ratio was also
improved because these tapes allowed signals to be recorded at higher
levels. However, the poor signal-to-noise ratio was really only solved
by the Dolby Laboratories who developed and licensed a consumer
version of its professional noise reduction system, Dolby A. The Dolby
B noise reduction system described in Box 9 dramatically improved
the sound quality on compact cassette tapes, enabling them to rival
discs. Remember that although Dolby B encoding can reduce tape hiss
it cannot be used to improve the quality of the original recorded sound.
The ability to make Dolby encoded home recordings was a very
attractive feature of the system and certainly contributed to the wide
acceptance of the compact cassette. This was exploited particularly in
automobile audio systems where a copy of an LP or CD could be
played whilst driving. Sales of classical music compact cassettes
overtook LPs by 1983 but were themselves overtaken by CDs in 1988.
By 1994 classical CDs took 78% of the market, compact cassettes 21%
and LPs a mere 1%.*
* Data from the Statistics Handbook (1995), The British Phonograph Institute,
London, p. 21.
Box 9 The Dolby B noise reduction system

Dolby Laboratories developed their Dolby B the frequency where the signal boost starts to
noise reduction system to improve both the happen. When the sound signal is very low or
frequency response and the signal-to-noise contains few upper frequencies, the boost start
ratio of the compact cassette system. point slides to its lowest frequency, giving on
Magnetic tape can only hold so much signal, replay a maximum of 10 dB noise reduction
beyond this it will saturate (i.e. the magnetic above 4 kHz. As the sound level increases and/
particles on the tape cannot be magnetised any or there are more higher frequencies in the
more), so the louder the signal being recorded, signal, the start point frequency rises and so
the closer the tape becomes to being the perceived reduction in noise on replay is
‘saturated’. For a particular tape recorder and reduced, but of course because the signal is
tape the amount of tape hiss is constant. Thus louder in the upper frequencies, the hiss is less
the louder the wanted sound, the less obtrusive noticeable.
the hiss will be. However, if loud (high level) The key to successful operation of this
signals are recorded and at the same time system is in the positioning of the sliding
boosted significantly for the purpose of noise bands and the ability of the decoder in the
reduction, the tape would saturate and the replay machine to track these changes in
recording would become distorted. frequency and so reproduce exactly the
original signal. If, for some reason, the
The basis of the Dolby noise reduction system
frequency response of the encoded signal is
is that low-level high frequency signals are
changed before it reaches the decoder, mis-
boosted when the recording is made, and the
tracking of the sliding bands will occur. How
opposite process is carried out on replay. The
audible this becomes has to do with several
process is only applied to high frequency
factors, including the nature of the music,
signals as this is the frequency range where
the listening conditions, and the sensitivity
the hiss is most obtrusive.
of the listener. However, audible effects of
To boost the noise reduction effect further, mis-tracking are minimised by limiting the
the Dolby B system uses a sliding range for overall boost range to 10 dB.
Disc recordings have always had a specific advantage over tape when it
comes to accessing a particular part of a recording. By describing the
different technologies used to store the sound can you suggest what
that advantage might be? I
Listen to the audio track associated with this activity. You will hear an
original live digital recording made at the Open University especially
for this activity. During the recording four different formats were
used: direct digital, an original compact cassette tape without noise
reduction, the same tape with Dolby B noise reduction and a metal
compound cassette tape again with noise reduction. Listen carefully to
the differences in the quality of the recorded sound.
Note, the differences are quite subtle and you may find them easier to
distinguish using headphones.
Comment
I expect you noticed the very intrusive tape-hiss after about 18 seconds.
I am sure you agree this is unacceptable for most music recordings
although could be tolerated for speech. The tape hiss is reduced to an
acceptable listening level after a further 25 seconds by using a Dolby B
noise reduction processor. This cuts the tape-hiss by about 10 dB but
maintains the tonal balance of the sound. This would not be possible
by using conventional filters (e.g. treble-cut tone controls) which

although they would suppress the tape-hiss they would affect the tonal
balance of the sound. Finally, after a further 30 seconds the tape-hiss
becomes almost inaudible by the use of a metal compound cassette
tape and Dolby B noise reduction. This combination cuts the hiss by a
further 10 dB making the overall sound quality very close to the
original digital recording at the beginning of the track. I
In this activity you will view the noise levels of the recording in
Activity 27 using the course’s sound editing software.
Run the course’s sound editing software and open the computer sound
file associated with this activity. Open a linear frequency analysis
window and play the sound clip. Can you see the noise levels
increasing and decreasing as explained in Activity 27? Do the levels
decrease by the amount suggested in that activity? What happens at the
very end of the recording? Note, you may need to play the sound two
or three times to answer these questions.
Comment
As the sound played, I hope you could see on the frequency analysis
display the general noise level (sometimes called the noise floor)
increasing and decreasing as explained in the comment to Activity 27.
Looking at the graph scale and estimating the actual values of the noise
level you should also see that the noise floor does vary by roughly the
amounts suggested in the previous activity. At the very end, did you
notice the noise floor reduce to the low level it was at the start of the
recording, indicating a rather sneaky final switch back to the original
digital sound? I
2.3.5 Studio tape recorders

The importance of tape recording to record production cannot be over
emphasised. From its development until the coming of digital tape
recorders in the late 1970s the analogue tape recorder
was at the heart of the professional music recording
studio. Initially, the full width of the standard quarter-
inch wide tape was used for making monophonic
recordings. Stereo needed two tracks – one for each
channel. Rather than doubling the tape width a decision
was made to halve the track width by incorporating two
discrete heads one above the other in a single head
assembly. This increased the signal-to-noise ratio by 3 dB
because of the reduced output signal from the replay head.
As technology advanced more tracks were able to be
added whilst keeping the noise to an acceptable level.
By also widening the tape, even more tracks could be
incorporated so allowing individual instruments to be
recorded on separate tracks for down-mixing at a later
date. Figure 26 shows a professional 24 track analogue
tape recorder using special 5 cm (2 inch) wide tape.
Figure 26 A professional 24
track analogue tape recorder
These complex machines are capable of reproducing high quality

sound for each track with a bandwidth equalling the average human
ear and they represent the pinnacle of analogue multitrack tape
recorders.
Why is it important that the plastic backing material of magnetic tape

be as resistant as possible to stretching.
Comment
The effect of stretching the tape is to make it longer. This means
that it takes more time for the original length of tape to pass the
head, effectively slowing the tape speed. The pitch of the signal in
the original recording will consequently be lowered. I

Sound recording really took off once the public’s demand for
recorded music had been acknowledged. The choice of technology,
cylinder or disc, was determined more by the selection of the artist
and material than the quality of the sound. Development of disc
technology was slow due to the lack of better alternatives,
remaining substantially unchanged for over fifty years. The
development of radio broadcasting caused a slump in the record
industry but eventually provided not only improvements in
recording technology, by replacing acoustic recording by electrical
methods, but became a shop window for records. Once perfected,
magnetic tape offered a superior sound quality to 78 rpm records
and spurred the record industry into developing the long-play vinyl
disc which improved the quality of sound and, most importantly,
increased the playing time. Difficulties in operating conventional
tape recorders led to the development of the compact cassette, sales
of which overtook LPs once the sound quality had been improved
by the Dolby B noise reduction system.
“Then I never did care for music much – It’s the High Fidelity!”
(from ‘The song of reproduction’, see page 81)
To round off this section on the history of sound reproduction on a

lighter note, listen to the audio track associated with this activity.
This is the complete version of Michael Flanders and Donald
Swann’s ‘The song of reproduction’ performed by the composers in
a live recording from the late 1950s of their popular At the drop of a
hat comedy show. I
3 RESTORING RECORDINGS
3.1 Introduction
You have seen that for over 125 years it has been possible to make
and store sound recordings and you have had the opportunity to
listen to some recordings from over 100 years ago. Over that time an
extraordinary number of recordings have been made. Just as an
example, way back in 1930 the UK Columbia Record Catalogue ran
to over 300 pages and they were only one of several major record
companies. Many original masters or at least very good copies of
earlier recordings still exist.
Think about why it might be advantageous for record companies to re-

use archived recordings. Try to make a list of possible reasons for
restoring recordings of musical performances.
Comment
I thought of the following list of possible reasons, but this is not
exhaustive:
• a particularly fine performance;
• a performance involving the composer;
• a previously unreleased recording;
• a first performance of a particular work;
• the last performance of a noted artist;
• cheaper than recording a new performance. I
An example of a historic recording is that made on the 14th and

15th July 1932 of Sir Edward Elgar’s Violin Concerto in B minor by
the 16-year old violinist Yehudi Menuhin and conducted by the
composer. This recording has been in the record catalogue ever since
and is still available on CD. The Penguin Guide to Compact Discs
describes the recording:
“…on this newly remastered CD emerges with a fine sense of
presence and plenty of body to the sound. As for the performance,
its classical status is amply confirmed; in ways no one has ever
matched – let alone surpassed – the sixteen-year-old Menuhin in
this work.”*
An original boxed set of the recording on six 78 rpm discs is shown

next to a copy of the remastered CD in Figure 27. If you were to listen
to the original recording you would have to change the disc every five
minutes – which is hardly conducive to an enjoyable listening
experience.
*March, I., Greenfield E., and Layton R. (2001) The Penguin Guide to Compact
Discs 2002 Edition, London, Penguin, p. 448.
Listen to the two audio tracks associated with this activity which
demonstrate what can be achieved with the restoration of an old recording
using modern digital techniques. The tracks contain an excerpt from the
same recording of Elgar’s Violin Concerto in B minor but taken from
different media. The first track is from the original 78 rpm disc and the
second track is the same section of the piece from the recently remastered
CD. Notice how the restoration process has reduced the background noise
without affecting the sounds of the instruments. I
Remastering, often known as audio

restoration, has become of great
importance to music companies.
The opportunity to raid their
archives, coupled with the desire
of record collectors to replace their
beloved vinyl records with digital
media, has led to a plethora of re-
releases of historic performances.
Before I go on to discuss the
various stages of audio restoration
I want to take a moment to
consider the life span of recording
media.
Figure 27 ‘Boxed sets’ of the same recording, but the

media are seventy years apart!
3.2 The life span of CDs

As I mentioned earlier, in the latter part of 1897 Berliner switched to a
shellac compound manufactured by The Durinoid Company of New
Jersey, USA, in place of the problematic vulcanite then used to
manufacture discs. This shellac compound has turned out to be a near
ideal choice of recording medium for, over time, it has proved to be
very stable and relatively easy to store with the result that we are still
able to listen to recordings made over 100 years ago. Certainly the
audio quality is poor by today’s standards, but that we can hear the
sounds at all is remarkable.
Magnetic tape has not proved nearly so dependable. Problems with the
oxide flaking off the backing, the layers of tape sticking together and
‘print through’ (pre-echo or post-echo caused by magnetism from one
section of tape transferring over time to the section of tape next to it
on the spool) have been experienced in tapes less than 50 years old.
Indeed, the British Library in London is busy researching ways to
salvage recordings from damaged magnetic tapes. Machines to playback
many of the early tapes are also becoming rare. In the future will CDs
prove equally problematic?
ACTIVITY 33 (READING) .......................................................................
Read the article associated with this activity which you will find in the
Block 3 Reader. The article is taken from the August 2002 edition of
the Gramophone magazine. It looks at some of the problems already
experienced with storing CDs.
Comment
The article raises a number of problems regarding the life-span of CDs.
To prevent some of these occurring, archivists now create discs using
precious metals such as silver or gold to try to avoid the problems
associated with aluminium. There is also a plan to put audio data
directly onto computer networked servers to make it available via the
Internet as well as providing archive resources. I
What might be the dangers of using current technologies to archive

audio recordings? I
3.3 Audio restoration processes
3.3.1 Introduction
The aim of the audio restoration engineer is to produce a recording which
offers to the listener an experience as close to the original performance as
possible. The problem however is that the expectations of the consumer
have risen to well beyond what can be offered by most early recordings.
Nearly silent backgrounds coupled with huge variations in sound levels
have become the norm on CDs. Due to the nature of the original recording
methods and the media used, even with the best replay equipment
available, it is impossible to reproduce the original performances
without degradation of the sound introduced by the ravages of time.
Degradation of the sound may be classified into two broad areas, global
and localised. Global degradation include background noise, wow and
flutter, and distortion. Localised degradation covers clicks, crackles
and deep scratches.
To restore a recording the following processes are used:
1 Select the best source or sources of the original performance, use
the original master recordings whenever possible.
2 Convert the recording to the digital domain applying appropriate
playback characteristics.
3 Join sections together, adjusting speed and editing as necessary.
4 Remove unwanted sounds such as such as clicks, crackles, buzzes
and hiss (collectively known as unwanted artefacts).
5 Apply equalisation (EQ).
6 Make the master record.
Actually there is stage missing between each of the steps. The most
important aspect of the restoration process is that of listening. So the ears
are the most important instrument the restoration engineer possesses.
Whilst there are plenty of ways of measuring objectively the results of

the restoration process, the problem is that they may not tell you what
you want to know about the sound. So, inevitably, the final result is a
subjective one based upon the experience and musical ability of the
restoration engineer. Too much processing and the quality of the
music is lost. Too little and the unwanted artefacts intrude into the
performance. However, the most important point for the restoration
engineer to remember is that the original recording made at the time
cannot be improved. Indeed it is the engineer’s job to get back, as far as
is possible, to that original recording.
3.3.2 Selecting the performance

The sources may be from a variety of different media types such as
cylinders, discs, tape, etc. If the restoration is to be undertaken on behalf
of the company who owns the original recording then the master disc or
tape may be available. If not a good copy or preferably copies will be
necessary. There may be a choice of master recordings, as often two
separate recordings of the same performance were made, in which case
both should be auditioned and the best selected. Box 10 ‘Kind of Blue’
demonstrates the value of having both a master recording and a
back-up recording made at the same time on two separate recorders.
Written details from the conductor and soloists, made at the time of
recording, may exist. Standard rotational speeds and pitch (e.g. 440 Hz
for A4) cannot be relied upon any more than can the composer’s key
signature. Variance by a semitone or more would not be unusual if it
suited a soloist’s voice on the day.
Box 10 Kind of Blue

The LP disc entitled Kind of Blue was recorded Columbia’s ‘Legacy Jazz’ series. Mark Wilder
by the Miles Davis Sextet in 1959 at Columbia recounts the story:
Records’ 30th Street Studios in New York, USA. “The reissues had always been done with the
It was released later that year to immediate ‘C’ master reel, so I said, “Let’s use the
critical and public acclaim and became one of [safety] ‘D’ reels since they’ve been played
the best-selling jazz recordings of all time. It less.” I always check the tapes against a copy
has been issued in most formats: LP, 45 disc, of the album anyway, and the safety reels
reel-to-reel tape, audio cassette and CD. By sounded different. I called in a trumpet player
1993 it had sold 500 000 copies. And yet every I knew to listen to Miles’s solos and he
single copy sold was flawed. confirmed what I heard – there was about a
In Block 2, Chapter 4 you had the opportunity quarter tone difference.”*
to listen to an audio clip of the first track, ‘So The tape recorder used to record the original
What’, from the latest CD of this recording. master tape had a problem. Its drive motor
It was a well known fact amongst jazz was running slightly slow and so when played
musicians that you could not jam (play along) back at the correct speed of 15 ips the pitch
to the first side of Kind of Blue, the tuning was slightly sharp. As further confirmation of
seemed wrong. The second side was fine – the difference between the two tapes the first
any budding musician could join in and play track of the original CD release of Kind of Blue,
along – in perfect tune. And yet no one had which used the same master tape as the original
complained to Columbia and none of their LP, runs for 9 minutes and 4 seconds, whereas
engineers or members of the Miles Davis the same track on the later remastered release
Sextet had spotted a problem. lasts 9 minutes and 27 seconds. The original
In 1992 Columbia’s remix engineer Mark tape recorder was running just under 5% slow.
Wilder was asked to prepare a new version of *Kahn, A. (2000) Kind of Blue, London,
Kind of Blue for release on a CD as part of Granta Publications, p. 125.
Listen to the audio track associated with this Activity. This is a short
section from three different sources of Miles Davis’s Kind of Blue. In
order, the sources are: an early LP, the first CD to be released and the
latest issue of the CD. At the end of the track there is a single trumpet
note taken from each of the three sources, played in quick succession.
Comment
Could you hear the difference between the three sources? Remember
that no measuring equipment spotted the mistake – just a keen pair of
ears. Listening, which requires much practice, is all important when
restoring musical performances. I
3.3.3 Converting to digital

In some cases the original recording and playback apparatus may be
available, otherwise suitably adapted equipment will have to be
employed. For example 78 rpm records may be played back at a lower
speed (e.g. 45 rpm) and then ‘speeded-up’ using the time-stretch
facility in a computer-based audio editor. (Some of the 78 rpm disc
excerpts for this chapter were made in this way.) The playback
equipment will be set up to match as closely as possible the
characteristics of the original equipment. Each major record company
used their own recording characteristics to obtain the best possible
sound with the equipment available at the time.
High quality amplifiers from the 1950s included equalisation circuits
(filters) to match the characteristics of different disc media. Where such
facilities are not available equalisation varying the bass and treble tone
controls can be tried. Today, once the recording has been converted
into the digital domain, digital equalisation can be employed using
audio editing software with information similar to that contained in
Table 1 (in Box 6 on pages 96 and 97). The transfer is most likely to
be to a hard disk and digitisation should use the highest possible
sampling rates and numbers of quantisation levels, irrespective of the
quality of the original sound. The aim is to get the best possible quality
of transfer from the original medium.
3.3.4 Making the joins

Consider the case where a musical work was recorded onto two
double-sided 78 rpm discs. Once the disc transfer is complete the four
sides must be joined together to make the complete performance. This
can be fraught with problems. The speeds of the two discs may vary
very slightly, which would not necessarily be noticed when played
individually, but when joined becomes obvious. In such a case
computer processing will be needed to match the speeds. Then the
actual performance may include small alterations by the conductor or
producer who may have chosen to add interpretations to the music to
complete a side in a neat fashion. There may be a small repeated
section at the beginning of each new side to ‘restart’ the performance
which sounded well on the original but would now need editing.
Finally the two records could have been recorded at different times so
there may be changes in performance sound and temperament, for

which little can be done. Careful listening is vital at this stage to
ensure the most coherent sound.
3.3.5 Removing unwanted artefacts from recordings

Any analogue recording will have artefacts, such as surface noise or
tape hiss, which will degrade the original sound. Fortunately a great
deal of research dealing with the restoration of recordings has been
undertaken and ways of solving noise problems are now available.
Recordings may be degraded by:
• clicks – short bursts of interference which occur at random both in
amplitude and time.
• low frequency noise transients – due to scratches and breaks in the
recording medium. They are larger and of lower frequency than
clicks.
• broadband noise – the hiss common to all recordings.
• wow and flutter – which cause pitch variations due to variations in
motor speed or eccentric discs (i.e. the spindle hole is not centred).
• distortion – due to wear in the groove, or recording at too high a
level (causing clipping).
Because the artefacts have become part of the source signal it is
difficult to suppress them simply by filtering without affecting the
overall sound quality. Many audio systems incorporate bass and treble
controls which are capable of providing tolerable short-term listening
experiences but these would not be appropriate or acceptable for audio
restoration.
Computer-based noise reduction systems try to identify the artefacts
and then use a number of techniques to remove the unwanted noise
from the audio signal. These techniques are not without their
problems and great care must be taken to ensure the final sound does
not become ‘over-processed’. CEDAR (standing for Computer
Enhanced Audio Restoration), a company based in Cambridge, UK, has
developed a range of products that use powerful digital signal
processing techniques. These are capable of identifying and removing
artefact groups by applying complex mathematical and statistical
algorithms to the digital audio data. A description of these algorithms
is beyond the scope of this course, and besides, their application is
simply performed by using the appropriate processor to remove the
artefact group. The user listens to the sound output by the CEDAR
equipment and adjusts the degree of processing to obtain the best
sound. Many record companies and broadcasters use this equipment
for audio restoration.
Read the article associated with this activity which you will find in the
Block 3 Reader. The article is taken from the February 2003 edition of
the Gramophone magazine and describes the work of Ward Marston,
one of the most highly regarded restoration producers. This article
brings together the points I have mentioned above. I
From your reading of the article in Activity 36, what does the restoration
producer consider to be his greatest asset? I
Many computer-based audio editors contain tools to assist with the

restoration of sound recordings. These may include means of
eliminating clicks, reducing hiss, repairing distorted waveforms and
reducing background noises.
In this activity you will have the opportunity to try your hand at
restoring a short section of a recording from a 78 rpm disc using the
facilities provided by the course’s sound editing software. Since this
activity involves features of the program you have not used before, you
should run the course’s sound editing software, open the computer
sound file associated with this activity and then follow the steps for
this activity which you will find in the Block 3 Companion.
Comment
Before leaving this activity, make sure you have saved your finished
work as you will need it in the next activity. I
3.3.6 Application of equalisation

Once the recording has been processed, the engineer can try applying
equalisation to further improve the sound (this is in addition to any
‘in-built’ equalisation such as RIAA equalisation). A rule of thumb in
the recording studio is no more than 3 dB of boost or cut should be
applied. However, in the case of restoration, the original sound is often
so poor that steps must be taken to extract every possible detail whilst
hiding any unwanted sounds.
In this activity you will add equalisation to the 78 rpm disc sound clip
that you restored in Activity 38. Since this activity involves a number of
detailed steps, you should run the course’s sound editing software, open
the computer sound file you saved in Activity 38 and then follow the
steps for this activity which you will find in the Block 3 Companion. I

The remastering or the audio restoration of a recording is an attempt to
reproduce the sound of the original recording to a standard that is
acceptable to today’s consumer. The preservation of historic
performances demands that artefacts due to time, use and imperfect
technology are reduced to a minimum whilst ensuring the original
sounds are maintained. Artefacts may be classified and strategies used
to reduce their effects on the final sound. The most important tool
available to the restoration engineer is a keen sense of hearing.
4 DIGITAL PERSONAL JUKEBOXES
4.1 Introduction
The front cover of this part of Block 3 shows an Apple iPod which is
an example of an audio device called a personal jukebox. A personal
jukebox offers a portable music environment for anyone wanting to
listen to music. The iPod gives immediate access to nearly 500 hours
of (compressed) digital audio files (depending on the model). Digital
audio data, either copied from the user’s own sources or (legally)
downloaded from the Internet, is stored in a preferred format in a
database on a personal computer which is then transferred to iPod. By
accessing a similar database on iPod any file may be replayed in high
quality sound in a way similar to that of the original jukeboxes
illustrated in Figure 18.
4.2 The development of digital jukeboxes

The Diamond Multimedia Rio PMP300, shown in Figure 28 was the
first consumer device dedicated to playing music from compressed
(MP3) digital audio files downloaded from a computer.
Figure 28 Diamond Rio PMP300
Fearing an increased spread of copied music, the Recording Industry

Association of America (RIAA) asserted that converting digital audio
files to MP3 format was illegal. In June 1999 the Judge in the case of
RIAA versus Diamond Multimedia stated that the ‘doctrine of fair use’
(part of the US copyright law) allows users to make MP3 files from their
own recordings as long as they are for their own (non-commercial) use.
A plethora of digital players followed this judgement. To placate the
RIAA, copying files between players was made difficult by only

allowing file transfer from the computer to the player and not back
again. The music files were also encrypted so it was not possible to
transfer files between different players. Some players had fixed
memory others used memory cards but because of their then limited
storage capacity only relatively short playing times were available, as
demonstrated in Activity 40.
Chapter 4 of this block mentioned that MP3 processing gives a digital

audio file a compression ratio of about 11:1 when compared with the
uncompressed audio file such as found on a CD. Calculate how many
minutes of music, compressed using MP3, may be stored on a 64
Mbyte memory card if the original source is from a CD. I
Generally, the RIAA’s fears have proved to be unfounded as MP3

players were not overly popular due to their relatively high cost
compared with other portable players. Also the Sony Corporation have
been particularly aggressive in marketing their MiniDisc (MD) format
which offers very similar facilities to digital players with significantly
lower storage costs, although the players are rather more bulky.
To overcome limitations in playing times caused by relatively small
memory sizes, manufacturers have chosen to use a range of high
capacity miniature hard disk drives (originally designed for use in PC
cards for laptop computers) in place of exchangeable memory cards.
This allows hundreds of hours of music files to be stored and has
given rise to the term personal jukebox.
4.3 Case study: Apple iPod

Apple Computer, Inc. was one manufacturer who chose to use a
hard disk drive to store music. iPod was released in October, 2001.
By virtue of a novel design, the size of a small
bar of soap, and with an excellent user
interface, iPod soon became a ‘must-have’
accessory. Initially, it was only supported by
their Apple Macintosh range of computers
with a Windows version following in 2002. By
mid-2003 an updated version, illustrated in
Figure 29, which will work with both types of
computer was released. The iPod specification
at the time of writing (2004) may be found in
Box 11, but note that this is liable to change if
new models are brought out. The details in the
box are explained in the following text.
Figure 29 The 2004 Apple iPod and docking station

(known as the Dock)
Box 11 iPod specification (for reference only)

Below for reference only is the outline specification for iPod in 2004. Some of the terms may be
unfamiliar to you. Most of these are explained in the accompanying text, but you should not worry
about the others.
Storage: Hard disk drive with capacities of Audio formats:
15 GB, 20 GB or 40 GB Apple Macintosh computer –
Battery type: Lithium-ion-polymer AAC (16 to 320 Kbit/s), MP3 (32 to 320 Kbit/s),
Battery life: Over 8 hours MP3 VBR, Audible, AIFF, WAV
Battery charge time: 3 hours (1-hour fast Windows computer –
charge to 80% capacity) MP3 (32 to 320 Kbit/s), MP3pro, WAV
Skip protection: Up to 25 minutes (32 Mbyte) Computer interface:
Dimensions: 10.5 cm by 6 cm by 1.6 cm Apple Macintosh computer –
FireWire port; Mac OS X v10.1.5 or later;
Display: 2-inch (diagonal) greyscale LCD CD includes iTunes for Mac OS X
160 × 120 pixels with LED back-light
Windows computer –
Outputs: Stereo headphones and combined Windows Me, Windows 2000 or Windows XP;
wired remote control; line out when CD includes MUSICMATCH Jukebox Plus 7.5
used with the Dock (see Figure 29) software. (iTunes for a Windows computer may
Weight: 160 gm be obtained from the Apple website.)
4.3.1 iPod hardware

Figure 30 shows the hardware components inside an iPod. A main
electronic circuit board contains a microprocessor to control the operation
of iPod. Also included are digital-to-analogue converters, memory for the
‘skip protection’ feature (see below), and controllers for the computer
interfaces. A daughter board interfaces to the liquid crystal display (LCD)
and the touch controls. The hard disk drive stores the digital audio
data and a lithium-ion (Li-ion) battery provides the power.
electromagnetic hard disk drive

connector (not Li-ion battery
interference (EMI) shield
shown)
Cypress
Semiconductor
CY7C68013 display connector
(USB)
Portal Player daughter

PP5002D_C card display
connectors
Samsung
TI TSB43AA82 K4S561633C-RL75 Synaptics, Inc. hard disk
(FireWire interface) Touchpad and display drive
(memory)
daughter card
Figure 30 Inside an iPod
Microprocessor
The Portalplayer PP5002 ‘Superintegration System-On-Chip’ micro-
processor is a complete ‘computer on a chip’ designed specifically to
support hard-disk audio jukeboxes by providing the following facilities:
• decoding of various standard digital audio data formats;
• audio processing effects including a 5-band graphic equaliser,
preset listening modes, bass boost, etc.;
• low power consumption and power management features;
• control of the hard disk drive;
• interfaces for the LCD panel, the keypad touch controller and the
on/off switch called the ‘hold’ switch.
Skip protection
Any rotating disk system requires the section of the disc containing
the data to be accessed by a read head which is positioned by an
actuator. A physical shock can cause the head to move from its current
position causing a break in the data being read and therefore
disturbance to the sound output. To avoid this problem the digital
audio data from the disk is stored in a temporary data memory prior to
conversion. Any shock will cause a break and/or error in the data
entering the buffer and because the buffer is filled more quickly than it
is emptied any break or error due to the shock can be corrected by
reading the data from the disk again. The buffer also acts as a power
saving device as once it is filled the hard disk may be switched off
until, at a predetermined low point, it is switched on again and the
buffer topped-up. The iPod skip memory is 32 Mbytes providing an
average of 25 minutes of playing time. At the time of writing memory
chips are still much more expensive per megabyte than hard disk
memory hence the need for both types of memory in iPod.
LCD display
Liquid crystal displays do not generate light but rely on turning
opaque and so interrupting incident light reflected from behind the
display or from a back-light which provides illumination when the
ambient lighting is too low. LCDs have a very low power consumption
except when using the back-light. The quality of any display depends
on its resolution and the iPod display is high resolution which enables
very detailed graphical pictures to be displayed
as well as text as shown in Figure 31.
Storage
The hard disk drive has a 1.8 inch format (this
refers to the diameter of the rotating magnetic
disk inside the drive assembly – the overall
width of the drive is slightly bigger than this). In
2004 the disk in iPod is available in three sizes:
15, 20 and 40 Gbytes. Directory information
reduces these sizes by about eight percent
giving the 15 Gbytes model a total storage
capacity of 13.8 Gbytes for digital audio data.
Figure 31 Close up of iPod’s display
Assuming that an average song stored in an MP3 format takes 4 Mbytes

of storage space on the iPod hard disk drive, estimate the number of
songs that can be stored on a 40 Gbyte model. I
Your answer to Activity 41 should have shown you that the iPod can
store a huge number of songs. This will require careful organising
preferably by suitable database software. Various programs exist, for
example iTunes. Textual information about songs can be embedded in
files storing MP3-processed sounds and this information is used to
organise files by genre, composer or artist. iPod software will be
discussed later in this section.
Battery
The size and weight of portable equipment is governed to a large
extent by the battery and its ability to supply sufficient energy to
meet the needs of the user. This is usually given as the number of
hours of continuous operation. Using the latest battery technology
allowed the iPod designers to keep the size of the player to a
minimum. The rechargeable lithium-ion (Li-ion) battery used in
iPod supplies 3.7 volts with a storage capacity of 850 milliampere
hours (i.e. it can theoretically supply 0.85 amps continuously for
one hour). iPod will run for about 8 hours between charges in normal
use and the battery should have an operational life of 2–3 years
depending on usage.
Connections
The following connections are supplied as standard:
• a dual-purpose connector for connecting the host desktop computer
which, depending on the connecting lead used, provides either a
USB interface or a fast serial interface called FireWire;
• a combined stereo headphone and wired remote control output;
• a stereo line out connection but only when attached to its Dock.
4.3.2 iPod software

iPod must be supported by a desktop computer and application
software. iPod is supplied with iTunes for both Apple and Windows
computers (iTunes for Windows is illustrated in Figure 32). This
program allows:
• easy management of music files in a music library database;
• conversion of digital audio data to compressed data formats;
• playing, copying and making CDs;
• adding details about the title, artist(s) composer and genre to music
files either manually or automatically from online CD databases;
• downloading of music files from the library database to iPod.
Music can come from many sources, although most likely from CDs.
However, Apple, along with other music distributors are selling songs
on the Internet for a nominal charge. At the time of writing (2004)
iTunes download services are only available in the USA but a well-
known soft-drinks manufacturer is hosting a download site in the UK
although not for iPod.
Figure 32 iTunes for Windows
4.3.3 iPod digital audio storage

iPod supports a range of coding standards and sampling rates
including compressed and uncompressed digital audio files detailed
in Table 4. However, iPod itself does not have the ability to compress
files, it can only decompress already compressed files for playing
purposes. Thus an uncompressed file cannot be compressed once it
has been downloaded into iPod – any compression of files must be
done by the desktop computer before downloading.
Table 4 Digital audio file types stored on iPod
File format
Apple Windows
Data type computer computer
Uncompressed AIFF WAV
Compressed AAC MP3, MP3 pro
Most of the file formats in Table 4 have been described earlier in the
course, but the AAC and MP3 pro formats may be unfamiliar to you. In
essence both systems offer greater compression than existing methods by
using lower bit rates, typically 96 kbit/s or less as opposed to 128 kbit/s
with MP3, but still preserve audio quality. Advanced Audio Coding
(AAC) was developed by the Fraunhofer Institute as an development of
their MP3 format. MP3 pro was developed by Thompson Multimedia.
MP3 pro decoders can decode standard MP3 files.
Describe in your own words why the simple MP3 players were
relatively unsuccessful compared with personal jukeboxes. Use the
iPod as an example of the latter. I
123
� TA225 BLOCK 3 SOUND PROCESSES
5 SURROUNDED BY SOUND
� CHAPTER 5 THE MUSIC BUSINESS
The goal of inventors such Edison and Berliner was to reproduce the
sounds as accurately as their systems allowed. Advancements in
recording technologies increased the fidelity of the sound to such a
degree that not only could the performance be reproduced faithfully
but the ambience of the surroundings in which the performance was
taking place could also be captured.
Stereophonic systems give accurate spatial information to the listener
by positioning the performers and instruments between the two
loudspeakers. However, to experience the true effect the listening
position is quite critical and not suitable for a group of listeners. To
get a true effect the recorded sound needs to envelop the listeners
giving an aural experience close to that of being in a concert hall. A
quadraphonic system, with four independent sound sources, against
the two sources in stereo, was introduced to consumers in the late
1960s. Unfortunately the disc-based technology of the time was not
adequate and there were a number of conflicting standards which led
to the system being a commercial disaster.
Since the 1950s suitably equipped cinemas have offered 4 and 6
channel sound in the wide screen movie presentations offered by
CinemaScope and Todd-AO film formats. Simple two-channel sound
would not localise the dialogue for audiences sitting off-centre leading
to an unrealistic appearance to the performers.
Initially these multi-channel sound systems were analogue, using
magnetic tracks along the edge of the film. In the early 1970s Dolby
Laboratories developed an analogue system that used light which
passed through special optical tracks at the edge of the film onto
sensors to generate four surround sound channels, left front, centre
front, right front and rear, plus two extra low frequency effects (LFE)
channels for the bangs and crashes heard in many films.
By 1992 Dolby Laboratories had developed Dolby Digital for cinemas
which provided six discrete sound channels in what is known as a
‘5.1’ format. This provides five full-bandwidth sound channels for
front left, centre and right, plus rear left and right (these are the ‘5’)
plus a sixth limited-bandwidth LFE channel (the ‘.1’ channel).
The digital audio data is contained between the sprocket holes of the
film as illustrated in Figure 33.
Figure 33 5.1 sound data is stored

between the sprocket holes of the
35 mm cinema film
digital sound track picture area
analogue sound track

123
To meet the needs of the consumer Dolby adapted their ‘5.1’ system for
DVD video systems under the name of Dolby Digital using digital data
compression techniques and psychoacoustic coding and called AC3.
DVD-Audio discs also support Dolby Digital sound. Figure 34 shows a
home with a typical ‘5.1’ installation. To ensure films encoded with
earlier surround systems still provide a surround sound experience in
the home, Dolby developed the four channel Pro Logic and more
recently the six channel Pro Logic II decoders, the latter able to decode
stereo into a pseudo surround sound (i.e. altering the phases of the
two channels to create sounds coming apparently from outside the
space between the two speakers as discussed in Chapter 1 of this
block).
left right
DVD player centre
subwoofer
left right
surround surround
Figure 34 5.1 surround sound installed at home
Surround sound is also available using the Digital Theatre System

(DTS) which uses less compression and has the potential to offer
higher quality sound reproduction. However its use is less widespread
than Dolby Digital at present.
The number of channels is not restricted to six. At the time of writing
(2004) both ‘6.1’ and ‘7.1’ formats have been developed to improve
quality of sound from the rear and sides respectively. In every case of
surround sound an amplifier and loudspeaker is necessary for each
surround channel plus an amplifier and a special low frequency
loudspeaker called a sub-woofer for the LFE channel.
You have an audio system that supports ‘7.1’ surround sound. How
many audio channels will you have and what does each one do? I
6 PROTECTING THE COPYRIGHT
6.1 Introduction
Who owns the music you listen to? You may think you do, after all
you bought the disc and you can to listen to the contents as often as
you want. However, although you own the medium you don’t own the
right to do whatever you want with the contents. In the UK you may
make a copy of the music for your own use – after all you are unlikely to
play the two recordings at the same time so you are simply transferring
the contents to another medium for your own convenience. For example,
if you only have a compact cassette player in your car you may copy all
or part of the contents of a CD onto compact cassette for your personal
use when in the car. However you may not copy all or part of the contents
of the disc for anyone else to listen to, nor may you borrow a disc and
copy any part of it for your own personal use.
Can you legally listen to a CD borrowed from a friend?
Comment
It is quite legal to listen to a CD that you have borrowed from a friend
as long as it is the original CD and not a copy. The right to listen to the
disc has been temporally transferred to you by the owner who cannot
listen to the disc whilst you have it. I
6.2 What is copyright?

A legal definition of copyright is ‘a limited duration monopoly’.
This means that if you create something you have exclusive rights
to do what you like with it for a limited time. After all if anybody
could freely do whatever they liked with your work then you and
others like you would not take the trouble to invent or create anything.
For something to be able to be copyrighted it must be original and
contain sufficient material to constitute a work. Unfortunately there is
no definition as to what constitutes a work and each must be tested on
a case-by-case basis. For example, consider the five notes played by the
spaceship in the film Close Encounters of the Third Kind which you
listened to at the start of Chapter 8 in Block 2. Despite their brevity
these five notes were able to be copyrighted because of their originality.
To get something copyrighted it must be in a tangible form, i.e. recorded
in a way that recognises and confirms ownership.
You have what you believe to be an original musical theme running

around in your head. A month later you hear a similar theme on the
radio. Can you claim ownership of that theme?
Comment
No, not unless you wrote down, or in some way recorded the theme
before you heard it, otherwise you have no rights to it at all. I
6.3 Protecting the work

There are two main methods that can be used to protect a work from
illegal copying – using the law, or using technology. In this section I
will look briefly at both of these methods.
6.3.1 Using the law

As there is no statutory mechanism for registering your copyright
in the UK (it is different in the USA) two organisations have been
created to control the licensing of and use of material. The British
Phonographic Industry (BPI) is the organisation that deals with the
record companies. The Mechanical Copyright Protection Society
(MCPS) represents the owners and creators of musical works. They
issue licences, known as mechanicals, which give protection to the
copyright holder by controlling the rights to use the material to
anyone using it. The royalties paid are based on the selling price of
the record.
Performing works in public is also protected. In the UK the Performing
Rights Society (PRS) allocates licences for music to be played or
performed in all areas to which the public has access. A scale of fees is
published and the monies collected are paid as royalties to qualifying
members. A qualifying member must have had three works commercially
recorded or broadcast within the previous two years.
List the places in which a work could be performed in public.
Comment
A work can be performed on broadcast systems such as television,
cable and radio, in pubs, clubs, factories, shopping malls and discos or
at live concerts. I
Copyright has a limited life. The exact length of time depends upon
the nature of the material but for recorded sound it is (in 2004) 50 years
from the end of the calendar year in which the work was recorded or
released if the release date is within 50 years of it being recorded. Once
copyright expires the work goes into the public domain which means
it may be freely used by anyone. However, discovering whether a work
is truly in the public domain is sometimes quite difficult.
6.3.2 Using technology

Philips and Sony Corp. developed CD technology at a time when
the technology for digital copying, including data compression
methods such as MP3, were not available. The only copy protection
system available then was the serial management copy system (SCMS)
described earlier in this block. This was designed to control copying to
Sony’s newly developed DAT recorder that was described in Chapter 4
of this block.
SACD and DVD-A technologies have employed copy protection
strategies from their outset giving music publishers and record
manufactures good reasons for promoting these new systems. At the

time of writing the high cost of both the SACD and DVD-A hardware
and software, alongside the confusion over having two competing
standards, are preventing their immediate acceptance by consumers.
So-called off-disc systems, which use hardware circuits and associated
software to prevent copying are also being explored. The overall rider
is that the cost of implementing a copy protection system must
compare favourably with its efficiency in preventing copying.
Unfortunately the Internet is a big source of illegal music files.
Despite the demise of the original Napster system which allowed
Internet users to share music files, many user-to-user file transfer
programs still exist and are suspected to be involved in fraudulent
activities. To combat the fraudsters the music industry has
developed so-called ‘digital decoys’. These are spoof web sites,
apparently offering the latest music files, set up and maintained by
representatives of the record manufacturers. When these sites are
accessed files containing noise or loops of sounds are downloaded
by unsuspecting users who will be frustrated when they come to
listen to their latest acquisition. While this may appear to be easily
foiled, especially as anti-spoof programs are available allowing
fraudsters to recognise and avoid spoof sites, research shows that
frustrated users are being persuaded to join legitimate music
subscription clubs.
Digital audio broadcasting (DAB) may be of concern to the music
industry in the future, as potentially high quality copies could be
made off-air. Although at present, this does not appear to concern the
music industry due to the low take up of digital audio broadcasting
and the use of low bit rates coupled with compressed digital audio
data, discussed in Chapter 4 of this block.
6.4 The current situation

Copyright with respect to music and the making of illegal copies of
recorded music are hot topics and the situation is changing rapidly
as new technologies and recording systems come on stream. To ensure
currency the Course Team has collected together a set of articles
that should give you up-to-date information about both the legal
and technological aspects of copyright with respect to music
recordings. In the next activity you will be asked to read these
articles.
Read the articles and any surrounding text associated with this activity
which you will find detailed in the Block 3 Reader. These articles may
be provided in the reader booklet, or they may be found on-line –
details about this will be found in the reader. The articles contain up-
to-date information about the copyright situation with regard to music
recordings. With each article the Course Team may include a short
introduction, a list of any new terms used in the article, a conclusion
and one or more self-assessment activities. I

Copy protection systems are here to stay. All new digital standards
will employ them, as demonstrated by the current SACD and DVD-A
standards. They will become ever more sophisticated in their
operation in an attempt to defeat the pirates. Our concern, as
consumers, should be as to whether our existing rights are likely to be
diminished by new copyright laws and copy prevention technologies.
Ultimately, whatever protection is applied in the digital domain,
conversion to and from the analogue domain will always be an option.
This ‘analogue hole’ will remain for the foreseeable future since high
quality digital-to-analogue and analogue-to-digital converters are
capable of re-digitising the signal without protection and without any
significant loss of quality, so providing the means for the pirates to
continue to operate.
7 SUMMING UP
In this final chapter of Block 3, you have looked briefly at the

fascinating history of sound recording and have seen how this has had
as much to do with the people involved and the commercial
considerations as it has with the technologies used.
After over 100 years of sound recordings, the deficiencies of the
various methods of recording in terms of their longevity are now
coming to light, and this has led to the development of a range of
techniques to restore old recordings before they are lost forever
through the ravages of time.
Today’s digital techniques have produced a range of personal
jukeboxes which are pocket-sized devices that allow vast numbers of
songs to be stored and played. One of these devices is the Apple iPod
which, mainly due to its design and ease of use, became a ‘must have’
item on its release.
Advancements in systems are not confined to two channel (stereo)
recordings. Surround sound systems have been in use in cinema
sound for some time, however such systems are now appearing in
consumer systems.
As making digital copies of audio becomes easier and easier, resorting
to the law to prevent illegal copying becomes impractical for all but the
largest of infringements. Thus the music industry is looking at more
and more sophisticated ways of protecting copyright through
technology.
Both in this chapter and in Chapter 4 of Block 2 on stringed
instruments, snippets of Miles Davis’s classic recording Kind of Blue
have been used. As a conclusion to this chapter the Course Team
thought you might like to listen to the whole of one track from this
album, not only hopefully to enjoy the music, but also to compare the
quality of different recording media.
Listen to the two audio tracks associated with this activity. Both tracks
contain the same performance of ‘So what’ from Miles Davis’s Kind of
Blue album. The first track is the version taken from a vinyl LP record,
and the second is from the remastered CD where the pitch has been
corrected (see Box 10). The tracks each last around 9 minutes, so do
not feel you have to listen to all of both tracks unless you want to.
However, you should listen critically to at least some of both
recordings comparing in particular the quality of the recordings in
terms of frequency response, noise level, distortion and dynamic range
– but do remember that in both cases the original recording was made
on a professional analogue tape recorder of the late 1950s so the
quality can possibly never match that of today’s totally digital
recordings. I
SUMMARY OF CHAPTER 5
The history of the recorded music industry and so multiple master recordings had to
is as much about the people involved, the be made. However, even when a
recordings they made and the prevailing manufacturing process was developed, it
times as it is about the technology involved. was found that multiple copies of discs were
Changes in the technology of recorded much easier to produce than cylinders. The
music have not only affected the way in basic processes used to duplicate the early
which we listen to music but they have also discs are still used in today’s manufacture
had an enormous impact on the legal of CDs. (Section 2.2.4)
aspects of copyright. (Section 1)
The gramophone disc system was enhanced
Before sound could be recorded, people’s by the use of a clockwork turntable motor,
experience of listening to music was mainly a standardised speed of 78 rpm and double-
confined to live concerts, music-making in sided discs. (Section 2.2.5)
the home and mechanical music such as a
barrel organ. With the advent of recorded Discs eventually succeeded over cylinders
music, not only could performers hear how mainly due to the choice of artists and
they sounded to others, but it has shifted music that was made available on this
performer’s emphasis in performance from medium rather than the quality of sound.
immediacy to perfection. (Section 2.1) (Section 2.2.6)
Recording sound involves capturing The development of discs continued with

variations in air pressure and storing them the incorporation of electric recording
onto an appropriate medium. Playing technologies which used electromagnetic
back a recording recreates vibrations in groove cutters and pickups. This increased
the air so that the sound can be heard. sensitivity and with the use of electronic
(Section 2.1) equalisation techniques substantially
improved the dynamic range and noise levels
Cylinders were the first recording medium. of disc recordings. Further development
They used purely mechanical recording occurred with the introduction of discs
and playback methods whereby a stylus running at 45 and 33 1/3 rpm and the use
vibrating vertically in response to the of vinyl as the disc material. Stereophonic
sound inscribed a groove in a tin-foil disc. recordings were also introduced using the
At the time, speech rather than music, was 45/45 technique whereby one of the sound
the perceived application. (Section 2.2.1) channels is contained in each of the groove
Rivals spurred development of wax walls. (Section 2.2.7)
cylinder and disc recordings. Wax cylinders Magnetic sound recording could not be
gave better quality reproduction than tin- implemented using mechanical or acoustic
foil, allowed the recording to be replayed techniques, and so awaited development
a number of times and also allowed re-use of suitable electronic technologies. The
by shaving off a layer of wax. However, the first magnetic sound recorders used steel
replay sound level was rather low. wire or tape as the recording medium.
Recordings of musical performances were (Sections 2.3.1 and 2.3.2)
eventually produced using cylinders and
a clockwork-operated graphophone. Magnetic recording only really took off with
(Section 2.2.2) the introduction of magnetic tape which
increased playing time and sound quality
Meanwhile the gramophone disc recording and enabled recordings to be edited by
system was being developed which used splicing and joining sections of tape – a
lateral rather than vertical motion of the property especially valued by broadcast
stylus. This system allowed the use of and recording studios. (Section 2.3.3)
cheap steel needles and did not require a
feed-screw mechanism to move the stylus The reluctance of consumers to use tape
across the record. However, the lateral because of its fragility and the fiddly
system did have the problem that large process of threading tape reels was
deflections could cause breakthrough overcome with the introduction of the
between adjacent tracks. (Section 2.2.3) compact cassette. Sound quality of
cassettes was improved by improved tape
One of the problems with the cylinder compounds and the Dolby B noise
system was that it was difficult to make reduction system which used a dynamic
copies – the numbers of copies able to be process to maximise the reduction in tape
made from a master cylinder was limited noise. (Section 2.3.4)
High quality analogue multitrack magnetic The Apple iPod is an example of one of
tape recorders have had a huge impact on these personal jukeboxes. This device uses
the recording industry, and have only a miniature hard disk to store a huge
recently been superseded by equivalent quantity of sound in a variety of formats
digital recorders. (Section 2.3.5) and is designed to be used in conjunction
with a desktop computer. Its popularity
Many early recordings need to be preserved stems from its novel design, friendly user
as they have historical importance and interface and incorporation of easy-to-use
restoration of recordings has become an database software for the management of
important part of the recorded music the songs stored. The iPod is an example
industry. (Section 3.1) of how audio and computer technologies
are converging. (Section 4.3)
Cylinders and discs have proved to be very Surround sound systems enable listeners
stable media compared with magnetic tape to experience sounds as if they were
which suffers over time from a number of positioned within the recording
problems from flaking to print through. The environment. In general the number of
reliability of CDs has still to be proved, but channels is given by the nomenclature in
already there are signs that CDs deteriorate the description, e.g. ‘5.1’ indicates 5
faster than might be expected. (Section 3.2) surround channels plus one low frequency
effects channel. Leading the development
The audio restoration process uses both
of surround sound for consumer systems
analogue and digital technologies with the
are the Dolby Laboratories who have much
aim of extracting the best quality of the
experience in cinema surround sound
sound from the source(s) available. The best
systems. (Section 5)
sources are first selected, and converted to
digital form using the highest possible Copyright protects peoples’ creations or
sample rate and number of quantisation ‘works’ from exploitation by others, and is
levels. Editing of sections is then carried of great concern to the music industry now
out and unwanted artefacts are removed. that digital technology has made it easy for
Finally equalisation may be added to people to make multiple copies of
further improve the result. A good ear is recordings without any degradation of the
vital in every stage of the restoration sound. Protection of a work can be
process. (Section 3.3) achieved by legal and/or technical means.
The law provides protection and payments
The miniaturisation of electronic to creators and performers through a system
circuitry, the emergence of small digital of licences and royalties. Technology can
memories for laptop computers and the also secure rights to creators and artists by
development of sophisticated sound restricting copying of digital media.
compression systems such as MP3 have However, the situation is continually
all helped in the development of small changing with new legal and technological
pocket-sized devices called personal measures being introduced to try to keep
jukeboxes which are able to store a large up with the progress of sound recording
number of songs. (Section 4.2) technology. (Section 6)
ANSWERS TO SELF-ASSESSMENT ACTIVITIES
Activity 3
The bandwidth of a system is the range of frequencies over which an
audio device responds equally (i.e. has a flat response). In the case of
an audio CD system the frequency response would need to be flat over
the range of frequencies the CD contains, i.e. 20 Hz to 20 kHz.
Dynamic range is derived from the amplitude range between the
loudest level that can be reproduced without producing distortion and
the softest level that can be reproduced without being enveloped in
noise. Dynamic range is the ratio between these two values and is
usually expressed in decibels. In the case of an audio CD system the
dynamic range needs to be at least 90 dB.
Signal-to-noise ratio gives an indication of the noise in the system.
It is expressed as a ratio of the wanted signal power to the noise power
in the system, and is usually expressed in decibels. A typical value for
digital audio equipment is 100 dB.
Activity 12
(a) Edison’s vertical (hill and dale) method used on phonograph
cylinders was more suitable for recording loud sounds as a greater
depth of deflection of the diaphragm was possible. Berliner’s
lateral groove on the gramophone disc limited deflection of the
diaphragm otherwise the groove wall would be broken. So, in
principle vertical recording gave a better dynamic range.
(b) The larger the movement of the diaphragm in the phonograph
pickup the louder the sound and so the signal-to-noise ratio was
increased thus making the system more suitable for mechanical
reproduction.
Activity 16
The frequency range of many instruments was outside that of the
mechanical transducer. Timpani and large stringed instruments just
would not be heard. Similarly, many of the small woodwind family
suffered in this way. Further, the acoustic qualities of many
instruments were too delicate to be reproduced above the surface noise
of the disc. An alto or tenor voice with either a piano or brass
instruments offered the best opportunity to obtain a good recording.
Activity 22
(a) If the hole is not in the centre the disc will turn eccentrically
causing variations in the linear speed and consequently changes in
the pitch of the replayed sound.
(b) If the disc is not flat there will be an audible noise every time the
‘bump’ is encountered. This will increase the signal-to-noise ratio.
(c) The smoother the material the better the signal-to-noise ratio as
any ‘roughness’ in the groove will be reproduced as noise. (This is
apparent when comparing the signal-to-noise ratio of 78 rpm discs
made of shellac and slate dust which is quite rough against the
smooth plastic material of the LP.)
Activity 25
Table 5 compares a 78 rpm disc with magnetic tape media.
Table 5 Comparison of 78 rpm disc and magnetic tape media
Characteristic 78 rpm disc Magnetic tape

Frequency response 30 Hz – 8 kHz 30 Hz – 15 kHz
Playing time 5 minutes per side 30 minutes (minimum)
Tape could offer twice the bandwidth and six times the playing time.
The frequency response of magnetic tape was between 30 Hz and 15 kHz,
and the playing time was up to 30 minutes. The frequency response of a
78 rpm record was 30 Hz to 8 kHz. The playing time was up to 5 minutes
for each side of a 12-inch (30 cm) disc.
Activity 26
On the tape the sound is recorded as a series of magnetic fluctuations
along its length. In order to get to a particular part of the recording the
tape must be wound forwards or backwards. This may take several
minutes especially if the required sound is at the other end of the tape.
On discs the sound is also recorded serially as a single spiral track.
However to find the equivalent sound on a disc is a relatively quick
operation, by simply placing the pickup at the appropriate place on the
disc surface. This takes the same time no matter where it is on the
disc. The speed and ease of access to particular songs has always given
an advantage to the disc over tape as a commercial replay medium.
Activity 34
The main problem with using any medium for archiving is that of it
deteriorating. In the case of CDs, this includes flaking or oxidisation of
the reflective layer, fogging up of the plastic disc and damage due to
mishandling or bad storage. In addition, as technology continues to
change rapidly, unless a programme of constant updating to new
media is employed, suitable replay devices will have to be kept
working to ensure the data on any of today’s media will be able to be
recovered in the future.
Activity 37
A good sense of hearing!
Activity 40
A stereo CD stores 2 channels of music using 16 quantisation levels at
a 44.1 kHz sampling rate which gives: 2 × 16 × 44 100 = 1 411 200 bits/s.
Compressed by a ratio of 11:1 gives a bit rate of: 1 411 200/11 =
128 291 bits/s.
The memory card stores 64 Mbytes of data which is 64 × 1 048 576 × 8
= 536 870 912 bits. Thus with MP3 compression the memory card can
store 67 108 864/128 291 = 4185 seconds or 69.75 minutes of sound.
Activity 41
The storage capacity of a HDD is about 8% less than the total. So
the capacity of a 40 Gbytes HDD is 40 – ((40 × 8)/100) = 36.8 Gbytes or
approximately 36 800 Mbytes. (This is an approximation because 1 Gbyte
is actually 1024 and not 1000 Mbytes.)
If an average song needs 4 Mbytes then the total number of songs that
can be stored is approximately 36 800/4 = 9 200 songs.
Activity 42
MP3 players were fitted with small memories allowing playing times
of around an hour. Additional memory cards could be fitted to some
players but these were relatively expensive and only held a similar
amount of music. Exchanging music files was controlled to an extent
by encrypting the data to a specific player.
Personal jukeboxes overcome the limited playing time by using
miniature hard disk drives. This gives a much larger storage ability but
requires careful management from database software on a desktop
computer.
Activity 43
‘7.1’ surround sound has eight audio channels. As with all surround
sound systems, the ‘.1’ channel is the low frequency effect channel.
The other seven are front left, front centre, front right, rear left and rear
right and side left and side right.
LEARNING OUTCOMES
After studying this chapter you should be able to:
1 Explain correctly the meaning of the emboldened terms in the main

text and use them correctly in context.
2 Give a brief account of the history of the record industry.
3 Describe the methods used for storing analogue audio recordings
introduced in the main text, highlighting their technological
aspects. (Activities 12, 16, 22 and 25)
4 Make informed judgements as to the quality of a sound recording
through analysis of the audio signal. (Activities 9, 14, 19 and 28)
5 Describe the stages in the restoration of archived recordings.
(Activities 34 and 37)
6 Demonstrate some of the techniques used to restore archived
recordings. (Activities 38 and 39)
7 Explain the development of personal jukeboxes and outline their
typical features and facilities and the technologies they use.
(Activities 40, 41 and 42)
8 Outline current consumer surround sound systems and their
emergence from cinema systems. (Activity 43)
9 Be aware of legal aspects regarding the copyright law as applied to
evolving technologies in the audio recording field. (Activity 47)
10 Identify instances of both legal digital copying of protected material
and of copyright infringements. (Activity 47)
11 Outline the techniques that are used to protect digital recordings
from unlawful copying. (Activity 47)
Acknowledgements
Grateful acknowledgement is made to the following sources for
permission to reproduce material in this chapter:
Figures 8 & 9: The Royal Scottish Museum, Edinburgh;
Figure 14: Ampex GB Limited.
We also thank Nigel Bewley (British Library Sound Archive),
Daniel Leech-Wilkinson (Kings College, London) and Robert Philip
(Open University) for advice on the restoration of recordings;
Stephen Potter for loan of the 78 rpm disc set of the Elgar Violin
Concerto and Bill Strang for playing the piano for Activities 17 and 26.
Every effort has been made to trace all the copyright owners, but if
any has been inadvertently overlooked, the publishers will be
pleased to make the necessary arrangements at the first opportunity.
137 TA225 BLOCK 3 SOUND PROCESSES INDEX 137
INDEX
Notes 1 This index covers Block 3, Chapters 4 and 5.

2 Where terms are referenced in two or more places, the page number is only given in one place,
cross references are given for the other entries.
3 Page numbers in bold refer to places where the term appears emboldened in the main text.
4 The index does not cover the aims, chapter summaries, answers to self-assessment activities or
learning outcomes.
3M see Minnesota Mining Manufacturing Bach, J.S. 99

35 rpm disc 77, 99 backward error correction 36
78 rpm disc 89, 109, 113 balance engineer see recording engineer
equalisation characteristics 97 bandwidth 13, 25, 31
reason for speed 93 AM radio see AM radio bandwidth
and hall acoustics 16
A&R co-ordinator 7 audio 17
AAC see advanced audio coding FM radio see FM radio bandwidth
of a channel see channel bandwidth
acoustic recording 82
of a digital signal see digital bandwidth
adaptive transform acoustic coding telephone 15
(ATRAC) 38 see also mechanical music player 79
address in pre-groove (ADIP) 39 BASF (German company) 103
ADIP see address in pre-groove Beethoven, L. van 33
advanced audio coding (AAC) 65, 122 Bell, A.G. 83
AIFF (sound file format) see audio Bell, C. 83
interchange file format
Berliner Gramophone Company 92
air pressure 80
Berliner, E. 11, 12, 88, 90, 111, 123
Alesis digital audio tape recorder 29
hard disk recorder 29 Blattner, L. 102
specification 29 Blattnerphone 102
AM radio 61, 62, 65, 66 Bluetooth 52
bandwidth 15 Blumlein, A. 100
American Graphophone Company 86 BPI see British Phonographic Industry
Ampex Corporation 103 British Columbia Record Company 96
amplifier British Phonographic Industry (BPI) 126
frequency response see frequency broadband noise 115
response of an amplifier broadcasting 5
analogue hole 128 buffer 36, 61, 120
analogue-to-digital converter 21, 128 burst error 33, 38
direct stream digital (DSD) 31
single bit see single bit conversion
cable (electrical) 51
anti-aliasing filter 20, 32
Caruso, E. 93, 93
Apple iPod see iPod
cassette
artefacts 112, 115, 116
compact see compact cassette
ATRAC see adaptive transform acoustic cassette tape 101
coding
CD see compact disc
attenuation 51
CEDAR see Computer Enhanced Audio
audio bandwidth
Restoration
of magnetic tape recorders see magnetic
tape speeds channel 52
audio cassette see compact cassette channel bandwidth 52
table of comparisons 53
audio compression 53, 89
chromium dioxide tape 106
audio digital versatile disc (DVD-A) 10, 15,
16, 21, 23, 32, 35, 30, 31 CIRC see Cross Interleaved Reed-Solomon
and copy protection see copy protection error correction code
specification 32 clicks (on records) 112, 115
universal player specification 32 clipping 13, 25, 115
audio interchange file format (AIFF) 122 digital 19
audio restoration 110, 111, 129 Close Encounters (theme music) 125
editing 113 Coccus lacca see Lac beetle
listening 113, 115, 116 code word 20
processes 112 codec 19
automatic track following 28 see also coder, decoder
coded orthogonal frequency division

crackles (on records) 112
multiplex (COFDM) 63
CRC see cyclic redundancy check code
coder
critical listening
compression 53
see also audio restoration
pulse code modulation 19

Cros, C. 81
COFDM see coded orthogonal frequency

Cross Interleaved Reed-Solomon error
division multiplex
correction code (CIRC) 37, 38
Columbia Phonograph Company 86, 98

Curie point 39
Columbia Records 99, 113

cut-off frequency 13
combinational parity 38
cyclic redundancy check code (CRC) 36
communication link
cylinder (recording medium) 81, 88, 89,
broadband 60
113
compact cassette 35, 38, 102, 106, 109, 125

duplicating 90
signal-to-noise ratio 106

groove pitch 89
tape 106
manufacture 91
compact disc (CD) 9, 5, 32, 77, 90, 125

quality 87, 93
bump 35
tin-foil 82, 85
CD-R 10
wax coated 85
data coding 36
disc mastering 12
disc speed 37
duplication 12
DAB see digital audio broadcasting
father 12
DASH see digital audio stationary head
land 12, 33, 36

DAT see digital audio tape
laser beam see laser beam

data cartridge (8 mm) 9, 10
manufacturing 12
metal master 12
data compression 38, 53, 53
mother 12
and master recordings 59
pit 12, 33, 36

lossless see lossless data compression
lossy see lossy data compression
player specification 37
ratio 55
Red Book see Red Book specification
servo mechanism 35, 37

data flow
son 12
directions 51
specification 33, 32
Davis, M. 8, 113, 129
spiral data track 33

DCC see digital compact cassette
system description 32
Decca Company, The 99
table of types 33
decoder
universal player specification 32

compression 53
see also super audio compact disk, audio

global see global degradation
digital versatile disc

localised see localised degradation
compatibility
pulse code modulation 19
of mono and stereo vinyl LPs 100

Diamond Rio 117
compression
diaphragm (in a transducer) 80, 82
audio see audio compression

digital audio 18
digital see data compression, digital

bandwidth see digital bandwidth
audio compression compact disc see compact disc
lossless see lossless compression compression 53, 117, 122
lossy see lossy compression see also lossless compression, lossy
Computer Enhanced Audio Restoration compression
(CEDAR) 115 disc recording systems 32
converter dynamic range see dynamic range of a
analogue-to-digital see analogue-to- digital audio signal

digital converter error detection/correction see error
digital-to-analogue see digital-to- detection/correction
analogue converter magnetic tape recording see magnetic
parallel-to-serial see parallel-to-serial tape recording
converter recording/playback processes 20
pulse code modulation see pulse code signal-to-noise ratio see signal-to-noise
modulation converter ratio in digital audio systems

copy protection 50, 126
tape recorder see Alesis digital audio
audio digital versatile disk 126, 128
tape recorder, digital audio tape,
super audio compact disk 31, 126, 128
rotary head tape recorder
using technology 126
transmission 51
using the law 126

see also digital audio broadcasting,
copyright 125
digital audio compression, digital
correlated (noise) 23
radio mondile, Internet audio/radio
digital audio broadcasting (DAB) 52, 63,

electric recording technology 96
66, 127
Electrical and Musical Industries (EMI) 98
BBC’s multiplex allocation see main

electrical interference 33
service channel allocation by BBC
outline system 63
electromagnetic pickup 96
digital audio stationary head (DASH) 31

electronic noise 33
digital audio tape (DAT) 9, 10, 26

electroplating 91
recorder 28, 126

Elgar, E. 98
recorder specification 28
EMI see Electrical and Musical Industries
see also wrap angle, guard band,

entropy coding 55
overwrite recording, automatic track

EQ see equalisation
following
EQ’d master 10
digital bandwidth 21
equalisation 10, 96, 112, 113, 116
digital compact cassette (DCC) 38
characteristics for 78 rpm and vinyl LP
digital decoy 127
discs 97
digital magnetic tape 5
RIAA see RIAA equalisation
digital radio mondile (DRM) 65

equalise 97
digital signal processing 115

see also equalisation
digital storage medium 19

error
digital theatre system (DTS) 123

burst see burst error
digital video broadcasting (DVB) 52

in digital data 33
random see random error
digital-to-analogue converter 21, 37, 119,
128 error concealment 39, 63
disc 113
error correction 27, 35, 37, 37, 51
78 see 78 rpm disc

backward see backward error correction
record see plate

data 20
forward see forward error correction
disc description protocol (DDP) 10
on-the-fly see forward error correction
disc jockey (DJ) 100
strategies 39
disc recording systems 5
error detection 27, 35, 51, 63
distortion 13
in digital audio systems 23, 23

Eureka 137 63
on records 112, 115

even parity 35
dither noise 23
see also parity bit
DJ see disc jockey

Exabyte see data cartridge (8 mm)
Dolby B noise reduction system 107, 109
Dolby Laboratories 106, 123
Dolby Pro-Logic 123

ffrr see full frequency-range recording
DRM see digital radio mondile
filter 32, 108
dropout 33
anti-aliasing see anti-aliasing filter
DTS see digital theatre system

in electric recording technology 96
Durinoid Company, The 111

reconstruction see reconstruction filter
DV (digital video) tape 101

FireWire 122
DVD-A see audio digital versatile disk

Flanders, M. 77
dynamic range 13, 17, 25, 31, 77
floppy disk 101
limits to 16
flutter 101, 112, 115
microphone see microphone dynamic
FM radio 66
range
bandwidth 15
of a digital audio signal 25

Radio Data Service (RDS) see Radio Data
Service
stereo system see Zenith-GE Pilot-Tone
ear
System
hearing threshold see hearing threshold

formant
masking threshold see masking threshold

singer’s see singer’s formant
response over audible frequency

forward error correction 36
range 13
frequency modulation (FM)
eardrum (tympanic membrane) 80

in MiniDisc 39
Edison Company, The 95

frequency response 77
Edison, T.A. 81, 83, 83, 85, 90, 100, 123

of a digital channel 19
editing
of an amplifier 15
in audio restoration see audio restoration

full frequency-range recording (ffrr) 99
editing
full-duplex 51
Gaisberg, F. 93
LFE see low frequency effects
global degradation 112

Li-ion see lithium-ion battery
gold-moulding 91
linear interpolation (in error correction) 39
Gramophone Company, The 93, 98

linear speed (surface speed of a disc/
gramophone 80, 88, 92

cylinder) 37, 89
Lippincott, J.H. 86
pickup 89
listening
quality 93
critical see critical listening
graphophone 85
lithium-ion (Li-ion) battery 119, 121
clockwork powered 86
localised degradation 112
groove 88
long-playing disc see vinyl LP
hill-and-dale see vertical cut groove

lossless data compression
lateral cut see lateral cut groove

see also entropy coding, Huffman code,
spacing 89, 96
Lempel-Ziv-Welch code, run-length
guard band 28
coding
lossy data compression 53
half-duplex 51
see also MP3, MPEG, Ogg Vorbis,
hard disk 19, 32, 101, 113, 118, 120

psychoacoustics
hearing threshold 57
loudspeaker 80
helical tape recording 27

low frequency effects (LFE) 123
hill-and-dale groove see vertical cut groove

low-pass filter 37
His Master’s Voice Company 92

LP see vinyl LP
HMV see His Master’s Voice Company

LZW see Lempel-Ziv-Welch code
Huffman code 55, 58
magnetic media 101
IEEE 1393 see FireWire

magnetic tape 19, 77, 103, 113
interference 33, 51
chromium dioxide see chromium
of radio waves by reflection see

dioxide tape
reflection interference
compact cassette see compact cassette
radio frequency see radio frequency

tape
interference
comparison with discs 109
interleaving 38
editing see splicing
hiss 18
International Standards Organisation
print through see print through
(ISO) 56
S-VHS 29
Internet 5, 59, 60, 117, 122, 127

saturation 107
audio/radio 61, 66
speeds 103
Internet service provider (ISP) 59

steel see steel tape
inversion (of a binary bit) 37

magnetic tape recorder 78, 103
iPod 117, 118, 129

analogue multitrack see multitrack tape
specification 119
recorder 108
iron oxide 103

DAT see digital audio tape
ISO see International Standards

digital 26, 108
Organisation
multitrack see Alesis digital audio tape
ISP see Internet service provider

recorder, stationary head tape recorder
iTunes 121
rotary head see rotary head tape recorder
magneto-optical disc 19, 38
jam 113
main service channel (MSC) 63
jitter 33
allocation by BBC 63
jog-proof system 36
Marconi Company, The 103
jukebox 98, 117

Marston, W. 115
personal see personal jukebox masking threshold 57
master recording 9, 10, 59
Kerr effect 39
matrix (master disc) 91
number 91
Kind of Blue 8, 113, 129
stamper 91
MCPS see Mechanical Copyright Protection
Lac beetle (Coccus lacca) 91

Society
laser beam 33, 37

MD see MiniDisc
lateral cut groove 89, 96

Mechanical Copyright Protection Society
LCD display 120

(MCPS) 126
legal aspects
mechanical audio recorder
of music industry 78
dynamic range 82
Lempel-Ziv-Welch (LZW) code 55

mechanical music player 79

noise floor 108
mechanicals 126
noise transient 115
see also Mechanical Copyright

normalisation 10
Protection Society
North American Phonograph Company 86
Melba, N. 93
memory card 19, 118

odd parity 35
merging bits 36
see also parity bit
Meridian Lossless Packing (MLP) 31

Ogg Vorbis 59
microgroove 99
optical disk 32
microphone 80, 89, 102

optical fibre 51
dynamic range 13
optical pickup 37
Microprocessor 120
oversampling 35, 37, 32, 50
Miles Davis Sextet 113, 129

overwrite recording 28
MiniDisc 32, 38, 39, 59, 118
editing 38
parallel-to-serial converter 20
MDLP 38
see also pulse code modulation converter
playback 39
parity 35
recorder specification 38
checking 35
recording 39
combinational see combinational parity
system description 38
Patti, A. 93
Minnesota Mining Manufacturing (3M)

PCM see pulse code modulation
Company 103
PCM-VTR 26
mix-down 9
perceptual coding 56
MLP see Meridian Lossless Packing
Performing Right Society (PRS) 126
monophonic (mono) 100
personal jukebox 78, 118, 129
Morse Code 55
Philips Gloeilampenfabrieken 106
Moving Pictures Expert Group (MPEG) 56

phonoautograph 88
standards 56
phonograph 81, 82, 83, 83, 85
MP2 (MPEG 2) 65, 63
pickup 89
MP3 58, 60, 117, 121, 122, 126
photo resist 12
portable player 60
see also personal jukebox

pickup
pro 122
electromagnetic see electromagnetic
pickup
MPEG see Moving Pictures Expert Group
gramophone see gramophone pickup

MSC see main service channel
optical see optical pickup
multiplex see main service channel
phonograph see phonograph pickup
multitrack tape recorder 29
plate (recording medium) 88
analogue 108
manufacture 91
DASH see digital audio stationary head

matrix see matrix
music industry 5, 50
see also 35 rpm disc, 78 rpm disc,
A&R co-ordinator see A&R co-ordinator

gramophone
balance engineer see recording engineer

polarisation (of light) 39
job functions 6
polyvinyl chloride 91
producer see record producer
post production 10
recording engineer see recording
engineer
Poulsen, V. 102
tonmeister see recording engineer

previous data word holding 39
musical box 79
print through 111
see also mechanical music player

Pro-Logic see Dolby Pro-Logic
muting 39
PRS see Performing Right Society
psychoacoustics 56, 57, 123
public domain 126
Napster 127
pulse code modulation (PCM) 18, 35
needle 88
converter 18
see also stylus
Nipper 92
quadraphonic 123
noise 13
quality of sound 77
broadband see broadband noise
correlated see correlated noise

quantisation
digital 19
error 20, 22
digital see also quantisation noise

levels 23
dither see dither noise

noise 20, 22
uncorrelated see uncorrelated noise

quantiser 20
Rachmaninov, S. 98
scrambling see interleaving
Radio Data Service (RDS) 63

scratches (on records) 112
radio
self-clocking 35
Internet see Internet audio/radio

serial copy management system
radio frequency 52
(SCMS) 50, 51, 126
interference 33
session recording 8
interference due to reflections see
reflection interference
shellac 89, 91, 99, 111
station bandwidth 63
Sibelius, J. 9
see also AM radio, FM radio

signal processing
RAM see random access memory

digital see digital signal processing
random access memory (RAM) 36

signal-to-noise ratio 17, 25, 31, 77
random error 33
improvement using RIAA
equalisation 97
RCA-Victor 92, 99
in digital audio systems 22
RDS see Radio Data Service

table of comparisons 22
reconstruction filter 20, 32

simplex 51
see also anti-aliasing filter
singer’s formant 95
record
LP see vinyl LP
single bit conversion 33, 51
skip protection 119, 120
record companies 11
Smith, O. 102
record distributor 11
sound pressure level (SPL) 13
record producer 6, 7
SPL see sound pressure level
recorder
wire see wire recorder splicing 31
Recording Industry Association of America stamper 11
(RIAA) 96, 117

stationary head tape recorder 30
recording engineer 8
steel tape 102, 103
recording production sequence 7

distribution 11
stereophonic (stereo) 100, 123, 129
manufacture 11
Stravinsky, I. 98
planning 7
streaming audio 61
post production see post production

see also Internet audio/radio
recording session 8
studio tape recorder see multitrack tape
editing notes 9
recorder
take see take

stylus 82, 89
Red Book specification 33, 36, 36

sub-woofer 123
reel-to-reel tape (recorder) 101, 105

subcode data 28
Reeves, A. 18
super audio compact disk (SACD) 10, 15,
reflection interference (of radio waves) 63

16, 21, 23, 32, 35, 30, 31
remastering 111
and copy protection see copy protection
see also audio restoration specification 32
resource interchange file format (RIFF) universal player specification 32
WAVE 122
watermark see watermark in super audio
restoring recordings see audio restoration

compact disk
RF see radio frequency

surface noise
RIAA see Recording Industry Association in records 96, 97
of America
surround sound 9, 123, 129
RIAA equalisation 97, 116

5.1 format 123
RIFF WAVE (sound file format) see

6.1/7.1 format 123
resource interchange file format

digital theatre system see digital theatre
rotary head recording system 27

system
see also wrap angle, guard band, Pro-Logic see Dolby Pro-Logic
overwrite recording, automatic track quadraphonic see quadraphonic
following Swann, D. 77
run-length coding 55
Tainter, C. 83
SACD see super audio compact disk
take (in a recording session) 8
sampling
see also oversampling

Tamagno, F. 93
sampling rates
tape recorder see magnetic tape recorder
table of values 21
telegraphone 102
SCMS see serial copy management system

thermionic valve 102
tone control (bass and treble) 100, 108, 113,

vinyl LP 18, 32, 35, 51, 77, 89, 99, 100, 109
115 groove 33
see also filter, equalisation RIAA equalisation characteristics 97
tonmeister see recording engineer specification 33
transfer function 16
stereo/mono compatibility see
transmission medium 51
compatibility of mono and stereo vinyl
LPs
uncorrelated (noise) 23
VTR see video tape recorder
United States Gramophone Company 88

vulcanite 88, 90, 91, 111
universal serial bus (USB) 121
USB see universal serial bus

watermark in super audio compact disk 31
WAV (sound file format) see resource
interchange file format
vacuum tube see thermionic valve

Webster wire recorder 103
valve (electronic component) see thermionic

valve Widor, C-M. 16
variable bit-rate encoding 58

WiFi 52
VCR 31
wire recorder 102
Verdi, G. 93
see also Webster wire recorder
vertical cut groove 89, 95

wireless 52, 62
VHS 31
wobble groove 39
tape 101
wow 101, 112, 115
Victor Records 92
wrap angle 28
video tape recorder (VTR) 26
vinyl see polyvinyl chloride Zenith-GE Pilot-Tone System 62
vinyl disc see vinyl LP ZIP file 53
Acknowledgement
Cover image: © 1997 Photodisc, Inc.

Ebook TA225 Block 3 Part 2 ISBN0749258993 L3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ebook TA225 Block 3 Part 2 ISBN0749258993 L3

Uploaded by

Copyright:

Available Formats

1 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 4 MUSIC DISTRIBUTION 1

TA225 The Technology of Music

Chapter 5 The Music Business page 75

Index page 137

2 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 4 MUSIC DISTRIBUTION 2

The Open University

Walton Hall, Milton Keynes

First published 2004

Copyright © 2004 The Open University

transmitted or utilized in any form or by any means, electronic, mechanical, photocopying,

University, or otherwise used by The Open University as permitted by applicable law.

The Open University or its assigns.

electronic storage or use in a website), distribute, transmit or re-transmit, broadcast, modify

Edited, designed and typeset by The Open University.

ISBN 0 7492 5899 3

TA225 Block 3 Sound processes

2 Stamping the record 5

2.2 The recording production sequence 7

2.2.2 Session recording 8

2.2.4 Post production 10

3 Audio system characteristics 13

3.1 Dynamic range 13

3.2 Bandwidth, frequency response and distortion 14

3.3 Signal-to-noise ratio 16

3.4 Summary of Section 3 17

4.2 Digital storage techniques 18

4.2.1 A digital recording system 19

4.2.2 Digital bandwidth 21

4.2.3 Signal-to-noise in digital audio systems 22

4.2.4 Digital dynamic range 25

5 Digital audio tape recording 26

5.2 Rotary head tape recorders 26

5.2.1 Digital audio tape recorders 26

5.2.2 The Alesis multitrack digital audio tape recorder 29

5.3 Stationary head tape recorders 30

5.4 Tape versus disc 31

5.5 Summary of Section 5 31

6 Digital disc systems 32

6.2 The audio compact disc system 32

6.4 Advanced disc-based systems 40

7 Digital disc technologies 42

4 TA225 BLOCK 3 SOUND PROCESSES CHAPTER 4 MUSIC DISTRIBUTION 4

7.2 Single bit conversion 43

7.3 Correcting media faults 43

7.3.1 Error detection 45

7.3.2 Error correction 46

7.3.3 Data interleaving 48

7.3.4 Error concealment 49

7.4 Copy protection 50

7.5 Summary of Section 7 51

8 Digital audio transmission 51

8.2 Channel bandwidth 52

8.3 Digital audio compression 53

8.3.2 Lossy or lossless? 54

8.3.3 Lossless compression 54

8.3.4 Lossy compression 55

8.3.5 Lossy coders and master recordings 59

8.4 Digital audio and the Internet 59

8.4.1 Internet radio 61

8.5 Digital Broadcasting 62

8.5.2 Digital Audio Broadcasting 63

8.5.3 Digital Radio Mondile 65

8.6 Summary of Section 8 66