Professional Documents
Culture Documents
Contents
3. Microphone Technology
Copyright Notice
You are licensed to make as many copies as you reasonably require for
your own personal use.
Chapter 1: Microphone Technology
Now, if you want to talk about something that really will make or break
the end product, that is how microphones are used. Two sound engineers
using the same microphones will instinctively position and direct them
differently and there can be a massive difference in sound quality. Give
these two engineers other mics, whose characteristics they are familiar
with, and the two sounds achieved will be identifiable according to
engineer, and not so much to according to microphone type.
Microphone Construction
Piezoelectric
Dynamic
The dynamic mic produces a signal that is healthy in both voltage and
current. Remember that it is possible to exchange voltage for current, and
vice versa, using a transformer. All professional dynamic mics
incorporate a transformer that gives them an output impedance of
somewhere around 200 ohms. This is a fairly low output impedance that
can drive a cable of 100 meters or perhaps even more with little loss of
high frequency signal (the resistance of a cable attenuates all frequencies
equally, the capacitance of a cable provides a path between signal
conductor and earth conductor through which high frequencies can
‘leak’). It is not necessary therefore to have a preamplifier close to the
microphone, neither does the mic need any power to operate. Examples
of dynamic mics are the famous Shure SM58 and the Electrovoice RE20.
The characteristics of the dynamic mic are primarily determined by the
weight of the coil slowing down the response of the diaphragm. The
sound can be good, particularly on drums, but it is not as crisp and clear
as it would have to be to capture delicate sounds with complete accuracy.
Dynamic microphones have always been noted for providing good value
for money, but other types are now starting to challenge them on these
grounds.
Ribbon Mic
Capacitor
or:
V = Q/C
or:
Now the tricky part: capacitance varies according to the distance between
the plates of the capacitor. The charge, as long as it is either continuously
topped up or not allowed to leak away, stays constant. Therefore as the
distance between the plates is changed by the action of acoustic
vibration, the capacitance will change and so must the voltage between
the plates. Tap off this voltage and you have a signal that represents the
sound hitting the diaphragm of the mic.
Sennheiser MKH 40
Old capacitor mics used to have bulky and inconvenient power supplies.
These mics are still in widespread use so you would expect to come
across them from time to time. Modern capacitor mics use phantom
power. Phantom power places +48 V on both of the signal carrying
conductors of the microphone cable actually within the mixing console or
remote preamplifier, and 0 V on the earth conductor. So, simply by
connecting a normal mic cable, phantom power is connected
automatically. That's why it is called ‘phantom’ – because you don't see
it! In practice this is no inconvenience at all. You have to remember to
switch in on at the mixing console but that's pretty much all there is to it.
Dynamic mics of professional quality are not bothered by the presence of
phantom power in any way, One operational point that is important
however is that the fader must be all the way down when a mic is
connected to an input providing phantom power, or when phantom power
is switched on. Otherwise a sharp crack of speaker-blowing proportions
is produced.
Electret
All of this is nice in theory, but is almost never borne out in practice.
Take a nominally cardioid mic for example. It may be an almost perfect
cardioid at mid frequencies, but at low frequencies the pattern will spread
out into omni. At high frequencies the pattern will tighten into
hypercardioid. The significant knock-on effect of this is that the
frequency response off-axis – in other words any direction but head on –
is never flat. In fact the off-axis response of most microphones is nothing
short of terrible and the best you can hope for is a smooth roll-off of
response from LF to HF. Often though it is very lumpy indeed. We will
see how this affects the use of microphones at another time.
Omnidirectional
Figure-of-Eight
AKG C414
Special Microphone Types
Stereo Microphone
Two capsules may be combined into a single housing so that one mic can
capture both left and right sides of the sound field. This is much more
convenient than setting two mics on a stereo bar, but obviously less
flexible. Some stereo mics use the MS principle where one cardioid
capsule (M) captures the full width of the sound stage while the other
figure-of-eight capsule (S) captures the side-to-side differences. The MS
output can be processed to give conventional left and right signals.
The original boundary effect microphone was the Crown PZM (Pressure
Zone Microphone) so the boundary effect microphone is often referred to
generically as the PZM. In this mic, the capsule is mounted close to a flat
metal plate, or inset into a wooden or metal plate. Instead of mounting it
on a stand, it is taped to a flat surface. One of the main problems in the
use of microphones is reflections from nearby flat surfaces entering the
mic. By mounting the capsule within around 7 mm from the surface,
these reflections add to the signal in phase rather than interfering with it.
The characteristic sound of the boundary effect microphone is therefore
very clear (as long as there are no other nearby reflecting surfaces). It can
be used for many types of recording, and can also be seen in police
interview rooms where obviously a clear sound has to be captured for the
interview recording. The polar response is hemispherical.
Miniature Microphone
Beyerdynamic MCE5
Vocal Microphone
Microphone Accessories
• What is a pad?
• Audio book
• Radio presentation, interview or discussion
• Television presentation, interview or discussion
• News reporting
• Sports commentary
• Film and television drama
• Theatre
• Conference
‘A pleasing tone of voice’? Well, first choose your voice talent. Second,
it is a fact that some microphones flatter the voice. Some work
particularly well for speech, and there are some classic models such as
the Electrovoice RE20 that are commonly seen in this application.
Generally, one would be looking for a large-diaphragm capacitor
microphone, or a quality dynamic microphone for natural or pleasing
speech for audio books or radio broadcasting.
News Reporting
Sports Commentary
Theatre
Conference
You will have noticed that in this context microphones are often used in
pairs. There are two schools of thought on this issue. One is that the
microphones should point inwards from the front corners of the lectern.
This allows the speaker to turn his or her head and still receive adequate
pickup. Unfortunately, as the head moves, both microphones can pick up
the sound while the sound source – the mouth – is moving towards one
mic and away from the other. The Doppler effect comes into play and
two slightly pitch shifted signals are momentarily mixed together. It
sounds neither pleasant nor natural. The alternative approach is to mount
both microphones centrally and use one as a backup. The speaker will
learn, through not hearing their voice coming back through the PA
system, that they can only turn so far before useful pickup is lost.
It is worth saying that in this situation, the person speaking must be able
to hear their amplified voice at the right level. If their voice seems too
loud, to them, they will instinctively back away from the mic. If they
can’t hear their amplified voice they will assume the system isn’t
working. I once saw the chairman of a large and prestigious organisation
stand away from his microphone because he thought it wasn’t working. It
had been, and at the right level for the audience. But unfortunately, apart
from the front few rows, they were unable to hear a single unamplified
word he said.
Use of Microphones for Music
The way in which microphones are used for music varies much more
according to the instrument than it possibly could for speech where the
source of sound is of course always the human mouth. First, some
scenarios:
• Recording
• Broadcast
• Public address
• Recording studio
• Location recording
• Concert hall
• Amplified music venue
• Theatre
Point the microphone at the sound source from the direction of the best
natural listening position.
So, wherever you would normally choose to listen from is the right
position for the microphone, except that the microphone has to be closer
because it can’t discriminate direct sound from reflected sound in the
way the human ear/brain can. It is always a good starting point to follow
these two rules, but of course it may not always be possible, practical, or
a natural sound may not be wanted for whatever reason. Broadcasters, by
the way, tend to place the microphone closer than recording engineers.
They need to get a quick, reliable result, and a close mic position is
simply safer for this purpose. Ultimate sound quality is not of such
importance.
In theatre musicals, the best option for the lead performers is to use
miniature microphones with radio transmitters. The placement of the mic
is significant. The original ‘lavalier’ placement, named for Mme Lavalier
who reportedly wore a large ruby from her neck, has long gone. The
chest position is great for newsreaders but it suffers from the shadow of
the chin and boominess caused by chest resonance. The best place for a
miniature microphone is on a short boom extending from behind the ear.
Mics and booms are available in a variety of flesh colours so they are not
visible to the audience beyond the second or third row. If a boom is not
considered acceptable, then the mic may protrude a short distance from
above the ear, or descending from the hairline. This actually captures a
very good vocal sound. It has to be tried to be believed. One of the
biggest problems with miniature microphones in the theatre is that they
become ‘sweated out’ after a number of performances and have to be
replaced. Still, no-one said that it was easy going on stage. For the
orchestra in a theatre musical, clip on mics are good for string
instruments. Wind instruments are generally loud enough for
conventional stand mics, closely placed. So-called ‘booth singers’ can
use conventional mics.
Stereo Microphone Techniques
Separating the mics by around 10 cm tears the theory into shreds, but it
sounds a whole lot better.
ORTF
Mercury Living Presence was one of the early stereo techniques of the
1950s, used for classical music recordings on the Mercury label. If you
imagine trying to figure out how to make a stereo recording when there
was no-one around to tell you how to do it, you might work out that one
microphone pointing left, another pointing center and a third pointing
right might be the way to do it. Record each to its own track on 35mm
magnetic film, as used in cinema audio, and there you have it! Nominally
omnidirectional microphones were used, but of course the early omni
mics did become directional at higher frequencies. Later recordings were
made to two-track stereo. These recordings stand up remarkable well
today. They may have a little noise and distortion, but the sound is
wonderfully clear and alive.
The same can be said of the Decca tree, used by the Decca record
company. This is not dissimilar from the Mercury Living Presence
system but baffles were used between the microphones in some instances
to create separation, and additional microphones might be used where
necessary, positioned towards the sides of the orchestra.
Decca tree
Instruments
Saxophone
There are two fairly obvious ways a saxophone can be close miked. One
is close to the mouthpiece, another is close to the bell. The difference in
sound quality is tremendous. The same applies to all close miking. Small
changes in microphone position can affect the sound quality enormously.
There are many books and texts that claim to tell you how and where to
position microphones for all manner of instruments, but the key is to
experiment and find the best position for the instrument – and player –
you have in front of you. Experience, not book learning, leads to success.
Of the two saxophone close miking positions, neither will capture the
natural sound of the instrument, if that’s what you want. Close mic
positions almost never do. If you move the mic further away, up to
around a meter, you will be able to capture the sound of the whole of the
instrument, mouthpiece, bell, the metal of the instrument, and the holes
that are covered and uncovered during the normal course of playing. Also
as you move away you will capture more room ambience, and that is a
compromise that has to be struck. Natural sound against room ambience.
It’s subjective.
Piano
Drums
The conventional setup is one mic per drum, a mic for the hihat perhaps,
and two overhead mics for the cymbals. Recording drums is an art form
and experience is by far the best guide. There are some points to bear in
mind:
The mics have to be placed where the drummer won’t hit them, or the
stands.
Dynamic mics generally sound better for drums, capacitor mics for
cymbals.
The kick drum should have its front head removed, or there should be a
large hole cut out so that a damping blanket can be placed inside.
Otherwise it will sound more like a military bass drum than the dull thud
that we are used to. The choice of beater – hard or soft - is important, as
is the position of the kick drum mic either just outside, or some distance
inside the drum.
The snares on the underside of the snare drum may rattle when other
drums are being played. Careful adjustment of the tension of the snares is
necessary, and perhaps even a little damping.
Listen!
Check Questions
• Write down, copy if you wish, the two golden rules for
microphone positioning
• What is stereo?
• Method of operation:
• Moving coil
• Electrostatic
• Direct radiator
• Horn
• Function:
• Domestic
• Hi-fi
• Studio
• PA
In this context we will use ‘PA’ to mean concert public address rather
than announcement systems that are beyond the scope of this text.
The moving coil loudspeaker, or I should say ‘drive unit’ as this is only
one component of the complete system, is the original and still most
widely used method of converting an electric signal to sound. The
components consist of a magnet, a coil of wire (sometimes called the
‘voice coil’) positioned within the field of the magnet and a diaphragm
that pushes against the air. When a signal is passed through the coil, it
creates a magnetic field that interacts with the field of the permanent
magnet causing motion in the coil and in turn the diaphragm. It is
probably fair to say that 99.999% of the loudspeakers you will ever come
across use moving coil drive units.
Perhaps the best place to start is a 200 mm drive unit intended for low
and mid frequency reproduction. This isn't the biggest drive unit
available, so why are larger drive units ever necessary? The answer is to
achieve a higher sound level. A 200 mm drive unit only pushes against so
much air. Increase the diameter to 300 mm or 375 mm and many more
air molecules feel the impact. The next question would be, why are 300
mm or 375 mm drive units not used more often, when space is available?
The answer to that is in the behavior of the diaphragm:
The diaphragm could be flat and still produce sound. However, since the
motor is at the center and vibrations are transmitted to the edges, the
diaphragm needs to be stiff. The cone shape is the best compromise
between stiffness and large diameter.
High frequencies will tend to bend the diaphragm more than low
frequencies. It takes a certain time for movement of the coil to propagate
to the edge of the diaphragm. Fairly obviously, at high frequencies there
isn't so much time and at some frequency the diaphragm will start to
deviate from the ideal rigid piston.
It might be stating the obvious at this stage, but a low frequency drive
unit is commonly known as a woofer, and a high frequency drive unit as
a tweeter.
Damage
There are two ways in which a moving coil drive unit may be damaged.
One is to drive it at too high a level for too long. The coil will get hotter
and hotter and eventually will melt at one point, breaking the circuit
(‘thermal damage’). The drive unit will entirely cease to function. The
other is to ‘shock’ the drive unit with a loud impulse. This can happen if
a microphone is dropped, or placed too close to a theatrical pyrotechnic
effect. The impulse won't contain enough energy to melt the coil, but it
may break apart the turns of the coil, or shift it from its central position
with respect to the magnet (‘mechanical damage’). The drive unit will
still function, but the coil will scrape against the magnet producing a very
harsh distorted sound. Many drive units can be repaired, but of course
damage is best avoided in the first place. The trick is to listen to the
loudspeaker. It will tell you when it is under stress if you listen carefully
enough.
Impedance
Drive units and complete loudspeaker systems are also rated in terms of
their impedance. This is the load presented to the amplifier, where a low
impedance means the amplifier will have to deliver more current, and
hence ‘work harder’. A common nominal impedance is 8 ohms.
‘Nominal’ means that this is averaged over the frequency range of the
drive unit or loudspeaker, and you will find that the actual impedance
departs significantly from nominal according to frequency. Normally this
isn't particularly significant, except in two situations:
At some frequency the impedance drops well below the nominal
impedance. The power amplifier will be called upon to deliver perhaps
more power than it is capable of, causing clipping, or perhaps the
amplifier might even go into protection mode to avoid damage to itself.
To be honest, the above points are not always at the forefront of the
working sound engineer's mind, but they are significant and worth
knowing about.
Check Questions
Cabinet (Enclosure)
The moving coil drive unit is as open to the air at the rear as it is to the
front, hence it emits sound forwards and backwards. The backward-
radiated sound causes a problem. Sound diffracts readily, particularly at
low frequencies, and much of the energy will 'bend' around to the front.
Since the movement of the diaphragm to the rear is in the opposite
direction to the movement to the front, this leaked sound is inverted (or
we can say 180 degrees out of phase) and the combination of the two will
tend to cancel each other out. This occurs at frequencies where the
wavelength is larger than the diameter of the drive unit. For a 200 mm
drive unit the frequency at which cancellation would start to become
significant is 1700 Hz, the cancellation getting worse at lower
frequencies.
The simple solution to this is to mount the drive unit on a baffle. A baffle
is simply a flat sheet of wood with a hole cut out for the drive unit.
Amazingly, it works. But to work well down to sufficiently low
frequencies it has to be extremely large. The wavelength at 50 Hz, for
example, is almost 7 meters. The baffle can be folded around the drive
unit to create an open back cabinet, which you will still find in use for
electric guitar loudspeakers. The drawback is that the partially enclosed
space creates a resonance that colors the sound.
The logical extension of the baffle and open back cabinet is to enclose
the rear of the drive unit completely, creating an infinite baffle. It would
now seem that the rear radiation is completely controlled. However, there
are problems:
The diaphragm now has to push against the air 'spring' that is trapped
inside the cabinet. This present significant opposition to the motion of the
diaphragm.
The cabinet will itself vibrate and is highly unlikely to operate anything
like a rigid piston or have a flat frequency response. (Of course, this
happens with the open back cabinet too).
At this point it is worth saying that the bare drive unit is often used in
theater sound systems where there is a need for extreme clarity in the
human vocal range. Low frequencies can be bolstered with conventional
cabinet loudspeakers.
Despite these problems, careful design of the drive unit to balance the
springiness of the trapped air inside the cabinet against the springiness of
the suspension can work wonders. The infinite baffle, properly designed,
is widely regarded as the most natural sounding type of loudspeaker
(electrostatics excepted). The only real problem is that the compromises
that have to be made to make this design work result in poor low
frequency response.
Points of order:
The next step in cabinet design is the bass reflex enclosure. You will
occasionally hear of this as a ported or vented cabinet.
The bass reflex cabinet borrows the theory of the Helmholtz resonator. A
Helmholtz resonator is nothing more than an enclosed volume of air
connected to the outside world by a narrow tube, called the port. The port
can stick out of the enclosure as in a beer bottle - a perfect example of the
principle - or inwards. The small plug of air in the port bounces against
the compliance of the larger volume of air inside and resonates readily.
Try blowing across the top of the beer bottle (when empty) and you will
see.
There are other cabinet designs, notably the transmission line, but these
are not generally within the scope of professional sound engineer so they
will be excluded from this text.
Horns
Whereas a direct radiator drive unit may be only 1% efficient (i.e. 100 W
of electrical power converts to just 1 W of sound power), a horn drive
unit may be up to 5% efficient.
The air in the throat of the horn becomes so compressed at high levels
that significant distortion is produced. However, some people - including
the writer of this text! - can on occasion find the distortion quite pleasant.
Crossover
Crossovers have two principal parameter sets: the cut off frequencies of
the bands, and the slopes of the filters. It is impractical, and actually
undesirable, to have a filter that allows frequencies up to, say, 4 kHz to
pass and then cut off everything above that completely. So frequencies
beyond the cutoff frequency (where the response has dropped by 3 dB
from normal) are rolled off at a rate of 6, 12, 18 or 24 dB per octave. In
other words, in the band of frequencies where the slope has kicked in, as
the frequency doubles the response drops by that number of decibels. The
slopes mentioned are actually the easy ones to design. A filter with a
slope of, say, 9 dB per octave would be much more complex.
• Inexpensive
• Convenient
• Usually matched by the loudspeaker manufacturer to the
requirements of the drive units
• And the disadvantages:
• Not practical to produce a 24 dB/octave slope
• Can waste power
• Not always accurate & component values can change over time
• Accurate
• Cutoff frequency and slope can be varied
• Power amplifier connects directly to drive unit - no wastage of
power & better control over diaphragm motion
• Limiters can be built into each band to help avoid blowing drive
units
• Expensive
• It is possible to connect the crossover incorrectly and send LF to
the HF driver and vice versa.
• A third-party unit would not compensate for any deficiencies in the
driver units.
• Crossover
• Equalizer to correct the response or each drive unit
• Sensing of voltage (and sometimes) current to ensure that each
drive unit is maximally protected
Use of Loudspeakers
The most fascinating use of loudspeakers is the near field monitor. Near
field monitors are now almost universally used in the recording studio for
general monitoring purposes and for mixing. This would seem odd
because twenty-five years ago anyone in the recording industry would
have said that studio monitors have to be as good as possible so that the
engineer can hear the mix better than anyone else ever will. That way, all
the detail in the sound can be assessed properly and any faults or
deficiencies picked up. Mixes were also assessed on tiny Auratone
loudspeakers just to make sure they would sound good on cheap
domestic systems, radios or portables.
That was until the arrival of the Yamaha NS10 - a small domestic
loudspeaker with a dreadful sound. It must have found its way into the
studio as cheap domestic reference. A slightly upmarket Auratone if you
like. However, someone must have used it as a primary reference for a
mix, and found that by some magical an indefinable means, the NS10
made it easier to get a great mix - and not only that but a mix that would
'travel well' and sound good on any system. The NS10 and later NS10M
are now no longer in production, but every manufacturer has a nearfield
monitor in their range. Some actually now sound very good, although
their bass response is lacking due to their small size. The success if
nearfield monitoring is something of a mystery. It shouldn't work, but the
fact is that it does. And since so little is quantifiable, the best
recommendation for a nearfield monitor is that it has been used by many
engineers to mix lots of big-selling records. That would be the Yamaha
NS10 then!
Check Questions
• What is a baffle?
• What is 'compliance'?
History
Magnetic tape recording was invented in the early years of the Twentieth
Century and became useful as a device for recording speech, but simply
for the information content, as in a dictation machine - the sound quality
was too poor. In essence, a tape recorder converts an electrical signal to a
magnetic record of that signal. Electricity is an easy medium to work in,
compared to magnetism. It is straightforward to build an electrical device
that responds linearly to an input. As we saw earlier, 'linear' means
without distortion - like a flat mirror compared (linear) to a funfair mirror
(non-linear). Magnetic material does not respond linearly to a
magnetizing force. When a small magnetizing force is applied, the
material hardly responds at all. When a greater magnetizing force is
applied and the initial lack of enthusiasm to become magnetized has been
overcome, then it does respond fairly linearly, right up to the point where
it is magnetized as much as it can be, when we say that it is 'saturated'.
Unfortunately, no-one has devised a way of applying negative feedback
to analog recording, which in an electrical amplifier reduces distortion
tremendously.
• Distortion
• Noise
• Modulation noise
• Distortion
The invention that transformed the analog tape recorder from a dictation
machine to a music recording device, during the 1940s, was AC bias.
Since the response of tape to a small magnetizing force is very small, and
the linear region of the response only starts at higher magnetic force
levels, a constant supporting magnetic force, or bias, is used to overcome
this initial resistance. Prior to AC bias, DC bias was used courtesy of a
simple permanent magnet. However, considerable distortion remained.
AC bias uses a high frequency (~100 kHz) sine wave signal mixed in
with the audio signal to 'help' the audio signal get into the linear region
which is relatively distortion-free. This happens inside the recorder and
no intervention is required on the part of the user. However the level of
the bias signal has to be set correctly for optimum results. In traditional
recording, this is the job of the recording engineer before the session
starts. It has to be said that line up is an exacting procedure and many
modern recording engineers have so much else to think about (their
digital transfers!) that line-up is better left to specialists.
Noise
Modulation Noise
There have been digital 'analog simulators', but to my ears, unless this
aspect of the character of analog recorders is simulated, they just don't
same the same. Modulation noise is noise that changes as the signal
changes, and has two causes. One is Barkhausen noise which is produced
by quantization of the magnetic domains (a gross over-simplification of a
phenomenon that would take too much understanding for the working
sound engineer to bother with). The other - more significant - cause of
modulation noise is irregularities in the speed of tape travel. These
irregularities are themselves caused by eccentricity and roughness in the
bearings and other rotating parts, and by the tape scraping against the
static parts. We some times hear of the term 'scrape flutter', which creates
modulation noise, and the 'flutter damper roller', which is a component
used to minimize the problem.
If a 1 kHz sine wave tone is recorded onto analog tape, the output will
consist of 1 kHz plus two ranges of other frequencies, some strong and
consistent, others weaker and ever-changing due to random variations.
These are known in radio as 'sidebands' and the concept has exactly the
same meaning here.
• Three motors, one each for the supply reel, take-up real and
capstan. The take-up reel motor provides sufficient tension to
collect the tape as it comes through. It does not itself pull the tape
through. The supply reel motor is energized in the reverse direction
to maintain the tension of the tape against the heads.
• The capstan provides the motive force that drives the tape at the
correct speed.
• The erase head wipes the tape clean of any previous recording.
• The record head writes the magnetic signal to the tape. It can also
function as a playback head, usually with reduced high frequency
response.
Magnetic Tape
Magnetic tape comprises a base film, upon which is coated a layer of iron
oxide. Oxide of iron is sometimes, in other contexts, known as 'rust'. The
oxide is bonded to the base film by a 'binder', which also lubricates the
tape as it passes through the recorder. Other magnetic materials have
been tried, but none suits analog audio recording better than iron, or more
properly 'ferric' oxide. There are two major manufacturers of analog tape
(there used to be several): Quantegy (formerly known as Ampex) and
Emtec (formerly known as BASF).
Tape is manufactured in a variety of widths. (It is also manufactured in
two thickness - so-called 'long play' tape can fit a longer duration of
recording on the same spool, at the expense of certain compromises.).
The widths in common use today are two-inch and half-inch. Oddly
enough, metrication doesn't seem to have reached analog tape and we
tend to avoid talking about 50 mm and 12.5 mm. Other widths are still
available, but they are only used in conjunction with 'legacy' equipment
which is being used until it wears out and is scrapped, and for replay or
remix of archive material. Quarter-inch tape was in the past very widely
used as the standard stereo medium, but there is now little point in using
it as it has no advantages over other options that are available.
The speed at which the tape travels is significant. Higher speeds are
better for capturing high frequencies as the recorded wavelength is
physically longer on the tape. However, there are also irregularities
(sometimes known as 'head bumps, or as 'woodles') in the bass end. The
most common tape speed in professional use used to be 15 inches per
second (38 cm/s), but these days it is more common to use 30 ips (76
cm/s), and not care about the massive cost in tape consumption! At 30
ips, a standard reel of tape costing up to $150 lasts about sixteen minutes.
Analog Recorders in Common Use
There have been many manufacturers of analog tape recorders, but the
top three historically have been Ampex, Otari and Studer. In the US, you
will commonly find the Ampex MM1200 and occasionally the Ampex
ATR124, which is often regarded as the best analog multitrack ever
made, but Ampex only made fifty of them. All over the world you will
find the Otari MTR90 (illustrated with autolocator) which is considered
to be a good quality workhorse machine, and is still available to buy. The
Studer range is also well respected. The Studer A80 represents the
coming of age of analog multitrack recording in the 1970s. It has a sound
quality which is as good as the best within a very fine margin, but
operational facilities are not totally up to modern standards. For example,
it will not drop out of record mode without stopping the tape. The Studer
A800 is still a prized machine and is fully capable, sonicly and
operationally, of work to the highest professional standard. The more
recent A827 and A820 are also very good, but sadly no longer
manufactured.
Editing can also be used to improve a performance by cutting out the bad
and splicing in the good. Even two inch tape can be edited, in fact it is
normal to record three or four takes of the backing tracks of a song, and
splice together the best sections. The tape is placed in a special precision-
machined aluminum editing block, and cut with a single-sided razor
blade, guided by an angled slot. Splicing tape is available with exactly
the right degree of stickiness to join the tape back together. When the edit
is done in the right place (usually just before a loud sound), it will be
inaudible. It takes courage to cut through a twenty-four track two-inch
tape though.
Maintenance
Demagnetizing the heads: After a while, the metal parts will collect a
residual magnetism that will partially erase any tape that is played on the
machine. A special demagnetizer is used for which proper training is
necessary, otherwise the condition can be made even worse.
Line-up: Line up, or alignment, has two functions - one is to get the best
out of the machine and the tape; the other is to make sure that a tape
played on one recorder will play properly on any other recorder. The
following parameters are aligned to specified or optimum values:
Playback level - the 1 kHz tone on a special calibration tape is played and
the output aligned to the studio's electrical standard level.
• Give two reasons why analog recorders are still in use in top
professional studios.
• Why is the supply reel motor driven in the opposite direction to the
actual rotation of the reel?
• What is 'bouncing'?
Why digital? Why wasn't analog good enough? The answer starts with
the analog tape recorder which plainly isn't good enough in respect of
signal to noise ratio and distortion performance. Many recording
engineers and producers like the sound of analog now, because it is a
choice. In the days before digital, analog recording wasn't a choice - it
was a necessity. You couldn't get away from the problems. Actually you
could. With Dolby A and subsequently SR noise reduction, noise
performance was vastly improved, to the point where it wasn't a problem
at all. And if you don't have a problem with noise, you can lower the
recording level to improve the distortion performance of analog tape. A
recording well made with Dolby SR noise reduction can sound very good
indeed. Some would say better than 16-bit digital audio, although this is
from a subjective, not a scientific, point of view. Analog record also had
the problem that when a tape was copied, the quality would deteriorate
significantly. And often there were several generations of copies between
original master and final product. Digital audio can be copied identically
as many times as necessary (although this doesn't always work as well as
you might expect. More on this in another module).
In the domestic domain, before CD there was only the vinyl record. Well
there was the compact cassette too, but that never even sounded good
even with Dolby B noise reduction. (Some people say that they don't like
Dolby B noise reduction. The problem is that they are usually comparing
an encoded recording with decoding switched on and off. The extra
brightness of the Dolby B encoded - but not decoded - sound
compensates for dirty and worn heads and the decoded version sounds
dull in comparison!). People with long memories will know that they
used to yearn for a format that wasn't plagued with the clicks, pops and
crackles of vinyl. The release of the CD format was eagerly anticipated,
and of course the CD has become a great success.
Having established the reasons we have digital audio, let's see how it
works...
Digital Theory
Digital systems analyze the original in two ways: firstly by 'sampling' the
signal a number of times every second. Any changes that happen
completely between sampling periods are ignored, but if the sampling
periods are close enough together, the ear won't notice. The other is by
'quantizing' the signal into a number of discrete - separately identifiable -
levels. The smoothly changing analog signal is therefore turned into a
stair-step approximation, since digital audio knows no 'in-between' states.
As you can see, the digital signal here is only a crude approximation of
the original, but it can be made better by increasing the sampling
frequency (sampling rate), and by increasing the number of quantization
levels. Let's go deep...
To reduce the quantization error between the digital signal and the
original analog, more quantization levels must be used. Compact disc and
DAT both use 65,536 levels. This, in digital terms, is a nice round
number corresponding to 16 bits. Without going into binary arithmetic,
each bit provides roughly 6 dB of signal to noise ratio. Therefore a digital
audio system with 16-bit resolution has a signal to noise ratio (at least in
theory) of 96 dB.
The question will arise, what happens if a digital system is presented with
a frequency higher than half the sampling frequency? The answer is that
a phenomenon known as aliasing will occur. What happens is that these
higher frequencies are not properly encoded and are translated into
spurious frequencies in the audio band. These are only distantly related to
the input frequencies and absolutely unmusical (unlike harmonic
distortion, which can be quite pleasant in moderation). The solution is not
to allow frequencies higher than half the sampling rate (in fact less, to
give a margin of safety) into the system. Therefore an 'anti-aliasing' filter
is used just after the input. Filter design is complex, particularly filters
with the steep slopes necessary to maximize frequency response, but not
be too wasteful on storage or bandwidth by having a sampling rate that is
unnecessarily high. The design of the filters is one of the distinguishing
points that make different digital systems actually sound different.
Once the signal has been filtered, sampled and quantized, it must be
coded. It might be possible to record the binary digits directly but that
wouldn't offer the best advantage, and indeed might not work. In the
compact disc system, the tiny pits in the aluminized audio layer
themselves form the spiral that the laser follows from the start of the
recording to the end. A binary '1' is coded by a transition from 'land' - the
level surface - to a pit or vice-versa. A binary '0' is coded by no
transition. But what if the signal was stuck on '0' for a period of time - the
spiral would disappear! Hence a system of coding is used that rearranges
the binary digits in such a way that they are forced to change every so
often, simply to make a workable system. There are other such
constraints that we need not go into here.
Additionally there is the need for error correction. In any storage medium
there are physical defects that would damage the data if nothing were
done to prevent such damage. So additional data is added to the raw
digital signal, firstly to check on replay whether the data is valid or
erroneous, secondly to add a backup data stream so that if a section of
data is corrupted, it can be reconstituted from other data nearby. Adding
error correction involves a compromise between preserving the integrity
of the digital signal, and not adding any more extra data than necessary.
It is fair to say that the error correction system on CD, and on DAT, is
very good. But as in all things, more modern digital systems are cleverer,
and better.
Muting; in this case the error is so bad that the system shuts down
momentarily rather than output what could be an exceedingly loud glitch.
Bandwidth
24/96
The quest for ever better sound quality leads us to want to increase both
the sampling rate and the resolution. 24-bit resolution will in theory give
a signal to noise ratio of 144 dB. This will never happen in practice, but
the real achievable signal to noise ratio is probably as good as anyone
could reasonably ask for. Of course, some of the available dynamic range
may be used as additional headroom, to play safe while recording, but
even so the resulting recording will be remarkably quiet. Also, even
though most of us cannot even hear up to 20 kHz, a frequency which is
perfectly well catered for these days by a 44.1 or 48 kHz sampling rate,
there is always a nagging doubt that this is only just good enough, and it
would be worthwhile to have a really high sampling rate to put all doubt
at an end.
Digital Interconnection
AES/EBU
S/PDIF
• Two types:
• Electrical
• Uses 75 Ohm unbalanced coaxial cable with RCA phono
connectors
• Cable lengths limited to 6 meters.
• Optical
• TOSLINK - Uses plastic fiber optic cable and same connectors as
Lightpipe (below). TOSLINK is an optical data transmission
technology developed by Toshiba. TOSLINK does not specify the
protocol to be used
• ST-type - Glass fiber can be used for longer lengths (1 kilometer).
• Meant for consumer products but may be seen on professional
equipment
• Supports up to 24-bit/48 kHz sampling rate
• Self-clocking
• It ought to be necessary to use a format converter when connecting
with AES/EBU since the electrical level is different (0.5 V) and
the format of the data is different also. However, some AES/EBU
inputs can recognise an S/PDIF signal
• Some of the bits within the Channel Status blocks are used for
SCMS (Serial Copy Management System), to prevent consumer
machines from making digital copies of digital copies.
MADI
ADAT Optical
• Sometimes known as 'Lightpipe'
• Implemented on the Alesis ADAT MDM and digital devices such
as mixing consoles, synthesizers and effects units
• Supports of to 24-bit/48 kHz sampling rate
• Transmits 8 channels serially on fiber-optic cable
• Distance limited to 10 meters., or up to 30 meters with glass fiber
cable
• Data transmission at 48 kHz is 12 Megabit/s
• Self clocking
• Channels can be reassigned (digital patchbay function)
• In relation to the question above, why was this the most pressing
need?
• What is 'aliasing'?
• Describe quantization.
Why, in a hard disk recording system, is it likely that fewer tracks can be
replayed simultaneously at the 24-bit/96 kHz standard, than at the CD-
quality 16-bit/44.1 kHz standard?
Chapter 7: Digital Audio Tape Recording
The signal that is recorded on the tape is of course digital, and very
dissimilar to either analogue audio or video signals. As you know, the
standard DAT format uses 16 bit sampling at a sampling frequency of 48
kHz. This converts the original analog audio signal to a stream of binary
numbers representing the changing level of the signal. But since the
dimensions of the actual recording on the tape are so small, there is a lot
of scope for errors to be made during the record/replay process, and if the
wrong digit comes back from the tape it is likely to be very much more
audible than a drop-out would be on analog tape. Fortunately DAT, like
the Compact Disc, uses a technique called Double Reed-Solomon
Encoding which duplicates much of the audio data, in fact 37.5%, in such
a way that errors can be detected, then either corrected completely or
concealed so that they are not obvious to the ear. If there is a really huge
drop-out on the tape, then the DAT machine will simply mute the output
rather than replay digital gibberish. As an extra precaution against
dropouts, another technique called interleaving is employed which
scatters the data so that if one section of data is lost, then there will be
enough data beyond the site of the damage which can be used to
reconstruct the signal.
The pulse code modulated audio data is recorded in the centre section of
each diagonal track across the tape. There is other data too:
DASH Operation
The first thing you are likely to want to do with your new DASH
machine is of course to make a recording with it, but it would be
advisable to read the manual before pressing record and play. Some of
the differences between digital and analog recording stem from the fact
that the heads are not in the same order. On an analog recorder we are
used to having three heads: erase, record and play. DASH doesn't need an
erase head because the tape is always recorded to a set level of
magnetism which overwrites any previous recordings without further
intervention. So the first head that the tape should come across should be
the record head. Right?
Since there are different ways to format a tape and make recordings, the
3342S has three different recording modes: Advance, Insert and
Assemble. Advance mode is as explained above. Insert is for when you
have recorded or formatted the full duration of the material and you want
to go back and re-record some sections. Assemble is when you want to
put the tape on, record a bit, play it back, record a bit more etc, as would
typically happen in classical sessions.
Converter Delay
The main text deals with some of the implications of delays caused
by the process of recording digital signals onto tape and playing
them back again. There is another problem caused by delays in the
A/D conversion itself. The convertors used in the Sony 3324S, for
example, while being very high quality, have an inherent delay of
about 1.7 milliseconds.
Editing
Maintenance
Although an analog recorder can be, and should be, cleaned by the
recording engineer in the normal course of studio activities, a DASH
machine should only be cleaned by an expert, or thousands of dollars
worth of damage can be caused. The heads can be cleaned with a special
chamois-leather cleaning tool, wiping in a horizontal motion only. Cotton
buds, as used for analog records will clog a DASH head with their fibers.
Likewise, an analog record can be aligned by a knowledgable engineer,
but alignment of a DASH machine is something that is done every six
months or so by a suitably qualified engineer carrying a portable PC and
a special test jig in his tool box. The PC runs special service software
which can interrogate just about every aspect of the DASH machine
checking head hours, error rates, remote ports, sampler card etc etc. With
the aid of its human assistant it can even align the heads and tape tension.
Current significance
The original modular digital multitrack was the Alesis ADAT (below
left). On its introduction it was considered a triumph of engineering to an
affordable price point. The ADAT (Alesis Digital Audio Tape) was
closely followed by the Tascam DTRS (Digital Tape Recording System)
format (below right).
Alesis ADAT-XT
Tascam DA98-HR
One further difference is that it is probably fair to say that the ADAT has
reached the end of its product life-cycle, although there are undoubtedly
still plenty of them around and in use. DTRS however is still useful as a
tape-based system offering a standard format and cheap storage.
Check Questions
• What is SCMS?
• What is 'interleaving'?
• In DASH, why does a playback head come before the record head
in the tape path?
Level
0 dBm = 1 mW
0 dBu = 0.775 V
0 dBv = 0.775 V
0 dBV = 1 V
All of the above (with the exception of dBFS) refer to electrical levels.
We also need levels for magnetic tape and other media. Analog recording
on magnetic media is still commonplace in top level music recording,
and outside of the developed countries of the world. Magnetic level is
measured in nWb/m (nanowebers per meter). ‘Nano’ is the prefix
meaning ‘one thousandth of a millionth’. The weber (Wb) is the unit of
magnetic flux. Wb/m is the unit of magnetic flux density, or simply ‘flux
density’. Wilhelm Weber the person (pronounced with a ‘v’ sound in
Europe, with a ‘w’ sound in North America), by the way, is to magnetism
what Alessandro Volta is to electricity.
It’s worth noting that none of these reference levels is better than any
other, but NAB and DIN are the most used in North America and Europe
respectively.
Operating Level
Most brands of tape can give good clean sound up to 8 dB above 200
nWb/m and even beyond, although distortion increases considerably
beyond that.
Gain
Frequency Response
20 Hz to 20 kHz +0 dB/-2 dB
or this:
20 Hz to 20 kHz ±1 dB
Of course the actual numbers are just examples, but the concept of
defining the allowable bounds of deviation from ruler-flatness is the key.
Q
It may be evident from this that Q is a ratio and has no units. Q doesn't
stand for anything either, it’s just a letter. Whether you need to use a low
Q setting or a high Q setting depends on the nature of the problem you
want to solve. If there is a troublesome frequency, for example acoustic
guitars sometimes have an irritating resonance somewhere around 150
Hz to 200 Hz, then a high Q setting of 4 or 5 will allow you to home in
on the exact frequency and deal with it without affecting surrounding
frequencies too much. If it is more a matter of shaping the spectrum of a
sound to improve it or allow it to blend better with other signals, then a
low Q of perhaps 0.3 would be more appropriate. The range of Q in
common use in audio is from 0.1 up to around 10, although specialist
devices such as feedback suppressers can vastly exceed this.
Noise
Signal to noise ratio is one measure of how noisy a piece of equipment is.
We said earlier that a common operating level is +4 dBu. If all signal
were removed and the noise level at the output of the console measured,
we might obtain a reading somewhere around –80 dBu. This would mean
that the signal to noise ratio is 84 dB. In analog equipment, a signal to
noise ratio of 80 dB or more is considered good. The worst piece of
equipment as far as noise is concerned is the analog tape recorder, which
can only turn in a signal to noise ratio of around 65 dB. The noise is quite
audible behind low-level signals. Outside of the professional domain, a
compact cassette recorder without noise reduction can only manage
around 45 dB. This is only adequate when used for information content
only, for instance in a dictation machine, or for music which is loud all
the time and therefore masks the noise.
Modulation Noise
It is worth saying that signal to noise ratio should be measured with any
noise reduction switched out, otherwise the comparison between peak or
operating level and the artificially lowered noise floor when signal is
absent gives an unfairly advantageous figure unrepresentative of the
subjective sound quality of the equipment in question.
Distortion
Headroom
The era of wow and flutter is probably coming to an end, but it hasn't
quite got there yet so we need some explanation. Wow and flutter are
both caused by irregularities in mechanical components of analog
equipment such as tape recorders and record players. Wow causes a long-
term cyclic variation in pitch that is audible as such. Flutter is a faster
cyclic variation in pitch that is too fast to be perceived as a rise and fall in
pitch. Wow is just plain unpleasant. You will hear it most often, and at its
worst, on old-style juke boxes that still use vinyl records. Flutter causes a
‘dirtying’ of the sound, which used to be thought of as wholly
unwelcome. Now, when we can have flutter-free digital equipment any
time we want it, old-style analog tape recorders that inevitably suffer
from flutter to some extent have a characteristic sound quality that is
often thought to be desirable. Wow and flutter are measured in
percentage, where less than 0.1% is considered good.
Check Questions
• Which has the greater heating effect: 100 V RMS or 100 V DC?
• What is clipping?
• What is headroom?