You are on page 1of 18

_________________________________

Audio Engineering Society

Convention Paper 5565


Presented at the 112th Convention
2002 May 10–13 Munich, Germany

This convention paper has been reproduced from the author's advance manuscript, without editing, corrections, or consideration
by the Review Board. The AES takes no responsibility for the contents. Additional papers may be obtained by sending request
and remittance to Audio Engineering Society, 60 East 42nd Street, New York, New York 10165-2520, USA; also see www.aes.org.
All rights reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the
Journal of the Audio Engineering Society.

_________________________________

In the Light of 5.1 Surround: "Why AB-Polycardioid Centerfill"


(AB-PC) is Superior for Symphony-Orchestra Recording
Edwin Pfanzagl-Cardone
Salzburg Festival, Salzburg, 5020, AT
e.pfanzagl@salzburgfestival.at

ABSTRACT
The reappearance of surround sound in the form of the DVD's 5.1 format has led sound engineers to re-evaluate
current microphone techniques used for stereo recording.
The author has measured various 2-channel main-microphone signals in order to prove that their correlation is
strongly frequency dependent. Due to properties of the human hearing mechanism as well as standard
loudspeaker playback-arrangements frequencies below approximately 700 Hz are particularly critical in terms of
faithful spatial reproduction in a stereo as well as 5.1 surround environment and therefore deserve special
attention already during the recording process. Conclusions from the measurements are drawn and a
microphone system ("AB-Polycardioid Centerfill"), well suited for 5.1 surround, is proposed.

0. INTRODUCTION

Stereo sound recording and reproduction is by now more than 100 Apparently there is a disagreement in respect to the amount of
years old, however the "one and only" method how to capture a capsule spacing needed for “correct spatial reproduction” of a sound
musical performance in the perfect way has not been found due to too event: On the one hand important representatives of the academic
many variables involved: the size of the hall and the individual room world [1,2,3] seem convinced that only small microphone spacings
acoustics, the size of the ensemble, the type of event (is it a live (“small AB”), based on psychoacoustic principles, are able to
performance or a recording session) are all parameters which enter provide correct localisation. On the other hand a large percentage of
into the equation. At the end of the day it is simply also a matter of practicing sound engineers favour largely spaced AB techniques
taste which microphone technique one might prefer for capturing and (“large AB”) [4], or at least use supplementary “outriggers” (largely
reproducing a musical event. spaced omnidirectional microphones in front of the orchestra), due to
the more “open sound” they provide.
Simply put, in order to achieve decorrelation of the stereophonic In addition RCA’s “Living Stereo” [5] series of stereophonic
sound signal we need to have either sufficent spacing between two recordings from the 1950-ies and 60-ies (which used a large
omnidirectional transducers or employ coincident or near coincident AB/Centerfill microphone technique) is still cherished by music
techniques with directional microphones. lovers world wide.
PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

and due to signal-diffraction sounds "bend" around it. Above about


800Hz the shadowing capacity of the head becomes more and more
important and the human hearing mechanism is therefore mainly
based on the interaural differences in terms of signal loudness and
frequency response, while - simply put - in the low frequency region
it is mainly based on detecting interaural phase- and timing-
differences. (see also Fig. 2 in the appendix, from [7])

Research by Yost [8] in 1971 has shown that the low frequency
components of transient sounds in a binaural signal are of greatest
importance for localisation: High-pass filtering of clicks, so as to
include only energy above 1500 Hz resulted in a deterioration of
localisation ability, but the same clicks low-pass filtered to include
only energy up to 1500Hz resulted in little change.

Research by Hirata [9] deals with the phenomenon of localization


Fig.1: RCA's "A/B - Centerfill" microphone scheme (1959) [5] deterioration for low frequency components of a stereo signal that is
replayed via loudspeaker. He proposes a “perceptual interaural
The above mentioned discrepancy led the author to re-evaluate the cross-correlation coefficient” (PICC) as follows:
sonic characteristics of various stereo main-microphone systems
through subjective listening tests. The majority of current microphone PICC = DR0+ (1-D)RE eqn. (1)
techniques used or proposed for 5.1 surround recording are in some
way derived from standard 2-channel stereophonic microphone with
techniques. A more detailed understanding of the sonic D … definition (directness) of sound
characteristics of stereo main-microphone systmes could therefore R0… the interaural cross-correlation coefficient of the direct sound
also provide conclusions for corresponding 5.1 surround techniques. (unity for normal incidence)
RE… the interaural cross-correlation coefficient of reverberant
(incoherent) sound, expressed by :
1. PROBLEMS RELATED TO PERCEPTION OF LOW RE = sinkr(f)/kr(f) eqn. (2)
FREQUENCIES where
k=2πf/c ... the wave number eqn. (3)
1.1 Playback Considerations c … speed of sound, and
As is common knowledge of practicing sound engineers, some r(f) … the effective acoustic distance between the ears, which is
microphone techniques are better suited for headphone reproduction, approximately 30cm. [10, 11]
while others work better with replay via loudspeakers.
Listening to the signal of a “small AB” microphone recording via He also defines an index of acoustic spatial impression ASI,
headphones, one gets a rather realistic impression of the sound event, expressed by
as long as the capsule spacing is in the order of the diameter of the
human head. (This effect is also true for some other near coincident ASI=(1-D)*100 [%] eqn. (4)
techniques).
However, when listening to the same stereo signal via loudspeakers, Full spatial impression is indicated by ASI=100% and no spatial
the experienced listener hears that while at high frequencies there is a impression by ASI=0%.
convincing stereo-panoramic impression of localization and depth of
the soundfield, this impression gets lost more and more towards
lower frequencies.
Alan D. Blumlein mentioned this effect already in his 1931 patent:
"With two microphones correctly spaced and the two channels
entirely separate it is known that this directional effect can also be
obtained for example in a studio, but if the channels are not kept
separae (for example by replacing the headphones by two loud
speakers) the effect is largely lost.” [6]
He then proceeds in proposing an electronic compensation circuit
(“Blumlein Shuffler”), as well as head-related recording techniques
with obstacles between two pressure transducers and his famous
“Blumlein Pair” composed of crossed figure-of-eight microphones,
among a plethora of other inventions relating to stereophonic sound.

The above mentioned effect - being quite obvious acoustically –


needs some empirical as well as theoretical explanation. The change
in spatial representation towards lower frequencies can be checked
by listening selectively to isolated frequency bands to detect which
room impression is provided by a main-microphone signal in various
frequency ranges. In this respect the frequency range below approx.
Fig. 3: PICC curves for stereo sound reproduction in a listening room
700-800 Hz is of primary interest, since the human head is not jet
of reverberation time TL (0 to 1s) shows small ASI in low frequency
effective as a baffle in this frequency band
band compared with an ASI=60% for the middle seat of a concert
hall. Broken line shows TL =0.3s. (from [9] )

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 2


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

Fig. 3 shows that in a standard listening room (RT60=0.3s) the ASI


in the stereophonic soundfiled is small at frequencies less than
800Hz and large at frequencies greater than 800Hz in comparison
with that in the concert hall, where ASI=60%.

Research by Griesinger on envelopment [12] has shown that while


for higher frequencies the ideal loudspeaker placement moves
towards the median plane, for frequencies below 700Hz the ideal
placement is the one which provides the most separation between the
radiators, i.e. at the sides of the listener at +/- 90 deg.

The research by Hirata and Greisinger indicates that the standard


loudspeaker placement for stereo (as well as for 5.1, see the
International Telecommunications ITU-R guidelines [BS.775-1])
with speakers at +/- 30 deg. provides less than perfect conditions in
respect to fidelity for low frequency spatial reproduction.

1.2 Recording Considerations


Fig. 4: The amount of correlation between two omnidirectional
Research in relation to concert hall acoustics by Morimoto and
microphones separated by 25cm in a reverberant field. Calculated in
Meakawa [13] has shown that spaciousness is strongly related to
three dimensions. Note the high correlation at 100Hz, and the
signal content in the frequency range below 500Hz. It is lower
negative correlation at 800Hz. The separation and the frequency vary
frequency components and the IACC (Inter-Aural Cross-Correlation
inverseley, so a pair separated by 2.5m would have a negative
Coefficient) of a stimulus that affect spaciousness independently; the
correlation at 80Hz (But only if the reverberation radius were greater
100–200Hz frequency range proved to be of particular importance.
than 2.5 meters.) (from [15])
Additionally, research by Hidaka, Beranek and Okano [14] on
Once the microphone spacing is in the order of half the acoustic
acoustical quality in concert halls found that changes in low
wavelength at a given frequency, sound (from the side) can still be
frequency levels made greater changes in respect to Apparent Source
picked up 180° out of phase. Below that frequency sound will
Width (ASW, traditionally linked to “spaciousness”), than changes
necessarily be picked up with less than 180° phase offset, which
in the high frequency levels.
means essentially that the correlation coefficient can only get more
positive. In other words: as frequency drops below this "critical
This shows that for reproduction, as well as recording, fidelity
frequency" the signal becomes more and more monophonic.
(amplitude- and phase-wise) in respect to low frequeny signal-
components is of paramount importance in order to achieve a sonic
As also proposed by Hecker [16], the frequency below which the
impression as natural as possible.
correlation increases towards 1 can be calculated approximately as:
The author is convinced that only microphone techniques, which
provide sufficient decorrelation (not only for mid- an high-
f=c/(2*d) eqn. (5)
frequencies, but also for low frequencies) are well suited for this
purpose.
with:
This essentially rules out quite a number of well-established
f ... frequency in [Hz]
microphone techniques, among them "small AB", as will be
c ... speed of sound in [m] (i.e. 343m/sec at 20°C)
explained below.
d ... microphone spacing in [m]

As an example, a "small AB" microphone spacing of 50cm will


1.2.1 "Small AB" vs. "Large AB" in Orchestral Recording
deliver accurate spatial reproduction (for replay via loudspeakers)
While it may be possible to achieve a good sounding recording of a
only down to approx. 340 Hz (or rather 400Hz, according to
small to medium size instrumental ensemble with a small AB system,
Griesinger's simulation in fig. 4)
this technique is not adequate for capturing the full range sound and
spatial width of a large symphony orchestra.
Therefore, in order to achieve spatial fidelity down to 40Hz, two
omnidirectional capsules need to be spaced by at least 4.3 meters:
The narrow spacing of two omnidirectional microphones has one
major disadvantage: the stereo microphone signal is not sufficiently
d=c/(2*f) 343/2*40=4,29m
decorrelated for low frequencies - in other words: the stereo signal
gets more and more monophonic as frequency decreases.
This spacing is still narrow enough to work well with small to
The correlation of the stereo signal is of course determined by the
medium size orchestras.
amount of microphone spacing: a spacing of 25 cm for example leads
to negative correlation at around 800 Hz and below this frequency
The already very high degree of correlation of a small AB
the correlation increases steeply. (see Fig. 4)
microphone arrangement will be made even higher by the low
frequency "spatial deterioration" effect of a standard stereo
The graph in figure 4 is derived from on a calculation based on
loudspeaker setup (see Hirata).
integrating plane waves arriving from all directions around a set of
spherical coordinates surrounding a microphone pair separated by a
At this point the author would like to introduce the term "Critical
distance of choice.
Frequency", referring to the lowest frequency in an spaced AB pair
recording, which is being captured with sufficient decorrelation to
provide fidelity in low frequency spatial reproduction.

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 3


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

2. MEASURES RELATED TO SPACIOUSNESS 3. MEASUREMENTS

Researchers have proposed various measures for acoustic parameters The technically most objective way to measure how well a particular
related to spatial impression and envelopment: microphone technique is able to translate an acoustic event to the
listener at home with the least amount of alteration is probably the
Traditionally spaciousness is regarded to be linked to "apparent following:
source width" (ASW) and "listener envelopment" (LEV). The 1. make a recording of the (orchestral) event with the microphone
research of Marshall [17] and Barron [18] has proved the importance technique of your choice
of lateral reflections for spatial impression in concert halls. Lateral 2. at the same time make a second recording at the "best seat in the
Energy Fraction (LF) has been used as a measure of apparent source house" position of a concert hall with a binaural arificial head
width (ASW) by Barron and Marshall [19]: (dummy head)
2. while reproducing the first recording through a stereo or 5.1
loudspeaker system make a third recording (using the same dummy
head) in the sweet spot of the listening room
3. measure the correlation of the two dummy head recordings. Of
course one has to be aware that with this method signal distortion
(concerning amplitude, frequency and phase) introduced by the
eqn. (6)
replay system and the acoustics of the listening room will be included
in the evaluation process and might bias the results.
where θ is the lateral angle (where θ=0 is 90° from straight ahead),
However, the microphone technique, which produces the highest
p(t) is the sound pressure (measured by a nondirectional
correlation (regarding the entire frequency range) between the
microphone).
original sound event and the re-recorded reproduction would appear
to be the one with the highest fidelity.
The "Interaural Cross-Correlation Function" IACFt (τ) is a binaural
measure of the difference in sound at the two ears and, hence, of
The measurements for this paper are based on a much simpler
lateralness:
approach: phase correlation over frequency is being evaluated in
respect to various 2-channel stereo main-microphone techniques.

3.1 Program Material and Venue Acoustics


eqn. (7) One set of recordings, which has been used for the measurements,
consists of a chamber music ensemble in a hall, carried out
where PL and PR designate the sound pressure level at the entrances simultaneously with 5 different 2-channel main-microphone systems
to the left and right ear, respectively. (small AB, ORTF, XY, MS, Sphere-Microphone) (see Fig. 5-9)
Microphones used: see figures
If we select the maximum value of this equation we get the
"Interaural Cross-Correlation Coefficient" (IACC): The second set of recordings which were used for the measurements
have been made at the "Grosse Festspielhaus, Salzburg" in Austria, a
concert hall classified as B+ (excellent to good) in terms of acoustics
IACCt= |IACFt(τ)| max for -1<τ <+1 eqn. (8)
according to research carried out by Hidaka, Beranek and Okano
[14].
In 1996 Griesinger [20] proposed the "Lateral Early Decay Time"
The microphone techniques employed for these recordings were:
(LEDT) as a measure for spaciousness:
ORTF, AB (0.2m, 0.4m, 0.8m, 1.3m, 3.2m, 7.2m and 12.0m) and
"AB-PC" (12m) (see Fig. 11-18).
Sound sources: various symphony-orchestras
Microphones used: Schoeps CMC4 (cardioid) and Schoeps CMC3
(omidirectional with diffuse-field compensation).
eqn. (9)
Additional locations where single recordings have been carried out
where S(t) is the Schroeder integral of the impulse response, and
by the author were: a small church in London (Fig. 22,23) (rather
SD(t) is the Schroeder integral if the Interaural Difference, the IAD.
reverberant acoustics) and a large tv-studio in London (Fig. 26) (very
dry acoustics).
A lot of factors contribute to spatial impression: more recent research
Sound sources: small orchestra (church), instrumental ensemble (8
by Griesinger [21] has shown that fluctuations in the Interaural Time
players; tv-studio)
Delay (ITD) and the Interaural Intensity Difference (IID) contribute
Microphones used: Neumann U-87 in figure of eight and cardioid
significantly to the spatial properties of a sound event.
mode (ORTF, Blumlein Pair; church), Neumann U-87 in cardioid
and figure-of-eight mode (MS; tv-studio)
While ASW and IACC are useful in concert hall measurement, they
do not work as well in smaller rooms. Therefore Griesinger has
In addition another set of recordings of a woodwind trio in a studio
proposed two new measures [22]:
environment was measured, to examine the sonic properties of two
The "Diffusefiled Transfer Function" (DFT) as a measure of
microphone techniques (XY 90deg. cardioids and Blumlein Pair)
envelopment, which is useful both in small and large rooms, and
used with rather dry acoustics. (See Fig. 24,25) [23]
the "Average Interaural Time Delay" (AITD) as a measure of
Microphones used: information not available
"externalisation", a sonic property unique to small rooms.

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 4


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

Finally, a dummy-head recording of the organ at the "Alte 3.2.4 Blumlein-Pair


Gewandhaus", Leipzig (Germany) has been included in the This crossed pair of figure-of-eight microphones is usually oriented
measurements as well. (Fig. 10) at an included angle of 90°. Developed in the early 1930-ies by Alan
Dower Blumlein, chief engineer at EMI's Abbey-Road Studios,
forms part of his patent on stereophonic sound-recording techniques
3.2 Main Microphone Systems – Sonic Properties [6].
The system has a unique property: it can be shown that, "when the
In [24] Streicher and Dooley have given a concise description of the listener is at such a distance that the loudspeakers subtend an angle
fundamental characteristics of various stereophonic main-microphone of 120° (sic), the apparent angle of a sound source is very close to
system (using from 2 to 4 capsules). the true angle (at the recording location) up to +/- 35°. When the
In the following sections the author will partly cite from their listener is at other distances from the loudspeakers, but still on the
publication. center line, the apparent source remains at the same fraction of the
total loudspeaker spacing, thus showing that a correctly proportioned
3.2.1 Small AB: sound picture is presented although the angular scale has been
Two omnidirectional microphones spaced apart by 0.2m to about 1m, altered. The angular distortion is such that the apparent source
mostly around 0.5m. Some sound engineers and academics are appears to be somewhat nearer to the center than it should be over
convinced that only small AB (as opposed to largely spaced most of the range." [26]
omnidirectional AB-microphones – "large AB" with a microphone Placement of the Blumlein pair has been found "to be critical in order
spacing of up to a few meters) enables correct localization and to maintain a proper direct-to-reverberant-sound ratio and to avoid
spatialisation in respect to a broad sound source, like a symphony strong out-of-phase components.". On the other hand "it is often
orchestra, for example. [2, 3] commented that this configuration produces a very natural sound."
It is claimed that the main reason why only "small AB" provides [24]
correct localization, lies in the psychoacoustic properties of the
human hearing mechanism.
Since the amplitude differences of a sound arriving at small spaced 3.2.5 MS
microphones is usually not very large, it is mainly the difference in The MS technique combines a forward facing microphone with either
arrival time which is responsible for accurate localization of the cardioid or omnidirectional characteristic with a figure-of-eight
sound source: a time-delay of around 1.1ms (or an intensity microphone, which is oriented sideways (with the null plane bisceting
difference of about 15dB) leads to localizing a source entirely in one the sound source). While keeping the M and S (mid- and side-
of the two loudspeakers of a stereo-playback arrangement. component) signals separate during the recording process gives the
mixing engineer complete control over the width of the stereo-image,
3.2.2 Large AB: careful balancing is due: noted sound-engineer Michael Bishop, who
Largely spaced omnidirectional microphones are frequently used for often uses a laterally oriented double MS-setup in combination with a
orchestra recordings, with spacings in the order of up to one third or forward facing dummy head for his 5.1 surround recordings of
half of the orchestra width. In this case the spacing may be anywhere classical music states: "When I matrix the M&S I may have to pan
in the range from 3 to 10m or more. Proponents of this technique the cardioid microphone to fill the sides. It's very touchy to get the
appreciate the "open and rich" sound of the system, the "air" it panning and imaging correct, especially with panning across the
provides. This property is also the reason why largely spaced sides. Prior to the recording, I'll have an assistant go out of the room
omnidirectional microphones (often with diffuse-field compensation) and walk around the microphone array while I listen to the decoded
are being used as "outriggers" in combination with other main- MS-matrix. Set-up of the matrix is very difficult to get right. In order
microphone techniques to add a feeling of "space" and openness, to get the surround microphones to breathe, I place them perhaps a
which the other systems are not able to provide. few feet back or even further if the acoustics of the hall call for it."
(see, for example, the "Decca-Tree" technique of the 50-ies; [5]) [27]
Critics point out the notorious "hole in the middle" effect, which is The author has had similar experiences in respect to balancing with a
the result of attenuation of sound-sources in the center. surround test-setup using a Schoeps KFM 360 sphere microphone
with added figure-of-eight microphones (see also section below).
3.2.3 ORTF
Two cardioid microphones spaced 17cm apart, at an angle of 110
degree. Listening tests with 64 participants conducted by Carl Coen 3.2.6 Sphere-Microphone
in the early seventies [25] brought the result that the ORTF technique In the first series of measurements under discussion a Schoeps KFM-
was chosen to be the best-sounding "compromise" among the six 6U ("Kugelflaechen-Mikrofon") has been used. Two omnidirectional
systems which had been compared (XY, MS, Stereosonic [=Blumlein microphones are integrated at the sides of a sphere (of 20cm
Pair], ORTF, NOS and 5 pan-potted omnidirectional microphones). diameter). For surround applications two figure-of-eight microphones
can be positioned just above or beneath these pressure transducers,
3.2.3 XY which results in having two MS-pairs of microphones, which can be
Two microphones with cardioid pattern at an included angle of matrixed accordingly. With later designs (KFM360) the diameter of
between 60 and 120°. Academics use to like this coincident the sphere has been reduced to 18cm in order to achieve a recording
arrangement of microphones since its sonic properties are more easy angle of about 120°.
to grasp in terms of mathematical simulation than those of non- However, the recording used for the measurements uses the "old"
coincident microphone techniques. As long as mono-compatibility 20cm sphere in a mere stereo application.
was still an issue with broadcast-audio and cutting of vinyl-discs,
sound-engineers appreciated the fact that XY provides a mono-
compatible stereo-signal (mainly due to strong correlation at low
frequencies, as will be shown later).

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 5


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

3.3 Aim of experiment


Measurements to be carried out on prerecorded material of acoustic For the second measurement, the phase of the R channel was inverted
music (mainly orchestral), the purpose of which is to verify if, and to by use of the phase reversal switch, which resulted in a "L-R" signal
which extent, the correlation of the electrical L- and R-channel signal upon summation. The peak-hold function was reset and the second
of a 2-channel stereo recording can be related to generally agreed run-through (playback of the same music passage) was carried out.
sonic properties of the main-microphone systems under test. The result was stored in another memory location for later data
Therefore recordings which make use of stereo main-microphone retrieval and comparison.
systems only (without the addition of spot microphones) are to be
Admittedly, the principle behind this method of measurement is
used.
somewhat rough, however it produces valid results. In addition it has
the advantage that the necessary equipment can often be found in
3.3.1 Measurement considerations sound-recording studios and it is therefore easy for a sound-engineer
As has been pointed out a faithful spatial reproduction of the to test his own microphone technique in respect to correlation over
acoustical performance over the entire frequency range should be the frequency.
main goal of a good recording. To achieve this, the recording of low-
For the measurement of the correlation coefficient a digital
frequency components of the 2-channel stereo (music) signal have to
implementation of a correlation meter was used. In order to gain
be captured with the same amount of correlation (or decorrelation) as
sufficient frequency-band limiting when measuring the low and high-
mid- and high frequencies.
frequency part of the stereo-signal, four cascaded digital
Preferably the measurement procedure should be technically easy to
implementations of 4th-order Linkwitz-Riley high- and low-pass
set up and fast to carry out.
filters were used. This resulted in a theoretical cut-off slope of 96dB
In order to see whether the correlation of signals produced by most of
/ octave in the stop-band beyond the cut-off frequencies of 700Hz
the common main-microphone systems is frequency dependent, it is
(Low-Pass) and 1000Hz (High-Pass), respectively.
necessary to measure correlation over the frequency range of interest
(auditory band). For this purpose the author has decided to use
frequency dependent level-attenuation as a measure of correlation. 3.3.3 Analysis of measurements
In a separate process the correlation coefficient (CC) of the electrical
signal (output signal) produced by the microphone systems under test The idea behind these measurements is that – as has already been
has been measured: explained more detailed above (see section 1.1) – low frequency
(averaged over the length of the sound recording, which was used for content in a stereo-recording will be less and less affected by the
the respective measurement) shadowing capacity of the human head, which is effective as a baffle
CCLF ... the correlation coefficient of the low-frequency part of the for higher frequencies. Therefore, at low frequencies, the sonic
stereo signal from 700Hz downwards. impression will tend to be a monophonic one, unless the signal
CCHF ... the correlation coefficient of the high-frequency part of the content is largely decorrelated ("wide stereo"), as is usually the case
stereo signal from 1000Hz upwards. with high frequencies for almost all 2-channel main-microphone
CCtot ... the correlation coefficient over the entire frequency range techniques.
(20Hz to 20kHz, "total signal"). Comparing the difference signal to the summation signal shows how
Not averaged: much of the signal in the L and R channel has been identical: the
CCmin ... the smallest (most negative) correlation coefficient that respective signal-components get canceled out in the L-R signal. An
was measured in respect to the signal material under test. attenuation of 6 dB in comparison to the L+R signal seems to
CCmax ... the largest (highest) correlation coefficient that was indicate that half of the signal material in the L and R channel was
measured in respect to the signal material under test. identical (CC=0.5).

3.3.2 Measurement procedure As will be shown later, the exact amount of correlation cannot be
The measurement employed digital tape-based recorders with 16bit deducted from this simple method, since the measurement provides
resolution (as recording and playback sources), a 1/3rd octave only the peak hold value and not the value (averaged over time) of
bandpass-based digital implementation of a realtime spectrum the correlation coefficient in each frequency band, respectively.
analyser (RTA) with snapshot memory and peak-hold function for all However, the results of the measurements are indicative for the
31-bands, and an analog mixing desk with phase-reversal function on overall behavior in respect to decorrelation of the various main-
the line inputs. microphone techniques.
The length of music passages used for the measurements was usually The results of the measurements with the correlation meter are
between 30s and 2min. compared with the "frequency-dependent-level-attenuation" based
The RTA was set up to measure with a short integration time of 15ms evaluation method.
and peak characteristics (instead of averaging mode), in order to Research by Yamamoto and Nagata at NHK [28] has showed that
optimize accuracy towards detail. the definition D of a recorded source for symphonic music is 0.5.
For the first measurement the signals of the L and R channel of the Measurement of common values of correlation coefficients for
stereo-recording were summed and the resulting mono-signal standard recordings, conducted by the author, showed that the
("L+R") was sent to the RTA, which was set to "peak hold" mode averaged CC was around 0.5 (CCmin=0.0, CCmax=+0.7) for
recordings that used a lot of spot microphones, while recordings with
(with τ=∞; i.e. the peaks are held indefinitely, until the machine is
AB-technique and few spot microphones had an average value of
reset).
around +/-0.0 (CCmin=-0.4, CCmax=+0.4).
Therefore, at the end of the first run-through the peak hold values for
each of the 31 center frequencies were representing the absolute peak
level, which had been measured within the respective frequency
band. The peak-hold values were stored to one of the snapshot
memories.

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 6


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

3.3.3.1 The first set of measurements:


Chamber Music Ensemble in a Hall
LF(<700Hz): CC=+0.7 (averaged) (CCmin=+0.2, CCmax=+0.9)
HF(>1000Hz): CC=+0.4 (avgd) (CCmin=+0.25, CCmax=+0.6)
full-bw signal: CCtot=+0.6 (avgd) (CCmin=+0.3, CCmax=+0.8)
Microphones: Schoeps CMC 54

Fig. 5: The small AB arrangement used two omnidirectional


microphone capsules spaced apart by 50cm. The L and R channel
signals are decorrelated for high frequencies, below 250Hz level
attenuation in the order of 8-12db, due to correlation, becomes Fig. 8: The measurements of the XY system (microphone angle 120°)
apparent in the L-R signal. present level attenuation already from well above 4kHz. Starting
LF(<700Hz): CC=+0.5 (averaged) (CCmin=+0.2, CCmax=+0.6) from around a few dB it reaches a peak of about 14dB in the lower
HF(>1000Hz): CC=+0.3 (averaged) (CCmin=+0.2, CCmax=+0.7) mid-frequency region (600Hz). In the low frequency region the
full bandwidth signal: attenuation is mainly around 10dB.
CCtot=+0.45 (averaged) (CCmin=+0.3, CCmax=+0.6) LF(<700Hz): CC=+0.6 (averaged) (CCmin=+0.2, CCmax=+0.8)
Microphones: Bruel & Kjaer 4006 HF(>1000Hz): CC=+0.4 (avgd) (CCmin=+0.25, CCmax=+0.5)
full-bw signal: CCtot=+0.7 (avgd) (CCmin=+0.4, CCmax=+0.8)
Microphones: Schoeps CMC 54

Fig 6: The sphere-microphone's L-R signal shows similar behavior:


level attenuation starts at already at higher frequencies, at around Fig. 9: The MS system, being composed of an omnidirectional
300Hz with an attenuation of about 6-8dB. Below 160Hz the capsule in combination with a figure-of-eight, exhibits a quite
attenuation is significantly higher, usually around 14-16dB. different pattern: the level attenuation within the L-R signal is more
LF(<700Hz): CC=+0.6 (averaged) (CCmin=+0.25, CCmax=+0.7) or less evenly distributed over the entire frequency range and has a
HF(>1000Hz): CC=+0.4 (averaged) (CCmin=+0.2, CCmax=+0.6) very small value, mostly in the range of 2-3dB. This shows that the L
full-bw signal: CCtot=+0.5 (avgd) (CCmin=+0.4, CCmax=+0.6) and R signals of the MS system are largely decorrelated over the
Microphone: Schoeps KFM-6U entire frequency range. This might be one of the reasons why some
people are praising the sonic result of using MS-microphone systems
also in a surround context: " ... Danmark Radio's use (of back-to-
back MS-systems) eclipses everything else in terms of localisation
and spatialisation". [29]
LF(<700Hz): CC=+0.2 (averaged) (CCmin=-0.4, CCmax=+0.7)
HF(>1000Hz): CC=+0.25 (avgd) (CCmin=+0.0, CCmax=+0.5)
full-bw signal: CCtot=+0.3 (avgd) (CCmin=+0.25, CCmax=+0.7)
Microphone: Neumann USM69

Fig. 7: The ORTF L-R signal exhibits level attenuation already from
600Hz downwards, but to a smaller extent: it is usually in the range
of 10dB or less.

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 7


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

Fig. 10: The dummy head organ recording exhibits high amounts of
correlation below 200Hz, as was to be expected. Level attenuation in
this frequency range reaches values of 15dB and more. With a Fig. 12: The 40cm AB system (the recording of which was done at a
CC=+0.75 it has one of the highest values of low frequency different occasion) shows a similar behavior: attenuation starts to
correlation. High frequency content of the stereo signal appears to be establishes itself clearly from the 400Hz region downwards. Again,
relatively decorrelated (CC=+0.25). attenuation levels are mainly around 12-14dB.
Overall, the characteristics of the signal in respect to correlation LF(<700Hz): CC=+0.7 (averaged) (CCmin=+0.4, CCmax=+0.8)
are relatively close to small AB (50cm) and the sphere-microphone. HF(>1000Hz): CC=+/-0.0 (avgd) (CCmin=-0.1, CCmax=+0.2)
LF(<700Hz): CC=+0.75 (averaged) (CCmin=+0.5, Cmax=+0.8) full-bw signal: CCtot=+0.4 (avgd) (CCmin=+0.3, CCmax=+0.7)
HF(>1000Hz): CC=+0.25 (avgd) (CCmin=+0.2, CCmax=+0.3)
full-bw signal: CCtot=+0.3 (avgd) (CCmin=+0.25, CCmax=+0.5)
Dummy head microphones: information not available

3.3.3.2 The second set of measurements:


Orchestral Music in a Hall
AB technique recordings with microphone spacings of 20, 40, 80,
130, 320, 720 and 1200cm, as well as in ORTF-technique have been
made for the purpose of measurement.
These microphone systems were all positioned as main-systems, i.e.
about 1-2 meters back from the conductor, with a height of about 3,5
– 4,5 meters (unless noted otherwise in the text)
Fig. 13: With the 80cm AB-System level attenuation has clearly been
shifted down to the 200Hz region and below. Also, attenuation levels
have clearly been reduced, due to increased decorrelation of the L
and R channel. At a spacing of 130cm (not shown), this effect
becomes even more pronounced: level attenuation in the range of 2-
3dB gets shifted to the frequency range below 100Hz.
LF(<700Hz): CC=+0.5 (averaged) (CCmin=+0.2, CCmax=+0.7)
HF(>1000Hz): CC=+/-0.0 (avgd) (CCmin=-0.2, CCmax=+0.1)
full-bw signal: CCtot=+0.25 (avgd)(CCmin=+0.25, CCmax=+0.4)

Fig. 11: The small AB-system with 20cm spacing exhibits clear level
attenuation below 400Hz, which usually is in the order of 12-14dB.
LF(<700Hz): CC=+0.85 (averaged) (CCmin=+0.7, CCmax=+0.9)
HF(>1000Hz): CC=+/-0.0 (avgd) (CCmin=-0.15, CCmax=+0.2)
full-bw signal: CCtot=+0.7 (avgd) (CCmin=+0.4, CCmax=+0.8)

Fig. 14: The AB 320cm spacing already exhibits much less


correlation; level attenaution in the L-R signal drops to values in the
range of about 5 dB.
(Remark: the peak at about 8kHz is a triangle hit)
LF(<700Hz): CC=+/-0.0 (averaged) (CCmin=-0.3, CCmax=+0.5)
HF(>1000Hz): CC=+0.1 (avgd) (CCmin=-0.2, CCmax=+0.2)
full-bw signal: CCtot=+0.1 (avgd) (CCmin=-0.2, CCmax=+0.3)

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 8


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

baffle, therefore mainly the level attenuation in the region below is of


interest. At around 100Hz level attenuation starts to reach a value of
about 10dB, which also corresponds well with the measured
CCLF=+0.8.
LF(<700Hz): CC=+0.8 (averaged) (CCmin=+0.2, CCmax=+0.8)
HF(>1000Hz): CC=±0.0 (avgd) (CCmin=-0.2, CCmax=+0.2)
full-bw signal: CCtot=+0.6 (avgd) (CCmin=+0.25, CCmax=+0.7)

Fig. 15: The 720cm spacing looks more decorrelated again than the
320cm spacing as can be seen mainly in the region between 100 and
200Hz, as well as from the values of attenuation from 40Hz
downwards.
LF(<700Hz): CC=±0.0 (averaged) (CCmin=-0.3, CCmax=+0.4)
HF(>1000Hz): CC=±0.0 (avgd) (CCmin=-0.1, CCmax=+0.1)
full-bw signal: CCtot=±0.0 (avgd) (CCmin=-0.25, CCmax=+0.3)

Fig. 18: AB-PC system with AB 12m and ORTF-centerfill


Finally, the measurement of the AB-PC main microphone system
(see section 6) displays the advantage of a largely spaced AB
microphone system: the L and R channel signals remain decorrelated
also towards the deepest frequencies. There is only a little bit of
"turbulence" (atttenuation of varying extent) in the frequency range
between 125-800Hz, probably due to the influence of the "Centerfill"
microphone system (an ORTF system was used in this recording).
LF(<700Hz): CC=+0.3 (averaged) (CCmin=+0.1, CCmax=+0.4)
HF(>1000Hz): CC=+0.2 (avgd) (CCmin=±0.0, CCmax=+0.2)
full-bw signal: CCtot=+0.25 (avgd) (CCmin=+0.2, CCmax=+0.3)

3.3.3.3 Measuring with "colored noise"


Fig. 16: The recording of the 1200cm spacing was again taken at a
different performance. Judged visually, decorrelation does not seem During the course of the measurements the author found that the
to have been increased in comparison to the 720cm spacing, which "colored noise" signal, produced by the audience before the
might have to do with the fact that both systems already have performances and in the intermissions, turned out to be a valuable
spacings which exceed the critical distance (reverberation radius) of test signal.
the hall which has a value of 5,8m. (Volume: 15500 cubic feet, 2158 Due to its spectral characteristics sum and difference signal could be
seats, medium RT60=1.5s [occupied]) compared "instantaneously" by switching the R channel (of the above
LF(<700Hz): CC=±0.0 (averaged) (CCmin=-0.25, CCmax=+0.2) described measurement setup) in and out of phase. The results for a
HF(>1000Hz): CC=-0.1 (avgd) (CCmin=-0.15, CCmax=+0.1) 20, 40 and 80cm spacing can be seen Fig. 19-21:
full-bw signal: CCtot=±0.0 (avgd) (CCmin=-0.25, CCmax=+0.1)

Fig. 19: Small AB (20cm) measurement with "audience noise"


LF(<700Hz): CC=+0.8 (averaged) (CCmin=+0.7, CCmax=+0.8)
Fig. 17: The measurement of the ORTF signal exhibits level HF(>1000Hz): CC=-0.1 (avgd) (CCmin=-0.25, CCmax=+0.1)
attenuation from about 2kHz downwards. However, above approx. full-bw signal: CCtot=+0.6 (avgd) (CCmin=+0.5, CCmax=+0.7)
700 Hz the human head starts to get effective more and more as a

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 9


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

Level attenuation in the L-R signal seems to have a 9dB/Octave 3.3.3.4 Cross-referencing
slope from the "critical frequency" downwards. With the 20cm For reasons of cross-referencing recordings made at acoustically
spacing, frequencies up to 800Hz got cleary affected, for the 40cm quite different locations have been measured as well.
spacing the "critical frequency" dropped to the 500Hz region, and The ORTF recording of an orchestra in a small church (Fig. 22)
with the 80cm spacing it shifted to the 250Hz region. showed a similar pattern as the one carried out at the Salzburg
Festival Hall (Fig. 17): level attenuation starts with a few dB in the
1kHz region and increases towards lower frequencies to values of
about 10-12dB for the Salzburg Festival Hall and around 14dB for
the church.

Fig. 20: Small (AB 40cm) measurement with "audience noise"


LF(<700Hz): CC=+0.7 (averaged) (CCmin=+0.6, CCmax=+0.75)
HF(>1000Hz): CC=+/-0.0 (avgd) (CCmin=-0.1, CCmax=+0.1)
full-bw signal: CCtot=+0.4 (avgd) (CCmin=+0.3, CCmax=+0.5)
Fig. 22: ORTF recording in church
LF(<700Hz): CC=+0.5 (avgd) (CCmin=+0.25, CCmax=+0.75)
HF(>1000Hz): CC=+/-0.0 (avgd) (CCmin=-0.2, CCmax=+0.2)
full-bw signal: CCtot=+0.5 (avgd) (CCmin=+0.25, CCmax=+0.7)
At the same church and with the same orchestra another recording
has taken place, making use of a Blumlein-Pair microphone system
(Fig. 23) This measurements display much smaller levels of
attenuation, mainly in the range of 3-4dB, starting from about 1kHz
downwards.

Fig. 21: Small (AB 80cm) measurement with "audience noise"


LF(<700Hz): CC=+0.2 (avgd) (CCmin=+/-0.0, CCmax=+0.25)
HF(>1000Hz): CC=+/-0.0 (avgd) (CCmin=-0.1, CCmax=+/-0.0)
full-bw signal: CCtot=+0.1 (avgd) (CCmin=-0.05, CCmax=+0.15)

It can be clearly seen from these "colored noise" measurements, that


the correlation of small AB systems increases steadily as frequency
drops below the "critical frequency". The colored noise
measurements display much higher values of level-attenuation than
the previous measurements with the same microphone spacings Fig. 23: Blumlein-Pair recording at church
reported above, and seem to give more accurate figures due to the LF(<700Hz): CC=+0.4 (averaged) (CCmin=+0.2, CCmax=+0.6)
difference of the noise stimulus. HF(>1000Hz): CC=+0.2 (avgd) (CCmin=+0.1, CCmax=+0.3)
full-bw signal: CCtot=+0.4 (avgd) (CCmin=+0.2, CCmax=+0.5)
However, it has to be noted that due to the relative distances of the
(small) AB systems and the sound sources, the AB microphones are
certainly way out in the diffuse field in the case of the "colored
noise" measurements, while they are still in the direct sound field (at
least for part of the orchestra) in the case of a musical performance.
This most likely contributes – apart form the described alteration in
the measurement procedure – to the differences in level attenuation of
the various documented examples.
Finally, it can be noted that the values for the "Critical Frequency" of
microphone spacings of 20, 40 and 80cm taken from the "ambience
noise" measurements coincide reasonably well with results calculated
from eqn. (5): 856Hz, 428Hz and 214Hz, respectively.
Fig. 24: Blumlein-Pair recording in studio

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 10


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

On the above mentioned test CD (see section 3.1) recordings of a operation within the measurements will result in getting "2M" and
woodwind trio can be found, executed in relatively dry studio "2S". Since the L-R (=2S) signal in fig. 26 is always weaker than the
acoustics. Taking a look at the measurement of the Blumlein-Pair L+R signal, this means essentially that the side signal (S) of the MS
recording of the trio gives a similar result: from about 1 kHz system had less level than the forward facing cardioid capsule (M).
downwards level attenuation becomes apparent with relatively small This can be due to either
values of just a few dB. a) lack of side reflections (dry studio acoustics), and / or
LF(<700Hz): CC=+0.25 (averaged) (CCmin=-0.2, CCmax=+0.7) b) the side signal was simply included in the mix with several dB less
HF(>1000Hz): CC=+0.1 (avgd) (CCmin=-0.25, CCmax=+0.2) than the M signal.
full-bw signal: CCtot=+0.25 (avgd) (CCmin=+0.1, CCmax=+0.7) However, it is to be noted that once we compensated the lack of level
in the L-R signal of Fig. 26 by about +6 to +10dB, the overall level
attenuation would drop significantly displaying a reasonably
decorrelated stereo signal. It is mainly the frequency range below
100Hz, which displays level attenuation in excess of 10dB (before
level compensation).

4. CONSIDERATIONS ON SOUND AESTHETICS


Concerning the recording of acoustic instruments in an orchestral
context Jean-Marie Porcher of Radio France stated that "... spectral
fidelity is the first priority, followed by localization." at the Paris
AES Convention's surround-workshop in 2000. [29]

Fig. 25: XY-pair (90 deg.) recording in studio The author is of contrary opinion, since the perception of the spectral
LF(<700Hz): CC=+0.9 (averaged) (CCmin=+0.5, CCmax=+0.9) characteristics of an instrument within the orchestra will depend on
HF(>1000Hz): CC=+0.6 (avgd) (CCmin=+0.3, CCmax=+0.75) the position of the listener in the hall, due to the complex radiation
full-bw signal: CCtot=+0.8 (avgd) (CCmin=+0.6, CCmax=+0.95) characteristics of acoustic music instruments interacting with
shadowing effects of adjacent musicians. In order to achieve a more
The 90° XY-microphone system recording on the same CD brought a
naturalistic impression of the sound event it is necessary to achieve
little surprise (Fig. 25): probably due to the dry studio acoustics (lack
proper localization (in the sense of spatialisation, positioning of
of diffuse field) the L-R signal displays almost constant level
instruments in relation to sound stage depth, etc.). The use of too
attenuation also at higher frequencies. For frequencies below 1kHz
many spot microphones (for the sake of sonic "brightness") however
the measurements show a similar behavior as with the already
certainly "blurs" the integrity of the overall sonic picture of the
examined XY-recording (see Fig. 8): significant amounts of level
orchestra in respect to spectral fidelity as well as spatialisation.
attenuation in the L-R signal, mainly in the range of 12-14dB.
"Single point" recordings in the form of coincident or near coincident
Remark: The somewhat strange "hump" at frequencies below 40Hz
techniques have one major disadvantage in comparison to wide AB:
with the last two recordings from the test CD are probably results of
They almost always provide a perspective "from the inside" of the
lack of sound-proofing (mechanical de-coupling) in that frequency
orchestra, meaning from somewhere along the center line that splits
range against exterior noise sources.
the orchestra in half left and right from the conductor's position.
Usually the microphones are also quite close to the orchestra,
especially to the string instruments next to the conductor's podium.
Only XY techniques with crossed cardioids or figure of eight's
(Blumein-Pair) might allow to move the microphones half the stage
width or even more out in the hall. For most of the other
arrangements the microphones stay inside the critical distance
(reverberation radius), usually slightly back from the conductor's
position and between 3 to 5 meters high.

Large AB on the other hand captures the orchestra more from the
"outside". Since the recording normally gets played back on home
systems with loudspeaker spacings of much smaller dimension, it
also seems to be the right approach to try to capture the orchestra in
its full width, which will suffer "downscaling" on playback in any
Fig. 26: MS recording (with cardioid capsule) in studio case. It is the author's experience (with orchestra as well as opera
LF(<700Hz): CC=+0.6 (averaged) (CCmin=+0.3, CCmax=+0.8) recording) that large spaced main-microphone techniques which
HF(>1000Hz): CC=+0.3 (avgd) (CCmin=+0.1, CCmax=+0.7) make use of only very few spot-microphones preserve the sense of
full-bw signal: CCtot=+0.6 (avgd) (CCmin=+0.4, CCmax=+0.8) "space" much better when played back on low-fi sound systems with
very small speaker spacings (tv-monitors, for example) than most
The MS recording displays an interesting characteristic: in contrast other techniques.
to the MS system measurement of Fig. 9, the MS recording from the
studio exhibits clear level attenuation throughout the whole
frequency range. Two plausible explanations come to mind:
First, the L-channel signal in an MS system is essentially of the form
(M+S), while the R-channel signal consists of (M-S). Therefore, the
result of our L+R and L-R signal processing

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 11


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

Even if misrepresentations in terms of localization were an integral


A look at figures 27 and 28 shows another advantage of large AB:
part of the large AB technique, the author would like to voice his
opinion that such a distortion introduced to the overall sonic
character is by far smaller than the distortion in respect to sound-
stage depth, spaciousness, clarity of the recording and sonic character
of the various instruments, which usually comes along with the use of
spot-microphones.
Looking at common working practices in the industry, the author has
experienced sound-engineers deliberately "panning" musicians to a
different position in the sonic panorama of a stereo-mix, in order to
achieve better sonic separation between instruments, for the sake of
the listener. This happened despite the fact that the audio was
supposed to accompany a visual representation of the performance.
How much less relevant is the exact positioning of, for example, the
players of the woodwind section, in an "audio-only" context ?

Fig. 27: Large AB spacing ~12m, critical distance (reverberation


5. WORK PRACTICE
radius) ~ 5.8m (drawing not to scale)
With a large AB system about half of the orchestra instruments are
within the critical distance of the two microphones A and B.
In addition, the majority of the rear instrumental sections of the
orchestra are quite loud and have highly directional characteristics,
which helps in terms of localization.
Apart from better L/R amplitude separation (due to the level
attenuation with distance according to the inverse square law), time-
of-arrival differences in a large AB system are much bigger in
comparison to a small AB system (see below).

Fig. 28: Small AB spacing ~50cm, critical distance (reverberation


radius) ~ 5.8m (drawing not to scale)
With a small AB technique, sound sources outside the critical
distance from the main system will not be localized properly since
the diffuse field dominates the direct sound. As a consequence
there is a need for spot-microphones.
In case of a large AB system amplitude differences as well as the
much larger time-of-arrival differences allow proper localization
even for sound sources outside of the reverberation radius.

The "hole in the middle" effect, inherent to the large AB technique,


can easily be compensated with an appropriate centerfill system. (see
section 6)
With large AB there may be a certain degree of localization
distortion, as some critics are claiming (even thought the author
himself has not experienced anything like that in his recordings).
Most sources refer to theoretical calculations based solely on
loudness differences, as published in the original paper about the
experiments at the Bell Laboratories from 1934 [7]. One of the
authors of this paper has revised these statements in a later
publication [30], in which also arrival-time differences and quality
differences (referring to the amplitude-frequency spectrum) were
taken into consideration.

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 12


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

In my work as sound engineer I have found the following guidelines


which I consider essential for making a good recording of an
ensemble of acoustic instruments:
1. Try to achieve a "natural perspective" in the recording; i.e. capture
the rear parts of an orchestra in a manner which lets them appear to
be further away from the listener than the ones in front;
2. Use as many microphones as necessary, but as few as possible;
3. Capture the acoustics of the room and the instruments within with
a microphone technique which works as "linearly" as possible (in
respect to localisation and depth) over the audio frequency range

(20Hz – 20kHz)
4. The "sonic picture" created through the recording should emanate
from one plane ("zero delay plane"), therefore the use of spot
microphones should be restricted as much as possible (with the
exception of vocal- and instrumental soloists)
5. Microphones are preferably pointing from the direction of the
audience towards the orchestra ("audience perspective"), in order to
achieve more natural tonal colors.
6. The use of artificial reverb or room simulation should be kept to a
minimum
At first glance these guidelines seem to be pretty much common
knowledge of practicing sound engineers, however following them
striktly leads to the exclusion of various well-established main-
microphone techniques.

6. PROPOSAL FOR A "DECORRELATED" MICROPHONE


SYSTEM: "AB - POLYCARDIOID CENTERFILL" (AB-PC)

As has been shown, sufficient decorrelation of a main-microphone


signal also at low frequencies is necessary for better spatial
reproduction. Therefore a "large AB" – as opposed to "small
AB" – microphone setting should be used for orchestral recording.
The acoustic "hole in the middle" effect – by many people considered
to be the pitfall of large AB - turns out to be its biggest advantage:
in recording situations, where a soloist (instrumental or vocal)
performs with an orchestra, this soloist is almost always represented
too loud or too close in the conventional coincident or near coincident
main-microphone arrangements for stereo, as well as 5.1 surround
recording. With large AB, the "hole in the middle" gives the sound
engineer the freedom to employ additional microphones which can be
optimized (in terms of position as well as level) to suit the soloist,
and – with a "purist" recording approach in mind towards minimizing
the number of microphones used – can often serve the orchestra at
the same time.

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 13


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

The proposed AB-PC microphone setup consists of a large AB setup


with a microphone spacing of about half the orchestra-width. The
two omnidirectional capsules are preferably diffuse-field
compensated and are positioned about 1 – 2 in front of the orchestra,
at a height between 3.5 and 4.5m. (Exact positioning of course
depends on the hall acoustics (reverberation radius), size of the
orchestra, etc.; the values cited above refer to a full size symphonic
orchestra) As "Centerfill" various systems can be employed: the
author's preference goes towards the use of one or several cardioid
microphones, since they provide sufficient channel separation
(rejection of sound-sources "outside" of the microphone pattern),
while exposing acceptable amounts of off-axis coloration.
(Microphone patterns with higher directivity suffer usually of much
stronger sound-coloration which is detrimental to the overall sonic
impression achieved in the final mix.)
A standard AB-PC configuration for a plain symphony-orchestra
recording will use a large AB setup with an ORTF-system as
centerfill.

Fig. 30: Schoeps microphones configured as "ORTF-Triple"

Fig.29: AB-PC with ORTF-Triple as centerfill and soloist mic


For better coverage of the middle section of the orchestra a third
cardioid microphone can be added to the middle of the ORTF system,
an arrangement, which the author refers to as the "ORTF-Triple", or
"ORTF-T". (see Fig. 30)
The third microphone of the ORTF-Triple is usually pointing at the
woodwind section of the orchestra (regular orchestra setup
presumed).
Single standing soloists - like a violin, for example – are usually
captured sufficiently by the ORTF system, however adding a
dedicated soloist microphone has the following advantages:
sound pick-up of the microphone can be optimized in respect to the
soloist's position on stage and in the mix the microphone-signal can Fig. 31: AB-PC with ORTF-Triple and spot microphones for piano
used to (virtually) pull the soloist towards the center, if that is
desired. This works very well in a 5.1 surround context, where the Due to the acoustic dominance of the grand piano the ORTF-Triple
soloist microphone may be routed to the center channel directly, centerfill system has been moved more towards the inside of the
which has the added advantage that this signal cannot cause phase orchestra: it is suspended above the piano (at about 3.5 –4.5m
cancellation when being mixed with the stereo-signals of the AB- height), slightly angled down towards the woodwinds and adjacent
and ORTF-System in the electrical domain. instrument sections. The ORTF-T's position is optimized to minimize
For a conventional stereo-mix the sound-engineer has to evaluate sound pickup from the piano.
carefully whether mixing in the soloist spot microphone actually
enhances or disturbes the sonic integrity of the mix. He might then In this position the ORTF-Triple becomes some sort of stereo spot-
also consider to leave the third microphone of the ORTF-Triple out microphone for the rear section of the orchestra. Arrival-time
of the mix and instead try to optimize the position of the soloist differences in relation to the AB-main system (distance "d") have to
microphone to capture the soloist in the foreground, as well as the be accounted for by delaying the signals of the three ORTF-Triple
woodwind section in the background instead of risking unnecessary microphones for mixdown, otherwise some unnecessary localization
phase cancellation when using both of them using in the mix (this distortion, due to the precedence effect, would be the result.
applies for 2-channel stereo, as well as 5.1 surround).
As can be expected, the displacement of the ORTF-T centerfill
An alternative approach is to try to separate the centerfill and soloist system from its normal position out in the hall has sonic
microphones, as can be seen in Fig. 31. consequences: a setup like in Fig. 31 will usually not provide the
same sonic transparency in respect to spatialisation, as does the one
displayed in Fig. 29.

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 14


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

One of the main ideas behind the purist AB-Polycardiod Centerfill to create envelopment, it is important to capture or add rear signal
approach is to preferably picture the entire orchestra from a so-called information of decorrelated nature.It is also the author's experience
"Zero-Delay Reference-Plane" (see also Theile [31]), which is that, if no particular sound event at the back of the hall apart from the
oriented at the AB microphones. An offset of the centerfill section of acoustic response of the room is to be captured, two cardioid
about +/- 1m from the plane seems to be uncritical in respect to microphones, pointing towards the rear of the hall, separated by and
localization distortion (depending on the size of the orchestra and at a distance from the orchestra of at least the reverberation radius
exact position of the microphones), bigger values have to be (critical distance) will deliver appropriate rear channel signals.
compensated for. In general it can be said that the ORTF-Triple may
be positioned further back in the hall in comparison with the AB
6.2 AB-PC in relation to other 5.1 systems
microphones to achieve a better balance between the front and rear
Since the above proposed AB-PC technique, which eliminates as far
parts of the orchestra's middle section. However, since it is also used
as possible the use of spot-microphones, may seem odd to some of
as some sort of spot-microphone for the first desks of the string
the readers, the author was happy to find that another professional in
sections left and right of the conductor it will probably be positioned
the field uses a very similar technique: As described in [32]
nearer to the orchestra than if it was used as a traditional main
Tomlinson Holman uses an arrangement in which he combines an
microphone system.
ORTF-system, panned center-L and center-R , but uses cardioids for
the AB-pair, which are each fully routed to the L and R channel. To
Also, it has to be pointed out that the L and R channel signal of the
this front-channel arrangement he adds delayed spot-microphones.
ORTF-microphone will not be panned fully L and R as usual, but
The information for the rear channels is derived from a mix of
instead panned towards the center (L channel at "10-11a.m.", R
cardioid and omnidirectional microphones positioned in the hall.
channel at "1-2p.m.") according to its function as centerfill system.
Sonic Time-of-arrival differences between the front and rear
microphones are compensated for by advancing the rear microphone
signals in time via shifting of tracks on a harddisk editing system.

On the one hand there is quite a number of proposals for 5.1 surround
microphone systems which make use of relatively small spaced
omnidirectional or as well as directional microphone systems, on the
other hand a recent survey [33] of techniques applied by practicing
sound-engineers in the field has shown that there is a discrepancy:
the majority of sound-engineers actually uses proprietary techniques
for 5.1 recording, which are derived from previous 2-channel stereo
techniques, usually with larger microphone spacings in the range of
1.5 to 4m. Also, there is a trend towards using more omnidirectional
transducers than directional ones; the majority of engineers in this
survey used 3-5 omnidirectional microphones. The advantage of
these "free" microphone arrangements is that they can easily be
Fig. 32: AB-PC with "Decca-Triangle" style centerfill system adjusted according to the requirements of the piece, hall acoustics,
etc.
Figure 32 displays an alternative centerfill system: a "DECCA-
Triangle" style arrangement with 3 cardioids. The center microphone
The conclusion of a project on surround recording of orchestral
C is pointing towards the woodwinds and is routed directly to the
music, carried out by of a group of students at the "Hochschule der
center channel in a 5.1 system. There are significant differences
Künste" in Berlin in 1997 was that the front channel signals should
though in comparison to a traditional Decca–Triangle setup: the
preferably be decorrelated (at least the L and R channel), which
centerfill microphones D and E are not assigned fully to the L and R
could be achieved by using three omnidirectional microphones at a
channel, but are panned "ad libitum" center-L and center-R. The
spacing of at least 1.5m, each. [34]
microphones might be angled slightly outwards for better acoustic
separation (important, since their signals are being panned in relative A comparative study of statistically evaluated listening tests by
vicinity on the L/R stereo-bus). Hildebrandt and Braun [35] also brought the result that for broad
Again, the center microphone C should be compensated in terms of sound sources, like an orchestra, microphone techniques using widely
arrival-time differences to optimize localization. Replacing the single spaced omnidirectional microphones (in that particular case 5 pan-
C microphone with an ORTF-Triple is an alternative, if better potted pressure transducers) were preferred for their superior spatial
coverage of the rear orchestra parts seems necessary. reproduction and localization qualities (for off-axis listening
positions, in the second case).
All the above explanations might sound quite theoretical, therefore
the author would like to emphasize the fact that these microphone
setups are a result of tried-and-tested work practice, which has All the above mentioned studies show that largely spaced systems
already achieved superior sonic results in respect to transparency and provide superior spatial reproduction of sound events, at least for big
spatial reproduction. In connection to the above described AB- sound sources like orchestras.
Polycardioid Centerfill system, the author has coined the term
"Natural Perspective", because this is what the minimalist
microphone technique wants to achieve.

6.1 The AB-PC arrangement as a 5.1 system


In the previous section various versions of an AB-Polycardioid
Centerfill system have been explained in detail in respect to front-
signal assignment in a 5.1 context. Capturing of signals for the L and
R surround channel have not been dealt with so far.
Griesinger has shown in various of his papers [15, 22] that in order

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 15


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

7. CONCLUSIONS [6] Blumlein A., "Improvements in and relating to Sound-


It has been shown that not all of the standard main-microphone transmission, Sound-recording and Sound-reproducing Systems",
systems for 2-channel stereo recording are equally suited to capture British patent, 394,325, Dec.14 1931, (reprinted in the "Anthology of
sound events with equal fidelity in respect to spatial reproduction, Stereophonic Techniques", AES, 1986, pp.32-40)
due to their frequency dependent correlation characteristics. Ideally,
microphone systems should be able to capture also low frequency [7] Steinberg J. C., Snow W. B., "Auditory Perspective – Physical
signal content with sufficient decorrelation. Factors", Electrical Engineering, vol.53, no.1., Jan. (1934), pp12-15
An AB-pair of omnidirectional microphones will provide correct
spatial reproduction only down to the "critical frequency", which can [8] Yost W.A., Wightman F.L., Green D.M., “Lateralisation of
be calculated from the spacing between the capsules. filtered clicks”, Jounal of the Acoust. Soc. of America, 50, pp. 1526-
Based on the measurements a ranking of main-microphone systems 1531, (1971)
in respect to low-frequency decorrelation is attempted:
Large AB, AB-PC, Blumlein-Pair and MS seem to be more [9] Hirata Y., “Improving stereo at L.F.”, Wireless World, pp.60,
decorrelated, followed by ORTF, small AB, sphere microphone, Oct. 1983
dummy head and XY.
Therefore small AB and XY seem less well suited for the recording [10] Yanagawa H., Higashi H., Mori S., “Interaural correlation
of large sound masses (symphonic orchestra), if replay via coefficients of the dummy head and the feeling of widness”, A.lS. J.,
loudspeakers is intended. Due to their nature, dummy head and Tech. Rep. H-35-1, (1976)
sphere microphone are most likely better used with headphone
reproduction. [11] Suzuki A., Tohyama M. “Interaural cross-correlation coefficient
of Kemer head and torso simulator”, IECE Japan, Tech. Rep. EA80-
A microphone system ("AB-PC") with sufficent decorrelation, also in 78, (1981)
respect to low frequencies, has been proposed, which is well suited
for 2-channel stereo as well as 5.1 surround recording. It has the [12] Griesinger D., “Objective Measures of Spaciousness and
added advantage of being equally suited for the recording of Envelopment”, paper at the AES 16th Int. Conf. on Spatial Sound
symphonic orchestras, as well as for soloists accompanied by an Reproduction
orchestra or instrumental ensemble, an application where many of the
conventional main microphone systems fail. [13] Morimoto M., Meakawa Z. "Effects of Low Frequency
In addition , the sonic characteristics of this microphone system Components on Auditory Spaciousness", Acustica Vol. 66 (1988)
translate well to low-fi replay systems with small speaker spacings,
like tv-monitors, for example. [14] Hidaka T., Beranek L. and Okano T, "Interaural cross-
correlation, lateral fraction, and low- and high-frequency sound
General suggestions in respect to recording practice in order to levels as measures of acoustical quality in concert halls", J. Acoust.
achieve a "natural perspective" and better spatial reproduction have Soc. Am. 98 (2), August (1995)
been made.
[15] Griesinger D., "The Theory and Practice of Perceptual
Modeling – How to use Electronic Reverberation to Add Depth and
8. ACKNOWLEDGEMENTS Envelopment Without Reducing Clarity", paper presented at the 21.
The author would like to thank Dr. David Griesinger, Eng. Günther Tonmeistertagung of the VDT, Hannover 2000, pp.766-795 (see also
Harner, Prof. Jürg Jecklin, Dr. Christoph Schüller and Mag. Markus www.world.std.com/~griesngr)
Waldner.
[16] Hecker P., "The Decision of the Microphone Spacing and its
Creative Benefit", (German) paper presented at the 21.
Tonmeistertagung of the VDT, Hannover 2000, pp.796-804,
Proceedings (ISBN 3-598-20362-4)

[17] Marshall A.H., "Acoustical Determinants for the Architectural


Design of Concert Halls", Arch. Sci. Rev.., Australia 11, 1968,
pp.81-87
9. REFERENCES
[18] Barron M., "The Subjective Effects of First Reflections in
[1] Rumsey F., Segar P., “Optimisation and Subjective Assessment Concert Halls – The need for Lateral Reflections", J. Sound Vib., 15,
of Surround Sound Microphone Arrays”, paper #5368 presented at 1971, pp. 475-494
the 110th AES Convention, Amsterdam, May 2001
[19] Barron M., Marshall A. H., "Spatial impression due to early
[2] Sengpiel E., lecture notes at the “Hochschule der Künste”, Berlin lateral reflections in concert halls", J. Sound Vib. 77, 1981, pp. 211-
232
[3] Wuttke J. ”The Microphone between Physics and Emotion",
paper presented at 20. Tonmeistertagung des VDT, Karlsruhe 1998, [20] Griesinger D., "Spaciousness and Envelopment in Musical
pp.460 Acoustics", paper presented at the "19th International Convention on
Sound Design" of the VDT in Karlsruhe, 1996, Proceedings (ISBN
[4] La Grou J., ”Orchestral Recording ”, Mix, Feb. 1994, pp. 32 3-598-20360-8), pp. 375-391

[5] Valin J., "The RCA Bible – A Compendium of Opinion on RCA [21] Griesinger D., "Spatial Impression and Envelopment in Small
Living Stereo Records", Second Edition, The Music Lovers Press Rooms", paper presented at the 103rd AES Convention in New York,
(1994), pp. 123,124 Sept. 1997, preprint # 4638 (H-2)

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 16


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

[22] Griesinger D., "General overview of spatial impression,


envelopment, localization and externalization", Proc. 15th Int.
Conf. AES on Small Rooms Acoustics, pp. 136-149, Denmark
Oct/Nov 1998, (see also www.world.std.com/~griesngr)

[23] "Best of Chesky Classics & Jazz and Audiophile Test Disk
Volume 3", Chesky Records, JD111D, Tracks 15, 18

[24] Streicher R., Dooley W., “Basic Stereo Microphone


Perspectives – A Review“, JAES, vol.33, no.7/8, pp548-556,
July/Aug. 1985

[25] Coen C., "Comparative Stereophonic Listening Tests", Journal


of the AES, vol. 20, no.1, Jan./Feb. (1972), pp.19-27

[26] Clark H. A. M., Dutton G. F., Vanderlyn P. B., "The


'Stereosonic' Recording and Reproduction System", JAES, vol. 6,
no.2 April 1958, pp.102-117

[27] Mitchell D., "Tracking for 5.1 – Surround-Field Recording


Techniques", Audio Media Nov. 1999, pp. 100-105

[28] Yamamoto T., Nagata M., “Acoustical characteristics at


microphone positions in music studios”, NHK Techn. Rep.., vol. 22,
pp.475-89, (1970)

[29] Nelson T., "Multichannel Mayhem", Studio Sound, June 2000,


pp. 69-72

[30] Snow, W.B., "Basic Principles of Stereophonic Sound",


Journals of the SMPTE, vol. 61, Nov. (1953), pp.567-589

[31] Theile G., "Microphone and Mixing Concepts for 5.1 Music
Recordings", paper presented at the "21st International Audio
Convention" of the VDT in Hannover, 2000, Proceedings (ISBN 3-
598-20362-4), pp. 384

[32] Holman T., "Mixing the Sound (Part 2): Perspective – where do
the sounds go ?", Surround Professional, May/June 2001,
pp.35

[33] Betz G., "Surround Recordings – Practical Experiences", paper


(in German) presented at the 21. Tonmeistertagung of the VDT,
Hannover 2000, Proceedings (ISBN 3-598-20362-4), pp.485-
494

[34] Goßmann J., "Die Wirkung von Laufzeit- und Pegeldifferenzen


bei 5-Kanal Stereophonie", paper from the report of the "20th Int.
Convention on Sound Design" of the VDT in Karlsruhe, 1998,
Proceedings (ISBN 3-598-20361-6), pp. 1233-1237

[35] Hildebrandt A., Braun D., "3/2 Stereo: Investigation on the


Center Channel", paper presented at the "21st International Audio
Convention" of the VDT in Hannover, 2000, Proceedings (ISBN3-
598-20362-4), pp. 455

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 17


PFANZAGL-CARDONE AB - POLYCARDIOID CENTERFILL (AB-PC)

10 APPENDIX

Fig. 2: Variation in loudness level as a sound source is rotated in a horizontal plane around the head (from [7])

AES 112TH CONVENTION, MUNICH, GERMANY, 2002 MAY 10–13 18

You might also like