High-Resolution Subjective Testing Using: A Double-Blind Comparator

ENGINEERING REPORTS
High-Resolution Subjective Testing Using

a Double-Blind Comparator*
DAVID CLARK
ABX Company, Troy, MI 48099, USA
A system for the practical implementation of double-blind audibility tests is de-

scribed. The controller is a self-contained unit, designed to provide setup and operational
convenience while giving the user maximum sensitivity to detect differences. Standards
for response matching and other controls are suggested as well as statistical methods of
evaluating data. Test results to date are summarized.
0 INTRODUCTION After nearly five years of using the A/B/X method,

there have been no lasting complaints of inhibited dis-
Listening tests used to evaluate audio equipment can criminatory ability by listeners. Some experiences and
seldom be considered scientific tests. For example, ex- results from these tests are discussed.
traneous factors which could influence the listener's
decision are not eliminated or held constant. A corn-
1 TESTING
mon failing is lack of a double-blind procedure because
of its inconvenience. Subjectivetesting of audio equipment will always be
When scientific tests have been performed, listeners' necessary because audio's end result is a subjective ex-
audibility thresholds have appeared to be poorer by perience. Obtaining useful data from subjective tests is
orders of magnitude compared to casual tests. It has usually difficult, expensive, and time consuming. For-
been argued that the methods and equipment used in tunately the results of subjective testing can often be
the scientific test have inhibited the listener's discrimi- related to the results of an objective measurement. The
natory ability, more convenient objective test can then be used in
A system of double-blind comparison testing, referred further development or evaluation work.
to as the A/B/X method, is described. This system The usefulness of data obtained from subjective test-
consistsof: ing dependson the resolution(what degree of detail)
1) Techniques to maximize discriminatory ability and the validity (trustworthiness) that it can provide.
2) Procedures and standards for maintaining validi- Obviously poetic descriptions of subtle shadings of dis-
ty tortion aredetailedbut uselessif theyare not heardby
3) Commercially available double-blind test equip- anyone else. Likewise if a number of people can only
ment. agreethat thesound is bad whenit isvoltageclippedat
It is the intent of this system to make practical and least 50% of the time, the results may be valid, but not
acceptable the widespread use of scientific double-blind detailed enough to be useful.
listeningtests. A scientifictest is designedto obtain valid data. The
key to such a test is to hold constant all factors which
* Presented at the 69th Convention of the Audio Engineer- may affect the outcome, except the one being studied.
ing Society, Los Angeles, 1981 May 12-15; revised 1982 Feb- Thus variation in results can be attributed to the one
ruary22. factor beingvaried. It is widelyrecognizedthat subjec-
330. © 1982 Audio Engineering Society, Inc. 0004-7554/82/050330-09500.75 J. Audio Eng. Soc., Vol. 30, No. 5, 1982'May
ENGINEERING REPORTS HIGH-RESOLUTION SUBJECTIVE TESTING
tive tests, in order to be scientific, must be performed audibility? Casual and scientific testing are not mutual-
double blind [1]. Double blind means that no one in a ly exclusive. The approach used to produce an answer
position to influence the outcome knows how the test to the question is to make the scientific rigor as trans-
factor is being varied, parent to the listeneras possible.
An important part of scientific testing is the control Level and response matching, polarity consistency,
experiment. This is a test of the test itself in which the elimination of extraneous pops, hums, etc., can be per-
factor under study is eliminated. This can establish: formed beforehand and need be of no concern to the
1) Random variations in results due to experimental listener. Frequently, however, there are time limits, re-
technique, the "noise floor." stricted program material, and other pressures in the
2) A reference point for judging the magnitude of usual double-blind comparison test. This is because the
the results. For example, articulation loss is measured experiment is of fixed duration by design, or the admin-
before and after a sound-reinforcement system is im- istrators create a time pressure by their presence in
proved to find the amount of improvement, running the experiment. There is, however, no inherent
The noise floor in subjective testing is frequently so reason why a scientifically rigorous double-blind eom-
high that achieving meaningful resolution is an exercise parison cannot be carried out over a period of weeks or
in extracting signal (useful data) from noise (test uncer- more in the listener's home or preferred listening envi-
tainty). This can be accomplished using statistical anal- ronment. In short, all of the appearances of a casual test
ysis. Positive results are inherently deterministic or sig- may be maintained.
nal because they are the result of a variation in the The remainder of this report describes an electronic
factor being studied. The test uncertainty is random or double-blind A/B comparator which will enable an
at least unrelated to the studied factor. Statistics can individual or group to perform rigorous tests in a casu-
separate the two: the larger the number of experiments, al manner. Operating controls and data analysis are
the greater the ability to extract signal from noise, designed to maximize the chance of detecting small au-
The question of why one bothers searching for such dible differences. Suggestions are made for the degree
minuscule audibility differences when such comparisons of elimination of extraneous factors to preserve validity.
are not made in real-world usage is often raised. One
reason is that the record-reproduction chain typically 2 REFINEM£NT$ TO THE A/B TEST
involves a long series of devices through which the au-
dio must pass. If a similar fault appears in ten devices The author's first experience with double-blind au-
in series, the compound fault may be clearly audible, dibility testing was as a member of the SMWTMS Audio
Also a number of different "less than audible" cleanups Club in early 1977. A button was provided which would
may result in an audible improvement, select at random component A or B. Identifying one of
A less obvious reason arises from the theory of signal these, the X component was greatly hampered by not
detectability applied to the human observer [2]. The having the known A and B available for reference. This
theory describes a human as a mathematically perfect was corrected by using three interlocked pushbuttons,
detector, except for a constant efficiency factor. The A, B, and X. Once an X was selected, it would remain
implication is that a human observer would be able to that particular A or B until it was decided to move on
detect any-difference between two signals when given to another random selection.
enough time. Since the total time spent in real-world However, another problem quickly became obvious.
listening to a device is likely to be much greater than There was always an audible relay transition time delay
the total test time, an enhanced real-world sensitivity is when switching from A to B. When switching from A
possible, to X, however,the time delay would be missingif X
The most common type of listening test incorporat- was really A and present if X was really B. This ex-
ing a control is the A/B test. Two components, A and traneous cue was removed by inserting a fixed length
B, are switched into the audio chain in turn so that a dropout time when any change was made. The dropout
comparison can be made. One component can be con- time was selected to be 50 ms which produces a slight
sidered the factor under study and the other the con- consistent click while allowing subjectively instant com-
trol. One component is preferably a wire bypass so that parison.
any differences heard can be presumed to be distortion When differences are small, a large number of re-
added by the real component [3]. sponses is necessary to achieve a high statistical prob-
In a casual audio salon type of A/B test, a difference ability that differences are audible. One way to accom-
between components A and B is almost always heard, plish this is to use few trials but a large number of
When the test is made more scientific, by eliminating listeners. If only one listener is used, however, a large
extraneous factors such as level mismatch and the sub- number of trials is necessary. Small numbers greatly
ject's knowlege of which is playing, the audible differ- penalize the listener for a single mistake, but also in-
ences begin to disappear. When the test is rigorous and crease the pure chance level of a perfect score.
double blind, audible differences between components A good minimum number of trials is 12 to 16 for an
become scarce [3]-[5]. ' '_ . individual. The p.r, esent comparator provides up to 100
A question becomes obvious,Doescasual testingTin _ ,_ trials, but this, has:-ne_erbeen approached in pra_:t4ce..If
vent audible differences or does Scientifi'c rigor inhibit _ an attempt is_made to accumulate responses by using
d. AudioEng.Soc.,Vol.30,No.5, 1982May 331

CLARK ENGINEERING
REPORTS
multiple test sequences, listeners are likely to start over to make the choice.
if they see that they have made errors in the first few 4) The listener knows that A and B are different and
trials. This effectively throws out certain data and that X is either A or B, so there is a correct answer.
makes subsequent statistical analysis invalid. Lack of a "no difference" option encourages the listen-
In the presen t design, obtaining the answer ends the er to muster all available auditory powers.
test sequence by disconnecting both components and 5) Random access A/B/X switching permits compar-
disabling the A, B, and X buttons.-Going back' tothe ing A to X for sameness or differentness. Also the tran-
test mode enters new random data into the memory for sition direction can be reversed (A to X and X to A).
a new test. The listenermay temporarily"wear out" one detection
The design philosophy was not to attempt to enforce mechanism, but can listen in a different way, hearing
a valid test, but to encourage it. For instance, the lis- another manifestation of the difference.
tener uses a hand-held control module for switching 6) If desired, the test can be performed individually.
trials and A, B, and X, but the answer and reprogram- Switch points and listening times then are customized
ming controls are located at the display. Typically, to to maximize the individual's sensitivity.
see the answers (and end the test) the listener must get 7) Great improvements in resolution can be achieved
up and go to the display. This does not prevent short if the listener knows what to listen for. Sensitizing tests
tests or cheating, but it makes the action more obvious, can use pink noise, sine waves, or pulses as appropriate
Likewise, the control buttons are easy to operate, yet to hear a difference. Sometimes an artificially enhanced
electrically interlocked to prevent tricking the logic or distortion can be produced by reducing feedback or
timing circuitry, connecting multiple devicesinseriesfor distortion build-
It would have been possible to include a memory for up, The listener is then more able to hear the difference
the listener's responses. Automatic readout of score, on music.
and even statistical analysis could then be performed 8) A tape loop or other means can be used to listen
by the unit. However, it was decided to provide only to exactly the same passage of music on A as on B.
those functions that the operator cannot do alone. This 9) Statistical analysis is used to detect any shift from
decision minimizes cost and prevents obsolescence if a pure chance responses. The probability that a particular
slightly different procedure is used. The comparator score from an A/B/X test is due to chance is given
can easily be interfaced to a general-purpose microcom- exactly by the binomial distribution [6]. Instead ofeval-
puter for data storage or analysis, uating the formula each time, a simple look-up table is
provided with the test form. An expanded version is
3 MAXIMIZING RESOLUTION provided in the owner's manual.
In pair comparison tests, the threshold of hearing is
The following specific operational and procedural traditionally considered to be at the point of 75% correct
considerations maximize resolution of the A/B/X test: responses. This may indeed correspond to a subjective
1) Instant juxtaposition of compared signals transition from not hearing to hearing, but the sounds
2) Test asks for differences only, not qualitative judg- must be affecting the listener differently when the correct
ments responses are between 50 and 75%. Our method utilizes
3) No time limits this prethreshold information to indicate audibility if it
4) Forced decision is statisticallysignificant.
5) "Sameness" or "differentness" comparison pos- As an example of how this increases sensitivity, con-
sible siderthe studiesof optimalbandwidthby Plengeet al.
6) Test is listener controlled [7]. Seven antialiasing filters of differing order, align-
7) Training signals or enhanced difference can be ment, and frequency were auditioned by a pair-eom-
used parisontest. Thetotaledresultsfor allsubjectswereless
8) Exact repetition of signals possible than the 75% threshold for every filter. Applying the
9) Statistical methods used to analyze group data. binomial distribution formula, however, it is extremely
To explain in more detail: probable that four of the seven filters were audible.
1) Many aspects of sound quality are short lived in Various temporal methods of presenting the sound
memory. A delay of only 50 ms between compared with the factor under study and the control are in use.
signals allows these qualities to be compared. They each have advantages in simplicity, repeatability,
2) The simplest possible judgment is: they are the resolution, and test time. The common audio salon A/B
same Or they are different. If the listener wishes to make test suffers most from the extraneous variable of criteri-
a quality judgment also, this should be done in writing, on dependence. Instructions like "amplifier A has high-
A good score in picking A or B qualifies the listener's speed circuitry. Listen for the faster sound of the highs,"
opinion, is an exampleof the worstkind of criterialbiasingof a
3) It can be argued that the stressful conditions of a test. Generally the less instruction and the more time
test do not allow the listener to be sensitive to certain symmetrical the presentations, the less the criterial in-
sonic parameters. There is no inherent time limit to fluence.
either the on time of A and B or the entire test. The The presentation ofapair of sounds is used extensive-
listener can simply wait for the proper sensitive attitude ly in clinical and scientific studies of the human subject
332 J. Audio Eng. Soc., Vol. 30, No. 5, 1982 May

and has also been used to study the sound stimulus, bandwidths and center frequencies necessary to elimi-
Each sound in the pair is usually presented for a fixed nate audible frequency response effects for most music
interval and separated by a short fixed interval. Simple sources. The curves are compiled from fairly limited
instructions and short test times are advantages. Table 1 double-blind testing of a limited number of individuals.
summarizes the qualities for various pair presentation The level used was approximately 85 dB unweighted.
tests. Thesecurvesare in generalagreementwith the findings
of others [2], [4]. In a double-blind test, response differ-
4 MAINTAINING VALIDITY ences greater than those allowed by the curves are likely
Writing down one's answers while performing the test to be responsible for audible differences.
The audibility Of absolute phase or polarity is still in
is mandatory with the A/B/X method. Memory just contention, so to be on the conservative side, polarity
does not serve well enough when scoring time arrives. A
should be maintained [8].
form is available for this (Fig. 1). Usually 16 trials is a
There are an endless number of hums, pops, clicks,
good maximum for one sitting. Program material used
mechanical noises, and other extraneous factors which
should be noted. Source material which makes certain may influence a test. The comparison testing equipment,
distortions apparent may be of value for other tests. A no matter how Well designed, cannot guarantee a valid
brief table of the scores necessary to achieve 95% con- test. Someone involved must assume the responsibility
fidence level is provided. When filled out fully and
of eliminating these influences. One check is to switch
signed, this form documents a scientific test. If it dem-
between components in a normal manner but with no
onstrates unusual findings, others will attempt to dupli- signal present. Many times X can be identified 100%
cate the experiment, with no sound. External decoupling capacitors or drain
It is generally agreed that levels should be matched in resistors sometimes have to be added.
comparison testing. However, assume that a small high-
frequency response rise exists in a particular phono 5 HARDWARE
pickup cartridge which is to be compared to one known
to be flat. A comparison test would likely reveal a dif- The ABX comparator (Fig. 3) consists of three parts,
ference, and perhaps the flatter one would be preferred.
L
The value of this result may be trivial, however, because Com,.r,
...... number
_ o,__ tried
( number )
Auxill[ary Equipment: _correct)
it is really a test of frequency response. A minor tone- M,.i...... for
95X confidence:
control high-frequency reduction is now applied to the Name: 5 out of $
nonflatcartridgeto makethe frequencyresponsesnear- _

· ' ' a
" ly the same. The comparison test may now disclose a _r,., ,d.nt,,ic.t,o.
Number A or
Mus,cSo.ic£..,.at_.
lB ? Code
.
·'
moresubtle difference.The previously nonflatone may I
_ ·....
lO "
-,0
" 12
even be preferred. 2 :o . ..143

:
Trivialand easilyremovedresponseerrors, like level 3 ,2
12
-" ."16,5
. error, should obviously be corrected before testing. In 4
the case of loudspeakers, however, complete removal of 5 MusicCode
responseerrors would be not only impracticalbut inap- 6
propriate becausenarrow-band aberrationsmaybejust 7
the factor that is being sought. For low- to high-cost 8 .,
high-fidelity systems, tone controls to octave-band 9 ,v
10 v
equalizers would be appropriate for response compen-
sation. If a differencecan still be heard after compensa- '12 _'
tion with these devices, a preference statement is appro- _3
priate even if it is based on frequency response. For 14
high-quality professional equipment, equalization to 15
one-third-octave resolution would be appropriate. As 16
equalization practices evolve, it would be appropriate to
Fig. I. Test form. Writing down responses is essential. Sonic
revise these criteria, evaluation is only meaningfulil' the listener could hear a
Fig. 2 shows the degree of level match for various difference.
Table 1. Comparison of common-pair comparison tests. The A/B/X method achieves the best results but takes more time.
Criterion
lndepen- Validity
Test SelectionCriteria dence Repeatability Resolution Simplicity Time
ABX IsX reallyAor B? Excellent Excellent Excellent Fair Long
A/B (yes-no) Is signal clean or dirty? Poor Fair Poor Excellent Varies
Randompair Aretheysameor different? Fair Good Poor Fair Short
Random A/B Pair* Which one has distortion? Excellent Excellent Good Fair Short
· This widely used test is known as "two-alternative temporal forced choice" (2ATFC).
J. Audio Eng. Soc., Vol. 30, No. 5, 1982 May 333

CLARK ENGINEERING
REPORTS
the hand-held control module, the logic/display mod- mode, allowing the memory's contents to be read out at
ule, and the relay module. Typically the logic/display each trial number. Both A and B relays are dropped out
module would be placed between or on one of the loud- at this time.
speakers. The control module is operated from the listen- This completes the test sequence. To begin another
ing position. The relay module would be placed near the test, the power is turned off and back on to reload the
components to be switched to enable the shortest audio memory.
cableruns. It is conceivablethat a listener mightspend days of
Fig. 4 is a block diagram of the logic/display module, listening to determine the memory contents, only to
For about 1 s after power is turned on, a pseudorandom- have a momentary power failure cause the answers to
noise generator provides a data input for the memory vanish. Battery backup to keep the memory alive for a
which is held in the write mode. The noise clock advances few hours is included to prevent this.
the decade counter through all steps, thus filling the The schematic in Fig. 5 shows that the circuit is pri-
memory with random ls and Os. marily TTL with discrete transistors performing a few
The logic/display module has two outputs for con- odd jobs.
trolling external A and B relays. Interlocked buttons on It is important that the relays or other means of
the control module operate these two relays. The X accomplishing theswitchingdo not produce extraneous
button assigns the memory's current 1 or 0 to operate hums or switching noise. The hum or switching noise
relay A for 0 and B for 1. Up and down control module can be distracting but the major problem is that the
buttons select different trial numbers each of which character of interference with the two components may
accesses a different memory location for a random value be different, thus allowing identification on an irrele-
of X. vant basis.A simplerelaysystem,Fig. 6, consistingof
When the listener has determined the A or B identity a pair of double-pole reed relays can accomplish a great
for all Xs that are to make up the test, the answer button deal of common comparison testing, however. The sim-
is pushed. A flip-flop now holds the circuit in the answer ple relay module can assign an output to two inputs for
testing amplifier/loudspeaker systems. It can also select
from two inputs for comparing microphone/preampli-
fiers, cartridge/preamplifiers, tape machines and other
sources. The requirements are that common grounding
be acceptable, and that the level is reasonably high with
d CTAVE
3 OCTAVE
a. LEVEL x% oTAVE
20 ....
50 1OO 200 500 1' '
2k '
5k 1 k'0 20k
Frequency, Hz
i
Fig. 2. Frequency response matching criteria for A and B

components. For narrow-band errors 'and at frequency ex- Fig. 3. ABX logic/display and control modules. Any type of
tremes, matching is less critical, component switchingdevice can be interfaced.
[ CONTROLS
PANEL J
____ I POWER-UP L_______ PSEUDO J
I
TIMER L_ I -i RANDO i
R/_ X
[HA_ H_L_I _ DECADE A_ _m_ORY I
l DISPLAY
BUTTON RESET ri
NTE 0CK IX J
LOGIC
B TO RELAYS
A_ ANSWER
LATCH i _t rA
Fig. 4. Block diagram of logic/display and control modules.
334 J.AudioEng.Soc.,
Vol.30,No.5,1982
May
Vcca LO&_C PC_ b_eLA¥

I
A
754-77 ._OU CI4. t_ pt. I_v
I f GE-NI_RATOR _.. _.1
I i _' :_. T"I / jo,',' Lo^_,

'_,.E_.
_47 IMI !a,lq I 151<,
c B
c3
· ,,7
Rz_ Vcc / ¢R moo
74'4'7
TO 7 sea,, BCD
74-4'7
Ta ? SE/,.
LO_,b --2'
Vce
?.1(?_ Vcc
DE_OU_IC_ R%_
_ E D
B= j. bec. co0_x DEC. COUNT
L_aCYIlK LI4pb)a
-- ---- UP ' b_ -_
L_TCq T4 L_O0
t UP
RE_ET Vcc
_0_ Vc¢ 740_
74L%00
T&L_G_
L¥
Vcc
--- L_,
N --
74-4g
i)ac e,. CA_P, 1_ V-_
D_ I $ 6Z_; _._oV. el "_7 IH [ VccR Vcce
, I "lz""'_°_c_>"'ll i'll o.s. l
IlOV _ _fiL4_ COI_N_CTTO

_O/60HZ' { ,I _.5. FL t..IZS A ·
LO&lC / Dt_PL AY t_ODu t£
LL ,,
C-OHTROL mODULE PCB
Fig. 5. Schematic diagram of logic/display and control modules. Output will operate 5-V or 12-V relays directly.
J. Audio Eng. Soc., Vol. 30, No. 5, 1982 May 335

CLARK ENGINEERING
REPORTS
moderately low input and output impedances, sides, and low sides in the proper order for a silent ex-
Perhaps the most difficult switching problem is silent- change of components within 50 ms. The relay assem-
ly switching preamplifier/amplifier combinations. The bly is elastically mounted to minimize acoustical noise.
more complex relay module in Fig. 7 accomplishes this Since audio must be routed through additional con-
task. The low level section is fully balanced and shielded nectors and relay contacts, there is a possibility that the
with a minimum of stray capacitance. The relays are sound will be audibly affected. This is a frequent criti-
sealed and rated down to dry circuit switching as ap- cism of switching comparisons of audio equipment.
proximated by a phonograph cartridge output. Coil Surely there must be some way of handling signals in a
voltages are ramped up and down to avoid feedthrough manner that will not degrade them audibly. The relays
into the audio. The loudspeaker level section is rated at and connectors used here are of a very high quality.
30 amperes and introduces less than 10 mi2 of addition- They have passed stringent listening tests as well as
al resistance. Both high and low sides are switched. All measuring up to the standards of conventional chassis
relays are controlled by an eight-phase timing sequence wiring. There is no reason, however, why any type of
that disconnects and reconnects outputs, inputs, high switching mechanism cannot be controlled and used in
the A/B/X scheme.
6 SOME RESULTS AND APPLICATIONS
One's first use of the double-blind comparator can be

distressing because one expects to be able to immediate-
ly sort between components, and one is not able to do so.
This has led many reviewers and designers to dismiss
this test as, in some way, masking the sonic identities
[9]. With persistence, however, human hearing limita-
tions are accepted and useful testing begins.
Frequency response or level differences are by far the
most common elements influencing a comparison.
Warmth, air, and many other illusive qualities come and
Fig. 6. The simple unbalanced relay module and the more
complex balanced relay module. The RM-2 handles signals go with l-dB response changes. If a preference for wide
from phonograph cartridge to loudspeaker level, or narrow loudspeaker dispersion is to be determined,
+5V
A
O GEN. _MPED _LAYS

RELAY
DRIVERS
COMP. GROUPII
BANK _LAYS
VOLTAGE (6) I
(6) /
E l
o I LAY
I ] LAYSI
DROP OUT AND PULL IN SEQUENCE
RELAY HI i !
GROUP I
LO i i
HI 3 J
II
LO i I
HI--3 !
III
LO '_ I
I I I I I i I I ,
0 20 40 60 80 100
-- 'DROP OUT TIME, ms

COMMAND t PULL IN COMMAND
Fig. 7. Block diagram ofthe balanced relay module. Coil voltages are ramped and sequenced to eliminate clicks.
336 J. Audio Eng. Soc.,Vol. 30, No. 8,1982 May

ENGINEERING
REPORTS HIGH-RESOLUTION
SUBJECTIVE
TESTING
for example, equalization must be used. Failure to do so ble. 1% is easy with sine waves.
will undoubtedly result in a preference based on the The worst of medium-quality electronics only ap-
dominant difference--frequency response, proaches 0.5% total harmonic distortion midband when
Most of the author's testing has been done in conjunc- driven near clipping. It is not surprising, then, that no
tion with the SMWTMS group [I0]. We have not found differences were heard.
any two preamplifiers or amplifiers that have sounded At this time SMWTMS has not been able to detect
different from each other when responses were matched, differences in pickup cartridges. This testing has been
All units were of medium or higher quality and not limited by its difficulty. It requires careful equalization,
operated in clipping. Other precautions were taken, as identical stamper number pressings in perfect condition,
listed earlier. This result is in agreement with Lipshitz et and synchronized turntables.
al. [11], Baxandall [12], and others. Differences between loudspeakers are almost always
To find out what amount of distortion was audible, a audible. Equalization, however, allows one to concen-
distortion generator was developed (Fig. 8). Nicknamed trate on imaging and other fine points. Frequently accu-
the "Grunge box," it generates even-order harmonic rate equalization is attainable at only one listening loca-
components that are independent of level. The rms out- tion. A 12-bit companded digital delay line was just
put also remains nearly constant as percent total har- audible. A 16-bit linear system was not. Audibility of
monic distortion is varied. Distortion can be heard on absolute phase (polarity) was statistically confirmed but
high- or low-level music passages, and there are no difficult to hear. More tests should be done on this
level-set problems. No two real-world nonlinearities are subject.
the same, and the "Grunge box" is yet another one, but The following are some areas where additional sub-
its sound is somehow typical and useful, jective research would be useful:
The best done so far is 3%, but with carefully selected 1) Filters: Phase, bandwidth, and cutoff characteris-
material (such as a flute solo) 2% or 1% might be possi- tics
.oout
I- n.
Al to 14: 1/4 TL074
"Grunge box"
Distortion is independent of level.
Distortion is independent of frequency.
Constant rms out as percent total harmonic distortion is varied.
All even-order harmonic components.
R Total Harmonic Distortion (%)
0 12 37
I kD 32
2 kl'! 28
5 kl'l 21
10k_Q 14
20 kD 9
50k12 4
100kl! 2.2
200 kl! 1.1
500kl! 0.44
I MD 0.22
2 MD 0.11
5 MD 0.04
10 MD 0.02
Open (0.02
Fig. 8. One channel of' calibrated distortion generator. Three percent THD is difficult to detect in music.
d.AudioEng.
Soc.,Vol.30,No.5,1982May 337
CLARK ENGINEERING
REPORTS
2) Noise reduction units--compandors 8 ACKNOWLEDGMENT

3) Nonlinear versus linear digital encoding
4) Compression--limiter comparisons The author wishes to acknowledge his coworkers in
5) Tolerable clipping or overmodulation this project, A. B. Kreuger, D. Carlstrom, B. F. Muller,
A. Greenia, and F. James.
6) Need for phase coherency in loudspeakers.
It is recognized that much good research has been
9 REFERENCES
done in all these areas, but new products and techniques
are constantly being developed. Even if more testing just [1] N. Agnew and S. Pyke, The Science Game, 2nd
confirms "what we knew all along," it is useful, ed. (Prentice-Hall, Englewood Cliffs, NJ, 1978).
The double-blind comparator can be an effective [2] D. M. Green and J. A. Swets, Signal Detection
teaching machine. By listening for differences, onelearns Theory andPsychophysics (Wiley, New York, 1966).
to quantify differences. The recording/mixing engineer [3] S. P. Lipshitz and J. Vanderkooy, "The Great
in particular needs to have hearing that detects a small Debate: Subjective Evaluation," J. Audio Eng. Soc., vol.
difference and tells what to do about it. Is that brightness 29, pp. 482-491 (1981 July/Aug.).
[4] D. L. Clark, A. B. Krueger, B. Muller, and D.
due to high-frequency distortion or frequency response? Carlstrom, "Lipshitz/Jung Forum," Audio Amateur,
A spectrum analyzer will not give the answer for music, vol. 10, pp. 56-57 (1979 Oct.).
but the trained ear can. Some other examples: [5] D. A. Spiegel, "Subjective Listening Tests Per-
1) Loudness ratios: how many dB? formed Objectively," Audio General Inc., 1631 Easton
2) Spectrum analysis: center frequency, bandwidth Rd., Willow Grove, PA 19090.
3) Nonlinear distortion: tape saturation, clipping [6] P. G. Hoel and R. J. Jessen, Basic Statistics for
4) Miking techniques: compare microphones, posi- Business and Economics (Wiley, New York, 1971, pp.
tions 92-98).
5) Rooms: hearing direct/reflected sound ratios [7] G. Plenge, H. Jakubowski, and P. Sch6ne,
6) Honest hearing: is there really a difference? "Which Bandwidth Is Necessary for Optimal Sound
Transmission?" J. Audio Eng. Soc., vol. 28, pp. 114-119
7 CONCLUSION (1980 Mar.).
[8] G. J. Holt, "Absolute Phase: Fact or Fallacy?"
Audibility testing, while subjective, can be done scien- Stereophile, vol. 4 (1980 Oct.).
tifically. Methods of maintaining a casual test appear- [9] W. Jung and C. Hollander, "Modifying the Ma-
ance to the listener while eliminating extraneous factors rantz 7C or St. Pooge and the DRIAAGON," Audio
Amateur, vol. 12, p. 20 (1981 Jan.).
and providing double-blind control were presented. [10] Southeastern Michigan Woofer and Tweeter
Generally the results of such testing confirm findings of Marching Society, 10155 Lincoln Dr., Huntington
previous rigidly scientific tests rather than those of casu- Woods, MI 48070. (Summary of tests is available.)
al, noncontrolled tests. [11] S. Lipshitz, J. Vanderkooy, and P. Young, "Let-
Specific test equipment, procedures for maximizing ters," Audio Amateur, vol. 10, pp. 53-54 (1979).
sensitivity, and conditions for maintaining validity were [ 12] P. Baxandall, "Audible Amplifier Distortion Is
presented. Not a Mystery," Wireless Worm (1977 Nov.).
THE AUTHOR
began in 1979 when he founded it with five other mem-
' bersof theSMWTMS, a Detriotareaaudioclub.The
club's adoption of double-blind listening tests was found
to be a dramatic aid to component evaluation. The
ABX Company was established to refine the apparatus
and procedures and make them available to the audio
community. His first work in the audio field was for
the University of Michigan where he was a student. He
provided technical services tora number of sponsored
research projects in areas of language teaching and hear-
ing research. He spent more than eight years working
for recording studios: first at Motown Record Corpo-
ration as a project engineer, then at HDH Sound Stu-
David Clark has operated his own company, DLC dios as chief engineer.
Design, since 1977. He performs a wide range of pro- In 1974, Mr. Clark returned to school full-time and
fessional audio services including the design of analog in 1977 received a B.S. degree in electrical engineering
and digital circu!ts, building acoustics, high output loud- from Lawrence Institute of Technology. He was one of
speakers, sound-reinforcement systems, film theaters the founders of the Detroit Section of the Audio Engl-
and recording studios, neeringSocietyand servedas its chairman in 1981.Still
Mr. Clark's involvement with the ABX Company active in the section, he now holds the office of secretary.
338 J. Audio Eng. Soc., Vol. 30, No. 5, 1982 May

High-Resolution Subjective Testing Using: A Double-Blind Comparator

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

High-Resolution Subjective Testing Using: A Double-Blind Comparator

Uploaded by

Copyright:

Available Formats

ENGINEERING REPORTS

High-Resolution Subjective Testing Using

ABX Company, Troy, MI 48099, USA

A system for the practical implementation of double-blind audibility tests is de-

0 INTRODUCTION After nearly five years of using the A/B/X method,

d. AudioEng.Soc.,Vol.30,No.5, 1982May 331

332 J. Audio Eng. Soc., Vol. 30, No. 5, 1982 May

nonflatcartridgeto makethe frequencyresponsesnear- _

even be preferred. 2 :o . ..143

J. Audio Eng. Soc., Vol. 30, No. 5, 1982 May 333

Fig. 2. Frequency response matching criteria for A and B

Fig. 4. Block diagram of logic/display and control modules.

Vcca LO&_C PC_ b_eLA¥

I f GE-NI_RATOR _.. _.1

I i _' :_. T"I / jo,',' Lo^_,

_0_ Vc¢ 740_

i)ac e,. CA_P, 1_ V-_

D_ I $ 6Z_; _._oV. el "_7 IH [ VccR Vcce

, I "lz""'_°_c_>"'ll i'll o.s. l

IlOV _ _fiL4_ COI_N_CTTO

LO&lC / Dt_PL AY t_ODu t£

J. Audio Eng. Soc., Vol. 30, No. 5, 1982 May 335

6 SOME RESULTS AND APPLICATIONS

One's first use of the double-blind comparator can be

O GEN. _MPED _LAYS

DROP OUT AND PULL IN SEQUENCE

-- 'DROP OUT TIME, ms

336 J. Audio Eng. Soc.,Vol. 30, No. 8,1982 May

Al to 14: 1/4 TL074

R Total Harmonic Distortion (%)

2) Noise reduction units--compandors 8 ACKNOWLEDGMENT

338 J. Audio Eng. Soc., Vol. 30, No. 5, 1982 May

You might also like