You are on page 1of 7

Audio Engineering Society

Conference
Paper
Presented at the Conference on
Audio for Virtual and Augmented Reality
2016 September 30–October 1, Los Angeles, CA, USA
This conference paper was selected based on a submitted abstract and 750-word precis that have been peer reviewed by at
least two qualified anonymous reviewers. The complete manuscript was not peer reviewed. This conference paper has been
reproduced from the author’s advance manuscript without editing, corrections, or consideration by the Review Board. The
AES takes no responsibility for the contents. This paper is available in the AES E-Library (http://www.aes.org/e-lib), all rights
reserved. Reproduction of this paper, or any portion thereof, is not permitted without direct permission from the Journal of the
Audio Engineering Society.

Adjustment of the Direct-to-Reverberant-Energy-Ratio to


Reach Externalization within a Binaural Synthesis System
Stephan Werner1, Florian Klein1, and Thomas Sporer2
1 Electronic Media Technology Group, Technische Universität Ilmenau, Ilmenau, Germany
2 Fraunhofer Institute for Digital Media Technology, Ilmenau, Germany
Correspondence should be addressed to Stephan Werner (stephan.werner@tu-ilmenau.de)

ABSTRACT
The contribution presents an experiment to use the room acoustic parameter Direct-to-Reverberant-Energy-Ratio
(DRR) to solve the room divergence effect in binaural listening via headphones. Perceived externalization of
auditory events is decreased if acoustic divergence between the listening room and the resynthesized room
occurs. The DRR is used to push the synthesis towards the listening room to increase externalization. The
listeners adjust the DRR of the synthesis on the expected DRR of the listening room until congruence between
synthesis and listening room is perceived. The results show that the DRR is a suitable acoustic parameter. The
listeners are able to reliably adjust the DRR of the listening room only by their expectations and no explicit
external reference. A subsequent quality test shows that the congruent DRR conditions have only a minor effect
on the increase of externalization using divergent room conditions.

acoustic divergence between the synthesized and the


1 Introduction listening room. Perceived externalization of auditory
The perception of the quality of a spatial audio events is decreased if divergence occurs, this is
reproduction can be hypothesized as the creation of described as room divergence effect [5, 6, 7, 8].
quality of experience (QoE) [1]. This process Feedback mechanisms to the audio reproduction
includes a comparison and judgement between system can be applied to enable an adaptation of the
desired and perceived quality features [2, 3]. This system, e.g. a binaural synthesis system, on the
model can be extended with the assumption that the context of use to achieve a plausible auditory
multimedia system with its context of use influences illusion for example. The basic idea of the presented
the build-up of QoE [4]. One relevant context work is to adjust the room acoustic parameter
dependent quality parameter is for example the Direct-to-Reverberant-Energy-Ratio (DRR) to
Werner, Klein, and Sporer DRR and Externalization in Binaural Synthesis

achieve congruence between listening room and the loudspeakers and to the resynthesis of those
resynthesized room and to increase externalization. loudspeakers via headphones.
The DRR is investigated as one possible acoustic
parameter which can be used for adaptation of the 3 Direct-to-Reverberant-Energy-Ratio
system to reach high QoE. Former studies show that The sound waves are emitted in multiple directions
the adjustment of the DRR can increase the from a sound source depending on frequency and
perceived congruence between synthesized and directivity of the source. The expansion of these
listening room [9]. The experiment also shows a waves is likely to be disturbed by boundary surfaces
similarity between the values of the inter-quartile of the room. A superposition of sound waves is
distances (IQDs) and reported just-noticeable measureable and perceivable at the listening point.
differences (JNDs) for the perception of DRR. It is Only the amount of sound energy which reaches the
conjectured from that work that the adjustment of listening point undisturbed is defined as direct
the DRR is a valid method to adapt a binaural sound. The sound waves reflected on the room
synthesis on the context parameter listening room surfaces are called reverberant sound. This sound is
because of the high inter-rater-reliability visible in delayed and reaches the listening point via several
the small IQDs. This contribution repeats the DRR paths. In a binaural context we define the direct
adjustment with more test persons and evaluate the sound as sound which reaches the ears up to 1.5 ms
effect on perceived externalization of auditory after the undisturbed sound. This time span includes
events. monaural and binaural cues resulting from reflexions
and diffractions of the outer ear, head, and torso but
2 Binaural Synthesis System not from the room. The reverberant sound is defined
The used binaural synthesis system consist of a as sound reaching the listening point after the direct
headphone system using binaural recordings of sound. The energy ratio between direct and
individual and artificial binaural room impulse reverberant sound is named as DRR. The DRR is
responses (BRIRs, using a KEMAR head and torso strongly dependent on the distance between source
simulator) for the selected rooms, sound sources and and receiver. In addition the DRR is a room specific
positions. A non-dynamic system with no head acoustic parameter for a fixed source to receiver
tracking is used to avoid that dynamic cues resolve distance. The DRR can be calculated:
perceptional ambiguities like front-back confusions  T h 2 t dt 
  (1)
and in-head localization. A customizable binaural DRR  10 log o 
  h t dt 
2
system is used to increase the fidelity of the  T 
simulation compared to real loudspeakers [11]. The with h(t) as impulse response between two points in
usage of individual BRIRs reduces within-cone and an enclosure and T=1.5 ms to separate direct sound
out-of-cone of confusion errors [10]. Different and room reflections.
rooms have been chosen to include different room
acoustic properties like reverberation and source- Several experiments are conducted in the past to
receiver distances. Reverberation encourages the determine the JND of DRR perception at the 50%-
perception of externalization of an auditory illusion discrimination point of the psychometric function.
and the impression of distance [12]. The headphones JND values between 2 dB and 20 dB are reported
are equalized using individual headphone transfer depending on test design and DRR magnitude. A
functions (HPTFs) if individual BRIRs are used. minimum of 2 dB JND at 0 dB DRR rising up to 20
HPTFs from the head-and-torso simulator are used if dB JND at +/- 20 dB DRR are reported in [16] for
artificial BRIRs are used. In-ear microphones are loudspeaker listening. For virtual acoustics JNDs of
used to measure individual BRIRs and HPTFs at the 5 dB to 6 dB are assessed in [17] and JNDs of 2.4
entrance of the blocked ear canal of each test person dB to 8.7 dB are mentioned in [18].
[13]. The microphones are not removed between the
BRIR and HPTF measurements. The inverse of a 4 Perceptual Evaluation
HPTF is calculated by a least-square method with This section describes the apparatus and the
minimum phase inversion [14]. A BK211 extra-aural methodologies of the experiment. The experiment
headphones [15] which fulfills the requirements for consists of an adjustment session, a quality rating
open headphone is used for playback [13]. The session in two rooms with different room acoustic
distortion of the sound incidence from the real characteristics, and a preceding pre-test session.
sources in the room to the listener´s ears is Twenty-two inexperienced (in terms of binaural
minimized especially for an extra-aural headphone. perceptional experiments) test persons with a mean
This allows a test design with listening to real age of 29 years participate in the listening test. A
speech shaped noise signal, a male speech signal

AES Conference on Audio for Virtual and Augmented Reality, Los Angeles, CA, USA.
2016 September 30–October 1,
Page 2 of 7
Werner, Klein, and Sporer DRR and Externalization in Binaural Synthesis

(Dutch speaker; no test person speaks Dutch), and a Acoustic divergence between the listening room
short part of a saxophone play are used as audio and the resynthesized room is the main object of
signals with a duration of four, five, and nine investigation. A listening lab (naming: HL; Rec.
seconds. ITU-R BS.1116-1, V=179 m³, RT60=0.3 s) and an
empty seminar room (naming: SR; V=182 m³,
a. Quality Feature of Investigation RT60=2.0 s [21]) are used as recording and test
The focus of the experiment lies on the quality rooms. A playback with convergent and divergent
feature externalization as one feature to describe the combinations between the synthesized room and the
perception of an auditory event. Externalization listening room are available.
describes the perception of the position of an Within the adjustment session (see section 4c.) the
auditory event outside or inside the head of the test persons are able to adjust the DRR of the
listener [5, 7, 19]. This feature is a crucial feature to synthesized scene until perceptional congruence
reach a plausible spatial auditory illusion with between the synthesis and the real listening room
binaural headphone systems. The dichotomous appears. The internal reference of the test person is
quality feature externalization is counted as the used as reference for this adjustment. The
index of the ratings on a three-point scale. In adjustment is done in the two rooms with resynthesis
additions to the characteristics “in-head” and of the two rooms. The adjustment of the DRR of the
“outside the head” a transition point “outside but synthesis yields to further combinations of listening
close to the head” is used. This scale is motivated by and synthesized room. The names for the room
the individual mapping to a scale of the percept of conditions follow the nomenclature “synthesized
externalization for every test person. Only the scale room (DRR adjustment on room)_listening room”.
point “outside the head” is counted as an Figure 1 gives an overview about the different room
externalized auditory event in further analysis. We combinations with the congruence of the DRR
define the perception of an event very close to the between the DRR adjusted synthesized room and the
head or ears as in-head-localized or non- listening room.
externalized. We suppose that this conservative
approach maps the ratings in a reliable way referred
congruence 

to the resynthesis of the real loudspeakers with their +1


DRR 

positions (and distances) in the room. The goal is to


minimize the confusion between distance perception
and externalization for farer distances (e.g. > 2 m). 0
An externalization index is calculated as division SR(HL)_SR SR(SR)_HL SR(HL)_HL SR(SR)_SR
between the number of externalized stimuli and all HL(SR)_HL HL(HL)_SR HL(SR)_SR HL(HL)_HL

stimuli in the test. Figure 1. Congruence of the Direct-to-Reverberant-


Energy-Ratio (DRR) between the DRR adjusted
b. Objects of Investigations synthesized room and the listening room for the
different room combinations; room name in brackets
Six discrete sound source positions with a distance
indicates the DRR adjustment on this room.
of 2.2 m to the listening point are used within the
experiment. Genelec 1030A loudspeakers are used Table 1 gives an overview of the different room
as sound sources to measure the BRIRs. The source combinations concerning the convergence or
directions (0°, 30°, 60°, 180°, 240°, 300° with divergence between the synthesized room and
counter clock-wise orientation) are chosen to include listening room and the original measured or DRR
a surrounding setup and to include directions with adjusted BRIRs. Note that the conditions “SR(SR)”
assumed front-back and cone-of-confusion and “HL(HL)” are the recorded BRIRs from the
localization errors. Former studies reveal a negative respective room.
correlation between the amount of localization errors
like front-back and cone-of-confusion errors and the BRIR Convergence Divergence
perceived externalization of an auditory event [20]. Original SR(SR)_SR SR(SR)_HL
The sound sources are resynthesized using HL(HL)_HL HL(HL)_SR
Adjusted SR(HL)_HL SR(HL)_SR
individual BRIRs and HPTFs of the single test
(DRR) HL(SR)_SR HL(SR)_HL
persons for two rooms and source positions.
Additionally, BRIRs and HPTFs of a KEMAR head- Table 1. Combinations of synthesized room and
and-torso simulator (45BA) are recorded. Both types listening room concerning convergence or
of BRIRs are used to create test stimuli for the divergence and original measured or adjusted DRR.
binaural synthesis.

AES Conference on Audio for Virtual and Augmented Reality, Los Angeles, CA, USA.
2016 September 30–October 1,
Page 3 of 7
Werner, Klein, and Sporer DRR and Externalization in Binaural Synthesis

c. Test Procedure The DRR step 1 is closely corresponding to an


The experiment consist of an adjustment session, a anechoic condition (DRR > 100 dB). The DRR step
quality rating session in two rooms with different 47 is the measured ratio and DRR step 70 is the most
room acoustic characteristics, and a preceding pre- reverberant condition used in the test. The overall
test session. sound pressure level of the playback can be adjusted
by the test person individually.
The pre-test session is included in the individual
BRIR recording session for the seminar room (SR). Within the quality session the test persons have to
The motivation for the pre-test is to check the rate perceived externalization of the auditory scenes
binaural synthesis system, the loudspeaker system, in both rooms. The presented scenes are the
and the used anchor signals. Dummy head BRIRs of convergent and divergent room conditions and the
the seminar room (SR) are used for binaural DRR-adjusted room conditions (see table 1). The
resynthesis of the six source positions. The same test design is a single stimulus test with randomized
seminar room is used as listening room. Anchor order of the test stimuli for each test person. The
signals with low spatial quality are created by using rating interface is shown in Fig. 3. The perceived
room impulse responses from the single loudspeaker angle of incidence is rated by choosing the
positions to a microphone with omnidirectional respective discrete direction on a top-down view.
characteristic at the listening position. The signals Externalization is rated by choosing the midpoint,
include room information but no spectral cues inner circle, or outer circle on the rating sheet. The
resulting from the pinna or head. The different sound grading and definition of the quality feature
directions are created by a stereo-intensity panning externalization is motivated by Hartmann and
between a left and a right channel. Wittenberg [19]. The following definitions are used
in the test: a) midpoint: “The auditory event is
The DRR adjustment session gives the test persons
entirely in my head and very diffuse.” b) inner circle
the possibility to adjust the room acoustic parameter
(in head): “The auditory event is entirely in my head
DRR of the synthesized scene until perceptional
and easy to locate.”; c) middle circle: “The auditory
congruence between the synthesis and the real
event is external but it is next to my ears or head.”;
listening room appears. The internal reference of the
d) outer circle: “The auditory event is external and
test person is used as reference for this adjustment.
easy to locate.”; e) outer cloud: “The auditory event
A listening to the real loudspeakers in the listening
is external and very diffuse”. A replay of the actual
room is performed in front of this session to support
stimulus is possible for the test participants in both
the build-up of the internal reference [22]. The DRR
experiments.
is changed by an amplification or damping of the
reverberant part relative to the direct sound of a
measured BRIR. The change of the BRIR is
conducted 3 ms after the direct sound, avoiding
effects on the head-related part. The DRRs are
calculated for 70 steps with 46 damping steps at
normalized amplitude from zero to one, 23
amplification steps at normalized amplitude from
one to 1.3, and the original DRR. Figure 2 shows the
different DRRs for the used two rooms exemplarily
for the 240° direction for the right ear of all test
persons. The full set of the DRR-scaled BRIRs is
Figure 3. Graphical user interface (GUI) for ratings
also available for download [23].
in the listening test.

5 Ratings
The analysis of the ratings includes the DRR
adjustment and the externalization ratings. The also
available localization ratings are not analysed in this
contribution because of clarity reason. Furthermore,
the ratings for the three used audio signals are
Figure 2. Adjustable steps of the DRR in dB as combined in the presented analysis. No significant
median and 0.25/0.75 quantile from the BRIR differences are found for the DRR and
measurements of all test person´s right ear and 240° externalization indices (p<.05, Fisher´s exact test for
source direction; left=listening lab, right=seminar externalization and Wilcoxon signed-rank test for
room; step 47 is the measured BRIR. DRR).

AES Conference on Audio for Virtual and Augmented Reality, Los Angeles, CA, USA.
2016 September 30–October 1,
Page 4 of 7
Werner, Klein, and Sporer DRR and Externalization in Binaural Synthesis

The ratings from the pre-test show that the real SR in HL HL in HL

40

40
loudspeakers in the seminar room are rated with

30

30
absolute DRR in dB

absolute DRR in dB
externalization indices close to one (see figure 4).

20

20
However, not all loudspeaker test stimuli are rated as

10

10
externalized which is especially visible in the front

0
and back directions. The binaurally resynthesized
loudspeakers using artificial head BRIRs are rated 0° 30° 60° 180°
direction
240° 300° 0° 30° 60° 180°
direction
240° 300°

tendentially with lower externalization indices as the SR in SR HL in SR

40

40
real loudspeakers. Significant differences (p<.05) are

30

30
absolute DRR in dB

absolute DRR in dB
only found for the 0° and 30° directions. The lowest

20

20
indices are also visible for front and back directions.

10

10
The non-binaural anchor signal is rated with

0
externalization indices slightly above 0.2. The pre-
0° 30° 60° 180° 240° 300° 0° 30° 60° 180° 240° 300°
test shows that the used signals and systems are direction direction

valid for the further experiment. Figure 5. Adjusted DRR in dB of the test persons as
boxplots with 95% conf. int. as notch; dashed line
loudspeaker indicates the DRR of the listening room; circles
1.0

binaural
anchor indicates outliers; top: listening lab (HL) as listening
0.8

room, bottom: seminar room (SR) as listening room.


externalization index
0.6

Nevertheless, the analysis of the adjustment show


that the test persons are able to adjust the DRR of
0.4

the divergent and convergent room conditions on the


0.2

DRR of the listening room. The accuracy,


represented in the IQDs, of the adjustment is within
0.0

0° 30° 60° direction 180° 240° 300° the range of the JND of DRR perception (see section
3). The mean IQDs over all directions are: 9.3 dB
Figure 4. Externalization indices from the pre-test for “SR in HL”, 4.0 dB for “HL in HL”, 4.7 dB for
with 95% binominal confidence intervals; “SR in SR”, and 2.9 dB for “HL in SR”.
resynthesis of a seminar room in the same room
using dummy head BRIRs.
b. Externalization rating
Figure 6 and 7 show the ratings of the quality test for
a. DRR Adjustment the different room conditions and adjusted room
Figure 5 shows the adjusted DRR for the resynthesis conditions. The externalizations indices and the 95%
of the seminar room or listening lab in the seminar binominal confidence intervals are shown for the
room or listening lab. A dashed line indicates the different directions. Significances and effect sizes
DRR of the listening room. A coincidence between (odd ratios) are calculated using Fisher´s exact test.
the median of the DRR adjustment and the DRR of The room conditions in question are highlighted
the listening room is visible for the congruent room with respective asterisks and numbers. Figure 6
condition “HL in HL” and “SR in SR”. Higher shows the rating for the more reverberant seminar
DRRs of about 6.8 dB (mean over all directions) are room as listening room while figure 7 shows the
adjusted for the divergent room condition “SR in listening lab.
HL”. The test persons chose a less reverberant
resynthesis of the seminar room than the listening
room. Slightly higher DRRs are also chosen for the
divergent room condition “HL in SR”. This effect
can be explained with the DRR steps available in the
test. Figure 2 shows that the most reverberant
possible DRR (step 70) of the listening lab is
approx. 3 dB higher than the original measured DRR
of the seminar room (step 53 in figure 2 right).

AES Conference on Audio for Virtual and Augmented Reality, Los Angeles, CA, USA.
2016 September 30–October 1,
Page 5 of 7
Werner, Klein, and Sporer DRR and Externalization in Binaural Synthesis

0° 30°
(significant at p<.001 or at least with p<.11). This
externalization index

externalization index
*** 0.75 *** 0.15
(142.3) (0.8) (max) (1.6)
observation is also conform to the influence of
0.8

0.8
reverberation on externalization. An increase of the
0.4

0.4
amount of reverberation for the resynthesis of the
0.0

0.0
SR(SR)_SR SR(HL)_SR HL(HL)_SR HL(SR)_SR SR(SR)_SR SR(HL)_SR HL(HL)_SR HL(SR)_SR
room condition

60°
room condition

180°
listening lab in the listening lab (“HL(SR)_HL”)
does not significantly (p<.2) increase or decrease the
externalization index

externalization index
*** 0.10 *** 0.13
(max) (1.9) (20.6) (2.2)
0.8

0.8
externalization indices compared to the congruent
condition “HL(HL)_HL”.
0.4

0.4
0.0

0.0
SR(SR)_SR SR(HL)_SR HL(HL)_SR HL(SR)_SR SR(SR)_SR SR(HL)_SR HL(HL)_SR HL(SR)_SR
room condition room condition

240° 300°
6 Conclusion
externalization index

externalization index

*** 0.07 *** 0.22


(19.0) (2.5) (142.0) (1.4)
0.8

0.8

The contribution presents a proposal of a context-


0.4

0.4

dependent adaptive binaural resynthesis system of


real rooms. The acoustic parameter DRR is used to
0.0

0.0

SR(SR)_SR SR(HL)_SR HL(HL)_SR HL(SR)_SR SR(SR)_SR SR(HL)_SR HL(HL)_SR HL(SR)_SR


room condition room condition

adapt the binaural playback of the resynthesized


Figure 6. Externalization indices and 95% bin. conf. scene on the listening room.
intervals for the seminar room (SR) as listening
The adjustment of the DRR shows that the listeners
room; dashed lines guessing range; *** sig.
are reliably able to change the DRR of the
difference p<.001 or the given p-value using Fisher
resynthesis until perceptional congruence between
test; number in brackets effect size as odds ratio.
the synthesis and the listening room occurs. No
A DRR divergence between the resynthesized scene explicit external reference but the expectation
and the listening room (“SR(HL)_SR”) yields to a (internal reference) of the individual listener is used
significant decrease (p<.001) of the externalization for this adjustment.
indices compared to the convergent room condition The analysis of the externalization ratings confirm
and DRR congruent condition “SR(SR)_SR”. This earlier research that the convergent room conditions
observation is in line with literature on the influence are rated with high externalization compared to the
of reverberation on externalization [10, 11]. The divergent room conditions [5, 6, 7, 8]. It is also
DRR congruent but room divergent condition shown that the externalization is tendentially
“HL(SR)_SR” does not yields to a significant increased if the divergent room condition is DRR-
increase of externalization compared to adapted to the listening room compared to the
“HL(HL)_SR”. Only a tendency is visible (with original divergent room condition. This effect is
p>.1) that a DRR congruence yields to an increase of observed with minor effect size and less significance
perceived externalization. (p>.05). It is concluded that the DRR is a valid room
0° 30° acoustic parameter in spatial listening but it has only
a minor influence on the elimination of the room
externalization index

externalization index

*** ***
(7.5) (max)
0.8

0.8

divergence effect. The change of the DRR has no


0.4

0.4

influence on the time structure of the single room


0.0

0.0

reflections and patterns. It is conjectured that the


SR(SR)_HL SR(HL)_HL HL(HL)_HL HL(SR)_HL SR(SR)_HL SR(HL)_HL HL(HL)_HL HL(SR)_HL
room condition room condition

60° 180°
divergence between the expected room and the
externalization index

externalization index

0.10 ***
(5.5) (6.3)
synthesized room depends also on the reflection
0.8

0.8

patterns like it is described in the literature for the


0.4

0.4

Clifton effect [24] and derivatives. Further studies


0.0

0.0

SR(SR)_HL SR(HL)_HL HL(HL)_HL HL(SR)_HL SR(SR)_HL SR(HL)_HL HL(HL)_HL HL(SR)_HL

are proposed to investigate the room divergence


room condition room condition

240° 300°
effect and to adapt the synthesis on the context of
externalization index

externalization index

0.11 ***
(2.6) (9.5)
0.8

0.8

use. Several interpolation methods in time and/or


0.4

0.4

frequency domain are investigated.


0.0

0.0

SR(SR)_HL SR(HL)_HL HL(HL)_HL HL(SR)_HL SR(SR)_HL SR(HL)_HL HL(HL)_HL HL(SR)_HL


room condition room condition

Figure 7. Externalization indices and 95% bin. conf. 7 Acknowledgement


intervals for the listening lab (HL) as listening We thank all test participants for the participation in
room; dashed lines guessing range; *** sig. the tests and interest in research. Furthermore, we
difference p<.001 or the given p-value using Fisher thank the master course Advanced Psychocacoustics
test; number in brackets effect size as odds ratio. at TU Ilmenau. This work is supported by grants of
the Deutsche Forschungsgemeinschaft (DFG Grant
The DRR congruent but room divergent condition BR 1333/14-1) and by Thüringer Aufbaubank
“SR(HL)_HL” yields to lower externalization (2015FGR0090) and the European Social Fund.
indices compared to the condition “SR(SR)_HL”

AES Conference on Audio for Virtual and Augmented Reality, Los Angeles, CA, USA.
2016 September 30–October 1,
Page 6 of 7
Werner, Klein, and Sporer DRR and Externalization in Binaural Synthesis

References related transfer functions on the spatial


[1] “Qualinet White Paper on Definitions of perception of a virtual speech source”, J.
Quality of Experience (2012)”. European Audio Eng. Soc., 49, pp. 904-916, 2001.
Network on Quality of Experience in [12] Laws, P.: “ [Auditory distance perception and
Multimedia Systems and Services (COST the problem of ‘in-head localization’ of
Action IC 1003), Patrick Le Callet, Sebastian sound images] Entfernungshören und das
Möller and Andrew Perkis, eds., Lausanne, Problem der Im-Kopf-Lokalisiertheit von
Switzerland, Version 1.2, March 2013. Hörereignissen”, Acustica, 29, pp. 243-259
[2] Jekosch, U.: “Voice and Speech Quality (NASA Technical Translation TT-20833),
Perception – Assessment and Evaluation”, 1973.
Springer Series in Signal and Communication [13] Møller, H.: “Fundamentals of Binaural
Technology, Berlin, 2005. Technology”, Applied Acoustics, 36, 1992.
[3] Raake, A.: “Speech Quality of VoIP – [14] Schärer, Z., Lindau, A.: “Evaluation of
Assessment and Prediction”, John Wiley & equalisation methods for binaural signals,”
Sons, Chichester, West Sussex, 2006. 126th AES Conv., preprint 7721, 2009.
[4] Werner, S., Klein, F., and Brandenburg, K.: [15] Erbes, V. et al.: “An extraaural headphone
“Influence of Spatial Complexity and Room system for optimized binaural reproduction“,
Acoustic Disparity on Perception of Quality in proc. of 38. DAGA, Germany, pp. 313-
Features using a Binaural Synthesis System”, 314, 2012.
in Proc. of 7th International Workshop on [16] Reichard, W., Schmidt, W. (1966) “[The
Quality of Multimedia Experience (QoMEX), percievable steps of room impression
Greece, 2015. listening to music] Die hörbaren Stufen des
[5] Plenge, G.: „[The problem of in-head Raumeindrucks bei Musik”, Acustica, 17,
lokalization] Über das Problem der Im-Kopf- pp.175-179.
Lokalisation“, Acustica 26, Nr. 5, pp. 241– [17] Zahorik, P. (2002) “Direct-to-reverberant
252, 1972. energy ratio sensitivity”, J. Acoust. Soc. Am.,
[6] Werner, S. and Siegel, A.: “Effects of 112, pp.2110-2117.
binaural auralization via headphones on the [18] Larsen, E. et al. (2008) “On the minimum
perception of acoustic scenes”, Proc. of 3rd audible difference in direct-to-reverberant
International Symposium on Auditory and energy ratio”, J. Acoust. Soc. Am., 124, pp.
Audiological Research (ISAAR), pp.215-222, 450-461.
Denmark, 2011. [19] Hartmann, W. M., Wittenberg, A.: “On the
[7] Udesen, J., Piechowiak, T., Gran, F.: “Vision externalization of sound images,” J. Acoust.
Affects Sound Externalization”, 55th AES Soc. Am., 99, pp. 3678-3688, 1996.
Conf. Spatial Audio, Helsinki, Finland, 2014. [20] Werner, S., Rekitt, M., and Klein, F.:
[8] Werner, S., Klein, F., Mayenfels, T., and “Distribution of Quadrant Errors in Auditory
Brandenburg, K.: “A Summary on Acoustic Localization using a Binaural Headphone
Room Divergence and its Effect on system”, in Proc. of 41st annual convention
Externalization of Auditory Events”, in Proc. for acoustics, DAGA, Nürnberg, Germany,
of 8th International Conference on Quality of 2015.
Multimedia Experience (QoMEX), Portugal, [21] Akustik Bureau Dresden, “Messbericht ABD
2016. 22049-02/15”, report, 2015.
[9] Werner, S. and Liebetrau, J.: “Adjustment of [22] Klein, F., Werner, S., Mayenfels, T.:
Direct-to-Reverberant-Energy-Ratio and the “Influences of training on externalization in
Just-Noticable-Difference”, 6th International binaural synthesis in situations of room
Workshop on Quality of Multimedia divergence” to be published in Journal of the
Experience (QoMEX), Singapore, 2014. Audio Engineering Society, 2016.
[10] Møller, H., Sørensen, M. F., Jensen, C. B., [23] Werner, S.: “DRR-scaled Individual BRIRs”,
and Hammershøi, D.: “Binaural technique: http://dx.doi.org/10.5281/zenodo.61072, 2016.
Do we need individual recordings?”, J. Audio [24] R. K. Clifton, R. L. Freyman, R. Y. Litovsky
Eng. Soc, 44, pp. 451-469, 1996. and D. McCall “Listeners’ expectations can
[11] Begault, D. R., Wenzel, E. M.: “Direct raise or lower echo threshold”, J. Acoust.
comparison of the impact of head tracking, Soc. Am., vol. 95, pp. 1525-1533, 1994.
reverberation, and individualized head-

AES Conference on Audio for Virtual and Augmented Reality, Los Angeles, CA, USA.
2016 September 30–October 1,
Page 7 of 7

You might also like