You are on page 1of 20

Large-scale auralised

sound localisation experiment


Enzo De Sena1 , Neofytos Kaplanis2,3 ,
Patrick A. Naylor4 , and Toon van Waterschoot1

1 KU Leuven 2 Bang & Olufsen


3 Aalborg University
4 Imperial College London

AES 60th Intern. Conf., 3 Feb. 2016


Outline 2/19

1 Motivation

2 Experiment

3 Data analysis

4 Summary
Outline 3/19

1 Motivation

2 Experiment

3 Data analysis

4 Summary
Objective 4/19

I Objective of this study is to investigate binaural sound localisation


performance in an informal setting and with little training
I Vast literature on localisation, but mostly few highly trained
participants
I Study conditions more similar to those in consumer applications
I Design an experiment that is suitable and entertaining
Outline 5/19

1 Motivation

2 Experiment

3 Data analysis

4 Summary
Participants 6/19

I Summer Science Exhibition of the Royal Society (July 2015)


I ≈ 15000 visitors over the course of a week
I 893 people participated in the test
Apparatus 7/19

I Rotating platform
I Auralised experiment
I Bang & Olufsen BeoPlay H6
I iPad mounted at eye level

I Rotation measured using iPad gyroscope


I Asked people to look at the iPad
Procedure 8/19

I Visitor introduced to apparatus and task


I Self-paced and self-controlled using custom GUI on iPad
I Choose from 3 conditions: anechoic, reverberant and close to wall
I Task: rotate the platform until sound source appeared to be in front
Sound stimuli 9/19

I Two anechoic sound samples:


I Speech
I African percussions
I Two HRTF datasets:
I KEMAR MIT
I CIPIC (measurement with anthropometric features closest to avg)
I Conditions:
I Anechoic: source in free-field at same height of listener
I Reverberant: room simulated in real-time (two source directions)
Condition Width Lx Length Ly Height Lz T60 DRR
Typical room 7.35 m 5.33 m 2.5 m 0.30 s 1.0 dB
High ceiling 7.35 m 5.33 m 8.0 m 0.45 s 4.5 dB
High reverb. 7.35 m 5.33 m 2.5 m 0.45 s 0.2 dB

I Initial look direction randomised


Bang and Olufsen. Music for Archimedes. CD B&O 101. 1992.
Scattering Delay Network (SDN) 10/19

I Network of delay lines connected at scattering points on the wall


I First order reflections correct in delay, amplitude and HRTF weighting
I Coarser approximation of higher order reflections
I Explicit control of model parameters (e.g. room size, absorption etc)

De Sena et al. “Efficient Synthesis of Room Acoustics via Scattering Delay Networks,”
IEEE TrASLP 2015
Summary 11/19

Anechoic input

I-order
refl. 1

Left ear
SDN
Room Simulator
Right ear
I-order
refl. N

Gyroscope
Outline 12/19

1 Motivation

2 Experiment

3 Data analysis

4 Summary
Anechoic condition 13/19

25

Frequency Percent
20

15

10

0
-180.00 -90.00 .00 90.00 180.00

Error [deg]

I Bi-modal distribution (front and back image) + uniform (error)


I People did remarkably well:
I 22% error < ±2.5 deg
I 52% error < ±7.5 deg
I 12% error > ±152.5 deg–front/back reversal
I Slight rightward bias
Mean analysis 14/19

I Mean −2.0 deg, i.e. to the left


I Mean of linear variable is misleading
I E.g. uniform in > ±90 deg is concentrated in back but 0 mean
I Common literature approach: "genuine errors" < ±45 deg
I Now mean is +1.84 deg (binomial test: p<0.001)
I Percussions +1.08 deg, speech +2.57 deg (Mann-Whitney: p=0.031)
I KEMAR +2.36 deg, CIPIC +1.35 deg (Mann-Whitney: p=0.004)
Source position

Close to wall
Typical Room
High Ceiling
High reverb.

Far from wall

-8.00 -6.00 -4.00 -2.00 0.00 2.00 4.00 6.00 8.00

Mean Error [deg]


Some more data analysis 15/19

I New results since paper submission


(data available online)
I Not just angles, but trajectories etc.

]
[deg
I Most front/back confusions when

ngle
starting around the back

ial a
Init
I Time to complete test–indication of localisation uncertainty
I Percussions: 22.6 s and speech: 28.8 s (t-test p < 0.001)
Anechoic Typical room High ceiling higher reverb.
I
28.0 s 25.0 s 27.2 s; 26.6 s
Outline 16/19

1 Motivation

2 Experiment

3 Data analysis

4 Summary
Summary 17/19

I Study localisation performance in informal setting and little training


I Design suitable and engaging experiment
I Subjects performed remarkably well (52% < ±7.5 deg)
I MIT database more to the right than CIPIC
I Speech more to the right than percussions
I Most front/back confusions when starting around the back
I Less time to make decision for percussions than speech
I Less time to make decision for typical room than anechoic

I Please visit the demo tomorrow!


Acknowledgments 18/19

The authors would like to thank Niccolò Antonello, Naveen Desiraju,


Clement Doire, Christine Evers, Sina Hafezi, Mathieu Hu, Hamza Javed,
Ante Jukić, Adam Kuklasinski, Alastair Moore, Pablo Peso, Richard
Stanton, Giacomo Vairetti, and Costas Yiallourides, for helping carry out
the experiment; Benjamin Cauchi, Clement Doire, and Mathieu Hu for
helping set up the experiment; Ray Thompson for designing and building
the structure of the rotating platform; and all the subjects for taking part
in the experiment.
THANKS!
THANKS!
Questions?

You might also like