You are on page 1of 5

3D Audio Systems and Acoustic Environmental

Modeling
Nagaraju.P
Dept. of E&C, SSIT

Abstract 3D audio effects are a group of sound effects that attempt to


widen the stereo image produced by two loudspeakers or
Always an individual prefer natural environment than the
stereo headphones, or to create the illusion of sound sources
artificial. Affection of man towards nature has no words to
explain. A technological touch to facilitate natural placed anywhere in 3 dimensional space, including behind,
environment, to create an atmosphere of natural hearing is the above or below the listener.
focus of this paper. Sound that we hear from the existing
sound systems fail to provide the natural hearing experience, There are several types of 3D audio effects:
since sound from individual sound sources located at different
places in reality when reproduced by normal speakers will lose
the effect of place of there presence. Those that only widen the stereo image by modifying
phase information.

Those that can place sounds outside the stereo basis.


Those that include a complete 3D simulation.

I. Introduction the loudspeakers. A second method exists to achieve 3D


audio using a pair of headphones or a regular stereo system.
3D audio effects create the illusion of sound sources placed
Using Binaural Synthesis, it is possible to place sounds at
anywhere in 3D space. One way to achieve 3D audio is to
any position relative to the listener. Multispeaker systems
use multiple speakers placed at strategic locations. The
may be unable to place sounds above or below the listener.
sound is faded between the front and rear channels allowing
True 3D audio systems are able to overcome this problem.
us to 'place' a sound at any location in the space enclosed by
II. How We Find Direction of Sound
IV. Localization of Sound
The ear acts like a complicated tone control mechanism that
It has been said that "the purpose of the ears is to point the
is direction dependent. We unconsciously use the time
eyes." While the ability of the auditory system to localize
delay, amplitude difference and tonal information at each
sound sources is just one component of our perceptual
ear to determine the location of the sound. A sound delay
systems, it has high survival value, and living organisms
exists since sound waves reach ear nearer to the sound
have found many ways to extract directional information
source first and then the other, which is unconsciously felt.
from sound. Although perceptual mysteries remain, the
These indicators are called sound localization 'cues'.
major cues have been known for a long time, and careful
III. How Does 3D Audio System Woks?
psychological studies have established how accurately we
A 3D audio system works by mimicking the process of can make localization judgments.
natural hearing, essentially reproducing the sound
localization cues (parameters to locate dimension) at the ear A. Azimuth Cues
of the listener. This is most easily done by using measured
One of the pioneers in spatial hearing research was John
HRTFs (Head Related Transfer Functions) as a
Strutt, who is better known as Lord Rayleigh. About 100
specification for digital audio filters (equalizers). When a
years ago, he developed his so-called Duplex Theory.
sound signal is processed by the digital filters and listened
According to this theory, there are two primary cues for
to over headphones, the sound localization cues for each ear
azimuth -- Interaural Time Difference (ITD) and Interaural
are reproduced, and the listener should perceive the sound
Level Difference (ILD).
at the location specified by the HRTFs. This process is
called binaural synthesis (binaural signals are defined as the
signals at the ears of a listener).

Fig. 2 Interaural Time Difference (ITD)

Lord Rayleigh had a simple explanation for the ITD. Sound


Fig. 1 3D Audio synthesis using HRTFs
travels at a speed c of about 343m/s. Consider a sound
wave from a distant source that strikes a spherical head of
Loudness
radius a from a direction specified by the azimuth angle .
Motion parallax
Clearly the sound arrives at right ear before the left, since it
V. Generation of Head-Related Transfer Functions
has to travel the extra distance to reach the left ear.
Dividing that by the speed of sound, we obtain the The transformation of sound from a point in space to the
following simple (and surprisingly accurate) formula for the ear-canal can be measured accurately. These measurements
interaural time difference: are called Head- Related Transfer Functions (HRTFs). The
ITD = a/c ( + sin ), for, 90 +90 measurements can be made by inserting miniature
B. Elevation Cues microphones into the ear canal of a human subject or a
The elevation cues stem from the fact that our outer ear or mannikin. A measurement signal is played by a loudspeaker
pinna acts like an acoustic antenna. Its resonant cavities and recorded by the microphones.
amplify some frequencies, and its geometry leads to The recorded signals are then processed to derive a pair of
interference effects that attenuate other frequencies.HRTFs (for the left and right ears) corresponding to the
Moreover, its frequency response is directionallysound source location. Each HRTF, typically consisting of
dependent. several hundred numbers, describes the time delay,
amplitude, and tonal transformation for the particular sound
The figure above shows measured frequency responses for
source location to the left or right ear of the subject. The
two different directions of arrival. In each case we see that
measurement procedure is repeated for many locations of
there are two paths from the source to the ear canal -- a
the sound source relative to the head, resulting in a database
direct path and a longer path following a reflection from the
of hundreds of HRTFs that describe the sound
pinna.
At moderately low frequencies, the pinna essentiallytransformation characteristics of a particular head.
collects additional sound energy, and the signals from theVI. Errors with the method of Synthesis
two paths arrive in phase. However, at high frequencies, the
3D audio synthesis works extremely well when the
delayed signal is out of phase with the direct signal, and
listener's own HRTFs are used to synthesize the localization
destructive interference occurs.
cues. However, measuring HRTFs is a complicated
C. Range Cues procedure, so 3D audio systems typically use a single set of
When it comes to localizing a source, we are best at
HRTFs previously measured from a particular human or
estimating azimuth, next best at estimating elevation, and
manikin subject. Localization performance generally suffers
worst at estimating range. In a similar fashion, the cues for
when a listener listens to cues synthesized from HRTFs
azimuth are quite well understood, the cues for elevation
measured from a different head, called non-individualized
are less well understood, and the cues for range are least
HRTFs. Human heads are all different sizes and shapes, and
well understood. The following cues for range are
there is also great variation in the size and shape of
frequently mentioned:
individual pinna. This means that every individual has a signals are processed and output. In the figure, the input
different set of directional cues. The greatest differences are signals shown at the top represent the individual object
in the tonal transformations at high frequencies caused by sounds that are to be spatially processed to create the scene.
the pinna.It is clear we become accustomed to localizing The input signals are monophonic when processed gives 3D
with our own ears, and thus our localization abilities are audio output.
diminished when listening through another persons ears.
Our uniqueness as individuals is the source of the greatest
limitation of 3D technology. The use of non-individualized
HRTFs results in two particular kinds of localization errors
commonly seen with 3D audio systems: front/back
confusions and elevation errors. A front/back confusion
results when the listener perceives the sound to be in the
front when it should be in back, and vice-versa.

VII. Acoustic Environmental Modeling

Acoustic environment modeling refers to combining 3D


spatial location cues with distance, motion, and ambience
cues, to create a complete simulation of an acoustic scene.
By simulating the acoustical interactions that occur in the
natural world, we can achieve stunningly realistic
Fig. 3 Signal routing to generate realistic 3D audio
recreations, above and beyond that possible with just 3D IX. Virtual Speakers
positional control.
Main considerations of environmental modeling: Using virtual speakers it is easy to convert a conventional
stereo sound into an immersive 3D sound. This can be done
Reverberation, Distance Cues, Doppler Motion Effect, Air
by assigning the left channel input to a virtual left speaker
Absorption
positioned far to the left of the actual left speaker, and

VIII. Signal Routing similarly assigning the right channel input to a virtual right
speaker positioned far to the right of the actual right
Signal routing is required to combine 3D audio with
speaker. A virtual speaker is like a stationary sound object;
environmental modeling. One possible routing scheme is
it is fixed in space and assigned a sound.
shown in the figure below. The signal routing is
conceptually similar to the routing seen in multichannel
mixing consoles: input signals are individually processed,
mixed to a set of shared signal busses, and then the bus
Fig. 4 Virtual Speaker input processing

CONCLUSION

The combination of 3D audio and acoustic environment [1] B.S. Atal and M.R. Schroeder, Apparent sound source
modeling can be used to create stunningly real auditory translator, U.S. Patent 3 236 949, 1962.
[2] P. Damaske, Head-related two-channel stereophony
experiences. With computing power getting less expensive, with loudspeaker reproduction, J. Acoust. Soc. Amer., vol.
these technologies are making their way to the common 50, 1971.
[3] D.H. Cooper and J.L. Bauck, Prospects for transaural
user. Nearly all movies come with some form of recording, J. Audio Eng. Soc., vol. 37, pp. 3-19, 1989.
multichannel encoded audio to make use of multichannel [4] S.T. Neely and J.B. Allen, Invertibility of a room
impulse response, J. Acoust. Soc. Amer., vol. 66, pp. 165-
setups. This adds realism and makes for a more immersive 169, 1979.
experience. With better understanding of the hearing system [5] S. Haykin, Adaptive filter theory, 3rd ed., Englewood
Cliffs, NJ: Prentice-Hall, 1996.
in humans, we will be able to design better and more [6] P.A. Nelson, H. Hamada, and S. J. Elliott, Adaptive
effective systems for 3D audio. inverse filters for stereophonic sound reproduction, IEEE
REFERENCES Trans. Signal Process., vol. 40, pp. 1621-1632, July, 1992.

You might also like