Psychoacoustics Tutorial

Structure of the Human Ear
An Ear is the organ of Auditory and Vestibular systems. The main function of the ear
is to receive sound energy and convert it into chemical and electrical impulses which go
to the brain. In Fig.1 structure of the Human Ear is shown schematically. The ear
structure can be divided into 3 main parts, namely:
- Outer ear;
- Middle ear;
- Inner ear.
Fig.1. Structure of the Human Ear

Source: http://www.dasp.uni-wuppertal.de/index.php?id=57
Outer Ear
The outer ear’s functions are collection of the sound energy and transmission of this
energy through the outer ear canal to the ear drum. The outer ear canal has two
advantages: firstly, it protects the ear drum and the middle ear from damage and
secondly, it enables the inner ear to be positioned very close to the brain, thus reducing
the length of the nerves and resulting in a short travel time for the electrical impulses in
the nerve.
The outer ear canal exerts a strong influence on the frequency response of the hearing
organ. It acts like an open pipe with a length of about 2 cm corresponding to a quarter of
the wavelength of frequencies near 4 kHz. That means the outer ear is responsible for
the high sensitivity of our hearing organ in this frequency range, indicated by the dip of
threshold in quiet around 4 kHz. This high sensitivity however, is also the reason for
high probability to damage the ear sensibility in the region around 4 kHz.
Middle Ear
The sound affecting the outer ear consists of air particles oscillations. The inner ear
contains fluids that surround the sensory cells. In order to excite these cells, it is
necessary to produce oscillations in the fluids. The oscillations of air particles with
small forces, but large displacement, have to be transferred into motions of the water-
like fluids with large force, but small displacements.
The Middle Ear is protected from its surroundings by the eardrum on one side and the
Eustachian tube on the other. However, the Eustachian tube, which is connected to the
upper throat region, is opened briefly when swallowing. External influences like
mountain climbing, the use of an elevator, flying, or diving can produce an extreme
increase or decrease in pressure which changes the resting position of the eardrum.
Thus, it leads to a reduction of hearing sensitivity. It is possible to equalize the air
pressure in the Middle ear with the environment by swallowing.
Inner Ear
The inner ear (cochlea) (seen in Fig.2) is shaped like a snail and is embedded in the
extremely hard temporal bone. The cochlea is filled with a watery liquid, the perilymph,
which moves in response to the vibrations coming from the middle ear via the oval
window and propagates to the apex (the top or center of the spiral). As the fluid moves,
the cochlear partitions (basilar membrane and organ of Corti) move; thousands of hair
cells sense the motion and convert that motion to chemical and electrical signals which
our brain interprets as a sound.
Fig.2. Inner Ear

Source: http://bio1152.nicerweb.com/Locked/media/ch50/
helmwr.gif
Fig.3 Example of a human ear work
Source: http://147.162.36.50/cochlea/cochleapages/overview/history.htm
Sound Preprocessing in the Peripheral System
From the Sound Preprocessing side of view cochlear works as a set (a bank) of
bandpass filters. That means cochlear is divided into many parts which response to
different frequencies (Fig.4). Low frequencies produce oscillations of the basilar
membrane near the Apex and high frequencies near the Base (oval window).
Fig.4 Frequency selectivity of the basilar membrane

Source: http://cochlearimplanthelp.com/journey/choosing-a-cochlear-implant/electrodes-and-channels/
Fig.5 shows the behavior of the traveling waves of different frequencies along the
basilar membrane for the complex tone which are presented simultaneously and
propagate from oval window to the Apex.
Fig.5 Behaviors of the traveling waves along the basilar membrane

Source: YuliYou“Audio CodingTheoryandApplications”
Each tone causes a different region of the basilar membrane to vibrate. Namely, it can
be seen that low frequency affects mostly that part of membrane which is near the Apex,
and high frequencies - part which is near the oval window.
The amplitude of traveling waves gradually increases from the oval window in the
direction of the Apex, reaches a maximum, and fades quite rapidly beyond this
maximum.
Thus the inner ear performs the very important task of frequency separation: sound
energy of complex tone is transferred to, divided and concentrated at different places
along the basilar membrane.
Frequency and Human Hearing Range
Physically sound can be described by: frequency of sound wave; sound pressure,
which is the force of sound on a surface area; and sound intensity, which is the sound
power per unit area. The lowest sound pressure possible to hear is approximately 2*10-5
Pa. It therefore convenient to express the sound pressure as a logarithmic decibel scale
related to this lowest human hearable sound - 2*10-5 Pa, 0 dB.
Fig.6 shows a human hearing area, which is a plane in which audible sounds can be
displayed in terms of frequencies with respect to Sound Pressure and Sound Intensity.
Fig.6 Hearing area

Source: Zwicker&Fastl “Psychoacoustics Facts and Models”
A human ear can perceive the sounds from 16 Hz up to 20 kHz. The actual hearing
area lies between the threshold in quiet (the limit towards low levels) and the threshold
of pain (the limit towards high levels).
Threshold in quiet is a threshold of sound pressure for different frequencies below
which a human ear cannot perceive a sound. So that means below some sound intensity
we will hear nothing. And the second important border - is the limit of damage risk.
Above this sound intensity we can damage our ears. Thus, e.g. in factories where noise
reaches 5 dB below of this threshold the ear plugs have to be used by workers in order
to not damage the ears.
The 3-rd main threshold - threshold of pain.
Between these thresholds lies the hearing area. Which corresponds to normal speech,
music and other hearing sounds.
As it was said earlier, the ear canal corresponds to a quarter of the wavelength of
frequencies near 4 kHz. The doted curve shows exactly that region which can be easily
damaged because of the very high ear sensitivity at these frequencies.
As you know with years our hearing abilities become lower and lower. It means that
the threshold of hearing grows and, e.g. the whispering which we could hear when we
were 20 years old, at 60-70 years we will not hear. Moreover, this threshold is not
absolute and varies for different persons at different age.
Terminology
Loudness is a measure of auditory sensation which is measured in "SONE". It

depends on sound pressure and frequency. Loudness of 1 son corresponds to the
loudness of pure sinusoidal tone of 1 kHz and with sound pressure of 2 mPa.
And the Loudness Level is the sound pressure of 1 kHz tone which is as loud as the
sound which is measured in ‘PHON’.
Fig.7 represents the family of loudness curves of pure tones in a free sound field
which are standardized. They show dependencies of sound pressure from frequencies
with fixed loudness level. In other words, you can determine the loudness level of pure
tone if you know frequency and sound pressure.
Via this link http://newt.phys.unsw.edu.au/jw/hearing.html you can test your hearing

abilities.
Fig.7 The family of loudness curves

Source: Suzuki et al., “Precise and Full-range Determination of Two-dimensional EqualLoudnessContours”,
2003.
A widely used "rule of thumb" for the loudness of a particular sound is that the sound
must be increased in intensity by a factor of ten for the sound to be perceived as twice as
loud. Or we can rewrite this in terms of the loudness level (LN) and the loudness (N) as:
Critical Bands
It was mentioned that different places of basilar membrane perceive different sound
frequencies. Critical Band is nothing else but a set of these frequencies on basilar
membrane which are perceived as the same pitch or frequency.
Source: http://cochlearimplanthelp.com/journey/choosing-a-cochlear-implant/electrodes-and-channels/
If we take a look at the Critical bandwidth as a function of frequency (Fig.8) we can

see that at small frequencies the critical bandwidths remain the same (about 100 Hz) up
to 500 Hz. Above 500 Hz the critical bandwidths start to grow up.
Fig.8 Frequency Grouping Bandwidth

Source: Zwicker & Fastl “Psychoacoustics Facts and Models”, p.159
The Bark Scale (Fig.8) represents a non-linear frequency scale modeling the
resolution of the human hearing system. This scale ranges from 1 to 24 in unit “Bark”
and one Bark corresponds to one critical band.
...
Fig.8 The Bark Scale
Source: https://en.wikipedia.org/wiki/Bark_scale
As follows, there is a dependency between position of tones in critical bandwidth and
the loudness. If two tones are in one critical bandwidth - the total perceived loudness is
no louder than one single tone. But if tones are under the threshold in quiet, their
intensities add up and the sum can be audible.
If the distance between two tones is bigger than the length of a critical bandwidth and
they have equal loudness, total loudness is about twice as loud.
Source: http://hyperphysics.phy-astr.gsu.edu/hbase/sound/loud.html
Masking
There are limits to how well our ears can differentiate between sounds occupying
similar frequencies. Masking occurs when two or more sounds occupy exact the same
frequencies or, more precise, the frequencies within one Critical Band.
In this process the signal which cannot be heard is called maskee. The signal which is
perceived is called masker.
To overcome masking problem there are two ways. One way is to wait until Masker
disappear, the second is to increase the power of the masked signal. The certain level
after which masked signal will be again audible - is called masking threshold.
In Fig.9 strong signal of 1 kHz is considered and it acts as a masker. Since three other
signals lie under the Masking Threshold they cannot be heard.
Fig.9 The Masking effect

Source: http://www.nptel.ac.in/courses/117105083/pdf/ssg_m9l28.pdf
Several types of Masking

Masking of Pure Tones by Broad-Band Noise
Broad-band Noise assumes a white noise in the range from 20 Hz up to 20 kHz. Since
White Noise covers the whole perceived frequency range it can hide any sound.
Fig.10 depicts masking thresholds for pure tones masked by broad band noise of
different density levels. This means if we have a White Noise with density level of 10
dB the 1 kHz pure tone has to be greater than 30 dB to be perceived.
Fig.10 Masking thresholds for pure tones masked by broad band noise of different density levels lWN
Fig: Zwicker, Fastl“Psychoacoustics - Facts and Models”, 2nd Edition, 1999.
Masking of Pure Tones by Noise - Narrow-Band Noise (Fig.11)

Narrow-band noise is a noise with a bandwidth equal or smaller than critical
bandwidth (about 100 Hz below and 0.2*f above 500 Hz).
Fig.11 shows the masking thresholds of pure tones masked by Narrow-band noises at
centre frequencies of 0.25, 1, and 4 kHz. The level of each masking noise is 60 dB and
the corresponding bandwidths of the noises are 100, 160, and 700 Hz, respectively.
It is observed that the maximum of the masked threshold shows the tendency to be
lower for higher centre frequencies of the masker, although the level of the narrow-band
masker is 60 dB at all centre frequencies.
Fig.11 Threshold s of pure tones masked by narrow-band noise for different centre
frequencies
Fig: Zwicker, Fastl“Psychoacoustics - Facts and Models”, 2nd Edition, 1999.
Moreover, the pure tones can be masked by Low-Pass or High-Pass Noise, Pure
Tones and Complex Tones. More information you can read in Zwicker&Fastl
“Psychoacoustics Facts and Models”.
There are special cases of masking such as Temporal Masking Effects (Fig.12)
It considers forward masking (post-masking) about 100 ms after the masker is
switched off.
And Backward masking (pre-masking) about 1-5 ms before the masker is been
switched on.
Since Temporal masking does not extend far in time simultaneous masking is
more important phenomenon for Audio Signal Processing.
Fig.11: Masking thresholds for pure tones masked by broad band noise of different density levels lWN
Source: Zwicker&Fastl “Psychoacoustics Facts and Models”

Psychoacoustics Tutorial

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Psychoacoustics Tutorial

Uploaded by

Copyright:

Available Formats

Structure of the Human Ear

Fig.1. Structure of the Human Ear

Fig.2. Inner Ear

Fig.3 Example of a human ear work

Fig.4 Frequency selectivity of the basilar membrane

Fig.5 Behaviors of the traveling waves along the basilar membrane

Fig.6 Hearing area

Loudness is a measure of auditory sensation which is measured in "SONE". It

Via this link http://newt.phys.unsw.edu.au/jw/hearing.html you can test your hearing

Fig.7 The family of loudness curves

If we take a look at the Critical bandwidth as a function of frequency (Fig.8) we can

Fig.8 Frequency Grouping Bandwidth

Fig.9 The Masking effect

Several types of Masking

Masking of Pure Tones by Noise - Narrow-Band Noise (Fig.11)

Fig: Zwicker, Fastl“Psychoacoustics - Facts and Models”, 2nd Edition, 1999.

You might also like