You are on page 1of 6

Signal Processing – Modelling an Auditory System

An auditory system mimics the behaviour of a biological cochlea found in humans and other
mammals. The system converts a 1D discrete-time audio signal to a 2D time-frequency signal
called an auditory spectrogram. From this spectrogram, audio information can be extracted
as shown in Table 1. Its application include hearing aids, speech and musical information
retrieval, audio multimedia systems, and brain modelling.

The objective of this project is to model an auditory system using MATLAB. Note that this
project is an individual task.

Index Audio Information Responsible For


1 Intensity Sound loudness.
2 Direction Indicates where a sound is coming from.
3 Pitch Difference between musical notes and also male
and female voices.
4 Timbre Sound colour and shape indicating from which
specific source a sound is coming from.
Table 1: Information extractable from an auditory spectrogram.

To convert a one-dimensional (1D) sound signal into a two-dimensional (2D) time-frequency


representation, a cochlear filterbank is used. A cochlear filterbank comprises multiple
gammatone filters either in parallel or cascaded. The bandwidth of each gammatone filter
increases with increasing frequency so that a high centre frequency filter has a higher
bandwidth than a filter with low centre frequency as shown in Figure 1(a).

A gammatone filter behaves like a bandpass filter. Each gammatone filter is tuned to a specific
centre frequency so that it only responds to a specific frequency. So when the input signal
resonates close to the centre frequency of the filter, the filter will output a resonating signal
at its centre frequency. Hence, a filterbank will have a bank of gammatone filters whose
centre frequencies are tuned from low to high for the entire spectrum of a sound signal.

1
Figure 1: Increasing bandwidth with increasing centre frequency in the gain response of gammatone filters. (a) x-axis is
linearly scaled where intervals between frequencies are the same; (b) x-axis is logarithmically-scaled where intervals between
frequencies are nonlinear.

Ideally, the varying filters tuned differently will react to the different frequencies in the input
signal and will output multiple signals. These signals are then half-wave rectified, where all
negative values are set to 0 and only positive values are maintained. They can be visualised
as a 2D image known as an auditory spectrogram, as shown in Figure 2.

An alternative method of showing an auditory spectrogram is by calculating the short-time


Fourier transform (STFT) on a sound signal.

Figure 2: Block diagram of auditory system

2
Assessment 1 – Assignment 1 (25%)
Design an auditory system in MATLAB.
1. Build a cochlear filterbank using the sample code in gammatonegram.tgz. Select your
specifications from Table 2 based on your right-most digit in your student number.
Note that the gammatone filter order column in Table 2 is meant for assignment 2.
After your changes are introduced, ensure the following:
a) The heights of the two spectrograms are the same as the number of channels in
your setting;
b) The lowest centre frequency in your gain response display should be within ±8 Hz
of your lowest centre frequency setting.
Right-most index of Gammatone filter Number of channels Gammatone
your student with lowest centre (gammatone filters) filter order
number frequency
0 60 Hz 90 2
1 70 Hz 92 3
2 80 Hz 94 4
3 90 Hz 96 5
4 100 Hz 98 6
5 110 Hz 100 2
6 120 Hz 102 3
7 130 Hz 104 4
8 140 Hz 106 5
9 150 Hz 108 6
Table 2: Cochlear filterbank specification

c) Generate a time vector (a vector is known alternatively as an array) 𝑡1 that


contains numbers from [0 to (𝑇 − 1)]/𝑠𝑟. Ensure the division by 𝑠𝑟 is done after
generating the vector 0 to 𝑇 − 1. Note that 𝑇 is the length (in number of samples,
not time duration) of the sound signal stored in sa2.wav that is found in
gammatonegram.tgz and 𝑠𝑟 is the sampling rate of the sound signal.
d) Use the sound signal found in sa2.wav provided in gammatonegram.tgz, as input
to your model. In MATLAB, display the waveform of the sound signal with respect
to time vector 𝑡1 in figure 1 and label the x-axis to reflect time in seconds and y-
axis to reflect amplitude (unitless).
e) Generate an auditory spectrogram by convolving the sound signal from c) with a
gammatone filterbank based on your custom setting from Table 2. Display the
auditory spectrogram in a separate figure 2.
f) Perform a short-time Fourier transform (STFT) on the sound signal from c) to
generate an STFT spectrogram. Display the STFT spectrogram below the auditory
spectrogram generated in part e).
g) Calculate and plot the average power (in Watts) of the STFT and auditory
spectrograms at the top half and bottom half, respectively in MATLAB figure 3.
Here, average power is to be computed independently for each column of the two
spectrograms. Label the axes and title the graphs. Hint: See online Mathworks help

3
page on bandpower command. Also, the 𝑡2 time vector from assignment 2 task
4 is helpful for display of the graphs and axis labels.

Add comments to the code you have modified or introduced in MATLAB. Prepare a progress
report describing what tasks you have completed. Include an introduction, completed tasks
description (and/or any working tasks), challenges experienced, conclusion, and references.
Include figures in your report where necessary.

Use any online English grammar and vocabulary checking application to ensure that your
report is coherent and clear, e.g. Grammarly – marks will be given if you are able to convey
your ideas clearly and concisely. Your report should be at least 2 pages. On the due date of
16 Apr 2021 (11.59pm), submit only the MATLAB script files that you have modified and your
progress report (via Turnitin) on vUWS submission link under “Assessment 1”.

Assessment 3 – Assignment 2 (25%)


1. In Figure 1 above, the gain response of every 5th channel of the gammatone filterbank
is displayed. Generate and display the gain response 𝑔1 (the equation has already
been implemented for you in the second argument of the plot line in
demo_gammatone.m) of all the channels in the gammatone filterbank on a linearly
scaled x-axis and the same response on a logarithmically scaled x-axis in figure 4 in
MATLAB. Plot the linearly-scaled gain response on top of MATLAB figure 4 and the log-
scaled gain response below it. In the graph, your settings from Table 2 can be checked
inspecting the peak of the first filter (left-most curve). This value should within ±8 Hz
of your setting from Table 2. The peak of the last filter (right-most curve) should be
close to but less than 8 kHz.
2. Generate two temporal profiles – one from the auditory spectrogram and another
from the STFT spectrogram generated from assignment 1. A temporal profile can be
generated by summing all the rows of a spectrogram.
3. Generate two spectral profiles – one from the auditory spectrogram and another from
the STFT spectrogram from assignment 1. A spectral profile can be generated by
summing all the columns of a spectrogram.
4. Generate a time vector 𝑡2 that contains 𝑛 number of samples in the range from 0 to
the time duration of the sound signal in sa2.wav. Here, 𝑛 is a fixed number dependent
on the length (number of samples) of the auditory spectrogram generated in
assignment 1.
5. Display two temporal profiles and two spectral profiles in figure 5. The x-axis of each
temporal profile should be displayed with respect to 𝑡2 (in seconds). The x-axis of each
spectral profile should be displayed with respect to 𝐹 and 𝐹2 vectors (in Hertz) that
correspond to the auditory spectrogram and STFT spectrogram respectively – these
vectors have been automatically generated for you in demo_gammatone.m in
assignment 1. The amplitude (y-axis) for all four graphs are unitless. Display:
a. The spectral profile from the STFT spectrogram on the top-left corner in
MATLAB figure 5;

4
b. The spectral profile from the auditory spectrogram on the top-right corner in
MATLAB figure 5;
c. The temporal profile from the STFT spectrogram on the bottom-left corner in
MATLAB figure 5.
d. The temporal profile from the auditory spectrogram on the bottom-right
corner in MATLAB figure 5.
6. Use 2D correlation coefficient (CC) to show the quantitative difference between the
following comparisons (note that only one CC should be generated per comparison).
Use fprintf to display the comparisons below one line at a time in your comman
window.
a. Auditory spectrogram versus STFT spectrogram generated in assignment 1.
b. Auditory spectrogram bandpower versus STFT spectrogram bandpower
generated in assignment 1.
c. Auditory spectrogram temporal profile versus STFT spectrogram temporal
profile.
d. Auditory spectrogram spectral profile versus STFT spectrogram spectral
profile.
7. Use symbolic variables and display the impulse response of an 𝑛-order gammatone
filter where 𝑛 can be found from Table 2 based on your right-most index of your
student number. The impulse response equation is defined by g[n] in the Auditory
Signal Processing.pdf slides.
Add comments to the code you have modified or introduced in MATLAB. On the due date of
04 Jun 2021 (11.59pm), submit only the MATLAB script files that you have modified on vUWS
submission link under “Assessment 3” as well as your final report detailed below.

Assessment 3 – Final Report (25%)


Prepare the final report by combining the results of both assignments 1 and 2 using a standard
format. The final report should include the images from assignments 1 and 2 as well as the
correlation coefficient results and the gammatone filter impulse response (screen capture –
do not use your phone to capture any images). Use any online English grammar and
vocabulary checking application to ensure that your report is coherent and clear, e.g.
Grammarly – marks will be given if you are able to convey your ideas clearly and concisely.
The submission deadline is 04 Jun 2021, by 11:59pm.

A guide for the final report would be


1. Introduction. Objectives – alternatively, you can include a motivation statement on
why this project is important.
2. Components of the auditory model.
3. Modelling the auditory model using MATLAB using the specification from Table 2
clearly described. Also the mention about the filter order from Table 2 required to
show the gammatone impulse response.
4. Results (screen capture of all the figures from assignments 1 and 2; screen capture of
MATLAB command window showing correlation coefficients (CC) and the symbolic

5
equation from assignment 2). Comment on the CC results to indicate the degree of
difference between pairs of vectors and matrices in assignment 2, task 6.
a. Address which CC result is highest and thus, most similar.
b. Conversely, address which CC result is lowest and thus, least similar.
5. Conclusion (discuss your experience in using MATLAB for modelling of the auditory
model, its usefulness, and difficulties).
6. References. IEEE-style referencing preferred – See the last slide in Auditory Signal
Processing.pdf as an example.

Please submit your final report using the Turnitin link in vUWS under “Assessment 3”.

Resources
• Signal Processing Toolbox.
• Audio Toolbox.
• Auditory Filterbank Sample Code.
• Auditory Filterbank Documentation.

You might also like