Professional Documents
Culture Documents
net/publication/352491950
CITATIONS READS
0 695
1 author:
Arpita Das
Chittagong University of Engineering & Technology
13 PUBLICATIONS 0 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Designing a Cascadable Comparator Cell and Cascading It to Form a Comparator for Two 4-bit Numbers View project
All content following this page was uploaded by Arpita Das on 27 June 2021.
Date – 29.10.2018
Submitted to
Submitted by
2) To be able to detect pitch and a fundamental frequency of a signal from audio file
THEORY:
PITCH: The quality of a sound governed by the rate of vibrations producing it; the degree
of highness or lowness of a tone, the steepness. It is the tone which is perceived by the
listener.
METHODS:
CEPSTRUM: A cepstrum is the result of taking the inverse Fourier transform (IFT) of
the logarithm of the estimated spectrum of a signal. There is a complex cepstrum,
a real cepstrum, a power cepstrum, and a phase cepstrum. The power cepstrum in particular
finds applications in the analysis of human speech. The cepstrum starts by taking the Fourier
transform, then the magnitude, then the logarithm, and then the inverse Fourier transform.
FFT: A Fast Fourier Transform (FFT) is an algorithm that samples a signal over a period of
time (or space) and divides it into its frequency components. These components are single
sinusoidal oscillations at distinct frequencies each with their own amplitude and phase. An
Page 2 of 15
FFT algorithm computes the discrete Fourier transform (DFT) of a sequence, or its inverse
(IFFT). Fourier analysis converts a signal from its original domain to a representation in
the frequency domain and vice versa.
Developed Code:
%% Clearing and closing previous files and/or variables
%%
clc;
clearvars;
close all;
Page 3 of 15
%%
NFFT=4096;
xaF = fftshift(abs(fft(xa1,NFFT)));
f=(-1/2:1/NFFT:1/2-1/NFFT)*fs;
figure, plot(f,xaF(1:end))
hold on;
[pka,lka]=findpeaks(xaF, 'MinPeakHeight', 10); % 10 should be varried if fundamental freqs
of signal a,e,u are not same
plot(f(lka), xaF(lka), 'o');
title('Signal "a" in frequency domain');
xlabel('Frequency(Hz)');
ylabel('Amplitude');
ffa=min(abs(f(lka)));
fprintf('fundamental frequency of signal "a" is: %f Hz\n', ffa);
xeF = fftshift(abs(fft(xe1,NFFT)));
f=(-1/2:1/NFFT:1/2-1/NFFT)*fs;
figure, plot(f,xeF(1:end))
hold on;
[pke,lke]=findpeaks(xeF, 'MinPeakHeight', 10);% 10 should be varried if fundamental freqs
of signal a,e,u are not same
plot(f(lke), xeF(lke), 'o');
title('Signal "e" in frequency domain');
xlabel('Frequency(Hz)');
ylabel('Amplitude');
ffe=min(abs(f(lke)));
fprintf('fundamental frequency of signal "e" is: %f Hz\n', ffe);
xuF = fftshift(abs(fft(xu1,NFFT)));
f=(-1/2:1/NFFT:1/2-1/NFFT)*fs;
figure, plot(f,xuF(1:end))
hold on;
[pku,lku]=findpeaks(xuF, 'MinPeakHeight', 10);% 10 should be varried if fundamental freqs
of signal a,e,u are not same
plot(f(lku), xuF(lku), 'o');
title('Signal "u" in frequency domain');
xlabel('Frequency(Hz)');
ylabel('Amplitude');
ffu=min(abs(f(lku)));
fprintf('fundamental frequency of signal "u" is: %f Hz\n\n', ffu);
%% PSD analysis
%%
h = spectrum.welch; % or, h = spectrum.periodogram
xapsd = psd(h, xa1, 'fs', fs );
figure; plot(xapsd);
xepsd = psd(h, xe1, 'fs', fs );
figure; plot(xepsd);
xupsd = psd(h, xu1, 'fs', fs );
Page 4 of 15
figure; plot(xupsd);
% highest point of psd is the pitch of that signal. It should be marked.
[Pxxe,Fxe] = pwelch(xe1,length(xe1),0,NFFT,fs);
figure, plot(Fxe, Pxxe);
hold on;
[~,Ie] = max(Pxxe);
ffreqe = abs(Fxe(Ie));
fprintf('Pitch of signal "e" is: %f Hz\n', ffreqe);
plot(Fxe(Ie), Pxxe(Ie), 'o');
title('PSD of Signal "e"');
xlabel('Frequency(Hz)');
ylabel('Power/Frequency');
[Pxxu,Fxu] = pwelch(xu1,length(xu1),0,NFFT,fs);
figure, plot(Fxu, Pxxu);
hold on;
[~,Iu] = max(Pxxu);
ffrequ = abs(Fxu(Iu));
fprintf('Pitch of signal "u" is: %f Hz\n\n', ffrequ);
plot(Fxu(Iu), Pxxu(Iu), 'o');
title('PSD of signal "u"');
xlabel('Frequency(Hz)');
ylabel('Power/Frequency');
re = xcorr(xe1, xe1);
figure, plot(re);
Page 5 of 15
title('Auto-correlated Sound "e" in Time Domain');
xlabel('Time');
ylabel('Amplitude');
ru = xcorr(xu1, xu1);
figure, plot(ru)
title('Auto-correlated Sound "e" in Time Domain');
xlabel('Time');
ylabel('Amplitude');
%% Cepstrum
%%
% Working with a section of signal
dt = 1/fs;
Page 6 of 15
I0 = round(0.1/dt);
Iend = round(0.2/dt);
xac = xa1(I0:Iend);
figure, plot(xac)
title('Working with a section of sound signal "a"');
xlabel('Time');
ylabel('Amplitude');
xec = xe1(I0:Iend);
figure, plot(xec);
title('Working with a section of sound signal "e"');
xlabel('Time');
ylabel('Amplitude');
xuc = xu1(I0:Iend);
figure, plot(xuc);
title('Working with a section of sound signal "u"');
xlabel('Time');
ylabel('Amplitude');
ca = rceps(xac);
figure, plot(ca);
title('Cepstrum of signal "a"');
xlabel('quefrency(s)')
ylabel('Amplitude');
ce = rceps(xec);
figure, plot(ce)
title('Cepstrum of signal "e"');
xlabel('quefrency(s)')
ylabel('Amplitude');
cu = rceps(xuc);
figure, plot(cu)
title('Cepstrum of signal "u"');
xlabel('quefrency(s)')
ylabel('Amplitude');
Page 7 of 15
[~,Ia] = max(crng_a);
figure, plot(trng, crng_a);
hold on;
plot(trng(Ia), crng_a(Ia), 'o');
title('Real Cepstrum F0 Estimation of signal "a"');
xlabel('Time');
ylabel('Amplitude');
[~,Ie] = max(crng_e);
figure, plot(trng, crng_e);
hold on;
plot(trng(Ie), crng_e(Ie), 'o');
title('Real Cepstrum F0 Estimation of signal "e"');
xlabel('Time');
ylabel('Amplitude');
[~,Iu] = max(crng_u);
figure, plot(trng, crng_u);
hold on;
plot(trng(Iu), crng_u(Iu), 'o');
title('Real Cepstrum F0 Estimation of signal "u"');
xlabel('Time');
ylabel('Amplitude');
Outputs:
The audios which were actually three different types of sound (aaaaaa……, eeeeee…… &
uuuuuu……..) were my voice and were recorded and given as inputs (.wav file) in MATLAB.
Page 8 of 15
2) Frequency domain plot:
Page 9 of 15
3) Pitch Estimation (Using pwelch function of MATLAB):
Hence, it is seen from the curves that pitch using pwelch function of MATLAB gives:
All of these values will later be validated also from analytic estimation.
Page 10 of 15
4) Auto correlated signal:
Here autocorrelation is used to reduce noise. We have plotted the signals after autocorrelation
in time domain.
Page 11 of 15
5) Pitch Estimation (via PERIODOGRAM):
Hence, it is seen from the curves that pitch from PERIODOGRAM gives:
All of these values will later be validated also from analytic estimation.
Page 12 of 15
6) Cepstrum:
Cepstrum is done using rcep() function which returns the real part of inverse fourier transform
of logarithm of fourier transform.
Page 13 of 15
7) Real cepstrum (Estimation of Fundamental Frequency):
Page 14 of 15
Assuming a fundamental frequency range (for women usually between 170 to 240 Hz), we
plotted the cepstrum in time domain and marked the maximum time. Inverse of that time gives
minimum frequency, that is the fundamental frequency.
Results:
Figure 08: Results showing fundamental frequency before and after the tested operation
Conclusion:
Our main target in this project was to detect pitch and fundamental frequency. We have learned
some ways to detect them. From the fast fourier transform and cepstrum, we have detected
fundamental frequency and we found that values were not equal. In the fundamental frequency,
there may be some presence of noise which may be the reason of this inequality. Again, from
the periodogram and pwelch function of MATLAB we tried to detect pitch. Also, in this case
we didn’t get same values. We saved the audio in .wav format which is not good enough as
.mp3 format. In conclusion, there may be some code errors as we are new to this kind of
implementation at this stage and hope that our respective audience will be considerate about
this.
Page 15 of 15