You are on page 1of 15

VOWEL IDENTIFICATION

OF
SPEECH SIGNAL
BY T RAJA SAI SUNESH KUMAR
1391082007
FOR COMPLETION OF SSP LAB
INSTITUTE OF
TECHNICAL&EDUCATION RESEARCH

1
CONTENTS

SL NO TOPICS SLIDE NUMBER

1 INTRODUCTION 3,4

2 TECHNIQUES TO 5
IDENTIFY VOWELS

3 PROPERTIES 6

4 DATABASE 7

5 RESULT AND 8,9


DISCUSSION

6 CONCLUSION 10

7 APPENDIX 11,12,13

2
INTRODUCTION

 speech signal is a quasi periodic signal having a sampling frequency of 8000Hz

 vowel is a sound pronounced with an open vocal tract, so that the tongue does not touch the
lips, teeth, or roof of the mouth.

consonant is a speech sound that is articulated with complete or partial closure of


the vocal tract .

3
DISTINGUISH BETWEEN VOWELS AND CONSONANTS

VOWELS CONSONANTS
 Vowel- These letters are pronounced  Consonants- These letters are
with an open mouth so there will be pronounced with trapped sounds.
no trapped sounds.
 Consonants can be voiced and
 Vowels are always voiced. unvoiced

 Different consonants present in


 Different vowels present in English English are b,c,d,f,e,etc
are a,e,i,o,u

4
TECHNIQUE USED TO IDENTIFY VOWELS

• Short time energy in time domain- serves to differentiate voiced and unvoiced
sounds in speech from silence (background signal)
• natural definition of energy of weighted signal is:

En=

• Block diagram representation-

5
PROPERTIES

 h(n)=w^2(n),if w[n] duration is very long and the energy would not change with
time.
 Very long duration window correspond to narrow band filter.
 We use two windows:
1)rectangular window=>{h(n)=1 ,0 <= n <= N-1
=0,otherwise }

2)hamming window=> {h(n)=0.54-0.46 cos(2πn/(N-1)), 0<=n<=N-1


=0,otherwise }
 rectangular window gives equal weight to all L samples in the window (n,...,n-L+1)
 Hamming window gives most weight to middle samples and tapers of strongly at
the beginning and the end of the window

6
DATA BASE

 Wave surfer software was opened and we have kept sampling frequency at 8000 Hz
and voice was recorded at mono mode.
 All the five vowels were recorded for 5 times and saved using .wav extension
 All the consonants were recorded and saved using .wav extension.
 Various in build commands like audioread ,audiowrite,etc was used to read and
hear required vowels and consonants in matlab
 We have used hamming window because it gives greater attenuation than
rectangular window

7
RESULT AND DISCUSSION
 Energy of vowels recorded as follows(in terms of Hz)-
a=249;
e=312;
i=281;
o=331;
u=331;

 Energy of consonants recorded as follows(in terms of Hz)-


b=137;
c=137;
d=87;

 Average energy of all the vowels is 300Hz

8
DISCUSSION

We have calculated the energies of the vowels and consonants by theoretically and
practically and also calculated there average and compared those average value of energy
of vowels with energy of consonant to get the required output.
For example:
energies of all the vowels i.e. A =249+312+281+331+331=1504
Average energy of vowels i.e. B=300
energy of b consonant i.e. C=137
Now we go for comparison using if condition
Suppose if( B>=C)
Then output is vowels
Otherwise
Output is consonants
So according to the above condition stated above output is consonants.

9
CONCLUSION

 We have successfully completed the project. we calculated the energies of vowels and
consonants and compared those energies to identify between vowels and consonants from
speech signal if the vowel energy is lower than consonants energy then we labeled it as
vowels otherwise it is labeled as consonants .

 Theoretically and practically consonants have lower energy than vowels.

10
APPENDIX

 clc;
 close all;
 clear all;
 %energy of a
 [a,fs]=audioread('a.wav');
 a1=buffer(a,160,80,'nodelay');
 h=hamming(160);
 a2=diag(sparse(h))*a1;
 energy1=sum(a2.^2,1);
 %energy of e
 [e,fs]=audioread('e.wav');
 e1=buffer(e,160,80,'nodelay');
 h=hamming(160);
 e2=diag(sparse(h))*e1;
 energy2=sum(e2.^2,1);
 %energy of i
 [i,fs]=audioread('i.wav');
 i1=buffer(i,160,80,'nodelay');
 h=hamming(160);
 11
i2=diag(sparse(h))*i1;
APPENDIX
 %energy of o
 [o,fs]=audioread('o.wav');
 o1=buffer(o,160,80,'nodelay');
 h=hamming(160);
 o2=diag(sparse(h))*o1;
 energy4=sum(o2.^2,1);
 %energy of u
 [u,fs]=audioread('u.wav');
 u1=buffer(o,160,80,'nodelay');
 h=hamming(160);
 u2=diag(sparse(h))*u1;
 energy5=sum(u2.^2,1);
 %average of all the energies
 A=[249+312+281+331+331]/5;

12
APPENDIX
 %energy of b consonat
 [b,fs]=audioread('b.wav');
 b1=buffer(b,160,80,'nodelay');
 h=hamming(160);
 b2=diag(sparse(h))*b1;
 energy6=sum(b2.^2,1);
 %energy of c cosonanat
 [c,fs]=audioread('c.wav');
 c1=buffer(c,160,80,'nodelay');
 h=hamming(160);
 c2=diag(sparse(h))*c1;
 energy7=sum(c2.^2,1);
 %energy of d cosonanat
 [d,fs]=audioread('d.wav');
 d1=buffer(d,160,80,'nodelay');
 h=hamming(160);
 d2=diag(sparse(h))*d1;
13
 energy8=sum(d2.^2,1);
APPENDIX

%energy of c consonant
[c,fs]=audioread('c.wav');
c1=buffer(c,160,80,'nodelay');
h=hamming(160);
c2=diag(sparse(h))*c1;
energy7=sum(c2.^2,1);
%energy of d cosonanat
[d,fs]=audioread('d.wav');
d1=buffer(d,160,80,'nodelay');
h=hamming(160);
d2=diag(sparse(h))*d1;
energy8=sum(d2.^2,1);
%average of all the consonant energies
g=[137+137+87]/3;
if A>=g
disp('it is a vowel.')
else
disp('it is not a vowel')
end
14
THANK YOU

THIS PRESENTATION IS PRSENTED


BY T RAJA SAI SUNESH KUMAR
DUAL DEGREE ECE (DSIP)
1391082007

15

You might also like