Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
Look up keyword
Like this
0 of .
Results for:
No results containing your search query
P. 1
Paper 3

Paper 3

Ratings: (0)|Views: 64 |Likes:
Published by Rakeshconclave

More info:

Published by: Rakeshconclave on Feb 15, 2012
Copyright:Attribution Non-commercial


Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less





 International Journal of Advances in Science and Technology,Vol. 4, No.1, 2012
Novel Speech Processing Methodology to Shrink Spectral Masking for Hearing Impaired
Jayant Chopade
, Pravin Dhulekar
, Dr.S.L.Nalbalwar
and Dr.D.S.Chaudhari
artment of Electronics and Telecommunication, SNJB‟s
College of Engineering, Chandwad,Maharashtra, India
Department of Electronics and Telecommunication, Dr. B. A. T. University, Lonere, Maharashtra, India
Department of Electronics and Telecommunication, Govt. College of Engg., Jalgaon, Maharashtra, India
 Auditory masking occurs when the perception of one sound is affected by the presence of  another sound like noise or unwanted sound of the same duration as the original sound. Earlier studies have shown that binaural dichotic presentation, using critical bandwidth based spectral  splitting with perceptually balanced comb filters, helps in reducing the effect of auditory masking for persons with moderate bilateral sensorineural hearing impairment. In the present study a spectral splitting of speech signals is done by using modified wavelet packets which is combination of discrete wavelet transform for one level of decomposition and wavelet packets for the second level  of decomposition and presented dichotically that is odd bands of frequencies are given to right ear and even bands to left ear simultaneously, which shown the significant reduction in auditory masking compared to earlier methods. The performance of the proposed method is experimentallyevaluated with speech signals of vowel-consonant-vowel syllables for fifteen English consonants.
: Auditory Masking, Binaural Dichotic Presentation, Sensorineural Hearing Impairment, Modified Wavelet Packets, Cochlea.
1. Introduction
If two sounds of two different frequencies (pitches) are played at the same time, two separatesounds can often be heard rather than a combination tone. This is otherwise known as frequencyresolution or frequency selectivity. This is thought to occur due to filtering within the cochlea, alsoknown as critical bandwidths, in the hearing organ of inner ear. A complex sound is split into differentfrequency components and these components cause a peak in the pattern of vibration at a specific placeon the cilia inside the basilar membrane within the cochlea. These components are then codedindependently on the auditory nerve which transmits sound information to the brain. This individualcoding only occurs if the frequency components are different enough in frequency, otherwise they arecoded at the same place and are perceived as one sound instead of two [1].The auditory masking is categorized depending upon the occurrence of masker, one being non-simultaneous masking, which occurs when the signal and masker are not presented at the same time.This can be split into forward masking and backward masking. Forward masking is when the masker ispresented first and the signal follows it. Backward masking is when the signal precedes the masker;while the other is Simultaneous masking is a frequency-domain version of temporal masking, and tendsto occur in sounds with similar frequencies, in masking a sound is made inaudible by a masker, a noiseor unwanted sound of the same duration as the original sound [1]. The greatest masking is when themasker and the signal are the same frequency and these decreases as the signal frequency movesfurther away from the masker frequency. This phenomenon is called on-frequency masking and occursbecause the masker and signal are within the same auditory filter. The simultaneous masking reduces
JanuaryPage 15 of 101ISSN 2229 5216
 International Journal of Advances in Science and Technology,Vol. 4, No.1, 2012
the frequency resolution significantly, so it is more severe compared to the non-simultaneous masking.The auditory masking occurs because the original neural activity caused by the first signal is reducedby the neural activity of the other sound [2].The objective of our investigation is to split the speech signals with help of modified wavelet packetto form complementary bands that are dichotically presented(presenting two different signals to thetwo ears is referred to as dichotic presentation) which will considerably solve the problem of auditorymasking compared to the earlier methods [3].The discrete wavelet transform divide the signal spectrum into frequency bands that are narrow inthe lower frequencies and wide in the higher frequencies. This limits how wavelet coefficients in theupper half of the signal spectrum are classified. Wavelet packets divide the signal spectrum intofrequency bands that are evenly spaced and have equal bandwidth and will be explored for use inidentifying transient and quasi
steady-state speech [4].The processing schemes were developed asspectral splitting with modified wavelets packets based on ten frequency bands as the performance byhearing-impaired subjects saturated around eight channels, while performance by normal-hearingsubjects sustained to 12
16 channels in higher background noise [5]. Three different Simulink modelswere developed based on modified wavelet packet with Daubechies, Symlets and Biorthogonal waveletfunctions. The inverse wavelet packet transform was used to synthesize speech components from thewavelet packet representation. To synthesize the speech component, wavelet coefficients were used.Table 1 shows the frequency order nodes that correspond to natural order for decomposition levels of 0to 2, whereas Table 2 and Table 3 shows frequency bands based on quasi-octave.
Table 1.
Frequency ordered terminal nodes for depths 0 to 2.01 2 3 4 5 67
Table 2.
Ten frequency bands for spectral splitting with compression(For left ear).
Filter for left earBandCentre frequency(kHz)Pass band frequency(kHz)
JanuaryPage 16 of 101ISSN 2229 5216
 International Journal of Advances in Science and Technology,Vol. 4, No.1, 2012
Table 3.
Ten frequency bands for spectral splitting with compression(For right ear).
Filter for right earBandCentre frequency kHzPass band frequency kHz
During the process of frequency transformation, as poles were changed, compression was achieved,and useful to the hearing impaired having high frequency impairment and changes in the acousticattributes such as the averaged power spectrum and formant transitions were observed [6].
2. Materials and methods
The speech material
Earlier studies have used CV, VC, CVC, and VCV syllables. It has been reported earlier that greatermasking takes place in intervocalic consonants due to the presence of vowels on both sides [7]. Sinceour primary objective is to study improvement in consonantal identification due to reduction in theeffect of masking, so VCV syllables are used.For the evaluation of the speech processing strategies, a set of fifteen nonsense syllables in VCVcontext with consonants / p, b, t, d, k, g, m, n, s, z, f, v, r, l, y / and vowel / 
 / as in farmer were used.The features selected for study were voicing (voiced: / b d g m n z v r l y / and unvoiced: / p t k s f /),place (front: / p b m f v /, middle: / t d n s z r l /, and back: / k g y /), manner (oral stop: / p b t d k g l y /, fricative: / s z f v r /, and nasals: / m n /), nasality (oral: / p b t d k g s z f v r l y /, nasal: /m n /),frication (stop: / p b t d k g m n l y /, fricative:
s z f v r /), and duration (short: / p b t d k g m n f v l / and long: /s z r y /).
The Speech processing strategies
For many signals, the low-frequency content is the most important part. It is what gives the signalits identity. The high-frequency content, on the other hand, imparts flavor or nuance. Consider thehuman voice. If you remove the high-frequency components, the voice sounds different, but you canstill tell what's being said. However, if you remove enough of the low-frequency components, you heargibberish.In basic filtering process, the original signal passes through two complementary filters and emergesas two signals. Unfortunately, if we actually perform this operation on a real digital signal, we wind upwith twice as much data as we started with. Suppose, for instance, the original signal consists of 1000samples of data. Then the resulting signals will each have 1000 samples, for a total of 2000.Thesesignals A and D are interesting, but we get 2000 values instead of the 1000.
Multilevel decomposition by DWT
There exists a more subtle way to perform the decomposition using wavelets. By looking carefullyat the computation, we may keep only one point out of two in each of the two 2000-length samples toget the complete information. This is the notion of down sampling. We produce two sequences calledcA and cD, which includes down sampling, produces DWT coefficients [8].The decomposition processcan be iterated, with successive approximations being decomposed in turn, so that one signal is broken
JanuaryPage 17 of 101ISSN 2229 5216

You're Reading a Free Preview

/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->