You are on page 1of 6

Indian J Otolaryngol Head Neck Surg

https://doi.org/10.1007/s12070-021-02757-9

ORIGINAL ARTICLE

Acoustic Voice Analysis of Normal and Pathological Voices


in Indian Population Using Praat Software
Prashant Hippargekar1 • Sudhir Bhise1 • Shankar Kothule1 • Sharad Shelke1

Received: 26 April 2021 / Accepted: 4 July 2021


Ó Association of Otolaryngologists of India 2021

Abstract Acoustic voice analysis is still a valuable tech- Keywords Acoustic voice analysis 
nique which enables voice clinicians to compare voices to Fundamental frequency (F0)  Jitter  Shimmer 
differentiate them into normal and abnormal. The present Harmonics-to-noise ratio (HNR)  Praat software
study was undertaken to standardize acoustic voice
parameters in normal healthy adult individuals and gender
comparison among them and also acoustic voice analysis Introduction
of pathological voices and it’s comparison with normal
healthy voices. Voice samples of vowels /a/, /i/ and /u/ of Voice can be defined as the laryngeal modulation of the
80 normal healthy adults (males = 40, females = 40) of pulmonary air stream, which is then further modified by the
control group and 40 patients with dysphonic voice of case configuration of the vocal tract. [1] Since, speech is the
group collected and acoustic voice parameters were most important means of communication and expression,
extracted by using Praat software. There were statistically any voice disorder can bring about deep implications to
significant higher values of fundamental frequency (F0) in social life. [2] Assessment of voice can be done with
females, while jitter local (%), shimmer local (%) and qualitative and quantitative measures. Qualitative assess-
harmonic to noise ratio (HNR) had no gender differences in ment is done subjectively either by patient himself or by
normal healthy voices. Pathological voices of case group professionals. Voice Handicap Index (VHI) and other such
subjects with laryngeal pathologies had statistically sig- grading systems are subjective measures of voice analysis
nificant higher values of jitter local (%), shimmer local (%) by patient himself. American Speech Language and
and lower values of HNR as compare to normal healthy Hearing Association (ASHA) has developed Consensus
voices of control group. Objective voice analysis by using Auditory Perceptual Evaluation of Voice (CAPE-V) for
Praat software is convenient, reliable and cost effective perceptual assessment of voice by professionals. [1]
method. This study establishes normative acoustic voice Objective voice assessment methods analyze voices by
parameters in normal healthy adults. There are no gender devices which are capable of measuring several acoustic
differences in adult healthy voices except fundamental parameters, as stated by Almeida. [3] Many software has
frequency (F0), which is higher in females. Patients who been developed for this purpose, namely, Praat, [4] Dr.
are with dysphonic voices due to laryngeal pathologies had Speech [5] and Multidimensional Voice Program etc. [6]
altered values of acoustic parameters compared to normo- Praat (Boersma & Weenink 1992–2013), a general set of
phonic adults and clinicians can precisely differentiate tools for analysing, synthesizing and manipulating speech
pathological voices from normophonics. and other sounds bundled into a single integrated computer
program. Many published studies on voice conditions in
the international literature have demonstrated the func-
& Sharad Shelke tionality of the Praat software for differentiating patho-
sbshelke996@gmail.com
logical from normal voice. Since it is free and easily used
1
Swami Ramanand Teerth Rural Govt. Medical College, software for all current major computer platforms
Ambajogai, state Maharashtra, India

123
Indian J Otolaryngol Head Neck Surg

(nowadays MacOS, Windows, Linux) and continually menopause (in the case of women), vocal habits, as well as
updated to accommodate new operating system develop- complaints related to speech, voice and hearing disorders.
ments and new analysis methods. It has been applied in
voice research worldwide. There is an online discussion Inclusion Criteria for Control Group
group on its website to foster debates and clearing of
doubts among users and software creators. [7] Participants of both genders, within the established age
Currently, acoustic voice parameters, commonly used in range (21–50 years), without any voice related complaints.
applications of acoustic analysis as well as the most ref- All the participants undergo auditory perceptual evaluation
erenced in the literature, are the fundamental frequency by one speech language therapist. Speech language thera-
(F0), jitter, shimmer and harmonic to noise ratio (HNR). pist also simultaneously scales each subjects’ voice
[8, 9] However, acoustic voice analysis is an objective, according to GRBAS scale, [11] for sustained vowel /a/
non-invasive, easy-to-use assessment method, that offers and those subjects having score zero, were included. The
indirect data of the vocal function through specific mea- GRBAS (Grade, Roughness, Breathiness, Asthenia and
sures, in normal and pathological conditions, assisting in Strain) scheme is probably the most widely used, partly
the diagnostic process and in the monitoring of the treat- due to its relative simplicity. Each dimension (overall
ment of vocal alterations. [10] Standardization of the Grade of voice, Roughness, Breathiness, Ashtenia and
acoustic data has significant implications for voice clini- Strain of voice) is rated on a four-point scale by a speech
cians, students of speech and language pathology and and language therapist, where 0 = no perceived abnor-
manufacturers of the instrumentations, regarding one’s mality, 1 = mild, 2 = moderate and 3 = severe abnormal-
vocal health. Research in this field is still in its early stages ity and finally total GRBAS score calculated by averaging
in India, and no study had been found to standardize the all scores. In addition to this, subjects having normal
acoustic parameters by using Praat software. The current laryngeal examination on video directed laryngoscopy
study was designed to standardize the acoustic voice (VDL) done by ENT surgeon were included in the study.
parameters in healthy adult individuals in India and to find
out differences in pathological voice acoustic parameters Exclusion Criteria for Control Group
and its characteristics.
Participants had past history of neurological or pulmonary
diseases, head and neck surgery, smoking, speech or
Materials and Methods hearing complaints, according to questionnaires or who
reported cold or allergic respiratory conditions on the day
Study Design and Ethical Aspects of collection of voice sample or who were unable to per-
form the emission required to the recording of their voices
This study was observational, cross-sectional study and were also excluded from the study.
approved by the Research Ethics Committee on Human
Beings of the home institution, performed after all the Inclusion Criteria for Case Group
participants signed the Informed Consent Form.
Participant of both genders, within the established age
Research Subjects range (21–50 years), who is having dysphonia for more
than 3 months. All the participants undergo auditory per-
Voice differs between the genders and undergoes changes ceptual evaluation by one speech language therapist. Par-
throughout the lifetime, with greater vocal stability ticipants, having score more than or equal to 1 on GRBAS
observed in the adulthood. [10] The target population for scale for sustained vowel /a/, were included. In addition to
this study comprised both men and women, with a mini- this, each participant had undergone VDL examination by
mum age of 21 years as voice alterations are commonly ENT surgeon and those having benign lesions of Vocal
occur during puberty. The maximum age was 50 years cord (Vocal cord nodule, polyps, cysts, Reinke’s edema) or
because the voice is also modified because of aging of the larynx (chronic laryngitis) were included in the study.
vocal mechanism. The present study comprises of mainly 2
groups- Control group having healthy voices (40 males and Exclusion Criteria for Case Group
40 females) and Case group having pathological voices (23
males and 17 females). The study participants were inter- Participants were having history of head and neck surg-
viewed at the time of voice recordings by a speech-lan- eries, neurological or pulmonary diseases in past. Those,
guage therapist and answered a questionnaire, containing who had malignant diseases of larynx or vocal cord
questions related to the general health conditions, paralysis due to any cause, on the day of sample collection,

123
Indian J Otolaryngol Head Neck Surg

if participant was suffered from allergic respiratory con- by four related parameters: absolute jitter (jitta), the local
dition or common cold, were excluded. or relative jitter (jitt), relative average perturbation (rap)
and five points period perturbation (ppq5). In present study,
Sample Collection Local / relative jitter (jitt) used for analysis. Jitter (local/
jitt): Average absolute difference between the consecutive
In a period of one year from 1st June 2019 to 31st May periods, divided by the average period, in percentage [12].
2020, a single speech language therapist recorded voice
samples of all participants (n = 120) in a sound treated Shimmer
room at tertiary care hospital. Each participant was
explained the task, till he/she understands it very well. Shimmer is a variation of amplitudes of consecutive peri-
Participant was asked to perform test trial before final ods. For shimmer, there are four related measures: the
recording. The microphone (dynamic cardioid) was posi- absolute or local shimmer that is the absolute difference in
tioned at 45 degrees laterally to mouth to avoid disturbing a logarithmic domain (ShdB) given in dB, the local
speech sample by nasal breaths and 10 cm away from shimmer (Shim) in percentage of the average amplitude,
mouth, fixed with the help of specialized head band to keep the three-point Amplitude Perturbation Quotient (apq3)
constant distance. Microphone was connected to Lenovo also in percentage and the five-point Amplitude Perturba-
laptop (windows 10) where the voice samples were tion Quotient (apq5) also in percentage. In current study,
recorded directly in Praat software (version 6.0.43) at fre- Local shimmer (shim) used for vocal analysis. Shimmer
quency of 44,000 Hz by keeping in view the specifications (local/shim): Average absolute difference between the
of the microphone. Each participant was sitting on chair amplitudes of consecutive periods, divided by the average
comfortably and requested to produce sustained vowel /a/ amplitude. This is expressed in percentage (%).
for minimum 6 s in a comfortable and using pitch and
loudness in a habitual way, after deep inhalation. Objective Hormonic to Noise Ratio (HNR)
measures such as jitter, shimmer and harmonic to noise
ratio (HNR) are typically analyzed on long-sustained HNR provides an indication of the overall periodicity of
vowels [6]. At least three final trials were recorded for each the voice signal by quantifying the ratio between the
participant. If a sample was not recorded appropriately, periodic (harmonic part) and aperiodic (noise) components.
more trials were carried out. The recording audio of the
sustained emission of the vowel /a/ were imported and Analysis of Result
edited in the Praat software, discarding the beginning 0.5 s
and end of the recording, selecting the most stable portion Two level analyses were done in this study. In first part,
of the emission with mean duration of 3 s, as to avoid acoustic voice parameters of normal healthy adult males
irregular patterns observed at starting and ending of voice and females of control group for vowels /a/, /i/ and /u/ were
signals. In a similar way, voice samples for vowels /i/ and / calculated and compared with each other to find out gender
u/ were recorded. By using Praat software, average differences in voice. Average acoustic parameters (jitter
acoustic voice parameters, fundamental frequency (F0), local (%), shimmer local (%) and HNR) for normal healthy
local jitter (%), local shimmer (%) and harmonic to noise voice of participants of control group for vowels /a/, /i/ and
ratio (HNR), from final three trials, were calculated for /u/ calculated and were compared with pathological voice
each vowel of each participant. parameters of case group.

Acoustic Voice Parameters Acoustic Parameters of Normal Healthy Voices


and Gender Comparison
Fundamental Frequency (F0)
Mean values of acoustic voice parameters for vowels /a/, /i/
F0 is defined as the number of times a sound wave pro- and /u/ of both genders of control group with its standard
duced by the vocal cords repeats during a given time per- deviation (SD) and p-value described in Table 1.
iod. It is also the number of cycles of opening/closure of
the glottis. Unit of F0 is Hertz (Hz).

Jitter

Jitter is the measure of cycle-to-cycle variations of the


fundamental glottal period. Jitter perturbation can be given

123
Indian J Otolaryngol Head Neck Surg

Table 1 Acoustic Voice Parameters of normal healthy adults of control group


Vowel Parameters Male (n = 40) Female (n = 40) P value

/a/ F0 131 ± 9.58 226 ± 17 \ 0.0001


Jitter local (%) 0.30 ± 0.12 0.37 ± 0.15 0.350
Shimmer local (%) 3.15 ± 0.76 3.31 ± 1.56 0.773
HNR 21.60 ± 1.71 22.18 ± 2.01 0.499
/i/ F0 130 ± 5.83 238 ± 16.19 \ 0.0006
Jitter local (%) 0.29 ± 0.11 0.21 ± 0.09 0.083
Shimmer local (%) 2.45 ± 0.58 2.59 ± 0.65 0.619
HNR 22.82 ± 2.66 23.62 ± 2.19 0.469
/u/ F0 131.70 ± 7.10 251.70 ± 16.34 \ 0.0003
Jitter local (%) 0.31 ± 0.12 0.24 ± 0.11 0.196
Shimmer local (%) 2.80 ± 0.47 2.78 ± 0.88 0.935
HNR 25.33 ± 3.56 26.05 ± 4.15 0.679

Observations and Results Fundamental Frequency (F0) control GROUP Male


Fundamental Frequency (F0) control GROUP Female
Gender Wise Distribution of Acoustic Voice
Parameters of Normal Healthy Adults 251
226 238

There were statistically significant higher values of fun-


damental frequency (F0) in females, while jitter local (%), 131 130 131
shimmer local (%) and harmonic to noise ratio (HNR) had
no gender differences in normal healthy voices as shown in
Table 1.
Fundamental Frequency (F0) range for both genders is
/a/ /i/ /u/
mentioned in Table 2.
Unpaired student t-test was applied as test of signifi- Fig. 1 Gender comparison of F0 in control group
cance with 95% confidence limit. There was significant
statistical gender difference in fundamental frequency (F0) (n = 40) and females (n = 40) in control group. Patholog-
for all three vowels /a/, /i/ and /u/, with females had higher ical acoustic voice parameters were compared with normal
values than males as shown in Fig. 1. control parameters as shown in Table 3. Unpaired student
T-test was used for statistical significance with 95% con-
Pathological Voice Acoustic Parameters fidence limit.
Comparison with Normal Healthy Voices In present study, pathological acoustic voice parameters
jitter local (%) and shimmer local (%) had statistically
It was already observed, there was significant gender dif- significant higher values as compare to control group for all
ference in F0, so F0 was not used for comparison of three vowels /a/, /i/ and /u/. HNR of pathological voices
pathological voice. Average acoustic parameters (jitter had statistically significant lower values than normal
local (%), shimmer local (%) and HNR) for control group healthy voices for vowels /a/, /i/ and /u/.
(n = 80) calculated by averaging parameter values of males

Discussion
Table 2 Range of fundamental frequency (F0) for control group
Vowel Male Female Voice is a personal feature; no voice is perfectly equal to
any other. There are well known differences in male and
/a/ 121–153 194–256
female voices and in various languages and ethnological
/i/ 124–161 218–276
background. Dysphonia is a descriptive medical term
/u/ 121–157 223–280
meaning disorder of voice. Diseases that affect larynx

123
Indian J Otolaryngol Head Neck Surg

Table 3 Comparison of pathological voice with normal healthy voice


Vowel Parameters Control values (n = 80) Case value (n = 40) P value

/a/ Jitter local (%) 0.34 ± 0.13 0.89 ± 0.87 0.0041


Shimmer local (%) 3.23 ± 1.20 6.11 ± 2.09 0.0001
HNR 21.89 ± 1.84 14.80 ± 6.56 \ 0.0007
/i/ Jitter local (%) 0.25 ± 0.10 1.23 ± 1.07 0.002
Shimmer local (%) 2.52 ± 0.60 4.86 ± 2.15 0.0006
HNR 23.22 ± 2.40 16.98 ± 6.76 0.0006
/u/ Jitter local (%) 0.27 ± 0.12 1.30 ± 0.60 \ 0.00001
Shimmer local (%) 2.79 ± 0.68 6.59 ± 1.99 \ 0.00001
HNR 25.69 ± 3.78 14.29 ± 4.82 \ 0.00001

cause changes in the patient’s vocal quality. The most In present study, normal healthy females had higher values
common signs that may indicate changes in the larynx of F0 than comparative males, which is consistent with
relate hoarseness, breathiness and roughness. The transient reports in literature. [6] The gender differences in F0 can
hoarseness may result from abuse of voice or the casual flu, be justified because there are marked anatomic differences
but when the hoarseness persist and become a character- in the larynges of men and women. A male larynx appears
istic voice, is indicative of pathology of the larynx [13, 14]. to be approximately 40% bigger than that of a female and
Hoarseness can also be an early symptom of cancer of the the male vocal folds consist of a thicker mass [6].
larynx, Teixeira et al. [15] The most common pathologies In the present study, we found local jitter (%) for vowels
affecting voice are vocal nodule, the laryngitis, and /a/, /i/ and /u/ have no gender differences for control group
paralysis of the vocal cord, polyps, cysts and Reinke’s but have statistically significant higher values for case
edema. Other pathologies of the larynx that may lead to group as compared to controls, this result is consistent with
dysphonic speech are ulcers of contact, as Lopes. [9] With the study done by Paulo et al. [12] The jitter is affected
the existence of normative databases characterizing voice mainly by the lack of control of vibration of the cords and
quality or using intelligent tools combining several the voices of patients with pathologies often have higher
parameters, it is intended to distinguish between normal values of jitter [8], similar findings reported in this study
and pathological voice or even identify or suggest the for vowels /a/, /i/ and /u/.
pathology. [9] The shimmer changes with the reduction of glottal
Acoustic analysis has been reported in literature as a resistance and mass lesions on the vocal cords and is cor-
useful tool to evaluate and characterize pathological vocal related with the presence of noise emission and breathiness
signals [16, 17] and to show statistically significant dif- and it is expected that patients with pathologies have higher
ferences with respect to normal subjects. [18, 19] Leiber- values of shimmer [12], similar results were observed in
man recorded 23 voices from the speakers who had current study for relative shimmer (%) for vowels /a/, /i/
pathologic growths on their vocal cords. It was found that and /u/.
they had larger perturbations than did normal speakers with HNR parameter is usually measured as an overall
the same median fundamental periods. This may be related characteristic of the signal, and not as a function of fre-
to size of pathological growth. [20] This showed that study quency. The overall value of the HNR of the signal varies
of perturbation measures is important for assessment of because different vocal tract configurations involve dif-
voice. Hence voice parameters like Jitter [20], shimmer ferent amplitudes for the harmonics. Analyzing the result,
[21], and harmonics-to-noise ratio [22] were extensively we found no statistical difference in gender comparison in
studied. Many authors studied both normal and pathologi- control group for HNR for vowel /a/, /i/ or /u/, which is not
cal voices using the acoustic parameters as well as funda- supported by Ambreen et al. [6], but supported by the study
mental frequency (F0), jitter, shimmer and HNR. done by Toran et al. [23], while pathological case group
The F0 is one of the most frequently used measures by have lower values compare to control group for vowels /a/,
clinicians to characterize human voice; it yields cues about /i/ and /u/.
age, sex and individual height and is related with mecha- Limitation of this study: As this study primarily focuses
nisms such as vocal fold length, mass and strain. Thus, on to evaluate voices by objective method of acoustic voice
lengthening the vocal folds will cause the glottic cycles to analysis using Praat software, so we did not evaluate voices
occur faster, yielding more acute resulting frequencies. [7] completely/thoroughly by perceptual assessment methods

123
Indian J Otolaryngol Head Neck Surg

and/or other objective voice analysis methods other than 4. Boersma P, Weenink D (2007) Praat: doing phonetics by com-
acoustic voice analysis using Praat software. GRBAS scale puter (Version 4.5.)[Computer program],’’ Retrieved from http//
www.praat.org/
was used just for inclusion and exclusion of the subjects, 5. Smits I, Ceuppens P, De Bodt MS (2005) A comparative study of
but it was not further analysed for voice analysis. acoustic voice measurements by means of Dr. Speech and com-
puterized speech Lab. J Voice 19(2):187–196
6. Ambreen S, Bashir N, Tarar SA, Kausar R (2017) Acoustic
analysis of normal voice patterns in pakistani adults. J Voice
Conclusion pp 1–28
7. Finger LS, Cielo CA, Schwarz K (2009) Acoustic vocal measures
This study establishes normative acoustic voice parameters in women without voice complaints and with normal larynxes.
for both genders. Normal healthy female voices have Braz J Otorhinolaryngol 75(3):432–440. https://doi.org/10.1016/
S1808-8694(15)30663-7
higher fundamental frequency value as compare to male 8. Paulo J, Odete P (2015) Acoustic analysis of vocal dysphonia.
voices. Patients who are with dysphonic voices due to Procedia Procedia Comput Sci 64:466–473
laryngeal pathologies had altered values of acoustic voice 9. Teixeira JP, Oliveira C, Lopes C (2013) Vocal acoustic analy-
parameters compared to normophonic adults and health sis—Jitter Shimmer and HNR Parameters. Procedia Technol
9:1112–1122
professionals precisely can differentiate pathological voi- 10. Spazzapan EA, Cardoso VM, Fabron EMG, Berti LC, Brasolotto
ces from normophonics. AG, de Castro Marino VC (2018) Acoustic characteristics of
healthy voices of adults: from young to middle age. CoDAS
[Internet] 30(5):e20170225
Author Contributions (Optional: please review the submission 11. Hirano M, McCormick KR (1986) Clinical examination of voice.
guidelines from the journal whether statements are mandatory): J Acoust Soc Am 80(4):1273
Additional declarations for articles in life science journals that report 12. Paulo J, Gonçalves A (2014) Accuracy of jitter and shimmer
the results of studies involving humans and/or animal. No use of measurements. Procedia Technol 16:1190–1199
animals in this study. 13. Teixeira JP, Goncalves A (2016) Algorithm for jitter and shim-
mer measurement in pathologic voices. Procedia Comput Sci
100:271–279
Funding (Information that explains whether and by whom the
14. Teixeira JP, Fernandes PO (2014) Jitter, Shimmer and HNR
research was supported):NIL No funds, grants, or other support was
classification within gender, tones and vowels in healthy voices.
received.
Procedia Technol 16:1228–1237
15. Teixeira JP, Ferreira D, Carneiro S (2011) Análise acústica vocal
Declaration
- determinação do Jitter e Shimmer para diagnóstico de patalogias
da fala. In 68 Congresso Luso-Moçambicano de Engenharia.
Conflict of interest (Include appropriate disclosures): The authors
Maputo, Moçambique
have no relevant financial or non-financial interests to disclose.
16. Shaoa J, MacCallumb JK, Zhangb Y, Sprecherb A, Jianga JJ
(2010) Acoustic analysis of the tremulous voice: assessing the
Consent to Participate Informed consent was obtained from all
utility of the correlation dimension and perturbation parameters.
individual participants included in the study.
J Commun Disord 43:35–44
17. Barsties B, Bodt MD (2015) Assessment of voice quality: current
Consent to Publication The participant has consented to the sub- state-of-the-art. Auris Nasus Larynx 42:183–188
mission of the case report to the journal. 18. Sussman JE, Tjaden K (2012) Perceptual measures of speech
from individuals with Parkinson’s disease and multiple sclerosis:
Ethical Approval (Include appropriate approvals or waivers) intelligibility and beyond. J Speech Lang Hear Res 55:1208–1219
approved by the Research Ethics Committee on Human Beings of the 19. Rusz J, Cmejla R, Ruzickova H, Ruzicka E (2011) Quantitative
home institution. All procedures performed in studies involving acoustic measurements for characterization of speech and voice
human participants were in accordance with the ethical standards of disorders in early untreated Parkinson’s disease. J Acoust Soc Am
the institutional and/or national research committee and with the 1964 129:350–367
Helsinki Declaration and its later amendments or comparable ethical 20. Leiberman P (1963) Some acoustic measures of fundamental
standards. periodicity of normal and pathologic larynges. J Acoust Soc Am
35:344
21. Horii Y (1979) Fundamental frequency perturbation observed in
References sustained phonation. J Speech Hearing Res 22(1):5–19
22. Yumoto E, Sasaki Y, Okamura H (1984) Harmonics-to-noise
1. Sataloff RT, Benninger MS (eds) (2016) Sataloff’s comprehen- ratio and psychophysical measurement of the degree of hoarse-
sive textbook of otolaryngology: head and neck surgery (laryn- ness. J Speech Hear Res 27(1):2–6
gology), vol 4. JAYPEE. https://doi.org/10.5005/jp/books/12711 23. Toran KC, Lal BK (2009) Objective analysis of voice in normal
2. Lara E, Tavares M, De Labio RB, Helena R, Martins G (2010) young adults. Kathmandu Univ Med J 7(4):374–377
Normative study of vocal acoustic parameters from children from
4 to 12 years of age without vocal symptoms A pilot study. Braz J Publisher’s Note Springer Nature remains neutral with regard to
Otorhinolaryngol 76(4):485–490 jurisdictional claims in published maps and institutional affiliations.
3. Almeida, N (2010) Sistema Inteligente para Diagnostico da
Patologias na Laringe Utilizando Maquinas de Vetor de Suporte.
Msc., Universidade Federal Rio Grande do Norte – Natal – Brasil

123

You might also like