You are on page 1of 13

636131

research-article2016
AORXXX10.1177/0003489416636131Annals of Otology, Rhinology & LaryngologyBarsties and Maryn

Original Article
Annals of Otology, Rhinology & Laryngology

External Validation of the Acoustic


1­–13
© The Author(s) 2016
Reprints and permissions:
Voice Quality Index Version 03.01 sagepub.com/journalsPermissions.nav
DOI: 10.1177/0003489416636131

With Extended Representativity aor.sagepub.com

Ben Barsties, BHth1,2 and Youri Maryn, PhD1,3,4

Abstract
Objectives: The Acoustic Voice Quality Index (AVQI) is an objective method to quantify the severity of overall voice
quality in concatenated continuous speech and sustained phonation segments. Recently, AVQI was successfully modified
to be more representative and ecologically valid because the internal consistency of AVQI was balanced out through equal
proportion of the 2 speech types. The present investigation aims to explore its external validation in a large data set.
Methods: An expert panel of 12 speech-language therapists rated the voice quality of 1058 concatenated voice samples
varying from normophonia to severe dysphonia. The Spearman rank-order correlation coefficients (r) were used to
measure concurrent validity. The AVQI’s diagnostic accuracy was evaluated with several estimates of its receiver operating
characteristics (ROC).
Results: Finally, 8 of the 12 experts were chosen because of reliability criteria. A strong correlation was identified
between AVQI and auditoryperceptual rating (r = 0.815, P = .000). It indicated that 66.4% of the auditory-perceptual
rating’s variation was explained by AVQI. Additionally, the ROC results showed again the best diagnostic outcome at a
threshold of AVQI = 2.43.
Conclusions: This study highlights external validation and diagnostic precision of the AVQI version 03.01 as a robust and
ecologically valid measurement to objectify voice quality.

Keywords
voice, acoustic analysis, voice disorders, larynx, laryngology, otolaryngology

Introduction with sufficient diagnostic accuracy and reliability with, for


example, the Acoustic Voice Quality Index (AVQI) proposed
Voice quality assessments are standard tools in clinical prac- by Maryn et al.5 The AVQI is a multivariate construct based
tice and research. There are several methods to determine on linear regression analysis that combines several acoustic
voice quality perceptually, acoustically, or aerodynamically.1 markers to yield a single score. This acoustic index correlates
The auditory-perceptual judgment is the most popular and reasonably well with the auditory-perceptual judgment of
most used method to evaluate voice quality.2 Furthermore, it overall voice quality. The analyses in AVQI are based on the
is the logical candidate for a gold standard assessment. First, concatenation of continuous speech and sustained vowel
the voice sound is perceptual in nature, and judging voice segments by quantifying 1 score for the whole voice sam-
quality is the response of the brain to specific acoustic features ple. Concatenated voice samples for AVQI analyses were
in the voice sound.2 Second, all accepted definitions about used for the following reasons. First, it seemed essential
voice quality are based on perceptual explanations.3 that the evaluation of voice quality be based on continuous
Unfortunately, there are several limitations associated with
auditory-perceptual judgment that have clearly affected clini-
cal utility. These limitations can be divided into 3 categories: 1
Faculty of Medicine and Health Sciences, University of Antwerp,
listener, stimulus, and scale.1 All of these affecting factors Belgium
2
influence the validity and reliability of this evaluation method. Medical School, Hochschule Fresenius University of Applied Sciences,
Hamburg, Germany
Therefore, it should be essential to focus on the development 3
European Institute for ORL, Sint-Augustinus Hospital, Antwerp, Belgium
of objective tools with sufficient reliability and validity. 4
Faculty of Education, Health & Social Work, University College Ghent,
Acoustic analysis of the voice signal is an obvious candidate Belgium
because acoustic measurements are the most used diagnostic
Corresponding Author:
instruments to identify voice disorders in research.4 Ben Barsties, University of Antwerp, Universiteitsplein 1, WILRIJK
Recently, overall voice quality has been able to be acousti- (Antwerp), 2610, Belgium.
cally judged for continuous speech and sustained phonation Email: ben.barsties@t-online.de

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
2 Annals of Otology, Rhinology & Laryngology 

speech and sustained phonation if it is to be considered eco- consisted of 970 participants with dysphonia and 88 healthy
logically valid.5 Sustained vowels and continuous speech subjects without any reported voice complaints and voice
yielded significant differences in their ratings of degree of disorders. The dysphonia group presented various organic
voice quality. Furthermore, both represent a different focus and nonorganic etiologies and various degrees in dysphonia
of voice/speech context (ie, specific vocal functions caused severity. Table 1 summarizes further subject details, such as
by controlled, reasonably stable, and sustained phonation gender, age, and voice disorder.
and characteristics in natural speech varying voicing pat- This study consisted of a retrospective and non-interven-
terns and sounds, respectively).5,6 Second, the auditory-per- tional re-analysis of earlier recordings, and therefore no
ceptual judgment of both speech types with 1 average score advise/consent of our Ethics Committee was needed.17
showed a strong proportional relationship without signifi-
cant differences to the post hoc average score of the single
Voice Samples
ratings on continuous speech and sustained phonation and
to a bivariate model that weighed the separate speech type Every voice sample from each subject contained 2 kinds of
ratings.6 speech types. Both of them were recorded with comfortable
Further investigations about AVQI showed consistent and pitch and loudness. First, the subject had to sustain the vowel
acceptable diagnostic precision,5,7-11 consistent and high con- [a:] longer than 3 seconds, and for the analyses, a selection of
current validity,5,7-12 robust inter-language phonetic differ- 3 seconds of the mid-vowel portion of the vowel [a:] was used.
ences,8-12 and high sensitivity in voice changes through voice Second, a read aloud Dutch phonetically balanced text “Papa
therapy.7 In the majority of all these investigations, the pro- en Marloes”18,19 was used. All recordings were conducted in a
grams Speech Tool13 and Praat14 were used to analyze AVQI. soundproof booth with an AKG C420 head-mounted con-
Recently, the smoothed cepstral peak prominence (ie, the denser microphone digitized at 44 100 samples per second,20
main factor in the multivariate AVQI model and analyzed that is, a sampling rate of 44.1 kHz, and 16 bits of resolution
with Speech Tool) has been implemented in Praat, and thus, using the Kay Pentax Computerized Speech Lab model 4500.
the use of Speech Tool might be expendable. A current inves- The signal-to-noise ratio (SNR) by Deliyski et al21,22 was
tigation revealed that the outcomes of the original AVQI ver- used to verify post hoc the level of environmental noise of
sion with the 2 programs and the second AVQI version only the voice recordings. All voice samples were consistent
in Praat are highly comparable in AVQI results.15 with the recommended SNR norm for acceptable circum-
The next step in the AVQI development was to establish stances of acoustic recordings and analysis. The results
equal proportion of the continuous speech and sustained showed a mean SNR of 38.56 dB and SD of 3.78 dB.
vowel to reach higher ecological validity and balanced out The analyzed voice samples of both recorded speech
internal consistency.16 Therefore, the duration from continu- types were concatenated, and they were constituted as rec-
ous speech was expanded from a range of 17 to 22 syllables10 ommended from Barsties and Maryn16 as follows: First, the
to around 34 syllables16 because the length of continuous continuous speech part was trimmed to the first 34 syllables
speech is significantly lower for the analysis, after separating of the text. Second, the 3 seconds of the sustained vowel [a:]
voice to voiceless segments, than the constant duration of sus- segment was appended. Third, every voice sample was
tained vowel (Figure 1). Although the evaluation from saved as a single sound wave in WAV-format.
Barsties and Maryn16 was found successfully in 60 voice sam-
ples for a new weighted AVQI model with extended represen-
tativity (ie, AVQI version 03.01), it is essential for the results Auditory-Perceptual Evaluation
of any single study to be replicated with alternative samples in The procedure of auditory-perceptual judgment was identical
a larger sample set. Therefore, this investigation aimed to as described by Barsties and Maryn16 using only a new and
explore external validity (ie, the ability to reproduce results extended panel of raters. In the present investigation, 12 native
with alternative subjects and in settings outside the initial Dutch speech-language therapists rated overall voice quality.
study) of the new weighted equation in the AVQI version The panel consisted of 9 females and 3 males who specialized
03.01 with a completely new and independent large set of nor- in voice disorders and had professional experience in audi-
mophonic and dysphonic voice samples and an associated tory-perceptual judgment ranging from 4 to 41 years (mean =
group of auditory-perceptual judges. 22.3 years, SD = 11.4 years). Each listener rated overall voice
quality of each concatenated voice sample with 1 severity
degree for the whole sample. They used the grade (G) from
Methods the GRBAS scale,23 which represents by Hirano23 the degree
of hoarseness or voice abnormality, and the grade concurred
Subjects with overall voice quality. As recommended by Wuyts et al,24
All subjects were recruited from the ENT caseload of the the judges used the ordinal 4-point equal-appearing interval
Sint-Jan General Hospital in Bruges, Belgium. The group scale (ie, 0 = normal/absence of hoarseness, 1 = slightly

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 3

Figure 1.  Oscillograms examples of the voice samples for (A) AVQI 02.02 as in Maryn and Weenink15 or earlier5,7-12 and (B) AVQI
03.01 as in Barsties and Maryn16: (A) Upper left: connected speech with 17 syllables, upper right: 3-second sustained vowel [a:], and
lower: concatenation of these 2 sound files in which the continuous speech is already separated into concatenated voiced segments.
(B) Upper left: connected speech with 34 syllables, upper right: 3-second sustained vowel [a:], and lower: concatenation of these 2
sound files in which the continuous speech is already separated into concatenated voiced segments.

hoarse, 2 = moderately hoarse, 3 = severely hoarse). All voice To control internal factors such as fatigue, attention,
samples were provided in a quiet room with a low ambient and low concentration level as described by Kreiman
noise level lower than 40 dBA, measured with a calibrated et al,25 a short break after every twenty-fifth rating was
PCE-322A sound level meter. They were presented to each used. Furthermore, anchor voices were used to putatively
listener individually at a comfortable loudness level through increase the reliability of listener ratings.26 In total, 6
an external soundcard from Creative Soundblaster x-fi 5.1 concatenated voice samples from the database from pre-
USB and a Beyerdynamic DT 770 PRO 80Ω headphone. vious perceived judged investigations were selected as
Every listener was allowed to repeat each voice sample as anchor voices. The selection criteria of these voices was
often as necessary to make a final decision of judgment. based on prior unanimous agreement across judges
All voice samples were judged randomly in several ses- adhering to the 3 severity degrees of slightly, moderately,
sions. Every rating session contained about 250 voice sam- and severely hoarse. This high consensus in the specific
ples. Furthermore, all judges were blinded regarding the severity degrees of hoarseness from different raters
identity, diagnosis, and disposition of the voice samples. To enables the use of these voices as reference patterns of a
assess intrarater reliability, 104 voice samples, approxi- specific level of G. In total, 2 sets of continuous increas-
mately 10% of the 1058 voice samples, were selected ran- ing hoarseness level (ie, 3 samples per set) were provided
domly. These voice samples were repeated a second time at for the listeners as anchors. The 2 sets distinguished
the end of the perceptual judgment without informing the between 2 chief subtypes of hoarseness (ie, breathiness
listeners that stimuli were repeated. and roughness) recognized in various scientific

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
4 Annals of Otology, Rhinology & Laryngology 

Table 1.  Descriptive Data of Dichotomous Factors of the 1058 Acoustic Measures
Subjects.
All acoustic analyses were applied to only voiced segments
Variable Results of continuous speech as determined by the automated detec-
Gender tion Praat script of Maryn et al5 and an appendage of a
  Male, No. 386 3-second [a:] segment to this chain of voiced text segments.
  Female, No. 672 The following acoustic analyses were performed on the
Female age, y (mean ± SD) 40.31 ± 19.36 entire segments of only-voiced continuous speech and sus-
Male age, y (mean ± SD) 44.41 ± 23.15 tained phonation.
Normal, No. 88 These concatenated sound files for calculating AVQI
Voice disorder, No.   consisted of 6 acoustic parameters: smoothed cepstral
  Functional dysphonia 221 peak prominence (CPPs), harmonics-to-noise ratio
 Nodules 201 (HNR), shimmer local (Shim), shimmer local dB (ShdB),
 Paralysis/paresis 132 general slope of the spectrum (Slope), and tilt of the
  Polypoid mucosa (edema) 73
regression line through the spectrum (Tilt) with the soft-
 Cyste 35
ware Praat. The smoothed CPP is the distance between
 Refluxlaryngitis 28
the first rahmonic’s peak and the point with equal que-
 Polyp 28
 Presbylarynx 26 frency on the regression line through the smoothed ceps-
 Tumor 23 trum. The HNR is the base-10 logarithm of the ratio
  Chronic laryngitis 22 between the periodic energy and the noise energy, multi-
 Post-phonosurgery 17 plied by 10. The Shim is the absolute mean difference
 Thyroidectomy 15 between the amplitudes of successive periods, divided by
  Sulcus vocalis 14 the average amplitude. The ShdB is the base-10 logarithm
 Trauma 12 of the difference between the amplitudes of successive
  Ventricular hypertrophy 12 periods, multiplied by 20. The general Slope is the differ-
  Acute laryngitis 11 ence between the energy in 0 to 1000 Hz and the energy
 Leukoplakia 10 in 1000 to 10 000 Hz of the long-term average spectrum.
 Post-radiotherapy 9 The Tilt is the difference between the energy in 0 to 1000
  Vocal fold atrophy 9 Hz and the energy in 1000 to 10 000 Hz of the trendline
 Granuloma 7 through the long-term average spectrum.
 Hyperkeratosis 6
To calculate AVQI with the software Praat, the following
  Spasmodic dysphonia 6
equation was used according to Barsties and Maryn16:
  Mutational falsetto 6
  Neurological disorder 6
 Dysatrophonia 5 AVQI 03.01= (4.152- ( 0.177*CPPs )
  Vocal fold scar 5
 Papillomatosis 5 - ( 0.006*HNR ) - ( 0.037*Shim ) + ( 0.941*ShdB )
  Preoperative baseline before thyroidectomy 4 + ( 0.01*Slope ) + ( 0.093*Tilt ) )*2.8902
 Laryngectomy 3
 Fibramyalgy 3
  Post-transoral robot surgery 2 Furthermore, the complete AVQI script is attached in the
 Web 2 appendix to run the AVQI version 03.01 in Praat.
 Transgender 2
  Genetic disorders 2
 Pseudopolyp 1 Statistics
  Tumorectomy head-neck 1 The statistical analyses for concurrent validity and diagnos-
 Pachydermy 1 tic accuracy were completed using SPSS for Windows ver-
 Hemorrhage 1
sion 22.0 (IBM Corp, Armonk, New York, USA). The rater
 Corticopathology 1
reliability was analyzed using the software package of
  Postoperative neurological surgery 1
r-Studio v. 3.0.1 (R Core Team, Vienna, Austria).
 Tracheitis 1
 Stenosis 1
Intrarater Reliability
papers.27-29 Each listener heard these 2 sets at the begin- The intrarater reliability of the 12 raters was assessed using
ning and after the break of every twenty-fifth rating. the Cohen’s kappa coefficient (Ck). This statistic is a chance

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 5

corrected index of the agreement between the ratings of 2 Concurrent Validity


judges or 2 ratings, yielding values of Ck = 1 for perfect
agreement and Ck = 0 when agreement is no better than that In order to investigate the criterion-related concurrent valid-
by chance.30 Ck was considered reasonably reliable from ity of AVQI version 03.01, the correlation between the Gmean
kappa = 0.41 because this kappa value was evaluated as (ie, average G score over the raters with the best reliability)
moderate in strength of agreement.31 and AVQI was calculated with the Spearman rank-order
Furthermore, significant changes (ie, considered statisti- correlation coefficient (r) and the coefficient of determina-
cally significant at P ≤ .01) in Ck values were tested using tion (r2). Interpretation guidelines for r were provided by
bootstrapping with 10 000 replications based on a script by Frey et al.34
Van Belle.32
Diagnostic Accuracy
Interrater Reliability Several estimates were calculated to evaluate the diagnostic
accuracy of AVQI version 03.01. Diagnostic precision of a
To assess the agreement of interrater reliability among measure is commonly evaluated by its sensitivity (ie, cor-
the 12 judges, we computed the kappa coefficient rectly identified hoarseness that tests positive on AVQI) and
according to Fleiss33 (Fk), who extended the Ck for more specificity (ie, correctly identified hoarseness when they
than 2 judges. Fk was considered reasonably reliable test true negative on AVQI). However, depending on the
from kappa = 0.41 because this kappa value was evalu- AVQI’s threshold chosen to define a positive result, its sen-
ated as moderate in strength of agreement.31 Furthermore, sitivity and specificity can vary. This trade-off between sen-
significant changes (ie, considered statistically signifi- sitivity and specificity can be graphically produced by
cant at P ≤ .01) in Fk values were tested using bootstrap- generating the receiver operating characteristic (ROC)
ping with 10 000 replications based on a script by Van curve. The ROC curve of AVQI is created by plotting points
Belle.32 of AVQI thresholds. The true positive rate (ie, sensitivity) is
shown on the ordinate and the false positive rate (ie, 1−spec-
ificity) on the abscissa. As mentioned by Barsties and
Selection of the Rater Group Maryn,16 the absence of hoarseness in a voice sound was
To establish a group of raters with homogeneous and repre- considered when modal agreement between the judges
sentative reliability, the next criteria were followed (next to rated the voice as normal (ie, Gmean < 0.5). A hoarse voice
their longstanding experience in clinical rating voice qual- was considered as soon as the mean rating was rounded off
ity as a speech-language therapist). to about Gmean ≥ 0.50. Thus, hoarseness ratings ranged from
First, no significant differences (ie, considered statisti- Gmean ≥ 0.50 to ≤ 3.
cally significant at P ≤ .01) were found in intrarater Ck To estimate the discrimination power of AVQI by assess-
results using bootstrapping with 10  000 replications ing between normal and hoarse voices, the area under ROC
between all pairs of raters. Second, each rater approached curve (ie, AROC) was used. An AROC = 1.0 (ie, AROC = 100%)
intrarater reliability with a level of Ck ≥ 0.41 because the is found for measures that perfectly distinguish between
strength of agreement was evaluated as minimally moder- normal and hoarse voices. An AROC = 0.5 (ie, AROC = 50%)
ate.31 Third, all remaining raters with representative and corresponds with chance-level diagnostic accuracy.35
comparably high intrarater reliability were analyzed to find Furthermore, likelihood ratios also had to be calculated to
a homogenous rater group with interrater reliability of Fk ≥ provide additional evidence regarding the value of a diag-
0.41. This level of kappa values is related to a minimum of nostic measure and to help reduce problems with sensitiv-
moderate strength of agreement. 31 Therefore, no signifi- ity/specificity related to the base rate differences in the
cant changes (ie, considered statistically significant at P ≤ samples (ie, the uneven percentages of 8% normophonia
.01) were found between the Fk for all tested raters and the and 92% dysphonia in the 1058 voice samples).36 The like-
Fk for 1 excluded rater of the group. If the Fk result is sig- lihood ratio for a positive result (LR+) yields information
nificantly better by excluding a rater, the rater with the regarding how the odds of the disease increase when the test
highest significant value has to be excluded for the next is positive. The LR+ provides information regarding the
round. Thus, in each round we used a backward stepwise likelihood that an individual is hoarse when testing positive.
method to exclude the rater with the highest significant The likelihood ratio for a negative result (LR−) is an esti-
kappa value in comparison with the Fk for all tested raters. mate that helps to determine if an individual does not have
This procedure was repeated until a minimum kappa value a particular disorder when they test negative on the diagnos-
of ≥0.41 was achieved without significantly better Fk tic test. The LR– provides information regarding the likeli-
results for 1 rater of the group who was excluded in com- hood that an individual has a normal voice quality when
parison with the Fk for all tested raters. testing negative. As a general guideline, the diagnostic

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
6 Annals of Otology, Rhinology & Laryngology 

Figure 2.  Frequency distribution of the mean auditory-perceptual overall voice quality ratings (average of G scores of the 8 identified
judges) of the 1058 concatenated voice samples.

value of a measure is considered to be high when LR+ ≥10


and LR− ≤0.1.36 Using LR statistics is more appropriate in
choosing the optimal AVQI threshold than using ROC sta-
tistics alone because LR statistics consider sensitivity and
specificity simultaneously. Due to this, LR statistics are less
vulnerable to sample size characteristics and base rate dif-
ferences in the sample between normal and hoarse voices.36

Results
Intrarater reliability showed no significant differences in Ck
values (t = 12.824, P = .306) between all 12 raters, but 1
rater did not reach the minimum of acceptable reliability
level (Ck = 0.32) and had to be excluded. The remaining 11
raters had a range of Ck between 0.41 and 0.58.
Interrater reliability was executed on the remaining 11
raters that reached an Fk = 0.39, and 5 raters showed a sig-
nificantly better Fk result if they were excluded in compari- Figure 3.  Scatterplot and linear regression line illustrating
the proportional relationship between AVQI version 03.01
son to the Fk of all tested 11 raters (t = 18.985, P < .001 to
and Gmean (the 2 lines above and under the regression fit line
t = 7.576, p = .006). After the fourth round, an Fk= 0.43 was delineate the upper and lower boundaries, respectively, of the
found with a group of 8 raters and simultaneously showed 95% prediction interval).
no significantly better Fk results if 1 rater of this group was
excluded (t = 7.25, P = .011 to t = 0.757, P = .384). The results indicated significant and marked34 concur-
Therefore, all analyses of perceptual Gmean ratings were rent validity between the AVQI values and Gmean ratings for
conducted on the panel of the particular 8 raters mentioned those same samples as provided by the selected judges (r =
previously. Figure 2 shows a distribution of the 1058 evalu- 0.815, P < .001; Figure 3), indicating that 66.4% (ie, r2 =
ated voice samples by this panel. 0.664) of the variance in Gmean was accounted for by AVQI.

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 7

Figure 4.  Receiver operating characteristic curve illustrating the diagnostic accuracy of AVQI version 03.01.

To evaluate AVQI’s potential to distinguish subjects with corresponds to perceptual ratings of overall voice quality.
normal voice quality from abnormal voice quality, a ROC Furthermore, the AVQI version 03.01 has a strong external
curve was constructed (Figure 4). The AROC was 0.923 (ie, validity as an effective correlate with high diagnostic accu-
92.3%) and confirmed the excellent discriminatory power racy of perceptual evaluation of overall voice quality with
of AVQI in differentiating between normal and hoarse alternative voice samples and judges. The intra- and inter-
voices. Furthermore, the AVQI threshold of 2.43 showed rater reliability reached moderate strength of agreement,
the optimal demarcation point between normal and hoarse and the results are comparable to other studies that investi-
voices that concurs with the previous result from Barsties gate the reliability of concatenated voice samples.3,5,8,16,37
and Maryn.16 At this threshold, the best balance between Although the present study has first, an extremely large
sensitivity and specificity was achieved with respectable data set of more than 1000 analyzed voice samples; second,
sensitivity = 0.785 and excellent specificity = 0.932. more exact selection of the rater panel based on the knowl-
Additionally, the likelihood ratio provided at this threshold edge of several affecting factors disturbing the perceived
the first time the statistical criteria for LR+ = 11.54 and a judgment1; and third, more critical statistical selection criteria
respectable LR− = 0.23. in rater reliability, the results are comparable to the previous
results about AVQI’s version 03.01. Thus, the concurrent
validity (ie, r = 0.815 in the current study vs r = 0.929 in the
Discussion study by Barsties and Maryn16) and diagnostic accuracy with
The present study aimed to explore external validation and the threshold of 2.43 (ie, sensitivity = 0.785 and specificity =
diagnostic precision of the AVQI version 03.01. The results 0.932 in the present study vs sensitivity = 0.936 and specific-
indicate that the new extended AVQI successfully ity = 1 in the study by Barsties and Maryn16) are comparable.

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
8 Annals of Otology, Rhinology & Laryngology 

The analyzed data pool15 of 507 subjects across 5 stud- The independent external validation of the AVQI version
ies5,7-10 evaluated the initial AVQI model and auditory- 03.01 provided by this study accomplishes an important
perceptual judgment of overall voice quality. The results step in making practical, reliable, and reproducible objec-
showed a homogeneous weighted mean correlation of r = tive voice assessments available to non-experts or profes-
0.790.15 By comparison, under the same conditions of sta- sionals to support their clinical decision in practice or
tistical analysis for the AVQI version 03.01 including research in voice-disordered patients.
1118 subjects across 2 studies, these results showed not
only a homogeneous weighted correlation but a slightly
Appendix
improved weighted mean r = 0.821. To compare the
weighted mean correlation (ie, concurrent validity) of # TITLE OF THE SCRIPT: ACOUSTIC VOICE QUALITY
AVQI with other comparable multivariate indices to INDEX (AVQI) v.03.01
objectify hoarseness, we used the same statistical analysis # Form for introduction and/or parameterization
as published in Maryn and Weenink.15 First, we analyzed form Acoustic Voice Quality Index v.03.01
the validity between the Dysphonia Severity Index comment >>> It is advocated to estimate someone’s dys-
(DSI)38 and auditory-perceptual judgment of overall phonia severity in both 
voice quality. The DSI is a multivariate method to provide comment continuous speech (i.e., ‘cs’) and sustained vowel
an objective and quantitative measure of vocal function (i.e., ‘sv’) (Maryn et al.,
including component measures of jitter, maximum phona- comment 2010). This script therefore runs on these two
tion time, lowest vocal intensity, and highest phonational types of recordings, and it is
frequency. A data pool of 490 subjects across 5 studies39-43 comment important to name these recordings ‘cs’ and ‘sv’,
was used. The results showed a heterogeneous weighted respectively.
mean correlation of r = 0.524. Second, the validity was comment >>> This script automatically (a) searches,
evaluated between the Cepstral Spectral Index of extracts and then concatenates
Dysphonia (CSID)44 and the auditory-perceptual judg- comment the voiced segments of the continuous speech
ment of overall voice quality. The CSID is composed of 2 recording to a new sound; (b)
multiple regression-based mathematical estimates of dys- comment concatenates the sustained vowel recording to the
phonia severity analyzing sustained phonation and con- new sound, (c) determines
tinuous speech separately, which use several cepstral- and comment the Smoothed Cepstral Peak Prominence, the
spectral-based measures. We selected a data pool of 310 Shimmer Local, the Shimmer
subjects across 3 studies44-46 for the sustained vowel and comment Local dB, the LTAS-slope, the LTAS-tilt and the
656 subjects across 4 studies44-47 for the continuous Harmonics-to-Noise Ratio of
speech. The results showed a homogeneous weighted comment the concatenated sound signal, (d) calculates the
mean correlation of r = 0.788 for the sustained vowel and AVQI-score based on
a homogeneous weighted mean correlation of r = 0.748 comment the equation of Barsties & Maryn (2015), and
for the continuous speech. Based on all these results, it draws the oscillogram, the narrow-
can be concluded that first, the development of the AVQI comment band spectrogram with LTAS and the power-
model is a steady and robust objective method in the eval- cepstrogram with power-
uation of voice quality. AVQI has improved in ecological comment cepstrum of the concatenated sound signal to
validity, concurrent validity, and diagnostic accuracy. allow further interpretation.
Second, the new version of AVQI 03.01 reached the high- comment >>> To be reliable for the AVQI analysis, it is
est concurrent validity in comparison to the earlier ver- imperative that the sound recordings
sion of AVQI and other acoustic multiparamatric indices comment are made in an optimal data acquisition
to objectify hoarseness. conditions.
comment >>> There are two versions in this script: (1) a
simple version (only AVQI with
Conclusions
comment data of acoustic measures), and (2) an illustrated
The results confirm AVQI as a robust and ecologically valid version (AVQI with data of 
measurement to objectify overall voice quality. In the pres- comment acoustic measures and above-mentioned graphs). 
ent large data set, the development to the AVQI version choice version: 1
03.01 demonstrates high validity and acceptable diagnostic button simple
accuracy in a representative voice clinic population, reflect- button illustrated
ing different ages, genders, different types and degrees of comment >>> Additional information (optional):
voice quality, and including nonorganic as well as organic sentence name_patient
laryngeal pathologies. sentence left_dates_(birth_-_assessment) 

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 9

sentence right_dates_(birth_-_assessment)  globalPower = Get power in air


comment  voicelessThreshold = globalPower*(30/100)
comment Script credits: Youri Maryn (PhD), Paul Corthals
(PhD), and Ben Barsties select Sound onlyLoud
endform extremeRight = signalEnd - windowWidth
Erase all while windowBorderRight < extremeRight
Select inner viewport... 0.5 7.5 0.5 4.5   Extract part... ‘windowBorderLeft’ ‘windowBorderRi-
Axes... 0 1 0 1 ght’ Rectangular 1.0 no
Black   select Sound onlyLoud_part
Text special... 0.5 centre 0.6 half Helvetica 12 0 Please wait   partialPower = Get power in air
an instant. Depending on the duration and/or the sample   if partialPower > voicelessThreshold
rate of the recorded    call checkZeros 0
Text special... 0.5 centre 0.4 half Helvetica 12 0 sound files,    if (zeroCrossingRate <> undefined) and (zeroCrossin-
this script takes more or less time to process the sound and gRate < 3000)
search for the AVQI.    select Sound onlyVoice
#--------------------------------------------------------------------------------------------    plus Sound onlyLoud_part
# PART 0:    Concatenate
# HIGH-PASS FILTERING OF THE SOUND FILES.     Rename... onlyVoiceNew
#--------------------------------------------------------------------------------------------    select Sound onlyVoice
select Sound cs    Remove
Filter (stop Hann band)... 0 34 0.1    select Sound onlyVoiceNew
Rename... cs2    Rename... onlyVoice
select Sound sv   endif
Filter (stop Hann band)... 0 34 0.1  endif
Rename... sv2   select Sound onlyLoud_part
#--------------------------------------------------------------------------------------------  Remove
# PART 1:   windowBorderLeft = windowBorderLeft + 0.03
# DETECTION, EXTRACTION AND CONCATENATION   windowBorderRight = windowBorderLeft + 0.03
OF   select Sound onlyLoud
# THE VOICED SEGMENTS IN THE RECORDING endwhile
# OF CONTINUOUS SPEECH. select Sound onlyVoice
#-------------------------------------------------------------------------------------------- procedure checkZeros zeroCrossingRate
select Sound cs2   start = 0.0025
Copy... original   startZero = Get nearest zero crossing... ‘start’
samplingRate = Get sampling frequency   findStart = startZero
intermediateSamples = Get sampling period   findStartZeroPlusOne = startZero + intermediateSamples
Create Sound... onlyVoice 0 0.001 ‘samplingRate’ 0   startZeroPlusOne = Get nearest zero crossing...
select Sound original ‘findStartZeroPlusOne’
To TextGrid (silences)... 50 0.003 -25 0.1 0.1 silence   zeroCrossings = 0
sounding   strips = 0
select Sound original   while (findStart < 0.0275) and (findStart <> undefined)
plus TextGrid original    while startZeroPlusOne = findStart
Extract intervals where... 1 no “does not contain” silence     findStartZeroPlusOne = findStartZeroPlusOne +
Concatenate intermediateSamples
select Sound chain     startZeroPlusOne = Get nearest zero crossing...
Rename... onlyLoud ‘findStartZeroPlusOne’
globalPower = Get power in air   endwhile
select TextGrid original    afstand = startZeroPlusOne - startZero
Remove    strips = strips +1
select Sound onlyLoud    zeroCrossings = zeroCrossings +1
signalEnd = Get end time    findStart = startZeroPlusOne
windowBorderLeft = Get start time  endwhile
windowWidth = 0.03   zeroCrossingRate = zeroCrossings/afstand
windowBorderRight = windowBorderLeft + windowWidth endproc

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
10 Annals of Otology, Rhinology & Laryngology 

#-------------------------------------------------------------------------------------------- shim = percentShimmer*100


# PART 2: shdb = Get shimmer (local_dB)... 0 0 0.0001 0.02 1.3 1.6
# DETERMINATION OF THE SIX ACOUSTIC # Harmonic-to-noise ratio
MEASURES select Sound avqi
# AND CALCULATION OF THE ACOUSTIC VOICE To Pitch (cc)... 0 75 15 no 0.03 0.45 0.01 0.35 0.14 600
QUALITY INDEX. select Sound avqi
#-------------------------------------------------------------------------------------------- plus Pitch avqi
select Sound sv2 To PointProcess (cc)
durationVowel = Get total duration Rename... avqi2
durationStart=durationVowel-3 select Sound avqi
if durationVowel>3 plus Pitch avqi
Extract part... durationStart durationVowel rectangular 1 no plus PointProcess avqi2
Rename... sv3 voiceReport$ = Voice report... 0 0 75 600 1.3 1.6 0.03 0.45
elsif durationVowel<=3 hnr = extractNumber (voiceReport$, “Mean harmonics-to-
Copy... sv3 noise ratio:”)
endif # Calculation of the AVQI
select Sound onlyVoice avqi = (4.152-(0.177*cpps)-(0.006*hnr)-(0.037*shim)+
durationOnlyVoice = Get total duration (0.941*shdb)+(0.01*slope)+(0.093*tilt))*2.8902
plus Sound sv3 #--------------------------------------------------------------------------------------------
Concatenate # PART 3:
Rename... avqi # DRAWINGS ALL THE INFORMATION AND THE
durationAll = Get total duration GRAPHS.
minimumSPL = Get minimum... 0 0 None #--------------------------------------------------------------------------------------------
maximumSPL = Get maximum... 0 0 None # Title and patient information
# Narrow-band spectrogram and LTAS Erase all
To Spectrogram... 0.03 4000 0.002 20 Gaussian Solid line
select Sound avqi Line width... 1
To Ltas... 1 Black
minimumSpectrum = Get minimum... 0 4000 None Helvetica
maximumSpectrum = Get maximum... 0 4000 None Select inner viewport... 0 8 0 0.5
# Power-cepstrogram, Cepstral peak prominence and Font size... 1
Smoothed cepstral peak prominence Select inner viewport... 0.5 7.5 0.1 0.15
select Sound avqi Axes... 0 1 0 1
To PowerCepstrogram... 60 0.002 5000 50 Text... 0 Left 0.5 Half Script: Youri Maryn (PhD) and Paul
cpps = Get CPPS... no 0.01 0.001 60 330 0.05 Parabolic Corthals (PhD)
0.001 0 Straight Robust Font size... 12
To PowerCepstrum (slice)... 0.1 Select inner viewport... 0.5 7.5 0 0.5
maximumCepstrum = Get peak... 60 330 None Axes... 0 1 0 1
# Slope of the long-term average spectrum Text... 0 Left 0.5 Half ##ACOUSTIC VOICE QUALITY
select Sound avqi INDEX (AVQI) v.03.01#
To Ltas... 1 Font size... 8
slope = Get slope... 0 1000 1000 10000 energy Select inner viewport... 0.5 7.5 0 0.5
# Tilt of trendline through the long-term average spectrum Axes... 0 1 0 3
select Ltas avqi Text... 1 Right 2.3 Half %%‘name_patient$’%
Compute trend line... 1 10000 Text... 1 Right 1.5 Half %%°‘left_dates$’%
tilt = Get slope... 0 1000 1000 10000 energy Text... 1 Right 0.7 Half %%‘right_dates$’%
# Amplitude perturbation measures # Simple version
select Sound avqi if version = 1
To PointProcess (periodic, cc)... 50 400   # Data
Rename... avqi1 Font size... 10
select Sound avqi Select inner viewport... 0.5 7.5 0.5 2
plus PointProcess avqi1 Axes... 0 7 6 0
percentShimmer = Get shimmer (local)... 0 0 0.0001 0.02 Text... 0.05 Left 0.5 Half Smoothed cepstral peak promi-
1.3 1.6 nence (CPPS): ##‘cpps:2’#

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 11

Text... 0.05 Left 1.5 Half Harmonics-to-noise ratio: Text bottom... no Time (s)
##‘hnr:2’ dB#   # LTAS
Text... 0.05 Left 2.5 Half Shimmer local: ##‘shim:2’ \% # Select inner viewport... 5.4 7.5 2.3 3.8
Text... 0.05 Left 3.5 Half Shimmer local dB: ##‘shdb:2’ select Ltas avqi
dB# Draw... 0 4000 minimumSpectrum maximumSpectrum no
Text... 0.05 Left 4.5 Half Slope of LTAS: ##‘slope:2’ dB# Curve
Text... 0.05 Left 5.5 Half Tilt of trendline through LTAS: Draw inner box
##‘tilt:2’ dB# One mark left... minimumSpectrum no yes no
Select inner viewport... 0.5 3.8 0.5 2 ‘minimumSpectrum:2’
Draw inner box One mark left... maximumSpectrum no yes no
Font size... 7 ‘maximumSpectrum:2’
Arrow size... 1 Text left... no Sound pressure level (dB/Hz)
Select inner viewport... 4 7.5 1.25 2 One mark bottom... 0 no yes no 0
Axes... 0 10 1 0 One mark bottom... 4000 no yes no 4000
Paint rectangle... green 0 2.43 0 1 Text bottom... no Frequency (Hz)
Paint rectangle... red 2.43 10 0 1   # Power-cepstrogram
Draw arrow... avqi 1 avqi 0  Select inner viewport... 0.5 5 4.1 5.6
Draw inner box select PowerCepstrogram avqi
Marks top every... 1 1 yes yes no Paint... 0 0 0.00303 0.01667 0 0 no
Font size... 16 Draw inner box
Select inner viewport... 4 7.5 0.5 1.15 One mark left... 0.00303 no yes no 0.003
Axes... 0 1 0 1 One mark left... 0.01667 no yes no 0.017
Text... 0.5 Centre 0.5 Half AVQI: ##‘avqi:2’# Text left... no Quefrency (s)
  # Copy Praat picture One mark bottom... 0 no yes no 0.00
Select inner viewport... 0.5 7.5 0 2 One mark bottom... durationOnlyVoice no no yes
Copy to clipboard One mark bottom... durationAll no yes no ‘durationAll:2’
# Illustrated version Text bottom... no Time (s)
  # Power-cepstrum
elsif version = 2 Select inner viewport... 5.4 7.5 4.1 5.6
  # Oscillogram select PowerCepstrum avqi_0_100
Font size... 7 Draw... 0.00303 0.01667 0 0 no
Select inner viewport... 0.5 5 0.5 2.0 Draw tilt line... 0.00303 0.01667 0 0 0.00303 0.01667 Straight
select Sound avqi Robust
Draw... 0 0 0 0 no Curve Draw inner box
Draw inner box One mark left... maximumCepstrum no yes no
One mark left... minimumSPL no yes no ‘minimumSPL:2’ ‘maximumCepstrum:2’
One mark left... maximumSPL no yes no Text left... no Amplitude (dB)
‘maximumSPL:2’ One mark bottom... 0.00303 no yes no 0.003
Text left... no Sound pressure level (Pa) One mark bottom... 0.01667 no yes no 0.017
One mark bottom... 0 no yes no 0.00 Text bottom... no Quefrency (s)
One mark bottom... durationOnlyVoice no no yes   # Data
One mark bottom... durationAll no yes no ‘durationAll:2’ Font size... 10
Text bottom... no Time (s) Select inner viewport... 0.5 7.5 5.9 7.4
  # Narrow-band spectrogram Axes... 0 7 6 0
Select inner viewport... 0.5 5 2.3 3.8 Text... 0.05 Left 0.5 Half Smoothed cepstral peak promi-
select Spectrogram avqi nence (CPPS): ##‘cpps:2’#
Paint... 0 0 0 4000 100 yes 50 6 0 no Text... 0.05 Left 1.5 Half Harmonics-to-noise ratio:
Draw inner box ##‘hnr:2’ dB#
One mark left... 0 no yes no 0 Text... 0.05 Left 2.5 Half Shimmer local: ##‘shim:2’ \% #
One mark left... 4000 no yes no 4000 Text... 0.05 Left 3.5 Half Shimmer local dB: ##‘shdb:2’ dB#
Text left... no Frequency (Hz) Text... 0.05 Left 4.5 Half Slope of LTAS: ##‘slope:2’ dB#
One mark bottom... 0 no yes no 0.00 Text... 0.05 Left 5.5 Half Tilt of trendline through LTAS:
One mark bottom... durationOnlyVoice no no yes ##‘tilt:2’ dB#
One mark bottom... durationAll no yes no ‘durationAll:2’ Select inner viewport... 0.5 3.8 5.9 7.4

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
12 Annals of Otology, Rhinology & Laryngology 

Draw inner box 5. Maryn Y, Corthals P, Van Cauwenberge P, Roy N, De Bodt


Font size... 7 M. Toward improved ecological validity in the acoustic mea-
Arrow size... 1 surement of overall voice quality: combining continuous
Select inner viewport... 4 7.5 6.75 7.4 speech and sustained vowels. J Voice. 2010;24:540-555.
6. Maryn Y, Roy N. Sustained vowels and continuous speech
Axes... 0 10 1 0
in the auditory-perceptual evaluation of dysphonia severity.
Paint rectangle... green 0 2.43 0 1
J Soc Bras Fonoaudiol. 2012;24:107-112.
Paint rectangle... red 2.43 10 0 1 7. Maryn Y, De Bodt M, Roy N. The Acoustic Voice Quality
Draw arrow... avqi 1 avqi 0  Index: toward improved treatment outcomes assessment in
Draw inner box voice disorders. J Commun Disord. 2010;43:161-174.
Marks top every... 1 1 yes yes no 8. Barsties B, Maryn Y. Der Acoustic Voice Quality Index in Deutsch:
Font size... 16 Ein Messverfahren zur allgemeinen Stimmqualität [The Acoustic
Select inner viewport... 4 7.5 5.9 6.65 Voice Quality Index. Toward expanded measurement of dyspho-
Axes... 0 1 0 1 nia severity in German subjects]. HNO. 2012;60:715-720.
Text... 0.5 Centre 0.5 Half AVQI: ##‘avqi:2’# 9. Reynolds V, Buckland A, Bailey J, et al. Objective assess-
  # Copy Praat picture ment of pediatric voice disorders with the acoustic voice qual-
ity index. J Voice. 2012;26:672.e1-7.
Select inner viewport... 0.5 7.5 0 7.4
10. Maryn Y, De Bodt M, Barsties B, Roy N. The value of the
Copy to clipboard
Acoustic Voice Quality Index as a measure of dysphonia
endif severity in subjects speaking different languages. Eur Arch
# Remove intermediate objects Otorhinolaryngol. 2014;271:1609-1619.
select all 11. Kankare E, Barsties B, Maryn Y, et al. A preliminary study
minus Sound sv of the Acoustic Voice Quality Index in Finnish speaking
minus Sound cs population. Paper presented at: 11th Pan European Voice
minus Sound avqi Conference; September 2015; Florence, Italy.
Remove 12. Maryn Y, Kim HT, Kim J. Auditory-perceptual and acoustic
methods in measuring dysphonia severity of Korean speech
Acknowledgments [published online August 24, 2015]. J Voice.
13. Hillenbrand J. SpeechTool, Version 1.65 [computer program]
The authors thank Jopie Kuiper, Timmy Hartman, Rudi Verfaillie, (October 2008). http://homepages.wmich.edu/~hillenbr/. Accessed
Prof Dr Marc De Bodt, Paulien Keim, Dr Leo Meulenbroek, Gerti February 20, 2011.
te Walvaart, Tinka Thede, Gertie Savelkoul, Jessica Fremdgen, 14. Boersma P, Weenink D. [computer program]. Praat: Doing
Bertine Lefers, and Kim Rutten for their contributions in the per- Phonetics by Computer (Version 5.3.57). Amsterdam, The
ceptual judgment of the many concatenated voice samples. Special Netherlands: Institute of Phonetic Sciences (October 2013).
thanks to Cas Kruitwagen for statistical support in the analysis of 15. Maryn Y, Weenink D. Objective dysphonia measures in

R-studio. the program Praat: smoothed cepstral peak prominence and
acoustic voice quality index. J Voice. 2015;29:35-43.
Declaration of Conflicting Interests 16. Barsties B, Maryn Y. The improvement of internal consis-
The author(s) declared no potential conflicts of interest with tency of the Acoustic Voice Quality Index. Am J Otolaryngol.
respect to the research, authorship, and/or publication of this 2015;36:647-656.
article. 17. Federal Agency for Medicines and Health Products, and

Belgian Advisory Committee on Bioethics. Guide for Non-
Funding interventional Studies. Brussels: Author; 2007.
18. Van de Weijer JC, Slis IH. Nasaliteitsmeting met de nasome-
The author(s) received no financial support for the research,
ter. Tijdschrift voor Logopedie en Foniatrie. 1991;63:97-101.
authorship, and/or publication of this article.
19. Van Lierde K. Nasalance and Nasality in Clinical Practice
[dissertation]. University of Ghent, Ghent, Belgium; 2001.
References 20. Roark RM. Frequency and voice: perspectives in the time
1. Barsties B, De Bodt M. Assessment of voice quality: current domain. J Voice. 2006;20:325-354.
state-of-the-art. Auris Nasus Larynx. 2015;42:183-188. 21. Deliyski DD, Shaw HS, Evans MK. Adverse effects of envi-
2. Oates J. Auditory-perceptual evaluation of disordered voice ronmental noise on acoustic voice quality measurements.
quality: pros, cons and future directions. Folia Phoniatr J Voice. 2005;19:15-28.
Logop. 2009;61:49-56. 22. Deliyski DD, Shaw HS, Evans MK, Vesselinow R. Regression
3. Barsties B, Maryn Y. The influence of voice sample duration tree approach to studying factors influencing acoustic voice
in the auditory-perceptual judgment of overall voice quality. analysis. Folia Phoniatr Logop. 2006;58:274-288.
unpublished data. 23. Hirano M. Psycho-acoustic evaluation of voice. In: Arnold
4. Roy N, Barkmeier-Kraemer J, Eadie T, et al. Evidence-based GE, Winckel F, Wyke BD, ed. Disorders of Human
clinical voice assessment: a systematic review. Am J Speech Communication 5. Clinical Examination of Voice. Vienna,
Lang Pathol. 2013;22:212-226. Austria: Springer-Verlag; 1981:81-84.

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 13

24. Wuyts FL, De Bodt MS, Van de Heyning PH. Is the reliabil- 38. Wuyts FL, De Bodt MS, Molenberghs G, et al. The dysphonia
ity of a visual analog scale higher than an ordinal scale? An severity index: an objective measure of vocal quality based
experiment with the GRBAS scale for the perceptual evalua- on a multiparameter approach. J Speech Lang Hear Res.
tion of dysphonia. J Voice. 1999;13:508-517. 2000;43:796-809.
25. Kreiman J, Gerratt BR, Kempster GB, Erman A, Berke GS. 39. Jayakumar T, Savithri SR. Assessment of voice quality in
Perceptual evaluation of voice quality: review, tutorial, and monozygotic twins: qualitative and quantitative measures.
a framework for future research. J Speech Lang Hear Res. JAIISH. 2009;28:8-13.
1993;36:21-40. 40. Henry LR, Helou LB, Solomon NP, et al. Functional

26. Chan KM, Yiu EM. The effect of anchors and training on the voice outcomes after thyroidectomy: an assessment of
reliability of perceptual voice evaluation. J Speech Lang Hear the Dsyphonia Severity Index (DSI) after thyroidectomy.
Res. 2002;45:111-126. Surgery. 2010;147:861-870.
27. Anders LC, Hollien H, Hurme P, Sonninen A, Wendler J. 41. Aboras Y, El-Banna M, El-Magraby R, Ibrahim A. The relation-
Perception of hoarseness by several classes of listeners. Folia ship between subjective self-rating and objective voice assess-
Phoniatr Logop. 1988;40:91-100. ment measures. Logoped Phoniatr Vocol. 2010;35:34-38.
28. Dejonckere PH. Principal components in voice pathology. 42. Smehák G. Complex voice measurement panel for the assess-
Voice. 1995;4:96-105. ment of the functional evaluation of the laryngeal surgical
29. Shrivastav R. Evaluating voice quality. In: Ma EPM, Yiu interventions [dissertation]. Univeristy of Szeged, Szeged,
EML, ed. Handbook of Voice Assessments. San Diego, CA: Hungary; 2010.
Singular Publishing Group; 2011:305-318. 43. Hussein Gaber AG, Liang FY, Yang JS, Wang YJ, Zheng
30. Everitt BS. The Cambridge Dictionary of Statistics. 2nd ed. YQ. Correlation among the dysphonia severity index (DSI),
New York: Cambridge University Press; 2002. the RBH voice perceptual evaluation, and minimum glot-
31. Landis JR, Koch GG. The measurement of observer agree- tal area in female patients with vocal fold nodules. J Voice.
ment for categorical data. Biometrics. 1977;33:159-174. 2014;28:20-23.
32. Van Belle S. Agreement Between Raters and Groups of Raters 44. Awan SN, Roy N, Jetté ME, Meltzner GS, Hillman RE.
[dissertation]. University of Liège, Liège, Belgium; 2009. Quantifying dysphonia severity using a spectral/cepstral-
33. Fleiss JL. Measuring nominal scale agreement among many based acoustic index: Comparisons with auditory-percep-
raters. Psychol Bull. 1971;76:378-382. tual judgements from the CAPE-V. Clin Linguist Phon.
34. Frey LR, Botan CH, Friedman PG, Kreps GL. Investigating 2010;24:742-758.
Communication: An Introduction to Research Methods. 45. Awan SN, Solomon NP, Helou LB, Stojadinovic A. Spectral-
Englewood Cliffs, NJ: Prentice-Hall; 1991. cepstral estimation of dysphonia severity: external validation.
35. Portney LG, Watkins MP. Foundations of Clinical Research, Ann Otol Rhinol Laryngol. 2013;122:40-48.
Applications to Practice. 2nd ed. Englewood Cliff, NJ: 46. Peterson EA, Roy N, Awan SN, Merrill RM, Banks R, Tanner
Prentice-Hall; 2000. K. Toward validation of the cepstral spectral index of dys-
36. Dollaghan CA. The Handbook for Evidence-based Practice in phonia (CSID) as an objective treatment outcomes measure.
Communication Disorders. Baltimore, MD: Brookes; 2007. J Voice. 2013;27:401-410.
37. Barsties B, Beers M, Ten Cate L, et al. The effect of visual feed- 47. Watts CR, Awan SN. An examination of variations in the
back and training in auditory-perceptual judgment of voice quality cepstral spectral index of dysphonia across a single breath
[published online November 2, 2015]. Logoped Phoniatr Vocol. group in connected speech. J Voice. 2015;29:26-34.

Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016

You might also like