Professional Documents
Culture Documents
research-article2016
AORXXX10.1177/0003489416636131Annals of Otology, Rhinology & LaryngologyBarsties and Maryn
Original Article
Annals of Otology, Rhinology & Laryngology
Abstract
Objectives: The Acoustic Voice Quality Index (AVQI) is an objective method to quantify the severity of overall voice
quality in concatenated continuous speech and sustained phonation segments. Recently, AVQI was successfully modified
to be more representative and ecologically valid because the internal consistency of AVQI was balanced out through equal
proportion of the 2 speech types. The present investigation aims to explore its external validation in a large data set.
Methods: An expert panel of 12 speech-language therapists rated the voice quality of 1058 concatenated voice samples
varying from normophonia to severe dysphonia. The Spearman rank-order correlation coefficients (r) were used to
measure concurrent validity. The AVQI’s diagnostic accuracy was evaluated with several estimates of its receiver operating
characteristics (ROC).
Results: Finally, 8 of the 12 experts were chosen because of reliability criteria. A strong correlation was identified
between AVQI and auditoryperceptual rating (r = 0.815, P = .000). It indicated that 66.4% of the auditory-perceptual
rating’s variation was explained by AVQI. Additionally, the ROC results showed again the best diagnostic outcome at a
threshold of AVQI = 2.43.
Conclusions: This study highlights external validation and diagnostic precision of the AVQI version 03.01 as a robust and
ecologically valid measurement to objectify voice quality.
Keywords
voice, acoustic analysis, voice disorders, larynx, laryngology, otolaryngology
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
2 Annals of Otology, Rhinology & Laryngology
speech and sustained phonation if it is to be considered eco- consisted of 970 participants with dysphonia and 88 healthy
logically valid.5 Sustained vowels and continuous speech subjects without any reported voice complaints and voice
yielded significant differences in their ratings of degree of disorders. The dysphonia group presented various organic
voice quality. Furthermore, both represent a different focus and nonorganic etiologies and various degrees in dysphonia
of voice/speech context (ie, specific vocal functions caused severity. Table 1 summarizes further subject details, such as
by controlled, reasonably stable, and sustained phonation gender, age, and voice disorder.
and characteristics in natural speech varying voicing pat- This study consisted of a retrospective and non-interven-
terns and sounds, respectively).5,6 Second, the auditory-per- tional re-analysis of earlier recordings, and therefore no
ceptual judgment of both speech types with 1 average score advise/consent of our Ethics Committee was needed.17
showed a strong proportional relationship without signifi-
cant differences to the post hoc average score of the single
Voice Samples
ratings on continuous speech and sustained phonation and
to a bivariate model that weighed the separate speech type Every voice sample from each subject contained 2 kinds of
ratings.6 speech types. Both of them were recorded with comfortable
Further investigations about AVQI showed consistent and pitch and loudness. First, the subject had to sustain the vowel
acceptable diagnostic precision,5,7-11 consistent and high con- [a:] longer than 3 seconds, and for the analyses, a selection of
current validity,5,7-12 robust inter-language phonetic differ- 3 seconds of the mid-vowel portion of the vowel [a:] was used.
ences,8-12 and high sensitivity in voice changes through voice Second, a read aloud Dutch phonetically balanced text “Papa
therapy.7 In the majority of all these investigations, the pro- en Marloes”18,19 was used. All recordings were conducted in a
grams Speech Tool13 and Praat14 were used to analyze AVQI. soundproof booth with an AKG C420 head-mounted con-
Recently, the smoothed cepstral peak prominence (ie, the denser microphone digitized at 44 100 samples per second,20
main factor in the multivariate AVQI model and analyzed that is, a sampling rate of 44.1 kHz, and 16 bits of resolution
with Speech Tool) has been implemented in Praat, and thus, using the Kay Pentax Computerized Speech Lab model 4500.
the use of Speech Tool might be expendable. A current inves- The signal-to-noise ratio (SNR) by Deliyski et al21,22 was
tigation revealed that the outcomes of the original AVQI ver- used to verify post hoc the level of environmental noise of
sion with the 2 programs and the second AVQI version only the voice recordings. All voice samples were consistent
in Praat are highly comparable in AVQI results.15 with the recommended SNR norm for acceptable circum-
The next step in the AVQI development was to establish stances of acoustic recordings and analysis. The results
equal proportion of the continuous speech and sustained showed a mean SNR of 38.56 dB and SD of 3.78 dB.
vowel to reach higher ecological validity and balanced out The analyzed voice samples of both recorded speech
internal consistency.16 Therefore, the duration from continu- types were concatenated, and they were constituted as rec-
ous speech was expanded from a range of 17 to 22 syllables10 ommended from Barsties and Maryn16 as follows: First, the
to around 34 syllables16 because the length of continuous continuous speech part was trimmed to the first 34 syllables
speech is significantly lower for the analysis, after separating of the text. Second, the 3 seconds of the sustained vowel [a:]
voice to voiceless segments, than the constant duration of sus- segment was appended. Third, every voice sample was
tained vowel (Figure 1). Although the evaluation from saved as a single sound wave in WAV-format.
Barsties and Maryn16 was found successfully in 60 voice sam-
ples for a new weighted AVQI model with extended represen-
tativity (ie, AVQI version 03.01), it is essential for the results Auditory-Perceptual Evaluation
of any single study to be replicated with alternative samples in The procedure of auditory-perceptual judgment was identical
a larger sample set. Therefore, this investigation aimed to as described by Barsties and Maryn16 using only a new and
explore external validity (ie, the ability to reproduce results extended panel of raters. In the present investigation, 12 native
with alternative subjects and in settings outside the initial Dutch speech-language therapists rated overall voice quality.
study) of the new weighted equation in the AVQI version The panel consisted of 9 females and 3 males who specialized
03.01 with a completely new and independent large set of nor- in voice disorders and had professional experience in audi-
mophonic and dysphonic voice samples and an associated tory-perceptual judgment ranging from 4 to 41 years (mean =
group of auditory-perceptual judges. 22.3 years, SD = 11.4 years). Each listener rated overall voice
quality of each concatenated voice sample with 1 severity
degree for the whole sample. They used the grade (G) from
Methods the GRBAS scale,23 which represents by Hirano23 the degree
of hoarseness or voice abnormality, and the grade concurred
Subjects with overall voice quality. As recommended by Wuyts et al,24
All subjects were recruited from the ENT caseload of the the judges used the ordinal 4-point equal-appearing interval
Sint-Jan General Hospital in Bruges, Belgium. The group scale (ie, 0 = normal/absence of hoarseness, 1 = slightly
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 3
Figure 1. Oscillograms examples of the voice samples for (A) AVQI 02.02 as in Maryn and Weenink15 or earlier5,7-12 and (B) AVQI
03.01 as in Barsties and Maryn16: (A) Upper left: connected speech with 17 syllables, upper right: 3-second sustained vowel [a:], and
lower: concatenation of these 2 sound files in which the continuous speech is already separated into concatenated voiced segments.
(B) Upper left: connected speech with 34 syllables, upper right: 3-second sustained vowel [a:], and lower: concatenation of these 2
sound files in which the continuous speech is already separated into concatenated voiced segments.
hoarse, 2 = moderately hoarse, 3 = severely hoarse). All voice To control internal factors such as fatigue, attention,
samples were provided in a quiet room with a low ambient and low concentration level as described by Kreiman
noise level lower than 40 dBA, measured with a calibrated et al,25 a short break after every twenty-fifth rating was
PCE-322A sound level meter. They were presented to each used. Furthermore, anchor voices were used to putatively
listener individually at a comfortable loudness level through increase the reliability of listener ratings.26 In total, 6
an external soundcard from Creative Soundblaster x-fi 5.1 concatenated voice samples from the database from pre-
USB and a Beyerdynamic DT 770 PRO 80Ω headphone. vious perceived judged investigations were selected as
Every listener was allowed to repeat each voice sample as anchor voices. The selection criteria of these voices was
often as necessary to make a final decision of judgment. based on prior unanimous agreement across judges
All voice samples were judged randomly in several ses- adhering to the 3 severity degrees of slightly, moderately,
sions. Every rating session contained about 250 voice sam- and severely hoarse. This high consensus in the specific
ples. Furthermore, all judges were blinded regarding the severity degrees of hoarseness from different raters
identity, diagnosis, and disposition of the voice samples. To enables the use of these voices as reference patterns of a
assess intrarater reliability, 104 voice samples, approxi- specific level of G. In total, 2 sets of continuous increas-
mately 10% of the 1058 voice samples, were selected ran- ing hoarseness level (ie, 3 samples per set) were provided
domly. These voice samples were repeated a second time at for the listeners as anchors. The 2 sets distinguished
the end of the perceptual judgment without informing the between 2 chief subtypes of hoarseness (ie, breathiness
listeners that stimuli were repeated. and roughness) recognized in various scientific
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
4 Annals of Otology, Rhinology & Laryngology
Table 1. Descriptive Data of Dichotomous Factors of the 1058 Acoustic Measures
Subjects.
All acoustic analyses were applied to only voiced segments
Variable Results of continuous speech as determined by the automated detec-
Gender tion Praat script of Maryn et al5 and an appendage of a
Male, No. 386 3-second [a:] segment to this chain of voiced text segments.
Female, No. 672 The following acoustic analyses were performed on the
Female age, y (mean ± SD) 40.31 ± 19.36 entire segments of only-voiced continuous speech and sus-
Male age, y (mean ± SD) 44.41 ± 23.15 tained phonation.
Normal, No. 88 These concatenated sound files for calculating AVQI
Voice disorder, No. consisted of 6 acoustic parameters: smoothed cepstral
Functional dysphonia 221 peak prominence (CPPs), harmonics-to-noise ratio
Nodules 201 (HNR), shimmer local (Shim), shimmer local dB (ShdB),
Paralysis/paresis 132 general slope of the spectrum (Slope), and tilt of the
Polypoid mucosa (edema) 73
regression line through the spectrum (Tilt) with the soft-
Cyste 35
ware Praat. The smoothed CPP is the distance between
Refluxlaryngitis 28
the first rahmonic’s peak and the point with equal que-
Polyp 28
Presbylarynx 26 frency on the regression line through the smoothed ceps-
Tumor 23 trum. The HNR is the base-10 logarithm of the ratio
Chronic laryngitis 22 between the periodic energy and the noise energy, multi-
Post-phonosurgery 17 plied by 10. The Shim is the absolute mean difference
Thyroidectomy 15 between the amplitudes of successive periods, divided by
Sulcus vocalis 14 the average amplitude. The ShdB is the base-10 logarithm
Trauma 12 of the difference between the amplitudes of successive
Ventricular hypertrophy 12 periods, multiplied by 20. The general Slope is the differ-
Acute laryngitis 11 ence between the energy in 0 to 1000 Hz and the energy
Leukoplakia 10 in 1000 to 10 000 Hz of the long-term average spectrum.
Post-radiotherapy 9 The Tilt is the difference between the energy in 0 to 1000
Vocal fold atrophy 9 Hz and the energy in 1000 to 10 000 Hz of the trendline
Granuloma 7 through the long-term average spectrum.
Hyperkeratosis 6
To calculate AVQI with the software Praat, the following
Spasmodic dysphonia 6
equation was used according to Barsties and Maryn16:
Mutational falsetto 6
Neurological disorder 6
Dysatrophonia 5 AVQI 03.01= (4.152- ( 0.177*CPPs )
Vocal fold scar 5
Papillomatosis 5 - ( 0.006*HNR ) - ( 0.037*Shim ) + ( 0.941*ShdB )
Preoperative baseline before thyroidectomy 4 + ( 0.01*Slope ) + ( 0.093*Tilt ) )*2.8902
Laryngectomy 3
Fibramyalgy 3
Post-transoral robot surgery 2 Furthermore, the complete AVQI script is attached in the
Web 2 appendix to run the AVQI version 03.01 in Praat.
Transgender 2
Genetic disorders 2
Pseudopolyp 1 Statistics
Tumorectomy head-neck 1 The statistical analyses for concurrent validity and diagnos-
Pachydermy 1 tic accuracy were completed using SPSS for Windows ver-
Hemorrhage 1
sion 22.0 (IBM Corp, Armonk, New York, USA). The rater
Corticopathology 1
reliability was analyzed using the software package of
Postoperative neurological surgery 1
r-Studio v. 3.0.1 (R Core Team, Vienna, Austria).
Tracheitis 1
Stenosis 1
Intrarater Reliability
papers.27-29 Each listener heard these 2 sets at the begin- The intrarater reliability of the 12 raters was assessed using
ning and after the break of every twenty-fifth rating. the Cohen’s kappa coefficient (Ck). This statistic is a chance
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 5
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
6 Annals of Otology, Rhinology & Laryngology
Figure 2. Frequency distribution of the mean auditory-perceptual overall voice quality ratings (average of G scores of the 8 identified
judges) of the 1058 concatenated voice samples.
Results
Intrarater reliability showed no significant differences in Ck
values (t = 12.824, P = .306) between all 12 raters, but 1
rater did not reach the minimum of acceptable reliability
level (Ck = 0.32) and had to be excluded. The remaining 11
raters had a range of Ck between 0.41 and 0.58.
Interrater reliability was executed on the remaining 11
raters that reached an Fk = 0.39, and 5 raters showed a sig-
nificantly better Fk result if they were excluded in compari- Figure 3. Scatterplot and linear regression line illustrating
the proportional relationship between AVQI version 03.01
son to the Fk of all tested 11 raters (t = 18.985, P < .001 to
and Gmean (the 2 lines above and under the regression fit line
t = 7.576, p = .006). After the fourth round, an Fk= 0.43 was delineate the upper and lower boundaries, respectively, of the
found with a group of 8 raters and simultaneously showed 95% prediction interval).
no significantly better Fk results if 1 rater of this group was
excluded (t = 7.25, P = .011 to t = 0.757, P = .384). The results indicated significant and marked34 concur-
Therefore, all analyses of perceptual Gmean ratings were rent validity between the AVQI values and Gmean ratings for
conducted on the panel of the particular 8 raters mentioned those same samples as provided by the selected judges (r =
previously. Figure 2 shows a distribution of the 1058 evalu- 0.815, P < .001; Figure 3), indicating that 66.4% (ie, r2 =
ated voice samples by this panel. 0.664) of the variance in Gmean was accounted for by AVQI.
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 7
Figure 4. Receiver operating characteristic curve illustrating the diagnostic accuracy of AVQI version 03.01.
To evaluate AVQI’s potential to distinguish subjects with corresponds to perceptual ratings of overall voice quality.
normal voice quality from abnormal voice quality, a ROC Furthermore, the AVQI version 03.01 has a strong external
curve was constructed (Figure 4). The AROC was 0.923 (ie, validity as an effective correlate with high diagnostic accu-
92.3%) and confirmed the excellent discriminatory power racy of perceptual evaluation of overall voice quality with
of AVQI in differentiating between normal and hoarse alternative voice samples and judges. The intra- and inter-
voices. Furthermore, the AVQI threshold of 2.43 showed rater reliability reached moderate strength of agreement,
the optimal demarcation point between normal and hoarse and the results are comparable to other studies that investi-
voices that concurs with the previous result from Barsties gate the reliability of concatenated voice samples.3,5,8,16,37
and Maryn.16 At this threshold, the best balance between Although the present study has first, an extremely large
sensitivity and specificity was achieved with respectable data set of more than 1000 analyzed voice samples; second,
sensitivity = 0.785 and excellent specificity = 0.932. more exact selection of the rater panel based on the knowl-
Additionally, the likelihood ratio provided at this threshold edge of several affecting factors disturbing the perceived
the first time the statistical criteria for LR+ = 11.54 and a judgment1; and third, more critical statistical selection criteria
respectable LR− = 0.23. in rater reliability, the results are comparable to the previous
results about AVQI’s version 03.01. Thus, the concurrent
validity (ie, r = 0.815 in the current study vs r = 0.929 in the
Discussion study by Barsties and Maryn16) and diagnostic accuracy with
The present study aimed to explore external validation and the threshold of 2.43 (ie, sensitivity = 0.785 and specificity =
diagnostic precision of the AVQI version 03.01. The results 0.932 in the present study vs sensitivity = 0.936 and specific-
indicate that the new extended AVQI successfully ity = 1 in the study by Barsties and Maryn16) are comparable.
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
8 Annals of Otology, Rhinology & Laryngology
The analyzed data pool15 of 507 subjects across 5 stud- The independent external validation of the AVQI version
ies5,7-10 evaluated the initial AVQI model and auditory- 03.01 provided by this study accomplishes an important
perceptual judgment of overall voice quality. The results step in making practical, reliable, and reproducible objec-
showed a homogeneous weighted mean correlation of r = tive voice assessments available to non-experts or profes-
0.790.15 By comparison, under the same conditions of sta- sionals to support their clinical decision in practice or
tistical analysis for the AVQI version 03.01 including research in voice-disordered patients.
1118 subjects across 2 studies, these results showed not
only a homogeneous weighted correlation but a slightly
Appendix
improved weighted mean r = 0.821. To compare the
weighted mean correlation (ie, concurrent validity) of # TITLE OF THE SCRIPT: ACOUSTIC VOICE QUALITY
AVQI with other comparable multivariate indices to INDEX (AVQI) v.03.01
objectify hoarseness, we used the same statistical analysis # Form for introduction and/or parameterization
as published in Maryn and Weenink.15 First, we analyzed form Acoustic Voice Quality Index v.03.01
the validity between the Dysphonia Severity Index comment >>> It is advocated to estimate someone’s dys-
(DSI)38 and auditory-perceptual judgment of overall phonia severity in both
voice quality. The DSI is a multivariate method to provide comment continuous speech (i.e., ‘cs’) and sustained vowel
an objective and quantitative measure of vocal function (i.e., ‘sv’) (Maryn et al.,
including component measures of jitter, maximum phona- comment 2010). This script therefore runs on these two
tion time, lowest vocal intensity, and highest phonational types of recordings, and it is
frequency. A data pool of 490 subjects across 5 studies39-43 comment important to name these recordings ‘cs’ and ‘sv’,
was used. The results showed a heterogeneous weighted respectively.
mean correlation of r = 0.524. Second, the validity was comment >>> This script automatically (a) searches,
evaluated between the Cepstral Spectral Index of extracts and then concatenates
Dysphonia (CSID)44 and the auditory-perceptual judg- comment the voiced segments of the continuous speech
ment of overall voice quality. The CSID is composed of 2 recording to a new sound; (b)
multiple regression-based mathematical estimates of dys- comment concatenates the sustained vowel recording to the
phonia severity analyzing sustained phonation and con- new sound, (c) determines
tinuous speech separately, which use several cepstral- and comment the Smoothed Cepstral Peak Prominence, the
spectral-based measures. We selected a data pool of 310 Shimmer Local, the Shimmer
subjects across 3 studies44-46 for the sustained vowel and comment Local dB, the LTAS-slope, the LTAS-tilt and the
656 subjects across 4 studies44-47 for the continuous Harmonics-to-Noise Ratio of
speech. The results showed a homogeneous weighted comment the concatenated sound signal, (d) calculates the
mean correlation of r = 0.788 for the sustained vowel and AVQI-score based on
a homogeneous weighted mean correlation of r = 0.748 comment the equation of Barsties & Maryn (2015), and
for the continuous speech. Based on all these results, it draws the oscillogram, the narrow-
can be concluded that first, the development of the AVQI comment band spectrogram with LTAS and the power-
model is a steady and robust objective method in the eval- cepstrogram with power-
uation of voice quality. AVQI has improved in ecological comment cepstrum of the concatenated sound signal to
validity, concurrent validity, and diagnostic accuracy. allow further interpretation.
Second, the new version of AVQI 03.01 reached the high- comment >>> To be reliable for the AVQI analysis, it is
est concurrent validity in comparison to the earlier ver- imperative that the sound recordings
sion of AVQI and other acoustic multiparamatric indices comment are made in an optimal data acquisition
to objectify hoarseness. conditions.
comment >>> There are two versions in this script: (1) a
simple version (only AVQI with
Conclusions
comment data of acoustic measures), and (2) an illustrated
The results confirm AVQI as a robust and ecologically valid version (AVQI with data of
measurement to objectify overall voice quality. In the pres- comment acoustic measures and above-mentioned graphs).
ent large data set, the development to the AVQI version choice version: 1
03.01 demonstrates high validity and acceptable diagnostic button simple
accuracy in a representative voice clinic population, reflect- button illustrated
ing different ages, genders, different types and degrees of comment >>> Additional information (optional):
voice quality, and including nonorganic as well as organic sentence name_patient
laryngeal pathologies. sentence left_dates_(birth_-_assessment)
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 9
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
10 Annals of Otology, Rhinology & Laryngology
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 11
Text... 0.05 Left 1.5 Half Harmonics-to-noise ratio: Text bottom... no Time (s)
##‘hnr:2’ dB# # LTAS
Text... 0.05 Left 2.5 Half Shimmer local: ##‘shim:2’ \% # Select inner viewport... 5.4 7.5 2.3 3.8
Text... 0.05 Left 3.5 Half Shimmer local dB: ##‘shdb:2’ select Ltas avqi
dB# Draw... 0 4000 minimumSpectrum maximumSpectrum no
Text... 0.05 Left 4.5 Half Slope of LTAS: ##‘slope:2’ dB# Curve
Text... 0.05 Left 5.5 Half Tilt of trendline through LTAS: Draw inner box
##‘tilt:2’ dB# One mark left... minimumSpectrum no yes no
Select inner viewport... 0.5 3.8 0.5 2 ‘minimumSpectrum:2’
Draw inner box One mark left... maximumSpectrum no yes no
Font size... 7 ‘maximumSpectrum:2’
Arrow size... 1 Text left... no Sound pressure level (dB/Hz)
Select inner viewport... 4 7.5 1.25 2 One mark bottom... 0 no yes no 0
Axes... 0 10 1 0 One mark bottom... 4000 no yes no 4000
Paint rectangle... green 0 2.43 0 1 Text bottom... no Frequency (Hz)
Paint rectangle... red 2.43 10 0 1 # Power-cepstrogram
Draw arrow... avqi 1 avqi 0 Select inner viewport... 0.5 5 4.1 5.6
Draw inner box select PowerCepstrogram avqi
Marks top every... 1 1 yes yes no Paint... 0 0 0.00303 0.01667 0 0 no
Font size... 16 Draw inner box
Select inner viewport... 4 7.5 0.5 1.15 One mark left... 0.00303 no yes no 0.003
Axes... 0 1 0 1 One mark left... 0.01667 no yes no 0.017
Text... 0.5 Centre 0.5 Half AVQI: ##‘avqi:2’# Text left... no Quefrency (s)
# Copy Praat picture One mark bottom... 0 no yes no 0.00
Select inner viewport... 0.5 7.5 0 2 One mark bottom... durationOnlyVoice no no yes
Copy to clipboard One mark bottom... durationAll no yes no ‘durationAll:2’
# Illustrated version Text bottom... no Time (s)
# Power-cepstrum
elsif version = 2 Select inner viewport... 5.4 7.5 4.1 5.6
# Oscillogram select PowerCepstrum avqi_0_100
Font size... 7 Draw... 0.00303 0.01667 0 0 no
Select inner viewport... 0.5 5 0.5 2.0 Draw tilt line... 0.00303 0.01667 0 0 0.00303 0.01667 Straight
select Sound avqi Robust
Draw... 0 0 0 0 no Curve Draw inner box
Draw inner box One mark left... maximumCepstrum no yes no
One mark left... minimumSPL no yes no ‘minimumSPL:2’ ‘maximumCepstrum:2’
One mark left... maximumSPL no yes no Text left... no Amplitude (dB)
‘maximumSPL:2’ One mark bottom... 0.00303 no yes no 0.003
Text left... no Sound pressure level (Pa) One mark bottom... 0.01667 no yes no 0.017
One mark bottom... 0 no yes no 0.00 Text bottom... no Quefrency (s)
One mark bottom... durationOnlyVoice no no yes # Data
One mark bottom... durationAll no yes no ‘durationAll:2’ Font size... 10
Text bottom... no Time (s) Select inner viewport... 0.5 7.5 5.9 7.4
# Narrow-band spectrogram Axes... 0 7 6 0
Select inner viewport... 0.5 5 2.3 3.8 Text... 0.05 Left 0.5 Half Smoothed cepstral peak promi-
select Spectrogram avqi nence (CPPS): ##‘cpps:2’#
Paint... 0 0 0 4000 100 yes 50 6 0 no Text... 0.05 Left 1.5 Half Harmonics-to-noise ratio:
Draw inner box ##‘hnr:2’ dB#
One mark left... 0 no yes no 0 Text... 0.05 Left 2.5 Half Shimmer local: ##‘shim:2’ \% #
One mark left... 4000 no yes no 4000 Text... 0.05 Left 3.5 Half Shimmer local dB: ##‘shdb:2’ dB#
Text left... no Frequency (Hz) Text... 0.05 Left 4.5 Half Slope of LTAS: ##‘slope:2’ dB#
One mark bottom... 0 no yes no 0.00 Text... 0.05 Left 5.5 Half Tilt of trendline through LTAS:
One mark bottom... durationOnlyVoice no no yes ##‘tilt:2’ dB#
One mark bottom... durationAll no yes no ‘durationAll:2’ Select inner viewport... 0.5 3.8 5.9 7.4
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
12 Annals of Otology, Rhinology & Laryngology
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016
Barsties and Maryn 13
24. Wuyts FL, De Bodt MS, Van de Heyning PH. Is the reliabil- 38. Wuyts FL, De Bodt MS, Molenberghs G, et al. The dysphonia
ity of a visual analog scale higher than an ordinal scale? An severity index: an objective measure of vocal quality based
experiment with the GRBAS scale for the perceptual evalua- on a multiparameter approach. J Speech Lang Hear Res.
tion of dysphonia. J Voice. 1999;13:508-517. 2000;43:796-809.
25. Kreiman J, Gerratt BR, Kempster GB, Erman A, Berke GS. 39. Jayakumar T, Savithri SR. Assessment of voice quality in
Perceptual evaluation of voice quality: review, tutorial, and monozygotic twins: qualitative and quantitative measures.
a framework for future research. J Speech Lang Hear Res. JAIISH. 2009;28:8-13.
1993;36:21-40. 40. Henry LR, Helou LB, Solomon NP, et al. Functional
26. Chan KM, Yiu EM. The effect of anchors and training on the voice outcomes after thyroidectomy: an assessment of
reliability of perceptual voice evaluation. J Speech Lang Hear the Dsyphonia Severity Index (DSI) after thyroidectomy.
Res. 2002;45:111-126. Surgery. 2010;147:861-870.
27. Anders LC, Hollien H, Hurme P, Sonninen A, Wendler J. 41. Aboras Y, El-Banna M, El-Magraby R, Ibrahim A. The relation-
Perception of hoarseness by several classes of listeners. Folia ship between subjective self-rating and objective voice assess-
Phoniatr Logop. 1988;40:91-100. ment measures. Logoped Phoniatr Vocol. 2010;35:34-38.
28. Dejonckere PH. Principal components in voice pathology. 42. Smehák G. Complex voice measurement panel for the assess-
Voice. 1995;4:96-105. ment of the functional evaluation of the laryngeal surgical
29. Shrivastav R. Evaluating voice quality. In: Ma EPM, Yiu interventions [dissertation]. Univeristy of Szeged, Szeged,
EML, ed. Handbook of Voice Assessments. San Diego, CA: Hungary; 2010.
Singular Publishing Group; 2011:305-318. 43. Hussein Gaber AG, Liang FY, Yang JS, Wang YJ, Zheng
30. Everitt BS. The Cambridge Dictionary of Statistics. 2nd ed. YQ. Correlation among the dysphonia severity index (DSI),
New York: Cambridge University Press; 2002. the RBH voice perceptual evaluation, and minimum glot-
31. Landis JR, Koch GG. The measurement of observer agree- tal area in female patients with vocal fold nodules. J Voice.
ment for categorical data. Biometrics. 1977;33:159-174. 2014;28:20-23.
32. Van Belle S. Agreement Between Raters and Groups of Raters 44. Awan SN, Roy N, Jetté ME, Meltzner GS, Hillman RE.
[dissertation]. University of Liège, Liège, Belgium; 2009. Quantifying dysphonia severity using a spectral/cepstral-
33. Fleiss JL. Measuring nominal scale agreement among many based acoustic index: Comparisons with auditory-percep-
raters. Psychol Bull. 1971;76:378-382. tual judgements from the CAPE-V. Clin Linguist Phon.
34. Frey LR, Botan CH, Friedman PG, Kreps GL. Investigating 2010;24:742-758.
Communication: An Introduction to Research Methods. 45. Awan SN, Solomon NP, Helou LB, Stojadinovic A. Spectral-
Englewood Cliffs, NJ: Prentice-Hall; 1991. cepstral estimation of dysphonia severity: external validation.
35. Portney LG, Watkins MP. Foundations of Clinical Research, Ann Otol Rhinol Laryngol. 2013;122:40-48.
Applications to Practice. 2nd ed. Englewood Cliff, NJ: 46. Peterson EA, Roy N, Awan SN, Merrill RM, Banks R, Tanner
Prentice-Hall; 2000. K. Toward validation of the cepstral spectral index of dys-
36. Dollaghan CA. The Handbook for Evidence-based Practice in phonia (CSID) as an objective treatment outcomes measure.
Communication Disorders. Baltimore, MD: Brookes; 2007. J Voice. 2013;27:401-410.
37. Barsties B, Beers M, Ten Cate L, et al. The effect of visual feed- 47. Watts CR, Awan SN. An examination of variations in the
back and training in auditory-perceptual judgment of voice quality cepstral spectral index of dysphonia across a single breath
[published online November 2, 2015]. Logoped Phoniatr Vocol. group in connected speech. J Voice. 2015;29:26-34.
Downloaded from aor.sagepub.com at UNIV CALIFORNIA SAN DIEGO on March 14, 2016