Professional Documents
Culture Documents
Article
Purpose: This study examined acoustic predictors of speech 1st moment analysis for fricatives, vowel space, F0, intensity
intelligibility in speakers with several types of dysarthria range, and Pairwise Variability Index.
secondary to different diseases and conducted classification Results: The results showed that (a) acoustic predictors of speech
analysis solely by acoustic measures according to 3 variables intelligibility differed slightly across diseases and (b) classification
(disease, speech severity, and dysarthria type). accuracy by dysarthria type was typically worse than by
Method: Speech recordings from 107 speakers with dysarthria disease type or severity.
due to Parkinson’s disease, stroke, traumatic brain injury, and Conclusions: These findings were discussed with respect to
multiple system atrophy were used for acoustic analysis and for (a) the relationship between acoustic characteristics and speech
perceptual judgment of speech intelligibility. Acoustic analysis intelligibility and (b) dysarthria classification.
included 8 segmental/suprasegmental features: 2nd formant
frequency slope, articulation rate, voiceless interval duration, Key Words: acoustic measures, dysarthria, classification
O
ne of the most important developments in the classification ranged from about 35% to 40%. Both Fonville
understanding of dysarthria was the introduc- et al. and Van der Graaf et al. concluded that classification
tion of a classification system by Darley, Aronson, by perceptual judgment alone is not adequate and that
and Brown (1969a, 1969b). This system (hereafter, the professionals should rely on other sources of information.
Mayo System) is widely used for research and clinical On the other hand, Guerra and Lovely (2003) reported
purposes, but questions persist concerning its reliability promising results for classification of dysarthria type us-
and validity for different groups of raters. Zyski and ing nonlinear self-organizing maps operating on a com-
Weisiger (1987), who reported on ratings by graduate stu- bination of acoustic and perceptual data. A recent study
dents and experienced clinicians, expressed doubts about by Liss et al. (2009) reported that rhythm metrics are
the suitability of the system for clinical purposes, noting able to distinguish dysarthric speech from controls and
low reliability and low rates of accurate classification. even between dysarthria subtypes, when the subtypes
More recently, Fonville et al. (2008) reported on classifi- are chosen to be highly representative of their classical
cation accuracy for neurologists and neurology trainees, description. Forty years after publication of the Mayo
and Van der Graaf et al. (2009) reported on classification System, it seems fair to say that its classification accu-
accuracy for neurologists, residents in neurology, and racy, either by perceptual or instrumental measures, has
speech therapists. In these studies, the rate of correct not been adequately established in the research literature.
a Severity of Dysarthria
Louisiana State University, Baton Rouge
b
University of Wisconsin—Madison A potentially complicating factor in the classification
Correspondence to Yunjung Kim: ykim6@lsu.edu of dysarthria type is speech severity. The Mayo Clinic
Editor: Anne Smith studies that provided the empirical basis for classifica-
Associate Editor: Wolfram Ziegler tion of dysarthria did not control for severity. Rather,
Received January 26, 2010 severity was allowed to vary within each of the disease
Accepted August 30, 2010 groups studied (Darley, Aronson, & Brown, 1975). There
DOI: 10.1044/1092-4388(2010/10-0020) is no standard measure of speech severity in dysarthria,
Journal of Speech, Language, and Hearing Research • Vol. 54 • 417–429 • April 2011 • D American Speech-Language-Hearing Association 417
but estimates of speech intelligibility are often used to The classification issue in dysarthria is important
index the extent to which neurological disease affects the for several reasons. First, a complete understanding of
speech mechanism (R. D. Kent et al., 1989). The main any disorder and the development of a scientific program
barrier to separating effects of severity from types of dys- to investigate its characteristics depend on a sound theo-
arthria, or other potential classification variables (see retical basis. The theoretical basis of the distinctions
the Discussion section), has been the lack of relevant ana- between dysarthria types is an assumed feature of the
lyses from a sufficiently large number of speakers with dif- Mayo System but has rarely been investigated when other
ferent dysarthria types and various levels of severity. The classification options (such as severity or disease type)
present study draws from a relatively large dysarthria are considered. In other words, there is little empirical
database generated by collaboration between the Mayo basis for the classification scheme specified by Darley
Clinic and the University of Wisconsin—Madison to et al. (1969a, 1969b), as compared with other potential
investigate the distribution of selected speech acoustic classification approaches. Challenges to the Mayo System
characteristics within and across dysarthria types and based on perceptual judgments have been described pre-
levels of severity, the latter based on estimates of speech viously in this article. Second, investigations in dysarthria
intelligibility. often involve groups of participants classified as having a
Reviews of dysarthria symptoms at both the articu- common dysarthria and who are compared with a control
latory and acoustic levels of analysis (Weismer, 1997; group. Much more rarely, multiple groups of participants
Weismer & Kim, 2010) identify much that is common to with perceptually judged, varying dysarthria types may
speakers with dysarthria, regardless of type. Moreover, be compared with each other (and with a control group)
the typically large interspeaker variability observed in for distinguishing characteristics (e.g., Liss et al., 2009).
almost any study of a particular dysarthria type is likely Investigation of alternate classification schemes may
due, in part, to variations in severity of speech involve- inform scientists concerning the relative homogeneity
ment. Even in the original Mayo Clinic studies (Darley of participant groups defined in different ways.
et al., 1969a, 1969b), there was substantial overlap across
dysarthria types in perceptual characteristics (see dis-
cussion in Weismer, 1997). Possibly, variation in speech
Acoustic Methods
severity within a dysarthria type explains as much vari- The current classification analyses were based on
ance in physiological, acoustic, and/or perceptual data acoustic measures thought to reflect salient aspects of
as variation across dysarthria type. If this is the case, speech production in persons with motor speech disor-
classification of speakers according to speech severity ders (Kent & Kim, 2008). The advantage of an acoustic
level might be expected to be roughly as accurate as classi- approach to understanding motor speech disorders has
fication by dysarthria type, when properly selected acous- been noted (Ansel & Kent, 1992; Weismer, 1984). Pre-
tic measures are used as input variables. vious efforts have pursued two main goals: (a) identifi-
The issue of classification accuracy can be expanded cation of the signal properties underlying intelligibility
to include other potential classification variables. One deficits in dysarthria and (b) identification of acoustic
such variable is disease type. In the original Mayo Clinic characteristics of specific dysarthria types. Studies have
studies, the groups were formed based on disease or groups been conducted to (a) identify acoustic measures that pre-
of related diseases such as Parkinson’s disease, cerebellar dict speech intelligibility scores and (b) investigate the
disease, and so forth (Darley, Aronson, & Brown, 1975). physical correlates of perceptual features in selected types
In many cases, there is good correspondence between dis- of dysarthria. Examples of acoustic parameters that pre-
ease type and dysarthria type, but in other cases, the cor- dict speech intelligibility in speakers with dysarthria
respondence is weaker, with multiple dysarthria types include acoustic vowel space (Liu, Tseng, & Tsao, 2000;
being possible for a single disease type because some dis- McRae, Tjaden, & Schoonings, 2002; Tjaden & Wilding,
eases can affect more than one component of the motor 2004; Weismer, Jeng, Laures, Kent, & Kent, 2001); sec-
system (Duffy, 2006, pp. 31–33, Table 2-3 and associated ond formant frequency (F2) slope (J. F. Kent et al., 1992;
text). In the current study, classification according to R. D. Kent et al., 1989; Kim, Weismer, Kent, & Duffy,
disease type was also investigated. A reasonable initial 2009; Mulligan et al., 1994; Weismer, Martin, Kent, &
hypothesis derived from the Mayo System and its pre- Kent, 1992); and voice onset time (VOT; Liu et al., 2000).
sumed neuropathological basis is that classification ac- In addition, acoustic measurements have been used
cording to dysarthria type will be better (i.e., more accurate) to investigate characteristics of specific types of dysar-
than classification based on disease type or severity of thria. For example, ataxic dysarthria has been character-
speech involvement. This prediction is reasonable because ized by slow speaking rate, relatively great variability in
a single disease may be associated with more than one VOT, a tendency toward equalized vowel/syllable dura-
dysarthria type, as noted previously, and severity plays tions within utterances, and an unusually large fundamen-
no classification role in the Mayo System. tal frequency (F0) range across utterances (Ackermann
418 Journal of Speech, Language, and Hearing Research • Vol. 54 • 417–429 • April 2011
& Hertrich, 1997; Chiu, Chen, & Tseng, 1996; R. D. Kent To begin the process of exploring the classification of
et al., 2000; Stuntebeck, 2002). Salient acoustic features of dysarthria based on multiple acoustic measures, the fol-
hypokinetic dysarthria, as another example, have included lowing questions were asked. First, are some or all of the
either normal or faster-than-normal speaking rates, rela- selected acoustic measures correlated with a measure of
tively high mean F0, decreased F2 extents and slopes, and speech severity, which in this case was operationalized
decreased F0 variability (Canter, 1963; Forrest, Weismer, with a speech intelligibility measure? Second, how well
& Turner, 1989; Goberman, Coelho, & Robb, 2005; Solomon do combinations of acoustic measures classify (a) dys-
& Hixon, 1993; Weismer, 1984, 1991). Other studies have arthria types, (b) disease type, and (c) speech severity
investigated selected acoustic characteristics in spastic as indexed by speech intelligibility measures?
(Ozawa, Shiromoto, Ishizaki, & Watamori, 2001; Ozsancak,
Auzou, Jan, & Hannequin, 2001; Ziegler & von Cramon,
1986), hyperkinetic (Ackermann, Hertrich, & Hehr, 1995;
Hertrich & Ackermann, 1994; Ludlow, Connor, & Bassich,
Method
1987), and flaccid (Morris, 1989) dysarthria, as well as in Speakers
mixed dysarthrias (Liss et al., 2009; Wang, Kent, Duffy, One hundred and seven subjects with dysarthria con-
& Thomas, 2005). sequent to Parkinson’s disease (males [M] = 29, females
Interestingly, although the Mayo System is widely [F] = 10), stroke (M = 21, F = 18), multiple system atrophy
accepted in clinical practice and for the definition of pre- (M = 11, F = 6), and traumatic brain injury (M = 7, F = 5)
sumably homogeneous participant groups in research were selected for the present study from the University of
studies (see Duffy, 2005), very few studies other than Wisconsin—Madison Mayo Clinic dysarthria database,
Liss et al. (2009) have examined the possibility of clas- which consists of digital speech recordings obtained at
sification of dysarthria on the basis of acoustic attributes. the Mayo Clinic in Rochester, Minnesota. Parkinson’s
The unique clusters of perceptual characteristics thought disease, stroke, and traumatic brain injury groups were
to be the core of the Mayo System (Darley et al., 1975) chosen for this study because they are the most frequent
prompt the hypothesis that properly selected and com- etiologies associated with dysarthria in the United States
bined acoustic measures can discriminate among dys- (Centers for Disease Control and Prevention, 2003; Duffy,
arthria types with a reasonable degree of accuracy. As 2005; National Institute of Neurological Disorders and
described previously, the same hypothesis of classification Stroke, 2001; National Stroke Association, 2002); the
from acoustic variables can be entertained for the clas- group of speakers with multiple system atrophy was
sification variables of severity and neurological disease chosen because of the availability of a relatively large
type (and, possibly, other variables not discussed here). number of speakers with this diagnosis and the associ-
Apart from Liss et al. (2009), there is no study of ation of this disease with different types of dysarthria.
classification of dysarthria, in which a single protocol Across all groups of speakers, participants ranged be-
has been used with a relatively large number of speakers tween 20 and 91 years of age (Mdn = 64.5 years). The
having different dysarthria types, varying severities of dysarthria diagnoses were made by a widely acknowl-
speech involvement, and different underlying diseases. edged expert in the area (Duffy, 2005) who was aware of
In general, studies in which instrumental measures of each patient’s medical diagnosis (if it was known at the
any type have been used to differentiate the dysarthrias time of the classification) but who identified the dysar-
are relatively rare and, when available, are usually based thria type largely, if not entirely, on perceptual evaluation
on a single measure (or derivatives of that measure). For of a patient’s speech (J. R. Duffy, personal communica-
example, Nishio and Niimi (2001) measured speaking tion, June 2, 2010). The dysarthria types he identified
rate and measures associated with it (articulation rate, included ataxic, spastic, hypokinetic, flaccid, hyperkinetic,
pause time) in seven different dysarthria types; these Unilateral Upper Motor Neuron (UUMN), and mixed.
measures were not particularly effective in distinguish- Table 1 shows subject information, including the distri-
ing among the different dysarthrias. Similarly, Morris bution of dysarthria types for these 107 participants; the
(1989) showed long-lag VOTs to be shorter than normal mixed category pools that the different combinations
in talkers with five different types of dysarthria. For- (e.g., spastic–flaccid, spastic–ataxic) used in the Mayo
mant frequency measures, formant transition rates, and System. Potential participants with language disorders,
even lip/jaw motions have all been shown to be similar apraxia of speech, or aprosodia were excluded from the
across different types of dysarthria (see Ackermann & current study.
Hertrich, 1997; see also reviews in Weismer, 1997; Weismer
& Kim, 2010). It is unknown, however, whether combina-
tions of measures reflecting different aspects of articu-
Procedure
latory behavior (or the speech acoustic signal) would The speech samples used in this study included
distinguish among different dysarthria types. word and sentence recitations. Participants were asked
Note. PD = Parkinson’s disease; TBI = traumatic brain injury; MSA = multiple system atrophy; UUMN = Unilateral
Upper Motor Neuron; M = male; F = female.
to produce each of six words (hail, shoot, sigh, sip, ship, For the sentence recitation task, all participants
and wax) 10 times in a row. These words were selected recited the following five sentences in response to a
because of their acoustic characteristics and previous live-voice demonstration by the examiner: (a) “ Put the
demonstrations of their sensitivity to varying severi- high stack of cards on the table,” (b) “Combine all the in-
ties of dysarthria (R. D. Kent et al., 1989; Weismer, Kent, gredients in a large bowl,” (c) “The blue spot is on the
Hodge, & Martin, 1988). The vocalic nuclei of some of these key,” (d) “The potato stew is in the pot,” and (e) “The
words require relatively extensive changes in vocal tract boiling tornado clouds moved swiftly.” These sentences
shape that were consistent with the sensitivity of formant were used to obtain speech intelligibility data as well
transitions to speech symptoms in dysarthria (Kim et al., as acoustic data. Three listeners who had a background
2009). Other words were chosen for a specific type of anal- in speech-language pathology, but who were not expert
ysis appropriate to analysis of consonant production (e.g., listeners of speech disorders in general or dysarthria
sip vs. ship for first-moment (M1) analysis; acoustic mea- specifically (in the sense defined by Monsen, 1983, for
sures are explained later in this article). Among the 10 rep- listeners of speakers with profound hearing impair-
etitions for each word, the middle eight of the repetition ment), made judgments of speech intelligibility. Intel-
string were taken for analyses to eliminate possible ligibility data were obtained using a direct magnitude
initial and final location effects. estimation technique (Gescheider, 1976). Inter- and
420 Journal of Speech, Language, and Hearing Research • Vol. 54 • 417–429 • April 2011
intralistener variability in selection of number scale was used to estimate the range of variation for F0 and
ranges was eliminated by following a procedure described intensity contours.
by Engen (1971). Ten subjects were randomly selected in order to ex-
The speech samples were collected in a quiet room amine intrajudge reliability for acoustic measurements.
with a high-quality microphone (SHURE SM 58) and a The correlation coefficient computed between the orig-
digital audiotape recorder (DAT; TASCAM DA-P1) at a inal and remeasured data (after 9 months) across all
sampling rate of 44.1 kHz and with 16-bit quantization. acoustic variables was .95, which suggested that the
After the utterances had been recorded on DAT, they measurements were repeatable and were made under
were analyzed using the speech analysis program TF32 a replicable set of criteria.
(Milenkovic, 2001).
All experimental procedures were approved by
the University of Wisconsin—Madison Human Subject Results
Committee. Speech Intelligibility Scores
The modulus-equalized speech intelligibility score
across the 107 speakers with dysarthria ranged from
Acoustic Analysis
0.58 to 1.63 (Mdn = 1.40). If listeners were actually doing
Acoustic measurements were made of (a) sentence ratio scaling, this suggests that intelligibility among these
duration, (b) vowel duration, (c) voiceless interval dura- speakers varied over a roughly 2.8:1 range (i.e., the most
tion, (d) first and second formant frequencies from four intelligible speaker was scaled as approximately 2.8 times
corner vowels, (e) M1 for fricatives (/s/ and /S/) during more intelligible than the least intelligible speaker).
three 50-ms-long windows approaching the vocalic nu- Speech intelligibility scores for individual disease groups
cleus (25-ms overlap between adjacent windows), (f ) tran- are displayed in Figure 1. The Parkinson’s disease group
sition duration and extent for F2 transitions, (g) F0 contour, had the highest mean intelligibility of the four groups,
and ( h) root-mean-square (RMS) intensity contour. and the stroke group had the lowest; t test results (ad-
Among these measures, the voiceless interval duration justed per comparison for an overall alpha level of .05)
was used as measured; the remaining measures were revealed that there were no significant differences in
used to derive the following variables for further sta- speech intelligibility among the clinical groups.
tistical analysis: (a) articulation rate, (b) Pairwise Var-
iability Index (PVI), (c) acoustic vowel space, (d) M1
difference between /s/ and /S/, (e) F2 slope, (f ) F0 range Figure 1. Box-and-whisker plot of intelligibility scores, plotted as
(maximum–minimum) of utterance, and (g) RMS inten- modulus-equalized values, for speakers with dysarthria in the four
disease groups. Mean values of each group are indicated by the
sity range of utterance. Except for the voiceless interval
dotted line within the box and median by the solid line. The left (lower
duration, these acoustic variables were selected based
value) edge of each box is the 25th percentile of the distribution;
on previous studies that have reported them to be useful the right (upper value) edge is the 75th percentile; the whiskers show
either for predicting intelligibility scores or characteriz- the lowest and highest nonoutlier observations, and the individual
ing the production deficit in dysarthric speech. Voiceless plotted points are outliers. PD = Parkinson’s disease; TBI = traumatic
interval duration was investigated as an alternative for brain injury; MSA = multiple system atrophy.
VOT because of recent concerns about the interpreta-
tion of the latter measure (Auzou et al., 2000) and its
sensitivity to age-related effects on speech production
(Weismer & Fromm, 1983). The PVI served as an es-
timate of the degree of scanning speech (see Low, Grabe,
& Nolan, 2000; Stuntebeck, 2002). Word materials were
used to derive M1 for the fricative pair (sip vs. ship) and
F2 slopes from vocalic nuclei (hail, wax, sigh, and shoot),
whereas sentence duration, vowel duration, voiceless
interval duration, acoustic vowel space, overall F0,
and RMS intensity range were measured from sentence
materials.
Tracking errors observed for formant trajectories
and F0 values (usually due to poor voice quality or sudden
phonation change) were manually modified using the in-
teractive editor in TF32. Considering possible outliers in
F0 and RMS intensity contours, the interquartile range
422 Journal of Speech, Language, and Hearing Research • Vol. 54 • 417–429 • April 2011
Figure 2. The relationships between second formant frequency (F2) slope of the word wax and speech intelligibility scores: male (M) speakers
with dysarthria (left) and female (F) speakers with dysarthria (right). **p < .01.
and 51.5% with their original severity groups. Among the patients was classified with their etiology group. For
eight acoustic variables, seven showed the greatest num- acoustic variables that are obviously affected by gender
ber of subjects classified identically when severity was the (e.g., formant frequencies), both regression and dis-
classification variable. When all eight acoustic variables criminant analyses were conducted separately for male
were combined for the discriminant function, however, and female speakers.
the results were slightly different (see Table 4). When the Although the focus of this study was on classification
stepwise discriminant function was performed by pool- accuracy, and not on the individual acoustic variables
ing male and female talkers, subjects were significantly contributing to the significant discriminant functions,
classified with their original group by etiology (68.6%), the analysis showed that different sets of acoustic vari-
F(6, 190) = 9.97, p < .01; severity (54.9%), F(2, 97) = 8.74, ables contributed to the classification of etiology, type,
p < .01; and type of dysarthria (31.7%), F(6, 93) = 4.55, and severity (see Table 4). Articulation rate, voiceless in-
p < .01. For male speakers, the greatest number of sub- terval duration, and intensity range contributed to etiol-
jects was classified with their original severity group, ogy classification, whereas articulation rate and F0 range
whereas for female speakers, the greatest number of contributed to type classification. For severity classifica-
tion, F2 slope, F0 range, and vowel space made signifi-
cant contributions to the discriminant function.
Figure 3. Scatter plot of articulation rate and speech intelligibility
Classification analysis was performed a final time
scores. **p < .01.
with only three types of dysarthria (hypokinetic, ataxic,
and UUMN; 59 of 107 speakers) because of reserva-
tions about unbalanced number of categories (for sever-
ity, etiologies, and types of dysarthria), as well as the
mixed type of dysarthria that was heterogeneous with
respect to dysarthria type. Results showed that the clas-
sification rate by types of dysarthria improved when the
number of types was reduced from seven to three. How-
ever, more subjects were still classified correctly when
coded by severity (62.2%) than when coded by type of
dysarthria (59.3%) or etiology (56.1%).
Discussion
One goal of this study was to identify a set of acoustic
variables that predict speech intelligibility for diverse
types of dysarthria and diseases that cause dysarthria
% F2 VS M1 VID AR dB F0 PVI
Etiology All 36.3 All 35.0 38.6 51.0 49.5 33.0 33.0 32.0
M 47.6 M 17.2
F 56.4 F 38.5
Type All 36.3 All 22.3 37.6 17.6 27.2 19.4 31.1 19.4
M 34.9 M 23.4
F 53.8 F 20.5
Severity All 58.8 All 45.6 58.4 46.1 51.5 42.7 41.7 32.0
M 66.7 M 50.0
F 56.4 F 46.2
Note. Numbers in the table indicate the number of subjects who were correctly identified by discriminant anlaysis as
original categories. For second formant slope (F2) and vowel space (VS), results are reported with separation between male
and female speakers. Each acoustic variable is indicated in abbreviated form. M1 = first-moment difference; VID = voiceless
interval duration; AR = articulation rate; dB = intensity range; F0 = fundamental frequency range; PVI = Pairwise
Variability Index.
and to examine whether different predictors are required slope significant for each disease group, but its effect size
for different diseases associated with dysarthria. A second (i.e., the variance accounted for) was the greatest of all
goal was to assess classification accuracy of speakers with acoustic variables. This is consistent with the results of
dysarthria using acoustic measures as input variables and previous studies that have reported F2 slope as one the
three different classification variables. The relationship most sensitive indices of vocal tract function for speech
of the findings of this study to these two goals is dis- production—as measured by intelligibility ratings—in
cussed in the section that follows. neurodegenerative diseases such as amyotrophic lateral
sclerosis (R. D. Kent et al., 1989; Weismer et al., 1992)
and Parkinson’s disease (Weismer, 1991). F2 slope, there-
Which Acoustic Variables Predict fore, seems to be an indicator of dysarthria severity, as
indexed by speech intelligibility scores, for speech motor
Speech Intelligibility? control deficits in general (i.e., independent of dysarthria
All segmental variables showed significant relation- type and etiology; see Weismer & Kim, 2010).
ships with intelligibility. This is consistent with the only In contrast to F2 slope, an exemplar of a “nonuni-
other study known to us that has compared the impact of versal” predictor is articulation rate. Articulation rate
segmental and suprasegmental variables on speech intel- was a strong predictor of intelligibility in the pooled
ligibility (de Bodt, Hernández-Díaz, & Van de Heyning, analysis but was not a significant predictor of speech
2002). Among the six variables that had a significant intelligibility for speakers with Parkinson’s disease,
relationship with intelligibility when regression anal- who were exclusively diagnosed with hypokinetic dys-
yses were performed with all speakers pooled, only F2 arthria. As reported in several studies, speakers with
slope appeared to be significantly regressed on speech in- Parkinson’s disease (or speakers with hypokinetic dys-
telligibility when analyses were performed within etiol- arthria) can have speech rate similar to or faster than
ogy groups (Parkinson’s disease, stroke, traumatic brain (and sometimes slower) that of healthy speakers. This
injury, and multiple system atrophy). Not only was F2 may create a ceiling effect on a possible relationship be-
tween articulation rate and intelligibility in this group
(Goberman & McMillan, 2005; Weismer, 1984). In other
Table 4. Classification analysis result for combination of words, there is more margin for the relationship between
eight variables. two factors to vary on the low side of typical articulation
rate than to vary on the high side (Turner & Weismer,
Contributing factors 1993). Perhaps a reasonable explanation is that faster-
% All Male Female to classification than-typical articulation rates do not associate with
lowered intelligibility in the same way as do slower-
Etiology 68.6 56.3 73.7 AR, VID, dB
Type 31.7 39.1 41.0 AR, F0
than-typical rates, the latter of which clearly covaried
Severity 54.9 68.3 53.8 F2 slope, F0, VS with speech intelligibility in the current study. Alterna-
tively, or in addition to this, because so many studies
424 Journal of Speech, Language, and Hearing Research • Vol. 54 • 417–429 • April 2011
have found essentially normal rates in Parkinson’s dis- Classification of Dysarthria
ease, the measure may not be expected to predict var-
iation in speech intelligibility for this group. Since the appearance of the Mayo System, most acous-
tic studies have sought to discover acoustic correlates that
F2 slope and articulation rate as predictors of speech
quantify perceptual features of dysarthria as described
severity are also different in another way. Both variables
are connected to overall speech severity, but the relation- by Darley et al. (1969a, 1969b) within a single disease or
ship between articulation rate and speech severity, even dysarthria type. More specifically, acoustic studies have
excluding speakers with Parkinson’s disease, may be been conducted to establish a set of acoustic character-
slightly more complicated. Whereas articulation rate istics that describe overall speech characteristics per-
decreases with speech severity, as indexed by intelligi- taining to a certain disease or dysarthria type, rather
bility, it may also decrease as a result of speaker com- than comparing these characteristics across disease or
pensation to achieve greater intelligibility. In this sense, dysarthria types. However, as a substantial amount of
slowing of articulation rate may reflect influences with speech acoustic data have been collected from persons
opposite effects on speech intelligibility—one effect due with various types of dysarthria, it is apparent that a
to severity, the other due to attempted “correction” for the common set of acoustic characteristics is exhibited in
effects of severity. Compensation for shallow F2 slopes, multiple types of dysarthria. The presence of acoustic
on the other hand, reflects an attempt to move articu- commonalities is not surprising because many of the per-
latory behavior in the same direction as lesser severity: ceptual dimensions in the Darley et al. studies are shared
Increased transition “speed” is both compensatory and across dysarthria types, as documented by Darley et al.
associated with lesser severity (greater intelligibility). (1969a) and subsequent studies (Zeplin & Kent, 1996). It
Because of these considerations, theories of motor speech is the clusters of dimensions that are thought to distin-
disorders and clinical practice may need to treat F2 slope guish among types of dysarthria, not individual perceptual
and articulation rate differently. Although there is evi- dimensions. Perceptual clusters are likely to be associated
dence that speakers with dysarthria can change artic- with multiple acoustic indices, and perhaps it is not sur-
ulation rate on command (e.g., Van Nuffelen, de Bodt, prising that a particular acoustic measure would show
Vanderwegen, Van de Heyning, & Wuyts, 2010), a sim- severity-related variations within several different types
ilar demonstration of voluntary control over F2 slope for of dysarthria. A single measure may be common to dif-
syllables with large vocalic transitions does not exist for ferent dysarthria types even though its combination with
speakers with dysarthria. other measures may be a unique feature of a particular
It is not surprising that the majority of acoustic var- dysarthria type.
iables covaried with speech intelligibility scores in this For example, studies have shown more shallow F2
study, especially considering the wide range of speech slopes for transitions of speakers with motor speech dis-
intelligibility scores for the speakers with dysarthria and orders, regardless of the etiology and type of dysarthria
the large number of participants in the study. Across (Kim et al., 2009). This phenomenon is interesting con-
speakers, it is clearly a matter of the degree to which an sidering the dissimilarity of the typical underlying neuro-
acoustic variable is affected by dysarthria, rather than pathology in diseases such as Parkinson’s disease and
whether the variable is affected. The fact that all vari- stroke. Acoustic similarity across etiologies and types of
ables are expected to covary with speech intelligibility, dysarthria has also been found for vowel space, speak-
especially when intelligibility varies over a wide range, ing rate, VOT, and even tone production in Mandarin
does not necessarily mean the covariation would be the (evidence reviewed by Weismer, 2006). The current study
same for different disease groups or dysarthria types. adds two more acoustic similarities across the four eti-
The current analyses were restricted to disease type ology groups in this investigation, including reduced in-
because similar analyses using dysarthria type would tensity range and the M1 difference between /s/ and /S/.
have been partially (and in some cases largely) redun- Accumulating data support the view that different
dant because of the substantial overlap between disease neuropathophysiologies may give rise to similar man-
and dysarthria type in many (but not all) cases (Duffy, ifestations at the speech acoustic surface and, by
2006). Of course, it is possible that an analysis of acous- inference, at the level of neuromotor control of speech
tic predictors of speech intelligibility within a group of production (presumably including motor commands and
patients having a single disease but multiple types of resulting movements of speech mechanism structures).
dysarthria (e.g., head injury or corticobasal degeneration; Even with different underlying neuropathologies, there
see Duffy, 2006) would reveal dysarthria-type–specific are probably a constrained number of ways in which
predictors of intelligibility. The present results cannot an- performance of the speech mechanism can deteriorate
swer this question definitively but only suggest that the when affected by neurological disease. These constraints
results would not differ much from those reported here. presumably result in the core set of acoustic features
426 Journal of Speech, Language, and Hearing Research • Vol. 54 • 417–429 • April 2011
(2007) combines an expert system and a pretrained multi- similar to one another and what makes them different
layer perceptron. This system produced good classifica- from each other.
tion results under certain conditions. Second, the results
of the different classification analyses reported in Tables 3
and 4 cannot be regarded as fully independent estimates Acknowledgment
of different classification performance. After all, dysar- This study was supported by National Institute on Deafness
thria type is often (but not always) linked with disease and Other Communication Disorders Grant DC00319 and
(etiology) type, and different diseases may be more or less 2006 New Century Scholarships from the American Speech-
likely to produce greater speech severity in the kinds of Language-Hearing Foundation. We thank Joseph R. Duffy
utterances from which the acoustic measures were ex- of the Mayo Clinic, Rochester, Minnesota, for performing the
tracted (e.g., persons with Parkinson’s disease are often dysarthria-type classifications used in this study.
highly intelligible in prepared utterances even when not
so intelligible in spontaneous speech, whereas persons
References
with amyotrophic lateral sclerosis may be equally intel-
ligible in both forms of utterance). Finally, the acoustic Ackermann, H., & Hertrich, I. (1997). Voice onset time
measures used in the current investigation were relatively in ataxic dysarthria. Brain and Language, 56, 321–333.
extensive and were selected for their known sensitivity to Ackermann, H., Hertrich, I., & Hehr, T. (1995). Oral
dysarthria but clearly were not broad enough in scope to diadochokinesis in neurological dysarthrias. Folia
Phoniatrica et Logopaedica, 47, 15–23.
rule out the possibility that a critical classifying measure
Ansel, B. M., & Kent, R. D. (1992). Acoustic–phonetic
was omitted from the analyses. Perhaps a more extensive contrasts and intelligibility in the dysarthria associated
set of measures would capture specific speech production with mixed cerebral palsy. Journal of Speech and Hearing
phenomena that are the basis for the presumed validity Research, 35, 296–308.
and reliability of the dysarthria classification system orig- Auzou, P., Ozsancak, C., Morris, R. J., Jan, M., Eustache,
inally advanced by Darley et al. (1969a). For example, Liss F., & Hannequin, D. (2000). Voice onset time in aphasia,
et al. (2009) recently demonstrated that acoustic metrics apraxia of speech and dysarthria: Review. Clinical Linguistics
& Phonetics, 14, 131–150.
of speech rhythm classified persons having four different
Canter, G. (1963). Speech characteristics of patients with
types of dysarthria with an impressive degree of accu-
Parkinson’s disease. I. Intensity, pitch, and duration.
racy (roughly 70%–80% correct classification, depending Journal of Speech and Hearing Disorders, 28, 221–229.
on the stringency of the classification criteria). The cur- Carmichael, J. N. (2007). Introducing objective acoustic
rent study did not include the variety of rhythm mea- metrics for the Frenchay Dysarthria Assessment procedure.
sures used in the Liss et al. classification exercise, which (Unpublished doctoral dissertation). University of Sheffield,
may explain the better classification performance in the United Kingdom.
latter study. It must be pointed out, however, that in Liss Centers for Disease Control and Prevention. (2003).
et al., the dysarthria-type classifications of participants Traumatic brain injury. Retrieved from http://www.cdc.gov/
ncipc /factsheets/tib.htm.
were chosen specifically from among a much larger group
Chiu, M. J., Chen, R. C., & Tseng, C. Y. (1996). Clinical
of potential participants to have rhythm characteristics
correlates of quantitative acoustic analysis in ataxic
typical of the cardinal dysarthria types presumed to be dysarthria. European Neurology, 46, 310–314.
associated with four different diseases (see Liss et al.,
Darley, F. L., Aronson, A. E., & Brown, J. R. (1969a).
2009, p. 1336, and J. M. Liss, personal communication, Differential diagnostic patterns of dysarthria. Journal
December 29, 2009). of Speech and Hearing Research, 12, 246–269.
Participants in the current study were chosen only Darley, F. L., Aronson, A. E., & Brown, J. R. (1969b).
because they had neuromotor diseases known to produce Clusters of deviant speech dimensions in the dysarthrias.
dysarthria. The classification success in Liss et al. (2009) Journal of Speech and Hearing Research, 12, 462–496.
is, at best, roughly 10% better than the classification by Darley, F. L., Aronson, A. E., & Brown, J. R. (1975). Motor
disease in the current study and may decrease for a less speech disorders. Philadelphia, PA: W. B. Saunders.
selected set of participants. It is also possible that the de Bodt, M. S., Hernández-Díaz, H. M., & Van de
classification results of Liss et al. were better than those Heyning, P. H. (2002). Intelligiblity as a linear combination
of dimensions in dysarthric speech. Journal of Communi-
of the present study because they used a normalized ver- cation Disorders, 35, 283–292.
sion of the PVI (see White & Mattys, 2007), in contrast to
Duffy, J. R. (2005). Motor speech disorders: Substrates,
the raw PVI measure of the current investigation. Clearly, differential diagnosis, and management. St. Louis, MO:
additional investigations with participants having differ- Mosby.
ent disease types, and possibly different dysarthria types, Duffy, J. R. (2006). History, current practice, and future
as well as additional measures are required to make pro- trends and goals. In G. Weismer (Ed.), Motor speech dis-
gress on the issue of what makes speakers with dysarthria orders (pp. 7–56). San Diego, CA: Plural.
428 Journal of Speech, Language, and Hearing Research • Vol. 54 • 417–429 • April 2011
Van Nuffelen, G., de Bodt, M., Vanderwegen, J., Van de Weismer, G., & Kim, Y.-J. (2010). Classification and taxon-
Heyning, P., & Wuyts, F. (2010). Effect of rate control on omy of motor speech disorders: What are the issues? In
speech production and intelligibility in dysarthria. Folia B. Maassen & P. H. H. M. van Lieshout (Eds.), Speech motor
Phoniatrica et Logopaedica, 62, 110–119. control: New developments in basic and applied research
Wang, Y.-T., Kent, R. D., Duffy, J. R., & Thomas, J. E. (pp. 229–241). Oxford, England: Oxford University Press.
(2005). Dysarthria associated with traumatic brain injury: Weismer, G., Martin, R., Kent, R. D., & Kent, J. F. (1992).
Speaking rate and emphatic stress. Journal of Communi- Formant trajectory characteristics of males with amyotrophic
cation Disorders, 38, 231–260. lateral sclerosis. The Journal of the Acoustical Society of
Weismer, G. (1984). Articulatory characteristics of America, 91, 1085–1098.
Parkinsonian dysarthria. In M. R. McNeil, J. C. Rosenbek, Weismer, G., Kent, R. D., Hodge, M., & Martin, R. (1988).
& A. Aronson (Eds.), The dysarthrias: Physiology–acoustic– The acoustic signature for intelligibility test words. The
perception–management (pp. 101–130). San Diego, CA: Journal of the Acoustical Society of America, 84, 1281–1291.
College-Hill. White, L., & Mattys, S. (2007). Calibrating rhythm: First
Weismer, G. (1991). Assessment of articulatory timing. In language and second language studies. Journal of Phonetics,
J. Cooper (Ed.), NIDCD Monograph #1: Assessment of speech 35, 501–522.
and voice production: Research and clinical applications Zeplin, J., & Kent, R. D. (1996). Reliability of auditory–
(pp. 83–95). Bethesda, MD: National Institutes of Health. perceptual scaling of dysarthria. In D. A. Robin, K. Y.
Weismer, G. (1997). Motor speech disorders. In W. J. Yorkston, & D. R. Beukelman (Eds.), Disorders of motor
Hardcastle & J. Laver (Eds.), The handbook of phonetic speech (pp. 145–154). Baltimore, MD: Brooks.
sciences (pp. 191–219). Cambridge, MA: Blackwell. Ziegler, W., & von Cramon, D. (1986). Spastic dysarthria
Weismer, G. (2006). Philosophy of research in motor speech after acquired brain injury: An acoustic study. British
disorders. Clinical Linguistics & Phonetics, 20, 315–349. Journal of Disorders of Communication, 21, 173–187.
Weismer, G., & Fromm, D. (1983). Acoustic analysis of Zyski, B. J., & Weisiger, B. E. (1987). Identification of
geriatric utterances: Segmental and nonsegmental charac- dysarthria types based on perceptual analysis. Journal of
teristics that relate to laryngeal function. In M. Bless & J. H. Communication Disorders, 20, 367–378.
Abbs (Eds.), Vocal fold physiology: Contemporary research and
clinical issues (pp. 317–332). San Diego, CA: College-Hill.
Weismer, G., Jeng, J., Laures, R., Kent, R., & Kent, J.
(2001). Acoustic and intelligibility characteristics of
sentence production in neurogenic speech disorders.
Folia Phoniatrica et Logopaedica, 53, 1–18.