Professional Documents
Culture Documents
Research Report Assessing Disordered Speech and Voice in Parkinson's Disease: A Telerehabilitation Application
Research Report Assessing Disordered Speech and Voice in Parkinson's Disease: A Telerehabilitation Application
Research Report
Assessing disordered speech and voice in Parkinson’s disease: a telerehabilitation
application
Gabriella Constantinescu†, Deborah Theodoros†, Trevor Russell‡, Elizabeth Ward†, Stephen Wilson§
and Richard Wootton{
†Division of Speech Pathology, University of Queensland, St Lucia, Brisbane, QLD, Australia
‡Division of Physiotherapy, University of Queensland, St Lucia, Brisbane, QLD, Australia
§School of Information Technology and Electrical Engineering, University of Queensland, St Lucia, Brisbane, QLD, Australia
{Scottish Centre for Telehealth, Aberdeen, UK
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
Abstract
Background: Patients with Parkinson’s disease face numerous access barriers to speech pathology services for
appropriate assessment and treatment. Telerehabilitation is a possible solution to this problem, whereby
rehabilitation services may be delivered to the patient at a distance, via telecommunication and information
technologies. A number of studies have demonstrated the capacity of telerehabilitation to provide reliable and valid
assessments of speech, voice and language. However, no studies have specifically focused on assessing patients with
Parkinson’s disease.
Aims: To investigate the validity and reliability of a telerehabilitation application for assessing the speech and voice
For personal use only.
Keywords: Parkinson’s disease, telerehabilitation, Internet-based assessment, speech and voice disorder.
Address correspondence to: Gabriella Constantinescu, Division of Speech Pathology, University of Queensland, St Lucia, Brisbane, QLD
4067, Australia; e-mail: gabriella@hearandsaycentre.com.au
International Journal of Language & Communication Disorders
ISSN 1368-2822 print/ISSN 1460-6984 online q 2010 Royal College of Speech & Language Therapists
http://www.informahealthcare.com
DOI: 10.3109/13682820903470569
2 Gabriella Constantinescu et al.
ment associated with Parkinson’s disease, is character- The disparity between the supply and demand of
ized by monotony of pitch and loudness, reduced speech pathology services for people with Parkinson’s
loudness and stress, imprecise articulation, variable rate disease suggests the need for an additional or alternate
and short rushes of speech, inappropriate silences, and a mode of service delivery for this population. One
harsh and breathy voice (Darley et al. 1969). The possible solution is the use of telerehabilitation,
incidence of the speech disorder occurs in as many as whereby telecommunication and information technol-
50– 90% of individuals during the course of their ogies are used in the delivery of healthcare at a distance.
disease (Ramig et al. 2004), with the severity of the Telerehabilitation is an emerging field in speech
dysarthria increasing with disease progression (Hartelius pathology and research has included various communi-
and Svensson 1994). Impaired speech intelligibility may cation technologies for the assessment and treatment of
result in decreased involvement in communicative motor speech, fluency, voice and language disorders.
exchanges, isolation within the family and community, The earlier use of the telephone, closed-circuit
and a subsequent degradation in the person’s quality of television and satellite-based videoconferencing, are
life (Oxtoby 1982). gradually being replaced by Internet-based videocon-
Patient access to speech-language pathology services ferencing via a personal computer which are now
for appropriate assessment and treatment of this accessible to many individuals.
condition is limited. Grimm et al. (2004) in a survey Regardless of the technology used, valid and reliable
of 250 people with Parkinson’s disease across Queens- assessment procedures need to be established to ensure
land, Australia, identified certain barriers impacting on effective telerehabilitation services. Recent studies using
service access. The barriers included: low priority placed Internet-based videoconferencing via a personal com-
on speech pathology services in the public health puter have highlighted the benefits of telerehabilitation
system; limited availability of speech – language path- for the assessment of motor speech performance, story
ologists (SLPs) trained to administer the effective Lee retell, language comprehension and expression for
Silverman Voice Treatmentw (LSVT) for Parkinson’s adults with neurological impairments (Brennan et al.
disease; physical incapacity of the individuals; difficul- 2004, Georgeadis et al. 2004, Hill et al. 2006, 2008a,
ties with transport and travel, and the large distances to 2008b, Palsbo 2007, Theodoros et al. 2008). The
the health service facilities. Of the people with studies commonly reported high levels of agreement
Parkinson’s disease who were surveyed, only 36.7% between the online and face-to-face ratings for the
reported having access to speech pathology services. majority of parameters investigated on the informal and
Online speech assessment of Parkinson’s disease 3
standardized assessments. High participant satisfaction Fifty-seven of the participants had been diagnosed with
with the online environment was also obtained Idiopathic Parkinson’s disease, of which nine participants
(Brennan et al. 2004, Georgeadis et al. 2004, Hill had undergone surgical treatment for Parkinson’s disease
et al. 2008a, 2008b, Theodoros et al. 2008). including deep brain stimulation (seven participants) and
Furthermore, it was encouraging to note that severity pallidotomy (two participants). The remaining four
of aphasia and apraxia did not significantly impact on participants in the cohort had been diagnosed with
the accuracy of the online assessments. However, the Parkinson-plus syndromes, including progressive supra-
occasional audio-visual disturbances online caused by nuclear palsy (three participants) and multiple system
heavy traffic on the network did make it more difficult atrophy (one participant). Stages of Parkinson’s disease as
for the SLPs to conduct the apraxia assessment for per the Hoehn and Yahr (1967) scale for the participants
the more severe participants, and rate the aphasia ranged from I to IV with 48 participants rated at Stages I
parameters of naming and paraphasia (Hill et al. 2008a, and II, and 13 participants rated at Stages III and IV. For
2008b). The ratings of some motor speech parameters all participants, an overall severity level for hypokinetic
(palatal movement in speech, laryngeal volume, tongue dysarthria was determined by the investigators from
elevation and lateral tongue movements), were also clinical judgement. The dysarthria levels ranged from
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
more difficult online for all participants involved, mild to severe, with 41% of participants considered as
and this was a result of audio-visual difficulties, mild, 48% of participants as moderate and 11% of
lighting, camera positioning and lack of zoom focus participants as severe. Participants were recruited from
(Hill et al. 2006). various support groups of Parkinson’s Queensland
As previous studies have demonstrated the capacity Incorporated, public hospitals and from private neurol-
of telerehabilitation to provide reliable and valid ogists in Brisbane, Australia. Proficiency in the use of
assessment of speech, voice and language, it was computers was not a requirement for inclusion in the
proposed that an Internet-based application to provide study as all aspects of the online assessment delivery were
services to people with Parkinson’s disease might lessen performed by the online assessing SLP. Exclusion criteria
the access issues that exist for this population. In order included a speech and/or language disturbance or a co-
For personal use only.
to ensure that valid assessments underpin treatment existing neurological disorder inconsistent with Parkin-
programs, the current study aimed to investigate the son’s disease, a severe uncorrected auditory and/or
validity and reliability of an Internet-based assess- visual disturbance, a cognitive disturbance inconsistent
ment protocol specifically designed to evaluate the with the capacity to provide informed consent, a
speech and voice disturbances associated with Parkin- respiratory dysfunction unrelated to the neurological
son’s disease, by comparison with clinical face-to-face disorder and a positive history of alcohol abuse. The
assessment. It was hypothesized that online assessment primary mode of assessment (online or face-to-face led)
of the speech and voice disturbances in Parkinson’s was randomly selected for each participant and 31
disease can be achieved to a level comparable with assessments were led face-to-face and 30 led online.
standard face-to-face assessment. The current study A computerized random-number generator was used
forms the first validation stage to determine the for the randomization.
feasibility of online delivery as a complete assessment
and treatment service delivery model for Parkinson’s
disease. Assessors
Three SLPs experienced in the assessment of motor
speech disorders and Parkinson’s disease took part in the
Method study. Assessments were conducted at The University of
Queensland. Prior to the commencement of the study,
Participants
SLP training was conducted in a 3-hour session which
Before commencement of the study, ethical clearance covered the administration of all assessments in both the
was obtained from the Behavioural and Social Sciences online and face-to-face assessment environments. The
Ethical Review Committee of The University of SLPs were deemed competent with online adminis-
Queensland, Brisbane. Sixty-one participants with tration when they could adequately deliver a mock
Parkinson’s disease and hypokinetic dysarthria (42 session within a 1-hour time frame and also agree on the
males, 19 females) aged between 52 and 89 years level of severity of five dysarthric speakers who were not
(mean ¼ 69.23 years; standard deviation (SD) ¼ 8.60) involved in the study. The speakers were judged on the
volunteered for the study. Participants were diagnosed as perceptual measures of voice, overall articulatory
having Parkinson’s disease by a neurologist experienced in precision, overall speech intelligibility in conversation
movement disorders. Time post-diagnosis ranged from and oromotor function. During the study, two of the
6 months to 30 years (mean ¼ 6.52 years; SD ¼ 6.53). three SLPs took part in each assessment session, where
4 Gabriella Constantinescu et al.
one SLP led the session, while the second SLP acted as a Oromotor function
silent rater and did not interact with the participant.
An informal assessment of non-speech oromotor
One SLP assessed the participant in the face-to-face
function was developed to evaluate specific parameters
environment (within the same room as the participant),
using a five-point rating scale (1 ¼ normal, 5 ¼
while the second SLP conducted the assessment in
severely impaired). The parameters included masked
the online environment, through a videoconferencing
facial expression, lip movement (retraction, pucker,
link via the Internet. The SLPs were also randomized
seal, alternate movement), tongue movement (sym-
to the assessment environments and were blind to
metry, protrusion, elevation/depression, lateral and
the participant and their level of hypokinetic
alternate movement), breath support and diadochoki-
dysarthria prior to assessment. In total, SLP 1 took
netic (DDK) rates (alternate motion rate [AMR] /p^
part in 25 of the online assessments (16 as leader and 9
p^/and sequential motion rate [SMR] /p^t^k^/).
as silent rater) and 20 of the face-to-face assessments
(9 as leader and 11 as silent rater); SLP 2 took part in 21
online sessions (7 as leader and 14 as silent rater) and 24 Overall articulatory precision
face-to-face assessments (11 as leader and 13 as silent
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
rater); and SLP 3 took part in 14 online assessments A perceptual rating of each participant’s articulatory
(7 as leader and 7 as silent rater) and 18 face-to-face precision was made from the speech sample obtained
assessments (12 as leader and 6 as silent rater). during the reading of The Grandfather Passage.
Articulatory precision was rated on a five-point scale
(1 ¼ normal, precise production of sounds, 5 ¼ severe
distortion or imprecision that interferes with speech
Assessment battery
intelligibility).
Each participant underwent a 1-hour assessment on one
occasion on a battery of perceptual and acoustic
measures specifically designed for this study. The Measures of speech intelligibility
For personal use only.
battery consisted of perceptual ratings of voice and The Assessment of Intelligibility of Dysarthric Speech
oromotor parameters, overall articulatory precision, (ASSIDS) (Yorkston and Beukelman 1981) was used to
speech intelligibility in reading and conversation, and measure speech intelligibility at the single word and
an instrumental evaluation of sound pressure levels, sentence level, as well as communication efficiency.
duration of vowel prolongation and pitch range. For this task, participants read or repeated a series of
These measures were chosen for the study as they are 50 words and 22 sentences of increasing length. The
commonly used to diagnose and define the level of words and sentences had been randomly generated prior
severity of hypokinetic dysarthria associated with to the assessment, in accordance with test procedure. The
Parkinson’s disease. Furthermore, as this study forms reading material was displayed on the participant’s screen
part of a larger validation trial that also evaluates online or presented as per the test booklet, depending on the
treatment, the measures were chosen as they have been assessment environment. Copyright approval was
used in the LSVTw literature as sensitive predictors of obtained from the publishers (PRO-ED, Austin, TX,
treatment change (Ramig et al. 1995a). USA) to enable conversion of test materials to an online
format. Audio recordings of participant speech samples
were made in both environments. Following assessments
Perceptual measures of all participants, the speech samples obtained face-to-
face and online were numerically coded, randomized and
Perceptual voice parameters
saved to CD for analysis and scoring. Two independent
In the absence of a standardized voice assessment SLPs who did not participate in the study and who were
available at the time of the study, vocal parameters were blinded to the assessment environment and participants
evaluated using a five-point rating scale developed for transcribed the speech samples obtained in each
the study (1 ¼ normal, 5 ¼ severely impaired). The environment. Following the ASSIDS ratings, the values
reading of a standard passage, The Grandfather Passage given by the two SLPs from the online and face-to-face
(Darley et al. 1975) was used for perceptual ratings of recordings were averaged to express a single mean value for
breathiness, roughness (lack of clarity), strain-strangled each sample obtained in that environment. Scores for the
vocal quality, vocal tremor, pitch and phonation breaks. word and sentence intelligibility tasks were expressed as
Modal pitch and loudness levels and pitch and loudness per cent correct. The communication efficiency ratio was
variability were rated on a 30 s conversational determined by dividing the participant’s rate of
monologue about a topic of interest such as family, intelligible speech (intelligible words per minute) by
hobbies or a recent holiday trip. the mean rate of intelligible words per minute for
Online speech assessment of Parkinson’s disease 5
normal speakers (190 words per minute) (Yorkston and Assessment environment
Beukelman 1981).
An additional rating of the participant’s overall speech Online environment
intelligibility in conversation was made from the 30 s Two personal computer-based videoconferencing
monologue sample using a five-point scale (1 ¼ normal, systems developed at The University of Queensland
completely intelligible speech, 5 ¼ severely unintelligible were used for the online assessment. The applications
speech with difficulties deciphering many words). For this operated on a 128 kbit/s Internet connection which was
task, the rating was made by the two SLPs who took part the minimum connection speed available in Queens-
in the online and face-to-face assessment. land’s public health systems at the time of the study.
Videoconferencing at 320 £ 240 pixel resolution was
Acoustic measures conducted between the online SLP’s computer and that
of the participant. Additional features of the system
The LSVTw Evaluation Protocol (Ramig et al. 1995b) which were used in this study included the ability: to
was used to assess the participant’s sound pressure levels display printed material and instructional video clips on
(SPL), duration of vowel prolongation and pitch range the participant’s screen; to control the remote camera
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
during several speaking tasks. This protocol has been with the use of a robotic arm and adjust its alignment
widely used in the LSVTw literature as a routine for optimal viewing of the participant’s head and upper
assessment. torso; to capture high-quality video (640 £ 480 pixel
resolution compressed with the windows media video
CODEC Version 8 at 384 kbit/s) and audio record-
Sound pressure levels and duration of vowel prolongation
ings (windows media audio CODEC Version 8 at
The SPLs (dB-C) of the participant’s speech were 368 kbit/s) independent of videoconferencing for the
recorded during six maximum sustained vowel oromotor tasks, and then to store-and-forward these
phonation of /a/, readings of the Rainbow Passage audio and video files back to the online SLP for later
(Fairbanks 1960) and The Grandfather Passage, and review.
For personal use only.
during a 30 s conversational monologue. The duration For tasks requiring acoustic measures, both the
of each vowel phonation was also measured in seconds. online and face-to-face SLPs were able to view and
For all tasks, the participants were instructed to speak in sample real-time calibrated average recordings of SPL
a comfortable voice and no reference was made to their (dB-C), peak frequency (Hz) and duration (s) data via
loudness level. Following the assessment, the SPL and the system’s acoustic speech processor specifically
duration levels were then averaged to provide mean developed for this study. The validity of the speech
levels for each participant. processor as an acoustic measurement device was
examined in a series of calibration trials. These trials
involved the generation of pure-tones by a Function
Pitch range
Generator (Topward Electronic Instruments Model
Each participant performed a series of six vocal glides, TFG-462) at varying levels of SPL (55 – 95 dB) and
reaching their highest and lowest pitch levels respect- pitch (100– 975 Hz), and comparing measures from
ively. No reference was made to their loudness level. the acoustic speech processor with those of the
The average highest and lowest frequency levels (Hz) commercially available Visi-Pitch II (Kay Elemetrics
obtained for each participant were then converted to a Model No. 3300). Statistical analyses using paired
maximum range in semitones (ST) (de Pijper 2007). t-tests revealed no significant differences ( p . 0.05) in
SPL and pitch measures for pure-tones between the two
devices. Furthermore, verification trials comparing SPL
Participant satisfaction questionnaire
measures using the speech processor with those from a
The 30 participants in the online-led assessments Digital Sound Level Meter (Radio Shackw Model
completed a brief questionnaire. On a five-point scale, No. 23-553) using voice samples (66 – 91 dB) also
the questionnaire evaluated the level of participant revealed no significant differences ( p . 0.05) between
satisfaction with: (1) the online assessment sessions the two devices (table 1). To standardize the acoustic
(possible responses ranging from would not participate measures across the two assessment environments, the
again to would prefer these types of sessions to face-to-face system’s acoustic speech processor was used as the
sessions); (2) the audio and video quality during the objective measurement tool in both the online and face-
sessions (responses ranging from poor to excellent); and to-face environments for all acoustic measures.
(3) overall satisfaction with the online modality (ranging During the online assessment, the online SLP wore
from not at all satisfied to very satisfied). Please refer to the a headset microphone attached to the telerehabilitation
Appendix. system for communication with the participant.
6 Gabriella Constantinescu et al.
Table 1. Calibration trial for measures of pitch and sound pressure levels
Note: Dashes ( – ) correspond to data not obtained. Speech processor is the system’s online acoustic speech processor. Visi-Pitch II (Kay Elemetrics Model No. 3300). SLM is a Digital
Sound-Level Meter (Radio Shackw Model No. 23-553). MAD is maximum average difference. Pure-tone pitch is the pitch calibration trial of the speech processor with the Visi-Pitch II
using pure-tones. Pure-tone SPL is the sound pressure level calibration trial of the speech processor with the Visi-Pitch II using pure-tones. Voice samples SPL is the sound pressure level
verification trial comparing sound pressure level measures using the speech processor with the Digital Sound Level Meter for voice samples. Measurements are in Hertz. dB-C is
measurements in decibels-C weighted. SD, standard deviation.
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
The SLP controlled all displays on the participant’s displayed on the participant’s screen by the online SLP.
screen, without the need for the participant to operate Throughout the online-led assessment, the face-to-face
the system. For standardization purposes, the participant SLP acted as the silent rater at the participant site. The
was seated in front of the system at a distance of face-to-face SLP wore headphones and was able to
approximately 50 cm from the monitor and wore a follow the assessment instructions given to the
headset microphone to enable interaction with the participant. The online assessment environment is
online SLP during videoconferencing. The microphone represented in figures 1a and 1b.
distance was set at 5 cm from the corner of the
participant’s mouth in order to reduce sound distortion,
maximize visibility of the participant’s face, and allow for Face-to-face environment
For personal use only.
Note: ASSIDS ¼ assessment of intelligibility of dysarthric speech; FTF, face-to-face assessment environment; n.a., not applicable.
intelligibility in conversation), percent close agreement beyond chance between the online and face-to-face
For personal use only.
(PCA) and the quadratic weighted Kappa (kw) statistic measures. The kw assigned weights to the observed and
(Landis and Koch 1977) were calculated. Analysis of chance agreement and presented levels of agreement
the ASSIDS and acoustic parameters (SPL tasks, where kw less than 0.20 is interpreted as poor;
duration of vowel phonation and pitch range) were 0.21– 0.40 is fair; 0.41– 0.60 is moderate; 0.61– 0.80 is
performed using the Bland and Altman (1986) ‘limits good, and 0.81– 1.00 indicates very good agreement
of agreement’ method for continuous data. (Landis and Koch 1977). For this study, the clinical
criterion for an acceptable level of agreement was set at
kw . 0.6 (good agreement).
Percent close agreement
PCA was chosen as it is commonly used to quantify
agreement in perceptual ratings of dysarthria. PCA was Bland and Altman (1986) limits of agreement
also selected for the present study to further verify kw This statistic establishes the limits of agreement (LA)
as it has been reported in some instances that non- within which 95% of differences between the two
linear distribution of data can negatively impact on kw environments are predicted to lie. If the LA are found to
creating a paradox (Cicchetti and Feinstein 1990). PCA be within a predetermined clinical criterion, the new
expressed the percentage of ratings where differences method can be considered an acceptable measurement
were within ^1 scale point on the perceptual rating tool and the two methods can be used interchangeably
scales (Kearns and Simmons 1988). In keeping with (Bland and Altman 1986). In the absence of a reported
previous studies that examined dysarthric speech using minimal, clinically important difference for word and
perceptual rating scales, the clinical criterion for an sentence intelligibility of the ASSIDS assessment, the
acceptable level of agreement in the present study was clinical criterion was established on the test – retest
considered to be equal to or greater than 80% agreement variability reported in the manual for dysarthric
within ^1 scale point (Kearns and Simmons 1988). speakers assessed in the face-to-face environment
(Yorkston and Beukelman 1981). Values of ^3.2%
and ^8.6% were set for these respective measures. In
Quadratic weighted Kappa statistic
addition, the clinical criterion for the communication
The kw is widely used in telerehabilitation studies for efficiency ratio was set at ^0.27 which was consistent
ordinal data and provides an indication of agreement with the criterion determined by Hill et al. (2006) for
between raters (Landis and Koch 1977). In the present dysarthric speakers. Furthermore, as the assessment
study, the statistic provided a measure of agreement battery used in the study was designed ultimately to
8 Gabriella Constantinescu et al.
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
Figure 1a. Online-led assessment by online SLP and equipment at site. Note (1) the videoconferencing system displaying the participant;
(2) pitch (Hz) and SPL (dB-C) data via the system’s acoustic speech processor; and (3) the web camera.
determine treatment outcome, the clinical criteria for (Ramig et al. 1995a). The clinical criterion for the
SPL, vowel duration and pitch range tasks were set at duration of vowel phonation was set at ^ 3 s, as the
For personal use only.
levels below the minimal improvement expected minimal change in phonation time expected with
following the LSVTw. For all SPL tasks (maximum the LSVTw has been reported to be a mean of 3.72 s
vowel phonation, reading and monologue loudness), (Ramig et al. 1995a). For measures of pitch range, the
the clinical criterion was set at ^4 dB difference clinical criterion was set at ^3 ST, which was below the
between the two raters, a level below the 4.5 –14.03 dB 4 ST minimum improvement in fundamental frequency
mean level of improvement reported with the LSVTw range post-LSVTw (Ramig et al. 1994). This clinical
Figure 1b. Online-led assessment at participant site with face-to-face SLP as the silent rater (left). Note (1) the videoconferencing system
displaying the online SLP; (2) the web cameras; (3) the video camera; and (4) pitch (Hz) and SPL (dB-C) data via the system’s acoustic speech
processor displayed for the face-to-face clinician.
Online speech assessment of Parkinson’s disease 9
criterion was also in keeping with the level of subject breaks, phonation breaks, modal pitch and loudness
variability in healthy adults that can range from 2 to 4 ST variability) were below the clinical criterion of good
(Gelfer 1986). agreement (kw . 0.6).
from each of the three SLPs in the particular 8.77%) and sentence intelligibility reading tasks
environment. ICC values below 0.40 corresponded to (LA ¼ 25.59% to 6.16%), and the communication
poor-to-fair reliability; between 0.40 and 0.75 to efficiency ratio (LA ¼ 20.12 to 0.10). The LA for
moderate-to-good reliability; and values above 0.75 sentence intelligibility and communication efficiency
represented very good reliability (Fleiss 1981). ratio were within the respective clinical criterion
(^8.6% and ^0.27), while the word intelligibility
LA fell outside of the clinical criterion of ^3.2%. In
Results addition, perceptual ratings of overall speech intellig-
ibility in conversation were within the clinical criteria
Perceptual measures
for PCA (98.36) and kw (0.79 good agreement).
Perceptual voice parameters
Table 4. Perceptual oromotor parameters between face-to-face
All individual voice parameters met the predetermined and online environments
clinical criterion of 80% agreement for PCA (table 3).
However, when using kw, seven of the ten parameters Oromotor parameters PCA kw
(breathiness, roughness, strained-strangled, pitch Breath support 96.72 0.83 (very good)
Masked facial expression 86.89 0.31 (fair)*
Table 3. Perceptual voice parameters between face-to-face and Lip movement
online environments Retraction 98.36 0.56 (moderate)*
Pucker 98.36 0.77 (good)
Voice parameters PCA kw
Seal 95.08 0.66 (good)
Breathiness 91.80 0.36 (fair)* Alternate 100 0.95 (very good)
Roughness 95.08 0.33 (fair)* Tongue movement
Strained-strangled 95.08 0.41 (moderate)* Symmetry 100 0.66 (good)
Vocal tremor 100 0.69 (good) Protrusion 100 0.94 (very good)
Pitch breaks 96.66 0.11 (poor)* Elevation/depression 98.36 0.93 (very good)
Phonation breaks 95.00 0.37 (fair)* Lateral 100 0.89 (very good)
Modal pitch 96.66 0.38 (fair)* Alternate 100 0.85 (very good)
Pitch variability 96.72 0.63 (good) Diadochokinetic (DDK)
Loudness level 100 0.69 (good) AMR /p^p^/ 100 0.75 (good)
Loudness variability 98.36 0.49 (moderate)* SMR /p^t^k^/ 100 0.87 (very good)
Note: kw, quadratic weighted Kappa statistic: *achieved lower than the clinical criterion Note: kw ¼ quadratic weighted Kappa statistic: *achieved lower than the clinical
of kw . 0.6; PCA, per cent close agreement. criterion of kw . 0.6; PCA, per cent close agreement.
10 Gabriella Constantinescu et al.
(a)
–1.97 vowel 1.35
– 15 – 10 –5 0 5 10 15
CC CC –5 –3 –1 1 3 5
CC CC
Limits of Agreement (%)
Limits of Agreement (dB)
(b)
Figure 3. Bland and Altman (1986) limits of agreement for sustained
vowel phonation, reading of the Rainbow and Grandfather Passages
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
–5.59 6.16
(LA ¼ 21.07 to 0.81 dB). For all SPL tasks, the LA
were within the predetermined clinical criterion of
^4 dB. Similarly, the LA for the duration of sustained
vowel phonation task (LA ¼ 22.74 to 2.70 s) were
– 10 –5 0 5 10 within the clinical criterion of ^3 s for online and face-
CC CC to-face ratings (figure 4a).
Limits of Agreement (%)
For personal use only.
(c)
Pitch range
Figure 4b represents the LA for the pitch range (LA ¼
22.03 to 2.19 ST) that were within the clinical
– 0.12 0.10 criterion of ^3 ST.
Reliability
Intra-class correlations ranged from moderate to very
– 0.3 – 0.2 – 0.1 0 0.1 0.2 0.3 good intra-rater reliability in both assessment environ-
CC CC ments (ICC ¼ 0.43–0.99 face-to-face; ICC ¼ 0.48–
Limits of Agreement
0.99 online), indicating comparable intra-rater reliability
Figures 2a– c. Bland and Altman (1986) limits of agreement for between environments. Inter-rater reliability was also
word intelligibility, sentence intelligibility, and communication found to be comparable between environments, with
efficiency ratio relative to the clinical criteria: (a) word intelligibility: reliability values between moderate to very good for the
clinical criterion (CC) ¼ ^ 3.2 percentage points; (b) sentence
intelligibility: clinical criterion (CC) ¼ ^ 8.6 percentage points;
majority of face-to-face (ICC ¼ 0.43–0.99) and online
and (c) communication efficiency ratio: clinical criterion (ICC ¼ 0.48–0.99) assessments (table 5).
(CC) ¼ ^0.27.
Table 5. Intra-class correlation (ICC) values for intra-rater and inter-rater reliability for online and face-to-face ratings
Note: CER, communication efficiency ratio; FTF, face-to-face environment; OAP, overall articulatory precision; OIC, overall speech intelligibility in conversation; ASSIDS, assessment
of intelligibility of dysarthric speech; SI, percentage sentence intelligibility; WI, percentage word intelligibility; ICC values below 0.40 ¼ poor-to-fair; from 0.40 to 0.75 ¼ moderate-
to-good, above 0.75 ¼ very good reliability. Reliability was calculated for three assessors.
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
Two of the 14 variables (masked facial expression and consistent with a previous study by Hill et al. (2006)
lip retraction), however, were below the clinical criterion where 89.47% consensus agreement was achieved
using kw. As noted with the vocal parameters, a certain between online and face-to-face ratings for consonant
level of variability inherent in perceptual ratings may have precision. Direct comparison of outcomes between the
contributed to the lower ratings for these oromotor two studies is not possible, however, due to the different
parameters. Previous face-to-face evaluations of facial assessment procedures used by Hill et al. (2006)
expression in Parkinson’s disease using a range of rating including the use of a four-point rating scale, different
scales and statistical analyses have shown varying levels of evaluation criteria and the non-simultaneous assess-
intra- and inter-rater reliability including fair, moderate ment of participants in the online and face-to-face
For personal use only.
and substantial agreement (Goetz et al. 1995, Martinez- environments. In the present study, the high level of
Martin et al. 1994). The authors of these studies agreement between online and face-to-face ratings
attributed some level of the variability to the subjective and the comparable intra- and inter-rater reliability
interpretation of severity levels, the possible inexperience values (moderate agreement) between environments
of a few of the raters, and some variability in consensus lends further support to the validity of an online
prior to rating. application.
Similarly, labial judgements which have been
investigated predominantly in the cleft palate literature,
have been associated with rater variability (Morrant and Measures of speech intelligibility
Shaw 1996, Ritter et al. 2002). Face-to-face evaluations The Bland and Altman (1986) LA were used for online
of lip retraction in participants with repaired unilateral and face-to-face scores of the ASSIDS assessment
cleft lip have shown poor (Morrant and Shaw 1996) and (reading tasks). For the sentence tasks, both the LA for
moderate levels of inter-rater agreement (Ritter et al. sentence intelligibility and communication efficiency
2002). The subjective interpretation of severity levels has ratio were within the clinical criterion, indicating that
also been reported to affect rater agreement in these comparable measures of speech intelligibility can be
studies (Morrant and Shaw 1996, Ritter et al. 2002). achieved between the online store-and-forward method
Analyses of the oromotor parameters including and traditional face-to-face audio recordings using this
masked facial expression and lip retraction using kw assessment. Hill et al. (2006) similarly reported
may also require a level of cautious interpretation due to comparable values for the communication efficiency
the non-linear distribution of the data, and results may ratio in their study, while sentence intelligibility was
need to be interpreted alongside PCA. On the whole, just outside the clinical criterion. In the present study,
the high intra- and inter-rater reliability obtained for the LA for word intelligibility were outside the clinical
the oromotor parameters collectively, and the compar- criterion of ^3.2 percentage points between the
able levels between the two environments were very environments. It is possible that speaker severity may
encouraging. have influenced these results. Differences of three or
more words between raters, which were outside the
clinical criterion, occurred predominantly for partici-
Overall articulatory precision
pants with moderate and severely reduced intelligibility
The complete agreement obtained between the two (66.66% of the time), as identified on the overall speech
assessment environments for articulatory precision is intelligibility in conversation rating scale. The reduced
Online speech assessment of Parkinson’s disease 13
speaker intelligibility may have contributed to the within a clinically acceptable level, and the subjective
differences in ratings between the two environments. element of selecting a sample from a section of the vocal
Yorkston and Beukelman (1981) acknowledge that glide for analysis did not impact greatly on the results.
transcription of word tasks (as used in this study) in the Moreover, reliability measures revealed very good intra-
traditional face-to-face environment is difficult with and inter-rater reliability for the pitch task between the
more severely dysarthric speakers. For all the ASSIDS two environments. Collectively, the comparable values
tasks, intra- and inter-rater reliability was largely obtained for all acoustic measures suggested that the
comparable between the two environments, and showed acoustic speech processor used in the online environ-
very good agreement overall. These values are in keeping ment is a sensitive assessment tool that can be used to
with the very good reliability measures reported for the detect minimal changes in SPL, duration and pitch in a
ASSIDS assessment (Yorkston and Beukelman 1981). Parkinson’s disease assessment battery.
Comparable levels between assessment environments and
previous literature lend further support to the use of this
assessment in an online application. Audio-visual challenges and the online environment
For the additional ratings of overall speech intellig- Although the current trials largely support the feasi-
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland
ibility in conversation, PCA within the clinical criterion bility of an online assessment, it is acknowledged that a
was achieved. This finding is in keeping with previous number of challenges were experienced with this
reports of high inter-rater agreement within one scale modality. Firstly, the assessments were conducted over a
point for overall speech intelligibility in traditional face- 128 kbit/s Internet videoconferencing connection,
to-face ratings (Sheard et al. 1991). The kw further which at this bandwidth, made the real-time evaluation
reflected good agreement within the clinical criterion. of a number of assessment items difficult. This included
Together with the comparable reliability measures, these the detection of fine motor movements and precision
findings suggest that ratings of overall speech intelligibility on the informal oromotor assessment due to the frame
in conversation can be made reliably online. rate and pixelated image, especially with movement.
The more subtle features of speech production on the
For personal use only.
not report that this delay impacted greatly on the therapy. Telemedicine Journal and e-Health, 10, 147– 154.
assessment. Overall, the online assessments were rated CICCHETTI, D. V. and FEINSTEIN, A. R., 1990, High agreement but
low kappa: II. Resolving the paradoxes. Journal of Clinical
favourably by the participants, the application was user- Epidemiology, 43, 551– 558.
friendly, and the features of the application were DARLEY, F. L., ARONSON, A. E. and BROWN, J. R., 1969, Clusters of
conducive to assessment. deviant speech dimensions in the dysarthrias. Journal of
speech and hearing research, 12, 462– 496.
DARLEY, F. L., ARONSON, A. E. and BROWN, J. R., 1975, Motor
Conclusion Speech Disorders (Philadelphia, PA: W. B. Saunders).
DE PIJPER, J. R., 2007, Semitone conversions (available at: http://
The comparable ratings achieved for the majority of users.utu.fi/jyrtuoma/speech/semitone.html) (accessed 1
parameters between the online and face-to-face environ- March 2007).
For personal use only.
ments and high rater reliability have demonstrated the DE RIJK, M. C., TZOURIO, C., BRETELER, M. M., DARTIGUES, J. F.,
AMADUCCI, L., LOPEZ-POUSA, S., MANUBENS-BERTRAN, J. M.,
validity and reliability of the online assessment tool for ALPEROVITCH, A. and ROCCA, W. A., 1997, Prevalence of
evaluating the speech and voice parameters associated parkinsonism and Parkinson’s disease in Europe: The EUR-
with hypokinetic dysarthria and Parkinson’s disease. The OPARKINSON collaborative study. European community
telerehabilitation application described in this study concerted action on the epidemiology of Parkinson’s disease.
provides a basis for the delivery of online assessment for Journal of Neurology Neurosurgery and Psychiatry, 62, 10–15.
FAIRBANKS, G., 1960, Voice and Articulation Drill Book (New York,
the dysarthric speech and voice disorder associated with NY: Harper & Brothers).
Parkinson’s disease. Online service delivery may prove to FLEISS, J. L., 1981, Statistical Methods for Rates and Proportions
be a necessary alternative or addition for people with (New York, NY: Wiley).
Parkinson’s disease, whereby lessening the difficult access GELFER, M. P., 1986, The stability of total phonational frequency
issues that exist for this population. Further research range. Journal of the Acoustical Society of America, 79, 83.
GEORGEADIS, A., BRENNAN, D., BARKER, L. M. and BARON, C.,
should involve analyses of online rater variability 2004, Telerehabilitation and its effect on story retelling by
involving a larger group of assessors and possible effects adults with neurogenic communication disorders. Aphasiology,
on assessment outcomes; provide a comprehensive 18, 639–652.
analyses of participant and SLP satisfaction with the GOETZ, C. G., STEBBINS, G. T., CHMURA, T. A., FAHN, S.,
online modality; examine the effects of dysarthria KLAWANS, H. L. and MARSDEN, C. D., 1995, Teaching tape
for the motor section of the unified Parkinson’s disease rating
severity and variance of Parkinson’s disease on the scale. Movement Disorders, 10, 263– 266.
online assessments; and provide an in-depth cost analysis GRIMM, N., PAUL, J. and WAKEHAM, A., 2004, Access to speech
of the use of telerehabilitation applications in the pathology services by people with Parkinson’s disease in
assessment and treatment of people with Parkinson’s Queensland, Australia (Unpublished report: University of
disease. Queensland, Australia).
HARTELIUS, L. and SVENSSON, P., 1994, Speech and swallowing
symptoms associated with Parkinson’s disease and multiple
sclerosis: a survey. Folia Phoniatrica et Logopaedica, 46,
Acknowledgements 9 –17.
HILL, A., THEODOROS, D. G., RUSSELL, T. G., CAHILL, L. M.,
This research was funded by a National Health and Medical WARD, E. C. and CLARK, K. M., 2006, An Internet-based
Research Council Project (Grant Number 301029). The research telerehabilitation system for the assessment of motor speech
forms part of the doctoral thesis of Gabriella Constantinescu at disorders: a pilot study. American Journal of Speech –
The University of Queensland, Australia. The authors would like Language Pathology, 15, 45 – 56.
to acknowledge Parkinson’s Queensland Incorporated, all HILL, A., THEODOROS, D., RUSSELL, T. and WARD, E., 2008a,
participants, Roy Anderson, Anne Hill, Monique Waite, Jasmin Using telerehabilitation to assess apraxia of speech in adults.
Online speech assessment of Parkinson’s disease 15
International Journal of Language and Communication RITTER, K., TROTMAN, C. A. and PHILLIPS, C., 2002, Validity
Disorders, 1 –17. of subjective evaluations for the assessment of lip scarring
HILL, A., THEODOROS, D., RUSSELL, T., WARD, E. and WOOTTON, and impairment. Cleft Palate-Craniofacial Journal, 39,
R., 2008b, The effects of aphasia severity on the ability to 587– 596.
assess language disorders via telerehabilitation. Aphasiology, SHEARD, C., ADAMS, R. D. and DAVIS, P. J., 1991, Reliability and
[epub ahead of print] 1– 16. agreement of ratings of ataxic dysarthric speech samples with
HOEHN, M. and YAHR, M., 1967, Parkinsonism: onset, progression varying intelligibility. Journal of Speech and Hearing Research,
and mortality. Neurology, 17, 427– 442. 34, 285– 293.
KEARNS, K. P. and SIMMONS, N. N., 1988, Interobserver reliability THEODOROS, D., HILL, A., RUSSELL, T., WARD, E. and WOOTTON,
and perceptual ratings: more than meets the ear. Journal of R., 2008, Assessing acquired language disorders in adults via
Speech and Hearing Research, 31, 131– 136. the Internet. Telemedicine Journal and e-Health,, 14,
KREIMAN, J. and GERRATT, B. R., 1998, Validity of rating scale 552– 559.
measures of voice quality. Journal of the Acoustical Society of YORKSTON, K. M. and BEUKELMAN, D. R., 1981, Assessment of
America, 104, 1598– 1608. Intelligibility of Dysarthric Speech (Austin, TX: PRO-ED).
LANDIS, J. R. and KOCH, G. G., 1977, The measurement of
observer agreement for categorical data. Biometrics, 33,
159– 174.
MARTINEZ-MARTIN, P., GIL-NAGEL, A., GRACIA, L. M., GOMEZ,
Int J Lang Commun Disord Downloaded from informahealthcare.com by University of Queensland