Kent - Et - Al Acoustic in SSD

05_Flipsen_93-114 5/4/09 8:42 AM Page 93
CHAPTER 5
Children’s Speech
Sound Disorders:
An Acoustic Perspective
RAYMOND D. KENT, LUCIANA PAGAN-NEVES, KATHERINE
C. HUSTAD, AND HAYDEE FISZBEIN WERTZNER
Introduction these advantages is discussed, with exam-

ples from the literature. Also included are
suggestions to (a) improve the acoustic
Speech sound disorders are almost always analysis of children’s speech, and (b) apply
identified and described by their auditory- acoustic methods to clinical assessment
perceptual properties, as determined by and treatment.
adult listeners. Given that auditory-perceptual
properties are extracted from the acoustic
signal of speech, it follows that acoustic meth- An Envisioned Future
ods should be highly suited to the study of
these disorders. But the capability for doing
something is not the same as the necessity A hopeful view of the future application of
or even desirability for doing so. What does acoustic analysis to the clinical assessment
acoustic analysis offer for the assessment, and treatment of speech disorders includes
treatment, and understanding of develop- the routine use of computer-based methods
mental speech disorders? This chapter takes to record, display, analyze, and store infor-
the view that acoustic analysis is a valuable mation about speech sound patterns (also
complement and co-referent to perceptual see Chapter 6). In fact, these functions
analysis. The advantages that acoustic analy- have been available for some time, so this
sis offers to the understanding of children’s view of the future is not especially bold
speech sound disorders are primarily objec- or revolutionary. But the operative word is
tivity, quantification, and sensitivity. Each of “routine.” Despite the general availability of
05_Flipsen_93-114 5/4/09 8:42 AM Page 94
94 SPEECH SOUND DISORDERS IN CHILDREN
computer-based acoustic analysis at relatively samples of speech as a digital file, display this
low cost, the use of such analysis tools is by file as a waveform or other pattern, select
no means routine. Are there real prospects and edit parts of the saved file, conduct var-
for routine clinical application? And what ious types of analysis (e.g., waveform, spec-
needs to be done to bring these prospects trogram, spectrum, fundamental-frequency
to reality? This chapter addresses these contour, intensity envelope, some of which
questions and, in so doing, reviews major are shown in Figure 5–1), play all or selected
accomplishments in the acoustic analysis parts of the file, and save the results of analy-
of speech disorders. The emphasis is on sis. The basic methods are found in several
speech disorders in children, but occasional different systems that are available com-
reference is made to disorders in adults as mercially at varying costs (Ingram, Bunta, &
they help to reveal potential clinical tools Ingram, 2004; Read, Buder, & Kent, 1990) or
for children’s speech. as free downloads (such as the computer
The pivotal technology is digital signal analysis program Praat [Dutch for talk]
processing, which enables a user to record developed by Paul Boersma and David Wee-
Figure 5–1. Screen display of a waveform (Panel A), pitch trace (Panel B), and spectro-
gram (Panel C) in TF32. Speech sample is from a three-year-old typically developing child
producing the phrase “cowboy boots.”
05_Flipsen_93-114 5/4/09 8:42 AM Page 95
CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 95
nink at the University of Amsterdam). Cer- graduate students, indicating that sometimes
tainly, cost is not an obstacle to performing expert listeners can be more influenced by
fairly sophisticated operations in the acous- their knowledge of the pathology than by
tic analysis and synthesis of speech. Perhaps what they are really listening to (Pagan &
a greater obstacle is a limited understanding Wertzner, 2007a). Furthermore, even the
of how these analyses can be used in the experts’ ears fail in comparison to acoustic
practice of speech-language pathology. Rel- analysis, which is free of phonetic biases
evant discussions are available in articles and other influences that inevitably affect
and books explaining how these digital perception.This conclusion has been reached
methods can be applied to speech and lan- for several aspects of speech (Kent, 1996).
guage disorders (Ingram et al., 2004; Kent &
Read, 2002). Proficiency with acoustic
methods may be the single most important Toward a Pediatric
factor that will lead to the increased use of
Speech Science
these methods for clinical purposes. Acous-
tic analysis, like any laboratory tool, requires
practice in its use (see Figure 5–1). Most of the literature on acoustic theory,
methods of analysis, and acoustic databases
pertains to normal adult speech, and espe-
Speech Is More Than What cially to the speech of men. Gradually, the-
Meets the Ear ory, methods, and databases are becoming
more comprehensive to include women and
children, that is, the community of speakers.
The human auditory system is remarkable The analysis of children’s speech, in partic-
in its ability to segregate the speech signal ular, needs to take account of various factors
from noise and to achieve a phonetic inter- that can complicate the analysis task. Some
pretation of that signal. The robustness of of the major factors are as follows:
this process necessarily discards a fair amount
of detail. A primary advantage of acoustic 1. Because children have shorter vocal
analysis is that it permits the detection of tracts than adults, children’s speech
acoustic properties that may not be detected sounds (both vowels and consonants)
by auditory means. The ear is necessarily an have energy at higher frequencies than
informational filter that attends to certain those observed for adults. One conse-
aspects of sound and ignores others. Espe- quence is that the total frequency range
cially because of biases introduced through of analysis may need to be extended
phonetic experience with a given language, for satisfactory results with children’s
human audition discards or neglects much speech. For example, the spectral energy
of the acoustic signal of speech. Speech- associated with infants’ fricatives may
language clinicians are taught to listen care- reach as high as 16 kHz (Kent & Read,
fully to acoustic variations that the layperson 2002). Fortunately, most contemporary
may not hear at all. A recent study compar- systems for recording and analyzing
ing the perception of correct and incorrect speech permit a total bandwidth of
Brazilian Portuguese liquids /l/, /ɾ/, /ʎ/ by about 20 kHz. Increases in computer
undergraduate and graduate students demon- memory accommodate such extended
strated a better performance of the under- bandwidths.
05_Flipsen_93-114 5/4/09 8:42 AM Page 96
2. The precision of formant estimation 6. The development of the vocal tract

varies with fundamental frequency (F0 ). reflects a complicated interaction of the
Voices with high F0 (generally the case growth of its constituent structures, and
for children) are more challenging when this interaction is poorly understood
it comes to estimating formant frequen- (Kent & Hustad, 2009; Kent & Tilkens,
cies. The limitation is basically one of 2007; Kent & Vorperian, 1995).
sampling. With higher F0 values, the har- 7. The acoustic database for children’s
monics of the laryngeal source are far- speech is incomplete. The database is
ther apart, and this makes it more diffi- growing slowly, but it is not adequate
cult to estimate the formant locations for all purposes. Clinical interpretation
in the spectrum (Huggins, 1980: Kent depends critically on a secure knowl-
& Read, 2002; Vorperian & Kent, 2007). edge of normative behavior.
3. Children often are variable in their
phonatory patterns, which may include These comments are not intended to
transient or long-term features such as discourage the use of acoustic analysis, but
breathiness, roughness, pitch shift, and rather to forewarn those who attempt these
even register change (e.g., between analyses of the complications that lie in the
chest and vocal fry registers). In con- path of discovery and application. Similar pre-
trast, adults tend to have fairly uniform cautions could be issued on the use of pho-
phonatory patterns so that one set of netic transcription and physiologic analyses.
analysis parameters generally is suitable These difficulties notwithstanding, there is
for an entire utterance. no good reason why acoustics should not be
4. Velopharyngeal function may differ a working partner with auditory-perceptual
between children and adults. Although methods in the understanding of children’s
the precise maturational pattern is not speech disorders.
well established, it appears that typi-
cally developing children may achieve
speech-adequate control at about the Prosodic Patterns
same time as canonical babbling appears
(Thom, Hoit, Hixon, & Smith, 2006).
However, some children may show Depending on the definition that is used,
variable or unusual patterns of velopha- prosody can embrace a number of phenom-
ryngeal function, which can complicate ena including intonation, tempo (pause and
acoustic analysis. lengthening), vocal effort, and loudness.
5. Eccentric or idiosyncratic acoustic- These are suprasegmental aspects of speech,
phonetic patterns may appear. Because meaning that their effects typically extend
children are learning language, including over two or more phonetic segments. It is
its phonological and phonetic aspects, not possible to offer an extensive review of
at the same time they are learning the prosody in this chapter, and the emphasis
motor skills of speech, they may exhibit is on the tractability of an acoustic analysis
behaviors that are seldom, if ever, ob- of prosody in children. In one view of pro-
served in adult speech. Some of these sody that was designed expressly for appli-
behaviors may be highly transient, but cation to language development (Gerken &
others may persist over a substantial McGregor, 1998), prosody was conceptual-
period of time. ized as three general types of phenomena
05_Flipsen_93-114 5/4/09 8:42 AM Page 97
in language: phrasal stress, boundary cues, 1994) and are robust in the face of speech
and meter. Each of these is elaborated in the or language disorder (Snow, 1998; Wang,
following. Kent, Duffy, & Thomas, 2005). According to
Snow’s (1994) data on children aged 16
to 25 months, intonation is acquired earlier
Phrasal Stress than final syllable timing. As Snow pointed
out, one implication of this result is that final
Phrasal stress is the phenomenon of word lengthening is a learned prosodic feature.
prominence in a phrase. Stress is conveyed
by adjustments of duration, fundamental
frequency, and intensity. Children begin to Meter or Rhythm
regulate the acoustic cues of stress (funda-
mental frequency, amplitude, duration) as Meter (or rhythm) is the pattern of stressed
early as 18 to 30 months of age (Kehoe, and unstressed syllables for words and
Stoel-Gammon, & Buder, 1995). In a study of phrases. In American English, syllables usu-
linguistic stress produced by 5 children with ally have a strong-weak (SW) alternation,
suspected developmental apraxia of speech and this alternation defines the rhythm of
(sDAS) and 5 children with phonological dis- the language. The SW pattern is linked to a
order, Munson, Bjorum, and Windsor (2003) stress unit called the foot, which is a SW syl-
reported that the children with sDAS were lable pair. Low, Grabe, and Nolan (2000)
judged to be less successful than the children introduced a measure called the Pairwise
with phonological disorder in producing Variability Index (PVI), which seems to be
target stress contours. However, acoustic a useful measure of a speaker’s adherence
studies showed that the children with sDAS to the normal stressed-unstressed alterna-
produced acoustic differences between tion in English. PVI is an index of changes
stressed and unstressed syllables that appar- in successive vowel length over an utter-
ently were not consistently detected by the ance, and it is not affected by speaking rate.
listeners who made the stress judgments. It is computed as follows:
PVI = 100 × [∑⏐(dk−dk−1)/dk+dk−1)/2⏐

Boundary Cues / (m−1)]
Boundary cues are pauses, adjustments in where m equals the number of

duration, or variations in pitch that mark vowels (or syllables) in an utterance
the ends of language units. A well-known and d is the duration of the kth vowel
example of a boundary cue is phrase-final (syllable).
lengthening, in which a word or syllable that
precedes the end of a major syntactic unit PVL has only recently been applied to the
is lengthened. Phrase-final lengthening often study of speech disorders (Henrich et al.,
is accompanied by a falling tone, and the 2006; Wang, Kent, Duffy, Thomas, & Freder-
two of these features are effective cues for icks, 2006), and, to our knowledge, has not
a major constituent unit. Figure 5–2 illus- been applied to the study of typical speech
trates both final syllable lengthening and development.
falling tone. They also appear relatively early The main conclusion is that acoustic
in speech-language development (Snow, correlates exist for prosodic constituents,
05_Flipsen_93-114 5/4/09 8:42 AM Page 98
Figure 5–2. Screen display of a waveform (Panel A), pitch trace (Panel B), and spectro-
gram (Panel C) in TF32. Speech sample is from a three-year-old typically developing child
producing the phrase “cook big hot dogs.” Note the falling intonation pattern shown on the
pitch trace (Panel B) and the syllable-final lengthening indicated by the arrow on the spec-
trogram (Panel C).
and these correlates are appropriate means (Klatt, 1976). As children gain language pro-
for the study of speech and language devel- ficiency and motor skill, the temporal pat-
opment in children, or for disorders in tern of their speech increasingly conforms
development. to the adult standard in the language. At the
segmental level, temporal measurement
applies to the intrinsic duration of phonetic
Segmental Analysis elements or to the effects of the immediate
phonetic context. A number of generaliza-
tions have been established, including the fol-
Temporal Patterns lowing: (1) short or lax vowels have a briefer
duration than long vowels; and (2) vowels
The temporal pattern of speech is deter- preceding voiced consonants are longer than
mined by multiple influences, ranging from vowels before voiceless consonants. Other
prosodic patterns (considered in the previ- generalizations apply to segments in clusters
ous section) to intrinsic segment durations or in word-sized units: (1) a singleton con-
05_Flipsen_93-114 5/4/09 8:42 AM Page 99
sonant has a longer duration than the same of energy that occurs with the initial release
sound in a consonant cluster; (2) the base of the constriction), a frication interval (a
form of a word has a shorter duration as period of turbulence noise generated as the
prefixes or suffixes are combined with it; constriction is progressively opened), and
and (3) new or novel words are produced onset of voicing (the initiation of vocal fold
with a longer duration than familiar words. vibration for the following vowel). An inter-
These are regularities of American English, val of aspiration typically occurs between
and children learn to incorporate them in the frication and the onset of voicing, so
their speech patterns. Their developmental that word-initial voiceless stops in English
appearance has clinical relevance. For exam- are aspirated. The interval between the
ple, Schwartz (1995) concluded that word burst and the onset of voicing is called the
familiarity is associated with shorter word voice onset time (VOT). VOT has a range
duration, and he explained this outcome of values that are often classified as voicing
as evidence of word-specific motor matura- lead or prevoicing (voicing begins before
tion. An implication is that word duration the stop is released), simultaneous voicing
can be used as a clinical index of familiarity (onset of voicing is simultaneous with the
or motor maturation. transient), short lag (onset of voicing begins
Munson examined the mean duration shortly after the onset of voicing), and long
of /s/ frication, and its variability in adults lag (onset of voicing begins significantly
and in three groups of children (mean ages after the onset of voicing. In short, VOT is a
of 3;11; 5;04; and 8;04). Children had a larger continuous variable on which various pho-
temporal variability than adults.Weismer and netic categories of voicing can be mapped,
Elbert (1982) studied the temporal character- and these vary across languages. Perceptual
istics of /s/ production in normally speaking studies have shown that listeners are gener-
adults, normal speaking children, and chil- ally oblivious to small differences within a
dren with /s/ misarticulations. The /s/ dura- voicing category. For example, a short-lag
tions of the misarticulating children were VOT of 5 msec cannot be distinguished
significantly more variable than those for the from a short-lag VOT of 15 msec. As young
other two groups. This result was explained children learn to control the production of
in terms of differences in speech motor con- VOT, they often begin with a preference for
trol capabilities. It appears that temporal prevoicing or short-lag. Adults will tend to
variability reflects both maturation and dis- perceive both of these as voiced stops in
order (or perhaps only a single factor if it American English. Macken and Barton (1980)
can be shown than disorder is equivalent to reported that children produced small dif-
delayed maturation) ferences in VOT for voiced and voiceless
Figure 5–3 gives a comparison of typi- cognates that were not perceived by adults.
cal and atypical (disordered) productions of In an acoustic study of phonologically dis-
a simple phrase. The atypical production is ordered children, Catts and Jensen (1983)
noticeably longer, with lengthening of pho- concluded that some phonologically disor-
netic segments and phrases. dered children may have less mature speech
One of the most frequently studied timing control. A recent study with Brazilian
temporal features is the voicing contrast for Portuguese-speaking children aged between
word-initial stop consonants. These sounds 6 and 10 years old (Gurgueira, 2006) dem-
are associated with a sequence of acoustic onstrated that voiced stops are always pro-
events, including a transient or burst (a pulse duced with prevoicing, which is also true
05_Flipsen_93-114 5/4/09 8:42 AM Page 100
Five more cookies
Figure 5–3. Screen displays of waveform, pitch trace, and spectrogram in TF32. Panel A
shows the speech of a 5-year-old boy who is typically developing. Panel B shows the speech
of a 5-year-old boy with apraxia of speech and mild dysarthria. Both boys are producing
the phrase “five more cookies.” Note the overall duration difference for the two productions
and the increased length of individual words and pauses for the child with the speech
disorder (Panel B).
for Spanish, Italian, and French (Borden, Saxman (1990) used both phonological and
Harris, & Raphael, 1994). acoustic analyses to describe the speech of
The differences in VOT that can be reg- four children with a phonological disorder.
istered by acoustic means could have impli- The acoustic analyses indicated that three of
cations for treatment. Tyler, Edwards, and the children produced significant, although
05_Flipsen_93-114 5/4/09 8:42 AM Page 101
frequently imperceptible, differences in VOT index of the capacity for intelligible speech.
for a given stop when it represented different Data on the acoustic vowel space in typically
stops in adult speech.These small differences developing children have been summarized
can be taken as evidence of productive pho- by Vorperian and Kent (2007). Data for chil-
nological knowledge, and it was shown that dren with speech disorders have been
such knowledge facilitated rapid generali- reported for several conditions including dys-
zation of correct production of the treated arthria (Higgins & Hodge, 2001; Liu, Tsao,
contrast. But when such knowledge was not & Kuhl, 2005), hearing loss (Kent, Osberger,
evident in acoustic analysis, treatment over Netsell, & Hustedde, 1987; Liker, Mildner,
a longer period was needed to achieve pro- & Sindija, 2007; Rvachew, Slawinski, Wil-
duction accuracy on the same treated con- liams, & Green, 1996; Schenk, Baumgartner,
trast. But it should be noted that the voicing & Hamzavi, 2003), and various developmen-
contrast can be based on several cues, not tal disorders (Moura et al., 2008). Unusually
VOT alone. Forrest and Rockman (1988) sug- small areas of the acoustic vowel space are
gested that a matrix of acoustic cues is correlated with reduced intelligibility, but it
needed to explain the perception of word- should be noted that some speakers main-
initial voicing in the speech of phonologi- tain a fairly high level of intelligibility even
cally disordered children. In addition to with a compressed vowel space, so long as
VOT, these cues include fundamental fre- other acoustic cues are preserved. Further-
quency and F1 frequencies at the onset of more, vowel-specific formant-frequency dif-
voicing, and the amplitude of the burst and ferences may have value in characterizing
aspiration relative to the amplitude of the the vocal tract features of particular syn-
vowel onset. dromes (Moura et al., 2008).
Among the most important noise events
in speech are the bursts associated with
Spectral Patterns stops and the frication intervals associated
with fricatives and affricates. Generally, noise
Formant descriptions (typically F1-F2 or events in speech are characterized by dif-
F1-F2-F3, where Fn is a formant) are low- fuse spectra that possess varying degrees of
dimensional descriptions of vowel sounds. resonant shaping. Without question, these
One advantage of a formant specification is events carry a great deal of phonetic infor-
that a fairly systematic relationship holds mation. What is less certain is how these
between formant pattern and vowel articu- acoustic intervals should be characterized.
lation (i.e., the acoustic-to-articulatory con- A valuable source of normative data for
version). In the classic F1-F2 formant plot, adults is the article by Jongman, Wayland,
the F1 and F2 frequencies are related prin- & Wong (2000). Some possibilities for the
cipally to tongue height and advancement, analysis of children’s fricative sounds are
respectively. Alternatively, the F2-F1 differ- considered next.
ence can be interpreted as tongue advance- The earliest analyses used spectrograms
ment/retraction. Formant patterns are readily and spectral analyses to characterize the noise
observed in spectrograms or spectra and are energy in various /s/ distortions (Daniloff,
among the most salient acoustic properties Wilcox, & Stephens, 1986). One outcome
of speech. of this work was recognition of the large
The size of the vowel space, as typically inter- and intraspeaker variability for chil-
displayed in an F1/F2 plot, is a potential dren who misarticulated the /s/ sound.
05_Flipsen_93-114 5/4/09 8:42 AM Page 102
Daniloff et al. concluded that /s/ has a wide form in their value in characterizing noise
range of permissible acoustic allophonic spectra, and a major goal of the ensuing dis-
variants, and that this sound accommodates cussion is to identify the moments that hold
a considerable variation in the upper and particular value in spectral description.
lower cutoff frequencies of the major noise Spectral moments have been used to
energy, and the frequency and amplitude of describe fricatives in typically and atypically
major spectral peaks. An implication of this developing speech. Normative data on /s/
conclusion is that it may not be worthwhile production were reported for 26 children
to focus on fine spectral details for clinical aged 9 to 15 years by Flipsen, Shriberg, Weis-
purposes, but rather to emphasize major mer, Karlsson, and McSweeny (1999). It was
regions of noise energy. Taking together the concluded that /s/ can be characterized sat-
results of the Daniloff et al. (1986), Weismer isfactorily by data for the midpoint of the
and Elbert (1980), and Munson (2004) stud- /s/ frication presented in a linear scale (as
ies reviewed earlier, it appears that both opposed to the Bark scale), with preference
temporal and spectral variability are to be for the 1st and 3rd spectral moments. In
expected in children’s misarticulated /s/. The addition, the authors noted that the data
variability is at once an interesting feature should be referenced to individual linguistic-
of misarticulated speech and a challenge phonetic contexts. Rather different conclu-
to researchers and clinicians who would sions were reached by Nissen and Fox
examine this sound. (2005), in a study of adults and children
Spectral moments were introduced as aged 3 to 6 years. Their results indicated
a speech analysis method by Forrest, Weis- that spectral slope and variance, usually
mer, Milenkovic, and Dougall (1988) who neglected in earlier studies of child speech,
treated FFTs as random probability distri- contributed importantly to the differentia-
butions for which the first four moments tion and classification of the voiceless frica-
(mean, variance, skewness, and kurtosis) tives. The only measure that separated all for
were computed. The first spectral moment places of fricative articulation was spectral
is the mean or center of gravity of the spec- variance. Interestingly, it was also reported
trum. The second moment is the distribu- that /s/ and /ʃ/ were distinguished more
tion of energy around the mean, typically sharply by adults than by children, with a
expressed as the variance or standard devi- remarkable change in several spectral param-
ation. The third moment is skewness, which eters occurring at about 5 years of age.
may appear as the degree of spectral tilt Munson (2004) compared spectral variabil-
(although its exact meaning depends on the ity in /s/ production for adults and three
overall shape of the spectrum). The fourth groups of children (mean ages of 3;11; 5:04;
moment is kurtosis, which is often defined and 8:04). Spectral variability was defined
as the degree of peakedness of the distribu- as changes in the spectral mean (first spec-
tion or spectrum. Figure 5–4 illustrates the tral moment) through the interval of frica-
use of spectral moments for characterizing tion noise. Adults produced the /s/ with less
two fricatives produced by a child. It should variability than the children’s groups, who
be noted that these descriptions are most did not differ from one another. In view of
valid when the underlying distribution has the lack of effects of phonetic context on
the shape of the normal probability distri- spectral variability, Munson concluded that
bution. In fact, acoustic spectra rarely have that the differences between adults and chil-
that shape. The four moments are not uni- dren reflected a “subtle variability in place
05_Flipsen_93-114 5/4/09 8:42 AM Page 103
Figure 5–4. Spectral moments display from a 4-year-old child, using a 20-msec window at
the temporal midpoint of /s/ (Panel A) and /sh/ (Panel B). Note the difference in skewness
between the two productions.
of articulation for /s/ in the children’s pro- above kHz compared to the same speech
ductions (p. 58). It should also be noted that sounds produced by adults (Pittman, Stel-
children’s speech may differ from adults’ machowicz, Lewis, & Hoover, 2003). These
speech in respect to the relative amplitude differences in relative amplitude are highly
of its high-frequency components. Short- relevant to understanding the perception
term spectra of children’s speech sounds and transcription of children’s speech.
have been reported to have reduced ampli- Spectral analyses also have been re-
tudes for /s/ and /ʃ/ and for vowel energy ported for the burst of stop consonants,
05_Flipsen_93-114 5/4/09 8:42 AM Page 104
especially the voiceless stops /p t k/. In one longer vowels before voiced stops). Appar-
of the earliest studies, spectral moments ently, these two children preserved the stop-
were calculated for word-initial /t/ and /k/ voicing feature in their speech, even though
produced by both typically developing chil- listeners judged the stop to be deleted. Ac-
dren and by children with phonological dis- cording to data reported by Krause (1982),
order (Forrest, Weismer, Hodge, Dinnsen, & the vowel duration cue for voicing appears
Elbert, 1990). Using a discriminant function at least by the age of 3 years. She described
analysis, Forrest et al. achieved 82% correct the early pattern of development as involv-
classification of the two stops using the first, ing both exaggerated vowel lengthening
third and fourth moments. The discriminant (before voiced stops) and exaggerated vowel
function developed for the normally speak- shortening (before voiceless stops).
ing children was applied to the phonolog-
ically disordered children, no distinction
could be made between /t/ and /k/. Bunnell, Spectrotemporal Patterns for
Polikoff, and McNicholas (2004) compared Liquids and Glides
spectral moments and Bark cepstral analyses
for classification of children’s word-initial The liquids in American English are the
voiceless stops. A better classification rate lateral /l/ and the rhotic /r/, both of which
was achieved for the Bark cepstral analysis. can be problematic for children acquiring
For both types of analysis, four time frames speech. Acoustically, liquids are character-
that sampled the initial 40 msec of each ized especially by their formant pattern
burst was needed for the highest rates of (/r/) or formant-antiformant pattern (/l/).
correct classification. It is premature to rec- Acoustic analyses for /r/ are illustrated in
ommend either spectral analysis or Bark Figure 5–5. The glides in American English
cepstral analysis as the preferred method are the palatal /j/ and the labiovelar /w/ (and
for inspecting stop bursts. its voiceless allophone, which may not be
An example that demonstrates both used by all speakers. The glides are associ-
clinical application and the sensitivity of ated with a relatively gradual formant tran-
acoustic analysis is for a common type of sition into the following vowel. Acoustic
speech sound error in early speech devel- data on correctly produced /w, r, l/ in both
opment, omission or deletion of a segment. children and adults were reported by Dal-
This is usually a conspicuous error, readily ston (1975). Chaney (1988) studied three
perceived by adult listeners. However, Weis- groups of children: a group that correctly
mer (1984) reported that in some cases of produced /w, r, l, j/, a group with develop-
an ostensible deletion, acoustic analyses mental w/r and w/l substitutions, and a
showed that the supposedly deleted conso- group of articulation-impaired children who
nant had formant transitions appropriate to had w/r and w/l substitutions. The children
its phonetic properties. The acoustic cue with /w, r/ errors produced the glide /j/ with
was not detected by listeners. In another acoustic properties similar to those seen in
study of apparent omission of word-final the control group, but neither of the groups
stops (Weismer, Dinnsen, & Elbert, 1981), it with errors differentiated among /w, r, l/ by
was shown that two of three children with either formant frequencies or transition
the omission pattern produced vowel dura- rate. Interestingly, the /w/ produced for tar-
tion differences that were suited to the voic- get /w/ and in substitution for /r/ and /l/ by
ing characteristic of the omitted stop (i.e., some of the children with errors did not
05_Flipsen_93-114 5/4/09 8:42 AM Page 105
Figure 5–5. Screen display of a waveform and spectrogram in TF32. Panel A shows a pro-
duction of the word “rock” from a three-year-old typically developing child, where the /r/
phoneme is distorted. Panel B shows a production of the same word by a 6-year-old typically
developing child, where the /r/ phoneme is produced appropriately. Note the difference in
the 3rd formant frequency (as indicated by the arrows in each panel).
match the acoustic pattern of /w/ as pro- be used to modify disordered productions
duced by the children without errors. Shus- of /r/ so that they approximate correct pro-
ter (1996) showed how speech resynthesis ductions. Flipsen et al. (2001) suggested that
based on linear prediction coding (LPC) can speech-genetics research would be enhanced
05_Flipsen_93-114 5/4/09 8:42 AM Page 106
by the availability of acoustic phenotypes, Siren & Wilcox, 1995), whereas other stud-
such as residual distortions of rhotic sounds. ies showed no developmental difference,
Pagan and Wertzner (2007b) studied variable patterns across sounds, or greater
the acoustic patterns of two of the three coarticulation in adults than children (Flege,
Brazilian Portuguese liquids (/r/, a voiced 1988; Katz, Kripke, & Tallal, 1991; Kent, 1983;
alveolar tap, and /l/, a voiced alveolar lat- Kuijpers, 1993; Repp, 1986; Sereno, Baum,
eral) as produced by typically developing Marean, & Lieberman, 1987; Sussman, et al.,
children and by children with a phonologi- 1996; Turnbaugh, Hoffman, Daniloff, & Ab-
cal disorder who had r/l substitutions. It sher, 1985). The different results probably
was found that the /l/ produced as a substi- can be explained by reference to the differ-
tution for /r/ was different from both the /l/ ent methods of analysis, and the phonetic
correctly produced by the phonologically properties of the speech material interact-
disordered children and the /l/ produced ing with the maturational status of the child.
by the control group. The /l/ substituting for It appears that there is no single matura-
/r/ had a longer duration, different steady- tional pattern of coarticulation for various
state values, and a smaller formant slope. sounds, and that a child seeks to balance
This result is another example of an acous- coarticulatory adjustments against con-
tic differentiation for sounds that are judged trastive distinctiveness (Gibson & Ohde,
to be the same by listeners 2007; Nittrouer, 1993; Sussman, Duder, Dal-
ston, & Cacciatore, 1999; Sussman, Hoe-
meke, & McCaffrey, 1992).
Coarticulation
Acoustic Correlates of
Coarticulation, or the simultaneous adjust- Speaker Intelligibility
ment of the articulators to two or more
phones, is a basic characteristic of compe-
tent adult speech. In forward or anticipatory A long-term goal in the application of acous-
coarticulation, a phonetic property of a given tics is to determine the acoustic correlates of
phone is assumed earlier in the phonetic intelligibility (see also Chapter 10). Research
string. For example, lip rounding for the on this topic is hindered by the potentially
vowel in the word appears during the ini- large number of acoustic features that can
tial consonant /s/. In backward or retentive be considered, and also by the fact that
coarticulation, a phonetic property of a speakers can deploy acoustic cues in vari-
given phone is retained to a later position ous combinations to achieve a satisfactory
in the phonetic string. An example is the degree of intelligibility. It seems safe to con-
nasalization of the vowel in the word no. clude on the basis of available evidence that
The development of coarticulation is not the same general acoustic properties are
well understood, and rather different conclu- relevant to both adult and child speech
sions have been reached from research on (Hazen & Markham, 2004).
children’s speech. In several studies, young Research on “clear” versus “conversa-
children were observed to show more exten- tional” speech holds value in understanding
sive coarticulation than adults (Nittrouer, the acoustic bases of speech intelligibility
Studdert-Kennedy, & McGowan, 1989; Nit- (Picheny et al., 1985, 1986, 1989). Acoustic
trouer, Studdert-Kennedy, & Neely, 1996; analyses of the two forms of speech have
05_Flipsen_93-114 5/4/09 8:42 AM Page 107
shown consistent differences, thereby laying Monsen, 1976; Weismer & Martin, 1992). In
a foundation for a general understanding the main, the results from dysarthric and
of the acoustic correlates of intelligibility deaf speech agree with the results reviewed
(Picheny et al., 1985, 1986, 1989). As com- above for normal speech. That is, the differ-
pared to “clear” speech, “conversational” ences in intelligibility appear to be rooted
speech tends to have modified or reduced in a common set of fine-grained acoustic
vowels, nonreleased word-final stops, and measures including vowel formant frequen-
reduced intensities for obstruents. Although cies and intersegmental timing.
“clear” speech typically is slower than “con- If these results can be generalized to
versational” speech, it is important to note speech development and to developmental
that enhancements of intelligibility can be speech disorders, then the implication is
achieved even at rapid speaking rates (Krause that fine-grained acoustic properties are the
& Braida, 1995). The acoustic differences key to understanding differences in speech
between “clear” and “conversational” speech intelligibility.
may explain intrinsic intelligibility differences
among individual speakers. Bond and Moore
(1994) studied the acoustic-phonetic differ- Variability as an Index of
ences between a talker with relatively high Precision and Maturation
intelligibility and two talkers with relatively
low intelligibility. The high-intelligibility
talker had many acoustic-phonetic proper- As earlier sections of this chapter make
ties similar to those described for “clear” clear, variability has been a particular focus
speech. In a similar study of individual dif- of research on both typical and atypical
ferences in intelligibility, Bradlow, Torretta, speech development. Across motor skills, it
& Pisoni (1996) concluded that global char- is generally presumed that increasing accu-
acteristics (e.g., speaking rate and mean F0 racy is a characteristic of skill maturation.
level) did not correlate strongly with intel- One way of gauging accuracy is to deter-
ligibility, but the fine-grained characteristics mine the variability in a motor response, or,
(F0 and F1 variation, formant frequency range in the case of speech, the acoustic conse-
for vowels, intersegmental timing) did cor- quences of that motor response. In one of
relate. The profile of a highly intelligible the earliest studies to address this issue,
speaker was one who produced sentences Eguchi and Hirsh (1969) reported that there
with a relatively wide range of F0, a rela- were nearly continuous decreases in the
tively expanded vowel space that includes variability of both F1 and F2 frequencies
a substantial F1 variation, precise articulation from 3 to 11 years of age in typically devel-
of the point vowels, and a high precision of oping children. One interpretation is that
intersegmental timing. Therefore, there is motor skill for speech improves with age,
an important linkage between two general and acoustic measures of formant structure
approaches to the study of intelligibility dif- reflect this improvement up until the age
ferences in normal speakers. of puberty. But other acoustic data point
Similar results can be seen in studies to a different conclusion. Nittrouer (1993)
of dysarthric and deaf speakers that have reported that the variability in F1 frequency
established fine-grained acoustic character- was minimal by the age of 3 years whereas
istics relating to differences in speaker intel- variability in F2 frequency continued to
ligibility (Kent et al., 1989; Metz et al., 1985; decrease beyond that age.The early accuracy
05_Flipsen_93-114 5/4/09 8:42 AM Page 108
in F1 frequency was related to an early tor mentioned in (a) above (Kent & Forner,
maturation of jaw movement control, given 1978). Children with a speech-language dis-
that jaw movement has a strong effect order may have even slower speaking rates
on F1 frequency. In fact, the maturation of than typically developing children. This
motor control over different oral structures slow rate may be related to the combined
is open to discussion. Although it has been effects of development and disorder.
reported that children’s jaw movements are (c) Variability may be a gauge of cate-
less variable than lip movements (Green, gory breadth or coarticulatory range. For
Moore, & Reilly, 2002; Walsh & Smith, example, as children add elements to their
2002), it also has been shown that there are vowel systems, the allowable range for any
parallel decreases in the variability of jaw one vowel may be adjusted to accommo-
and lip movements with maturation (Walsh date the insertion of new vowel sounds.
& Smith, 2002). Similarly, variability in producing a particu-
Variability in the temporal patterns of lar word may be related to the lexical den-
speech also has been examined. The general sity for that word. Presumably, a word with
conclusion is that variability declines with a high neighborhood density would be pro-
age until late childhood, puberty, or adoles- duced more accurately than a word with a
cence (Kent & Forner, 1978; Lee, Potamianos, low neighborhood density.
& Narayanan, 1999; Lehman & Sharf, 1989; (d) Variability in a spectral or tempo-
Munson, 2004; Smith & Kenney, 1999). How- ral feature (or a spectrotemporal property)
ever, changes in variability are not necessar- may be an indication of destabilizing forces.
ily uniform across different segments (Kent In a dynamic systems perspective, periods
& Forner, 1978; Smith & Kenney, 1999). of destabilization may be optimum times
Variability is actually relevant to sev- for intervention.
eral developmental issues, including the (e) Measures of temporal pattern are
following: not as sensitive to age and gender variables
(a) Estimates of the variability of either as are measures of formant or general spec-
spectral or temporal features have been tral pattern. Reliability estimates of tempo-
proposed as an index of the maturation of ral measures are reviewed in Kent and Read
speech motor control, as noted above. Vari- (2002).
ability, commonly expressed as a standard Obviously, a simple interpretation of
deviation, is considered as an estimate of variability is not likely to be correct unless
precision of articulation. This approach this list can be pruned to one or two appli-
requires analysis of multiple tokens of a cable alternatives. Unfortunately, many
given speech target. It is assumed that the developmental studies were not designed
speaker is able to create a stable represen- to address each of these factors in an empir-
tation of the target behavior from which ical fashion that allows their separation.
motor commands to the articulators can be The clinical implication is not neces-
formulated. sarily that a clinician will record ten or
(b) In general, the variability in tempo- more tokens of a sound pattern and then
ral segments is related to speaking rate, such calculate standard deviations for a selected
that a slow rate is associated with greater measurement. Such a procedure may be for-
variability. Because children typically have a biddingly tedious for both the child and the
slower rate of speech than adults, speaking clinician. Rather, the object is more likely
rate is confounded with the maturational fac- to be to ascertain the stability of production
05_Flipsen_93-114 5/4/09 8:42 AM Page 109
in relation to a clinical objective. Say, for rum, & Windsor, 2003 ), (b) small differences
example, that the objective is treatment of in VOT even for stimuli that were not distin-
a speech sound disorder. Frequently, clini- guished by adult listeners (Macken & Barton,
cians want to establish a degree of stability 1980; Tyler, Edwards, & Saxman (1990), and
in production of a certain sound pattern (c) acoustic evidence of a phonetic feature
before introducing a change of some kind, of a speech sound that was supposedly
such as working on another target sound, omitted (Weismer, 1984; Weismer, Dinnsen,
changing the phonetic context of sound pro- & Elbert, 1981).
duction, or varying prosodic features such This “look and listen” strategy can be
as speaking rate or stress. quite powerful, as it enables the observer
(e.g., clinician, researcher) to observe the
visual display of speech and to reconcile it
Sensitivity with what is heard. Qualitative analysis has
much to recommend it. As noted by Liss
and Weismer (1992), “traditional acoustic
Acoustic analysis is capable of resolving measures of temporal and spectral charac-
fine differences in the timing and spectra of teristics of normal speech may not neces-
speech sounds. Differences that are not sarily reveal the inherently ‘important’
detected by the ear can be detected by suit- aspects of disordered speech production”
able acoustic analyses that are performed in (p. 2984). This is not to assert that quantita-
the time domain (waveform), frequency tive analyses are irrelevant to the study of
domain (FFT or LPC spectrum, cepstrum, or disordered speech, but rather to say that
other analysis), or the time-frequency domain qualitative analyses are a valuable comple-
(spectrogram or other running spectral dis- ment to quantitative methods. For addi-
play). The issue here is not necessarily quan- tional discussion of this issue, see Weismer
tification, as important as that may be, but and Liss (1991).
identifying the shear presence or absence Each individual clinician must ask her-
of an acoustic property. Examples from the self or himself whether acoustic tools will
literature are discussed below to illustrate make for better clinical services. Technol-
the concept for both segmental and supra- ogy is only as useful as the use to which it
segmental aspects of speech. is put. The dramatic progress in speech
The sensitivity of acoustic analysis does technology (automatic speech recognition,
not necessarily depend on quantification. speech synthesis, no-cost or low-cost speech
Sometimes, simply observing the presence analysis software) presents a powerful set
or absence of an acoustic phenomenon is of tools for the future practice of speech-
sufficient. In some examples given earlier in language pathology.
this chapter, measurements were not always
needed. Rather, the person performing the
analysis used acoustics as a kind of alter- Conclusion
native visual display—a highly sensitive
one—to the analysis performed by the ear.
This approach made it possible to detect The first author was a long-term faculty col-
(a) acoustic differences between stressed league of Larry Shriberg at the University of
and unstressed syllables that were not con- Wisconsin-Madison. We co-authored a text,
sistently perceived by adults (Munson, Bjo- Clinical Phonetics, now in its third edition.
05_Flipsen_93-114 5/4/09 8:42 AM Page 110
In preparing the text and the accompany- stops. Available at http://www.asel.udel.edu/

ing audiotapes, we accomplished a kind of speech/reports/icslp04/ICSLP04_paper.pdf .
mutual calibration of our “phonetic” ears as Catts, H. W., & Jensen, P. J. (1983). Speech timing
we listened (repeatedly) to samples of chil- of phonologically disordered children: Voic-
dren’s speech disorders. Our labors began ing contrast of initial and final stop consonants.
Journal of Speech and Hearing Research,
in the pre-DSP days, which meant that
26, 501–510.
speech samples existed physically as pieces
Chaney, C. (1988). Acoustic analysis of correct
of audiotape (analog recordings). I recall and misarticulated semivowels. Journal of
seeing strips of tape hanging around the Speech and Hearing Research, 31, 275–287.
room where we worked. These were even- Dalston, R. M. (1975). Acoustic characteristics of
tually assembled by tape-splicing methods English /w, r, l/ spoken correctly by young
into tapes for auditory exercises for pho- children and adults. Journal of the Acousti-
netic transcription. If we undertook that cal Society of America, 57, 462–469.
effort today, it would be very different. We Daniloff, R. G.,Wilcox, K., & Stephens, M. I. (1980).
would use digital signal processing to record, An acoustic-articulatory description of chil-
store, and analyze the samples. Rather than dren’s defective /s/ productions. Journal of
compare notes strictly on our respective Communication Disorders, 13, 347–363.
Eguchi, S., & Hirsh, I. J. (1969). Development of
auditory impressions (which differed now
speech sounds in children. Acta Otolaryngo-
and then) of each sample, we would exam-
logica, Supplementum, 257, 1–51.
ine visual displays of acoustic information. Flege, J. E. (1988). Anticipatory and carry-over
Would this information be helpful? I have nasal coarticulation in the speech of children
no doubt that it would. and adults. Journal of Speech and Hearing
Research, 31, 525–536.
Flipsen, P., Jr., Shriberg, L., Weismer, G., Karlsson,
References H., & McSweeny, J. (1999). Acoustic charac-
teristics of /s/ in adolescents. Journal of
Speech, Language, and Hearing Research,
Boersma, P., & Weenink, D. (2008). PRAAT (com- 42, 663–677.
puter program). University of Amsterdam; avail- Flipsen, P., Jr., Shriberg, L., Weismer, G., Karlsson,
able at http://www.fon.hum.uva.nl/praat . H., & McSweeny, J. (2001). Acoustic pheno-
Bond, Z. S., & Moore, T. J. (1994). A note on the types for speech-genetics studies: Reference
acoustic-phonetic characteristics of inadver- data for residual /Er/ distortion. Clinical Lin-
tently clear speech. Speech Communication, guistics and Phonetics, 15, 603–630.
14, 325–337. Forrest, K., & Rockman, B. K. (1988). Acoustic
Borden, S. J., Harris, K. S., & Raphael, L. J. (1994). and perceptual analysis of word-initial stop con-
Speech science primer: Physiology, acoustics sonants in phonologically disordered children.
and perception of speech. Baltimore: Williams Journal of Speech, Language, and Hearing
& Wilkins. Research, 31, 449–459.
Bradlow, A. R., Torretta, G. M., & Pisoni, D. B. Forrest, K., Weismer, G., Hodge, M., Dinnsen, D.
(1996). Intelligibility of normal speech. I. A., & Elbert, M. (1990). Statistical analysis of
Global and fine-grained acoustic-phonetic word-initial K and T produced by normal and
talker characteristics. Speech Communica- phonologically disordered children. Clinical
tion, 20, 255–272. Linguistics and Phonetics, 4, 327–340.
Bunnell, H. T., Polikoff, J., & McNicholas, J. Forrest, K., Weismer, G., Milenkovic, P., & Dou-
(2004). Spectral moment vs. Bark cepstral gal, R. N. (1988). Statistical analysis of word-
analysis of children’s word-initial voiceless initial voiceless obstruents: Preliminary data.
05_Flipsen_93-114 5/4/09 8:42 AM Page 111
Journal of the Acoustical Society of Amer- Speech, and Hearing Services in Schools, 35,
ica, 84, 115–123. 112–121.
Gerken, L., & McGregor, K. (1998). An overview Jongman, A., Wayland, R., & Wong, S. (2000).
of prosody and its role in normal and disor- Acoustic characteristics of English fricatives.
dered child language. American Journal of Journal of the Acoustical Society of Amer-
Speech-Language Pathology, 7, 38–48. ica, 108, 1252–1263.
Gibson, T., & Ohde, R. N. (2007). F2 locus equa- Katz, W. F., Kripke, C., & Tallal, P. (1991). Antici-
tions: Phonetic descriptors of coarticulation patory coarticulation in the speech of adults
in 17- to 22-month-old children. Journal of and young children. Journal of Speech and
Speech, Language, and Hearing Research, Hearing Research, 34, 1222–1232.
50, 97–108. Kehoe, M., Stoel-Gammon, C., & Buder, E. H.
Green, J. R., Moore, C. A., & Reilly, K. U. (2002). (1995). Acoustic correlates of stress in young
The sequential development of jaw and lip children’s speech. Journal of Speech and
control in speech. Journal of Speech, Lan- Hearing Research, 38, 338–350.
guage, and Hearing Research, 45, 66–79. Kent, R. D. (1976). Anatomical and neuromus-
Gurgueira, A. L. (2006). Estudo acústico dos cular maturation of the speech mechanism:
fonemas surdos e sonoros do português do Evidence from acoustic studies. Journal of
Brasil, em crianças com distúrbio fono- Speech and Hearing Research, 19, 421–447.
lógico apresentando o processo fonológico Kent, R. D. (1983). Segmental organization of
de ensurdecimento. [Acoustic study of the speech. In P. F. MacNeilage (Ed.), The produc-
voice onset time (vot) and the vowel duration of speech. New York: Springer Verlag.
tion for the distinctiion of voicing of the stop Kent, R. D. (1996). Hearing and believing: Some
sounds of the brazilian portuguese in chil- limits to the auditory-perceptual assessment
dren with typical development and with of speech and voice disorders. American Jour-
phonological disorders]. Doctoral disserta- nal of Speech-Language Pathology, 7, 7–23.
tion. Department of Linguistics of the Faculty Kent, R. D., & Forner, L. L. (1980). Speech seg-
of Philosophy, Sciences and Literature and ment durations in sentence recitations by
Languages from the University of São Paulo, children and adults. Journal of Phonetics, 8,
Brazil. 157–168.
Hazen, V., & Markham, D. (2004). Acoustic- Kent, R. D., & Hustad, K. C. (2009). Speech pro-
phonetic correlates of talker intelligibility for duction, development. In L. Squire, T. Albright,
adults and children. Journal of the Acoustical F. Bloom, F. Gage, & N. Spitzer (Eds.), New
Society of America, 116, 3108–3118. encyclopedia of neuroscience. Oxford, UK:
Henrich, J., Lowit, A., Schalling, E., & Mennen, I. Elsevier.
(2006). Rhythmic disturbances in ataxic dys- Kent, R. D., & Murray, A. D. (1982). Acoustic
arthria: A comparison of different measures features of infant vocalic utterances. Journal
and speech tasks. Journal of Medical Speech- of the Acoustical Society of America, 72,
Language Pathology, 14, 291–296. 353–365.
Higgins, C. M., & Hodge, M. M. (2001). F2/F1 Kent, R. D., Osberger, M. J., Netsell, R., & Hus-
vowel quadrilateral area in young children with tedde, C. G. (1987). Phonetic development in
and without dysarthria. Canadian Acoustics, twins who differ in auditory function. Jour-
29, 66–68. nal of Speech and Hearing Disorders, 52,
Huggins, A. W. F. (1980). Better spectrograms 64–75.
from children’s speech: A research note. Jour- Kent, R. D., & Read, C. (2002). The acoustic
nal of Speech and Hearing Research, 23, analysis of speech (2nd ed.). Albany, NY: Sin-
19–27. gular/Thomson Learning.
Ingram, K., Bunta, F., & Ingram, D. (2004). Digi- Kent, R. D., & Tilkens, C. (2007). Oral motor foun-
tal data collection and analysis. Language, dations of speech. To appear in S. McLeod
05_Flipsen_93-114 5/4/09 8:42 AM Page 112
(Ed.), The international guide to speech speech intelligibility in Mandarin-speaking

acquisition. Albany, NY: Delmar. young adults with cerebral palsy. Journal of
Kent, R. D., & Vorperian, H. K. (1995). Anatomic the Acoustical Society of America, 117,
development of the craniofacial-oral-laryngeal 3879–3889.
systems: A review. Journal of Medical Speech- Low, E. L., Grabe, E., & Nolan, F. (2000). Quanti-
Language Pathology, 3, 145–190. (Also pub- tative characterizations of speech rhythm:
lished as a monograph (1995) San Diego, CA: Syllable-timing in Singapore English. Lan-
Singular.). guage and Speech, 43, 377–401.
Kent, R. D., Weismer, G., Kent, J. F., & Rosenbek, Macken, M. A., & Barton, D. (1980). The acquisi-
J. C. (1989). Toward phonetic intelligibility tion of the voicing contrast in English: Study
testing in dysarthria. Journal of Speech and of voice onset time in word-initial stop con-
Hearing Disorders, 54, 482–499. sonants. Journal of Child Language, 7, 41–74.
Klatt, D. H. (1976). Linguistic uses of segmental Metz, D., Samar, V., Schiavetti, N., Sitler, R., &
duration in English: Acoustic and perceptual Whitehead, R. (1985). Acoustic dimensions
evidence. Journal of the Acoustical Society of hearing-impaired speakers’ intelligibility.
of America, 59, 1208–1221. Journal of Speech and Hearing Research,
Krause, J. C., & Braida, L. D. (1995). The effects of 28, 345–355.
speaking rate on the intelligibility of speech Monsen, R. B. (1976). Normal and reduced
for various speaking modes. Journal of the phonological space: The productions of Eng-
Acoustical Society of America, 98, 2982. lish vowels by deaf adolescents. Journal of
Krause, S. E. (1982). Developmental use of vowel Phonetics, 4, 189–198.
duration as a cue to postvocalic stop conso- Moura, C. P., Cunha, L. M., Vilarinho, H., Cunha,
nant voicing. Journal of Speech and Hearing M. J., Freitas, D., Palha, M., et al. (2008). Voice
Research, 25, 388–393. parameters in children with Down syndrome.
Kuijpers, C. (1993). Temporal aspects of the Journal of Voice, 22, 34–42.
voiced-voiceless distinction in speech devel- Munson, B. (2004). Variability in /s/ production
opment of young Dutch children. Journal of in children and adults: Evidence from dy-
Phonetics, 21, 313–327. namic measures of spectral mean. Journal of
Lee, S., Potamianos, A., & Narayanan, S. (1999). Speech, Language, and Hearing Research,
Acoustics of children’s speech: Developmental 47, 58–69.
changes of temporal and spectral parameters. Munson, B., Bjorum, E. M., & Windsor, J. (2003).
Journal of the Acoustical Society of Amer- Acoustic and perceptual correlates of stress
ica, 105, 1455–1468. in nonwords produced by children with sus-
Lehman, M. E., & Sharf, D. J. (1989). Perception/ pected developmental apraxia of speech and
production relationships in the development children with phonological disorder. Journal
of the vowel duration cue to final consonant of Speech, Language, and Hearing Research,
voicing. Journal of Speech and Hearing 46, 189–202.
Research, 32, 803–815. Nissen, S. L., & Fox, R. A. (2005). Acoustic and
Liker, M., Mildner, V., & Sindija, B. (2007). Acous- spectral characteristics of young children’s
tic analysis of the speech of children with fricative productions: A developmental per-
cochlear implants: A longitudinal study. Clin- spective. Journal of the Acoustical Society of
ical Linguistics and Phonetics, 21, 1–11. America, 118, 2570–2578.
Liss, J. M., & Weismer, G. (1992). Qualitative acous- Nittrouer, S. (1993). The emergence of mature
tic analysis in the study of motor speech dis- gestural patterns is not uniform: Evidence
orders. Journal of the Acoustical Society of from an acoustic study. Journal of Speech
America, 92, 2984–2987. and Hearing Research, 36, 959–972.
Liu, H. M., Tsao, F. M., & Kuhl, P. K. (2005). The Nittrouer, S., Studdert-Kennedy, M., & McGowan,
effect of reduced vowel working space on R. S. (1989). The emergence of phonetic seg-
05_Flipsen_93-114 5/4/09 8:42 AM Page 113
ments: Evidence from the spectral structure nal of the Acoustical Society of America, 79,
of fricative-vowel syllables spoken by chil- 1616–1619.
dren and adults. Journal of Speech and Rvachew, S., Slawinski, E. G.,Williams, M., & Green,
Hearing Research, 32, 120–132. C. L. (1996). Formant Frequencies of vowels
Nittrouer, S., Studdert-Kennedy, M., & Neely, S. produced by infants with and without early
(1996). Howe children learn to organize their onset ititis media. Canadian Acoustics/
speech gestures: Further evidence from frica- Acoustique Canadienne, 24, 19–28.
tive-vowel syllables. Journal of Speech and Schenk, B. S., Baumgartner, W. D., & Hamzavi,
Hearing Research, 39, 379–389. J. S. (2003). Changes in vowel quality after
Pagan, L. O., & Wertzner, H. F. (2007a). Percep- cochlear implantation. ORL Journal of Oto-
tual analysis of the three Brazilian Portu- rhinolaryngology Related Specialties, 65,
guese liquid sounds. Poster to be presented at 184–188.
the 27th World Congress of the International Schwartz, G. R. (1995). Effect of familiarity on
Association of Logopedics and Phoniatrics, word duration in children’s speech: A prelim-
Copenhagen, Denmark. inary investigation. Journal of Speech and
Pagan, L. O., & Wertzner, H. F. (2007b). Descrip- Hearing Research, 38, 76–84.
tion of the acoustic characteristics of the liq- Sereno, J. A., Baum, S. R., Marean, G. C., & Lieber-
uid sounds /l/ and /r/ in phonologically man, P. (1987). Acoustic analyses and percep-
disordered children. Poster presented at the tual data on anticipatory coarticulation in
2nd International Association of Logopedics adults and children. Journal of the Acousti-
and Phoniatrics Composium, Sao Paulo, Brazil, cal Society of America, 81, 512–519.
March 24–25. Shuster, L. I. (1996). Linear predictive coding
Picheny, M. A., Durlach, N. I., & Braida, L. D. parameter manipulation/synthesis of incor-
(1985). Speaking clearly for the hard of hear- rectly produced /r/. Journal of Speech and
ing I: Intelligibility differences between clear Hearing Research, 39, 827–832.
and conversational speech. Journal of Speech Siren, K. A., & Wilcox, K. A. (1995). Effects of lex-
and Hearing Research, 28, 96–103. ical meaning and practiced productions on
Picheny, M. A., Durlach, N. I., & Braida, L. D. coarticulation in children’s and adults’ speech.
(1986). Speaking clearly for the hard of hear- Journal of Speech and Hearing Research,
ing II: Acoustic characteristics of clear and 38, 351–359.
conversational speech. Journal of Speech Smith, B. L., & Kenney, M. K. (1999). A longitu-
and Hearing Research, 29, 434–446. dinal study of the development of temporal
Picheny, M.A., Durlach, N. I., & Braida, L. D. (1989). properties of speech production: Data from
Speaking clearly for the hard of hearing III: 4 children. Phonetica, 56, 73–102.
An attempt to determine the contribution of Snow, D. (1994). Phrase-final syllable lengthening
speaking rate to difference in intelligibility and intonation in early child speech. Journal
between clear and conversational speech. of Speech, Language, and Hearing Research,
Journal of Speech and Hearing Research, 37, 831–840.
32, 600–603. Snow, D. (1998). Prosodic markers of syntactic
Pittman, A. L., Stelmachowicz, P. G., Lewis, D. E., & boundaries in the speech of 4-year-old chil-
Hoover, B. M. (2003). Spectral characteristics dren with normal and disordered language
of speech at the ear. Journal of Speech, Lan- development. Journal of Speech, Language,
guage, and Hearing Research, 46, 649–657. and Hearing Research, 41, 1158–1170.
Read, C., Buder, E., & Kent, R.D. (1990). Speech Sussman, H. M., Duder, C., Dalston, E., & Caccia-
analysis systems: A survey. Journal of Speech tore, A. (1999). An acoustic analysis of the
and Hearing Research, 33, 363–374. development of CV coarticulation. Journal
Repp, B. (1986). Some observations on the devel- of Speech, Language, and Hearing Research,
opment of anticipatory coarticulation. Jour- 42, 1080–1096.
05_Flipsen_93-114 5/4/09 8:42 AM Page 114
Sussman, H. M., Hoemeke, K. A., & McCaffrey, H. E. (2005). Dysarthria in traumatic brain
A. (1992). Locus equations as index of coar- injury: A breath group and intonational analy-
ticulation for place of articulation distinctions sis. Folia Phoniatrica et Logopaedica, 57,
in children. Journal of Speech and Hearing 59–89.
Research, 35, 769–781. Wang, Y.-T., Kent, R. D., Duffy, J. R., Thomas, J. E.,
Sussman, H. M., Minifie, F. D., Buder, E. H., Stoel- & Fredericks, G. V. (2006). Dysarthria follow-
Gammon, C., & Smith, J. (1996). Consonant- ing cerebellar mutism secondary to resection
vowel interdependencies in babbling and of a fourth ventricle medulloblastoma: A case
early words. Journal of Speech and Hearing study. Journal of Medical Speech-Language
Research, 39, 424–433. Pathology, 14, 109–122.
Thom, S., Hoit, J., Hixon, T., & Smith, A. (2005). Weismer, G. 1984). Acoustic analysis strategies
Velopharyngeal function during vocalization for the refinement of phonological analyses.
in infants. The Cleft Palate-Craniofacial Jour- In M. Elbert, D. A. Dinnsen, & G. Weismer
nal [published online 15 November; doi: (Eds.), Phonological theory and the misar-
10.1597/05-113]. ticulation child (pp. 30–52). ASHA Mono-
Turnbaugh, K., Hoffman, P., Daniloff, R. G., & graphs (No. 22). Rockville, MD: American
Absher, R. (1985). Stop-vowel coarticulation Speech-Language-Hearing Association.
in 3-year-olds, 5-year-olds, and adults. Journal Weismer, G., Dinnsen, D., & Elbert, M. (1981).
of the Acoustical Society of America, 77, A study of the voicing distinction associated
1256–1258. with omitted, word-final stops. Journal of
Tyler, A. A., Edwards, M. L., & Saxman, J. H. Speech and Hearing Disorders, 46, 320–328.
(1990). Acoustic validation of phonological Weismer, G., & Elbert, M. (1982). Temporal char-
knowledge and its relationship to treatment. acteristics of “functionally” misarticulated /s/
Journal of Speech and Hearing Disorders, in 4- to 6-years-old children. Journal of
55, 251–261 Speech and Hearing Research, 25, 275–287.
Vorperian, H. K., & Kent, R. D. (2007). Vowel Weismer, G., & Liss, J. M. (1991). Acoustic/per-
acoustic space development in children: ceptual taxonomies of disordered speech. In
A synthesis of acoustic and anatomic data. C. Moore, K. Yorkston, & D. Beukelman
Journal of Speech, Language, and Hearing (Eds.), Dysarthria and apraxia of speech:
Research, 50, 1510–1545. Perspectives on management (pp. 245–270).
Walsh, B., & Smith, A. (2002). Articulatory move- Baltimore: Brookes.
ments in adolescents: Evidence for protracted Weismer, G., & Martin, R. (1992). Acoustic and
development of speech motor control pro- perceptual approaches to the study of intel-
cesses. Journal of Speech, Language, and ligibility. In R. Kent (Ed.), Intelligibility in
Hearing Research, 45, 1119–1133. speech disorders (pp. 67–118). Philadelphia:
Wang, Y.-T., Kent, R. D., Duffy, J. R., & Thomas, J. John Benjamins.

Kent - Et - Al Acoustic in SSD

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Kent - Et - Al Acoustic in SSD

Uploaded by

Copyright:

Available Formats

05_Flipsen_93-114 5/4/09 8:42 AM Page 93

Introduction these advantages is discussed, with exam-

94 SPEECH SOUND DISORDERS IN CHILDREN

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 95

96 SPEECH SOUND DISORDERS IN CHILDREN

2. The precision of formant estimation 6. The development of the vocal tract

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 97

PVI = 100 × [∑⏐(dk−dk−1)/dk+dk−1)/2⏐

Boundary cues are pauses, adjustments in where m equals the number of

98 SPEECH SOUND DISORDERS IN CHILDREN

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 99

100 SPEECH SOUND DISORDERS IN CHILDREN

Five more cookies

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 101

102 SPEECH SOUND DISORDERS IN CHILDREN

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 103

104 SPEECH SOUND DISORDERS IN CHILDREN

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 105

106 SPEECH SOUND DISORDERS IN CHILDREN

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 107

108 SPEECH SOUND DISORDERS IN CHILDREN

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 109

110 SPEECH SOUND DISORDERS IN CHILDREN

In preparing the text and the accompany- stops. Available at http://www.asel.udel.edu/

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 111

112 SPEECH SOUND DISORDERS IN CHILDREN

(Ed.), The international guide to speech speech intelligibility in Mandarin-speaking

CHILDREN’S SPEECH SOUND DISORDERS: AN ACOUSTIC PERSPECTIVE 113

114 SPEECH SOUND DISORDERS IN CHILDREN

You might also like