You are on page 1of 2

Factors affecting the detection of mispronunciations

R. A. Cole, and J. A. Jakimik

Citation: The Journal of the Acoustical Society of America 60, S27 (1976); doi: 10.1121/1.2003252
View online: https://doi.org/10.1121/1.2003252
View Table of Contents: https://asa.scitation.org/toc/jas/60/S1
Published by the Acoustical Society of America

ARTICLES YOU MAY BE INTERESTED IN

Word-letter phenomenon with speech stimuli: a word-segment effect


The Journal of the Acoustical Society of America 60, S27 (1976); https://doi.org/10.1121/1.2003253

Detecting and correcting mispronunciations: A note on methodology


The Journal of the Acoustical Society of America 75, S46 (1984); https://doi.org/10.1121/1.2021447
S27 92nd Meeting' Acoustical Society of America S27

initial stop, liquid or aspirate induces no appreciable length- 3:00

ening, other segments in the syllable being sharply reduced;


M7. Reading speed as a clue to text structure. M.O. Harris
and initial/s/lengthens the syllable breath group by an
(Acoustics Research Department, Bell Laboratories, Murray
amount approaching the duration of the fricative segment, the Hill, NJ 07974)
durations of the other acoustic segments being slightly re-
duced. It is hoped that these results will contribute to more Text consists of structured content. Do readers reflect that
adequatesynthesisby rule. [Work supportedby NIH and VAo] structure in their production? A system is under design to
*Also Department of Linguistics, University of map the content pattern of a given text, sentence by sentence,
Connecticut. thus locating major and lesser topic boundaries. The system
depends on word frequency and distribution in that text, and
2:30
on the recognition of synonyms and other forms of reference,
M4. Use of nonsense-syllablemimicry in the s•udyof prosodic which establish links betwen text segments. Acoustic tools
phenomena. Mark Y. Liberman and Lynn A. Streeter (Bell help to identify significant variations in reading performance,
Laboratories, Murray Hill, NJ 07974) and these can then be interpreted in terms of the text structure.
The technique of nonsense-syllable mimicry of natural Material for the present paper consi•sts primarily of four
utterances [used by Lindbloom and Rapp, Publ. No. 21, Insti- passages, each read by three speakers' about an hours' read-
tute of Linguistics, University of Stockholm (1973) (un- ing in all. The passages are structurally independent excerpts
published)] has many advantages in the study of prosodic from much larger works. All of the sentences were random-
phenomena, especially duration. In analytic studies, the elim- ized together and read as one list, and then each passage was
ination of segmental effects as a factor makes data collection read in original form. A study based on duration measure-
much more efficient, and requires only one segmentation ments, indicates clear agreement among speakers as to the
criterion. In perceptual studies, the technique eliminates locations where reading speed changes occur: e.g., in sen-
lexical information without unnatural distortions of the signal. tences of transition between major topics, and at points of
In a series of validation experiments, we have fouad that: salient semantic shift. However, the nature of the change-
(a) the patterns of duration obtained by using this technique increase or decrease in speed is highly speaker dependent.
were stable and reproducible within and across speakers;
3:10
(b) mimicry of different natural models with identical stress
patterns and constituent structures produced indistinguishable MS. Perceptual phonetics of coarticulation. J.G. Martin,
nonsense-syllable duration patterns; (c) obtained duration R.H. Meltzer, and C.B. Mills (Department of Psychology,
patterns correspond closely to results of work on natural University of Maryland, College Park, MD 20742)
speech by ourselves and others.
In experiments reported earlier at ASA meetings (November,
2:40 1975) and elsewhere, reaction time (RT) of subjects monitoring
stop-consonant phoneme targets in tape-recorded sentences
M5. Syntacticboundariesin the timing of trochaic speech. was observed. RT was compared when the target (a) was
William E. Cooper, Steven G. Lapointe, and Jeanne M.
Paccia (Research Laboratory of Electronics, Massachusetts
carried by the normal, intact sentence version or 00)was
temporally displaced by experimental intervention, that is,
Institute of Technology, Cambridge, MA 02139)
separated from prior sen{ence context by addition of 200
The duration of a stressed syllable is shortened when it is msec to the normal pre-stop-consonant silent interval. Faster
immediately followed by an unstressed syllable. Previous work RT to temporally displaced than to normal targets was inter-
showed that this effect operates across word boundaries but preted in terms of coarticulatory cues to target existing in the
is diminished in magnitude by the presence of an intervening speech interval preceding the intervention which were used to
syntactic boundary. In this study, the durations of key seg- anticipate the target across the intervention interval. In fur-
ments within stressed syllables were measured in sentence ther analysis, data were separated on the basis of four classes
pairs containing a matched phonetic environment. The results of pretarget phonetic context' stop, fricative, sonorant, and
for ten speakers showed that the shortening effect was blocked vowel. All classes produced coarticulatory effects (relatively
in the presence of a number of major syntactic boundaries, faster RT to displaced compared td normal targets), some
including the boundaries between a noun phrase (NP) and a more than others. Additional analysis indicated similar ef-
prepositional phrase (PP), between two PP's, and between fects in the normal sentence versions also. Discussion con-
a NP or PP and a separate clause. Lesser syntactic boundaries, cerns the mapping of perceptual results onto acoustic and
including the boundaries between a verb and NP direct object articulatory data. [Work supportedby NIMH, ARIBSS.]
and between an NP direct object and NP indirect object, did ß

not block the shortening rule. The magnitude of the blocking 3:20
effect did not depend on the transformational history or
internal structure of the syntactic constituents as much as on M9. Factors affecting the detection of mispronunciations.
the boundary type. R.A. Cole and J. A. Jakimik (Department of Psychology,
Carnegie--Mellon University, Pittsburgh, PA 15213)
2.50 In a series of experiments, subjects were presented with
sentences or short stories, and required to press a response
M6. Syntacticdeterminants of word duration in speech.
Richard Goldhor (Research Laboratory of Electronics, 36-575, key whenever they detected a mispronounced word. Mispro-
nunelations were produced by changing one phonetic segment
MassachusettsInstitute of Technology, Cambridge, MA 02139)
in a word to produce a nonword. Detection of mispronuncia-
A preliminary identification of the primary syntactic deter- tions was effected by factors at several linguistic levels: (a)
minants of duration in English declarative sentences has been the type of phonetic change, (b) the position of the changed
made. A hundred declarative sentences were generated from segmentin the word, (c) the stress of the syllable (e. g., "con-
a small set of words, and the durational variations of those tain" versus "contour", where/k/ was pronounced/g/ in each
words which appeared in more than six sentences were studied. word), (d) the syntactic structure of the sentence, and (e)
The three syntactic determinants which were found to most whether the word containing the mispronunciation occurred
strongly and consistently affect word duration are: (1) the in a previous sentence. The restfits will be summarized and
length of the phrase in which a word appears; (2) the position discussed.
of the phrase in its dominating clause; and (3) the number and
type of clauses in the sentence. Additionally, these determinants 3:30
were found to affect phrase-final wo•ds differently than
phrase-nonfinal words. These findings have been incorporated M10. Word-letter phenomenon with speechstimuli: a word-
into a simple prosodie durational algorithm suitable for use in segmenteffect. JosephRogers (Departmentof Psychology,
a text-to-speech system. University of California at San Diego, La Jolla, CA 92093)

J. Acoust. Soc. Am., Vol. 60, Suppl. No. 1, Fall 1976

You might also like