Professional Documents
Culture Documents
To cite this article: Jennifer M. Oates & Robert J. Kirkby (1979) Acoustic Investigations of
Abnormal Voice Quality — A Review, Australian Journal of Human Communication Disorders,
7:1, 4-15, DOI: 10.3109/asl2.1979.7.issue-1.03
Jennifer M. Oates
School of Communication Disorders
Lincoln Institute
Robert J. Kirkby
Department of Behavioural Sciences
Lincoln Institute
Downloaded by [University of Lethbridge] at 19:58 16 May 2016
The present paper was concerned with acoustic investigations of abnormal voice quality
in clinical patients and subjects trained to simulate abnormal voice qualities. A review
of studies relating to the acoustic components of hoarseness, harshness, the dysphonia
of subjects with various laryngeal pathologies, and simulated roughness indicated that
these conditions were characterised by very similar acoustic correlates. It appeared that
this lack of acoustic differentiation reflected a number of problems, including: the
semantic confusions of voice quality terminology; the failure of previous studies to
control for the type of laryngeal pathology underlying perceptual categories; and, a
lack of consistency in instrumentation, measurement techniques, and interpretation of
acoustic findings. The authors have suggested that these problems must be substantially
reduced in future studies in order that complete physiological and acoustic data for
voice quality abnormalities can be delineated, and an empirically-based terminology
~')r the perceptual description of these disorders developed.
In regard to the research aspect, acoustic studies could provide the speech
pathologist with further data concerning the acoustic and inferred physiological
components of normal voice, and the voices of patients with particular laryngeal
pathologies. A terminology could then be based on this objective data, which would
allow a more consistent description of the nature and severity of the quality disorder.
Until this physical data is delineated and an empirically-based terminology
developed, the possibility of inappropriate treatment planning and inaccurate evalua-
tion of a patient's progress remains asa potential problem for the speech pathologist.
In the present paper the authors have reviewed previous acoustic studies of abnormal
voice quality and presented suggestions which could substantially reduce problems of
these earlier investigations.
Sonagrams were made of the series of vowels using a narrow band filter. Sections
were taken at approximate mid-points of the vowels and four discriminable acoustic
patterns were delineated as follows:
Type I: Regular harmonic components were mixed with noise components, chiefly
in the formant regions.
Type II: Noise components in the second formants predominated over the
harmonics. Slight additional noise components were present in the high frequency
region above 3,000 Hz.
Type Ill: Second formants were completely replaced by noise components and the
additional noise above 3,000 Hz. was further intensified and increased in range.
Type IV: Second formants were completely replaced by noise, and the first
formants lost periodic components. More intense high frequency noise was seen.
Yanagihara (1967 a) concluded that the noise components and changes of
harmonic structure were positively correlated with the perceptual degree of hoarse-
Downloaded by [University of Lethbridge] at 19:58 16 May 2016
ness. It was suggested that the noise may have originated in turbulent airflow, due to
incomplete glottal closure, or to irregular vibratory patterns. The loss of high
frequency harmonics was attributed to an incomplete or shorter closing phase of the
glottis. Yanagihara (1967a) pointed out that these two acoustic features were
probably not the only components of hoarseness; he noted that aperiodicity of
fundamental frequency could also be a relevant factor.
In a second and more comprehensive study, Yanagihara (1967b) investigated
results from spectrographic analysis of the five cardinal vowels, high speed
cinematographic analysis of glottal area function during phonation of lei, and
measures of airflow rate during phonation of la/. Ten adult subjects, seven males
and three females, participated in all three examinations. Each subject had a degree
of hoarseness due to some laryngeal pathology. A comparison of the spectrographic
findings with the results of glottal area wave analyses indicated that the degree of
abnormality in the acoustic findings was closely related to the extent of cycle-to-cycle
changes in the shape, amplitude, and periodicity of the glottal area waves. That is,
the subjects with extreme irregularity in glottal area waves tended to show more
changes in harmonic structure and more noise components in their spectrograms.
Consistent with his earlier finding, the results of Yanagihara's (1967b) study
indicated that the noise components were stronger than the harmonic components
when the hoarseness became more severe. This phenomenon began at higher
frequencies and extended into lower frequencies.
From the results of this study, Yanagihara (1967b) concluded that while minor
aperiodicity in the glottal area wave would affect the higher frequency components,
lower frequencies would also alter when aperiodicity increased. Hoarseness was thus
seen as a multidimensional disorder with perturbation of fundamental frequency,
changes in harmonic structure, and additional noise components as the acoustic
correlates. These cycle-to-cycle and overall changes in the voice spectrum had a
direct effect on the perceptual judgement of the severity of hoarseness.
Yanagihara's (1967a, 1967b) findings were consistent with those of an earlier and
less extensive study by Moore and Thomson (1965). In the Moore and Thomson
study, glottal changes were recorded by ultra high-speed cinematography in two
adult males, one with severe and one with moderate hoarseness. From the analysis of
motion pictures of sections of 25 consecutive cycles of vibration, it was found that
random variability in the opening and closing phases of vocal fold vibration was
OATES. KIRKBY: Abnormal Voice Quality 7
more extreme for the severely hoarse subject. Oscillographic analysis of 5 to 15 sec.
segments of the vocalisations of the subject with severe hoarseness revealed that no
more than two consecutive waves were the same length.
In addition to aperiodicity of the fundamental frequency, Yanagihara (1967b)
found that cycle-to-cycle changes in the amplitude of the glottal area wave were
related to the degree of hoarseness. This supported Wendahl's (1966) observation
that the sensation of hoarseness may be caused by cycle-to-cycle amplitude variations
of the impulse. Wendahl (1966) labelled the hoarseness caused by this amplitude
change as "shimmer". Laryngeal analog synthesis has established a linear relation-
ship between shimmer and listener judgements of hoarseness, with as little as 1 db. of
shimmer between adjacent cycles being sufficient for subjective detection of rough-
ness in the signal (Wendahl, 1966). Wendahl (1963) had also suggested that hoarse-
ness could be due to aperiodic changes in the fundamental frequency from cycle-to-
cycle, in which case it was experienced as "jitter". In an earlier study, Wendahl
Downloaded by [University of Lethbridge] at 19:58 16 May 2016
(1961) used an analog of the larynx to study the effect of these aperiodic wave
periods on judgement of hoarseness. He found that small amounts of aperiodicity, as
little as 1 Hz. around a median frequency of 100 Hz. were detectable and that
deviations of ± 10Hz. were judged as extremely rough. As Wendahl (1961) pointed
out, it was almost impossible to differentiate perceptually between hoarseness due to
jitter and hoarseness due to shimmer.
Other studies of the acoustic correlates of hoarseness were carried out by Iwata
and von Leden (1970) who used the Voice Print, a modification of the Sonograph,
which provided a contour display spectrogram. Five adult subjects with various
laryngeal pathologies, but all with hoarse voices, were required to sustain the vowel
/a:/ for several seconds at comfortable pitch and loudness levels. Voice prints with
a normal scale up to 8,000 Hz, and a print with an expanded scale up to 2,000 Hz.
to show details of energy in the first formant region were obtained. The voice prints
allowed a detailed spectral view of the amplitude characteristics in terms of frequency
and time. Iwata and von Leden (1970) found that the acoustic energy distribution in
the first formant was important in detecting laryngeal pathology. As the degree of
hoarseness increased, the formant bars widened and became irregular in frequency
and time. These changes were related to the increase of noise components and to the
fluctuation of fundamental frequency.
Two of the subjects in the Iwata and von Leden study had laryngeal neoplasms and
severe hoarseness. The voice prints for these subjects showed irregularity in the
acoustic energy for the entire frequency range, particularly in the lower frequencies.
Extensions of high energy distribution from the first formant coalesced with the
adjacent fundamental and second formant bars, with disintegration of the formant
pattern. However, the subjects with laryngeal paralysis presented different acoustic
results even though their voices were rated as hoarse. The first formant bar indicated
a much lower energy distribution with less overlap between adjacent formants.
According to Iwata and von Leden (1970) these results suggested that hoarseness
was not a single quality disorder, but a pathology comprised of discrete subgroups.
Lowered fundamental frequency has been cited often as an acoustic correlate of
hoarseness. In an investigation of fundamental frequency and hoarseness, Cooper
(1974) measured fundamental frequency from narrow band spectrograms of the
voices of adult patients with a variety of laryngeal pathologies. The severity of
hoarseness was judged using Yanagihara's (1967a) classification of noise in relation
to the harmonic structure of the voice wave. Cooper found that a lower fundamental
8 Aust. J. of Human Comm. Dis. 7, 1.1979
frequency was present with more severe hoarseness and a higher fundamental
frequency was associated with less hoarseness.
Shipp and Huntington (1965) investigated the voices of 26 adults with acute
laryngitic hoarseness. The results, obtained from oscillographic analysis of readings
of a sentence from a standard passage, indicated no significant differences between
the mean fundamental frequencies of normal and hoarse voices. This inconsistency
between the findings of Shipp and Huntington and those of Cooper may have arisen
through differences in the laryngeal pathologies underlying the voice quality disorders
investigated. Shipp and Huntington studied a specific pathology (acute laryngeal
hoarseness), whereas Cooper (1974) studied a variety of laryngeal pathologies; thus
the acoustic components of the voices corresponding to the different pathologies
would be expected to differ.
Summary
Downloaded by [University of Lethbridge] at 19:58 16 May 2016
Harshness
"Harshness" is a perceptual classification of voice quality often regarded as
typically lower in pitch than the normal voice (Moore, 1971b; van Riper and Irwin,
1958). Michel and Hollien (1968) investigated differences in mean fundamental
frequency between the normal voice, vocal fry, and harshness. In their study, 10
adult males with normal voices read a phonetically balanced passage, once in their
normal voice, and then once using vocal fry. Ten adult males with voices judged
perceptually as harsh also recorded the passage. The fundamental frequencies of the
subjects' voices were acoustically determined by a fundamental frequency indicator
and a phonellogram. No significant difference in mean fundamental frequency was
found between the samples and Michel and Hollien (1968) suggested that fundamen-
tal frequency was not an appropriate parameter to differentiate normal and harsh
voices. However, these investigators noted that there was more variability among the
individual harsh fundamental ranges than among the vocal fry or normal voice
ranges. This suggested that the range of fundamental frequency variation was a
possible parameter to discriminate harsh from normal voices.
Bowler (1964) also investigated the fundamental frequency of perceptually
delineated harsh voices. Oscillograms were made of readings of a phonetically
balanced passage by 44 adult subjects. The recordings of all subjects were judged to
contain examples of harshness. It was found that the distribution of fundamental
frequency for the harsh portions was markedly skewed towards the lower end of the
distribution, whereas the distributions for the normal samples were relatively
symmetrical. These findings did not agree with Michel and Hollien's (1968) results,
possibly because of a difference in the method of voice sampling in the two studies.
OATES. KIRKBY: Abnormal Voice Quality 9
Michel and Hollien's samples of harshness were obtained from subjects who had
habitual harsh voices, whereas Bowler's samples were isolated examples of harshness
from subjects whose voices were normal.
Bowler (1964) also found that the range of fundamental frequency means for the
harsh sections of the samples was twice that for the non-harsh sections. Abrupt
changes of the fundamental frequency, typically one octave in upward or downward
directions, occurred from wave to wave in the harsh samples. A downward break,
followed almost immediately by an upward break returning approximately to the
point at wbich the downward break began, was the most common pattern for the
harsh samples. Michel and Hollien's (1968) finding of more variability among the
fundamental frequency ranges of harsh voices was in accord with Bowler's (1964)
results. Zemlin (1968) also reported that harshness seemed to be related to
fundamental frequency differences between cycles of vocal fold vibration, in addition
to noise factors.
Downloaded by [University of Lethbridge] at 19:58 16 May 2016
Summary
Harshness has been associated with abrupt cycle-to-cycle changes in fundamental
frequency and with noise factors. As with hoarseness, there have been conflicting
findings concerning fundamental frequency: for harsh voices, this variable has been
reported to be no different from the normal voice or to be lowered.
On the basis of the recorded amplitudes Koike suggested that, although his technique
might distinguish between laryngeal paralysis and neoplasms, considerably more
work needed to be carried out before a valid and reliable tool for general clinical use
could be developed.
Summary
The presence of various laryngeal pathologies, then, has been associated with the
acoustic parameters of cycle-to-cycle irregularity in the fundamental freq uency,
weakness of high frequency harmonics, changes in amplitude modulation. and
harmonic and resonance discontinuity.
Downloaded by [University of Lethbridge] at 19:58 16 May 2016
literature are consistent with Lively and Emanuel's (1970) explanation (see, for
example, Iwata and von Leden, 1970; More and Thomson, 1965; Wendahl, 1963,
1966). However, Yanagihara (1967a) has outlined two additions to this explanation.
Yanagihara hypothesised that noise components were due to turbulent airflow caused
by incomplete glottal closure and that loss of high frequency harmonics was the result
of incomplete or shorter closing phases of the glottis.
DISCUSSION
Although investigations of perceptual classifications of quality disorders have'
involved studies of hoarseness, roughness, and harshness, as well as studies of the
general dysphonia of patients with different laryngeal pathologies, very similar
acoustic results have been found for all conditions. These common findings and their
physiological interpretations have been summarised in Table 1.
It appears that the small number of differences in acoustic findings that have been
reported probably reflect differences in the severity of a "general" quality disorder
rather than differences between various perceptual classifications. The differentiation
was established in studies of both hoarseness and roughness, where it was found that
the severity of perceived disturbance varied with changes in the intensity and range of
spectral noise. The lack of any further reports of differences across perceptual
categories could reflect a number of problems.
Firstly, the problem of nomenclature: the voice qualities which were classified
under different perceptual terms, such as hoarse and harsh, may in fact represent a
very similar vocal disorder (Lively and Emanuel, 1970). This again points to the
semantic confusions prevalent in this area and to the dangers of using perceptual
terms which are often ambiguous and inconsistent in meaning.
OATES. KIRKBY: Abnormal Voice Quality 13
• Points 6 and 7 list as yet unverified proposals concerning the supragloual tract resonance function and its
relation to voice quality abnormality.
Secondly, the problem related to a failure to control for the type of laryngeal
pathology underlying the perceptual category: the studies of Cooper (1974), Iwata
and von Leden (1970) and Yanagihara (1967a, 1967b) investigated subjects whose
voices were classified within a specific perceptual category, or merely as dysphonic,
but each sample comprised subjects with several different laryngeal conditions.
Differences which may have been detected if the sample involved a single patho-
logical condition may have been obscured through findings based on a blurred
collection of pathologies. Evidence has been provided by Koike (1969) and Iwata
and von Leden (1970), that differences can exist between different pathologies which
are assigned the same perceptual classification. These investigators found that
differentiation between laryngeal paralysis and neoplasms could be made on the basis
of intensity variations and the range of spectral noise, when both pathologies gave
rise to a voice quality labelled as "hoarseness".
Thus, it seems that acoustic studies of voice quality could be more sensitive if a
specific laryngeal pathology, rather than a perceptually classified group with several
underlying pathological conditions, was studied in detail.
14 Aust. J. of Human Comm. Dis. 7,1. 1979
Thirdly, there has been a lack of consistency in the type and use of instrumenta-
tion, and the measurement and interpretation of acoustic findings, across and within
studies. Because of these differences, and the possible error associated with some of
the measurement techniques used (see for example, Blake, 1970; F ant, 1956;
Lindblom, 1962; Peterson and Barney, 1952) the accuracy of the earlier studies of
abnormal voice quality is questionable.
Clearly, this review indicates a need for speech pathologists to undertake studies
where these problems of nomenclature, mixing of laryngeal pathologies, instrumenta-
tion, and measurement are substantially reduced (see for example, Oates 1978).
REFERENCES
Downloaded by [University of Lethbridge] at 19:58 16 May 2016
MONCUR, J. P., BRACKETT, I. P. Modifying Vocal Behaviour, New York: Harper and Row
(1974).
MOORE, G. P. Voice disorders organically based. In Travis, L. E. (Ed.) Handbook of Speech
Pathology and Audiology, New York: Appleton-Century Crofts (1971).
MOORE, G. P., THOMSON, C. L. Comments on the physiology of hoarseness. Archives of
Otolaryngology, 81,97-102 (1965).
MOORE, G. P., VON LEDEN, H. Dynamic variation of the vibratory pattern in the normal
larynx. Folia Phoniatrica, 10, 205-238 (1958).
OATES, J. M. An Acoustic study of voice quality disorders in children with vocal nodules.
Masters Thesis: Lincoln Institute (1978).
PERELLO, J., TOSI, O. Phonogram. Folia Phoniatrica, 26 (4), 289-290 (1974).
PETERSON, G. E. Parameters of vowel quality. Journal of Speech and Hearing Research, 4,
10-29 (1961).
PETERSON, G. E., BARNEY, H. L. Control methods used in a study of the vowels. Journal of
The Acoustic Society of America, 24,175-184 (1952).
POTTER, R. K., KOPP, G. A., GREEN, H. G. Visible Speech, New York: Van Nostrand
Downloaded by [University of Lethbridge] at 19:58 16 May 2016
(1947).
REES, M. Some variables affecting perceived harshness. Journal of Speech and Hearing
Research 1,155-168 (1958).
SHERMAN, D., LINKE, E. The influence of certain vowel types on degree of harsh voice
quality. Journal of Speech and Hearing Disorders, 17,401-408 (1952).
SHIPP, T., HUNTINGTON, D. A. Some acoustic and perceptual factors in acute-laryngitic
hoarseness. Journal of Speech and Hearing Disorders, 30 (4), 350-359 (1965).
VAN RIPER, c., IRWIN, J. V. Voice and Articulation. Englewood Cliffs, N.J.: Prentice-Hall
Inc. (1958).
VON LEDEN, H., KOIKE, Y. Detection of laryngeal disease by computer technique. Archi~'es
of Otolaryngology, 91 (1),3-11 (1970).
WENDAHL, R. W. A photophonellographic analysis of hoarse voice quality. Proceedings 4th
International Congress of Phonetic Sciences, Helsinki: 307-310 (1961).
WENDAHL, R. W. Laryngeal analog synthesis of harsh voice quality. Folia Phoniatrica, 15,
241 (1963).
WEN DAHL, R. W. Some parameters of auditory roughness. Folia Phoniatrica, 18, 26 (1966).
YANAGIHARA, N. Significance of harmonic changes and noise components in hoarseness.
Journal of Speech and Hearing Research, 10, 531-541 (1967a).
YANAGIHARA, N. Hoarseness: Investigation of the physiological mechanisms. Annals of
Otology, Rhinology and Laryngology, 16,472-488 (1967b).
ZEMLIN, W. R. Speech and Hearing Science-Anatomy and Physiology, New Jersey: Prentice-
Hall Inc. (1968).