You are on page 1of 7

Human Speech: A Restricted Use of the

Mammalian Larynx
*,†Ingo R. Titze, *Salt Lake City, Utah, and †Iowa City, Iowa

Summary: Purpose. Speech has been hailed as unique to human evolution. Although the inventory of distinct sounds
producible with vocal tract articulators is a great advantage in human oral communication, it is argued here that the
larynx as a sound source in speech is limited in its range and capability because a low fundamental frequency is ideal
for phonemic intelligibility and source-filter independence.
Method. Four existing data sets were combined to make an argument regarding exclusive use of the larynx for speech:
(1) range of fundamental frequency, (2) laryngeal muscle activation, (3) vocal fold length in relation to sarcomere length
of the major laryngeal muscles, and (4) vocal fold morphological development.
Results. Limited data support the notion that speech tends to produce a contracture of the larynx. The morphologi-
cal design of the human vocal folds, like that of primates and other mammals, appears to be optimized for vocal
communication over distances for which higher fundamental frequency, higher intensity, and fewer unvoiced seg-
ments are used.
Conclusion. The positive message is that raising one’s voice to call, shout, or sing, or executing pitch glides to stretch
the vocal folds, can counteract this trend toward a contracted state.
Key Words: speech–larynx–contracture–singing–muscles.

INTRODUCTION unvoiced segments have become part of the sound inventory. Pro-
Mammals, birds, amphibians, and reptiles communicate with sodic variations, such as melody and accent, do not need to interrupt
sound produced in respiratory airways, using either a larynx or the carrier and are therefore good candidates for long-range vocal
a syrinx (birds) as a sound source. Sonic, infrasonic, or ultra- communication. They are used by most species that vocalize. Range
sonic vibration is sustainable by nonlinear interaction between of frequency and amplitude, speed of change of frequency and
the airstream and a collapsible (soft tissue) segment of the airway amplitude, and frequency spectrum of the overtones of the carrier
wall, producing a fundamental frequency (fo) and a spectrum of frequency, become important criteria for a large inventory of vocal
higher frequencies. The vibrating tissue disturbs the airstream signals that a species can produce.
so that acoustic waves are propagated in the airways. These waves With the evolution of speech on the order of 100,000 years
travel along the slowly moving airstream, with a small portion ago,2 humans discovered that a larger inventory of modula-
of the sound being radiated from the mouth, beak, or nostrils tions could be produced by changing the airway structures rather
into free space to the listener. A much larger portion of the sound than simply the sound-source characteristics. Moving the tongue,
is retained in the airway in the form of multiple reflections from the lips, the jaw, or the velum allowed the frequency spectrum
irregular boundaries, producing standing waves that form dis- of the carrier overtones to be modulated amply to produce vowels
tinct classes of sound. Amplitude and frequency modulations of and consonants. Articulation became the dominant modulation
the fo and higher partials are added to allow rhythmic and melodic in vocal communication, so much so that whispered speech, or
patterns to be produced. speech with a buzzer held against the neck, is viable today for
Communication strategies vary across species.1 A single tone close-range communication. Source modulations are of second-
with constant fo and amplitude, considered the carrier of vocal ary importance. The invention of electronic amplification makes
communication, reveals the presence and location of an animal, source modulations with wide frequency and amplitude ranges
and possibly its size and some degree of identity. Modulations of even less essential for speech.
this carrier are needed to create a sufficient inventory of sounds The purpose of this paper is to put forth an argument, with
for all of the animal’s communication needs. In analogy with radio existing data never presented in combination, that adaptation of
communication theory, the carrier is of a much higher frequency the mammalian larynx for long-range unamplified vocal com-
than the modulations, which are in the form of variations in am- munication could eventually be reversed with excessive or
plitude, fundamental frequency, duration, or frequency spectrum. exclusive use of speech over short distances. Calling, with its
Ideally, the modulations are not so large that the carrier (voicing) many variations (howling, shouting, hooting, roaring, scream-
is interrupted. For short-distance communication, however, ing, chanting), is practiced so little that a predisposition to motor
control problems in the larynx may exist. Furthermore, infre-
Accepted for publication June 13, 2016. quent mechanical stretching of laryngeal tissues may diminish
From the *National Center for Voice and Speech, The University of Utah, Lead Insti-
tution, Salt Lake City, Utah; and the †Department of Communication Sciences and Disorders,
the need for a multilayered vocal fold morphology. Fragmen-
The University of Iowa, Iowa City, Iowa. tary evidence is given here in the nature of acoustic requirements,
National Center for Voice and Speech, The University of Utah, 136 South Main Street,
Suite 320, Salt Lake City, UT 84101-3306. E-mail: ingo.titze@ncvs2.org
vocal fold morphology, vocal fold posturing for speech, and ap-
Journal of Voice, Vol. 31, No. 2, pp. 135–141 proaches used for voice therapy and training. The methods are
0892-1997
© 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
in the form of corroborating data sets produced over several
http://dx.doi.org/10.1016/j.jvoice.2016.06.003 decades.
136 Journal of Voice, Vol. 31, No. 2, 2017

METHODS of the speech fo falls below 200 Hz, whereas fundamental fre-
Acoustic requirements for speech quencies beyond 500 Hz are available from the male larynx with
The ideal sound carrier for speech articulation has a low fo. A equal or greater SPL variation.
logical argument is that intelligibility of phonemes is im- Aside from the requirement of low fo for vowel and voiced
proved if source harmonics are closely spaced, given that vocal consonant intelligibility, strong nonlinear coupling between the
tract resonances can then be sampled with a greater number of source and the filter is avoided when fo is low. This creates a
frequencies. With identical formant structure, male speech should stability region for source harmonics in speech. It has been shown
by this argument be more intelligible than female speech. This that low harmonics of the source spectrum (in particular fo or
has not been reported, however. In fact, the evidence is some- two fo) passing through airway resonances can destabilize vocal
what in the opposite direction.3,4 The answer may lie in the fact fold vibration.11 Avoiding these crossings keeps the source spec-
that formant frequencies are higher in women than in men, which trum more constant (less registered in voice science terminology,
tends to equalize the sampling issue. At female fundamental fre- or less bifurcated in nonlinear dynamics terminology) so that
quencies beyond those typically used in speech (ie, singing), it modulation with articulation is not confused with spectral un-
has been shown that intelligibility is indeed reduced.5–7 There certainty at the source. Hence, a linear source-filter theory of
is poorer sampling of the resonances of the vocal tract. For speech production that separates the sound source from the res-
example, because most vowels are identified by the first two res- onator has been the hallmark of speech science and technology
onant frequencies of the supraglottal airway (the exception being for more than half a century, notable based on the work of Fant.12
rhotic vowels), and because these frequencies center around A current trend in speech is to use vocal fry, also known as
500 Hz and 1500 Hz, respectively, a fo that is not well below pulse register.13,14 A subharmonic series (period doubling or tri-
500 Hz provides too few harmonics to sample the resonance peaks pling) is often produced, which increases the density of source
effectively. The same can be said for voiced consonants, which frequencies in the spectrum. This would appear to increase speech
also have low resonance frequencies. Thus, there is a strong ten- intelligibility if the source were to remain stable. Vowels and
dency for humans to keep fo low during speech. Unvoiced consonants are extremely well sampled, but overall vocal in-
consonants use secondary sound sources with greater spectral tensity is low and the voice quality is rough. With amplification
density (hisses, clicks, and pops), but these sounds are not carried or small speaker-listener distances, low intensity does not seem
over distances more than a few meters. They are generally not to be an issue. However, prosodic variations (ie, intonation that
effective for long-range vocal communication unless amplifi- takes the voice into and out of fry without linguistic intent) appear
cation is used. to be more difficult to produce. Without ample prosodic varia-
Figure 1 shows the range of sound pressure level (SPL) plotted tion to communicate mood, personality, or other paralinguistic
against range of fo, known as a voice range profile. The curves characteristics, speech can degenerate to vocal texting.
labeled loud and soft show the nonspeech range of SPL obtain- Before acquisition of the major articulatory components of
able 30 cm from the mouth over the entire fo range of 10 men.9,10 speech, human infants vocalize at high fo (around 400–500 Hz).
Superimposed are four speech contours from male classroom Mothers encourage high fo vocalization with “mothereses” that
teachers who speak daily on the order of 6 hours.8 Note that 70% modulate the high fo carrier with lots of prosodic variation.15 The
fo steadily drops in the first 3 years (to around 300 Hz) as more
and more articulation develops.16 Calling, shouting, laughing, and
crying continue toward puberty but are often suppressed there-
after for social reasons. Even singing, which in some cultures
is a daily family or school activity, is becoming less habilitated
with electronic amplification that can extend the acoustic di-
mensions of the voice artificially. People who sing appear to retain
physiologically younger voices into advanced age. Their voices
are louder and their fo is categorically higher.17

Vocal fold morphology


Mammalian vocal folds (or vocal cords) are generally multi-
layered in their tissue construct (Figure 2). An epithelium (skin)
encapsulates a soft-tissue structure known as the lamina propria,
which in itself can have one, two, or three compartments: a su-
perficial layer, an intermediate layer, or a deep layer.19 They consist
of a matrix of elastin and collagen fibers with interstitial fluids
(proteoglycans and glycoproteins).20 Lateral to the deep layer,
and firmly attached to it, is the thyroarytenoid (TA) muscle.18
The design is functional in that the superficial layer (under
the skin) needs to be pliable, like a gel, to support surface modes
FIGURE 1. Voice range profile with superimposed speech range con- of vibration.21 These surface modes allow energy to be trans-
tours of school teachers (after Hunter and Titze8). ferred from the airstream to the tissue. An alternating convergent
Ingo R. Titze Human Speech: Restricted Use of Mammalian Larynx 137

fibers. For a moderate range of fo, a two-layer system (mucosa


and muscle) suffices. TA muscle fibers can stiffen to provide a
workable range of fo. Such an architecture has been developed
by some domestic canines, who bark in a low to medium fo range.
A much wider pitch range, however, is achieved with devel-
opment of a vocal ligament in a trilayer system (mucosa, ligament,
muscle). Stiffness in muscle fibers can then gradually be shifted
to stiffness in ligament fibers as vocal fold length is increased
with cricothyroid (CT) muscle activation. The human vocal folds
evolved and developed into a trilayer system, presumably for
calling over long distances, combat shouting, screaming, imi-
tating infants or animal sounds, or chanting and singing. Pigs
have a ligament for squealing, Rocky Mountain elk for extreme-
ly high-pitched “bugling,” and chimpanzees for screaming and
pant-hooting. For humans who do not vocalize out of the context
of speech, however, the vocal ligament may become superflu-
ous. Surgeons have found that removal of the ligament (known
as a cordectomy) may not have a profound effect on conversa-
tional speech.26,27 Caution should be used, however, in applying
this perspective to current surgical management of the vocal lig-
ament. Successful voice outcomes from a few subligamentous
FIGURE 2. Layered structure of human vocal folds (after Hirano18). cordectomies could be based on secondary (compensatory) effects.
At this stage, therefore, it is risky to consider the vocal liga-
and divergent glottal duct is produced between the vocal folds ment superfluous in reconstructive surgery for speech. For singing
in every cycle of vibration with these surface modes, such that and calling, on the other hand, one can say categorically that
a push-pull action is exerted on the vocal fold surface by the ligament removal would likely result in a dramatic impairment.
airstream.21 Deformation of the superficial layer is a require-
ment for the alternating shape and its resulting push-pull action. Physiology of posturing with laryngeal muscles
Thus, a single gel-like superficial layer underneath the epithe- Muscles that are used for posturing usually consist of an agonist-
lium is a minimum requirement for oscillation. Human infants antagonist pair. One muscle adducts while the other abducts, or
begin with this morphology, that is, a thick single-layer mucosa.19 one elongates while the other shortens. For optimal strength and
The deeper layers develop around ages 3–4, but a full adult- control, it is desirable for both muscles to be operating near their
like lamination may not be achieved until the age of puberty.22 sarcomere length (the length at which the actin-myosin overlap
The intermediate and deep layers of the lamina propria form produces the maximum contractile force). The in situ resting
a ligament, with dense collagen and elastin fibers aligned in par- length is often near the sarcomere length.28,29 In this resting po-
allel, like a string. The alignment and strength of the fibers is sition, between two boundary attachments, the muscle is stretched
functionally driven. It has been shown that children whose vocal slightly by end point tendons. From the sarcomere length, the
folds were never vibrated, owing to cerebral palsy, did not develop muscle can be shortened or elongated in comparable amounts.
a vocal ligament.23 Likewise, a human adult who never pho- In speech, however, there is a tendency to perpetually shorten
nated past 64 years of age owing to a cerebral hemorrhage lost the vocal folds to keep fo low. The principle of least effort in
the ligament architecture. The vocal fold morphology returned speech production30–33 may not be observed for the larynx if ar-
to its infant-like state, with a uniform rather than a layered ticulation is of higher priority than range of fo. The process of
structure.24 From these studies, it is reasoned that evolution has adduction of the vocal processes of the arytenoid cartilages is
provided the cell structure (stellate cells at the end points of the a rocking action on top of the cricoid cartilage34 that moves the
vocal folds) for multiple layer development, but tissue stress from processes ventrally, caudally, and medially (forward, down-
vocalization is needed to build and maintain the structure. ward, and inward). This shortens the vocal folds 10%–30% from
A major benefit of multiple layers is the possibility of a large the neutral resting length for a range of low (speech-like) fun-
fo range. A homogeneous (nonlayered) mucosa cannot be stiff- damental frequencies.35 Although the vocal folds can also be
ened to produce a wide f o range while at the same time elongated about 30% from the resting length to tense the tissue
maintaining the surface pliability requirement for energy trans- fibers, this lengthening rarely happens in speech.
fer by surface waves, which sustains oscillation. A fiber-gel Figure 3 shows a muscle activation plot for fo change pro-
structure is ideal for the combined requirement of string-like duced by two intrinsic muscles of the larynx, the CT and the
tension and surface pliability. As the vocal folds grow in length, TA. Electromyography was conducted on a male subject with
however, fo drops because the natural frequencies in the fibers hooked-wire electrodes.36 Six fundamental frequencies were elic-
are lowered. This lowering of natural frequencies with laryn- ited with multiple tokens of vowel phonations. The muscle
geal growth can be countered only by a nonlinear stress-strain activities, ranging between 0.0 and 1.0, were normalized to
relation in the tissue fibers,25 or by active contraction of muscle maximum values of contraction by calibration maneuvers (a high
138 Journal of Voice, Vol. 31, No. 2, 2017

FIGURE 4. Increase in fo achievable with increased lung pressure.


Data are from canine larynges (after Titze39).

muscle in humans may also be biased far from its sarcomere


length in the 100–200 Hz speech fo range. The TA muscle needs
to be opposed by the CT muscle to stretch to sarcomere length.
Both canines and humans have a strong CT muscle. Howling
by dogs and wolves may accomplish this lengthening if the CT
FIGURE 3. Muscle activation plot (MAP) for a human male subject muscle stretches the TA muscle passively. Some head lifting with
(after Titze et al36). neck stretching may help in lengthening the vocal folds for
howling. Humans can rely on the CT muscle to stretch both the
TA muscle fibers and the vocal ligament, but only with high-
squeal for maximum CT and a forceful adduction [as in heavy pitched phonations like singing and calling, yet these vocalizations
lifting] for maximum TA). Note that typical male speech fun- are not generally part of speech.
damental frequencies (98 Hz and 131 Hz) are all in the lower A secondary fo-raising mechanism involves lung pressure. Ex-
left quadrant of the muscle activation plot. The subject used periments on excised canine larynges39 have shown that a range
mainly TA activity to increase fo in this region, moving from the of 0–2.5 kPa of lung pressure can change fo by 100–200 Hz, de-
lowest contour line to the second from the bottom. This range pending on the vocal fold length (Figure 4). This change is
of fo agrees with the speech range shown in Figure 1. It also agrees attributed to a dynamic stretch increase in the TA fibers with in-
with small vocal fold lengths (lower third of the range re- creasing vibrational amplitude. Thus, canines can reach their
ported by Nishizawa et al35) for corresponding frequencies. maximum barking fo, and perhaps sarcomere length, with raised
The stress in the tissue fibers of the TA muscle can be esti- lung pressure. For human speech, a two-layer system (without
mated from the frequency of a string under tension, fixed at both a ligament) would also appear to be sufficient if “loud” phona-
ends, tion were practiced, as in calling, but calling is practiced mainly
at construction sites, at sports events, or in playgrounds. Loud
1 σ
fo = (1) speech, as in noisy restaurants or moving vehicles, is not an ef-
2L ρ fective exercise because it retains the articulatory constraints
For a vocal fold length L of 1.0 cm, a fo of 150 Hz, and a mentioned earlier.
density ρ of 1.04 g/cm2, the stress σ in the muscle fibers is pre- To achieve the full physiological fo range shown in Figures 1
dicted to be 9 kPa. The exact range of stress in human muscle and 3, a vocal ligament is available to humans. Collagen fiber
fibers is not known, but it has been measured in vitro in canine stress can be much higher than the active or passive stress in
TA muscle to be 0–100 kPa.37 Assuming similar muscle fibers the TA muscle, but it requires substantial vocal fold lengthen-
for human and canine TA muscles, the required stress for speech ing (rather than shortening). Figure 5 shows the stress-strain curves
in humans would be less than 10% of the maximum active stress for three layers of vocal fold tissue: mucosa, ligament, and 10%
achievable at sarcomere length. Alipour-Haghighi and Titze37 contracted TA muscle. Min et al41 measured the ligament stress
showed that the maximum stress occurs at 20%–30% elonga- on humans. For negative strain, the ligament tissue stress is similar
tion of the muscle, rather than at shortening. to the mucosal layer stress. The activated TA muscle alone can
Chhetri et al38 used anesthetized canines (larynx size com- control fo, as discussed above. However, if the CT muscle is ac-
parable with humans) to stimulate the motor nerves of the five tivated enough to lengthen the vocal folds to the point where
intrinsic laryngeal muscles (CT, TA, lateral cricoarytenoid [LCA], the ligament curve crosses the TA curve, the TA can operate near
posterior cricoarytenoid [PCA], and interarytenoid). The vocal the sarcomere length (strain near 0.2–0.3, or 20%–30%) and the
fold was shortened by 2% at 200 Hz phonation and 12% at 100 Hz ligament is stretched as well. This requires fo to be in the 300–
phonation. This study, together with the study of Alipour- 400 Hz range for men (near the top of the voice range profile
Haghighi and Titze37 on the same species, suggests that the TA in Figure 1).
Ingo R. Titze Human Speech: Restricted Use of Mammalian Larynx 139

FIGURE 5. Stress-strain curves for human vocal ligament and canine mucosa and muscle tissue layers (after Min et al41 and Alipour-Haghighi
and Titze37,40).

A good exercise for the laryngeal musculature, and perhaps lesion, whereas functional disorders exhibit voice limitations with
for the entire framework of the larynx, is to stretch the vocal a visually intact system. Spasmodic dysphonia, paradoxical vocal
folds often by raising fo above speech levels. In human phona- fold movement, muscle tension dysphonia, and the laryngeal com-
tion today, calling over long distances for identity and location, ponent of stuttering show no structural disorder in the larynx,
as well as uttering intense vocal expression for emotion, tend but the muscular system produces abnormal vocal fold postur-
to be minimized. Singing appears to be one of a few viable mo- ing and movement. There is as yet no direct link of this abnormal
dalities for exercising the CT/TA muscle pair. As cited earlier, posturing or movement to a condition one might call laryngeal
Brown et al17 showed that singers maintained a higher fo than muscle contracture, but many therapy techniques for function-
nonsingers over their entire lifespan. al voice disorder embrace the idea of stretching the vocal folds
A second postural issue in speech is frequent and aggressive with fo glides,43,44 reducing forceful adduction of the vocal folds,45,46
adduction of the vocal folds. This again involves an imbalance and stretching of muscles and connective tissue by laryngeal
in the use of agonist-antagonist muscles, in this case the LCA/ massage.47 Muscle contracture in the limbs has been reported
TA combination for adduction and the PCA for abduction. It has as a result of agonist-antagonist imbalance (eg, paralysis of one
been shown that adduction from the neutral position and return or the other) or as a result of abnormal flexing or shortening over
to the neutral position occurs on the order of 10,000 times per extended periods of time. The general impression by clinicians
day in classroom teachers.42 This adduction, often quite force- is that vocal fold movement in functional disorders is re-
ful in speech if a syllable needs to be stressed, or if the teacher stricted, locked up, or even paradoxical. Laryngeal massage,
needs to increase vocal effort in the presence of classroom noise, chanting, and singing are generally considered to be “unlock-
involves mostly TA and LCA activation, leaving the antagonist ing” activities. Patients with Parkinson disease have overcome
PCA dormant. Rapid inhalation or sniffing, which would engage vocal limitations with therapy geared toward louder, more en-
the PCA, is a relatively infrequent laryngeal activity. At high fo, ergetic voice production.48 Severely symptomatic spasmodic
however, when the vocal folds need to be stretched, PCA acti- dysphonia patients can often sing with ease. It is not yet known
vation helps to stabilize the position of the arytenoid cartilages to what extent laryngeal muscle contracture can be reversed (tem-
on the cricoid cartilage, thereby contributing to the vocal fold porarily or permanently) by therapy.
stretch produced by CT activity. Thus, the CT/PCA combina- In addition to these motor control issues, it is known that me-
tion may need to be activated often to oppose the LCA/TA chanical stress causes a remodeling of soft tissue. Fibrocytes respond
combination if the entire musculature is to make use of its phys- to stress by replenishing structural proteins. If fibers are not fre-
iologic range. But again, if fo is kept perpetually and chronically quently stretched, they will weaken. Exercises known as semi-
low, neither CT nor PCA will ever be well exercised. occluded vocal tract exercises or vocal function exercises (nonspeech
exercises) have been shown to accomplish simultaneous stretch-
DISCUSSION ON VOICE THERAPY AND ing and unpressing of the vocal folds (Stemple et al,43 Titze,49
VOCAL TRAINING Kapsner-Smith et al50), thereby alleviating vocal fatigue.
Traditionally, voice disorders have been broadly characterized Clinicians are not totally in agreement about the context in
as either organic or functional. Organic disorders have an identified which vocal habilitation and rehabilitation should take place.
140 Journal of Voice, Vol. 31, No. 2, 2017

General physical exercise principles are specificity (building spe- 7. Sundberg J, Ternstrom S. Commentary on “Comparison of word intelligibility
cific muscles for the context in which they will be used) and in spoken and sung phrases” by Lauren Collister and David Huron. Empir
Musicol Rev. 2008;3:215–217.
overload (exercising muscles slightly beyond their normal limits 8. Hunter EJ, Titze IR. Variation of intensity, fundamental frequency, and
for a short period of time). Training in the context of speech, voicing for teachers in occupational versus non-occupational settings. J
however, is not likely to make ideal use of these principles. As Speech Lang Hear Res. 2010;53:862–875.
has been shown, low fo will not elicit an overload on the CT 9. Gramming P. The Phonetogram: An Experimental and Clinical Study.
muscle. However, if extreme fo glides, vocal fold adductory glides, Malmo, Sweden: Malmo General Hospital; 1988.
10. Gramming P, Sundberg J. Spectrum factors relevant to phonetogram
or long sustained phonation are part of the habilitative process, measurement. J Acoust Soc Am. 1989;83:2352–2360.
some training within the context of speech is defensible and 11. Titze IR, Riede T, Popolo P. Nonlinear source-filter coupling in phonation:
perhaps ideal. Given the restricted range of muscle use in speech, vocal exercises. J Acoust Soc Am. 2008;123:1902–1915.
however, nonspeech exercises alone may accomplish the task 12. Fant G. The Acoustic Theory of Speech Production. The Hague: Moulton;
effectively, with the hope that the results will transfer to speech. 1960.
13. Wolk L, Abdelli-Beruh NB, Slavin D. Habitual use of vocal fry in young
Current use of semi-occluded vocal tract exercises or vocal func- adult female speakers. J Voice. 2012;26:e111–e116.
tion exercises for laryngeal muscle training is becoming 14. Abdelli-Beruh NB, Wolk L, Slavin D. Prevalence of vocal fry in young adult
widespread as a means of “taking the voice to the gym” several male American English speakers. J Voice. 2014;28:185–190.
times a day for a few minutes. 15. Kuhl PK. A new view of language acquisition. Proc Natl Acad Sci.
2000;97:11850–11857.
16. Kent RD. Anatomical and neuromuscular maturation of the speech
CONCLUSION mechanism evidence from acoustic studies. Am J Anat. 1976;151:11–20.
17. Brown W, Morris RJ, Hollien H, et al. Speaking fundamental frequency
Using small and fragmentary data sets, arguments have been pre-
characteristics as a function of age and professional singing. J Voice.
sented that current trends in the use of the human larynx for 1991;5:310–315.
speech run counter to its fundamental design. Low fo, low in- 18. Hirano M. Phonosurgery: basic and clinical investigations. Otologia
tensity, and frequent adduction for accent in speech prevent (Fukuoka). 1975;21:239–440.
laryngeal muscles from being exercised with the appropriate 19. Hirano M, Sato K. Histological Color Atlas of the Human Larynx. San Diego:
Singular Publishing Group, Inc; 1993.
ranges of motion and optimal lengthening. A very recent cross-
20. Gray S, Titze IR, Chan R, et al. Vocal fold proteoglycans and their influence
species study51 has demonstrated that two factors account for a on biomechanics. Laryngoscope. 1999;109:845–854.
wide fo range, the achievable fiber stress in the vocal ligament 21. Titze IR. The physics of small-amplitude oscillation of the vocal folds. J
and the ability to elongate the vocal folds. Here it has been sug- Acoust Soc Am. 1988;83:1536–1552.
gested that voice disorders such as muscle tension dysphonia, 22. Sato K, Nakashima T, Nonaka S, et al. Histopathologic investigations of
the unphonated human vocal fold mucosa. Acta Otolaryngol. 2008;128:694–
spasmodic dysphonia, or other larynx-based dysfluencies in speech
701.
may be linked to the mismatch between mammalian design and 23. Sato K, Umeno H, Nakashima T, et al. Histopathologic investigations of
contemporary use of the larynx. It would appear, however, that the unphonated human child vocal fold mucosa. J Voice. 2012;26:37–
regular exercise out of the context of speech could have a health 43.
benefit. For some people, this may parallel the need to exer- 24. Sato K, Umeno H, Ono T, et al. Histopathologic study of human vocal fold
mucosa unphonated over a decade. Acta Otolaryngol. 2011;131:1319–1325.
cise the spine and postural muscles of the body to overcome the
25. Titze IR. Principles of Voice Production. Denver, CO: National Center for
effects of excessive bending, sitting, or otherwise distorting natural Voice and Speech; 2000.
equilibria. Vocology, the science and practice of voice habili- 26. Hillel AT, Johns MM III, Hapner ER, et al. Voice outcomes from
tation, is a new field that addresses these challenges. subligamentous cordectomy for early glottic cancer. Ann Otol Rhinol
Laryngol. 2013;122:190–196.
27. Mau T, Palaparthi A, Riede T, et al. Effect of resection depth of early glottic
Acknowledgement cancer on vocal outcome: an optimized finite element simulation.
Laryngoscope. 2015;125:1892–1899.
Support for this research comes from grant number 5
28. Rack PM, Westbury DR. The effects of length and stimulus rate on tension
R01DC012045-04 by the National Institute on Deafness and Other in the isometric cat soleus muscle. J Physiol. 1969;204:443–460.
Communication Disorders. 29. Vaz MA, de la Rocha FC, Leonard T, et al. The force-length relationship
of the cat soleus muscle. Muscles Ligaments Tendons J. 2012;2:79–84.
30. Traunmuller H, Eriksson A. The frequency range of the voice fundamental
REFERENCES in the speech of male and female adults. 1995. Unpublished Manuscript.
1. Taylor AM, Reby D. The contribution of source-filter theory to mammal 31. Cielo CA, Elias VS, Brum DM, et al. Thyroarytenoid muscle and vocal fry:
vocal communication research. J Zool. 2009;280:221–236. a literature review. Rev. soc. bras. fonoaudiol. 2010;16:362–369.
2. Lieberman P. Uniquely Human: The Evolution of Speech, Thought, and 32. Holmberg EB, Perkell JS, Hillman RE, et al. Individual variation in measures
Selfless Behavior. Cambridge, MA: Harvard University Press; 1991. of voice. Phonetica. 1994;51:30–37.
3. Byrd D. Relation of sex and dialect to reduction. Speech Commun. 33. Lindblom B. Emergent phonology. In Proceedings of the 25th Annual
1994;15:39–54. Meeting of the Berkeley Linguistics Society, 2000. pp. 195–209.
4. Bradlow AR, Toretta GM, Pisoni DB. Intelligibility of normal speech I: 34. Selbie WS, Zhang L, Levine WS, et al. Using joint geometry to determine
global and fine-grained acoustic-phonetic talker characteristics. Speech the motion of the cricoarytenoid joint. J Acoust Soc Am. 1998;103:1115–1127.
Commun. 1997;20:255–272. 35. Nishizawa N, Sawashima M, Yonemoto K. Vocal fold length in vocal pitch
5. Joliveau E, Smith J, Wolfe J. Tuning of vocal tract resonances by sopranos. change. In: Fujimura O, ed. Vocal Fold Physiology: Voice Production,
Nature. 2004;427:p116. Mechanisms and Functions. New York, NY: Raven Press; 1988:75–82.
6. Collister LB, Huron D. Comparison of word intelligibility in spoken and 36. Titze IR, Luschei ES, Hirano M. Role of the thyroarytenoid muscle in
sung phrases. Empir Musicol Rev. 2008;3:109–125. regulation of fundamental frequency. J Voice. 1989;3:213–224.
Ingo R. Titze Human Speech: Restricted Use of Mammalian Larynx 141

37. Alipour-Haghighi F, Titze IR. Viscoelastic modeling of canine vocalis muscle 45. Boone DR. The Voice and Voice Therapy. 5th ed. Englewood Cliffs, NJ:
in relaxation. J Acoust Soc Am. 1985;78:1939–1943. Prentice-Hall, Inc; 1996.
38. Chhetri DK, Neubauer J, Berry DA. Neuromuscular control of fundamental 46. Colton RH, Casper J. Understanding Voice Problems: A Physiological
frequency and glottal posture at phonation onset. J Acoust Soc Am. Perspective for Diagnosis and Treatment. 2nd ed. Baltimore, MD: Williams
2012;131:1401–1412. & Wilkins; 1996.
39. Titze IR. On the relation between subglottal pressure and fundamental 47. Roy N, Bless DM. Manual circumlaryngeal techniques in the assessment
frequency in phonation. J Acoust Soc Am. 1989;85:901–906. and treatment of voice disorders. Curr Opin Otolaryngol Head Neck Surg.
40. Alipour-Haghighi F, Titze IR. Elastic models of vocal fold tissues. J Acoust 1998;6:151–155.
Soc Am. 1991;90:1326–1331. 48. Ramig LO, Bonitati CM, Lemke JH, et al. Voice treatment for patients with
41. Min YB, Titze IR, Alipour-Haghighi F. Stress-strain response of the human Parkinson disease: development of an approach and preliminary efficacy
vocal ligament. Ann Otol Rhinol Laryngol. 1995;104:563–569. data. J Med Speech Lang Pathol. 1994;2:191–209.
42. Titze IR, Hunter EJ, Švec JG. Voicing and silence periods in daily and weekly 49. Titze IR. Voice training and therapy with semi-occluded vocal tracts: rationale
vocalizations of teachers. J Acoust Soc Am. 2007;121:469–478. and scientific underpinnings. J Speech Lang Hear Res. 2006;49:448–459.
43. Stemple JC, Lee L, D’Amico B, et al. Efficacy of vocal function exercises 50. Kapsner-Smith M, Hunter EJ, Kirkham K, et al. A randomized-controlled
as a method of improving voice production. J Voice. 1994;8:271–278. trial of two semi-occluded vocal tract voice therapy protocols. J Speech
44. Titze IR, Verdolini-Abbott K. Vocology: The Science and Practice of Voice Language Hear Res. 2015;58:535–549.
Habilitation. Salt Lake City, UT: National Center for Voice and Speech; 51. Titze IR, Riede T, Mau T. Predicting fundamental frequency ranges in
2012. vocalization across species. PLoS Comput Biol. 2016.

You might also like