You are on page 1of 8


Steven Keyes
24.915 Final Paper
16 December 2013

Comparison of Acoustic Models of Overtone Singing

1. Introduction
Overtone singing, also called throat singing, is a singing technique that produces the
perceptual effect of a singer chanting a low drone note as well as a high, whistle-like melody.
This technique is used in a variety of regions and cultures and is common in the music of Tuva,
Mongolia, and Tibet among others. Although it is sometimes confusingly described as singing
two notes at once, the production of overtone singing can be described by a number of models
incorporating vocal cavity resonances and voicing production. In this paper, we will compare
and summarize some of these models.
The nature of overtone singing is that the singer first produces a voicing that is low and
rich in harmonics. These harmonics are components of the waveform that are integer multiples
of the fundamental frequency—for example, if the fundamental or first harmonic of a waveform
is 150 Hz, then the second harmonic will be at 300 Hz, the third at 450 Hz, and so on such that
the ݊th harmonic is at 150 ⋅ ݊ Hz. After producing this tone, overtone singers then manipulate the
resonances of their vocal tract to enhance a specific, high frequency harmonic. At high
frequencies, listeners perceive the harmonic frequencies as being close together, so, for example,
the 6th through 12th harmonics correspond to several notes of a scale spanning 1-2 octaves, which
gives singers potential for generating melodies. This harmonic is enhanced so specifically and
prominently, is a few octaves above the fundamental, and may change over a range of several
different harmonics while the fundamental remains fixed, which may contribute to why listeners
perceive this as a second tone.

The vocal folds. allowing the singer to manipulate the first formant frequency. 3. Alternatively. The rest of the vocal tract can be modeled as a resonator. and the lower resonance at 200-350 Hz corresponds to the nasal pole. the dip at 400 Hz corresponds to the nasal zero. the harmonic emphasized by the singer. He suggests this helps separate the overtone resonance from the lower harmonics. roughly between 500 and 800 Hz. Based on a case study of an overtone singer. The nasal cavity adds poles and zeroes to this transfer function. Bloothooft concluded that the first method was more likely. In mode 1. indicating nasalization.2 2. the singer could keep the lips and jaw fixed but move the tongue slightly forward and back to manipulate the second formant frequency. Effects of Formants in Overtone Singing Phoneticians posit various mechanisms as to why the emphasized harmonic in overtone singing is so prominent. To accomplish this method. produce a source waveform at the start of the vocal tract. In one. Based on observations of the mouth of the singer. Bloothooft describes 2 modes of throat singing. the singer could articulate a nasalized back vowel and manipulate jaw height and lip roundedness along the /ɔ/-/ɑ/ vowel range. . the emphasized harmonic could be the second formant. Finally. in addition to resonances of this system. which emphasizes or deemphasizes certain formant frequencies of the source signal. A simple approximation of the vocal tract is a tube or a series of tubes connected in series with open or closed ends depending on the location of constrictions in the throat and mouth. For this explanation. is the first formant. Bloothooft et al (1992) developed formant spectra for singing a range of overtone frequencies. as mentioned later. increasing the bandwidth of resonant poles. for harmonics less than 800 Hz. he notes a dip in the formant spectra around 400 Hz. Bloothooft elaborated that two possible mechanisms could give rise to this effect. various factors dampen the system. This would change the length of the vocal tract. vibrating in some mode chosen by the singer. For this mode. Model Overview In general. From these spectra. we will rely on a source-filter model of the vocal tract.

using x-ray pictures to study the shape of the mouth when performing overtone singing. no nasal dip at 400 Hz is produced. Bloothooft suggests there may be a nasal zero around 800-1000 Hz. may contribute to the prominence of the emphasized harmonic. tuned such that they have the same resonance. If this were the case. with a retroflex tongue position. Bloothooft suggests that overtone singing involves a retroflex /r/-like mouth shape. In this mode. This is support for the combination of F2 and F3 because r-colored vowels have lower F3s. Mouth shapes that lend to convergence of F2 and F3 are mechanisms that support this explanation. . the corresponding nasal pole could be conflated with the first formant. but the effect is not very strong. Chernov and Maslov (1987. and Hai (1991) confirms this shape. In fact. 1991). many studies of overtone singing suggest that the combination of these cavities. As mode 1 only corresponds to about 3 harmonics. in which singers produce harmonics greater than 800 Hz. found tongue positions that resembled the articulation of /l/ and /li/. Bloothooft also suggests that the second formant combines with the third formant for a prominent peak. This can be seen in Figure 1. This tongue position implies the mouth may be divided into a front cavity and a back cavity.3 In mode 2. which compares a recording of a regular vowel and r-colored vowel. each with a resonance that contributes to the final waveform. For example. I will mainly consider mode 2 for other examples.

This lowering effect is achieved by positioning the tongue near an appropriate node. as shown in Figure 2. and F4 of a 2005 recording demonstrating r-colored vowels. Formants other than F3 remain in approximately the same position. F3. as explained by Stevens (1989). this effect is observable when overtone singers transition from a regular sung vowel to a overtone sung vowel. especially if the lips are rounded and protruded. . measured using Praat. increasing the size of the cavity in front of the constriction and thus further decreasing the F3 frequency.4 Figure 1: The formants of a regular vowel [ə] and r-colored vowel [ɚ] demonstrate how F3 is lowered when the tongue moves upward into a retroflex position. are F1. Furthermore. The lines. F2. Lowering F3 to approach the F2 frequency by positioning the tongue in a retroflex position could account for the production of throat singing. from bottom to top.

but the emphasized formant of overtone singers generally spans mostly a single harmonic. the 12th partial of the measured fundamental. reducing damping by as much as 40%. the singer starts by singing a /ɑ/-like vowel but then transitions to singing in overtone style. a piece performed by a Tuvan-style overtone singer. They form a combined peak roughly at 1750 Hz. F1. Also. and F3 are visible at the beginning of the clip. Klingholz pointed out that the bandwidth of the formant used by the singer relative to its amplitude is very narrow. would cause a corresponding effect on the intensity of the harmonics as they oscillate around the resonant point of the vocal tract. Bloothooft as well as Klingholz (1993). Also. Alternatively. but in the middle F2 and F3 clearly converge by F2 raising and F3 lowering. A normal formant spans multiple harmonics. also observed that overtone singers also realize their fundamental frequency in a very stable manner with respect to pitch. Klingholz claims this is because singers tenses the cheeks.5 Figure 2: In this excerpt from “Borbanngadyr”. who performed another case study on a recording of overtone singing. a small mouth . and they suggest this helps then align their formants with the harmonics of their voice. F2. they suggest this allows another parameter of creativity because modulating their pitch. such as by vibrato.

the fundamental. However. and 9th harmonics. This can be seen in a wide-window spectrogram of overtone singing. such as the combination of F2 and F3. In this excerpt from “Borbanngadyr”. Large and Murry (1981) suggested Tibetian chant was in vocal fry register. the source signal has an effect on the distinct sound of overtone singing. F1 is visible as the slight darkening on the 1st and 2nd harmonics. 4.6 opening further reduces damping. starting with the 1st harmonic. as in Figure 3. further increasing amplitude relative to its bandwidth. the formant is very prominent for other reasons discussed earlier. Bloothooft . is very prominent and narrow. however. Finally. Summary of sources for source filter model In addition to the formant strength. the overtone singer creates a melody by alternating between the 12th. 10th. Listeners perceive this as a distinct tone. The overtone harmonic. Figure 3: This wide window (15 ms) spectrogram makes harmonics easily observable as the lines that span the graph. and there is some debate on the exact production of the source signal. it significantly darkens roughly one harmonic at a time.

and Maslov." Proc. Bloothooft suggests the use of the modal register but with a relatively long glottal closure. and Maslov. (1983). 1991) found the false vocal folds may have some involvement in Touvinian “double-voice singing” in additional to the glottis. Vol. 5. The convergence of F2 and F3 as well as other factors related to damping are evident in singers of this style. 370-373. Chernov. which he though corresponded with the effects of long glottal closure. Phoniatr. XII Congress of Phonetic Sciences. V. (1992) “Acoustics and perception of overtone singing. (1991)"..7 disagrees for their case study. L. This would result in muscular hypertension in the pharyngeal region.” Journal of the Acoustic Society of America 1992. Van Luipen JB. V.. Instead. Chernov. V. P. Hai. References Bloothooft G. Huun-Huur-Tu. (1994) “Borbanngadyr” Recording. "Functioning of the Voice Mechanism in Double-Voice Touvinian Singing. Dmitriev. 61. Conference “New ways of the voice. pp. XI Congress of Phonetic Sciences Tallinn. B.” Becancon. Van Cappellen M.92(4):1827–35. (1991). the source filter model and formant model provide a strong explanation for the production of overtone singing. . T. writing that the signal of the singer he measured was too regular and periodic. 35. 40-43. Q. B. and Maslov. (1987). Bringmann E. "Phonation without text. pp. Chernov. 3." Fol. Conclusion In general. B... Vol. Finally. “Larynx double sound generator.” Proc. 193-197.T . New experiments about the Overtone Singing Style. Aix-en-Provence. 6. which was observed by Dmitriev et al (1983) in Touvinian singers. Dmitriev at al (1983) and further Chernov and Maslov (1987. Thomassen Koen P." Proc. From whistle to speech. Bloothooft also qualitatively observed damping in the voice signal. B.

” Journal of Voice Vol. No. en. (1989) “On the Quantal Nature of Speech.N. 22-28. pp.wikipedia. 118-122 Large. T. 3-45 . Exp. (1981) “. Observations on the nature of Tibetan chant.ogg Stevens. (1993) “Overtone Singing: Productive Mechanisms and Acoustic Data.. J. “Regular and r-colored vowels” (2005) Recording submitted by user Denelson83. F. Singing 5 . and Murry. Res.” Journal of Phonetics 17. 7.” J. 2.8 Klingholz.