You are on page 1of 9

Voice Source Variation Between Vowels in Male

Opera Singers
~, and §Brian P. Gill, *yStockholm, Sweden, zAveiro, Portugal, and xBloomington, Indiana
*,†Johan Sundberg, ‡Filipa M. B. La

Summary: Objectives. The theory of nonlinear source-filter interaction predicts that the glottal voice source should
be affected by the frequency relationship between formants and partials. An attempt to experimentally verify this theory
is presented.
Study design. Glottal voice source and electrolaryngograph (ELG) signal differences between vowels were analyzed
in vowel sequences, sung at four pitches with the same degree of vocal loudness by professional opera singers. In addi-
tion, the relationships between such differences and the frequency distance between the first formant (F1) and its closest
partial were examined.
Methods. A digital laryngograph microprocessor was used to simultaneously record audio and ELG signals. The
former was inverse filtered, and voice source parameters and formant frequencies were extracted. The amplitude quo-
tient of the derivative of the ELG signal (AQdELG) and the contact quotient were also compared.
Results. A one-way repeated-measures ANOVA revealed significant differences between vowels, for contact quotient
at four pitches and for maximum flow declination rate (MFDR) at three pitches. For other voice source parameters, dif-
ferences were found at one or two pitches only. No consistent correlation was found between MFDR and the distance
between F1 and its closest partial.
Conclusions. The glottal voice source tends to vary between vowels, presumably because of source-filter interaction,
but the variation does not seem to be dependent on the frequency distance between F1 and its closest partial.
Key Words: Sung vowels–Inverse-filtering–Voice source–Formant frequencies–Electrolaryngograph.

INTRODUCTION between lower harmonics and the first formant frequency, this
According to classical singing pedagogy, some vowels can be interaction should be milder for male speech and greater for fe-
produced more easily than others at a given pitch.1 This seems male and child voices. In male singing, however, an interaction
to contradict the classical source-filter theory of voice produc- should be likely to occur in and above the passaggio, that is,
tion, if it is assumed to predict that the glottal airflow is indepen- E4 (±330 Hz) to G4 (±400 Hz).7
dent of vocal tract resonances, that is, formants.2 Rather, it The theory of nonlinear source-filter interaction has been
supports the assumption that the glottal airflow is affected by tested and confirmed in experiments using physical models,
the formants because of nonlinear source-filter interaction.3,4 computer simulation,8 excised larynges,9,10 and voice source
The theory of nonlinear source-filter interaction in voice pro- analysis in a single speaker.11 For example, in model experi-
duction has been developed over the last decades.5 It predicts that ments with a simplified two-mass model connected to a straight
when the first formant (F1) coincides with or crosses over one of tube, subharmonic vibrations and deterministic chaos were
the lower spectrum partials, voice instabilities may occur, for observed when F0 and F1 coincided.12 The theory has also
example, fundamental frequency (F0) jumps, subharmonic fre- been tested in experiments. For example, Titze et al had 18 sub-
quencies, and changes in the amplitude of the voice source jects, none of whom had extensive vocal training, perform vocal
fundamental.5,6 Under certain conditions, such feedback may exercises where F1 was passed by a partial. In many cases,
facilitate vocal fold oscillation, that is, elicit a more efficient various types of F0 disturbances, such as pitch jumps, and bifur-
conversion of aerodynamic to acoustic energy.3 More specif- cations, were observed when a partial was close to F1. More-
ically, the sound pressure level (SPL) of a vowel may increase over, using an electrolaryngograph (ELG), a noninvasive tool
by as much as 10 dB if one of the lowest harmonics is just below for documenting vocal fold contact,13 differences have been
the first formant frequency. On the other hand, it may be weak- observed in contacting and decontacting events between
ened if one of those partials is located just above the first formant different spoken vowels; both the open quotient and the speed
frequency.5 Considering the dependence on the proximity quotient were affected.14
In singing, control of the vocal output is crucial, so uncon-
trolled pitch jumps and other instabilities would be totally un-
Accepted for publication July 22, 2015.
This work was presented at the Annual Symposium Care of the professional Voice, Phil-
acceptable. One way to circumvent them would be to avoid
adelphia, June 2014. the situation that a partial is just above F1. However, the effects
From the *Department of Speech, Music and Hearing, School of Computer Science and
Communication, KTH, Stockholm, Sweden; yUniversity College of Music Education,
of the frequency relations between F1 and its closest partial
Stockholm, Sweden; zDepartment of Communication and Arts, INET-MD, University have not been measured in singers, neither with respect to the
of Aveiro, Aveiro, Portugal; and the xDepartment Jacobs School of Music, Indiana Univer-
sity, Bloomington, Indiana.
flow glottogram of different vowels, nor with respect to the
Address correspondence and reprint requests to Johan Sundberg, Department of Speech, ELG waveform. Hence, it seemed worthwhile (1) to compare
Music and Hearing, School of Computer Science and Communication, KTH, SE-10044
Stockholm, Sweden. E-mail: pjohan@speech.kth.se
voice source parameters between vowels and (2) to investigate
Journal of Voice, Vol. 30, No. 5, pp. 509-517 whether vowel differences between such parameters could be
0892-1997/$36.00
Ó 2016 The Voice Foundation
explained by source-filter interaction. In particular, we tested
http://dx.doi.org/10.1016/j.jvoice.2015.07.010 if the vocal tract excitation, that is, the maximum flow
510 Journal of Voice, Vol. 30, No. 5, 2016

Development at New York University. A Laryngograph micro-


TABLE 1.
Participant’s Age, Voice Classification and Working
processor (Laryngograph Ltd, London, UK) was used to re-
Experience cord audio and ELG simultaneously. The former was picked
up by a head-mounted omnidirectional electret microphone
Singer Age (yrs.) Classification Experience (Knowles EK3132, Knowles Corporation, Itasca, IL) placed
1 30 Tenor Internationally touring at a mouth-to-microphone distance of approximately 15 cm.
2 32 Baritone Nationally touring The sound level was calibrated by means of a 1-kHz sine
3 23 Baritone Graduate student wave, the SPL of which was measured next to the recording
4 25 Tenor Graduate student microphone by means of a sound level meter. The value
5 24 Baritone Graduate student observed was announced in the recording. Both signals were
6 42 Tenor Internationally touring
recorded using Laryngograph Speech Studio (Laryngograph
7 38 Baritone Nationally touring
8 35 Baritone Graduate student
Ltd, London, UK) software and stored as wav files.
The voice source was analyzed in terms of flow glottograms
derived from the audio signal after integration and inverse
filtering. Inverse filtering is a classical method in voice anal-
declination rate (MFDR), was greater when the frequency
ysis.15,16 The strategy is to eliminate the influence of the
distance between F1 and its closest partial, henceforth
vocal tract resonance characteristics on the radiated sound.
Min(F1–[n*F0]), was positive, that is, when F1 was just above
This is realized by filtering the signal by a set of filters
the closest partial, and smaller when it was negative, that is,
representing the inverse of the transfer function of the vocal
when F1 was just below its closest partial.
tract. The method offers information on both the glottal
airflow waveform (flow glottogram) and on the formant
METHODS frequencies and bandwidths. The accuracy is particularly high
Eight male classically trained singers, 23–42 years old (mean in cases where a partial is close to a formant. This is
31.1, SD 6.9) with varying levels of professional expertise, vol- illustrated in Figure 1, showing the effects of setting the F1 filter
unteered as subjects (Table 1). They were asked to sing a 4% above and 4% below the correct value.
sequence of the vowels /i, e, a, o, u/ on each of the pitches Samples of the different vowels were analyzed using the
E3, G3, A3, and C4, keeping vocal loudness constant. Each custom-made Decap software for inverse filtering (Svante
task was repeated once. The pitches were given to the subjects Granqvist; KTH, Stockholm, Sweden). This program can be
by means of the custom-made MADDE software (by Svante set to display waveform and spectrum in separate windows,
Granqvist; KTH, Stockholm, Sweden). as described in detail elsewhere.17 The frequencies and band-
All recordings were made in a sound-treated studio in the widths of the inverse filters are set manually, and the classical
Steinhardt School of Culture, Education, and Human equations are applied for calculating the transfer function that

FIGURE 1. Example of the effects on the flow glottogram of mistuning inverse filter for F1 by 20 Hz and +20 Hz, corresponding to ±4% of F1
(left and right panels, respectively). The middle panel represents the result of a correct filter setting. The positions of the arrows along the horizontal
axis show the frequencies of the F1 inverse filter. Their positions on the vertical scale represent their bandwidths in an arbitrary scale, and the curves
show the range of typical bandwidth values. (Color version of figure available online.)
Johan Sundberg, et al Voice Source and Vowels in Male Singers 511

FIGURE 2. Examples of audio signal, flow glottogram, and dELG (top, middle, and bottom curves) of the indicated vowels sung in the same vowel
sequence by singer 1 on the pitch E3 (F0z165 Hz).

corresponds to the chosen combination of formant frequencies As mentioned, there are reasons to expect that the sound level
and bandwidths. The software can display the filtered voice produced is affected by the frequency distance between F1 and
source waveform and the spectrum in quasi-real time. The the harmonic lying closest to it, the Min(F1  [n 3 F0]). This
formant frequencies and bandwidths can be saved in a formant sound level depends on the strength by which the vocal tract is
data file. Provided that the filters are correctly set, the output excited by the voice source, and that strength is determined by
displays the waveform and spectrum of the transglottal airflow, MFDR. Therefore, this voice source parameter and F0 were
also including effects of nonlinear source-filter interaction, if measured using the ‘‘Glottal flow parameter measurement’’
any. It should be noted that inverse filtering yields a representa- tool contained in the custom-made Sopran software (available
tion of glottal flow, but not of glottal area, because glottal area is at www.tolvan.com, last inspected June 2015). Other flow glot-
nonlinearly related to glottal flow.11,18 Thus, only the effects of togram parameters measured by the same software were pulse
the vocal tract transfer function are eliminated from the input amplitude, normalized amplitude quotient (NAQ), level differ-
signal. The program can also display an additional signal, ence between the first and the second harmonic of the source
such as ELG or its derivative (dELG), which can be delayed spectrum (H1  H2), and the closed quotient (Qclosed). In addi-
so as to compensate for the time lag between the audio and tion, the contact quotient (Qcontact) and the amplitude quotient
the ELG signals. of the derivative of ELG (AQdELG) were determined from the
For the inverse filtering, the formant frequencies and band- ELG signal. The last mentioned measure was calculated by a
widths were adjusted according to three criteria: (1) ripple- novel tool, developed as a script in the Sopran software by
free closed phase; (2) voice source spectrum envelope as void Svante Granqvist. It is defined as the smoothed ratio between
of peaks and valleys near the formant frequencies as possible; the amplitude of the positive peak of the dELG signal and the
and (3) synchrony between the positive peak of the dELG and amplitude of the negative peak of the dELG signal multiplied
the MFDR. The last mentioned criterion is based on the fact by 1. Thus, it reflects how much steeper the maximum con-
that vocal fold contact must cause a sudden decrease of glottal tacting speed is than the maximum decontacting speed. This
airflow.19 measure is closely related to the EGG speed quotient used by
512 Journal of Voice, Vol. 30, No. 5, 2016

TABLE 2.
Results of One-Way Within-Subjects Repeated-Measures ANOVA for Each of the Four Pitches Analyzed, Run on Z-Score
Values for Both Flow Glottogram and ELG Parameters and on Raw Data Values for Min(F1–[n*F0])
Vowels

/i/ /e/ /a/ /o/ /u/

Parameter Pitch Mdn IQR Mdn IQR Mdn IQR Mdn IQR Mdn IQR P Value
Pulse amplitude E3 0.46 1.13 0.60 2.00 0.33 1.37 0.20 0.68 1.01 1.87 0.24
G3 0.34 1.11 0.51 2.40 0.06 0.79 0.37 1.41 1.11 1.19 0.075
A3 0.23 0.87 0.65 1.51 1.12 1.28 0.20 0.62 0.92 0.87 0.03
C4 0.72 2.22 0.44 1.77 0.16 0.91 0.05 1.06 1.03 1.27 0.036
MFDR E3 0.57 1.53 0.02 1.07 0.09 1.09 0.02 1.26 0.02 1.78 0.699
G3 0.81 0.72 0.67 1.25 0.48 1.91 0.38 0.71 0.55 1.69 0.013
A3 0.97 0.59 0.80 0.96 0.64 0.36 0.72 1.17 1.12 0.35 <0.05
C4 0.17 1.22 0.48 0.93 0.74 1.17 0.13 1.46 1.10 0.97 0.019
NAQ E3 0.68 1.67 0.08 0.77 0.15 1.56 0.57 1.01 0.63 2.43 0.215
G3 0.94 1.26 0.43 1.05 0.34 1.01 0.36 0.76 0.31 2.42 0.091
A3 0.60 1.10 0.38 1.14 0.15 1.45 1.13 1.34 0.52 0.91 0.01
C4 1.24 1.04 0.27 0.87 0.62 1.16 0.09 1.40 0.05 1.26 0.215
H1  H2 E3 1.19 1.11 0.47 0.33 0.16 1.41 0.64 1.30 0.06 1.03 0.046
G3 0.48 1.98 0.06 1.77 0.29 1.48 0.16 1.74 0.09 1.36 0.861
A3 0.65 1.91 0.35 1.45 0.14 1.83 0.02 0.51 0.57 0.92 0.42
C4 0.73 1.42 0.92 0.76 0.45 0.47 0.83 1.27 0.71 1.47 0.006
Qclosed E3 0.72 1.81 0.36 1.34 0.04 1.37 0.32 0.93 0.63 1.87 0.24
G3 0.31 2.11 0.19 1.32 0.14 1.78 0.40 0.76 0.50 2.06 0.894
A3 0.64 2.07 0.12 1.42 0.17 1.99 0.57 0.52 0.18 1.31 0.195
C4 1.06 0.72 1.11 1.20 0.27 1.43 0.21 0.70 0.42 1.37 0.007
AQdELG E3 0.57 1.86 0.43 1.02 0.65 1.25 0.17 1.07 1.15 1.18 0.027
G3 0.15 1.62 0.01 0.81 0.60 1.71 0.32 1.34 0.98 0.48 0.037
A3 0.76 1.19 0.20 1.15 0.30 1.58 0.10 0.76 0.85 2.06 0.199
C4 0.82 1.67 0.30 1.63 0.37 1.26 0.13 0.91 0.89 1.58 0.258
QContact E3 0.29 1.52 0.20 0.80 0.71 1.72 0.54 0.67 0.51 2.75 0.023
G3 1.07 1.14 0.47 0.85 0.34 1.42 0.57 0.61 0.27 1.96 0.033
A3 0.78 1.46 0.14 0.34 0.46 1.20 1.00 0.61 0.51 1.83 0.001
C4 0.84 1.48 0.21 1.00 0.79 0.71 0.75 1.54 0.17 1.83 0.033
Notes: The columns list the median (Mdn) and interquartile range (IQR) for the different vowels.
Statistically significant differences between vowels (Friedman test, P < 0.05) are in bold.

Marasek,14 which, however, is based on a linear approximation deviations differed between the vowels in the same sequence,
of the EGG waveform. as can be seen in the examples shown in Figure 2. The tilt of
Flow glottogram and ELG measures may differ considerably the closed phase varied considerably and sometimes contained
between individuals; for example, the same subglottal pressure a ripple or a bump that was impossible to cancel with realistic
would obviously produce larger pulse amplitudes in individuals formant frequencies. In some cases, a bump appeared in the
with long vocal folds than with short vocal folds because longer pulse, for example, as in the vowel /o/ in Figure 2.
vocal folds produce a larger glottal area. Therefore, all glottal The result of the ANOVA is summarized in Table 2. Qcontact
parameters were converted to z scores. Such values are calcu- showed a significant variation for all four pitches analyzed and
lated as the difference between the value and the subject’s MFDR for three pitches. Pulse amplitude, H1  H2, and AQdELG
average of all values, divided by the standard deviation. To showed a significant variation between vowels for two pitches,
test if the voice source parameters and the ELG measures varied whereas NAQ and Qclosed differed significantly only for one
significantly between vowels, a one-way within-subjects pitch.
repeated-measures ANOVA was run on these z score values. The frequencies used for the inverse filters for F1 and F2 are
shown in Figure 3. The subjects showed a common pattern
RESULTS varying with vowels in the expected manner. No systematic
Many flow glottograms deviated considerably from the clas- variation with pitch can be observed, except for F2, which
sical form with skewed triangle-shaped pulses separated by hor- was higher for the pitch C4 than for the lower pitches in
izontal portions that represent the closed phase. Mostly, the some cases.
Johan Sundberg, et al Voice Source and Vowels in Male Singers 513

FIGURE 3. F1 and F2 used in the inverse filtering of the vowels sung at the indicated pitches by the singers. (Color version of figure available
online.)

According to the theory of nonlinear source-filter interac- both positive and small, there should be a great number of
tion, the frequency distance between F1 and its closest partial low positive values. Such effects were not found; the number
affects the intensity that this partial has in the voice source; a of positive values was greater than the number of negative
formant just above the partial will boost its intensity, whereas values in five singers, and lower in three singers. Furthermore,
a formant just below a partial will attenuate it.6 The partial few values were positive and small, and there was no clear dif-
closest to F1 mostly has the highest amplitude in the spec- ference between cases where the second and the third partial
trum, and the SPL tends to be entirely determined by the in- was closest to F1. Thus, there seemed to be no preference
tensity of that partial.20,21 The SPL, in turn, is strongly among these singers to tune F1 to a frequency just above a
dependent on MFDR, which determines the strength with partial.
which the vocal tract is being excited by the voice source. Figure 5 shows each singer’s MFDR, in l/s2, for the different
In other words, according to the theory of nonlinear source- vowels and pitches. For all singers, MFDR differed consider-
filter interaction, MFDR should vary systematically depend- ably between vowels sung on a given pitch. The variation was
ing on Min(F1  [n 3 F0]). When Min(F1  [n 3 F0]) is particularly great for singer 3 and small for singer 8. In many
small and positive, that is, when F1 is just above the closest cases, MFDR was lower for /i/ and /u/ than for /a/.
partial, the amplitude of this partial should be increased. As mentioned, the theory of nonlinear source-filter interaction
Conversely, when Min(F1  [n 3 F0]) is small and negative, predicts that vowels with a low positive Min(F1  [n 3 F0]) will
that is, when F1 is just below the closest partial, its amplitude be associated with higher MFDR values than vowels with a low
should be attenuated.4 negative Min(F1  [n 3 F0]). In both cases, the effect should
The previous section suggests that singers would prefer to decrease when the frequency separation between F1 and the
tune F1 to a frequency just above its nearest partial. Figure 4 partial increases. This was tested by plotting MFDR as function
shows Min(F1  [n 3 F0]) for the different vowels and of Min(F1  [n 3 F0]), Figure 6. In each panel of the figure,
pitches. Positive values refer to cases where the F1 was higher the correlation coefficients between MFDR and the
than its closest partial, and vice versa. If the singers preferred Min(F1  [n 3 F0]) are listed for negative values and positive
to place F1 just above its closest partial, there would be a values of Min(F1  [n 3 F0]) separately. Thus, in each panel,
greater number of positive than negative values. Moreover, the midline represents the case when the formant coincides
as the effect would be greatest when Min(F1  [n 3 F0]) is with a partial.
514 Journal of Voice, Vol. 30, No. 5, 2016

FIGURE 4. Frequency distance between F1 and its closest partial, Min(F1  [n 3 F0]) for the different vowels sung at the indicated pitches by the
eight singers.

In some cases, the values to the left of the midline (open cir- analyzing the correlation between the two. The result is shown
cles) are lower than most of those to the right of the midline in Figure 7. The correlation coefficients for negative and posi-
(filled circles). This means that, in these cases, MFDR tended tive values of Min(F1  [n 3 F0]) are shown in the top of the
to be lower when the formant was below its closest partial. panels. There is an indication of some relationship between
This is in accordance with the theory of nonlinear source- AQdELG and Min(F1  [n 3 F0]) for singers 4, 5, and 6. How-
filter interaction. This theory further predicts that the closer ever, these correlations differ in sign and refer to negative
the partial is to the formant, the stronger the effects should values of Min(F1  [n 3 F0]) for singers 4 and 5 and to positive
be; in other words, when the formant is higher than its closest values for singer 6. Hence, although AQdELG differed signifi-
partial, that is, for Min(F1  [n 3 F0]) > 0, the highest cantly between vowels at two pitches, it does not seem to be
MFDR values should occur close to the midline in the graph. related to the distance between F1 and its closest partial.
Conversely, when the formant is lower than its closest partial,
that is, for Min(F1  [n 3 F0]) < 0, the MFDR values close
to the midline should be low. This means that the trend lines DISCUSSION
should have a negative slope both to the left and to the right The present investigation relies on inverse filtering. The basic
of the midline. In other words, one would expect the MFDR assumptions underlying this method are (1) that the voice
values to the left of the midline in each panel to decrease source is the transglottal airflow and (2) that the sound transfer
with decreasing distance to the midline and the values to the in the vocal tract is linear. If, by contrast, the voice source is
right of this line should be high close to the midline and defined as the oscillating glottal area, inverse filtering could
decrease with increasing distance from this line. Only singers not be used to recover it from the radiated sound because the
3 and 5 showed a clear negative correlation for negative values relationship between glottal area and glottal flow is nonlinear.22
of Min(F1  [n 3 F0]). For singers 3 and 7, MFDR increased With regard to the second assumption, there is no reason to
with increasing positive Min(F1  [n 3 F0]). For positive doubt its validity; it is a well-established fact that the vocal tract
values, none of the singers showed a negative correlation. These sound transfer is linear. Therefore, inverse filtering can be used
findings suggest that nonlinear source-filter interaction did not for analyzing the transglottal airflow, that is, the voice source.
explain why MFDR differed significantly between vowels. As all methods, inverse filtering has limitations. Nasalization
One might assume that AQdELG, that is, the speed ratio be- produces an extremely complicated transfer function, the in-
tween vocal fold contacting and decontacting would be greater verse of which cannot be accurately generated by the inverse fil-
when F1 is just above its closest partial, that is, when Min(F1  ter.23–25 However, for velopharyngeal openings smaller than
[n 3 F0]) is small and positive. This assumption was tested by 1 cm2, the effect on flow glottogram parameters is small and
Johan Sundberg, et al Voice Source and Vowels in Male Singers 515

FIGURE 5. MFDR for the different vowels sung by the subjects at the indicated pitches.

limited to the very final part of the closing phase, the so-called cant differences were found for all flow glottogram parameters
return phase.26 This part of the flow glottogram cannot affect analyzed, at least for one of the four pitches examined. With
MFDR appreciably because the MFDR reflects the steepness respect to MFDR and Qcontact, significant differences were
at the inflection point of the flow, whereas the return phase re- found for three and four pitches, respectively. Thus, the voice
flects the abruptness of the subsequent corner of the flow signal. source varied systematically with vowel.
Unexpectedly, flow glottograms have recently been found We expected this variation to depend mainly on the
to be quite sensitive to sound reflections in the recording Min(F1  [n 3 F0]), but the results failed to confirm this
room; depending on the distance to the reflecting object, expectation. A relevant question then is what may have caused
they may cause disturbances such as a tilt and also a ripple the variation? Examining the data listed in Table 2 suggest that
in the closed phase (Svante Granqvist, personal communica- some of the variation is correlated with F1 and F2. For example,
tion). Many examples of such disturbances were found in Qclosed showed a slight tendency to decrease and MFDR to in-
the flow glottograms, although the recordings were made in crease with rising F1. (The authors are indebted to an anony-
a sound-treated studio, and the microphone distance to the mous reviewer for pointing this out to us.) These
mouth was only 15 cm, approximately. A much shorter micro- relationships should be worthwhile to study in detail in the
phone distance would have reduced or eliminated the influ- future.
ence of room reflections but could not be used because it The bump in the pulse, illustrated in the case of the vowel /o/
caused clipping of the audio signal by the equipment available in Figure 2, shows a striking similarity with the flow glottogram
for the experiment. shape discussed by Gunnar Fant (1986)18 and by Martin
Our results concerned vowels sung in a sequence by male Rothenberg.27 Both ascribed this effect to source-filter interac-
professional baritone and tenor singers. It seems reasonable tion. In the case shown in the figure, F1 and F2 were 517 Hz and
to assume that the singers did not intentionally change their 1009 Hz, respectively, and F0 was 168 Hz. Hence, the third par-
voice source within such a sequence. Nevertheless, our results tial was 3 3 168 ¼ 504 Hz, that is, just 13 Hz above F1; the
showed voice source differences between vowels, and signifi- sixth partial was 6 3 168 ¼ 1008 Hz, that is, just 1 Hz from
516 Journal of Voice, Vol. 30, No. 5, 2016

FIGURE 6. MFDR as function of the Min(F1  [n 3 F0]). At the bottom of each panel, the left and right values correspond to the linear corre-
lations for negative and positive values of Min(F1  [n 3 F0]), open and filled circles, respectively. (Color version of figure available online.)

F2. In this case, then, it seems likely that the bump in the pulse this method may be is a question that will be analyzed in a
was caused by source-filter interaction. future investigation.
Lim et al28 (2006) studied the relationship between jaw open- Summarizing, although no pitch instabilities were found,
ing and the EGG speed quotient in different vowels pronounced several flow glottogram and ELG differences were observed be-
by speakers. They found that this quotient was significantly tween vowels. Such differences may be caused by source-filter
lower in the vowel /u/ than in the vowels /a, e, i, o/.28 The speed interaction; but, according to these findings, they do not seem to
quotient should be closely related to the AQdELG. We found sig- be due to a specific formant/partial relationship. Flow glotto-
nificant variation of this parameter between vowels for two gram differences in, for example, pulse amplitude or Qclosed
pitches. Thus, similar to the results reported by Lim et al, our would be less apparent than pitch instabilities, making them
findings indicate that the contacting speed of the vocal folds more tolerable, even in professional singing.
sometimes differs between vowels.
The subjects used in the present study were classically
trained singers. Such subjects generally have much better con- CONCLUSIONS
trol over their voices than untrained subjects, that is, the influ- Our results have shown that there are substantial flow glotto-
ence of random factors on phonation has been minimized. gram and ELG differences between vowels sung in a sequence
According to the theory of nonlinear source-filter interaction, by professional classically trained singers. Although these dif-
a negative Min(F1  [n 3 F0]), which is close to zero, should ferences may occur because of source-filter interaction, our an-
expose the singer to the risk of instabilities and uncontrolled alyses failed to support the assumption that the variation of
variation of MFDR, and hence vocal loudness. For example, MFDR was related to the Min(F1  [n 3 F0]). Thus, much
when performing unfamiliar vocal tasks such as wide of the cause for the variation of the voice source between vowels
glissandos, untrained voices have been found to produce pitch remains an open question.
jumps and other uncontrolled vocal events when a partial is
passing F1. Such events are unlikely to occur in singers. One Acknowledgments
would then expect that trained singers avoid the situation that The authors gratefully acknowledge the kind cooperation of the
F1 is just below its closest partial. However, the singers in singers who volunteered to participate. The authors are
this study did not avoid this situation. Apparently, they had indebted to the Schering Health Care Ltd. and Bayer Portugal
found another method to circumvent such problems. What for providing the means to acquire the equipment.
Johan Sundberg, et al Voice Source and Vowels in Male Singers 517

FIGURE 7. AQdELG as function of Min(F1  [n 3 F0]). In each panel, the left and right values at the top show the correlations for negative and
positive values of Min(F1  [n 3 F0]), open and filled circles, respectively. (Color version of figure available online.)

REFERENCES 16. Alku P. Glottal wave analysis with pitch synchronous iterative adaptive in-
1. Coffin B. Overtones of Bel Canto. New Brunswick, NJ: Scarecrow Press; 1980. verse filtering. Speech Comm. 1992;11:109–118.
2. Fant G. Acoustic Theory of Speech Production. 2nd ed. The Hague, The 17. Sundberg J, L~a F, Gill B. Formant tuning strategies in professional male op-
Netherlands: Mouton; 1960. era singers. J Voice. 2013;27:278–288.
3. Titze I. A theoretical study of F0-F1 interaction with application to resonant 18. Fant G. Glottal flow: models and interaction. J Phon. 1986;14:393–399.
speaking and singing voice. J Voice. 2004;18:292–298. 19. L~a FMB, Sundberg J. Contact quotient versus closed quotient: a compara-
4. Titze I. Nonlinear source-filter coupling in phonation: theory. J Acoust Soc tive study on professional male singers. J Voice. 2014;29:148–154.
Am. 2008;123:2733–2749. 20. Gramming P, Sundberg J. Spectrum factors relevant to phonetogram mea-
5. Titze I, Riede T, Popolo P. Nonlinear source–filter coupling in phonation: surement. J Acoust Soc Am. 1988;83:2352–2360.
vocal exercises. J Acoust Soc Am. 2008;123:1902–1915. 21. Titze I. Acoustic interpretation of the voice range profile (phonetogram).
6. Titze IR, Worley AS. Modeling source-filter interaction in belting and high- J Speech Hear Res. 1991;35:21–34.
pitched operatic male singing. J Acoust Soc Am. 2009;126:1530–1540. 22. Rothenberg M. Cosi fan tutte and what it means, or nonlinear source-tract
7. Doscher B. The Functional Unity of the Singing Voice. London, UK: The acoustic interaction in the soprano voice and some implications for the defi-
Scarecrow Press, Inc.; 1994. nition of vocal efficiency. In: Baer T, Sasaki C, Harris K, eds. Laryngeal
8. Alipour F, Montequin D, Tayama N. Aerodynamic profiles of a hemilarynx Function in Phonation and Respiration. Boston: College-Hill Press;
with a vocal tract. Ann Otol Rhinol Laryngol. 2001;110:550–555. 1987:254–269.
9. Chan R, Titze I. Dependence of phonation threshold pressure on vocal tract 23. Bavegard M, Fant G, Gauffin J, Liljencrants J. Vocal tract sweeptone data
acoustics and vocal fold tissue mechanics. J Acoust Soc Am. 2006;119: and model simulations of vowels, laterals and nasals. KTH, Stockholm.
2351–2362. Q Prog Status Rep. 1993;34:43–76.
10. Zhang Z, Neubauer J, Berry D. The influence of subglottal acoustics on lab- 24. Granqvist S, Herteg ard S, Larsson H, Sundberg J. Simultaneous analysis of
oratory models of phonation. J Acoust Soc Am. 2006;120:1558–1569. vocal fold vibration and transglottal airflow: exploring a new experimental
11. Rothenberg M. Acoustic interaction between the glottal source and the set-up. J Voice. 1998;17:319–330.
vocal tract. In: Stevens K, Hirano M, eds. Vocal Fold Physiology. Tokyo: 25. Stevens KN. Acoustic Phonetics. Cambridge: MIT Press; 1998.
University of Tokyo Press; 1981:305–328. 26. Gobl C, Mahshi J. Inverse filtering of nasalized vowels using synthesized
12. Haralambos A, Tecumseh F, HP H. Voice instabilities due to source-tract speech. J Voice. 2013;27:155–169.
interactions. Acta Acust United Acust. 2006;92:468–475. 27. Rothenberg M. Cosi fan tutte and what it means or nonlinear source-tract
13. Baken R, Orlikoff R. Clinical Measurement of Speech and Voice. 2nd Ed. acoustic interaction in the soprano voice and some implications for the defi-
San Diego: Singular Publishing Group Thomson Learning; 2000. nition of vocal efficiency. In: B T, Sasaki C, Harris KS, eds. Laryngeal
14. Marasek K. Glottal correlates of the word stress and the tense/lax opposi- Function in Phonation and Respiration (Proc. Vocal Fold Physiology
tion in the German vowels. Proc ICSLP-96. 1996;1573–1577. Conf. 1985). San Diego: Singular Publishing Group; 1987:254–269.
15. Fant G. A new anti-resonance circuit for inverse filtering. STL–QPSR. 28. Lim M, Lin E, Bones P. Vowel effect on glottal parameters and the magni-
1961;4:1. tude of jaw opening. J Voice. 2004;20:46–54.

You might also like