You are on page 1of 8

ARTICLE IN PRESS

Long-Term Average Spectrum in Screening of Voice


Quality in Speech: Untrained Male University Students

Timo Leino

Summary: Voice quality has mainly been studied in trained speakers,


singers, and dysphonic patients. Few studies have concerned ordinary un-
trained university students’ voices. In light of earlier studies of professional
voice users, it was hypothesized that good, poor, and intermediate voices
would be distinguishable on the basis of long-term average spectrum charac-
teristics. In the present study, voice quality of 50 Finnish vocally untrained
male university students was studied perceptually and using long-term aver-
age spectrum analysis of text reading samples of one minute duration. Equiv-
alent sound level (Leq) of text reading was also measured. According to the
results, the good and ordinary voices differed from the poor ones in their rel-
atively higher sound level in the frequency range of 1–3 kHz and a prominent
peak at 3–4 kHz. Good voices, however, did not differ from the ordinary voi-
ces in terms of the characteristics of the long-term average spectrum (LTAS).
The strength of the peak at 3–4 kHz and the voice-quality scores correlated
weakly but significantly. Voice quality and alpha ratio (level difference above
and below 1 kHz) correlated likewise. Leq was significantly higher in the
students with good and ordinary voices than in those with poor voices. The
connections between Leq, voice quality, and the formation of the peak at
3–4 kHz warrant further studies.
Key Words: Voice quality—LTAS.

INTRODUCTION average spectrum (LTAS) analysis provides a means


Sound quality is defined as ‘‘the attribute of sound of viewing the average frequency distribution of the
that makes two sounds of the same pitch and loud- sound energy in a continuous speech sample. The
ness dissimilar.’’1,2 By definition, then, quality, loud- method yields information of the general voice qual-
ness, and pitch are distinct perceptual attributes even ity, if the duration of the sample is long enough to
though a change in one may affect another. The phys- avoid the effects of individual speech sounds. Li et
ical correlate of sound quality is sound energy distri- al3 regard a sample of 30–40 seconds long enough
bution along the frequency range. Long-term to study voice quality, whereas Majewski et al4 rec-
ommend a sample of at least 1 minute. LTAS has
been used to study differences between normal and
Accepted for publication March 26, 2008. pathological voices5–8 and between various vocal pa-
Department of Speech Communication and Voice Research, thologies.9–13Among normal-voiced subjects LTAS
University of Tampere, Tampere, Finland.
Corresponding author. Department of Speech Communica-
has been used to study individual and gender, age
tion and Voice Research, University of Tampere, Kalevantie and language related differences.4,14–20 Characteris-
4, FIN-33014 Tampere, Finland. E-mail: timo.leino@uta.fi tics of vocal expression of emotions21,22 and acoustic
Journal of Voice, Vol. -, No. -, pp. - differences between specific voice qualities23–25
0892-1997/$34.00
Ó 2008 The Voice Foundation have also been studied with this method. Moreover,
doi:10.1016/j.jvoice.2008.03.008 LTAS has been used to investigate singing voice,

1
ARTICLE IN PRESS

2 TIMO LEINO

for example, differences between voice categories (4165) at a distance of 40 cm from the subject’s
and various singing styles.26,27 mouth. The speakers were asked to read the same
The suitability of LTAS for vocal screening has prose extract in their habitual way, using natural
also been questioned. In the study by Kitzing,24 loudness, that is, neither too soft nor with extra effort,
the differences between the LTAS of perceptually and with neutral expression, thus, no artistic perfor-
very different voice qualities were in the magnitude mances were allowed. No external monitoring was
of only 3%–6%. Nor did Nolan find LTAS to dis- provided to control for F0 or sound level, because
criminate satisfactorily between various articula- this would have impaired the naturalness of the
tory settings.23 He concluded that LTAS may speech. The recordings were perceptually monitored
show better glottal than resonatory characteristics by the researchers, and the subjects were asked to re-
of voice quality. Löfqvist et Mandersson28 found peat the task if some of the requirements reported
substantial differences in the LTAS spectra re- above were not met. The duration of each sample
corded from the same subjects during the same was over one minute. The samples were calibrated
day, which they supposed to impair the usefulness for sound pressure level measurements using a sine
of LTAS for screening purposes. However, their re- wave generator and a level meter (Bruel and Kjaer
cordings were made before and after a working day re- Frequency Analyzer 2120).
quiring a lot of speaking. Substantial differences in
LTAS have been reported after vocal loading.29–31 Listening evaluation
Encouraging results on the suitability of LTAS The reading samples were evaluated by two expe-
for differentiation of various voice qualities have rienced professional voice trainers and five students
been reported for both normal and disordered of vocology. The listeners rated the voice quality us-
voices.11,13,25 ing a scale 3, 2, 1, 0, þ1, þ2, þ3 (3 5 very
To obtain some objectivity in setting goals for vo- poor, þ3 5 excellent). All subjects had training in
cal training and for evaluating its effects, Leino used perceptual analysis of voices, which had been given
LTAS to study the differences between good and in the same department. According to the terminol-
poor speaking voices of vocally trained professional ogy used in the perceptual training, voice quality is
male actors.32,33According to the results LTAS dis- one of the three basic characteristics of voice to-
tinguished between good, fairly good, rather poor, gether with pitch and loudness. Listening tests
and poor voice qualities. Poor speaking voices had were carried out in a sound-treated studio. The sam-
relatively lower sound level in the frequency range ples were played back with a high-quality loud-
of 1–3 kHz, and good voice quality differed from speaker (Genelec Biamp 1019A). The listeners
fairly good and rather poor voice qualities by having participated in the test in two groups to allow each
a more prominent peak between 3 and 4 kHz. This subject to be seated sufficiently close (at 2 m) to
peak seemed to be a kind of an actor’s formant (ref. the loudspeaker, directly in front of it. The whole
singer’s formant34). Perceptual evaluation of voice 1-minute samples were presented. Leq was set to
quality correlated with the level of the peak. be the same in each sample on the listening tape.
The present study investigates the voice quality The Intelligent Speech Analyser (ISA) signal analy-
of normal-voiced, vocally untrained male univer- sis system, developed by Raimo Toivonen, M.Sc.
sity students. Eng., was used in this standardization. First the true
Leq of each sample was measured. Then one of the
samples was chosen as the reference and the Leq of
MATERIALS AND METHODS all the other samples were set up to be the same as
Subjects and recording that of the reference by increasing or decreasing their
Text reading samples of 50 Finnish male normal- level by the amount needed.
voiced vocally untrained university students (20–27
years, mean 22 years; no acting students included) Spectrum analysis
were recorded in the same sound-treated studio using LTAS was made with a real-time spectrum ana-
a digital recorder and a Bruel and Kjaer microphone lyzer (Hewlett-Packard signal analyzer 3561A). A

Journal of Voice, Vol. -, No. -, 2008


ARTICLE IN PRESS

STUDENTS’ VOICES 3

400-point linear narrow band fast Fourier transform


(FFT) analysis was used. Earlier findings33 have
shown that broadband analysis does not differenti-
ate sufficiently between voice qualities.
The frequency span was 10 kHz; fast display
mode was used since it provides with real-time
registration up to 7.5 kHz. Voiceless segments
were left out of the analysis by using an analogous
s-gate. Pauses were excluded by using the ampli-
tude trigger of the analyzer. The time record
length was 40 millisecond, and the overlap was
15%. The display resolution was 25 Hz. The flat FIGURE 1. Schematic picture of measurement of the level
top window with the frequency bandwidth of difference L1–L0 between the F1 variation range (L1) and
98 Hz was used because it is practically ripple the F0 variation range (L0) in LTAS. In this case the difference
free (!0.01 dB) in the passband, and thus gives is þ2.5 dB.
the best amplitude accuracy. This characteristic
makes Flat Top window suitable, for instance,
for calibration purposes. RESULTS
The LTAS were normalized for the strongest Test results
peak and compared to each other according to the Interrater and intrarater reliability of the lis-
level differences between frequency ranges. The teners’ evaluations was satisfactory (Cronbach’s
fact that relative levels were compared instead of alpha 5 0.89, Spearman’s Correlation r 5 0.75,
absolute levels made the subjects comparable to P ! 0.001, respectively). There was a positive cor-
each other regardless of the absolute Leq values relation between the points given for voice quality
of their samples. In addition, the level difference and the average sound energy level in different
between the first formant region and the fundamen- frequency ranges (Spearman correlation coefficient
tal frequency range was measured by the subtrac- r 5 0.39, P 5 0.005 for 1–2 kHz, r 5 0.40,
tion L1L0 (see Figure 1). This level difference P 5 0.004 for 2–3 kHz, and r 5 0.33, P 5 0.018
has been found to correlate negatively with the per- for 3–4 kHz). L1L0 difference did not correlate
ception of breathiness (hypofunctional voice pro- significantly with the perceptual evaluation. Alpha
duction35). That level difference may also be ratio also correlated with voice quality (r 5 0.31,
affected by intensity and F0. In addition to LTAS P 5 0.032).
measurements, alpha ratio9 was also calculated The average points given in the listening evalua-
with Intelligent Speech Analyser by subtracting tion allowed a grouping of the voice samples into
the Leq of the range of 50 Hz–1 kHz from that of 14 good voices, 14 poor voices, and 22 intermedi-
the range of 1–5 kHz. ate voices. Figure 2 compares the average LTAS
The text reading samples were divided into dif- of the students’ good, poor, and intermediate voi-
ferent voice quality classes according to the points ces. The poor voices differed from the good voices
given in the listening evaluation (poor 5 mean by having relatively less sound energy at the range
points 3 to 1.2, good: þ0.8–to þ2.6, intermedi- of 1–3 kHz. The good voices also had a more prom-
ate: 0.8 to þ0.4). The statistical significance of inent peak at 3–4 kHz. The intermediate voices did
the differences between the voice quality groups not differ from the good ones in terms of the LTAS
was studied by carrying out Student’s unpaired t- characteristics. The analyses were made using
tests and nonparametric Mann–Whitney U tests. a 10 kHz frequency range, but in figures the spectra
To illustrate the results, the LTAS of individual are compared in the range of 0–5 kHz as no signif-
samples were averaged over voice quality classes icant differences were found above 5 kHz.
with a custom-made computer program (developed Table 1 shows the statistical results. It can be
by Heikki Alatalo, DSP-Systems). seen that the difference between the good and the

Journal of Voice, Vol. -, No. -, 2008


ARTICLE IN PRESS

4 TIMO LEINO

DISCUSSION
Differences in the LTAS of the voice quality
groups among students were basically similar to
those reported earlier for professional actors.33
Good voices were characterized by a less steep
spectral slope and a more prominent peak at 3–
4 kHz. This peak, however, was not as prominent
in the good students’ voices as in the best voices
of the actors.33 For the students the level of the
3.5 kHz peak was on the average 28 dB, relative
FIGURE 2. Average LTAS for the male university students’
good voices (N 5 14, dark line), poor voices (N 5 14, grey
to the strongest peak in the LTAS. According to ear-
line), and intermediate voices (N 5 22, dotted line). lier results, the level of this peak was on the average
20 dB for the best voices of the actors. The cor-
poor voices was significant in all other ranges stud- relation between the strength of the peak and the
ied except for 2–3 kHz and 4–5 kHz. The difference perceptual voice quality was low in students,
between the good and the intermediate voices, in- whereas according to earlier results it was moderate
stead, was nonsignificant in all ranges. L1L0 level in actors.33 Good voices did not differ from the or-
difference did not distinguish the three voice quality dinary voices among the students. This acoustic
groups either (good voices: mean 0.64 dB, SD 3.6; finding is in accordance with voice trainers’ percep-
poor voices: mean 2.11 dB, SD 3.6 and intermedi- tion that clearly good voices are quite rare among
ate voices: mean 0.27 dB, SD 4.2. Differences be- vocally untrained university students. Among
tween the groups were nonsignificant). young students this may be due to social pressure
Table 2 summarizes the results of Leq measure- not to be different from the others.
ments. Students with good voices had significantly LTAS illustrates both phonatory and resonatory
higher Leq than those with poor voices, while stu- characteristics. However, Nolan23 has found that
dents with intermediate voices did not differ signif- LTAS is more capable of distinguishing between
icantly from those with good voices. phonatory than resonatory differences in voice

TABLE 1. Differences between the average LTAS and Alpha ratio (level difference above and below 1 kHz): Leq
(1–5 kHz)–Leq (50 Hz–1 kHz) of good voices (N 5 14), poor voices (N 5 14), and intermediate voices (N 5 22)
of Finnish male university students (N 5 50 in total)
0–1 kHz 1–2 kHz 2–3 kHz 3–4 kHz 4–5 kHz Alpha

Good
Mean 7.6 dB 18.5 dB 27.5 dB 32.2 dB 42.4 dB 11.3
SD 1.8 3.3 4.0 4.1 2.8 (15.3 to 8.1)
Poor
Mean 9.8 dB 22.0 dB 30.0 dB 36.5 dB 44.5 dB 13.3
SD 2.1 3.5 2.5 4.1 5.0 (18.3 to 8.7)
Intermediate
Mean 8.3 dB 18.2 dB 26.5 dB 31.6 dB 42.3 dB 10.5
SD 2.1 4.3 5.7 6.0 3.8 (15.5 to 5.9)
Significance of differences
Good/poor P 5 0.006 P 5 0.012 NS P 5 0.015 NS P 5 0.043
Good/intermediate; NS NS NS NS NS NS
Mean and standard deviations are given for the averages of the normalized amplitude values in different frequency ranges. Statistical
significance: Mann–Whitney U test.
NS 5 nonsignificant 5 P O 0.05.

Journal of Voice, Vol. -, No. -, 2008


ARTICLE IN PRESS

STUDENTS’ VOICES 5

TABLE 2. Differences between Leq of the students with good voices (N 5 14), poor voices (N 5 14),
and intermediate voices (N 5 22)
Good Intermediate Poor

Mean 72.8 dB 70.9 dB 67.7 dB


SD 2.7 3.6 4.0

Significance of differences
Good/poor P 5 0.001 Good/intermediate; NS Intermediate/poor; P 5 0.019
Statistical significance: Mann–Whitney U test.
NS 5 nonsignificant 5 P O 0.05.

qualities. The level differences between frequency muscles.41 The strength of the 3–4 kHz peak com-
ranges in the LTAS reflect the slope of the source pared to the strongest spectral peak (F0 or F1) is
spectrum, which in turn is related to the glottal naturally to some extent related to overall spectral
closing speed. Increased closing speed gives slope, but the prominence of that peak in the
a less tilting slope.36 The spectral tilt is known to good voices of the actors does not seem to be
decrease together with vocal intensity and effort in- merely a result of a less tilting spectral slope. Ac-
crease. This is due to the fact that the higher over- cording to the earlier studies by Leino,33 it was
tones gain more in intensity than the lower ones not the level differences between frequency ranges
when the overall intensity is increased.37 In the but the prominence of the 3.5 kHz peak in the aver-
present study the students with good and ordinary age LTAS that distinguished the actors’ good voices
voices used somewhat more intense voice than the from the fairly good and rather poor voices. Below
students with poor voices (mean difference 1.9 dB 3 kHz the relative sound level of the rather poor
between good and intermediate voices and 5.1 dB voices was the same and that of the fairly good voi-
between good and poor voices). ces was even higher than that of the good voices. In
Sound level and voice quality seem to be interre- individual LTAS, it has been possible to see that the
lated. In general, sound level tends to be somewhat peak at 3.5 kHz, the actor’s formant, can be almost
higher after vocal exercising or a longer voice train- as strong or in some cases even stronger than the
ing period.38,39 This may suggest that subglottic preceding peak in the range of 2–3 kHz.42 This sug-
pressure is higher after training. We may ask gests that the peak is not only a result of strong
whether the only difference between speakers voice source overtones in general but of a special
with good voices and those with not so good voices resonance type, most likely related to the actors’
is related to the habitual loudness of speech—and technique of ‘‘projecting.’’ The region above
could thus be abolished by merely asking the 3 kHz is supposed to be more prone to show reso-
speakers with poor voices to speak more loudly. natory than phonatory characteristics.24 The results
On the other hand, somewhat higher sound level of Hurme43 also showed that when the speakers in-
could also be a sign of improved vocal function creased voice intensity (55–65–75 dB) in text read-
per se. For instance, there are results suggesting ing (30 seconds), the level differences between
that the glottal spectrum tilts less for trained singers frequency ranges in the LTAS (strongest peak set
than for untrained subjects.38 This could result from as zero) changed substantially more below 3 kHz
a higher glottal closing speed or a more favorable than above it. At 4 kHz the change was nonsignifi-
input impedance of the vocal tract, for example, ep- cant. This was also demonstrated in Nordenberg
ilaryngeal narrowing.40 A faster glottal closing and Sundberg.44
speed could show a more adequate adduction and/ The clear valleys separating the 3.5 kHz peak
or improved mobility of the vocal folds. The latter from its surroundings suggest that it is formed by
could possibly result from a more economic activ- a formant or a cluster of formants. This peak may
ity ratio between cricothyroid and thyroarytenoid basically be related to the fourth formant.

Journal of Voice, Vol. -, No. -, 2008


ARTICLE IN PRESS

6 TIMO LEINO

According to the results of Iivonen and Laukka- representing a good voice quality differed from
nen,45 the fourth formant for a male speaker of those perceived as representing a poor voice quality
Finnish was located at the range 3,360–3,800 Hz by having a relatively higher sound level between 1
depending on both the vowel and its duration. and 3 kHz in the LTAS and a more prominent peak
Thus, the vowel-related changes in the frequency at 3–4 kHz. Good voices did not differ from inter-
of F4 were only minor. This suggests that F4 is mediate voices. Leq was significantly lower in sam-
not very heavily dependent on articulation and ples evaluated as poor in voice quality. The level
thus most likely is not very language dependent ei- difference between the regions of the fundamental
ther. A fairly strong peak at 3.5 kHz can also be and the first formant did not differentiate between
seen in the LTAS of speakers of different lan- voice qualities.
guages: Dejonckere published such an LTAS in The peak at 3–4 kHz (actor’s formant) cannot re-
a French speaking subject,6 Frøkjær-Jensen and sult from louder voice alone but seems to imply
Prytz9 obviously from samples in Danish and Nolan some resonatory phenomena as well. The effects
from a sample in English.23 Nawka found this peak of Leq and resonance characteristics on the LTAS
in good voices among German speakers.46 This characteristics of trained and untrained ‘‘good voi-
peak, however, is not an absolute prerequisite of ces’’ need to be studied further.
a good voice quality, and sometimes it can also
be seen in not so good voices,47 for example, in Acknowledgments: The author is grateful to Mr. Jussi
those with vocal fry.23 Helin for technical assistance in the analyses and to
The level difference L1L0 has been found to Mrs. Virginia Mattila, M.A., for language correction of
the manuscript. Professors Johan Sundberg and Anne-
correlate negatively with the perception of breathi-
Maria Laukkanen are acknowledged for their valuable
ness (hypofunctional voice production35). Further-
comments.
more, the level difference has been found to rise
after voice training while the perceptual voice qual-
ity also became ‘‘tighter’’ (ie, more hyperfunc- REFERENCES
tional).39 According to the results of the present 1. Ansi S. USA standard: Acoustical terminology (s 1.1).
study, L1L0 difference was on average positive New York: American National Standards Institute, Inc.;
for the good and ordinary voices and negative for 1960.
2. Titze IR. Principles of voice production. Englewood Cliffs,
the poor voices. This suggests that the relative level N.J: Prentice Hall; 1994.
of the F0 was lower and voice production thus most 3. Li K-P, Hughes GW, House AS. Correlation characteristics
likely tighter in the students with good and ordinary and dimensionality of speech spectra. J Acoust Soc Am.
voices compared to those with poor voices. This 1969;46:1019–1025.
may be related to greater loudness or otherwise 4. Majewski W, Rothman H, Hollien H. Acoustic compari-
sons of American English and Polish. J of Phon. 1977;5:
tighter adduction. Bele has also found a greater
247–251.
L1L0 and stronger sound level in actors com- 5. Gauffin J, Sundberg J. Clinical applications of acoustic
pared to teachers.47 In the present study, however, voice analysis—acoustical analysis, results and discussion.
L1L0 difference did not differentiate significantly In: NH. Buch, ed. Proceedings of the International Associ-
between voice qualities of normal voiced untrained ation of Logopedics and Phoniatrics Congress 15–18 Au-
students. This may be due to the fact that not so gust 1977, Copenhagen, Denmark. Herning: Organizing
Committee of the Congress, 1978:489–502.
good voice may be either more hypofunctional or 6. Formby C, Monsen RB. Long-term average speech spectra
more hyperfunctional than the optimum and thus for normal and hearing-impaired adolescents. J Acoust Soc
at group level the differences in this parameter Am. 1982;71:196–202.
will cancel each other out. 7. Hurme P, Sonninen A. Normal and disordered voice quality:
listening tests and long-term spectrum analyses. In: Hurme P,
ed, Papers in Speech Research, Vol. 6. Jyväskylä: Department
CONCLUSIONS of Communication, University of Jyväskylä; 1985:49–72.
8. Dejonckere PH. Acoustic analysis of voice production. Es-
In vocally untrained male students’ speaking sai de synthèse dans une optique clinique. Acta Oto-Rhino-
samples, those which were perceived as Laryngologica Belgica. 1986;40:377–385.

Journal of Voice, Vol. -, No. -, 2008


ARTICLE IN PRESS

STUDENTS’ VOICES 7

9. Frøkjær-Jensen B, Prytz S. Registration of voice quality. 29. Novak A, Dlouha O, Capoca B, Vohradnik M. Voice fa-
Brüel Kjær Tech Rev. 1976;3:3–17. tigue after theatre performance in actors. Folia Phoniatr.
10. Prytz S. Long-time-average-spectra (LTAS) analyses of 1991;43:74–78.
normal and pathological voices. In: NH. Buch, ed. Pro- 30. Rantala L, Paavola L, Körkkö P, Vilkman E. Working-day
ceedings of the International Association of Logopedics effects on the spectral characteristics of teaching voice.
and Phoniatrics Congress 15–18 August 1977, Copenha- Folia Phoniatr Logop. 1998;50:205–211.
gen, Denmark. Herning: Organizing Committee of the 31. Jónsdottir V, Laukkanen A-M, Siikki I. Changes in
Congress, 1978: 459–475. teachers’ voice quality during a working day with and
11. Wendler J, Doherty ET, Hollien H. Voice classification by without electric sound amplification. Folia Phoniatr
means of long-term speech spectra. Folia phon. 1980;32: Logop. 2003;55:267–280.
51–60. 32. Leino T. The spectral characteristics of good voice. In:
12. Hammarberg B, Fritzell B, Gauffin J, Sundberg J, Finnish. Licentiate Thesis in Logopedics. Helsinki: Univer-
Wedin L. Perceptual and acoustic correlates of abnormal sity of Helsinki, Department of Phonetics; 1976.
voice qualities. Acta otolaryngologica. 1980;90:441–451. 33. Leino T. Long-term average spectrum study on speaking
13. Dejonckere PH, Villarosa D. Long-term spectrum analysis voice quality in male actors. In: Friberg A, Iwarsson J,
of the voice. Comparaison de voix normales et de voix altér- Jansson E, Sundberg J, eds, SMAC93, Proceedings of the
ées par différentes catégories de pathologies laryngées. Acta Stockholm Music Acoustics Conference, July 28–August
Oto-Rhino Laryngologica Belgica. 1986;40:426–435. 1, 1993, No 79. Stockholm: The Royal Swedish Academy
14. Benson R, Hirsh I. Some variables in audio spectrometry. of Music; 1994:206–210.
J Acoust Soc Am. 1953;25:499–505. 34. Sundberg J. Articulatory interpretation of the singing for-
15. Pruzansky S. Pattern matching procedure for automatic mant. J Acoust Soc Am. 1974;55:838–844.
talker recognition. J Acoust Soc Am. 1963;35:354–358. 35. Laukkanen A-M, Vintturi J, Vilkman E, Sala E, Siikki I,
16. Niemøller A, McCormick L, Miller J. On the spectrum of Lukkarila P. Perceptual, acoustic and self-reported corre-
spoken English. J Acoust Soc Am. 1974;55:461. lates of vocal loading. In: Proceedings of the XXVth world
17. Kiukaanniemi H, Siponen P, Mattila P. Individual differ- congress of the International Association of Logopedics
ences in the long-term speech spectrum. Folia phoniatrica. and Phoniatrics in Montreal.
1982;34:21–28. 36. Gauffin J, Sundberg J. Data on the glottal voice source be-
18. Harmegnies B, Landercy A. Intra-speaker variability of the havior in vowel production. In: Speech Transmission Lab-
long term speech spectrum. Speech Commun. 1988;7:81–86. oratory, Quarterly Progress and Status Report, 2–3.
19. Pavlovic V, Rossi M, Espesser R. Statistical distributions Stockholm: Royal Institute of Technology; 1980. 61–70.
of speech of various languages. J Acoust Soc Am. 1990; 37. Fant G. Speech Sounds and Features. Cambridge, MA: The
88(Suppl 1):176. MIT (Massachusetts Institute of Technology) Press; 1973.
20. Byrne D, Dillon H, Tran K, et al. An international compar- 38. Wedin S, Leanderson R, Wedin L. Evaluation of voice
ison of long-term average speech spectra. J Acoust Soc training. In: N.H. Buch, ed. Proceedings of the Interna-
Am. 1994;96:2108–2120. tional Association of Logopedics and Phoniatrics Congress
21. Williams CE, Stevens KN. Emotions and speech: some 15–18 August 1977, Copenhagen, Denmark. Herning: Or-
acoustical correlates. J Acoust Soc Am. 1972;52:1238–1250. ganizing Committee of the Congress, 1978:361–381.
22. Pittam J, Gallois C, Callan V. The long-term spectrum and 39. Laukkanen A-M, Syrjä T, Laitala M, Leino T. Effects of
perceived emotion. Speech Commun. 1990;9:177–187. two-month vocal exercising with and without spectral bio-
23. Nolan F. The Phonetic Bases of Speaker Recognition. feedback on student actors’ speaking voice. Log Phon
Cambridge, MA: Cambridge University Press; 1983. Vocol. 2004;29:66–76.
24. Kitzing P. LTAS criteria pertinent to the measurement of 40. Titze IR, Story BH. Acoustic interactions of the voice
voice quality. J Phon. 1986;14:477–482. source with the lower vocal tract. J Acoust Soc Am.
25. Pittam J. Discrimination of five voice qualities and 1997;101(4):2234–2243.
prediction to perceptual ratings. Phonetica. 1987;44:38– 41. Titze IR, Talkin DT. A theoretical study of the effects of
49. various laryngeal configurations on the acoustics of phona-
26. Dmitriev L, Kiselev A. Relationship between the formant tion. J Acoust Soc Am. 1979;66(1):60–74.
structure of different types of singing voices and the di- 42. Leino T, Kärkkäinen P. On the effects of vocal training
mension of supraglottal cavities. Folia phoniatr. 1979;31: on the speaking voice quality of male student actors. In:
238–241. Elenius K, Branderud P, eds, Proceedings of the XIIIth
27. Rossing T, Sundberg J, Ternström S. Acoustic comparison International Congress of Phonetic Sciences, Stockholm,
of voice use in solo and choir singing. J Acoust Soc Am. Sweden 13–19 August, 1995, Vol. 3 of 4. Stockholm:
1986;79:1975–1981. Department of Speech Communication and Music
28. Löfqvist A, Mandersson B. Long-time average spectrum Acoustics, Royal Institute of Technology and the Depart-
of speech and voice analysis. Folia Phoniatr. 1987;39: ment of Linguistics, Stockholm University; 1995:496–
221–229. 499.

Journal of Voice, Vol. -, No. -, 2008


ARTICLE IN PRESS

8 TIMO LEINO

43. Hurme P. Acoustic Studies of Voice Variation. Doctoral 4, Series B: Phonetics, Logopedics and Speech Communi-
thesis. In: Jyväskylä Studies in Communication, 7. Jyväs- cation, 5. Helsinki: Department of Phonetics, University of
kylä: University of Jyväskylä; 1996. Helsinki; 1993:29–54.
44. Nordenberg M, Sundberg J. Effect on LTAS of vocal 46. Nawka T, Anders LC, Cebulla M, Zurakowski D. The
loudness variation. Logop Phoniatr Vocol. 2004;29:183– speaker’s formant in male voices. J Voice. 1997;11(4):
191. 422–428.
45. Iivonen A, Laukkanen A-M. Explanations of the qualita- 47. Bele IV. Professional Speaking Voice: A Perceptual and
tive variation of Finnish vowels. In: Iivonen A, Acoustic Study of Male Actors’ and Teachers’ Voices. Doc-
Lehtihalmes M, eds. Studies in Logopedics and Phonetics toral dissertation. University of Oslo; 2002.

Journal of Voice, Vol. -, No. -, 2008

You might also like