Professional Documents
Culture Documents
201
Acoust. Sci. & Tech. 43, 3 (2022)
after recording many times of each emotion, the singer Mean Energy Intensity(dB)[INTENSITY]: Energy in-
selected the most appropriate version of each emotion by tensity listed in PRAAT is the RMS amplitude of the signal,
listening and comparing. When singing major scale with which is related to (but different from) the perceived structure
notes, the singer could still adjust her performance in the first ‘‘loudness.’’ The Mean value was measured from the begin-
few whole tones, while the 8th whole tone was the most stable ning to the end of the note.
and the emotion was the most expressive, which was selected Mean Pitch(Hz)[PITCH]: The Mean pitch value was
to investigate the acoustic features and these selected notes measured from the beginning to the end of a note.
of twenty-four kinds of emotions were in the same position Spectral Centroid(Hz)[CENTROID]: It is calculated as
and comparable intonation. the weighted mean of the frequencies present in the signal,
Procedures: For acoustic measurement, the last note at where the weights are the normalized energy of each
the end of the continuous scale was analysed. The vibrato frequency component in that sub-band. It indicates the
was extracted using the spectrogram in PRAAT [12], with a location of the centroid of the spectrum, which is related to
sampling rate of 48 kHz and a Fast Fourier Transform (FFT) the brightness impression in perception.
window size for 2,048 samples. If the difference of adjacent
peaks in pitch is larger than a predefined threshold (set to 3. Results
6 Hz), then determined the position exhibits vibrato. The Table 1 shows acoustic features of twenty-four emotional
vibrato extent of Zhang [13] was used as the calculation voices. Vibrato M ¼ 5:45 Hz, SD ¼ 0:75, n ¼ 24 for
formula. ‘‘RATE’’ and M ¼ 64:5 cents, SD ¼ 30:90, n ¼ 24 for
Figure 1 shows the F0 of ‘‘Anger’’ when singing the end ‘‘EXTENT’’, the ‘‘RATE’’ ranges from 3.20 to 7.35 Hz,
of the note, where the position of note starting, the first vibrato and ‘‘EXTENT’’ ranges from 20 to 151 cents. Spectral
peak, and the last vibrato peak is illustrated. It also shows ‘‘CENTROID’’ (M ¼ 1;566:37 Hz, SD ¼ 6:64, n ¼ 24), the
vibrato interval, which is determined as the position of the red ranges from 923.6 to 2,110.9 Hz. The average ‘‘RATE’’ and
spot. We analysed the characteristic quantities of the follow- ‘‘EXTENT’’ of vibrato are similar to those of previous studies
ing seven acoustic features. The parentheses ( ) indicate unit. [3–5]. Among them, the lower ‘‘RATE’’ and ‘‘EXTENT’’
The square brackets [ ] indicate abbreviations. value of ‘‘Serenity’’ (3.20 Hz, 20 cents) may be caused by the
Note Duration(ms)[DURATION]: From the beginning longer ‘‘DELAY’’ of vibrato (1,641 ms). As shown in Fig. 2,
to the end of a note. the emotion of Serenity’s vibrato is unstable and the vibrato is
Vibrato Rate(Hz)[RATE]: The vibrato rate was mea- irregular or non-existent.
sured by identifying each complete vibrato period, which was The average vibrato ‘‘DURATION’’ of note is 2,151 ms,
composed of continuous peaks, and the total number of these and the average vibrato starting time (Onset ‘‘DELAY’’) is
periods was calculated and then divided by the duration of 595.5 ms. What was worthy of attention is that the notes
these periods, which produced a vibrato rate in hertz. ‘‘DURATION’’ of ‘‘Joy’’ segments is much longer (3,936 ms),
Vibrato Extent(cents)[EXTENT]: The vibrato extent and the corresponding vibrato starting time is 703 ms. When
was estimated by reading the difference between adjacent investigating the ratio of vibrato starting time to note duration,
peaks frequencies in the continuous vibrato area. In Fig. 1, the ratio of ‘‘Joy’’ is 703=3;936 ¼ 0:179, which means that the
the positions of the red points are determined as the peaks vibrato for the note ‘‘Joy’’ note began earlier than the average
(The mean value for vibrato rate and extent were measured starting time, also could be interpreted that the singer had
from the first peak until the final peak of the vibrato cycle). prolonged the singing in the emotion of ‘‘Joy.’’ The ratio of
In Eq. (1), pk is the extent value of each peak point, and average vibrato starting time to note duration is 0.269.
J is the total number of peak values. FM vibrato extent used The data analysis concerns the importance of seven
cent units. acoustic features of twenty-four emotions after a PCA
analysis. The biplot of the first and second principal
1 1 X J 1
components (varimax rotation) is shown in Fig. 3. The black
ExtentFM ¼ 1200 log2 jpkjþ1 pkj j ð1Þ
2 J 1 j¼1 points represent the distribution of each emotion, the arrows
represent the seven features’ radiation directions, and the
Vibrato Onset Delay(ms)[DELAY]: The delay was colour represents the degree of cos 2 (square cosine, squared
measured from the initiation of the note until the first coordinates). The ellipses represent different groupings.
conclusive peak of the vibrato cycle. The first principal component has a variance of 2.37,
explaining 33.9% (2.37/7) of the total variance. The second
principal component has a variance 27.9% (1.95/7) of the
total variance. More than 90.8% of the variance is contained
in the first four principal components.
The results show that ‘‘EXTENT-DELAY’’ is the first
component. The positive direction of the horizontal axis is
shown by ‘‘EXTENT,’’ whereas ‘‘DELAY’’ indicates the
‘‘Negative’’ direction. ‘‘INTENSITY’’ is the second compo-
nent. ‘‘INTENSITY’’ and ‘‘CENTROID’’ had a close sim-
ilarity. The expression of ‘‘Joy, Sadness, Anger, Rage’’ can be
Fig. 1 Vibrato interval from the emotion ‘‘Anger.’’ observed to a more significant extent. The ‘‘INTENSITY’’ is
202
J.Y. LIU et al.: ACOUSTIC EXPRESSION OF EMOTIONAL SINGING
Table 1 One-note analyses of singer’s duration, vibrato onset delay, rate and extent, intensity, pitch, and spectral centroid
for twenty-four kinds of emotions, which can be divided into three groups as neutral, positive and negative from top to
bottom in the table.
203
Acoust. Sci. & Tech. 43, 3 (2022)
References
stronger, and the ‘‘Neutral’’ group had a longer ‘‘DELAY’’ [1] M. Sherman, ‘‘Emotional character of the singing voice,’’ J.
and a narrower ‘‘EXTENT.’’ Exp. Psychol., 11, 495–497 (1928).
[2] J. Sundberg and T. D. Rossing, ‘‘The science of the singing
voice,’’ J. Acoust. Soc. Am., 87, 462–463 (1990).
4. Discussion
[3] C. E. Seashore, The Vibrato, Studies in the Psychology of
Figure 3 illustrates the relationship of seven acoustic Music (University of Iowa, Iowa City, 1932), pp. 30–37.
features. It shows that ‘‘EXTENT’’ and ‘‘PITCH’’ were [4] M. Baroni and L. Finarelli, ‘‘Emotions in spoken language and
positively correlated in the same direction. Thus, there are in vocal music,’’ Proc. 3rd Int. Conf. Music Perception and
opposing directions between ‘‘DELAY’’ and vibrato Cognition., pp. 343–345 (1994).
(‘‘EXTENT’’ and ‘‘RATE’’). This indicates that vibrato [5] E. Prame, ‘‘Measurement of the vibrato rate of ten singers,’’
onset ‘‘DELAY’’ was later, but the ‘‘RATE’’ and ‘‘EXTENT’’ J. Acoust. Soc. Am., 96, 1979–1984 (1994).
were smaller. [6] R. Miller, Singing Schumann: An Interpretive Guide for
The method applied in this study is far from enough to Performers (Oxford University Press, Oxford, 1999).
prove emotional vocal expression only by collecting the sound [7] D. Katok, The Versatile Singer: A Guide to Vibrato & Straight
Tone (City University of New York, New York, 2016).
of one singer or several singers. However, conducting an
[8] J. Sundberg, ‘‘Acoustic and psychoacoustic aspects of vocal
analysis of the same singer is suitable to better control the vibrato,’’ in Vibrato, P. Dejonckere, M. Hirano and J.
variables and identify the differences in the acoustic charac- Sundberg, Eds. (Singular Publishing, San Diego, 1995),
teristics. pp. 35–62.
[9] T. Saito and T. Nakamura, ‘‘Hierarchical structure of the
5. Conclusion categories of Japanese emotion,’’ Kyushu Univ. Psychol. Res.,
Our study analysed twenty-four emotions expressed by 4, 95–99 (2003).
one vocalist in the recording of a single note with vibrato. In [10] D. L. Robinson, ‘‘Brain function, emotional experience and
addition, the acoustic analysis of the vocalist’s singing during personality,’’ Neth. J. Psychol., 64, 152–167 (2009).
the performance was carried out to study the vibrato [11] L. Johnson-Read, ‘‘Performing lieder: Expert perspectives and
comparison of vibrato and singer’s formant with opera
parameters of the singer in the same pitch and pronunciation,
singers,’’ J. Voice, 29, 645. e15–32 (2015).
as well as the differences between the acoustic features. [12] P. Boersma, ‘‘PRAAT, a system for doing phonetics by
The analysis conducted in this study led to the following computer,’’ Glot Int., 5, 341–345 (2002).
conclusions: [13] M. Zhang, ‘‘A Matlab-based signal processing toolbox for
1. Through the analysis of our study, it was found that characterization and analysis of musical vibrato,’’ J. Audio
when a singer expresses different emotions at the same pitch, Eng. Soc., 65, 408–422 (2017).
204