You are on page 1of 11

Biomedical Signal Processing and Control 62 (2020) 102148

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control


journal homepage: www.elsevier.com/locate/bspc

Lung volume affects the decay of oscillations at the end of a vocal emission
P.H. DeJonckere a, *, J. Lebacq b
a
Federal Agency for Occupational Risks, Brussels, Belgium
b
Institute of Neurosciences, University of Louvain, Brussels, Belgium

A R T I C L E I N F O A B S T R A C T

Keywords: At the end of a vocal emission, when the voicing is not interrupted by a laryngeal closure, a damped oscillatory
Lung volume motion of each vocal fold can be observed after the last contact phase of the two fold edges on the midline. It can
Damping be precisely analysed using a measure of transglottal light intensity (photoglottography). Actually, during modal
Vocal folds
phonation, the vocal oscillator mainly comprises two components: the vocal folds themselves and the vibrating
Photoglottography
Fundamental frequency
air mass. A simple calculation suggests that the internal air mass set into vibration is larger than the vocal fold
mass. In order to investigate the effect of the vibrating air mass, a voicing protocol was elaborated for validly
measuring and comparing damping characteristics in two conditions: at high and at low lung volume, ceteris
paribus. Glottal area, intraoral pressure, electroglottogram and sound were recorded simultaneously. Elaborated
voicing protocol consisted in series of fast repetitions (3–4 s− 1) of the vowel /ε/, each vocalization being followed
by an abrupt bilabial occlusion with complete airflow interruption. The average difference in lung volume be­
tween the two conditions is approximately 2410 mL. The results show that the decay of vocal fold oscillation is
influenced by the amount of lung air that is set into oscillation. A reduction of the air volume leads to a sig­
nificant increase in the rate of decay, thus voicing at low lung volume requires more energy, which is of
importance for voice hygiene.

1. Introduction consist in? The few existing experimental data about the damping
characteristics of the vocal folds out of a phonation context, either in
At the end of a vocal emission, when the voicing is not interrupted by vivo [4] or in excised larynges [5,6], indicate a high damping ratio after
a laryngeal closure and the airway remains open, a damped oscillatory an external impulse (oscillation stops after 2 cycles). This strongly
movement on each vocal fold (VF) can be observed after the last contact contrasts with observations of phonation offsets, as e.g. recorded using
phase of the two fold edges on the midline. This phenomenon results high speed film: Fig. 1 shows a videokymogram (single line scan) at four
from frictional forces, which cause a decrease of energy content of the levels (from ventral to dorsal) of the vibrating glottis obtained from high
oscillating system, reducing the amplitude of the oscillations as soon as speed video [1]. The recording was made at the end of a sustained /a:/ in
the driving force disappears [1]. The transition is very brief and obvi­ a healthy male subject. Due to persistence of some airflow, the total
ously occurs at a level beyond the scrutiny of traditional video­ damping transient spans over at least 20 cycles, starting with a pro­
laryngostroboscopic examination, but it can be followed precisely using gressive shortening of the closed phase of the vibrating cycle. Hence the
a measurement of the intensity of transglottal light (photoglottography) voicing context appears to play a crucial role.
[1–3]. Actually, the damping characteristics constitute an important In a previous work, we emphasized - for a voicing offset in modal
mechanical property of the oscillating system, particularly related to the physiological conditions - the importance of the persistence of some
energy required for voice production. In concrete terms, the amplitude transglottal flow (i.e. the driving force), resulting from the timing dy­
decrement from cycle to cycle reflects the energy input requested to namics of the expiratory pressure (muscular/elastic, depending on lung
maintain a steady state oscillation, as in the steady state situation, the volume) with respect to the opening speed of the glottis. A persisting
work input from the driving source (lung pressure) exactly compensates transglottal flow after the last VF contact clearly slows the damping [1].
for the energy lost in friction. Hence the airflow interruption is a crucial methodological issue.
So, a first question is: what does the oscillating system actually Another parameter that may intervene is of morphological nature:

* Corresponding author.
E-mail addresses: ph.dejonckere@outlook.com (P.H. DeJonckere), jean.lebacq@uclouvain.be (J. Lebacq).

https://doi.org/10.1016/j.bspc.2020.102148
Received 20 April 2020; Received in revised form 3 August 2020; Accepted 7 August 2020
Available online 21 August 2020
1746-8094/© 2020 Elsevier Ltd. All rights reserved.
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

ideally, the morphology of the oscillator should remain constant during comfortable pitch and loudness). Considering that the vibrating mass
the damping phase. Actually, from a certain degree of abduction, the narrows in the ventral commissure, 0,5 g is a reasonable upper limit
morphology of the oscillating masses considerably changes, the lip-like estimate of the total mass of vibrating tissue in vivo (2 VF). In a female
shape of the VF disappearing and the VF ‘flattening’ laterally. The extent subject, one may expect 0.35 g. Tanabe & al. made a lower estimate on
of this non-linear change seems to mainly depend on the degree of an autopsy case (120 mg per VF) [6].
abduction. Damping was also recently observed in inspiratory phonation A rough assumption is that modal speech occurs with an average
[7]: the characteristics are similar to those of expiratory voicing. lung volume slightly above the upper limit of the tidal volume (‘mid lung
A fast repetition (3 to 4 s− 1) of a vowel followed by an abrupt bilabial volume range’) [8] (see Fig. 4). Hence the internal air volume set into
occlusion (e.g. /εpεpεpεpεpεp/ at comfortable pitch and loudness seems vibration consists in about 50% of the vital capacity (i.e. a half of
to be the most convenient method: it is very close to physiological 3000–4500 ml), to which has to be added a probably large part of the
speech, a complete interruption of airflow is achieved and can be residual volume (on average 1,2 L in males, 1,1 L in females) [9,10] and
controlled, and a significant inflation of the supraglottal vocal tract the supraglottal vocal tract (around 75 mL) [11]. In a healthy adult, the
between vocalizations is prevented, as well as an extended VF abduc­ alveolar dead space can be considered negligible [12]. Globally, the
tion. An example of an audio signal of /εpεpεpεpεpεp/ is shown in Fig. 2: weight of the vibrating air can be estimated to around 2,7 to 3,7 g
microphone signal, with simultaneous display of F0 and SPL measures, (114 g/l), clearly larger than even the high estimate of the VF mass.
as obtained with PRAAT (Computer programme by Boersma, Paul & Varying the air volume set into vibration would allow checking its
Weenink, David (2020). Version 6.1.16, retrieved 16 June 2020 from importance for the mechanical properties of the oscillating system,
http://www.praat.org/). particularly the damping characteristics. However, in order to avoid any
Furthermore, it is possible, for a trained vocalist, to introduce trans- persistence of transglottal flow after a temptative interruption, it is
orally, without hindering lip closure, the thin rod of a small laryngo­ essential to achieve the change in air volume upstream from the glottis,
scopic mirror equipped with the photoglottographic diode deeply into while the volume of the vocal tract is minimized and kept constant. This
the pharynx. The rod is maintained between the teeth, which helps is possible by comparing two conditions: voicing with respectively high
keeping the mouth volume constant. The Millar pressure transducer is and low lung volume while the above-mentioned protocol is applied.
placed into the mouth via the labial commissure. The EGG-electrodes Our hypothesis is that an increase of the air volume (of about 2,5 L) put
and the microphone are external. Actually, fast repetition contributes into vibration by the VFs should improve the mechanical quality of the
to standardizing several parameters. global oscillating system, which should be reflected in a lower damping
This protocol is of interest, but it requires a trained vocalist, and when the driving force is abruptly suppressed.
seems - as such - unsuited for clinical application. Moreover, as the current study makes it possible to precisely measure
The oscillating system itself consists in two components: the two VFs cycle duration during the offset phase, the progress of F0 in the last
and the air mass of lower and upper airways. The size of the vibrating identifiable cycles will also be considered.
mass of the VFs tissue can be roughly estimated on the basis of X-ray-
imaging. Fig. 3 shows a frontal view of a normal male larynx during
modal phonation. A contrast agent clearly defines the contours of soft
tissues. Thickness and width of each vibrating fold can be estimated to 4
and 5 mm respectively. The vibrating length, as seen on video­
laryngoscopic images, is around 16 mm (male subject, modal register,

Fig. 1. Videokymogram at four levels (from ventral to dorsal) of the vibrating glottis obtained from high speed video. Left corresponds to the more dorsal part of the
glottis and right to the more ventral part. Healthy male subject. End of a sustained /a:/. Due to persistence of some airflow, the total damping phase spans at least
20 cycles.

2
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

Fig. 2. Example of an audio signal of /εpεpεpεpεpεp/: microphone signal, and simultaneous F0 and SPL measures, as obtained with PRAAT (Computer programme by
Boersma, Paul & Weenink, David (2020). Version 6.1.16, retrieved 6 June 2020 from http://www.praat.org/).

2. Material and methods

2.1. Signals

2.1.1. Glottal area (light flow)


The glottal area was derived from a photometric record obtained by
transilluminating the trachea. The light source for this transillumination
was a tungsten filament light bulb driven by a constant ripple-free
current source. The light flux was detected by a photovoltaic trans­
ducer positioned as dorsally as possible in the pharynx (photo­
glottography) [1–3,13,14]. The light signal is the most important one, as
it serves to compute the damping. The transducer, a BP104 silicon
photodiode (Vishay Precision Group, Malvern, PA), was glued onto a
small laryngoscopic mirror (nr. 3), the handle of which was maintained
between the teeth. The current produced by the photodiode was pre­
amplified by a current-to-voltage converter with a linear response up to
2 kHz. The calibration procedure has been described previously [3,13].
The measured glottal area at maximal glottal opening can be related to
the peak of the photodiode current. Since the precise position of the
photodiode cannot be reproduced from record to record, in each record,
the amplitude of the light signal was normalized and expressed - in the
damping phase - as a fraction of the amplitude of the first ‘free oscilla­
tion’ after the last closed plateau.
High speed video provides an adequate global view of the moving
vocal folds and the changing glottal shape, but image processing from
high-speed video [15] or even videokymography [16,17] is limited by
the number of pixels (resolution), and merely by the frequency of the
measurement moments, as has been demonstrated by the recent (2019)
experiments of Horacek et al. [18]. This of course also limits the sensi­
tivity for detection of very small oscillations at the end of the damping
phase, which are crucial in our experiments. Resolution and sampling
Fig. 3. Frontal X-ray imaging of the larynx during modal phonation, with a rate do not apply as limiting factors for the photometric measurements.
contrast agent allowing to clearly define the contours of the soft tissues. The
thickness of the vocal fold can be estimated to 4 mm.
2.1.2. Pressure
The intra-oral pressure was measured by means of a Millar Mikro-Tip
catheter (Model SPC-751, Millar Instruments, Inc. Houston, USA). The
pressure signal allows to precisely identify the moment of lip opening
(pressure drop). When the lips are closing, the intraoral pressure in­
creases up to nearly the level of the lung pressure, which remains

3
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

Fig. 4. Spirographic diagram, with personalized values, showing the traditional lung volume compartments, and the situation of the two zones (of 500 mL each)
wherein the sequences of interrupted vocalizations were produced and recorded. The two zones correspond to the ‘high lung volume’ and ‘low lung volume’
conditions respectively. The difference in lung volume between the two zones is approximately 2410 mL.

approximately constant. During phonation, the intraoral pressure signal the EGG-signal (see e.g. Fig. 5).
is affected by the fluctuations of acoustic pressure (see Fig. 5). In this
way, our protocol (fast repetitions of /pεp/ during a single expiration) 2.1.4. Sounds
with monitoring of intraoral pressure provides a good estimate of the Sounds were detected by a Sennheiser MD 421 U microphone at
subglottal pressure during phonation. This method [19,20] was used for 10 cm of the mouth.
indirect measurement of subglottic pressure, particularly the Phonation All signals were recorded by means of a 4-channels Pico Scope 3403D
Threshold Pressure (PTP). Hertegard et al. [19] found that the indirect module (Pico Technology Ltd, St Neots, England, UK) driven by the
measures of subglottic pressure, obtained using the short flow inter­ PicoScope 6 programme, and stored in a computer.
ruption method, were strongly correlated with the direct measurements
obtained by tracheal puncture.
2.2. Methodology and vocalizations
2.1.3. VF - contact
The electroglottographic (EGG) signal, used as a reference for The subject was a healthy trained male vocalist, experienced in
monitoring the changes in contact surface of the VF, was detected using controlling voicing parameters [1,3,13].
a portable electroglottograph (Laryngograph Ltd, London, UK) Model During three sessions, a total of 227 recordings of series of short
EG90. Electroglottography [21] measures the transglottic electrical repetitive vocal /pεp/ emissions were achieved with the photoglotto­
impedance using an AC current at a frequency above 100 kHz and graph and pressure sensor in situ. As mentioned above, the small lar­
monitors the changes in contact surface of the VF. The method does not yngoscopic mirror (Nr. 3), with the photodiode glued onto the mirror
interfere with vocalization. It allows precise phonetic tasks, with and pointing towards the glottis, was introduced trans orally as deeply
acoustic control. However, the sensitivity for detecting very small as possible into the pharynx, the thin rod being maintained between the
transglottic impedance variations depends on the design of the elec­ teeth, in order to keep the mouth volume constant. The intraoral Millar
tronic circuit. The original design of Fourcin and Abberton [22] has been pressure transducer was placed in the oral cavity via the labial
superseded by more recent devices using a higher carrier-wave fre­ commissure, without contact with the tissues. Neither of the instruments
quency, a more efficient feed-back control of the oscillator, multipole hindered lip closure. EGG-electrodes and microphone were external.
filters with sharper cut-off and flat bandwidth response (e.g. F–J Elec­ The vocalist made series of fast repetitions (3 to 4 s− 1) of the vowel
tronics, Denmark; Laryngograph, UK; Synchrovoice Research, USA; /ε/ (determined by mechanical constraints of the experimental pro­
etc.). As a result, a better signal-to-noise ratio and a higher sensitivity cedure), each vocalization being followed by an abrupt bilabial occlu­
are achieved with a larger bandwidth and better linearity [23]. The sion (/εpεpεpεpεpεp/) at comfortable pitch and loudness (105–130 Hz,
EGG-signal however fails to show the final phase of the damping, since corresponding to the average speaking frequency of the subject, and
there is no contact between the VF during this phase. The last sinusoidal 63–68 dBA at 10 cm of the lips). The fast repetition is essential in this
EGG-cycles probably correspond to small (reduced amplitude) imped­ context, as (1) it contributes to standardizing several parameters (pitch,
ance fluctuations at the level of the ventral commissure. The start of free loudness, oro-pharyngeal configuration…); (2) it prevents a volume
oscillations of the VF is indicated by a strong reduction in amplitude of increase at the level of the vocal tract after each vocalization, and an
unwanted persistence of transglottal airflow: after a single /pεp/, when

4
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

Fig. 5. Global view of a polygraphic recording of a single vocalization /pεp/ in the ‘high lung volume’ condition. The /pεp/ is extracted from a /εpεpεpεpεp…/
sequence at a rhythm of three to four vocalizations per s. The vowel / ε / is determined by the constraints of the oral and pharyngeal sensors. Fo is around 130 Hz and
intensity around 64 dB (10 cm). Subglottal pressure (estimated) is 4.9 hPa. 7 free oscillations can be identified after the last closed plateau. The adequate identi­
fication of the last closed plateau requires inspection of an enlargement of the picture (expanding the time scale), but the loss of VF contact on the midline is also
recognizable on the EGG-trace (strong amplitude reduction).

lips are closed, the (although closed) supraglottal tract (mouth and expand the signal horizontally by speeding up the time-base, as is done
pharynx) can increase in volume e.g. by puffing out the cheeks, lowering in Fig. 6. This is what we did for analyzing each utterance and counting
the mandible…, which should allow a (shortly) persisting transglottal the number of free oscillations. On a faster time-base, the plateau could
airflow. This is prevented by fast repetition of the /pεp/; (3) it allows be properly identified, at least in all the cases we took into account for
collecting short series of recordings at the predefined high or low level of this study. Moreover, the EGG-signal is considerably reduced in ampli­
lung volume; (4) it also prevents strong VF abduction, and lateral flat­ tude as soon as no contact occurs between the VF (see Fig. 5). Counting
tening of the lip-like VF shape, which changes to some extent the was made blindly, i.e. the rater being unaware of the condition (high or
morphology of the oscillating masses; (5) it avoids any opening of the low lung volume).
velopharyngeal sphincter, as /ε/ is a non-nasal phoneme. Finally, it is Measurement of amplitude decay was done by first identifying – after
very close to physiological speech and it does not require any effort. strong enlargement of the Pico picture (vertical expansion) – the suc­
These series of fast repetitions of the vowel /ε/ were carried out in cessive maximum and minimum of each cycle. The programme auto­
two lung volume conditions: high and low lung volume. Fig. 4 shows a matically displays the mV when clicking on any point of the tracing. The
spirographic diagram with personalized values for the subject, showing same applies for period measurements (horizontal expansion and ms).
the traditional lung volume compartments, and the situation of the two
zones (of 500 mL each) in which the sequences of interrupted vocali­
zations were produced and recorded. The two zones correspond to the 2.3. Statistics
‘high’ and ‘low’ lung volume conditions respectively. The difference in
lung volume between the two zones is approximately 2410 mL. Statistical computations and graphs were made using the Statistica
A total corpus of 105 selected polygraphic recordings corresponding software (Statsoft Inc., Tusla, USA). The comparison of the numbers of
to the condition ‘high lung volume’ (54) and to the condition ‘low lung cycles was made with a Mann-Whitney-U test. For correlations between
volume’ (51) was created. Criteria for selection were (1) full screen- period duration and number of cycle, Spearman’s correlation coefficient
display of all four traces; (2) sufficient amplitude of the light signal Rho was used.
(depending on positioning of the photodiode in the pharynx); (3) limited The relative amplitudes of oscillations, cycle by cycle, as well as the
drift of the light trace (glottal area) in order to allow valid amplitude logarithmic decrement values were compared between the two condi­
measurements. An example is given in Fig. 5. tions using the Mann-Whitney-U test.
Counting the number of free oscillations on the glottal area trace
started just after the last closed plateau. However, identifying this last 3. Results
closed plateau requires expanding the time scale. For example, it is not
possible to identify a plateau in the light trace (glottal area) of Fig. 5, Fig. 5 shows a global view of a polygraphic recording of a single
because the time-base is too long. To identify a plateau, it is necessary to vocalization /pεp/ in the ‘high lung volume’ condition. The /pεp/ is
extracted from a /εpεpεpεpεp…/ sequence at a rhythm of three to four

5
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

Fig. 6. Example of a voicing offset in the ‘high lung volume’ condition. Seven free oscillations can be identified on the glottal area trace after the last closed plateau.

Fig. 7. Enlargement of a part of Fig. 5, to show how the last cycle with a closed plateau can be identified on the glottal area trace.

6
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

vocalizations per s. The vowel / ε / is determined by the constraints of difference is highly significant (p < .001). Table 1 also gives, for each
the oral and pharyngeal sensors. F0 is around 130 Hz and intensity pair of adjacent cycles and in both conditions, the decay values (mean
around 64 dB (at 10 cm). Subglottal pressure (estimate) is 4.9 hPa. and SD) when the frequency is taken into account: (ln [xn / xn+1] / F0),
Fig. 7 is focusing on the voicing offset in an example of the ‘high lung which is relevant for modelers. Indeed, F0 slightly changes in the last
volume’ condition: on the glottal area trace, seven free oscillations can cycles (see below). The global average values for the ratio (logarithmic
be identified after the last closed plateau. Fig. 6 shows that expanding decrement / frequency) become 5.96 × 10− 3 s for high lung volume and
the time scale clearly differentiates the last cycle with a closed plateau 7.29 × 10− 3 s for low lung volume. The difference remains highly sig­
from the first free oscillation on the glottal area trace. nificant (p < 0.001). If frequency is expressed as angular frequency
Fig. 8 shows an example of a voicing offset in the ‘low lung volume’ (radians.s− 1), the ratio values become 0.95 × 10− 3 s for high lung vol­
condition. The last closed plateau is again easily recognized, even with ume and 1.15 × 10− 3 s for low lung volume respectively.
moderate expanding. In this case, five free oscillations can be identified The tracings of voicing offsets allow a precise quantification of the
after the last closed plateau on the glottal area trace. period duration of the last free oscillations. The periods of the last four
Average counts (blinded for condition) of the numbers of ‘free free oscillating cycles could be reliably measured in both conditions in
oscillation’ cycles after the last VF contact, were 489 +/− 079 in the N = 53 and 48 decays, respectively. In the plots of Figs. 12 and 13, a
‘high lung volume’ condition and 365 ± 072 in the ‘low lung volume’ slight increase in F0 is noticeable: as damping progresses, a small
condition. The difference is highly significant (p < 0.0001) (Fig. 9). This shortening on the period is observed. Positive correlations of F0 with
is confirmed by the superimposed histograms with Gaussian fits cycle # are weak but significant (Rho = 0.24 and 0.21, for high and low
(Fig. 10). lung volumes respectively; p < .05)
The cycle by cycle decay of the normalized amplitude after the last
closed plateau for the two conditions is shown in Fig. 11. Cycle #1 is the 4. Discussion
first free oscillation, defining 100% amplitude. The decay is stronger and
faster in the ‘low lung volume’ condition. The difference in amplitude 4.1. The method for standardizing the damping characteristics
mainly appears in cycles #2 and #3. In cycle #4, the difference is
smaller although still just significant, but there are only a few cases for In a previous study [1], we were already confronted with the relia­
the ‘low lung volume’ condition. For cycles #6 and #7, there are only bity of measurements pertaining to damping, because of the variability
data for the ‘high lung volume’ condition. related to the laryngeal and respiratory behaviour at the end of a vocal
The logarithmic decrement is defined as the natural log of the ratio of emission. The airflow interruption is the crucial methodological issue, as
the amplitudes of any two successive positive peaks: (ln [xn / xn+1]). any persisting transglottal flow after the last VF contact strongly slows
Table 1 shows, for each pair of adjacent cycles and in both conditions, the damping. Abruptly interrupting airflow at a subglottic level is
the mean and SD of the logarithmic decrement value. The global average practically ruled out in vivo. The airflow can be interrupted down­
logarithmic decrement is 072 +/− 031 in the ‘high lung volume’ con­ stream, either artificially by an inflatable balloon within a pneumo­
dition (n = 212 logarithmic decrements) and 088 +/− 026 in the ‘low tachograph, or physiologically by bilabial occlusion (e.g. in /ip/). In the
lung volume’ condition (n = 133 logarithmic decrements). This case of an artificial abrupt interruption during a sustained vowel, some

Fig. 8. Example of a voicing offset in the ‘low lung volume’ condition. Five free oscillations can be identified on the glottal area trace after the last closed plateau. To
be compared with Fig. 6.

7
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

Fig. 9. Result of counting (blinded for condition) the number of identifiable free oscillations after the last closed plateau. In the ‘low lung volume’ condition (n = 51),
the average is 3.65 free oscillations, and in the ‘high lung volume’ condition (n = 54), the average is 4.89 free oscillations. The difference is highly signifi­
cant (p < .0001).

Fig. 10. Histogram of the number of cycles (free oscillations) that can be identified after the last closed plateau (= last contact between vocal fold edges on the
midline). Black = high lung volume; grey / dotted = low lung volume. Gaussian fits. N = 54 and 51. The average number of cycles is highly significantly lower in the
case of low lung volume (p < .0001).

transglottal airflow can persist by limited inflation of the upper vocal standardizing many parameters, without affecting the lung volume.
tract, upstream of the occlusion. As damping characteristics of the oscillating system are objective
A fast repetition (3 to 4 s− 1) of a vowel followed by an abrupt bilabial data directly related to the amount of energy required for voicing, this
occlusion (like /εpεpεpεpεpεp/ at comfortable pitch and loudness seems protocol could conceivably be of use in the fields of vocal hygiene, ed­
the most convenient method: it is very close to physiological speech, and ucation and pathology. However, in its current setting, it requires a
complete interruption of the airflow is achieved and can be controlled. trained vocalist and seems unsuitable for clinical or medicolegal appli­
The fast repetition rate prevents significant inflation of the supraglottal cation. High speed / high resolution glottal imaging via a noninvasive
vocal tract between vocalizations as well as extended VF abduction that transnasal fiberscope and automatic extraction by dedicated software of
would alter the VF morphology. Furthermore, with a trained vocalist, it glottal area and period measurements with sufficient precision (pixels /
is possible to introduce the light and pressure transducers in mouth and frames per s) is a possible solution for the near future [1].
pharynx. The teeth gently maintain the thin rod equipped with the In the present study, all observations deal with phonation at
photodiode, and the pressure transducer is introduced laterally via the comfortable pitch and loudness. Changes in intensity and F0 may also
labial commissure, without hindering lip closure. Practically any jaw likely influence the damping characteristics, particularly in the cases of
and mandible movements are precluded. The tongue must also remain shouting or register change.
immobile in a position that avoids any contact of the Millar pressure Another totally different possibility for observing the damping phe­
transducer with a mucosa. In all, fast repetition contributes to nomenon would be an external mechanical impulsion on the larynx in

8
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

Fig. 11. Comparison of amplitudes between the


‘high lung volume’ and the ‘low lung volume’
conditions for each successive free oscillation.
The amplitude of the first identifiable free
oscillation is set at 100% in both the ‘high lung
volume’ and in the ‘low lung volume’ condition
(normalization). The difference in amplitude
mainly appears in cycles #2 and #3. In cycle #4
the difference is still just significant, but there
are only a few cases for the ‘low lung volume’
condition. For cycles 6 and 7, there are only
data for the ‘high lung volume’ condition.

the absence of phonation. However, it would require an adequate


Table 1 positioning of the VF at the moment of impulsion, and any recording
Mean and SD of the logarithmic decrement value for each pair of adjacent cycles
would be disturbed by the mechanical artifact. Furthermore, an external
and in both conditions. Idem for the ratio (logarithmic decrement / frequency).
impulsion will elicit a laryngeal reflex (with a delay of the order of
High lung volume Low lung volume 10–40 ms [24,25]) which will interfere with the damping. For these
# of Log. Ratio Log. Log. Ratio Log. reasons, this procedure is unlikely to produce valid results.
cycles decrement decrement/ decrement decrement/
Frequency (s) Frequency (s)
4.2. The effect of lung volume on damping characteristics
3 3
1-2 0.56 +/− 4.90 × 10− +/− 0.74+/ 6.70 × 10− +/−
3 3
0.25 2.19 × 10− − 0.24 2.17 × 10−
2-3 0.56 +/− 4.93 × 10− 3
+/− 1.05+/ 9.44 × 10− 3
+/−
In phonation physiology, the concept of ‘vocal oscillator’ may
0.25 2.20 × 10− 3
− 0.23 2.07 × 10− 3 obviously not be limited to the VF, but it includes internal air volume set
3-4 0.98 +/− 8.72 × 10− 3
+/− 0.89+/ 7.50 × 10− 3
+/− into motion by the lung pressure and into vibration by the VF. The mass
3 3
0.20 1.78 × 10− − 0.21 1.69 × 10− of the air appears to be around sevenfold that of the VF tissue. During
3 3
4-5 0.85 7.41 × 10− 0.69 5.51 × 10−
speech and singing, after the subject has taken a small or a larger deep
+/− +/− +/−
3 3
0.35 3.05 × 10− +/− 0.20 1.60 × 10−
5-6 0.50 +/− 4.30 × 10− 3
+/− breath, this volume progressively declines, and this in turn influences
0.24 2.07 × 10− 3
the physical properties and the energy required for of the voice pro­
3
6-7 0.69 +/− 5.52 × 10− +/− duction. At high lung volume, the elastic effect of a larger vibrating air
3
0.20 1.60 × 10−
mass reduces the rate of decay of the glottal oscillations.
In our experiments, the calculated global average logarithmic
decrement is 072 +/− 031 in the ‘high lung volume’ condition and 088
+/− 026 in the ‘low lung volume’ condition. For comparison, the log­
arithmic decrement computed on a graph made by Tanabe and Isshiki
[5] and based on high speed cinematography of an autopsy larynx is
clearly higher: 165 (oscillation stops after 2 cycles).

Fig. 12. Progress of F0 in the last four identifiable cycles (free oscillations) during voicing offset in the ‘high lung volume’ condition (mean values, SE and SD; linear
regression line). Positive correlation of F0 with cycle # is weak but significant: F0 tends to increase in last cycles.

9
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

Fig. 13. Progress of F0 in the last four identifiable cycles (free oscillations) during voicing offset in the ‘low lung volume’ condition (mean values, SE and SD; linear
regression line). Positive correlation of F0 with cycle # is weak but significant: similarly, to the ‘high lung volume’ condition, F0 tends to increase in last cycles.

4.3. Implications of lung volume for voice hygiene and functional voice inductive plethysmography, investigated female voice patients with
disorders vocal fold nodules, which are generally considered to result from a tissue
reaction to repeated localized mechanical stress or insult to the VF tis­
Dysphonia related to vocal abuse, or hyperfunctional dysphonia, has sues [35]. They concluded that “females with vocal nodules were shown
long been suspected to be associated with deviant speech breathing to inhale more often, and, when shouting, initiated phrases at lower lung
[26]. volume levels than females without nodules, thus refraining from taking
Subglottal pressure is determined by active forces produced by the advantage of the increased recoil contributions to subglottal pressure
expiratory muscles and passive forces produced by gravity and the associated with high lung volumes.”
elasticity of the breathing apparatus. Elasticity is generated by the lungs Our experiments point to an additional mechanism to this physio­
and the rib cage, and varies with lung volume. At high lung volumes, logical rationale: speech at low lung volume requires significantly more
elasticity produces an exhalatory force that may amount to 30 hPa or energy for voicing due to the enhanced damping of the oscillating
more [27]. Conversely, at low volumes, elasticity contributes an inha­ system.
latory force. In conversational speech, about 15–20% of the total lung
capacity is used. Hence elasticity forces change from exhalatory (high
4.4. Evolution of fundamental frequency during the last cycles
lung volumes) to inhalatory (low lung volumes): an equilibrium is
reached at a certain lung volume, that is the functional residual capacity
It has been shown that, during a breathy onset, before the first closed
(i.e. the volume in the lungs at the end-expiratory position [27]. The
plateau is reached, there is a clear trend to a slight progressive decrease
tendency of the chest wall to expand is equal and opposite to the ten­
of F0 of the VF oscillation [2]. This seems to show that when the mass of
dency of the lungs to collapse at a lung volume of 38% of the vital ca­
vibrating tissue is limited to a very thin strip of tissue along the VF edges,
pacity [28].
and concomitantly the vibrating air mass is also reduced, the vibration
Lowell [29] and Lowell & al. [30] compared - during
frequency is higher than when a more substantial part of the VF mass,
teaching-related speaking tasks - teachers with voice problems (in the
and a more substantial air mass is involved. A mirror phenomenon can
absence of laryngeal lesions) with ‘healthy’ teachers, and observed
actually be observed during voicing offset, whatever the lung volume
decreased levels of lung volume initiation and termination in the former
condition (Figs. 11 & 12): F0 tends to increase in the last cycles, and the
with respect to the latter. Actually, teachers frequently have to speak at
underlying explanation obviously seems akin.
increased loudness levels while teaching. At higher lung volume initia­
tion levels, greater respiratory recoil forces are available for expiratory
5. Conclusion
speech [28,31]. By starting their breath groups at higher levels, teachers
with healthy voices capitalize on these passive recoil forces. Initiating
With an adequate methodology, it is possible to control, to stan­
breath groups at a higher volume facilitates an increased lung pressure
dardize and to quantify the damping characteristics of the oscillating
and consequently a louder voice. Also, by ending their breath groups at
system (vocal fold tissue and air mass) during a physiological voicing
higher levels, they avoid the muscle effort required for producing speech
offset with abrupt interruption of the airflow. This allows investigating
below the resting respiratory level.
specifically the role of lung volume. The mechanical quality of the
Similarly, Schaeffer et al. [32] compared patients with abuse-related
oscillating system appears to be, to a non-negligible extent, determined
dysphonia with a normal control group in a reading task of a 60-syllable
by the lung volume that is set into oscillation; a reduction of the air
paragraph: significant results indicated that the end-expiratory lung
volume leads to a significant increase in the rate of decay of oscillations,
volume levels of the dysphonic group were further below the resting
resulting in a higher energy demand for voicing. Moreover, during the
expiratory level than those of the control group. In a later study,
last cycles of a voicing offset, when a smaller part of the vocal fold mass
Schaeffer [33] showed that a significant improvement in speech
is involved, a trend to a slight progressive increase of the F0 of the VF
breathing data (higher end-expiratory levels) could be obtained by voice
oscillation is clearly observed.
therapy, with a reduction of perceived dysphonia. The average termi­
nation of speech relative to the resting respiratory level was – 0.224 L
CRediT authorship contribution statement
before therapy and + 0.063 L after therapy.
Along the same line, Iwarsson & Sundberg [34], using respiratory
P.H. DeJonckere: Conceptualization, Methodology, Investigation,

10
P.H. DeJonckere and J. Lebacq Biomedical Signal Processing and Control 62 (2020) 102148

Writing - original draft, Writing - review & editing. J. Lebacq: [16] P.H. DeJonckere, J. Lebacq, L. Bocchi, S. Orlandi, C. Manfredi, Automated tracking
of quantitative parameters from single line scanning of vocal folds: a case study of
Conceptualization, Methodology, Investigation, Writing - original draft,
the ‘messa di voce’ exercise, Logop. Phoniatr. Vocol. 40 (2015) 44–54.
Writing - review & editing. [17] C. Manfredi, L. Bocchi, G. Cantarella, G. Peretti, Videokymographic image
processing: objective parameters and user-friendly interface, Biomed. Signal
Process. Control 7 (2012) 192–201.
Declaration of Competing Interest [18] J. Horáček, V. Radolf, V. Bula, A.M. Laukkanen, Experimental modelling of glottal
area declination rate in vowel and resonance tube phonation, in: Models and
Analysis of Vocal Emissions for Biomedical Applications: 11th International
The authors declare that they have no known competing financial Workshop, December, 17-19, 2019, Claudia Manfredi (Ed.), ©, FUP, CC BY 4.0
interests or personal relationships that could have appeared to influence International, Published by Firenze University Press, 2019, pp. 205–207. ISSN
2704-5846 (online), ISBN (online PDF) 978-88-6453-961-4, www.fupress.com.
the work reported in this paper.
[19] S. Hertegard, J. Gauffin, P.A. Lindestadt, A comparison of subglottal and intraoral
pressure measurements during phonation, J. Voice 9 (1995) 149–155.
References [20] J. Jiang, T. O’Mara, D. Coley, D. Hanson, Phonation threshold pressure
measurements during phonation by airflow interruption, Laryngoscope 109 (1999)
425–432.
[1] P.H. DeJonckere, J. Lebacq, Damping of vocal fold oscillation at voice offset,
[21] P.H. DeJonckere, Instrumental methods for assessment of laryngeal phonatory
Biomed. Signal Process. Control 37 (2017) 92–99.
function, in: A. am Zehnhoff-Dinnesen, B. Wiskirska-Woznica, K. Neumann,
[2] J. Lebacq, P.H. DeJonckere, The dynamics of vocal onset, Biomed. Signal Process.
T. Nawka (Eds.), European Manual of Medicine. Phoniatrics, Vol I, Springer-
Control 49 (2019) 528–539.
Verlag, Berlin Heidelberg, 2020.
[3] P.H. DeJonckere, J. Lebacq, I.R. Titze, Dynamics of the driving force during the
[22] A. Fourcin, E. Abberton, First applications of a new laryngograph, Volta Rev. 69
normal vocal fold vibration cycle, J. Voice 31 (2017) 649–661.
(1972) 507–508.
[4] J.G. Svec, J. Horacek, F. Sram, J. Vesely, Resonance properties of the vocal folds: in
[23] J.N. Sarvaiya, P.C. Pandey, V.K. Pandey, An impedance detector for glottography,
vivo laryngoscopic investigation of the externally excited laryngeal vibrations,
IETE J. Res. 55 (2011) 100–105.
J. Acoust. Soc. Am. 108 (2000) 1397–1407.
[24] P.H. DeJonckere, EMG of the Larynx, Press Productions, 1987, 340 pp. LigeISBN 2-
[5] M. Tanabe, N. Isshiki, Rheological characteristics of the vocal cord, Stud. Phonol.
87211-000-003.
Kyoto 13 (1979) 18–22.
[25] I.R. Titze, B. Story, M. Smith, R. Long, A reflex model of vocal vibrato, J. Acoust.
[6] M. Tanabe, N. Isshiki, M. Sawada, Damping ratio of the vocal cord, Folia Phoniatr.
Soc. Am. 111 (2002) 2272–2282.
(Basel) 31 (1979) 27–34.
[26] T.J. Hixon, A.H.B. Putnam, Voice disorders in relation to respiratory kinematics,
[7] F. Vanhecke, J. Lebacq, M. Moerman, C. Manfredi, G.W. Raes, P.H. DeJonckere,
Semin. Speech Lang. 4 (1983) 217–231.
Physiology and acoustics of inspiratory phonation, J. Voice 30 (2016) 769.e9–769.
[27] J. Sundberg, The singing voice, in: R.D. Kent (Ed.), The MIT Encyclopedia of
e18.
Communication Disorders, The MIT Press, Cambridge, Massachusetts, London,
[8] J.E. Huber, E.T. Stathopoulos, Speech breathing across the life span and in disease,
England, 2004, pp. 51–54.
in: M.A. Redford (Ed.), The Handbook of Speech Production, John Wiley & sons
[28] R.D. Kent, The Speech Sciences, Singular Publishing Group, Inc., San Diego
Inc., Chichester; UK, 2015.
London, 1997.
[9] Lung volumes, Physiopedia, 2019. September 22. Retrieved December 28, 2019
[29] S.Y. Lowell, Respiratory and Laryngeal Function During Spontaneous Speaking in
from https://www.physio-pedia.com/index.php?title=Lung_volumes&o
Teachers With Voice Disorders. PhD Dissertation, The University of Arizona, 2005,
ldid=223401.
180 pp. repository@u.library.arizona.edu.
[10] J. Stocks, P.H. Quanjer, Reference values for residual volume, functional residual
[30] S.Y. Lowell, J.M. Brakmeyer-Kraemer, J.D. Hoit, B.H. Story, Respiratory and
capacity and total lung capacity, Eur. Respir. J. 8 (1995) 492–506.
laryngeal function during spontaneous speaking in teachers with voice disorders,
[11] N. Yan, L.N.G. Manwa, M.K. Man, T.H. To, Vocal tract dimensional characteristics
J. Speech Lang. Hear. Res. 51 (2008) 333–349.
of professional male and female singers with different types of singing voices, Int.
[31] T.J. Hixon, M.D. Goldman, J. Mead, Kinematics of the chest wall during speech
J. Speech. Pathol. 15 (2013) 484–491.
production: volume displacements of the rib cage, abdomen and lung, J. Speech
[12] S. Intagliata, A. Rizzo, W.G. Gossman, Physiology, Lung Dead Space, StatPearls
Hear. Res. 16 (1973) 78–115.
Publishing, Treasure Island (FL), 2019. Jan. NCBI Bookshelf National Center for
[32] N. Schaeffer, S.A. Cavallo, M. Wall, C. Diakow, Speech breathing behavior in
Biotechnology Information, U.S. National Library of Medicine 8600 Rockville Pike,
normal and moderately to severely dysphonic subjects during connected speech,
Bethesda MD, 20894 USA. www.ncbi.nlm.nih.gov 〉 books 〉 NBK482501.
J. Med. Speech Lang. Pathol. 10 (2002) 1–19.
[13] P.H. DeJonckere, J. Lebacq, In vivo quantification of the intraglottal pressure:
[33] N. Schaeffer, Speech breathing behavior and vocal fold function in dysphonic
modal phonation and voice onset, J. Voice (2019), https://doi.org/10.1016/j.
participants before and after therapy during connected speech: preliminary
jvoice.2019.01.00 in press.
observations, Contemp. Issues Commun. Sci. Disord. 34 (2007) 61–72.
[14] P.H. DeJonckere, J. Lebacq, Intraglottal aerodynamics at vocal fold vibration
[34] J. Iwarsson, J. Sundberg, Breathing behaviors during speech in healthy females and
onset, J. Voice (2019), https://doi.org/10.1016/j.jvoice.2019.08.002 in press.
patients with vocal fold nodules, Logoped. Phoniatr. Vocol. 24 (1999) 154–169.
[15] O. Köster, B. Marx, P. Gemmar, M. Hess, J. Künzel, Qualitative and quantitative
[35] P.H. DeJonckere, M. Kob, Pathogenesis of vocal fold nodules. New insights from a
analysis of voice onset by means of a multidimensional voice analysis system
modelling approach, Folia Phoniatr. Logop. 61 (2009) 171–179.
(MVAS) using high-speed imaging, J. Voice 13 (1999) 355–374.

11

You might also like