You are on page 1of 10

Australian Journal of Electrical and Electronics

Engineering

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tele20

Diagnostic feasibility of time domain features for


detecting and characterizing cry cause factors - an
investigation

Arun P, Madhukumar S, Vishnu K Kumar, Ron S, Neha M & Princy E

To cite this article: Arun P, Madhukumar S, Vishnu K Kumar, Ron S, Neha M & Princy E (2022)
Diagnostic feasibility of time domain features for detecting and characterizing cry cause factors -
an investigation, Australian Journal of Electrical and Electronics Engineering, 19:4, 340-348, DOI:
10.1080/1448837X.2022.2068486

To link to this article: https://doi.org/10.1080/1448837X.2022.2068486

Published online: 27 Apr 2022.

Submit your article to this journal

Article views: 8

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=tele20
AUSTRALIAN JOURNAL OF ELECTRICAL AND ELECTRONICS ENGINEERING
2022, VOL. 19, NO. 4, 340–348
https://doi.org/10.1080/1448837X.2022.2068486

Diagnostic feasibility of time domain features for detecting and characterizing


cry cause factors - an investigation
Arun P, Madhukumar S, Vishnu K Kumar, Ron S, Neha M and Princy E
Department of Electronics and Communication Engineering, St. Joseph’s College of Engineering and Technology, Palai, India

ABSTRACT ARTICLE HISTORY


The very first cry of an infant gives vital information about the health of infant, and as they Received 2 December 2020
grow the acoustics change with the development of their vocal tract system. This reflects the Accepted 24 February 2022
learning mechanism of infant cry-cause factors, which upon solving will give a huge impact in KEYWORDS
the areas of medical and household. The behaviour of infant cry records is frequently used for Cry-cause factors; infant cry
non-invasive infant health inspection and monitoring. Automated approaches for forecasting analysis; performance
health status, on the other hand, are highly dependent on the features extracted. In this paper, parameters; RMS; statistical
the diagnostic feasibility of the time domain features to detect and discriminate various cry- significance; time domain
cause factors of cry signals is investigated. Mean, peak value, RMS, crest factor, Impulse factor, features
shape factor, energy, and clearance factor are the features employed in this work. It is
discovered that, among the features investigated, RMS is more effective than all other features
in detecting cry-cause factors with a Probability value (P) of 2.23307 × 10−6 and it offers an
accuracy of 91.67%, sensitivity of 90%, and specificity of 93.33%.

1. Introduction sophisticated data acquisition systems. Moreover, use­


ful information can be extracted from this recorded
New-born child crying has numerous features for spe­
information.
cific experts. For a physiologist, crying is a conduct for
A few approaches that incorporate various fea­
wearing emotional state and also informs approxi­
tures to analyse the cry signals are presented in the
mately the necessities of the toddler. For a neonate,
crying is the means to communicate his or her physi­ literature (Xie 1993; Chittora and Patil 2016;
cal discomforts like hunger, pain, and moist diaper. Jeyaraman et al. 2018; Ramesh et al. 2019; Diaz
From a linguistic aspect, a baby’s crying is the begin­ et al., 2012; Orlandi et al. 2018; Sahin et al. 2017;
ning of vocalisation and a step towards learning a new Etz et al. 2013; Hariharan et al. 2012). Xie (1993)
language that will allow the baby to communicate with proposed a method to analyse the cry signal using
the entire world (Nor and Ab-Rashid 2018). spectral analysis. They have examined the energy
For a medical expert, new-born crying is an indica­ features derived from the fundamental frequency
tion of right functioning and coordination of various components of the pre-processed cry signal. It has
organs at time of birth is an alarming signal. For an also reported that the strength of the cry signal in
engineer, it is acoustic data that comprises records of various sub bands of frequency and the invoicing
the toddler’s needs and physical status in the form of present in the cry of the child. The changed autocor­
acoustic signatures such as melody, loudness, timbre, relation method was used to extract fundamental
pitch, intonation, rhythm, and so on. As a result, it is a
frequency. They came to the conclusion that funda­
multidisciplinary environment where experts from
mental frequency features are unimportant in the
various backgrounds examine the newborn cry from
analysis of the new-born cry, and that the occurrence
various perspectives, implying that these non-verbal
communication data are an important part of the of invoicing in the child’s cry varies with central
automated analysis aimed at identifying the cry cause nervous system (CNS) age and is a crucial discrimi­
factors from the features extracted from these cry natory function in the study of the new-born cry. In
signals using signal processing techniques (Ruíz Díaz normal infants (20 days–3 months), the average pro­
et al. 2012). The features may be in time, frequency, portion of invoicing in birth screams is 67.7%, com­
and time frequency domain (Arun, Abraham Lincon, pared to 84.4% in birth screams. When examining the
and Prabhakaran 2018). This non-verbal crying infor­ birth cry, it was discovered that there is far less voi­
mation is mainly available in the form of acoustic cing and, as a result, less vocal fold vibrations.
information and data samples acquired by various Chittora and Patil (2016) used the short-time

CONTACT Arun P link2arun.p@gmail.com Department of Electronics and Communication Engineering, St. Joseph’s College of Engineering and
Technology, Palai-686 579, India
© Engineers Australia
AUSTRALIAN JOURNAL OF ELECTRICAL AND ELECTRONICS ENGINEERING 341

Fourier transform (STFT) of infant cry signals to Hariharan et al. (2012) suggested a feature extraction
construct energy-based features, which were then method based on time-frequency analysis using short-
fed into a support vector machine (SVM). They time Fourier transform (STFT) for the investigation of
came to the conclusion that the energy distribution infant cry signals. These features were applied to gen­
in the 0–1 kHz region was a potential feature for eral regression neural networks, probability regression
categorising infant crying. neural network, time-delay neural network (TDNN),
Jeyaraman et al. (2018) suggested that the time and multilayer perceptron (MLP). It was reported that
domain features were commonly used in the classifi­ the first two neural network are predominant than the
cation of neonate cry because of their good detection others.
efficiency in reduced noise conditions and their low The main swayback of the frequency domain ana­
computing complexity. They also stated that the linear lysis is that the computation is more complex than the
prediction coefficient (LPC) and mel-frequency cep­ time domain analysis (Xie 1993; Ramesh et al. 2019;
strum coefficient (MFCC) have the specific and quite Diaz et al., 2012; Orlandi et al. 2018; Sahin et al. 2017;
well-acoustic properties preferred by researchers in Etz et al. 2013). This analysis shows that there is
the study of infant cry classification in order to deter­ a chance to lose the transient information present in
mine the essence of multi-informatics neonates crying the data as it compares it with the perfect sinusoids.
signals (Ramesh et al. 2019) proposed a feature extrac­ Moreover, in F-domain analysis the temporal infor­
tion approach from the cry signal that recovered pitch- mation may also be lost due to the transformation
related parameters, MFC coefficients, and short-time from time to frequency domain. The features used
energy characteristics. The K-NN algorithm was used for cry segment analysis should be able to efficiently
to classify the signal, and it performed well even with translate the qualitative attributes of the spectrum (Xie
a low signal-to-noise ratio (SNR). Diaz et al. (2012) 1993; Orlandi et al. 2018) seen through visual exam­
recommended a scheme to analyse the qualitative ination to a limited set of numerical indices. The
features in the infant cry recordings as they are capable shortcoming of STFT is that once we set a time win­
to provide vital information to classify normal and dow size, it remains the same across all frequencies
pathological cases as well. For this, they have chosen (Kim and In 2012).
the substantial cry segments only with duration In the majority of the above-mentioned studies, the
greater than 200 ms from the pre-processed cry characteristics were used directly to train the statistical
records. They calculated parameters such as the start learning tool rather than first determining statistical
and finish of each cry unit, the period, the number of significance and then training the model depending on
cry units in the sample, and the fundamental fre­ the level of significance. Furthermore, the features’
quency of each segment from these records, as these separability and inter-class variability must be vali­
factors are useful for the clinician as significant com­ dated prior they can be used in real – world applica­
ponents in further assessment. tions. Because no domain transformation is necessary
Orlandi et al. (2018) proposed a technique for cate­ in time domain analysis, the interpretation is straight­
gorising full-term and preterm infant cry records
forward, and the computational complexity is lower
using parameters including duration, mean, median,
than that of spectral and spectro-temporal features.
standard deviation, lowest and maximum values of the
In this work, the diagnostic feasibility to detect and
primary and resonance frequencies of the pre-
discriminate different types of cry cause factors using
processed cry signals. These features were given into
time domain features of the cry signals is carried out.
various classifiers for better classification. Sahin et al.
The features used are energy, mean, RMS, crest factor
(2017) developed a scheme for estimating the gesta­ (CRF), impulse factor (IF), peak amplitude (PA),
tional age of a new-born based on cry data obtained shape factor (SHF), and clearance factor (CLF). The
during routine blood sample. They calculated the selected features used in this manuscript account
mean fundamental frequency, jitter percentage, shim­ directly for the qualitative behaviour of the cry signal
mer percentage, noise-to-harmonic ratio, intensity, because they are primarily non-stationary in nature
time interval between painful stimuli and the initial and do not include any domain transformation. They
cry signal, total cry length, and mean utterance dura­ effectively reflect the qualitative attributes of the signal
tion time (MUDT) of the pre-processed cry fragments. via numerical indices. The majority of the selected
A step-by-step multiple linear regression analysis was features exclusively reflect pattern of variation of sig­
also performed to assess the impact of these features. nal amplitude over time. One more attraction is that
Etz et al. (2013) introduced a model for detecting these features are analytically simple when compared
spontaneous and pain-induced cries using character­ to composite frameworks including signal boundary
istics such primary frequency, intensity, cry length, detection and Linear Predictive Coefficients. Hence,
formants, vocal fold micro-variability, and harmonics- the response of the proposed method may be better
to-noise ratio of the pre-processed cry signals. than other methods.
342 A. P ET AL.

The highlights of this work are as follows: (a) 2. Methodology


The proposed time domain features are relatively
The cry signal used for the analysis carried out in
simple and computationally less complicated; (b)
this work has been provided by the donate cry-
The features are accurately examined for statistical corpus database (Ji, Mudiyanselage, and Gao
significance utilising Kruskal–Wallis one-way 2021), in which the cry signal is been recorded
ANOVA; and (c) The separability among the fea­ from Android as well as iOS devices. The noise
tures is assessed through using Box–Whisker plot. eliminated signal is given in this data repository.
The rest of the paper is organised as follows. The different classes of signal available are hunger,
Section 2 describes the cry signal used in this tired, belly pain, and discomfort and are cate­
study as well as the mathematical description of gorised and kept in different folders. Of this, 30
the time domain features used. Section 3 also signals from each of the category is selected. Thus,
shows the statistical significance of the features a total of 120 samples, 30 from each category is
in detecting and discriminating different types of considered for analysis. All these records are avail­
cry cause factors like hunger, tired, belly pain, and able as ‘.wav’ files with a duration 6 second and
discomfort and the performance of each features have a sampling frequency of 8 kHz. The cry sig­
using as well as the performance of each feature nals of babies aged 4 weeks to 2 years are included
using the matrices such as accuracy, sensitivity, in this repository. It has also understood from the
and specificity. database that all the records were filtered using

Figure 1. Block diagram of procedures involved in the new-born cry examination.

Figure 2. Wave patterns of cry signal of different classes. (a) Tired (b) Discomfort (c) Hunger (d) Belly pain.
AUSTRALIAN JOURNAL OF ELECTRICAL AND ELECTRONICS ENGINEERING 343

a band pass filter with cut-off frequency 250– XPeak


Crest Factor ðCRFÞ ¼ (5)
600 Hz. Hence, a separate filtering mechanism has RMS
not done in the proposed work.
Figure 1 shows the block diagram of the procedures XPeak
incorporated in the infant cry analysis. Prior to the Impulse Factor ðIFÞ ¼ � (6)
μ Xft
feature extraction and evaluation, the cry signal is pre-
processed. It involves offset elimination. The offset
elimination is done to avoid the unwanted dc compo­ RMS
Shape Factor ðSHFÞ ¼ � (7)
nents present in the signal. μ Xft
The wave shape of the signal corresponds to hun­
ger, tired, belly pain, discomfort is shown in Figure 2
P p� �� 2
(a-d). The wave patterns of the signals that correspond ½ Nt¼1 � Xft ��
to hunger, tiredness, belly pain, and discomfort are all Energy ¼ (8)
N
different. In terms of amplitude and randomness, they
are distinct. The signal associated to belly pain, for
example, has a more erratic pattern and has a greater XPeak
Clearance Factor ðCLFÞ ¼ (9)
average magnitude than the other groups. The magni­ Energy
tude of the cry signal caused by hunger is lower than
The statistical significance of the features is assessed
the magnitude of the cry signal caused by the other
using Kruskal–Wallis one-way ANOVA for their abil­
classifications.
ity to identify the reason for the cry. The separability
The mathematical formulation of the pre-
of the features is evaluated qualitatively using the Box-
processing and the features (Prabhakaran, Lincon,
Whisker plot. Matlab® is used for all mathematical
and Arun 2018) utilised are shown in equation 1–9.
computations, signal pre-processing, extraction of fea­
The cry signal after offset elimination is given as,
tures, and statistical evaluation.
1X N
X0 ðtÞ¼ XðtÞ XðtÞ (1)
N t¼1 3. Results and discussions

where X(t) is the cry signal, ‘μ Xft ’ is the mean value This section deals with the details of the analysis of the
and the whole number of samples is ‘N’. The signal features on the cry records. That means, it comprises
after offset elimination and filtering ‘ Xft ’ (Arun, the details of the numerical values, range, statistical
Abraham Lincon, and Prabhakaran 2018) is given as. significance, feature separability, evaluation of the per­
formance parameters of each of the features. The
� 1X N
numerical values and the range of features relates to
μ Xft ¼ Xft (2)
N t¼1 the signals of four cry cause factors of are furnished in
Tables 1 and 2, respectively.
The representation of a discrete – time signal with ‘N’ By observing Table 1, it is noted that the values of
quantised level is illustrated below, features except PA and CF are distributed both in posi­
tive and negative regions. For PA and CF, the values are
Peak Amplitude ðPAÞ ¼ MaxðjXft jÞ (3) located in positive side only. For CA and PA, the hunger
exhibits less deviation than all other stages. For the
where YPeak is the Peak amplitude value of stochastic feature RMS, the values of features for Belly pain is
signal. distributed among positive and negative regions. The
majority of the feature values of IF and SF are very
p1 XN �
Root Mean Square value ðRMSÞ ¼ ð Xft 2 close to zero and hence it’s too difficult to interpret
N t¼1 manually. In the case of ‘energy’, the energy of the signal
(4) by hunger predominates others.

Table 1. Numerical values of features of cry signal corresponding to belly pain, hunger, tired and discomfort.
Sl. No Feature Tired Discomfort Hunger Belly Pain
1 Mean −2.09E-17 ± 8.23E-17 6.65E-17 ± 2.46E-16 −1.11E-16 ± 4.48E-16 2.11E-18 ± 7.44E-18
2 PA 0.5085 ± 0.2684 0.5804 ± 0.3083 0.7454 ± 0.1923 0.4462 ± 0.2528
3 RMS 0.0710 ± 0.0621 0.0102 ± 0.0997 0.1278 ± 0.054315 0.0514 ± 0.0496
4 CRF 8.6841 ± 2.5361 9.11 ± 5.7207 6.4794 ± 2.0084 10.5835 ± 5.7398
5 IF −1.59E+18 ± 9.89E+18 −6.37E+16 ± 7.67E+17 −2.65E+17 ± 1.57E+18 −2.65E+17 ± 1.57E+18
6 SHF −7.49E+16 ± 6.07E+17 −2.33E+15 ± 6.62E+16 −4.45E+16 ± 2.43E+17 −1.85E+14 ± 1.99E+17
7 Energy 1370 ± 1803.08 2270 ± 2745.74 2790 ± 1720.132 986 ± 1242.151
8 CLF 6.54E-04 ± 0.0006 5.89E-04 ± 0.0005 0.0003 ± 0.0002 6.87E-04 ± 0.0004
344 A. P ET AL.

Table 2. The range of features of cry signal corresponding to The Class separability, which is based on distance
belly pain, hunger, tired and discomfort. measures, is another metric that can be used to rank
Sl. features and measure the discriminative power of
No Feature Tired Discomfort Hunger Belly Pain
1 Mean −3.11E-16 to −3.17E-16 to −2.33E-15 −3.42E-18
a feature. The idea behind using distance measure­
1.00E-16 9.74E-16 to 6.75E- to 3.97E- ments is that good features should embed items of
17 17 the same class close enough for all classes in the
2 PA 0.0217 to 1 0.0626 to 0.2415 to 1 0.0672 to
1.0001 1.002 dataset (i.e. small interclass variability); good features
3 RMS 0.0018 to 0.006 to 0.0346 to 0.0092 to should also include items of various classes far apart
0.3266 0.426 0.2337 0.2272
4 CRF 3.0619 to 2.43 to 28.70 3.0936 to 4.2625 to (i.e. large interclass variability).
16.6087 11.1262 24.9028 In this work, the Box-Whisker plot is used to mea­
5 IF −2.92E+18 −3.30E+18 −6.90E+18 −3.95E+18
to 5.38E to 1.69E to 2.72E to 5.07E sure the separability offered by the features. This plot
+19 +18 +18 +18 gives a clear picture on how far the extreme values are
6 SHF −5.01E+17 −2.34E+17 −1.10E+18 −4.45E+17
to 3.24E to 1.31E to 3.98E to 5.74E
from most of the data. A box plot is constructed from
+18 +17 +17 +17 five values: the minimum value, the first quartile, the
7 Energy 44.50 to 69.6 to 572 to 6440 189 to 5700 median, the third quartile, and the maximum value.
9900 11,400
8 CLF 1.01E-04 to 8.75E-05 to 0.0001 to 1.70E-04 to We use these values to compare how close other data
3.40E-03 2.50E-03 0.0013 1.70E-03 values are to them.
The Box-Whisker plot of features corresponds to
the cry states like hunger, tired, discomfort, belly pain
By observing the range of features of different
is exposed in Figures 3(a-h).
stages shown in Table 2, it can be inferred that the
From the above Figure 3(a, e and f) all the classes
features PA, RMS, CF and Energy of different cry-
like tired, hunger, belly pain and discomfort are
cause factors are positive and that of the feature
slightly way similar. In the Box-Whisker plot of RMS
‘mean’ it’s negative. But for other three features, they
and Energy (Figure 3(c & g)) all the classes are not
ranging between negative and positive for all cry
inputs. The feature ‘energy’ shows a wider range than lying sufficiently apart spatially from the boxes and
that of all other features. they are somewhat looks same. The plot analogues to
By observing the details given in Tables 1 and 2, it is peak amplitude given in Figure 3(b), the boxes corre­
noted that the numerical indices are somewhat, closely sponding to ‘discomfort’ is mostly overlapped with all
spaced for the feature RMS. These obvious disparities other boxes; that means, the separability offered by
between numerical values and the range of feature this features are very less to discriminate the cry-
values recovered from cry records relating to belly cause factors. By inspecting the Box plot shown in
pain, hunger, tiredness, and discomfort validate the Figure 3(h), the clearance factor of ‘hunger’ offers
possibilities of time domain features for cry signal more feature separability than other three stages. By
analysis. observing the Box-Whisker plot of all the features, it
Kruskal–Wallis tests the statistical significance of has understood that the feature separability offered by
characteristics such as Peak amplitude, RMS, Impulse all the time domain features used in this paper to
factor, Crest factor, energy, clearance factor and Shape discriminate various cry – cause factor is less as most
factor in distinguishing cry-causes due to belly pain, of the boxes corresponds to the features are over­
hunger, tiredness, and discomfort. Kruskal–Wallis lapped together. Moreover, the statistical information
ANOVA is a non-parametric hypothesis test used to regarding the overlap is evident in the numerical
quantify how much feature values retrieved from var­ values of the features as well as the ‘P’ value obtained
ious classes differ from one another. Kruskal–Wallis from the Kruskal–Wallis.
ANOVA (KWA), unlike one-way ANOVA, does not The matrices like accuracy, sensitivity, and specifi­
make the assertion that the data to be tested will be city (Balakrishnan and Sathiyasekar 2015) are used in
normally distributed. The ANOVA values depicted in this work to examine the effectiveness of each features
this work is generated using Matlab®. ANOVA table of to differentiate various cry cause factors.
the test is supplied in Table 3(a-f). The Kruskal–Wallis A diagnostic test’s sensitivity refers to its capacity to
Test yielded Chi-Square values (H) of 8.57, 19.03, 29.01, appropriately detect patients with the medical pro­
19.29, 2.68, 1.97, 24.88, 15 for Mean, Peak value, RMS, blem. This refers to the percentage of persons who
CRF, IF, SHF, Energy, and CLF, as shown in Table 3 are appropriately diagnosed as unwell. It’s the percen­
(a-f). The same features (in the same order as above) tage of true positives that the test correctly recognises,
corresponding to the input signal acquired from hun­ calculated as:
ger, tired, belly pain, discomfort vary with a probability
value of 0.0356, 0.0003, 2.23307x10−6, 0.0002, 0.4442, Sensitivity ¼
True positives
(10)
0.5785, 1.63537x10−5, 0.0018, respectively. True positives þ False negatives
AUSTRALIAN JOURNAL OF ELECTRICAL AND ELECTRONICS ENGINEERING 345

Table 3. The ANOVA table of different features.


(a): The KWA table corresponding to the feature ‘Mean’
Sources Sum of Square (SS) Degrees of freedom (DF) Mean Squares (MS) Chi-Squared value (H) Probablity value (P)
Column 10,369.1 3 3456.38 8.57 0.0356
Error 133,607.4 116 1151.79
Total 143,976.5 119
(b): The KWA table corresponding to the feature ‘Peak Value’
Sources Sum of Square (SS) Degrees of freedom (DF) Mean Squares (MS) Chi-Squared value (H) Probablity value (P)
Column 23,020.3 3 7673.43 19.03 0.0003
Error 120,911.7 116 1042.34
Total 143,932 119
(c): The KWA table corresponding to the feature ‘Root Mean Square’
Sources Sum of Square (SS) Degrees of freedom (DF) Mean Squares (MS) Chi-Squared value (H) Probablity value (P)
Column 35,092.1 3 11,697.4 29.01 2.23E-06
Error 108,877.4 116 938.6
Total 143,969.5 119
(d): The KWA table corresponding to the feature ‘Crest Factor’
Sources Sum of Square (SS) Degrees of freedom (DF) Mean Squares (MS) Chi-Squared value (H) Probablity value (P)
Column 23,344.1 3 7781.36 19.29 0.0002
Error 120,633.9 116 1039.95
Total 143,978 119
(e): The KWA table corresponding to the feature ‘Impulse Factor’
Sources Sum of Square (SS) Degrees of freedom (DF) Mean Squares (MS) Chi-Squared value (H) Probablity value (P)
Column 3238.5 3 1079.49 2.68 0.4442
Error 140,739.5 116 1213.27
Total 143,978 119
(f): The KWA table corresponding to the feature ‘Shape Factor’
Sources Sum of Square (SS) Degrees of freedom (DF) Mean Squares (MS) Chi-Squared value (H) Probablity value (P)
Column 2384.5 3 794.82 1.97 0.5785
Error 141,593.5 116 1220.63
Total 143,978 119
(g): The KWA table corresponding to the feature ‘Energy’
Sources Sum of Square (SS) Degrees of freedom (DF) Mean Squares (MS) Chi-Squared value (H) Probablity value (P)
Column 30,103.1 3 10,034.4 24.88 1.64E-05
Error 113,874.9 116 981.7
Total 143,978 119
(h): The KWA table corresponding to the feature ‘Clearance Factor’
Sources Sum of Square (SS) Degrees of freedom (DF) Mean Squares (MS) Chi-Squared value (H) Probablity value (P)
Column 18,146.1 3 6048.69 15 0.0018
Error 125,826.4 116 1084.71
Total 143,972.5 119

Figure 3. The Box-Whisker plot of the features of tired, hunger, belly pain, discomfort. (a) Mean; (b) Peak value; (c) RMS; (d) crest
factor; (e) impulse factor; (f) Shape factor; (g) Energy; and (h) Clearance factor.
346 A. P ET AL.

In general, True positive and False-positive refer to positive and true negative in all analysed cases to
subjects who have been correctly identified and those estimate a test’s accuracy. The mathematical expres­
who have been mistakenly identified. True and False sion is as follows:
negatives are the subjects that were correctly and
incorrectly rejected. ðTP þ TNÞ
Accuracy ¼ (12)
The ability of a test to reliably identify persons ðTP þ TN þ FP þ TNÞ
without the ailment is known as specificity. This refers
to the proportion of healthy people who are appro­ The other two main diagnostic accuracy measure­
priately identified as such. It’s the percentage of true ments are the PPV and NPV. They are linked to
negatives that the test correctly identifies, sensitivity and specificity by a disease prevalence fac­
tor (Π). The PPV is the probability of presence of
True negatives disease assuming a positive test result and is calculated
Specificity ¼ (11)
False positives þ True negatives as follows:
Q
The accuracy of a test refers to its ability to correctly sensitivity x
PPV ¼ Q Q
discriminate between normal and abnormal situa­ sensitivity x þð1 specificityÞ x ð1 Þ
tions. We should determine the fraction of true (13)

Table 4. Matrices of the time domain features to distinguish different cry-cause factors.
Features State Sensitivity (%) Specificity (%) Accuracy(%) Threshold
Mean TH 60 43.33 51.67 −8.00E-19
TB 76.66 63.33 70 −2.72E-19
TD 66.66 63.33 65 −2.72E-19
HB 70 53.33 61.67 −1.29E-19
HD 50 73.33 61.67 1.24E-18
BD 46.66 56.66 51.67 1.37E-18
PA TH 70 73.33 71.67 0.6316
TB 66.66 73.33 70 0.4555
TD 56.66 56.66 56.67 0.5175
HB 93.33 80 86.67 0.4940
HD 70 66.66 68.33 0.6333
BD 66.66 66.66 66.67 0.4442
RMS TH 73.33 76.66 75 0.0884
TB 56.66 80 68.33 0.0547
TD 50 60 55 0.0734
HB 90 93.33 91.67 0.0603
HD 73.33 66.66 70 0.0884
BD 60 86.66 73.33 0.0576
CF TH 76.66 80 78.33 7.2266
TB 46.66 56.66 51.67 8.7879
TD 33.33 70 51.67 9.4639
HB 80 80 80 7.2317
HD 76.66 60 68.33 6.4737
BD 60 63.33 61.67 7.7754
IF TH 66.66 40 53.33 −9.52E+16
TB 56.66 36.66 46.67 −6.06E+16
TD 36.66 56.66 46.67 4.0117E+15
HB 60 70 65 6.87E+16
HD 56.66 53.33 55 2.9336E+14
BD 63.33 63.33 63.33 2.39E+16
SHF TH 66.66 40 53.33 −1.29E+16
TB 56.66 36.66 46.67 −6.68E+15
TD 36.66 56.66 46.67 5.9183E+14
HB 63.33 66.66 65 5.27E+15
HD 56.66 53.33 55 2.9624E+13
BD 60 66.66 63.33 9.81E+15
Energy TH 73.33 80 76.67 1867.95
TB 63.33 66.66 65 807.4493
TD 56.66 70 63.33 1299.7
HB 73.33 93.33 83.33 1624.65
HD 70 66.66 68.33 2103.85
BD 56.66 93.33 75 1306.5
CLF TH 66.66 80 73.33 0.0004
TB 40 73.33 56.67 0.0006
TD 66.66 66.66 66.67 0.0004
HB 86.66 63.33 75 0.0003
HD 60 63.33 61.67 0.0003
BD 73.33 53.33 63.33 0.0003
AUSTRALIAN JOURNAL OF ELECTRICAL AND ELECTRONICS ENGINEERING 347

Similarly, the NPV is defined as the likelihood that the in this paper. An example of a case study using
disease is not present based on a negative test result: samples was provided for discussed. The features
Q employed are mean, peak value, RMS, shape factor,
specificity x ð1 Þ energy, crest factor, clearance factor and impulse
NPV ¼ Q Q
specificity x ð1 Þþð1 sensitivityÞ x factor. The Probability value (P) obtained from the
(14) Kruskal–Wallis Test are 2.23307x10−6, 0.0356,
0.5785, 1.63537x10−5, 0.4442, 0.0018, 0.0003, 0.0002
Table 3 shows the matrices such as sensitivity, specifi­
for RMS, mean, shape factor, energy, impulse factor,
city, and accuracy of all of the features used to cate­
clearence factor, peak voltage and crest factor,
gorise tired, discomfort, hunger, and belly pain.
respectively. It was inferred that, RMS was found to
In this paper, as already stated, there are eight
be more successful than all other characteristics in
features are employed and the performance of each
detecting cry-cause factors, with accuracy, sensitiv­
feature is also evaluated on the six different states like
ity, and specificity of 91.67%, 90%, and 93.33%,
TH (tired-hunger), TB (tired–belly pain), TD (tired-
respectively. By using the statistically significant fea­
discomfort), HB (hunger-belly pain), HD (hunger-
tures an automated non-invasive method can also be
discomfort) and BD (belly pain- discomfort). The
implemented to distinguish various cry-cause fac­
values of sensitivity, specificity and accuracy obtained
tors. As a future study, the time domain features
for these features are presented in Table 4.
used in this work can also be input into various
The threshold value is selected in such a way that
artificial classifiers to examine the potential of them
the one which is co-located in the nearest point to
to discriminate the cry-cause factors. The ability of
(0,1) is set as the threshold in order to maximise the
the frequency domain and time–frequency domain
sensitivity and specificity values (Habibzade,
features can also be investigated as a future
Habibzadeh, and Yadollahie 2016). From the Table 4,
enhancement.
it is noted that TH states can be easily distinguished by
the feature energy where sensitivity is 73.33%, specifi­
city is 80% and accuracy is 76.67% for a threshold
Disclosure statement
value 1867.95. TB states can be easily distinguished
by the feature mean where sensitivity is 76.67%, spe­ No potential conflict of interest was reported by the author(s).
cificity is 63.33% and accuracy is 70% for a threshold
value −2.72E-19, TD states can be easily distinguished
by the feature clearance factor where sensitivity, spe­ References
cificity, and accuracy is 66.66% (threshold value
Arun, P., S. Abraham Lincon, and N. Prabhakaran. 2018.
0.0004). The HB states can be easily distinguished by “Detection and Characterization of Bearing Faults from
the feature RMS where sensitivity, specificity, and the Frequency Domain Features of Vibration.” IETE
accuracy is 90%, 93.33% and 91.67%, respectively, for Journal of Research 64: 634–647. doi:10.1080/
a threshold value of 0.0603. The HD states can be 03772063.2017.1369369.
easily distinguished by the feature RMS where sensi­ Balakrishnan, P., and K. Sathiyasekar. 2015. “Wavelet and
tivity is 73.33%, specificity is 66.66% and accuracy is Kernel Principal Component Analysis Based
Fuzzy-Neuro Technique to Detect and Classify Power
70% for the threshold value of 0.0884. The BD state is
Transmission System Faults.” Australian Journal of
more useful to distinguished by the feature energy Electrical and Electronics Engineering 12 (Nov): 1–12.
where sensitivity is 56.66%, specificity is 93.33% and doi:10.7158/E13-189.2015.12.1.
accuracy is 75% for a threshold value of 1306.5. It has Chittora, A., and H. A. Patil. 2016. “Spectral Analysis of
investigated that the RMS feature identifies hunger Infant Cries and Adult Speech.” International Journal of
and belly pain more efficiently than other features by Speech Technology 19 (4, Oct): 841–856. doi:10.1007/
s10772-016-9375-z.
noticing the sensitivity, specificity, and accuracy of Diaz, M., C. Garcia, L. Robles, J. Altamirano, and
90%, 93.33%, and 91.67%, respectively, for A. Mendoza. 2012. “Automatic Infant Cry Analysis for
a threshold value of 0.0603. As a result, a non- the Identification of Qualitative Features to Help
invasive procedure relies on RMS to determine the Opportune Diagnosis.” Biomedical Signal Processing and
cry-cause factor from the cry signals appears to be Control 7 (Jan): 43–49.
Etz, T., H. Reetzb, C. Wegenera, and F. Bahlmannc. 2013.
more promising than all other features presented in
“Infant Cry Reliability: Acoustic Homogeneity of
this paper. Spontaneous Cries and Pain-Induced Cries.” Speech
Communication 58 (Nov): 91–100. doi:10.1016/j.
specom.2013.11.006.
4. Conclusions Habibzade, F., P. Habibzadeh, and M. Yadollahie. 2016.
“On Determining the Most Appropriate Test Cut-off
The diagnostic feasibility of time domain features to Value: The Case of Tests with Continuous Results.”
detect and differentiate different types of cry cause Biochemia medica 26 (Oct): 297–30. doi:10.11613/
factors of the cry signals was reviews and discussed BM.2016.034.
348 A. P ET AL.

Hariharan, M., J. Saraswathy, R. Sindhu, Orlandi, S., C. A. R. Garcia, A. Bandini, G. Donzelli, and
W. Khairunizam, and S. Yaacob. 2012. “Infant Cry C. Manfredi. 2018. “Application of Pattern Recognition
Classification to Identify Asphyxia Using Techniques to the Classification of Full-Term and
Time-frequency Analysis and Radial Basis Neural Preterm Infant Cry.” Biocybernetics and Biomedical
Network.” Expert Systems with Applications 39 (Aug): Engineering 30 (May): 656–653.
Prabhakaran, N., S. A. Lincon, and P. Arun. 2018. “Non-
9515–9523. doi:10.1016/j.eswa.2012.02.102.
intrusive Detection and Characterization of Bearing Faults
Jeyaraman, S., H. H. Muthusamy, W. Khairunizam, from the Temporal Features of Vibration.” Australian
S. Jeyaraman, T. Nadarajaw, S. Yaacob, and S. Nisha. Journal of Mechanical Engineering 2 (June): 1–8.
2018. “A Review: Survey on Automatic Infant Cry Ramesh, S., Handia, M., Padma, B., Madhushree, s., et al. 2019.
Analysis and Classification.” Health and Technology “A Smart Baby Cradle.” Global Journal of Computer Science
8 (July): 391–404. doi:10.1007/s12553-018-0243-5. and Technology May. 19: 1–10.
Ruíz Díaz, A., C A. Reyes García, L C. Altamirano Robles,
Ji, C., T.B. Mudiyanselage, and Y. Gao. 2021. “A Review of
J E. Xalteno Altamirano, A. Verduzco Mendoza, et al.
Infant Cry Analysis and Classification.” EURASIP Journal
2012. “Automatic Infant Cry Analysis for the
on Audio, Speech, and Music Processing 8 (Feb): 1–17. Identification of Qualitative Features to Help
Kim, S., and F. 2012. “An Introduction to Wavelet Theory in Opportune Diagnosis.” Biomedical Signal Processing and
Finance: A Wavelet Multiscale Approach”, World Control 7(1) Jan: 43–49. 10.1016/j.bspc.2011.06.011.
Scientific Publishing Company. Sahin, M., S. Sahin, F. N. Sari, E. C. Tatar, N. Uras,
Nor, N. M., and R. Ab-Rashid. 2018. “A Review of S. S. Oguz, and M. H. Korkmaz. 2017. “Utilizing Infant
Cry Acoustics to Determine Gestational Age.” Journal of
Theoretical Perspectives on Language Learning and
Voice 31 (July): 506. doi:10.1016/j.jvoice.2016.10.005.
Acquisition.” Kasetsart Journal of Social Sciences Xie, Q., 1993. “Automatic Infant Cry Analysis and
39 (April): 161–167. doi:10.1016/j.kjss.2017.12.012. Recognition”, Ph.D. thesis, University of British Columbia.

You might also like