Diagnosis of Bearing Faults Using Temporal Vibration Signals: A Comparative Study of Machine Learning Models With Feature Selection Techniques

J Fail. Anal. and Preven.
https://doi.org/10.1007/s11668-024-01883-0
ORIGINAL RESEARCH ARTICLE
Diagnosis of Bearing Faults Using Temporal Vibration Signals:

A Comparative Study of Machine Learning Models with Feature
Selection Techniques
Alaa Abdulhady Jaber
Submitted: 21 November 2023 / in revised form: 19 January 2024 / Accepted: 23 January 2024
ASM International 2024
Abstract Accurately identifying bearing defects is cru- demonstrated a robust performance. The AUC attained a
cial for guaranteeing the dependability and effectiveness of value of 99.1%, AC was assessed at 97%, F1-score reached
industrial systems. Although the use of vibration signals for 96%, Precision was 96%, and Recall earned a score of
diagnosing bearing faults is highly important, there are still 95.7%. The benefits of the kNN-FCBF model were
persistent obstacles, especially when it comes to detecting emphasized by a comparison analysis with prior studies.
minor damage in the early stages. Relying solely on time- The kNN-FCBF algorithm provides a straightforward and
domain analysis for statistical feature extraction in com- precise approach that is less intricate and computationally
plex multi-fault scenarios may lack robustness. The affordable, while still achieving high levels of accuracy.
computational difficulties of frequency-domain and time–
frequency approaches impede the real-time identification Keywords Bearing fault diagnosis
of emergent errors, despite their effectiveness. Although Condition monitoring Vibration analysis
machine learning holds potential, its reliance on attributes Feature selection Machine learning
that are not generated from the time domain poses a dif-
ficulty. Therefore, the main aim of this study is to fill these
deficiencies by examining the utilization of temporal Introduction
vibration signals for the purpose of diagnosing bearing
defects. The vibration signals originated from the Case Today, rotating machines play a significant role in indus-
Western Reserve University. A total of fourteen time-do- trial systems and are perhaps one of the most crucial
main features were derived from the vibration signal, equipments in various industrial applications, like the
encompassing root mean square, kurtosis, and skewness. petrochemical, aviation, chemical, and household appli-
The study employed two feature selection strategies, ance sectors. These rotating machines have many
specifically Information Gain and Fast Correlation-Based components, including shafts, gears, and bearings. Bear-
Filter (FCBF), to identify the most important seven features ings play a key role as they guide and support the shafts in
for training machine learning models, including k-Nearest rotating machinery, and frequently operate in harsh envi-
Neighbor (kNN), Support Vector Machines, and Naı̈ve ronments. However, their design, production, assembly,
Bayes. Based on the acquired data, the kNN-based FCBF and operation problems eventually manifested in bearing
model (kNN-FCBF) approach exhibited superior classifi- performance deterioration and machine failure [1, 2]. Many
cation outcomes in comparison to alternative methods. The studies have shown that bearing failure is the primary cause
evaluation metrics, including Area Under the Curve of most mechanical failures in rotating equipment. In
(AUC), Accuracy (AC), F1-score, Precision, and Recall, practical applications, the failure of rotating machinery’s
bearings poses a major danger to its dependability and
safety and might result in production and equipment losses.
A. A. Jaber (&)
Mechanical Engineering Department, University of Technology- Although the onset of early bearing faults may not cause an
Iraq, Baghdad, Iraq instant breakdown, their growth over time will result in a
e-mail: alaa.a.jaber@uotechnology.edu.iq
123
severe failure in machines that involves costly maintenance techniques. The key benefit of the time–frequency-domain
[3]. In recent years, a great deal has been paid to bearing analysis approach over temporal and frequency-based
health monitoring, and several measures have been techniques is its ability to provide valuable information for
undertaken to avoid bearing failure. As a result, monitoring stationary and non-stationary signals [17]. Numerous time–
the health condition and diagnosing faults in bearings have frequency analysis techniques, including the short-term
considered a crucial area of industrial research. Various Fourier Transform (STFT) [18, 19], wavelet transforms
monitoring methods for bearing health have been intro- [20, 21], and empirical mode decomposition [22], have
duced in recent years, including noise, temperature, been developed. It should be emphasized that in many
current, and vibration analysis [4–7]. instances, and especially with variable speed and load
The vibration signals are one of the most valuable and systems, a simple assessment of the monitoring index does
informative sources for understanding phenomena related not offer trustworthy information about the machine’s
to bearing issues [3, 8]. It can offer information on the condition [3].
actual operating condition of the equipment at any given Due to the rapid growth of artificial intelligence (AI)
time without halting the manufacturing line. Nevertheless, technology, machine learning techniques are often
vibration monitoring is the most effective strategy because employed in the mechanical component failure diagnostic
one of the benefits of vibration monitoring is its capability process [17]. Rolling bearings display unique vibration
to identify, locate, and differentiate various kinds of defects characteristics depending on the working environment. The
from their onset before they become severe and harmful. dependability and stability of the whole machine are
The approach for vibration monitoring of rotary equipment directly impacted by how the bearings operate. Due to this,
comprises a sensor that transmits the machine vibration several researchers have suggested various intelligent
signature to a data-gathering system that is interfaced with diagnosis techniques based on machine learning or artifi-
a computer [9]. Nevertheless, the efficiency of vibration cial intelligence models to detect bearing defects.
monitoring is dependent on the signal processing approach Patil, et al. [21] proposed a fault identification and
used to extract fault diagnostic features from the collected classification system for induction motor bearing health
data. This has led several researchers to suggest various monitoring based on vibration signal analysis. The discrete
diagnosis techniques by examining the signal recorded wavelet transform (DWT), considering different wavelet
using different approaches, such as temporal/time-domain families such as DB4, DB8, and Sym5, was applied first for
analysis, frequency analysis, and time–frequency analysis. signal analysis. Different statistical features, including
Due to its significant findings in various health monitoring RMS, kurtosis, and skewness, were then extracted from the
applications, temporal/statistical analysis is frequently third level of the DWT. These features were fed to a three-
employed [10, 11]. It employs different time-based indi- layer artificial neural network (ANN) model for fault
cators such as root mean square [9, 12], kurtosis [12], peak classification. The developed system was capable of clas-
value, and skewness [13]. Simplified and shorter compu- sifying ball, inner race, and outer race bearing faults with
tations are a benefit of time-domain analysis. However, this 98.7% accuracy. In similar research, Deák and Kocsis [23]
technique’s main shortcomings are its insensitivity to early used the Support Vector Machine (SVM) method for
faults and deeply dispersed defects. classification features related to different bearing faults
Spectrum or frequency analysis is the most traditional extracted from vibration signals using the DWT. However,
way to find faults in rotating machinery. This method the authors used an approach based on the energy-to-
makes it possible to convert the temporal signal to the Shannon entropy ratio to enhance the feature extraction
frequency domain. Compared to time-domain analysis, it step to select the best wavelet family.
provides thorough and early information on the machine’s Another research suggested using k-Nearest Neighbor
status. As a result, many techniques, including the bearing (kNN), Decision Tree (DT), and SVM to diagnose different
defect frequencies analysis technique [14], the envelope induction motor faults, including bearing faults [16]. The
spectrum method [15], and the Hilbert transformation researchers have adopted current signal analysis based on
method [16], are based on exploiting the spectrum of the the Hilbert–Huang transform (HHT) to extract fault-related
temporal signal. This method is accurate and reliable for features. Then, the most salient features were selected
locating damage and locating bearing faults. However, the using different dimensionality reduction algorithms and
effectiveness is based on the bearing’s size and rotational provided to the employed machine learning approaches.
speed. Additionally, as the high energy noise frequently The general conclusion of this research is that the feature
masks the information, all approaches that use the fre- selection step can considerably enhance the classification
quency domain need careful consideration when choosing accuracy of the fault diagnosis system. Furthermore, for the
the frequency band. As a result, the fault in the spectra identification and localization of faults in roller bearings,
cannot be found using traditional frequency-based several statistical attributes of vibration signals in the time
123
and frequency domains were extracted in [24]. For classification. The suggested methodology offers a com-
obtaining better classification accuracy, these attributes putationally efficient and robust way to diagnose bearing
were then concatenated and fed to three machine learning problems. The efficacy of the suggested approach is eval-
algorithms, namely kNN, SVM, and kernel linear dis- uated in terms of its capacity to detect incipient bearing
criminant analysis (KLDA). Based on the simulation abnormalities, employing refined time-domain analysis and
results, KLDA has achieved the best classification accuracy machine learning modeling.
(99.13%). Li, et al. [25] have investigated the application The rest of this paper is organized as follows. The fol-
of the convolutional neural network (CNN) based on raw lowing section provides an overview of the machine
time-domain vibration signal analysis for bearings fault learning algorithms employed, including kNN, SVM, and
diagnosis. The aim was to avoid the time-consuming fea- Naive Bayes classifiers. ‘‘Experimental Setup’’ section
ture extraction step using a data enhancement method. The describes the experimental apparatus and the CWRU-
suggested method’s recognition accuracy in the CWRU- bearing dataset used in this research. The process of
bearing database exceeds 96%. In [26], a unique monitor- extracting time-domain features is discussed in ‘‘Feature
ing approach for bearing defect diagnosis considering the Extraction’’ section. ‘‘Feature Selection’’ section investi-
identification of distributed faults, such as roughness, and gates the considered techniques for selecting features. The
local defects, such as single-point ball and raceway faults, evaluation methodology for machine learning models is
was proposed. First, the approach evaluates the most rel- described in ‘‘Machine Learning Models Evaluation’’ sec-
evant statistical time characteristics computed from the tion. ‘‘Results and Discussion’’ section presents the results
vibration signal. Then, the most significant features that and discussion, including feature importance rankings,
best discriminate among faults were selected. In the final classification results, confusion matrix analysis, and com-
step, classification, a hierarchical neural network topology parative evaluation. The conclusion and contributions of
is utilized. The efficacy of this condition-monitoring this investigation are summarized in ‘‘Conclusion’’ section.
technique has been validated by experimental findings
gathered under various operating situations.
Previous research into diagnosing bearing faults using Considered Machine Learning Algorithms
vibration signals has shown effectiveness through fre-
quency-domain, time–frequency, and time-domain k-Nearest Neighbor
analyses. However, these methods face limitations, as
frequency-domain and time–frequency approaches can be The k-Nearest Neighbor (kNN) algorithm is a fundamental
computationally intensive [27, 28], impeding real-time and widely used machine learning technique. It involves
diagnosis and the detection of subtle early-stage faults. calculating the distance between data points and assumes
While time-domain statistical features are computationally that similar entities are in close proximity to each other, as
simple, their standalone use may lack robustness in multi- depicted in Fig. 1. This nonparametric method is popular
fault diagnosis scenarios. Although time-domain features due to its simplicity and straightforward implementation
are computationally simpler, relying solely on them in [31, 32].
research may lack robustness in complex multi-fault sce-
narios with noise interference. Moreover, many machine
learning-based approaches have predominantly utilized
features derived from frequency or time–frequency trans-
forms rather than optimizing time-domain features
[29, 30]. Nevertheless, these strategies have primarily
depended on characteristics obtained from frequency or
time–frequency analysis.
To overcome these problems, the current study proposes
a comprehensive framework that combines effective time-
domain feature extraction with enhanced machine-learning
methods that include essential feature selection. The
objective of this strategy is to address previous studies’
limitations by developing a system that achieves precise
fault classification while still being simple for the prompt
detection of early damage. The incorporation of feature
selection aims to improve model performance by elimi-
nating irrelevant and duplicated features that may impair Fig. 1 An illustration of the kNN algorithm [adapted from Ref 32]
123
kNN employs various distance calculation techniques, the classification boundary, and parallel lines H1 and H2
including Euclidean and Manhattan distances, to determine pass via the data points closest to H from the two data
the similarity or proximity between two points. Among types. These specific data points, termed support vectors,
these methods, Euclidean distance is the most commonly play a crucial role. The distance between H1 or H2 and H is
used and can be expressed by Eq.1 [32]. known as the geometric separation, and the key goal during
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi training of the SVM model is to maximize this geometric
Xn
DðX; Y Þ ¼ ðx i y i Þ2 ðEq 1Þ margin.
i¼1 Consider a training dataset,
D ¼ fðxi ; yi Þ; i ¼ 1; . . .. . .; ng, where xi is a K-dimensional
In the given context, D represents the distance between
column vector ðx 2 RK1 Þ, yi denotes the category of xi ,
two points. X and Y denote two points in an n dimensional
and yi 2 f1; 1g. The overall decision function can be
Euclidian space, where xi and yi represent the values of two
expressed as follows [32]:
corresponding data points.
When a new data point is provided, the algorithm yi ¼ sgnððw xÞ þ bÞ ðEq 2Þ
computes the distance between the new data point and its
In the equation, the sgn is a symbolic function, the
closest neighbors, with the number of nearest neighbors (k)
weight vector is denoted by w, and b represents the bias.
predetermined. By calculating the distances and consider-
Consequently, maximizing the geometric interval can be
ing the class labels of the k-Nearest Neighbors, the
converted into solving a quadratic programming problem
algorithm determines the class of the new data point. The
as follows:
choice of k is crucial as a higher value reduces the impact
of noise but may blur the class boundaries, leading to an 1 2
min ¼ kwk ðEq 3Þ
underfitted model. Conversely, a smaller k imposes stricter 2
classification criteria, resulting in faster computation but However, the constraint in the above equation is
risking overfitting. Therefore, optimizing the value of k is [32, 33, 36]:
essential for enhancing the performance of the kNN algo-
rithm by decreasing the error in prediction. yi ðw xi þ bÞ 1; i ¼ 1; . . .. . .; n ðEq 4Þ
The Lagrange optimization function is introduced to the
Support Vector Machine above function:
1 Xn Xn
Support Vector Machine (SVM) is a widely used classifier Lðw; b; aÞ ¼ kwk2 ai y i ð w x i þ bÞ þ ai ðEq 5Þ
that offers significant advantages in addressing pattern 2 i¼1 i¼1
recognition problems involving limited samples, nonlin-
The Lagrange multiplier ai must be non-negative
earity, and high dimensions [33–35]. As depicted in Fig. 2,
ðai 0Þ. Hence, the overall expression for the decision
the primary concept of SVM involves recognizing an
function is as follows:
optimal hyperplane in the feature space to effectively dis- !
tinguish the dataset. In the illustration figure, H symbolizes X n
f ð xÞ ¼ sgn ai yi ðxi xÞ þ b ðEq 6Þ
i¼1
However, it is common for the samples in the original

space to exhibit nonlinearity and inseparability. Therefore,
it is imperative to incorporate a suitable kernel function.
Ultimately, this enables the derivation of the fundamental
SVM model as:
!
X n
f ð xÞ ¼ sgn ai yi K ðxi xÞ þ b ðEq 7Þ
i¼1
In this context, K denotes the kernel function, which can

take various forms such as linear, polynomial, S-shaped, or
Gaussian radial basis functions.
Fig. 2 Illustration depicting the underlying principle of SVM

classification [adapted from Ref 33]
123
Naı̈ve Bayes Experimental Setup
The Naı̈ve Bayes algorithm was founded with the The vibration signals utilized in this research for bearing
assumption that all variables in the network are indepen- fault detection were obtained from the Bearing Data Centre
dent of the classification variable, meaning that attributes at Case Western Reserve University (CWRU); these sig-
X1 ; . . .. . .; Xn are conditionally independent of each other nals were used as a benchmark dataset [39]. This dataset is
given Y [37, 38]. This assumption has significant value as it widely acknowledged for its easy accessibility and direct
greatly simplifies the representation of PðXjY Þ and the applicability to real-world industrial scenarios, which
challenge of estimating it from the training data. To illus- makes it a significant resource given the limited availability
trate, let’s examine a specific scenario where X vector is of actual industrial data in the public domain. The funda-
composed of attributes X1 and X2 ; hence [37, 38], mental configuration of the employed test rig to obtain
PðXjY Þ ¼ PðX1 ; X2 jY Þ ¼ PðX1 jX2 ; Y ÞPðX2 jY Þ these signals is shown in Fig. 3. It consists of a shaft
¼ PðX1 jY ÞPðX2 jY Þ ðEq 8Þ powered by a 2 horsepower Reliance Electric motor
equipped with a torque transducer and an encoder. Vibra-
In a more general context, if X consists of n attributes tion signals were collected using accelerometers on the
that are conditionally independent of each other given Y, motor’s supporting base plate and bearings housing, rep-
then we can observe the following. resenting SKF deep groove ball bearings (6205-2RS JEM
Y
n and 6203-2RS JEM). A 16-channel digital audio tape
PðX1 ; . . .::; Xn jY Þ PðXi jY Þ ðEq 9Þ (DAT) recorder captured the datasets [29]. The bearing
i¼1 vibration signals were collected under various health states
Xi and Y are Boolean variables, so only 2n parameters (normal and faults) and motor speeds (1730, 1750, 1772,
are required to define PðXi ¼ Xik jY ¼ yi Þ for the and 1797 rpm), with faults induced using the Electro-dis-
appropriate i, j, and k. This is a significant reduction charge machining (EMD) technique. Faults on the inner
compared to the 2(2n—1) parameters needed to describe race, outer race, and balls had specific dimensions (widths
PðXjY Þ without assuming conditional independence. Now, of 0.007, 0.014, and 0.021 inches, depth of 0.011 inches).
let us derive the Naı̈ve Bayes algorithm in a general Experiments were conducted under four load conditions (0
context, where Y can be any discrete-valued variable and hp, 1 hp, 2 hp, and 3 hp) and different sampling frequencies
the attributes X1 ; . . .::; Xn can be either discrete or real (12 kHz or 48 kHz), each lasting 10 seconds. Table 1
valued. The objective here is to train a classifier that summarizes the ten data categories used in this research.
provides a probability distribution over the potential values In this research, 40 datasets are selected from the
of Y for each new instance X that requires classification. CWRU database, which are the vibration signals captured
Following Bayes’ rule, the expression for the probability by the fan-end bearing accelerometer with a sampling rate
that Y assumes its kth possible value is as follows [37, 38]: of 12 kHz. The dataset files come in MATLAB format and
have a .mat extension. Of these 40 datasets, 4 are of normal
PðY ¼ yk ÞPðX1 ; . . .; Xn jY ¼ yk Þ
PðY ¼ yk jX1 ; . . .; Xn Þ ¼ P bearings, and 36 are faulty bearings with flaws in various
P Y ¼ yj P X1 ; . . .; Xn jY ¼ yj bearing components at different loading and speed condi-
ðEq 10Þ tions, as shown in Table 1.
We sum over all possible values of yj for the variable Y.
Given the assumption that Xi is conditionally independent
given Y, Eq. 10 can be rewritten as follows:
Q
PðY ¼ yk Þ i PðXi jY ¼ yk Þ
PðY ¼ yk jX1 ; . . .::; Xn Þ ¼ P Q
P Y ¼ yj i P Xi jY ¼ yj
ðEq 11Þ
Equation 11 is the key equation for the Naı̈ve Bayes
classifier. When it is provided with a new instance
Xnew ¼ ð1; . . .; Xn Þ, it demonstrates how to compute the
probability of Y adopting a specific value. This calculation
is based on the observed attribute values of Xnew , as well as
the estimated distributions PðY Þ and PðXi jY Þ derived from
the training data.
Fig. 3 Bearing experimental setup of CWRU [adapted from Refs 1,
40]
123
Table 1 Information of the used CWRU rolling bearing vibration features from random signals, enabling defect diagnosis
signals and categorization. These methods include temporal (time)
Health Fault diameter Load and spectral (frequency) domain analysis, each revealing
condition (in) Speed (rpm) (hp) signal dynamics. Time-domain feature extraction has many
advantages over the other methods. It enables direct
Healthy … 1797/1772/1750/1730 0/1/2/3
monitoring of vibration signal temporal properties, helping
Ball fault 0.007 1797/1772/1750/1730 0/1/2/3
identify fault patterns and abnormalities [41, 42].
0.014 1797/1772/1750/1730 0/1/2/3
In contrast with frequency-domain analysis, which
0.021 1797/1772/1750/1730 0/1/2/3
requires transforming signals into the frequency spectrum
Inner race fault 0.007 1797/1772/1750/1730 0/1/2/3
[43], time-domain characteristics have the advantages of
0.014 1797/1772/1750/1730 0/1/2/3
being understandable and computationally efficient. Hence,
0.021 1797/1772/1750/1730 0/1/2/3
they are well suited for applications that involve real-time
Outer race fault 0.007 1797/1772/1750/1730 0/1/2/3
fault detection and condition monitoring. In cases like
0.014 1797/1772/1750/1730 0/1/2/3
these, being able to make decisions and take action is
0.021 1797/1772/1750/1730 0/1/2/3 extremely important. The time-domain features show the
strength, distribution, and patterns of signals which helps to
identify spikes, differences in amplitude, and changes in
In order to process and evaluate the CWRU data, a
signal frequency over time. The utilization of time-domain
specialized LabVIEW code was developed to extract the
features in machine learning algorithms allows classifiers
time-domain features. The decision to utilize the LabVIEW
to learn how to differentiate between the conditions and
programming environment was based on its inclusion of
various types of faults. This discrimination of bearing
specialized toolkits designed for the purposes of vibration
defects is attained with the help of the unique temporal
analysis and data processing. The CWRU datasets con-
features presented in vibration signals. This time-domain
sisted of lengthy raw signal files. These files were initially
methodology is a significantly effective approach to clas-
partitioned into segments of equal length, each containing
sify the bearing defects as it helps in identifying definite
4096 samples. This partitioning process was carried out
fault patterns linked to different faults occurring due to
using LabVIEW VI, which was created for this purpose.
outer race, inner race, and rolling element defect of bear-
This approach guaranteed that each individual signal
ings. Therefore, this contributes to building new
sample contained an adequate number of data points for
maintenance methods that are more effective, increased
accurately characterizing the bearing condition while also
system stability as well as reduced downtime. Using time-
ensuring that the processing times remained within rea-
domain features to classify faults is a practical and efficient
sonable limits. Figure 4 shows samples of these signals; it
method for assessing the status of bearings and monitoring
can be noticed that the vibration amplitude of the fault
their health in different sectors. Therefore, a total of 14
bearings becomes higher and has more pulses as fault
temporal domain characteristics are derived using the
severity increases. The implementation of the LabVIEW
equations presented in Table 2, utilizing the developed
code was utilized to automate the calculation of the 14
LabVIEW program in this research [17, 41, 42].
time-domain statistical features, as described in ‘‘Feature
where xi is a sample in the acquired signal and N is the
Extraction’’ section, for every pre-processed vibration
total number of samples.
signal segment. Subsequently, the feature matrices
obtained were exported to Excel to facilitate subsequent
analysis utilizing the Orange data mining software. By
Feature Selection
utilizing the LabVIEW platform, the raw CWRU datasets
were processed effectively, enabling reading, segmenta-
In fault diagnosis, features collected from the data set are
tion, and feature extraction to be carried out with ease. The
often utilized as input to the classification process. Some
subsequent creation and assessment of machine learning
applications only have a small number of features, while
models relied heavily on this vibration signal preparation.
others could have too many [44]. Each data item’s
retrieved features are saved in a feature matrix. Therefore,
the size of the matrix and, therefore, the length of the
Feature Extraction
process will depend on both the size of the dataset and the
number of features. However, not all features hold equal
The feature extraction procedure is critical to vibration
importance for a specific task. Some may be redundant or
signal analysis as it extracts key fault classification char-
even irrelevant, and better performance can be achieved by
acteristics. Many methods can be used to extract relevant
eliminating such features. Therefore, feature selection
123
Fig. 4 Samples of CWRU vibration signals for different faults at 1730 (rpm) and 3 (hp)
123
Table 2 Mathematical expressions of the extracted time-domain features [10, 41]

Feature Equation Feature Equation
Mean P
N
x Minimum xmin ¼ minðxi Þ
xm ¼ N i=
i¼1
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Root mean square PN . Peak-to-Peak xpp ¼ maxðxi Þ minðxi Þ
xrms ¼ ðxi Þ2
N
i¼1
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Standard deviation PN . Peak xp ¼ maxjxi j
xstd ¼ ðxi xm Þ2
N
i¼1
Variance P
N . Crest factor CF ¼ xrms
p x
xvar ¼ ðxi xm Þ2
N
i¼1
Kurtosis P
N . Impulse factor x
IF ¼ 1 PNp
xkur ¼ ðxi xm Þ4 N jxi j
Nx4std i¼1
i¼1
Skewness P
N . Shape factor SF ¼ 1 Pxrms
xkur ¼ ðxi xm Þ3 N
N
jxi j
Nx3std i¼1
i¼1
Maximum xmax ¼ maxðxi Þ Clearance factor CF ¼ PNxmaxpffiffiffiffiffi2
1
N i¼1
jxi j
plays a crucial role in distinguishing between significant

and insignificant features. The curse of dimensionality
affects identification algorithms, as the presence of irrele-
vant, redundant, and noisy attributes leads to poor
predictive performance and increased computational
requirements. Hence, feature selection is vital to enhance
the efficiency of fault identification algorithms. The
objective of feature subset selection is to extract features
with low dimensionality while preserving sufficient infor-
mation and improving feature separability in the feature
space. In the proposed methodology, two ranking methods,
namely Information Gain and Fast Correlation-Based Fil- Fig. 5 FCBF feature selection approach [adapted from Ref 16]
ter, are utilized for feature ranking.
for Information Gain [16]. One notable advantage of FCBF
Information Gain is its ability to eliminate redundant features. During the
screening process, FCBF compares pairs of features and
Entropy, represented by the symbol H, is a metric that retains the one that exhibits a stronger correlation with the
enables the quantification of impurity within a given target. By leveraging features with higher correlations, the
dataset. Building upon this notion, an Information Gain screening is effectively completed. This approach reduces
(IG) measure can be defined, which captures the additional time complexity, enhances computational efficiency,
information pertaining to class Y contributed by feature X accelerates calculations, and simultaneously improves
[45]. This measure precisely indicates the extent to which recognition rates. Figure 5 demonstrates the process
the entropy of Y decreases as a result of considering X. The involved in this method for removing redundant features.
formula for calculating this measure is as follows: In this particular illustration, F1, F2, and F4 are considered
IG ¼ H ðY Þ H ðYjX Þ ¼ H ð X Þ H ðXjY Þ ðEq 12Þ similar features, with F1 demonstrating a stronger con-
nection to the target. As a result, F2 and F4 are deemed
Fast Correlation-Based Filter redundant features. Building upon this assessment, F3
enables the removal of F6 and F7.
The Fast Correlation-Based Filter (FCBF) is a feature
selection method that utilizes symmetric uncertainty (SU),
which assesses information based on the average content
within a message as a basis for evaluation, as a replacement
123
Machine Learning Models Evaluation negatives, respectively, whereas FP and FN represent the
number of false positives and false negatives, respectively.
To assess the effectiveness of various classification meth-
ods, performance is evaluated using selected indicators, Orange Software for Machine Learning Models Design
including Accuracy (AC), F1-score, Precision, and Recall
[43, 46]. These evaluation metrics are shown in Table 3 Orange is an open-source data mining software for
[47]. Typically, AC quantifies the proportion of accurate machine learning applications, has impressive data visu-
predictions out of the total number of instances assessed. alization capabilities, and is perfect for both beginners and
On the other hand, Precision measures the correct identi- experienced users [37]. Scientists at the University of
fication of positive patterns from the total predicted Ljubljana in Slovenia developed it in 1997 utilizing
patterns in the positive class. Recall assesses the fraction of Python, Cython, C ? ? , and C programming languages
positive patterns that are accurately classified. The F1- [44]. The Python and Qt3 libraries were used to build the
score provides a balanced evaluation of Precision and software’s graphical user interface and environment. This
Recall by calculating their harmonic mean, effectively software’s most recent version, Orange 3.35.0, was
combining these two metrics into a single value, achieving released in March 2023, and it includes a straightforward
its highest value at 1 and the lowest value at 0. user interface where users may arrange graphical elements,
Additionally, the area under the ROC (receiver operat- called widgets, to build a data analysis process. This pro-
ing characteristic) curve or AUC is used in determining the gram’s user-friendly widget provides users access to
ability of a classifier to distinguish between different fundamental activities such as reading data, presenting data
classes. The ROC curve is one of the widely used statistical tables, choosing features, selecting learning estimators,
tools that provide a comprehensive assessment of sensi- comparing learning methods, and visualizing data items.
tivity and specificity. The AUC metric measures the quality The user may also see the outcomes graphically. One of the
of classification by noting sensitivity on one axis and key benefits of this application is the comparison of several
specificity on another. A value of AUC closer to 1 indicates algorithms using different criteria during the performance
improved classifier performance. The confusion matrix was assessment phase. A dataset with a .tab extension may be
an additional evaluation metric employed in this study. It is used with Orange. However, it can also open widely used
an important tool in machine learning because it provides a dataset file formats, including txt, basket, .csv, and .arff.
thorough representation of the contingency table for a The developed Orange data processing workflow is
given classifier. It effectively summarizes the classifier’s shown in Fig. 6. It starts first by loading the data, which are
efficacy on a particular set of test data [48, 49]. The matrix the extracted 14 features using the developed LabVIEW
arrangement illustrates the organization of predictions in a code, from an Excel sheet file saved on the PC. The data
tabular format, wherein each column corresponds to the are then visualized in table format and sent to a ranking
instances classified into a predicted class, and each row widget to select the most important first 7 features using IG
corresponds to the instances classified into an actual class. and FCBF. The selected features are passed to the data
The confusion matrix provides a concise representation of sampler widget, incorporating various sampling tech-
the number of true positives, false positives, true negatives, niques. It produces two datasets: one used for taring and
and false negatives, facilitating a straightforward under- another that includes instances from the original dataset not
standing of the outcomes [50]. The calculation of the included in the sampled dataset used for testing. However,
mentioned above indicators can be easily derived from the this research considered a 10-fold cross-validation crite-
values present in the confusion matrix. rion: 9 for training and 1 for testing. The learning algorithm
The formulas listed above contain the variables TP and testing widget (Test and Score) enables the evaluation of
TN, which stand for the number of true positives and true various learning algorithms, in this research, kNN, SVM,
and Naı̈ve Bayes, respectively. The widget performs two
Table 3 Considered evaluation indices for the machine learning main functions. Firstly, it displays a table presenting per-
models formance measures of different classifiers, such as
Indicator Mathematical expression classification accuracy and Area under the curve. Secondly,
it generates evaluation results that other widgets can utilize
Accuracy (AC) TP + TN for further analysis of classifier performance, such as ROC
TP + FP + TN + FN
Precision TP Analysis or Confusion Matrix widgets. The kNN widget
TP + FP
utilizes the kNN algorithm where the used number of
Recall TP
TP + FN nearest neighbors (k) is 5 with Euclidean distance. Simi-
F1-score 2 Precision Recall larly, the SVM and Naı̈ve Bayes algorithms are applied
Precision þ Recall
using SVM and Naı̈ve Bayes widgets where the Gaussian
123
Fig. 6 Developed Orange workflow for feature ranking and classification
radial basis function (RBF) has been chosen as the pre- significant. The features ‘‘Min,’’ ‘‘Peak,’’ ‘‘Max,’’ ‘‘Shape
ferred kernel option to obtain high accuracy. The factor,’’ ‘‘Clearance factor,’’ ‘‘Kurtosis,’’ ‘‘Impulse factor,’’
performance of different machine learning algorithms is and ‘‘Crest factor’’ have lower FCBF scores, suggesting
evaluated using a confusion matrix. The confusion matrix that they are less significant for the classification task in
widget shows each algorithm’s number of true positives, comparison to other features.
false positives, true negatives, and false negatives. The The presence of ‘‘Peak-to-Peak’’ as a consistent feature
scatter plot widget incorporates advanced exploratory in both rankings indicates its importance in classification.
analysis features and intelligent data visualization The presence of ‘‘Variance,’’ ‘‘STD,’’ and ‘‘RMS’’ in both
enhancements. rankings highlights their significance, further emphasizing
their importance. The rankings of certain features, such as
‘‘Skewness,’’ ‘‘Mean,’’ and ‘‘Min,’’ differ between the
Results and Discussion FCBF and IG results. In the FCBF rankings, these features
have higher positions than the IG results rankings. Based
Feature Ranking Results on the feature rankings obtained from IG and FCBF, it is
proposed to prioritize the top 7 features in each method
Figure 7 shows the feature ranking results. It can clearly when training machine learning models.
observe the importance of the considered 14 features. The
features that have the highest IG scores (Figure 7a) are Classification Results
‘‘Peak-to-Peak,’’ ‘‘Variance,’’ ‘‘STD,’’ and ‘‘RMS.’’ The
higher-score features suggest they are important for the Table 4 presents the classification results obtained from
classification task and provide valuable information. The machine learning algorithms (kNN, SVM, Naive Bayes)
features ‘‘Min,’’ ‘‘Peak,’’ and ‘‘Max’’ also exhibit relatively combined with the considered feature ranking methods (IG,
high IG scores, yet slightly lower than the previously FCBF). Performance evaluation utilizes various measures,
mentioned features. The features ‘‘Shape factor,’’ ‘‘Clear- including AUC, AC, F1, Precision, and Recall. In general,
ance factor,’’ ‘‘Kurtosis,’’ ‘‘Impulse factor,’’ ‘‘Crest the performance of all three machine learning algorithms is
factor,’’ ‘‘Skewness,’’ and ‘‘Mean’’ have smaller IG scores, satisfactory, even without feature selection. However, the
indicating that they provide relatively less information level of success varies across different performance mea-
compared to the features that are ranked higher. sures. The kNN algorithm demonstrates superior
According to the FCBF rankings (Figure 7b), the ‘‘Peak- performance with the highest AUC score, suggesting its
to-Peak’’ feature is considered the most significant. effectiveness in accurately differentiating between classes.
According to their FCBF scores, ‘‘Skewness,’’ ‘‘Mean,’’ The Naive Bayes algorithm demonstrates strong perfor-
‘‘Variance,’’ ‘‘STD,’’ and ‘‘RMS’’ are deemed relatively mance across various evaluation metrics, including high
123
Fig. 7. Feature ranking results

(a) feature ranking based on IG
(b) feature ranking based on
FCBF
AC, F1-score, Precision, and Recall. The performance of Although the FCBF performance is comparable to the
SVM is comparatively lower than kNN and Naive Bayes, IG-based feature selection technique, there may be some
as it exhibits lower scores across most measures. differences. The SVM technique greatly enhances several
Applying the IG method for feature selection leads to an performance indicators, such as AUC, AC, F1-score, and
enhancement in the overall performance of the models, Precision. The kNN method consistently exhibits robust
with a notable improvement observed in SVM and Naive performance, showing marginal improvements in metrics
Bayes. The consistently high performance of kNN across for instance AUC, AC, F1-score, and Recall. The results
various measures indicates that the initial set of features also show that the use of FCBF leads to improved SVM
effectively captured significant discriminatory information. classification. The Naive Bayes approach displays
The SVM model significantly enhances AUC, AC, and improvements in all the metrics similar to the feature
Precision metrics. This suggests that the feature selection selection method based on IG. This suggests that the FCBF
process positively impacts the model’s classification significantly improves the effectiveness of Naive Bayes.
accuracy and Precision. A number of improvements in all Based on a comprehensive evaluation of performance
metrics demonstrate the impact of IG-based feature selec- across multiple criteria, it can be inferred that the inte-
tion as it allows for improved Naive Bayes performance. gration of kNN with FCBF (kNN-FCBF) feature selection
123
exhibits significant promise. The kNN algorithm frequently model combined with the FCBF method for significant
exhibits robust performance in diverse settings. Addition- feature selection.
ally, its classification performance is further enhanced In the first matrix, the model correctly predicted 314 out
when combined with the FCBF feature selection technique, of 360 instances of ball fault but incorrectly classified 11
showing the best evaluation results, as indicated by the instances as healthy, 34 as inner race fault, and 1 as outer
bold-highlighted values in Table 4. Furthermore, it is worth race fault. The classification of each of the 160 instances of
noting that SVM with FCBF feature selection demonstrates healthy bearings was accurate. The model accurately pre-
significant enhancements, suggesting that it has the dicted 305 instances of inner race fault but misclassified 6
potential to be a powerful combination. instances as ball fault and 49 as outer race fault. The model
The confusion matrices of the kNN model are repre- correctly identified 251 out of 360 instances of outer race
sented in Fig. 8. The first confusion matrix (Figure 8a) defect. However, 14 instances were misclassified as ball
corresponds to a kNN model without significant feature faults and 92 as healthy. In the second matrix, however,
selection, while the second (Figure 8b) represents a kNN 335 out of 360 instances of ball defect were correctly
classified. There were 19 instances of inner race fault
misclassification and 6 instances of outer race fault mis-
classification. All 160 instances of healthy bearings were
Table 4 Evaluation measures of the used machine learning models classified accurately. The model accurately predicted 356
with and without feature selection instances of inner race defect out of 360. There were four
Model AUC AC F1 Precision Recall incorrect classifications of ball defect. The model accu-
rately identified 339 out of 360 instances of outer race
Without feature selection
defect. Twenty instances were incorrectly classified as ball
kNN 0.988 0.951 0.95 0.951 0.95
faults and one as inner race fault.
SVM 0.898 0.734 0.726 0.733 0.734
Both models performed well in classifying instances as
Naı̈ve Bayes 0.976 0.83 0.833 0.852 0.83
healthy with 100 percent accuracy. The second model
With feature selection using IG
(kNN-FCBF) predicted various fault categories more
kNN 0.989 0.957 0.951 0.952 0.952
accurately than the first model (kNN without feature
SVM 0.965 0.89 0.885 0.907 0.89
selection). For the ball fault and inner race fault classes, the
Naı̈ve Bayes 0.941 0.81 0.814 0.84 0.81 second model was more precise and made fewer misclas-
With feature selection using FCBF sifications than the first. In contrast, the first model was
kNN 0.991 0.97 0.96 0.96 0.957 more accurate for the outer race fault class and made fewer
SVM 0.936 0.837 0.832 0.84 0.838 errors than the second. The findings demonstrate the effect
Naı̈ve Bayes 0.975 0.859 0.86 0.87 0.857 of feature selection (FCBF) on the kNN model’s general
performance for bearing defect diagnosis. The second
model was more accurate and had fewer misclassifications
Fig. 8 kNN confusion matrices

(a) without important feature
selection (b) with important
feature selection using FCBF
123
for certain fault categories due to selecting and using only obtained from the confusion matrix of the enhanced model
the important features. However, it is important to note that (kNN-FCBF). In Fig. 9a, the x-axis corresponds to the
each model may have unique strengths and limitations standard deviation, while in Fig. 9b, it represents the peak-
depending on the fault class. to-peak value. On the other hand, the y-axis in both fig-
The scatter plot in Fig. 9 displays the distribution of the ures represents the skewness value. The selection of these
four bearing health states in a two-dimensional space, as axes was arbitrary and aimed at presenting the data based
Fig. 9 Features clustering of the studied cases (a) Peak-to-Peak with skewness (b) standard deviation with skewness
123
on their ranking using significant feature selection high-dimensional data, but their computational complexity
approaches. The four health states are visually distin- can increase as the number of features increases. kNN,
guished by distinct colors: red for ball fault, green for utilized by Jamil, et al. [32], is a relatively simple and
healthy, blue for inner race fault, and yellow for outer race intuitive algorithm with minimal computational complexity
fault. The scatter plot illustrates distinct separation among during training, but larger datasets may require additional
the four health states. Specifically, the ball fault bearings computational resources during the prediction phase. Naive
tend to cluster in the lower-left corner, the healthy bearings Bayes classifiers, including Kernel Naive Bayes as
cluster in the upper-right corner, the inner race fault employed by Alonso-Gonzalez, et al. [53], are known for
bearings cluster in the lower-right corner, and the outer their simplicity and speed, resulting in high computational
race fault bearings cluster in the upper-left corner. This efficiency. Fuzzy-CNN (Convolution Neural Network), as
finding implies that the chosen features successfully dif- applied by Rajput, et al. [54], is a powerful model for
ferentiate among the four states of bearing health. The complex pattern recognition. Nevertheless, its computa-
observation that the four health states exhibit a tional complexity can be high due to its multiple layers and
notable degree of distinctiveness is a positive development, convolution operations.
as it suggests that the improved model is expected to The kNN-FCBF model which incorporates seven unique
possess the capability to accurately classify various states time-domain features was used in this study to evaluate the
of bearing health. performance of diverse machine learning algorithms used
for industrial machine fault diagnostics. As shown in
Comparative Analysis Table 5, the kNN-FCBF model achieved an accuracy of
97%, ranking second. In particular, the Fuzzy-CNN model
Table 5 provides a comparison summary of the obtained with 16 features achieved the most accurate result at
best model in this study, which is kNN-FCBF, with a 99.87%. But this model needed a lot of computational
number of previous studies. These studies also used the power, which reduced its practicality in widespread
CWRU dataset and focused on categorizing four health implementation. On the other hand, it was found that the
classes: Healthy bearing, Ball fault, Inner race fault, and kNN-FCBF model is quite simple and less computationally
Outer race fault, as the present study. Comparative extensive, however still provided a high degree of cor-
parameters include the classifier type, the number of fea- rectness thus indicating its adequacy in fault diagnosis. The
tures used for classifier training, and the classification efficiency and ease of use that come with the kNN-FCBF
accuracy. strategy facilitate real-time machine health monitoring
Various adopted approaches were observed across through embedded electronics. Embedded electronic sys-
studies in terms of classifier complexity and computational tems, however, have limitations such as a lack of
effort. Rule-based classifiers, as utilized by Grover and processing power, memory, and energy capabilities. This
Turk [22], offer lower complexity and computational approach effectively resolves these problems because the
demands due to predefined rules, which can result in decent low computational cost allows for real-time detection of
accuracy with a smaller number of features. Back-Propa- faults using cheap embedded processors. Further, the small
gation Neural Networks (BPNN), such as those described number of features that were used (only 7) have low
in Huang et al. [51], can manage complex patterns but may memory and data transfer requirements which is very
require more computational resources for training due to important in embedded implantation where there are lim-
their multiple layers and connections. SVM, as employed ited resources. In contrast, larger and more resource-
by Cascales-Fulgencio, et al. [52], is effective at managing intensive approaches may not be practically feasible in
Table 5 Performance comparison of machine learning models for bearing fault diagnosis
Author Classifier Number of features Accuracy (%)
Grover and Turk [22] Rule-based classifiers 3 93.82

Huang, et al. [51] BPNN (Back-propagation neural network) 4 91.6
Cascales-Fulgencio, et al. [52] SVM 16 84.7
Jamil, et al. [32] kNN 9 96.2
Alonso-Gonzalez, et al. [53] Kernel Naive Bayes 5 94.4
Rajput, et al. [54] Fuzzy-CNN (Convolution neural network) 16 99.87
Current work FCBF-kNN 7 97
123
embedded monitors with strict constraints. However, the entirety of bearing fault conditions encountered in real-
demonstrated efficiency of the kNN-FCBF technique here world scenarios. Hence, further testing on varied datasets
attains high accuracy while adhering to such constraints. representing different fault severities and operating condi-
This makes it well suited for integration into embedded tions would help validate the robustness and
condition monitoring systems, allowing for real-time generalizability of the proposed method. Also, the efficacy
alerting of machinery faults directly on-device during of the used classifiers can fluctuate based on factors such as
operations. The on-equipment deployment potential holds the specific type of bearing fault, the extent of the fault, and
the promise of significantly enhancing maintenance the overall quality of the vibration signals. Therefore, it is
practices. imperative to acknowledge the constraints of this study in
In conclusion, various diagnostic methods for bearing subsequent research endeavors. This can be achieved by
faults have been utilized in various investigations. Each employing a dataset that is more inclusive and reflective of
method has its own tradeoffs in terms of classifier com- the population; ensemble or deep learning approaches may
plexity, computational effort, accuracy, and the number of provide a better diagnostic ability and assess the classifiers’
features employed. The current work’s combination of the efficacy on data that has not been previously encountered.
FCBF feature selection method with the kNN classifier Additionally, exploring alternative feature selection tech-
improved accuracy while reducing the number of features, niques would be imperative to enhance the efficacy of
establishing a balance between computational efficiency bearing fault diagnosis.
and classification performance.
Funding This research received no external funding.
Data Availability Not applicable.

Conclusion
The focus of this research was on the application of kNN, Conflict of interest The author declares no conflict of interest.
SVM, and naı̈ve Bayes classifiers for bearing fault diag-
nosis by using time-domain vibration signals. It was
observed from the results that despite all three classifiers References
producing good results in bearing faults detection, kNN
and SVM produced better results. Evaluation of IG and 1. A. Boudiaf, A. Moussaoui, A. Dahane, I. Atoui, A comparative
study of various methods of bearing faults diagnosis using the
FCBF for the best feature selection was also done in this case western reserve university data,’’ (in English). J. Fail. Anal.
study. These techniques were found to enhance the per- Prev. 16(2), 271–284 (2016)
formance of all three classifiers. The greatest improvement 2. J. Liu, L. Xue, L. Wang, Z. Shi, M. Xia, A new impact model for
was observed with the kNN algorithm, where the AUC rose vibration features of a defective ball bearing. ISA Trans. 142,
465–477 (2023)
from 0.988 to 0.991 taking feature selection into consid- 3. C. Abdelkrim, M.S. Meridjet, N. Boutasseta, L. Boulanouar,
eration. In general, this study proved the effectiveness of Detection and classification of bearing faults in industrial geared
machine learning classifiers including kNN and SVM in motors using temporal features and adaptive neuro-fuzzy infer-
bearing fault diagnosis with time-domain vibration signals ence system, (in English). Heliyon. 5(8), e02046 (2019)
4. P. Guo, J. Fu, X. Yang, Condition monitoring and fault diagnosis
and feature selection. The AC has changed from 0.951 to ofwind turbines gearbox bearing temperature based on kol-
0.97, the F1-score is now 0.96 as against the previous value mogorov-smirnov test and convolutional neural network model,
of 0.95, Recall is no longer at 0.951 but increased to (in English). Energies. 11(9), 2248 (2018)
0.96. The data presented in the confusion matrices 5. M. Irfan et al., A Comparison of Machine Learning Methods for
the Diagnosis of Motor Faults Using Automated Spectral Feature
demonstrated that kNN-FCBF outperformed using kNN Extraction Technique, (in English). J. Nondestruct. Eval. 41(2),
without feature selection. 31 (2022)
The suggested framework, which integrates optimized 6. H. Shi, Y. Li, X. Bai, K. Zhang, X. Sun, A two-stage sound-
time-domain feature extraction with machine learning vibration signal fusion method for weak fault detection in rolling
bearing systems, (in English). Mech. Syst. Signal Process. 172,
models and feature selection, yielded promising outcomes 109012 (2022)
in bearing fault diagnosis. A key advantage is the com- 7. H. Zhao, H. Liu, Y. Jin, X. Dang, W. Deng, Feature extraction for
putational efficiency achieved by directly analyzing data-driven remaining useful life prediction of rolling bearings.
vibration signals in the time domain without transforming. IEEE Trans. Instrum. Meas. 70, 1–10 (2021)
8. R. Liu, B. Yang, E. Zio, X. Chen, Artificial intelligence for fault
Incorporating dual feature selection further enhanced diagnosis of rotating machinery: a review, (in English). Mech.
model performance by eliminating irrelevant inputs. Syst. Signal Process. Review. 108, 33–47 (2018)
However, it is important to acknowledge the potential 9. V.V. Rao, C. Ratnam, Estimation of defect severity in rolling
limitations of this study. The vibration signals utilized in element bearings using vibration signals with artificial neural
this study (CWRU dataset) may not accurately reflect the
123
network, (in English). Jordan J. Mech. Ind. Eng. 9(2), 113–120 28. H. Zhao et al., Intelligent diagnosis using continuous wavelet
(2015) transform and gauss convolutional deep belief network. IEEE
10. J. Liu, Z. Xu, L. Zhou, W. Yu, Y. Shao, A statistical feature Trans. Reliab. 72(2), 692–702 (2023)
investigation of the spalling propagation assessment for a ball 29. H. Zhao, H. Liu, J. Xu, W. Deng, Performance prediction using
bearing. Mech. Mach. Theory. 131, 336–350 (2019) high-order differential mathematical morphology gradient spec-
11. A.A.F. Ogaili, A. AbdulhadyJaber, M.N. Hamzah, Statistically trum entropy and extreme learning machine. IEEE Trans.
optimal vibration feature selection for fault diagnosis in wind Instrum. Meas. 69(7), 4165–4172 (2020)
turbine blade. Int. J. Renew. Energy Res. 13(3), 1082–1092 30. L.A. Al-Haddad, A.A. Jaber, Improved UAV blade unbalance
(2023) prediction based on machine learning and ReliefF supreme fea-
12. M. Savolainen, A. Lehtovaara, Development of damage detection ture ranking method. J. Brazil. Soc. Mech. Sci. Eng. 45(9), 463
parameters over the lifetime of a rolling element bearing, (in (2023)
English). Tribologia. 38(3–4), 61–71 (2021) 31. S. Mochammad, Y.J. Kang, Y. Noh, S. Park, B. Ahn, Stable hy-
13. S. Agrawal, V.K. Giri, Improved mechanical fault identification brid feature selection method for compressor fault diagnosis.
of an induction motor using Teager-Kaiser energy operator, (in IEEE Access. 9, 97415–97429 (2021)
English). J. Electr. Eng. Technol. 12(5), 1955–1962 (2017) 32. M.A. Jamil, M.A. Khan, S. Khanam, Feature-based performance
14. H. Shi, Z. Liu, X. Bai, Y. Li, Y. Wu, A theoretical model with the of SVM and KNN classifiers for diagnosis of rolling element
effect of cracks in the local spalling of full ceramic ball bearings, bearing faults. Vibroeng. Procedia. 39, 36–42 (2021)
(in English). Appl. Sci. (Switzerland). 9(19), 4142 (2019) 33. L. Yuan, D. Lian, X. Kang, Y. Chen, K. Zhai, Rolling bearing
15. P. Wang et al., Vibration characteristics of rotor-bearing system fault diagnosis based on convolutional neural network and sup-
with angular misalignment and cage fracture: simulation and port vector machine. IEEE Access. 8, 137395–137406 (2020)
experiment, (in English). Mech. Syst. Signal Process. 182, 34. J. Zhang, X. Hu, X. Zhong, H. Zhou, Fault diagnosis of axle box
109545 (2023) bearing with acoustic signal based on chirplet transform and
16. C.Y. Lee, W.C. Lin, Induction motor fault classification based on support vector machine. Shock. Vib. 2022, 9868999 (2022)
ROC curve and t-SNE, (in English). IEEE Access. 9, 56330– 35. M.A.S. Al Tobi, K.P. Ramachandran, S. Al-Araimi, R. Pacturan,
56343 (2021) A. Rajakannu, C. Achuthan, Machinery faults diagnosis using
17. X. Liu, H. Huang, J.J.S. Xiang, A personalized diagnosis method support vector machine (SVM) and Naı̈ve Bayes classifiers. Int. J.
to detect faults in a bearing based on acceleration sensors and an Engi. Trends Technol. 70(12), 26–34 (2022)
FEM simulation driving support vector machine. Sensors. 20(2), 36. A.A.F. Ogaili, A.A. Jaber, M.N. Hamzah, A methodological
420 (2020) approach for detecting multiple faults in wind turbine blades
18. H. Liu, L. Li, J. Ma, Rolling bearing fault diagnosis based on based on vibration signals and machine learning. Curved Layer.
STFT-deep learning and sound signals. Shock. Vib. 2016, Str. 10(1), 20220214 (2023)
6127479 (2016) 37. K. Vernekar, H. Kumar, K.V. Gangadharan, Engine gearbox fault
19. C.T. Alexakos, Y.L. Karnavas, M. Drakaki, I.A. Tziafettas, A diagnosis using empirical mode decomposition method and Naı̈ve
combined short time fourier transform and image classification Bayes algorithm. Sadhana Acad. Proc. Eng. Sci. 42(7), 1143–
transformer model for rolling element bearings fault diagnosis in 1153 (2017)
electric motors. Mach. Learn. Knowl. Extr. 3(1), 228–242 (2021) 38. V. Muralidharan, V. Sugumaran, A comparative study of Naı̈ve
20. F. He, Q. Ye, A bearing fault diagnosis method based on wavelet Bayes classifier and Bayes net classifier for fault diagnosis of
packet transform and convolutional neural network optimized by monoblock centrifugal pump using wavelet analysis. Appl. Soft
simulated annealing algorithm, (in English). Sensors. 22(4), 1410 Comput. J. 12(8), 2023–2029 (2012)
(2022) 39. C. W. R. U. B. D. Center. (2023, January). Seeded fault test data.
21. A. B. Patil, J. A. Gaikwad, and J. V. Kulkarni, Bearing fault Available: https://engineering.case.edu/bearingdatacenter
diagnosis using discrete Wavelet Transform and Artificial Neural 40. W.A. Smith, R.B. Randall, Rolling element bearing diagnostics
Network, 2017, pp 399-405: Institute of Electrical and Elec- using the Case Western Reserve University data: a benchmark
tronics Engineers Inc study, (in English). Mech. Syst. Signal Process. 64–65, 100–131
22. C. Grover, N. Turk, Rolling element bearing fault diagnosis using (2015)
empirical mode decomposition and hjorth parameters. Procedia 41. B. Cui, Y. Weng, N. Zhang, A feature extraction and machine
Comput. Sci. 167, 1484–1494 (2020) learning framework for bearing fault diagnosis. Renew. Energy.
23. K. Deák, I. Kocsis, Support vector machine with wavelet 191, 987–997 (2022)
decomposition method for fault diagnosis of tapered roller 42. J. J. Saucedo-Dorantes, I. Zamudio-Ramirez, J. Cureno-Osornio,
bearings by modelling manufacturing defects, (in English). Per- R. A. Osornio-Rios, and J. A. Antonino-Daviu, ‘‘Condition
iod. Polytech. Mechan. Eng. 61(4), 276–281 (2017) monitoring method for the detection of fault graduality in outer
24. M. Altaf, T. Akram, M.A. Khan, M. Iqbal, M.M.I. Ch, C.H. Hsu, race bearing based on vibration-current fusion, statistical features
A new statistical features based approach for bearing fault and neural network,’’ (in English), Applied Sciences (Switzer-
diagnosis using vibration signals, (in English). Sensors. 22(5), land), Article vol. 11, no. 17, 2021, Art. no. 8033.
2012 (2022) 43. V. Vakharia, V.K. Gupta, P.K. Kankar, A comparison of feature
25. M. Li, Q. Wei, H. Wang, X. Zhang, Research on fault diagnosis ranking techniques for fault diagnosis of ball bearing. Soft.
of time-domain vibration signal based on convolutional neural Comput. 20(4), 1601–1619 (2016)
networks, (in English). Syst. Sci. Control Eng. 7(3), 73–81 (2019) 44. M. Peker, O. Özkaraca, and A. Şaşar, Use of orange data mining
26. M.D. Prieto, G. Cirrincione, A.G. Espinosa, J.A. Ortega, H. toolbox for data analysis in clinical decision making: the diag-
Henao, Bearing fault detection by a novel condition-monitoring nosis of diabetes disease, in expert system techniques in
scheme based on statistical-time features and neural networks, (in biomedical science practice: IGI Global, 2018, pp 143-167
English). IEEE Trans. Ind. Electr. 60(8), 3398–3407 (2013) 45. R.V. Sánchez, P. Lucero, R.E. Vásquez, M. Cerrada, J.C.
27. X. Li, H. Zhao, L. Yu, H. Chen, W. Deng, W. Deng, Feature Macancela, D. Cabrera, Feature ranking for multi-fault diagnosis
extraction using parameterized multisynchrosqueezing transform. of rotating machinery by using random forest and KNN. J. Intell.
IEEE Sens. J. 22(14), 14263–14272 (2022) Fuzzy Syst. Conf. Paper. 34(6), 3463–3473 (2018)
123
46. J. Li, X. Yao, X. Wang, Q. Yu, Y. Zhang, Multiscale local fea- frequency-domain features enhanced using cepstrum pre-
tures learning based on BP neural network for rolling bearing whitening: A ML- and DL-based classification. Appl. Sci.
intelligent fault diagnosis. Measurement J. Int. Measurement (Switzerland). 12(21), 10882 (2022)
Confed. 153, 107419 (2020) 53. M. Alonso-González, V.G. Dı́az, B.L. Pérez, B.C. G-Bustelo, J.P.
47. M. Hossin, M.N. Sulaiman, A review on evaluation metrics for Anzola, Bearing fault diagnosis with envelope analysis and
data classification evaluations. Int. J. Data Min. Knowl. Manag. machine learning approaches using CWRU dataset. IEEE Access.
Process. 5(2), 1 (2015) 11, 57796–57805 (2023)
48. S. Visa, B. Ramsay, A. Ralescu, E. Van Der Knaap, Confusion 54. D.S. Rajput, G. Meena, M. Acharya, K.K. Mohbey, Fault pre-
matrix-based feature selection. CEUR Workshop Proc. 710, 120– diction using fuzzy convolution neural network on IoT
127 (2011) environment with heterogeneous sensing data fusion. Measure-
49. L.A. Al-Haddad, A.A. Jaber, An intelligent fault diagnosis ment: Sens. 26, 100701 (2023)
approach for multirotor UAVs based on deep neural network of
multi-resolution transform features. Drones. 7(2), 82 (2023) Publisher’s Note Springer Nature remains neutral with regard to
50. R.H. Hadi, H.N. Hady, A.M. Hasan, A. Al-Jodah, A.J. Humaidi, jurisdictional claims in published maps and institutional affiliations.
Improved fault classification for predictive maintenance in
industrial IoT based on AutoML: a case study of ball-bearing Springer Nature or its licensor (e.g. a society or other partner) holds
faults. Processes. 11(5), 1507 (2023) exclusive rights to this article under a publishing agreement with the
51. M. Huang, Z. Liu, Y. Tao, Mechanical fault diagnosis and pre- author(s) or other rightsholder(s); author self-archiving of the
diction in IoT based on multi-source sensing data fusion. Simul. accepted manuscript version of this article is solely governed by the
Model Practice Theory. 102, 101981 (2020) terms of such publishing agreement and applicable law.
52. D. Cascales-Fulgencio, E. Quiles-Cucarella, E. Garcı́a-Moreno,
Computation and statistical analysis of bearings’ time- and
123

Diagnosis of Bearing Faults Using Temporal Vibration Signals: A Comparative Study of Machine Learning Models With Feature Selection Techniques

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Diagnosis of Bearing Faults Using Temporal Vibration Signals: A Comparative Study of Machine Learning Models With Feature Selection Techniques

Uploaded by

Copyright:

Available Formats

J Fail. Anal. and Preven.

ORIGINAL RESEARCH ARTICLE

Diagnosis of Bearing Faults Using Temporal Vibration Signals:

However, it is common for the samples in the original

In this context, K denotes the kernel function, which can

Fig. 2 Illustration depicting the underlying principle of SVM

Naı̈ve Bayes Experimental Setup

Table 2 Mathematical expressions of the extracted time-domain features [10, 41]

plays a crucial role in distinguishing between significant

Fig. 6 Developed Orange workflow for feature ranking and classification

Fig. 7. Feature ranking results

Fig. 8 kNN confusion matrices

Grover and Turk [22] Rule-based classifiers 3 93.82

Data Availability Not applicable.

You might also like