Access 2017 2773460

Received September 18, 2017, accepted October 23, 2017, date of publication November 21, 2017,
date of current version February 28, 2018.

Digital Object Identifier 10.1109/ACCESS.2017.2773460
Rolling Bearing Fault Diagnosis Using Modified

LFDA and EMD With Sensitive Feature Selection
XIAO YU 1,2,3 , FEI DONG1,2 , ENJIE DING1,2 , SHOUPENG WU1,2 , AND CHUNYANG FAN1,2
1 School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221000, China
2 IOT Perception Mine Research Center, China University of Mining and Technology, Xuzhou 221000, China
3 School of Medicine Information, Xuzhou Medical University, Xuzhou 221000, China
Corresponding author: Enjie Ding (enjied@cumt.edu.cn)

This work was supported by the National Key Research and Development Program of China under Grant 2017YFC0804400 and
Grant 2017YFC0804401 and in part by the National Key Basic Research Program of China (973 Program) under Grant 2014CB046300.
ABSTRACT In order to improve the accuracy of bearings fault diagnosis, one of the most crucial
components of rotating machinery, a novel features extraction procedure incorporating an improved features
dimensionality reduction method is proposed. In the first step, using the empirical mode decomposition
method, the original statistical characteristics were calculated from intrinsic mode functions of the vibration
signal. Due to information redundancy of the original statistical characteristics, this paper presents a novel
features extraction method that combines K-means method and standard deviation to select the most sensitive
characteristics. Furthermore, a modified features dimensionality reduction method is proposed, to realize the
low-dimensional representations for high-dimensional feature space. Finally, the performance of the fault
diagnosis model is evaluated by vibration signals with 12 bearing fault conditions, which are provided by
Bearing Data Center of Case Western Reserve University. Experiment results show that the proposed fault
diagnosis model can serve as an effective and adaptive bearing fault diagnosis system.
INDEX TERMS Fault diagnosis, features extraction, features reduction, sensitive features.
I. INTRODUCTION In the phase of signal processing and features extraction,

In industrial milieu, the rotating machinery has been widely due to the complexity of equipment structure and variety of
applied in almost all applications. Its health state not only operation conditions [4], the signals collected from rolling
affects safe and stable operations of the device itself, but also bearings often exhibit strong non-linearity and non- stationar-
has a direct impact on industrial production [1], [2]. With ity. Therefore, the time-domain and frequency-domain anal-
the rapid development of science and technology, modern ysis approaches cannot have essential effects on the signals
rotating machinery equipment becomes larger, more com- collected from rolling bearings [12]. For these kinds of sig-
plex, and more precise [3]. If no effective actions are taken, nals, the time-frequency analysis can provide an effective
device faults will inevitably occur and such faults may lead to method for features extraction. There are some representative
serious casualties and enormous pecuniary loss [4]. Among and commonly used time-frequency analysis methods, such
the frequently encountered components in the vast majority as short-time Fourier transform (STFT), Wigner-Ville distri-
of rotating machinery, bearings are one of the most crucial bution (WVD), wavelet transform (WT), and empirical mode
elements [5], [6] that can cause approximately 40-50% of decomposition (EMD) [13].
all failures of rotating machinery [7]–[9]. Thus, it is of sig- STFT is an effective time and frequency localized analysis
nificance to identify bearing faults, to maintain safety of the method. It could be considered as a local spectrum of the
device. signal in a fixed window [14], [15]. It divides the entire
In recent years, with the rapid development of signal pro- time domain into numerous segments of the same length.
cessing, data mining, and artificial intelligence technology, Each time period is an approximately stationary process [14].
data-driven methods are becoming more important in the fault In [15], an adaptive, short-time Fourier transform that makes
diagnosis of rolling bearings. Four main steps are necessary the analysis window width equal to the local stationary length
for these methods: signal processing, features extraction, fea- was presented. In [16], the STFT was employed for fea-
tures reduction, and patterns recognition [10], [11]. The first tures extraction of varying-speed rotary machinery. Although
three steps are the foundation of the patterns recognition. there are tremendous achievements of STFT in bearing fault
2169-3536
2017 IEEE. Translations and content mining are permitted for academic research only.
VOLUME 6, 2018 Personal use is also permitted, but republication/redistribution requires IEEE permission. 3715
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
X. Yu et al.: Rolling Bearing Fault Diagnosis Using Modified LFDA and EMD
diagnosis, its effectiveness is still hampered by the limita- worthy of reflecting the fault nature from the high-
tion of single triangular basis [17], [18]. Wavelet analysis is dimensional feature space, and a high-dimensional feature
another important method for non-linear and non-stationary set may easily produce the redundant features and lead
signals. In [19], wavelet filtering to detect periodical impulse to a decline in accuracy and efficiency of fault diagno-
components from vibration signals was presented. In [20], sis. Therefore, the selection of the feature subset is a cru-
the discrete wavelet transforms for extracting the rotor bar cial step in the classification process. Previous studies have
faults feature was studied. Although WT uses the scalable also shown that features extraction appears to be an impor-
analysis window, which is different from STFT, it still has tant prerequisite to achieve the expected diagnostic accu-
problems. These include the signal-independent resolution racies [13], [33]. Against this problem, how to select the
and unsatisfied time–frequency representation concentration, fault-sensitive statistical characteristics as the basis of sub-
so that they cannot characterize the rapidly time-varying sequent fault analysis garners considerable attention and is
component [18]. Furthermore, a very obvious disadvan- further studied. Therefore, a features extraction method, fea-
tage of STFT, WT, and WVD is the very high computa- tures selection by adjusted rand index and standard deviation
tional complexity [22]. Unlike the methods mentioned above, ratio (FSASR), is proposed. FSASR combines the K-means
EMD can automatically decompose non-stationary, non- method and standard deviation (STD) of feature data, which
linear signals into several different intrinsic mode func- can select the sensitive statistical characteristics for fault
tions (IMFs), each of which has different physical meanings. analysis.
In view of the strength, the EMD method has received wide For the high-dimensional statistical characteristics data,
attention and been applied successfully in the fault diagnosis if these data are used directly in fault classification, it will
field [23], [24]. In [25], EMD was used to extract the under- lead to the very high computational complexity and fault
lying trends of signals. In [26], a novel method that integrates classification accuracy degradation. Therefore, features
the EMD and modified SVM is proposed, to improve the dimensionality reduction is another crucial stage in the fault
performance of conventional EMD for bearing fault diagno- diagnosis process. Dimensionality reduction can reduce com-
sis. In [27], an improved EMD method based on the multi- putational complexity of learning algorithms, save time, and
objective optimization is applied to extracting the fault feature improve performance by reducing the dimensionality of the
of rolling bearing. In [1], the recent research of EMD in input features to machine learning tools before performing
fault diagnosis of rotating machinery was summarized and fault identification and classification [22]. Recently, dimen-
the possible trends were discussed. Nevertheless, the time- sion reduction algorithms for machinery fault diagnosis have
frequency analysis methods mentioned above could cause a been intensively investigated [34], [35], and many classi-
high-dimensional feature vector that can be a primary rea- cal methods have been proposed [36]. Principal component
son for fault-classification accuracy degradation [28]. Thus, analysis (PCA) and linear discriminant analysis (LDA), as
features selection or dimensionality reduction is needed to two classical linear dimensionality reduction methods, have
find the most useful fault features that maintain intrinsic been widely used for linear data. When the distribution of
information about the defects. a dataset is non-linear, PCA and LDA may be invalid [37].
Generally, the statistical properties of the signal in the time, Therefore, some non-linear dimensionality reduction meth-
frequency, and time-frequency domain are extracted to repre- ods, kernel principal components analysis (KPCA), Isomap,
sent features information, such as peak value (PV), root mean Laplacian eigenmaps (LE), and local linear embedding (LLE)
square (RMS), variance (V), skewness (Sw), and kurtosis (K). are presented, to provide a valid solution for the dimen-
In [29], 21 time-domain statistical characteristics are sionality reduction of non-linear data [24]. Although
extracted from different IMFs as the feature vectors. Then, non-linear dimensionality reduction methods have been
principle component analysis (PCA) was employed to extract successfully applied in many fields, they also have some
the dominant components from statistical characteristics for problems in practical applications, such as the problem of
gear faults detection. In [30], 16 time-domain statistical char- ‘‘out-of-sample’’ that has no explicit mapping matrix [38],
acteristics and 13 frequency-domain statistical characteristics the problem of overlearning of locality [39], and high com-
were calculated from IMFs of the vibration signal, on which putational complexity. Locality preserving projections (LPP)
distance evaluation techniques were used to select the salient as a novel manifold learning method is a kind of linear
features for improvement of the classification accuracy for mapping of LE, by replacing the non-linear mapping relation
gear case abnormalities. In [31], two time-domain and two to achieve dimensionality reduction [24]. Due to its sound
frequency-spectrum statistical characteristics are selected as workability and fast computation, LPP has been paid more
the features to train the SVM with a novel hybrid parame- attention in the fault diagnosis field [40], [41]. However, LPP,
ter optimization algorithm for fault diagnosis of the rolling as an unsupervised dimensionality reduction method, can
element bearings. In [32], the statistical parameters of the preserve the local geometry of the data and work well with
wavelet coefficients in 1-64 scales were calculated for the multimodal data. LDA is a supervised dimensionality reduc-
vibration signal. However, considering the complex mapping tion method and takes the label information into account in
relations between some bearing faults and their signs, it features reduction. Based on the respective attributes of LPP
is often difficult to determine which statistical property is and LDA, a novel dimensionality reduction method, local
3716 VOLUME 6, 2018

Fisher discriminant analysis (LFDA), was proposed in [42], II. THEORETICAL BACKGROUND
by which the label information can be taken into consid- A. BEARING FAULT EFFECTS ON VIBRATION IN
eration, and the local structure of data can be preserved. FREQUENCY DOMAIN
However, this method only considers the neighboring Among the main parts of the bearing, the inner race, outer
relationships between samples of the same classes. The race, ball, and cage (all of which are placed in the space
neighboring relationships between samples of different between the rings and make rotating possible [44]) are
classes are not considered. In order to address this problem, a precisely machined [45]. However, due to the inappropri-
novel dimensionality reduction method, support margin local ate lubrication of the bearing rolling elements, inadequate
Fisher discriminant analysis (SM-LFDA), is proposed in this bearing selection, improper mounting, indirect failure, mate-
paper, where the neighboring relationships between samples rial defects, and manufacturing errors, various defects can
of different classes are considered. occur [22]. When a localized fault appears on the bear-
The contribution of this paper is the development of ing, cyclical impulsive vibration emerges. Therefore, the fre-
intelligent fault diagnosis system of rolling bearing based quency of cyclical impulsive vibration contains bearing fault
on multi-domain features, systematically combining statisti- information, which can be applied to bearing fault diagnosis.
cal analysis methods with artificial intelligence techniques. The frequency of cyclical impulsive vibration is known as
FSASR, a novel features extraction method, was proposed to fault frequency [46]. During the presence of a failure, the
select the fault sensitive statistical characteristics as the basis value of the fault frequency depends on the fault size, rota-
of subsequent fault analysis. A modified features reduction tional speed, and damage location.
method, SM-LFDA, was proposed to excavate abundant and
valuable information with low dimensionality. The execution
of the proposed bearing fault diagnosis method is divided
into four steps: signal processing, features extraction, features
reduction, and fault patterns identification. In the first step,
vibration signals collected from bearings are decomposed
into several different IMFs by EMD, and multi-domain fea-
tures were calculated from the first four IMFs of the vibration
signal. In the second step, the adjusted rand index (ARI) cri-
terion of the clustering method and standard deviation (STD)
of samples are used to select fault-sensitive statistical char-
acteristics, which can represent the fault peculiarity under
different working conditions. Furthermore, due to the infor- FIGURE 1. Structure of a ball bearing.
mation redundancy and high-dimensional dataset, in the third
step, SM-LFDA was applied to determine a new lower-
dimensional space in which the new constructed features are For different bearing components (i.e., outer race, inner
obtained by transformations of the original higher dimension- race and ball, as shown in Fig. 1), the mechanical character-
ality features, such that certain properties preserved. Finally, istic frequency fcar (when the outer ring is fixed) is one of the
two cases with 12 working conditions were employed to following:
verify the effectiveness, adaptability, and superiority of the
fr d fbi
proposed method for the identification and classification of fc = 1− cos α = (1)
bearing faults, where the vibration signals were collected 2 D N
m b
from an experimental bench of rolling element bearing. The fr d
fbe = Nb 1 − cos α (2)
analysis results for the vibration signals of roller bearing 2 Dm
under different working conditions show the effectiveness,

fr d
adaptability, and the superiority of the proposed fault diag- fbi = Nb 1 + cos α (3)
2 Dm
nosis approach. 2 !
f r Dm d
The rest of this paper is organized as follows. fb = 1− cos α (4)
In section II, a theoretical background of the EMD, LDA 2d Dm
technique, LFDA technique, and SVM is summarized.
In section III, a description of the proposed diagnosis where fc is the cage fault frequency, fbi is the inner raceway
technique is given, and the system framework of the fault frequency, fbe is the outer raceway fault frequency,
proposed method is illustrated. In section IV, the pro- fb is the ball/roller fault frequency, d is the ball/roller diam-
posed method on experimental bearing faulty data taken eter, Dm is the pitch diameter, Nb is the number of rolling
from the Case Western Reserve University (CWRU) [43] elements, and α is the ball contact angle (zero for rollers) [22].
is investigated. Finally, some conclusions are drawn Therefore, significant research work has been conducted
in section V. based on vibration signal for bearings fault analysis.
VOLUME 6, 2018 3717

B. EMPIRICAL MODE DECOMPOSITION where u is a vector, uT Sb u and uT SW u are two scales. |_| is
Based on local characteristics of signals in different time the absolute value operator. However, a large number of state
scales, EMD decomposes the signals into a set of complete classes are usually present for identification and classification
and nearly orthogonal IMFs, each IMF corresponding to of different bearing faults. Hence, the multi-class LDA is
the vibration mode of a specific signal at a discrete fre- more desired [22].
quency [47]. To deal with non-stationary signal smoothly, Let xi ∈ Rd (i = 1, 2, . . . , n) be d-dimensional samples
an IMF is a function that satisfies two conditions: (1) In the and yi ∈ (i = 1, 2, . . . , c) be the associated class labels,
entire data set, the number of extrema and the number of zero where n is the number of samples and c is the total number
crossings must either equal or differ at most by one. of classes. Let nl be the number of samples in class l. When
(2) At any point, the mean values of the envelope defined (r > 1, where r = c − 1), a projection matrix U is needed.
by the local maxima and the envelope defined by the local Both UT Sb U and UT SW U are r by r matrices, and the ratio
minima are both zero [24]. of them cannot be computed directly. The determinant ratio
The specific description of EMD for x(t) is presented as is used:
follows [47]:
T
U Sb U
(1) Obtain the local maxima and minima of x(t). J (U) = max T (7)
U U SW U
(2) Produce the upper and lower envelopes in accordance
with the local maxima and the local minima of x(t). where the definitions of the between-class scatter matrix Sb
(3) The mean is designated as m1 (t), and the differ- and within-class scatter matrix SW are as follow:
ence between x(t) and m1 (t) is the first component h1 (t), c
X
h1 (t) = x(t) − m1 (t). Sb = nl (µl − µ)(µl − µ)T (8)
(4) Generally, h1 (t) satisfies the requisite conditions for l=1
IMF, thus h1 (t) can be treated as the first IMF component Xc X
of x(t). If h1 (t) does not satisfy the conditions for IMF, h1 (t) SW = (xi − µl )(xi − µl )T (9)
can be treated as a new original signal. By repeating the l=1 i:yi =l
above steps (1)–(3), we obtain h1 (t) and m11 (t), with h11 (t) = µl is the mean of the samples in class l, and µ is the mean of
h1 (t) − m11 (t). We repeat this sifting procedure k times, until all samples:
h1k (t) is an IMF that meets the criterion c1 (t) = h1k (t).
Next, isolate c1 (t) from x(t) by r1 (t) = x(t) − c1 (t), where 1 X
µl = xi (10)
r1 (t) is the residue, treating this as new data, which meets nl
i:yi =l
x(t) = r1 (t). n c
(5) Repeat the above steps (1)–(4), until the original signal 1X 1X
µ= xi = nl µl (11)
is decomposed into n IMFs, the residue rn (t) becomes smaller n n
i=1 l=1
than the predetermined value, or the residue rn (t) becomes a
The between-class scatter matrix Sb and within-class scat-
monotonic function. Thus, the EMD process is completed.
ter matrix SW also have equivalent form [42]:
After decomposition, x(t) can be expressed as:
n
n 1 X b
X Sb = pij (xi − xj )(xi − xj )T = X(Db − Pb )XT (12)
x(t) = ci (t) + r(t) (5) 2
i,j=1
i=1 n
1 X
C. LINEAR DISCRIMINANT ANALYSIS (LDA) AND LOCAL SW = pW T W W T
ij (xi − xj )(xi − xj ) = X(D − P )X
2
FISHER DISCRIMINANT ANALYSIS (LFDA) i,j=1
The LDA proposed by Fisher [48] for dimension reduction of (13)

binary classification problems (and further extended to multi- where
class cases by Rao [49]) is one of the most popular meth- (
ods for dimension reduction in statistics research [50], [51]. 1/n − 1/nl , yi = yj = l
pbij = (14)
LDA not only finds an embedding transformation such that 1/n yi 6 = yj
the between-class scatter is maximized and within-class scat- (
1/nl , yi = yj = l
ter is minimized, but also helps to improve the efficiency and pW
ij = (15)
accuracy of classification algorithms. 0, yi 6 = yj
The objective of the original Fisher’s LDA, namely
Pb = [pbij ]n×n and PW = [pW
ij ]n×n are weight matrices,
the Fisher’s criterion, is to maximize the ratio of
between-class scatter matrix Sb to within-class scatter Db and DW are diagonal matrices. dib is the ith diagonal
matrix SW : samples of Db and the sum of elements of the ith row of Pb ,
T diW is the ith diagonal samples of DW and the sum of ele-
u Sb u
J (u) = max T (6) ments of the ith row of PW . The solution to minimize the
u u SW u within-class scatter variance and maximize the between-class
3718 VOLUME 6, 2018

variance is obtained by an eigenvalue decomposition of SW , hyperplane that minimizes the upper bound of the generaliza-
taking into account the eigenvalues corresponding to the tion error, by maximizing the margin between the separating
eigenvalues. hyperplane and the nearest sample points.
LFDA is a linear-supervised, dimensionality reduction This process may be described as a set of N given training
method proposed by Sugiyama [42]. LFDA not only helps data points {(xi , yi ) |xi ∈ Rn , yi ∈ {−1, +1} }, i = 1, · · · , N ,
to simultaneously maximize between-class separability and wherexi is the input vector, yi is the label, and N is the
preserve within-class local manifold structure in a reduced number of data samples. The sample space can be mapped
dimensional space, but also inherits an excellent property on a high-dimensional feature space by the non-linear map-
from LDA; that is, it has an analytic form of the embedding ping function ϕ(x), and the maximum margin separating the
matrix, and the solution can be easily computed by solving a hyperplane can be presented as wϕ(x) + b, where w is the
generalized eigenvalue problem [42]. normal direction of a separation plane and b is the scalar. The
Compared with LDA, LFDA incorporates local informa- distance between the closest sample points and a separation
tion into the definition of weight. Furthermore, LFDA and plane is 1/kwk. Thus, maximizing 1/kwk is equivalent to
LDA have the same optimization framework J (U). However, minimizing kwk. The problem of constructing an optimal
Sb has been replaced by S̃b and SW has been replaced by S̃W . hyperplane can be transformed into the following quadratic
Let S̃b and S̃W be the local between-class scatter matrix and optimization solution:
local within-class scatter matrix: n
1 X
n min kwk2 + C ξi (21)
1 X w,b 2
S̃b = p̃bij (xi − xj )(xi − xj )T = X(D̃b − P̃b )XT (16) i=1
2
i,j=1 with
n
1 X W Restrictions : yi (wφ(xi ) + b) ≥ 1 − ξi and ξi
S̃W = p̃ij (xi − xj )(xi − xj )T = X(D̃W − P̃W )XT
2 ≥ 0 for i = 1, · · · , n. (22)
i,j=1
(17)
where ξi represents positive slack variables that are necessary
where to allow misclassification, and C imposes a trade-off between
( training error and generalization. By using the duality theory
Aij (1/n − 1/nl ), yi = yj = l of optimization, the final decision function can be presented
p̃bij = (18) as:
1/n yi 6 = yj !
(
Aij /nl , yi = yj = l
X
f (x) = sgn yi α i hφ (xi ) , φ (x)i + b
p̃W
ij = (19)
0, yi 6 = yj xi ∈svs
!
X
Aij ∈ [0, 1] is the affinity between xi and xj given by = sgn yi α i K (xi , x) + b (23)
xi ∈svs
xi − xj 2
!
where α i symbolizes Lagrange multipliers, which can be
Aij = exp − (20)
γi γj determined during the optimization process. K (xi , x) is a
kernel function, which allows access to spaces of high dimen-
where γi is the local scaling around xi , defined by γi =
sions without the need to explicitly know the mapping func-
xi − xk and xk is the k th nearest neighbor of xi . Aij is large tion. A typical kernel function [13] offers these choices: linear
i
if xi and xj are close to each other in the feature space; oth- kernel, polynomial kernel, radial basis function (RBF) kernel,
erwise it is small. The k parameter is a tuning factor and is a and sigmoid kernel.
function of the embedding space [42], [52]. According to the For roller element bearings, the fault detection is a multi-
affinity Aij , the values for the sample pairs in the same class pattern recognition task, which can be generally solved
class can be weighted. This means that sample pairs located by decomposing the multi-class problem into several binary
far apart in the same class have less influence on S̃b class problems [54]. In [55], the multi-class patterns recogni-
and S̃W . Furthermore, the values for the sample pairs in dif- tion was handled by the ‘‘one-against-one’’ approach. In this
ferent classes cannot be weighted. The LFDA transformation paper, we select the polynomial kernel to solve the multi-class
matrix ULFDA can also be formed by solving a generalized pattern-recognition task.
eigenvalue problem of S̃b and S̃W .
III. PROPOSED METHOD AND THE SYSTEM
D. SUPPORT VECTOR MACHINE (SVM) FRAMEWORK
The support vector machine (SVM) is a statistical classi- A. FEATURES SELECTION BY ADJUSTED RAND INDEX AND
fication method based on the structural risk minimization STANDARD DEVIATION RATIO (FSASR)
approach proposed by Vapnik et al. [53]. The basic prin- In this paper, we suggest that the most sensitive statistical
ciple of SVM is that it determines the optimal separating characteristics should be selected before the implementation
VOLUME 6, 2018 3719

of the fault patterns recognition technique. For this reason, the greater the value of the ARI, the better the clustering
the K -means method and STD are applied to a dataset that performance will be.
includes different statistical characteristics for all cases of Once clustering analysis is performed for the charac-
bearing conditions. In FSASR, each kind of statistical char- teristics sets [CS1 , CS2 , . . . , CSK ], the ARI = {ARI (1),
acteristic is clustered by K -means method, from which the ARI (2), . . . , ARI (K )} can be obtained. In this paper, we pre-
clustering result adjusted rand index (ARI) becomes an eval- sume that the greater the value of ARI(k), the greater the
uation index of each statistical characteristic. For each kind characteristic class discriminative degree will be.
of statistical characteristic, we compute STD of characteristic Step 2: The standard deviation of characteristic samples of
samples in each bearings condition. And the sum of STD in all a statistical characteristic in each type of bearings condition
bearing conditions can be obtained. For each statistical char- is calculated, that is, the standard deviation of the elements of
acteristic, the higher the ARI, the greater the characteristic the row of the matrix CSk . Therefore, we can obtain standard
class discriminative degree will be. The lower the value of deviations sets, [STDk1 , STDk2 , . . . , STDkM ], where STDkm can
the STD, the greater the class cohesion of the characteristic be expressed by:
will be. Therefore, the ratio of ARI and STD is selected to v
u
N
indicate the sensitivity of the statistical characteristic. The u 1 X
STDki = t (Sijk − Sik )2 (26)
u
implementation details are described as follows. N −1
Step 1: In the training dataset, there are M kinds of bearings j=1
fault types, N vibration signals samples in each type of bear- where
ings fault pattern, and K kinds of statistical characteristics. N
By vibration signals processing, we can obtain characteristics 1 X k
Sik = Sij (27)
sets, [CS1 , CS2 , . . . , CSK ], where CSk can be expressed by: N
j=1
 k
S11 k
S12 ··· k
S1N

Next, we can obtain SSTD(k), which is the sum of the
 k
S21 k
S22 ··· k
S2N  standard deviations of the characteristic samples of the
CSk =  .. .. .. (24) k th statistical characteristic for all cases of bearing conditions,
 
.. 
 . . . . 
where SSTD(k) can be expressed by:
k
SM k
SM ··· k
SMN
1 2 M
X
SSTD(k) = STDki (28)
where Sijk is the k th statistical characteristic of the jth sample
i=1
in the ith kind of bearings fault type.
Next, CSk can be classified into M clustering partitions The standard deviation is a measure that is used to quan-
using the K -means method. The ARI of the clustering par- tify the amount of variation or dispersion of a set of data
titions can be calculated to judge the accuracy of clustering values [58]. The standard deviation sequence SSTD =
results [56], [57]. {SSTD(1), SSTD(2), . . . , SSTD(K )} becomes another evalu-
Given a set of n objects X = {x1 , x2 · · · xn }, suppose P = ation index for features extraction. In this paper, we presume
{p1 , p2 · · · pn } and Q = {q1 , q2 · · · qn } represent classes of the that the lower the value of SSTD(k), the greater the class
objects by K -means algorithm and real class memberships, cohesion of the characteristic.
respectively. The ARI is then defined as [56], [57]: Step 3: Obtain a new sequence, ASR = {ASR(1), ASR(2),
. . . , ASR(K )}, where the definition of ASR (k) is as follows:

(a + c)(a + b) (a + c)+(a + b) (a + c)(a + b) ARI (k)
a− / − ASR(k) = (29)
d 2 d SSTD(k)
(25)
In this paper, we presume that the greater the value
where: of ASR(k), the better the statistical characteristic sensitiv-
ity of the corresponding characteristic elements. Therefore,
a number of object pairs {xi , xj } belonging to the same the sorted ratio sequence of the ARI and standard devia-
class in Q and the same class in P. tion (SASR) can be obtained by sorting the ASR in descending
b number of object pairs {xi , xj } belonging to the same mode.
class in Q and different classes in P.
c number of object pairs {xi , xj } belonging to different B. PROPOSED SUPPORT MARGIN LOCAL FISHER
classes in Q and the same class in P. DISCRIMINANT ANALYSIS (SM-LFDA)
d number of object pairs {xi , xj } belonging to different
Although LFDA preserves the local geometry of data, it only
classes in Q and different classes in P.
considers the neighboring relationships between samples of
ARI measures the degree of similarity between the same classes. The neighboring relationships between samples
obtained partition and the true clustering structure underlying of different classes are not considered. We proposed a support
the data between 0 and 1, where 0 indicates complete dis- margin, local fisher discriminant analysis (SM-LFDA). For
agreement and 1 indicates complete agreement. Necessarily, SM-LFDA, it naturally inherits the merits of LFDA, and
3720 VOLUME 6, 2018

the neighboring relationships between samples of different

classes are considered. The underlying idea of solving the
problem above is that the desired projection should make the
distance between the nearest neighbors of different classes
the furthest.
The objective of the LFDA, namely the LFDA’s criterion,
is to simultaneously maximize between-class separability and
preserve within-class local manifold structure in a reduced
dimensional space. Compared with LFDA, SM-LFDA has FIGURE 2. Experimental test stand [43].
an improvement on the local information of the definition
of the weight. Furthermore, SM-LFDA and LFDA have the Step 2: Compute between-class scatter matrix S̃Sb and
same optimization framework J (U), but Sb and SW have within- class scatter matrix S̃SW .
been replaced by S̃Sb and S̃SW , respectively. Based on the Step 3: Compute the eigenvectors and corresponding
definition of LDA and LFDA in section II, we define the local eigenvalues for the scatter matrix S̃bS and the scatter
between-class scatter matrix S̃Sb and local within-class scatter matrix S̃WS , thus, eigenvectors V , V , . . . , V and corre-
1 2 d
matrix S̃SW as follows: sponding eigenvalues λ1 , λ2 , . . . , λd are obtained.
n
Step 4: Sort the eigenvectors by decreasing eigenvalues and
1 X Sb choose k eigenvectors with the largest eigenvalues, to form
S̃SW = p̃ij (xi − xj )(xi − xj )T = X(D̃Sb − P̃Sb )XT
2 d × k dimensional projection matrix U.
i,j=1
Step 5: Compute the equation Z = X × U. The
(30)
n d-dimensional samples can be transformed to the
1 X
k-dimensional samples and procedures of dimensionality
S̃SW = p̃SW T
ij (xi − xj )(xi − xj ) = X(D̃
SW
− P̃SW )XT
2 reduction are complete.
i,j=1
(31) Finally, with the utility of SM-LFDA, the low-dimensional
feature matrices of the training and testing dataset can be
where obtained with more sensitive and less redundant information
 for the bearings fault identification and classification.
Aij (1/n − 1/nl ), yi = yj = l,

p̃Sb
ij = nl /n yi 6 = yj (j ∈ Nst(i)) , (32) C. SYSTEM FRAMEWORK


1/n else. The implementation of the proposed diagnostic model is
( shown in Fig. 3, where the statistical analysis approach and
Aij /nl , yi = yj = l
p̃SW
ij = (33) the artificial intelligence approach are systematically blended
0, yi 6 = yj to detect and diagnose rolling element bearing faults. The
entire fault diagnosis procedure is divided into four steps:
where P̃Sb = [p̃Sb ij ]n×n and P̃
SW = [p̃SW ]
ij n×n are weight signal processing, features extraction, features reduction, and
Sb SW
matrices, and D̃ and D̃ are diagonal matrices. d̃iSb is the patterns recognition.
ith diagonal samples of D̃Sb and the sum of elements of the In the first step, vibration signals collected from bearings
ith row of P̃Sb , and d̃iSW is the ith diagonal samples of D̃SW are decomposed into several different IMFs by EMD. In the
and the sum of elements of the ith row of P̃SW . In p̃Sb ij , the second step, the original vibration signals and IMFs of the
means of yi 6 = yj (j ∈ Nst(i)) is that j is the nearest neighbor training samples are applied to generate statistical charac-
of i and they belong to different classes. In this condition, the teristics. With the utility of the proposed features extraction
weight value is set to nl /n. approach FSASR, the most sensitive statistical character-
Through the above analysis, the steps of the SM-LFDA istics can be selected to construct the feature vector for
algorithm are generally in agreement with LFDA, where only the diagnostic model. The sequence number of most sen-
the determination of the weight value that can change the sitive statistical characteristics will be directly applied to
local geometry between the nearest neighbors of different feature extraction for testing samples. For features reduction,
classes are modified according to (32). the low-dimensional training feature space is obtained by
For the n×d dimensional feature matrix X, n is the number SM-LFDA, which generates a projection that can be used
of samples. The goal of dimensional reduction is to find for dimensionality reduction for the testing feature space. For
the d × k dimensional projection matrix U to obtain the the second and third steps, the SASR and projection matrix
low-dimensional representations Z, where Z is the n × k are obtained by processing the training set, which can be
dimensional matrix and k(k < d) is the total number of directly used by the testing set. In the last step, the low-
projection vectors. The detailed procedures of SM-LFDA are dimensional training feature set is employed as the input to
listed as follows: the fault type to train the classifier, and the trained classifier
Step 1: Compute the d-dimensional mean vectors for dif- will be employed to conduct fault patterns recognition using
ferent classes from the dataset. the low-dimensional feature set of testing data. Finally, the
VOLUME 6, 2018 3721

FIGURE 3. Implementation of the proposed fault diagnostic technique.
proposed method will output the fault identification and clas- In order to evaluate the effectiveness, adaptability, and
sification accuracy. robustness of the bearing fault diagnosis method, 2 hp and
3 hp with different fault types and degrees were employed.
IV. EXPERIMENTS AND ANALYSIS RESULTS The detailed information of the used vibration dataset is
A. EXPERIMENTAL SETUP AND CASES presented in Table 1, where ball and inner race faults have
In order to validate the proposed bearing fault diagnosis four fault degrees, respectively. The outer race fault has
method, the vibration bearing test data was freely pro- three fault degrees. For each case, there is also a normal
vided by the Bearing Data Center of Case Western Reserve condition. Therefore, there are 12 working conditions, and
University (CWRU) [43]. Fig. 4 shows the system used for these conditions correspond to 12 fault patterns in each case.
measuring the data that includes an electric motor (left), a In each fault pattern, 60 samples are acquired from vibration
torque transducer/encoder (center), a dynamometer (right), signals in the time-domain, while each sample contains 2000
and control circuitry (not shown). The bearings used in this continuous data points. The 60 samples of each fault pattern
work are deep-groove ball bearings of the type 6205-2RS were collected from the bearings installed at the drive end of
JEM SKF at DE, of which the specifications are listed in the the motor housing, where the sampling frequency is 12 kHz.
Table 1. The single-fault (including ball fault, inner race fault, In order to verify the adaptability of the proposed diagnosis
and outer race fault) was separately seeded on the normal method, the samples of a fixed motor load are selected as
bearing with different defect sizes (0.007in, 0.014in, 0.021in, training samples and the samples of different motor loads
and 0.028in) using electro-discharge machining [24]. are selected as testing samples. This experimental setup is
The vibration signals were collected using accelerometers different from other setups employed in previous research [4],
under different motor loads of 0-3 hp (motor speeds of [22], [52]. Therefore, two cases are employed in experiments.
1730 to 1797 rpm). In case 1, 40 random samples of 3 hp are selected as testing
3722 VOLUME 6, 2018

TABLE 1. The detail information of the used vibration dataset.
FIGURE 4. The original ORF (outer race fault) vibration signal.
samples. In case 2, 40 random samples of 2 hp are selected as

testing samples. For training samples, two cases use the same
remaining 20 samples of 3 hp.
B. ANALYSIS RESULTS
According to the system framework shown in Fig. 3, the first
step is signal processing, in which vibration signals collected
from bearings are decomposed into several different IMFs FIGURE 5. IMF1-IMF8 of ORF vibration signal.
by EMD. One vibration signal sample from the training set
and the 8 IMFs from the sample are presented in Fig. 4
and Fig. 5. The bearing running frequency is 28 Hz at the istics under different fault conditions. The HMS of the ORF
machine running speed of 1730 r/min, while the bearing vibration signal sample is shown in Fig. 7. Therefore, in this
ORF characteristic frequency is 103.4 Hz, which can be cal- paper, IMF1-IMF4 are selected to calculate the HMS (HHT
culated from the SKF-6205-2RS bearing parameters and the marginal spectrum) and HES (HHT envelope spectrum) [59],
fault characteristic frequency theoretical calculation formula. which are applied to generating statistical characteristics.
Fig. 6(a)–(f) shows the HES of IMF1–IMF6. In Fig. 6(a)–(d), After the first step of the proposed bearing fault diagnosis
we can see from the IMF1–IMF4 HES that there are explicit method, the features extraction is conducted, among which
spectral lines near the running frequency (29.3 Hz) and multi-domain statistical characteristics are calculated and the
double the running frequency. Similarly, there are explicit proposed FSASR is employed to select the sensitive statistical
spectral lines near the theoretic fault characteristic frequency characteristics for constructing multi-domain feature vectors.
(105.5 Hz) and double the fault frequency (205.1 Hz). The For each sample, four IMFs, four HES, and the HMS of each
HMS expresses the cumulative frequency amplitude across sample can generate 81 statistical characteristics by using
the entire measured time period, which contains frequency 9 statistical parameters, which are shown in Table 2. For
characteristics of each IMF component. Therefore, HMS 81 statistical characteristics of each sample, the class dis-
presents different frequency amplitude distribution character- criminative degree of each characteristic is different, which
VOLUME 6, 2018 3723

FIGURE 6. HES of IMF1-IMF6 of ORF vibration signal (a) IMF1; (b) IMF2;
(c) IMF3; (d) IMF4; (e) IMF5;(f) IMF6.
FIGURE 8. Two time-domain statistical characteristics of training samples.
TABLE 2. Statistical parameters.

FIGURE 7. The HMS of ORF vibration signal.
can be reflected in Figs. 8 and 9. In this paper, we provide

four examples, of which two are time-domain characteristics
(mean value and skewness) and two are HES statistical char-
acteristics (standard deviation and shape factor). Due to the
difference of class discriminative degree of each characteris-
tic, the features extraction step should be employed to select
some characteristics that are more suitable for fault diagnosis.
When the original feature set has been obtained, the fea-
tures selection method FSASR is employed to select the most
sensitive statistical characteristics as the input feature vectors
for training classifier. In this procedure, a new evaluation
index ASR, which can evaluate the sensitivity of statisti-
cal characteristics for bearing fault diagnosis, is obtained
by combining ARI and SSTD. The ARI, SSTD and ASR
of 81 statistical characteristics of training samples are pre-
sented in Figs. 10, 11 and 12, respectively. In Fig. 10, the
horizontal axis represents the number of statistical charac-
teristics. The 1-9, 10-18, 19-27, and 28-36 represent time
domain characteristics of IMF1-IMF4, respectively. The
37-45, 46-54, 55-63, and 64-72 represent HES characteristics
of IMF1-IMF4, respectively. The 73-81 represent the HMS
characteristics of a sample. Fig. 12 shows that different sta-
tistical characteristics have different sensitivity. the dimensionality of the input feature vectors to the classifier
After the features extraction step, we can get a training before performing fault identification and classification. For
feature set for training classifier. Due to the dimensionality the SVM classifier used in this article, two parameters need
of the training feature set, therefore, the proposed dimension- to be determined, c and g. Therefore, we use training samples
ality reduction approach, SM-LFDA is conducted to reduce and the PSO method, combined with cross-validation, to
3724 VOLUME 6, 2018

FIGURE 12. The ASR of 81 statistical characteristics of training samples.
TABLE 3. Bearing fault diagnosis results obtained by the OFS-SVM.
two groups, which are employed to verify the effectiveness

and adaptability of the proposed FSASR and SM-LFDA. The
detailed descriptions of them are presented below.
FIGURE 9. Two HES statistical characteristics of training samples. In the first group, the FSASR is not applied, and 81 statis-
tical characteristics named the original feature set (OFS) are
directly processed by dimensionality reduction methods, such
as principal component analysis (PCA), LDA, LFDA, and
SM-LFDA. OFS-SVM is an SVM-based diagnosis model,
and the OFS is selected as the input to SVM. Table 3 presents
diagnosis results of the testing set. The testing accuracies of
case 1 is obviously higher than that of case 2. OFS-PCA-SVM
is an SVM-based diagnosis model with the use of PCA.
Table 4 presents diagnosis results of the two cases with
different dimensionality reduction sizes for PCA. From the
table it is evident that, as the size of dimensionality reduc-
FIGURE 10. The ARI of 81 statistical characteristics of training samples.
tion increases, the diagnosis accuracies appear to increase.
However, the maximum testing accuracy of OFS-PCA-SVM
model for case 2 can only attain 62.92%. Table 5 presents
diagnosis results of the two cases with different dimensional-
ity reduction sizes obtained by LDA, LFDA, and SM-LFDA.
According to the experiment results in Table 5, for case 1,
both of the three diagnosis methods achieve desirable
performance. For case 2, compared with OFS-SVM and
OFS- PCA-SVM, the performances of these diagnosis
models (OFS-LDA-SVM, OFS-LFDA-SVM and OFS-(SM-
LFDA) -SVM) have an obvious improvement, among them,
the performance of OFS-(SM-LFDA)-SVM is better than
FIGURE 11. The SSTD of 81 statistical characteristics of training samples. that of the other two models. The testing accuracy can
attain 92.29%. In the experiments mentioned above, case1
and case2 are tested in various approaches. We can summa-
obtain the optimal parameters. The SVM used in all of the rize that since LDA, LFDA, and SM-LFDA are supervised
following experiments conduct the same parameter optimiza- dimensionality reduction methods, which consider the label
tion method for obtaining the optimal parameters. information, they can achieve preferable prediction effects
To verify the effectiveness and adaptability of the proposed and generalization ability.
bearing fault diagnosis method, a series of comparative exper- In the second group, the FSASR is applied, to select the
iments are conducted. These experiments are divided into most sensitive statistical characteristics before the implemen-
VOLUME 6, 2018 3725

TABLE 4. Bearing fault diagnosis results obtained by the OFS-PCA-SVM. TABLE 6. Bearing fault diagnosis results obtained by the OFS-FSASR-SVM.
TABLE 5. Bearing fault diagnosis results obtained by the

OFS-(LDA/LFDA/SM-LFDA)-SVM.
FIGURE 13. The diagnosis results with different number of sfn.

OFS-FSASR-PCA-SVM (dimension size is 20).
tation of feature reduction and fault diagnosis. Therefore,

81 statistical characteristics are processed by FSASR, from
which the different numbers of characteristics are considered
as an input to dimensionality reduction methods, such as
PCA, LDA, LFDA, and SM-LFDA. OFS-FSASR-SVM is

an SVM-based diagnosis model, in which OFS is processed
by FSASR, and selected characteristics that are based on the
result of FSASR will input to SVM without dimensionality
reduction method. Table 6 presents the testing results of
case 1 and case 2. The sfn is the number of characteristics that

are selected by FSASR. According to the results in Table 6, the diagnosis results presented in Table 6. OFS-FSASR-PCA-
the testing accuracies of case 1 and case 2 can attain 97.92% SVM is an SVM-based diagnosis model that uses FSASR
and 93.33%, respectively. Compared with the results in and PCA. Keeping with 20 dimension size, Table 7 presents
Table 3, it is evident that, with the use of FSASR, the testing the testing results of case 1 and case 2. Compared with the
accuracies of case 1 and case 2 appear to be an improve- results in Table 6, the performance of the OFS-FSASR-PCA-
ment. Meanwhile, it indicates that the different number of SVM model has no obvious improvement. Furthermore,
sensitive characteristics have an effect on the accuracy of Fig. 14 presents diagnosis results of the two cases with differ-
bearing fault diagnosis. Fig. 13 is a curved representation of ent dimensionality reduction sizes for PCA. For comparisons,
3726 VOLUME 6, 2018

FIGURE 14. The diagnosis results obtained by OFS-FSASR-PCA-SVM with different number of dimension sizes for PCA.
(The PC represents the number of dimension size.)
TABLE 8. Bearing fault diagnosis results obtained by the TABLE 10. Bearing fault diagnosis results obtained by the
OFS-FSASR-LDA-SVM (dimension size is 11). OFS-FSASR-(SM-LFDA)-SVM (dimension size is 11).

OFS-FSASR-LFDA-SVM (dimension size is 11).
FIGURE 15. The diagnosis results with different number of sfn and
different dimensionality reduction methods. (Dimension size is 11.)
respectively. Fig. 15 gives the curved representation of the

results presented in Tables 8, 9, and 10. According to the
results in Table 8(LDA), the maximum testing accuracies of
three other models OFS-FSASR-LDA-SVM, OFS-FSASR- case 1 and case 2 can attain 100% and 92.71%, respectively.
LFDA-SVM and OFS-FSASR-(SM-LFDA) -SVM are con- In Table 9(LFDA), the maximum testing accuracies of case 1
ducted for case 1 and case 2 based on LDA, LFDA, and SM- and case 2 can attain 100% and 96.67%, respectively.
LFDA, respectively. Keeping with 11 dimension size, The In Table 10(SM-LFDA), the maximum testing accuracies
diagnosis results of the testing set for these three dimension- of case 1 and case 2 can attain 100% and 97.71%, respec-
ality reduction methods are presented in Tables 8, 9, and 10, tively. In order to compare the experimental results of two
VOLUME 6, 2018 3727

FIGURE 19. The diagnosis results of testing set of case 1 with the use of
FIGURE 16. The diagnosis results with different number of sfn and FSASR and different dimensionality reduction methods. (The output
different dimensionality reduction methods. (Dimension size is 5). dimension sizes of PCA, LDA, LFDA, and SM-LFDA
are 10, 11, 11, and 11, respectively).
FIGURE 20. The diagnosis results of testing set of case 2 with the use of
FIGURE 17. The diagnosis results with different number of sfn and FSASR and different dimensionality reduction methods. (The output
different dimensionality reduction methods. (Dimension size is 7). dimension sizes of PCA, LDA, LFDA and SM-LFDA
are 10, 11, 11 and 11, respectively).
diagnosis models (OFS-FSASR-LDA-SVM, OFS-FSASR-

LFDA-SVM, and OFS-FSASR-(SM-LDA)-SVM), the per-
formance of OFS-FSASR-LFDA-SVM and OFS-FSASR-
(SM- LDA)-SVM) are better than OFS-FSASR-LDA-SVM,
and the performance of OFS-FSASR-(SM-LDA)-SVM) is
the best. According to Figs. 15-20, we find that the fault
diagnosis can attain better performance when the param-
eter sfn is in a relatively wide range. For example, the
OFS-FSASR-(SM-LDA)-SVM) model can attain a good per-
formance when the sfn ranges from 25 to 50. On the one hand,
the validity of the design of the correlation parameter can
FIGURE 18. The diagnosis results with different number of sfn and be verified. On the other hand, it verifies that the proposed
different dimensionality reduction methods. (Dimension size is 9). bearing fault diagnosis algorithm has great adaptability.
testing sets with different dimensionality reduction methods V. CONCLUSION

and different dimension sizes, Figs. 16, 17, and 18 give the This paper proposed a novel procedure in order to identify
curved representation of the testing accuracies with different and classify different bearing fault conditions. The pro-
dimension sizes obtained by LDA, LFDA, and SM-LFDA, posed procedure, systematically blending the statistical anal-
respectively. The dimension size is presented in the figure. ysis approach with artificial intelligence, is developed using
Figs. 19 and 20 presents the testing accuracies of case 1 EMD as the multi-domain features generation approach,
and case 2 with the different dimensionality reduction meth- FSASR as the most sensitive features extraction method,
ods in the second group experiments. The results show SM-LFDA as a feature dimensionality reduction technique,
that different sfn can lead to different results. When a suit- and SVM as an automated fault patterns recognition sys-
able parameter sfn is selected, the diagnosis accuracy can tem. The experimental data, which contains different bear-
significantly improve. Comparing the results of three fault ing fault conditions such as ball fault, inner race fault, and
3728 VOLUME 6, 2018

outer race fault at different motor loads of 3 hp and 2 hp [12] M. Kang, J. Kim, L. M. Wills, and J.-M. Kim, ‘‘Time-varying and
(motor speeds of 1730 and 1750 rpm) for different defect multiresolution envelope analysis and discriminative feature analysis for
bearing fault diagnosis,’’ IEEE Trans. Ind. Electron., vol. 62, no. 12,
sizes (0.007in, 0.014in, 0.021in, 0.028in), was obtained from pp. 7749–7761, Dec. 2015.
the Case Western Reserve University Bearing Data Center [13] Z. Feng, M. Liang, and F. Chu, ‘‘Recent advances in time–frequency
website. analysis methods for machinery fault diagnosis: A review with application
examples,’’ Mech. Syst. Signal Process., vol. 38, no. 1, pp. 165–205,
According to the results, the proposed bearing fault diag- 2013.
nosis method has great potential to be an effective and [14] X. Jiao, B. Jing, Y. Huang, J. Li, and G. Xu, ‘‘Research on fault diagnosis
adaptable tool for precise identification and classification of airborne fuel pump based on EMD and probabilistic neural networks,’’
Microelectron. Rel., vol. 75, pp. 296–308, Aug. 2017.
of bearing faults for a variety of bearing conditions, fault [15] J. Zhong and Y. Huang, ‘‘Time-frequency representation based on an adap-
locations, and fault types. In the experiments, we use sam- tive short-time Fourier transform,’’ IEEE Trans. Signal Process., vol. 58,
ples of the same motor loads of 3 hp as the training set. no. 10, pp. 5118–5128, Oct. 2010.
Next, the samples of different motor loads of 3 hp and 2 [16] E. Cabal-Yepez, A. G. Garcia-Ramirez, R. J. Romero-Troncoso,
A. Garcia-Perez, and R. A. Osornio-Rios, ‘‘Reconfigurable monitoring
hp are selected as the testing sets of case1 and case 2, system for time-frequency analysis on industrial equipment through
respectively. For case 1, the proposed procedure can provide STFT and DWT,’’ IEEE Trans. Ind. Informat., vol. 9, no. 2, pp. 760–771,
a highly accurate classification of 100% using FSASR and May 2013.
[17] Y. Yang, X. J. Dong, Z. K. Peng, W. M. Zhang, and G. Meng, ‘‘Vibration
SM-LFDA. For case 2, the proposed procedure can also pro- signal analysis using parameterized time–frequency method for features
vide a highly accurate classification of 97.71% using FSASR extraction of varying-speed rotary machinery,’’ J. Sound Vibrat., vol. 335,
and SM-LFDA. Compared with the traditional, single-step pp. 350–366, Jan. 2015.
[18] J. Chen et al., ‘‘Wavelet transform based on inner product in fault diagnosis
diagnosis model, the final analysis results show the effec- of rotating machinery: A review,’’ Mech. Syst. Signal Process., vols. 70–71,
tiveness and adaptability of the proposed model, and its fault pp. 1–35, Mar. 2016.
diagnosis is suitable for a promising and intelligent bearing [19] H. Qiu, J. Lee, J. Lin, and G. Yu, ‘‘Wavelet filter-based weak signature
detection method and its application on rolling element bearing prognos-
fault diagnosis system for practical applications. tics,’’ J. Sound Vibrat., vol. 289, nos. 4–5, pp. 1066–1090, 2006.
This work is supported by the National Key R&D Program [20] H. Talhaoui, A. Menacer, A. Kessal, and R. Kechida, ‘‘Fast Fourier and
of China (NO. 2017YFC0804400, NO. 2017YFC0804401); discrete wavelet transforms applied to sensorless vector control induc-
tion motor for rotor bar faults diagnosis,’’ ISA Trans., vol. 53, no. 5,
the National Key Basic Research Program of China pp. 1639–1649, 2014.
(973 Program, NO. 2014CB046300); [21] Y. Wang, G. Xu, L. Liang, and K. Jiang, ‘‘Detection of weak transient
signals based on wavelet packet transform and manifold learning for rolling
REFERENCES element bearing fault diagnosis,’’ Mech. Syst. Signal Process., vols. 54–55,
pp. 259–276, Mar. 2015.
[1] Y. Lei, J. Lin, Z. He, and M. J. Zuo, ‘‘A review on empirical mode decom- [22] I. Attoui, N. Fergani, N. Boutasseta, B. Oudjani, and A. Deliou, ‘‘A new
position in fault diagnosis of rotating machinery,’’ Mech. Syst. Signal time–frequency method for identification and classification of ball bearing
Process., vol. 35, nos. 1–2, pp. 108–126, 2013. faults,’’ J. Sound Vibrat., vol. 397, pp. 241–265, Jun. 2017.
[2] N. A. Othman, N. S. Damanhuri, and V. Kadirkamanathan, ‘‘The study [23] L. Saidi, J. B. Ali, F. Fnaiech, and B. Morello, ‘‘Bi-spectrum based-
of fault diagnosis in rotating machinery,’’ in Proc. IEEE Int. Colloquium EMD applied to the non-stationary vibration signals for bearing faults
Signal Process. ITS Appl., Mar. 2009, pp. 69–74. diagnosis,’’ in Proc. IEEE 6th Int. Conf. Soft Comput. Pattern Recognit.,
[3] H. Shao, H. Jiang, F. Wang, and Y. Wang, ‘‘Rolling bearing fault diag- Aug. 2014, pp. 25–30.
nosis using adaptive deep belief network with dual-tree complex wavelet
[24] L. Lu, J. Yan, and C. W. de Silva, ‘‘Dominant feature selection for the
packet,’’ ISA Trans., vol. 69, pp. 187–201, Jul. 2017.
fault diagnosis of rotary machines using modified genetic algorithm and
[4] X. Xue and J. Zhou, ‘‘A hybrid fault diagnosis approach based on
empirical mode decomposition,’’ J. Sound Vibrat., vol. 344, pp. 464–483,
mixed-domain state features for rotating machinery,’’ ISA Trans., vol. 66,
May 2015.
pp. 284–295, Jan. 2017.
[25] Z. Yang, B. W.-K. Ling, and C. Bingham, ‘‘Trend extraction based on
[5] L. Zhang, G. Xiong, H. Liu, H. Zou, and W. Guo, ‘‘Bearing fault diagnosis
separations of consecutive empirical mode decomposition components in
using multi-scale entropy and adaptive neuro-fuzzy inference,’’ Expert
Hilbert marginal spectrum,’’ Measurement, vol. 46, no. 8, pp. 2481–2491,
Syst. Appl., vol. 37, no. 8, pp. 6077–6085, 2010.
2013.
[6] N. Gebraeel, M. Lawley, R. Liu, and V. Parmeshwaran, ‘‘Residual life
predictions from vibration-based degradation signals: A neural network [26] X. Liu, L. Bo, and H. Luo, ‘‘Bearing faults diagnostics based on hybrid
approach,’’ IEEE Trans. Ind. Electron., vol. 51, no. 3, pp. 694–700, LS-SVM and EMD method,’’ Measurement, vol. 59, pp. 145–166,
Jun. 2004. Jan. 2015.
[7] O. V. Thorsen and M. Dalva, ‘‘Failure identification and analysis for high- [27] T. Guo and Z. Deng, ‘‘An improved EMD method based on the multi-
voltage induction motors in the petrochemical industry,’’ IEEE Trans. Ind. objective optimization and its application to fault feature extraction of
Appl., vol. 35, no. 4, pp. 810–818, Jul. 1999. rolling bearing,’’ Appl. Acoust., vol. 127, pp. 46–62, Dec. 2017.
[8] R. N. Bell et al., ‘‘Report of large motor reliability survey of industrial and [28] M. Kang, J. Kim, J.-M. Kim, A. C. C. Tan, E. Y. Kim, and B.-K. Choi,
commercial installations, part I,’’ IEEE Trans. Ind. Appl., vol. IA-21, no. 4, ‘‘Reliable fault diagnosis for low-speed bearings using individually trained
pp. 865–872, Jul. 1985. support vector machines with kernel discriminative feature analysis,’’ IEEE
[9] A. Garcia-Perez, R. de J. Romero-Troncoso, E. Cabal-Yepez, and Trans. Power Electron., vol. 30, no. 3, pp. 2786–2797, May 2015.
R. A. Osornio-Rios, ‘‘The application of high-resolution spectral analysis [29] C. Y. Yang and T. Y. Wu, ‘‘Diagnostics of gear deterioration using EEMD
for identifying multiple combined faults in induction motors,’’ IEEE Trans. approach and PCA process,’’ Measurement, vol. 61, pp. 75–87, Feb. 2015.
Ind. Electron., vol. 58, no. 3, pp. 2002–2010, May 2011. [30] Z. Shen, X. Chen, X. Zhang, and Z. He, ‘‘A novel intelligent gear fault
[10] Z. Gao, C. Cecati, and S. X. Ding, ‘‘A survey of fault diagnosis and diagnosis model based on EMD and multi-class TSVM,’’ Measurement,
fault-tolerant techniques—Part I: Fault diagnosis with model-based and vol. 45, pp. 30–40, Jan. 2012.
signal-based approaches,’’ IEEE Trans. Ind. Electron., vol. 62, no. 3, [31] X. Zhang, D. Qiu, and F. Chen, ‘‘Support vector machine with parameter
pp. 3757–3767, Jun. 2015. optimization by a novel hybrid method and its application to fault diagno-
[11] Z. Gao, C. Cecati, and S. X. Ding, ‘‘A survey of fault diagnosis and sis,’’ Neurocomputing, vol. 149, pp. 641–651, Feb. 2015.
fault-tolerant techniques—Part II: Fault diagnosis with knowledge-based [32] H. H. Bafroui and A. Ohadi, ‘‘Application of wavelet energy and Shannon
and hybrid/active approaches,’’ IEEE Trans. Ind. Electron., vol. 62, no. 6, entropy for feature extraction in gearbox fault detection under varying
pp. 3768–3774, Jun. 2015. speed conditions,’’ Neurocomputing, vol. 133, pp. 437–445, Jun. 2014.
VOLUME 6, 2018 3729

[33] H. Sun et al., ‘‘Multiwavelet transform and its applications in mechanical [56] L. Hubert and P. Arabie, ‘‘Comparing partitions,’’ J. Classification, vol. 2,
fault diagnosis—A review,’’ Mech. Syst. Signal Process., vol. 43, nos. 1–2, no. 1, pp. 193–218, 1985.
pp. 1–24, 2014. [57] R. J. G. B. Campello, ‘‘A fuzzy extension of the Rand index and other
[34] F. Ferracuti, A. Giantomassi, S. Iarlori, G. Ippoliti, and S. Longhi, related indexes for clustering and classification assessment,’’ Pattern
‘‘Electric motor defects diagnosis based on kernel density estimation and Recognit. Lett., vol. 28, pp. 833–841, May 2007.
Kullback–Leibler divergence in quality control scenario,’’ Eng. Appl. Artif. [58] J. M. Bland and D. G. Altman, ‘‘Statistics notes: Measurement error,’’ BMJ,
Intell., vol. 44, pp. 25–32, Sep. 2015. vol. 312, no. 7047, p. 1654, 1996, doi: 10.1136/bmj.313.7059.744.
[35] S. Yin, X. Zhu, and O. Kaynak, ‘‘Improved PLS focused on key- [59] Y. Xiao, E. Ding, C. Chen, X. Liu, and L. Li, ‘‘A novel characteris-
performance-indicator-related fault diagnosis,’’ IEEE Trans. Ind. tic frequency bands extraction method for automatic bearing fault diag-
Electron., vol. 62, no. 3, pp. 1651–1658, Mar. 2015. nosis based on Hilbert Huang Transform,’’ Sensors, vol. 15, no. 11,
[36] F. Nie, D. Xu, I. W. Tsang, and C. Zhang, ‘‘Flexible manifold embedding: pp. 27869–27893, 2015.
A framework for semi-supervised and unsupervised dimension reduction,’’
IEEE Trans. Image Process., vol. 19, no. 7, pp. 1921–1932, Jul. 2010.
[37] Y. Liu, Y. Zhang, Z. Yu, and M. Zeng, ‘‘Incremental supervised locally XIAO YU received the B.S. degree in information
linear embedding for machinery fault diagnosis,’’ Eng. Appl. Artif. Intell., and communication engineering from the China
vol. 50, pp. 60–70, Apr. 2016. University of Mining and Technology in 2010. He
[38] Y. Bengio, J.-F. Paiement, P. Vincent, O. Delalleau, N. L. Roux, and is currently pursuing the Ph.D. degree with the
M. Ouimet, ‘‘Out-of-sample extensions for LLE, isomap, MDS, eigen- School of Information and Control Engineering,
maps, and spectral clustering,’’ in Proc. Int. Conf. Neural Inf. Process. China University of Mining and Technology. His
Syst., 2004, pp. 177–184. research interests include signal processing, fault
[39] N. Vlassis, Y. Motomura, and B. Kröse, ‘‘Supervised dimension reduction diagnosis, and machine learning.
of intrinsically low-dimensional data,’’ Neural Comput., vol. 14, no. 1,
pp. 191–215, 2002.
[40] J.-B. Yu, ‘‘Bearing performance degradation assessment using locality
preserving projections,’’ Expert Syst. Appl., vol. 38, no. 3, pp. 7440–7450, FEI DONG received the B.S. degree in communi-
2011. cation engineering from Huaibei Normal Univer-
[41] B. S. J. Costa, P. P. Angelov, and L. A. Guedes, ‘‘Fully unsupervised fault sity, China, in 2015. He is currently pursuing the
detection and identification based on recursive density estimation and self- M.S. degree with the School of Information and
evolving cloud-based classifier,’’ Neurocomputing, vol. 150, pp. 289–303, Control Engineering, China University of Min-
Feb. 2015. ing and Technology. His research interests include
[42] M. Sugiyama, ‘‘Dimensionality reduction of multimodal labeled data signal processing, fault diagnosis, and machine
by local fisher discriminant analysis,’’ J. Mach. Learn. Res., vol. 8, learning.
pp. 1027–1061, May 2007.
[43] Case Western Reserve University. (2014). [Online]. Available:
http://csegroups.case.edu%2fbearingdatacenter%2fpages%2fdownload-
data-file/RK=2/RS=kGDBzlCSmdjv0gV3BUuZvZictsM- ENJIE DING has been a Professor with the School
[44] F. Immovilli, F. Immovilli, R. Rubini, and C. Tassoni, ‘‘Diagnosis of Information and Control Engineering, China
of bearing faults in induction machines by vibration or current sig- University of Mining and Technology, since 1999.
nals: A critical comparison,’’ IEEE Trans. Ind. Appl., vol. 46, no. 3,
He is an Executive Deputy Director of the Depart-
pp. 1350–1359, Jul./Aug. 2010.
ment of IoT Perception Mine Research Center,
[45] STI, ‘‘Rolling element bearings,’’ STI Vibration Monitoring Inc.,
League City, TX, USA, Appl. Note, 2017. [Online]. Available:
China University of Mining and Technology. He
http://www.stiweb.com/kb_results.asp?ID=53 received the Ministry of Education Science and
[46] M. M. Tahir, A. Q. Khan, N. Iqbal, A. Hussain, and S. Badshah, Technology Progress Award 2016, the Sixth Pro-
‘‘Enhancing fault classification accuracy of ball bearing using central duction Safety Science and Technology Achieve-
tendency based time domain features,’’ IEEE Access, vol. 5, no. 3, ment Award 2015 and many other awards. He
pp. 72–83, Mar. 2017. received the Ph.D. degree in information and communication engineering
[47] N. E. Huang et al., ‘‘The empirical mode decomposition and the Hilbert from the China University of Mining and Technology in 1999. His current
spectrum for nonlinear and non-stationary time series analysis,’’ Proc. research interests include Internet of things, signal processing, fault diagno-
Roy. Soc. A, Math. Phys. Sci., vol. 454, no. 1971, pp. 903–995, 1998, sis, wireless sensor networks, and coal rock interface recognition.
doi: 10.1098/rspa.1998.0193.
[48] R. A. Fisher, ‘‘The use of multiple measurements in taxonomic problems,’’
Ann. Hum. Genet., vol. 7, no. 2, pp. 179–188, 1936. SHOUPENG WU received the B.S. degree in
[49] C. R. Rao, ‘‘The utilization of multiple measurements in problems of information engineering from the China Univer-
biological classification,’’ J. Roy. Statist. Soc. Ser. B, Methodol., vol. 10, sity of Mining and Technology, China, in 2016.
no. 2, pp. 159–203, 1948. He is currently pursuing the M.S. degree with the
[50] C. Yao, Z. Lu, J. Li, Y. Xu, and J. Han, ‘‘A subset method for improving School of Information and Control Engineering,
linear discriminant analysis,’’ Neurocomputing, vol. 138, pp. 310–315, China University of Mining and Technology. His
Aug. 2014. research interests include machine learning and
[51] A. K. Jain, R. P. W. Duin, and J. C. Mao, ‘‘Statistical pattern recogni- signal processing.
tion: A review,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 1,
pp. 4–37, Jan. 2000.
[52] M. Van and H.-J. Kang, ‘‘Bearing defect classification based on individual
wavelet local fisher discriminant analysis with particle swarm optimiza- CHUNYANG FAN received the B.S. degree
tion,’’ IEEE Trans. Ind. Informat., vol. 12, no. 3, pp. 124–135, Feb. 2016. in electrical engineering and automation from
[53] V. Vapnik, The Nature of Statistical Learning Theory. New York, NY, USA: Jiangsu Normal University, China, in 2016. She
Springer-Verlag, 1995. is currently pursuing the M.S. degree with the
[54] X. Zhang, Y. Liang, and J. Zhou, ‘‘A novel bearing fault diagnosis model School of Information and Control Engineering,
integrated permutation entropy, ensemble empirical mode decomposition China University of Mining and Technology. Her
and optimized SVM,’’ Measurement, vol. 69, pp. 164–179, Jun. 2015. research interests include signal processing, fault
[55] C.-W. Hsu and C.-J. Lin, ‘‘A comparison of methods for multiclass support diagnosis, and machine learning.
vector machines,’’ IEEE Trans. Neural Netw., vol. 13, no. 4, pp. 415–425,
Mar. 2002.
3730 VOLUME 6, 2018

Access 2017 2773460

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Access 2017 2773460

Uploaded by

Copyright:

Available Formats

Received September 18, 2017, accepted October 23, 2017, date of publication November 21, 2017,

date of current version February 28, 2018.

Rolling Bearing Fault Diagnosis Using Modified

Corresponding author: Enjie Ding (enjied@cumt.edu.cn)

I. INTRODUCTION In the phase of signal processing and features extraction,

3716 VOLUME 6, 2018

VOLUME 6, 2018 3717

The LDA proposed by Fisher [48] for dimension reduction of (13)

3718 VOLUME 6, 2018

VOLUME 6, 2018 3719

3720 VOLUME 6, 2018

the neighboring relationships between samples of different

VOLUME 6, 2018 3721

FIGURE 3. Implementation of the proposed fault diagnostic technique.

3722 VOLUME 6, 2018

TABLE 1. The detail information of the used vibration dataset.

FIGURE 4. The original ORF (outer race fault) vibration signal.

samples. In case 2, 40 random samples of 2 hp are selected as

VOLUME 6, 2018 3723

FIGURE 8. Two time-domain statistical characteristics of training samples.

TABLE 2. Statistical parameters.

can be reflected in Figs. 8 and 9. In this paper, we provide

3724 VOLUME 6, 2018

FIGURE 12. The ASR of 81 statistical characteristics of training samples.

TABLE 3. Bearing fault diagnosis results obtained by the OFS-SVM.

two groups, which are employed to verify the effectiveness

VOLUME 6, 2018 3725

TABLE 5. Bearing fault diagnosis results obtained by the

FIGURE 13. The diagnosis results with different number of sfn.

TABLE 7. Bearing fault diagnosis results obtained by the

tation of feature reduction and fault diagnosis. Therefore,

PCA, LDA, LFDA, and SM-LFDA. OFS-FSASR-SVM is

case 1 and case 2. The sfn is the number of characteristics that

3726 VOLUME 6, 2018

TABLE 9. Bearing fault diagnosis results obtained by the

respectively. Fig. 15 gives the curved representation of the

VOLUME 6, 2018 3727

diagnosis models (OFS-FSASR-LDA-SVM, OFS-FSASR-

testing sets with different dimensionality reduction methods V. CONCLUSION

3728 VOLUME 6, 2018

VOLUME 6, 2018 3729

3730 VOLUME 6, 2018

You might also like