Professional Documents
Culture Documents
1 Introduction
Induction motors have undeniable presences in pivotal industrial systems such as wind
turbine, aircraft, automotive, and power generator [1]. A motor is composed of bearing,
stator and rotor, however bearing is the most frequent failing component, and
responsible for 50% of all failures [1]. Thus, fault diagnosis of bearings is paramount
importance to keep the machines operating normally and reduce breakdown time.
Fortunately, the rapid development of data mining, data acquisition technique, and
machine learning technique since 1990 has provided the ability collect and store a vast
amount process data to extract useful information inherent in such a significant amount
of recorded data (e.g., vibration, acoustic emission and current) for the purpose of
reliable fault diagnosis in the large-scale industries [2, 3]. Therefore, it is needed to
develop fault diagnosis methods that can efficiently process massive data to self-learn
fault features and intelligently obtain accurate diagnosis results. Traditional data-driven
based fault diagnosis methods consist of two main categories: manual fault feature
© Springer International Publishing AG, part of Springer Nature 2018
E. Bagheri and J. C. K. Cheung (Eds.): Canadian AI 2018, LNAI 10832, pp. 144–155, 2018.
https://doi.org/10.1007/978-3-319-89656-4_12
Motor Bearing Fault Diagnosis Using Deep CNN 145
image pixels. Furthermore, we apply an adaptive learning rate for training ADCNN.
The proposed diagnosis methodology (CSM+ADCNN) is validated on publicly
available benchmark bearing vibration data from Case Western Reserve University
(CWRU) lab [12, 13].
The remainder of this paper is structured as follows. Section 2 provides proposed
fault diagnosis scheme including data acquisition system, proposed cycle spectrum
map (CSM), and offered ADCNN architecture for fault classification. Section 3 vali-
dates the proposed method’s effectiveness and compares its performance with the
state-of-the-art methods. Finally, Sect. 4 summarizes the conclusion of this paper.
2 The Methodology
The main steps of the proposed ADCNN-based bearing fault diagnosis are shown in
Fig. 1. The proposed method consists of three significant steps. After data acquisition,
the first step is to obtain 2D cycle spectrum map (CSM) using an improved cyclo-
stationary analysis to pre-process the vibration data. The main advantage of using such
pre-processing is that the valuable information about how rotating components are
distributed within vibration spectrum. The second step is to feed the pre-processed 2D
matrix (CSM) into an ADCNN for an optimized deep learning model. The final step is
to diagnose faulty bearings using that optimized model.
Fig. 1. An overall structure of the ADCNN-based method for bearing fault diagnosis
Fig. 2. The seeded bearing test ring for recording fault data
To reduce the shortcomings of CA, cycle spectrum map (CSM) is proposed based
on short-time envelope analysis. To calculate the Fourier transform of an input signal
x½n, we define X½N as follows,
X
K 1
x½ne2pj K
kn
X½N ¼ ð1Þ
n¼0
Instead of the power signal calculation in [8], we take the envelope spectrum of the
signal, x[n], which is defined as follows,
y ¼ j að x Þ j 2 ð2Þ
Where, pffiffiffiffiffiffiffi
aðtÞ ¼ xðtÞ þ ixðtÞ; i¼ 1 ð3Þ
Z 1
xðsÞ
and; x ¼ ; ð4Þ
1 s
t
Here, Sx is the spectrogram in time, and t represents the short-time spectral com-
ponent of the signal (i.e., envelope spectral frequency). This CSM is highly effective to
detect induction motor bearing faults.
In this study, we use the back-propagation (BP) algorithm [14] to update all
trainable parameters on all layers in the proposed ADCNN. In the proposed architec-
ture, convolutional layers are placed subsequently with subsampling layers to build up
spatial invariance progressively and to reduce computational complexity, but fully
connected layers are placed in the last stage of the architecture to make consistency
with the generic structure, as can be seen in Fig. 3.
Fig. 3. The proposed ADCNN architecture used to classify different fault types. A CSM
containing the bearing health state feeds the deep neural network as input.
On a convolution layer, the nearby receptive fields of the previous layer’s feature
maps are convolved with learnable kernels, and the result is fed to an activation
function to generate the output feature map. Each output feature map can be created by
combining the convolution implementations of multiple input maps. Let, l is the current
convolutional layer in a network. In general, the output feature map in that layer is
calculated as follows:
!
X
zlj ¼ f zl1
j kijl þ blj ; ð6Þ
i2Mj
Where, j is the jth output feature map in a convolutional layer, i is the ith input
feature map, Mj defines a selection of input feature maps in the layer l 1, kj represents
a convolutional kernel of feature map jth , and f is an activation function. b is an additive
basis for each output map. With different output feature maps, the input feature maps
are convolved with different kernels k.
The pooling layer aids to reduce the spatial size of the representation, to steadily
reduce the number of parameters, amount of computation in the network, and also to
control overfitting. The pooling layer is defined as follows:
zlj ¼ f ulj ssðzl1
j Þ þ bj ;
l
ð7Þ
Where ssð:Þ defines a subsampling function. Typically, this function computes the
max of each distinct 2-by-2 block in the input feature map, so the output feature map is
two times smaller than the number of rows and columns in the input feature map. Each
output map is given its trainable coefficient, ðuÞ and trainable bias, (b).
150 M. M. M. Islam and J.-M. Kim
The final step of the proposed ADCNN architecture is fully connected (FC) layer:
the last few layers (closest to the outputs) are FC one-dimensional (1D) layers. The
output of this layer is:
yl ¼ f zl wl þ bl ; ð8Þ
Where, the output activation function, f ðÞ, is a logistic function, and w is trainable
weights.
BP algorithm is applied to train ADCNN, so the gradients of the loss function for
all trainable weights in all layers are calculated during BP operation. However, it is
vital to define an appropriate objective function for BP algorithm. Thus, a squared-error
loss function is used to address the objective function. Equation (9) defines the loss
function as follows:
1X m 2
cos tðwÞ ¼ yl til ; ð9Þ
2 i¼1 i
Where, til represents the target output value of the ith pattern. Suppose a point w to
find the next weight point (w + 1) to find a minimizer, and it is started from w, and
@
moves by a @w cos tðwÞ as in Eq. (10), where a is a definite scalar step size.
@
w :¼ w a cos tðwÞ ð10Þ
@w
That weight update process in Eq. (10) is called a stochastic gradient descent
(SGD) algorithm [14]. It is supposed to assume that the gradient in SGD will vary as
the search continues and tends toward zero as it approaches the minimizer. So, the steps
size can be either small or large. The first approach (i.e., a small step size) is com-
putationally complicated to reach a minimizer; however, the second approach (i.e., a
large step size) could result in a more zigzag path to the minimizer. This approach is
computationally inexpensive and straightforward, and also it could be trapped in a local
minimizer that means there is no optimum value of step size.
To elevate the above issues in SGD, this study applies an adaptive moment esti-
mation that combines the advantages of the adaptive gradient (AdaGrad) — working
well with sparse gradients — and root-mean-square propagation (RMSProp) —
working well in non-stationary settings [15]. The main idea is that it maintains
exponential moving averages of the gradient and its square in each update proportional
to the average gradient and its square as follows:
Mt
w :¼ w a pffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð11Þ
Rt þ 2
@
where; Mt ¼ b1 Mt1 þ ð1 b1 Þ cos tðwÞ ð12Þ
@w
Motor Bearing Fault Diagnosis Using Deep CNN 151
@2
and; Rt ¼ b1 Rt1 þ ð1 b2 Þ cos tðwÞ ð13Þ
@w2
Where, Mt is the 1st-moment bias correction, Rt is the 2nd-moment bias correction,
and the decay rates are small (i.e., b1 and b2 are close to 1). Weight update process in
Eq. (11) provides a signification result about near optimum learning rate selection.
To investigate the effect of CSM-based health state visualization of bearing fault and its
application to fault diagnosis using ADCNN are presented in this section.
To validate the performance, this paper utilizes benchmark dataset with four fault
types namely ORF, IRF, BRF, and HBC at various operating conditions (see Table 1).
Figure 4 illustrates recorded example vibration signals of each bearing conditions. As
this study provides an improved health state visualization using cyclic spectrum map
(CSM), Fig. 5 presents the results of CSM four bearing health conditions. According to
the results in Fig. 5, it is seen that proposed CSM represents a unique health condition
for each fault types. An interesting point to note that outer fault defect frequency
(103 Hz in this paper for 1730 rpm), inner fault defect frequency (157 Hz) and ball
fault defect frequency (136 Hz), and up to 2nd harmonics each of them can be seen in
Fig. 5(a), (b), and (c) respectively. Furthermore, it is evident that there is no defect
frequency found for healthy bearing condition (HBC) as in Fig. 5(d).
Fig. 4. Raw time-domain vibration signals of each bearing condition, (a) ORF, (b) IRF,
(c) BRF, and (d) HBC bearings, respectively
152 M. M. M. Islam and J.-M. Kim
Fig. 5. Cyclic spectrum maps (CSMs) of each bearing fault condition, (a) ORF, (b) IRF,
(c) BRF, and (d) HBC, in which bearing conditions are unique depending on fault type.
As CSM is highly capable of representing bearing health state, this study further
utilizes CSM for fault classification. The main difference between the CNN-based
method and traditional analysis-based methods is the manner for extracting features.
To validate the efficiency of the proposed method (CMS+ADCNN), we conducted
experiments in comparison with two state-of-the-art fault diagnosis methods: first
method [4] that extract features by calculating statistical time and frequency mea-
surements, along with complex envelope power spectrum features. The extracted
features are then evaluated and selected by a feature selection algorithm. Selected
features are finally used to detect bearing defects using a k-NN classifier. Our second
comparative model is deep learning–based CNN that classifies faults using the 1D raw
signal [6].
In ADCNN, an appropriate learning rate is a crucial parameter in training stage
optimization. It is essential to report the performance of the optimization scheme in
proposed ADCNN to ensure a higher classification accuracy. Figure 6 presents the
convergence rate of the proposed ADCNN and conventional deep learning with a fixed
learning rate. The result is significant because the proposed scheme is converged
almost to zero, which is expected to forecast a better classification accuracy.
To calculate the classification accuracy, we divide the dataset into two groups: 50
signals for training and 60 signals for testing for each faults type (see Table 1).
Therefore, we have 200 samples training and 240 samples for testing for four fault
Motor Bearing Fault Diagnosis Using Deep CNN 153
1.4
1.2
0.6
ADCNN
0.4
CNN with constant learning rate
0.2
0
0 200 400 600 800 1000 1200
Number of epoch
Fig. 6. Cost function convergence rate of conventional CNN and ADCNN at an initial learning
rate of 1e2 .
types (e.g., ORF, IRF, BRF, and HBC). The training samples are kept lower than that
of testing to ensure the reliability of the generalized performance. For a reliable
comparison of classification accuracy, we use sensitivity and average classification
accuracy (ACA) as in [4] as follows.
NTrue positive
Sensitivity ¼ 100ð%Þ; ð14Þ
NTrue positive þ NFalse negative
where NTrue positive is the total number of samples in class l that are correctly classified
as a class l, and NFalse negative is the number of samples within the class l that are not
recognized as a class l.
PL !
j¼1 NTrue positive
ACA ¼ 100ð%Þ; ð15Þ
Nsamples
Where L defines the number of fault classes (i.e., 4 in this paper) and Nsamples is the
total number of samples represented in a particular testing subset.
Table 2 presents the classification accuracy in terms of sensitivity and ACA. The
proposed method delivers better classification performance than the methods from [4, 6].
It yielded a classification accuracy of 98%, 93%, 95% and 97% for ORF, IRF, BRF, and
HBC respectively. According to result in Table 2, the proposed method outperforms two
state-of-the-art algorithms, yielding an enhancement of 8.25% to 13.75% in ACA com-
pared with the methods in [4, 6], respectively.
Also, the confusion matrices for the proposed framework and referenced methods
are presented in Fig. 7. The confusion matrix is a robust technique that provides a
visualization of the performance of a classifier algorithm where actual versus the
predicted deviation can be observed. According to the results in Fig. 7(a), the proposed
method can correctly identify all fault types with a small misclassification rate in
comparison with its two counterparts: [6] in Fig. 7(b) and in [4] Fig. 7(c).
154 M. M. M. Islam and J.-M. Kim
Fig. 7. The confusion matrices for showing classification results of the (a) proposed, (b) [6], and
(c) [4] for each bearing condition.
4 Conclusion
This paper proposed a reliable bearing fault diagnosis scheme using a two-dimensional
visualization tool for bearing health state and an adaptive deep convolutional neural
network. First, we applied an improved cyclostationary analysis method for cyclic
spectrum map (CSM) for representing bearing health condition. CMSs were highly
effective that represent unique fault information about the bearing health state. Once we
obtained the CMS, we fed it into an ADCNN. Using CSM as input in ADCNN, useful
bearing fault features were then self-learned automatically and the bearing faults were
diagnosed with a high precession. The presented method was validated with benchmark
vibration data collected from bearing tests. The results showed that the proposed
approach outperforms its referenced methods regarding classification accuracy. Thus,
the combined effects of CMS-based bearing health state and adaptive learning in
ADCCN contributed to achieving higher precision.
Acknowledgements. This work was supported by the Korea Institute of Energy Technology
Evaluation and Planning (KETEP) and the Ministry of Trade, Industry & Energy (MOTIE) of the
Republic of Korea (Nos. 20162220100050, 20161120100350, 20172510102130). It was also
funded in part by The Leading Human Resource Training Program of Regional Neo-Industry
through the National Research Foundation of Korea (NRF) funded by the Ministry of Science,
ICT and Future Planning (NRF-2016H1D5A1910564), and in part by the Basic Science
Research Program through the National Research Foundation of Korea (NRF) funded by the
Ministry of Education (2016R1D1A3B03931927).
Motor Bearing Fault Diagnosis Using Deep CNN 155
References
1. Kang, M., Kim, J., Kim, J.M.: High-performance and energy-efficient fault diagnosis using
effective envelope analysis and denoising on a general-purpose graphics processing unit.
IEEE Trans. Power Electron. 30, 2763–2776 (2015)
2. Dai, X., Gao, Z.: From model, signal to knowledge: a data-driven perspective of fault
detection and diagnosis. IEEE Trans. Industr. Inform. 9, 2226–2238 (2013)
3. Islam, M.M.M., Kim, J.-M.: Reliable multiple combined fault diagnosis of bearings using
heterogeneous feature models and multiclass support vector machines. Reliab. Eng. Syst.
Saf. (2018)
4. Islam, R., Khan, S.A., Kim, J.-M.: Discriminant feature distribution analysis-based hybrid
feature selection for online bearing fault diagnosis in induction motors. J. Sens. 2016, 16
(2016)
5. Jack, L.B., Nandi, A.K.: Fault detection using support vector machines and artificial neural
networks, augmented by genetic algorithms. Mech. Syst. Signal Process. 16, 373–390
(2002)
6. Janssens, O., Slavkovikj, V., Vervisch, B., Stockman, K., Loccufier, M., Verstockt, S., Van
de Walle, R., Van Hoecke, S.: Convolutional neural network based fault detection for
rotating machinery. J. Sound Vib. 377, 331–345 (2016)
7. Islam, M.M.M., Islam, M.R., Kim, J.-M.: A hybrid feature selection scheme based on local
compactness and global separability for improving roller bearing diagnostic performance. In:
Wagner, M., Li, X., Hendtlass, T. (eds.) ACALCI 2017. LNCS (LNAI), vol. 10142,
pp. 180–192. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-51691-2_16
8. Antoni, J.: Cyclostationarity by examples. Mech. Syst. Signal Process. 23, 987–1036 (2009)
9. Antoni, J.: Fast computation of the kurtogram for the detection of transient faults. Mech.
Syst. Signal Process. 21, 108–124 (2007)
10. Wang, D., Tse, P.W., Tsui, K.L.: An enhanced kurtogram method for fault diagnosis of
rolling element bearings. Mech. Syst. Signal Process. 35, 176–199 (2013)
11. LeCun, Y.: LeNet-5, Convolutional neural networks (2015). http://yann.lecun.com/exdb/
lenet
12. Case Western Reserve University. Seeded Fault Test Data. http://csegroups.case.edu/
bearingdatacenter/home
13. Haidong, S., Hongkai, J., Xingqiu, L., Shuaipeng, W.: Intelligent fault diagnosis of rolling
bearing using deep wavelet auto-encoder with extreme learning machine. Knowl. Based
Syst. 140, 1–14 (2018)
14. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)
15. Kingma, D., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv preprint arXiv:
1412.6980 (2014)