Professional Documents
Culture Documents
Measurement
journal homepage: www.elsevier.com/locate/measurement
A R T I C L E I N F O A B S T R A C T
Keywords: It’s a challenging work to diagnose faults from the measured vibration signals automatically and efficiently under
Rolling bearings small samples. A new intelligent fault diagnosis method of rolling bearing with small samples is proposed based
Intelligent fault diagnosis on structural similarity generative adversarial network (SSGAN) and improved MobileNetv3 convolutional
Small samples
neural network (IMCNN). Firstly, the wavelet transform (WT) is performed on the signal to obtain a wavelet 2D
Structurally similar generative adversarial
networks (SSGAN)
image with time–frequency characteristics. Then, SSGAN is constructed to obtain high-quality generated samples
Improved MobileNetv3 convolutional neural for expanding the small training sets. Finally, the improved MobileNetv3 convolutional neural network (IMCNN)
networks (IMCNN) is proposed to extract feature information of the extended samples by using the self-focus mechanism instead of
the original lightweight focus mechanism, and the classification results are acquired for fault recognition. The
experimental results show that the proposed SSGAN-IMCNN method can effectively extend the small samples and
automatically detect the rolling bearing faults with high classification accuracy.
* Corresponding author.
E-mail address: ncepuxl@163.com (L. Xiang).
https://doi.org/10.1016/j.measurement.2022.111899
Received 13 July 2022; Received in revised form 22 August 2022; Accepted 4 September 2022
Available online 10 September 2022
0263-2241/© 2022 Elsevier Ltd. All rights reserved.
X. Yang et al. Measurement 203 (2022) 111899
under various work conditions. Yan et al. [16] presented a deep regu equipment. The results not only verify the quality of SSGAN synthetic
larized variational self-encoder to solve the overfitting problem of the samples, but also demonstrate the effectiveness of IMCNN fault
original model with better recognition of fault severity. He also pro classification.
posed a multiple scale CDB network for fault diagnosis which could
extract rich features hiding in signal [17]. Jia et al. [18] provided a 2. Deep learning model
connection network using standard sparse autoencoder, which was
effectively verified in failure intelligence detection. Wang et al. [19] 2.1. Related work
presented a new method by using adaptive maximum cyclostationarity
blind deconvolution, which was proved for bearing fault detection. A 2.1.1. Wavelet transform
new deep transferable learning model of fault diagnosis was reported in The essence of wavelet transform (WT) is function decomposition,
various work states of unequal quantity [20]. which represents the original signal as a linear combination for wavelet
Deep learning model has higher diagnostic accuracy than traditional basis functions. WT gets over the drawback that the window size cannot
methods for single and compound faults. Undeniably, these models have change with frequency, and provides a “time-frequency” window which
obtained some successes on rolling bearing failure detection, but most of changes with frequency, with strong time–frequency feature extraction
the models still need sufficient training samples to complete fault capability. Wavelet basis function selection profoundly affects the
diagnosis task. It is known that the sufficient fault samples are very wavelet transform effect. Morlet wavelet is a bilateral exponentially
difficult in the actual signal acquisition, where the rolling bearing may decaying cosine signal, which is very similar to the characteristics of the
fail more quickly after a fault occurs. Therefore, with small samples, the fault pulse generated by the rolling bearing. Cmor wavelet in Morlet
diagnostic model is prone to overfitting, low recognition accuracy, and wavelet represents its negative form with strong adaptability, so Cmor
difficulty in fault feature extraction. Generative adversarial networks wavelet is chosen as basis function of wavelet transform, which is
(GAN) [21] was first proposed in 2014, which has widely been applied expressed as Eq. (1):
for data enhancement, sample expansion, and fault detection. GAN
1
could effectively achieve the data distributing features of raw datasets ψ r (t) = √̅̅̅̅̅̅̅̅ exp(− t2 /fb )cos(2πft) (1)
π ⋅fb
and then generated novel artificial data with similar distribution. Mao et
al [22] detected unbalance fault of rolling bearings by using Fourier Its corresponding Fourier representation is:
transform and GAN to synthesize the datasets. Han et al. [23] built a new
adversarial learning layer in DCNN for intelligent fault detection, and ψ (af ) = e− π2 fb (af − fc )2
(2)
the model has a better performance in the case of insufficient data. In Eq. (1), t denotes time, f denotes frequency, and f b is the shape
By using above methods some successes have been achieved in fault parameter, which determines how fast or slow the waveform oscillation
diagnosis of small samples, but there are still not concrete structures in decays. In Eq. (2), a is the transformation scale, and f c is the center
selecting data source and characteristic extraction model. It is also frequency, which determines the oscillation frequency of the waveform.
important to consider both time and frequency characteristics of the
signal, and two-dimensional characteristic image is beneficial to 2.1.2. Structural similarity generative adversarial networks (SSGAN)
generative adversarial networks to learn the distribution characteristics. Structural similarity (SSIM) [24] is a measure of similarity between
Two-dimensional time–frequency signals can be better applied to images, and the similarity index ranges from 0 to 1. The larger the index
generate adversarial networks. Following these ideas, a small-sample is, the higher the similarity is. The principle is shown in Eq. (3):
bearing fault diagnosis method combining SSGAN and improved
MobileNetv3 convolutional neural network (IMCNN) is proposed in this SSIM(x, y) = [l(x, y)]α ∗ [c(x, y)]β ∗ [s(x, y)]γ (3)
paper. WT is used to obtain the time–frequency features from the one-
Luminance, contrast and structure are three important modules in
dimensional original vibration signal, which are two-dimensional im
the measurement system. Eq. (4) is the contrast function of luminance,
ages with fault information. SSGAN is constructed for generating high
Eq. (5) is the contrast function of contrast, and Eq. (6) is the contrast
quality samples for the expansion of training sample. IMCNN is proposed
function of structure.
to extract the deep features and conduct final diagnoses. Two different
experiments are designed which are on the CWRU and our laboratory
2
X. Yang et al. Measurement 203 (2022) 111899
Table 1 2μx μy + C1
Structure of proposed model (IMCNN).
l(x, y) = (4)
μ2x + μ2y + C1
Input Kernel Size Attention Activation Step
Dimension Mechanism Function Length 2 σ x σ y + C2
c(x, y) = (5)
2242 × 3 Conv2d,1 × – ReLU 2 σ 2x + σ2y + C2
1
1122 × 16 3× 3 – Sigmoid 1 2σ xy + C3
1122 × 16 3× 3 – Sigmoid 2 s(x, y) = (6)
σ x σ y + C3
562 × 24 3× 3 – Sigmoid 1
562 × 24 5× 5 √ Sigmoid 2 In the above equations, μx is the mean of x, μy is the mean of y, σ 2x is
282 × 40 5× 5 Sigmoid 1
the variance of x, σ 2y is the variance of y, σxy is the covariance of x and y,
√
282 × 40 5× 5 √ Sigmoid 1
282 × 40 3× 3 – ReLU 2 c1 , c2 and c3 are three constants, and c1 = (k1 L)2 , c2 = (k2 L)2 , c3 = c22 ,
142 × 80 3× 3 – ReLU 1 Default k1 = 0.01,k2 = 0.03,L = 2B − 1(B is the bit depth), and α, β, γ are
142 × 80 3× 3 – ReLU 1 constants greater than 0. In the actual engineering calculation, it is
142 × 80 3× 3 ReLU 1
generally set for α = β = γ = 1.
–
142 × 80 3× 3 √ ReLU 1
3× 3 ReLU 1
Generative adversarial network (GAN) is primarily consisted of
142 × 112 √
142 × 112 5× 5 √ ReLU 1 generator and discriminator model. The generated model uses the
72 × 160 5× 5 √ ReLU 2 random noise Z of known distribution to learn the real image distribu
72 × 160 5× 5 √ ReLU 1 tion, which makes its own generated image G(Z) more realistic. Then the
72 × 160 Conv2d,1 × – ReLU 1 discriminative model distinguishes the authenticity of sample in ob
1
tained data. The training process is equivalent to a game between two
72 × 960 pool,7 × 7 – – 1
Conv2d,1 × ReLU 1
models. Over time, the generative and discriminant models are
12 × 960 –
1 constantly fighting against each other, eventually reaching Nash equi
12 × 1280 Conv2d,1 × – – 1 librium in alternate training. The generative model generates images
1 close to the true image distribution, while the discriminator is unable to
distinguish between the truth and falsity of the samples. The loss func
tion is given as Eq. (7):
min maxV(D, G) = EX∼Pdata(x) [logD(x)] + EZ∼P(z) [log(1 − D(G(z)))] (7)
G D
3
X. Yang et al. Measurement 203 (2022) 111899
where E( • ) denotes the mathematical expectation, X is the pixel value Pz (z) denotes the noise data distribution, and G(z) is the data generated
of the wavelet image, Pdata (x) is the real data distribution, D(x) indicates by the generative model.
the result for discriminant model discrimination, Z is the noise data, SSIM is combined with GAN, and SSIM is used for secondary
screening of the generated samples. After the structural similarity
4
X. Yang et al. Measurement 203 (2022) 111899
Table 2 training set, the average SSIM value is taken, and the generated samples
Sample status of rolling bearing fault dataset. are ranked from highest to lowest according to the SSIM value. The
Working conditions Type of fault generated samples with relatively large differences are removed ac
cording to a certain proportion, and the remaining generated samples
Normal OR IR BF
are added to the training set to complete its expansion. The network
0hp 200 200 200 200 structure is displayed as Fig. 1 and they are improved as follows:
1hp 100 100 100 100
2hp 100 100 100 100
3hp 100 100 100 100 (1) Two-dimensional convolution layer is used instead of coding
layer;
(2) Adding normalization layers layer by layer to accelerate network
convergence;
Table 3
(3) The generated model uses Relu as the activation function except
Sample status of rolling bearing fault datasets.
Tanh;
Type of fault Real samples Auxiliary sample generation
(4) Discriminant model using LeakyRelu as the activation function,
BF except for the output layer, which uses the Sigmoid function;
(5) Combine with SSIM to re-screen the generated sample and
improve the auxiliary sample quality.
5
X. Yang et al. Measurement 203 (2022) 111899
lessened. The layer-by-layer convolution sub-module focuses on the intermediate features. The self-attentive model consists of three main
feature information at the channel level, and the three channels of data fully connected layers with layers of feature extraction.The model ex
p1, p2, p3 are extracted with different convolution kernels and passed tracts the feature information of the significant features in the stage
backward to j1, j2, j3 respectively. A separable two-dimensional features by dimensional transformation of the feature layers, and the
convolution structure is used to perform the convolution operation on convolutional kernels of different scales are utilized to learn the feature
different channels and extract the feature information of each channel. It parameters to complete the periodical feature screening. Several inverse
is worth noting that the computation occurs only within each channel residual modules form the improved IMCNN network.
and the resulting feature information is not fused. The point-by-point
convolution focuses more on the feature information at the spatial 3. Case study
level, and 1 × 1 convolution kernel linearly combines the information
obtained from each channel to realize the feature fusion.Finally, the Considering the small fault sample problem and difficulty of fault
fused features are extracted using the same size convolutional kernels feature extraction, a small-sample bearing fault detection model named
Q1,Q2,Q3,Q4 respectively to obtain the deep features. SSGAN-IMCNN has been proposed. The method first uses WT to process
the bearing vibration signal for obtaining a wavelet 2D image with
2.2.3. Inverse residual module time–frequency features of the signal. Then the small sample training set
The improved MobileNetv3 network mainly consists of an inverse is input to SSGAN for more generated samples which complete the
residual structure and a stack of convolutional blocks. The inverse re sample expansion. The expanded training set is input to IMCNN for
sidual structure allows the data dimension to be expanded in high di training, and the results are achieved. The flow chart of the proposed
mensions and then downscaled, enhancing the nonlinear variation of model is shown as Fig. 5.
features. Fig. 4 shows the inverse residual structure, which contains
mainly-two main channels. First, the input sampled convolutional layer 3.1. Case 1: Fault bearing data from CWRU
is subjected to 1 × 1 data up-dimensioning to complete channel ascent.
Then, 3 × 3 depth separable convolutional layers have been utilized to 3.1.1. Experimental setup
make the channel number double, during which a self-attention mech In order to verify the validity of the proposed method, the Case
anism is added to obtain fault features. Finally, the 1 × 1 down-sampled Western Reserve University bearing dataset is utilized. Four types of
convolutional layer is used to realize data down-dimensioning to obtain experimental samples, including inner ring failure signal (IR), rolling
6
X. Yang et al. Measurement 203 (2022) 111899
Table 6
Diagnostic accuracy of rolling bearing fault data.
Experiment Methods Auxiliary training Test sample Average
number samples for each conditions accuracy Fig. 9. Experiment setup of bearing fault.
fault type at 0hp
7
X. Yang et al. Measurement 203 (2022) 111899
Table 8 and the step is 600. The magnitude for the generated individual wavelet
Sample status of rolling bearing fault data set. image is 224 × 224. 200 samples are collected for the fault at 0hp and
Fault Original signal Real samples Auxiliary sample divided 1:1 into the training set and the test set, and 100 samples are
type generation collected for the other operating conditions as the test set. The details of
0 the experimental samples are shown in Table 2.
8
X. Yang et al. Measurement 203 (2022) 111899
9
X. Yang et al. Measurement 203 (2022) 111899
The number of training rounds Epochs for the model is set to 5000, and
the number of batch samples is 16. 2000 auxiliary training samples are
generated. Table 8 shows the comparison of auxiliary samples with the
real samples. It can be seen from the table that the auxiliary generated
samples are very similar but not identical to the real samples, indicating
that the auxiliary samples are not only of high quality but also diverse.
For testing the quality of the generated samples of SSGAN, different
numbers of real samples are selected as the input of SSGAN, and the
generated sample is compared with the real sample to design the ex
periments shown in Table 9. When the number of samples involved in
SSGAN training is below 100, the quality of the generated samples de
creases to a large extent. Therefore, for ensuring the quality of the
generated samples, all subsequent experiments are conducted at a
training set of 100 samples.
4. Conclusion
For the problems of little fault data and difficult feature extraction, a
novel method has been proposed for a small-sample bearing fault
diagnosis based on SSGAN and improved MobileNetv3 convolutional
neural network (IMCNN). By analyzing the standard dataset and labo
ratory bearing data, it is verified the superiority and stability of the
proposed method. Results of the experiments are written as:
(1) Morlet wavelets are used to transform the data dimension and
Fig. 14. Classification confusion matrix graph based on SSGAN image gener extract two-dimensional time–frequency image features from the
ation method. one-dimensional original signal, which can fully utilize of the
powerful time–frequency feature extraction capability of wavelet
transform.
10
X. Yang et al. Measurement 203 (2022) 111899
(2) Combining GAN with SSIM, the advantages of GAN in image [4] Z. Sheng, Y. Xu, K. Zhang, Applications in bearing fault diagnosis of an improved
Kurtogram algorithm based on flexible frequency slice wavelet transform filter
generation can be maximized. The quality of the auxiliary
bank, Measurement 174 (2021), 108975.
training samples is greatly improved after SSIM eliminates the [5] X. Fan, M.J. Zuo, Gearbox fault detection using Hilbert and wavelet-packet
generated samples with large differences from the real samples. transforms, Mech. Syst. Sig. Process. 20 (4) (2006) 966–982.
(3) By improving the MobileNetv3 convolutional network and [6] H. Zhao, S. Zuo, M. Hou, W. Liu, L. Yu, X. Yang, W.u. Deng, A novel adaptive signal
processing method based on enhanced empirical wavelet transform technology,
introducing Early-stop mechanism, the training time of the Sensors 18 (10) (2018) 3323, https://doi.org/10.3390/s18103323.
network can be minimized. By replacing the lightweight focus [7] L. Xiang, X. Yang, H.u. Aijun, et al., Condition monitoring and anomaly detection
mechanism with the self-focus mechanism, 100 % accuracy is of wind turbine based on cascaded and bidirectional deep learning networks,
Applied Energy 305 (2022), 117925.
achieved in small sample bearing fault diagnosis, which effec [8] C. Lu, Z.-Y. Wang, W.-L. Qin, J. Ma, Fault diagnosis of rotary machinery
tively improves the classification accuracy of the proposed components using a stacked denoising autoencoder-based health state
model. identification, Signal Process. 130 (2017) 377–388.
[9] K. Yu, T. Lin, H. Ma, et al., A multi-stage semi-supervised learning approach for
intelligent fault diagnosis of rolling bearing using data augmentation and metric
Short training time and high diagnostic accuracy have been the learning, Mech. Syst. Sig. Process. 146 (2021), 107043.
constant pursuit in the field of fault diagnosis. Although the proposed [10] H.e. Zhiyi, S. Haidong, J. Lin, C. Junsheng, Y. Yu, Yang Yu. Transfer fault diagnosis
of bearing installed in different machines using enhanced deep auto-encoder,
method in this paper has good performance in classification results, its Measurement 152 (2020) 107393, https://doi.org/10.1016/j.
diagnostic accuracy in multiple working conditions still needs to be measurement.2019.107393.
improved. The next step is to further adjust the parameters and structure [11] X. Zhu, X. Luo, J. Zhao, Research on deep feature learning and condition
recognition method for bearing vibration, Appl. Acoust. 168 (2020), 107435.
in the network, shorten the training time of the network, and improve
[12] H. Shao, H. Jiang, X. Zhang, et al., Rolling bearing fault diagnosis using an
the performance of IMCNN in multi-conditions fault diagnosis. optimization deep belief network, Meas. Sci. Techno. 26 (2015), 115002.
[13] O. Janssens, V. Slavkovikj, B. Vervisch, et al., Convolutional neural network based
on fault detection for rotating machinery, J. Sound Vib. 377 (2016) 331–345.
CRediT authorship contribution statement [14] D. Yao, H. Liu, J. Yang, et al., A lightweight neural network with strong robustness
for bearing fault diagnosis, Measurement 159 (2020), 107756.
Xin Yang: Investigation, Formal analysis, Writing – original draft. [15] W. Zhao, Z. Wang, Multiscale inverted residual convolutional neural network for
intelligent diagnosis of bearings under variable load condition, Measurement 188
Bing Liu: Writing – original draft, Investigation, Software. Ling Xiang:
(2022), 110511.
Conceptualization, Methodology, Writing – review & editing. Aijun Hu: [16] X. Yan, D. She, Y. Xu, M. Jia, Deep regularized variational autoencoder for
Resources, Writing – review & editing, Supervision. Yonggang Xu: intelligent fault diagnosis of rotor-bearing system within entire life-cycle process,
Visualization, Data curation. Knowledge-Based Syst. 226 (2021) 107142, https://doi.org/10.1016/j.
knosys.2021.107142.
[17] X. Yan, Y. Liu, M. Jia, Multiscale cascading deep belief network for fault
identification of rotating machinery under various working conditions, Knowledge-
Declaration of Competing Interest based Syst. 193 (2020), 105484.
[18] F. Jia, Y. Lei, L. Guo, J. Lin, S. Xing, A neural network constructed by deep learning
The authors declare that they have no known competing financial technique and its application to intelligent fault diagnosis of machines,
Neurocomputing 000 (2017) 1–10.
interests or personal relationships that could have appeared to influence [19] Z. Wang, J. Zhou, Y. Lei, Bearing fault diagnosis method based on adaptive
the work reported in this paper. maximum cyclostationarity blind deconvolution, Mech. Syst. Sig. Process. 162
(2022), 108018.
[20] H. Su, X. Yang, L. Xiang Ling, et al., A novel method based on deep transfer
Data availability
unsupervised learning network for bearing fault diagnosis under variable working
condition of unequal quantity, Knowledge-based Syst. 242 (2022), 108381.
Data will be made available on request. [21] I.J. Goodfellow, et al., Generative adversarial nets, Proc. Adv. Neural Inf. Process.
Syst. 3 (2014) 2672–2680.
[22] W. Mao, Y. Liu, L. Ding, Y. Li, Imbalanced fault diagnosis of rolling bearing based
Acknowledgment on generative adversarial network: a comparative study, IEEE Access 7 (2019)
9515–9530.
[23] T.e. Han, C. Liu, W. Yang, D. Jiang, A novel adversarial learning frame work in
This work was supported by the National Natural Science Foundation
deep convolutional neural network for intelligent diagnosis of mechanical faults,
of China (52075170 and 52175092), Hebei Natural Science Foundation Knowledge-based Syst. 165 (2019) 474–487.
(A2022502005) and Post-Graduate’s Innovation Fund Project of Hebei [24] Z. Wang, A.C. Bovik, H.R. Sheikh, et al., Image quality assessment: from error
visibility to structural similarity, IEEE Trans. Image Process. 13 (2004) 1–15.
Province (CXZZBS2022154).
[25] S. Gao, Z. Pei, Y. Zhang, T. Li, Bearing fault diagnosis based on adaptive
convolutional neural network with nesterov momentum, IEEE Sensors J. 21 (7)
References (2021) 9268–9276.
[26] G. Jin, T. Zhu, M.W. Akram, Y.i. Jin, C. Zhu, An adaptive anti-noise neural network
for bearing fault diagnosis under noise and varying load conditions, IEEE Access 8
[1] S. Lu, Q. He, J. Wang, A review of stochastic resonance in rotating machine fault
(2020) 74793–74807.
detection, Mech. Syst. Signal Process. 116 (2019) 230–260.
[27] W. Yu, C. Zhao, Broad convolutional neural network based industrial process fault
[2] H. Su, L. Xiang, A. Hu, et al., A novel hybrid method based on KELM with SAPSO
diagnosis with incremental learning capability, IEEE Trans. Ind. Electron. 67 (6)
for fault diagnosis of rolling bearing under variable operating conditions,
(2020) 5081–5091.
Measurement 177 (2021), 109276.
[28] Y. Wang, Z. Liu, Y. Xia, C. Zhu, D. Zhao, Spatiotemporal module for video saliency
[3] P. Ma, H. Zhang, W. Fan, C. Wang, A diagnosis framework based on domain
prediction based on self-attention, Image Vis. Comput. 112 (2021) 104216,
adaptation for bearing fault diagnosis across diverse domains, ISA Transactions 99
https://doi.org/10.1016/j.imavis.2021.104216.
(2020) 465–478.
11