You are on page 1of 11

Measurement 203 (2022) 111899

Contents lists available at ScienceDirect

Measurement
journal homepage: www.elsevier.com/locate/measurement

A novel intelligent fault diagnosis method of rolling bearings with


small samples
Xin Yang a, Bing Liu a, Ling Xiang a, *, Aijun Hu a, Yonggang Xu b
a
Hebei Key Laboratory of Electric Machinery Health Maintenance & Failure Prevention, North China Electric Power University, Baoding 071003, China
b
Beijing University of Technology, Beijing 100124, China

A R T I C L E I N F O A B S T R A C T

Keywords: It’s a challenging work to diagnose faults from the measured vibration signals automatically and efficiently under
Rolling bearings small samples. A new intelligent fault diagnosis method of rolling bearing with small samples is proposed based
Intelligent fault diagnosis on structural similarity generative adversarial network (SSGAN) and improved MobileNetv3 convolutional
Small samples
neural network (IMCNN). Firstly, the wavelet transform (WT) is performed on the signal to obtain a wavelet 2D
Structurally similar generative adversarial
networks (SSGAN)
image with time–frequency characteristics. Then, SSGAN is constructed to obtain high-quality generated samples
Improved MobileNetv3 convolutional neural for expanding the small training sets. Finally, the improved MobileNetv3 convolutional neural network (IMCNN)
networks (IMCNN) is proposed to extract feature information of the extended samples by using the self-focus mechanism instead of
the original lightweight focus mechanism, and the classification results are acquired for fault recognition. The
experimental results show that the proposed SSGAN-IMCNN method can effectively extend the small samples and
automatically detect the rolling bearing faults with high classification accuracy.

1. Introduction gradually difficult to meet the requirements of modern intelligent fault


diagnosis
As a critical element of rotary equipment, the rolling bearings play an Deep learning methods have the advantages of fast processing speed,
important role, and any fault can lead to potential failure hazards and high detection accuracy, and no reliance on expert knowledge, which
unexpected safety problems in the equipment [1–2]. It is necessary to gradually become the mainstream method for fault diagnosis and have
employ advanced and intelligent fault detection method for ensuring received more and more attention from researchers [7]. Lu et al. [8]
rolling bearing safe operation and minimizing economic losses [3]. adopted a graduated structure by using an autoencoder learning
However, the harsh working environment of rolling bearings brings network to detect the condition for rotating equipment. Yu et al. [9]
many problems, such as few fault samples and difficulty in fault feature presented a multiple stage and half supervision network to detect the
extraction. And the actual collected data is very complex, which often fault of bearings by utilizing dataset generation and measure learning.
contains multiple noises, which makes it challenging to extract effective He et al. [10] proposed an enhanced depth self-encoder for the case of
features of the faults. Wavelet transform [4] which is a time–frequency insufficient data. The source model was passed to the target function
analyzing algorithm, provides a “time-frequency” window with fre­ after correction of the loss function, and the target model is matched
quency. The window can effectually cut down the interferences of noises after fine-tuning. The method had a good migration diagnosis perfor­
on signal. Wavelet transform is widely applied in various fields owing to mance. Zhu et al. [11] reported a connecting network for CNN to reduce
its multi-resolution analysis ability. Fan et al. [5] applied wavelet the dataset complicacy for intelligent bearing fault diagnosis. Shao et al.
transform and Hilbert transform for obtaining time–frequency distri­ [12] applied particle swarm model for optimizing deep belief network
bution for detecting early gear faults. The wavelet transform is utilized for improving the network classification accuracies. Janssens et al. [13]
to recognize fault by for failure mode classification [6]. With the mature proposed a characteristic learning method according to CNN, which was
application of artificial intelligence methods in the industrial field, superior to the existed model. Yao et al. [14] presented a lightweight
equipment fault diagnosis gradually enters the era of intelligence, and neural network to diagnose the bearing faults. Zhao et al. [15] presented
the massive data tags make the traditional fault diagnosis methods a multiple scale inverted residual CNN for intelligence bearing detection

* Corresponding author.
E-mail address: ncepuxl@163.com (L. Xiang).

https://doi.org/10.1016/j.measurement.2022.111899
Received 13 July 2022; Received in revised form 22 August 2022; Accepted 4 September 2022
Available online 10 September 2022
0263-2241/© 2022 Elsevier Ltd. All rights reserved.
X. Yang et al. Measurement 203 (2022) 111899

Fig. 1. The schematic diagram of SSGAN.

under various work conditions. Yan et al. [16] presented a deep regu­ equipment. The results not only verify the quality of SSGAN synthetic
larized variational self-encoder to solve the overfitting problem of the samples, but also demonstrate the effectiveness of IMCNN fault
original model with better recognition of fault severity. He also pro­ classification.
posed a multiple scale CDB network for fault diagnosis which could
extract rich features hiding in signal [17]. Jia et al. [18] provided a 2. Deep learning model
connection network using standard sparse autoencoder, which was
effectively verified in failure intelligence detection. Wang et al. [19] 2.1. Related work
presented a new method by using adaptive maximum cyclostationarity
blind deconvolution, which was proved for bearing fault detection. A 2.1.1. Wavelet transform
new deep transferable learning model of fault diagnosis was reported in The essence of wavelet transform (WT) is function decomposition,
various work states of unequal quantity [20]. which represents the original signal as a linear combination for wavelet
Deep learning model has higher diagnostic accuracy than traditional basis functions. WT gets over the drawback that the window size cannot
methods for single and compound faults. Undeniably, these models have change with frequency, and provides a “time-frequency” window which
obtained some successes on rolling bearing failure detection, but most of changes with frequency, with strong time–frequency feature extraction
the models still need sufficient training samples to complete fault capability. Wavelet basis function selection profoundly affects the
diagnosis task. It is known that the sufficient fault samples are very wavelet transform effect. Morlet wavelet is a bilateral exponentially
difficult in the actual signal acquisition, where the rolling bearing may decaying cosine signal, which is very similar to the characteristics of the
fail more quickly after a fault occurs. Therefore, with small samples, the fault pulse generated by the rolling bearing. Cmor wavelet in Morlet
diagnostic model is prone to overfitting, low recognition accuracy, and wavelet represents its negative form with strong adaptability, so Cmor
difficulty in fault feature extraction. Generative adversarial networks wavelet is chosen as basis function of wavelet transform, which is
(GAN) [21] was first proposed in 2014, which has widely been applied expressed as Eq. (1):
for data enhancement, sample expansion, and fault detection. GAN
1
could effectively achieve the data distributing features of raw datasets ψ r (t) = √̅̅̅̅̅̅̅̅ exp(− t2 /fb )cos(2πft) (1)
π ⋅fb
and then generated novel artificial data with similar distribution. Mao et
al [22] detected unbalance fault of rolling bearings by using Fourier Its corresponding Fourier representation is:
transform and GAN to synthesize the datasets. Han et al. [23] built a new
adversarial learning layer in DCNN for intelligent fault detection, and ψ (af ) = e− π2 fb (af − fc )2
(2)
the model has a better performance in the case of insufficient data. In Eq. (1), t denotes time, f denotes frequency, and f b is the shape
By using above methods some successes have been achieved in fault parameter, which determines how fast or slow the waveform oscillation
diagnosis of small samples, but there are still not concrete structures in decays. In Eq. (2), a is the transformation scale, and f c is the center
selecting data source and characteristic extraction model. It is also frequency, which determines the oscillation frequency of the waveform.
important to consider both time and frequency characteristics of the
signal, and two-dimensional characteristic image is beneficial to 2.1.2. Structural similarity generative adversarial networks (SSGAN)
generative adversarial networks to learn the distribution characteristics. Structural similarity (SSIM) [24] is a measure of similarity between
Two-dimensional time–frequency signals can be better applied to images, and the similarity index ranges from 0 to 1. The larger the index
generate adversarial networks. Following these ideas, a small-sample is, the higher the similarity is. The principle is shown in Eq. (3):
bearing fault diagnosis method combining SSGAN and improved
MobileNetv3 convolutional neural network (IMCNN) is proposed in this SSIM(x, y) = [l(x, y)]α ∗ [c(x, y)]β ∗ [s(x, y)]γ (3)
paper. WT is used to obtain the time–frequency features from the one-
Luminance, contrast and structure are three important modules in
dimensional original vibration signal, which are two-dimensional im­
the measurement system. Eq. (4) is the contrast function of luminance,
ages with fault information. SSGAN is constructed for generating high
Eq. (5) is the contrast function of contrast, and Eq. (6) is the contrast
quality samples for the expansion of training sample. IMCNN is proposed
function of structure.
to extract the deep features and conduct final diagnoses. Two different
experiments are designed which are on the CWRU and our laboratory

2
X. Yang et al. Measurement 203 (2022) 111899

Table 1 2μx μy + C1
Structure of proposed model (IMCNN).
l(x, y) = (4)
μ2x + μ2y + C1
Input Kernel Size Attention Activation Step
Dimension Mechanism Function Length 2 σ x σ y + C2
c(x, y) = (5)
2242 × 3 Conv2d,1 × – ReLU 2 σ 2x + σ2y + C2
1
1122 × 16 3× 3 – Sigmoid 1 2σ xy + C3
1122 × 16 3× 3 – Sigmoid 2 s(x, y) = (6)
σ x σ y + C3
562 × 24 3× 3 – Sigmoid 1
562 × 24 5× 5 √ Sigmoid 2 In the above equations, μx is the mean of x, μy is the mean of y, σ 2x is
282 × 40 5× 5 Sigmoid 1
the variance of x, σ 2y is the variance of y, σxy is the covariance of x and y,

282 × 40 5× 5 √ Sigmoid 1
282 × 40 3× 3 – ReLU 2 c1 , c2 and c3 are three constants, and c1 = (k1 L)2 , c2 = (k2 L)2 , c3 = c22 ,
142 × 80 3× 3 – ReLU 1 Default k1 = 0.01,k2 = 0.03,L = 2B − 1(B is the bit depth), and α, β, γ are
142 × 80 3× 3 – ReLU 1 constants greater than 0. In the actual engineering calculation, it is
142 × 80 3× 3 ReLU 1
generally set for α = β = γ = 1.

142 × 80 3× 3 √ ReLU 1
3× 3 ReLU 1
Generative adversarial network (GAN) is primarily consisted of
142 × 112 √
142 × 112 5× 5 √ ReLU 1 generator and discriminator model. The generated model uses the
72 × 160 5× 5 √ ReLU 2 random noise Z of known distribution to learn the real image distribu­
72 × 160 5× 5 √ ReLU 1 tion, which makes its own generated image G(Z) more realistic. Then the
72 × 160 Conv2d,1 × – ReLU 1 discriminative model distinguishes the authenticity of sample in ob­
1
tained data. The training process is equivalent to a game between two
72 × 960 pool,7 × 7 – – 1
Conv2d,1 × ReLU 1
models. Over time, the generative and discriminant models are
12 × 960 –
1 constantly fighting against each other, eventually reaching Nash equi­
12 × 1280 Conv2d,1 × – – 1 librium in alternate training. The generative model generates images
1 close to the true image distribution, while the discriminator is unable to
distinguish between the truth and falsity of the samples. The loss func­
tion is given as Eq. (7):
min maxV(D, G) = EX∼Pdata(x) [logD(x)] + EZ∼P(z) [log(1 − D(G(z)))] (7)
G D

Fig. 2. Self-attention submodule.

Fig. 3. Deep separable convolution structure.

3
X. Yang et al. Measurement 203 (2022) 111899

Fig. 4. Inverse residual module structure diagram.

Fig. 5. Flow chart for SSGAN-IMCNN model.

where E( • ) denotes the mathematical expectation, X is the pixel value Pz (z) denotes the noise data distribution, and G(z) is the data generated
of the wavelet image, Pdata (x) is the real data distribution, D(x) indicates by the generative model.
the result for discriminant model discrimination, Z is the noise data, SSIM is combined with GAN, and SSIM is used for secondary
screening of the generated samples. After the structural similarity

4
X. Yang et al. Measurement 203 (2022) 111899

Table 2 training set, the average SSIM value is taken, and the generated samples
Sample status of rolling bearing fault dataset. are ranked from highest to lowest according to the SSIM value. The
Working conditions Type of fault generated samples with relatively large differences are removed ac­
cording to a certain proportion, and the remaining generated samples
Normal OR IR BF
are added to the training set to complete its expansion. The network
0hp 200 200 200 200 structure is displayed as Fig. 1 and they are improved as follows:
1hp 100 100 100 100
2hp 100 100 100 100
3hp 100 100 100 100 (1) Two-dimensional convolution layer is used instead of coding
layer;
(2) Adding normalization layers layer by layer to accelerate network
convergence;
Table 3
(3) The generated model uses Relu as the activation function except
Sample status of rolling bearing fault datasets.
Tanh;
Type of fault Real samples Auxiliary sample generation
(4) Discriminant model using LeakyRelu as the activation function,
BF except for the output layer, which uses the Sigmoid function;
(5) Combine with SSIM to re-screen the generated sample and
improve the auxiliary sample quality.

2.2. Proposed model

2.2.1. Improved MobileNetv3 convolutional neural network (IMCNN)


IR
The MobileNetv3 network model is a lightweight convolutional
network model which enables processing functions such as target
detection and image classification on mobile devices and embedded
applications. MobileNetv3 network was optimized, which improved the
recognition accuracy and shortened the training time [25–26]. The core
of the optimization is to use a depthwise separable convolutional layer
OR instead of a general convolutional layer to filter channel and spatial area
features by changing the data dimension, which effectively reduces the
computational and parametric quantities. In this paper, the inverse re­
sidual structure is designed to reduce the problem of gradient disap­
pearance during convolution, and a self-attention mechanism is
introduced into this structure to highlight important features and sup­
press useless features, which makes the network performance further
Normal improved. The network structure is shown as Table 1.
In recent years, with the deepening of artificial intelligence research,
attention mechanism has been generally applied in various learning
fields [27]. The earliest attention mechanism is mainly applied in visual
images and natural language processing. At present, introducing the
attention mechanism in fault diagnosis could effectually increase the
accuracy of fault diagnosis [28]. In the proposed model, the self-
attention module is introduced into the deep separable convolutional
network to enhance the significant features and suppress the useless
Table 4 features by compressing the global spatial features. An additional fully-
Generated sample quality under different real sample numbers. connected layer is added to further improve the extraction capability of
Number of training samples/ Peak Signal to Noise Ratio Euclidean distance self-attention mechanism for faulty features and to improve the
(sheets) (PSNR) (ED) nonlinear expression capability of the features. The specific structure of
200 39.104 0.2950 the self-attention module is designed and shown in Fig. 2. The first fully-
150 39.583 0.3840 connected layer reduces the initial feature dimension to 1/16, and the
100 38.377 0.5920 second fully-connected layer increases the data dimension to 1/4. The
50 32.916 0.7970
final fully-connected layer returns to the initial dimension and outputs it
again to complete feature extraction. Such a design can increase the non-
linear fitting ability of the self-attentive mechanism module to the fault
Table 5 features in wavelet images. In addition, Sigmoid function is utilized for
Rolling bearing fault diagnosis accuracy under multiple working conditions. activating the last fully connected layers to get the weight values on
Experiment Auxiliary training samples Test sample Accuracy [0,1], which could lessen the parameter number and computational
number for each fault type at 0hp working (%) effort in computing process. Finally, the channel weights are multiplied
conditions
with the original features to complete the feature extraction.
1 0 0hp 97.50
2 500 0hp 100 2.2.2. Deep separable convolution
3 1000 0hp 100
4 1000 1hp 98.15
The deep separable convolution module is composed of a layer-by-
5 1000 2hp 97.25 layer sub-module with strong spatial feature extraction capability and
6 1000 3hp 91.50 a point-by-point convolution sub-module with linear combination. Its
structure is shown in Fig. 3. When the traditional convolutional layer is
replaced with deep separable convolutional module, the theoretical
analysis between the generated samples and the real samples in the
computation and the parameter number of the model is effectively

5
X. Yang et al. Measurement 203 (2022) 111899

Fig. 6. Confusion matrix graph (training sample → test sample).

lessened. The layer-by-layer convolution sub-module focuses on the intermediate features. The self-attentive model consists of three main
feature information at the channel level, and the three channels of data fully connected layers with layers of feature extraction.The model ex­
p1, p2, p3 are extracted with different convolution kernels and passed tracts the feature information of the significant features in the stage
backward to j1, j2, j3 respectively. A separable two-dimensional features by dimensional transformation of the feature layers, and the
convolution structure is used to perform the convolution operation on convolutional kernels of different scales are utilized to learn the feature
different channels and extract the feature information of each channel. It parameters to complete the periodical feature screening. Several inverse
is worth noting that the computation occurs only within each channel residual modules form the improved IMCNN network.
and the resulting feature information is not fused. The point-by-point
convolution focuses more on the feature information at the spatial 3. Case study
level, and 1 × 1 convolution kernel linearly combines the information
obtained from each channel to realize the feature fusion.Finally, the Considering the small fault sample problem and difficulty of fault
fused features are extracted using the same size convolutional kernels feature extraction, a small-sample bearing fault detection model named
Q1,Q2,Q3,Q4 respectively to obtain the deep features. SSGAN-IMCNN has been proposed. The method first uses WT to process
the bearing vibration signal for obtaining a wavelet 2D image with
2.2.3. Inverse residual module time–frequency features of the signal. Then the small sample training set
The improved MobileNetv3 network mainly consists of an inverse is input to SSGAN for more generated samples which complete the
residual structure and a stack of convolutional blocks. The inverse re­ sample expansion. The expanded training set is input to IMCNN for
sidual structure allows the data dimension to be expanded in high di­ training, and the results are achieved. The flow chart of the proposed
mensions and then downscaled, enhancing the nonlinear variation of model is shown as Fig. 5.
features. Fig. 4 shows the inverse residual structure, which contains
mainly-two main channels. First, the input sampled convolutional layer 3.1. Case 1: Fault bearing data from CWRU
is subjected to 1 × 1 data up-dimensioning to complete channel ascent.
Then, 3 × 3 depth separable convolutional layers have been utilized to 3.1.1. Experimental setup
make the channel number double, during which a self-attention mech­ In order to verify the validity of the proposed method, the Case
anism is added to obtain fault features. Finally, the 1 × 1 down-sampled Western Reserve University bearing dataset is utilized. Four types of
convolutional layer is used to realize data down-dimensioning to obtain experimental samples, including inner ring failure signal (IR), rolling

6
X. Yang et al. Measurement 203 (2022) 111899

Fig. 7. Visualization of different test sets with training set of 0HP.

Fig. 8. Comparison chart of accuracy under different methods.

Table 6
Diagnostic accuracy of rolling bearing fault data.
Experiment Methods Auxiliary training Test sample Average
number samples for each conditions accuracy Fig. 9. Experiment setup of bearing fault.
fault type at 0hp

1 WT- 1000 0 100


2 ACGAN- 1 89.77 Table 7
3 IMCNN 2 91.50 Sample status of rolling bearing fault data.
4 3 87.94
Condition Fault Category Training set samples Test set sample Label
5 WT- 1000 0 100
6 SSGAN- 1 98.15 Condition 1 Rolling element 100 100 0
7 IMCNN 2 97.25 Inner ring 100 100 1
8 3 94.50 Out ring 100 100 2
Normal 100 100 3
Condition 2 Rolling element 100 100 4
element failure signal (BF), outer ring failure signal (OR) and normal Inner ring 100 100 5
Out ring 100 100 6
bearing signal (Normal), are selected at 12 kHz sampling frequency.
These samples are under four types of operating conditions from 0 to 3
hp, and all are with a fault size of 0.007 in. (0.18 mm). A sliding window
sampling method is used. Each sample contains 2000 sampling points,

7
X. Yang et al. Measurement 203 (2022) 111899

Table 8 and the step is 600. The magnitude for the generated individual wavelet
Sample status of rolling bearing fault data set. image is 224 × 224. 200 samples are collected for the fault at 0hp and
Fault Original signal Real samples Auxiliary sample divided 1:1 into the training set and the test set, and 100 samples are
type generation collected for the other operating conditions as the test set. The details of
0 the experimental samples are shown in Table 2.

3.1.2. Auxiliary sample generation


Four types of fault data at 0hp are selected as the input to the SSGAN
network, with epoch set to 3000 and batch size set to 64, for generating
1 1000 auxiliary training samples. The generated auxiliary sample images
are shown in Table 3. It can be seen from the table that the auxiliary
generated samples are very similar but not identical to the real samples,
indicating that the auxiliary samples are not only of high quality but also
diverse. The auxiliary training samples can enrich the training set and
2 improve the robustness and diagnostic accuracy of the classification
model.
For testing the quality of the generated sample of SSGAN network,
different numbers of real samples at 0hp are selected as the input, and
these samples are compared with the real samples. The experiments are
3 designed as shown in Table 4. Euclidean distance (ED) and peak signal to
noise ratio (PSNR) are chosen to determine the similarity between the
generated samples and the original samples. ED is the distance between
image distributions, and the more similar the two distributions are, the
smaller the value is. PSNR is based on error-sensitive image quality
evaluation. The less distorted the sample is, the larger the value is. As
seen in Table 4, when the sample number involved in SSGAN training is
Table 9 below 100, the quality of the generated samples decreases to a large
Sample quality comparison is generated under different real sample numbers. extent. Therefore, for ensuring the quality of the generated samples, all
subsequent experiments are conducted at a training set of 100 samples.
Number of training Peak Signal to Noise Ratio Euclidean distance
samples (PSNR) (ED)
3.1.3. Diagnosis results
100 35.78471 0.7116
80 33.1606 0.8013 The 200 real samples of 4 types of running states at 0hp were divided
60 22.0718 0.8812 into a training set and a test set in a 1:1 ratio. The training set data at 0hp
40 20.6659 0.9743 is used as input to SSGAN, and 1000 auxiliary training samples are ob­
20 21.1524 1.3931 tained for each type of fault. The auxiliary training samples are sup­
plemented to the 0hp dataset for the training of the IMCNN. The test
data under 0hp, 1hp, 2hp and 3hp conditions are applied to output the
Table 10 results. To avoid the influence of random factors on the experimental
Comparison of diagnostic accuracy under different data sets. results, each group of experiment is conducted ten times separately and
Experiment Number of real Number of Total test Accuracy
the average accuracy is output. The experiment design and the diagnosis
serial number samples of auxiliary training sample (%) accuracy are presented in Table 5.
each type of samples for each size It can be seen from Table 5 the fault diagnosis accuracy of the pro­
failure type of fault posed method increases from the initial 97.50 to 100 % in the same
1 100 100 700 53.95 working condition. When the test condition is 1hp, the fault diagnosis
2 500 700 86.20 accuracy is 98.15 %. When the test condition is 2hp, the fault diagnosis
3 1000 700 99.96 accuracy is 97.25 %, and when the test condition is 3hp, the fault
4 2000 700 100.00
5 80 100 700 51.75
diagnosis accuracy is 91.50 %. It is proved that the proposed IMCNN is
6 500 700 81.65 capable of mining deeper features in the samples and has better diag­
7 1000 700 99.45 nostic performance. The visualization effect of classification is shown in
8 2000 700 100.00 Fig. 6. After adding the same number of auxiliary training samples, the
9 60 100 700 45.50
upper left figure has 0 working conditions for training samples and
10 500 700 73.60
11 1000 700 98.20 0 working conditions for test samples, the upper right figure has
12 2000 700 99.31 0 working conditions for training samples and 1 working condition for
13 40 100 700 39.50 test samples, the lower left figure has 0 working conditions for training
14 500 700 71.34 samples and 2 working conditions for test samples, and the lower right
15 1000 700 94.35
16 2000 700 98.50
figure has 0 working conditions for training samples and 3 working
17 20 100 700 35.20 conditions for test samples. The confusion matrix verifies the effective­
18 500 700 65.31 ness of the proposed SSGAN-IMCNN method for rolling bearing fault
19 1000 700 90.94 diagnosis.
20 2000 700 94.30
To further verify the diagnosis of the proposed method, the results of
21 0 100 700 14.59
22 500 700 45.32 Experiment 3, Experiment 4, Experiment 5, and Experiment 6 in Table 5
23 1000 700 78.94 are visualized using the t-distributed random neighborhood embedding
24 2000 700 90.75 t-SNE. Fig. 7 shows the two-dimensional visualization of the classifica­
tion results. From the figure, it can be obtained that the four operating
states are well distinguished after principal component analysis (PCA).
ACGAN-IMCNN is applied to highlight the accuracy of the proposed

8
X. Yang et al. Measurement 203 (2022) 111899

Fig. 10. Classification accuracy bar charts of data.

0), 98.15 % (condition 1), 97.25 (condition 2) and 94.50 % (condition


3), three of which are higher than ACGAN-IMCNN methods with 89.77
% (condition 1), 91.50 % (condition 2), 87.94 % (condition 3). There­
fore, the proposed method has strong performance under different
working conditions datasets for bearing fault diagnosis.

3.2. Case 2: Fault bearing data from lab

3.2.1. Experimental setup


The experiments are carried out using a rolling bearing experimental
setup as shown in Fig. 9, which consists of the accelerometer, motor, test
bearing and loading module. Vibration signal is gathered using accel­
erometers with a sampling frequency of 12.8 kHz. The experimental
bearing type is 6205 deep groove ball bearing and the faults are created
Fig. 11. Two-dimensional visualization of classification results. on the bearing using EDM technology. The signals are collected at 1425
rpm (operating condition 1) and 1470 rpm (operating condition 2)
method under a variety of operating conditions. To minimize specificity respectively. The datasets include normal signals (Normal), inner ring
and chance, each experiment is repeated for ten times, and the network faults (IR), which are divided into training and test sets at 1:1, as shown
structure does not change during the experiment. The learning rate is set in Table 7.
to 0.0002, and the training and test sets are kept consistent. The com­ Faced with the problem of few fault data, the small samples of
parison results are shown in Fig. 8 and Table 6. From Table 6, the wavelet images are input to SSGAN for adversarial training, and the
average accuracies obtained by SSGAN-IMCNN reach 100 % (condition generated high-quality image samples enrich the training set and
improve the robustness and diagnostic accuracy of the classification.

Fig. 12. 3D visualization of classification results.

9
X. Yang et al. Measurement 203 (2022) 111899

The number of training rounds Epochs for the model is set to 5000, and
the number of batch samples is 16. 2000 auxiliary training samples are
generated. Table 8 shows the comparison of auxiliary samples with the
real samples. It can be seen from the table that the auxiliary generated
samples are very similar but not identical to the real samples, indicating
that the auxiliary samples are not only of high quality but also diverse.
For testing the quality of the generated samples of SSGAN, different
numbers of real samples are selected as the input of SSGAN, and the
generated sample is compared with the real sample to design the ex­
periments shown in Table 9. When the number of samples involved in
SSGAN training is below 100, the quality of the generated samples de­
creases to a large extent. Therefore, for ensuring the quality of the
generated samples, all subsequent experiments are conducted at a
training set of 100 samples.

3.2.2. Diagnosis results


It can be seen from Table 10 when the real sample number is 100,
with the addition of auxiliary samples, the classification accuracy is
increased from 53.95 % to 100 %. When the number of real samples is
80, 60, 40, 20, or 0, adding auxiliary samples can increase the accuracies
of fault detection to varying degrees. The classification results are
illustrated in Fig. 10. It is clear from Fig. 10 the number of auxiliary
training samples significantly affects the accuracy and stability of the
Fig. 13. Classification confusion matrix based on 2D-SSGAN image genera­ diagnostic model.
tion method. To further validate the proposed IMCNN model, the experiment re­
sults in Table 10 are visualized using the t-Distributed Stochastic
Neighbor Embedding (t-SNE). The visualization diagrams are displayed
Table 11 in Fig. 11 and Fig. 12 which present the classification results respec­
Comparison of diagnostic accuracy under different datasets. tively. The seven operation states are well distinguished after PCA
Experiment Number of auxiliary training Test sample Accuracy
(principal component analysis). The results show the 2D-SSGAN image
number samples for various faults under conditions (%) generation model proposed in the paper can improve the quality of
0hp sample generation, which enables IMCNN model to more effectively
1 0 1hp 84.5 mine the features in the data and further improve the fault diagnosis
2 500 93.5 accuracy.
3 1000 97.5 The classification confusion matrix is displayed in Fig. 13 by using
the proposed IMCNN. The fault classification accuracy reaches 100 %
after the auxiliary sample expansion is performed, which fully verifies
the superiority of the proposed method in bearing fault diagnosis. To
further verify the feasibility of the proposed method, the number of
samples for auxiliary training under 1 working condition is limited and
the 2 working condition test set is used for the trained network. The
classification result is presented in Table 11.
As can be seen from Table 11, when the test set is 2 working con­
dition, the classification accuracy increases from 84.5 % to 97.5 % with
the addition of auxiliary samples, which is a 13 % increase in accuracy
compared to that before the addition of auxiliary samples. It is clear the
number of auxiliary training samples has a significant effect on the ac­
curacy of the diagnostic model, and the feasibility of the proposed
method is verified for bearing fault diagnosis. The classification confu­
sion matrix diagram is displayed as Fig. 14.

4. Conclusion

For the problems of little fault data and difficult feature extraction, a
novel method has been proposed for a small-sample bearing fault
diagnosis based on SSGAN and improved MobileNetv3 convolutional
neural network (IMCNN). By analyzing the standard dataset and labo­
ratory bearing data, it is verified the superiority and stability of the
proposed method. Results of the experiments are written as:

(1) Morlet wavelets are used to transform the data dimension and
Fig. 14. Classification confusion matrix graph based on SSGAN image gener­ extract two-dimensional time–frequency image features from the
ation method. one-dimensional original signal, which can fully utilize of the
powerful time–frequency feature extraction capability of wavelet
transform.

10
X. Yang et al. Measurement 203 (2022) 111899

(2) Combining GAN with SSIM, the advantages of GAN in image [4] Z. Sheng, Y. Xu, K. Zhang, Applications in bearing fault diagnosis of an improved
Kurtogram algorithm based on flexible frequency slice wavelet transform filter
generation can be maximized. The quality of the auxiliary
bank, Measurement 174 (2021), 108975.
training samples is greatly improved after SSIM eliminates the [5] X. Fan, M.J. Zuo, Gearbox fault detection using Hilbert and wavelet-packet
generated samples with large differences from the real samples. transforms, Mech. Syst. Sig. Process. 20 (4) (2006) 966–982.
(3) By improving the MobileNetv3 convolutional network and [6] H. Zhao, S. Zuo, M. Hou, W. Liu, L. Yu, X. Yang, W.u. Deng, A novel adaptive signal
processing method based on enhanced empirical wavelet transform technology,
introducing Early-stop mechanism, the training time of the Sensors 18 (10) (2018) 3323, https://doi.org/10.3390/s18103323.
network can be minimized. By replacing the lightweight focus [7] L. Xiang, X. Yang, H.u. Aijun, et al., Condition monitoring and anomaly detection
mechanism with the self-focus mechanism, 100 % accuracy is of wind turbine based on cascaded and bidirectional deep learning networks,
Applied Energy 305 (2022), 117925.
achieved in small sample bearing fault diagnosis, which effec­ [8] C. Lu, Z.-Y. Wang, W.-L. Qin, J. Ma, Fault diagnosis of rotary machinery
tively improves the classification accuracy of the proposed components using a stacked denoising autoencoder-based health state
model. identification, Signal Process. 130 (2017) 377–388.
[9] K. Yu, T. Lin, H. Ma, et al., A multi-stage semi-supervised learning approach for
intelligent fault diagnosis of rolling bearing using data augmentation and metric
Short training time and high diagnostic accuracy have been the learning, Mech. Syst. Sig. Process. 146 (2021), 107043.
constant pursuit in the field of fault diagnosis. Although the proposed [10] H.e. Zhiyi, S. Haidong, J. Lin, C. Junsheng, Y. Yu, Yang Yu. Transfer fault diagnosis
of bearing installed in different machines using enhanced deep auto-encoder,
method in this paper has good performance in classification results, its Measurement 152 (2020) 107393, https://doi.org/10.1016/j.
diagnostic accuracy in multiple working conditions still needs to be measurement.2019.107393.
improved. The next step is to further adjust the parameters and structure [11] X. Zhu, X. Luo, J. Zhao, Research on deep feature learning and condition
recognition method for bearing vibration, Appl. Acoust. 168 (2020), 107435.
in the network, shorten the training time of the network, and improve
[12] H. Shao, H. Jiang, X. Zhang, et al., Rolling bearing fault diagnosis using an
the performance of IMCNN in multi-conditions fault diagnosis. optimization deep belief network, Meas. Sci. Techno. 26 (2015), 115002.
[13] O. Janssens, V. Slavkovikj, B. Vervisch, et al., Convolutional neural network based
on fault detection for rotating machinery, J. Sound Vib. 377 (2016) 331–345.
CRediT authorship contribution statement [14] D. Yao, H. Liu, J. Yang, et al., A lightweight neural network with strong robustness
for bearing fault diagnosis, Measurement 159 (2020), 107756.
Xin Yang: Investigation, Formal analysis, Writing – original draft. [15] W. Zhao, Z. Wang, Multiscale inverted residual convolutional neural network for
intelligent diagnosis of bearings under variable load condition, Measurement 188
Bing Liu: Writing – original draft, Investigation, Software. Ling Xiang:
(2022), 110511.
Conceptualization, Methodology, Writing – review & editing. Aijun Hu: [16] X. Yan, D. She, Y. Xu, M. Jia, Deep regularized variational autoencoder for
Resources, Writing – review & editing, Supervision. Yonggang Xu: intelligent fault diagnosis of rotor-bearing system within entire life-cycle process,
Visualization, Data curation. Knowledge-Based Syst. 226 (2021) 107142, https://doi.org/10.1016/j.
knosys.2021.107142.
[17] X. Yan, Y. Liu, M. Jia, Multiscale cascading deep belief network for fault
identification of rotating machinery under various working conditions, Knowledge-
Declaration of Competing Interest based Syst. 193 (2020), 105484.
[18] F. Jia, Y. Lei, L. Guo, J. Lin, S. Xing, A neural network constructed by deep learning
The authors declare that they have no known competing financial technique and its application to intelligent fault diagnosis of machines,
Neurocomputing 000 (2017) 1–10.
interests or personal relationships that could have appeared to influence [19] Z. Wang, J. Zhou, Y. Lei, Bearing fault diagnosis method based on adaptive
the work reported in this paper. maximum cyclostationarity blind deconvolution, Mech. Syst. Sig. Process. 162
(2022), 108018.
[20] H. Su, X. Yang, L. Xiang Ling, et al., A novel method based on deep transfer
Data availability
unsupervised learning network for bearing fault diagnosis under variable working
condition of unequal quantity, Knowledge-based Syst. 242 (2022), 108381.
Data will be made available on request. [21] I.J. Goodfellow, et al., Generative adversarial nets, Proc. Adv. Neural Inf. Process.
Syst. 3 (2014) 2672–2680.
[22] W. Mao, Y. Liu, L. Ding, Y. Li, Imbalanced fault diagnosis of rolling bearing based
Acknowledgment on generative adversarial network: a comparative study, IEEE Access 7 (2019)
9515–9530.
[23] T.e. Han, C. Liu, W. Yang, D. Jiang, A novel adversarial learning frame work in
This work was supported by the National Natural Science Foundation
deep convolutional neural network for intelligent diagnosis of mechanical faults,
of China (52075170 and 52175092), Hebei Natural Science Foundation Knowledge-based Syst. 165 (2019) 474–487.
(A2022502005) and Post-Graduate’s Innovation Fund Project of Hebei [24] Z. Wang, A.C. Bovik, H.R. Sheikh, et al., Image quality assessment: from error
visibility to structural similarity, IEEE Trans. Image Process. 13 (2004) 1–15.
Province (CXZZBS2022154).
[25] S. Gao, Z. Pei, Y. Zhang, T. Li, Bearing fault diagnosis based on adaptive
convolutional neural network with nesterov momentum, IEEE Sensors J. 21 (7)
References (2021) 9268–9276.
[26] G. Jin, T. Zhu, M.W. Akram, Y.i. Jin, C. Zhu, An adaptive anti-noise neural network
for bearing fault diagnosis under noise and varying load conditions, IEEE Access 8
[1] S. Lu, Q. He, J. Wang, A review of stochastic resonance in rotating machine fault
(2020) 74793–74807.
detection, Mech. Syst. Signal Process. 116 (2019) 230–260.
[27] W. Yu, C. Zhao, Broad convolutional neural network based industrial process fault
[2] H. Su, L. Xiang, A. Hu, et al., A novel hybrid method based on KELM with SAPSO
diagnosis with incremental learning capability, IEEE Trans. Ind. Electron. 67 (6)
for fault diagnosis of rolling bearing under variable operating conditions,
(2020) 5081–5091.
Measurement 177 (2021), 109276.
[28] Y. Wang, Z. Liu, Y. Xia, C. Zhu, D. Zhao, Spatiotemporal module for video saliency
[3] P. Ma, H. Zhang, W. Fan, C. Wang, A diagnosis framework based on domain
prediction based on self-attention, Image Vis. Comput. 112 (2021) 104216,
adaptation for bearing fault diagnosis across diverse domains, ISA Transactions 99
https://doi.org/10.1016/j.imavis.2021.104216.
(2020) 465–478.

11

You might also like