You are on page 1of 6

Estimation of Remaining Useful Life of Bearings

Using Sparse Representation Method

Yuting Nie Jiuqing Wan


Department of Automation Department of Automation
Beijing University of Aeronautics and Astronautics Beijing University of Aeronautics and Astronautics
Beijing, China Beijing, China
nieyuting@buaa.edu.cn wanjiuqing@gmail.com

A bstract-Prognostics and health management (PHM) play an Then the other feature vectors of the ball bearings are
important role in improving the reliability and safety of systems represented by the sparse dictionary, which a curve of
in industry. A method using sparse representation to solve the reconstruction cost is got and the curve can reflect a trend of
PHM questions raised in the IEEE 2012 PHM Data Challenge the degradation. The trend of the curve is used to divide the
Competition is presented. It is discovered that the sparse states of the ball bearings into different regions in order to
representation score could be a good health indicator of ball
predict the RUL of the ball bearings. The result is compared
bearings compared with traditional acceleration signal or some
with a mixture of traditional features under the same
extracted features like energy, root mean square, maximum peak
experimental condition. The experimental data is from the
value and skewness. A group of experimental datasets from
IEEE Reliability Society and the FEMTO-ST Institute. The
seventeen ball bearings are provided by the FEMTO-ST Institute
datasets consist of six training sets obtained from run-to-failure
for the competition. The data consists of six groups of bearing
acceleration data for algorithm training and another eleven
experiments while the other eleven test sets showing truncated
groups of bearing accelerated data for testing. The training data
experimental data. Details about the data can be found on the
groups are used to build the full score cures model in order to FEMTO-ST Institute's website [3].
estimate the remaining useful life of the truncated test data
The paper is organized as follows. A description of
groups of the ball bearings. The result is compared with the
traditional feature extraction methods in section 2. Section 3
result using a mixture of traditional features like energy,
explains the method using sparse representation about the states
maximum peak value, root mean square and skewness. Result
of the bearings. The experimental results using sparse
using the sparse representation method is relatively promising
and could well reflect the states of the rotating ball bearings.
representation method and traditional features method are
showed in the section 4. Finally, a conclusion of the paper and
Keywords- ball bearings; sparse representation; RUL the future work is depicted in section 5.

I. INTRODUCTION II. RELATED WORK

Bearings are critical components in rotating machines. The To describe the exactly remaining useful life of the rotating
degradation of bearings over time is one of the most important bearings, it is important to extract the useful characteristics of
reasons that cause a machine to breakdown. So it is meaningful the bearings. The raw acceleration signals of bearing contain a
to estimate the remaining useful life (RUL) of the bearings in lot of information about the bearings. People have tried to
advance. But the useful information is hard to get and people extract different kinds of features from the bearings to describe
have tried a lot of ways to deal with it. The horizontal and the degradation process of bearings.
vertical acceleration signals of bearing data is used. Now there Mosallam et al. [4] had tried to extract peak-to-peak,
are two main methods based on physical model and data-driven maximum peak value, root mean square, kurtosis, skewness
approach to predict the remaining useful life of rotating from the ball bearings to get the health indicator, but he just
machines. The physical models need a lot of knowledge about presented the predicted trends of the bearings, not the
the desired systems to build analytical model of the system estimations of the remaining useful life of bearings we need.
function of the degradation mechanism. The advantage of this Sutrisno [5] had used the feature of the average of the five
approach is that it can provide precise results. However, its highest absolute acceleration values to describe the states of the
difficulty is that the system is often nonlinear and the model of bearings, but with too much human judgement of the states of
the degradation mechanisms is generally hard to build in the the bearings, which is not easy to repeat. The Hidden Markov
form of analytical models [1]. The data-driven method is Model based Mahalanobis distance was used by Yu [6] to
relatively easier to build the model and have better result [2]. In provide an indication for quantifying machine health states,
order to make full use of the data, a method based on sparse which verifies the effectiveness of the state classification.
representation is used to try to classify the states of the bearings There are also some people investigating ways in classification
to predict the RUL of bearings. First, a normal feature techniques based on machine learning. Tamilselvan et al. [7]
dictionary is built based on the sparse representation, which had used deep belief learning to judge the health state.
could represent the health states of ball bearings effectively. Mosallam et al. [8] has developed a two phases data driven

978-1-4673-8554-1/151$31.00 2015 IEEE RPOl53 2015 Prognostics and System Health Management Conference-Beijing
(2015 PHM-Beijing)
method for RUL prediction, which includes the offline phase A sparse dictionary is a feature vector set
and the online phase. Wang and Su [9] had used wavelet packet A [ai,a2, , a ] E RnIXn composed of feature vectors from
n
=

sample entropy in the forecast of rolling element bearing fault


health ball bearings acceleration signals, where each column
trend. Qian [10] presented an integrated approach which
combines recurrence quantification analysis with the auto vector ai E Rill denotes a normal feature, which is composed of
regression model, for evaluating bearing perfonnance Fast Fourier transfonn of the acceleration data in the method,
degradation. Li [11] presents a method to use autoregressive representing the shape feature in frequency domain of the
model to separate the original vibration signal of the bearing bearing. Fig. 1 shows the process from the original shape
into the random parts and the detenninistic parts. Then using feature to the denoising shape feature of a normal observation
these parts to calculate the energy ratio as the fault indicator. Li by extracting the multi-scale wavelet transform coefficient of
and Wang [12] used root mean square (RMS) and wavelet low frequency. Fig. 2 shows the process from the original
entropy as characteristic parameter in the time domain. Then he shape feature to the denoising shape feature of an abnonnal
built the logistic regression model to estimate rolling bearing observation by using the same way. We can see from the
reliability. Among all the methods, we found an interesting picture that the shapes of normal feature and abnonnal feature
method in anomaly detection of image processing. Cong et al. are quite different. In order to represent this difference, we
[13] used sparse representation to detect abnonnal events in the choose the sparse reconstruction cost to present it and intend to
crowded scene, which has a good result in abnonnal events use the trend of this difference to help to estimate the RUL of
detection. Given a collection of nonnal training examples, he the bearings. The goal of wavelet packet decomposition on the
proposes the sparse reconstruction cost over the normal initial feature from Fast Fourier transfonn is to reduce the noise
dictionary to measure the nonnalness of the testing sample. The and extract the main shape information in the frequency
sparse reconstruction cost will be high if the testing sample is domain as can be clearly seen in the right part of Fig. 1 and Fig.
abnormal. On the other hand, the sparse reconstruction cost 2. We choose the foregoing 20% features of a group of
will be low if the testing sample is nonnal. Inspired by Cong's acceleration bearing data after wavelet decomposition to create
method, we propose to introduce the sparse reconstruction over the normal sparse dictionary.
the nonnal bases to the remaining useful life of bearings.
2) Sparse Representation
After the building of the nonnal sparse dictionary, we start
III. OUR METHOD to use the dictionary to make sparse representation of the test
The method is composed of three parts: the formation of the samples. The improved algorithm based on Orthogonal
health feature vector, the classification of states of ball bearings Matching-Pursuit algorithm is used to adapt to the limited data
and the estimation of the remaining useful life of ball bearings. to get the sparse representation of a testing sample over the
sparse dictionary. The algorithm is a greedy algorithm for
A. The Formation of the Health Feature Vector approximating the solution of the Pac problem:
1) The Sparse Dictionary

where y is the test sample, A is the sparse matrix, X is the


sparse coefficient vector and &0 is an error tolerance set in
advance. The unknown X is composed of two effective parts
to be found: the support of the solution needed and the non
zero values over the support. Our goal is to get the minimum
number of the non-zero values in X under the condition that
Frequ",,"Y
Ily- Axl12 :s; &0 If we traversed the dictionary and still

couldn't let & :s; &0' we would stop the cycle. The test vector
Figure 1. The normal feature and the normal feature after wavelet transform.
which needs to traverse the dictionary will have a high
probability to be the abnormal one. The sparse representation
x E Rn of each testing sample over the sparse dictionary is got

based on this algorithm. The algorithm is summarized as


follows:

Algorithm 1: Orthogonal-Matching-Pursuit
1 Input: sparse matrix A,
test vector y
2 Output: sparse coefficient vector X
3 Initialization: XO 0, rO =
Ax = y- y, Sa = =
Figure 2. The abnonnal feature and the abnonnal feature after wavelet
4
Repeat: Compute & (J) minZj Ila/zj - ll
= r
k-l

transfonn.

978-1-4673-8554-1/151$31.00 2015 IEEE RPOl53 20 I 5 Prognostics and System Health Management Conference-Beijing
(2015 PHM-Beijing)
I
S Update: rk = y- Axk , Sk = Sk- U {j }
0
results about the states of the bearing. The algorithm can be
seen below:
6 Until convergence.
Algorithm 2: Clustering algorithm: K-means
r is the residual. Zj is the sparse vector under the current
1: Input: health feature vector {XI,X2, ,x } E R 5 xn ,number
I
\/j (jo) (j), n
..

loop. S is the solution support. If E Sk- ,& &


of cluster k , the k cluster centroids ul,u2 , U3 E Rn
namely a minimizer jo is found, update Sk Sk- U
I
. = {jo} 2 Repeat:
2
3 j arg min ll x j
c= 11 , Vi
The stopping rule is rk -U; E {1,2, ... ,n},j E {1,2,3}
Il 112 <
&
0
or traversing the dictionary.
4
J

L;l l{c,=j}x,
3) The Sparse Reconstruction Cost , ='-'-'-----
u=
After getting the sparse representation x E Rn of each "" I1{ = . '}
L.",= C, J
testing sample, the next step is to calculate the sparse 6: Until convergence.
reconstruction cost of the testing samples of the set of bearing k is the number of cluster we set in advance. c, E {1,2,3} is the
data. The goal is to find out that whether a testing sample y is
cluster that the sample i belong to, which has most close
normal or not, namely the degree of abnormal. If the testing
distance to all the k clusters. Ui is the center of mass to the k
sample y is close to the health state, the score will be low. If
clusters. The algorithm iterates until the variation of u, is less
the testing sample y is far from the health state and even in the
than a threshold.
near failure state, the score will be high. Given a testing
sample y , we design a Sparse Reconstruction Cost (SRC)
C. The Estimation of Remaining Useful Life (RUL) of Ball
using the minimal objective function value of next formula to
Bearings
detect its abnormality:
The last part is to estimate the remaining useful life. To
(2) predict failure, a rate method is taken. As the reconstruction
score curves of all the test bearings present a rising trend,
where A=[al ,a2, ,a ] E R"IXn , X E Rn , the 2-norm IIx I12 is which is similar to the near failure state of the training bearings,
n

we make an assumption that all the test bearings have entered


defined as IIx I12=(LiX? r the near failure state already. The rate is the ratio between the
length of the near failure state and the length of the transition
A high SRC value implies a high reconstruction cost and a state. As the states of the bearings have been divided, we will
probability of being closer to the failure state of the ball bearing. get the first point at which the bearing entering for the
The cost can reflect the normalness of the testing samples. transition state and the second point at which the bearing
Each testing sample will get a reconstruction cost and these entering for the near failure state. Given a test bearing like
costs can consist a cost curve of one bearing over the entire bearingl-3, we will find the first points and the second points
testing samples. In fact, two reconstruction cost curves both on of the state division of training bearings under the
the vertical and on the horizontal of one bearing can be got. corresponding conditions, like bearingl-l and bearingl-2.
Then we presume the estimated end of the test bearing to be X ,
4)The Formation of the Health Feature Vector
A five-dimensional vector is formed to be the health feature we will get
vector. The first two dimensions are respectively the vertical end - po intI x - po intI *
sparse reconstruction cost and horizontal sparse reconstruction (3)
'
cost, the third dimension is the root mean square of the two po int 2 - po intI po int 2 * -po int1 *
sparse reconstruction costs, the fourth dimension is the
maximum of the two costs and the last is the time, which can where po int1 and po int 2 belongs to the training bearing
ensure the continuity of the cluster result in time. The five under the same condition, end is the real end of the training
dimensions all have some drab trends for all the bearing sets.
bearing and po int1 * and po int 2 * belongs to the test

B. The Classification of the States of Ball Bearings bearing. X is the estimated end of the test bearing. When we
get X , we will get the estimated RUL of the test bearing. The
As all health feature vectors have a trend of monotone
fmal result is the average of the training bearings.
increase or vibration, we assume that the bearing would survive
at least three states in common: the healthy state, the transition
state and the near failure state. The method ofK-means is used IV. EXPERIMENT AND RES ULT

for clustering. K-means is a kind of unsupervised learning


method [IS]. In K-means clustering, the input is the health A. Experiment Data
feature vector {XI,X2 , ,x } E R 5 x n , the number of cluster is set The experiment datasets are provided by the FEMTO-ST
n
. .

to be three as we presume. The starting points of the K-means Institute. The data are collected under three different conditions:
clustering are settled as prior knowledge to get relatively stable first operating condition (1800 rpm and 4000 N), second
operating condition (16S0 rpm and 4200 N) and third operating
condition (1S00 rpm and SOOON) [16]. There are two

978-1-4673-8554-1/151$31.00 2015 IEEE RPOl53 2015 Prognostics and System Health Management Conference-Beijing
(2015 PHM-Beijing)
completed groups of run-to-failure datasets under each feature of an observation and the abnormal feature of an
condition, which belong to the learning set. Five groups of observation respectively. The construction score is to describe
truncated datasets are respectively for condition 1 and the difference between the normal feature and the abnormal
condition 2, while another one group of truncated dataset is for feature.
condition 3, all belonging to the test set. The sampling
frequency is 25.6 kHz and 2560 samples (i.e. 1/lOs) are B. Criteria
recorded each 10 seconds. Each sampling is an observation.
The results of the experiment are measured using the
For all the truncated test datasets, the last times of the sampling
are given. Details of the datasets are shown in table I and table percentage error. Let note RULi and ActRULi respectively
II as follows. the remaining useful life of the bearing estimated and the actual
RUL to be predicted. The percentage error on an experiment is
TABLEJ. EXPERIMENT DATESETS defined by:
Datesets 0 crating Conditions
Conditions 1 Conditions 2 Conditions 3 ActRULI - RUL1
%Er, = 100x ,i E [1,11] (4)
Learning set Bearing1-1 Bearing2-1 Bearing3-l
Bearing l -2 Bearing2-2 Bearing3-2
ActRUL,
Test set Bearing1-3 Bearing2-3 Bearing3-3
Bearing l -4 Bearing2-4 C. Results
Bearing l -5 Bearing2-5
Bearing1-6 Bearing2-6
As described in the part III, the 20% of the observations of
Bearing l -7 Bearing2-7 the bearing are used to be the normal observations and the
sparse reconstruction scores of the rest observations both on the
TABLE II. ACTUAL RULS TO BE ESTIMATED horizontal and the vertical accelerating data of the bearing are
got. In this way we get two dimensions of our health indicator.
Test set Actual RUL
Then we mix the two dimensions of the scores to get the max
Bearing1-3 5730s
Bearing1-4 2890s
value and the root mean square of the scores to make the other
Bearing1-5 1610s two dimensions of vectors. The last dimension of the health
Bearing1-6 1460s indicator is the time characteristics. Then theK-means method
Bearing1-7 7570s is used to cluster our health vector. We get the six rates of the
Bearing2-3 7530s learning sets and the two points of the bearing entering the
Bearing2-4 1390s transition state and the near failure state for the eleven test sets.
Bearing2-5 3090s
Using these information, we could estimate the RUL of the test
Bearing2-6 1290s
sets.
Bearing2-7 580s
Bearing3-3 820s
Construction score of three states
0.3 .

Construction score of three states 0.25


0.09

0.08
0.2

0.07
0.15
0.06

0.05 0.1

0.04
0.05
0.03

0.02

0.01
0.05 '----'---'----'
o 200 400 600 800 1000 1200 1400 1600 1800 2000
Observation
.0.01 L-__________--'
o 500 1000 1500 2000 2500 3000
Observation

Figw'e 4. The three states of bearingl -3

Figure 3. The three states of bearing I-I As we know that the test bearingl-3, learning bearing1-1
and bearingl-2 are under the same condition and have the
We view each 2560 samples as an observation. For each similar degradation trends, we can get the estimated RUL of
observation, we get the Fourier transform of the 2560 samples bearingl-3 based on the rate of bearing1-1 and bearingl-2. We
as a vector. Then we get the foregoing 20% of the whole take this group to give an example. The five dimensions of the
observations and use these observations to build the sparse health feature vector are divided into three states by the K
dictionary A [ai,a2,...,a ] E RI/IXI/ , where each column vector means clustering method as in Fig. 3. The first time when the
n
=

a, denotes a normal feature. Fig. 1 and Fig. 2 show the normal bearing enters the period of transition is 893 #. The second time

978-1-4673-8554-1/151$31.00 2015 IEEE RPOl53 2015 Prognostics and System Health Management Conference-Beijing
(2015 PHM-Beijing)
when the bearing enters the near failure period is 1839#. The V. CONCLUSION
duration of the second period and the third period gives the
The method using the sparse representation score feature
state duration ratio for bearing1-1:
for estimating the remaining useful life of ball bearings has
. 2803-893 been presented. The reconstruction score can relatively well
State RatlO= =2.02 (5) represent the states in the degradation of the bearing. The
1839-893
abnormal observation is different from the normal one. As we
Fig. 4 shows the five dimensions of the health feature vector use the preset normal observations to be the dictionary, the
and the three states after K-means clustering on the vector of reconstruction score of the rest observations will be high if the
the bearingl-3. The first point the bearing enters the period of observation is abnormal. We use this method in the PHM 2012
transition and the second point the bearing enters the near challenge data and find it can better represent the state of the
failure period are 697# and 1357# respectively. The end of bearings than the traditional features like energy, root mean
bearingl-l is 2803#. Using the formula (3), we can know that square, maximum peak value and skewness. Further work
the X is 2030.2 based on the model of bearing1-1, thus we can should be done to use this sparse representation method to
know that the estimation of remaining useful life of bearing1-3 improve the accuracy of the estimation of remaining useful life
is 2272s (2030.2 -1803)x 10 2272s). Knowing from
=
of ball bearings.
table II, we can get that the real estimation of RUL of bearing1-
3 is 5730s, we can know from formula (4) that the error is 60%. ACKNOWLEDGMENT

Another comparison has been done between the bearingl-3 and The authors would like to thank the FEMTO-ST Institute,
bearingl-2 because they are under the same condition. We get for providing the experimental data.
the mean value of the two groups of comparison data and get
the final error of 57.5%. Among all the test bearings, we get the
REFERENCES
best error estimation of 2.4% for the bearing2-5 under
[I] A. K. Jardine, D. Lin, and D. Banjevic, "A review on machinery
condition 2. All results are in table III. It can be seen that the
diagnostics and prognostics implementing condition-based
method has estimated the RUL of the bearings in general and maintenance," Mechanical Systems and Signal Processing, vol. 20, no. 7,
no over-estimated situation appears. What's more, the results pp. 1483 -1510, 2006.
have compared to the traditional features under the same [2] M. Lebold and M. Thurston, "Open standards for condition-based
experimental condition and have shown better accuracy. maintenance and prognostic systems," Maintenance and Reliability
Conference (MARCON), 200 I .
[3] FEMTO-ST, "IEEE PHM 2012 Data Challenge," online website,
TABLE III. RECONSTRUCTION SCORE ERROR http://www.femto-st.fr/eniResearchdepartments/AS2M/Research
groupsIPHM/IEEE-PHM-2012-Datachallenge.php, last accessed on May
Condition 1 Condition 2 Condition 3
31, 2012.
Bearin!? Error% Bearin!? Error% Bearin!? Error%
1-3 57.7 2-3 84.8 3-3 44.5 [4] A. Mosallam, K. Medjaher, et al. "Nonparametric Time Series
1-4 100 2-4 45.5 Modelling for Industrial Prognostics and Health Management," the
1-5 118 2-5 2.4 International Journal of Advanced Manufacturing Technology, vol. 69,
pp. 1685-1699, 2013.
1-6 108 2-6 51.5
1-7 92.8 2-7 72.5 [5] E. Sutrisno, "Estimation of remaining useful life of ball bearings using
data driven methodologies," Prognostics and Health
Management(PHM), 2012 IEEE Conference on , USA, 2012.
Table IV shows the error that the health indicator is [6] J. Yu, "Health condition monitoring of machines based on hidden
replaced by the traditional features under the same method of markov model and contribution analysis," IEEE Trans. Instrum.
Meas.,vol. 61, no. 8, pp. 2200-2211, Aug. 2012.
RUL estimation. This is a mixture features of energy, root
[7] P. Tamilselvan and P. Wang, "Failure diagnosis using deep belief
mean square, maximum peak value and skewness. Under the
learning based health state classification," Reliability Engineering and
same method of RUL estimation, the errors of the bearings System Safety, vol. 115, pp. 124-135, July 2013.
using our health indicator is obviously smaller than the errors [8] A. Mosallam, K. Medjaher, et aI. , "Data-driven prognostic method based
of the bearings using traditional features. It demonstrates that on Bayesian approaches for direct remaining useful life prediction,"
the health indicator containing the construction scores Journal of Interlligent Manufacturing, Springer Verlag(Germany), pp. l -
information can better reflect the state information about the 20, 2014.
bearing. [9] F. Wang and W. Su, "Application of Wavelet Packet Sample Entropy in
the Forecast of Rolling Element Bearing Fault Trend," International
Conference on Multimedia and Signal Processing, 20 I I .
TABLE IV. TRADITIONAL FEATURE ERROR [10] Y. Qian, "Bearing Performance Degradation Evalutation Using
Recurrence Quantification Analysis and Auto-regression Model," IEEE
Condition I Condition 2 Condition 3
International Instrumentation and Measurement Technology Conference,
Bearing Error% Bearing Error% Bearing Error% 2013.
1-3 115 2-3 210 3-3 121
[11] R. Li, "Fault features extraction for bearing prognostics", Journal of
1-4 111 2-4 230
Intelligent Manufacturing, vol. 23, pp. 313-321, April 2012.
1-5 866 2-5 555
1-6 498 2-6 108 [12] H. Li and Y. Wang, "Rolling Bearing Reliability Estimation Based on
Logistic Regression Model," International Conference on Quality,
1-7 151 2-7 198
Reliability, Risk, Maintenance, and Safety Engineering, 2013.

978-1-4673-8554-1/151$31.00 2015 IEEE RPOl53 2015 Prognostics and System Health Management Conference-Beijing
(2015 PHM-Beijing)
[13] Y. Cong, J. Yuan and J. Liu, "Abnormal event detection in crowded [15] G. Shi, "The optimized K-means algorithms for improving randomly
scenes using sparse representation," Pattern Recognition, vol. 46, pp. initialed midpoints," 2nd International Conference on Measurement,
1851-1864, July 2013. Infolmation and Control, 2013.
[14] M. Elad, "Sparse and Redundant Representations," Springer New York [16] P. Nectoux, R. Gouriveau, K. Medjaher, et aI. , "PRONOSTIA: An
Dordrecht Heidelberg London. experimental platform for bearings accelerated degradion tests," IEEE
Intemational Conference on Prognostics and Health Management,
PI-IM'12., June 2012.

978-1-4673-8554-1/151$31.00 2015 IEEE RPOl53 20 IS Prognostics and System Health Management Conference-Beijing
(2015 PHM-Beijing)

You might also like