You are on page 1of 15

Int J Adv Manuf Technol (2013) 69:16851699

DOI 10.1007/s00170-013-5065-z

ORIGINAL ARTICLE

Nonparametric time series modelling for industrial


prognostics and health management
Ahmed Mosallam Kamal Medjaher
Noureddine Zerhouni

Received: 2 January 2013 / Accepted: 13 May 2013 / Published online: 3 July 2013
Springer-Verlag London 2013

Abstract Prognostics and health management (PHM) in machinery health prognostics. The goal of this work is to
methods aim at detecting the degradation, diagnosing the develop a method that can extract features representing the
faults and predicting the time at which a system or a com- nominal behaviour of the monitored component, and from
ponent will no longer perform its desired function. PHM these features, smooth trends are extracted to represent the
is based on access to a model of a system or a component critical components health evolution over the time. The pro-
using one or combination of physical or data-driven models. posed method starts by multidimensional feature extraction
In physical-based models, one has to gather a lot of knowl- from machinery sensory signals. Then, unsupervised feature
edge about the desired system and then build an analytical selection on the features domain is applied without mak-
model of the system function of the degradation mechanism ing any assumptions concerning the number of the extracted
that is used as a reference during system operation. On the features. The selected features can be used to represent the
other hand, data-driven models are based on the exploitation nominal behaviour of the system and hence detect any devi-
of symptoms or indicators of degradations using statistical ation. Then, empirical mode decomposition algorithm is
or artificial intelligence methods on the monitored system applied on the projected features with the purpose of follow-
once it is operational and learn the normal behaviour. Trend ing the evolution of data in a compact representation over
extraction is one of the methods used to extract important time. Finally, ridge regression is applied to the extracted
information contained in the sensory signals, which can be trend for modelling and can be used later for the remain-
used for data-driven models. However, extraction of such ing useful life prediction. The method is demonstrated on
information from the collected data in a practical working accelerated degradation data set of bearings acquired from
environment is always a great challenge as sensory signals PRONOSTIA experimental platform and another data set
are usually multidimensional and obscured by noise. Also, downloaded from NASA repository where it is shown to be
the extracted trends should represent the nominal behaviour able to extract signal trends.
of the system as well as the health status evolution. This
paper presents a method for nonparametric trend modelling Keywords Feature extraction Health indicator Trend
from multidimensional sensory data so as to use such trends construction Health state detection Prognostics

1 Introduction
A. Mosallam K. Medjaher () N. Zerhouni
FEMTO-ST Institute, AS2M Department,
University of Franche-Comte/CNRS/ENSMM/UTBM, The degradation of machine critical components is one of
24 rue Alain Savary, 25000 Besancon, France the main reasons of machine breakdown. Moreover, faulty
e-mail: kamal.medjaher@ens2m.fr components can also affect other components in the sys-
A. Mosallam tem which might lead to severe consequences. Effective
e-mail: Ahmed.Mosallam@femto-st.fr condition monitoring and health assessment of machinery
N. Zerhouni deterioration is therefore crucial for reducing the downtime
e-mail: Nourredine.Zerhouni@ens2m.fr and the costs while increasing the availability [20].
1686 Int J Adv Manuf Technol (2013) 69:16851699

Diagnostics
Fault
Prognostics
detection

Health Decision
assessment support
Data &
information
exchange
Data User
manipulation interface

Data
acquisition

System with Fig. 3 Example of raw signal extracted from bearing test bid
sensors
The process of extracting representative health progres-
Fig. 1 PHM general scheme
sion data from sensory time series data is known as trend
extraction [3]. It can be performed using multiple parame-
Maintenance strategies can be classified into (1) break- ters, where relations between parameters can be utilised to
down maintenance, (2) preventive maintenance and (3) represent the deterioration. However, sensory signals usu-
condition-based maintenance (CBM) [16]. In breakdown ally contain tremendous amount of oscillations and partly or
maintenance, no actions are taken to maintain the equipment completely hidden by noise which can be very challenging
until it breaks down. Preventive maintenance only includes to process and to extract informative representation of the
setting periodic intervals for machine inspection. CBM machine status. Also, the extracted trends should be used
analyses the condition measurements that do not interrupt as a reference for the nominal behaviour and to predict the
normal operations. Compared to other maintenance strate- health status in the future.
gies, CBM attempts to avoid unnecessary tasks by taking In this work, time and frequency domain features have
actions only when there is evidence of abnormal behaviour been extracted from two sensory time series. The fea-
in the machine. Thus, CBM can produce significant savings tures have been grouped using unsupervised feature selec-
through more convenient scheduling and therefore has been tion algorithm based on symmetrical uncertainty measure,
widely explored in research and industry lately [36]. which can be used as a reference model for the nominal
One of the most important CBM activities is prognostics behaviour. Then, each group of features has been pro-
and health management (PHM) which can be defined as the jected into a compact two-dimensional representation using
process of detecting abnormal conditions, diagnosis of the principle component analysis (PCA) [19]. In order to get
fault and their cause and prognostics of future fault progres- monotonic-like signals, the final trends were extracted from
sion (see [12, 31]). PHM research has been shifting lately each projected group using empirical mode decomposi-
from fault detection and diagnostics to prognostics. Prog- tion (EMD) algorithm [14] and used to build a regression
nostics can be defined as the estimation of time to failure model of the machinery health progression. These models
and risk for one or more existing or future failure modes can be used later for predicting the remaining useful life of
[15]. It usually requires the past history of the machinery the system.
condition sensory data [34]. The machinery health progres- This paper is organised as follows: A detailed litera-
sion can be then extracted from the acquired data. The ture review is depicted in Section 2. Section 3 explains the
extracted progression data are then used as input for predic- method in details. The experimental results are shown in
tion or estimation algorithms to predict the future machine Section 4. Finally, the conclusion of this paper and the future
status [22]. work are depicted in Section 5.

Fig. 2 The proposed method


Unsupervised
Feature Feature Trend Health state
feature
extraction compression extraction modeling
selection
Int J Adv Manuf Technol (2013) 69:16851699 1687

2.5

1.5
RMS

0.5

0
0 2 4 6 8 10 12 14 16 18
Time (hours)

Fig. 4 Example of RMS feature extracted from sensory bearing test Fig. 5 Example of group of selected feature compressed using PCA
bid

2 Background
when a fault/outlier has occurred. This section is dedi-
Data-driven PHM methods aim at building degradation cated to review some of the research works conducted
models from sensory data acquired from sensors attached for this task.
to critical components (see Fig. 1). These sensory data con- An online anomaly detection algorithm that is not based
tain valuable information about several aspects of health on any learning algorithm has been proposed in [8]. The
progression, which has proved to be very challengeable. algorithm sets different operation modes once it starts run-
In this section, we review state-of-the-art preprocessing ning on N observations. Any change on the number of the
approaches for the main PHM activities. clusters or the behaviour of the new data in an already
defined cluster will be considered as suspicious behaviour.
2.1 Fault detection The proposed method is divided into two main parts: initial-
isation and monitoring. For initialisation, 27 features were
Fault or anomaly detection is the simplest part of PHM, extracted; however, the authors did not specify exactly all
which can be defined as the process of identifying the features. Then, data standardisation has been performed
by applying unit variance and mean centering. PCA has
been used for data dimensionality reduction. Finally, the
Table 1 Summary of features extracted from raw signals
greedy expectation maximisation clustering algorithm was
Feature Formula used to estimate the unknown numbers of operating mode
(OM) clusters. For the monitoring part, the first feature
Peak-to-peak mean(upperpks ) + mean(lowerpks ) extraction has been applied. Then, calculating the online
Maximum peak value max(find peaks(signal))
 data and updating the modelled mean and variance of the
n (x1 + x2 ... + xn )
1 2 2 2
Root mean square old data set have been performed. Data standardisation for
E(x)4
Kurtosis 4
the online readings was performed as in the initialisation

N
part and also dimensionality reduction. Then, OM clusters
(xi x)3
i=1
Skewness (N 1) 3
were updated using Mahalanobis distance measure. Finally,
Mutual information I (X, Y ) = H (X) + H (Y ) H (X, Y ) OM was tracked using evolving TakagiSugeno model in

Entropy H (X) = xX p(x) log p(x) order to predict the dynamics of the data within each cluster.
1 
Arithmetic mean of PSD 20 log10 n
abs(ff t (x)) A comparison between PCA and partial least square (PLS)
105

n for fault detection has been presented in [33]. Moreover,
Line integral i= abs(xi+1 xi )
i=0 the authors have used PLS for prognostics. The proposed

p
method is divided into three parts: feature extraction, mod-
Autoregressive model xt = c + i xti + t
i=1 elling and deviation detection. For feature extraction, 16

n
Energy e= xi2 signals have been measured from moving gate-type inciner-
i=0
ator. The model on healthy data sets was built using PCA
1688 Int J Adv Manuf Technol (2013) 69:16851699

Fig. 6 Example of 5 5
decomposition process using 4 4
EMD. a First IMF, b third IMF, 3 3
c sixth IMF, and d residual 2 2
1 1

IMF1

IMF3
0 0
1 1
2 2
3 3
4 4
5 5
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Time (hours) Time (hours)
(a) First IMF (b) Third IMF
1.5
20

1 15

0.5 10

Residual
IMF6

0 5

0.5 0

1 5

1.5 10
0 5 10 15 20 25 30 0 5 10 15 20 25 30
Time (hours) Time (hours)

(c) Sixth IMF (d) Residual

and PLS. Finally, T square and Q statistics have been pro- and monitoring temporal evolution of aircraft engine health
posed to detect faults. The results show that both PCA is presented in [7]. The environmental variables and engine
and PLS performed fairly well in fault detection, and PLS effects are removed from rough measurements using gen-
was also good for prognostics. An algorithm for machin- eral linear model. The residuals of the regression are used
ery health monitoring and early fault detection has been after that for training self-organised maps (SOMs) which
proposed in [23]. The method decomposes the input sig- shows the evolution of the motor status. Two methods for
nal into intrinsic mode functions (IMFs) using EMD and detecting anomalies in time series data sets acquired from
then combined mode function is applied to mix neighbour- different machinery sensors have been proposed in [1]. The
ing IMFs to obtain the best signal. Feature extraction is then first approach uses entropy analysis over the entire set of
performed using Fourier transform on the acquired signal. sensors at once to detect anomalies that have broad system-
Finally, the method uses hidden Markov models (HMMs) wide impact. It starts by smoothing and normalising time
to build a model of the normal condition of the gearbox, series data for each sensor. Then, the time series values
and average probability index is constructed as an index for were sampled using uniformly sized bins. Finally, Shannon
machinery health status. A visual tool for fault detection entropy is computed for each time step as a reference model

Fig. 7 PRONOSTIA Data


experimentation platform acquisition
module
Load
module

Rotating
module

Tested
bearing
Int J Adv Manuf Technol (2013) 69:16851699 1689

Fig. 8 a, b PRONOSTIA first 14 0.6


monotonic data set
12 0.4
10
0.2
8

Residual

Residual
6 0

4 -0.2
2
-0.4
0

-2 -0.6

-4 -0.8
0 1 2 3 4 0 1 2 3 4
Time (hours)
Time (hours)
(a) (b)

for the system. The second approach uses automated cluster- health monitoring for diesel engines has been proposed in
ing of sensors combined with intra-cluster entropy analysis [28]. The method is based on building a model using inde-
to detect anomalies and faults that have more local impact. pendent component analysis. Probabilistic outlier detection
For each of the n sensors, the method computes the Pear- algorithm has been also proposed for the detection of
son correlation between each pair of sensors and form an anomalies.
n n distance matrix. Then, the method clusters the sensors
by performing a graph partitioning of the adjacency graph 2.2 Diagnostics
and performs a time-windowed correlation within each sen-
sor cluster. Finally, the entropy of the m discrete values Another important aspect in PHM systems is diagnostics
has been computed to provide the cluster entropy for the and evaluation of wear advancements, and it is presented
system. The strength points in this work were no parame- in this section. A multidimensional diagnostic approach for
ters to tune, and they are easy to implement. However, the mechanical systems has been presented in [4]. Vibration
entropy does not differentiate between anomalies or noise signals were acquired from diesel engine, heavy fan and
which make it difficult for the method to detect different rubbing blades in a turbo set to validate the approach. The
types of faults. A method for fault detection for electrical authors applied mean centering and normalisation for the
machines has been proposed in [32]. The method learns the signals and then using singular value decomposition (SVD),
normal behaviour over the time and detects any changes the multidimensional data set was reduced to a lower dimen-
between the signals due to the degradation of the system. sion. Finally, health evolution indexing and fault diagnostics
The proposed method selects interesting features from the have been proposed by choosing the most informative SVD
measured signals from the system using pairwise similar- indexes. The same authors also proposed PCA instead of
ity measure algorithm and uses Gaussian mixture models SVD for its computational efficiency in [5]. SOMs have
for relation description. Finally, a distance measure between been proposed in [25] for fault diagnostics and assessment
different signals was calculated as indicator for the system of fault severity. Kurtosis and line integral of acceleration
deviation. A method for unsupervised change detection and signal have been extracted from bearing vibration signals

Fig. 9 a, b PRONOSTIA first 8 1500


nonmonotonic data set
6
1000

4
100
Residual

Residual

0
0

500
-2

-4 1000
0 1 2 3 4 0 1 3 4 5
Time (hours) Time (hours)
(a) (b)
1690 Int J Adv Manuf Technol (2013) 69:16851699

Fig. 10 a, b PRONOSTIA
second monotonic data set

for different faults. Then, a SOM has been trained using the The method shows that using feature selection to select
extracted features. EMD was also proposed in many works smaller set of features yields in more effective results. The
as it returns smooth monotonic-like signals. A comparison proposed method is divided into three parts: feature extrac-
study has been performed on the performance of four sta- tion, feature selection and tool state learning. For feature
tistical indexes such as crest factor, kurtosis, skewness and extraction, 13 features have been extracted from acoustic
beta distribution function [13]. The study shows that there is emission (AE) signals acquired from computer numerically
no significant advantage in using beta function compared to controlled (CNC) machine. Feature selection has been done
using kurtosis and crest factor for detecting and identifying using automatic relevance determination to select features
different machinery defects. The paper also shows that the which appear to have more potential use. Finally, the tool
statistical parameters are affected by the shaft speed. The state learning has been conducted using the support vec-
proposed method is only based on extracting four features tor machine (SVM). The performance of four classification
mentioned earlier from vibration and sound signals acquired algorithms for fault diagnostics has been compared in [9].
from test rig. Two temporal models, HMM and autore- Sixteen features were extracted from force sensors attached
gressive moving model with exogenous input (ARMAX), to a cutting machine. Then, three features were automat-
have been used for diagnostics in [10]. In this work, 16 ically selected from the extracted features using genetic
features have been extracted from force signals acquired algorithms. A segmentation algorithm for health monitoring
from the cutter milling machine. Selecting dominant fea- and signal trending has been proposed in [6]. It is conducted
tures has been applied using least multicollinearity along in four steps: (1) online segmentation of data into linear seg-
with a high R 2 . Finally, building models for different tool ments, (2) classification of the segmented lines into seven
wears using ARMAX and HMM. A method for diagnos- shapes, (3) transformation of the obtained shapes into three
tics and tool state recognition has been proposed in [18]. main shapes and (4) aggregation of the current episode with

Fig. 11 a, b PRONOSTIA
second nonmonotonic data set
Int J Adv Manuf Technol (2013) 69:16851699 1691

Fig. 12 a, b PRONOSTIA third


monotonic data set

the previous ones to form the signal trend. The performance 2.3 Prognostics
of SVM versus artificial neural network (ANN) for gear
fault diagnostics has been compared in [37]. The method is Prognostics can be defined as the prediction when a fail-
based on selecting important features from a larger feature ure might take place. Prognostics has recently attracted a lot
set, and it shows that this selection increases the classifi- of research interest due to the need of models for accurate
cation accuracy. Finally, the authors used data sets for nine estimation of the remaining useful life (RUL) for different
different gear fault classes. For each class, 40 measures applications.
were recorded. The comparison shows that SVM outper- A method for prognostics and health assessment has been
forms ANN. An unsupervised feature selection method for proposed in [39]. The method starts by extracting 16 time
deciding the optimal depth for wavelet packet decomposi- and frequency features from double suction pump vibration
tion (WPD) has been presented in [35]. The paper shows signals. The method then uses PCA to merge features and
that using feature selection to determine the depth for WPD to project the multidimensional feature vector into a com-
led to a model which can retain almost all the accuracy of pact indicator. Finally, fault threshold has been calculated
models built by a much deeper transformation. The pro- using best efficiency point. Autoregression (AR) filter has
posed method determines the appropriate level using Chi been used to model vibration signal from bearing test rig
square and information gain and then builds a diagnostics in [21]. The model parameters have been estimated using
model using a naive Bayes classifier. A method shows that if LevinsonDurbin recursion. Then, the energy ratio between
the size of the SOM is chosen judiciously, then it is possible the random parts and the original signal was used as fault
to monitor and identify different range of faults which has indicator. The algorithm is simple to implement and suit-
been presented in [17]. The method proposes an empirically able for highly accelerated signals. An integrated framework
derived equation that governs the proper network size for for fault detection, diagnosis and prognosis using HMM has
efficient diagnostics. been presented in [40]. The proposed framework starts the

Fig. 13 a, b PRONOSTIA third


nonmonotonic data set

(a) (b)
1692 Int J Adv Manuf Technol (2013) 69:16851699

Fig. 14 a, b NASA first


monotonic data set

(a) (b)

data preprocessing by frame blocking, frequency spectral ratio and Gaussian mixture model clustering algorithm, the
analysis and noise filtering. Then, using PCA, the dimen- important features have been selected. A semi-supervised
sionality of the data set has been reduced. Next, the health feature selection algorithm for selecting dominant features
status estimator has been built using HMM and health index for later wear status monitoring has been proposed in [41].
interpolation using Paris formula. A probabilistic model for The method extracts 16 features from the raw force sig-
online health indicator of bearing wear using wavelet pack- nals acquired from cutting tools. Then, SVD was applied
age decomposition and HMM has been proposed in [27]. on the feature space, and the n most dominant compo-
The method is divided into three parts: the vibration sig- nents were selected manually. K-means clustering was
nal is divided into equal-sized signal epochs, the nth level applied on the feature space using n clusters, and the set of
WPD was applied to these signal epochs and finally the closest m features to the cluster centroids has been iden-
probabilities of the HMM for the normal condition are cal- tified. Finally, multiple regression model has been built
culated which can be used as the health indicator. The main using the selected set of m features. The authors reported
contribution in this method was the proposition of a new the quality of the regression results using different sets of
probabilistic model for machinery health monitoring using selected features and also against different feature selec-
HMM. A single hidden semi-Markov model for prognos- tion approaches. SOM was also proposed for prognostics.
tics has been proposed in [11]. In this work, seven signals, A method for prognostics and trend analysis using modi-
three force, three vibration and one acoustic, have been fied SOM has been proposed in [38]. The paper proposes
acquired from the CNC milling machine. Then, 16 statis- unequal scaling method for improving the performance of
tical features were extracted from the three force acquired SOM. It shows that the SOM outperforms constrained topo-
signals. Wavelet feature extraction from the force signals logical mapping on estimation of an unknown function with
and discrete Meyer wavelet have been applied to vibra- multiple indices. Particle filter has been widely used for
tion and AE signals. Finally, using Fishers discriminant fault progression modelling and estimation specially for

Fig. 15 a, b NASA first


nonmonotonic data set

(a) (b)
Int J Adv Manuf Technol (2013) 69:16851699 1693

Fig. 16 a, b NASA second


monotonic data set

nonlinear systems [2]. The particle filter does not assume component and detecting any abnormal behaviour. Then,
a general analytic form for the state space probability from these features, the method extracts smooth trends to
density function (PDF). The extended Kalman filter is the represent critical components health evolution over the
most popular solution to the recursive nonlinear state esti- time. The extracted trends will be used to build reference
mation problem. However, the desired PDF is approximated models that could be used for health status prediction.
by a Gaussian, which may have significant deviation from
the true distribution causing the filter to diverge. In contrast,
for the particle filter approach, the PDF is approximated 3 The method
by a set of particles representing sampled values from
the unknown state space and a set of associated weights The idea is to extract parameters from successive mul-
denoting discrete probability masses. The particles are gen- tidimensional features acquired from machinery sensory
erated and recursively updated from a nonlinear process signals, without making any assumptions concerning the
model that describes the evolution in time of the system source of the signals and the number of the extracted fea-
under analysis, a measurement model, a set of available tures, which reflect the machinery deterioration over time.
measurements and an a priori estimate of the state PDF. The assumptions taken in this work can be summarised as
Particle filter has been proposed in [30] and [29] for RUL follows:
estimation.
From the review, it can be seen that the processing of 1. The input to the proposed method is in general multi-
sensory signals differs according to the task to be done, i.e. dimensional time series sensory signals acquired from
fault detection, diagnostics or prognostics. The goal of this critical components.
work is to develop a method that can be used to extract fea- 2. The time series signals should capture the health status
tures representing the nominal behaviour of the monitored evolution through the time.

Fig. 17 a, b NASA second


nonmonotonic data set
1694 Int J Adv Manuf Technol (2013) 69:16851699

3. There are no assumptions about the nature and number 3.3 Feature compression
of the sensors.
4. Historical data sets should be complete in order to build In order to follow the trajectories of selected features over
reference model(s). time, the number of features has to be reduced to a com-
pact form. The goal in this step is to compress the n
The algorithm consists of five main phases as shown in
features selected in the previous step onto one-dimensional
Fig. 2.
space. One way to compress the variables, i.e. reducing the
number of their dimensions, without much loss of infor-
3.1 Feature extraction
mation is using PCA. PCA projects data from features
to principal component domain while keeping the greatest
Gathered signals from machine components contain gener-
variance by any projection of the data on the first prin-
ally immense number of data which require a large amount
cipal component and the second greatest variance on the
of memory and computation power to be analysed as shown
second principle and so on. PCA can be used for data com-
in Fig. 3.
pression of the input data set using one or more of the
In order to transform the raw input data into reduced
principle components. In this way, the method can com-
informative representation, different features can be
press the n input features, selected in the previous step,
extracted from raw signals. These features can be derived
into single trend to represent the health evolution over the
from time domain, frequency domain or joint time
time.
frequency domain. Figure 4 depicts an example of root
In this work we use the standard PCA method where
mean square (RMS) feature extracted from the raw signal.
eigenvalues i and eigenvectors vi have been calculated for
In this work, features from both time and frequency
covariance matrix C of the selected features:
domains have been extracted from raw signals and are listed
in Table 1.
Ci = i vi . (2)
3.2 Unsupervised feature selection
Then, the first component is used to represent the health
Not all extracted features acquired in the previous step are status evolution with respect to time as shown in Fig. 5.
interesting, that is, containing information about the degra- The compressed features are then further processed to get a
dation of the system. We are interested in signals that have smooth trend of the health status.
nonrandom relationships and consequently contain infor-
mation about system degradation. To select such signals,
3.4 Signal trend extraction
an unsupervised feature selection algorithm [24] based on
information theory is applied. The algorithm first calculates
So far, the reduced signals exhibit high level of oscillations
the pairwise symmetrical uncertainty for all the input signals
and noise. Such oscillations make the signal hard for visu-
defined by the following:
alisation and superfluous for model building tasks. In this
section, the internal structure of the data is extracted in a
I (X, Y ) way which best explains the degradation in a simple mono-
SU(X, Y ) = 2 , (1)
H (X) + H (Y ) tonic signal, and the rest of oscillations are discarded using
EMD algorithm.
where I(X,Y) is the mutual information between two ran- EMD, which was originally proposed by Huang et al.
dom variables X and Y ; H(X) and H(Y) are information [14], is a way for signal decomposition into a successive
entropy of a random variable. Then, the algorithm mea- IMF, such as depicted in Fig. 6, and is composed of the
sures the distance between all the pairs using hierarchical following steps:
clustering. The algorithm finally ranks the resulting clusters
according to the quality of the included signals in represent- Find all the local maxima and minima of the input sig-
ing interesting relationships, that is, containing information nal and compute the corresponding upper and lower
about machinery degradation using normalised SOM distor- envelopes using cubic spline, respectively
tion measure. The selected features now can represent the Subtract the mean value of the upper and lower
nominal behaviour of the critical component. Any change envelopes from the original signal.
in the relations between selected features can indicate a Repeat until the signal remains nearly unchanged and
potential fault in the system. The selected features are then obtain IMFi .
compressed into lower dimension using PCA to represent Remove IMFi from the signal and repeat if it is neither
the critical components health evolution over the time. a constant nor a trend.
Int J Adv Manuf Technol (2013) 69:16851699 1695

3.5 Health status modelling In a practical application, we need to determine the


values of the regression parameter , and the principle
Machine learning provides large methodologies such as objective in doing so is usually to achieve the best predictive
simple regression algorithms which are used to fit models performance on new data. The performance on the train-
through training data. In case of regression, we assume that ing set is not a good indicator of predictive performance on
the real underlying function is the sum of our estimation unseen data due to the problem of over-fitting. Here, we use
and an error, i.e. the error can be estimated by a difference cross-validation to take the available data and partition it
between the real underlying function and our estimate: into a training set used to determine the coefficients w and
a validation set used to optimize the model complexity (in
y = wx + = y wx. (3) this case finding a proper value for ).
The goal is to use this to find a vector w for which the
regression model best fits the training data, which corre-
4 Experiments
sponds to finding the minimum value of the error. Usually,
this procedure called training the model is done by min-
This work has been verified using two different data sets,
imising the sum-of-squares error function for each data
PRONOSTIA and NASA data sets. First, the trends have
point:
been extracted from the raw data and then a selected trend
has been modelled for RUL estimation. The details of the

N
E = SSE = {y wx}2 . (4) data sets and the experiments are explained in the following
n=1 subsection.

There are many linear regression methods which can 4.1 Trend extraction
be used to learn the degradation process. In standard
linear regression, the pseudo-inverse of the input train- 4.1.1 PRONOSTIA data set
ing data is used to compute the coefficients for a linear
model. As the name suggests, this involves to invert a PRONOSTIA (see Fig. 7) is an experimentation platform
matrix and which might not always be possible because dedicated to test and validate bearing fault detection, diag-
of zero eigenvalues of the matrix. Ridge regression tries nostic and prognostic approaches. It is composed of four
to cope with this by perturbation of the diagonal entries main parts: a rotating part, a degradation generation part
of the matrix. (with a radial force applied on the tested bearing), a mea-
If we expand the square in Eq. (4) and then differentiate surement part and test bearings [26]. In this work, three
the expansion with respect to w, we end up with the equation data sets acquired from this platform have been used to val-
below from where we can isolate w as a function of known idate the algorithm, where only the acceleration data have
data: been used.
For the first data set, the algorithm generates nine signals,
XT Xw = XT y w = (XT X)1 XT y. (5) each of that represents evolution of the bearing degrada-
tion over the time with different trajectories. Figure 8 shows
The major weakness of Eq. (5) is that the X is two selected monotonic health indices for the first exper-
a nonsquare matrix, and thus, it is not invertible. To iment. As can be seen from the figure, the trends show
solve this problem pseudo-inverse generalisation of the smooth increasing signals over the time which can be used
notion of matrix inverse to nonsquare matrices has to deduce the health status of the machinery. The plot on
been applied: the left side is the result of fusing two AR parameters for
the first sensor and one parameter for the second sensor.
The plot on the right side shows another indicator which
w = (XT X)1 XT y = X y. (6)
was a result of fusing also three features, namely skewness
for the first sensor along with the second parameter of AR
Nevertheless, if XT X in Eq. (6) has zero eigenvalues,
model and line integral feature extracted from the second
it is not invertible and said to be singular. Ridge regres-
sensor signal.
sion tries to cope with this by adding a term I to XT X
Figure 9 shows two selected nonmonotonic health
which perturbs the diagonal entries of XT X to encourage
indices for the same experiment. The plot on the left side
nonsingularity:
is the result of fusing two features, skewness of the second
sensor and energy of the first sensor. The plot on the right
XT X (XT X + I) w = (XT X + I)1 XT y. (7) side shows another indicator which was a result of fusing
1696 Int J Adv Manuf Technol (2013) 69:16851699

Fig. 18 a, b NASA third


nonmonotonic data set

also two features, namely kurtosis and skewness extracted for the aforementioned experiment. The plot on the left
from the second sensor. side, which is almost a linear function, is the result of
For the second data set, the algorithm generates ten sig- fusing three features, peak-to-peak, maximum peak value
nals, each of that indicates the health degradation over and AR parameter, where all the features have been
the time. Figure 10 shows two selected monotonic health extracted from the first sensor. The plot on the right side
indices for the first experiment. The plot on the left side, shows a smooth decreasing monotonic function. It is the
which is almost a linear function, is the result of fusing two result of fusing three features, arithmetic mean of the
features, maximum peak value and AR parameter; both have power spectral density (PSD) extracted from both sensors
been extracted from the first sensor. The plot on the right and kurtosis which has been extracted from the second
side shows a smooth monotonic function. It is the result also sensor.
of fusing two features, RMS and line integral acquired from Figure 13 shows two selected nonmonotonic health
the second sensor. indices for the same experiment. The plot on the left
Figure 11 shows two selected nonmonotonic health side is the result of fusing two features, skewness which
indices for the same experiment. The plot on the left side has been extracted from the first sensor and entropy of
is the result of fusing three features, entropy of both sen- the first sensor. The plot on the right side shows another
sors and AR model parameter for the first sensor. The plot indicator which was the result of fusing two features,
on the right side shows another indicator which was a result namely kurtosis and line integral acquired from the second
of fusing four features, namely peak-to-peak, maximum sensor.
peak value, energy for the second sensor and the mutual
information between both sensors. 4.1.2 NASA data set
For the third data set, the algorithm generates nine sig-
nals that indicate the health degradation over the time. This data set is fully described in NASAs website
Figure 12 shows two selected monotonic health indices (ti.arc.nasa.gov/tech/dash/pcoe/). In this work, only the first

Fig. 19 Selected trends for


modelling. Training
PRONOSTIA (a) and NASA (b)
data sets
Int J Adv Manuf Technol (2013) 69:16851699 1697

Fig. 20 Error minimisations for


PRONOSTIA (a) and NASA (a)
data sets

two sensor readings have been used to validate the algo- For the final data set, the algorithm generates ten trends
rithm. For the first data set, the algorithm generates ten from the raw data, and none of them was monotonic. Fig-
trends from the raw data. Figure 14 shows two selected ure 18 shows two selected nonmonotonic health indices for
monotonic health indices for the aforementioned experi- the same experiment. The plot on the left side is the result
ment. The plot on the left side shows the result of fusing four of fusing two features, peak-to-peak acquired from the first
features, peak-to-peak, maximum peak value and energy sensor and entropy of both sensors. The plot on the right
acquired from the second sensor and entropy from the side shows another indicator which was the result of fusing
first sensor. The plot on the right side shows a smooth two features, namely root mean square and AR coefficient
decreasing monotonic function. It is the result of fusing acquired from the second sensor.
two features, mutual information between both sensors
and entropy which has been extracted from the second 4.2 Ridge regression
sensor.
Figure 15 shows two selected nonmonotonic health From NASA and PRONOSTIA data sets, two trends have
indices for the same experiment. The plot on the left side been selected for this step. One trend is used for building
is the result of fusing two features, maximum peak value the regression and the other trend is used for testing. The
which has been extracted from the first sensor and skew- experiments have been done on the same time t. Figure 19a,
ness which has been extracted from the second sensor. The b shows the original trends for PRONOSTIA and NASA,
plot on the right side shows another indicator which was respectively.
the result of fusing two features, namely kurtosis and line
integral acquired from the first sensor.
For the second data set, the algorithm generates six
10
trends from the raw data. Figure 16 shows two selected Original
Fit
monotonic health indices for the aforementioned experi-
8
ment. The plot on the left side shows the result of fusing
two features, maximum peak value acquired from the sec-
ond sensor and energy acquired from the first sensor. The 6

plot on the right side shows a smooth decreasing monotonic


Residual

function. It is the result of fusing two features, entropy of 4


the first sensor and arithmetic mean of PSD acquired from
the second sensor.
2
Figure 17 shows two selected nonmonotonic health
indices for the same experiment. The plot on the left side is
0
the result of fusing two features, peak-to-peak and energy
which have been extracted from the second sensor. The
plot on the right side shows another indicator which was 0 1 2 3 4
Time (hours)
the result of fusing two features, namely peak-to-peak and
entropy acquired from the first sensor. Fig. 21 Prediction error for PRONOSTIA data set.
1698 Int J Adv Manuf Technol (2013) 69:16851699

applied for the generated trends for modelling the degra-


dation. These models can be used to identify the current
machinery health status and predict the RUL of the machine.
The algorithm is demonstrated on accelerated degradation
data set of bearings acquired from PRONOSTIA experi-
mental platform and NASA data set where it is shown to
be able to extract interesting signal trends. The results show
that the algorithm is generic and can extract the progression
of the machinery health status in a compact form. The gen-
erated trends have been used also for model building and
validation.

References

1. Agogino A, Tumer K (2006) Entropy based anomaly detec-


tion applied to space shuttle main engines. IEEE aerospace
Fig. 22 Prediction error curves for NASA data set. conference
2. Brown D, Georgoulas G, Chen R, Ho YH, Tannenbaum
G, Schroeder J (2009) Particle filter based anomaly detec-
tion for aircraft actuator system. IEEE aerospace conference,
We first calculate the coefficients with minimum error Monatan
as shown in the previous section. Figure 20a, b shows the 3. Camci F, Medjaher K, Zerhouni N, Nectoux P (2012) Feature
effect of using regression parameter where the minimum evaluation for effective bearing prognostics. Quality and reliabil-
error has been selected for the regression modelling for both ity engineering international 29. Article first published online: 16
Mar 2012. doi:10.1002/qre.1396
PRONOSTIA and NASA, respectively. 4. Cempel C (2003) Multidimensional condition monitoring of
Finally, we test the model using test trends for both data mechanical systems in operation. Mech Syst Signal Process
sets and calculate the mean absolute error (8). 17:12911303
5. Cempel C (2003) Multifault condition monitoring of mechanical
mean(abs(Expected Real)). (8) systems in operation, vol 17. IMEKO XVII, Croatia, pp 14
6. Charbonnier S, Garcia-Beltan C, Cadet C, Gentil S (2005) Trends
extraction and analysis for complex system monitoring and deci-
Figure 21 shows the results of PRONOSTIA predicted sion support. Eng Appl Artif Intel 18:2136
trends. The results show that the estimation was so close to 7. Cottrell M, Gaubert P, Eloy C, Franscois D, Hallaux G, Lacaille J,
the test values with absolute error = 0.4165. Verleysen M (2009) Fault prediction in aircraft engines using self-
organizing maps. Proceedings of the 7th international workshop
Figure 22 shows the results of NASA predicted trends.
on advances in self-organizing maps, pp 3744
The results show that the estimation was so close to the test 8. Filev DP, Tseng F (2006) Novelty detection based machine health
values with absolute error = 0.0751. prognostics. Int Symp Evolving Fuzzy Syst 193199
9. Geramifard O, Xu JX, Pang CK, Zhou JH, Li X (2010) Data-
driven approaches in health condition monitoringa comparative
study. 8th IEEE international conference on control and automa-
5 Conclusion tion, pp 16181622
10. Geramifard O, Xu JX, Zhou JH, Li X (2010) Continuous health
In this work, a health estimation model based on unsu- assessment using a single hidden Markov model. 11th interna-
pervised trend extraction algorithm from sensory data has tional conference on control automation robotics and vision, pp
13471352
been proposed. The generated trends showed a smooth
11. Geramifard O, Xu JX, Zhou JH, Li X (2011) Continuous
representation of the progression of machinery health health condition monitoring: a single hidden semi-Markov model
status and have been used for health degradation modelling. approach. IEEE conference on prognostics and health manage-
The proposed method is based on extracting successive ment, pp 110
12. Heng A, Zhang S, Tan ACC, Mathew J (2009) Rotating machinery
multidimensional features from machinery sensory signals
prognostics: state of the art, challenges and opportunities. Mech
acquired from machines critical components. Then, unsu- Syst Signal Process 23:724739
pervised feature selection on the domain features is applied 13. Heng R, Nor M (1998) Statistical analysis of sound and vibration
without making any assumptions concerning the source of signals for monitoring rolling element bearing condition. Appl
the signals and the number of the extracted features. An Acoust 53:211226
14. Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen
EMD algorithm is applied on the projected features with the NC, Tung CC, Liu HH (1998) The empirical mode decomposition
purpose of following the evolution of data in a compact rep- and the Hilbert spectrum for nonlinear and non-stationary time
resentation over time. Finally, a ridge regression has been series analysis. Proc R Soc London 454:903995
Int J Adv Manuf Technol (2013) 69:16851699 1699

15. ISO, Condition Monitoring and Diagnostics of Machines - 29. Saha B, Goebel K, Poll S, Christophersen J (2009) Prognostics
Prognostics part 1: General Guidelines, (2004) In: ISO13381 methods for battery health monitoring using a Bayesian frame-
1:2004(E). Volume: ISO/IEC Directives Part 2, Page 14 work. IEEE Trans Instrum Meas 58:291296
16. Jardine AKS, Lin D, Banjevic D (2006) A review on machin- 30. Saha B, Goebel K, Poll S, Christopherson J (2007) A Bayesian
ery diagnostics and prognostics implementing condition-based framework for remaining useful life estimation. Proceedings Fall
maintenance. Mech Syst Signal Process 20:14831510 AAAI symposium: AI for prognostics. Arlington
17. Jiang H, Penman J (1993) Using Kohonen feature maps to moni- 31. Schwabacher M (2005) A survey of data-driven prognostics.
tor the condition of synchronous generators. Workshop on neural AIAA InfoTech Aerospace
network applications and tools, pp 8994 32. Svensson M, Rognvaldsson T, Byttner S, West M, Andersson B
18. Jie S, Hong GS, Rahman M, Wong YS (2002) Feature extraction (2011) Unsupervised deviation detection by GMMa simulation
and selection in tool condition monitoring system. Advances in study. IEEE international symposium on diagnostics for electric
artificial intelligence, pp 487497 machines, power electronics and drives, pp 5154
19. Jolliffe IT (2002) Principal component analysis and factor anal- 33. Tavares G, Zsigraiova Z, Semiao V, da Graca Carvalho M (2011)
ysis, chap 7. Principal component analysis. Springer series in Monitoring, fault detection and operation prediction of MSW
statistics, pp 150166. Springer, New York incinerators using multivariate statistical methods. Waste Manag
20. Kothamasu R, Huang SH, VerDuin WH (2006) System health 31:16351644
monitoring and prognostics: a review of current paradigms and 34. Tobon-Mejia D, Medjaher K, Zerhouni N, Tripot G (2012) A data-
practices. Int J Adv Manuf Technol 28:10121024 driven failure prognostics method based on mixture of Gaussians
21. Li R, Sopon P, He D (2012) Fault features extraction for bearing hidden Markov models. IEEE Trans Reliab 61:491503
prognostics. J Intell Manuf 313321 35. Wald R, Khoshgoftaar TM, Sloan JC (2011) Using feature selec-
22. Medjaher K, Tobon-Mejia D, Zerhouni N (2012) Remaining use- tion to determine optimal depth for wavelet packet decomposition
ful life estimation of critical components with application to of vibration signals for ocean system reliability. IEEE 13th inter-
bearings. IEEE Trans Reliab 61:292302 national symposium on high-assurance systems engineering, pp
23. Miao Q, Wang D, Pecht M (2010) A probabilistic description 236243
scheme for rotating machinery health evaluation. J Mech Sci 36. Ying P, Ming D, Ming JZ (2010) Current status of machine prog-
Technol 24:24212430 nostics in condition-based maintenance: a review. Int J Adv Manuf
24. Mosallam A, Byttner S, Svensson MTR (2011) Nonlinear relation Technol 50:297313
mining for maintenance prediction. IEEE aerospace conference, 37. Yang Z, Zhong J, Wong SF (2011) Machine learning method
pp 19. doi:10.1109/AERO.2011.5747581 with compensation distance technique for gear fault detec-
25. Moshou D, Kateris D, Sawalhi N, Loutridis S, Gravalos I (2010) tion. 9th world congress on intelligent control and automation,
Fault severity estimation in rotating mechanical systems using pp 632637
feature based fusion and self-organizing maps. Artificial Neural 38. Zhang S (1994) Function estimation for multiple indices trend
Networks, pp 410413 analysis using self-organizing mapping. IEEE symposium on
26. Nectoux P, Gouriveau R, Medjaher K, Ramasso E, Chebel- emerging technologies and factory automation, pp 160165
Morello B, Zerhouni N, Varnier C (2012) PRONOSTIA: an exper- 39. Zhang S, Hodkiewicz M, Ma L, Mathew J, Kennedy J,
imental platform for bearings accelerated degradation tests. IEEE Tan A, Anderson D (2006) Machinery condition prognosis
international conference on prognostics and health management, using multivariate analysis. Engineering asset management, pp
PHM12, Denver 847854
27. Ocak H, Loparo KA, Discenzo FM (2007) Online tracking of 40. Zhang X, Xu R, Kwan C, Liang S, Xie Q, Haynes L (2005) An
bearing wear using wavelet packet decomposition and probabilis- integrated approach to bearing fault diagnostics and prognostics.
tic modeling: a method for bearing prognostics. J Sound Vib Proceedings of the 2005 American control conference, pp 2750
302:951961 2755
28. Pontoppidan NH, Larsen J (2003) Unsupervised condition change 41. Zhou JH, Pang CK, Lewis FL, Zhong ZW (2009) Intelligent
detection in large diesel engines. IEEE 13th workshop on neural diagnosis and prognosis of tool wear using dominant feature
networks for signal processing, pp 565574 identification. IEEE Trans Ind Inform 454464

You might also like