You are on page 1of 21

Chapter 14

Prognostics and Health Management

M.G. Pecht

Abstract There is a need to acquire knowledge of LED’s life cycle loading


conditions, geometry, and material properties to identify potential failure
mechanisms and estimate its remaining useful life. The physics-of-failure (PoF)
approach considers qualification as an integral part of design and development and
involves identifying root causes of failure and developing qualification tests that
focus on those particular issues. PHM-based-qualification combined with the
PoF qualification process can enhance the evaluation of LED reliability in its actual
life-cycle conditions to assess degradation, to detect early failures of LEDs, to
estimate the lifetime of LEDs, and to mitigate LED-based- product risks. Determi-
nation of aging test conditions better designed with PHM-based-qualification
enables more representation of the final usage conditions of the LEDs.

14.1 Introduction

We introduce prognostics and health management to improve LED reliability and


qualification techniques in this section. Prognostics and health management (PHM)
is composed of health management and prognostics. Health management is based
on health monitoring. Heath monitoring is defined as ability to sense the instanta-
neous condition of the product. This means in situ performance monitoring.
Prognostics are defined as ability to extrapolate forward to predict remaining useful
life (RUL). Purpose of developing PHM is to assess the degree of deviation or

M.G. Pecht (*)


Center for Advanced Life Cycle Engineering (CALCE), University of Maryland,
College Park, MD 20742, USA
Center for Advanced Life Cycle Engineering (CALCE), Engineering Lab, University
of Maryland, Room S1103, Building 089, College Park, MD 20742, USA
e-mail: pecht@calce.umd.edu

W.D. van Driel and X.J. Fan (eds.), Solid State Lighting Reliability: 373
Components to Systems, Solid State Lighting Technology and Application Series 1,
DOI 10.1007/978-1-4614-3067-4_14, # Springer Science+Business Media, LLC 2013
374 M.G. Pecht

degradation from an expected normal operating condition for electronics. Goals of


PHM comprise [1]:
• Providing warning of failures in advance
• Minimizing unscheduled maintenance, extending time duration of maintenance
cycle, and maintaining time repair action effectively.
• Reducing life cycle costs of equipment
• Improving qualification and helping design and logistical support of future
products
Prognostics need sensing capability to monitor the history of stress exposures
throughout the life cycle. Prognostics also need a model-based capability and/or
other suitable method to assess life consumed and life remaining. Approaches to
prognostics are classified into PoF-based prognostics (quantitative and proactive),
data-driven prognostics, and Fusion prognostics combining the advantages of the
PoF and data-driven approaches. Data-driven prognostics use statistics and proba-
bility for analyzing current and historical data to estimate RUL.

14.2 PoF-Based Prognostics

PoF-based prognostics utilize knowledge of a product’s life cycle loading


conditions, geometry, material properties, and failure mechanisms to estimate its
remaining useful life. PoF utilization in PHM includes the following [2]:
• Virtual life assessment with design data and expected life-cycle conditions
• Identification of critical failure mechanisms (through FMMEA: failure modes,
mechanisms, and effects analysis)
• Selection of precursor parameters to monitor
• Development and implementation of canaries
• Calculation of remaining useful life (RUL)
Based on the monitored operational and environmental data, the health status of
the electronics product can be assessed. Damage of parts or product can be
evaluated by PoF-based physical models to get RUL. PoF-based PHM methodol-
ogy is summarized in Fig. 14.1.
There is known history of canary birds used in early coal mines to detect the
presence of hazard gases. Failure of the canary served as early warning to miners of
health hazards. Since canaries are more sensitive to hazardous gases than humans,
the death or sickening of the canary was an indication to the miners to get out of the
shaft. Canary refers to embedded devices that are used to predict the degradation and
provide early warning of impending failure of the host. Canary devices sense stress
conditions in the host and degrade faster than the host system so that impending
catastrophic failure can be anticipated and preempted before occurrence.
Reliability is the foremost concern for many companies, especially for aero-
space, medical, and military industries because the failure of the products during
operation can be catastrophic. It is not always safe and economical to conduct
regular maintenance. In other words, benefits of canary devices are:
14 Prognostics and Health Management 375

Fig. 14.1 POF-based PHM methodology

• Physical mechanism that directly measures the cumulative environmental


exposure indicates that a system may soon fail.
• Canaries store environmental life history of equipment for trouble shooting/
repair.
• Canaries provide information on suitable qualification test levels.
• Canaries offer data that can be used to make real time adjustments to other
predictive methods such as PoF and empirical approaches.
Types of expandable canaries can be divided into overstress canaries and wear-out
canaries. Overstress failure occurs when stress exceeds strength. Overstress failures
include dielectric breakdown, electrostatic discharge (ESD), and die fracture.
Overstress canaries will be developed for large stress events that can cause latent
damage and subsequent premature failure or designed to act as a sacrificial element
that eliminates the stress-flow path before the overstress event can damage costly
functional elements. Wear-out failure is caused by gradual increase of cumulative
damage. Examples of wear-out failure are electromigration, interconnect fatigue, Sn
whisker growth, corrosion, and time dependent dielectric breakdown caused by
tunneling mechanisms. Wear-out canaries will be developed for accelerated tracking
of cumulative damage under life-cycle stresses.
Technically a canary can be any device that wears out faster than the actual product.
The approach for controlled error-seeding in canaries includes three-inter-related
techniques that will be used individually or synergistically to enhance the damage
accumulation rates in the canaries: geometric error-seeding, material error-seeding,
and load error-seeding.
• Geometry error-seeding: the canary geometry is designed to increase stress
conditions at the failure site beyond the levels experienced in corresponding
376 M.G. Pecht

functional elements. Canary solder joints can be designed to have lower height than
normal ones to attain faster degradation rates. Canaries for electrochemical migra-
tion are designed with closer spacing to increase degradation rates.
• Material error-seeding: the composition and microstructure of canary can be
tailored to alter material properties. The material properties include dielectric
constants, dielectric strength, glass-transition temperature, diffusivity,
creep resistance, ductility, and fracture toughness. Preliminary concepts are
being explored for tin whisker canaries using compositional gradient libraries
deposited on glass substrates.
• Load error-seeding: the canary will be subjected to higher load levels than
functional elements. Canaries for conductive filament formation in metal traces
will be subjected to higher voltage gradients than normal. Electromigration
canaries in solder and die metallization will be subjected to higher current
densities than normal. Microvia fatigue canaries will be subjected to higher
current swings.
Design steps of expendable canaries include the following:
• Identify the failure mechanisms of host systems.
• Find out what governing parameters or equations (material properties, physical
size, usage, and environmental conditions) can affect these failure mechanisms.
• Design canaries with adjusted governing parameters.
• Determine the appropriate equipment for (a) measuring these governing
parameters and (b) applying accelerated or real-situ loading stress.
• To conduct experiments and find out the coefficients in governing equations.
• To develop a model which correlates the failure of canaries with that of host
systems so that RUL can be quantified based on the health state of canaries.
Sensory canaries are inspired by biological system focusing on self-cognizant
systems with in situ canary capabilities to look, listen, smell, and feel for signs
of degradation and impending failure. Guidelines of sensory canaries are being
developed to make the canary approach generic for both new and legacy informa-
tion systems.
• Infrared canaries are to look for degradation in microprocessors based on
changes in the thermal dissipation
• Impedance spectroscopy and time domain reflectometry are to listen for defects
in signal traces and wiring harnesses.
• Acoustic sensors are to listen for delamination and cracking
• MEMS-based chemical canaries are to smell for out-gassing products.
• Piezoelectric or piezoresistive canaries are to touch and feel for sign of
delamination.
Conjugate-stress canaries can be developed to provide prognostic assessments
based on simultaneous identification of conjugate-stress pairs (e.g., stress & strain;
temperature gradient & heat flux; voltage and charge flux density; and magnetic
14 Prognostics and Health Management 377

field and magnetic induction), using novel dual-field detector pair concepts. These
canaries provide model-based fusion prognostic assessments of RUL by:
• Providing stress histories for damage accumulation models
• Monitoring intrinsic changes in material properties due to damage (e.g., stiff-
ness, thermal/electrical conductivity, and dielectric constants)
• Monitoring other damage metrics; e.g., hysteretic energy dissipation at failure site
Interconnect canaries built in one same system can be connected together to
form a built-in canary network by using wireless or wired network, or optical fiber
communication systems. The canary network has advantages over an individual
canary because it can cover a much wider area of communication and provide
distributed early warnings of failures.
In summary of canaries, PHM is attracting more attention from industry due to
the increasing demand for reliable products from both consumers and critical
applications such as military, aerospace, and nuclear power plants. As an approach
of PHM, canary has an intrinsic capability of providing advance warning of host
system failure and prediction of its health state, by accelerating the degradation
rates within the canary and providing more information about the actual life cycle
stresses at potential failure sites. Canaries should degrade faster than their host
systems under the same loading conditions.

14.3 Data-Driven Approaches for PHM

Data-driven techniques (also known as empirical approaches) use historical infor-


mation to statistically and probabilistically determine anomalies and make
predictions about the RUL of systems [3]. Data-driven techniques are needed due
to following reasons:
• As systems become increasingly complex, performing PHM efficiently and cost-
effectively becomes a challenge.
• Conducting FMMEA may not be cost effective for a complex system.
• The only kinds of information available regarding the system may be perfor-
mance data.
• Data-driven approaches for PHM are useful for complex systems where the
knowledge of the underlying physics of the system are absent and when the
health of large multivariate systems is to be assessed.
• DD techniques are capable of intelligently detecting and assessing correlated
trends in the system dynamics to estimate the current and future health of the
system.
Prognostics include steps of anomaly detection, diagnosis, and prognosis as
shown in Fig. 14.2. Anomaly detection process is to know where an anomaly in
the system of interest is detected. The goal of anomaly detection is to extract
underlying structural information from the data, to define normal structure, and to
378 M.G. Pecht

Fig. 14.2 PHM cycle

identify departures from such normal structures [4]. Diagnosis step is useful to
recognize where the fault is identified and isolated. Prognosis step predicts a failure.
The prediction can be based on a comparison of the current state of the system and
the expected normal state, in addition to the continued tendency of the system to
deviate from the expected normal state.
Statistical methods are composed of parametric methods and nonparametric
methods [5]. Parametric methods assume that the data are drawn from a certain
distribution (for example, the Gaussian distribution) and that the parameters
(such as the mean and the standard deviation) of the distribution are calculated
from the data. Nonparametric methods do not make any assumptions regarding the
underlying distribution of data. These methods draw their strength from the data
and its inherent features (e.g., Mahalanobis distance).
Machine learning (ML) algorithms recognize patterns in data and make
decisions on the state of the system based on the data [6]. General procedures for
learning algorithms are shown in Fig. 14.3. Three types of learning algorithms are
supervised, semi-supervised, and unsupervised techniques.
The translation from raw data to meaningful information may be achieved by
using techniques like classification, clustering, regression, and ranking. ML based
on statistical methods is suited for PHM because it is capable of actively learning
about the system and its dynamics, faults, and failures. ML techniques can handle the
increasing complexity of system information. ML is useful for real time analysis.
14 Prognostics and Health Management 379

Fig. 14.3 Machine learning algorithms

Prognostic measurements are processed by identification of new nonzero states,


change in state probabilities, changes in the amount of time a system can stays in a
state, changes in the time and probability to reach a particular state, and time to reach
a particular state. The example of data driven prognostics is shown in Fig. 14.4.
Data-driven algorithms used at Center for advanced life cycle engineering
(CALCE) for prognostics include [3]:
• Mahalanobis distance clustering
• Principle component analysis (PCA)
• Support vector machine (SVM)
• Sequential probability ratio test (SPRT)
• Gaussian processes (GPs)
• Bayesian support vector machine (BSVM)
• Neural networks (NN)
• Self-organizing map (SOM)
• Particle filtering (PF)
The each algorithms are not be covered by this chapter. Please refer to a book
written by Prof. M. G. Pecht, “Prognostics and Health Management of Electronics”,
published in A John Wiley & Sons, Inc. in 2008.
Anomaly detection is required to perform data-driven PHM techniques shown in
Fig. 14.5. Data-driven PHM techniques are performed by following in steps of
collection of raw data, feature selection, anomaly detection, diagnostics, and
prognostics.
Nature of input data can be classified into categorical data and real-valued data
shown in Fig. 14.6. Categorical data is a part of an observed dataset that consists
of categorical variables (which are variables assessed on a nominal scale) or for
data that has been converted into the form (e.g., grouped data) [4]. Real-valued
Fig. 14.4 Example of data-driven technique

Fig. 14.5 Data-driven PHM flow

Fig. 14.6 Nature of input data


14 Prognostics and Health Management 381

(continuous) measurements are collected from sensors that measure physical


properties such as voltage, current, and speed. They have traditionally been the
primary data source for monitoring applications because they allow one to trend
subtle changes over time. Categorical data can include error logs, fault messages,
and warnings that are either of textual nature or binary flags. Some of the fault
messages can be triggered, for example, when real-valued measurements are
beyond certain thresholds or more generally when the subsystem behaves outside
preset operating parameters. Real-valued data are often prior to their usage to
enhance their usefulness in the prognostic applications. Understanding the data
needs to acquire following information:
• Meaning of each variable
• Data formatting (software reads correctly)
• Ranges of variables
• Duplications
• Outliers (e.g., errors)
• Graphics and summaries
• Domain knowledge
Data preparation needs:
• Choice of variables
• Choice of scales (continuous/categorical)
• Binning
• Missing values
• Extent/type
• Drop observations or drop variables (replace with dummy)
• Impute (mean, regression, more advanced methods)
• Explanatory vs. predictive
• Creating derived variables
Some preprocessing techniques including outlier removal, noise reduction, and
transformation into other domains are used to select features of data. Examples of
outlier, filtering, and transformation of domain are shown in Fig. 14.7. Outlier is value
far away from most others in a set of data [5] (for example, temperature of 2,000  C in
computer). Anomaly is defined as deviation or departure from the normal order.
Anomaly detection is finding patterns in data that do not conform to expected
behavior. Anomalies in data provide significant, and often critical, information in a
wide variety of application domains. Examples of applications are [4]:
• Fault detection (spacecraft, airplanes, and laptop computers)
• Fraud detection in credit cards, insurance, or health care
• Medical diagnosis and public safety (disease outbreaks)
• Intrusion detection (cyber security)
• Military surveillance
Types of anomalies can be divided into point anomalies, contextual anomalies,
and collective anomalies [4]. Point anomalies are that an individual data instance is
382 M.G. Pecht

Fig. 14.7 Outlier, filtering


and transformation of domain
for data preprocessing
14 Prognostics and Health Management 383

Fig. 14.8 Example of point anomalies

Fig. 14.9 Example of contextual anomalies

anomalous compared to the rest of the data shown in Fig. 14.8. Contextual
anomalies are that data instance is anomalous only in a particular context shown
in Fig. 14.9. High temperature in the month of January is anomalous although the
high temperature in the month of July is not anomalous. Collective anomalies are
that collection of related data instances is anomalous in Fig. 14.10. The individual
data instances may not be anomalous by themselves.
Machine learning techniques can be divided into supervised, semi-supervised,
and unsupervised algorithms [6]. Supervised learning techniques require training
data set that has labeled data for normal as well as anomaly classes. Semi-
supervised learning techniques can use training data that has labeled instances
384 M.G. Pecht

Fig. 14.10 Example of collective anomalies

Fig. 14.11 Hypothetical example

only for the normal class. Unsupervised learning techniques may not require
training data. They assume that normal instances are more frequent than anomalies.
Machine learning techniques can handle the increasing complexity of system
information. In other words, machine learning for PHM can actively learn the
system and its dynamics, faults, and failures.
Techniques for point anomaly detection include classification based techniques,
nearest neighbor, clustering, statistical (e.g., hypothesis test), and spectral
techniques [4]. Input data can be collected by building matrix. Columns contain
variables and rows contain instances. Example is temperature as a junction of
acceleration for some system shown in Fig. 14.11.
14 Prognostics and Health Management 385

Fig. 14.12 Multi-class anomaly detection

Fig. 14.13 One-class anomaly detection

Classification based anomaly detection build a classification model for normal


and anomalous events based on labeled training data, and use it to classify each test
instance. Assumption is that a classifier which can distinguish between normal and
anomalous class can be learned with a given training set. There are two classifica-
tion based techniques in terms of training data available.
• Multi-class training is capable of operate in semi-supervised or supervised
mode.
• One-class training can operate in semi-supervised or unsupervised mode.
Multi class technique assumes training data contains instances belonging to
multiple normal classes. Test data is anomalous if it belongs to none of the normal
classes shown in Fig. 14.12. One-class technique assumes all training data belong to
only one normal class shown in Fig. 14.13.
386 M.G. Pecht

Fig. 14.14 Support vector machines

Algorithms in classification based techniques are neural networks based algorithm,


Bayesian networks based algorithm, support vector machines (SVM) algorithm, and
rule based algorithm. Example of SVM is shown in Fig. 14.14. Neural networks based
algorithm works in both multi-class and one-class settings. Two steps are:
• First, a neural network is trained on the normal training data to learn different
classes.
• Second, each test instance is provided as an input and if the networks accept the
test input, it is normal.
Bayesian networks based algorithm works in multi-class setting. It estimates the
expectancy that the test instance belongs to the normal or anomaly class label.
It also assumes independence between the different attributes. SVM creates a
boundary around the region containing the training data. SVM determines if the
test instance falls within the boundary. SVM declare anomalous if it does not fall
within the boundary. Rule based algorithm works in multi-class as well as one-class
setting. Two steps of rule based algorithm are:
• Learn rules regarding the normal behavior of a system from training data (e.g.,
by using decision trees)
• Find the rule that best captures each test instance.
Nearest neighbor based anomaly detection assumes that normal data instances
occur in dense neighborhoods, anomalies occur far from their closest neighbors [7].
Concept is shown in Fig. 14.15. Each circle corresponds to a group of nearest
neighbor. Nearest neighbor based anomaly detection utilize a distance/similarity
measure between data instances. Two-step approach includes:
14 Prognostics and Health Management 387

Fig. 14.15 Nearest neighbor

• Compute neighborhood for each data record.


• Analyze the neighborhood to determine whether data is anomaly or not.
This can result in misclassification if normal instances do not have sufficient
neighbors or anomalies have close neighbors.
Nearest neighbor based techniques are categorized into Kth nearest neighbor and
relative density based technique. In case of kth nearest neighbor technique,
• Distance of test instance to the kth nearest neighbor is calculated.
• To determine if test instance is anomalous, a threshold value is chosen based on
experience.
In case of relative density based technique,
• The density of the neighborhood of each data instance is estimated.
• Test instance in a low density neighborhood is declared anomalous, and instance
that lies in a dense neighborhood is declared to be normal.
Clustering based anomaly detection technique utilizes primarily an unsupervised
or semi-supervised technique to group similar data instances into clusters [4, 7].
Clustering based anomaly detection technique is distinct from the nearest neighbor
based technique such that clustering based technique evaluates each instance with
respect to the cluster it belongs to while nearest neighbor based technique analyzes
each instance with respect to its local neighborhood. Several techniques are effective
only when the anomalies do not form significant clusters among themselves. Three
categories for detection are used with different assumptions. The assumptions of
category 1, category 2, and category 3 are:
• Assumption of category 1: normal data instances belong to a cluster in the data,
while anomalies do not.
• Assumption of category 2: normal data instances lie close to the nearest cluster
centroid while anomalies are far away.
388 M.G. Pecht

• Assumption of category 3: normal data instances belong to large and dense


clusters, while anomalies either belong to small or sparse clusters.
Statistical methods have an underlying principle such that an anomaly is an
observation which is suspected of being partially or wholly irrelevant because it is
not generated by the statistical distribution assumed [4, 5]. Assumption is that
normal data instances occur in high probability regions of distribution, while
anomalies occur in the low probability regions of the distribution. Statistical
methods fit a statistical model to the given data (usually for normal behavior) and
apply a statistical inference test to determine if the test instance belongs to this
model. The confidence interval associated with anomalies can be used as additional
information while making a decision. Two categories are:
• Parametric techniques
– Assumption: normal data is generated by a parametric distribution with
parameters ’ and probability density function f(x; ’), where x is an
observation.
– Parameters are estimated from the given data, and a statistical hypothesis test
is used for anomaly detection.
• Nonparametric techniques.
– The data structure is not defined a priori, but is instead determined from the
given data.
– Typically makes fewer assumptions regarding the data.
Spectral anomaly detection techniques have an assumption such that data can be
embedded into a lower dimensional subspace in which normal instances and anomalies
appear significantly different. The approach adopted is to determine subspaces wherein
the anomalous instances can be easily identified. For example, principle components
analysis (PCA) can be used to find the projections along subspaces which will separate
the anomalies based on variance. A preprocessing step can be used for existing
anomaly detection technique in the transformed space.
Examples of problem settings depending on data set are discussed here. In case of
data set 1 shown in Fig. 14.16, normal data are generated from a Gaussian distribu-
tion. Anomalies are generated from another Gaussian distribution whose mean is far
from the first. Training data set from normal data set is available. In data set 1, all
discussed anomaly detection techniques are able to detect the anomalies in this case.
In data set 2 shown in Fig. 14.17, normal data are generated by large number of
Gaussian distribution. One-class classification technique fails to detect anomalies.
Multi-class classification technique will detect anomalies. Clustering based, nearest
neighbor based, and spectral based techniques will also detect these anomalies.
In data set 3 shown in Fig. 14.18, anomalous instances form a tight cluster of
significant size at the center. Clustering based and nearest neighbor based
techniques will treat these anomalies as normal. Spectral technique will perform
better to detect these anomalies.
14 Prognostics and Health Management 389

Fig. 14.16 Data set 1

Fig. 14.17 Data set 2

Classification based techniques require labeled training data for both normal and
anomaly classes [8]. Nearest neighbor and clustering based techniques suffer when
number of dimensions is high. When identifying a good distance measure is
difficult, classification based and statistical techniques are better. Statistical
techniques are effective with low dimensional data and when the statistical
assumptions hold true. Spectral techniques are good only if anomalies are separable
from normal states in the projected subspaces.
Previous techniques primarily focus on detecting point anomalies. Contextual
anomaly detection works where data instances tend to be similar within a context.
390 M.G. Pecht

Fig. 14.18 Data set 3

Contextual anomaly detection techniques are able to detect anomalies that might
not be detected by point anomaly detection techniques that take global view of the
data. It is applicable only when a context can be defined. Two methods of handling
contextual anomalies: conversion to point anomaly detection problem and utiliza-
tion of the structure of the data.
• Conversion to point anomaly problem:
– Splits data into different contexts or attributes.
– Uses point anomaly detection techniques on each of the attributes within a
context.
• Utilization of structure of the data:
– Used when data cannot be split into contexts
– A model is learned from the training data, which can predict the expected
behavior with respect to a given context.
– Anomaly is declared if the expected behavior is significantly different from
observed behavior.
Collective anomalies are subset of instances that occur together as a collection
[4]. Handling collective anomalies are more challenging than point and contextual
anomaly detection. Data is presented as a set of sequences. Primary requirement is
the presence of relationship between data instances. Collective anomalies are
detected mostly by building models using sequential training data. Sequential
anomaly detection detects anomalous sequences or subsequences in a database of
sequences. To handle collective anomalies, the sequences are transformed to a finite
feature space. Sequences may or may not be of the same length. Sequential rules are
generated from a set of normal sequences. The test sequence is compared to the
14 Prognostics and Health Management 391

rules, and anomaly is declared if it contains patterns for which no rules have been
generated. For long sequences, one can assume that the normal behavior follows a
defined pattern. If a subsequence within the long sequence does not conform to the
pattern, it declares anomalous.
Challenges in anomaly detections are:
• It is difficult in defining a normal (healthy) operating region that encompasses
every possible normal behavior of the system.
• The boundary between normal and anomalous behavior is often not precise.
• Normal behavior changes with time.
• The definition of an anomaly is application specific (e.g., fluctuations in body
temperature).
• Uncertainties make data analysis difficult if there is noise in data.
• Availability of labeled data for training/validation of models used by anomaly
detection techniques is usually a major issue.

14.4 Fusion Prognostics

The PoF-based prognostics involve the usage of representative models that allow
estimation of damage and degradation in critical components as a function of the
life cycle loads. The PoF approach utilizes knowledge of a product’s life cycle
loading conditions and material properties to identify critical failure mechanisms
and estimate RUL. Advantages and limitations of PoF-based prognostics are:
• Advantages:
– Provide estimate of damage and RUL for given loading conditions and failure
modes or mechanisms (in operating and nonoperating state).
– Identify critical components and parameters to be monitored.
– Provide information regarding failure modes and mechanisms that are useful
for root cause analysis.
• Limitations:
– Development of models of the degradation process in a complex system may
be practically infeasible.
– System specific knowledge is necessary to create and use the system models
which may not always be available.
– It is hard for PoF models to detect intermittent failures.
The data-driven approach derives features from product performance data using
statistical and machine learning techniques to estimate deviations of the product
from its healthy state. Advantages and limitations of data-driven prognostics are:
• Advantages:
– Do not require system specific knowledge (i.e., material properties, geometry,
or failure mechanisms).
392 M.G. Pecht

Fig. 14.19 Fusion prognostics approach

– Can detect intermittent failures.


– Capable of capturing complex relationships (between subsystems and
environment), reduce dimensionality and thus can be used for complex
systems.
• Limitations:
– In some cases, reliable training data is required to create a baseline.
– Cannot identify failure mechanisms.
– It is difficult to estimate RUL without complete historical knowledge (run-
to-failure data) of system parameters.
The conceptual explanation of fusion prognostics is depicted in Fig. 14.19. For
a complex system, high dimensions may be required to monitor what can be
monitored. Not all the parameters are related to anomalies or failures of the
system. PoF methods can assist the parameter identification. Potential failure
modes, causes, mechanisms, and models of a product under an environmental and
operational condition can be identified by PoF method (e.g., failure modes,
mechanisms, and effects analysis (FMMEA)). The parameters to monitor and
the sensing locations can be identified based on the failure mechanisms and
models. PoF methods may not identify all the parameters related to anomalies
or failures.
Data-driven methods can identify other parameters. Relationship (e.g., correla-
tion or covariance) between parameters and the principle parameters relative to
anomalies can be identified by data-driven methods. Anomaly detection can be
done by data-driven methods. Features of monitored data can be extracted,
for example:
14 Prognostics and Health Management 393

• Statistical characteristics, e.g., range, mean, standard deviation, and histogram


• Similarity measures and distance measure: e.g., Euclidean distance and
Mahalanobis distance
• Relationship between parameters: e.g., correlation and covariance
• Residuals: e.g., between actual measurement and the estimation
Mathematical tools can be used to detect the anomalies by analyzing extracted
features. Mathematical tools can be sequential probability ratio test (SPRT), PCA,
neural networks, and support vector machines (SVM).
Failure can be predicted by PoF models assisted by data-driven methods.
Parameters responsible for the anomalies or failures can be isolated by data-driven
methods (e.g., PCA). Proper PoF models from a database can be extracted. Failure
can be predicted by the extracted model. Failure can be also predicted by data-
driven methods. Mathematical tools can conduct the trending or regression based
on the features of the isolated parameters. Failure criteria can be obtained from
standard, PoF models, historical databases, or expert knowledge. Decision making
will be performed if multiple predictions are available. Examples of decision
making are choosing conservative one or utilizing methods such as Dempster-
Shafer method and fuzzy fusion.
Capability of fusion prognostics are: it aggregates the strengths of PoF and data-
driven approaches to improve the capability of PHM for system health assessments
and prognostics; it is capable of detecting intermittent failures; and it can provide
information about the failure modes and mechanisms occurring in the system which
can be used for root cause analysis.

References

1. Pecht MG (2008) Prognostics and health management of electronics, chap. 1. Wiley, Hoboken,
NJ, pp 3–4
2. Pecht MG (2008) Prognostics and health management of electronics, chap. 4. Wiley, Hoboken,
NJ, pp 73–84
3. Pecht MG (2008) Prognostics and health management of electronics, chap. 3. Wiley, Hoboken,
NJ, pp 47–72
4. Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41
(3) Article 15: 15:1–15:58
5. Markou M, Singh S (2003) Novelty detection: a review-part 1: statistical approaches. Signal
Process 83:2481–2497
6. Nilsson NJ. Introduction to machine learning. http://ai.stanford.edu/~nilsson/mlbook.html
7. Tran TN, Wehrens R, Buydens LMC (2006) KNN-kernel density-based clustering for high-
dimensional multivariate data. Comput Stat Data Anal 51(2):513–525
8. Xu R (2005) Survey of clustering algorithms. IEEE Trans Neural Network 16(3):645–678

You might also like