Amozegar 2016

Accepted Manuscript
An ensemble of dynamic neural network identifiers for fault detection

and isolation of gas turbine engines
M. Amozeghar, K. Khorasani
PII: S0893-6080(16)00004-6
DOI: http://dx.doi.org/10.1016/j.neunet.2016.01.003
Reference: NN 3580
To appear in: Neural Networks
Received date: 16 October 2015

Accepted date: 13 January 2016
Please cite this article as: Amozeghar, M., & Khorasani, K. An ensemble of dynamic neural
network identifiers for fault detection and isolation of gas turbine engines. Neural Networks
(2016), http://dx.doi.org/10.1016/j.neunet.2016.01.003
This is a PDF file of an unedited manuscript that has been accepted for publication. As a
service to our customers we are providing this early version of the manuscript. The manuscript
will undergo copyediting, typesetting, and review of the resulting proof before it is published in
its final form. Please note that during the production process errors may be discovered which
could affect the content, and all legal disclaimers that apply to the journal pertain.
Title Page (With all author details listed)
An Ensemble of Dynamic Neural

Network Identifiers for Fault Detection
and Isolation of Gas Turbine Engines
M. Amozeghar and K. Khorasani
Department of Electrical and Computer Engineering
Concordia University
Montreal, Canada
email: kash@ece.concordia.ca, tel:1-514-848-2424x3086, Fax:1-514-
848-2424).
*Manuscript
Click here to view linked References
1
An Ensemble of Dynamic Neural Network

Identifiers for Fault Detection and Isolation of Gas
Turbine Engines
M. Amozeghar and K. Khorasani
Abstract—In this paper, a new approach for accomplishing gas significant obstacle in using model-based approaches, as ob-
turbine engine Fault Detection and Isolation (FDI) is proposed by taining an accurate model will in general be a quite a challeng-
developing an ensemble of dynamic neural network identifiers. ing and expensive task to accomplish. Kalman filter is a well-
For monitoring the health of the gas turbine engine its dynamics
is first identified by constructing three separate or individual established model-based approach that has been extensively
dynamic neural network architectures. Specifically, a dynamic applied to gas turbine engine fault diagnosis [1], [2].
multi-layer perceptron (MLP), a dynamic radial-basis function Computational intelligent-based methods, on the other hand,
(RBF) neural network, and a dynamic support vector machine do not require availability of a mathematical model and they
(SVM) are trained to individually identify and represent the gas can be trained by using only the available input-output data
turbine engine dynamics. Next, three ensemble-based techniques
are developed to represent the gas turbine engine dynamics, from the system [3], [4]. Various computational intelligent
namely, two heterogenous ensemble models and one homogeneous approaches have been presented for FDI in the literature,
ensemble model. It is first shown that all the ensemble approaches among which the neural networks are the most well-known.
do significantly improve the performance and accuracy of the Neural networks have been widely used in the application of
system identification goal when compared to each of the stand- gas turbine engine fault diagnosis. The use of dynamic neural
alone solutions. The best selected stand-alone model (i.e., the dy-
namic RBF network) and the best selected ensemble architecture networks for the gas turbine engine fault diagnosis is reported
(i.e., the heterogenous ensemble) in terms of their performances in [3]. Feed-forward neural networks is another widely used
in achieving an accurate system identification are then selected neural network for the gas turbine engine fault diagnosis [5]–
for accomplishing the FDI task and objective. The required [8]. The use of radial basis function (RBF) neural networks
residual signals are generated by using both a single model- for the gas turbine engine fault diagnosis is also reported in
based solution and an ensemble-based solution under various
gas turbine engine health conditions. Our extensive simulation [5], [9], [10].
studies demonstrate and illustrate that the fault detection and The main drawback of standard neural network methods is
isolation task achieved by using the residuals that are obtained their lack of quantitative a priori confidence on their general-
from the dynamic ensemble scheme results in a significantly more ization performance, since their knowledge is distributed over
accurate and reliable performance as illustrated through detailed a set of neurons (unlike the model-based approaches where
quantitative confusion matrix analysis and comparative studies.
the knowledge is centralized in the mathematical model). To
Index Terms—Ensemble learning; Fault detection and iso- respond to this challenge, one can propose a fault detection and
lation; System identification; Dynamic neural networks; Gas isolation scheme that is based on ensemble of neural networks
turbine engines;
architectures. The agreement among the ensemble members
reduces the chance of error while increase the overall decision
I. I NTRODUCTION making reliability and confidence [11]. Ensemble learning
Fault Detection and Isolation (FDI) of complex systems has proven to improve the individual learner’s generalization
has captured a wide range of attention in various industries capability and performance [12], [13], and reduces the chance
including the aerospace, among others. FDI plays an important of selecting a learner with weak performance capability. En-
role in increasing safety and reducing operational costs of semble learning has captured a lot of attention in computer
an aircraft. This is applicable to different subsystems of an science and engineering communities under various names
aircraft, which also includes its engine. Early diagnosis of [14] including: bagging [15], boosting [16], [17], mixture of
the gas turbine engine faults reduces both the operational and experts [18], and neural networks ensemble [13].
maintenance costs of an aircraft. The use of ensemble methods for tackling the FDI problem
Various schemes have been proposed for fault detection has been reported in several publications. In [6], Xiao et
and isolation in diverse range of applications. At the high al. developed an ensemble classifier for fault diagnosis of an
level, these schemes can be categorized into two main classes, aircraft engine using linear regression (LR), multilayer percep-
namely: model-based and computational intelligent methods. tron (MLP), likelihood ratio test (LRT) and robust ratio thresh-
The complications associated with the availability of a highly olding (RRT). Yan et al. [8] introduced an ensemble classifier
reliable mathematical representation of a system is the most for gas turbine engine fault diagnosis by using support vector
machine (SVM), MLP and decision tree (DT). Xiao et al.
Department of Electrical and Computer Engineering, Concordia University, [19] designed an ensemble classifier for fault diagnosis of gas
Montreal, Canada. (email: kash@ece.concordia.ca, tel:1-514-848-2424x3086, turbine with generalized regression neural network (GRNN),
Fax:1-514-848-2424). This publication was made possible by NPRP grant No.
4 - 195 - 2 - 065 from the Qatar National Research Fund (a member of Qatar
logistic regression (LoR) and random forest (RF). Varma et al.
Foundation). The statements made herein are solely the responsibility of the [20] used rough sets (RS) and self-organizing maps (SOM) for
authors. anomaly detection problem of gas turbines. Donat et al. [21]
2
presented five fault classifiers based on the neaest neighbor the RBF-NARX model shows a better modeling and repre-
classifier (K-NN), SVM, Gaussian mixture models (GMM), sentation performance (in term of the modeling identification
probabilistic neural network (PNN) and principle component accuracy) among the other stand-alone models.
analysis (PCA). Lu et al. [9] presented an ensemble system The first ensemble model that attempts to model the gas
based on the MLP and radial basis function (RBF) neural turbine engine dynamics is a homogeneous ensemble with
networks to improve the diagnostic accuracy and reduce the bagging where several RBF-NARX models are trained by
rate of misdiagnosis for an aircraft engine gas path faults. using various subsets of the training data that are generated
Huang et al. [22] proposed a multiple classifier fusion using by the bootstrap sampling to model the gas turbine engine
within-class decision support for fault diagnosis where the dynamics. The effects of the number of models in an ensemble
base classifiers selected are K-NN and orthogonal quadratic on its accuracy will also be studied. It is observed that by
discriminant function (OQDF). Amanda et al. [23] as well as increasing the number of the models in an ensemble the
Sharkey et al. [24] used ensemble of MLP networks for fault prediction error in general decreases. It is also observed that
diagnosis of diesel engine. Lei et al. [25] presented a multiple homogeneous ensemble models based on the MLP, RBF, and
classifier system (MCS) for fault detection of a gearbox by SVM structures outperform the stand-alone models in term of
combining MLP, RBF and K-NN. Oza et al. [10] presented an the modeling and identification accuracy.
ensemble of MLP and RBF for the aircraft health monitoring. Next, the system identification is accomplished by de-
Ren et al. [26] combined three classifiers MLP, fuzzy logic veloping a heterogeneous ensemble where the members are
(FL) and human-machine interaction (HI) to solve the fault combined by using the weighted averaging where the weights
diagnosis problem of an aero-engine. are optimized by using a gradient descent method. For the
Based on all the above literature review, the use of ensem- second heterogeneous ensemble approach the Forward Se-
ble learning for gas turbine engine fault diagnosis through quential Selection (FSS) pruning is utilized. We first train a
system identification has not yet been reported. The work in pool of stand-alone models to identify the gas turbine engine
this paper presents an ensemble of dynamic neural networks output measurements. The ensemble initially utilizes the model
for fault detection and isolation of a gas turbine engine through yielding the best performance (in term of the modeling and
system identification. Several prior research highlighted above identification accuracy), and then the other models are added
have employed ensemble learning to evaluate the residual sig- to the ensemble based on their contribution to improve the
nals to detect or isolate a fault but no research has addressed overall ensemble performance.
the possible use of ensemble learning identifiers for generat- The selected heterogenous ensembles with the FSS pruning
ing the residuals. In other words, all previous work has devel- (given that it will be shown to perform the best among the
oped ensemble of classifiers that receives and operates based other ensembles), and the RBF-NARX as a stand-alone neural
on the residual signals; however, no research has addressed network model is shown to yield the best performance for
the use of ensemble of identifiers or regressors to identify solving the fault detection problem. Different fault scenarios
the gas turbine engine dynamics and to generate the ensemble are considered based on the fault type, the fault severity, and
of dynamically generated residuals for accomplishing FDI. the gas turbine engine’s input profile (the fuel flow rate).
The objectives of this work are therefore to develop an Our experiments will demonstrate that the fault detection
ensemble-based methodology for accomplishing the fault- achieved by using the residuals that are obtained from the
detection and isolation of gas turbine engines and also to ensemble models result in more accurate fault detection goals
compare the results with conventional single neural network- and objectives.
based FDI solutions. It will be shown that by integrating stand- Finally, the fault isolation is accomplished by evaluating the
alone neural networks more accurate ensemble models can variations in the ensemble residual signals before and after a
be constructed and designed to identify and represent the gas fault detection flag is issued by the detection filters. Eight
turbine engine dynamics without the need of requiring ad-hoc classes corresponding to single faults and ten classes corre-
fine tunings procedures necessary in single neural network- sponding to multiple concurrent faults are defined (according
based solutions. It should be noted that the main motivation, to the fault type and severity). Our goal is to demonstrate
justification, and argument for developing a more accurate and show that the ensemble-based fault isolation approach will
ensemble is to have a large number of ensemble members. In result in a more promising performance as compared to the
theory, the accuracy of an ensemble model can be improved individual neural network-based classifiers.
arbitrarily by increasing the number of ensemble members It should be pointed out that the improved performance and
albeit without requiring to have very accurate individual accuracy that are achieved by using the ensemble methods
ensemble members. will be slightly more computationally costly (in terms of
For the purpose of performing the gas turbine engine health training multiple system identification models as opposed to
monitoring, first the gas turbine engine dynamics is identi- only one model) as compared to stand-alone neural network
fied by using three different stand-alone learning algorithms. methods; however, this additional effort will be compensated
Specifically, the MLP-NARX, the RBF-NARX, and the SVM- by and justified as the ensemble methods do not require time-
NARX models (NARX denotes Nonlinear AutoRegressive consuming and labor intensive ad-hoc fine tunings that are
eXogeneous) are trained to individually model the gas turbine necessary for single neural network-based solutions.
engine output measurements. A separate model was trained for Based on the above discussion, the main contributions of
each engine output by using individual learning algorithms. this work can therefore be summarized as follows:
The parameters of the individual learning algorithms (e.g., 1) A novel approach is proposed for identification of a
the number of the neural network neurons) are optimized by gas turbine engine dynamics based on an ensemble of
performing several experimentations. It will be shown that dynamic identifiers and learners. This paper does report
3
the first use of dynamic ensemble learning methodology higher variance [27]. The decision for selecting one method
for identifying and representing nonlinear systems. versus another is not simply a matter of selecting the one
2) An extensive comparative study is conducted to verify that has a small variance or the one that has a small bias, but
and validate that the proposed ensemble-based dynamic instead the goal should be to weigh the respective merits of the
system identification can reduce the gas turbine engine bias and variance and then properly choose accordingly [29].
modeling error for up to 67% as compared with single Ensemble learning methods have been developed to improve
neural network-based solutions. Consequently, one can a methodology’s accuracy by reducing its variance, while
expect to guarantee more accurate residual signal gener- maintaining the bias of the learner low [14].
ation process that can be utilized for accomplishing the Diversity is key in designing ensemble learning systems.
fault detection and isolation (FDI) objectives. Clearly, there is nothing to be gained from combining several
3) Novel FDI methodologies are proposed by utilizing the identical models. It should be noted that the association and
ensemble dynamic identifiers of the gas turbine engine by link between diversity and bias-variance trade-off is that
using both homogeneous and heterogeneous ensemble diversity of models boils down to their variances. Therefore,
architectures. Various ensemble schemes are studied to the ensemble members should be different from each other
determine the strategy that yields the maximal improve- while each must maintain an acceptable level of accuracy.
ment as compared with single dynamic neural network- The specific method for creating diversity plays an impor-
based solutions. The constructed residuals are shown to tant role in training of the ensemble model. At the high level,
yield ensemble-based gas turbine engine fault detection two different methodologies can be considered for creating
results that are up to 5% more accurate compared to diversity among the ensemble members. The first method uses
the single neural network-based solutions. Moreover, it learners having different architectures (e.g. by using differ-
is shown that one can achieve up to 12% improvement ent types of neural networks), and the second methodology
in the fault isolation correct classification rates. trains different learners on different sets of training data [30].
The remainder of this paper is organized as follows. Section These two methodologies are referred to as heterogenous and
II presents the required background on the ensemble learning homogeneous ensembles, respectively.
and the gas turbine engine model as well as the nature The source of diversity in heterogeneous ensembles is
of the considered faults. Section III presents our proposed due to the inherent properties of different learning schemes.
ensemble-based dynamic system identification methodology. On the other hand, the source of diversity in homogeneous
Section IV describes our developed and proposed ensemble- ensembles is due to the use of different subsets of the available
based methodology for the fault detection problem. Section V training data for training the individual models. Homogeneous
presents our proposed single fault isolation ensemble method- ensembles have been widely studied in the literature [30],
ology, and Section VI presents our proposed multiple faults whereas the number of work on heterogenous ensembles is
isolation ensemble approach. The proposed homogeneous and relatively less [31], and thus it deserves more attention.
heterogeneous dynamic neural network ensemble identifiers
and simulation results corresponding to an engine are pre-
sented in Section VII. The FDI simulation results are presented
in Section VIII. Section IX concludes the paper. B. Neural Networks for Dynamical System Identification
Basic neural network architectures are capable of learning
II. BACKGROUND I NFORMATION static nonlinear maps between inputs and outputs of a system.
This section contains three parts. The first part provides In static systems the output at any instant n, that is y(n),
a brief overview of the ensemble learning. The second part depends only on the input u(n) at the same instant through a
presents preliminaries on the dynamic neural network learn- potentially nonlinear map that is given by y(n) = f u(n) .
ing. Finally, the third part is an overview on single-spool Therefore, static neural networks can be used for modeling
gas turbine engine model and the component faults that are such systems. However, in dynamical systems, the output at
considered and investigated for the FDI problem. the present time depends not only on the present input, but also
on a certain number of past instances of inputs and outputs.
Such systems should be represented by dynamical neural
A. Ensemble Learning network structures for mapping representation and system
The error in any learning problem is composed of two identification.
components bias and variance [13], [14], [27], where there The main characteristic of a dynamic neural network is
is a trade-off relationship. Generally, bias would be large that it is embedded with memory, which makes it a suitable
if a learning method produces models that are consistently framework for modelling highly complex nonlinear systems,
wrong. Assume that a learning problem is solved several times such as the gas turbine engines. We will show that dynamic
using the same learning algorithm, then bias is indicated by neural networks is a promising tool for generating residuals,
the difference between the prediction and the expected values and hence fault diagnosis. Dynamic neural networks have
taken over different trained models [28]. Variance would be recently been employed in achieving FDI of nonlinear systems.
large if choosing different training sets results in various In [32], a dynamic neural network was used to detect actuator
predictions (assuming that a learning problem is solved several faults in the attitude control subsystem of a satellite. Valdes et
times using the same learning algorithm but different training al. [33] used a dynamic neural network for fault detection and
data) [28]. When one compares different learners, in most isolation of thrusters in satellites. The authors in [3], [34], [35]
cases comparisons show that one method has a higher bias have applied dynamic neural networks that was developed in
and lower variance and another method has a lower bias and [36] for fault detection of aircraft jet engines.
Fig. 1. General architecture of our proposed MM-based FDI scheme. 4
TABLE I
GAS TURBINE ENGINE COMPONENT FAULT INDICATIONS .
Component Fault Indication Symbol
Compressor fouling Decrease in the compressor flow capacity (ṁC ) Fmc
Compressor erosion Decrease in the compressor efficiency (ηC ) Fec
Turbine fouling Decrease in the turbine flow capacity (ṁT ) Fmt
Turbine erosion Decrease in the turbine efficiency (ηC ) Fet
The control input of the gas turbine engine is the power level
angle (PLA) that is adjusted by the pilot and is related to the
fuel mass flow rate (ṁf ) through a variable gain.
The faults in the gas turbine engine are categorized into
the following commonly occurring types, namely component
faults, actuator faults, and sensor faults. This paper considers
Fig. 2. Information flow diagram in a modular modeling of the jet engine dynamics.
Fig. 1. Gas turbine engine modules and the information flow chart [2]. only the component faults that are listed in Table I. The
component faults are modeled as a decrease in the engine
health parameters that are the efficiency and the mass flow
e ηmech denotes the mechanical efficiency and J denotes the inertia of the shaft connecting the compressor to the
rates ofturbine.
the compressor and the turbine components.
C.theGas
ermore, using [21] Turbine
following Engine
dynamics Model
for the fuel mass flow rate are considered
A MATLAB Simulink model of a gas turbine engine is

III. E NSEMBLE - BASED DYNAMIC S YSTEM
developed in [2] based don ṁ f the available literature [37], [38]
τ + ṁ f = Gu f d (6) I DENTIFICATION
and is used in this paper dt to generate the required data.
Alternatively, data from a real engine can be used. However, A. The Gas Turbine Engine Model Identification: Single Neu-
it should be noted that using a simulation model is of special ral Network-based
e τ is the time constant, G is the gain associated with the fuel valve, and u f d denotes the fuel demand which is computed Approach
interest as it easily allows one to generate faulty data, given
ing a feedback from the rotational speed as described in [21]. A modular Simulink model is developed to simulateidentification
the
System plays an important role in fault detec-
that dynamics
jet engine nonlinear for practical reasons
as described it may
by equations (5) not always
and (6). Figure be possible
2 shows to
the information flow process in
imulink model ofactually
the engine.inject faults in a real operational gas turbine engine.
tion schemes. One always requires to have a reference healthy
model to generate expected system outputs that correspond to
The
igure 3 shows the following
series set that
of steady states of are
nonlinear equations
obtained from describes
our nonlinear a single-
model and the GSP [24] at PLAs ranging
0.4 to 1. At eachspool
point, the the healthy process. The residual signals are then constructed
gasinitial condition
turbine of thedynamics
engine PLA is set equal[2]:to 0.3 followed by a transient to reach to the steady
corresponding to the desired PLA. Since the steady state corresponding to each PLA is independent by of thecomparing
path taken the output of the actual system that could
g the transients (unless the compressor 1 surges),
it provides a suitable basis for comparison. As can be be subjected
observed from to faults with that of the identified healthy
ṪCC =to our model (c
e 3 the responses corresponding P TC ṁC + ηCC f − can
Hu ṁwithin PT CC ṁT )error tolerance (below
cν mCC and the GSP match each other acceptable reference model output. If the residual is within a certain a
in terms of the complexity of the mathematical model
The difference between the two representations is manifested
CC (ṁ C + ) priori selected bounds the system under study is said to be
we have used, by taking−c intoν T
account that ourṁ f − ṁisT simpler
structure as compared to the more complicated representation
healthy, however if the residual signal exceeds these bands
GSP ( [24]). ηmech ṁT cP (TCC − TT ) − ṁC cP (TC − Td )
Ṅ = 2
then a fault has occurred in the system and the fault flag is
JN ( π30 ) declared. In this work, various neural network architectures
˙ RTMi β are trained to identify the dynamics of a healthy gas turbine
PT = (ṁT + ṁC − ṁn )
VMi 1+β engine. These models are subsequently integrated together to
PCC γRTCC construct an ensemble model representation for the healthy
ṖCC = ṪCC + (ṁC + ṁf − ṁT ) engine. This section describes the identification of the engine
TCC VCC
dynamics by using individual neural network learning models,
where TCC denotes the combustion chamber temperature, N whereas Section III-B provides the identification of the engine
denotes the rotational speed, PT denotes the turbine pressure, dynamics by using an ensemble of the learning models.
PCC denotes the combustion chamber pressure, mCC denotes The gas turbine engine dynamics is identified by utilizing
the mass flow in the combustion chamber, cν denotes the the Nonlinear AutoRegressive eXogenous (NARX) architec-
specific heat at constant volume, cp denotes the specific heat ture which is commonly used in the system identification
at constant pressure, J denotes the rotor moment of inertia, domain [39]. The NARX approach relates the current output of
R denotes the gas constant, VCC denotes the combustion the to be identified system to its previous inputs and outputs.
chamber volume, γ denotes the heat capacity ratio, TC denotes A general nonlinear system can be represented by an input-
the compressor temperature, ṁC denotes the compressor mass output map according to the following nonlinear autoregressive
flow rate, ṁT denotes the turbine mass flow rate, ηCC denotes moving average (NARMA) model:
the combustion chamber efficiency, Hu denotes the fuel spe-
cific heat, ṁf denotes the fuel mass flow rate, ηmech denotes y(k) = f (y(k − 1), ..., y(k − dy ), u(k), ..., u(k − du )) (1)
the mechanical efficiency, Td denotes the diffuser temperature, where u(k) and y(k) denote the input and the output vectors
ṁn denotes the nozzle mass flow rate, β denotes the bypass of the system at the discrete-time instant k, respectively and f
ratio, TMi denotes the mixer temperature, and VMi denotes is an unknown nonlinear function that is to be identified and
the volume mixer. estimated by the dynamic neural network. The parameters dy
Figure 1 depicts the main engine components and their and d represent the order of the delays in the output and
u
interdependencies. The state variables in a single-spool gas input channels of the system, respectively. When a system
turbine engine are selected as x = [TCC , N, PT , PCC ]T . The is identified by the NARX model the representation is then
output measurements in a single-spool gas turbine engine are expressed by a nonlinear function fˆ(.) as follows:
selected as z = [PC , TC , N, PT , TT ], where PC denotes the
compressor pressure and TT denotes the turbine temperature. ŷ(k) = fˆ(y(k − 1), ..., y(k − dˆy ), u(k), ..., u(k − dû )) (2)
5
TC (k )
y (k ) PC (k )
m f (k ) Gas Turbine Engine m f (k )
Gas Turbine
(i.e. TC , TT , N , PT , PC ) Engine
N (k )
TT (k )
PT (k )
TˆC (k )
TDL EnsembleTC
...
TDL yˆ(k ) PˆC (k )

Learning EnsemblePC
...
Algorithm
(i.e. TˆC , TˆT , Nˆ , PˆT , PˆC )
Nˆ (k )
EnsembleN
y (k )
m f (k ) Gas Turbine Engine
(i.e. TC , TT , N , PT , PC ) TˆT (k )
EnsembleTT
PˆT (k )
TDL EnsemblePC
...
Ensemble model of gas turbine engine
TDL yˆ(k )
Learning TC (k )
...
Algorithm m f (k ) PC (k )
(i.e. TˆC , TˆT , Nˆ , PˆT , PˆC ) GasTurbine
Engine
N (k )
TT (k )
PT (k )
Fig. 2. The gas turbine engine identification process by using the NARX
TˆC (k )
methodology that is used during the training (top plot) and the recall (bottom EnsembleTC RESTC (k )
plot) phases. The block T DL represents the Tapped Delay Lines module.
PˆC (k )
EnsemblePC RES PC (k )
where ŷ(k) denotes the estimate of the actual output and

the time delays dˆy and dû should be approximated such that EnsembleN
Nˆ (k )
RES N (k )
dˆy ≥ dy and dû ≥ du [36]. There are two possible structures
TˆT (k )
that one can utilize for the NARX model. During the training EnsembleTT RESTT (k )
phase, the so-called series-parallel NARX structure is used for
identification of the system dynamics. In this methodology the PˆT (k )
EnsemblePC RES PT (k )
actual inputs and outputs of the gas turbine engine are fed
to the NARX identification model. During the recall phase, Ensemble model of gas turbine engine
the so-called parallel NARX structure is used for verification

and validation of the trained NARX representation and model. Fig. 3. The architecture of the gas turbine engine ensemble learning model
during the learning (top figure) and the recall (bottom plot) phases.
These two structures are shown in Figure 2.
Given that the gas turbine engine is a bounded input
bounded output (BIBO) stable system, all the signals used
in the identification process are bounded. This guarantees that also used for training an ensemble model as shown in Figure
the identified model will also remain BIBO stable. Once it can 3. Once the training process is completed where the trained
be ensured that ŷ(k) ≈ y(k), the series-parallel model will model outputs replicate the outputs of the actual engine, the
then be replaced by a parallel model. Three separate learning series-parallel architecture is replaced with a parallel archi-
algorithms are now considered for this framework to identify tecture as shown in Figure 3. Figure 4 depicts the internal
the engine dynamics as described below. structure of each ensemble learning model. Note that each
The first learning algorithm uses the MLP neural network in model has its own specifically optimized parameters such as
the NARX structure to result in what we designate as the MLP- the number of neurons and the number of the tapped delay
NARX for accomplishing the system identification objective. lines (TDLs).
This structure has been reported in the literature [40]. The
second learning algorithm uses the RBF neural network in the Training of an ensemble learning architecture can be gener-
NARX structure to result in what we designate as the RBF- ally accomplished through three steps [43]. The first step is the
NARX. This structure has also been reported in the literature ensemble generation, during which a set of individual models
[41]. The third individual learning algorithm is the SVM that are constructed. The second step is to trim the set of gener-
is used in the NARX structure to result in what we designate ated models, during what is known as the ensemble pruning
as the SVM-NARX. This structure has also been reported in stage so that the performance of the ensemble learning
the literature [42]. model is optimized in terms of the identification accuracy.
Finally, the selected models are combined together in the
ensemble integration step where the final ensemble learning
B. The Gas Turbine Engine Model Identification: Ensemble- model is constructed. Note that constructing the ensemble
based Approach model is an iterative procedure as it requires selecting various
This section develops and presents our proposed methodol- subsets of the generated models (i.e., in the pruning step)
ogy for identifying the gas turbine engine dynamics by using and combining them (i.e., in the integration step) in order to
ensemble methods. Similar to the single model-based approach achieve the best possible generalization performance capabil-
presented in Section III-A, a series-parallel architecture is ities. These steps are now described in detail below.
6
Engine output feedback

TDLOUT ,1
(e.g. rotational speed etc.) Model 1
Fuel flow rate TDLIN ,1
Aggregation Engine output
...
...
Mechanism (e.g. rotational speed etc.)
TDLOUT ,n
Model n
TDLIN ,n
Fig. 4. The internal structure of a given ensemble learning model (refer to Figure 3).
1) Ensemble Generation: Ensemble generation approaches probability of being selected is not necessarily the same for
are divided into (a) homogeneous schemes, where the models different samples. In fact, the probability of being selected is
are generated all by using the same learning algorithm, and (b) initially the same for all the samples, however in subsequent
heterogeneous schemes, where different learning algorithms iterations the samples that lead to more inaccurate predictions
are used for training each member of the ensemble. As dis- will be given higher probability of being selected. Boosting
cussed in Section II, diversity among models is a key require- was originally developed for classification problems. Although
ment and essential for constructing an ensemble of learning several modifications to it are proposed in the literature for
models. The source of diversity in the homogeneous ensemble regression problems, none has demonstrated to be as promising
generation is accomplished by using different training data. as that of the bagging method [46].
Whereas, the source of diversity in the heterogeneous ensem- In the heterogeneous ensemble, on the other hand, dis-
ble is accomplished through the inherent properties of different tinct learning models are trained by using the same training
learning algorithms. Homogeneous ensemble generation is data. There are very few works in the literature that use
widely addressed and investigated in the literature [30], but in different architectures for the ensemble learning systems [31].
contrast the number of publications on heterogenous ensemble The diversity among the models is ensured through different
generation is relatively quite limited [31]. In this work, one learning algorithms. This approach is studied much less in
homogeneous and two heterogenous ensemble learning models the literature; however, some results are reported by using
of the gas turbine engine are developed by using the MLP- heterogeneous ensembles [47]. The diversity in this approach
NARX, the RBF-NARX and the SVM-NARX structures as is ensured by the inherent properties of the different learning
candidate ensemble learning machines. algorithms. The main challenge is the lack of control on the
Homogeneous ensemble generation is the best covered area diversity of the ensemble during the generation phase.
of the ensemble learning in the literature [30]. In this ap- 2) Ensemble Pruning: Ensemble pruning refers to the pro-
proach, ensemble members are generated by using the same cedure for trimming the set of trained models with the goal of
learning algorithm (e.g., the same kind of neural networks), improving the generalization error of the resulting ensemble. It
and diversity among them are ensured by altering the training is also used to reduce the complexity of the overall ensemble
data. Alternatively, one may diversify homogeneous models system. Several pruning methods are proposed and compared
by using the same learning method subject to different set in the literature, including (a) ranking based on the accuracy,
of parameters (e.g., neural networks with different number (b) forward selective search (FSS) algorithm, (c) backward
of hidden layers and neurons). Comparative studies between selective search (BSS), and (d) BSS with Ranking (BSSwR).
these two approaches have concluded that altering the training It has been reported in [43], [48] that the FSS demonstrates
data is generally more effective than altering the network a better performance as compared to the other approaches. In
parameters [44]. Several approaches have been suggested in this paper a heterogeneous ensemble with the FSS as a pruning
the literature for training ensemble systems that manipulate algorithm will be used to identify the gas turbine engine
the training data. Bagging (bootstrap aggregating) has been dynamics. First, several models are trained using each of the
extensively studied as a homogeneous ensemble method [31]. MLP-NARX, the RBF-NARX, and the SVM-NARX structures
In this approach, the original training data is re-sampled by to model, represent, and identify the single input multiple
the bootstrap sampling in order to obtain several training sets output dynamics of the engine. To limit the complexity of
corresponding to a given training data. The authors in [15] the solution to the problem, a subset of the trained models is
and [45] have provided valuable insights as to why bagging selected based on their performance (i.e., the 10 best RBF-
works. NARX models, the 10 best MLP-NARX models, and the
Boosting is another approach that works by manipulating 10 best SVM-NARX models are selected out of a total of
the training data to ensure diversity. Similar to the bagging, 105 models each). The members of the ensemble learning
several training data sets are generated by re-sampling the system are then selected by using the FSS algorithm. The FSS
corresponding training data. However, unlike bagging the algorithm is initialized by using the model that yielded the best
TDL
TDL
Learning Method
yˆ(k )
7
(e.g. RBF, MLP)
performance in the set. Each time a new model is added to y (k )

the ensemble, all the candidates are tested and the model with m f (k ) Gas Turbine Engine
the maximal improvement is then added as the next model. It
should be pointed out that the FSS is an iterative procedure that +
requires combining various subsets of the generated models TDL RES (k )
in order to construct the ensemble with the best achievable -
generalization performance.
3) Ensemble Integration: Ensemble integration combines TDL yˆ(k )
the identifications that are made by various models into a Trained model
module for generating and constructing the final ensemble. For

system identification and regression problems the integration
mechanism combines the models by using a linear combina- Fig. 5. The schematic for generating the residual signals.
tion of identifiers as represented by [31]:
n
X
fensemble (x) = αi fi (x) maybe present in the engine. The residual signals are evaluated
to determine the health status of the engine. Our proposed fault
i
detection methodology is given below.
where fensemble (x) represents the output of the ensemble
learning model for the input pattern x, fi (x) denotes the
A. Fault Detection Logic
output of the ith model corresponding to the input pattern
x, αi denotes the averaging weight for the ith model, and n As stated above the first step in the engine fault detection
denotes the number of selected ensemble members. In other is the identification phase that was developed and described
words, the ensemble combination for a system identification in Section III. The engine dynamics is identified by using
or a regression problem can be restated as that of optimizing both the ensemble-based and the individual learning models.
the averaging weights αi s. A separate model (corresponding to both ensemble-based and
Merz et al. [49] conducted a comparative study to determine individual learning models) is developed for each of the
the most effective ensemble combination technique. Several engine’s five measurable outputs, namely the variable z as
ensemble combination techniques were studied (i) the Gener- defined in Section II-C. The trained models are then utilized
alized Ensemble Method (GEM) [50], (ii) the Basic Ensemble for generating the residual signals RES by comparing the
Method (BEM) [50], (iii) the Linear Regression (LR), (iv) the actual output of the engine with that of the trained model
Gradient Descent, and (v) the Exponential Gradient Descent representing the behavior of the healthy engine as shown in
[51]. Among these approaches the gradient descent method Figure 5.
was found to demonstrate a better generalization performance The residual signals that are constructed are then utilized
as compared to the other approaches. as indicators of the gas turbine engine health status, given
The objective function that is minimized by using the that the residuals change before and after the occurrence of a
gradient descent approach is the RMSE of the ensemble as: fault. Consequently, by selecting a proper threshold band the
engine faults can be detected by monitoring the variations in
min RM SEensemble (α) the residuals. To generate the threshold bands the mean (µ) and
αi
Updating rule: αk+1 = αk − γ∇RM SEensemble (αk ) the standard deviation (σ) of the residuals are obtained when
Pn the engine is operating under the healthy condition and under
where RM SEensemble (α) = i=1 αi fi (xtraining ) − various fuel operating conditions and profiles. The threshold
f (xtraining ), α = [α1 , ..., αn ]T , fi (xtraining ) denotes the bands are then specified according to t.h.upper = µ + zσ
prediction of the ith model corresponding to the training data and t.h.lower = µ − zσ, corresponding to the upper and the
xtraining , f (xtraining ) denotes the target or the actual value lower bands, respectively. By assuming a normal distribution
corresponding to the training data xtraining , αk denotes the associated with the residuals, a 99% confidence interval can
value of α at the k th iteration, ∇RM SEensemble (αk ) denotes be determined by selecting z = 2.6. A fault in the gas turbine
the gradient of the RM SEensemble at the k th iteration, and engine would then be detected if any of the five residual signals
γ denotes the step size. The step size, as well-known in passes its corresponding thresholds that are defined by the
optimization techniques, should be carefully selected given band [t.h.lower , t.h.upper ].
that a very large step size may lead to divergence of the
gradient descent and a very small step will take a long time V. FAULT I SOLATION M ETHODOLOGY
for the iterations to converge to a fixed point. It should be This section provides our proposed methodology for eval-
pointed out that the step size can be either fixed or adaptive uating the residuals that were utilized in the previous section
(i.e., it is adjusted or changed at each iteration). for detecting the presence of a fault to now solve the fault
isolation problem. This is achieved by using a static ensemble
IV. FAULT D ETECTION M ETHODOLOGY of neural networks classifiers. The use of static MLP neural
Our proposed fault detection methodology consists of two networks for fault isolation of a gas turbine engine is reported
phases. The first phase is to identify the dynamics of the engine in several publications including but not limited to [5], [6],
by using both individual models as well as integrating them [8], [23], [24]. Similar methodology is used here for isolating
to construct ensemble learning models. The identified model the engine faults, but instead we utilize an ensemble of neural
under healthy operation of the engine is then used to generate network models that we designed in Section IV for performing
the residual signals to be analyzed for detection of faults that the fault detection task. The inputs to MLPs in this case should
8
TABLE II RESTC {Fmc}

T HE CONSIDERED SINGLE FAULT CLASSES .
Description Symbol {Fec}
Class 1: Decrease in the compressor flow capacity (ṁC ) with severity less than or equal to 3% Fmc ≤ 3%
Class 2: Decrease in the compressor flow capacity (ṁC ) with severity > 3% Fmc > 3%
Class 3: Decrease in the compressor efficiency (ηC ) with severity ≤ to 3% Fec ≤ 3% RES PC {Fmt}
Class 4: Decrease in the compressor efficiency (ηC ) with severity > 3% Fec > 3%
Class 5: Decrease in the turbine flow capacity (ṁT ) with severity ≤ to 3% Fmt ≤ 3% {Fet}
Class 6: Decrease in the turbine flow capacity (ṁT ) with severity > 3% Fmt > 3%
Class 7: Decrease in the turbine efficiency (ηT ) with severity ≤ to 3% Fet ≤ 3%
Class 8: Decrease in the turbine efficiency (ηT ) with severity > 3% Fet > 3% RES N {Fec , Fmc}
Neural Network
Fault Classifier {Fec , Fet}
RESTT
{Fec , Fmt}
be spatial data as opposed to residual signal time-series data
that were used for the fault detection task. Therefore, the {Fmc , Fmt}
residuals are pre-processed to make them suitable as inputs RES PT
{Fmc , Fet}
to static neural network structures.
We consider only two fault severities ranges for sake of {Fmt , Fet}
simplicity and without loss of generality. Namely we consider
(a) severities < 3%, and (b) severities > 3%. It should be Fig. 7. The schematic of the proposed neural network for achieving multiple
fault isolation.
emphasized that this is for illustration purposes only as more
severity ranges can be easily considered by incorporating ad-
ditional fault class labels. The class labels that are considered class is associated with simultaneous 1% to 6% faults in both
for a single fault scenario are listed in Table II. compressor efficiency (Fec ) and turbine efficiency (Fet ). Class
The designed neural network fault classifier receives varia- 7 ({F , F }): This class is associated with simultaneous 1%
ec mt
tions of the residual signals before and after the fault occur- to 6% faults in both compressor efficiency (F ) and turbine
ec
rence, and returns a fault label corresponding to the isolated mass flow rate (F ). Class 8 ({F , F }): This class is asso-
mt mc et
fault. This is the function of the residual evaluation block ciated with simultaneous 1% to 6% faults in both compressor
that is shown in Figure 6. This block continuously compares mass flow rate (F ) and turbine efficiency (F ). Class 9
mc et
the residual signals with their corresponding threshold bands. ({F , F }): This class is associated with simultaneous 1%
mc mt
Once at least one of the residuals has exceeded its threshold to 6% faults in both compressor mass flow rate (F ) and
mc
band the presence of a fault is detected. The residual evaluation turbine mass flow rate (F ). Class 10 ({F , F }): This
mt et mt
block then computes the variations of the residual signal class is associated with simultaneous 1% to 6% faults in both
before and after the fault detection declaration. This is now turbine efficiency (F ) and turbine mass flow rate (F ).
et mt
used as an input to the neural network for performing the The structure of the ensemble MLP classifier that is used
fault classification. In other words, the output of the residual for multiple fault isolation is shown in Figure 7. The neural
evaluation block can be defined and specified according to: network receives the set of variations in residuals before and

∆RES = ∆RESTC , ∆RESPC , ∆RESN , ∆RESTT , ∆RESPTafter the fault detection decision.
It should be emphasized that the fault isolation neural network
is only utilized when a fault is detected and the ∆RES vector VII. H OMOGENEOUS AND H ETEROGENEOUS DYNAMIC
is defined. Note that the value of ∆RES is only defined for N EURAL N ETWORK E NSEMBLE I DENTIFIERS
t ≥ tD , and not defined otherwise, where tD denotes the fault A. Dynamic Neural Network System Identifiers
detection time communicated by the fault detection module. The data used for identification of the engine dynamics
The fault isolation task will be performed by both using is obtained from the model provided in Section II-C. It is
the residuals that are obtained from the single model-based assumed that the engine is operating for one hour (3600 sec).
solutions as well as those that are obtained from the ensemble- A total of 3601 input-output data samples are collected. The
based solutions. generated data contains the measured five (5) outputs as well
as the engine input represented by the fuel flow rate.
VI. M ULTIPLE FAULTS I SOLATION Empirically, it is observed that normalization of the data
In the previous section, the fault isolation problem in pres- improves the performance of the ensemble learners. Therefore,
ence of only a single fault at any given time was considered. the following min-max normalization function is applied as a
In this section, the goal is to isolate simultaneously occur- pre-processing step to the data, that is, the normalized data is
ring faults. Isolating multiple faults is a complex problem. computed from Xn = 2 × Xmax Xmax −X
−Xmin . During construction
Therefore, in order to limit the complexity of the proposed of each learning model the generated data is divided into the
solution, it is assumed that only two concurrent faults may training, the testing, and the cross-validation.
occur. Extensions to more than two concurrent faults are
straightforward and not included here. We assume that the 1) The Gas Turbine Engine Dynamic Identification using
first fault occurs at t1 = 20 sec and the second fault occurs at MLP-NARX
t2 = 30 sec. The fault scenarios investigated are as follows.
Classes 1-4: These classes are associated with single Fec , Five MLP-NARX structures are trained to model the engine
Fet , Fmc , and Fmt reductions in severities between 1% to 6%. output measurements PT , TT , PC , TC and N . The net-
Class 5 ({Fec , Fmc }): This class is associated with simultane- works are denoted by M LPPT , M LPTT , M LPPC , M LPTC ,
ous 1% to 6% faults in both compressor efficiency (Fec ) and M LPN , respectively. Constructing the MLP-NARX structures
compressor mass flow rate (Fmc ). Class 6 ({Fec , Fet }): This requires that one determines the network parameters, such as
9
TC (k )
Gas PC (k )
W f (k )
Turbine N (k )
Engine TT (k )
PT (k )
TˆC (k ) RESTC (k ) RESTC
ModelTC
f1
PˆC (k ) RES PC (k ) RES PC f2

ModelPC
f3
Nˆ (k ) RES N (k ) RES N f4
ModelN Residual Neural Network
Evaluation Fault Classifier f5
TˆT (k ) RESTT (k ) RESTT
ModelTT f6
f7
PˆT (k ) RES PT (k ) RES PT
ModelPC f8
Identification model of gas turbine engine
Fig. 6. The schematic of the fault isolation methodology.
(i) the number of hidden neurons, (ii) the number of hidden the ensemble models of the gas turbine engine. For example,
layers, (iii) the number of delays, as well as (iv) the size of for the optimal RBFPC the number of delays is 10 and the
the training, testing and validation sets. To limit the network number of neurons is 7, whereas for the optimal RBFN the
complexity, the number of hidden layers is limited to one. number of delays is 8 and the number of neurons is 11, to
The networks are then constructed such that the generalization name a couple of structures.
performance of the trained networks is maximized. The Root
Mean Squared Error (RMSE) is applied to evaluate the training 3) The Gas Turbine Engine Dynamic Identification using
and the generalization performance of the trained networks. SVM-NARX
The Mean of Absolute Error (µae ) and Standard Deviation of
Absolute Error (σae ) are also evaluated, given that the error Five SVM-NARX structures are trained to model the engine
is expected to be randomly distributed around zero for an output measurements. The construction of SV MPT , SV MTT ,
appropriately constructed network. SV MPC , SV MTC , SV MN networks is repeated several
The construction of M LPPT , M LPTT , M LPPC , M LPTC , times by applying different network parameters and different
M LPN networks is repeated several times by using various percentage combinations of the training and testing data. The
parameters and different combination of the percentage of the results produced a set of trained models that will be used
training and the testing data. For example, for the optimal subsequently for constructing the ensemble learning models
M LPPC the number of delays is 7 and the number of neurons of the gas turbine engine.
is 11, whereas for the optimal M LPN the number of delays
is 6 and the number of neurons is 10, to name a couple of
structures. Consequently, a set of trained models are obtained B. Ensemble I: Heterogeneous Ensemble with Ranked Pruning
that will be used later for constructing the ensemble models of Heterogeneous ensembles with ranked pruning have been
the engine. After various experimentations it was determined reported in the literature as in [52]. In this approach, first a pool
that 40% of the available training data could be used for the of individual learners are trained by using different learning
cross-validation stage to construct the most suitable networks. algorithms. The most accurate models are then selected for
each learning algorithm to be aggregated and to generate the
2) The Gas Turbine Engine Dynamic Identification using final ensemble learning model. The only source of diversity
RBF-NARX in this approach is the use of heterogeneous ensembles (that
is, by using different kinds of neural network algorithms).
Five RBF-NARX structures are trained to model the engine In this work, the above methodology is first used to identify
output measurements PT , TT , PC , TC and N . The networks the engine dynamics. As discussed in the previous subsection,
are denoted by RBFPT , RBFTT , RBFPC , RBFTC , RBFN , several machine learners are trained from the MLP-NARX,
respectively. Constructing the RBF-NARX structure requires RBF-NARX, and SVM-NARX architectures to identify the
that one determines the network parameters, such as (i) the engine dynamics. For each learning algorithm (e.g., the MLP-
number of RBF neurons, (ii) the number of delays, and (iii) NARX) the identification architecture with the best perfor-
the size of the training, testing, and validation data. After mance is selected from the pool of individual trained learning
experimenting with different parameter values the ones that models. The selected identified models are then combined
yielded the best generalization performance are selected. by using the weighted averaging technique. Two combination
The construction of RBFPT , RBFTT , RBFPC , RBFTC , methods are used to determine the optimal averaging weights,
RBFN networks is repeated several times by using various namely the generalized ensemble and the gradient descent.
parameters and different combination of the percentage of the The performance of the heterogeneous ensemble with
training and the testing data. Consequently, a set of trained ranked pruning and generalized ensemble method as the inte-
models are obtained that will be used later for constructing gration methodology were determined. The performance of the
10
TABLE III TABLE IV

S UMMARY OF THE HETEROGENEOUS ENSEMBLE TRAINING WITH RANKED S UMMARY OF THE HETEROGENEOUS ENSEMBLE TRAINING WITH THE
PRUNING . F ORWARD S EQUENTIAL S ELECTION (FSS) PRUNING ALGORITHM .
1: Several models are trained by using the MLP-NARX, the RBF-NARX, and the 1: Several models are trained with MLP-NARX, RBF-NARX, and SVM-NARX
SVM-NARX architectures corresponding to the five gas turbine engine outputs. architectures to identify the engine five output measurements.
2: For each architecture, the best trained model is selected for the subsequent integration. 2: A subsets of 10 best RBF-NARX models, the 10 best MLP-NARX models,
3: The generalized ensemble method and the gradient decent algorithms are used and the 10 best SVM-NARX models are selected from the pool of models
to determine the averaging weights. trained in Step 1 in order to reduce the complexity of the solution.
4: The initial condition of the gradient decent algorithm is selected such that 3: Corresponding to each gas turbine engine output (i.e., PC , TC , N , PT , TT ),
the best learner in the pool has the maximum contribution. the FSS algorithm is initialized with the best trained model.
4: Each time a new model is added to the ensemble, all the candidates are tested
and the model with the maximal improvement is then added as the next model.
5: Each time a new model is added to the ensemble, the optimal combining
heterogeneous ensemble with ranked pruning and gradient de- weights are re-calculated by using gradient descent algorithm as discussed
scent as the integration method was also determined. Based on in Subsection III-B3.
6: All the evaluations for the FSS algorithm are performed on the training set.
the extensive simulations that are conducted it was concluded
that the gradient descent approach has a much better perfor- Training individual models
mance in terms of the generalization error when compared
with the generalized ensemble method. Detailed comparative
studies between the ensemble and the individual learners are
provided in the following subsections. Table III summarizes Bootstrap sampling
N times
T1 M1 Aggregation
mechanism
the main procedure for the heterogeneous ensemble training
with ranked pruning. T2 M2
Training data Ensemble
C. Ensemble II: Heterogeneous Ensemble using the Forward
...
Sequential Selection (FSS)
The use of the heterogeneous ensemble with the Forward TN data
Training MN
Sequential Selection (FSS) as a pruning algorithm has been

reported in several publications in the literature, including Fig. 8. The schematic of the homogeneous ensemble learning with bagging.
but not limited to [43], [48], and [50]. Forward selection
is initialized with an empty set and iteratively models are
added with the aim of decreasing the expected prediction
D. Ensemble III: Homogeneous with Bagging
error. Two different versions of the FSS are presented in the
literature, namely the Forward Sequential Selection with Rank- Bootstrap sampling or the bagging is one of the most
ing (FSSwR) and the Forward Sequential Selection (FSS). extensively used techniques for manipulation of the training
The FSSwR ranks all the candidates with respect to their data [31]. Empirical studies have shown that bagging is a
performance on a given training set. It selects the candidate simple and effective method in reducing the prediction error
at the top of the list until the performance of the ensemble in both classification and regression problems [45]. The main
decreases. In the FSS algorithm, each time a new candidate idea is to train a learning model by using different subsets of
is added to the ensemble, all the candidates are tested and the the training data that are generated by the bootstrap sampling
one that leads to the maximal improvement in the ensemble procedure. In the bagging method, a training set with the size
performance is selected. In [53] the FSS was modified by s, several bootstrap replicates of it are constructed by taking
adding a diversity measure. In this approach, the criterion for s samples out of it with replacement. Thus, a new training
inclusion is a diversity measure, and the new model should be set with the same size would be generated where each of the
diverse from the previously selected models. samples in the original training set may appear once, more than
In this work, a heterogeneous ensemble with the FSS as once, or may not appear at all [45]. The learning algorithm
a pruning algorithm is utilized for identifying the gas turbine then uses this new training set. This procedure would be
engine dynamics. Several models are trained by employing the repeated several times, and all the models are then aggregated
MLP-NARX, the RBF-NARX, and the SVM-NARX architec- to generate the final ensemble. Figure 8 depicts the main
tures for modeling the five output measurements of the engine components of the bagging procedure.
dynamics. As stated earlier, in order to limit the complexity of In this subsection, a homogeneous ensemble is trained by
the resulting solution, here a subset of the trained models is using the bagging for modeling each of the five gas turbine
selected based on their achieved performance (i.e., the 10 best engine outputs. As shown below the RBF-NARX architecture
RBF-NARX models, the 10 best MLP-NARX models, and the outperforms the MLP-NARX and SVM-NARX models for
10 best SVM-NARX models are selected). The members of the identifying the dynamics of the five gas turbine engine outputs.
ensemble are then selected by invoking the FSS algorithm. The Therefore, the RBF-NARX model is used to form the homo-
FSS algorithm is initialized by using the model with the best geneous ensemble. Corresponding to the network parameters
performance in the pool. Each time a new model is added to (i.e., the number of neurons, the number of time delays, and
the ensemble, all the candidates are tested and the model with the size of training data), those corresponding to models with
the maximal improvement is then added as the next model. the best RMSE performance are selected.
In each iteration, all the selected models are aggregated by Consequently, several RBF-NARX models having exactly
using the gradient descent algorithm. It should be noted that the same parameters are trained. The only factor that is
the generalized ensemble method is not employed due to its different is the training data, where for different homogeneous
poor performance as pointed out in the previous subsection. models the training data is obtained by the bootstrap sampling
Table IV summarizes the main procedure for construction of as indicated above. As previously stated, the number of models
the heterogeneous ensemble system. in an ensemble plays an important role in its performance
11
TABLE V TABLE VIII

C OMPARISON OF VARIOUS IDENTIFICATION METHODS FOR THE C OMPARISON OF SINGLE LEARNING MODELS AND THE ENSEMBLE
COMPRESSOR PRESSURE . METHOD FOR IDENTIFICATION OF THE COMPRESSOR PRESSURE .
RM SEtotal RM SEtrain RM SEtest µae σae RMSEtotal RMSEtrain RMSEtest µae σae
MLP-NARX 0.0531 0.060215 0.040119 0.033786 0.040972
RBF-NARX 0.026612 0.027382 0.026087 0.021582 0.015573
MLP 0.053 0.060215 0.040119 0.03378 0.04097
SVM-NARX 0.041185 0.037001 0.046766 0.028081 0.03013 RBF 0.02661 0.027382 0.026087 0.02158 0.01557
Ensemble I 0.024672 0.024926 0.024285 0.02894 0.02103 SVM 0.04118 0.037001 0.046766 0.02808 0.0301
Ensemble II 0.023135 0.023115 0.023166 0.023113 0.0010121
Ensemble III 0.031684 0.027609 0.042234 0.030476 0.031543
Ensemble II 0.02313 0.023115 0.023166 0.02311 0.001012
TABLE IX
TABLE VI C OMPARISON OF SINGLE LEARNING MODELS AND THE ENSEMBLE
T HE GRADIENT DESCENT COEFFICIENTS FOR INTEGRATION OF THE METHOD FOR IDENTIFICATION OF THE ROTATIONAL SPEED .
ENSEMBLE LEARNING MODELS . RMSEtotal RMSEtrain RMSEtest µae σae
αM LP αRBF αSV M MLP 22.7076 24.7451 21.2417 19.597 11.4732
PC 9.5212 × 10− 5 1.0001 9.4425 × 10− 5 RBF 17.8349 19.7104 15.7359 14.2826 10.6831
TC 0.0646 0.8164 0.1190 SVM 24.6728 26.5164 22.6791 21.236 12.562
N 0.0685 1.7562 -0.8261 Ensemble II 6.8672 6.7947 6.9391 4.5636 3.6543
PT 0.0048 0.8974 0.109
TT -0.0995 1.086 0.0131
VIII. FAULT D ETECTION AND I SOLATION C ASE S TUDIES
As described earlier, the engine FDI task consists of two
[13]. Thus, in order to have a fair comparison between the
phases. In the first phase, the engine dynamics is identified by
heterogenous ensembles (i.e., the ensembles I and II) and the
using both single model-based and ensemble-based learning
homogeneous ensemble (the ensemble III), the same number
approaches and corresponding to each methodology residuals
of models are selected for all of them. A weighted averaging
are generated. In the second phase, the obtained residuals from
is used to integrate the ensemble models. The weights are
both methodologies are evaluated to detect and isolate the
optimized by using the gradient descent algorithm with the
presence of a fault. A comparative study is conducted between
objective of minimizing the RMSE on the training data.
the two approaches to evaluate and demonstrate the advantages
and capabilities of the ensemble-based methodology.
E. Comparative Results
Table V shows the comparisons between various methods A. Fault Detection Results
for modeling only the compressor pressure, as a typical illus- Four engine component faults are considered in this work
tration. It can be concluded that the heterogeneous ensemble as described in Table I. Multiple fault scenarios are considered
with the FSS pruning scheme (ensemble II) has a better that vary in terms of (a) the fault type, (b) the fault severity
performance in modeling and identifying the unknown engine or magnitude, and (c) the fuel flow rate. The simulations are
dynamics. Moreover, the stand-alone RBF-NARX model has conducted where the fuel flow rate varies between ṁf =
the best performance among the stand-alone trained models. 0.7, 0.75, 0.8, 0.85 of its maximum rate. The residual signals
The gradient descent method (as previously stated in Sub- are generated by using (a) the heterogeneous ensemble model
section III-B) is used to minimize the RMSE by optimizing II, and (b) the individual RBF-NARX model. The residuals
the averaging weights of the learning models. The αi s that are are evaluated against the computed thresholds as described in
obtained for integrating the three individual learners are given Subsection IV-A. A fault is detected if any of the residuals
in Table VI. A summary of the ensemble system performance exceeds its corresponding determined threshold.
for identification of each of the five gas turbine engine outputs
is given in Table VII.
B. Scenario I: Faults in the Compressor Efficiency
A comparative study of the performance of the heteroge-
neous ensemble II with each of the individual learners are Consider decrease in the efficiency of the compressor faults
presented in Tables VIII through X. To summarize, one can with severities 1%, 2%, 4%, 6%, and 8% in the engine.
observe that the ensemble model demonstrates a significantly The fuel flow rate is allowed to vary within the range of
better performance in modeling and identification of the gas 70% to 85% of its maximum rate. The instant where the
turbine engine dynamics. Also, it can be concluded that faults are injected is t = 20 sec. The residuals corresponding
the RBF-NARX model has the best performance among the to both the ensemble-based II and the single-model (RBF-
individual learning models. Consequently, in the remainder NARX) based methodologies are obtained. Figures 9 and 10
of this work, the trained heterogeneous ensemble model II show comparisons between the fault detection results of the
and the RBF-NARX model will be used for generating the ensemble approach (Figure 10) and the one corresponding
residuals for performing the FDI task. A comparative study to the single-model approach (Figure 9), which indicate an
will be conducted between the performance of the ensemble- improvement in the fault detection accuracy that is achieved
based and single-model based FDI methodologies. by using the ensemble learning scheme. Note that in Figure
9 only one residual has exceeded its threshold bands and
TABLE VII
T HE PERFORMANCE OF THE HETEROGENEOUS ENSEMBLE WITH THE FSS TABLE X
PRUNING . C OMPARISON OF SINGLE LEARNING MODELS AND THE ENSEMBLE
METHOD FOR IDENTIFICATION OF THE TURBINE TEMPERATURE .
RM SE µae σae
PC 0.023135 0.023113 0.00101 RMSEtotal RMSEtrain RMSEtest µae σae
TC 0.5049 0.3883 0.1045 MLP 41.9615 40.4995 43.3752 37.1441 19.5246
TT 10.139 7.848 3.820 RBF 13.4734 15.2044 12.1843 9.8925 9.1487
PT 0.016865 0.016841 0.000896 SVM 104.3983 107.0395 102.6002 90.6818 51.7326
N 6.8672 4.563 3.654 Ensemble II 10.1397 9.8846 10.3887 7.8483 3.821
12
Tc residual Pc residual Tc residual Pc residual

5 0.4 4 0.4
0.2 2 0.2
0 0 0 0
−0.2 −2 −0.2
−5 −0.4 −4 −0.4
10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40
N residual Tt residual N residual Tt residual

100 200 100 100
← Detection time = 20.16
← Detection time = 20.16
0 100 0 50
−100 0 −100 0
−200 −100 −200 −50
−300 −200 −300 −100

10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40
Pt residual Pt residual
0.2 0.6
0.1 0.4
0 0.2
−0.1 0
−0.2 −0.2
10 15 20 25 30 35 40 10 15 20 25 30 35 40
Residual signals compressor mass: fault magnitude = 2 Residual signals compressor mass: fault magnitude = 2
Fig. 9. The residuals that are generated by using the RBF-NARX model Fig. 11. The residuals that are generated by using the RBF-NARX model
subject to a 2% decrease in the compressor efficiency injected at t = 20 sec. subject to a 2% decrease in the turbine mass flow injected at t = 20 sec.
Tc residual Pc residual Tc residual Pc residual

5 0.4 10 0.5
0 0.2
← Detection time = 20.04 0 0
−5 0
−10 −0.2 −10 −0.5
−15 −0.4 −20 −1

10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40
N residual Tt residual N residual Tt residual
200 100 100 100
150 50 ← Detection time = 20.10 50
0
100 0 0
50 −50 −100
−50
0 −100 −200 −100
10 20 30 40 10 20 30 40 10 20 30 40 10 20 30 40
Pt residual Pt residual
0.2 0.2
0.1 0.1
0 0
−0.1 −0.1
−0.2 −0.2
10 15 20 25 30 35 40 10 15 20 25 30 35 40
Residual signals compressor mass: fault magnitude = 2 Residual signals compressor mass: fault magnitude = 2
Fig. 10. The residuals that are generated by using the ensemble II model Fig. 12. The residuals that are generated by using the ensemble II model
subject to a 2% decrease in the compressor efficiency injected at t = 20 sec. subject to a 2% decrease in the turbine mass flow injected at t = 20 sec.
remained consistently outside the bands, whereas in Figure based methodologies are obtained. The results (graphs are
10 two residuals have achieved this property, leading to a not shown) confirm that the ensemble approach yields more
more reliable decision in detection of the fault. Moreover, accurate detection performance.
the other residuals in Figure 10 practically remain at all
times within their bounds, whereas the residuals in Figure 9
oscillate consistently in and out of the threshold line generating E. Scenario IV: Faults in the Turbine Mass Flow
numerous undesirable false flags.
Consider decrease in the effectiveness of the turbine mass
flow rate faults with severities 1%, 2%, 4%, 6%, and 8% in
C. Scenario II: Faults in the Compressor Mass Flow Rate the engine. The fuel flow rate is allowed to vary between 70%
Consider decrease in the effectiveness of the compressor to 85% of its maximum rate. The instant where the faults are
mass flow rate faults with severities 1%, 2%, 4%, 6%, and 8% injected is t = 20 sec. The residual signals corresponding
in the engine. The fuel flow rate is allowed to vary between to both the ensemble-based II and the single-model (RBF-
70% to 85% of its maximum rate. The instant where the NARX) based methodologies are obtained. Figures 11 and
faults are injected is t = 20 sec. The residuals corresponding 12 show comparisons between the fault detection results of
to both the ensemble-based II and the single-model (RBF- the ensemble approach (Figure 12) and the one corresponding
NARX) based methodologies are obtained. The results (graphs to the single-model approach (Figure 11), which indicate an
are not shown) confirm that the ensemble approach yields improvement in the fault detection accuracy that is achieved
more accurate detection performance. by using the ensemble learning scheme. Note that in Figure
11 only one residual has exceeded its threshold bands and
remained consistently outside the bands, whereas in Figure
D. Scenario III: Faults in the Turbine Efficiency 12 three residuals have achieved this property, leading to a
Consider decrease in the turbine efficiency fault with sever- more reliable decision in detection of the fault. Moreover,
ities 1%, 2%, 4%, 6%, and 8% in the engine. The fuel flow the other residuals in Figure 12 practically remain at all
rate is allowed to vary within the range between 70% to times within their bounds, whereas the residuals in Figure 11
85% of its maximum rate. The instant where the faults are oscillate consistently in and out of the threshold line generating
injected is t = 20 sec. The residuals corresponding to both numerous undesirable false flags.
the ensemble-based II and the single-model (RBF-NARX)
13
TABLE XI TABLE XV
T HE FAULT DETECTION ACCURACY OF THE SINGLE MODEL - BASED T HE FAULT DETECTION ACCURACY OF THE SINGLE MODEL - BASED
RBF-NARX SOLUTION (FAULT SEVERITY = 1%). RBF-NARX SOLUTION (FAULT SEVERITY = 8%).
Fmc Fec Fmt Fet Fmc Fec Fmt Fet
CCR 60% 90% 100% 90% CCR 90% 100% 100% 100%
Precision 57.14% 83% 100% 83% Precision 100% 100% 100% 100%
TPR 40% 80% 100% 80% TPR 100% 100% 100% 100%
FPR 20% 0% 0% 0% FPR 20% 0% 0% 0%
TNR 80% 100% 100% 100% TNR 80% 100% 100% 100%
FNR 60% 20% 0% 20% FNR 0% 0% 0% 0%
TABLE XII TABLE XVI
T HE FAULT DETECTION ACCURACY OF THE ENSEMBLE - BASED II T HE FAULT DETECTION ACCURACY OF THE ENSEMBLE - BASED II
SOLUTION (FAULT SEVERITY = 1%). SOLUTION (FAULT SEVERITY = 8%).
Fmc Fec Fmt Fet Fmc Fec Fmt Fet

CCR 70% 100% 100% 100% CCR 100% 100% 100% 100%
Precision 100% 100% 100% 100% Precision 100% 100% 100% 100%
TPR 100% 100% 100% 100% TPR 100% 100% 100% 100%
FPR 60% 0% 0% 0% FPR 0% 0% 0% 0%
TNR 40% 100% 100% 100% TNR 100% 100% 100% 100%
FNR 0% 0% 0% 0% FNR 0% 0% 0% 0%
F. Fault Detection Performance Analysis with Confusion Ma- these scenarios are collected (a total of 100 scenarios are
trix considered for sake of illustration). Among the collected
The following metrics are now used to quantitatively data we randomly select 50 samples (after experimenting
evaluate the performance of our proposed fault detection with different training data sizes, namely, 30, 40, 50, and
methodologies, namley Correct Classification Ratio (CCR) = 60 samples). Starting from a small structure we construct a
t.p.+t.n t.n neural network classifier to perform the fault isolation task.
t.p.+t.n.+f.p.+f.n. , Precision = t.n.+f.n. , True Positive Rate
t.p f.p We observe that an acceptable performance is achieved with
(TPR) = t.p.+f.n. , False Positive Rate (FPR) = f.p.+t.n. , a single layer MLP having 15 hidden neurons. Table XVII
t.n
True Negative Rate (TNR) = t.n.+f.p. , and False Negative provides the corresponding confusion matrix for the testing
f.n
Rate (FNR) = t.p.+f.n. , where t.p. (true positive) denotes the results of the available data. The obtained performance results
number of cases that are classified as faulty and the engine is in terms of the CCR metric are as follows: CCRtraining =
also faulty; f.p. (false positive) denotes the number of cases 88%, CCRtesting = 80% and CCRtotal = 84%.
that are classified as faulty, but the engine is healthy; t.n. (true
negative) denotes the number of cases that are classified as H. Ensemble-based Fault Isolation Results
healthy and the engine is also healthy; and f.n. (false negative)
denotes the number of cases that are classified as healthy The residuals that are generated corresponding to the
but the engine is faulty. Tables XI through XVI provide a ensemble-based II approach are utilized for achieving the fault
complete comparative analysis between the ensemble-based II isolation task. The fault scenarios and the sample size of 50
fault detection methodology and that of the single-model based data as well as the MLP classifier structure are the same as in
RBF-NARX fault detection scheme. the previous subsection. Table XVIII provides the correspond-
ing confusion matrix for the testing results of the available
data. The performance results in terms of the CCR metric are
G. Single Model-based Fault Isolation Results as follows: CCRtraining = 98%, CCRtesting = 100% and
The residuals that are generated corresponding to the single- CCRtotal = 98.75%.
model approach RBF-NARX are utilized for achieving the
fault isolation task. The fault scenarios are described in the I. Single Model-based Multiple Faults Isolation Results
previous subsections and the data samples associated with
The residuals that are generated corresponding to the single-
model approach RBF-NARX are utilized for achieving the
TABLE XIII fault isolation of multiple concurrent faults. The fault scenarios
T HE FAULT DETECTION ACCURACY OF THE SINGLE MODEL - BASED
RBF-NARX SOLUTION (FAULT SEVERITY = 2%).
and the sample size of 50 data as well as the MLP classifier
structure are the same as in the previous subsection. Table XIX
Fmc Fec Fmt Fet
CCR 70% 100% 100% 100%
provides the confusion matrix for the testing of the available
Precision 60% 100% 100% 100% data. The performance results in terms of the CCR metric are
TPR 67% 100% 100% 100% as follows: CCRtraining = 98.33%, CCRtesting = 87.5%
FPR 25% 0% 0% 0%
TNR 75% 100% 100% 100% and CCRtotal = 94%.
FNR 33% 0% 0% 0%
TABLE XIV
T HE FAULT DETECTION ACCURACY OF THE ENSEMBLE - BASED II
J. Ensemble-based Multiple Faults Isolation Results
SOLUTION (FAULT SEVERITY = 2%). The residuals that are generated corresponding to the
Fmc Fec Fmt Fet ensemble-based II approach are utilized for achieving the fault
CCR 80% 100% 100% 100% isolation of multiple concurrent faults. The fault scenarios and
Precision 80% 100% 100% 100%
TPR 80% 100% 100% 100% the sample size of 50 data as well as the MLP classifier
FPR 20% 0% 0% 0% structure are the same as in the previous subsections. Table XX
TNR 80% 100% 100% 100%
FNR 20% 0% 0% 0%
shows the confusion matrix for the testing of all the available
14
TABLE XVII
T HE TESTING DATA CONFUSION MATRIX BY USING SINGLE MODEL - BASED (RBF-NARX) FAULT ISOLATION .
Actual Fault Class
Fec > 3% Fec < 3% Fmc > 3% Fmc < 3% Fmt > 3% Fmt < 3% Fet > 3% Fet < 3%
Fec > 3% 7 0 0 0 1 0 1 0
Predicted Fault Class
Fec < 3% 0 3 0 0 0 0 0 0
Fmc > 3% 0 0 6 0 0 0 0 0
Fmc < 3% 0 0 0 5 0 0 0 0
Fmt > 3% 1 0 0 0 5 1 1 2
Fmt < 3% 0 0 0 0 0 4 0 0
Fet > 3% 0 0 0 0 0 0 5 0
Fet < 3% 0 0 0 0 0 0 3 5
TABLE XVIII
C ONFUSION MATRIX FOR TESTING DATA USING ENSEMBLE - BASED II FAULT ISOLATION .
Actual Fault Class
Fec > 3% Fec < 3% Fmc > 3% Fmc < 3% Fmt > 3% Fmt < 3% Fet > 3% Fet < 3%
Fec > 3% 9 0 0 0 0 0 0 0
Predicted Fault Class
Fec < 3% 0 6 0 0 0 0 0 0
Fmc > 3% 0 0 4 0 0 0 0 0
Fmc < 3% 0 0 0 5 0 0 0 0
Fmt > 3% 0 0 0 0 5 0 0 0
Fmt < 3% 0 0 0 0 0 4 0 0
Fet > 3% 0 0 0 0 0 0 4 0
Fet < 3% 0 0 0 0 0 0 0 3
TABLE XIX
T HE TESTING DATA CONFUSION MATRIX BY USING THE SINGLE MODEL - BASED (RBF-NARX) FAULT ISOLATION .
Prediction → Fec Fmc Fmt Fet Fmc , Fmc , Fmc , Fec , Fec , Fmt ,
Actual ↓ Fec Fmt Fet Fet Fmt Fet
Fec 4 0 0 0 0 0 0 0 0 0
Fmc 0 11 0 0 0 2 0 0 0 0
Fmt 0 0 8 0 0 0 0 0 0 0
Fet 0 0 0 8 0 0 0 0 0 0
Fmc , Fec 0 0 0 0 6 0 0 0 0 0
Fmc , Fmt 0 0 0 0 0 5 0 0 0 0
Fmc , Fet 0 0 0 0 0 0 8 0 0 0
Fec , Fet 8 0 0 0 0 0 0 7 0 0
Fec , Fmt 0 0 0 0 0 0 0 0 5 0
Fmt , Fet 0 0 0 0 0 0 0 0 0 8
TABLE XX
T HE TESTING DATA CONFUSION MATRIX BY USING THE ENSEMBLE - BASED II MULTIPLE FAULT ISOLATION .
Prediction → Fec Fmc Fmt Fet Fmc , Fmc , Fmc , Fec , Fec , Fmt ,
Actual ↓ Fec Fmt Fet Fet Fmt Fet
Fec 9 0 0 0 0 0 0 0 0 0
Fmc 0 11 0 0 0 2 0 0 0 0
Fmt 0 0 8 0 0 0 0 0 0 0
Fet 0 0 0 8 0 0 0 0 0 0
Fmc , Fec 0 0 0 0 6 0 0 0 0 0
Fmc , Fmt 0 0 0 0 0 5 0 0 0 0
Fmc , Fet 0 0 0 0 0 0 8 0 0 0
Fec , Fet 3 0 0 0 0 0 0 7 0 0
Fec , Fmt 0 0 0 0 0 0 0 0 5 0
Fmt , Fet 0 0 0 0 0 0 0 0 0 8
data. The performance results in terms of the CCR metric are engine dynamics. Three ensemble-based schemes are then
as follows: CCRtraining = 99.17%, CCRtesting = 93.75% proposed and developed for representing the gas turbine engine
and CCRtotal = 97%. dynamics. Namely, two heterogenous ensemble models and
one homogeneous ensemble model. It is first concluded that
IX. C ONCLUSION all the heterogeneous ensemble models improve the system
In this paper, a new approach for fault detection and identification modeling accuracy when compared to the stand-
isolation (FDI) of a gas turbine engine is proposed by using alone solutions. The best selected stand-alone model (i.e.,
ensemble of neural networks. Proposed ensemble methods the dynamic RBF neural network) and the best selected
integrate various identification models to ensure reduction in ensemble model (i.e., a heterogenous ensemble) in term of
the modeling error and increase in the prediction accuracy. the engine modeling accuracy are selected to perform the FDI
By combining individual neural network models, enhanced task and objective. The residuals are obtained by using both
robustness and accurate representations are almost always the single and ensemble-based methodologies under various
achievable without the need of ad-hoc, labor intensive and engine health conditions to detect the component faults. Our
time consuming fine tunings that are required for single neural simulation results demonstrate that the residuals that are
network solutions. To accomplish engine health monitoring, obtained from the ensemble approach result in more accurate
its dynamics is first identified and represented by using fault detection performance. The fault isolation task is then
three different stand-alone neural network learning algorithms. performed by evaluating variations in the ensemble residual
Specifically, a dynamic MLP, a dynamic RBF neural network, signals (before and after a fault detection flag is issued) by us-
and a dynamic SVM are trained to individually identify the ing an MLP neural network classifier. As in the fault detection
15
results, it is concluded through extensive simulation studies [26] C. Ren, J. F. Yan, and Z. H. Li, “Improved ensemble learning in fault
that the ensemble-based fault isolation methodology results in diagnosis system,” IEEE International Conference on Machine Learning
and Cybernetics, vol. 1, pp. 54–60, 2009.
a more promising, accurate, and reliable performance. [27] M. Bishop, Pattern recognition and machine learning. Springer, 2006.
[28] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Learning. Springer, 2008.
R EFERENCES [29] K. Anders and J. Vedelsby, “Neural network ensembles cross validation
and active learning,” In Advances in Neural Information Processing
[1] T. Kobayashi and D. L. Simon, “Application of a bank of kalman filters Systems, pp. 231–238, 1995.
for aircraft engine fault diagnostics,” Proceedings ASME TurboExpo, [30] G. Brown, J. Wyatt, R. Harris, and X. Yao, “Diversity creation methods:
pp. 461–470, 2003. a survey and categorisation,” Information Fusion, vol. 6, pp. 5–20, 2005.
[2] E. Naderi, N. Meskin, and K. Khorasani, “Nonlinear fault diagnosis of [31] J. Mendes-Moreira, C. Soares, A. M. Jorge, and J. F. D. Sousa, “En-
jet engines by using a multiple model-based approach,” Transactions of semble approaches for regression: A survey,” ACM Computing Surveys
the ASME Engineering for Gas Turbines and Power, vol. 134, pp. 319– (CSUR), vol. 45, pp. 101–114, 2012.
329, 2012. [32] I. A.-D. Al-Zyoud and K. Khorasani, “Detection of actuator faults
[3] S. S. Tayarani-Bathaie, Z. S. Vanini, and K. Khorasani, “Dynamic neural using a dynamic neural network for the attitude control subsystem of a
network-based fault diagnosis of gas turbine engines,” Neurocomputing, satellite,” Proc. of Int. Joint Conf. on Neural Networks, 2005.
vol. 125, pp. 153–165, 2014. [33] A. Valdes, K. Khorasani, and L. Ma, “Dynamic neural network-based
[4] V. Venkatasubramanian, R. Rengaswamy, K. Yin, and S. N. Kavuri, fault detection and isolation for thrusters in formation flying of satel-
“A review of process fault detection and diagnosis: Part i: Quantitative lites,” Advances in Neural Networks - ISNN 2009: 6th International
model-based methods,” Computers & Chemical Engineering, vol. 27, Symposium on Neural Networks, 2009.
no. 3, pp. 293–311, 2003. [34] R. Mohammadi, E. Naderi, K. Khorasani, and S. Hashtrudi-Zad, “Fault
[5] I. Loboda, Y. Feldshteyn, and V. Ponomaryov, “Neural networks for diagnosis of gas turbine engines by using dynamic neural networks,”
gas turbine fault identification: Multilayer perceptron or radial basis Proc. ASME Turbo Expo 2010.
network?,” Proceedings of the ASME Turbo Expo, pp. 465–475, 2011. [35] Z. N. S. Vanini, K. Khorasani, and N. Meskin, “Fault detection and
[6] H. Xiao, N. Eklund, K. Goebel, and W. Cheetham, “Hybrid change isolation of a dual spool gas turbine engine using dynamic neural
detection for aircraft engine fault diagnostics,” Proceedings of IEEE networks and multiple model approach,” Information Sciences, vol. 259,
Aerospace Conference, pp. 1–10, 2007. pp. 234–251, 2014.
[7] J. Zhang, “Improved on-line process fault diagnosis through information [36] A. Yazdizadeh and K. Khorasani, “Adaptive time delay neural network
fusion in multiple neural networks,” Computers & Chemical Engineer- structures for nonlinear system identification,” Neurocomputing, vol. 47,
ing, vol. 30, pp. 558–571, 2005. no. 4, pp. 207–240, 2002.
[8] W. Yan and F. Xue, “Jet engine gas path fault diagnosis using dynamic [37] S. M. Camporeale, B. Fortunato, and M. Mastrovito, “A modular code
fusion of multiple classifiers,” Proceedings of IEEE World Congress on for real time dynamic simulation of gas turbines in simulink,” J. Eng.
Computational Intelligence, pp. 1585–1591, 2008. Gas Turbines Power, vol. 128, no. 3, pp. 506–517, 2006.
[9] F. Lu, T. B. Zhu, and Y. Q. Lv, “Data-driven based gas path fault [38] V. Panov, “Gasturbolib: Simulink library for gas turbine engine mod-
diagnosis for turbo-shaft engine,” Applied Mechanics and Materials, elling,” Proceedings of ASME Turbo Expo 2009, vol. 1, 2009.
vol. 249, pp. 400–404, 2012. [39] S. A. Billings, Nonlinear System Identification: NARMAX Methods in
the Time and Frequency and and Spatio-Temporal Domains. Wiley,
[10] N. C. Oza, K. Tumer, I. Y. Tumer, and E. M. Huff, “Classification of
2013.
aircraft maneuvers for fault detection,” Multiple Classifier Systems and
[40] S. Chen, S. A. Billings, and P. M. Grant, “Non-linear system identifica-
Lecture Notes in Computer Science, pp. 375–384, 2003.
tion using neural networks,” International Journal of Control, vol. 51,
[11] C. Zhang and Y. Ma, Ensemble Machine Learning: Methods and pp. 1191–1214, 1990.
Applications. Springer, 2012. [41] S. Chen, X. X. Wang, and C. J. Harris, “Narx-based nonlinear system
[12] A. J. C. Sharkey and N. E. Sharkey, “Combining diverse neural nets,” identification using orthogonal least squares basis hunting,” IEEE Trans-
Knowledge Eng. Rev., vol. 12, pp. 1—17, 1997. actions on Control Systems Technology, vol. 16, pp. 78–84, 2008.
[13] L. K. Hansen and P. Salamon, “Neural network ensembles,” IEEE [42] R. Salat, M. Awtoniuk, and K. Korpysz, “Black-box system identification
Transactions on Pattern Analysis and Machine Intelligence, vol. 12, by means of support vector regression and imperialist competitive
no. 10, pp. 993–1001, 1990. algorithm,” Przeglkad Elektrotechniczny, vol. 9, pp. 223–226, 2013.
[14] R. Polikar, “Ensemble learning,” Ensemble Machine Learning, pp. 1–34, [43] F. Roli, G. Giacinto, , and G. Vernazza, “Methods for designing multiple
2012. classifier systems,” Multiple Classifier Systems, vol. 2096, pp. 78–87,
[15] L. Breiman, “Bagging predictors,” Machine Learning, vol. 24, pp. 123– 2001.
140, 1996. [44] D. Opitz and R. Macli, “Popular ensemble methods: An empirical study,”
[16] R. Avnimelech and N. Intrator, “Boosting regression estimators,” Neural Journal of Artificial Intelligence Research, vol. 11, pp. 169–198, 1999.
Computation, vol. 11, pp. 491–513, 1999. [45] P. Domingos, “Why does bagging work? a bayesian account and its
[17] H. Drucker, C. Cortes, L. D. Jackel, Y. LeCun, , and V. Vapnik, implications,” International Conference on Knowledge Discovery and
“Boosting and other ensemble methods,” Neural Computation, vol. 6, Data Mining, pp. 155–158, 1997.
pp. 1289–1301, 1994. [46] P. M. Granitto, P. F. Verdes, , and H. A. C. and, “Neural network
[18] R. A. Jacobs, M. I. Jordan, S. J. Nowlan, , and G. E. Hinton, “Adaptive ensembles: evaluation of aggregation algorithms,” Artificial Intelligence,
mixtures of local experts,” Neural Computation, vol. 3, pp. 79–87, 1991. vol. 163, pp. 139–162, 2005.
[19] H. Xiao, N. Eklund, and K. Goebel, “A data fusion approach for [47] G. I. Webb and Z. Zheng, “Multistrategy ensemble learning: reducing
aircraft engine fault diagnostics,” Proceedings of the ASME Turbo Expo, error by combining ensemble learning techniques,” IEEE Transactions
pp. 767–775, 2007. on Knowledge and Data Engineering, vol. 16, pp. 980–991, 2004.
[20] A. Varma, P. Bonissone, W. Yan, N. Eklund, K. Goebel, N.Iyer, and [48] P. Guilherme and F. J. V. Zuben, “The influence of the pool of
S.Bonissone, “Anomaly detection using non-parametric information,” candidates on the performance of selection and combination techniques
Proceedings of the ASME Turbo Expo, pp. 813–821, 2007. in ensembles,” International Joint Conference on Neural Networks,
[21] W. Donat, K. Choi, W. An, S. Singh, and K. Pattipati, “Data visualization pp. 10588–10595, 2006.
data reduction and classifier fusion for intelligent fault detection and [49] C. J. Merz and M. J. Pazzani, “A principal components approach to
diagnosis in gas turbine engines,” Proceedings of the ASME Turbo Expo, combining regression estimates,” Machine Learning, vol. 36, pp. 9–32,
pp. 883–892, 2007. 1999.
[22] J. Huang and M. Wang, “Multiple classifiers combination model for [50] M. P. Perrone and L. N. Cooper, “When networks disagree: Ensemble
fault diagnosis using within-class decision support,” Proceedings of methods for hybrid neural networks,” Neural Networks for Speech and
Information Science and Management Engineering (ISME), pp. 226– Image Processing, pp. 126–142, 1994.
229, 2010. [51] J. Kivinen and M. K. Warmuth, “Exponentiated gradient versus gradient
[23] J. Amanda and C. Sharkey, “Types of multinet system,” Multiple descent for linear predictors,” Information and Computation, vol. 132,
Classifier Systems and Lecture Notes in Computer Science, pp. 108– pp. 1–63, 1997.
117, 2002. [52] S. B. Kotsiantis and P. E. Pintelas, “Selective averaging of regression
[24] A. J. C. Sharkey, G. O. Chandroth, and N. E. Sharkey, “A multi-net models,” Annals of Mathematics and Computing & Teleinformatics,
system for the fault diagnosis of a diesel engine,” Neural Computing vol. 1, pp. 65–74, 2005.
and Applications, vol. 9, pp. 152–160, 2000. [53] W. Yates and D. Partridge, “Use of methodological diversity to improve
[25] Y. Lei, M. J. Zuo, Z. He, and Y. Zi, “A multidimensional hybrid intelli- neural network generalization,” Neural Computing and Applications,
gent method for gear fault diagnosis,” Expert Systems with Applications, vol. 4, pp. 114–128, 1996.
vol. 37, pp. 1419–1430, 2010.

Amozegar 2016

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Amozegar 2016

Uploaded by

Copyright:

Available Formats

Accepted Manuscript

An ensemble of dynamic neural network identifiers for fault detection

To appear in: Neural Networks

Received date: 16 October 2015

An Ensemble of Dynamic Neural

An Ensemble of Dynamic Neural Network

A MATLAB Simulink model of a gas turbine engine is

TDL yˆ(k ) PˆC (k )

where ŷ(k) denotes the estimate of the actual output and

the so-called parallel NARX structure is used for verification

Engine output feedback

Aggregation Engine output

performance in the set. Each time a new model is added to y (k )

module for generating and constructing the final ensemble. For

TABLE II RESTC {Fmc}

PˆC (k ) RES PC (k ) RES PC f2

Identification model of gas turbine engine

Fig. 6. The schematic of the fault isolation methodology.

TABLE III TABLE IV

Sequential Selection (FSS) as a pruning algorithm has been

TABLE V TABLE VIII

Tc residual Pc residual Tc residual Pc residual

N residual Tt residual N residual Tt residual

−200 −100 −200 −50

−300 −200 −300 −100

Tc residual Pc residual Tc residual Pc residual

−10 −0.2 −10 −0.5

−15 −0.4 −20 −1

Fmc Fec Fmt Fet Fmc Fec Fmt Fet

You might also like