Professional Documents
Culture Documents
9, 2018.
Digital Object Identifier 10.1109/ACCESS.2018.2806373
ABSTRACT Air handling systems are the key sub-systems of heating ventilation and air condition-
ing (HVAC) systems. They condition and deliver air to satisfy human thermal comfort requirements and
provide acceptable indoor air quality. Faults in their components and sensors may lead to high-energy
consumption, poor thermal comfort, and unacceptable indoor air quality. Additionally, new types of faults
may falsely be identified as known types. Identifying failure modes and their severities with low false identi-
fication rates is thus critical to know what faults occur and how severe they are. However, this is challenging,
since 1) classifying both failure modes and fault severities generates many categories of failures, leading to
high computational requirements; 2) updating model parameters to adapt to changing environments requires
accurate recursive equations that are hard to obtain; and 3) model errors and measurement noise may cause
high false identification rates in detecting new types of faults. In this paper, failure modes are identified by
hidden Markov models (HMMs) and fault severities are estimated by filtering methods, leading to a decrease
in the number of HMM states and low computational requirements. To adapt to changing environments,
a new online learning algorithm is developed. In this algorithm, HMM parameters are obtained based on their
posterior distributions given new observations, thereby avoiding the need for accurate recurrence equations.
To identify new fault types with low false identification rates, a robust statistical method is developed to
compare current HMM observations with those expected from existing states to obtain potential new types,
and then confirm new types by checking whether observations have a significant change. Physical knowledge
is then used to find the reason for the new fault type. Experimental results show that failure modes and fault
severities of both known and new types of faults are identified with high accuracy.
INDEX TERMS Fault diagnosis, HVAC air handling system, online learning algorithm, hidden Markov
model, new fault types.
I. INTRODUCTION from air, and cooling/heating coils condition the mixed air
Heating, Ventilation and Air-Conditioning (HVAC) systems to satisfy human thermal comfort requirements. The supply
account for 57% of energy used in the U.S. commercial and fan delivers conditioned air to Variable Air Volume (VAV)
residential buildings [1]. Air handling systems are key sub- boxes, which control temperatures and airflow rates deliv-
systems in HVACs. They condition and deliver air to rooms ered to rooms. The return fan then delivers return air from
to satisfy human thermal comfort and provide acceptable rooms to the outside and to the mixing box. In the air han-
indoor air quality. An air handling system consists of a mixing dling system, various sensors measure air/water flow rates,
box, filters, cooling/heating coils, ducts and fans, as shown temperatures and humidity ratios, including the supply air
in Fig. 1. The mixing box consists of an Exhausted Air (EA) temperature sensor, the supply air humidity ratio sensor, the
damper, a Return Air (RA) damper and an Outdoor Air (OA) supply air mass flow rate sensor, the return air temperature
damper; and controls the ratio of the return airflow to the sensor, the return air humidity ratio sensor and the return
outdoor airflow. Filters remove solid particulates such as dust air mass flow rate sensor, as shown in Fig .1. Given sensor
2169-3536
2018 IEEE. Translations and content mining are permitted for academic research only.
21682 Personal use is also permitted, but republication/redistribution requires IEEE permission. VOLUME 6, 2018
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Y. Yan et al.: Fault Diagnosis of Components and Sensors in HVAC Air Handling Systems
Section 3 cannot detect and find reasons of new fault types. Additionally, rules are usually fixed and may not adapt to
In Section 4, a method is developed to distinguish new fault changing environments, resulting in false identifications.
types from existing ones, and then find reasons of detected In model-based methods, models are established to cap-
faults and estimate their severities based on physical knowl- ture key features of systems, and can be generally cate-
edge as shown in Fig. 2. To detect new fault types, the gorized into three types, including (1) black-box models;
HMM-based method in Section 3 is improved by adding (2) statistical models; and (3) physics-based models. Black-
new HMM states dynamically to capture new fault types. box models are established only based on data without con-
Because of model errors and measurement noise, it is hard sidering physical knowledge. Black-model-based methods,
to detect new fault types with low false identification rates. e.g., Principal Component Analysis (PCA), Artificial Neu-
Considering that occurrence of a new type is a state tran- ral Networks (ANN), Support Vector Machine (SVM) and
sition, it causes large changes in observations. Addition- extreme learning machine, are widely used in fault diagno-
ally, observations corresponding to new types deviate from sis of air handling systems. For instance, PCA models are
those belonging to existing types. Thus a robust method is established for various sub-systems based on normal data.
developed to distinguish new types from known types in By comparing new data with these models, anomalies are
these two perspectives. To check changes in observations, detected and the fault source is identified according to mod-
Kullback-Leibler (KL) divergence is used, since it measures els [4]. In [10], a wavelet-PCA method was developed to
how the distribution diverges from that represented by previ- diagnose sudden and gradual faults of an AHU by removing
ous observations. To test whether current observations deviate influence of weather conditions. Compared with PCA, ANNs
from those of existing types, a robust Bayes-factor-based are good at classifying failure modes based on training data,
hypothesis testing is developed. To find the reason of the and are widely used for HVACs [11]–[13]. Based on single
detected fault, changes of fault-related sensor readings are hidden-layer feed-forward neural networks, extreme learning
analyzed based on physical knowledge to find possible fault machine was developed by randomly selecting features for
types as shown in Fig. 2. The new fault type is identified by the hidden units. It has much faster learning speed compared
eliminating known types from possible ones. with traditional ANNs, and was used to diagnose faults of
In Section 5, simulation data of a small building and AHUs [14]. Like ANNs, SVMs are also good at classifica-
ASHRAE-1312 data are used to test our method. Experimen- tion. They were applied to classify faults of AHUs based
tal results show that our method can identify known types and on estimates of fault-related model parameters [15]. These
new types of failure modes and their severities with low false methods classify normal and failure modes only based on
identification rates. sensor readings or features extracted from them. State evo-
lutions obtained from physical knowledge represent relation-
II. LITERATURE REVIEW ships among states, and thus help to identify current states.
To diagnose known types of faults in air handling systems, However, state evolutions are rarely considered in these meth-
many methods were developed. However, papers that inves- ods. Moreover, structures of these classifiers are usually fixed
tigate diagnosing new fault types in air handling systems are and are not updated, and thus need to be retrained when new
rare. fault types occur. Thus, they may not adapt to changing envi-
ronments, leading to false identifications. Most of existing
A. METHODS FOR DIAGNOSING KNOWN FAULT methods focus on identifications of failure modes, but few
TYPES OF COMPONENTS AND SENSORS of them consider estimating fault severities.
IN AIR HANDLING SYSTEMS Compared to black-box models, HMMs capture state evo-
To diagnose known types of faults, multiple methods lutions and distributions of sensor readings given states by
were developed, and are generally categorized into two statistical models. Conditions of components and sensors are
types: model-free and model-based. Model-free methods, considered as states of HMMs, and estimated to identify fail-
e.g., expert systems and decision trees, were developed based ure modes [16], [17]. However, if HMMs are used to identify
on physical knowledge and experience without establishing both failure modes and fault severities, many HMM states are
models. These methods infer faults by investigating cause- generated, leading to high computational requirements. Addi-
effect relationships between faults and their impacts. For tionally, structures and parameters of HMMs used for fault
example, expert systems employ physical knowledge and diagnosis are usually fixed, and thus cannot adapt to changing
experience to generate if-then-else rules, and are widely used environments [18]. To address this issue, several online algo-
in HVACs, and in particular, air handling systems [6]–[8]. rithms were developed by updating HMM parameters based
Decision trees established based on physical knowledge are on the Baum-Welch (BW) algorithm [19], [20]. In these algo-
also used. For instance, a decision tree was used to diagnose rithms, recurrence equations are developed based on homo-
sudden and gradual faults of an Air Handling Unit (AHU) geneity over a time window and large sample assumptions.
based on data from the ASHRAE project 1312-RP [9]. However, these assumptions may not always be satisfied.
These methods have good explanatory capabilities for fault Filtering methods, such as KFs and PFs capture both state
inference. However, developing rules for a specific system evolutions, and relationships between states and sensor read-
requires expertise and knowledge that may not be available. ings, by physics-based models. Fault-related parameters are
modeled as states and are estimated to check whether they A. HMMS TO IDENTIFY FAILURE MODES
fall outside their control limits or not. If failure modes have To identify failure modes and fault severities of compo-
different signatures, they are diagnosed [21]. However, exist- nents and sensors, it is important to estimate their states,
ing models may not contain enough fault-related parameters i.e., the normal condition or failure modes. States are esti-
to diagnose all faults considered. Additionally, estimates of mated based on certain fault-related sensor readings including
parameters are continuous. They can reflect fault severities (a) temperatures, humidity ratios and mass flow rates of
with higher resolution when compared to discrete state esti- outdoor air, supply air and return air; (b) temperatures and
mates obtained by HMMs. mass flow rates of chilled water; and (c) power of supply
and return fans. Additionally, some sensor readings cannot
B. METHODS FOR DIAGNOSING NEW FAULT TYPES track set-points when faults occur. Consequently, residuals
Diagnosing new fault types has two steps: detecting and between these sensor readings and set-points are also consid-
finding reasons. To detect new fault types, certain statistical ered. Relationships among sensor readings are represented by
hypothesis testing methods can be used. For instance, detect- physics-based/gray-box models. A sensor reading can be esti-
ing new types of faults is essentially equivalent to testing mated based on a model given other readings. Sensor readings
whether current sensor readings and previous ones are from and their estimates obtained from models are consistent under
different distributions. Bayes-factor-based hypothesis testing the normal condition, but not under faulty conditions. Resid-
can be used to make such inferences [22]. In this method, uals between sensor readings and their estimates represent
marginal probabilities of belonging to a known distribution the parity (consistency) relationships. Consequently, sensor
and that belonging to a different one are calculated. The readings and residuals mentioned above are considered as
ratio between the two probabilities is compared with one HMM observations. In this subsection, a cooling coil and
that is a constant threshold to check which case is more sensors related to return air are used as examples to show how
likely. Because of the constant threshold, model errors and to establish HMMs for components and sensors.
measurement noise may cause high false identification rates.
Additionally, analytical expressions of marginal probabilities
for certain distributions are hard to obtain. To detect new
fault types in high voltage electronic and power equipment,
a novel clustering method, integrating Gaussian mixture mod-
els and k-means, was developed [23]. New fault types are
detected based on confidence scores. In this method, no phys-
ical knowledge is considered for classification. Additionally,
the classification is only based on sensor readings and state
evolutions are not considered. Infinite HMMs, developed
using Dirichlet processes, can also be used to detect new FIGURE 3. A typical cooling coil.
fault types [24]. In these HMMs, the number of states is
determined based on data. They can capture new fault types
by adding new states. However, they are purely data-driven, 1) HMM OF COOLING COILS
and no physical knowledge is used. In our problem, as in A cooling coil is a coiled arrangement of tubes for heat
most engineering problems, physical knowledge and expe- transfer between chilled water and air as shown in Fig. 3.
rience with failure modes are available. It is important to Fins are used to increase the heat transfer area. Valves are
consider physical knowledge to improve robustness of the adjusted to control the chilled water mass flow rate. Chilled
method under model errors and measurement noise. Papers water flows through tubes, and air passes through fins. Air
that investigate identifying reasons of new fault types are rare. temperature is reduced through heat transfer between air and
chilled water.
III. IDENTIFICATION OF FAILURE MODES AND FAULT To identify failure modes, an HMM of cooling coils is
SEVERITIES FOR KNOWN FAULT TYPES established as shown in Fig. 4.
In this section, HMMs and filtering methods are developed For the cooling coil, four failure modes are considered,
to identify known types of failure modes and their sever- including (a) tube fouling; (b) dust on fins; (c) valve stuck
ities, respectively. In subsection III-A, HMMs of compo- closed; and (d) valve stuck open. Since failure modes may
nents and sensors are established while capturing coupling be concurrent, the HMM state is set as (fcc,tube , fcc,fin ,
among them. In subsection III-B, an online learning fcc,vlvs c , fcc,vlvs o ), where tube fouling is denoted by fcc,tube ;
algorithm is developed to estimate HMM states to dust on fins is denoted by fcc,fin ; valve stuck closed is denoted
identify failure modes while adapting to changing envi- by fcc,vlvs c ; and valve stuck open is denoted by fcc,vlvs o . State
ronments. In subsection III-C, fault-related parameters and Scc = 0 is equivalent to (fcc,tube , fcc,fin , fcc,vlvs c , fcc,vlvs o ) =
deviations of sensor readings from their normal values are ‘00000 which means that no faults present in the cooling coil.
estimated to identify fault severities of components and (fcc,tube , fcc,fin , fcc,vlvs c , fcc,vlvs o ) = ‘11110 means that all faults
sensors, respectively. have occurred. The HMM thus has 24 = 16 states denoted
ith VAV box; and M is the number of VAV boxes. The estimate used to deduce posterior distribution of HMM parameters
will deviate from ma,rn if a sensor fault occurs. Thus the based on updated observation sets and state estimates. HMM
residual m̂a,rn −ma,rn is used in the HMM. Similarly, estimate parameters are then drawn from their posterior distributions
of the return air temperature Ta,rn can also be calculated based to replace their previous values. The method of updating
on zone temperatures of VAV boxes as, HMM parameters is described below.
XM X
M
T̂a,rn = ( ṁa,vavi · Tzone,i ) ṁa,vavi , (6) 1) UPDATE OBSERVATION SETS OF HMMS
i=1 i=1
Assume that the current time is t + 1, observation sets for
where Tzone,i is the temperature of the ith zone. Residu- different states are available as well as state estimates from
als between estimates and measurements are considered as the beginning to time t. To update observation sets at time
HMM observations. For the return air humidity ratio sensor, t + 1, the new observation Ot+1 should be added in the
the humidity ratio is estimated as corresponding observation set. Thus it is important to know
X
XM M the state to which Ot+1 belongs. To estimate the state, the
Ŵa,rn = ( ṁa,vavi · Wzone,i ) ṁa,vavi , (7)
i=1 i=1 Viterbi algorithm is usually used since it gives the probability
of the most likely sequence of states. However, it is difficult
where Wzone,i is the humidity ratio of the ith zone. The residual
to use the algorithm in this case since HMM parameters at
between the estimate and the measurement is considered. For
time t + 1 are yet to be derived. Unlike using the Viterbi
each sensor, drift and bias are considered as failure modes,
algorithm, using the forward variable αt+1 (i) provides us
and thus the HMM of a sensor has 22 = 4 states. Sim-
with a probability of the partial observation sequence until
ilarly, HMMs of other sensors are established. Since each
time t + 1, with State i at time t + 1,
component or sensor has its own HMM, concurrent faults of
different components and sensors can be identified separately XN
based on corresponding HMMs. αt+1 (i) = [ αt (j)aji,t ]bi (Ot+1 ). (8)
j=1
2) UPDATE PARAMETERS OF HMMS AND ESTIMATE STATES where L is the window length. The computational complexity
After updating observation sets, Bayesian rule is used to infer of the Viterbi algorithm is O(N 2 T ), where N is the number
posterior distributions of HMM parameters given their prior of states and T is the length of the state sequence. In our
distributions. As discussed in [18], the prior probability of the method, for each t, the state sequence in the moving window
mean of observations corresponding to Scc = i and Ssf = j is needs to be estimated, thus the complexity is O(N 2 TL). The
normally distributed state estimate at the end of the sequence is considered as the
X−1 estimate at time t + 1.
µi |ssf =j . . . ∼ N (µ0,i , ). (9)
0,i
3) EVALUATE STATE ESTIMATES
Based on the Bayes rule, the posterior distribution is
To measure fault diagnosis accuracy, F-measure is used since
derived as
X it reflects the precision and recall (correct/false identification
µi |ssf =j Occ . . . ∼ N (µ̄i |ssf =j , |ssf =j ), (10) rates) [15]. To generate F-measures, four statistical mea-
i
sures related to identification rates are considered, including
with (a) true positive; (b) false positive; (c) true negative; and
X X−1 (d) false negative. True positive means that the normal condi-
µ̄i |ssf =j = (ni |ssf =j + )−1
i 0i tion is correctly estimated; true negative means that the failure
X−1 X−1 mode is correctly estimated; false positive means that the
·(ni |ssf =j ōi |ssf =j + ·µ0,i ), (11) normal condition is falsely estimated as a failure mode; and
i 0i
false negative means that the failure mode is falsely estimated
and
as the normal condition or other failure modes. The larger
ni
!,
X the F-measure is, the better the performance of a diagnosis
ōi |ssf =j = ok |ssf =j ni |ssf =j , (12)
method is.
k=1
where ni |ssf =j is the number of observations corresponding C. METHOD OF IDENTIFYING SEVERITIES
to State i of the cooling coil and State j of the supply fan; OF KNOWN FAULT TYPES
and ōi |ssf =j is the average value of the observations. The prior For components, deviations of fault-related parameters from
density of the covariance matrix of observations correspond- their normal values represent severities of fault impacts. For
ing to Scc = i and Ssf = j are assumed to follow an Inverse sensors, deviations of sensor readings from their actual values
Wishart (IW) distribution. The prior distribution is reflect fault severities. Filter-based methods including KF and
PF are developed to estimate fault severities of components
X
|ssf =j ∼ IW (9i , υi ) . (13)
i and sensors. For linear models, KF is used since it is an
Then, the posterior distribution is derived as, optimal linear filter. For the nonlinear case, PF is used since
X ni
X it is good at dealing with nonlinearities. Our methods are
|ssf =j Occ ∼ IW 9 + (ok |ssf =j − µi |ssf =j ) discussed below.
i
k=1
!
1) IDENTIFY FAULT SEVERITIES OF COMPONENTS
×(ok |ssf =j − µi |ssf =j ) , υi + ni |ssf =j .
T
As presented before, four faults in cooling coils are consid-
ered, including (a) tube fouling; (b) dust on fins; (c) valve
(14) stuck closed; and (d) valve stuck open. Tube fouling causes
As justified in [18], probabilities of state transitions follow a a decrease in the tube inside diameter dtube,in . Similarly,
Dirichlet distribution. The prior distribution is dust accumulation leads to a decrease in the fin outside
surface area Afin . Severities of these two faults are reflected
(ai,0 , . . . , ai,Ncc −1 )|ssf =j ∼ Dir(1, . . . , 1). (15) as the amount of decease in the two parameters. To esti-
mate the parameters, they are considered as parametric states
Given the number of visits to each state, the posterior distri-
xcc = [dtube,in Afin ]T . The state evolution is
bution is also Dirichlet and is given by:
(ai,0 , . . . , ai,Ncc −1 )|ssf =j xcc (t + 1) = xcc (t) + vcc (t), (17)
∼ Dir(ni0 |ssf =j + 1, . . . , ni,Ncc −1 |ssf =j + 1), (16) where process noise is denoted by vcc (t) = [vtube (t)vfin (t)]T .
For simplification, process noise is assumed white, zero mean
where ni0 |ssf =j is the number of transitions from State i and normally distributed, and vtube (t) and vfin (t) are uncorre-
to State 0 (normal condition) when the supply fan is in lated. To represent the relationship between the two paramet-
State j. These values are obtained by counting visits to each ric states and sensor readings, the measurement equation is
state given state estimates before time t + 1. Given updated derived from the physics-based model (1):
parameters, the Viterbi algorithm is used to estimate states in
the current moving window consisting of Ot−L+2 , . . . , Ot+1 , zcc (t) = hcc (xcc (t)) + wcc (t), with (18)
State i and µnew is that of observations in the current moving defined as the iterated integral of a function with respect
window. The mean µi is unknown. To simplify the problem, to each element of 6 [30]. Elements in 6 are denoted by
µi is approximated by the average value of observations σjk , j = 1, . . . , d and k = 1, . . . , d, where d is the observation
corresponding to State i. Thus the hypotheses are converted dimension. Then, (30) is converted into
as H0,i : µnew = µ̄i versus H1,i : µnew 6 = µ̄i . Based on Bayes
m0,i
rule, the probability that current observations belong to State i Z Z
is obtained 1
= ··· |6|−1−n/2 exp − tr (U + S)6 −1
P(onew |H0,i )P(H0,i ) 2
P(H0,i |onew ) = . (26) ×
Y
dσjk . (31)
P(onew )
Similarly, the probability that current observations do not j = 1, . . . , d
belong to State i, P(H1,i |onew ), is obtained. Since most of k = 1, . . . , d
failure modes are considered and covered by our HMMs, If the dimension of 6 is large, it is complex to calculate
occurrence of a new fault type is a small probability event. this integration. Since the probability density function of the
Prior probabilities for the two hypotheses are set as inverse matrix gamma distribution is
( Z ∞
P(H0,i ) = 0.99
(27) f (X , α, β, 9)dX
P(H1,i ) = 0.01 0
|9|α
Z ∞
1
Based on Bayes rule, the ratio of the two posterior probabili- = |X |−α−(d+1)/2
exp tr − 9X −1
dX= 1,
−∞ β 0d (α) β
dα
ties is
(32)
P(H0,i |onew ) P(onew |H0,i )P(H0 )
=
P(H1,i |onew ) P(onew |H1,i )P(H1 ) it follows,
m0 P(H0 ) P(H0 ) Z ∞
1 β dα 0d (α)
= · = Bf · , (28) |X |−α−(d+1)/2 exp − tr 9X −1 dX = .
m1 P(H1 ) P(H1 )
0 β |9|α
where Bf = m0 /m1 is the Bayes factor. To detect new fault (33)
types, control limits of this ratio are calculated. If the ratio
is lower than the lower bound of the control limits, it means It is evident that (33) is like (30). Letting α = −(d + 1)/2 +
that the probability of belonging to the existing State i is much n/2 + 1, β = 2 and 9 = U + S, (30) is converted to
lower than that of belonging to other states. Considering that 2d[−(d+1)/2+n/2+1] 0d(−(d + 1)/2 + n/2 + 1)
the ratio may be too large or too small, log of the ratio is m0,i = ,
|U + S|−(d+1)/2+n/2+1
considered. By assuming that log of the ratio follows a normal (34)
distribution, the control limits are obtained based on training
data as which can be easily calculated. The multivariate gamma
function is
[µ̄log _rto − 2σ̄log _rto , µ̄log _rto + 2σ̄log _rto ], (29) d
Y
where µ̄logr to is the average value of log ratios, and σ̄logr to 0d (a) = π d(d−1)/4 0(a + (1 − j)/2). (35)
is the standard deviation of log ratios. In (28), the posterior j=1
probability m0,i corresponding to State i is (34) is converted into
Z
m0,i = L(6)πa (6)d6 m0,i
d
2d[−(d+1)/2+n/2+1] 0(n/2 − d/2 + 0.5 + (1 − i)/2)
Z Q
n
= |6| −n/2
exp − (X̄ − µ̄i )0 6 −1 (X̄ − µ̄i ) i=1
2 = .
π −d(d−1)/4 |U + S|−(d+1)/2+n/2+1
1 −1
− tr S6 |6|−1 d6 (36)
2
Z
1 The analytical expression for m0,i is obtained. Compared with
−1−n/2 −1
= |6| exp − tr (U + S)6 d6. m0,i , m1,i is difficult to calculate. In [29], a recursive formula
2
was derived to calculate m1,i as
(30) Qd
j=1 0((n − 1)/2) · 2
(n−1)/2 Y
d−1 n −1/2 o
where U = n(X̄ −µ̄i )0 (X̄ −µ̄i ), S = Lj=1 (Xj − X̄ )0 (Xj − X̄ );
P
m1,i = Sj
the parameter L is the moving window length; Xj is the 2d (2π )(n−d)d/2 · nd/2 j=1
Sj−1 (n−1)/2
!
jth observation in the current moving window; X̄ is the aver-
1 Yd
age value of observations; and 6 is the covariance matrix of · (n−1)/2 . (37)
s j=2 Sj
observations. The prior πa = |6|−1 was considered [29]. 11
This is an integration with respect to a matrix 6, and is Thus, the Bayes factor BF is obtained.
FIGURE 12. State estimates of the return air temperature sensor obtained
FIGURE 10. Estimates of failure modes of the supply fan. by coupled online HMM.
FIGURE 13. Residuals between sensor readings and actual values of the
FIGURE 11. Residuals between estimates of the supply fan efficiency and return air temperature sensor.
its normal value.
FIGURE 18. EA damper stuck open (new fault type) detected by our
statistical method.
FIGURE 16. State estimates of the OA damper. I. COMPARISON BETWEEN OUR METHOD AND OTHERS
Many methods such as SVM, decision tree, Bayesian network
and ANN have been developed to diagnose faults in air
F. IDENTIFY KNOWN TYPES OF FAILURE handling systems. To evaluate our method, F-measures of
MODES IN COMPONENTS known fault types are compared with those obtained by using
To show the process of identifying the failure modes of other methods also based on the ASHRAE project 1312-RP
components, the EA damper and the OA damper are used data as shown in Table. 1.
as examples. By using the coupled online HMMs, fail- In this table, it can be seen that the average F-measure of
ure modes of the EA damper are estimated as shown in our method is better than that of the SVM [15] and the deci-
Fig. 15. In the figure, ‘1’ means that the damper is stuck sion tree [9]. Additionally, as mentioned in [15], the average
closed. Its F-measure is 0.966. Similarly, states of the OA F-measure of their method is 0.923 that is significantly better
damper are estimated as shown in Fig. 16. ‘1’ denotes the than other methods, e.g., LibSVM, Naïve Bayes, radial basis
damper leakage, and ‘2’ means that the damper is stuck function network, Bayesian network, NN and random forest
closed. F-measures of the two failure modes are 1 and 1, decision tree. Therefore, the performance of our method is
respectively. also better than these methods.
TABLE 1. F-measures of failure mode obtained by our method and others. [10] S. Li and J. Wen, ‘‘A model-based fault detection and diagnostic method-
ology based on PCA method and wavelet transform,’’ Energy Buildings,
vol. 68, pp. 63–71, Jan. 2014.
[11] W. Y. Lee, J. M. House, C. Park, and G. E. Kelly, ‘‘Fault diagnosis of an air-
handling unit using artificial neural networks,’’ ASHRAE Trans., vol. 102,
no. 1, pp. 540–549, 1996.
[12] Z. Du, B. Fan, X. Jin, and J. Chi, ‘‘Fault detection and diagnosis for build-
ings and HVAC systems using combined neural networks and subtractive
clustering analysis,’’ Building Environ., vol. 73, pp. 1–11, Mar. 2014.
[13] S. Sun, G. Li, H. Chen, Q. Huang, S. Shi, and W. Hu, ‘‘A hybrid
ICA-BPNN-based FDD strategy for refrigerant charge faults
in variable refrigerant flow system,’’ Appl. Therm. Eng.,
vol. 127, pp. 718–728, Dec. 2017. [Online]. Available:
http://dx.doi.org/10.1016/j.applthermaleng.2017.08.047
[14] K. Yan, Z. W. Ji, H. J. Lu, J. Huang, W. Shen, and Y. Xue, ‘‘Fast and accu-
rate classification of time series data using extended ELM: Application in
fault diagnosis of air handling units,’’ IEEE Trans. Syst., Man, Cybern.,
Syst., to be published, doi: 10.1109/TSMC.2017.2691774.
[15] T. Mulumba, A. Afshari, K. Yan, W. Shen, and L. K. Norford, ‘‘Robust
model-based fault diagnosis for air handling units,’’ Energy Buildings,
vol. 86, pp. 698–707, Jan. 2015.
VI. CONCLUSION [16] H. M. Ertunc, K. A. Loparo, and H. Ocak, ‘‘Tool wear condition monitoring
In this paper, a systematic method is developed to identify in drilling operations using hidden Markov models (HMMs),’’ Int. J. Mach.
Tools Manuf., vol. 41, no. 9, pp. 1363–1384, 2001.
known and new types of failure modes and their severities in [17] A. Kodali, K. Pattipati, and S. Singh, ‘‘Coupled factorial hidden Markov
air handling systems with low false identification rates. In this models (CFHMM) for diagnosing multiple and coupled faults,’’ IEEE
method, to identify known types of faults, an online learning Trans. Syst., Man, Cybern., Syst., vol. 43, no. 3, pp. 522–534, May 2013.
[18] Y. Yan, P. B. Luh, and K. R. Pattipati, ‘‘Fault diagnosis of HVAC air-
algorithm is developed to estimate the states of components handling systems considering fault propagation impacts among compo-
and sensors, while updating HMM parameters to adapt to nents,’’ IEEE Trans. Autom. Sci. Eng., vol. 14, no. 2, pp. 705–717,
changing environments. To identify new types of faults with Apr. 2017.
[19] G. Mongillo and S. Deneve, ‘‘Online learning with hidden Markov mod-
low false identification rates, a robust statistical method is els,’’ Neural Comput., vol. 20, no. 7, pp. 1706–1716, 2008.
developed to detect the new fault type, and the new fault type [20] T. Chis and P. G. Harrison, ‘‘Adapting hidden Markov models for online
learning,’’ Electron. Notes Theor. Comput. Sci., vol. 318, pp. 109–127,
is labeled based on physical knowledge (e.g., a fault tree or
Nov. 2015.
a human-in-the-loop). By adapting to changing environments [21] H. Yoshida, T. Iwami, H. Yuzawa, and M. Suzuki, ‘‘Typical faults of air-
and capturing coupling among components, our method per- conditioning systems, and fault detection by ARX Model and extended
Kalman filter,’’ ASHRAE Trans., vol. 102, no. 1, pp. 557–564, 1996.
forms better than others. [22] S. Goodman, ‘‘Toward evidence-based medical statistics. 1: The P value
fallacy,’’ Ann. Internal Med., vol. 130, no. 12, pp. 995–1004, 1999.
VII. ACKNOWLEDGMENT [23] H.-C. Yan, J.-H. Zhou, and C.-K. Pang, ‘‘New types of faults detection and
Any opinions, findings, and conclusions or recommendations diagnosis using a mixed soft & hard clustering framework,’’ in Proc. IEEE
expressed in this material are those of the author(s) and do 21st Int. Conf. Emerg. Technol. Factory Autom., Sep. 2016, pp. 1–6.
[24] M. J. Beal, Z. Ghahramani, and C. E. Rasmussen, ‘‘The infinite hid-
not necessarily reflect the views of the National Science den Markov model,’’ in Proc. Neural Inf. Process. Syst., vol. 14. 2002,
Foundation. pp. 577–584.
[25] Q. S. Yan, W. X. Shi, and C. Q. Tian, Refrigeration Technology for Air
REFERENCES Conditioning (in Chinese). Beijing, China: Construction Industry Press,
[1] (2009). U.S. Department of Energy, Building Energy Data Book. [Online]. 2010.
[26] Ernest Orlando Lawrence Berkeley National Laboratory.
Available: http://buildingsdatabook.eren.doe.gov
EnergyPlus Documentation, Engineering Reference. Accessed:
[2] Building Optimization and Fault Diagnosis Source Book, document
Oct. 1, 2013. [Online]. Available: http://apps1.eere.energy.gov/
ANNEX 25, IEA, Paris, France, 1996
buildings/energyplus/pdfs/engineeringreference.pdf
[3] S. Li and J. Wen, ‘‘Description of fault test in summer of 2007,’’ ASHRAE,
[27] F. P. Incropera and D. P. De Witt, Fundamentals of Heat and Mass Transfer.
Drexel Univ., Philadelphia, PA, USA, Tech. Rep. 1312, 2007.
4th ed. New York, NY, USA: Wiley, 1996.
[4] S. W. Wang and F. Xiao, ‘‘AHU sensor fault diagnosis using principal com- [28] Y. Yan, P. B. Luh, and K. R. Pattipati, ‘‘Fault diagnosis framework for
ponent analysis method,’’ Energy Buildings, vol. 36, no. 2, pp. 147–160, air handling units based on the integration of dependency matrices and
2004. PCA,’’ in Proc. IEEE Conf. Autom. Sci. Eng., Taipei, Taiwan, Aug. 2014,
[5] S. Wang and J.-B. Wang, ‘‘Robust sensor fault diagnosis and validation in pp. 1103–1108.
HVAC systems,’’ Trans. Inst. Meas. Control, vol. 24, no. 3, pp. 231–262, [29] D. Sun, ‘‘Objective priors for the multivariate normal model,’’ in Proc.
2002. Valencia/ISBA 8th World Meeting Bayesian Statist., Benidorm, Spain,
[6] S. Katipamula, R. G. Pratt, D. P. Chassin, Z. T. Taylor, K. Gowri, and Jun. 2006, pp. 1–38.
M. R. Brambley, ‘‘Automated fault detection and diagnostics for outdoor- [30] A. K. Gupta and D. K. Nagar, ‘‘Matrix variate distributions,’’ in Mono-
air ventilation systems and economizers: Methodology and results from graphs and Surveys in Pure and Applied Mathematics. Boca Raton, FL,
field testing,’’ ASHRAE Trans., vol. 105, no. 1, pp. 1–13, 1999. USA: Chapman & Hall, 2000.
[7] J. M. House, H. Vaezi-Nejad, and J. M. Whitcomb, ‘‘An expert rule set [31] D. I. Belov and R. D. Armstrong, ‘‘Distributions of Kullback-Leibler
for fault detection in air handling units,’’ ASHRAE Trans., vol. 107, no. 1, divergence and its application for the LSAT,’’ Law School Admission
pp. 858–871, 2001. Council, Newtown, PA, USA, Res. Rep. 09-02, Oct. 2009.
[8] J. Schein, S. T. Bushby, N. S. Castro, and J. M. House, ‘‘A rule-based fault [32] S. J. Press, ‘‘Linear combinations of non-central chi-square
detection method for air handling units,’’ Energy Buildings, vol. 38, no. 12, variates,’’ Annu. Math. Statist., vol. 37, no. 2, pp. 480–487,
pp. 1485–1492, 2006. 1966.
[9] R. Yan, Z. Ma, T. Zhao, and G. Kokogiannakis, ‘‘A decision tree [33] DesignBuilder Software. DesignBuilder 2.1 User’s Manual.
based data-driven diagnostic strategy for air handling units,’’ Energy Accessed: Oct. 2009. [Online]. Available: http://www.
Buildings, vol. 133, pp. 37–45, Dec. 2016. [Online]. Available: designbuildersoftware.com/docs/designbuilder/DesignBuilder_2.1_
http://dx.doi.org/doi:10.1016/j.enbuild.2016.09.039. Users-Manual_Ltr.pdf
YING YAN (S’12) received the B.S. degree in KRISHNA R. PATTIPATI (S’77–M’80–SM’91–
automation in 2007 and the M.S. degree in con- F’95) received the B.Tech. (Hons.) degree in elec-
trol theory and control engineering from Southeast trical engineering from IIT Kharagpur in 1975, and
University, Nanjing, China. He is currently pursu- the M.S. and Ph.D. degrees in systems engineering
ing the Ph.D. degree in the electrical and computer from the University of Connecticut (UCONN),
engineering with the University of Connecticut. Storrs, in 1977 and 1980, respectively. He was
His research interests include fault detection, diag- with Alphatech, Inc., Burlington, MA, USA, from
nosis and prognosis of heating ventilation, and air 1980 to 1986. He has been with the Department of
conditioning systems. Electrical and Computer Engineering, UCONN,
where he is currently the UTC Professor of sys-
tems engineering and serves as the Interim Director of the UTC Institute
for Advanced Systems Engineering. His current research activities are in
the areas of agile planning, diagnosis, and prognosis techniques for cyber-
physical systems, multiobject tracking, and combinatorial optimization.
PETER B. LUH (S’77–M’80–SM’91–F’95) A common theme among these applications is that they are characterized by a
received the B.S. degree from National Taiwan great deal of uncertainty, complexity, and computational intractability. He is
University, Taipei, Taiwan, the M.S. degree from the Co-Founder of Qualtech Systems, Inc., a firm specializing in advanced
MIT, Cambridge, MA, USA, and the Ph.D. from integrated diagnostics software tools (TEAMS, TEAMS-RT, TEAMS-RDS,
Harvard University, Cambridge, MA, USA. He and TEAMATE), and serves on the Board of Aptima, Inc. He is an Elected
has been with the Department of Electrical and Fellow of the Connecticut Academy of Science and Engineering. He was a
Computer Engineering, University of Connecticut, co-recipient of the Andrew P. Sage Award for the Best IEEE Systems, Man,
Storrs, CT, USA, since 1980. He is currently a and Cybernetics (SMC) Transactions Paper in 1999, the Barry Carlton Award
SNET Professor with Communications and Infor- for the Best AES Transactions Paper in 2000, the NASA Space Act Awards
mation Technologies. He is also a member of the for A Comprehensive Toolset for Model-Based Health Monitoring and
Chair Professors Group, Center for Intelligent and Networked Systems, Diagnosis, and Real-Time Update of Fault-Test Dependencies of Dynamic
Department of Automation, Tsinghua University, Beijing, China. He was Systems: A Comprehensive Toolset for Model-Based Health Monitoring and
on sabbatical leave with the Department of Electrical Engineering, National Diagnostics in 2002 and 2008, and the AAUP Research Excellence Award
Taiwan University, Taipei, Taiwan as a Chair Professor in 2015. He is a at UCONN in 2003. He received the Best Technical Paper Award at the
member of the Periodicals Committee of the IEEE Technical Activities 1985, 1990, 1994, 2002, 2004, 2005, and 2011 IEEE AUTOTEST Confer-
Board. He was the founding Editor-in-Chief of the IEEE TRANSACTIONS ON ences, and the 1997 and 2004 Command and Control Conference. He was
AUTOMATION Science and ENGINEERING, and an EiC of the IEEE TRANSACTIONS selected as the Outstanding Young Engineer by the SMC Society in 1984,
ON ROBOTICS AND AUTOMATION. His research interests include smart, green, and received the Centennial Key to the Future Award. He served as the
and safe buildings, smart grid, electricity markets, and effective renewable Editor-in-Chief of the IEEE TRANSACTIONS ON SYSTEMS, MAN AND,
integration to the grid, and intelligent manufacturing systems. He received CYBERNETICS-PART B from 1998 to 2001, the Vice-President of Techni-
the 2013 Pioneer Award of the IEEE Robotics and Automation Society for cal Activities of the IEEE SMC Society from 1998 to 1999, and the
his pioneering contributions to the development of near-optimal and efficient Vice-President of Conferences and Meetings of the IEEE SMC Society from
planning, scheduling, and coordination methodologies for manufacturing 2000 to 2001.
and power systems.