You are on page 1of 16

Energy and Built Environment 1 (2020) 149–164

Contents lists available at ScienceDirect

Energy and Built Environment


journal homepage: www.elsevier.com/locate/enbenv

A review of data mining technologies in building energy systems: Load


prediction, pattern identification, fault detection and diagnosis
Yang Zhao, Chaobo Zhang∗, Yiwen Zhang, Zihao Wang, Junyang Li
Institute of Refrigeration and Cryogenics, Zhejiang University, Hangzhou, China

a r t i c l e i n f o a b s t r a c t

Keywords: With the advent of the era of big data, buildings have become not only energy-intensive but also data-intensive.
Supervised data mining Data mining technologies have been widely utilized to release the values of massive amounts of building operation
Unsupervised data mining data with an aim of improving the operation performance of building energy systems. This paper aims at making
Big data
a comprehensive literature review of the applications of data mining technologies in this domain. In general,
Building energy efficiency
data mining technologies can be classified into two categories, i.e., supervised data mining technologies and
Building energy systems
unsupervised data mining technologies. In this field, supervised data mining technologies are usually utilized
for building energy load prediction and fault detection/diagnosis. And unsupervised data mining technologies
are usually utilized for building operation pattern identification and fault detection/diagnosis. Comprehensive
discussions are made about the strengths and shortcomings of the data mining-based methods. Based on this
review, suggestions for future researches are proposed towards effective and efficient data mining solutions for
building energy systems.

1. Introduction been recognized as one of the most promising tools to extract valuable
knowledge from massive amounts of building operation data. The ex-
The building sector has become the largest energy consumer in the tracted knowledge can be used for fault detection and diagnosis (FDD),
world [1]. It accounts for more than one third of the global energy con- operation optimization, predictive modeling and so on [9]. In general,
sumption [2]. Building energy systems are responsible for most of the data mining technologies can be classified into two categories, i.e., su-
energy consumption in buildings, such as heating, ventilation and air pervised data mining technologies and unsupervised data mining tech-
conditioning (HVAC) systems, lighting systems and so on [3]. Equip- nologies. Supervised data mining technologies aim at learning the com-
ment faults, equipment degradation and improper control are widely plicated relationship between multiple variables. It is usually utilized
existed in building energy systems, which result in serious energy waste for prediction tasks or classification tasks. Unsupervised data mining
[4]. Katipamula and Brambley estimated that about 15% to 30% of technologies are good at discovering intrinsic structures, relations, and
the energy consumed by commercial buildings was wasted due to poor interconnectedness of data. It is capable of discovering hidden operation
maintenance, equipment degradation, and improper control in building patterns of building energy systems from their operation data.
energy systems [5]. Lee and Cheng investigated a large amount of re- In the past decade, some literature reviews had been made about
searches related to building energy saving [6]. They discovered that, in the applications of data mining technologies in the field of buildings, as
average, 39.5% of the lighting energy consumption and 14.1% of the described in the Section 2. However, there is still a lack of comprehen-
energy consumption of HVAC systems could be saved if they were oper- sive literature reviews about the data mining technologies utilized for
ated properly. Roth et al. found that, in American commercial buildings, load prediction, fault detection/diagnosis and pattern identification in
about 4% to 18% of the energy consumed by lighting systems, HVAC the domain of building energy systems. It is in great need to summa-
systems and refrigeration systems was wasted due to thirteen key faults rize the lessons from the previous researches and find opportunities for
[7]. future researches. In this study, the most recent supervised and unsuper-
With the popularity of building automation systems, a tremendous vised data mining-based methods are reviewed in the Section 3.1 and
amount of building operation data has been stored [8]. These data can the Section 4.1 respectively. The strengths and shortcomings of the two
reflect the actual operation conditions of buildings, which is very useful types of data mining-based methods are discussed in the Section 3.2 and
for building energy management tasks. Data mining technologies have the Section 4.2 respectively. Some suggestions about future researches
are concluded in the Section 5.


Corresponding author.
E-mail address: chaobo.zhang@zju.edu.cn (C. Zhang).

https://doi.org/10.1016/j.enbenv.2019.11.003
Received 16 September 2019; Received in revised form 5 November 2019; Accepted 11 November 2019
Available online 16 November 2019
2666-1233/© 2019 Southwest Jiaotong University. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license.
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Abbreviation Historical
operation data
HVAC heating, ventilation and air conditioning
AHU air handling units
VAV variable air volume Data transformation
FDD fault detection and diagnosis
ANN artificial neural network
SVM support vector machine Feature extraction
SVR support vector regression
ARIMA autoregressive integrated moving average
Optimization of model
DNN deep neural network
parameters
ELM extreme learning machine
RF random forest
PENN probabilistic entropy-based neural network Model training
GRNN general regression neural network
RNN recurrent neural network
LSTM long short-term memory Building energy load
GRU gated recurrent units prediction
FCRBM factored conditional restricted boltzmann machine
CRBM conditional restricted boltzmann machine Fig. 1. A general process of regression-based building energy load prediction
MLR multiple linear regression methods.
GP genetic programming
ESN echo state network
BPNN traditional back propagation neural network As for the unsupervised data mining-based researches, Capozzoli
RBFNN radial basis function neural network et al. provided a review of discovering knowledge from building op-
LS-SVM least square support vector machine eration data using unsupervised data mining technologies [21]. Miller
MNR multiple nonlinear regression et al. reviewed the researches of unsupervised data mining technologies
AR auto regression in the field of energy consumption analysis of non-residential buildings
ARX autoregressive exogenous [22]. Similarly, Fan et al. made a review related to knowledge extrac-
PSO particle swarm optimization algorithm tion from building operation data based on unsupervised data mining
WNN wavelet neural network technologies [9].
RCMACNN recurrent cerebellar model articulation controller However, previous literature reviews usually focused on a specific
neural network application of a type of data mining technology in the building field.
PCA principal component analysis The discussions in previous reviews were usually too specific and ig-
CART classification and regression tree nored the relationship between different types of data mining technolo-
SVDD support vector data description gies. It is necessary to make a more comprehensive review which can
ARM association rule mining take both supervised and unsupervised data mining technologies into
SAX symbolic aggregate approximation account. New insights can be gained through the comparison of the
FP-growth frequent pattern tree strengths and shortcomings of the both types of technologies. Therefore,
DBSCAN density-based spatial clustering of applications with this literature review is of great value.
noise
3. Supervised data mining-based methods

2. Summary of previous literature reviews 3.1. Applications of supervised data mining technologies in building energy
systems
In the field of building energy systems, some literature reviews
had been reported to summarize the applications of data mining 3.1.1. Regression-based building energy load prediction
technologies in the past decade. Most of them are related to supervised Many regression algorithms had been applied for predicting build-
data mining technologies. Zhao and Magoulès provided an overview of ing energy loads successfully such as artificial neural network (ANN)
supervised data mining-based building energy consumption prediction [23], support vector regression (SVR) [24], autoregressive integrated
methods [10]. Ahmad et al. [11] and Chalal et al. [12] summarized the moving average (ARIMA) [25], deep neural network (DNN) [26], etc.
applications of two common supervised data mining algorithms, i.e., A general process of regression-based building energy load prediction
artificial neural network (ANN) and support vector machine (SVM), methods is as illustrated in Fig. 1. In general, it includes four steps,
in building energy consumption prediction. Similarly, Fumo reviewed i.e., data transformation, feature selection, optimization of model pa-
several common supervised data mining algorithms which had been rameters, and model training. In the step of data transformation, the
utilized for predicting building energy consumption [13]. Amasyali raw historical operation data are transformed into a normalized scale
et al. made a comprehensive overview of data-driven building energy to improve the accuracy of the prediction model. The step of feature
consumption prediction methods, which covered fourteen supervised extraction aims at extracting the most relevant variables affecting the
data mining algorithms [14]. The applications of the ensemble learn- target energy load. The extracted features are then utilized for model
ing algorithm in building energy consumption prediction were also training. The step of optimization of model parameters aims at opti-
reviewed by Wang and Srinivasan [15]. Similar reviews related to the mizing hyper-parameters of the model for obtaining the optimal model
supervised data mining-based building energy consumption prediction structure. Lastly, the coefficients of the model are tuned automatically
methods also can be found in [16–19]. Apart from building energy to obtain the final building energy load prediction model. The predicted
consumption prediction, Li et al. further reviewed the applications of building energy loads are very useful for optimal control, performance
some supervised data mining algorithms in FDD [20]. assessment, etc.

150
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Variable 1

Variable 2
Cooling/heating/electricity load
...

Variable n

Input layer Hidden layers Output layer

Fig. 2. Illustration of a four-layer ANN-based energy load prediction method.

3.1.1.1. Artificial neural network-based methods. ANN is the most typi- Fig. 3. Illustration of SVR-based prediction models.
cal regression algorithm for building energy load prediction. In general,
an ANN is composed of one input layer, one or more hidden layers and
one output layer. Each layer has some artificial neurons which are con- and gated recurrent units (GRU), in 24 h ahead building cooling load
nected with the artificial neurons in adjacent layers. Each connection prediction [34]. The results showed that GRU had the best performance
has a weight which needs to be tuned in the step of model training. in cooling load prediction of an educational building in Hong Kong.
Fig. 2 illustrates a typical four-layer ANN model for building energy It provided a reference for the selection of a suitable type of RNN in
load prediction. building energy load prediction. Marino et al. also introduced LSTM to
ANN has been widely utilized to predict building electricity loads predict electricity load of a residential building [35]. They proposed an
successfully. For instance, Bagnasco et al. introduced ANN to predict improved LSTM architecture which showed higher prediction accuracy
the electricity loads of a hospital building in Italy [23]. In their work, than the standard LSTM. Mocanu et al. investigated the performance of
a three-layer ANN was trained using one-year operation data of a hos- two types of DNN algorithm, i.e., factored conditional restricted boltz-
pital. Three common kinds of variables were selected as model inputs mann machine (FCRBM) and conditional restricted boltzmann machine
including time variables, historical loads and outdoor environment vari- (CRBM), in electricity load prediction of residential buildings [36]. They
ables. The results showed that the prediction accuracy of ANN was discovered that the FCRBM outperformed CRBM, RNN, ANN, and SVR.
very high in this case. Similarly, Biswas et al. developed a three-layer Amber et al. also compared the electricity load prediction performance
ANN to predict daily total electrical consumption of a residential build- of DNN with that of four other regression algorithms, i.e., multiple lin-
ing [27]. Sareh et al. adopted the extreme learning machine (ELM) ear regression (MLR), genetic programming (GP), ANN, and SVR [37].
algorithm, i.e., a single layer feed-forward neural network, to predict The results showed that the prediction performance of DNN was not bet-
electrical consumption of residential buildings [28]. They further com- ter than ANN. Echo state network (ESN) is a new type of RNN. It had
pared the prediction accuracy of ELM with that of a three-layer ANN. been utilized by Shi et al. to predict hourly electricity load of an office
It was discovered that ELM was superior to the three-layer ANN. Mena building [38].
et al. combined ANN and the nonlinear autoregressive with exogenous
input algorithm to predict hourly electricity loads of a public build- 3.1.1.3. Support vector regression-based methods. SVR is widely utilized
ing [29]. Zhao et al. investigated the performance of three common for building energy load prediction. Fig. 3 illustrates a typical SVR
regression algorithms, i.e., ANN, SVR and ARIMA, in electricity load model. In essence, it aims at finding a hyperplane in a high dimensional
prediction of variable refrigerant volume systems in office buildings space which minimizes the prediction residuals of the points outside of
[25]. The results showed that ANN had the best performance among the margins. The width of the margins is a hyper-parameter which is
them. usually pre-defined.
Apart from the electricity load prediction, ANN have also been SVR had been widely utilized for building electricity load predic-
applied for cooling load and heating load prediction. For instance, tion. For instance, Dong et al. introduced a radial basis function-based
Ahmad et al. utilized ANN to predict hourly cooling loads and hourly SVR to predict electricity load of commercial buildings in Singapore
heating loads of a hotel building in Spain [30]. They compared the [24]. Four-year operation data collected from four commercial build-
performance of ANN with that of the random forest (RF) algorithm. The ings were utilized to verify the performance of SVR in building energy
results showed that the prediction accuracy of the ANN-based method load prediction. The results showed that the prediction accuracy of SVR
is higher than that of the RF-based method. Kwok et al. proposed an was very high. Fu et al. also proposed a SVR-based method to predict
improved ANN-based method, i.e., a probabilistic entropy-based neural electricity load of public buildings [39]. According the results in their
network (PENN), to predict cooling loads of an office building located cases, SVR had higher prediction accuracy than ANN. Liu et al. applied
in Hong Kong [31]. Hou et al. proposed a cooling load prediction SVR to predict the hourly energy consumption of office buildings [40].
method based on the rough sets theory and ANN [32]. The rough sets Lai et al. adopted SVR to predict the electricity load of a residential
theory was applied to extract the most relevant factors affecting cooling building in Japan [41]. SVR showed good prediction performance in the
loads as model inputs of ANN. Ben-Nakhi et al. also proposed a general cases in [40,41]. Massana et al. utilized SVR to predict electricity loads
regression neural network (GRNN)-based method for hourly cooling of a non-residential building [42]. The prediction performance of other
load prediction of office buildings [33]. two common regression algorithms, i.e., ANN and MLR, was compared
with that of SVR. The results showed that SVR had the best prediction
3.1.1.2. Deep neural network-based methods. Recently, DNN has been performance. Similarly, Li et al. discovered that SVR had better elec-
proved as an effective regression algorithm for building energy load tricity load prediction performance than those of three kinds of ANN
prediction [26]. Compared with ANN, DNN has more complex architec- algorithms, i.e., traditional back propagation neural network (BPNN),
tures. Recurrent neural network (RNN) is one of the most common types radial basis function neural network (RBFNN) and GRNN [43].
of DNN. Fan et al. investigated the prediction accuracy of three types SVR has also been utilized for the prediction of cooling and heat-
of RNN, i.e., the conventional RNN, long short-term memory (LSTM) ing loads of buildings successfully. For instance, Li et al. proposed an

151
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Benchmarking value
Actual observation
Energy load

Confidence interval

Energy load
Time
A fault is detected.
Upper bound Lower bound
Predicted observation Actual observation Time

Fig. 4. Illustration of the prediction interval of predicted building energy loads. Fig. 5. Illustration of regression-based FDD methods.

hourly cooling load prediction method for office buildings based on that the prediction residuals of SVR obeyed the zero-mean Laplace dis-
the least square support vector machine (LS-SVM) regression algorithm tribution. The maximum likelihood algorithm was then utilized to esti-
[44]. They discovered that the LS-SVM regression algorithm had higher mate the upper and lower bounds of prediction intervals. It was a good
prediction accuracy than BPNN. Paudel et al. adopted SVR to predict attempt to combine prediction interval estimation with cooling load pre-
the heating loads of a residential building [45]. Zhao and Magoulès diction. Quan et al. proposed a prediction interval estimation method for
also applied SVR to predict the heating loads of multiple buildings [46]. electricity loads based on ANN and the particle swarm optimization al-
Similarly, Hou and Lian utilized SVR to predict the cooling loads of a gorithm (PSO) [53]. Similarly, Guan et al. proposed a wavelet neural
HVAC system [47]. The prediction performance of SVR was compared network (WNN)-based prediction interval estimation method for elec-
with that of ARIMA. The results showed that SVR performed better than tricity load prediction [54]. Dahl et al. proposed an ARX-based method
ARIMA. to estimate prediction intervals of heating loads of a district heating
system [55]. Both weather-based uncertainties and model uncertainties
3.1.1.4. Other regression algorithm-based methods. Apart from ANN, were taken into account in the proposed method.
DNN and SVR, other regression algorithms such as piece-wise linear re- All of the methods reviewed in this section are listed in Table 1.
gression, multiple nonlinear regression and ensemble learning had also
shown high building energy load prediction accuracy in some cases.
3.1.2. Supervised fault detection and diagnosis methods for building energy
For instance, Zheng et al. proposed a piece-wise linear regression-based
systems
method to predict the hourly electricity loads of campus buildings [48].
Supervised data mining technologies have also been utilized to detect
The proposed method showed high prediction accuracy on fifty campus
and diagnose device faults and sensor faults in building energy systems
buildings. Fan et al. developed a multiple nonlinear regression (MNR)-
successfully. The operation performance of building energy systems can
based method for hourly building cooling load prediction [49]. The re-
be improved significantly if the detected faults are removed in time.
sults showed that the proposed method had higher prediction accuracy
The supervised data mining-based FDD methods can be divided into
than MLR, auto regression (AR), autoregressive exogenous (ARX), and
two categories, i.e., regression-based methods and classification-based
BPNN. Ensemble learning is an advanced regression algorithm which
methods.
can benefit from multiple individual regression algorithms. In general,
it has better prediction performance than each individual regression al-
gorithm. Chou and Bui developed an ensemble prediction model based 3.1.2.1. Regression-based methods. Regression algorithms have been
on SVR and ANN to predict cooling and heating loads of buildings [50]. utilized to develop benchmarking models for FDD. As illustrated in
Similarly, Fan et al. also proposed an ensemble prediction model for the Fig. 5, the developed benchmarking models are adopted to calculate
prediction of electricity loads of a public building [51]. The ensemble benchmarking values firstly. Then, faults can be detected if actual mea-
model combined eight base regression algorithms, i.e., MLR, ARIMA, surements deviate from the benchmarking values significantly.
SVR, RF, ANN, boosting tree, multivariate adaptive regression splines ANN is the most widely utilized regression algorithms for developing
and k-nearest neighbors. The results demonstrated that the prediction benchmarking models. For instance, Lee et al. proposed an on-line FDD
accuracy of the ensemble model was significantly better than those of method for air handling units (AHU) based on GRNN [56]. Four GRNN-
base models. based benchmarking models were developed to estimate the bench-
marking values of supply-air temperature, mixed-air temperature, static-
3.1.1.5. Regression-based methods with prediction interval estimation. pressure and airflow rate respectively. Residuals between the bench-
Noise are widely existed in the collected building operation data, since marking values and actual values were utilized to detect faults and per-
sensors always have measurement errors in practice. Moreover, it is formance degradation of AHU. Wang and Jiang developed a recurrent
almost impossible to find optimal structures of regression models. It cerebellar model articulation controller neural network (RCMACNN) to
results in the uncertainties of predicted building energy loads. Recently, identify the performance degradation of coil valves in variable air vol-
researchers had realized that prediction interval estimation is one of ume (VAV) air conditioning systems [57]. Performance degradation was
the most effective methods to quantify the uncertainties. As illustrated detected if model residuals exceeded a threshold. Du et al. trained a com-
in Fig. 4, the prediction interval of predicted building energy loads bined neural network to detect sensor faults and valve faults of AHU
is defined as a range between the upper bound and the lower bound. [58,59]. The combined neural network was composed of the basic neu-
Actual observations are expected to fall within the range with a given ral network and the auxiliary neural network. Du et al. also proposed
probability. a WNN-based method to diagnose the sensor faults of AHU [60,61].
Zhang et al. proposed a prediction interval estimation method for Similar WNN-based FDD methods for AHU can be found in [62,63].
SVR-based building cooling load prediction tasks [52]. They assumed Mavromatidis et al. introduced ANN to provide benchmarking values

152
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Table 1
A list of regression-based building energy load prediction methods.

Year Application Algorithm Ref.

2004 Cooling load prediction GRNN [33]


2006 ANN [32]
2009 LS-SVM regression∗ , BPNN [44]
2009 SVR∗ , ARIMA [47]
2011 PENN [31]
2017 DNN [26]
2019 RNN, LSTM, GRU∗ [34]
2019 MNR, MLR, AR, ARX, BPNN [49]
2005 Electricity load prediction SVR [24]
2008 SVR [41]
2010 SVR∗ , BPNN, RBFNN, GRNN [43]
2014 ANN [29]
2014 Ensemble learning [51]
2015 ANN [23]
2015 SVR [39]
2015 SVR [40]
2015 MLR, ANN, SVR∗ [42]
2016 ANN [27]
2016 ELM∗ , ANN [28]
2016 ANN∗ , SVR, ARIMA [25]
2016 LSTM [35]
2016 CRBM, FCRBM∗ , RNN, ANN, SVR [36]
2016 ESN [38]
2017 Piece-wise linear regression [48]
2018 MLR, GP, DNN, SVR, ANN∗ [37]
2010 Heating load prediction SVR [46]
2017 SVR [45]
2013 Electricity load prediction interval estimation WNN [54]
2014 ANN, PSO [53]
2014 Cooling & heating load prediction Ensemble learning [50]
2017 ANN, RF [30]
2017 Cooling load prediction & interval estimation SVR, maximum likelihood [52]
2017 Heating load prediction & interval estimation ARX [55]

Note: ∗ represents that the algorithm has the best performance compared with other algorithms.

of building energy consumption for detecting abnormal energy con- introduced an adaptive Gaussian mixture model regression algorithm to
sumption of supermarkets [64]. Tran investigated the performance of calculate the benchmarking values of performance indexes for FDD in
three regression algorithms, i.e., MLR, Kriging and radial basis func- HVAC systems [74].
tion network, in fault detection of centrifugal chiller systems [65]. They All of the methods reviewed in this section are listed in Table 2.
discovered that the radial basis function network had the best perfor-
mance. Swider et al. concluded that the generalized radial basis func-
tion neural network had good performance in detecting chiller faults 3.1.2.2. Classification-based methods. Classification algorithms can
[66]. learn the complicated relationship between faults and symptoms based
Other regression algorithms had also been utilized to estimate the on the data collected in different fault conditions. Then it can detect
benchmarking values such as SVR, Gaussian process regression, least which fault a new condition belongs to. Two kinds of classification
square regression, etc. Zhao et al. introduced SVR to estimate the bench- algorithms have been utilized for FDD, i.e., multiclass classification
marking values of performance indexes of HVAC subsystems for fault algorithms and one-class classification algorithms [75].
detection [67,68]. Faults such as heat transfer degradation and evapo- Multiclass classification-based FDD methods aim at classifying the
rator fouling were detected finally. Tran et al. further proposed a LS- operation data into several classes, including a normal class and several
SVM regression-based method to calculate the benchmarking values of faulty classes. SVM is one of the most widely utilized multiclass classifi-
performance indexes of chillers [69]. The proposed method obtained cation algorithms for FDD. Fig. 6 illustrates a two-class SVM classifier to
reliable detection results of six typical chiller faults using the experi- distinguish whether the data belong to the class of fault or not. With an
mental data of ASHRAE RP-1043 project. Van Every et al. adopted the aim of training the SVM model, an optimal hyperplane is found in a high
Gaussian process regression algorithm to estimate the mixed air tem- dimensional space which maximizes the margin between the data of dif-
perature, supply air temperature, and supply air flow rate of AHU [70]. ferent faults. The trained SVM is then utilized to determine whether the
The ratio between the prediction residual and its standard deviation unlabeled data belong to the class of fault or not. Liang and Du pro-
were then fed into a SVM classifier for FDD. Chung used the fuzzy lin- posed a four-layer SVM classifier for detecting and diagnosing faults of
ear regression algorithm to develop benchmarking models of the energy HVAC systems [76]. Four two-class SVM classifiers were developed. The
efficiency of commercial buildings [71]. The developed benchmarking first classifier was utilized to detect weather the systems existed faults
models were useful for FDD of commercial buildings. Wang and Gao or not. If faults occurred, the other three classifiers were used for di-
proposed a kernel-based partial least square regression-based method agnosing which fault occurred. Similarly, Dehestani et al. introduced
to detect faults and track performance degradation of heat exchangers SVM for FDD of HVAC systems [77]. Han et al. also proposed some sim-
in HVAC systems [72]. Similarly, Turner et al. utilized a recursive least ilar SVM-based FDD methods for chillers [78–80]. Four SVM classifiers
square regression algorithm for generating benchmarking values of per- were developed for detecting the fault-free condition and three differ-
formance indexes of HVAC systems [73]. Eight typical faults of HVAC ent faults respectively. In order to improve the performance of SVM-
equipment such as fan blockage and compressor failure were detected based FDD methods, some data preprocessing algorithms had been inte-
based on the generated benchmarking values. Karami and Wang also grated with SVM, such as principal component analysis (PCA) [81], ARX

153
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Table 2
A list of regression-based fault detection and diagnosis methods.

Year Detection/diagnosis target Algorithm Ref.

2001 Faults of chillers Generalized radial basis function neural [66]


network
2016 MLR, Kriging, radial basis function network∗ [65]
2016 LS-SVM regression [69]
2004 Faults of AHU GRNN [56]
2008 WNN [61]
2009 WNN [60]
2010 WNN [63]
2012 WNN [62]
2014 Combined neural network [58,59]
2017 Gaussian process regression [70]
2004 Faults of VAV systems RCMACNN [57]
2012 Abnormal building energy consumption patterns Fuzzy linear regression [71]
2013 Faults of HVAC systems SVR [67,68]
2017 Recursive least square regression [73]
2018 Gaussian mixture model regression [74]
2017 Faults of heat exchangers Partial least square regression [72]

Note: represents that the algorithm has the best performance compared with other algorithms.

Support Hypersphere
vector The target class

Class A (Fault)
D(x)<R
Support vector
α
D(x)>R
Outliers
Class B (Normal)

Fig. 8. Illustration of SVDD-based FDD methods (adapted from [75]).


Fig. 6. Illustration of a two-class SVM classifier for FDD (adapted from [75]).

supply flow station faults and return flow station faults. Zhou et al.
combined the fuzzy theory and ANN to detect and diagnose the faults of
Variable 1 Normal a chiller [86]. There were three steps included in the proposed method,
i.e., data preprocessing, performance index generating, and fault
Variable 2 Refrigerant leak detection. The fuzzy theory was utilized to generate the standardized
quantitative performance indices. Then, ANN was utilized for detecting
...
...

faults based on the generated performance indices. Similarly, Kocyigit


proposed a FDD method for vapor compression refrigeration systems
based on fuzzy inference systems and ANN [87]. Sun et al. also trained
Variable n Condenser fouling a BPNN classifier based on labeled faulty data to diagnose refrigerant
charge faults [88]. Magoulès et al. proposed a FDD method for HVAC
systems based on the recursive deterministic perceptron neural network
Input layer Hidden layer Output layer algorithm [89]. Four kinds of equipment faults were detected using
the proposed method including fan faults, coil faults, pump faults and
Fig. 7. Illustration of a four-layer ANN-based FDD method of chillers. chiller faults. He et al. utilized an adaptive resonance theory neural net-
work classifier to diagnose the faults of solar hot water systems [90,91].
Guo et al. proposed a deep belief network-based method to detect and
[82], wavelet de-noising [83], back-tracing sequential forward feature diagnose the faults of variable flow refrigerant systems [92]. Classifica-
selection [84], etc. tion and regression tree (CART), another typical multiclass classification
Apart from SVM, ANN has also been utilized for detecting and diag- algorithm, had also been utilized to diagnose the faults of AHU [93].
nosing multi-faults. As shown in Fig. 7, ANN-based FDD methods aim at One-class classification algorithms assume that only one kind of la-
training an ANN model to classify the data into different classes of faults. bel is known in the data. It can identify a normal class or one kind of
Lee et al. trained an ANN classifier to diagnosis eight typical faults of fault class. Support vector data description (SVDD) is one of the most
HVAC systems [85]. Seven variables were selected as the model inputs typical one-class classification algorithms. As illustrated in Fig. 8, SVDD
such as the supply air temperature, the supply air pressure and the aims at finding a minimum-volume hypersphere in a high dimensional
cooling coil valve position. Eight faults were detected by the proposed space which can enclose most of data of the target class. Zhao et al. in-
method finally including supply fan faults, return fan faults, pump faults, troduced SVDD for chiller fault detection based on fault-free data [94].
cooling coli valve faults, thermocouple faults, pressure transducer faults, Fault-free data were utilized to train a SVDD model whose hypersphere

154
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Table 3
A list of classification-based fault detection and diagnosis methods.

Year Fault detection/diagnosis target Algorithm Ref.

1996 HVAC systems ANN [85]


2007 SVM [76]
2011 SVM [77]
2013 Recursive deterministic perceptron neural network [89]
2009 Chillers ANN [86]
2011 SVM [78–80]
2013 SVDD [94]
2014 ARX, SVM [82]
2014 SVDD [95]
2014 PCA, one-class SVM [99]
2016 SVDD [96,97]
2017 Kalman filter, one-class SVM [98]
2018 Back-tracing sequential forward feature selection, SVM [84]
2010 Vapor compression refrigeration systems PCA, SVM [81]
2015 ANN [87]
2011 Solar hot water systems Adaptive resonance theory neural network [90,91]
2016 Variable refrigerant flow systems Wavelet de-noising, SVM [83]
2017 BPNN [88]
2018 Deep belief network [92]
2018 AHU CART [93]

could enclose most of the fault-free data. Faults were detected if the data • Most of the supervised data mining models are unexplainable. In
without labels were out of the hypersphere. Based on this work, Zhao general, they learn a statistical model between input variables and
et al. further proposed a SVDD-based FDD method for chillers [95]. A target variables without any physical meanings. Therefore, it is hard
fault-free SVDD model was developed to detect faults, and several fault for experts to understand these models.
SVDD models were developed to diagnose which fault occurred. Simi- • The performance of supervised data mining-based methods would be
larly, SVDD was utilized by Li et al. to detect faults of chillers [96,97]. very poor if the operation conditions are out of the range of train-
One-class SVM, another one-class classification algorithm, has also been ing conditions. Therefore, massive amounts of data are needed for
utilized for FDD. Yan et al. proposed a hybrid chiller FDD method based training a reliable model.
on the extended Kalman filter and recursive one-class SVM [98]. The ex- • The quality of training data has a great impact on the performance
tended Kalman filter was implemented to preprocess the fault-free data. of supervised data mining-based methods. However, in the domain
Then, the preprocessed fault-free data were utilized to train a one-class of building energy systems, the data always have missing values,
SVM classifier. Similarly, Beghi et al. introduced PCA and one-class SVM noises, outliers and dead values. They should be preprocessed care-
to detect faults of centrifugal chillers [99]. In the proposed method, PCA fully. In general, this process is very time-consuming.
was employed for dimensionality reduction. • Labeled data are necessary for supervised data mining-based classi-
All the methods reviewed in this section are listed in Table 3. fication methods such as FDD. However, high-quality labeled data
are quite rare in practices. It takes lots of time for experts to label
3.2. Strengths and shortcomings of supervised data mining algorithms the data manually.

3.2.1. Strengths 4. Unsupervised data mining-based methods


Based on the reviews, the major strengths of the supervised data
mining technologies in the domain of building energy systems are 4.1. Applications of unsupervised data mining technologies in building
summarized: energy systems

• Most of supervised data mining-based methods are pure data-driven. 4.1.1. Identification of typical building operation patterns
In general, they do not require a lot domain knowledge. There are Unsupervised data mining technologies are powerful at identifying
also abundant commercial software and open source software pack- operation patterns of buildings from operation data. Some identified
ages available. Therefore, it is more feasible to apply supervised data building operation patterns are useful for helping building managers
mining-based methods compared with the physical model-based to understand the operation performance, control strategies and energy
methods. consumption profiles. In the domain of building energy systems, the
• Compared with the knowledge driven-based methods, most of super- most widely employed unsupervised data mining-based methods are as-
vised data mining-based methods have very high accuracy in energy sociation rule mining (ARM)-based methods, clustering-based methods
load prediction and FDD. Theoretically, they can learn any relation- and motif detection-based methods.
ships between input variables and target variables if the data are
sufficient. 4.1.1.1. Association rule mining-based methods. Association rule mining
• It is common that the available sensors are usually very limited in is very powerful at extracting association rules among variables from
practices. The physical model-based methods usually cannot work massive amounts of operation data. An association rule is usually rep-
in such situations. The supervised data mining-based methods show resented in the form of “A → B”, where A is the antecedent and B is
advantages that they can still work even if some important variables the consequent. However, it requires domain experts to analyze the ex-
are unavailable. tracted association rules one by one to find valuable building operation
patterns.
3.2.2. Shortcomings Apriori is one of the most common ARM algorithms for identifying
Based on the reviews, the major shortcomings of the supervised typical building operation patterns. For instance, Li et al. proposed
data mining technologies in the domain of building energy systems are an ARM-based method to discover energy consumption patterns and
summarized: compressor control strategies of variable refrigerant flow systems under

155
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Table 4
A list of association rule mining-based building operation pattern identification methods.

Year Identification task Algorithm Ref.

2015 Temporal operation patterns of building energy systems TRuleGrowth [107]


2017 Control strategies of building energy systems Weighted ARM [104]
2019 Weighted ARM [103]
2017 Typical operation patterns and control strategies of HVAC systems QuantMiner [105]
2018 QuantMiner [106]
2017 Energy consumption patterns and control strategies of variable refrigerant flow systems Apriori [100]
2017 Regulation strategies of district heating systems Apriori [101]
2018 Electricity consumption patterns of residual buildings Apriori [102]
2018 Gradual operation patterns of building energy systems ParaMiner [108]
2019 Typical operation patterns of HVAC systems CloseGraph [109,110]

various part loads [100]. Four steps were included in the proposed y Pattern B
method, i.e., data preprocessing, clustering analysis, data association
mining, and knowledge extraction. The step of data preprocessing
aimed at improving the quality of raw data. Then clustering algorithms
were utilized to identify operation conditions. In the step of data
association mining, the Apriori algorithm was utilized to generate
association rules under different operation conditions. Finally, the Pattern C
mined association rules were interpreted by domain experts in the step Pattern A
of knowledge extraction. Xue et al. also proposed a similar method to
identify regulation strategies of district heating systems [101]. Wang
et al. utilized the Apriori algorithm to reveal the electricity consumption
patterns of residual buildings [102].
Other association rule mining algorithms had also been utilized such
as weighted ARM, quantitative ARM and temporal ARM. For instance,
Li et al. proposed a weighted ARM-based method to identify control
strategies of building energy systems [103,104]. Lighting system control x
strategies, chiller sequencing control strategies, and coordinated con-
trol strategies between chillers and pumps were identified. Fan et al. Fig. 9. Illustration of the clustering-based pattern identification methods.
introduced a quantitative ARM algorithm to identify chilled water and
condensing water distribution patterns, chiller control strategies and
cooling tower control strategies of HVAC systems [105,106]. Compared They found that k-means clustering had better pattern identification per-
with conventional ARM algorithms, the quantitative ARM algorithm formance than other clustering algorithms. Similarly, Nikolaou et al.
can mine both numerical data and categorical data without data dis- discovered that k-means clustering was suitable for heating load clas-
cretization. Considering complicated dynamics in building operations, sification of buildings [112]. Wu and Clements-Croome introduced k-
Fan et al. proposed a temporal ARM-based method to discover temporal means clustering to identify indoor environment distribution patterns
operation patterns of building energy systems [107]. A similar method [113]. Three indoor environment variables, i.e., temperature, humidity
to discover dynamic operation patterns of building energy systems also and light, were selected to characterize the indoor environment. Four
can be found in [108]. Recently, researchers realized that the graph types of indoor environment distribution patterns were identified fi-
mining algorithm, i.e., a variation of ARM, were more effective in min- nally. Gao and Malkawi applied k-means clustering to classify the build-
ing multi-relational databases than conventional ARM algorithms. For ings into different clusters based on twelve building performance fea-
instance, Fan et al. proposed a graph mining-based method to reveal tures [114]. Miller et al. utilized k-means clustering to identify daily
typical operation patterns of HVAC systems [109,110]. Graphs are ca- building electricity consumption patterns [115,116]. Similarly, Lavin
pable of describing knowledge in a visual way. Therefore, the graph and Klabjan discovered daily electricity usage patterns of about 1000
mining-based methods can improve the interpretability of the mined commercial consumers using k-means clustering [117]. K-means clus-
knowledge. tering had also been utilized to identify energy consumption patterns of
The methods reviewed in this section are listed in Table 4. hotels [118,119]. Conventional k-means clustering algorithms can work
only when the number of clusters is given. However, this parameter is
4.1.1.2. Clustering-based methods. Clustering algorithms have also been hard to be pre-specified. Some adaptive k-means algorithms had been
widely utilized to identify typical building operation patterns from developed which could optimize this parameter automatically. For in-
building operation data such as building energy consumption patterns, stance, an adaptive k-means algorithm was proposed by Kwac et al. to
indoor environment distribution pattern and building energy system op- discover daily energy consumption patterns of residual buildings [120].
eration patterns. As illustrated in Fig. 9, clustering algorithms aim at The fuzzy c-means clustering had also been adopted to identify build-
classifying all the points in a data set into several clusters based on the ing operation patterns. For instance, Iglesias investigated the perfor-
statistical similarity between each pair of points. The points in the same mance of the fuzzy c-means clustering algorithm with four similar-
cluster have similar statistical characteristics. And the points in different ity measures (i.e., Euclidean distance, Mahalanobis distance, Pearson
clusters have significantly different statistical characteristics. In general, correlation-based distance and dynamic time warping distance) in iden-
different operation conditions have different statistical characteristics. tifying daily building electricity consumption patterns [121]. They dis-
They can be classified into different clusters as shown in Fig. 9. covered that the Euclidean distance had the best performance. Qin and
K-means clustering is one of the most popular clustering algorithms Zhang also utilized the fuzzy c-means clustering algorithm to identify en-
in the domain of building energy systems. For instance, Chicco made a ergy consumption patterns of office buildings [122]. Santamouris et al.
comprehensive comparison of the performances of several typical clus- utilized the fuzzy c-means clustering algorithm to classify 320 school
tering algorithms in identifying typical electricity load patterns [111]. buildings into five categories based on their energy consumption [123].

156
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Table 5
A list of clustering-based building operation pattern identification methods.

Year Identification task Algorithm Ref.

2007 Indoor environment distribution patterns k-means clustering [113]


2007 Energy consumption patterns of school buildings Fuzzy c-means clustering [123]
2009 Electricity load patterns of non-residential customers Support vector clustering [124]
2012 k-means clustering [111]
2012 Energy consumption patterns of hotels k-means clustering [119]
2015 [118]
2012 Lighting energy consumption patterns of commercial buildings Expectation maximization clustering [125]
2012 Building heating load patterns k-means clustering [112]
2013 Daily building electricity consumption patterns Fuzzy c-means clustering [121]
2015 k-means clustering [115,116]
2014 Building energy performance benchmarking k-means clustering [114]
2014 Daily energy consumption patterns of residual buildings Adaptive k-means clustering [120]
2015 Daily electricity usage patterns of commercial consumers k-means clustering [117]
2017 Energy consumption patterns of office buildings Fuzzy c-means clustering [122]
2017 Power consumption patterns of variable refrigerant flow systems Decision tree-based clustering [126]

energy consumption patterns of building equipment in a robust form.


Miller et al. also introduced SAX to detect daily energy consumption
d profiles of buildings [115,116]. Fonseca et al. proposed an adaptive
d SAX-based method to identify daily electricity consumption patterns of
c buildings [131]. Generic algorithms were adopted to optimize two input
Energy load

c c parameters of SAX, i.e., the word size and the alphabet size. Similarly,
Reinhardt and Koessler proposed a SAX-based method to identify power
consumption patterns [132]. Habib and Zucker introduced SAX to iden-
b b b b
b tify common operation patterns of an absorption chiller [133]. Fan et al.
applied SAX to extract temporal energy consumption patterns of build-
a ings and frequent operation patterns of subsystems [107,108]. Kalluri
a et al. proposed a SAX-based time series subsequence mining method to
discover operation patterns of appliances in office buildings [134]. Chen
0:00 6:00 12:00 18:00 0:00 6:00 12:00 18:00 0:00 and Wen used SAX to identify operation conditions of HAVC systems
Day 1 Day 2 [135].
Mean value of the data in a subsequence The methods reviewed in this section are listed in Table 6.

Fig. 10. Illustration of the symbolic aggregate approximation conversion 4.1.2. Unsupervised fault detection and diagnosis methods for building
(adapted from [115]). energy systems
Unsupervised data mining technologies are also utilized for detecting
sensor faults, device faults and anomalies in building operations without
Other clustering algorithms were also utilized such as support vec- labeled faulty data. In general, there is no significant difference between
tor clustering, expectation maximization clustering and decision tree the anomaly detection methods and the fault detection methods. There-
clustering. For instance, Chicco and Ilie proposed a support vector fore, the anomaly detection is regarded as a research branch belonging
clustering-based method to detect electricity load patterns [124]. Ac- to FDD in this paper. The common unsupervised data mining-based FDD
cording to their results, the support vector clustering algorithm had methods can be categorized into ARM-based methods, clustering-based
good effectiveness when the number of clusters was low. Petcharat et al. methods, discord detection-based methods and PCA-based methods.
utilized the expectation maximization clustering algorithm to identify
lighting energy consumption patterns of commercial buildings [125]. 4.1.2.1. Association rule mining-based methods. ARM have also been uti-
Liu et al. proposed a decision tree-based clustering method to identify lized to detect faults of building energy systems from massive amounts
power consumption patterns of variable refrigerant flow systems [126]. of building operation data successfully. Apriori is the most common one
The methods reviewed in this section are listed in Table 5. in the domain of building energy systems. Xiao and Fan proposed a data
mining framework based the Apriori algorithm to detect deficit flow
4.1.1.3. Motif detection-based methods. Motif detection aims at discov- faults and abnormal operation patterns of HVAC systems [136]. The pro-
ering frequently occurring subsequences named motif in time-series posed framework consisted of five steps including data preprocessing,
data. The detected motifs are useful for discovering temporal operation clustering analysis, association rule mining, post-mining and knowledge
patterns of buildings. Symbolic aggregate approximation (SAX) is the application. It was also applicable for FDD of other building energy sys-
most widely utilized motif detection algorithm. In general, it includes tems such as lighting systems. Li et al. also discovered undercharge faults
three steps. Firstly, the time-series data is divided into a set of subse- of variable refrigerant flow systems using the Apriori algorithm [100].
quences. Secondly, each subsequence is converted into an alphabetic Cabrera and Zareipour applied the Apriori algorithm to reveal abnormal
symbol to represent the original subsequence, as illustrated in Fig. 10. operation patterns of lighting systems in an actual educational building
Thirdly, time series with the same alphabetic symbols were classified [137]. They evaluated that 70% of the lighting energy consumption of
into the same class as a motif if they occur frequently. the actual educational building could be saved if the detected abnormal
Patnaik et al. utilized SAX to extract operation modes of chillers operation patterns were removed. Similarly, Xue et al. detected remote
[127–129]. Based these works, Shao et al. further proposed a SAX-based flow meter faults and abnormal operation patterns of district heating
method to identify temporal power operation patterns of appliances systems based on the Apriori algorithm [101].
in commercial buildings [130]. According to the researches of Patnaik Frequent pattern tree (FP-growth) is another common ARM algo-
et al. [127–129] and Shao et al. [130], SAX was powerful to describe the rithm for this purpose [138]. In general, FP-growth preforms better than

157
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Table 6
A list of motif detection-based building operation pattern identification methods.

Year Identification task Algorithm Ref.

2009 Operation patterns of chillers SAX [127]


2010 SAX [128]
2011 SAX [129]
2013 Temporal power operation patterns of appliances in SAX [130]
commercial buildings SAX
2015 Common operation patterns of an absorption chiller SAX [133]
2015 Temporal energy consumption patterns of buildings SAX [107]
2016 Operation patterns of appliances in office building SAX [134]
2014 Electricity consumption patterns of buildings SAX [132]
2015 SAX [115,116]
2017 SAX [131]
2017 Operation patterns of HAVC systems SAX [135]
2018 [108]

Table 7
A list of association rule mining-based fault detection and diagnosis methods. y
Year Fault detection/diagnosis target Algorithm Ref.
Fault B
2012 HVAC systems FP-growth [139]
2014 Apriori [136]
2015 QuantMiner [143]
2017 QuantMiner [142]
2019 FP-growth [140,141] Fault A
2019 Text mining [144] Fault-free
2013 Lighting systems in school buildings Apriori [137]
2015 Building energy systems TRuleGrowth [107]
2018 ParaMiner [108]
2019 CloseGraph [109,110]
2017 District heating systems Apriori [101]
2017 Variable refrigerant flow systems Apriori [100]

Apriori in mining massive amounts of data. Yu et al. introduced the FP-


growth algorithm to identify abnormal operation patterns and equip- x
ment faults of HVAC systems [139]. Zhang et al. also proposed a FP-
growth-based method to detect temperature sensor faults and abnormal Fig. 11. Illustration of the clustering-based fault detection method.
operation patterns of HVAC systems [140,141]. They developed an as-
sociation rule comparison-based post mining method to remove most of
worthless association rules automatically. According their results, about The fuzzy c-means clustering algorithm and the density-based spa-
half of the association rules were removed by the proposed post mining tial clustering of applications with noise algorithm (DBSCAN) algorithm
method. are two common clustering algorithms utilized for FDD. House et al. uti-
However, the Apriori algorithm and the FP-growth algorithm can lized the fuzzy c-means clustering algorithm to detect faults of an air-
work with categorical data only. Quantitative ARM algorithms can ex- handling unit [145]. The operation data of the air-handling unit were
tract association rules from numerical data. They had been utilized for classified into three types finally, i.e., normal, faulty and unknown. Zogg
FDD of building energy systems successfully. For instance, Xiao et al. et al. also proposed a fuzzy c-means clustering-based method to detect
utilized a quantitative ARM algorithm named QuantMiner to detect de- faults of heat pumps [146,147]. Capozzoli et al. investigated the perfor-
vice control faults [142]. Fan et al. also proposed a similar method to mance of two kinds of typical clustering algorithms in detecting abnor-
identify abnormal operation patterns and sensor faults of HVAC systems mal building energy consumption patterns, i.e., the k-means clustering
[143]. and DBSCAN [148,149]. They found that the k-means clustering was
Graph mining algorithms, text mining algorithms, and dynamic ARM unsuitable to detect faults of building energy consumption. And the DB-
algorithms have also been utilized for fault detection of building energy SCAN algorithm classified all the data of abnormal operation patterns
systems. Fan et al. proposed a graph mining-based method for detecting into a cluster. The DBSCAN algorithm was also utilized by Li and Hu to
abnormal operation patterns of building energy systems [109,110]. The detect and diagnose sensor faults of chiller systems [150].
raw data were transformed into a graph dataset. Then, the CloseGraph, a Other clustering algorithms have also been utilized, such as the
common graph mining algorithm, was adopted to extract valuable sub- k-means++ clustering, the subtractive clustering, the agglomerative
graphs from the graph dataset. Gunay et al. introduced text mining algo- clustering and the k-nearest neighbor clustering. Narayanaswamy et al.
rithms to detect equipment fault frequency of HVAC systems [144]. Fan applied k-means++ clustering to identify the faults of HVAC systems
et al. developed two dynamic ARM-based methods which can discover [151]. Du et al. introduced the subtractive clustering algorithm to
dynamic operation anomalies of building energy systems [107,108]. discover the clustering centers of normal operation data and faulty
The methods reviewed in this section are listed in Table 7. operation data [152]. The faults of a HVAC system were isolated accu-
rately finally. Wall et al. proposed an agglomerative clustering-based
4.1.2.2. Clustering-based methods. Clustering algorithms can classify method for fault detection of HVAC systems [153]. Jakkula and Cook
measurements into different groups. They have shown powerful ability introduced the k-nearest neighbor clustering to identify abnormal
of FDD. In general, measurements collected from different fault condi- power consumption patterns [154].
tions should be divided into different group as illustrated in Fig. 11. The methods reviewed in this section are listed in Table 8.

158
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Table 8
A list of clustering-based fault detection and diagnosis methods.

Year Detection/diagnosis target Algorithm Ref.

1999 Faults of an air-handling unit Fuzzy c-means clustering [145]


2001 Faults of heat pumps Fuzzy c-means clustering [146]
2006 Fuzzy c-means clustering [147]
2010 Abnormal building energy k-nearest neighbor clustering [154]
2015 consumption patterns DBSCAN [148]
2013 DBSCAN [149]
2011 Faults of HVAC systems Agglomerative clustering [153]
2014 k-means++ clustering [151]
2014 Subtractive clustering [152]
2018 Faults of chiller systems DBSCAN [150]

Table 9 The PCA algorithm have been widely utilized for detecting faults
A list of discord detection-based fault detection and diagnosis methods. of AHU. For instance, Du and Jin introduced the PCA algorithm to de-
Year Detection/diagnosis task Algorithms Refs tect the faults of AHU [156]. Two kinds of FDD models were developed
including system-level PCA models and local-level PCA models. The
2015 Abnormal daily building energy consumption SAX [115]
system-level models were utilized to detect the abnormity of systems
2015 patterns SAX [116]
2017 Faults of building energy systems SAX [131] based on energy balance. Then, the local-level models were adopted to
2018 Abnormal patterns of building electricity loads SAX [155] further diagnose the type of faults. Five common faults of AHU were
detected by the proposed method including drifting bias, fixed bias,
complete failure of sensors, water valve stuck and air damper stuck.
Wang and Xiao also proposed a PCA-based method to detect and di-
agnose sensor faults of AHU under various operating conditions [157–
159]. Two PCA-based models were developed based on the pressure-
flow balance and heat balance respectively. The result showed that the
proposed method was robust in different control modes. Based on the
works of Wang and Xiao, a diagnostic tool was developed to assist build-
ing automation systems for online FDD of AHU [160]. Madhikermi et al.
applied PCA to diagnose the faults of heat recovery units in AHU [161].
Li and Wen proposed a PCA-based method combined with wavelet trans-
form for FDD of AHU [162]. It was found that the wavelet transform
could improve the sensibility and robustness of PCA. The faults of VAV
systems also can be detected using the PCA algorithm. For instance,
Du et al. proposed an improved PCA-based method to detect and diag-
nose sensor faults of VAV systems [163–166]. The joint angle analysis
plot was utilized for fault diagnosis in the improved method instead of
Fig. 12. Illustration of principal component analysis (adapted from [158]). the conventional contribution plot. This method was further utilized to
detect multiple faults occurred in VAV systems simultaneously [167].
Similarly, Du et al. proposed a FDD method based on the PCA algo-
4.1.2.3. Discord detection-based methods. Discord detection aims at dis- rithm and fisher discriminant analysis algorithm [168]. Wang and Qin
covering infrequently occurring subsequences named discord in time- applied the PCA algorithm to detect flow sensor faults of VAV terminals
series data. The infrequent time subsequences usually indicate faulty [169]. Apart from above two kinds of systems, PCA-based FDD meth-
operation patterns of building energy systems. SAX is the most widely ods have also been applied in heat pump systems [170,171], chiller
utilized algorithm to detect infrequent time subsequences. Miller et al. systems [150,172–175], variable refrigerant flow systems [176–178],
utilized the SAX algorithm for identifying infrequent daily energy con- vapor compression systems [179], HVAC subsystems [180,181] and so
sumption profiles of buildings [115,116]. The infrequent daily energy on.
consumption profiles usually resulted from hidden faults of building en- The methods reviewed in this section are listed in Table 10.
ergy systems. However, such kinds of method cannot diagnose the spe-
cific faults automatically. Experts need to further analyze the detected
4.2. Strengths and shortcomings of unsupervised data mining algorithms
infrequent daily energy consumption profiles with an aim of diagnos-
ing the specific faults. Similarly, Capozzoli et al. proposed a SAX-based
4.2.1. Strengths
method for discovering abnormal patterns of building electricity loads
Based on the reviews, the major strengths of the unsupervised data
[155]. Fonseca et al. also concluded that the SAX algorithm was promis-
mining technologies in the domain of building energy systems are sum-
ing for FDD of building energy systems [131].
marized:
The methods reviewed in this section are listed in Table 9.
• The unsupervised data mining-based methods do not require labeled
4.1.2.4. Principal component analysis-based methods. PCA is one of the data. High-quality labeled data are quite rare in practices. Therefore,
most widely utilized FDD algorithms in the building field. As illustrated the unsupervised data mining-based methods have potential to be
in Fig. 12, the PCA algorithm decomposes original building operation applied more widely.
measurement space into two orthogonal subspaces, i.e., principal com- • The discovered knowledge by unsupervised data mining technolo-
ponent subspace and residual subspace. The principal component sub- gies always has physical meanings. Domain experts can explain the
space reflects normal conditions. The residual subspace indicates faulty discovered knowledge easily based on their domain knowledge.
conditions. Faults of building energy systems can be detected based on • The unsupervised data mining technologies can discover hidden
the squared sum of the values in the residual subspace. knowledge from building operation data such as unknown faults and

159
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

Table 10
A list of Principal component analysis-based fault detection and diagnosis methods.

Year Fault detection/diagnosis target Algorithm Ref.

2004 AHU PCA [157,158]


2006 PCA [159]
2006 PCA [160]
2014 PCA, wavelet transform [162]
2005 VAV systems PCA [169]
2006 PCA [165]
2007 PCA [163]
2007 PCA [166]
2007 PCA [167]
2007 PCA, fisher discriminant analysis [168]
2009 PCA [164]
2005 Chillers PCA [174]
2008 PCA [173]
2015 PCA [175]
2016 PCA [172]
2018 PCA [150]
2005 HVAC subsystems PCA [181]
2010 PCA [180]
2009 Heat pump systems PCA [170]
2017 PCA [171]
2017 Variable refrigerant flow systems PCA [176]
2017 PCA [177]
2018 PCA [178]
2017 Vapor compression systems PCA [179]
2018 Heat recovery units in AHU PCA [161]

unknown operation patterns. They can provide building managers specific data mining-based methods for each building energy sys-
with new insights. tem. It is in great need to develop universal methods that can serve
various building energy systems with limited adjustments.
• Develop automatic data mining-based methods. It is very crucial to
4.2.2. Shortcomings
Based on the reviews, the major shortcomings of the unsupervised the applications of data mining technologies. Because most of tech-
data mining-based technologies in the domain of building energy sys- nicians are not familiar with data mining technologies. For instance,
tems are summarized: automatic data preprocessing methods is in great need to fill in miss-
ing values, remove outliers, reduce noises and extract effective fea-
• In general, the amounts of knowledge discovered by unsupervised tures. An ideal user-friendly data mining solution requires inputs
data mining-based methods are very large. Most of them are useless. of high level information such as the configuration of energy sys-
It is very time consuming to find valuable knowledge from all of the tems and physical meanings of every sensor. It provides outputs such
mined knowledge. as executable advises with detailed explanations and accurate con-
• The performance of the unsupervised data mining-based methods trol commands. For instance, automatic data preprocessing methods
depends on the quality of data significantly. The raw operation data should understand the physical meanings of the measured variables
collected from building energy systems are always of low-quality and learn their statistical characteristics from the data, so that they
due to missing values, noises, outliers and dead values. It is very can automatically determine whether the data need to be prepro-
time-consuming to preprocess the data. cessed.
• Compared with the supervised data mining-based methods, the unsu- • Develop domain knowledge-driven data mining-based methods.
pervised data mining technologies are not good at discovering com- Domain knowledge is crucial to the intelligence level of data
plex relationships between multiple variables. mining-based methods. Most of the existing methods did not
take domain knowledge into account properly. In most of time,
5. Suggestions for future research domain knowledge was utilized to develop data mining-based
methods by experts. The future data mining-based methods shall
According to the previous researches, there is no doubt that the data be domain knowledge-driven. They shall have features of expert
mining-based methods have great potential to release the power of data systems and even cognitive computing systems. For instance, con-
in the domain of building energy systems. However, most of the data ventional clustering-based building energy consumption analysis
mining-based methods are still not mature enough for practical appli- methods usually utilized the geometrical distance between energy
cations. Based on this review, here are some suggestions for future re- consumption-related variables to quantify the similarity between
searches: different buildings. However, the geometrical distance such as
Euclidean distance is a statistical index which has no physical
• Develop data mining-based methods which can inherit the strengths meanings. Physical principle-based similarity indexes should be
of both supervised data mining technologies and unsupervised data proposed for building energy consumption clustering.
mining technologies. The previous researches had demonstrated that
it is challenging to develop a complete solution based on a type of
data mining technology only. It would be better to introduce proper 6. Conclusions
data mining technologies of both categories to solve sub-problems,
and then to combine them in a framework as a complete solution. A comprehensive literature review is made in this study about
• Develop universal data mining-based methods. Building energy sys- the data mining-based methods in the domain of building energy sys-
tems are quite different with each other in sensor installations, con- tems. In general, the supervised data mining-based methods were good
trol strategies, occupant behaviors, etc. It is unrealistic to develop at building energy load prediction and FDD. The unsupervised data

160
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

mining-based methods were good at the extraction and identification of [16] M.A. Mat Daut, M.Y. Hassan, H. Abdullah, H.A. Rahman, M.P. Abdullah, F. Hussin,
operation patterns and faults. Building electrical energy consumption forecasting analysis using conventional and
artificial intelligence methods: a review, Renew. Sustain. Energy Rev. 70 (2017)
The supervised data mining-based methods show advantages in 1108–1118, doi:10.1016/j.rser.2016.12.015.
working without domain knowledge or physical models. Moreover, they [17] C. Deb, F. Zhang, J. Yang, S.E. Lee, K.W. Shah, A review on time series forecast-
have high prediction accuracy, and can still be acceptable even if some ing techniques for building energy consumption, Renew. Sustain. Energy Rev. 74
(2017) 902–924, doi:10.1016/j.rser.2017.02.085.
important variables are unknown. They were feasible to be utilized in [18] Y. Wei, X. Zhang, Y. Shi, L. Xia, S. Pan, J. Wu, et al., A review of data-driven ap-
practices. However, the models are always unexplainable and cannot proaches for prediction and classification of building energy consumption, Renew.
extrapolate beyond the range of the training data. The unsupervised Sustain. Energy Rev. 82 (2018) 1027–1047, doi:10.1016/j.rser.2017.09.108.
[19] M. Bourdeau, X. Zhai, E. Nefzaoui, X. Guo, P. Chatellier, Modeling and forecasting
data mining-based methods show advantages in the capacity of reveal-
building energy consumption: a review of data-driven techniques, Sustain. Cities
ing meaningful operation patterns. However, it is very challenging to Soc. 48 (2019) 101533, doi:10.1016/j.scs.2019.101533.
find valuable operation patterns manually from massive amounts of the [20] Z. Li, Y. Han, P. Xu, Methods for benchmarking building energy consumption
against its past or intended performance: an overview, Appl. Energy 124 (2014)
mined patterns. The preprocessing processes for both types of methods
325–334, doi:10.1016/j.apenergy.2014.03.020.
are always very time-consuming. [21] A. Capozzoli, T. Cerquitelli, M.S. Piscitelli, Chapter 11 - Enhancing energy effi-
In summary, data mining-based methods have shown great potential ciency in buildings through innovative data analytics technologies, in: C. Dobre,
in releasing the power of data in the domain of building energy systems. F. Xhafa (Eds.), Pervasive Computing, Academic Press, Boston, 2016, pp. 353–389,
doi:10.1016/B978-0-12-803663-1.00011-5.
However, most of the data mining-based methods are still not mature for [22] C. Miller, Z. Nagy, A. Schlueter, A review of unsupervised statistical learn-
practical applications. It is in great need to develop universal, automatic ing and visual analytics techniques applied to performance analysis of non-
and domain knowledge-driven data mining-based methods in the future. residential buildings, Renew. Sustain. Energy Rev. 81 (2018) 1365–1377,
doi:10.1016/j.rser.2017.05.124.
[23] A. Bagnasco, F. Fresi, M. Saviozzi, F. Silvestro, A. Vinci, Electrical consumption
Declaration of Competing Interest forecasting in hospital facilities: an application case, Energy Build. 103 (2015) 261–
270, doi:10.1016/j.enbuild.2015.05.056.
[24] B. Dong, C. Cao, S.E. Lee, Applying support vector machines to predict build-
The authors declare that there is no conflicts of interest. ing energy consumption in tropical region, Energy Build. 37 (2005) 545–553,
doi:10.1016/j.enbuild.2004.09.009.
Acknowledgements [25] D. Zhao, M. Zhong, X. Zhang, X. Su, Energy consumption predicting model of vrv
(variable refrigerant volume) system in office buildings based on data mining, En-
ergy 102 (2016) 660–668, doi:10.1016/j.energy.2016.02.134.
This study is supported by the National Natural Science Foundation [26] C. Fan, F. Xiao, Y. Zhao, A short-term building cooling load prediction
of China (Grant No. 51706197). method using deep learning algorithms, Appl. Energy 195 (2017) 222–233,
doi:10.1016/j.apenergy.2017.03.064.
[27] M.A.R. Biswas, M.D. Robinson, N. Fumo, Prediction of residential building
References energy consumption: a neural network approach, Energy 117 (2016) 84–92,
doi:10.1016/j.energy.2016.10.066.
[1] International Energy Agency (IEA), Transition to Sustainable [28] S. Naji, A. Keivani, S. Shamshirband, U.J. Alengaram, M.Z. Jumaat, Z. Mansor,
Buildings: Strategies and Opportunities to 2050, 27 Jun 2013, et al., Estimating building energy consumption using extreme learning machine
https://doi.org/10.1787/9789264202955-en. method, Energy 97 (2016) 506–516, doi:10.1016/j.energy.2015.11.037.
[2] J.R. Vázquez-Canteli, S. Ulyanin, J. Kämpf, Z. Nagy, Fusing tensorflow with build- [29] R. Mena, F. Rodríguez, M. Castilla, M.R. Arahal, A prediction model based on neural
ing energy simulation for intelligent energy management in smart cities, Sustain. networks for the energy consumption of a bioclimatic building, Energy Build. 82
Cities Soc. 45 (2019) 243–257, doi:10.1016/j.scs.2018.11.021. (2014) 142–155, doi:10.1016/j.enbuild.2014.06.052.
[3] V.S.K.V. Harish, A. Kumar, A review on modeling and simulation of build- [30] M.W. Ahmad, M. Mourshed, Y. Rezgui, Trees vs neurons: comparison between ran-
ing energy systems, Renew. Sustain. Energy Rev. 56 (2016) 1272–1292, dom forest and ANN for high-resolution prediction of building energy consumption,
doi:10.1016/j.rser.2015.12.040. Energy Build. 147 (2017) 77–89, doi:10.1016/j.enbuild.2017.04.038.
[4] C. Fan, Y. Sun, Y. Zhao, M. Song, J. Wang, Deep learning-based feature engineering [31] S.S.K. Kwok, A study of the importance of occupancy to building cooling load in
methods for improved building energy prediction, Appl. Energy 240 (2019) 35–45, prediction by intelligent approach, Energy Conv. Manag. 52 (2011) 2555–2564,
doi:10.1016/j.apenergy.2019.02.052. doi:10.1016/j.enconman.2011.02.002.
[5] S. Katipamula, M.R. Brambley, Review article: methods for fault detection, diag- [32] Z. Hou, Z. Lian, Y. Yao, X. Yuan, Cooling-load prediction by the combination of
nostics, and prognostics for building systems—A review, part I, HVAC&R Res. 11 rough set theory and an artificial neural-network based on data-fusion technique,
(2005) 3–25, doi:10.1080/10789669.2005.10391123. Appl. Energy 83 (2006) 1033–1046, doi:10.1016/j.apenergy.2005.08.006.
[6] D. Lee, C.C. Cheng, Energy savings by energy management systems: a review, Re- [33] A.E. Ben-Nakhi, M.A. Mahmoud, Cooling load prediction for buildings using gen-
new. Sustain. Energy Rev. 56 (2016) 760–777, doi:10.1016/j.rser.2015.11.067. eral regression neural networks, Energy Conv. Manag. 45 (2004) 2127–2141,
[7] K.W. Roth, D. Westphalen, P. Llana, M. Feng, The energy impact of faults in U.S. doi:10.1016/j.enconman.2003.10.009.
commercial buildings, in: Proceedings of the International Refrigeration and Air [34] C. Fan, J. Wang, W. Gang, S. Li, Assessment of deep recurrent neural network-based
Conditioning Conference, 2004, p. 665. http://docs.lib.purdue.edu/iracc/665. strategies for short-term building energy predictions, Appl. Energy 236 (2019) 700–
[8] Z. Yu, F. Haghighat, B.C.M. Fung, Advances and challenges in building engineering 710, doi:10.1016/j.apenergy.2018.12.004.
and data mining applications for energy-efficient communities, Sustain. Cities Soc. [35] D.L. Marino, K. Amarasinghe, M. Manic, Building energy load forecasting using
25 (2016) 33–38, doi:10.1016/j.scs.2015.12.001. deep neural networks, in: Proceedings of the IECON (Industrial Electronics Confer-
[9] C. Fan, F. Xiao, Z. Li, J. Wang, Unsupervised data analytics in mining big building ence), 2016, 2016, pp. 7046–7051, doi:10.1109/IECON.2016.7793413.
operational data for energy efficiency enhancement: a review, Energy Build. 159 [36] E. Mocanu, P.H. Nguyen, M. Gibescu, W.L. Kling, Deep learning for estimat-
(2018) 296–308, doi:10.1016/j.enbuild.2017.11.008. ing building energy consumption, Sustain. Energy Grids Netw. 6 (2016) 91–99,
[10] H.X. Zhao, F. Magoulès, A review on the prediction of building en- doi:10.1016/j.segan.2016.02.005.
ergy consumption, Renew. Sustain. Energy Rev. 16 (2012) 3586–3592, [37] K.P. Amber, R. Ahmad, M.W. Aslam, A. Kousar, M. Usman, M.S. Khan, Intelligent
doi:10.1016/j.rser.2012.02.049. techniques for forecasting electricity consumption of buildings, Energy 157 (2018)
[11] A.S. Ahmad, M.Y. Hassan, M.P. Abdullah, H.A. Rahman, F. Hussin, H. Abdul- 886–893, doi:10.1016/j.energy.2018.05.155.
lah, et al., A review on applications of ANN and SVM for building electrical en- [38] G. Shi, D. Liu, Q. Wei, Energy consumption prediction of office build-
ergy consumption forecasting, Renew. Sustain. Energy Rev. 33 (2014) 102–109, ings based on echo state networks, Neurocomputing 216 (2016) 478–488,
doi:10.1016/j.rser.2014.01.069. doi:10.1016/j.neucom.2016.08.004.
[12] M.L. Chalal, M. Benachir, M. White, R. Shrahily, Energy planning and fore- [39] Y. Fu, Z. Li, H. Zhang, P. Xu, Using support vector machine to predict next day
casting approaches for supporting physical improvement strategies in the electricity load of public buildings with sub-metering devices, Procedia Eng. 121
building sector: a review, Renew. Sustain. Energy Rev. 64 (2016) 761–776, (2015) 1016–1022, doi:10.1016/j.proeng.2015.09.097.
doi:10.1016/j.rser.2016.06.040. [40] D. Liu, Q. Chen, K. Mori, Time series forecasting method of building energy
[13] N. Fumo, A review on the basics of building energy estimation, Renew. Sustain. consumption using support vector regression, in: Proceedings of the IEEE In-
Energy Rev. 31 (2014) 53–60, doi:10.1016/j.rser.2013.11.040. ternational Conference on Information and Automation, 2015, pp. 1628–1632,
[14] K. Amasyali, N.M. El-Gohary, A review of data-driven building energy con- doi:10.1109/ICInfA.2015.7279546.
sumption prediction studies, Renew. Sustain. Energy Rev. 81 (2018) 1192–1205, [41] F. Lai, F. Magoulès, F. Lherminier, Vapnik’s learning theory applied to energy con-
doi:10.1016/j.rser.2017.04.095. sumption forecasts in residential buildings, Int. J. Comput. Math. 85 (2008) 1563–
[15] Z. Wang, R.S. Srinivasan, A review of artificial intelligence based build- 1588, doi:10.1080/00207160802033582.
ing energy use prediction: contrasting the capabilities of single and ensem- [42] J. Massana, C. Pous, L. Burgas, J. Melendez, J. Colomer, Short-term load forecasting
ble prediction models, Renew. Sustain. Energy Rev. 75 (2017) 796–808, in a non-residential building contrasting models and attributes, Energy Build. 92
doi:10.1016/j.rser.2016.10.079. (2015) 322–330, doi:10.1016/j.enbuild.2015.02.007.

161
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

[43] Q. Li, P. Ren, Q. Meng, Prediction model of annual energy consumption of resi- [71] W. Chung, Using the fuzzy linear regression method to benchmark the en-
dential buildings, in: Proceedings of the International Conference on Advances in ergy efficiency of commercial buildings, Appl. Energy 95 (2012) 45–49,
Energy Engineering, 2010, pp. 223–226, doi:10.1109/ICAEE.2010.5557576. doi:10.1016/j.apenergy.2012.01.061.
[44] X. Li, J.H. Lǔ, L. Ding, G. Xu, J. Li, Building cooling load forecasting model based on [72] P. Wang, R.X. Gao, Automated performance tracking for heat ex-
LS-SVM, in: Proceedings of the Asia-Pacific Conference on Information Processing, changers in HVAC, IEEE Trans. Autom. Sci. Eng. 14 (2017) 634–645,
1, 2009, pp. 55–58, doi:10.1109/APCIP.2009.22. doi:10.1109/TASE.2017.2666184.
[45] S. Paudel, M. Elmitri, S. Couturier, P.H. Nguyen, R. Kamphuis, B. Lacarrière, et al., [73] W.J.N. Turner, A. Staino, B. Basu, Residential HVAC fault detection us-
A relevant data selection method for energy consumption prediction of low energy ing a system identification approach, Energy Build. 151 (2017) 1–17,
building based on support vector machine, Energy Build. 138 (2017) 240–256, doi:10.1016/j.enbuild.2017.06.008.
doi:10.1016/j.enbuild.2016.11.009. [74] M. Karami, L. Wang, Fault detection and diagnosis for nonlinear systems: a new
[46] H.X. Zhao, F. Magoulès, Parallel support vector machines applied to the prediction adaptive Gaussian mixture modeling approach, Energy Build. 166 (2018) 477–488,
of multiple buildings energy consumption, J. Algorithm Comput. Technol. 4 (2010) doi:10.1016/j.enbuild.2018.02.032.
231–249, doi:10.1260/1748-3018.4.2.231. [75] Y. Zhao, T. Li, X. Zhang, C. Zhang, Artificial intelligence-based fault detec-
[47] Z. Hou, Z. Lian, An application of support vector machines in cooling load pre- tion and diagnosis methods for building energy systems: advantages, chal-
diction, in: Proceedings of the International Workshop on Intelligent Systems and lenges and the future, Renew. Sustain. Energy Rev. 109 (2019) 85–101,
Applications, 2009, pp. 1–4, doi:10.1109/IWISA.2009.5072707. doi:10.1016/j.rser.2019.04.021.
[48] Z. Zheng, Z. Zhuang, Z. Lian, Y. Yu, Study on building energy load pre- [76] J. Liang, R. Du, Model-based fault detection and diagnosis of HVAC systems us-
diction based on monitoring data, Procedia Eng. 205 (2017) 716–723, ing support vector machine method, Int. J. Refrigeration 30 (2007) 1104–1114,
doi:10.1016/j.proeng.2017.09.894. doi:10.1016/j.ijrefrig.2006.12.012.
[49] C. Fan, Y. Ding, Y. Liao, Analysis of hourly cooling load prediction accuracy with [77] D. Dehestani, F. Eftekhari, Y. Guo, S. Ling, S. Su, H. Nguyen, Online support vector
data-mining approaches on different training time scales, Sustain. Cities Soc. 51 machine application for model based fault detection and isolation of HVAC
(2019) 101717, doi:10.1016/j.scs.2019.101717. system, Int. J. Mach. Learn. Comput. 1 (2011) 66–72
[50] J.S. Chou, D.K. Bui, Modeling heating and cooling loads by artificial intelli- http://hdl.handle.net/10453/33168.
gence for energy-efficient building design, Energy Build. 82 (2014) 437–446, [78] H. Han, B. Gu, J. Kang, Z.R. Li, Study on a hybrid SVM model
doi:10.1016/j.enbuild.2014.07.036. for chiller FDD applications, Appl. Therm. Eng. 31 (2011) 582–592,
[51] C. Fan, F. Xiao, S. Wang, Development of prediction models for next-day building doi:10.1016/j.applthermaleng.2010.10.021.
energy consumption and peak power demand using data mining techniques, Appl. [79] H. Han, B. Gu, Y. Hong, J. Kang, Automated fdd of multiple-simultaneous faults
Energy 127 (2014) 1–10, doi:10.1016/j.apenergy.2014.04.016. (MSF) and the application to building chillers, Energy Build. 43 (2011) 2524–2532,
[52] C. Zhang, Y. Zhao, X. Zhang, C. Fan, T. Li, An improved cooling load prediction doi:10.1016/j.enbuild.2011.06.011.
method for buildings with the estimation of prediction intervals, Procedia Eng. 205 [80] H. Han, B. Gu, T. Wang, Z.R. Li, Important sensors for chiller fault detection and
(2017) 2422–2428, doi:10.1016/j.proeng.2017.09.967. diagnosis (FDD) from the perspective of feature selection and machine learning,
[53] H. Quan, D. Srinivasan, A. Khosravi, Uncertainty handling using neural network- Int. J. Refrigeration 34 (2011) 586–599, doi:10.1016/j.ijrefrig.2010.08.011.
based prediction intervals for electrical load forecasting, Energy 73 (2014) 916– [81] H. Han, Z. Cao, B. Gu, N. Ren, PCA-SVM-based automated fault detection and diag-
925, doi:10.1016/j.energy.2014.06.104. nosis (AFDD) for vapor-compression refrigeration systems, HVAC&R Res. 16 (2010)
[54] C. Guan, P.B. Luh, L.D. Michel, Z. Chi, Hybrid Kalman filters for very short-term 295–313, doi:10.1080/10789669.2010.10390906.
load forecasting and prediction interval estimation, IEEE Trans. Power Syst. 28 [82] K. Yan, W. Shen, T. Mulumba, A. Afshari, ARX model based fault detection and
(2013) 3806–3817, doi:10.1109/TPWRS.2013.2264488. diagnosis for chillers using support vector machines, Energy Build. 81 (2014) 287–
[55] M. Dahl, A. Brun, G.B. Andresen, Using ensemble weather predictions in dis- 295, doi:10.1016/j.enbuild.2014.05.049.
trict heating operation and load forecasting, Appl. Energy 193 (2017) 455–465, [83] K. Sun, G. Li, H. Chen, J. Liu, J. Li, W. Hu, A novel efficient SVM-
doi:10.1016/j.apenergy.2017.02.066. based fault diagnosis method for multi-split air conditioning system’s re-
[56] W.Y. Lee, J.M. House, N.H. Kyong, Subsystem level fault diagnosis of a building’s frigerant charge fault amount, Appl. Therm. Eng. 108 (2016) 989–998,
air-handling unit using general regression neural networks, Appl. Energy 77 (2004) doi:10.1016/j.applthermaleng.2016.07.109.
153–170, doi:10.1016/S0306-2619(03)00107-7. [84] K. Yan, L. Ma, Y. Dai, W. Shen, Z. Ji, D. Xie, Cost-sensitive and sequential feature
[57] S. Wang, Z. Jiang, Valve fault detection and diagnosis based on CMAC neural net- selection for chiller fault detection and diagnosis, Int. J. Refrigeration 86 (2018)
works, Energy Build. 36 (2004) 599–610, doi:10.1016/j.enbuild.2004.01.037. 401–409, doi:10.1016/j.ijrefrig.2017.11.003.
[58] Z. Du, B. Fan, X. Jin, J. Chi, Fault detection and diagnosis for buildings and hvac [85] W.Y. Lee, C. Park, J.M. House, G.E. Kelly, Fault diagnosis of an air-handling unit
systems using combined neural networks and subtractive clustering analysis, Build. using artificial neural networks, ASHRAE Trans. 102 (1996) 540–549.
Environ. 73 (2014) 1–11, doi:10.1016/j.buildenv.2013.11.021. [86] Q. Zhou, S. Wang, F. Xiao, A novel strategy for the fault detection and
[59] Z. Du, B. Fan, J. Chi, X. Jin, Sensor fault detection and its efficiency analysis in diagnosis of centrifugal chiller systems, HVAC&R Res. 15 (2009) 57–75,
air handling unit using the combined neural networks, Energy Build. 72 (2014) doi:10.1080/10789669.2009.10390825.
157–166, doi:10.1016/j.enbuild.2013.12.038. [87] N. Kocyigit, Fault and sensor error diagnostic strategies for a vapor compression
[60] Z. Du, X. Jin, Y. Yang, Fault diagnosis for temperature, flow rate and pressure refrigeration system by using fuzzy inference systems and artificial neural network,
sensors in VAV systems using wavelet neural network, Appl. Energy 86 (2009) Int. J. Refrigeration 50 (2015) 69–79, doi:10.1016/j.ijrefrig.2014.10.017.
1624–1631, doi:10.1016/j.apenergy.2009.01.015. [88] S. Sun, G. Li, H. Chen, Q. Huang, S. Shi, W. Hu, A hybrid ICA-BPNN-based FDD strat-
[61] Z. Du, X. Jin, Y. Yang, Wavelet neural network-based fault di- egy for refrigerant charge faults in variable refrigerant flow system, Appl. Therm.
agnosis in Air-Handling units, HVAC&R Res. 14 (2008) 959–973, Eng. 127 (2017) 718–728, doi:10.1016/j.applthermaleng.2017.08.047.
doi:10.1080/10789669.2008.10391049. [89] F. Magoulès, H.X. Zhao, D. Elizondo, Development of an rdp neural network
[62] Y. Zhu, X. Jin, Z. Du, Fault diagnosis for sensors in air handling unit based on for building energy consumption fault detection and diagnosis, Energy Build. 62
neural network pre-processed by wavelet and fractal, Energy Build. 44 (2012) 7– (2013) 133–138, doi:10.1016/j.enbuild.2013.02.050.
16, doi:10.1016/j.enbuild.2011.09.043. [90] H. He, D. Menicucci, T. Caudell, A. Mammoli, Real-time fault detection for solar hot
[63] B. Fan, Z. Du, X. Jin, X. Yang, Y. Guo, A hybrid FDD strategy for local system of water systems using adaptive resonance theory neural networks, in: Proceedings of
AHU based on artificial neural network and wavelet analysis, Build. Environ. 45 the 5th International Conference on Energy Sustainability, 2011, pp. 1059–1065,
(2010) 2698–2708, doi:10.1016/j.buildenv.2010.05.031. doi:10.1115/ES2011-54885.
[64] G. Mavromatidis, S. Acha, N. Shah, Diagnostic tools of energy performance for [91] H. He, T.P. Caudell, D.F. Menicucci, A.A. Mammoli, Application of Adap-
supermarkets using Artificial Neural Network algorithms, Energy Build. 62 (2013) tive Resonance Theory neural networks to monitor solar hot water systems
304–314, doi:10.1016/j.enbuild.2013.03.020. and detect existing or developing faults, Solar Energy 86 (2012) 2318–2333,
[65] D.A.T. Tran, Y. Chen, C. Jiang, Comparative investigations on reference models doi:10.1016/j.solener.2012.05.015.
for fault detection and diagnosis in centrifugal chiller systems, Energy Build. 133 [92] Y. Guo, Z. Tan, H. Chen, G. Li, J. Wang, R. Huang, et al., Deep learning-based fault
(2016) 246–256, doi:10.1016/j.enbuild.2016.09.062. diagnosis of variable refrigerant flow air-conditioning system for building energy
[66] D.J. Swider, M.W. Browne, P.K. Bansal, V. Kecman, Modelling of vapour- saving, Appl. Energy 225 (2018) 732–745, doi:10.1016/j.apenergy.2018.05.075.
compression liquid chillers with neural networks, Appl. Therm. Eng. 21 (2001) [93] R. Yan, Z. Ma, Y. Zhao, G. Kokogiannakis, A decision tree based data-driven
311–329, doi:10.1016/S1359-4311(00)00036-3. diagnostic strategy for air handling units, Energy Build. 133 (2016) 37–45,
[67] Y. Zhao, S. Wang, F. Xiao, A statistical fault detection and diagnosis method doi:10.1016/j.enbuild.2016.09.039.
for centrifugal chillers based on exponentially-weighted moving average control [94] Y. Zhao, S. Wang, F. Xiao, Pattern recognition-based chillers fault detection method
charts and support vector regression, Appl. Therm. Eng. 51 (2013) 560–572, using Support Vector Data Description (SVDD), Appl. Energy 112 (2013) 1041–
doi:10.1016/j.applthermaleng.2012.09.030. 1048, doi:10.1016/j.apenergy.2012.12.043.
[68] Y. Zhao, S. Wang, F. Xiao, A system-level incipient fault-detection method for HVAC [95] Y. Zhao, F. Xiao, J. Wen, Y. Lu, S. Wang, A robust pattern recognition-based fault
systems, HVAC&R Res. 19 (2013) 593–601, doi:10.1080/10789669.2013.789371. detection and diagnosis (FDD) method for chillers, HVAC&R Res. 20 (2014) 798–
[69] D.A.T. Tran, Y. Chen, H.L. Ao, H.N.T. Cam, An enhanced chiller FDD strategy based 809, doi:10.1080/10789669.2014.938006.
on the combination of the LSSVR-DE model and EWMA control charts, Int. J. Re- [96] G. Li, Y. Hu, H. Chen, H. Li, M. Hu, Y. Guo, et al., A sensor fault detection and
frigeration 72 (2016) 81–96, doi:10.1016/j.ijrefrig.2016.07.024. diagnosis strategy for screw chiller system using support vector data description-
[70] P.M. Van Every, M. Rodriguez, C.B. Jones, A.A. Mammoli, M. Martínez- based d-statistic and DV-contribution plots, Energy Build. 133 (2016) 230–245,
Ramón, Advanced detection of HVAC faults using unsupervised SVM novelty doi:10.1016/j.enbuild.2016.09.037.
detection and Gaussian process models, Energy Build. 149 (2017) 216–224, [97] G. Li, Y. Hu, H. Chen, L. Shen, H. Li, M. Hu, et al., An improved fault detection
doi:10.1016/j.enbuild.2017.05.053. method for incipient centrifugal chiller faults using the PCA-R-SVDD algorithm,
Energy Build. 116 (2016) 104–113, doi:10.1016/j.enbuild.2015.12.045.

162
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

[98] K. Yan, Z. Ji, W. Shen, Online fault detection methods for chillers combining ex- [125] S. Petcharat, S. Chungpaibulpatana, P. Rakkwamsuk, Assessment of potential en-
tended kalman filter and recursive one-class SVM, Neurocomputing 228 (2017) ergy saving using cluster analysis: a case study of lighting systems in buildings,
205–212, doi:10.1016/j.neucom.2016.09.076. Energy Build. 52 (2012) 145–152, doi:10.1016/j.enbuild.2012.06.006.
[99] A. Beghi, L. Cecchinato, C. Corazzol, M. Rampazzo, F. Simmini, G.A. Susto, [126] J. Liu, J. Wang, G. Li, H. Chen, L. Shen, L. Xing, Evaluation of the energy
A one-class SVM based tool for machine learning novelty detec- performance of variable refrigerant flow systems using dynamic energy bench-
tion in HVAC chiller systems, IFAC Proc. Vol. 47 (2014) 1953–1958, marks based on data mining techniques, Appl. Energy 208 (2017) 522–539,
doi:10.3182/20140824-6-ZA-1003.02382. doi:10.1016/j.apenergy.2017.09.116.
[100] G. Li, Y. Hu, H. Chen, H. Li, M. Hu, Y. Guo, et al., Data partitioning and as- [127] D. Patnaik, M. Marwah, R. Sharma, N. Ramakrishnan, Sustainable operation and
sociation mining for identifying vrf energy consumption patterns under various management of data center chillers using temporal data mining, in: Proceedings
part loads and refrigerant charge conditions, Appl. Energy 185 (2017) 846–861, of the 15th ACM SIGKDD International Conference on Knowledge Discovery and
doi:10.1016/j.apenergy.2016.10.091. Data Mining, 2009, pp. 1305–1314, doi:10.1145/1557019.1557159.
[101] P. Xue, Z. Zhou, X. Fang, X. Chen, L. Liu, Y. Liu, et al., Fault detection and operation [128] D. Patnaik, M. Marwah, R.K. Sharma, N. Ramakrishnan, Data mining for model-
optimization in district heating substations based on data mining techniques, Appl. ing chiller systems in data centers, in: P.R. Cohen, N.M. Adams, M.R. Berthold
Energy 205 (2017) 926–940, doi:10.1016/j.apenergy.2017.08.035. (Eds.), Advances in Intelligent Data Analysis IX, Springer, Berlin Heidelberg, 2010,
[102] F. Wang, K. Li, N. Duić, Z. Mi, B.-.M. Hodge, M. Shafie-khah, et al., Association rule pp. 125–136.
mining based quantitative analysis approach of household characteristics impacts [129] D. Patnaik, M. Marwah, R.K. Sharma, N. Ramakrishnan, Temporal data mining
on residential electricity consumption patterns, Energy Conv. Manag. 171 (2018) approaches for sustainable chiller management in data centers, ACM Trans. Intell.
839–854, doi:10.1016/j.enconman.2018.06.017. Syst. Technol. 2 (2011) 1–29, doi:10.1145/1989734.1989738.
[103] S. Qiu, F. Feng, Z. Li, G. Yang, P. Xu, Z. Li, Data mining based framework to identify [130] H. Shao, M. Marwah, N. Ramakrishnan, A temporal motif mining approach to unsu-
rule based operation strategies for buildings with power metering system, Build. pervised energy disaggregation: applications to residential and commercial build-
Simul. 12 (2019) 195–205, doi:10.1007/s12273-018-0472-6. ings, in: Proceedings of Twenty-Seventh AAAI Conference on Artificial Intelligence,
[104] F. Feng, Z. Li, A methodology to identify multiple equipment coordinated con- 2013, 2013, pp. 1327–1333.
trol with power metering system, Energy Procedia 105 (2017) 2499–2505, [131] J.A. Fonseca, C. Miller, A. Schlueter, Unsupervised load shape clustering for
doi:10.1016/j.egypro.2017.03.721. urban building performance assessment, Energy Procedia 122 (2017) 229–234,
[105] C. Fan, F. Xiao, Assessment of building operational performance using data doi:10.1016/j.egypro.2017.07.350.
mining techniques: a case study, Energy Procedia 111 (2017) 1070–1078, [132] A. Reinhardt, S. Koessler, PowerSAX: fast motif matching in distributed power
doi:10.1016/j.egypro.2017.03.270. meter data using symbolic representations, in: Proceedings of 39th Annual
[106] C. Fan, F. Xiao, Mining big building operational data for improving building en- IEEE Conference on Local Computer Networks Workshops, 2014, pp. 531–538,
ergy efficiency: a case study, Build. Serv. Eng. Res. Technol. 39 (2018) 117–128, doi:10.1109/LCNW.2014.6927699.
doi:10.1177/0143624417704977. [133] U. Habib, G. Zucker, Finding the different patterns in buildings data using bag
[107] C. Fan, F. Xiao, H. Madsen, D. Wang, Temporal knowledge discovery in big of words representation with clustering, in: Proceedings of 13th International
BAS data for building energy management, Energy Build. 109 (2015) 75–89, Conference on Frontiers of Information Technology (FIT), 2015, pp. 303–308,
doi:10.1016/j.enbuild.2015.09.060. doi:10.1109/FIT.2015.60.
[108] C. Fan, Y. Sun, K. Shan, F. Xiao, J. Wang, Discovering gradual patterns in building [134] B. Kalluri, A. Kamilaris, S. Kondepudi, H.W. Kua, K.W. Tham, Applicability of using
operations for improving building energy efficiency, Appl. Energy 224 (2018) 116– time series subsequences to study office plug load appliances, Energy Build. 127
123, doi:10.1016/j.apenergy.2018.04.118. (2016) 399–410, doi:10.1016/j.enbuild.2016.05.076.
[109] C. Fan, M. Song, F. Xiao, X. Xue, Discovering complex knowledge in massive build- [135] Y. Chen, J. Wen, Whole building system fault detection based on weather pattern
ing operational data using graph mining for building energy management, Energy matching and PCA method, in: Proceedings of the 3rd IEEE International Confer-
Procedia 158 (2019) 2481–2487, doi:10.1016/j.egypro.2019.01.378. ence on Control Science and Systems Engineering (ICCSSE), 2017, pp. 728–732,
[110] C. Fan, F. Xiao, M. Song, J. Wang, A graph mining-based methodology for discov- doi:10.1109/CCSSE.2017.8088030.
ering and visualizing high-level knowledge for building energy management, Appl. [136] F. Xiao, C. Fan, Data mining in building automation system for improv-
Energy 251 (2019) 113395, doi:10.1016/j.apenergy.2019.113395. ing building operational performance, Energy Build. 75 (2014) 109–118,
[111] G. Chicco, Overview and performance assessment of the clustering doi:10.1016/j.enbuild.2014.02.005.
methods for electrical load pattern grouping, Energy 42 (2012) 68–80, [137] D.F.M. Cabrera, H. Zareipour, Data association mining for identifying lighting en-
doi:10.1016/j.energy.2011.12.031. ergy waste patterns in educational institutes, Energy Build. 62 (2013) 210–216,
[112] T.G. Nikolaou, D.S. Kolokotsa, G.S. Stavrakakis, I.D. Skias, On the ap- doi:10.1016/j.enbuild.2013.02.049.
plication of clustering techniques for office buildings’ energy and ther- [138] J. Han, J. Pei, Y. Yin, Mining frequent patterns without candidate generation,
mal comfort classification, IEEE Trans. Smart Grid 3 (2012) 2196–2210, SIGMOD Rec. (ACM Spec. Interest Group Manag. Data) 29 (2) (2000) 1–12,
doi:10.1109/TSG.2012.2215059. doi:10.1145/335191.335372.
[113] S. Wu, D. Clements-Croome, Understanding the indoor environment through [139] Z. Yu, F. Haghighat, B.C.M. Fung, L. Zhou, A novel methodology for knowledge
mining sensory data—A case study, Energy Build. 39 (2007) 1183–1191, discovery through mining associations between building operational data, Energy
doi:10.1016/j.enbuild.2006.07.011. Build. 47 (2012) 430–440, doi:10.1016/j.enbuild.2011.12.018.
[114] X. Gao, A. Malkawi, A new methodology for building energy performance bench- [140] C. Zhang, Y. Zhao, X.J. Zhang, An improved association rule mining-based method
marking: an approach based on intelligent clustering algorithm, Energy Build. 84 for discovering abnormal operation patterns of HVAC systems, Energy Procedia
(2014) 607–616, doi:10.1016/j.enbuild.2014.08.030. 158 (2019) 2701–2706, doi:10.1016/j.egypro.2019.02.025.
[115] C. Miller, Z. Nagy, A. Schlueter, Automated daily pattern filtering of [141] C. Zhang, X. Xue, Y. Zhao, X. Zhang, T. Li, An improved association rule mining-
measured building performance data, Autom. Construct. 49 (2015) 1–17, based method for revealing operational problems of building heating, ventila-
doi:10.1016/j.autcon.2014.09.004. tion and air conditioning (HVAC) systems, Appl. Energy 253 (2019) 113492,
[116] C. Miller, A. Schlueter, Forensically discovering simulation feedback knowledge doi:10.1016/j.apenergy.2019.113492.
from a campus energy information system, in: Proceedings of the Symposium on [142] F. Xiao, S. Wang, C. Fan, Mining big building operational data for building cool-
Simulation for Architecture & Urban Design, 2015, pp. 136–143. ing load prediction and energy efficiency improvement, in: Proceedings of the
[117] A. Lavin, D. Klabjan, Clustering time-series energy data from smart meters, Energy IEEE International Conference on Smart Computing (SMARTCOMP), 2017, pp. 1–3,
Eff. 8 (2015) 681–689, doi:10.1007/s12053-014-9316-0. doi:10.1109/SMARTCOMP.2017.7947023.
[118] S.P. Pieri, Santamouris M. IoannisTzouvadakis, Identifying energy consump- [143] C. Fan, F. Xiao, C. Yan, A framework for knowledge discovery in massive building
tion patterns in the ATTICA hotel sector using cluster analysis techniques with automation data and its application in building diagnostics, Autom. Constr. 50
the aim of reducing hotels’ CO2 footprint, Energy Build. 94 (2015) 252–262, (2015) 81–90, doi:10.1016/j.autcon.2014.12.006.
doi:10.1016/j.enbuild.2015.02.017. [144] H.B. Gunay, W. Shen, C. Yang, Text-mining building maintenance work or-
[119] I. Farrou, M. Kolokotroni, M. Santamouris, A method for energy classifica- ders for component fault frequency, Build. Res Inf. 47 (2019) 518–533,
tion of hotels: a case-study of Greece, Energy Build. 55 (2012) 553–562, doi:10.1080/09613218.2018.1459004.
doi:10.1016/j.enbuild.2012.08.010. [145] J.M. House, W.Y. Lee, D.R. Shin, Classification techniques for fault detection and
[120] J. Kwac, J. Flora, R. Rajagopal, Household energy consumption segmen- diagnosis of an air-handling unit, ASHRAE J. (1999) 105.
tation using hourly data, IEEE Trans. Smart Grid 5 (2014) 420–430, [146] D. Zogg, E. Shafai, Geering HPBT-IIC on CA. a fault diagnosis system
doi:10.1109/TSG.2013.2278477. for heat pumps, in: Proceedings of the IEEE International Conference on
[121] F. Iglesias, W. Kastner, Analysis of similarity measures in times series cluster- Control Applications (CCA’01) (Cat. No.01CH37204), 2001, pp. 70–76,
ing for the discovery of building energy patterns, Energies 6 (2013) 579–597, doi:10.1109/CCA.2001.973840.
doi:10.3390/en6020579. [147] D. Zogg, E. Shafai, H.P. Geering, Fault diagnosis for heat pumps with param-
[122] J. Qin, J. Zhang, Sampling for building energy consumption with fuzzy theory, eter identification and clustering, Control Eng. Pract. 14 (2006) 1435–1444,
Energy Build. 156 (2017) 78–84, doi:10.1016/j.enbuild.2017.09.047. doi:10.1016/j.conengprac.2005.11.002.
[123] M. Santamouris, G. Mihalakakou, P. Patargias, N. Gaitani, K. Sfakianaki, [148] A. Capozzoli, F. Lauro, I. Khan, Fault detection analysis using data mining tech-
M. Papaglastra, et al., Using intelligent clustering techniques to classify the niques for a cluster of smart office buildings, Expert Syst. Appl. 42 (2015) 4324–
energy performance of school buildings, Energy Build. 39 (2007) 45–51, 4338, doi:10.1016/j.eswa.2015.01.010.
doi:10.1016/j.enbuild.2006.04.018. [149] I. Khan, A. Capozzoli, S.P. Corgnati, T. Cerquitelli, Fault detection analysis of build-
[124] G. Chicco, I. Ilie, Support vector clustering of electrical load pattern data, IEEE ing energy consumption using data mining techniques, Energy Procedia 42 (2013)
Trans. Power Syst. 24 (2009) 1619–1628, doi:10.1109/TPWRS.2009.2023009. 557–566, doi:10.1016/j.egypro.2013.11.057.

163
Y. Zhao, C. Zhang and Y. Zhang et al. Energy and Built Environment 1 (2020) 149–164

[150] G. Li, Y. Hu, Improved sensor fault detection, diagnosis and estimation for screw [166] Z. Du, X. Jin, Detection and diagnosis for sensor fault in HVAC systems, Energy
chillers using density-based clustering and principal component analysis, Energy Conv. Manag. 48 (2007) 693–702, doi:10.1016/j.enconman.2006.09.023.
Build. 173 (2018) 502–515, doi:10.1016/j.enbuild.2018.05.025. [167] Z. Du, X. Jin, Detection and diagnosis for multiple faults in VAV systems, Energy
[151] B. Narayanaswamy, B. Balaji, R. Gupta, Y. Agarwal, Data driven investigation of Build. 39 (2007) 923–934, doi:10.1016/j.enbuild.2006.09.015.
faults in HVAC systems with Model, Cluster and Compare (MCC), in: Proceedings [168] Z. Du, X. Jin, L. Wu, PCA-FDA-based fault diagnosis for sensors in VAV systems,
of 1st ACM Conference on Embedded Systems for Energy-Efficient Buildings, 2014, HVAC&R Res. 13 (2007) 349–367, doi:10.1080/10789669.2007.10390958.
pp. 50–59, doi:10.1145/2674061.2674067. [169] S. Wang, J. Qin, Sensor fault detection and validation of VAV terminals
[152] Z. Du, B. Fan, X. Jin, J. Chi, Fault detection and diagnosis for buildings and HVAC in air conditioning systems, Energy Conv. Manag. 46 (2005) 2482–2500,
systems using combined neural networks and subtractive clustering analysis, Build. doi:10.1016/j.enconman.2004.11.011.
Environ. 73 (2014) 1–11, doi:10.1016/j.buildenv.2013.11.021. [170] Y. Chen, L. Lan, A fault detection technique for air-source heat
[153] J. Wall, Y. Guo, J. Li, S. West, A dynamic machine learning-based technique for pump water chiller/heaters, Energy Build. 41 (2009) 881–887,
automated fault detection in HVAC systems, ASHRAE Trans. 117 (2011) 449–456. doi:10.1016/j.enbuild.2009.03.007.
[154] V. Jakkula, D. Cook, Outlier detection in smart environment structured power [171] D. Yu, J. Yu, F. Sun, Y. Deng, Q. Wu, G. Cong, Research on the PCA-based intelligent
datasets, in: Proceedings of the Sixth International Conference on Intelligent Envi- fault detection methodology for sewage source heat pump system, Procedia Eng.
ronments, 2010, pp. 29–33, doi:10.1109/IE.2010.13. 205 (2017) 1064–1071, doi:10.1016/j.proeng.2017.10.171.
[155] A. Capozzoli, M.S. Piscitelli, S. Brandi, D. Grassi, G. Chicco, Automated load pat- [172] Y. Hu, G. Li, H. Chen, H. Li, J. Liu, Sensitivity analysis for PCA-based
tern learning and anomaly detection for enhancing energy management in smart chiller sensor fault detection, Int. J. Refrigeration 63 (2016) 133–143,
buildings, Energy 157 (2018) 336–352, doi:10.1016/j.energy.2018.05.127. doi:10.1016/j.ijrefrig.2015.11.006.
[156] Z. Du, X. Jin, Multiple faults diagnosis for sensors in air handling unit us- [173] X. Xu, F. Xiao, S. Wang, Enhanced chiller sensor fault detection, diagnosis and
ing Fisher discriminant analysis, Energy Conv. Manag. 49 (2008) 3654–3665, estimation using wavelet analysis and principal component analysis methods, Appl.
doi:10.1016/j.enconman.2008.06.032. Therm. Eng. 28 (2008) 226–237, doi:10.1016/j.applthermaleng.2007.03.021.
[157] S. Wang, F. Xiao, AHU sensor fault diagnosis using principal component [174] S. Wang, J. Cui, Sensor-fault detection, diagnosis and estimation for centrifugal
analysis method, Energy Build. 36 (2004) 147–160, chiller systems using principal-component analysis method, Appl. Energy 82 (2005)
doi:10.1016/j.enbuild.2003.10.002. 197–213, doi:10.1016/j.apenergy.2004.11.002.
[158] S. Wang, F. Xiao, Detection and diagnosis of AHU sensor faults using princi- [175] A. Beghi, R. Brignoli, L. Cecchinato, G. Menegazzo, M. Rampazzo, A data-
pal component analysis method, Energy Conver. Manag. 45 (2004) 2667–2686, driven approach for fault diagnosis in HVAC chiller systems, in: Proceed-
doi:10.1016/j.enconman.2003.12.008. ings of IEEE Conference on Control Applications (CCA), 2015, pp. 966–971,
[159] S. Wang, F. Xiao, Sensor fault detection and diagnosis of air-handling units using doi:10.1109/CCA.2015.7320737.
a condition-based adaptive statistical method, HVAC&R Res. 12 (2006) 127–150, [176] Y. Guo, G. Li, H. Chen, Y. Hu, H. Li, L. Xing, et al., An enhanced PCA method
doi:10.1080/10789669.2006.10391171. with Savitzky-Golay method for VRF system sensor fault detection and diagnosis,
[160] F. Xiao, S. Wang, J. Zhang, A diagnostic tool for online sensor health Energy Build. 142 (2017) 167–178, doi:10.1016/j.enbuild.2017.03.026.
monitoring in air-conditioning systems, Autom. Constr. 15 (2006) 489–503, [177] Y. Guo, G. Li, H. Chen, Y. Hu, H. Li, J. Liu, et al., Modularized PCA
doi:10.1016/j.autcon.2005.06.001. method combined with expert-based multivariate decoupling for FDD in VRF
[161] M. Madhikermi, N. Yousefnezhad, K. Främling, Heat recovery unit failure detection systems including indoor unit faults, Appl. Therm. Eng. 115 (2017) 744–755,
in air handling unit, in: I. Moon, G.M. Lee, J. Park, D. Kiritsis, G. von Cieminski doi:10.1016/j.applthermaleng.2017.01.008.
(Eds.), Advances in Production Management Systems. Smart Manufacturing For [178] S. Shi, G. Li, H. Chen, Y. Hu, X. Wang, Y. Guo, et al., An efficient VRF
Industry 4.0, Springer International Publishing, 2018, pp. 343–350. system fault diagnosis strategy for refrigerant charge amount based on PCA
[162] S. Li, J. Wen, A model-based fault detection and diagnostic methodology and dual neural network model, Appl. Therm. Eng. 129 (2018) 1252–1262,
based on PCA method and wavelet transform, Energy Build. 68 (2014) 63–71, doi:10.1016/j.applthermaleng.2017.09.117.
doi:10.1016/j.enbuild.2013.08.044. [179] Z. Du, L. Chen, X. Jin, Data-driven based reliability evaluation for measure-
[163] Z. Du, X. Jin, L. Wu, Fault detection and diagnosis based on improved PCA ments of sensors in a vapor compression system, Energy 122 (2017) 237–248,
with JAA method in VAV systems, Build. Environ. 42 (2007) 3221–3232, doi:10.1016/j.energy.2017.01.055.
doi:10.1016/j.buildenv.2006.08.011. [180] S. Wang, Q. Zhou, F. Xiao, A system-level fault detection and diagnosis strat-
[164] Z. Du, X. Jin, X. Yang, A robot fault diagnostic tool for flow rate sen- egy for HVAC systems involving sensor faults, Energy Build. 42 (2010) 477–490,
sors in air dampers and VAV terminals, Energy Build. 41 (2009) 279–286, doi:10.1016/j.enbuild.2009.10.017.
doi:10.1016/j.enbuild.2008.09.007. [181] X. Hao, G. Zhang, Y. Chen, Fault-tolerant control and data recov-
[165] X. Jin, Z. Du, Fault tolerant control of outdoor air and AHU supply air temperature ery in HVAC monitoring system, Energy Build. 37 (2005) 175–180,
in VAV air conditioning systems using PCA method, Appl. Therm. Eng. 26 (2006) doi:10.1016/j.enbuild.2004.06.023.
1226–1237, doi:10.1016/j.applthermaleng.2005.10.039.

164

You might also like