You are on page 1of 12

Burst Detection in District Metering Areas Using Deep

Learning Method
Xiaoting Wang 1; Guancheng Guo 2; Shuming Liu, Aff.M.ASCE 3; Yipeng Wu 4; Xiyan Xu 5; and Kate Smith 6

Abstract: Water loss reduction is important in sustainable water resource management. As one of the main water loss control methods, early
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

detection of hydraulic accidents in district metering areas (DMAs) has emerged as a research focus. This study presents a data-driven method
for burst detection which consists of three stages: prediction, classification and correction. A prediction stage is used to improve accuracy of
flow prediction, a classification stage utilizes multiple thresholds to make the method robust to time variation, and an outlier feedback
correction stage allows consecutive detection of outliers. The proposed method was capable of triggering burst alarms with 99.80% detection
accuracy (DA), 85.71% true-positive rate (TPR), and 0.14% false-positive rate (FPR) in simulated experiments, and 99.77% DA, 94.82%
TPR and 0.21% FPR in synthetic experiments over a 10-min detection time in a real-life DMA. The identifiable minimum burst rate was as
low as 2.79% of average DMA inflow. The proposed method outperformed the single threshold-based method, window size–based method,
and clustering-based method. It provides a sensitive and effective solution for burst detection in water distribution systems. DOI: 10.1061/
(ASCE)WR.1943-5452.0001223. © 2020 American Society of Civil Engineers.
Author keywords: Water distribution system; Water loss management; Leakage control; Long short-term memory model; Multithreshold
classification; Outlier feedback correction.

Introduction increase of water loss. Minimizing the unawareness time duration is


a crucial task for water loss control in WDS.
Water loss in water distribution systems (WDS) has received much Current global water industry practices show that district meter-
attention over the last two decades because water reduction is of ing areas (DMAs) are effective ways to control water bursts (Zhang
great importance in sustainable water resource management et al. 2019). In China, DMAs have been widely implemented
(Melgarejo-Moreno et al. 2019). Bursts are an important form following the implementation of the Urban Water Distribution
of water loss (AWWA 2009; Yan et al. 2019) because they are a Network District Metering Management Guidelines (MOHURD
financial liability for water industries, and can represent a public 2017). DMAs divide a water distribution network into a number
health threat (Fox et al. 2016; Qi et al. 2018). The time between of hydraulically isolated areas. The water utilities can detect water
burst occurrence and identification by water utilities determines losses using flow meters installed in DMAs by analyzing the mini-
the volume of water lost (Bakker et al. 2014b). Generally, water mum night flow (MNF) in DMAs (Farley and Trow 2003). How-
utilities quickly identify serious bursts because of user hotline com- ever, some bursts will last for days or weeks before being found by
plaints, and are able to repair the burst pipeline in a short time. MNF analysis, which creates a serious time delay. Rapid develop-
However, for small bursts or leakage, the time duration between ments in wireless sensors as well as supervisory control and data
occurrence and identification is much longer, causing a significant acquisition (SCADA) systems provide some new tools for leakage
control in DMAs (Chen and Boccelli 2018). Real-time data-driven
1 models are proposed to detect the pipe bursts. They provide infor-
Ph.D. Student, Smart Water Research Center, School of Environment,
Tsinghua Univ., Beijing 100084, China. Email: wang-xt15@mails.tsinghua mation on DMA locations for manual leak detection in real time.
.edu.cn Compared with traditional manual detection methods, the data-
2
Ph.D. Student, Smart Water Research Center, School of Environment, driven methods perform better, cost less, and reduce the time be-
Tsinghua Univ., Beijing 100084, China. Email: ggc16@mails.tsinghua tween leak occurrence and identification.
.edu.cn
3 Previous studies of data-driven methods primarily focused on
Professor, Smart Water Research Center, School of Environment,
Tsinghua Univ., Beijing, 100084, China (corresponding author). Email:
using a two-stage data-driven framework, which includes predic-
shumingliu@tsinghua.edu.cn tion and classification. These methods incorporate (1) the predic-
4
Ph.D. Student, Smart Water Research Center, School of Environment, tion of flow/pressure values based on data-driven models, and
Tsinghua Univ., Beijing 100084, China. Email: wuyp17@mails.tsinghua (2) the classification of residuals between predicted and observed
.edu.cn values. Bursts can be identified by classifying residuals into normal
5
Postdoctoral Fellow, Smart Water Research Center, School of Envir- and abnormal ones. The framework of these methods is quite clear,
onment, Tsinghua Univ., Beijing 100084, China. Email: xiyanxu@tsinghua but there are still some issues to be addressed.
.edu.cn The first issue is how to ensure good performance in the pre-
6
Postdoctoral Fellow, Smart Water Research Center, School of Envir- diction stage. Artificial neural networks (ANNs) are the most popu-
onment, Tsinghua Univ., Beijing 100084, China. Email: katesmith@mails
lar prediction method in the field of data-driven burst detection and
.tsinghua.edu.cn
Note. This manuscript was submitted on April 19, 2019; approved on have been widely used in the last decade (Mounce et al. 2002). As a
January 7, 2020; published online on March 23, 2020. Discussion period nonlinear fitting algorithm (Li et al. 2015, 2017), the ANN model
open until August 23, 2020; separate discussions must be submitted for uses previous sequential flow and pressure data as inputs to predict
individual papers. This paper is part of the Journal of Water Resources future values (Romano et al. 2014a, b). However, for most two-
Planning and Management, © ASCE, ISSN 0733-9496. stage methods, the performance of prediction methods rarely has

© ASCE 04020031-1 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


been discussed. Recently, cloud computing and the popularity of fluctuations (Ye and Fenner 2014b). To overcome this problem,
deep learning have led to a renaissance of neural networks some previous studies have considered the periodicity of flow
(LeCun et al. 2015). The new concept of data mining has proven and made daily time series analysis to describe a relatively stable
powerful for mining the intrinsic relationships among potentially flow pattern (Wu et al. 2018b; Wu et al. 2016; Ye and Fenner
correlated variables (Li et al. 2017). In the field of WDS, deep neu- 2014b). This method has shown great potential for outlier identi-
ral networks also show potential applications (Wu et al. 2015). For fication, but the data are relatively isolated. The interval between
water demand prediction, the performance of the deep learning adjacent data in a daily time sequence is given in terms of days,
method has proven much better than traditional ANN or seasonal which is an interval of 24 h. Thus, the daily time sequence loses
autoregressive integrated moving average (SARIMA) models (Guo the adjacent minute-level intervals, and does not consider the ten-
et al. 2018). dency of flow time sequences. To reach a trade-off, original time
The second issue is how to set proper thresholds in the classi- series were selected for flow prediction, and daily time series were
fication stage. For classification in the two-stage methods, machine selected for prediction, classification, and correction.
learning–based methods have been used for burst detection with To extract more features for flow prediction, this study estab-
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

different success rates, but they have the limitation of computa- lished multiple time-series data sets to extract the features of ten-
tional complexity (Mounce et al. 2003). Thus, statistical process dency and periodicity. Data at each time point in the original time
control (SPC) with simple computation has become the most series were extracted to form the daily time series. The flow data
widely used method for residual classification (Jung et al. 2015; were recorded every 5 min, so that each day had 288 flow readings,
Palau et al. 2012). A previous study showed that residuals vary with i.e., the original time series ff 1 ; f2 ; : : : ; fn g was transformed to
flow 24 h=day (Hutton and Kapelan 2015a). Considering the varia- ff11 ; f 12 ; : : : ; f1m g; ff21 ; f 22 ; : : : ; f 2m g; : : : ; ff288 288 288
1 ; f 2 ; : : : ; f m g,
tion of residuals, statistical theory can be used to calculate a varia- where f represents the value of flow measurements, n is the number
tional threshold to detect abnormal patterns in residuals (Bakker of historical flow data in the original time series, and m is the num-
et al. 2014b; Jung et al. 2015; Palau et al. 2012; Romano et al. ber of historical flow data in the daily time series. The periodicity of
2014a, b). Similarly, a probabilistic approach was proposed by water flow was illustrated in daily time series, and the original time
Hutton and Kapelan (2015b). These studies indicate that the variable series was shown to describe the water flow varying with time
threshold shows great potential for two-stage outlier identification. sequence.
The third issue is how to tackle the accumulation of prediction
residuals. A burst is an accumulated water loss process. To trans-
form it into a data-driven model, burst detection can be regarded as Flow Prediction
the continuous identification of outliers. Nevertheless, when the A recurrent neural network (RNN) is a neural network framework
recognized outlier is input into a prediction model, it will cause that uses self-connections from the previous time step as inputs.
a certain bias of the predicted value. This biased-prediction model Hence the hidden state contains a dynamic history of the input
will predict an abnormal flow when a burst is occurring and will features sequence instead of a fixed-size window (Schmidhuber
reduce the values of subsequent prediction residuals. Consequently, 2015). It is difficult to train these networks, because they generally
in the second stage of burst detection, it is difficult to identify re- contain millions of parameters. Recent advances in neural network
siduals continuously. Thus, avoiding the prediction bias caused by technologies, especially for long short-term memory (LSTM), have
outliers and performing normal flow prediction are significant chal- shown some promising results in machine learning tasks. LSTM
lenges. To overcome these problems, recent studies of burst detec- has impressive performance in tasks as varied as image recogni-
tion have been carried out, by using weighted least-squares fitting tion, language translation, and natural language processing (Sainath
and treating the Kalman filter as a prediction pattern to forecast et al. 2015).
normal flow in DMAs (Ye and Fenner 2011, 2014b). However, for LSTM has a strong ability to deal with nonlinear data, especially
most two-stage methods, the correction of abnormal flow to normal for sequence processing. Compared with conventional hidden
flow is rarely discussed. units, using a LSTM unit ensures that the gradient does not vanish
To fill these knowledge gaps, the authors propose a novel data- or increase sharply after a large number of iterations, which over-
driven algorithm for burst detection, including three steps: predic- comes the difficulties encountered in traditional RNN training.
tion, classification, and correction. It uses inflow data and detects Fig. 1 describes an LSTM unit.
bursts in DMAs. First, the observed flow is predicted by deep learn- The input sequence of LSTM is x ¼ fxð1Þ; xð2Þ; : : : ; xðnÞg,
ing recurrent neural networks, and the residuals are applied to re- where xðtÞ is an E-dimensional word vector in this study. The cal-
present the difference between observed and predicted values. culation process of LSTM can be briefly described by Eqs. (1)–(6)
Second, a multithreshold stage that changes with time is introduced
for outlier identification, mainly based on the characteristics in the gt ¼ tanhðW xg xt þW hg ht−1 þbg Þ ð1Þ
residual data. Finally, the outlier feedback correction stage is used
to return abnormal flow to normal flow for continuous outlier de- it ¼ σðW xi xt þ W hi ht−1 þbi Þ ð2Þ
tection. Additionally, this proposed algorithm was validated using
several burst tests in real DMAs. f t ¼ σðW xf xt þW hf ht−1 þbf Þ ð3Þ

st ¼ st−1 ⊙f t þit ⊙gt ð4Þ


Methodology
ot ¼ σðW xo xt þ W ho ht−1 þ bo Þ ð5Þ
Feature Extraction
ht ¼ ot ⊙ tanhðst Þ ð6Þ
DMA flow is a type of time series data with tendency and perio-
dicity. Some studies have considered the tendency and estimated where ht = output; st = cell state; ⊙ denotes element multiplication;
the flow in DMAs through original time series analysis. However, gt , it , ft , and ot = squeeze unit, input unit, forget unit, and output
it is difficult to detect outliers in the flow data because of nonlinear unit, respectively; σ denotes sigmoid function; W xg , W hg , W xi , W hi ,

© ASCE 04020031-2 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Fig. 1. Illustration of a LSTM unit.

W xf , W hf , W xo , and W ho = related weight matrices; and bg , bi , bf , was transformed into 288 daily time series R1 –R288 , which are
and bo = related biases. fr11 ; r12 ; : : : ; r1m g; fr21 ; r22 ; : : : ; r2m g; : : : ; fr288 288 288
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

1 ; r2 ; : : : ; rm g, where
In this study, the LSTM model has a structure which consists r represents the value of the residual.
of three LSTM layers and three fully connected layers. The Here, the residuals of each series were analyzed by a z-score,
LSTM unit is used as the core to build a flow prediction model, which is a statistical indicator for outlier detection. It indicates
and the memory function of LSTM retains information on past how many standard deviations a data point is from the mean of
flow and establishes interdependencies in flow for different the sample
periods.
rt − μ
z¼ ð8Þ
σ
Multithreshold Classification
where rt = residual between predicted value and observed value at
LSTM is applied to predict normal flow. When a burst occurs, the
detection time, and μ and σ = mean and standard deviation, respec-
abnormal flow of observed values differs from the predicted value.
tively, of all residuals predicted by the LSTM in rt daily residual
It can be calculated by the residuals Rt between the observed value
series.
X t and the predicted value X^ t
Thumb-rules of z-thresholds Z can be 2, 2.56, 3, 3.5 or more
Rt ¼ X t − X^ t ð7Þ standard deviations (Vaghefi et al. 2018). The mean value and stan-
dard deviation vary over the 288 daily residual series. Therefore,
The residuals are dynamic but recurrent under different water the 288 thresholds are calculated and treated as the multiple thresh-
flow because the volume of water used varies 24 h=day (Fig. 2) olds mentioned previously.
(Hutton and Kapelan 2015a). For instance, two bursts (A and B) When the residual exceeds its corresponding time threshold, this
are shown in Fig. 2. The size of A is smaller than that of B. A higher residual can be considered as an outlier. If multiple thresholds are
residual threshold [Fig. 2(a)] is able to detect the large Burst B, but rewritten in the form of intervals, they can be expressed as
the small Burst A is ignored, meaning that the small burst will not pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
be reported. By contrast, a lower fixed bound [Fig. 2(b)] will cause outlier; high flow∶rt ≥ R̄n þ Z Var½Rn 
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a number of false alarms. normal flow∶R̄n − Z Var½Rn  < rt < R̄n þ Z Var½Rn 
Based on the characteristics of residuals, this study proposes a pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
multithreshold stage applied to time variation to detect outliers. outlier; low flow∶rt ≤ R̄n − Z Var½Rn  ð9Þ
Residuals are first divided into different daily time series to over-
come the temporal limitation. This is similar to the transformation where Rn = daily residual series at time point n, which corresponds
of historical flow data. The original time series fr1 ; r2 ; : : : ; rn g to time t.

Fig. 2. Schematic diagram of threshold setting: (a) burst detection using a higher fixed bound; and (b) burst detection using a lower fixed bound. A
and B are two bursts. The size of A is smaller than B. Shaded columns indicate the detected bursts.

© ASCE 04020031-3 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Specifically, for outliers, when the observed value is higher than FP
FPR ¼ × 100% ð14Þ
the predicted value, it is larger than the flow under normal condi- FP þ TN
tions. A potential burst or an instrument error may be present in the
DMA. By contrast, when the observed value is lower than the pre-
TP þ TN
dicted value, this indicates that the water consumption is lower than DA ¼ × 100% ð15Þ
the normal case. Outliers also can be shaped by meter error or low TP þ FN þ TN þ FP
water usage.
where TP = true positive, which means that a burst occurs and is
detected correctly; FN = false negative, which means that a burst
Outlier Feedback Correction occurs but fails to be detected; TN = true negative, which means
The identified outliers affect subsequent predicted values, which that a burst does not occur and is not detected; and FP = false pos-
causes continuous miscalculation. LSTM attempts to restore the itive, which means that a burst does not occur but is wrongly de-
normal flow to ensure continuous difference between normal tected as having occurred.
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

and abnormal conditions. When the residual is marked as an outlier, The TPR is the proportion of burst data correctly identified as
the appropriate data are selected as an alternative and fed back to bursts under burst conditions, whereas the FPR is the proportion of
the LSTM input layer as a normal flow. Therefore, this study pro- normal data identified as bursts under normal conditions (Wu and
poses a method for rapid correction of identified outliers called out- Liu 2017). DA is the ratio of correctly detected bursts under all
lier feedback correction (OFC). conditions, which is a measure of composite burst detection ability.
Similar analysis was done in the short-term prediction of wind A good burst detection method has a high TPR, a small FPR, and a
speed (Mi et al. 2017). The moving average method, horizontal– high DA.
vertical comparison, and the probability method are widely used in
outlier feedback correction. In this study, outlier correction was
Results and Discussion
based on the predicted value and the residual correction value
(or error correction value) (Ma and Liu 2016)
Study Area and Datasets
xt ¼ x^ t þ R̄n ð10Þ To evaluate the efficiency of the proposed method, some examples
were tested with the inflow data collected from a real-life DMA in
where xt = corrected value for identified outlier at time t; x^ t = pre- the south of China. The same DMA was employed in previous
dicted value of LSTM at time t; and R̄n = mean of residual correc- work (Wu et al. 2016). It is a typical DMA in China with meters
tion value Rn . installed at each inlet. The service area of this metering area
To reduce false alarms and ensure the reliability of the model, reaches 6.5 km2 , water is supplied to more than 25,000 connec-
many data-driven models raise an alarm when they detect two or tions by gravity from east to west, and two inlets are located in
more outliers (Loureiro et al. 2016; Wu and Liu 2017; Wu et al. the east of the DMA. The average daily water consumption is
2018a). Based on this, the correction step triggers an alarm only more than 10,000 m3 . The flows were recorded every 5 min,
when Q (Q ≥ 2) consecutive flow data are identified as outliers. and the data were collected from the SCADA system over a period
Because recordings are made every 5 min, the time to detect can of 135 days, from July 20 to November 21, 2015. To fit the
be given by method to more standard DMAs, data from the two inflows were
integrated as a single inflow DMA. The integrated flow data were
T detect ¼ 5 × Q ð11Þ cleaned by checking for repetitive values (i.e., repetitive flow
data at the same time) and filling the missing values though
interpolation.
Performance Evaluation
The simulated and synthetic experiments in the DMA were used
Four metrics were calculated to evaluate the effectiveness of burst to develop and validate the proposed three-stage detection method.
detection: mean absolute percentage error (MAPE), true positive Three bursts were simulated by the water utility by opening a fire
rate (TPR), false positive rate (FPR), and detection accuracy (DA). hydrant at different locations, and the burst records were provided
Among them, MAPE was used to directly evaluate the prediction by the water utility. The case was experimentally proved to be
effect of LSTM and to indirectly evaluate the effect of burst applicable when employed in a previous study (Wu et al. 2016).
detection. To further demonstrate the performance of the model, artificially
In statistics, the MAPE is a prediction metric that expresses ac- synthesized outliers were added to the initial flow time series to
curacy in terms of a percentage form 96 sets of synthetic bursts. Seven-day flow data were selected
randomly; flow data starting at 0:00, 6:00, 12:00, and 18:00 (rep-
N  
100 X  X t − X^ t  resenting peaks and troughs in water demand) were added with 5,
MAPE ¼ ð12Þ
N i¼1  X t  10, and 20 outliers. The value of the synthetic burst was in the range
15%–20% of the daily average flow. The occurrence time of bursts
and corresponding outliers were used to evaluate the accuracy of
where X t = observed value; and X^ t = predicted value. A good pre- the model.
diction method should keep the MAPE value small (Guo et al.
2018).
TPR, FPR, and DA are three other metrics for calculating the Extracting Features and Model Establishment
effectiveness of the classification method, and are defined by To predict the flow at the time point t, original time series and daily
the following three formulas: time series were combined to form four input features: (1) the ad-
jacent daily time sequence of t − 1, (2) the adjacent daily time se-
TPR ¼
TP
× 100% ð13Þ quence of t þ 1, (3) the adjacent daily time sequence of t, and
TP þ FN (4) the adjacent original time sequence of t. The periodicity of

© ASCE 04020031-4 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Table 1. Features and inputs of LSTM model
Feature Input
First period (daily interval sequence t − 1 − ð24hÞ × 7, t − 1 − ð24hÞ × 6, t − 1 − ð24hÞ × 5, t − 1 − ð24hÞ × 4, t − 1 − ð24hÞ × 3,
of t − 1) t − 1 − ð24hÞ × 2, and t − 1 − ð24hÞ × 1.
Second period (daily interval t þ 1 − ð24hÞ × 7, t þ 1 − ð24hÞ × 6, t þ 1 − ð24hÞ × 5, t þ 1 − ð24hÞ × 4, t þ 1 − ð24hÞ × 3,
sequence of t þ 1) t þ 1 − ð24hÞ × 2, and t þ 1 − ð24hÞ × 1.
Third period (daily interval t − ð24hÞ × 7, t − ð24hÞ × 6, t − ð24hÞ × 5, t − ð24hÞ × 4,
sequence of t) t − ð24hÞ × 3, t − ð24hÞ × 2, and t − ð24hÞ × 1.
Fourth period (sampling interval t − 35 min, t − 30 min, t − 25 min, t − 20 min, t − 15 min,
sequence of t) t − 10 min, and t − 5 min.
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

water flow was illustrated in (1), (2), and (3) to consider both week- Table 2 shows multiple thresholds (high-flow thresholds) for
days and weekends, and a typical time sequence was shown in 87 days. Previous studies proposed that the residual between pre-
(4) to describe the water flow varying with time sequence. Because dicted and measured value could be estimated as burst size (Bakker
flow patterns vary weekly, daily time series input used the seven et al. 2014a; Wu and Liu 2017; Ye and Fenner 2014a). The purpose
nearest data, which represented the periodicity of an entire week. of multiple thresholds was to measure the confidence interval for
The seven nearest data of different features were selected as the residuals. In other words, the boundary (thresholds) of the confi-
cycle input of the LSTM model (Table 1), and the output was dence interval can be assumed to be the minimum detectable burst
the flow of this DMA for the following time step. size. The identifiable minimum bursts in the DMA ranged from
The LSTM model has four parameters to be optimized: (1) num- 2.79% to 13.51% of the average inflow.
ber of LSTM units, (2) activation functions [i.e., tanh, rectified lin-
ear unit (ReLU), and sigmoid]; (3) number of dense layers; and
(4) initial learning rate. The optimal parameters were searched by
certain ranges. The number of nodes in the three LSTM layers was
set to 128, 64, and 48, and the activation function of the LSTM
layer was tan h. The activation function of the first and second fully
connected neural network layer was rectified linear unit (Wu and
Liu 2017), and the activation function of the third network layer
(corresponding to the output layer) was linear. The learning rate
was 0.002, and the batch size was 60.
The full data set was divided into training data (25,200 samples
over 87 days) and testing data (5,184 samples over 18 days). The
training data were randomly selected. The testing data were used to
evaluate the proposed method. Previous studies showed that a sen-
sitivity test is more valuable than a cross-validation method, with
less computational cost (Li et al. 2019; Li and Zhang 2018). A sen-
sitivity test that shuffled data into multiple training and testing runs
was used in this study for model validation.
With regard to classification and correction, the thresholds of
each time point were calculated though daily time series of resid-
uals. As the preferred parameter of the model, Z was set to 3,
which was obtained by evaluating the optimal point of a receiver
operating characteristic (ROC) curve. More details are discussed
in section “Performance Evaluation of Multihreshold-Based
Classification Stage.” To prevent false alarms caused by these
noise data and to make the results of the normal flow prediction
model align with our expectations, the outliers were abandoned
and filled with a correction value. In this study, the method trig-
gered an alarm only when two (alarm threshold Q ¼ 2) consecu-
tive flow data were identified as outliers. In other words, a burst
was detected within 10 min. After this alarm has been sounded,
the method continues to detect outliers and determines the dura-
tion of the burst.

Results of Burst Detection


The model was run 10 times to test for sensitivity. Fig. 3 shows the Fig. 3. Results of sensitivity test for burst detection: (a) mean absolute
results of sensitivity test for simulated burst detection. These met- percentage error of LSTM before outlier feedback correction; (b) mean
rics are within certain ranges, which indicates that the model is ro- absolute percentage error of LSTM after outlier feedback correction;
(c) true positive rate of burst detection; (d) false positive rate of burst
bust (Guo et al. 2018). One of these tests was selected at random for
detection; and (e) detection accuracy of burst detection.
use in the following discussion of results.

© ASCE 04020031-5 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Table 2. Efficiency of burst detection 0.21%, respectively. Based on the preceding test results, the pro-
Upper residual Identifiable minimum posed method can identify bursts in a pipeline network quickly
Detection time threshold (L/s) burst (% average inflow)a and accurately.
0:00 64.84 6.49
0:05 69.59 6.97 Performance Evaluation of LSTM Model
0:10 55.61 5.57
0:15 64.00 6.41 Flow prediction is the first stage of the model, and it is essential to
0:20 62.18 6.23 ensure the accuracy of burst detection. A more realistic flow state
.. .. ..
. . . can be described by a more accurate prediction, which is beneficial
3:30 27.86 2.79 in the second step of detection.
3:35 31.24 3.13 The prediction performance is summarized in Table 4. The per-
3:40 31.58 3.16 formance was obtained by using LSTM and ANN models on test-
3:45
.. 28.01
.. 2.80
.. ing data. The number of nodes in the ANN layers was set to 128,
. . .
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

32, 1, and the activation function of these layers was set to tanh,
6:00 96.67 9.68
ReLU, and linear. The learning rate was 0.01, and the batch size
6:05 115.73 11.59
6:10 134.91 13.51 was 60. Four common metrics [MAPE, mean squared error (MSE),
6:15 119.28 11.94 mean absolute error (MAE), and relative error (RE)] were calcu-
.. .. ..
. . . lated to evaluate the effectiveness of the prediction. The results
15:00 56.78 5.68 showed that the best forecasting performance was obtained using
15:05
.. 69.06.. 6.91.. the LSTM model, because it had a higher prediction accuracy
. . . (e.g., MAPE ¼ 2.52%) than the ANN model (e.g., MAPE ¼
18:00 64.02 6.41 3.48%). It was found that 95% of the predicted relative errors were
18:05
.. 64.60 .. 6.47 .. in the ranges 7.13% for ANN model and 5.74% for the LSTM
. . . model. The relative errors of LSTM were smaller than for the ANN
21:00 49.50 4.96
model. This result shows that the bias of relative errors for the
21:05
.. 48.87 .. 4.89 ..
. . . LSTM model is lower than for the ANN model, which implies that
23:30 77.10 7.72 the LSTM model is more stable.
23:35 77.70 7.78 The deep learning-based method proposed in this study can
23:40 71.32 7.14 achieve accurate and reliable flow prediction. The ANN model con-
23:45 62.52 6.26 nects all layers in a fully connected manner. A fully connected
23:50 69.13 6.92 ANN with multiple layers requires a large number of parameters,
23:55 73.69 7.38 which contributes to gradient explosion and disappearance, and
a
Numbers in bold represent the maximum and minimum of identifiable therefore causes the update of parameters to fluctuate. For a prob-
minimum burst over 24 h. lem of a vanishing gradient, one of the solutions is LSTM, which
guarantees the passing of gradients through layers via the concat-
enation of long-term and short-term memory.
The main reason is that the LSTM model has a novel structure of
Three simulated bursts are listed in Table 3. Bursts were iden-
six layers that perform complex nonlinear transformations of flow
tified in 10 min, and six of seven outliers were identified across the data. The LSTM unit is the key to the prediction model and it has a
three experiments. Three simulated bursts were detected by using memory function that retains information on past water flow and
multiple thresholds (Fig. 4). The simulated flow was slower than a establishes interdependencies of flows for different time periods.
real burst, so the initial burst points were smaller than the latter. It By contrast, the ANN model has only three layers of network
was difficult to detect the first point at 2:20, and it was even harder to perform simple transformations, resulting in less-reliable predic-
to recognize the difference by using the original data. As shown in tions. The LSTM model was superior to the ANN model for water
the enlarged burst diagram, OFC played an important role in restor- flow predictions (Fig. 6). The proposed model provides an accurate
ing normal flow when bursts occurred, even for the relatively small basis for the whole framework.
flow at 2:45.
A total of 96 synthetic bursts were synthesized. Each burst had a
separate data set. There were four sets of bars per day, which cor- Performance Evaluation of Multithreshold-Based
responded to the outlier starting times 0:00, 6:00, 12:00, and 18:00 Classification Stage
(Fig. 5). Each set of bars consisted of three columns, which cor- To test the accuracy of multithreshold detection, four detection
responded to the number of synthetic outliers (5, 10, 20). The methods were compared: (1) single threshold, (2) the proposed
model was run 96 times for each set of synthetic bursts. For multiple thresholds method; (3) window-size thresholds (Bakker
TPR, the metric decreased with the increase of outliers. The aver- et al. 2014b); and (4) the clustering method (Wu et al. 2016).
age TPR, DA, and FPR of the 96 tests were 94.82%, 99.77%, and For a single threshold, local representative thresholds were selected

Table 3. Detection results


Burst accident Burst start Burst detection Recognition Percentage of Number of Number of
number and end time duration time (min) average inflow (%) outliers detection-outliers
1 2:17–2:27 2:20–2:25 8 About 6 2 1
2 2:30–2:40 2:30–2:40 5 About 8 3 3
3 2:46–2:57 2:50–2:55 9 About 12 2 2

© ASCE 04020031-6 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

Fig. 4. Results of flow prediction and burst detection. The four graphs above represent the residual distribution at different times (0:00, 6:00 12:00,
and 18:00), which is the basis of the multithreshold stage. The flow curve in the bottom section of the figure contains three bursts; the shaded column
indicates the detected bursts, and the three bursts are enlarged in the inset.

Fig. 5. Results of 96 incidences of synthetic burst detection: (a) true positive rate of 96 incidences of synthetic burst detection; and (b) false positive
rate of 96 incidences of synthetic burst detection.

© ASCE 04020031-7 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Table 4. Performance of models on testing data balance between TPR and FPR is closest to the upper left-hand
Model MAPE (%) MSE MAE RE, 95% (%) corner of Fig. 7(a) (Aggarwal and Ranganathan 2018).
Compared with the single-threshold method, the multiple-
ANN 3.48 24,825.11 128.95 7.13
threshold method was more effective. It had a higher classification
LSTM 2.52 9,325.44 75.30 5.74
performance (e.g., M5∶Z ¼ 3, TPR ¼ 85.71%, and FPR ¼ 0.83%)
Note: MSE = mean square error; MAE = mean absolute error; and RE, than the single-threshold method (e.g., S1∶TPR ¼ 85.71% and
95% = relative error (95%). FPR ¼ 11.15%). The main reason is that the threshold with time
attributes was refined, which reduced the impact of the peaks and
troughs of residuals. Generally, it is hard to predict the inflection
point for prediction models. When the flow fluctuates strongly dur-
for comparison. These thresholds ranged from a maximum of
ing periods of high water demand, more inflection points appear
13.51% to a minimum of 2.79% and were divided into 10 parts.
and the residual values increase. Hence, the outlier classification
Each node was compared with a single threshold to evaluate the
is impacted by both prediction accuracy and changes in time attrib-
classification effect. For the proposed multiple thresholds, com-
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

utes of residuals.
monly used statistical measurements were selected. These were
Compared with the two previous studies, the curve of the multi-
the 90% confidence threshold (Z ¼ 1.645), 95% confidence thresh-
ple thresholds is obviously higher than the best detection point
old (Z ¼ 1.96), 99% confidence threshold (Z ¼ 2.576), 2-sigma
(TPR ¼ 71.43% and FPR ¼ 0.61%) of Wu et al. (2016), which in-
threshold (Z ¼ 2), 3-sigma threshold (Z ¼ 3), 3.5-sigma threshold dicates that our proposed method is more accurate. However, the
(Z ¼ 3.5), 4-sigma threshold (Z ¼ 4), 5-sigma threshold (Z ¼ multithreshold method performed worse than the window-size
5) and 6-sigma threshold (Z ¼ 6). For window-size thresholds method. The multiple thresholds method [Fig. 6(a)] raises an alarm
(Bakker et al. 2014b), the predicted values and the observed values only when it detects single outliers. A burst is an accumulated water
were transferred into the moving average values over timeframes of loss process. If there is only a single data anomaly, the alarm cannot
5, 10, 15, 30, 60, 120, and 240 min (B1–B7, respectively). Based be sure that a burst really has occurred. This could lead to a false
on Bakker’s study, the 5% exceedance probability values were used alarm caused by data noise. To reduce false alarms and make the
to set the monitoring threshold values. burst detection system more reliable, many data-driven models
The parameters of aforementioned methods were in certain raise an alarm when they detect two or more outliers (Loureiro
ranges, so it was hard to draw a complete ROC curve based on et al. 2016; Wu and Liu 2017; Wu et al. 2018a). The window-size
the alternative parameters. In this case, the TPR and FPR of differ- method requires the transformation of the predicted and observed
ent ranges were calculated, and these performance curves were ob- values by the moving average. Therefore, the feedback correction
tained by connecting these TPR–FPR points. These performance step and the alarm threshold are essential.
curves were part of the ROC curve [Fig. 7(a)]. The two endpoints The single-threshold classification relies on the accuracy of pre-
of the line marked with solid triangles correspond to the maximum diction, whereas multiple thresholds consider the time attribute of
value (S10: threshold = 13.51% of the average daily flow rate) and residuals on the basis of prediction. Based on the same LSTM
the minimum value (S0: threshold = 2.79% of the average daily prediction, the TPR of the single-threshold method and the multi-
flow rate) of the single threshold, whereas the two endpoints of threshold stage reached 85.71%. There are three stages to reduce
the line marked with solid circles correspond to the maximum false alarms [Fig. 7(b)]. The first column (FPR = 11.15%) repre-
ZðM9∶Z ¼ 6Þ and the minimum Z (M1: 90% confidence threshold, sents FPR based on LSTM prediction, the second column (FPR ¼
Z ¼ 1.675) of the multiple thresholds. If the threshold value is too 0.83%) represents FPR based on LSTM and multiple thresholds,
small (S0), the detection accuracy is improved, but false positives and the third column (FPR ¼ 0.14%) represents the LSTM-based,
also increase. As a result, the method wastes time and resources, multiple-threshold-based, and alarm-threshold-based FPR, which
and loses credibility. If the threshold value is too large (S10), the is named the three-stage burst detection method. Compared with
false alarm rate is lower, but accuracy also is decreased, meaning the window-size method, the proposed three-stage method had a
that events can be ignored. The point that represents the optimal lower FPR (FPR ¼ 0.35%).

Fig. 6. Comparison of flow prediction between LSTM and traditional ANN. Larger differences are magnified in the inset.

© ASCE 04020031-8 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

Fig. 7. Performance comparison of different detection methods: (a) enumeration of different classification methods [single-threshold method, multi-
threshold method, window-size thresholds method (Bakker et al. 2014b), and the best detection result of Wu et al. (2016)] versus true positive rate and
false positive rate, with stars representing the optimal performance of these methods; and (b) optimal false positive rate of different detection methods
(single-threshold, multithreshold, multithreshold and Q ¼ 2, and window-size thresholds method) at TPR ¼ 85.71%.

Performance Evaluation of OFC-Based Continuous Table 5. True positive rate and false positive rate performances of OFC and
Detection Stage non-OFC based detection during 12 synthetic burst cases
Outlier feedback correction can effectively reduce the influence of OFC Non-OFC
Number of
outliers in the prediction model and aid with burst detection con- outliers Time TPR (%) FPR (%) TPR (%) FPR (%)
tinuity. To evaluate the continuous recognition performance of this
5 0:00 100.00 0.00 20.00 2.47
model, 12 synthetic bursts were selected for validation and discus- 6:00 100.00 0.35 20.00 2.83
sion. For each set of synthetic burst sequences, data processing was 12:00 100.00 0.00 20.00 2.45
performed by an OFC method and a non-OFC method. The TPR 18:00 100.00 0.00 60.00 2.83
was used to characterize the performance of the model recognition. 10 0:00 100.00 0.00 10.00 2.88
Table 5 summarizes the TPR of 12 synthetic bursts. All 12 6:00 100.00 0.72 10.00 2.88
bursts were detected by the OFC method and the TPR value 12:00 100.00 0.00 10.00 2.88
reached 100% in all cases. The detection method performed poorly 18:00 100.00 0.00 30.00 2.88
when the OFC was not present. The method was relatively effective 20 0:00 100.00 0.00 5.00 2.99
for detecting short abnormal segments (5 outliers), but could not 6:00 100.00 0.00 5.00 2.61
12:00 100.00 0.00 5.00 2.99
accurately detect abnormal segments—for example 10 or 20
18:00 100.00 0.00 15.00 3.36
outliers—with a long duration. The non-OFC method can trigger Average — 100.00 0.09 17.50 2.84
an alarm quickly, so it is suitable for instantaneous detection (one
outlier) rather than continuous detection (more than one outlier).
With the increase of the duration of a burst, the effect of feedback
correction becomes more and more obvious, which indicates that because the residual from the observed flow was less than 20,
the OFC stage can continuously identify outliers caused by bursts. which was better than the effect at 6:00 (residual > 100). This also
Compared with single-outlier judgment, OFC aided by a multi- was caused by the smoothness of flow data. The flow at 0:00 was
outlier identification can significantly reduce the misjudgment small and stable, so OFC-based prediction could accurately restore
caused by instrument fault and transmission error. This part reduces the sequence. By contrast, the flow at 6:00 was large and rising
the false alarm rate of bursts by continuous detection. Moreover, it rapidly. The OFC-based prediction did not entirely restore the oc-
can identify the burst duration and help water supply companies currence time of the crest [Figs. 8(e and f)]. Previous studies sug-
make intelligent decisions. gested replacing anomalies with data from the previous 2–4 days
Simultaneously, for the same burst size, the flow proportion (Bakker et al. 2014a). However, due to the long interval (more than
varies over 24 h, and thus the difficulty of burst detection, also 24 h) between alternative data and abnormal data, the normal peak
changes. For the non-OFC method, the average TPR at 0:00 (water time cannot be accurately restored. For burst detection during peaks
trough) was the highest and the average TPR at 6:00 and 12:00 in demand, which are very hard to predict, attention was paid in this
(water peaks) was the lowest (Table 5), which fits the preceding study to trend prediction and real-time feedback at the end of
conclusions. The TPR value of the OFC stage was 100% regardless bursts. The three graphs of the most unfavorable situation [Figs. 8
of peaks or troughs in water demand, which proves that the OFC (d–f)] show that this method can predict the rising trend of water
stage has a better detection performance. flow and stop the feedback and correction at the end of the burst. It
OFC-based prediction can restore the normal flow (Fig. 8). The mischaracterizes at most 1–2 points [Figs. 8(d–e)].
correction begins at the first outlier and returns to normal prediction When a burst occurs in a segment in which the original flow is first
at the end of the burst. The correction effect at 0:00 was the best, stable and then drops sharply—for example, at 12:00—OFC-based

© ASCE 04020031-9 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

Fig. 8. OFC-based prediction of the synthetic burst sequence with synthetic burst cases: (a) 0:00, 5 outliers; (b) 0:00, 10 outliers; (c) 0:00, 20
outliers; (d) 6:00, 5 outliers; (e) 6:00, 10 outliers; (f) 6:00, 20 outliers; (g) 12:00, 5 outliers; (h) 12:00, 10 outliers; (i) 12:00, 20 outliers;
(j) 18:00, 5 outliers; (k) 18:00, 10 outliers; and (l) 18:00, 20 outliers.

prediction can restore the flow rate better in the case of five outliers Only a few months’ worth of data were available in the data set,
[Fig. 8(g)]. In the case of 10 outliers [Fig. 8(h)] and 20 outliers so this work focused on daily and weekly variation of the DMA
[Fig. 8(i)], although the inflection point of the flow rate cannot be flow. Different prediction accuracies at different time points result
restored, the downward trend of the flow can be restored. When a in time-variation thresholds of the classification stage. Only when
burst occurs at the moment when the original flow increases or de- the burst size is greater than the threshold at a specific time point
creases suddenly, it is difficult to retrieve the OFC of such data using can a burst be identified. Consequently, the range of the identifiable
OFC-based prediction. For the data at 6:00 and 18:00, the OFC-based minimum burst in the DMA varied from 2.79% to 13.51% of the
prediction was not as accurate as at a stationary time (such as 0:00), average inflow at different time points. The proposed method de-
but it still restored the trend of normal flow. tects bursts by identifying consecutive outliers, and the interval be-
After the first outlier is detected, the outlier can be replaced by tween two consecutive outliers is 5 min. Therefore, two consecutive
the corrected value and fed back to the input layer of LSTM events that occur within 5 min were considered as the same event
to replace the outlier with a corrected value for the next step pre- in this study. Only experimental bursts were used in this study,
diction. The corrected value is calculated using the predicted value and there is a difference between a real burst event and an experi-
and the statistical mean of the residual, which essentially is an aver- ment. The duration of real bursts is longer than simulated experi-
age statistical value, so the inflection point value is more radical ments. However, because this method can detect leakage within
than the correction value. Therefore, when flow data change 10 min, it can be expected to detect a real burst event with larger
significantly—for example, a sudden rise or fall occurs—there will duration.
be a certain flow reduction deviation with mild correction value Future work should verify the applicability by using more real
feedback. The results showed that OFC-based prediction can re- burst data in different DMAs, consider seasonal variation and dif-
cover normal flow. The correction begins at the first outliers and ferent size of bursts, and improve the detection for burst or leakage
returns to normal prediction at the end of the burst. with smaller size.

Applicability of the Method Conclusions


By opening drain valves in networks and adding random flow in This study proposes a sensitive data-driven algorithm for burst de-
original monitoring data, the authors obtained experimental burst tection and presented its applications in a real-life district metering
data and tested the proposed method in a real-life DMA. Although area (DMA). The conclusions are as follows:
the method was proved to be effective in preceding experiments, it 1. The analysis showed that the improvement in prediction and clas-
still has some limitations in real-life applications. sification had a significant impact on the detection performance.

© ASCE 04020031-10 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


The method detected all simulated bursts with a true positive rate Plann. Manage. 144 (12): 04018076. https://doi.org/10.1061/(ASCE)
of 85.71%. The multithreshold stage reduced the false positive WR.1943-5452.0000992.
rate from 11.15% to 0.83%, and detection accuracy reached Hutton, C., and Z. Kapelan. 2015b. “Real-time burst detection in water dis-
99.8%. In addition, the average DA, TPR and FPR of the 96 tribution systems using a Bayesian demand forecasting methodology.”
synthetic tests were 99.77%, 94.82%, and 0.21%, respectively. Procedia Eng. 119: 13–18. https://doi.org/10.1016/j.proeng.2015.08
.847.
Comparisons of long short-term memory and artificial neural net-
Hutton, C. J., and Z. Kapelan. 2015a. “A probabilistic methodology for
works, of multithreshold and single-threshold methods, and of
quantifying, diagnosing and reducing model structural and predictive
previous studies and this study showed that the proposed method errors in short term water demand forecasting.” Environ. Modell. Soft-
has good detection performance. ware 66 (Apr): 87–97. https://doi.org/10.1016/j.envsoft.2014.12.021.
2. The outlier feedback correction stage with LSTM restored the Jung, D., D. Kang, J. Liu, and K. Lansey. 2015. “Improving the rapidity of
normal flow trend when bursts occurred and realized continuous responses to pipe burst in water distribution systems: A comparison of
burst detection with 100% TPR for the 12 synthetic bursts. In statistical process control methods.” J. Hydroinf. 17 (2): 307–328.
the simulated bursts (with an open fire hydrant), OFC recovered https://doi.org/10.2166/hydro.2014.101.
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

normal flow and reduced the FPR to 0.14%. This indicates that LeCun, Y., Y. Bengio, and G. Hinton. 2015. “Deep learning.” Nature
the proposed method avoids prediction residual caused by the 521 (7553): 436–444. https://doi.org/10.1038/nature14539.
input of outliers and ensures the ability of the model to continu- Li, H., F. D. Chen, K. W. Cheng, Z. Z. Zhao, and D. Z. Yang. 2015. “Pre-
ously identify outliers. diction of zeta potential of decomposed peat via machine learning:
3. The current study achieved timely (within 10 min) and contin- Comparative study of support vector machine and artificial neural net-
uous burst detection with desirable effectiveness. It also pro- works.” Int. J. Electrochem. Sci. 10 (8): 6044–6056.
Li, H., D. Yan, Z. Zhang, and E. Lichtfouse. 2019. “Prediction of CO2 ab-
vides a deep learning framework for the massive data stored
sorption by physical solvents using a chemoinformatics-based machine
by water utilities and an efficient idea for DMA-level water loss
learning model.” Environ. Chem. Lett. 17 (3): 1397–1404. https://doi
management. The results showed that this method has good pro- .org/10.1007/s10311-019-00874-0.
spects for applications in DMAs in water distribution systems. Li, H., Z. Zhang, and Z. J. Liu. 2017. “Application of artificial neural net-
works for catalysis: A review.” Catalysts 7 (10): 306. https://doi.org/10
.3390/catal7100306.
Data Availability Statement Li, H., and Z. E. Zhang. 2018. “Mining the intrinsic trends of CO2 solu-
bility in blended solutions.” J. CO2 Util. 26 (Jul): 496–502. https://doi
The following data and the model used in this study can be made .org/10.1016/j.jcou.2018.06.008.
available by the corresponding author on request: data of simulated Loureiro, D., C. Amado, A. Martins, D. Vitorino, A. Mamade, and S. T.
experiments, data of synthetic experiments, and codes for the pro- Coelho. 2016. “Water distribution systems flow monitoring and anoma-
posed method in Spyder 3.2.4. lous event detection: A practical approach.” Urban Water J. 13 (3):
242–252. https://doi.org/10.1080/1573062X.2014.988733.
Ma, Z. G., and W. Y. Liu. 2016. “Outlier correction method of telemetry
data based on wavelet transformation and Wright criterion.” Multimedia
Acknowledgments
Tools Appl. 75 (22): 14477–14489. https://doi.org/10.1007/s11042-015
-3241-x.
This work was jointly supported by the National Natural Science
Melgarejo-Moreno, J., M.-I. López-Ortiz, and P. Fernández-Aracil. 2019.
Foundation of China (Grant No. 5187090620) and the China
“Water distribution management in South-East Spain: A guaranteed
Postdoctoral Science Foundation (Grant No. 2018M631495). system in a context of scarce resources.” Sci. Total Environ. 648 (Jan):
1384–1393. https://doi.org/10.1016/j.scitotenv.2018.08.263.
Mi, X. W., H. Liu, and Y. F. Li. 2017. “Wind speed forecasting method
References using wavelet, extreme learning machine and outlier correction algo-
rithm.” Energy Convers. Manage. 151 (Nov): 709–722. https://doi
Aggarwal, R., and P. Ranganathan. 2018. “Understanding diagnostic .org/10.1016/j.enconman.2017.09.034.
tests–Part 3: Receiver operating characteristic curves.” Perspect. Clin.
MOHURD (Ministry of Housing and Urban-Rural Development). 2017.
Res. 9 (3): 145–148. https://doi.org/10.4103/picr.PICR_87_18.
Urban water distribution network district metering management guide-
AWWA (American Water Works Association). 2009. Water audits and loss
lines. Beijing: MOHURD.
control programs. Denver: AWWA.
Mounce, S. R., A. J. Day, A. S. Wood, A. Khan, P. D. Widdop, and
Bakker, M., E. A. Trietsch, J. H. G. Vreeburg, and L. C. Rietveld. 2014a.
J. Machell. 2002. “A neural network approach to burst detection.” Water
“Analysis of historic bursts and burst detection in water supply areas of
Sci. Technol. 45 (4–5): 237–246. https://doi.org/10.2166/wst.2002.0595.
different size.” Water Sci. Technol. Water Supply 14 (6): 1035–1044.
https://doi.org/10.2166/ws.2014.063. Mounce, S. R., A. Khan, A. S. Wood, A. J. Day, P. D. Widdop, and J.
Bakker, M., J. H. G. Vreeburg, M. Van De Roer, and L. C. Rietveld. 2014b. Machell. 2003. “Sensor-fusion of hydraulic data for burst detection
“Heuristic burst detection method using flow and pressure measure- and location in a treated water distribution system.” Information Fusion
ments.” J. Hydroinf. 16 (5): 1194–1209. https://doi.org/10.2166/hydro 4 (3): 217–229. https://doi.org/10.1016/S1566-2535(03)00034-4.
.2014.120. Palau, C. V., F. J. Arregui, and M. Carlos. 2012. “Burst detection in water
Chen, J., and D. L. Boccelli. 2018. “Forecasting hourly water demands with networks using principal component analysis.” J. Water Resour. Plann.
seasonal autoregressive models for real-time application.” Water Re- Manage. 138 (1): 47–54. https://doi.org/10.1061/(ASCE)WR.1943
sour. Res. 54 (2): 879–894. https://doi.org/10.1002/2017WR022007. -5452.0000147.
Farley, M., and S. Trow. 2003. Losses in water distribution networks: A Qi, Z., F. Zheng, D. Guo, T. Zhang, Y. Shao, T. Yu, K. Zhang, and H. R.
practitioner’s guide to assessment, monitoring and control. London: Maier. 2018. “A Comprehensive framework to evaluate hydraulic and
International Water Association Publishing. water quality impacts of pipe breaks on water distribution systems.”
Fox, S., W. Shepherd, R. Collins, and J. Boxall. 2016. “Experimental quan- Water Resour. Res. 54 (10): 8174–8195. https://doi.org/10.1029
tification of contaminant ingress into a buried leaking pipe during tran- /2018WR022736.
sient events.” J. Hydraul. Eng. 142 (1): 04015036. https://doi.org/10 Romano, M., Z. Kapelan, and D. A. Savic. 2014a. “Automated detection of
.1061/(ASCE)HY.1943-7900.0001040. pipe bursts and other events in water distribution systems.” J. Water
Guo, G., S. Liu, Y. Wu, J. Li, R. Zhou, and X. Zhu. 2018. “Short-term water Resour. Plann. Manage. 140 (4): 457–467. https://doi.org/10.1061
demand forecast based on deep learning method.” J. Water Resour. /(ASCE)WR.1943-5452.0000339.

© ASCE 04020031-11 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031


Romano, M., Z. Kapelan, and D. A. Savic. 2014b. “Evolutionary algorithm Wu, Y. P., S. M. Liu, X. Wu, Y. F. Liu, and Y. S. Guan. 2016. “Burst
and expectation maximization strategies for improved detection of pipe detection in district metering areas using a data driven clustering algo-
bursts and other events in water distribution systems.” J. Water Resour. rithm.” Water Res. 100 (Sep): 28–37. https://doi.org/10.1016/j.watres
Plann. Manage. 140 (5): 572–584. https://doi.org/10.1061/(ASCE)WR .2016.05.016.
.1943-5452.0000347. Wu, Z. Y., M. El-Maghraby, and S. Pathak. 2015. “Applications of deep
Sainath, T. N., O. Vinyals, A. Senior, and H. Sak. 2015. “Convolutional, learning for smart water networks.” Procedia Eng. 119 (1): 479–485.
long short-term memory, fully connected deep neural networks.” In https://doi.org/10.1016/j.proeng.2015.08.870.
Proc., 2015 IEEE Int. Conf. on Acoustics, Speech, and Signal Process- Yan, H., Q. Wang, J. Wang, K. Xin, T. Tao, and S. Li. 2019. “A simple but
ing (ICASSP), 4580–4584. New York: IEEE. robust convergence trajectory controlled method for pressure driven
Schmidhuber, J. 2015. “Deep learning in neural networks: An overview.” analysis in water distribution system.” Sci. Total Environ. 659 (Apr):
Neural Networks 61 (Jan): 85–117. https://doi.org/10.1016/j.neunet 983–994. https://doi.org/10.1016/j.scitotenv.2018.12.374.
.2014.09.003. Ye, G., and R. A. Fenner. 2014a. “Study of burst alarming and data
Vaghefi, M., K. Mahmoodi, and M. Akbari. 2018. “A comparison among sampling frequency in water distribution networks.” J. Water Resour.
data mining algorithms for outlier detection using flow pattern experi- Plann. Manage. 140 (6): 06014001. https://doi.org/10.1061/(ASCE)
Downloaded from ascelibrary.org by University of Birmingham on 04/16/20. Copyright ASCE. For personal use only; all rights reserved.

ments.” Scientia Iranica 25 (2): 590–605. https://doi.org/10.24200/SCI WR.1943-5452.0000394.


.2017.4182. Ye, G. L., and R. A. Fenner. 2011. “Kalman filtering of hydraulic measure-
Wu, Y. P., and S. M. Liu. 2017. “A review of data-driven approaches for ments for burst detection in water distribution systems.” J. Pipeline Syst.
burst detection in water distribution systems.” Urban Water J. 14 (9): Eng. Pract. 2 (1): 14–22. https://doi.org/10.1061/(ASCE)PS.1949-1204
972–983. https://doi.org/10.1080/1573062X.2017.1279191. .0000070.
Wu, Y. P., S. M. Liu, K. Smith, and X. T. Wang. 2018a. “Using correlation Ye, G. L., and R. A. Fenner. 2014b. “Weighted least squares with
between data from multiple monitoring sensors to detect bursts in water expectation-maximization algorithm for burst detection in U.K. water
distribution systems.” J. Water Resour. Plann. Manage. 144 (2): distribution systems.” J. Water Resour. Plann. Manage. 140 (4):
04017084. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000870. 417–424. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000344.
Wu, Y. P., S. M. Liu, and X. T. Wang. 2018b. “Distance-based burst Zhang, K., H. Yan, H. Zeng, K. Xin, and T. Tao. 2019. “A practical multi-
detection using multiple pressure sensors in district metering areas.” objective optimization sectorization method for water distribution net-
J. Water Resour. Plann. Manage. 144 (11): 06018009. https://doi work.” Sci. Total Environ. 656 (Mar): 1401–1412. https://doi.org/10
.org/10.1061/(ASCE)WR.1943-5452.0001001. .1016/j.scitotenv.2018.11.273.

© ASCE 04020031-12 J. Water Resour. Plann. Manage.

J. Water Resour. Plann. Manage., 2020, 146(6): 04020031

You might also like