You are on page 1of 5

Available online at www.sciencedirect.

com

ScienceDirect
Procedia Engineering 70 (2014) 1386 1390

12th International Conference on Computing and Control for the Water Industry, CCWI2013

Estimating demands with a Markov chain Monte Carlo approach


T. Qina, D.L. Boccellia*
a
Environmental Engineering Program, University of Cincinnati, Cincinnati, OH, 45221, US.

Abstract

The use of drinking water distribution system models has been around for decades, and demand estimation has
been a key component in the model development process. Existing techniques utilized by consultants/utilities have
been capable of representing the general hydraulics for assessing typical objectives such as maintaining adequate
pressure. However, these models are generally not appropriate, or simply do not work, when attempting to
recreate the observed hydraulics and operational conditions associated with a specific time horizon. In general,
these limitations in recreating demands have become even more important when attempting to assess the ability of
models to represent complex water quality dynamics. In this paper, a Bayesian demand estimation algorithm is
presented, and estimated water demand multipliers are compared with synthetically generated observed demand
multipliers in a water distribution system network. Using two different data scenarios an ideal case assuming
all flow and pressure data are available for the estimation process and a more realistic limited case using only
flow rates at the sources and tank levels the Markov chain Monte Carlo (MCMC) method demonstrated an
increase in demand estimate uncertainty for the limited data scenario. Using the limited data set only resulted
in a moderate increase in the underlying flow rate uncertainty. These results demonstrate the potential feasibility
of the MCMC algorithm for recreating adequate demand estimates that will provide an improved approach for
evaluating water quality models.


2013
2013The
TheAuthors.
Authors.Published
PublishedbybyElsevier Ltd.
Elsevier Ltd. Open access under CC BY-NC-ND license.
Selection
Selectionand
andpeer-review
peer-reviewunder
underresponsibility of of
responsibility thethe
CCWI2013
CCWI2013Committee.
Committee

Keywords: Distribution; demand; estimation; Markov chain; Monte Carlo

* Corresponding author. Tel.:+1-513-375-6901; fax:1-513-556-4162.


E-mail address: dominic.boccelli@uc.edu

1877-7058 2013 The Authors. Published by Elsevier Ltd. Open access under CC BY-NC-ND license.
Selection and peer-review under responsibility of the CCWI2013 Committee
doi:10.1016/j.proeng.2014.02.153
T. Qin and D.L. Boccelli / Procedia Engineering 70 (2014) 1386 1390 1387

1. Introduction

The use of drinking water distribution system models has been around for decades, and demand estimation has
been a key component in the model development process. Existing techniques utilized by consultants/utilities have
been capable of representing the general hydraulics for assessing typical objectives such as maintaining adequate
pressure. However, these models are generally not appropriate, or simply do not work, when attempting to
recreate the observed hydraulics and operational conditions associated with a specific time horizon. These
limitations in recreating demands have become even more important when attempting to assess the ability of
models to represent complex water quality dynamics (Alexander and Boccelli, 2010).
Our current objectives are to develop a demand estimation algorithm that provides the flexibility to estimate
(potentially) spatially correlated demands, and provide a trade-off between representing the hydraulic dynamics
and demand accuracy. The present research is focused on developing a Markov chain Monte Carlo (MCMC)
estimation algorithm, which showed promise in a previous study (Boccelli and Uber, 2005) and has been
successful in recreating the permeability field within a ground water transport problem (Lee et al, 2002). The
MCMC approach is a Bayesian parameter estimation technique that will derive the posterior distribution of
demand estimates based on maximizing the likelihood of the observed data, such as flow rate, pressure, and/or tank
levels. While all parameters are assigned a prior distribution to provide subjective information on the parameter
distribution, a Markov Random Field (MRF) prior will be specifically included to account for the potential spatial
correlation that might exist within the actual demands.

2. Methods

The current approach is developing a state estimation algorithm based on Markov chain Monte Carlo (MCMC)
a Bayesian parameter estimation technique that is linked with EPANET a widely used, free distribution
system network hydraulic and water quality solver (Rossman, 2000). The benefits for utilizing the MCMC
approach are that the algorithm estimates both the expected parameter values and the uncertainty distribution
without specifying a parametric statistical distribution. The parameters to be estimated, , are treated as random
variables that can be assigned some prior belief on their distribution,   , which represents our initial
understanding of the expected value and uncertainty of . After collecting data, x, the values of  can be updated
to reflect the information contained in the observed data x given by the posterior distribution    the
distribution of  given the observed data x using Bayes' Rule is        , where    represents
the likelihood of observing the data x for the parameter set . The MCMC estimation algorithm explores the
parameter space by randomly perturbing the current parameter set,  , to generate a new parameter set,  ,
where  represents the vector of parameters to be estimated. The updated parameter set is accepted/rejected using
the Metropolis-Hastings (M-H) algorithm using a fitness ratio that compares the product of the likelihood
function, prior distribution, and candidate generating function value for parameter sets  and  .
The likelihood function represents the error between the observed and model predicted data and is assumed to
be normally distributed    where 0 is the expected error and  is the user defined error variance. In general,
the x data represents the type of observed data (e.g., flow rate, pressure, water quality), and  represents the
parameters of interest (e.g., used to represent demand pattern multipliers). The prior distributions are used to
incorporate prior beliefs about the parameter values of . For individual demand pattern multipliers, the priors of
the parameters  will be assumed to be a uniform distribution to limit the multiplier values between zero and a pre-
determined positive number. To account for possible spatial relationships among the individual demand
multipliers, a Markov Random Field prior is utilized to account for possible spatial correlations within the demand
field. To explore the posterior space of our parameter distributions, new values of each parameter need to be
generated when moving from     , which is relatively straightforward using updating techniques that
include random sampling from uniform distributions centered on the current estimate of the individual parameters
(Spall, 2003).
1388 T. Qin and D.L. Boccelli / Procedia Engineering 70 (2014) 1386 1390

2.1. Simulation Study: Small-Network Model

Figure 1 shows the test network EPANET Example Network 3. This system consists of 92 nodes, 3 tanks, 2
reservoirs, and 2 pumps that operate periodically. The three tanks float on the system, meaning that the flows into
and out of that tanks are dependent on the source flow rates (inflows) and total system demand (outflows). In
order to evaluate the ability of the MCMC algorithm to recreate demand estimates associated with a spatially
correlated field, a Gaussian Process model was used to generate a spatially correlated set of demands.
When estimating the parameters, two different data scenarios were considered: 1) an ideal case where all
flows and pressures were observed; and 2) a limited case where only the flows from the two reservoirs and three
tank levels were observed. Finally, rather than attempt to estimate the demands at all 92 nodes, the network was
divided into 3  3 grids, of which one grid contained no demand nodes in order to reduce the parameterization of
the network model.

Grid8
Grid7 Nnodes: 16 Grid9
Nnodes: 2 Nnodes: 2

Grid4 Grid6
Nnodes: 3 Nnodes: 4
Grid5
Nnodes:
43

Grid1 Grid2 Grid3


Nnodes: 0 Nnodes: 17 Nnodes: 4

Figure 1. The test network and related divisions for the MCMC-MRF algorithm

3. Results

Figure 2 shows the mean estimated and base demand weighted multipliers for the two data scenarios: all and
limited. The mean estimated multipliers are the results from the MCMC algorithm, while the base demand
weighted multipliers represent the average multiplier value from the original network model weighted by the base
demand. The black symbols indicate that the mean estimated multipliers were within the min/max range of the
original multipliers within the grid, while the white symbols represent that the estimated multiplier lie outside the
min/max range of the original multipliers within the grid. From Figure 2, we can observe that when moving from
T. Qin and D.L. Boccelli / Procedia Engineering 70 (2014) 1386 1390 1389

all to limited data scenarios, the estimated demand uncertainty increased given the less amount of information
available for the estimation process. There was also an increase in the number of estimated multipliers that were
outside the min/max range of the original multipliers. However, these outliers were found in Grids 3, 6, and 9,
which had few nodes within the grid and were at the edges of the system. For the limited data case, these

Figure 2 Scatterplot of mean estimate and base demand weighted multipliers for the all and limited data case.

locations have little impact on the estimated flows and pressures. However, while the demand estimates are
important, assessing the impact of the potential demand estimates on the underlying hydraulics is important or
understanding the potential impacts on hydraulic transport and evaluation of water quality models. Thus, Figure 3
presents scatter plots of the estimated and actual flow rates for the all and limited data case. While there was
the expected increase in the error estimates associated with the flow rate as the estimation moves from the all to
the limited data case, the decrease in accuracy was relatively small. These results suggest that even utilizing a
sparse amount of data can still yield reasonable representation of the underlying hydraulics.

Figure 3 Scatterplot of estimated and actual flow rate for the all and limited data cases.

4. Summary

The ability to estimate consumptive demands and adequately represent the underlying hydraulics is important
for evaluating the performance of water quality models. This research presented a synthetic test case of a Markov
chain Monte Carlo demand estimation algorithm for re-creating a spatial demand field assuming ideal conditions
1390 T. Qin and D.L. Boccelli / Procedia Engineering 70 (2014) 1386 1390

of observing all possible flow and pressure measurements and a limited case that more realistically reflected the
amount of data collected by utilities. Overall, the MCMC algorithm appeared capable of recreating the spatially
distributed demand field. In particular, even under the limited data scenario, the underlying hydraulics were
reasonably represented with only moderate increases in the flow rate uncertainty. Future research efforts will
explore the impacts of demand parameterizations as well as an application to a real-world test case.

References

M. Alexander and D. L. Boccelli, A Field-Scale Evaluation of a Multi-Species Water Quality and Hydraulic Model, in Proc. Water Quality
Technology Conference, Savannah, GA: AWWA, 2010.
D. L. Boccelli and J. G. Uber, Incorporating Spatial Correlation in a Markov Chain Monte Carlo Approach for Network Model Calibration,
in Proc.World Water and Environmental Resources Congress, Anchorage, AK: ASCE, 2005.
H. Lee, D. Higdon, Z. Bi, M. Ferreira, and M. West, Markov Random Field Models for High-Dimensional Parameters in Simulations of Fluid
Flow in Porous Media, Technometrics, vol. 44, no. 3, March. 2002, pp. 230-241.
Rossman, L. A. 2000. EPANET2 Users Manual. Cincinnati, OH: Risk Reduction Engineering Laboratory, U. S. Environmental Protection
Agency.
Spall, J. C. 2003. Estimation via Markov chain Monte Carlo. IEEE Control Systems Magazine 23(2), 34-35.

You might also like