Professional Documents
Culture Documents
Abstract: This paper describes investigations into a development of a new application of neural networks 共NN兲 for prediction of pipeline
failure. Results show higher correlations with recorded data when compared with the two existing statistical models. The shifted time
power model gives results in total number of failures and the shifted time exponential model gives results in number of failures per year.
The database was large but neither complete and nor fully accurate. Factors influencing pipeline deterioration were missing from the
Downloaded from ascelibrary.org by New York University on 04/17/15. Copyright ASCE. For personal use only; all rights reserved.
database. Using the NN technique on this database produced models of pipeline failure, in terms of failures/km/year, that more closely
matched the number of failures of a particular asset recorded for the period.
DOI: 10.1061/共ASCE兲1076-0342共2007兲13:1共26兲
CE Database subject headings: Water pipelines; Neural networks; Statistics; Predictions.
Introduction date, failure dates, replacement dates, failure cost parameters, and
failure type. The current study uses 6 years of data extracted out
City West Water Ltd. 共CWW兲 is an Australian retail water com- of the CWW database dealing with individual reticulation cast
pany, located in Victoria, Australia, that provides high quality iron pipes with diameters ⬍300 mm that experienced any failures
drinking water and sewerage services to a population of over from 1997 to 2002, inclusive. The basic data set of pipelines that
600,000 in Melbourne’s central business district and the Western registered failures between 1997 and 1999 is used here for initial
suburbs. CWW experiences one of the highest structural failures analysis. As more data became available this was extended with
rates on water supply pipeline assets in Australia, so a main ele- another 3 years of registered failures between 2000 and 2002,
ment of their strategy is to improve the capability to predict fail- inclusive.
ure. Traditional statistical models used at CWW have been based Significant data auditing was undertaken within CWW to as-
on past failure histories and use regression analysis to predict sess the accuracy of the records and eliminate unreliable records.
future failure rates. The models are based on age only and are The description of failure type was inconsistent, sometimes miss-
considered unsatisfactory, due to relatively low correlations be- ing, difficult to quantify, and hence was discarded.
tween predicted and actual failure rates 共Righetti 2001兲.
The possibility of using neural networks 共NNs兲 was explored
with the aim of improving failure predictions. NNs are often de- Previous Modeling
scribed as a network of interconnected processing units inspired
by the neurons in the brain and may also be viewed as a gener- Statistical models for predicting water pipe failure use historical
alization of any traditional statistical method such as nonlinear data of past failures to identify pipe breakage patterns. Details
regression or classification 共Ampazis et al. 1999兲. They can use about research carried out on structural deterioration of water
large databases and are tolerant of missing values and “noise.” mains using the traditional statistical methods can be found in the
comprehensive review of Kleiner and Rajani 共2001兲. The predic-
tive models currently used by CWW are the shifted time power
CWW Database model 共STPM兲 and the shifted time exponential model 共STEM兲
for individual assets 共Mavin 1996; Constantine et al. 1998兲, as
The information, in the CWW database, gathered for each pipe described by Righetti 共2001兲.
breakage was: asset ID number, location 共using a geographic in- The models are based on age and failure histories and use a
formation system兲, material type, diameter, length, installation shifted time parameter or scale parameter and a rate variable
parameter common to a set of pipelines. The STPM uses a rate
1
Post Doctoral Fellow, School of Engineering and Science, Swinburne variable parameter from modeling undertaken by Constantine
Univ. of Technology, John St., Hawthorn, Victoria, Australia 3122. et al. 共1998兲, using a set of data from the Melbourne suburbs of
2
Senior Lecturer, School of Mathematics, Swinburne Univ. of Ringwood and Sunshine. The STPM model is represented by a
Technology, John St., Hawthorn, Victoria, Australia 3122. power function or an exponential function increasing over time or
3
AM, Deputy Head, School of Engineering and Science, Swinburne pipe age.
Univ. of Technology, John St., Hawthorn, Victoria, Australia 3122 The STPM equations are 共Constantine et al. 1998兲
共corresponding author兲. E-mail: kmcmanus@swin.edu.au
Note. Discussion open until August 1, 2007. Separate discussions H共t兲 = l共x兲 共1兲
must be submitted for individual papers. To extend the closing date by
one month, a written request must be filed with the ASCE Managing where H共t兲 = expected number of total failures at pipe age x;
Editor. The manuscript for this paper was submitted for review and pos- l = pipe length;  = 2.063= variable rate parameter; = shifted time
sible publication on December 14, 2004; approved on March 13, 2006. parameter 共or the scaling parameter兲; and x = asset age 共years兲.
This paper is part of the Journal of Infrastructure Systems, Vol. 13, No. The rate of failure per year at age x is given by the derivative
1, March 1, 2007. ©ASCE, ISSN 1076-0342/2007/1-26–30/$25.00. of this equation
time bands. The STEM requires division of the data into four assets,” where “links appear to exist with soil type and weather
smaller groups of assets, by years of construction, and uses four fluctuations.” The statistical models used by CWW assumed no
different fixed parameters for each group of assets. In this case the end point in time, and the effect of the repairs is being ignored.
predictive capability of the STEMs can significantly decrease. Further development of these models by incorporating the effects
The STEM is 共Constantine et al. 1998兲 of soil and weather and repairs may improve their performance.
Ht = lex 共3兲
where Ht = expected number of total failures at pipe age x; Description of Neural Networks Model
l = pipe length;  = variable rate parameter; = scaling parameter
共or the shifted time parameter兲; and x = asset age 共years兲. The mathematics of NNs are reviewed by Anderson 共1995兲,
The rate of failure per year is given by the derivative of this Bishop 共1996兲, Ripley 共1996兲, Gurney 共1997兲, and Beale and
equation Jackson 共1990兲. NNs are adaptive and can represent any complex
dHt nonlinear relationships between the input and the output vari-
= lex 共4兲 ables. They are particularly good for modeling complex problems
dx and can deal with relative ease with the combined effects of a
The exponential model did not fit the data sets as well as the large number of input variables. The multilayer perceptron 共MLP兲
power model, according to Constantine et al. 共1998兲. Therefore  is the most commonly used neural network model. The MLP has
共the variable rate兲 could not be calculated for a common class of a number of nodes 共neurons兲 or units organized in input and out-
assets and CWW needs to calculate both and  variables in put layers as well as a number of hidden layers.
order to be able to use this model. This can be done by calculating A trainable nonzero bias term is used in each to account for
and fitting the model for a class of assets to the cumulative sum of external influences. The NNs are flexible and “learn” through an
historical failures of all pipes using the technique of minimizing iterative process of adjusting their weights and biases 共Ampazis
the sum of squares of the differences. et al. 1999兲. The most common learning is supervised learning,
Constantine et al. 共1998兲 recommends that the rate variables which provides a response value for every set of input values and
parameters in the STEM need to be calculated. Hence, STEM is requires a known 共input兲 target value that the response is trying to
not a true predictive model. It is not clear what covariates 共e.g., “guess.” The difference between the response and the actual tar-
soil type, bedding type, location, pipe diameter, pipe type, over- get gives the error value. The network weights are adjusted itera-
head traffic, ground cover, climatic data, presence of groundwater, tively in accordance with the error value, in order to minimize the
and internal pressure兲 are included in the STEM model, let alone error.
their significance.
Regression analysis, as described by Righetti 共2001兲, was
applied to analyze the strength of the CWW models. The rate Application of Neural Network Analysis
variable parameter 共for STPM兲 or parameters 共for STEM兲 were
supplied by Righetti. The shifted time parameter was fitted for The neural network topology is represented by a number of nodes
both models, using the first 3 years of data 1997–1999 and then organized in input, hidden, and output layers. The NN architec-
predictions were made for the next 3 years 2000–2002 and com- ture of an MLP with an input layer with six nodes, two hidden
pared with observed failures. layers with eight nodes each, and an output layer with one node,
The STPM model gives results in total number of failures and is represented in Fig. 2.
the STEM gives results in number of failures per year. However, The algorithm used for error minimization is a conjugate gra-
for modeling purposes, a better unit is number of failures/km/ dient. The error function to be minimized through training is the
year, since each asset has a different length. Taking this unit as a mean square error. The algorithm finds the nearest local minimum
basis for comparison between the models, it was found that in a mean square error for any given set of initial connection
STPM gives a moderately high coefficient of determination 共r2兲, values. The network was initialized a number of times with ran-
equal to 0.437, and STEM gives a very low r2, equal to 0.097, dom weights to find a good optimum. The sigmoid logistic acti-
between the predicted and the observed value as illustrated in vation function for neurons was found through trial and error.
Fig. 1. In order to deal with nonlinearity and hence possible overfit-
The main criticism of the statistical models, as reported by ting, a number of techniques have been used such as cross vali-
Righetti 共2001兲, is that “whilst failures generally increase with dation, bootstrapping, and random sampling in order to estimate
pipe age, there is wide variation in performance of individual the generalization error. Cross validation is applied in Neural con-
Software
Table 1. Coefficients of Determination 共r2兲 for Predicted versus Observed Values from NN Experiments and Statistical Model
Hidden
Experiment Model layers Nodes Variable Pipe type Data set r2
1. NN 2 9 6 CICLa Test97-99 0.679
— 2 9 6 CICL Run00-02 0.5423
2. NN 2 9 5 CIb Test97-99 0.4825
— 2 9 5 CI Run00-02 0.3626
3. STPM — — — CICL Run00-02 0.4525
STPM — — — CI Run00-02 0.3728
4. STEM — — — CICL Run00-02 0.0946
STEM — — — CI Run00-02 0.1234
a
CICL 共cast iron cement lined兲: spun gray cast iron pipe with factory cement lining.
b
CI 共gray cast iron兲: horizontally cast, unlined pipe 共possibly cement lined in situ兲.
Water Mains Research Rep. No. 114, Urban Water Research Associa-
tion of Australia, Melbourne, Australia.
Ampazis, N., Perantonis, S. J., and Taylor, J. G. 共1999兲. “Dynamics of
Neural connection, version 2.1, user’s manual. 共1998兲. Recognition Sys-
multi-layer networks in the vicinity of temporary minima,” Neural
Networks, 12, 43–58. tems Ltd.
Righetti, B. 共2001兲. “Cast iron condition assessment study.” City West
Anderson, A. 共1995兲. An introduction to neural networks, MIT Press,
Water, Internal Rep., City West Water Pty Ltd, Melbourne, Australia.
Cambridge, Mass.
Beale, R., and Jackson, T. 共1990兲. Neural computing. Ripley, B. D. 共1996兲. Pattern recognition and neural networks, Cam-
Bishop, C. M. 共1996兲. Neural networks for pattern recognition, Claren- bridge University Press, Cambridge, U.K.
don, Oxford, U.K. Sain, S. 共2005兲. “MATH 4820/5320: Introduction to mathematical statis-
Clark, R. M., Stafford, C. L., and Goodrich, J. A. 共1982兲. “Water distri- tics: Simple linear regression II.” 具http://math.cudenver.edu/~ssain/
bution systems: A spatial and cost evaluation,” J. Water Resour. Plng. stat/lec18.pdf典 共October 19, 2005兲.