You are on page 1of 40

1

4 Multi-Objective objective oOptimization mModels for the rRenewal pPlanning of mMultiple aAsset

5 cClasses

6 Thomas Ying-Jeh Chen1,* thomas.chen@xylem.com;

7 Eric Wang2;

8 Nicole Pasch3;

9 Amin Ganjidoost4

10 1
Senior Data Scientist

11 Xylem Inc

12 8920 MD-108, Columbia, MarylandMD, 21045, USA.

13 2
Product Manager

14 Xylem Inc

15 870 Market St., San Francisco, CaliforniaCA, 94102, USA.

16 3
Client Solutions Manager

17 Xylem Inc

18 11850 Sears Street, Ste A, Livonia, MichiganMI, 48150, USA.

19 4
Drinking Water Decision Science – Manager

20 Xylem Inc

21 5055 Satellite Dr #7, Mississauga, ON, L4W 5K7, Ontario, CaliforniaCA, USA.

22

1
23 *
Thomas Ying-Jeh Chen, Xylem Inc, 8920 MD-108, Columbia, MD, 21045, USA.

24 EmailCorresponding author: Thomas.Chen@xylem.com

25

26 Abstract

27 Managing the aging infrastructure of water distribution systems presents a challenge for many

28 utilities. With various asset types competing for limited dollars, designing an effective asset

29 management program is a resource allocation problem. Mobilizations of equipment and crew is a

30 significant cost (typically 2%–-10%) within any capital improvement program. Therefore, selecting

31 projects that target multiple asset classes together can reduce mobilization and help utilities stretch

32 their budgets further. This research presents a process for modelingmodelling the joint renewal

33 planning of multiple asset classes. The problem is framed as a dual-objective optimization, where

34 the selection of project areas aims to maximize lead service line removal and water meter changeout

35 together. A case study from a Midwest utility is presented, and empirical data suggests the dual-

36 objective approach effectively reduces duplicate interventions to in the same regions. Equity

37 considerations are also examined, where constraints are added to enforce system-wide project

38 selection. Results show that the sensitivity of the objective towards equity is dependent on the

39 underlying spatial distribution of the target asset itself, where uneven spread of the target asset

40 leads to greater negative impact on model performance.

41

42 KEYWORDS

43 aAsset mManagement

44 dDrinking wWater sSystems

2
45 oOptimization

46 rRisk aAnalysis

47

48 Article Impact Statement

49 A two-step method is presented for the replacement planning of water system assets located at
50 individual households. The method reduces mobilization cost by addressing multiple asset types
51 together.

52

53 1. INTRODUCTION

54 With various operational and regulatory objectives competing for the same dollars within a

55 municipal budget, designing an asset management program is a resource allocation problem. For

56 many utilities, having multiple asset types (mains, valves, meters, and service lines) that each needs

57 timely inspection and repair is placing a heavy burden on their limited budgets and affordability of

58 the water service in their communities. In addition, public health regulations imposed by state and

59 local agencies may require additional capital works to be performed, further stretching the staff

60 and funding of the municipality. As a result, a cost-efficient and effective asset management

61 program is critical for guiding utility operators to maximize their return on time and investment by

62 targeting vulnerable assets most in need first.

63 Risk assessments are useful in designing these programs because they provide a systematic process

64 for quantifying vulnerability and ranking individual assets to guide the prioritization (American

65 Water Works Association, 2010). For example, previous works have implemented analyses where

66 every distribution main is ranked based on the future likelihood and consequence of failure, and

67 asset management programs are designed to address the highest risk assets first (Chen, Riley, et al.,

68 2020; Chen, Washington, et al., 2020). See other examples in (Ganjidoost et al., (2022)), (Vladeanu

3
69 & and Matthews, (2019)), (Fontanazza et al., (2015)) , and (Puleo et al., (2014)) focusing on

70 distribution mains, valves, and water meters.

71 While this is a useful starting point, a key limitation of these approaches is that it only considers a

72 single asset class in the analysis. The assumption being that risk can be optimally reduced if capital

73 was allocated based on the highest- ranked assets. As discussed earlier, a water distribution system

74 contains many different infrastructure classes, and an asset management program that accounts for

75 the different categories is more effective. For example, identifying regions with high-risk water

76 mains alone is good, but identifying regions with high-risk mains, service lines, and valves together

77 can help the utility achieve greater economies of scale. The largest savings realized in this approach

78 is the reduction in truck roll, or mobilization cost. Truck roll is broadly defined as the dispatch of

79 crew and equipment into the field to perform any capital improvement work. It accounts for a

80 significant portion of the cost for any replacement project, with estimates of $18,000 per pipe

81 replacement projects requiring road excavations (Chen et al., 2021), and approximately $500 for

82 smaller projects at the household level (e.g., meter replacements). Depending on the crew and

83 equipment needed, mobilization costs take up anywhere between 2% and –-10% of the project's

84 overall budget, a significant margin for utilities with stretched budgets. Designing programs that

85 align the replacements of multiple infrastructure classes together will reduce the cost of truck roll

86 and help utilities do more with their limited budgets.

87 The objective of this research is to present a methodology for selecting household-level projects that

88 optimizes the joint replacement of two infrastructure classes: degraded water meters, and lead

89 service lines. The problem is framed as a dual objective integer program: (1) maximize the number

90 of lead service lines replaced, and (2) maximize the number of degraded meters replaced.

91 Individual homes are aggregated into larger areas and scored based on the number of lead pipes

92 and degraded meters contained in each. A single project is delineated based on these aggregated

93 spaces and the optimization aims to find a selection of project areas that jointly maximizes the
4
94 replacement of these assets. An exact formulation of this model (one that guarantees the

95 identification of all optimal solutions) is presented, and a solution approach is demonstrated to find

96 the pPareto-efficient frontier (Berardi et al., 2009Dridi et al., 2009). Each point on this frontier

97 represents an optimal solution based on different weightings of the two objectives. Utilities can

98 choose between these solutions based on overall organizational and regulatory priorities.

99 Two main contributions of this research are summarized below.as follows:

100  The paper presents a methodology for optimizing the joint replacement planning of two

101 infrastructure classes that are located at the premise of individual households. Single

102 structures can be dispersed geographically and has have the potential to quickly drive-

103 up mobilization costs. Based on our review of the literature, no previous work has

104 demonstrated the application of optimization modelingmodelling and geospatial

105 methods for the joint renewal planning of assets at the individual household resolution.

106 While this research considers water meters and lead service lines as the case study, the

107 methods presented can be generalized to plan for the asset management of any

108 infrastructure found inside the home.

109  An exact mathematical model, one that can guarantee to find all optimal solutions, is

110 presented and framed as a dual-objective optimization problem. The output for these

111 classes of problems is a set of pPareto-efficient solutions representing different tradeoffs

112 between the two objectives. We find no previous work presents an exact model for joint

113 (multi-asset) project selection of water systems. Providing multiple solutions to

114 decision- makers is also an improvement because it can guide more effective planning

115 based on different organizational and regulatory priorities.

116 Together these two contributions advance the state of the art in modelingmodelling to assist in better

117 asset management planning. It is important to note that the case study of synthesizing these two

5
118 activities together is meant for conceptual purposes only. The goal here is to provide an example of

119 two infrastructure projects that both take place at individual structures, and the case study

120 demonstrates the application of dual objective modelingmodelling to select the best project portfolio.

121 In practice, considerations beyond just mobilization cost must be included when selecting which

122 infrastructure projects should be aligned. Infrastructure projects requiring similar crew size,

123 construction activities (need for specialized equipment), and level of disruption (excavation and

124 restoration) are much better suited for alignment. Meter and service line replacements are different

125 in their required level of effort, but these are selected due to data availability at the single

126 household resolution and best supports methods explored in this research.

127 2. LITERATURE REVIEW

128 Many studies are available for the risk assessments of water systems focusing on a single asset class.

129 Water distribution mains have been a particular focus for many researchers (see reviews by

130 (Kleiner & and Rajani, (2001); , Konstantinou & and Stoianov, (2020); , and St. Clair & and Sinha,

131 (2012))), where statistical models were developed to estimate risk for individual pipeline segments

132 and future projects could then be planned by prioritizing the highest risk assets first. The

133 underlying models rely on historic records of asset failure to train and validate the predictions for

134 which assets will fail in the future. On the other hand, see literature reviews by (Tscheikner-Gratl

135 et al., (2019)) and (Ana & Ana and Bauwens, (2007)) that summarizes the state of the art in sewer

136 main risk modelingmodelling. Sewer main risk assessments primarily (1) quantify the structural

137 condition of the main, (2) estimate the degree of overflow during wet weather events, and (3)

138 characterize the fiscal and environmental impacts of overflows and leaks. Similar works have also

139 been performed that address water meter changeout programs (see works by (Fontanazza et al.,

140 (2015); , Yazdandoost & and Izadi, (2018); , Yee, (1999); , and Mohanakrishnan et al. (2019)), where

141 quantitative models to characterize the reliability of water meters are presented. The meters most

6
142 likely to fail or inaccurately record hydraulics can be addressed first in a replacement program to

143 maximize overall reliability.

144 Taking on a broader view of infrastructure renewal planning, previous works have also discussed

145 the benefits of bundling multiple infrastructure projects together in the same location. The major

146 benefit of selecting projects in this manner is to cut down on mobilization costs and realize greater

147 economies of scale (S. Kerwin & Adey, 2020a). In the context of water systems, works by (Kleiner et

148 al., (2010) and; Nafi and Kleiner (2010)) present a methodology for selecting pipe replacement

149 locations that are incentivized to align with future road-work locations. An objective function for

150 total cost is presented, which includes mobilization, raw materials, crew, and expected repair of

151 future breaks, and the optimization aims to minimizes total cost. Mobilization costs are greatly

152 reduced for streets with future road works already planned, such that selections that align with

153 adjacent works are incentivized. Similar research by (Sean Kerwin & and Adey, (2020b; ),

154 Muñuzuri et al., (2020), and; Tscheikner-Gratl et al., (2016)) demonstrate approaches for bundling

155 water and sewer main replacement selections together, where a risk index that encompasses the

156 status of both asset classes is used to guide decision making. A case study by (Carey & and Lueke,

157 (2013)) presents a framework for selecting infrastructure projects based on the combined criticality

158 of the underlying roads, water mains, and sewer mains. A major limitation of these approaches is

159 the need to assign weightings between the asset classes and the difficulty in normalizing failure

160 outcomes (e.g., failure cost to a road collapse is much larger than a pipe burst). Multi-criteria

161 decision-making methods such as AHP and ELECTRE have been demonstrated to reflect these

162 weightings from an operators perspective (Tscheikner-Gratl et al., 2017), but they are often subject

163 to human bias and difficult to scale. Despite the challenges posed for by the application of these

164 planning models, they can still be used in conjunction with the dual-objective model developed in

165 this research to identify areas with greater project alignment. The model developed in this research

166 focuses on infrastructure at the same spatial resolution (i.e., homes), but this can be incorporated

7
167 with other planning models to consider other assets of different spatial scales (i.e., streets,

168 neighborhoods).

169 Optimization modelingmodelling is a useful tool for ingesting asset- level or location- level risk

170 indices and formulating a program that maximizes risk reduction. The output from these models

171 often specify specifies which assets to select, when to address them, and even the type of action to

172 take (replace or repair) (L. Chen et al., 2019). Some examples that formulate and solve exact integer

173 models include the following: inspection routing (Chen, Riley, et al., 2020; Chen, Washington, et al.,

174 2020), replacement project selection (T.Y.J. Chen et al., 2021), sewer rehabilitation (de Monsabert

175 et al., 1999), as well as sensor placement (Berry et al., 2005). Since these models are convex, the

176 globally optimal solution can be identified. To add further complexity, overall pressure and flow

177 conditions need to be considered when taking parts of a distribution system offline during projects.

178 This can introduce non-linear constraints which impact computation tractability, as seen in

179 examples by (Pecci et al., (2015)) and (Naoum-Sawaya et al., (2015)). Here non-linear relaxation

180 techniques are demonstrated to convergence on local optima. Multi-objective formulations for

181 water main replacement planning is are presented in (Dandy & and Engelhardt, (2006); , Kim et al.,

182 (2004); , and Osman et al., (2017)), the model formulations in these works aim to maximize risk

183 reduction along with other considerations such as cost, hydraulic reliability, and traffic impacts.

184 The work presented in this paper aims to extend the state of the art by formulating a multi-

185 objective optimization model for the replacement of multiple water infrastructure assets at the

186 location of individual households. From our review of the literature, we find no work that jointly

187 addresses capital replacement planning at this spatial resolution as well as the economies of scale

188 when bundling multiple replacements at the same location. Another area where previous work can

189 be integrated with the proposed model is during the formulation of the dual-asset replacement

190 model. Because the proposed method relies on inputs to determine the reward for selecting a given

191 asset, this makes it particularly suitable for using the existing state of the art as input. In this paper,
8
192 we use the age of the valves and a predictive model for lead service line locations to generate inputs

193 to the project selection model. However, in practice, more sophisticated methods that exist in the

194 literature can be used to provide better- quality estimates of infrastructure conditions.

195

196 3. METHODOLOGY

197 This section outlines the two-step process for (1) aggregating adjacent households into larger areas

198 more suitable for capital projects, and (2) identifying the collection of areas that maximizes the

199 joint renewal of two different water infrastructure assets. This research will consider the

200 replacement planning of lead service lines and degraded water meters since both are located at the

201 premise of individual homes. The method can be generalized to consider any two asset classes, but

202 other practical planning factors would need to be considered (i.e., project cost and time). Figure 1

203 below summarizes the two-step process.

204

205

206 FIGURE 1. Project Identification identification and Optimization optimization Workflowworkflow.

207

208 3.1. Problem sSpecification

209 The project area selection problem is similar to the knapsack problem (Martello & Toth, 1990), a

210 well-studied problem in the field of combinatorial optimization. The goal is to select the

211 combination of project areas that will be targeted for replacement (lead service lines and water

212 meters) in the upcoming capital renewal program. The objective is to identify the selection of

213 project areas that maximize the number of lead service lines and degraded meters replacements. A
9
214 project area is simply an aggregation of addresses that are nearby to each other typically in a

215 contiguous manner. The budget and resource limitations of the municipality are reflected by

216 placing a ceiling on the total number of homes that can be included in the replacement program.

217 This assumes that the cost for of addressing each home is uniform, it is possible to include more

218 complex cost models but that is left for future work.

219 3.2. Spatial aAggregation of hHouseholds

220 Each residential structure served by a utility can be delineated by their its land parcel, and adjacent

221 parcels can be aggregated into larger geographic areas for project planning purposes. This is done

222 by loading the shapefiles of individual land parcels into the ArcGIS spatial software, using a spatial

223 snap function to join adjacent land parcels together if needed, then merging all adjacent land

224 parcels into a larger area. See Figure 2 below for an example output of this spatial analysis. This

225 approach is selected due to its simplicity in implementation and effectively groups of individual

226 homes into larger areas for project selection. The spatial process is also consistent where most

227 aggregated street-blocks contain similar number of homes (approximately ca. 30 parcels per street

228 block). In practice, the methodology to bundle individual homes can be generalized to any

229 approach the utility sees most appropriate, e.g.for example, aggregating homes along both sides of

230 the same street. It is possible that other aggregation methods may be more realistic, but the selected

231 method described here meets the needs of this research.

232

233 FIGURE 2. Spatial Aggregation aggregation of Land land Parcels parcels to Street street Block block

234 Neighborhoodsneighborhoods.

235

10
236 For further simplicity, only residential structures are considered for this research since they

237 comprise most of the buildings served by a municipality and are the primary target for lead service

238 line removal and meter changeouts. Each land parcel is counted as a single residential structure

239 and assumed to contain one 1 meter and one service line (Hajiseyedjavadi et al., 2022). The

240 aggregated street blocks will have an associated (1) count of the total number of homes, (2) count of

241 homes with lead service lines, and (3) count of homes with degraded meters.

242 3.3. Neighborhood sSelection oOptimization

243 The output from the spatial aggregation is to group individual addresses located near each other

244 into larger areas that better delineate potential project areas. The next step is to select the collection

245 of project areas that will be included into the asset management program. Due to limited capital

246 and labor resources, utilities need to prioritize neighborhoods to maximize return on investment

247 and best meet regulatory and operational objectives. Budget limitations at the utility is are reflected

248 through a limit on the total number of households that can be included. The reward for selecting a

249 project area is defined as (1) the expected sum of lead service lines contained in the boundary

250 (verified plus unverified) and (2) the number of degraded meters contained in the boundary based

251 on age. On the other hand, the cost for of selecting a project area is defined as the total number of

252 structures contained in the boundary. The exact integer program for jointly optimizing the

253 replacement of lead service lines and degraded meters is defined.

254 3.3.1. Decision vVariables

255 We first define the following decision variables for the optimization model.

256  Let i ∈ I be the index of candidate project areas.

257  Let X i = 1 if the candidate project area i is selected to be included in the capital

258 replacement plan, 0 otherwise.

11
259  Let Li = the expected number of homes with lead or galvanized service lines within the

260 candidate project area i . This is taken as the sum of homes where the pipe material is

261 verified (known based on historic inspection) and individual likelihoods of containing

262 lead if unverified (derived from a statistical model).

263  Let M i = the expected number of degraded meters contained within the project area i .

264 This is taken as the sum of the probability of meter failure across all homes contained

265 within the project area, the probabilities are derived from a statistical model.

266 Let H i = the total number of residential structures within the project area i .

267  Let T U and T L be the upper and lower limit for total residential structures that can be

268 addressed within the planning cycle. This reflects the utilities allocated budget for the

269 capital program.

270 For budget planning purposes, a municipality may need to submit a 3–10-year replacement plan to

271 the city manager for approval. The limits for the total number of homes targeted in the capital

272 renewal program (T U and T L) should reflect the available equipment and labor at the municipality.

273 For this research, we assume that a utility can target 1%–-3% of the total homes served within the

274 distribution area each year.

275 3.3.2. Problem fFormulation

276 The dual objective optimization model for selecting project areas that jointly maximizes the sum of

277 both lead service lines and water meters is specified in model (1) below.as follows:

278 max ∑ Li X i (1a)


i ∈I

279 max ∑ M i X i, (1b)


i ∈I

12
280 Subject to:

281 T L ≤ ∑ H i Xi ≤ T U, (1c)
i∈ I

282 X i ∈ {0 ,1 } , ∀ i ∈ I (1d)

283 The first objective function (1a) maximizes the total count of lead service lines replaced in the

284 selected project areas. The second objective function (1b) maximizes the total count of degraded

285 meters addressed. Constraint (1c) specifies that the total number of households in the selected area

286 is between the upper and lower limits. Constraint (1d) specifies the binary domain of the decision

287 variable. Note that equations (1a) – –(1d) form the basis of the project selection problem and are

288 closely similar to the knapsack problem (Martello & Toth, 1990). Solving this model can serve as a

289 good starting point for planning purposes, however, there are many practical and political

290 limitations not accounted for. We will address two common examples here and demonstrate how

291 these considerations can be included in the following model: (1) requiring a minimum number of

292 projects planned for each neighborhood or political boundary, and (2) the desire for selecting

293 project areas that are as spatially compact as possible.

294 When allocating resources for lead service line replacements, there is often political pressure to

295 ensure that there is system- wide coverage during the capital program (Madrigal, 2019). However,

296 it is well documented that at-risk populations for lead exposure are not evenly distributed across a

297 city, often being concentrated in specific neighborhoods (Abernethy et al., 2016Chojnacki et al.,

298 2017). As a result, a more equitable program will allocate more resources towards neighborhoods

299 with at-risk individuals, while still satisfying political pressures by ensuring projects are distributed

300 across all neighborhoods. The following constraints will enforce a minimum count of homes

301 selected in every neighborhood. For generalizability, will use the terms ‘“neighborhood,’”, ‘“ward’

13
302 ward,” and ‘“boundary’ boundary” interchangeably, in practice, any geographic delineation of the

303 distribution area can be used.

304  Let j ∈ J be the index of all neighborhood boundaries within the distribution area.

305  Let T j = the minimum number of residential structures within the neighborhood

306 boundary j required to be included in the replacement plan.

307 Let D j = the set of candidate project areas i that are contained within the boundary j .

308 Note that the neighborhood-specific limits on houses selected need to correspond with the overall

309 limits across the entire distribution area: T L ≤ ∑ T j ≤ T U . The following constraint (1e) can be
j ∈J

310 included in model (1) to enforce minimum selection threshold per neighborhood. Equation (1e)

311 specifies that the total number of homes selected within each neighborhood is at least the minimum

312 required amount.

313 ∑ H i Xi ≥ T j, ∀ j ∈ J (1e)
i ∈D j

314 Within each neighborhood, selecting projects that are as spatially compact as possible is desirable

315 because of the following: (1) further reduces mobilization since less driving is required to address all

316 the individual homes, (2) simplifies routing of the crew to address all the selected projects since they

317 are close together. To account for compactness, we specify a maximum distance that cannot be

318 exceeded between any two project areas within a given neighborhood. To include considerations of

319 compactness in model (1), we first define a few additional variables.

320 Let ε j be the set of all possible project pair combinations across the boundary j .

321  Let Y ij= 1 if both project areas i and j are selected for replacement, 0 otherwise.

322  Let D ij = the distance between the project areas i and j , defined as the euclidean

323 distance between the two centroids.


14
324 Let B j the maximum distance allowable between any selected pair of projects within the boundary

325 j.

326 The following constraints added to model (1) will adjust the model to consider the degree of spatial

327 spread of the selected project areas.

328 D ij Y ij ≤ B j , ∀(i , j) ∈ ε j , j ∈ J (1f)

329 Y ij ≤ X i , ∀ (i , j)∈ ε j , j ∈ J (1g)

330 Y ij ≤ X j , ∀(i , j)∈ ε j , j ∈ J (1h)

331 Y ij ≥ X i + X j−1 , ∀(i , j)∈ ε j , j ∈ J (1i)

332 Y ij ∈ {0 , 1} , ∀(i , j) ∈ ε j , j ∈ J (1j)

333 Equation (1f) enforces that the centroid distance between any selected pair of projects must not

334 exceed the boundary-specific limit B j . Equations (1g) – –(1i) enforces the relationship between the

335 indicator variables: Y ij can only be 1 if both X i and X j are also 1. Equation (1j) specifies the binary

336 domain of the decision variable.

337 3.3.3. Solution mMethod

338 Model (1) is a dual objective model, meaning the optimal solution is not a single unique selection of

339 neighborhoods, but rather a set of solutions that are pPareto-efficient (or non-dominated). A set of

340 solutions that are pPareto-efficient represents the optimal combination of outcomes where any

341 improvement to objective (1a) will come at the expense of (1b), and vice versa. Pareto optimality

342 enables all tradeoffs among optimal combinations of the two objectives to be considered (Muncie et

343 al., 2013).

15
344 To solve model (1) and identify the pPareto-efficient frontier, this research considers the epsilon

345 constraint approach since the closed form specification is available. We refer the reader to (Haimes

346 et al., 1971) for full details on the epsilon constraint method, as well as Figure 3 below. To

347 summarize, it involves first solving the model as single objective problem by considering only (1a)

348 and (1b) alone. These two solutions initialize the pPareto-efficient set by defining the boundaries.

349 The algorithm then iterates through the solution space between the two boundary points to identify

350 all other pPareto-efficient solutions that may exist. This is done by converting one of the objective

351 functions as a constraint, making the model a single-objective problem, and resolving the

352 optimization at different threshold values of the converted objective function. This process is

353 repeated for every incremental value of ε, with each newly detected non-dominated solution being

354 appended to the pPareto-efficient set.

355

356 FIGURE 3. Schematic for Epsilonepsilon-Constraint constraint Algorithm algorithm Dualdual-

357 Objective objective Maximization maximization Problemsproblems.

358

359 Since model (1) is binary (all decision variables are binary) and linear, each iteration of the epsilon

360 constraint method can be solved directly by using the branch and bound algorithm (Lawler &

361 Wood, 1966) available on most commercial and open-source solvers. The spatial data of the case

362 study was preprocessed using ESRI'’s ArcGIS software; all data processing, model formulation,

363 and implementation of the epsilon constraint algorithm was implemented with Python 3.7 and the

364 package PuLP; the mathematical solver CPLEX 12.10.0 was used to identify the optimal solutions.

365

366 4. CASE STUDY

16
367 The two-step methodology to aggregate individual households and optimize the selection of projects

368 is demonstrated on in a real municipality. In this section, we describe the case study dataset and the

369 methods used to generate the necessary inputs for the project optimization: (1) lead service line

370 estimates, (2) failed meter estimates, and (3) location of larger neighborhoods.

371 4.1. Distribution sSystem dData

372 We partnered with the local utility in Dearborn (Michigan) to obtain spatial databases of the city'’s

373 water distribution system and parcel tax assessment information. The water meter spatial layer and

374 tax parcels are used to identify the set of active residential users. We first use tax assessment data

375 from the year 2021 to filter out all buildings with a non-residential zoning classification (e.g.,

376 commercial, industrial, federal). Next, we spatially relate each meter to a land parcel based on its

377 location, then using the customer status in the meters shapefile we filter out locations where the

378 meter is inactive. There are a total of 29,559 residential parcels served by the Dearborn water

379 system, and we assume that each parcel contains one 1 meter and one building. There are a total of

380 2074 unique candidate project areas in the City of Dearborn after aggregating adjacent land

381 parcels to street blocks. Figure 4 below shows a map of all the residential land parcels served under

382 the distribution system, along with a map of aggregated street blocks colored by the number of

383 parcels contained within each boundary.

384

385 FIGURE 4. Dearborn Residential residential Land land Parcel parcel and Street street Block block

386 Locationslocations.

387

388 4.2. Lead sService lLine pProbability

17
389 We consider the service line assets running from the distribution main to the household in this

390 study. The portion of pipe between the water main and the stop box, curb stop, or shutoff valve is

391 publicly owned by the City of Dearborn, and the rest of the pipeline running to the meter inside the

392 home is owned by the homeowner. While there are two portions of pipe making up the service line

393 connection, we consider the prevalence of lead as a binary response: does any part of the pipeline

394 contain lead, or not. A ‘lead’ response in the data is encoded by the utility as ‘“any portion lead,’”,

395 whereas a ‘“non-lead’ lead” response is encoded as ‘“neither portion lead.’”. We take this approach

396 considering both the privately and publicly owned portion together because the incidence of lead on

397 either part of the pipe necessitates a full replacement work based on the latest regulation (US

398 Environmental Protection Agency, 2019). Therefore, the count of lead service lines across the entire

399 system can be assumed as the count of meter boxes connected to lead pipes.

400 The material information for the service line inventory is only partially complete for the City of

401 Dearborn. Of the 29,559 active residential land parcels under consideration, only 11,692 (39.6%)

402 contain material information which is verified based on historic inspection of replacement works.

403 For this research, we use the data from the verified portion of services lines to train machine

404 learning models to predict which unverified location is most likely to contain lead. Based on past

405 research results (Chojnacki et al., 2017) it is demonstrated that the XGBoost (T. Chen & Guestrin,

406 2016) algorithm is a strong predictor of lead service line locations, which we will use in the case

407 study here. We obtained from the City of Dearborn parcel tax assessment records to use for

408 modelingmodelling, in combination with attribute information embedded within the service line

409 shapefile. The tax assessment dataset identifies each parcel of land under the city'’s jurisdiction and

410 includes information on land value, building value, building age, and other relevant information on

411 building construction. See Table 1 below summarizing the input data used for training the XGBoost

412 algorithm.

18
413 TABLE 1. Machine Learning learning Variables variables for Lead lead Service service Line line

414 Predictionprediction.

Variable nName Description

Lead Binary Response Rresponse Variablevariable: ‘“Positive’ Positive” is the

Responseresponse meter box is located to a lead pipe, ‘“Negative’ Negative” otherwise.

Diameter Size of the service line pipe connected to the meter box, reported in

inches.

Install Yearyear Install year of the service line.

Parcel Ageage Built year of the residential structure connected to the meter box.

Parcel Floor floor Total floor area of the residential structure connected to the meter box,

Areaarea measured in square feet.

Parcel Total total Total square footage of the parcel in which the meter box is located in.

Areaarea

Parcel Land land Assessed value of the land in the parcel, based on 2021 tax data.

Valuevalue

Parcel Total total Total value of the parcel, including the land and the structure, based on

Valuevalue 2021 tax data.

415

416 The trained XGBoost model is a classification model that predicts for each unverified meter box,

417 the likelihood of having a lead service line connected there. To estimate the total number of lead

418 service lines that can be removed when selecting a given project area, sum the number of verified

419 locations with lead pipes with the probabilities of lead pipe for each unverified location. The

420 distribution of estimated lead service lines per street block is shown in Figure 5 below.

421

19
422

423 FIGURE 5. Expected Lead lead Service service Line line Capture capture per Street street Blockblock.

424

425 4.3. Probability of mMeter fFailure

426 To characterize the conditions of the water meters, an age-based likelihood of failure model is

427 implemented. It is beyond the scope of this research to use the most sophisticated modelingmodelling

428 of water meter risk, our goal here is to have a method to estimate the number of prevented meter

429 failures when selecting a replacement project area. A failure likelihood model is convenient here

430 because it is scaled between 0 and 1, and higher likelihoods directly translate to a higher risk value.

431 The probability model we use is presented in the case study by (Lund, 1988), which uses an

432 exponential distribution to characterize failure likelihood. The exponential model estimates the

433 probability of failure of an asset P ( t ) , where t denotes the asset age measured in years. The model

434 equation (2) is presented below.as follows:

( −0.01t )
435 P ( t ) =1−e (2)

436 Equation (2) specifies that older meters are more likely to fail. The in-service date of individual

437 meters is available as an embedded attribute in the water meter shapefile. We can use this

438 information to compute the age of each active residential water meter in years ( t ¿ . To estimate the

439 total number of failed meters that can be avoided when selecting a given project area, simply sum

440 the probability of failures of all the individual meters contained in the area. The distribution of

441 estimated failed meters per street block is shown in Figure 6 below.

442

443

20
444 FIGURE 6. Expected Meter meter Failure failure Count count per Street street Blockblock.

445

446 4.4. Geographic nNeighborhoods

447 Census tracts are used to delineate the larger neighborhoods which different street blocks are

448 contained within. They were selected because they are contiguous spaces that each roughly

449 encompass the same population (1200– - 8000 people) and are large enough to divide the service

450 area of Dearborn into a small number of discrete regions. The spatial database of census tracts are

451 is publicly available from the US Census Bureau (US Census Bureau, 2019). Figure 7 below shows

452 the boundary of each census tract overlaid with the street blocks. There are 24 unique census

453 tracts, with an average of 86 street blocks being contained within each area.

454

455

456 FIGURE 7. Census Tract tract (Neighborhoodsneighborhoods) Locations locations for the City of

457 Dearborn.

458

459 5. RESULTS AND DISCUSSION

460 In this section, we present the project area selection results for the City of Dearborn case study. To

461 demonstrate the application of optimization models to incorporate varying degrees of practical

462 planning constraints, three versions of model (1) will be considered.

21
463 (1) 1) Baseline: This model consists of only Eequations (1a) – –(1d). The only constraint is the

464 budget on the number of homes that can be selected as part of the capital program. No

465 geographic restrictions are specified.

466 (2) 2) Geographic Minimums: This model includes equation Equation (1e) to the baseline. This

467 specifies a minimum number of homes that must be selected per census tract to be included

468 for replacements. In effect, this avoids spatial concentration of projects and ensures there

469 will be selections spread across the entire distribution area.

470 (3) 3) Geographic Minimums and Compactness: This is the full model described by eEquations

471 (1a) – –(1j). Beyond simply enforcing a minimum number of homes per census tract, an

472 additional constraint requires that the selected blocks per area all be within a certain

473 distance of each other. In effect, this requires the model to identify a cluster of street blocks

474 to address within each census tract to reduce mobilization.

475 The epsilon constrain method was implemented to identify the pPareto-efficient frontier for each

476 model and the results were compared. The three models progress in complexity, with more

477 constraints being added each time, and thus worsens the overall performance of the model with

478 each step (fewer total assets removed). By contrasting the different solutions identified by each

479 model, we can also quantify the tradeoffs of including each consideration. This is done be by

480 measuring the degree to which the objectives worsen (how many fewer lead service lines were

481 removed, how many fewer failed meters replaced) at the expense of having a more practical and/or

482 low-cost deployment of resources (selected areas cover the whole site, all areas are compact).

483 For simplicity, we only considered the threshold where a maximum of 3% of parcels can be

484 selected. This roughly corresponds to a 3-year planning horizon for the city assuming a 1% annual

485 replacement rate. In practice, this threshold can be adjusted to the proportion of homes the utility

486 is planning to include in the capital program. The 3% limit corresponds to roughly 886 parcels total

487 over the street blocks selected, it is assumed that every home in a chosen street block will be
22
488 included in the capital program. Similarly, we specify that a minimum of 30 homes should be

489 selected per census tract for the geographic minimums constraint (1d)) and an upper limit of 1000 

490 feet of separation between street blocks to enforce compactness in constraints (1f)) – –(1j)). Again,

491 these thresholds were selected as feasible and realistic values in a capital program and are used to

492 highlight the use of multi-objective models for capital planning. In practice, these model

493 parameters can be adjusted to accurately reflect local situations.

494 Figure 8 shows the identified pPareto-efficient frontiers for each of the solved models, the solution

495 located at the midpoint of the frontier is bolded. The X-axis shows the optimal value of objective 1

496 (lead service line removal) and the Y-axis shows the optimal value of objective 2 (failed meter

497 removal). From observation, it is evident that the baseline model with no geographic constraints far

498 out-performs the other two models, as seen in the large gap of its pPareto frontier relative to the

499 others. There are 36 identified solutions in the pPareto-efficient frontier of the ‘“baseline’ baseline”

500 model, 25 solutions in the ‘“geographic minimums’ minimums” model, and 13 solutions in the full

501 ‘“geographic minimums and compactness model.”’. The solution set of each model will be discussed

502 individually below before a comparison across models.

503

504

505 FIGURE 8. Pareto Efficient efficient Frontier frontier—– Objective Value value Comparison

506 comparison with Different different Constraints constraints (midpoint solution bolded).

507

508 We first focus on the ‘“baseline’ baseline” model. The boundary points along the pPareto-efficient

509 frontier represent the outcome when only one objective is considered in the optimization. The left-

510 most point is the solution when only failed meter removal is considered (equation Equation 1b)) and

23
511 the right-most point when only accounting for lead service lines (equation Equation 1a)). Optimizing

512 for lead service lines alone will produce a selection of street blocks that remove 823 lead pipe and

513 528 failed meters and optimizing for meters alone will remove 687 lead pipe and 571 failed meters

514 respectively. To quantify the difference in the selected street blocks, we can consider the two

515 optimal solutions as unique sets of street blocks and use the jJaccard similarity index. The jJaccard

516 index simply measures the degree of overlap between two sets, scaled between 0– and -1, defined as

517 the proportion of overlapping items relative to the total count of unique items (see equation (3)

518 below). Typically, jJaccard index values above 0.6 represent similar sets, and values below 0.4

519 represent dissimilar sets. The two boundary solutions have a jJaccard similarity of 0.39, illustrating

520 the large difference in selected areas when only considering one objective alone.

Number of verlapping tems cross wo ets ( A ∩ B )


521 Jaccard et imilarity ndex , J ( A , B )= (3)
Total umber of tems cross wo ets ( A ∪B )

522 Based on the two boundary points, the decrease in lead pipe capture (136, 16.5% lower) is a lot

523 larger than failed meters (43, 7.5% lower). This implies that the tradeoff for improving objective 2

524 comes at a larger expense of objective 1, as objective 1 is more sensitive to performance decrease.

525 The reasoning behind the large tradeoff of objective 1 (lead pipes) relative to objective 2 (meters) is

526 intuitive after comparing the spatial distributions of these assets in Figures 5 and 6 The location of

527 lead service lines is spatially concentrated in just a few neighborhoods of the distribution area,

528 whereas faulty meters are more evenly scattered throughout the system. Meaning any deviation

529 away from the regions with lead pipes are found will greatly decrease model performance, whereas

530 many different geographic configurations of street blocks can produce similar meter removal. The

531 mid-point of the pPareto frontier is a proxy for the outcome when the two objectives are evenly

532 weighted, in this scenario the selection captures 768 lead pipes and 562 bad meters. Here the

533 performance tradeoffs between the two objectives are much smaller: only 55 (6.7%) fewer lead

534 pipes selected compared to the highest value possible, and 8 (1.4%) fewer for bad meters. Taking

24
535 the jJaccard similarity index into account, the mid-point solution has a jJaccard similarity of 0.571

536 to the case where only lead pipes are optimized for and a similarity index of 0.703 in the case of only

537 meters.

538 To provide practical context, given that we assume there is 1 meter and 1 service line per

539 household, any difference in count of asset removal is an approximately the number of additional

540 home visits the utility crew must make. For example, if one solution captures 100 fewer lead service

541 lines, then to make up that difference the city must allocate an additional 100 removals at other

542 homes to make up the difference. Therefore, the mid-point along the pPareto frontier represents a

543 more cost-efficient program since it greatly reduces the truck roll needed to remove a high number

544 of both lead pipes and meters at the same time. If we only considered the single-objective outcomes,

545 a utility would have to visit up to 136 additional homes to achieve optimal removal of lead pipe,

546 versus just 55 additional homes if the mid-point outcome was used. Similarly, a utility would need

547 43 additional deployments at homes relative to just 8 additional for bad meters.

548 Focusing on the ‘“geographic minimums’ minimums” model next, when lead service line removals

549 are optimized alone the optimal selection of street blocks captures 660 lead pipes and 507 bad

550 meters. In contrast, when only faulty meter removal is optimized, the solution street blocks the

551 removal of 469 lead pipes and 548 548 meters. The two solutions have a jJaccard similarity of 0.194,

552 meaning that less than one- fifth of the solution street blocks overlap.

553 Like the patterns observed in the ‘“baseline”’ model, the removal of lead service lines is much more

554 sensitive to drops in performance relative to the removal of faulty meters. The tradeoff between the

555 two boundary points are is 191 (28.9%) fewer lead pipes in exchange for 41 (7.6%) additional

556 meters, and vice versa. The mid-point solution along the pPareto-efficient frontier results in a

557 selection of blocks that removes 603 (57, 8.6% less than the optimal) lead pipes and 535 (13, 2.3%

558 fewer than the optimal) failed meters. The jJaccard similarity of this selection to that when only

559 lead pipes are optimized for is 0.355 and 0.476 compared to the solution optimizing only meters.
25
560 This represents a significant improvement since the mid-point encompasses over a third overlap to

561 the lead-optimized solution and almost a half overlap the meter—optimized one. The reasoning

562 behind the difference in sensitivity is also the same, lead pipes are highly concentrated in just a

563 small handful of census tracts. By reweighting the model to focus more on bad meters, which are

564 much more evenly dispersed across the map, will lead to large tradeoffs in the number of removed

565 lead pipes. Translating from the objective values to crew mobilizations, the mid-point solution can

566 potentially save the city 131 home mobilizations to the removal of lead pipe and 28 deployments to

567 remove meters in order to achieve optimal removal.

568 Finally turning to the ‘geographic minimums and compactness”’ model, the boundary points along

569 pPareto frontier indicate that a lead-optimized solution will select street blocks that remove 627

570 lead pipes and 501 meters. The meter-optimized solution, in contrast, will select street blocks that

571 remove 469 lead pipes and 532 532 meters. This accounts for a 158 (25.2%) decrease in the lead

572 capture and 31 (5.8%) decrease in failed meter removal when comparing the two single-objective

573 solutions. The jJaccard similarity index between them is just 0.176. Turning to the mid-point

574 solution, the block selections here can remove 603 lead pipes and 519 bad meters. This again

575 represents a significantly smaller reduction between the two boundary points, only 24 (3.8%) less

576 than the lead-optimized count and 13 (2.7%) less than the meter-optimized count. It is interesting to

577 note here that, given blocks with high lead service lines and bad meter counts are already

578 geographically clustered, the compactness constraint does not deviate the solution performance by

579 a big margin. To translate the changes in objective values to potential crew mobilizations, the mid-

580 point solution can save the city up to 134 home deployments for lead pipes and 18 deployments for

581 meters.

582 Figure 9 below shows the optimal solution of street blocks based on the different scenarios

583 considered, the selection of areas representing the mid-point of the pPareto-efficient frontier is

584 visualized. Table 2 compares the performance between of these three scenarios.
26
585

586

587 FIGURE 9. Street Block block Selections selections—– Ssolution Comparison comparison with

588 Different different Constraintsconstraints.

589

590 TABLE 2. Street Block block Selections selections—- Pperformance Comparison comparison with

591 Different different Constraintsconstraints.

Number of

Number of sStreet Objective 1, Lead Objective 2,

pParcels bBlocks lead sService fFailed mMeter

Scenario sSelected sSelected lLine rRemoval rRemoval

Baseline 886 166 768 562

Geographic 886 111 603 535

Minimumminimum

Geographic Minimums 879 78 603 518

minimums and

Compactnesscompactnes

592

593 One interesting observation is that the outcome in each scenario is at or just under the maximum

594 number of allowable parcels (886), however, the number of street blocks is reduced as more

595 constraints are added. The ‘“baseline’ baseline” model has no geographic constraints and the

596 solution under consideration selects 116 street blocks for inclusion in the capital program, whereas

597 the ‘“Geographic Minimums and Compactness’Compactness” scenario has less than half at just 78
27
598 street blocks. The performance across both objectives also trends in a decreasing manner as more

599 constraints are introduced. The ‘“baseline’ baseline” scenario far outperforms the other two

600 scenarios in terms of lead removal with almost a 27% increase, and marginally improves on failed

601 meter removal with a 5% increase. The reasoning behind these patterns were was discussed earlier

602 in this section, with lead service lines being primarily concentrated in just a few areas of the city,

603 imposing constraints on system-wide project selection will greatly reduce the performance.

604 Combining the trends in street block count and objective performance into consideration, the

605 empirical results suggest there exists a tradeoff between the removal count of target assets (in turn,

606 the effectiveness of a capital program) and cost. The “‘baseline’ baseline” model is the most precise,

607 selecting many smaller street blocks and removing many target assets, but the mobilization and

608 truck roll costs of this program are greatly higher because the small blocks are spread out

609 geographically. In contrast, the ‘“geographic minimums and compactness’compactness” constraint

610 reduces the mobilization cost where all selected projects within the same census tract are closely

611 located. These types of projects can be most efficiently completed with fewer truck rolls, but at the

612 expense of fewer removal of bad assets. These findings aim to highlight the power of optimization

613 modelingmodelling to evaluate different scenarios in the context of program planning, but also the

614 ability of the proposed framework to bundle household replacement projects in a cost-efficient

615 manner.

616 The findings of this selection are summarized below.as follows:

617 The optimal solution at the midpoint of the pPareto-efficient frontier represents an effective balance

618 between the two objectives and does not result in substantial differences in the selected project

619 areas in the single-objective cases. Using the solutions along the middle of the pareto-efficient

620 frontier can thus significantly reduce the number of potential home deployments needed to achieve

621 optimal removal of both lead service line and degraded meter assets.

28
622  The resulting number of lead service lines captured is much more sensitive to tradeoffs

623 when balanced against the need for removing bad meters. This is because lead service

624 lines are mostly clustered together in just a few areas, whereas faulty meters are much

625 more evenly spread out across the system.

626  Including the geographic minimum constraint greatly reduces the performance of both

627 objectives. This is because there are census tracts with low counts of lead pipe and bad

628 meters, but the model is still required to allocate a minimum selection there.

629  The models are less sensitive to the compactness constraint. This is because street

630 blocks with high lead and bad meters are already close to each other within a given

631 neighborhood, so the optimization will naturally select proximate areas without

632 needing a constraint.

633  The empirical results suggest a tradeoff exists between program quality and cost. The

634 baseline model removes the most assets of interest but selects a high number of small

635 street blocks for the program which can increase mobilization costs. In contrast, the

636 model with geographic and compactness constraints selects less than half the street

637 blocks but the removal of target assets is also lower.

638 Possible extensions of the research presented in this paper can involve the application of more

639 sophisticated methods for estimating risk levels of individual asset classes. For example, higher

640 accuracy models for locating degraded meters as well as the incorporation of demographic

641 information to better identify high- risk populations to lead exposure. There are also possible

642 extensions to the optimization model that can be added to reflect other planning considerations,

643 e.g.for example, existence of construction moratoriums in certain neighborhoods, assigning cost

644 functions to address different types of homes based on age and size rather than assuming uniform

645 cost. Additional considerations of equity can also be incorporated into the modelingmodelling to

646 ensure that utility resources are adequately target the most at-need demographics. As more

29
647 constraints and variables are introduced to the integer programming formulation, the tradeoffs

648 between computational tractability and model complexity need to be examined.

649

650 6. CONCLUSION

651 Managing the aging infrastructure of water distribution systems with limited funding means that

652 many operators need to maximize the return on any capital investment. Designing an effective asset

653 management program is, as a result, a resource allocation problem. The objective is to identify

654 areas where vulnerable assets are located and target the deployment of capital dollars to address

655 the problematic areas. The main issue for many municipalities is that there are many classes of

656 infrastructure types (water mains, valves, meters, service lines) that each need renewal, and it is

657 cost ineffective to plan replacement programs only targeting one alone. This is because it can

658 greatly reduce truck roll (or mobilization), defined as the deployment of crew and equipment to a

659 project area, and is a significant cost to any capital improvement program. Therefore, selecting

660 capital improvement projects that target multiple asset classes together can reduce mobilization

661 costs and help utilities stretch their limited budgets further.

662 To our knowledge, no previous work has demonstrated the application of optimization

663 modelingmodelling and geospatial methods for the infrastructure project planning of assets at the

664 individual household resolution. The problem is framed as a dual-objective integer programing

665 model, where the selection of project areas aims to maximize lead service line removal and water

666 meter changeout together. A case study for the joint renewal planning of service lines and water

667 meters is presented. These two infrastructure classes are selected due to the availability of data. To

668 best reduce mobilization cost in practice, projects with similar outage times and required

669 crew/equipment are best suited for joint work (i.e., meter replacements and service line material

670 inspections, service line and water main renewals). To provide additional efficiency and delineate
30
671 individual project areas, we group adjacent land parcels together into larger street blocks. The

672 selection of street blocks is more efficient than selecting individual homes since it allows more

673 geographically compact replacements projects to be performed.

674 Since a multi-objective optimization model is specified, the optimal solution to the problem is

675 represented via a pPareto-efficient frontier. Each point along the frontier represents a unique

676 selection of street blocks to be included in a potential asset management program, and the different

677 solutions represent different weightings of the two objectives. Empirical data suggests the multi-

678 objective approach can identify effective project selections that also significantly reduce duplicate

679 visits to the same home. Furthermore, the sensitivity of a given objective to tradeoffs is highly

680 dependent on the spatial distribution of the target assets. In our case study, lead service lines are

681 spatially concentrated in only a limited number of areas, and as a result, is are more sensitive to

682 performance decreases when balanced with the competing objective to remove bad meters. The

683 spatial concentration of lead pipes also means that any constraints to enforce broad spatial selection

684 of projects, potentially due to political concerns, will greatly reduce the removal performance.

685 Altogether, our research demonstrates the potential of using dual-objective modelingmodelling to

686 guide replacement planning of household-level water distribution assets and has the potential to

687 generate more cost-efficient programs to protect critical water infrastructures. In practice, the

688 application of the methods explored here needs to be incorporated within broader infrastructure

689 decision frameworks that account for local regulations and the need for all asset types. Water

690 utilities own various infrastructure types (e.g., distribution main, water tower, pumps, valves) that

691 each have has their own replacement and maintenance needs. The research here focuses specifically

692 on the alignment of household-level projects, but the efficacy of capital programs can be greatly

693 improved when assets of different resolutions are considered together.

694

31
695 For example, additional reduction in project mobilization cost can be achieved if service line

696 replacements (household level) were aligned with adjacent distribution main renewals (street level)

697 since both require excavation. Larger planning frameworks are needed to accurately weigh the

698 needs of different infrastructure types. Once the best portfolio of assets to target can be determined,

699 they can then be incorporated into the downstream decision models that selects areas with maximal

700 alignment to reduce cost. Beyond the physical conditions of the infrastructure itself, local

701 regulations that may influence the geographic location of projects also need to be included in the

702 decision framework. Common examples include project moratoriums to avoid frequent disruption

703 of the same neighborhood, incentivized alignments with other departments in their project areas

704 (i.e., water department main replacements combined with transportation department street

705 resurfacing of the same road), and renewal targets specified by the state agencies (i.e., 5%

706 replacement rate of lead service lines per year).

707 In the case of water meter and lead service line replacements, the methods developed here help

708 prioritize neighborhoods for best joint renewal of these assets. However, state regulatory bodies

709 may specify a minimum lead pipe replacement rate due to their urgent public health risk, and local

710 regulation may prevent the excavation of underground pipes within 5  years of a previous project.

711 These factors combined may skew the selected infrastructure projects to prioritize lead pipes over

712 meters, but also influence when certain neighborhoods can be addressed. To our best knowledge,

713 broader planning frameworks like this are typically executed by relying on expert

714 judgmentjudgement, but it is an active area of research in the infrastructure planning domain. The

715 exploration of methods to determine the best mix of asset types to target, and similar optimization

716 methods to best align projects spanning different spatial resolutions, can provide value to utility

717 decision- makers and presents a meaningful direction for related future work.

718

32
719 ACKNOWLEDGMENTSAcknowledgements

720 We would like to thank the City of Dearborn, MI, for agreeing to the use and showcasing of its

721 system data for the implementation of this research. We would also like to thank Mr. Eric Roggow,

722 CMMS Program Manager for the Department of Public Works at the City of Dearborn for his

723 efforts in compiling the relevant datasets needed for this research. The models and risk results

724 presented in this research are hypothetical based on data provided by the city and carry

725 uncertainty, they do not necessarily reflect the true condition of the distribution system assets.

726

727 CONFLICT OF INTEREST STATEMENTConflict of Interest

728 This work was funded by Xylem, Inc., which is developing products related to the research

729 described in this paper. The independence of this work is reviewed and approved in accordance

730 with Xylem Inc.’s 's policy on objectivity in research. The opinions and views expressed are those of

731 the researchers and do not necessarily reflect those of the sponsors.

732

733 DATA AVAILABILITY STATEMENTData Availability Statement

734 The data that support the findings of this study are available from the City of Dearborn, MI. Restrictions
735 apply to the availability of these data, which were used under license for this study. Data are available
736 from the authors with the permission of the City of Dearborn, MI.

737

738 REFERENCESReferences

739 Abernethy, J., Anderson, C., Rauh, A., Schwartz, E., Stroud, J., Tan, X., & Webb, J. (2016). Flint

740 wWater cCrisis : Data-dDriven rRisk aAssessment vVia rResidential wWater

33
741 tTesting. Proceedings of the 23rd ACM SIGKDD iInternational cConference on

742 kKnowledge dDiscovery and dData mMining, 1407–1416.

743 Ana, E. V, & Bauwens, W. (2007). Sewer network asset management decision-support tools: Aa

744 review. International Symposium on New Directions in Urban Water Management,

745 September, 1–8. http://www2.gtz.de/Dokumente/oe44/ecosan/en-sewer-network-

746 decision-making-tool-2007.pdf.

747 Berardi, L., Giustolisi, O., Savic, D. A., & Kapelan, Z. (2009). An effective multi-objective approach

748 to prioritisation of sewer pipe inspection. Water Science and Technology, 60(4),

749 841–850. https://doi.org/10.2166/wst.2009.432

750 Berry, J. W., Fleischer, L., Hart, W. E., Phillips, C. A., & Watson, J.-P. (2005). Sensor pPlacement

751 in mMunicipal wWater nNetworks. ASCE Journal of Water Resources Planning and

752 ManagementJ. Water Resour. Plan. Manag., 131(3), 237–243. 10.1061/(ASCE)0733-

753 9496(2005)131%3A3(237)http://link.aip.org/link/?QWR/131/237/1

754 Carey, B. D., & Lueke, J. S. (2013). Optimized holistic municipal right-of-way capital improvement

755 planning. Canadian Journal of Civil Engineering, 40(12), 1244–1251.

756 https://doi.org/10.1139/cjce-2012-0183

757 Chen, T., & Guestrin, C. (2016). XGBoost: A sScalable tTree bBoosting sSystem. Proceedings of the

758 22nd Acm Sigkdd iInternational cConference on kKnowledge dDiscovery and dData

759 mMining, 785–794.

760 Chen, T.Y., Beekman, J. A., David Guikema, S., & Shashaani, S. (2019). Statistical mModeling in

761 aAbsence of sSystem sSpecific dData: Exploratory eEmpirical aAnalysis for

762 pPrediction of wWater Main bBreaks. Journal of Infrastructure Systems, 25(2).

763 https://doi.org/10.1061/(ASCE)IS.1943-555X.0000482

34
764 Chen, T. Y., Man, C., & Daly, C. M. (2021). Optimizing cluster selections for the replacement

765 planning of water distribution systems. AWWA Water Science, 3(4).

766 https://doi.org/10.1002/aws2.1230

767 Chen, T.Y., Riley, C. T., Van Hentenryck, P., & Guikema, S. D. (2020). Optimizing inspection

768 routes in pipeline networks. Reliability Engineering and System Safety, 195.

769 https://doi.org/10.1016/j.ress.2019.106700, 106700

770 Chen, T.Y., Vladeanu, G., Yazdekhasti, S., & Daly, C. M. (2022). Performance eEvaluation of pPipe

771 bBreak mMachine lLearning mModels uUsing dDatasets from mMultiple uUtilities.

772 Journal of Infrastructure Systems, 28(2). https://doi.org/10.1061/(asce)is.1943-

773 555x.0000683

774 Chen, T. Y., Washington, V. N., Aven, T., & Guikema, S. D. (2020). Review and eEvaluation of the

775 J100-10 rRisk and rResilience mManagement sStandard for wWater and

776 wWastewater sSystems. Risk Analysis, 40(3), 608–623.

777 https://doi.org/10.1111/risa.13421

778 Chojnacki, A., Dai, C., Farahi, A., Shi, G., Webb, J., Zhang, D. T., Abernethy, J., & Schwartz, E.

779 (2017). A dData sScience aApproach to uUnderstanding rResidential wWater

780 cContamination in Flint. Proceedings of the 23rd ACM SIGKDD iInternational

781 cConference on kKnowledge dDiscovery and dData mMining, 1407–1416.

782 https://doi.org/10.1145/3097983.3098078

783 Dandy, G. C., & Engelhardt, M. O. (2006). Multi-oObjective tTrade-oOffs between cCost and

784 rReliability in the rReplacement of wWater mMains. Journal of Water Resources

785 Planning and Management, 132(2), 79–88. https://doi.org/10.1061/(ASCE)0733-

786 9496(2006)132:2(79)

787 de Monsabert, S., Ong, C., & Thornton, P. (1999). An iInteger pProgram for oOptimizing sSanitary
35
788 sSewer rRehabilitation oOver a pPlanning hHorizon. Water Environment Research,

789 71(7), 1292–1297. https://doi.org/10.2175/106143096x122429

790 Dridi, L., Mailhot, A., Parizeau, M., & Villeneuve, J. P. (2009). Multiobjective aApproach for pPipe

791 rReplacement bBased on Bayesian iInference of bBreak mModel pParameters.

792 Journal of Water Resources Planning and Management-ASCE, 135(5), 344–354.

793 https://doi.org/10.1061/(ASCE)0733-9496(2009)135:5(344)

794 Fontanazza, C. M., Notaro, V., Puleo, V., & Freni, G. (2015). The apparent losses due to metering

795 errors: Aa proactive approach to predict losses and schedule maintenance. Urban

796 Water Journal, 12(3), 229–239. https://doi.org/10.1080/1573062X.2014.882363

797 Ganjidoost, A., Vladeanu, G., & Daly, C. M. (2022). Leveraging risk and data analytics for

798 sustainable management of buried water infrastructure. AWWA Water Science,

799 4(2). https://doi.org/10.1002/aws2.1283

800 Haimes, Y. Y., Lasdon, L. S., & Wismer, D. A. (1971). On a bicriterion formulation of the problems

801 of integrated identification and system optimization. IEEE Transactions on Systems,

802 Man and Cybernetics, SMC-1(3), 296–297. https://ieeexplore-ieee-

803 org.afit.idm.oclc.org/stamp/stamp.jsp?tp=&arnumber=4308298.

804 Hajiseyedjavadi, S., Karimi, H. A., & Blackhurst, M. (2022). Predicting lead water service lateral

805 locations: Geospatial data science in support of municipal programming. Socio-

806 Economic Planning Sciences. https://doi.org/10.1016/j.seps.2022.101277, 82, 101277

807 Kerwin, S., & Adey, B. T. (2020a). Pipes or pumps? The use of cost-benefit analysis in investment

808 decision-making for public water infrastructure. Life-Cycle Civil Engineering:

809 Innovation, Theory and Practice - Proceedings of the 7th International Symposium

810 on Life-Cycle Civil Engineering, IALCCE 2020, 1143–1150.

811 https://doi.org/10.1201/9780429343292-151
36
812 Kerwin, Sean, & Adey, B. T. (2020b). Optimal iIntervention pPlanning: A bBottom-uUp aApproach

813 to rRenewing aAging wWater iInfrastructure. Journal of Water Resources Planning

814 and Management, 146(7). https://doi.org/10.1061/(asce)wr.1943-5452.0001217

815 Kim, J., Baek, C., Jo, D., Kim, E., & Park, M. (2004). Optimal planning model for rehabilitation of

816 water networks. Water Science and Technology, 4(3), 133–148.

817 Kleiner, Y., Nafi, A., & Rajani, B. (2010). Planning renewal of water mains while considering

818 deterioration, economies of scale and adjacent infrastructure. Water Science and

819 Technology: Water Supply, 10(6), 897–906. https://doi.org/10.2166/ws.2010.571

820 Kleiner, Y., & Rajani, B. (2001). Comprehensive rReview of sStructure dDeterioration of wWater

821 mMains: Statistical mModels. Urban Water, 3(3), 151–164.

822 Konstantinou, C., & Stoianov, I. (2020). A comparative study of statistical and machine learning

823 methods to infer causes of pipe breaks in water supply networks. Urban Water

824 Journal, 17(6), 534–548. https://doi.org/10.1080/1573062X.2020.1800758

825 Lawler, E. L., & Wood, D. E. (1966). Branch-and-bound methods: A sSurvey. Operations Research,

826 14(4), 699–719. https://doi.org/10.1098/ROYAL/

827 Lund, J. R. (1988). Metering utility services: Evaluation and maintenance. Water Resources

828 Research, 24(6), 802–816. https://doi.org/10.1029/WR024i006p00802

829 Madrigal, A. C. (2019). How a fFeel-gGood AI sStory wWent wWrong in flint. The Atlantic, 1–14.

830 https://www.theatlantic.com/technology/archive/2019/01/how-machine-learning-

831 found-flints-lead-pipes/578692/?

832 utm_medium=offsite&utm_source=google&utm_campaign=newsstand-technology.

833 Martello, S., & Toth, P. (1990). Knapsack problems: Algorithms and computer implementations.

834 Wiley.. https://doi.org/10.1007/springerreference_5701

37
835 Mohanakrishnan, J., Boyle, C., & Poff, J. G. (2019). Detecting and rResolving aApparent lLoss

836 wWith dData sScience. Journal - American Water Works Association, 111(2), 13–17.

837 https://doi.org/10.1002/awwa.1230

838 Muncie, H. L., Sobal, J., & DeForge, B. (2013). Search mMethodologies: Introductory tTutorials for

839 oOptimization and dDecision sSupport tTechniques. In Journal of fFFamily

840 pPPractice (Second, Vol. 28, Issue 1). Springer. https://doi.org/10.1515/9780823274161-

841 004

842 Muñuzuri, J., Ramos, C., Vázquez, A., & Onieva, L. (2020). Use of discrete choice to calibrate a

843 combined distribution and sewer pipe replacement model. Urban Water Journal,

844 17(2), 100–108. https://doi.org/10.1080/1573062X.2020.1748205

845 Nafi, A., & Kleiner, Y. (2010). Scheduling renewal of water pipes while considering adjacency of

846 infrastructure works and economies of scale. Journal of Water Resources Planning

847 and Management, 136(5), 519–530. https://doi.org/10.1061/(ASCE)WR.1943-

848 5452.0000062

849 Naoum-Sawaya, J., Ghaddar, B., Arandia, E., & Eck, B. (2015). Simulation-optimization

850 approaches for water pump scheduling and pipe replacement problems. European

851 Journal of Operational Research, 246, 293–306.

852 https://doi.org/10.1016/j.ejor.2015.04.028

853 Osman, H., Ammar, M., & El-Said, M. (2017). Optimal scheduling of water network repair crews

854 considering multiple objectives. Journal of Civil Engineering and Management,

855 23(1), 28–36. https://doi.org/10.3846/13923730.2014.948911

856 Pecci, F., Abraham, E., & Stoianov, I. (2015). Mathematical programming methods for pressure

857 management in water distribution systems. Procedia Engineering, 119(1), 937–946.

858 https://doi.org/10.1016/j.proeng.2015.08.974
38
859 Puleo, V., Fontanazza, C. M., Notaro, V., De Marchis, M., La Loggia, G., & Freni, G. (2014).

860 Definition of water meter substitution plans based on a composite indicator.

861 Procedia Engineering, 70, 1369–1377. https://doi.org/10.1016/j.proeng.2014.02.151

862 St. Clair, A. M., & Sinha, S. (2012). State-of-the-technology review on water pipe condition,

863 deterioration and failure rate prediction models! Urban Water Journal, 9(2), 85–

864 112. https://doi.org/10.1080/1573062X.2011.644566

865 The American Water Works Association (AWWA). (2010). J100-10 rRisk and rResilience

866 mManagement of wWater and wWastewater sSystems. Denver, CO.

867 Tscheikner-Gratl, F., Caradot, N., Cherqui, F., Leitão, J. P., Ahmadi, M., Langeveld, J. G., Le Gat,

868 Y., Scholten, L., Roghani, B., Rodríguez, J. P., Lepot, M., Stegeman, B.,

869 Heinrichsen, A., Kropp, I., Kerres, K., Almeida, M. do C., Bach, P. M., Moy de

870 Vitry, M., Sá Marques, A., … Clemens, F. (2019). Sewer asset management–state of

871 the art and research needs. Urban Water Journal, 16(9), 662–675.

872 https://doi.org/10.1080/1573062X.2020.1713382

873 Tscheikner-Gratl, F., Egger, P., Rauch, W., & Kleidorfer, M. (2017). Comparison of multi-criteria

874 decision support methods for integrated rehabilitation prioritization. Water, 9(2).

875 https://doi.org/10.3390/w9020068

876 Tscheikner-Gratl, F., Sitzenfrei, R., Rauch, W., & Kleidorfer, M. (2016). Integrated rehabilitation

877 planning of urban infrastructure systems using a street section priority model.

878 Urban Water Journal, 13(1), 28–40. https://doi.org/10.1080/1573062X.2015.1057174

879 US Census Bureau. (2019). TIGER/lLine with sSelected dDemographic and eEconomic dData.

880 Washington, DC.

881 US Environmental Projection Protection Agency (EPA). (2019). Revised lead and cCopper rRule.

39
882 Washington, DC.

883 Vladeanu, G. J., & Matthews, J. C. (2019). Consequence-of-fFailure mModel for rRisk-bBased

884 aAsset Management management of Wastewater wastewater Pipes pipes Using using

885 AHP. Journal of Pipeline Systems Engineering and Practice, 10(2), 1–12.

886 https://doi.org/10.1061/(asce)ps.1949-1204.0000370

887 Yazdandoost, F., & Izadi, A. (2018). An asset management approach to optimize water meter

888 replacement. In Environmental mMModelling and sSSoftware (Vol. 104, pp. 270–

889 281). https://doi.org/10.1016/j.envsoft.2018.03.015

890 Yee, M. D. (1999). Economic analysis for replacing residential meters. Journal of American Water

891 Works Association, 91(7), 72–77. https://doi.org/10.1002/j.1551-8833.1999.tb08666.x

892

40

You might also like