Aws2 1336

1 Multi-Oobjective oOptimization mModels for the rRenewal pPlanning of mMultiple aAsset cClasses
2 Thomas Ying-Jeh Chen1,* thomas.chen@xylem.com;

3 Eric Wang2;
4 Nicole Pasch3;
5 Amin Ganjidoost4
1
6 Senior Data Scientist
7 Xylem Inc
8 8920 MD-108, Columbia, MD, 21045, US.
2
9 Product Manager
10 Xylem Inc
11 870 Market St., San Francisco, CaliforniaCA, 94102, USA.
3
12 Client Solutions Manager
13 Xylem Inc
14 11850 Sears Street, Ste A, Livonia, MichiganMI, 48150, USA.
4
15 Drinking Water Decision Science – Manager
16 Xylem Inc
17 5055 Satellite Dr #7, Mississauga, ON, L4W 5K7, Ontario, CA.
18
19 *Corresponding author: Thomas.Chen@xylem.com
20
21 Abstract
22 Managing the aging infrastructure of water distribution systems presents a challenge for many utilities.
23 With various asset types competing for limited dollars, designing an effective asset management
24 program is a resource allocation problem. Mobilizations of equipment and crew is a significant cost
25 (typically 2–-10%) within any capital improvement program. Therefore, selecting projects that target
26 multiple asset classes together can reduce mobilization and help utilities stretch their budgets further.
27 This research presents a process for modelingmodelling the joint renewal planning of multiple asset
1
28 classes. The problem is framed as a dual-objective optimization, where selection of project areas aims to
29 maximize lead service line removal and water meter changeout together. A case study from a Midwest
30 utility is presented, and empirical data suggests the dual-objective approach effectively reduces
31 duplicate interventions to in the same regions. Equity considerations are also examined, where
32 constraints are added to enforce system-wide project selection. Results show that the sensitivity of the
33 objective towards equity is dependent on the underlying spatial distribution of the target asset itself,
34 where uneven spread of target asset leads to greater negative impact on model performance.
35
36 Keywords
37 aAsset mManagement
38 dDrinking wWater sSystems
39 oOptimization
40 rRisk aAnalysis
41
42 Article Impact Statement

43 A two-step method is presented for the replacement planning of water system assets located at in
44 individual households. The method reduces mobilization costs by addressing multiple asset types
45 together.
46
47 1. INTRODUCTION
48 With various operational and regulatory objectives competing for the same dollars within a municipal
49 budget, designing an asset management program is a resource allocation problem. For many utilities,
50 having multiple asset types (mains, valves, meters, service lines) that each needs timely inspection and
51 repair is placing a heavy burden on their limited budgets and affordability of the water service in their
52 communities. In addition, public health regulations imposed by state and local agencies may require
53 additional capital works to be performed, further stretching the staff and funding of the municipality. As
2
54 a result, a cost-efficient and effective asset management program is critical for guiding utility operators
55 to maximize their return on time and investment by targeting vulnerable assets most in need first.
56 Risk assessments are useful in designing these programs because they provide a systematic process for
57 quantifying vulnerability and ranking individual assets to guide the prioritization (American Water Works
58 Association, 2010). For example, previous works have implemented analyses where every distribution
59 main is ranked based on future likelihood and consequence of failure, and asset management programs
60 are designed to address the highest- risk assets first (Chen et al., 2020). See other examples in
61 (Ganjidoost et al., 2022), (Vladeanu & Matthews, 2019), (Fontanazza et al., 2015), and (Puleo et al.,
62 2014) focusing on distribution mains, valves, and water meters.
63 While this is a useful starting point, a key limitation of these approaches is that it only considers a single
64 asset class in the analysis. The assumption being that risk can be optimally reduced if capital was
65 allocated based on the highest- ranked assets. As discussed earlier, a water distribution system contains
66 many different infrastructure classes, and an asset management program that accounts for the different
67 categories is more effective. For example, identifying regions with high-risk water mains alone is good,
68 but identifying regions with high-risk mains, service lines, and valves together can help the utility
69 achieve greater economies of scale. The largest savings realized in this approach is the reduction in truck
70 roll, or mobilization cost. Truck roll is broadly defined as the dispatch of crew and equipment into the
71 field to perform any capital improvement work. It accounts for a significant portion of the cost for any
72 replacement project, with estimates of $18,000 per pipe replacement projects requiring road
73 excavations (Chen et al., 2021), and approximately $500 for smaller projects at the household level (e.g.,
74 meter replacements). Depending on the crew and equipment needed, mobilization costs take up
75 anywhere between 2–-10% of the project's overall budget, a significant margin for utilities with
76 stretched budgets. Designing programs that align the replacements of multiple infrastructure classes
77 together will reduce the cost of truck roll and help utilities do more with their limited budgets.
78 The objective of this research is to present a methodology for selecting household-level projects that
79 optimizes the joint replacement of two infrastructure classes: degraded water meters, and lead service
80 lines. The problem is framed as a dual objective integer program: 1) maximize the number of lead
81 service lines replaced, and 2) maximize the number of degraded meters replaced. Individual homes are
82 aggregated into larger areas and scored based on the number of lead pipes and degraded meters
83 contained in each. A single project is delineated based on these aggregated spaces and the optimization
84 aims to find a selection of project areas that jointly maximizes the replacement of these assets. An exact
3
85 formulation of this model (one that guarantees the identification of all optimal solutions) is presented,
86 and a solution approach is demonstrated to find the pPareto-efficient frontier (Berardi et al., 2009; Dridi
87 et al., 2009). Each point on this frontier represents an optimal solution based on different weightings of
88 the two objectives. Utilities can choose between these solutions based on overall organizational and
89 regulatory priorities.
90 Two main contributions of this research are summarized below.
91  The paper presents a methodology for optimizing the joint replacement planning of two
92 infrastructure classes that are located at the premise of individual households. Single
93 structures can be dispersed geographically and has have the potential to quickly drive-up
94 mobilization costs. Based on our review of the literature, no previous work has
95 demonstrated the application of optimization modelingmodelling and geospatial methods
96 for the joint renewal planning of assets at the individual household resolution. While this
97 research considers water meters and lead service lines as the case study, the methods
98 presented can be generalized to plan for the asset management of any infrastructure found
99 inside the home.
100  An exact mathematical model, one that can guarantee to find all optimal solutions, is
101 presented and framed as a dual-objective optimization problem. The output for these
102 classes of problems is a set of pPareto-efficient solutions representing different tradeoffs
103 between the two objectives. We find no previous work presents an exact model for joint
104 (multi-asset) project selection of water systems. Providing multiple solutions to decision-
105 makers is also an improvement because it can guide more effective planning based on
106 different organizational and regulatory priorities.
107 Together these two contributions advance the state of the art in modelingmodelling to assist in better
108 asset management planning. It is important to note that the case study of synthesizing these two
109 activities together is meant for conceptual purposes only. The goal here is to provide an example of two
110 infrastructure projects that both take place at individual structures, and the case study demonstrates
111 the application of dual objective modelingmodelling to select the best project portfolio. In practice,
112 considerations beyond just mobilization cost must be included when selecting which infrastructure
113 projects should be aligned. Infrastructure projects requiring similar crew size, construction activities
114 (need for specialized equipment), and level of disruption (excavation and restoration) are much better
115 suited for alignment. Meter and service line replacements are different in their required level of effort,
4
116 but these are selected due to data availability at the single household resolution and best supports
117 methods explored in this research.
118 2. LITERATURE REVIEW

119 Many studies are available for the risk assessments of water systems focusing on a single asset class.
120 Water distribution mains have been a particular focus for many researchers (see reviews by (Kleiner &
121 Rajani, 2001; Konstantinou & Stoianov, 2020; St. Clair & Sinha, 2012)), where statistical models were
122 developed to estimate risk for individual pipeline segments and future projects could then be planned
123 by prioritizing the highest- risk assets first. The underlying models rely on historic records of asset failure
124 to train and validate the predictions for which assets will fail in the future. On the other hand, see
125 literature reviews by (Tscheikner-Gratl et al., 2019) and (Ana & Bauwens, 2007) that summarizes the
126 state of the art in sewer main risk modelingmodelling. Sewer main risk assessments primarily 1) quantify
127 the structural condition of the main, 2) estimate the degree of overflow during wet weather events, and
128 3) characterize the fiscal and environmental impacts of overflows and leaks. Similar works have also
129 been performed that address water meter changeout programs (see works by (Fontanazza et al., 2015;
130 Yazdandoost & Izadi, 2018; Yee, 1999; Mohanarishnan et al., 2019), where quantitative models to
131 characterize the reliability of water meters are presented. The meters most likely to fail or inaccurately
132 record hydraulics can be addressed first in a replacement program to maximize overall reliability.
133 Taking on a broader view of infrastructure renewal planning, previous works have also discussed the
134 benefits of bundling multiple infrastructure projects together in the same location. The major benefit of
135 selecting projects in this manner is to cut down on mobilization cost and realize greater economies of
136 scale (S. Kerwin & Adey, 2020). In the context of water systems, works by (Kleiner et al., 2010; Nafi &
137 Kleiner, 2010) present a methodology for selecting pipe replacement locations that are incentivized to
138 align with future road-work locations. An objective function for total cost is presented, which includes
139 mobilization, raw materials, crew, and expected repair of future breaks, and the optimization aims to
140 minimizes total cost. Mobilization costs are greatly reduced for streets with future road works already
141 planned, such that selections that align with adjacent works are incentivized. Similar research by (Sean
142 Kerwin & Adey, 2020; Muñuzuri et al., 2020; Tscheikner-Gratl et al., 2016) demonstrate approaches for
143 bundling water and sewer main replacement selections together, where a risk index that encompasses
144 the status of both asset classes is used to guide decision making. A case study by (Carey & Lueke, 2013)
145 presents a framework for selecting infrastructure projects based on the combined criticality of the
5
146 underlying roads, water mains, and sewer mains. A major limitation of these approaches is the need to
147 assign weightings between the asset classes and the difficulty in normalizing failure outcomes (e.g.,
148 failure cost to a road collapse is much larger than a pipe burst). Multi-criteria decision-making methods
149 such as AHP and ELECTRE have been demonstrated to reflect these weightings from an operator's
150 perspective (Tscheikner-Gratl et al., 2017), but they are often subject to human bias and difficult to
151 scale. Despite the challenges posed for by the application of these planning models, they can still be
152 used in conjunction with the dual-objective model developed in this research to identify areas with
153 greater project alignment. The model developed in this research focuses on infrastructure at the same
154 spatial resolution (i.e homes), but this can be incorporated with other planning models to consider other
155 assets of different spatial scales (i.e. streets, neighborhoods).
156 Optimization modelingmodelling is a useful tool for ingesting asset- level or location- level risk indices
157 and formulating a program that maximizes risk reduction. The output from these models often specify
158 specifies which assets to select, when to address them, and even the type of action to take (replace or
159 repair) (L. Chen & Bai, 2019). Some examples that formulate and solve exact integer models include the
160 following: inspection routing (T.Y.J. Chen et al., 2020), replacement project selection (T.Y.J. Chen et al.,
161 2021), sewer rehabilitation (de Monsabert et al., 1999), as well as sensor placement (Berry et al., 2005).
162 Since these models are convex, the globally optimal solution can be identified. To add further
163 complexity, overall pressure and flow conditions need to be considered when taking parts of a
164 distribution system offline during projects. This can introduce non-linear constraints which impact
165 computation tractability, as seen in examples by (Pecci et al., 2015) and (Naoum-Sawaya et al., 2015).
166 Here non-linear relaxation techniques are demonstrated to convergence on local optima. Multi-
167 objective formulations for water main replacement planning is are presented in (Dandy & Engelhardt,
168 2006; Kim et al., 2004; Osman et al., 2017), the model formulations in these works aim to maximize risk
169 reduction along with other considerations such as cost, hydraulic reliability, and traffic impacts.
170 The work presented in this paper aims to extend the state of the art by formulating a multi-objective
171 optimization model for the replacement of multiple water infrastructure assets at the location of
172 individual households. From our review of the literature, we find no work that jointly addresses capital
173 replacement planning at this spatial resolution as well as the economies of scale when bundling multiple
174 replacements at the same location. Another area where previous work can be integrated with the
175 proposed model is during the formulation of the dual-asset replacement model. Because the proposed
176 method relies on inputs to determine the reward for selecting a given asset, this makes it particularly
6
177 suitable for using the existing state of the art as input. In this paper, we use the age of the valves and a
178 predictive model for lead service line locations to generate inputs to the project selection model.
179 However, in practice, more sophisticated methods that exist in the literature can be used to provide
180 better- quality estimates of infrastructure conditions.
181
182 3. METHODOLOGY
183 This section outlines the two-step process for 1) aggregating adjacent households into larger areas more
184 suitable for capital projects, and 2) identifying the collection of areas that maximizes the joint renewal of
185 two different water infrastructure assets. This research will consider the replacement planning of lead
186 service lines and degraded water meters since both are located at the premise of individual homes. The
187 method can be generalized to consider any two asset classes, but other practical planning factors would
188 need to be considered (i.e. project cost and time). Figure 1 below summarizes the two-step process.
189
190
191 FIGURE 1. Project Identification and Optimization Workflow.
192
193 3.1. Problem sSpecification
194 The project area selection problem is similar to the knapsack problem (Martello & Toth, 1990), a well-
195 studied problem in the field of combinatorial optimization. The goal is to select the combination of
196 project areas that will be targeted for replacement (lead service lines and water meters) in the
197 upcoming capital renewal program. The objective is to identify the selection of project areas that
198 maximize the number of lead service lines and degraded meters replacements. A project area is simply
199 an aggregation of addresses that are nearby to each other typically in a contiguous manner. The budget
200 and resource limitations of the municipality are reflected by placing a ceiling on the total number of
201 homes that can be included in the replacement program. This assumes that the cost for of addressing
202 each home is uniform, it is possible to include more complex cost models but that is left for future work.
203 3.2. Spatial aAggregation of hHouseholds

7
204 Each residential structure served by a utility can be delineated by their its land parcel, and adjacent
205 parcels can be aggregated into larger geographic areas for project planning purposes. This is done by
206 loading the shapefiles of individual land parcels into the ArcGIS spatial software, using a spatial snap
207 function to join adjacent land parcels together if needed, then merging all adjacent land parcels into a
208 larger area. See Figure 2 below for an example output of this spatial analysis. This approach is selected
209 due to its simplicity in implementation and effectively groups of individual homes into larger areas for
210 project selection. The spatial process is also consistent where most aggregated street-blocks contain
211 similar number of homes (approximately 30 parcels per street block). In practice, the methodology to
212 bundle individual homes can be generalized to any approach the utility sees most appropriate, e.g.,
213 aggregating homes along both sides of the same street. It is possible that other aggregation methods
214 may be more realistic, but the selected method described here meets the needs of this research.
215
216 FIGURE 2. Spatial Aggregation of Land Parcels to Street Block Neighborhoods.
217
218 For further simplicity, only residential structures are considered for this research since they comprise
219 most of the buildings served by a municipality and are the primary target for lead service line removal
220 and meter changeouts. Each land parcel is counted as a single residential structure and is assumed to
221 contain one meter and one service line (Hajiseyedjavadi et al., 2022). The aggregated street blocks will
222 have an associated 1) count of the total number of homes, 2) count of homes with lead service lines,
223 and 3) count of homes with degraded meters.
224 3.3. Neighborhood sSelection oOptimization
225 The output from the spatial aggregation is to group individual addresses located near each other into
226 larger areas that better delineate potential project areas. The next step is to select the collection of
227 project areas that will be included into the asset management program. Due to limited capital and labor
228 resources, utilities need to prioritize neighborhoods to maximize return on investment and best meet
229 regulatory and operational objectives. Budget limitations at the utility is are reflected through a limit on
230 the total number of households that can be included. The reward for selecting a project area is defined
231 as 1) the expected sum of lead service lines contained in the boundary (verified plus unverified) and 2)
232 the number of degraded meters contained in the boundary based on age. On the other hand, the cost
8
233 for of selecting a project area is defined as the total number of structures contained in the boundary.
234 The exact integer program for jointly optimizing the replacement of lead service lines and degraded
235 meters is defined.
236 3.3.1. Decision vVariables
237 We first define the following decision variables for the optimization model.
238  Let i ∈ I be the index of candidate project areas.

239  Let X i = 1 if the candidate project area i is selected to be included in the capital
240 replacement plan, 0 otherwise.
241  Let Li = the expected number of homes with lead or galvanized service lines within the
242 candidate project area i . This is taken as the sum of homes where the pipe material is
243 verified (known based on historic inspection) and individual likelihoods of containing lead if
244 unverified (derived from a statistical model).
245  Let M i = the expected number of degraded meters contained within the project area i . This
246 is taken as the sum of the probability of meter failure across all homes contained within the
247 project area, the probabilities are derived from a statistical model.
248 Let H i = the total number of residential structures within the project area i .
249  Let T U and T L be the upper and lower limit for total residential structures that can be
250 addressed within the planning cycle. This reflects the utilities allocated budget for the
251 capital program.
252 For budget planning purposes, a municipality may need to submit a 3–10-year replacement plan to the
253 city manager for approval. The limits for the total number of homes targeted in the capital renewal
254 program (T U and T L) should reflect the available equipment and labor at the municipality. For this
255 research, we assume that a utility can target 1–-3% of the total homes served within the distribution
256 area each year.
257 3.3.2. Problem fFormulation
258 The dual objective optimization model for selecting project areas that jointly maximizes the sum of both
259 lead service lines and water meters is specified in model (1) below.
9
260 max ∑ Li X i (1a)
i ∈I
261 max ∑ M i X i (1b)

i ∈I
262 Subject to:
263 T L ≤ ∑ Hi Xi≤ T U (1c)

i∈ I
264 X i ∈ {0 ,1 } , ∀ i ∈ I (1d)
265 The first objective function (1a) maximizes the total count of lead service lines replaced in the selected
266 project areas. The second objective function (1b) maximizes the total count of degraded meters
267 addressed. Constraint (1c) specifies that the total number of households in the selected area is between
268 the upper and lower limits. Constraint (1d) specifies the binary domain of the decision variable. Note
269 that equations (1a – 1d) form the basis of the project selection problem and are closely similar to the
270 knapsack problem (Martello & Toth, 1990). Solving this model can serve as a good starting point for
271 planning purposes, however, there are many practical and political limitations not accounted for. We
272 will address two common examples here and demonstrate how these considerations can be included in
273 the following model: 1) requiring a minimum number of projects planned for each neighborhood or
274 political boundary, and 2) the desire for selecting project areas that are as spatially compact as possible.
275 When allocating resources for lead service line replacements, there is often political pressure to ensure
276 that there is system- wide coverage during the capital program (Madrigal, 2019). However, it is well
277 documented that at-risk populations for lead exposure are not evenly distributed across a city, often
278 being concentrated in specific neighborhoods (Abernethy et al., 2016; Chojnacki et al., 2017). As a result,
279 a more equitable program will allocate more resources towards neighborhoods with at-risk individuals,
280 while still satisfying political pressures by ensuring projects are distributed across all neighborhoods. The
281 following constraints will enforce a minimum count of homes selected in every neighborhood. For
282 generalizability, will use the terms ‘neighborhood’, ‘ward’ and ‘boundary’ interchangeably, in practice,
283 any geographic delineation of the distribution area can be used.
284  Let j ∈ J be the index of all neighborhood boundaries within the distribution area.
285  Let T j = the minimum number of residential structures within the neighborhood boundary
286 j required to be included in the replacement plan.
10
287 Let D j= the set of candidate project areas i that are contained within the boundary j .
288 Note that the neighborhood-specific limits on houses selected need to correspond with the overall limits
289 across the entire distribution area: T L ≤ ∑ T j ≤ T U . The following constraint (1e) can be included in
j ∈J
290 model (1) to enforce minimum selection threshold per neighborhood. Equation (1e) specifies that the
291 total number of homes selected within each neighborhood is at least the minimum required amount.
292 ∑ H i Xi ≥ T j, ∀ j ∈ J (1e)
i ∈D j
293 Within each neighborhood, selecting projects that are as spatially compact as possible is desirable
294 because: 1) further reduces mobilization since less driving is required to address all the individual
295 homes, 2) simplifies routing of the crew to address all the selected projects since they are close
296 together. To account for compactness, we specify a maximum distance that cannot be exceeded
297 between any two project areas within a given neighborhood. To include considerations of compactness
298 in model (1), we first define a few additional variables.
299 Let ε j be the set of all possible project pair combinations across the boundary j .
300  Let Y ij = 1 if both project areas i and j are selected for replacement, 0 otherwise.
301  Let Dij = the distance between the project areas i and j , defined as the euclidean distance
302 between the two centroids.
303 Let B j the maximum distance allowable between any selected pair of projects within the boundary j .
304 The following constraints added to model (1) will adjust the model to consider the degree of spatial
305 spread of the selected project areas.
306 D ij Y ij ≤ B j , ∀(i , j) ∈ ε j , j ∈ J (1f)
307 Y ij ≤ X i , ∀ (i , j)∈ ε j , j ∈ J (1g)
308 Y ij ≤ X j , ∀(i , j)∈ ε j , j ∈ J (1h)
309 Y ij ≥ X i + X j−1 , ∀(i , j)∈ ε j , j ∈ J (1i)
310 Y ij ∈ {0 , 1} , ∀(i , j) ∈ ε j , j ∈ J (1j)
11
311 Equation (1f) enforces that the centroid distance between any selected pair of projects must not exceed
312 the boundary-specific limit B j . Equations (1 g – 1i) enforces the relationship between the indicator
313 variables: Y ij can only be 1 if both X i and X j are also 1. Equation (1j) specifies the binary domain of the
314 decision variable.
315 3.3.3. Solution mMethod
316 Model (1) is a dual objective model, meaning the optimal solution is not a single unique selection of
317 neighborhoods, but rather a set of solutions that are pPareto-efficient (or non-dominated). A set of
318 solutions that are pPareto-efficient represents the optimal combination of outcomes where any
319 improvement to objective (1a) will come at the expense of (1b), and vice versa. Pareto optimality
320 enables all tradeoffs among optimal combinations of the two objectives to be considered (Muncie et al.,
321 2013).
322 To solve model (1) and identify the pPareto-efficient frontier, this research considers the epsilon
323 constraint approach since the closed form specification is available. We refer the reader to (Haimes et
324 al., 1971) for full details on the epsilon constraint method, as well as Figure 3 below. To summarize, it
325 involves first solving the model as single objective problem by considering only (1a) and (1b) alone.
326 These two solutions initialize the pPareto-efficient set by defining the boundaries. The algorithm then
327 iterates through the solution space between the two boundary points to identify all other pPareto-
328 efficient solutions that may exist. This is done by converting one of the objective functions as a
329 constraint, making the model a single-objective problem, and resolving the optimization at different
330 threshold values of the converted objective function. This process is repeated for every incremental
331 value of ε, with each newly detected non-dominated solution being appended to the pPareto-efficient
332 set.
333
334 FIGURE 3. Schematic for Epsilon-Constraint Algorithm Dual-Objective Maximization Problems.
335
336 Since model (1) is binary (all decision variables are binary) and linear, each iteration of the epsilon
337 constraint method can be solved directly by using the branch and bound algorithm (Lawler & Wood,
338 1966) available on most commercial and open-source solvers. The spatial data of the case study was
339 preprocessed using ESRI'’s ArcGIS software; all data processing, model formulation, and implementation
12
340 of the epsilon constraint algorithm was implemented with Python 3.7 and the package PuLP; the
341 mathematical solver CPLEX 12.10.0 was used to identify the optimal solutions.
342
343 4. CASE STUDY

344 The two-step methodology to aggregate individual households and optimize the selection of projects is
345 demonstrated on in a real municipality. In this section, we describe the case study dataset and the
346 methods used to generate the necessary inputs for the project optimization: 1) lead service line
347 estimates, 2) failed meter estimates, 3) location of larger neighborhoods.
348 4.1. Distribution sSystem dData
349 We partnered with the local utility in Dearborn (Michigan) to obtain spatial databases of the city'’s
350 water distribution system and parcel tax assessment information. The water meter spatial layer and tax
351 parcels are used to identify the set of active residential users. We first use tax assessment data from the
352 year 2021 to filter out all buildings with a non-residential zoning classification (e.g., commercial,
353 industrial, federal). Next, we spatially relate each meter to a land parcel based on its location, then using
354 the customer status in the meters shapefile we filter out locations where the meter is inactive. There are
355 a total of 29,559 residential parcels served by the Dearborn water system, and we assume that each
356 parcel contains one meter and one building. There are a total of 2074 unique candidate project areas in
357 the City of Dearborn after aggregating adjacent land parcels to street blocks. Figure 4 below shows a
358 map of all the residential land parcels served under the distribution system, along with a map of
359 aggregated street blocks colored by the number of parcels contained within each boundary.
360
361 FIGURE 4. Dearborn Residential Land Parcel and Street Block Locations.
362
363 4.2. Lead sService lLine pProbability
364 We consider the service line assets running from the distribution main to the household in this study.
365 The portion of pipe between the water main and the stop box, curb stop, or shutoff valve is publicly
366 owned by the City of Dearborn, and the rest of the pipeline running to the meter inside the home is
13
367 owned by the homeowner. While there are two portions of pipe making up the service line connection,
368 we consider the prevalence of lead as a binary response: does any part of the pipeline contain lead, or
369 not. A ‘lead’ response in the data is encoded by the utility as ‘any portion lead’, whereas a ‘non-lead’
370 response is encoded as ‘neither portion lead’. We take this approach considering both the privately and
371 publicly owned portion together because the incidence of lead on either part of the pipe necessitates a
372 full replacement work based on the latest regulation (US Environmental Protection Agency, 2021).
373 Therefore, the count of lead service lines across the entire system can be assumed as the count of meter
374 boxes connected to lead pipes.
375 The material information for the service line inventory is only partially complete for the City of
376 Dearborn. Of the 29,559 active residential land parcels under consideration, only 11,692 (39.6%) contain
377 material information which is verified based on historic inspection of replacement works. For this
378 research, we use the data from the verified portion of services lines to train machine learning models to
379 predict which unverified location is most likely to contain lead. Based on past research results (Chojnacki
380 et al., 2017) it is demonstrated that the XGBoost (T. Chen & Guestrin, 2016) algorithm is a strong
381 predictor of lead service line locations, which we will use in the case study here. We obtained from the
382 City of Dearborn parcel tax assessment records to use for modelingmodelling, in combination with
383 attribute information embedded within the service line shapefile. The tax assessment dataset identifies
384 each parcel of land under the city'’s jurisdiction and includes information on land value, building value,
385 building age, and other relevant information on building construction. See Table 1 below summarizing
386 the input data used for training the XGBoost algorithm.
387 TABLE 1. Machine Llearning Vvariables for Llead Sservice Lline Pprediction.
Variable nName Description

Lead Rresponse Binary Response Variable: ‘Positive’ is the meter box is located to a lead pipe,
‘Negative’ otherwise.
Diameter Size of the service line pipe connected to the meter box, reported in inches.
Install Yyear Install year of the service line.
Parcel Aage Built year of the residential structure connected to the meter box.
Parcel Floor Area Total floor area of the residential structure connected to the meter box,
measured in square feet.
Parcel Total Area Total square footage of the parcel which the meter box is located in.
Parcel Land Value Assessed value of the land in the parcel, based on 2021 tax data.
14
Parcel Total Value Total value of the parcel, including the land and the structure, based on 2021
tax data.
388
389 The trained XGBoost model is a classification model that predicts for each unverified meter box, the
390 likelihood of having a lead service line connected there. To estimate the total number of lead service
391 lines that can be removed when selecting a given project area, sum the number of verified locations
392 with lead pipes with the probabilities of lead pipe for each unverified location. The distribution of
393 estimated lead service line per street block is shown in Figure 5 below.
394
395
396 FIGURE 5. Expected Lead Service Line Capture per Street Block.
397
398 4.3. Probability of mMeter fFailure
399 To characterize the conditions of the water meters, an age-based likelihood of failure model is
400 implemented. It is beyond the scope of this research to use the most sophisticated modelingmodelling
401 of water meter risk, our goal here is to have a method to estimate the number of prevented meter
402 failures when selecting a replacement project area. A failure likelihood model is convenient here
403 because it is scaled between 0 and 1, and higher likelihoods directly translate to a higher risk value.
404 The probability model we use is presented in the case study by (Lund, 1988), which uses an exponential
405 distribution to characterize failure likelihood. The exponential model estimates the probability of failure
406 of an asset P ( t ) , where t denotes the asset age measured in years. The model equation (2) is presented
407 below.
(−0.01 t )
408 P ( t ) =1−e (2)
409 Equation (2) specifies that older meters are more likely to fail. The in-service date of individual meters is
410 available as an embedded attribute in the water meter shapefile. We can use this information to
411 compute the age of each active residential water meter in years ( t ¿ . To estimate the total number of
15
412 failed meters that can be avoided when selecting a given project area, simply sum the probability of
413 failures of all the individual meters contained in the area. The distribution of estimated failed meters per
414 street block is shown in Figure 6 below.
415
416
417 FIGURE 6. Expected Meter Failure Count per Street Block.
418
419 4.4. Geographic nNeighborhoods
420 Census tracts are used to delineate the larger neighborhoods which different street blocks are contained
421 within. They were selected because they are contiguous spaces that each roughly encompass the same
422 population (1200– - 8000 people) and are large enough to divide the service area of Dearborn into a
423 small number of discrete regions. The spatial database of census tracts are is publicly available from the
424 US Census Bureau (US Census Bureau, 2019). Figure 7 below shows the boundary of each census tract
425 overlaid with the street blocks. There are 24 unique census tracts, with an average of 86 street blocks
426 being contained within each area.
427
428
429 FIGURE 7. Census Tract (Neighborhoods) Locations for the City of Dearborn.
430
431 5. RESULTS AND DISCUSSION

432 In this section, we present the project area selection results for the City of Dearborn case study. To
433 demonstrate the application of optimization models to incorporate varying degrees of practical planning
434 constraints, three versions of model (1) will be considered.
435  1) Baseline: This model consists of only equations 1(a) – 1(d). The only constraint is the
436 budget on the number of homes that can be selected as part of the capital program. No
437 geographic restrictions are specified.
16
438  2) Geographic Minimums: This model includes equation 1(e) to the baseline. This specifies a
439 minimum number of homes that must be selected per census tract to be included for
440 replacements. In effect, this avoids spatial concentration of projects and ensures there will
441 be selections spread across the entire distribution area.
442  3) Geographic Minimums and Compactness: This is the full model described by equations
443 1(a) – 1(j). Beyond simply enforcing a minimum number of homes per census tract, an
444 additional constraint requires that the selected blocks per area all be within a certain
445 distance of each other. In effect this requires the model to identify a cluster of street blocks
446 to address within each census tract to reduce mobilization.
447 The epsilon constrain method was implemented to identify the pPareto-efficient frontier for each model
448 and the results were compared. The three models progress in complexity, with more constraints being
449 added each time, and thus worsens the overall performance of the model with each step (fewer total
450 assets removed). By contrasting the different solutions identified by each model, we can also quantify
451 the tradeoffs of including each consideration. This is done be by measuring the degree to which the
452 objectives worsen (how many fewer lead service lines removed, how many fewer failed meters
453 replaced) at the expense of having a more practical and/or low-cost deployment of resources (selected
454 areas cover the whole site, all areas are compact).
455 For simplicity, we only considered the threshold where a maximum of 3% of parcels can be selected.
456 This roughly corresponds to a 3-year planning horizon for the city assuming a 1% annual replacement
457 rate. In practice, this threshold can be adjusted to the proportion of homes the utility is planning to
458 include in the capital program. The 3% limit corresponds to roughly 886 parcels total over the street
459 blocks selected, it is assumed that every home in a chosen street block will be included in the capital
460 program. Similarly, we specify that a minimum of 30 homes should be selected per census tract for the
461 geographic minimums constraint 1(d) and an upper limit of 1000 feet of separation between street
462 blocks to enforce compactness in constraints 1(f) – 1(j). Again, these thresholds were selected as
463 feasible and realistic values in a capital program and are used to highlight the use of multi-objective
464 models for capital planning. In practice, these model parameters can be adjusted to accurately reflect
465 local situations.
466 Figure 8 shows the identified pPareto-efficient frontiers for each of the solved models, the solution
467 located at the midpoint of the frontier is bolded. The X-axis shows the optimal value of objective 1 (lead
468 service line removal) and the Y-axis shows the optimal value of objective 2 (failed meter removal). From
17
469 observation, it is evident that the baseline model with no geographic constraints far out-performs the
470 other two models, as seen in the large gap of its pPareto frontier relative to the others. There are 36
471 identified solutions in the pPareto-efficient frontier of the ‘baseline’ model, 25 solutions in the
472 ‘geographic minimums’ model, and 13 solutions in the full ‘geographic minimums and compactness
473 model’. The solution set of each model will be discussed individually below before a comparison across
474 models.
475
476
477 FIGURE 8. Pareto Efficient Frontier – -Objective Value Comparison with Different Constraints (midpoint
478 solution bolded).
479
480 We first focus on the ‘baseline’ model. The boundary points along the pPareto-efficient frontier
481 represent the outcome when only one objective is considered in the optimization. The left-most point is
482 the solution when only failed meter removal is considered (equation 1(b)) and the right-most point
483 when only accounting for lead service lines (equation 1(a)). Optimizing for lead service lines alone will
484 produce a selection of street blocks that remove 823 lead pipes and 528 failed meters and optimizing for
485 meters alone will remove 687 lead pipes and 571 failed meters respectively. To quantify the difference
486 in the selected street blocks, we can consider the two optimal solutions as unique sets of street blocks
487 and use the jJaccard similarity index. The jJaccard index simply measures the degree of overlap between
488 two sets, scaled between 0–-1, defined as the proportion of overlapping items relative to the total count
489 of unique items (see equation (3) below). Typically, jJaccard index values above 0.6 represent similar
490 sets and values below 0.4 represent dissimilar sets. The two boundary solutions have a jJaccard
491 similarity of 0.39, illustrating the large difference in selected areas when only considering one objective
492 alone.
Number of Overlapping Items Across Two Sets( A ∩ B)

493 Jaccard Set Similarity Index , J ( A , B)=
Total Number of Items AcrossTwo Sets( A ∪ B)
494 (3)
495 Based on the two boundary points, the decrease in lead pipe capture (136, 16.5% lower) is a lot larger
496 than failed meters (43, 7.5% lower). This implies that the tradeoff for improving objective 2 comes at a
18
497 larger expense of objective 1, as objective 1 is more sensitive to performance decrease. The reasoning
498 behind the large tradeoff of objective 1 (lead pipes) relative to objective 2 (meters) is intuitive after
499 comparing the spatial distributions of these assets in Figure 5 and Figure 6. The location of lead service
500 lines is spatially concentrated in just a few neighborhoods of the distribution area, whereas faulty
501 meters are more evenly scattered throughout the system. Meaning any deviation away from the regions
502 with lead pipes are found will greatly decrease model performance, whereas many different geographic
503 configurations of street blocks can produce similar meter removal. The mid-point of the pPareto frontier
504 is a proxy for the outcome when the two objectives are evenly weighted, in this scenario the selection
505 captures 768 lead pipes and 562 bad meters. Here the performance tradeoffs between the two
506 objectives are much smaller: only 55 (6.7%) fewer lead pipes selected compared to the highest value
507 possible, and 8 (1.4%) fewer for bad meters. Taking the jJaccard similarity index into account, the mid-
508 point solution has a jJaccard similarity of 0.571 to in the case where only lead pipes are optimized for
509 and a similarity index of 0.703 in the case of only meters.
510 To provide practical context, given that we assume there is 1 meter and 1 service line per household,
511 any difference in count of asset removal is an approximately the number of additional home visits the
512 utility crew must make. For example, if one solution captures 100 fewer lead service lines, then to make
513 up that difference the city must allocate an additional 100 removals at other homes to make up the
514 difference. Therefore, the mid-point along the pPareto frontier represents a more cost-efficient program
515 since it greatly reduces the truck roll needed to remove a high number of both lead pipes and meters at
516 the same time. If we only considered the single-objective outcomes, a utility would have to visit up to
517 136 additional homes to achieve optimal removal of lead pipe, versus just 55 additional homes if the
518 mid-point outcome was used. Similarly, a utility would need 43 additional deployments at homes
519 relative to just 8 additional for bad meters.
520 Focusing on the ‘geographic minimums’ model next, when lead service line removals are optimized
521 alone the optimal selection of street blocks captures 660 lead pipes and 507 bad meters. In contrast,
522 when only faulty meter removal is optimized, the solution street blocks removal of 469 lead pipes and
523 548 meters. The two solutions have a jJaccard similarity of 0.194, meaning that less than one- fifth of the
524 solution street blocks overlap.
525 Like the patterns observed in the ‘baseline’ model, the removal of lead service lines is much more
526 sensitive to drops in performance relative to the removal faulty meters. The tradeoff between the two
527 boundary points are is 191 (28.9%) fewer lead pipes in exchange for 41 (7.6%) additional meters, and
19
528 vice versa. The mid-point solution along the pPareto-efficient frontier results in a selection of blocks that
529 removes 603 (57, 8.6% less than the optimal) lead pipes and 535 (13, 2.3% fewer than the optimal)
530 failed meters. The jJaccard similarity of this selection to that when only lead pipes are optimized for is
531 0.355 and 0.476 compared to the solution optimizing only meters. This represents a significant
532 improvement since the mid-point encompasses over a third overlap to the lead-optimized solution and
533 almost a half overlap the meter—optimized one. The reasoning behind the difference in sensitivity is
534 also the same, lead pipes are highly concentrated in just a small handful of census tracts. By reweighting
535 the model to focus more on bad meters, which are much more evenly dispersed across the map, will
536 lead to large tradeoffs in the number of removed lead pipes. Translating from the objective values to
537 crew mobilizations, the mid-point solution can potentially save the city 131 home mobilizations to
538 removal of lead pipe and 28 deployments to remove meters in order to achieve optimal removal.
539 Finally turning to the ‘geographic minimums and compactness’ model, the boundary points along
540 pPareto frontier indicate that a lead-optimized solution will select street blocks that remove 627 lead
541 pipes and 501 meters. The meter-optimized solution, in contrast, will select street blocks that remove
542 469 lead pipes and 532 meters. This accounts for a 158 (25.2%) decrease in the lead capture and 31
543 (5.8%) decrease in failed meter removal when comparing the two single-objective solutions. The
544 jJaccard similarity index between them is just 0.176. Turning to the mid-point solution, the block
545 selections here can remove 603 lead pipes and 519 bad meters. This again represents a significantly
546 smaller reduction between the two boundary points, only 24 (3.8%) less than the lead-optimized count
547 and 13 (2.7%) less than the meter-optimized count. It is interesting to note here that, given blocks with
548 high lead service lines and bad meter counts are already geographically clustered, the compactness
549 constraint does not deviate the solution performance by a big margin. To translate the changes in
550 objective values to potential crew mobilizations, the mid-point solution can save the city up to 134
551 home deployments for lead pipes and 18 deployments for meters.
552 Figure 9 below shows the optimal solution of street blocks based on the different scenarios considered,
553 the selection of areas representing the mid-point of the pPareto-efficient frontier is visualized. Table 2
554 compares the performance between these three scenarios.
555
556
557 FIGURE 9. Street Block Selections – -Solution Comparison with Different Constraints.
20
558
559 TABLE 2. Street Block Selections - Performance Comparison with Different Constraints.
Number of Number of Objective 1 Lead

pParcels sStreet bBlocks sService lLine Objective 2 fFailed
Scenario sSelected sSelected rRemoval mMeter rRemoval
Baseline 886 166 768 562
Geographic 886 111 603 535
Minimum
Geographic 879 78 603 518
Minimums and
Compactness
560
561 One interesting observation is that the outcome in each scenario is at or just under the maximum
562 number of allowable parcels (886), however, the number of street blocks is reduced as more constraints
563 are added. The ‘baseline’ model has no geographic constraints and the solution under consideration
564 selects 116 street blocks for inclusion in the capital program, whereas the ‘Geographic Minimums and
565 Compactness’ scenario has less than half at just 78 street blocks. The performance across both
566 objectives also trends in a decreasing manner as more constraints are introduced. The ‘baseline’
567 scenario far outperforms the other two scenarios in terms of lead removal with almost a 27% increase,
568 and marginally improves on failed meter removal with a 5% increase. The reasoning behind these
569 patterns were was discussed earlier in this section, with lead service lines being primarily concentrated
570 in just a few areas of the city, imposing constraints on system-wide project selection will greatly reduce
571 the performance. Combining the trends in street block count and objective performance into
572 consideration, the empirical results suggest there exists a tradeoff between the removal count of target
573 assets (in turn, the effectiveness of a capital program) and cost. The ‘baseline’ model is the most precise,
574 selecting many smaller street blocks and removing many target assets, but the mobilization and truck
575 roll costs of this program are greatly higher because the small blocks are spread out geographically. In
576 contrast, the ‘geographic minimums and compactness’ constraint reduces the mobilization cost where
577 all selected projects within the same census tract are closely located. These types of projects can be
578 most efficiently completed with fewer truck rolls, but at the expense of fewer removal of bad assets.
579 These findings aim to highlight the power of optimization modelingmodelling to evaluate different
21
580 scenarios in the context of program planning, but also the ability of the proposed framework to bundle
581 household replacement projects in a cost-efficient manner.
582 The findings of this selection are summarized below.
583  The optimal solution at the midpoint of the pPareto-efficient frontier represents an
584 effective balance between the two objectives and does not result in substantial differences
585 in the selected project areas in the single-objective cases. Using the solutions along the
586 middle of the pPareto-efficient frontier can thus significantly reduce the number of
587 potential home deployments needed to achieve optimal removal of both lead service line
588 and degraded meter assets.
589  The resulting number of lead service line captured is much more sensitive to tradeoffs
590 when balanced against the need for removing bad meters. This is because lead service lines
591 are mostly clustered together in just a few areas, whereas faulty meters are much more
592 evenly spread out across the system.
593  Including the geographic minimum constraint greatly reduces the performance of both
594 objectives. This is because there are census tracts with low counts of lead pipe and bad
595 meters, but the model is still required to allocate a minimum selection there.
596  The models are less sensitive to the compactness constraint. This is because street blocks
597 with high lead and bad meters are already close to each other within a given neighborhood,
598 so the optimization will naturally select proximate areas without needing a constraint.
599  The empirical results suggest a tradeoff exists between program quality and cost. The
600 baseline model removes the most assets of interest but selects a high number of small
601 street blocks for the program which can increase mobilization costs. In contrast, the model
602 with geographic and compactness constraints selects less than half the street blocks but the
603 removal of target assets is also lower.
604 Possible extensions of the research presented in this paper can involve the application of more
605 sophisticated methods for estimating risk levels of individual asset classes. For example, higher accuracy
606 models for locating degraded meters as well as the incorporation of demographic information to better
607 identify high- risk populations to lead exposure. There are also possible extensions to the optimization
608 model that can be added to reflect other planning considerations, e.g., the existence of construction
609 moratoriums in certain neighborhoods, assigning cost functions to address different types of homes
610 based on age and size rather than assuming uniform cost. Additional considerations of equity can also
22
611 be incorporated into the modelingmodelling to ensure that utility resources are adequately target the
612 most at-need demographics. As more constraints and variables are introduced to the integer
613 programming formulation, the tradeoffs between computational tractability and model complexity need
614 to be examined.
615
616 6. CONCLUSION
617 Managing the aging infrastructure of water distribution systems with limited funding means that many
618 operators need to maximize the return on any capital investment. Designing an effective asset
619 management program is, as a result, a resource allocation problem. The objective is to identify areas
620 where vulnerable assets are located and target the deployment of capital dollars to address the
621 problematic areas. The main issue for many municipalities is that there are many classes of
622 infrastructure types (water mains, valves, meters, service lines) that each need renewal, and it is cost
623 ineffective to plan replacement programs only targeting one alone. This is because it can greatly reduce
624 truck roll (or mobilization), defined as the deployment of crew and equipment to a project area, and is a
625 significant cost to any capital improvement program. Therefore, selecting capital improvement projects
626 that target multiple asset classes together can reduce mobilization costs and help utilities stretch their
627 limited budgets further.
628 To our knowledge, no previous work has demonstrated the application of optimization
629 modelingmodelling and geospatial methods for the infrastructure project planning of assets at the
630 individual household resolution. The problem is framed as a dual-objective integer programing model,
631 where selection of project areas aims to maximize lead service line removal and water meter changeout
632 together. A case study for the joint renewal planning of service lines and water meters is presented.
633 These two infrastructure classes are selected due to availability of data. To best reduce mobilization
634 costs in practice, projects with similar outage times and required crew/equipment are best suited for
635 joint work (i.e., meter replacements and service line material inspections, service line and water main
636 renewals). To provide additional efficiency and delineate individual project areas, we group adjacent
637 land parcels together into larger street blocks. The selection of street blocks is more efficient than
638 selecting individual homes since it allows more geographically compact replacements projects to be
639 performed.
23
640 Since a multi-objective optimization model is specified, the optimal solution to the problem is
641 represented via a pPareto-efficient frontier. Each point along the frontier represents a unique selection
642 of street blocks to be included in a potential asset management program, and the different solutions
643 represent different weightings of the two objectives. Empirical data suggests the multi- objective
644 approach can identify effective project selections that also significantly reduce duplicate visits to the
645 same home. Furthermore, the sensitivity of a given objective to tradeoffs is highly dependent on the
646 spatial distribution of the target assets. In our case study, lead service lines are spatially concentrated in
647 only a limited number of areas, and as a result, is are more sensitive to performance decreases when
648 balanced with the competing objective to remove bad meters. The spatial concentration of lead pipes
649 also means that any constraints to enforce broad spatial selection of projects, potentially due to political
650 concerns, will greatly reduce the removal performance.
651 Altogether, our research demonstrates the potential of using dual-objective modelingmodelling to guide
652 replacement planning of household-level water distribution assets and has the potential to generate
653 more cost-efficient programs to protect critical water infrastructures. In practice, the application of the
654 methods explored here needs to be incorporated within broader infrastructure decision frameworks
655 that account for local regulations and the need for all asset types. Water utilities own various
656 infrastructure types (e.g., distribution main, water tower, pumps, valves) that each have has their own
657 replacement and maintenance needs. The research here focuses specifically on the alignment
658 household-level projects, but efficacy of capital programs can be greatly improved when assets of
659 different resolutions are considered together.
660
661 For example, an additional reduction in project mobilization cost can be achieved if service line
662 replacements (household level) were aligned with adjacent distribution main renewals (street level)
663 since both require excavation. Larger planning frameworks are needed to accurately weigh the needs of
664 different infrastructure types. Once the best portfolio of assets to target can be determined, they can
665 then be incorporated into the downstream decision models that selects areas with maximal alignment
666 to reduce cost. Beyond the physical conditions of the infrastructure itself, local regulations that may
667 influence the geographic location of projects also need to be included in the decision framework.
668 Common examples include project moratoriums to avoid frequent disruption of the same
669 neighborhood, incentivized alignments with other departments in their project areas (i.e. water
670 department main replacements combined with transportation department street resurfacing of the
24
671 same road), and renewal targets specified by the state agencies (i.e. 5% replacement rate of lead service
672 lines per year).
673 In the case of water meter and lead service line replacements, the methods developed here help
674 prioritize neighborhoods for best joint renewal of these assets. However, state regulatory bodies may
675 specify a minimum lead pipe replacement rate due to their urgent public health risk, and local regulation
676 may prevent excavation of underground pipes within 5 years of a previous project. These factors
677 combined may skew the selected infrastructure projects to prioritize lead pipes over meters, but also
678 influence when certain neighborhoods can be addressed. To our best knowledge, broader planning
679 frameworks like this are typically executed by relying on expert judgmentjudgement, but it is an active
680 area of research in the infrastructure planning domain. The exploration of methods to determine the
681 best mix of asset types to target, and similar optimization methods to best align projects spanning
682 different spatial resolutions, can provide value to utility decision- makers and presents a meaningful
683 direction for related future work.
684
685 ACKNOWLEDGMENTSAcknowledgements
686 We would like to thank the City of Dearborn, MI, for agreeing to the use and showcasing of its system
687 data for the implementation of this research. We would also like to thank Mr. Eric Roggow, CMMS
688 Program Manager for the Department of Public Works at the City of Dearborn for his efforts in compiling
689 the relevant datasets needed for this research. The models and risk results presented in this research
690 are hypothetical based on data provided by the city and carry uncertainty, they do not necessarily
691 reflect the true condition of the distribution system assets.
692
693 CONFLICT OF INTEREST STATEMENTConflict of Interest

694 This work was funded by Xylem, Inc., which is developing products related to the research described in
695 this paper. The independence of this work is reviewed and approved in accordance with Xylem Inc.’s
696 policy on objectivity in research. The opinions and views expressed are those of the researchers and do
697 not necessarily reflect those of the sponsors.
698
25
699 DATA AVAILABILITY STATEMENTData Availability Statement
700 The data that support the findings of this study are available from the City of Dearborn, MI. Restrictions
701 apply to the availability of these data, which were used under license for this study. Data are available
702 from the authors with the permission of the City of Dearborn, MI.
703
704 REFERENCESReferences
705 Abernethy, J., Anderson, C., Rauh, A., Schwartz, E., Stroud, J., Tan, X., & Webb, J. (2016). Flint wWater
706 cCrisis : Data-dDriven rRisk aAssessment vVia rResidential wWater tTesting. Proceedings
707 of the 23rd ACM SIGKDD iInternational cConference on kKnowledge dDiscovery and
708 dData mMining, 1407–1416.
709 Ana, E. V, & Bauwens, W. (2007). Sewer network asset management decision-support tools: Aa review.
710 International Symposium on New Directions in Urban Water Management, September,
711 1–8. http://www2.gtz.de/Dokumente/oe44/ecosan/en-sewer-network-decision-
712 making-tool-2007.pdf
713 The American Water Works Association (AWWA). (2010). J100-10 rRisk and rResilience mManagement
714 of wWater and wWastewater sSystems. Denver, CO.
715 Berardi, L., Giustolisi, O., Savic, D. A., & Kapelan, Z. (2009). An effective multi-objective approach to
716 prioritisation of sewer pipe inspection. Water Science and Technology, 60(4), 841–850.
717 https://doi.org/10.2166/wst.2009.432
718 Berry, J. W., Fleischer, L., Hart, W. E., Phillips, C. A., & Watson, J.-P. (2005). Sensor pPlacement in
719 mMunicipal wWater nNetworks. ASCE J. Water Resour. Plan. Manag., 131(3), 237–243.
720 http://link.aip.org/link/?QWR/131/237/1
721 Carey, B. D., & Lueke, J. S. (2013). Optimized holistic municipal right-of-way capital improvement
722 planning. Canadian Journal of Civil Engineering, 40(12), 1244–1251.
723 https://doi.org/10.1139/cjce-2012-0183
724 Chen, T., & Guestrin, C. (2016). XGBoost : A sScalable tTree bBoosting sSystem. Proceedings of the 22nd
725 Acm Sigkdd iInternational cConference on kKnowledge dDiscovery and dData mMining,
26
726 785–794.
727 Chen, T. Y., Man, C., & Daly, C. M. (2021). Optimizing cluster selections for the replacement planning of
728 water distribution systems. AWWA Water Science, 3(4).
729 https://doi.org/10.1002/aws2.1230
730 Chen, T. Y., Washington, V. N., Aven, T., & Guikema, S. D. (2020b). Review and eEvaluation of the J100-
731 10 rRisk and rResilience mManagement sStandard for wWater and wWastewater
732 sSystems. Risk Analysis, 40(3), 608–623. https://doi.org/10.1111/risa.13421
733 Chen, T.Y., Beekman, J. A., David Guikema, S., & Shashaani, S. (2019). Statistical mModeling in
734 aAbsence of sSystem sSpecific dData: Exploratory eEmpirical aAnalysis for pPrediction of
735 wWater Main bBreaks. Journal of Infrastructure Systems, 25(2).
736 https://doi.org/10.1061/(ASCE)IS.1943-555X.0000482
737 Chen, T.Y., Riley, C. T., Van Hentenryck, P., & Guikema, S. D. (2020a). Optimizing inspection routes in
738 pipeline networks. Reliability Engineering and System Safety, 195.
739 https://doi.org/10.1016/j.ress.2019.106700, 106700
740 Chen, T.Y., Vladeanu, G., Yazdekhasti, S., & Daly, C. M. (2022). Performance eEvaluation of pPipe
741 bBreak mMachine lLearning mModels uUsing dDatasets from mMultiple uUtilities.
742 Journal of Infrastructure Systems, 28(2). https://doi.org/10.1061/(asce)is.1943-
743 555x.0000683
744 Chojnacki, A., Dai, C., Farahi, A., Shi, G., Webb, J., Zhang, D. T., Abernethy, J., & Schwartz, E. (2017). A
745 dData sScience aApproach to uUnderstanding rResidential wWater cContamination in
746 Flint. Proceedings of the 23rd ACM SIGKDD iInternational cConference on kKnowledge
747 dDiscovery and dData mMining, 1407–1416. https://doi.org/10.1145/3097983.3098078
748 Dandy, G. C., & Engelhardt, M. O. (2006). Multi-oObjective tTrade-oOffs between cCost and rReliability
749 in the rReplacement of wWater mMains. Journal of Water Resources Planning and
750 Management, 132(2), 79–88. https://doi.org/10.1061/(ASCE)0733-9496(2006)132:2(79)
751 de Monsabert, S., Ong, C., & Thornton, P. (1999). An iInteger pProgram for oOptimizing sSanitary
752 sSewer rRehabilitation oOver a pPlanning hHorizon. Water Environment Research,
753 71(7), 1292–1297. https://doi.org/10.2175/106143096x122429
27
754 Dridi, L., Mailhot, A., Parizeau, M., & Villeneuve, J. P. (2009). Multiobjective aApproach for pPipe
755 rReplacement bBased on Bayesian iInference of bBreak mModel pParameters. Journal
756 of Water Resources Planning and Management-Asce, 135(5), 344–354.
757 https://doi.org/10.1061/(ASCE)0733-9496(2009)135:5(344)
758 US Environmental Projection Agency (EPA). (2019). Revised Lead and cCopper rRule. Washington DC.
759 Fontanazza, C. M., Notaro, V., Puleo, V., & Freni, G. (2015). The apparent losses due to metering errors:
760 Aa proactive approach to predict losses and schedule maintenance. Urban Water
761 Journal, 12(3), 229–239. https://doi.org/10.1080/1573062X.2014.882363
762 Ganjidoost, A., Vladeanu, G., & Daly, C. M. (2022). Leveraging risk and data analytics for sustainable
763 management of buried water infrastructure. AWWA Water Science, 4(2).
764 https://doi.org/10.1002/aws2.1283
765 Haimes, Y. Y., Lasdon, L. S., & Wismer, D. A. (1971). On a bicriterion formulation of the problems of
766 integrated identification and system optimization. IEEE Transactions on Systems, Man
767 and Cybernetics, SMC-1(3), 296–297.
768 https://ieeexplore-ieee-org.afit.idm.oclc.org/stamp/stamp.jsp?tp=&arnumber=4308298
769 Hajiseyedjavadi, S., Karimi, H. A., & Blackhurst, M. (2022). Predicting lead water service lateral
770 locations: Geospatial data science in support of municipal programming. Socio-
771 Economic Planning Sciences. https://doi.org/10.1016/j.seps.2022.101277, 82, 101277
772 Kerwin, S., & Adey, B. T. (2020a). Pipes or pumps? The use of cost-benefit analysis in investment
773 decision-making for public water infrastructure. Life-Cycle Civil Engineering: Innovation,
774 Theory and Practice - Proceedings of the 7th International Symposium on Life-Cycle Civil
775 Engineering, IALCCE 2020, 1143–1150. https://doi.org/10.1201/9780429343292-151
776 Kerwin, Sean, & Adey, B. T. (2020b). Optimal iIntervention pPlanning: A bBottom-uUp aApproach to
777 rRenewing aAging wWater iInfrastructure. Journal of Water Resources Planning and
778 Management, 146(7). https://doi.org/10.1061/(asce)wr.1943-5452.0001217
779 Kim, J., Baek, C., Jo, D., Kim, E., & Park, M. (2004). Optimal planning model for rehabilitation of water
780 networks. Water Science and Technology, 4(3), 133–148.
781 Kleiner, Y., Nafi, A., & Rajani, B. (2010). Planning renewal of water mains while considering
28
782 deterioration, economies of scale and adjacent infrastructure. Water Science and
783 Technology: Water Supply, 10(6), 897–906. https://doi.org/10.2166/ws.2010.571
784 Kleiner, Y., & Rajani, B. (2001). Comprehensive rReview of sStructure dDeterioration of wWater
785 mMains: Statistical mModels. Urban Water, 3(3), 151–164.
786 Konstantinou, C., & Stoianov, I. (2020). A comparative study of statistical and machine learning
787 methods to infer causes of pipe breaks in water supply networks. Urban Water Journal,
788 17(6), 534–548. https://doi.org/10.1080/1573062X.2020.1800758
789 Lawler, E. L., & Wood, D. E. (1966). Branch-and-bound methods: A sSurvey. Operations Research, 14(4),
790 699–719. https://doi.org/10.1098/ROYAL/
791 Lund, J. R. (1988). Metering utility services: Evaluation and maintenance. Water Resources Research,
792 24(6), 802–816. https://doi.org/10.1029/WR024i006p00802
793 Madrigal, A. C. (2019). How a fFeel-gGood AI sStory wWent wWrong in Flint. The Atlantic, 1–14.
794 https://www.theatlantic.com/technology/archive/2019/01/how-machine-learning-
795 found-flints-lead-pipes/578692/?
796 utm_medium=offsite&utm_source=google&utm_campaign=newsstand-technology
797 Martello, S., & Toth, P. (1990). Knapsack problems: Algorithms and computer implementations. Wiley.
798 https://doi.org/10.1007/springerreference_5701
799 Mohanakrishnan, J., Boyle, C., & Poff, J. G. (2019). Detecting and rResolving aApparent lLoss wWith
800 dData sScience. Journal - American Water Works Association, 111(2), 13–17.
801 https://doi.org/10.1002/awwa.1230
802 Muncie, H. L., Sobal, J., & DeForge, B. (2013). Search mMethodologies: Introductory tTutorials for
803 oOptimization and dDecision sSupport tTechniques. In Journal of fFamily pPractice
804 (Second, Vol. 28, Issue 1). Springer. https://doi.org/10.1515/9780823274161-004
805 Muñuzuri, J., Ramos, C., Vázquez, A., & Onieva, L. (2020). Use of discrete choice to calibrate a
806 combined distribution and sewer pipe replacement model. Urban Water Journal, 17(2),
807 100–108. https://doi.org/10.1080/1573062X.2020.1748205
808 Nafi, A., & Kleiner, Y. (2010). Scheduling renewal of water pipes while considering adjacency of
809 infrastructure works and economies of scale. Journal of Water Resources Planning and
29
810 Management, 136(5), 519–530. https://doi.org/10.1061/(ASCE)WR.1943-5452.0000062
811 Naoum-Sawaya, J., Ghaddar, B., Arandia, E., & Eck, B. (2015). Simulation-optimization approaches for
812 water pump scheduling and pipe replacement problems. European Journal of
813 Operational Research, 246, 293–306. https://doi.org/10.1016/j.ejor.2015.04.028
814 Osman, H., Ammar, M., & El-Said, M. (2017). Optimal scheduling of water network repair crews
815 considering multiple objectives. Journal of Civil Engineering and Management, 23(1),
816 28–36. https://doi.org/10.3846/13923730.2014.948911
817 Pecci, F., Abraham, E., & Stoianov, I. (2015). Mathematical programming methods for pressure
818 management in water distribution systems. Procedia Engineering, 119(1), 937–946.
819 https://doi.org/10.1016/j.proeng.2015.08.974
820 Puleo, V., Fontanazza, C. M., Notaro, V., De Marchis, M., La Loggia, G., & Freni, G. (2014). Definition of
821 water meter substitution plans based on a composite indicator. Procedia Engineering,
822 70, 1369–1377. https://doi.org/10.1016/j.proeng.2014.02.151
823 St. Clair, A. M., & Sinha, S. (2012). State-of-the-technology review on water pipe condition,
824 deterioration and failure rate prediction models! Urban Water Journal, 9(2), 85–112.
825 https://doi.org/10.1080/1573062X.2011.644566
826 Tscheikner-Gratl, F., Caradot, N., Cherqui, F., Leitão, J. P., Ahmadi, M., Langeveld, J. G., Le Gat, Y.,
827 Scholten, L., Roghani, B., Rodríguez, J. P., Lepot, M., Stegeman, B., Heinrichsen, A.,
828 Kropp, I., Kerres, K., Almeida, M. do C., Bach, P. M., Moy de Vitry, M., Sá Marques, A.,
829 … Clemens, F. (2019). Sewer asset management–state of the art and research needs.
830 Urban Water Journal, 16(9), 662–675. https://doi.org/10.1080/1573062X.2020.1713382
831 Tscheikner-Gratl, F., Egger, P., Rauch, W., & Kleidorfer, M. (2017). Comparison of multi-criteria
832 decision support methods for integrated rehabilitation prioritization. Water, 9(2).
833 https://doi.org/10.3390/w9020068
834 Tscheikner-Gratl, F., Sitzenfrei, R., Rauch, W., & Kleidorfer, M. (2016). Integrated rehabilitation
835 planning of urban infrastructure systems using a street section priority model. Urban
836 Water Journal, 13(1), 28–40. https://doi.org/10.1080/1573062X.2015.1057174
837 US Census Bureau. (2019). TIGER/lLine with sSelected dDemographic and eEconomic dData. Washington
30
838 DC.
839 Vladeanu, G. J., & Matthews, J. C. (2019). Consequence-of-fFailure mModel for rRisk-bBased aAsset
840 Management of Wastewater Pipes Using AHP. Journal of Pipeline Systems Engineering
841 and Practice, 10(2), 1–12. https://doi.org/10.1061/(asce)ps.1949-1204.0000370
842 Yazdandoost, F., & Izadi, A. (2018). An asset management approach to optimize water meter
843 replacement. In Environmental mModelling and sSoftware (Vol. 104, pp. 270–281).
844 https://doi.org/10.1016/j.envsoft.2018.03.015
845 Yee, M. D. (1999). Economic analysis for replacing residential meters. Journal of American Water Works
846 Association, 91(7), 72–77. https://doi.org/10.1002/j.1551-8833.1999.tb08666.x
847
31

Aws2 1336

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Aws2 1336

Uploaded by

Copyright:

Available Formats

1 Multi-Oobjective oOptimization mModels for the rRenewal pPlanning of mMultiple aAsset cClasses

2 Thomas Ying-Jeh Chen1,* thomas.chen@xylem.com;

19 *Corresponding author: Thomas.Chen@xylem.com

42 Article Impact Statement

90 Two main contributions of this research are summarized below.

118 2. LITERATURE REVIEW

191 FIGURE 1. Project Identification and Optimization Workflow.

193 3.1. Problem sSpecification

203 3.2. Spatial aAggregation of hHouseholds

216 FIGURE 2. Spatial Aggregation of Land Parcels to Street Block Neighborhoods.

224 3.3. Neighborhood sSelection oOptimization

236 3.3.1. Decision vVariables

238  Let i ∈ I be the index of candidate project areas.

257 3.3.2. Problem fFormulation

261 max ∑ M i X i (1b)

262 Subject to:

263 T L ≤ ∑ Hi Xi≤ T U (1c)

306 D ij Y ij ≤ B j , ∀(i , j) ∈ ε j , j ∈ J (1f)

307 Y ij ≤ X i , ∀ (i , j)∈ ε j , j ∈ J (1g)

308 Y ij ≤ X j , ∀(i , j)∈ ε j , j ∈ J (1h)

309 Y ij ≥ X i + X j−1 , ∀(i , j)∈ ε j , j ∈ J (1i)

310 Y ij ∈ {0 , 1} , ∀(i , j) ∈ ε j , j ∈ J (1j)

315 3.3.3. Solution mMethod

334 FIGURE 3. Schematic for Epsilon-Constraint Algorithm Dual-Objective Maximization Problems.

343 4. CASE STUDY

348 4.1. Distribution sSystem dData

363 4.2. Lead sService lLine pProbability

Variable nName Description

398 4.3. Probability of mMeter fFailure

417 FIGURE 6. Expected Meter Failure Count per Street Block.

419 4.4. Geographic nNeighborhoods

431 5. RESULTS AND DISCUSSION

Number of Overlapping Items Across Two Sets( A ∩ B)

Number of Number of Objective 1 Lead

582 The findings of this selection are summarized below.

693 CONFLICT OF INTEREST STATEMENTConflict of Interest

You might also like