Version 1.

1 6-5-17 DRAFT DRAFT DRAFT

CaDC Statewide Efficiency Explorer Methodology
Context: CaDC staff is proud to announce the completion of its first rapid assessment of the
residential component of Governor Brown's statewide efficiency targets. This integrates publicly
available evapotranspiration, land use, service area boundary, aerial imagery, population and
water production data to estimate residential water efficiency targets for 404 out of CA's 409
major urban water retailers reporting in the latest supplier report.

This assessment provides a marked improvement over the previous CaDC parcel based
methodology, which was the previously best available statewide approximation publicly
available online. Those calculations are shown via an interactive tool whereby water uses can
input and analyze various policy scenarios. This tool was developed for planning and
education purposes as a public service to support the water community in navigating the
rapidly evolving statewide policy discussions.

As described in the original grant agreement with the Water Foundation, "This interactive
planning tool empowers the California water community to analyze the impact of those
prospective efficiency standards under user selected scenarios with varying indoor or outdoor
efficiency standards." The CaDC partnership does not take water policy positions as described
in the CaDC in depth principles here.

Purpose: This document details the CaDC’s methodology in conducting the first assessment of
statewide efficiency targets described in Governor Brown’s May 2016 Executive Order on
“Making Conservation a California Way of Life” (EO).1 The key excerpt from the EO is copied
below:

“These water use targets shall be customized to the unique conditions of each water
agency, shall generate more statewide water conservation than existing requirements,
and shall be based on strengthened standards for:

A. Indoor residential per capita water use;
B. Outdoor irrigation, in a manner that incorporates landscape area, local climate,
and new satellite imagery data;
C. Commercial, industrial, and institutional water use; and
D. Water lost through leaks.”

This high level executive order is further specified in the draft report from the Department of
Water Resources (DWR) and State Water Resources Control Board (SWRCB).2 This CaDC

1
https://www.gov.ca.gov/docs/5.9.16_Attested_Drought_Order.pdf
2
http://www.water.ca.gov/wateruseefficiency/conservation/docs/EO_B-37-16_Report.pdf

1 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
methodology whitepaper details how the CaDC Efficiency Explorer tool operationalizes each of
those EO standards to calculate water use targets. Each section describes the methodology in
this first assessment as well as opportunities for improvement in future iterations.

1. Calculating a Water Use Target 2
A. “Indoor residential per capita water use” calculation 3
B. “Outdoor irrigation in a manner that incorporates landscape area, local climate, and new
satellite imagery” calculation 4
B.1 Landscape Area Measurement Methodology 4
B.2 Evapotranspiration Calculation 8
C. Commercial, Industrial and Institutional Water Use 9
D. Water lost through leaks 10

2. Calculating Applicable Urban Water Use 10

1. Calculating a Water Use Target
These targets will be set using the EO standards for indoor and outdoor water use efficiency.
The State’s draft report illustrates how the volumetric water use target will be calculated based
on those standards as shown below:

This initial deployment of the CaDC Efficiency Explorer focuses on residential indoor and
residential outdoor water use to set an initial target. The CaDC Efficiency Explorer tool
incorporates this target calculation for any user-selected time period.

2 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
In addition, the CaDC Efficiency Explorer tool is policy-neutral and enables users to input an
indoor GPCD and/or outdoor ET adjustment factor of their choosing. The user is able to select
an arbitrary time period and explore how these estimated targets compare to actual over time.

Note the efficiency explorer shows only data for the dates available from the supplier report.
For instance, if a supplier did not report in July through September of 2016, if the user selects
that time period, that supplier will be omitted from the tool display.

In each user-defined scenario, the aggregate statewide residential target for the previous twelve
months is shown against the existing SBx7-7 target and the total residential usage for that same
period.3

The estimated error associated with each target is shown in the tool to highlight the imprecision
with this first assessment.4 Some suppliers have uniquely challenging circumstances and those
have been assessed as part of a standardized qualitative rubric shown in Appendix 1.

This section of the methodology documentation details those calculations along with
opportunities for improvement in all four standards in the Governor’s Executive Order.5 Section
2 explains how this calculation is aligned with publicly available water production and use data.

1.A. “Indoor residential per capita water use” calculation

This first assessment calculates the residential indoor water use budget for each month as

??????????? ?????? ?????? = ?????????? × ???? × ???? ?? ??????
where
● ?????????? is the population data reported to the state in the monthly supplier report
● ???? is the residential indoor standard, representing the gallons per person per day
allocated for reasonable, efficient indoor use
● ???? ?? ?????? is the number of days in the reporting period

Opportunities for improvement

● There exist unique local circumstances that may need additional customization, such as
seasonal population shifts in high tourism areas:
3
Note the SBx7-7 target is an aggregate utility wide number so that is adjusted by the percent of
residential usage for the utility service area obtained from the monthly supplier report.
4
The calculation for that error is shown here: https://github.com/California-Data-Collaborative/statewide-
efficiency-error-model
5
A mathematical supplement detailing the calculation of errors associated with these water use targets is
available upon request.

3 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
○ Such suppliers may need additional customizations through an indoor variance
process or supplemental data collection. This variance process would need to
have the ability to deal with other local economic conditions to accurately
measure efficiency.

1.B. “Outdoor irrigation in a manner that incorporates landscape area, local
climate, and new satellite imagery” calculation

The outdoor irrigation budget will be calculated for each month as follows:

??????? ?????????? ?????? = ?? × ??> × ?? ?????????? ?????? ×0.62 ?????????? ??????6
where
● ??is the landscape area measured within the utility service area as described in section
1.B.1
● ??> is the reference evapotranspiration, calculated for each agency and month as
described in section 1.B.2
● ?? ?????????? ??????is the outdoor standard representing a percent of measured
evapotranspiration

This water budget calculation uses landscape area and evapotranspiration data as follows.

1.B.1 Landscape Area Measurement Methodology
For this initial iteration, CaDC staff collaborated with Andrew Marx from Claremont Graduate
University (CGU) to develop the initial landscape area data. CGU uses remote sensing
technology and the National Agricultural Imagery Project (NAIP) data to develop landscape area
measurements statewide.

The National Agricultural Imagery Program (NAIP) contracts aircraft to fly states at least every
two years, making it an invaluable public resource for updating land cover classifications. The
2016 NAIP imagery is at 60 centimeter resolution, an improvement over the historical NAIP
resolution of 1 meter. The NAIP imagery provides four spectral bands including red, green, blue
and critically near infrared to enable vegetation differentiation such as irrigated turf versus
artificial turf.

This imagery has not traditionally been used in detailed landscape classification due to its
relatively coarse spatial resolution of one meter. However, improvements to the 2016 NAIP’s

6
0.62 is the conversion between inches of ET to gallons per square foot.

4 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
spatial resolution, as well as new analytical methods developed by Dr. Marx, have produced an
inexpensive and accurate way to estimate landcover across California.

Developed at the Claremont Graduate University, CILA (the California Irrigable Landscape
Algorithm) produces accurate landscape classifications derived from NAIP imagery.
Classifications include ‘Irrigated Turf’, ‘Bushes and Trees’ and surfaces that would not receive a
water budget such as surfaces impervious to water (concrete, roofs), bare earth and non-
photosynthetically-active or dead vegetation (NPAV). CILA functions by looking not only at the
reflectance of the pixel in the four spectral bands of the NAIP imagery (pixel-based analysis),
but also through how the groupings of pixels are shaped (object-based) and how different they
are from their neighbors (texture-based).

Note the CaDC Efficiency Explorer tool is ultimately agnostic to landscape area data source and
could input data from future improved iterations, other statewide data sources, or local
landscape area datasets. In fact, the Efficiency Explorer tool has a neighborhood-level
counterpart that is deployed for CaDC agencies.

This initial CGU landscape area iteration measures what is indisputably included in that
landscape area calculation: photosynthetically active vegetation (PSAV) turf and PSAV trees
and shrubs. PSAV describes vegetation that actively conducting photosynthesis, or healthy,
green vegetation. That data has been generated at a parcel level statewide and aligns with the
(Draft) state landscape area categories as follows:

Generalized Land CGU Categories (Draft) State
Cover Classes Categories from
December Workgroup
Meeting

Irrigable Irrigated ● PSAV Turf ● Green, irrigated
● PSAV turf
Trees/Shrubs ● Trees/Shrubs
● Pools

● Brown lawns

5 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT

Ambiguous Areas ● Urban wildland
(Sample) interface
● Drought tolerant
gardens
● Fire
suppression
● Greenspace
covered with an
awning

Neither ● Impervious ● Impervious and
and other other areas
areas

Those landscape area measures are pruned to residential areas to align with available water
production data from the monthly supplier report (which measures residential production). That
is done on parcel by parcel basis using the Office of Planning and Research (OPR) residential
parcel dataset.7

Note publically available parcel polygons sometimes do not cover street medians, so that
portion of landscape area is omitted. That is an area for future improvement. Other times there
is some sort of street parcel so that the street is also covered by a polygon and there is no
"empty space" between polygons.

Those parcel-level measurements are aggregated to a retail supplier level using service area
boundaries obtained from the Department of Water Resources (DWR)8 and California
Environmental Health Tracking Program.9 Several retail service area boundaries overlap and
those issues have been passed along to the relevant DWR working group.

Comparison to CaDC Member Agency Data

7
http://services.gis.ca.gov/arcgis/rest/services/Boundaries/Parcels_Residential/MapServer
8
https://gis.water.ca.gov/app/boundaries/
9
http://cehtp.org/page/water/water_system_map_viewer

6 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
This section details initial efforts to evaluate the accuracy of this CGU data. One way to gauge
the accuracy of the CGU landscape area measurements is to compare them with the
measurements currently being used by CaDC member agencies employing water budget rate
structures. We do this here by comparing landscape area at the parcel level and in aggregate
for two CaDC agencies known to have measured (as opposed to estimated) landscape area:
the Moulton Niguel Water District (MNWD), and the Eastern Municipal Water District (EMWD).

In both cases, only customers with validated assessor parcel numbers (APNs) were compared.
In this case 32,180 parcels in Moulton Niguel and 30,856 in Eastern Municipal were included in
the comparison. Two mismatches complicate this direct comparison which should be
considered.

1) MNWD and EMWD are measuring a wider number of classes as irrigable to include
dead lawns, pools and some drought tolerant landscaping while the CGU method only
measures photosynthetically active turf, bushes or trees.

2) All CGU imagery is from July and August 2016, measuring landscaping in the driest
part of the year. MNWD and EMWD measurements were done as part of implementing
their water budget based rates and many of those measurements predated 2016. This
mismatch means MNWD and EMWD will include a significant amount of non-irrigated,
natural vegetation in their measurement.

Since irrigated area is a subset of irrigable area, we would expect the CGU measurements to be
smaller than those used by the agencies for each parcel. That is what we see. CaDC staff has
performed a similar accuracy assessment with leading vendors.10 Furthermore, CaDC
investigated how that error broke out by subcategory. The CGU estimates are produced as the
sum of turf area and tree/shrub area. In order to determine how each of these two components
contributes to the observed differences, a regression analysis was conducted.

The analysis found that the error is minimized by inflating the estimated turf area (by a factor of
1.1 in MNWD and 2.2 in EMWD) and decreasing the estimated tree/shrub area (by a factor of
0.89 in MNWD and 0.94 in EMWD). This further aligns with the intuition that an aerial algorithm
for estimating PSAV would under-predict turf by excluding areas of brown lawn, and over-
predict tree/shrub by including tree area that covers impermeable surfaces.

Opportunities for improvement

● Certain areas of the state are more technically challenging. Remote sensing can be
used to flag challenging areas that need special attention.

10
CaDC Independent Landscape Area Classification Accuracy Assessment (September 2016). Please
inquire with CaDC staff for a copy.

7 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
○ Multiple images over different time periods can help disambiguate those areas.
○ Use human experts to manually inspect ambiguous areas and reconcile what
specific category the area falls into.
● Implement a variance process to procedurally provide allowances in areas that are
challenging to any aerial imaging measurement -- automated or manual.
● Improve parcel data to better align landscape area with customer type. OPR’s statewide
residential parcel data offers a great starting point that can be built upon.
● Match meters to parcels to better distinguish between areas with natural vegetation and
maintained landscapes that use retail water. The CaDC agencies have invested in
common data infrastructure to integrate that metered use and parcel data on a voluntary
basis. This “big data” approach has multiple benefits as discussed in section 2 below.

1.B.2 Evapotranspiration Calculation

The other key data point for assigning an outdoor water budget is evapotranspiration. This first
iteration uses the inverse distance-weighted average of evapotranspiration data from the 10
CIMIS stations nearest to each utility service area. Specifically, the ??> for a given utility and
month is calculated as:

I> I>
??> = JKI ?J ×??>J / JKI ?J

?J = 1/?J
where
● ??>J is the reference evapotranspiration measured at CIMIS station ?
● ?J is the euclidean distance from station ? to the centroid of the utility service area

A small number of the daily ET readings are missing values. The missing readings for each
station are estimated as the daily average for that station and month.

In this approach, the ??> for utilities is calculated as the weighted average of readings from
nearby stations, with more weight given to those nearby and less weight given to more distant
stations. The approach is imperfect as evapotranspiration can vary greatly across a service area
(which a single number will not reflect) and some service areas will be far from a CIMIS station.
This speaks to the importance of 1) calculating the outdoor water budgets for statewide
efficiency targets at a parcel level and 2) having more geographically granular measurements of
ET.

Opportunities for improvement

● This initial iteration uses a single, aggregate utility-wide estimate of evapotranspiration.
Many California water utilities have nontrivially different climates within their service

8 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
area. The most robust method would be to assign evapotranspiration to water use at the
parcel level to reflect the actual watering needs of that landscape area, even if the
ultimate target is to be aggregated at a utility level.
● CaDC staff is working to improve these initial evapotranspiration measurements and to
integrate DWR’s spatial CIMIS data to provide more robust evapotranspiration
measurements at a higher resolution.
● Lastly, according to the latest spatial CIMIS methodology, that measurement
incorporates only solar radiation in addition to local in situ sensor measurements. To
help improve on those measurements CaDC staff has partnered with researchers at
UCLA and NYU CUSP to integrate additional pertinent publicly available data sources
such as wind, precipitation and physiography. The latest work on that project is available
on the CaDC GitHub in the footnote below.11 Higher resolution ET data is foundational
to being able to accurately assign parcel level ET and thus calculate an accurate, parcel
customized water budget.

1.C. “Commercial, Industrial and Institutional Water Use”
Highlights the importance of more granular customer categorization

In October 2013, the DWR CII Task force produced a report for the California State Legislature
recommending the following:12

Recommendation 5 -7: DWR should work with the Association of California
Water Agencies (ACWA), CUWCC, California Urban Water Agencies
(CUWA), California Public Utilities Commission (CPUC), California Water
Association (CWA), and American Water Works Association (AWWA) to
develop a full-spectrum, water-centric water use standardized classification
system of customer categories. This classification system should include
consistent use of North American Industry Classification System (NAICS)
codes and assessor’s parcel numbers (APNs).

In line with those recommendations on the need for granular, water-centric customer categories,
CaDC staff has developed a unique applied R&D project with NYU CUSP. Working with
Professor Constantine Kontokosta of NYU's Center for Urban Science and Progress, the CaDC
is working towards benchmarking efficient water use across Commercial, Industrial, Institutional
and Multifamily Residential customer classes using our standardized data. This approach has
been deployed by CaDC staff when they were part of Professor Kontokosta’s research group for
NYC, resulting in a Bloomberg Data for Good award-winning Energy & Water Performance

11
https://github.com/California-Data-Collaborative/Evapotranspiration
12
http://www.water.ca.gov/legislation/docs/CII%20Volume%20II%20july%202014.pdf

9 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
Map13 and associated journal paper.14 These approaches to classify and resolve commercial
entities have economies of scale and would be much more difficult to deploy 411 unique times
than on a statewide basis through a single integrated database like CaDC’s SCUBA data
infrastructure.

1.D. “Water lost through leaks”

Integrated, nonprofit water data infrastructure can help use statistical outlier detection to flag
anomalies and find areas for leak detection. Monthly metered use data needs to be
supplemented with SCADA flow data to detect leaks effectively. The underlying parsing and
SCUBA data infrastructure can be redeployed to integrate that SCADA flow data.

2. Calculating Applicable Urban Water Use

California collects three publicly available sets of major statewide aggregate urban water use
data. Those sources temporal frequency and key data fields are summarized below.

13
http://www1.nyc.gov/site/sustainability/codes/energy-benchmarking.page
14
http://cusp.nyu.edu/wp-
content/uploads/2015/09/D4GX_Web_Based_Visualization_and_Predction_Kontokosta_Tull.pdf

10 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
These urban water demand data sources each present challenges in aligning with the water use
target methodology detailed in Section 1. Note none of these aggregate water use data sources
explicitly break out indoor residential use and total outdoor water use.

This initial deployment of the CaDC Efficiency Explorer tool addresses this issue by focusing
initially on residential water use and highlighting opportunities for future improvement.

Opportunities for improvement

Current California water reported data will need to be improved to implement the EO long term
framework. For instance mapping dedicated commercial areas with overlapping parcels and
meters will be key for distinguishing outdoor from indoor CII. In addition, outdoor water budgets
can be more accurately assigned at the parcel level as detailed in section 1.B.2 as
evapotranspiration can vary greatly across a utility’s service area. This meter level approach
would also better account for unique local circumstances like recycled water delivered to
residential accounts. Lastly, an integrated database of metered water utility helps streamline the
deployment of analytics to support water managers in meeting their targets and broader water
reliability objectives.The CaDC partnership has developed data infrastructure integrating
metered use data across participating agencies. Participation in the project is voluntary.

The CaDC has the tools to ingest meter data mapped to addresses which can spatially identify
and rectify many of the issues noted by distinguishing between for instance commercial and
residential parcels. This tool provides an initial exploration to identify opportunities for mapping
measures of efficiency to water agencies and accounting for local conditions. A combination of
better data, consistent metadata and standard definitions are critical to provide an accurate
measure of efficiency. This data integration is exactly what the CaDC’s SCUBA data
infrastructure delivers.

This integrated and standardized metered use data provides multiple benefits which you can
learn about on the CaDC website here: CaliforniaDataCollaborative.com/analytics

11 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT

Additionally, CaDC staff is actively working to improve other data quality aspects that will be
critical to the effective implementation of this framework:

● Evapotranspiration -- CaDC staff has created an open GitHub repository to supplement
DWR’s in situ CIMIS stations with additional characteristics (including variables beyond
what is included in spatial CIMIS).15
● Local water utility boundaries -- CaDC staff is collaborating with DWR to reconcile
overlapping boundaries for California water retailers. Publicly available water utility
boundary shapefiles were integrated from the Department of Water Resources and the
California Environmental Health Tracking Program (CEHTP). The CEHTP dataset is
notably indexed by public water system ID (PWSID).16 These data were processed to
correct for service areas with multiple water systems. Agencies’ most recently available
Urban Water Management Plans were consulted to validate these updates. Cases in
which there were overlaps between available boundaries were reconciled by assigning
conflicted parcel-level areas to the agency with less total area. This logic was
implemented to account for situations in which available agency boundaries reflected not
only retail service area, but also wholesale service area, resulting in misleading
overlaps. CaDC staff has made this data publicly available and appreciates corrections
from water utility staff at the following link in the footnote below.17
● Local parcels -- CaDC staff has integrated publicly available parcels for 51 of 58
counties statewide (mandated per 2013 OC vs Sierra Club). Those 7 additional counties
could make their data available and many counties that currently make their data
available could improve their parcel data quality. Three additional county’s parcels were
obtained directly from their local assessor (Fresno, Lassen and San Luis Obispo). The
parcels were integrated through a spatial join with the aforementioned OPR land use
data to develop residential parcels statewide.

APPENDIX 1 -- Qualitative Accuracy Assessment

In partnership with CaDC local utilities, staff went through individually agency by agency for the
404 estimated targets. The accuracy of the estimated target was assessed through the
following standardized rubric and evaluated based on evidence of systemic bias that would
skew the target away from ground truth.

15
https://github.com/California-Data-Collaborative/Evapotranspiration
16
http://cehtp.org/faq/water/water_boundary_tool_data_user_frequently_asked_questions
17
https://california-data-collaborative.carto.com/builder/45fc9a00-08ff-11e7-87a6-
0e05a8b3e3d7/embed?state=%7B%22map%22%3A%7B%22ne%22%3A%5B33.137551192346145%2
C-125.36499023437501%5D%2C%22sw%22%3A%5B38.70265930723801%2C-
112.6318359375%5D%2C%22center%22%3A%5B35.96911507577482%2C-
118.99841308593751%5D%2C%22zoom%22%3A7%7D%7D

12 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
CIMIS_Prox Rural_Residential_Pr Residential_Parcel_ Census_Place_C Service_Bou
imity evalence Accuracy overage ndary
More than
20 kilometer
from service
area
boundary to
nearest
CIMIS Overlap with
station and Parcel data imply an neighboring
Severe hydrologic unreasonable average Less than 70% of boundaries or
uncerta characteristi More than 10% of number of people per this service area's UWMP
inty cs that residential parcels residential parcel residential parcels estimated at
with make ET appear to be large, (PpRP<1 or are within a greater than
data measureme green, and un- PpRP>15) in this designated census 10% of total
quality nts doubtful. irrigated. service area. place. service area
More than
20 kilometer Parcel data imply an Overlap with
Material from service ambiguous average 70%-80% of thisneighboring
uncerta area number of people per service area's boundaries or
inty boundary to 1-10% of residential residential parcel in residential parcels
UWMP
with nearest parcels appear to be this service area (1 <= are within a estimated at
data CIMIS large, green, and un- PpRP <2 or 12 < designated census
1-10% of total
quality station irrigated. PpRP =<15. place. service area
No evident
discrepency
in boundary in
terms of
overlap with
neighbors or
No reported
uncerta Parcel data imply a More than 80% of UWMP plan.
inty reasonable average this service area's Or very minor
evident CIMIS number of people per residential parcels overlap (e.g.
with Stations Most residential residential parcel are within a rectification
data within 20 greenspace could be (2<=PpRP<=12) in designated census issue of a few
quality kilometers. irrigated. this service area. place. meters)

These are synthesized into standardized assessments of the accuracy (or inaccuracy through
systematic bias) of the estimated targets as follows:

Data Uncertainty Aggregate_Assessment

13 | CaDC Statewide Efficiency Explorer Methodology
Version 1.1 6-5-17 DRAFT DRAFT DRAFT
Less than two material
uncertainties with data quality
and no severe uncertainties
Little to no evidence of with data quality in any
systemic bias categories
More than two material
uncertainties with data quality
and no severe uncertainties
Some evidence of with data quality in any
systemic bias category
Any severe uncertainties with
data quality or persistent
material uncertainties with data
Substantial evidence of quality across assessment
systemic bias categories

14 | CaDC Statewide Efficiency Explorer Methodology