You are on page 1of 16

Limnol. Oceanogr.

9999, 2024, 1–16


Published 2024. This article is a U.S. Government work and is in the public domain in the
USA. Limnology and Oceanography published by Wiley Periodicals LLC on behalf of
Association for the Sciences of Limnology and Oceanography.
doi: 10.1002/lno.12549

Deep learning of estuary salinity dynamics is physically accurate at a


fraction of hydrodynamic model computational cost
Galen Gorski ,1* Salme Cook ,2 Amelia Snyder ,1 Alison P. Appling ,3 Theodore Thompson ,4
Jared D. Smith ,1 John C. Warner ,2 Simon N. Topp 1
1
U.S. Geological Survey, Reston, Virginia, USA
2
U.S. Geological Survey, Woods Hole Coastal and Marine Science Center, Woods Hole, Massachusetts, USA
3
U.S. Geological Survey, State College, Pennsylvania, USA
4
U.S. Geological Survey, Lawrenceville, New Jersey, USA

Abstract
Salinity dynamics in the Delaware Bay estuary are a critical water quality concern as elevated salinity can
damage infrastructure and threaten drinking water supplies. Current state-of-the-art modeling approaches use
hydrodynamic models, which can produce accurate results but are limited by significant computational costs.
We developed a machine learning (ML) model to predict the 250 mg L1 Cl isochlor, also known as the “salt
front,” using daily river discharge, meteorological drivers, and tidal water level data. We use the ML model to
predict the location of the salt front, measured in river miles (RM) along the Delaware River, during the period
2001–2020, and we compare predictions of the ML model to the hydrodynamic Coupled Ocean–Atmosphere-
Wave-Sediment Transport (COAWST) model. The ML model predicts the location of the salt front with greater
accuracy (root mean squared error [RMSE] = 2.52 RM) than the COAWST model does (RMSE = 5.36); how-
ever, the ML model struggles to predict extreme events. Furthermore, we use functional performance and
expected gradients, tools from information theory and explainable artificial intelligence, to show that the ML
model learns physically realistic relationships between the salt front location and drivers (particularly dis-
charge and tidal water level). These results demonstrate how an ML modeling approach can provide predictive
and functional accuracy at a significantly reduced computational cost compared to process-based models. In
addition, these results provide support for using ML models in operational forecasting, scenario testing, man-
agement decisions, hindcasting, and resulting opportunities to understand past behavior and develop
hypotheses.

In the mid-Atlantic region of North America, the Delaware quality objectives (Mandarano and Mason 2013). Within the
River Basin supplies drinking water for 15 million people Delaware River Basin, the main stem of the Delaware River pro-
including nearly half of the population of New York City vides recreational opportunities, critical habitat to endemic spe-
(Hutson et al. 2016). Water resources in the basin are jointly cies, and drinking water for several urban areas including
managed by four states and the federal government to meet Philadelphia, Pennsylvania. Salinity concentrations within the
demand while also maintaining downstream flow and water lower reaches of the Delaware Bay estuary vary throughout the
year, but during periods of low river discharge, ocean salinity
can be driven upstream, threatening water quality. Increased
*Correspondence: ggorski@usgs.gov salinity conditions, which can be brought on by drought, sea
This is an open access article under the terms of the Creative Commons level rise, and/or increased demand, can corrode critical infra-
Attribution License, which permits use, distribution and reproduction in structure, threaten species habitat, and necessitate further treat-
any medium, provided the original work is properly cited.
ment or alternative sources of drinking water. As such, salinity
Additional Supporting Information may be found in the online version of is a key constituent in the Delaware Bay estuary, and its dynam-
this article. ics are tracked closely to ensure the protection of water quality.
Author Contribution Statement: AA, JW, GG, and JS all helped con- For example, managed releases from upstream reservoirs, which
ceive the study and secure support. GG, SC, AS, and TT executed the are jointly managed for drinking water, flood control, recrea-
study, and GG, SC, JS, AA, and ST analyzed and interpreted the results. tion, and other purposes, are used to meet minimum flow
JS and ST made methodological contributions to the analysis. GG, SC, JS,
and AA wrote the manuscript, while AS, TT, ST, and JW contributed to objectives and manage salinity intrusion. However, these
the editing and revision of the manuscript and figures. releases require the careful balancing of salinity mitigation with

1
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

other beneficial uses, and there is considerable uncertainty sur- quality constituents, including salinity, up to 2 h in the future
rounding the effect that the timing and magnitude of releases (Alizadeh et al. 2018). Other studies have demonstrated how ML
have on downstream salinity. models can reduce cost in data collection and computation
Within the Delaware Bay estuary, salinity intrusion is measured (Codden et al. 2021). The flexibility and computational efficiency
using the location of the 7-d average 250 mg L1 Cl isochlor, of such models raises exciting possibilities about their capacity to
also referred to as the “salt front.” The location of the salt harness ever-growing data availability to develop accurate models
front is reported in river miles (RM; 1.6 km) along the central at speeds and resolutions not possible with traditional modeling
axis of the estuary, ranging from 0 at the mouth of the estu- approaches. However, the complexity of ML models’ inner work-
ary to 134 at Trenton, NJ where the river is no longer ings can make it difficult to discern if the models are representing
influenced by ocean tides. The location of the salt front, and a system in a physically reasonable way. Moving beyond purely
salinity dynamics more broadly, in the Delaware Bay estuary predictive performance metrics, it is important to ensure that ML
are primarily controlled by river discharge along the models are getting the right answers for the right reasons
mainstem of the Delaware River and from other major tribu- (Kirchner 2006). This issue is particularly critical if models are to
taries such as the Schuylkill River. However, other factors be used for forecasting, testing of future scenarios, or extrapola-
such as tidal action, meteorological systems, and bathymetric tions of any kind in space and/or time.
and topographical features also have influence, particularly This type of model accuracy is termed functional performance,
during times of low discharge (Ross et al. 2015). which represents the ability of a model to accurately reproduce
Current models of the Delaware Bay estuary primarily use coupling and sensitivities within the system (Ruddell et al. 2019).
numerical techniques to solve equations for the fundamental A model with high functional performance should reproduce
physical processes which drive salinity, hydrologic, and other pairwise input–output relationships seen in the observed data.
biogeochemical dynamics (Galperin and Mellor 1990a,b; Transfer entropy, a metric from information theory, has been
Whitney and Garvine 2006; Aristazabal and Chant 2013; Phil- proposed as a method for assessing functional performance, as
adelphia Water Department 2015). For example, the Coupled it provides a robust measure of non-linear, one-way coupling
Ocean–Atmosphere-Wave-Sediment Transport (COAWST) between inputs and outputs (Schreiber 2000; Ruddell and
modeling system has been used to accurately simulate salinity Kumar 2009; Konapala et al. 2020). The joint assessment of pre-
dynamics in the Delaware Bay (Cook et al. 2023). dictive and functional performance can provide a deeper under-
While numerical hydrodynamic models such as COAWST standing of a model’s ability to represent a physical system and
can accurately simulate water level, temperature, salinity, and help identify potential weakness in the model.
other constituents at fine spatial and temporal resolutions The ML model used in this study is a long short-term memory
(Warner et al. 2010), there are substantial drawbacks including: (LSTM) network (Hochreiter and Schmidhuber 1997), a kind of
(1) high computational costs in running the model, which can temporally aware neural network that has shown skill in rep-
lead to long run times, shorter simulation periods, and diffi- resenting hydrologic processes due to its ability to retain informa-
cultly in using the model for operations or short-term forecast- tion over many time steps (Rahmani et al. 2020; Frame
ing. For example, the COAWST model takes  50 h of compute et al. 2021; Zhi et al. 2021). As the name suggests, the network is
time on 180 processors to simulate 1 yr of conditions within engineered to discern which information from the past is relevant
the Delaware Bay estuary. This limits the ability to test suites of to the current time step and to carry that information forward.
scenarios simulating reservoir releases or potential future condi- This behavior has natural resonance with hydrologic systems,
tions. In addition, given the computational constraints, produc- where storage and transfer can operate across a range of different
ing model ensembles for uncertainty or using ensemble-based scales. An advantage of LSTMs is that they can identify these
data assimilation approaches to develop operational models for timescales from input data automatically (Tennant et al. 2020).
near-term forecasting is not feasible; (2) substantial system- In this study, we develop a ML model to predict the 7-d
specific expertise required to define boundary conditions and backward-looking average salt front location using daily obser-
configure the model; (3) significant requirements for hosting vations of discharge, meteorological variables, and water levels.
and accessing input and output data due to fine spatial and We train the model using a record of the salt front location
temporal resolutions of the model, for example, the COAWST from 2001 to 2020 developed by the Delaware River Bay Com-
model outputs  2 TB of data for a single year of simulations. mission (Preucil and Reavy 2020) with data from 2001 to 2015
The rapid advances of machine learning (ML) have shown used for training and 2016 to 2020 used for testing. We com-
great promise in simulating dynamics in complex environmental pare the model results with modeled salt front location from
and hydrologic systems, including estuaries, at reduced computa- the COAWST hydrodynamic model for 3 yr during the holdout
tional costs (Zaini et al. 2022; Bahari et al. 2023; Qi et al. 2023). period for which we have results for both models. In addition
For example, Zhou et al. (2021) used the Cubist ML algorithm to to comparing predictive performance of the two models, we
estimate the impact of Mississippi River discharge on estuary use transfer entropy between river discharge and salt front
salinity dynamics along the Gulf Coast. Alizadeh et al. (2018) location to compare model functional performance at identi-
showed that ML models could accurately forecast estuarine water fied critical time scales. As COAWST is a state-of-the-art

2
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

hydrodynamic model (Warner et al. 2010), it serves as a point (barometric pressure, air temperature, wind direction, and
of comparison for measuring the predictive and functional per- wind speed), and four tidal signatures, for a total of
formance of the ML model. We also investigate the ML model’s 10 predictors. Daily river discharge data were acquired from
internal gradients using a method called expected gradients the Delaware River at Trenton, NJ (USGS Site 01463500) and
(EGs) (Erion et al. 2021) to assess whether the model is using Schuylkill River at Philadelphia (USGS Site 01474500) using
input variable data in a physically consistent manner. the USGS National Water Information System (U.S. Geological
The reduced computational cost and increased flexibility of Survey 2022). Daily meteorological data were downloaded
ML approaches raise the possibility of quickly running ensem- from the NOAA National Estuarine Research Reserve System
bles of scenarios. These applications would have the potential to (NERRS) at Saint Jones River (DELSJMET) (NOAA NERRS n.d.)
inform water resources management both in real time and for (Fig. 1). We used the Saint Jones River data because that sta-
the long term. For this and other future applications to be real- tion provided the only complete record of all variables of
ized, modeling frameworks need to be rigorously compared to interest among stations in the area, and the station record is
both observational data and the existing state-of-the-art model- more locally accurate than a gridded meteorological product.
ing approaches to understand their strengths and weaknesses. Observed and predicted water level data from the mouth of
Although ML approaches have been applied to estuary salinity the estuary were downloaded from the NOAA National Ocean
in the past, this work focuses on explicitly comparing state-of- Service (NOS) Tides and Currents web portal for Lewes, DE
the-art process-based modeling to ML methods using a combina- (8557380) at a 6-min resolution (NOAA National Ocean
tion of predictive and functional performance metrics. In Services n.d.). Water levels are predicted for specific locations
addition, our use of EGs to examine the drivers of model along the coastline using celestial mechanics and past local
response through time is a unique application of this explainable observations (NOAA Center for Operational Oceanographic
artificial intelligence technique to understand model behavior. Products and Services n.d.). Water level data was converted
from 6-min resolution to the daily time step by calculating
the daily range, maximum height, difference between
Materials
predicted and actual height, and subtidal frequency fluctua-
Geographical setting tions (Supporting Information Material Data preprocessing).
The Delaware Bay estuary is a coastal plain estuary that drains The ML model target variable is the backward-looking 7-d
an area of about 35,000 km2 across Pennsylvania, New Jersey, average location of the 250 mg L1 Cl isochlor, known as the
New York, Delaware, and Maryland (Hutson et al. 2016). The salt front. The salt front location in units of RM is a specific reg-
Delaware River, its main tributary, contributes  60% of the ulatory metric, as such our modeling and analysis is done in
annual freshwater to the total discharge of the estuary, with these units (Delaware River Basin Commission 2019). The loca-
the Schuylkill River contributing an additional 15% and no other tion of the salt front is estimated by the Delaware River Basin
single tributary contributing more than 1% (Sharp et al. 1986) Commission using daily observations of specific conductivity
(Fig. 1). Streamflow is highest in the spring and lowest in the late from four locations throughout the estuary, Reedy Island
summer and early fall (Supporting Information Fig. S1). (RM 54), Chester (RM 84), Fort Mifflin (RM 92), and the Ben
Salinity dynamics within the estuary have been character- Franklin Bridge (RM 100). Specific conductivity is converted to
ized as having a weak response to variation in streamflow Cl using a locally calibrated conversion, and a log-linear inter-
compared to other estuary systems (Garvine et al. 1992; polation method is used to estimate Cl concentration between
Aristazabal and Chant 2013). This is in large part due to signif- observation points (Preucil and Reavy 2020). When station data
icant tidally driven advection dominated by the M2 principal is missing due to a faulty sensor or routine maintenance, often
lunar constituent (Wong 1995), vertical mixing, and the shape in the winter months, a correction is applied to interpolate
of the estuary. Other factors such as wind and gravitational between non-adjacent sensors. Uncertainty estimates are not
flow due to density gradients formed by freshwater inputs can available for the salt front location. For the purposes of calculat-
also affect salinity dynamics (Galperin and Mellor 1990a). ing the salt front from observations, salinity is synonymous
Throughout the estuary, lateral variations in salinity are gener- with Cl concentration.
ally greater than vertical (Wong and Münchow 1995), with more The 7-d average salt front is similar to the position of the
saline water found in the middle of estuary. These lateral gradients 2 g kg1 isohaline (X2) used as a measure of salinity intrusion
are likely less important in the upper reaches of the estuary, as it in other estuaries (Guerra-Chanis et al. 2019). It is an advanta-
narrows considerably (Sharp et al. 1982). While lateral salinity gra- geous target variable for several reasons. First, it is a
dients can be important for driving salinity dynamics in certain management-relevant variable that is directly referenced in
parts of the estuary, they are beyond the scope of this study. decisions to mitigate salinity by releasing water from upstream
reservoirs (U.S. Geological Survey, Office of the Delaware River
Data sources Master 2017). Second, in contrast to an Eulerian frame of ref-
The ML model uses daily dynamic drivers to make predic- erence such as salinity concentration at single or multiple
tions: discharge at two locations, four meteorological variables locations, a time series of the salt front location describes the

3
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

Fig. 1. Map of study area. Orange bars indicate river mile intervals used for model performance analysis; yellow inverted triangles provide historical and
management context. Projection: World Geodetic System 1984. Basemap generated using the tigris and dataRetrieval packages in R (R Core Team 2022;
DeCicco et al. 2023; Walker 2023), depth data from (NOAA National Centers for Environmental Information 2005).

directional history of salinity within the estuary; it conveys ability to represent system memory and antecedent conditions.
whether salinity is advancing inland or being repelled down- An LSTM is a sensible choice in this case as antecedent condi-
stream. Finally, the salt front location is a single value that tions and system memory have strong influence on the salt front
provides a succinct summary of salinity conditions in the estu- movement (Supporting Information Fig. S1). The fundamental
ary; more complex model output, such as maps of salinity or unit of an LSTM is a cell or hidden unit, which is made up of a
salinity at several locations, would require post-processing to series of gates through which incoming data pass. The gates con-
produce similarly useful information. Although choosing an trol the flow of information throughout the model: the input
interpolated value for the modeling target introduces addi- gate controls which information from the current time step is
tional uncertainty from the interpolation method that might added to the model’s memory, the forget gate controls which
be avoided if we were instead to model salinity at several loca- information from the previous time step is removed from mem-
tions, we elected to select the salt front record as it is a target ory, and the output gate controls which information is used to
with more management relevance. make the current time step’s prediction. The LSTM cells can be
represented in a simplified way by:
ML model
ht , ct ¼ f LSTM ðht1 , ct1 , xt Þ, ð1Þ
To model the 7-d average location of the salt front we use the
LSTM ML method (Hochreiter and Schmidhuber 1997). The st ¼ f linear ðht Þ, ð2Þ
LSTM is a type of temporally aware neural network that has been
used successfully to model complex hydrologic systems (Kratzert where h and c are the hidden state and cell state, respectively,
et al. 2018; Read et al. 2019; Jia et al. 2021; Sadler et al. 2022). Its and xt represents a vector of the model inputs at time t. The
success in hydrology over other ML methods is largely due to its full set of equations describing the LSTM’s functions as well as

4
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

details describing the choice of modeling hyperparameters method that had previously been employed by Cook et al.
and number of replicates can be found in the Supporting (2023) and was the most likely to be used in a decision-
Information. making context. To derive a time series of the modeled salt
In Eq. 2, the LSTM is used to generate the prediction of the front location from COAWST output, salinity is aggregated to
salt front location, st, using linear regression. We employed a the daily timestep, and the bottom of the vertical cell equal
model with a single LSTM layer, 20 hidden units, a look back to approximately 0.52 psu (250 mg L1 Cl) is selected
sequence length of 365 d, dropout of 0.1 and a recurrent dropout (Philadelphia Water District 2019). Each model run consists of
of 0.3. The model was trained for 500 epochs with a learning rate simulating a specific year, and each year is run independently.
of 0.003. The results presented are the average of 10 model repli- Note that the COAWST model and the ML model are designed
cates differing only in the random initiation of model weights. to predict different quantities at different spatial and temporal
All input data were z-score normalized before modeling. resolutions and scales, therefore the models are not necessarily
The modeling time period (2001–2020) was split into train- on equal footing for different prediction tasks. However, in
ing (2001–2010), validation (2011–2015), and testing (2016– this study, we have designed the comparison to emphasize rel-
2020) periods. To determine model hyperparameters a grid sea- evance from a management perspective.
rch was used, and models were trained on data from the train-
ing period and assessed during the validation period
(Supporting Information Material). Once the hyperparameters Assessing model performance
had been selected, the final model was trained using combined The predictive performance of the hydrodynamic and ML
data from the training and validation periods (2001–2015) and models were assessed against the salt front record using root
assessed during the testing period. Model performance results mean squared error (RMSE) and Nash–Sutcliffe efficiency (NSE)
refer to the testing period unless otherwise stated. (Nash and Sutcliffe 1970). The model performance was assessed
separately for a set of six RM intervals (< 58, 58–68, 68–70, 70–
COAWST hydrodynamic model 78, 78–82, > 82 RM) using geographic and bathymetric control
The COAWST modeling system (Warner et al. 2008, 2010) points (Fig. 1). The intervals do not have the same length but
was used to model the processes controlling salinity intrusion in were chosen based on the physical structure of the system. The
the Delaware Bay. The Regional Ocean Modeling System features of the intervals are discussed in the Spatial Variability
(Haidvogel et al. 2000, 2008; Shchepetkin and McWilliams 2005) in Modeling Results section of the Discussion.
was employed as the ocean circulation model. River discharge The functional performance of the model is an indication
was prescribed using the USGS gage at Trenton, NJ (USGS Site of how well a model represents the relationships between
01463500; U. S. Geological Survey 2022). Flows at the Chesa- inputs and outputs compared to the observed relationships.
peake and Delaware Canal were estimated from NOAA water To compute functional performance, we used transfer entropy
level and velocity measurements at Chesapeake City, MD (TE), a concept from information theory that quantifies the
(8573927; NOAA National Ocean Services n.d.). Tidal forcing asymmetric flow of information between sources and sinks
was from the Advanced Circulation Model (Mukai et al. 2001). and can capture both linear and non-linear interactions
Subtidal forcing was generated using a 32-h low pass filter from (Schreiber 2000). The TEX!Y is a measure of the reduction in
water level observations interpolated from Lewes, DE (8557380) uncertainty in Y (the sink variable) gained through knowledge
and Cape May, NJ (8536110). Temperature and salinity bound- of X (the source variable), conditioned on the history of Y
ary conditions at the coastal ocean are provided by the COAWST (Schreiber 2000; Ruddell and Kumar 2009). In our analysis, we
forecast model (Warner and Kaira 2022). All model runs included focused on discharge from the Delaware at Trenton, NJ and
atmospheric dynamics that were forced by the European Center discharge from the Schuylkill at Philadelphia as the source var-
for Medium-Range Weather Forecasts Reanalysis v5 (Hersbach iables (Xj ) for transfer entropy calculations, where j is the
et al. 2020) with a spatial resolution of 30 km and a temporal res- index of the source location, j is an element of fT, Sg for Tren-
olution of 1 h. The COAWST model was calibrated using water ton or Schuylkill, respectively. The sink variable (Y) was the 7-d
level and salinity at four stations. Details of the calibration average salt front location. We calculated TE for each combi-
scheme can be found in (Cook et al. 2023). nation of sources and sink for modeled and observed data. For
The COAWST model simulates estuary water level, veloci- more detail on the calculation of TE please see Supporting
ties, temperature, and salinity, among other output variables, Information Material.
at a spatial resolution of 5–450 m. The model was simulated Functional performance is defined as:
with 16 vertical levels and run with a 2- to 10-s time step.
Several methods for comparison of COAWST model output to Functional performanceXj !Y ¼ TEXj !Y modeled  TEXj !Y observed ,
ML simulations were considered, including selecting COAWST ð3Þ
salinity simulation from the locations where specific conduc-
tivity observations were collected and post-processing to inter- where functional performance values were calculated for
polate a simulated salt front record. We elected to use the each model, and for each individual Xj  > Y relationship.

5
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

A functional performance equal to 0 is ideal, meaning that the Results


modeled salt front location (Y) has the same relationship to ML model results
the input (Xj ) as measured in the observed data. A functional The baseline ML model reproduces the structure of the salt
performance > 0 indicates that the relationship is over- front record well in both training and testing periods with
deterministic, while a functional performance < 0 indicates RMSEs of 1.25 RM and 2.52 RM, respectively (Fig. 2a–d). The
over-randomness. model shows worse performance when the salt front location
is low in the estuary (< RM 58; Fig. 2e), however, model per-
formance at this range is of lower importance to the modeling
ML model feature importance and EGs objectives which focus on salinity intrusions in the upper
To assess the importance of each input feature to the estuary, and the salt front location is not tracked below RM
accuracy of the predictions, we use permutation feature 54. When the salt front is high in the estuary (> RM 82), the
importance in which a feature is independently replaced model matches the observations well during the training
with a random sampling of values within the observed period (Fig. 2b), but underpredicts the two major peaks in
range of that feature’s values and the trained model is used 2016 and 2019 during the testing period (Fig. 2d). However,
to make predictions. The resulting error, measured as RMSE, the model did identify these as periods of salinity intrusion,
is compared to the error obtained using the observed and their relative timing is well matched. These conditions are
unpermuted data, and the process is repeated for each input relatively rare, representing 6.5% of the training period and
feature. This results in an estimate of the mean decrease in 6.1% of the testing period, however they are of significant
the performance metric resulting from information loss for importance from a management perspective.
each permuted feature, where larger decreases indicate In general, the model performed best under median annual
that the model relies more on that feature to improve flow conditions. For example, the best performance during
predictions. the testing set was observed in 2020 (RMSE = 1.38 RM), which
EGs are a model agnostic method for calculating a was the year with the median cumulative discharge in the
model’s local sensitivity to input variables at each individ- Delaware and Schuylkill Rivers during the entire modeling
ual time step of the model output. High positive or negative period. In contrast, 2016 was the year the lowest cumulative
EG values indicate that the input variable strongly influ- flow in the modeling period and showed the worst perfor-
ences the model prediction at that time step, while EG mance (RMSE = 3.31 RM). Similarly, the model performs
values near zero indicate that the variable has little influ- worse when the salt front is either low or high in the estuary
ence on the output. The method calculates the accumulated (< 58 or > 78 RM) and performs better when the salt front is
gradients of the change in an output with regards to a closer to its median value (Fig. 2e).
change in the input, df(x)/dx, as the input variable goes The ML model showed NSE = 0.85 during the testing
from some baseline value x0 to its true value x, that is, period and NSE = 0.96 during the training period. Somewhat
γ ðαÞ ¼ x0 þ αðx  x0 Þ where α = 0 at the baseline and α = 1 at the worse performance was reported by Meyer et al. (2020) who
true value of x and γ is a function describing the relationship used process-based and empirical models to simulate specific
between x and x0 (Jiang et al. 2022) (Supporting Information conductivity at several locations within the Delaware Bay estu-
Fig. S6). The baseline values x0 are generated by sampling from ary. They report NSE = 0.706–0.834 for the process-based
the range of the input values in the training data, given by D. empirical model and 0.458–0.744 for a hydrodynamic model.
For this study, we generated 200 random samples with The modeling scheme in that study did not include holdout
replacement. Given a baseline distribution D, the EG of the data, so comparison to the training performance is more
ith input variable ðΦi Þ can be calculated as the following appropriate.
expectation (Erion et al. 2021):

  
 ∂f x0i þ α xi  x0i Comparison of ML and COAWST hydrodynamic model
Φi ðf ,xÞ ¼ E xi  x0i ,x0  D,α  U ð0,1Þ, ð4Þ
∂x0i The ML model showed consistent accuracy across all 3 yr
in the testing set for which we have model results for both
where U ð0,1Þ is the uniform distribution between 0 and 1. We models (2016, 2018, and 2019), with the best performance in
calculate the EG for each replicate model run, differing only 2019 (RMSE = 2.53 RM) and worst performance in 2016
by random seed, and show the average EGs across model repli- (RMSE = 3.31 RM) (Fig. 3a). In contrast to the ML model, the
cates for visualization purposes. COAWST model showed a wider range of accuracy, with
While both permutation feature importance and EGs can minimum RMSE in 2019 (2.69 RM) and maximum in 2018
help assess the influence of input features globally and locally, (8.46 RM). The two models show very similar performance in
respectively, neither method explicitly accounts for interac- 2016 and 2019, while the ML model outperformed COAWST
tion between features. in 2018 (Fig. 3a).

6
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

Fig. 2. Machine learning model results for training period (a, b) and the testing period (c, d). Panel (e) shows the testing period model results broken
down by river mile interval, with the standard deviation among 10 replicates shown with error bars.

Figure 4a,c,e shows the annual time series with observed but shows good sensitivity, matching the higher frequency
and modeled salt front location for both models. In 2016, fluctuations of the salt front record similar to the ML model
both models capture the general structure of the salt front (Fig. 4c). In 2019, both models capture the rising limb of the
record but fail to capture the maximal extent of the salt salt front peak similarly, but the COAWST model shows supe-
front intrusion. In 2018, the COAWST model shows a consis- rior performance in matching the maximum salt front loca-
tent under prediction resulting in poor overall performance, tion (Fig. 4e).

Fig. 3. Comparison of ML model with COAWST model for each year in the testing set (a) and for each river mile interval (b), the standard deviation
among 10 replicates for the ML model is shown with error bars.

7
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

Fig. 4. Comparison of predictions from COAWST (orange) and ML model (blue) for the salt front location for 2016 (a, b), 2018 (c, d), and 2019 (e, f).
Predictive performance for COAWST and ML models for years during the ML testing period (b, d, f) binned by performance at different river mile inter-
vals. The standard deviation among 10 replicates for the ML model is shown with error bars.

When aggregated across all 3 yr of comparison, the ML Schuylkill River at Philadelphia, PA (Fig. 5b) at a time lag of
model outperforms the COAWST model across nearly all RM 8 d. For example, the salt front location on 10 October 2018 is
intervals (Fig. 3b); however, much of the poor performance in the average salt front location from 03 October 2018 to
the COAWST model is driven by 2018, which was a very wet 10 October 2018, and its relationship to Delaware discharge
year with higher-than-average flows. The COAWST model on 03 October 2018 is assessed (02 October 2018 for Schuylkill
shows no consistent performance across RM intervals, with discharge). The time lags were chosen as they showed the
worst performance at higher RM intervals in 2018 (Fig. 4d) strongest correlation between discharge at the two locations
and best performance in 2019 (Fig. 4f). and salt front in the observational data (Supporting Informa-
tion Fig. S1) and, as such, represent important time scales of
Comparison of functional performance information transfer.
To assess functional performance of the ML model and The ML model results show over-random behavior (func-
COAWST, we focus on the relationship between 7-d average tional performance < 0) for 2016 and 2018 for both Delaware
salt front location and discharge of the Delaware River mea- and Schuylkill discharge (Fig. 5a,b). Average functional perfor-
sured at Trenton, NJ (Fig. 5a) at a time lag of 7 d, and the mance for these 2 yr of 0.062 for the Delaware and 0.065

8
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

Fig. 5. Functional performance for (a) the relationship between the discharge 7 d previously in the Delaware River at Trenton, NJ and the 7-d average
salt front location and (b) the discharge 8 d previously in the Schuylkill River at Philadelphia, PA and the 7-d average salt front location plotted against
predictive performance for each year (symbol) and model (color). A functional performance value of 0 indicates perfect fidelity to observed relationships.

for the Schuylkill indicate that the ML model is underutilizing speed. The feature importance calculation for every input vari-
the information contained within the two discharge records able besides the two discharge variables resulted in very small
by approximately 6%. However, the functional performance decreases in performance (< 0.1 RMSE), indicating that the
for 2019 shows near-perfect representation of the discharge- model relies heavily on information from discharge to make
salt front relationship for the Delaware (0.002) and the its predictions. A version of the model with only Delaware
Schuylkill (0.004). and Schuylkill discharge as predictors showed an RMSE = 2.81
Averaged across all years, the COAWST model shows more RM, compared to 2.52 RM for the full set of predictors,
accurate functional performance (closer to zero), with mean suggesting that even though individually the additional fea-
functional performance 0.013 and 0.009 for the Delaware tures contribute only a small amount to improve predictive
and Schuylkill, respectively (Fig. 5a,b). The ML model shows performance, together they do improve the model.
0.049 and 0.052 for Delaware and Schuylkill, respectively. The ML model EGs for 2019 are shown in Fig. 6b–d. The
There is no consistent relationship between the functional year 2019 was chosen to highlight this method as the shift
performance and predictive performance (RMSE) for either from high discharge periods (Jan–Jul) to a prolonged summer
model (Fig. 5), which suggests that there is not an obvious tra- low-flow period (Aug–Oct) showed the clearest pattern in EGs
deoff between functional and predictive accuracy. For exam- throughout the year. The highest magnitude EGs throughout
ple, in 2018 the COAWST model shows poor predictive the year are generally discharge from the Delaware and the
performance (RMSE = 8.46 RM), yet it also shows superior Schuylkill Rivers (Fig. 6b). These gradients respond to the dis-
functional performance compared to the ML models. For both charge record and attribute large decreases in the predicted salt
ML and COAWST model, 2018 shows the least accurate func- front with high discharge events. For example, the large dis-
tional performance. 2018 was a particularly wet year with the charge event in January in the Delaware River (Fig. 6a; blue
highest annual cumulative flow during the modeling period line) results in a large magnitude negative gradient (decreased
in the Schuylkill, and the second highest annual total in the salt front prediction) for Delaware discharge (Fig. 6b; red line).
Delaware, which represent conditions outside the range seen Similarly, a large discharge event in July in the Schuylkill River
during ML training (U. S. Geological Survey 2022). (Fig. 6a; green line) results in a large magnitude negative gradi-
ent (decreased salt front prediction) for Schuylkill discharge
Feature importance and EGs (Fig. 6b; gold line).
The permutation feature importance (Supporting Informa- The EGs associated with discharge peaks are generally nega-
tion Fig. S7) shows that the discharge of the Delaware and tive, reflecting the fact that discharge values greater than base-
Schuylkill Rivers are the most and second most important var- line generally tend to result in a decrease in the location of
iables, respectively, followed by the air temperature and wind the predicted salt front. Similarly, generally positive EG values

9
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

Fig. 6. Salt front observations and ML predictions plotted with discharge from the Delaware River at Trenton and the Schuylkill River for 2019 (a). Daily
expected gradients for model predictor variables (b–d).

for discharge during the summer months and other low-flow predicted salt front to below baseline low-flow conditions.
periods show that the model is attributing an increase in the Near the end of the summer low-flow period, EGs show the

10
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

model is attributing a decrease in the predicted salt front to an can have bi-directional flow throughout the year (Wong 2002)
increase in discharge, such as the Delaware discharge in and was not included as input in either model.
October and November that push the salt front downstream Upstream of RM 58, the river tapers in width from  5 km
from its annual maximum. Supporting Information Fig. S6 at RM 58 to 2.8 km at RM 68. The interval from RM 68 to
shows a schematic representation of the how the EGs are 70 represents a narrowing of the river below the mouth of
accumulated during discharge events and low-flow periods. Christina River, which drains the urban and suburban areas
During the summer low-flow period, when there are fewer surrounding Wilmington, DE. From RM 70 to 78, the river has
discharge events, the magnitudes of the discharge EGs are a narrow, deep central channel. The model shows the best
smaller than other parts of the year. During this period, the performance over these three intervals (RMSE = 1.92 RM for
water level residuals (Fig. 6c), and the wind direction EGs 58–68; RMSE = 1.68 RM for 68–70; and RMSE = 2.97 RM for
(Fig. 6d) show greater magnitude, indicating that the model is 70–78).
relying more heavily on tidal and meteorological input data Between RM 78 and 82, the deep central channel of the
during periods of low discharge. river widens, allowing for more lateral transport and mixing,
leading to a more complex relationship between discharge
and salt front location and likely also influencing performance
Discussion (RMSE = 3.83 RM). Above RM 82 the river approaches the
Spatial variability in modeling results urban and suburban areas surrounding Philadelphia, PA and is
The ML models showed consistent spatial variability in influenced by discharge from Chester Creek and the Schuylkill
model predictive performance (Figs. 2e, 4b,d,f), with generally River (RMSE = 3.86 RM). These intervals are more strongly
better performance when the salt front was in the intervals influenced by urban runoff, which may complicate salinity
58–78 RM, and worse above or below this range. This is likely dynamics further (Sharp et al. 2009). In addition, these areas
a result of two factors: first, during the training and testing contain groundwater wells that are used for drinking water
period, 85% and 87% of the observations of the salt front abstraction from the underlying Potomac-Raritan-Magothy
location fall in the interval 58–82 RM, respectively. This formation. Pumping in these wells can induce recharge from
results in fewer opportunities for the model to learn the the river into the groundwater system, which represents an
dynamics of the more extreme values, leading the model to additional factor not accounted for by this modeling frame-
make predictions that are biased toward central values. Simi- work (Navoy et al. 2005).
larly, during the training period, the maximum salt front loca-
tion was RM 89, while during the testing period, there were How well is the ML model representing the physical
13 d in which the salt front exceeded that value, indicating system?
that the testing period represented an extrapolation into Consistent with other studies of salinity dynamics in the
unseen conditions. These observations are consistent with Delaware Bay, the ML model identified discharge in the main
other studies that have shown ML models (Frame et al. 2022; stem of the Delaware River as the most important driver of
Kayalvizhi et al. 2022) and other modeling approaches the salt front, followed by discharge in the Schuylkill River
(Brunner et al. 2021) can struggle to match extreme values (Supporting Information Fig. S7) (Wong 1995; Sharp
even if they are seen during the training period. et al. 2009; Ross et al. 2015). Further analysis of the relation-
The second factor that contributes to the spatial variability ship between discharge and salt front indicate that the ML
in model response is the physical dimensions of the estuary. model is underutilizing the information contained within the
The model performance was assessed separately in RM inter- two discharge inputs (Fig. 5a,b) during 2016 and 2018, while
vals using geographic and bathymetric control points. Hydro- the functional performance for 2019 was near-perfect. These
dynamic studies of the Delaware Bay estuary show that results suggest that under “normal” conditions, like 2019, the
bathymetric control points can exert significant control on ML model does as well as the COAWST model in representing
salinity dynamics (Cook et al. 2023). the functional relationship between discharge and salt front
RM 58 is the location of the Chesapeake and Delaware location.
Canal, connecting the Chesapeake and Delaware Bays. Below In 2016 and 2018, COAWST shows better functional per-
this point, the estuary opens into the Delaware Bay where formance than the ML model. This result is not surprising, as
salinity dynamics are governed by tidal energy and belong to the hydrodynamic model forces physically realistic flow at
a distinct regime from the upper estuary. Any salt front loca- every time step. Even so, the EGs and feature importance
tions below RM 54 were removed from the input data, but together show that the ML model identified the most impor-
observations between 54 and 58 were retained to provide a tant drivers in the system and developed physically realistic
softer lower limit for the model. The ML model generally relationships between inputs and outputs. Furthermore, it
over-predicts the salt front location over this interval, showed that those relationships could be used to predict the
resulting in an RMSE = 3.84 RM (Fig. 2e). Some of the error in salt front location under a wide range of conditions, as 2016
this region is likely due to the influence of the canal, which had the lowest combined discharge (Delaware plus Schuylkill)

11
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

of any year in the modeling period. As noted, 2016 and 2018 approximately 20 min on 16 2.60 GHz processors, while the
represent low and high discharge end members in the model- COAWST model runs in approximately 50 h on 180 processors
ing set, respectively, while 2019 represents a more typical dis- to produce a single year of predictions (4000 CPU hours per
charge year in both the Delaware and Schuylkill Rivers. year). As noted previously, the COAWST model produces 2 TB
The decreased functional performance in 2016 and 2018 of model output, with high-resolution estimates of water level,
may be a result of several factors. First, average conditions are temperature, salinity, and other variables. For detailed investi-
better represented in the training set and therefore more accu- gations of estuarine circulation, the information gained is
rately simulated in the testing set, similar to predictive perfor- often worth additional computational cost. However, for pre-
mance results. In addition, the seven- and eight-day time lags diction tasks such as the location of the salt front, or water
used to assess functional performance from Delaware and quality constituent concentrations at various discrete loca-
Schuylkill discharge, respectively, represent the average maxi- tions, a ML approach such as the one shown here can produce
mum magnitude correlation between discharge and salt similarly accurate results at a fraction of the computational
front location throughout the entire modeling time period. cost. This has significant implications for the ability of such a
However, viewed on an annual basis, the time lag of the model to help inform management decisions. The relative
maximum magnitude correlation varies as a function of speed and flexibility of the ML approach means that the
the cumulative discharge amount (Supporting Information model could be altered or improved more easily based on
Fig. S8). This serves to highlight the year-to-year variability in stakeholder suggestions. For example, in its current form the
the duration and strength of system memory that is disrupted ML model could be adjusted to make predictions of salinity at
by frequent discharge events (i.e., 2018) and extended by pro- several specific locations throughout the estuary, assuming
longed periods of low-flow (i.e., 2016). In general, drier years that the training data at those locations were obtainable. In
show longer time lags of maximum association between dis- addition, the ML models can be used to run a wider suite of
charge and salt front location indicating longer system mem- scenarios that might cover potential future climate conditions
ory under drier conditions. As such, the assessment of and management actions than what would be feasible using a
functional performance at the time lag of maximum magni- hydrodynamic model.
tude correlation may be most relevant for median flow years, Similar advantages have been noted in several other diverse
while greater (lesser) time lags are likely more important for estuarine systems. For example, in the Chesapeake Bay, salin-
drier (wetter) years. Similar relationships have been noted in ity and temperature response to climate change scenarios were
others estuary systems, for example, Qiu and Wan (2013) modeled using statistical trees, a data-driven technique similar
defined a piece-wise autoregressive relationship between dis- to traditional tree-based ML methods (Muhling et al. 2018). In
charge and salinity dynamics for different discharge regimes addition, estuary systems along the Gulf of Mexico USA, the
in the Caloosahatchee River Estuary. Pearl River in China, and the Danshui River in Taiwan have
While LSTMs are temporally aware ML algorithms and been successfully modeled using ML approaches with applica-
would ideally identify these patterns directly from the data, a tions to forecasting and operational early warning systems
process guided deep learning approach might be particularly (Chen et al. 2016; Zhou et al. 2021; Weng et al. 2024). In the
useful in such a setting where critical time scales are variable Caloosahatche River Estuary in Florida, USA a statistical model
and knowable (Read et al. 2019). To improve model functional was competitive with a three-dimensional hydrodynamic
performance at a particular time lag, the functional perfor- model and the statistical model was shown to be particularly
mance calculation could be incorporated into the ML model useful for forecasting in support of upstream water resource
loss function. However, care must be taken to ensure that decision-making (Qiu and Wan 2013).
there is independent assessment of functional performance Investigating the coupling of the ML model to upstream
and that the modeling approach is not subject to kludging, reservoir operations models to quickly run scenarios could
where the model is tuned to fit an overly narrow set of perfor- reveal how different timing and magnitude of reservoir
mance metrics (Clark 1987; Ruddell et al. 2019). releases affect the movement of the salt front. Such informa-
In addition, the modeling target variable is calculated tion could have direct application to reservoir operation and
from observations. The interpolation naturally introduces water resource management and relies heavily on the rigorous
uncertainty into the exact location of the salt front. This model evaluation conducted in the current study to demon-
uncertainty could be reduced by predicting salinity at a spe- strate the predictive and functional accuracy of the ML model-
cific location or several locations where observational data ing approach.
are available. Despite the clear computational advantages, the direct
comparison of the ML models with COAWST showed that
Leveraging the strengths of hydrodynamic and ML models there were conditions in which the hydrodynamic model con-
together sistently outperformed the ML model. For example, the maxi-
The ML model presented in this study can be trained and mum salinity intrusion in 2019 was much better matched by
used to make 20 years of predictions in a wall time of COAWST than the ML model (Fig. 4e). As such, the ML model

12
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

should not be used in its current form to make predictions of machine learning models. Eng. Appl. Comput. Fluid Mech.
the extent of salinity intrusion during extreme events when 12: 810–823. doi:10.1080/19942060.2018.1528480
the salt front is ≥  82 RM. This highlights the utility of a dual Aristazabal, M., and R. Chant. 2013. A numerical study of salt
modeling approach, where process-based models can be used fluxes in Delaware Bay Estuary. J. Phys. Oceanogr. 43:
in conjunction with ML approaches to collectively improve 1572–1588. doi:10.1175/JPO-D-12-0124.1
the performance and extrapolative power of each approach. Bahari, N. A. A. B. S., A. N. Ahmed, K. L. Chong, V. Lai, Y. F.
While we focus on a comparison of modeling approaches in Huang, C. H. Koo, J. L. Ng, and A. El-Shafie. 2023.
this study, future work could be focused on conceptual and/or Predicting sea level rise using artificial intelligence: A
computational integration of the models, for example through review. Arch. Computat. Methods. Eng. 30: 4045–4062.
alteration of the ML loss function to conserve mass in the sys- doi:10.1007/s11831-023-09934-9
tem (Read et al. 2019), using a ML model to emulate COAWST Brunner, M. I., L. Slater, L. M. Tallaksen, and M. Clark. 2021.
output (Chen et al. 2018), or rewriting portions of the Challenges in modeling and predicting floods and
COAWST model in a differentiable language to employ a dif- droughts: A review. WIREs Water 8: e1520. doi:10.1002/
ferentiable ML framework (Feng et al. 2022). Further, the cur- wat2.1520
rent model could be improved by including additional Chen, W., W. Liu, W. Huang, and H. Liu. 2016. Prediction of
observational or modeled input data to drive the model. For salinity variations in a tidal estuary using artificial neural
example, records of discharge and/or salinity from contribut- network and three-dimensional hydrodynamic models.
ing tributaries could increase model performance. In addition, Comput. Water Energy Environ. Eng. 6: 107–128. doi:10.
the incorporation of more historical data that represents 4236/cweee.2017.61009
extreme conditions may help to increase the model perfor- Chen, L., S. B. Roy, and P. H. Hutton. 2018. Emulation of a
mance, particularly at the model extremes. For example, the process-based estuarine hydrodynamic model. Hydrol.
records of discharge and salt front are available for Sci. J. 63: 783–802. doi:10.1016/j.jhydrol.2022.127675
the drought of record for the area in the 1960s and could be Clark, A. 1987. The kludge in the machine. Mind Lang. 2:
included in future training runs; however, complete meteoro- 277–300. doi:10.1111/j.1468-0017.1987.tb00123.x
logical observations are sparser historically, such that tradeoffs Codden, C. J., A. M. Snauffer, A. V. Mueller, C. R. Edwards, M.
might include the need to interpolate input data and/or Thompson, Z. Tait, and A. Stubbins. 2021. Predicting dis-
employ a simpler ML model. solved organic carbon concentration in a dynamic salt
marsh creek via machine learning. Limnol. Oceanogr.:
Conclusions Methods 19: 81–95. doi:10.1002/lom3.10406
In this study, we developed a ML model to make hindcast Cook, S. E., J. C. Warner, and K. L. Russell. 2023. A numerical
predictions of the 7-d average 250 mg L1 isochlor in the investigation of the mechanisms controlling salt intrusion
Delaware Bay estuary over the period 2001–2020. In compari- in the Delaware Bay estuary. Estuar. Coast. Shelf Sci. 283:
son to results from a state-of-the art hydrodynamic model, the 108257. doi:10.1016/j.ecss.2023.108257
ML model produces overall more accurate results and similarly DeCicco, L., R. Hirsch, D. Lorenz, D. Watkins, and M.
accurate functional coupling between inputs and outputs at a Johnson. 2023. dataRetrieval: R package for discovering
drastically reduced computational cost. The EGs of the ML and retrieving water data. U.S. Federal Hydrologic Web
model reveal that it has learned physically reasonable relation- Services.
ships, and the model is able to make accurate predictions Delaware River Basin Commission. 2019. An overview of
under a wide range of conditions. While there are areas of drought in the Delaware Basin. Delaware River Basin Com-
potential improvement (e.g., prediction of extreme values, mission. https://www.state.nj.us/drbc/library/documents/
quantification of uncertainty), the current approach already drought/DRBdrought-overview_feb2019.pdf
shows promise for flexible exploratory and predictive model- Erion, G., J. D. Janizek, P. Sturmfels, S. M. Lundberg, and S.-I.
ing of estuary salinity dynamics in support of management Lee. 2021. Improving performance of deep learning models
decisions. with axiomatic attribution priors and expected gradients.
Nat. Mach. Intell. 3: 620–631. doi:10.1038/s42256-021-
Data availability statement 00343-w
All data and code used in modeling is available at https:// Feng, D., J. Liu, K. Lawson, and C. Shen. 2022. Differentiable,
doi.org/10.5066/P9IK5Y45. learnable, regionalized process-based models with
multiphysical outputs can approach state-of-the-art hydro-
logic prediction accuracy. Water Resour. Res. 58:
References e2022WR032404. doi:10.1029/2022WR032404
Alizadeh, M. J., M. R. Kavianpour, M. Danesh, J. Adolf, S. Frame, J. M., F. Kratzert, A. Raney, M. Rahman, F. R. Salas, and
Shamshirband, and K.-W. Chau. 2018. Effect of river flow G. S. Nearing. 2021. Post-processing the National Water
on the quality of estuarine and coastal waters using Model with long short-term memory networks for

13
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

streamflow predictions and model diagnostics. J. Am. Water Kayalvizhi, S., D. Kerins, C. Shen, and L. Li. 2022. Riverine
Resour. Assoc. 57: 885–905. doi:10.1111/1752-1688.12964 nitrate concentrations predominantly driven by human cli-
Frame, J. M., and others. 2022. Deep learning rainfall–runoff mate, and soil property in contiguous United States. Water
predictions of extreme events. Hydrol. Earth Syst. Sci. 26: Res. 226: 119295. doi:10.1016/j.watres.2022.119295
3377–3392. doi:10.5194/hess-26-3377-2022 Kirchner, J. W. 2006. Getting the right answers for the right
Galperin, B., and G. L. Mellor. 1990a. Salinity intrusion and reasons: Linking measurements, analyses, and models to
residual circulation in Delaware Bay during the drought of advance the science of hydrology. Water Resour. Res. 42:
1984, p. 469–480. In R. T. Cheng [ed.], Residual currents W03S04. doi:10.1029/2005WR004362
and long-term transport. Springer. doi:10.1007/978-1-4613- Konapala, G., S.-C. Kao, and N. Addor. 2020. Exploring
9061-9_32 hydrologic model process connectivity at the conti-
Galperin, B., and G. L. Mellor. 1990b. A time-dependent, nental scale through an information theory approach.
three-dimensional model of the Delaware Bay and river sys- Water Resour. Res. 56: e2020WR027340. doi:10.1029/
tem. Part 2: Three-dimensional flow fields and residual cir- 2020WR027340
culation. Estuar. Coast. Shelf Sci. 31: 255–281. doi:10.1016/ Kratzert, F., D. Klotz, C. Brenner, K. Schulz, and M.
0272-7714(90)90104-Y Herrnegger. 2018. Rainfall–runoff modelling using long
Garvine, R. W., R. K. McCarthy, and K.-C. Wong. 1992. The short-term memory (LSTM) networks. Hydrol. Earth Syst.
axial salinity distribution in the Delaware estuary and its Sci. 22: 6005–6022. doi:10.5194/hess-22-6005-2018
weak response to river discharge. Estuar. Coast. Shelf Sci. Mandarano, L. A., and R. J. Mason. 2013. Adaptive manage-
35: 157–165. doi:10.1016/S0272-7714(05)80110-6 ment and governance of Delaware River water resources.
Guerra-Chanis, G., M. Reyes-Merlo, M. Diez-Minguito, and A. Water Policy 15: 364–385. doi:10.2166/wp.2012.077
Valle-Levinson. 2019. Saltwater intrusion in a subtropical Meyer, E. S., D. P. Sheer, P. V. Rush, R. M. Vogel, and H. E.
estuary. Estuar. Coast. Shelf Sci. 217: 28–36. doi:10.1016/j. Billian. 2020. Need for process based empirical models for
ecss.2018.10.016 water quality management: Salinity Management in the
Haidvogel, D. B., H. G. Arango, K. Hedstrom, A. Beckmann, P. Delaware River Basin. J. Water Resour. Plan. Manag. 146:
Malanotte-Rizzoli, and A. F. Shchepetkin. 2000. Model eval- 05020018. doi:10.1061/(ASCE)WR.1943-5452.0001260
uation experiments in the North Atlantic Basin: Simula- Muhling, B. A., C. F. Gaitan, C. A. Stock, V. S. Saba, D.
tions in nonlinear terrain-following coordinates. Dyn. Tommasi, and K. W. Dixon. 2018. Potential salinity and
Atmos. Oceans 32: 239–281. doi:10.1016/S0377-0265(00) temperature futures for the Chesapeake Bay using a statisti-
00049-X cal downscaling spatial disaggregation framework. Estuar.
Haidvogel, D. B., and others. 2008. Ocean forecasting in Coasts 41: 349–372. doi:10.1007/s12237-017-0280-8
terrain-following coordinates: Formulation and skill assess- Mukai, A. Y., J. J. Westerink, R. A. Luettich, and D. Mark.
ment of the regional ocean modeling system. J. Comput. 2001. Eastcoast 2001, a tidal constituent database for West-
Phys. 227: 3595–3624. doi:10.1016/j.jcp.2007.06.016 ern North Atlantic, Gulf of Mexico, and Caribbean Sea
Hersbach, H., and others. 2020. The ERA5 global reanalysis. (ERDC/CHL TR-02-24). US Army Corps of Engineers.
Q. J. Roy. Meteorol. Soc. 146: 1999–2049. doi:10.1002/qj. Nash, J. E., and J. V. Sutcliffe. 1970. River flow forecasting
3803 through conceptual models part I—A discussion of princi-
Hochreiter, S., and J. Schmidhuber. 1997. Long short-term ples. J. Hydrol. 10: 282–290. doi:10.1016/0022-1694(70)
memory. Neural Comput. 9: 1735–1780. doi:10.1162/neco. 90255-6
1997.9.8.1735 Navoy, A. S., L. M. Voronin, and E. Modica. 2005. Vulnerabil-
Hutson, S. S., K. S. Linsey, B. Reyes, and J. L. Shourds. 2016. ity of production wells in the Potomac-Raritan-Magothy
Estimated use of water in the Delaware River basin in Dela- aquifer system to saltwater intrusion from the Delaware
ware, New Jersey, New York, and Pennsylvania, 2010 River in Camden, Gloucester, and Salem Counties, New Jersey
(Scientific Investigations Report 2015-5142). U.S. Geological (Scientific Investigations Report 2004-5096). U.S. Geological
Survey. doi:10.3133/sir20155142 Survey. doi:10.3133/sir20045096
Jia, X., and others. 2021. Physics-guided recurrent graph NOAA Center for Operational Oceanographic Products and
model for predicting flow and temperature in river net- Services. n.d. NOAA Tide predictions users guide. https://
works. Proceedings of the 2021 SIAM International Confer- tidesandcurrents.noaa.gov/PageHelp.html
ence on Data Mining (SDM): SIAM. doi:10.1137/1. NOAA National Centers for Environmental Information.
9781611976700.69 2005. Delaware Bay, DE/NJ (M090) bathymetric digital ele-
Jiang, S., Y. Zheng, C. Wang, and V. Babovic. 2022. Uncovering vation model (30 meter resolution) Derived from source
flooding mechanisms across the contiguous United States hydrographic survey soundings collected by NOAA.
through interpretive deep learning on representative catch- NOAA National Estuarine Research Reserve System (NERRS).
ments. Water Resour. Res. 58: e2021WR030185. doi:10. n.d. System-wide monitoring program.
1029/2021WR030185 NOAA National Ocean Services. n.d. Tides and currents.

14
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

Philadelphia Water Department. 2015. Green City, clean 2009. A biogeochemical view of estuarine eutrophication:
waters: Tidal waters water quality mode–Bacteria and dis- Seasonal and spatial trends and correlations in the Dela-
solved oxygen. ware Estuary. Estuar. Coasts 32: 1023–1043. doi:10.1007/
Philadelphia Water District. 2019. PWD water supply plan- s12237-009-9210-8
ning: Salinity intrusion in the Delaware: Presentation to Shchepetkin, A. F., and J. C. McWilliams. 2005. The regional oce-
the regulated flow advisory committee. anic modeling system (ROMS): A split-explicit, free-surface,
Preucil, A., and K. Reavy. 2020. Description and data source topography-following-coordinate oceanic model. Ocean
for the flow management snapshot. Delaware River Basin Model. 9: 347–404. doi:10.1016/j.ocemod.2004.08.002
Commission. https://drbc.maps.arcgis.com/apps/dashboards/ Tennant, C., L. Larsen, D. Bellugi, E. Moges, L. Zhang, and H.
690464a9958b49e5b49550964641ffd7 Ma. 2020. The utility of information flow in formulating
Qi, S., and others. 2023. Salinity modeling using deep learning discharge forecast models: A case study from an arid snow-
with data augmentation and transfer learning. Water 15: dominated catchment. Water Resour. Res. 56:
2482. doi:10.3390/w15132482 e2019WR024908. doi:10.1029/2019WR024908
Qiu, C., and Y. Wan. 2013. Time series modeling and predic- U. S. Geological Survey. 2022. USGS Water Data for the
tion of salinity in the Caloosahatchee River estuary. Water Nation: U.S. Geological Survey National Water Information
Resour. Res. 49: 5804–5816. doi:10.1002/wrcr.20415 System Database. doi:10.5066/F7P55KJN
R Core Team. 2022. R: A language and environment for statis- U.S. Geological Survey, Office of the Delaware River Master.
tical computing. R Foundation for Statistical Computing. 2017. Agreement for a flexible flow management program.
Rahmani, F., K. Lawson, W. Ouyang, A. Appling, S. Oliver, Walker, K. 2023. Tigris: Load Cenus TIGER/line Shapefiles.
and C. Shen. 2020. Exploring the exceptional performance Warner, J., C. Sherwood, R. Signell, C. Harris, and H. Arango.
of a deep learning stream temperature model and the value 2008. Development of a three-dimensional, regional,
of streamflow data. Environ. Res. Lett. 16: 024025. doi:10. coupled wave, current, and sediment-transport model.
1088/1748-9326/abd501 Comput. Geosci. 34: 1284–1306. doi:10.1016/j.cageo.2008.
Read, J. S., and others. 2019. Process-guided deep learning pre- 02.012
dictions of lake water temperature. Water Resour. Res. 55: Warner, J. C., B. Armstrong, R. He, and J. B. Zambon. 2010.
9173–9190. doi:10.1029/2019WR024922 Development of a coupled ocean-atmosphere-wave-
Ross, A. C., R. G. Najjar, M. Li, M. E. Mann, S. E. Ford, and B. sediment transport (COAWST) modeling system. Ocean
Katz. 2015. Sea-level rise and other influences on decadal- Model. 35: 230–244. doi:10.1016/j.ocemod.2010.07.010
scale salinity variability in a coastal plain estuary. Estuar. Warner, J. C., and T. S. Kaira. 2022. Collection of COAWST
Coast. Shelf Sci. 157: 79–92. doi:10.1016/j.ecss.2015.01.022 model forecast for the US East Coast and Gulf of Mexico.
Ruddell, B. L., and P. Kumar. 2009. Ecohydrologic process net- Weng, P., Y. Tian, H. Zhou, Y. Zheng, and Y. Jiang. 2024. Salt-
works: 1. Identification. Water Resour. Res. 45: W03419. water intrusion early warning in pearl river Delta based on
doi:10.1029/2008WR007279 the temporal clustering method. J. Environ. Manag. 349:
Ruddell, B. L., D. T. Drewry, and G. S. Nearing. 2019. Informa- 119443. doi:10.1016/j.jenvman.2023.119443
tion theory for model diagnostics: Structural error is indi- Whitney, M. M., and R. W. Garvine. 2006. Simulating the
cated by trade-off between functional and predictive Delaware Bay buoyant outflow: Comparison with observa-
performance. Water Resour. Res. 55: 6534–6554. doi:10. tions. J. Phys. Oceanogr. 36: 3–21. doi:10.1175/JPO2805.1
1029/2018WR023692 Wong, K.-C. 1995. On the relationship between long-term
Sadler, J. M., A. P. Appling, J. S. Read, S. K. Oliver, X. Jia, J. A. salinity variations and river discharge in the middle reach
Zwart, and V. Kumar. 2022. Multi-task deep learning of of the Delaware estuary. J. Geophys. Res.: Oceans 100:
daily streamflow and water temperature. Water Resour. Res. 20705–20713. doi:10.1029/95JC01406
58: e2021WR030138. doi:10.1029/2021WR030138 Wong, K.-C. 2002. On the spatial structure of currents across
Schreiber, T. 2000. Measuring information transfer. Phys. Rev. the Chesapeake and Delaware Canal. Estuaries 25: 519–
Lett. 85: 461–464. doi:10.1103/PhysRevLett.85.461 527. doi:10.1007/BF02804887
Sharp, J. H., C. H. Culberson, and T. M. Church. 1982. The Wong, K.-C., and A. Münchow. 1995. Buoyancy forced inter-
chemistry of the Delaware estuary. General consider- action between estuary and inner shelf: Observation. Cont.
ations1. Limnol. Oceanogr. 27: 1015–1028. doi:10.4319/lo. Shelf Res. 15: 59–88. doi:10.1016/0278-4343(94)P1813-Q
1982.27.6.1015 Zaini, N., L. W. Ean, A. N. Ahmed, and M. A. Malek. 2022. A
Sharp, J. H., L. A. Cifuentes, R. B. Coffin, J. R. Pennock, and systematic literature review of deep learning neural network
K.-C. Wong. 1986. The influence of river variability on the for time series air quality forecasting. Environ. Sci. Pollut.
circulation, chemistry, and microbiology of the Delaware Res. 29: 4958–4990. doi:10.1007/s11356-021-17442-1
estuary. Estuaries 9: 261–269. doi:10.2307/1352098 Zhi, W., D. Feng, W.-P. Tsai, G. Sterle, A. Harpold, C. Shen,
Sharp, J. H., K. Yoshiyama, A. E. Parker, M. C. Schwartz, S. E. and L. Li. 2021. From hydrometeorology to river water
Curless, A. Y. Beauregard, J. E. Ossolinski, and A. R. Davis. quality: Can a deep learning model predict dissolved

15
19395590, 0, Downloaded from https://aslopubs.onlinelibrary.wiley.com/doi/10.1002/lno.12549 by East China Normal University, Wiley Online Library on [17/04/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Gorski et al. Deep learning model of Delaware Bay

oxygen at the continental scale? Environ. Sci. Technol. 55: surrounding the framing of the issue and model application. In addition,
2357–2368. doi:10.1021/acs.est.0c06783 the authors would like to thank Jeremy Diaz for critical contributions to
analysis of ML models. Any use of trade, firm, or product names is for
Zhou, J., M. J. Deitch, S. Grunwald, E. J. Screaton, and M.
descriptive purposes only and does not imply endorsement by the
Olabarrieta. 2021. Effect of Mississippi River discharge and U.S. Government.
local hydrological variables on salinity of nearby estuaries
using a machine learning algorithm. Estuar. Coast. Shelf
Conflict of Interest
Sci. 263: 107628. doi:10.1016/j.ecss.2021.107628
The authors declare no conflicts of interest.
Acknowledgments
GG, SC, AS, AA, TT, JS, and ST were funded by the USGS Water Avail- Submitted 27 January 2023
ability and Use Science Program. SC and JW were additionally funded by Revised 05 September 2023
USGS Coastal and Marine Hazards and Resources Program. The authors Accepted 06 March 2024
would like to thank Amy Shallcross and the staff of the Delaware River
Basin Commission for thoughtful insight and productive discussion Deputy Editor: Julia C. Mullarney

16

You might also like