You are on page 1of 12

Journal of Hydrology 609 (2022) 127726

Contents lists available at ScienceDirect

Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol

Research papers

Data-driven rapid flood prediction mapping with


catchment generalizability
Zifeng Guo a, *, Vahid Moosavi a, João P. Leitão b
a
Swiss Federal Institute of Technology Zurich (ETHZ), Switzerland
b
Swiss Federal Institute of Aquatic Science and Technology (Eawag), Switzerland

A R T I C L E I N F O A B S T R A C T

This manuscript was handled by Emmanouil Data-driven and machine learning models have recently received increasing interest to resolve the computational
Anagnostou, Editor-in-Chief, with the assis­ speed challenge faced by various physically-based simulations. A few studies have explored the application of
tance of Emad Hasan, Associate Editor these models to develop new, and fast, applications for fluvial and pluvial flood prediction, extent mapping, and
flood susceptibility assessment. However, most studies have focused on model development for specific catch­
Keywords:
ment areas, drainage networks or gauge stations. Hence, their results cannot be directly reused to other contexts
Pluvial flood prediction
unless extra data are available and the models are further trained. This study explores the generalizability po­
Data-driven modeling
Surrogate flood modeling tential of convolutional neural networks (CNNs) as flood prediction models. The study proposes a CNN-based
model that can be reused in different catchment areas with different topography once the model is trained.
The study investigates two options, patch- and resizing-based options, to process catchment areas of different
sizes and different shapes. The results showed that the CNN-based model predicts accurately on “unseen”
catchment areas with significantly less computational time when compared to physically-based models. The
obtained results also suggest that the patch-based option is more effective than the resizing-based option in terms
of prediction accuracy. In addition, all experiments have shown that the prediction of flow velocity is more
accurate than water depth, suggesting that the water accumulation is more sensitive to global elevation infor­
mation than flow velocity.

1. Introduction and significantly accelerate the computational process compared with


physically-based models. This feature is extremely important for appli­
Solving flow- and flood-related problems using data-driven and cations that require a considerable number of simulations or real-time
machine learning models has recently become a research field receiving predictions, for example, simulation-driven optimizations (e.g., Feng
growing attention. Compared to the conventional physically-based et al., 2016; Mustafa et al., 2018) and animations (e.g., Ladický et al.,
models governed by systems of differential equations, data-driven 2015).
flood models can be considered as the “surrogate” which exhibits two
major advantages. First, data-driven models can produce relatively ac­ 1.1. Current status and challenges of data-driven flood modeling
curate predictions without the need of having the full a priori knowledge
of the phenomena. The accuracy of the model is related to the amount, A considerable number of studies have been conducted for data-
quality, and diversity of data available. This feature suggests that, by driven flood modeling using different methods such as decision trees
learning from the observational data, we can bypass the complexity (Tehrany et al, 2013), logistic regressions (e.g., Ladicky el al., 2015;
issue of physically-based models caused by the increasing number of Tehrany et al., 2017), support vector machines (e.g., Huang et al., 2014;
influential factors (e.g., Wang et al., 2018; Liu et al., 2020). Second, Tehrany et al., 2019; Wang et al., 2020), and self-organizing maps (e.g.,
data-driven flood models substitute the iterative process (i.e., the nu­ Zaghloul 2017; Leitão et al., 2018). These studies have shown that data-
merical integration of differential equations) with non-iterative opera­ driven techniques are suitable for a wide range of flow-related problems
tions (such as the forward propagation of neural networks). Therefore, with promising accuracy when sufficient data are available. However,
data-driven models benefit from the parallel computing techniques these learning methods require feature extractions when dealing with

* Corresponding author.
E-mail address: guo@arch.ethz.ch (Z. Guo).

https://doi.org/10.1016/j.jhydrol.2022.127726
Received 16 September 2021; Received in revised form 25 January 2022; Accepted 10 March 2022
Available online 16 March 2022
0022-1694/© 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

raster data such as elevation maps. The feature extraction can be 1.2. Objective: investigating the terrain generalizability of data-driven
computationally expensive and may not be feasible for larger scale flood models
problems (e.g., Zaghloul 2017; Leitão et al., 2018).
Recently, neural networks (NNs) have become a dominant method In this study, we propose a data-driven pluvial flood prediction
for data-driven flow and flood modeling due to 1) the ability of NNs to model that can generalize to different terrain inputs. In other words,
approximate strong and non-linear correlations, 2) the ability to process once the model is trained, it can be used to different catchment areas
raw raster data, and 3) the variety of NN’s architecture to explore. Many that are not included in the training data. The proposed model repre­
studies have reported that NN-based models outperformed other sents the pluvial flood prediction as an image-to-image translation task
methods in terms of the prediction accuracy for the same task (e.g., that can be handled by CNNs. Compared to other machine learning al­
Gebrehiwot et al., 2019; Bui et al., 2020; Zhao et al., 2020). The studies gorithms, CNNs offer two advantages: 1) the ability of learning spatial
of NN-based models can be further differentiated into two directions. features from the input data, and 2) the ability to avoid the explosive
The first direction adopts fully-connected networks and convolutional growth of model’s complexity when handling raster (image) data. As
neural networks (CNNs) for spatial flood predictions of catchment areas. CNNs were shown effective to generalize on various rainfall events for
This direction typically predicts one pixel at a time using the surround urban-scale inundation prediction (Guo et al., 2020a), we mainly focus
information of the pixels (e.g., Berkhahn et al., 2019; Zhao et al., 2019; on the flood prediction of the same event in different catchment areas.
Zhao et al., 2020; Bui et al., 2020; Wang et al., 2020), or predict a large Compared with other CNNs-based studies which focus on specific loca­
area at once using image-to-image modeling techniques (Guo et al., tions (e.g., Zhao et al., 2019; Zhao et al., 2020; Kabir et al., 2020), or
2020a; Löwe et al., 2021). The second direction typically uses recurrent process one raster pixel as a time (e.g., Zhao et al., 2019; Zhao et al.,
neural networks (RNNs) to conduct long-term predictions for drainage 2020; Khosravi et al., 2020; Wang et al., 2020). Our study takes a step
networks or catchment-level rainfall-runoff relations (e.g., Chang et al., further to handle different locations and simplify the prediction process
2004; Chang et al., 2014; Chen et al., 2013; Tan et al., 2018; Kratzert (i.e., raster-to-raster instead of raster-to-pixel). The main contributions
et al., 2019a; Kratzert et al., 2019b; Gude et al., 2020). An exception can of our study can be summarized as follow:
be found in Kabir et al. (2020) who used CNNs to process temporal data
that were encoded as matrixes. The rows of the matrixes represent the 1. A new data-driven flood prediction model capable to generalize to
time series while the columns represent the features. different catchment areas, i.e., areas with different topography, and
The recent studies have demonstrated the promising potential of to generate flood predictions in a few seconds with a promising ac­
data-driven flood modeling. However, most studies have focused on curacy compared to physically-based simulation results.
specific catchment areas or drainage systems (e.g., Chen et al., 2013; 2. Two different spatial discretization options to handle catchment
Chang et al., 2014; Bui et al., 2020; Kabir et al., 2020; Wang et al., 2020), areas of different sizes, which can be used as references for further
and their results cannot be directly extrapolated and transferred to other research, and
locations without adding more data and further training to the models. 3. A large pluvial flood dataset generated using a simplified physically-
Data-driven flood models that can generalize to different terrain inputs based flood model that can contribute to other related flood pre­
are not fully explored. Regarding this problem, Mustafa et al. (2018) diction studies.
showed that a neural network trained with artificially generated terrain
performed well on other “unseen” terrains that were produced by the 2. Method and materials
same generator. However, the neural network was trained with the user-
defined parameters of the generator rather than the raw elevation data. 2.1. Problem statement
Thus, the result cannot be easily extended to other applications. Ber­
khahn et al. (2019) trained different models for each of the catchment This study focuses on the development of a data-driven flood pre­
areas and showed that the same neural network architecture has the diction model capable of terrain generalizability, which means once the
potential to be used in different locations. A step further was made by model is trained, it can be used on different catchment areas that were
Kratzert et al. (2019a) and Kratzert et al. (2019b) who investigated not presented in the training data set. As a first step, we concentrate on
recurrent neural networks for predicting catchment-level rainfall-runoff the generalizability to different terrain inputs and simplify the problem
relations. The networks were tested on basins that were not included in by 1) neglecting other contributing factors such as surface types and
the training data and outperformed the calibrated traditional hydrology land uses to reduce the model’s complexity and the amount of data
models. Löwe et al. (2021) trained a CNN model using raster data required; 2) focusing on the maximum water depths and flow velocities
sampled from the same city. The network was based on the concept of U- as they are the key factors for risk assessment and urban planning; and 3)
net, which is a network architecture for image-to-image translation, and focusing on one rainfall event of approximately 100-year return period
has showed a promising accuracy on raster samples of the same city that to study whether the model successfully transfer the information learnt
were not used as the training data. from some catchments to other catchments.
The lack of data-driven models extrapolatable to “unseen” terrains Our study considers flood prediction as a supervised learning task in
can be justified by two main reasons. First, many machine learning al­ which the prediction model is trained using input-output pairs. The in­
gorithms require the input data to have the same dimensionality, which puts are elevation raster (i.e., spatial maps of terrain elevations) and the
means we need to develop a systematic representation for catchment outputs are flooding raster (i.e., spatial maps of maximum water depth
areas of different sizes or drainage networks of different topologies. and flow velocity). After the training step, the model can predict the
Second, such models require large amount of flood data to be available maximum water depth and flow velocity from the new elevation data
as the training data. However, preparing such dataset is computationally that is fed as input. We implement such prediction model using CNNs to
expensive due to the size and the spatial and temporal resolutions of utilize the spatial information of adjacent image pixels (raster cells).
urban-scale flood simulations. However, most existing CNN models have relatively small input sizes (e.
Despite these recent works, other type of predictions, such as surface g., 256 × 256 pixels) which are significantly smaller than the size of
water depth and flow velocities of different catchment areas, have not catchment areas (e.g., 3,000 × 3,000 pixels). As it is not feasible to in­
yet been well-studied. Therefore, researchers and urban planners still crease the input size of CNNs to the size of catchment areas due to the
lack proper models for large-scale simulation-intensive applications. model’s complexity and the memory limitation of most graphic cards,
This situation emphasizes the need of exploring data-driven flood we propose two options to balance between information loss and model
models capable of accurate flood predictions on different terrain inputs. implementation: the patch- and the resizing-based options.

2
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

Fig. 1. The receptive field of a layer is the number of pixels of the input layer that are “visible” from this layer. k represents kernel size, s is the stride size, and r is the
receptive field.

Fig. 2. The prediction model. Note that the not all layers are shown for visualization purpose.

2.1.1. Patch-based option The receptive field can be increased by 1) adding more network
The patch-based option discretizes catchment areas to patches and layers and 2) using larger convolutional kernels. However, these two
assembles the patch-level outputs to the final predictions. This option strategies will “smooth out” the detailed spatial patterns for most CNN
was shown effective for describing large original objects (e.g., Masci architectures (Long et al., 2015). Although small convolutional kernels
et al., 2015; Ronneberger et al., 2015). However, considering the in­ can improve the output details (Badrinarayanan et al., 2017), using
formation loss caused by the patch sampling, we chose a relatively large small kernels is in contradiction with the goal to having a global view of
patch size of 1,024 × 1,024 to preserve as much global information as the input data. This issue is resolved by skip-connections as they bring
possible. Also, we over-sample the patches to reduce errors using shortcuts between the input and output layers. In addition, the skip
overlapping pixels (Guo et al., 2020a). During the experiment, we have connections reduce the difficulties to train deep CNNs and improve the
also tested other patch sizes for comparison purposes. accuracy (He et al., 2016).
The structure of our CNN model is shown in Fig. 2, where the input is
2.1.2. Resizing-based option a terrain raster with four image channels (elevation, slope, aspect, and
The resizing-based option down-samples large catchment areas, and curvature) and the output is the corresponding flood predictions (water
then up-samples the outputs to their original sizes. The purpose of this depth and flow velocity). The model consists of an encoder and a
option is to study whether CNNs can effectively handle resized inputs decoder. The encoder is a series of convolutional and max-pooling layers
and make accurate predictions as in other applications such as computer which compress the input raster to arrays of smaller sizes. The decoder is
visions (e.g., Badrinarayanan et al., 2017). The resizing-based option a series of up-sampling and convolutional layers which decompress the
preserves global elevation information but destroys local detailed pat­ compressed arrays to the output raster. For each up-sampling layer of
terns, but the lost details can be re-generatable by synthetic up-sampling the decoder, its output array is concatenated with the array of the same
methods (e.g., Chu and Thuerey, 2017). We choose a large input size size produced by the encoder. The concatenated arrays are fed to the
(1,024 × 1,024) to preserve as much local information as possible and to successive layer of the up-sampling layers.
make the two options comparable. The number of layers of the CNN model depends on the size of the
input. The goal is to have the receptive field in the latent layer (the last
2.2. Model design layer of the encoder) larger than the input size. The receptive field rn of
the n-th hidden layer of the encoder can be calculated using Eq. (1).
The CNN model is designed based on the structure of U-Net (Ron­ ⎧


neberger et al., 2015), a neural network architecture that is character­ ⎨ k1 if n = 1
ized by the skip-connections between shallow and deep layers. The main rn = ∏n− 1 (1)

⎪ r + (k − 1) s if n>1
⎩ n− 1 n i
reason to use U-Net rather than other CNN architectures is to preserve
i=1

the detailed spatial patterns in the outputs while keeping the receptive
field as large as possible. The receptive field refers to the “visible pixels” In the equation, kn, sn are the kernel size (the size of the convolu­
of the input layer for each output pixel (Luo et al., 2016). It corresponds tional kernel) and the stride of the n-th hidden layer, respectively. For
to the hydrological fact that the water accumulation in a small region is max-pooling layers, k = s. Therefore, the larger the input size, the deeper
the result of water flowing from larger areas (i.e., from upstream to the network.
downstream). Larger receptive field allows the CNN model to learn from Based on this formulation, a good combination that efficiently in­
the global elevation information rather than only from local terrain creases the receptive field of the encoder part is two convolutional layers
patterns (Geirhos et al., 2019). Fig. 1 shows a diagram explaining the with k = 7 and s = 1 followed by one max pooling layer with k = 2 and
concept of receptive field. s = 2. For the decoder part, we used a symmetrical layer sequence and

3
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

Fig. 3. Spatial distribution of catchment areas that were categorized as training dataset (white) and validation and test datasets (blue). The red polygons represent
case studies sampled from the validation and test datasets.

Fig. 4. Terrain characteristics of the training, validation, and test datasets.

replace all max-pooling layers by up-sampling layers with k = 2. All 2.3. Data source and data processing
convolutional layers of the decoder part have a k = 3 in order to better
preserve detail spatial patterns. The activation functions for all except The elevation data for this study were collected from the GeoVITe
the last convolutional layers are Leaky-ReLU (Maas et al., 2013). The geodata service of ETH Zurich (https://geovite.ethz.ch/). The data were
Leaky-ReLU function avoids the “vanishing gradient problem” downloaded as 2 m raster tiles and were processed to catchment areas
(Hochreiter, 1998) of the sigmoid functions and the dead neuron using GIS software. The collected elevation data consist of two regions.
problem of the rectified linear function (Nair and Hinton, 2010). The The first region is an area of approximately 90 km × 65 km around the
output layer has no activation function and produces unbounded values. Canton of Zurich, Switzerland, containing 649 catchment areas. The
The CNN models were implemented using Keras 2.2.2 (Chollet et al., second region is located at the cities of Lausanne and Geneva,
2015) and Tensorflow 1.14.0 (Abadi et al., 2016), and were trained Switzerland, containing seven catchment areas. The catchment areas of
using the Adam optimizer (Kingma and Ba, 2015) with a learning rate of the first region were randomly split into two sets that contain 433 and
5 × 10− 5. The batch size for all training was two. We used a small batch 216 catchment areas, respectively. The larger set was used as the
size due to the memory limitation of the available graphic card (a Nvidia training dataset and the smaller set was used as the validation datasets.
GTX 1080 Ti with 11 GB memory). The loss function for training the All the catchment areas of the second region were used as the test
models was mean squared loss. All no-data pixels were excluded from dataset. Fig. 3 shows the spatial distribution of the dataset, where
the loss functions. For all models, we stopped the training process when catchment areas with red boundaries belong to the validation and test
their test losses converge to stable values. datasets. Fig. 4 reports the terrain characteristics of the datasets. It is
noteworthy that these characteristics show the average conditions of the
catchments, not representing detailed spatial patterns, and that the

4
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

Table 1
The design rainfall event (approximately 100-year return period) used for the simulations.
Rainfall interval (min) 0–5 5–10 10–15 15–20 20–25 25–30 30–35 35–40 40–45 45–50 50–55 55–60
Rainfall intensity (mm/h) 24.1 26.8 30.7 37.0 50.1 161.4 65.6 42.1 33.4 28.6 25.3 23.0

rainfall characteristics of the different locations. The purpose for using


Table 2
the same event is to focus on the main factor (i.e., terrain elevations) and
Performance indicators used by the experiments.
maintain other numerical inputs constant. The rainfall intensity of the
Performance indicator Formula Range Optimal event is shown in Table 1.
score
The training data for the patch- and resizing-based options were
Modified index of d1 =
∑n ⃒⃒ ⃒
[0,1] 1 prepared differently. For the patch-based option, the training data were
agreement (d1) yi − y’ ⃒
1 − ∑n ( ⃒⃒ i=1 ⃒⃒ ⃒⃒ i ’ ⃒) patches (input-output pairs) randomly sampled from the catchment
i=1 yi − y + yi − y

√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ areas of the training dataset. Each patch contains three raster maps: the
RMSE ∑n ( ) 2 [0, 0
RMSE = i=1 yi − yi

+∞)
terrain elevations, the maximum water depth, and the maximum flow
n velocity. We oversampled the patches so that the total number of patch
pixels is three times larger than the total number of catchment pixels.
For the resizing-based option, the training data were raster maps pro­
Table 3 duced by resizing each of the catchment areas in the training dataset.
Different models tested in the validation experiments. The resizing was based on the longer side of the original catchment areas
Name Input size Receptive Kernel All layers shown Tested and the shorter side of the results were padded with 0 s to match the
field size as a sequence with input size of the CNNs. In addition, data augmentation techniques that
(concatenations flip and rotate the inputs were used to increase the number of training
are not shown) 1
data during the training process. The CNN models were trained using the
1024- 1,024 × 1,024 1588 7 convp(8); convp Both training set only and were evaluated using both validation and test sets.
k7 (16); convp(32); options
For both options, the training data were pre-processed before being
1024- 572 3 convp(64); convp
k3 (128); convp fed to the CNN models. The data pre-processing consists of three steps:
(256); 2 × conv 1) computing terrain features, including slope, aspect, and curvature,
(512); upconv from the raw elevation data xraw using the approach proposed by De
(256); upconv Smith et al. (2007); 2) rescaling the raw elevation xraw to c(max
(1 2 8); upconv
(xraw) − xraw, where max returns the maximum value and c is a constant;
(64); upconv(32);
upconv(16); and 3) concatenating the terrain features and the rescaled elevation to
upconv(8); conv raster maps of multiple image channels. All no-data pixels were filled
(2) with 0 s during data preprocessing. We trained the CNNs with and
512- 512 × 512 788 7 convp(16); convp The
without data preprocessing and found that, although the CNNs can learn
k7 (32); convp(64); patch-
512- 284 3 convp(128); based and converge using raw elevation data, the learning process was much
k3 convp(256); option faster using preprocessed training data. The test run also showed that
2 × conv(512); smaller c such as 0.01 performed better than larger values in terms of the
upconv(256); learning speed. This result is reported in the appendix.
upconv(128);
upconv(64);
upconv(32); 2.4. Model validation
upconv(16); conv
(2)
256- 256 × 256 388 7 convp(32); convp 2.4.1. Performance evaluation
k7 (64); convp(128); The performance of the CNN models was evaluated from the aspect
256- 140 3 convp(256); of prediction accuracy and computational time. The prediction accuracy
k3 2 × conv(512);
was assessed by comparing the ground truths and the prediction results,
upconv(256);
upconv(128);
using two commonly used performance indicators – modified index of
upconv(64); agreement (d1), and rooted mean squared error (RMSE). Both indicators
upconv(32); conv show the deviation of the prediction results from the ground truths, and
(2) the d1 value emphasizes on the relative errors (Willmott et al., 1985),
conv(n) represents one convolutional layer with the kernel size = 3; convp(n) is while the RMSE focuses on the absolute errors. The calculation of these
two convolutional layers with the kernel size specified in the network name, indicators is reported in Table 2. As the output of the CNN models
followed by one max pooling layer; upconv(n) is one up-sampling layer fol­ consists of two image channels – the maximum water depth and the
lowed by two convolutional layers with the kernel size = 3, and n is the number maximum flow velocity, the indicators were computed individually for
of output image channels. each of the output channels. The computation was conducted using
inundated pixels only, which means both the no-data and the non-
catchment characteristics were not used as inputs during the training inundated pixels were excluded. The inundated pixels were selected
process. by filtering the pixels where either water depth or flow velocity exceeds
The ground truth (i.e., maximum water depths and flow velocities) of the given thresholds. These thresholds were set to 0.05 m and 0.05 m/s.
all catchment areas were created by conducting simulations using Lastly, these performance indicators were individually computed for
CADDIES model (Guidolin et al., 2016). CADDIES is a cellular-automata- each of the catchments, rather than for the entire dataset.
based flood model capable of relatively fast pluvial flood simulations. The computational time was measured by the average prediction
All simulations were five hours length and were based on a designed time for each catchment area. For the resizing-based option, this time is
100-year event generated using the alternating block method (Te Chow equivalent to the time for processing one input raster. For the patch-
et al., 1988). Note that this rainfall event of approximately 100-year based option, the time depends on the size of the catchment area – i.
return period was used on the entire region without considering the e., how many patches are sampled. In our experiment, we sampled the

5
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

Fig. 5. Performance of different CNN models in the two proposed options.

patches by moving a patch-size × patch-size window horizontally and 3. Results and discussions
vertically with a step of patch-size/2 until the entire catchment area is
reached. The patch-size depends on the input size of the CNN model, for 3.1. Comparing two options and different model designs
example, 1,024 × 1,024 pixels. In addition to the prediction time, the
time for necessary data preprocessing was also measured. The performance of the proposed options and different model vari­
ants is presented in Fig. 5. As can be seen, the performance of patch-
2.4.2. Validation experiments based option is significantly higher than the resizing-based option.
The evaluation process described above was conducted in multiple The worst-performing patch-based model shows a higher accuracy than
validation experiments to compare the performance of different options the best-performing resizing-based model. This result indicates that,
(patch-based and resizing-based options) and different CNN model unlike other CNN-based image recognitions that work well on images of
variants. Each validation experiment produces a list of d1 and RMSE different sizes, learning a flood model from resized rasters is difficult.
values that show the performance of the corresponding model on all The loss of the detailed terrain information due to image resizing cannot
catchments. be compensated by having a global view of the terrain.
The CNN model variants that were tested in the validation experi­ For the patch-based option, it is clear that models with larger
ments were listed in Table 3. These variants were created by changing receptive field are more accurate for both water depth and flow velocity.
the key variables of the CNN model – the input size and the kernel size. This trend can be better seen when different input size (such as 512 and
Due to the long training process of each CNN model, we did not 256) or different kernel size (such as k7 and k3) are compared. This
exhaustively test all possible values. Rather, we selected three input result indicates that the availability of terrain information in larger scale
values and two kernel sizes, producing six model variants in total. These is essential for flood predictions. Without sufficient receptive field, the
model variants were named by input size-kernel size. In addition, to models would tend to “memorize” the training data rather than make
reduce the potential information lost, we only tested the resizing-based good generalization on the test data. Also, when the receptive field de­
option with CNN models that have the largest input size (i.e., creases, the accuracy drop is less significant for the flow velocity than for
1,024 × 1,024 pixels). the water depth, which indicates that the flow velocity is affected more
by the local elevation pattern than by the global terrain information,
thus making flow velocity easier to learn and to predict.

6
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

Fig. 6. Scatter plots of simulation time (x-axis) against prediction times (y-axis). Each dot represents one catchment area.

Fig. 7. Left and middle: comparison of the two options’ RMSE on water depth and flow velocity, for all catchment areas. Right: relation between the water depth
RMSE and terrain elevation span, where WD represents water depth.

Fig. 8. Spatial plots of sample A. Top: simulated and predicted water depth. Bottom: simulated and predicted flow velocity.

7
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

Fig. 9. Spatial plots of sample B. Top: simulated and predicted water depth. Bottom: simulated and predicted flow velocity.

Fig. 10. Spatial plots of sample C. Top: simulated and predicted water depth. Bottom: simulated and predicted flow velocity.

3.2. Time performance due to the increasing number of patches sampled from larger catchment
areas. For the resizing-based option, the prediction time (blue) is con­
The time comparison between the cellular-automata-based simula­ stant, and the data preprocessing time (orange) slightly increases for
tions and CNN models is presented in Fig. 6 where each point represents catchment areas that cost more simulation time, explained by the
one catchment. The x-axes of the plots represent the simulation time and different size of the catchment areas. The baseline experiment shows
the y-axes show the prediction time. The orange points consider both that the data-processing time remains constant if all elevation data have
prediction time and the time for necessary data preprocessing, whereas same size.
the blue points consider only the prediction times. The plots clearly
show that CNNs achieved a significant improvement on computational
speed. Results that take approximately 20,000 s by simulations can be 3.3. Case studies
obtained by 3 s using CNN based models. For the patch-based option, the
prediction time is linearly correlated with the simulation time. This is In addition to the mentioned performance indicators, we manually
selected four catchment areas from the validation dataset for case

8
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

Fig. 11. Spatial plots of sample D. Top: simulated and predicted water depth. Bottom: simulated and predicted flow velocity.

Fig. A1. Sample 1 from the test dataset. Top: simulated and predicted water depth. Bottom: simulated and predicted flow velocity.

studies to assess the model’s spatial performance (their locations are generated using the patch-based option, and the non-flooded pixels
shown in Fig. 3). The case studies were selected as 1) they correspond to (water depth < 0.05 m or flow velocity < 0.05 m/s) were excluded from
relatively extreme RMSE values that represent the best- and worst- the rendering. In the appendix we present more spatial renderings from
performing scenarios for the CNN models, and 2) they represent the test dataset.
different terrain conditions (i.e., sample B and D are urban areas, and Figs. 8 and 9 show two low-RMSE cases (sample A and B) that
sample A and C are rural areas). Fig. 7 shows the RMSE of all catchments correspond to rural (sample A) and urban (sample B) conditions,
using both options, where the four selected cases are highlighted with respectively. A visual examination of the spatial rendering clearly shows
different symbols. Figs. 8–11 show the spatial rendering of the flood that both the water depth and flow velocity predictions are reasonably
predictions of the four case studies. The flood predictions were accurate. On the other hand, the flood extent prediction (colored pixels)

9
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

Fig. A2. Sample 2 from the test dataset. Top: simulated and predicted water depth. Bottom: simulated and predicted flow velocity.

Fig. A3. The effect of data-processing parameters on model convergence. The S, A, and C represent slope, aspect, and curvature, respectively.

10
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

of the downstream area in Fig. 9 shows a visual pattern that differs from catchments suggest that the major challenge is how to recognize the
the simulation result. However, we would emphasize that this pattern water accumulations from large regions. Possible solutions for this issue
difference can be justified by the nonlinear colormap which gives more include considering larger training dataset, oversampling the deep-
color spectrum to smaller values than to larger values. As seen in the water regions, and modifying the loss functions to gain more weights
error map of Fig. 9 (right column), despite the pattern difference, the for the deep-water regions. Lastly, our study was conducted using
accuracy of the downstream area is high. CADDIES as the flood data provider and therefore the accuracy was
Figs. 10 and 11 show two high-RMSE cases (sample C and D) that highly depended on the performance of the CADDIES. However, we
represent the worst-performing scenarios of the CNN model in rural would like to clarify that the goal was to investigate whether data-driven
(sample C) and urban (sample D) conditions. Comparing with the pre­ models can produce results as good as that of physically-based models.
vious low-RMSE cases, these high-RMSE cases have larger elevation The reliability issue of data-driven predictions will be addressed by
span and are characterized by the under predictions in the downstream training with observational data in future works.
deep-water areas (the dark red areas of the water depth simulations). As of future works, the systematically encoding of catchment areas of
Although the CNN model was able to recognize a relatively accurate arbitrary sizes and shapes remains a challenge. This issue was handled
flood extent for these deep-water areas, the numerical accuracy of these by the patch- and resizing-based options in our experiments, which
areas is substantially lower when compared with that of other areas. On showed that the former performed significantly better than the letter on
the other hand, despite the under predictions of the downstream areas, both validation and test dataset. However, other possible methods that
the prediction accuracy of the upper stream areas remains high. This worth to explore include testing new neural network architecture, or
suggests that the main challenge for the CNN model is to correctly sampling patches based on flow movement rather than spatial locations
recognize the effect of water accumulation in large areas. (e.g., Chu and Thuerey, 2017). Another interesting direction of future
Despite the comparison of these cases has suggested that the larger research would be to estimate the flow dynamic based on input con­
the catchment elevation range the lower the prediction accuracy, when straints such as spatial rainfall intensity. Also, the rapid development of
all catchments are taken into account, no clear relationship between this sensor networks has made it possible to collect data by crowdsourcing
terrain feature and the prediction accuracy is observed (right-hand side methods (Zheng et al., 2018) or computer vision techniques (e.g., Moy
plot of Fig. 7). In both path- and resizing-options, one can see (left and de Vitry et al., 2019; Gebrehiwot et al., 2019), opening new possibilities
middle plots in Fig. 7) that the prediction accuracy is variable to produce observational flood data to be used in the training step of
throughout the catchment area. This may indicate that the generaliz­ data-driven flood prediction models.
ability of the model and its prediction accuracy can be improved if a
larger training dataset is used. CRediT authorship contribution statement

4. Conclusions, limitations, and future works Zifeng Guo: Data curation, Writing – original draft, Conceptualiza­
tion, Methodology, Visualization. Vahid Moosavi: Conceptualization,
This study presented a data-driven approach for fast flood prediction Methodology. João P. Leitão: Conceptualization, Supervision, Writing
using CNNs that is able to generalize on different catchment areas and – review & editing.
topographies. The study consists of two experiments that explored
different methods for processing catchment areas larger than the input Declaration of Competing Interest
size of CNNs. The results have shown that CNNs exhibit a promising
ability to generalize the information learnt from the training data to The authors declare that they have no known competing financial
other unseen terrains, suggesting a potential to serve as the rapid sur­ interests or personal relationships that could have appeared to influence
rogate model for flood predictions of different scenarios. The results of the work reported in this paper.
the experiments also showed that CNN models with larger receptive
field tend to have higher accuracy than models with small receptive Acknowledgement
field. This suggests that water accumulation is sensitive to the global
patterns of the catchment area and the prediction accuracy depends on This study was funded by the China Scholarship Council grant
how much global information is available. This conclusion was also 201706090254.
reported by Tsubaki and Kawahara (2013). Data accessibility
On the other hand, several drawbacks and limitations exist and The elevation and simulation data of the catchment areas used by
remain as possible future research directions. The first drawback is that this study can be obtained from the data repository (Guo et al., 2020b)
our experiment performed holdout validations rather than k-fold cross hosted by the Research Collection of ETH Zurich with DOI link 10.3929/
validations. This can be explained by the limitations on available ethz-b-000453305.
computational power. Compared with the holdout validations, k-fold
cross validations smooth out the results and avoid highly misleading Appendix
results. However, we considered this drawback acceptable as 1) the goal
of this study was not to conduct a benchmark test which compares the Figs. A1 and A2 show two samples from the test dataset. The terrains
performance of our model with other models using a specific dataset, 2) of the test dataset were not included in the training data and were
we have trained multiple models which help to avoid outliers, and 3) we collected from regions far from where training data were collected. As
had a relatively large dataset compared with other studies (as each seen, despite the “unfamiliar” terrains, the accuracy of the CNN model
catchment area produces multiple training samples), which compen­ was promising.
sates for the lack of cross validations. The second drawback is that we As mentioned in Section 4.2, several tests were made to train CNN
have simplified the problem and only focused on the main contributing models with and without terrain features. These tests were made using
factor (i.e., the elevation). Other influential factors such as land cover patches sampled from several catchment areas. The results of these tests
and spatial distributions of precipitation intensities were neglected. This are presented as loss curves in Fig. A3, in which the left plot shows the
decision was made to reduce the model’s complexity and the amount of result of different c values, and the right plot shows the result of different
data required for the investigation. Therefore, future investigations are terrain features. It is clear from the left plot that the model converges
still required before applying this type of flood models to real applica­ faster as the c decreases. However, the improvement on convergence
tions. Another limitation of the presented model is that the prediction speed becomes less significant when c < 0.01. The right plot suggests
accuracy can still be further improved. The mispredictions in some of the that models using multiple features converge faster than those using one

11
Z. Guo et al. Journal of Hydrology 609 (2022) 127726

feature or those without any feature. Ladický, L.U., Jeong, S., Solenthaler, B., Pollefeys, M., Gross, M., 2015. Data-driven fluid
simulations using regression forests. ACM Transactions on Graphics 34 (6), 199.
https://doi.org/10.1145/2816795.2818129.
References Liu, J., Shao, W., Xiang, C., Mei, C., Li, Z., 2020. Uncertainties of urban flood modeling:
Influence of parameters for different underlying surfaces. Environ. Res. 182, 108929
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., et al., 2016. Tensorflow: A https://doi.org/10.1016/j.envres.2019.108929.
system for large-scale machine learning. In: 12th Symposium on Operating Systems Long, J., Shelhamer, E., Darrell, T., 2015. Fully convolutional networks for semantic
Design and Implementation. USENIX, Savannah, USA. pp. 265-283. doi: 10.1029/ segmentation. In: Proceedings of the 28th IEEE conference on computer vision and
2018WR024301. pattern recognition (CVPR 2015). Boston, USA, pp. 3431–3440. doi: 10.1109/
Badrinarayanan, V., Kendall, A., Cipolla, R., 2017. Segnet: A deep convolutional encoder- CVPR.2015.7298965.
decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. Löwe, R., Böhm, J., Jensen, D.G., Leandro, J., Rasmussen, S.H., 2021. U-
39 (12), 2481–2495. https://doi.org/10.1109/TPAMI.2016.2644615. FLOOD–topographic deep learning for predicting urban pluvial flood water depth.
Berkhahn, S., Fuchs, L., Neuweiler, I., 2019. An ensemble neural network model for real- J. Hydrol. 603, 126898.
time prediction of urban floods. J. Hydrol. 575, 743–754. https://doi.org/10.1016/j. Luo, W., Li, Y., Urtasun, R., Zemel, R., 2016. Understanding the effective receptive field
jhydrol.2019.05.066. in deep convolutional neural networks. In: Proceedings of the 30th International
Bui, D.T., Hoang, N.D., Martínez-Álvarez, F., Ngo, P.T.T., Hoa, P.V., Pham, T.D., Conference on Neural Information Processing Systems. Barcelona, Spain. pp.
Costache, R., 2020. A novel deep learning neural network approach for predicting 4905–4913.
flash flood susceptibility: A case study at a high frequency tropical storm area. Sci. Masci, J., Boscaini, D., Bronstein, M., Vandergheynst, P., 2015. Geodesic convolutional
Total Environ. 701, 134413. doi: 10.1016/j.scitotenv.2019.134413. neural networks on riemannian manifolds. In: Proceedings of the IEEE international
Chang, L.C., Chang, F.J., Chiang, Y.M., 2004. A two-step-ahead recurrent neural network conference on computer vision workshops. Santiago, Chile. pp. 37-45. doi: 10.1109/
for stream-flow forecasting. Hydrol. Process. 18 (1), 81–92. https://doi.org/ ICCVW.2015.112.
10.1002/hyp.1313. Maas, A.L., Hannun, A.Y., Ng, A.Y., 2013. Rectifier nonlinearities improve neural
Chang, F.J., Chen, P.A., Lu, Y.R., Huang, E., Chang, K.Y., 2014. Real-time multi-step- network acoustic models. ICML Workshop on Deep Learning for Audio, Speech, and
ahead water level forecasting by recurrent neural networks for urban flood control. Language Processing (WDLASL 2013). Atlanta, USA.
J. Hydrol. 517, 836–846. https://doi.org/10.1016/j.jhydrol.2014.06.013. Moy de Vitry, M., Kramer, S., Wegner, J.D., Leitão, J.P., 2019. Scalable flood level trend
Chen, P.A., Chang, L.C., Chang, F.J., 2013. Reinforced recurrent neural networks for monitoring with surveillance cameras using a deep convolutional neural network.
multi-step-ahead flood forecasts. J. Hydrol. 497, 71–79. https://doi.org/10.1016/j. Hydrol. Earth Syst. Sci. 23 (11), 4621–4634. https://doi.org/10.5194/hess-2018-
jhydrol.2013.05.038. 570.
Chollet, F., et al., 2015. Keras. GitHub. Retrieved from https://github.com/fcholl Mustafa, A., Wei Zhang, X., Aliaga, D.G., Bruwier, M., Nishida, G., Dewals, B.,
et/keras. Erpicum, S., Archambeau, P., Pirotton, M., Teller, J., 2020. Procedural generation of
Chu, M., Thuerey, N., 2017. Data-driven synthesis of smoke flows with CNN-based flood-sensitive urban layouts. Environ. Plann. B 47 (5), 889–911.
feature descriptors. ACM Trans. Graphics 36 (4), 1–14. https://doi.org/10.1145/ Nair, V., Hinton, G.E., 2010. Rectified linear units improve restricted boltzmann
3072959.3073643. machines. In: Proceedings of the 27th international conference on machine learning
De Smith, M.J., Goodchild, M.F., Longley, P., 2007. Geospatial analysis: a comprehensive (ICML 2010), Haifa, Israel, pp. 807–814.
guide to principles, techniques and software tools. Troubador Publishing Ltd. Ronneberger, O., Fischer, P., Brox, T., 2015. U-net: Convolutional networks for
Feng, T., Yu, L.F., Yeung, S.K., Yin, K., Zhou, K., 2016. Crowddriven mid-scale layout biomedical image segmentation. In Medical Image Computing and Computer-
design. ACM Transactions on Graphics 35 (4), 132. https://doi.org/10.1145/ Assisted Intervention (MICCAI). Springer, LNCS 9351, 234–241. https://doi.org/
2897824.2925894. 10.1007/978-3-319-24574-4_28.
Gebrehiwot, A., Hashemi-Beni, L., Thompson, G., Kordjamshidi, P., Langan, T.E., 2019. Tan, Q.F., Lei, X.H., Wang, X., Wang, H., Wen, X., Ji, Y., Kang, A.Q., 2018. An adaptive
Deep convolutional neural network for flood extent mapping using unmanned aerial middle and long-term runoff forecast model using EEMD-ANN hybrid approach.
vehicles data. Sensors 19 (7), 1486. https://doi.org/10.3390/s19071486. J. Hydrol. 567, 767–780. https://doi.org/10.1016/j.jhydrol.2018.01.015.
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W., 2019. Te Chow, V., Maidment, D.R., Mays, L.W., 1988. Applied Hydrology. McGraw-Hill. ISBN:
ImageNet-trained CNNs are biased towards texture; increasing shape bias improves 0-07-100174-3.
accuracy and robustness. In: 7th International Conference on Learning Tehrany, M.S., Pradhan, B., Jebur, M.N., 2013. Spatial prediction of flood susceptible
Representations (ICLR 2019). New Orleans, USA. Oral presentation. https://open areas using rule based decision tree (DT) and a novel ensemble bivariate and
review.net/forum?id=Bygh9j09KX. multivariate statistical models in GIS. J. Hydrol. 504, 69–79. https://doi.org/
Gude, V., Corns, S., Long, S., 2020. Flood Prediction and Uncertainty Estimation Using 10.1016/j.jhydrol.2013.09.034.
Deep Learning. Water 12 (3), 884. https://doi.org/10.3390/w12030884. Tehrany, M.S., Shabani, F., Neamah Jebur, M., Hong, H., Chen, W., Xie, X., 2017. GIS-
Guo, Z., Leitão, J.P., Simões, N.E., Moosavi, V., 2021. Data-driven Flood Emulation: based spatial prediction of flood prone areas using standalone frequency ratio,
Speeding up Urban Flood Predictions by Deep Convolutional Neural Networks. logistic regression, weight of evidence and their ensemble techniques. Geomatics
J. Flood Risk Manage. 14 (1) https://doi.org/10.1111/jfr3.12684 (in Press). Nat. Hazards Risk 8 (2), 1538–1561. https://doi.org/10.1080/
Guo, Z., Leitao, J.P., Moosavi, V., 2020b. Flood simulation data of a 100-year designed 19475705.2017.1362038.
storm in 656 catchment areas of Switzerland. ETH Zurich Research Collection. Tehrany, M.S., Jones, S., Shabani, F., 2019. Identifying the essential flood conditioning
https://doi.org/10.3929/ethz-b-000453305. factors for flood prone area mapping using machine learning techniques. Catena
He, K., Zhang, X., Ren, S., Sun, J., 2016. Deep residual learning for image recognition. In: 175, 174–192. https://doi.org/10.1016/j.catena.2018.12.011.
Proceedings of the 29th IEEE conference on computer vision and pattern recognition Tsubaki, R., Kawahara, Y., 2013. The uncertainty of local flow parameters during
(CVPR 2016), Las Vegas, USA. pp. 770-778. doi: 10.1109/CVPR.2016.90. inundation flow over complex topographies with elevation errors. J. Hydrol. 486,
Hochreiter, S., 1998. The vanishing gradient problem during learning recurrent neural 71–87. https://doi.org/10.1016/j.jhydrol.2013.01.042.
nets and problem solutions. Int. J. Uncertainty Fuzziness Knowl. Based Syst. 6 (02), Wang, Y., Chen, A.S., Fu, G., Djordjević, S., Zhang, C., Savić, D.A., 2018. An integrated
107–116. https://doi.org/10.1142/S0218488598000094. framework for high-resolution urban flood modelling considering multiple
Huang, S., Chang, J., Huang, Q., Chen, Y., 2014. Monthly streamflow prediction using information sources and urban features. Environ. Modell. Software 107, 85–95.
modified EMD-based support vector machine. J. Hydrol. 511, 764–775. https://doi. https://doi.org/10.1016/j.envsoft.2018.06.010.
org/10.1016/j.jhydrol.2014.01.062. Wang, Y., Fang, Z., Hong, H., Peng, L., 2020. Flood susceptibility mapping using
Kabir, S., Patidar, S., Xia, X., Liang, Q., Neal, J., Pender, G., 2020. A deep convolutional convolutional neural network frameworks. J. Hydrol. 582, 124482 https://doi.org/
neural network model for rapid prediction of fluvial flood inundation. J. Hydrol. 10.1016/j.jhydrol.2019.124482.
590, 125481 https://doi.org/10.1016/j.jhydrol.2020.125481. Willmott, C.J., Ackleson, S.G., Davis, R.E., Feddema, J.J., Klink, K.M., Legates, D.R.,
Khosravi, K., Panahi, M., Golkarian, A., Keesstra, S.D., Saco, P.M., Bui, D.T., Lee, S., O’Donnell, J., Rowe, C.M., 1985. Statistics for the evaluation and comparison of
2020. Convolutional neural network approach for spatial prediction of flood hazard models. J. Geophys. Res. 90 (C5), 8995–9005. https://doi.org/10.1029/
at national scale of Iran. J. Hydrol. 591, 125552 https://doi.org/10.1016/j. JC090iC05p08995.
jhydrol.2020.125552. Zaghloul, M., 2017. Machine-Learning aided Architectural Design – Synthesize Fast CFD
Kingma, D.P., Ba, J., 2015. Adam: A method for stochastic optimization. In: the 3rd by Machine-Learning. Phd Diss. ETH Zurich. doi: 10.3929/ethz-b-000207226.
International Conference on Learning Representations (ICLR 2015), San Diego, USA. Zhao, G., Pang, B., Xu, Z., Peng, D., Xu, L., 2019. Assessment of urban flood susceptibility
Poster presentation. using semi-supervised machine learning model. Sci. Total Environ. 659, 940-949.
Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., Nearing, G., 2019a. doi: 10.1016/j.scitotenv.2018.12.217.
Towards learning universal, regional, and local hydrological behaviors via machine Zhao, G., Pang, B., Xu, Z., Peng, D., Zuo, D., 2020. Urban flood susceptibility assessment
learning applied to large-sample datasets. Hydrol. Earth Syst. Sci. 23 (12), based on convolutional neural networks. J. Hydrol. 590, 125235 https://doi.org/
5089–5110. 10.1016/j.jhydrol.2020.125235.
Kratzert, F., Klotz, D., Herrnegger, M., Sampson, A.K., Hochreiter, S., Nearing, G.S., Zheng, F., Tao, R., Maier, H.R., See, L., Savic, D., Zhang, T., Chen, Q., Assumpção, T.H.,
2019b. Toward improved predictions in ungauged basins: Exploiting the power of Yang, P., Heidari, B., Rieckermann, J., Minsker, B., Bi, W., Cai, X., Solomatine, D.,
machine learning. Water Resour. Res. 55 (12), 11344–11354. https://doi.org/ Popescu, I., 2018. Crowdsourcing methods for data collection in geophysics: State of
10.1029/2019WR026065. the art, issues, and future directions. Rev. Geophys. 56 (4), 698–740.
Leitão, J. P., Zaghloul, M., Moosavi, V., 2018. Modelling overland flow from local inflows
in “almost no-time” using Self-Organizing Maps. In: 11th International Conference
on Urban Drainage Modelling, Palermo, Italy. Oral presentation.

12

You might also like