Professional Documents
Culture Documents
150006 (2023)
www.papersinphysics.org
ISSN 1852-4249
The study of precipitation is one of the most intriguing areas in atmospheric sciences,
with significant implications for our daily lives and climate change projections. This paper
explores the estimation of rainfall trends in South American regions using convolutional
neural networks (CNNs). The study focuses on the application of Cloud-Net, a CNN-
based model with a format similar to an autoencoder, to obtain qualitative estimates of
precipitation patterns. The employed loss functions, Categorical Cross Entropy and Cate-
gorical Focal Loss, address the challenges of classifying minority categories in unbalanced
data. Regional analysis was conducted, identifying days with high rainfall intensity and
the predominant intensities in 25 regions. The CNN model’s performance was compared
with the XGBoost algorithm, showing excellent results for extreme rainfall categories and
challenging intermediate categories. Furthermore, a comparison was made with Quanti-
tative Precipitation Estimation (QPE) data and ground measurements from rain gauges.
While the CNN model provided a valuable qualitative estimate of precipitation trends,
achieving precise quantitative estimation would require an extensive data set of in-situ
measurements. Overall, this research demonstrates the potential of CNNs for estimating
rainfall trends and understanding precipitation patterns in South American regions. The
findings offer valuable insights for further applications in meteorology and environmental
studies.
150006-1
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
Precipitation is a complex phenomenon in which sured in mm/h) at a 2 km scale for each pixel, with
many local parameters, such as sea level, season, updates occurring every 10 minutes. The study
surface temperature, among others, influence its area encompasses the latitudes 17° S to 39° S and
dynamics. These features are hidden for satellite longitudes 49° W to 73° W, including the countries
observation in the range VIS to IR, which provide of Uruguay, Paraguay, and portions of Argentina,
information solely about the top of the clouds that Brazil, Chile, and Bolivia. Our research focuses
generate the rain. However, these measurements only on the first 16 days of January 2021.
are available from geostationary orbits, enabling
quasi-continuous monitoring of a significant portion This work was divided into two main stages. In
of Earth with high update speed. This allows for the first stage, our focus was on gaining a deeper
adequate temporal sampling, essential for observ- understanding of the Quantitative Precipitation
ing the short-lived and high temporal variability of Estimation (QPE) itself. We conducted an analysis
rain events. MW measurements are more directly of temporal and regional patterns using frequency
sensitive to precipitation; however, currently, they graphs, visually verified the formation of rainfall
are only available from satellites in low orbits[2]. intensity clusters, obtained brightness temperature
Image classification is understood as the pro- distributions for each spectral band utilized, and
cess of making quantitative decisions using data compared the results with existing Red, Green and
extracted from images, by grouping their pixels or Blue (RGB) products.
regions into classes representing different physical
objects. This general definition of classification is During the second stage, the study involved the
based on a statistical and probabilistic analysis of processing of satellite images through pixel super-
the information, which can be performed by both vised classification, addressing the segmentation
humans and algorithms[3, 4]. problem. We considered the statistical character-
Machine learning can be defined as the utilization istics of the data, treating each spectral band as a
of data and algorithms to mimic the way humans distinct physical dimension. By doing so, we ob-
learn, gradually improving its accuracy through tained probability maps for each category or class.
practice. One potential approach is supervised Some models treated each pixel as an independent
learning, where a training set of data with known sample, like XGBoost[7], while others, such as con-
class labels is provided, and from this dataset, the volutional networks, considered the neighborhood
algorithm can generalize what it has learned when of each pixel as well. In this context, the Cloud-
presented with new information. Net model[8], initially designed for cloud pixel de-
In this study, the focus lies on the challeng- tection, was adapted for this research to effectively
ing task of image segmentation, aiming to produce detect rainfall pixels or classify them based on their
thematic maps that provide a spatial description respective rainfall intensity.
of specific characteristics (e.g., rain) in a desig-
nated area on Earth[5]. Neural networks are among This work draws inspiration from the growing
the models employed to accomplish this task, con- number of studies that combine machine learning
tributing to the creation of these informative maps. methods with satellite[9, 10], radar[11, 12], or hy-
Our study is centered around analyzing the brid information[13]. Its primary objective is to
Quantitative Precipitation Estimation (QPE)[6] demonstrate the effectiveness of the Cloud-Net net-
from the Geostationary Operational Environ- work, using only GOES spectral bands as input
mental Satellite 16 (GOES-16), a self-calibrated data, in reproducing the rainfall results obtained
level 2 standard product developed by the Na- by the QPE algorithm. The ultimate goal is to
tional Oceanic and Atmospheric Administration create a simpler hydro-estimator. Additionally, the
(NOAA). The algorithm utilizes IR data from classification results were compared across differ-
GOES and calibrates its features to determine rain- ent quantities of rainfall intensity categories (2, 3,
fall rates by incorporating microwave external in- or 6), using various loss functions and the XGBoost
formation from specific polar satellites like Wind- algorithm. This paper will present only the results
Sat. This process enables the algorithm to generate obtained with 6 categories, but further information
estimates of the instantaneous rainfall rate (mea- and datasets are available upon request.
150006-2
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
ii Loss Functions
Neural networks require not only a specific architec-
ture to learn how to solve a problem but also a way
to measure how far their predictions are from the
actual target, in order to minimize the final loss or
error and achieve better results. For binary classifi-
cation, two loss functions were used: Dice Loss and
Tversky Loss[14–17]. These loss functions differ in
how they penalize errors when classifying classes.
In the case of multi-category classification, the fol-
lowing loss functions were employed:
150006-3
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
Figure 2: The chosen area and the 25 regions of 4. Band 14 (11.2 µm): longwave window channel.
study
5. Band 15 (12.3 µm): “dirty” longwave IR win-
dow channel.
n K
1 XX As mentioned in the Introduction, the QPE
L(θ) = − [y ij log(pij )α(1 − pij )γ ] (2)
n i=1 j=1 product provides an instantaneous rain rate in
mm/h (ranging from 0 to 100 mm/h) every 10 min-
In Categorical Focal Loss, we use a parameter γ ≥ 0 utes. This means there are 144 images available
which is known as the focus parameter. This pa- per day, corresponding to the Coordinated Univer-
rameter reduces the impact of easy examples in the sal Time (UTC). For this research, the focus was on
loss value, giving more attention to challenging ex- binary and multi-category classification tasks. To
amples. Additionally, we have another parameter, achieve this, rainfall intensity classes needed to be
α, which is called the balance parameter. For this defined for the label data. For rain detection (bi-
specific work, the values α = 0.25 and γ = 2 were nary classification), a threshold of 0.1 mm/h was
chosen, based on the consulted bibliography[18]. chosen to differentiate ”rain pixels” from ”No Rain
pixels”. Additionally, 6 classes were defined based
on existing bibliography [19], see Table 1.
iii Study Area and Time Period
In order to avoid high class imbalance between
To focus on a specific area of South America, this the “No Rain” category and the other 5 rainfall
study was carried out within the latitudes 17° S to classes during training, an image selection process
39° S and longitudes 49° W to 73° W. This area was established. In the case of the multi-category
includes the countries of Uruguay, Paraguay and classification, a minimum of 1000 pixels of both
parts of Argentina, Brazil, Chile and Bolivia. The “Intense” and “Torrential” classes was required for
entire chosen area was then subdivided in 25 re- each image to be used during training. This re-
gions of 192x192 pixels for easier insertion in the sulted in the availability of 1326 images (a total
Cloud-Net model. The map displayed in Fig. 2 is of 48.881.664 pixels). The decision was based on
based on Google Earth, and each marker on the the focus of this study to prioritize the replication
map represents the corners of these square regions, of heavy summer rainfall through machine learning
as seen from a geostationary projection point of models. Then, a total of 3500 images from January
view. The models were trained using data from 16 were used for testing.
150006-4
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
150006-5
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
Figure 5: (top) Daily total count of pixels for Figure 6: (top) Daily total count of pixels for
both binary classification and (bottom) rainfall both binary classification and (bottom) rainfall
classes in region 5 (Mato Grosso, Brazil). classes in region 18 (central part of Argentina).
150006-6
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
Figure 7: (top) Day Microphysics RGB. Figure 8: Cloud-Net training with 6 categories.
(bottom) QPE product at 17:10 UTC, 16/01, (top) Cross Entropy vs Epochs. (bottom) Focal
region 13. All map figures include north, scale and Loss vs Epochs
only the coordinates of the corner points.
150006-7
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
Figure 9: Confusion matrix for 6 categories with (a) Cloud-Net and Cross Entropy, (b) Cloud-Net and
Focal Loss, (c) XGBoost
Cross Entropy, while the validation curves converge ing the first classification threshold. This indicates
rapidly with the training curves when using Focal that further improvements may be needed to en-
Loss. hance the classification performance in the lower
Figure 9 shows the confusion matrices for the 6 rainfall intensity categories.
rainfall categories using both Cloud-Net and XG- In Fig. 10, the predictions for the 6 rainfall cate-
Boost. Both algorithms face challenges in accu- gories at 17:10 UTC in region 13 are shown. Once
rately identifying the intermediate categories, ex- again, Cloud-Net demonstrates better performance
cept for category 2 when Cloud-Net is trained with in identifying isolated and low-intensity clusters.
Cross Entropy. However, the results are quite im- Moreover, it shows greater attention to detail in the
pressive for the extreme categories, ”No Rain” and main precipitation cluster, particularly in category
”Rain > 30 mm/h”, particularly when Cloud-Net 4 (regions with lighter green) which closely resem-
is trained with Focal Loss and for XGBoost. It bles the QPE label. Despite Focal Loss’s attempt
is important to note that XGBoost only considers to penalize mistakes while classifying minority cat-
the information of individual pixels and does not egories, it doesn’t significantly improve the results
take into account the neighborhood information. obtained with Cross Entropy in this case. However,
Finally, there is a noticeable trend where the al- Cloud-Net still outperforms XGBoost in handling
gorithms tend to predict category 1 pixels as ”No the complex patterns and variations in rainfall in-
Rain” pixels, suggesting a difficulty in establish- tensity.
150006-8
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
(a) (b)
(c) (d)
Figure 10: 17:10 UTC, 16/01/2021, Region 13: (a) QPE label and predictions with (b) Cloud-Net and
Cross Entropy, (c) Cloud-Net and Focal Loss, (d) XGBoost
In Fig. 11, we observe the predictions for 23:30 v QPE and models comparison with
UTC in region 8, where a considerable number of ground floor information
“torrential” rain pixels are present. The prediction
made by both Cloud-Net and XGBoost show better Before making comparisons, it is essential to con-
results, and there is a relative similarity between sider some factors. Cloud precipitation exhibits
them. It is evident that all models demonstrate high variability both in time and space, making it
good performance in predicting the main clusters challenging to directly compare instantaneous esti-
with higher precipitation. However, they encounter mation over an area against in–situ measurements.
difficulty when it comes to the secondary clusters. Additionally, the rain rate, as a geophysical vari-
These secondary clusters are the main areas where able, can have different interpretations depending
the models face challenges and make errors, as seen on various factors such as latitude, season, or the
in the confusion matrix shown in Fig. 9, which specific application, such as agriculture or emer-
was to be expected due to the selection process gency situations[1]. In the case of QPE, its values
explained in the Input and Label Data sets sub- are generally higher when compared to radar es-
section. timations. For example, Sun et al.[23] reported a
factor of at least two in comparisons for the Conti-
nental U.S. (CONUS) area. Due to these inherent
150006-9
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
(a) (b)
(c) (d)
Figure 11: 23:30 UTC, 16/01/2021, Region 8 with heavy rainfall: (a) QPE label and predictions with
(b) Cloud-Net and Cross Entropy, (c) Cloud-Net and Focal Loss, (d) XGBoost
complexities and differences, our comparisons pri- cation. For the first five categories, we assigned
marily focus on quantitative aspects rather than their mean value. However, for the last category of
direct quantitative assessments. “Torrential Rain”, we calculated a weighted aver-
age using the rainfall data from all 16 days in the
During January 16, 2021, we conducted a com-
nearby regions. This approach allowed us to con-
parison between the QPE algorithm’s results and
vert the model’s categorical predictions into mean-
the daily accumulated predictions generated by the
ingful quantitative estimates of rainfall intensity in
Cloud-Net and XGBoost models in a region experi-
millimeters per hour (mm/h), see Table 2.
encing heavy rainfall. To facilitate this comparison,
we accessed rainfall data from rain gauges provided It is worth noting that despite the differences
by the Chaco police. This data enabled us to ob- between ground-based and satellite-derived values,
tain figures 12, 13, 14, and 15. As these machine there are records of rainfall of the magnitude cal-
learning models assign rainfall intensity categories culated by the QPE in certain towns of Chaco[24].
to each pixel, each pixel had to be reassigned a For instance, in the town of Miraflores, there were
value in mm/h depending on its classification. As reports of rainfall reaching up to 105 mm, which
the machine learning models assign rainfall inten- resulted in impassable roads for vehicular traffic,
sity categories to each pixel, we needed to reassign as documented by the police[25].
each pixel a value in mm/h based on its classifi-
150006-10
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
Figure 12: Area with high precipitation. The Figure 14: The colored dots correspond to rain
colored dots correspond to rain gauge information gauge information and the color map beneath
and the color map beneath corresponds to the corresponds to the accumulated Cloud-Net model
QPE information. Gauge data was collected at 7 with Focal Loss over a 24-hour period.
ART, 17/01/2021. The QPE product was
accumulated over a 24–hour period.
Figure 13: The colored dots correspond to rain Figure 15: The colored dots correspond to rain
gauge information and the color map beneath gauge information and the color map beneath
corresponds to the accumulated Cloud-Net model corresponds to the accumulated XGBoost model
with Cross Entropy over a 24-hour period. over a 24 hour period.
150006-11
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
150006-12
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
[8] S Mohajerani, S Parvaneh, Cloud-Net: An [18] T Lin et al., Focal Loss for Dense Object De-
End-To-End Cloud Detection Algorithm for tection, IEEE Transactions on Pattern Analy-
Landsat 8 Imagery, IGARSS 2019, 1029, sis and Machine Intelligence, 42, 2, (2020).
(2019).
[9] M Sadeghi et al., PERSIANN-CNN: Precip- [19] M Guico et al., Design and development
itation Estimation from Remotely Sensed In- of a novel acoustic rain sensor with auto-
formation Using Artificial Neural Networks mated telemetry, MATEC Web of Conferences,
- Convolutional Neural Networks, J. of Hy- (2018).
dromet., 40, (2019).
[20] W Huang et al., Simulation and Projection of
[10] M Sadeghi et al., Improving Near Real-time Summer Convective Afternoon Rainfall Activ-
Precipitation Estimation Using a U-Net Con- ities over Southeast Asia in CMIP6 Models, J.
volutional Neural Network and Geographical of Climate, 34, 1, (2021).
150006-13
Papers in Physics, vol. 15, art. 150006 (2023) / F Andelsman et al.
[21] J Welty et al., Increased Likelihood of Ap- [24] R Mancuello, 11/04/2022, “Temporal en
preciable Afternoon Rainfall Over Wetter or Castelli: En una hora y con fuertes
Drier Soils Dependent Upon Atmospheric Dy- vientos hubo destrozos en viviendas, sa-
namic Influence, Geophys. Research L., 47, lones comerciales, voladuras de techos”,
11, (2020). Chaco Dı́a por Dı́a, accesed 07/11/2023,
https://www.chacodiapordia.com/2022/
[22] C Böhm et al., The Role of Moisture Conveyor 04/11/temporal-en-castelli-en-una-
Belts for Precipitation in the Atacama Desert, hora-y-con-fuertes-vientos-hubo-
Geophy. Research L., 48, 24, (2021). destrozos-en-viviendas-salones-
[23] L Sun et al., Cross Validation of GOES-16 and comerciales-voladuras-de-techos/
NOAA Multi-Radar Multi-Sensor (MRMS) [25] Website of Chaco police with rainfall informa-
QPE over the Continental United States, Re- tion, accesed 07/11/2023, https://policia.
mote Sensing, 13, 4030, (2021). chaco.gob.ar/#/lluvia
150006-14