Professional Documents
Culture Documents
Modelling Reference Evapotranspiration Using Principal Component Analysis and Machine Learning Methods Under Different Climatic Environments
Modelling Reference Evapotranspiration Using Principal Component Analysis and Machine Learning Methods Under Different Climatic Environments
net/publication/370952822
CITATIONS READS
0 85
9 authors, including:
Some of the authors of this publication are also working on these related projects:
Application of gamma test, heuristic and regression techniques for simulation of suspended sediment. View project
Special Issue "Sustainable Management of Water and Environment with the Aid of Advanced Computing Methods" View project
All content following this page was uploaded by Ali Raza on 23 May 2023.
RESEARCH ARTICLE
1
School of Agricultural Engineering,
Jiangsu University, Zhenjiang, Abstract
China Reference evapotranspiration (ETo) is a complex process in the hydrologic
2
Laboratory of Water and Environment cycle that influences several hydrologic parameters. Although several methods
Engineering in Saharan Environment,
have been developed to model ETo, a reliable method that can use limited cli-
University of Kasdi Merbah-Ouargla,
Ouargla, Algeria matic input parameters for data-limited regions is still limited. This study eval-
3
Department of Agriculture, Nutrition uated four machine learning (ML) methods: M5 pruned (M5P) tree, sequential
and Human Ecology, College of minimal optimization (SMO), radial basis function neural regression
Agriculture and Human Sciences, Prairie
View A&M University, Prairie View, (RBFNreg) and multilinear regression (MLR). The major objective of this study
Texas, USA was to identify the best approach to estimate ETo with minimum input data in
4
Civil Engineering Department, Faculty of five stations (Multan, Jacobabad, Faisalabad, Islamabad and Skardu) located
Engineering, Osmaniye Korkut Ata
in Pakistan. The datasets of these stations comprised maximum and minimum
University, Osmaniye, Turkey
5
Agricultural Research, Education and
temperatures (Tmax, Tmin), average relative humidity (RHavg), average wind
Extension Organization, Agricultural speed (Ux), and sunshine hours (n) variables. Two scenarios were used for ETo
Engineering Research Institute, Karaj, modelling. In the first scenario, five climatic variables were used as inputs to
Alborz, Iran
6
estimate ETo as obtaining full climatic parameters is the biggest challenge in
Department of Civil Engineering,
Technical University of Lübeck, Lübeck, developing countries. Principal component analysis (PCA) was used as a clus-
Germany tering technique in the second scenario to reduce the climatic input parame-
7
Department of Civil Engineering, Ilia ters. The PCA results indicated that Tmax, Tmin and n were identified as
State University, Tbilisi, Georgia
8
effective inputs for ETo estimation. Based on statistical indicators, the M5P tree
Agricultural Engineering
Department, Faculty of Agriculture, outperformed the other applied ML methods in estimating ETo under various
Mansoura University, Mansoura, climatic environments. This study recommends focusing on areas with high
Egypt
ETo values and adequate irrigation scheduling of crops to achieve water
Correspondence sustainability.
Yongguang Hu, School of Agricultural
Engineering, Jiangsu University, KEYWORDS
Zhenjiang 212013, China. diverse climatic environments, ETo modelling, machine learning methods, principal
Email: deerhu@ujs.edu.cn component analysis, reference evapotranspiration
Article title in French: Modelisation de l'evapotranspiration de reference a l'aide de methodes d'analyse des composantes principales et
d'apprentissage automatique dans differents environnements climatiques
Irrig. and Drain. 2023;1–26. wileyonlinelibrary.com/journal/ird © 2023 John Wiley & Sons, Ltd. 1
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 RAZA ET AL.
Funding information
Résumé
Jiangsu Postdoctoral Science Foundations,
Grant/Award Numbers: 2016M600376, L'évapotranspiration de référence (ETo) est un processus complexe du cycle
1601032C; Jiangsu Provincial hydrologique qui influence plusieurs paramètres hydrologiques. Bien que plu-
Government, Grant/Award Number:
sieurs méthodes aient été mises au point pour modéliser l'ETo, une méthode
BE2021340; Priority Academic Program
Development of Jiangsu Higher Education fiable qui peut utiliser des paramètres d'entrée climatiques limités pour des
Institutions, Grant/Award Number: régions où les données sont limitées est encore limitée. Cette étude a évalué
PAPD-2018-87
quatre méthodes d'apprentissage automatique (ML): arbre M5 élagué (M5P),
optimisation minimale séquentielle (SMO), régression neuronale à fonction de
base radiale (RBFNreg), et régression multilinéaire (MLR). Cette étude visait à
identifier la meilleure approche pour estimer l'ETo avec un minimum de don-
nées d'entrée dans 5 stations (Multan, Jacobabad, Faisalabad, Islamabad et
Skardu) situées au Pakistan. L'ensemble de données de ces stations comprend
les températures maximales et minimales (Tmax, Tmin), l'humidité relative moy-
enne (RH), la vitesse moyenne du vent (Ux) et les heures d'ensoleillement (n).
Deux scénarios ont été utilisés pour la modélisation de l'ETo. Dans le premier
scénario, cinq variables climatiques ont été utilisées comme intrants pour esti-
mer l'ETo, car l'obtention de paramètres climatiques complets est le plus grand
défi en face des pays en développement. L'analyse des composantes principales
(PCA) a été utilisée comme technique de regroupement dans le deuxième scé-
nario pour réduire les paramètres d'entrée climatiques. Les résultats de la PCA
ont indiqué que Tmax, Tmin et n ont été identifiés comme des intrants efficaces
pour l'estimation de l'ETo. Sur la base des indicateurs statistiques, l'arbre M5P
a surpassé les autres méthodes ML appliquées pour estimer l'ETo dans divers
environnements climatiques. Cette étude recommande de se concentrer sur les
zones présentant des valeurs élevées de l'ETo et sur un calendrier adéquat
d'irrigation des cultures pour assurer la durabilité de l'eau.
MOTS CLÉS
Evapotranspiration de référence, Modélisation d'ETo, Méthodes d'apprentissage
automatique, Analyse des composantes principales, Environnements climatiques variés
continually increasing effectiveness (Kumar et al., 2011). SVM consistently performed the best. Wen et al. (2015)
Due to its high complexity, the dynamic and nonlinear applied two ML models of ANN and SVM against three
nature of ETo estimation poses a considerably challenging empirical (Hargreaves [Ha], Ritchie [Ri], Priestley and
task. Consequently, computer models with fewer climatic Taylor [PT]) models to estimate the ETo of a dry region
variables are the most effective alternatives for ETo estima- in China using daily meteorological data. The authors
tion. It is widely recognized (Ibrahim, 2016) due to its used the highest and minimum temperatures as model
capacity to efficiently handle challenging problems and input. The predicted daily ETo was found to be adequate
the benefit of applying it to resolve complex difficulties when only a few meteorological variables were used. To
with a limited amount of data. ML models rely on a set of estimate daily ETo using climatic data from four meteoro-
top-notch algorithms applied in nonlinear mapping pro- logical stations situated in the karst region of Guangxi
cesses, such as ETo, to comprehend the relationship province in China, Wang et al. (2016) studied the effec-
between the input and output (target) variables. tiveness of two ML models, namely, gene expression pro-
Based on premise and foundation of use, how accu- gramming (GEP) and ANN. According to the study, GEP
rate and structured, Raza, Hu, et al. (2021) listed research with fewer climatic inputs can generate straightforward
articles recently published on ETo estimation that were explicit mathematical formulas that are simpler to use
not established more than 8 years ago, from 2012 to 2020. than the employed ANN models.
Soft computing models have been successfully used in To estimate ETo, Mehdizadeh et al. (2017) examined
many parts of the world to estimate ETo. These are cho- the effectiveness of GEP, multivariate adaptive regression
sen for their tendency and ability to provide absolute for- splines (MARS) and SVM. ML models were created using
mulation and are helpful in their application. However, the monthly meteorological variables. MARS ranked first
the authors have pointed out a limitation about how pre- among the applied ML models according to the
dictable the soft computing models can be while using a evaluation indices, and it was in good agreement with
few parameters associated with the climate. These results FAO-PM56. Kişi & Cimen (2009) used climate data from
were observed to be more prominent in different climate California stations and least square support vector
conditions. The conditions may include humid, semi-arid machine (LSSVM) to estimate ETo. The evaluation indi-
and arid regions. This results from the level of influence ces used to examine the LSSVM performance produced
exerted on the ETo process by multiple climate variables. satisfactory and reliable ETo. By using the GEP ML
In lieu of these observations, a reliable result on estimat- model, Saggi & Jain (2019) calculated ETo using monthly
ing ETo using soft computing models in specific regions climate data. When the acquired results were compared
requires a variety of information relating to and with the to FAO-PM56, they were determined to be favourable.
climate. The main objective of the study was to develop Mattar (2018) discovered that GEP performed best when
different soft computing models, which can be preferred applied to varied climatic conditions in Egypt. Elbeltagi
as an alternative to FAO-PM56. The need for it is due to et al. (2022) developed five variants of additive regression
the numerous climatic and adjusted data as input, which (AR) ML methods using monthly climatic data from
are not easily accessible or available in most regions. Pakistan stations. The authors investigated the perfor-
Recent works on ETo modelling highlighted this con- mance of each ML model using different evaluation indi-
tentious issue and has been overwhelming among ces. The M5 pruned (M5P) variant of AR was found to be
researchers and climatologists. In the literature, various the closest to FAO-PM56 and provided accurate ETo esti-
types of ML methods, for example, support vector mations. Similarly, Wang et al. (2022) developed 10 ML
machine (SVM) (Ferreira & da Cunha, 2020; Mehdizadeh methods using monthly climatic data, and the results
et al., 2017), genetic programming (GP) (Mattar, 2018; based on evaluation indices indicated that tree boost
Valipour et al., 2019), extreme learning machine (ELM) (TB) performed best compared to other ML models. Fur-
(Abdullah et al., 2015; Shamshirband & Kamsin, 2016), thermore, ML methods can extract useful information
tree-based models (Raza et al., 2020), M5 model tree (Fan from time series data without discretization. The perfect
et al., 2018; Granata, 2019), random forest (RF) (Saggi & handling of time series data using ML methods is recom-
Jain, 2019), extreme gradient boosting (XGBoost) (Han mended in various engineering challenges, especially in
et al., 2019) and artificial neural networks (ANN) (Walls ETo estimation.
et al., 2020), have been applied to estimate ETo using lim- It can be perceived from the above literature that the
ited climatic data. Using historical meteorological data application of ML algorithms in ETo modelling using lim-
on a daily basis, Yin et al. (2017) evaluated the effective- ited climatic variables is a good choice and is accepted
ness of the ANN, SVM and three empirical models for worldwide. In Pakistan, a few weather stations were
estimating the daily ETo in a hilly interior watershed in installed, and climatic data for some areas were found to
northwest China. They observed that in the studied area, be insufficient to calculate ETo. Thus, when conventional
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 RAZA ET AL.
methods (FAO-PM56) cannot be implemented owing to the wind speed at Faisalabad station was recorded as the
enormous input demands or a lack of climatic character- highest (149.92 km/day) because of its geographical loca-
istics, improving methods depending on fewer climatic tion, and severe types of thunderstorms occurred every
inputs becomes important. One of the exquisite possibili- year due to the cold wind coming from the west. More-
ties for developing an ETo model is to use ML methods. over, it has dry winter and humid summer seasons, mak-
Creating an ML model with a known set of input vari- ing its climatic condition semi-arid. Table 2 represents the
ables versus the target variable is a challenging task that brief statistical characteristics of the climatic data
has been explicitly addressed in this paper. employed in the training and testing stages.
According to the available literature, there is no com- Additionally, skewness and kurtosis coefficients
parison research on the use of M5P tree, sequential mini- (Xskp and Xkrt) were also determined, which indicate the
mal optimization (SMO), radial basis function neural asymmetrical direction and degree of flatness/peakness
regression (RBFNreg) and multilinear regression (MLR) in the time series data, respectively. Table 2 shows that
for estimating ETo in different climates. Thus, the objec- Xskp at Islamabad station for the Tmax, Tmin and RHavg
tives of the current study are (i) developing and evaluat- climatic variables was positively skewed, which indicates
ing the M5P tree, SMO, RBFNreg and MLR methods in a larger value of the mean (Xmean). At the same time,
ETo estimation using climatic data from five stations negatively skewed Ux, n and ETo showed lower values of
located in different climates (semi-arid, humid, hyper- Xmean. Alternatively, Xkrt for Tmax, Tmin, RHavg and ETo
arid), (ii) examining various meteorological input combi- was estimated to be negative, indicating lower peakness
nations using principal component analysis (PCA) and (Platykurtic curve), while larger peakness in Ux and
identifying influential climatic variables for ETo estima- n was observed due to the positive value (Mesokurtic
tion and (iii) creating ETo variation maps in the studied curve). Similarly, the Xskp and Xkrt relations of the
region based on the output of the best ML method. climatic variables for the selected climatic stations can be
found in Table 2.
subnodes, and each node ends with a regression equa- T i
tion. After the classification, each final node containing a SDR ¼ SD j T i j , ð2Þ
T
regression equation lets users make an estimation. Con-
sidering ‘T’ as a node, the standard deviation of the where Ti represents the subgroup of samples,
class in ‘T’ is a measure of error. It is expected to deter- T represents a group of samples reaching the node, SD
mine the maximizing standard deviation reduction represents the standard deviation, and SDR represents
(SDR) to choose between attributes at that level of split- the standard deviation reduction. In some cases, the tree
ting. The calculation of the SDR is given by Equation (2) created by the M5P may split considerably, and the tree
as follows: size could be larger than expected.
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6 RAZA ET AL.
TABLE 1 Monthly average meteorological parameters for each selected climatic station.
Abbreviations: ETo, reference evapotranspiration; n, sunshine hours; RHavg, average relative humidity; Tmax, maximum temperature; Tmin, minimum
temperature; U, wind speed.
TABLE 2 (Continued)
Abbreviations: CV, coefficient of variation; ETo, reference evapotranspiration; n, sunshine hours; RH, relative humidity; Tmax, maximum temperature; Tmin,
minimum temperature; Ux, average wind speed; Xmean, mean value; Xstd, standard deviation; Xmin, minimum value; Xmax, maximum value; Xskp, skewness
coefficient; Xkrt, kurtosis coefficient.
2.3.2 | Sequential minimal optimization SVM technique was extended to make it usable on multi-
class classification and regression problems. In the general
SMO is used to prepare a help vector classifier using poly- description of the SVM, it is indicated that the SVM tries
nomial or RBF bits. It replaces all the missing qualities to find the best line to separate the data into classes as
and changes ostensible qualities into paired ones. A soli- accurately as possible. However, SVR tries to find the best
tary shrouded layer neural system utilizes a similar type of line fitting with the cost function by minimizing errors.
model as an SVM. Support vector regression (SVR) is the The following equation (Equation 3) is used to train
general name of regression analysis performed using SVM- the SVR:
supervised learning models. The SVM method was initially
developed for binary classification problems. Later, the jyi hw, x i i bj ≤ ε, ð3Þ
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8 RAZA ET AL.
F I G U R E 2 Adopted reference
evapotranspiration (ETo) methodology for
selected machine learning algorithms. FAO,
Food and Agriculture Organization; M5P tree,
M5 pruned tree; MLR, multilinear regression;
RBFNreg, radial basis function neural
regression; SMO, sequential minimal
optimization.
effectively. It has data preprocessing, clustering, classifica- utilized (7), association rules were performed (8), and in
tion, regression, visualization and feature selection abilities. the last step, the model was evaluated (9). A scheme
In Weka software, each data point (parameter) is described showing the methodology of the modelling process is
as an ‘attribute’. The software supports different attributes, given in Figure 2.
such as nominal, numeric and string attributes. It has its
own file system called ‘arff’, and it also supports other
common file types such as ‘CSV’. 2.6 | Performance evaluation of ML
Weka was used in this study, and the following steps methods
were implemented. The dataset used was preprocessed
in the first step (1). The data file was prepared in CSV The present study calculated the performance indices,
file format (2), and it was uploaded to Weka software by namely, correlation coefficient (r), mean absolute error
using the import feature (3). The necessary parameters (MAE), root mean squared error (RMSE), relative abso-
were selected (4), data were classified (5), and the near- lute error (RAE) and root relative squared error (RRSE),
est neighbour was chosen as the estimation function (6). to evaluate the ML methods. These are defined as
In the cross-validation process, K-means clustering was follows:
Pn P n P n
n i¼1 f ETobs ETest i¼1 ETobs i¼1 ETest
r ¼ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
h h P , ð7Þ
Pn 2 n 2i h Pn 2
i h P
n 2i
n i¼1 ðETobs Þ i¼1 ETobs n i¼1 ðETest Þ i¼1 ETest
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 RAZA ET AL.
where ETobs is the relative observed agreement, and ETest matrices, eigenvalues, eigenvectors and contributions of
is the hypothetical probability. the variables for the studied meteorological stations.
Table 3 shows that Tmax, Tmin, Ux, and n have positive
correlation coefficients at the humid and semi-arid
1X n
1X n
MAE ¼ j f i yi j¼ jei j, ð8Þ stations. At Faisalabad station (semi-arid climate), the
n i¼1 n i¼1
ETo has r values with Tmax, Tmin, Ux and n equal to
0.901, 0.911, 0.853 and 0.729, respectively. At the second
where fi is the prediction value, and yi is the true value. semi-arid station, Islamabad, the ETo has a significant
correlation coefficient with Tmax, Tmin and n, with values
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi equal to 0.868, 0.838 and 0.728, respectively. A humid
X n ðy^ yÞ2 climate characterizes Skardu station, and the ETo has
RMSE ¼ t
ð9Þ
t¼1 n r values equal to 0.893, 0.911, 0.833 and 0.892 with Tmax,
Tmin, Ux and n, respectively. For hyperarid stations, Tmax
P n 2 2
1 and Tmin had the highest correlations for the estimation
i¼1 ðP i Ai Þ of ETo, which were 0.898 and 0.875, respectively, at
RAE ¼ U 1 ¼ P , ð10Þ
2 2
1
n
i¼1 ðAi Þ
Jacobabad station. At Multan station, the highest r values
were found between ETo and Tmax (0.913), Tmin (0.914)
and Ux (0.884).
where the first version of Thiel's U, called ‘U1’, is a mea- The main objective of using PCA in this study was to
sure of accuracy, comparing actual earnings (A) to pre- determine the best predictors of ETo to use in the second
dicted earnings (P). proposed scenario. For this aim, the selection of compo-
nent numbers in the rotated space had to be specified
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2ffi
uP correctly. Figure 3 presents a bar diagram; the x-axis
u n p
u j¼1 ðijÞ T j presents the components, while the y-axis presents the
RRSE ¼ E i ¼ t P n 2 , ð11Þ
T j Ťj
eigenvalue of each component (on the left) with orange
j¼1
bars and the accumulation percentage of explained
variance (on the right) with a purple line. Each station
where P(ij) is the value predicted by the individual is presented in a separate window. From Figure 3, it is
model i for record j (out of n records), Tj is the target clear that in all five stations, the first two components
P
n explained more than 80% of the total variance
value for record j, and Ť is given by the formula n1 T j:
j¼1 (Islamabad station 84%, Faisalabad station 88%, Skardu
station 93%, Jacobabad station 86% and Multan station
86%). Based on Figure 3, we selected the first two
3 | R ES U L T S A N D D I S C U S S I O N components to present meteorological parameters in a
rotated space. Figure 4 illustrates the projection of
This paper applied four ML techniques, M5P tree, SMO parameters in a rotated space of two components and
and RBFNreg, for ETo modelling. The modelling tasks eigenvalues for selected stations. Based on Figure 4, a
were performed based on two scenarios. In the first sce- homogenous group containing ETo at each station was
nario, the meteorological data for Tmax, Tmin, RH, Ux and detected and identified by a red dashed circle. At the
n were used as inputs to predict ETo. In the second sce- Islamabad, Jacobabad and Skardu stations, Tmax, Tmin
nario, we used PCA as a clustering technique to reduce and n formed a homogeneous group with ETo. A new
the inputs. member was added to this group at Faisalabad station,
which is Ux, whereas at Multan station, we included the
addition of Ux by excluding n from the homogonous
3.1 | PCA results group of ETo. The RH was found to be out of this group
at all stations. The percentage of the presence of ETo
Due to the strong correlation between the explanatory with one of the remaining parameters in a homogeneous
variables, multicollinearity is a significant issue in multi- group at the studied stations is as follows: ETo-Tmax
ple linear regression analysis, increasing the regression 100%, ETo-Tmin 100%, ETo-n 80%, ETo-Ux 40% and
parameter estimators. Thus, PCA was suggested and used ETo-RHavg < 10%. From the previous results, the param-
to address the multicollinearity statistics between the eters selected as good predictors of ETo in the second
explanatory variables. Table 3 presents the correlation scenario are Tmax, Tmin and n.
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
RAZA ET AL. 11
TABLE 3 (Continued)
Abbreviations: ETo, reference evapotranspiration; n, sunshine hours; RHavg, average relative humidity; Tmax, maximum temperature; Tmin, minimum
temperature; Ux, average wind speed.
3.2 | Suggested scenario results The RBFNreg model was the optimal choice in ETo esti-
mation. This model generated low statistical errors
Table 4 shows the statistical performance indicators for (MAE = 0.573 mm/day, RMSE = 0.722 mm/day, RAE =
modelling ETo based on four developed ML models at five 29.532 and RRSE = 32.312) for training, while the corre-
stations for the first scenario. At Jacobabad station, all sponding errors for evaluation were MAE = 0.608 mm/
models produced satisfactory results, with correlation coef- day, RMSE = 0.812 mm/day, RAE = 33.330 and
ficients (r) ranging from 0.930 to 0.946 in the training RRSE = 37.525. Moreover, it also generated satisfactory
phase and from 0.941 to 0.949 in the evaluation period. results at Multan station, with correlation coefficients
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
RAZA ET AL. 13
ranging from 0.992 to 0.996 in the training phase and from 0.008 at Faisalabad and from 0.0016 to 0.0071 at
0.989 to 0.995 in the evaluation period. It generated low Islamabad during the evaluation process. The superior
statistical errors (MAE = 0.165 mm/day, RMSE = 0.211 ETo models were SMO and M5P for the Faisalabad and
mm/day, RAE = 7.898 and RRSE = 8.923) for training, Islamabad stations, which achieved the lowest values for
while the corresponding errors for evaluation were MAE statistical errors compared to the others. Additionally, at
= 0.232 mm/day, RMSE = 0.303 mm/day, RAE = 9.856 Skardu station (located in a humid region), RBFNreg was
and RRSE = 11.311. At both previous stations, RBFNreg the best ML model, which had the highest correlation
was found to be the best model in predicting ETo due to (r = 0.998) and the lowest values of errors (MAE =
the same climate conditions in the hyperarid region. At 0.100 mm/day, RMSE = 0.135 mm/day, RAE = 5.137 and
the other three stations, due to different climates (located RRSE = 6.223) through the evaluating phase. In addition,
in semi-arid and humid regions), different methods were Figure 5 presents scatter plots of the observed (x-axis) and
found to be optimal in modelling ETo. At Faisalabad and predicted (y-axis) ETo values for the created models. The
Islamabad stations, located in semi-arid regions, all results indicate that the performances of the RBFNreg pre-
designed models had the same correlation coefficient dictive model have a high correlation with observations at
(r = 0.980), with low variations ranging from 0.004 to the three stations (Jacobabad, Multan and Skardu). In
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
14 RAZA ET AL.
F I G U R E 4 Projection of meteorological parameters on a two-dimensional rotated space plot. ETo, reference evapotranspiration; n,
sunshine hours; RH, relative humidity; Tmax, maximum temperature; Tmin, minimum temperature; WS, wind speed.
contrast, the SMO and M5P predictive models had the evaluation periods. It is clearly observed from Figure 5
highest correlations with the observed ETo at other sta- that all the ML models show less performance at
tions (Faisalabad and Islamabad) during the training and Jacobabad station than at the other four stations. This can
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
RAZA ET AL. 15
TABLE 4 Performance metrics of developed models in ETo prediction for first scenario.
Train Test
Abbreviations: CCP, coefficient of pearson correlation; ETo, reference evapotranspiration; M5P, M5 pruned; MAE, mean absolute error; MLR, multilinear
regression; RAE, relative absolute error; RBFNreg, radial basis function neural regression; RMSE, root mean squared error; RRSE, root relative squared error;
SMO, sequential minimal optimization.
be explained by the lower correlations between the meteo- and Ux seem to have a greater influence on ETo. In the
rological parameters (inputs) and ETo at this station com- second scenario, a slight reduction in model performance
pared to other stations (see Table 4). was noted compared to the first scenario, but the results
The second scenario consists of reducing the size of were satisfactory and acceptable. At Jacobabad station,
required inputs for modelling ETo. The operation of input the best performance was registered after the application
size reduction is performed based on PCA, as shown in of the RBFNreg model, where the coefficient of correla-
Section 3.1. The results of PCA show that strong correla- tion (r) exceeded 90%, with a value equal to 0.940 in the
tions exist between ETo and Tmax, Tmin and the hours of training phase and 0.937 in the evaluation phase. As
sunshine (n). Nevertheless, at Multan station, Tmax, Tmin mentioned above, all models performed well, with the
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
16 RAZA ET AL.
lowest result in the evaluation phase being 0.923 in the coefficient equal to 0.962 in the training phase and 0.954
SMO model. At Multan station, the best model perfor- in the evaluation phase, followed by the RBFNreg model
mance was registered for M5P tree with a correlation with 0.952 in the training phase and 0.954 in the
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
RAZA ET AL. 17
TABLE 5 Performance metrics of developed models in ETo prediction for second scenario.
Train Test
Abbreviations: CCP, coefficient of pearson correlation; ETo, reference evapotranspiration; M5P, M5 pruned; MAE, mean absolute error; MLR, multilinear
regression; RAE, relative absolute error; RBFNreg, radial basis function neural regression; RMSE, root mean squared error; RRSE, root relative squared error;
SMO, sequential minimal optimization.
evaluation phase. After the application of SMO, the low- evaluation phase was registered using the SMO model at
est performance was found with an r value equal to 0.929 Islamabad station, where the performance percentage
in the training phases and 0.928 in the evaluation phase. decreased by 11% compared to the best performance
The M5P tree model had the best performance at the Fai- recorded when using the M5P tree model at Skardu sta-
salabad, Islamabad and Skardu stations, with r values in tion. This result indicates that all models have acceptable
the training phase equal to 0.960, 0.951 and 0.967 and in performance based on the reduced dataset of inputs used
the evaluation phase equal to 0.955, 0.924 and 0.977, in the second scenario. Table 5 shows the statistical per-
respectively. In contrast, the RBFNreg model had the formance indicators for modelling the ETo-based reduced
second-best performance after the M5P tree model. It is dataset of inputs used in the second scenario. Figure 5
worth mentioning that the lowest r value in the presents the scatter plots between the observed (x-axis)
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
18 RAZA ET AL.
and predicted (y-axis) ETo values for the created models. and third quarter values of the estimation models as well
From Figure 5, we can see that RBFNreg and the M5P as the observed ETo. In addition, according to Kouadri
tree have points distributed close to the matching line. It et al. (2021), the margin of deviation (MoD) is one of the
is worth noting that the effectiveness of the input param- methods used to evaluate the performance of a ML model
eters on ETo differs according to correlation analysis and in the estimation operation, as shown in Figure 8. Calcu-
PCA. Table 3 shows that Tmin, Tmax, Ux and n are effec- lating the MoD allows for the evaluation of anticipated
tive parameters for ETo estimation. In contrast, according ETo values depending on the degree of error, resulting in
to PCA (see Figure 4), Tmin, Tmax and n are found to be model performance analysis. The deviation rate error
effective parameters. This reveals the necessity of PCA in values between the measured and predicted ETo using
the determination of the most effective input variables. the suggested models were determined using the
The violin diagrams of both scenarios (S1 and S2) are equation below.
presented in Figure 6. The M5P tree model performed
well in comparison to the other models. Additionally,
Y Yi
heatmaps of selected stations from the input dataset for MoD ¼ 100, ð12Þ
Y
explaining the relation between explanatory and response
variables are presented in Figure 7, which shows that the
climatic variables Tmin, Tmax and n were effective for ETo where MoD is the margin of deviation, Y is the measured
estimation. ETo value, and Yi is the predicted ETo value.
It is clear from Figures 8 and 9 that the M5P tree
model represents the best predictive model for ETo esti-
3.3 | Uncertainty evaluation of results mation, followed by the RBFNreg model. In addition, it
was found that the fluctuations of the SMO and MLR
Figure 8 depicts a boxplot that analyses the uncertainty predictive models were far from the range of the observed
in ETo estimation. The boxplot contains the first, second ETo. Hence, it could be concluded that the M5P tree and
F I G U R E 6 Violin plots of first (S1, left side) and second (S2, right side) scenario. ETo, reference evapotranspiration; MLR, multilinear
regression; M5P, M5 pruned; RBF, radial basis function; SMO, sequential minimal optimization.
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
RAZA ET AL. 19
F I G U R E 7 Heat maps between explanatory and response variables for selected stations. ETo, reference evapotranspiration; n, sunshine
hours; RHavg, average relative humidity; Tmax, maximum temperature; Tmin, minimum temperature; Ux, average wind speed.
RBFNreg are more suitable for predicting ETo in different (2019) utilized SVM, GEP and an adoptive neuro-fuzzy
climate conditions. interference system (ANFIS) to estimate ETo utilizing
various input climatic combinations. The findings of
the chosen ML models indicated that SVM performed
3.4 | Comparison between ML and FAO- the best, with R2 = 0.999 and RMSE = 0.434 mm/
PM56 methods month. Saggi & Jain (2019) analysed four ML models,
including deep learning (DL), the generalized linear
The high probability of error in weather data monitoring model, the gradient boosting machine and TensorFlow
and recording, mainly in developing countries (TF), for modelling daily ETo at Indian stations. The
(e.g. Pakistan) and at meteorological stations run by non- DL model had superior performance compared to other
experts, is one of the most compelling arguments in models, with the greatest Nash–Sutcliffe efficiency
favour of a simpler technique than FAO-PM56. In certain coefficient (NSE) being 0.980 and the lowest RMSE
scenarios, data precision and quality metrics may be 0.190 mm/day. Shiri et al. (2019) compared the perfor-
unreliable (Droogers & Allen, 2002). The potential expla- mance of GEP with locally and externally calibrated PT
nation below supports the statistical index-based findings models on ETo estimation. GEP outperformed
of our investigation. (RMSE = 0.462mm/day; MAE = 0.216 mm/day) and
Wu et al. (2019) studied the ability of various ML gave the best solution for ETo modelling alternative to
models to estimate ETo using climatic data from local the FAO-PM56 approach utilizing two meteorological
and cross stations. ML-based models demonstrated inputs in humid and desert stations of Iran. The
superior estimation accuracy based on statistical indices acquired ML findings were compared to Valiantza's
(R2 = 0.962 and RMSE = 0.263 mm/day). SVM and empirical equation-based model. It was found that ML
tree-based ML are considered the best approaches for with grey wolf optimization techniques (ML GWO) out-
ETo modelling. Similarly, Mohammadrezapour et al. performed the empirical equation as determined by the
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
20 RAZA ET AL.
F I G U R E 8 Boxplots of first (S1) and second (S2) scenario. ETo, reference evapotranspiration; MLR, multilinear regression; M5P, M5
pruned; RBF, radial basis function; SMO, sequential minimal optimization.
indices NSE = 0.990 and RMSE = 0.050–0.040 mm/ option. In addition, the study determined that removing
month at both sites. RHavg data in the analysis reduces the RMSE by up to
According to Ferreira et al. (2019), most empirical 24%. Granata (2019) found ETo ML models (SVM, DT,
equations usually presented are site-specific or have less TB and TF) to be better than the conventional FAO-
extensive climatic conditions, restricting their worldwide PM56 method by analysing limited climate data from
applicability. Consequently, the authors utilized various Florida's humid region. Keshtegar et al. (2019) created a
ML models for ETo modelling in the entire Brazilian polynomial chaos expansion (PCE) ML model for ETo
region with fewer climate data. In the absence of meteo- modelling utilizing restricted meteorological data at two
rological data, the authors suggested that ML is the best Turkish stations. The findings revealed that the PCE ML
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
RAZA ET AL. 21
F I G U R E 9 Margins of deviation for first and second scenario. MLR, multilinear regression; M5P, M5 pruned; RBF, radial basis
function; SMO, sequential minimal optimization.
model outperformed competing methods and delivered enhance the ETo findings, two ensemble models based on
the greatest NSE of 0.999, the lowest RMSE of 0.045 mm ML and empirical equations were also built and com-
and the highest agreement index of 0.999. Globally, pared to single ML and empirical equations. In the case
Nourani et al. (2019) compared ML and empirical models of limited climate data, the study suggested using ML
for ETo modelling in several climatic locations models for ETo modelling. Similarly, Shiri et al. (2019)
(e.g. Turkey, Iraq, Cyprus, Iran and Libya). The outcomes compared the ML-GEP model against six empirical equa-
of ML models surpassed those of empirical models. To tions for estimating daily ETo using island climate data
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
22 RAZA ET AL.
from Iran. The findings indicated that the GEP model Even though researchers created robust ML models with
was superior to the selected empirical models at the test high accuracy, the approaches mentioned could not be
stations. used to create a generic model in place of separately
Kisi et al. (2015) examined the efficacy of four ML estimating each case study. However, utilizing climatic
models to successfully predict the monthly ETo in Iran data from different meteorological locations in our
using limited climatic data. The ML-GEP model used in analysis, the recommended ML model successfully calcu-
this work exhibited excellent performance with limited lated the ETo.
input data. Likewise, ML-based models excelled and pro-
duced a more accurate estimate of ETo, consistent with
our results. Due to the limitations of meteorological data, 3.5 | ETo interpolation maps based on
it is advised to employ ML applications in lieu of empiri- best ML model
cal and locally calibrated models. In addition, the overes-
timation and underestimation of ETo values by ML ArcMap GIS 10.1 software was used to create ETo varia-
models is contingent on correct calibration during the tion maps for the climatic stations under study. Inverse
training phase. Higher training data result in an underes- distance weighted (IDW) interpolation was used to
timation of ETo, whereas less training data result in an develop a surface raster map based on the M5P tree out-
overestimation of ETo. Implementing ML models with put. Figure 10 presents the ETo variation at the Faisala-
limited data requires good training, and ETo models bad, Islamabad, Jacobabad, Multan and Skardu climatic
based on ML may be applied to various climatic condi- stations. The lowest ETo for all the studied stations was
tions for their validation and verification. To validate the recorded at Faisalabad with 0.950 mm, Islamabad
efficacy of the generated ML ETo models, this work 0.800 mm, Jacobabad 1.370 mm, Multan 1.100 mm and
examined ML models under various meteorological con- Skardu 0.200 mm, while the highest ETo was noted at
ditions. Table 6 compares data requirements for the Faisalabad with 10.960 mm, Islamabad 8.34 mm, Jacoba-
FAO-PM56 and ML models to estimate ETo. Table 6 bad 11.020 mm, Multan 10.290 mm and Skardu
demonstrates that FAO-PM56 is dependent on various 4.620 mm. Figure 10 shows that red colour indicates a
parameters that are difficult to obtain, particularly in high ETo, yellow colour shows mediocre ETo, and green
developing regions. Unlike the FAO-PM56 method, ML colour presents a low ETo value. It can also be noted that
models use fewer parameters (Tmax, Tmin and n) and gen- the climatic stations in the arid region (Jacobabad, Mul-
erate reliable ETo values. The symbol ‘■’ in Table 6 tan) have the highest ETo variation. In contrast, the cli-
denotes the parameters required for ETo estimation, matic station in the humid region (Skardu) showed the
whereas ‘□’ indicates that they are not utilized in the lowest ETo variation. However, the climatic stations in
corresponding method. semi-arid regions (Faisalabad and Islamabad) showed
Significant efforts have been made to develop ML average ETo variation. The temperature in the arid region
models that reliably predict ETo based on a few input rises in the peak months (June, July and August) due to
data. Zhu et al. (2020) attempted to create an ML model more sunshine hours; therefore, ETo was highest in this
using solely temperature data, and they compared the region. On the other hand, rainfall occurred in the humid
model's ETo predictions to several empirical equations. region, which cooled the atmospheric temperature and
The study examined which ML model produced the least reduced sunshine hours. Therefore, ETo was recorded
error during testing and had the highest correlation lowest in this region. These ETo maps are based on effec-
(R2 = 89%) between the estimated and actual ETo. The tive climatic variables and present realistic scenarios that
method of earlier studies has generally been applied to can be used to estimate agricultural water needs and
estimate ETo over several sites, as demonstrated above. maximize yields while conserving water.
Abbreviations: Δ, vapour pressure curve slope constant; γ, vapour pressure psychrometric constant; ea, actual vapour pressure; es, saturation vapour pressure;
ETo, reference evapotranspiration; FAO-PM56, Penman–Monteith equation of the Food and Agriculture Organization; ML; machine learning; n, sunshine
hours; RH, relative humidity; Rn, net radiation; Tmax, maximum temperature; Tmin, minimum temperature; emin, minimum vapour pressure; emax, maximum
vapour presuure; Z, height of installed instrument; U, wind speed.
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
RAZA ET AL. 23
F I G U R E 1 0 Variations in ETo at studied stations (Faisalabad, Islamabad, Jacobabad, Multan and Skardu). ETo, reference
evapotranspiration.
environments than the other deployed ML methods. Con- and walnut trees using a continuous leaf monitoring system.
sidering the outcomes of the performance statistical indi- Precision Agriculture, 20(4), 723–745. Available from: https://
ces using limited climatic input for modelling ETo, the doi.org/10.1007/s11119-018-9607-0
Droogers, P. & Allen, R.G. (2002) Estimating reference evapotrans-
M5P tree ranked first, followed by the RBFNreg model.
piration under inaccurate data conditions. Irrigation and Drain-
Future advancements of this work will involve estab- age Systems, 2002(16), 33–45. Available from: https://doi.org/
lishing additional ETo models based on hybrid data intel- 10.1023/a:1015508322413
ligence (HDI) approaches and ELM that evaluate Elbeltagi, A., Raza, A., Hu, Y., al-Ansari, N., Kushwaha, N.L.,
numerous ecological factors. Effective planning, adminis- Srivastava, A., et al. (2022) Data intelligence and hybrid meta-
tration and control of water resource systems necessitate heuristic algorithms-based estimation of reference evapotrans-
extensive and dependable data on ETo modelling, which piration. Applied Water Science, 12(7), 1, 152–18. Available
from: https://doi.org/10.1007/s13201-022-01667-7
requires the implementation of gap-filling approaches.
Fan, J., Yue, W., Wu, L., Zhang, F., Cai, H., Wang, X., et al. (2018)
Due to their vital function in developing local models in
Evaluation of SVM, ELM and four tree-based ensemble models
areas with enough data, regional models should be given for predicting daily reference evapotranspiration using limited
special consideration. ETo maps are useful for managing meteorological data in different climates of China. Agricultural
data related to ground and surface water resource plan- and Forest Meteorology, 263, 225–241. Available from: https://
ning and management, as well as other water-related doi.org/10.1016/j.agrformet.2018.08.019
topics such as regional water usage analysis, water alloca- Ferreira, L.B. & da Cunha, F.F. (2020) New approach to estimate
tion, water consumption and water rights. daily reference evapotranspiration based on hourly temperature
and relative humidity using machine learning and deep learn-
ing. Agricultural Water Management, 234, 106113. Available
ACK NO WLE DGE MEN TS from: https://doi.org/10.1016/j.agwat.2020.106113
We are thankful to the editor and anonymous reviewers Ferreira, L.B., da Cunha, F.F., de Oliveira, R.A. & Filho, E.I.F.
for their time in improving the quality of this article. This (2019) Estimation of reference evapotranspiration in Brazil
research was supported by Key R&D program of Jiangsu with limited meteorological data using ANN and SVM—a new
Provincial Government (BE2021340), China, Jiangsu approach. Journal of Hydrology, 572, 556–570. Available from:
Postdoctoral Science Foundations (2016M600376 and https://doi.org/10.1016/j.jhydrol.2019.03.028
1601032C), and the Priority Academic Program Develop- Gavilan, P., Berengena, J. & Allen, R.G. (2007) Measuring versus
estimating net radiation and soil heat flux: impact on Penman–
ment of Jiangsu Higher Education Institutions (PAPD-
Monteith reference ET estimates in semiarid regions. Agricul-
2018-87). tural Water Management, 89(3), 275–286. Available from:
https://doi.org/10.1016/j.agwat.2007.01.014
CONFLICT OF INTEREST STATEMENT Granata, F. (2019) Evapotranspiration evaluation models based on
The authors declare that no conflict of interest is associ- machine learning algorithms—a comparative study. Agricul-
ated with this research article. tural Water Management, 217, 303–315. Available from:
https://doi.org/10.1016/j.agwat.2019.03.015
Han, Y., Wu, J., Zhai, B., Pan, Y., Huang, G., Wu, L., et al. (2019)
DATA AVAILABILITY STATEMENT
Coupling a bat algorithm with xgboost to estimate reference
Data are available on reasonable request.
evapotranspiration in the arid and semiarid regions of China.
Advances in Meteorology, 2019, 1–16. Available from: https://
R EF E RE N C E S doi.org/10.1155/2019/9575782
Abdullah, S.S., Malek, M.A., Abdullah, N.S., Kisi, O. & Yap, K.S. Ibrahim, D. (2016) An overview of soft computing. Procedia Com-
(2015) Extreme learning machines: a new approach for predic- puter Science, 102, 34–38. Available from: https://doi.org/10.
tion of reference evapotranspiration. Journal of Hydrology, 527, 1016/j.procs.2016.09.366
184–195. Available from: https://doi.org/10.1016/j.jhydrol.2015. Keshtegar, B., Zounemat-Kermani, M. & Kisi, O. (2019) Polynomial
04.073 chaos Expansion and Response Surface Method for Non-linear
Allen, R. G., Pereira, L. S., Raes, D. & Smith, M. (1998) Crop Modelling of Reference Evapotranspiration. Hydrological Sci-
evapotranspiration-guidelines for computing crop water ences Journal, 2019(64), 720–730. Available from: https://doi.
requirements-FAO irrigation and drainage paper 56. Fao, org/10.1080/02626667.2019.1601727
Rome, 300(9), 5109. Kişi, O. & Cimen, M. (2009) Evapotranspiration modelling
Bahrami, M., Zarei, A.R., Moghimi, M.M. & Mahmoudi, M.R. using support vector machines. Hydrological Sciences Journal,
(2019) Trend analysis of evapotranspiration applying paramet- 54(5), 918–928. Available from: https://doi.org/10.1623/hysj.54.
ric and non-parametric techniques (case study: arid regions of 5.918
southern Iran). Sustainable Water Resources Management, 5(4), Kisi, O., Sanikhani, H., Zounemat-Kermani, M. & Niazi, F. (2015)
1981–1994. Available from: https://doi.org/10.1007/s40899-019- Long-term monthly evapotranspiration modeling by several
00352-z data-driven methods without climatic data. Computers and
Dhillon, R., Rojo, F., Upadhyaya, S.K., Roach, J., Coates, R. & Electronics in Agriculture, 115, 66–77. Available from: https://
Delwiche, M. (2019) Prediction of plant water status in almond doi.org/10.1016/j.compag.2015.04.015
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
RAZA ET AL. 25
Kouadri, S., Kateb, S. & Zegait, R. (2021) Spatial and temporal various climatic regions. Theoretical and Applied Climatology,
model for WQI prediction based on back-propagation neural 139(3-4), 1459–1477. Available from: https://doi.org/10.1007/
network, application on EL MERK region (Algerian southeast). s00704-019-03007-3
Journal of the Saudi Society of Agricultural Sciences, 20(5), 324– Saggi, M.K. & Jain, S. (2019) Reference evapotranspiration
336. Available from: https://doi.org/10.1016/j.jssas.2021.03.004 estimation and modeling of the Punjab northern India using
Kumar, R., Shankar, V. & Kumar, M. (2011) Modelling of crop ref- deep learning. Computers and Electronics in Agriculture, 156,
erence evapotranspiration: a review. Universal Journal of Envi- 387–398. Available from: https://doi.org/10.1016/j.compag.
ronmental Research and Technology, 1(3), 239. 2018.11.031
Mattar, M.A. (2018) Using gene expression programming in Shamshirband, S. & Kamsin, A. (2016) Comparative analysis of ref-
monthly reference evapotranspiration modeling: a case study in erence evapotranspiration equations modelling by extreme
Egypt. Agricultural Water Management, 198, 28–38. Available learning machine. Computers and Electronics in Agriculture,
from: https://doi.org/10.1016/j.agwat.2017.12.017 127, 56–63. Available from: https://doi.org/10.1016/j.compag.
Mcmahon, T., Peel, M., Lowe, L., Srikanthan, R. & Mcvicar, T. 2016.05.017
(2013) Estimating actual, potential, reference crop and pan Shiri, J., Nazemi, A.H., Sadraddini, A.A., Marti, P., Fakheri
evaporation using standard meteorological data: a pragmatic Fard, A., Kisi, O., et al. (2019) Alternative heuristics equations
synthesis. Hydrology and Earth System Sciences, 17(4) to the Priestley–Taylor approach: assessing reference evapo-
1331.2013, 1331–1363. Available from: https://doi.org/10.5194/ transpiration estimation. Theoretical and Applied Climatology,
hess-17-1331-2013 138(1-2), 831–848. Available from: https://doi.org/10.1007/
Mehdizadeh, S., Behmanesh, J. & Khalili, K. (2017) Using MARS, s00704-019-02852-6
SVM, GEP and empirical equations for estimation of monthly Trajkovic, S. & Kolakovic, S. (2009) Kolakovic. Estimating
mean reference evapotranspiration. Computers and Electronics reference evapotranspiration using limited weather data. Jour-
in Agriculture, 139, 103–114. Available from: https://doi.org/10. nal of Irrigation and Drainage Engineering, 135(4), 443–449.
1016/j.compag.2017.05.002 Available from: https://doi.org/10.1061/(ASCE)IR.1943-4774.
Mohammadrezapour, O., Piri, J. & Kisi, O. (2019) Comparison of 0000094
SVM, ANFIS and GEP in modeling monthly potential evapo- Valipour, M., Sefidkouhi, M.A.G., Raeini-Sarjaz, M. & Guzman, S.
transpiration in an arid region (Case study: Sistan and Baluche- M. (2019) A hybrid data-driven machine learning technique for
stan Province, Iran). Water Supply, 19(2), 392–403. Available evapotranspiration modeling in various climates. Atmosphere
from: https://doi.org/10.2166/ws.2018.084 (Basel), 10(6), 311. Available from: https://doi.org/10.3390/
Nourani, V., Elkiran, G. & Abdullahi, J. (2019) Multi-station artifi- atmos10060311
cial intelligence based ensemble modeling of reference evapo- Walls, S., Binns, A.D. & Levison, J. (2020) Prediction of actual
transpiration using pan evaporation measurements. Journal of evapotranspiration by artificial neural network models using
Hydrology, 577, 123958. Available from: https://doi.org/10. data from a Bowen ratio energy balance station. Neural Com-
1016/j.jhydrol.2019.123958 puting and Applications, 32(17), 14001–14018. Available from:
Nouri, H., Beecham, S., Kazemi, F., Hassanli, A.M. & Anderson, S. https://doi.org/10.1007/s00521-020-04800-2
(2013) Remote sensing techniques for predicting evapotranspi- Wang, J., Raza, A., Hu, Y., Buttar, N.A., Shoaib, M., Saber, K., et al.
ration from mixed vegetated surfaces. Hydrology and Earth Sys- (2022) Development of monthly reference evapotranspiration
tem Sciences Discussions, 10(3), 3897–3925. Available from: machine learning models and mapping of Pakistan—a compar-
https://doi.org/10.5194/hessd-10-3897-2013 ative study. Water, 14(10), 1666. Available from: https://doi.
Quinlan, J. R. (1992) Learning with continuous classes.—5th org/10.3390/w14101666
Australian joint conference on artificial intelligence 92: Wang, S., Fu, Z.-Y., Chen, H., Nie, Y.-P. & Wang, K.-L. (2016)
343-348. Modeling daily reference ET in the karst area of Northwest
Rahimikhoob, A. (2010) Estimation of evapotranspiration based on Guangxi (China) using gene expression programming (GEP)
only air temperature data using artificial neural networks for a and artificial neural network (ANN). Theoretical and Applied
subtropical climate in Iran. Theoretical and Applied Climatol- Climatology, 126(3-4), 493–504. Available from: https://doi.org/
ogy, 101(1-2), 83–91. Available from: https://doi.org/10.1007/ 10.1007/s00704-015-1602-z
s00704-009-0204-z Wen, X., Si, J., He, Z., Wu, J., Shao, H. & Yu, H. (2015)
Raza, A., Hu, Y., Shoaib, M., Abd Elnabi, M.K., Zubair, M., Support-vector-machine-based models for modeling daily
Nauman, M., et al. (2021) A systematic review on estimation of reference evapotranspiration with limited climatic data in
reference evapotranspiration under Prisma guidelines. Polish extreme arid regions. Water Resources Management, 29(9),
Journal of Environmental Studies, 30, 5413–5422. Available 3195–3209. Available from: https://doi.org/10.1007/s11269-015-
from: https://doi.org/10.15244/pjoes/136348 0990-2
Raza, A., Shoaib, M., Baig, M.A.I., Ahmad, S., Khan, M.M., Wu, L., Peng, Y., Fan, J. & Wang, Y. (2019) Machine learning
Ullah, M.K., et al. (2021) Comparative study of powerful predic- models for the estimation of monthly mean daily reference
tive modeling techniques for modeling monthly reference evapotranspiration based on cross-station and synthetic data.
evapotranspiration in various climatic regions. Fresenius Envi- Hydrology Research, 50(6), 1730–1750. Available from: https://
ronmental Bulletin, 30(6b), 7490–7513. doi.org/10.2166/nh.2019.060
Raza, A., Shoaib, M., Khan, A., Baig, F., Faiz, M.A. & Khan, M.M. Yin, Z., Feng, Q., Yang, L., Deo, R.C., Wen, X., Si, J., et al. (2017)
(2020) Application of non-conventional soft computing Future projection with an extreme-learning machine and sup-
approaches for estimation of reference evapotranspiration in port vector regression of reference evapotranspiration in a
15310361, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/ird.2838 by Jiangsu University, Wiley Online Library on [22/05/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
26 RAZA ET AL.