You are on page 1of 12

Ecotoxicology and Environmental Safety 253 (2023) 114665

Contents lists available at ScienceDirect

Ecotoxicology and Environmental Safety


journal homepage: www.elsevier.com/locate/ecoenv

The risk assessment of arsenic contamination in the urbanized coastal


aquifer of Rayong groundwater basin, Thailand using the machine
learning approach
Narongpon Sumdang a, Srilert Chotpantarat b, c, *, Kyung Hwa Cho d, Nguyen Ngoc Thanh e
a
International Postgraduate Program in Hazardous Substance and Environmental Management, Graduate School, Chulalongkorn University, Bangkok 10330, Thailand
b
Department of Geology, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
c
Center of Excellence in Environmental Innovation and Management of Metals (EnvIMM), Chulalongkorn University, Phayathai Road, Pathumwan, Bangkok 10330,
Thailand
d
Department of Urban and Environmental Engineering, Ulsan National Institute of Science and Technology, 50, UNIST-gil, Ulsan 44919, Republic of Korea
e
University of Agriculture and Forestry, Hue University, 102 Phung Hung Str, Hue City, Viet Nam

A R T I C L E I N F O A B S T R A C T

Edited by Dr. Hao Zhu The rapid expansion of urbanization has resulted in an insufficient of groundwater resource. In order to use
groundwater more efficiently, a risk assessment of groundwater pollution should be proposed. The present study
Keywords: used machine learning with three algorithms consisting of Random Forest (RF), Support Vector Machine (SVM),
Arsenic and Artificial Neural Network (ANN) to locate risk areas of arsenic contamination in Rayong coastal aquifers,
Machine learning
Thailand and selected the suitable model based on model performance and uncertainty for risk assessment. The
Groundwater contamination
parameters of 653 groundwater wells (Deep=236, Shallow=417) were selected based on the correlation of each
Spatial probability model
Groundwater risk assessment hydrochemical parameters with arsenic concentration in deep and shallow aquifer environments. The models
Thailand were validated with arsenic concentration collected from 27 well data in the field. The model’s performance
indicated that the RF algorithm has the highest performance as compared to those of SVM and ANN in both deep
and shallow aquifers (Deep: AUC=0.72, Recall=0.61, F1 =0.69; Shallow: AUC=0.81, Recall=0.79, F1 =0.68). In
addition, the uncertainty from the quantile regression of each model confirmed that the RF algorithm has the
lowest uncertainty (Deep: PICP=0.20; Shallow: PICP=0.34). The result of the risk map obtained from the RF
reveals that the deep aquifer, in the northern part of the Rayong basin has a higher risk for people to expose to As.
In contrast, the shallow aquifer revealed that the southern part of the basin has a higher risk, which is also
supported by the location of the landfill and industrial estates in the area. Therefore, health surveillance is
important in monitoring the toxic effects on the residents who use groundwater from these contaminated wells.
The outcome of this study can help policymakers in regions to manage the quality of groundwater resources and
enhance the sustainable use of groundwater resources. The novelty process of this research can be used to further
study other groundwater aquifers contaminated and increase the effectiveness of groundwater quality
management.

1. Introduction human activities such as drinking water, agricultural and industrial as­
pects (Cho et al., 2011). However, the problem of groundwater
The rapid urbanization increases the demand for water resources for contamination has been concerned especially arsenic (As) contamina­
development caused a shortage of water resources which is mainly tion in groundwater, that is correlated with various health risks in many
influenced by land use, industrial structure, and population density countries, such as China, Taiwan, Bangladesh, India, the USA, Vietnam,
factors (Chen et al., 2019). To improve the sustainability of water used, Japan, and Thailand (Wongsasuluk et al., 2018a, 2018b). As is a major
groundwater is a major freshwater resource that plays an important role carcinogenic element, which mostly presents in a toxic form as an
in terrestrial, aquatic ecosystems, and water consumption resources in inorganic component in groundwater (Chetia et al., 2010). Arsenic can

* Corresponding author at: Department of Geology, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand.
E-mail address: Srilert.C@chula.ac.th (S. Chotpantarat).

https://doi.org/10.1016/j.ecoenv.2023.114665
Received 13 July 2022; Received in revised form 26 December 2022; Accepted 15 February 2023
Available online 28 February 2023
0147-6513/© 2023 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

be considered a toxic element to humans in several forms, especially investigate various environmental factors that influence on As contam­
arsenate (As(V)), arsenite (As (III)), and As organic compounds. A lethal ination in groundwater, and (iii) to generate the As probability maps and
dose in humans is 1.5 mg/kg of body weight (WHO, 2018). The acute groundwater pollution risk maps of As contamination based on popu­
intoxication symptoms include vomiting, abdominal pain, muscular lation density and water consumption in the study area. This study can
pain, diarrhea, and weakness, with flushing of the skin, and chronic provide appreciative predictive information for better public health
intoxication symptoms, including dermal lesions such as hypo­ management and groundwater quality control to support
pigmentation and hyperpigmentation, skin cancer, peripheral neurop­ decision-makers for taking the suitable plans to preserve groundwater
athy, lung cancers, bladder, and peripheral vascular disease (WHO, quality and reduce contamination of water resources. Good ground­
2018). As a result of As toxicity, the monitoring strategies methods to water quality will reduce the providing costs to clean water for human
observe As contamination need to be developed to estimate and predict use, and help the sustainable development of water management, water
As in groundwater. The proper method could be used to impart the resources, and the environment.
necessary information for better public health and water management
(Cho et al., 2011). However, due to lack of equipment and human re­ 2. Materials and methods
sources, As contamination problems remain problematic. Therefore,
modeling methods to predict As contamination using hydrogeological 2.1. Study area
and on-field measurement data can be used to evaluate and measure the
potential area of As contamination, as well as to notify necessary in­ The study area is in the Rayong groundwater basin in eastern
formation to improve public health and water management (Cho et al., Thailand. The study area is a significant industrial and agricultural area.
2011). The prediction information in terms of the probability and risk The area has undergone rapid development, leading to the potential for
map generated from the model will help to support groundwater man­ many pollutants to contaminate groundwater (Sonthiphand et al.,
agement and to plan to install a monitoring well system for the local 2019). Thailand usually experiences dry weather in winter due to the
government agencies. The probability map is usually used in many types northeast monsoon, which is the main cause that controls the climate of
of environmental science studies to inform the general information this region (TMD, 2015). In Rayong province, annual rainfall averages
about a percent chance to encounter some study’s element (Sajedi around 1500 mm/year, which is the average rate for annual rain in
Hosseini et al., 2018). Thailand (TMD, 2015). Hydrogeological characteristics in the study area
Machine learning (ML) has been used in several fields in environ­ containing with colluvium sediment (Qc) in the quaternary age has the
mental scientist studies. The ML power comes from their powerful largest aquifer area as compared to other hydrogeological units. Another
nonlinear modeling capability, which is usually used for assessment in main hydrogeological unit is alluvium sediment in quaternary age,
environmental science aspects including groundwater contamination which is located alongside the main river in the study area. The other
studies. The ML has been applied to predict several risk assessments in two aquifers are the consolidated aquifers, including granite (Gr) and
groundwater resources (Sajedi Hosseini et al., 2018; Bindal and Singh, limestone aquifers (Pl), which are in mountainous areas distributed
2019; Podgorski et al., 2020). Many ML algorithms are usually used in around the groundwater basin. From the population density data from
many studies, such as random forest (RF), support vector machine the National Statistical Office in 2020, it can be classified into ten dis­
(SVM), and artificial neural network (ANN). Random Forest (RF) algo­ tricts with difference population density. The largest population density
rithm is usually used to classify many spatial distribution studies and district is Bang Lamung district with 643 people per square kilometer.
shows that this algorithm is very effective and has good accuracy results Mueang Rayong district is the second largest population density district
with the limited data set and it could provide better stability in the with 548 people per square kilometer, following with Si Racha, Sattahip,
evaluation process (Liu et al., 2019). On the other hand, a support vector Ban Chang, Nikhom Phatthana, Ban Bueng, Ban Khai, Pluak Daeng and
machine (SVM) is mainly used to classify all data into classes with high Nong Yai of 512, 496, 313, 208, 166, 138, 110 and 59 people per square
accuracy by constructing a hyperplane (Ghosh et al., 2020). An artificial kilometer. The Rayong groundwater basin has been reported in many
neural networks (ANNs) have been used in many studies to evaluate the studies to be associated with As contamination in groundwater, which
contamination in groundwater. and have the advantage of classifying could consequently affect various health risks in local people (Kerdthep
the complex dataset (Cho et al., 2011; Dawood et al., 2020). When et al., 2009; Boonkaewwan et al., 2017; Pipattanajaroenkul et al., 2018;
combined with the PCA technique, ANN algorithms provided a signifi­ Boonkaewwan et al., 2020). Besides, the study area is the part of the
cant result to determine the As contamination in groundwater in Eastern Economic Corridor (EEC), where is the ongoing economic plan
Cambodia, Laos, and Thailand (Cho et al., 2011). Moreover, ANNs are for Thailand’s Eastern Seaboard, and the government has been launch­
often used to model the classification of spatial distribution. In addition, ing measures to support the economic growth in the EEC that will
several studies have shown that the use of deep neural networks (DNN) develop into a new trade center in Asia (Niyomsilp et al., 2020). As a
in the model is very effective. However, in the study, a deep neural result, the demand for the groundwater resource in this region will in­
network (DNN) is highly effective when applied to a heterogeneous and crease dramatically, and the groundwater quality of groundwater must
high-volume dataset (Sarker, 2021). But in this study, the amount of be considered the priority before pumping groundwater to supply each
data used was limited. That makes the study using ANN more efficient. sector (Fig. 1).
All three algorithms are suitable to handle a large dataset to conduct a
model, which has many variables to analyze and predict groundwater 2.2. Framework of research
contamination.
In many previous studies, physical parameters such as land use, soil The overall framework of this study is shown in Fig. 2. The preparing
map, and hydrogeology parameters were mostly used with ML ap­ data process was conducted by collecting the secondary data such as
proaches to predict groundwater pollution risk (Winkel et al., 2008; hydrochemical data of groundwater from 653 groundwater wells (Deep
Podgorski et al., 2020), which is limited understanding of physical and = 236, Shallow = 417), soil type, land use, geological characteristic,
hydrochemical parameters to predict contamination in groundwater, aquifer types from various government agencies. The hydrogeological
particularly As contamination in the urbanized coastal aquifer (Zubair data were categorized the aquifer into shallow and deep aquifers. The As
et al., 2015). To fulfill this research gap, this study attempted to apply concentration was classified by the drinking water standard to further
ML algorithms, including RF, SVM, and ANN to investigate and predict use them in the modeling process. Other physical and hydrochemical
the As contamination in Rayong coastal basins. The objectives of the parameters were selected using spearman’s correlation technique to
current study: (i) to evaluate and compare ML algorithms such as RF, screen out unnecessary parameters. In the modeling process, spatial
SVM, and ANN to predict As contamination in groundwater, (ii) to modeling uses a different algorithm to generate different probability

2
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

Fig. 1. Rayong groundwater basin.

Fig. 2. Framework of the present study.

models such as support vector machine (SVM), random forest (RF), and used in the validation process to measure the models, consisting of
artificial neural networks (ANNs). The probability model was compared classification performance, uncertainty evaluation, and finally predic­
with each other in the validation process. There are three aspects that we tion performance by validation with the field data. The best probability

3
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

map was calculated with the population density and water consumption 2.4. Arsenic contamination probability map
data in the study area to generate the risk maps.
The physical and hydrochemical parameters were calculated with
2.3. Data preparation and field-collected data nonparametric Spearman’s rank correlation using the SPSS software to
study the correlation between As and other parameters that were used to
2.3.1. Data preparation investigate associate of variables (Sangkham et al., 2021). The param­
The secondary data were collected from various Thai government eters that have a correlation coefficient (rs) in excess of 0.1 with As
departments such as the Department of mineral Resources (DMR), Land concentration were classified as the influence parameters that associate
Development Department (LDD), Department of Groundwater Re­ with As in groundwater and selected to interpret into raster data using
sources (DGR). The data were classified into two groups, physical and the Inverse Distance Weighting (IDW) technique with the AutoMap
hydrochemical data. The period of hydrochemical data was collected package in R-studio software. IDW interpolation can generate the con­
around 2011–2012, 2017–2018, and 2019 in the dry and rainy seasons centration point data into concentration area data by averaging the
around the study area (DMR, 2007; DGR, 2012, 2017; LDD, 2016). The point data (Gaafar, 2019) and was used interpolations to estimate
physical data such as geological characteristics, soil properties, soil groundwater As concentrations better than Kriging interpolation tech­
texture, aquifer characteristics, land use and groundwater level. The nique (Gong et al., 2014).
date of these data is not over 10 years as shown in Table 1. Hydro­ The framework of model probability is shown in Fig. 4. Arsenic
chemical data were collected from the DGR included concentrations of concentrations which obtained from the dataset were used to establish a
magnesium (Mg), calcium (Ca), sodium (Na), chloride (Cl), iron (Fe), groundwater As probability map in the Rayong groundwater basin. The
fluoride(F-), potassium(K), carbonate (CO2- 2-
3 ), bicarbonate (HCO3 ), sul­ arsenic standard following the World Health Organization, a threshold
fate (SO2-
4 ), nitrate (NO -
3 ), phosphate (PO-
4 ), cadmium (Cd), copper (Cu), of arsenic (10 μg/l) was used to classify the polluted and non-polluted
chromium (Cr6+), mercury (Hg), nickel(Ni), manganese (Mn), lead(Pb), groundwater wells. Groundwater wells were classified into 2 types as
selenium(Se), zinc (Zn), arsenic (As) and other on-field measurement follows: the polluted type (As>10 μg/l), and the non-polluted type (As <
hydrochemical parameters consisting temperature, electrical conduc­ 10 μg/l). After that, the datasets were divided into a training dataset
tivity(EC), temperature, pH, total hardness (TTH), and total dissolved (containing 70% of the dataset) and a testing dataset (containing 30% of
solids (TDS). There were interpreted from point data to raster data by the dataset) (Sajedi Hosseini et al., 2018). The training set was used to
using the interpolation technique. provide the probability models along with three algorithms, consisting
of SVM, RF, and ANN through coding by SDM package in the R-studio
2.3.2. Collected data in the field software (Naimi and Araújo, 2016). The default setting of SVM algo­
The groundwater samples were collected in the rainy season because rithms is following by the svm function from the package ’e1071’, the
the As concentrations across the observation wells in the rainy season Epsilon parameter and the Algocv integer set default at 1e-08 and 3. The
were higher than those in the summer season, as previously reported. In RF algorithm function was used from the package ‘randomForest’, Trees
addition, As concentrations exceeding the drinking water standards of integer and Final leave integer set default to 2500 and 1. The ANN al­
the World Health Organization (WHO) were found in some shallow gorithm uses a function from the package ’nnet’ with a maximum
wells located in the community areas within the study area. Thus, the number of iterations set to 500 with 12,12,12 hidden layers in deep
rainy season was the focus of this study because As contamination in condition and 22,22,22 hidden layers in shallow condition (Naimi. et al.,
groundwater in this period is harmful to the local people, who generally 2016). Precession matrices such as recall, fi-score, and AUC are very
use groundwater for several purposes, especially the public water robust evaluation metrics that work great for many classification prob­
supply. lems (Magalhaes et al., 2021). Therefore, the model performance was
The groundwater’s physicochemical parameters such as tempera­ assessed using the metrics of precision with the test dataset (30%
ture, pH, electrical conductivity (EC) was analyzed. A plastic bailer was dataset). The receiver operating characteristic (ROC) is a tool to measure
then used to collect groundwater samples. All groundwater samples the performance of the model for the classification task at various
were collected and preserved in 60 ML HDPE bottles that were already threshold setups. The area under curve (AUC) implies a degree of
cleaned with deionized water and 10% HNO3. To prevent sample dissociation. It indicates how much the model is capable of identifying
degradation, all samples were transferred to the laboratory in coolers between classed data. If the model performance is above 60% of the AUC
with ice at 4ºC and preserved with HNO3 for heavy metal analysis. value, the model will be acceptable. In contrast, if the model perfor­
(American Public Health Association, 2012). The samples were analyzed mance is below 60% of the AUC value, the model must be calibrated and
by the United Analyst and Engineering Consultant Co., LTD for total As provide the new models until it reaches 60% AUC (Havryliuk et al.,
using the hydride generation AAS method (Shimadzu, AA-6200) (Fig. 3). 2018). Moreover, a Taylor diagram was used to provide a graphical
comparison, which showed the model’s performance in visualization by
combining standard deviations (SD), and correlation coefficients (Tay­
lor, 2001; Choubin et al., 2017). When the model was completed, the As
probability map was generated using the SDM package in R-studio
software (Naimi and Araújo, 2016).
Table 1
Sources of physical data. 2.5. Probability map validation
Data Data source Data Year
format 2.5.1. Uncertainty evaluation of the models
Geological Ministry of Natural Resources Shapefile 2015 To ensure that the models are working properly and reliably, quan­
characteristics tile regression (QR) was used to evaluate the model’s predictive un­
Soil properties Land Development Department Shapefile 2011 certainty value. QR methods were used to calculate the model residuals
Aquifer characteristics Department of Groundwater Shapefile 2017
and regard uncertainty sources, that is different to the classical methods
Resources
Land use Land Development Department Shapefile 2015 such as orthogonal regression, Monte Carlo, and inverse regression
Elevation USGS Raster file 2019 methods that usually consider only one source of uncertainty (Solo­
Population density National Statistical Office Excels 2019 matine and Shrestha, 2009; Meinrath and Spitzer, 2000). There are
Groundwater Department of Groundwater Excels 2019 several statistical measures of uncertainty, such as mean prediction in­
consumption Resources
terval (MPI) and prediction interval coverage probability (PICP), which

4
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

Fig. 3. Groundwater sampling map.

Fig. 4. Framework of constructing the probability model of As.

were used as suggested by Shrestha and Solomatine (Shrestha and measurement of uncertainty as it indicates the number of observations
Solomatine, 2006). MPI is the average of the widths of the prediction that fall within the estimated interval (Dogulu et al., 2015). Therefore,
intervals, where the lower values of MPI indicate lower uncertainty. The MPI is used as a supplementary metric: between models with similar
best method MPI and PICP values are calculated as: PICP values, the one with a lower MPI is regarded as the better model
(Muthusamy et al., 2016).
1∑ n
MPI = (PLupper
t − PLlower
t )
n t=1 2.5.2. Validation with field data
The As concentration collected in the field was classified into 2 types
1∑ n
as follows: the polluted type (As>10 μg/l), and the non-polluted type
PICP = C, C = (1, PLlower < yt < PLupper , 0, otherwise)
(As < 10 μg/l). To compare the forecasting performance of a model, root
t t
n t=1
mean square error (RMSE) and the mean absolute error (MAE) are used
where yt is the observed value, PLlower
t and PLupper
t are lower and upper for validation as a measurement indicator by compare between the
prediction limits respectively. The PICP is the more important actual value and prediction value from the probability map in each

5
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

model. RMSE is often used to test the prediction performance of spatial et al., 2019). Total dissolve solids (TDS) were correlated with electrical
distribution model (Rahmati et al., 2019) with the standard formula as: conductivity (EC) (r = 0.612), Sodium (Na) (r = 0.546), Chloride (Cl)
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ (r = 0.50), Sulfate (SO4) (r = 0.350), Magnesium (Mg) (r = 0.611) and
√N
√∑
√ (xi − ̂x i )2 Calcium (Ca) (r = 0.672). Similarly, the electrical conductivity (EC) was
√ a positive correlation with sodium (Na) (r = 0.642), chloride (Cl)
RMSD = i=1
N (r = 0.526), sulfate (SO4) (r = 0.428), magnesium (Mg) (r = 0.595) and
calcium (Ca) (r = 0.710) as shown in Table SI1. These correlations with
N=number of non-missing data pointsxi =actual observations time ser­ the electrical conductivity (EC) indicate that the increase in salinity level
ies ̂
x i =estimated time series. was influenced by seawater intrusion and groundwater mineralization
The mean absolute error (MAE) is another useful measure widely (Sae-Ju et al., 2019).
used in model evaluations. The MAE is the average over the verification For the shallow aquifer shown in Table SI2, like the deep aquifer,
sample of the absolute values of the differences between the forecast and calcium also had a high positive correlation with magnesium (Mg)
the corresponding observation. (r = 0.650) and bicarbonate (HCO3) (r = 0.790) indicating the dissolu­
tion of calcite in aquifers. The relation between magnesium (Mg) and
1∑ n
MAE = |ei | sodium (Na) (r = 0.630) implied that it has an ion-exchange reaction
n i=1
during seawater intrusion. Total dissolve solids (TDS) was highly
The MAE and the RMSE can be used together to diagnose the vari­ correlated with electrical conductivity (EC) (r = 0.957), sodium (Na)
ation in errors in a set of forecasts. the RMSE will always be greater than (r = 0.839), chloride (Cl) (r = 0.720), sulfate (SO4) (r = 0.647), mag­
or equal to the MAE; the larger the gap between the two, the more nesium (Mg) (r = 0.710) and calcium (Ca) (r = 0.799). Similarly, the
variability there is in the sample’s individual errors. All errors will be of electrical conductivity (EC) was positively correlated with sodium (Na)
the same magnitude if RMSE=MAE (Chai et al., 2014). (r = 0.817), chloride (Cl) (r = 0.696), sulfate (SO4) (0.623), magnesium
(Mg) (r = 0.692) and calcium (Ca) (r = 0.786). These correlations with
the electrical conductivity (EC) indicate that the increase in salinity was
2.6. Groundwater pollution risk map also caused by seawater intrusion and groundwater mineralization
(Sae-Ju et al., 2019). These can indicate that the aquifers were influ­
The risk map was created using the overlay technique in QGIS soft­ enced by seawater. In addition, As had a positive correlation with pH
ware by overlaying the probability map with population density and and a negative correlation with NO3, implying that reducing conditions
water consumption in the study area following by Eq. (1). The popula­ might cause the release of As in the shallow aquifer in the Rayong
tion density data was derived from the Bureau of Registration Admin­ groundwater basin.
istration in 2020 (NSO, 2020) and the water consumption obtain from Many studies reported that the reducing environment is mainly
the Department of Groundwater Resources (DGR, 2020). The water caused of As contamination in groundwater through a denitrification
consumption data was created using the pumping rate of groundwater process, leading to As released into groundwater (Boonkaewwan et al.,
well, which has been used in domestic and agricultural sectors in the 2020). The majority of As in groundwater was usually in oxidation form
study area. The pumping rate was standardized into 0–1 values to including As-rich Fe oxyhydroxide (FeOH) and As-bearing pyrite that
represent the areas that were affected by the groundwater consumption. exists as a coating on soil particles throught an oxidation process (Bulut
Risk = Probability map x Population density x Water consumption (1) et al., 2013). In addition, the reducing environment also affected to
microbial degradation of organic matter in groundwater, while
These two risk maps (Deep and Shallow) were classified into five risk consuming O2 and NO3, the microbial release As into groundwater, and
levels: very high, high, moderate, low, and very low, using equal interval this directly impacts As concentration in groundwater (Nickson et al.,
mode in the classification process (Bindal and Singh, 2019). 2000). According to the previous study (Boonkaewwan et al., 2021),
they studied As levels that increased with decreasing ORP. This indicates
3. Results and discussion that the reducing environment is widespread in the groundwater system
in this study area.
3.1. Mechanism of arsenic release In addition, anthropogenic activities, such as agriculture field, ur­
banized wastewater, livestock area, industrial site, and municipal
The spearman’s correlation was used in this process to screen un­ landfill station, some organic pollutants, and acid from the surface area
necessary parameters before using them in the modeling process. For leaches into the groundwater system, enchant the reducing environment
making sure the spearman’s correlation selected useful parameters, the with a high correlation with sulfate (SO4) and nitrate (NO3). In the
correlation between other parameters and total As in groundwater had reducing environment, the microorganism increases both sulfate (SO4)
to toughly reviewed. The natural As sources in groundwater mostly and nitrate (NO3) in the groundwater system with a denitrification
come from the weathering of certain rock types in the region and are process, in which NO3 is reduced to ammonia. When arsenate moves
released into groundwater (Adithya et al., 2016). The other sources of As from iron oxides reductive by the iron-reducing bacteria, and subse­
in groundwater can be from anthropogenic activities such as leachates quent arsenate reduction by bacteria that produce ammonia (Fytianos
infiltrating through municipal waste in landfills that were contaminated and Christophoridis, 2004). From the correlation between sulfate (SO4)
with various organic and inorganic substances (Kumari et al., 2017). The and nitrate (NO3) with negative correlation (r = − 0.086), it means that
result of previous studies indicated that As sources in groundwater microorganisms increased sulfate (SO4); on the other hand, nitrate
probably came from both natural and anthropogenic sources (Boon­ (NO3) was used to release the As (Boonkhao et al., 2017).
kaewwan et al., 2020). This was in agreement with previous studies, The physical parameters influencing total As concentrations
which revealed that high total As concentrations in groundwater were (Tables SI3 and SI4) in groundwater are mainly from hydrogeological
highly positively correlated with bicarbonate (HCO3) concentrations characteristics, soil types, and agricultural area. For example, in shallow
under the reducing condition in the groundwater environment. Calcium aquifer as shown in Table SI4 total As was mainly associated with sandy
(Ca) had a high positive correlation with magnesium (Mg) (r = 0.672) (r = − 0.127), clay (r = 0.143), granite (r = 0.206) and quaternary al­
and bicarbonate (HCO3) (r = 0.744) that indicates the dissolution of luvium (r = − 0.152). The correlation indicates that bedrock and soil
calcite minerals from geological formations into groundwater. The types slightly correlate with As release into the groundwater environ­
correlation between magnesium (Mg) and sodium (Na) (r = 0.466) ment. Moreover, the correlation with the agricultural area (r = − 0.124)
signifies that it has an ion exchange reaction in groundwater (Sae-Ju could indicate the anthropogenic activities of agriculture in the area.

6
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

However, although physical parameters can be screened by spearman’s Table 3


correlation technique, it does not help much in explaining the mecha­ Performances of SVM, RF, and ANN in the deep and shallow aquifers.
nism of As release in groundwater. Model Train Test
Therefore, Spearman’s correlation can be used to select hydro­
Deep AUC Recall F1 AUC Recall F1
geochemical and physical parameters that are significantly related to As.
In addition, the influence factors that affected an As mechanism in the SVM 0.73 0.49 0.61 0.69 0.32 0.38
RF 0.93 0.51 0.66 0.72 0.61 0.69
study area could come from both anthropogenic and natural sources as ANN 1.00 0.22 0.32 0.65 0.14 0.18
shown in Table 2. Shallow
SVM 0.84 0.34 0.28 0.79 0.48 0.39
RF 0.93 0.90 0.79 0.81 0.79 0.68
3.2. Evaluation of the predictive performance of models ANN 1.00 1.00 1.00 0.75 0.75 0.70

The classification performance, which showed how well the model


fits with the data were also quantified by precession matrices mea­ MAE, RF had the lowest MAE in both deep and shallow aquifers (MAE =
surements (Recall, Fi-Score and AUC). In the classification performance 0.38 and 0.57) compared to those of SVM (MAE = 0.54 and 0.57) and
(training dataset) of the deep aquifer, the ANN model produced a very ANN (MAE = 0.46 and 0.71).
high performance of total As probability model (AUC=1.00, The result indicated that the RF model was the appropriate model to
Recall=0.22, F1 =0.32), followed by SVM (AUC=0.73, Recall=0.49, create a groundwater pollution risk map of total As rather than SVM and
F1 =0.61) and RF (AUC=0.93, Recall=0.49, F1 =0.61). This also ANN. The result is also supported by many previous studies that RF al­
happened in the shallow aquifer as shown in the descending order: ANN gorithms have high performance and stable evaluation to predict data in
(AUC=1, Recall=1.00, F1 =1.00), SVM (AUC=0.84, Recall=0.34, the groundwater modeling field (Liu et al., 2019; Podgorski et al., 2020;
F1 =0.28), RF (AUC=0.93, Recall=90, F1 =0.79). Based on the classi­ Wang et al., 2021).
fication performance (training dataset) result, by investigating preces­
sion matrices ANN showed the sign of overfitting with very high 3.3. Uncertainty evaluates
performance in the train set and very low performance in the test set.
Commonly, the overfitting in the ANN model is found when ANN al­ The Quantile Regression (QR) technique was used to evaluate the
gorithms work with a small dataset (Rao et al., 2018). The result can uncertainty for each ML model. To determine the uncertainty of ML
indicate that ANN was not an appropriate model for the modeling pro­ models, this study applied only the testing data to carry out. Two sta­
cess in this study. However, the quality of the model cannot be estimated tistics approaches consist of prediction interval coverage probability
by just using only the training dataset because it was considered only by (PICP) and mean prediction interval (MPI) that were suggested by
the data used to generate the model (Henseler and Sarstedt, 2013). Rahmati (Rahmati et al., 2019). MPI was the average value from the
Therefore, the classification performance (test dataset) represents the prediction interval’s width, and the fewer values of MPI indicate the
quality of the model. To evaluate the appropriate model, the classifi­ lower uncertainty for each model. PICP was the probability that the
cation performance (test dataset) needs to be mainly considered (Bindal calculated within the prediction intervals, and the greatest value of PICP
and Singh, 2019). As shown in Table 3, the best deep aquifer probability indicates less uncertainty in each model.
model is RF (AUC=0.72, Recall=0.61, F1 =0.69) compared with SVM The result of PICP in the deep aquifer showed that Random Forest
(AUC=0.69, Recall=0.32, F1 =0.38) and ANN (AUC=0.65, (RF) provided the lowest uncertainty value (PICP=0.20) compared to
Recall=0.14, 0.18). In addition, in the shallow aquifer, RF (AUC=0.81, that of Support Vector Machine (SVM) (PICP=0.16) and Artificial
Recall=0.79, F1 =0.68) also the best model following by ANN Neural Networks (ANN) (PICP=0.05) (Table 5). In the case of shallow
(AUC=0.78, Recall=0.75, F1 =0.70) and SVM (AUC=0.79, aquifer, RF also provided the lowest uncertainty value (PICP=0.34)
Recall=0.48, F1 =0.39). The result of prediction performance indicated compared with that of SVM (PICP=0.23) and ANN (PICP=0.25)
that the RF had the best performance compared to those of SVM and (Table 5). However, because the PICP evaluation from the three models
ANN in both the deep and shallow aquifers. were indicate the uncertainty value of each model, there was no need to
In addition, Taylor’s diagrams were conducted to describe a model’s use the MPI value to compare the uncertainty of each model (Rahmati
performance (Rahmati et al., 2019). The figure of the models’ perfor­ et al., 2019). When evaluating uncertainty in the models in case the PICP
mance confirmed that the RF was the appropriate model to classify total was very clearly indicated the uncertainty value. Therefore, the PICP
As in groundwater in the study area as presented in Fig. 5. Based on the was a more important measurement uncertainty indicator than the MPI
criteria of Taylor’s diagram, the RF had the highest correlation with the (Dogulu et al., 2015). However, the MPI will be used as a secondary
observed total As probability and had the lowest correlation compared indicator in the case when the models provided close PICP values, a
to those values of the SVM and ANN models (Taylor, 2001). model with a lower MPI will be considered as a low uncertainty model
The prediction performance by validation with actual field data as (Muthusamy et al., 2016). According to QR results, the RF model pro­
shown in Table 4 revealed that RF had the lowest RMSE in both deep and vided the lowest uncertainty value compared to SVM and ANN in both
shallow aquifers (RMSE = 0.48 and 0.75) compared to those of SVM shallow and deep aquifers. The RF was the appropriate model to predict
(RMSE = 0.73 and 0.76) and ANN (RMSE = 0.67 and 0.84). Similar to the total As risk area based on the prediction performance and uncer­
tainty assessment in both deep and shallow aquifers. The spatial map of
Table 2 probability map showed that RF and other models in Fig. 6.
Spearman’s correlation with Physical and hydrochemical parameters in both
deep and shallow aquifers.
3.4. Risk map assessment
Parameters Deep aquifer Shallow aquifer

Physical DEM, Silt loam, Forest, DEM, Sandy loam, Silt loam, Clay Arsenic concentrations were found in the range of 0.3–500 µg/l with
parameters Quaternary, Granite loam, Agricultural(fields), Gr, Qc,
an average of 12.85 µg/l, which was higher than the groundwater
Qmc
Chemical EC, pH, TDS, TTH, EC, pH, TDS, TTH, drinking standard (WHO, 2018). The other heavy metals in the shallow
parameters Ca, Mg, F, HCO3 Ca, Mg, Na, K, Fe, Cl aquifer had a wide range of concentrations. Cadmium, Cr, Cu, Hg, Mn,
F, HCO3, SO4, NO3, Ni, Pb, Se, Zn had a wide range from 0.4 to 0.6 µg/l, 2.40–40 µg/l,
Cd, Cr, Cu, Hg, Mn, 3–500 µg/l, 0.1–700 µg/l, 5– 18,000 µg/l, 1–180 µg/l, 0.70–60 µg/l,
Ni, Pb, Se
0.3–160 µg/l, and 5–1900 µg/l respectively. Metals that were detected

7
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

Fig. 5. Taylor’s diagram for a) the deep aquifer and b) the shallow aquifer.

to cause diabetic, gastro-intestinal, renal, and neurological diseases, as


Table 4 well as cardiovascular disease. Furthermore, repeated exposure to As
Validation model with field data during 31 August – 1 September 2019.
has been related to a variety of cancers. In addition, the ingestion risk
RMSE MAE from As was found to be higher than the dermal uptake risk (Çelebi et al.,
SVM RF ANN SVM RF ANN 2014a, 2014b). Because of the As toxicity, monitoring strategies and risk
Deep 0.73 0.48 0.67 0.54 0.38 0.46
estimation are required to locate the risk area. The present study used
Shallow 0.76 0.75 0.84 0.57 0.57 0.71 risk maps from the RF model with population density and water con­
sumption to indicate As contamination in groundwater regions. The
percentage of groundwater pollution risk area in the deep aquifer shown
Table 5 in Table 6 has areas in very high-level, including Sattahip and Ban Khai
Uncertainty analysis of SVM, RF, and ANN of the deep and shallow aquifers. districts of 71.9% and 28.1%, respectively. Ban Chang district was
classified as the high-level area (15.6%). The district classified as the
Models Train Test
moderate level included Si Racha, Bang Lamung, Mueang Rayong, Pluak
Deep PICP MPI PICP MPI Daeng, Nikhom Phatthana of 73.89%, 16.87%, 3.48%, 0.57%, and
SVM 0.27 0.16 0.16 0.15 0.13%, respectively. The very low-level districts were Nong Yai and Ban
RF 0.53 0.49 0.20 0.13 Bueng (2.77% and 1.52%, respectively).
ANN 0.69 1.3E-4 0.05 6.075E-10
The percentage of groundwater pollution risk area in the shallow
Shallow
SVM 0.50 0.37 0.23 0.39 aquifer was shown in Table 7, revealing that Mueang Rayong and Ban
RF 0.58 0.44 0.34 0.27 Khai districts were classified in the very high-risk level of 99.27% and
ANN 0.86 8.7E-05 0.25 1.63E-10 0.73%, respectively. For the high-level area, there was only Nikhom
Phatthana district covering 3.65%. There are two districts classified as
the moderate level consisting of Si Racha and Pluak Daeng of 41.73%
to be above the groundwater standard were Cu, Mn, Ni, Pb, and Se
and 0.29%, respectively. Ban Chang district was grouped in the low-
(WHO, 2018).
level area (0.94%). Lastly, Bang Lamung, Sattahip, Nong Yai, and Ban
Arsenic contamination in the Rayong groundwater basin appeared to
Bueng districts areas were classified in the very low-level of 8.07%,
be a potential risk, which has been mentioned in many studies (Pipat­
4.99%, 2.54%, and 1.40%, respectively.
tanajaroenkul et al., 2018; Boonkaewwan et al., 2020). Arsenic is known

8
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

Fig. 6. The probability map in the deep aquifer derived from a) RF, b) SVM, c) ANN and the probability map in the shallow aquifer derived from d) RF, e) SVM and
f) ANN.

Table 6 Table 7
Percentage of groundwater pollution risk area in the deep aquifer. Percentage of groundwater pollution risk area in the shallow aquifer.
Districts Very high High Moderate Low Very low Districts Very high High Moderate Low Very low

Pluak Daeng 0.00 0.00 0.57 32.76 22.57 Pluak Daeng 0.00 0.00 0.29 9.29 26.52
Mueang Rayong 0.00 0.00 3.48 0.10 22.31 Mueang Rayong 99.27 91.62 55.22 29.76 10.66
Nong Yai 0.00 0.00 0.00 0.00 2.77 Nong Yai 0.00 0.00 0.00 0.00 2.54
Ban Chang 0.00 15.58 0.00 0.79 9.14 Ban Chang 0.00 0.00 0.00 0.94 8.54
Ban Khai 28.13 77.26 5.06 26.14 14.89 Ban Khai 0.73 4.73 1.22 0.94 18.80
Sattahip 71.87 7.16 0.00 0.02 5.03 Sattahip 0.00 0.00 0.00 0.00 4.99
Si Racha 0.00 0.00 73.89 5.94 7.44 Si Racha 0.00 0.00 41.73 16.95 8.35
Ban Bueng 0.00 0.00 0.00 0.00 1.52 Ban Bueng 0.00 0.00 0.00 0.00 1.40
Nikhom Phatthana 0.00 0.00 0.13 33.39 6.23 Nikhom Phatthana 0.00 3.65 1.55 42.12 11.23
Bang Lamung 0.00 0.00 16.87 0.87 8.10 Bang Lamung 0.00 0.00 0.00 0.00 8.07

9
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

The result showed that Sattahip and Ban Khai districts (the north part Rayong groundwater basin as shown in Fig. 7. Based on various types of
of Rayong Province) have the largest number of people exposed to As land use in the Rayong groundwater basin including industrial sites,
contamination with a very high risk level, followed by Ban Chang dis­ agricultural areas, and urban areas, particularly, the landfill in the
trict which had a high-risk level of exposure. Other districts that might Mueang District that is close to the estuary of the groundwater basin
have people exposed to As contaminated in the deep aquifer were in the (Boonkaewwan et al., 2021). As highly concentrated in industrial sites
descending order as follows: Si Racha, Bang Lamung, Mueang Rayong, due to anthropogenic activity (Çelebi et al., 2014a, 2014b). Therefore,
Pluak Daeng, and Nikhom Phatthana district. Based on the risk map, the the risk areas of the shallow aquifer could be affected by anthropogenic
exposure of local people with As contaminated in the shallow ground­ activity in the study area, such as landfills, agricultural areas, and in­
water aquifer mostly found in the south path in Rayong groundwater dustrial sites.
basins such as Mueang Rayong and Ban Khai districts which had a very Given these results, it is critical to limit additional contamination of
high risk level. Following Nikhom Phatthana district (high-risk expo­ potential drinking water sources. Therefore, groundwater monitoring
sure), Si Racha and Pluak Daeng districts also had a moderate risk level. should be carried out in the Rayong groundwater basin. Moreover,
From Nilkarnjanakul et al. (2022), the previous study indicated that oral decision-makers should plan an appropriate policy following the risk
exposure is the most risky exposure pathway in Ban Khai district. The area of As risk map to locate an area suitable for installing a filter tank
highest HQ oral level of arsenic observed in the adult group was 15.30, that has oxide mineral material for remediation and treat As contami­
which is ten times the average risk level of 1.47. Moreover, the nation in groundwater before use (Aredes et al., 2012). Furthermore, the
high-level health risk assessment area in Ban Khai district came from Gr As groundwater risk map can be used as reference information to warn
groundwater. This can be related to the risk map in this study and in­ the people in the study area this will enhance the public willingness to
dicates that in Ban Khai district, arsenic exposure has a very high-risk procure groundwater before drilling groundwater well (Das et al.,
assessment level in the deep aquifer. 2019).
In summary, the risk map for the deep aquifer appeared to have more
risk in the northern part of Rayong groundwater basin whereas in the 4. Conclusions
shallow aquifer it appears to have high risk in the southern part of the
The correlation analysis shows mechanisms in the groundwater
environment. The correlation result indicated that the sources of As in
this study area were caused by both natural and anthropogenic sources.
The results of the models’ classification and prediction performances
and uncertainty evaluation results indicated that the RF models had the
highest model performance and lowest uncertainty compared to the
SVM and ANN models. Therefore, the RF was the most suitable model
for AS contamination assessment in the Rayong groundwater basin
compared to SVM and ANN. The RF model’s risk map shows that the
northern and middle parts of the Rayong groundwater basin had a high
risk of people being exposed to As contaminated in deep aquifers. On the
contrary, the shallow aquifer indicates that the southern part of the
Rayong groundwater basin has a higher chance that people are exposed
to As in groundwater, which could be related to anthropogenic activities
in the Mueang District, which is close to the estuary of the Rayong
groundwater basin.
Therefore, long-term groundwater monitoring should be carried out
intensively in the Rayong groundwater basin. The process of this
research can be used to further study other groundwater aquifers that
are contaminated and increase the effectiveness of groundwater quality
management. Moreover, the outcome of this study can be used by the
government to manage groundwater resources and protect the envi­
ronment in the study area. Moreover, the people in the area can use the a
groundwater risk map as reference data to increase public willingness to
procure uncontaminated groundwater before drilling a new well. The
government can propose a suitable policy to procure uncontaminated
groundwater and encourage sustainable groundwater management in
the study area more effectively. Due to the industrial estate being built
as part of the EEC project, this area will experience significant ground­
water demand in the upcoming years. Overall, this study helps both
government and non-government sectors understand the risk assessment
of As-contaminated groundwater and protect groundwater resources.

CRediT authorship contribution statement

Narongpon Sumdang: Software, Validation, Formal analysis,


Investigation, Visualization, Data curation, Writing – original draft.
Srilert Chotpantarat: Conceptualization, Methodology, Validation,
Resources, Formal analysis, Investigation, Writing – review & editing,
Project administration, Funding acquisition, Data curation. Kyung Hwa
Cho: Writing – review & editing, Supervision. Nguyen Ngoc Thanh:
Fig. 7. Groundwater pollution risk map of As contamination in a) the deep Data curation. All authors have read and agreed to the published version
aquifer and b) the shallow aquifer. of the manuscript.

10
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

Declaration of Competing Interest DGR, 2012. Project for Exploration and Study of Heavy Metals in Groundwater Rayong
and Chonburi Groundwater Basin, Thailand. Department of Groundwater Resources,
Ministry of Natural Resource and Environment, Thailand.
The authors declare that they have no known competing financial DGR, 2017. Project of Exploration and Study of Heavy Metals in Groundwater in the
interests or personal relationships that could have appeared to influence Central and Eastern Regions of Thailand. Department of Groundwater Resources,
the work reported in this paper. Ministry of Natural Resource and Environment, Thailand.
DGR, 2020. Groundwater Well Data Provinces. Ministry of Natural Resources and
Environment, Ministry of Natural Resource and Environment, Thailand.
Data availability DMR, 2007. Mineral Resource in Rayong Province, Thailand. Department of Mineral and
Resource, Ministry of Natural Resource and Environment, Thailand.
Dogulu, N., Lopez Lopez, P., Solomatine, D., Weerts, A., Shrestha, D., 2015. Estimation of
Data will be made available on request. predictive hydrologic uncertainty using the quantile regression and UNEEC methods
and their comparison on contrasting catchments. Hydrol. Earth Syst. Sci. 19,
3181–3201.
Acknowledgments
Fytianos, K., Christophoridis, C., 2004. Nitrate, Arsenic and Chloride Pollution of
Drinking Water in Northern Greece. Elaboration by Applying GIS. Environ. Monit.
The authors are very grateful for the 90th Year Chulalongkorn Uni­ Assess. 93, 55–67.
versity Scholarship. We acknowledge financial supports from the Grant Gaafar, M., Mahmoud, S., Gan, T.Y., Davies, E., 2019. A practical GIS-based hazard
assessment framework for water quality in stormwater systems. J. Clean. Prod. 245,
for Thailand Science Research and Innovation Fund Chulalongkorn 118855 https://doi.org/10.1016/j.jclepro.2019.118855.
University (CUFRB65_dis(2)_090_23_20). We are grateful for the thor­ Ghosh, S., Das, A., 2020. Wetland conversion risk assessment of East Kolkata Wetland: a
ough reviews of anonymous reviewers. Their valuable comments Ramsar site using random forest and support vector machine model. J. Clean. Prod.
275. https://doi.org/10.1016/j.jclepro.2020.123475.
significantly improved the earlier draft of this article. Gong, G., Mattevada, S., O’Bryant, S.E., 2014. Comparison of the accuracy of kriging and
IDW interpolations in estimating groundwater arsenic concentrations in Texas.
Environ. Res. 130, 59–69. https://doi.org/10.1016/j.envres.2013.12.005.
Appendix A. Supporting information
Havryliuk, S., Korol, M., Tokar, О, Olena, V., Lubov, K., 2018. Using the random forest
classification for land cover interpretation of landsat images in the Prykarpattya
Supplementary data associated with this article can be found in the region of Ukraine. In: Proceedings of the IEEE 13th International Scientific and
Technical Conference on Computer Science and Information Technologies (CSIT).
online version at doi:10.1016/j.ecoenv.2023.114665.
Lviv, Ukraine. 〈https://doi.org/10.1109/STC-CSIT.2018.8526646〉.
Henseler, J., Sarstedt, M., 2013. Goodness-of-fit indices for partial least squares path
References modeling. Comput. Stat. 28, 565–580. https://doi.org/10.1007/s00180-012-0317-1.
Kerdthep, P., Tongyonk, L., Rojanapantip, L., 2009. Concentrations of cadmium and
arsenic in seafood from Muang District, Rayong Province. J. Health Res. 23 (4),
Adithya, V.S., Chidambaram, S., Thivya, C., Prasanna, M.V., Nepolian, M., Ganesh, N.,
179–184.
2016. A study on the impact of weathering in groundwater chemistry of a hard rock
Kumari, P., Gupta, N.C., Kaur, A., 2017. A review of groundwater pollution potential
aquifer. Arab J. Geosci. 9, 158. https://doi.org/10.1007/s12517-015-2073-3.
threats from municipal solid waste landfill sites: assessing the impact on human
Aredes, S., Klein, B., Pawlik, M., 2012. The removal of arsenic from water using natural
health. Avicenna J. Environ. Health Eng. 4 (1), 11525. https://doi.org/10.5812/
iron oxide minerals. J. Clean. Prod. 29–30, 208–213. https://doi.org/10.1016/j.
ajehe.11525.
jclepro.2012.01.029.
LDD, 2016. Land Use Summary in Rayong Province, Thailand: 2016. Land Development
Bindal, S., Singh, C.K., 2019. Predicting groundwater arsenic contamination: Regions at
Department, Ministry of Agriculture and Cooperatives, Thailand.
risk in highest populated state of India. Water Res. 159, 65–76. https://doi.org/
Liu, D., Fan, Z., Fu, Q., Li, M., F, M.A., Ali, S., Li, T., Zhang, L., Khan, M.I., 2019. Random
10.1016/j.watres.2019.04.054.
forest regression evaluation model of regional flood disaster resilience based on the
Boonkaewwan, S., Sonthiphand, P., Chotpantarat, S., 2021. Mechanisms of arsenic
whale optimization algorithm. J. Clean. Prod. 250. https://doi.org/10.1016/j.
contamination associated with hydrochemical characteristics in coastal alluvial
jclepro.2019.119468.
aquifers using multivariate statistical technique and hydrogeochemical modeling: a
Magalhaes, C., Tavares, J.M., Mendes, J.G., Vardasca, R., 2021. Comparison of machine
case study in Rayong province, eastern Thailand. Environ. Geochem. Health 43 (1),
learning strategies for infrared thermography of skin cancer. Biomed. Signal Process.
537–566. https://doi.org/10.1007/s10653-020-00728-7.
Control 69, 102872.
Boonkhao, L., Phanprasit, W., Robson, M., Sujirarat, D., Kwonpongsagoon, S.,
Meinrath, G., Spitzer, P., 2000. Uncertainties in determination of pH. Microchim. Acta
Tangtong, C., 2017. Arsenic exposure levels of petrochemical workers in three
135, 155. https://doi.org/10.1007/s006040070005.
workplace settings in Rayong Province, Thailand, 00-00 Hum. Ecol. Risk Assess. Int.
Muthusamy, M., Godiksen, P.N., Madsen, H., 2016. Comparison of different
J. 23. https://doi.org/10.1080/10807039.2017.1333406.
configurations of quantile regression in estimating predictive hydrological
Bulut, G., Yenial, U., Emiroğlu, E., Sirkeci, A.A., 2013. Arsenic removal from aqueous
uncertainty. Procedia Eng. 154, 513–520. https://doi.org/10.1016/j.
solution using pyrite. J. Clean. Prod. 84, 526–532. https://doi.org/10.1016/j.
proeng.2016.07.546.
jclepro.2013.08.018.
Naimi, B., Araújo, M.B., 2016. sdm: a reproducible and extensible R platform for species
Çelebi, A., Şengörür, B., Kløve, B., 2014. Seasonal and spatial variations of metals in
distribution modelling. Ecography 39 (4). https://doi.org/10.1111/ecog.01881.
melen watershed groundwater, Turkey. Clean Soil Air Water 43 (5), 739–745.
Nickson, R., McArthur, J.M., Ravenscroft, P., Burgess, W.G., Ahmed, M., 2000.
https://doi.org/10.1002/CLEN.201300774.
Mechanism of arsenic release to groundwater, Bangladesh and West Bengal. Appl.
Çelebi, A., Sengörür, B., Kløve, B., 2014. Human health risk assessment of dissolved
Geochem. 15 (4), 403–413. https://doi.org/10.1016/S0883-2927(99)00086-4.
metals in groundwater and surface waters in the Melen watershed, Turkey.
Nilkarnjanakul, W., Watchalayann, P., Chotpantarat, S., 2022. Spatial distribution and
J. Environ. Sci. Health Part A Toxic Hazard. Subst. Environ. Eng. 49 (2), 153–161.
health risk assessment of As and Pb contamination in the groundwater of Rayong
https://doi.org/10.1080/10934529.2013.838842.
Province, Thailand. Environ. Res. 204 (Pt. A), 111838 https://doi.org/10.1016/j.
Chai, T., Draxler, R.R., 2014. Root mean square error (RMSE) or mean absolute error
envres.2021.11183.
(MAE)? – Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 7,
Niyomsilp, E., Worapongpat, N., Bunchapattanasakda, C., 2020. Thailand’s Eastern
1247–1250. https://doi.org/10.5194/gmd-7-1247-2014.
Economic Corridor (EEC): according to Thailand 4.0 economic policy. J. Leg. Entity
Chen, X., Li, F., Li, X., Hu, Y., Hu, P., 2019. Evaluating and mapping water supply and
Manag. Local Innov. 6 (2).
demand for sustainable urban ecosystem management in Shenzhen, China. J. Clean.
, 2020NSO, 2020. Number of Population from Registration by Sex, House, Region and
Prod. 251, 119754 https://doi.org/10.1016/j.jclepro.2019.119754.
Province: 2020. National Statistical Office, Thailand. 〈http://statbbi.nso.go.th/staticrepo
Chetia, M., Chatterjee, S., Banerjee, S., Nath, M.J., Singh, L., Srivastava, R., Sarma, H.,
rt/page/sector/en/01.aspx〉 [dataset].
2010. Groundwater arsenic contamination in Brahmaputra river basin: a water
Pipattanajaroenkul, P., Sonthiphand, P., Kraidech, S., Boonkaewwan, S.,
quality assessment in Golaghat (Assam), India. Environ. Monit. Assess. 173,
Chotpantarat, S., 2018. Detection of arsenite-oxidizing bacteria in groundwater with
371–385. https://doi.org/10.1007/s10661-010-1393-8.
low arsenic concentration in Rayong province, Thailand. MATEC Web Conf. 192,
Cho, K., Sthiannopkao, S., Pachepsky, Y., Kim, K.W., Kim, J.H., 2011. Prediction of
03036. https://doi.org/10.1051/matecconf/201819203036.
contamination potential of groundwater arsenic in Cambodia, Laos, and Thailand
Podgorski, J., Wu, R., Chakravorty, B., Polya, D.A., 2020. Groundwater arsenic
using artificial neural network. Water Res. 45, 5535–5544. https://doi.org/10.1016/
distribution in india by machine learning geospatial modeling. Int. J. Environ. Res.
j.watres.2011.08.010.
Public Health 17 (19), 7119. https://doi.org/10.3390/ijerph17197119.
Choubin, B., Malekian, A., Samadi, V., Khalighi, S., Sajedi Hosseini, F., 2017. An
Rahmati, O., Choubin, B., Fathabadi, A., Coulon, F., Soltani, E., Shahabi, H.,
ensemble forecast of semiarid rainfall using large-scale climate predictors. Meteor.
Mollaefar, E., Tiefenbacher, J., Cipullo, S., Ahmad, B.B., Tien Bui, D., 2019.
Appl. 24. https://doi.org/10.1002/met.1635.
Predicting uncertainty of machine learning models for modelling nitrate pollution of
Das, R., Laishram, B., Jawed, M., 2019. Perception of groundwater quality and health
groundwater using quantile regression and UNEEC methods. Sci. Total Environ. 688,
effects on willingness to procure: the case of upcoming water supply scheme in
855–866. https://doi.org/10.1016/j.scitotenv.2019.06.320.
Guwahati, India. J. Clean. Prod. 226, 615–627. https://doi.org/10.1016/j.
Rao, M., Prasad, V., Teja, P.S., Zindavali, Reddy, O., 2018. A survey on prevention of
jclepro.2019.04.097.
overfitting in convolution neural networks using machine learning techniques. Int. J.
Dawood, T., Elwakil, E., Novoa, H., Delgado, J., 2020. Toward urban sustainability and
Eng. Technol. 7, 177. https://doi.org/10.14419/ijet.v7i2.32.15399.
clean potable water: prediction of water quality via artificial neural networks.
J. Clean. Prod. 291, 47907. https://doi.org/10.1016/j.jclepro.2020.125266.

11
N. Sumdang et al. Ecotoxicology and Environmental Safety 253 (2023) 114665

Sae-Ju, J., Chotpantarat, S., Thitimakorn, T., 2019. Hydrochemical, geophysical and Taylor, K., 2001. Summarizing multiple aspects of model performance in a single
multivariate statistical investigation of the seawater intrusion in the coastal aquifer diagram. J. Geophys. Res. 106, 7183–7192. https://doi.org/10.1029/
at Prachuap-Khiri-Khan Province, Thailand. J. Asian Earth Sci. 191, 104165 https:// 2000JD900719.
doi.org/10.1016/j.jseaes.2019.104165. TMD, 2015. Thai Climate. Meteorological Department, Ministry of Digital Economy and
Sajedi Hosseini, F., Malekian, A., Choubin, B., Rahmati, O., Cipullo, S., Coulon, F., Society, Thailand.
Pradhan, B., 2018. A novel machine learning-based approach for the risk assessment Wang, F., Wang, Y., Zhang, K., Hu, M., Weng, Q., Zhang, H., 2021. Spatial heterogeneity
of nitrate groundwater contamination. Sci. Total Environ. 644. https://doi.org/ modeling of water quality based on random forest regression and model
10.1016/j.scitotenv.2018.07.054. interpretation. Environ. Res. 202, 111660 https://doi.org/10.1016/j.
Sangkham, S., Thongtip, S., Vongruang, P., 2021. Influence of air pollution and envres.2021.111660.
meteorological factors on the spread of COVID-19 in the Bangkok Metropolitan WHO, 2018. Arsenic. 〈https://www.who.int/news-room/fact-sheets/detail/arsenic〉.
Region and air quality during the outbreak. Environ. Res. 197, 0013–9351. https:// (Accessed 20 August 2020).
doi.org/10.1016/j.envres.2021.111104. Winkel, L., Berg, M., Amini, M., Hug, J.S., Johnson, A.C., 2008. Predicting groundwater
Sarker, I.H., 2021. Deep learning: a comprehensive overview on techniques, taxonomy, arsenic contamination in Southeast Asia from surface parameters. Nat. Geosci. 1,
applications and research directions. SN Comput. Sci. 2, 420. https://doi.org/ 536–542. https://doi.org/10.1038/ngeo254.
10.1007/s42979-021-00815-1. Wongsasuluk, P., Chotpantarat, S., Siriwong, W., Robson, M., 2018. Using hair and
Shrestha, D.L., Solomatine, D.P., 2006. Machine learning approaches for estimation of fingernails in binary logistic regression for bio-monitoring of heavy metals/metalloid
prediction interval for the model output. Neural Netw. Off. J. Int. Neural Netw. Soc. in groundwater in intensively agricultural areas, Thailand. Environ. Res. 162,
19 (2), 225–235. https://doi.org/10.1016/j.neunet.2006.01.012. 106–118. https://doi.org/10.1016/j.envres.2017.11.024.
Solomatine, D., Shrestha, D., 2009. A novel method to estimate model uncertainty using Wongsasuluk, P., Chotpantarat, S., Siriwong, W., Robson, M., 2018. Using urine as a
machine learning techniques. Water Resour. Res. 45. https://doi.org/10.1029/ biomarker in human exposure risk associated with arsenic and other heavy metals
2008WR006839. contaminating drinking groundwater in intensively agricultural areas of Thailand.
Sonthiphand, P., Ruangroengkulrith, S., Mhuantong, W., Charoensawan, V., Environ. Geochem. Health 40, 323–348. https://doi.org/10.1016/j.
Chotpantarat, S., Boonkaewwan, S., 2019. Metagenomic insights into microbial envres.2017.11.024.
diversity in a groundwater basin impacted by a variety of anthropogenic activities. Zubair, A., Begum, A., Khan, M., Nasir, M., Ahmad, W., Khan, A., 2015. Contamination of
Environ. Sci. Pollut. Res. 26, 26765–26781. Arsenic in Sea, Surface, and Ground water in the coastal aquifers of Sindh, Pakistan.
Mitt. Klosterneubg. 163–178.

12

You might also like