You are on page 1of 21

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/346017190

Computational intelligence applied to soil quality index using GIS and


geostatistical approaches in semiarid ecosystem

Article  in  Arabian Journal of Geosciences · November 2020


DOI: 10.1007/s12517-020-06214-9

CITATIONS READS

16 230

4 authors, including:

Huseyin Senol Sinan Demir


T.C. Isparta Uygulamalı Bilimler Üniversitesi Isparta University of Applied Sciences
37 PUBLICATIONS   130 CITATIONS    23 PUBLICATIONS   31 CITATIONS   

SEE PROFILE SEE PROFILE

Orhan Dengiz
Ondokuz Mayıs Üniversitesi
233 PUBLICATIONS   1,772 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Geopedological Process and Weathering Ratios of Soils under Similar Climate condition and on different gelogical parent materials View project

Determination of Opium Poppy (Papaver Somniferum) Parcels Using High-Resolution Satellite Imagery View project

All content following this page was uploaded by Orhan Dengiz on 23 November 2020.

The user has requested enhancement of the downloaded file.


Arabian Journal of Geosciences (2020) 13:1235
https://doi.org/10.1007/s12517-020-06214-9

ICCESEN 2017

Computational intelligence applied to soil quality index using GIS


and geostatistical approaches in semiarid ecosystem
Hüseyin Şenol 1 & Pelin Alaboz 1 & Sinan Demir 1 & Orhan Dengiz 2

Received: 27 August 2020 / Accepted: 1 November 2020


# Saudi Society for Geosciences 2020

Abstract
The importance of soil quality is increasing every passing day for sustainable agriculture. In recent years, the investigation of the
classification of soil quality with some classification methods known as machine learning algorithms draws attention. The study carried
out for this purpose was hold on the farmland of Isparta University of Applied Sciences. Soil quality index was determined with a linear
combination technique approach and analytical hierarchical process (observed values) and estimated by decision trees (predicted values).
Total and minimum data sets (27 and 15 indicators, respectively) were evaluated by both methods, and all four outputs were compared.
Deterministic (Inverse Distance Weighted-1, 2, 3 powers and radial based functions—completely regularized spline, spline with tension,
multiquadric) and scholastic (spherical, exponential, Gaussian belonging to ordinary kriging, simple kriging and universal kriging)
models were used in the creation of the distribution maps of observed and predicted values. No statistically significant differences were
found in the comparison of soil quality index obtained using both data sets (P > 0.05). In the decision tree where organic matter was
determined as the root node, quality classes can be predicted at 91.1% by separating sand, wilting point, and EC properties into branches
as an internal node. Area under the curve value in evaluating the estimation accuracy was found as 0.991, 0.960, and 0.943 for I, II, and
III classes, respectively (P = 0.00). It was determined that estimation can be done with 91.7% sensitivity and 90.9% specificity at 0.38
cut-off value for class III soils. Consequently, the highest accuracy in distribution maps of predicted and observed soil quality index
values were found with the Gaussian semivariogram model of the ordinary and simple kriging for both data sets.

Keywords Soil quality . Principal components . Interpolation . Decision tree . Soil properties

Introduction without losing its functions, and the most important way is to
determine the quality of the soil. Events such as wrong farming
Today, the misuse of agricultural lands with industrialization and activities, excessive use of pesticide and chemical fertilizers, and
urbanization leads to a spatially decrease in agricultural areas. erosion cause degradation of soils, thus failing to perform pro-
Besides, increasing demand for production due to population ductivity functions. In recent years, while trying to develop fer-
growth requires a higher amount of production from the unit area tilization programs based on analysis, these practices are not
with the use of more intensive inputs. This case can only be sufficient to maintain or improve crop productivity. Evaluation
achieved through sustainable use of the soil, which is the most of soil only in terms of nutrient content and ignoring other bio-
important production environment of the terrestrial ecosystem, physicochemical properties will be insufficient in ensuring pro-
ductivity and sustainability. Soil quality which is defined as sus-
taining plant and animal production of soil in a natural or man-
This article is part of the Topical Collection on Geo-Resources-Earth- aged ecosystem and capacity to increase water and air quality and
Environmental Sciences
to provide all of the functions of creating a suitable living envi-
ronment for human health (Karlen et al. 1997; Doran 2002) is
* Pelin Alaboz
pelinalaboz@isparta.edu.tr studied by many researchers (Dengiz 2013; Mukherjee and Lal
2014; Vasu et al. 2016; Seker et al., 2017; Dengiz 2020).
1
It is reported that the use of minimum data sets in determining
Faculty of Agriculture, Department of Soil Science and Plant
Nutrition, Isparta University of Applied Sciences, Isparta, Turkey
soil quality factors gives the best results in terms of economy,
2
labor, and data quality produced (Seker et al., 2017). To generate
Faculty of Agriculture, Department of Soil Science and Plant
Nutrition, Ondokuz Mayıs University, Samsun, Turkey
a minimum data set, principal component analysis with fewer
1235 Page 2 of 20 Arab J Geosci (2020) 13:1235

new variables or basic components is created where the common study of artificial neural networks and decision tree algorithms
effects of the variables are seen. Thus, the dependency structure in mapping soil properties, reported that soil properties such as
between variables is eliminated with the analysis of principal content of sand, clay, organic carbon, and carbonates can be
components (Tatlıdil 2002). Many methods are used in the de- predicted by decision trees (R2: 0.70–0.73), and they had better
termination and monitoring of soil quality in the world (Qi et al. estimation accuracy than artificial neural networks. In addition,
2009; Adeyolanu et al. 2013; Askari and Holden 2015; Dengiz many researchers evaluated the relationship between digital soil
2020), and the suitability of methods varies according to regional mapping and soil properties with decision trees (Vågen et al.,
factors and soil characteristics. Minimum data sets containing 2016; Malone et al. 2017). Kheir et al. (2010) reported that or-
different features were used in determining the soil quality. In ganic carbon maps created with the decision tree showed accu-
this context, while Askari and Holden (2015) created data set racy between 68.87 and 69.54%.
with organic carbon, penetration resistance, magnesium content, In this study, which was conducted in application farmlands
aggregate size distribution, and carbon-nitrogen ratio, Navarro of Isparta University of Applied Sciences in Turkey, the follow-
et al. (2015) used porosity, available water capacity, organic ing are aimed: (i) determination of the soil quality of the area by
carbon, potassium and copper content, vegetation, cation creating the data set containing the physical, chemical, biological,
exchange capacity, pH, dehydrogenase activity and CaCO3 and nutrient properties of the soils and the minimum data set by
content, and Liu et al. (2014) used characteristics of organic using analytical hierarchical process and standard scoring func-
matter, total nitrogen, pH, dehydrogenase, and arbuscular mycor- tion; (ii) prediction of soil quality with decision trees; (iii) com-
rhiza in albic soils in the evaluation of soil quality. Again, the parison of soil quality distribution patterns obtained in both data
total data set and the minimum data set are widely used in many sets; and (iv) evaluation of distribution maps of predicted and
studies in soil quality evaluation approaches (Doran and Parkin observed quality index values by means of different interpolation
1994; Qi et al. 2009; Cheng et al. 2016; Wu et al. 2019). methods.
Rather than point-based evaluating the studies on soil quality,
obtaining spatial distribution results is very important on behalf
of decision-makers or producers. For this purpose, geostatistical Material and methods
approaches in which spatial distributions of point-based data are
evaluated are widely used in the assessment and classification of Study area
lands (Özyazıcı et al. 2016; Tunçay et al. 2018; Aydın and
Dengiz 2019; Mihalikova and Dengiz, 2019). Dengiz (2002), The study area is within the campus of Isparta University of
in his study to determine the quality of agricultural lands in Applied Sciences and located in the east of the Isparta-Burdur
Ankara Gölbaşı district and its immediate surroundings, deter- Highway with coordinates 283,100–282,921 North Longitude
mined 70.1% of the area was very good and good, and 14.2% of and 4,190,355–4,191,399 East Latitudes—WGS 1984 UTM-m
the area was not suitable for agricultural use with parametric and Zone 36 N (Fig. 1). The area covers approximately 150 ha.
geostatistical methods. In addition, Koca et al. (2019), in the Isparta plain is in the south-east direction of the study area, and
study in which they determined the soil quality in the it is surrounded by high hills and ridges in other directions.
Çukurova region by focusing on the standard scoring functions Almost all the farmland is located on the colluvial deposits from
and the analytical hierarchy process, reported 68.3% of the area the alluvial and slope lands brought by the Horozyokuşu River
as moderate quality. The soil has a very dynamic feature and it is coming from the West (Akgül and Başyiğit, 2005). High hills
under the influence of many environmental factors. Therefore, and ridges are made of Cretaceous limestones (Görmüş and
many criteria need to be considered together when examining a Özkul 1995). The climate can be described as sub-humid, and
feature of soil. Besides, because the influencing criteria do not according to long-term meteorological data (1974–2017), aver-
have the same importance, the use of the analytical hierarchical age annual precipitation and temperature of the study area are
process, which is one of the multi-criteria decision-making 467 mm and 12.3 °C, respectively. According to Newhall simu-
methods, in the estimation of the desired feature to determine lation model (Van Wambeke 2000), soil temperature and mois-
their weighing and sub-criteria was considered in the studies ture regimes are Mesic and Dry Xeric, respectively (Fig. 2). The
(Turan and Dengiz 2017; Dedeoğlu and Dengiz 2018). study area has been generally used for wheat, some vegetables,
In recent years, pedotransfer functions have become the re- and fruit trees.
searchers’ interest in developing artificial intelligence technolo-
gies for the estimation, classification, and revealing of soil prop-
erties. Decision trees, one of the widely used machine learning Soil analysis methods
algorithms for classification purposes, are controlled learning
algorithms used to divide a dataset with a large number of fea- Fifty-six disturbed and undisturbed soil samples were taken
tures into smaller clusters by applying a series of decision rules from 0 to 20 cm depth with the grid system created 150 ×
(Albayrak and Yılmaz 2009). Hateffard et al. (2019), in their 150 m from the area. Samples were made ready for analysis
Arab J Geosci (2020) 13:1235 Page 3 of 20 1235

Fig. 1 Location of the study area

by sieving them with 2 mm sieve after being brought to the with glass electrode in saturation extract (Soil Survey Staff
laboratory and becoming dry. Texture was determined with 1992). Organic matter content was determined with modified
hydrometer method (Bouyoucous 1951). Lime content was de- Walkley-Black method (Jackson 1958). Soil Ca, Mg, K, and
termined with Scheibler calcimeter method (Soil Survey Staff Na contents were determined by extracting with 1 N ammoni-
1993). The pH and EC were determined by pH and EC meter um acetate (NH4OAc). Total nitrogen, available phosphorus
1235 Page 4 of 20 Arab J Geosci (2020) 13:1235

Fig. 2 Diagram of the soil moisture and temperature regimes

were determined using Kjeldahl (Bremner 1982) and Olsen Determination of soil quality
methods (Olsen 1954). Micronutrients (Fe, Mn, Cu, and Zn)
were determined extracting with DTPA (Kacar 2016; Lindsay After converting to unitless by applying a standard scoring
and Norvell 1978) by atomic absorption spectroscopy device function (SSF) to soil properties, indicators were weighted
(Soil Survey Staff 1992). Aggregation percentages were deter- by the analytical hierarchical process (AHP), which was de-
mined in soil samples according to US Salinity Laboratory Staff veloped by Saaty (1980), to highlight which of the soil quality
(1954). Dry bulk density was determined using the core sam- indicators is important. Soil quality index values were deter-
ples (100 cm3). Field capacity (0.33 bar), wilting point (15 bar), mined by using the linear combination technique approach.
and available water moisture constants were determined using The Linear combination equation is given below (Eq. 1).
the ceramic table pF set (USA, Soil Moisture Equipment
SQI ¼ ∑ni¼1 ðWi:XiÞ ð1Þ
Corp.). Penetration resistance (PR) measurements were per-
formed using the penetrologger. The mentioned device is capa- where SQI is the soil quality index for agricultural usage,
ble of measuring between 0 and 10 MPa for each cm. In mea- Wi the weighting of parameter i, and Xi the sub-criterion score
surements, cone, 60° (NEN 5140 1996), and a cone-shaped tip of parameter i. The above formula was applied to each soil
with a 1 cm2 base area were used. Correction equations sug- sample.
gested by Alaboz (2019) were used to determine the penetration Soil quality indicators were converted from 0.1 to 1.0
resistance change due to standard moisture content. Urease en- unitless scores to be comparable to each other using standard
zyme analysis was performed with the method based on the scoring functions (Andrews et al. 2002). In general, there are 3
hydrolysis of ammonia and the analysis of ammonium formed different scoring functions as “more is better,” “less is better,”
in the medium with the urease enzyme of urea added to the and “midpoint is optimum” (Karlen and Stott. 1994; Masto
medium (Tabatabai and Bremner 1972). Soil respiration is et al. 2008). In “more is better” scoring, a high score of the
based on titration with Ba(OH)2 put in a closed container to indicator represents a positive relationship with soil quality,
react with CO2 from the soil and HCl solution of unused and in “less is better” scoring, low score of the indicator rep-
Ba(OH)2 (Isermayer 1952). Alkali phosphatase enzyme analy- resents a negative relationship with soil quality. In “midpoint
sis was determined on the basis of the formation of p- is optimum,” threshold values are determined, and depending
nitrophenol from p-nitrophenyl phosphate. Determination of on whether indicators are above or below this threshold, they
β-glucosidase enzyme activity was based on the hydrolysis of are converted into unitless form with “more is better” and
4-nitrophenyl b-d glucopyranoside to p-nitrophenol (μg of p- “less is better” scoring. The SSF equations for the parameters
nitrophenol g−1 dry soil h−1) (Arcak et al. 1997). are listed in Table 1.
Arab J Geosci (2020) 13:1235 Page 5 of 20 1235

It is possible to make binary comparisons of both qualita- compared with element J (i, J……n) properties of binary com-
tive and quantitative factors and to determine their weight and parison matrix;
priorities with the AHP method (Saaty 2008). Saaty (1977) aji = 1/aij.
proposed a comparison of values ranging from 1 to 9, which aij > 0(i,j = 1,2….,n)
defines the degree of importance. In conducted scientific stud- In order for binary comparison to be fully consistent.
ies (Turan and Dengiz 2017; Dedeoğlu and Dengiz 2018), aik = aji ajk(I,j,k = 1,2,….n).
binary comparison is applied to criteria and sub-criteria based If consistent aij = Wi/WJ.
on expert opinions and evaluations. Numerical values indicat- Wi = priority value for element i, Wj: priority value for
ing the relative importance of each other according to the element J.
Saaty scale are given in Table 2. Because each element has equal importance when
Firstly, a comparison matrix (n x n dimensions) is created compared with each other, the values of the elements
between created criteria by considering the impact status of on the diagonal of the comparison matrix must be equal
the criteria (Eq. 2). to 1 (Saaty 1977). After the comparison matrix table is
created, the normalization of the matrix is performed.
2 3 Normalization will be done by dividing the data in each
a11 a12 a13 a14 a1n cell to the cell’s column total. W column vector called
6 a21 a22 a23 a24 a2n 7
6 7 priority vector is obtained with the arithmetic average of
A¼6
6 a31 a32 a33 a34 a3n 7
7
4 a41 the sum of the data in each row in the normalization
a42 a43 a44 a4n 5
table obtained from binary comparisons. This vector ex-
an1 a52 a53 a54 ann
ð2Þ presses the percentage of importance weights of the
criteria (Eq. 3).
All entries in this created matrix must have a positive value.
A: binary comparison matrix aij: the importance of element i

Table 1 Standard scoring functions and parameters for quantitative soil parameters

Parameters FT* SSF Equation**

Sand LB f ðxÞ ¼ f1−0:9  0:1 x−L 1 x≥ L L ≤ x≤ U x≤ U


U −L
Silt LB
EC LB
CaCO3 LB
BD LB
pH LB
PR LB
Na LB
WP LB
OM MB f ðxÞ ¼ f0:9  0:1 x−L 1 þ0:1 x≥L L ≤x≤ U x≤ U
U −L
N, P, Ca, Mg MB
Fe, Cu, Zn, Mn MB
FC MB
AWC MB
Aggregation MB
CO2 MB
Urease MB
Beta glucosidase MB
Phosphatase MB
Clay OR f ðxÞ ¼ f0:9  0:1 x−L1 1 þ0:1 x≥ L1 or x≤U L ≤ x≤ L2
L2−L1
f ðxÞ ¼ f1−0:9  0:1 x−U 1 1 U1 ≤x≤ U 2
U2−U 1 L2 ≤ x≤ U1
*FT means function type; MB means more is better; LB means low is better; OR means optimal range.**SSF means standard scoring function; L and U
are the lower and the upper threshold value, respectively; L1-L2: lower limits of the values determined as the optimal range groups; U1-U2: Upper limits
of the values determined as the optimal range groups
1235 Page 6 of 20 Arab J Geosci (2020) 13:1235

Table 2 Saaty scale (Saaty 1977)

Importance Description Definition


level

1 Equal importance The two elements are equally important


3 Moderate importance The first criterion is slightly more important than the second criterion
5 Strong importance The first criterion is slightly more important than the second criterion
7 Very strong The first criterion is very important compared with the second criterion, dominant or provable in practice
importance
9 Extreme importance The first criterion is strongly (extremely) important compared with the second criteria; it has the highest
accuracy
2, 4, 6, 8 Intermediate values It is used when undecided between two close evaluations and a compromise is required between the two
values

n CI ¼ ðλmax−nÞ=ðn−1Þ ð7Þ
∑ aij
j¼1
wi ¼ ð3Þ As seen in equation inequality (7), the consistency ratio
n
(CR) value is obtained by dividing the consistency index
ai = element i in the normalization table (CI) by the random index (RI) value (Table 3). If the calculat-
aj = element j in the normalization table ed CR value is less than 0.10, then the comparisons made by
n = number of criteria the decision-maker are consistent; if CR value is greater than
Wi = priority vector or weighing of criterion i 0.10, the comparisons are inconsistent or there is a calculation
The binary comparison matrix (A) is multiplied by the error. In this case, comparisons should be revised (Saaty
priority vector (w) to obtain the vector D (Eq. 4). 1980).
2 3 2 3 2 3 The Linear combination technique approach of values ob-
a11 a12 a13 ⋯ a1n w1 d1
6 a21 a22 a23 ⋯ a2n 7 6 7 6 7 tained with the analytical hierarchical process and standard
6 7 6 w2 7 6 d2 7
6⋯ ⋯ ⋯ ⋯ ⋯7 6 7 6 7 scoring functions and classification of soil quality index
D¼AW¼6 7  6 … 7 ¼ 6 … 7 ð4Þ
6⋯ ⋯ ⋯ ⋯ ⋯7 6 7 6 7
6 7 6 … 7 6…7 values are given in Table 4.
4⋯ ⋯ ⋯ ⋯ ⋯ 5 4 … 5 4…5
an1 an2 an3 … ann wn dn
Statistical techniques
The elements of the D column vector (d1 = a11×w1 + a-
12×w2…..a1n×wn) are divided by the elements of the priority Descriptive statistics of soil properties, principal components
vector (W) to obtain the E values in Eq. 5. analysis (PCA), and decision trees from classification process-
di es were determined using IBM SPSS 23 software. In the prin-
Ei ¼ ; i ¼ 1; 2; 3…n ð5Þ cipal components analysis, Kaiser-Meyer-Olkin (KMO) and
wi
Bartlett sphericity test results were examined by standardizing
The sum of the Ei values is divided by the number of the soil properties and suitability of the data for factor analysis
criteria, and its arithmetic average is calculated. With this op- was checked. As many basic components as eigenvalue great-
eration, λ the largest eigenvalue of the matrix called max (Eq. er than 1 were picked, and the minimum data set was created
6) is found. by examining correlation matrices. t test was used to compare
the soil quality index values determined with the minimum
n
∑ Ei data set and the data sets in which all the features were eval-
λmax ¼ i¼1
ð6Þ uated together.
n
where λmax is the maximum eigenvalue and n is the number Decision tree
of criteria.
The eigenvector method is used to measure the consistency Creating a decision tree takes place in 2 stages and these are
in the comparisons made by the decision-making group. The tree building and tree pruning. All features are first in the root
consistency index (CI) of the comparison matrix is obtained node when building a tree. If all of the samples belong to the
by obtaining the number of criteria from the maximum eigen- same class, the root ends with a leaf; if they do not belong to
vector as seen in Eq. 7 and dividing it by the number of criteria the same class, the features that will divide the best are select-
minus one. ed. Branches with exceptions are pruned in the tree pruning
Arab J Geosci (2020) 13:1235 Page 7 of 20 1235

Table 3 Random index values (RI) depending on the number of criteria

n 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

RI 0.00 0.00 0.58 0.90 1.12 1.24 1.32 1.41 1.45 1.49 1.51 1.48 1.56 1.57 1.59

phase. There are quite different algorithms for determining the The confidence interval for mean differences between H.
roots, nodes, and branching criteria selections in decision group and L. group t variable averages is given in Eq. 10, and
trees. One of the most important and commonly used ones is calculation of the W matrix is given in Eq. 11.
the Chi-square automatic interaction detection (CHAID) algo- sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
 
 a

1 1

Wtt
rithm. It is distinguished from other decision tree algorithms ðdHt−dLtÞ ¼ XHt−XLt ∓t ; ðN −gÞ þ
by allowing more than two divisions from a decision tree node pgðg−1Þ nH nL N −g
and uses chi-square tests to determine the best division at each ð10Þ
stage. Bonferroni fixed p value is used to determine whether
the variables are suitable for division (Pehlivan 2006). Here, N = n1 + n2 + … + ng refers to the number of vari-
Bonferroni approach is based on whether the differences of ables p, g refers to the number of groups, and Wtt refers to the
each group are zero once the difference of mean vectors of diagonal elements of the W matrix.
each group from the general mean vectors found. Average g nt   0
vectors of the general mean vector (X ) and the mean vectors W¼ ∑ ∑ Xtl−Xt Xtl−Xt ð11Þ
t¼1 L¼1
of each group according to the variable g (XgÞ is given in Eq. 8
and the difference of group mean vectors (d1) from general In equality, g: refers to the number of groups, and nt: refers
mean vectors is given in Eq. 9. to the number of units in the group t.
2 3 2 3 2 3
X1 X 11 X 12
6X27 6 X 21 7 6 X 22 7
X ¼6 7 6 7 6
4 ⋮ 5 X 1 ¼ 4 ⋮ 5 X 2 ¼ 4 ⋮ 5Xg
7 Interpolation methods
Xp Xp1 Xp2 In this study, different interpolation methods [inverse distance
2 3 neighborhood similarity (IDW), radial-based functions
X 1g
6 7 (RBF), ordinary kriging (OK), simple kriging and universal
6 X 22 7
¼6 7 ð8Þ kriging] were applied to predict the spatial distribution of soil
4 ⋮ 5
quality index. In the Kriging method, a semivariogram is cre-
Xpg ated, which is a measure of the positional correlation between
2 3 2 3
X 11 X1 two points. Thus, weights vary according to the spatial ar-
6 X 21 7 6 X 2 7 rangement of the samples. Unlike other estimation methods,
d1 ¼ X 1−X ¼ 6 7 6 7
4 ⋮ 5−4 ⋮ 5……:dg kriging evaluates the error or uncertainty in the prediction
Xp1 Xp area. In radial-based function methods, a series of precise in-
2 3 2 3 terpolation techniques are used, which must go through each
X 1g sample value measured for the interpolated surface. It is a
6 7 X17
6 X 22 7 6 method used in the interpolation of multidimensional data.
¼6 7−6 X 2 7 ð9Þ
4 ⋮ 5 4⋮5 The most commonly used completely regularized spline
Xpg Xp (CRS), multiquadric (MQ), and spline with tension (ST) radial
function interpolation were selected to evaluate the soil quality
index distribution. ArcGIS 10.5v software was used to create
soil quality index distribution maps. In the present study, root
Table 4 Classes of soil mean square error (RMSE) was used to assess and figure out
quality index Class Definition SQI
the most suitable interpolation model (Eq. 13). That is why the
I Very low < 0.40 lowest RMSE indicates the most accurate prediction.
II Low 0.40–0.50
III Moderate 0.50–0.65 Evaluation of predictions
IV High 0.65–0.85
V Very high > 0.85 Cross-validation methods were preferred to verify the estima-
tion accuracy of models in decision trees. In the performance
1235 Page 8 of 20 Arab J Geosci (2020) 13:1235

of classification models, evaluations were made by examining of the soils have “sufficient,” “high,” and “very high“
ROC (receiver operating characteristic) curve and AUC (area Ca content, respectively. Of the soils, 3.60, 30.30, and
under a curve) values. ROC is also expressed as the fraction of 66.10% contain “low,” “sufficient,” and “very high”
true positives to false positives. The ROC curve is formed by Mg, respectively. Besides, the copper and manganese
plotting the number of true positives as a function of false contents of the soils in question were determined at
positives according to varying classification threshold values. sufficient levels. According to the results of CO2 output
Besides, root mean square error (RMSE), mean absolute per- obtained in the classification by Doran et al. (1997),
centage error (MAPE), and coefficient of determination (R2) very low (18.83 mg CO2 100 g FKT−1 24 h−1) and
parameters are used in the comparison of real and predicted low soil (39.27 mg CO2 100 g FKT−1 24 h−1) activities
values (Eq. 12, 13, 14). were determined. According to Hofmann and Hoffmann
  (1966), while urease activity varied as low
1 n  Zi−Z  (29.12 μg N g FKT−1) and high (377 μg N g FKT−1),
MAPE ¼ ∑  ð12Þ
n i¼1 Zi  β-g Glycosidase enzyme activity was low (1.2–37.7 μg
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
of p-nitrophenol g−1 dry soil h−1). According to Wilding
∑ðZi−ZÞ2
RMSE ¼ ð13Þ et al. (1994) and Mulla and McBratney (2000), K, P,
n Mg, Fe, Zn, OM, S, urease, and β-g aggregation prop-
2 32 erties have high (> 35%) coefficient of variation, and
6 ∑ZiZ − ∑Zi∑Z 7 other properties have moderate (15–35%) and low (<
R2 ¼ 6 r
4 h ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ih
n
i5
7 ð14Þ 15%) coefficient of variation (CV).
ð∑ZiÞ2 ð∑ZÞ2
∑Zi − n 2
∑Z − n 2

where Zi is the predicted value, Z is the observed value, and Creating a minimum data set
n is the number of observations.
In the study, the results of the Kaiser-Meyer-Olkin (KMO) test
were found to be 68.93% (0.6893 > 0.5) and the Bartlett test
results (P = 0.000 < 0.05). Thus, it has been determined that
Results and discussion the analysis of the principal components is suitable for the data
set (Karagöz and Kösterelioğlu 2008). Eight principal compo-
Soils properties nents with an eigenvalue bigger than 1 and at the level where
the variance was explained by 76.6% were taken into account
Descriptive statistics of soil properties were given in Table 5. in the result of the principal components analysis (Table 6).
Field capacity, WP, and AWC contents of soil samples vary as According to the PCA analysis, the minimum data set was
texture class L, CL, SiL, SCL, SL, SiCL, and C, and they were created with 15 of 27 indicators handled in the total data set.
determined in the ranges of 17.841–31.732%, 10–20.534%, P, K, Zn, Fe, and Cu features provide the highest contribu-
4.654–14.89%, respectively. There are sampling points where tion rate in PCA-1. A high correlation (0.85) was found be-
penetration resistance values (0.71–2.082 MPa) were deter- tween P and K. Due to the higher total correlation loads (CT) P
mined above 2 MPa (Schoeneberger et al. 2012) which is (9.306), the K feature was eliminated. Again, a high correla-
the limit value for plant root development and the bulk density tion between Fe and Zn (0.71) and the total correlation load of
of the soils in the area varied at 1.228–1.750 g cm−3 levels. Zn (9.243) were more than Zn, and Fe feature was removed
Classification of studied soil properties was made ac- from the dataset. Since there was no high correlation (> 0.7)
cording to Lindsay and Norvell (1978); FAO (1990); among the other features, the features selected in PCA-1 were
Arshad and Martin (2002); Borůvka et al. (2005); P, Zn, and Cu. In PCA-2, C, S, FC, and WP features are
Hazelton and Murphy (2016). Organic matter values of determined as features with a high contribution rate. FC and
the soils of the study area vary between “very low and WP have highly correlated (0.804) features. The WP feature
moderate.” A very wide range of soils with low lime was eliminated from the dataset due to the total contribution
content and very high lime content has a “salt-free“ and rate of FC which is high (7.97). Since there is no high corre-
“slightly alkaline” reaction. N contents of 21.4, 76.8, lation between the high contribution rate and the characteris-
and 1.8% of the soils are “low,” “sufficient,” and tics determined in other principal components, selected fea-
“high,” respectively, and phosphorus was determined tures were included in the minimum data set. Thus physical
as “low,” “sufficient,” “high,” and “very high” in 42.9, properties determined for minimum data set are C, Si, S, FC,
39.3, 16.0, and 1.8% of the soils, respectively. Of the AWC, and PR; chemical properties are N, P, Ca, Cu, and Zn;
soils, 78.60% and 21.40% include “high” and very and biological properties are β-g, alkali phosphatase, and ure-
high“ K levels, respectively, and 1.8, 87.5, and 10.7% ase enzyme activity.
Arab J Geosci (2020) 13:1235 Page 9 of 20 1235

Table 5 Descriptive statistics of soil properties

Indicators Min Max Mean Std. Dev. CV Skewness Kurtosis

N% 0.04 0.16 0.10 0.03 26.61 0.45 0.08


P mg kg−1 1.83 76.83 7.96 10.18 60.87 5.88 39.40
K cmol kg −1 0.79 8.11 1.60 0.96 59.88 5.86 39.77
Ca cmol kg −1 27.14 59.59 40.09 6.59 16.44 0.41 0.17
Na cmol kg −1 1.07 3.37 1.83 0.50 27.53 1.31 1.95
Mg cmol kg −1 1.14 8.60 3.70 1.70 45.98 1.00 0.70
Mn mg kg−1 17.05 35.97 26.51 3.64 13.72 0.26 0.80
Fe mg kg−1 3.24 51.46 7.09 6.15 86.71 7.08 51.90
CaCO3% 5.78 34.27 23.78 4.67 19.64 −0.83 3.16
EC dS m−1 0.13 0.39 0.20 0.05 23.96 1.56 4.24
pH 8.04 8.39 8.24 0.08 0.93 −0.23 0.37
OM % 0.12 2.93 0.91 0.53 58.51 1.04 2.36
BD g cm−3 1.23 1.75 1.46 0.12 8.18 0.35 −0.25
Zn mg kg−1 0.35 2.67 0.76 0.40 52.29 2.87 10.63
Cu mg kg−1 1.45 7.87 2.93 0.91 31.15 3.07 15.08
FC % 17.84 31.73 25.34 3.10 12.25 −0.42 −0.24
WP % 10.00 20.53 14.80 2.67 18.05 0.05 −0.88
AWC % 4.65 14.89 10.55 1.85 17.57 0.04 1.03
Aggregation % 14.83 49.97 24.60 8.95 36.36 1.17 0.68
CO2 mg CO2 100 g dry soil 18.83 39.27 29.52 4.07 13.79 0.16 0.57
β-g μg p-nitrophenol g−1 dry soil 1.20 37.70 14.06 9.57 68.05 0.82 −0.06
A-f μg p-nitrophenol g−1 10.77 26.41 18.23 3.66 20.05 0.14 0.13
Urease μgNg dry soil−1 29.12 377.00 130.57 50.52 38.69 2.01 9.39
PR MPa 0.71 2.08 1.17 0.31 26.55 0.73 0.13
Si % 15.72 60.19 37.16 9.19 24.74 0.20 0.25
S% 11.09 59.21 31.63 12.37 39.10 0.31 −0.65
C% 19.25 55.94 31.21 9.77 31.32 1.02 0.18

C clay, Si silt, S sand, FC field capacity, AWC available water content, BD bulk density, WP wilting point, OM organic matter, EC electric conductivity,
PR penetration resistance, A-f alkaline phosphatase, β-g beta glucosidase.

Evaluation of soil quality rate of contribution to soil quality (Şenol et al. 2019). Besides,
the fact that the physical structure of a soil that is at an opti-
As a result of the evaluation of soil properties with AHP, their mum level in terms of chemical content, soil quality, and
weights were given in Table 7, and the weights of the soil management is not in ideal conditions significantly affects
properties selected as a result of PCA analysis were presented fertility and plant growth. Therefore, it is expected that hier-
in Table 8. In the analytical hierarchical process, weighting archical A will have the highest contribution rate. Again,
determinations were created in 4 different hierarchies as soil Dengiz, and Sarıoğlu (2013) reported that soil texture is
physical properties (B1), soil chemical properties (B2), soil among the basic features in the quality criteria. Penetration
biological properties (B3), and fertility (B4). While the highest resistance and bulk density, one of the indicators of soil com-
value (0.4147) was determined for hierarchy B1 (physical paction, has significant effects on the moisture level in the
parameters), the lowest value (0.1304) was found for soil bi- field capacity and the water level that the plant can benefit
ological (Hierarchy B3) properties. Moreover, the highest (Haghighi Fashi et al., 2017).
contribution of the indicators in each hierarchy B1, B2, B3, In the B2 hierarchy, OM content (0.5430) has the highest
and B4 was determined as clay percentage (0.3470), OM contribution rate due to its important functions in the soil and
(0.5430), and CO2 (0.4732), and N (0.2601), respectively. its effect on the biological and physicochemical properties of
The fact that soil texture is a highly effective parameter on soils. It is one of the parameters that are very effective in
water and nutrient retention feature is an indicator of the high increasing biological activity and productivity on properties
1235 Page 10 of 20 Arab J Geosci (2020) 13:1235

Table 6 Results of principal component analysis (PCA)

Properties Component

PCA- PCA- PCA- PCA- PCA- PCA- PCA- PCA- CT


1 2 3 4 5 6 7 8

N 0.151 0.142 0.109 0.125 0.001 0.841 0.019 0.098 5.303


P 0.95 0.034 0.03 0.075 0.044 0.053 0.161 0.013 9.306
K 0.85 0.032 0.021 0.067 0.004 0.061 0.133 0.05 9.243
Ca 0.03 0.059 0.061 0.754 0.139 0.135 0.125 0.077 4.783
Na 0.569 0.05 0.039 0.02 0.134 0.509 0.232 0.112 7.032
Mg 0.324 0.192 0.519 0.308 0.029 0.289 0.259 0.062 8.040
Mn 0.481 0.141 0.169 0.321 0.422 0.209 0.22 0.158 6.742
Zn 0.825 0.089 0.087 0.051 0.167 0.201 0.003 0.154 9.243
Fe 0.904 0.094 0.083 0.025 0.121 0.102 0.142 0.151 8.909
Cu 0.757 0.411 0.193 0.032 0.017 0.123 0.027 0.071 9.739
CaCO3 0.214 0.351 0.302 0.309 0.087 0.486 0.063 0.131 6.927
EC 0.681 0.02 0.216 0.381 0.159 0.057 0.204 0.108 9.040
pH 0.004 0.042 0.055 0.113 0.897 0.061 0.044 0.008 4.280
OM 0.575 0.302 0.295 0.056 0.223 0.091 0.251 0.01 8.452
C 0.004 0.87 0.007 0.083 0.083 0.101 0.138 0.185 5.873
Si 0.255 0.059 0.018 0.14 0.072 0.015 0.84 0.034 5.054
S 0.193 0.731 0.008 0.038 0.012 0.069 0.515 0.121 6.316
BD 0.085 0.614 0.065 0.029 0.02 0.049 0.403 0.204 5.707
FC 0.243 0.745 0.051 0.467 0.204 0.099 0.016 0.141 7.978
WP 0.279 0.804 0.277 0.115 0.026 0.036 0.059 0.217 7.927
AWC 0.005 0.089 0.485 0.615 0.379 0.218 0.111 0.076 4.831
PR 0.072 0.126 0.68 0.065 0.282 0.152 0.01 0.223 5.568
Aggregation 0.038 0.122 0.578 0.092 0.35 0.413 0.172 0.232 5.287
CO2 0.108 0.409 0.162 0.123 0.452 0.065 0.306 0.542 5.133
Urease 0.04 0.053 0.109 0.038 0.048 0.018 0.065 0.849 3.235
β-g 0.222 0.016 0.79 0.086 0.027 0.021 0.069 0.191 5.850
A-f 0.243 0.028 0.112 0.66 0.289 0.273 0.079 0.183 4.277
Eigenvalue 6.972 3.579 2.660 2.153 1.700 1.357 1.154 1.094
Proportion 0.258 0.133 0.099 0.080 0.063 0.050 0.043 0.041

C clay, Si silt, S sand, FC field capacity, AWC available water content, BD bulk density, WP wilting point, OM organic matter, EC electric conductivity,
PR penetration resistance, A-f alkaline phosphatase, β-g beta glucosidase, CT total correlation

such as organic matter, aggregation, and water retention minimum data set obtained by selecting 15 indicators. The
(Alaboz et al. 2017); while hierarchies B1 and B2 are more weights of B1, B2, and B3 hierarchies are 0.5247, 0.3338,
stable and formation-related features, hierarchies B3 and B4 and 0.1416, respectively.
are among the dynamic properties of the soil and are in a Descriptive statistics and t test results of soil quality index
continuous process of change. Soil respiration and enzyme values obtained as a result of evaluating two data sets are
activities that play an active role in determining biological given in Table 9.
activities in soils are not a stable feature and they constantly Soil quality index values (SQITDS) obtained as a result of
change. Therefore, it is very difficult to determine relations the total data set using 27 indicators varied between 0.339 and
with soil properties. They are significantly affected by envi- 0.494. Of the soils, 17.85, 60.71, and 21.43% were deter-
ronmental conditions. Thus, the weight of biological proper- mined to be very low, low, and moderate, respectively. Soil
ties was determined at the lowest level (Demirtok et al. 2015). quality index (SQIMDS) values obtained with minimum data
In nutrient content of soils (B4), macroelements play an active set were found between 0.285 and 0.489. Of the soils,
role in plant growth and productivity (Tripathi et al. 2014). 23.31%, 55.35%, 19.65%, and a very small part (1.78%) were
Similar weights were obtained within the properties within the defined as very low, low, moderate, and high, respectively.
Arab J Geosci (2020) 13:1235 Page 11 of 20 1235

Table 7 Contribution weight of total data set indicators to soil quality calculated by the AHP

Hierarchy A

Hierarchy C/indicators Hierarchy B Combine weight ∑ Bi x Ci


(B1) Physical (B2) Chemical (B3) Biological (B4) Fertility
0.4147 0.2959 0.1304 0.1591
C 0.3470 0.1439
Si 0.1570 0.0651
S 0.1802 0.0747
BD 0.0798 0.0331
FC 0.0524 0.0217
WP 0.0524 0.0217
AWC 0.0614 0.0255
Aggregation 0.0369 0.0153
PR 0.0329 0.0136
OM 0.5430 0.1607
EC 0.2445 0.0723
pH 0.1360 0.0402
CaCO3 0.0765 0.0226
CO2 0.4732 0.0617
Urease 0.2827 0.0369
β-g 0.1220 0.0159
A-f 0.1220 0.0159
N 0.2601 0.0414
P 0.2009 0.0320
K 0.1629 0.0259
Ca 0.1029 0.0164
Mg 0.0816 0.0130
Na 0.0584 0.0093
Mn 0.0467 0.0074
Fe 0.0361 0.0057
Cu 0.0264 0.0042
Zn 0.0240 0.0038
∑ 1.00 1.00 1.00 1.00 1.00

C clay, Si silt, S sand, FC field capacity, AWC available water content, BD bulk density, WP wilting point, OM organic matter, EC electric conductivity,
PR penetration resistance, A-f alkaline phosphatase, β-g beta glucosidase.

The result of the t test to determine the importance of the of the evaluation of 27 indicators was given in Fig. 3. Group I,
change between the two data sets was determined as P = group II, and group III were determined with a CHAID algo-
0.223. Soil quality index has been determined at close levels rithm with a rate of 100, 88.2, and 91.7%, respectively.
to each other in both data sets that are not statistically signif- Besides, this feature was predicted at 91.1% with the decision
icant. Thus, quality can be accurately evaluated by determin- tree (Table 10).
ing the features in the minimum data set in determining the Organic matter is the root node; S, WP, and EC are internal
soil quality in this region. nodes; and other nodes are built as leaf nodes in the estimation
of soil quality. Pruning has been carried out on branches using
other features. Generalizations are strengthened by pruning.
With pruning, parts of the decision tree that do not affect or
Prediction of soil quality with decision tree contribute to the classification accuracy are removed. Thus, a
less complex and more understandable tree is obtained
The decision tree created for the estimation of the soil quality (Kavzoğlu and Çölkesen 2010). Starting from the root node
index (SQITDS) values of the total data set obtained as a result
1235 Page 12 of 20 Arab J Geosci (2020) 13:1235

Table 8 Contribution weight of


minimum data set indicators to Hierarchy C/indicators (B1) Physical (B2) Chemical (B3) Biological Combine weight ∑ Bi x Ci
soil quality calculated by the AHP 0.5247 0.3338 0.1416

C 0.4497 0.2360
Si 0.1956 0.1026
S 0.1878 0.0985
FC 0.0654 0.0343
AWC 0.0575 0.0301
PR 0.0441 0.0231
N 0.3611 0.1205
P 0.2490 0.0831
Ca 0.2070 0.0691
Cu 0.0638 0.0213
Zn 0.0503 0.0168
pH 0.0687 0.0229
Urease 0.5889 0.0834
A-f 0.2519 0.0357
β-g 0.1593 0.0225
∑ 1.00 1.00 1.00 1.00

C clay, Si silt, S sand, FC field capacity, AWC available water content, PR penetration resistance, A-f alkaline
phosphatase, β-g beta glucosidase

with OM, OM is split into 3 sibling nodes, and in case OM is as the area under the ROC curve where the estimation accu-
≤ 0.3667%, soils are determined as class I with 66.7%; soils racy of soil quality classes were determined as 1.00, 0.960,
are determined as class II with 0.3667–1.58%, and in case OM and 0.943, respectively.
is >1.58%, the soils are determined as class III with 80%. If Sensitivity, true positive 1-Specificity, is an indicator of
OM is ≤0.3667% and S is ≤21.60%, soils found to be class II false positive. AUC, which is one of the criteria used to eval-
and if it is greater, soils found to be class II with 88.9%. When uate the accuracy of models in making the right decisions, can
OM is between 0.3667 and 1.58% and WP is ≤ 10.75, soils are take the biggest value of “1” (the positives are separated from
determined to be class I, and if it is greater, soils are deter- the negatives in the best way) (Kılıç 2013). The closer the
mined to be class II with 78.4%. A 66.1% success achieved ROC curve to the upper left corner, the higher the overall
under the conditions of WP > 10.75%, it is divided into 3 child accuracy of the test. The AUC value for the class I was deter-
nodes according to EC contents. If EC ≤ 0.183 dS/m, soils are mined as 0.991 (P = 0.00) due to the high estimation accuracy.
class I and soils are class II with 70% between 0.183–0.217 The high AUC value is a reason for the large area under the
dS/m range; and if EC > 0.217 dS/m, soils are class II with curve and increased differentiation ability. The cut-off value
92.9%. Hateffard et al. (2019) state that the decision tree for the class I is 0.44 (100% sensitivity, 97.8% specificity),
shows high performance in mapping and predicting organic and for the class II, it is 0.61 (85.3% sensitivity, 95.5% spec-
carbon, and also in another study, they stated that it was an ificity). For the class III soils, it has been determined that
effective method of classifying satellite images of decision prediction will be made with 91.7% sensitivity and 90.9%
trees with approximately 92% accuracy (Kavzoğlu and specificity at 0.38 cut-off value.
Çölkesen 2010). Area under a curve (AUC) values and receiv- The decision tree created in the estimation of the soil qual-
er operating characteristic (ROC) curves of the classes speci- ity index (SQIMDS) classifications obtained with the minimum
fied in Table 10 are shown in Fig. 4. AUC values determined data set is given in Fig. 5. As a result of the decision tree
created by the CHAID algorithm, the class I soils in the
SQIMDS estimation were determined correctly at the rate of
Table 9 Results of t test 61.5%, while the others were 96.8 and 72.7, 0, respectively
(Table 11). In general, the estimation accuracy is 82.1%.
Minimum Maximum Mean StDev T value P value
While 3 classes were determined in SQITDS, soils were divid-
SQITDS 0.339 0.494 0.454 0.058 −1.23 0.223 ed into 4 classes with SQIMDS.
SQIMDS 0.285 0.489 0.447 0.076 The clay (C) root node is a prediction of soil quality, Si and
β-g are internal nodes, and other nodes are built as leaf nodes.
SQITDS total data set of soil quality index for 27 indicators, SQIMDS The clay node is divided into 4 sibling nodes. If C is ≤
minimum data set of soil quality index for 15 indicators
Arab J Geosci (2020) 13:1235 Page 13 of 20 1235

Table 10 Classification of
SQITDS Observed Predicted

I II III Percent correct AUC value P value

I 10 0 0 100.0% 0.991 0.00


II 0 30 4 88.2% 0.960 0.00
III 0 1 11 91.7% 0.943 0.00
Overall Percentage 17.9% 55.4% 26.8% 91.1%

26.03%, soils are classes I and II with 50%, soils are class II Si is > 33.98, soils are determined as class II with 76.9%. β-g
with 76.5% between 26.03–33.60%, and soils are class III separated as child nodes in soils containing 33.60–48.69%
with 75% between 33.60–48.69%, and in case C is > clay and β-g ≤ 12.80 μg of p-nitrophenol g−1 dry soil h−1 side
48.69%, soils are class II with 80%. If C is ≤ 26.03% and Si branch determined to be class III with 100%. B-g > 12.80 μg
is ≤ 33.98, soils were determined as class I with 88.9%, and if of p-nitrophenol g−1 dry soil h−1 is class II with 75%. The clay

Fig. 3 Decision tree of SQITDS


prediction
1235 Page 14 of 20 Arab J Geosci (2020) 13:1235

Fig. 4 ROC curve of SQITDS classes

content of soils is a very important parameter on nutrient and energy source for soil heterotrophs (Jian et al., 2016). It has
water retention and aggregation (Baldock 2007). Besides, the been reported that β-glucosidase activity is also affected by
contribution rate in clay content AHP scores was determined the cultivation of soils (Turner et al. 2002).
higher than other physical properties. For this purpose, ROC curves in evaluating the estimation accuracy of soil
starting the root node with C is an indication that the feature quality classes are given in Fig. 6, and AUC values and im-
in question is important in quality prediction. β-glucosidase is portance levels are given in Table 11. The AUC values used in
responsible for the degradation of carbohydrates and it is con- evaluating the validity of the prediction of soil classes were
sidered to be important for soil as it provides development and 0.896, 0.815, 0.955, and 0.855, respectively. While the values
Arab J Geosci (2020) 13:1235 Page 15 of 20 1235

Table 11 Classification of SQIMDS

Observed Predicted

I II III IV Percent correct AUC value P value

I 8 5 0 0 61.5% 0.896 0.000


II 1 30 0 0 96.8% 0.815 0.000
III 0 3 8 0 72.7% 0.955 0.000
IV 0 1 0 0 0.0% 0.855 0.228
Overall Percentage 16.1% 69.6% 14.3% 0.0% 82.1%

obtained in the first 3 classes were statistically significant values are given in Table 12. Soil quality index estima-
(P < 0.01), estimation accuracy for class IV (high) was found tion values created by two data sets showed very close
statistically insignificant (P > 0.05). The low number of class values. MAPE is an evaluation parameter that represents
IV soils is considered as a result of the low success of the estimation errors in percent and it classifies the predic-
evaluation. While the cut-off values were 0.21 (84.6% sensi- tion models below 10% as ‘high accuracy’ prediction
tivity, 74.4% specificity) for class I and 0.43 (96.8% sensitiv- (Witt and Witt 1992). High accuracy estimation was
ity, 64% specificity) for class II, it is determined that the esti- performed using both data sets.
mations will be made with 81.8% sensitivity and 93.3% spec- As a result of the evaluation of the data obtained, soil qual-
ificity at class III 0.18 cut-off value. ity classes obtained by TDS and MDS were determined with
MAPE, RMSE, and R2 values obtained by comparing high predictability by decision trees.
the predicted SQI TDS and SQI MDS values with real

Fig. 5 Decision tree of SQIMDS


prediction
1235 Page 16 of 20 Arab J Geosci (2020) 13:1235

Fig. 6 ROC curve of SQIMDS classes

Spatial distribution of SQI

Fifteen different interpolation semivariogram models were ap-


Table 12 Evaluation of 2 plied in forming the soil quality index distributions of the total
prediction models MAPE (%) RMSE R
data set and the minimum data set obtained as a result of PCA
SQITDS 4.44 0.02 0.80 analysis and in evaluating the distribution maps of the soil
SQIMDS 6.12 0.03 0.78 quality index values predicted as a result of the decision tree
of both data sets, and their RMSE values were given in
SQITDS soil quality index for 27 indicators,
SQIMDS soil quality index for 15 indicators
Table 13, and distribution maps were presented in Fig. 7.
Arab J Geosci (2020) 13:1235 Page 17 of 20 1235

Table 13 Cross-validation and their RMSE values according to different interpolation models

Inverse distance weighing—IDW Radial basis function—RBF

1 2 3 MQ CRS SPT

SQITDS 0.0540 0.0552 0.0567 0.0634 0.0575 0.0562


SQITDS-DT 0.0443 0.0445 0.0448 0.0493 0.0458 0.0453
SQIMDS 0.0732 0.0753 0.0777 0.0893 0.0793 0.0772
SQIMDS-DT 0.0651 0.0658 0.0672 0.0757 0.0686 0.0672
Kriging
Ordinary Simple Universal
Gau. Exp. Sph. Gau. Exp. Sph. Gau. Exp. Sph.
SQITDS 0.0535 0.0550 0.0541 0.0529 0.0549 0.0536 0.0543 0.0556 0.0548
SQITDS-DT 0.0441 0.0450 0.0444 0.0442 0.0454 0.0446 0.0442 0.0450 0.0444
SQIMDS 0.0731 0.0750 0.0738 0.0724 0.0754 0.0735 0.0731 0.0750 0.0738
SQIMDS-DT 0.0660 0.0670 0.0665 0.0650 0.0657 0.0652 0.0660 0.0670 0.0665

TDS total data set, MDS minimum data set, DT decision tree, Gau. Gaussian, Exp. exponential, Sph. spherical, MQ. multiquadric, CRS. completely
regularized spline, ST. spline with tension.

Fig. 7 Observed and predicted


distribution maps of the soil
quality
1235 Page 18 of 20 Arab J Geosci (2020) 13:1235

While the Gaussian model of ordinary (RMSE: 0.0441) and the suitability of these geostatistical approaches should be
simple kriging (RMSE: 0.0529) were determined as the most investigated in practice by revealing in more comprehensive
suitable model for the estimation of soil quality, total data set, soil properties and environmental effects.
and decision tree, the Gaussian model of ordinary kriging
(RMSE: 0.0650) was determined as the most suitable interpo- Acknowledgment We would like to thank the Presidency of Scientific
Research Projects Management Unit of Suleyman Demirel University,
lation model in the minimum data set and decision tree esti-
which financially supported the part of this study with Project FYL-
mation. When the distribution maps were examined, besides 2018-6743.
soil quality index values belonging to total data set and min-
imum data sets showed a similar pattern, the distribution pat-
terns of the predicted index values also showed parallelism References
with each other. While the soil quality index value increased
in all distribution maps, generally in the north and northeast Alaboz P, Işıldar AA, Müjdeci M, Şenol H (2017) Effects of different
parts of the study area, this value decreased in the southern vermicompost and soil moisture levels on pepper (capsicum
annuum) grown and some soil properties. Yuzuncu Yıl Univ J
parts.
Agric Sci 27(1):30–36
Adeyolanu OD, Are KS, Oluwatosin GA, Ayoola OT, Adelana AO
(2013) Evaluation of two methods of soil quality assessment as
Conclusion influenced by slash and burn in tropical rainforest ecology of
Nigeria. Arch Agron Soil Sci 59(12):1725–1742
Akgül M, Başyiğit L (2005) Detailed soil survey and mapping of
In this current study, soil quality index was determined by Suleyman Demirel University farmıng land. Suleyman Demirel
using total and minimum data sets with a linear combination Univ J Inst Sci 9(3):1–10
technique approach and analytical hierarchical process. As a Alaboz P (2019) The development of prediction models to determine
result of statistical analysis, it was determined that there were some soil moisture constants by penetration resistance measure-
ments. Doctoral thesis. Süleyman Demirel University Institute of
no statistically significant differences between the minimum
science, 142s, Isparta
data set and the soil quality index values obtained from the Albayrak AS, Yılmaz K (2009) Data mining: decision tree algorithms and
total data set (P > 0.05). After that, decision trees were formed an application on ise data. Suleyman Demirel Univ J Faculty Econ
in the prediction of soil quality classes and spatial distribution Admin Sci 14(1):31–52
maps of observed and predicted values were obtained with Andrews SS, Karlen DL, Mitchell JP (2002) A comparison of soil quality
indexing methods for vegetable production systems in northern
different interpolation methods. The distribution maps created California. Agric Ecosyst Environ 90:25–45
with both data sets showed similar results. Therefore, in de- Arcak S, Kütük AC, Haktanır K, Çaycı G (1997) The effects of tea wastes
termining the soil quality index for the region soils, it is rec- on soil enzyme activity and nitrification. Pamukkale Univ J Eng Sci
ommended to determine the minimum data set in which 15 3(1):261–266
indicators were determined. Soil quality index classes were Arshad MA, Martin S (2002) Identifying critical limits for soil quality
indicators in agro-ecosystems. Agric Ecosyst Environ 88(2):153–
predicted with 91.1% accuracy (SQITDS) by decision trees to 160
be created using OM, S, WP, and EC contents of soils and Askari MS, Holden NM (2015) Indices for quantitative evaluation of soil
82.1% accuracy (SQIMDS) with C, Si, and β-g. The AUC quality under grassland management. Geoderma 230–231:131–142
values obtained in the validation part of the models varied Aydın A, Dengiz O (2019) Determination of physico-chemical and nu-
trient element content of soils formed under semi-humid ecological
between 0.815 and 0.991, and statistically insignificant AUC environment. Acad J Agric 8(2):301–312
value was obtained in class IV estimation in the SQIMDS. In Baldock JA (2007) Composition and cycling of organic carbon in soil. In:
general, while class I, class II, and class III soil quality was Nutrient cycling in terrestrial ecosystems. Springer, Berlin
successfully predicted by decision trees, the accuracy of the Heidelberg, pp 1–35
class IV prediction was determined at low levels. As a result of Borůvka L, Vacek O, Jehlička J (2005) Principal component analysis as a
tool to indicate the origin of potentially toxic elements in soils.
comparing the soil quality index values predicted with the Geoderma 128(3–4):289–300
decision tree with both data sets with the actual values, high Bouyoucous GA (1951) Determination of particle size in soils. Agron J
R2 and low RMSE and MAPE values were obtained at similar 42:438–443
rates. Bremner JM (1982) Total nitrogen, methods of soil analysis. Am Soc
Agron Mongrn 10(2):594–624
In this present study, in the creation of distribution maps
Cheng J, Ding C, Li X, Zhang T, Wang X (2016) Soil quality evaluation
with different interpolation methods as a result of estimating for navel orange production systems in central subtropical China.
the SQI values obtained by TDS and MDS with the decision Soil Tillage Res 155:225–232
tree; ordinary kriging Gaussian for TDS and simple kriging Dedeoğlu M, Dengiz O (2018) Determination of land suitability classes
for MDS were evaluated as the most suitable model of by using integrated geographic information systems with multi-
criteria decision making analysis. J Süleyman Demirel Univ Fac
Gaussian. Also, in future studies, evaluation of the usability Agric 13(2):60–72
of models is recommended to be examined in a more compre- Demirtok M, Kılıç Ş, Doğan K (2015) Mapping of microbial activities in
hensive and wider field. In addition in semi-arid ecosystems, the widespread soil series of Amik plain. Soil-water J 4(2):14–20
Arab J Geosci (2020) 13:1235 Page 19 of 20 1235

Dengiz O (2002) Determination of land quality using parametric ap- Lindsay WL, Norvell WA (1978) Development of a DTPA soil test for
proach in Gölbaşı district of Ankara province. Selcuk J Agric zinc, iron, manganese, and copper. Soil Sci Soc Am J 42(3):421–
Food Sci 16(30):59–60 428
Dengiz O (2020) Soil quality index for paddy fields based on standard Liu Z, Zhou W, Shen J, Li S, Ai C (2014) Soil quality assessment of
scoring functions and weight allocation method. Arch Agron Soil yellow clayey paddy soils with different productivity. Biol Fertil
Sci 66(3):301–315 Soils 50:537–548. https://doi.org/10.1007/s00374-013-0864-9
Dengiz O, Sarıoğlu FE (2013) Parametric approach with linear combina- Malone BP, Minasny B, McBratney AB (2017) Using R for digital soil
tion technique in land evaluation studies. J Agric Sci 19:101–112 mapping. Springer International Publishing, Basel
Dengiz O (2013) Land suitability assessment for rice cultivation based on Masto RE, Chhonkar PK, Purakayastha TJ, Patra AK, Singh D (2008)
GIS modeling. Turk J Agric Forest 37(3):51 Soil quality indices for evaluation of long-term land use and soil
Doran J, Tim K, Maria T (1997) Field and laboratory solvita soil test management practices in semi-arid sub-tropical India. Land
evaluation. USDA-ARS. Department of Agronomy. University of Degrad Dev 19(5):516–529
Nebraska, Lincoln Mihalikova M, Dengiz O (2019) Towards more effective irrigation water
Doran JW, Parkin TB (1994) Defining and assessing soil quality. In: usage by employing land suitability assessment for various irrigatıon
Doran JW, Coleman DC, Bezdicek DF, Stewart BA (eds) techniques. Irrigation and drainage. https://doi.org/10.1002/ird.2349
Defining soil quality for a sustainable environment. SSSA Spec. Mukherjee A, Lal R, Zimmerman AR (2014) Effects of biochar and other
Publ., 35, SSSA ASA, Madison, pp 1–21 amendments on the physical properties and greenhouse gas emis-
Doran JW (2002) Soil health and global sustainability: translating science sions of an artificially degraded soil. Sci Total Environ 487:26–36
into practice. Agric Ecosyst Environ 88(2):119–127 Mulla DJ, McBratney AB (2000) Soil spatial variability. In: Sumner ME
FAO (1990) Micronutrient assessment at the country level p 1–208. An (ed) Handbook of soil science. CRS press, Boca Raton, pp 321–352
international study (Ed., M. Sillanpa). FAO Soil Bulletin 63. Navarro SA, Gil-Vázquez JM, Delgado-Iniesta MJ, Marín-Sanleandro P,
Published by FAO, Rome, İtaly Blanco-Bernardeau A, Ortiz-Silla R (2015) Establishing an index
Görmüş M, Özkul M (1995) Stratigraphy of the area between Gonen- and identification of limiting parameters for characterizing soil qual-
Atabey (Isparta) and Aglasun (Burdur). J Sci Ins Suleyman Demirel ity in Mediterranean ecosystems. Catena 131:35–45
Univ 1:43–64 NEN 5140 (1996) Nederlandse norm - Geotechniek. Bepaling van de
Haghighi Fashi F, Gorji M, Sharifi F (2017) Least limiting water range for conusweerstand en de plaatselijke wrijvingsweerstand van grond.
different soil management practices in dryland farming in Iran. Arch Elektrische sondeermethode. Nederlands Normalisatie-instituut,
Agron Soil Sci 63(13):1814–1822. https://doi.org/10.1080/ Delft, p 6
03650340.2017.1308688
Qi Y, Darilek JL, Huang B, Zhao Y, Sun W, Gu Z (2009) Evaluating soil
Hateffard F, Dolati P, Heidari A, Zolfaghari AA (2019) Assessing the
quality indices in an agricultural region of Jiangsu Province, China.
performance of decision tree and neural network models in mapping
Geoderma 149(3–4):325–334
soil properties. J Mt Sci 16(8):1833–1847
Olsen SR (1954) Estimation of available phosphorus in soils by extrac-
Hazelton P, Murphy B (2016) Interpreting soil test results: what do all the
tion with sodium bicarbonate (no. 939). US Department of
numbers mean? CSIRO publishing
Agriculture
Hofmann ED, Hoffmann GG (1966) Die bestimmung der biologischen
Özyazıcı MA, Dengiz O, Aydoğan M, Bayraklı B, Kesim E, Urla Ö,
tätigkeit in böden mit enzymmethoden. Adv Enzymol Relat Areas
Yıldız H, Ünal E (2016) Levels of basic fertility and the spatial
Mol Biol 28:365–390
distribution of agricultural soils in central and eastern Black Sea
Isermayer H (1952) Eine einpache method zur bestimmung der
region. Anadolu J Agri Sci 31(1):136–148
pflanzenatmung under carbonate in boden. ZPflanzenernahr, Dung
Jackson ML (1958) Soil chemical analysis prentice hall. Inc., Englewood Pehlivan G (2006) CHAID analysis and an application. Yıldız Technical
Cliffs, NJ, vol 498, pp 183–204 University, Institute of Science, Unpublished Master Thesis
Jian S, Li J, Chen J, Wang G, Mayes MA, Dzantor K, Hui ED, Luo Y Saaty TL (1977) A scaling method for priorities in hierarchical structures.
(2016) Soil extracellular enzyme activities, soil carbon and nitrogen J Math Psychol 15(3):234–281
storage under nitrogen fertilization: a meta-analysis. Soil Biol Saaty TL (1980) The analytic hierarchy process. New York: McGraw-
Biochem 101:32–43 Hill, (This book has been tranlated into Chinese by S. Xu et al.;
Kacar B (2016) Physical and chemical soil analysis. Nobel Press, Turkish information is available from them at the Inst. Of Systems
Karagöz Y, Kösterelioğlu İ (2008) Developing evaluation scale of com- Engineering, Tianjing Univ., Tianjin, China.), A Translation into
munication skills with factor analysis. Dumlupınar Univ J Social Sci russian by R. Vachnadze is currently underway
21 Saaty TL (2008) Decision making with the analytic hierarchy process,
Karlen DL, Mausbach MJ, Doran JW, Cline RG, Harris RF, Schuman GE international journal services. Sciences 1(1):83–98
(1997) Soil quality: a concept, definition, and framework for evalu- Schoeneberger PJ, Wysocki DA, Benham EC (2012) Soil survey staff
ation. Soil Sci Soc Am J 61:4–10 field book for describing and sampling soils. In: Version, 3. National
Karlen DL, Stott DE (1994) A framework for evaluating physical and Soil Survey Center LincolnNE, Natural Resources Conservation
chemical indicators of soil quality. Defin soil qual sustain environ Service
35:53–72 Soil Survey Staff (1992) Procedures for collecting soil samples and
Kavzoğlu T, Çölkesen İ (2010) Classification of satellite images using methods of analysis for soil survey. Soil Survey Invest. Rep. I.
decision trees: Kocaeli case. Electron J Map Technol 2(1):36–45 U.S. Gov. Print. Office, Washington D.C. USA
Kheir RB, Greve MH, Bøcher PK, Greve MB, Larsen R, McCloy K Soil Survey Staff (1993) Soil survey manual. USDA Handbook. No: 18
(2010) Predictive mapping of soil organic carbon in wet cultivated Washington D.C.
lands using classification-tree based models: the case study of Seker C, Özaytekin HH, Negiş H, Gümüş İ, Dedeoğlu M, Atmaca E,
Denmark. J Environ Manag 91(5):1150–1160 Karaca Ü (2017) Identification of regional soil quality factors and
Kılıç S (2013) ROC analysis in clinical decision making. Psych Behav indicators: a case study on an alluvial plain (Central Turkey). Solid
Sci 3(3):135 Earth 8(3):583–595
Koca YK, Acar M, Turgut YŞ (2019) Evaluation of quality of agricultural Şenol H, Dengiz O, Alaboz P (2019) Determination of spatial variability
soils with geostatistical modeling. Harran J Agric Food Sci 23(4): of soil quality index based on multi criteria decision analysis.
489–499 International Soil Congress 2019 17-19 June Ankara, Turkey
1235 Page 20 of 20 Arab J Geosci (2020) 13:1235

Tabatabai MA, Bremner JM (1972) Assay of urease activity ın soils. Soil Vågen TG, Winowiecki LA, Tondoh JE, Desta LT, Gumbricht T (2016)
Biol Biochem 4:479–487 Mapping of soil properties and land degradation risk in Africa using
Tatlıdil H (2002) Applied multivariate statistical analysis. Academy MODIS reflectance. Geoderma 263:216–225
Printing House, Ankara Van Wambeke A R (2000) The Newhall simulation model for estimating
Tunçay T, Başkan O, Bayramin İ, Dengiz O, Kılıç Ş (2018) Geostatistical soil moisture and temperature regimes. Department of Crop and Soil
approach as a tool for estimation of field capacity and permanent Sciences. Cornell University, Ithaca, NY. USA
wilting point in semiarid terrestrial ecosystem. Arch Agron Soil Sci Vasu D, Singh SK, Ray SK, Duraisami VP, Tiwary P, Chandran P,
64(9):1240–1253. https://doi.org/10.1080/03650340.2017.1422081 Anantwar SG (2016) Soil quality index (SQI) as a tool to evaluate
Turan İD, Dengiz O (2017) Erosion risk prediction using multi-criteria crop productivity in semi-arid Deccan plateau, India. Geoderma
assessment in Ankara Güvenç Basin. J Agric Sci 23(3):285–297 282:70–79
Turner BL, Hopkins DW, Haygarth PM, Ostle N (2002) β–glucosidase Wilding LP, Bouma J, Goss DW (1994) Impact of spatial variability on
activity in pasture soils. Appl Soil Ecol 20:157–162 ınterpretative modelling. ın: quantitative modelling of soil forming
processes. R.B. Bryant ve Arnold R.W. (Ed.) SSSA special publi-
Tripathi DK, Singh VP, Chauhan DK, Prasad SM, Dubey NK (2014)
cation number 39. SSSA.Inc. Madison Wisconsin.USA
Role of macronutrients in plant growth and acclimation: recent ad-
Witt SF, Witt CA (1992) Modeling and forecasting demand in tourism.
vances and future prospective. In: improvement of crops in the era of
Academic Press, Londra
climatic changes. Springer, New York, pp 197–216
Wu C, Liu G, Huang C, Liu Q (2019) Soil quality assessment in Yellow
U.S Salinity Laboratory Staff (1954) Diagnosis and improvement of sa- River Delta: establishing a minimum data set and fuzzy logic model.
lina and alkali soils. Agricultural Handbook 60 U.S.D.A. Geoderma 334:82–89

View publication stats

You might also like