You are on page 1of 15

Eur. Phys. J.

Plus (2019) 134: 326


DOI 10.1140/epjp/i2019-12692-0
THE EUROPEAN
PHYSICAL JOURNAL PLUS
Regular Article

Prediction of the California bearing ratio (CBR) of compacted


soils by using GMDH-type neural network

T. Fikret Kurnaz1,a and Yilmaz Kaya2


1
Mersin University, Vocational School of Technical Sciences, Department of Transportation Services, Mersin, Turkey
2
Siirt University, Department of Computer Engineering, Faculty of Engineering and Architecture, Siirt, Turkey

Received: 29 June 2018 / Revised: 7 April 2019


Published online: 10 July 2019

c Società Italiana di Fisica / Springer-Verlag GmbH Germany, part of Springer Nature, 2019

Abstract. The California bearing ratio (CBR) is an important parameter in defining the bearing capacity
of various soil structures, such as earth dams, road fillings and airport pavements. However, determination
of the CBR value of compacted soils from tests takes a relatively long time and leads to a demanding ex-
perimental working program in the laboratory. This study is aimed to predict the CBR value of compacted
soils by using the group method of data handling (GMDH) model with a type of artificial neural networks
(ANN). The results were also compared with multiple linear regression (MLR) analysis and different ANN
models. The selected variables for the developed models are gravel content (GC), sand content (SC) fine
content (FC), liquid limit (LL), plasticity index (PI), optimum moisture content (OMC) and maximum
dry density (MDD) of compacted soils. Many trials were carried out with different numbers of layers and
different numbers of neurons in the hidden layer in GMDH model and with different training algorithms
in ANN models. The results indicate that the GMDH model has better success in the estimation of the
CBR value compared to both the MLR and the different types of ANN models.

1 Introduction
The California bearing ratio (CBR) is defined as the resistance of the soil sample to the penetration piston immersed
in at a rate of 1.27 mm/min. In other words, it is the applied force for the piston to sink into the soil sample [1]. The
CBR test was first developed in the US state of California in 1929 with the aim of determining the suitability of soils
for use in sub-structures of highways [2,3]. During the Second World War, in order for this method to be used more
effectively in airport constructions, some modifications were made by American civil engineers on the test method.
The CBR test to measure the strength of the pavement layer has been used in developing countries for a long time [4].
In particular, highway engineers need CBR value for superstructure design.
CBR can also be defined as the measure of the quality of the base and subbase materials in transport structures
by proportioning the penetration resistance of the soil to the penetration resistance of a standard material [5]. CBR
is a comparative measure of the shear strength of the soil and is the ratio of the bearing capacity of the existing
base material or of any material to the crushedrock bearing capacity. CBR value of crushedrock used in engineering
structures is accepted as 100 while the CBR values of other materials are listed below this maximum value.
CBR tests can be carried out in the laboratory or on the field [6]. The CBR tests are generally performed on
compacted soil samples in the laboratory while the tests are performed on compacted soil surface or in trenches and
pits at the field conditions [7]. The CBR test is a penetration test that is applied to almost all the soil groups, from the
clay to the fine gravel. The soil structure, water content and dry unit weight are the most important factors affecting
the CBR value. As the water content increases, the CBR value decreases and as the dry unit volume of the material
increases, the CBR value also increases [4]. The results of the laboratory and field CBR test may show differences
depending on the type of soil, water content and dry unit weight.
The laboratory CBR test procedure which is defined in TS 1900–2 [8] is to measure the load required for the
sinking of a cylindrical piston with a cross-sectional area of 19.6 cm2 into the soil at a certain speed (1.50 mm/min).
This load is divided by the load required to sink the piston to the same depth on a standard crushedrock sample and
a
e-mail: fkurnaz@mersin.edu.tr (corresponding author)
Page 2 of 15 Eur. Phys. J. Plus (2019) 134: 326

Table 1. Standard stresses according to the amount of penetration in the test by using crushedrock [2].

Penetration depth (mm) Standard stress (kgf/cm2 ) Standard load (kgf)


2.54 70.4 1362.6
5.08 105.6 2034.9
7.62 133.7 2587.7
10.16 161.9 3133.5
12.70 183.0 3541.9

the result obtained is multiplied by 100 to obtain the CBR value. This value is not a fixed number for the soil. It varies
depending on the water content and density of soil. The CBR value obtained from the test is valid only for the present
water content and density of the soil. The definition of the CBR value can be expressed by the following equation:
Applied Stress in Experiment (or Load)
CBR = × 100. (1)
Standard Stress (or Load)
It is understood, from the above equation, that the CBR value is a percentage of the standard stress. In practice,
the “%” symbol is usually not used and integers are used for CBR, such as 4, 38, 98 [4]. The standard stress values
depending on the penetration amount are given in table 1. The CBR value is found as the ratio of standard stress
to applied stress in the test, which corresponds to a penetration of usually 2.54 mm. The soil samples to be used
in the CBR test are usually prepared in optimum water content. The optimum water content of the soil samples is
determined by Standard or Modified Proctor test in the laboratory.
The CBR tests can be performed on soaked or unsoaked soil samples and the results give useful information related
to the bearing capacity of various soil structures such as earth dams, road fillings, highway embankments and airport
pavements. However, laboratory CBR tests are laborious and time consuming. In addition, the poor quality of the soil
samples and laboratory conditions sometimes affect the accurate of CBR values obtained with the tests. Therefore,
many studies were carried out by different researchers on the California bearing ratio (CBR) and various approaches
have been developed. The previous researches have shown that the soil types and the soil properties are affected the
CBR value [9]. Hence, the investigations are mostly focused on the empirical relations between the index and the
compaction characteristics and the CBR value of soil samples [9–22].
Recently, soft computing methods have become popular in practical solutions of the geotechnical engineering
problems and useful results have been obtained [23–35]. Also the CBR values of soils have been tried to predict by
using different ANN applications in a few research studies [36–39].
In this paper, an alternative approach is proposed by using the GMDH model, which is a type of ANN, on the
prediction of the CBR value of compacted soils. In addition, in order to compare the results of the GMDH model, MLR
analysis and different ANN applications were carried out with same database. The GMDH model was first proposed
by Ivakhnenko (1976) [40] and the GMDH network is a self-organizing, machine learning method. While GMDH is
self-organizing, it creates an optimal network by trying a number of networks in different architectures depending on
the number of input variables. Recently, the GMDH model has begun to be applied in some geotechnical problems [41–
44]. The index and compaction characteristics of soils were selected as an input parameter on the prediction of the
CBR value in all models of this study. The results obtained from the models were also compared based on the mean
square error (MSE), root mean square error (RMSE) and determination coefficient (R).

2 Database compilation
Database used in this study consists of the index, Standard Proctor (SP) and CBR test results of 158 soil samples. 68
of the test results in the database were obtained from soil mechanics laboratories located on different parts of Turkey
and the remainder 90 test results were obtained from road construction studies completed by the General Directorate
of Highways of Turkey’s. The data forming the database are the gravel content (GC), sand content (SC), fine content
(FC), liquid limit (LL), plasticity index (PI), optimum moisture content (OMC), maximum dry density (MDD) and
the CBR values. The soil samples in the database are both fine and coarse grained and classified as twelve different
soil types (CH, CI, CL, GC, GM, MH, MI, ML, SC, SM, SP, SW).
All the CBR values in the database belong to the compacted soil samples with optimum moisture content at
maximum dry density. The soil type is one of the most important factor that affecting the compaction of soil samples.
In general, low water content - high dry density values are obtained on coarse-grained soils. The MDD decreases
as the amount of fine material on the soil increases. The compaction of cohesive soils is directly affected by the
plasticity characteristics of the fine material. As LL and PI increases, the MDD decreases. Statistical description of
the soil parameters in the database are given in table 2. In order to better observe the distribution of the parameters,
frequency histograms were prepared for all the parameters in the data set (fig. 1).
Eur. Phys. J. Plus (2019) 134: 326 Page 3 of 15

Table 2. Descriptive statistics of the soil parameters in the dataset.

GC (%) SC (%) FC (%) LL (%) PI (%) OMC (%) MDD (kN/m3 ) CBR (%)
Min 0 0 20.30 24 2.80 1.68 12 0.70
Max 100 74.60 100 94 58 35.40 21.30 32
Mean 7.73 27.83 64.42 45.91 23.46 19.16 16.37 8.54
Std. Dev. 10.71 20.14 24.84 14.06 11.19 8.62 2.33 7.38

Fig. 1. The frequency histograms of the index and compaction parameters.


Page 4 of 15 Eur. Phys. J. Plus (2019) 134: 326

3 Method

In the last decade, many soft computing methods have been used for almost the all complex problems in geotechnical
engineering. The majority of these studies are based on ANN models which have different network architectures. In
this study, the CBR value of the compacted soils obtained as a result of difficult and laborious laboratory studies are
tried to be determine with MLR analysis, GMDH model and different ANN applications.

3.1 Multiple linear regression model (MLR)

The relationships between the multiple independent variables (x1 , x2 , . . . , xn ) and a dependent variable (y) are exam-
ined in multiple regression analysis. In general, the dependent variable may be related to n independent variables. The
multiple regression model used here is below as, assuming that each independent variable has a linear relationship
with dependent variable,
yi = a + b1 xi,1 + b2 xi,2 + · · · + bn xi,n . (2)

Benefiting from this function, an estimate of the real multiple relationships presumed to be between variables is done
with the help of the following function:

yi = α + β1 xi,1 + β2 xi,2 + · · · + βn xi,n , (3)

where the parameter α is the y-intercept of the plane and parameters β1 , β2 , . . . , βn are called partial regression
coefficients. This model describes a hyperplane in the n-dimensional space of the independent variables.
In fitting a multiple regression model, it is much more convenient to express the mathematical operations using
matrix notation. For the n independent variables and k observations, the multiple regression model can be expressed
in matrix notation as
Y = Xβ + ε, (4)

where
⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤
y1 1 x1,1 . . . x1,n α ε1
⎢ y2 ⎥ ⎢ 1 x2,1 . . . x2,n ⎥ ⎢ β1 ⎥ ⎢ ε2 ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥
Y = ⎢...⎥, X = ⎢... ... ... ... ⎥, β = ⎢...⎥, ε = ⎢...⎥. (5)
⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣...⎦ ⎣... ... ... ... ⎦ ⎣...⎦ ⎣...⎦
yk 1 xk,1 . . . xk,n βn εk

In general, Y is a (k × 1) vector of the observations, X is a (k × n) design matrix, β is a (n × 1) vector of the regression


coefficients, and ε is a (k × 1) vector of random errors.
For the calculation of vector of the regression coefficients in this function, differences between the observed y values
and estimated values of y will be minimized by using the least squares method [45]:


n
2
(yi − yi ) → min . (6)
i=1

3.2 Artificial neural networks (ANN)

ANN is an information processing system consisting of simple information processing units and capable of running
these units in parallel [46]. The main goal in ANN is to model the relationship between the input vector and the
output. ANN predicts the outputs with different inputs after being trained on the samples of a data set. ANN consists
of a large number of interconnected information processing units called neurons. The architecture of ANN depends on
the weights and activation functions that connect the neurons together. There are two connection architectures that
are the feed forward and feedback in typical ANN. The feed forward architecture is that architectures in which the
connections between the ANN layers advance to the next layer. Feedback architectures have a backward linkage to the
input neurons from the output. An ANN can be single-layer or multi-layer. Generally, feed forward ANN architectures
based on back training rules are composed of three layers; input, intermediate and output (fig. 2).
Eur. Phys. J. Plus (2019) 134: 326 Page 5 of 15

Fig. 2. Artificial neural network structure.

3.3 Group method of data handling (GMDH)

The GMDH algorithm is a self-organizing approach based on evaluating performance on multiple input - single output
data pairs. GMDH, proposed by Ivakhnenko in the 1960s [47], is an architectural class of polynomial neural network
models. Since the GMDH network has a flexible structure, hybrid methods have been developed with intuitive methods
such as genetic, evolutionary, particle swarm optimization [48]. The main implication of the GMDH model is to define
an analytical function that enables weights to be obtained on a regression basis in forward feed neural networks
using square neurons. In the GMDH network, neurons in a layer are bound to the next layer through a quadratic
and triquadratic polynomial to form new neurons in the next layer. The best nodes in each layer are determined by
performing a layer-layer pruning according to a predetermined criterion in the topology of the GMDH network. In this
model, the input variables are mapped to the output variable. In this mapping, the goal is to construct the function f (),
which will estimate the output value ŷ using the input vector X = (X1 , X2 , X3 , . . . , Xn ) [41]. This function estimates
the values as close as possible to real ŷ output values. When considering multiple input - single output, the function
between them is expressed as follows [42]:

yi = f (xi1 , xi2 , xi3 , . . . , xin ) (i = 1, 2, 3, . . . M ). (7)

Thus, it is possible to estimate the output value ŷ by using the input vector X = (X1 , X2 , X3 , . . . , Xn ). The prediction
equation can be written as
ŷi = fˆ (xi1 , xi2 , xi3 , . . . , xin ) (i = 1, 2, 3, . . . M ). (8)
To solve this problem, the GMDH generates the general relation between output and input variables in the form of a
mathematical definition also referred to as a reference. The aim here is to minimize the difference between the actual
output values and the estimated values:
M

2
fˆ (xi1 , xi2 , xi3 , . . . , xin ) − yi → Minimum. (9)
i=1

The general connection between input and output variables can be expressed as a complex discrete form of a series of
Volterra functions as follows [42,49]:


n
n
n
n
n
n
y = w0 + wi xi + wij xi xj + wijk xi xj xk + . . . . (10)
i=1 i=1 j=1 i=1 j=1 k=1

The above equation is known as the Kolmogorov-Gabor polynomial. This function is written as follows. GMDH uses
a recursive polynomial regression procedure to synthesize any model. Polynomial regression equations can produce a
high-order polynomial model using effective predictors:

Quadratic: ŷ = G (xi , xj ) = w0 + w1 xi + w2 xj + w3 xi xj + w4 x2i + w5 x2j . (11)

The mathematical relation between the input variables of the generated network and the output variable is formed by
eq. (10). The weights of the equation in eq. (11) are calculated by regression methods. Thus, the difference between
real y and estimated ŷ is minimized for input pairs xi and xj . The weights are obtained by least squares method. In
this way, the weighting coefficients of the quadratic function (Gi) are obtained so as to optimally fit the output set
of all input-output data pairs. In the GMDH model, it is tried to estimate the output variables best way by taking
all the input variables (two variables at a time) and creating a second-order polynomial equation (eq. (11)) in the
training process. Each input vector pair (attributes) will form a second quadratic regression polynomial equation. For
the first layer, the L(L = m(m − 1)/2) number of regression polynomial equations are obtain. For example, if the input
variable number m = 4, L = 6, regression polynomial equations will be obtained in the first layer. New variables are
Page 6 of 15 Eur. Phys. J. Plus (2019) 134: 326

Fig. 3. GMDH network architecture.

obtained for the next layer from the first layer using these equations. In this way, new variables are obtained for the
other layers in each layer. Thus, new variables are generated which best explain the output variable from the input
variables. If the minimum error value in the current layer is greater than the error value in the previous layer, the
model becomes complicated. In other words, it is expected that the error value in a certain layer is smaller than the
error value in the previous layer. GMDH network architecture is given in fig. 3.
Each input data pair forms a regression equation. Outputs of the regression equations form new inputs to the next
layer. The final output consists of the regression equations selected from all the layers:
M
(yi − Gi (xi , xj ))2
E= i=1
→ minimum. (12)
M
The GMDH network is constructed using all possible binary combinations of n input variables to construct the
polynomial regression equation (in eq. (10)) that best predicts the independent y variable with the least squares
method. From the observed {(yi , xip , xiq ), (i = 1, 2, 3, . . . M )} samples, the first layer of the GMDH network is
constructed using n(n − 1)/2 quadratic polynomial neurons:
⎡ ⎤
x1p x1q y1
⎣ x2p x2q y2 ⎦ . (13)
xmp xmq ym

Equation (10) can be written in matrix form as follows, using the input-output variables mentioned above:

AW = Y, (14)

where W is the vector of the unknown weight coefficients of the quadratic polynomial and Y specifies the vector of
the output values,

W = {w0 , w1 , w2 , w3 , w4 , w5 }T , (15)
Y = {y1 , y2 , y3 , y4 , , y5 , . . . yM }
T
, (16)
⎡ ⎤
1 x1p x1q x1p x1q x21p x21q
⎢ ⎥
A = ⎣ 1 x2p x2q x2p x2q x22p x22q ⎦ . (17)
1 xmp xmq xmp xmq x2mp x2mq

The weights are solved in matrix form using multiple regression equations as follows:

W = (AT A)−1 AT Y, (18)

where W is the weight vector to be estimated, A is the input matrix, and Y is the output vector.
Eur. Phys. J. Plus (2019) 134: 326 Page 7 of 15

Table 3. Multiple linear regression analysis results.

Model No. Variables Regression equation R RMSE


Model 1 GC, SC, FC CBR = −161.77 + 1.8116GC + 1.852SC + 1.625FC 0.72 5.11
Model 2 LL, PI CBR = 19.81 − 0.276LL + 0.059PI 0.44 6.65
Model 3 MDD, OMC CBR = −25.37 + 2.053MDD + 0.015OMC 0.63 5.71
Model 4 GC, SC, FC, LL, PI CBR = −1361.85 + 13.843GC + 13.880SC + 13.673FC 0.74 5.04
−0.113LL + 0.049PI
Model 5 LL, PI, MDD, OMC CBR = −24.52 − 0.005LL − 0.007PI + 2.022MDD 0.63 5.75
+0.021OMC
Model 6 GC, SC, FC, LL, PI, MDD, OMC CBR = −2914.53 + 28.948GC + 29.064SC + 28.812FC 0.83 4.16
+0.070LL − 0.128PI + 1.574MDD + 0.406OMC

3.4 Performance criteria

In this study, statistical criteria such as mean square error (MSE), root mean square error (RMSE) and determination
coefficient (R) were used to test the performances of the GMDH and the other models. These criteria are defined by
the following equations:

1
N
MSE = (Ŷi − Yi )2 , (19)
N i=1


1 N
RMSE =  (Ŷi − Yi )2 , (20)
N i=1

 N   2 N

 i=1 Yi − Y − i=1 (Ŷi − Yi )2
R= N  2 , (21)
i=1 Yi − Y

where Yi is the actual value of the California bearing ratio (CBR) parameter, Y is the mean value of these parameters,
Ŷi is the estimated CBR value and N is the total number of observations. MSE and RMSE are positive values and this
statistics are expected to be as small as possible. Values near to zero (0) indicate that the models predict the CBR
parameter close to the real. The R criterion specifies the relation between the actual CBR parameter values and the
values that the models estimate for the parameter. The R coefficients must be close to 1 for the success of the model.

4 Results
4.1 MLR results

In this section, the relationships between the index and compaction parameters to the CBR values of the soils have
been investigated with multiple linear regression analysis (MLR). The independent variables were selected as the gravel
content (GC), sand content (SC) fines content (FC), liquid limit (LL), plasticity index (PI), optimum moisture content
(OMC), maximum dry density (MDD) and the dependent variable was selected as the CBR values. MLR analyzes were
carried out for different models between the independent and dependent variables. The regression models, regression
equations and obtained performance coefficients are given in table 3.

4.2 GMDH results

In this study, the CBR value which is defined as resistance parameter of compacted soils has been tried to estimate
with using the GMDH algorithm. GMDH is a nonlinear regression method, but is a model that also carries the
characteristics of supervised and unsupervised artificial neural networks (ANN). Regression is a statistical model that
examines the cause-and-effect relationship between independent and dependent variables. Linear regression, models
the relationship between one or more independent and dependent variables.
Page 8 of 15 Eur. Phys. J. Plus (2019) 134: 326

Table 4. Performance criteria for different number of hidden layers.

Training Testing
#Layer
MSE RMSE R MSE RMSE R

2 3.03 1.74 0.9628 4.28 2.07 0.9501

3 2.24 1.49 0.9726 3.28 1.81 0.9612

4 1.92 1.38 0.9766 2.56 1.60 0.9688

5 1.64 1.28 0.9800 2.33 1.52 0.9727

6 1.61 1.27 0.9804 2.01 1.42 0.9749

7 1.60 1.27 0.9805 1.69 1.30 0.9783

Fig. 4. Determination coefficient performance values.

Variables such as GC, SC, FC, LL, PI, OMC and MDD belonging to the compacted soil samples were used to
estimate the CBR value with the GMDH model. The data set consisting of 158 data was divided into two separate
data set: the training (%70) and the testing (%30) data set for the model. The variables were normalized with the
min - max transformation given in the following equation is applied to the input and output variables in both models:
Aactual − Amin
Anormalized = . (22)
Amax − Amin
Here, Anormalized is the normalized values of A, Aactual is the actual values of the variable, Amax and Amin are the largest
and smallest values in the variables. GMDH can be used with different number of layers and in architectures built
with different numbers of neurons in each layer. Trials in this study were performed with different number of layered
GMDH architectures. The R and RMSE performance criteria obtained as a result of trials with different numbers of
hidden layers are given in table 1. Since the cost of calculation increased as the number of hidden layers increased, the
trials were carried out with a maximum of 7 layers. As can be seen in table 1, as the number of hidden layers increases
the success of the model also increases (RMSE and MSE error values decreased while R determination coefficients
increased). The best performance was achieved when the number of hidden layer is 7 (MSE = 1.6; RMSE = 1.27;
R = 0.9805) (table 4).
The regression graphics between the real and the predicted values of the CBR parameter are given in fig. 4. The
regression curves were created for training, testing and all data sets. The determination coefficient was obtained as
R = 0.9805 for the training set, R = 0.9783 for the test set and R = 0.9798 for the whole data set, respectively (fig. 4).
The distribution graphics, error distributions and error distribution histograms are given for the real and estimated
values of the CBR parameter in figs. 5, 6 and 7, respectively.
In order to investigate the effect of the number of neurons in the hidden layer on the achievement performance
obtained, some trials were carried out using different numbers of neurons in the hidden layer. The trials were carried
out with neurons in different numbers between 1 and 10 in the hidden layer for the GMDH network in the 7-layer
topology which the highest performance was achieved (table 5). As can be seen in table 5, the success of the test set
increases as the number of neurons increases. However, this situation was not observed for the training set.
Eur. Phys. J. Plus (2019) 134: 326 Page 9 of 15

Fig. 5. The distribution graphics, error distributions and error distribution histograms belonging to the real and estimated
values of the CBR parameter for all data set.

Fig. 6. The distribution graphics, error distributions and error distribution histograms belonging to the real and estimated
values of the CBR parameter for training data set.

Fig. 7. The distribution graphics, error distributions and error distribution histograms belonging to the real and estimated
values of the CBR parameter for test data set.
Page 10 of 15 Eur. Phys. J. Plus (2019) 134: 326

Table 5. GMDH model performances with different neuron numbers.

Training Testing
#Noron
MSE RMSE R MSE RMSE R
1 4.45 2.11 0.9450 6.05 2.46 0.9270
2 2.58 1.60 0.9684 4.40 2.09 0.9498
3 2.40 1.55 0.9707 4.23 2.05 0.9520
4 2.42 1.55 0.9704 4.08 2.02 0.9531
5 2.32 1.52 0.9716 3.98 1.99 0.9528
6 1.55 1.24 0.9811 2.54 1.59 0.9715
7 1.55 1.24 0.9811 2.03 1.42 0.9756
8 1.59 1.26 0.9806 1.88 1.37 0.9758
9 1.60 1.26 0.9805 1.69 1.30 0.9783
10 1.60 1.26 0.9805 1.69 1.30 0.9783

Fig. 8. Pairs of variables for the 7-layer GMDH network.

The most important features of GMDH algorithm are the generated polynomial equations and automated selection
of necessary input variables during the modeling. The polynomial equations reveal the quantitative relation between
the input and output variables. In this study, the polynomial functions of the input parameters were developed for the
prediction of the CBR value by the GMDH model. Polynomial regression equations were obtained with GMDH model
which has 7 layers and 10 neurons per layer. However, the GMDH network used to estimate the CBR parameter is
given in fig. 8. The GC, FC, SC, LL and MDD variables were selected as the most effective predictors for CBR value
prediction in the GMDH model. The most ineffective variable was observed as OMC and it has not been used in any
way. The most effective variable is SC that widely used in polynomial equations. The pairs of variables that the most
effective variables form are as follows:

{GC, MDD}, {FC, MDD}, {SC, MDD}, {SC, LL}, {FC, LL}, {LL, MDD}, {SC, PI}, {GC, SC}, {GC, FC} and {SC, FC}.

The polynomial functions were obtained using these variables. The weights of these equations are estimated by the
least error squares method. The variable pairs in the current layer calculate an error for all combinations in the GMDH
network. The pair of variables forming the lowest error is moved to the next layer. The polynomial equations of variable
pairs forming the smallest error in each layer are given below. Thus, the effective variables can be determined on the
output in GMDH networks. New variables are created for the next layer in the first layer using the most efficient
variable pairs from the input layer. Similarly, the CBR values were estimated using the most efficient variable pairs
in the previous layer in the next layers. The current error sum decreases while moving from one layer to the other in
the GMDH network.
Eur. Phys. J. Plus (2019) 134: 326 Page 11 of 15

Hidden layer 1 polynomial equations:


A1 = 95.3873 + 0.8618x1 − 12.8900x6 − 0.0105x1 x6 + 0.4497x21 − 0.0269x26 ,

A2 = 111.5454 + 0.2892x3 − 14.9904x6 − 0.0031x3 x6 + 0.4995x23 + 0.0020x26 ,

A3 = 83.4719 + 0.0452x2 − 10.7638x6 − 0.0033x2 x6 + 0.3511x22 + 0.0166x26 ,

A4 = −0.4661 + 0.8241x2 − 0.0725x4 − 0.0033x2 x4 + 0.0019x22 − 0.0097x24 ,

A5 = 38.7169 − 0.2552x3 − 0.5968x4 − 0.0017x3 x4 + 3.0978e − 0.4x23 + 0.006x24 ,

A6 = 27.3663 + 1.1471x4 − 6.6436x6 − 0.0022x4 x6 − 0.3261x24 − 0.0643x26 ,

A7 = −2.0613 + 0.5996x2 + 0.0906x5 − 0.0024x3 x5 + 0.0014x22 − 0.0114x25 ,

A8 = 2.0745 + 0.2415x1 + 0.1431x2 − 0.0060x1 x2 + 1.8852e − 04x21 + 0.0055x22 ,

A9 = 18.2779 + 0.6148x1 − 0.1810x3 − 0.0114x1 x3 + 1.8940e − 04x21 − 0.0052x23 ,

A10 = −34.0937 + 1.6624x2 + 0.9648x3 − 0.0114x2 x3 − 0.0060x22 − 0.0176x23 .


Hidden layer 2 polynomial equations:
B1 = −2.0569 + 0.7637A1 + 0.8325A3 − 0.1110A1 A3 − 0.1235A21 + 0.2115A23 ,

B2 = −1.7277 + 0.1258A2 + 1.1488A6 + 0.0839A2 A6 − 0.0727A22 − 0.0198A26 ,

B3 = −2.7423 + 0.7079A1 + 1.0518A5 − 0.0075A1 A5 − 0.0667A21 + 0.0390A25 ,

B4 = 0.2884 + 0.5125A1 + 0.5346A2 − 0.2258A1 A2 − 0.1903A21 + 0.4172A22 ,

B5 = −2.0082 + 1.2602A1 + 1.2602A6 + 0.30940A1 A6 − 0.1123xA21 + 0.2311A26 ,

B6 = −2.5725 + 0.7892A1 + 0.8616A4 − 0.0117A1 A4 − 0.0507A21 + 0.0349A24 ,

B7 = −3.0106 + 0.7760A1 + 0.9717A7 − 0.0230A1 A7 − 0.0606A21 + 0.0534A27 ,

B8 = −2.6028 + 0.6501A1 + 1.0431A8 + 0.0116A1 A8 − 0.0486A21 + 0.0022A28 ,

B9 = −2.6047 + 0.6497A1 + 1.0440A9 + 0.0117A1 A9 − 0.0486A21 + 0.0021A29 ,

B10 = −2.6054 + 0.6496A1 + 1.0443A10 + 0.0117A1 A10 − 0.0486A21 + 0.0021A210 .


Hidden layer 3 polynomial equations:
C1 = 0.6427 + 0.5049B1 + 0.3103B2 − 0.0422B1 B2 − 0.0358B12 + 0.0877B22 ,

C2 = 0.6085 + 0.1530B2 + 0.6811B3 − 0.0660B2 B3 − 0.0897B22 + 0.1658B32 ,

C3 = 0.6536 + 0.7232B2 + 0.1272B5 − 0.1136B2 B5 − 0.1065B22 + 0.2297B52 ,

C4 = 0.2694 + 0.9657B1 − 0.0128B5 − 0.0791B1 B5 − 0.0469B12 + 0.1280B52 ,

C5 = 0.4632 + 0.3110B2 + 0.5569B10 − 0.0611B2 B10 − 0.0735B22 + 0.1426B10


2
,

C6 = 0.4634 + 0.3107B2 + 0.5571B9 − 0.0611B2 B9 − 0.0736B22 + 0.1427B92 ,

C7 = 0.4639 + 0.3103B2 + 0.5574B8 − 0.0611B2 B8 − 0.0736B22 + 0.1428B82 ,

C8 = 0.7283 + 0.2389B2 + 0.5537B7 − 0.0289B2 B7 − 0.0379B22 + 0.0773B72 ,

C9 = 0.6367 + 0.3815B2 + 0.4333B6 − 0.0141B2 B6 − 0.0155B22 + 0.0386B62 ,

C10 = 0.0104 + 0.6642B4 + 0.2946B9 + 0.2021B4 B9 + 0.1683B42 − 0.3690B92 .


Page 12 of 15 Eur. Phys. J. Plus (2019) 134: 326

Hidden layer 4 polynomial equations:

D1 = 0.1273 + 0.2770C3 + 0.6941C10 + 0.0242C3 C10 − 0.0175C32 − 0.0058C10


2
,
D2 = −0.2107 + 0.6256C3 + 0.4129C4 + 0.1562C3 C4 + 0.1353C32 − 0.2944C42 ,
D3 = −0.1110 + 0.4446C4 + 0.5907C9 − 0.1522C4 C9 − 0.1330C42 + 0.2850C92 ,
D4 = −0.1505 + 0.2458C4 + 0.8035C8 − 0.1771C4 C8 − 0.1594C42 + 0.3356C82 ,
D5 = 0.0794 + 0.3743C4 + 0.6002C5 − 0.0515C4 C5 − 0.0443C42 + 0.0977C52 ,
D6 = 0.0795 + 0.3741C4 + 0.6004C6 − 0.0516C4 C6 − 0.0444C42 + 0.0978C62 ,
D7 = 0.0798 + 0.3737C4 + 0.6007C7 − 0.0517C4 C7 − 0.0445C42 + 0.0981C72 ,
D8 = 0.8291 − 0.4525C1 + 1.2586C10 − 0.1319C1 C10 − 0.2032C12 + 0.3435C10
2
,
D9 = 0.2541 + 0.1951C8 + 0.7699C10 − 0.2564C8 C10 − 0.3025C82 + 0.5613C10
2
,
D10 = 0.0672 + 0.7243C1 + 0.2583C4 − 0.0306C1 C4 − 0.0451C12 + 0.0766C42 .

Hidden layer 5 polynomial equations:

E1 = −0.2741 + 1.4033D1 − 0.3263D9 − 0.3006D1 D9 − 0.2343D12 + 0.5322D92 ,


E2 = −0.0575 + 0.9460D1 + 0.0697D8 − 0.1074D1 D8 − 0.0854D12 + 0.1922D82 ,
E3 = −0.1223 + 0.2732D7 + 0.7636D9 − 0.0971D7 D9 − 0.0634D72 + 0.1589D92 ,
E4 = −0.1218 + 0.2723D6 + 0.7644D9 − 0.0971D6 D9 − 0.0636D62 + 0.1593D92 ,
E5 = −0.1215 + 0.2716D5 + 0.7650D9 − 0.0972D5 D9 − 0.0638D52 + 0.1594D92 ,
E6 = 0.1022 − 0.1979D4 + 1.1751D9 − 0.0983D4 D9 − 0.1157D42 + 0.2151D92 ,
E7 = 0.0716 − 0.0328D3 + 1.0174D8 − 0.0735D3 D8 − 0.0907D32 + 0.1650D82 ,
E8 = 0.0305 + 1.1651D1 − 0.1749D7 + 0.0550D1 D7 + 0.0683D12 − 0.1229D72 ,
E9 = 0.0305 + 1.1663D1 − 0.1760D6 + 0.0545D1 D6 + 0.0679D12 − 0.122D62 ,
E10 = 0.0305 + 1.1671D1 − 0.1768D5 + 0.0543D1 D5 + 0.0677D12 − 0.1217D52 .

Hidden layer 6 polynomial equations:

F1 = −0.3827 − 1.2091E1 + 2.3110E8 − 0.2220E1 E8 − 0.3265E12 + 0.5447E82 ,


F2 = −0.3830 − 1.2122E1 + 2.3142E9 − 0.2223E1 E9 − 0.3269E12 + 0.5454E92 ,
F3 = −0.3831 − 1.2135E1 + 2.3155E10 − 0.2224E1 E10 − 0.3271E12 + 0.5457E10
2
,
F4 = −0.0196 + 273.0414E3 − 272.0390E4 − 1.2461e + 05E3 E4 − 1.2461e + 05E32 + 2.4922e + 05E42 ,
F5 = −0.0178 + 190.6552E3 − 189.6525E5 − 4.3382e + 04E3 E5 − 4.3379e + 04E32 + 8.6761e + 04E52 ,
F6 = −0.0152 + 531.3927E4 − 530.3893E5 − 2.1332e + 05E4 E5 − 2.1331e + 05E42 + 4.2663e + 05E52 ,
F7 = −0.1829 + 0.0429E3 + 1.0079E8 − 0.1310E3 E8 − 0.1626E32 + 0.2920E82 ,
F8 = −0.1827 + 0.0431E3 + 1.0077E9 − 0.1311E3 E9 − 0.1626E32 + 0.2920E92 ,
F9 = −0.1826 + 0.0432E3 + 1.0076E10 − 0.1310E3 E10 − 0.1626E32 + 0.292E10
2
,
F10 = −0.1829 + 0.0423E4 + 1.0086E8 − 0.1310E4 E8 − 0.1627E42 + 0.2921E82 .

Output layer polynomial equations:

Output = −0, 1014 − 550, 44F7 + 551, 467F10 − 23557, 93F7 F10 − 23599, 353F72 + 47157, 290F10
2
.
Eur. Phys. J. Plus (2019) 134: 326 Page 13 of 15

Table 6. ANN architecture and training parameters.

Architecture Values

Number of layers Input layer, hidden layer and output layer

Activation functions Tangent sigmoid, Sigmoid

Input layer: 7
Number of neurons Hidden layer: 1–100

Output layer: 1

Training algorithm Levenberg-Marquardt, Bayesian regularization, scaled conjugate gradient

Sum-squared error 0.0001

Initial weights and biases Nguyene-Widrow method

Table 7. The success of the ANN models in predicting the CBR parameter.

Train Test
ANN model
MSE R RMSE MSE R RMSE

LM 2.9787 0.9577 1.7259 3.1228 0.9618 1.7671

BR 1.6244 0.9777 1.2745 2.8593 0.9703 1.6911

SCG 6.7686 0.9144 2.6017 7.6639 0.8990 2.7684

4.3 ANN results

In order to be able to demonstrate the success ratio of the GMDH algorithm the ANN models were also used in the
prediction of the CBR value in this study. Generally in the ANN applications, the available dataset divided into two
subsets that are the training and testing sets. Hence, the developed ANN models in this study also have a training
and a testing set. Randomly selected 70% of totally 158 data for training and 30% of the data for testing were used
in the ANN models just as in the GMDH model.
The trials were performed with 3 different training algorithms of the ANN model, such as Levenberge-Marquardt
(LM), scaled conjugate gradient algorithm (SCG) and Bayesian reguzilation (BR). The success of the ANN model
depends on the activation function used in the neurons and the number of neurons in the hidden layer. These parameters
are determined by checking the error with trial-error method. Therefore, functions such as sigmoid and tangential-
sigmoid were used in both training and testing processes. Furthermore, the number of neurons in the hidden layer
was tested between 1 and 100 by observing the success of the training process. The parameters used for the training
process of the ANN models are given in table 6. The results obtained with the ANN models for the prediction of the
CBR parameter are given in table 7. In addition, the regression graphics between the actual values and estimated
values obtained by these models are given in fig. 9.

5 Discussion and conclusions


In this study, we aimed to develop a prediction model for determining the CBR value of compacted soils by using
GMDH algorithm. Totally 158 data belongs to the compacted soil samples were used for this aim. Besides, MLR
and different ANN model applications were performed by using the same database in order to compare the results
and demonstrate the success of the GMDH model on the prediction of the CBR value. The soil samples forming the
database belong to the twelve different soil classes as fine and coarse grained. The inputs used in all models were
selected as the gravel content (GC), sand content (SC) fines content (FC), liquid limit (LL), plasticity index (PI),
optimum moisture content (OMC), maximum dry density (MDD) of the compacted soils. The CBR value was selected
as the output parameter and tried to be estimate as closely as possible to the actual measured values.
Page 14 of 15 Eur. Phys. J. Plus (2019) 134: 326

Fig. 9. Comparison between the estimated and measured values of CBR parameter: (1) regression model for LM; (2) regression
model for BR; (3) regression model for SCG model.

Many attempts were made to obtain best prediction performances of CBR value with different number of layers and
different number of neurons in hidden layer in the GMDH models and with different training algorithms (Levenberge-
Marquardt, scaled conjugate gradient, Bayesian regularization) in the ANN models. The results obtained with different
architectural models of GMDH and ANN models were compared. The best prediction performance of the GMDH model
was achieved with 7-layered architecture with maximum 10 neurons in each hidden layer (R = 0.9783, MSE = 1.69
and RMSE = 1.30) (table 4). The best prediction performance of the ANN was found to be R = 0.9703, MSE = 2.8593
and RMSE = 1.6911 with BR algorithm (table 7). It has been observed that the results obtained by GMDH algorithm
are more successful than the other ANN models on the prediction of the CBR value. In addition, due to the forecasting
ability in a single cycle the GMDH model is faster than classic ANN models.
The GMDH algorithm which has recently been used in different problems of geotechnical engineering was achieved
successful results on the prediction of the CBR value of compacted soils as a result of this study. Hence, the GMDH
model can also be used to estimate the other nonlinear parameters in soil mechanics such as the CBR parameter.
Eur. Phys. J. Plus (2019) 134: 326 Page 15 of 15

Publisher’s Note The EPJ Publishers remain neutral with regard to jurisdictional claims in published maps and institutional
affiliations.

References
1. A. Chegenizadeh, H.R. Nikraz, CBR Test on Reinforced Clay, in The 14th Pan-American Conference on Soil Mechanics
and Geotechnical Engineering (PCSMGE), the 64th Canadian Geotechnical Conference (CGC), Oct 2, Toronto, Ontario,
Canada (Canadian Geotechnical Society, 2011).
2. J.E. Bowles, Engineering Properties of Soils and Their Measurements (McGraw-Hill Book Company, New York, 1970).
3. B. Caglarer, Road construction technique, General Directorate of Highways of the Ministry of Public Works and Settlement,
Publication no. 259, Ankara, Turkey (1986).
4. M. Aytekin, Experimental Soil Mechanics (Technical Publisher, Ankara, 2004) pp. 483–559.
5. TS 5744, In Situ Measurement of Soil Properties in Civil Engineering (Turkish Standards Institute, 1988).
6. M.M. Zumrawi, IACSIT Int. J. Eng. Technol. 6, 439 (2014).
7. W.R. Day, Soil Testing Manual: Procedures, Classification Data, and Sampling Practices (McGraw-Hi//Edecation, 2001).
8. TS 1900-2, Soil Laboratory Experiments in Civil Engineering - Part 2: Determination of Mechanical Properties (Turkish
Standards Institute, Ankara, 2006).
9. M. Zumrawi, Prediction of CBR from index properties of cohesive soils, in Advances in Civil Engineering and Building
Materials, edited by S.-Y. Chang, S.K. Al Bahar, J. Zhao (CRC Press, Boca Raton, 2012) pp. 561–565.
10. W.P.M. Black, Geotechnique 12, 271 (1962).
11. K.B. Agarwal, K.D. Ghanekar, Prediction of CBR from plasticity characteristics of soil, in Proceedings of the 2nd southeast
Asian conference on soil engineering, Singapore, June 11–15 (Asian Institute of Technology, Bangkok, 1970) pp. 571–576.
12. M. Linveh, Transp. Res. Rec. 1219, 56 (1989).
13. D.J. Stephens, J. Civ. Eng. S. Afr. 32, 523 (1990).
14. T. Al-Refeai, A. Al-suhaibani, King Saud U. J. Eng. Sci. 9, 191 (1997).
15. M.W. Kin, California Bearing Ratio Correlation with Soil Index Properties, Master degree Project, Faculty of Civil Engi-
neering, University Technology, Malaysia (2006).
16. C.N.V. Satyanarayana Reddy, K. Pavani, Mechanically stabilized soils-regression equation for CBR evaluation, in Proceedings
of the Indian geotechnical conference, Chennai, India (2006) pp. 731–734.
17. P. Vinod, C. Reena, Highw. Res. J. IRC 1, 89 (2008).
18. S.R. Patel, M.D. Desai, CBR predicted by index properties for alluvial soils of South Gujarat, Dec. 16–18, in Proceedings of
the Indian Geotechnical Conference, India (2010) pp. 79–82.
19. G.V. Ramasubbarao, G. Siva Sankar, Jordan J. Civ. Eng. 7, 354 (2013).
20. M.H. Alawi, M.I. Rajab, Road Mater. Pavement Des. 14, 211 (2013).
21. V. Chandrakar, R.K. Yadav, Int. Res. J. Eng. Technol. 3, 772 (2016).
22. A.O. Samson, Int. J. Sci. Eng. Res. 8, 1460 (2017).
23. F.P. Nejad, M.B. Jaksa, M. Kakhi, B.A. McCabe, Comput. Geotech. 36, 1125 (2009).
24. J.A. Abdalla, M.F. Attom, R. Hawileh, Environ. Earth Sci. 73, 5463 (2015).
25. M.J. Sulewska, Comput. Assist. Mech. Eng. Sci. 18, 231 (2011).
26. Z. Chik, Q.A. Aljanabi, A. Kasa, M.R. Taha, Arab. J. Geosci. 7, 4877 (2014).
27. F. Saboya, M.G. Alves, W.D. Pinto, Eng. Geol. 86, 211 (2006).
28. W. Li, S. Mei, S. Zai, S. Zhao, X. Liang, Int. J. Rock Mech. Mining Sci. 43, 503 (2006).
29. H.J. Oh, B. Pradhan, Comput. Geosci. 37, 1264 (2011).
30. H. Jalalifara, S. Mojedifar, A.A. Sahebi, H. Nezamabadi-pour, Comput. Geotech. 38, 783 (2011).
31. D. Padmini, K. Ilamparuthi, K.P. Sudheer, Comput. Geotech. 35, 33 (2008).
32. S. Levasseur, Y. Malecot, M. Boulon, E. Flavigny, Int. J. Numer. Anal. Methods Geomech. 32, 189 (2008).
33. P. McCombie, P. Wilkinson, Comput. Geotech. 29, 699 (2002).
34. P. Samui, Comput. Geotech. 35, 419 (2008).
35. P. Samui, D.P. Kothari, Sci. Iran. 18, 53 (2011).
36. B. Yildirim, O. Gunaydin, Expert Syst. Appl. 38, 6381 (2011).
37. T. Taskiran, Adv. Eng. Softw. 41, 886 (2010).
38. C. Venkatasubramanian, G. Dhinakaran, Int. J. Civ. Struct. Eng. 2, 605 (2011).
39. S. Bhatt, P.K. Jain, Am. Int. J. Res. Sci. Technol. Eng. Math. 8, 156 (2014).
40. A.G. Ivakhnenko, Sov. Autom. Control Avtomot. 9, 21 (1976).
41. A. Kordnaeij, F. Kalantary, B. Kordtabar, H. Mola-Abasi, Soils Found. 55, 1335 (2015).
42. A. Ardakani, A. Kordnaeij, Eur. J. Environ. Civ. Eng. 23, 449 (2019).
43. M. Hassanlourad, A. Ardakani, A. Kordnaeij, H. Mola-Abasi, Eur. Phys. J. Plus 132, 357 (2017).
44. R.A. Jirdehi, H.T. Mamoudan, H.H. Sarkaleh, Appl. Appl. Math. 9, 528 (2014).
45. N.R. Draper, H. Smith, Applied Regression Analysis, 2nd ed. (John Wiley & Sons Inc, NY, 1981).
46. S. Haykin, Neural Network: A Comprehensive Foundation (MacMillan College Publishing Co., New York, 1994).
47. V.A. Vissikirsky, V.S. Stepashko, I.K. Kalavrouziotis, P.A. Drakatos, Instrum. Sci. Technol. 33, 229 (2005).
48. H. Ghanadzadeh, M. Ganji, S. Fallahi, Appl. Math. Model. 36, 4096 (2012).
49. W. Zhu, J. Wang, W. Zhang, D. Sun, Atmos. Environ. 51, 29 (2012).

You might also like