Shear Angle Data Consolidation Tests

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/225451900
Formulation of soil angle of shearing resistance using a hybrid GP and OLS

method
Article in Engineering With Computers · January 2011

DOI: 10.1007/s00366-011-0242-x
CITATIONS READS
10 129
5 authors, including:
Amir H. Alavi Amir H Gandomi

University of Missouri Stevens Institute of Technology
169 PUBLICATIONS 6,774 CITATIONS 244 PUBLICATIONS 9,894 CITATIONS
SEE PROFILE SEE PROFILE
Milad arab esmaeili

university of Shahrood.Shahrood.Iran
5 PUBLICATIONS 77 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Ph. D. View project
Recycled aggregate concrete View project
All content following this page was uploaded by Amir H. Alavi on 19 May 2017.
The user has requested enhancement of the downloaded file.

Formulation of Soil Angle of Shearing Resistance Using a Hybrid GP and OLS Method
S.M. Mousavi1, A.H. Alavi2, A. Mollahasani3, A.H. Gandomi4, M Arab Esmaeili5

1
Department of Civil Engineering, Sharif University of Technology, Tehran, Iran
2
School of Civil Engineering, Iran University of Science and Technology, Tehran, Iran
3
Department of Civil, Environmental and Material Engineering (DICAM), University of Bologna, Bologna, Italy
4
Department of Civil Engineering, University of Akron, Akron, OH 44325-3905, USA
5
Department of Civil Engineering, Islamic Azad University, Shahrood Branch, Shahrood, Iran
Abstract
In the present study, a prediction model was derived for the effective angle of shearing resistance (') of soils
using a novel hybrid method coupling genetic programming (GP) and orthogonal least squares algorithm
(OLS). The proposed nonlinear model relates ' to the basic soil physical properties. A comprehensive
experimental database of consolidated-drained triaxial tests was used to develop the model. Traditional GP and
least square regression analyses were performed to benchmark the GP/OLS model against classical approaches.
Validity of the model was verified using a part of laboratory data that were not involved in the calibration
process. The statistical measures of correlation coefficient (R), root mean squared error (RMSE) and mean
absolute percent error (MAPE) were used to evaluate the performance of the models. Sensitivity and parametric
analyses were conducted and discussed. The GP/OLS-based formula precisely estimates the ' values for a
number of soil samples. The proposed model provides a better prediction performance than the traditional GP
and regression models.
Keywords: Effective angle of shearing resistance; Soil physical properties; Genetic programming; Orthogonal
least squares; Hybridization.
1
1 Introduction
A major property of soil is its ability to resist sliding along internal surfaces within a mass. The soil shearing
resistance plays an important role in the stability of structures built on it. In general, the Mohr-Coulomb theory
is used to represent the shear strength of geotechnical materials. This theory indicates that the soil shear strength
varies linearly with the applied stress through two shear strength components known as the cohesion intercept
and angle of shearing resistance [1]. The tangent to the Mohr-Coulomb failure envelopes is represented by its
slope and intercept. The slope expressed in degrees is the angle of shearing resistance and the intercept is
cohesion [2, 3]. The cohesion intercept and angle of shearing resistance are treated as constants over the range
of normal stresses. The values of these empirical parameters for any soil depend upon several factors such as
the soil textural properties, past history of soil, initial state of soil, permeability characteristics of soil and
conditions of drainage allowed to take place during the test [1, 3]. Figs. 1(a) and (b) show the Mohr circles and
failure envelopes in terms of the total and effective stresses, respectively. If the cohesion intercept and angle of
shearing resistance are determined using the total stresses (Fig. 1(a)), they are named as total or undrained
cohesion intercept (c) and angle of shearing resistance (). The effective stress is the difference between the
total stress and the excess pore water pressure. If the pore water pressures are measured during the test, the
effective circles can be plotted as shown in Fig. 1(b) and the effective strength parameters (c' and ') are
obtained.
Fig. 1 Mohr circles and failure envelopes in terms of total and effective stresses [3]
Determination of ' is an important consideration in design of geotechnical structures. This key parameter can
be determined using field or laboratory tests. The triaxial compression and direct shear tests are the most
common tests for determining the ' values in the laboratory. The testing procedures of triaxial and direct shear
tests have been standardized by ASTM WK3821 [4] and ASTM 6528-00 [5], respectively. The triaxial and
direct shear tests are more suitable for clayey and sandy soils, respectively. The tests used in the field are vane
shear test or any other indirect method [3, 6].
Since experimental determination of ' is cumbersome and costly, numerical models are developed to estimate
the ' values. Despite the multivariable dependency of soils, the existing correlations are developed on the basis
2
of only one soil index property [1]. Further, simplifying assumptions are commonly incorporated into the
development of the statistical and numerical methods that may lead to very large errors [7-11].
In recent years, new soft computing methods such as artificial neural networks (ANNs) have been successfully
applied to behavioral modeling of many geotechnical engineering problems [11-13]. The insufficiency of ANNs
to produce simplified prediction equations can create difficulty in practical circumstances. Furthermore,
structure of a neural network should be identified a priori [14]. A new alternative approach to overcome these
problems is known as genetic programming (GP) [15, 16]. GP is generally a supervised machine learning
technique that searches a program space instead of a data space. Many researchers have employed GP and its
variants to derive simple prediction equations for civil engineering problem [17-20]. Recently, Kayadelen et al.
[21] used the ANN and GP-based methods to predict the ' value of soils.
Orthogonal least squares (OLS) algorithm [22, 23] is an effective algorithm to designate which terms are
significant in a linear-in-parameters model [24, 25]. Madar et al. [26] coupled GP and OLS to make a hybrid
algorithm with better efficiency. Introducing this strategy into the GP process resulted in obtaining more robust
and interpretable models [25, 26]. Some of the limited researches with the specific objective of applying the
GP/OLS method to civil engineering problems have been recently conducted by Gandomi et al. [24] and
Gandomi et al. [25].
The purpose of the current research is to utilize the hybrid GP/OLS technique to generate a linear-in-parameters
prediction model for '. The proposed model relates ' to the coarse and fine-grained contents, liquid limit and
bulk density. The developed model can reliably be used for routine design practice in that it was derived from
tests with a wide range of aggregate gradation and soil index properties.
2 Genetic Programming
GP creates computer programs to solve a problem using the principle of Darwinian natural selection. A
significant advantage of GP over other soft computing techniques is its ability to generate practical prediction
equations. Development of GP in the late 1980s was a result of experiments of Koza [13] on symbolic
regression. GP is an extension of genetic algorithms (GAs). The classical GP technique is referred to as tree-
based GP [25], in which a random population of individuals (trees) is created to achieve high diversity. A
population member in GP is a hierarchically structured tree comprising functions and terminals. The functions
3
and terminals are selected from a set of functions and a set of terminals. The functions and terminals are chosen
at random and constructed together to form a computer model in a tree-like structure [25]. A simple tree
representation of a GP model is shown in Fig. 2.
Fig. 2 The tree representation of a GP model ((√(x - 1))
Once a population of models has been created at random, the GP algorithm evaluates the individuals, selects
individuals for reproduction. Thereafter, GP generates new individuals by mutation, crossover, and direct
reproduction [15, 25]. The crossover operation selects a point on a branch of each program at random. Then, set
of terminals and/or functions from each program are swapped to generate two new programs (see Fig. 3).
During the mutation process, the algorithm occasionally selects a function or terminal from a model at random
and mutates it (see Fig. 4). In the following subsections, the coupled algorithm of GP and OLS, GP/OLS, is
described.
Fig. 3 Crossover operation in GP
Fig. 4 Mutation operation in GP
2.1 Genetic Programming for Linear-in-parameters Models
In general, GP creates both nonlinear and linear-in-parameters models. In order to avoid parameter models, the
parameters must be removed from the set of terminals. That is, it must contain only variables: T = (x0 (k), ..., xi
(k)}, where xi (k) denotes the ith repressor variable. Hence, a population member represents only Fi nonlinear
functions [25, 27]. The parameters are assigned to the model after “extracting” the F i function terms from the
tree, and determined using a least square (LS) algorithm [28]. A simple technique for the decomposition of the
tree into function terms can be used. The subtrees, representing the F i function terms, are determined by
decomposing the tree starting from the root as far as reaching nonlinear nodes (nodes which are not “+” or “-”).
As shown in Fig. 5, the root node is a “+” operator; therefore, it is possible to decompose the tree into two
subtrees of “A” and “B”. The root node of the “A” tree is anew a linear operator; therefore, it can be
decomposed into “C” and “D” trees. As the root node of the “B” tree is a nonlinear node (/), it cannot be
decomposed. The root nodes of “C” and “D” trees are also nonlinear. Consequently, the final decomposition
procedure results in three subtrees: “B”, “C”, and “D”. According to the results of the decomposition, it is
4
possible to assign parameters to the functional terms represented by the obtained subtrees. The resulted linear-
in-parameters model for this example is y: p0 + p1(x2 + x1)/x0 + p2x0 + p3x1 [25].
Fig. 5 Decomposition of a tree to function terms [29]
GP can be used for selecting from special model classes such as a polynomial model. For this aim, the set of
operators must be restricted and some simple syntactic rules must be introduced. For instance, if the set of
operators is defined as F= {×, +} and there is a syntactic rule that exchanges the internal nodes that are below a
“×”-type internal nodes to “×”- type nodes, GP will generate only polynomial models [15, 25, 29].
2.2 Orthogonal Least Squares Algorithm
The great advantage of using linear-in-parameter models is that the LS method can be used to identify the
model parameters. This is much less computationally demanding than other nonlinear optimization algorithms,
because the optimal p = [p1,..., pm]T parameter vector can analytically be calculated [25]:
p = (U-1U)TUy (1)
in which y = [y(1),..., y(N)]T is the measured output vector and the U regression matrix is:
 U 1 ( x(1))  U M ( x(1)) 
 
U     
U ( x( N ))  U ( x( N )) 
 1 M  (2)
The OLS algorithm [23, 24] is an efficient algorithm for determining which terms are significant in a linear-in-
parameters model. The OLS technique introduces the error reduction ratio (err), which is a measure of the
decrease in the variance of output by a given term. The matrix form corresponding to the linear-in-parameters
model is [25]:
y = Up+e (3)
where the U is the regression matrix, p is the parameter vector, and e is the error vector. The OLS method
transforms the columns of the U matrix into a set of orthogonal basis vectors to inspect the individual
contributions of each term [30]. It is assumed in the OLS algorithm that the regression matrix U can be
orthogonally decomposed as U = WA, where A is a M by M upper triangular matrix (i.e., Aij = 0 if i > j). W is a
N by M matrix with orthogonal columns in the sense that WTW = D is a diagonal matrix (N is the length of the
5
y vector and M is the number of repressors). After this decomposition, the OLS auxiliary parameter vector g can
be calculate as [25]:
g = D-1WT y (4)
where gi represents the corresponding element of the OLS solution vector. The output variance (yTy)/N can be
described as:
M
y T y   g i2 wiT wi  eT e. (5)
i 1
Therefore, the error reduction ratio [err]i of the Ui term can be expressed as:
2 T
err i  g i wT i w . (6)
y y
This ratio offers a simple mean for order and selects the model terms of a linear-in-parameters model on the
basis of their contribution to the performance of the model.
2.3 Hybrid Genetic Programming-Orthogonal Least Squares
The application of OLS in the GP algorithm leads to significant improvements in the performance of GP [25].
The main feature of this hybrid approach is to transform the trees to simpler trees which are more transparent,
but their accuracies are close to the original trees. In this coupled algorithm, GP generates a lot of potential
solutions in the form of a tree structure during the GP operation. These trees may have better and worse terms
(subtrees) that contribute more or less to the accuracy of the model represented by the tree. OLS is used to
estimate the contribution of the branches of the tree to the accuracy of the model, whereas, using the OLS, one
can select the less significant terms in a linear regression problem. According to this strategy, terms (subtrees)
having the smallest error reduction ratio are eliminated from the tree [25, 27]. This “tree pruning” approach is
realized in every fitness evaluation before the calculation of the fitness values of the trees. Since GP works with
the tree structure, the further goal is to preserve the original structure of the trees as far as it possible. The
GP/OLS method always guarantees that the elimination of one or more function terms of the model can be done
by pruning the corresponding subtrees, so there is no need for structural rearrangement of the tree after this
operation. The way the GP/OLS method works on its basis is simply demonstrated in Fig. 6. Assume that the
function which must be identified is y(x) = 0.8 (u x−1)2 + 1.2yx−1 − 0.9yx−2 − 0.2. As can be seen in Fig. 5, the GP
6
algorithm finds a solution with four terms: (ux−1)2, yx−1, yx−2, ux−1 × ux−2. Based on the OLS algorithm, the sub-
tree with the least error reduction ratio (F 4 = ux−1 × ux−2) is eliminated from the tree. Subsequently, the error
reduction ratios and mean square error values (and model parameters) are calculated again. The new model
(after pruning) may have a higher mean square error but it obviously has a more adequate structure [25].
Fig. 6 Pruning of a tree with OLS [25]
3 Modeling of Effective Angle of Shearing Resistance
The angle of shearing resistance represents the interlocking among the soil particles. The soils with high
plasticity like clayey soils have lower angle of shearing resistance and higher cohesion. Conversely, as the soil
grain size increases like sands, the soil internal friction angle increases and its cohesion decreases. Therefore, in
a rational manner the main parameters which affect the soil strength parameters will be the soil type, soil
plasticity, and soil density. This study aimed at obtaining meaningful relationships between ' and the
influencing parameters using the GP/OLS approach. The most important factors representing the ' behavior
were selected on the basis of a literature review [3, 6, 21, 31] and after a trial study. The formulation of ' (o)
was considered to be as follows:
   f FC , CC, LL, γ  (7)
where,
FC (%): Fine-grained content
CC (%): Coarse-grained content
LL (%): Liquid limit
γ (gr/cm3): Soil bulk density
The significant influence of these parameters in determining ' is well understood. The best GP/OLS model was
chosen on the basis of a multi-objective strategy as below:
i. The simplicity of the model, although this was not a predominant factor.
ii. Providing the best fitness value on the training set of data.
7
3.1 Parameters for Measuring Performance
Correlation coefficient (R), root mean squared error (RMSE) and mean absolute percent error (MAPE) were
used to evaluate the performance of the models. R, RMSE and MAPE were calculated using the following
equations:
 n h  h  t  t   (8)
  i  1
  i
i  i i  
R
2 2
 n  h  h   n  t  t 
i  1 i i  i  1 i i 
n 2
  hi  ti 
(9)
RMSE  i  1  100
n
 
1 n  hi  ti  (10)
MAPE     100
n i  1 h 
 i 
where, hi and ti are, respectively, the actual and predicted output values for the ith output, hi and t are,
i
respectively, the average of the actual and predicted outputs, and n is the number of samples.
3.2 Experimental Database
Within the scope of this study, a series of consolidated drained (CD) triaxial tests were performed in accordance
with ASTM WK3821 [4] on undisturbed soil samples. The samples were taken from Khorasan and Khouzestan
provinces in Iran. The most versatile test to measure shear strength parameters of soils is the triaxial test. In this
test, a cylinder-shaped specimen, sealed by a rubber membrane, is submitted to an axisymmetric water pressure
and then to an increasing axial loading. Herein, typical dimensions of specimens were 38 mm in diameter and
76 mm in height. The undisturbed samples were taken from drilling boreholes with Shelby tube (thin-walled
metal) in accordance whit the procedures given in ASTM 1587 [32]. All the triaxial tests were carried out using
samples taken at the depth ranging from 30 m to 15 m and contained no gravel or larger particles.
Before conducting the test, the soil samples were saturated, such that pore pressure response to an undrained
isotropical stress increment gave a value of B ≥ 0.95. For this purpose, both cell pressure and back pressure
(saturation water pressure) were applied and then simultaneously increased during saturation. The cell pressure
was always kept 10 kPa above the back pressure so that the accidental swelling of the samples or consolidation
resulting from a high pressure of the specimen was prevented. At the end of the saturation process, specimens
8
were istropically consolidated under constant vertical stresses and certain confining pressures. When 95%
excess pore pressure dissipation was achieved, the consolidation stage was finished. After consolidation, the
specimens were vertically loaded at a strain rate of 0.35mm/min as long as pore water was allowed to drain out.
The loading was continued until the maximum deviator stress was achieved and axial strain was at least 20%.
Each soil was tested for three confining pressure levels. Approximately ten days were required to complete the
each test including saturation and consolidation process. To develop generalized correlations, a comprehensive
database of previously published triaxial tests gathered by Kayadelen et al. [21] and Kayadelen [33] was further
added to the available experimental results. The collected database together with data from the present study (13
data sets) consists of a total of 135 data sets. The complete list of the test results and soil properties are given in
Appendix A (Table 6).
3.3 Data Preprocessing
Some of the soil property variables are fundamentally interdependent. The first step in the analysis of
interdependency of the data is to make a careful study of what it is that these variables are measuring, noting
any highly correlated pairs. High positive or negative correlation coefficients between the pairs may lead to
poor performance of the models and difficulty in interpreting the effects of the explanatory variables on the
response. This interdependency can cause problems in analysis as it will tend to exaggerate the strength of
relationships between variables. This is a simple case commonly known as the problem of multicollinearity
[34]. Thus, the correlation coefficients between all possible pairs were determined and shown in Table 1.
Table 1 Correlation coefficients between all pairs of the explanatory variables
As can be seen in this table, there is a high negative correlation between FC and CC in the operation. This is
apparent since CC is calculated by subtracting FC from 100. Since there was no advantage of having both
variables in the modeling (one can represent the other), decisions were made to remove the correlated
parameters in order to maximize the reliability of the final models. Finally, on the basis of a trial study, the ratio
of FC to CC (FC/CC) was used as the input parameter. The descriptive statistics of the data used in this study
are given in Table 2. To visualize the distribution of the samples, the data are also presented by frequency
histograms (Fig. 7).
9
Table 2 Descriptive statistics of the variables used in the model development
Fig. 7 Histograms of the variables used in the model development
For the GP/OLS analysis, the data sets were randomly divided into training and testing subsets. The training
data were used for learning (genetic evolution). The testing data were used to measure the performance of the
program evolved by GP/OLS on data that played no role in building the model. In order to obtain a consistent
data division, several combinations of the training and testing sets were considered. The selection was such that
the maximum, minimum, mean and standard deviation of parameters were consistent in the training and testing
data sets. Out of the 135 data, 108 data were used as the training data and 27 data for the testing of the
generalization capability of the models.
Although normalization is not strictly necessary in the GP-based analysis, better results are usually reached after
normalizing the variables. Further, normalization speeds up the process [25]. These are mainly due to the
influence of unification of the variables, no matter their range of variation. Thus, both input and output variables
were normalized in this study. After controlling several normalization methods [35, 36], the following method
was used to normalize the variables to a range of [L, U]:
X n = ax + b (11)
where
a = (U−L)/ (Xmax−Xmin), and b = U − aXmax, in which Xmax and Xmin are the maximum and minimum values of the
variable, and Xn is the normalized value. In the present study, L = 0.05 and U = 0.95.
3.4 Model Development Using GP/OLS
The available database was used for generating a GP/OLS prediction model relating ' to FC/CC, LL, and γ.
Various parameters are involved in the GP/OLS predictive algorithm. The parameter settings for the GP/OLS
algorithm are shown in Table 3. In this study, basic arithmetic operators were utilized to get the optimum
GP/OLS model. The number of programs in the population that GP/OLS will evolve is set by the population
size. A run will take longer with a larger population size. The number of generation sets the number of levels
the algorithm will use before the run terminates. The proper number of population and generation depends on
10
the number of possible solutions and complexity of the problem. A relatively large number of generations were
tested to find models with minimum error. The program was run until the runs terminated automatically. The
values of the other involved parameters were selected based on some previously suggested values [24, 25, 37]
and also after a trial and error approach.
Table 3 Parameter settings for the GP/OLS algorithm
3.4.1 GP/OLS-Based Formulation of Angle of Shearing Resistance
The optimal formulation of the effective angle of shearing resistance (') is as given below:
   FC   
 / OLS     27.47 - 0.0589 
GP  - LLn  + 0.8046γ n + 0.77 
2
(12)

   CC  n  
where,
(FC/CC)n = 0.0081(FC/CC)+0.0986
LLn = 0.0105LL-0.1316
γn = 0.9558γ-1.2677
A comparison of the experimental ' values and those predicted by GP/OLS is shown in Fig. 8.
Fig. 8 Experimental versus predicted ' values using the GP/OLS model: (a) training data, (b) testing data
3.5 Model Development Using Traditional GP
A tree-based GP analysis was performed to compare the hybrid GP and OLS technique (GP/OLS) with a
classical GP approach. The tree-based GP model was developed using the same variables and same data sets as
the GP/OLS model. Various parameters involved in the traditional GP predictive algorithm are shown in Table
4. The parameters were selected considering some previously suggested values [17] and also after a trial and
error approach. A large number of generations were tested to find a model with minimum error. A tree-based
GP software, GPLAB [38] in conjunction with subroutines coded in MATLAB was used in this study.
Table 4 Parameter settings for the traditional GP algorithm
11
3.5.1 Traditional GP-Based Formulation of Angle of Shearing Resistance
The prediction model for ', for the best result by the traditional GP algorithm, is as given below:
   
l GP    27.47  n -  n  LLn n - LLn  n - 1
  FC 
Traditiona
   - LLn - γn +  n  + 0.55455 
 2
(13)
   CC  n  
in which,
(FC/CC)n = 0.0081(FC/CC)+0.0986
LLn = 0.0105LL-0.1316
γn = 0.9558γ-1.2677
A comparison of the experimental ' values and those predicted by traditional GP is shown in Fig. 9.
Fig. 9 Experimental versus predicted ' values using the traditional GP model: (a) training data, (b) testing data
3.6 Model Development Using Regression Analysis
A multivariable least squares regression (MLSR) [39] analysis was performed to have an idea about the
predictive power of the best GP/OLS model, in comparison with a classical statistical approach. The method of
LSR is extensively used in regression analysis primarily because of its interesting nature. Under certain
assumptions, LSR has some attractive statistical properties that have made it as a member of the most powerful
and popular methods of regression analysis. The major task was to determine the MLSR-based equation
connecting the input variables to the output variable as:
FC
   α1  α 2 LL  α3  α 4 (14)
CC
where a denotes coefficient vector. LSR minimizes the sum-of-squared residuals for each equation, accounting
for any cross-equation restrictions on the parameters of the system. If there are no such restrictions, this
technique is identical to estimating each equation using single-equation ordinary least squares. The MLSR
model was trained and tested using the same data sets previously considered for developing the GP/OLS model.
Eviews software package [40] was used to perform the regression analysis.
12
3.6.1 MLSR-Based Formulation of Angle of Shearing Resistance
The MLSR-based formulation of ' is as given below:
φMLSR    -0.0137 CC
 FC
+ 0.0256 LL + 18.783CC - 8.458 (15)
and FC/CC, LL and γ are the predictor variables. Fig. 10 shows a comparison between the experimental '
values and the values predicted by MLSR. The resulting Fisher value of the performed regression analyses is
equal to 59.6.
Fig. 10 Experimental versus predicted ' values using the MLSR model: (a) training data, (b) testing data
4 Performance Analysis
Different correlations were developed for the estimation of ' upon a reliable database. Comparisons of the '
predictions made by the GP/OLS, traditional GP and MLSR models are presented in Fig. 11. No rational model
for the prediction of ' has been found that encompass the influencing variables considered in this study. Thus,
it was not possible to conduct a comparative study between the results of this research and those in hand.
It is known that if the R value provided by a model is higher than 0.8 and the error values (e.g., RMSE and
MAPE) are low, the predicted and measured values are strongly correlated with each other [41, 42]. It can be
observed from Figs. 8 and 11 that the GP/OLS model with high R and low RMSE and MAPE values is able to
predict the target values to an acceptable degree of accuracy. The performance of the GP/OLS model on the
testing data is better compared with that on the training data. This indicates that the GP/OLS model has a very
good generalization performance. The amount of data used for the training process is an important issue, as it
bears heavily on the reliability of the final models [42]. To cope with this limitation, Frank and Todeschini [43]
argue that the minimum ratio of the number of objects over the number of selected variables for model
acceptability is 3. It is also suggested that considering a higher ratio equal to 5 is safer. In the present study, this
ratio is much higher and is equal to 135/3 = 45. Additionally, new criteria proposed by Golbraikh and Tropsha
[44] were checked for the external validation of the GP/OLS model on the testing data sets. It is recommended
that at least one slope of regression lines (k or k') through the origin should be close to 1. Recently, Roy and
Roy [45] introduced a confirm indicator of the external predictability of models (R m). For Rm > 0.5, the
13
condition is satisfied. Furthermore, the squared correlation coefficient between the predicted and measured
values (Ro2), and the correlation coefficient between the measured and predicted values (Ro' 2) should be close
to 1 [42]. The validation criteria and the relevant results obtained by the model are presented in Table 5. As it is
seen, the derived model satisfies nearly all of the required conditions. The only exception is for the Rm criterion.
In this case, the proposed model marginally fails to satisfy the condition (R m = 0.483). The validation phase
ensures that the GP/OLS model is valid, has the prediction power and is not established by chance.
Table 5 Statistical parameters of the GP/OLS model for the external validation
As shown in Figs. 8, 9 and 11, the GP/OLS-based model has produced better results than the traditional GP
model. This reveals that applying the OLS strategy into the GP process (GP/OLS) improves the efficiency of
the traditional GP. Because of the tree pruning process, the GP/OLS equation is structurally simpler in
comparison with the equation evolved by the traditional GP. The GP/OLS model can be used for routine design
practice via hand calculations.
It can also be seen from Figs. 8, 10 and 11 that the GP/OLS model produces remarkably better outcomes than
the empirical MLSR model. The significant limitations of empirical modeling based on the statistical techniques
strongly affect the prediction capabilities of the regression-based equations [14, 24]. Conventional regression
models often assume a linear relationship between the output and the predictor variables, which is not always
true. In most cases, the best models developed using the commonly used regression approach are obtained after
controlling just some equations established in advance. Thus, they cannot efficiently consider the interactions
between the dependent and independent variables [14]. On the other hand, GP/OLS introduces completely new
characteristics and traits. One of the major advantages of the GP/OLS approach over the traditional regression
analysis is its ability to derive explicit relationships for a problem without assuming prior forms of the existing
relationships. The best equations evolved by this technique are determined after controlling numerous
preliminary models. However, it is notable that the GP-based methods are extremely parameter sensitive,
especially when difficult experimental training data sets like the one used in this paper are employed. Using any
form of optimally controlling the parameters of the run (e.g., GAs), can improve the performance of their
algorithms [24]. In this context, further research can be focused on hybridizing GP with other optimization
algorithms such as Ant Colony or Tabu Search.
14
Fig. 11 Comparison of the ' predictions made by different models: (a) GP/OLS, (b) traditional GP (c) MLSR
5 Sensitivity Analysis
The contribution of each predictor variable in the models evolved by GP/OLS was evaluated through a
sensitivity analysis. For this aim, frequency values [46] of the input variables were obtained. A frequency value
equal to 1.00 for an input indicates that this variable has been appeared in 100% of the best thirty programs
evolved by GP/OLS. This methodology is a common approach in the GP-based analyses [18-20]. The
frequency values of the predictor variables are presented in Fig. 12. According to these results, it can be found
that ' is more dependent on γ than LL and FC/CC. The results comparably agree with those of the ANN model
developed by Kayadelen et al. [21].
Fig. 12 Contributions of the predictor variables in the GP/OLS analysis
6 Parametric Analysis
For further verification of the GP/OLS prediction equation, a parametric analysis was performed in this study.
The parametric analysis investigates the response of the predicted ground-motion parameters from the GP/OLS
models to a set of hypothetical input data generated over the training ranges of the minimum and maximum
data. The methodology is based on changing one predictor variable at a time while the other seismic variables
are kept constant at the average values of their entire data sets. A set of synthetic data for the single varied
parameter is generated by increasing the value of this in increments [24]. These variables are presented to the
prediction model and ' is calculated. This procedure is repeated using another variable until the model response
is tested for all input variables. Fig. 13 presents the tendency of the ' predictions to the variations of the
predictor variables, FC/CC, LL and γ. The results of the parametric analysis indicate that ' continuously
decreases due to increasing FC/CC and increases with increasing LL. As can be seen in Fig. 13(c), the ' value
decreases when γ increases up to 1.4 gr/cm3 and thereafter it starts decreasing.
Fig. 13 Parametric analysis of ' in the GP/OLS model
15
Besides, the ratios of the experimental ' values to the values predicted by the GP/OLS solution, with respect to
FC/CC, LL and γ, is shown in Fig. 14. As the scattering increases in these figures, the accuracy of the model
consequently decreases. It can be observed from these figures that the predictions obtained by the proposed
correlations have a very good accuracy with no significant trend with respect to the design parameters. In the
cases of FC/CC and LL, the scattering slightly decreases with increasing this parameter.
Fig. 14 The ratio between the predicted and experimental ' values with respect to the design parameters
design parameters
7 Conclusions design parameters
In this research, a high-precision model was derived for assessing the effective angle of shearing resistance
using a combined GP and OLS algorithm (GP/OLS). The proposed model was developed based on well
established and widely dispersed triaxial test results obtained from the literature and experimental study
performed in this study. The performance of the model was benchmarked against the standard GP and multiple
regression-based models.
 The developed model gives reliable estimations of the ' values. Introducing the OLS strategy to the GP
process improved the efficiency of the traditional GP. The results indicate that the proposed model
significantly outperforms the regression model.
 The proposed GP/OLS model simultaneously take into account the role of several important
representing the behavior of shear strength parameters.
 The GP/OLS model can be used for practical engineering purposes since it was developed based on
tests conducted on clayey and sandy soils with wide range properties. The proposed model is very
simple. The predictive capability of the derived model is limited to the range of the data used for its
calibration. Despite this limitation, this model can be retrained and improved to make more accurate
predictions for a wider range by adding newer data sets for other soil types and test conditions.
 With the use of the GP/OLS approach, the ' values can be estimated without carrying out sophisticated
and time-consuming laboratory or field tests.
 A finding from the sensitivity analysis results is that the most important parameter governing the '
behavior is the soil bulk density.
16
 The interesting observation from the results of parametric study is that bulk density is positively
correlated with ' just up to about 1.4 gr/cm3 and thereafter the correlation becomes negative.
Appendix A
Table 6 Geotechnical properties of soils used for the model development

design parameters
design parameters
References
1. Mollahasani A, Alavi AH, Gandomi AH, Rashed A (2011) Nonlinear neural-based modeling of soil
cohesion intercept. KSCE J Civil Eng 15(5): 831-840.
2. Arora KR (1988) Introductory soil engineering. text book, 322, pub.: Nem Chand Jane (Prop.), Standard
Publishers Distributors, 1705- NaiSarak, Delhi
3. Murthy S (2008) Geotechnical Engineering: Principles and Practices of Soil Mechanics, 2008, 2nd
Edition, CRC Press, Taylor & Francis, UK
4. ASTM WK3821 New test method for consolidated drained triaxial compression test for soils
5. ASTM D 6528 Consolidated undrained direct simple shear testing of cohesive soils
6. El-Maksoud MAF (2006) Laboratory determining of Soil Strength Parameters in Calcareous Soils and
Their Effect on Chiseling Draft Prediction. In: Proceedings of Energy Efficiency and Agricultural
Engineering International Conference, Vol. 9, Rousse, Bulgaria
7. Bowles JE (1992) Engineering properties of soils and their measurement (4th ed.). New York, NY,
McGraw-Hill
8. Korayem AY, Ismail KM, Sehari SQ (1996) Prediction of soil shear strength and penetration resistance
using some soil properties. Missouri J Agric Res 13(4):119–140
9. Panwar JS, Seimens JC (1972) Shear strength and energy of soil failure related to density and moisture.
T ASAE 15:423–427
10. Terzaghi K, Peck RB, Mesri G (1996) Soil mechanics in engineering practice (2 nd ed.). Wiley & Sons,
New York
17
11. Shahin MA, Maier HR, Jaksa MB (2001) Artificial neural network applications in geotechnical
engineering. Aus Geomech 36(1):49–62
12. Alavi AH, Gandomi AH, Mollahasani A, Heshmati AAR (2010) Modeling of maximum dry density
and optimum moisture content of stabilized soil using artificial neural networks. J Plant Nutr Soil Sci,
173(3): 368-379.
13. Heshmati AAR, AH Alavi, M Keramati, AH Gandomi (2009) A radial basis function neural network
approach for compressive strength prediction of stabilized soil. Geotechnical Special Publication ASCE
191:147-153
14. Alavi AH, Ameri M, Gandomi AH, Mirzahosseini MR (2011) Formulation of flow number of asphalt
mixes using a hybrid computational method. Constr Build Mater 25(3): 1338–1355.
15. Koza J (1992) Genetic programming, on the programming of computers by means of natural selection.
Cambridge (MA), MIT Press
16. Banzhaf W, Nordin P, Keller R, Francone F (1998) Genetic programming – an introduction. on the
automatic evolution of computer programs and its application. dpunkt/Morgan Kaufmann:
Heidelberg/San Francisco
17. Johari A, Habibagahi G, Ghahramani (2006) A prediction of soil–water characteristic curve using
genetic programming. J Geotech Geoenviron ASCE 132(5):661-65
18. Gandomi AH, Alavi AH, Sahab MG, Arjmandi P (2010) Formulation of elastic modulus of concrete
using linear genetic programming. J Mech Sci Tech 24(6): 1011-1017.
19. Gandomi AH, Alavi AH, Sahab MG (2009) New formulation for compressive strength of CFRP
confined concrete cylinders using linear genetic programming. Mater Struct 43(7): 963-983.
20. Alavi AH, Gandomi AH, Sahab MG, Gandomi M (2010) Multi expression programming: a new
approach to formulation of soil classification. Eng Comput 26(2): 111-118.
21. Kayadelen C, Günaydın O, Fener M, Demir A, Özvan A (2009) A Modeling of the angle of shearing
resistance of soils using soft computing systems. Expert Syst Appl 36:11814–11826
22. Billings S, Korenberg M, Chen S (1988) Identification of nonlinear outputaffine systems using an
orthogonal least-squares algorithm. Int J Syst Sci 19(8):1559–1568
18
23. Chen S, Billings S, Luo W (1989) Orthogonal least squares methods and their application to non-linear
system identification. Int J Cont 50(5):1873–1896
24. Gandomi AH, Alavi AH, Mousavi M, Tabatabaei SM (2011) A hybrid computational approach to
derive new ground-motion attenuation models. Eng Appl Artif Int 24(4): 717–732.
25. Gandomi AH, Alavi AH, Arjmandi P, Aghaeifar A, Seyednoor M (2010) Genetic programming and
orthogonal least squares: a hybrid approach to modeling the compressive strength of CFRP-Confined
concrete cylinders. J Mech Mater Struct 5(5), 735–753.
26. Madar J, Abonyi J, Szeifert F (2005) Genetic programming for the identification of nonlinear input-
output models. Ind Eng Chem Res 44(9):3178–3186
27. Pearson R (2003) Selecting nonlinear model structures for computer control. J Process Contr 13(1):1–
26
28. Reeves CR (1997) Genetic algorithm for the operations research. INFORMS J Comput 9:231–250
29. Madár J, Abonyi J, Szeifert F (2004) Genetic programming for system identification. In: Proceedings of
Intelligent Systems Design and Applications (ISDA 2004) Conference, Budapest, Hungary
30. Cao H, Yu J, Kang L, Chen Y (1999) The kinetic evolutionary modelling of complex systems of
chemical reactions. Comput Chem Eng 23(1):143–151
31. Barends FBJ, Lindenberg JL, DeQuelerij L, Verruijt A, Luger HJ (1999) Geotechnical Engineering for
Transportation Infrastructure: Theory and Practice, Planning and Design, Construction and
Maintenance. AA Balkema Publishers, Rotterdam, Netherlands
32. ASTM D 1587 Standard practice for thin-walled tube sampling of soils for geotechnical purposes
33. Kayadelen C (2008) Estimation of effective stress parameter of unsaturated soils by using artificial
neural networks. Int J Numer Anal Meth Geomec 32:1087–106.
34. Dunlop P, Smith S (2003) Estimating key characteristics of the concrete delivery and placement process
using linear regression analysis. Civil Eng Environ Syst 20,273–290
35. Swingler K (1996) Applying neural networks a practical guide. Academic Press, New York
36. Rafiq MY, Bugmann G, Easterbrook DJ (2001) Neural network design for engineering applications,
Comput Struct 79(17):1541–1552
19
37. Madár J, Abonyi J, Szeifert F (2005b) Genetic Programming for the Identification of Nonlinear Input-
Output Models, white paper. Retrieved September 05, 2009, from
http://www.fmt.vein.hu/softcomp/gp/ie049626e.pdf
38. Silva S (2007) GPLAB, a genetic programming toolbox for MATLAB, ITQB/UNL[M]. {http://gplab.
sourceforge. net}
39. Ryan TP (1997) Modern Regression Methods. New York (NY), Wiley
40. Maravall A, Gomez V (2004) EViews Software, Ver. 5. Quantitative Micro Software, LLC, Irvine CA
41. Smith GN (1986) Probability and statistics in civil engineering. Collins, London
42. Mollahasani A, Alavi AH, Gandomi AH, (2011) Empirical modeling of plate load test moduli of soil
via gene expression programming. Comput Geotech 38(2): 281-286.
43. Frank IE, Todeschini R (1994) The data analysis handbook. Elsevier, Amsterdam, Netherlands.
44. Golbraikh A, Tropsha A. Beware of q2. J Mol Graph Model 2002;20:269–76.
45. Roy PP, Roy K. On some aspects of variable selection for partial least squares regression models.
QSAR Comb Sci 2008;27:302–13.
46. Francone F (2001) Discipulus Owner„s Manual, Version 4.0. Register Machine Learning Technologies
20
LIST OF TABLES


Variable FC (%) CC (%) LL (%) γ (gr/cm3)
FC (%) 1.00 -1.00 0.28 -0.21
CC (%) -1.00 1.00 -0.28 0.21
LL (%) 0.28 -0.28 1.00 -0.58
γ (gr/cm )
3
-0.21 0.21 -0.58 1.00

Inputs Output
Parameters FC (%) CC (%) LL (%) γ (gr/cm )
3
' (o)
Mean 38.68 61.39 41.00 1.81 26.41
Standard Error 1.76 1.77 0.96 0.01 0.29
Median 40.00 60.00 37.00 1.81 26.00
Mode 28.00 72.00 36.00 1.81 26.00
Standard Deviation 20.50 20.51 11.19 0.14 3.39
Sample Variance 420.28 420.72 125.16 0.02 11.46
Kurtosis -0.63 -0.64 4.27 0.83 2.75
Skewness -0.09 0.09 1.43 0.03 0.99
Range 84 84.00 76 0.837 22
Minimum 1 15.00 22 1.431 18
Maximum 85 99.00 98 2.268 40
Parameter Settings
Function set +, -, ×, /
Population size 500-1000
Maximum tree depth 64
Maximum number of evaluated individuals 250
Maximum number of evaluated individuals 250
Generation 100
Type of selection Roulette-wheel
Type of mutation Point-mutation
Type of crossover One-point (2 parents)
Type of replacement Elitist
Probability of crossover 0.5
Probability of mutation 0.5
Probability of changing terminal–non-terminal nodes (vice versa) during mutation 0.25

Parameter Settings
Function set +, -, ×, /
Population size 100-1000
Maximum tree depth 10
Total generations 4000
Initial population Ramped half-and-half
Sampling Tournament
Expected no. of offspring method Rank 89
Fitness function error type linear error function
Termination Generation 40
Minimum probability of crossover 0.1
Minimum probability of mutation 0.1
Real max level 30
Survival mechanism Keep best
Item Formula Condition GP/OLS

1 R R > 0.8 0.909
2 k

in 1 hi  ti  0.85 < K < 1.15 1.005
h2
i
3 k 

in 1 hi  ti  0.85 < K' < 1.15 0.991
t2
i
6 Rm  R 2  (1  R 2  Ro 2 ) Rm > 0.5 0.483
Ro  1 
2 
in1 ti  hi
o

2
, hio  k  ti
t 
where 2 Should be Close to 1 0.999
 n
i 1 i  ti
Ro 2  1 

in1 hi  ti
o

2
, tio  k   hi
h  h  2 Should be Close to 1 0.995
 n
i 1 i i

Test No. FC (%) CC (%) LL (%) γ (gr/cm3) 'Exp. (o) ' GP/OLS (o) ' Traditional GP (o) 'MLSR (o)
1 77.4 22.6 36.4 1.810 27.1 26.1 26.3 26.4
2 81.3 18.7 45.8 1.870 28.4 27.5 28.0 27.8
3 56.3 43.7 24.2 2.070 32.2 32.3 31.4 31.0
4 97.1 2.9 34.7 1.970 30.8 29.3 29.1 29.0
5 96.9 3.1 29 1.990 29.6 29.7 29.3 29.2
6 86.4 13.6 33.9 1.930 28.8 28.6 28.5 28.6
7 72 28 26 2.030 32 31.2 30.4 30.3
8 96.5 3.5 40.4 1.950 30.5 28.9 29.0 28.8
9 68.6 31.4 43.2 1.880 30.1 27.7 28.1 27.9
10 87 13 28 1.960 36 29.3 28.8 29.0
11 81.9 18.1 37.7 1.950 30.7 29.2 29.2 29.1
12 98.8 1.2 35.8 1.910 29.5 27.2 27.2 27.2
13 79.2 20.8 43.5 1.950 29.1 29.3 29.5 29.2
14 76 24 39 1.781 26 25.6 25.9 25.9
15 79 21 36 1.801 27 25.9 26.1 26.2
16 52 48 44 1.804 25 26.1 26.7 26.5
17 72 29 28 1.800 28 25.7 25.6 26.0
18 50 51 48 1.828 25 26.7 27.4 27.1
19 59 41 43 1.765 25 25.4 25.9 25.8
20 35 65 35 1.705 26 24.3 24.2 24.5
21 76 24 34 1.715 26 24.4 24.3 24.6
22 74 27 47 1.814 25 26.3 27.0 26.8
23 63 37 34 1.831 27 26.5 26.6 26.8
24 64 37 55 1.532 23 22.6 21.5 21.7
25 49 51 35 1.831 29 26.5 26.6 26.8
26 41 59 42 1.826 28 26.5 27.0 26.9
27 24 76 35 1.902 26 28.0 28.1 28.2
28 72 28 36 1.760 26 25.1 25.3 25.5
29 42 58 38 1.841 26 26.7 27.0 27.1
30 33 67 47 1.848 28 27.1 27.7 27.4
31 27 73 34 1.946 28 29.1 28.9 29.0
32 38 62 28 1.922 28 28.4 28.0 28.4
33 37 63 39 1.882 25 27.7 27.9 27.9
34 50 50 46 1.792 26 25.9 26.6 26.4
35 64 36 35 1.726 21 24.6 24.6 24.8
36 66 34 35 1.728 23 24.6 24.6 24.9
37 45 55 37 1.837 24 26.6 26.9 27.0
38 95 5 57 2.093 35 33.3 33.3 32.1
39 53 48 35 1.979 26 29.9 29.7 29.6
40 95 5 49 1.865 29 27.2 27.8 27.6
41 51 49 28 1.966 27 29.5 29.0 29.2
42 45 55 47 1.809 27 26.3 27.0 26.7
43 31 69 46 1.739 26 25.0 25.6 25.4
44 27 73 25 2.020 27 30.9 30.1 30.1
45 61 39 38 1.910 26 28.3 28.4 28.4
46 53 47 48 1.778 24 25.7 26.4 26.2
47 54 46 47 1.826 26 26.6 27.3 27.0
48 55 45 98 1.778 26 26.5 29.6 27.4
49 51 49 32 1.845 29 26.7 26.7 27.0
50 35 65 25 1.918 30 28.2 27.7 28.2
51 72 28 36 1.638 22 23.3 23.0 23.2
52 32 68 36 1.813 28 26.2 26.3 26.5
53 58 43 30 1.799 29 25.8 25.7 26.1
54 61 39 31 1.806 25 25.9 25.9 26.2
55 33 67 49 1.737 23 25.0 25.7 25.4
56 17 83 32 2.233 39 37.9 36.9 34.3
57 94 6 65 1.550 23 22.7 22.1 22.1
58 58 42 29 1.953 31 29.2 28.8 28.9
59 60 40 53 1.641 25 23.7 23.9 23.7
60 33 67 34 1.881 30 27.5 27.6 27.7
61 42 58 31 1.970 28 29.6 29.3 29.3
62 72 28 36 1.673 23 23.8 23.6 23.9
63 42 58 36 1.912 27 28.3 28.3 28.4
64 65 35 42 1.660 26 23.7 23.7 23.8
65 96 4 37 1.638 22 23.1 22.8 22.9
66 44 56 40 1.886 25 27.8 28.0 28.0
67 51 49 33 1.745 30 24.9 24.9 25.1
68 98 2 63 1.638 22 23.2 23.2 23.3
69 47 53 33 1.832 24 26.5 26.5 26.8
70 57 43 48 1.743 25 25.1 25.7 25.5
71 75 25 56 1.572 24 22.9 22.5 22.5
72 65 35 47 1.651 25 23.7 23.8 23.7
73 83 17 32 1.806 24 25.9 25.9 26.2
74 67 33 39 1.883 29 27.7 27.9 27.9
75 65 35 36 1.758 27 25.1 25.3 25.5
76 40 60 40 1.790 24 25.8 26.2 26.2
77 67 33 37 1.770 27 25.3 25.6 25.7
78 64 36 34 1.726 21 24.6 24.6 24.8
79 56 44 46 1.720 25 24.7 25.2 25.0
80 75 25 33 1.883 29 27.5 27.5 27.7
81 36 64 40 1.748 24 25.0 25.4 25.4
82 15 85 23 2.016 29 30.7 29.9 30.0
83 72 28 32 1.806 24 25.9 25.9 26.2
84 19 81 35 1.926 27 28.6 28.6 28.6
85 49 51 45 1.832 26 26.7 27.3 27.1
86 57 43 46 1.739 24 25.0 25.5 25.4
87 99 1 71 1.655 23 22.9 22.4 23.1
88 99 1 55 1.533 19 21.3 20.0 20.4
89 75 25 55 1.467 22 22.1 19.8 20.5
90 99 1 71 1.655 23 22.9 22.4 23.1
91 56 44 49 1.649 26 23.7 23.9 23.8
92 35 65 37 1.958 28 29.4 29.4 29.3
93 72 28 36 1.725 25 24.6 24.6 24.8
94 66 34 35 1.728 23 24.6 24.6 24.9
95 60 40 47 1.918 30 28.6 29.1 28.8
96 38 62 38 1.822 27 26.4 26.7 26.7
97 42 58 41 1.812 25 26.2 26.6 26.6
98 57 43 53 1.640 24 23.6 23.9 23.7
99 50 50 49 1.689 24 24.2 24.7 24.5
100 72 28 36 1.813 27 26.1 26.3 26.5
101 61 39 51 1.718 24 24.7 25.4 25.1
102 42 58 36 1.912 27 28.3 28.3 28.4
103 37 63 49 1.702 28 24.4 25.0 24.8
104 66 34 49 1.609 26 23.2 23.0 23.0
105 92 8 60 1.498 18 22.3 20.6 21.1
106 55 45 36 1.885 27 27.7 27.8 27.9
107 45 55 51 1.668 25 24.0 24.4 24.2
108 75 25 55 1.467 22 22.1 19.8 20.5
109 60 40 33 1.838 24 26.6 26.6 26.9
110 60 40 33 1.823 27 26.3 26.3 26.6
111 31 69 51 1.872 29 27.6 28.5 28.0
112 74 26 57 1.431 22 21.9 18.8 19.8
113 80 21 54 1.552 23 22.7 21.9 22.0
114 56 44 53 1.776 26 25.7 26.7 26.2
115 98 2 65 1.743 25 24.7 25.4 25.3
116 93 7 58 1.613 21 23.3 23.3 23.1
117 61 39 35 1.770 24 25.3 25.4 25.7
118 54 46 35 1.914 27 28.3 28.3 28.4
119 60 40 55 1.704 26 24.6 25.4 24.9
120 67 33 37 1.770 27 25.3 25.6 25.7
121 91 9 32 2.000 28 30.3 29.9 29.8
122 47 53 41 1.785 28 25.7 26.1 26.1
123 48 53 22 1.848 28 26.6 26.1 26.8
124 87 14 33 1.988 32 30.1 29.7 29.6
125 61 39 35 1.770 24 25.3 25.4 25.7
126 84 16 27 2.268 40 39.0 38.1 34.8
127 52 48 29 1.904 25 28.0 27.7 28.0
128 91 9 32 2.058 34 32.0 31.4 30.9
129 53 47 27 1.962 27 29.4 28.8 29.1
130 47 53 43 1.759 27 25.3 25.8 25.7
131 75 25 51 1.708 24 24.5 25.2 24.9
132 72 28 36 1.708 24 24.3 24.3 24.5
133 37 63 30 1.913 26 28.2 28.0 28.2
134 95 5 49 1.883 29 27.6 28.2 27.9
135 50 50 44 1.806 27 26.1 26.7 26.6
Bold sets are test sets.
The first 13 datasets represent the test results obtained in this study.
LIST OF FIGURES
√ Root Node
- Functional Node
x 1
Terminal Nodes
√ √ √ √
/ + / +
Crossover
x1 x2 √ 2 √ x2 x1 2
x2 x2
Parent 1 Parent 2 Child 1 Child 2
√ √
- +
Mutation
x1 x2 x1 x2

+
+ /
x0 x1 + x0
C D
A x2 x1
B
+ +
+ + Pruning yx-2
+
F3
× yx-1 yx-2 × yx-1
×
F2 F3
ux-1 ux-1 ux-1 ux-2 F2
ux-1 ux-1
F1 F4 F1

(a) (b)
30 Frequency 100% 30 Frequency 100%
Cumulative Cumulative
80% 80%
Frequency
Frequency
20 20
60% 60%
40% 40%
10 10
20% 20%
0 0% 0 0%
CC (%) FC (%)
(c) (d)
40 100% 40 100%
80% 80%
30 30
Frequency
Frequency
Frequency Frequency
Cumulative 60% 60%
Cumulative
20 20
40% 40%
10 10
20% 20%
0 0% 0 0%
LL (%) γ (gr/cm3)
(e)
40 100%
80%
30
Frequency
Frequency 60%
20 Cumulative
40%
10
20%
0 0%
' (o)

50 50
(a) (b)
Predicted ' Value (Degree)

45 45
40 Ideal fit 40
Ideal fit
35 35
30 30
25 25
R = 0.804 R = 0.909
20 RMSE = 189.09 20 RMSE = 160.61
MAPE = 5.85 MAPE = 5.12
15 15
15 20 25 30 35 40 45 50 15 20 25 30 35 40 45 50
Experimental ' Value (Degree) Experimental ' Value (Degree)
50 50
(a) (b)
45 45
40 Ideal fit 40
Ideal fit
35 35
30 30
25 25
R = 0.791 R = 0.897
20 RMSE = 195.01 20 RMSE = 171.08
MAPE = 6.18 MAPE = 5.69
15 15
15 20 25 30 35 40 45 50 15 20 25 30 35 40 45 50
50 50
(a) (b)
45 45
40 Ideal fit 40
Ideal fit
35 35
30 30
25 25
R = 0.789 R = 0.874
20 RMSE = 195.68 20 RMSE = 194.54
MAPE = 6.13 MAPE = 5.90
15 15
15 20 25 30 35 40 45 50 15 20 25 30 35 40 45 50
50 50
Experimental GP/OLS Experimental Traditional GP
' (o)
' (o)
40 40
30 30
20 20
10 R = 0.835 10 R = 0.822
RMSE = 183.75 RMSE = 190.47
MAPE = 5.71 (a) MAPE = 6.08 (b)
0 0
1 21 41 61 81 101 121 141 1 21 41 61 81 101 121 141
Test No. Test No.
50
Experimental MLSR
' (o)
40
30
20
10 R = 0.813
RMSE = 195.45
MAPE = 6.08 (c)
0
1 21 41 61 81 101 121 141
Test No.
1
Frequcency
0.8
0.6
0.4
0.2
FC/CC LL (%) γ (gr/cm3)

26.2 27.2
26.0 27.0
26.8
25.8
' (Degree)
' (Degree)
26.6
25.6 26.4
25.4 26.2
25.2 26.0
25.8
25.0
25.6
24.8 25.4
(a) (b)
24.6 25.2
0 20 40 60 80 100 120 0 20 40 60 80 100 120
FC/CC LL (%)
70.0
60.0
50.0
' (Degree)
40.0
30.0
20.0
10.0
(c)
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
γ (gr/cm3)
1.8 1.8
(a) (b)
1.6 1.6
'EXP / 'GP/OLS
1.4 1.4
1.2 1.2
1 1
0.8 0.8
0.6 0.6
0 20 40 60 80 100 120 0 20 40 60 80 100 120
FC/CC LL (%)
1.8
(c)
1.6
1.4
1.2
1
0.8
0.6
1.25 1.5 1.75 2 2.25 2.5
γ (gr/cm3)
View publication stats

Shear Angle Data Consolidation Tests

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Shear Angle Data Consolidation Tests

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Formulation of soil angle of shearing resistance using a hybrid GP and OLS

Article in Engineering With Computers · January 2011

Amir H. Alavi Amir H Gandomi

SEE PROFILE SEE PROFILE

Milad arab esmaeili

Ph. D. View project

Recycled aggregate concrete View project

The user has requested enhancement of the downloaded file.

S.M. Mousavi1, A.H. Alavi2, A. Mollahasani3, A.H. Gandomi4, M Arab Esmaeili5

and regression models.

least squares; Hybridization.

shear test or any other indirect method [3, 6].

Gandomi et al. [25].

representation of a GP model is shown in Fig. 2.

Fig. 2 The tree representation of a GP model ((√(x - 1))

Fig. 3 Crossover operation in GP

Fig. 4 Mutation operation in GP

2.1 Genetic Programming for Linear-in-parameters Models

Fig. 5 Decomposition of a tree to function terms [29]

2.2 Orthogonal Least Squares Algorithm

basis of their contribution to the performance of the model.

2.3 Hybrid Genetic Programming-Orthogonal Least Squares

Fig. 6 Pruning of a tree with OLS [25]

3 Modeling of Effective Angle of Shearing Resistance

was considered to be as follows:

   f FC , CC, LL, γ  (7)

FC (%): Fine-grained content

CC (%): Coarse-grained content

LL (%): Liquid limit

γ (gr/cm3): Soil bulk density

chosen on the basis of a multi-objective strategy as below:

3.2 Experimental Database

Appendix A (Table 6).

3.3 Data Preprocessing

Table 1 Correlation coefficients between all pairs of the explanatory variables

histograms (Fig. 7).

Fig. 7 Histograms of the variables used in the model development

generalization capability of the models.

was used to normalize the variables to a range of [L, U]:

3.4 Model Development Using GP/OLS

and also after a trial and error approach.

Table 3 Parameter settings for the GP/OLS algorithm

3.4.1 GP/OLS-Based Formulation of Angle of Shearing Resistance

3.5 Model Development Using Traditional GP

Table 4 Parameter settings for the traditional GP algorithm

3.6 Model Development Using Regression Analysis

connecting the input variables to the output variable as:

The MLSR-based formulation of ' is as given below:

practice via hand calculations.

algorithms such as Ant Colony or Tabu Search.

developed by Kayadelen et al. [21].

Fig. 12 Contributions of the predictor variables in the GP/OLS analysis

decreases when γ increases up to 1.4 gr/cm3 and thereafter it starts decreasing.

Fig. 13 Parametric analysis of ' in the GP/OLS model

significantly outperforms the regression model.

representing the behavior of shear strength parameters.

and time-consuming laboratory or field tests.

behavior is the soil bulk density.

Table 6 Geotechnical properties of soils used for the model development

cohesion intercept. KSCE J Civil Eng 15(5): 831-840.

Publishers Distributors, 1705- NaiSarak, Delhi

Edition, CRC Press, Taylor & Francis, UK