You are on page 1of 34

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/225451900

Formulation of soil angle of shearing resistance using a hybrid GP and OLS


method

Article  in  Engineering With Computers · January 2011


DOI: 10.1007/s00366-011-0242-x

CITATIONS READS

10 129

5 authors, including:

Amir H. Alavi Amir H Gandomi


University of Missouri Stevens Institute of Technology
169 PUBLICATIONS   6,774 CITATIONS    244 PUBLICATIONS   9,894 CITATIONS   

SEE PROFILE SEE PROFILE

Milad arab esmaeili


university of Shahrood.Shahrood.Iran
5 PUBLICATIONS   77 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Ph. D. View project

Recycled aggregate concrete View project

All content following this page was uploaded by Amir H. Alavi on 19 May 2017.

The user has requested enhancement of the downloaded file.


Formulation of Soil Angle of Shearing Resistance Using a Hybrid GP and OLS Method

S.M. Mousavi1, A.H. Alavi2, A. Mollahasani3, A.H. Gandomi4, M Arab Esmaeili5


1
Department of Civil Engineering, Sharif University of Technology, Tehran, Iran
2
School of Civil Engineering, Iran University of Science and Technology, Tehran, Iran
3
Department of Civil, Environmental and Material Engineering (DICAM), University of Bologna, Bologna, Italy
4
Department of Civil Engineering, University of Akron, Akron, OH 44325-3905, USA
5
Department of Civil Engineering, Islamic Azad University, Shahrood Branch, Shahrood, Iran

Abstract

In the present study, a prediction model was derived for the effective angle of shearing resistance (') of soils

using a novel hybrid method coupling genetic programming (GP) and orthogonal least squares algorithm

(OLS). The proposed nonlinear model relates ' to the basic soil physical properties. A comprehensive

experimental database of consolidated-drained triaxial tests was used to develop the model. Traditional GP and

least square regression analyses were performed to benchmark the GP/OLS model against classical approaches.

Validity of the model was verified using a part of laboratory data that were not involved in the calibration

process. The statistical measures of correlation coefficient (R), root mean squared error (RMSE) and mean

absolute percent error (MAPE) were used to evaluate the performance of the models. Sensitivity and parametric

analyses were conducted and discussed. The GP/OLS-based formula precisely estimates the ' values for a

number of soil samples. The proposed model provides a better prediction performance than the traditional GP

and regression models.

Keywords: Effective angle of shearing resistance; Soil physical properties; Genetic programming; Orthogonal

least squares; Hybridization.

1
1 Introduction

A major property of soil is its ability to resist sliding along internal surfaces within a mass. The soil shearing

resistance plays an important role in the stability of structures built on it. In general, the Mohr-Coulomb theory

is used to represent the shear strength of geotechnical materials. This theory indicates that the soil shear strength

varies linearly with the applied stress through two shear strength components known as the cohesion intercept

and angle of shearing resistance [1]. The tangent to the Mohr-Coulomb failure envelopes is represented by its

slope and intercept. The slope expressed in degrees is the angle of shearing resistance and the intercept is

cohesion [2, 3]. The cohesion intercept and angle of shearing resistance are treated as constants over the range

of normal stresses. The values of these empirical parameters for any soil depend upon several factors such as

the soil textural properties, past history of soil, initial state of soil, permeability characteristics of soil and

conditions of drainage allowed to take place during the test [1, 3]. Figs. 1(a) and (b) show the Mohr circles and

failure envelopes in terms of the total and effective stresses, respectively. If the cohesion intercept and angle of

shearing resistance are determined using the total stresses (Fig. 1(a)), they are named as total or undrained

cohesion intercept (c) and angle of shearing resistance (). The effective stress is the difference between the

total stress and the excess pore water pressure. If the pore water pressures are measured during the test, the

effective circles can be plotted as shown in Fig. 1(b) and the effective strength parameters (c' and ') are

obtained.

Fig. 1 Mohr circles and failure envelopes in terms of total and effective stresses [3]

Determination of ' is an important consideration in design of geotechnical structures. This key parameter can

be determined using field or laboratory tests. The triaxial compression and direct shear tests are the most

common tests for determining the ' values in the laboratory. The testing procedures of triaxial and direct shear

tests have been standardized by ASTM WK3821 [4] and ASTM 6528-00 [5], respectively. The triaxial and

direct shear tests are more suitable for clayey and sandy soils, respectively. The tests used in the field are vane

shear test or any other indirect method [3, 6].

Since experimental determination of ' is cumbersome and costly, numerical models are developed to estimate

the ' values. Despite the multivariable dependency of soils, the existing correlations are developed on the basis

2
of only one soil index property [1]. Further, simplifying assumptions are commonly incorporated into the

development of the statistical and numerical methods that may lead to very large errors [7-11].

In recent years, new soft computing methods such as artificial neural networks (ANNs) have been successfully

applied to behavioral modeling of many geotechnical engineering problems [11-13]. The insufficiency of ANNs

to produce simplified prediction equations can create difficulty in practical circumstances. Furthermore,

structure of a neural network should be identified a priori [14]. A new alternative approach to overcome these

problems is known as genetic programming (GP) [15, 16]. GP is generally a supervised machine learning

technique that searches a program space instead of a data space. Many researchers have employed GP and its

variants to derive simple prediction equations for civil engineering problem [17-20]. Recently, Kayadelen et al.

[21] used the ANN and GP-based methods to predict the ' value of soils.

Orthogonal least squares (OLS) algorithm [22, 23] is an effective algorithm to designate which terms are

significant in a linear-in-parameters model [24, 25]. Madar et al. [26] coupled GP and OLS to make a hybrid

algorithm with better efficiency. Introducing this strategy into the GP process resulted in obtaining more robust

and interpretable models [25, 26]. Some of the limited researches with the specific objective of applying the

GP/OLS method to civil engineering problems have been recently conducted by Gandomi et al. [24] and

Gandomi et al. [25].

The purpose of the current research is to utilize the hybrid GP/OLS technique to generate a linear-in-parameters

prediction model for '. The proposed model relates ' to the coarse and fine-grained contents, liquid limit and

bulk density. The developed model can reliably be used for routine design practice in that it was derived from

tests with a wide range of aggregate gradation and soil index properties.

2 Genetic Programming

GP creates computer programs to solve a problem using the principle of Darwinian natural selection. A

significant advantage of GP over other soft computing techniques is its ability to generate practical prediction

equations. Development of GP in the late 1980s was a result of experiments of Koza [13] on symbolic

regression. GP is an extension of genetic algorithms (GAs). The classical GP technique is referred to as tree-

based GP [25], in which a random population of individuals (trees) is created to achieve high diversity. A

population member in GP is a hierarchically structured tree comprising functions and terminals. The functions

3
and terminals are selected from a set of functions and a set of terminals. The functions and terminals are chosen

at random and constructed together to form a computer model in a tree-like structure [25]. A simple tree

representation of a GP model is shown in Fig. 2.

Fig. 2 The tree representation of a GP model ((√(x - 1))

Once a population of models has been created at random, the GP algorithm evaluates the individuals, selects

individuals for reproduction. Thereafter, GP generates new individuals by mutation, crossover, and direct

reproduction [15, 25]. The crossover operation selects a point on a branch of each program at random. Then, set

of terminals and/or functions from each program are swapped to generate two new programs (see Fig. 3).

During the mutation process, the algorithm occasionally selects a function or terminal from a model at random

and mutates it (see Fig. 4). In the following subsections, the coupled algorithm of GP and OLS, GP/OLS, is

described.

Fig. 3 Crossover operation in GP

Fig. 4 Mutation operation in GP

2.1 Genetic Programming for Linear-in-parameters Models

In general, GP creates both nonlinear and linear-in-parameters models. In order to avoid parameter models, the

parameters must be removed from the set of terminals. That is, it must contain only variables: T = (x0 (k), ..., xi

(k)}, where xi (k) denotes the ith repressor variable. Hence, a population member represents only Fi nonlinear

functions [25, 27]. The parameters are assigned to the model after “extracting” the F i function terms from the

tree, and determined using a least square (LS) algorithm [28]. A simple technique for the decomposition of the

tree into function terms can be used. The subtrees, representing the F i function terms, are determined by

decomposing the tree starting from the root as far as reaching nonlinear nodes (nodes which are not “+” or “-”).

As shown in Fig. 5, the root node is a “+” operator; therefore, it is possible to decompose the tree into two

subtrees of “A” and “B”. The root node of the “A” tree is anew a linear operator; therefore, it can be

decomposed into “C” and “D” trees. As the root node of the “B” tree is a nonlinear node (/), it cannot be

decomposed. The root nodes of “C” and “D” trees are also nonlinear. Consequently, the final decomposition

procedure results in three subtrees: “B”, “C”, and “D”. According to the results of the decomposition, it is

4
possible to assign parameters to the functional terms represented by the obtained subtrees. The resulted linear-

in-parameters model for this example is y: p0 + p1(x2 + x1)/x0 + p2x0 + p3x1 [25].

Fig. 5 Decomposition of a tree to function terms [29]

GP can be used for selecting from special model classes such as a polynomial model. For this aim, the set of

operators must be restricted and some simple syntactic rules must be introduced. For instance, if the set of

operators is defined as F= {×, +} and there is a syntactic rule that exchanges the internal nodes that are below a

“×”-type internal nodes to “×”- type nodes, GP will generate only polynomial models [15, 25, 29].

2.2 Orthogonal Least Squares Algorithm

The great advantage of using linear-in-parameter models is that the LS method can be used to identify the

model parameters. This is much less computationally demanding than other nonlinear optimization algorithms,

because the optimal p = [p1,..., pm]T parameter vector can analytically be calculated [25]:

p = (U-1U)TUy (1)

in which y = [y(1),..., y(N)]T is the measured output vector and the U regression matrix is:

 U 1 ( x(1))  U M ( x(1)) 
 
U     
U ( x( N ))  U ( x( N )) 
 1 M  (2)

The OLS algorithm [23, 24] is an efficient algorithm for determining which terms are significant in a linear-in-

parameters model. The OLS technique introduces the error reduction ratio (err), which is a measure of the

decrease in the variance of output by a given term. The matrix form corresponding to the linear-in-parameters

model is [25]:

y = Up+e (3)

where the U is the regression matrix, p is the parameter vector, and e is the error vector. The OLS method

transforms the columns of the U matrix into a set of orthogonal basis vectors to inspect the individual

contributions of each term [30]. It is assumed in the OLS algorithm that the regression matrix U can be

orthogonally decomposed as U = WA, where A is a M by M upper triangular matrix (i.e., Aij = 0 if i > j). W is a

N by M matrix with orthogonal columns in the sense that WTW = D is a diagonal matrix (N is the length of the

5
y vector and M is the number of repressors). After this decomposition, the OLS auxiliary parameter vector g can

be calculate as [25]:

g = D-1WT y (4)

where gi represents the corresponding element of the OLS solution vector. The output variance (yTy)/N can be

described as:

M
y T y   g i2 wiT wi  eT e. (5)
i 1

Therefore, the error reduction ratio [err]i of the Ui term can be expressed as:

2 T
err i  g i wT i w . (6)
y y

This ratio offers a simple mean for order and selects the model terms of a linear-in-parameters model on the

basis of their contribution to the performance of the model.

2.3 Hybrid Genetic Programming-Orthogonal Least Squares

The application of OLS in the GP algorithm leads to significant improvements in the performance of GP [25].

The main feature of this hybrid approach is to transform the trees to simpler trees which are more transparent,

but their accuracies are close to the original trees. In this coupled algorithm, GP generates a lot of potential

solutions in the form of a tree structure during the GP operation. These trees may have better and worse terms

(subtrees) that contribute more or less to the accuracy of the model represented by the tree. OLS is used to

estimate the contribution of the branches of the tree to the accuracy of the model, whereas, using the OLS, one

can select the less significant terms in a linear regression problem. According to this strategy, terms (subtrees)

having the smallest error reduction ratio are eliminated from the tree [25, 27]. This “tree pruning” approach is

realized in every fitness evaluation before the calculation of the fitness values of the trees. Since GP works with

the tree structure, the further goal is to preserve the original structure of the trees as far as it possible. The

GP/OLS method always guarantees that the elimination of one or more function terms of the model can be done

by pruning the corresponding subtrees, so there is no need for structural rearrangement of the tree after this

operation. The way the GP/OLS method works on its basis is simply demonstrated in Fig. 6. Assume that the

function which must be identified is y(x) = 0.8 (u x−1)2 + 1.2yx−1 − 0.9yx−2 − 0.2. As can be seen in Fig. 5, the GP

6
algorithm finds a solution with four terms: (ux−1)2, yx−1, yx−2, ux−1 × ux−2. Based on the OLS algorithm, the sub-

tree with the least error reduction ratio (F 4 = ux−1 × ux−2) is eliminated from the tree. Subsequently, the error

reduction ratios and mean square error values (and model parameters) are calculated again. The new model

(after pruning) may have a higher mean square error but it obviously has a more adequate structure [25].

Fig. 6 Pruning of a tree with OLS [25]

3 Modeling of Effective Angle of Shearing Resistance

The angle of shearing resistance represents the interlocking among the soil particles. The soils with high

plasticity like clayey soils have lower angle of shearing resistance and higher cohesion. Conversely, as the soil

grain size increases like sands, the soil internal friction angle increases and its cohesion decreases. Therefore, in

a rational manner the main parameters which affect the soil strength parameters will be the soil type, soil

plasticity, and soil density. This study aimed at obtaining meaningful relationships between ' and the

influencing parameters using the GP/OLS approach. The most important factors representing the ' behavior

were selected on the basis of a literature review [3, 6, 21, 31] and after a trial study. The formulation of ' (o)

was considered to be as follows:

   f FC , CC, LL, γ  (7)

where,

FC (%): Fine-grained content

CC (%): Coarse-grained content

LL (%): Liquid limit

γ (gr/cm3): Soil bulk density

The significant influence of these parameters in determining ' is well understood. The best GP/OLS model was

chosen on the basis of a multi-objective strategy as below:

i. The simplicity of the model, although this was not a predominant factor.

ii. Providing the best fitness value on the training set of data.

7
3.1 Parameters for Measuring Performance

Correlation coefficient (R), root mean squared error (RMSE) and mean absolute percent error (MAPE) were

used to evaluate the performance of the models. R, RMSE and MAPE were calculated using the following

equations:

 n h  h  t  t   (8)
  i  1
  i
i  i i  
R
2 2
 n  h  h   n  t  t 
i  1 i i  i  1 i i 

n 2
  hi  ti 
(9)
RMSE  i  1  100
n

 
1 n  hi  ti  (10)
MAPE     100
n i  1 h 
 i 

where, hi and ti are, respectively, the actual and predicted output values for the ith output, hi and t are,
i

respectively, the average of the actual and predicted outputs, and n is the number of samples.

3.2 Experimental Database

Within the scope of this study, a series of consolidated drained (CD) triaxial tests were performed in accordance

with ASTM WK3821 [4] on undisturbed soil samples. The samples were taken from Khorasan and Khouzestan

provinces in Iran. The most versatile test to measure shear strength parameters of soils is the triaxial test. In this

test, a cylinder-shaped specimen, sealed by a rubber membrane, is submitted to an axisymmetric water pressure

and then to an increasing axial loading. Herein, typical dimensions of specimens were 38 mm in diameter and

76 mm in height. The undisturbed samples were taken from drilling boreholes with Shelby tube (thin-walled

metal) in accordance whit the procedures given in ASTM 1587 [32]. All the triaxial tests were carried out using

samples taken at the depth ranging from 30 m to 15 m and contained no gravel or larger particles.

Before conducting the test, the soil samples were saturated, such that pore pressure response to an undrained

isotropical stress increment gave a value of B ≥ 0.95. For this purpose, both cell pressure and back pressure

(saturation water pressure) were applied and then simultaneously increased during saturation. The cell pressure

was always kept 10 kPa above the back pressure so that the accidental swelling of the samples or consolidation

resulting from a high pressure of the specimen was prevented. At the end of the saturation process, specimens

8
were istropically consolidated under constant vertical stresses and certain confining pressures. When 95%

excess pore pressure dissipation was achieved, the consolidation stage was finished. After consolidation, the

specimens were vertically loaded at a strain rate of 0.35mm/min as long as pore water was allowed to drain out.

The loading was continued until the maximum deviator stress was achieved and axial strain was at least 20%.

Each soil was tested for three confining pressure levels. Approximately ten days were required to complete the

each test including saturation and consolidation process. To develop generalized correlations, a comprehensive

database of previously published triaxial tests gathered by Kayadelen et al. [21] and Kayadelen [33] was further

added to the available experimental results. The collected database together with data from the present study (13

data sets) consists of a total of 135 data sets. The complete list of the test results and soil properties are given in

Appendix A (Table 6).

3.3 Data Preprocessing

Some of the soil property variables are fundamentally interdependent. The first step in the analysis of

interdependency of the data is to make a careful study of what it is that these variables are measuring, noting

any highly correlated pairs. High positive or negative correlation coefficients between the pairs may lead to

poor performance of the models and difficulty in interpreting the effects of the explanatory variables on the

response. This interdependency can cause problems in analysis as it will tend to exaggerate the strength of

relationships between variables. This is a simple case commonly known as the problem of multicollinearity

[34]. Thus, the correlation coefficients between all possible pairs were determined and shown in Table 1.

Table 1 Correlation coefficients between all pairs of the explanatory variables

As can be seen in this table, there is a high negative correlation between FC and CC in the operation. This is

apparent since CC is calculated by subtracting FC from 100. Since there was no advantage of having both

variables in the modeling (one can represent the other), decisions were made to remove the correlated

parameters in order to maximize the reliability of the final models. Finally, on the basis of a trial study, the ratio

of FC to CC (FC/CC) was used as the input parameter. The descriptive statistics of the data used in this study

are given in Table 2. To visualize the distribution of the samples, the data are also presented by frequency

histograms (Fig. 7).

9
Table 2 Descriptive statistics of the variables used in the model development

Fig. 7 Histograms of the variables used in the model development

For the GP/OLS analysis, the data sets were randomly divided into training and testing subsets. The training

data were used for learning (genetic evolution). The testing data were used to measure the performance of the

program evolved by GP/OLS on data that played no role in building the model. In order to obtain a consistent

data division, several combinations of the training and testing sets were considered. The selection was such that

the maximum, minimum, mean and standard deviation of parameters were consistent in the training and testing

data sets. Out of the 135 data, 108 data were used as the training data and 27 data for the testing of the

generalization capability of the models.

Although normalization is not strictly necessary in the GP-based analysis, better results are usually reached after

normalizing the variables. Further, normalization speeds up the process [25]. These are mainly due to the

influence of unification of the variables, no matter their range of variation. Thus, both input and output variables

were normalized in this study. After controlling several normalization methods [35, 36], the following method

was used to normalize the variables to a range of [L, U]:

X n = ax + b (11)

where

a = (U−L)/ (Xmax−Xmin), and b = U − aXmax, in which Xmax and Xmin are the maximum and minimum values of the

variable, and Xn is the normalized value. In the present study, L = 0.05 and U = 0.95.

3.4 Model Development Using GP/OLS

The available database was used for generating a GP/OLS prediction model relating ' to FC/CC, LL, and γ.

Various parameters are involved in the GP/OLS predictive algorithm. The parameter settings for the GP/OLS

algorithm are shown in Table 3. In this study, basic arithmetic operators were utilized to get the optimum

GP/OLS model. The number of programs in the population that GP/OLS will evolve is set by the population

size. A run will take longer with a larger population size. The number of generation sets the number of levels

the algorithm will use before the run terminates. The proper number of population and generation depends on

10
the number of possible solutions and complexity of the problem. A relatively large number of generations were

tested to find models with minimum error. The program was run until the runs terminated automatically. The

values of the other involved parameters were selected based on some previously suggested values [24, 25, 37]

and also after a trial and error approach.

Table 3 Parameter settings for the GP/OLS algorithm

3.4.1 GP/OLS-Based Formulation of Angle of Shearing Resistance

The optimal formulation of the effective angle of shearing resistance (') is as given below:

   FC   
 / OLS     27.47 - 0.0589 
GP  - LLn  + 0.8046γ n + 0.77 
2
(12)

   CC  n  

where,

(FC/CC)n = 0.0081(FC/CC)+0.0986

LLn = 0.0105LL-0.1316

γn = 0.9558γ-1.2677

A comparison of the experimental ' values and those predicted by GP/OLS is shown in Fig. 8.

Fig. 8 Experimental versus predicted ' values using the GP/OLS model: (a) training data, (b) testing data

3.5 Model Development Using Traditional GP

A tree-based GP analysis was performed to compare the hybrid GP and OLS technique (GP/OLS) with a

classical GP approach. The tree-based GP model was developed using the same variables and same data sets as

the GP/OLS model. Various parameters involved in the traditional GP predictive algorithm are shown in Table

4. The parameters were selected considering some previously suggested values [17] and also after a trial and

error approach. A large number of generations were tested to find a model with minimum error. A tree-based

GP software, GPLAB [38] in conjunction with subroutines coded in MATLAB was used in this study.

Table 4 Parameter settings for the traditional GP algorithm

11
3.5.1 Traditional GP-Based Formulation of Angle of Shearing Resistance

The prediction model for ', for the best result by the traditional GP algorithm, is as given below:

   
l GP    27.47  n -  n  LLn n - LLn  n - 1
  FC 
Traditiona
   - LLn - γn +  n  + 0.55455 
 2
(13)
   CC  n  

in which,

(FC/CC)n = 0.0081(FC/CC)+0.0986

LLn = 0.0105LL-0.1316

γn = 0.9558γ-1.2677

A comparison of the experimental ' values and those predicted by traditional GP is shown in Fig. 9.

Fig. 9 Experimental versus predicted ' values using the traditional GP model: (a) training data, (b) testing data

3.6 Model Development Using Regression Analysis

A multivariable least squares regression (MLSR) [39] analysis was performed to have an idea about the

predictive power of the best GP/OLS model, in comparison with a classical statistical approach. The method of

LSR is extensively used in regression analysis primarily because of its interesting nature. Under certain

assumptions, LSR has some attractive statistical properties that have made it as a member of the most powerful

and popular methods of regression analysis. The major task was to determine the MLSR-based equation

connecting the input variables to the output variable as:

FC
   α1  α 2 LL  α3  α 4 (14)
CC

where a denotes coefficient vector. LSR minimizes the sum-of-squared residuals for each equation, accounting

for any cross-equation restrictions on the parameters of the system. If there are no such restrictions, this

technique is identical to estimating each equation using single-equation ordinary least squares. The MLSR

model was trained and tested using the same data sets previously considered for developing the GP/OLS model.

Eviews software package [40] was used to perform the regression analysis.

12
3.6.1 MLSR-Based Formulation of Angle of Shearing Resistance

The MLSR-based formulation of ' is as given below:

φMLSR    -0.0137 CC
 FC
+ 0.0256 LL + 18.783CC - 8.458 (15)

and FC/CC, LL and γ are the predictor variables. Fig. 10 shows a comparison between the experimental '

values and the values predicted by MLSR. The resulting Fisher value of the performed regression analyses is

equal to 59.6.

Fig. 10 Experimental versus predicted ' values using the MLSR model: (a) training data, (b) testing data

4 Performance Analysis

Different correlations were developed for the estimation of ' upon a reliable database. Comparisons of the '

predictions made by the GP/OLS, traditional GP and MLSR models are presented in Fig. 11. No rational model

for the prediction of ' has been found that encompass the influencing variables considered in this study. Thus,

it was not possible to conduct a comparative study between the results of this research and those in hand.

It is known that if the R value provided by a model is higher than 0.8 and the error values (e.g., RMSE and

MAPE) are low, the predicted and measured values are strongly correlated with each other [41, 42]. It can be

observed from Figs. 8 and 11 that the GP/OLS model with high R and low RMSE and MAPE values is able to

predict the target values to an acceptable degree of accuracy. The performance of the GP/OLS model on the

testing data is better compared with that on the training data. This indicates that the GP/OLS model has a very

good generalization performance. The amount of data used for the training process is an important issue, as it

bears heavily on the reliability of the final models [42]. To cope with this limitation, Frank and Todeschini [43]

argue that the minimum ratio of the number of objects over the number of selected variables for model

acceptability is 3. It is also suggested that considering a higher ratio equal to 5 is safer. In the present study, this

ratio is much higher and is equal to 135/3 = 45. Additionally, new criteria proposed by Golbraikh and Tropsha

[44] were checked for the external validation of the GP/OLS model on the testing data sets. It is recommended

that at least one slope of regression lines (k or k') through the origin should be close to 1. Recently, Roy and

Roy [45] introduced a confirm indicator of the external predictability of models (R m). For Rm > 0.5, the

13
condition is satisfied. Furthermore, the squared correlation coefficient between the predicted and measured

values (Ro2), and the correlation coefficient between the measured and predicted values (Ro' 2) should be close

to 1 [42]. The validation criteria and the relevant results obtained by the model are presented in Table 5. As it is

seen, the derived model satisfies nearly all of the required conditions. The only exception is for the Rm criterion.

In this case, the proposed model marginally fails to satisfy the condition (R m = 0.483). The validation phase

ensures that the GP/OLS model is valid, has the prediction power and is not established by chance.

Table 5 Statistical parameters of the GP/OLS model for the external validation

As shown in Figs. 8, 9 and 11, the GP/OLS-based model has produced better results than the traditional GP

model. This reveals that applying the OLS strategy into the GP process (GP/OLS) improves the efficiency of

the traditional GP. Because of the tree pruning process, the GP/OLS equation is structurally simpler in

comparison with the equation evolved by the traditional GP. The GP/OLS model can be used for routine design

practice via hand calculations.

It can also be seen from Figs. 8, 10 and 11 that the GP/OLS model produces remarkably better outcomes than

the empirical MLSR model. The significant limitations of empirical modeling based on the statistical techniques

strongly affect the prediction capabilities of the regression-based equations [14, 24]. Conventional regression

models often assume a linear relationship between the output and the predictor variables, which is not always

true. In most cases, the best models developed using the commonly used regression approach are obtained after

controlling just some equations established in advance. Thus, they cannot efficiently consider the interactions

between the dependent and independent variables [14]. On the other hand, GP/OLS introduces completely new

characteristics and traits. One of the major advantages of the GP/OLS approach over the traditional regression

analysis is its ability to derive explicit relationships for a problem without assuming prior forms of the existing

relationships. The best equations evolved by this technique are determined after controlling numerous

preliminary models. However, it is notable that the GP-based methods are extremely parameter sensitive,

especially when difficult experimental training data sets like the one used in this paper are employed. Using any

form of optimally controlling the parameters of the run (e.g., GAs), can improve the performance of their

algorithms [24]. In this context, further research can be focused on hybridizing GP with other optimization

algorithms such as Ant Colony or Tabu Search.

14
Fig. 11 Comparison of the ' predictions made by different models: (a) GP/OLS, (b) traditional GP (c) MLSR

5 Sensitivity Analysis

The contribution of each predictor variable in the models evolved by GP/OLS was evaluated through a

sensitivity analysis. For this aim, frequency values [46] of the input variables were obtained. A frequency value

equal to 1.00 for an input indicates that this variable has been appeared in 100% of the best thirty programs

evolved by GP/OLS. This methodology is a common approach in the GP-based analyses [18-20]. The

frequency values of the predictor variables are presented in Fig. 12. According to these results, it can be found

that ' is more dependent on γ than LL and FC/CC. The results comparably agree with those of the ANN model

developed by Kayadelen et al. [21].

Fig. 12 Contributions of the predictor variables in the GP/OLS analysis

6 Parametric Analysis

For further verification of the GP/OLS prediction equation, a parametric analysis was performed in this study.

The parametric analysis investigates the response of the predicted ground-motion parameters from the GP/OLS

models to a set of hypothetical input data generated over the training ranges of the minimum and maximum

data. The methodology is based on changing one predictor variable at a time while the other seismic variables

are kept constant at the average values of their entire data sets. A set of synthetic data for the single varied

parameter is generated by increasing the value of this in increments [24]. These variables are presented to the

prediction model and ' is calculated. This procedure is repeated using another variable until the model response

is tested for all input variables. Fig. 13 presents the tendency of the ' predictions to the variations of the

predictor variables, FC/CC, LL and γ. The results of the parametric analysis indicate that ' continuously

decreases due to increasing FC/CC and increases with increasing LL. As can be seen in Fig. 13(c), the ' value

decreases when γ increases up to 1.4 gr/cm3 and thereafter it starts decreasing.

Fig. 13 Parametric analysis of ' in the GP/OLS model

15
Besides, the ratios of the experimental ' values to the values predicted by the GP/OLS solution, with respect to

FC/CC, LL and γ, is shown in Fig. 14. As the scattering increases in these figures, the accuracy of the model

consequently decreases. It can be observed from these figures that the predictions obtained by the proposed

correlations have a very good accuracy with no significant trend with respect to the design parameters. In the

cases of FC/CC and LL, the scattering slightly decreases with increasing this parameter.

Fig. 14 The ratio between the predicted and experimental ' values with respect to the design parameters

design parameters
7 Conclusions design parameters

In this research, a high-precision model was derived for assessing the effective angle of shearing resistance

using a combined GP and OLS algorithm (GP/OLS). The proposed model was developed based on well

established and widely dispersed triaxial test results obtained from the literature and experimental study

performed in this study. The performance of the model was benchmarked against the standard GP and multiple

regression-based models.

 The developed model gives reliable estimations of the ' values. Introducing the OLS strategy to the GP

process improved the efficiency of the traditional GP. The results indicate that the proposed model

significantly outperforms the regression model.

 The proposed GP/OLS model simultaneously take into account the role of several important

representing the behavior of shear strength parameters.

 The GP/OLS model can be used for practical engineering purposes since it was developed based on

tests conducted on clayey and sandy soils with wide range properties. The proposed model is very

simple. The predictive capability of the derived model is limited to the range of the data used for its

calibration. Despite this limitation, this model can be retrained and improved to make more accurate

predictions for a wider range by adding newer data sets for other soil types and test conditions.

 With the use of the GP/OLS approach, the ' values can be estimated without carrying out sophisticated

and time-consuming laboratory or field tests.

 A finding from the sensitivity analysis results is that the most important parameter governing the '

behavior is the soil bulk density.

16
 The interesting observation from the results of parametric study is that bulk density is positively

correlated with ' just up to about 1.4 gr/cm3 and thereafter the correlation becomes negative.

Appendix A

Table 6 Geotechnical properties of soils used for the model development


design parameters
design parameters

References

1. Mollahasani A, Alavi AH, Gandomi AH, Rashed A (2011) Nonlinear neural-based modeling of soil

cohesion intercept. KSCE J Civil Eng 15(5): 831-840.

2. Arora KR (1988) Introductory soil engineering. text book, 322, pub.: Nem Chand Jane (Prop.), Standard

Publishers Distributors, 1705- NaiSarak, Delhi

3. Murthy S (2008) Geotechnical Engineering: Principles and Practices of Soil Mechanics, 2008, 2nd

Edition, CRC Press, Taylor & Francis, UK

4. ASTM WK3821 New test method for consolidated drained triaxial compression test for soils

5. ASTM D 6528 Consolidated undrained direct simple shear testing of cohesive soils

6. El-Maksoud MAF (2006) Laboratory determining of Soil Strength Parameters in Calcareous Soils and

Their Effect on Chiseling Draft Prediction. In: Proceedings of Energy Efficiency and Agricultural

Engineering International Conference, Vol. 9, Rousse, Bulgaria

7. Bowles JE (1992) Engineering properties of soils and their measurement (4th ed.). New York, NY,

McGraw-Hill

8. Korayem AY, Ismail KM, Sehari SQ (1996) Prediction of soil shear strength and penetration resistance

using some soil properties. Missouri J Agric Res 13(4):119–140

9. Panwar JS, Seimens JC (1972) Shear strength and energy of soil failure related to density and moisture.

T ASAE 15:423–427

10. Terzaghi K, Peck RB, Mesri G (1996) Soil mechanics in engineering practice (2 nd ed.). Wiley & Sons,

New York

17
11. Shahin MA, Maier HR, Jaksa MB (2001) Artificial neural network applications in geotechnical

engineering. Aus Geomech 36(1):49–62

12. Alavi AH, Gandomi AH, Mollahasani A, Heshmati AAR (2010) Modeling of maximum dry density

and optimum moisture content of stabilized soil using artificial neural networks. J Plant Nutr Soil Sci,

173(3): 368-379.

13. Heshmati AAR, AH Alavi, M Keramati, AH Gandomi (2009) A radial basis function neural network

approach for compressive strength prediction of stabilized soil. Geotechnical Special Publication ASCE

191:147-153

14. Alavi AH, Ameri M, Gandomi AH, Mirzahosseini MR (2011) Formulation of flow number of asphalt

mixes using a hybrid computational method. Constr Build Mater 25(3): 1338–1355.

15. Koza J (1992) Genetic programming, on the programming of computers by means of natural selection.

Cambridge (MA), MIT Press

16. Banzhaf W, Nordin P, Keller R, Francone F (1998) Genetic programming – an introduction. on the

automatic evolution of computer programs and its application. dpunkt/Morgan Kaufmann:

Heidelberg/San Francisco

17. Johari A, Habibagahi G, Ghahramani (2006) A prediction of soil–water characteristic curve using

genetic programming. J Geotech Geoenviron ASCE 132(5):661-65

18. Gandomi AH, Alavi AH, Sahab MG, Arjmandi P (2010) Formulation of elastic modulus of concrete

using linear genetic programming. J Mech Sci Tech 24(6): 1011-1017.

19. Gandomi AH, Alavi AH, Sahab MG (2009) New formulation for compressive strength of CFRP

confined concrete cylinders using linear genetic programming. Mater Struct 43(7): 963-983.

20. Alavi AH, Gandomi AH, Sahab MG, Gandomi M (2010) Multi expression programming: a new

approach to formulation of soil classification. Eng Comput 26(2): 111-118.

21. Kayadelen C, Günaydın O, Fener M, Demir A, Özvan A (2009) A Modeling of the angle of shearing

resistance of soils using soft computing systems. Expert Syst Appl 36:11814–11826

22. Billings S, Korenberg M, Chen S (1988) Identification of nonlinear outputaffine systems using an

orthogonal least-squares algorithm. Int J Syst Sci 19(8):1559–1568

18
23. Chen S, Billings S, Luo W (1989) Orthogonal least squares methods and their application to non-linear

system identification. Int J Cont 50(5):1873–1896

24. Gandomi AH, Alavi AH, Mousavi M, Tabatabaei SM (2011) A hybrid computational approach to

derive new ground-motion attenuation models. Eng Appl Artif Int 24(4): 717–732.

25. Gandomi AH, Alavi AH, Arjmandi P, Aghaeifar A, Seyednoor M (2010) Genetic programming and

orthogonal least squares: a hybrid approach to modeling the compressive strength of CFRP-Confined

concrete cylinders. J Mech Mater Struct 5(5), 735–753.

26. Madar J, Abonyi J, Szeifert F (2005) Genetic programming for the identification of nonlinear input-

output models. Ind Eng Chem Res 44(9):3178–3186

27. Pearson R (2003) Selecting nonlinear model structures for computer control. J Process Contr 13(1):1–

26

28. Reeves CR (1997) Genetic algorithm for the operations research. INFORMS J Comput 9:231–250

29. Madár J, Abonyi J, Szeifert F (2004) Genetic programming for system identification. In: Proceedings of

Intelligent Systems Design and Applications (ISDA 2004) Conference, Budapest, Hungary

30. Cao H, Yu J, Kang L, Chen Y (1999) The kinetic evolutionary modelling of complex systems of

chemical reactions. Comput Chem Eng 23(1):143–151

31. Barends FBJ, Lindenberg JL, DeQuelerij L, Verruijt A, Luger HJ (1999) Geotechnical Engineering for

Transportation Infrastructure: Theory and Practice, Planning and Design, Construction and

Maintenance. AA Balkema Publishers, Rotterdam, Netherlands

32. ASTM D 1587 Standard practice for thin-walled tube sampling of soils for geotechnical purposes

33. Kayadelen C (2008) Estimation of effective stress parameter of unsaturated soils by using artificial

neural networks. Int J Numer Anal Meth Geomec 32:1087–106.

34. Dunlop P, Smith S (2003) Estimating key characteristics of the concrete delivery and placement process

using linear regression analysis. Civil Eng Environ Syst 20,273–290

35. Swingler K (1996) Applying neural networks a practical guide. Academic Press, New York

36. Rafiq MY, Bugmann G, Easterbrook DJ (2001) Neural network design for engineering applications,

Comput Struct 79(17):1541–1552

19
37. Madár J, Abonyi J, Szeifert F (2005b) Genetic Programming for the Identification of Nonlinear Input-

Output Models, white paper. Retrieved September 05, 2009, from

http://www.fmt.vein.hu/softcomp/gp/ie049626e.pdf

38. Silva S (2007) GPLAB, a genetic programming toolbox for MATLAB, ITQB/UNL[M]. {http://gplab.

sourceforge. net}

39. Ryan TP (1997) Modern Regression Methods. New York (NY), Wiley

40. Maravall A, Gomez V (2004) EViews Software, Ver. 5. Quantitative Micro Software, LLC, Irvine CA

41. Smith GN (1986) Probability and statistics in civil engineering. Collins, London

42. Mollahasani A, Alavi AH, Gandomi AH, (2011) Empirical modeling of plate load test moduli of soil

via gene expression programming. Comput Geotech 38(2): 281-286.

43. Frank IE, Todeschini R (1994) The data analysis handbook. Elsevier, Amsterdam, Netherlands.

44. Golbraikh A, Tropsha A. Beware of q2. J Mol Graph Model 2002;20:269–76.

45. Roy PP, Roy K. On some aspects of variable selection for partial least squares regression models.

QSAR Comb Sci 2008;27:302–13.

46. Francone F (2001) Discipulus Owner„s Manual, Version 4.0. Register Machine Learning Technologies

20
LIST OF TABLES

Table 1 Correlation coefficients between all pairs of the explanatory variables


Table 2 Descriptive statistics of the variables used in the model development
Table 3 Parameter settings for the GP/OLS algorithm
Table 4 Parameter settings for the traditional GP algorithm
Table 5 Statistical parameters of the GP/OLS model for the external validation
Table 6 Geotechnical properties of soils used for the model development

Table 1 Correlation coefficients between all pairs of the explanatory variables


Variable FC (%) CC (%) LL (%) γ (gr/cm3)
FC (%) 1.00 -1.00 0.28 -0.21
CC (%) -1.00 1.00 -0.28 0.21
LL (%) 0.28 -0.28 1.00 -0.58
γ (gr/cm )
3
-0.21 0.21 -0.58 1.00

Table 2 Descriptive statistics of the variables used in the model development


Inputs Output
Parameters FC (%) CC (%) LL (%) γ (gr/cm )
3
' (o)
Mean 38.68 61.39 41.00 1.81 26.41
Standard Error 1.76 1.77 0.96 0.01 0.29
Median 40.00 60.00 37.00 1.81 26.00
Mode 28.00 72.00 36.00 1.81 26.00
Standard Deviation 20.50 20.51 11.19 0.14 3.39
Sample Variance 420.28 420.72 125.16 0.02 11.46
Kurtosis -0.63 -0.64 4.27 0.83 2.75
Skewness -0.09 0.09 1.43 0.03 0.99
Range 84 84.00 76 0.837 22
Minimum 1 15.00 22 1.431 18
Maximum 85 99.00 98 2.268 40
Table 3 Parameter settings for the GP/OLS algorithm
Parameter Settings

Function set +, -, ×, /
Population size 500-1000
Maximum tree depth 64
Maximum number of evaluated individuals 250
Maximum number of evaluated individuals 250
Generation 100
Type of selection Roulette-wheel
Type of mutation Point-mutation
Type of crossover One-point (2 parents)
Type of replacement Elitist
Probability of crossover 0.5
Probability of mutation 0.5
Probability of changing terminal–non-terminal nodes (vice versa) during mutation 0.25

Table 4 Parameter settings for the traditional GP algorithm


Parameter Settings
Function set +, -, ×, /
Population size 100-1000
Maximum tree depth 10
Total generations 4000
Initial population Ramped half-and-half
Sampling Tournament
Expected no. of offspring method Rank 89
Fitness function error type linear error function
Termination Generation 40
Minimum probability of crossover 0.1
Minimum probability of mutation 0.1
Real max level 30
Survival mechanism Keep best
Table 5 Statistical parameters of the GP/OLS model for the external validation

Item Formula Condition GP/OLS


1 R R > 0.8 0.909

2 k

in 1 hi  ti  0.85 < K < 1.15 1.005
h2
i

3 k 

in 1 hi  ti  0.85 < K' < 1.15 0.991
t2
i
6 Rm  R 2  (1  R 2  Ro 2 ) Rm > 0.5 0.483

Ro  1 
2 
in1 ti  hi
o

2
, hio  k  ti
t 
where 2 Should be Close to 1 0.999
 n
i 1 i  ti

Ro 2  1 

in1 hi  ti
o

2
, tio  k   hi
h  h  2 Should be Close to 1 0.995
 n
i 1 i i

Table 6 Geotechnical properties of soils used for the model development


Test No. FC (%) CC (%) LL (%) γ (gr/cm3) 'Exp. (o) ' GP/OLS (o) ' Traditional GP (o) 'MLSR (o)
1 77.4 22.6 36.4 1.810 27.1 26.1 26.3 26.4
2 81.3 18.7 45.8 1.870 28.4 27.5 28.0 27.8
3 56.3 43.7 24.2 2.070 32.2 32.3 31.4 31.0
4 97.1 2.9 34.7 1.970 30.8 29.3 29.1 29.0
5 96.9 3.1 29 1.990 29.6 29.7 29.3 29.2
6 86.4 13.6 33.9 1.930 28.8 28.6 28.5 28.6
7 72 28 26 2.030 32 31.2 30.4 30.3
8 96.5 3.5 40.4 1.950 30.5 28.9 29.0 28.8
9 68.6 31.4 43.2 1.880 30.1 27.7 28.1 27.9
10 87 13 28 1.960 36 29.3 28.8 29.0
11 81.9 18.1 37.7 1.950 30.7 29.2 29.2 29.1
12 98.8 1.2 35.8 1.910 29.5 27.2 27.2 27.2
13 79.2 20.8 43.5 1.950 29.1 29.3 29.5 29.2
14 76 24 39 1.781 26 25.6 25.9 25.9
15 79 21 36 1.801 27 25.9 26.1 26.2
16 52 48 44 1.804 25 26.1 26.7 26.5
17 72 29 28 1.800 28 25.7 25.6 26.0
18 50 51 48 1.828 25 26.7 27.4 27.1
19 59 41 43 1.765 25 25.4 25.9 25.8
20 35 65 35 1.705 26 24.3 24.2 24.5
21 76 24 34 1.715 26 24.4 24.3 24.6
22 74 27 47 1.814 25 26.3 27.0 26.8
23 63 37 34 1.831 27 26.5 26.6 26.8
24 64 37 55 1.532 23 22.6 21.5 21.7
25 49 51 35 1.831 29 26.5 26.6 26.8
26 41 59 42 1.826 28 26.5 27.0 26.9
27 24 76 35 1.902 26 28.0 28.1 28.2
28 72 28 36 1.760 26 25.1 25.3 25.5
29 42 58 38 1.841 26 26.7 27.0 27.1
30 33 67 47 1.848 28 27.1 27.7 27.4
31 27 73 34 1.946 28 29.1 28.9 29.0
32 38 62 28 1.922 28 28.4 28.0 28.4
33 37 63 39 1.882 25 27.7 27.9 27.9
34 50 50 46 1.792 26 25.9 26.6 26.4
35 64 36 35 1.726 21 24.6 24.6 24.8
36 66 34 35 1.728 23 24.6 24.6 24.9
37 45 55 37 1.837 24 26.6 26.9 27.0
38 95 5 57 2.093 35 33.3 33.3 32.1
39 53 48 35 1.979 26 29.9 29.7 29.6
40 95 5 49 1.865 29 27.2 27.8 27.6
41 51 49 28 1.966 27 29.5 29.0 29.2
42 45 55 47 1.809 27 26.3 27.0 26.7
43 31 69 46 1.739 26 25.0 25.6 25.4
44 27 73 25 2.020 27 30.9 30.1 30.1
45 61 39 38 1.910 26 28.3 28.4 28.4
46 53 47 48 1.778 24 25.7 26.4 26.2
47 54 46 47 1.826 26 26.6 27.3 27.0
48 55 45 98 1.778 26 26.5 29.6 27.4
49 51 49 32 1.845 29 26.7 26.7 27.0
50 35 65 25 1.918 30 28.2 27.7 28.2
51 72 28 36 1.638 22 23.3 23.0 23.2
52 32 68 36 1.813 28 26.2 26.3 26.5
53 58 43 30 1.799 29 25.8 25.7 26.1
54 61 39 31 1.806 25 25.9 25.9 26.2
55 33 67 49 1.737 23 25.0 25.7 25.4
56 17 83 32 2.233 39 37.9 36.9 34.3
57 94 6 65 1.550 23 22.7 22.1 22.1
58 58 42 29 1.953 31 29.2 28.8 28.9
59 60 40 53 1.641 25 23.7 23.9 23.7
60 33 67 34 1.881 30 27.5 27.6 27.7
61 42 58 31 1.970 28 29.6 29.3 29.3
62 72 28 36 1.673 23 23.8 23.6 23.9
63 42 58 36 1.912 27 28.3 28.3 28.4
64 65 35 42 1.660 26 23.7 23.7 23.8
65 96 4 37 1.638 22 23.1 22.8 22.9
66 44 56 40 1.886 25 27.8 28.0 28.0
67 51 49 33 1.745 30 24.9 24.9 25.1
68 98 2 63 1.638 22 23.2 23.2 23.3
69 47 53 33 1.832 24 26.5 26.5 26.8
70 57 43 48 1.743 25 25.1 25.7 25.5
71 75 25 56 1.572 24 22.9 22.5 22.5
72 65 35 47 1.651 25 23.7 23.8 23.7
73 83 17 32 1.806 24 25.9 25.9 26.2
74 67 33 39 1.883 29 27.7 27.9 27.9
75 65 35 36 1.758 27 25.1 25.3 25.5
76 40 60 40 1.790 24 25.8 26.2 26.2
77 67 33 37 1.770 27 25.3 25.6 25.7
78 64 36 34 1.726 21 24.6 24.6 24.8
79 56 44 46 1.720 25 24.7 25.2 25.0
80 75 25 33 1.883 29 27.5 27.5 27.7
81 36 64 40 1.748 24 25.0 25.4 25.4
82 15 85 23 2.016 29 30.7 29.9 30.0
83 72 28 32 1.806 24 25.9 25.9 26.2
84 19 81 35 1.926 27 28.6 28.6 28.6
85 49 51 45 1.832 26 26.7 27.3 27.1
86 57 43 46 1.739 24 25.0 25.5 25.4
87 99 1 71 1.655 23 22.9 22.4 23.1
88 99 1 55 1.533 19 21.3 20.0 20.4
89 75 25 55 1.467 22 22.1 19.8 20.5
90 99 1 71 1.655 23 22.9 22.4 23.1
91 56 44 49 1.649 26 23.7 23.9 23.8
92 35 65 37 1.958 28 29.4 29.4 29.3
93 72 28 36 1.725 25 24.6 24.6 24.8
94 66 34 35 1.728 23 24.6 24.6 24.9
95 60 40 47 1.918 30 28.6 29.1 28.8
96 38 62 38 1.822 27 26.4 26.7 26.7
97 42 58 41 1.812 25 26.2 26.6 26.6
98 57 43 53 1.640 24 23.6 23.9 23.7
99 50 50 49 1.689 24 24.2 24.7 24.5
100 72 28 36 1.813 27 26.1 26.3 26.5
101 61 39 51 1.718 24 24.7 25.4 25.1
102 42 58 36 1.912 27 28.3 28.3 28.4
103 37 63 49 1.702 28 24.4 25.0 24.8
104 66 34 49 1.609 26 23.2 23.0 23.0
105 92 8 60 1.498 18 22.3 20.6 21.1
106 55 45 36 1.885 27 27.7 27.8 27.9
107 45 55 51 1.668 25 24.0 24.4 24.2
108 75 25 55 1.467 22 22.1 19.8 20.5
109 60 40 33 1.838 24 26.6 26.6 26.9
110 60 40 33 1.823 27 26.3 26.3 26.6
111 31 69 51 1.872 29 27.6 28.5 28.0
112 74 26 57 1.431 22 21.9 18.8 19.8
113 80 21 54 1.552 23 22.7 21.9 22.0
114 56 44 53 1.776 26 25.7 26.7 26.2
115 98 2 65 1.743 25 24.7 25.4 25.3
116 93 7 58 1.613 21 23.3 23.3 23.1
117 61 39 35 1.770 24 25.3 25.4 25.7
118 54 46 35 1.914 27 28.3 28.3 28.4
119 60 40 55 1.704 26 24.6 25.4 24.9
120 67 33 37 1.770 27 25.3 25.6 25.7
121 91 9 32 2.000 28 30.3 29.9 29.8
122 47 53 41 1.785 28 25.7 26.1 26.1
123 48 53 22 1.848 28 26.6 26.1 26.8
124 87 14 33 1.988 32 30.1 29.7 29.6
125 61 39 35 1.770 24 25.3 25.4 25.7
126 84 16 27 2.268 40 39.0 38.1 34.8
127 52 48 29 1.904 25 28.0 27.7 28.0
128 91 9 32 2.058 34 32.0 31.4 30.9
129 53 47 27 1.962 27 29.4 28.8 29.1
130 47 53 43 1.759 27 25.3 25.8 25.7
131 75 25 51 1.708 24 24.5 25.2 24.9
132 72 28 36 1.708 24 24.3 24.3 24.5
133 37 63 30 1.913 26 28.2 28.0 28.2
134 95 5 49 1.883 29 27.6 28.2 27.9
135 50 50 44 1.806 27 26.1 26.7 26.6
Bold sets are test sets.
The first 13 datasets represent the test results obtained in this study.
LIST OF FIGURES

Fig. 1 Mohr circles and failure envelopes in terms of total and effective stresses [3]
Fig. 2 The tree representation of a GP model ((√(x - 1))
Fig. 3 Crossover operation in GP
Fig. 4 Mutation operation in GP
Fig. 5 Decomposition of a tree to function terms [29]
Fig. 6 Pruning of a tree with OLS [25]
Fig. 7 Histograms of the variables used in the model development
Fig. 8 Experimental versus predicted ' values using the GP/OLS model: (a) training data, (b) testing data
Fig. 9 Experimental versus predicted ' values using the traditional GP model: (a) training data, (b) testing data
Fig. 10 Experimental versus predicted ' values using the MLSR model: (a) training data, (b) testing data
Fig. 11 Comparison of the ' predictions made by different models: (a) GP/OLS, (b) traditional GP (c) MLSR
Fig. 12 Contributions of the predictor variables in the GP/OLS analysis
Fig. 13 Parametric analysis of ' in the GP/OLS model
Fig. 14 The ratio between the predicted and experimental ' values with respect to the design parameters

Fig. 1 Mohr circles and failure envelopes in terms of total and effective stresses [3]
√ Root Node

- Functional Node

x 1

Terminal Nodes

Fig. 2 The tree representation of a GP model ((√(x - 1))

√ √ √ √

/ + / +
Crossover
x1 x2 √ 2 √ x2 x1 2

x2 x2

Parent 1 Parent 2 Child 1 Child 2

Fig. 3 Crossover operation in GP

√ √

- +
Mutation
x1 x2 x1 x2

Fig. 4 Mutation operation in GP


+
+ /

x0 x1 + x0
C D
A x2 x1
B

Fig. 5 Decomposition of a tree to function terms [29]

+ +
+ + Pruning yx-2
+
F3
× yx-1 yx-2 × yx-1
×
F2 F3
ux-1 ux-1 ux-1 ux-2 F2
ux-1 ux-1
F1 F4 F1

Fig. 6 Pruning of a tree with OLS [25]


(a) (b)
30 Frequency 100% 30 Frequency 100%
Cumulative Cumulative
80% 80%
Frequency

Frequency
20 20
60% 60%

40% 40%
10 10
20% 20%

0 0% 0 0%

CC (%) FC (%)

(c) (d)
40 100% 40 100%

80% 80%
30 30
Frequency

Frequency
Frequency Frequency
Cumulative 60% 60%
Cumulative
20 20
40% 40%
10 10
20% 20%

0 0% 0 0%

LL (%) γ (gr/cm3)

(e)
40 100%

80%
30
Frequency

Frequency 60%
20 Cumulative
40%
10
20%

0 0%

' (o)

Fig. 7 Histograms of the variables used in the model development


50 50
(a) (b)

Predicted ' Value (Degree)


Predicted ' Value (Degree)
45 45

40 Ideal fit 40
Ideal fit

35 35

30 30

25 25
R = 0.804 R = 0.909
20 RMSE = 189.09 20 RMSE = 160.61
MAPE = 5.85 MAPE = 5.12
15 15
15 20 25 30 35 40 45 50 15 20 25 30 35 40 45 50
Experimental ' Value (Degree) Experimental ' Value (Degree)

Fig. 8 Experimental versus predicted ' values using the GP/OLS model: (a) training data, (b) testing data

50 50
(a) (b)
Predicted ' Value (Degree)
Predicted ' Value (Degree)

45 45

40 Ideal fit 40
Ideal fit

35 35

30 30

25 25
R = 0.791 R = 0.897
20 RMSE = 195.01 20 RMSE = 171.08
MAPE = 6.18 MAPE = 5.69
15 15
15 20 25 30 35 40 45 50 15 20 25 30 35 40 45 50
Experimental ' Value (Degree) Experimental ' Value (Degree)

Fig. 9 Experimental versus predicted ' values using the traditional GP model: (a) training data, (b) testing data

50 50
(a) (b)
Predicted ' Value (Degree)
Predicted ' Value (Degree)

45 45

40 Ideal fit 40
Ideal fit

35 35

30 30

25 25
R = 0.789 R = 0.874
20 RMSE = 195.68 20 RMSE = 194.54
MAPE = 6.13 MAPE = 5.90
15 15
15 20 25 30 35 40 45 50 15 20 25 30 35 40 45 50
Experimental ' Value (Degree) Experimental ' Value (Degree)

Fig. 10 Experimental versus predicted ' values using the MLSR model: (a) training data, (b) testing data
50 50
Experimental GP/OLS Experimental Traditional GP
' (o)

' (o)
40 40

30 30

20 20

10 R = 0.835 10 R = 0.822
RMSE = 183.75 RMSE = 190.47
MAPE = 5.71 (a) MAPE = 6.08 (b)
0 0
1 21 41 61 81 101 121 141 1 21 41 61 81 101 121 141
Test No. Test No.

50
Experimental MLSR
' (o)

40

30

20

10 R = 0.813
RMSE = 195.45
MAPE = 6.08 (c)
0
1 21 41 61 81 101 121 141
Test No.

Fig. 11 Comparison of the ' predictions made by different models: (a) GP/OLS, (b) traditional GP (c) MLSR

1
Frequcency

0.8

0.6

0.4

0.2

FC/CC LL (%) γ (gr/cm3)

Fig. 12 Contributions of the predictor variables in the GP/OLS analysis


26.2 27.2

26.0 27.0
26.8
25.8
' (Degree)

' (Degree)
26.6
25.6 26.4
25.4 26.2

25.2 26.0
25.8
25.0
25.6
24.8 25.4
(a) (b)
24.6 25.2
0 20 40 60 80 100 120 0 20 40 60 80 100 120

FC/CC LL (%)

70.0

60.0

50.0
' (Degree)

40.0

30.0

20.0

10.0
(c)
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0

γ (gr/cm3)
Fig. 13 Parametric analysis of ' in the GP/OLS model

1.8 1.8
(a) (b)
1.6 1.6
'EXP / 'GP/OLS

'EXP / 'GP/OLS

1.4 1.4
1.2 1.2
1 1
0.8 0.8
0.6 0.6
0 20 40 60 80 100 120 0 20 40 60 80 100 120
FC/CC LL (%)

1.8
(c)
1.6
'EXP / 'GP/OLS

1.4
1.2
1
0.8
0.6
1.25 1.5 1.75 2 2.25 2.5
γ (gr/cm3)

Fig. 14 The ratio between the predicted and experimental ' values with respect to the design parameters

View publication stats

You might also like