You are on page 1of 9

Tunnelling and Underground Space Technology 38 (2013) 368–376

Contents lists available at ScienceDirect

Tunnelling and Underground Space Technology

journal homepage:

Predicting tunnel convergence using Multivariate Adaptive Regression

Spline and Artificial Neural Network
Amoussou-Coffi Adoko a, Yu-Yong Jiao a,⇑, Li Wu b, Hao Wang a, Zi-Hao Wang a
State Key Laboratory of Geomechanics and Geotechnical Engineering, Institute of Rock and Soil Mechanics, Chinese Academy of Sciences, Wuhan 430071, China
Faculty of Engineering, China University of Geosciences, Wuhan 430074, China

a r t i c l e i n f o a b s t r a c t

Article history: Determining the tunnel convergence is an indispensable task in tunneling, especially when adopting the
Received 20 March 2013 New Austrian Tunneling Method. The interpretation of the monitoring allows adjusting the construction
Received in revised form 7 July 2013 methods in order to achieve more effective tunneling conditions and to avoid problems like rock collapse,
Accepted 17 July 2013
trapping and jamming of boring machine, delay of the project or even geological disasters. In this
Available online 17 August 2013
research, a model capable of predicting the diameter convergence of a high-speed railway tunnel in weak
rock was established based on two approaches: Multivariate Adaptive Regression Spline (MARS) and
Artificial Neural Network (ANN). A tunnel construction project located in Hunan province (China) was
Tunnel convergence
New Austrian Tunneling Method (NATM)
used as case study. The input parameters included the class index of the surrounding rock mass, angle
Multivariate Adaptive Regression Spline of internal friction, cohesion, Young’s modulus, rock density, tunnel overburden, distance between the
Artificial Neural Network monitoring station and the tunnel heading face and the elapsed monitoring time. The performance of
Predictive model the models was evaluated by comparing the predicted convergence to the measured data using several
performance indices. Overall, the results showed high accuracy of the model predictability of tunnel con-
vergence with MARS showing a light lesser accuracy. However, MARS was more flexible and computa-
tionally efficient. It is concluded that MARS can constitute a reliable alternative to ANN in modeling
nonlinear geo-engineering problem such as the tunnel convergence.
Ó 2013 Elsevier Ltd. All rights reserved.

1. Introduction superstructures, bridges, subway lines, riverbed and soft ground

or rehabilitation of old tunnel automatic tunnel monitoring sys-
The monitoring of the convergence of a tunnel convergence is tems (Chung et al., 2006) and special monitoring methods (Navarro
an essential task in tunnel construction, especially when using Torres et al., 2011) are implemented.
the New Austrian Tunneling Method. The tunnel convergence is re- The interpretation of the processed data allows the adjustment
ferred to as the amount of closure on the tunnel diameter and is of the construction methods to the actual tunneling conditions. In
due to the loss of stress–strain equilibrium state of the rock mass fact it has been proved that the tunnel deformation, the excavation
caused by the tunnel driving, resulting in a redistribution of stress and support methods are influenced by the rate of convergence
around the excavation and rock deformations (AFTES, 2002; Zhao (Galler et al., 2009; Kovari and Amstad, 1994; Schubert et al.,
et al., 2007). In practice, convergence data obtained through instru- 2004; Schubert and Moritz, 2011). Higher convergence and unex-
mentation of the tunnel and monitoring program during and after pected rock deformation can lead to problems like rock collapse,
the tunnel construction, are analyzed and interpreted for a wide trapping and jamming of boring machine, delay of the project or
variety of decision-making purposes such as the rock conditions even geological disasters.
ahead of the tunnel face, the tunnel stability and its serviceability. Even though displacement monitoring is one of the most reli-
Deformations and displacements are measured using geodetic able means of knowing the amount of the tunnel convergence, in
surveying (total stations), laser scanners (profilometers) and tape some cases other methods employing field data and predictive
extensometers. These instruments can be placed at regular inter- models can be a useful alternative. Most of the time displacement
vals between 5 m and 50 m depending on the rock mass properties data are not available or are not enough at the earlier stages of de-
and the data can be processed manually or automatically as digital sign or prior construction. In such situations, the tunnel conver-
data and then transmitted to the data processing unit or simply re- gence can be conveniently predicted based on collected data or
corded to be used later (Kavvadas, 2005; Simeoni and Zanei, 2009). information gathered from a previous project with similar ground
In complex conditions such as proximity of the foundation of conditions. In general, a reliable tunnel convergence prediction
should be carried out in such a way that both the amount and
⇑ Corresponding author. Tel.: +86 27 87198299/7560. the trends of convergence can be known with reasonable accuracy
E-mail address: (Y.-Y. Jiao).

0886-7798/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved.
A.-C. Adoko et al. / Tunnelling and Underground Space Technology 38 (2013) 368–376 369

in accordance with the type of project (Schubert and Moritz, 2011; responses between the inputs and the output of a system by the
Jiao et al., 2013). means of piecewise linear segments called splines with differing
The literature highlights several methods of predicting tunnel gradients. The segments are delimited by knots which mark subdi-
convergence including classical statistical models, grey forecasting vision between two data regions in such a way piecewise curves
method, probabilistic approach, time series, fuzzy set models and are obtained. These piecewise curves are referred to as basis func-
artificial intelligence methods (Adoko et al., 2011; Dai et al., tions (BFs). In addition, the functional relationship between the in-
2011; Kang and Wang, 2010; Liu et al., 2008; Mahdevari and Torabi, put variables and the output is not specifically required. This
2012; Mahdevari et al., 2012; Mao et al., 2011; The Professional allows for greater flexibility, bends, thresholds, and several kinds
Standards Compilation Group of People’s Republic of China, 2007; of BFs in modeling.
Wu, 2010). Generally speaking in most of these existing methods, The BFs are generated by searching in a stepwise manner. An
the prediction can involve single input parameter such as time adaptive regression algorithm is used to find the knot best loca-
(e.g. statistical models for time–displacement curves) or multi in- tions. MARS models are constructed in a two-phase procedure.
put parameters correlating with the main influential factors such The forward phase adds functions and finds potential knots to im-
as rock masses properties, tunnel geometry and engineering ground prove the performance, resulting in over-fitting while the back-
conditions, for instance Artificial Neural Network (ANN) methods. ward phase (called model pruning phase) deletes the least
ANN models can allow an accurate estimation of the convergence effective terms. A general MARS model can be written in the fol-
because the mains influential parameters can be taken into consid- lowing form (Samui, 2013):
eration by learning from available field data. Due to this advantage, Ki
over the past few years, several ANN models for tunnel convergence y ¼ c0 þ ci bji ðxv ðj;iÞ Þ ð1Þ
analysis and prediction have been implemented (Lee and Akutaga- i¼1 j¼1
wa, 2009; Mahdevari and Torabi, 2012; Mahdevari et al., 2012; Ra-
fiai and Moosavi, 2012). However, they have some limitations where y is the output variable, c0 is constant, ci is vector of coeffi-
which make them not reliable enough. For example, they are often cients of the non-constant basis functions, bij(xv(j,i)) is the truncated
referred to as ‘‘black boxes’’ because they lack transparency as they power basis function with v(j,i) being the index of the independent
do not consider nor explain the underlying physical processes (Jak- variable used in the ith term of the jth product, and Ki is a parameter
sa et al., 2008; Zhang and Goh, 2013). They correlate input and out- that limits the order of interactions.
put parameters with high precision but do not explain explicitly the The spline bji is defined as below:
relationship among them. In addition, an inefficient and long pro-
ðx  t ji Þq if x < t ji
cess of trial-and-error approach is usually required to determine bji ðxÞ ¼ ðx  t ji Þqþ ¼ ð2Þ
the optimal network configuration (for example network type,
0 Otherwise
number of neurons and hidden layers, transfer function and train- (
ing epochs), since this is not known a priori. ðtji  xÞq if x < tji
bjiþ1 ðxÞ ¼ ðtji  xÞqþ ¼ ð3Þ
For these reasons, the current paper investigates an alternative 0 Otherwise
for predicting tunnel convergence based on Multivariate Adaptive
Regression Spline (MARS) methods (Friedman, 1991; Jekabsons, in which tji represents the knot of the spline; q (q P 0) stands for
2011) to model nonlinear and multidimensional relationships the power of the splines and represents the smoothness degree of
among the factors influencing the tunnel convergence. MARS is a the resultant function approximation. In particular, when q = 1 the
nonlinear and nonparametric regression technique which uses splines are linear functions, which is the case in this study.
piecewise linear segments (splines) to represent nonlinear behav- In the forward process, the BFs are selected according to Eq. (1).
iors between input and the output variables of a system. Some re- In backward process, the ineffective BFs are deleted based on the
cent successful applications of MARS showing its promising usage generalized cross-validation (GCV) criterion. The GCV criterion is
in geotechnical engineering have been achieved such as predicting defined in the following way:
shaft resistance of piles in sand (Lashkari, 2012), determining the 1
PN 2
½y f ðxi Þ
undrained shear strength of clay (Samui and Karup, 2011), model- GCV ¼ hN i¼1 i i2 ð4Þ
ing lateral load capacity of piles (Samui and Kim, 2012), analysis of 1  MþdðM1Þ=2
several geotechnical problems including predicting surface settle-
ment associated with tunneling, HP-pile drivability, collapse po- where M is the number of BFs, d is the penalizing parameter (a de-
tential for compacted soils and seismic liquefaction potential fault value of 3 is assigned to d (Friedman, 1991)), N is the number
(Zhang and Goh, 2013) and elastic modulus prediction of jointed of observations, yi is the ith measured element and f(xi) denotes the
rock masses (Samui, 2013). It is anticipated that this methodology ith predicted value of the model. As it can be seen, the numerator is
can be useful in determining tunnel convergence as well, since the mean squared error of the evaluated model in the training data,
MARS has capacity to map complex data in high-dimensional prob- penalized by the denominator. The denominator accounts for the
lems, produce simple, easy-to-interpret models, and to estimate increasing variance when the model complexity augments.
the contributions of the input variables (Zhang and Goh, 2013). (M  1)/2 is the number of hinge function knots. The GCV penalizes
In this study, a reliable prediction of diameter convergence of both the number of the BFs and number of knots. At each deletion
high speed railways tunnel in weak rock conditions is investigated. step a BF is removed to minimize Eq. (1) until an adequately fitted
MARS and ANN models are implemented and the results are model is found. MARS is an adaptive technique because the selec-
compared. tion of BFs and the variable knot locations are data driven procedure
and specific to the problem at hand. More detailed explanation of
MARS theory can be found in (Friedman, 1991).
2. Methods
2.2. An overview of Artificial Neural Networks (ANNs)
2.1. Basics of MARS
Artificial Neural Networks (ANNs) are considered as a form of
MARS was introduced by Friedman (1991). It is a nonlinear and artificial intelligence which attempt to imitate the brain functions
nonparametric statistical method that simulates the nonlinear and learn from sample data presented to them with the purpose of
370 A.-C. Adoko et al. / Tunnelling and Underground Space Technology 38 (2013) 368–376

capturing the relationship among data. They consist of densely tunnels and auxiliary excavations for high speed railway
interconnected simple processing units referred as neurons that (350 km/h) which are supposed to respond to the increasing de-
can perform large parallel computations. In each neuron n input mand for fast transportation between Changsha and Kunming in
data are processed and a single output is obtained as below: southern China (China Railway Corporation 14th Construction
! Bureau, 2010). The tunnels are two-track railway tunnels; driven
y¼f wi xi þ b ð5Þ section has 12.62 m of height and 13.20 m of span ensuring that
i¼1 the cross section for serviceability is larger than 100 m2. The over-
burden varies between 25 m and 185 m. The lithology of the area
where xi, wi, b and f are the values of the ith input, the values of the includes mainly a succession of weathered calcareous slate and
ith weight, the bias of the neuron and the activation function of the tuffaceous slate rocks of the Quaternary Holocene. Geological
neuron, respectively (Veelenturf, 1995). Eq. (5) can also be conve- explorations revealed from weak to strong weathered rock com-
niently written as shown in Eq. (6) posed of clay minerals, sericite, feldspar quartz and other minerals.
y ¼ fðwx þ bÞ ð6Þ In most of the cases, the rock texture and structure are unclear
with the presence of unfilled rock cracks. The rock surrounding
where x is n  1 input vector; b and y 1  1 bias and output vectors, the tunnel is considered as of poor quality and belongs to class
respectively; w is 1  n weight matrix and f is 1  1 vector repre- III, IV and V according to the Chinese code GB 50218-94 for rock
senting the activation function. classification (The Professional Standards Compilation Group of
Usually, a network consists of several layers of neurons. Each People’s Republic of China, 1995). In these tunnels, the orientation
layer has two or more neurons and sieves for specific purpose. (dip direction/dip) of the rock formations is mainly 296°/39° for
The first one (let it be layer 0) distributes the inputs to the input the calcareous slate layers and 210°/17° for the tuffaceous slate
layer 1. There is no processing in layer 0, it can be seen just as a layers with joint strikes parallel to the tunnel axis which is consid-
sensory layer; each neuron receive one component of the input vec- ered to have between fair and favorable affectations on the tunnel
tor which gets distributed, unchanged, to all neurons from the in- excavation (Bieniawski, 1989). Fig. 1 indicates a simplified
put layer. The last layer is the output layer which outputs the geological section of the Daguan N°2 tunnel.
processed data; each output of individual output neurons being a The construction method is based on the design principle of
component of the output vector y. The layers between the input NATM with partial face excavation supported temporarily by shot-
one and the output one are called hidden layers. crete, rock bolts wire mesh and steel ribs as first lining. According
Theories on ANNs are widely available in literatures. Based on to the ground conditions of the tunnel segments and expected
the overall design, there are various types of ANNs such as back- challenges, soil backfilling, pipe shed and roofing, smooth blasting,
propagation, Kohonen, Hopfield or counter-propagation networks. excavators and other specific methods were used. Some of the ex-
They can also be classified as static (feed forward) or dynamic. In pected difficulties associated with this tunnel included collapses,
dynamic networks, the output depends not only on the current in- landslides (entrance and exit of tunnel located on steep slopes)
put to the network, but also on the current or previous inputs, out- as well as mud and water inflows in fissured metamorphic rock
puts, or states of the network. In contrast, in feed-forward neural areas (mainly in segment DK389 + 045 to DK389 + 0465) due to
networks (FFNNs), there are no feedback elements; inputs are re- hydraulic pressure with the groundwater inflow rate per 10 m of
ceived and simply forwarded through all the next layers to obtain tunnel length reaching 76 L/min.
the outputs. FFNNs can fairly approximate any kind of function
(Engelbrecht, 2007). The Multi-Layer Perceptron network (MLP) 3.2. Data collection and input parameters
is a type of FFNNs which employs back-propagation for training
the network and consists of several layers of neurons (nodes) com- The tunnel convergences (crown and wall sides of the benches)
pletely connected from one layer to the next. MPL was used in re- were measured after the first lining was posed using extensome-
cent applications associated with tunnel convergence prediction ters (SGS-1) and total station. These instruments were placed at
(Mahdevari and Torabi, 2012). When a pair of training dataset con- regular intervals between 5 m and 20 m depending on the sur-
stituted by input values and corresponding target values is pre- rounding rock quality in accordance with the technical regulation
sented to the network, it computes its own outputs using its for such work in China (The Professional Standards Compilation
initial weights and biases. Next, weights and biases are adjusted Group of People’s Republic of China, 2007). The record lasted for
based on a comparison of the output values and the target values, 1–3 months and 65 monitoring stations were considered. Each sta-
until the network outputs match the targets. Usually, in the train- tion contributed with 30–90 datasets. The model inputs that have
ing process, the sum of squared errors is used as performance in- been selected in this study included the principal factor influencing
dex while the Levenberg–Marquardt algorithm is mostly the tunnel the convergence such as tunnel geometry and geome-
implemented to minimize the errors (Engelbrecht, 2007). chanical properties of the rock (Lee and Akutagawa, 2009), namely
One of the limitations of ANNs is that they do not perform very class rating index of the surrounding rock mass (SRM), angle of
well when they have to extrapolate beyond the range of the data internal friction (U), cohesion (C), Young’s modulus (E), rock den-
used for calibration (Jaksa et al., 2008). Also, over-fitting of training sity (c), the tunnel overburden (H), the distance between the mon-
data occurs when the network memorizes the training data, and itoring station and the tunnel working face (D) and the elapsed
consequently loses the ability to generalize (Demuth and Beale, monitoring time (T). The output is the cumulated convergence
2002). (Ccum). Recent researches did not consider T as a direct input
parameter; however since convergence is a time dependent phe-
3. Case study nomenon, the present research took into account the elapsed time
with the convergence history as suggested (Mahdevari and Torabi,
3.1. Brief description of the geo-engineering conditions 2012).
It should be noted that the SRM index used in this research is
The ‘‘Daguan N°1’’ and ‘‘Daguan N°2’’ tunnels 585 m and 912 m not a rating system. The authors’ experience and the available data
long, respectively are located in Hunan Province, China. The tunnel were used to assign (numerical code) subjectively class indices to
excavations were completed in 2012. These tunnels are part of the the surrounding rock mass in a scale of 1–5. In this scale the lower
‘‘CKTJ-9 Project’’ consisting of the construction of more than 15 indices represent the rock classes with better engineering qualities.
A.-C. Adoko et al. / Tunnelling and Underground Space Technology 38 (2013) 368–376 371

Fig. 1. Shematic representation of the geological profile of the Daguan N°2 tunnel.

Since the rock classes of the Chinese rock classification system (The In ANN and MARS modeling, it is recommended to pre-scale the
Professional Standards Compilation Group of People’s Republic of data in a range of [0, 1] by a normalization procedure before train-
China, 1995) cannot be used directly as input, it is necessary to ing (Demuth and Beale, 2002; Friedman, 1991). This is because dif-
translate the rock class into numeric input as a class index. This ex- ferent dimension and scales for the input variables can provoke
pert semi-quantitative approach is often used in rock engineering instabilities that could affect the learning ability quality of the
system (Jiao and Hudson, 1995). Therefore, the SRM index was model. In addition, normalization allows orthogonalizing the com-
established based on the Chinese code GB 50218-94 in use in Chi- ponents of the input vectors in order to avoid correlation with one
na, which takes into account mainly the uniaxial compressive another. The input vectors were scaled over [0, 1] using:
strength, intactness factor, p-wave velocity (highly correlated to
the well known Rock Quality Designation), groundwater condition X ij  X ijmin
X ijNorm ¼ ð7Þ
and initial stress state. The SRM indices corresponding to the X ijmax  X jmax
‘‘Daguan N°2’’ tunnel are shown in Table 1 while a statistical
where X ijNorm is the scaled value and Xijis the original data in ith row
description of raw data used for this study is provided in Table 2.
and jth column respectively, and X jmax andX jmin are the respective
maximum and minimum values of each corresponding jth column.
The normalized output was inverse-scaled to obtain the final
3.3. Data database preparation

Firstly, the most representative datasets (each dataset consists 4. Results

of eight input and one output parameters) of the raw data were
considered and 486 datasets were compiled. During the prepara- 4.1. MARS model
tion of the database, the datasets were sorted in accordance with
each tunnel segment where the rock mass of each monitoring sec- An open source code of MARS (ARESLab) from Jekabsons (Jekab-
tions showed similar characteristics. The database was divided into sons, 2011) which implements the main functionality of the MARS
two. For both ANN and MARS models, approximately 80% (390 technique for regression proposed in (Friedman, 1991), is used to
datasets) of the datasets were used for the modeling training while carry out the analyses presented in this paper. The datasets of
the remained datasets were kept for testing purposes. For selecting the first part of the database were used as training datasets to build
the testing datasets, a sorting method was utilized to make sure the MARS model. The maximal number of BFs was selected to be
that every class of data is covered. The aspect of the statistical con- 26 while the limit maximum interaction level was 2 i.e. only pair-
sistency of the testing dataset was considered as well by choosing wise products of BFs are allowed (second order interaction), leav-
data based on a regular interval. These testing datasets were ing all the other parameters to their defaults. Ultimately, 14
exclusively used for the model performance evaluation and have piecewise-linear BFs including the intercept term were used for
not been employed for the modeling itself. the optimum model.
Analysis of variance (ANOVA) decomposition which is a well
known statistics procedure to identify important variables and
Table 1 important interactions between variables in high-dimensional
SRM indices (Daguan N°2 tunnel).
models was performed for the model using the training data and
Segment Rock classes (Chinese code GB SRM the results are indicated in Table 3.
50218-94) indices Each row of the table summarizes ANOVA decomposition for
DK388 + 915 to IVa, IVb and V 3, 4 and 5 each ANOVA function while the columns are the summary quanti-
DK388 + 045 ties of each decomposition (Friedman, 1991). The first and the sec-
DK388 + 045 to IIIa 1
ond columns give the function number and the standard deviation
DK388 + 265
DK388 + 265 to IIIb and IVa 2 and 3 of the function, respectively. This standard deviation indicates its
DK388 + 465 relative importance to the overall model and can be interpreted
DK388 + 465 to IVa and IVb 3 and 4 similarly to a standardized regression coefficient in a linear model.
DK388 + 635 The third column provides the GCV score; this is an additional indi-
DK388 + 635 to V 5
cation of the importance of the corresponding ANOVA function. It
DK388 + 827
can be used to judge whether this ANOVA function is making an
372 A.-C. Adoko et al. / Tunnelling and Underground Space Technology 38 (2013) 368–376

Table 2
Statistical description of the model parameters.

Type of data Symbol Unit Min. Max. Average Standard deviation

Input SRM logic 1 5 2.93 1.43
U Deg. 20.50 35.60 27.65 5.11
C MPa 0.67 4.80 2.40 1.38
E GPa 1.75 12.10 3.88 3.00
c g/cm3 2.30 2.80 2.60 0.15
H m 25 178 85.19 32.30
D m 25 65 45.99 12.85
T day 1 90 28.61 20.57
Output Ccum mm 0.60 54.30 18.68 11.29

, !
important contribution to the model, or whether it just slightly 1X N
1X N

helps to improve the global GCV score. The fourth column gives
R 1 ^i Þ2
ðy  y  i Þ2
ðy  y ð11Þ
N i¼1 i N i¼1 i
the number of BFs comprising the ANOVA function while the fifth
column indicates the particular predictor variables associated with where var stands for the variance, yi is the ith observed element, y^ is
the ANOVA function. As it can be seen in Table 3, the ANOVA func- the ith predicted element, yi is the mean of the observed values of yi
tions 1, 3, 6, and 7 made the greatest contribution; they are asso- and N is the number of dataset using. Theoretically, a prediction
ciated with SRM, T, C, U and E. These are some of the parameters model is considered as excellent when RMSE is equal to zero and
affecting the tunnel convergence and the results are in agreement VAF is 100%. While VAF represents the percentage ratio of the dif-
with recent research works (Mahdevari and Torabi, 2012; Rafiai ference between the variances of the measured and predicted out-
and Moosavi, 2012). Meanwhile, function 2 (corresponding to in- puts, and the variance of the predicted outputs, the RMSE measures
put variable D) gives small contribution to the model. the deviation between the measured and predicted data. On one
The predictive performance of the model configuration on the hand, the closer the value of VAF is to 100%, the smaller the variabil-
data was evaluated using k-fold Cross-Validation approach. In ity is; and therefore the better the model prediction capabilities. On
k-fold cross-validation, the data samples are randomly split parti- the other, lower value of RMSE suggests better performance. In
tioned into k equal size subsamples. Out of the k subsamples, a sin- addition, using the validation datasets, RRMSE equal to zero is a
gle subsample is kept as the validation data for testing the model perfect fit and values of RRMSE approaching 1 indicate that the
while the remaining (k  1) subsamples are used as training data. model is no better than a mean value of the observed data; how-
This process is repeated k times (folds) in such a way each subset ever, RRMSE values largely exceeding 1 show that the model prob-
is used for validating once. An estimation of the expected error can ably drastically overfits the training data. Conversely, values of R2
be obtained by averaging the validation error over the k trials approaching 1 is a perfect fit.
(Jekabsons, 2011). In addition, the model was evaluated using the The average performance indices that have been obtained with
testing data of the second part of the database. The indices were the 5-fold Cross-Validation (k = 5) were 0.35, 0.11 and 0.98 for
the variance account for (VAF), root mean square error (RMSE), rel- RMSE, RRMSE and R2, respectively. Using the testing data, the cal-
ative root mean square error mean absolute percentage error culated indices were 94.26%, 0.42, 0.18 and 0.96 for VAF, RMSE,
(RRMSE) and the coefficient of determination (R2) using Eqs. (8)– RRMSE and R2, respectively. These results suggest that drastic over-
(11). fit of the training data did not occur and a perfect fit is achieved.
varðyi  y^i Þ The predicted convergence against the actual convergence is plot-
VAF ¼ 100 ð8Þ ted in Fig. 2 using the testing datasets.
varðyi Þ
As it can be seen in Fig. 2, almost all the data points are closely
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi scattered along the line of equality (y = x) indicating very good
u N
u1 X
RMSE ¼ t ðy  y ^ i Þ2 ð9Þ prediction.
N i¼1 i The equation of the piecewise-linear model with all its basis
functions as the modeling output is given in Eq. (12) and in Table 4:
u N u N
u1 X u1 X X
RMSE ¼ t ðyi  y ^ i Þ2 t ðy  y  i Þ2 ð10Þ C c um ¼ 9:896 þ ai bi ð12Þ
N i¼1 N i¼1 i

Eq. (12) and Table 4 can provide an interpretation or insight on how

the convergence varies with changes in the input variables. Some
coefficients (a4, a7 a10, and a13) are negative. Without entering in
Table 3
any deep and detailed analysis of each equation, these negative
ANOVA decomposition for MARS.
coefficients may suggest that when c, U, H, SRM, and T increase
Function STD GCV Number of BFS Variable (s) (i.e. denser rock with more intern friction, larger volume of rock,
1 6.247 130.611 2 SRM poorer rock quality and longer monitoring time), the convergence
2 0.528 4.045 1 D decrease. This is true to some extent. Weak rock tends to have lar-
3 5.312 60.611 1 T ger deformation than hard rock around excavation and can stabilize
4 0.889 14.779 1 SRM, C
5 0.938 14.213 1 SRM, T
after a certain period. However, a10 comes to contradict this when
6 2.359 24.394 2 U, c the condition (T < 5 and SRM < 2) holds. This is difficult to interpret
7 3.720 26.038 1 C, E and may constitute a weakness of the model. Conversely, the posi-
8 0.781 14.233 1 E, c tive value of the coefficients may be suggesting positive correlation
9 0.938 14.361 1 H, T
with the output. For example, with a separate increase of D, SRM or
10 0.725 14.136 2 D, T
T, the cumulated convergence augments. This is in agreement with
A.-C. Adoko et al. / Tunnelling and Underground Space Technology 38 (2013) 368–376 373

domly selected to train the network, 15–20% (60–78 datasets)

were used for validation while the remaining served for the net-
work testing. Several experiments with different network architec-
tures and parameters were tested to identify the appropriate
combinations. Also, the models were tuned to improve their per-
formance. Two to four layers of neurons, hyperbolic tangent sig-
moid (Tansig) and logistic sigmoid (Logsig) in the hidden layers
and linear (Purelin) transfer functions in the output layer were
used according to:

en  en 1
TansigðnÞ ¼ ; LogsigðnÞ ¼ ; PurelinðnÞ ¼ n ð13Þ
en þ en 1 þ en
where e is Neperian number and n is given as:
n¼ pi wl;i þ b ð14Þ
Fig. 2. Prediction performance of MARS model. i¼1

in which p, w and b are input vector, weight matrix and bias, respec-
the field data. In summary, even though it would be difficult to tively. The network architecture settings including the learning rate,
present a rigorous interpretation, these equations make an attempt momentum rate goal and epochs were 0.05, 0.01, 1e3 and 200
to induce from data a sort of physical meaning between input and respectively. Some of the best results of the trial and error experi-
output variables. mentation are listed in Table 5. As it is indicated in Table 5, the opti-
mum configuration of the model is a 3-layer MPL using Tansig,
Logsig, Purelin transfer functions with 20, 26 and 1 neurons in the
4.2. ANN model first, second and third layer, respectively. Fig. 3 plots the mean
squared error (MSE) against the epochs showing the best validation
Several MPL models were investigated in order to establish rela- performance while Fig. 4 shows regression plots for the different
tionship between the dependent variable of convergence and the datasets. MSE is commonly used in ANN model for performance
independent variables (tunnel geometry and geomechanical evaluation and can be derived from Eq. (10) as the square of RMSE.
parameters). Three principal steps were involved: defining the net- The optimum model was additionally tested using the testing
work architecture, training and testing the network. The same data that have served to evaluate the MARS model and the perfor-
datasets which were used for MARS analysis (Section 4.1) are em- mance assessment was also accomplished based on the same indi-
ployed for modeling and evaluating the prediction performance of ces as previously (Section 4.1). This is to allow achieving objective
the models. The crown convergence and sides’ convergence (the comparison between the models’ performances. The computed
top and bottom heading) data corresponding to the 65 monitoring indices were 95.81%, 0.29, 0.13 and 0.97 for RMSE, RRMSE and
station have been used. The required computation was carried out R2, respectively.
using the neural network fitting tool of MATLAB R2010a software.
Each dataset is a vector of eight input parameters as described in
4.3. Comparison
Section 3.2. The target vector is the cumulated convergence. These
inputs were presented to the network which was trained using the
In this section, the models’ performances for the testing data are
Levenberg–Marquardt back-propagation algorithm, as recom-
compared. Fig. 5 contrasts the predicted and actual values of the
mended due to its high generalization capability (Rafiai and Moos-
convergence. Even though the MARS shows slightly less accuracy
avi, 2012). It is important to determine the optimum network
than ANN, both models indicate excellent prediction capability
architecture to achieve reliable results. This task still relies on
with R2 = 0.96 and R2 = 0.97, respectively. As it can be seen, the
trial-and-error method even though several heuristic relations
data points are very closely distributed along the 100% agreement
have proposed to determine appropriately the number of neurons
line. A sample of time–displacement curves (corresponding to
to be included in the hidden layer (Hecht-Nielsen, 1987; Rafiai and
DK389 + 035) is plotted in Fig. 6 using smooth lines for better visu-
Moosavi, 2012). During the modeling process, from the 390 data-
alization. The 3 curves are almost identical with slight deviation.
sets of the first part of the database, 60% (234 datasets) were ran-
However, MARS indicated a bit larger deviation from the actual
observational data. This shows that the models can predict the
Table 4 trends of the convergence i.e. the predicted and measured varia-
List of BFs (bi) of the MARS model and their coefficients (ai). tions of the convergence are in fair agreement during the monitor-
bi Equation ai
b1 max(0, T  39) 0.488 Table 5

b2 b1 max(0, D  35) 0.019 selected results of some MLP models for network testing data.

b3 b1 max(0, 35  D) 0.059
No. Transfer function Model architecture R2 MSE
b4 max(0, 1.5  c)  max(0, U  28) 0.023
b5 max(0, 1.5  c)  max(0, 28  U) 0.071 1 Logsig–Purelin 20-1 0.685 1.952
b6 max(0, 4.4  C)  max(0, E  5.7) 0.161 2 Tansig–Purelin 14-1 0.721 1.864

b7 b1 max(0, 90  H) 0.108 3 Tansig–Tansig–Purelin 21-16-1 0.890 0.406
b8 max(0, 4.4  C)  max(0, 4  SRM) 0.332 4 Logsig–Logsig–Purelin 22-20-1 0.921 0.325
b9 max(0, SRM  4.5) 0.500 5 Logsig–Tansig–Purelin 26-13-1 0.850 0.776
b10 max(0, 5  T)  max(0, 2  SRM) 2.522 6 Tansig–Logsig–Purelin 20-26-1 0.976 0.029
b11 max(0, SRM  1.2) 1.784 7 Tansig–Logsig–Logsig–Purelin 26-14-20-1 0.780 1.030
b12 max(0, 3  E)  max(0, c  9.6) 0.513 8 TanSig–LogSig–TanSig–PureLin 30-38-19-1 0.799 1.170
b13 max(0, 7  c) 0.069 9 Logsig–Tansig–Tansig–Purelin 12-28-16-1 0.723 2.056
b14 9.89 1 10 Logsig–Tansig– Logsig–Purelin 14-30-22-1 0.765 1.355
374 A.-C. Adoko et al. / Tunnelling and Underground Space Technology 38 (2013) 368–376

Best Validation Performance is 0.07169 at epoch 16 However, there are few points that make MARS overcoming this
103 minor weakness. This model was observed to be computationally
more efficient at finding the optimal model. In fact, the algorithm
basically employs a series of linear regressions to construct flexible
102 models and estimate approximation by splitting separate slopes in
Mean Squared Error (mse)

the input variable space. So, selecting the optimum model requires
less trial and error works compared to ANN. The final number of
101 BFs is determined by the algorithm after setting the maximum
number of BFs. In addition, it appears to be faster than ANN. Using
a PC with 2.4 GHz Intel Core i3 M370 processor, 2 GB RAM the pro-
100 cessing speed (CPU time) was smaller. Another distinctive aspect is
that it is able to provide a relative importance or contribution of
each variable to the tunnel convergence through the ANOVA
10-1 decomposition. The model output is expressed in a more interpret-
able way in form of ‘‘segmented’’ linear regressions defined on dif-
ferent intervals as shown in Eq. (13). This may provide additional
10-2 information about how changes in the input data can affect the
0 5 10 15 20
22 Epochs
Finally, models’ performance and the efficiency features are
Fig. 3. Performance error plot of the optimum MLP.
summarized in Table 6. These results conclude that the MARS
can constitute a valuable alternative of determining the conver-
gence of the tunnel diameter.

ing period. In addition, using 21 monitoring stations selected along

the tunnels, the models’ capability of predicting spatially the con- 5. Discussion
vergence was evaluated in Fig. 7. For both models, the results are in
good agreement with the measured data as well. Therefore, it is Knowing accurately the trend and the total amount of the tun-
suggested that ANN and MARS can be used to predict tunnel nel convergence is imperative when tunneling in weak rocks such
convergence even though MARS underperforms very slightly in as tuffaceous slate with unfavorable geo-engineering conditions. It
terms of prediction capability. is used to assess the tunnel overall stability and decide if any

Fig. 4. Regression plots corresponding to the optimum MLP.

A.-C. Adoko et al. / Tunnelling and Underground Space Technology 38 (2013) 368–376 375

order to be used in the field and can be able to estimate not only
the convergence trends but also the final amount of convergence.
Existing empirical and analytical approaches cannot be used in
all ground geological situations because they predict convergence
by making a series of assumption such as the input geomechanical
parameters, tunnel geometry and the stress state (Barton, 2002).
Thus, alternative approaches in which the complex and nonlinear
relationships between input parameters and the convergence are
learned from available data, need to be investigated.
The results obtained in this research showed excellent perfor-
mance indices for the ANN based model and are in agreement with
several recent works using the same methodology even though the
ground conditions and the model input parameters were different
(Mahdevari and Torabi, 2012; Mahdevari et al., 2012). One of the
principal limitations of ANN pointed out by many researchers for
its long training process since the optimal configuration is not
known a priori, has been experienced in this research, as well. Dur-
Fig. 5. ANN and MARS prediction comparison.
ing the trial-and-errors process, poor prediction accuracies were
noticed (with R2 as low as 0.56 and MSE as high as 45.8); this can-
not be reliably used. Selecting the optimum configuration some-
times cannot be easier. With the purpose of finding alternative,
MARS model for tunnel convergence was applied in this work.
The model can be interpreted and high accuracy results are ob-
tained as well.
Using MARS model in large scale engineering even with com-
plex geological conditions, can allow achieving reliable results pro-
vided that the model is properly calibrated with large dataset and
the engineers can observe the variation of the output with any
changes of each input variable by the mean of linear regression

6. Conclusions

Fig. 6. Predicted convergence trends (at DK389 + 035).

This paper intends to explore alternative for tunnel convergence
prediction. ANN and MARS models were investigated. Datasets
from the Daguan N°1 and Daguan N°2 tunnels were compiled.
The models’ input parameters included the principal factor influ-
encing the tunnel the convergence i.e. SRM, U, C, E, c, H, D and T.
On one hand, a 3-layer MLP with 20-26-1 configuration was iden-
tified as optimum ANN. On the other, the MARS model was imple-
mented with 14 effective piecewise-linear BFs. Both models
exhibited excellent prediction performance with MARS showing
lightly lesser accuracy. VAF, RMSE, RRMSE and R2 indices were
95.81%, 0.29, 0.13, 0.97 for ANN and 94.26%, 0.42, 0.18 and 0.96
for MARS, respectively.
Nevertheless, it was observed that MARS was computationally
more efficient at finding the optimal model and able to provide a
contribution of each variable to the tunnel convergence through
the ANOVA decomposition. In addition, the model output was ex-
pressed in a more interpretable way since it uses a series of linear
Fig. 7. Predicted convergence along some tunnel segments.
regressions defined in distinct intervals of the input variable space.
All these aspects balance its performance.
On the basis of these results, it can be concluded that MARS can
adjustment or modification of the construction method is required. be used to predict the tunnel convergence. It can usefully assist in
This is so important that usually, engineers prefer relying on direct decision-making process regarding the tunnel stability in the
measurement through a well established monitoring program observational tunnel method. One of the advantages is that data
since it is the most convenient. Therefore, a prime requirement from projects with similar geo-engineering can be reliably used
for any predictive model is that it should be reliable enough in for new project prior construction.

Table 6
Performance comparison between ANN and MARS.

Model Selecting the optimum model Processing time (s) VAF (%) RMSE RRMSE R2
ANN More trial-and-errors 60.8 95.81 0.29 0.13 0.97
MARS Less trial-and-errors 3.5 94.26 0.42 0.18 0.95
376 A.-C. Adoko et al. / Tunnelling and Underground Space Technology 38 (2013) 368–376

Acknowledgments Kovari, K., Amstad, C., 1994. Decision making in tunnelling based on field
measurements. Int. J. Rock Mech. Min. Sci. Geomech. Abst, 571–606.
Lashkari, A., 2012. Prediction of the shaft resistance of nondisplacement piles in
This study is financially supported by the Key Research Program sand. Int. J. Numer. Anal. Meth. Geomech.
of the Chinese Academy of Sciences (KZZD-EW-05-03) and the Chi- Lee, J.-H., Akutagawa, S., 2009. Quick prediction of tunnel displacements using
Artificial Neural Network and field measurement results. Int. J.-JCRM 5, 53–62.
na National Natural Science Foundation (41172287, 51139004).
Liu, K.Y., Qiao, C.S., Wang, S.D., 2008. Study on the GA-ANIFIS intelligence model for
Also, the authors greatly appreciate the contribution of the CKTJ- nonlinear displacement time series analysis of long and large tunnel
9 Project authorities for providing access to some data. Finally, construction. In: Proc. International Young Scholars’ Symposium on Rock
Mechanics - Boundaries of Rock Mechanics Recent Advances and Challenges for
the authors highly appreciate the two anonymous reviewers for
the 21st Century. Taylor and Francis/Balkema, Beijing, China, pp. 667–671.
their critical and helpful comments. Mahdevari, S., Torabi, S.R., 2012. Prediction of tunnel convergence using Artificial
Neural Networks. Tunn. Undergr. Space Technol. 28, 218–228.
Mahdevari, S., Torabi, S.R., Monjezi, M., 2012. Application of artificial intelligence
References algorithms in predicting tunnel convergence to avoid TBM jamming
phenomenon. Int. J. Rock Mech. Min. Sci. 55, 33–44.
Mao, G., Xia, Y., Liu, L., 2011. Time series forecasting of tunnel surrounding rock
Adoko, A.C., Zuo, Q.-J., Wu, L., 2011. A fuzzy model for high-speed railway tunnel displacement. In: Proc. International Conference on Civil Engineering and
convergence prediction in weak rock. Elect. J. Geotech. Eng. 16, 1275–1295. Building Materials (CEBM 2011). Trans Tech Publications, Kunming, China, pp.
AFTES, 2002. Recommendations on the convergence–confinement method. Tunn. 1789–1793.
Ouvrages Souterr. 174, 414–424. Navarro Torres, V.F., Dinis da Gama, C., Costa e Silva, M.M., Singh, R.N., Reddish, D.,
Barton, N., 2002. Some new Q-value correlations to assist in site characterisation Stace, R., 2011. Application of a new convergence measurement technique for
and tunnel design. Int. J. Rock Mech. Min. Sci. 39, 185–216. the rehabilitation of old Rossio railway tunnel. Lisbon. Geomech. Geoeng. 6,
Bieniawski, Z.T., 1989. Engineering Rock Mass Classification. Wiley, New York. 109–118.
China Railway Corporation 14th Construction Bureau, 2010. CKTJ9 Shanghai- Rafiai, H., Moosavi, M., 2012. An approximate ANN-based solution for convergence
Kunming Passenger Line Part I, Construction Management Manual. China of lined circular tunnels in elasto-plastic rock masses with anisotropic stresses.
Railway Corporation, Changsha. Tunn. Undergr. Space Technol. 27, 52–59.
Chung, H.-S., Chun, B.-S., Kim, B.-H., Lee, Y.-J., 2006. Measurement and analysis of Samui, P., 2013. Multivariate adaptive regression spline (Mars) for prediction of
long-term behavior of Seoul metro tunnels using the Automatic Tunnel elastic modulus of jointed rock mass. Geotech. Geol. Eng. 31, 249–253.
Monitoring Systems. Tunn. Undergr. Space Technol. 21, 316–317. Samui, P., Karup, P., 2011. Multivariate adaptive regression spline and least square
Dai, T., Xie, D., Yao, H., Li, G., 2011. Establishment of grey system model about support vector machine for prediction of undrained shear strength of clay. Appl.
tunnel surrounding rock convergence and information renewal GM(1,1) model Metaheuristic Comput. 3, 33–42.
forecasting. In: Proc. International Conference on Remote Sensing, Environment Samui, P., Kim, D., 2012. Least square support vector machine and multivariate
and Transportation Engineering (RSETE 2011). Springer, Berlin, pp. 310–314. adaptive regression spline for modeling lateral load capacity of piles. Neural
Demuth, H., Beale, M., 2002. Neural Network Toolbox for Use with MATLAB, forth Comput. Appl, 1–5.
ed. The MathWorks, Inc., MA, USA. Schubert, W., Moritz, B., 2011. State of the art in evaluation and interpretation of
Engelbrecht, A.P., 2007. Computational Intelligence. Wiley, Chichester. displacement monitoring data in tunnels/Stand der Auswertung und
Friedman, J.H., 1991. Multivariate Adaptive Regression Splines (with Discussion). interpretation von verschiebungsmessdaten bei tunneln. Geomech. Tunn. 4,
Ann. Stat. 19, 1–141. 371–380.
Galler, R., Schneider, E., Bonapace, P., Moritz, B., Eder, M., 2009. The new guideline Schubert, W., Grossauer, K., Button, E.A., 2004. Interpretation of displacement
NATM – the Austrian practice of conventional tunnelling. BHM Berg- und H. monitoring data for tunnels in heterogeneous rock masses. Int. J. Rock Mech.
154, 441–449. Min. Sci. 41 (Suppl. 1), 882–887.
Hecht-Nielsen, R., 1987. Kolmogorov’s mapping neural network existence theorem. Simeoni, L., Zanei, L., 2009. A method for estimating the accuracy of tunnel
In: Proc. 1st IEEE International Conference on Neural Networks. SOS Printing, convergence measurements using tape distometers. Int. J. Rock Mech. Min. Sci.
San Diego, CA, USA, pp. iii/11–14. 46, 796–802.
Jaksa, M.B., Maier, H.R., Shahin, M.A., 2008. Future challenges for artificial neural The Professional Standards Compilation Group of People’s Republic of China, 1995.
network modeling in geotechnical engineering. In: Proc. 12th International GB 50218-94 Standard for Engineering Classification of Rock Masses. China
Conference of International Association for Computer Methods and Advances in Railway Press, Beijing.
Geomechanics (IACMAG), Goa, India, pp. 1710–1719. The Professional Standards Compilation Group of People’s Republic of China, 2007.
Jekabsons, G., 2011. ARESLab: Adaptive Regression Splines Toolbox for Matlab/ TB 10121-2007 Technical Code for Monitoring Measurement of Railway Tunnel.
Octave. <>. China Railway Press, Beijing.
Jiao, Y., Hudson, J.A., 1995. The fully-coupled model for rock engineering systems. Veelenturf, L.P.J., 1995. Analysis and Applications of Artificial Neural Networks.
Int. J. Rock Mech. Min. Sci. Geomech. Abst. 32, 491–512. Prentice Hall International (UK) Ltd., Midsomer Norton.
Jiao, Y.Y., Song, L., Wang, X.Z., Adoko, A.C., 2013. Improvement of the U-shaped steel Wu, Z.-Z., 2010. Stochastic medium predicting model of ground movement
sets for supporting the roadways in loose coal seam. Int. J. Rock Mech. Min. Sci. tunneling based on non-uniform convergence mode. Zhongnan Daxue Xuebao
60, 19–25. (Ziran Kexue Ban)/J. Cent. S. Univ. (Sci. Tech.) 41, 2005–2010.
Kang, Y., Wang, J., 2010. A support-vector-machine-based method for predicting Zhang, W.G., Goh, A.T.C., 2013. Multivariate adaptive regression splines for analysis
large-deformation in rock mass. In: Proc. 7th International Conference on Fuzzy of geotechnical engineering systems. Comput. Geotech. 48, 82–95.
Systems and Knowledge Discovery, FSKD 2010. IEEE Computer Society, Yantai, Zhao, J., Gong, Q.M., Eisensten, Z., 2007. Tunnelling through a frequently changing
Shandong, China, pp. 1176–1180. and mixed ground: a case history in Singapore. Tunn. Undergr. Space Technol.
Kavvadas, M.J., 2005. Monitoring ground deformation in tunnelling: current 22 (4), 388–400.
practice in transportation tunnels. Eng. Geol. 79, 93–113.