Professional Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: The long-term deformations of mountain tunnels, which attract more and more attentions, are closely
Received 27 November 2007 related to the time-dependent features of the surrounding rock mass. However, it is not easy to deter-
Received in revised form 25 July 2008 mine an appropriate rheological model and its corresponding parameters for a certain engineering
Accepted 11 August 2008
instance. This paper presents a rheological parameter estimation technique by using error backpropaga-
Available online 21 September 2008
tion neural network (BN) and genetic algorithm (GA). The application of the proposed technique to an
engineering instance, Ureshino tunnel line I on Nagasaki expressway, is expatiated in detailed. The sto-
Keywords:
chastic nature of the proposed technique is also discussed through case studies. It is proved that the pro-
Rheology
Parameter estimation
posed technique can provide the engineer with an optimal estimation of the rheological parameters,
BN which can help the prediction of long-term deformations of mountain tunnels in the future.
GA Ó 2008 Elsevier Ltd. All rights reserved.
0886-7798/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.tust.2008.08.001
Z. Guan et al. / Tunnelling and Underground Space Technology 24 (2009) 250–259 251
EL (m)
STA 211+70 STA 212+00
30.0
STA 211+90
STA 209+26 STA 216+09 rock bolt
L=683m
400 Lb=3.0
300 K-La
K-Vc
200 K-Tb K-La shotcrete lining
K-Tb second lining
tc=0.15
Tunnel Face
Advancing tc=0.35
100 i = 0.5%
K-La: Augite hypersthene andesite Center Line
K-Tb: Pyroxene andesite tuff i: slope
K-Vc: hornblende.andesite lava
second lining
excavation cycle tc=0.50
Fig. 1. Schematic representation of the geological profile of Ureshino tunnel line I
Unit: (m) Lz=1.0
(JPHC, 2000).
3.1. Multilayer error backpropagation neural network For a pair of input signals to the desired output signals (named a
dataset hereafter), the sum-square-error for this dataset is defined
There are many types of artificial neural network according to as Eq. (3.1). Similarly, the average sum-square-error for the entire
the different network architecture and different learning algo- datasets is defined as Eq. (3.2).
rithm, among which BN is the most mature and popular one. The
1X 2
basic concepts and learning algorithm about BN are introduced sse ¼ e ð3:1Þ
2 N k
briefly in this section, which are based on Schalkoff (1997) and l
Haykin (1999). 1 XX 2
sseav ¼ e ð3:2Þ
The neuron is the basic processing element that sums up the in- 2NT N N k
T l
put signals by synaptic weights and converts them into the output
signal through a transfer function. The artificial neural network is where Nl is the total number of output neurons; NT is the total num-
considered as a computing system composed of a number of neu- ber of datasets.
rons that are interconnected by a particular topology. Specifically, The synaptic weights are adjusted iteratively to minimize the
the BN consists of an input layer corresponding to the independent error function (sse or sseav) by the means of error backpropagation
variables of the problem, an output layer corresponding to the learning algorithm. Therefore, there are two sweeps in each itera-
dependent variables of the problem and one (or more) hidden tion: forward activation to map the input signals into the output
layer(s) according to the nonlinearity of the problem. The architec- ones, and error backpropagation to adjust the synaptic weights
ture of a fully connected BN is schematically illustrated in Fig. 5. to minimize the error function. There are many algorithms to ad-
Hereafter, the symbol of BN(N0, N1, N2, N3) is used to represent this just the synaptic weights, among which the gradient descent
kind of network consisting of an input layer with N0 input signals, method is the most straightforward one.
an output layer with N4 neurons and two hidden layers with N2 For a neuron nk being an output neuron, the weight corrections
and N3 hidden neurons in each layer. in the nth iteration, Dwkj(n), are defined by the delta rule.
Consider a neuron nk fed by a set of input signals yj produced by osse osse ovk
its former (left) layer neurons, the output signal yk of this neuron is Dwkj ðnÞ ¼ g þ lDwkj ðn 1Þ ¼ g þ lDwkj ðn 1Þ
owkj ovk owkj
computed by the following:
! ¼ gdk yj þ lDwkj ðn 1Þ ðfor an output neuron nk Þ ð4:1Þ
Nl
X osse osse oyk osse oek oyk
yk ¼ vk ðvk Þ ¼ vk wkj yj ðfor a certain neuron nk Þ ð1Þ with dk ¼ ¼ ¼ ¼ ek v0k ð4:2Þ
j¼0 ovk oyk ovk oek oyk ovk
In the above equation, Nl is the total number of input signals fed to In the above equations, g is the learning rate and l is the momen-
neuron nk and wkj is the synaptic weight connecting neuron nj to tum coefficient. dk is the local gradient of the output neuron nk and
neuron nk (wk0 is the synaptic weight connecting a fixed unit bias v0k is the first-order differential of the transfer function with respect
input signal to neuron nk). vk is the weighted sum of all the input to vk.
signals. vk is the transfer function of neuron nk that transfers the For a neuron nj being a hidden neuron, the weight corrections in
weighted sum to the output signal yk. It may be a threshold, sigmoid the nth iteration, Dwji(n), are similarly defined by the delta rule.
or linear transfer function depending on the problem. osse osse ovj
When neuron nk is an output neuron, comparing its output sig- Dwji ðnÞ ¼ g þ lDwji ðn 1Þ ¼ g þ lDwji ðn 1Þ
owji ovj owji
nal to the desired output dk, the error signal of this output neuron
nk is defined as ¼ gdj yi þ lDwji ðn 1Þ ðfor an input neuron nj Þ ð5:1Þ
!
osse osse oyj X oek ovk X
ek ¼ dk yk ðfor an output neuron nk Þ ð2Þ with dj ¼ ¼ ¼ ek v0j ¼ v0j dk wkj
ovj oyj ovj k
ovk oyj k
ð5:2Þ
where dj is the local gradient of the hidden neuron nj and v is the
0
k
first-order differential of the transfer function with respect to vj.
N0-dimension N3-dimension
N1-dimension and N2-dimension desired The training mode presented above, where the synaptic weights
input layer output layer output are updated immediately after the presentation of one training
hidden layers
dataset, is called the sequential mode. In contrast, the batch mode
uses sseav as the error function, and updates the synaptic weights
1 1 1 in an averaged means after the entire training datasets has been
e1 presented. In general, the sequential mode needs a smaller local
p1 n1 n1 n1 d1 storage and has a better stochastic search ability to prevent entrap-
ment in local minima. The batch mode has a better estimation of
...
...
...
...
...
...
...
...
...
lating on the error surface. A small learning rate drives the search
p N0 eN4 steadily in the direction of the global minimum, though slowly.
nN1 nN2 nN3 dN3
Similarly, a high momentum coefficient will reduce the risk of
interlayer interlayer interlayer
being stuck in local minima, but increase the risk of overshooting
weights wih weights wji weights wkj
the solution as a high learning rate does. Commonly, the adaptive
technique of varying these two parameters in relation to the error
input signals neurons
gradient information is a better solution for this issue (Schalkoff,
Fig. 5. A fully-connected BN with the architecture of BN(N0, N1, N2, N3). 1997; Haykin, 1999).
Z. Guan et al. / Tunnelling and Underground Space Technology 24 (2009) 250–259 253
individuals are die out. Reproduction is stopped when the number of individuals
initiated randomly in the mating pool is equal to the number of individuals in a
generation.
Crossover and mutation: Individuals in mating pool are arranged
former generation randomly in couples, so-called parents. Each couple is designated
reproduction
to yield two offspring from randomly exchanging one of their
parental parameter. During the crossover operation, the parame-
ters in the offspring can be randomly mutated (i.e. modified) with
the so-called a very small probability. This mutation operation is used to protect
mating pool the generations from losing some potentially useful genetic
genetic operations
crossover
mutation
material.
After genetic operations, the individuals in the next generation
are generally characterized by an increased average fitness. This
iteration is terminated after several generations when some condi-
next generation tions or criteria are reached (e.g. the improvement on the average
fitness converges). Then the individual that appears most fre-
one individual is selelcted quently in the latest generation is regarded as a solution to the gi-
by the appearance frequency ven problem.
Fig. 6. Schematic illustration for the genetic algorithm. 3.3. Philosophy of the proposed technique
step 1 FL FL
3D
datasets p FLAC u key
preparation
scaled
scaled
step 2 BN BN BN
p BN o u key
network
training
adjustweights
after well
training
step 4 FL 3D FL MO
revalidation p FLAC u key u key
rithm. Firstly, the key convergences measured in site uMO key should Shotcrete lining Second lining
also be scaled into its dimensionless counterpart uGA
key , which serves Young’s modulus E (MPa) 1.0e4 2.0e4
as the desired output for the aforementioned post-trained net- Poisson ratio l 0.25 0.25
work. An individual pGA represents a set of rheological parameter UCS rc (MPa) 15 30
in this context. Then a generation of randomly initiated individuals Friction angle / (°) 35 40
Dilation angle w (°) 3.0 3.0
(300 individuals, for example) is fed to the post-trained network, Thickness of lining tc (m) 0.15 0.35
and the fitness of each individual is defined as the sum-square-er-
ror between the network output oGA and the desired output uGA key .
Then, this generation of individuals undergoes evolution via the
genetic operations. This evolution is terminated after several gen- the configuration of AMD Athlon (tm) 3800+ and 1.0 GB memory.
erations when some conditions or criteria are reached, and the The rheological parameters applied to the numerical simulations
individual appearing most frequently in the latest generation is re- are initiated in such a random way. Fifty sets of random number
garded as an estimation for the rheological parameter set. All these (within 0 and 1) are initiated first, so that the rheological parame-
works presented in step two and step three are implemented in ters applied to the numerical simulations can be scaled from them
Matlab (Mathworks, 2006). by the following equation:
In step four, the rheological parameter set estimated by BN and P FL ¼ P lower þ ðP upper P lower Þ P BN ð6Þ
GA should be reversely scaled (from the dimensionless numbers to
BN
their real values) and then revalidated by the identical numerical In the above equation, p is a 61 vector recording the random
simulations aforementioned. This estimation is considered a reli- numbers within 0 and 1 that are fed to the BN as the input signals;
able one, if the key convergences computed from the numerical pFL is the counterpart vector recording the six rheological parame-
simulations agree with those measured in situ. The implement de- ters that are applied to the numerical simulations. plower and pupper
tails of the proposed technique to the Ureshino tunnel line I and are two 61 vectors recording the lower and upper limits for the six
some discussions on the stochastic nature of the proposed tech- rheological parameters, which can take reference to the literatures
nique are expatiated in Section 4. (Hudson and Harrison, 1997; Li and Xia, 2000; Shin et al., 2005;
Fabre and Pellet, 2006). The informations for these 50 cases are
4. Implementation details and discussions partly listed in Table 3.
Then, the 12 key convergences (i.e. uc, us and ui at 12th, 24th,
4.1. Dataset preparation 36th, 48th month, respectively) computed from the numerical sim-
ulations are recorded and then scaled into dimensionless number
The training datasets are provided by the numerical simulations within 0 and 1 by the following equation:
(codes: FLAC3D). According to the geological conditions and tunnel ðuFL min
key ukey Þ
dimensions presented in Section 2, a full 3D model including thirty uBN
key ¼ ð7Þ
ðumax min
key ukey Þ
excavation cycles from STA 212 + 00 to STA 211 + 70 in the longi-
tudinal direction (y axis) is set up firstly. The surrounding rock In the above equation, uFL key is a 121 vector recording the key con-
mass is basically hornblende andesite lava with a cover about vergences that are computed from the numerical simulations; and
250 m, whereas the simulation regions in the cross-section are uBN
key is its counterpart vector recording the scaled key convergences
only five times and three times of tunnel diameter in the vertical that serve as the desired outputs for BN training. umin max
key and ukey are
direction (z axis) and the horizontal direction (x axis). The sur- two 121 vectors recording the minimum and maximum key con-
rounding rock mass is assumed as a Mohr-Coulomb material at this vergences amongst these fifty cases of numerical simulations.
stage, with its mechanical properties listed in Table 1 (JPHC, 2000). The normalization for pBN and uBN key is very important for the
Then, according to the construction cycles presented in Section training process due to these two reasons: it can prevent the large
2, the tunnel face advances and the rock bolts and shotcrete lining input (output) components from overriding the small ones; and
are modeled gradually. The second lining is modeled after all these the sigmoid transfer function that is commonly used in the hidden
thirty exaction cycles are completed. The mechanical properties of neuron is sensitive to the value within 1 and 1.
shotcrete lining and the second lining are listed in Table 2 (JPHC,
2000). 4.2. Network training
After the construction process completed, the Burger-MC dete-
rioration rheological model is applied to the surrounding rock 4.2.1. Network architecture
mass, and the aforementioned numerical simulations proceed to There is no doubt that the input layer includes six input signals
rheological calculation focusing on the first five years during the corresponding to the six rheological parameters under consider-
tunnel’s service life. The numerical model consists of 35,898 grid- ation, and the output layer includes 12 output neurons corre-
ponits, 33,450 zones and 2100 structure elements. The computing sponding to the 12 key convergences. Although the output layer,
time for the rheological calculation is nearly 10 h per case, under
Table 3
Parameter informations of 50 randomly generated cases
Table 1
Case No. GK gK gM xc x/ Rthr
Properties of the rock masses employed in numerical simulations (JPHC, 2000)
01 0.527 0.729 0.468 0.516 0.650 0.650
Properties Value Properties Value 02 0.791 0.250 0.138 0.461 0.525 0.613
.. .. .. .. .. .. ..
Bulk modulus K (MPa) 833 Density q (kg/m3) 2500 . . . . . . .
Shear modulus G (MPa) 385 Residual cohesion cres (MPa) 0.310 49 0.412 0.556 0.444 0.111 0.125 0.750
Cohesion c (MPa) 0.577 Residual friction angle /res (o) 25.5 50 0.333 0.667 0.345 0.778 0.550 0.163
Friction angle / (°) 30.0 In situ rzz (MPa) 8.0
Lower and upper limits for each parameter: GK: 1e8–1e9 Pa; gK: 1e16–1e17 Pa s;
Dilation angle w (°) 5.1 In situ rxx (MPa) 8.0
gM: 1e17–1e18 Pa s; xc: 1e4–1e5 Pa/year; x/: 0.5–4.5°/year; Rthr: 0.2–1.0.
Z. Guan et al. / Tunnelling and Underground Space Technology 24 (2009) 250–259 255
theoretically, can include as many output neurons as the monitor- train the network until the stopping criterion is met. Repeat this
ing data available, increasing the number of output neurons exces- iteration and train the network continuously, until each dataset ap-
sively is neither economic nor accurate due to the following reason peared at least once in both the validating subset and the training
(Schalkoff, 1997; Haykin, 1999). The number of datasets required subset.
to train a BN well (i.e. having a good generalization ability) is clo-
sely related to the number of synaptic weights involved in the net- 4.2.3. Random initial weights
work. When the training datasets are very limited, increasing the The initialization of synaptic weights has an effect not only on
number of output neurons would increase the number of synaptic the network convergence speed, but also on the network’s general-
weights greatly and consequently increase the risk of overfitting ization ability. Herein, a random initial weights strategy is em-
significantly. ployed to deal with this issue (Basheer and Hajmeer, 2000;
The transfer function employed in the output neurons is linear Abraham, 2004). The synaptic weights are initiated randomly
function, as described in the following. within the intervals of (3Nl1/2, 3Nl1/2), where Nl is the number
of neurons in lth layer. The network is then trained by the leave-
vðmÞ ¼ m ð8:1Þ
n-out corotation process as presented above. After that, all the 50
Determining the numbers of hidden layers and hidden neurons is datasets are fed to the network entirely and the network perfor-
more complicated. Generally speaking, the number of hidden layers mance (i.e. the error function sseav) is recorded.
and hidden neurons is highly dependent on the nonlinearity and This kind of iteration of random initializing and corotative
multimodality of the problem under consideration. More hidden training is repeated dozens of times, and the network with best
layers and neurons are able to detect more hidden features in the performance (i.e. minimum sseav) is considered as a post-trained
training datasets, but increase the risk of overfitting at the same network that can approximate the numerical simulations afore-
time (Schalkoff, 1997; Haykin, 1999). However, one knows nothing mentioned in the sense of statistics.
about the relationship between the network architecture and the
problem under consideration prior to some attempts. Therefore, 4.3. Genetic searching
four types of network architecture with different hidden neurons
and hidden layers are tried, with more detail discussions expatiated The aforementioned post-trained network together with the ge-
in Section 4.5.The transfer function employed in the hidden neurons netic algorithm is employed to estimate the six rheological param-
is log-sigmoid function, as described in the following: eters according to the monitoring data. The key convergences
1 measured in site uMOkey should be scaled into its dimensionless coun-
vðmÞ ¼ ð8:2Þ terpart uGA
key by the following equation, and serve as the desired out-
1 þ expðmÞ
put for the post-trained network aforementioned.
ðuMO min
key ukey Þ
4.2.2. Leave-n-out corotation strategy uGA
key ¼ ð9Þ
The available datasets should be divided into two subsets ðumax min
key ukey Þ
(training subset and validating subset) by a certain proportion.
Three hundred individuals pGA (each individual represents a set of
Generally, the learning curve (i.e. error function plot) for the train-
scaled rheological parameters within 0 and 1) are initiated ran-
ing subset decreases monotonically with the increasing of training
domly and fed to the post-trained network. Comparing the network
epoch. In contrast, the learning curve for the validating subset de-
output oGA with the desired output uGAkey , the error function sse is de-
creases monotonically to minimum and then starts to increase as
fined as the fitness for each individual. Then, a reproduction indica-
the training epoch continues, as schematically illustrated in Fig. 8
tor x is devised to select these individuals into the mating pool
(Schalkoff, 1997; Haykin, 1999). This heuristic suggests that the
probabilistically, as formulated in the following:
point of minimum on the validation learning curve can be used
as a stopping criterion to prevent excessive training.
sse sseav
However, the subset partitioning is an intractable issue, since xðsseÞ ¼ þ gauss ð0; aÞ ð10Þ
sseav
the available datasets are very limited. Herein, a leave-n-out coro-
tational strategy is employed to deal with this issue (Basheer and In the above equation, sseav is the average error function for the
Hajmeer, 2000; Abraham, 2004). Ten datasets are randomly se- whole generation, gauss (0, a) is a Gaussian distributed random
lected for validation and the other forty datasets for training. Then number generator with a mean of zero and a variance of a. An indi-
vidual is selected into the mating pool when its reproduction indi-
cator is greater than zero, or is prohibited on the inverse condition.
Individuals in mating pool are arranged randomly in couples,
Error function (e.g. sse)
ural form.
The evolution is terminated after 8–10 generations, when the
ubse
ning
by the BN and GA should be revalidated again by the identical for example, five parameter set estimations provided by the genetic
numerical simulations. Trying the estimation into the identical searching are listed in Table 4. The difference between these five
numerical simulations and comparing the results to the monitor- estimations is relatively small, and the medium one between them
ing data, the error of crown convergence between them, euc, is de- is selected as the possible estimation for the post-trained network
fined as BN(6, 9, 12)_T3. A similar tendency can be observed for other post-
trained networks. Therefore, it can be concluded that the possible
kuFL MO
c uc k
euc ¼ ð11Þ estimation provided by the genetic searching is generally determi-
kuMO
c k nate for a particular post-trained network, when the number of
where uMO c is the crown convergence measured in situ, and uFLc is the
individuals in each generation is large enough.
counterpart that is computed from the numerical simulations. k k is On the other hand, the randomness in network training is com-
the denotation of vector norm calculation. The errors of springline paratively large, especially when the available datasets are limited.
convergence and invert convergence, eus and eui, can also be defined The BN is expected to approximate the numerical simulations in
by a similar means. the sense of statistics, after it has been well trained by the datasets
For example, a network with the architecture of BN(6, 9, 12) provided by the numerical simulations. Therefore, the number of
(i.e. six input neurons in the input layer, nine hidden neurons in datasets required to train a BN well is closely related to the number
a hidden layer and twelve output neurons in the output layer) is of synaptic weights and the desired goal of error function. It is ex-
trained by the strategy and the datasets presented above, then pected to be a very large number (Schalkoff, 1997; Haykin, 1999).
the post-trained network is employed to search for a parameter Therefore, merely fifty datasets are insufficient on this occasion.
set estimation via GA. This estimation is revalidated by the identi- Taking another post-trained network BN(6, 9, 12)_T2 for example,
cal numerical simulations again, with the results depicted in Fig. 9. the estimation provided by this post-trained network is proved
The results from the numerical simulations agree well with the to be a bad one when revalidated by the numerical simulations.
monitoring data, with the three errors being 4.0%, 12.1% and The simulation results are depicted in Fig. 10, with the three errors
9.3%, respectively. It implies that this estimation for the rheological being 8.6%, 14.3% and 34.1% respectively. Table 5 summarizes five
parameter set is rather reliable. possible estimations provided by those post-trained networks (T1–
However, it is expected that there exist numerous estimations T5) sharing the same architecture of BN(6, 9, 12). These five estima-
due to the stochastic nature of the genetic operations and the tions are quite different from one another, which implies that the
training process, and there is no guarantee that all these estima- randomness in training process is comparatively large. Herein,
tions are always reliable. Section 4.5 presents more discussions the estimation with the minimum errors is selected as the optimal
on this issue. one for the network architecture of BN(6, 9, 12). Theoretically,
increasing the number of datasets can reduce the randomness
4.5. Discussions in training process and increase the accuracy of estimation. In
50 monitoring data 50
monitoring data
Delayed deformations (mm)
20 20
eus = 12.1% eus = 14.3%
10 us 10 us
Time Time
0 0
0 12 24 36 48 60 (months) 0 12 24 36 48 60 (months)
G
K
ηK ηM ωc ωφ Rthr G
K
ηK ηM ωc ωφ Rthr
BN((6,9,12)_T3 0.560 0.337 0.314 0.320 0.585 0.363 BN(6,9,12)_T2 0.476 0.388 0.583 0.139 0.537 0.642
FLAC3D 6.04E+08 4.04E+16 3.83E+17 3.88E+04 2.84 0.49 FLAC3D 5.28E+08 4.49E+16 6.24E+17 2.25E+04 2.65 0.71
Fig. 9. Comparison of the in situ monitoring data to the numerical simulation Fig. 10. Comparison of the in situ monitoring data to the numerical simulation
results and the network training results (for BN(6, 9, 12)_T3). results and the network training results (for BN(6, 9, 12)_T2).
Z. Guan et al. / Tunnelling and Underground Space Technology 24 (2009) 250–259 257
Table 5
Five possible estimations for the rheological parameter set (for BN(6, 9, 12))
Train no. GK (Pa) gK (Pa s) gM (Pa s) xc (Pa/year) x/ (°/year) Rthr euc (%) eus (%) eui (%)
T1 5.96E+08 4.38E+16 2.63E+17 4.69E+04 2.70 0.61 6.9 20.1 11.3
T2 5.28E+08 4.49E+16 6.24E+17 2.25E+04 2.65 0.71 8.6 14.3 34.1
T3a 6.04E+08 4.04E+16 3.83E+17 3.88E+04 2.84 0.49 4.0 12.1 9.3
T4 5.63E+08 4.20E+16 5.05E+17 4.49E+04 2.46 0.44 6.2 25.7 8.2
T5 7.35E+08 4.45E+16 2.27E+17 3.91E+04 3.61 0.62 5.1 19.4 7.6
a
The one with minimum errors is selected as the optimal estimation.
Table 6
Five possible estimations for the rheological parameter set (for BN(6, 6, 12, 12))
Train no. GK (Pa) gK (Pa s) gM (Pa s) xc (Pa/year) x/ (°/year) Rthr euc (%) eus (%) eui (%)
T1a 6.87E+08 3.48E+16 4.21E+17 5.08E+04 3.14 0.61 8.1 13.2 6.0
T2 9.03E+08 3.12E+16 1.22E+17 5.34E+04 2.94 0.53 24.4 57.1 24.6
T3 6.87E+08 3.48E+16 5.92E+17 6.02E+04 3.14 0.61 14.1 18.7 8.5
T4 5.36E+08 4.11E+16 4.47E+17 4.31E+04 1.50 0.79 5.6 21.5 7.9
T5 7.07E+08 3.32E+16 4.93E+17 2.89E+04 3.44 0.65 13.0 21.7 16.7
a
The one with minimum errors is selected as the optimal estimation.
in Fig. 11, with the three errors being 8.1%, 13.2% and 6.0%
50
monitoring data respectively.
Delayed deformations (mm)
simulation results from FLAC3D Table 7 summarizes the four optimal estimations for different
40 training results from BN(6,6,12,12)_T1 euc = 8.1%
network architectures. In fact, all these four estimations are rather
eui = 6.0% good, since the average errors of them are below 10%. However,
30 uc one cannot tell easily as to which architecture is more proper for
ui
this problem, because the estimation is stochastic in nature and
20 no statistical characteristic can be found through such few at-
tempts. On the precondition that the errors are acceptable (for
eus = 13.2%
example, the average error is below 10%), it is the author’s opinion
10 us that the networks with less hidden neurons are preferred.
time
0 4.5.3. Characteristic of the long-term deformations
0 12 24 36 48 60 (months)
As far as Ureshino tunnel line I is concerned, 20 possible estima-
G
K
ηK ηM ωc ωφ Rthr tions (five estimations for each four network architectures) are
BN(6,6,12,12)_T1 0.652 0.275 0.357 0.453 0.660 0.508 provided by the proposed technique presented above. Based on
FLAC3D 6.87E+08 3.48E+16 4.21E+17 5.08E+04 3.14 0.61 the revalidating results (although not totally listed in this paper),
it can be concluded that the strength deterioration parameters
Fig. 11. Comparison of the in situ monitoring data to the numerical simulation (xc, x/ and Rthr) would influence ui more significantly, and the vis-
results and the network training results (for BN(6, 6, 12, 12)_T1).
coelastic parameters (GK, gK and gM) would influence both uc and ui
to some extent. However, it is difficult (or impossible) to find out
an estimation whose simulation results could agree well with the
practice, however, trying and validating seem to be a more realistic
monitoring data of uc, u i and us simultaneously, since there are
solution for this issue.
no numerical simulations that can take a comprehensive account
of the in situ conditions and the rock mass behaviors. Therefore,
4.5.2. Hidden neurons and hidden layers
it is up to the engineer to select a best estimation according to
To discuss the influence of hidden neurons and hidden layers on
his/her judgment. For example, the 1st estimation in Table 7 is pre-
network’s learning and generalization ability, another three differ-
ferred, since it has a minimum average error.
ent network architectures are tried by a similar means presented
In fact, ground reinforcement on Ureshino tunnel line I was con-
above. Taking the architecture of BN(6, 6, 12, 12), for example, Table
ducted after the Ureshino tunnel line II was completed in 2000. The
6 summarizes five possible estimations provided by those post-
reinforcement mainly included the backfill grouting beneath the
trained networks sharing the same architecture. After revalidating,
invert and the additional bolting at the foot, which proved to have
the possible estimation provided by the network T1 has a mini-
a good effect on controlling the tunnel convergence at the invert.
mum average error and is selected as the optimal estimation for
Although the in site conditions have changed, the rheological
BN(6, 6, 12, 12). The revalidating simulation results are depicted
Table 7
Four optimal estimations for the rheological parameter set (for BNs with different architectures)
Architecture GK (Pa) gK (Pa s) gM (Pa s) xc (Pa/year) x/ (°/year) Rthr euc (%) eus (%) eui (%)
BN(6, 9, 12) 6.04E+08 4.04E+16 3.83E+17 3.88E+04 2.84 0.49 4.0 12.1 9.3
BN(6, 16, 12) 6.11E+08 3.82E+16 3.28E+17 4.94E+04 2.55 0.61 3.1 15.3 12.4
BN(6, 6, 12, 12) 6.87E+08 3.48E+16 4.21E+17 5.08E+04 3.14 0.61 8.1 13.2 6.0
BN(6, 8, 10, 12) 7.10E+08 3.99E+16 2.97E+17 5.45E+04 2.71 0.65 3.5 12.1 12.0
258 Z. Guan et al. / Tunnelling and Underground Space Technology 24 (2009) 250–259
parameters estimated by the proposed method are still useful to Guan et al. (2008). The deviatoric behavior can be schematically
predict the tunnel’s long-term deformations in the future. illustrated in Fig. A1, where a Kelvin unit, a Maxwell unit and a
MC plastic unit are connected in series and subjected to some cer-
5. Conclusions tain deviatoric stresses jointly. The constitutive laws for the Kelvin
and Maxwell units are formulated by Eq. (A1) and Eq. (A2).
The long-term deformations of mountain tunnels, which attract
sij ¼ 2gK e_ Kij þ 2GK eKij ðA1Þ
more and more attentions, are closely related to the time-depen-
dent features of the surrounding rock mass. Although many s_ ij sij
e_ M
ij ¼ þ ðA2Þ
rheological models have been proposed to account for the time- 2GM 2gM
dependent features of rock mass from manifolds, for a certain engi- For the plastic unit, a Mohr-Coulomb failure criterion f is first de-
neering instance however, it is difficult to estimate the rheological fined as Eq. (A3). After the failure criterion being reached, the devi-
parameters due to the high cost and long duration of rheological atoric behavior of the plastic unit is specified by flow rule and
tests. Therefore, the rheological parameter estimation technique plastic potential g, as shown in Eq. (A4) and Eq. (A5).
by using BN and GA is proposed in this paper, and applied to an sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
engineering instance the Ureshino tunnel line I on Nagasaki 1 þ sin / 1 þ sin /
expressway. f ¼ r1 r1 þ 2c ðA3Þ
1 sin / 1 sin /
The Burger-MC deterioration rheological model is applied to the P
og e_ og og og
surrounding rock mass to account for the mechanism of the de- e_ Pij ¼ k kk dij with e_ Pkk ¼ k þ þ ðA4Þ
layed deformations that occurred during the tunnel’s service life. orij 3 or1 or2 or3
The six rheological parameters involved in this model are esti- 1 þ sin w
g ¼ r1 r3 ðA5Þ
mated via such four steps: (1) numerical simulations are utilized 1 sin w
first to generate a certain number of datasets for BN training; (2)
In the above equations, eij and sij are the deviatoric strain and stress
the BN is trained with those datasets to approximate the numerical
tensors; ekk and rkk are the volumetric strain and stress. r1, r2 and
simulations in the sense of statistics; (3) the post-trained network
r3 are the three principle stresses. GK and gK are the shear modulus
combined with the genetic algorithm is employed to provide an
and the viscosity of Kelvin unit. GM and gM are the shear modulus
optimal estimation for the rheological parameter set according
and the viscosity of Maxwell unit. c, / and w are the cohesion, the
to the in situ monitoring data; (4) this optimal estimation should
friction angle and the dilation angle of the MC plastic unit. k is a
be revalidated by the aforementioned identical numerical
multiplier that can be eliminated in the calculation afterwards.
simulations.
The variables denoted by a dot mark refer to their first-order deriv-
The stochastic nature of the proposed technique is discussed.
atives with respect to rheological time. The superscripts K, M and P
The randomness in the genetic searching is relatively small, when
denote the Kelvin, the Maxwell and the MC plastic partitions for the
the number of individuals in each generation is large enough. In
corresponding variables. In addition, the Burger-MC deterioration
contrast, the randomness of network training is comparatively
rheological model assumes that the cohesion c and the friction angle
large, because the available datasets are insufficient comparing
/ will decrease with rheological time, regardless of whether the
with the required dataset to train the network well. Therefore, it
loss of strength is caused by cyclic loading fatigue, by clay mineral
is suggested that the network with less hidden neurons are pre-
hydration or by some other reasons. It is assumed that the loss of
ferred, on the precondition that the errors between the revalidat-
strength is controlled by its current stress state. Furthermore, there
ing simulation results and the monitoring data are acceptable.
exists a threshold (i.e. Rthr) to initiate this kind of strength deterio-
The characteristic of long-term deformations for the Burger-MC
ration and two lower limits (i.e. cres and /res) to circumscribe the
deterioration rheological model is also summarized in this paper. It
strength deterioration.
is difficult (or impossible) to find out an estimation whose simula-
tion results could agree well with all the monitoring data simulta- dc
¼ xc R ðR P Rthr ; c P cres Þ ðA6Þ
neously. Therefore, it is up to the engineer to select a best dt
estimation from those ones provided by the BN and GA, according d/
¼ x/ R ðR P Rthr ; / P /res Þ ðA7Þ
to his/her judgment. dt
r1 r3
R¼ ðA8Þ
Appendix A. Burger-MC deterioration rheological model 2c cos / þ ðr1 þ r3 Þ sin /
In the above equations, the parameter R is named as stress coeffi-
The constitutive laws of Burger-MC deterioration rheological cient in this paper and indicates the ‘‘distance” from the current
model are introduced briefly in this appendix, which is based on stress state to the MC failure envelope. When the stress coefficient
G
K c, φ , ψ
η
M M
G ωc, ωφ , Rthr
sij
ηK
is greater than a certain threshold Rthr, the rock strength initiates to Guan, Z., Jiang, Y., Tanabashi, Y., Huang, H., 2008. A new rheological model and its
application in mountain tunnelling. Tunn. Undergr. Space Technol. 23 (3), 292–
deteriorate. The multipliers xc and x/ are two deterioration ratios
299.
that control the deteriorating of the cohesion and friction angle by Haykin, S., 1999. Neural Networks: A Comprehensive Foundation. Prentice-Hall Inc.
certain rates. cres and /res are residual cohesion and residual friction Hudson, J., Harrison, J., 1997. Engineering Rock Mechanics. Pergamon.
angle that can be estimated from the conventional triaxial tests. The Itasca Consulting Group Inc. FLAC3D, 1997. Fast Lagrange Analysis of Continua in 3
Dimensions. Version 2.0, User Manual. Itasca Consulting Group Inc.
proposed rheological model can be implemented in the numerical Japan Public Highway Corporation, 2000. The construction and monitoring reports
codes FLAC3D, although it is not included directly in the FLAC3D li- of the Nagasaki Expressway. JPHC (in Japanese).
brary of constitutive laws. The programming language FISH can Jin, J., Cristescu, N., 1998. An elastic viscoplastic model for transient creep of rock
salt. Int. J. Plasticity 14 (1), 85–107.
help to implement the proposed model under the framework of Li, Y., Xia, C., 2000. Time-dependent tests on intact rocks in uniaxial compression.
classic Burger-MC rheological model in such an indirect way (Itasca Int. J. Rock Mech. Min. Sci. 37 (3), 467–475.
Consulting Group): reevaluate the cohesion and the friction angle Maranini, E., Brignoli, E., 1999. Creep behaviour of a week rock: experimental
characterization. Int. J. Rock Mech. Min. Sci. 36 (1), 127–138.
for every rock zone according to Eqs. (A6)–(A8), after a certain rhe- Mathworks Inc. Matlab R2006a, 2006. User Help Document. Mathworks Inc.
ological period being computed. Commonly, the shear modulus of Mayoraz, F., Vulliet, L., 2002. Neural networks for slope movement prediction. Int. J.
Maxwell unit GM is considered the same as the ordinary shear mod- Geomech. ASCE 2 (2), 153–173.
Meulenkamp, F., Grima, M., 1999. Application of neural networks for the prediction
ulus G. Therefore, this model includes six rheological parameters
of the unconfined compressive strength from Equotip hardness. Int. J. Rock
(GK, gK, gM, xc, x/ and Rthr) that should be estimated by using BN Mech. Min. Sci. 36 (1), 29–39.
and GA according to the in situ monitoring data. Nawrocki, P., Cristescu, N., Dusseault, M., Bratli, R., 1999. Experimental methods for
determining constitutive parameters for nonlinear rock modeling. Int. J. Rock
Mech. Min. Sci. 36 (5), 659–672.
References Pellet, F., Hajdu, A., Deleruyelle, F., Besnus, F., 2005. A viscoplastic model including
anisotropic damage for the time dependent behaviour of rock. Int. J. Numer.
Abraham, A., 2004. Meta learning evolutionary artificial neural networks. Anal. Meth. Geomech. 29 (9), 941–970.
Neurocomputing 56 (1), 1–38. Pichler, B., Lackner, R., Mang, H., 2003. Back analysis of model parameters in
Basheer, I., Hajmeer, M., 2000. Artificial neural network: fundamentals, computing, geotechnical engineering by means of soft computing. Int. J. Numer. Meth.
design and application. J. Microbiol. Meth. 43 (1), 3–31. Engng. 57 (14), 1943–1978.
Cai, J., Zhao, J., Hudson, J., 1998. Computerization of rock engineering system using Rangel, J., Iturraran-Viveros, U., Ayala, A., Cervantes, F., 2005. Tunnel stability
neural network with expert system. Rock Mech. Rock Eng. 31 (3), 135–152. analysis during construction using a neuro-fuzzy system. Int. J. Numer. Anal.
Fabre, G., Pellet, F., 2006. Creep and time-dependent damage in argillaceous rocks. Meth. Geomech. 29 (15), 1433–1456.
Int. J. Rock Mech. Min. Sci. 43 (6), 950–960. Schalkoff, J., 1997. Artificial Neural Network. McGraw-Hill.
Feng, X., Zhang, Z., Sheng, Q., 2000. Estimating mechanical rock mass parameters Shao, J., Chau, K., Feng, X., 2006. Modeling of anisotropic damage and creep
relating to the Three Gorges Project permanent shiplock using an intelligent deformation in brittle rocks. Int. J. Rock Mech. Min. Sci. 43 (4), 582–592.
displacement back analysis method. Int. J. Rock Mech. Min. Sci. 37 (7), 1039– Shin, K., Okubob, S., Fukuib, K., Hashibab, K., 2005. Variation in strength and creep
1054. life of six Japanese rocks. Int. J. Rock Mech. Min. Sci. 42 (2), 251–260.
Goldberg, D., 1989. Genetic Algorithms in Search Optimization, and Machine Sonmez, H., Gokceoglu, C., Nefeslioglu, H., Kayabasi, A., 2006. Estimation of rock
Learning. Addison-Wesley. modulus: for intact rocks with an artificial neural network and for rock masses
with a new empirical equation. Int. J. Rock Mech. Min. Sci. 43 (2), 224–235.