Professional Documents
Culture Documents
ScienceDirect
Article history: According to groundwater level monitoring data of Shuping landslide in the Three Gorges
Received 23 June 2016 Reservoir area, based on the response relationship between influential factors such as
Accepted 5 July 2016 rainfall and reservoir level and the change of groundwater level, the influential factors of
Available online 23 August 2016 groundwater level were selected. Then the classification and regression tree (CART) model
was constructed by the subset and used to predict the groundwater level. Through the
Keywords: verification, the predictive results of the test sample were consistent with the actually
Landslide measured values, and the mean absolute error and relative error is 0.28 m and 1.15%
Groundwater level respectively. To compare the support vector machine (SVM) model constructed using the
Prediction same set of factors, the mean absolute error and relative error of predicted results is 1.53 m
Classification and regression tree and 6.11% respectively. It is indicated that CART model has not only better fitting and
Three Gorges Reservoir area generalization ability, but also strong advantages in the analysis of landslide groundwater
dynamic characteristics and the screening of important variables. It is an effective method
for prediction of ground water level in landslides.
© 2016, Institute of Seismology, China Earthquake Administration, etc. Production and
hosting by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. This is an open access
article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
How to cite this article: Zhao Y, et al., Groundwater level prediction of landslide based on classification and regression tree,
Geodesy and Geodynamics (2016), 7, 348e355, http://dx.doi.org/10.1016/j.geog.2016.07.005.
* Corresponding author. Key Laboratory of Earthquake Geodesy, Institute of Seismology, China Earthquake Administration, Wuhan
430071, China.
E-mail address: yannan-1985@163.com (Y. Zhao).
Peer review under responsibility of Institute of Seismology, China Earthquake Administration.
http://dx.doi.org/10.1016/j.geog.2016.07.005
1674-9847/© 2016, Institute of Seismology, China Earthquake Administration, etc. Production and hosting by Elsevier B.V. on behalf of KeAi
Communications Co., Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
g e o d e s y a n d g e o d y n a m i c s 2 0 1 6 , v o l 7 n o 5 , 3 4 8 e3 5 5 349
the dam accident damage caused by the groundwater flow [2]. (classification). The classic CART algorithm was popularized
The Three Gorges Dam begins to store high water level in by Breiman et al. [14,15].
October each year and to flood the low water level at next In the case of a categorical variable, the number of possible
flood season, and the reservoir level since 2009 started to keep splits increases quickly with the number of levels of the cat-
145 me175 m fluctuations. The significant and cyclical egorical variable. Thus, it is useful to tell the software the
fluctuations of water level coupled with the impact of heavy maximum number of levels for each categorical variable. In
rainfall produce dramatic changes in the groundwater power choosing the best splitter, the program seeks to maximize the
field of landslide area, thereby threatening a large number of average “purity” of the two child nodes. In CART, Gini coeffi-
landslides stability located in both sides of reservoir [3]. cient is used to measure the “purity”, and its mathematical
Therefore, the dynamic prediction of groundwater level plays definition is:
a key role in the evaluation of landslide stability.
The groundwater level in the landslide has a complex X
k
GðtÞ ¼ 1 p2 ðjjtÞ (1)
nonlinear relationship between the natural and anthropo- j¼1
2. Model theory
2.2. CART pruning
2.1. CART building
Due to the complete decision tree on the feature of training
CART, a recursive partitioning method, builds classifica- samples is described too accurate, it loses general represen-
tion and regression trees for predicting continuous dependent tation and cannot be used in the classification or prediction of
variables (regression) and categorical predictor variables new data. This phenomenon is called over-fitting. Pruning is a
350 g e o d e s y a n d g e o d y n a m i c s 2 0 1 6 , v o l 7 n o 5 , 3 4 8 e3 5 5
technique to overcome this problem, and it can increase the consisted of mudstone, siltstone and marlstone in Badong
accuracy of the decision tree by 25% [16]. formation of Triassic (Fig. 1). Upper part of the landslide is
CART uses the minimum cost complexity pruning method mainly composed of broken stone, while the lower part of it
(MCCP) as pruning algorithm, which is a trim and inspection is mainly composed of silty clay. East side of Shuping
process. In the process of pruning, we calculate the prediction landslide is accumulative-formation slope and the rock
accuracy of the current decision tree for the test sample, contact zone. The main ingredients for the slip soil are silty
finally get an optimal tree with the balance of complexity and clay and gravel breccia. The west side of the landslide
error rate. This method relies on a complexity parameter, contains two layers of sliding zone. Shallow sliding zone is
denoted as a [17], which is gradually increased during the located in the diluvial layer consisted of breccia soil (Fig. 2).
pruning process. The error of decision tree is regarded as the There are two kinds of groundwater in the landslide area:
cost, and the complexity of the decision tree T can be (1) Pore water of loose debris, loose aquifer medium for qua-
expressed as: ternary colluvial deposits, with weakemedium permeability
and weak water yield property; (2) Bedrock fissure water,
Ra ðTÞ ¼ RðTÞ þ a~T (5) aquifer medium is mainly argillaceous siltstone and brown
where R(T), ~T and a are classification error on testing sample gray marl, with weak permeability and weak water yield
set, the number of leaf nodes, and the complexity for each property. Groundwater in the slide body mainly is stagnant
increasing of the leaf node, respectively. water. The level and flow with great dynamic change influ-
enced by season obviously. Main source of groundwater is at-
mospheric rainfall and the three gorges reservoir, and
3. Analysis of landslide groundwater level eventually drains into the Yangtze River in the form of springs.
The water quantity of the atmospheric precipitation and
characteristics
surface water seeping into the underground is little. When in
the case of a long-time rain, part of the groundwater may
Shuping landslide belongs to an ancient landslide accu-
mulation body, and distributes from north to south. The infiltrate into deep sliding body. A small amount of ground-
terrain slope is about 20 e30 . The development of Shuping water will be concentrated near the sliding zone and soften the
landslide is in the south wing of Shaxi Town anticline, sliding zone.
Fig. 2 e The cross-section of Shuping landslide (T2b is the Badong formation of Triassic).
Groundwater of landslide body keep close hydraulic the raise of reservoir water level from 135.2 m to 153.8 m in
connection between the reservoir water. The runoff intensity October 2006 and from 145 m to 154 m in October 2007 caused
is related to the range of the reservoir water level. When the a sudden jump of groundwater level. Compared to the
permeability of the landslide is poor and the water level of reservoir water level, the rainfall impacted on groundwater
reservoir with sudden drawdown, groundwater will discharge level was small in the front of the landslide, because there
to the reservoir, resulting in rapid growth of the hydrostatic were three seasonal gullies distributed on the landslide. The
pressure at the slope foot, easy leading to the instability of the monitoring sites just distributed on the vicinity of the eastern
landslide. Therefore, dynamic prediction of groundwater level gully, which provided a surface water drainage channel. So
based on the analysis of relationship between groundwater relatively little rainfall became into groundwater. When the
change and the factors such as rainfall and water level reservoir water level changed slowly with large rainfall, the
response is significant. groundwater would uplift correspondingly. Such as in July
In this paper, the groundwater level data of the monitoring and August 2005, the reservoir water level fluctuation was
point SZK1 in the front of Shuping landslide was analyzed, and slow, but two monthly rainfall reached 360.1 mm, causing the
the curves of groundwater depth, water level and rainfall were groundwater certain uplift.
shown in Fig. 3 (the value of groundwater depth was negative). The above analysis shows that the front of groundwater
In the figure, the fluctuation of groundwater level and the dynamic changes of Shuping landslide affected by the com-
reservoir water level was consistent basically, when the bined effects of reservoir water level and rainfall. There is a
reservoir water level raised and dropped, the groundwater complex nonlinear response relationship between ground-
level would uplift and lower correspondingly. For example, water level and influencing factors.
Fig. 3 e Curves of ground water level, rainfall and reservoir water level.
352 g e o d e s y a n d g e o d y n a m i c s 2 0 1 6 , v o l 7 n o 5 , 3 4 8 e3 5 5
pruning with the higher prediction accuracy has a strong factors (c1,c2), the inertia weight (w) and the maximum num-
generalization ability, its average absolute error and relative ber of iterations were 2, 20 and 200. Then, the penalty factor
error respectively are 0.28 m and 1.15%. parameter (C) and the RBF kernel function parameter (g) of the
In order to verify the applicability of CART model, the PSO- support vector machine (SVM) were searched, which were
SVR [18,19] model was built with the same data set to predict 60.916 and 0.094. Finally, the best model parameters were
landslide groundwater depth. First, the initial parameters of used to learn from the training samples and to build a
particle swarm optimization (PSO) contained the learning nonlinear response model for landslide groundwater depth
Fig. 5 e The optimal pruning tree for groundwater prediction of Shuping landslide.
354 g e o d e s y a n d g e o d y n a m i c s 2 0 1 6 , v o l 7 n o 5 , 3 4 8 e3 5 5
Fig. 6 e Measured values and predicted values of groundwater depth of Shuping landslide.
and influence factors. The model fitting and prediction value the PSO-SVR model, CART model with better fit and predictive
and error for training sample and test data are shown in Fig. 6 ability of generalization can be used to predict landslide
and Table 2. By comparison, fitting and prediction error of groundwater level.
PSO-SVR model is the largest in the three models, mainly
because the factor set used to build SVM model has strong
coupling and information redundancy, which interfere the
Acknowledgments
model prediction strategy and reduce the ability of the
model fitting and generalization.
Thanks for the data supporting of Three Gorges Reservoir
Area geological disaster prevention and control headquarters.
This work is supported by the China Earthquake Administra-
tion, Institute of Seismology Foundation (IS201526246).
5. Conclusions
[5] Jiang ZM, Xu WY, Zhang XM. Prediction of groundwater level [15] Hand D, Mannila H, Smyth P. Principles of data ming.
in landslides based on radial basis function. Chin J Rock Massachusetts Institute of Technology; 2001. p. 327e63.
Mech Eng 2003;22(9):1500e4. [16] Mingers J. An empirical comparison of pruning methods for
[6] Liu YJ. The establishment and application of dynamic decision tree induction. Mach Learn 1989;4:227e43.
prediction model of groundwater level based on intelligent [17] Esposito F, Malerba D, Semeraro G. A comparative analysis of
algorithm. Hydrogeology Eng Geol 2004;3:55e7. methods for pruning decision trees. IEEE Trans Pattern Anal
[7] Peng L, Niu RQ, Ye RQ. Prediction of ground water level in Mach Intell 1997;19(5):476e91.
landslides based on genetic-support vector machine. J [18] Peng L, Niu RQ, Zhao YN. Prediction of landslide
Central South Univ Sci Technol 2012;43(12):4788e95. displacement based on KPCA and PSO-SVR. Geomat Inf Sci
[8] Sangrey DA, Harrop-williams KO, Klaiber JA. Predicting Wuhan Univ 2013;38(2). 148e152þ161.
ground-water response to precipitation. J Geotechnical Eng [19] Zhao YN, Peng L, Niu RQ, Cheng WM. Prediction of landslide
1984;110(7):957e75. deformation based on rough sets and particle swarm
[9] Wu JL, Yang SX, Liu CS. Parameter selection for support optimization-support vector machine. J Central South Univ
vector machines based on genetic algorithms to short-term Sci Technol 2015;46(6):2324e32.
power load forecasting. J Cent South Univ Sci Technol
2009;40(1):180e4.
[10] Wang JL, Wu JS, Sun JS. Application of support vector
Yannan Zhao, an assistant researcher at
machine method in prediction of groundwater level. J
the Institute of Seismology, China Earth-
Hydraulic Eng 2003;34(5):122e8.
quake Administration. She graduated from
[11] Qi CS, Yu ZH, Hou Z. Study on quality prediction of the
China University of Geosciences (Wuhan)
complex production based on CART algorithm. Modul Mach
and obtained a Doctor's degree in 2015. She
Tool Automatic Manuf Tech 2010;3:94e7.
is mainly engaged in research on geological
[12] Zhang LB, Zhang QQ, Xu F. Research and application of the
disaster prediction and reservoir-induced
statistical models based on CART. J Zhejiang Univ Technol
seismicity in the Three Gorges area.
2003;30(4):315e8.
[13] Zhang HJ, Zhang PX, Chen JH. On-line quality inspection of
spot welding based on classification and regression tree
(CART). J Lanzhou Univ Technol 2005;31(4):10e4.
[14] Breiman L, Friedman J, Olshen R. Classification and
regression trees. California: Wadsworth Belement; 1984.