You are on page 1of 7

Paper No: SC 12 Smart Computing

Novel machine learning ensemble approach for landslide prediction

C. N. Madawala B. T. G. S. Kumara L. Indrathilaka


D epartment o f Computing and D epartment o f Computing and N ational Research Building
Information Systems, Information Systems, Organization, Sri Lanka.
Sabaragamuwa University, Sri Lanka Sabaragamuwa University, Sri Lanka laksiriindrathilaka@gmail.com
cnmadawala@std. appsc. sab. ac.lk btgsk2000@gmail.com

Abstract 1. Introduction

Haphazard development activities on mountain slopes Landslides are the geological incident which includes
and inadequate attention to construction aspects have led widely spread land movement resulting meticulous
to the increase of landslides and consequently sustaining damages to the people and their belongings.
damage to lives and infrastructure. Nearly 3275 sq.km o f Fundamentally the landslide transpired when a part of a
area spread over the Ratnapura District, seems to be natural slope is not capable of bearing its weight. The
highly prone to landslides and mass wastage o f 2178 gravity is the fundamental driving force of the landslide
sq.km. Landslides occur in many regions o f Ratnapura refuse flow relying on the slope of the area. Landslide
district and nearly 90 deaths have been reported happens when the stability of the slope changes from a
according to National Research Building Organization stable to an unstable state. In the last decades, there was a
(NBRO) in 2017. Most landslides or potential failures considerable increase in landslide frequency, in accord
could be predicted fairly accurately i f proper with the climatic changes, improper land uses and the
investigations were performed in time. The primary expansion of urbanized areas in the world. Thus,
objective o f this study is landslide-hazard mapping and landslide detection is a crucial requirement in pre and
risk evaluation to determine the real extent, timing, and post-disaster, hazard analysis processes.
severity o f landslide processes in Ratnapura district. The recognition of landslide susceptibility is
Such knowledge will provide the most significant benefit important to get some preventive and control actions and
to government officials, consulting engineering firms, give some early warnings to reduce or mitigate hazards
and the general public in avoiding the landslide hazard impacts. Most developed countries in the world apply the
or in mitigating the losses. Hybrid Machine Learning latest tools for landslide prediction. But Sri Lanka being
techniques can be used to develop prediction models an underdeveloped country cannot meet the expenses for
using existing data. Ensemble approach based on such technologies. So, it is necessary to bring a research
Support Vector Machine (SVM), Naïve Bayes model were study for landslide predictions to give early warnings
combined and implemented fo r the final prediction. This which lead save the lives of naive people.
study possesses a strong capability to predict landslides Machine learning is a sophisticated fusion of applied
by causative factors, slope, land use, elevation, geology, mathematics and computational intelligence. It focuses
soil materials and triggering factor; rainfall was on ‘training’ an algorithm to probe for and leam from
extracted and applied to the machine learning data structure robust enough to make predictions; even
algorithms. This research introduces a novel architecture without predecessor knowledge of the structure. In
to produce a more relevant and accurate prediction o f Machine Learning perspective, a majority of studies
the landslide vulnerability within the study area. discussed aspect related to Artificial Neural Network
Moreover, it was revealed that all o f the factors had (ANN), Support Vector Machine (SVM), Decision tree,
relatively positive relationship with occurrence o f Classifying and Clustering, Bayesian, WebGIS and
landslides. An improvement in hazard monitoring, General data mining techniques.
accuracy o f early warning and disaster mitigation is Even though these models have been applied
documented. successfully and efficiently in landslide susceptibility
assessment, no model is perfect. Therefore, the
Keywords: Landslide, Support Vector Machine (SVM), improvement in these models is needed to achieve
Naïve Bayes, Hydrological, Rainfall desirable results. The performance of landslide models
can be enhanced by using feature selection and ensemble.
Whereas, the Ensemble frameworks that combine
multiple classifiers to improve the performance of

Smart Computing and Systems Engineering, 2019


Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka. 78
individual classifiers based on characteristics of In the past, disaster information extraction and
the diversity [1]. prediction were mainly based on artificial visual
Ensemble frameworks started in the 1990s but interpretation. Apart from being time-consuming and
received signihcant attention of researchers in recent strenuous, the traditional method also has a limitation in
years. Ensemble techniques such as Bagging; AdaBoost; that the measurement process lacks accuracy and depends
Naïve Bayes and SVM have been applied efficiently in heavily on experts’ experience [4]. With the development
an improvement of the performance of individual of the computer vision and pattem recognition
classihers for different problems. Therefore, the main technologies, it is possible to make the hazard assessment
objective of present study is to attempt a novel classiher automatic. The other type of monitoring method is to
Ensemble data mining approach for landslide embed different kinds of sensors related to slope, rainfall,
susceptibility assessment at Ratnapura District, Sri Lanka water table level, and other factors into the landslide and
This method is a combination of Naive Bayes classiher sense the dynamic change of signals. Wireless sensor
and SVM ensemble [2]. networks are therefore being used to achieve large-scale
Recently, many algorithms were proposed for data collection and transmission.
landslide prediction. Thought in the landslide studies, During recent decades, a number of different methods
machine learning algorithms are more accurate than were proposed for landslide modelling including
conventional statistical techniques. However, Ensemble heuristic, deterministic, and probabilistic methods [5].
methods are known as proper machine learning But that methods have some limitations [2]. However,
techniques in combining statistical methods for better recent approaches to landslide modelling show that the
landslide prediction. prediction of landslide susceptibility could be enhanced
The formation and occurrence of landslides is a with the use of hybrid machine learning techniques.
complicated evolution process, which is caused by the Therefore, exploration of new hybrid machine learning
interaction of multiple instability factors. However, most methods for landslide susceptibility modelling should be
of the methods consider only the current value of the further carried out.
instability factors while ignoring the factors’ evolution The reason behind this research is to use Ensemble
feature over time. This study proposes an Architecture approach and propose a suitable landslide prediction
related to Ensemble approach of Machine Learning Architecture. Meanwhile, identify, an expected location
algorithms, SVM, Naïve Bayes model. A variation of that would cause fatalities, damages or disruptions to
spatial data, including landslides, geology, topography, existing standards of safety in Ratnapura District, Sri
slope, soil, and land cover, and triggering factor rainfall Lanka. Hence, the study helps to predict the most
were identified and collected in the study areas. triggering factors of a landslide and changes can be
expected in the activity of massive landslides in the
2. Research problem future under the impact of environmental changes.

The detection of natural disasters is a signihcant and non­ 3. Literature review


trivial problem. A traditional method relies on dedicated
physical sensors to detect specihc disasters. With the When concerning the effect of hydrological
advances in information communication technologies characteristics and slope failures in different locations in
(ICT), and Machine Learning, it is critical to improving the world, major landslide disasters triggered by rainfall
the efficiency and accuracy of disaster management are reported every year in different countries [6].
systems through modem data processing techniques. The objective of landslide-hazard mapping and risk
They help the decision-makers understand near real-time evaluation is to determine the real extent, timing, and
possibilities during an event. severity of landslide processes in selected high-priority
If the data related to a landslide can be collected and areas of the Sri Lanka, where such knowledge will
then analyzed using Machine Learning techniques, it may provide the most significant benefit to government
provide valuable insights into the disaster management. officials, consulting engineering firms, and the general
Furthermore, if we develop a prediction model which can public in avoiding the landslide hazard or in mitigating
be embedded in to develop a user friendly and efficient the losses.
computer program which is used by any ordinary person More specifically hydrological triggering is generally
who is living in a landslide-prone area to determine “am I known as one of the major natural landslide beginning
safe in the current place with regards to current mechanisms. Hydrological triggering can be defined as a
geological and weather condition or not?” by dealing reduction in shear strength due to an enhance in pore-
with data of current situation rather than living blindly water pressure on the potential failure surface which
until National Building Research Organization (NBRO) eventually causes to the slope failure [7]. In Sri Lanka,
issue disaster warnings [3]. landslides are mostly triggered due to massive and

Smart Computing and Systems Engineering, 2019


Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka. 79
continued rainfall. During the last few decades, 5. Data and methodology / materials and
landslides have occurred with increasing frequency and methods
intensity. The intense rainfalls with shorter durations
could trigger more destructive landslides3. 5.1. Study area
Currently in Sri Lanka NBRO and many researchers
(i.e. Karunanayake, K.B.A.A.M.; Wijayanayake, Nearly 3275 sq.km of area spread over the Ratnapura
W.M.J.I. e tal) issue landslide early warning based on District, 1097 sq.km forestry areas and seems to be
historical data collected using the contours, map of land highly prone to land sliding of 2178 sq.km. 473 out of
use, the map of overburden and map of landslides[8]. The 575 GM divisional areas reported as Landslide-prone
decision tree algorithm and the Neural Network areas. A severe landslide occurred in Eheliyagoda,
technique will be used to develop prediction models out Nivithigala, Ayagama, kalawana, Dolapallehena,
of the predictive analysis of Machine Learning Kiribathgala, Alupathgala, Hortonwatta, Girapagama and
techniques [9]. But the aforementioned literature review approximately 14 of Landslide-prone and Embilipitiya,
shows that SVM are widely accepted to be an effective Godakawela, Kolonna 03 non-landslide-prone AGM
and robust method for landslide modelling compared to divisional areas around Ratnapura District,
the other mentioned methods and techniques[10]. Sabaragamuwa province in Sri Lanka[ll],
Therefore, in this study, aimed at filling this gap by
developing an SVM model and evaluating its
performance for the prediction of landslides at the
Sabaragamuwa province area of Sri Lanka.
In Machine Learning perspective, a majority of
studies discussed aspect related to ANN, SVM, Decision
Tree (DT), Classifying and Clustering, Naïve Bayesian,
WebGIS and General data mining techniques from
Machine learning.
During recent decades, some different methods were
proposed for landslide modelling including heuristic,
deterministic (engineering approach), and probabilistic
(non-deterministic or data-driven) methods. Due to
limitations of these methods, exploration of new hybrid
machine learning methods for landslide susceptibility
modelling should be further carried out.
In the present study, this gap in the literature is
partially filled by proposing a new hybrid machine
learning approach for landslide susceptibility modelling
with a study of the Ratnapura District, Sri Lanka. The Figure 1. The study areas, the Ratnapura District,
proposed approach relies on an integration of the Support Sabaragamuwa Province.
Vector Machine and Naïve Bayes.

4. Objectives
The natural hazards are beyond human control, but
their destruction can be reduced if prediction mechanisms
are carried out in advance. Researchers worldwide are
having a great pace to develop early prediction
mechanisms for such natural hazards. It was hard to use
traditional mathematical methods for analyzing. Figure 2. (a) Kiribathgala,
The paper aims in presenting an architecture using in Ratnapura district, Sri
Lanka, Monday, May 29,
Ensemble approach relying on an integration of SVM
2017, (b) Ratnapura,
and Naïve Bayes which possess a strong capability to Kalawana and Ayagama
predict landslides and analyze the climate variability areas (c) Colombo Hatton
(rainfall and temperature). main road due to rock falls

Smart Computing and Systems Engineering, 2019


Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka. 80
5.2. Materials 5.3.1. Naïve Bayes Classifier

5.2.1. Required data Naive Bayes classiher is one of the simplest soft
computing methods which is based on the Bayesian
To determine the tributary characteristics of the study theory and the maximum posterior hypothesis [12].
area a) Digital topographic maps of the study area Naive Bayes classiher uses a statistical hypothesis that all
(1:10000) and b) Digital elevation model (DEM) - values of numeric attributes are independent and
satellite images of specific areas were obtained. normally distributed in each class. Naive Bayes classiher
has been applied effectively in many helds such as
T ab le l. Datausedandsources. medical diagnosis, and management. However, its
application is still limited to landslide problems [13].
D a t a u se So u rces
5.3.2. SupportVector Machine
Rainfall Metrology Department
Support Vector Machine is primary a classier method
Soil Materials Irrigation Department
that performs the classification tasks by constructing
Geology Survey Department
hyperplanes in a multi-dimensional space that separates
LandUse LUPP Department cases of different class labels. SVM supports both
Landslide & non Landslide NBRO regression method and classification technique tasks and
Soil Texture Survey Department can handle multiple continuous and categorical variables
Distance from road, river NBRO [14].
Influence of constructions NBRO
5.3.3. The novel classifier ensemble model
To analyze the climate variability (rainfall and
In this study, the novel ensemble classiher model was
temperature), monthly rainfall and temperature data in
generated by the combination of Naive Bayes classiher
the past, recent five years (2012-2017)
and SVM ensemble. SVM ensemble was hrst applied to
To study the slope aspect, angle within the landslide create the subsets of training[15]. After that, Naive Bayes
are National Building Research Organization | Hazard
classiher was used to construct base classihers from these
Zonation Map was referred.
subsets for classihcation. Methodological how chart of
the novel classiher ensemble model is shown in Figure 3.
5.2.2. Software tools and models
■ Arc GIS 10.4 version
■ Digital Elevation Model (DEM)
Classifier Ensemble model

5.2.3. Programming Language


■ Java

5.3. M ethodology of ensem ble model

The fundamental data required for producing the


landslide susceptibility model of the study area was
obtained from Standard Topographic Maps at a scale of
1/50000. Six major contributory factors can be identified
for landslide events. That is the bedrock geology with the
degree of weathering and nature and intensity of defects, Figure 3. Methodological flow chart of the novel
classifier ensemble model.
Slope angle, overstrain soil cover, Landform, drainage
pattern and land use pattern. But the critical triggering
The advantage of the novel classiher ensemble model is
reason for a landslide is the high-intensity rainfall.
that the training subsets are being optimized using SVM
ensemble, and then these training subsets are utilized for
training a base classiher of Naive Bayes. Therefore, the

Smart Computing and Systems Engineering, 2019


Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka. 81
novel classifier ensemble model could improve the Furthermore, researcher aim to separate instances
predictive capability of a base classifier of Naïve Bayes. belongs to two classes: (i) areas where landslides
occurred or likely to happen in the future; and (ii) areas
5.4. A methodology for reference architecture where landslides did not occur or are not expected to
occur in the future. The index was classified into a class
Based on the classification results, proposed an based on an area for visual and straightforward
Architecture which is meant to possess a strong interpretation of Landslides are most likely to occur. The
capability to predict landslides by factors of landslide proposed architecture is meant to possess a strong
dataset using Machine Learning concept. capability to predict landslides by factors of landslide
dataset using Machine Learning concept.

6.1. Model performance and validation

The performance capability of the novel classifier


ensemble model has evaluating with three main
Ensemble Learning techniques.

6.1.1. Ensemble learning techniques

Bagging
Figure 4. Reference architecture of prediction model.
Bagging uses to implement similar learners on small
sample populations and then takes a mean of all the
6. Results and discussion predictions.
The natural disasters are beyond human control, but their
destruction can be reduced if prediction mechanisms are
carried out in advance. Researchers worldwide are
having a great place to develop an application of the
landslide prediction model an exploration always, and it
includes many things to be researched further. The SVM
apply, and the results use to produce landslide prediction
model of the study areas.
Landslide Locations in Ratnapura District, Sri Lanka

Figure 6. Bagging process, to reduce variance error.

Boosting
This technique uses to adjust the weight of an
observation based on the last combined classification. If
an observation was classified incorrectly, it tries to
increase the weight of these observations.

Legend
♦ LS_Ratnapura
I Rainapura_LH3l

Figure 5. Landslide susceptibility map, Ratnapura


District.

Figure 7. Boosting process, to decrease the bias


error.

Smart Computing and Systems Engineering, 2019


Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka. 82
Stacking District, where landslides are expected to continue to
We use a learner to combine output from different strike in the future.
learners. This lead to a decrease in either bias or variance Moreover, soon, a landslide warning system may be
error depending on combine learner we use. established by forecasting landslides induced by rainfall.
And as a recommendation conducting programs in hilly
or mountainous areas about the control of run-off and
erosion through the water and soil conservation can have
extremely beneficial effects to control the hazards.

8. Acknowledgement
The access to Hazard Zonation map of Ratnapura
District from the Sri Lanka National Research Building
Organization (NBRO), Director General, Department of
Metrology, Dr. Mr. R. M. S. Bandara has been
Figure 8. Stacking process, to decrease in either bias acknowledged, Mr. A. L. K. Wijemanna, Director
or variance error depending on the combining (Computer, Research, Climate change, International
learner. Affairs) was supported to collect Rainfall Data and also
Director General, Department of Land Use & Policy
Ensemble approach produces more accurate solutions Planning, Mrs. A. S. Illangamge helped to collect Land
than a single model would. When we are trying to predict Use data and my sincere gratitude to the Sabaragamuwa
the target variable using any machine learning Ensemble University of Sri Lanka for the encouragement.
technique, the leading causes of difference in actual and
predicted values are noise, variance, and bias. Ensemble 9. References
approach has helped to reduce these factors, except noise,
which is an irreducible error. [1] Pham, B. T.,Bui, D.T.,Dholakia, M.B., Prakash, I., Pham,
A landslide prediction model should be able to make H.V., Mehmood, K. and Le, H.Q., “A novel ensemble classifier
an adequate prediction of possible landslide areas to of rotation forest and Naïve Bayer for landslide susceptibility
improve the hazard monitoring, accuracy of early assessment at the Luc Yen district, Yen Bai Province (Viet
Nam) using GIS,” Geomatics, Nat. Hazards Risk, vol. 8, no. 2,
warning and disaster mitigation by collecting data of the
pp. 649-671,2017.
triggering factors of the disaster. Data collected for this
research may provide a suitable database for hazard [2] Pham, B. T. , Shirzadi, A., Tien Bui, D. , Prakash, I. and
forecasting and future studies. Dholakia, M. B., “A hybrid machin/e learning ensemble
approach based on a Radial Basis Function neural network and
Rotation Forest for landslide susceptibility modeling: A case
7. Conclusion study in the Himalayan area, India,” Int. J. Sediment Res., vol.
Landslide susceptibility assessment have been done 33, no. 2, pp. 157-170,2018.
in the Ratnapura District, Sri Lanka using the novel [3] Government of Sri Lanka, World Bank, UN Sri Lanka, and
ensemble classiher model which is a combination of Global Facility for Disaster Reduction and Recovery, Sri Lanka
Naive Bayes classiher and SVM. Naive Bayes is an Rapid Post Disaster Needs Assessment: Floods and Landslides,
effective classiher. Analysis results show the novel May 2017, no. May. 2017.
classiher ensemble model has the best degree of ht to [4] Kadavi, P. R. , Lee, C. W., and S. Lee, “Application of
landslide susceptibility assessment compared to other ensemble-based machine learning models to landslide
models. This study identihed factors that may be susceptibility mapping,” Remote Sens., vol. 10, no. 8, pp. 1-18,
involved in landslides, and the results and methods that 2018.
can be used for landslide predicting in other regions [5] Gariano, S. L. and Guzzetti, F., “Landslides in a changing
beyond the study areas. Landslide prediction models can climate,” Earth-Science Rev., vol. 162, no. August 2016, pp.
help implement a guide for planning the mass evacuation 227-252, 2016.
of residents in the case of a landslide, and also to prevent
[6] Pham, B. T. and Prakash, I., “A novel hybrid model of
or reduce the disruptive impacts of a natural disaster on Bagging-based Naïve Bayes Trees for landslide susceptibility
surrounding communities. assessment,” Bull. Eng. Geol. Environ., pp. 1-15, 2017.
Therefore, the present study proposed an
Architecture for predicting landslide occurrences in the [7] Terlien, M. T. J., “The determination of statistical and
areas susceptible to these phenomena in the Ratnapura deterministic hydrological landslide-triggering thresholds,”
Environ. Geol., vol. 35, no. 2-3, pp. 124-130, 1998.

Smart Computing and Systems Engineering, 2019


Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka. 83
[8] “Predicting landslides in hill country of Sri Lanka using
data mining techniques,” pp. 2016, 2016.
[9] Lee, S., Lee, M.-J. and Jung, H.-S., “Data Mining
Approaches for Landslide Susceptibility Mapping in
Umyeonsan, Seoul, South Korea,” Appl. Sci., vol. 7, no. 7, p.
683,2017.
[10] International Consortium on Landslides; University of
Ljubljana; Geological Survey of Slovenia, Landslide Research
and Risk Reduction for Advancing Culture of Living with
Natural Hazards, Local Proceedings with Programme, no. June.
2017.
[11] Baba, H., “Introductory study on Disaster RiskAssessment
and Area Business Continuity Planning in industry
agglomerated areas in the ASEAN,” J. Integr. Disaster Risk
Management, vol. 3, no. 2, pp. 184-195, 2013.
[12] Hellerstein, J. L., Jayram, T. S. and Rish, I., “Recognizing
End-User Transactions in Performance Management,” New
York, vol. 1,2000.
[13] Zhou, C., Yin, K., Cao, Y. and Ahmed, B., “Application of
time series analysis and PSO-SVM model in predicting the
Bazimen landslide in the Three Gorges Reservoir, China,” Eng.
Geol., vol. 204, no. November2017, pp. 108-120, 2016.
[14] Korup, O. and Stolle, A., “Landslide prediction from
machine learning,” Geol. Today, vol. 30, no. 1, pp. 26-33,
2014.
[15] Goetz, J. N., Brenning, A., Petschko, H. and Leopold, P.,
“Evaluating machine learning and statistical prediction
techniques for landslide susceptibility modeling,” Comput.
Geosci, vol. 81, pp. 1-11,2015.

Smart Computing and Systems Engineering, 2019


Department of Industrial Management, Faculty of Science, University of Kelaniya, Sri Lanka. 84

You might also like