You are on page 1of 20

Original Paper

Landslides (2021) 18:1025–1044 Filippo Catani


DOI 10.1007/s10346-020-01513-4
Received: 9 March 2020
Accepted: 10 August 2020 Landslide detection by deep learning of non-nadiral
Published online: 2 September 2020
© The Author(s) 2020
and crowdsourced optical images

Abstract The recent development of mobile surveying platforms most of the research and application fields of science, industry and
and crowdsourced geoinformation has produced a huge amount defence (Lee et al. 2017). An even stronger increase has been
of non-validated data that are now available for research and observed in the availability of crowdsourced information generat-
application. In the field of risk analysis, with particular reference ed by data mining web resources of various type, with a special
to landslide hazard, images generated by autonomous platforms relevance of geo-tagged unclassified and potentially useful images.
(such as UAVs, ground-based acquisition systems, satellite sen- In turn, this has generated an exponential surge in the amount
sors) and pictures obtained from web data mining are easily of available data that contain large quantities of noisy and non-
gathered and contribute to the fast surge in the amount of non- validated information. To be usable, such big data require auto-
organized information that may engulf data storage facilities. mation and the support of machine learning methods for selec-
Therefore, the high potential impact of such methods is severely tion, classification and storage (Catani et al. 2013; Smith et al. 2017;
reduced by the need of a massive amount of human intelligence Intrieri et al. 2017; Du et al. 2019).
tasks (HITs), which is necessary to filter and classify the data, For the specific case of image-related data, where the informa-
whatever the final purpose. In this work, we present a new set of tion content is carried by a multi-layered digital matrix of quan-
convolutional neural networks (CNNs) specifically designed for titative measures in n dimensions, computer vision methods may
the automated recognition of landslides and mass movements in be of great help, because they are capable of mimicking simple and
non-standard pictures that can be used in automated image clas- repetitive human decision tasks, if suitably trained.
sification, in supporting UAV autonomous guidance and in the With reference to the specific field of landslide hazard and risk
filtering of data-mined information. Computer vision can be of assessment, the usage of unmanned platforms has recently become
great help in fostering the autonomous capability of intelligent almost mandatory, due to the operational flexibility, high spatial
systems to complement, or completely substitute, HITs. Image and resolution, low cost, quick capability of deployment and availabil-
object recognition are at the forefront of this research field. The ity of a number of new sensors that were previously unavailable as
deep learning procedure has been accomplished by applying trans- payload on such small aircrafts (Niethammer et al. 2012; Lucieer
fer learning to some of the top-performer CNNs available in the et al. 2014; Turner et al. 2015; Giordan et al. 2015; Giordan et al.
literature. Results show that the deep learning machines, calibrat- 2018; Allasia et al. 2019).
ed on a relevant dataset of validated images of landforms, may On a quite different path, direct surveying is being increasingly
supply reliable predictions with computational time and resource complemented by the usage of data mining of web-related and
requirements compatible with most of the UAV platforms and web crowdsourced information (Battistini et al. 2013; Battistini et al.
data mining applications in landslide hazard studies. Average 2017; Smith et al. 2017). This indirect approach provides an alter-
accuracy achieved by the proposed methods ranges between 87 native way to explore the occurrence of hazards over large areas
and 90% and is consistently higher than that obtained by general- and backwards in time. It also allows for the collection of soft data
purpose state-of-the-art image recognition convolutional neural such as damage estimation, impacts on population, reaction time
networks. The method can be applied to early warning, vulnera- during emergency and system resilience, which are fundamental
bility assessment, residual risk estimation, model parameterisation for the calibration and validation of risk assessment models
and landslide mapping. Specific advantages will be the reduction (Corominas et al. 2014; Uzielli et al. 2015a; Uzielli et al. 2015b).
of the present limitations in the intelligent guidance of landslide Several text and semantic analysis methods exist that can be
mapping drones, the classification of fake news, the validation of fruitfully used for the selection and classification of online news
post-disaster information and the correct interpretation of an and automated positioning of events (Battistini et al. 2013; Smith
impending change in the environment. et al. 2017) even though not much exists concerning the analysis of
more complex data on landslides, such as photographs, multi-
Keywords Landforms . Computer vision . Automated object spectral images and multi-source web graphics.
recognition . Data mining . UAV Therefore, the exploitation of digital images is at the forefront
of the research challenges and being improved at a fast rate.
Introduction Most of the advantages of both crowdsourced and UAV-
The science of natural hazards, including landslides, has lately surveyed imagery derive from the easiness of use and the short
been positively impacted by the quick growth of remote sensing time required to gather historical, monitoring or mapping data
and crowdsourced platforms such as satellites, UAVs, social net- (Bishop et al. 2012; Corominas et al. 2014; Chae et al. 2017; Giordan
works, sensor networks and public online data storages. For ex- et al. 2018). However, when the sorting and classification of thou-
ample, the usage of air- and UAV-borne sensors has gained a sands, if not hundred thousands, of completely different images
notable relevance due to the concurrent effect of price-lowering entail a repeated human intelligence task (HIT) or when an effi-
and quality gain in rotors, structure materials, power systems, on- cient drone-based survey requires a direct or indirect human
board computing power and sensors (Giordan et al. 2018; Rossi control by an expert pilot, most of the advantages may be lost
et al. 2018) and to the multiplication of UAV-based applications in and strong limitations may be introduced due to many factors,
Landslides 18 & (2021) 1025
Original Paper
including time constrains, data formats, terrain configuration and (CNNs) may be an optimal solution for highly flexible and pow-
logistics, thereby reducing applicability and extent of data collec- erful image classification and object recognition (Shin et al. 2016;
tion. For such reasons, recent cutting-edge research is trying to Du et al. 2019). In general, artificial neural networks have long
perfect the computer vision proficiency in object recognition on been successfully used to recognize specific landscape characters
the one side and the autonomous flying capabilities of drones to leading to slope instability (Lee et al. 2004; Catani et al. 2005;
allow the execution of larger scale, all-terrain surveys, on the other Ermini et al. 2005; Pradhan and Lee 2010; Yilmaz 2010; Liu and
side (Niesterowicz and Stepinski 2013; Lee et al. 2017). Wu 2016; Zhou et al. 2018a) or to detect anomalous displacements
The autonomous recognition and guidance capabilities of ma- of rock and soil masses (Zhou et al. 2018b).
chines are challenging tasks that are being tackled by the research Almost all the published research concentrates on the post-
community in several ways (Minaeian et al. 2016; Lee et al. 2017). processing analysis of multi-source data to apply pattern recogni-
All of them entail, as a basic requirement, the capability of com- tion and object-oriented methods for landform classification, with
puter vision by the machine platform, for decision-making, obsta- some of them specifically targeting mass movements. Only a few
cle avoidance, path adjustment and object detection. published works (Huang et al. 2011; Niesterowicz and Stepinski
Object detection, in particular, is a very important add-on to 2013; Lee et al. 2017), to the best of our knowledge, focus on the
any autonomous system as a specific skill that supports intelligent attempt of achieving real-time target detection for landforms or
decisions by helping the CPU in the interpretation of complex data landscape scenes with computer vision. And no work at all pro-
extracted from the surrounding environment. Examples of such poses an operational method to give on-board detection capability
skills are the proficiency in object recognition from simple photo- to any intelligent system as related to mass movements.
graphs, the autonomous extraction of flying information and the In this paper, we propose a simple, computationally compatible
generation of additional smart data for optimizing survey opera- deep learning classifier (LanDLC) trained for the detection and classi-
tions or validating models. fication of specific landslide-related landforms in nadiral and non-
In the field of landslide hazard, one of the main tasks which is nadiral images. The four versions of LanDLC presented in the following
devoted to drone systems is the quick survey of areas that are too sections are based on the transfer learning of pre-trained general-
large to be inspected with ground visits, yet require a detail-scale purpose image classification convolutional neural networks that have
analysis. On the other hand, data mining systems can be used to been specifically modified towards landslide recognition.
collect large-scale information in real-time and back-analysis All LanDLC versions can be fully implemented in a desktop
concerning cases of damage and risk assessment (Battistini et al. data mining toolbox to complement existing automated context
2017). In both cases, the computer vision system should mostly extraction and news classification applications (see, e.g. the sys-
concentrate on the capability of correctly classifying the terrain in tems described by Smith et al. (2017) and Battistini et al. (2013)).
terms of landforms, processes and effects due to the action of mass Furthermore, despite being in a prototypal stage for UAV on-board
movements, while being at the same time capable of detecting the implementation, LanDLC may provide a contribution towards the
presence of elements at risk, such as buildings, structures and objective of building self-aware drones capable of mapping land-
infrastructures. A specific challenge is linked to the fact that most form instability and geo-hydrological hazard by independently
of web-sourced and UAV-generated imagery for landslide studies flying over an area and targeting specific terrain features to be
is non-nadiral and non-standard (Minaeian et al. 2016). surveyed, positioned and stored in digital form.
There are many studies reporting on effective and accurate methods
to map landslides from optical and non-optical imagery. An important Materials and Methods
review work by Evans (2012) proposes a conceptual framework for the
interpretation of landforms, which is a starting point for every auto- Methodology
mated analysis to tackle multi-scale issues. Further developing the idea Image analysis and classification in the Earth sciences and in the
of landform delineation, Jasiewicz and Stepinski (2013) propose the broader field of remote sensing has a long and successful history
operational concept of the geomorphons, as the basic landscape unit to that has now undergone a huge step forward due to the capability
be classified with the help of pattern recognition methods. Later on, the of computers to manage and process big data with artificial intel-
accuracy requirements on landform measurement needed by specific ligence methods. When dealing in particular with image classifi-
geomorphic analysis have been classified and discussed by several cation and object recognition, the highest performances, at the
authors (Tarolli 2014; Eltner et al. 2016). present state of the art, are those provided by deep learning tools,
On such a basis, a relevant literature exists covering landform such as CNNs, that are capable of performing classification tasks
recognition. Examples include methods based on classical pixel- directly from images rather than by using pre-selected features of
based satellite image classification (Liu and Wu 2016), super-pixel them (Krizhevsky et al. 2012; He et al. 2015a; Shin et al. 2016). A
segmentation (Li et al. 2018), object-based image analysis (Drăguţ CNN combines multiple nonlinear processing layers using simple
and Blaschke 2006; Lu et al. 2011; Stumpf and Kerle 2011; Drăguţ elements working in parallel. The layers are interconnected by
and Eisank 2012; Hölbling et al. 2016), combination of multi- nodes and each layer uses the previous layer’s output as input.
spectral measurements with DEM-derived landform attributes Differently from other machine learning systems, CNNs may au-
such as elevation, slope, topographic position, and contributing tonomously extract features from images, use them in the learning
area with watershed delineation (Mondini et al. 2011; Forzieri et al. process, select only the most useful of them (activations) and then
2012; Forzieri et al. 2013; Ciampalini et al. 2016; Du et al. 2019). An implement a highly accurate object recognition machine, based on
overview on such studies is provided by Scaioni et al. (2014) and, a set of training images (Russakovsky et al. 2015; Shin et al. 2016).
more recently, by Giordan et al. (2018). Most of the studies agree However, the training of a deep CNN with tens or hundreds of
on the fact that deep learning convolutional neural networks layers over a large data set of images is a non-trivial task that

1026 Landslides 18 & (2021)


requires a huge computational effort preceded by a similarly large (Russakovsky et al. 2015) and tested them by transfer learning on a
undertaking that is necessary for collecting and labelling hundred dataset of labelled landscape images containing verified landforms
thousands, if not millions, of training images (Russakovsky et al. belonging to five categories (‘landslide’, ‘scree deposit’, ‘rock cliff’,
2015). As an example, the general-purpose image classification ‘alluvial fan’ and ‘slope without mass movements’).
CNN AlexNet (Krizhevsky et al. 2012), which is quite simple and The choice of the five categories is based on the following
has only eight learnable layers, uses 61 million parameters trained reasons: landslides are the target object for the detection system
over several million labelled images. Luckily, such heavy duties we want to develop; scree deposits, alluvial fans and rock cliffs are
have already been accomplished by the leading computer vision typical landforms that can be erroneously classified as landslides
research groups for general-purpose image analysis and can be and that, therefore, have to be discriminated from them; finally,
fruitfully exploited as a starting base for a much simpler process of ‘slope without mass movements’ is the label assigned to any image
specialized training called transfer learning (Shin et al. 2016). in the dataset where none of the previous categories is present,
Transfer learning consists in the specialized training of a subset of according to a careful expert-based selection process. Most of the
the deepest layers of a CNN that has been already trained for similar, selected ‘slope without mass movements’ images, however, pur-
but more general, classification purposes. An entire class of such posefully contain objects that can be mistaken for slope processes,
public-domain CNNs exists offering various levels of flexibility, com- such as mid-slope roads, buildings, cultivated fields, retaining
plexity and accuracy, depending on the user requirements. By picking walls and rivers. This should contribute to a more effective train-
one of such pre-trained, non-specialized networks, it is possible to ing of the network and decrease the degree of overfitting (Zhou
substitute the deepest layers and retrain them to fit very specialized et al. 2016; Lee et al. 2017).
tasks such as the classification of landforms characterized by mass The four pre-trained CNNs tested in this work derive from the
wasting and landslides. Because most of the classification capability successors of the AlexNet architecture (Krizhevsky et al. 2012) and
of the network has already been obtained, transfer learning can be its derivations. All of them are on the Pareto frontier and Pareto-
performed with a relatively small number of specialized images belong- efficient in the domain accuracy versus prediction time (Fig. 1).
ing to the target category. Furthermore, the usage of a general-purpose Any set of non-dominated solutions, being chosen as optimal, can
object recognition CNN strongly enhances the capability of detecting be defined as Pareto-efficient if no objective can be improved
single objects set against a complex background which may include without sacrificing at least one other objective. On the other hand,
other landscape features such as trees, buildings, clouds, roads, people a solution ζ* is referred to as dominated by another solution ζ if,
and animals. and only if, ζ is equally good or better than ζ* with respect to all
In this paper, we selected four among the best performing CNN objectives. In such terms, the chosen CNN architectures are state-
architectures for image recognition and object detection as related to of-the-art at the present stage and excelling in the combination of
the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) accuracy and computational efficiency.

Fig. 1 Position of some popular CNNs along the Pareto frontier (dashed line) in terms of accuracy vs prediction time with respect to the ImageNet database. The four
CNNs used in this paper are highlighted in bold red. Data from Mathworks (www.mathworks.com)

Landslides 18 & (2021) 1027


Original Paper
Table 1 CNN architecture, characteristics and parameter domains used for the training tests in the optimization procedure. For each CNN, the number of learnable layers is
shown together with the total number of layers and the number of parameters. The image input size is also reported
CNN name Original architecture Input resolution (pixels) Learnable layers Total layers Parameter no.
Go-LanDLC GoogLeNet 224x224 22 144 7.0x106
GP-LanDLC GoogLeNet.Places365 224x224 22 144 7.0x106
Re-LanDLC ResNet.101 224x224 101 347 44.6x106
In-LanDLC Inception.v3 299x299 48 315 23.9x106
Mini-batch size (−) 10, 20, 30
Initial learning rate (−) 1.0 × 10−5, 5.0 × 10−5, 1.0 × 10−4, 5.0 × 10−4, 1.0 × 10−3, 5.0 × 10−3, 1.0 × 10−2, 5.0 × 10−2
Momentum (−) 0.0, 0.2, 0.4, 0.6, 0.8, 1.0

They are as follows: GoogLeNet, compact and fast with a large rate, that is, the scale of the search lag in the error minimization
degree of flexibility and a good overall accuracy (Szegedy et al. procedure; and (iii) momentum, that is, the adjustment factor for
2015a); GoogLeNet-Places365 (Zhou et al. 2016), a modified version avoiding target misdetection in the search of function minima. The
of GoogLeNet specifically oriented towards the classification of the combination of all the considered parameter domains sums up to
scene rather than single objects; ResNet.101, a 101-layer CNN with 144 different configurations for each of the four trained CNN archi-
improved training curve based on residual learning (He et al. tectures, for a total of 576 training runs for each trial. Table 1 lists the
2015a); and Inception.V3, possibly the latest state-of-the-art different values used for the parameter domains as well as the main
open-source network for classification of multi-purpose images characteristics of the network architectures.
in near real time (Szegedy et al. 2015b). While the two GoogLeNet- At the end of the optimization cycle, the results were ranked
derived CNNs are compact and fast, ResNet and Inception.v3 are by overall accuracy to select the best method and parameter set
more accurate at the expenses of requiring more computing power for the choice of the optimal CNN configuration which was
and being less compact in terms of potential UAV and robot thereafter compiled and executed against an external validation
implementation. Architectures with potentially higher accuracy dataset made up by unlabelled images to simulate an actual
than Inception.v3 require a prediction time more than double operational application. No direct comparison in classification
(Fig. 1) and have not been considered in this study due to their performances is possible between the modified CNNs here de-
low suitability to operational near-real-time applications. veloped (LanDLC) and the original pre-trained networks
The transfer learning was performed by removing the classifi- (GoogLeNet, GoogLeNet.Places365, ResNet.101 and Inception.v3)
cation and SoftMax layers at the end of the network structure and because the latter do not include the 5 labels which are the target
the learnable layers (convolutional or fully connected) just before of the research (landslide, stable slope, rock cliff, alluvial fan and
them from the pre-trained CNNs, then by replacing them with new scree deposit). Therefore, we only compare average accuracy of
layers specifically designed for landform classification in five clas- original networks as reported in the literature, as visible in the
ses, as previously specified (all network architectures and specifi- Pareto frontier plot of Fig. 1, to the average accuracy obtained by
cations are provided upon request under a CC-BY-NC 3.0 licence the LanDLC networks.
in ONNX format). The training and testing of all the configurations during opti-
All transfer learning and training was done in the Matlab envi- mization were performed on a multi-GPU platform (CUDA
ronment (®Mathworks). Since performances in terms of accuracy, NVIDIA GeForce RTX 2070 with 36 processing cores) by using
flexibility and overfitting avoidance are linked not only to the net- the Matlab Deep Learning toolbox (Mathworks) supported by the
work architecture but also to training options, we performed a specific packages for the four CNNs chosen for the experiment (see
multiple-parameter optimization procedure based on a combination Table 1). The best CNN obtained for each basic type has been
of three training regulation variables: (i) mini-batch size, that is, the validated against an independent data set and then saved for usage
size of the subset of images used for each iteration; (ii) initial learning with external packages in ONNX format (https://onnx.ai) with the
name highlighted in the first column of Table 1.

Table 2 Numerical consistency of labelled images across the three datasets used. Data sets
Please note that figures do not consider data augmentation, adopted during trai- The need for parameter optimization and architecture selection
ning and testing
suggests that image datasets should be split in a training and a
Label Training Test Validation testing subset. Moreover, since overfitting is always a critical issue
Landslide 1980 495 403 when re-training large, deep networks with a limited amount of
data, a further independent dataset is required, for external vali-
Stable slope 2560 640 291 dation (Russakovsky et al. 2015).
Scree 1025 256 100 The images for training and testing are obtained through a combi-
Fan 1230 307 168 nation of methods to ensure density (frequency of label representation)
and diversity (high variability of appearances and viewpoints). Such
Cliff 1045 261 195

1028 Landslides 18 & (2021)


Table 3 Optimal configuration for each architecture. Overall accuracy for the combination showing best performances is also reported. Image classification time is relative
to a single processor Intel Core i7 (2.7 GHz, 4 cores)
CNN name Pre-trained Optimal ILR Optimal Optimal Overall Image Size in
architecture (−) Mom (−) MBS (−) accuracy classification ONNX
(−) time (s) format
(MB)
Go-LanDLC GoogLeNet 1.0 × 10−3 0.8 10 0.88 0.025 23.9
−2
GP-LanDLC GoogLeNet-Places365 1.0 × 10 0.4 20 0.87 0.025 23.9
Re-LanDLC ResNet.101 5.0 × 10−3 0.4 20 0.90 0.105 170.8
In-LanDLC Inception.v3 5.0 × 10−3 0.6 20 0.90 0.030 87.5

datasets must be selected and supervised very carefully by an expert by looking up terms such as ‘hillslope’, ‘hill’ and ‘landscape’ and
geomorphologist to avoid labelling errors or multiple labelling. then by manually verifying them one by one, and by manually
Landforms chosen for training are landslides of various types, sorting through the previously mentioned image catalogues.
scales, states of activity and materials, which are representative of The combination of the two different sources of information
a large range of physiographical settings, versus slopes without ensures a higher diversity in the visual appearances within the
mass movements (‘stable slopes’ in the remaining of the paper). dataset and allows for a more comprehensive set of non-nadiral
Furthermore, the CNNs were trained to distinguish landslides scenes. This, in turn, should extend the capability of the trained
from typical slope processes that can be mistaken for proper mass networks towards computer vision applications and automated
movements, such as rocky cliffs, scree deposits and alluvial fans. classification of images deriving from non-standard sources.
Most of the images were collected by taking UAV and ground Very often, in fact, images obtained by data mining of web
pictures of the relevant categories from the archives of the Civil resources or through automated optical camera acquisitions
Protection Centre of the University of Florence (CPC-Unifi) with (mounted on drones, fixed stands or collected from non-
manual and semi-automated selection methods. To increase the professional photographers) are not object-centred nor clean in
discriminant capability of the trained networks, the dataset was terms of target visibility. The inclusion of such noisy data in the
complemented by a second catalogue, generated by data mining training set adds more flexibility and generalization capability to
image search engines on the web (Google Images, Bing Images and the automated classification machine. As it is not possible to
Flickr) on query words related to the main denominations of the define a certain source for all images, with special reference to
chosen landforms and by using the web news catalogue generated those derived from undocumented web data sources, we estimate
by CPC-Unifi in-house system for automated search of landslide that roughly 55% of images come from ground pictures, 35%
news over the period 2010–2017 (Battistini et al. 2013; Battistini from aerial and drone acquisitions and the remaining 10% from
et al. 2017). The data mining of no-landslide scenes was performed optical satellite images.

Fig. 2 Mean overall accuracy in classification for each run over the 144 combinations of optimization parameters. Points showing zero accuracy correspond to training
options leading to networks with no classification capability with respect to the test set. This is often due to the adoption of wrong values of ILR

Landslides 18 & (2021) 1029


Original Paper

Fig. 3 Variation of the average classification accuracy of the different tested CNNs with increasing initial learning rate ILR. Typically, for large values of ILR the accuracy
quickly degrades

In terms of pure numbers, after data augmentation, the dataset acquisitions. This dataset was only used after training optimiza-
was split in two, with 80% (about 7900 images) of the validated tion to define the actual level of reachable accuracy. In all datasets,
images devoted to training and 20% to testing (about 2000 im- the number of images for each class is not perfectly balanced due
ages). A separate validation dataset of about 1200 images was then to the difficulty in finding suitable pictures for some specific
generated by using independent non-filtered data, to simulate landform types, such as scree deposits and alluvial fans. This has
real-world cases of unlabelled web data mining and drone survey produced a certain degree of unbalancing in the data that has been

Fig. 4 Variation of the average classification accuracy of the different tested CNNs with increasing momentum

1030 Landslides 18 & (2021)


tackled by resorting to image augmentation techniques and by

0.13
0.08
0.05
0.04
0.06
Error
adopting suitable performance metrics (Ferri et al. 2009; Sun et al.
2009; Batuwita and Palade 2012; Branco et al. 2015). See also the
Results section for details. The distribution of the used labels
across the three datasets is shown in Table 2.

Negative predictive value


The density of images (number of data for each category, see
Table 2) is comparable to state-of-the-art benchmarks such as the

0.93
0.96
0.99
0.97
0.94
ImageNet data storage that contains over 15 million images la-
belled in 22,000 categories. In ImageNet, the average number of
samples that are available for each category is of about 680 while
in our case, each category (over the five used) has an average
number of samples of about 1570, for training only. The adopted
data density is even higher than that used for full training of CNNs
in the ILSVRC challenge that was based on a subset of ImageNet
Specificity

with 1.2 million images labelled in 1000 categories, with an average


0.90
0.94
0.96
0.99
1.00
density of 1200 images per category (Russakovsky et al. 2015).
During transfer learning, training and optimization, all the images
were scaled to the required dimension by using an augmented image
data store that combines RGB bands into a [Rx Ry 3] matrix, where Rx
and Ry are, respectively, x and y image input size in pixel. During
Accuracy
0.89
0.93
0.95
0.96
0.95

training, the augmented image data store has also been used to generate
slight variations of the single images, to further increase sample density
and diversity (Russakovsky et al. 2015; Zhou et al. 2016).

Results
f-score
0.84
0.86
0.77
0.86
0.81

The training of the four selected CNNs during the optimization


runs shows an execution time directly proportional to the archi-
tecture complexity. For each optimization cycle, learning time was
of about 2.5 min on Go-LanDLC and GP-LanDLC, 4.5 min on In-
Recall (sensitivity)

LanDLC and 6.2 min on Re-LanDLC, based on the hardware setup


previously described. A series of independent post-training classi-
0.87
0.88
0.90
0.81
0.69

fication trials were also carried out on a separate hardware plat-


Table 4 Summary statistics on the GO-LanDLC network validation. See text for symbol explanation

form with basic computational capability, to simulate an actual


operational environment on a portable platform (Intel Core i7
2.7 GHz with 4 cores). Average classification time (Table 3), using
Precision
0.82
0.83
0.67
0.93
0.99
FN
53
34
10
32
60
1012
675
814

978
960
TN
79
52
45
11
FP

2
Go-LanDLC optimal config

350
257

136
135
90
TP
Landslide

Fig. 5 Confusion matrix for classification using Go-LanDLC (GoogLeNet) on the


Slope
Label

Scree

validation dataset. The term ‘slope’ is short for ‘stable slope’ or negative measure
Cliff
Fan

with respect to positive predictions

Landslides 18 & (2021) 1031


Original Paper
trained CNNs on single images was of 0.025 s for both Go-LanDLC

0.18
0.09
0.06
0.10
0.08
Error
and GP-LanDLC, 0.030 s for In-LanDLC and 0.105 s for Re-
LanDLC, within Matlab. Faster CNNs are slightly less accurate,
showing maximum accuracy of 0.88 (Go-LanDLC) and 0.87 (GP-
LanDLC) with respect to an average 0.90 shown by the more

Negative predictive value


complex Re-LanDLC and In-LanDLC. In general terms, the best
performing network seems In-LanDLC, based on the Inception.v3

0.91
0.93
0.98
0.96
0.93
architecture, that has maximum accuracy equal to Re-LanDLC
(based on ResNet.101) but is much faster (0.030 against 0.105 s).
For all the tested networks, accuracy seems strongly dependent
on the fine-tuning of training parameters, as shown in Fig. 2 where
the results of average overall accuracy are shown for each archi-
tecture and all optimization runs.
Sensitivity to the training parameters value is different for
Specificity

each architecture and suggests that specific optimization is


0.85
0.96
0.96
0.94
0.99
needed case by case. Nonetheless, it is apparent that some
parameter combinations show a general high (or low) classi-
fication accuracy for all networks.
The influence of each training parameter may be better
understood by looking at the mean accuracy achieved at
Accuracy
0.85
0.92
0.95
0.91
0.92

varying values. Figure 3 shows the variation of accuracy with


increasing values of the initial learning rate (ILR), while Fig. 4
refers to the momentum (Mom).
In Fig. 3, for relatively simple architectures, based on
GoogLeNet (Go-LanDLC and GP-LanDLC), the accuracy increases
f-score
0.79
0.83
0.71
0.71
0.73

with ILR only up to values around 5.0 × 10−4. After that, the
accuracy diminishes with a sharp drop for ILR greater than 1.0 ×
10−3. Deeper networks, such as those based on the Inception-v3
and Resnet.101 architectures, are more robust to variations of ILR,
Recall (sensitivity)

showing a decline in prediction accuracy for values of ILR greater


than 1.0 × 10−2. In-LanDLC, in particular, exhibits an increase in
0.85
0.80
0.76
0.75
0.61

the accuracy up to ILR equal to 5.0 × 10−3.


The sensitivity to momentum appears lower (Fig. 4). Almost all
Table 5 Summary statistics on the GP-LanDLC network validation. See text for symbol explanation

architectures show a constant average accuracy until values of


Mom equal to 0.8, then accuracy declines. In-LanDLC, based on
Precision
0.75
0.87
0.67
0.67
0.89
FN
62
58
24
42
76
1020
640
830

928
948
TN
114
36
37
61
14
FP
GP-LanDLC optimal config

341
233

126
119
76
TP

Fig. 6 Confusion matrix for classification using GP-LanDLC (GoogLeNet.Places365)


Landslide

on the validation dataset. The term ‘slope’ is short for ‘stable slope’ or negative
Slope
Label

Scree

measure with respect to positive predictions


Cliff
Fan

1032 Landslides 18 & (2021)


Inception.v3, is less sensitive than other CNNs to momentum

0.09
0.07
0.03
0.03
0.03
Error
variations. The same can be said for the Re-LanDLC even though
the fact that the curve is always the highest may be because, in
general, the residual learning architectures are less prone to
minima-seeking errors. This, however, does not mean that the

Negative predictive value


Re-LanDLC is the more efficient in terms of image object recog-
nition since computation time is much higher. Finally, the mini-
batch size (MBS) is not very relevant in the tests, showing a very

0.94
0.96
0.99
0.98
0.97
limited influence on overall accuracy.
The best parameter set and overall accuracy for each trained
CNNs are shown in Table 3.
After training and testing, the 4 optimal configurations have
been tested by running the classifier on an independent data set
(see ‘Materials and methods’). The results have been used to
estimate the classification capability of the CNNs according to
Specificity
0.94
0.95
0.98
0.98
0.99
standard ranking metrics for class label data in both balanced
and unbalanced samples (Ferri et al. 2009; Sun et al. 2009;
Batuwita and Palade 2012; Branco et al. 2015). For each architecture
and for each landform type, the following metrics have been used,
where TP is the number of true positives, FP the number of false
Accuracy
0.92
0.93
0.97
0.97
0.97

positives, TN the number of true negatives and FN the number of


false negatives.
TP
Precision : p ¼
ðTP þ FPÞ
TP
Recall ðor sensitivity Þ : r ¼
f-score

ðTP þ FNÞ
0.88
0.87
0.85
0.90
0.90

ðp  r Þ
F−score : f ¼ 2 
ðp þ r Þ
ðTP þ TNÞ
Accuracy : α ¼
ðTP þ FP þ TN þ FNÞ
Recall (sensitivity)

TN
Specificity : s ¼
ðTN þ FPÞ
0.88
0.89
0.89
0.89
0.86

TN
Negative Predictive Value : npv ¼
ðFN þ TNÞ
Table 6 Summary statistics on the Re-LanDLC network validation. See text for symbol explanation

ðFP þ FNÞ
Error : ϵ ¼
ðTP þ TNÞ

Precision (p) is a measure of the robustness towards false


Precision
0.88
0.86
0.81
0.91
0.94

positives. Recall (r or sensitivity) summarizes how well positive


cases are predicted accounting for the robustness towards false
negatives. The f-score combines p and r into a single score. Accu-
racy (α) is an overall measure of correct answers with respect to
total answers. Specificity (s) refers to the capability of predicting
FN
47
33
11
18
27

negative values against false positives. The negative predictive


value (npv) measures the relative importance of false negatives.
Error (ε) is the complement of accuracy and should be minimized.
1036

The results, highlighted in the following, clarify that, in terms of


724
823

974
952
TN

predictive performance, the 4 CNN architectures behave quite


differently, for each separate type of landform. The simplest and
fastest CNNs, based on GoogLeNet, offer acceptable classification
47
43
21
15
10
FP

capability when left with the original pre-training (Go-LanDLC)


and poor performances when pre-trained with the scene’s dataset
Re-LanDLC optimal config

of Places.365 (GP-LanDLC). In Table 4, the summary statistics for


356
258

150
168

the Go-LanDLC network shows that the recognition of landslides


89
TP

(i.e. the main target of the study) is acceptable compared to Pareto


frontier averages (Fig. 1) with precision of 0.82, accuracy of 0.89
and error of 0.13. Rather good is the capability to classify alluvial
Landslide

fans (p = 0.93, α = 0.97) and stable slopes (p = 0.83, α = 0.93) as


Slope
Label

Scree

Cliff
Fan

well. There is, however, a poor capability in the correct

Landslides 18 & (2021) 1033


Original Paper

Fig. 7 Confusion matrix for classification using Re-LanDLC (ResNet.101) on the validation dataset. The term ‘slope’ is short for ‘stable slope’ or negative measure with
respect to positive predictions

classification of scree deposits (precision p = 0.67 and F1 score f1 = of Fig. 6. In any case, GoogLeNet-based CNNs are compact and
0.77). Rock cliffs show a very low number of false positives but an fast (ONNX size of about 24 MB and average image classification
unsustainable number of false negatives (p = 0.99, r = 0.69). The time of 0.025 s).
complete results of validation for Go-LanDLC are presented in the The increase of architecture complexity in terms of number of
confusion matrix of Fig. 5. learnable layers appears to boost overall performances. The most
When trained on scene pictures from the Places.365 database, advanced CNN used, Re-LanDLC based on ResNet.101, a
GoogLeNet does not improve. In fact, in Table 5, the statistics of convolutional neural network with 101 layers and residual learning,
validation for GP-LanDLC shows a quite poor classification power improves landslide detection with p = 0.88, r = 0.88 and α = 0.92
towards landslides (p = 0.75, f-score = 0.79, ε = 0.18). Even poorer (Table 6). Moreover, it strongly enhances the capability to classify
is the performance with respect to scree deposits (p = 0.67, r = scree deposits (p = 0.81, r = 0.89 and α = 0.97), alluvial fans (p = 0.91,
0.76, f-score = 0.71), alluvial fans (p = 0.67, r = 0.75, f-score = 0.71) r = 0.89 and α = 0.97) and rock cliffs (p = 0.94, r = 0.86 and α = 0.97).
and rock cliffs (p = 0.89, r = 0.61, f-score = 0.73). This is done at the expenses of compactness (170.9 MB) and predic-
This behaviour may appear as unexpected, due to the fact that tion time (0.105 s). The complete results of validation for Re-LanDLC
GoogLeNet.Places365 has been pre-trained on a dataset of images are presented in the confusion matrix of Fig. 7.
representing places so as to be able to classify specific site typol- The CNN based on Inception.v3 (In_LanDLC) seems to repre-
ogies. A more careful analysis, however, reveals that this pre- sent a good compromise in terms of cost-benefit ratio given the
training is very good when the target is a set of classes representing fact that it is more compact (87.5 MB) and faster (average image
generic places or place names, but it may be quite inefficient if the prediction time of 0.030 s) than Re-LanDLC. As highlighted in
objective is to recognize an object inside a complex landscape. For Table 7, overall classification proficiency is still high, with land-
example, GoogLeNet.Places365 is capable of distinguishing wheth- slide classification figures that are actually higher than with Re-
er a picture is representing a classroom or a library but cannot tell LanDLC (p = 0.93, r = 0.87 and α = 0.93). The same applies for
whether the objects ‘book’ or ‘computer’ are present in the picture alluvial fans (p = 0.90, r = 0.92 and α = 0.97) and rock cliffs (p =
itself. This, conversely, is typically feasible by resorting to 0.91, r = 0.93 and α = 0.97). The only exception is the detection of
GoogLeNet with standard training. In our specific case, we are scree deposits, with slightly lower figures, mainly concerning the
training a CNN that must detect the presence of complex objects number of false positives (p = 0.74, r = 0.97 and α = 0.97). The
merged in a background that is not relevant. In other words, we complete results of validation for In-LanDLC are presented in the
want to be able to recognize a landslide (or another similar confusion matrix of Fig. 8.
landform) that is overlapping (or overlapped by), e.g. a road, a Some examples of image classification are shown in the follow-
series of buildings, some vineyard lines, a parking lot or a standing ing, with the purpose of visually describing results and typical
passer-by. This specific task is evidently better accomplished by errors as compared to actual landscape components. In all figures,
Go-LanDLC rather than GP-LanDLC. The complete results of the the classification is reported along with the membership likelihood
validation for GP-LanDLC are presented in the confusion matrix in percentage. Classes are indicated by short terms where the term

1034 Landslides 18 & (2021)


‘slope’ is short for ‘stable slope’ and has, as previously mentioned,

0.08
0.06
0.03
0.03
0.03
error
the significance of any picture in which the CNN detector does not
recognize one of the four trained landforms (‘cliff’, ‘fan’, ‘land-
slide’ and ‘scree’).
In Fig. 9, a selected sample of images classified by the Go-

Negative predictive value


LanDLC algorithm is depicted to highlight a typical behaviour.
The CNN correctly identifies all the features with some uncer-
tainties in ascribing the coastal cliff in image (g) and the

0.93
0.96
1.00
0.99
0.99
debris-flow fan in image (h). This indecision may be due to
the ambiguities that the two images represent also to a skilled
human expert. The coastline, in fact, may equally represent a
rock cliff or a landslide scar, depending on the level of accu-
racy and classification choices. The debris flow is in effect
dominated by the alluvial fan that it generates, and the error
is understandable. On the other hand, pictures in images (e),
Specificity
0.96
0.97
0.97
0.98
0.98
(f) and (i) are quite challenging but are correctly classified with
a low level of uncertainty.
The Fig. 10 illustrates some cases for GP-LanDLC. The tendency
to overestimate the class ‘landslide’, quantified by the overall
precision value p = 0.75 in Table 5, is visible in the third image
Accuracy
0.93
0.95
0.97
0.97
0.97

of the second row (f), where a road is flanked by an average steep


slope with vegetation and some rock outcrops. The low probability
(60.4%) for the class, however, may in part help to understand that
the attribution is uncertain. An even worst case of false positive is
the image of second column, third row (h), in which the classifi-
f-score
0.89
0.89
0.84
0.91
0.92

cation algorithm is almost certain (99.6%) that the ploughed fields


in the background are landslide scars. Even though there is a
certain probability that the slope hosts some dormant landslides,
this is not actually visible from the image and the case must be
considered a false positive. Better capability is shown by GP-
Recall (sensitivity)

LanDLC in recognizing the absence of trained landforms in the


0.87
0.87
0.97
0.92
0.93

low-quality image (a) and the presence of a landslide in the very


confusing picture in image (g), where almost the entire image is
Table 7 Summary statistics on the In-LanDLC network validation. See text for symbol explanation

filled by a part of the landslide body, without context or contrast-


ing background. The capability of classifying landforms that are
only partially included in images is a very useful characteristic in
data mining applications and in the classification of low-altitude
Precision
0.93
0.91
0.74
0.90
0.91

aerial photographs.
The examples related to the ResNet.101 network (Re-LanDLC)
are reported in Fig. 11. Here, the high discriminant capability of
residual training networks is highlighted by the correct classifica-
tion of the landslides in images (a), (h) and (i) despite the sur-
FN
54
39

13
13

rounding disturbance given by buildings, people and


3

infrastructures. There is, however, a quite serious error in the


central image (e) that is misclassified (even though with some
1023

uncertainty given by the likelihood of 74.8%) as a stable slope,


726
842

971
944
TN

possibly due to the presence of vegetation on the main landslide


body. The possible causes of such kind of false negatives will be
discussed in the next section. A typical feature of Re-LanDLC is
28
24
34
18
18
FP

visible when looking at all but the central image that are classified
without any uncertainty by the algorithm. This is not necessarily
In-LanDLC optimal config

an advantage of the method and may generate false positives


349
252

155
182

especially in the crucial distinction between landslides and stable


97
TP

slopes (see values of p = 0.88 and p = 0.86 in Table 6).


A similar level of detection skill is given by the In-LanDLC,
based on the state-of-the-art Inception.v3 convolutional neural
Landslide

network. In Fig. 12 a high discriminant power is shown in images


Slope
Label

Scree

Cliff

(c), (f) and (h), where, again, several disturbances are present,
Fan

including internal and external factors. Quite unexpected is the

Landslides 18 & (2021) 1035


Original Paper

Fig. 8 Confusion matrix for classification using In-LanDLC (Inception.v3) on the validation dataset. The term ‘slope’ is short for ‘stable slope’ or negative measure with
respect to positive predictions

false negative in image (g) where a possibly active landslide is higher precision and accuracy on that specific task but are
completely missed (likelihood for stable slope 98.9%) possibly due not usable for general-purpose scene or object classification.
to the very low colour contrast between the landside body and the The best classification results are obtained by using In-
surrounding slopes. LanDLC, based on the Inception.v3 architecture, one of the
Colour contrast is a typical source of errors in CNNs for image best open-source CNNs for image recognition in terms of
classification that exploit only RGB optical bands. A possible accuracy-time trade-off. The only weak point of In-LanDLC
improvement could be obtained by adding additional bands, such is represented by the relatively high number of false positives
as the near infrared or the short-wave infrared, if available. This, in (p = 0.74) in scree deposit detection, possibly due to the
turn, would force a complete redesign of the CNN structure and underestimation of the number of landslides and stable
prevent from the usage of most pre-trained architectures. This and slopes. Still, In-LanDLC boasts the best performances in all
other possible error sources will be briefly discussed in the next the remaining classes, including landslides which are the ulti-
section. mate target of the present study. When the overall classifica-
tion power is compared to the parameters that constrain the
Discussion operational implementation of the algorithm, In-LanDLC is
absolutely superior to Re-LanDLC with figures that suggest
General considerations on CNN implementation that the latter should be discarded in case of UAV or drone
The results reported in the previous section seem to highlight applications. The ONNX-format size (Table 3) and, more im-
the fact that specialized convolutional neural networks derived portantly, the image classification time for Re-LanDLC are
from transfer learning behave better than their original coun- much higher than for In-LanDLC and not acceptable for
terparts. This is clearly visible by comparing average accura- real-time applications. In turn, simpler architectures based
cies reported in the Pareto frontier plot of Fig. 1 with figures on the compact and fast GoogLeNet model may offer a better
of overall accuracy in Table 3. This is not unexpected and is drone implementation suitability at the expenses of larger
related to the very nature of a specialized convolutional neu- errors. In particular, Go-LanDLC seems the best option be-
ral network. The four original CNNs used for transfer learn- tween the two, due to the acceptable overall accuracy that is
ing are general-purpose classification algorithms, capable of coupled with a small size (23.9 MB in ONNX) and a good
classifying with a relatively good accuracy thousands of dif- classification speed (about 0.025 s per image). The research of
ferent object types. This holistic aptitude is obtained at the the optimal trade-off between In-LanDLC and Go-LanDLC
expenses of a lesser precision and accuracy in the classifica- depends on the scope of work, on the type of drone (or
tion of specific features that are not included in the original robot) platform to be used, on the type of sensors and on a
training database, such as geomorphic landforms. On the complex set of operational parameters such as flight speed
contrary, the four proposed post-trained CNNs are strictly and altitude, land cover type, target type and lighting
trained on the five desired target classes; therefore, they have conditions.

1036 Landslides 18 & (2021)


Fig. 9 a–i Some examples of classification as given by the Go-LanDLC algorithm on the validation dataset. For each single image, the assigned class is indicated along
with the class membership likelihood in percentage

Landslide classification errors and possible solutions capable of delineating horizontal boundaries or enhancing colour
On a different matter, we have seen in the results section that classifica- contrast. Others yet are not producing any significant pattern, at
tion errors, with particular reference to landslide false negatives (or least to the human eye.
missed alerts in risk assessment terms), are present in all the trained Despite the fact that the landslide is quite discernible from the
CNNs. As an example, Re-LanDLC, despite its overall accuracy, misses surroundings, Re-LanDLC is not capable of classifying it correctly
the detection of a quite large landside in the central picture of Fig. 11 (see Fig. 11e). The remaining three networks, instead, provide
(image e), possibly due to the vegetation regrowth that gives a colour like correct predictions (Table 8).
the surrounding slopes to the landslide body. The failure to detect This behaviour may be explained by considering that Re-
landforms which may appear as quite clear to an expert human eye LanDLC is the only architecture, among the four chosen, that uses
can be investigated by looking at the output of convolutional layers. residual learning. Residual learning allows for a deeper network to
Each convolution produces a quasi-random set of image modifications be developed by reducing the impact of the vanishing gradient
from which the training can extract the most relevant parameters for a problem (Hochreiter 1998). This is done by adding to each
multivariate analysis. Such convolutional products of the original image convolutional layer’s output the original (or upper layer’s) learned
are called activations in the CNN literature and represent features that output to produce a new input data for the next layer that includes
activate exchanges of information among layers, thus enhancing the also the source information (He et al. 2015a; He et al. 2015b). This
classification power. In the following, we provide two examples of technique limits the information degradation which is typical of
activations that lead to a wrong classification by one of the CNNs, as a classical CNNs by keeping the (n−1)th layer output as part of the
basis for discussion. input for the (n+1)th layer. Consequently, it is possible to build
A sample of the activations for the landslide of Fig. 11 e, as efficient very-deep CNNs with a low degree of horizontal complex-
generated by the first convolutional layer of the CNN, is depicted ity. Such deep and narrow networks are very powerful in image
in Fig. 13, along with the original image. It is clear that some of recognition (ResNet.101 has won several ILSVRC challenges) as it
them are filters that highlight terrain texture while others are is well accounted also in the performance indicators of our

Landslides 18 & (2021) 1037


Original Paper

Fig. 10 a–i Some examples of classification as given by the GP-LanDLC algorithm on the validation dataset. For each single image, the assigned class is indicated along
with the class membership likelihood in percentage

modified version Re-LanDLC (Table 6). However, in some specific different. In this specific case, a possible explanation of the uncer-
cases, such as the one of Fig. 11 e, the residual learning may retain, tainties in the correct classification is the seeming bulging feature in
in the weighting scheme, a landscape feature which is confusing the mid-right of the image generated by some of the activations. It is
rather than useful for correct classification. The same error may possible that a fine-tuning of the weighting scheme for some of the
not occur in non-residual networks that, on the contrary, discard convolutional products could enhance the final prediction, thereby
previous information before going deeper to the next reducing false positives. In this case, as well, the residual learning
convolutional layer. approach of Re-LanDLC could cause a wrong weighting of the
A different case is the one depicted in Fig. 14, where a stable slope activations, by giving low scores to the most significant ones or by
with terracing and scattered vegetation is analysed. This time, Re- keeping noisy information as relevant through the residual learning
LanDLC wrongly detects the presence of a landslide, thus producing technique, which inherits previous layer’s convolutional outputs that
a false positive (less dangerous in terms of risk assessment than a may be deceiving in specific cases.
false negative). The remaining CNNs, despite some uncertainty with The examples reveal that a careful study of classification
scree deposits and landslides, correctly identify the slope as stable errors through the analysis of activations and the way the latter
(Table 9). Image activations are again the same for each network but are propagating down within the neural network layers might
the way each one processes the parameters emerging from them is reveal specific insights on image filtering methods, in order to

1038 Landslides 18 & (2021)


Fig. 11 a–i Some examples of classification as given by the Re-LanDLC algorithm on the validation dataset. For each single image, the assigned class is indicated along
with the class membership likelihood in percentage

devise a set of landform-specific image convolutions and further specialized task that requires a certain level of training of the
improve the overall performances. This may be feasible only by HIT workers. This would surely increase the effort in terms of
applying a full residual learning to a partially new architecture, a expected costs and time.
task that involves a large effort in terms of image labelling and
computation. This specific task is outside the scope of this paper, Automated landform detection in optical satellite images
since it will require a specific activation analysis for each one of One of the main constrains of the proposed algorithms is the
the images used for the training and the subsequent development small image size, which is implicit in the transfer learning
of a brand-new CNN with full training. The latter, in itself, as technique that has been adopted to exploit pre-trained high-
discussed in the methodological section, would require a much performance image recognition CNNs (Table 1). This limita-
larger set of labelled images for training and testing, in the order tion, however, is not relevant for the implementation of robot
of 105 or larger. Provided that a similar number of landform guidance and scene recognition and is also well compatible
images actually exists within publicly available resources and with frame-by-frame video analysis and crowdsourced data
databases (something that, so far, remains to be verified in the mining since most of the available imagery is low resolution
first place), the correct labelling of them would only be possible and of limited areal coverage. Even in the case of targets with
by resorting to automated human intelligence tasking (HIT), dimension much larger than the camera footprint, a simple
such as the Amazon Mechanical Turk used for the development solution is to increase flight altitude. Another, more elaborat-
of GoogLeNet.Places365 (Zhou et al. 2016). However, while the ed, solution may be the automated mosaicking and resam-
human recognition of landscape scenes is a task requiring a quite pling of drone acquisitions until a CNN-compatible scaling is
common general knowledge, the analysis of landforms is a obtained.

Landslides 18 & (2021) 1039


Original Paper

Fig. 12 a–i Some examples of classification as given by the In-LanDLC algorithm on the validation dataset. For each single image, the assigned class is indicated along
with the class membership likelihood in percentage

The limited image size becomes an important drawback when in non-overlapping mode. For example, in Sentinel-2 optical
the image to be analysed is very large with respect to the average images, multi-spectral (RGB and NIR) information is measured
dimension of the target. In such a case, a downsampling of the at a ground resolution of 10 m. That, in case of a scan size of 224
image, to fit the required size, would completely filter out the × 224 pixels (required by Go-LanDLC, GP-LanDLC and Re-
target landforms. In the case of landslides, this may happen when LanDLC), would mean that each moving window will cover an
trying to apply computer vision techniques to satellite optical data area of 2240 × 2240 m, a dimension quite matching most of
that cover tens of squared kilometres with a resolution of metres landslides, inclusive of runout. The operation should be repeated
or tens of metres, such as Landsat 8 and Sentinel-2. On such all over the satellite image for about 103 times in average in non-
images, having size in the order of 104 × 104 pixels and resolution overlapping mode. That would mean, based on the classification
of 101 m, a landslide will occupy a few pixels. Therefore, a complete time figures and hardware setup reported in Table 3, a total scan
image resizing to 102 × 102 pixels, required by the typical CNN, will time in the order of 102 s by using the slow but powerful Re-
definitely wipe out most of them blurring any interesting feature LanDLC and of 101 s when using the faster and more compact
with the background. Go-LanDLC. Given the fact that in post-processing operations
A possible solution may exist, even though only in a post- much larger computational power can be used, such as the CUDA
processing perspective, as exemplified by some previous applica- NVIDIA GeForce RTX 2070 with 36 processing cores used for the
tions (Liu and Wu 2016). The LanDLC algorithms could be training or similar GPU processing units, we may expect that
applied over a moving window, with dimensions exactly matching batch classification process chains might be implemented for
the required size for each adopted CNN, either in overlapping or large image datasets quite easily.

1040 Landslides 18 & (2021)


Fig. 13 Some activation of the image discussed in the text, generated by the first convolutional layer. The first image in the upper left corner is the original image passed
to the network as input. It is clear that some activations are more relevant than others due to the fact that they are able to extract specific important features of the
landform

Conclusions general-purpose original architectures and suitable for the usage


A set of powerful convolutional neural networks publicly available in automated data mining of crowdsourced images. Furthermore,
have been adapted to recognize typical mass movement landforms preliminary tests with basic and more advanced hardware config-
within non-nadiral and non-standard pictures by transfer learn- urations show that at least two of the optimal CNNs developed
ing. The best parameter sets for the four tested algorithms have (Go-LanDLC and In-LanDLC) are compatible with usage in UAV
been determined by an iterative optimization procedure covering and generic robot applications for automated survey and guid-
576 different configurations. The accuracy and error analysis of ance, provided that some technical adjustments on image acqui-
such training runs shows that classification performances of such sition and pre-processing are made. A slight modification of the
post-trained CNNs are consistently higher than those of the way the algorithm is applied may also allow for a quasi-real-time

Table 8 Degree of likelihood of class membership for the image in Fig. 13 for the four trained convolutional neural networks
Cliff Alluvial fan Landslide Scree deposit Stable slope
Go-LanDLC 0.000 0.003 0.997 0.000 0.000
GP-LanDLC 0.008 0.008 0.878 0.003 0.103
Re-LanDLC 0.060 0.022 0.149 0.021 0.748
In-LanDLC 0.003 0.002 0.711 0.001 0.283

Landslides 18 & (2021) 1041


Original Paper

Fig. 14 Example activations of a stable slope characterized by terracing and scattered vegetation that may render landform classification difficult. The activations have
been generated by the first convolutional layer of the CNNs

scan of satellite VHR optical RGB images in a moving-window on this direction, we believe two will be essential: (i) a detailed
mode, thus potentially improving the capability of existing auto- analysis of a large number of convolutional schemes for the
mated mapping tools. The four different versions of LanDLC are extraction of significant parameters for object recognition to
freely available for research purposes in the ONNX format under a reduce false negatives and false positives and (ii) the develop-
CC BY NC 3.0 licence, as electronic supplementary material. ment of brand new CNNs specifically suited for landform recog-
Further research is needed to work out the best trade-off nition through full training with suitably large datasets
between computational power on the one side and speed and (inexistent at present) of correctly labelled images. According to
compactness on the other, before developing actually the present experience and to the work carried out for similar
implementable machine intelligence for automated landslide networks, we expect that such databases should have dimension
and landform detection. Among the priorities in future research in the order of 105–106 images.

Table 9 Degree of likelihood of class membership for the image in Fig. 14 for the four trained convolutional neural networks
Cliff Alluvial fan Landslide Scree deposit Stable slope
Go-LanDLC 0.014 0.003 0.076 0.185 0.722
GP-LanDLC 0.028 0.006 0.158 0.026 0.782
Re-LanDLC 0.001 0.000 0.987 0.004 0.008
In-LanDLC 0.001 0.001 0.229 0.028 0.741

1042 Landslides 18 & (2021)


Acknowledgments Ciampalini A, Raspini F, Lagomarsino D, Catani F, Casagli N (2016) Landslide suscepti-
We are grateful to the Editor and to the three anonimous reviewers bility map refinement using PSInSAR data. Remote Sens Environ 184:302–315.
for their important comments and suggestions that helped in https://doi.org/10.1016/j.rse.2016.07.018
Corominas J, van Westen C, Frattini P, Cascini L, Malet J, Fotopoulou S, Catani F, Van Den
improving the final version of the paper. We gratefully acknowl- Eeckhaut M, Mavrouli O, Agliardi F, Pitilakis K, Winter M, Pastor M, Ferlisi S, Tofani V,
edge the Civil Protection Centre of the University of Florence Hervás J, Smith J (2014) Recommendations for the quantitative analysis of landslide
(CPC-UNIFI) for the access to the historical archive of landslide risk. Bull Eng Geol Environ 73(2):209–263. https://doi.org/10.1007/s10064-013-0538-
and landscape imagery. We also thank Luca Tanteri (CPC-UNIFI) 8
and Gabriele Scaduto (Department of Earth Sciences) for their Drăguţ L, Blaschke T (2006) Automated classification of landform elements using object-
based image analysis. Geomorphology 81:330–344. https://doi.org/10.1016/
help in collecting UAV and web pictures used in the training and j.geomorph.2006.04.013
testing of the artificial neural networks. The four different versions Drăguţ L, Eisank C (2012) Automated object-based classification of topography from
of LanDLC are freely available for research purposes in the ONNX SRTM data. Geomorphology 141–142:21–33. https://doi.org/10.1016/
format under a CC BY NC 3.0 licence as electronic supplementary j.geomorph.2011.12.001
material. Du L, You X, Li K, Meng L, Cheng G, Xiong L, Wang G (2019) Multi-modal deep learning
for landform recognition. ISPRS J Photogramm Remote Sens 158:63–75. https://
doi.org/10.1016/j.isprsjprs.2019.09.018
Eltner A, Kaiser A, Castillo C, Rock G, Neugirg F, Abellán A (2016) Image-based surface
Funding Information reconstruction in geomorphometry; merits, limits and developments. Earth Surf Dyn
Open access funding provided by Università degli Studi di Firenze 4:359–389. https://doi.org/10.5194/esurf-4-359-2016
within the CRUI-CARE Agreement. Ermini L, Catani F, Casagli N (2005) Artificial neural networks applied to landslide
susceptibility assessment. Geomorphology 66(1–4):327–343. https://doi.org/
Open Access This article is licensed under a Creative Commons 10.1016/j.geomorph.2004.09.025
Evans IS (2012) Geomorphometry and landform mapping: what is a landform? Geomor-
Attribution 4.0 International License, which permits use, sharing, phology 137:94–106. https://doi.org/10.1016/j.geomorph.2010.09.029
adaptation, distribution and reproduction in any medium or for- Ferri C, Hernández-Orallo J, Modroiu R (2009) An experimental comparison of perfor-
mat, as long as you give appropriate credit to the original au- mance measures for classification. Pattern Recogn Lett 30:27–38. https://doi.org/
thor(s) and the source, provide a link to the Creative Commons 10.1016/j.patrec.2008.08.010
licence, and indicate if changes were made. The images or other Forzieri G, Moser G, Catani F (2012) Assessment of hyperspectral MIVIS sensor capability
for heterogeneous landscape classification. ISPRS J Photogramm Remote Sens
third party material in this article are included in the article's 74:175–184. https://doi.org/10.1016/j.isprsjprs.2012.09.011
Creative Commons licence, unless indicated otherwise in a credit Forzieri G, Tanteri L, Moser G, Catani F (2013) Mapping natural and urban environments
line to the material. If material is not included in the article's using airborne multi-sensor ADS40–MIVIS–LiDAR synergies. Int J Appl Earth Obs
Creative Commons licence and your intended use is not permitted Geoinf 23(1):313–323. https://doi.org/10.1016/j.jag.2012.10.004
by statutory regulation or exceeds the permitted use, you will need Giordan D, Manconi A, Tannant DD, Allasia P (2015) UAV: low-cost remote sensing for
high-resolution investigation of landslides. In: 2015 IEEE International Geoscience and
to obtain permission directly from the copyright holder. To view a Remote Sensing Symposium (IGARSS). IEEE, Milan, pp 5344–5347
copy of this licence, visit http://creativecommons.org/licenses/by/ Giordan D, Hayakawa Y, Nex F, Remondino F, Tarolli P (2018) Review article: the use of
4.0/. remotely piloted aircraft systems (RPASs) for natural hazards monitoring and man-
agement. Nat Hazards Earth Syst Sci 18:1079–1096. https://doi.org/10.5194/nhess-
18-1079-2018
He K, Zhang X, Ren S, Sun J (2015a) Delving deep into rectifiers: surpassing human-level
References
performance on ImageNet Classification. Proceedings of the IEEE International Con-
ference on Computer Vision (ICCV), Santiago, pp 1026–1034
Allasia P, Baldo M, Giordan D, Godone D, Wrzesniak A, Lollino G (2019) Near real time
He K, Zhang X, Ren S, Sun J (2015b) Deep residual learning for image recognition.
monitoring systems and periodic surveys using a multi sensors UAV: the case of
ArXiv1512.03385 [cs.LG]
Ponzano landslide. In: Shakoor A, Cato K (eds) IAEG/AEG Annual Meeting Proceedings,
Hochreiter S (1998) The vanishing gradient problem during learning recurrent neural
San Francisco, California, 2018 - Volume 1. Springer International Publishing, Cham,
nets and problem solutions. Int J Uncertain Fuzziness Knowl-Based Syst 06:107–116.
pp 303–310
https://doi.org/10.1142/S0218488598000094
Battistini A, Segoni S, Manzo G, Catani F, Casagli N (2013) Web data mining for
Hölbling D, Betts H, Spiekermann R, Phillips C (2016) Identifying spatio-temporal
automatic inventory of geohazards at national scale. Appl Geogr 43:147–158.
landslide hotspots on North Island, New Zealand, by analyzing historical and recent
https://doi.org/10.1016/j.apgeog.2013.06.012
aerial photography. Geosciences 6:48. https://doi.org/10.3390/geosciences6040048
Battistini A, Rosi A, Segoni S, Lagomarsino D, Catani F, Casagli N (2017) Validation of
Huang Y, Yi S, Li Z, Shao S, Qin X (2011) Design of highway landslide warning and
landslide hazard models using a semantic engine on online news. Appl Geogr 82:59–
emergency response systems based on UAV. Proc. SPIE 8203, Remote Sensing of the
65. https://doi.org/10.1016/j.apgeog.2017.03.003
Environment: The 17th China Conference on Remote Sensing, Hangzhou, p 820317.
Batuwita R, Palade V (2012) Adjusted geometric-mean: a novel performance measure for
https://doi.org/10.1117/12.910424
imbalanced bioinformatics datasets learning. J Bioinforma Comput Biol 10:1250003.
Intrieri E, Bardi F, Fanti R, Gigli G, Fidolini F, Casagli N, Costanzo S, Raffo A, Di Massa G,
https://doi.org/10.1142/S0219720012500035
Capparelli G, Versace P (2017) Big data managing in a landslide early warning system:
Bishop MP, James LA, Shroder JF, Walsh SJ (2012) Geospatial technologies and digital
experience from a ground-based interferometric radar application. Nat Hazards Earth
geomorphological mapping: concepts, issues and research. Geomorphology 137:5–
Syst Sci 17:1713–1723. https://doi.org/10.5194/nhess-17-1713-2017
26. https://doi.org/10.1016/j.geomorph.2011.06.027
Jasiewicz J, Stepinski TF (2013) Geomorphons — a pattern recognition approach to
Branco P, Torgo L, Ribeiro R (2015) A survey of predictive modelling under imbalanced
classification and mapping of landforms. Geomorphology 182:147–156. https://
distributions. ArXiv1505.01658 [cs.LG]
doi.org/10.1016/j.geomorph.2012.11.005
Catani F, Casagli N, Ermini L, Righini G, Menduni G (2005) Landslide hazard and risk
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep
mapping at catchment scale in the Arno River basin. Landslides 2:329–342
convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ
Catani F, Lagomarsino D, Segoni S, Tofani V (2013) Landslide susceptibility estimation by
(eds) Advances in Neural Information Processing Systems 25. Curran Associates, Inc.,
random forests technique: sensitivity and scaling issues. Nat Hazards Earth Syst Sci
pp 1097–1105
13:2815–2831. https://doi.org/10.5194/nhess-13-2815-2013
Lee S, Ryu J-H, Won J-S, Park H-J (2004) Determination and application of the weights
Chae B-G, Park H-J, Catani F, Simoni A, Berti M (2017) Landslide prediction, monitoring
for landslide susceptibility mapping using an artificial neural network. Eng Geol
and early warning: a concise review of state-of-the-art. Geosci J 21:1033–1070.
71:289–302. https://doi.org/10.1016/S0013-7952(03)00142-X
https://doi.org/10.1007/s12303-017-0034-4

Landslides 18 & (2021) 1043


Original Paper
Lee J, Wang J, Crandall D, Sabanovic S, Fox G (2017) Real-time, cloud-based object Stumpf A, Kerle N (2011) Object-oriented mapping of landslides using Random Forests.
detection for unmanned aerial vehicles. In: 2017 First IEEE International Conference Remote Sens Environ 115:2564–2577. https://doi.org/10.1016/j.rse.2011.05.013
on Robotic Computing (IRC). IEEE, Taichung, pp 36–43 Sun Y, Wong AKC, Kamel MS (2009) Classification of imbalanced data: a review. Int J
Li H, Shi Y, Zhang B, Wang Y (2018) Superpixel-based feature for aerial image scene Pattern Recognit Artif Intell 23:687–719. https://doi.org/10.1142/
recognition. Sensors 18:156. https://doi.org/10.3390/s18010156 S0218001409007326
Liu Y, Wu L (2016) Geological disaster recognition on optical remote sensing images Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V,
using deep learning. Procedia Comput Sci 91:566–575. https://doi.org/10.1016/ Rabinovich A (2015a) Going deeper with convolutions. Proceedings of the IEEE
j.procs.2016.07.144 Conference on Computer Vision and Pattern Recognition (CVPR), Boston, pp 1–9
Lu P, Stumpf A, Kerle N, Casagli N (2011) Object-oriented change detection for landslide Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2015b) Rethinking the inception
rapid mapping. IEEE Geosci Remote Sens Lett 8:701–705. https://doi.org/10.1109/ architecture for computer vision. ArXiv1512.00567 [cs.LG]
LGRS.2010.2101045 Tarolli P (2014) High-resolution topography for understanding Earth surface processes:
Lucieer A, de Jong SM, Turner D (2014) Mapping landslide displacements using Structure opportunities and challenges. Geomorphology 216:295–312. https://doi.org/10.1016/
from Motion (SfM) and image correlation of multi-temporal UAV photography. Prog j.geomorph.2014.03.008
Phys Geogr Earth Environ 38:97–116. https://doi.org/10.1177/0309133313515293 Turner D, Lucieer A, de Jong S (2015) Time series analysis of landslide dynamics using an
Minaeian S, Liu J, Son Y-J (2016) Vision-based target detection and localization via a unmanned aerial vehicle (UAV). Remote Sens 7:1736–1757. https://doi.org/10.3390/
team of cooperative UAV and UGVs. IEEE Trans Syst Man Cybern Syst 46:1005–1016. rs70201736
https://doi.org/10.1109/TSMC.2015.2491878 Uzielli M, Catani F, Tofani V, Casagli N (2015a) Risk analysis for the Ancona landslide—I:
Mondini AC, Guzzetti F, Reichenbach P, Rossi M, Cardinali M, Ardizzone F (2011) Semi- characterization of landslide kinematics. Landslides 12:69–82. https://doi.org/
automatic recognition and mapping of rainfall induced shallow landslides using 10.1007/s10346-014-0474-0
optical satellite images. Remote Sens Environ 115:1743–1757. https://doi.org/ Uzielli M, Catani F, Tofani V, Casagli N (2015b) Risk analysis for the Ancona landslide—II:
10.1016/j.rse.2011.03.006 estimation of risk to buildings. Landslides 12(1):83–100. https://doi.org/10.1007/
Niesterowicz J, Stepinski TF (2013) Regionalization of multi-categorical landscapes using s10346-014-0477-x
machine vision methods. Appl Geogr 45:250–258. https://doi.org/10.1016/ Yilmaz I (2010) The effect of the sampling strategies on the landslide susceptibility
j.apgeog.2013.09.023 mapping by conditional probability and artificial neural networks. Environ Earth Sci
Niethammer U, James MR, Rothmund S, Travelletti J, Joswig M (2012) UAV-based remote 60:505–519. https://doi.org/10.1007/s12665-009-0191-5
sensing of the Super-Sauze landslide: evaluation and results. Eng Geol 128:2–11. Zhou B, Khosla A, Lapedriza A, Torralba A, Oliva A (2016) Places: an image database for
https://doi.org/10.1016/j.enggeo.2011.03.012 deep scene understanding. ArXiv1610.02055 [cs.CV]
Pradhan B, Lee S (2010) Regional landslide susceptibility analysis using back-propagation Zhou C, Yin K, Cao Y, Ahmed B, Li Y, Catani F, Pourghasemi HR (2018a) Landslide
neural network model at Cameron Highland, Malaysia. Landslides 7:13–30. https:// susceptibility modeling applying machine learning methods: a case study from
doi.org/10.1007/s10346-009-0183-2 Longju in the Three Gorges Reservoir area, China. Comput Geosci 112:23–37.
Rossi G, Tanteri L, Tofani V, Vannocci P, Moretti S, Casagli N (2018) Multitemporal UAV https://doi.org/10.1016/j.cageo.2017.11.019
surveys for landslide mapping and characterization. Landslides 15:1045–1052. Zhou C, Yin K, Cao Y, Intrieri E, Ahmed B, Catani F (2018b) Displacement prediction of
https://doi.org/10.1007/s10346-018-0978-0 step-like landslide by applying a novel kernel extreme learning machine method.
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Landslides 15:2211–2225. https://doi.org/10.1007/s10346-018-1022-0
Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition
challenge. Int J Comput Vis 115:211–252. https://doi.org/10.1007/s11263-015-0816-y
Scaioni M, Longoni L, Melillo V, Papini M (2014) Remote sensing for landslide investi-
gations: an overview of recent achievements and perspectives. Remote Sens 6:9600– Electronic supplementary material The online version of this article (https://doi.org/
9652. https://doi.org/10.3390/rs6109600 10.1007/s10346-020-01513-4) contains supplementary material, which is available to
Shin H-C, Roth HR, Gao M, Lu L, Xu Z, Nogues I, Yao J, Mollura D, Summers RM (2016) authorized users.
Deep convolutional neural networks for computer-aided detection: CNN architectures, F. Catani ())
dataset characteristics and transfer learning. IEEE Trans Med Imaging 35:1285–1298. Earth Sciences Department,
https://doi.org/10.1109/TMI.2016.2528162 University of Florence,
Smith L, Liang Q, James P, Lin W (2017) Assessing the utility of social media as a data Via La Pira, 4, 50121, Florence, Italy
source for flood risk management using a real-time modelling framework: assessing Email: filippo.catani@unifi.it
the utility of social media for flood risk management. J Flood Risk Manag 10:370–380.
https://doi.org/10.1111/jfr3.12154

1044 Landslides 18 & (2021)

You might also like