You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/326197987

Concrete Crack Detection Using the Integration of Convolutional Neural


Network and Support Vector Machine

Article · June 2018


DOI: 10.14456/scitechasia.2018.11

CITATION READS

1 362

3 authors:

Mayank Sharma Weerachai Anotaipaiboon


Thammasat University Thammasat University
7 PUBLICATIONS   5 CITATIONS    19 PUBLICATIONS   175 CITATIONS   

SEE PROFILE SEE PROFILE

Krisada Chaiyasarn
Thammasat University
20 PUBLICATIONS   55 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Crack detection View project

Computer Aided Manufacturing of Medical Implants View project

All content following this page was uploaded by Krisada Chaiyasarn on 06 November 2018.

The user has requested enhancement of the downloaded file.


P-ISSN 2586-9000
E-ISSN 2586-9027
Homepage : https://tci-thaijo.org/index.php/SciTechAsia Science & Technology Asia

Vol. 23 No.2 April - July 2018 Page: [ 19 – 28 ]

Original research article

Concrete Crack Detection Using the


Integration of Convolutional Neural
Network and Support Vector Machine
Mayank Sharma1,* Weerachai Anotaipaiboon1
1
Faculty of Engineering, Department of Electrical and Computer Science, Thammasat University,
Pathum Thani 12120, Thailand

Krisada Chaiyasarn2
2
Faculty of Engineering, Department of Civil Engineering, Thammasat University,
Pathum Thani 12120, Thailand
Received 11 September 2017; Received in revised form 9 February 2018
Accepted 19 February 2018; Available online 30 June 2018

ABSTRACT
Crack detection in concrete structures is an important task in the inspection of
buildings to ensure their safety and serviceability. Previous studies relating to crack detection
via image-processing and machine learning techniques generally involve the complex
modelling of cracks and the extraction of hand crafted crack features. This approach often
fails to apply to images from a real environment. This paper proposes a new image-based
crack detection system using a combined model of classifiers, namely a Convolutional Neural
Network (CNN) and a Support Vector Machine (SVM), which was proven to perform better
than the methods involving the handcrafted features. In the proposed technique, a CNN is
used in extracting deep convolutional features from the RGB images of cracks and an SVM
classifier is used as an alternative to a softmax layer to enhance classification ability. The
combined model automatically extracts features and determines whether or not an image patch
belongs to a crack class. A dataset of 550 images are collected by a digital camera from
various locations, and from the results, it is concluded that the proposed method is able to
identify cracks on concrete surface with an accuracy of 90.76 %
Keywords: Convolutional Neural Network; Computer Vision; Concrete Structure; Crack
Detection; Support Vector Machine

1. Introduction structural condition. The most common


As concrete structures age, regular inspection procedure is visual inspection,
assessment and maintenance work is which is used to assess the physical and
essential to obtain information about the functional condition of structures. Visual

*Corresponding author: macx.sharma@gmail.com


M. Sharma et al | Science & Technology Asia | Vol.23 No.2 April - June 2018

inspection is usually carried out by


experienced engineers to detect and monitor
defects within the aging structures. This
procedure is costly and labor intensive. In
visual inspection, crack detection is an
important procedure as cracks provide
significant information about the current
state of a concrete structure as discussed in
[1, 2, 3]. Fig.1 shows some example images
of cracks on concrete surfaces used in this
study. A number of previous studies have
been conducted in
developing automatic crack Fig. 1 Example images of cracks used in
detection systems, although much work still this study
is required if they are to be adopted in
This paper is organized as follows; the
practice. In crack detection, pattern
background study is included in section II.
recognition is applied to obtain the
An overview of the proposed system is
prominent features of cracks from digital
explained in section III and in section IV,
images or video data. Previous methods of
implementation detail is provided. Lastly,
crack detection applied handcrafted features
results and discussion are provided in
extraction techniques to obtain unique crack
section V, followed by the conclusion.
features from images, which can be
complex and time-consuming, unlike
techniques using a Convolutional Neural 2. Background
Network (CNN), which can retrieve 2.1 Image-based crack detection
semantic information from images. As In previous studies, crack detection
exemplified in [4] in handwritten digit systems are based on the assumption that
recognition, CNN was applied to crack pixels have lower intensity than
automatically extract optimized features for background pixels; hence, thresholding
a Support Vector Machine (SVM) to techniques can be applied to extract crack
classify, which shows better recognition regions in images. The crack detection
results. This paper presents a similar system as shown in [5] involves two steps:
approach by combining CNN and SVM for (i) a line filer based on the hessian matrix to
crack detection. The aim is to develop a new enhance the line features associated with
and reliable method based on the state-of-art cracks, and (ii) thresholding to extract crack
computer vision technology to improve regions. In [6], an image-based crack
accuracy and efficiency in automatic crack detection method was proposed to automate
detection. In this study, our primary focus is crack detection in bridges. In this paper, an
on detecting the presence of cracks in the image division operation was applied, thus
images, which can be applied to splitting an image into two different regions
assess concrete structures for building and the regions were then classified to be
inspection. either crack or non-crack based on the
region features. However, threshold-based
algorithms are prone to errors caused by
shadow. In [7], a percolation-based system
was proposed for pavement crack detection
by adopting a template matching technique,
which can reduce the system computational

20
M. Sharma et al. | Science & Technology Asia | Vol.23 No.2 April - June 2018

cost without affecting the results of crack was explained in [14] with five hidden
detection. The modeling process starts by layers. The given inputs are the ratio of the
initializing a seed region and then the major and minor axis rotation and the area
labelling operation will be executed on of an object. The features are generated
neighbouring regions as crack regions, from various operations, such as subtraction
based on the percolation process [8]. Amhaz processing, Gaussian filtering, morphologi-
et. al. [9] developed the algorithm based on cal closing and thresholding. A preliminary
Minimal Path Selection (MPS) for study was conducted in [15] to identify and
automatic crack detection by adding the classify cracks using texture-based feature
width estimation. The model-based extraction. In [16], a minimum spanning
approach often relies on the user input to tree (MSTs) was used to extract the possible
initialize the seed pixels. connections of sampled crack seeds and
Crack detection algorithms based on undesired edges in trees that are then pruned
pre-processing have been proposed to to obtain crack curves. Parsana et. al. [17]
facilitate the laborious process of visual developed a vision-based automatic crack
inspection. In [10], various image-based detection system to detect cracks on a
edge detection techniques for detecting bridge deck. The detection system starts by
cracks within concrete structures were constructing feature vectors based on
compared against the Wavelet, Fourier intensities gradient and scale-space. The
Transforms, Sobel, and canny algorithms. It feature vectors are then classified using
was observed that the fast Haar wavelet- various classifiers. Shi et. al. [18] developed
based method was more reliable than the an automated road crack detection method
other methods. However, in the case of by extracting crack features based on a
noisy blemishes in images, edge detection discriminitive integral channel and then
algorithms show a dramatic drop in their classifying the features by random structure
performance. For extracting significant forests. This method can deal with crack
crack features from images, a principle intensity inhomogeneity, which can capture
component analysis (PCA) algorithm was and utilise some unique characteristics of
applied to mitigate the dimensionality cracks.
problem of feature vectors [11]. Then for The Convolutional Neural Network
the classification step, the nearest neighbour (CNN) has recently been applied in visual
algorithm was used in this system. computing due to its promising results [19,
However, in the case of noisy blemishes 20, 21, 22, 23, 24, 25]. The work of [26]
images, methods based on pre-processing developed a CNN-based defect detector
can fall short in their performance due to the system for tunnel inspection. The inspection
level of image noise. system starts by extracting feature vectors of
2.2 Crack detection based on defects based on different edge, frequency,
machine learning texture, entropy, scale invariance and HOG.
To this end, algorithms that classify The feature vectors are classified using
cracks using machine learning have been CNN, and then compared with various
proposed. SVM is a supervised learning classifiers. Although this method shows
technique in machine learning invented in staisfactory results, the method has to be
1995 by Vapnik and Cortes [12]. Lui et. al. carried out off-line for training and
[13] applied a SVM classifier to determine classifying as there are too many steps in the
if cracks appear in images, in which the features extraction stage. Zhang et. al. [27]
images are pre-processed to extract crack developed a CNN-based road crack
features based on pixel-intensity. A crack detection method by automatically
detection algorithm using neural network extracting the discriminative features from

21
M. Sharma et al | Science & Technology Asia | Vol.23 No.2 April - June 2018

RGB images at multiple levels, without the CNN-based methods will likely replace the
need of hand-crafted features, and then hand-crafted features-based techniques [20].
applying the CNN softmax layer to classify Such promising results motivated the use of
features. As shown in previous studies, the the CNN-based technique to solve the crack
multi-level deep features learned by the detection problem in this paper.

Fig.2 System Pipeline


3.3 Convolutional Neural
3. Methodology
Network
3.1 Data Collection
A CNN is a type of multilayer
In the proposed system, image-
feed forward, biologically inspired, or an
based visual inspection needs to collect data
influenced variant of an artificial neural
of cracks on different concrete surfaces. As
network. The CNN consists of three types of
shown in Fig. 2, the first process is image
layers as shown in Fig. 3. A convolutional
acquisition, in which images of concrete
layer consists of a rectangular grid of filters,
surface cracks have been taken by a digital
each of which takes input from rectangular
camera to build a dataset of crack and non-
regions of the input layer. Another layer of
crack images from various locations, mainly
the CNN is max pooling, which is the form
inside the Thammasat University campus.
of non-linear down-sampling. Max pooling
For training, a CNN classifier, image
performs down-sampling operations to
patches are used instead of full images.
partition an input image into a set of small
Training and validation images were
non-overlapping rectangular blocks. For
collected in various locations for different
each sub-region output block, a maximum
types of concrete surfaces, and the images
value of the sub-region is computed as
were manually labelled to create a database
shown in Fig. 4 and a fully connected layer
of cracks. These images were used in
is used in computing each class score. In the
classification as well as in validation.
proposed model, the Keras sequential model
3.2 Pre-processing step
[28] is used in the proposed CNN
The image pre-processing technique
architecture. As shown in Fig. 5, in the first
is applied to obtain image patches using the
two stages of convolution layer, activation
Adobe® Photoshop software package, which
layer (ReLU) and max-pooling layer are
divides RGB images into image patches
stacked. For the CNN, this is a suitable layer
with the size of 28x28 pixels. The patches
pattern for larger and deeper networks since
are then used for training the CNN. Before
multi-stacked convolutional layers can
feeding the CNN with training images, a
develop more complex features of the input
labeling operation needs to be performed on
data before the destructive pooling
the images. The training patches are labelled
operation.
either one or zero, one is assigned to a patch
containing cracks and zero is assigned to a
non-crack patch. The RGB values are used
as features in input vectors to the CNN.

22
M. Sharma et al. | Science & Technology Asia | Vol.23 No.2 April - June 2018

a softmax classifier is used as the output


layer of a convolutional neural network to
compute the probability score of each class
given an input patch. However, in the
proposed model, the classifying step is
replaced by a SVM classifier.
3. 3. 1 Convolutional Neural
Network Learning
The main objective of training a
convolutional network is to increase the
variation of features and to avoid overfitting
[ 27] . For training, the dropout method is
used to reduce overfitting [29]. The training
of the CNN is accelerated by using rectified
linear units ( ReLU) as the activation
function [ 19] . The input to the CNN
architecture is feature vectors, where is
the height and width of an image patch and
, which is the RGB colour channels. The
input training dataset is , , and is the
number of image patches used for training.
A 28x28 pixels image patch, is the
Fig. 3 The system pipeline with CNN layers observation for the image patch, and is
architecture the class label for the image patch. Let be
the output of the convolutional layer whose
filters are determined by the weight and
bias . Then the is calculated as
(3.1)

where is the input of the


convolutional layer and indices and
correspond to the location of the input
Fig. 4 An example of the Max pooling layer region where the filter is applied. Symbol
operation and denoted the convolutional
In the last step, fully connected layer operator and non-linear func-tion,
features are used for classification. Usually, respectively.

Fig. 5 Struture of the proposed CNN-SVM model

23
M. Sharma et al | Science & Technology Asia | Vol.23 No.2 April - June 2018

The first convolutional layer with the CNN fully connected layer output
32 convolutional filters of size features are the input for the SVM as
pixels, where and the max depicted in Fig. 5, and a Radial basis (RBF)
pooling operation after the filtering has a kernel is used. To obtain the optimal values
ratio of 2. The CNN training procedure was for the kernel C and gamma, a cross-
stopped after 20 Epoch as it converged to a validation technique is employed. Different
fixed value as shown in Fig. 6. values of C and gamma have been
experimented as shown in Table I. As
shown in the table C = 4 and gamma = 1
provides the best accuracy.

Table 1. A Parametric study for support


vector machine

Fig. 6. A plot between Loss and Number of


Epochs

3.4 Support Vector Machine


The idea behind a SVM is to find an 4. Implementation
optimal linear hyper plane decision In this study, the proposed model
boundary such that the expected employed CNN combined with SVM for
classification error for testing samples is crack detection. The CNN can be
minimized. The working implementation of considered as the composition of two tools:
SVM classifier is summarized in detail in an automatic multiple level feature extractor
[ 23] . An SVM finds a hyperplane that and a classifier. The feature extractor uses
separates the largest fraction of a labelled fully connected layers of the CNN to
data set for binary classification, the training retrieve the information in the form of
data is a set of training sample pairs discriminative features from raw RGB
, where is the images. The extracted features are used to
observation for the sample and train the classifier and the weight by a back-
is the associated class label. The propagation algorithm. In the proposed
SVM classifier is a discriminant function model, a softmax classifier is replaced by a
mapping an input vector space into a class SVM classifier. In this study, the Scikit-
label. learn toolbox [30] was used to build the
SVM classifier in the experiment. In the
(3.2) experiment, the classifier not only predicts
the output class labels, but also provides
where is the weight or the direction of probability scores. The probability scores
the linear decision boundary, and b is an are then used to create the ROC shown in
added bias, which maximizes a margin section 5.1. For evaluation of the proposed
between each class. In the proposed system, model, 15600 crack and non-crack image

24
M. Sharma et al. | Science & Technology Asia | Vol.23 No.2 April - June 2018

patches of different concrete surface types shown in Fig. 9. Table II shows the results
were used as depicted in Fig 7. The image of the proposed technique, which show that
patches were split into three sets, i.e. the CNN-SVM model has the accuracy of
training, testing and validation data, with a 90.76% . It is clear that the combined model
split ratio of 8:2:2. For training 60% of outper-forms the CNN model. Some image
image samples were selected randomly, patches have been misclassified because the
20% for validation and 20% for testing. The visibility of the crack is not clear in the
number of crack and non-crack image image patches as shown in Fig 8. The
patches was set equally in each of the three overall efficiency of the crack detection
datasets. The experiment was executed on a system is promising.
Window PC with Intel Core i7-4790 CPU
with 8GB RAM using the Keras library Table 2. Performance evaluation of the
implemented in Python. proposed approach

(a) (f)

(b) (g)

(c) (h)

Fig. 7. (a) Sample crack image patches (b)


non-crack image patches
(d) (i)
5. Results
The proposed CNN-SVM crack
detection technique was evaluated using the Fig. 8. Sample patches after being
ROC, which shows that the CNN-SVM classified, incorrectly classified patches ( a,
performs well on validation images as

25
M. Sharma et al | Science & Technology Asia | Vol.23 No.2 April - June 2018

b, c, d) and correctly classified image 6. Discussion


patches ( f, g, h, i). A previous researcher ( Niu et al.
(2011) [16] proved that better classification
5.1 Receiver Operating performance can be achieved with CNN
Characteristic features by using an SVM as a classifier.
The performance of the proposed The proposed model of CNN-SVM slightly
CNN-SVM model was evaluated using the boosted classification accuracy by 87. 63%
ROC curve, which is a plot between the true to 90.76% . The increment is due to the use
positive rate ( TPR) and false positive rate of a different optimization strategy.
(FPR) for different probability values of the According to this author, the softmax layer
output as computed by comparing predicted classification performance is based on
labels to ground truth values. As shown in empirical risk minimization, which is
Fig. 9, the true positive ( TP) , true negative required to minimize the predict-tion loss on
(TN), false positive (FP), and false negative the training dataset. On the other hand, the
( FN) are computed from the frequency of SVM is to find an optimal linear hyper
occurrences when comparing between plane decision boundary such that the
ground truth labels against predicted output expected classification error for testing
labels, as shown in Table III. The best samples is minimized, i. e. good
results can be achieved with CNN-SVM generalization in performance, which
approach, as shown in Fig. 9. requires minimizing the generalization error
by using structural risk minimization
Table 3. ConfusionMatrix for class principle. To maximize the margin results,
classification. the SVM provides better generalization
ability compared to using the softmax as a
Predicted Label
Ground Truth Positive Negative classifier. Modifying the softmax classifier
Label Prediction Prediction of the CNN with the SVM is simple, and
(Crack) (Non-Crack) appears to be useful for the crack detection
Positive Class True Positive False Negative
(Crack) (TP) (FN) problem. The CNN is capable of extracting
Negative Class False Positive True Negative discriminative features from a large amount
(Non-Crack) (FP) (TN) of images without any pre-processing.
Further work involves training and testing
our system on a large database. For the
inspection of a small section of concrete
infrastructure, a hand held camera is
sufficient to acquire images from real
environment. However, to capture images
from a larger inspection site, a special
acquisition system is required.

7. Conclusion
This paper presents the combined
CNN-SVM model to solve the crack
detection problem in concrete structure. The
Fig. 9. The ROC curves for between the model considered CNN as a deep feature
CNN technique, and the CNN-SVM extractor and utilized SVM as the output
technique. classifier. The obtained results found that
the proposed approach can detect cracks

26
M. Sharma et al. | Science & Technology Asia | Vol.23 No.2 April - June 2018

with the accuracy of 90.76 %. This is a [7] Yamaguchi T and Hashimoto S. “Fast
promising result, which can be applied to crack detection method for large size
assess concrete structures for inspection. concrete surface images using percolation-
The results show that the combination of based image processing,” Machine Vision
and Applications.2010;21(5):797-809.
CNN and SVM are promising, although the
efficiency of the combined model can be [8] Yamaguchi T, and Hashimoto S. "Image
processing based on percolation
further improved by fine-tuning CNN, like
model." IEICE transactions on information
for example the size of convolution filter and systems, 2006;89(7):2044-2052.
and max-pooling layer as define in section
[9] Amhaz R, Chambon S, Idier J, and
3.3.1. Baltazart V. "A new minimal path selection
algorithm for automatic crack detection on
Acknowledgements pavement images." In Image Processing
The valuable support and resources (ICIP), International Conference IEEE.
for the research received from the Faculty of 2014;17:788-792.
Engineering, Thammasat University are [10] Abdel-Qader I, Abudayyeh O. and Kelly
greatly acknowledged. Michael E, “Analysis of edge-detection
techniques for crack identification in
References bridges,” Journal of Computing in Civil
Engineering. 2003;17(4):255-263.
[1] Chaiyasarn K, “Damage detection and
[11] Abdel-Qader I, Pashaie-Rad S, Adudayyeh
monitoring for tunnel inspection based on
O. and Yehia S, “PCA-based algorithm for
computer vision,” [Ph.D Dissertation].
unsupervised bridge crack detection,”
U.K.: Christ’s College, University of
Advances in Engineering Software.
Cambridge England; 2011.
2006;37(12):771-778.
[2] Koch C, K. Georgieva, V. Kasireddy, B.
[12] Cortes C, and Vapnik V. "Support-vector
Akinci and P. Fieguth, “A review on
networks." Machine Learning,
computer vision based defect detection and
1995;20(3):273-297.
condition assessment of concrete and
asphalt civil infrastructure,” Advanced [13] Liu Z, Shahrel A SUANDI, Takeshi
Engineering Informatics, 2015;29(2):196- OHASHI and Toshiaki EJIMA. “A Tunnel
210. Crack Detection and Classification Systems
Based on Image Processing,” Machine
[3] Yu S. N, Jang J. H.and Han C. S. “Auto
vision application in industry inspection X.
inspection system using a mobile robot for
2002;4664(145):820-8502.
detecting concrete cracks in a tunnel,”
Automation in Construction.2007 [14] Moon H-G, and Kim Jung-H. "Intelligent
16(3):255-261. crack detecting algorithm on the concrete
crack image using neural
[4] Niu X. X, Ching Y. Suen “A novel hybrid
network," Proceedings of the 28th
CNN–SVM classifier for recognizing hand-
ISARC.2011; 1461-1467.
written digits.” Pattern
Recognition.2012;45(4):1318–1325. [15] Chen Z, Derakhshani R. R, Halmen C, and
Kevern J. T. "A texture-based method for
[5] Fujita Y, Mitani Y. and Hamamoto Y. “A
classifying cracked concrete surfaces from
method for crack detection on a concrete
digital images using neural networks." In
structure,” in Proc. IEEE 18th Int. Conf.
Neural Networks (IJCNN), The
Pattern Recognition (ICPR), 2006;4:901-
International Joint Conference IEEE.
904.
2011:2632-2637.
[6] Tong X, Guo J, Ling Y. and Yiu Z. “A new
[16] Zou Q, Cao Y, Qingquan L, Qingzhou M,
image-based method for concrete bridge
and Wang S. "CrackTree: Automatic crack
bottom crack detection,” in Proc. IEEE Int.
detection from pavement images." Pattern
Conf. Image Analysis and Signal
Recognition Letters. 2012;33(3):227-238.
Processing (IASP), 2011;568-571.

27
M. Sharma et al | Science & Technology Asia | Vol.23 No.2 April - June 2018

[17] Prasanna P, Dana K. J, Gucunski N, Basily Symposium (IGARSS), IEEE International


B.B, Hung M. La, Ronny S. Lim, and H. 2015;4959-4962.
Parvardeh. "Automated crack detection on [27] Zhang L, Yang F, Daniel Y. Z, and Zhu Y.
concrete bridges," IEEE Transactions on J, “Road crack detection using deep
Automation Science and Engineering, convolutional neural network,” in
2016;13(2);591-599. Proceedings of the IEEE Int. Conf. on
[18] Shi Y, Cui L, Qi Z, Meng F. and Chen Z, Image Processing (ICIP ’16), 2016;3708–
“Automatic road crack detection using 3712.
random structured forests.” IEEE [28] Charles P.W.D, Keras, (2013) GitHub
Transaction Intelligent Transportation repository [Internet]. [cited 2017 March].
Systems. 2016;17(12):1-12. Available
from:https://github.com/charlespwd/project
[19]Krizhevsky A, Sutskever I, and Geoffrey E. -title
Hinton. "Imagenet classification with deep [29] Srivastava N, Hinton G, Krizhevsky A,
convolutional neural networks," Advances Sutskever I, and Salakhutdinov R,
in neural information processing systems Dropout: A simple way to prevent neural
2012;25:1097-1105. networks from overfitting, Journal Machine
[20] LeCun Y, Bengio Y, and G. Hinton. "Deep Learning Research”..2014;15(1):1929–
learning. " Nature, 2015; 521(7553): 436- 1958.
444. [30] Pedregosa F, Varoquaux G, Gramfort A, V.
[21] Luo G, An R, Wang K, Dong S, Zhang H Michel, B. Thirion, Grisel O, Blondel M,
“A deep learning network for right Prettenhofer P, Weiss R, Dubourg V,
ventricle segmentation in short-axis MRI,” Vanderplas J. "Scikit-learn: Machine
In Computing in Cardiology Conference learning in Python," Journal of Machine
(CinC) IEEE, 2016;485-488. Learning Research, 2011;2825-2830.
[22] Shin H-C, Roth H. R, Gao M, Lu L, Xu Z,
Nogues I, Yao J, Mollura D, and Summers
R.M. "Deep convolutional neural networks
for computer-aided detection: CNN
architectures, dataset characteristics and
transfer learning," IEEE transactions on
medical imaging.2016;35(5):1285-1298.
[23] Tang Y. “Deep learning using linear
support vector machines.” In Workshop on
Representational Learning, ICML,
2013;1306-0239.
[24] Ukai M, “Development of image
processing technique for detection of
tunnel wall deformation using continuously
scanned image,” Quart. Rep.–RTRI
2000;41(3):120–126.
[25] Xie D, Zhang L, and Bai L. "Deep
Learning in Visual Computing and Signal
Processing," Applied Computational
Intelligence and Soft Computing. 2017;1-
13.
[26] Makantasis K, Karantzalos K, Doulamis A,
Doulamis N. Deep supervised learning for
hyperspectral data classification through
convolutional neural networks. In
Geoscience and Remote Sensing

28

View publication stats

You might also like