You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/324046926

Convolutional neural networks-based crack detection for real concrete


surface

Conference Paper · March 2018


DOI: 10.1117/12.2296536

CITATIONS READS

8 2,691

2 authors, including:

Shengyuan Li
China University of Mining and Technology
16 PUBLICATIONS   139 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Computer vision-based structure health monitoring View project

All content following this page was uploaded by Shengyuan Li on 18 July 2018.

The user has requested enhancement of the downloaded file.


Convolutional neural networks-based crack detection for real concrete
surface
Shengyuan Li a, Xuefeng Zhao*a
a
School of Civil Engineering, State Key Laboratory of Coastal and Offshore Engineering, Dalian
University of Technology, 116023, Dalian, China

ABSTRACT

Crack is one of important damages on real concrete surface. The visual inspection that depends on inspectors, a primary
method to detect cracks, is laborious and time-consuming in practical operation. Fortunately, image processing
techniques make the crack detection more automated to some extent. However, the extracting of features is certainly
necessary when image processing techniques detect crack in an image. As a result, the usage of image processing
techniques is also limited, since images taken on real concrete surface are influenced by some noises caused by lighting,
blur, and so on. In this paper, a method of convolutional neural networks-based crack detection for real concrete surface
was proposed. The convolutional neural networks (CNNs) can learn the features of images automatically instead of
extracting features, and therefore the CNNs will not be influenced by the noises. A convolutional neural network (CNN)
used to detect crack was designed through fine-turning an existed CNN architecture. In order to train the CNN, image
datasets needed be built firstly. A large number of images were taken from real concrete surface using a smartphone,
cropped into small images, classified and labeled. A CNN classifier used to detect crack can be obtained by training the
CNN according to those built datasets. Through integrating the trained CNN classifier into a smartphone application, the
detection of crack in an image can be implemented automatically. The results illustrate that the proposed method shows
high accuracy and robust performance and can indeed detect crack on real concrete surface.
Keywords: crack detection, convolutional neural networks, real concrete surface, smartphone

1. INTRODUCTION
Crack detection is crucial in the maintenance and operation of concrete structures. Conventional crack detection method
depends on visual inspection. A trained inspector evaluates the state of structure according to the location and width of a
crack. However, it is a time-consuming task to detect manually, and its detection results are subjective since inspectors
evaluate the condition of a structure according to their experiences.
Given the weaknesses of human inspection, image-based crack detection method is deeply studied. Image processing
techniques identify cracks from images based on some assumptions that the cracks are slender and connected regions and
darker than their backgrounds [1]. About morphology, it usually segments cracks using just right threshold [2]. For
further robustness of crack detection, general global transforms and local edge detections are deployed [3-5], such as fast
Haar transform, fast Fourier transform, Sobel and Canny edge detectors, etc. Fortunately, this approach excessively
depends on well-chosen image preprocessing techniques and image edge detection. However, the features on structure
surface are variable and affected by many factors in real situation, such as light, shadows, etc.
To detect cracks from images more accurately, machine learning-based approaches are utilized [6]. Artificial neural
networks, supervised machine learning algorithms, are used to classify images that are with or without cracks [7].
However, because of the limitation of computational capability, only simple structures of artificial neural networks can
be used to detect cracks in practice. In recent years, thanks to the development of deep learning and parallel
computations using graphic processing units (GPUs) [8], deep CNNs have been highlighted in image recognition [9].
Unlike conventional neural networks, CNNs classifying images depend on fewer computations due to the partial
connections, sharing weights and pooling process between neurons. Notably, designing CNN architecture is needed, and
a databank that contains large number of images should be built to train the CNN [10]. With the popularity of
smartphones, smartphones has been used as tools for structure health monitoring [11]. Therefore, integrating the trained
CNN model into a smartphone application will be convenient to detector.
In this paper, a deep CNN is employed to build a classifier for crack detection. The CNN-based crack detection
method can recognize cracks from image, and the process of extracting crack feature is needless. Moreover, the detection
result of CNN will not be affected by noise on concrete surface. Section 2 introduces the overview of the proposed
method. Section 3 presents a CNN architecture used in this paper. Section 4 describes the process of CNN training in
details, including building a crack databank, hyperparameters that are used to train the CNN, and training results. Section
5 demonstrates how to integrate the trained CNN model into a smartphone application for detecting crack conveniently.

2. OVERVIEW
The section summarizes the whole process of proposed crack detection method. Figure 1 is the general flow of CNN-
based crack detection using smartphone. Before training the CNN, a databank should be established to generate training
and validation sets. In the databank, a series of large crack images are cropped into small images with 256×256 pixels,
and then classified into images with cracks and without cracks manually. The training set and validation set are picked
up from those small images randomly. Through training the CNN using the training and validation sets, a CNN classifier
for crack detection can be obtained accordingly. To verify the effectiveness of the trained CNN, a process of testing is
implemented. Finally, the trained CNN is integrated into a smartphone application for detecting cracks in real-world
situations.

Figure 1. Flow chart for crack detection using CNN

3. CNN ARCHITECTURE USED TO DETECT CRACKS


This section presents the CNN of the proposed method. A CNN with bipartition outputs (with cracks and without cracks)
is designed through modified the GoogLeNet [12]. The architecture of the designed CNN is shown in Figure 2. Images
with 224×224×3 pixels are inputted into the CNN, and the softmax layers predict whether each input image is with or
without cracks. Besides, some other computations, dropout [13], local response normalization, rectified linear unit
(ReLU) [14] and full connection (FC), are not be illustrated in Figure 2, but are indispensable for the CNN.
Figure 3 is the inception module mentioned in Figure 2. It is important to point out that the inception modules in
Figure 2 is an efficient architecture during CNN training. It makes use of the architecture’s sparsity and high-
performance computations on dense matrices. Taking huge computations from using 5×5 convolution kernel into
consideration, 1×1 convolution kernel is employed to reducing dimension.

Figure 2. Architecture of the CNN (Conv: convolution; MaxPool: max pooling; AveragePool: average pooling; FC: full
connection)

Figure 3. Inception module

4. TRAINING THE CNN


Classifying the crack images depends on a CNN classifier. To train the designed CNN, a databank that consists of
training and validation sets need be established first. Through optimizing weights and bias parameters, the CNN can be
trained and validated using the built databank. Following the training process, a testing process is carried out to verify
the effectiveness of the trained CNN. The training of the CNN is performed in the open-source deep learning framework
Caffe [15], and a workstation with GPU (CPU:Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.2GHz, RAM: 32GB, GPU:
ASUS GeForce GTX 1080 Ti) is employed to implement all of the work in this paper.
4.1 Building databank
A total number of 1250 real-world crack images are taken from a bridge using a smartphone. Those images are cropped
into smaller images with 256×256 pixels and then classified into two classes: with cracks and without cracks. After these
operations, a databank with 60000 small images is built where the quantity proportion of two classes is 1:1. Some small
images used for training are shown in Figure 4.

Figure 4. Images used for training

4.2 Training and validating the CNN


Before training the CNN, weights and bias parameters should be initialed first. In this paper, the weights and bias
parameters are initialed using the “Xavier” and “Constant” method, respectively. The CNN training uses the stochastic
gradient descent with momentum of 0.9. The base learning rate is 0.01 and the learning policy of “step” is adopted. The
CNN is trained 15000 iterations using the built training set and validation set. To adjust the input size of the designed
CNN, those training and validation images are resized to 224×224 pixels resolutions. During the training process, the
validation is implemented every 50 iterations, and the training loss and validation accuracy are recorded accordingly.

Figure 5. Training loss and validation loss for each iteration

Figure 5 is results of training loss and validation loss for each iteration. As shown in Figure 5, the training loss and
validation loss descend rapidly in top 2000 iterations and converged to an unstable equilibrium after 7000 iterations.
Besides, the training loss is slightly smaller than the validation loss on the whole. Figure 6 presents the validation
accuracy over iterations. The highest accuracy is about 99.39% achieved at 13000th iteration. All of the training process
of 15000 iterations costs about 8 hours with the boosting of a GPU. Figure 7 is the visualized weights of the convolution
kernels in the first convolution layer [16]. Unlike the conventional machine learning methods, the CNN learns crack
features automatically.

Figure 6. Validation accuracy for each iteration

Figure 7. Visualized weights of the first convolution layer

5. INTEGRATING THE TRAINED CNN INTO A SMARTPHONE APPLICATION


The trained CNN can be used to predict the class of a new image, and the popularity of smartphones provides an
opportunity to mobile public to detect cracks. For this purpose, based on the framework of Core ML, the trained CNN
model is integrated into a smartphone application to detect cracks in practice conveniently. During the integrating
process, the Xcode (version 9.2), an integrated development environment is utilized to create an application with Swift
programming language. The generated application named Crack Detector is installed on an iPhone 7 Plus with iOS 11.2.
Before predicting a images, the Crack Detector will resize the image into 224×224 pixel resolutions. Notably, it can
predict not only local photos but also a new photo taken from concrete surface at that time. Figure 8 presents the
prediction results of different images. The results expose the great performance of the trained CNN. In addition, this
smartphone application can draw more attention to crack detection handily.
Figure 8. Prediction results of different concrete images using the Crack Detector

6. CONCLUSION
A convolutional neural networks-based approach to detect cracks using smartphone is proposed in this paper. A CNN
used for crack detection is designed through modifying the GoogLeNet. A large number of images needed for CNN
training, validation and testing are collected using a smartphone. Then those crack images are cropped into small images
with 256×256 pixel resolutions to build a databank. A total of 60000 small images for building training and validation
sets are included in the databank. The CNN is trained using the datasets and recorded the highest validation accuracy of
99.39%. To mobile more public to detect cracks in practice, the trained CNN model is integrated into a smartphone
application named Crack Detector. It is concluded that the proposed method can detect cracks indeed, and the created
smartphone application make the crack detection conveniently.

ACKNOWLEDGEMENT
This work was supported by the National Key Research and Development Plan (Grant 2016YFE0202400) and Natural
Science Foundation of China (Grant 51479031).

REFERENCES

[1] Yamaguchi, T., Nakamura, S., Saegusa, R., and Hashimoto, S., “Image‐based crack detection for real
concrete surfaces,” IEEJ T. Electr. Electr. 3(1), 128-135 (2010).
[2] Oliveira, H. and Lobato Correia, P. “Automatic road crack segmentation using entropy and image dynamic
thresholding,” Proc. EUSIPCO 2009, 622-626 (2009).
[3] Santhi B., Krishnamurthy G., Siddharth S., Ramakrishnan P.K., “Automatic detection of cracks in pavements
using edge detection operator,” J. Theor. App. Inf. Technol. 36(2), 199-205 (2012).
[4] Abdelqader, I., Abudayyeh, O., and Kelly, M. E., “Analysis of edge-detection techniques for crack
identification in bridges,” J. Comput. Civil Eng. 17(4), 255-263 (2003).
[5] Yeum, C. M. and Dyke, S. J., “Vision‐based automated crack detection for bridge inspection,” Comput.-
Aided Civ. Infrastruct. Eng. 30(10), 759-770 (2015).
[6] Liu, S. W., Huang, J. H., Sung, J. C., and Lee, C. C., “Detection of cracks using neural networks and
computational mechanics,” Comput. Methods Appl. Mech. Engrg. 191(25–26), 2831-2845 (2002).
[7] Moselhi, O. and Shehab-Eldeen, T., “Classification of defects in sewer pipes using neural networks,” J.
Infrastruct. Syst. 6(3), 97-104 (2000).
[8] Lecun, Y., Bengio, Y., and Hinton, G., “Deep learning,” Nature. 521(7553), 436 (2015).
[9] Krizhevsky, A., Sutskever, I., and Hinton, G. E., “ImageNet classification with deep convolutional neural
networks,” International Conference on Neural Information Processing Systems, 60, 1097-1105 (2012).
[10] Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., and Li, F. F., “ImageNet: A large-scale hierarchical image
database,” Proc. CVPR 2009, 248-255 (2009).
[11] Zhao, X. F., Han, R. C., Yu, Y., Hu, W. T., Jiao, D., Mao, X. Q., Li, M. C, and Ou, J. P., “Smartphone-based
mobile testing technique for quick bridge cable–force measurement,” J. Bridge Eng. 22(4), 06016012 (2016).
[12] Szegedy, C., Liu, W., Jia, Y. Q., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and
Rabinovich, A., “Going deeper with convolutions,” Proc. CVPR 2015, 1-9 (2014).
[13] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., and Salakhutdinov, R., “Dropout: a simple way to
prevent neural networks from overfitting,” J. Mach. Learn. Res. 15(1), 1929-1958 (2014).
[14] Nair, V. and Hinton, G. E., “Rectified linear units improve restricted boltzmann machines,” Proc. ICML 2010,
807-814 (2010).
[15] Jia, Yangqing, Shelhamer, Evan, Donahue, and Jeff, et al., “Caffe: convolutional architecture for fast feature
embedding,” Proc. ACM Multimedia 2014, 675-678 (2014).
[16] Matthew D. Z., and Fergus R., “Visualizing and understanding convolutional networks,” Proc. ECCV 2014,
818-833 (2014).

View publication stats

You might also like