A Novel Intelligent Inspection Robot With Deep Stereo Vision For Three-Dimensional Concrete Damage Detection and Quantification

Original Article
Structural Health Monitoring

2022, Vol. 21(3) 7 88–802
A novel intelligent inspection The Author(s) 2021

Article reuse guidelines:
robot with deep stereo vision for sagepub.com/journals-permissions

DOI: 10.1177/14759217211010238
journals.sagepub.com/home/shm
three-dimensional concrete
damage detection and quantification
Cheng Yuan, Bing Xiong, Xiuquan Li, Xiaohan Sang and Qingzhao Kong
Abstract
Crack assessment of reinforced concrete structures using stereo cameras is a potential way for increasing the efficiency
and safety of infrastructure maintenance routines. However, existing damage methods for reinforced concrete structures
are based on the segmentation of two-dimensional planes without consideration to the actual size of concrete damage.
Furthermore, on-site structural monitoring requires the installation of a large number of contact-based sensing devices,
resulting in the potentially excessive consumption of time and financial resources. Therefore, a new vision-based damage
assessment method for reinforced concrete structures using a novel intelligent inspection robot with Internet of things–
enabled data communication system is proposed in this article. In the first part of this article, the data acquisition system
of the inspection robot and the algorithm for three-dimensional structural reconstruction using a stereo camera is dis-
cussed. The discussion is followed by a description of the method for crack quantification based on a new proposed
deep-learning technique. Finally, to accomplish damage localization, the quantified concrete damage with actual size infor-
mation is projected onto a three-dimensional surface point cloud reconstruction of the inspected structure. To verify
the proposed method, a reinforced concrete column that has undergone cyclic loading failure is used as an inspection
subject. The validation experiment demonstrated the ability of the proposed system to segment, localize, and quantify
the damage in three-dimensional space with high accuracy.
Keywords
Structural health monitoring, deep learning, stereo vision, three-dimensional scene reconstruction, concrete damage,
volume quantification
Introduction Traditional facade damage detection relies on visual

inspection and sensor-based structural health monitor-
Reinforced concrete structures during their service life ing techniques, which mainly depends on the judgement
can often be subjected to fatigue stresses or cyclic load- of trained inspectors and the proper implementation of
ing that initiates microscopic cracks on the surface sensors or other data acquisition equipment, respec-
which can grow or coalesce into larger cracks.1–3 tively.15–17 Conventional sensors, such as strain gauges,
Cracking can degrade structural stiffness and generate accelerometers, and linear variable differential transfor-
material discontinuities.4,5 Potential failures due to mers (LVDTs) are widely used to assess structural
excessive cracking can be prevented by the diligent integrity.18–22 For example, the smart-aggregate (SA)
implementation of early inspection and monitoring.6–8
In general, two approaches are available for detecting
cracks in concrete, such as contact-based detection and Department of Disaster Mitigation for Structures, Tongji University,
non-contact inspection.9–13 Although vibration-based Shanghai, China
structural damage detection approaches have been
Corresponding author:
widely applied in recent years, such methods are not Qingzhao Kong, Department of Disaster Mitigation for Structures, Tongji
suitable for crack detection due to the fact that they are University, Shanghai 200092, China.
not sensitive to local damage.14 Email: qkong@tongji.edu.cn
Yuan
2 et al. 789
Structural Health Monitoring 00(0)
method can monitor concrete cracks in real time with inspection system was demonstrated experimentally on
damage index.23–25 However, the application of sensor- an RC column.
based acquisition networks can be prohibitively time- This study is organized as follows. Section
consuming and labour-intensive, and long-term moni- ‘‘Methodology’’ details the framework of the proposed
toring may prove impractical due to the need to store methodology. Following the overall procedure, section
massive amounts of data.26–30 ‘‘Design of the novel inspection system’’ shows the
Machine learning in computer vision demonstrated development of the novel inspection system, with the
rapidity and reliability for conducting image-based details of the hardware design and software processing
inspection of the concrete surface.8,31–36 Deep learning framework. Following the design of the proposed sys-
with robustness against noise disturbance has been tem, section ‘‘Crack detection and quantification’’
applied to accurately interpret images and sensing data explains the crack detection and quantification, describ-
for crack detection.8,33,37–41 It has demonstrated that ing the data set generation, deep-learning network used
automated damage detection, automated evaluation of to segment and detect cracks in 2D plane, proposed
the local damage and safety of the global structure, and quantification of concrete damage in 3D space coordi-
automated data collection using robots belong to the nates, and estimation of crack volume. Subsequently,
category of automate vision-based structural inspec- section ‘‘Three-dimensional scene reconstruction’’
tion.42 Vision-based inspection devices can be carried describes the approach to reconstruct the real concrete
by robotic vehicles that can be remotely controlled from structure. Finally, the experimental validation and error
a distance such as for accessing hard to reach places.43– analysis are conducted using a lab-scale RC column
46
Oftentimes, deep-learning techniques can benefit with real cracks in section ‘‘Application,’’ describing
from the inclusion of image pre-processing using estab- the inspection robot and experimental setup, the pro-
lished two-dimensional (2D) image processing methods posed crack assessment of the RC structure, error anal-
used traditionally for conventional damage detection ysis, and the validation results. A brief conclusion is
and quantification.47 Common processing methods detailed in the final section ‘‘Conclusion.’’
include edge characterization and threshold segmenta-
tion with advanced filtering.48 Chen et al.49 conducted
corrosion detection of steel structures using support Methodology
vector machines (SVMs). A convolutional neural net-
work (CNN)–based crack detection approach was pro- The data pipeline of this study is detailed in Figure 1.
posed by Cha et al.,50 and experimental results found The architecture of the proposed novel inspection sys-
that CNNs were able to effectively detect structural tem has three components, including data acquisition
damage. using stereo camera mounted on the inspection robot,
However, existing deep-learning–based damage 3D structural reconstruction, and concrete damage
detection approaches cannot automatically quantify quantification and projection. The main objective of
crack length, width, or volume, and on-site damage the inspection task is to automatically detect, localize,
assessment requires a large amount of contact sensing and quantify the crack volume in an RC structure. To
devices.6 In addition, since the integrated simultaneous accomplish the objective, a deep-learning–based con-
localization and mapping (SLAM) requires a large crete segmentation module is first proposed to localize
amount of computing power, existing three- and segment concrete damage. To acquire detailed
dimensional (3D) reconstruction and damage detection crack images and point cloud data of 3D representation
methods based on deep learning are not simulta- of the concrete crack, a novel infinite distance control
neous.51,52 Such an issue is compounded when assessing inspection robot was designed and tested. The inspec-
large-scale structures and infrastructures. Therefore, a tion robot is equipped with a monocular camera, a
new and non-contact structural health inspection sys- stereo camera, a Lidar sensor, an edge computing sys-
tem for reinforced concrete (RC) structures is proposed tem, and a quasi-real-time data transmission system.
in this article to overcome challenges regarding the RGB imaging data captured by the robot is routed to a
practical implementation of deep learning for damage deep-learning network to segment and localize cracks.
assessment. The proposed approach expands the quan- Through visual mapping, 3D convex hull instantiation,
tification of concrete damage from 2D plane to 3D and 3D scene reconstruction, the segmented cracks can
space. As a vehicle to deliver inspection, a novel inspec- be quantified.
tion robot with stereo camera was designed and utilized Following validation and testing of the deep-learn-
to perform quasi-real-time damage assessment of RC ing–based 3D damage volume detection model, the
structures. The onboard system is linked to quasi-real- model is placed on an online server to process image
time cloud computing and data visualization algo- data collected and uploaded by the inspection robot.
rithms using the Internet of things (IoT). The entire The deployed server serves two functions: (1)
790 et al.
Yuan Structural Health Monitoring 21(3)
3
Figure 1. Flow chart of the proposed 3D concrete damage volume quantification and damage projection.
implement spatial mapping and 3D damage assessment sensors to gather required inspection data, which is
and (2) visualize the 3D reconstruction model obtained called MSIL in this study, where M refers to a monocu-
through edge computing using the Jetson NX lar camera, S stands for a stereo camera, I represents
Developer Kit. Finally, damage projection onto the 3D the inertial measurement unit (IMU), and L is the light
point cloud surface was conducted to realize the dam- detection ranging (Lidar) sensor. The monocular cam-
age localization. era mounted at the forefront of the robot is used as the
visual sensor of the control board. The stereo camera
(i.e. ZED 2) developed by STEREOLABS with built-
Design of the novel inspection system in IMU, Barometer, and Magnetometer sensors is used
to conduct spatial object detection. The IMU provides
Design of the inspection robot
6-degree-of-freedom (6-DOF) information, in which
The designed mobile robot includes a control each image can be directly mapped for panoramic
terminal and edge computing terminal, both of which image stitching. Lidar sensor is employed to measure
serve as the basis of the inspection system, as shown in the distance between the RC structure and the camera.
Figure 2. The robot is designed to carry a variety of Consequently, the space and position between the 3D
Yuan
4 et al. 791
Figure 2. The components of the inspection robot (a) control terminal, (b) data acquisition terminal, and (c) robot internal
structure.
reconstructed structure with local crack and the acqui- image processing and manipulation of point cloud
sition system can be tagged under the cooperation data. Movement of the robot is accomplished
between these three types of sensors. Following the through four-wheel drive differential steering, which
damage segmentation extracted from a deep-learning– allows in situ rotation of the robot for scene recon-
based model, the obtained damage can be directly pro- struction. The STM-32 chip, which belongs to a
jected on to a global coordinate based on the MSIL. family of 32-bit microcontroller integrated circuits by
Since the Raspberry Pi enables control of electronic STMicroelectronics, was used to control robot move-
components for physical computing and interfacing ment. Jetson NX and the motion control board com-
through the IoT, the Raspberry Pi served as the main municate directly through a serial port. Commands for
control port with the remote-control system. A Jetson motions are transmitted to the Jetson NX with a rela-
NX (Nvidia) which was used to implement embedded tively low delay through a WebSocket protocol. The
edge computing for artificial intelligence (AI)–based Jetson NX encodes the instructions and sends them to
792 et al.
5
Figure 3. Pipeline of the data transmission and processing system.
the motion control board. In addition to unlimited dis- damage quantification, and 3D reconstruction. The
tance control, the robot can also be controlled using a first database consists primarily of RGB images of con-
manual remote controller. crete damage obtained from online open-source data-
bases for 2D object detection and segmentation. The
second database contained point cloud data derived
Data transmission and processing system from images acquired by the stereo camera, and is
The robustness and performance of the data acquisi- mainly used for spatial mapping, damage quantifica-
tion and transmission system is critical in object detection, and damage localization in 3D space.
tion due to potential issues and bottlenecks in real-time
data transmission and processing.53 Numerous studies Image acquisition for 3D concrete damage detection
used offline detection methods in which data are first
acquired and later exported to a computer for post- A total of 400 RGB images of cracks in concrete were
processing. This post-processing method reduces the collected from an open-source library to build up the
monitoring efficiency.43 To enhance the monitoring training data set, as shown in Figure 5(a). The stereo
efficiency, by improving the quasi-real-time processing camera was utilized as a depth sensor to prepare the
and transmission capabilities of data, this article testing data set for damage segmentation and quantifi-
proposes a partially automated monitoring method. cation, as shown in Figure 5(b). The collected RGB
images were then labelled manually using LabelImg for
Figure 3 illustrates the pipeline of the data transmission
use in the deep-learning framework. The stereo camera
and processing system. Following the data acquisition
used in this study is ZED 2, which allows remote
from the inspection robot using the MSIL, a 4G mod-
monitoring and recording of 1080p high-definition
ule sends the data to the cloud computing system. An
(HD) videos at 30 FPS or Wide Video Graphics Array
HTTP protocol transmits reconstructed point cloud
(WVGA) at 100 FPS. Wide-angle video depth can be
data and captured binocular images, and then pro-
captured with up to 120 field of view.
cessed by graphics processing unit (GPU) cluster to
complete the 3D damage detection and quantification.
Following the edge computing of the acquired data on Crack detection and segmentation
the Jetson NX, WebRTC and WebSocket protocols
Mask region convolutional neural network (R-CNN)54
provide low-latency transmission of images and traffic is a state-of-the-art method for object detection, and
of remote-control commands, respectively. To facilitate extends the target detection abilities of the Faster R-
client control and visualization, a specially designed CNN framework by adding an extra branch to achieve
web app that includes a graphical user interface (GUI) instance segmentation for each output proposal box
was installed on the laptop and a smartphone, as using a fully connected (FC) layer. The three stages of
shown in Figure 4. the framework are shown in Figure 6. First, feature
maps are extracted from the input images by the back-
bone network. Second, the generated feature maps are
Crack detection and quantification
fed to the region proposal network (RPN) to generate
Two types of image databases were set up for storing regions of interest (ROIs). Finally, the generated ROIs
images for object detection, damage segmentation, are mapped to extract the corresponding target features
Yuan
6 et al. 793
Figure 4. GUI of the IoT-enabled inspection system (a) PC platform and (b) smartphone platform.
in the shared feature maps. The ROIs are thereafter

output to the FC and the fully convolutional network
(FCN) for object detection and instance segmentation.
The corresponding bounding boxes and segmentation
masks are generated in this process.
As deeper networks can result in higher accuracy,
ResNet is used as the backbone network for feature
extraction. Image feature extraction is based on shared
convolution layers. Low-level features (i.e. edges and
angles) are extracted by the underlying network. High-
level features that represent target categories are Figure 5. Image acquisition for training of damage detection
extracted at the higher level. To better represent the model. (a) Open-source dataset; (b) Stereo vision dataset.
crack target on multiple scales, a feature pyramid net-
work (FPN) was employed to extend the backbone net-
work. The FPN improves feature extraction using a multiple objects across different scales. The ROI can be
second pyramid that routes high-level features from the generated by sliding the feature maps across the images
initial pyramid towards the lower layers. outputted from the anchors. The trained Mask R-
The RPN scans all FPN top-bottom features and CNN model is integrated into the IoT-based web server
proposes regions that potentially contain objects. using Tensorflow Lite for further damage segmenta-
Anchor boxes are rectangular regions defined to detect tion. To further evaluate the object detection
794 et al.
7
Figure 6. Pipeline of the Mask R-CNN for damage detection and segmentation.
Figure 7. Segmented results compared to actual images.
performance of the proposed model, the precision (Q) of instance segmentation, the mean intersection over
and recall (R) rates were used as follows55 union (MIU) rate is used to assess the performance of
image segmentation, and can be expressed as follows
TQ
Q= ð1aÞ
TQ + FQ 1 X k
pii
MIU = ð2Þ
TQ k + 1 i=0 P
k Pk
R= ð1bÞ pij + pji � pii
TQ + FN j=0 j=0
in which TQ, FQ, and FN, respectively, refer to the where k refers to the number of training classes, and pij
number of cases that are positive and detected as posi- refers to the number of pixels that belong to category i
tive, the number of cases that are negative but detected but is misjudged as category j. Thus, pii, pij, and pji,
positive, and the number of cases that are positive but respectively, refer to the number of pixels with right
detected negative. The testing results from 50 images classification, false positive, and false negative. The
showed that the general precision and recall rates were instance segmentation results from 50 testing images
80.10% and 81.98%, respectively. For the evaluation showed that the MIU rate for crack detection can reach
Yuan
8 et al. 795
distances outside an interval defined by the global mean

and standard deviation can be outliers and thus
trimmed from the data set. For each point cloud Pi
(i = 1, ..., n) in the data set, the average distance di(k)
can be computed with the consideration to K-nearest
neighbours (KNNs).56 By assuming that the average
distance to the KNN is normally distributed, the
standard-deviation multiplier (N) can be chosen based
upon the cumulative distribution function (CDF) from
the normal distribution.57
After going through point cloud denoising filters, the
projection from the 2D damage mask of the plane dam-
age segmentation to the 3D point cloud data is obtained
Figure 8. Damage object segmentation from 2D to 3D space.
as shown in Figure 8. The relationship between a point
in 3D Cartesian space [x, y, z] and its pixel coordinates
[u, v, d] in an image can be expressed as follows
80.97%, and such a rate is adequate for crack instance 8
segmentation. It is worth noting that this study only < u = xfx z + cx
applied instance segmentation instead of semantic seg- v = yfy z + cy ð3Þ
:
mentation because instance segmentation frameworks d = zs
yield an individual mask for each ROI so that individ-
ual instances can be processed separately. where d refers to the depth, fx and fy refer to the focal
Figure 7 shows the input images and the correspond- length of the camera on the x-and y-axis, respectively,
ing segmentation results using Mask R-CNN for vari- cx and cy refer to the centre of the aperture, and s repre-
ous concrete crack types (i.e. single crack, multiple sents the zoom factor of the depth map. Given the inter-
cracks, and concrete spalling). In addition, various nal parameters of a stereo camera, the spatial position
image resolutions, different background complexity, and pixel coordinates of each point can be described by
and different structural scenes are all involved in the the following matrix
data set. The segmentation results shown on the left 2 3 0 2 3 1
half of Figure 6 were derived from concrete columns u x
and slabs, and these images were collected by the stereo s4 v 5 = C @R4 y 5 + tA ð4Þ
camera. The right half shows the detection results of 1 z
the open-source database, including the spalling of C
floor, the cracks of pavement, and the shear cracks of where C refers to the parameter matrix consisting of fx,
RC beams. This precise detection results show that the fy, cx, and cy, R represents the rotation matrix, t refers
Mask R-CNN used in this study is capable of damage to the displacement vector, and s is the scaling factor of
detection and damage location segmentation in the 2D the data given in the depth map to the actual distance.
plane.
Estimation of crack volume
Quantification of concrete damage in 3D space Open3D, which is an open-source library, is utilized to
coordinates compute a 3D convex hull for volumetric quantifica-
Following segmentation with the Mask R-CNN, spatial tion, as shown in Figure 9. The 3D convex hull is com-
mapping from 2D to 3D space is conducted to quantify puted based on a finite set of points in space.58
the damage. Spatial mapping uses a single spatial map Mathematically, the convex hull of the point set P in
to describe real-world geometry. The map can be ini- the real vector space is defined as the smallest convex
tially sparse but continuously updated or can be set that contains P. The convex hull is defined as all
updated once after an entire area is mapped. However, convex combinations of the point set P, and the for-
the existing of noise and holes can remarkably impact mula can be expressed as follows
the accuracy of 3D reconstruction. To enhance the ( )
accuracy of damage quantification, a statistical outlier X
n X
n
removal (SOR) filter was employed in this study to pre- chð PÞ = ai pi jpi 2 P, ai ø 0, ai = 1, n = 1, 2, . . .
i=1 i=1
process the 3D point cloud. Assuming that the results
follow a Gaussian distribution, all points with mean ð5Þ
796 et al.
9
Figure 9. 3D concrete crack convex hull construction.
Figure 10. Steps for 3D scene reconstruction.
where a and n are the constant coefficients. Based on ZED camera captured a pair of images at 720p resolu-
Euler’s formula V 2 E + F = 2, where V refers to the tion and a frame rate of 60 FPS for 3D reconstruction
number of vertices, E refers to the number of edges, of structural surfaces. The ZED Fusion Application in
and F represents the number of faces.59 For any convex the Software Development Kit (SDK) package was
polytope P, if the body contains n vertices, then P will used in real time to simultaneously scan and register
not contain more than 3n 2 6 edges (E) and no more point clouds of an area. The 3D scene reconstruction
than 2n 2 4 faces (F). In this study, a random incre- process comprised of five main stages, including image
mental algorithm was employed to construct the con- acquisition, stereo calibration, stereo rectification,
vex hull of the point set P.60 stereo block matching, and 3D scene reconstruction, as
shown in Figure 10.
Stereo calibration is required to calibrate the cam-
3D scene reconstruction eras and obtain the required intrinsic and extrinsic
Figure 10 shows computation steps needed for stereo- parameters, which can be used to measure distances in
based 3D scene reconstruction for real structures. The length units rather than in pixels. These parameters are
Yuan
10 et al. 797
Figure 11. Visualization of stereo camera trajectory and the

corresponding 3D-pose estimation.
estimated in an offline calibration process.

Rectification, which ensures distortion removal and
stereo alignment, is an online process. Once the intrin- Figure 12. Experimental setup.
sic and extrinsic parameters are obtained, the corre-
spondence between the left and right images need to be
established, such as through a dense disparity map. As the horizontal FOV is remarkably larger than the
The dense disparity map can be computed based on a corresponding vertical FOV, the ZED camera was
method proposed by Geiger et al.61 The depth value zij rotated 90 horizontally to capture a larger vertical
can be obtained by the following equation area in one frame. Since the faster scanning speeds
caused the stereo camera to lose track of the object
b�f with drift errors, and the IMU can be affected by the
zij = ð6Þ
dij irregular or sudden movements, the movement of
inspection robot during scanning was limited to 0.5 m/
where dij is the disparity of pixel (i, j) in the image, b is s. The quaternion-based extended Kalman filter (EKF)
the stereo baseline, and f is the focal length of the was employed to estimate the dynamic pose of the
camera. stereo camera. During the movement of the camera,
To estimate the camera pose in the current frame, it the accelerometer of the IMU was used as the long-
is necessary to register the previous measurement value term reference for the static horizontal plane (i.e. the
based on the coordinates of the camera in the current vertical direction), and was fused with the angle change
frame. This study used the IMU to accurately estimate measured by the gyroscope to achieve a drift-free
the 6-DOF pose (i.e. x, y, z, roll, pitch, and yaw) of the dynamic tilt response. The visualization of stereo cam-
camera and consequently the pose of the inspection era trajectory and the corresponding 3D-pose estima-
robot. Most point-based 3D reconstruction methods tion are detailed in Figure 11.
map all valid pixels to the 3D space and project them
into a common coordinate system according to the esti-
mated camera pose. A clockwise scanning direction Application
was chosen for improved reconstruction accuracy. The
scanning path is directly impacted by the field of view Inspection robot and setup
of the stereo camera as well as the maximum range of Following the successful validation process, a post-
the device. The horizontal field of view (FOV) of ZED damage RC column recovered from a cyclic loading
2 is 61. Consequently, the vertical FOV can be test was selected to assess the performance of the pro-
obtained using the following equations posed inspection system. The prepared RC column has
a shear span of 1000 mm and a 200-mm square cross-
0:5 � H section. Figure 12 shows the field test setup on a scaled
FOVV = 2 arctan ð7Þ
fy structural sample. The image reconstruction process
was carried out around the entire RC column at a dis-
0:5 � W
FOVH = 2 arctan ð8Þ tance of 70 cm. To reconstruct more details of the
fx
entire column, the ZED camera was adjusted at differ-
where fx and fy refer to the focal length, and W and H ent viewing angles for data acquisition. A total of four
represent the width and the height of the image, respec- damage areas were visible as concrete cracks, all of
tively. The obtained vertical FOV is approximately 93. which occurred at the junction of the column and the
798 et al.
11
Figure 13. Isolated volume of the damage convex hull (a) Crack #1, (b) Crack #2, (c) Crack #3, (d) Crack #4, and (e) damage
convex hull of Crack #3 under different viewing angles.
base. The ROIs for crack reconstruction and quantifi- to the maximum length, width, and depth of a crack
cation were concentrated around this area. area. Due to the difficulty of quantifying the crack vol-
ume of concrete, only Crack #1 was manually mea-
sured and calculated for damage volume. The crack
Crack assessment of the RC column was divided into 10 small areas (from C1 to C10 in
Figure 13 shows the isolated damage regions from the Figure 12) to measure the length, width, and depth,
3D damage convex hull construction of the damaged respectively, and finally, the volume of these 10 small
RC column. The results demonstrate that the proposed areas was summed to obtain the total volume of the
method can accurately segment the contours of con- cracked region.
crete cracks and the 3D convex hull function can com-
pletely map the volume of the detected concrete
damage, as can be seen in Figure 13(e). Due to the irre- Error analysis
gularity of the cracks, it is worth noting that the size of To assess the accuracy and potential for verifying as-
concrete cracks, such as length, width, and depth refer built conditions, the dimensions of RC column are
Yuan
12 et al. 799
Table 1. Crack length, width, depth, and corresponding measurements (units: mm).
Crack ID # Measured Proposed Manual RE (%)

type method (mm) measurement
(mm)
#1 Maximum length 201.76 205.90 2.01

Maximum width 55.06 58.80 6.37
Maximum depth 27.00 29.70 9.09
Volume 104,965.65 (mm3) 120,000.00 (mm3) 12.53
#2 Maximum length 214.91 218.00 1.42
Maximum depth 18.00 20.20 10.89
Volume 7,554,144 (mm3) – –
#3 Maximum length 121.37 125.80 3.52
Maximum width 290.69 291.80 0.38
Volume 211,416.18 (mm3) – –
#4 Maximum length 189.69 192.10 1.25
Volume 105,629.01 (mm3) – –
ME 5.10
MSE 9.53
RE: relative error; ME: mean error; MSE: mean square error.
extracted from the 3D scene reconstruction model (MSE) is 9.53%, which is acceptable for most practical
manually and compared to manual survey dimensions. applications.
The error analysis has also been introduced by previous
studies.6,43 The length of the RC column and the maxi-
mum length, width, and depth of concrete damage were Conclusion
measured in the point cloud data using MeshLab. For This study presents a novel inspection system to quan-
manual measurement, a vernier calliper was used to tify the structural damage based on stereo vision and
verify the inspection accuracy of the proposed method. deep learning and can serve as a practical and effective
Table 1 lists the dimensions of the scaled model. Values solution for inspection of civil structures. An innova-
in Table 1 were obtained by measuring different positive, partially automated 3D segmentation and quanti-
tions of the cracked area. Since the crack length, width, fication method is proposed to extract damage
and depth varied greatly along the profile of the cracks, information and identify damage in building elements.
the multiple measurements in the interior of the crack The major contributions are given as follows:
and the largest measured model length, width, and
depth are reported for comparison. It is noted that the
(1) Intelligent inspection robots have the potential to
volume of the damage region was not measured due to
replace human inspectors in conducting on-site
the irregular shape of the concrete crack region.
civil structure inspection. The robots also can per-
form quasi-real-time assessment a disaster scene
before rescuers enter a possibly dangerous envi-
Validation results
ronment. Consequently, the efficiency of civil
This study demonstrated an automated method for 3D inspection work can be greatly improved, and the
identification of concrete cracks. The 360 panoramic safety of inspection tasks can be remarkably
model was reconstructed by the proposed system to enhanced.
quantify and localize damage regions. According to the (2) A wireless data transmission method is proposed
comparison of the measured and computed crack to achieve quasi-real-time image display and
dimensions of four damage regions listed in Table 1, online damage detection. The transmission latency
the relative errors (REs) between manual measure- of the reconstructed model after edge computing
ments and the proposed method are generally less than is lower than 5 s.
13%, and the corresponding mean error (ME), respec- (3) The proposed 3D damage segmentation and quan-
tively, for crack size is 5.10%. The mean square error tification model based on Mask R-CNN can
800 et al.
13
accurately localize cracks and quantify the seg- 4. Chupanit P and Roesler JR. Fracture energy approach to
mented damage volumes. characterize concrete crack surface roughness and shear
(4) An IoT-enabled data processing and visualization stiffness. J Mater Civil Eng 2008; 20: 275–282.
system is adopted to process the 3D damage seg- 5. Spencer BF Jr, Hoskere V and Narazaki Y. Advances in
mentation and quantification. Following data pro- computer vision-based civil infrastructure inspection and
cessing, the damage is projected into 3D space to monitoring. Engineering 2019; 5: 199–222.
6. Torok MM, Golparvar-Fard M and Kochersberger KB.
enable damage localization in the 3D reconstruc-
Image-based automated 3D crack detection for post-
tion model. Each module comes together to form disaster building assessment. J Comput Civil Eng 2014;
a comprehensive system that is ready for deploy- 28: A4014004.
ment in civil structural inspection. 7. Chang PC, Flatau A and Liu S. Health monitoring of
civil infrastructure. Struct Health Monit 2003; 2: 257–267.
However, limitations of the proposed method 8. Xu Y, Bao Y, Chen J, et al. Surface fatigue crack identifi-
become apparent during deployment. Concrete cracks cation in steel box girder of bridges by a deep fusion con-
smaller than 10 mm could not be detected because they volutional neural network based on consumer-grade
were not reconstructed in 3D. In addition, some of the camera images. Struct Health Monit 2019; 18: 653–674.
narrowest cracks cannot be detected when the stereo 9. Kharkovsky S, Giri P and Samali B. Non-contact inspec-
camera is located beyond a certain distance. tion of construction materials using 3-axis multifunc-
tional imaging system with microwave and laser sensing
Furthermore, this system is so far not fully automatic
techniques. IEEE Instrum Meas Mag 2016; 19: 6–12.
because the 3D reconstruction model obtained by the
10. Kang D and Cha YJ. Autonomous UAVs for structural
edge computing still requires human intervention for health monitoring using deep learning and an ultrasonic
uploading data to the cloud. beacon system with geo-tagging. Comput Aided Civil
Infrastruct Eng 2018; 33: 885–902.
Declaration of conflicting interests 11. Zhao J, Bao Y, Guan Z, et al. Video-based multiscale
identification approach for tower vibration of a cable-
The author(s) declared no potential conflicts of interest with
stayed bridge model under earthquake ground motions.
respect to the research, authorship, and/or publication of this
Struct Control Health Monit 2019; 26: e2314.
article.
12. Peng Z, Li J, Hao H, et al. High-resolution time-
frequency representation for instantaneous frequency
Funding identification by adaptive Duffing oscillator. Struct Con-
trol Health Monit 2020; 27: e2635.
The author(s) disclosed receipt of the following financial sup-
13. Fan G, Li J and Hao H. Vibration signal denoising for
port for the research, authorship, and/or publication of this
structural health monitoring by residual convolutional
article: This research is supported by the National Key
neural networks. Measurement 2020; 157: 107651.
Research Program of China (grant no. 2020YFC1512500),
14. Zhang J, Guo S and Zhang Q. Mobile impact testing for
the Science and Technology Commission of Shanghai
structural flexibility identification with only a single refer-
Municipality (grant no. 19DZ1201200), and the China
ence. Comput Aided Civil Infrastruct Eng 2015; 30:
National Science Foundation (grant no. 51978507).
703–714.
15. Tu J, Sui H, Feng W, et al. Detecting facade damage on
ORCID iD moderate damaged type from high-resolution oblique
Qingzhao Kong https://orcid.org/0000-0001-9577-4540 aerial images. IEEE J Select Topic Appl Earth Observ
Remote Sens 2017; 10: 5598–5607.
16. Sony S, Laventure S and Sadhu A. A literature review of
References next-generation smart sensing technology in structural
1. Yuan C, Chen W, Pham TM, et al. Finite element model- health monitoring. Struct Control Health Monit 2019; 26:
ling of dynamic bonding behaviours between fibre rein- e2321.
forced polymer sheet and concrete. Construct Build Mater 17. Kong Q, Fan S, Bai X, et al. A novel embeddable spheri-
2020; 255: 118939. cal smart aggregate for structural health monitoring: part
2. Noorsuhada M. An overview on fatigue damage assess- I. Fabrication and electrical characterization. Smart
ment of reinforced concrete structures with the aid of Mater Struct 2017; 26: 095050.
acoustic emission technique. Construct Build Mater 2016; 18. Zhang J, Guo S, Wu Z, et al. Structural identification
112: 424–439. and damage detection through long-gauge strain mea-
3. Narazaki Y, Hoskere V, Hoang TA, et al. Vision-based surements. Eng Struct 2015; 99: 173–183.
automated bridge component recognition with high-level 19. Lynch JP and Loh KJ. A summary review of wireless sen-
scene consistency. Comput Aided Civil Infrastruct Eng sors and sensor networks for structural health monitor-
2020; 35: 465–482. ing. Shock Vib Digest 2006; 38: 91–130.
Yuan
14 et al. 801
20. Friswell MI and Penny JE. Crack modeling for structural 36. Li S, Wei S, Bao Y, et al. Condition assessment of cables
health monitoring. Struct Health Monit 2002; 1: 139–148. by pattern recognition of vehicle-induced cable tension
21. Yoon H, Shin J and Spencer Jr BF. Structural displace- ratio. Eng Struct 2018; 155: 1–15.
ment measurement using an unmanned aerial system. 37. Koziarski M and Cyganek B. Image recognition with
Comput Aided Civil Infrastruct Eng 2018; 33: 183–192. deep neural networks in presence of noise–dealing with
22. Li S, Li H, Liu Y, et al. SMC structural health monitor- and taking advantage of distortions. Integr Comput Aided
ing benchmark problem using monitored data from an Eng 2017; 24: 337–349.
actual cable-stayed bridge. Struct Control Health Monit 38. Jang K, Kim N and An YK. Deep learning–based auton-
2014; 21: 156–172. omous concrete crack evaluation through hybrid image
23. Yuan C, Kong Q, Chen W, et al. Interfacial debonding scanning. Struct Health Monit 2019; 18: 1722–1737.
detection in externally bonded BFRP reinforced concrete 39. Saleem MR, Park J-W, Lee J-H, et al. Instant bridge
using stress wave-based sensing approach. Smart Mater visual inspection using an unmanned aerial vehicle by
Struct 2020; 29: 035039. image capturing and geo-tagging system and deep convo-
24. Kong Q, Robert RH, Silva P, et al. Cyclic crack monitor- lutional neural network. Struct Health Monit. Epub
ing of a reinforced concrete column under simulated ahead of print 1 July 2020. DOI: 10.1177/1475921720
pseudo-dynamic loading using piezoceramic-based smart 932384.
aggregates. Appl Sci 2016; 6: 341. 40. Lei B, Wang N, Xu P, et al. New crack detection method
25. Kong Q, Hou S, Ji Q, et al. Very early age concrete for bridge inspection using UAV incorporating image
hydration characterization monitoring using piezocera- processing. J Aerosp Eng 2018; 31: 04018058.
mic based smart aggregates. Smart Mater Struct 2013; 41. Bao Y and Li H. Machine learning paradigm for struc-
22: 085025. tural health monitoring. Struct Health Monit. Epub
26. Beckman GH, Polyzois D and Cha YJ. Deep learning- ahead of print 24 November 2020. DOI: 10.1177/147592
based automatic volumetric damage quantification using 1720972416.
depth camera. Autom Construct 2019; 99: 114–124. 42. Yeum CM, Dyke SJ, Ramirez J, et al. Big visual data
27. Sony S, Dunphy K, Sadhu A, et al. A systematic review analytics for damage classification in civil engineering. In:
of convolutional neural network-based structural condi- Transforming the future of infrastructure through smarter
tion assessment techniques. Engineering Structures 2021; information: proceedings of the international conference on
226: 111347. smart infrastructure and construction, Cambridge, 27–29
28. Wang F, Mobiny A, Van Nguyen H, et al. If structure June 2016, pp. 569–574. London: ICE Publishing.
can exclaim: a novel robotic-assisted percussion method 43. Jiang S and Zhang J. Real-time crack assessment using
for spatial bolt-ball joint looseness detection. Struct deep neural networks with wall-climbing unmanned aer-
Health Monit. Epub ahead of print 1 June 2020. DOI: ial system. Comput Aided Civil Infrastruct Eng 2020; 35:
10.1177/1475921720923147. 549–564.
29. Wang F and Song G. 1D-TICapsNet: an audio signal 44. Cha YJ, You K and Choi W. Vision-based detection of
processing algorithm for bolt early looseness detection. loosened bolts using the Hough transform and support
Struct Health Monit. Epub ahead of print 15 December vector machines. Autom Construct 2016; 71: 181–188.
2020. DOI: 10.1177/1475921720976989. 45. Omar T and Nehdi ML. Remote sensing of concrete
30. Yuan C, Zhang J, Chen L, et al. Timber moisture detec- bridge decks using unmanned aerial vehicle infrared ther-
tion using wavelet packet decomposition and convolu- mography. Autom Construct 2017; 83: 360–371.
tional neural network. Smart Mater Struct 2021; 30: 46. Lei B, Ren Y, Wang N, et al. Design of a new low-cost
035022. unmanned aerial vehicle and vision-based concrete crack
31. Xue Y and Li Y. A fast detection method via region- inspection method. Struct Health Monit 2020; 19: 1871–1883.
based fully convolutional neural networks for shield tun- 47. Wang C, Wang N, Ho S-C, et al. Design of a new vision-
nel lining defects. Comput Aided Civil Infrastruct Eng based method for the bolts looseness detection in flange
2018; 33: 638–654. connections. IEEE Trans Ind Electron 2019; 67:
32. Zalama E, Gómez Garcı́a Bermejo J, Medina R, et al. 1366–1375.
Road crack detection using visual features extracted by 48. Anand R and Kumar P. Flaw detection in radiographic
Gabor filters. Comput Aided Civil Infrastruct Eng 2014; weldment images using morphological watershed segmen-
29: 342–358. tation technique. NDT&E Int 2009; 42: 2–8.
33. Ni F, Zhang J and Chen Z. Pixel-level crack delineation 49. Chen PH, Shen HK, Lei CY, et al. Support-vector-
in images with convolutional feature fusion. Struct Con- machine-based method for automated steel bridge rust
trol Health Monit 2019; 26: e2286. assessment. Autom Construct 2012; 23: 9–19.
34. Eick BA, Narazaki Y, Smith MD, et al. Vision-based 50. Cha YJ, Choi W and Büyüköztürk O. Deep learning-
monitoring of post-tensioned diagonals on miter lock based crack damage detection using convolutional neural
gate. J Struct Eng 2020; 146: 04020209. networks. Comput Aided Civil Infrastruct Eng 2017; 32:
35. Bao Y, Chen Z, Wei S, et al. The state of the art of data 361–378.
science and engineering in structural health monitoring. 51. Li R, Liu J, Zhang L, et al. LIDAR/MEMS IMU inte-
Engineering 2019; 5: 234–242. grated navigation (SLAM) method for a small UAV in
802 et al.
15
indoor environments. In: 2014 DGON inertial sensors systems, San Diego, CA, 29 October–2 November 2007,
and systems (ISS), Karlsruhe, 16–17 September 2014, pp. 3191–3198. New York: IEEE.
pp. 1–15. New York: IEEE. 57. Carrilho A, Galo M and Santos R. Statistical outlier
52. Gomes G. Deep learning-based volumetric damage quan- detection method for airborne lidar data. In: Interna-
tification using an inexpensive depth camera, 2018, tional Archives of the Photogrammetry, Remote Sensing &
https://mspace.lib.umanitoba.ca/xmlui/handle/1993/33222 Spatial Information Sciences, Karlsruhe, 10–12 October
53. Ali MI, Ono N, Kaysar M, et al. Real-time data analytics 2018.
and event detection for IoT-enabled communication sys- 58. Barber CB, Dobkin DP and Huhdanpaa H. The Quic-
tems. J Web Semantic 2017; 42: 19–37. khull algorithm for convex hulls. ACM Trans Math Softw
54. He K, Gkioxari G, Dollár P, et al. Mask R-CNN. In: 1996; 22: 469–483.
Proceedings of the IEEE international conference on com- 59. Kabluchko Z, Last G and Zaporozhets D. Inclusion–
puter vision, Venice, 22–29 October 2017, pp. 2961–2969. exclusion principles for convex hulls and the Euler rela-
New York: IEEE. tion. Discrete Comput Geometry 2017; 58: 417–434.
55. Yu Y, Zhang K, Yang L, et al. Fruit detection for straw- 60. Chan TM. Output-sensitive results on convex hulls,
berry harvesting robot in non-structural environment extreme points, and related problems. Discrete Comput
based on Mask-RCNN. Comput Electr Agricult 2019; Geometry 1996; 16: 369–387.
163: 104846. 61. Geiger A, Roser M and Urtasun R. Efficient large-scale
56. Rusu RB, Blodow N, Marton Z, et al. Towards 3D object stereo matching. In: Asian conference on computer vision,
maps for autonomous household robots. In: 2007 IEEE/ Queenstown, New Zealand, 8–12 November 2010, pp.
RSJ international conference on intelligent robots and 25–38. New York: Springer.

A Novel Intelligent Inspection Robot With Deep Stereo Vision For Three-Dimensional Concrete Damage Detection and Quantification

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Novel Intelligent Inspection Robot With Deep Stereo Vision For Three-Dimensional Concrete Damage Detection and Quantification

Uploaded by

Copyright:

Available Formats

Original Article

Structural Health Monitoring

A novel intelligent inspection The Author(s) 2021

robot with deep stereo vision for sagepub.com/journals-permissions

Introduction Traditional facade damage detection relies on visual

Figure 3. Pipeline of the data transmission and processing system.

in the shared feature maps. The ROIs are thereafter

Figure 7. Segmented results compared to actual images.

distances outside an interval defined by the global mean

Figure 9. 3D concrete crack convex hull construction.

Figure 10. Steps for 3D scene reconstruction.

Figure 11. Visualization of stereo camera trajectory and the

estimated in an offline calibration process.

Crack ID # Measured Proposed Manual RE (%)

#1 Maximum length 201.76 205.90 2.01

You might also like

A Novel Intelligent Inspection Robot With Deep Stereo Vision For Three-Dimensional Concrete Damage Detection and Quantification

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Novel Intelligent Inspection Robot With Deep Stereo Vision For Three-Dimensional Concrete Damage Detection and Quantification

Uploaded by

Copyright:

Available Formats

Original Article

Structural Health Monitoring

A novel intelligent inspection  The Author(s) 2021

robot with deep stereo vision for sagepub.com/journals-permissions

Introduction Traditional facade damage detection relies on visual

Figure 3. Pipeline of the data transmission and processing system.

in the shared feature maps. The ROIs are thereafter

Figure 7. Segmented results compared to actual images.

distances outside an interval defined by the global mean

Figure 9. 3D concrete crack convex hull construction.

Figure 10. Steps for 3D scene reconstruction.

Figure 11. Visualization of stereo camera trajectory and the

estimated in an offline calibration process.

Crack ID # Measured Proposed Manual RE (%)

#1 Maximum length 201.76 205.90 2.01

You might also like

A novel intelligent inspection The Author(s) 2021