Professional Documents
Culture Documents
Keywords: With the rapid development of deep learning, target detection and segmentation in dirty backgrounds have
Magnetic Particle Inspection been readily available. Fluorescent magnetic particle inspection (MPI) based on such technology will be a
Crack segmentation promising alternative for automated crack defect inspection. Most previous studies in MPI have focused only
3D reconstruction
on crack detection. Instead, we frame it as a crack 3D localization problem, since cracks on non-machined
CNN
surfaces of metal parts need to be polished and re-inspected, which relies on their 3D positions. Although
good results were obtained in defect detection, it is still challenging to perform pixel-level segmentation of
micro-cracks from the large background to obtain crack 2D pixels for 3D reconstruction. This paper proposes a
two-stage convolutional neural network (CNN) method for metal parts crack defect detection and segmentation
at the image-pixel level. The first stage detects and crops the potential cracks to a small area, and the second
stage can learn the context of cracks in the detected patches. A window-based stereo matching method is then
used to find matching pixels of cracks and to map crack image plane points to the 3D world points. We also
illustrate the entire system’s model deployment and signaling work to apply these methods. Both computational
and experimental results based on the system are presented for validation. The training precision of target
detection reaches 96.3%, its average precision reaches 85.4%, and the average precision reaches 98.3% when
the Intersection-over-Union (IoU) threshold is 0.5. The Dice score reaches 94% in pixel-level segmentation,
and the average precision is 99.3% when the probability threshold is set to 0.5. The corresponding efficiency
reaches 19 FPS and 18 FPS, and the mean absolute errors of 3D coordinates of reconstructed crack defects are
all within 1 mm in X-, Y- and Z- directions.
∗ Corresponding author at: School of Automotive Engineering, Wuhan University of Technology, Wuhan, 430070, China.
E-mail addresses: wuqiangyl@whut.edu.cn (Q. Wu), qxp915@hotmail.com (X. Qin), dkang0808@gmail.com (K. Dong), 2795687951@qq.com (A. Shi),
821759037@qq.com (Z. Hu).
1
Both authors contributed equally to the writing of the paper.
https://doi.org/10.1016/j.eswa.2022.118966
Received 19 April 2022; Received in revised form 16 September 2022; Accepted 1 October 2022
Available online 8 October 2022
0957-4174/© 2022 Published by Elsevier Ltd.
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
2. Related works
2
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Fig. 2. Schematic diagram of fluorescent MPI system including robot arm, cameras,
detector, and host computer. The workstation receives working signals from the robot
and camera and processes them through deployed deep learning models. The tool,
camera and robot world coordinate system are {𝐓}, {𝐂}, and {𝐖𝐂𝐒}, respectively.
3
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Fig. 4. Scheme of the Scaled-YOLOv4 network, as proposed by Wang, Bochkovskiy, and Liao (2021), exemplified with three parts. The input is an image of height H and width
W. The box regression head predicts the pixel coordinates x1, y1, x2, y2 for anchor boxes.
& Farhadi, 2016) and its descendants (Bochkovskiy, Wang, & Liao, This paper is a single-class crack detection without considering the
2020; Redmon & Farhadi, 2017, 2018) in terms of detection speed classification loss. The loss function used in this model consists of two
and precision improved version. It supports higher input network size parts: object localization offset CIoU loss 𝐿𝐶𝐼𝑜𝑈 (Zheng et al., 2020) and
(resolution) while ensuring inference speed, which is beneficial to object confidence cross-entropy loss 𝐿𝐵𝐶𝐸 , as shown in Eqs. (1)–(3),
detect micro-cracks. Meanwhile, Scale-YOLOv4 is the current SOTA at where 𝜆𝑖 (𝑖 = 1, 2) is the balance coefficient.
the time we use it. Another important reason we choose it is that it
designs a set of models (YOLOv4-Tiny, YOLOv4-CSP, YOLOv4-Large) 𝐿𝑜𝑠𝑠 = 𝜆1 𝐿𝐵𝐶𝐸 + 𝜆2 𝐿𝐶𝐼𝑜𝑈 , (1)
for different GPUs (low-end, high-end). This makes it easy for us to 𝜌2 (𝑦, 𝑦)
̂
𝐿𝐶𝐼𝑜𝑈 = 1 − IOU(𝑦, 𝑦)
̂ + + 𝛼𝜈, (2)
deploy the model on different devices and YOLOv4-P5 is used in this 𝑐2
∑
paper. 𝐿𝐵𝐶𝐸 = − Obj𝑖 [𝑦𝑙𝑜𝑔(𝑦)
̂ + (1 − 𝑦)𝑙𝑜𝑔(1
̂ − 𝑦)] . (3)
As shown in Fig. 4, the network structure consists of three parts: the
backbone for feature extraction, the neck for semantic representation In Eq. (2), IOU(𝑦, 𝑦)̂ = 𝑦∩ 𝑦̂
𝑦∪𝑦
represents the Intersection over Union
of the extracted features, and the head for prediction. Images are first between the prediction box and the ground truth. 𝜌 represents the
initialized through a convolutional layer with a kernel size of 3 × 3, euclidean distance between the center point of the prediction box
stride size of 1, and a channel number of 32. Then 5 CSPDark blocks 𝑦 and the target box 𝑦, ̂ and 𝑐 represents the diagonal distance of
are followed. Each CSPDark block is first down-sampled by convolution the minimum closure region that can contain both the prediction
with stride 𝑆 = 2. Consequently, the down-sampling rate are set box and the target box. 𝛼𝜈 is the penalty for the aspect ratio. 𝜈 =
( 𝑔𝑡
)2
4
to 2, 4, 8, 16, 32. After down-sampling, the featured graph is divided
𝜋2
arctan 𝑤
ℎ𝑔𝑡
− arctan 𝑤
ℎ
is a positive number, (𝑤𝑔𝑡 , ℎ𝑔𝑡 ), (𝑤, ℎ) are
into two parts with the same number of channels. Part1 performs the
the true and predicted width and height of the BBox, respectively.
convolution operation using successive convolution layers with kernel 𝜈
𝛼 = (1−IOU)+𝜈 measures the consistency of the aspect ratio. In Eq. (3),
sizes of 3 × 3 and 1 × 1, and is connected by a residual block. Part2 does
not operate, and it concatenates directly with the final output of part1. Obj𝑖 indicates whether there is an object in the predicted object BBox
The number of residual blocks of each CSPDark block are 1, 3, 15, 15, 7, 𝑖, and the result value is 0 or 1.
and the number of channels are 64, 128, 256, 512, 1024. Then a CSP-
ized PAN (Path Aggregation Network) (Liu, Qi, Qin, Shi, & Jia, 2018) 3.2. Crack segmentation
structure is used at the neck of the network, and the bottom-up path is
extended to make it easier for low-level information to propagate to the The purpose of fluorescence MPI image segmentation is to obtain
top level. In the top-to-bottom process, the CSP-ized Spatial Pyramid the pixel coordinates of cracks, which can be used to calculate their
Pooling (SPP) (He, Zhang, Ren, & Sun, 2015) module is constructed by corresponding 3D coordinates. The semantic information and structure
four parallel branches: max-pooling layers with kernel sizes of 5 × 5, of those patches cropped from the crack detection model is relatively
9 × 9, 13 × 13, and a jump connection. SPP extends the receptive simple. Therefore, in order to improve the segmentation efficiency,
field to realize the fusion of local and global features. It enriches reduce the calculation consumption, and balance the segmentation
the expression ability of the feature map, which is beneficial for the precision, a widely used U-Net structure (Ronneberger, Fischer, &
targets with large scale differences to be detected. Then through two Brox, 2015), an encoding–decoding symmetric network, is adopted.
inverse CSP modules, up-sampling is performed between the modules. The structure of the model is shown in Fig. 5. It includes 4 down-
From bottom to top, we sample down by convolution with stride of sampling and up-sampling steps. Skip-connection technology is used
𝑆 = 2. To extract additional semantic features, the feature layers for every stage to ensure that the feature maps integrate more low-
obtained from the CSPDarknet53 are concatenated after convolution, lever features. Up-sampling can fuse different scale features and refine
then up-sampled, followed by down-sampling, which is stacked with segmented images’ edge information more refined.
the remaining feature layers for enhancing the feature. Three YOLO The image features are extracted by the encoding part, when the
heads with sizes of 28 × 28, 56 × 56, and 112 × 112 are used to fuse pixel-level segmentation is obtained by the decoding part. The model
and interact with feature maps of different scales to detect objects of has five convolutional stages. Two convolution layers with 3 × 3 kernel
different sizes. are used in each stage, and the ReLU (Rectified Linear Unit) is used
4
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
5
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Table 1
Comparison of different target detection methods.
Method Size FPS AP AP50 AP75 AP𝑆 AP𝑀 AP𝐿
SSD 512 67 51.7% 85.1% 61.1% 38.6% 61.5% 60.4%
Faster R-CNN 512 45 57.6% 90.5% 64.0% 46.9% 61.5% 70.7%
Faster R-CNN 896 31 69.9% 94.4% 79.6% 46.2% 70.9% 76.7%
Scaled-YOLOv4 512 𝟕𝟕 79.5% 97.3% 87.4% 61.0% 75.7% 91.0%
Scaled-YOLOv4 896 38 85.4% 98.3% 93.5% 72.1% 82.6% 86.8%
6
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Fig. 9. Precision and loss during training for Scaled-YOLOv4 with and without pre-trained.
Fig. 10. P–R (Precision–Recall) curves for different methods and image sizes. From left to right: IoU thresholds are 0.5, 0.75, 0.90, respectively.
Fig. 11. Example result of Scaled-YOLOv4 detection experiment on images with different type of cracks.
outperforms Faster R-CNN and SSD by about 20%. In terms of inference Fig. 11(c), is much larger than others. Fig. 11(d) shows a case where
speed, Scaled-YOLOv4 also leads with a speed of over 77 frames per there are multiple cracks on the surface of parts. In Fig. 11(e), there
second. At the same time, the size of the image has a greater impact on appear to be more than two lateral cracks emanating from a point and
the precision of those methods. a median crack extending from the same point towards the surface.
The crack detection results of Scaled-YOLOv4 at different scales At the end of the lateral branch cracks, the crack visual features are
are shown in Fig. 11(a), (b), and (c), respectively. The crack defect significantly weakened and discontinuous. The model does not perform
detection in Fig. 11(a)–(b) is challenging because of its small size and satisfactorily enough in such cases, it can accurately identify the well-
low contrast. The precision of searching for these non-salient targets characterized crack trunk, but end branches are missed. Nevertheless,
can reach 85%, indicating that the model can solve the detection task this deficiency can be compensated in the cropping session. Overall,
of small targets well. At the same time, the model also shows good the model has a high precision and good robustness, giving good
detection results when the length of these surface cracks, as shown in predictions on both test and training dataset. The model has good
7
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Fig. 12. Example of detection failure when image size is 512 × 512. The red box is Ground Truth, and the green box is the detection result. Top row: SSD. Middle row: Faster
R-CNN. Bottom row: Scaled-YOLOv4.
adaptability and can adapt to cracks of different scales and shapes, and enhance and color-space invert, Gaussian Blur), the dataset is doubled,
has high recognition precision for cracks with obscure features. yielding a total of 8100 images.
Fig. 12 shows some false detection examples at image size 512 × Implementation The model is trained from scratch for 300 epochs.
512. The red boxes are ground truth, and the green boxes are the The Batch size is set to 6, and the initial learning rate is 0.001. As shown
detection results. From top to bottom are SSD, Faster R-CNN and in Eqs. (4)–(6), the balance coefficients 𝜆1 and 𝜆2 for binary cross-
Scaled-YOLOv4. In SSD and Faster R-CNN, the first three columns from entropy loss 𝐿𝐵𝐶𝐸 and Dice loss 𝐿𝐷𝑖𝑐𝑒 are both set to 1, 𝑦̂ is ground
left to right are examples of false alarm, while the last two columns truth, 𝑦 is predicted value.
are examples of missed detection. In the case of false alarms, SSD and The dice coefficient score of the predicted image and the target
Faster R-CNN are prone to mistaking color-sharp edges for cracks. In the image, is used as the precision gold indicator and defined as:
missed detection examples, the crack vision features in the other images 𝑦 ∩ 𝑦̂
are weakened during the image down-sampling process, which makes 𝐴𝐶𝐷𝑖𝑐𝑒 = 2 . (16)
|𝑦| + |𝑦|̂
the model unable to detect effectively. In contrast, Scaled-YOLOv4
In the segmentation phase, a positive sample is a crack pixel, and a
has a few failed detection cases, which are shown in the third row.
negative sample is a background pixel.
Except for the fourth example, all the other examples occur only once.
Performance Fig. 13, from left to right, shows the training preci-
Again, the first two examples are mis-detecting crack-like regions. In
sion, training loss and the P–R curves of several common segmentation
the third and fourth examples, the model splits the ‘‘single’’ crack into
methods, including SegNet (Badrinarayanan, Kendall, & Cipolla, 2017)
‘‘multiple’’ targets. Especially the fourth case, which appeared many
and FCN (Long, Shelhamer, & Darrell, 2015). Them all show good
times in the test. In the fifth example, crack is not detected, possibly
performance and reach convergence within 50 epochs with precision
due to subtle image enhancement differences. In general, small image
of over 80%. Among them, U-Net achieves the best results under the
resolution can accelerate the convergence and inference speed of the
image size of 512 × 512, and its Dice precision reaches 93.8%. The
model, but improper image sampling may cause the lower precision.
training results of U-Net under two image sizes of 512 × 512 (U-
Without significantly affecting the inference speed, this paper adopts
Net:512) and 1024 × 1024 (U-Net:1024) are also compared. When the
an image size of 896 × 896 to ensure more stable detection precision.
image size of corp changes from 512 to 1024, the U-Net precision drops
by about 3.8%, and the convergence speed is relatively slow. The main
4.2. Segmentation reason for this may be that the large background brings more negative
samples. In P–R curves, U-Net:512 holds a curve most close to the up-
Dataset Crack detection is done by using the Scaled-YOLOv4 on right corner in the chart, and achieves the best precision and recall
the training, validation and test sets used in the target detection stage. values. Table 2 shows that U-Net:512 has obtained the highest average
The corresponding target regions are cropped and filtered with a size precision, reaching 99.3%. At the same time, the model maintains great
of 512 × 512, obtaining a total of 4050 images and maintaining a ratio advantages in terms of FPS, parameter quantity, and computation quan-
of 8:1:1. Using the same image enhancement methods and parameters tity. Its 7.76 M parameters are 1/4 of SegNet (29.44 M), 1/17 times
as for target detection (flipud, fliplr, random rotation, scale, brightness that of FCN (134.27 M), and its MAC (Multiply–Accumulate) operations
8
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Fig. 13. The accuracy and loss of different segmentation methods during training and their corresponding P–R curves. From left to right: Precision, Loss and P–R curve.
Table 2 Table 3
Comparison of different segmentation methods. Binocular vision system calibration result.
Method Size FPS AP Dice Params MACs 𝑓 /mm 𝑘 𝑆𝑥 𝑆𝑦 𝐶𝑥 𝐶𝑦
U-Net 512 𝟏𝟎𝟏 99.3% 𝟗𝟑.𝟖% 𝟕.𝟕𝟔 𝐌 𝟓𝟒.𝟗𝟔 𝐆 L 16.6 −141.6 5.5e−6 5.5e−6 1024.9 1032.2
U-Net 1024 25 97.2% 90.2% 7.76 M 219.84 G Calibration R 17.0 −106.5 5.5e−6 5.5e−6 1048.7 1043.9
SegNet 512 38 99.0% 91.8% 29.44 M 160.56 G
𝑿/mm 𝒀 /mm 𝒁/mm 𝑹 𝒙 ∕◦ 𝑹 𝒚 ∕◦ 𝑹𝒛 ∕◦
FCN 512 31 97.5% 89.0% 134.27 M 190.36 G 𝐓
69.1 −3.3e−6 1.6 0.06 359.3 359.8
𝑓 /mm 𝑘 𝑆𝑥 𝑆𝑦 𝐶𝑥 𝐶𝑦
L 16.8 0 5.5e−6 5.5e−6 1113.3 1059.4
does not exceed 1/3 times of the other methods (SegNet: 160.56G, Rectification R 16.8 0 5.5e−6 5.5e−6 1172.5 1059.4
FCN: 190.36 G). The small amount of parameters and computation also 𝑿/mm 𝒀 /mm 𝒁/mm 𝑹 𝒙 ∕◦ 𝑹 𝒚 ∕◦ 𝑹𝒛 ∕◦
𝐓
enable U-Net to achieve an higher inference speed of 101 FPS. 69.1 0 0 0 0 0
In Fig. 14, some typical examples segmentation results of those 𝑿/mm 𝒀 /mm 𝒁/mm 𝑹 𝒙 ∕◦ 𝑹 𝒚 ∕◦ 𝑹𝒛 ∕◦
Eye-in-hand 𝐇
methods are compared. From left to right are the cropped patches, −103.3 49.5 85.7 358.6 0.68 256.3
9
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Fig. 14. Comparison of results obtained by different methods on five sample images (From top to bottom).
Fig. 15. U-Net segmentation samples at image crop size of 1024 × 1024.
10
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Fig. 16. Examples of stereo matching for depth estimation. Top row: Original left image. Second row: Disparity map. Bottom row: Point cloud.
Fig. 17. Crack defects 3D coordinates in the camera coordinate system and the world coordinate system.
Coordinate Transformation The coordinates in the camera co- truth in the robotic coordinate system, can be obtained. The main
ordinate system often need to be transformed into other coordinate problem is that the actual values introduce robotic system errors, such
systems such as the robot world coordinate system for use. The co- as TCP (Tool Center Point) calibration errors and manual manipulation
ordinate transformation in this paper has two main purposes, one is errors. The second problem is that the number of reconstructed 3D
for subsequent robot marking, and the other is for reconstruction error points is much larger than that of acquisitions, and it is necessary
analysis. The first crack defect point cloud in the camera coordinate to find the point corresponding to the actual value from the point
system is shown in Fig. 17(a). Its coordinates transformed by Eq. (10) cloud. We pick out the corresponding points from the point cloud by
in the robot coordinate system are shown in Fig. 17(b). Euclidean distance minimization, which is an imprecise approach, but
Ground Truth It is often difficult to obtain actual 3D data and align a reasonable approach assuming the two objects are close enough.
it to the corresponding image by vision systems or other 3D techniques Finally, we selected a total of 150 points in 5 groups of experiments.
for error analysis. Thanks to the fact that only the 3D coordinates Error Analysis Under the previous assumptions, the errors of five
of the crack are required, it is possible to obtain the 3D coordinates sets of data along the X-, Y- and Z- axis were calculated by Eq. (11),
of some points on the surface crack of the part as the real value as shown in Fig. 18 (from left to right). The evaluation matric in
through the robot tool. Combined with the coordinate transformation, Eqs. (11)–(14) are analyzed, and the results are shown in Table 4.
the reconstructed 3D coordinates of the crack, as well as the ground The mean absolute errors on X-, Y- and Z- axis are 1.67, 1.25 and
11
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Fig. 18. 3D reconstruction errors. From left to right: X-, Y-, Z- axis errors,respectively.
Table 4
The statistical results of crack defect 3D reconstruction.
MAE (mm) MSE (mm) Percentage (<1 mm) Percentage (<2 mm) Percentage (<3 mm)
x y z x y z x y z x y z x y z
Caculate 1.67 1.25 1.19 1.95 1.43 1.35 30.8% 40.3% 40.9% 55.7% 87.9% 91.9% 95.3% 99.3% 99.3%
Correct 0.90 0.61 0.62 1.03 0.79 0.84 53.0% 81.9% 87.2% 99.3% 99.3% 98.0% 100.0% 99.3% 98.0%
1.19 mm, respectively. The root mean square errors were 1.95, 1.43 Algorithm 1: Crack detection and 3D localization
and 1.35 mm, respectively. The percentage of points with errors less Input: Image pair 𝐼 = (𝐼𝐿 , 𝐼𝑅 ); Camera parameters 𝐾; Hand–eye
than 1 mm was 30.8%, 40.3%, and 40.9%, respectively. The error of relationship 𝐇𝑐𝑎𝑚 ; Robot positions 𝐇𝑡𝑜𝑜𝑙 1 𝑛
𝑡𝑜𝑜𝑙 𝑤𝑐𝑠 ← {𝐇 , … , 𝐇 }
more than 95% points was within 3 mm. Because the calibration errors Output: A list 𝑃 of surface crack defects 3D coordinates
of robot often show directivity in macro. Therefore, the errors are fitted 1 Initialization:Camera TCP; Robot Socket; Model
horizontally, and the intercepts of the yellow horizontal lines in Fig. 18 2 𝑃 ← { }; ⊳ list surface crack defects 3D coordinates
are −1.65, −1.19 and −1.06 mm respectively. The average absolute 3 for 𝑖 ∈ {1, 2, … , 𝑛} do
errors corrected by this intercept are 0.90, 0.61, and 0.62 mm on X-, 4 Model ← {𝐼𝐿 , 𝐼𝑅 } ← camera ← Robot Socket;
Y-, and Z- axis, respectively. More than 50% of the points are within 5 {𝐷𝐿(1) , … , 𝐷𝐿(𝑠) }, {𝐷𝑅(1) (𝑡)
, … , 𝐷𝑅 } ← Detection{𝐼𝐿 , 𝐼𝑅 };
6 if size of 𝑑 ∈ {𝐷𝐿 , 𝐷𝑅 } > 𝜎 then
1 mm, and more than 98% are within 2 mm.
7 End;
⋂
8 else if 𝑑 ∈ {𝐷𝐿 , 𝐷𝑅 } is empty or 𝑑 ∉ 𝐼𝐿 𝐼𝑅 then
4.4. Model deployment 9 Continue;
10 else
Those models are deployed using the NVIDIA Triton Inference 11 {𝑅(1)
𝐿
, … , 𝑅(𝑠)𝐿
}, {𝑅(1)
𝑅
, … , 𝑅(𝑡)
𝑅
} ← Crop(𝐼 𝑖 , 𝐷𝑖 );
Server, an open source inference service that can be used to deploy 12 {𝑆𝐿(1) , … , 𝑆𝐿(𝑠) }, {𝑆𝑅(1) , … , 𝑆𝑅(𝑡) } ← Segment(𝑅𝐿 , 𝑅𝑅 );
models from all popular frameworks. It supports common frameworks 13 {𝐼𝐿𝑆 , 𝐼𝑅𝑆 } ← Region-mapping(𝑆𝐿 , 𝑆𝑅 );
such as TensorFlow and PyTorch. NVIDIA Triton Inference Servers 14 𝐼𝑑𝑖𝑠 ← Stereo-matching(𝐼𝐿𝑆 , 𝐼𝑅𝑆 );
maximize performance and reduce end-to-end latency by running mul- 15 𝑃𝑐𝑎𝑚 ← 3D coordinates in the camera frame(𝐼𝑑𝑖𝑠 , 𝐾);
tiple models simultaneously on GPUs. The system communication and 16 𝑃𝑤𝑜𝑟𝑙𝑑 ← (𝑃𝑐𝑎𝑚 , 𝐇𝑡𝑜𝑜𝑙 , 𝐇𝑐𝑎𝑚 );
⋃ 𝑤𝑐𝑠 𝑡𝑜𝑜𝑙
17 𝑃 ← 𝑃 𝑃𝑤𝑜𝑟𝑙𝑑 ;
processing pseudo-code are shown in Algorithm 1. Its inputs include
18 end
Image pair 𝐼 = (𝐼𝐿 , 𝐼𝑅 ); Camera parameters 𝐾; Hand–eye relationship
19 end
𝐇𝑐𝑎𝑚
𝑡𝑜𝑜𝑙
; Robot positions 𝐇𝑡𝑜𝑜𝑙 1 𝑛
𝑤𝑐𝑠 ← {𝐇 , … , 𝐇 }. And its output is a list 𝑃 of
surface crack defect 3D coordinates. Firstly, the camera is initialized,
and communication is established with the robot through Socket/TCP.
Table 5
After receiving the client’s movement command, the robot drives the Efficiency and GPU memory consumption.
camera to move to the present position and feeds back the signal to Steps Size Run-time GPU Mem
the client. Then cameras collect images at this position and performs
Detection 896 × 896 53 ms 3.2 GB
detection processing through the model deployed on the server-side. Segmentation 512 × 512 55 ms 1.4 GB
Here we have parallel processing and image queues. The detected Stereo matching 2046 × 2040 105 ms –
crack areas {𝐷𝐿 , 𝐷𝑅 } are determined by two conditions, and if the
conditions are satisfied, operations such as cropping, segmentation, and
reconstruction are continued to calculate the 3D coordinates of the
average GPU consumption is 1.4 Gb, and the average reasoning time
crack. When done, the robot is asked to move to the next position 𝐇𝑡𝑜𝑜𝑙 𝑤𝑐𝑠 . of a single image is 55 ms. In the stereo matching stage, the image size
In the middle, PLC drives the rotary mechanism to drive the parts to
is 2046 × 2040, and it takes 105 ms to calculate on the CPU. These
rotate, and the next round of photos is collected circularly. times are for the entire operation process, including pre-processing and
The GPU memory consumption and run time of those models de- network delays, etc. According to the calculation of the two pictures,
ployed on the workstation are shown in the following Table 5. The the total time of the model is about 320 ms. It should be noted that
image precision is Float32 bits. In Detection, the image size is down- not all images require segmentation and stereo reconstruction. On the
sampled to 896 × 896, the average GPU consumption is 3.2 Gb, and other hand, this paper uses a robot to drive the camera to move, and the
the average inference time of a single image is 53 ms. The detection magnetic particle flaw detector stops every 90 degrees to take 3 groups
results are displayed on the original images with a size of 2046 × 2040. of photos each time, a total of 12 groups of photos. The rotational
In the image segmentation stage, the image size is 512 × 512, the speed of the mechanism is 180 degrees per second. Thanks to parallel
12
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
computing and image queues, the average detection time of a single Biederer, S., Knopp, T., Sattel, T. F., Lüdtke-Buzug, K., Gleich, B., Weizenecker, J.,
part is within 5 s (excluding the processes of loading and unloading, et al. (2009). Magnetization response spectroscopy of superparamagnetic nanopar-
ticles for magnetic particle imaging. Journal of Physics D: Applied Physics, [ISSN:
drenching and magnetizing, etc.) for the currently measured forming
0022-3727] 42(20), Article 205007. http://dx.doi.org/10.1088/0022-3727/42/20/
parts shown in Fig. 16. 205007.
Bochkovskiy, A., Wang, C., & Liao, H. M. (2020). Yolov4: Optimal speed and accuracy
5. Conclusion of object detection. http://dx.doi.org/10.48550/arXiv.2004.10934, arXiv preprint
arXiv:2004.10934.
British Standards Institution (1999). Open die steel forgings for general engineering
This paper developed a automated fluorescent MPI framework for purposes-part 1: General requirements. https://dlscrib.com/download/bs-10250-4-
simultaneous crack defect detection and its 3D localization. First, a 2000_59b98ded08bbc5bc27894d06_pdf.
Cheng, X., & Yu, J. (2020). Retinanet with difference channel attention and adaptively
two-stage model is used to obtain crack defect pixel coordinates. The
spatial feature fusion for steel surface defect detection. IEEE Transactions on
first stage operates on the image and removes the noisy background Instrumentation and Measurement, 70, 1–11. http://dx.doi.org/10.1109/TIM.2020.
area. The second stage is used to classify all pixels in the crack re- 3040485.
gion localized in the first stage. Then, the images are rectified by Chin, R. T., & Harlow, C. A. (1982). Automated visual inspection: A survey. IEEE
Transactions on Pattern Analysis and Machine Intelligence, PAMI-4(6), 557–573. http:
the vision system parameters, and the stereo disparity is estimated
//dx.doi.org/10.1109/TPAMI.1982.4767309.
by matching the crack defect pixels in the left and right images to Choi, D. c., Jeon, Y. J., Kim, S. H., Moon, S., Yun, J. P., & Kim, S. W. (2017).
restore 3D depth maps. Next, the model deployment, communication Detection of pinholes in steel slabs using Gabor filter combination and morpho-
and efficiency analysis of the whole system are completed. Calculations logical features. Isij International, 57(6), 1045–1053. http://dx.doi.org/10.2355/
and experiments based on this system are designed. High crack defect isijinternational.ISIJINT-2016-160.
Eisenmann, D. J., Enyart, D., Lo, C., & Brasche, L. (2015). Review of progress in
detection, segmentation precision, and low crack spatial position error magnetic particle inspection. AIP Conference Proceedings, 1581(1), 1505. http://
are obtained. dx.doi.org/10.1063/1.4865001.
In future work, the convenience, efficiency and economy between Fu, G., Sun, P., Zhu, W., Yang, J., Cao, Y., Yang, M. Y., et al. (2019). A deep-learning-
multiple fixed cameras and manipulators will be measured by opti- based approach for fast and robust steel surface defects classification. Optics and
Lasers in Engineering, 121, 397–405. http://dx.doi.org/10.1016/j.optlaseng.2019.05.
mizing the viewpoints of cameras, and the most suitable arrangement 005.
scheme can be selected according to the actual needs. We also plan Hartley, R. I. (1999). Theory and practice of projective rectification. Interna-
to analyze the specificity of fluorescence images and crack defects and tional Journal of Computer Vision, 35(2), 115–127. http://dx.doi.org/10.1023/A:
continuously optimize the structure and size of the model. 1008115206617.
He, Y., Song, K., Meng, Q., & Yan, Y. (2019). An end-to-end steel surface defect
detection approach via fusing multiple hierarchical features. IEEE Transactions
CRediT authorship contribution statement on Instrumentation and Measurement, 69(4), 1493–1504. http://dx.doi.org/10.1109/
TIM.2019.2915404.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep
Qiang Wu: Designed the study, Contributed to analysis and convolutional networks for visual recognition. IEEE Transactions on Pattern Anal-
manuscript preparation. Xunpen Qin: Conception of the study. Kang ysis and Machine Intelligence, 37(9), 1904–1916. http://dx.doi.org/10.1109/TPAMI.
Dong: Data analysis and experimental platform. Aixian Shi: Data 2015.2389824.
analysis and experimental platform. Zeqi Hu: Data analysis and He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image
recognition. In Proceedings of the IEEE conference on computer vision and pattern
experimental platform.
recognition (pp. 770–778). http://dx.doi.org/10.48550/arXiv.1512.03385.
Hirschmuller, H. (2007). Stereo processing by semiglobal matching and mutual informa-
Declaration of competing interest tion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2), 328–341.
http://dx.doi.org/10.1109/TPAMI.2007.1166.
Honarvar, F., & Varvani-Farahani, A. (2020). A review of ultrasonic testing applications
The authors declare that they have no known competing finan- in additive manufacturing: Defect evaluation, material characterization, and process
cial interests or personal relationships that could have appeared to control. Ultrasonics, 108, Article 106227.
influence the work reported in this paper. International Association of Classification Societies (2021). Guidelines for non-
destructive testing of hull and machinery steel forgings no.68. https://iacs.org.uk/
download/1855.
Data availability Kanade, T., & Okutomi, M. (1994). A stereo matching algorithm with an adaptive
window: Theory and experiment. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 16(9), 920–932. http://dx.doi.org/10.1109/34.310690.
No data was used for the research described in the article.
Karthik, M. M., Terzioglu, T., Hurlebaus, S., Hueste, M. B., Weischedel, H., & Stamm, R.
(2019). Magnetic flux leakage technique to detect loss in metallic area in external
Acknowledgments post-tensioning systems. Engineering Structures, 201, Article 109765. http://dx.doi.
org/10.1016/j.engstruct.2019.109765.
Kim, M. S., Park, T., & Park, P. (2019). Classification of steel surface defect us-
The authors would like to thank all the Hubei Key Laboratory of ing convolutional neural network with few images. In 2019 12th Asian control
Advanced Technology for Automotive Components staff for supporting conference (ASCC) (pp. 1398–1401). IEEE, https://ieeexplore.ieee.org/abstract/
this work. document/8764994.
Kolmogorov, V., Criminisi, A., Blake, A., Cross, G., & Rother, C. (2006). Probabilistic
The work was supported by the Major Project of Technological In-
fusion of stereo with color and contrast for bilayer segmentation. IEEE Transactions
novation in Hubei Province (2020BED010), China Postdoctoral Science on Pattern Analysis and Machine Intelligence, 28(9), 1480–1492. http://dx.doi.org/
Foundation (2020M682498). 10.1109/TPAMI.2006.193.
Lee, J., Lee, S., Jiles, D., Garton, M., Lopez, R., & Brasche, L. (2003). Sensitivity analysis
of simulations for magnetic particle inspection using the finite-element method.
References
IEEE Transactions on Magnetics, 39, 3604–3606. http://dx.doi.org/10.1109/TMAG.
2003.816152.
Abend, K. (1999). Fully automated dye-penetrant inspection of automotive parts. Li, J., Su, Z., Geng, J., & Yin, Y. (2018). Real-time detection of steel strip surface
Computer Standards & Interfaces, 2(21), 157. http://dx.doi.org/10.1016/S0920- defects based on improved yolo detection network. IFAC-PapersOnLine, 51(21),
5489(99)92144-X. 76–81. http://dx.doi.org/10.1016/j.ifacol.2018.09.412.
Ali, R., & Cha, Y. (2019). Subsurface damage detection of a steel bridge using deep Li, L., Yang, Y., Cai, X., & Kang, Y. (2020). Investigation on the formation mechanism
learning and uncooled micro-bolometer. Construction and Building Materials, 226, of crack indications and the influences of related parameters in magnetic particle
376–387. http://dx.doi.org/10.1016/j.conbuildmat.2019.07.293. inspection. Applied Sciences, 10, http://dx.doi.org/10.3390/app10196805.
Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: A deep convolutional Lin, C., Chen, C., Yang, C., Akhyar, F., Hsu, C., & Ng, H. (2019). Cascading
encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern convolutional neural network for steel surface defect detection. In International
Analysis and Machine Intelligence, 39(12), 2481–2495. http://dx.doi.org/10.1109/ conference on applied human factors and ergonomics (pp. 202–212). Springer, http:
CVPR.2015.7298965. //dx.doi.org/10.1007/978-3-030-20454-9_20.
13
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., et al. (2014). Shipway, N., Huthwaite, P., Lowe, M., & Barden, T. (2019). Performance based
Microsoft coco: Common objects in context. In European conference on computer modifications of random forest to perform automated defect detection for flu-
vision (pp. 740–755). Springer, http://dx.doi.org/10.1007/978-3-319-10602-1_48. orescent penetrant inspection. Journal of Nondestructive Evaluation, 38(2), 1–11.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., et al. (2016). SSD: http://dx.doi.org/10.1007/s10921-019-0574-9.
Single shot multibox detector. In European conference on computer vision (pp. 21–37). Shipway, N., Huthwaite, P., Lowe, M., & Barden, T. (2021). Using ResNets to
Springer, http://dx.doi.org/10.48550/arXiv.1512.02325. perform automated defect detection for fluorescent penetrant inspection. NDT
Liu, K., Li, A., Wen, X., Chen, H., & Yang, P. (2019). Steel surface defect detection using & E International, 119, Article 102400. http://dx.doi.org/10.1016/j.ndteint.2020.
GAN and one-class classifier. In 2019 25th international conference on automation 102400.
and computing (ICAC) (pp. 1–6). IEEE, http://dx.doi.org/10.23919/IConAC.2019. Song, G., Song, K., & Yan, Y. (2020). EDRNet: Encoder–decoder residual network
8895110. for salient object detection of strip steel surface defects. IEEE Transactions on
Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path aggregation network for instance Instrumentation and Measurement, 69(12), 9709–9719. http://dx.doi.org/10.1109/
segmentation. In Proceedings of the IEEE conference on computer vision and pattern TIM.2020.3002277.
recognition (pp. 8759–8768). http://dx.doi.org/10.48550/arXiv.1803.01534.
Standardization Administration of China (2016). Steel die forgings – tolerance
Liu, K., Wang, H., Chen, H., Qu, E., Tian, Y., & Sun, H. (2017). Steel surface defect
and machining allowance. https://openstd.samr.gov.cn/bzgk/gb/newGbInfo?hcno=
detection using a new Haar–Weibull-variance model in unsupervised manner. IEEE
21DA10C89BA268AC118A824F82B06FD5.
Transactions on Instrumentation and Measurement, 66(10), 2585–2596. http://dx.doi.
Staněk, P., & Škvor, Z. (2019). Automated magnetic field evaluation for magnetic
org/10.1109/TIM.2017.2712838.
particle inspection by impulse. Journal of Nondestructive Evaluation, 38, 75. http:
Liu, W., & Yan, Y. (2014). Automated surface defect detection for cold-rolled steel strip
//dx.doi.org/10.1007/s10921-019-0615-4.
based on wavelet anisotropic diffusion method. International Journal of Industrial and
Systems Engineering, 17(2), 224–239. http://dx.doi.org/10.1504/IJISE.2014.061995. Tang, Y., Niu, A., Wee, W. G., & Han, C. Y. (1995). Automated inspection system
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic for detecting metal surface cracks from fluorescent penetrant images. In Machine
segmentation. In Proceedings of the IEEE conference on computer vision and pattern vision applications in industrial inspection III, Vol. 2423 (pp. 278–291). SPIE, http:
recognition (pp. 3431–3440). http://dx.doi.org/10.1109/CVPR.2015.7298965. //dx.doi.org/10.1117/12.205514.
Lovejoy, M. (1993). Magnetic particle inspection: a practical guide. Springer Science & Tout, K., Meguenani, A., Urban, J.-P., & Cudel, C. (2021). Automated vision system for
Business Media, http://dx.doi.org/10.1007/978-94-011-1536-0. magnetic particle inspection of crankshafts using convolutional neural networks.
Luo, Q., Fang, X., Liu, L., Yang, C., & Sun, Y. (2020). Automated visual defect International Journal of Advanced Manufacturing Technology, 112, 3307–3326. http:
detection for flat steel surface: A survey. IEEE Transactions on Instrumentation and //dx.doi.org/10.1007/s00170-020-06467-4.
Measurement, 69(3), 626–644. http://dx.doi.org/10.1109/TIM.2019.2963555. Tsai, R. Y., Lenz, R. K., et al. (1989). A new technique for fully autonomous and efficient
Mei, S., Wang, Y., & Wen, G. (2018). Automatic fabric defect detection with a 3 d robotics hand/eye calibration. IEEE Transactions on Robotics and Automation,
multi-scale convolutional denoising autoencoder network model. Sensors, 18, 1064. 5(3), 345–358. http://dx.doi.org/10.1109/70.34770.
http://dx.doi.org/10.3390/s18041064. Wang, C., Bochkovskiy, A., & Liao, H. M. (2021). Scaled-yolov4: Scaling cross stage par-
Miao, L., Li, H., & Tian, G. (2020). Resonant frequency tracking mode on eddy tial network. In Proceedings of the IEEE/Cvf conference on computer vision and pattern
current pulsed thermography non-destructive testing. Philosophical Transactions. recognition (pp. 13029–13038). http://dx.doi.org/10.48550/arXiv.2011.08036.
Series A, Mathematical, Physical, and Engineering Sciences, 378, Article 20190607. Wang, T., Chen, Y., Qiao, M., & Snoussi, H. (2018). Fast dynamic hysteresis
http://dx.doi.org/10.1098/rsta.2019.0607. modeling using a regularized online sequential extreme learning machine with
Milletari, F., Navab, N., & Ahmadi, S.-A. (2016). V-net: Fully convolutional neural forgetting property. International Journal of Advanced Manufacturing Technology, 94,
networks for volumetric medical image segmentation. In 2016 fourth international 3465–3471. http://dx.doi.org/10.1007/s00170-017-0549-x.
conference on 3D vision (3DV) (pp. 565–571). IEEE, http://dx.doi.org/10.1109/3DV.
Wang, J., Li, Q., Gan, J., Yu, H., & Yang, X. (2019). Surface defect detection via entity
2016.79.
sparsity pursuit with intrinsic priors. IEEE Transactions on Industrial Informatics,
Neogi, N., Mohanta, D. K., & Dutta, P. K. (2014). Review of vision-based steel surface
16(1), 141–150. http://dx.doi.org/10.1109/TII.2019.2917522.
inspection systems. EURASIP Journal on Image and Video Processing, 2014(1), 1–19.
Wang, H., Zhang, J., Tian, Y., Chen, H., Sun, H., & Liu, K. (2018). A simple guidance
http://dx.doi.org/10.4028/www.scientific.net/AMR.308-310.1328.
template-based defect detection method for strip steel surfaces. IEEE Transactions
Nguyen, N. H. T., Perry, S., Bone, D., Le, H. T., & Nguyen, T. T. (2021). Two-stage
on Industrial Informatics, 15(5), 2798–2809. http://dx.doi.org/10.1109/TII.2018.
convolutional neural network for road crack detection and segmentation. Expert
2887145.
Systems with Applications, 186, Article 115718. http://dx.doi.org/10.1016/j.eswa.
2021.115718. Woodford, O., Torr, P., Reid, I., & Fitzgibbon, A. (2009). Global stereo reconstruction
Park, F. C., & Martin, B. J. (1994). Robot sensor calibration: solving AX=XB on the under second-order smoothness priors. IEEE Transactions on Pattern Analysis and
Euclidean group. IEEE Transactions on Robotics and Automation, 10(5), 717–721. Machine Intelligence, 31(12), 2115–2128. http://dx.doi.org/10.1109/TPAMI.2009.
http://dx.doi.org/10.1109/70.326576. 131.
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Wu, H., Xu, X., Chu, J., Duan, L., & Siebert, P. (2019). Particle swarm optimization-
Unified, real-time object detection. In Proceedings of the IEEE conference on computer based optimal real gabor filter for surface inspection. Assembly Automation, 39,
vision and pattern recognition (pp. 779–788). http://dx.doi.org/10.48550/arXiv. 963–972. http://dx.doi.org/10.1108/AA-04-2018-060.
1506.02640. Yang, Y., Yang, Y., Li, L., Chen, C., & Min, Z. (2022). Automatic defect identification
Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings method for magnetic particle inspection of bearing rings based on visual charac-
of the IEEE conference on computer vision and pattern recognition (pp. 7263–7271). teristics and high-level features. Applied Sciences, [ISSN: 2076-3417] 12(3), 1293.
http://dx.doi.org/10.48550/arXiv.1612.08242. http://dx.doi.org/10.3390/app12031293.
Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. http://dx.doi. Youkachen, S., Ruchanurucks, M., Phatrapomnant, T., & Kaneko, H. (2019). Defect
org/10.48550/arXiv.1804.02767, arXiv preprint arXiv:1804.02767. segmentation of hot-rolled steel strip surface by using convolutional auto-encoder
Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks and conventional image processing. In 2019 10th international conference of informa-
for biomedical image segmentation. In International conference on medical image tion and communication technology for embedded systems (IC-ICTES) (pp. 1–5). IEEE,
computing and computer-assisted intervention (pp. 234–241). Springer, http://dx.doi. http://dx.doi.org/10.1109/ICTEmSys.2019.8695928.
org/10.1007/978-3-319-24574-4_28. Yu, H., Li, Q., Tan, Y., Gan, J., Wang, J., Geng, Y.-a., et al. (2018). A coarse-to-fine
Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2008). Labelme: a model for rail surface defect detection. IEEE Transactions on Instrumentation and
database and web-based tool for image annotation. International Journal of Computer Measurement, 68(3), 656–666. http://dx.doi.org/10.1109/TIM.2018.2853958.
Vision, 77(1), 157–173. http://dx.doi.org/10.1007/s11263-007-0090-8.
Zabih, R., & Woodfill, J. (1994). Non-parametric local transforms for computing visual
Scharstein, D., & Szeliski, R. (2002). A taxonomy and evaluation of dense two-frame
correspondence. In European conference on computer vision (pp. 151–158). Springer,
stereo correspondence algorithms. International Journal of Computer Vision, 47(1),
http://dx.doi.org/10.1007/BFb0028345.
7–42. http://dx.doi.org/10.1023/A:1014573219977.
Zhang, Z. (1999). Flexible camera calibration by viewing a plane from unknown
Shi, T., Kong, J.-y., Wang, X., Liu, Z., & Zheng, G. (2016). Improved sobel algorithm
orientations. In Proceedings of the seventh ieee international conference on computer
for defect detection of rail surfaces with enhanced efficiency and accuracy. Journal
vision, Vol. 1 (pp. 666–673). Ieee, http://dx.doi.org/10.1109/ICCV.1999.791289.
of Central South University, 23(11), 2867–2875. http://dx.doi.org/10.1007/s11771-
016-3350-3. Zhang, J., Kang, X., Ni, H., & Ren, F. (2021). Surface defect detection of steel strips
Shi, B., & Qiao, P. (2018). A new surface fractal dimension for displacement mode based on classification priority YOLOv3-dense network. Ironmaking & Steelmaking,
shape-based damage identification of plate-type structures. Mechanical Systems and 48(5), 547–558. http://dx.doi.org/10.1080/03019233.2020.1816806.
Signal Processing, 103, 139–161. http://dx.doi.org/10.1016/j.ymssp.2017.09.033. Zhao, W., Chen, F., Huang, H., Li, D., & Cheng, W. (2021). A new steel defect detection
Shi, J., Wu, K., Yang, C., & Deng, N. (2021). A method of steel bar image segmentation algorithm based on deep learning. Computational Intelligence and Neuroscience, 2021,
based on multi-attention U-net. IEEE Access, 9, 13304–13313. http://dx.doi.org/10. http://dx.doi.org/10.1155/2021/5592878.
1109/ACCESS.2021.3052224. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020). Distance-iou loss:
Shipway, N., Barden, T., Huthwaite, P., & Lowe, M. (2019). Automated defect detection Faster and better learning for bounding box regression. In Proceedings of the AAAI
for fluorescent penetrant inspection using random forest. NDT & E International, conference on artificial intelligence, Vol. 34 (pp. 12993–13000). http://dx.doi.org/10.
101, 113–123. http://dx.doi.org/10.1016/j.ndteint.2018.10.008. 1609/aaai.v34i07.6999.
14
Q. Wu et al. Expert Systems With Applications 214 (2023) 118966
Zheng, J., Xie, W., Viens, M., Birglen, L., & Mantegh, I. (2013). Design of advanced Zhou, F., Liu, G., Xu, F., & Deng, H. (2019). A generic automated surface defect
automatic inspection system for turbine blade fpi analysis. In AIP conference detection based on a bilinear model. Applied Sciences, 9, 3159. http://dx.doi.org/
proceedings, Vol. 1511 (pp. 612–619). American Institute of Physics, http://dx.doi. 10.3390/app9153159.
org/10.1063/1.4789103. Zhou, S., Wu, S., Liu, H., Lu, Y., & Hu, N. (2018). Double low-rank and sparse
Zhiznyakov, A., Privezentsev, D., & Zakharov, A. (2015). Using fractal features of digital decomposition for surface defect segmentation of steel sheet. Applied Sciences, 8(9),
images for the detection of surface defects. Pattern Recognition and Image Analysis, 1628. http://dx.doi.org/10.3390/app8091628.
25(1), 122–131. http://dx.doi.org/10.1134/S105466181501023X. Zou, Q., Zhang, Z., Li, Q., Qi, X., Wang, Q., & Wang, S. (2018). Deepcrack: Learning
hierarchical convolutional features for crack detection. IEEE Transactions on Image
Processing, 28, 1498–1512. http://dx.doi.org/10.1109/TIP.2018.2878966.
15