You are on page 1of 12

Journal of Materials Processing Tech.

292 (2021) 117064

Contents lists available at ScienceDirect

Journal of Materials Processing Tech.


journal homepage: www.elsevier.com/locate/jmatprotec

Filtered selective search and evenly distributed convolutional neural


networks for casting defects recognition
Xiaoyuan Ji a, Qiuyu Yan a, b, Dong Huang a, Bo Wu a, Xiaojing Xu a, Aibin Zhang c,
Guanglan Liao d, Jianxin Zhou a, *, Menghuai Wu e, *
a
State Key Laboratory of Materials Processing and Die & Mould Technology, School of Materials Science and Engineering, Huazhong University of Science and
Technology, Wuhan, Hubei, 430074, China
b
CISDI Information Technology Co., Ltd., Chongqing, 401122, China
c
Beijing Baimu High-tech Co., Ltd., of Aeronautical Materials, Beijing 100095, China
d
State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan, China
e
Department Metallurgy, University of Leoben, Austria

A R T I C L E I N F O A B S T R A C T

Associate Editor: Dr Jian Cao X-ray flaw detection is a key link in the detection of internal defects in titanium alloy castings which are used for
most important components in aeroengines. However, the existing manual defect detection methods from the X-
Keywords: ray images have common drawbacks such as unstable artificial recognition, misdetection, misjudgment, fails of
Convolutional neural networks quantitative analysis, huge workload, and low-quality inspection efficiency. To avoid these drawbacks, this paper
Selective search
proposes a new artificial intelligent (AI) method to detect and recognize the aerospace titanium casting defects
Defect detection
from the X-ray images. It includes the target defect positioning method named as filtered selective search al­
Classification
Casting gorithm (FSS) and the defect classification method named as evenly distributed convolutional neural network
(ED-CNN). In the target positioning step, through statistical analysis of defect characteristics, a filtered selective
search algorithm is built with two filters (size and edge curvature). In this way, the FSS algorithm can position
the defects with almost 100 % of accuracy, hence avoid missed detection and false detection. In the target
classification step, an ED-CNN is constructed with a similar structure of the same number of layers in each feature
extraction stage, and its entire architecture is evenly distributed. Compared with other three classic high-
performance convolutional neural network models (AlexNet, VGG16 and VGG19), the ED-CNN model has the
best performance. The ED-CNN model was tested with 324 targets from 50 original images, a classification
accuracy of nearly 90 % was obtained for low density holes, porosity, linear defects, high density inclusions and
casting structure. The FSS/ED-CNN method of two phases defect detection proposed in this paper can achieve
accurate positioning and high accurate classification of typical defect targets, and is expected to solve the
common drawbacks of "manual defect detection". The newly-proposed FSS/ED-CNN method has important
research significance and engineering value.

1. Introduction 2006) studied the content of casting quality control and corresponding
solutions from a macro perspective, and strengthened the content of
Automatic identification of casting defects based on computer vision quality control from two aspects of personnel operation and technical
has always been a key issue in quality control procedure. Castings have a improvement, which brought us reference suggestions in terms of
few kinds of defects in the production process due to various reasons, quality improvement, but the specific methods about quality inspection
such as internal shrinkages, cracks, inclusions and unfused areas. The were rarely mentioned. Since the last decade, X-ray inspection has been
presence of these non-homogeneous regions severely affects the strength used as the primary method for non-destructive industrial testing due to
of the castings, impairing or even destroying the performance of the its ease of operation and readability. Gholizadeh (2016) made a sum­
casting products. In order to ensure the quality of castings, it is necessary mary and review of nondestructive testing methods, mentioned that
to carry out quality research and technical solutions. (Vijayaram et al., radiographic testing is the most commonly used testing method, and

* Corresponding authors.
E-mail addresses: zhoujianxin@hust.edu.cn (J. Zhou), menghuai.wu@unileoben.ac.at (M. Wu).

https://doi.org/10.1016/j.jmatprotec.2021.117064
Received 22 June 2020; Received in revised form 1 December 2020; Accepted 17 January 2021
Available online 19 January 2021
0924-0136/© 2021 Elsevier B.V. All rights reserved.
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 1. Titanium Alloy Castings for Aerospace Industry.

introduced the characteristics of tested objects suitable for different technology, which provided a reference for our pre-processing of X-ray
types of rays. Today, multiple detection methods based on computer images. And for feature extraction on casting X-ray images, Silva and
vision have opened up new avenues for non-destructive testing. Com­ Mery (2007) used statistical techniques to extract 8 features and esti­
bined with traditional inspection equipment, these technologies bring mated the accuracy of defect classification by constructing a two-layer
great convenience to automated defects detection through feature neural network. In order to improve the quality of plastic injection
extraction and recognition. In the field of color printing, Luo and Zhang molding process, Yang et al. (2015) combined digital image processing
(2003) realized the detection and classification of quality defects under and model-free optimization method to realize the defect measurements
different light conditions in the printing process based on histograms and feedback optimization of plastic products. In the image processing
and neural networks, providing production information feedback for the stage, the author extracted the defect contour through edge detection
improvement of the quality of printed products. During the welding and successfully monitored the defect magnitude. And in the process of
process, Sreedhar et al. (2012) developed an online monitoring system multi-threshold segmentation of images, Wu et al. (2020) introduced an
by extracting the thermal image features of the thermal image, and improved teaching and learning algorithm to achieve accurate seg­
realized the identification of the defects generated in welding. And for mentation of defect targets on X-ray images of castings. The
checking the quality of civil facilities, Makantasis et al. (2015) used above-mentioned image processing methods and applications achieve,
convolutional neural networks to implement security inspections of for example, image transformation, feature extraction, and target
tunnels, confirm their ability to learn images, and locate the targets by detection and segmentation from multiple angles, but in general, only
capturing the difference between the pixels of the images. Similarly, part of the content of the entire multi-target identification process is
these technical research and applications based on industrial images implemented. For casting X-ray images, the task of defect identification
provide a reference for internal defects detection by X-ray in our casting includes multi-target detection and classification.
field. For multi-target and multi-category images, the objects first need to
In the process of quality inspection of castings, after X-ray machine be detected, so the image segmentation may be involved. Among the
images the objects, people can judge the defects inside, record the cor­ methods of regional proposals that have been put forward, selective
responding defects and then feedback the results to other manufacturing search (Van de Sande et al., 2012) is one of the most commonly used
processes, so as to guide production and improve casting quality. With methods due to its fast speed and high recall rate, and the Graph-Based
the development of computer image processing technologies, the pro­ Image Segmentation method (Felzenszwalb and Huttenlocher, 2004) it
cess above becomes more convenient. Taking into account the distortion uses shows good performance. Considering the practical requirements of
of the casting X-ray images, Tokhy and Saad (2017) explored the feature the X-ray image recognition task of aviation titanium alloy castings, we
extraction and defect classification of the artificial neural network under use this algorithm to capture all suspected defects. The specific principle
different image transformations based on the cepstral coefficient and application of the algorithm will be introduced in the following

Fig. 2. Flowchart for casting image defects recognition.

2
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

chapters. With regard to the classification of targets, in recent years,


convolutional neural networks have been widely used for their advan­
tages in image processing. Compared with the manual feature extraction
methods, the automatic learning of the networks will greatly reduce the
manual workload, feature conversion error, and the chance of missing
important features. Krizhevsky et al. (2012) constructed and trained a
convolutional neural network to classify a large number of pictures,
which achieved good classification results. The network named AlexNet
he proposed has important classical significance for the subsequent
development of convolutional neural networks, and in the comparison
part of our research, we compare it with the effect of our model.
We study further on research about automatic multi-target contin­ Fig. 3. Process of region’s filtered selective search.
uous detection and classification of a single image. Especially for the X-
ray image of aviation titanium alloy castings, due to the particularity of
3.1. Filtered selective search
its product use, the inspection task must not only ensure that small-size
defects are not omitted, but the variety of defects also illustrates the
Traditional object detection usually uses exhaustive method like a
difficulty of classification tasks. Considering the characteristics of the
sliding window (Lampert et al., 2008) to select all candidate regions on
obtained images and combining the above knowledge, in this work we
the image, then extracts these regions’ features and makes a classifica­
propose filtered selective search and evenly distributed convolutional
tion. Every suspicious region is noticed. However, by using the
neural networks for defects identification on casting X-ray images. The
exhaustive method like a sliding window to madly select the regions
method is mainly divided into two steps. One is to use the filtered se­
where any objects may appear, the computational complexity is high
lective search to detect all suspicious targets, and the other is to use the
and it will generate a large amount of calculation. At the same time,
convolutional neural networks to construct the ED-CNN to classify the
many redundant regions are produced, which is costly. The selective
suspicious regions. All the work is to achieve the automatic positioning
search algorithm proposed by Uijlings et al. (2013) can effectively
and classification of defects, and to facilitate the manual inspection of
reduce redundant regions and greatly reduce the amount of calculation.
casting quality.
Firstly, the algorithm recommends object regions, calculates the simi­
larity between adjacent sub-regions in the image, and finally obtains a
2. Overview of the proposed method
smaller amount of object regions by continuously merging adjacent
similar regions, thereby narrowing the object search range. The algo­
This section introduces the overall framework briefly of the method
rithm adopts a versatile strategy to avoid poor regional recommenda­
we have proposed. The obtained casting images are the X-ray images of
tions due to a single strategy. The strategy of its diversity is embodied in
Titanium alloy castings which are often used in aerospace industry, as
a variety of color spaces, multiple threshold and similarity calculation
shown in Fig. 1. In order to further realize the lightening effect of the
standards. In order to consider the influence of factors such as scenes and
casing, the wall thickness of the support plate is reduced. Therefore,
lighting conditions, the original color space is converted to a color space
effective defect identification technology becomes a guarantee for con­
of up to 8 types, such as RGB, HSV, and grayscale, etc. The original
firming the safety of the machine.
image is multi-threshold initialized, and the threshold can be flexibly
In actual production, due to the large number of castings, the in­
adjusted to an optimal state according to the image type and specific
spection task is heavy. The process of defects recognition is shown in
conditions. In general, the larger the threshold, the fewer the divided
Fig. 2.
regions. As for the multiple regional similarity fusion standards, it can be
In total 402 X-ray images are acquired in the casting quality in­
considered from the color, texture features, size, and overlap of adjacent
spection. All original images are divided into 3 groups: 282 for training,
regions. When compare the similarities, the value of a single dimension
70 for evaluation, and 50 for testing. Firstly, the 282 images, which were
is normalized to 0− 1. The larger the value, the greater the similarity
used for training, were used for target detection and feature analysis, so
between the compared regions. Finally, add all the single values, and a
as to remove false defects and obtain a detection effect close to real
comprehensive similarity value is obtained. The overall implementation
defects detection. The image database is derived from the images of the
steps of the algorithm are as follows:
castings taken by the X-ray detector, which contains 4 types of defects
inside. In addition to the actual defects, parts of casting structures are
1) Firstly, the segmentation method is used to generate a candidate
also added as a type of others. The size of the original images is
region set, that is, the image is segmented into many small blocks.
3072 × 2400-pixel resolutions. The image database consists of human-
2) Then calculate the similarity of each two adjacent regions, and merge
cut labeled defects and backgrounds, and these images are randomly
the two regions with the highest similarity.
selected to form a training set and a validation set by a ratio of 4:1. As
3) Repeat step 2 until only one complete region is left. At this point, we
the next step, a convolutional neural network to realize the training and
will get the target region set we want.
classification of various types of defect images is constructed. Finally,
the trained model is used for classification of the targets. The rest 50
For a particular kind of image, the selective search algorithm may
images are reserved to form a testing set which are first selected by re­
still generate many unnecessary recommended regions. Therefore, after
gions, then these regions are sent to the validated model for classifica­
the algorithm is executed, a filter is added. The filter condition is to give
tion and form a defect recognition report.
the maximum (including margin) two-dimensional size limit after
counting the two-dimensional size of various defects, and filter out
3. Methodology
obvious non-defect misdetection areas. In the selected area remaining
after filtering, identify the edge curve of each area, and then judge
This section explains the principles and background of the various
whether the area is a defect according to the ratio of the maximum
parts used in the framework. It mainly consists of two parts, one is the
continuous length of the edge curvature to the total length of the edge.
filtered selective search algorithm, and the other is the structures of ED-
And by combining the specific features such as the size and edge cur­
CNN, in which of it each layer will be explained in detail.
vature of the image objects, the filtering conditions suitable for the
image are determined, thereby achieving the effect of accurately

3
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 4. A Schematic of the evenly distributed Convolutional Neural Network (ED-CNN) Architecture.

Table 1
The Dimensional information in each layer.
Layer Size Operator Kernel size Stride Processed size

Input 128 × 128 × 3 – – – –


L1 128 × 128 × 3 C1 5 × 5×32 1 128 × 128 × 32
L2 128 × 128 × 32 P1 2×2 2 64 × 64 × 32
L3 64 × 64 × 32 C2 5 × 5×64 1 64 × 64 × 64
L4 64 × 64 × 64 P2 2×2 2 32 × 32 × 64
L5 32 × 32 × 64 C3 3 × 3×128 1 32 × 32 × 128
L6 32 × 32 × 128 P3 2×2 2 16 × 16 × 128
L7 16 × 16 × 128 C4 3 × 3×256 1 16 × 16 × 256
L8 16 × 16 × 256 P4 2×2 2 8 × 8 × 256
L9 8 × 8 × 256 C5 3 × 3×512 1 8 × 8×512
L10 8 × 8×512 P5 2×2 2 4 × 4 × 512
L11 4 × 4 × 512 FC6 – – 1 × 1×1024
L12 1 × 1×1024 FC7 – – 1 × 1×5
Output 1 × 1×5 – – – –

searching for the image targets. The specific process is shown in Fig. 3. Fig. 5. Comparison of three activation functions.

3.2. The overall structure of ED-CNN connection. In addition, to reduce the overfitting of the networks, the
dropout layer (Srivastava et al., 2014) is placed behind the first two
Fig. 4 shows the overall structure of the evenly distributed con­ layers of the fully connected layers. The mechanism of Dropout is to
volutional neural network (ED-CNN) used for classifying defects. Each randomly discard some neurons in proportion during network training,
feature extraction stage uses a similar structure of the same number of and change the fixed connection between layers to achieve the purpose
layers, and the entire architecture of the convolutional neural network is of a robust model.
evenly distributed. We give a name to CNN with this kind of structure as For activation function, the RELU (Nair and Hinton, 2010) is used
ED-CNN. The network can be divided into three types of layers: con­ after the convolution to increase the nonlinearity of the neural network
volutional layers, pooling layers and fully connected layers. The input of model. Compared with sigmoid and other functions, the RELU has a
the network is an image of 128 × 128 × 3 pixels, and after processing by smaller amount of computation and solves the gradient vanishing
the network, the final output is a defect type of 5 categories. Table 1 problem in the positive interval. Moreover, the convergence speed is
details the dimensional information of the input image after being much faster than the sigmoid and tanh(x). Fig. 5 shows three kinds of
processed at each layer of the network. In addition, there is a dropout functions and their derivatives.
layer located after FC6 that are not shown in the figure. In addition, when defining the loss, this article uses label and logits to
calculation their softmax cross entropy (Kline and Berardi, 2005). The
softmax calculation is performed at the last output layer of the network,
3.3. Structural composition of ED-CNN
and the probability L=[y1, y2, y3, …, y5] of the category to which the
defect belongs is obtained. The calculation Eq. (1) is as follows.
Normally, most of the feature extraction work relies on convolution
operations. The image is calculated by using a convolution kernel to ezi
yi = ∑ z (1)
obtain a new feature map. Initial weights of the convolution kernel are ke
k

usually randomly generated and can also be initialized by using the


‘Xavier’ algorithm (Glorot and Bengio, 2010). And the bias is commonly zi is the output of the upper layer’s neurons, while k is 5 in this article.
determined using ‘Constant’ and initialized to zero at first. The stride is Then the result L is calculated by cross entropy with the actual label y’i ,
defined as the distance of each convolution kernel sliding, we set it to 1. as in Eq. (2). Finally, the resulting vector is averaged to obtain network
Two common methods of pooling are max pooling and mean pooling. loss.
The former takes the maximum value of the feature points in the ∑
C=− y’i log(yi ) (2)
neighborhood, and the latter averages the feature values. A detailed i
comparative analysis of the two methods is shown in (Boureau et al.,
2010). In this article we select the max pooling. In our method, the Stochastic Gradient Descent (Bottou, 2012) is
Through the multiplication of the matrix, a feature space trans­ adopted to move the parameter along the opposite direction of the
formation is performed, and the "distributed feature representation" gradient, that is, the direction in which the total loss is reduced,
learned from the previous layers is mapped to the sample mark space implementing the parameters update.
(Lin et al., 2014). Coupled with the nonlinear mapping of the activation
function, the multi-layer fully connected layer can theoretically simulate 4. Defects search and classification
any nonlinear transformation.
Between the convolutional layer and fully connected layer, a flatten This section describes the details involved in defect extraction and
layer is added to realize the transition between convolution and full- classification, including the image database used, filtered selective

4
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 6. Different kinds of defects for training and validation:(a) low density holes, (b)porosity, (c)linear defects, (d)high density inclusions, (e) casting structure.

Fig. 7. Comparison of image enhancement effect: (a) before processed, (b) after brightness and color processed, (c) after contrast processed, (d) after sharp­
ness processed.

search for defects, hyperparameter settings, and the classifier training Fig. 7.
results. For neural network feature learning models, the key to good per­
formance in training is to use high-quality data sets. Among them, the
balance of image data has a greater impact on the model. Relevant
4.1. Image database
experimental experience also shows that convolutional neural networks
are indeed sensitive to this type of problem. In order to distribute the
The image database for network training consists of 5 kinds of
image data as evenly as possible and improve the generalization ability
cropped images (4 types of casting defects, one Others), which are given
of the training model, we use data expansion and data disorder to
corresponding labels artificially from 352 original X-ray images. The
enhance it. The flip and rotation functions are similar, which can enrich
specific type is shown in Fig. 6. A total of 3000 cropped images(targets)
the posture of the defective target in the image, and the rotation can
were randomly composed of a training set and a validation set in a ratio
increase the rotation invariance of the model learning. Increasing
of 4:1, and the pixel resolution was 128 × 128. And additional 50
Gaussian noise can increase the pixel interference to the original image,
original X-ray images were used for testing.
so that the convolutional neural network can learn more powerful
Due to the influence of the photographing environment, some of the
feature information, thereby enhancing the robustness of the model.
images affected by noise and other factors are of poor quality, so all the
Histogram equalization refers to increasing the relatively high and low
original images have been enhanced. For the training data set, in order
contrast between image pixels to make the difference between image
to make it easier for the model to learn image features, we enhance the
details more obvious. For grayscale images, it manifests as a wider
brightness, color, contrast, and sharpness of the image as shown in

5
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 8. Expansion of casting X-ray image data: (a)-(f) flip mirror image, (g) Gaussian noise, (h) histogram equalization.

Fig. 9. The size of the artificially cut defect images: (a) size distribution of different defects, (b) proportion of defects in each size range.

contrast range in brightness. Fig. 8 shows the effect of data expansion, original image produces a number of small sub-regions, and secondly,
after expansion (including flip mirror image, Gaussian noise, histogram region merging is performed based on similarities between the
equalization), the total number of training data reaches 3000, 600 for sub-regions such as color, texture, size, overlap, etc. In this way to
each category. iterate, finally circumscribe rectangles of the merged sub-regions and
get the desired proposed boxes. At last, record the relative coordinate
position and height as well as width of the boxes on the image.
4.2. Filtered selective search for defects By counting the size of the defect images that has been cut except the
background, as shown in Fig. 9, we can see that the size of most defects
The implementation of this part prepares for the network testing. A ranges from 0 to 400 pixels. Considering that these images are formed by
global search is performed on the original image, after that all proposed artificial cropping, compared with the selective search algorithm, the
defect regions are selected which will be sent to the classification target positioning accuracy is lower, and a certain distance of the casting
network model subsequently. Since there are some similarities and background area is still reserved around the defect edge, so the actual
continuities between defects or in the background regions, the proposed defect size is less than this. Therefore, in the single dimensions of width
boundary box is extracted based on the idea of sub-region merging. and height, the relative size and length limit of anti-false artifacts is 500
Firstly, algorithmic segmentation (Van de Sande et al., 2012) of the

6
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 10. Target area segmentation and edge detection: (a) casting structure, (b) casting defects.

Fig. 11. Edge-curvature map of the targets: (a) casting body structure, (b) casting defect.

pixels. Also, the mainly sizes concentrate in 0–200 pixels, and the small the total edge length. The results show that the proportion of real defects
sizes bring challenges to detection and accurate classification. Based on is below 0.2, with an average value of 0.146, while the proportion of
this, the edge curvature characteristics of defective and non-defective structural types is relatively larger and more dispersed, with an average
targets are also considered. value of 0.305. For example, target area segmentation and edge detec­
As shown in Fig. 10, it can be seen that the edge curvature of the real tion of casting structure and casting defects as showed in Fig. 11.
defect targets varies irregularly with the edge. Meanwhile, because the Therefore, we set the distinction threshold of the two to 0.2.
non-defective targets’ regular edges, the curvature shows continuous The application effect is shown in Fig. 12. In this process, referring to
zero values of longer length. Calculate the continuous values of the the actual physical dimensions and mathematical statistical character­
curvatures of the defect and non-defect targets, record the maximum istics of casting defects, some of the apparently non-defective proposed
continuous length of the curvature on the edge, and divide this value by boxes are screened out by constraints, such as the ratio of height to

7
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 12. Application of defect search: (a) original image to be detected, (b) image after filtered selective search.

8
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 13. Loss and accuracy changes during model training.


Fig. 14. Result of image target recognition.

width is too large and the curvature of edge line continuously to be zero.
As can be seen from the results, the filtered selective search satisfies the output. The image targets recognition results are shown in Fig. 14.
search requirements for the defect images, and there are almost no As can be seen from the results, defects within the image are detected
missing regions. Moreover, compared to the method of sliding window, and given types. One image contains various types of defects, as well as
the selective search’s calculation amount is smaller, and save time the structure and background of the castings which are both unified into
accordingly as well as the window size is also more flexible. the Other category.

5. Testing and discussions


4.3. Hyperparameter setting and training of ED-CNN

The images for testing are original images that has not been cropped.
The network is optimized with Stochastic Gradient Descent during
After the model is trained, the original image is directly processed.
training, and considering the sample sizes, we set the batch size 32 for
Firstly, perform a global search on the image to pick out all regions that
each iteration. For the learning rate, after comparisons of multiple trial-
may be defects. Then the selected regions cropped from the original
and-error and corresponding results, we use a decay learning rate with
image become the input of the ED-CNN. After the classification of the
an initial value of 0.05. A learning rate higher than this value indicates
network, the category of the predicted defect is directly marked beside
that the performance is rough, at the same time, low learning rate results
the boxes in the image, and the relative position and size of each defect
in a slow convergence of accuracy and a mediocre effect. Since the
are taking out.
number of samples used in this method is objectively going to be
improved, the dropout rate is set to 0.9, which is close to full sample
training. Besides, apart from the stride of all convolutional layers 5.1. Comparative study in detection
mentioned above is set to 1, the stride of all pooling layers in this method
is set to 2. In order to test the performance of the trained network, another 50
Fig. 13 shows the loss and accuracy changes in the image data original images that have not been trained and validated are selected.
training process. It can be seen that in a total of 30,000 iterations, the First, a comparison is made with respect to the detection capabilities of
model loss gradually decreases from the highest of about 1.9, to 28,000 the proposed method. The selected images contain defect targets in a
times, it drops to close to 0. The training accuracy has stabilized at 1 variety of states, such as targets locate within the structure, targets are
from 22,000 times, indicating that the model has converged and the data blurred, and target overlaps. And compared with the traditional Prewitt
fitting training is completed. edge detection (Shrivakshan and Chandrasekar, 2012) method and the
After the classification model is established, the network is trained. threshold segmentation method OTSU (Vala and Baxi, 2013), the spe­
Based on the existing data, we explored the training effect of the model cific comparison effect is shown in the Fig. 15.
under different iteration times. The effect shows that in the first 200 From the results, we can see that compared with the original selec­
iterations, the model is still in a less stable state, and the training ac­ tive search algorithm, the filtered selective search removes some targets
curacy is improving. After 200 times, the precision gradually tends to of structural class, and visual effect of defects detection is obvious, at the
stabilize. Among them, at 30,000 iterations, it reached a maximum same time the result is more accurate. While the Prewitt edge detection
precision of about 99.99 %. For the training dataset sizes, considering method can present the defects’ edge information, it also detects too
the limited number of original samples and the imbalance of distribution much useless edges, which leads to huge visual interference to the
between categories, the data is expanded by mirroring and rotation, and detection effect. Finally, the OTSU threshold segmentation method is
the total number reaches 3000. almost ineffective in the above several scenarios, and defects are not
detected completely in some scenarios.
4.4. Testing of ED-CNN The Fig. 16 shows the comparison of the final detection results of
whole image. It can be seen that the original selective search algorithm
After a global search, the proposed regions of varying sizes are boxed detects many non-defective targets, and even makes the real defect
out on the image. These regions contain most of the defects, the struc­ targets covered, resulting in a reduction in the number of real defects.
tural parts of the casting, and regions that are misdetected as defects. On The filtered selective search retrieves all suspicious targets more spe­
this basis, all proposed regions are numbered in order from left to right cifically, with almost no omissions. And the Prewitt detection method
and top to bottom. Then, in numerical order, each region is cropped and detects out many non-defective features in images, and the visibility is
sent to the trained network for category prediction. The predicted result poor, which makes the target identification difficult. The OTSU
is shown beside the number and the corresponding coordinates are threshold segmentation method detects some defects, but we still cannot

9
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 15. Comparison of target defects detection results with different methods under different conditions: (a) original image, (b) Selective search, (c) Filtered se­
lective search (ours), (d) Prewitt edge detection, (e) OTSU threshold segmentation.

clearly see the true shapes of the targets, and the method completely the qualitative classifications of targets are realized on a whole image in
fails for the targets in the background. Finally, the comparison of the our method, and it achieves good results in the detection step. Because of
false detection rate and the missed detection rate between the selective the large difference of target sizes in this type of images and especially
search algorithm before and after filtered is given. Misdetection refers to some sizes are too tiny to learn the features for CNN, it poses a great
the detection of non-defective cases. As can be seen from the graph, the challenge to achieve extremely accurate classification.
false detection rate after filtered drops from 49.1 % to 8.24 %, and the It can be seen from the Table 2 that the recognition accuracy of the
missed detection rate also drops from 4.01 % to 0.31 %. This is due to the ED-CNN model we built before optimization is only 76.54 %, but after
reduced selection of non-defective targets after filtering, leaving only a optimization, the recognition accuracy is improved to 87.65 %, close to
small number of Others similar to defects. At the same time, it reduces 90 %. The feature learning is optimized from three aspects of data
the meaningless coverage of some defect targets, resulting in a decrease expansion, optimization of the learning rate and improvement of model
in the number of missed inspections. overfitting. Of course, note that if more training samples were provided,
the imagine recognition accuracy of the AlexNet and VGG networks
might be improved.
5.2. Comparative study in classification From the result, we can learn that the classification result is not going
to be more accurate with the networks’ depth deepening. It is generally
After comparing the detection methods, a comparison of defects believed that increasing the number of network layers can improve the
classification is performed. Based on the data obtained, we realized 4 accuracy in some way. However, it also complicates the network, which
different depth networks to compare the effects of defects classification can lead to overfitting. Therefore, in order to ensure better network
including 3 kinds of classic networks, AlexNet (Krizhevsky et al., 2012), performance and robustness, a compact structure should be selected as
VGG16 and VGG19 (Simonyan and Zisserman, 2014). much as possible while designing so as to achieve the best accuracy.
The statistical classification results of the test set are shown in Based on this set of defect identification method, we developed the
Table 2. It can be seen from the table that under the test set of 50 un­ corresponding software and applied it to the actual quality inspection
trained original images including 324 targets of various types, the pro­ process of the enterprise. The results show that, compared with the
posed method shows the best classification performance in all the manual discrimination method, our method has higher stability and
methods implemented, with an accuracy 87.65 % out of 5 categories. improves the efficiency of defect identification while meeting the actual
Compared with the detection-only method for casting radiography im­ auxiliary decision-making needs.
ages, not only the detection and localization of defective targets but also

10
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

Fig. 16. Comparison of the results of the entire defects image detection: (a) original image, (b) Selective search, (c) Filtered selective search (ours), (d) Prewitt edge
detection, (e) OTSU threshold segmentation,(f) Comparison of detection accuracy before and after filtration.

manual cropped and labeled images containing 5 kinds of category ob­


Table 2 jects, which expand from original quantity to 3000 sheets, and another
Classification results of different kinds of networks.
50 untrained original images are used for testing. As for a result, the
Method Structure Number actual Correct Test average false detection rate is reduced to 8.24 % and the missed
of layers quantity quantity accuracy detection rate lowers to 0.31 % in the detection step, after that a clas­
AlexNet ( 5Conv- 8 324 174 53.70% sification with 87.65 % accuracy is implemented in the label prediction
Krizhevsky 3FC step. The proposed method is compared with corresponding other
et al., 2012)
methods, which are respectively original selective search, Prewitt edge
VGG16 13Conv- 16 324 128 39.51%
(Simonyan 3FC detection, and OTSU threshold segmentation in the detection step as
and well as AlexNet, VGG16 and VGG19 in the classification step. And the
Zisserman, results show that the proposed method has significant detection capa­
2014) bilities and better recognition performance.
VGG19 16Conv- 19 324 200 61.73%
(Simonyan 3FC
With the combination and rapid development of image processing
and technologies and deep learning, CNN will be applied more in the pro­
Zisserman, cessing of visual tasks, at the same time bring great changes to the
2014) traditional image processing, especially for large-scale repetitive visual
ED-CNN -7 5Conv- 7 324 248 76.54 %
processing tasks. However, it also means that there is a need for amounts
(ours) before 2FC
optimization of accurate training data in the supervised learning mode. In the future,
ED-CNN -7 5Conv- 7 324 284* 87.65 % CNN will be used for other visual processing tasks in the foundry field
(ours) after 2FC * when involves different metal materials such as aluminum alloy and cast
optimization steel, and will partially or completely replace manual operations to
*
Bold means that the results are the best. achieve quality monitoring of castings. And for the problem that the
target is too small to distinguish, the method of restoring high resolution
6. Conclusions (Bai et al., 2018) will also be applied in the field. What’s more, based on
the quantitative data obtained, this will help the intelligent development
In this work, we propose filtered selective search and evenly of the foundry industry.
distributed convolutional neural networks for casting defects recogni­
tion, which is used for the detection and classification of 5 kinds of in­ CRediT authorship contribution statement
ternal objects in the cast X-ray images. These objects’ backgrounds are
complex on the image, the size range is wide, and the feature gaps be­ Xiaoyuan Ji: Conceptualization, Methodology, Investigation, Re­
tween some categories is narrow. And the method can be divided into sources, Data curation, Writing - review & editing, Writing - original
two steps. One is the filtered selective search of all suspicious targets. In draft, Funding acquisition. Qiuyu Yan: Data curation, Software, Writing
this step, the conditions of the actual defects in castings are considered - original draft. Dong Huang: Writing - review & editing, Validation. Bo
to filter most of the non-defective targets. Another step is to predict the Wu: Investigation, Writing - review & editing. Xiaojing Xu: Investiga­
categories of these suspicious targets. In this step, the ED-CNN is con­ tion, Validation. Aibin Zhang: Resources, Validation. Guanglan Liao:
structed to learn the data. The training and validation set consists of Supervision, Conceptualization. Jianxin Zhou: Supervision, Project

11
X. Ji et al. Journal of Materials Processing Tech. 292 (2021) 117064

administration, Funding acquisition. Menghuai Wu: Supervision, Lin, M., Chen, Q., Yan, S., 2014. Network in network. International Conference on
Learning Representations.
Methodology, Writing - review & editing.
Luo, J., Zhang, Z., 2003. Automatic colour printing inspection by image processing.
J. Mater. Process. Technol. 139 (1-3), 373–378.
Declaration of Competing Interest Makantasis, K., Protopapadakis, E., Doulamis, A., Doulamis, N., Loupos, C., 2015. Deep
Convolutional Neural Networks for efficient vision based tunnel inspection. IEEE
International Conference on Intelligent Computer Communication & Processing
The authors report no declarations of interest. 335–342.
Nair, V., Hinton, G.E., 2010. Rectified linear units improve restricted Boltzmann
Acknowledgments machines. Proceedings of the 27th International Conference on Machine Learning
(ICML-10) 807–814.
Shrivakshan, G.T., Chandrasekar, C., 2012. A comparison of various edge detection
We gratefully acknowledge the support of the National Natural Sci­ techniques used in image processing. Int. J. Comput. Sci. Issues (IJCSI) 9 (5),
ence Foundation of China (No. 51905188, No. 51775205), the National 269–276.
Silva, R.R.D., Mery, D., 2007. Accuracy estimation of detection of casting defects in X-ray
Key Research and Development Program of China: Network Cooperative images using some statistical techniques. Insight-Non-Destructive Testing Condition
Manufacturing and Intelligent Factory (No. 2020YFB1710100), Invest­ Monit. 49 (10), 603–609.
ment Casting Collaborating Laboratory on Advanced Aeronautical Simonyan, K., Zisserman, A., 2014. Very deep convolutional networks for large-scale
image recognition. Comput. Sci.
Lightweight Alloy Materials of Huazhong University of Science & Sreedhar, U., Krishnamurthy, C.V., Balasubramaniam, K., Raghupathy, V.D.,
Technology and AECC Beijing Institute of Aeronautical Materials. Ravisankar, S., 2012. Automatic defect identification using thermal image analysis
for online weld quality monitoring. J. Mater. Process. Technol. 212 (7), 1557–1566.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R., 2014.
References
Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn.
Res. 15 (1), 1929–1958.
Bai, Y., Zhang, Y., Ding, M., Ghanem, B., 2018. Finding tiny faces in the wild with Tokhy, M.S.E., Saad, M.H., 2017. Automatic detection algorithm of defects in casting
generative adversarial network. Proceedings of the IEEE Conference on Computer radiography images based on cepstral coefficients. Arab. J. Nucl. Sci. Appl. 50 (4),
Vision and Pattern Recognition 21–30. 18–28.
Bottou, L., 2012. Stochastic Gradient Descent Tricks. Neural Networks: Tricks of the Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., Smeulders, A.W.M., 2013. Selective
Trade, pp. 421–436. search for object recognition. Int. J. Comput. Vision 104 (2), 154–171.
Boureau, Y.L., Bach, F., LeCun, Y., Ponce, J., 2010. Learning mid-level features for Vala, H.J., Baxi, A., 2013. A review on Otsu image segmentation algorithm. Int. J. Adv.
recognition. IEEE Computer Society Conference on Computer Vision and Pattern Res. Comput. Eng. Technol. (IJARCET) 2 (2), 387–389.
Recognition 2559–2566. Van de Sande, K.E.A., Uijlings, J.R., Gevers, T., Smeulders, A.W., 2012. Segmentation as
Felzenszwalb, P.F., Huttenlocher, D.P., 2004. Efficient graph-based image segmentation. selective search for object recognition. IEEE International Conference on Computer
Int. J. Comput. Vision 59 (2), 167–181. Vision 1879–1886.
Gholizadeh, S., 2016. A review of non-destructive testing methods of composite Vijayaram, T.R., Sulaiman, S., Hamouda, A.M.S., Ahmad, M.H.M., 2006. Foundry quality
materials. Procedia Struct. Integr. 1, 50–57. control aspects and prospects to reduce scrap rework and rejection in metal casting
Glorot, X., Bengio, Y., 2010. Understanding the difficulty of training deep feedforward manufacturing industries. J. Mater. Process. Technol. 178 (1-3), 39–43.
neural networks. Proceedings of the Thirteenth International Conference on Wu, B., Zhou, J., Ji, X., Yin, Y., Shen, X., 2020. An ameliorated teaching–learning-based
Artificial Intelligence and Statistics 249–256. optimization algorithm based study of image segmentation for multilevel
Kline, D.M., Berardi, V.L., 2005. Revisiting squared-error and cross-entropy functions for thresholding using Kapur’s entropy and Otsu’s between class variance. Inform. Sci.
training neural network classifiers. Neural Comput. Appl. 14 (4), 310–318. 72–107.
Krizhevsky, A., Sutskever, I., Hinton, G., 2012. ImageNet classification with deep Yang, Y., Yang, B., Zhu, S., Chen, X., 2015. Online quality optimization of the injection
convolutional neural networks. Adv. Neural Inf. Process. Syst. 25 (2). molding process via digital image processing and model-free optimization. J. Mater.
Lampert, C.H., Blaschko, M.B., Hofmann, T., 2008. Beyond sliding windows: object Process. Technol. 226, 85–98.
localization by efficient subwindow search. 2008 IEEE Conference on Computer
Vision and Pattern Recognition 1–8.

12

You might also like