You are on page 1of 6

Proceedings, 5th IFAC Workshop on Mining, Mineral and Metal

Proceedings, 5th IFAC Workshop on Mining, Mineral and Metal


Processing 5th IFAC Workshop on Mining, Mineral and Metal
Proceedings,
Processing Available online at www.sciencedirect.com
Proceedings, 5th IFAC
Shanghai, China,
Processing Workshop
August on Mining, Mineral and Metal
23-25, 2018
Shanghai, China, August 23-25, 2018
Processing
Shanghai, China, August 23-25, 2018
Shanghai, China, August 23-25, 2018
ScienceDirect
IFAC PapersOnLine 51-21 (2018) 76–81
Real-time
Real-time Detection
Detection of
of Steel
Steel Strip
Strip Surface
Surface
Real-time
Defects Detection
Based on of Steel
Improved Strip
YOLOSurface
Real-time
Defects Detection
Based on of Steel Strip
Improved YOLOSurface
Defects Detection
Based on Improved
Network YOLO YOLO
Defects Detection
Based on Improved
Detection Network
Network
Detection Network
First A. Jiangyun Li ∗∗ Second B. Zhenfeng Su ∗∗
First A. Jiangyun ∗ Li Second B. Zhenfeng Su ∗
Third First
Third C.
C. A. Jiangyun
Jiahui
Jiahui Geng
Geng Li ∗∗ Second B. author.
∗ Corresponding Zhenfeng Su ∗ Yin
Yixin Yin ∗∗
∗∗
First A. Jiangyun ∗ Li Second B. author.
Corresponding Zhenfeng Yixin
Su
Third C. Jiahui Geng ∗ Corresponding author. Yixin Yin ∗∗ ∗∗
∗Third C. Jiahui Geng Corresponding author. Yixin Yin
∗ Key Laboratory of Knowledge Automation for Industrial Processes,
∗ Key Laboratory of Knowledge Automation for Industrial Processes,
Key Laboratory
Ministry of of Knowledge Automation & forElectrical
IndustrialEngineering,
Processes,
Ministry
∗ of Education,School
Key Laboratory Education,School
of Knowledge
of
of Automation
Automation & Electrical
forBeijing
Industrial Engineering,
Processes,
Ministry
University
University of Education,School
of Science and of Automation
Technology Beijing, & Electrical Engineering,
100083, China
Ministry
University of of
of
Science and Technology
Education,School
Science (e-mail:leejy@ustb.edu.cn)
and Technology
Beijing,
of Automation
Beijing, & Beijing
Electrical
Beijing
100083, China
Engineering,
100083, China
University (e-mail:leejy@ustb.edu.cn)
∗∗ SchoolofofScience and Technology Beijing, BeijingUniversity
100083, China
∗∗ (e-mail:leejy@ustb.edu.cn)
Automation &
& Electrical Engineering, University of
Electrical Engineering,
∗∗ School of Automation (e-mail:leejy@ustb.edu.cn) of
∗∗
School of
Science
ScienceAutomation
and
and & Electrical
Technology
Technology Beijing,
Beijing, Engineering,
Beijing
Beijing University of
100083,
100083,
SchoolScience
of Automation & Electrical
and Technology Beijing, Engineering,
China(e-mail:yyx@ies.ustb.edu.cn) Beijing 100083, University of
ScienceChina(e-mail:yyx@ies.ustb.edu.cn)
and Technology Beijing, Beijing 100083,
China(e-mail:yyx@ies.ustb.edu.cn)
China(e-mail:yyx@ies.ustb.edu.cn)
Abstract: The
Abstract: The surface
surface defects
defects of of steel
steel strip
strip have
have diverse
diverse and and complex
complex features,
features, and and surface
surface
Abstract:
defects
defects The
caused
caused bysurface
different
bysurface
differentdefects of
production
productionsteel strip
lines have
tend todiverse
have and
differentcomplex features,
characteristics. and surface
Therefore,
Abstract:
defects The
caused
the detection
detection by different
algorithms defects the of
forproduction steel lines
surface strip
lines
tend
have
tend
defects
todiverse
of to
have different
have
steel andshould
different
strip complexcharacteristics.
havefeatures,
characteristics.
Therefore,
and surface
Therefore,
good generalization
generalization
the
defects algorithms
causedAiming
by different for the surface
production defects
lines of steel
tendofofto havestrip should
different have good
characteristics. Therefore,
the detection
performance.
performance. algorithms
Aiming at for
at detectingthe
detecting surface
surface defects
defects
surfacedefects steel
steel
defectsofofsteel strip
strip,
steelstrip should
we
strip,should have
established
we established good a generalization
dataset of six
a generalization
dataset of six
the detection
performance.
types of surface algorithms
Aiming
defects aton for the surface
detecting
cold-rolled surface
steel defects
strip and of steel
augmentedstrip, itwein havetogood
established
order a
reduce dataset of six
over-fitting.
types of surface
performance. defectsatondetecting
Aiming cold-rolled steel strip andofaugmented itwein established
order to reduce over-fitting.
types of surface
We improved
We improved the
thedefects
You on
You Only
Only Look surface
cold-rolled
Look Once
Once
defects
steel(YOLO)
strip
(YOLO) steel strip,
andnetwork
augmented
network and itmade
and in order
made it all
it all a dataset
to convolutional.
reduce
convolutional. ofOur
six
over-fitting.
Our
types
We of surface
improved
improved network, thedefects
You
network, which on
Onlycold-rolled
which consists Look
consists of steel
Once
of 27 strip
(YOLO)
27 convolutionand augmented
network
convolution layers, and
layers, provides it in
made
provides an order
it to
all reduce over-fitting.
convolutional.
an end-to-end
end-to-end solution Our
solution forfor
improved
We improved the You Only
improved
the surface
the surface network,
defects
defects which
detection
detection ofLook
consists
of steel
Once
steelofstrip.
strip.
(YOLO)
27 convolution
We evaluated
We
network
evaluated layers, and
the six
the six madeof
provides
types
types anitdefects
of
all convolutional.
end-to-end
defects with our
with solution
our Our
network
network for
improved
the
andsurface
reachednetwork,
defects which of
detection
performance consists
of steelofstrip.
97.55% 27 convolution
mAP We 95.86%layers,
andevaluated the six
recall provides
types
rate. andefects
of
Besides, end-to-end
our with
networksolution
our network
achievesfor
and
the reached
surface performance
defects detection of 97.55%
of steel mAP
strip. and
We 95.86%
evaluated recall
the rate.
six types Besides,
of our
defects network
with our achieves
network
and
99% reached
99% detection
detection performance
rate with
rate of 97.55%
with speed
speed 83mAP
of 83
of FPS, and
FPS, which
which95.86% recall
provides
provides rate. Besides, support
methodological
methodological our network
support achieves
for real-time
for real-time
and
99% reached
detection
surface defectsperformance
rate
detection of steel
with speed
of 97.55% 83mAP
of strip.
FPS,
It and
which
can 95.86%
also recall
provides
predict rate. Besides,
methodological
the location and our
supportnetwork
size achieves
for real-time
information of
surface
99% defects rate
detection detection
with of steel
speed strip. It can
of significance
83 FPS, whichalsoprovides
predict methodological
the location and size information
support forsteel
real-time of
surface
defect
defect defects
regions,
regions, detection
which
which is
is of
of
of steel
great
great strip. It
significancecan also
for
for predict
evaluating
evaluating thethe
the location
quality
quality and
of
of an
ansize information
entire
entire steel strip
stripof
surface
defect defects
regions,
production detection
line.which is ofofgreat
steel significance
strip. It can for alsoevaluating
predict the thelocation
quality and of an size information
entire steel strip of
production
defect line.
regions, which is of great significance for evaluating the quality of an entire steel strip
production line.
© 2018, IFACline.
production (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
Keywords: Surface
Keywords: Surface quality;
quality; Defect
Defect Detection;
Detection; Steel
Steel Strip;
Strip; Improved
Improved YOLO YOLO Network;Network;
Keywords:
Convolutional
Convolutional Surface
Neural
Neural quality;
Network
Network Defect Detection; Steel Strip; Improved YOLO Network;
Keywords:
Convolutional Surface
Neural quality;
Network Defect Detection; Steel Strip; Improved YOLO Network;
Convolutional Neural
1. INTRODUCTION
INTRODUCTION Network
1.
1. INTRODUCTION
1. INTRODUCTION
Due to
Due to the
the influence
influence ofof raw
raw materials,
materials, rolling
rolling process
process andand
Due
systemto the influence
control, etc., of rawstrip
steel materials,
in the rolling process
production and
process
system
Due control, etc., steel strip in the production process
may to
system
may have
have
the influence
control,
scars,
scars, etc., of rawstrip
steel
scratches,
scratches,
materials,
insect
insect inprints,
prints, rolling
the production process
inclusions,
inclusions,
and
process
bright
bright
system
may have
prints, control,
scars,
burrs, etc., steel strip
scratches,
seams, black insect
burn, inironthescales,
prints, production
inclusions, process
pollution bright
and
prints,
may burrs, seams, black burn, iron scales, pollution and
otherhave
prints,
other
scars,
burrs,
defects.
defects. The
The
scratches,
seams, black
defect
defect
insectare
burn,
images
images
prints,
iron
are shown
shown
inclusions,
scales, in 1. bright
inpollution
Fig. 1.
Fig. and
These
These
prints,
other burrs,
defectsdefects.
not onlyseams,
Theaffect
only black
defectthe burn,
images ironshown
steel are
strip scales,
surfaceinpollution
Fig. and
1. These
appearance,
defects
other not
defects. affect
Theaffect
defectthe steel
images strip
are surface
shown appearance,
in Fig. 1. These
defects
but also not
but alsonot only
damage
damage the the
wear
the wear steel strip
resistance,
resistance, surface
corrosion
corrosion appearance,
resistance,
resistance,
defects
but
highalso only affect
damage
temperature the the resistance,
wear
resistance steel
andstrip surface
corrosion
fatigue appearance,
resistance,
strength of the
the
high
but temperature
also damage resistance
the wear and
resistance, fatigue strength
corrosion of
resistance,
high
steel temperature
strip. resistance
Therefore, it is and
very fatigue
important strength
to detectof the
the Fig. 1.
steel temperature
high
steel
surface
strip. Therefore,
strip. Therefore,
defects of the
the
it is very
resistance
it andimportant
is very
steel strip fatigue
important
for
to detect
strength
improving the Fig.
theofsteel
to detect steel 1. Several
Several types types of of surface
surface defects
defects on
on steel
steel strip
strip
surface defects of steel strip for improving the Fig. 1. Several types of surface defects on steel strip
steel strip. Therefore, it is very important to detect the Using Convolutional Neural Networks (CNNs), we
surface
strip defects
production
strip production of the
quality.
quality.steel strip for improving the steel Fig.
Using1. Convolutional
Several types ofNeural surfaceNetworks
defects on(CNNs), we can
steel strip can
surface defects ofquality.
strip production the steel strip for improving the steel Using Convolutional
automatically extract Neural Networks
multi-scale (CNNs),
features of we strip
steel can
However, there are many factors that make real-time automatically
Using Convolutionalextract multi-scale
Neural features
Networks of
(CNNs), steel
we strip
can
strip production quality. automatically
surface defects extract
with
However, there are many factors that make real-time surface defects with good generalization and high accuracy good multi-scale
generalizationfeatures
and of
highsteel strip
accuracy
However,
detection of
detection ofthere
steel are
steel stripmany
strip surfacefactors
surface defectsthat
defects make real-time
particularly
particularly difficult, automatically
difficult, surface
by usingdefects
a extract
with good
general-purpose multi-scale
generalization
learningfeatures
and
procedureof steel
high strip
accuracy
(LeCun
However,
detection
such as
as theofthere
the steel are
stripmany
high-speed surface factors
production defects that
line, make and
particularly
diversity large by
real-time
difficult, using
surface
by
et using
al.,
a general-purpose
defects
2015). with trained learning procedure
good generalization
a general-purpose
Using andregions
learningdefect
network, procedure
(LeCun
high accuracy
(LeCun
can
such
detection
such
scale as of high-speed
steel strip
the high-speed
changes
production
surface
of defects,production
defects, defects
random line,
line, diversity
particularly and
diversity and
distribution
large
difficult,
and large
non- by et al., 2015).
using Using
a milliseconds. trained
general-purpose network,
learning defect regions
procedure can be
(LeCunbe
scale changes of random distribution non- et al.,
detected 2015).
in Using trained network,
Therefore, defect
CNNs regions
can can an
provide be
such
scale as
defective the
changeshigh-speed
of
interferences production
defects,
(oil random
stains line,
and diversity
distribution
dust on and
the large
non-
surface detected
et al., in
2015). milliseconds.
Using trained Therefore,
network, CNNs
defect can provide
regions can an
be
defective interferences (oil stains and dust on the surface detected
accurate, in milliseconds.
real-time
accurate, inreal-time Therefore,
detection
detection method
method CNNs
for can provide
surface
for surface an
defects
defects
scale
of steel
steelchanges
defective strips). of defects,
interferences (oil random
stains and distribution
dust on the and non- detected
surface milliseconds. Therefore, CNNs canthe
provide an
of strips). accurate,
in steel real-time
strip production detectionlines, method
and for
improve surface defects
product
defective interferences (oil stains and dust on the surface in
of steel strips). steel strip
accurate, production
real-time detectionlines,method
and improve
for surfacethe product
defects
in steel of
quality
quality ofstrip
steel
steel production
strips.
strips. lines, and improve the product
of steel strips). in steel strip production lines, and improve the product
 This work was supported by the Fundamental Research Funds for quality of steel strips.
 This work was supported by the Fundamental Research Funds for quality of steel strips.

theThis
China
workCentral Universities
was supported by theof USTB (FRF-BR-17-004A,
Fundamental Research Funds FRF-
for
2.
2. RELATED
RELATED WORK WORK
the
 China Central Universities of USTB (FRF-BR-17-004A, FRF- 2. RELATED WORK
theThis work
GF-17-B49).
China was supported
Meanwhile,
Central by the
this work
Universities of Fundamental
was also
USTB Research Funds
supported by the FRF-
(FRF-BR-17-004A, for
Open
GF-17-B49). Meanwhile, this work was also supported by the Open 2. RELATED WORK
the China
Project Central
Program
GF-17-B49). of Universities
the National
Meanwhile, this workofLaboratory
USTB
was also (FRF-BR-17-004A,
of Patternby
supported the FRF-
Recognition
Open The
The existing
existing surface
surface defect
defect detection
detection methods
methods are are mainly
mainly
Project Program of the National Laboratory of Pattern Recognition
GF-17-B49).
(NLPR, Meanwhile,
201800027).
Project Program
(NLPR, 201800027).
this work
of the National was also supported
Laboratory of Patternby the Open
Recognition The
based existing
on
basedexisting surface
classical
on classical defect
machine
machine detection
learning
learning methods
algorithms,
algorithms, are mainly
which
which are
are
Project
(NLPR, Program of the National Laboratory of Pattern Recognition
201800027). The surface defect detection
based on classical machine learning algorithms, which methods are mainly
are
(NLPR, 201800027). based on classical machine
2405-8963 © 2018, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. learning algorithms, which are
Copyright © 2018 IFAC 76
Copyright
Peer review©under
2018 responsibility
IFAC 76 Control.
of International Federation of Automatic
Copyright © 2018 IFAC 76
10.1016/j.ifacol.2018.09.412
Copyright © 2018 IFAC 76
IFAC MMM 2018

Shanghai, China, August 23-25, 2018 Jiangyun Li et al. / IFAC PapersOnLine 51-21 (2018) 76–81 77

Table 1. Dataset of strip surface defect images

Scar Scratch Inclusion Burr Seam Iron scale Amount2


Training 673 596 572 448 591 575 3455
Test 200 200 200 200 200 200 1200
Amount1 873 796 772 648 791 775 4655
Fig. 2. YOLO flow diagram
coarsely divided into three main stages: image preprocess-
ing, feature extraction, and classification. However, these
algorithms need to design feature extractors manually, and
the hand-crafted features are heavily dependent on expert
knowledge and require a lot of manpower (LeCun et al.,
2015)(Bengio et al., 2013). An adaptive segmentation al-
gorithm was proposed in (Ma et al., 2017) to adaptively Fig. 3. IOU defines the overlap of two bounding box.IOU
segment defect regions based on the gray features of the (A B)
mental surface, but the types of defects cannot be dis- of rectangular box A, B calculated as IOU =  , (A B)
tinguished. Tetrolet-based method was proposed in (Ke which is the proportion of the overlapping area of A,
et al., 2016) to recognize the surface defects of steel strips. B to the total area.
After extracting the sub-band characteristics of surface
defects in different scales and directions, a Support Vector convolutional layer was used to extract image features,
Machine (SVM) classifier was used to classify different and the softmax classifier was used to classify the object
types of surface defects. However, it took 0.239 seconds to classes, the bounding boxes are predicted to locate the
extract features from a single defect image during testing, position of the object. In this paper, we constructed an
which is too long to meet the real-time detection require- all convolutional YOLO detection network to detect the
ments. Hu et al. extracted four kinds of defect features strip surface defects, which not only improves the accuracy
and transformed them to a 38-dimensional feature vector and detection speed, but also precisely locates the defects.
in (Hu et al., 2016), and an optimized SVM classifier was What’s more, we added the prediction of the surface defect
trained to classify 5 types of 101 defect images. size. The all convolutional YOLO detection network does
not require cumbersome steps and is an end-to-end strip
Deep multi-layer architectures of CNNs are capable of surface defect detection network.
extracting more powerful features than hand-crafted fea-
tures, and all of the features are extracted from training
3.1 Detection Principal
data automatically by using the backpropagation algorith-
m (LeCun et al., 2015)(Bengio et al., 2013). The convolu-
tional networks provide an end-to-end solution from raw The YOLO network divides the input image into S × S
defect images to predictions, thereby alleviating the re- grids. Meanwhile, the convolutional layers are designed
quirement to manually extract suitable features (Sermanet to extract the defect features. For each grid, the network
et al., 2013). What’s more, objects can be detected in a few determines whether the grid contains defects and identifies
of milliseconds with accurate location and size information the defect categories according to the extracted defect
of objects via Convolutional detection networks (Redmon features, the detection procedure is shown in Fig. 2.
et al., 2016)(Redmon and Farhadi, 2016). The YOLO detection network will also predict B bounding
In view of above problems, this paper adopted You Only boxes and the confidence of each bounding box. The
Look Once (YOLO) network, a convolutional detection bounding boxes locate the position of defects in the
network, to automatically extract multi-scale features of images, and the confidence score reflects how confident
steel strip surface defects and detect the defect region- the predicted bounding box is. Formally we define the
s. Similar to (Springenberg et al., 2014)(Radford et al., confidence score as conf idence = Pr (Object)×IOU truthb−box .
2015), we replaced the pooling layers with convolutional Pr (Object) represents the probability of defects in the grid,
layers and made it all convolutional, allowing the network and IOU truth
b−box represents the overlapping rate between
to learn its own spatial downsampling. Our network can the bounding box and ground truth(as shown in Fig. 3).
simultaneously predict the class, location and size informa- NMS (Non-Maximum Suppression) method is adopted to
tion of defect regions, which is very important to improve remove the redundant bounding boxes.
the quality of steel strips in production lines. According to The network predicts 5 values for each bounding box:x,y,w,
our experimental results, only 12 milliseconds are needed h and conf idence. The (x, y) coordinates represent the
to detect a raw defect image, which fully meets the real- center of defect, the (w, h) represents the height and width
time requirements of defect detection tasks in steel strip of each box. The conf idence are described before.
production lines.
The probability of a defect appearing in a box is defined
3. NETWORK ARCHITECTURE as Pr (Class|Object). At test time we multiply the class
confidence score and the bounding box confidence score,
The YOLO detection network was first proposed by (Red- defined as equation (1). The Pr (Classi ) × IOU truth
b−box will
mon et al., 2016) in 2015 and used for object detec- provide class-specific confidence scores for each box. All
tion tasks (Redmon and Farhadi (2016); Girshick (2015); the class confidence and the bounding boxes in each grid
Ren et al. (2015)). In the YOLO detection network, the cell are finally encoded as S × S × (5 + c) × B tensor.

77
IFAC MMM 2018
78
Shanghai, China, August 23-25, 2018 Jiangyun Li et al. / IFAC PapersOnLine 51-21 (2018) 76–81

Pr (Classi |Object) × Pr (Object) × IOU truth


b−box
3.4 Training
(1)
= Pr (Classi ) × IOU truth
b−box The YOLO network optimized the loss function, and the
effect is proved to be good. Thus we adopt the same loss
3.2 Improved YOLO Network function in our all convolutional network.

We constructed an all convolutional YOLO network with Loss function Aiming at the ease of optimization, the
27 convolutional layers. The first 25 convolutional layers YOLO detection network uses the sum-squared error in
are used to extract steel strip surface defect features, the loss function. However, the sum-squared error weights
while the last two convolutional layers predict the defect localization error equally with classification error which
categories and bounding boxes. The network structure is does not perfectly align with the goal of maximizing
shown in Fig. 4. average precision. In every image many grid cells dont
contain any defects. This pushes the confidence scores
Similar to (Lin et al., 2013), our network simply uses of no-defect cells towards zero, often overpowering the
continual 3 × 3 convolutional layer with 1 × 1 reduction gradient from those defect grid cells, which can lead the
layer followed. The continual 3×3 filters extract the defect model instability.
features from input images, and the 1 × 1 kernels are
used to reduce the feature space of the previous feature To remedy this, the YOLO network increases the loss from
map. Learning from (Springenberg et al., 2014)(Radford bounding box coordinate predictions and decrease the loss
et al., 2015), max pooling layers can be replaced by from confidence predictions for no-defect boxes. YOLO
convolution layers with stride of 2 without loss in accuracy network uses two parameters (λcoord = 5, λno def ect = 0.5)
on several image recognition benchmarks. Besides, the to accomplish this.
convolution layers allow the network to learn its own In order to improve the detection effect on small-scale
spatial downsampling rather than deterministic spatial defects, YOLO network increases the proportion of errors
downsampling. In this paper, we replaced the max-pooling in the bounding box of the small-scale defects by increasing
functions of original network with 3 × 3 (stride = 2) the square difference information of the width and height
convolutional functions and achieved a slightly increase of the bounding box in the loss function. The optimized
in accuracy of 0.6%. loss function is as follows (2):
The network predicts defect categories information and
bounding box information on the 13 × 13 feature maps 2
(S = 13). This is sufficient for large-scale defects. However, 
s 
B

Loss = λcoord 1def ect


[(xi − xˆi )2 + (yi − yˆi )2 ]
some defect features will be lost during the convolution ij
i=0 j=0
process. In addition, small-scale defects are often unde- s2
tectable. To extract features from fine-grained features and  B
  
ect √
+λcoord 1def [( wi − ŵi )2 + ( hi − hˆi )2 ]
improve the detection accuracy of small-scale defects, we ij
i=0 j=0
add higher resolution feature maps to extract finer fea- 2
tures. By using passthrough layer in (Redmon and Farha- 
s

B
+ 1def ect
(Ci − Ĉi )2
di, 2016), the 26 × 26 × 512 feature maps are transformed ij
i=0 j=0
into the 13 × 13 × 2048 feature maps and concatenated 2
with the original 13×13×1024 feature maps. The network 
s 
B
+λno−def ect 1no−def ect
(Ci − Ĉi )2
will return the information of class and bounding box from ij
i=0 j=0
13×13×3072 feature maps. In this paper, we mainly detect 2
6 types of strip surface defects and predict 5 bounding 
s

+ 1def ect
(pi (c) − p̂i (c))2
boxes in each grid, and we finally get 13 × 13 × 55 tensor. i
i=0 c∈classes
(2)
3.3 Defect database
where 1def
i
ect
denotes if any kinds of defects appear in cell
def ect
We get the steel strip surface defect data from the cold- i and 1ij denotes that the jth bounding box predictor
rolled steel strip production line. The database mainly in cell i is responsible for that prediction.
includes scratches, inscriptions and other dozens of strip
surface defect images. Due to the limited number of defect Optimization In order to improve the accuracy and
images that can be collected on the actual cold-rolled strip speed of defects detection, we have adopted some training
production line, defects of several types are too few to strategies in the training process.
effectively extract defect features. Six types of defects are
Multi-scale input. The network trained on low-resolution
selected to be detected in this paper, which mainly obtain
images has high test speed but low accuracy, while the
scar, scratches, inclusions, burrs, seams and iron scales
network trained on high-resolution images shows high test
defects.
accuracy but does not meet the requirement of speed.
Each defect image was cut into 300×300 before sent to our When training our network, we changed the fixed 416 ×
network, and each image has obvious defects. The surface 416 input resolution to a variable input resolution. We
defect database contains 6 classes of 4655 steel strip surface set a set of selectable input resolutions {224, 256 · · ·
defects. The details of our dataset are shown in Table 1. 416, 448}, and the network changes the input size every
Prior to training the network, ground truth annotations 10 iterations. Such strategy can ensure that the network
were performed on all defect images manually. extracts features from images of different scales, and the

78
IFAC MMM 2018

Shanghai, China, August 23-25, 2018 Jiangyun Li et al. / IFAC PapersOnLine 51-21 (2018) 76–81 79

Fig. 4. Architecture of improved YOLO detection network.

detection results can be traded between accuracy and


speed.
Batch Normalization. In the process of training the net-
work, the network does not train all the images at the
same time, but first divides all the images into several
batches. Our network utilizes Batch Normalization (Ioffe
and Szegedy, 2015) to normalize the data for each batch,
as shown in (3). Here xi denotes the activation input, and
batch size is m. Fig. 5. Fig. 6.
The loss decay curve. The recall decay curve.
1 
m
µβ ←− xi
m
i=1
1 
m
4. EXPERIMENTS
σβ2 ←− (xi − µβ )2 (3)
m
i=1
xi − µβ We trained the improved YOLO network for 50,000 itera-
x̂i ←−  tions on the 6 types of defect images. Throughout training
σβ2

we use stochastic gradient descent with a batch size of 64,
yi ←− γ x̂i + β ≡ BNγ,β (xi )
a momentum of 0.9 and a decay of 0.0005. The learning
rate was initialized at 0.01 and was divided by 10 after
Batch normalization leads to a significant improvement every 12000 iterations. The training process took 12 hours
in convergence while eliminating the need for other forms on two NVIDIA GTX 1080Ti GPUs. Dealing with the
of regularization. It also helps regularize the model. By training process data, we get the loss attenuation curve
adding batch normalization on all of the convolutional and recall curve, as shown in Fig. 5 and Fig. 6.
layers we get more than 2% improvement in mAP.
The test defect images were detected with a trained net-
Multi-scale features. The YOLO extracts defect features work and 1200 images were completed within 13 seconds,
directly from the entire image, making full use of the achieving 97.55% mAP and 95.86% recall rate. The de-
contextual information of the original image to ensure a tection details are shown in Table 2, and the detection
high recall rate. In our network, feature maps of different results are shown in Fig. 7.
scales are jointly connected to predict the classification
information and bounding boxes, which reduce the loss of Table 2. Detection results (mAP & Recall)
information in defect images (Lin et al., 2017).
Scar Scratch Inclusion Burr Seam Iron scale Avg.
Data augmentation. When the training data is not suf-
ficient, data augmentation (Krizhevsky et al., 2012) can mAP 97.54% 99.02% 97.20% 97.56% 97.07% 97.45% 97.55%
expand the data set and increase the diversity of the Recall 92.29% 98.70% 95.29% 99.17% 93.17% 99.26% 95.86%
training data. Data augmentation can also reduce overfit-
ting. Before we train the network, we performed sharpness
augmentation and contrast augmentation on some of the 4.1 Comparison with traditional methods
defect images. During the training process, we randomly
scaled and cropped the defect images. Traditional algorithm mainly focused on the defect classifi-
cation problem, and they usually cannot solve the problem
Activation function. We use a linear activation function of locating defects and predicting the size of defects, such
for the final layer and all other layers use the following as the SVM classification in (Hu et al., 2016) and some
leaky rectified linear activation (4): machine learning algorithms in (Ke et al., 2016)(Guo et al.,
 2017). Our improved YOLO detection network can not
x x>0 only classify defect images, but also accurately obtain the
ϕ(x) = (4)
0.1x otherwise position and size information of defects. We compared our

79
IFAC MMM 2018
80
Shanghai, China, August 23-25, 2018 Jiangyun Li et al. / IFAC PapersOnLine 51-21 (2018) 76–81

Table 3. Comparisons with other methods

Task Accuracy Inference time per image Amount of dataset


M-pooling CNN(Masci et al., 2012) classification 93.03% unknown 2927
HCGA(Hu et al., 2016) classification 95.04% 0.158s 351
HSVM-MC(Chu et al., 2017) classification 95.18% 1.1044s 900
Infrared imaging(Zhang, 2011) classification 95.42% unknown 1200
Contourlet transform(Xu et al., 2013) classification 96.46% 0.103s 868
Tetrolet transform(Ke et al., 2016) classification 97.38% 0.239s 868
Ours classification + location + scale 97.55% 0.012s 4655

     

Fig. 7. Detection results of 6 types of surface defects on steel strip.

detection results with traditional methods, the results are 4.3 The detection rate
shown in Table 3.
When only needs to detect the defects without classifying
As can be seen in Table 3, the YOLO detection network
the categories, our network achieves 99% detection rate,
is significantly higher in classification accuracy than the
with only 1% defects missed. With the growth and ac-
shallow neural network in (Masci et al., 2012) and the
cumulation of online defect data, the performance of our
SVM classifier in (Chu et al., 2017), slightly higher than
network can be further improved.
the Tetrolet transform in (Ke et al., 2016) and the hybrid
chromosomal genetic algorithm in (Hu et al., 2016). Since
4.4 Defect scale accuracy
the number of defect images in this experiment far exceeds
the number in these referrences, the defect features that
our network extracted have better generalization. According to the experiments, our network can detect
defects with a minimum area of 10 square millimeters.

4.5 Future outlook


4.2 Real-time analysis
Deep learning is a data-driven learning method, and the
The average inference time for our network to detect a amount of data sets directly affects the learning results. If
strip surface image is only 0.012s. It means our network a larger number and variety of strip surface defects images
can detect 83 defect images per second, which is more than are available to train our network, the network will have
ten times or even dozens times faster than methods in better performance and higher accuracy. Meanwhile, the
(Ke et al., 2016)(Hu et al., 2016). The maximum speed location and scale of the defects will be more accurate.
of the actual production line is 30m/s and the view field
of a single camera is 50-100cm. This requires the defect 5. CONCLUSIONS
detector must have a speed of 30-60 FPS. Our improved
YOLO detection network in this paper achieves a detection We established a steel strip surface defect database which
speed of 83 FPS, which fully meets the real-time detection contains surface defects of six types of cold-rolled steel
speed requirements of the actual production lines. strip. We detected the surface defects by constructing an

80
IFAC MMM 2018

Shanghai, China, August 23-25, 2018 Jiangyun Li et al. / IFAC PapersOnLine 51-21 (2018) 76–81 81

all convolutional YOLO detection network. The results generative adversarial networks. arXiv preprint arX-
show that our network achieves a 97.55% mAP, 95.86% iv:1511.06434.
recall rate and 99% detection rate. The network provides Redmon, J., Divvala, S., Girshick, R., and Farhadi, A.
an end-to-end detection solution for strip surface defect, (2016). You only look once: Unified, real-time object
and achieves a detection speed of 83 FPS, making the real- detection. In Proceedings of the IEEE conference on
time detection of strip surface defects more effective. computer vision and pattern recognition, 779–788.
Redmon, J. and Farhadi, A. (2016). Yolo9000: better,
The improved YOLO detection network can predict loca-
faster, stronger. arXiv preprint, 1612.
tion and the scale information of surface defects on the
Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster
entire strip production line, which is of great significance
r-cnn: Towards real-time object detection with region
for improving the product quality of the strip steel produc-
proposal networks. In Advances in neural information
tion. In the case of obtaining more types and quantities
processing systems, 91–99.
of strip surface defect data, this method can be further
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus,
improved in detection accuracy.
R., and LeCun, Y. (2013). Overfeat: Integrated recog-
nition, localization and detection using convolutional
REFERENCES networks. arXiv preprint arXiv:1312.6229.
Springenberg, J.T., Dosovitskiy, A., Brox, T., and Ried-
Bengio, Y., Courville, A., and Vincent, P. (2013). Rep- miller, M. (2014). Striving for simplicity: The all con-
resentation learning: A review and new perspectives. volutional net. arXiv preprint arXiv:1412.6806.
IEEE transactions on pattern analysis and machine in- Xu, K., Ai, Y.H., Zhou, P., and Yang, C.L. (2013). Recog-
telligence, 35(8), 1798–1828. nition of surface defects in continuous casting slabs
Chu, M., Zhao, J., Gong, R., and Liu, L. (2017). Steel based on contourlet transform. Journal of University
surface defects recognition based on multi-label classifier of Science Technology Beijing, 35(9), 1195–1200.
with hyper-sphere support vector machine. In Control Zhang, X. (2011). Vision inspection of metal surface
And Decision Conference (CCDC), 2017 29th Chinese, defects based on infrared imaging. Acta Optica Sinica,
3276–3281. IEEE. 31(3), 0312004.
Girshick, R. (2015). Fast r-cnn. arXiv preprint arX-
iv:1504.08083.
Guo, H., Shao, W., and Zhou, A. (2017). Novel defect
recognition method based on adaptive global threshold
for highlight mental surface. Chinese Journal of Scien-
tific Instrument, 38(11), 2797–2804.
Hu, H., Liu, Y., Liu, M., and Nie, L. (2016). Surface defect
classification in large-scale strip steel image collection
via hybrid chromosome genetic algorithm. Neurocom-
puting, 181, 86–95.
Ioffe, S. and Szegedy, C. (2015). Batch normalization:
Accelerating deep network training by reducing internal
covariate shift. In International conference on machine
learning, 448–456.
Ke, X.U., Lei, W., and Wang, J. (2016). Surface defect
recognition of hot-rolled steel plates based on tetrolet
transform. Journal of Mechanical Engineering.
Krizhevsky, A., Sutskever, I., and Hinton, G.E. (2012).
Imagenet classification with deep convolutional neural
networks. In Advances in neural information processing
systems, 1097–1105.
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep
learning. nature, 521(7553), 436.
Lin, M., Chen, Q., and Yan, S. (2013). Network in network.
arXiv preprint arXiv:1312.4400.
Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B.,
and Belongie, S. (2017). Feature pyramid networks for
object detection. In CVPR, volume 1, 4.
Ma, Y., Li, Q., He, F., Yan, L., and Xi, S. (2017). Adap-
tive segmentation algorithm for metal surface defects.
Chinese Journal of Scientific Instrument.
Masci, J., Meier, U., Ciresan, D., Schmidhuber, J., and
Fricout, G. (2012). Steel defect classification with max-
pooling convolutional neural networks. In Neural Net-
works (IJCNN), The 2012 International Joint Confer-
ence on, 1–6. IEEE.
Radford, A., Metz, L., and Chintala, S. (2015). Unsuper-
vised representation learning with deep convolutional

81

You might also like