You are on page 1of 12

Accepted Manuscript

LPR-Net: Recognizing Chinese License Plate in Complex


Environments

Di Wang, Yumin Tian, Wenhui Geng, Lin Zhao, Chen Gong

PII: S0167-8655(18)30699-8
DOI: https://doi.org/10.1016/j.patrec.2018.09.026
Reference: PATREC 7325

To appear in: Pattern Recognition Letters

Received date: 2 May 2018


Revised date: 28 August 2018
Accepted date: 27 September 2018

Please cite this article as: Di Wang, Yumin Tian, Wenhui Geng, Lin Zhao, Chen Gong, LPR-Net: Rec-
ognizing Chinese License Plate in Complex Environments, Pattern Recognition Letters (2018), doi:
https://doi.org/10.1016/j.patrec.2018.09.026

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
1

Highlights

• An end-to-end Chinese license plate recognition method


named LPR-net is proposed.

• LPR-net avoids the problem of accumulating error and en-


hances recognition accuracy.
• An effective scheme based on batch normalization is used
to accelerate the learning procedure of LPR-net.
• LPR-Net outperforms state-of-the-art methods in terms of
both recognition accuracy and robustness in complex en-
vironment.

T
IP
CR
US
AN
M
ED
PT
CE
AC
ACCEPTED MANUSCRIPT
2

Pattern Recognition Letters


journal homepage: www.elsevier.com

LPR-Net: Recognizing Chinese License Plate in Complex Environments

Di Wanga,b , Yumin Tiana,∗∗, Wenhui Genga , Lin Zhaoc,b , Chen Gongc,b


a Schoolof Computer Science and Technology, Xidian University, Xi’an, 710071, China

T
b StateKey Laboratory of Integrated Services Networks, Xidian University, Xi’an, 710071, China
c Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, School of Computer Science and Engineering,

IP
Nanjing University of Science and Technology, Nanjing, 210094, China

ABSTRACT

CR
License plate recognition (LPR) technology has been attracting increasing interest during recent years
for its exclusive role in real world intelligent traffic management systems. Owing to its importance,
numerous LPR methods have been developed. These methods are generally composed of three pro-

US
cessing steps, i.e. license plate location, character segmentation and character recognition. However,
the three-step scheme always yields unsatisfactory recognition performance in challenging complex
environment like uneven illumination, adverse atmospheric conditions, complex backgrounds, unclear
vehicle plates, low-quality surveillance camera, etc. In such scenes, the obtained license plates are usu-
AN
ally not clear, which will cause imprecise results of localization and segmentation. Consequently, the
recognition capacity is inadequate as its performance highly depends on the effects of localization and
segmentation. To address these challenges, we propose a novel Chinese vehicle license plate recog-
nition method to directly recognize license plate through an end-to-end deep learning architecture
M

named license plate recognition net (LPR-Net). The LPR-Net is a hybrid deep architecture that con-
sists of a residual error network for extracting basic features, a multi-scale net for extracting multi-s-
cale features, a regression net for locating plate and characters, and a classification net for recognition.
ED

Moreover, an effective scheme based on batch normalization is used to accelerate training speed in the
learning procedure. Extensive experiments demonstrate that the proposed method achieves excellent
recognition accuracy and works more robustly and efficiently compared with the state-of-the-art meth-
ods in complex environments.
c 2018 Elsevier Ltd. All rights reserved.

PT
CE

1. Introduction stage of LPR, which extracts license plate region according to


some defined conditions or properties [4]. The results of LPL
License plate recognition (LPR) is an important research will directly influence the following character segmentation and
topic in computer vision [1], pattern recognition [2], and vi- recognition stages. The typical LPL methods include edge-
AC

sual analysis[3]. It is widely employed in parking management, detection-based methods [5] and color-based methods [6][7].
electronic toll collection, vehicle tracking, and traffic control Edge-detection-based LPL methods extract texture features to
administration, etc. Owing to its importance, several LPR algo- locate license plates. However, they are susceptible to invalid
rithms have been proposed to automatic recognition of license area around license plates when there are many texture features.
plates. These LPR algorithms are generally composed of the Color-based LPL methods take advantage of the fixed color
following three processing steps: license plate location, charac- collocations of license plate to narrow down searching range.
ter segmentation and character recognition. Each of these parts However, color-based methods may become invalid when there
plays an important role in the final recognition accuracy. are regions in license plate image, whose color is similar to that
License plate localization (LPL) is the first and foremost of car body. In order to avoid the aforementioned drawbacks,
many LPL methods extract character features [8] and texture
features [9] to locate license plate. Nevertheless, these meth-
∗∗ Corresponding author: ods can easily be affected by noises and disturbing characters
e-mail: ymtian@mail.xidian.edu.cn (Yumin Tian )
ACCEPTED MANUSCRIPT
3

outside the region of license plates. Recently, a color-depressed deep learning, namely license plate recognition neural network
gray scale conversion method [10] is proposed to locate Chi- (LPR-Net), to improve the recognition accuracy in complex en-
nese license plates. It achieves a high location rate by 98.95%. vironment. The proposed LPR-Net is a hybrid deep architecture
However, this method is easily affected by the color of vehicle that consists of a basic net, a multi-scale net, a regression net,
body. and a classification net. Fig. 1 illustrates the flowchart of the
Character segmentation (CS) is the second step of LPR, proposed LPR-Net. It first extracts deep basic features by ba-
which locates each character in license plate area. There are sic net. Then multi-scale features are extracted by multi-scale
many CS methods based on connected component analysis net in order to adapt to plates with different sizes. Thirdly, the
(CCA) [11], projection [12], grey-level quantization [13], mor- proposed LPR-Net locates plates and characters by a regres-
phology analysis [14], and template matching [15][16]. The sion net. And finally, a classification net is designed to iden-
CCA-based methods select connected area as candidates by tify characters. The LPR-Net is trained by back propagation
scanning the entire license plate. They require none broken bi- method in an end-to-end way and the batch normalization (BN)
nary characters and relatively large intervals between frames algorithm is used to accelerate the training. Comparing with ex-

T
and characters. Projection-based methods project the extracted isting works, the main contributions of the proposed LPR-Net
binary license plate vertically to confirm the starting and end- are summarized as follows:

IP
ing coordinates of characters, and then project the extracted re- • By recognizing the license plate in an end-to-end way,
gion horizontally to extract each character separately. However, the proposed LPR-net avoids the problem of accumulat-
the Chinese character such as chuan may be divided into sev-

CR
ing error, which degrades the recognition accuracy of typ-
eral characters by projection-based methods. Grey-level based ical three-step methods, thus its recognition accuracy is
methods separate character and background by different gray enhanced.
levels, which are susceptible to brightness. Template matching
based methods extract character candidate regions that satisfy • An effective scheme based on batch normalization is used
the matching threshold by sliding window within the license
plate region. However, a single template cannot accommo-
date skewed characters and complex circumstance. Extremal
regions (ERs) based methods generate character candidates by
US to accelerate the learning procedure of LPR-net.
• Thorough experimental results demonstrate that the pro-
posed LPR-Net outperforms state-of-the-art LPR methods
AN
searching extremal regions, then the selected candidates are in terms of both recognition accuracy and robustness in
classified by using support vector machines. All the above char- complex environment.
acter segmentation methods need to appropriately set some in- The rest of this paper is organized as follows. Section 2
volved parameters manually, thus these methods are not robust presents the proposed LPR-Net and its structure. Section 3
M

enough for real-world applications. presents the training process of LPR-Net. Experimental results
Character recognition (CR) is the final task of LPR. Many and comparisons with traditional methods are shown in Section
classification techniques have been utilized to CR, such as arti- 4. Finally, conclusions are reached in Section 5.
ED

ficial neural networks [17][18], support vector machine (SVM)


[19][20], k-nearest neighbor classifier [21][22], AdaBoost clas-
2. License Plate Recognition Neural Network
sifier [23], and Bayesian classifier [24]. Recently, convolutional
neural networks(CNN) is the most widely used character recog- To avoid large accumulated error in typical three-step LPR
nition method, which achieves high recognition rate [25] [26].
PT

methods and improve recognition performance in challenging


Character recognition is the most robust and reliable module in complex environment, an end-to-end recognition method based
LPR system. Usually, characters can be correctly identified as on deep convolution neural network named LPR-Net is pro-
long as the results of localization and segmentation are accurate posed in this paper. The input of LPR-Net is a gray or color
CE

enough. image with any size and the output is the license plate number
The overwhelming majority of the existing license plate of input image. If no license plate in input images, LPR-net
recognition methods use the above three-step processing will output ”no license”. The proposed LPR-Net is a hybrid
scheme to recognize license plates [27][28]. However, the deep architecture that consists of a basic net, a multi-scale net,
AC

three-step scheme will cause unsatisfactory recognition perfor- a regression net, and a classification net. Its structure is shown
mance in challenging complex environment like uneven illumi- in Fig. 1. Firstly, LPR-Net resizes the input image to 500 × 500
nation, adverse atmospheric conditions, complex backgrounds, pixels. Then the resized image is input to basic net to get basic
unclear vehicle plates, and low-quality surveillance camera features. After that, the basic features are input to multi-scale
[29]. In complex environment, it is difficult to obtain precise net to obtain multi-scale features. Finally, the multi-scale fea-
result of each step. As the later step strongly depends on the tures are input to regression net and classification net to locate
former step, the incompetence of the former step will seriously plate and identify characters. Table 1 lists the main notations
affects the performance of the later step. Therefore, errors of and descriptions used in this paper.
localization and segmentation will be accumulated, and hinder
the final recognition performance. 2.1. The Basic Net
To overcome the aforementioned limitations, we proposed The residual network (ResNet) can extract detailed features
a novel Chinese license plate recognition method based on and high-level essential features of an image which is beneficial
ACCEPTED MANUSCRIPT
4
License Plate Recognition Net

NMS & Sort Bbox


Features 11586
Classification
conf

ResNet
res3b3_relu

res5c_relu
10

conv1_2

conv2_2
19 5 Regression
Bbox

conv3_2
19 10 3 pool6
5
3 1
1

500  500 粤B9A28S


Basic Net Multi-scale Net Regression Net
Classification Net

Fig. 1. The structure of the proposed LPR-Net.

T
IP
Table 1. Notations and descriptions.
X
Symbol Description
α Weight of location loss
Input/pool

CR
n Number of positive samples
N Number of candidate boxes
ni Number of ground truth boxes of the i-th category
Conv F(x)
cx Horizontal coordinate of the center point of boxes Identity:X H(x)
cy
sk
smin
smax
Vertical coordinate of the center point of boxes
Minimum scale of default box on feature map of the
k-th convolution layer
Minimum scale of the first layer in multi-scale net
Maximum scale of the first layer in multi-scale net
US Relu

F(x)
AN
dj The j-th default box
d j∈pos Positive sample
d j∈neg Negative sample H(x)=F(x)+x
wak Width of default box in feature map of the k-th layer
M

hak Height of default box in feature map of the k-th layer Fig. 2. The structure of ResNet.
gip The i-th ground-truth box of the p-th category
lmj Prediction box of the j-th default box
Indicates whether the j-th candidate box matches the and angles. In order to adapt to different situations, we ex-
xipj
ED

i-th ground-truth box of the p-th category tract multi-scale features by multi-scale net to make LPR-Net
Jipj
Overlap ratio of the i-th ground-truth box and the j-th more robust. The multi-scale net involves six layers which
default box are res3b3 relu layer, res5C relu layer, conv1 2 layer, conv2 2
Probability that the k-th positive sample belongs to layer, conv3 2 layer and pool6 layer. The res3b3 relu layer
Pik
PT

the i-th category and res5C relu layer are original layers in ResNet. Detailed
descriptions about them can refer to [30]. The parameters of
Table 2. Parameters of multi-scale net
conv1 2 layer, conv2 2 layer and conv3 2 layer are shown in Ta-
Layers Padding Kernel-size Stride Output numbers ble 2. Pool6 layer is a global average pooling layer. We can get
CE

conv2 1 0 1 1 256 six multi-scale feature maps by these layers. Samples of feature
conv2 2 1 3 2 512 maps are shown in Fig. 3. Then, we calculate the maximum and
conv3 2 1 3 2 512
minimum scales of each layer to control sizes of default boxes
by multi-scale transformation strategy. After that, we select de-
AC

for LPR [30]. Moreover, it has advantages of rapidly stable con- fault boxes with various scales on six multi-scale feature maps.
vergence, deep layers, a small number of parameters and avoid- Finally, pixel values of default boxes are treated as multi-scale
ing gradient vanishing than traditional deep neural networks. features.
So, it is used as the basic network of LRP-Net. The structure of
ResNet is shown in Fig. 2. In LPR-Net, the resized 500 × 500 2.2.1. Default Box
pixels license plate image is input to ResNet to extract detailed After obtaining feature maps from six multi-scale layers,
and essential basic features. each of the feature map is cut into small lattices which are called
feature map cells as shown in Fig. 4(b) and (c). On each feature
map, six default boxes with different aspect ratios are generated
2.2. The Multi-Scale Net
for each feature map cell, which is shown in Fig. 4(b) with rect-
In vehicle license plate recognition system, license plate re- angular box formed by dotted line. The center point of each
gion and characters in images are always with various scales default box is the same with that of feature map cell. Gener-
ACCEPTED MANUSCRIPT
5

where gip means the i-th ground-truth box of the p-th category,
d j means the j-th default box.

2.3. The Regression Net


After getting multi-scale features by multi-scale net, we input
features and location information of each default box to regres-
res3b3_relu res5c_relu sion net to get prediction boxes of plate and characters. And
conv1
then, the errors between prediction boxes and ground-truth are
Fig. 3. Samples of feature maps.
back propagated to fine tune location information. The regres-
sion net is a convolution layer with 1 × 1 filter. Its input is
multi-scale features, and output is the prediction boxes of plate
loc:( C ,C w , h )
{conf :( Cx 1 ,Cy ,2 ,...,C70 ) and characters. The loss function of regression net Lloc (x, l, g)
Feature map cell
is defined as

T
Default box

N
X X

IP
Lloc (x, l, g) = xipj · smooth(lmj − ĝm
i j ), (3)
j∈pos m∈{cx,cy,w,h}
(a) ground truth (b) 8 x 8 feature map (c) 4 x 4 feature map

CR
Fig. 4. Feature map with default boxes. (a) is the original image with gcx cx
i − d j∈pos gcy cx
i − d j∈pos
ground-truth boxes of two characters; (b) shows feature map cells which ĝcx
ij = ĝcy =
are selected from a 8 × 8 feature map and their corresponding 6 default dwj∈pos ij
dwj∈pos
boxes; (c) shows a feature map cell and its corresponding default boxes on , (4)
gw
a 4 × 4 feature map and four location elements (loc) and a category confi-
dence (conf) of a default box.

ally, Chinese license plate has two specifications: 220 × 440


US ĝwij = log( iw )

smooth(x) =
dj
(
ĝhij

0.5x ,
= log(
ghi
dhj

|x| < 1
)

(5)
AN
,
|x| − 0.5, |x| ≥ 1
and 140 × 440 mm, and the specification of each character is
45 × 90 mm. Considering that license plates may incline, the where dmj∈pos represents the j-th positive sample, ĝm i j is the po-
m
value range of aspect ratio ar of default box in this paper is sition deviation between the i-th positive sample di∈pos and the
ar ∈ {1, 2, 3, 12 , 22
7
}. j-th ground-truth box gmj , cx and cy represent the horizontal
M

and vertical coordinates of the center point of any box respec-


tively, w and h represent the width and height of any box respec-
2.2.2. Multi-scale Transformation Strategy tively, lmj represents the prediction box of the j-th default box,
The proposed LPR-Net uses multi-scale transformation strat- smooth(lmj − ĝm i j ) represents the error smoothing function of the
ED

egy to determine the height and width of default box. The scale prediction box and the position deviation value, xipj indicates
of default box for each feature map is computed as whether the j-th candidate box matches the i-th ground-truth
smax − smin box of the p-th category and it equals to 1 if they are success-
sk = smin + (k − 1), k ∈ [1, m], (1)
PT

fully matched and otherwise 0, and N is the number of candi-


m−1
date boxes.
where sk represents the minimum scale of default box on feature
map of the k-th convolution layer. It is also the maximum scale 2.4. The Classification Net
CE

of the (k + 1)-th convolution layer. m is the number of layers The multi-scale features of positive samples are finally input
of multi-scale net and it equals to 6 in this paper. smin and smax to classification net for license character recognition. The soft-
represent the minimum scale and maximum scale of the first max classifier is applied as the classification net to compute the
layer in multi-scale net respectively. The width of default box character class probability of input sample, for its high classi-
√ √
AC

is wak = sk ar , and the height is hak = sk / ar in feature map fication efficiency and effectiveness [31]. A linear function is
of the k-th layer. For the case when aspect ratio equals to 1, applied to model the relationship between the multi-scale fea-
0 √
we add a default box whose scale is sk = sk sk+1 . Thus, there ture x and its probability distribution zi (x)
are six default boxes for each feature cell. After obtaining these
default boxes, we select eligible default boxes as positive and zi (x) = wTi x + bi (6)
negative samples. If the overlap ratio Jipj of the i-th ground-truth
box and the j-th default box is greater than 0.7, we choose the where zi (x) is the probability of input x belonging to category
j-th default box to be the positive sample d j∈pos , and less than i, wi and bi are the corresponding model parameters of softmax
0.3 as negative sample d j∈neg . The overlap ratio Jipj is defined as classifier. As probability has properties of nonnegativity and
unitarity, zi (x) is normalized as
p p
gi ∩ d j gi ∩ d j ezi (x)
Jipj = p = , (2) pi (x) = so f tmax(zi (x)) = Pn z (x) , i = 1, 2, ..., m, (7)
gi ∪ d j gip + d j − gip ∩ d j j=1 e
i
ACCEPTED MANUSCRIPT
6

where pi is the normalized probability of x belonging to cate-


gory i, m is the number of categories, n is the number of positive
samples.
The goal of classification net is to maximize pi (x), which
means that −log(pi (x)) should be minimized. We define class
0 as the background, so the purpose of classification turns to
maximize the probability of negative samples d j∈neg which are
belong to category 0 , that is to minimize − log(p0 (x = d j∈neg )).
Therefore, the class confidence error function Lcon f (x, p) of
classification net is defined as
X ni
m X n
X X  

Lcon f (x, p) = (− xki j log(pkj ) − log(p0k )), (8)  


Fig. 5. Average recognition rates of LPR-Net with BN and LPR-Net without
 
i=1 j=1 k∈pos k∈neg

T
 
BN.  
 
where xki j indicates whether the k-th positive sample matches  

IP
Table 3. Hyper parameters of LPR-Net
the j-th ground truth box of the i-th category, pik represents
parameters base lr gamma lr policy Active fun
the probability that the k-th positive sample belongs to cate- value 0.001 0.1 multistep ReLu

CR
gory i. m represents the number of categories. ni is the number parameters type weight decay momentum BN
of ground truth boxes of the i-th category. n is the number of value sgd 0.0005 0.9 yes
positive samples.
Then the overall loss function L(x, p, l, g) of the proposed
LPR-Net can be obtained by combing the location loss function 3.2. Hyper Parameters Selection
Lloc (x, l, g) and classification loss function Lcon f (x, p) as

L(x, p, l, g) =
1
(Lcon f (x, p)) + αLloc (x, l, g),
US
(9)
In the training process of deep neural networks, the selec-
tion of hyper parameters has a great influence on the perfor-
mance. Hyper parameters are external configurations of neural
network, and their values cannot be learned from data. For a
AN
n
given problem, we cannot know the optimal values of hyper pa-
where α is the weight of location loss and its default value is 1. rameters. Despite this, we can empirically find desirable values
according to typical parameter adjustment rules or experiments.
The hyper parameters of this paper are shown in Table 3.
M

3. Implementation Details base lr is the basic learning rate, gamma is an update factor
of learning rate, lr policy is the updating strategy of learning
LPR-Net can be trained end-to-end by back-propagation and rate. In this paper, the multi-step update strategy is adopted
stochastic gradient descent (SGD) [32]. To speed up the train-
ED

to update the learning rate. And the learning rate is equal to


ing process, the paper adds batch normalization (BN) layer after base lr ∗ gammaiter , iter ∈ {30000, 60000, 90000}. Active fun
each multi-scale layer to normalize the input of each convolu- is the activation function, and ReLu function is selected in this
tion layer in multi-scale net [33]. paper. type represents the optimization strategy of loss func-
tion, and stochastic gradient descent method (SGD) is utilized.
PT

3.1. Batch Normalization weight decay is the weight attenuation factor to prevent over-
fitting. momentum is a momentum factor that allows the net-
In order to keep the original distribution pattern of input data work to learn faster when the loss curved surface is flat. BN
CE

as far as possible, BN layer is added after each multi-scale layer shows whether BN layer is utilized. LPR-Net was trained by
to prevent convolution operations destroying features of previ- these hyper parameters with 120,000 iterations in the following
ous layers. Batch normalization is defined as experiments. The training loss curve of LPR-Net is shown in
  Fig. 6. It can be seen that LPR-Net converges very fast without
AC

x(k) − E x(k) obvious oscillation.


(k) (k) (k)
y =γ p  +β , (10)
Var x (k)

4. Experimental Results
where k represents the k-th layer, x(k) is the input of the k-th
layer, E(x(k) ) is the mean of x(k) , Var(x(k) ) is the variance of This section evaluates the performance of the proposed LPR-
x(k) , γ(k) and β(k) are two normalization parameters. By adding Net. The hardware configuration in the following experiments
BN layer, the training time of LPR-Net with 70,000 iterations is GPU GeForce GTX 1080 and Dell Precision Tower 7810
was shortened from 8.2 hours to 5.6 hours in the experiment. RAM32G.
Fig. 5 shows the average recognition accuracy of license plate.
It can be seen that the average plate recognition accuracy are 4.1. Data Set
higher by using BN layers. Therefore, BN layer enhances the We collect 2,000 Chinese license plate images (including
training speed and recognition rate of LPR-Net. 2,200 plates) in the complex environment as data set. And the
ACCEPTED MANUSCRIPT
7
MSER+COLOR LPR-Net

loss yellow plate yellow and blue plate success

 
 
  Fig. 6. Training loss curve.

T
 
 
 

bootstrapping method is used to partition the data set into train-

IP
 

ing set and test set. Firstly, we randomly sampling one sample failure fuzziness in fog success
from the original data set m times and treat m sampled data

CR
as the training set. Then, the rest unsampled data is served as
the test set. In our experiments, m equals to 2, 000 and there
are 685 test data. Moreover, we expand the original test set
to other three test sets by OpenCV to show the generalization
ability and the robustness of LPR-Net in complex environment.
We choose images which are taken during daytime from the
original test set to form the day-set (524 images). Then we ran-
domly choose 410 images from the original test set and reduce
US failure shelter in snow success
AN
brightness value of each image pixel by 50% to form the night-
set. Similarly, we form askew-set and fuzzy-set by randomly
choose 410 images from the original test set and rotate images
by 5-10 degree and filter images by 5 × 5 Gaussian filter respec-
tively.
M

failure rainy &hard light success


4.2. Comparison Methods
The proposed LPR-Net is compared with three state-of-the- Fig. 7. License plate location comparison in complex environments.
ED

art Chinese license plate recognition methods, i.e., MSER-


based method [34], Color-based method [35], and CNN-based
proposed method and the left column is the location results
method [10].
of the combination of MSER-based [34] method and Color-
MSER-based method extracts candidate character regions in
based [35] method. It can be seen that the proposed LPR-Net
PT

grey level using the MSER (Maximally Stable Extremal Re-


can locate license plate more accurate than MSER-based [34]
gion) detector [36] and infers license plate locations according
method and Color-based [35] method.
to the arrangement of characters in standard license plates. The
CNN-based [10] method presents a color edge algorithm to
MSER-based method can hardly extract the bounding boxes of
CE

locate license plate region, and combines CCA and projection


characters in complex environment.
analysis to segment characters in license plate region. And a
Color-based [35] method locates license plate using color in-
simplified recurrent convolutional neural networks is proposed
formation for the most common blue and yellow plates. This
to automatically recognize characters. Despite its character rec-
method utilizes the color concomitant property and transitions
AC

ognize rate is high, it is easily affected by license plate location


between background and characters, and removes fake plates
method and character segmentation method in complex back-
and reserves real plates. In addition, it removes fake plates by
ground.
using the relationship between stroke width and character size.
It can select threshold automatically by judging the illumination
distribution of an image. However, it cannot deal with the situ- 4.3. Evaluation Criteria
ation when colors of license plate and car body are very similar. In this paper, license plate detection rate (LPDR), Chinese
We adopt support vector machine (SVM) to recognize charac- characters recognition rate (CCRR), alphanumeric characters
ters based on the located license plate region of the Color-based recognition rate (ACRR), OP1 and OP2(overall performance)
method. are used to evaluate the performance of the proposed LPR-Net.
Fig. 7 shows the location results of the combination of LPDR shows the detection rate of license plate and a license
MSER-based method and Color-based method and the pro- plate is correctly detected only if the overlap of the detected
posed method. The right column is the location results of the and ground truth bounding box is above 0.7. CCRR and ACRR
ACCEPTED MANUSCRIPT
8

Table 4. Experimental results of LPR-Net. Table 5. License plate location accuracy on day-set and night-set
Data set LDR CCRR ACRR OP1 OP2 night-set(%) day-set(%)
Method
day-set 99.8% 99.6% 99.8% 99.1% 99.2% Recall Precision F-score Recall Precision F-score
night-set 99.8% 99.8% 99.3% 96.4% 98.9% LPR-Net 99.88 99.16 99.52 99.76 99.51 99.63
askew-set 99.7% 99.7% 98.8% 93.9% 99.2% MSER-based 98.94 98.75 98.84 97.46 97.59 97.53
fuzzy-set 99.8% 98.9% 99.0% 93.7% 97.7% Color-based 94.17 91.69 92.91 97.95 93.54 95.70
Average 99.8% 99.5% 99.2% 95.8% 98.8% CNN-based 93.90 94.28 94.09 97.46 97.68 97.57

reflect the recognition rate of Chinese characters and alphanu-


meric characters respectively. A character is correctly recog-
nized only if the overlap of the detected character bounding
box and ground truth bounding box is above 0.7 and the labels
of two boxes are equal. OP1 is the overall recognition rate of

T
license plate and a license plate is correctly recognized only if
the recognized character string is the same with the real license

IP
plate number. OP2 is the product of CCRR and ACRR, which
shows the overall character recognition rate. LPDR, CCRR,
ACRR, OP1, and OP2 are defined as

CR
Number of correctly detected license plates
LPDR = . (11)
Number of all ground truth license plates

Number of correctly recognized chinese characters

US
Real license plate number :赣C5V253
CCRR = . Recognized license plate number:赣C5V253
Number of all real chinese characters
(12)
Number of correctly recognized alphanumeric characters Fig. 8. An example of the case when the colors of license plate and the body
ACRR = .are similar.
Number of all real alphanumeric characters
AN
(13)
Number of correctly recognized license plates
OP1 = . (14)
Number of all license plates
OP2 = LPDR × CCRR × ACRR. (15)
M

Recall, Precision, and F-score are also used to evaluate the per-
formance of LPR-Net. Their definitions are defined as
TP
ED

Recall = , (16)
P
TP
Precision = , (17)
FP + TP
PT

2 × Recall × Precision
F − score = , (18)
Recall + Precision
where TP means true detected positives, FP means false de- Real license plate number :粤B1KF45
CE

Recognized license plate number:粤B1KF45


tected positives and P means real positives.
Fig. 9. An example of the case when illumination is insufficient.
4.4. Results and Discussions
Table 4 shows the recognition rates of LPR-Net on four test
AC

sets. It can be seen that recognition rates of both Chinese char- can still be accurately identified as shown in Fig. 9. Fig. 10
acters and alphanumeric characters are more than 99%. And shows LPR-Net successfully recognizes the license plate which
the overall recognition rate OP1 on day-set is as high as 99.1%. is very vague due to strong light. Fig. 11 shows that LPR-Net
These shows that the proposed LPR-Net has high recognition accurately recognized the seriously skewed license plate with-
rate and stronger robustness in complex environment. out angle correction.
To intuitively show the recognition results of LPR-Net, we License plate location is an important task of license plate
show some concrete examples in Figs. 8, 9, 10, and 11. Fig. 8 recognition. We show the comparison results of LPR-Net and
shows the case when the colors of license plate and body are baseline methods on license plate location task on day-set and
similar and characters are somewhat broken. It can be seen night-set in Table 5. It can be seen that the proposed LPR-Net
that LPR-Net can still accurately locate and identify the li- has higher F-score, recall and precision than other plate location
cense plate. When illumination is seriously insufficient in the algorithms on night-set and day-set. The F-score of LPR-Net is
evening and characters are defective, the license plate number as high as 99.52% and 99.63% on night-set and day-set respec-
effectiveness of each sub-networks.

Authors’ reply: Thanks for your good advice. According to your suggestion, we did experiments to eval
ACCEPTED MANUSCRIPT
effectiveness of each sub-networks in the new version. The revision is copied as follows for your convenience.
9
The proposed LPR-Net is a hybrid deep architecture that consists of a basic net, a multi-scale net, a regres
Table 6. Performance comparison net.
and a classification withTo
other
showChinese LPR methods
the effectiveness of each sub-network, we did experiments to analyze the cont
day-set(%) ofnight-set(%)
different stages in LPR-Net andaskew-set(%) fuzzy-set(%)
the results are reported in Table average
7. In Table 7, “LPR-Net” represents the p
Method
CCRR ACRR OP1 CCRR ACRR OP1 CCRR ACRR OP1 CCRR ACRR OP1 OP1(%)
LPR-Net, “VGG-BasicNet” indicates the basic net RestNet is replaced by VGGnet in LPR-Net, “SSD-Sc
LPR-Net 99.64 99.83 99.10 99.81 99.32 96.42 99.71 98.83 93.91 98.92 99.01 93.68 95.78
MSER-based 96.98 97.45 95.36 95.67represents
98.67the multi-scale
93.51 net in LPR-Net
93.55 98.67is replaced
93.65 by the scaling97.16
96.93 strategy of91.21
SSD-Net,93.43
and “LPR-Net with
Color-based 97.64 95.15 93.54 94.16represents
94.76 the proposed
90.37 95.67
LPR-Net 96.43 batch-normalization.
without 92.03 91.21 It 93.57 91.99
can be seen that the91.92
proposed LPR-Ne
CNN-based 96.55 98.67 96.35 99.06achieves
99.26 91.96 93.78 95.37 90.97 98.63 98.69 90.87 92.54
the best performance. This shows the effectiveness of the basic net, the multi-scale net and
normalization. As the regression net and the classification net are indispensable and other effective regress
classification methods can also be used to LPR-Net, we did not do separate experiments on them. F
Table
experimental results, we can know the 7. The comparison
effectiveness of recognition
of the regression speed.
net and the classification net as the rec
Method LPR-Net MSER-based Color-based CNN-based
rate is always large than 99%.
Time (s) 0.20 0.61 0.58 0.41

Table 7: Analysis on sub-networks and batch-normalization.

T
Table 8. Analysis on sub-networks and batch-normalization.
Method\Criteria LDR CCRR ACRR OP1

IP
LPR-Net 99.78% 99.50% 99.23% 95.78%
VGG-BasicNet 98.68% 93.67% 95.56% 90.17%
SSD-ScaleNet 84.16% 82.57% 83.34% 81.32%

CR
LPR-Net without BN 94.69% 87.65% 88.56% 85.52%

Q2. The effects of batch-normalization should be evaluated to support the claim of the paper.
other Chinese LPR methods on any data set. Moreover, aver-
Real license plate number
Recognized license plate number:粤B72Q47
US age OP1 of the proposed method on four data sets is about 2%
: 粤B72Q47 Authors’ reply: Thanks for this comment. According to your suggestion, we did experiments to eval
higher than MSER-based method, about 4% higher than Color-
effectiveness of batch-normalization in the new version. The revision is copied as follows for your convenience

Fig. 10. An example of the case when the plate is small and fuzzy.
based method and about 3% higher than CNN-based method.
In addition, it can be seen from the results that ACRR is always
AN
To show the effectiveness of each sub-network, we did experiments to analyze the contribution of differen
higher than CCRR, this shows that Chinese character recogni-
in LPR-Net and the results
tion are reported
is more in Table
difficult 7. In
than Table 7, “LPR-Net”
alphanumeric represents the proposed LPR-Net
recognition.
BasicNet” indicates the basic
The net RestNet
average is replaced by
recognition VGGnet
time in LPR-Net,
of LPR-Net and“SSD-ScaleNet”
baselines on represents th
scale net in LPR-Net four
is replaced by is
data sets thepresented
scaling strategy of SSD-Net,
in Table and “LPR-Net
7. From Table without
7, we can see BN” repres
that LPR-Net costs only 0.2 seconds to recognize a plate num-
M

ber, and its recognition speed is 2 two or three times faster than
other methods. The main reason is that most time is spent on
plate location and character segmentation of baseline methods.
ED

As a result, the proposed LPR-Net can better meet practical ap-


plication requirement than baseline methods.
The proposed LPR-Net is a hybrid deep architecture that con-
sists of a basic net, a multi-scale net, a regression net, and
PT

a classification net. To show the effectiveness of each sub-


network, we did experiments to analyze the contribution of dif-
ferent stages in LPR-Net and the results are reported in Ta-
Real license plate number : 川AEK882
ble 8. In Table 8, ”LPR-Net” represents the proposed LPR-
CE

Recognized license plate number:川AEK882


Net, ”VGG-BasicNet” indicates the basic net RestNet in LPR-
Fig. 11. An example of the case when the plate is skewed.
Net is replaced by VGGNet [37], ”SSD-ScaleNet” represents
the multi-scale net in LPR-Net is replaced by the scaling strat-
egy of SSD-Net [38], and ”LPR-Net without BN” represents
AC

tively. The F-scores of LPR-Net only differ by 0.11%, while the proposed LPR-Net without batch-normalization. It can be
that of the baseline methods is around 3.0%. This shows that seen that the proposed LPR-Net always achieves the best per-
the proposed method are more robust than baseline methods. formance. This shows the effectiveness of the basic net, the
As can be seen from the comparison results, the detection rate multi-scale net and batch-normalization. As the regression net
and robustness of the LPR-Net in the complex environment are and the classification net are indispensable and other effective
better than baseline methods. regression and classification methods can also be used to LPR-
The proposed LPR-Net is an end-to-end license plate recog- Net, we did not do separate experiments on them. From the
nition algorithm, which includes the localization and recogni- experimental results, we can know the effectiveness of the re-
tion of license plates and characters. Table 6 shows CCRR, gression net and the classification net as the recognition rate is
ACRR and OP1 of the proposed method and other Chinese LPR always large than 95%.
methods on all the four data sets. As can be seen from Table 6, The basic learning rate is one of the most important hyper-
the proposed method achieves the highest OP1 compared with parameters for training deep neural networks. In this paper, we
ACCEPTED MANUSCRIPT
The graph of OP1 with the different base learning 10
rate
1.00 References
0.95 [1] L. Zhu, J. Shen, L. Xie, Z. Cheng, Unsupervised visual hashing with
semantic assistant for content-based image retrieval, IEEE Transactions
OP1

0.90 on Knowledge and Data Engineering 29 (2017) 472–486.


[2] L. Zhu, J. Shen, L. Xie, Z. Cheng, Unsupervised topic hypergraph hashing
0.85 for efficient mobile image retrieval, IEEE transactions on cybernetics 47
(2017) 3941–3954.
0.80 [3] L. Zhu, J. Shen, H. Jin, R. Zheng, L. Xie, Content-based visual land-
0.0001 0.0003 0.0006 0.0009 0.001 0.003 0.006 0.009 0.01 mark search via multimodal hypergraph learning, IEEE transactions on
Learning Rate cybernetics 45 (2015) 2756–2769.
[4] D. Zheng, Y. Zhao, J. Wang, An efficient method of license plate location,
day night askew fuzzy Pattern recognition letters 26 (2005) 2431–2438.
[5] B. Hongliang, L. Changping, A hybrid license plate extraction method
Fig. 12. OP1 versus basic learning rate. based on edge statistics and morphology, in: International Conference on
Pattern Recognition, 2004, pp. 831–834 Vol.2.
[6] F. Wang, L. Man, B. Wang, Y. Xiao, W. Pan, X. Lu, Fuzzy-based algo-

T
rithm for color recognition of license plates, Pattern Recognition Letters
experimentally show the performance variations with the learn- 29 (2008) 1007–1020.

IP
ing rate for LPR-Net. Fig. 12 shows the OP1 variations with [7] Y. Tian, J. Song, X. Zhang, P. Shen, L. Zhang, W. Gong, W. Wei, G. Zhu,
An algorithm combined with color differential models for license-plate
learning rate on all the four data sets. It can be seen that OP1 location, Neurocomputing 212 (2016) 22–35.
values first increase and then decrease and achieve the best per- [8] M. Rasooli, S. Ghofrani, E. Fatemizadeh, Farsi license plate detection

CR
formance when learning rate equals to 0.001. Therefore, the ba- based on element analysis and characters recognition, International Jour-
sic learning rate of LPR-Net can be chosen to be around 0.001. nal of Signal Processing Image Processing and P 4 (2013) 697 – 700.
[9] J. Chen, H. E. Xiao-Hai, Q. Z. Teng, License plate localization based on
mser, Science Technology & Engineering 247 (2015).
[10] Y. Liu, H. Huang, J. Cao, T. Huang, Convolutional neural networks-based

5. Conclusions US intelligent recognition of chinese license plates, Soft Computing (2017)


1–17.
[11] K. Kanayama, Y. Fujikawa, K. Fujimoto, M. Horino, Development of
vehicle-license number recognition system using real-time image pro-
cessing and its application to travel-time measurement, in: Vehicular
AN
This paper proposes an effective and efficient end-to-end Technology Conference, 1991. Gateway To the Future Technology in Mo-
Chinese license plate recognition method named license plate tion., IEEE, 1991, pp. 798–804.
[12] K. M. V. Deneen, An algorithm for license plate recognition applied to
recognition net. It is a hybrid deep architecture which con- intelligent transportation system, IEEE Transactions on Intelligent Trans-
sists of a residual error network for extracting basic features, a portation Systems 12 (2011) 830–845.
M

multi-scale net for extracting multi-scale features, a regression [13] J. Jiao, Q. Ye, Q. Huang, A configurable method for multi-style license
net for locating plate and characters, and a classification net plate recognition, Pattern Recognition 42 (2009) 358–369.
[14] W. Zou, C. Bai, K. Kpalma, J. Ronsin, Online glocal transfer for auto-
for recognition. By recognizing plate characters in an end-to- matic figure-ground segmentation., IEEE Transactions on Image Process-
end way, it avoids accumulative errors in traditional three-step ing 23 (2014) 2109–2121.
ED

scheme and therefore identifies plate characters more precisely. [15] I. Paliy, V. Turchenko, V. Koval, A. Sachenko, Approach to recogni-
Moreover, an effective scheme based on batch normalization tion of license plate numbers using neural networks, in: IEEE Interna-
tional Joint Conference on Neural Networks, 2004. Proceedings, 2004,
is used to accelerate training speed in its learning procedure. pp. 2965–2970 vol.4.
Extensive experiments on complex Chinese license plate data [16] N. F. Gazcón, C. I. Chesñevar, S. M. Castro, Automatic vehicle identifi-
PT

set have demonstrated that the proposed LPR-net outperforms cation for argentinean license plates using intelligent template matching,
several state-of-the-art methods in terms of both accuracy and Pattern Recognition Letters 33 (2012) 1066–1074.
[17] M. H. Dashtban, Z. Dashtban, H. Bevrani, A novel approach for vehicle
efficiency. license plate localization and recognition, International Journal of Com-
CE

puter Applications 26 (2013) 22–30.


[18] F. Gao, J. Yu, S. Zhu, Q. Huang, Q. Tian, Blind image quality prediction
by exploiting multi-level deep representations, Pattern Recognition 81
Acknowledgments (2018) 432–442.
[19] S. Wang, X. Chang, X. Li, G. Long, L. Yao, Q. Z. Sheng, Diagnosis code
AC

assignment using sparsity-based disease correlation embedding, IEEE


This paper was supported in part by the National Natural Transactions on Knowledge and Data Engineering 28 (2016) 3191–3202.
Science Foundation of China under Grant 61702394, Grant [20] S. Wang, X. Li, L. Yao, Q. Z. Sheng, G. Long, et al., Learning multiple di-
agnosis codes for icu patients with local disease correlation mining, ACM
61572385 and Grant 61711530248, in part by the Postdoctoral Transactions on Knowledge Discovery from Data (TKDD) 11 (2017) 31.
Science Foundation of China under Grant 2018T111021 and [21] C. Patel, A. Desai, Gujarati handwritten character recognition using hy-
Grant 2017M613082, in part by the Science and Technology brid method based on binary tree-classifier and k-nearest neighbour, Esrsa
Project of Shaanxi Province under Grant 2016GY-033, in part Publications (2013).
[22] L. Zhu, J. Shen, H. Jin, L. Xie, R. Zheng, Landmark classification with
by the Shaanxi Key Research and Development Program under hierarchical multi-modal exemplar feature, IEEE Transactions on Multi-
Grant 2017ZDXM-GY-002, in part by the Aeronautical Sci- media 17 (2015) 981–993.
ence Foundation of China under Grant 20171981008, and in [23] T. Jindal, U. Bhattacharya, Recognition of offline handwritten numer-
als using an ensemble of mlps combined by adaboost, in: International
part by the Fundamental Research Funds for the Central Uni- Workshop on Multilingual Ocr, 2013, pp. 1–5.
versities under Grant JBX170313, Grant XJS17063 and Grant [24] X. U. Wei, Classification of machine-printed and handwritten texts based
JBF180301. on the bayesian judge, Chinese Journal of Computers (2003).
ACCEPTED MANUSCRIPT
11

[25] H. Li, C. Shen, Reading car license plates using deep convolutional neural
networks and lstms (2016).
[26] M. Wang, Y. Chen, X. Wang, Recognition of handwritten characters in
chinese legal amounts by stacked autoencoders, in: International Confer-
ence on Pattern Recognition, 2014, pp. 3002–3007.
[27] C. N. E. Anagnostopoulos, I. E. Anagnostopoulos, V. Loumos,
E. Kayafas, A license plate-recognition algorithm for intelligent trans-
portation system applications, IEEE Transactions on Intelligent Trans-
portation Systems 7 (2006) 377–392.
[28] R. Panahi, I. Gholampour, Accurate detection and recognition of dirty
vehicle plate numbers for high-speed applications, IEEE Transactions on
Intelligent Transportation Systems PP (2017) 1–13.
[29] F. Gao, J. Yu, Biologically inspired image quality assessment, Signal
Processing 124 (2016) 210–219.
[30] K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recog-
nition, in: Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[31] S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object

T
detection with region proposal networks, in: Advances in neural infor-
mation processing systems, 2015, pp. 91–99.

IP
[32] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hub-
bard, L. D. Jackel, Backpropagation applied to handwritten zip code
recognition, Neural computation 1 (1989) 541–551.

CR
[33] S. Ioffe, C. Szegedy, Batch normalization: Accelerating deep network
training by reducing internal covariate shift (2015) 448–456.
[34] B. Li, B. Tian, Q. Yao, K. Wang, A vehicle license plate recognition
system based on analysis of maximally stable extremal regions, in: IEEE
International Conference on Networking, Sensing and Control, 2012, pp.
399–404.

Intelligent Transportation Systems Magazine 7 (2015) 51–61.


[36] J. Matas, O. Chum, M. Urban, T. Pajdla, Robust wide-baseline stereo
from maximally stable extremal regions, Image Vision Computing 22
US
[35] J. Dun, S. Zhang, X. Ye, Y. Zhang, Chinese license plate localization in
multi-lane with complex background based on concomitant colors, IEEE
AN
(2004) 761–767.
[37] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-
scale image recognition, arXiv preprint arXiv:1409.1556 (2014).
[38] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C.
Berg, Ssd: Single shot multibox detector, in: European Conference on
M

Computer Vision, 2016, pp. 21–37.


ED
PT
CE
AC

You might also like