You are on page 1of 6

A Vision System for Traffic Sign Detection and

Recognition
Jian-He Shi and Huei-Yung Lin
Department of Electrical Engineering
Advanced Institute of Manufacturing with High-Tech Innovation
National Chung Cheng University
168 University Road, Min-Hsiung, Chiayi 621, Taiwan
Email: lystyp 87015@yahoo.com.tw, lin@ee.ccu.edu.tw

Abstract—The paper presents an automatic traffic sign recog- red color. Because the RGB color space is more sensitive to
nition system using the videos recorded from an on-board luminance than the HSI color space, they transfer the images
dashcam. It is based on image processing, bilateral Chinese to the HSI domain and eliminate the obvious red color by
transform, and vertex and bisector transform techniques. The
images captured from the dashcam are processed with the restricting the intervals of hue and saturation [5]. There also
histogram of oriented gradients to form feature vectors, followed exist similar approaches for sign detection based on the HSI
by support vector machines to detect the traffic signs. The color space [6].
bilateral Chinese transform and vertex and bisector transform In this paper, we present a road sign detection and identi-
are used to extract the area of traffic sign from images. Finally, a fication technique which does not exclusively depend on the
neural network is adopted to identify the traffic sign information.
In this work, we test the algorithms using the images captured color information. It also adopts the geometric features in the
from the camera mounted behind the front windshield. The image for shape detection [7]. In the previous work, Broggi et
experiments are evaluated with real traffic scenes and the results al. normalize the region with a specific color to a fix size using
have demonstrated the effectiveness of the proposed system. geometric and gradient feature extraction, and compare it with
a predefined template [8]. The proportion of pixels in the area
I. I NTRODUCTION is used to determine the shape. Escalera et al. use an angle
In the past few decades, people drive cars more frequently mask and the coordinates with a pre-set search range to detect
than ever, and consequently cause more safety problems. New the triangular or circular traffic signs after the image region
sensing technologies including laser rangefinder, GPS and is extracted with a specific color [5]. Without using the color
computer vision have become very popular for advanced driver information, Belaroussi and Tarel focus on the detection with
assistance system (ADAS) and autonomous vehicles [1]. One shape features [9]. They use a voting approach to identify the
important issue for driving safety is to be able to notice the center of circular road signs and the equiangular characteristic
traffic information such as watch out for passerby, roadway of triangular road signs by accumulating the edge gradient
narrow and speed limit, provided by the road signs [2], [3]. changes.
To make drivers pay more attentions to the traffic signs, In addition to the techniques using color and shape features,
conspicuous colors and simple shapes are usually adopted. machine learning is also a common method for traffic sign
However, it is still very likely that the drivers ignore the detection and recognition [10]. Bahlmann et al. use AdaBoost
roadside traffic signs due to routine driving, conversation, and Bayesian classification with Haar feature and color in-
or watching electronic devices, etc. In such cases, a driver formation as training data [11]. Kuo and Lin first detect
assistance system which is able to provide automatic detection the traffic sign using Hough transform and corner detection,
and identification of traffic signs in real-time is extremely followed by RBF neural network and k-d tree for traffic sign
helpful. It can reduce the driver’s burden by sending an audio recognition [12]. Greenhalgh and Mirmehdi convert the image
reminder or warning signals. Furthermore, unmanned vehicles to the HSI color space and extract the hue channel using
will also benefit from road sign recognition for autonomous MSER to determine the sign region [13]. The SVM classifier
navigation. with HOG features is then used to recognize the traffic sign.
The objective of this work is to develop a road sign Maldonado-Bascon et al. use the aspect ratio of the red color
detection and identification system using the videos recorded region to remove the regions without a traffic sign, followed
by an in-vehicle dashcam. For traffic sign detection, the color by extracting the DtBs feature and classification using a linear
information is widely adopted in the existing research [4]. support vector machine [14]. Fang et al. create two different
However, due to various weather conditions and inconsistent neural networks and use the image’s color and shape as input
color outputs for different cameras, using the color informa- to detect the traffic sign. They also apply Kalman filter to
tion does not generally provide robust detection results. For predict the next location of the traffic sign [15].
example, Escalera et al. limit the red value interval, and use Our approach for road sign detection consists of two stages:
the ratios of green over red and blue over red to filter out the First, the histogram of oriented gradient (HOG) features and

l-))) 
(a) Circular signs.

(a)

(b) Triangular signs.

(c) Not signs.

(b) (c) Fig. 2. Training data for traffic sign detection.

Fig. 1. Road sign detection failures using RGB and HSI color spaces.

linear support vector machine (SVM) are used to detect the region. A Sobel filter is used to calculate the horizontal
regions which contain a traffic sign. The bilateral Chinese and vertical derivatives, which provide the magnitude and
transform [16], vertex and bisector transform [9], and color orientation of the features. The HOG features are computed
information are used to determine the accurate sign regions. using local contrast normalization on a dense grid of cells with
The results are then normalized and classified using a neural overlapping masks.
network. A nine-bin histogram with unsigned pixel orientations
II. T RAFFIC S IGN D ETECTION weighted by magnitudes is created for each cell. The histogram
is normalized for each block, and the components of each
This work deals with the road signs in Taiwan, and focuses
feature vector are the values from the histogram of each
on the triangular and circular signs. Most traffic sign detection
normalized cell. The HOG can produce large feature vectors
techniques use color information to select the road sign region
with high accuracy. After the HOG features are calculated,
in the image. Either the RGB image input is used directly, or
the regions are classified as triangular or circular shapes using
transformed to the HSI color space and using the hue channel
two trained SVMs. The SVM classification is a supervised
to extract the red color region. The advantage of using the HSI
learning method that constructs a hyperplane to separate data
color space is reducing the effect of luminance changes. After
into positive or negative classes. In an SVM classifier, support
the red pixels are extracted, we limit the size and the aspect
vectors are data points that define the maximum margin of the
ratio of connected components to select the candidate road
hyperplane. SVM is fast, highly accurate, and less prone to
sign regions. There are, however, several problems associated
overfitting compared to many other classification techniques.
with the road sign detection using color information. First, the
In addition, it is possible to train an SVM classifier very
color characteristics might not be identical for different image
quickly [13].
acquisition hardware. Thus, there is no single standard to limit
a specific color (e.g., red). Second, the color information is We collect many images with circular and triangular traffic
sometimes not obvious due to various outdoor weather condi- signs as positive training samples, and use many random street
tions such as in shining, cloudy or rainy days. Fig. 1 illustrates view images without traffic signs as negative training samples.
an example of road sign detection failures using both RGB and The size of positive and negative training samples is 24 × 24
HSI color spaces. Third, the extracted connected components pixels (see Fig. 2). We train two different SVM classifiers to
could be too large or irregular due to the background color. detect the circular and triangular signs respectively, and use
Thus, we use SVM with the HOG feature instead of the color a sliding window with the same size as the training data to
information to detect traffic signs in the first step. calculate the HOG features and classify the regions in the input
The SVM classifier trained with the HOG feature is often images into sign regions or not by SVM classifiers. When the
used in human detection and traffic sign detection. The objects vehicle is moving forward, the location of the traffic sign is
for detection are composed of strong geometric shapes and from far to near. Thus, the same traffic sign has different sizes
high-contrast edges which encompass a range of orientations, in different images in the video sequence. To overcome this
and the traffic signs are generally found to be in the upright problem and achieve the multi-scale detection, we scale the
direction which does not require rotation invariance. In gen- original images to many different sizes, and search the sign
eral, the HOG feature vectors are calculated for each candidate region in each image [17].


Fig. 3. Traffic sign detection by HOG+SVM.

III. T RAFFIC S IGN D ETECTION AND R ECOGNITION


Using the method mentioned above, not only the true traffic
sign region but also the region similar to the traffic sign will
be selected. For example, tires, advertisement sign boards and
lane stripes could also be detected as traffic signs (see Fig. 3).
Thus, the next step is to remove the wrong detection regions.
In this work, we use bilateral Chinese transform (BCT), vertex Fig. 4. Left: the original images; Middle: the accumulators; Right: the circles
detected by BCT.
and bisector transform (VBT) and color information to validate
the correctness of road signs. The input images are resized
from the original images to small ones so that we can perform the contribution of a pair of symmetric points is higher
the detection fast. Since BCT and VBT are sensitive to the than the contribution of a pair of points which is not
gradient orientation of edge points, the resized images with symmetric.
distortion might lead to poor results. We first find the sign 3) If the maximum in the accumulator is greater than a
region in the original image which is determined by the region threshold, then it corresponds to a circle and its location
detected in the small image. BCT and VBT are then carried is the center of circle. To obtain the diameter, we take
out only in the possible traffic sign regions to reduce the all pairs of symmetric points with the middle point at
computation time. the center of the circle. The average of the distances
between two points of all selected pairs is calculated as
A. Bilateral Chinese Transform
the diameter.
Bilateral Chinese transform was first used for road sign
Using the bilateral Chinese transform, it is possible to find
detection by Belaroussi and Tarel [16]. The method is based
the center coordinates and the diameter even the circle is not
on the gradient orientation and magnitude of the edge points.
complete. This implies that we can obtain the contour of the
It is able to detect the complete or broken circle pattern in the
exact circular traffic sign even if it is covered by some small
image with the following steps:
obstacles as shown in Fig. 4.
1) Apply Canny edge detection on the input image, and
use Sobel filter to calculate the gradient orientation and B. Vertex and Bisector Transform
magnitude of each edge point. Similar to BCT, vertex and bisector transform (VBT) also
2) Calculate the accumulated contribution of all pairs of performs special shape detection by the gradient orientation
edge points. For each point P in the image, a set of and magnitude of edge points [9]. The difference between
voters is defined as VBT and BCT is the formulation for calculating the accu-
 
Pi + Pj mulated contribution. VBT is generally carried out with the
Γ(P ) = (i, j) | P = (1) following steps:
2
1) Calculate the gradient orientation and magnitude of each
where Pi and Pj are two different points. The accumu-
edge point.
lator of the BCT defining the symmetry magnitude at a
2) Calculate the accumulated contribution of all pairs of
point P is
 edge points. The set of votes is defined as
Accu(P ) = C(i, j) (2)  −−→ −−→ 
(i,j)∈Γ(P )
Γ(A) = (i, j) | APi · ni = 0, APj · nj = 0

where C(i, j) is the contribution of a pair of points where ni is the gradient vector [ Gix Giy ] of Pi . We
(Pi , Pj ) to the votes of its middle point P . By definition, can calculate the angle  Pi APj , then cast votes to the


Fig. 6. Training data for road sign recognition.

Fig. 5. Left: the original road sign images; Right: the triangles detected by
VBT with and without occlusion. • Black and white inner content: To determine the color of
a traffic sign, we use BCT and VBT to capture the ROI
and transform the color by
corresponding angle bisector AB defined by the vertex
|r − g| + |g − b| + |b − r|
A and the angle  Pi AB = 12  Pi APj . The accumulator f= (5)
of the VBT defining the contribution at point P is given 3λ
by where λ is an experimental parameter. The color is more

Vaccu (A) = T (i, j) (3) like black, gray or white if the value of f is small. We
(i,j)∈Γ(A)
set λ as 15, and the color is the target color (black or
white) if f is smaller than 1. If the target color area is
and  smaller than 50% of the ROI, we will consider it is not
Baccu (AB) = T (i, j) (4) a target traffic sign.
(i,j)∈Γ(AB)
D. Neural Network
where T (i, j) is the contribution of a pair of points
Neural network is a machine learning method which sim-
(Pi , Pj ), which is calculated by the gradient orientation
ulates the structure of human brain. It is useful to classify
of Pi and Pj . The contribution of a pair of points
nonlinear data like handwriting and speech recognition. A
is higher if the angle  Pi APj is close to 60◦ for a
standard neural network has one input layer, one output layer
triangular shape.
and many hidden layers. There are many nodes in each layer,
3) Detect the center of the triangle by finding the maxima
and the neurons can connect the nodes with different weights
of Baccu above a threshold, and look for the local
in each layer to transfer information. To calculate the output
maxima above a second threshold. If a local maximum
from a node, all inputs of the node and weights are multiplied
of Baccu is associated with three local maxima in Vaccu ,
and summed, and the result is substituted into an activation
they are respectively the center and three vertices of a
function. We can correctly classify the input road sign image
triangle.
by adjusting the weights.
As detected by VBT, we can find the contour of a triangular In this paper, we use the back-propagation neural network
traffic sign even it is covered by some small obstacles as to classify the input road sign images. The neural network we
illustrated in Fig. 5. use has three layers, the input layer has 2500 nodes, the hidden
C. Color Information layer has 100 nodes, and the number of nodes in the output
layer is the same as the number of traffic sign classes. We
Common traffic signs contain a red contour and black or collect many binary road sign images without external frames
white content. Many approaches for traffic sign detection take as training samples (see Fig. 6). The size is 50 × 50 pixels,
the selection of the red color as the first step. However, it is and every pixel in image is used as the input data. To detect
easily missing the real traffic signs from our experience. In this the traffic sign, we use HOG+SVM in first step, and BCT,
work, we take the rough positions of circles and triangles, and VBT and color information are used to determine the exact
then use the red color to avoid the confusion caused by the sign region. It is then binarized and the external frame of the
color information alone. The color selection for traffic signs sign is removed. Finally the sign regions are normalized as
is divided into two parts. the size identical to the training samples, and classified by the
• Red frame: We capture the ROI and transform it to the neural network. These steps are illustrated in Fig. 7.
HSI color space, limit the area of hue in 0◦ ∼ 30◦ and
330◦ ∼ 360◦ as the red area. If the red area does not IV. E XPERIMENTAL R ESULTS
exceed 15% of the ROI, then it is not considered as the Our test dataset contains many videos captured from differ-
target sign. ent scenes, including countryside, city and highway, etc. The


(a)

(b)

(c)

Fig. 8. Left: The results using HOG+SVM. Right: The results using BCT,
VBT and color information.

TABLE II
T HE RESULTS USING BCT, VBT AND COLOR INFORMATION .
Fig. 7. Step 1: Using HOG+SVM to detect the sign regions. Step 2: Using
BCT, VBT and color information to determine the exact sign regions. Step 3: Using BCT, VBT and color information
Recognize and classify the traffic signs by neural networks. Signs correctly detected 2578
Background detected as sign 351
Signs undetected 289
TABLE I Total signs in test data 2867
T HE RESULTS USING HOG+SVM. Precision 88%
Recall 90%
True positive False negative Recall
HOG+SVM 2631 236 92%
Red region 1748 1119 61%
difficult conditions, like bad weather or dim light, we can still
recognize the traffic signs (see Figs. 9 and 10).
image resolution of the dataset is 1920 × 1080 pixels and is V. C ONCLUSIONS
resized to 512 × 288 pixels in the detection step. The training
data for circular signs, triangular signs and not signs are 1300, In this paper, we present a road sign detection and recogni-
1200 and 1500, respectively. All training data are classified to tion algorithm based on the image gradient information. It
many classes for neural network. adopts the histogram of gradients cooperated with support
vector machine to perform the initial traffic sign detection.
There are four situations, true positive, false positive, true
The exact position of the traffic sign is then obtained by
negative, and false negative, for the detection results. We
BCT, VBT and color information. Experiments carried out on
compare two methods on a PC platform with C/C++ imple-
real scene images are used to demonstrate the effectiveness
mentation: using HOG+SVM to detect traffic signs and finding
the red regions to determine the traffic sign location. The
results are shown in Table I. Since HOG+SVM captures the TABLE III
region similar to a circular sign or a triangular sign, we use T HE ROAD SIGN RECOGNITION RESULTS OF THE PROPOSED SYSTEM .
BCT, VBT and color information to promote the detection
Recognition
accuracy. The results are shown in Fig. 8 and Table II. Signs correctly classified 2462
In the experiments, we have 2929 possible traffic sign Sign misclassified 467
regions detected as positive. The possible regions are recog- Background misclassified as sign 23
Signs undetected 289
nized by the neural network. The results are shown in Table Total signs in test data 2867
III. We can see from the results that only 23 wrong traffic Precision 99%
signs classified from 351 false detection regions. In more Recall 84%


Fig. 9. Road sign detection and recognition in a sunny day. Fig. 10. Road sign detection and recognition under a dim light condition.

of the proposed technique. In the future work, the proposed [8] A. Broggi, P. Cerri, P. Medici, P. P. Porta, and G. Ghisio, “Real time
method will be tested on the German traffic sign detection and road signs recognition,” in 2007 IEEE Intelligent Vehicles Symposium,
June 2007, pp. 981–986.
recognition benchmark [18] with various types of road signs. [9] R. Belaroussi and J. P. Tarel, “Angle vertex and bisector geometric model
for triangular road sign detection,” in 2009 Workshop on Applications
ACKNOWLEDGMENT of Computer Vision (WACV), Dec 2009, pp. 1–7.
[10] R. Qian, Q. Liu, Y. Yue, F. Coenen, and B. Zhang, “Road surface traffic
The support of this work in part by Create Electronic Optical sign detection with hybrid region proposal and fast r-cnn,” in 2016 12th
Co., LTD is gratefully acknowledged. International Conference on Natural Computation, Fuzzy Systems and
Knowledge Discovery (ICNC-FSKD), Aug 2016, pp. 555–559.
R EFERENCES [11] C. Bahlmann, Y. Zhu, V. Ramesh, M. Pellkofer, and T. Koehler,
“A system for traffic sign detection, tracking, and recognition using
[1] R. Timofte, K. Zimmermann, and L. Van Gool, “Multi-view traffic sign color, shape, and motion information,” in IEEE Proceedings. Intelligent
detection, recognition, and 3d localisation,” Mach. Vision Appl., vol. 25, Vehicles Symposium, 2005., June 2005, pp. 255–260.
no. 3, pp. 633–647, Apr. 2014. [12] W. J. Kuo and C. C. Lin, “Two-stage road sign detection and recogni-
[2] Z. Zhu, D. Liang, S. Zhang, X. Huang, B. Li, and S. Hu, “Traffic- tion,” in 2007 IEEE International Conference on Multimedia and Expo,
sign detection and classification in the wild,” in 2016 IEEE Conference July 2007, pp. 1427–1430.
on Computer Vision and Pattern Recognition (CVPR), June 2016, pp. [13] J. Greenhalgh and M. Mirmehdi, “Real-time detection and recognition
2110–2118. of road traffic signs,” IEEE Transactions on Intelligent Transportation
[3] C. Liu, F. Chang, Z. Chen, and D. Liu, “Fast traffic sign recognition Systems, vol. 13, no. 4, pp. 1498–1506, Dec 2012.
via high-contrast region extraction and extended sparse representation,” [14] S. Maldonado-Bascon, S. Lafuente-Arroyo, P. Gil-Jimenez, H. Gomez-
IEEE Transactions on Intelligent Transportation Systems, vol. 17, no. 1, Moreno, and F. Lopez-Ferreras, “Road-sign detection and recognition
pp. 79–92, Jan 2016. based on support vector machines,” IEEE Transactions on Intelligent
[4] X. Yuan, X. Hao, H. Chen, and X. Wei, “Robust traffic sign recognition Transportation Systems, vol. 8, no. 2, pp. 264–278, June 2007.
based on color global and local oriented edge magnitude patterns,” IEEE [15] C.-Y. Fang, S.-W. Chen, and C.-S. Fuh, “Road-sign detection and
Transactions on Intelligent Transportation Systems, vol. 15, no. 4, pp. tracking,” IEEE Transactions on Vehicular Technology, vol. 52, no. 5,
1466–1477, Aug 2014. pp. 1329–1341, Sept 2003.
[5] A. de la Escalera, J. M. Armingol, J. M. Pastor, and F. J. Rodriguez, [16] R. Belaroussi and J.-P. Tarel, “A real-time road sign detection using
“Visual sign information extraction and identification by deformable bilateral chinese transform,” in Proceedings of the 5th International
models for intelligent vehicles,” Trans. Intell. Transport. Sys., vol. 5, Symposium on Advances in Visual Computing: Part II, ser. ISVC ’09.
no. 2, pp. 57–68, Jun. 2004. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 1161–1170.
[6] W.-C. Huang and C.-H. Wu, “Adaptive color image processing and [17] P. Sermanet and Y. LeCun, “Traffic sign recognition with multi-scale
recognition for varying backgrounds and illumination conditions,” IEEE convolutional networks,” in The 2011 International Joint Conference on
Transactions on Industrial Electronics, vol. 45, no. 2, pp. 351–357, Apr Neural Networks, July 2011, pp. 2809–2813.
1998. [18] S. Houben, J. Stallkamp, J. Salmen, M. Schlipsing, and C. Igel,
[7] H. Y. Lin and J. Y. Wei, “A street scene surveillance system for moving “Detection of traffic signs in real-world images: The german traffic sign
object detection, tracking and classification,” in 2007 IEEE Intelligent detection benchmark,” in The 2013 International Joint Conference on
Vehicles Symposium, June 2007, pp. 1077–1082. Neural Networks (IJCNN), Aug 2013, pp. 1–8.



You might also like