You are on page 1of 11

PhD Research Proposal

Multiple Vehicles Detection and Tracking from a Moving Platform

Research Aim
The aim of the research is to detect and track multiple vehicles from a moving platform. This research is
a part of ADAS (Advanced driver assistance systems) and ITS (Intelligent transportation Systems). It will
be used mainly for collision avoidance systems.

Introduction
Vehicle detection and tracking systems are nowadays under development for driver assistance systems.
On-board sensors are used to alert the drivers about the driving environment and a possible collision
with another vehicle. Vehicle detection techniques have gained importance for the last 15 years. A few
months before, Google self-driving cars have done their first test drive. This has opened a new era of
automatic self-driving in this field. The development of a reliable and successful system for vehicle
detection and tracking is the main step. The main problem with the vehicle detection technique is the
correct extraction of vehicles in consecutive video frames in complex outdoor environments such as
illumination conditions, unpredictable interaction among traffic participants, and cluttered backgrounds.
This becomes more challenging due to huge changes in vehicles appearance as they vary in size, color
and shape. The appearance gets effected by nearby objects. Conventional background subtraction
methods fail completely because of the changes in viewpoint from frame to frame. This process usually
requires a near-real time response to what has been seen. In ideal conditions, we need each video or
photo frame to be processed rapidly to give vehicle detection system enough response time for reaction
under expected collision probabilities. This also requires the choice of sensors for detection. The sensors
currently being used are lasers, radar, cameras and others. The data fusion among several sensors is also
challenging. The aim of my research will be to develop a reliable vehicle detection and tracking system
which will be robust in complex conditions.

The basic model for vehicle detection and tracking consists of the following three basic steps.

(i) Vehicle candidates generation


(ii) Generated candidates verification
(iii) Verified candidates tracking

Fig. 1. General flow diagram for vehicle detection


The vehicle candidate generation step involves the initial filtration of possible vehicle candidates in an
image. The verification involves the removal of false candidates from true candidates. The tracking step
then involves verified candidates tracking in multiple frames and issue warnings in case of collision
danger.

Background
Vehicle detection methods include on-board sensors such as optical sensors [1], radar (millimeter-
wave) [2], laser scanners (i.e., LIDAR) [3], or acoustic-based sensors [4]. Radar shows a good
performance on highways but experiences difficulties in complicated environments. Laser scanners work
well within a short distance. To improve reliability, radar or laser scanners can be fused with image
sensors [5]. However, radar and laser scanners are a lot more expensive than cameras. There are also
stereo vision-based methods which include disparity map-based and inverse perspective mapping (IPM)
based methods [6][7]. But they have often correspondence problems and approaches become
computationally too much expensive. Therefore, a single camera vision-based system can often show
satisfactory vehicle detection accuracy, and vision-based systems can be used for most of the vehicles
owing to low-cost.

The methods for vehicle candidate generation can currently be classified into two groups: (i) Local
feature-based methods, or (ii) Motion-based methods.

Local feature-based methods include the detection of local features such as edges [8], corners [9],
shadow [10], texture [11], symmetry [12], or light artifacts [13].

Edges are strong detectors in finding initial vehicle regions in a scene. Edge detectors used are common
operators such as Sobel, Canny, Kovesi-Owens or LoG (Laplacian of Gaussian). Intelligent grouping is
required to find the vehicle regions. In [14], the robustness of operators such as Sobel, Canny and
Kovesi-Owens has been discussed. In [9], a corner-based method was proposed to hypothesize the
vehicles by using four corners (upper-left, upper–right, lower–left and lower-right) for detection. The
shadow beneath the vehicle is the strongest feature in the image. In [10], a threshold-based shadow
detection method was proposed. Texture-based methods include the detection of entropy, energy,
contrast and correlation measurements [11]. In [15], NNs (Neural Networks) based symmetry detection
was proposed, this involves the horizontal and vertical edges symmetry.

Motion-based methods include the generation of optical flow vectors for moving objects in the image. In
[16] optical flow was estimated from spatio-temporal derivatives of grey level value images. There are
also sparse optical flow methods which consume less time by utilizing features such as corners [17],
[18], local maxima and minima [19], or color blobs [20].

Generated vehicle candidates verification methods can be classified as (i) part-based methods, or (ii)
appearance based methods.

Part-based methods include the verification by detecting the presence of license plates and rear
windows [21]. In [22], a U-shaped model (one horizontal edge, two vertical edges, and two corners
connecting the vertical and horizontal edges) of a vehicle was proposed to verify the generated
candidates. In [23], template-based symmetry was used to verify the vehicle presence.

Appearance-based methods include the pattern classification problems such as two class classification:
vehicle versus non-vehicle. A robust pattern classification requires the estimation of optimized decision
boundary between two classes. Due to a large number of in class invariabilities in vehicle and non-
vehicle classes, this is not an easy task. The features are extracted from both classes which serve as
classifiers to distinguish between vehicles and non-vehicles. Both local and global features are
employed. Global feature based techniques such as SIFT [24], SURF [25], Haar-like features [26] and
Gabor features [27] are quite famous. In [28], a Haar-like feature dictionary was used to find complex
patterns. In [29], using Gabor filters was investigated. Gabor features were extracted from overlapping
sub-windows and then SVM was used to perform classification. Surf detectors [30] were used to find the
shape and texture feature vectors of vehicles in the images.

Tracking of detection results makes us to remove the errors in detection, to reduce the number of false
positives, and providing the trajectory information of the detected vehicles for tracking. There has been
much work on tracking of vehicles based on trajectories. In some approaches, the tracking is done by
frame by frame detections of geometry and dynamics without detecting the particular appearance
models. Some approaches further use Bayesian framework to fuse together detection and tracking, and
combine appearance based models with pattern density, dynamics and probabilistic presence of
trajectory states. Particle filters [31] are very popular due to their ability to approximate closely complex
real-world multi targets based on weighed densities. The most important problem in tracking is how to
track multiple targets together.
PhD Proposal

Objectives

The main objectives of the PhD research are:

Optimized algorithm
• To develop an optimized algorithm for multiple vehicles detection and verification
Reliability
• Reliable in the sense that the candidate detection after verification should include only vehicles
and detection process must not miss the position of any vehicle on the road.
• Overlapped vehicles tracking reliability
Real-time operation
• Real time detection of vehicles which require the detection rate of 25-30 frames processing per
second.
Performance evaluation
• To compare the performance of the developed scheme with other state-of-art technologies
Hardware
• To propose a hardware architecture (e.g., GPU or FPGA) for optimized algorithm
implementation
Extension
• Robust vehicle detection during night time
• Vehicle detection under different weather conditions such as fog, snow, or rain.
• Sensor fusion with a range of other sensors such as lasers, radar, or a 3D camera.

Methodology
The current methodology includes the capturing of video from a single on-board camera, or from
multiple time-synchronized cameras. The current focus in research is on detection during day-light
conditions, with planned further extensions after implementation of the basic approach.

Vision-based Vehicle Detection System


Since there can be various shapes of vehicles, models of vehicles for vehicle detection will not be
used. Instead, the shadows under vehicles are chosen as good features, based on previous work in [10],
where the shadow of a vehicle was successfully used for vehicle detection. The shadows under the
vehicle will be detected.

Fig. 2 shows the overall flow diagram of vehicle detection and tracking method. Starting with RGB to
grey conversion, first we decide the region of interest (ROI) in which vehicles should be detected. Then
edges are extracted and processed by using a modified Sobel mask. From the edges, candidates of
vehicles are generated.
Fig. 2. Flow diagram of vehicle detection and Tracking

Vehicle Candidate Generation


The images are pre-processed with histogram equalization to remove effects of changing illumination.

Edge Extraction
For edge extraction, a new adaptive threshold-based extended Sobel edge operator will be used.
When the slope of brightness in an image is greater than the threshold value, the pixel of interest is
determined to belong to an edge portion. Careful selection of the threshold is important as
inappropriate value directly affects the result of edge extraction. When a low threshold value is used, a
lot of edge information can be obtained as a result of edge extraction, but it causes extraction of thick
edge lines and high sensitivity to noise. On the other hand, when a high threshold value is used, not all
useful edges can be extracted, resulting in a significant loss of edge information. Therefore, various
methods are used to determine the appropriate threshold value. However, these methods show limited
performance as one fixed threshold value may be effective for some regions of an image, but in many
cases, not for the entire image. This is because the brightness of an image is not uniform across all the
regions of the image. Wrongly extracted edges may later adversely affect the vehicle detection rate.
An adaptive threshold edge detection method will be used. Especially the edges in dark regions are
difficult to extract. Therefore, a lower threshold should be used in dark regions than that of bright
regions. 3x3 Sobel masks for horizontal and vertical directions for pixel P are computed by following the
equations
P h = (D 1 + 2D 2 + D 3 ) – (U 1 + 2U 2 + U 3 ) (1)
P v = (R 1 + 2R 2 + R 3 ) – (L 1 + 2L2 + L3) (2)

A horizontal (vertical) edge is extracted at P if |P h | ≥ threshold (|P v | ≥ threshold), in a fixed threshold


method. In adaptive threshold method, block-level brightness, B p of a block centered at P, is computed
and the threshold at P is computed by the following equation.
Threshold (P) = c ⋅ (B p ) α (3)
The values of α and c are carefully chosen from average and standard deviation values.

When the brightness in an image gradually changes across boundary lines, some edges may not be
easily detected by a general Sobel filter. To extract edges effectively in this case, we will use the
extended mask as shown in Fig. 3.

Fig. 3. Extended Sobel mask.

Edge Processing
To generate vehicle candidates, extracted edges need to be processed. The main goal is to find the
width of each vehicle. The bottom or top boundaries of a vehicle may not be extracted as connected
edges due to noise or road conditions. First, we connect edges when the length of the broken part is less
than a threshold, so that we can obtain a connected edge covering the width of a vehicle.

The second processing step is to separate each vehicle and each non-vehicle object. We will use
vertical edges to separate objects. This is possible because objects usually have vertical boundary edges.

The third processing step is to remove short edges and diagonal edges. Short edges are due to small
objects on a vehicle or background. Diagonal edges can be from marks on roads or objects like guard
rails. Since short edges or diagonal edges are not useful for vehicle detection, they are removed.

Making ROI based on LUT


Vehicle candidates are generated based on the horizontal edges. The candidates are then filtered to
remove non-vehicle edges by using several filters.

When a horizontal edge contains more than a predetermined number of pixels, it becomes the
candidate of a vehicle. For a 640x480 image, the edge estimation for pixels will be estimated.
The width of a vehicle in an image is dependent on the vertical (y) position of the bottom edge of the
vehicle. For each possible vertical coordinate (y), we can determine the minimum and the maximum
widths of vehicles. The minimum and maximum widths will be stored in a look-up table (LUT). For each
vehicle candidate edge generated, we verify whether the width of the edge is in the appropriate range
by using the LUT. This LUT-based filtering can efficiently remove non-vehicles and making ROI around
the vehicle candidates. Thus, a large number of candidates will be generated.

Candidate Verification
Candidate verification by using visual features with SVM
The expected candidates generated after LUT filtering will be verified by using Bag of Visual Words
model [32]. This method is modified to detect multiple vehicles. The database will be made by taking
several pictures of vehicles on the test roads and also using MIT vehicle database [33]. The visual
features will be extracted from the consistent patches in the several vehicles by using SIFT and thus
making a visual dictionary for consistent patches by using k-means clustering [34]. The clustering
algorithms are chosen based on performance, which is evaluated by using recovery rate as defined in
[35]. The features obtained after clustering are then trained with SVM. The whole training will be offline.
Now, for the verification of possible vehicle candidates, the SIFT features will be extracted from the
candidates generated from LUT and will be tested by SVM for their category decision. The process is
shown in fig. 4 by means of a flow chart.

Fig. 4. Flow chart of Bag of visual features with SVM approach

Candidate filtering by using Symmetry


The symmetry of vehicles can also be used for candidate filtering. In [15], neural network was used for
symmetry checking. In contrast, we can use simpler pixel by pixel AND operation to check the symmetry.
This will also reduce the computational cost as compared to neural networks. The symmetry verification
is necessary to verify the vehicles pose as visual features do not give full information about the
generative model of the object, so symmetry will verify the exact pose and location of the object in the
ROI. After this filtration, only true vehicle candidates will be remained.

Tracking
After removal of non-vehicles, the verified vehicles are tracked in the consecutive frames. The particle
filter will be used to closely track the multi targets based on weighed probabilities [31]. The Kalman filter
can also be employed to detect multiple targets in consecutive frames.

Hardware implementation
This is the most important part to optimize the timing constraints in the algorithm. The hardware
implementation using the parallel architectures will be used. The [36] uses ADI-BF561 DSP module. The
SoC with 55.3 GOPS image recognition engine was also proposed in [37]. An optimized architecture for
hardware implementation on dedicated hardware will be developed to reduce the processing time of
algorithm and will allow enough time to the drivers to react in case of emergency.

Extensions
Night time detection
To make the vehicle detection robust during night time, the head lights and tail lights will be used as
strong features to classify between vehicles and non- vehicles [38].

Weather Conditions
Under diverse weather condition such as snow, fog and rainy days, the classical vehicle detection
methods find it difficult to detect vehicles. Thus vehicle detectors based on models are used [39].

Sensor fusion
The use of different sensors such as lasers to help in pre-recognition of vehicles will be employed to
enhance the distant view obstruction problem in image sensors. Furthermore, more cameras on sides of
the cars will be deployed to detect side cars along with the front cars.
Proposed research time line and targets

January 2012- June 2012 Literature Review.


Learning Open CV and doing Matlab experiments.
July 2012 – December 2012 Learning camera calibration in HAKA1.
Experiments of Video Recording.
Seminar Presentation.
Preparing two conference submissions.
January 2013- June 2013 Main algorithm development.
Testing of algorithm.
Preparing a journal submission.
July 2013 – December 2013 Debugging and optimization of main algorithm.
Performance evaluation of main algorithm.
Preparing two conference submissions.
Seminar presentation.
January 2014- June 2014 Modifications in algorithm for diverse weather conditions and night time.
Deciding whether to include sensors other than camera.
Writing a second journal submission with improved results including
weather conditions.
Proposing hardware for final algorithm.
Seminar Presentation.
July 2014 – December 2014 Completing experimental work.
Preparing two conference submissions.
Thesis writing.

References
[1] Z. Sun, G. Bebis, and R. Miller, “On-Road Vehicle Detection Using Optical Sensors,” IEEE International
Conference on Intelligent Transportation Systems, pp. 585-590, 2004.
[2] S. Park, T. Kim, S. Kang, and K. Heon, “A Novel Signal Processing Technique for Vehicle Detection
Radar,” 2003 IEEE MTT-S Int’l Microwave Symposium Digest, pp. 607-610, 2003.
[3] C. Wang, C. Thorpe, and A. Suppe, “Ladar-Based Detection and Tracking of Moving Objects from a
Ground Vehicle at High Speeds,” Proc. IEEE Intelligent Vehicles Symposium, 2003.
[4] R. Chellappa, G. Qian, and Q. Zheng, “Vehicle Detection and Tracking Using Acoustic and Video
Sensors,” Proc. IEEE Int’l Conf. Acoustics, Speech, and Signal Processing, pp. 793-796, 2004.
[5] S. Yang, H. Lho and B. Song, “Sensor Fusion for Obstacle Detection and Its Application to an
Unmanned Vehicle,” ICCAS-SICE, pp.1365-1369, 2009.
[6] R. Klette and Z. Liu, “Computer Vision for the Car Industry,” In Phase”, IIT Guwahati-Cepstrum
magazine, pp. 5-8, 2008.
[7] U. Franke and I. Kutzbach, “Fast Stereo Based Object Detection for Stop and Go Traffic,” Intelligent
Vehicles, pp. 339-344, 1996.
[8] N. Matthews, P. An, D. Charnley, and C. Harris, “Vehicle Detection and Recognition in Grey scale
Imagery,” Control Eng. Practice, vol. 4, pp. 473-479, 1996.
[9] M. Bertozzi, A. Broggi, and S. Castelluccio, “A Real-Time Oriented System for Vehicle Detection,”
Journal of Systems Architecture, pp. 317-325, 1997.
[10] C. Tzomakas and W. Seelen, “Vehicle Detection in Traffic Scenes Using Shadows,” Technical Report
98-06, Institut fur Neuroinformatik, Ruht-Universitat, Bochum, Germany, 1998.
[11] T. Kalinke, C. Tzomakas, and W. von Seelen, “A Texture-Based Object Detection and an Adaptive
Model-Based Classification,” Proc. IEEE Int’l Conf. Intelligent Vehicles, pp. 143-148, 1998.
[12] M. Bertozzi, A. Broggi, and A. Fascioli, “Vision-Based intelligent Vehicles: State of the Art and
Perspectives,” Robotics and Autonomous Systems, vol. 32, pp. 1-16, 2000.
[13] R. Cucchiara and M. Piccardi, “Vehicle Detection under Day and Night Illumination,” Proc. Int’l ICSC
Symposium Intelligent Industrial Automation, 1999.
[14] A. Al-Sarraf, T. Vaudrey, R. Klette and Y. W. Woo, “An Approach for Evaluating Robustness of Edge
Operators on Real-World Driving Scenes,” Proc. IEEE Int’l Conf. IVCNZ, pp. 1-6, 2008.
[15] W. von Seelen, C. Curio, J. Gayko, U. Handmann, and T. Kalinke, “Scene Analysis and Organization of
Behavior in Driver Assistance Systems,” Proc. IEEE Int’l Conf. Image Processing, pp. 524-527, 2000.
[16] W. Kruger, W. Enkelmann, and S. Rossle, “Real-Time Estimation and Tracking of Optical Flow
Vectors for Obstacle Detection,” Proc. IEEE Intelligent Vehicle Symposium, pp. 304-309, 1995.
[17] J. Weng, N. Ahuja, and T. Huang, “Matching Two Perspective Views,” IEEE Trans. Pattern Analysis
and Machine intelligence, vol. 14, pp. 806-825, 1992.
[18] S. Smith and J. Brady, “ASSET-2: Real-Time Motion Segmentation and Shape Tracking,” vol. 17,
1995.
[19] D. Koller, N. Heinze, and H. Nagel, “Algorithmic Characterization of Vehicle Trajectories from Image
Sequence by Motion Verbs,” Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, pp. 90-95,
1991.
[20] B. Heisele and W. Ritter, “Obstacle Detection Based on Color Blob Flow,” Proc. IEEE Intelligent
Vehicle Symposium, pp. 282-286, 1995.
[21] P. Parodi and G. Piccioli, “A Feature-Based Recognition Scheme for Traffic Scenes,” Proc. IEEE
Intelligent Vehicles Symposium, pp. 229-234, 1995.
[22] U. Handmann, T. Kalinke, C. Tzomakas, M. Werner, and W. Seelen, “An Image Processing System for
Driver Assistance,” Image and Vision Computing, vol. 18, no. 5, 2000.
[23] T. Ito, K. Yamada, and K. Nishioka, “Understanding Driving Situations Using a Network Model,”
Intelligent Vehicles, pp. 48-53, 1995.
[24] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” International Journal of
Computer Vision, pp. 91-110, 2004.
[25] B. Herbert, E. Andreas, T. Tinne and G. V. Luc,"SURF: Speeded Up Robust Features," Computer
Vision and Image Understanding (CVIU), Vol. 110, pp. 346-359, 2008.
[26] Viola and Jones, "Rapid object detection using boosted cascade of simple features," Computer
Vision and Pattern Recognition, 2001.
[27] G. F. Hans, S. Thomas, "Gabor Analysis and Algorithms,” Birkhäuser, 1998.
[28] C. Papageorgiou and T. Poggio, “A Trainable System for Object Detection,” Int’l Journal Computer
Vision, vol. 38, no 1, pp. 15-33, 2000.
[29] Z. Sun, G. Bebis, and R. Miller, “On-Road Vehicle Detection Using Gabor Filters and Support Vector
Machines,” Proc. IEEE Int’l Conf’ Digital Signal Processing, July 2002.
[30] R. Collins and Y. Liu, “On-Line Selection of Discriminant Tracking Features,” Proc. IEEE Int’l Conf.
Computer Vision, 2003.
[31]D.P. Bai, B.B. Lee, “ Based on Particle Filter for Vehicle Detection and Tracking in Digital Video,” Int’l
Conf. on Machine Learning and Cybernetics, pp. 2810-2814, 2008.
[32] G. Csurka, C. Bray, C. Dance, and L. Fan, ‘‘Visual categorization with bags of keypoints,’’ Proc. of
Workshop Stat. Learning Computer Vision, pp. 1–22, 2004.
[33] MIT CBCL car dataset: http://cbcl.mit.edu/projects/cbcl/software-datasets/CarData.html
[34] M. Naba, N. Katoh and H.Imai ,"Applications of weighted Voronoi diagrams and randomization to
variance-based k-clustering," Proceedings of 10th ACM Symposium on Computational Geometry, pp.
332–339, 1994.
[35] F. Li and R. Klette,” Recovery Rate of Clustering Algorithms,” Proceedings of PSIVT, pp. 1058-1069,
2009.
[36] S.P. Tseng and D. Fong,” A DSP based real-time front car detection driving assistant system,”
Proceedings of SICE, pp. 2419-2423, 2010.
[37] H. Hamasaki, Y. Hoshi, A. Nakamura, A. Yamamoto, H. Kido and S. Muramtsu,” SOC for car
navigation system with a 55.3GOPS image recognition engine,” 15th ASP-DAC, pp. 464-465, 2010.
[38] R. Malley, E. Jones and M. Galvin,” Rear-Lamp Vehicle Detection and Tracking in Low-Exposure
Color Video for Night Conditions,” IEEE Transections on Intelligent Transportation Systems, pp.453-462,
2010.
[39] H. Sakaino ,” Moving vehicle velocity estimation from obscure falling snow scenes based on
brightness and contrast model,” Proceedings of Int’l Conference on Image Processing, pp. 905-908,
2002.

You might also like