You are on page 1of 5

2009 International Conference on Artificial Intelligence and Computational Intelligence

A High Efcient System For Trafc Mean Speed Estimation from MPEG Video
Fu Yuan Hu
Department of Electronics & Informatics
Suzhou University of Science and Technology
Suzhou City, Jiangsu Prov.,P.R.China
Email: fuyuanhu@gmail.com
Xing Fa Dong
Department of Electronics & Informatics
Suzhou University of Science and Technology
Suzhou City, Jiangsu Prov.,P.R.China
Email: dongxfa@mail.usts.edu.cn

Jian Wang
Department of Electronics & Informatics
Suzhou University of Science and Technology
Suzhou City, Jiangsu Prov.,P.R.China
Email: wanjiansuzhou@sina.com

acquisition, as well as low cost [11].


Most existing vision systems for monitoring road trafc rely on (i) counting based-virtual loop or (ii) vehicle
tracking-based approaches. A virtual loop is composed
of detection lines or a bounding area, manually dened.
Through emulating the functionality of the induction loop
detector, this type of systems consider changes in the image
proles (corresponding to vehicles crossing the lines or the
bounding area) for counting the number of vehicles and estimating their average speed [14]. These methods are simple
and fast, but not exible as they require manual settings by
an operator. Tracking-based approaches, Generally, consider
three steps: vehicle detection [17], vehicle tracking and
trafc parameters calculation [18], [4], [2], [9], [10], [11].
In these approaches, the images of individual cars need to
be separated, which makes their applicability not feasible
in real situations with changes in lighting conditions, trafc
congestion and vehicle occlusion [7].
In order to alleviate for the above mentioned limitations
of vision based systems, recently, trafc analysis approaches
based on estimated motion vectors have been proposed.
These methods do not require close view for analyzing
trafc scenes, moreover they are characterized by a reduced
computation. Among such methods, Yu et al., [15], [13], [16]
proposed several versions of an algorithm to quickly extract
average vehicle speed from MPEG compressed video, within
a time window. They assume that the vehicles motion is
homogenous in the temporal domain and project the motion
vectors from the image plane into the ground plane using
an afne camera model. The camera calibration being made
manually. The proposed approach has a very low computational cost since computation for fully decoding MPEG
stream and computing trajectory or detecting vehicles is
saved. The interested reader is referred to [1], [8], [13]
for the aspects of MPEG parsing for the extraction of
motion vectors and DCT coefcients and their use for

AbstractIn this paper, we present a vision-based trafc


measurement system, allowing automatic trafc ow segmentation, camera calibration and trafc information estimation.
The system quickly estimates mean vehicle speed directly from
MPEG Motion Vectors. Although extensive work has been
done in extracting and using motion information from MPEG
video data in compressed domain, to our best knowledge, only
few works have been dedicated to the use of MPEG motion
vector for trafc analysis. The proposed system is stable and
handles camera vibrations and illumination changes. The paper
describes the main principles of our system together with
qualitative and quantitative representative results.
Keywords-intelligent transportation systems; motion vectors;
trafc measurement system; vehicle speed;

I. INTRODUCTION
Nowadays trafc analysis is one of the challenging societal and economical problems related to transportation in
industrialized countries. The last few years have seen a
growing interest in intelligent transportation systems (ITS).
Within ITS system trafc surveillance/measurement systems
(TMS) are gaining interest within the research community
as well industrial and governmental institutions. Trafc surveillance systems must quickly provide (to trafc managers
and drivers) information, such as vehicle speed and density.
Apart from real-time operations, trafc data is also used as
an important source of information for long-term planning
and trafc management activities.
Vehicle speed is a fundamental trafc information that
is essential to both macroscopic and microscopic trafc
analysis [6]. The current state-of-art in speed measurement
technologies include the use of magnetic inductive loop
detectors, magnetic strips, laser sensors, etc..., Recently
vision-based approaches have been also used. Compared
with other non-intrusive TMS, vision-based trafc measurement systems (VTMS) has a lot of advantages, including
portability, easy installation and operation, rich information
978-0-7695-3816-7/09 $26.00 2009 IEEE
DOI 10.1109/AICI.2009.358

Hichem Sahli
Department of Electronics & Informatics
Vrije Universiteit Brussel (VUB)
Brussel, Belgium
Email: hsahli@etro.vub.ac.be

444

motion vector smoothing. Using motion vectors, other authors proposed methods for trafc analysis methods using
stochastic processes models [3], [7], neuronal networks [?],
and deterministic approaches [12]. MPEG parsing and trafc
status classication are being outside the scope of this paper.
For the estimation of the average vehicles speed we
propose using the 3-D vector eld dened on the road
surface, describing the motion of each 3-D point between
two time steps. It can be seen as an extension of the motion
vector or optical ow to 3-D, being the projection of the
3-D scene ow onto the images, resulting in a 2-D vector
eld.
The remainder of the paper is organized as follows. In
Section II, we describe the proposed approach. Section III
presents the camera calibration strategy, Section IV denes
the trafc mean estimation, section V illustrates some experimental results, and nally section VI gives some conclusions
and future work.

Figure 1.

The overall diagram of the proposed algorithm.

1) Camera Model: Here we consider a camera system


placed at an unknown hight, and orientation, with respect to
the X Y ground plane (supposed at). Any point (X,Y, 0)
on the ground plane is projected on the pixel (x, y) on the
image plane following [10], [5], [16]:

II. P ROPOSED A PPROACH


A. General Principle
Fig. 1 depicts the the proposed system. It consist of two
steps processing:
an off-line processing step dealing with the automatic
estimation of the camera calibration parameters using the standardized road markers dimensions. This
step consists in two modules, rst the road area is
determined by segmenting the motion vectors, having
determined the road area, we estimate the background
road image, from which the road markers are detected,
and calibration points, corresponding to the corners
of the road markers are used to estimate the camera
calibration parameters.
an on-line module, which considers as input the
smoothed motion vectors, being the 2D image velocity,
and estimates, on the road area determined during the
off-line step, the vehicles 3D velocity using the camera
calibration parameters.

y =

L1 X + L2Y + L3
L7 X + L8Y + 1
L4 X + L5Y + L6
L7 Xw + L8Yw + 1

(1)
(2)

The parameters Li |i = 1 8 are the camera parameters for


the projective model. They are estimated using at least four
pairs of calibration points that determine the correspondence
between the ground plane and the image plan. In Section III,
we propose an automatic method for obtaining these calibration points by making use of the white markers along
the road/lanes and the standardized road width, and markers
lengths and width, as illustrated in Fig. 2.

B. 2D to 3D Transformation
Before discussing the trafc mean speed estimation, we
need to dene the notation for the camera projection parameters, 3D scene velocity and 2D image velocity. MPEG
motion vectors, or optical ow, typically estimates 2D pixel
motion, but when combined with known depth information
can yield estimates of 3D pixel velocity. The 2D ow or motion vector at pixel (block) (x, y) in an image can be written
as
 M(x, y) = (u, v), and the motion magnitude MV (x, y) =
(u2 + v2 ). The corresponding 3D velocity of the pixel
(block) in 3D space is denoted as E(X,Y, Z) = (U,V,W ). In
our the following we dene both the relationship between a
point (X,Y, Z) in the 3D space and its projection (x, y), and
the relationship between M and E.

Figure 2.

445

Required Road Markings Parameters

2) 3D Velocity Estimation: Ignoring the vehicles hight,


we can denote the velocity along the X and Y axes as:
dX
X2 X1

dt
t
dY
Y2 Y1
V =

dt
t
with (X1 ,Y1 ) and (X2 ,Y2 ) are two positions of a vehicle at
two successive times. Using Eq( 1) and Eq( 2), and some
manipulation we get the 3D velocity as function of the 2D
ow (u, v):
U

U
(L 4 x + L 5 y + L 6 )2
V
(L 4 x + L 5 y + L 6 )2

(3)
(4)

with:
U = (L 1 u + L 2 v)(
L 4 x + L 5 y + L 6 ) (L 4 u + L 5 v)(
L 1 x + L 2 y + L 3 )
V = (L 7 u + L 8 v)(
L 4 x + L 5 y + L 6 ) (L 4 u + L 5 v)(
L 7 x + L 8 y + L 9 )

where (u,
v)
= FrameRate(u, v) and the parameters L j , j =
1 9 are function of the projection parameters Li , i = 1 8
in Eq( 1) and Eq( 2).

Figure 3. Motion Vectors Filtering, First row: frame 12 and frame 299;
Second row: MPEG motion vectors; Third row: smoothed motion vectors

III. C AMERA C ALIBRATION

changes as well as camera vibration. In order to alleviate for


this problem we use the Temporal difference history (several
images) to accommodate for these changers.
Fig. 4 illustrates the result of the above described step,
namely, motion-based road region segmentation, background
image estimation, and the obtained road image, without
moving objects.

In this section we give an overview of the different steps


for estimating the camera parameters Li , i = 1 , 8.
A. Road Region Segmentation From Motion Vectors
The motion estimation in MPEG is based on blockmatching criteria, which is known to be sensitive to noise.
To derive reliable results, it is necessary to lter the raw
motion vectors before further processing. In our approach,
we did use a spatio-temporal median lter to smooth the
motion vectors. Fig. 3 illustrates the results of the motion
vector ltering.
The smoothed motion vectors are used as inputs to the
road region segmentation module. In this step we try to
detect road pixels, characterized by the fact that at these
locations the motion vectors behave as a continuous step
function. Indeed, a road pixel (block) will have a motion
state (moving/static) changing dynamically as vehicles are
moving on the road. Thus, we set a matrix DM(x, y) which
counts the number of state changes of a given pixel (x, y),
during a certain time (number of frames). Pixels with a
certain number of changes (parameter set empirically between 6 to 14) will be considered as road pixels, denoted as
Road(x, y).
Having detected the road area, we estimate the background image (corresponding to the scene without moving
vehicle). Temporal difference between successive images is
the simplest way to sperate background pixels from moving
ones. However, this method do not handle illumination

(a) Image Frame

(b) Extracted Road Region

(c) Background Image

(d) Road Image

Figure 4.

Road Extraction

B. Road Marking Extraction


Having extracted the road image, the next step is the
extraction of the road markers, and the detection of some

446

corner points as illustrated in Fig. 2. The different steps of


this processing are:
First, a canny edge detection is applied to the road
image,
then a progressive probabilistic hough transform is
applied, and we keep only the longest detected lines,
we select all the parallel lines, being the borders of the
road markings,
nally, we t a triangle to the line segments sustained
by the detection parallel linear structures.
Fig. 5 illustrates the different steps for the detection of
the road markers on the image of of Fig. 4.(d).
Finally we estimate the camera parameters by solving the
equations Eq.( 1) and Eq.( 1), using as calibration points the
detected marker corners and the associated world coordinate
points obtained using the standardized dimensions of the
markers and the road width as illustrated in Fig. 2.

(a)

(b)

(a)
Figure 6.

(b)

3D Speed Calculation Without Removing Noise.

(a)

(b)

(c)

(d)

(c)

Figure 5. Control Point Detection: (a) Lane Marking Segmentation, (b)


Example of detected elongated parallel lines (c) Detected markers

Figure 7. 3D Speed Calculation Based on Motion Vectors (a) the car is


entering the given region (b) the car is moving in given region (c) the car
is moving in given region (d) the car is going out the given region.

IV. T RAFFIC M EAN S PEED E STIMATION


Having estimated the camera parameters, we can estimate
the 3D velocity as given by Eq.( 3) and Eq.( 4). The
average vehicles speed is then determined either over for the
segmented road region, or a given Area-of-interest (ROI) in
the image, as follows:

x,y U 2 (x, y) +V 2 (x, y)
V=
(5)
N
where N is the number of moving pixels (blocks) of the road
region or the dened ROI.

different images is approximately the same, only for the last


case where the vehicle is going out from the ROI.
Then, we show the results of mean vehicle speed and the
radio of motion vectors for 1500 frame image sequences.
Fig. 8(a) show the ratio of motion pixels in given region
for motion vectors. Fig. 8(b) show the mean vehicle speed
in different frame for given region using motion vectors
and camera model parameters. From Fig. 8, we notice that
vehicles speed dont change signicantly in a short time
windows when there are vehicle.
Finally, we test the distribution of vehicle speed for motion vectors. Fig. 9 shows the distributions of the estimated
speed for 1500 frame image sequences. It is symmetric about
its mean, which shows the speed distributions satisfy the
normal distribution.

V. E XPERIMENTAL R ESULTS
Here we test the results of vehicle speed estimation using
real video in Belgium. We estimate 3D speed using motion
vectors after we remove noise. Due to noise, it affect the
accuracy of mean vehicle speed, which has been shown in
Fig. 6. Even there is no vehicles, the speed is not equal
to zero in Fig. 6. Subsequently, we compute the 3D mean
speed in given region and the corresponding parameters after
ltering.
First of all, we show the results in given region for motion
vectors. From the Fig. 7, we know that the average speed
is 28.4117m/s, 28.4277m/s,26.2403m/s and 15.7060m/s,
respectively. Despite that the distance between the car and
the camera is becoming shorter, the estimated speed for the

VI. C ONCLUSION
In this paper, we present a vision-based trafc measurement system, using MPEG Motion Vectors. The proposed
approach allows automatic trafc ow segmentation, camera
calibration and trafc information estimation such as average
vehicles speed. The approach is robust and cost-effective.
Further work would be the assessment of the estimation

447

[5] X.C. He and N. H.C. Yung. A novel algorithm for estimating vehicle speed from two consecutive images. In IEEE
Workshop on Applications of Computer Vision, pages 1212,
2007.

Ratio of Moving Pixles


0.4

Ratio

0.3
0.2
0.1
0

500

1000

1500

[6] S.P. Hoogendoorn and P.H.L. Bovy. State-of-the-art of vehicular trafc ow modelling. Journal of Systems and Control
Engineering, 215(4):283303, 2001.

Frame Number

(a)
Mean Speed of Vehicle

Speed

150

[7] F. Porikli and X. Li. Trafc congestion estimation using


HMM models without vehicle tracking. In In IEEE Intelligent
Vehicle Symposium, pages 188193, 2004.

100
50
0

500

1000

1500

Frame Number

[8] Fatih Porikli. Real-time video object segmentation for mpeg


encoded video sequence. TR-2004-11 Report, 2004.

(b)
Figure 8. Vehicles Mean Speed (a) Motion pixel ratio in the given region
(b) Vehicles mean speed in the given region.

[9] T. N. Schoepin and D. J. Dailey. Dynamic camera calibration


of roadside trafc management cameras. IEEE Transactions
on Intellident Transportation System, 4(2):9098, 2003.

The distributions of the estimated speed


35

30

[10] T. N. Schoepin and D. J. Dailey. Algorithms for estimating


mean vehicle speed using uncalibrated trafc management
cameras. Technical report, University of Washington, Oct.,
2003.

25

20

15

10

Figure 9.

20

40
60
80
The estimated speed(m/s)

100

[11] J. M. Wang, Y. C. Chung, S. C. Lin, S. L. Chang, S. Cherng,


and S. W. Chen. Vision-based trafc measurement system. In
Proceedings of the 17th International Conference on Pattern
Recognition, 2004.

120

The distributions of the estimated speed.

[12] Hans Bokma Willem Jan Knibbe, Arne Oostveen and Daisy
Poot-Geers. A new incident detection scheme developed in
the netherlands. Proceeding of the 8th International IEEE
conference on Intelligent Transportation Systems, 2005.

error by comparing the obtained mean trafc speed results


to state-of-art methods.
ACKNOWLEDGMENT

[13] Ling-Yu Duan Xiao-Dong Yu and Qi Tian. Robust moving


video object segmentation in the mpeg compressed domain.
International Conference on Image Processing, 2003.

This work has been done in the framework of the


FLEXYS project, funded by the Interdisciplinary Institute
for Broadband Technology (IBBT) (founded by the Flemish
Government in 2004), and in part Suzhou University of
Science and Technology Grant 330911601 and the Natural
Science Foundation of the Jiangsu Higher Education Institutions of China Grant 06KJD510169. The authors would like
to thank Tracon for the image sequences.

[14] CHO Young and RICE John. Estimating velocity elds on


a freeway from low resolution video. IEEE Transactions on
intelligent transportation systems, 2006.
[15] X. Yu, L. Duan, and Q. Tian. Highway trafc information extraction from skycam mpeg video. In IEEE 5th International
Conference on Intelligent Transportation Systems, pages 37
42, September 2002.

R EFERENCES
[16] X. Yu, P. Xue, L. Duan, and Q. Tian. An algorithm to estimate
mean vehicle speed from mpeg skycam video. Multimed Tools
Appl, (34):85105, 2007.

[1] Sandeep Singh Shamik Sural Ashwani Aggarwal, Susmit Biswas and A.K. Majumdar. Object tracking using background subtraction and motion estimation in mpeg videos.
Lecture Notes in Computer Science, 2006.

[17] Bebis G Zehang Sun and Miller R. On-road vehicle detection:


A review. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 2006.

[2] William Tompkinson Bai Li and Yan Wang. Computer vision


techniques for trafc ow computation. Pattern Analysis &
Applications, 1999.

[18] Chen Zhenxue and Wang Guoyou. Vehicle ow detection


statistic algorithm based on optical ow. 2005 IEEE International Symposium on Signal Processing and Information
Technology, 2005.

[3] A. B. Chan and N. Vasconcelos. Classication and retrieval


of trafc video using auto-regressive stochastic processes. In
Proceedings of 2005 IEEE Intelligent Vehicles Symposium,
pages 771776, 2005.
[4] D. J. Dailey and F. W Cathey. Cctv technical report(phase
3). Technical report, University of Washington, Oct., 2006.

448

You might also like