Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
Look up keyword
Like this
2Activity
P. 1
Multiple Moving Target Detection, Tracking, and Recognition from a Moving Observer

Multiple Moving Target Detection, Tracking, and Recognition from a Moving Observer

Ratings: (0)|Views: 19|Likes:
Published by Hammad Ansari
Author: Fenghui Yao and Ali Sekmen Mohan J. Malkani.

Abstract - This paper describes an algorithm for multiple moving targets detection, tracking and recognition from a moving observer. When the camera is placed on a moving observer, the whole background of the scene appears to be moving and the actual motion of the targets must be distinguished from the background motion. To do this, an affine motion model between consecutive frames is estimated, and then moving targets can be extracted. Next, the target tracking employs a similarity measure which is based on the joint feature-spatial space. At last, the target recognition is performed by matching moving targets with target database. The average processing time is 680 ms per frame, which corresponds to a processing rate of 1.5 frames per second. The algorithm was tested on the Vivid datasets provided the Air Force Research Laboratory and experimental results show that this method is efficient and fast for real-time application.
Author: Fenghui Yao and Ali Sekmen Mohan J. Malkani.

Abstract - This paper describes an algorithm for multiple moving targets detection, tracking and recognition from a moving observer. When the camera is placed on a moving observer, the whole background of the scene appears to be moving and the actual motion of the targets must be distinguished from the background motion. To do this, an affine motion model between consecutive frames is estimated, and then moving targets can be extracted. Next, the target tracking employs a similarity measure which is based on the joint feature-spatial space. At last, the target recognition is performed by matching moving targets with target database. The average processing time is 680 ms per frame, which corresponds to a processing rate of 1.5 frames per second. The algorithm was tested on the Vivid datasets provided the Air Force Research Laboratory and experimental results show that this method is efficient and fast for real-time application.

More info:

Categories:Types, Research
Published by: Hammad Ansari on Apr 01, 2013
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
See more
See less

07/06/2013

 
Multiple Moving Target Detection, Tracking, andRecognition from a Moving Observer 
 
Fenghui Yao and Ali Sekmen Mohan J. Malkani
Department of Computer Science Department of Electric and Computer EngineeringTennessee State University3500 John A Merritt Blvd, Nashville, TN 37215, USA{fyao, asekmen, mmalkani }@tnstate.edu
 Abstract 
-
 
This paper describes an algorithm formultiple moving targets detection, tracking and recognition froma moving observer. When the camera is placed on a movingobserver, the whole background of the scene appears to bemoving and the actual motion of the targets must be distinguishedfrom the background motion. To do this, an affine motion modelbetween consecutive frames is estimated, and then moving targetscan be extracted. Next, the target tracking employs a similaritymeasure which is based on the joint feature-spatial space. At last,the target recognition is performed by matching moving targetswith target database. The average processing time is 680 ms perframe, which corresponds to a processing rate of 1.5 frames persecond. The algorithm was tested on the Vivid datasets providedthe Air Force Research Laboratory and experimental resultsshow that this method is efficient and fast for real-timeapplication.
I.
 
I
 NTRODUCTION
 Detection and tracking of moving objects in an imagesequence is one of the basic tasks in computer vision. Thedetected moving object trajectory can be either of interest in itsown or used as the input for a high level analysis such asmotion pattern understanding, moving behavior recognitionand so on. Applications include surveillance, homelandsecurity, protection of vital infrastructure, and advancedhuman-machine communication. Therefore, moving objectsdetection and tracking has received more and more attention,and many algorithms have been proposed. Among these, oneinteresting approach is the Particle filter [1], which has beenused and extended many times [2] [3] [4]. Particle filter wasdeveloped to track objects in clutter, where the posterior density and observation density are often non-Gaussian. Thekey idea of particle filtering is to approximate the probabilitydistribution by a weighted sample set. Each sample consists of an element which represents the hypothetical state of an objectand a corresponding probability. The state of an object may becontrol points of a contour [1], the position, shape and motionof an elliptical region [2], or specific model parameters [3].That is, these methods [2] [3] are based on models. Ross’sapproach [4] is a model-free, statistical detection methodwhich use both edge and color information. The commonassumption of these methods [1] [2] [3] is that the backgrounddoes not move, and the image sequences are from a stationarycamera. Tian et al [5] developed a real-time algorithm to detectsalient motion in complex environments by combiningtemporal difference imaging and temporal filtered optical flow.The image sequence used in this method is also from stationarycamera.The works of Smith and Brady [7] and Kang et al [6]employed the image sequences from moving platform. Kanget al developed an approach for tracking of moving objectsobserved by both stationary and Pan-Tilt-Zoom cameras. Smithand Brady’s approach employed the image sequence from acamera mounted on a vehicle to detect other moving vehicle.This method used special-purpose hardware to implement thereal-time target detection and tracking. COMETS systemdetects the target from a moving observer (an autonomoushelicopter but does not perform tracking [8]. Yang et al’stracker works for image sequence form both stationary andmoving platform but it detects and track single target [9].Literature [10] proposes a detection-based multiple objecttracking, literature [11] shows a multiple object trackingmethod based on multiple hypotheses graph representation, andliterature [12] demonstrates a distributed Bayesian multipletarget tracker. However they all employ image sequences fromstationary observers.As shown above, there are a few works to discuss themultiple moving target detection and tracking from the movingobserver. And also few work deals with target recognition atsame time. This paper introduces a method for moving targetdetection, tracking, and recognition from a moving observer.II.
 
M
OVING
T
ARGET
D
ETECTION
F
ROM
A
 
M
OVING
O
BSERVER 
 The entire configuration is shown in Fig. 1. The output of the moving target detection is sent to the target tracking. Thetracked targets are sent to target recognition. This sectiondescribes moving target detection, target tracking and targetrecognition is discussed in Section 3 and 4, respectively.The moving observer usually means a camera mounted on aground vehicle or on an airborne platform such as a helicopter or an unmanned aerial vehicle (UAV). In this work, the videosequences are generated by an airborne camera. In airbornevideo, everything (target and background) appear to be movingover time due to the camera motion. Before employing framedifferencing (simple motion detection method for stationary platforms) to detect motion images, it is necessary to conductmotion compensation first. Two-frame background motion
978-1-4244-2184-8/08/$25.00 © 2008 IEEE.
978
Proceedings of the 2008 IEEEInternational Conference on Information and AutomationJune 20 -23, 2008, Zhangjiajie, China
 
estimation is achieved by fitting a global parametric motionmodel (affine or projective) to sparse optic flow. Here, we useaffine transformation model.
 A. Optic Flow Detection
Sparse optic flow is obtained by applying Lucas-Kanadealgorithm [13]. The number of optic flow is controlled in therange of 200 to 1000. Other methods such as matching Harriscorners, Moravec feature, SUSAN corners between frames, or matching SIFT features are all applicable here. The mainfactors need to be considered are computation cost androbustness. Experiment results show that Lucas-Kanademethod is most reliable and pretty fast.
 B. Affine Parameter Estimation
2-D affine transformation is described as follows,
¸¸ ¹ ·¨¨© § 
+
¸¸ ¹ ·¨¨© § ¸¸ ¹ ·¨¨© § 
=
¸¸ ¹ ·¨¨© § 
654321
aa y xaaaa X 
iiii
, (1)where (
 x
i
,
 y
i
) are locations of feature points in previous frame,and (
 X 
i
,
i
) are locations of feature points in current frame.Theoretically, to determine six affine parameters, three pairs of matched feature points are enough. How to select these three pairs of feature points will affect the precision of affine parameter estimation. To reduce this estimation error, these parameters can be solved in the least-squares method based onall matched feature points. However the computation cost inleast-squares method is heavy. To reduce the computation timeand estimation error, this work use the algorithm similar toLMedS (Lest Median Square) [14]. Details are as follows. (i)Randomly select
 N 
pairs of matched feature points from previous frame and current frame. And further, randomly select
triplets from
 N 
pairs of matched feature points, where
 N 
<<
. Each triplet determines an affine transformation (six parameters). Let
ω 
represent
-th affine transform, where
=1, 2, …,
. (ii) For 
ω 
, all feature points in previous frame aretransformed to the current frame. The affine transform error isdefined as
¦
×=
nii
 P  P 
ω ε 
ˆ
, where
n
= 1, 2, …,
 N 
,
 P 
i
 is the feature point in previous frame, and
i
 P 
ˆ
in current frame.The
 s
-th affine transform
 s
ω 
, which is correspondingto
},...,,min{
21
 s
ε ε ε ε 
=
, is considered as the global parametricmotion model. Literature [14] has shown that above methodwill generate accurate and reliable model if the moving targetsconstitute a small area (
i.e.
, less than 50%). In airborne videocamera, this requirement can be easily satisfied.
C. Target Detection
The frame difference is generated according to
1
×=
i sidiff 
 F  F  F 
ω 
, where
1
i
 F 
and
i
 F 
is previous frame andcurrent frame, respectively. After being binarized, a set of morphological operations containing dilation, white-blobdetection, blob filtering, blob merging are applied to the binarydifference image
 F 
diff.
. After these processings, the blobs aretarget candidates. For blobs, their contour 
, center 
),(
c
 y x P 
, hull
 H 
, affine transformation model
 s
ω 
, andminimal circumscribed rectangle
 R
are passed to the nextstage for tracking and recognition, where
= 1, 2, …,
 K 
, and
 K 
 is the number of target candidates (see Fig. 4 for targetdetection results).III.
 
M
ULTIPLE
T
ARGETS
T
RACKING
 The multiple targets tracking algorithm accepts the targetcandidates from target detection sub-system, and keepsmultiple target trajectories in a graph structure, as shown in Fig.2. The tracker employs the similarity measure that is based onthe joint feature-spatial space. It consists of (i) similaritymeasure generation, and (ii) tracking history management.
 A. Join Feature-Spatial Spaces and Similarity Measure
Let
u
i
be a
-dimensional feature vector at image location
 x
i
, and
={
 y
i
 
= (
 x
i
,
u
i
)} (
i
= 1, …,
 N 
) be samples from a imageregion. The estimate of the probability at (
 x
,
u
) in joint space is:
)()( 1),(ˆ
1
ihi N i
uuG x x K   N u x P 
=
¦
=
σ  
, (2)where
 K 
ı 
is a 2-dimensional kernel with a bandwidth
ı,
and
G
h
is a
-dimensional kernel with a bandwidth
h
. The bandwidth in the spatial dimensions represents the variabilityin feature location due to the local deformation or measurement uncertainty, while the bandwidth in the featuredimensions represents the variability in value of feature.Given two distributions with samples
 I 
 x
 
= {(
 x
i
,
u
i
)} (
i
= 1, …,
 N 
) and
 I 
 y
 
= {(
 y
 j
,
v
 j
)} (
 j
= 1, …,
), the similarity measure between
 I 
 x
and
 I 
 y
is defined as:Moving TargetDetectionMoving TargetTrackingTargetRecognitionTargetDatabaseRecognitionUpdating
Fig. 1 System Configuration
Frame #i-1i-2i-3i-4 i-5 
 
MergeSplitNewMissing detectionFalse detectionDisappearing
Fig. 2 Graph structure in multiple object tracking.
979
 
 
.1),(ˆ1),(
12121
 
¦¦¦
= ==
¸¸ ¹ ·¨¨© § 
¸¸ ¹ ·¨¨© § 
==
 N i jih j jii j j x y x
hvuG y x K MN v y P  I  I  J 
σ  
σ  
(3)
 J 
(
 I 
 x
,
 I 
 y
) is symmetric and bounded by zero and one. Thissimilarity is based on the average separation criterion in cluster analysis [15] except that it employs the distance with akernelized one. This similarity measure has been applied for asingle target tracking [16] [17].
 B. Modified Similarity Measure for Multiple Target Tracking 
In multiple target tracking, the similarity between the target
represented by the hull
 H 
in (
t-
1)-th frame and the target
 in the
-th frame depends on not only joint feature-spatial space but also the distance between them. Therefore the similaritymeasure in Eq.(3) is modified as follows.
c stl ctl  y xkl 
 P  P  I  I  J 
,11,1,1
),(),(
×=
ω 
, (4)where
 x
 I 
,1
,
c
 P 
,1
is the distribution of target samples insidethe hull
 H 
and the target center in (
t-
1)-frame,
tl  y
 I 
and
tl c
 P 
isthe distribution of target samples inside
 H 
and the target center in
-th frame, respectively, and
1
 s
ω 
is the affinetransformation model from (
-1)-th frame to
-th frame.To verify the robustness of this similarity measure, four targets extracted from aerial images as shown in Fig. 3 (a),which are gray truck (GT), red sedan (RS), blue sedan (BS),and gray sedan (GS) from left to right, are employed for similarity testing. These four targets are rotated in range of 0°to 180°, with 5° increment in each rotation. The similaritymeasure between these generated images and the gray truck inFig. 3 (a) are calculated by using 500 random sample pointsfrom the each image. The similarity measures for GT-RS, GT-BS, GT-GS, and GT-GT are sown in Fig.3 (b). The similaritymeasure variance for GT-RS, GT-BS, GT-GS, and GT-GTmatching is 4.34
×
10
-6
, 1.04
×
10
-5
, 1.29
×
10
-5
, and 5.04
×
10
-5
,respectively. These results show that the similarity in Eq. (4) isrobust to the rotation and scaling. To reduce the computationtime, there is no need to use all points inside the target hull.The sample points can be chosen randomly from samplesinside the target hull.
C. Tracking Graph Management 
The multiple targets tracker needs to handle all problems aslisted in Fig.2. The algorithms to deal with these problems areas follows.
1) Missing detection prediction:
The targets which are under tracking till the frame right before the current frame may bemissed at the current frame because of the failure of detector.Missing detections at
i
-th frame, they are estimated from thedetection results obtained in image frames prior to the currentframe, by applying estimators. According to the position andvelocity of the target in previous image frames, its new position and velocity in the new frame can be estimated byKalman filter, recursive Bayesuan estimator, or particle filter.In this work, Kalman filter is employed. From previous state
)ˆ,ˆ,,(
,1,1,1,1
iiiii cic
v y x
ΔΔ
θ 
, the next state
)ˆ,ˆ,,(
ik ik ik cik c
v y x
θ 
 is estimated, where
),(
,1,1
icic
 y x
is the center of the target
 (which is missing at
i
-th frame) at (
i
-1)-th frame, and
)ˆ,ˆ(
,1,1
iiii
v
ΔΔ
θ 
is the average velocity and direction over  passed ǻ frames. Fig. 4 (c) shows a missing detection at frame21, which will be estimated.
2) New target detection
: New targets usually appear at thefour surroundings but not interior area. If a target is detectedand tracked over ǻ
2
frames, it is considered as a new target.Currently, four surroundings with the size of 20-pixel arecleared to zero, to remove the pixels that are not involved ingenerating frame difference. Toward inside, four surroundingswith size of 40-pixel are the area that new target may emerge.Fig. 4 (a) shows 3 newly detected targets at frame 6.
3) False detection filtering:
For targets that emerge in theinside area of the image, and are not linked to the targets in previous frame or next frame, they are false detection. They arefiltered out. Fig.4 (b) shows a false detection at frame 9, whichwill be filtered out.
Fig. 3 Robustness of similarity measure.(b) Similarity measures between targets in (a)(a) Four targets extracted from aerial image (from left to right:gray truck, red sedan, blue sedan, and white sedan)
Similarity Using 500 Samples
0.000.020.040.060.080.100.120 10 20 30 40 50 60 70 80 90 100 110 120130140150 160170
theta
     S     i    m     i     l    a    r     i     t    y
GT-RS GT-BSGT-GS GT-GT
980

You're Reading a Free Preview

Download
scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->