Professional Documents
Culture Documents
Cody Lawyer Ee 7150 Final Paper
Cody Lawyer Ee 7150 Final Paper
Cody Lawyer
December 10, 2013
Cody Lawyer
Contents
Abstract
Introduction
Technical Discussion
Results
10
Conclusion
11
References
12
Appendix
13
Page 2 of 13
Cody Lawyer
Abstract
The problem of tracking players in sporting events using video cameras is an interesting
problem. Tracking players allows for attributes to be measured that would not otherwise
normally be able to be measured, such as fitness and positioning. It is also a difficult
problem because in some sports there are many players on the field, the players occlude each
other, and the camera is possibly moving. There have been a variety of methods proposed
and implemented to track players. The mean shift algorithm and the continuous adaptive
mean shift algorithm both use the color histogram of the target to do the tracking. The
particle filter method uses randomly placed particles and state estimation for tracking. This
project will implement and compare the mean shift algorithm, continuous adaptive mean
shift algorithm and the particle filter method using MATLAB and a soccer video dataset.
Page 3 of 13
Cody Lawyer
Abstract
Introduction
The problem of tracking players using video is a commonly studied problem. It is a difficult
problem because the camera can be non-stationary, many players can be on the field and
the players can occlude each other among other difficulties.
In [1], the mean shift algorithm of tracking objects is described. In [2], a modified mean shift
algorithm called the continously adaptive mean shift algorithm is described. Both methods use a color histogram to represent and track the target. In [3], a multiple hypothesis
tracker is described which implements camera parameter detection, player detection, and
player tracking using multiple camera angles. In [4], player tracking is implemented using a
Kalman filter. Finally, in [5], player tracking is implemented using a particle filter.
In this project, the mean shift algorithm, the continously adaptive mean shift algorithm,
and the particle filter algorithm were selected to be implemented and compared. The mean
shift and continously adaptive mean shift algorithms were chosen because they are similar
so if one is implemented the other is not difficult to implement. The particle filter method
was chosen it is an interesting method and is very different from the other two algorithms.
The rest of this paper is organized as follows: First, a technical discussion describing the
details of the three methods. Next, a description of the dataset used for comparison followed by a descripition of the MATLAB simulation results. Finally, a conclusion on which
of the three methods is best for tracking players. The references used as well as an appendix
containing the MATLAB code is also included.
Technical Discussion
Mean Shift Algorithm
For the mean shift algorithm [1], first the user is presented with the first frame in the
video to select what player to track. The frame is converted from RBG color space to HSV
color space. This is done because the hue and saturation values in HSV are less susceptible
to changes in brightness. A small window around where the user selects is used to calculate
the target hue histogram of the image. The histogram is formed by placing each pixel in a
bin based on its value. The histogram is then used to calculate the backprojection of the
entire frame. The backprojection is formed by replacing each pixel with the value of the bin
it would be placed in the histogram. As shown in Figure 1, this gives greater values to areas
that match the target to be tracked.
Page 4 of 13
Cody Lawyer
M10 =
M01 =
XX
XX
x
xI(x, y)
yI(x, y)
Using the moments, the mean x and mean y location of the window can be calculated.
xc =
M10
M01
yc =
M00
M00
The window is then shifted to the calculated location. If the difference between the previous
location and the new location is smaller than a user defined value, the next frame is loaded.
If the difference is large, the new moments and location are calculated until the window
shifts less than the user defined amount. This done until the target leaves the video frame
or until the end of the video.
The mean shift algorithm requires tuning depending on the application. The tracking window size needs to be adjusted to be similar is size to the expected target. Also the difference
between locations while continuing to iterate can be adjusted to increase speed or the precision of the track.
CAMshift Algorithm
Page 5 of 13
Cody Lawyer
The cam shift algorithm [2] is a modified version of the mean shift algorithm. Instead
of using the fixed window size, it adjusts the size of the window throughout the tracking to
better track through occulsions and changes in size.
The cam shift algorithm is implemented by starting with the standard mean shift algorithm.
The window size is adjusted as a function of M00 after it is calculated. This a function that
needs to be tuned to the particular application but in my use it is set as followed.
s=2
M00
The window width is set to 1.5s and the window height is set to 2s. The values that s is
multiplied by can be tuned to the particular application.
Particle Filter The particle filter can be used to track objects. It is implemented as described in [5] and [6] The tracked object is represented by a collection of randomly generated
particles. Each particle has a state and a weight. The state is represented by
xt = [x y r g xt yt ]
where x and y are the location of the particle in the x and y dimension, rt and g are the red
and green chromacity values of the particle, and xt and yt are the velocity of the particle in
the x and y dimension. The chromacity values are between 0 and 1.
The particle and their weights are updated each frame using observed data. For this application, the observed data is the extracted player regions from the frame which are represented
by
C = (e, p, c)
where e is the list of edge points for the region, p is the center of mass of the region, and c
is the average color of the region.
Player region detection is implemented via various common digital image processing tasks.
Using the first frame of the video, a playing field mask is calculated. This is used to filter
out sideline clutter. First, the hue histogram of the entire image is calculated. Since the
image is mostly grass, the histogram will have a peak at green hues. The backprojection
of the entire frame is calculated using the histogram and is thresholded. This is shown in
Figure 2 and represents the grass areas in the image.
Page 6 of 13
Cody Lawyer
Page 7 of 13
Cody Lawyer
markings.
Page 8 of 13
Cody Lawyer
previous frames particles and noise and the weights are updated using the player region data.
The particles are updated using the previous frames particles and noise. The x and y
positions of the particles are updated using the following equations.
xt = xt1 + xt1
+ vx vx N (0, x )
yt = yt1 + yt1
+ vy vx N (0, y )
The red and green chromacity values are updated using the following equations.
rt = rt1 + vr vr N (0, r )
gt = gt1 + vg vg N (0, g )
Finally, the x and y velocity values are updated using the following equations.
xt = xt1
+ (1 )(xt xt 1) = [0, 1]
yt = yt1
+ (1 )(yt yt 1) = [0, 1]
After the particles have been updated for the current frame, the weight of each particle is
calulated. The weights are calculated using the player region data. The error vector for each
particle is calculated as follows. ex = 0 if in region else it is set to distance to the nearest
edge. ey = 0 if in region else it is set to distance to the nearest edge. er = Cr rt where Cr
is the average red chromacity of the region and rt is the red chromacity value of the particle
for the current frame. In a similar approach, eg = Cg gt where Cg is the average green
chromacity of the region and gt is the green chromacity value of the particle for the current
frame. ext and eyt are set to zero because they are not updated using any player region data,
just the movement of the particles. The error vector is defined as ed = [ex ey er eg ext eyt ]T .
The weight of each particle is calculated using
eT R 1ed
w = exp d
2
where R is the covariance matrix to weight errors different amounts if neccessary. For example, if in your application the color data is noisy, those errors can be weighted less.
A problem with the particle filter is that after a few iterations most of the particles will
have very low weights. This problem can be mostly prevented by resampling the particles
after the weights are calculated each frame. This done by replacing the particles with small
weights with copies of particles that have large weights. First, a CDF C(i) is calculated using
the weights of each particle. A random starting point u1 is drawn from U [o, Ns1 ] where Ns
is the number of particles. Then moving along the CDF using uj = ui + Ns1 (j 1) and
while uj > C(i) move until a particle with a large enough weight is found. The new particle
is assigned Xki = Xki and the new weight is assigned wki = Ns1 .
Finally, the tracked location of the object is the weighted mean of the particle locations.
There is a large amount of tuning the particle filter your particular application. The number
of particle Ns , the particle noises x , y , r , g , , and the player region detection method
all need to be tuned to your particular application. Also some applications with low process
noise are not suited for particle filters as described below in results.
Page 9 of 13
Cody Lawyer
Results
Dataset
The dataset [7] used was six videos of the same clip of a soccer match. Each video was
a different viewpoint of a stationary camera. The resolution of each camera was 1920 by
1080 pixels. The dataset provided ground truth tracking data for each of the players in the
video. An example frame from one of the cameras is shown in Figure 6.
Page 10 of 13
Cody Lawyer
algorithm was RM SECS = 209.0335. Figure 7 shows a plot of the tracking data and the
truth data.
Conclusion
This project was a comparison of soccer player video tracking algorithms. The mean shift
algorithm, the cam shift algorithm, and the particle filter method were all detailed and
implemented in MATLAB. They were compared using a soccer video tracking dataset. The
mean shift algorithm and the cam shift algorithm both did a reasonably good job at tracking
the player. The cam shift algorithm did a better job when comparing root mean square
errors. This is expected because the cam shift algorithm is a modified version of the mean
shift algorithm. The particle filter method did not work for this dataset. The collection of
particles collapsed to a single location after a few iterations. This is due to low process noise
possibly caused by the stationary camera of this dataset.
Page 11 of 13
Cody Lawyer
Conclusion
References
1. L. W. Kheng, Mean Shift Tracking, Computer Vision and Pattern Recognition CS4243,
National University of Singapore.
2. G. R. Bradski, Computer Vision Face Tracking For Use in a Perceptual User Interface, Intel Technology Journal, Q2 1998.
3. M. Beetz, S. Gedikli, J. Bandouch, B. Kirchlechner, N. von Hoyningen-Huene, and A.
C. Perzylo, Visually Tracking Football Games Based on TV Broadcasts In IJCAI (pp.
2066-2071), January 2007.
4. M. Xu, J. Orwell, L. Lowey, and D. Thirde, Architecture and algorithms for tracking
football players with multiple cameras, in IEEE Proceedings of Vision, Image and Signal
Processing, vol.152, no.2, pp.232,241, 8 April 2005.
5. A. Dearden, Y. Demiris, and O. Grau, Tracking Football Player Movement From a
Single Moving Camera Using Particle Filters, in Proceedings of CVMP-2006, pp. 29-37,
IET Press, 2006.
6. M. S. Arulampalam, S. Maskell, N. Gorden, and T. Clapp. A tutorial on particle
Filters for online nonlinear/non-gaussian bayesian tracking, in IEEE transactions on signal
processing, 50(2), 2002.
7. T. DOrazio, M.Leo, N. Mosca, P.Spagnolo, P.L.Mazzeo, A Semi-Automatic System
for Ground Truth Generation of Soccer Video Sequences, in the Proceeding of the 6th
IEEE International Conference on Advanced Video and Signal Surveillance, Genoa, Italy
September 2-4, 2009.
Page 12 of 13
Cody Lawyer
References
Appendix
The following m-files are attached:
calcHueBackprojection.m
CAMShiftTracking.m
meanShiftTracking.m
ParticleFilterTracking.m
resampleSIR.m
SIRParticleFilter.m
Page 13 of 13