You are on page 1of 5

Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006

MEAN SHIFT ALGORITHM AND ITS APPLICATION IN TRACKING OF


OBJECTS
ZHI-QIANG WEN, ZI-XING CAI

College of Information Science and Engineering, Central South University, HuNan, Changsha, 410083 China
E-MAIL: zhqwen20001@163.com, zxcai@csu.edu.cn

Abstract: a blurring process and Cheng’s theories do not apply.


Mean shift algorithm is recently widely used in tracking Comaniciu in [14] proposed the mean shift procedure from
clustering, etc, however convergence of mean shift algorithm density estimation and proved it converge at nearest
has not been rigorously proved. In this paper mean shift stationary point of the underlying density function.
algorithm with Gaussian profile is studied and applied to Comaniciu’s work in [14] nicely is what Cheng has not
tracking of objects. The imprecise proofs about convergence
of mean shift are firstly pointed out. Then a convergence
dealt with. Li in [15] found the mistake in Comaniciu’s
theorem and its rigorous convergence proof are provided. convergence proof in [14] and gave some counterexamples.
Lastly tracking approach of objects based on mean shift is Comaniciu also made the same mistake in [4,16]. Li prove
modified. The results of experiment show the modified the convergence of mean shift in a new way. In fact the
approach has good performance of object tracking applied to proof in [4,14,15,16] is imprecise, and the convergence of
occlusion. The contributions in this paper are expected to mean shift need to be studied.
further study and application in mean shift algorithm. Arming at mean shift algorithm with Gaussian profile,
this paper studies above problem. The main contributions
Keywords: of this paper are: the imprecise proofs about convergence of
Mean shift algorithm; Convergence; Kernel function; mean shift in [4,14,15,16] are pointed out and a rigorous
Tracking of object; Bhattacharyya coefficient convergence proof is provided. Moreover the tracking
approach of objects based on mean shift is modified. The
1. Introduction paper is organized as follows: mean shift algorithm is
introduced in section 2. Section 3 provides the proof for the
convergence of mean shift. Modified tracking approach of
Mean shift, which was proposed in 1975 by Fukunaga objects and its experiment are presented in section 4.
and Hostetler[1], is a nonparametric, iterative procedure Section 5 is the conclusion.
that shifts each data to local maximum of density function.
In spite of its good properties, it has been ignored until
Cheng’s paper[2] renews our interest in it. Cheng in [2] 2. Mean shift
revisited mean shift, developing a more general formulation
and demonstrating its potential uses in clustering and global Given n data points xi, i=1,…,n in the d-dimensional
optimization. Since then, mean shift has been widely used space R d ,iterative formula of mean shift is as follows.
in object tracking[3-7], image segmentation[8,9], pattern n  y −x 2

recognition and clustering[10,11], filtering[12], information ∑ xi g  t
h
i 

fusion[13] and etc. yt +1 =
i =1
  (1)
Cheng[2] discussed the mean shift algorithm in three n  y − xi 2

ways and chiefly studied the blurring process. Let data be a ∑ g t 
 h 
i =1
 
finite set S embedded in the d-dimensional Euclidean space
X. Let T ⊂ X be a finite set produced in iterative procedure. We can yield
When T=S, the mean shift procedure is a blurring process, yt +1 = yt + λt ⋅ dt (2)
namely the input data is recursively modified after each Where
mean shift step. Cheng in [2] provided the proofs for the λt = h2c 2 fˆh, g ( yt ) > 0 , d t = ∇ fˆh , k ( yt ) .
convergence of blurring process. However when S is fixed
through the process and T is initialized to S, it is no longer

1-4244-0060-0/06/$20.00 ©2006 IEEE


4024
Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006

c n  x−x 2
 is a multivariate kernel density profile, the
sequences {yt}, t=1,2,... and
fˆh , k ( x ) = k , dd ∑ k  i 

nh i =1  h  { fˆh, K (t )}. , t = 1,2..
converge and { fˆh, K (t )}. , t = 1,2.. is
estimator with profile k which is Gaussian profile. monotonically increasing[14]) in [14,15] is imprecise.
 x−x 2  Proof. The details about the proof can be found in
ˆf ( x) = c g , d
n

h, g ∑ g
nh d i =1  h
i  , where cg,d, ck,d is the
 [14,15]. We suppose the proof in [14,15] is accurate, so
 sequences fˆh,k ( y t ) converge and monotonically increase.
corresponding normalization constant. We define derivative
From [14,15] we know y t converge at a point assuming
 x− x
n
2

∑ x i g  h i 
 y , and then the expression ∇fˆh, k ( y ) = 0 is true.
∇ m h,g ( x) =
i =1
 
− x
(3)
n  x− x 2
 Literature [14] provided the proof for its theory
∑ g  h i 

according to property of convex function:
i =1
  k(x2)≥k(x1)+k’(x1)(x2-x1) as follows (the expression in [14]
From (2) we can yield is used):
y t +1 = y t + ∇m h, g ( y t ) (4)  x 
2
ck , d n
 x 
∑ g 
2
fˆh , K ( j + 1) − fˆh , K ( j ) ≥
2
i
− y j +1 − xi
nh d + 2   i 
ˆ
i =1
 h 
where ∇ m h , g ( x ) = 1 h 2 c ∇ f h , k ( x )
Here the author considered k  y − xi  is a convex
2

2 fˆh , g ( x )  h 
 
(2) shows mean shift alternates toward the gradient function(the proof in [15] is similar to that in [14]).
direction and leads the original points shift to a local
According to the property of convex function, fˆh ,k ( y t ) is
maximum point of distributing density function. The step
size λt changes going with the whole iterative process. convex and Hessian matrix ∇ 2 fˆh,k ( y t ) is a positive
semidefinite matrix.
3. Convergence of mean shift According to second-order Taylor series expansion
formula around y , we have:
Literature [4,14,16] provided the convergence theories. fˆ h , k ( y ) = fˆ h , k ( y ) + ∇ fˆh , k ( y ) T ( y − y )
However differing from [4,14], Literature [16] studied the 1
variable-bandwidth mean shift. Li in [15] found that the + ( y − y )T ∇ 2 fˆh , k ( yˆ ))( y − y )
2
proofs, which were provided for convergence of mean shift where yˆ = y + η ( y − y) , 0 < η < 1 . After y is replaced by yt,
in literature [4,14,16], are wrong. When literature [14] was
proving the convergence of sequence {yt, t=1,2...}, Li in[15] we yield
1
considered it is wrong to deduce “{yt, t=1,2...} converge” fˆh, k ( y t ) − fˆh, k ( y) = ( y t − y ) T ∇ 2 fˆh, k ( yˆ )( y t − y )
2
from “||yt+1-yt|| converges to zero” and gave some
counterexamples. Li in [15] used a weighting parameter From fˆh ,k ( y ) > fˆ ( y ) we have
h,k t

ck,i,h in Comaniciu’s method. Li proved the convergence of ( y t − y ) T ∇ 2 fˆh, k ( yˆ )( y t − y ) < 0 .


mean shift in a new way. The work Li did in [15] is to find Assuming the fˆh , k has continuous partial derivatives of
above mistakes and correct it but he did not drastically
correct it. Li[15] and Comaniciu [4,14,16] use the property second order, there exits a Neighborhood Ω of ŷ ,
of convex function to prove the sequences fˆh,k (yt ) , t=1,2... satisfying ∇ 2 fˆh, k ( y ) is a negative definite matrix, where
converge and is monotonically increasing but function k(x) y ∈ Ω . When y t → y , yˆ → y and yˆ → y t . So there at
is rewritten in k(||yt -xi||2) which is not always a convex
least exits a yτ τ ∈ (1,2,...) , satisfying yτ ∈ Ω , which
function, possibly is a concave function or neither of them.
When k(x) is a convex function, sequences fˆh,k (yt ) are also implies ∇2 fˆh,k ( yτ ) is a negative definite matrix. This is
not always converge to a local maximum point. The paradoxical with the property that ∇2 fˆh, k ( yt ) t ∈ (1,2,...) is a
definitions and properties about convex set, convex positive semidefinite matrix. So the proofs in [14,15] are
function and concave function may consult literature[17] or imprecise. □
other literatures. Theorem 1. and its proof are provided as From the proof of Theorem 1, we suppose if
follows for the imprecise proof in [14,15]. sequence fˆh ,k ( y t ) , t = 1,2,... converge and monotonically
Theorem 1. The proof of theorem (for example: If the
increase, the Hessian matrix ∇2 fˆh, k ( yt ) is a negative definite
kernel K has a convex and monotonically decreasing

4025
Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006

matrix, so there is Theorem 2. In order to provide proof for So ϕ (θ ) monotonically decrease and ϕ (θ ) > 0 ,
the Theorem 2, a lemma is given as follows ∀θ ∈ (0,1) , and expression fˆh ,k ( yt +1 ) > fˆh ,k ( yt ) can be
Lemma If k(x) is Gaussian function, the following
expression is true. concluded. Consequently, sequence fˆh ,k ( yt ) converges and
∇ fˆ h , k ( y t ) T ∇ fˆ h , k ( y t + 1 ) > 0 t=0, 1, 2… (5) monotonically increases.
Proof. From [14], we know that if k(x) is Gaussian Assuming sequence fˆh ,k ( yt ) converges to the
function, the following expression is true
∇ m h , g ( y t ) T ∇ m h , g ( y t +1 )
point fˆh,k ( y ) , y ∈ S , we can drive the equality
>0 (6)
|| ∇ m h , g ( y t ) || ⋅ || ∇ m h , g ( y t +1 ) || lim fˆ ˆ ϕ (θ ) = 0 .
h , k ( yt ) → f h , k ( y )

From the (6), we can yield In addition,for 0 < ϕ (1) < ϕ (θ ) , we have two equalities
T
1 2  ∇ fˆh , k ( y t )  1 2 ∇ fˆh , k ( y t + 1 ) (7a) and (7b)
h c  ⋅ h c >0
2  fˆh , g ( y t )  2 fˆh , g ( y t + 1 ) lim fˆ ( y )→ fˆ ( y ) ∇fˆh ,k ( y t +1 ) T ∇fˆh ,k ( y t ) = 0 (7a)
h,k t h,k

So ∇ fˆh , k ( y t ) T ⋅ ∇ fˆh , k ( y t +1 ) > 0 □


limfˆ ˆ ∇fˆh,k (yt )T ∇fˆh,k (yt+1) = 0 (7b)
h,k ( yt )→fh,k ( y)
Theorem 2. (Convergence Theorem) Assuming
S ⊆ R d is nonempty open convex set, fˆ : S → R has The left term of (7a) and (7b) are multiplied by each
h,k
other, yielding
continuous partial derivatives of second order in S .
If ∀y t ∈ S , k (⋅) is Gaussian function and ∇2 fˆh, k ( yt ) t = 1,2,... , lim fˆ ( y ) → fˆ ( y ) || ∇fˆh, k ( yt +1 ) ||2 ⋅ || ∇fˆh, k ( yt ) ||2 = 0
h ,k t h ,k

the Hessian Matrix of fˆ in yt is negative definite matrix,


h,k which implies lim fˆ ˆ ∇fˆh ,k ( yt ) = 0 ,that is
h , k ( yt ) → f h , k ( y )
sequences fˆh, k ( yt ) converge and monotonically increase,
lim fˆ ˆ ( yt +1 − yt ) = 0 , so yt is convergent.
moreover sequences {yt} converge. h , k ( yt ) → f h , k ( y )

Proof. According to the properties of profile, k(x) is For the continuity of function fˆh ,k , yt is easily known
bounded, so fˆh ,k is also bounded. It will explain the to converge to point y and ∇fˆh,k ( y ) = 0 □
convergence of fˆh, k ( yt ) , if fˆh, k ( yt ) monotonically The adaptive step sizes is adopted in mean shift to
increases. guarantee convergence which also eliminates the need for
As t≥1, according to first-order Taylor series additional procedures to choose the adequate step sizes (for
expansion formula around yt, and after y is replaced example finding the optimum step sizes in steepest descend
by yt + λt d t ∈ S , we have: method[17,18]). This is a major advantage over the
traditional gradient-based methods. The bandwidth h only
fˆ h , k ( y t + λ t d t ) = fˆ h , k ( y t ) + λ t ∇ fˆ h , k ( y t + θλ t d t ) T d t decides the number of observed peak values[10] and the
where 0 < θ < 1 . size of region where fˆh ,k ( y t ) is concave. Generally the
Assuming ϕ (θ ) = fˆh , k ( y t + λ t d t ) − fˆh , k ( y t ) number of peaks decreases with the increase of the
h x 0 
= λ t ∇fˆh, k ( y t + θλ t d t ) T d t bandwidth h. So for Hessian matrix H is   , mean
the following expression is true.  0 hy 
shift can converge to a steady point.
lim θ → 0 ϕ (θ ) = λ t ∇ fˆh , k ( y t + θλ t d t ) T d t
= λ t || ∇fˆh,k ( y t ) || 2 > 0 4. Experiment on tracking based on mean shift
lim θ → 1 ϕ (θ ) = ∇ fˆ h , k ( y t + λ t d t ) T λ t d t
The tracking algorithm based mean shift in [19] is
= λt ∇fˆh ,k ( y t +1 ) T ∇fˆh,k ( y t ) > 0 (Lemma)
adopted in this paper, and more details about this algorithm
Function ϕ (θ ) is continuous and differentiable, implying can be found in [19]. But in this paper there are some
ϕ ' (θ ) = λt2 d tT ∇ 2 fˆh,k ( y t + θλt d t ) T d t differences from [19], as follows.
h x 0 
= ( y t +1 − y t ) T ∇ 2 fˆh ,k ( y t + θλt d t ) T ( y t +1 − y t ) < 0 1) Hessian matrix H is   , where hx, hy is the
(Easily yielding yt + θλt d t ∈ S ).  0 hy 
width and height of object respectively.

4026
Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006

2) The profile k is Gaussian function. is maximum.


e − x || x ||≤ λ The video in our experiments is acquired in outdoor
k ( x) = 
 0 || x ||> λ
3) Optimum hx, hy is searched, satisfying that fˆh, k ( yt )

(a) distribution of Bhattacharyya coefficient (b) iterative times


Figure 1. Analysis in process of experiment

(a) 20th frame (b) 30th frame ( c) 70th frame, (d) 120th frame
Figure 2. The result of experiment whenλ=2

(a) 20th frame (b) 30th frame


Figure 3. Alterative Bhattacharyya coefficient Figure 4. The result of experiment whenλ=1
environment through vision system installed in robot of our location is found. Figure1 (b) shows the iterative times in
Lab[20]. A object and its initial location are hypothetically process of finding the object from the first frame to 120th
known. Through the mean shift algorithm, the location of frame. Figure2 shows the result of experiment. After 120th
object is found after some times. Figure1(a) shows the frames, the object can be found; even there are some
distribution of Bhattacharyya coefficientρaround the initial occlusions. Figure 3 shows the alterative Bhattacharyya
location of object. After some iterative times, the next coefficient in each frame, so the curve in Figure3 has two
minimum points, A and B that show there are two

4027
Proceedings of the Fifth International Conference on Machine Learning and Cybernetics, Dalian, 13-16 August 2006

occlusions in process of object moving. For occlusions, the In: Proc. of the Conf. on Computer Vision and Pattern
Bhattacharyya coefficient value is quickly decreasing. After Recognition (CVPR), pp.18-20, 2003.
that, it is quickly increasing. The time of occlusion can not [6] Shan C, Wei Y, and Tan T et al, “Real Time Hand Tracking
by Combining Particle Filtering and Mean Shift,” In: Proc.
be too long; otherwise the object will be lost. λ is important
of the 6th IEEE International Conf. on Automatic Face and
for the performance of object tracking applied to occlusion. Gesture Recognition, 17-19, May, pp.669-674, 2004.
If λ is too small, the object is easily lost. If λ is too large, [7] Maggio E, and Cavallaro A, “Hybrid Particle Filter and
the computed time is too long and real time performance is Mean Shift tracker with adaptive transition model,” In: Proc.
bad. When λ =1, the results of experiment in Figure4 show of IEEE Signal Proc. Society Int. Conf. on Acoustics,
the object is lost. Generally λ =1.5-2.5. when λ =2 in Speech, and Signal Processing (ICASSP), Philadelphia, PA,
Figure2, it shows the good results. USA, March pp.19-23, 2005.
[8] Comaniciu D, “Image segmentation using clustering with
5. Conclusions saddle point detection,” In: Proc. of the IEEE Int’l Conf. on
Image Processing (ICIP), pp.297-300, 2002.
This paper primarily studies the convergence on mean [9] Wang J, Thiesson B, and Y. Xu et al, “Image and Video
shift algorithm with Gaussian profile and presents a Segmentation by Anisotropic Kernel Mean Shift,” In: Proc.
modified tracking approach of object based on mean shift. European Conf. on Computer Vision (ECCV), 2004.
At first, this paper review the mean shift algorithm, then the [10] Comaniciu D, Ramesh V, and A. D. Bue, “Multivariate
saddle point detection for statistical clustering,” In: Proc. of
imprecise proof in [14,15] is pointed out. Third, a new
the European Conf. Computer Vision (ECCV). Pp.561-576,
convergence theorem and its proof are provided; lastly a 2002.
modified tracking approach of object is presented. This [11] Georgescu B, Shimshoni I, and Meer P, “Mean Shift Based
approach can be applied to occlusion issue. However the Clustering in High Dimensions: A Texture Classification
parameter λ has an important effect on performance of Example,” In: Proc. ICCV, Oct. pp. 456-463, 2003.
tracking of object in occlusion environment. Generally [12] Comaniciu D, and Meer P, “Mean shift analysis and
λ =1.5-2.5 can ensure the good results. applications,” In: Proc. of the IEEE Int’l Conf. on Computer
Vision (ICCV), pp.1197-1203, 1999.
[13] Comaniciu D, “Nonparametric information fusion for
Acknowledgements motion estimation,” In: Proc. of the IEEE Conf. on
Computer Vision and Pattern Recognition (CVPR), pp.59-66,
This work was supported by the National Natural 2003.
Science Foundation of China under Grant No. 60234030. [14] Comaniciu D, and Meer P, “Mean shift: A robust approach
The authors thank the whole member in institute of toward feature space analysis,” IEEE Trans. on Pattern
intelligent robot of Central South University. Besides, we Analysis and Machine Intelligence, vol.24, no.5, pp.603-619,
thank the anonymous reviewers for their good advice. 2002.
[15] Li X, and Wu F et al, “Convergence of a mean shift
algorithm,” Journal of Software, vol.16, no.3, pp.365-374,
References 2005, in Chinese.
[16] Comaniciu D, Ramesh V, and Meer P, “The variable
[1] Fukunaga K, and Hostetler LD, “The estimation of the bandwidth mean shift and data-driven scale selection,” In:
gradient of a density function, with applications in pattern Proc. of the IEEE Int’l Conf. on Computer Vision (ICCV),
recognition,” IEEE Trans. Information Theory, vol. 21, pp.438-445, 2001.
pp.32-40, 1975. [17] Xie Z, Li J, and Tang Z, “non-linear optimization,”
[2] Cheng Y, “Mean shift, mode seeking, and clustering,” IEEE Changsha: National University of Defence Technology
Trans. on Pattern Analysis and Machine Intelligence, vol.17, Publish House, pp.167-174, 2003, in Chinese.
no.8, pp.790-799, 1995. [18] Shi Y, “Globally convergent algorithms for unconstrained
[3] Comaniciu D, and Ramesh V, “Mean shift and optimal optimization,” Computational Optimization and
prediction for efficient object tracking,” In: Mojsilovic A, Applications, vol.16, pp.295-308, 2000.
Hu J, eds. Proc. of the IEEE Int’l Conf. on Image Processing [19] Comaniciu D, Ramesh V, Meer P, Kernel-Based Object
(ICIP), pp.70-73, 2000 Tracking, IEEE Trans. on Pattern Analysis and Machine
[4] Comaniciu D, Ramesh V, and Meer P, “Real-Time tracking Intelligence, Vol. 25, no. 5, 2003, pp564-577.
of non-rigid objects using mean shift,” In: Proc. of the IEEE [20] Cai ZX, Zou XB, et al, Design of Distributed Control
Conf. on Computer Vision and Pattern Recognition (CVPR), System for Mobile Robot, J. Cent. South Univ. (science and
pp.142-149, 2000. technology), vol.36, no.5, 2005,pp: 727-732, in Chinese
[5] Collins RT, “Mean shift blob tracking through scale space,”

4028

You might also like