You are on page 1of 8

Mean Shift Algorithm

Pi19404
September 23, 2013

Contents

Contents
Mean Shift Algorithm
0.1 Introduction . . . . . . . . . . . . . . . . . . 0.2 Kernel density Estimation . . . . . . . 0.3 Mean Shift . . . . . . . . . . . . . . . . . . . 0.3.1 Modes of Smooth function . 0.3.2 Using the Gradient . . . . . . . 0.3.3 Local Maxima . . . . . . . . . . . . . 0.4 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3 3 3 4 7 8 8

2|8

Mean Shift Algorithm

Mean Shift Algorithm


0.1 Introduction
In the article we will look at the basics of Mean Shift Algorithm.

0.2 Kernel density Estimation


let us first consider a univariate gaussian PDF and sampled data from the PDF. The kernel density estimation uses fact that the density of samples about a given point is proportional to its probability. It approximates the probability density by estimating the local density of points as seen in figure ?? is resonable. Large density of points are observed near the maximum of PDF. The KDE estimate the PDF by superposition of kernel at each data point,which is equivalent to convolving the data points with a gaussian kernel.

0.3 Mean Shift


Mean shift is a tool for finding the modes in a set of data samples which is sampled from underlying PDF.The aim of the mean shift algorithm is to find the densest region in given set of data samples. Data points density implies PDF value . Let us consider a 2D region.The points in the 2D region are sampled from a underlying unknown PDF. Let
X

= [x; y ]be

random variables associated with a multi-variate PDF

P (X )

.
P (X )

Thus sampling a point

will give us a vector

H = [xH ; y H ]

3|8

Mean Shift Algorithm

(a) original PDF

(b) Sampled data

(c) Density estimate


Figure 1: Density estimation

For example let us consider a multi-variate gaussian distribution where the random variables x and y take values in the range -3 to 3.

0.3.1 Modes of Smooth function


Let us say we want to find the modes of PDF.The PDF is approximated using kernel density estimation.Modes are the points at which PDF exhibits local maximum . Dense regions in PDF corresponds to modes or local maxima. Since the kernel is smooth,its differentiable.It gives to a smooth

4|8

Mean Shift Algorithm PDF.The gradient of density estimate is given by


^ (x) = f h 1
n

n X i=1

Kh (x

xi )

= 1

1
nh

n  x xi  X
K

r r r ^h(
f x)

^ (x) = f h

n X i=1

i=1

nh

 x xi 
h

^ (x) = f h

C nh

n X i=1

for gaussian kernel

r
2

exp

 (x xi )2 
2

C nh

n X
h

exp

 (x xi )2 
2
K

i=1
1

^ (x) = f h

nh

n  x xi  X
h

((
x

xi))

equating the gradient to 0

i=1 n X
K

((
x x xi)

xi))

 xH xi 
h

n  xH xi  X
K

i=1
h

(
K

=0

i=1

n  x xi  X

H x =
The estimate is xH
=
m(x)

Pn

i=1


K

xi

Pn

i=1

x xi h

 
xi

i=1 K

xxi h

is called the sample mean at x with kernel K.

This will always be biased towards region with high density. Thus if we were to move along the vector m(x) x,we would reach the region with higher density.The density at m(x) will be greater than density at x. This forms the basis of mean shift algorithm. The vector m(x) x is called the mean shift vector which always points in the direction of local maximum or mode.
m(x)

Pn
x

i=1 K

Pn

xxi (xi h

i=1 K
x

m(x)

2r
h

xxi h

x)

fh (x)

fh (x)

This is a estimate of normalize gradient of fh (x) Given any point x,we can take a small step in the direction of vector m(x) x to reach the local maximum.

5|8

Mean Shift Algorithm

Let us consider that the present estimate of the mode is we compute m(x) at this point.

, then

For examples let initial estimate of the location of mode be (0:96; 2:01) The density at this point can be approximated by interpolation or computed again using non parametric density estimation The plot for this is show in 2.The estimate clearly does not lie at maximum. To find the direction of the mean shift vector we find the gradient of the normalize density estimate and take a smalll step in that direction.This is perform till gradient magnitude is very small A video for mean shift algorithm using KDE is shown in https:

Figure 2: Mean Shift

//googledrive.com/host/0B-pfqaQBbAAtNkg2bUJvWERmNFU/a2.avi
In this case we scale the gradient by the estimated PDF values to obtain normalize gradient values.
m(x)

2r
h

fh (x)

fh (x)

This enabled us to adaptively change the step size based on estimated PDF value.The step size magnitude is iversly proportional to estimated PDF values. if the estimated PDF values is small,we are far away from the maximum and the step size will be large. If the estimate PDf value is large,we are close to maximum and the step size will be small.

6|8

Mean Shift Algorithm

0.3.2 Using the Gradient


to find the modes of the PDF,we do not actually required to estimate the PDF,we require just the gradient of the PDF to move in the direction of the mean shift vector. The gradient of superposition of kernels centered at each data point is equivalent to convolving the data points with gradient of the kernel. Instead of convolving with gaussian kernel,we can convolve with gradient of gaussian kernel.

k (X )

=
x


C exp k (x);

+ y2

k (X )

2
y

 i

k (x)

Thus given a intial point X ,we estimate the value at X using the x k (x) and x k (x) which gives us the direction of gradient kernels h h at the point X Since we do not actually estimate the PDF at a point,but estimate the gradient of PDF each time during the mean shift iteration we need to take a step in direction of mean shift vector,in the earlier case ,we used the scale the gradient by the estimated PDF values to obtained a normalized measure. However in the present case we do not adaptively change the step size but take a step of fixed size in direction of the gradient. This still incorporates some adaptive behavior,since mean shift vector magnitude depends on the gradient magnitude. If gradient magnitude is large,step size take will be large else step take will be small and refined ,near the maximum. video of mean shift algorithm using gradient estimates is shown in https://googledrive.com/host/0B-pfqaQBbAAtNkg2bUJvWERmNFU/a3.avi This iterative algorithm is a standard gradient descent algorithm and the convergence is guranteed for infinately small step size. Since the algorithm depends on kernel density estimate, the band-

7|8

Mean Shift Algorithm with of kernel will play a important role in mean shift algorithm as well.

0.3.3 Local Maxima


If we reach a region,where local density is flat or we have reached a local maximum.The algorithm will terminate. this is a problem in case of all algorithms trying to reach a global maximum.The animation for the same is shown in https://googledrive. com/host/0B-pfqaQBbAAtNkg2bUJvWERmNFU/a1.avi

(a) original PDF


Figure 3: Mean shift

0.4 Code
The Code is written in matlab and available in repository https:// github.com/pi19404/m19404/tree/master/meanshift the file mean_shift.m is the main file.The file kgde2 implements kernel density estimator using bivariate gaussian windows for 2D distributions.The file kgde2x implements estimation of gradient on KDE .The dim parameter decides the computation of gradient along x and y directions.

8|8