You are on page 1of 2

CSE 455/555 Spring 2013 Homework 8: Non-parametric Techniques

Jason J. Corso
Computer Science and Engineering
SUNY at Buffalo

This assignment does not need to be submitted and will not be graded, but students are advised to work through the
problems to ensure they understand the material.
You are both allowed and encouraged to work in groups on this and other homework assignments in this class.
These are challenging topics, and working together will both make them easier to decipher and help you ensure that
you truly understand them.

1. Kernel Density Estimation and k-Nearest Neighbors

Suppose you are given a dataset X = {0, 0, 0, 1, 2, 2, 2, 2, 3, 4, 4, 4, 5, 5}
(a) Using the following kernel function with a bandwith of 3, calculate the kernel density estimate of x =
{0, 1, 2, 3, 4, 5}.

K(u) =

(b) Consider the effect of bandwidth in case of kernel density estimation, when will it result estimates that
are of high bias and in what case will it result estimates of high variance?
(c) Think of the similar question in case of k-nearest neighbor, what is the effect of the number of k?
(d) Suppose that we are doing classification of d-dimensional data using k-nearest neighbor method, show
that the effective number of parameters used by k-nearest neighbor is in the order of N/k, where N is
the number of training examples.
Hint:Think of the cases where k = 1 and k = N .
2. Figure-ground segmentation
Implement the kernel-density estimation based method for foreground and background segmentation. The
method is discribed in the paper attached.
(a) Be sure to implement weighted kernel-density estimation (based on the current probability of foreground
and probability of background).
(b) Use the same exact color-based feature space that they do.
(c) You do not need to implement the method of normalized KL-divergence for selecting the kernel scale
for initialization. Just set it to some reasonable value manually.
(d) You do not need to use the method based on sample variance to set the kernel bandwidth. Just set it to
something reasonable (0.1).
(e) You have the choice of either implementing a sampling-based version as Zhao and Davis have done (i.e.,
take 6% of the pixels each round), or you can simply process all of the pixels.

(f) Rather than implementing the Gaussian kernel as they have, use the Epanechnikov kernel:
K(u) = (1 2 )(|| 1)
(g) Have your system iterate for a fixed number of iterations (say 25).
(h) Set the bandwith parameter to some different values (say 0.1, 0.01, or 0.2), and see the effect on the
result on the provided flower.ppm and butterfly.ppm file.
(i) Also try this method on some images of your interest and look at the result. (You probably want to resize
your image to lower resolutions (like 240 180 before processing it.)

You might also like