Professional Documents
Culture Documents
3 Basic Algorithms 33
and 0 otherwise.
Stage 2 Keep the r fixed and determine µ. Since the r’s are fixed, J is an
quadratic function of µ. It can be minimized by setting the derivative
with respect to µj to be 0:
m
X
rij (xi − µj ) = 0 for all j. (1.31)
i=1
Rearranging obtains
P
rij xi
µj = Pi . (1.32)
i rij
P
Since i rij counts the number of points assigned to cluster j, we are
essentially setting µj to be the sample mean of the points assigned
to cluster j.
The algorithm stops when the cluster assignments do not change signifi-
cantly. Detailed pseudo-code can be found in Algorithm 1.5.
Two issues with K-Means are worth noting. First, it is sensitive to the
choice of the initial cluster centers µ. A number of practical heuristics have
been developed. For instance, one could randomly choose k points from the
given dataset as cluster centers. Other methods try to pick k points from X
which are farthest away from each other. Second, it makes a hard assignment
of every point to a cluster center. Variants which we will encounter later in