You are on page 1of 2

4/6/2017 kmeans_cluster

Contents
K-means clustering fucntion
Distance function

K-means clustering fucntion

function [c,centroids,dists2centroids] = kmeans_cluster(X,k,distFun)


% Cutoff criteria
epsilon = 1e-7; % Some small value

% Initialize cutoff values


maxEpsilon = 100; % Set to some large value initially

% Get intiial centroids by randomly sampling input matrix


% Assumes clustering dimension is by rows
rand_vals = randperm(size(X,1));
order = sort(rand_vals(1:k)); % organize (for numerical reasons)
centroids = X(order,:);

loop_count = 0;

% Begin loop and continue until cutoffs met


while maxEpsilon > epsilon && loop_count < 100

loop_count = loop_count + 1;

distances = zeros(length(X),k);
for i = 1:length(X)
for j = 1:k
distances(i,j) = distFun(X(i,:),centroids(j,:));
end
end

% Get membership of observations to clusters based on distance


[~,I] = min(distances');

% Re-calculate centroids based on current assignments


new_centroids = zeros(size(centroids));
for i = 1:k; new_centroids(i,:) = mean(X(I==i,:)); end;

% Compute change in centroids


% update epsilon cutoff value
maxEpsilon = sum(sum(abs(new_centroids - centroids)));

% Update centroids to new ones


centroids = new_centroids;

end

% Assign final membership, centroids, and distances to centroids


c = I;
dists2centroids = min(distances,[],2);

end

le:///home/pholec/Projects/440/Homework/PS4/html/kmeans_cluster.html 1/2
4/6/2017 kmeans_cluster

Not enough input arguments.

Error in kmeans_cluster (line 11)


rand_vals = randperm(size(X,1));

Distance function

function fun = distance(dist)


switch dist
case 'Euclidean'
fun = @(x,y) sqrt(sum((x-y).^2));
case 'cityblock'
fun = @(x,y) sum(abs(x-y));
end
end

Published with MA TLAB R2016b

le:///home/pholec/Projects/440/Homework/PS4/html/kmeans_cluster.html 2/2

You might also like