Professional Documents
Culture Documents
UCS551 Chapter 7 - Clustering
UCS551 Chapter 7 - Clustering
CLUSTERING
DR AZLIN AHMAD
CONTENT
K-Means
K-Nearest Neighbour
WHAT IS CLUSTERING?
Clustering:-
a process of grouping similar objects into groups called clusters
The clusters resemble the hidden patterns in the data set
widely used in numerous applications
pattern recognition, data analysis, image processing, life sciences etc
Repeat these steps for a set number of iterations or until the group
centers don’t change much between iterations. You can also opt to
randomly initialize the group centers a few times, and then select
the run that looks like it provided the best results.
K-NEAREST NEIGHBOR
Get the labels of the If regression, return the mean of the K labels
selected K entries
Advantages
The algorithm is simple and easy to implement.
There’s no need to build a model, tune several parameters, or make additional assumptions.
The algorithm is versatile. It can be used for classification, regression, and search (as we
will see in the next section).
Disadvantages
The algorithm gets significantly slower as the number of examples and/or
predictors/independent variables increase.