You are on page 1of 13

PRESENTATION

Operating System Concept


CS-582
GROUP MEMBER

 Mujtaba Hassan
 Hamid mehmood Saqib
 Nosheen Safdar
K-MEAN
CLUSTERING
HISTORY OF K-MEANS
CLUSTERING
• The term k means was first used by James Macqueen in 1967 though idea
• He standard algorithm was first proposed by Stuart Lloyd in 1957 as a
technique
• For pulse code modulation ,though it wasn’t published outside of bell labs
• Untill 1982
• 1n 1965 E.W froggy published essential the same method which is why it is
sometimes
• Referred to as Lloyd frog
INTRODUCTION TO K-MEANS
CLUSTERING
K-means clustering is a type of unsupervised learning, which is used when you
have unlabeled data (i.e., data without defined categories or groups). The goal of
this algorithm is to find groups in the data, with the number of groups
represented by the variable K. The algorithm works iteratively to assign each
data point to one of K groups based on the features that are provided. Data
points are clustered based on feature similarity. The results of the K-means
clustering algorithm are:
• The centroids of the K clusters, which can be used to label new data
• Labels for the training data (each data point is assigned to a single cluster)
ALGORITHM
The Κ-means clustering algorithm uses iterative refinement to produce a final result. The algorithm inputs
are the number of clusters Κ and the data set. The data set is a collection of features for each data point.
The algorithms starts with initial estimates for the Κ centroids, which can either be randomly generated or
randomly selected from the data set.
1. Data assigment step:
In this step, each data point is assigned to its nearest centroid, based on the squared Euclidean
distance.

2. Centroid update step:


In this step, the centroids are recomputed. This is done by taking the mean of all data points assigned to
that centroid's cluster.
TYPES OF CLUSTERING
1. Hierarchical algorithms: these find successive clusters
using previously established clusters.
1. Agglomerative ("bottom-up"): Agglomerative algorithms begin with
each element as a separate cluster and merge them into successively larger
clusters.
2. Divisive ("top-down"): Divisive algorithms begin with the whole set
and proceed to divide it into successively smaller clusters.
2. Partitional clustering: Partitional algorithms determine all clusters at once.
They include:
K-means and derivatives
Fuzzy c-means clustering
QT clustering algorithm
HOW THE K-MEAN
CLUSTERING ALGORITHM
WORKS?
K-MEANS: ADVANTAGES AND
DISADVANTAGES
Advantages
• Easy to implement
• With a large number of variables, K-Means may be computaHonally faster than hierarchical
clustering (if K is small).
• k-Means may produce Hghter clusters than hierarchical clustering
• An instance can change cluster (move to another cluster) when the centroids are recomputed.
Disavantages
• Difficult to predict the number of clusters (K-Value)
• IniHal seeds have a strong impact on the final results
• The order of the data has an impact on the final results
• SensiHve to scale
APPLICATIONS OF K-MEAN
CLUSTERING
• It is relatively efficient and fast. It computes result at O(tkn),
where n is number of objects or points, k is number of clusters and
t is number of iterations.
• k-means clustering can be applied to machine learning or data
mining
• Used on acoustic data in speech understanding to convert
waveforms into one of k categories (known as Vector Quantization
or Image Segmentation).
• Also used for choosing color palettes on old fashioned graphical
display devices and Image Quantization.
CONCLUSION
• K-means algorithm is useful for undirected knowledge
discovery and is relatively simple.
• K-means has found wide spread usage in lot of fields,
ranging from unsupervised learning of neural network,
Pattern recognitions, Classification analysis, Artificial
intelligence, image processing, machine vision, and many
others.

You might also like