You are on page 1of 2

Experiment 3.

Aim: Write a program to implement k-means clustering using clustering algorithm.

Software Required:

1. R studio
2. R language

Description:

K-means clustering is a popular unsupervised machine learning algorithm used for partitioning
a set of data points into distinct, non-overlapping groups or clusters. The goal of K-means
clustering is to group similar data points together and separate dissimilar ones. It is widely
used for various applications, such as customer segmentation, image compression, document
categorization, and more.
The K-means algorithm aims to minimize the sum of squared distances between data points and
their assigned cluster centroids. This objective is known as the "within-cluster sum of squares"
(WCSS). The algorithm may converge to a local minimum, depending on the initial centroids, so
it is often advisable to run the algorithm multiple times with different initializations and choose
the best result based on the WCSS.

Code:

install.packages("clusterR")
install.packages("cluster")
setwd("C:/Users/Desktop/Harmandeep/Sem 7/BI_LAB")
dataset = read.csv('mall.csv')
X = dataset[4:5]
set.seed(6)
wcss = vector()
for (i in 1:10) wcss[i] = sum(kmeans(X, i)$withinss)
pdf("elbow-graph.pdf", paper="a4")
plot(x = 1:10,
y = wcss,
type = 'b',
main = 'The Elbow Method',
xlab = 'Number of clusters',

Name: Akar shan saini UID:20BCS4914


ylab = 'WCSS')
dev.off()
set.seed(29)
kmeans = kmeans(x = X,
centers = 6,
iter.max = 300,
nstart = 10)
library(cluster)
pdf("clusterplot.pdf", paper="a4")
clusplot(x = X,
clus = kmeans$cluster,
lines = 0,
shade = TRUE,
color = TRUE,
labels = 4,
plotchar = TRUE,
span = TRUE,
main = 'Clusters of customers',
xlab = 'Annual Income',
ylab = 'Spending Score')

dev.off()

Output:

Name: Akar shan saini UID:20BCS4914

You might also like