You are on page 1of 2

SWE2009 - DATA MINING TECHNIQUES

DIGITAL ASSIGNMENT 2
NAME :S Pavithra
REG NO: 22MIS1152
K-MEAN CLUSTERING:

K-MEDOID CLUSTERING:

Initially, the input data, which is in table format, is


transformed into a numerical array using the table2array()
function. The code then proceeds to determine the optimal
value of k (number of clusters) using an elbow curve
approach. The resulting cluster assignments, cluster
representatives, and the sum of dissimilarities are recorded.
The sum of dissimilarities for each k is stored in the sa array.
Finally, the k-medoids algorithm is executed once more with
k = 5 to obtain the final cluster assignments, cluster
representatives, and sum of dissimilarities. The resulting
clusters are visualized using a scatter plot, where the x and y
coordinates of the data points are taken from the first and
second columns of the dataset, respectively, and the points
are coloured based on their cluster assignments.
PAM CLUSTERING:
In this code, we first install and load the cluster package, which provides the pam()
function for PAM clustering. Then, we load and preprocess the dataset as needed.
Next, we perform PAM clustering using the pam() function, specifying the attributes
to cluster on and the desired number of clusters.
The pam() function returns an object that contains information about the clustering
results, such as the cluster assignments and the medoids. You can access this
information using the $ operator.
CLARA CLUSTERING:
In this code, we first install and load the cluster package, which provides the
clara() function for CLARA clustering. Then, we load and preprocess the dataset as
needed. Next, we perform CLARA clustering using the clara() function, specifying
the attributes to cluster on and the desired number of clusters.
The clara() function returns an object that contains information about the
clustering results, such as the cluster assignments and the medoids. You can access
this information using the $ operator. For example, clara_result$clustering gives
the cluster assignments for each data point.

HIERARCHIAL CLUSTERING:
In this code, we first load and preprocess the dataset as needed. Then, we perform
hierarchical clustering using the hclust() function. We calculate the distance matrix
using the dist() function on the attributes of interest. The hclust() function then
performs the hierarchical clustering on the distance matrix.
The hclust() function returns an object that represents the hierarchical clustering
result. You can visualize the clustering using various methods, such as dendrograms
or heatmaps.

GITHUB LINK:
https://github.com/pavithras333/DATAMININGDA2.1152

You might also like