Welcome to Scribd!

K-Means Clustering Guide - Understand, Explain, Apply K-Means

Uploaded by

0% found this document useful (0 votes)

12 views22 pages

K-Means clustering is an unsupervised machine learning algorithm that groups unlabeled data points into a specified number of clusters (K) based on their similarity. It is one of the simplest clustering algorithms and is computationally efficient. The algorithm works by assigning each data point to the cluster with the nearest mean, and iteratively updating the cluster means until convergence is reached. Common evaluation metrics for K-Means clustering include inertia, homogeneity, completeness, V-measure, adjusted Rand index, adjusted mutual information, and silhouette coefficient.

Original Description:

Original Title

CSC649 Lecture 3 Unsupervised ML_KMeansClustering (1).ppt

Copyright

Available Formats

PPT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

12 views22 pages

K-Means Clustering Guide - Understand, Explain, Apply K-Means

Uploaded by

Ryan anak Gaybristi

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 22

Search inside document

K-Means Clustering

https://www.analyticsvidhya.com/blog/2019/08/comprehensive-guide-k-means-clustering/

Learning Outcome
By the end of this lecture, you should be able to understand,
explain and apply K-Means Clustering.
• Similar is the measure of similarity (distance)
between “points” to be clustered
 K-Means Clustering is
one of the simplest
unsupervised machine
learning algorithms
where it is fast and
efficient in terms of its
computational cost.
K-Means Clustering in Python
#K-Means Clustering on iris flower dataset
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
#import the dataset
df = pd.read_csv('iris.csv')
#df.head(10)
#4 columns of features
x = df.iloc[:, [0,1,2,3]].values
kmeanss = KMeans(n_clusters=5)
y_kmeanss = kmeanss.fit_predict(x)
print(y_kmeanss)
kmeanss.cluster_centers_
#to find optimum number of cluster
Error =[]
for i in range(1, 11):
kmeans = KMeans(n_clusters = i).fit(x)
kmeans.fit(x)
Error.append(kmeans.inertia_)
import matplotlib.pyplot as plt
#the elbow indicates the optimal value of K
#edit and run again using the new K
plt.plot(range(1, 11), Error)
plt.title('Elbow method')
plt.xlabel('No of clusters')
plt.ylabel('Error')
plt.show()
Elbow method gives us an
idea on what a
good k number of clusters
would be based on the sum
of squared distance (SSE)
between data points and their Estimated K=3
assigned clusters’ centroids.

We pick k at the spot where

SSE starts to flatten out and
forming an elbow
https://towardsdatascience.com/k-means-clustering-algorithm-applications-evaluation-methods-and-drawbacks-
aa03e644b48a

inertia measures how well a dataset was clustered

homogeneity score describes the closeness of the clustering algorithm to this perfection.

completeness score describes the closeness (number of classes within the same cluster) of the
clustering algorithm to this perfection.
V measure the harmonic mean/normalized between homogeneity and completeness.

adjusted Rand index a function that computes a similarity measure between two clusters

adjusted mutual information computes a similarity measure between two clusters by chance

silhouette coefficient determine the degree of separation between clusters

21BEC505 Exp2
Document7 pages
21BEC505 Exp2
jay
No ratings yet
R Tutorial: K-Means Cluster Analysis of Iris Data Set
Document8 pages
R Tutorial: K-Means Cluster Analysis of Iris Data Set
Palash Saroware
No ratings yet
K-Means in Python - Solution
Document6 pages
K-Means in Python - Solution
Rodrigo Violante
No ratings yet
Cluster Analysis in R TML
Document5 pages
Cluster Analysis in R TML
RajyaLakshmi
No ratings yet
Clustering Algorithms on Iris Dataset
Document6 pages
Clustering Algorithms on Iris Dataset
Shrey Dixit
No ratings yet
Simple Case Study of Implementing K Means Clustering On The IRIS Dataset
Document4 pages
Simple Case Study of Implementing K Means Clustering On The IRIS Dataset
gargwork1990
No ratings yet
UNIT-IV MATERIAL
Document24 pages
UNIT-IV MATERIAL
Udaya sri
No ratings yet
ML Notes
Document125 pages
ML Notes
Abhijit Das
100% (2)
Data Mining Algorithms in R - Clustering - Fuzzy Clustering - Fuzzy C-Means - Wikibooks, Open Books For An Open World
Document8 pages
Data Mining Algorithms in R - Clustering - Fuzzy Clustering - Fuzzy C-Means - Wikibooks, Open Books For An Open World
Snr Kofi Agyarko Ababio
No ratings yet
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
Document7 pages
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
Kartik Katekar
No ratings yet
TD2345
Document3 pages
TD2345
ashitaka667
No ratings yet
3.unsupervised Learning
Document9 pages
3.unsupervised Learning
Alexandra Veres
No ratings yet
Machine Learning
Document56 pages
Machine Learning
Mani Vrs
100% (3)
Tutorial 6
Document8 pages
Tutorial 6
POEASO
No ratings yet
Assignment#3 AI
Document5 pages
Assignment#3 AI
Taimoor ali
No ratings yet
Clustering in R
Document12 pages
Clustering in R
Renuka
No ratings yet
Unsupervised Learning - Clustering Cheatsheet - Codecademy
Document5 pages
Unsupervised Learning - Clustering Cheatsheet - Codecademy
Imane Loukili
No ratings yet
OLS Regression Commands in R
Document38 pages
OLS Regression Commands in R
deepak
100% (1)
Data Mining Business Report Set
Document12 pages
Data Mining Business Report Set
priyada16
No ratings yet
Aychew Chernet
Document8 pages
Aychew Chernet
aychewchernet
No ratings yet
Lab Experiment No. 4 Name: Dhruv Jain SAP ID: 60004190030 Div/Batch: A/A2 Aim
Document9 pages
Lab Experiment No. 4 Name: Dhruv Jain SAP ID: 60004190030 Div/Batch: A/A2 Aim
Dhruv jain
No ratings yet
Hierarchical Clustering unit 4 ml
Document14 pages
Hierarchical Clustering unit 4 ml
Smriti Sharma
No ratings yet
What is Cluster Analysis? Unsupervised Machine Learning for Customer Segmentation and Stock Market Clustering
Document22 pages
What is Cluster Analysis? Unsupervised Machine Learning for Customer Segmentation and Stock Market Clustering
Netra Raina
No ratings yet
Create List Using Range
Document6 pages
Create List Using Range
YUKTA JOSHI
No ratings yet
Introduction to KNN Classification with Python Implementation (40
Document125 pages
Introduction to KNN Classification with Python Implementation (40
Abhiraj Das
100% (1)
R Code For Discriminant and Cluster Analysis
Document23 pages
R Code For Discriminant and Cluster Analysis
Nguyễn Oanh
No ratings yet
Pythonfile
Document36 pages
Pythonfile
collection58209
No ratings yet
Exp 4
Document10 pages
Exp 4
jay
No ratings yet
Confusion Matrix
Document6 pages
Confusion Matrix
amir
No ratings yet
Week10 KNN Practical
Document4 pages
Week10 KNN Practical
seerungen jordi
No ratings yet
Cluster Analysis Techniques for Customer Segmentation
Document2 pages
Cluster Analysis Techniques for Customer Segmentation
Mudit Rander
No ratings yet
Maxbox Starter60 Machine Learning
Document8 pages
Maxbox Starter60 Machine Learning
Max Kleiner
No ratings yet
CLUSTERING AI
Document12 pages
CLUSTERING AI
Chandu Chandrakanth
No ratings yet
Hierarchical Clustering: Required Data
Document6 pages
Hierarchical Clustering: Required Data
Hritik Agrawal
No ratings yet
4. Clustering FinancialData
Document38 pages
4. Clustering FinancialData
Zeeshan Ali
No ratings yet
Clustering: Unsupervised Learning Techniques
Document4 pages
Clustering: Unsupervised Learning Techniques
marc
No ratings yet
MLA Lab 6:-Implementation of Decision Tree
Document16 pages
MLA Lab 6:-Implementation of Decision Tree
tushar3patil03
No ratings yet
K-Mean Algo. On Iris Data Set - 15129145 PDF
Document7 pages
K-Mean Algo. On Iris Data Set - 15129145 PDF
Mohammad Waqas Moin Sheikh
No ratings yet
Revised Clustering Business Report
Document5 pages
Revised Clustering Business Report
Pratigya pathak
No ratings yet
2nd Programme AIML 7th Sem
Document2 pages
2nd Programme AIML 7th Sem
awfullymeee
No ratings yet
Kmeans Package
Document1 page
Kmeans Package
Kenneth Mwai
No ratings yet
Journal of Computer Applications - WWW - Jcaksrce.org - Volume 4 Issue 2
Document5 pages
Journal of Computer Applications - WWW - Jcaksrce.org - Volume 4 Issue 2
Journal of Computer Applications
No ratings yet
UCS551 Chapter 7 - Clustering
Document9 pages
UCS551 Chapter 7 - Clustering
Farah Yahaya
No ratings yet
Bhaumik-Project - C - Report K Mean Complexity
Document10 pages
Bhaumik-Project - C - Report K Mean Complexity
Mahiye Ghosh
No ratings yet
Ex No: Date: K-Means Clustering Using Python: Scatter
Document10 pages
Ex No: Date: K-Means Clustering Using Python: Scatter
Jasmitha B
No ratings yet
Introduction To Deep Learning Assignment 0: September 2023
Document3 pages
Introduction To Deep Learning Assignment 0: September 2023
christiaanbergsma03
No ratings yet
Coincent - Data Science With Python Assignment
Document23 pages
Coincent - Data Science With Python Assignment
Sai Nikhil Nellore
100% (2)
Classification Algorithms III
Document9 pages
Classification Algorithms III
Jayod Rajapaksha
No ratings yet
DBSCAN Python Example - The Optimal Value For Epsilon (EPS) - by Cory Maklin - Towards Data Science
Document7 pages
DBSCAN Python Example - The Optimal Value For Epsilon (EPS) - by Cory Maklin - Towards Data Science
Leandro Henrique Formigoni Aggio
No ratings yet
Determining The Number of Clusters in A Data Set
Document6 pages
Determining The Number of Clusters in A Data Set
john949
No ratings yet
A Complete Guide To KNN
Document16 pages
A Complete Guide To KNN
cinculiranje
No ratings yet
Support Vector Machines Explained
Document8 pages
Support Vector Machines Explained
alexa_sherpy
No ratings yet
FullMarks - Clustering StudentSolution 2
Document13 pages
FullMarks - Clustering StudentSolution 2
Ummu Uwais Ash-Shafi'ie
No ratings yet
KMEANS
Document9 pages
KMEANS
johnzenbano120
No ratings yet
K Means
Document329 pages
K Means
yousef shaban
100% (1)
SVM K NN MLP With Sklearn Jupyter NoteBo
Document22 pages
SVM K NN MLP With Sklearn Jupyter NoteBo
Ahm Tharwat
No ratings yet
Text Clustering and Validation For Web Search Results
Document7 pages
Text Clustering and Validation For Web Search Results
International Journal of Application or Innovation in Engineering & Management
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Kernel Methods: Fundamentals and Applications
From Everand
Kernel Methods: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
How To Install Python and Scikit
Document3 pages
How To Install Python and Scikit
Ryan anak Gaybristi
No ratings yet
Lecture 1 - Introduction To Parallel Computing
Document32 pages
Lecture 1 - Introduction To Parallel Computing
Ryan anak Gaybristi
No ratings yet
Lecture 2 - Parallel Programming Platforms (Part I) - Updated - 2021
Document44 pages
Lecture 2 - Parallel Programming Platforms (Part I) - Updated - 2021
Ryan anak Gaybristi
No ratings yet
Oct2022 CSC649 SupervisedDL - CNN
Document79 pages
Oct2022 CSC649 SupervisedDL - CNN
Ryan anak Gaybristi
No ratings yet