Lecture6 Clustering

Uploaded by

sowmyasanthavel

0% found this document useful (0 votes)

1 views47 pages

Original Title

Lecture6-Clustering

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

1 views47 pages

Lecture6 Clustering

Uploaded by

sowmyasanthavel

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 47

Search inside document

Clustering

Partial of the content of this class are copied from online materials. In particular:
1. Introduction to Computational Thinking and Data Science, by Pro. Eric Grimson, Prof. John Guttag and Dr. Ana Bell.
https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0002-introduction-to-computational-thinking-and-data-science-fall-20
16/
, MIT.
2. Unsupervised Learning Clustering, by Shimon Ullman, Tomaso Poggio Danny Harari, DaneilZysman, Darren Seibert
http://www.mit.edu/~9.54/fall14/slides/Class13.pdf, MIT
Machine learning paradigm
• Observe set of examples: training data
• Infer something about process that generated that data
• Use inference to make predictions about previously unseen data: test
data
• Supervised: give a set of feature/label pairs, find a rule that predicts
the label associated with a previously unseen input
• Unsupervised: given a set of feature vectors (without labels) group
them into "natural clusters".
What is Clustering?
What do we need for Clustering?
Distance Measures
Clustering is an Optimization Problem

• Why not divide variability by size of cluster (like variance)?

• So cluster with more points are likely to look less cohesive according to this
measure. So big and bad is worse than small and bad.
• If one wants to compare the coherence of two clusters of different sizes, one
needs to divide the variability of each cluster by the size of the cluster.
• Is optimization problem finding a C that minimizes dissimilarity(C)?
• No, otherwise could put each example in its own cluster.
• Need a constraint, e.g.
• Minimum distance between clusters
• Number of clusters
Clustering Techniques
Hierarchical clustering
Linkage metrics
Example of hierarchical clustering
Clustering Algorithms
K-means Algorithm
An Example: Step 1
Step 2:
Step 3:
Result of first iteration
Second Iteration
Result of Second Iteration
Why Use K-means?
Issues with K-means
• Choosing the "wrong" k can lead to strange results
• Consider k = 3
• Result can depend upon initial
centroids
• Number of iterations
• Even final results
• Greedy algorithm can find different local optimas
• The algorithm is sensitive to outliners
Dealing with Outliers
How to choose K
Sensitivity to Initial Seeds
Mitigating dependence on initial centroids
An Example
Data Sample
Class Example
Class Cluster
Evaluating a clustering
Patients
Kmeans
Examining results
Result
How many positives are there?
A Hypothesis
Testing multiple values of K

Modern Fantasy Lesson Plan
Document3 pages
Modern Fantasy Lesson Plan
api-279876248
100% (1)
6th Class (Science) English Medium Book PDF
Document190 pages
6th Class (Science) English Medium Book PDF
Gowtham Shanker
67% (6)
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Artificial Intelligence: Slide 6
Document42 pages
Artificial Intelligence: Slide 6
fatima hussain
No ratings yet
Machine Learning Theory
Document12 pages
Machine Learning Theory
airplaneunderwater
No ratings yet
Clustering (Unit 3)
Document71 pages
Clustering (Unit 3)
vedang maheshwari
100% (1)
L3 - Supervised and Unsupervised Learning
Document24 pages
L3 - Supervised and Unsupervised Learning
Gaurav Rohilla
100% (1)
Vision
Document1 page
Vision
Fairy-Lou Mejia
100% (2)
Data Mining Slides
Document43 pages
Data Mining Slides
Abdul Samad
No ratings yet
Effects of Consumption and Production Patterns To Climate Change
Document22 pages
Effects of Consumption and Production Patterns To Climate Change
Romel Christian Zamoranos Miano
100% (2)
Case Methods
Document4 pages
Case Methods
mickybhuvi
No ratings yet
Regression
Document109 pages
Regression
Pranati Bharadkar
100% (2)
8ad59658 1701235711480
Document36 pages
8ad59658 1701235711480
kashyaputtam7
No ratings yet
Minor Project Synopsis
Document12 pages
Minor Project Synopsis
AshishJha
No ratings yet
Module 1 ML Mumbai University
Document47 pages
Module 1 ML Mumbai University
2021.shreya.pawaskar
No ratings yet
Un-Supervised Machine Learning
Document9 pages
Un-Supervised Machine Learning
ranamzeeshan
No ratings yet
Proposal Defense v6
Document55 pages
Proposal Defense v6
lakshmi rao
No ratings yet
ML (Interview)
Document20 pages
ML (Interview)
ratnadepp
No ratings yet
DuongToGiangSon 517H0162 HW2 Nov-26
Document17 pages
DuongToGiangSon 517H0162 HW2 Nov-26
Son Tran
No ratings yet
NLP Chapter 2
Document79 pages
NLP Chapter 2
ai20152023
No ratings yet
Decision Tree
Document18 pages
Decision Tree
Rithvik Dadapuram
No ratings yet
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
Document95 pages
WINSEM2023-24 BEEE410L TH VL2023240502246 2024-03-22 Reference-Material-I
Rahul ganth
No ratings yet
DM Lecture 06
Document32 pages
DM Lecture 06
Sameer Ahmad
No ratings yet
Enhancement of Qualities of Clusters by Eliminating Outlier For Data Mining Application in Education
Document27 pages
Enhancement of Qualities of Clusters by Eliminating Outlier For Data Mining Application in Education
diptipatil20
No ratings yet
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
Document32 pages
Data Mining: Kabith Sivaprasad (BE/1234/2009) Rimjhim (BE/1134/2009) Utkarsh Ahuja (BE/1226/2009)
Rule2
No ratings yet
Ai Module 3
Document41 pages
Ai Module 3
Arshad Raza
No ratings yet
20ECE633T Machine Learning in VLSI
Document81 pages
20ECE633T Machine Learning in VLSI
UMAMAHESWARI P (RA2113004011004)
No ratings yet
Final Clustering
Document21 pages
Final Clustering
NEEL GHADIYA
No ratings yet
Importany Questions Unit 3 4
Document30 pages
Importany Questions Unit 3 4
Mubena Hussain
No ratings yet
Clustering and Classification: - Task
Document16 pages
Clustering and Classification: - Task
asra18786
No ratings yet
Datamining - Revited
Document8 pages
Datamining - Revited
Bridget Smith
No ratings yet
E-Assessment & Learning Analytics: 10. Clustering, Classification & Prediction
Document40 pages
E-Assessment & Learning Analytics: 10. Clustering, Classification & Prediction
fabian plett
No ratings yet
1.supervised and Unsupervised
Document42 pages
1.supervised and Unsupervised
rajthakre81
No ratings yet
Final
Document95 pages
Final
Farhan Rahman Anik
No ratings yet
A Survey On Decision Tree Algorithms of Classification in Data Mining
Document5 pages
A Survey On Decision Tree Algorithms of Classification in Data Mining
lastofspades
No ratings yet
Data Science Project Training Report
Document19 pages
Data Science Project Training Report
Sunny Sharan
No ratings yet
C45 Algorithm
Document12 pages
C45 Algorithm
triisant
No ratings yet
Classification & Prediction
Document24 pages
Classification & Prediction
timmy
No ratings yet
Week 4 Applications of ML in Security
Document36 pages
Week 4 Applications of ML in Security
Asfand Khalid
No ratings yet
Statistical Machine Learning (CSE 575) : About This Course
Document12 pages
Statistical Machine Learning (CSE 575) : About This Course
Sir sir
No ratings yet
Ôn Thi KTDL
Document18 pages
Ôn Thi KTDL
20521292
No ratings yet
Dataset Acquisition Method and Parameter Selection Method
Document17 pages
Dataset Acquisition Method and Parameter Selection Method
Sign Everything
No ratings yet
A Preliminary Idea On Machine Learning
Document40 pages
A Preliminary Idea On Machine Learning
Avijit Bose
No ratings yet
Unsupervised Machine Learning - What Is, Algorithms, Example
Document11 pages
Unsupervised Machine Learning - What Is, Algorithms, Example
Cristina
No ratings yet
Lecture 05: Feature Engineering: Ms. Mehroz Sadiq
Document69 pages
Lecture 05: Feature Engineering: Ms. Mehroz Sadiq
uxama
No ratings yet
7 - Conceptual Data Science
Document22 pages
7 - Conceptual Data Science
Putri Anisa
No ratings yet
Deep Learning Midsem Merged Previous Batch
Document423 pages
Deep Learning Midsem Merged Previous Batch
rahulsgayke
No ratings yet
Data Mining (Viva)
Document18 pages
Data Mining (Viva)
Anubhav Shrivastava
No ratings yet
DWDM Unit 4
Document22 pages
DWDM Unit 4
Vanshika Chauhan
No ratings yet
Custer Analysis: Prepared by Navin Ninama
Document20 pages
Custer Analysis: Prepared by Navin Ninama
Nishith Lakhlani
No ratings yet
Unit 2
Document76 pages
Unit 2
21131A4232 CHUKKALA PRUDHVI RAJ
No ratings yet
DWDM Unit-3: What Is Classification? What Is Prediction?
Document12 pages
DWDM Unit-3: What Is Classification? What Is Prediction?
Sai Venkat Gudla
No ratings yet
Machine Learning 1707965934
Document15 pages
Machine Learning 1707965934
robson110770
No ratings yet
ML 1 2 3
Document54 pages
ML 1 2 3
Shoba Natesh
No ratings yet
Chapter-V CLASSIFICATION & CLUSTERING
Document153 pages
Chapter-V CLASSIFICATION & CLUSTERING
21053259
No ratings yet
M4 - FDS
Document15 pages
M4 - FDS
Raghu C
No ratings yet
Classification and Clustering
Document8 pages
Classification and Clustering
Divya G
No ratings yet
Unit-Iv DWDM
Document28 pages
Unit-Iv DWDM
varsha.j2177
No ratings yet
Recent Advances in Clustering A Brief Survey
Document9 pages
Recent Advances in Clustering A Brief Survey
Anwar Shah
No ratings yet
DWBI4
Document10 pages
DWBI4
Dhanraj Deore
No ratings yet
ML - Machine Learning PDF
Document13 pages
ML - Machine Learning PDF
David Esteban Meneses Rendic
No ratings yet
10 Clus Basic
Document92 pages
10 Clus Basic
Mike Ku
No ratings yet
DM Chapter 4
Document47 pages
DM Chapter 4
world channel
No ratings yet
MachineLearning - Algorithms - Tagged
Document35 pages
MachineLearning - Algorithms - Tagged
Anthony Corneau
No ratings yet
Python Machine Learning for Beginners: Unsupervised Learning, Clustering, and Dimensionality Reduction. Part 1
From Everand
Python Machine Learning for Beginners: Unsupervised Learning, Clustering, and Dimensionality Reduction. Part 1
Tom Lesley
No ratings yet
VI Sem Machine Learning CS 601 PDF
Document28 pages
VI Sem Machine Learning CS 601 PDF
pankaj gupta
No ratings yet
Audit Course Instructions by University
Document1 page
Audit Course Instructions by University
patlninad
No ratings yet
Greening The School For Sustainable Development A
Document21 pages
Greening The School For Sustainable Development A
Mmapaseka Ledwaba
No ratings yet
Post Task, Designing A Lesson Plan
Document12 pages
Post Task, Designing A Lesson Plan
adrian cruz
No ratings yet
Police PRSNL ND Rcrds MGT LEA 5 Syllabus
Document3 pages
Police PRSNL ND Rcrds MGT LEA 5 Syllabus
Evalyn Diaz
No ratings yet
Educational Sociology - Education and Different Society PDF
Document11 pages
Educational Sociology - Education and Different Society PDF
Farsha Nichole
No ratings yet
Book Review - Learning From Children What To Teach Them - Sparsh
Document2 pages
Book Review - Learning From Children What To Teach Them - Sparsh
Sparsh Maheshwari
No ratings yet
17 Jurnal Kepimpinan Lestari Bi 2020 2
Document12 pages
17 Jurnal Kepimpinan Lestari Bi 2020 2
Ct Azza
No ratings yet
DLL Q3 W2 Health M2 Stress Management
Document4 pages
DLL Q3 W2 Health M2 Stress Management
Ruby Rose
No ratings yet
Sex-Ed Guidelines
Document112 pages
Sex-Ed Guidelines
Las Vegas Review-Journal
100% (2)
Karil Sastra Inggris
Document12 pages
Karil Sastra Inggris
yasiriady
No ratings yet
Junela Suclatan Ma Pe
Document5 pages
Junela Suclatan Ma Pe
Felicity Sanchez
No ratings yet
EngEd 314 - Survey in English and American Literature
Document14 pages
EngEd 314 - Survey in English and American Literature
Ian Paul Hurboda Daug
No ratings yet
Dakota Anderson Resume - Dec 16
Document1 page
Dakota Anderson Resume - Dec 16
Anonymous EPkfZ0b
No ratings yet
The Differences
Document2 pages
The Differences
lisadalila
No ratings yet
Professional Growth Plan For Colorado Teachers Mew
Document2 pages
Professional Growth Plan For Colorado Teachers Mew
api-251656431
No ratings yet
Letter To The Parents - Revised
Document2 pages
Letter To The Parents - Revised
Emmanuel Lupo Templa
No ratings yet
UHL2412 Course Info Sem 1 2020 - 2021
Document2 pages
UHL2412 Course Info Sem 1 2020 - 2021
Sammi Yong
No ratings yet
NEW DLL 2nd - 2 2018
Document3 pages
NEW DLL 2nd - 2 2018
Roselyn Pinion
No ratings yet
Report-Snehalaya Home For Child Rights
Document16 pages
Report-Snehalaya Home For Child Rights
Nishan Medhi
No ratings yet
EA STARTER A Lesson 9 Story Time The Jungle Book
Document3 pages
EA STARTER A Lesson 9 Story Time The Jungle Book
blackpearlmely
No ratings yet
Department of Education: Republic of The Philippines
Document1 page
Department of Education: Republic of The Philippines
Enelra Recamara Azaur
No ratings yet
Jordan M. Carabeo: Address
Document2 pages
Jordan M. Carabeo: Address
franc
No ratings yet
The Influence of Using English Listening Application On Students Listening Skils
Document10 pages
The Influence of Using English Listening Application On Students Listening Skils
rizqi
No ratings yet
Analogy Relationships
Document3 pages
Analogy Relationships
Haroon Akhtar
No ratings yet