Welcome to Scribd!

Skip carousel

Unsupervised Learning (1h)

Uploaded by

mariagil.altafulla

0% found this document useful (0 votes)

3 views22 pages

Original Title

10._Unsupervised_Learning_(1h)

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

3 views22 pages

Unsupervised Learning (1h)

Uploaded by

mariagil.altafulla

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 22

Search inside document

Unsupervised learning

Surprise me!

1
Finding the Unknown

❖ Find latent variables

■ Dimensionality Reduction / Representation learning

❖ Separate things

■ Clustering

❖ Find anomalies

■ Anomaly detection

2
A universal tool

❖ Always handy

❖ Never enough

3
Clustering
Labeling the unlabelled

4
K-means
Further details in ME/SM

1. Deﬁne random prototypes (centroids)

2. Attach all samples to the closest one

3. Recompute prototypes

4. Repeat 2.

5
K-means

6
The good and the bad

❖ No optimum guarantees

❖ Generally fast

❖ Sensitive to

■ number of clusters (k)

■ prototype initialization

❖ Spherical clusters (mean)

7
Cluster this

❖ (1,1), (1,2), (1,5), (3,2), (3,4), (4,1), (4,4), (5,3), (5,5), (6,2), (6,6), (7,7)

❖ Choose a random init

❖ Run 3 iterations of the algorithm

8
Variants

❖ k-Medians: sample in the middle

■ Actual data point. Outlier robust

❖ k-Medoids: sample closest overall

■ Actual data point. Outlier robust

■ Pair-wise similarity within cluster

❖ Improved initializations available (k-means++ spreads at init)

9
Hierarchical Clustering Further details in ME/SM

❖ Group things greedily, pair by pair

■ Bottom-up (agglomerative)
○ Start with N clusters
○ Merge two most similar from different clusters
○ Deﬁne new cluster
■ Top-down (divisive)
○ Split two most distant (centroids) from same clusters.
○ Deﬁne two new clusters. Assign others based on dist. to centroid
❖ Matrix of distances
■ Merging & updating
10
Dendograms Further details in ME/SM

❖ Y axis = distance

❖ Branch length

■ Line cut

11
Hierarchical Clustering

❖ Linkage type (distance between clusters)

■ Complete (max, spheres)

■ Average (mean, spheres)

■ Single (min, ladders)

■ Ward (minimum variance within

cluster)

12
Hierarchical Clustering

❖ Linkage type

■ Complete (max, spheres)

■ Average (mean, spheres)

■ Single (min, ladders)

■ Ward (variance)

13
The good and the bad

❖ Expensive (recompute dist. mat.)

❖ Sensitive to linkage type

❖ Sensitive to outliers & noise (single linkage!)

❖ Usable with any similarity measure

❖ Interpretable

❖ Kinda k-free

14
Cluster this

❖ (1,1), (1,2), (1,5), (3,2), (3,4), (4,1), (4,4), (5,3), (5,5), (6,2), (6,6), (7,7)

❖ Choose a tie breaker

❖ Build the dendogram

15
DBSCAN
Further details in ME
❖ Find dense sample regions
■ Core point: Potential cluster center (min. neighb. at max. radius)
■ Border point: Neighbour to a core point
■ Noise points: Others

❖ Label all nodes

❖ For each core point

■ start or expand a cluster
■ propagate label to border points
16
DBSCAN

❖ Density-based clusters

❖ Noise: Unassigned samples

❖ Very sensitive to hyperparams

❖ Finds one k

17
DBSCAN params

❖ Radius (epsilon)

■ High: Larger clusters. Irregular.

■ Small: Smaller clusters, more noise points. Spherical

❖ Num. min. points

■ High: Denser clusters

■ Smaller: Sparser clusters

18
DBSCAN params

19
The good and the bad

❖ Bad for variable density datasets

❖ Works with any pair-wise similarity

❖ Fast

❖ No k needed (radius and min. points instead)

❖ Robust to outliers

❖ Non-deterministic for ties or border samples

❖ OPTICS, HDBSCAN (Hierarchical, using the radius)

20
Anomaly Detection
It’s weird

21
Combination of other stuﬀ

❖ Statistical measures

❖ Density metrics (KNN, DBSCAN)

❖ Unsatisﬁed frequent itemsets

❖ …

❖ Infrequent due to data volume

12 Week Programme - Leetcode
Document6 pages
12 Week Programme - Leetcode
Aalekh Srivastava
No ratings yet
Vibrathane B 600
Document4 pages
Vibrathane B 600
PabloNavarrete
No ratings yet
INTERCOMP Pt300 Users Manual Rev G
Document44 pages
INTERCOMP Pt300 Users Manual Rev G
CTN2010
No ratings yet
AP Statistics Midterm Outline (Up To Chapter 8)
Document4 pages
AP Statistics Midterm Outline (Up To Chapter 8)
elialt
No ratings yet
L08 Clustering
Document31 pages
L08 Clustering
YONG LONG KHAW
No ratings yet
Lecture 6
Document55 pages
Lecture 6
Hassan
No ratings yet
Pagina 1
Document1 page
Pagina 1
mariana mourão
No ratings yet
Lecture 2
Document21 pages
Lecture 2
Waseem Abbas
No ratings yet
Unit 4
Document50 pages
Unit 4
Krishna Malviya
No ratings yet
CZ4032 Data Analytics & Mining Notes
Document16 pages
CZ4032 Data Analytics & Mining Notes
Feng Chengxuan
No ratings yet
Clustering K Means Agnes
Document36 pages
Clustering K Means Agnes
preetam
No ratings yet
9.54 Class 13: Unsupervised Learning
Document54 pages
9.54 Class 13: Unsupervised Learning
GrantMwakipunda
No ratings yet
7.classification Before
Document27 pages
7.classification Before
Hamed Rokni
No ratings yet
Lecture 2 - Nearest-Neighbors Methods
Document57 pages
Lecture 2 - Nearest-Neighbors Methods
Ali Raza
No ratings yet
7.classification After
Document51 pages
7.classification After
Hamed Rokni
No ratings yet
20 - 1 - ML - Unsup - 03 - Dbscan Hdbscan
Document21 pages
20 - 1 - ML - Unsup - 03 - Dbscan Hdbscan
MohitKhemka
No ratings yet
cs188 Fa22 Lec03
Document36 pages
cs188 Fa22 Lec03
Yasmine Amina Moudjar
No ratings yet
Duda ch10
Document17 pages
Duda ch10
Sudheer Kumar
No ratings yet
Introduction To FEA: Won Hyun Park
Document25 pages
Introduction To FEA: Won Hyun Park
04935
No ratings yet
Cassandra Best Practices
Document49 pages
Cassandra Best Practices
Eder Henry
No ratings yet
Phylogeny Lars Arvestad
Document31 pages
Phylogeny Lars Arvestad
Hashim Ali
No ratings yet
KNN and Bias Variance Tradeoff
Document21 pages
KNN and Bias Variance Tradeoff
21110289
No ratings yet
10 SVMAndEvaluation PDF
Document60 pages
10 SVMAndEvaluation PDF
argo82
No ratings yet
Classification 1
Document78 pages
Classification 1
chanakyaramakanth
No ratings yet
Artificial Intelligence: Machine Learning Algorithms Id3 Dbscan
Document30 pages
Artificial Intelligence: Machine Learning Algorithms Id3 Dbscan
elgeneral0313
No ratings yet
Clustering
Document39 pages
Clustering
Sourav Mondal
No ratings yet
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 26-Apr-2021 Clustering
Document43 pages
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material I 26-Apr-2021 Clustering
RenuSharma
No ratings yet
09evaluation Clustering
Document29 pages
09evaluation Clustering
Neti Suherawati
No ratings yet
Feature Matching and RANSAC: 15-463: Computational Photography Alexei Efros, CMU, Fall 2005
Document36 pages
Feature Matching and RANSAC: 15-463: Computational Photography Alexei Efros, CMU, Fall 2005
shaguftahuma
No ratings yet
Inherently Interpretable Models 1 of 2
Document64 pages
Inherently Interpretable Models 1 of 2
Paul George
No ratings yet
Reviewer in Sap
Document4 pages
Reviewer in Sap
Cassiopeia- Alcantara, Rica Mae L.
No ratings yet
Cs188-Sp23-Lec07 CSP
Document37 pages
Cs188-Sp23-Lec07 CSP
khanhlinhkstndtvtk57
No ratings yet
Small World
Document66 pages
Small World
JT
No ratings yet
Lecture 2
Document17 pages
Lecture 2
Icy45
No ratings yet
Constraint Satisfaction Problem: Teknik Elektro Fakultas Informatika Dan Teknik Elektro Institut Teknologi Del
Document55 pages
Constraint Satisfaction Problem: Teknik Elektro Fakultas Informatika Dan Teknik Elektro Institut Teknologi Del
Ronaldo Piterson
No ratings yet
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
Document22 pages
Sathyabama Institute of Science and Technology SIT1301-Data Mining and Warehousing
viktahjm
No ratings yet
Lecture8 DenseGraphs
Document45 pages
Lecture8 DenseGraphs
Kal tek
No ratings yet
Week 04 Lecture Material
Document52 pages
Week 04 Lecture Material
Meer Hassan
No ratings yet
Lect 4 Graphs&GreedyAlgo
Document74 pages
Lect 4 Graphs&GreedyAlgo
Mohamed Akel
No ratings yet
3.3 Classification Algorithms in Various Situations
Document7 pages
3.3 Classification Algorithms in Various Situations
Raja
No ratings yet
Constraint Satisfaction Problems II
Document35 pages
Constraint Satisfaction Problems II
HuyCường
No ratings yet
Clustering Algorithm (Dbscan) : Vishal Bharti Computer Science Dept. GC, Cuny
Document27 pages
Clustering Algorithm (Dbscan) : Vishal Bharti Computer Science Dept. GC, Cuny
Muthu Kumaran
No ratings yet
RNN LSTM
Document72 pages
RNN LSTM
5049 Harishchandra Kumar
No ratings yet
Constraint Satisfaction Problem
Document30 pages
Constraint Satisfaction Problem
abhijeet.pundkar2021
No ratings yet
Spatial Interpolation
Document21 pages
Spatial Interpolation
Jamie Whitehouse
No ratings yet
Neural Networks
Document29 pages
Neural Networks
Shailesh Sivan
No ratings yet
Artificial Intelligence CS-3431w (V2)
Document15 pages
Artificial Intelligence CS-3431w (V2)
Nisha Idrees
No ratings yet
Clustering K-Means
Document28 pages
Clustering K-Means
Faysal Ahammed
No ratings yet
Lec 3 SIFT
Document28 pages
Lec 3 SIFT
ahmed sayed
No ratings yet
5 CSP
Document36 pages
5 CSP
Pratik Pradip Sarode
No ratings yet
Lecture 8-9 - Clustering
Document43 pages
Lecture 8-9 - Clustering
johndeuterok
No ratings yet
Constraint Satisfaction Problems: Foundations of Artificial Intelligence
Document62 pages
Constraint Satisfaction Problems: Foundations of Artificial Intelligence
林佳緯
No ratings yet
Lecture 4 2 Feature Matching
Document18 pages
Lecture 4 2 Feature Matching
Eli
No ratings yet
CGE Course Johanne
Document24 pages
CGE Course Johanne
razib126
No ratings yet
Hw4 Probs
Document12 pages
Hw4 Probs
sfaawrgawg
No ratings yet
Aifa 9 CSP 120923
Document19 pages
Aifa 9 CSP 120923
Das Skrillex
No ratings yet
Clustering
Document32 pages
Clustering
drakefeel408
No ratings yet
Lec 5
Document51 pages
Lec 5
Vân Anh
No ratings yet
Machine Learning - Overview
Document5 pages
Machine Learning - Overview
Alexandra Feldman
No ratings yet
Hausdorff Gaps and Limits
From Everand
Hausdorff Gaps and Limits
R. Frankiewicz
No ratings yet
Frequency Selective Surfaces: Theory and Design
From Everand
Frequency Selective Surfaces: Theory and Design
Ben A. Munk
No ratings yet
Lectures on Resolution of Singularities (AM-166)
From Everand
Lectures on Resolution of Singularities (AM-166)
János Kollár
No ratings yet
Maryknoll College of Panabo, Inc.: Junior High School Unit
Document3 pages
Maryknoll College of Panabo, Inc.: Junior High School Unit
Christian Rey
No ratings yet
Whole Room Disinfection
Document68 pages
Whole Room Disinfection
kfujiwar
100% (1)
TT C 490G
Document36 pages
TT C 490G
awesome_600
No ratings yet
Binder 1
Document9 pages
Binder 1
Luận Nguyễn
No ratings yet
Sweepers Catalog
Document56 pages
Sweepers Catalog
NewVersion_FacilityManagement
No ratings yet
SKPI (Puti Naulia 24010113140097) MATEMATIKA
Document5 pages
SKPI (Puti Naulia 24010113140097) MATEMATIKA
muhamad faizal
No ratings yet
Approved By: SSH Daniel S. Wierzbicki: Drafted By: Kyle Steere
Document4 pages
Approved By: SSH Daniel S. Wierzbicki: Drafted By: Kyle Steere
Paul Wall
No ratings yet
WTO Assignment Bussiness Law
Document18 pages
WTO Assignment Bussiness Law
Aarav Tripathi
No ratings yet
Ex Lap
Document42 pages
Ex Lap
Omar Khalif Amad Pendatun
No ratings yet
Coffee Can PMS - Feb - 2021 Presentation
Document21 pages
Coffee Can PMS - Feb - 2021 Presentation
Deepak Joshi
No ratings yet
7.limitations of Cyber Investigations & Digital Forensics - NJA
Document20 pages
7.limitations of Cyber Investigations & Digital Forensics - NJA
Yusura Wiz
No ratings yet
Elementary Statistics
Document73 pages
Elementary Statistics
Deepak Goyal
No ratings yet
The Spring Series
Document7 pages
The Spring Series
api-3701708
No ratings yet
Herbal Cosmetics: Seminar On
Document22 pages
Herbal Cosmetics: Seminar On
DRx Sonali Tarei
No ratings yet
Chicken
Document9 pages
Chicken
Jefferson Tan
No ratings yet
CPK PDF
Document1 page
CPK PDF
Hussein N. Farhat
No ratings yet
14 PCA Max Variance and Min Error
Document26 pages
14 PCA Max Variance and Min Error
kada vineeth
No ratings yet
Sanders LDavid Ruth 1963 Brazil
Document34 pages
Sanders LDavid Ruth 1963 Brazil
the missions network
No ratings yet
Examradar Com Linked Lists Single Array Based Double Circular MCQ Based Online T
Document6 pages
Examradar Com Linked Lists Single Array Based Double Circular MCQ Based Online T
MELANIE LLONA
No ratings yet
Indian Standard For Drinking Water As Per BIS Specifications - 2010
Document26 pages
Indian Standard For Drinking Water As Per BIS Specifications - 2010
Ayush Jain
100% (1)
At The End of This Lecture You Will Be Able To:: - Linux Operating System
Document11 pages
At The End of This Lecture You Will Be Able To:: - Linux Operating System
Alex Kharel
No ratings yet
System Control
Document86 pages
System Control
mhmd ajibnrzaki
No ratings yet
Target 75+ in Maths: Download Doubtnut Today
Document18 pages
Target 75+ in Maths: Download Doubtnut Today
Priyanshu Singh
No ratings yet
Trends and Determinants of Early Initiation of Breastfeeding and Exclusive Breastfeeding in Ethiopia From 2000 To 2016
Document14 pages
Trends and Determinants of Early Initiation of Breastfeeding and Exclusive Breastfeeding in Ethiopia From 2000 To 2016
Hilda Hilda
No ratings yet
Semantic Business Intelligence - Dinu AIRINEI
Document9 pages
Semantic Business Intelligence - Dinu AIRINEI
cemkuda
No ratings yet
a. speak: BÀI TẬP 5: Choose the best answer A, B, C or D
Document4 pages
a. speak: BÀI TẬP 5: Choose the best answer A, B, C or D
Dương Trần Thị Mỹ
No ratings yet
Mechanical Engineering German Taught
Document27 pages
Mechanical Engineering German Taught
ajitha
No ratings yet
Users Guide: API Version 2.7
Document43 pages
Users Guide: API Version 2.7
Mark
No ratings yet