Professional Documents
Culture Documents
Unsupervised Learning (1h)
Unsupervised Learning (1h)
Surprise me!
1
Finding the Unknown
❖ Separate things
■ Clustering
❖ Find anomalies
■ Anomaly detection
2
A universal tool
❖ Always handy
❖ Never enough
3
Clustering
Labeling the unlabelled
4
K-means
Further details in ME/SM
3. Recompute prototypes
4. Repeat 2.
5
K-means
6
The good and the bad
❖ No optimum guarantees
❖ Generally fast
❖ Sensitive to
■ prototype initialization
7
Cluster this
❖ (1,1), (1,2), (1,5), (3,2), (3,4), (4,1), (4,4), (5,3), (5,5), (6,2), (6,6), (7,7)
8
Variants
9
Hierarchical Clustering Further details in ME/SM
❖ Y axis = distance
❖ Branch length
■ Line cut
11
Hierarchical Clustering
cluster)
12
Hierarchical Clustering
❖ Linkage type
■ Ward (variance)
13
The good and the bad
❖ Interpretable
❖ Kinda k-free
14
Cluster this
❖ (1,1), (1,2), (1,5), (3,2), (3,4), (4,1), (4,4), (5,3), (5,5), (6,2), (6,6), (7,7)
15
DBSCAN
Further details in ME
❖ Find dense sample regions
■ Core point: Potential cluster center (min. neighb. at max. radius)
■ Border point: Neighbour to a core point
■ Noise points: Others
❖ Density-based clusters
❖ Finds one k
17
DBSCAN params
❖ Radius (epsilon)
18
DBSCAN params
19
The good and the bad
❖ Fast
❖ Robust to outliers
21
Combination of other stuff
❖ Statistical measures
❖ …
22