Professional Documents
Culture Documents
Results of a k-medoid
ε-Neighborhood of p
ε ε ε-Neighborhood of q
q p
Density of p is “high” (MinPts = 4)
Density of q is “low” (MinPts = 4)
Minpts = 3
Eps=radius
of the
circles
Directly density-reachable
An object q is directly density-reachable from
object p if p is a core object and q is in p’s -
neighborhood.
MinPts = 4
Density-Connected
A pair of points p and q are density-connected
if they are commonly density-reachable from a
point o.
Density-connectivity is
symmetric
p q
o
University at Buffalo The State University of New York
Formal Description of
Cluster
P is a core object.
DBScan Algorithm
for each o D do
if o is not yet classified then
if o is a core-object then
collect all objects density-reachable from o
and assign them to a new cluster.
else
assign o to NOISE
University at Buffalo The State University of New York
DBSCAN Algorithm:
Example
Parameter
= 2 cm
MinPts = 3
for each o D do
if o is not yet classified then
if o is a core-object then
collect all objects density-reachable from o
and assign them to a new cluster.
else
assign o to NOISE
University at Buffalo The State University of New York
DBSCAN Algorithm:
Example
Parameter
= 2 cm
MinPts = 3
for each o D do
if o is not yet classified then
if o is a core-object then
collect all objects density-reachable from o
and assign them to a new cluster.
else
assign o to NOISE
P1
C1 P
P C1
C1
C1
C1
C1 C1
C1
= 10, MinPts = 4
• Resistant to Noise
• Can handle clusters of different shapes and sizes
(MinPts=4, Eps=9.92).
Original Points
(MinPts=4, Eps=9.75)
p 3-distance(p) :
q
3-distance(q) :
3-distance
first „valley“
Objects
„border object“
Heuristic method:
Fix a value for MinPts (default: 2 d –1)
User selects “border object” o from the MinPts-distance plot;
is set to MinPts-distance(o)
A C
F A, B, C
E
B, D, E
3-Distance
G
B‘, D‘, F, G
G1
G3 D1, D2,
D G2 G1, G2, G3
B D’
B’ D1
D2 Objects
Idea: Process objects in the “right” order and keep track of point
density in their neighborhood
C MinPts = 3
C1 C2
2 1
core-distance,MinPts(o)
“smallest distance such that o is a core object”
(if that distance is “?”otherwise)
reachability-distance,MinPts(p, o)
MinPts = 5
“smallest distance such that p is
directly density-reachable from o” p
(if that distance is “?”otherwise) q o
core-distance(o)
reachability-distance(p,o)
reachability-distance(q,o)
1
2 17
3 16 18 34
1
4 2 16 17
18
Core-distance
Performance: approx. Reachability-distance
runtime( DBSCAN( , MinPts) )
O( n * runtime(-neighborhood-query) )
without spatial index support (worst case): O( n2 )
e.g. tree-based spatial index support: O( n log(n) )
reachability distance
“large enough”
2
Reachability-
distance
undefined
‘
DBSCAN OPTICS
Density Boolean value Numerical value
(high/low) (core distance)
Density- Boolean value Numerical value
connected (yes/no) (reachability distance)
Searching random greedy
strategy
Guassian:
d ( x, y )2
y 2 2
f Gaussian ( x) e
d ( x , xi ) 2
( x ) i 1 e
D N
2 2
f Gaussian
d ( x , xi ) 2
( x ) i 1 e
D N
2 2
f Gaussian
d ( x , xi ) 2
( x, xi ) i 1 ( xi x) e
D N
2 2
f Gaussian
Center-defined cluster
A subset of objects attracted by an attractor x
density(x) ≥
Arbitrary-shape cluster
A group of center-defined clusters which are
connected by a path P
For each object x on P, density(x) ≥ .