CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica

CAIM: Cerca i Anàlisi d’Informació Massiva
FIB, Grau en Enginyeria Informàtica
Slides by Marta Arias, José Luis Balcázar,

Ramon Ferrer-i-Cancho, Ricard Gavaldá
Department of Computer Science, UPC
Fall 2018
http://www.cs.upc.edu/~caim
1 / 72
7. Introduction to Network Analysis
Network Analysis, Part I
Today’s contents
1. Examples of real networks

2. What do real networks look like?
I real networks exhibit small diameter
I .. and so does the Erdös-Rényi or random model
I real networks have high clustering coefficient
I .. and so does the Watts-Strogatz model
I real networks’ degree distribution follows a power-law
I .. and so does the Barabasi-Albert or preferential attachment
model
3 / 72
Examples of real networks
I Social networks
I Information networks
I Technological networks
I Biological networks
4 / 72
Social networks
Links denote social “interactions”
I friendship, collaborations, e-mail, etc.
5 / 72
Information networks
Nodes store information, links associate information

I citation networks, the web, p2p networks, etc.
6 / 72
Technological networks
Man-built for the distribution of a commodity
I telephone networks, power grids, transportation networks,
etc.
7 / 72
Biological networks
Represent biological systems

I protein-protein interaction networks, gene regulation
networks, metabolic pathways, etc.
8 / 72
Representing networks
I Network ≡ Graph
I Networks are just collections of “points” joined by “lines”
points lines
vertices edges, arcs math
nodes links computer science
sites bonds physics
actors ties, relations sociology
9 / 72
Types of networks
From [Newman, 2003]
(a) unweighted,
undirected
(b) discrete vertex and
edge types,
undirected
(c) varying vertex and
edge weights,
undirected
(d) directed
10 / 72
Small-world phenomenon
I A friend of a friend is also frequently a friend

I Only 6 hops separate any two people in the world
11 / 72
Measuring the small-world phenomenon, I
I Let dij be the shortest-path distance between nodes i and
j
I To check whether “any two nodes are within 6 hops”, we
use:
I The diameter (longest shortest-path distance) as
d = máx dij
i,j
I The average shortest-path length as

2 X
l= dij
n (n + 1) i>j
I The harmonic mean shortest-path length as

2 X
l−1 = d−1
n (n + 1) i>j ij
12 / 72
From [Newman, 2003]
13 / 72
But..
I Can we mimic this phenomenon in simulated networks

(“models”)?
I The answer is YES!
14 / 72
The (basic) random graph model
a.k.a. ER model
Basic Gn,p Erdös-Rényi random graph model:

I parameter n is the number of vertices
I parameter p is s.t. 0 ≤ p ≤ 1
I Generate and edge (i, j) independently at random with
probability p
15 / 72
Measuring the diameter in ER networks
Want to show that the diameter in ER networks is small
I Let the average degree be z

I At distance l, can reach z l nodes
log n
I At distance log z , reach all n nodes
I So, diameter is (roughly) O(log n)
16 / 72
ER networks have small diameter
As shown by the following simulation
17 / 72
Measuring the small-world phenomenon, II
I To check whether “the friend of a friend is also frequently a

friend”, we use:
I The transitivity or clustering coefficient, which basically
measures the probability that two of my friends are also
friends
18 / 72
Global clustering coefficient
3 × number of triangles
C=
number of connected triples
3×1
C= = 0.375
8
19 / 72
Local clustering coefficient
I For each vertex i, let ni be the number of neighbors of i

I Let Ci be the fraction of pairs of neighbors that are
connected within each other
nr. of connections between i’s neighbors
Ci = 1
2 ni (ni − 1)
I Finally, average Ci over all nodes i in the network

1X
C= Ci
n
i
20 / 72
Local clustering coefficient example
I C1 = C2 = 1/1
I C3 = 1/6
I C4 = C5 = 0
I C = 15 (1 + 1 + 1/6) = 13/30 = 0.433
21 / 72
From [Newman, 2003]
22 / 72
ER networks do not show transitivity
I C = p, since edges are added independently

I Given a graph with n nodes and e edges, we can
“estimate” p as
e
p̂ =
1/2 n (n − 1)
I We say that clustering is high if C p̂

I Hence, ER networks do not have high clustering coefficient
since for them C ≈ p̂
23 / 72
ER networks do not show transitivity
24 / 72
So ER networks do not have high clustering, but..
I Can we mimic this phenomenon in simulated networks

(“models”), while keeping the diameter small?
25 / 72
The Watts-Strogatz model, I
From [Watts and Strogatz, 1998]
Reconciling two observations from real networks:

I High clustering: my friend’s friends are also my friends
I small diameter
26 / 72
The Watts-Strogatz model, II
I Start with all n vertices arranged on a ring
I Each vertex has intially 4 connections to their closest
nodes
I mimics local or geographical connectivity
I With probability p, rewire each local connection to a
random vertex
I p = 0 high clustering, high diameter
I p = 1 low clustering, low diameter (ER model)
I What happens in between?
I As we increase p from 0 to 1
I Fast decrease of mean distance
I Slow decrease in clustering
27 / 72
The Watts-Strogatz model, III
For an appropriate value of p ≈ 0.01 (1 %), we observe that the

model achieves high clustering and small diameter
28 / 72
Degree distribution
Histogram of nr of nodes having a particular degree
fk = fraction of nodes of degree k
29 / 72
Scale-free networks
The degree distribution of most real-world networks follows a

power-law distribution
fk = ck −α
I “heavy-tail” distribution, implies

existence of hubs
I hubs are nodes with very high
degree
30 / 72
Random networks are not scale-free!
For random networks, the degree distribution follows the

binomial distribution (or Poisson if n is large)
z k e−z

n k
fk = p (1 − p)(n−k) ≈
k k!
I Where z = p(n − 1) is the mean degree

I Probability of nodes with very large degree becomes
exponentially small
I so no hubs
31 / 72
So ER networks are not scale-free, but..
I Can we obtained scale-free simulated networks?

32 / 72
Preferential attachment
I “Rich get richer” dynamics

I The more someone has, the more she is likely to have
I Examples
I the more friends you have, the easier it is to make new ones
I the more business a firm has, the easier it is to win more
I the more people there are at a restaurant, the more who
want to go
33 / 72
Barabási-Albert model
From [Barabási and Albert, 1999]
I “Growth” model
I The model controls how a network grows over time
I Uses preferential attachment as a guide to grow the
network
I new nodes prefer to attach to well-connected nodes
I (Simplified) process:
I the process starts with some initial subgraph
I each new node comes in with m edges
I probability of connecting to existing node i is proportional to
i’s degree
I results in a power-law degree distribution with exponent
α=3
34 / 72
ER vs. BA
Experiment with 1000 nodes, 999 edges (m0 = 1 in BA model).
random preferential attachment
35 / 72
In summary..
phenomenon real networks ER WS BA

small diameter yes yes yes yes
high clustering yes no yes yes1
scale-free yes no no yes
1
clustering coefficient is higher than in random networks, but not as high
as for example in WS networks
36 / 72
Network Analysis, Part II
Today’s contents
1. Centrality
I Degree centrality
I Closeness centrality
I Betweenness centrality
2. Community finding algorithms
I Hierarchical clustering
I Agglomerative
I Girvan-Newman
I Modularity maximization: Louvain method
37 / 72
Centrality in Networks
Centrality is a node’s measure w.r.t. others

I A central node is important and/or powerful
I A central node has an influential position in the network
I A central node has an advantageous position in the
network
38 / 72
Degree centrality
Power through connections
def
degree_centrality(i) = k(i)
39 / 72
Degree centrality
def
in_degree_centrality(i) = kin (i)
40 / 72
Degree centrality
def
out_degree_centrality(i) = kout (i)
41 / 72
Degree centrality
By the way, there is a normalized version which divides the

centrality of each degree by the maximum centrality value
possible, i.e. n − 1 (so values are all between 0 and 1).
But look at these examples, does degree centrality look OK to

you?
42 / 72
Closeness centrality
Power through proximity to others
P −1
def j6=i d(i, j) n−1
closeness_centrality(i) = =P
n−1 j6=i d(i, j)
Here, what matters is to be close to everybody else, i.e., to be

easily reachable or have the power to quickly reach others.
43 / 72
Betweenness centrality
Power through brokerage
A node is important if it lies in many shortest-paths

I so it is essential in passing information through the network
44 / 72
Power through brokerage
def X gjk (i)

betweenness_centrality(i) =
gjk
j<k
Where
I gjk is the number of shortest-paths between j and k, and
I gjk (i) is the number of shortest-paths through i
Oftentimes it is normalized:
def betweenness_centrality(i)
norm_betweenness_centrality(i) = n−1

2
45 / 72
Examples (non-normalized)
46 / 72
What is community structure?
47 / 72
Why is community structure important?
48 / 72
.. but don’t trust visual perception
it is best to use objective algorithms
49 / 72
Main idea
A community is dense in the inside but sparse w.r.t. the outside
No universal definition! But some ideas are:

I A community should be densely connected
I A community should be well-separated from the rest of the
network
I Members of a community should be more similar among
themselves than with the rest
Most common..
nr. of intra-cluster edges > nr. of inter-cluster edges
50 / 72
Some definitions
Let G = (V, E) be a network with |V | = n nodes and |E| = m
edges. Let C be a subset of nodes in the network (a “cluster” or
“community”) of size |C| = nc . Then
I intra-cluster density:
nr. internal edges of C

δint (C) =
nc (nc − 1)/2
I inter-cluster density:
nr. inter-cluster edges of C

δext (C) =
nc (n − nc )
A community should have δint (C) > δ(G), where δ(G) is the
average edge density of the whole graph G, i.e.
nr. edges in G
δ(G) =
n(n − 1)/2
51 / 72
Most algorithms search for tradeoffs between large δint (C) and
small δext (C)
P
C δint (C) − δext (C) over all communities
I e.g. optimizing
C
Define further:
I mc = nr. edges within cluster C = |{(u, v)|u, v ∈ C}|
I fc = nr. edges in the frontier of C = |{(u, v)|u ∈ C, v 6∈ C}|
I nc1 = 4, mc1 = 5, fc1 = 2

I nc2 = 3, mc2 = 3, fc2 = 2
I nc3 = 5, mc3 = 8, fc3 = 2
52 / 72
Community quality criteria
fc
I conductance: fraction of edges leaving the cluster 2mc +fc
I expansion: nr of edges per node leaving the cluster fc
nc
I internal density: a.k.a. “intra-cluster density” mc
nc (nc −1)/2
I cut ratio: a.k.a. “inter-cluster density” f c
nc (n−nc )
I modularity: difference between nr. of edges in C and the
expected nr. of edges E[mc ] of a random graph with the
same degree distribution
1
(mc − E[mc ])
4m
53 / 72
Methods we will cover
I Hierarchical clustering
I Agglomerative
I Divisive (Girvan-Newman algorithm)
I Modularity maximization algorithms
I Louvain method
54 / 72
Hierarchical clustering
From hairball to dendogram
55 / 72
Suitable if input network has hierarchical structure
56 / 72
Agglomerative hierarchical clustering [Newman, 2010]
Ingredients
I Similarity measure between nodes
I Similarity measure between sets of nodes
Pseudocode
1. Assign each node to its own cluster
2. Find the cluster pair with highest similarity and join them
together into a cluster
3. Compute new similarities between new joined cluster and
others
4. Go to step 2 until all nodes form a single cluster
57 / 72
Example
Data
80 ●
● ●
60
●
40
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
D. Blei Clustering 02 5 / 21
Example
iteration 001
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 002
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 003
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 004
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 005
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 006
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 007
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 008
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 009
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 010
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 011
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 012
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 013
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 014
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 015
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 016
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 017
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 018
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 019
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 020
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 021
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 022
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 023
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Example
iteration 024
80 ●
● ●
60
●
40
V2
●
●
20
● ●
● ●
● ●
● ●
●
0
●
●
●
●
● ●●
●
−20
0 20 40 60 80
V1
Similarity measures wij for nodes I
Let A be the adjacency matrix of the network, i.e. Aij = 1 if
(i, j) ∈ E and 0 otherwise.
I Jaccard index:
|Γ(i) ∩ Γ(j)|
wij =
|Γ(i) ∪ Γ(j)|
where Γ(i) is the set of neighbors of node i

I Cosine similarity:2
P
Aik Akj nij
wij = P k qP
q =p
2 2 ki kj
k Aik k Ajk
where:
P
I |Γ(i) ∩ Γ(j)| = k Aik Akj , and
nij =P
I ki = k Aik is the degree of node i
58 / 72
Similarity measures wij for nodes II
I Euclidean distance: (or rather Hamming distance since A
is binary) X
dij = (Aik − Ajk )2
k
I Normalized Euclidean distance:3

(Aik − Ajk )2
P
nij
dij = k =1−2
ki + kj ki + kj
I Pearson correlation coefficient

P
cov(Ai , Aj ) (Aik − µi )(Ajk − µj )
rij = = k
σi σj nσi σj
q P
where µi = n1 k Aik and σi = n1 k (Aik − µi )2
P
2
From the equation xy = |x||y| cos θ
3
Uses the idea that the maximum value of dij is when there are no
common neighbors and then dij = ki + kj
59 / 72
Similarity measures for sets of nodes
I Single linkage: sXY = máx sxy

x∈X,y∈Y
I Complete linkage: sXY = mı́n sxy
x∈X,y∈Y
P
x∈X,y∈Y sxy
I Average linkage: sXY =
|X| × |Y |
60 / 72
Agglomerative hierarchical clustering on Zachary’s
network
Using average linkage
61 / 72
The Girvan-Newman algorithm
A divisive hierarchical algorithm [Girvan and Newman, 2002]
Edge betweenness
The betweenness of an edge is the nr. of shortest-paths in the
network that pass through that edge
It uses the idea that “bridges” between communities must have

high edge betweenness
62 / 72
The Girvan-Newman algorithm
Pseudocode
1. Compute betweenness for all edges in the network
2. Remove the edge with highest betweenness
3. Go to step 1 until no edges left
Result is a dendogram
63 / 72
Definition of modularity [Newman, 2010]
Using a null model
Random graphs are not expected to have community structure,

so we will use them as null models.
Q = (nr. of intra-cluster communities) − (expected nr of edges)
In particular:
1 X
Q= (Aij − Pij ) δ(Ci , Cj )
2m
ij
where Pij is the expected number of edges between nodes i

and j under the null model, Ci is the community of vertex i, and
δ(Ci , Cj ) = 1 if Ci = Cj and 0 otherwise.
64 / 72
How do we compute Pij ?
Using the “configuration” null model
The “configuration” random graph model choses a graph with

the same degree distribution as the original graph uniformly at
random.
I Let us compute Pij
I There are 2m stubs or half-edges available in the
configuration model
I Let pi be the probability of picking at random a stub
incident with i
ki
pi =
2m
ki kj
I The probability of connecting i to j is then pi pj = 4m2
ki kj
I And so Pij = 2mpi pj = 2m
65 / 72
Properties of modularity

1 X ki kj
Q= Aij − δ(Ci , Cj )
2m 2m
ij
I Q depends on nodes in the same clusters only

I Larger modularity means better communities (better than
random intra-cluster density)
1 P 1 P
I Q ≤ 2m ij Aij δ(Ci , Cj ) ≤ 2m ij Aij ≤ 1
I Q may take negative values
I partitions with large negative Q implies existence of cluster
with small internal edge density and large inter-community
edges
66 / 72
The Louvain method [Blondel et al., 2008]
Considered state-of-the-art
Pseudocode
1. Repeat until local optimum reached
1.1 Phase 1: partition network greedily using modularity
1.2 Phase 2: agglomerate found clusters into new nodes
67 / 72
The Louvain method
Phase 1: optimizing modularity
Pseudocode for phase 1

1. Assign a different community to each node
2. For each node i
I For each neighbor j of i, consider removing i from its
community and placing it to j’s community
I Greedily chose to place i into community of neighbor that
leads to highest modularity gain
3. Repeat until no improvement can be done
68 / 72
The Louvain method
Phase 2: agglomerating clusters to form new network
Pseudocode for phase 2

1. Let each community Ci form a new node i
2. Let the edges between new nodes i and j be the sum of
edges between nodes in Ci and Cj in the previous graph
(notice there are self-loops)
69 / 72
The Louvain method
Observations
I The output is also a hierarchy

I Works for weighted graphs, and so modularity has to be
generalized to
1 X si sj
Qw = Wij − δ(Ci , Cj )
2W 2W
ij
wherePWij is the weightP

of undirected edge (i, j),
W = ij Wij and si = k Wik .
70 / 72
References I
Barabási, A.-L. and Albert, R. (1999).

Emergence of scaling in random networks.
science, 286(5439):509–512.
Blondel, V. D., Guillaume, J.-l., Lambiotte, R., and Lefebvre,
E. (2008).
Fast unfolding of community hierarchies in large networks.
Networks, pages 1–6.
Girvan, M. and Newman, M. E. J. (2002).
Community structure in social and biological networks.
Proceedings of the National Academy of Sciences of the
United States of America, 99:7821–7826.
Newman, M. (2010).
Networks: An Introduction.
Oxford University Press, USA, 2010 edition.
71 / 72
References II
Newman, M. E. (2003).
The structure and function of complex networks.
SIAM review, 45(2):167–256.
Watts, D. J. and Strogatz, S. H. (1998).
Collective dynamics of small-world networks.
nature, 393(6684):440–442.
72 / 72

CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica

Uploaded by

Copyright:

Available Formats

CAIM: Cerca i Anàlisi d’Informació Massiva

FIB, Grau en Enginyeria Informàtica

Slides by Marta Arias, José Luis Balcázar,

1. Examples of real networks

Nodes store information, links associate information

Represent biological systems

I A friend of a friend is also frequently a friend

I The average shortest-path length as

I The harmonic mean shortest-path length as

I Can we mimic this phenomenon in simulated networks

Basic Gn,p Erdös-Rényi random graph model:

Want to show that the diameter in ER networks is small

I Let the average degree be z

I To check whether “the friend of a friend is also frequently a

I For each vertex i, let ni be the number of neighbors of i

I Finally, average Ci over all nodes i in the network

I C = p, since edges are added independently

I We say that clustering is high if C  p̂

I Can we mimic this phenomenon in simulated networks

Reconciling two observations from real networks:

For an appropriate value of p ≈ 0.01 (1 %), we observe that the

Histogram of nr of nodes having a particular degree

fk = fraction of nodes of degree k

The degree distribution of most real-world networks follows a

I “heavy-tail” distribution, implies

For random networks, the degree distribution follows the

I Where z = p(n − 1) is the mean degree

I Can we obtained scale-free simulated networks?

I “Rich get richer” dynamics

Experiment with 1000 nodes, 999 edges (m0 = 1 in BA model).

random preferential attachment

phenomenon real networks ER WS BA

Centrality is a node’s measure w.r.t. others

By the way, there is a normalized version which divides the

But look at these examples, does degree centrality look OK to

Here, what matters is to be close to everybody else, i.e., to be

A node is important if it lies in many shortest-paths

def X gjk (i)

No universal definition! But some ideas are:

nr. internal edges of C

nr. inter-cluster edges of C

I nc1 = 4, mc1 = 5, fc1 = 2

where Γ(i) is the set of neighbors of node i

I Normalized Euclidean distance:3

I Pearson correlation coefficient

I Single linkage: sXY = máx sxy

It uses the idea that “bridges” between communities must have

Random graphs are not expected to have community structure,

Q = (nr. of intra-cluster communities) − (expected nr of edges)

where Pij is the expected number of edges between nodes i

The “configuration” random graph model choses a graph with

I Q depends on nodes in the same clusters only

Pseudocode for phase 1

Pseudocode for phase 2

I The output is also a hierarchy

wherePWij is the weightP

Barabási, A.-L. and Albert, R. (1999).

You might also like

I We say that clustering is high if C p̂