You are on page 1of 32

I NTRODUCTION Community detection

Local Community Computation

Rushed Kanawati
LIPN, CNRS UMR 7030; USPC
http://lipn.fr/∼kanawati
rushed.kanawati@lipn.univ-paris13.fr

1 / 32
I NTRODUCTION Community detection

P LAN

1 I NTRODUCTION

2 Community detection
Local community detection

2 / 32
I NTRODUCTION Community detection

C OMPLEX NETWORK
Definition
Graphs modeling (direct/indirect) interactions among actors.

Basic topological features


I Low Density
I Small Diameter
I Heterogeneous degree
distribution.
I High Clustering
coefficient

I Community structure

3 / 32
I NTRODUCTION Community detection

C OMMUNITY ?
Definitions
I A dense subgraph loosely coupled to other modules in the
network
I A community is a set of nodes seen as one by nodes outside the
community
I A subgraph where almost all nodes are linked to other nodes in
the community.
I ...

4 / 32
I NTRODUCTION Community detection

C OMMUNITY DETECTION PROBLEM

I Local community identification (ego-centred).

I Network partition computing

I Overlapping community detection

5 / 32
I NTRODUCTION Community detection

L OCAL COMMUNITY

6 / 32
I NTRODUCTION Community detection

L OCAL COMMUNITY
C: Core (noyau)
B: Border (frontière)
S: Shell
U: extérieur de la commu

7 / 32
I NTRODUCTION Community detection

L OCAL COMMUNITY

1 C ← {φ}, B ← {n0 } S ← Γ(n0 )


2 Q ← 0 /* a community quality function */
3 While Q can be enhanced Do
1 n ← argmaxn∈S Q
2 S ← S − {n}
3 D ← D + {n}
4 update B, S, C
4 Return D

8 / 32
I NTRODUCTION Community detection

Q UALITY FUNCTIONS : E XEMPLES I

Local modularity R [Cla05]


Bin
R= Bin +Bout

Local modularity M [LWP08]


Din
M= Dout

Local modularity L [CZG09]


P P
kΓ(i)∩Dk kΓ(i)∩Sk
Lin i∈D i∈B
L= Lex where : Lin = kDk , Lex = kBk

And many many others . . . [YL12]

9 / 32
I NTRODUCTION Community detection

L OCAL MODULARITY LIMITATIONS : A N EXEMPLE

 Blank nodes enhance Bin and Din without affecting Bout and Dout
 Bank nodes will be added if M or R modularities are used
 Low precision computed communities
 Proposed solution: Ensemble approaches
10 / 32
I NTRODUCTION Community detection

M ULTI - OBJECTIVE LOCAL COMMUNITY


IDENTIFICATION

Three main approaches


Combine then Rank
Ensemble ranking
Ensemble clustering

11 / 32
I NTRODUCTION Community detection

C OMBINE THEN R ANK

Principle
Let Qi (s) be the local modularity value induced by node s ∈ S

Qi (s)−min Qi (u)
u∈S
if max Qi (u) 6= min Qi (u)


] max Qi (u)−min Qi (u)
Qi (s) = u∈S u∈S u∈S u∈S
1

otherwise

k
1 P ]
Qcom (s) = k Qi (s)
i=1

12 / 32
I NTRODUCTION Community detection

E NSEMBLE RANKING APPROACHES

Principle
I Rank S in function of each local modularity Qi
I Select the winner after applying ensemble ranking approach
I What stopping criteria to apply ?

Stopping criteria
I Strict policy : All modularities should be enhanced
I Majority policy : Majority of local modularities are enhanced
I Least gain policy : At least one local modularity is enhanced.

13 / 32
I NTRODUCTION Community detection

E NSEMBLE RANKING

Problem
I Let S be a set of elements to rank by n rankers
I Let σi be the rank provided by ranker i
I Goal: Compute a consensus rank of S.

14 / 32
I NTRODUCTION Community detection

E NSEMBLE RANKING

Problem
I Let S be a set of elements to rank by n rankers
I Let σi be the rank provided by ranker i
I Goal: Compute a consensus rank of S.

Déjà Vu: Social choice algorithms, but . . .


I Small number of voters and big number of candidates
I Algorithmic efficiency is required
I Output could be a complete rank

15 / 32
I NTRODUCTION Community detection

E NSEMBLE RANKING : APPROACHES

Borda
I Borda’s score of i ∈ σk :
Bσk (i) = {count(j)|σk (j) < σk (i) ; j ∈ σk }.
I Rank elements in function of
B(i) = kt=1 wt × Bσt (i).
P
Jean-Charles de Borda [1733-1799]

16 / 32
I NTRODUCTION Community detection

E NSEMBLE RANKING : APPROACHES

Condorcet
I The winner is the candidate who defeats
every other candidate in pairwise
majority-rule election
I The winner may not exists
Marquis de Condorcet [1743-1794]

17 / 32
I NTRODUCTION Community detection

E NSEMBLE RANKING : APPROACHES

Condorcet
I The winner is the candidate who defeats
every other candidate in pairwise
majority-rule election
I The winner may not exists
Marquis de Condorcet [1743-1794]

Condorcet 6= Borda
I Votes : 6 × A  B  C, 4 × B  C  A
I Borda winner : B
I Condorcet winner : A

18 / 32
I NTRODUCTION Community detection

E NSEMBLE RANKING : APPROACHES

Extended Condorcet criterion


If for every a ∈ A and b ∈ B, majority prefers a to b the all elements in A
should be ranked before any element in B.

Arrow’s Theorem
No vote rule can have the following desired
proprieties :
I Every result must be achievable somehow.
I Monotonicity.
I Independence of irrelevant attributes.
Kenneth Arrow, 1921-
I Non-dictatorship.

19 / 32
I NTRODUCTION Community detection

E NSEMBLE RANKING : APPROACHES

Optimal Kemeny rank aggregation


I Let d() be distance over rankings σi (ex.
Kendall τ , Spearman’s footrule)
P
I Find π that minimise i d(π, σi )

I NP-Hard problem

I Approximation : Local Kemeny : two


adjacent candidats are in the good order.
John Kemeny 1926-1992
I Local Kemeny : Apply Bubble sort using the
majority preference partial order relationship
I Approximate Kemeny : Apply QuickSort

20 / 32
I NTRODUCTION Community detection

E NSEMBLE CLUSTERING APPROACHES

Principle
I Let CQ i
vq be the the local community of vq applying Qi .

I We have a natural partition : PQi = {CQ i Qi


vq , Cvq }
I Apply an ensemble clustering approach.

21 / 32
I NTRODUCTION Community detection

E NSEMBLE CLUSTERING : PRINCIPLE

from A. Topchy et. al. Clustering Ensembles: Models of Consensus and Weak Partitions. PAMI, 2005

22 / 32
I NTRODUCTION Community detection

E NSEMBLE CLUSTERING : APPROACHES

CSPA: Cluster-based Similarity Partitioning Algorithm


I Let K be the number of basic models, Ci (x) be the cluster in model
i to which x belongs.
K
P
δ(Ci (v),Ci (u))
i=1
I Define a similarity graph on objects : sim(v, u) = K

I Cluster the obtained graph :


Isolate connected components after pruning edges
Apply community detection approach

I Complexity : O(n2 kr) : n # objects, k # of clusters, r# of clustering


solutions

23 / 32
I NTRODUCTION Community detection

CSPA : E XEMPLE

from Seifi, M. Cœurs stables de communautés dans les graphes de terrain. Thèse de l’université Paris 6, 2012

24 / 32
I NTRODUCTION Community detection

E NSEMBLE CLUSTERING : APPROACHES

HGPA: HyperGraph-Partitioning Algorithm


I Construct a hypergraph where nodes are objects and hyperedges
are clusters.
I Partition the hypergraph by minimizing the number of cut
hyperedges
I Each component forms a meta cluster

I Complexity : O(nkr)

25 / 32
I NTRODUCTION Community detection

E NSEMBLE CLUSTERING : APPROACHES

MCLA: Meta-Clustering Algorithm


I Each cluster from a base model is an item
I Similarity is defined as the percentage of shared common objects
I Conduct meta-clustering on these clusters
I Assign an object to its most associated meta-cluster

I Complexity : O(nk2 r2 )

26 / 32
I NTRODUCTION Community detection

E XPERIMENTS

Protocol ([Bag08])
1 Apply the different algorithms on nodes in networks for which a
ground-truth community partition is known.

2 For each node compute the distance between the real-partition


and the computed one (Ex. NMI [Mei03])

3 Compute average and standard deviation for the network.

27 / 32
I NTRODUCTION Community detection

D ATASETS

Zachary’s Karate Club Football network

US Political books network Dolphins social network

28 / 32
I NTRODUCTION Community detection

R ESULTS : EVALUATING STOPPING CRITERIA (NMI)

Zachary’s Karate Club Football network

US Political books network Dolphins social network

29 / 32
I NTRODUCTION Community detection

R ESULTS : C OMPARATIVE RESULTS (NMI)

Zachary’s Karate Club Football network

US Political books network Dolphins social network

30 / 32
I NTRODUCTION Community detection

B IBLIOGRAPHY I

J. P. Bagrow, Evaluating local community methods in networks, J. Stat. Mech. 2008 (2008), no. 5, P05001.

Aaron Clauset, Finding local community structure in networks, Physical Review E (2005).

Jiyang Chen, Osmar R. Zaı̈ane, and Randy Goebel, Local community identification in social networks, ASONAM, 2009,
pp. 237–242.

Feng Luo, James Zijun Wang, and Eric Promislow, Exploring local community structures in large networks, Web Intelligence
and Agent Systems 6 (2008), no. 4, 387–400.

Marina Meila, Comparing clusterings by the variation of information, COLT (Bernhard Schölkopf and Manfred K. Warmuth,
eds.), Lecture Notes in Computer Science, vol. 2777, Springer, 2003, pp. 173–187.

31 / 32
I NTRODUCTION Community detection

B IBLIOGRAPHY II

Jaewon Yang and Jure Leskovec, Defining and evaluating network communities based on ground-truth, ICDM
(Mohammed Javeed Zaki, Arno Siebes, Jeffrey Xu Yu, Bart Goethals, Geoffrey I. Webb, and Xindong Wu, eds.), IEEE
Computer Society, 2012, pp. 745–754.

32 / 32

You might also like