You are on page 1of 32

Joint Clustering and Ranking from

Heterogeneous Pairwise Comparisons


Chen-Hao Hsiao and I-Hsiang Wang

Graduate Institute of Communication Engineering,


National Taiwan University, Taipei, Taiwan
Email: {f07942062,ihwang}@ntu.edu.tw
ISIT 2021
1
Goal: Joint clustering and ranking

POP
Rock

Goal:
1.Recover the underlying clusters
2.Rank the items in each cluster
Given only pairwise preferences!
EDM

2
Intuition: Utilize the heterogeneity

POP
Rock

Utilize the heterogeneity!

EDM Intra-cluster comparisons


Inter-cluster comparisons
3
Fundamental Questions
To achieve reliable joint clustering and ranking:
• How high of the heterogeneity do we need?
• How many comparisons do we need?
• How to utilize the heterogeneity?

4
Main Contribution

1. Characterize the fundamental limit for exact clustering and ranking


𝐥𝐨𝐠 𝒏
Level of heterogeneity × Number of comparisons = 𝚯( )
𝒏

2. Develop a polynomial-time three-step algorithm

5
Proposed Model: Preference Score

Items:
𝑛1 = 𝛽𝑛
𝑛2 = 𝑛 − 𝛽𝑛
𝛽 ∈ 0,0.5

Preference score:

𝑤1 𝑤2 …… 𝑤𝑛 𝑤𝑖 ∈ 𝑤min , 𝑤max , ∀𝑖 ∈ [𝑛]

6
Proposed Model: Number of Comparisons

Number of comparisons between any item pair (𝑖, 𝑗):

Bin(𝐿, 𝑝) times
𝑤𝑖 𝑤𝑗

N. B. Shah and M. J. Wainwright, “Simple, robust and optimal ranking from pairwise comparisons,” The Journal
of Machine Learning Research, vol. 18, no. 1, pp. 7246–7283, 2017.
7
Proposed Model: Heterogeneity
Outcome of comparisons:

𝑟𝑖𝑗 0.5
𝑤𝑖 𝑤𝑗
𝑤𝑖 𝑤𝑖 𝑤𝑗
𝑟𝑖𝑗 =
𝑟𝑗𝑖 𝑤𝑖 + 𝑤𝑗 0.5

Same cluster: Different cluster:


Bradley-Terry-Luce (BTL) model Purely random

R. A. Bradley and M. E. Terry, “Rank analysis of incomplete block designs: I. the method of paired
comparisons,” Biometrika, vol. 39, no. 3/4, pp. 324–345, 1952.
8
R. D. Luce, Individual choice behavior: A theoretical analysis. Courier Corporation, 2012.
Proposed Model: Heterogeneity
Outcome of comparisons:

Heterogeneity!
𝑟𝑖𝑗 0.5
𝑤𝑖 𝑤𝑗
𝑤𝑖 𝑤𝑖 𝑤𝑗
𝑟𝑖𝑗 =
𝑟𝑗𝑖 𝑤𝑖 + 𝑤𝑗 0.5

Same cluster: Different cluster:


Bradley-Terry-Luce (BTL) model Purely random

R. A. Bradley and M. E. Terry, “Rank analysis of incomplete block designs: I. the method of paired
comparisons,” Biometrika, vol. 39, no. 3/4, pp. 324–345, 1952.
8
R. D. Luce, Individual choice behavior: A theoretical analysis. Courier Corporation, 2012.
Proposed Model: Quantifying the Heterogeneity

𝑤1 𝑤2 …… 𝑤𝑛1

{
{

{
|𝑤𝑖 − 𝑤𝑗 | Gap of
Δ= min 𝑟𝑖𝑗 − 0.5 Δ= min
𝜎 𝑖 =𝜎(𝑗) 𝜎 𝑖 =𝜎(𝑗) 2(𝑤𝑖 + 𝑤𝑗 ) preference score
Intra-cluster pair (𝑖, 𝑗) Intra-cluster pair (𝑖, 𝑗)

9
Proposed Model: Quantifying the Heterogeneity

𝑤1 𝑤2 …… 𝑤𝑛1

{
{

{
|𝑤𝑖 − 𝑤𝑗 | Gap of
Δ= min 𝑟𝑖𝑗 − 0.5 Δ= min
𝜎 𝑖 =𝜎(𝑗) 𝜎 𝑖 =𝜎(𝑗) 2(𝑤𝑖 + 𝑤𝑗 ) preference score
Intra-cluster pair (𝑖, 𝑗) Intra-cluster pair (𝑖, 𝑗)

Δ: Heterogeneity of comparisons
Δ ↑ ⇒ Clustering becomes easier
9
Proposed Model: Quantifying the Heterogeneity

𝑤1 𝑤2 …… 𝑤𝑛1

{
{

{
|𝑤𝑖 − 𝑤𝑗 | Gap of
Δ= min 𝑟𝑖𝑗 − 0.5 Δ= min
𝜎 𝑖 =𝜎(𝑗) 𝜎 𝑖 =𝜎(𝑗) 2(𝑤𝑖 + 𝑤𝑗 ) preference score
Intra-cluster pair (𝑖, 𝑗) Intra-cluster pair (𝑖, 𝑗)

Δ: Heterogeneity of comparisons Δ: The smallest intra−cluster gap


Δ ↑ ⇒ Clustering becomes easier Δ ↑⇒ Ranking becomes easier
9
Ranking Error and Clustering Error
• Ground truth: 𝜎: 𝑛 → 1,2
𝒘 ∈ ℝ𝑛

• Output: 𝜎:
ො 𝑛 → 1,2
ෝ ∈ ℝ𝑛
𝒘

• Ranking Error: ℰ𝑟 𝒘, 𝒘
ෝ ≔ {∃(𝑖, 𝑗) with 𝜎 𝑖 = 𝜎 𝑗 s.t. 𝑤𝑖 − 𝑤𝑗 𝑤
ෝ𝑖 − 𝑤
ෝ𝑗 < 0}
Intra-cluster Inconsistent ordering
• Clustering Error: ℰ𝑐 𝜎, 𝜎ො ≔ {∃𝑖 s.t. 𝜎 𝑖 ≠ 𝜎(𝑖)}

Wrong cluster label
• Goal: Characterize 𝜎ෝinf

,𝒘

sup ℙ{ℰ𝑐 𝜎, 𝜎ො ∪ ℰ𝑟 (𝒘, 𝒘)}
𝑝,𝐿,Δ
10
The Fundamental Limit is Determined by…
• The parameter of Binomial: 𝑝, 𝐿
• 𝑝, 𝐿 ↑ ⇒ More comparisons!

• The level of heterogeneity: Δ


• Δ ↑ ⇒ Easier to distinguish items in single cluster!
⇒ Also easier to distinguish comparisons!

11
Main Result

2
log 𝑛
𝑝𝐿 × Δ = Θ( )
𝑛
Expected number of comparisons Level of heterogeneity

Impossible Achievable
By Fano’s inequality By the proposed algorithm
𝑝𝐿Δ2 𝐶1
log 𝑛
𝐶2
log 𝑛
𝑛 𝑛
𝐶1 , 𝐶2 : constants that only depends on 𝑤min , 𝑤max and 𝛽
12
Proposed Three-step Algorithm

Ranking Transformation Clustering

13
Proposed Three-step Algorithm

Borda
Elimination SDP
Counting

Ensures the
Transform the graph
correct ordering Solve the clustering task
into a block model
in each cluster
with high probability
13
Borda Counting:
Ensure the Correct Intra-Cluster Ordering

8
2 3 3
5

J. d. Borda, “Mémoire sur les élections au scrutin,” Histoire de


14
l’Academie Royale des Sciences pour 1781 (Paris, 1784), 1784.
Borda Counting:
Ensure the Correct Intra-Cluster Ordering

2 3
8
2 3 3
13 5

1 4

J. d. Borda, “Mémoire sur les élections au scrutin,” Histoire de


14
l’Academie Royale des Sciences pour 1781 (Paris, 1784), 1784.
Borda Counting:
Ensure the Correct Intra-Cluster Ordering
Claim:
For items in the same cluster, Borda Counting ensures the correct ranking order
with high probability.
2 3
8
𝑖
2 3 3
13 5 …

1 4 𝑘
𝑗

J. d. Borda, “Mémoire sur les élections au scrutin,” Histoire de


14
l’Academie Royale des Sciences pour 1781 (Paris, 1784), 1784.
Borda Counting:
Ensure the Correct Intra-Cluster Ordering
Claim:
For items in the same cluster, Borda Counting ensures the correct ranking order
with high probability.
2 3
8
𝑖 𝑘
2 3 3
13 5 …

1 4 …
𝑗

J. d. Borda, “Mémoire sur les élections au scrutin,” Histoire de


14
l’Academie Royale des Sciences pour 1781 (Paris, 1784), 1784.
Elimination: Transform to an SBM-like Graph

2 3 2 3
8 8
2 3 3 3
13 5 13 5

1 4 4

15
Brief Introduction to SBM

Ber(𝛼) 𝛼>𝛽 Ber(𝛽)


𝑖 𝑗 𝑖 𝑗

𝜶 𝜷

𝜷 𝜶
Expected value of the adjacency matrix of the observation graph
E. Abbe, “Community detection and stochastic block models: recent developments,”
The Journal of Machine Learning Research, vol. 18, no. 1, pp. 6446–6531, 2017. 16
Before Elimination: Directed Weighted Graph
𝑝
Bin 𝐿, 𝑝𝑟𝑖𝑗 Bin(𝐿, )
2
𝑖 𝑗 𝑖 𝑗
𝑝
Bin 𝐿, 𝑝𝑟𝑗𝑖 Bin(𝐿, )
2

𝒑𝑳
𝒑𝑳𝒓𝒊𝒋
𝟐
𝒑𝑳
𝒑𝑳𝒓𝒊𝒋
𝟐
17
After Elimination: Similar to SBM
𝑝
Bin 𝐿, 𝑝 max{𝑟𝑖𝑗 , 𝑟𝑗𝑖 } Bin(𝐿, )
2
𝑖 𝑗 𝑖 𝑗
𝑝 Δ
𝑝max{𝑟𝑖𝑗 , 𝑟𝑗𝑖 } > (1 + )
2 2

𝒑𝑳 𝒑𝑳
>
𝟐 𝟐
𝒑𝑳 𝒑𝑳
>
𝟐 𝟐
18
Existing algorithms in SBM literature
• Spectral Methods
[Rohe-Chatterjee-Yu ’11] [Vu ’14]
[Yun-Proutiere ’14][Lei-Rinaldo ’15]
[Joseph-Yu ’16][Gao-Ma-Zhang-Zhou ’17] 𝒑𝑳 𝒑𝑳
[Abbe-Fang-Wang-Zhong ’20]
>
𝟐 𝟐
• SDP 𝒑𝑳 𝒑𝑳
>
[Amini-Levina ’14][Abbe-Banderia-Hall ’16] 𝟐 𝟐
[Montanari-Sen ’16][Guédon-Vershynin ’16]
[Hajek-Wu-Xu ’16][Moitra-Perry-Wein ’16]

19
SDP is a Convex Relaxation of MLE in SBM
For SBM2 (𝑛, 𝛼, 𝛽)
MLE: SDP:
Find 𝜎 that Find 𝑋 that
maximize 𝐴 − 𝜆𝐽, 𝜎𝜎 𝑇 maximize 𝐴 − 𝜆𝐽, 𝑋
subject to 𝜎𝑖 ∈ ±1 , ∀𝑖 ∈ [𝑛] subject to 𝑋 is PSD,
1−𝛽 𝑋𝑖𝑖 = 1, ∀𝑖 ∈ 𝑛
log( )
𝜆= 1 − 𝛼
𝛼(1 − 𝛽)
log( )
𝛽(1 − 𝛼)
B. Hajek, Y. Wu, and J. Xu, “Achieving exact cluster recovery threshold via semidefinite programming: Extensions,”
IEEE Transactions on Information Theory, vol. 62, no. 10, pp. 5918–5937, 2016. 20
SDP in our problem
1 −1
maximize 𝐴 − 𝜆(𝐽 − 𝐼), 𝑋
Matrix after elimination step 𝟏 −𝟏 1
subject to 𝑋 is PSD, 𝑋∗ = =
𝑋𝑖𝑖 = 1, ∀𝑖 ∈ 𝑛 −𝟏 𝟏 −1
𝑝𝐿Δ
𝜆=
Δ
4 log(1 + )
2

2 log 𝑛
Theorem: If 𝑝𝐿Δ ≥ 𝐶′ ,
then the output of the above
𝑛
program 𝑋𝑆𝐷𝑃 = 𝑋 ∗ with high probability
Solves Clustering (w.h.p.)! 21
Key Difference Between Our SDP and The Others
Classical SBM: Proposed SDP in our setting:
maximize 𝐴 − 𝜆𝐽, 𝑋 maximize 𝐴 − 𝜆(𝐽 − 𝐼), 𝑋
subject to 𝑋 is PSD, subject to 𝑋 is PSD,
𝑋𝑖𝑖 = 1, ∀𝑖 ∈ 𝑛 𝑋𝑖𝑖 = 1, ∀𝑖 ∈ 𝑛

𝑝 Δ 𝑝
𝛼= 1+ ,𝛽 =
𝑎 log 𝑛 𝑏 log 𝑛 2 2 2
𝛼= ,𝛽 = , 𝑎− 𝑏>2 |𝑤𝑖 − 𝑤𝑗 | 1
𝑛 𝑛 Δ= min ⇒ Δ = 𝑂( )
log 𝑛 𝑖 =𝜎(𝑗) 2(𝑤𝑖 + 𝑤𝑗 )
𝛼−𝛽 =Θ → same order as 𝛼, 𝛽 𝜎 𝑛
𝑛 𝑝
𝛼 − 𝛽 = 𝑂 𝑝Δ = 𝑂 →Smaller than 𝛼, 𝛽
𝑛
22
Conclusion
• Characterize the fundamental limit for exact joint clustering and ranking:
Impossible Achievable
By Fano’s inequality With a polynomial-time algorithm
𝑝𝐿Δ2 𝐶1
log 𝑛
𝐶2
log 𝑛
𝑛 𝑛
• Proposed algorithm:

Borda
Elimination SDP
Counting
(Transformation) (Clustering)
(Ranking)
23
Final Remarks
• The information limit can be extended to multiple clusters (in Appendix C).
• The joint clustering and top-K identification problem can be considered.
1 log 𝑛
This work (exact joint clustering and ranking) Δ = 𝑂 , 𝑝𝐿Δ2 ≥ 𝐶 ⇒ 𝑝𝐿 ≥ 𝐶𝑛 log 𝑛
𝑛 𝑛
1
Joint clustering and top-K identification: Δ = 𝑂
𝐾
• The definition of Δ is slightly different from the definition in the paper.

24
Thank you for listening!

25

You might also like