You are on page 1of 10

CONNECTIONS BETWEEN

STOCHASTIC BLOCK MODEL


AND LABEL PROPAGATION
ALGORITHM
 Stochastic Block Model
 This is a random graph model that tries to model the community
structure in a graph i.e. generate graphs which has communities
in it.
 Differs substantially from the well known random graph model of
Erdos and Renyi.
 Basically the intention is to plant a community structure in a given
graph. Simplest case the planted bisection model.

INTRODUCTION
 Basic Model : Given a value of k (the number of communities)
and a k x k probability matrix where the value p_ij represents the
probability that there is an edge between a node in community i
and a node in community j. The sizes of each community is given
by a probability vector p_i where I varies from 1 to k. This model is
useful
 To model real life networks.
 Study the average case complexity of NP-hard problems.
 To establish benchmarks for clustering algorithms

STOCHASTIC BLOCK MODEL (CONT.)


 The main objectives in the SBM model are
 Weak Recovery : Find a partition of nodes that is positively
correlated with the hidden partition
 Partial Recovery : What portion of the nodes can be recovered
exactly ?
 Exact Recovery : When can entire clusters be recovered.
 These questions have been studied extensively for the past three
decade.
 Several Results have been obtained using tool from high dimensional
geometry till now which are mathematically very challenging. Abbe
et al and Bandeira et al. present the current state of the art.

RESULTS REGARDING SBM


 Label Propagation is a semi-supervised machine learning
algorithm that assigns labels to the unlabeled nodes in a graph
based on the labels of its neighborhood and various cost
functions.
 There are different notions of cost functions and different labeling
strategies.
 Label Propagation also refers to a class of graph clustering
algorithms.

LABEL PROPAGATION ALGORITHM


Can LP be viewed as a network generative model as SBM ?

Yamaguchi et al prove that LP and SBM share the same goal. They prove that a
modified partial supervised version of SBM called PLSBM shares the same properties as
a discrete version of Label Propagation called DLP

Our objective : Since Yamaguchi et al has established this connection we want to


understand whether we can use this connection to apply the results obtained by
Bandeira et al which is based on concentration inequalities in the LP scenario.

OBJECTIVES
 Can the concentration based results be extended to the PLSBM
setting and hence the DLP algorithm ?
 This objective has been slightly met in Yamaguchi’s paper.

PROBLEM DEFINITION
 Till now we have done a literature survey of the results obtained in
the research of SBM and Label propagation algorithms.
 This involves understanding the basic mathematical machinery to
understand these algorithms.

CONTRIBUTIONS
 Extend the results of Bandeira et al and Abbe et al to the PLSBM
setting.
 More formally, can the concentration based results be extended
to the PLSBM setting and hence the DLP algorithm ?
 Yamaguchi et al establish a weak connection; can this
connection be strengthened.
 Does the application of these results lead to a better
understanding of the LP algorithm ?

REMAINING TASKS
 Abbe, Emmanuel, and Colin Sandon. "Community detection in
general stochastic block models: Fundamental limits and efficient
algorithms for recovery." 2015 IEEE 56th Annual Symposium on
Foundations of Computer Science. IEEE, 2015.
 Abbe, Emmanuel, Afonso S. Bandeira, and Georgina Hall. "Exact
recovery in the stochastic block model." IEEE Transactions on
Information Theory 62.1 (2016): 471-487.
 Yamaguchi, Yuto, and Kohei Hayashi. "When Does Label
Propagation Fail? A View from a Network Generative
Model." IJCAI. 2017.

REFERENCES

You might also like