You are on page 1of 5

COMMUNITY DETECTION

1. INTRODUCTION

With the rise of the internet, there have been quite a few developments all over the
globe. The Internet introduced a whole new genre of interacting and sharing knowledge,
resources, and information. This digital world has now even enabled a new way of
consuming content and learning information. This led to a huge amount of data being
released which can be analyzed to help get a better understanding of the current generation.

Interacting between people has now reached new heights with the introduction of
social media. Social media explores new ways of easy and fast communication among one
another. Either being in the form of a simple text message to an image or voice note, almost
anything can be shared in this modern-day and age. When a lot of data is shared in this
fashion, there’s bound to be a group of people sharing a common interest or view towards a
specific ideology. Thus, these people can get closer to each other by having a form of
communication that brings together several like-minded people together, by forming a
community. These communities are quite prevalent throughout social media, but the
terminology varies throughout each of them. For example, Google+ has ‘Circles’, Reddit has
‘Subreddits’, Facebook and WhatsApp have ‘Groups’ and so on.

These communities are not limited to social media but are prevalent throughout the
whole world. The communities formed online have not just brought out a change socially, but
also economically. They can help various brands obtain different reviews of their products
within the one community, and likewise among different communities. Thus, communities
help e-commerce websites build a recommendation system suited to the likes of a person
after identifying the community he or she prefers to belong to.

Thus, real world communities are quite necessary in the modern era, especially in the
field of social networking. Analysing the data from the communities is too, quite important,
but should not be used in an inappropriate manner. Since some behaviours are only
observable at a group level and not at an individual level, real-world communities are
quintessential here.

Detecting these communities can be done through various mechanisms, either


digitally or manually. If there happens to be an error, in the detection of communities, there
would be quite a negative impact. Thus, detecting these communities among a discrete data
set is as important as the community itself. Before heading into community detection,
understanding let’s try to understand what is a community formally.

2. WHAT IS A COMMUNITY?

A ‘Community’ formally can be defined as a group of individuals who live in the same
geographical location. But in terms of social network analysis, we define communities by
looking at how people are connected to each other, and clustering them into such similar
groups. In other terms, a network is said to have a community-like structure if the nodes in
the network can be clustered into smaller groups, where each group has similar
connections(properties).

3. COMMUNITY DETECTION

There are quite a few ways of detecting communities, which have been proposed and
implemented throughout the years. There have also been revisions, modifications to these
algorithms, which makes the algorithm more efficient. Several papers have been published in
various journals implementing these models.

In real-world networks, some communities have a tendency to overlap with each other.
Detecting these communities are more complex compared to disjoint communities.

//PICTURE FOR OVERLAPPING COMMUNITES AND DISJOINT COMMUNITIES

The need for community detection mostly revolves around the ability to obtain a
summary of the network, since communities are easier to visualize and understand, when
compared to a real-world network. Another use of community detection is that it can preserve
privacy. In some networks, a community can reveal some properties of individual nodes
without releasing their private information.

4. TYPES OF COMMUNITY DETECTION

Communities are largely divided into:

(1) Group attributes based


(2) Member attributes based

These are further divided into smaller categories. Member based-community detection
include (1) Similarity (2) Degree (3) Reachability. These are, as the name implies divided as
per the individual members or nodes. Group based community detection are either (1)
Modular (2) Balanced (3) Dense (4) Robust (5) Hierarchical.

4.1 MEMBER-BASED COMMUNITY DETECTION

The properties of the nodes are used to detect the communities. Each node has several
behaviours, mainly it’s degree, similarity with other nodes and its reachability from other
nodes in the network.

4.2 NODE DEGREE BASED

If the degree of the node is used as a measure to detect communities, then it first
determines the maximum clique, which becomes a community. Using the degree as a
measure, the three methods which can be use are (i) Brute Force (ii) Relaxing cliques (iii)
Clique Percolation Method (CPM). All these

4.2.1 BRUTE FORCE METHOD

Brute Force Method, as the name suggests manually check each node recursively, and
then determines the cliques by placing them in a ‘Clique Stack’. This method can find all the
maximal cliques. For ‘n’ nodes the algorithm generates 2n-1-1 different cliques. For a network
over a hundred nodes, noting that real-world networks have over thousands of nodes, the use
of this algorithm seems impractical. Due to this, other approaches were implemented.

4.2.2 RELAXING CLIQUE APPROACH

This approach is uses k-plex. A k-plex is a structure where all nodes have a minimum
degree that is not necessarily (k-1). This method finds all the k-plex from 1 to dv, where dv is
the maximum degree in the graph. This problem is NP-hard, which means that it can only be
calculated manually. Each of the k-plex formed form a community.

4.2.3 CLIQUE PERCOLATION METHOD (CPM)

This is the most suitable approach among the degree-based approaches to detecting
communities. The procedure includes finding out all cliques of size k in a given network, the
using those cliques to construct a graph. Two cliques are said to adjacent if they share k-1
nodes. Thus, a network of cliques can be formed.

//Working Example
4.3 NODE REACHABILITY

In terms of the reachability of one node to all the other nodes, BFS and DFS would be
difficult to use. Instead, finding the communities that are in between cliques and connected
components in terms of their connectivity and have small shortest paths between their nodes
is the best approach. With the use of k-clique, k-clubs and k-clans, communities can be
detected using the reachability property.

4.4 NODE SIMILARITY

Node Similarity algorithm compares a set of nodes in a network based on the nodes they are
connected to. Two nodes are said to be similar if they are connected to the same set of nodes
within that same network. It can also be called structural equivalence. While detecting
communities, Jaccard Similarity and Cosine Similarity are used.

5. GROUP BASED COMMUNITY DETECTION

In group-based community detection, communities are detected based on group


properties. The group properties are: (1) Robustness (2) Dense (3) Hierarchical (4) Balanced
(5) Modular.

In robust community detection algorithm, a k-vertex connected graph-based approach is


used to find a sub-graph as a community such that even after removing same edges and
vertices it does not lose its node connectivity.

In dense community detection-based algorithm, high dense cliques form a community.


Whereas in modular community detection approach, a matrix called ‘modularity matrix’ is
used to partition a graph into k sub graphs as a community. Hierarchical group-based
detection generates communities in a hierarchical format. Initially all the nodes are
considered to be in a single community but after gradual aggregation and division split into
large community into the desired sub-communities.
REFERENCES

COMMUNITY DETECTION IN SOCIAL NETWORKS

Punam Bedi, Chhavi Sharma, Department of Computer Science, University of Delhi

REVIEW OF COMMUNITY DETECTION OVER SOCIAL MEDIA: GRAPH


PERSPECTIVE

Pranita Jain, Deepak Singh Tomar, Department of Computer Science, Maulana Azad
National Institute of Technology Bhopal, India 462001

You might also like