You are on page 1of 37

Social Network Analysis (SNA)

Centrality measures

Grau de Ciència de Dades | Escola Tècnica Superior d’Informàtica | Universitat Politècnica de València
Sources


Albert László Barabási: Network Science. Cambridge
University Press, 2016
– Follows almost section-by-section chapter 02

● Newman, Mark E. J.: Networks: an introduction. Oxford


University Press, 2010
– Chapter 7 - Measures and metrics

2/37
Contents
1. Degree centrality
2. Eigenvector centrality
3. Katz centrality
4. Page rank
5. Betweenness centrality
6. Closeness centrality
7. Centrality in weighted networks

Social Network Analysis (SNA): Graph theory


Which are the most important or central
nodes in a network?

It depends on which
centrality mesure
we look at

Througout this lesson we will talk about node centrality (as opposed
to edge centrality or network centrality)
DEGREE CENTRALITY
DEGREE CENTRALITY

Each neighbor counts the same:


a node gets 1 “centrality point”
for every neighbor it has
DEGREE CENTRALITY

Degree centrality in directed graphs

Out-Degree In-degree
DEGREE CENTRALITY

Which one to chose?

• Citation network
• In-degree measures impact
• Out-degree measures reference size
• Disease propagation network
• In-degree (usually 1) measures source transmission
• Out-degree measures contagion size
DEGREE CENTRALITY

Citations in CS 2014; nodes: papers; edges: citations

M. Ley, "The DBLP computer science bibliography: Evolution, research issues,


perspectives." Proc. of the 9th Int. Symposium on String Processing and Information
Retrieval (SPIRE), 1-10 (2002), https://doi.org/10.1007/3-540-45735-6_1
DEGREE CENTRALITY

Limitations

• Not all neighbors are necessarily equivalent


Degree centrality assigns 1 ”centrality point” to
each neighbor
• Usually, a node centrality is increased by having
connections to important nodes
Centrality should be, not only about how many you
know, but also about who you know
EIGENVECTOR CENTRALITY

Eigenvector centrality awards nodes proportionally


to the centrality scores of its neighbors
EIGENVECTOR CENTRALITY

Proportionality Centrality of node j


constant Adjacency
matrix
Rewrite using vectors:
where x is the leading eigenvector* of A
(elements equal to the centrality scores xi)
and k is the largest eigenvalue
(*) In asymmetric matrices (directed networks) there are two sets of
eigenvectors (in and out degrees): we take the eigenvector with the
largest eigenvalue within the in-degrees set.
EIGENVECTOR CENTRALITY

Examples
• Citation networks
– Important papers receive high number of cites (awarded by
degree centrality)
– Receiving a cite from an important paper matters (awarded
by eigenvector centrality)
– The number of cites of a non-important paper (out-degree)
does not make that paper important (not awarded)

• World Wide Web


– Similar considerations than citation networks
EIGENVECTOR CENTRALITY

Problem arise for directed networks

• If nodes are not in a strongly connected component


(path exists between each pair and direction) then
eigenvector centrality can become zero for some nodes

Example: Node A has in-degree 0 → Zero


centrality propagates all over the network

Solution
• Avoid any node to have zero centrality
(Katz centrality)
KATZ CENTRALITY

Katz centrality assign nodes with extra


eighenvector centrality score
KATZ CENTRALITY (eigenvector-based measure)

• Each node gets some centrality “for free” (β constant)

• In matrix terms
where 1 is the uniform vector (1,1,..,1)

• rearranging and setting β=1

where α governs the balance between


the eigenvector centrality term (α=1/k)
and the constant term (α=0)
→ researchers must fix α / 0<=α<=1/k
KATZ CENTRALITY (eigenvector-based measure)

Problem with Katz and eigenvector centrality


If a node with high centrality points to 1 million nodes,
all those nodes will have high centrality
Do thousands of sellers and manufacturers
pointed by Amazon website deserve high centrality?

Possible solution
Centrality of neighbors normalized by their out degree
(page rank centrality)
PAGE RANK CENTRALITY

Page rank centrality awards nodes proportionally to the centrality


scores of its neighbors normalized by their out-degree
PAGE RANK CENTRALITY (eigenvector-based measure)

• Dividing centrality of neighbors by their out-degree

• In matrix terms
PAGE RANK CENTRALITY (eigenvector-based measure)

• Parameter alpha is not straightforward to find


Google uses alpha equal to 0.85 (based probably on
empirical results)
• The centrality gained by receiving an edge from a
prestigious node is now diluted among all the other
neighbors of the prestigious node
Amazon website points to thousands of non-central
sellers and manufacturers
• Network hubs (highly connected nodes) do not have
disproportionate influence on the centrality rankings
BETWEENNESS CENTRALITY

Betweenness centrality awards nodes that are part


of a higher number of shortest paths
BETWEENNESS CENTRALITY

Removal of nodes with highest


betweenness will disrupt
communication the most

Edge-betweenness of an edge (i,j) computes the number of paths passing through (i,j)
BETWEENNESS CENTRALITY

The Brandes algorithm efficiently computes the betweenness of each node of a network
Ulrich Brandes (2001). A faster algorithm for betweenness centrality. Journal of Mathematical Sociology. 25:163–177.
BETWEENNESS CENTRALITY

Betweenness measures how much a node falls between others

red is minimum
dark blue is maximum

Take care, Easley and Kleinberg


talk about ‘flow misleading’:

We only care about shortest paths

There are alternative paths just
running through red nodes

Note that, in undirected graphs, counting


paths once or twice is the same
BETWEENNESS CENTRALITY

Central nodes are bridges (aka gatekeepers)


High betweenness nodes
connect groups
BETWEENNESS CENTRALITY

Adolescents being friends of


black and white schoolmates
(in the frontier of both groups)
will obtain high betweenness
CLOSENESS CENTRALITY

Closeness centrality awards nodes that are closer


to the rest of the nodes
CLOSENESS CENTRALITY

Example:
Air travel routes between countries

Nodes: countries
Edges: more than 2 routes
CLOSENESS CENTRALITY

Mean shortest distance from a node


to every other node

# nodes shortest distance


between i and j

Closeness centrality:

Consider only nodes within a connected component!


CLOSENESS CENTRALITY

Closeness Country
0.5737265415549598 United Arab Emirates
0.5846994535519126 Germany
0.5911602209944751 France
0.6028169014084508 United Kingdom
0.6079545454545455 United States
CENTRALITY IN WEIGHTED NETWORKS

We do not talk about node degree (number of neighbors)


but node strength (sum of weights to reach neighbors)
CENTRALITY IN WEIGHTED NETWORKS

Betweenness and closeness


• Path length: sum of edge weights from one node to the
other

Unweighted distance

Weighted distance

Questions:
Does betweenness of node 2 change using weights vs. not?
Does closeness of node 2 change using weights vs. not?
CENTRALITY IN WEIGHTED NETWORKS

Eigenvector, Katz, and PageRank

• Use the weighted adjacency matrix instead

• Eigenvector

• Katz

• PageRank

Note: Only valid for non-negative weights!


CENTRALITY IN WEIGHTED NETWORKS

Weighted Core: The s-core


S-core: Maximal subnetwork
where nodes have strength
at least s

Main weighted core


(or simply core):

S’-core such that there is not


s-core with s > s’

Eidsaa, Marius, and Eivind Almaas. "S-core network decomposition: A generalization of k-core analysis to weighted networks." Physical Review E 88.6 (2013): 062819
USEFUL INFORMATION
Advices
• Compute centrality measures within connected
components
• Use them as a ranking system for node importance,
relative values are difficult to understand
• Measures are context dependent, use the best one for
your problem
Networkx python package

• nx.betweenness_centrality
• nx.closeness_centrality
• nx.k_core
• nx.katz_centrality
• nx.eigenvector_centrality
• nx.pagerank

You might also like