You are on page 1of 68

Module4_NetworkModels

Reference: R. Zafarani, M. A. Abbasi, and H.


Liu, Social Media Mining: An Introduction,
Cambridge University Press, 2014.
Book at http://socialmediamining.info/
Why should we use network models?
Facebook – a few details
In May 2011:
– 721 millions users.
– Average number of friends: 190
– A total of 68.5 billion friendships
In September 2015:
– 1.35 Billion users
In June 2017:
– 2 Billion users
In July 2018:
– 2.2 Billion monthly active users
In June 2019:
– 2.41 billion monthly active users (MAU)
1. What are the principal underlying processes that help initiate these friendships?
2. How can these seemingly independent friendships form this complex friendship
network?
3. In social media there are many networks with millions of nodes and billions of
edges.
– They are complex and it is difficult to analyze them
Network Models
we can design models that generate, on a smaller scale, graphs similar to real-
world networks.
If we can guarantee that generated graphs are similar to real-world networks:
1. We can analyze simulated graphs instead of real-networks (cost-efficient)
2. We can better understand real-world networks by providing concrete mathematical
explanations
3. We can perform controlled experiments on synthetic networks when real-world
networks are unavailable.
Graph Models:
• Random graph model
• Small-world model
• Preferential attachment model
These models are designed to accurately model properties observed in real-world
networks
CONCERN: Properties of real-world networks that should be accurately modeled
Properties of Real-World Networks
• Real-world networks share common characteristics.
• When designing network models, we devise models that can accurately
describe these networks by mimicking these common characteristics.
• To determine these characteristics, identify their attributes and show
that measurements for these attributes are consistent across networks.
• Three network attributes exhibit consistent measurements across real-
world networks: degree distribution, clustering coefficient, and average
path length.
• Degree distribution denotes how node degrees are distributed across a
network.
• Clustering coefficient measures transitivity of a network.
• Average path length denotes the average distance (shortest path
length) between pairs of nodes.
Degree Distribution
Wealth Distribution:
– Most individuals have average capitals
– Few are considered wealthy
– Exponentially more individuals with average capital than the wealthier ones

City Population:
– A few metropolitan areas are densely populated
– Most cities have an average population size

Social Media:
– We observe the same phenomenon regularly when measuring popularity or
interestingness for entities.
• The Pareto principle
(80–20 rule): 80% of the effects come from 20% of the causes
Degree Distribution
Site Popularity:
– Many sites are visited less than a 1,000 times a month
– A few are visited more than a million times daily
User Activity:
– Social media users are often active on a few sites
– Some individuals are active on hundreds of sites
Product Price:
– There are exponentially more modestly priced products for sale compared to
expensive ones.
Friendships:
– Many individuals have a few friends and a handful of users have thousands of
friends
In all the provided observations, the distribution of values
follows a power-law distribution
Power-Law Degree Distribution
• When the frequency of an event changes as a power of an
attribute
– the frequency follows a power-law
• Let k denote the degree of a node . Let pk denote the fraction of
individuals with degree k, (i.e. frequency of observing k / |V| ).
Then, according to the power-law distribution we have
pk = ak-b
– where b is the power-law exponent
– a is the power-law intercept

• Taking the logarithm from both sides of pk = ak-b , we get


ln pk = - b ln k + ln a
Power-Law Degree Distribution
• A typical shape of a power-law distribution

• A log-log plot of a power-law Log-Log


distribution is a straight line with plot
slope -b and intercept ln a
Power-Law Distribution
• Many real-world networks exhibit a power-law
distribution.
• Power-laws seem to dominate
• When the quantity being measured can be viewed
as a type of popularity.

• A power-law distribution
• Small occurrences : common
• Large instances: extremely rare
Power-Law Distribution: Examples
• Call networks:
•–
  Thefraction of telephone numbers that receive calls per day is roughly
proportional to

• Book Purchasing:
– The fraction of books that are bought by people is roughly proportional to

• Scientific Papers:
– The fraction of scientific papers that receive citations in total is roughly
proportional to

• Social Networks:
– The fraction of users that have in-degrees of is roughly proportional to
Power-law Distribution: An Elementary Test

• test
To   whether a network exhibits a power-law distribution
1. Pick a popularity measure and compute it for the whole network
– Example: number of friends for all nodes

2. Compute , the fraction of individuals having popularity .

3. Plot a log-log graph, where the -axis represents and the -axis
represents

4. If a power-law distribution exists, we should observe a straight


line
Power-Law Distribution: Real-World
Networks
• Networks with a power-law degree distribution are
called Scale-Free networks
The tail of the power-law distribution is long!

• The Loooooong Tail


• Are most sales being
generated by a small
set of items that are
enormously popular?
OR
By a much larger population of
items that are each individually
less popular?
An example power-law graph, being used to demonstrate ranking of popularity. To
the right is the long tail, and to the left are the few that dominate (also known as
the 80–20 rule [Pareto Principle]).
• The total sales volume of unpopular items, taken together,
is very significant.
 57% of Amazon’s sales is from the long tail
Clustering Coefficient
• In real-world networks, friendships are highly transitive
• These friendships form triads of friendships that are
frequently observed in social networks.
• These triads result in networks with high average clustering
coefficients.
• Facebook Observation
In May 2011:
• Average clustering coefficient of 0.5 for users with two friends
• This indicates that for 50% of all users with two friends, their two friends were
also friends with each other.
Clustering Coefficient for Real-World
Networks
Average Path Length
• How Small is the World?
A rumor is spreading over a
social network.
 Assume all users pass it
immediately to all of
their friends
1. How long does it take to reach almost all of the
nodes in the network?
2. What is the maximum time?
3. What is the average time?
Average Path Length
• In real-world networks, any two members of the network are
usually connected via short paths.
• Average path length is small.
• This is known as the small-world phenomenon.
• In the well-known small-world experiment conducted in the 1960s
by Stanley Milgram, Milgram conjectured that people around the
world are connected to one another via a path of at most six
individuals (i.e., the six degrees of separation).
• Small average path lengths in social networks
– For example, in May 2011, the average path length between individuals
in the Facebook graph was 4.7.
– This average was 4.3 for individuals in the United States at the same time
– Average Path Length in Real-World Networks
The Average Shortest Path in Sample Networks
Milgram’s Experiment
• 296 random people from
Nebraska (196 people) and
Boston (100 people) were asked
to send a letter (via
intermediaries) to a stock broker
in Boston
• S/he could only send to people
they personally knew, i.e., were
on a first-name basis
• Among the letters that found the
target (64), the average number
of links was around six.
• Stanley Milgram (1933-
1984)
Milgram’s Experiment
• Travers, Jeffrey, and
Stanley Milgram. "An
experimental study
of the small world
problem“.

• Average Number of
Intermediate people
is 5.2
Erdös Number
• Erdös Number: Number of links
required to connect scholars to
Erdös, via co-authorship papers

• Erdös wrote 1500+ papers with


507 co-authors

• The Erdös Number Project


allows you to compute your
Erdös number:
http://www.oakland.edu/enp/

• Connecting path lengths, among


mathematicians only:
– Avg. is 4.65 and Maximum is
13 • Paul Erdös (1913-1996)
An Example of Erdös number 2 [Einstein]
Random Graphs – Network Model
• Most
  basic assumption on how friendships can
be formed:
Random Graph assumption:
– Edges (i.e., friendships) between nodes (i.e.,
individuals) are formed randomly.
– Two random graph models
• and
Random Graph
  Model -
•• Proposed
  independently by Edgar Gilbert, Solomonoff and
Rapoport.
• Consider a graph with a fixed number of nodes
• In the G(n, p) model, a graph is constructed by
connecting nodes randomly.
– Each edge is included in the graph with probability p
independent from every other edge.
• Any of the edges can be formed independently,
with probability p
• The graph is called a random graph
Random Graph
  Model -

• Assume
 number ofboth number of
edges are fixed.
nodes and

• In the G(n, m) model, a graph is chosen


uniformly at random from the collection
of all graphs which have n nodes and m
edges.
• This model was first proposed by
• Paul Erdös and Alfred Rényi
• Also called as Erdös and Alfred Rényi
model
• Determine which edges are selected from
the set of possible edges
• Let denote the set of graphs with
nodes and edges
– There are different graphs with nodes and
edges
This model was first
proposed by
Paul Erdös and Alfred Rényi
• To generate a random graph, we uniformly
select one of the graphs (the selection
probability is )
Modeling Random Graphs (cont’d)
•   probability of uniformly selecting a graph in G(n,m) is analogous
The
to p, the probability of selecting an edge in G(n, p)

Similarities:
– In the limit (when is large), both and models act similarly
• The expected number of edges in is
• We can set
– Both models act the same because they contain the same number of edges

Differences:
– The model contains a fixed number of edges
– The model is likely to contain none or all possible edges
G(n, p) – a few Mathematical properties

• Mathematically, the G(n, p) model is almost


simpler to analyse.
• Mathematical properties
– Expected number of edges that are connected to a
node (Expected Degree)
– Expected number of edges observed in the graph
– Probability of observing m edges in a random
graph
G(n, p) - Expected Degree
•Proposition:
  The expected number of edges
connected to a node (expected degree) in is

Proof:
– A node can be connected to at most nodes (via edges)
– All edges are selected independently with probability
– Therefore, on an average, edges are selected

• or equivalently,
G(n, p) - Expected Number of Edges

•Proposition:
  The expected number of edges in is

Proof:
– Since edges are selected independently, and we
have a maximum edges, the expected number of
edges is
G(n, p) - Probability  of observing edges
•Proposition:
  Given the model, the probability
of observing edges is

which is the binomial distribution


Proof:
– edges are selected from the possible edges.
– These edges are formed with probability and other edges are
not formed (to guarantee the existence of only edges) with
probability
Evolution of Random Graphs
•• In
  random graphs, when nodes form connections, after some
time a large fraction of nodes get connected.
• This large fraction forms a connected component, commonly
called the largest connected component or the giant
component.
– In random graphs, as we increase , a large fraction of nodes start
getting connected
– i.e., we have a path between any pair
• In random graphs [G(n , p)]:
– When
• the size of the giant component is
– When
• the size of the giant component is (all pairs are connected)
The Giant Component
• Example: Evolution of Random Graphs.
• Here, p is the random graph generation probability, c is the average degree, ds is
the diameter size, slc is the size of the largest component, and l is the average path
length.
• The highlighted column denotes phase transition in the random graph
(18/12=1.5, 110/42=2.6)
1st Phase Transition (Rise of the Giant
Component)
•• Phase
  Transition: The point where diameter value starts
to shrink in a random graph
– We have other phase transitions in random graphs
• E.g., when the graph becomes connected
• In phase transition we focus on what happens when
– average node degree (or when )
• At this Phase Transition:
1. The giant component, which just started to appear, starts to
grow, and
2. The diameter, which just reached its maximum value, starts
decreasing.
Random Graphs
If :
•  – small, isolated clusters
– small diameters
– short path lengths

At :
– a giant component appears
– diameter peaks
– path lengths are long

For :
– almost all nodes
connected
– diameter shrinks
– path lengths shorten
Why  ?

•  
It is proven that in random graphs phase transition occurs when c = 1; that is, p = 1 /
(n - 1)
Proposition : In random graphs, phase transition happens at c = 1.
Proof: Consider a random graph with expected node degree .
• In this graph,
– Consider any connected set of nodes ;
– Let denote the complement set; and
– Assume
• For any node in
– If we move one hop away from ,
we visit approximately nodes.
• If we move one hop away from nodes in ,
– we visit approximately nodes.
• If is small, the nodes in only visit nodes in and when moving one hop away
from , the set of nodes guaranteed to be connected gets larger by a factor .
• The connected set of visited nodes gets c 2 times larger when moving two hops and so on.
• In the limit, if we want this connected component to become the largest
component, then after traveling hops, its size must grow and we must have
Properties of Random Graphs - Degree Distribution

• When
  computing degree distribution, we estimate
the probability of observing , for node
• Proposition : For a graph generated by , node v
has degree d, d <= n-1, with probability

• This is a binomial degree distribution. In the limit


(i.e., n  ) this will become the Poisson degree
distribution
Binomial Distribution in the Limit

Poison
Distribution
2 Phase Transition (Connectivity)
nd

•  

- When the graph is connected there are no


nodes with degree 0

- So, should be less than


Expected Local Clustering Coefficient
•Proposition
  : In a random graph generated by
he expected local clustering coefficient for node
is
Proof: The local clustering coefficient for node v
is

• can have different degrees depending on the


edges that are formed randomly, so the
expected value is
Expected Local Clustering Coefficient, Cont.
Expected Local Clustering Coefficient,
(Cont.d)
Alternatively,
• LCC(node i ) = Number of links b/w k neighbours of node i /
Possible Number of links b/w k neighbours of node i
= Expected number of neighbours b/w k neighbours of node i / kC2
= (kC2 * p) / kC2
=p
Global Clustering Coefficient
•  
Proposition : The global clustering coefficient of a random graph generated by
is

Proof.
– The global clustering coefficient defines the probability of two neighbors of the
same node being connected.
– In a random graph, for any two nodes, it is
• Equal to the generation probability that determines the probability of two nodes getting
connected
• In random graphs, the expected local clustering coefficient is equivalent to
the global clustering coefficient .
• By appropriately selecting p, we can generate networks with a high
clustering coefficient.
• Further, selecting a large p is undesirable because doing so will generate a
very dense graph, which is unrealistic, as in the real-world, networks are
often sparse.
Average Path Length
•  
Proposition: The average path length l in a random graph is

Proof:
• Lthe expected diameter size of the graph
• Starting with any node and the expected degree ,
– one can visit approximately nodes by traveling one edge
– nodes by traveling edges, and
– nodes by traveling diameter number of edges
• After this step, almost all nodes should be visited. In this case, we have
• In random graphs, the expected diameter size tends to the
average path length in the limit. Using this fact, we have
Modeling with Random Graphs

• Given
  a real-world network, we can simulate it using a random graph model.
• Compute the average degree in the real-world graph
• Compute using
• Generate the random graph G(n,p) using and the number of nodes in the
given network.

• How representative is the generated graph? (continued in next slide)


– [Degree Distribution] Random graphs do not have a power-law degree distribution
– [Average Path Length] Random graphs perform well in modeling the average path
lengths
– [Clustering Coefficient] Random graphs drastically underestimate the clustering
coefficient
• To tackle this issue, let us study the small-world model.
Modeling with Random Graphs:
Real-World Networks / Simulated Random Graphs

• How representative is the generated graph?


– [Degree Distribution] Random graphs do not have a power-law degree
distribution
– [Average Path Length] Random graphs perform well in modeling the
average path lengths
– [Clustering Coefficient] Random graphs drastically underestimate the
clustering coefficient
• To tackle this issue, let us study
the small-world model.
Small-World Model
– Small-world model
• or the Watts-Strogatz (WS) model
• A special type of random graph
• Exhibits small-world properties:
– Short average path length
– High clustering coefficient

• It was proposed by Duncan J.


Watts and Steven Strogatz in
their joint 1997 Nature paper

Watts, Duncan J., and Steven H. Strogatz.


"Collective dynamics of ‘small-world’networks."
nature 393.6684 (1998): 440-442.
Small-world Model
• In real-world interactions, many individuals have a limited
•  
and often at least, a fixed number of connections.
• Individuals connect with their parents, brothers, sisters,
grandparents, and teachers, among others.
• Thus, instead of assuming random connections, one can
assume an egalitarian model in real-world networks, where
people have the same number of neighbors (friends).
• This again is unrealistic;
• However, it models more accurately the clustering
coefficient of real-world networks.
• In graph theory terms, this assumption is equivalent to
embedding users in a regular network
• A regular (ring) lattice is a special case of regular networks
where there exists a certain pattern on how ordered nodes
are connected to one another Regular Lattice of Degree 4
• In a regular lattice of degree , nodes are connected to
their previous and following neighbors
• Formally, for node set V={ ,…, , an edge exists between
node and if and only if
Generating a Small-World Graph
•  

• The lattice has a high, but fixed, clustering coefficient


• The clustering coefficient takes the value

• The lattice has a high average path length


• To overcome these problems, the proposed small world model dynamically lies between the
regular lattice and the random network.
• In the small-world model, assume a parameter that controls randomness in the model
– When is 0, the model is basically a regular lattice
– When , the model becomes a random graph
• The model starts with a regular lattice and continues adding random edges [through
rewiring] Rewiring: take an edge, change one of its end-points randomly
Constructing Small World Networks

• The procedure creates new edges by a process called rewiring.


• Rewiring replaces an existing edge between nodes v i and vj with a non existing edge between vi and vk with
probability .
– An edge is disconnected from one of its endpoints vj and connected to a new endpoint v k. Node vk is
selected uniformly.
• As in many network generating algorithms
– Disallow self-edges
– Disallow multiple edges
Small-World Model – Properties - Degree Distribution

• The degree distribution for the small-world model


is

– where P(dv = d) is the probability of observing degree d for node


v.
– degree distribution is quite similar to the Poisson degree
distribution observed in random graphs with no proof
• In practice, in the graph generated by the small
world model, most nodes have similar degrees due
to the underlying lattice.
Regular Lattice vs. Random Graph
• Regular
  Lattice:
• Clustering Coefficient (high):

• Average Path Length (high):

• Random Graph:
• Clustering Coefficient (low):
• Average Path Length (ok!) :
Clustering Coefficient for Small-world model

• The Clustering Coefficient (CC) for a small-world network is a value between CC of
 Regular Lattice and CC of Random Graph, depending on
• Commonly, clustering coefficient for a regular lattice is represented using C(0), and
the clustering coefficient for a small-world model with = p is represented as C(p).
• The relation between the two values can be computed analytically; it has been
proven that
• The intuition behind this relation is that because the clustering coefficient
enumerates the number of closed triads in a graph,
– we are interested in triads that are still left connected after the rewiring process.
• For a triad to stay connected, all three edges must not be rewired with probability
(1 - p).
• Since the process is performed independently for each edge, the probability of
observing triads is (1-p)3 times the probability of observing them in a regular lattice.
• We also need to take into account new triads that are formed by the rewiring
process; however, that probability is nominal and hence negligible.
Clustering Coefficient for Small-world model

• The
  probability that a connected triple
stays connected after rewiring consists of
1. The probability that none of the 3
edges were rewired is
2. The probability that other edges were
rewired back to form a connected triple
• Very small and can be ignored
• Clustering coefficient
Regular Lattice vs. Random Graph, What
happens in Between?
• Regular Lattice:
•  
• Clustering Coefficient (high):
• Average Path Length (high):
• Random Graph:
• Clustering Coefficient (low):
• Average Path Length (ok!) :
• Does smaller average path length mean smaller clustering coefficient?
• Does larger average path length mean larger clustering coefficient?
• Numerical simulation:
• We increase (i.e., ) from 0 to 1
• Assume
• is the average path length of the regular lattice
• is the clustering coefficient of the regular lattice
• For any , denotes the average path length of the small-world graph and denotes its
clustering coefficient
• Observations:
• Fast decrease of average distance
• Slow decrease in clustering coefficient
Change in Clustering Coefficient /Avg. Path Length

•  
The graph depicts the value of C(p) /C(0) for different values of p.
• As shown in the figure, the value for C(p) stays high until p reaches 0.1, (10% rewired) and then decreases
rapidly to a value around zero.
• Since models with a high clustering coefficient and small average path length are desired, values in range
0.01 <= = p <=0.1 are preferred.
Modeling with the Small-World Model
•   Given a real-world network in which average
degree is and clustering coefficient ,
• we set and determine using equation

• Given , , and (size of the real-world network),


we can simulate the small-world model
Real-World Network and Simulated Graphs

• The small-world model is still incapable of generating a realistic degree distribution in the
simulated graph.
• To generate scale-free networks (i.e., with a power-law degree distribution), the preferential
attachment model is introduced.
Small-World Model
• SW networks have:
– High clustering coefficients – introduced by
“ring regularity”
– Large average diameters of regular lattices –
fixed by randomly re-wiring a small percentage
of edges
Preferential Attachment Model
•• Main
  assumption: • Distribution of wealth
– When a new user joins the in the society:
network, the probability of
connecting to existing nodes is – The rich get richer
proportional to existing nodes’ • The higher the node’s degree, the
degrees higher the probability of new
– For the new node nodes getting connected to it.
• Connect to a random node with • Barabási, Albert-László, and Réka
probability Albert. "Emergence of scaling in
random
networks." science 286.5439

– Proposed by Albert-László Barabási


(1999): 509-512.
and Réka Albert
• When new nodes are added to networks, they are
more likely to connect to existing nodes that many
others have connected to.
Preferential Attachment: Example
 •
Node arrives 𝑃(1)=1/7
 
𝑣  𝑃(5)=3
  /7

𝑃(2)=1/7
  𝑃(4)=0
 

1 5

𝑃(3)=2
  /7

2 3 4
Preferential Attachment - Algorithm
• The algorithm starts with a graph containing a small set of nodes m0 and
then adds new nodes one at a time.
• Each new node gets to connect to m <= m0 other nodes, and each
connection to existing node vi depends on the degree of vi
• The model incorporates two ingredients to achieve a scale-free network
(1) The growth element and
(2) The preferential attachment element
• The growth is realized by adding nodes as time goes by.
• The preferential attachment is realized by connecting to node vi based on
its degree probability,
Constructing Scale-free Networks
Properties of the Preferential Attachment
Model

• Degree Distribution:

• Clustering Coefficient:

• Average Path Length:


Modeling with the Preferential Attachment
Model
•  
• Similar to random graphs, we can simulate
real-world networks by generating a
preferential attachment model by setting the
expected degree
Real-World Networks and Simulated Graphs
Real-World Networks and Simulated Graphs
Unpredictability of the Rich-Get-Richer
Effects
• The initial stages of one’s rise
to popularity are fragile

• Once a user is well


established, the rich-get-richer
dynamics of popularity is likely • If we could roll time back to 1997,
to push the user even higher and then run history forward again,
would the Harry Potter books again
sell hundreds of millions of copies?
• But getting the rich-get-richer • See more: Salganik, Matthew J., Peter
Sheridan Dodds, and Duncan J. Watts.
process started in the first "Experimental study of inequality and
place is full of potential unpredictability in an artificial cultural
market." science 311.5762 (2006): 854-
accidents and near-misses 856.
Network (graph) model

You might also like