You are on page 1of 80

Models & Algorithms

for Internet
Computing
Lecture 9: RANDOM NETWORK MODELS
A nice book to discover the field

• Markov chain and basic concepts


• Page Rank Model and Search Engine
• Stationary Distribution

Probability for Computing 3


What is a network (graph) model?
Network models
A brief cover on:
I. Erdos–Renyi random graphs
II. Generalized random graphs
with the same degree distribution as the data networks
III. Small-world networks
IV. Scale-free networks
V. Hierarchical model
VI. Geometric random graphs
The E-R model
Erdos–Renyi random graphs (ER)
• Model a data network G(V,E) with |V|=n and |E|=m
• An ER graph that models G is constructed as follows:
• It has n nodes
• Edges are added between pairs of nodes uniformly at
random with the same probability p
• Two (equivalent) methods for constructing ER graphs:
• Gn,p: pick p so that the resulting model network has m
edges
• Gn,m: pick randomly m pairs of nodes and add edges
between them with probability 1
Erdos–Renyi random graphs (ER)
• Number of edges, |E|=m, in Gn,p is:

• Average degree is:


Erdos–Renyi random graphs (ER)
• Many properties of ER can be proven theoretically
(See: Bollobas, "Random Graphs," 2002)

• Example:
• When m=n/2,suddenly the giant component
emerges, i.e.:
• One connected component of the network has
O(n) nodes
• The next largest connected component has
O(log(n)) nodes
DEGREE DISTRIBUTION OF A RANDOM GRAPH

Select k
nodes from N- probability
1 of
probability
missing N-
of
1-k
having k
edges
edges

As the network size increases, the distribution becomes increasingly


narrow—we are increasingly confident that the degree of a node is in the
vicinity of <k>.
Network Science: Random Graphs
Erdos–Renyi random graphs (ER)
• The degree distribution is binomial:

• For large n, this can be approximated with


Poisson distribution:

where z is the average degree


• However, many real world networks have
power-law degree distribution
Erdos–Renyi random graphs (ER)

• Clustering coefficient, C, of ER is low (for low p)

• C=p, since probability p of connecting any two


nodes in an ER graph is the same, regardless of
whether the nodes are neighbors

• However, many real world networks have high


clustering coefficients
Erdos–Renyi random graphs (ER)
• Average diameter of ER graphs is small
• It is equal to

• Real networks also have small average diameters

• Summary
Generalized random graphs (ER-DD)
• Preserve the degree distribution of data
(“ER-DD”)

• Constructed as follows:
• An ER-DD network has n nodes
(so does the data)
• Edges are added between pairs of nodes using
the “stubs method” [configuration model
discussed earlier]
Generalized random graphs (ER-DD)
• The “stubs method” for constructing ER-DD
graphs:
• The number of “stubs” (to be filled by edges) is
assigned to each node in the model network
according to the degree distribution of the real
network to be modeled
• Edges are created between pairs of nodes with
“available” stubs picked at random
• After an edge is created, the number of stubs left
available at the corresponding “end nodes” of the
edges is decreased by one
• Multiple edges between the same pair of nodes are
not allowed
Generalized random graphs (ER-DD)

• Summary

• 2 global network properties are matched by ER-DD


Small-world networks (SW)
• Watts and Strogatz,
1998
• Created from
regular ring lattices
by random rewiring
of a small
percentage of their
edges
Small-world networks (SW)
• SW networks have:
• High clustering coefficients – introduced by “ring
regularity”
• Large average diameters of regular lattices – fixed
by randomly re-wiring a small percentage of edges

• Summary
Scale-free networks (SF)
• Power-law degree distributions: P(k) = k−γ
• γ > 0; 2 < γ < 3
Scale-free networks (SF)
• Power-law degree distributions: P(k) = k−γ
• γ > 0; 2 < γ < 3
Scale-free networks (SF)
• Different models exist, e.g.:

• A popular one is:


• Preferential Attachment Model (SF-BA)
(Barabasi-Albert, 1999)
Scale-free networks (SF)
• Preferential Attachment Model (SF-BA)
• “Growth” model: nodes are added to an existing
network
• New nodes preferentially attach to existing nodes with
probability proportional to the degrees of the existing
nodes; e.g.:
• This is repeated until the size of SF network matches
the size of the data
• “Rich getting richer”
Scale-free networks (SF)
• Summary
Hierarchical model
• Preserves network “modularity” via a fractal-
like generation of the network
Hierarchical model
• These graphs do not match any biological data
and are highly unlikely to be found in data sets
Geometric random graphs
• “Uniform” geometric random graphs (GEO)
• Take any metric space and, using a uniform random
distribution, place nodes within the space
• If any nodes are within radius r (calculated via any
chosen distance norm for the space), they will be
connected
• Choose r so that the size of the GEO network matches
that of the data
• There are many possible metric spaces (e.g., Euclidean
space)
• There are many possible distance norms
(e.g. the Euclidean distance, the Chessboard distance,
and the Manhattan/Taxi Driver distance)
Geometric random graphs
• “Uniform” geometric random graphs (GEO)

• Summary
Stochastic Block Models

• M matrix (constant p)  Erdos Renyi model


• M matrix (not constant)  Erdos Renyi within community; random bipartite across
communities.
• Vertices within a community are considered exchangeable (i.e. probabilistically equivalent
with respect to their interactions with other vertices)
SBM: Representation and Instantiation
SBM: ASSORTATIVE
NETWORKS
SBM: Dissortative Networks
SBM contd.
• A number of other composable models can be viewed as
Stochastic Block Models
• One may compose/create multiple generative models in this
fashion

• Can handle directed networks (M is not symmetric in this case)


• Can potentially model all three properties of real networks one
is often interested in!
Random network
models
MORE ON ERDOS-RENYI RANDOM GRAPH
RANDOM NETWORK MODEL

Pál Erdös Alfréd Rényi


(1913-1996) (1921-1970)

Erdös-Rényi model (1960)

Connect with probability p

p=1/6
N=10
k ~ 1.5

Network Science: Random Graphs 2012


RANDOM NETWORK MODEL

Definition:

A random graph is a graph of N labeled nodes where each


pair of nodes is connected by a preset probability p.

We will call is G(N, p).

Network Science: Random Graphs 2012


RANDOM NETWORK MODEL

p=1/6
N=12

Network Science: Random Graphs 2012


RANDOM NETWORK MODEL

p=0.03
N=100

Note: No node has a very high degree. Rather, it is very unlikely for one
node to have a very high degree. Why? (HW question)

Network Science: Random Graphs 2012


RANDOM NETWORK MODEL

N and p do not uniquely define


the network– we can have many
different realizations of it. How
many?
N=1
0
p=1/
6

The probability to form a particular graph G(N,p)


That is, each graph
is G(N,p) appears with
probability
P(G(N,p)).

Network Science: Random Graphs 2012


RANDOM NETWORK MODEL

P(L): the probability to have exactly L links in a network of N nodes and


probability p:
The maximum number of
links in a network of N
nodes.

Binomial distribution...

Number of different ways we can


choose L links among all
potential links.

Network Science: Random Graphs 2012


MATH TUTORIAL the mean of a binomial distribution

There is a faster way using generating functions, see:


http://planetmath.org/encyclopedia/BernoulliDistribution2.html Network Science: Random Graphs 2012
MATH TUTORIAL the variance of a binomial distribution

 ( X )  E ( X )  E  X 
2 2 2

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.html
Network Science: Random Graphs 2012
MATH TUTORIAL the variance of a binomial distribution

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.html
Network Science: Random Graphs 2012
MATH TUTORIAL Binomian Distribution: The bottom line

http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.html
Network Science: Random Graphs 2012
RANDOM NETWORK MODEL

P(L): the probability to have a network of


exactly L links

•The average number of links <L> in a random


graph

•The standard deviation

Network Science: Random Graphs 2012


DEGREE DISTRIBUTION OF A RANDOM GRAPH

Select k
nodes from N- probability
1 of
probability
missing N-
of
1-k
having k
edges
edges

As the network size increases, the distribution becomes increasingly


narrow—we are increasingly confident that the degree of a node is in the
vicinity of <k>.
Network Science: Random Graphs 2012
DEGREE DISTRIBUTION OF A RANDOM GRAPH

or large N and small k, we can use the following approximations:

k  k  k
ln[(1  p) ( N 1) k ]  ( N  1  k ) ln(1  )  ( N  1  k )    k  (1  )k 
N 1 N 1 N 1

for

Network Science: Random Graphs 2012


DEGREE DISTRIBUTION OF A RANDOM GRAPH

P(k)

k Network Science: Random Graphs 2012


DEGREE DISTRIBUTION OF A RANDOM NETWORK

Exact Result Large N limit


-binomial distribution- -Poisson distribution-
Probability Distribution Function
(PDF)

Network Science: Random Graphs 2012


NODES HAVE COMPARABLE DEGREES IN RANDOM NETWORKS

What does it mean? Continuum formalism:

If we consider a network with average degree <k> then the probability to


have a node whose degree exceeds a degree k 0 is:

For example, with <k>=10,


•the probability to find a node whose degree is at least twice the average degree is 0.00158826.
•the probability to find a node whose degree is at least ten times the average degree is
1.79967152 × 10-13
•the probability to find a node whose degree is less than a tenth of the average degree is
0.00049

What does it mean? Discrete formalism:

•The probability of seeing a node with very high of very low degree is
exponentially small.
•Most nodes have comparable degrees.
•The larger the size of a random network, the more similar are the node
NO OUTLIERS IN A RANDOM SOCIETY

According to sociological research, for a typical individual k ~1,000

The probability to find an individual with degree k>2,000 is 10 -27.

Given that N ~109, the chance of finding an individual with 2,000 acquaintances is so tiny
that such nodes are virtually non-existent in a random society.

a random society would consist of mainly average individuals, with everyone with
roughly the same number of friends.

It would lack outliers, individuals that are either highly popular or recluse.

Network Science: Random Graphs 2012


SIX DEGREES small worlds

Sarah

Jan
e Ralph

Pete
r Frigyes Karinthy,
1929
Stanley Milgram,
1967
Network Science: Random Graphs 2012
SIX DEGREES 1967: Stanley Milgram

HOW TO TAKE PART IN THIS STUDY

1. ADD YOUR NAME TO THE ROSTER AT THE BOTTOM OF THIS SHEET, so that the next
person who receives this letter will know who it came from.

2. DETACH ONE POSTCARD. FILL IT AND RETURN IT TO HARVARD UNIVERSITY. No stamp


is needed. The postcard is very important. It allows us to keep track of the progress of the folder as
it moves toward the target person.

3. IF YOU KNOW THE TARGET PERSON ON A PERSONAL BASIS, MAIL THIS FOLDER
DIRECTLY TO HIM (HER). Do this only if you have previously met the target person and know each
other on a first name basis.

4. IF YOU DO NOT KNOW THE TARGET PERSON ON A PERSONAL BASIS, DO NOT TRY TO
CONTACT HIM DIRECTLY. INSTEAD, MAIL THIS FOLDER (POST CARDS AND ALL) TO A PERSONAL
ACQUAINTANCE WHO IS MORE LIKELY THAN YOU TO KNOW THE TARGET PERSON. You may send
the folder to a friend, relative or acquaintance, but it must be someone you know on a first name
basis.

Network Science: Random Graphs 2012


DISTANCES IN RANDOM GRAPHS

Random graphs tend to have a tree-like topology (???) with almost


constant node degrees.

• nr. of first neighbors:


N1  k
2
• nr. of second neighbors: N2  k
•nr. of neighbours at distance d:

• estimate maximum distance:

d for the world = log (7 billion) / log


(1000) = 3.28 Network Science: Random Graphs 2012
DISTANCES IN RANDOM GRAPHS compare with real data

log N
l max 
log k

Given the huge differences in scope, size, and average degree, the
agreement is excellent. Network Science: Random Graphs 2012
EVOLUTION OF A RANDOM NETWORK

Until now we focused on the static properties of a random graph with fixes
p value.

What happens when vary the parameter p?

GOTO http://cs.gmu.edu/~astavrou/random.html

Choose Nodes=100.
Note that the p goes up in increments of 0.001, which, for N=100, L=pN(N-
1)/2~p*50,000, i.e. each increment is really about 50 new lines.

Network Science: Random Graphs 2012


EVOLUTION OF A RANDOM NETWORK

disconnected nodes  NETWORK.

<k>
How does this transition happen? Network Science: Random Graphs 2012
THE PHASE TRANSITION TAKES PLACE AT <k>=1

Let us denote with u=1-Ng/N, i.e., the fraction of nodes that are NOT part of
the giant component (GC) Ng .

For a node i to be part of the GC, it needs to connect to it via another node j.
If i is NOT part of the GC, that could happen for two reasons:

Case A: node i does not connect to node j,


Probability: 1-p

Case B: node i connects to j, but j is not connected to the GC:


Probability: pu

Total probability that i is not part of the GC via node j is:


1-p+pu

The probability that i is not linked to the GC via any other node is (1-
p+pu)N-1
The probability that i is linked to the GC is 1-
Hence:
or any p and N this equation provides the size of the giant component as NGC=N(1-u)

Network Science: Random Graphs 2012


EVOLUTION OF A RANDOM GRAPH

ng p=<k>/(N-1) and taking the log of both sides and using <k><<N we obtain:

Taking an exponential of both sides we obtain

if we denote with S the fraction of nodes in the giant component, S=N GC/N, i.e. S=1-u

Erdos and Renyi, 1959


Network Science: Random Graphs 2012
THE PHASE TRANSTION IN A RN TAKES PLACE AT <k>=1

S: the fraction of nodes in the giant component, S=N g/N

Set S=0, we obtain a


Phase transition point: phase transition at
<k>=1

(a) (b)

after Newman, 2010


EVOLUTION OF A RANDOM GRAPH

Analytical result Numerical result

Network Science: Random Graphs 2012


CLUSTER SIZE DISTRIBUTION

Probability that a
randomly selected node
belongs to a cluster of
size s:

The distribution of
cluster sizes at
the critical point,
displayed in a log-log
plot. The data represent
At the critical point
an average over 1000
<k>=1
systems of sizes
The dashed line has a
slope of

Network Science: Random Graphs 2012


Derivation in Newman, 2010
I: II: III: IV:
Subcritical Critical Supercritical Connected
<k> < 1 <k> = 1 <k> > 1 <k> > ln N

<k>
N=100

<k>=0.5 <k>=1 <k>=3 <k>=5


I:
Subcritical
<k> < 1
p < pc=1/N

<k>

No giant component.

N-L isolated clusters, cluster size distribution is exponential

The largest cluster is a tree, its size ~ ln N


II:
Critical
<k> = 1
p=pc=1/N

<k>

Unique giant component: NG~ N2/3


contains a vanishing fraction of all nodes, N G/N~N-1/3 A jump in the cluster size:
Small components are trees, GC has loops. N=1,000  ln N~ 6.9; N2/3~95
N=7 109  ln N~ 22; N2/3~3,659,250
Cluster size distribution: p(s)~s-3/2
III:
Supercritical
<k> > 1
p > pc=1/N <k>=3

<k>

Unique giant component: NG~ (p-pc)N


GC has loops.

Cluster size distribution: exponential


IV:
Connected
<k> > ln N
p > (ln N)/N

<k>=5

<k>

Only one cluster: NG=N


GC is dense.
Cluster size distribution: None

Network Science: Random Graphs 2012


IV:
Connected
<k> > ln N
p > (ln N)/N

The probability that a node does not connect to the giant component is
(1-p)NG~(1-p)N

The expected number of such nodes is:

For a sufficiently large p we are left with only one disconnected node, i.e.
C=1.

Network Science: Random Graphs 2012


I: II: III: IV:
Subcritical Critical Supercritical Connected
<k> < 1 <k> = 1 <k> > 1 <k> > ln N

<k>
N=100

<k>=0.5 <k>=1 <k>=3 <k>=5


CLUSTERING COEFFICIENT

ni is the no. of
connections among
the ki nodes

Since edges are independent and have the same probability p,

This is valid for random


The clustering coefficient of random graphs is small. networks only, with
arbitrary degree
For fixed degree C decreases with the system size N. distribution

Network Science: Random Graphs 2012


13.47 from Newman 2010
Erdös-Rényi MODEL (1960)

•Degree distribution
Binomial, Poisson (exponential tails)

•Clustering coefficient
Vanishing for large network sizes

•Average distance among nodes


Logarithmically small

Network Science: Random Graphs 2012


Thank you for​
your attentions!​

You might also like