CN 2016 Lecture1

Complex Systems
@ c s . a a l t o . f i
CS-E5740
Complex Networks
Lectures 1-2:
Jari Saramäki
jari.saramaki@aalto.fi
@JariSaramaki
Lectures 3-6:
Assistants: Mikko Kivelä
Onerva Korhonen mikko.kivela@aalto.fi
Tuomas Alakörkkö www.mkivela.com
Sara Heydari
Elisa Ryyppö
fall 2016
Goals
After the course, you should
• know how to analyze and characterize networks,
• know the fundamental network models,
• understand how networks evolve, and
• know how network structure affects  

dynamical processes.
A Brief How-To
• For all details, see MyCourses!
• No exam
• Weekly problem sets (return online)

• 60% of points needed for passing
• see grading table in MyCourses
• Project work (due end 2016)

Pipeline for one problem set
Wed, week i Wed, week i+1 Mon, week i+2
Exercise Exercise Pre-DL DL

Lecture X
session session ex. session at 23:55
Intro to get started do most ask last finalize

get help! submit!
set X of the work questions!
TU1
Maari
Y342a Y342a MyCourses
M
Continuous feedback!
• We will collect feedback for each exercise

set and the project.
• We reward you with 1 bonus point for
each time you give feedback
• We will publish summaries after each
round!
Course outline
• Nov 2: Introduction; random (Erdős-Rényi) networks
• TUTORIAL: Python + NetworkX
• Nov 9: Small-world networks; scale-free networks
• TUTORIAL: Statistics with Python
• Nov 16: Network analysis and measures
• Nov 23: Weighted & social networks
• Nov 30: Communities & graph clustering
• Dec 7: Temporal networks & multilayer networks
Part I:
Why Study Networks?
Everything is a network
gene regulatory networks

gene regulatory networks
metabolic
protein interaction networks
networks
nerve cells
gene regulatory networks power grids

protein interaction networks
metabolic networks
social systems world trade Hagmann P, Kurant

Gigandet X, Thiran P
VJ, et al. (2007) Map
Human Whole-Brain
Structural Networks
the human brain

Diffusion MRI. PLo
(7): e597. !
the Internet
transport networks
What can network science tell us?
Internet human brain
- WHY IS THE INTERNET ALWAYS
ON, WHY DOESN’T IT FAIL?
- WHY IS IT SO HARD TO
ERADICATE ELECTRONIC VIRUSES?
- HOW AND WHY DO WE THINK?
- WHAT ARE THE DIFFERENCES

BETWEEN NORMAL AND
SCHITZOPHRENIC BRAINS?
world trade
Hagmann P, Kurant M,
Gigandet X, Thiran P, Wedeen
VJ, et al. (2007) Mapping
- WHAT DOES GLOBALIZATION MEAN Human Whole-Brain
Structural Networks with
IN PRACTICE?
Diffusion MRI. PLoS ONE 2
- HOW CAN DEVELOPING COUNTRIES
(7): e597. !
DEVELOP FURTHER?
What can network science tell us?
disease spreading
-HOW DO DISEASES SPREAD?
-WHAT IS THE ROLE OF THE UNDERLYING

CONTACT AND TRANSPORT NETWORKS?
-HOW TO PREVENT GLOBAL EPIDEMICS?
Complex systems as networks

• Links denote
node
interactions between
nodes link
‣ Interactions of different
strength weighted networks
‣ Interactions of different
direction directed networks
‣ Time-dependent interactions
temporal networks Vertex Edge
person friendship
‣ Interactions of different type
multiplex networks neuron synapse
WWW hyperlink
• Vertex = node company ownership
(synonyms), edge = link gene regulation
What is a link?  
...that is not always straightforward.
Multiple types of links can be included using multilayer networks or

multiplex networks. (More on these in the last lecture.)
The real question:
Protein 1 Protein 3
Protein 2
How do we
deal with
such things? H. Jeong, S.P. Mason, A.-L. Barabasi, Z.N. Oltvai, Nature 411, 41-42 (2001)
The network approach
1) Make empirical observations
2) Try to explain observations
2.1) Choose right level of coarse-graining  
(interacting elements = nodes, interactions = links)
2.2) Strip the problem, disregard some detail 

(assume the node-link structure contains the answers)
2.3) Cast the problem as maths 

(analyze networks, simulate processes, write theories)
3) See if theories, calculations, or simulations can reproduce

findings, or predict something
4) Start over from 1), refine
Part II:
Basic concepts
What is a network/graph?
Graph G = (V, E) consists of
• N vertices V = {v1 , . . . , vN },
• m edges E = {e1 , . . . , em }, where
each edge is a pair of vertices,
ei = (vj , vk ).
If the vertex pairs (vj , vk ) are ordered,
the network is directed. Then ei = (vj , vk )
means that there is an edge from vj to vk .
Otherwise the network is undirected and

ei = (vj , vk ) = (vk , vj ) means that vj and
vk are connected
Note: link = edge, node = vertex, network = graph.  

Choose whichever you like.
Simple graphs, multigraphs
A simple graph has no self-loops, that is, links from a node to itself,
or multiple edges between the same pair of nodes.
(Please note that in directed networks (vi , vj ) 6= (vj , vi ) so bidirectional

links do not count.)
Otherwise the graph is a multigraph.
In this course we only deal with simple graphs.
multigraph
simple graphs
Walks and paths
Naming conventions for walks and paths are varying (alternative names in
parenthesis)
• walk (path) is a sequence of vertices where each consecutive pair is con-
nected by an edge.
• path (self-avoiding walk/path, simple path) is a walk where vertices are

never repeated. (Exception: the first and the last vertex can be the same.)
• path length is the number of edges traversed along a path.
• shortest path (geodesic path) is a path between a pair of nodes vi and vj
with minimum path length.
• distance (geodesic distance), dij , is the shortest path length between nodes
vi and vj .
• diameter, d, is the largest distance in the network: d = maxi,j2V dij .
i
Path {i,j,k} has length 2. This is the
j distance between i and k, and also  
happens to be the diameter of this network
k
Subgraphs, cliques
A subgraph G⇤ = (V ⇤ , E ⇤ ) consists of some subset of nodes, V ⇤ ✓ V , together
with some subset of edges between those nodes. Induced subgraph contains all
edges between the nodes V ⇤ .
Subgraph: E ⇤ ✓ {(vi , vj ) 2 E | vi , vj 2 V ⇤ }
Induced subgraph: E ⇤ = {(vi , vj ) 2 E | vi , vj 2 V ⇤ }
A clique is a subgraph where all nodes are linked to all other nodes.
this is a 3-clique!
j
i i
l
m j j k
l
k k m j
Graph G some (induced) subgraphs of G

Connectedness, components
A graph is connected if some path can be found between all pairs of vertices.
If a graph is not connected, it consists of separate components (maximal

connected subgraphs).
Note that because there is no path between nodes that belong to di↵erent
components, their distance is undefined (or infinite).
i i
m j m j
l l
k k
this graph is connected this graph has two components,
(i,j,m) and (l,k)
Directed networks: paths, components
Degree
the degree
If there is an edge (vi , vj ) 2 E, of j is 4
i
• vi and vj are adjacent,
m j
• vi is a neighbour of vj , l j’s neighbours
k are (i,k,l,m)
• the edge is incident to vi and vj .
The degree ki of vertex vi is the number of edges it is incident to. (This is num-
ber of neighbours in simple graphs. Loops are counted twice in multigraphs.)
For directed networks, one can consider separate in- and out-degrees. The in-
degree ki,in is defined as the number of edges leading to vi and the out-degree
ki , out as the number of edges leading out from vi .
P
The average degree hki of a network is hki = i ki /N = 2m/N .
Degree distribution
The degree distribution P (k) is one of the central concepts in network analysis.
It answers the questions ”if a random node is picked, what is the probability
that its degree is k?” That is,
P (k) = Nk /N ,
where Nk is the number of nodes of degree k.
However, we often assume that the observed degree distribution is a sample

from some ”real” degree distribution that is smooth. In this case a better way
to obtain an estimate for the real distribution is to bin the data and answer the
question ”what is the probability that a randomly picked node has a degree in
some interval [ki , ki+1 ]?”
Edge density
The edge density of a network is the fraction of edges out of possible edges:
m 2m
⇢= = N (N 1) .
(N2 )
In real-world networks, the edge density is usually low, i.e. the networks are
sparse.
even though this metabolic network

has dense subgraphs, in general it
has a low link density, i.e. is sparse.
Clustering coefficient
2⇥1
Cj = 4⇥3 = 0.1666
i
m j
l this metabolic network has a high
k 2⇥1 average clustering coefficient
Ck = 2⇥1 =1 because of the dense subgraphs.
Special graphs
• A tree is a connected graph • A set of trees is called a

with no loops. forest.
• So Ci=0 for all i, <C>=0. a tree
• A tree with  
N vertices always  
has m=N-1edges.
• Also: a connected graph 

with m=N-1 edges is  
always a tree.
a forest
Special graphs
• In a k-regular graph all

nodes have degree k.
• k-regular graphs can e.g.

take the shape of regular
lattices, or be otherwise
random.
Subgraphs of 1-d lattice,
2-d lattice, and a Cayley tree
• Cayley tree:  
a k-regular (infinite) tree.
Bipartite graphs
• If the vertices of a graph can

be divided into two subsets V1
and V2 such that edges exist
only between subsets, the
graph is bipartite
• A bipartite graph can be

projected (or collapsed)  
onto V1 or V2
• Bipartite graphs arise naturally

in many contexts
(collaboration networks,
metabolic reactions, etc)
Network representation
Network representation
Network data structures
Networkx does everything automatically, so you do not

have to worry about all this. 
Just do not try to manually generate an adjacency matrix

for any larger network...
Calculations with the adjacency matrix
undirected network directed network
A = AT N
X
ki,in = aij
N
X N
X i=1
ki = aij = aij XN
j=1 i=1 ki,out = aij
j=1
N
X N X
X N
N
X N
X
2m = ki = aij
i=1 i=1 j=1
m= ki,in = ki,out
i=1 i=1
Part III:
Random networks
Erdős-Rényi networks
Erdős-Rényi networks: Two versions:
• A maximally random • G(N,p): connect each pair of
ensemble of networks of vertices with probability p
given size
• G(N,m): place m edges
Construction: randomly on the network
• Connect N vertices • these define ensembles of
randomly networks
N = 10
p = 1/5
<k> = 1.8
Pál Erdős
(1913-1996)
Erdős, P.; Rényi, A. (1959). "On Random Graphs. I.". Publicationes Mathematicae 6: 290–297.
Ensemble G(N,p) with N =3
N=3, p=1/3
ππjj == probability
probability of
of
realization
realization ofof network
network jj
<kjj>> == avg
<k avg degree
degree in
in jj
π1 ~ 0.3 π2 ~ 0.15 π3~ 0.15 π4~ 0.15
<k1>=0 <k2>=2/3 <k3>=2/3 <k4>=2/3
0.07
π5 ~ 0.15 0.07
π6 ~ 0.15 π7 ~ 0.15
0.07 π8 ~ 0.04
<k5>=4/3 <k6>=4/3 <k7>=4/3 <k8>=2
For any random network model,

any quantity such as average degree
can be viewed as either its
ensemble average, or its expected
value given the generation rules.
Properties of G(N,p)
Edges & degrees Degree distribution

• On average, the number of edges is • Each node’s number of links comes
N
hmi = 2 p = p ⇥ N (N 1)/2. from N 1 independent trials with
probability p.
• Hence the average degree is
hki = 2hmi/N = (N 1)p ⇡ N p. • Hence P (k) = Bin ((N 1), p)
n 1 n 1 k
0.08
= k pk (1 p)
0.07
0.06 • For N ! 1 with hki constant,

0.05
hkik hki
P (k) ! e ,
P (k)
0.04 hki = 30 k!
0.03
that is, P (k) = Poisson(hki).
0.02
0.01
0.00
0 10 20 30 40 50 60 70
k
Average shortest paths & dimensionality
` / ln N
1 X
`= dij
N (N 1) i,j
` / ln N 1/d
`/N ` / N (1/2)
Components in E-R networks
N
Components in E-R networks
<s>
<s>==average
averagenumber
number
relative giant component size S

ofofvertices
verticesinin
average component size <s>

components
components
other
otherthan
thanthe
the
giant
giant
SS==(number
(numberofof
vertices in
vertices in
giant) / N
giant) / N
small average degree <k>

smallcomponents
components
grow
grow insize
in
sizeuntil
until this curve is, strictly
giant
giant appears,then
appears,
then speaking, valid only for
when
join the giant, leaving
join the giant, leaving when<k>=1,
<k>=1,giant
giant ER networks where
only component appears
onlyvery
verysmall
small component appears N→∞ so that pN=const
and
disconnected
disconnectedparts
parts andstarts
startsgrowing
growing
ininsize
size (”thepercolation
(”the percolation
transition”)
transition”)
Randomizing networks:
configuration model
Randomizing networks:
configuration model
Random networks: summary
Extra reading material
Books:
• M.E.J. Newman: Networks: an Introduction (Oxford UP, 2010) recommended!
• S.N. Dorogovtsev, Lectures on Complex Networks (Oxford UP, 2010).  
(online at http://sweet.ua.pt/~f2358/)
• D. Easley & J. Kleinberg, Networks, Crowds, and Markets: Reasoning about a Highly
Connected World (Cambridge UP, 2010) 
(online at http://www.cs.cornell.edu/home/kleinber/networks-book/)
Review papers:
• M.E.J. Newman: Structure and function of complex networks (SIAM Review 45, 167-256
(2003)) 
(online at http://arxiv.org/abs/cond-mat/0303516/) recommended!
• Boccaletti et al., Complex networks: Structure and dynamics, Physics Reports,Vol. 424, No.
4-5. (February 2006)  
(online at http://www.sciencedirect.com/science/article/pii/S037015730500462X)
• P. Holme & J. Saramäki, Temporal Networks, Physics Reports 519, 97-125 (2012) 
(online at http://arxiv.org/abs/1108.1780)
• M. Kivelä et al., Multilayer Networks, Journal of Complex Networks 2(3) 203-271 (2014) 
(online at http://comnet.oxfordjournals.org/content/early/2014/07/14/comnet.cnu016 )
Extra reading material
Python and Networkx, online documentation and tutorials: 
• http://docs.python.org/
• http://en.wikibooks.org/wiki/A_Beginner’s_Python_Tutorial
• http://networkx.github.io/documentation/latest/tutorial/
• http://docs.scipy.org/doc/
Other network software (not used in the course, but good to know): 
• http://snap.stanford.edu/
• http://graph-tool.skewed.de/
• http://github.com/CxAalto/
• http://www.boost.org/doc/libs/1_61_0/libs/graph/

CN 2016 Lecture1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CN 2016 Lecture1

Uploaded by

Copyright:

Available Formats

Complex Systems

• know how to analyze and characterize networks,

• know the fundamental network models,

• understand how networks evolve, and

• know how network structure affects

• Weekly problem sets (return online)

• see grading table in MyCourses

• Project work (due end 2016)

Exercise Exercise Pre-DL DL

Intro to get started do most ask last finalize

• We will collect feedback for each exercise

gene regulatory networks

gene regulatory networks

gene regulatory networks power grids

social systems world trade Hagmann P, Kurant

the human brain

- WHAT ARE THE DIFFERENCES

-WHAT IS THE ROLE OF THE UNDERLYING

-HOW TO PREVENT GLOBAL EPIDEMICS?

Complex systems as networks

Multiple types of links can be included using multilayer networks or

2.2) Strip the problem, disregard some detail

2.3) Cast the problem as maths

3) See if theories, calculations, or simulations can reproduce

Otherwise the network is undirected and

Note: link = edge, node = vertex, network = graph.

(Please note that in directed networks (vi , vj ) 6= (vj , vi ) so bidirectional

Otherwise the graph is a multigraph.

In this course we only deal with simple graphs.

• path (self-avoiding walk/path, simple path) is a walk where vertices are

Graph G some (induced) subgraphs of G

If a graph is not connected, it consists of separate components (maximal

where Nk is the number of nodes of degree k.

However, we often assume that the observed degree distribution is a sample

even though this metabolic network

• A tree is a connected graph • A set of trees is called a

• So Ci=0 for all i, <C>=0. a tree

• Also: a connected graph

• In a k-regular graph all

• k-regular graphs can e.g.

• If the vertices of a graph can

• A bipartite graph can be

• Bipartite graphs arise naturally

Networkx does everything automatically, so you do not

Just do not try to manually generate an adjacency matrix

undirected network directed network

For any random network model,

Edges & degrees Degree distribution

0.06 • For N ! 1 with hki constant,

relative giant component size S

average component size <s>

small average degree <k>

You might also like

• know how network structure affects  

2.2) Strip the problem, disregard some detail 

2.3) Cast the problem as maths 

Note: link = edge, node = vertex, network = graph.  

• Also: a connected graph