Editor : A. Ravi Ravindran
The Pennsylvania State University
December 1, 2006



11 Complexity and Large-scale Networks


11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.2 Statistical properties of complex networks . . . . . . . . . . . . . . . . . . .


11.2.1 Average path length and the small-world effect . . . . . . . . . . . . .


11.2.2 Clustering coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.2.3 Degree distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.2.4 Betweenness centrality . . . . . . . . . . . . . . . . . . . . . . . . . .


11.2.5 Modularity and community structures . . . . . . . . . . . . . . . . .


11.2.6 Network resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.3 Modeling of complex networks . . . . . . . . . . . . . . . . . . . . . . . . . .


11.3.1 Random graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.3.2 Small-world networks . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.3.3 Scale-free networks . . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.4 Why “Complex” Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . .




11.5 Optimization in complex networks . . . . . . . . . . . . . . . . . . . . . . . .


11.5.1 Network resilience to node failures . . . . . . . . . . . . . . . . . . . .


11.5.2 Local search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.5.3 Other topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


11.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Chapter 11
Complexity and Large-scale Networks
Hari P. Thadakamalla1 , Soundar R. T. Kumara1 and R´
eka Albert2
Dept. of Industrial & Manufacturing Engineering, The Pennsylvania State University
Dept. of Physics, The Pennsylvania State University



In the past few decades, graph theory has been a powerful analytical tool for understanding
and solving various problems in operations research (OR). Study on graphs (or networks)
traces back to the solution of the K¨onigsberg bridge problem by Euler in 1735. In K¨onigsberg,
the river Preger flows through the town dividing it into four land areas A, B, C and D as
shown in figure 11.1 (a). These land areas are connected by seven (1 - 7) different bridges.
The K¨onigsberg bridge problem is to find whether it is possible to traverse through the
city on a route that crosses each bridge exactly once, and return to the starting point.
Euler formulated the problem using a graph theoretical representation and proved that the
traversal is not possible. He represented each land area as a vertex (or node) and each bridge
as an edge between two nodes (land areas) as shown in figure 11.1 (b). Then, he posed the
question as whether there exists a path such that it passes every edge exactly once and ends

For example.2). xij ≥ 0.   {j|(i. Euler proved that for a graph to have an Eulerian Circuit. Later. The weights on the edges represent the distance between the two nodes (see figure 11. we find the shortest path from node 10 to node 30 as (10 . The scale of the size of these networks is substantially dif- .i)∈ξ}  0 otherwise.j)∈ξ} {j|(j.2). Euler’s great insight lay in representing the K¨onigsberg bridge problem as a graph problem with a set of vertices and edges. The problem can be modeled as a shortest path problem on a network. j) connecting the nodes and w is a function such that wij is the weight of the edge (i.12 . j). The shortest path problem from node s to node t can be formulated as follows. minimize X wij xij (i. Note that this problem and similarly other problems considered in traditional graph theory requires to find the exact optimal path.1 . all the nodes in the graph need to have an even degree.j)∈ξ   if i = s. E) where V is the set of all nodes.2 CHAPTER 11. This path was later termed an Eulerian Circuit. graph theory has developed into a substantial area of study which is applied to solve various problems in engineering and several other disciplines [7]. In the last few years there has been an intense amount of activity in understanding and characterizing large-scale networks.3 . where xij = 1 or 0 depending on whether the edge from node i to node j belongs to the optimal path or not respectively. ∀(i. Let the network be G(V.30)(see figure 11. consider the problem of finding the shortest route between two geographical points. where different geographical points are represented as nodes and they are connected by an edge if there exists a direct path between the two nodes. which led to development of a new branch of science called “Network science” [108].   1 X X subject to xij − xji = −1 if i = t. COMPLEXITY AND LARGE-SCALE NETWORKS at the start node. E is the set of edges (i. in the twentieth century. Using one such popular algorithm (Dijkstra’s algorithm [7]). Many algorithms have been proposed to solve the shortest path problem [7]. j) ∈ ξ.

Here we use the exact distances between different nodes to calculate the shortest path 10 .3 . Each node represents a land area and the edge between them represent the bridges connecting the land areas. C.1: K¨onigsberg bridge problem. The land areas are connected by seven bridges numbered from 1 to . (b) Graph theoretical representation of the K¨onigsberg bridge problem. The objective is to find the shortest path from node 10 to node 30.11. B. The values on the edges represent the distance between two nodes. Figure 11.2: Illustration of a typical optimization problem in OR. and D. INTRODUCTION 3 Figure 11. (a) Shows the river flowing through the town dividing it into four land areas A.12 . .

[4] constructed a phone call network from the long distance telephone calls made during a single day which had 53. A recent study [66] estimated the size to be 11. 65]. These large-scale networks are referred to as complex networks and we will discuss the reasons why they are termed “complex” networks later in the section 11. It had approximately one billion nodes at the end of 1999 [89] and is continuously growing at an exponential rate. At the domain level.4 CHAPTER 11. • Phone call network : The phone numbers are the nodes and every completed phone call is an edge directed from the receiver to the caller. [29] have analyzed the International Air Transportation Association database to form the world-wide airport . 000 in 2000 [61] and at the domain level were 4000 in 1999 [55]. each router is represented as a node and physical connections between them as edges. Internet Provider System) is represented as a node and inter-domain connections by edges. The topology of the Internet is studied at two different levels [55]. The World Wide Web is currently the largest network for which topological information is available. 767. Barthelemy et al. 657 edges [16] in 2005. Abello et al. and substations are the nodes and high-voltage transmission lines are the edges. • Airline network : Nodes are the airports and an edge between two airports represent the presence of a direct flight connection [29. the problems posed in such networks are very different from traditional graph theory. each domain (autonomous system. transformers. approximately. The following are examples of complex networks: • World Wide Web: It can be viewed as a network where web pages are the nodes and hyperlinks connecting one webpage to another are the directed edges. 087 nodes and over 170 million edges.5 billion nodes as of January 2005. At the router level.4. The North American power grid consisted of 14. The number of nodes. Also. • Power grid network : Generators. COMPLEXITY AND LARGE-SCALE NETWORKS ferent from the networks considered in traditional graph theory. at the router level were 150. 099 nodes and 19. • Internet: The Internet is a network of computers and telecommunication devices connected by wired or wireless links. The power grid network of the western United States had 4941 nodes in 1998 [143].

885 edges for the U. The above are only a few examples of complex networks pervasive in the real world [13. This is a continuously growing network with 225. • Movie actor collaboration network : Another well studied network is the movie actor collaboration network. The study of large-scale complex systems has always been an active research area in various branches of science. 33] represented the stock market data as a network where the stocks are nodes and two nodes are connected by an edge if their correlation coefficient calculated over a period of time exceeds certain threshold value. especially in the physical sciences. Newman [99. 100] studied networks constructed from four different databases spanning biomedical research. [32. • Market graph: Recently. 31.11. For instance. If we represent the system with the . The substantial growth in size of many such networks [see figure 11. statistical description of gases. computer science and physics. Boginski et al. 786 edges in 1998 [143]. stock data during the period 2000-2002 [33]. Here again. the actors are represented as nodes and two nodes are connected by an edge if the two actors have performed together in a movie. 251 nodes and 2. 49. 163.S. The resulting network consisted of 3880 nodes and 18810 edges in 2002. The new methodology applied for analyzing complex networks is similar to the statistical physics approach to complex phenomena. Some examples are: ferromagnetic properties of materials. On of these networks formed from Medline database for the period from 1961 to 2001 had 1. 101]. • Scientific collaboration networks: Scientists are represented as nodes and two nodes are connected if the two scientists have written an article together.3] necessitates a different approach for analysis and design.4 (a)]. Tools and techniques developed in the field of traditional graph theory involved studies that looked at networks of tens or hundreds or in extreme cases thousands of nodes. let us consider a box containing one mole (6. INTRODUCTION 5 network. diffusion. 738.1. which contains all the movies and their casts from 1890s. high-energy physics. 923 edges. formed from the Internet Movie Database [1]. formation of crystals etc. 520.022 ∗ 1023 ) of gas atoms as our system of analysis[see figure 11. The network had 6556 nodes and 27. 226 nodes and 13.

Similarly. Furthermore.3: Pictorial description of the change in scale in the size of the networks found in many engineering systems. COMPLEXITY AND LARGE-SCALE NETWORKS Figure 11. edge-weight distribution etc). microscopic properties of the individual particles such as their position and velocity. This change in size necessitates a change in the analytical approach. During the last few years there has been a tremendous amount of research activity dedicated to the .4 (b)]. The objective of this chapter is to introduce this new direction of inter-disciplinary research (Network Science) and discuss the new challenges for the OR community. then it would be next to impossible to analyze the system.6 CHAPTER 11. pressure etc. We rarely require specific shortest path solutions such as from node A to node B (from webpage A to webpage B). this approach has become feasible and received a considerable boost by the availability of computers and communication networks which have made the gathering and analysis of large-scale data sets possible. rather than the properties of individual entities in the network (such as the neighbors of a given node. the weights on the edges connecting this node to its neighbors etc) [see figure 11. physicists use statistical mechanics to represent the system and calculate macroscopic properties such as temperature. Rather. where the number of nodes is extremely large. WWW). This new approach for understanding networked systems provides new techniques as well as challenges for solving conceptual and practical problems in this field. in networks such as the Internet and WWW. we have to represent the network using macroscopic properties (such as degree distribution. Rather it is useful if we know the average distance (number of hops) taken from any node to any other node (any webpage to any other webpage) to understand dynamical processes (such as search in WWW). Now let us consider the shortest path problem in such networks (for instance.

022 ∗ 1023 atoms) in a box. There was a revival of network modeling which gave rise to many path breaking results [13.11. However. we summarize different evolutionary models proposed to explain the properties of real networks. We summarize typical behaviors of complex systems and demonstrate how the real networks have these behaviors. In section 11.3. For analysis.2.4. We also present the empirical results obtained for many real complex networks that initiated a revival of network modeling. Until now. 49. 101] and provoked vivid interest across different disciplines of the scientific community. (b) An example of a large-scale network. 31. we discuss Erd˝os-R´enyi random graphs. In particular. This activity was mainly triggered by significant findings in real-world networks which we will elaborate later in the chapter. we address the urgent need and opportunity for the OR community to contribute to the fast-growing inter-disciplinary research on Network Science. small-world networks. The following is the outline of the chapter.4: Illustration of the analogy between a box of gas atoms and complex networks. a major part of this research was contributed by physicists. we discuss the optimization in complex networks by . INTRODUCTION 7 Figure 11. (a) A mole of gas atoms (6. mathematicians. we discuss briefly why these networks are called “complex” networks. we need to represent both the systems using statistical properties. In section 11. sociologists and biologists. In section 11. In section 11. study of these large-scale networks. we introduce different statistical properties that are prominently used for characterizing complex networks. and scale-free networks. rather than large-scale networks.1.5. The methodologies and techniques developed till now will definitely aid the OR community in furthering this research. the ultimate goal of modeling these networks is to understand and optimize the dynamical processes taking place in the network. In this chapter.

. then the path length is equal to the number of edges (or hops) along the path. w)i = X X 1 d(u. E) be a network where V is the collection of entities (or nodes) and E is the set of arcs (or edges) connecting them. These statistical properties help in classifying different kinds of complex networks. The average path length (l) of a connected network is the average of the shortest paths from each node to every other node in a network. COMPLEXITY AND LARGE-SCALE NETWORKS concentrating on two specific processes. N is the number of nodes in the network and d(u. This implies that any node can reach any other node in the network in a relatively small . If all the edges are equivalent in the network. A path between two nodes u and v in the network G is a sequence [u = u1 . 11.1 show the values of l for many different networks. We discuss the effects of statistical properties on these processes and demonstrate how they can be optimized. the average path length is small. u2 . robustness and local search. Finally. which are most relevant to engineering networks. where u′i s are the nodes in G and there exists an edge from ui−1 to ui in G for all i. Table 11. N(N − 1) u∈V u6=w∈ V where.r. the number of nodes). . It is given by l ≡ hd(u.2. w). Further.. we explain some of the statistical properties which are prominently used in the literature. 11. We discuss the definitions and present the empirical findings for many real networks. The path length is defined as sum of the weights on the edges along the path.t.2 Statistical properties of complex networks In this section. w) is the shortest path between u and w.6. We observe that despite the large size of the network (w. un = v] .1 Average path length and the small-world effect Let G(V. we conclude and discuss future research directions.8 CHAPTER 11. we briefly summarize few more important topics and give references for further reading. in section 11..

These letters had to finally reach a specific person in Boston. That is there is an acquaintance path of an average length 6 in the social network of people in the United States. Milgram randomly selected individuals from Wichita. STATISTICAL PROPERTIES OF COMPLEX NETWORKS 9 Table 11.2. say the number of nodes in the network.28 number of steps. For example. is called the small-world effect. a network is considered to be small-world if the average path length scales logarithmically or slower with the number of nodes N (∼ logN). Mathematically. Currently.000 212. router level [61] Internet.r.250 24.t.5. Note that despite the large size of the network (w.54 11. or contagious diseases across a network. the average length of these acquaintance chains for the letters that reached the targeted node was found to be approximately 6. The existence of the small-world effect was first demonstrated by the famous experiment conducted by Stanley Milgram in the 1960s [92] which led to the popular concept of six degrees of separation.097 880 Average path length 16 11 4 4.11.000 4. Anyone who received the letter subsequently would be given the same information and asked to do the same until it reached the target person. N. Kansas and Omaha.05 4. This characteristic phenomenon. domain level [55] Movie actors [143] Electronic circuits [75] peer-to-peer network [122] Size (number of nodes) 2 × 108 150. In this experiment. This phenomenon has critical implications on the dynamic processes taking place in the network. computer viruses. The participants were asked to send the letter to one of their acquaintances whom they judged to be closer (than themselves) to the target. Nebraska to pass on a letter to one of their acquaintances by mail.1: Average path length of many real networks. that most pairs of nodes are connected by a short path through the network.2. if we consider the spread of information. Over many trials. Massachusetts. Watts et al. Network WWW [37] Internet. For example. [145] are doing an Internet-based study to verify this phenomenon. increases from 103 to 106 . We will discuss another interesting and even more surprising observation from this experiment in section 11. the small-world . the number of nodes). the name and profession of the target was given to the participants. the average path length is very small. then average path length will increase approximately from 3 to 6.

3 Degree distribution The degree of a node is the number of edges incident on it. Consider a node i which is connected to ki other nodes. a node has both an in-degree (number of incoming edges) and an out-degree (number of outgoing edges). . phone call networks [4.2. 101]. It is measured in terms of number of triangles (3-cliques) present in the network. i. The clustering coefficient of a node i is the ratio of the number of edges Ei that actually exist between these ki nodes and the total number ki (ki − 1)/2 possible. C = n1 i Ci (see figure 11. The degree distribution of the network is the function pk . COMPLEXITY AND LARGE-SCALE NETWORKS phenomenon implies that within a few steps it could spread to a large fraction of most of the real networks.2.5). 11. a feature analyzed in detail in the so called theory of balance [43]. 8]. it means that it is highly likely that two friends of a person are also friends. in many networks if node A is connected to node B and node C.2 Clustering coefficient The clustering coefficient characterizes the local transitivity and order in the neighborhood of a node.e.10 CHAPTER 11. 14. With respect to social networks. the Internet [55].e. In other words. 11. The number of possible edges between these ki neighbors that form a triangle is ki (ki − 1)/2. a directed graph has both in-degree and out-degree distributions. metabolic networks [77]. then there is a high probability that node B and node C are also connected. It was found that most of the real networks including the World Wide Web [5. Ci = 2Ei ki (ki − 1) The clustering coefficient of the whole network (C) is then the average of Ci′ s over all the P nodes in the network i. where pk is the probability that a randomly selected node has degree k. The clustering coefficient is high for many real networks [13. 88]. In a directed network. Here again.

Hence. E[k] = ∞ otherwise. local search [6]. node 1 has degree 5 and the number of edges between the neighbors is 3.  f inite if γ > 2. 19. and epidemiological processes [111. These networks having power-law degree distributions are popularly known as scale-free networks. 113. 16]. The clustering coefficient of the entire network is the average of the clustering coefficients at each individual nodes (109/180).1 and 3. STATISTICAL PROPERTIES OF COMPLEX NETWORKS 11 Figure 11. For example. Later in this chapter. with a high fraction of small-degree nodes and few large degree nodes. V [k] = ∞ otherwise. The power-law exponent (γ) of most of the networks lie between 2.5: Calculating the clustering coefficient of a node and the network.11. where γ is the power-law exponent.2. 115]. Note that the variance of the node degree is infinite when γ < 3 and the mean is infinite when γ < 2. 25] follow a power-law degree distribution (p(k) ∼ k −γ ). 114.0 which implies that their is high heterogeneity with respect to node degree. These networks were called as scale-free networks because of the lack of a characteristic degree and the broad tail of the degree distribution. we will discuss the impact of the this heterogeneity in detail.  f inite if γ > 3. Figure 11. network navigation. 112. the clustering coefficient for node 1 is 3/10. scientific collaboration networks [26. and movie actor collaboration networks [12. This phenomenon in real networks is critical because it was shown that the heterogeneity has a huge impact on the network properties and processes such as network resilience [15.6 shows the empirical results for the Internet at the router level and co-authorship network of highenergy physicists. . The following are the expected values and variances of the node degree in scale-free networks. indicating that the topology of the network is very heterogeneous. 99].

Data courtesy of Ramesh Govindan [61]. The BC of a node i is given by BC(i) = X σst (i) . σst s6=n6=t where σst is the total number of shortest paths from node s to t and σst (i) is the number of these shortest paths passing through node i. Goh et al.2 and is insensitive to the detail of the scale-free network as long as the degree exponent (γ) lies between 2. (a) Internet at the router level. have shown numerically that the load (or BC) distribution follows a power-law.0.6: The degree distribution of real networks.4 Betweenness centrality Betweenness centrality (BC) of a node counts the fraction of shortest paths going through a node. BC was first introduced in the context of social networks [139]. and has been recently adopted by Goh et al. consider the transportation of data packets in the Internet along the shortest paths. COMPLEXITY AND LARGE-SCALE NETWORKS 0 0 10 10 −1 10 −1 10 −2 10 −2 10 −3 P(k) 10 −3 10 −4 10 −4 10 −5 10 −5 10 −6 10 −7 −6 10 0 10 1 2 10 10 10 3 0 10 10 1 10 2 10 3 10 4 10 k (b) (a) Figure 11. [59] as a proxy for the load (li ) at a node i with respect to transport dynamics in a network.12 CHAPTER 11. They further showed that .1 and 3. PL (l) ∼ l−δ with exponent δ ≈ 2.2. If the BC of a node is high. If many shortest paths pass through a node then the load on that node would be high. it implies that this node is central and many shortest paths pass through this node. 11. after Newman [99]. For example. (b) Co-authorship network of high-energy physicists.

11. we discuss how this property can be utilized for local search in complex networks. 121]. 11.7: Illustration of a network with community structure.2. STATISTICAL PROPERTIES OF COMPLEX NETWORKS 13 Figure 11. For example.5 Modularity and community structures Many real networks are found to exhibit a community structure (also called modular structure). In GPP the objective is to divide the nodes of the network into k disjoint sets . Many other centrality measures exists in literature and a detailed review of these measures can be found in [86]. Similar community structures are observed in many networks which reflects the division of nodes into groups based on the node properties [101]. Communities are defined as a group of nodes in the network that have higher density of edges with in the group than between groups. group of nodes enclosed with in a dotted loop is a community. groups of nodes in the network have high density of edges within the group and lower density between the groups (see figure 11. In many ways. community detection is similar to a traditional graph partitioning problem (GPP). in the WWW it reflects the subject matter or themes of the pages. age. profession etc.7).2. in citation networks it reflects the area of research. Later in this chapter. in cellular and metabolic networks it may reflect functional groups [72. That is. there exists a scaling relation l ∼ k (γ−1)/(δ−1) between the load and the degree of a node when 2 < γ ≤ 3. This property was first proposed in the social networks [139] where people may divide into groups based on interests. In the above network.

6 Network resilience The ability of a network to withstand removal of nodes/edges in a network is called network resilience or robustness. COMPLEXITY AND LARGE-SCALE NETWORKS of specified sizes. we do not have any a priori knowledge about the number of communities into which we should divide and about the size of the communities. the presence of community structures is a common phenomenon across many real networks. In general. in real networks. there is no unique definition of a community due to the ambiguity of how dense a group should be to form a community.5. However. in mapping of parallel computations. the stronger is the community structure.14 CHAPTER 11. the removal of nodes and edges disrupts the paths between nodes and can increase the distances and thus making the communication between nodes harder. It is important to note that in spite of this ambiguity. an initially connected network can break down into isolated components that cannot communicate anymore.8 shows the effect of . 120] considers a subgraph as a community if each node in the subgraph has more connections within the community than with the rest of the graph. 103. The goal is to find the naturally existing communities in the real networks rather than dividing the network into a pre-specified number of groups. 139]. Many possible definitions exist in literature [56. 119] have been proposed to decrease the computation time. In more severe cases. Here. This problem is NP-complete [58] and several heuristic methods [69. 81. the tasks have to be divided between a specified number of processors such that the communication between the processors is minimized and the loads on the processors are balanced. For example. A simple definition given in [56. such that. the number of edges between these sets is minimum. it is difficult to evaluate the goodness of a given partition. the number of partitions to be made is specified and the size of each partition is restricted. laying out of circuits (VLSI design) and the ordering of sparse matrix computations [69].2. 120. The higher this difference. Figure 11. Algorithms for detecting these communities are briefly discussed in section 11. Moreover. 109. 11. Newman and Grivan [103] have proposed another measure which calculates the fraction of links within the community minus the expected value of the same quantity in a randomized counterpart of the network. Since we do not know the exact partitions of network. GPP arises in many important engineering problems which include mapping of parallel computations.3.

the lesser the increase. Observe that as we remove more nodes and edges.8: Effects of removing a node or an edge in the network. the removal of nodes at random is termed as random failures and the removal of nodes with higher degree is termed as targeted attacks. Again. the better the robustness of the network.1. For example. other removal strategies are discussed in detail in [71]. removal of nodes/edges on a network. A connected component is a part of the network in which a path exists between any two nodes in that component and the largest connected component is the largest among the connected components. the size of the largest connected component decreases from 13 to 9 and then to 5. Similarly there are several ways of measuring the degradation of the network performance after the removal. One simple way to measure it is to calculate the decrease in size of the largest connected component in the network. the network disintegrates into many components. Another way to measure robustness is to calculate the increase of the average path length in the largest connected component. We discuss more about network resilience and robustness with respect to optimization in section 11. There are different ways of removing nodes and edges to test the robustness of a network. one can remove nodes at random with uniform probability or by selectively targeting certain classes of nodes. Observe that as we remove more nodes and edges the network disintegrates into small components/clusters. The lesser the decrease in the size of the largest connected component. STATISTICAL PROPERTIES OF COMPLEX NETWORKS 15 Figure 11. Malfunctioning of nodes/edges eliminates some existing paths and generally increases the distance between the remaining nodes. the better the robustness of the network. .8. In figure such as nodes with high degree. Usually.

the Watts-Strogatz small-world network model. . the ER random graph abruptly changes its topology from a loose collection of small clusters to one which has giant connected component. and the Barab´asi-Albert scale-free network model. where < k > is the average degree of the network. In a ER random graph. the Erd˝os-R´enyi random graph model. 11. We observe that there exists a threshold pc = 0. 53. Erd˝os and R´enyi have shown that at pc ≃ 1/N. 54] in the 1950s and 1960s. the network is composed of small isolated clusters and when p > pc a giant component suddenly appears. COMPLEXITY AND LARGE-SCALE NETWORKS 11. Most of the modeling efforts focused on understanding the underlying process involved during the network evolution and capture the above-mentioned properties of real networks. This phenomenon is similar to the percolation transition. we concentrate on three prominent models. Figure 11.001 such that when p < pc . The following is the evolutionary model given by Erd˝os and R´enyi: • Start with a set of N isolated nodes • Connect each pair of nodes with a connection probability p Figure 11. They proposed uniform random graphs for modeling complex networks with no obvious pattern or structure.3.9 illustrates two realizations for Erd˝os-R´enyi random graph model (ER random graphs) for two connection probabilities. namely. In specific. for N = 1000.10 shows the change in size of the largest connected component in the network as the value of p increases. we give a brief summary of different models for complex networks. the mean number of neighbors at a distance (number of hops) d from a node is approximately < k >d .3 Modeling of complex networks In this section. a topic well-studied both in mathematics and statistical mechanics [13].16 CHAPTER 11.1 Random graphs One of the earliest theoretical models for complex networks was given by Erd˝os and R´enyi [52.

3.001 such that when p < pc .9: An Erd˝os-R´enyi random graph that starts with N = 20 isolated nodes and connects any two nodes with a probability p.01 Connection probability (p) Figure 11. .006 0. As the value of p increases the number of edges in the network increase.17 11. Note that there exists pc = 0.008 0.10: Illustration of percolation transition for the size of the largest connected component in Erd˝os-R´enyi random graph model. the network is composed of small isolated clusters and when p > pc a giant component suddenly appears. 1000 Size of the largest connected component 900 800 700 600 500 400 300 200 100 0 0 0.004 0. MODELING OF COMPLEX NETWORKS Figure 11.002 0.

indicating a far more ordered structure in real networks. the clustering coefficient of a ER random graph is p = <k> N which is small for large sparse networks. 49. If we consider a node and its neighbors in a ER random graph then the probability that two of these neighbors are connected is equal to p (probability that two randomly chosen neighbors are connected).18 CHAPTER 11. This is only an approximate argument for illustration. The total number of edges in the network is a random variable with an expected value of pN(N − 1)/2 and the number of edges incident on a node (the node degree) follows a binomial distribution with parameters N − 1 and p. p(ki = k) = CNk −1 pk (1 − p)N −1−k . ER random graphs are small world. the degree distribution of many large-scale networks are found to follow a power-law p(k) ∼ k −γ . Hence. We observe that there is a remarkable difference between these networks. This implies that in the limit of large N. the probability that a given node has degree k approaches a Poisson distribution. Figure 11. ER random graphs are statistically homogenous in node degree as the majority of the nodes have a degree close to the average. the distance (l) should be such that < k >l ∼ N. Moreover. and significantly small and large node degrees are exponentially rare. Now. p(k) = <k>k e−<k> . Thus. The model was intuitive and analytically tractable. Hence. ER random graphs were used to model complex networks for a longtime [34]. moreover the average path length of real networks is close to the average path length of a ER random graph of the same size [13]. 31. recent studies on the topologies of diverse large-scale networks found in nature indicated that they have significantly different properties from ER random graphs [13. let us calculate the degree distribution of the ER random graphs. COMPLEXITY AND LARGE-SCALE NETWORKS To cover all the nodes in the network. k! Hence. The network with Poisson degree distribution is more homogenous in node degree. log <k> which scales logarithmically with the number of nodes N.11 compares two networks with Poisson and power-law degree distributions. These discoveries along with others related . the average path length is given by l = log N . 101]. It has been found [143] that the average clustering coefficient of real networks is significantly larger than the average clustering coefficient of ER random graphs with the same number of nodes and edges. The clustering coefficient of the ER random graphs is found to be low. whereas the network with power-law distribution is highly heterogenous. a rigorous proof can be found in [34]. However.

There are few nodes with large degree and many nodes with a small degree to the mixing patterns of complex networks [13.08 P(k) P(k) 0. these models specify either a degree sequence. power-law degree distribution.0E-02 1. If a degree distribution is specified then the sequence is formed by generating N random values from this distribution. 104] to mimic the properties of real-world networks. MODELING OF COMPLEX NETWORKS Poisson Power-law 0. the network with power-law degree distribution is highly heterogenous in node degree.0E-07 1 10 100 1000 k Figure 11. provided that the maximum degree is less than N 1/4 . Most of the nodes in the network have same degree which is close to the average degree of the network. Non-uniform random graphs are also studied [8. Molloy and Reed [93] have proved that for a random graph with a degree distribution p(k) a gaint connected component emerges P almost surely when k≥1 k(k − 2)p(k) > 0.0E-03 1. which is set of N values of the degrees ki of nodes i = 1.02 k 0 0 10 20 k 30 1. 31. [8. 93..12 0. γ) . 9.11: Comparison of networks with Poisson and power-law degree distribution of the same size. 2.06 0. This can be thought as giving each node i in the network ki “stubs” sticking out of it and then pairs of these stubs are connected randomly to form complete edges [104]..0E+00 1. 41. 101] initiated a revival of network modeling in the past few years. Typically.04 0. 9] introduced a two-parameter random graph model P (α. N or a degree distribution p(k).0E-04 1. 102.3.19 11.0E-06 1. in specific.0E-05 1. 49.0E-01 1. Aiello et al.. Later. However. Note that the network with Poisson distribution is homogenous in node degree.1 0. .

noting that the maximum degree of a node in the network is eα/γ . the probability that any of the second neighbors is connected to first neighbors or to one another scales as N −1 and can be neglected in the limit of large N. In addition. the distribution of number of edges from this node (one less than the degree) qk . Assuming that all the nodes in the network P can be reached within l steps. there is no gaint connected component almost surely when γ > γ0 . The generating function for the degree distribution pk P k is given by G0 (x) = ∞ k=0 pk x . we obtain the average path length of the network l = N/z1 z2 /z1 + 1. we can approximately calculate the relation for the average path length of the network. The total number of nodes in the network can be computed. this formulation helps in calculating other properties of the network [104]. The average degree of a randomly chosen node P ′ would be < k >= k kp(k) = G0 (1). Further. Thus the degree distribution of the node arrived by a randomly chosen edge is given by kpk and not pk . we find that the average number of mth neighbors zm . bipartite graphs and degree correlations [101]. Using the results from Molloy and Reed [93].20 CHAPTER 11. Let us consider. [104] have developed a general approach to random graphs by using a generating function formalism [146]. Now.. we have 1 + lm=1 zm = N... the generating function for qk is given by G1 (x) = P∞ k=0 (k+1)pk+1 P k kpk (k+1)pk+1 xk k = = (k+1)pk+1 . <k> ′ G0 (x) . let us start from a node and find the number of first neighbors. This function captures all the information present in the k original distribution since pk = k!1 ddxGk0 |x=0 . ′ G0 (1) Note that the distribution of number of first neighbors of a randomly chosen node (degree of a node) is G0 (x). the degree of the node reached by following a randomly chosen edge.47875. is [G′1 (1)]m−1 G′0 (1) = [ zz12 ]m−1 z1 . mth neighbors. For instance. Whereas. Extending this method of calculating neighbors is given by [ ∂x the average number of nearest neighbors. Here. . second. third . If the degree of this node is k then we are k times more likely to reach this node than a node of degree 1.. is Thus.. they showed that there is almost surely a unique gaint connected component if γ < γ0 = 3. Newman et al. Hence. As for most graphs N ≫ z1 and z2 ≫ z1 . The generating function formalism can further be extended to include other features such as directed graphs. This implies that the average number of second ∂ G0 (G1 (x))]x=1 = G′0 (1)G′1 (1). COMPLEXITY AND LARGE-SCALE NETWORKS for power-law graphs with exponent γ described as follows: Let nk be the number of nodes with degree k. the distribution of number of second neighbors from the same P randomly chosen node would be G0 (G1 (x)) = k pk [G1 (x)]k . such that nk and k satisfy log nk = α − γ log k.

but lie somewhere between these two extremes. The other end is chosen without allowing multiple edges (more than one edge joining a pair of nodes) and loops (edges joining a node to itself). where the function Z is. {ǫi } is the set of observable’s or measurable properties of the network such as number of nodes with certain degree. The following is the algorithm for the model. • Randomly rewire each edge with a probability p such that one end remains the same and the other end is chosen uniformly at random. 11. social networks. This is similar to the Boltzmann ensemble of statistical mechanics with Z as the partition function [101]. 140]. especially. {θi } are adjustable set of parameters for the model.11. Here. The major advantage of these models is that G ǫi (G)P (G) = Z ǫi exp(− they can represent any kind of structural tendencies such as dyad and triangle formations.2 Small-world networks Watts and Strogatz [143] presented a small-world network model to explain the existence of high clustering and small average path length simultaneously in many real networks. number of triangles etc.3. 127]. 70. . Z = G exp(− i θi ǫi ). They argued that most of the real networks are neither completely regular nor completely random.3. MODELING OF COMPLEX NETWORKS 21 Another class of random graphs which are especially popular in modeling social networks is Exponential Random Graphs Models (ERGMs) or p∗ models [20. The ERGM consists of a family of possible networks of N nodes in which each network G appears P P P with probability P (G) = Z1 exp(− i θi ǫi ). the networks can be generated by using Gibbs or MetropolisHastings sampling methods [127]. A detailed review of the parameter estimation techniques can be found in [20. • Start with a regular ring lattice on N nodes where each node is connected to its first k neighbors. Once the parameters {θi } are specified. 129. The WattsStrogatz model starts with a regular lattice on N nodes and each edge is rewired with certain probability p. The ensemble average of a property ǫi is given as hǫi i = P P ∂f 1 i θi ǫi ) = ∂θi . 57.

the graph becomes increasingly disordered until p = 1. . When p = 0 the graph is regular (each node has 4 edges). Also. since all the edges are rewired (see figure 11. small-world models are even more homogeneous than random graphs. Also. Later.12: Illustration of the random rewiring process for the Watts-Strogatz model. After Watts and Strogatz. This class of networks displays a high degree of clustering coefficient for small values of p since we start with a regular lattice. new edges are introduced with probability p. for small values of p the average path length falls rapidly due to the few long-range connections. which is not the case with real networks. This co-existence of high clustering coefficient and small average path length is in excellent agreement with the characteristics of many real networks [98. 143]. The degree distribution of both models depends on the parameter p. The clustering coefficient of the Watts-Strogatz model and the Newman model are CW S = 3(k − 1) (1 − p)3 2(2k − 1) CN = 3(k − 1) 2(2k − 1) + 4kp(p + 2) respectively. colleagues at work etc (the connections in the regular lattice). Thus. COMPLEXITY AND LARGE-SCALE NETWORKS Figure 11.22 CHAPTER 11. Newman [98] proposed a similar model where instead of edge rewiring. 1998 [143]. as p increases. each person has few friends who are a long way away (long-range connections attained by random rewiring).12). evolving from a univalued peak corresponding to the initial degree k to a somewhat broader but still peaked distribution. The above model is inspired from social networks where people are friends with their immediate neighbors such as neighbors on the street. without changing the number of vertices (N = 20) or edges (E = 40) in the graph. all the edges are rewired randomly. The resulting network is a regular network when p = 0 and a random graph when p = 1. This model interpolates between a regular ring lattice and a random network.

3 Scale-free networks As mentioned earlier. . collaboration networks and even biological networks are growing continuously by the creation of new web pages. 8] and movie actor collaboration networks [12. 19. Barab´asi and Albert [25] addressed the origin of this power-law degree distribution in many real networks. their degree distribution follows a power-law.3. The clustering coefficient of a scale-free network is approximately C ∼ (log N )2 . MODELING OF COMPLEX NETWORKS 11.3. leading to a well-developed theory of evolving networks [13. new nodes entering the network do not connect uniformly to existing nodes. They argued that a static random graph or Watts-Strogatz model fails to capture two important features of large-scale networks: their constant growth and the inherent selectivity in edge creation. 49. N which is a slower decay than C =< k > N −1 decay observed in random graphs [35]. • Growth: Start with small number of connected nodes say m0 and assume that every time a node enters the system. These networks were called as scale-free networks because of the lack of a characteristic degree and the broad tail of the degree distribution. In the years following the proposal of the first scale-free model a large number of more refined models have been introduced. This reasoning led them to define the following mechanism. m edges are pointing from it. unlike random networks where each node has the same probability of acquiring a new edge. Complex networks like the World-Wide Web. Moreover. each edge of the newly entered node preferentially attaches to a already existing node i with degree ki with the following probability. where m < m0 . The average path length of this network scales as log(N ) log(log(N )) and thus displays small world property. the Internet [55]. • Preferential Attachment: Every time a new node enters the system. peer-to-peer networks [122]. many real networks including the World Wide Web [5. ki Πi = P j kj It was shown that such a mechanism leads to a network with power-law degree distribution p(k) = k −γ with exponent γ = 3. that is. but attach preferentially to nodes of higher degree. 88]. metabolic networks [77]. 31. start of new researchers and by gene duplication and evolution.23 11. 101]. p(k) ∼ k −γ . 14. phone call networks [4. 25] are scale-free.

we will discuss why these large-scale networks are termed as “complex” networks. the list goes on. Some of the behaviors exhibited by complex systems are discussed below: • Scale invariance or self-similarity: A system is scale invariant if the structure of the system is similar regardless of the scale. This is similar to the percolation threshold. Similarly. they keep accumulating. though the complexity does arise due to the size of the network.24 CHAPTER 11. Consider an airplane as an example. One must also distinguish “complex systems” from “complicated systems” [136]. Note that if we look at a small part of the triangle at a different scale. this is not the case with complex systems. The reason is not merely because the large size of the network. ant hills . an addition of one more small particle may lead to an avalanche demonstrating that sand pile is highly sensitive. However. COMPLEXITY AND LARGE-SCALE NETWORKS 11.13. But after reaching a certain point. A small change in the system conditions or parameters may lead to a huge change in the global behavior. When we are adding more sand particles to a sand pile.. Typical examples of scale invariant systems are fractals. For example.4 Why “Complex” Networks In this section. Examples of such complex systems include ecosystems. the human brain. These emergent behaviors can not be fully explained by just understanding the properties of the individual components/constituents. we know its components and the rules governing its functioning. • Infinite susceptibility/response: Most of the complex systems are highly sensitive or susceptive to changes in certain conditions. consider the Sierpinski triangle in figure 11.. where a small change in connection probability induces the emergence of a giant connected cluster. various organizations/societies. Another good example of such system is a sand pile. at which ever scale we look at the triangle. the nervous system. it is self-similar and hence scale invariant. economies. Complex systems are characterized by diverse behaviors that emerge as a result of non-linear spatio-temporal interactions among a large number of components [73]. • Self-organization and Emergence: Self-organization is the characteristic of a system by which it can evolve itself into a particular structure based on interactions between . then it still looks similar to the original triangle. Even though it is a complicated system.

This network is highly sensitive when we remove just two nodes in the network. consider the network shown in figure 11. This heavy tailed degree distribution induces a high level of heterogeneity in the degrees of the vertices. It completely disintegrates into small components. Most of these networks have power-law degree distribution which does not have any specific scale [25]. a large number of ants interact in an ant colony that leads to more complex emergent behaviors. Self-organization typically leads to an emergent behavior. They have shown that in networks which do not have a heavy tailed degree distribution if the disease transmission . it is self-similar. the constituents and without any external influence. Also. A completely new property arises from the interactions between the different constituents of the system. WHY “COMPLEX” NETWORKS 25 Figure 11.13: Illustration of self-similarity in the Sierpinski triangle. On the other hand.4. Emergent behavior is a phenomenon in which the system global property is not evident from those of its individual parts. consider an ant colony.14(a). the network shown in the figure 11. For example. When we look at a small part of the triangle at a different scale then it looks similar to the original triangle. 115] have shown that the presence of heterogeneity has a huge impact on epidemiological processes such as disease spreading.14(b) having the same number of nodes and edges is not very sensitive. with a huge heterogeneity in node degree. Although a single ant (a constituent of an ant colony) can perform a very limited number of tasks in its lifetime. This is a typical behavior of a scale invariant system. Now let us consider the real large-scale networks such as the Internet.11 (b)). studies [111.14(a). 112. Moreover.1. 113. Most real networks are found to have a structure similar to the the network shown in figure 11. at each scale we look at the triangle.11. the WWW and other networks mentioned in section 11. The heterogeneity makes the network highly sensitive to external disturbances. 114. For example. This implies that the networks do not have any characteristic degree and an average behavior of the system is not typical (see figure 11. Due to these reasons they are called as scale-free networks.

They further showed that no matter what the transmission rate is. we clearly see that these real large-scale networks are highly sensitive or infinitely susceptible. they are called “complex” networks [135]. COMPLEXITY AND LARGE-SCALE NETWORKS Figure 11. More rigorous mathematical definitions of complexity can be found in [23. The above discussion on complexity is an intuitive explanation rather than technical details. . Further. There is no external influence that controlled the evolution process or structure of the network. (a) Observe that when we remove the two highest degree nodes from the network. (b) Example of a network with the same number of nodes and edges which is not sensitive. there exists a finite probability that the infection will cause a major outbreak. it disintegrates into small parts. rate is lesser than a certain threshold. The network in (a) is highly sensitive due to the presence of high heterogeneity in node degree. all these networks have evolved over time with new nodes joining the network (and some leaving) according to some self-organizing or evolutionary rules. these networks have evolved in such a manner that they exhibit complex behaviors such as power-law degree distributions and many others. Hence. it will not cause an epidemic or a major outbreak.26 CHAPTER 11. The network is highly sensitive to node removals. 30]. if the network has power-law or scale-free distribution. it becomes highly sensitive to disease propagation. This network is not effected much when we remove the three highest degree nodes. Nevertheless. However. Hence.14: Illustration of high sensitivity phenomena in complex networks.

In the past few years. there has been tremendous amount of effort by the research communities of different disciplines to understand the processes taking place on networks [13.11. These principles will certainly help in redesigning or restructuring the existing networks and perhaps even help in designing a network from scratch. But the ultimate goal in studying and modeling the structure of complex networks is to understand and optimize the processes taking place on these networks. they are very sensitive to targeted attacks. [15] demonstrated that the topological structure of the network plays a major role in its response to node/edge removal.1 Network resilience to node failures All real networks are regularly subject to node/edge failures either due to normal malfunctions (random failures) or intentional attacks (targeted attacks) [15. one would like to understand how the structure of the Internet affects its survivability against random failures or intentional attacks. On .3 are focused on explaining the evolution and growth process of many large real networks. Albert et al. one has to understand the interactions between the structure of the network and the processes taking place on the network. it is extremely important for the network to be robust against such failures for proper functioning. how the structure of social networks affects the spread of viruses or diseases. 11. They showed that most of the real networks are extremely resilient to random failures. In this chapter. Since a large fraction of nodes have small degree. OPTIMIZATION IN COMPLEX NETWORKS 11. On the other hand. to design rules for optimization.5 27 Optimization in complex networks The models discussed in section 11. because of their high relevance to engineering systems and discuss few other topics briefly.5. 49. Hence. random failures do not have any effect on the structure of the network. namely node failures and local search. They mainly concentrate on statistical properties of real networks and network modeling. which are highly heterogenous in node degree. how the structure of the WWW helps in efficient surfing or search on the web. we concentrate on two processes. 101]. They attribute it to the fact that most of these networks are scale-free networks. In other words. 16]. 31.5. For example. etc.

consider the Internet: despite frequent router problems in the network. The behavior with respective to random failures and targeted attacks is similar for random graphs. Ideally. they behave almost similarly for both random failures and targeted attacks (see figure 11. Valente et al. [133] and Paul et al. ER graphs are homogenous in node degree. for scale-free networks. the other hand.15: The size of the largest connected component as the percentage number of nodes (p) removed from the networks due to random failures (⋄) and targeted attacks (△). we rarely experience global effects. the removal of a few highly connected nodes that maintain the connectivity of the network. (b) Scalefree networks generated by Barab´asi-Albert model with N = 10. due to random failures and targeted attacks.15(b)). To determine the feasibility of modeling such a network. Figure 11. (a) ER graph with number of nodes (N) = 10. [117] have studied . drastically changes the topology of the network. Hence. For example.000 and mean degree < k > = 4. the size of the largest connected component decreases slowly for random failures and drastically for targeted attacks (see figure 11.15(a)).15 shows the decrease in the size of the largest connected component for both scale-free networks and ER graphs. COMPLEXITY AND LARGE-SCALE NETWORKS Size of the largest connected component Random graphs Scale−free networks 10000 10000 9000 9000 8000 8000 7000 7000 6000 6000 5000 5000 4000 4000 3000 3000 2000 2000 1000 1000 0 0 35 0 0 70 35 70 p (a) (b) Figure 11.28 CHAPTER 11. In contrast. we would like to have a network which is as resilient as scale-free networks to random failures and as resilient as random graphs to targeted attacks.000 and < k > = 4. However. that is all the nodes in the network have approximately the same degree. if a few critical nodes in the Internet are removed then it would lead to a devastating effect. Scale-free networks are highly sensitive to targeted attacks and robust to random failures.

The lower the average path length. 74.5. and one node has a very large degree. it is still unclear as to what would be the optimal configuration of such survivable . Flexibility is the ability of the network to have alternate paths for dynamic rerouting. a completely connected network will be the most robust network for both random failures and targeted attacks). Valente et al. responsiveness and flexibility along with robustness for random failures and targeted attacks. They designed a parameterized evolutionary algorithm for supply-chain networks and analyzed the performance with respect to these three measures. OPTIMIZATION IN COMPLEX NETWORKS 29 the following optimization problem: “What is the optimal degree distribution of a network of size N nodes that maximizes the robustness of the network to both random failures and targeted attacks with the constraint that the number of edges remain the same? ” Note that we can always improve the robustness by increasing the number of edges in the network (for instance. specifically for supply-chain networks. They define responsiveness as the ability of network to provide timely services with effective navigation and measure it in terms of average path length of the network. However. [130] consider two other measures. the better is the responsiveness of the network. They showed that the optimal networks that maximize robustness for both random failures and targeted attacks have at most three distinct node degrees and hence the degree distribution is three-peaked.11. Hence the problem has a constraint on the number of edges. where N is the number of nodes. Through simulation they have shown that there exist trade-offs between these measures and proposed different ways to improve these properties. Thadakamalla et al. In [133]. 130. Similar results were demonstrated by Paul et al. showed that the optimal network design is one in which all the nodes in the network except one have the same degree. Many different evolutionary algorithms have also been proposed to design an optimal network configuration that is robust to both random failures and targeted attacks [44. 125. and the flexibility of a network is measured in terms of the clustering coefficient. However. Paul et al. Good clustering properties ensure the presence of alternate paths. 134]. In particular. in [117]. showed that the optimal network configuration is very different from both scale-free networks and random graphs. k2 ∼ N 2/3 . k1 (which is close to the average degree). these optimal networks may not be practically feasible because of the requirement that each node has a limited repertoire of degrees.

Motter showed that a defence strategy based on a selective further removal of nodes and edges. can drastically reduce the size of the cascade. The research question would be “what is the optimal configuration of a network of N nodes that maximizes the robustness to random failures. By local information. Let us suppose some required information such as computer files or sensor data is stored at the nodes of a distributed network or database. with the constraint that the number of edges remain the same? ” Until now. Note that this is different from neighborhood search strategies used for solving combinatorial . Models of cascades of irreversible [97] or reversible [45] overload failures have demonstrated that removal of even a small fraction of highly loaded nodes can trigger global cascades if the load distribution of the nodes is heterogenous. This is an intriguing and relatively little studied problem that has many practical applications. the removal of nodes (power stations) changes the balance of flows and leads to a global redistribution of loads over all the network. 95. or perhaps second neighbors and it is not aware of nodes at a larger distance and how they are connected in the network. Then. 141]. In some cases. 94.30 CHAPTER 11. one should have efficient local (decentralized) search strategies. flexibility. in which a node tries to find a network path to a target node using only local information. targeted attacks. as happened on August 10th 1996 in 11 US states and two Canadian provinces [124]. cascade-based attacks can be much more destructive than any other strategies considered in [15. 138.2 Local search One of the important research problems that has many applications in engineering systems is search in complex networks. 11. we mean that each node has information only about its first. 71]. in many real networks. in order to quickly determine the location of particular information. However. Local search is the process. COMPLEXITY AND LARGE-SCALE NETWORKS networks. this may not be tolerated and might trigger a cascade of overload failures [82]. in [96]. the removal of nodes will also have dynamic effects on the network as it leads to avalanches of breakdowns also called cascading failures. Hence. in a power transmission grid. right after the initial attack or failure. Other studies on cascading failures include [39.5. and responsiveness. we have focussed on the effects of node removal on the static properties of a network. For instance. Later.

Most of the algorithms in wireless sensor networks literature find a path to the target node either by broadcasting or random walk and then concentrate on efficient routing of the data from start node to the end node [10. node 4 also has only local information. the statistical properties of the networks have significant effect on the search process. the algorithms in wireless sensor networks could be integrated with these results for better performance. 11]. 76]. node 1 can calculate the optimal path using traditional algorithms [7] and send the message through this path (1 .16(a). In such a case.5. OPTIMIZATION IN COMPLEX NETWORKS 31 optimization problems [2].13 . we would not know whether a step in the search process is going towards the target node or away from it.30.16(b).16 (b). based on some search algorithm. However.30) is not optimal. each node has global connectivity information about the network (that is. All the intermediary people (or nodes) use this information as a guide for passing the messages. In the first type of network. in which each node knows only about its immediate neighbors. Whereas in the second type of network. the global position of the target node can be quantified and each node has this information. As we will see in this section.3 . In this case. For example.23 . Node 1. depicted by the dotted line). Similarly. However. how each and every node is connected in the network). consider the network shown in figure 11. For example. chooses to send the message to one of its neighbors: in this case.28 . the problem tries to design optimal search algorithms in complex networks. One . if we look at the network considered in Milgram’s experiment each person has the geographical and professional information about the target node. We discuss this problem for two types of networks. The algorithms discussed in this section may look similar to “distributed routing algorithms” that are abundant in wireless ad hoc and sensor networks [10. the main difference is that the former try to exploit the statistical properties of the network topology whereas the latter do not.11.16(a) and 11. during the search process.4 . Hence. The objective is for node 1 to send a message to node 30 in the shortest possible path. and uses the same search algorithm to send the message to node 13. This makes the local search process even more difficult. consider the networks shown in figure 11. In the network shown in figure 11.12 . This information will guide the search process in reaching the target node. This process continues until the message reaches the target node. we can not quantify the global position of the target node. node 4. given that we have only local information available. We can clearly see that the search path obtained (1 . Next.

when a user is searching for a file he/she does not know the global position of the node that has the file. each node has global connectivity information about the whole network. For lack of more suitable name. it is difficult to find out whether this step is towards the target node or away from it. such kind of network is the peer-to-peer network. Further. The path obtained is longer than the optimal path. node 1 calculates the optimal path and send the message through this path. (a) In this case. This is the information about their immediate friends or perhaps . which is even more surprising. (b) In this case. Search in spatial networks The problem of local search goes back to the famous experiment by Stanley Milgram [92] (discussed in section 11. 84. where the network structure is such that one may know very little information about the location of the target node. node 1 tries to send the message to node 30.32 CHAPTER 11. Gnutella [79]. when the user sends a request to one of its neighbors. we focus more on search in non-spatial networks. In this chapter.2) illustrating the short distances in social networks. As pointed out by Kleinberg [83. each node has only information about its neighbors (as shown by the dotted curve). 85]. Another important observation of the experiment. is the ability of these nodes to find these short paths using just the local information. Using this local information. Hence. COMPLEXITY AND LARGE-SCALE NETWORKS Figure 11. Here. we call the networks of the first type spatial networks and networks of the second type non-spatial networks.16: Illustration for different ways of sending message from node 1 to node 30. this is not a trivial statement because most of the time. people have only local information in the network.

OPTIMIZATION IN COMPLEX NETWORKS 33 their friends’ friends. 101. 143]. all the nodes in the network would have received the message at least once or may be even more. Unfortunately. This could have devastating effects on the performance of the network. 84. Search in non-spatial networks The traditional search methods in non-spatial networks are broadcasting or random walk. They do not have the global information about the acquaintances of all people in the network. Sixth-grade students and their teacher sent out a sweet email to all the people they knew. the people to whom he gave the letters have only local information about the entire social network. we can see that arbitrary pairs of strangers are able to find short chains of acquaintances between them by using only local information. Mathematically. 85]. the model given by Kleinberg is highly constrained and represents a very small subset of complex networks. Still. 98. elementary school project [142]. Kleinberg demonstrated that the emergence of such a phenomenon requires special topological features [83. these models are not sufficient to explain the second phenomenon. Watts et al. [144] presented another model which is based upon plausible hierarchical social structures and contentions regarding social networks. Considering a family of network models on a n-dimensional lattice that generalizes the Watts-Strogatz model. and the process continues. The neighbors in turn broadcast the message to all their neighbors. from the results of the experiment. This model defines a class of searchable networks and offers an explanation for the searchability of social networks. a network is searchable if the length of the search path obtained scales logarithmically with the number of nodes N (∼ logN) or lesser. 31. 49. he showed that only one particular model among this infinite family can support efficient decentralized algorithms. each node sends the message to all its neighbors. However. In broadcasting. The observations from Milgram’s experiment suggest that there is something more embedded in the underlying social network that guides the message implicitly from the source to the target. Even in Milgram’s experiment.11. A hint on the potential damages of broadcasting can be viewed by looking at the Taylorsville NC. Such networks which are inherently easy to search are called searchable networks. They requested the recipients to forward the email to .5. Many models have been proposed to explain the existence of such short paths [13. Effectively.

85.0. 118. each node sends the message to one of its neighbors until it reaches the target node. Recently. If the neighbor is chosen randomly with equal probability then it is called random search. Clearly. All the algorithms discussed until now [6. 65. 116. Thadakamalla et al. it implies that this node is critical in the local . 62. have assumed that the edges in the network are equivalent. or power consumption associated with the process described by the edge) usually does not hold in real networks. COMPLEXITY AND LARGE-SCALE NETWORKS everyone they know and notify the students by email so that they could plot their locations on a map. 87. 83. σst (i) is the number of these shortest paths passing through i. bandwidth. But. the assumption of equal edge weights (which may represent the cost.1 and 3. 84. Many researchers [17. 36. If the LBC of a node is high.000 responses from all over the world [142]. the number of steps taken by high-degree search scales with a smaller exponent than the random walk search. 144]. The neighbor can be chosen in different ways depending on the algorithm. 106.0. 60.34 CHAPTER 11. Since most real networks have power-law degree distribution with exponent (γ) between 2. A good way to avoid such a huge exchange of messages is by doing a walk. [6] have demonstrated that high degree search is more efficient than random search in networks with a power-law degree distribution (scale-free networks). high-degree search would be more effective in these networks. for γ > 2. Adamic et al. 100. A few weeks later. [131] have proposed a new search algorithm based on a network measure called local betweenness centrality (LBC) that utilizes the heterogeneities in node degrees and edge weights. They showed that the number of steps (s) required for the random search until the whole graph is revealed is s ∼ N 3(1−2/γ) and for the high-degree search it is s ∼ N (2−4/γ) . 27. the project had to be canceled because they had received about 450. High degree search sends the message to a more connected neighbor that has higher probability of reaching the target node and thus exploiting the presence of heterogeneity in node degree to perform better. distance. In a walk. while in a high degree search the highest degree neighbor is chosen. 148]. The LBC of a neighbor node i. is given by L(i) = X s6=n6=t σst (i) . L(i).t ∈ local network where σst is the total number of shortest paths (shortest path means the path over which the sum of weights is minimal) from node s to t. σst s. have pointed out that it is incomplete to assume that all the edges are equivalent. 28.

0 and 2.30 Exp.41 (103%) 298.71 (29%) 368. performed the best for all edge weight distributions.98 (4%) 1228.99 (59%) 466.06 Power-law σ 2 = 4653.71 (202%) 704.47 (92%) 379. Table 11.70 (272%) 318.2: Comparison of different search strategies in power-law networks with exponent 2. assume that each node in the network has information about its first and second neighbors and using this information. The values in the parentheses show the relative difference between the average distance for each algorithm with respect to the average distance obtained by the LBC algorithm.1 and 2000 nodes with different edge weight distributions.35 11. they observed that as the heterogeneity in the edge weights increase.26 Uniform σ 2 = 8. This implies that it is critical to consider the edge weights in the local search algorithms. which reflects both the heterogeneities in edge weights and node degree.54 (44%) 394.68 (235%) 366. OPTIMIZATION IN COMPLEX NETWORKS Table 11.77 network. σ 2 = 25 1108.9 for a variety of edgeweight distributions.18 (88%) 247. LBC search. In specific.43 (14%) 788. The values in the table are the average distances obtained for each search strategy in these networks. The mean for all the edge weight distributions is 5 and the variance is σ 2 .21 (344%) 358. given that many real networks are heterogeneous in edge weights.8 1011.5.72 (241%) 414.3 1097.95 (7%) 375.15 (145%) 322. . Search strategy Random walk Minimum edge weight Highest degree Minimum average node weight Highest LBC Beta σ 2 = 2. al [6]. They demonstrated that this search algorithm utilizes the heterogeneities in node degree and edge-weights to perform well in power-law networks with exponent between 2.2 compares the performance of different search algorithms for scale-free networks with different edge weight distributions. Moreover. Thadakamalla et al. the node calculates the LBC of each neighbor and passes the message to the neighbor with the highest LBC.83 (26%) 605.3 1107. it becomes important to consider an LBC based search rather than high degree search as shown by Adamic et. the difference between the high-degree search and LBC search increase. The values in the brackets show the relative difference between average distance for each strategy with respect to the average distance obtained by the LBC strategy.

139]. In this subsection. One of the major classes of algorithms for clustering is hierarchical algorithms which fall into two broad types. Identifying them helps in finding the functional modules which correspond to specific biological functions. The behavior of an individual is highly influenced by the community he/she belongs. an empty network (n nodes with no edges) is considered and edges are added based on some similarity measure between nodes (for example. • Biological networks: Community structures are found in cellular [72. content filtering. Algorithmically. automatic classification. the community detection problem is same as cluster analysis problem studied extensively by OR community. computer scientists. Communities often have their own norms. statisticians. COMPLEXITY AND LARGE-SCALE NETWORKS 11. automatic realization of ontologies and focussed crawlers [18. • Social networks: Community structures are a typical feature of a social network. In an agglomerative method. and mathematicians [67]. subcultures which are an important source of a person’s identity [103. community structures are typically found in many real networks. 123]. agglomerative and divisive. The following are some of the examples: • The World Wide Web: Identification of communities in the web is helpful for implementation of search engines. Sometimes the statistical properties of the community alone may be very different from the whole network and hence these may be critical in understanding the dynamics in the community.3 Other topics There are various other applications to real networks which include the issues related to the structure of the networks and their dynamics. we briefly summarize these applications and give some references for further study. Detecting community structures As mentioned earlier. similarity based on the number of . Finding these communities is extremely helpful in understanding the structure and function of the network. metabolic [121] and genetic networks [147].36 CHAPTER 11. 56].5.

j} ∈ E. 68. in divisive methods edges are removed from the network based on certain measure (for example. . 110]. {i. Then we need to use external memory algorithms and data structures [3] for solving the optimization problems in such networks. let G(S) denote the subgraph induced by a subset S ⊆ V . The maximum clique problem has many practical applications such as project selection.e. A comprehensive list of algorithms to identify community structures in complex networks can be found in [46] where Danon et al. economics and integration of genome mapping data [38. many such algorithms are proposed and applied to complex networks [31. 46]. i. E) is complete if each node in the network is connected to every other node. solve this problem for finding maximal independent set in the market graph which can form a base for forming a diversified portfolio. if the network size is large. A clique C is a subset of V such that the induced graph G(C) is complete.5. coding theory. using external memory algorithms. Boginski et al. Abello et al proposed decomposition schemes that make large sparse graphs suitable for processing by graph optimization algorithms. Further. Another interesting problem in community detection is to find a clique of maximum cardinality in the network. in [33]. computer vision. then the data may not fit completely inside the computer’s internal memory. On the other hand. These algorithms use slower external memory (such as disks) and the resulting communication between internal memory and external memory can be a major performance bottleneck. E). The maximum clique problem is known to be NP-hard [58] and details on various algorithms and heuristics can be found in [78. As this process continues the network disintegrates into different communities. the edge with the highest betweenness centrality [103]). A clique is a complete subgraph in the network. In the network G(V. have compared them in terms of sensitivity and computational cost.11. j ∈ V. A network G(V. For instance. This procedure can be stopped at any step and the distinct components of the network are taken to be the communities. Recently. 110]. ∀i. OPTIMIZATION IN COMPLEX NETWORKS 37 common neighbors) starting with the edge between the pairs with highest similarity. In [4].

137]. computer virus or information on a network constitute examples of spreading processes. 49. there has been a burst of activities on understanding the effects of the network properties on the rate and dynamics of disease propagation [13. Recently. 112. presence of high clustering coefficients. 101]. However. Ohira and Sawatari [107] have shown that there exists a phase transition from a free flow to a congested phase as a function of the packet generation rate. Considering a basic model. 105]. In particular. which implies that the individuals who are in contact with susceptible individuals are uniformly distributed throughout the entire population. Due to limitations in resources (bandwidth). 42. many models have been proposed [13. many researchers have shown that incorporating these properties in the model radically changes the results previously established for random graphs. recent findings (section 11. congestion in networks. 31. increase in number of packets (packet generation rate) may lead to overload at the node and unusually long deliver times. 115] which consider these properties of the network. 101. In particular. COMPLEXITY AND LARGE-SCALE NETWORKS Spreading processes Diffusion of a infectious disease.38 CHAPTER 11. in other words. and strategies for marketing campaigns [90]. . The study of epidemiological modeling has been an active research area for a long time and is heavily used in planning and implementing various prevention and control programs [48]. data dissemination on the Internet [80. the better is the network performance with respect to congestion. Congestion Transport of packets or materials ranging from packet transfer in the Internet to the mass transfer in chemical reactions in cell is one of the fundamental processes occurring on many real networks.2) such as heterogeneities in node degree. 49. and community structures indicate that this assumption is far from reality. 31. Later. the spread of infectious diseases in a population is called epidemic spreading. Most of the earlier methods used the homogenous mixing hypothesis [21]. Other spreading processes which are of interest include spread of computer viruses [24. This critical rate is commonly called “congestion threshold” and the higher the threshold. 91.

132]. Importantly. For example. Network Science. Congestion due to such local search algorithms and optimal network configurations are studied in [22]. 51.6 Conclusions Complex networks abound in today’s world and are continuously evolving. 128] have been proposed to improve the performance. [128] introduced a novel static routing protocol which is superior to shortest path routing under intense packet generation rates. The sheer size and complexity of these networks pose unique challenges in their design and analysis. Toroczkai et al. 51. 50. 47. routing is done using local search algorithms. Sometimes when global information is not available. 128. 64. They propose a mechanism in which packets are routed through hub avoidance paths unless the hubs are required to establish the route. scale-free topologies are less prone to congestion than random graphs.6. in scale-free networks. 11. In this chapter. Singh and Gupte [126] discuss strategies to manipulate hub capacity and hub connections to relieve congestion in the network. to optimize a process on a given config- .11. Therefore. Routing algorithms also influence congestion at nodes. We discussed how network approaches and optimization problems are different in network science than traditional OR algorithms and addressed the need and opportunity for the OR community to contribute to this fast-growing research field. 126. 50. 63. if the packets are routed through the shortest paths then most of the packets pass through the hubs and hence causing higher loads on the hubs [59]. CONCLUSIONS 39 Many studies have shown that an important role is played by the topology and routing algorithms in the congestion of networks [40. Similarly many congestion-aware routing algorithms [40. Sreenivasan et al. [132] have shown that on large networks on which flows are influenced by gradients of a scalar distributed on the nodes. we presented significant findings and developments in recent years that led to a new field of inter-disciplinary research. The fundamental difference is that largescale networks are characterized based on macroscopic properties such as degree distribution and clustering coefficient rather than the individual properties of the nodes and edges. Such complex networks are so pervasive that there is an immediate need to develop new analytical approaches. these macroscopic or statistical properties have a huge influence on the dynamic processes taking place on the network.

findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF). This will further help in the design of optimal network configurations for various processes.) for making this work feasible. Moreover. the macroscopic properties and structure of networks across different disciplines are found to be similar. Acknowledgments The authors would like to acknowledge the National Science Foundation (Grant # DMI 0537992) and a Sloan Research Fellowship to one of the authors (R. A. it is important to understand the interactions between the macroscopic properties and the process. a macroscopic network approach is necessary for the design and analysis of such systems. Hence the results of this research can easily be migrated to applications as diverse as social networks to telecommunication networks. Due to the growing scale of many engineered systems. the authors would like to thank the anonymous reviewer for helpful comments and suggestions. Any opinions.40 CHAPTER 11. In addition. . COMPLEXITY AND LARGE-SCALE NETWORKS uration.

[3] J. [2] E. Abello and J. Vitter. Network flows: Theory. External Memory Algorithms: DIMACS series in discrete mathematics and theoretical computer science. and M. Aarts and J. and J. B. 2001. Resende. Adamic and B. P. E. Lukose. Algorithms. volume 50. 1999. Wiley & Sons. L.imdb. J. Nature. volume 50. R. K. and B. Phys. 1999. Ahuja. Magnanti. [7] R. American Mathematical Society. Local Search in Combinatorial Optimization. Rev.com/.Bibliography [1] The Internet Movie Database can be found on the WWW at http://www. C. chapter On maximum clique problems in very large graphs. R. USA. editors. Puniyani. MA. 401(6749):131. K. A. Huberman. [4] J. 41 . 64(4):046135. editors. Huberman. Abello. Pardalos. M. Growth dynamics of the world-wide web. Adamic. American Mathematical Society. NJ. pages 119–130. [5] L. Orlin. 1997. UK. External Memory Algorithms: DIMACS series in discrete mathematics and theoretical computer science. G. A. M. A. Search in powerlaw networks. Prentice-Hall. T. Boston. A. and Applications. Lenstra. 1993. A. [6] L. 1999. Chichester.

Proceedings of the thirty-second annual ACM symposium on Theory of computing. and L. and A. Attack and error tolerance of complex networks. and G. A community-aware search engine. Computer Networks. E. Almeida. 2004. I. Nature. Phys. In Proceedings of the 13th International Conference on World Wide Web. Jeong. 1999. Topology of evolving networks: Local events and universality. Albert and A. L. Nakarado. [11] J. 85(24):5234–5237. Al-Karaki and A. Oltval. Structural vulnerability of the north american power grid. 2002. Barab´asi. Barab´asi. pages 171–180. Global organization of metabolic fluxes in the bacterium escherichia coli. F. Aiello. ACM Press. A. Barab´asi. Almeida and V. Z. N. L. Rev. Albert. Sankarasubramaniam. 11(6):6–28. B. 401(6749):130–131. . Lett. 2000. Albert. IEEE Wireless Communications. [12] R. Experimental Mathematics. B. Nature. and A. 427(6977):839–843. Akyildiz. 2004. 2004. 2000. 38(4):393–422. 69(2):025103. [10] I. Barab´asi. Cayirci. L. 2004. and E. and A. 406(6794):378–382.42 BIBLIOGRAPHY [8] W. Kamal. [9] W. Su. Albert. [17] E. F. Lu. [13] R. L. [16] R. Lu. Nature. Wireless sensor networks: A survey. Phys. Diameter of the world wide web. Albert. 2001. T. [14] R. Statistical mechanics of complex networks. Routing techniques in wireless sensor networks: a survey. F. 2000. Aiello. Chung. Y. and L. 2002. A random graph model for massive graphs. Viscek. F. L. L. Chung. 10(1):53–66. H. A random graph model for power law graphs. [15] R. Kovacs. 74(1):47–97. H. Rev. Albert and A. W. [18] R. Almaas. N. Barab´asi. Jeong. Reviews of Modern Physics. E..

and A. Pastor-Satorras. R. chapter Search and Congestion in Complex Networks. J. Albert. S. N. Vega. Pastor-Satorras. Barrat. R.. L Barab´asi and R. Modeling the evolution of weighted networks. A. Amaral. Proc. and T. A. [22] A. Characterization and modeling of weighted networks. Natl. 304(5670):527–529. [27] A. Jeong. Barthelemy. Vicsek. Anderson. Schubert. 2004. Forrest. 70(6):066149. Scala. and A. The architecture of complex weighted networks. Vespignani. M. Diaz-Guilera. Phys. Rev. E. May. Arenas. [23] R. Cambridge university press. Acad. Ravasz. 346:34–43. Physica A. M. 101(11):3747. pages 175–194. and F. Z. Barab´asi. Infectious Diseases in Humans. Barrat. R. M. 2004. Sci. Technological networks and the spread of computer viruses. . L. [24] J. Neda. Barthelemy. Cabrales. 97(21):11149–11152. Classes of small-world networks. H. Stanley. Wasserman. Natl. Politi. Evolution of the social network of scientific collaborations. A. Anderson and R. Balthrop. E. Vespignani. [20] C. Barrat. M. and B. Sci. Guimera. 2005. [26] A. 2002. Oxford. Statistical mechanics of complex networks. Badii and A. Williamson. 286 (5439):509–512. Springer-Verlag. Vespignani. A p∗ primer: Logit models for social networks. Proc. 2000. 21(1):37–66. Germany. and M. Physica A. M. E. 1999. [25] A. Science. Emergence of scaling in random networks. M.BIBLIOGRAPHY 43 [19] L. 1992. [21] R. 1997. Acad. 311:590–614. Barthelemy. 1999. Crouch. and H. [29] M. S. A. and A.. A. E. Complexity : Hierarchical structures and scaling in physics. 2004. [28] A. Social Networks. Science. M. Oxford University Press. Newman. Berlin. A. 2003. Barthelemy.

Statistical analysis of financial networks. Boginski. 2003. 2005. Computer networks. Pardalos. and P. [33] V. Complex networks: Structure and dynamics. 2003. Latora. Academic. Butenko. 12(4):985–994. Wiener. Wiley-VCH. Carreras. Tomkins. [36] L. Hwang. and D. E. Critical points and transitions in an electric power transmission model for cascading failure blackouts. Boccaletti. Randon graphs. Computers & Operations Research. and P. 2006. Boginski. Pardalos. Butenko. [35] B. [32] V. 2003. Chaos. V. Moreno. E. E. R. Chavez. A. Raghavan. [37] A. Clique-detection models in computational biochemistry and genomics. [38] S. 1985. and Why. Stanley. Mining market data: a network approach. S. Havlin. and J. Computational Statistics & Data Analysis. Bollobas and O. Phys. E. [39] B. Rev. chapter How to Define Complexity in Physics. 2000. [40] Z. 424:175–308. A. S. Chen and X.. 91(16):168701. Y. chapter Mathematical results on scale-free graphs. H. Newman. 2006. Rajagopalan. European Journal of Operational Research. 2006. V. 2006. Effects of network structure and routing strategy on network capacity. F.E. [31] S. 33:309–320. S. I. . M. Lynch. S. and H. Bollobas. Lett. Stata. Graph structure in the web. U. A. 73(3):036107. Oxford University Press. Optimal paths in disordered complex networks. Dobson. Wang. R. Buldyrev. Braunstein. 173:1–17. 2002. P. [34] B. Butenko and W. V. From Complexity to Life. Rev. Broder.44 BIBLIOGRAPHY [30] C. London. and D. S. Riordan. Y. F. Handbook of Graphs and Networks. R. Phys. Maghoul. Physics Reports. Kumar. 48:431–443. Cohen. pages 34–43. Berlin. 33:3171–3184. Wilhelm. Bennett.

Vespignani. Reinforcing the resilience of complex networks. E. Mendes. Annals of combinatorics. 71(2):325–331. Lett. 31(3):681–703. Dynamics of jamming transitions in complex networks. Gomez-Gardenes. and A. and Y. 2000. 2006. F. Mathematical Epidemiology of Infectious Diseases: Model Building. Diaz-Guilera. Model for cascading failures in complex networks. Wiley. 2004. 2004. The role of the airline transportation network in the prediction and predictability of global epidemics. Rev. 2005. Rev. and M. PNAS. [45] P. [49] S.. 69(6): 066127. Phys. Barthlemy. Rev. Fluctuations in network dynamics. 103(7):2015–2020. 2002. 92(2):028701. Evolution of networks. 2002. Gomez-Gardenes.. Dorogovtsev and J. Barab´asi. 2004. Connected components in random graphs with given degree sequences. A. . Argollo de Menezes and A. Europhys. Marchiori. Faust. Phys.-L. Phys. Heesterbeek. [47] M. J. Phys.BIBLIOGRAPHY 45 [41] F.. Arenas. [43] N. Echenique. Costa. E. Contractor. Diekmann and J. Danon. A. 2006. and K. M. [51] P. Moreno. and A. and Y. Latora. Chung and L. Lu. Crucitti. Analysis and Interpretation. S. Journal of Statistical Mechanics. [50] P. Adv. Colizza. [48] O. 69(4):045104. Wasserman. Lett. Barrat. J. page P09008. Rev. F. 2004. N. New York. F. J. V. 51: 1079–1187. 70(5):056105. Moreno. 6:125–145. Improved routing strategies for internet traffic delivery. 2005. Duch. [42] V. [44] L. Comparing community structure identification. Testing multi-theoretical multilevel hypotheses about organizational networks: An analytic framework and empirical example. Academy of Management Review. Echenique. Phys. E. [46] L.

Kozl. Frank and D.Renyi. E. On the evolution of random graphs. Johnson. 78(6): 1360–1380. 29:251–262. [55] M. Goh. Goh. [61] R. IEEE INFOCOM.46 BIBLIOGRAPHY [52] P. 1973. 1979. 6:290–297. 2005. Granovetter. Universal behavior of load distribution in scale-free networks. Mat. 2001. Kahng. [54] P. Strauss. Garey and D. Tangmunarunkit. and C. Rev. Computer Communications Review. Noh. 5:17–61. Kahng. Flake. On power-law relationships of the internet topology. and D. Lett. Lawrence. [60] K. [56] G. In Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1961.. Computers and Intractability. 1986.. Load distribution in weighted complex networks. H. 1999. 3:1371–1380. and C. Acta Math. On the strength of connectedness of a random graph. 2000. Acad. . On random graphs. P. S. Heuristics for internet map discovery. D. 87(27). I. Faloutsos. Kim. [53] P. 81: 832–842. and D. American Statistical Association. Kim. 72(1):017102. Efficient identification of web communities. A Guide to the Theory of NP-Completeness. I. Publicationes Mathematicae. B. 2000. J. Hungar. W. Kutato Int. Erdos and A. Erdos and A. 278701. 1960. American Journal of Sociology.Renyi. Phys. Govindan and H. B. [58] M. Freeman. pages 150–160. Sci. [62] M. R. [57] O. 12:261–267. 1959. Faloutsos. Rev. The strength of weak ties. Faloutsos. Magyar Tud. S. Phys. Lee Giles. Markov graphs.. Erdos and A.Renyi. [59] K. J.

Gulli and A. In WWW ’05: Special interest tracks and posters of the 14th international conference on World Wide Web. E.. Arenas. 79:191–215. Phys. Lett. Guimera. [69] B. 1997. Arenas. N. and G. community structure. Optimal network topologies for local search with congestion. . 66(2):026704. and cities’ global roles. New York. [70] P. 1981. W. USA. [71] P. A. 89(24):248701. American Statistical Association. Vairaktarakis. Phys. Pardalos.BIBLIOGRAPHY 47 [63] R. 65 (5). Phys. Hansen and B. A. A multilevel algorithm for partitioning graphs. 2005. Attack vulnerability of complex networks. Guimera. [68] J. S. Rev. pages 902–903. Jaumard. 1993. page 28.. A. D´ıaz-Guilera. Leinhardt. 2002. In Supercomputing ’95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing. [66] A. Hendrickson and R. Proc. 1995. 3: 463–482. [65] R. A. Mathematical programming. Mossa. The worldwide air transportation network: Anomalous centrality. A. J. 2005.5 billion pages. Nat. USA. Amaral. 76:33–65. ACM Press. Holland and S. New York. and F. M. [64] R. Test case generators and computational results for the maximum clique problem. F. The indexable web is more than 11. and L. 2002. D´ıaz-Guilera. J. Cluster analysis and mathematical programming. [67] P. W. Turtschi. Dynamical properties of model communication networks. ACM Press. Giralt. Acad. Rev. Vega-Redondo. Journal of Global Optimization. Hasselberg. and A. Rev. Holme and B. E. Leland. Cabrales. P. Signorini. 102:7794–7799. Sci. Guimera. 2002. An exponential family of probability distributions for directed graphs. Kim. A.

2001. and A. Bioinformatics. Ferrer i Cancho and R. pages 114–126. Tombor. Johnson and M. 2003. N.48 BIBLIOGRAPHY [72] P.iastate. Honavar. The Bell System Technical Journal. October 11-13. Estrin.-L. Z. Ferrer i Cancho.cs. R. 2003. and Satisfiability: Second DIMACS Implementation Challenge. Berlin. and A. editors. 49:291–307. Jeong.edu/∼honavar/cas. pages 174–185. Lin. 2000. J. Boston. Phys. 1993. C. The large-scale organization of metabolic networks. [78] D. date accessed: March 22. [75] R. [73] V.html. O’Reilly. American Mathematical Society. M. Peer-to-Peer Harnessing the Power of Disruptive Technologies. [79] G. Sol´e. 2006. 1996. Jeong. E. Huss. 64(4):046119. [81] B. Statistical mechanics of complex networks. 407:651–654. Kan. Complex Adaptive Systems Group at Iowa State University. Kermarrec. Intanagonwiwat. 1970. and D. http://www. . chapter Optimization in complex networks. USA. [76] C. on Parallel and Distributed Sys. Beijing.W. Albert. MA. A. IEEE Trans. R. and R. J. Subnetwork hierarchies of biochemical pathways. L. Topology of technology graphs: Small world patterns in electronic circuits. 2003. Probabilistic reliable dissemination in large-scale systems. Barab´asi. Workshop. Proceedings of ACM MobiCom ’00. Directed diffusion: a scalable and robust communication paradigm for sensor networks. Nature. An efficient heuristic procedure for partitioning graphs. Springer-Verlag. Boston. Cliques. Rev.-M. [77] H. Ganesh. V. chapter Gnutella. Janssen. Govindan. Oltvai. [80] A. V. B. 2001. [74] R. Massoulie. and H. 19:532–538. Sol´e. 14(3):248–258. Holme. Coloring. Kernighan and S. 2000. Trick.

2005. The small world problem. 2003. A. K. Nature. A. Kleinberg. May. Lawrence and C. Huberman. How viruses spread among computers and people. [83] J. R. 1999. S. Accessibility of information on the web. L. A. 292:1316–1317. M. Frank. Albert. The small-world phenomenon: An algorithmic perspective. 2005. Modeling cascading failures in the north american power grid. E. 2:60–67. Milgram. 2000. 406:845. D. Krause. Nature. [85] J. Zlotowski. Lehmann. The dynamics of viral marketing. [84] J. Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. 426:282–285. L. 2000. 1967. Science. Kinney. [86] D. 2005. 46:101–107. Advances in Neural Information Processing Systems. Nature. Kosch¨ utzki. . Ulanowicz. [92] S. Taylor. Tomkins. A. Psychology Today. Kumar. 2001. K. D. Compartments revealed in food-web structure. pages 1–10. Small-world phenomena and the dynamics of information. Berlin. 32nd ACM Symposium on Theory of Computing. Lloyd and R. 2000. Proc. L. M. and V. pages 163–170. Network Analysis.BIBLIOGRAPHY 49 [82] R.org/abs/physics?papernum=0509039. 2001. Latora. W.arxiv. and B. Peeters. Tenfelde-Podehl. L. [87] A. Richter. Navigation in a small world. [89] S. Raghavan. Giles. e-print physics/0509039. Rajalopagan. [91] A. Upfal. D. pages 16–61. Kleinberg. [90] J. and O. and W. and E. Crucitti. The European Physical Journal B. http://lanl. Sivakumar. Mason. P. 400: 107–109. Leskovec. The web as a graph. Adamic. Springer-Verlag. chapter Centrality Indices. A. E. 14:431–438. S. [88] R. P. R. Kleinberg.

Newman. SIAM Review. 2001. 2001. [96] A. [102] M. 2002. E.. Scientific collaboration networks: Ii. [101] M. J. 62(2):292–298. Critical load and congestion instabilities in scale-free networks. Europhys. 6:161–179. 2003. J. and centrality. Moreno. weighted networks. Lai. Pastor-Satorras. Finding and evaluating community structure in networks. 2000. [98] M. Europhys. E. Rev. [94] Y. Journal Statistical Physics. shortest paths. 2004. Lett. and A. 69(2):026113. R. Girvan. Rev. Lett. J. E. Vazquez. Random Structures Algorithms. B. Phys. Cascade-based attacks on complex networks.50 BIBLIOGRAPHY [93] M. Molloy and B. Newman. A. [99] M. Lett. 45:167–256. 2003. E. Rev. Reed. Phys. Vespignani. network construction and fundamental results. E. Gomez. E. Scientific collaboration networks: I. 2002. 1995. 58(4):630–636. Phys. [97] A. Newman. Pacheco. . Motter and Y. 101:819–841. Rev. Cascade control and defense in complex networks.. and A. Phys. 2004. J. A critical point for random graphs with a given degree sequence. 64(1):016132. E. J. E. Instability of scale-free networks under node-breaking avalanches. Motter. The structure and function of complex networks. F. Newman. Rev. E. [103] M. J. [95] Y. Newman. 66(6):065102. J. Moreno. Newman and M. 93 (9):098701. E. 64(1):016131. [100] M. E. Models of small world. Phys. E. Handbook of Graphs and Networks. chapter Random graphs as models of networks..

Newman.. E. 86:3200–3203. Nature. E. [106] J. E.BIBLIOGRAPHY 51 [104] M. . Rev. 2001. Phase transition in a computer network traffic model. 65(3):036104. E. Phys. and J. Farkas. [108] Committee on network science for future army applications. Pastor-Satorras and A. Epidemic spreading in scale-free networks. D. Watts. 65(3):035108. Network Science. M. 2002. and T. Phys. E. Phys. S. 66(3):035101. Uncovering the overlapping community structure of complex networks in nature and society. Pastor-Satorras and A. Epidemic dynamics and endemic states in complex networks. and D. Pardalos and J. Random graphs with arbitrary degree distributions and their applications. 1998. Vespignani. Pastor-Satorras and A. 2002. E. Rev. Xue. Immunization of complex networks. [110] P. 66(6):066127. Rev. Epidemic dynamics in finite size scale-free networks. 63(6):066117. 1994. J. E. Rev. 2002. Lett. Balthrop. 2002. Phys. Palla. [114] R. Derenyi. 2001. Sawatari. Stability of shortest paths in complex networks with random edge weights. 58(1):193–195. Rev. J. Vespignani. S. [109] G. [111] R. 2005. [113] R. Ohira and R. Phys. [112] R. Rev. Vespignani. Vicsek. Noh and H. Pastor-Satorras and A. Strogatz. H. [107] T. E. [105] M. E. 2001. 2005. Vespignani. The maximum clique problem. The National Academies Press. Journal of Global Optimization. I. Phys. 435:814–818. Email networks and the spread of computer viruses. Rieger. I. Newman. 64(2):026118. Rev. J. Forrest. Phys. Rev. Phys. 4:301–328.

2002. V. Partitioning sparse matrices with eigenvectors of graphs. Optimization of robustness of complex networks. Barab´asi. Epstein. Liou. and H. I. Wiley-VCH. 2002. Berlin. Loreto. 101:2658–2663. Pimm. Pastor-Satorras and A. [125] B. F.. Mongru. 6:50–57. 61(5):4877–4882. Acad. A. Iamnitchi. 1990. Journal B. Sci. Rives and T. C. L. W. and K. Paul. Vespignani. [123] A. 11(3):430–452. Natl. Phys. E. E. Cecconi. Disturbances in a power transmission system. Carreras. H. E. Lynch. . [117] G. Rev. [116] R. The University of Chicago Press. Havlin. and V. Sayama. Galitskidagger. Z. 2 edition. Tanizawa. T. Evolution and structure of the Internet: A statistical physics approach. Stanley. Somera. Shargel. 2004. Food Webs. B. 2003. A. [124] M. and Y. Lett. Parisi. Modular organization of cellular networks. I. Defining and identifying communities in networks. Science. Radicchi. IEEE Internet Computing Journal. 90(6):068701. 2003. 2002. chapter Epidemics and immunization in scale-free networks. Hierarchical organization of modularity in metabolic networks. Proc. and D. R.L. Pothen. Sci. Optimization of robustness and connectivity in complex networks. [121] E. Matrix Anal. 100(3):1128–1133. L. Eur. Ripeanu. Rev. S. Natl. 2000.. D. Vespignani. Sachtjen. and A. Phys. [120] F. Bar-Yam. [118] S. 2004. [122] M. A. Simon. 38:187–191. Handbook of Graphs and Networks. Foster. 2003. 297:1551–1555. [119] A. N. SIAM J.52 BIBLIOGRAPHY [115] R. Mapping the gnutella network: Properties of large-scale peer-to-peer systems and implications for system design. Cambridge University Press. Pastor-Satorras and A. L. Castellano.. Phys.. 2004. and A. H. Proc. Acad. Oltvai. Ravasz.

SIAM Review.. Rev. Rev. [128] S. Snijders. Epidemic modeling: Dealing with http://vw. Rev. Thadakamalla. 72(6):066128. J. Search in weighted complex networks. Z. N. and H. R. P. Sarkar. Date accessed: July 6. Social Structure. B. C. On a general class of models for interaction. Lett.gov/abs/cs?papernum=0604023. 19:24–31. [127] T. Toroczkai and K. Vespignani. R. Raghavan. . P. X. Lopez. 92(11):118702. http://xxx. [134] V. and F. Thadakamalla. Phys. Gupte. T. 2004.indiana. 2005. 2004. 2004. Albert. E. Mu. Nature. [129] D. 2004. N. Two-peak and three-peak optimal complex networks. 2006. complexity. Spontaneous emergence of complex optimal networks through evolutionary adaptation. Markov chain monte carlo estimation of exponential random graph models. Valente. 28(9):1789–1798. munication bottlenecks in scale-free networks. and R. 71(5):055103. E. 1986. 428:716. Sreenivasan. P. IEEE Intelligent Systems. [135] A. Patkar. S. 2002. Congestion and decongestion in a communication network. [130] H. A. 2006. 2005. S. and H. R. 28:513–527. Com- e-print cs. Cohen. Kumara. E. Stanley. R. T. Phys.lanl. and S.NI/0604023. A. Kumara. [131] H. Albert. U. Stone.53 BIBLIOGRAPHY [126] B. Survivability of multi-agent based supply networks: A topological perspective. Computers & Chemical Engineering. Phys. Toroczkai.edu/talks-fall04/. R. Network dynamics: Jamming is limited in scale-free systems. K. [132] Z. Strauss. [133] A. A. Bassler. E. Venkatasubramanian. Katare. 2004. Singh and N. 3(2):1–40. E.

Faust. W. 1998. Generating Functionology. . [143] D.columbia. J. Natl. pages 75–81.. Watts. Frontiers of Engineering: Reports on Leading-Edge Engineering from the 2005 Symposium. [137] W. 2006. Watts. Rev. http://smallworld. Cambridge University Press.edu/index. 2004. H. 1996. 2002. Rev. [145] D. Six degrees: The science of a connected age. J. date accessed: March 22. 2003. chapter Complex Networks: Ubiquity. SIGCOMM Comput. R. 296:1302–1305. Social Network Analysis. Pattison. [141] D. Proc. Vespignani. Wasserman and K. S. [140] S.html. and Implications. E. E. Identity and search in social networks. Sci. Watts and S. J. Newman. 2002. The power of epidemics: robust communication for large-scale distributed systems. P. J. 393:440–442. [142] D. Commun. The National Academies Press. Nature. Vogels. Xu. Boston. Cascading failures in coupled map lattices. van Renesse. [146] H. F. 99(9):5766–5771. Muhamad. Acad. Wasserman and P. Psychometrika. Wilf. S. 2003. [138] X. Dodds. J. Wang and J. 70 (5):056113. Norton & Company. [144] D. 1994. Strogatz. 61:401–426. 2006. Birman. and M. Logit models and logistic regressions for social networks 1: An introduction to markov random graphs and p∗ . A simple model of global cascades on random networks. P.54 BIBLIOGRAPHY [136] A. 33 (1):131–135. Watts. Watts. [139] S. 1990.. Phys. Importance. and R. Academic. J. Dodds. Science. W. Collective dynamics of “small-world” networks. and K. S.

Sci. H. A method for finding communities of related genes. and Y. Rev. Acad. Lett. Weighted evolving networks. Huberman. . [148] S. Tu. Yook.. 2004. Barab´asi. Natl. H. Phys. Jeong. M..BIBLIOGRAPHY 55 [147] D. Wilkinson and B. A. L. 101:5241–5248. A. 86. Proc. 2001:5835–5838.