You are on page 1of 18


An algorithm for determining backbones in a wireless sensor network using the Smallest Last graph coloring algorithm in random geometric graphs.


and can produce networks containing over a hundred thousand vertices and millions of edges in a few minutes. This implementation produces redundant independent sets which could serve as backbones within the network using a greedy graph coloring [Kosowski] method and the Smallest Last (SL) [Matula] vertex ordering algorithm. colors. This paper is just intended to outline the algorithmic developments behind the project.E XECUTIVE S UMMARY I N TR O D UC T I ON & S UM M AR Y ABSTRACT I implemented an algorithm which efficiently creates. Wireless Sensor Networks Page 2 . all of the more data-intensive features (such as tabulated orderings and statistics for benchmark graphs) have been included in the website. This site also includes extensive details showing the performance of this algorithm on certain benchmark sets. This site includes more interactive features which will better relay some of the algorithms in this project including Smallest Last ordering. Grundy coloring. and computes statistics on a Random Geometric Graph (RGG) of various shapes and distributions. Also. etc. This study has focused on applications within the field of wireless sensor networks and could serve as a useful tool in the simulation and study of such networks. Note that a website has been developed to accompany this material. This implementation can produce reasonably sized networks for visualization in under a second. Many of these backbones achieve very high percentage of coverage on the graph while only using a small subset of the vertices.

theoretically. are out of one another's range. Wireless Sensor Networks Page 3 . second. but would be able to relay information between an intermediary node. this solution is super-exponential on the number of nodes and very quickly becomes unfeasible computationally. meaning that the nodes. operate on the same frequency without fear of interference. This is a particularly interesting problem in wireless networks for two reasons: first. is a (nearly) dominating. themselves. With so many potential applications in hand. Pushing aside the mechanical and manufacturing challenges involved in creating these networks. independent set in the graph of nodes. more technically. as well as in trying to extract the sensed information from the networks. due to their placement. much energy has been expended in attempting to efficiently enable communication between these sensors [Sohrabi. Wireless sensor networks are gradually becoming prevalent in environments where running wires and cabling is impractical [Cook 26]. These networks have widespread applications from agriculture to military security [Lawson 3]. researchers are very interested in algorithms to develop "backbones" within these networks reliably and quickly. A backbone. the deployment of sensors is often poorlystructured. Mahjoub]. This introduces new challenges in trying to enable communication between these nodes. Unfortunately. analyzing all possible connections between nodes in the graph. if not completely random. These nodes could. the sensors must be able to handle the congestion that is often much trickier in wireless applications. A naive solution to calculate the ideal backbone would be a "brute-force" analysis.BACKGROUND & ENVIRONMENT Wireless sensor networks are collections of sensing devices which communicate without traditional cabling.

Algorithmic efficiency in this problem. Obviously.even for a handful of nodes. The scope of this project is to analyze these network in random geometric graphs (RGGs) which are graphs in which n vertices are randomly placed. 3. there is great interest in more efficient solutions for this problem. Wireless Sensor Networks Page 4 . we're quickly dealing with hundreds of thousands. then a connection is established if two nodes are within some distance r of one another. of nodes. The scale of the problem is much larger than the naive observer would expect. When applied to wireless sensor networks. if not millions. in the future. sensors would need to be able to quickly and easily handle these computations. 2. Algorithms must be developed that can reliably handle networks of this size. Some going so far as to say that. Thus. as applied to wireless sensor networks. In a self-organizing network. we may need to re-compute network backbones frequently. we can imagine that the vertices on these graphs are sensors and an edge in the graph between two nodes indicates that these two nodes are within communication range of one another. These sensor nodes are often impoverished devices with very limited capabilities. every plant on a farm could have its own sensor to ensure optimal growing conditions and nutrients [Lawson 2]. In such hypothetical problems. The algorithm must be able to produce a backbone quickly enough to still be a useful means of transporting information before the nodes have relocated. is important for three primary reasons: 1. Many researchers hope to deploy these sensors ubiquitously. These networks are often dynamic and/or mobile.

itself. as opposed to its implementation. Aside from that. by viewing an animated sequence. The algorithm. The Flash program may have less tangible benefits. Many important breakthroughs have been made only after a physical or visual manifestation of the problem could be studied.albeit often overlooked . The most obvious applications I see are in visually analyzing the performance of graphs and also in teaching. because the interface is more interactive and animated. so the fact that it provides this functionality in a (presumably) new language could be of use. I have been able to make certain optimizations on the construction of the graph which pertain to the underlying algorithm.aspect of algorithm engineering and analysis. P R O GR AM M I NG E N V IR O NM EN T D E SC R IP TIO N My algorithm was developed on a custom-built machine with specifications as listed in Table 1. for instance. but I feel that it could be a valuable tool nonetheless.RESULTS There are two primary contributions of this project: the Java-based graph creation and coloring algorithm. The only imported libraries used in Wireless Sensor Networks Page 5 . The graph generation and coloring algorithm could be a useful implementation in this field. I used two languages. was developed completely in Java [Sun]. primarily. and the Flash tool used in visualizing and examining such graphs. It may be possible to convey the nuances of the Smallest Last algorithm. I am currently unaware of any implementations in Java. Also. Visual analysis of problems is an important . in the development of the algorithm and displays. it could serve as an effective learning tool in studying these and related topics. These optimizations may be of interest to users in the field at large.

Point. but found that the files were far too large. because of its ease of use in a web-browser.util.util. which makes it accessible to the Flash client. the hardware on which I ran the programs turned out to be a bit of Wireless Sensor Networks Page 6 . With this in mind. generally as an HTML or XML file.000.Random.4 GHz Intel Core 2 Quad Quad Core 8GB DDR2 RAM 800MHz Operating System Windows 7 Western Digital Raptor Hard Drive 10K RPM. java. To do this.the code were: java. I wanted to find a solution which would allow for a more interactive experience in trying to convey these complicated topics. I needed to bridge the two technologies. it became infeasible to visually examine graphs any larger than n=10. This means that any information that Table 1 . java. in part.Vector.Specifications of computer Item Description Intel Q6600 CPU 2.util. No preexisting code or external java libraries (other than the above) were used. Due to the current pixel density on modern displays. 130GB needs to be passed to Flash needs to be downloaded from a server. I used Java Server Pages. making the Java output available as an HTML file. Flash is a client-side technology that runs within a user's webbrowser. I considered using XML markup to describe the structure of the generated graphs. so I developed my own format to efficiently transmit the information.util. I chose Flash. and java. Specifically. The ultimate goal of much of this work involved visualization. With this in mind. I developed a custom application in Adobe Flash to display the graphs[Adobe]. This technology runs Java code in a web server. I used Flash 9 with ActionScript 3. In order for my graphical application in Flash to be able to retrieve the graphs produced in Java. This means that I would be able to publish my application online and have interested readers be able to interact with the graphical tools in the same way I did.Stack.

Volume Smallest-last ordering and clustering and graph coloring algorithms. Experimental Study of Independent and Dominating Sets in Wireless Sensor Networks Using Graph Coloring Algorithms.sun. protocols and applications. As will be discussed later. Wiley & Sons Inc. Allerton Conference on Communication. R E FER E NC ES Adobe Inc. Protocols for Self-Organization of a Wireless Network. Adobe Flash Platform. A typical modern computer with 2GB of RAM. July 1983. Wireless Sensor Networks Page 7 . having more memory allowed me to study the performance on some more theoretical networks containing hundreds of thousands of vertices. Ktzysztof. Smart environments: technologies. WASA 2009. Dhia. http://www. Accessed December 11. Kosowski. Wireless Sensor Networks. Mahjoub. would be sufficient for most graphs that could be visualized on a computer monitor. Matula. 3242. Manuszewski. Sajal Developer Resources for Java Technology.adobe. a graph with 10. Classical Coloring of Graphs. Accessed December 11. http://hemswell. 2009. pp. LNCS 5283. David. Lawson. Leland. David.the RAM specifically. Katayoun. Diane J. University of Lincoln. Das. Adrian. 2009. September 1999.000 nodes only occupied about 50MB of memory for me and took around 1 second to generate. Journal of the ACM. Matula. Beck. Sohrabi. Computing and Control. http://java. However. overkill . Sun Microsystems. 2005. Shaun.pdf 2005. Issue

the program currently in use to develop these graphs consists of over 1. for example. more important algorithms which were used. There are certain situations in which one data structure could be converted to another. The goal is to create and color a graph in O(|V|+|E|) time. I made the initial decision to prioritize computational complexity over memory use. Wireless Sensor Networks Page 8 . where possible. we're generating two random numbers per vertex. This may. This section will detail just a few of the higher-level. After some early testing. The work involved in creating these points is on the order of the number of nodes we're creating .or "O(n). but I typically will just duplicate the data structures to avoid the overhead of converting the data back and forth.100 lines of code just to create and color the graph. GRAPH CREATION The heart of the graph creation algorithm is an iterative loop which creates n vertices with random x and y coordinates between 0 and 1 which are stored as single-precision floating point values. double the amount of memory required to perform some operation. but it typically saves enough time to justify this. To give an idea of the overall scope of the project. meaning that the complexity should scale linearly on the number of vertices and edges in the graph." More specifically. at time. it became obvious that the memory requirements of this program would be minimal.W IRELESS S ENSOR N ETWORK B ACKBONE R E DU C T IO N TO P R AC TI C E This project consisted of a multitude of algorithms which needed to be implemented. With that in mind.

However. 0. This creates a graph which is very sparse in the center and very dense as you approach the border (Figure 3). and will never place a vertex at the origin. will accept and place a vertex with probability 𝑝 𝑥 = 2 ∗ (𝑥 − 0.1 a disc. The first is a "Skewed distribution" which. then we know that the point must be outside of the perimeter of the disc and thus cannot be used (Figure 2).1] are acceptable. Wireless Sensor Networks Page 9 . the algorithm will add the node to the graph with a various probability based on its location.5). When working with a square. 0. essentially. after generating a point with coordinates (x.5). when working with a disc.A uniform. where the origin of the circle is at (0. To calculate whether or not a coordinate is within the acceptable bounds of Figure 1 .5. only a subset of these points actually fall within the area of the disc.y). we just compute the Euclidian distance from the origin of the disc (0.2 distributing vertices on the graph with the desired distribution. as the size doesn't change.y). Instead.A uniform disc with n= 100 and r = the time will increase on the order of 2n.5.5)2 + (𝑦 − 0. so no further filtering is necessary (Figure 1). If this distance is greater than 0. it will place a vertex on the perimeter of the disc with 100% probability. all coordinates from [0.5)2 .5. This function is equivalent to the Euclidean distance from the origin. The two non-uniform distributions considered in this project were both only applied to the disc. Square graph with n = 400 and r = 0. These nodes can be stored in an array of predefined size. The creation of non-uniform distributions upon these surfaces is not considered in their pseudo-random generation. given a vertex with coordinates (x. This method will achieve the effect of Figure 2 .

The second distribution is a "Two-Tiered" distribution does not have a continuously varying probability of placement. The problem with this solution. those discs which are not. of course. is that it would require a multitude of . This will create a graph which resembles the uniform disc on the interior but will have a much more crowded border region (Figure 5). In order to establish all of these connections. Wireless Sensor Networks Page 10 .Number of nodes vs. connects those nodes which are within distance r of one Figure 5 .A two-tiered disc with n = 400 and r = 0. it distinguishes only between 1. those nodes which are within distance r of the border and 2. Once the nodes have been placed. which quickly becomes computationally 40000000 30000000 20000000 10000000 0 100 400 800 1600 3200 6400 unfeasible (Figure 4). as stated earlier. they must be connected.O(n2) calculations for large graphs. number of comparisons between nodes.1 other nodes) and connect if 50000000 that distance is less than r. The function will give a 100% chance of Figure 3 . as the skewed disc does. we must calculate the distances between many of the nodes in the graph. A random geometric graph.1 another. Instead.1 placing those nodes within distance r of the border and only a 50% chance of placing those nodes in the interior region. Naïve Comparisons Cell Method Figure 4 . A naive algorithm to handle this problem would merely calculate the distance between any node (n total nodes) and every other node (n .A skewed disc with n = 400 and r = 0.

while the cell-based implementation averages just over 110 thousand comparisons.A more efficient method to connect the nodes is to divide the graph into smaller pieces and connect the nodes only within these pieces or "cells. the number of comparisons required to create a graph is significantly lower. as displayed in Figure 6. the cell method would be an improvement in the number of comparisons. thus the division into cells is O(n). we can do a Wireless Sensor Networks Page 11 .A view of a graph split into cells. This operation is still technically on the order of the number of nodes squared . In order to do this. it is best to use cells of size r. For instance. it will be much faster than a naive implementation. However. Clearly. for a graph of size n = 6400. we can minimize the number of connections which would overlap between "cells" which will save even further computation. To minimize the overlap of cells. Thus." If this division is done intelligently.O(n2) however. as that would require spanning a distance of more than r. it can be. there is no way that a vertex can be connected to a non-adjacent cell. the naive method would require 40 million comparisons. cells any larger than r would begin to be costly as the number of comparisons increases on the order of nodes in the cell squared. amount of work on each. Dividing the nodes into cells requires going through all nodes in the graph (of which there are n) and doing a constant Figure 6 . More specifically. we will need to compare all nodes within one cell first. As shown in Figure 4. the blue nodes are in adjacent cells so they must be compared. but is it feasible in terms of the computations complexity of dividing these nodes as well as the memory requirements to do so? If implemented efficiently. This way. we partition the graph into cells of size r x r. The red nodes are in the current cell.

assuming a uniform distribution. but we must be able to retrieve elements by their index (for reasons to be explained later). This interface is similar to a stack. at each node. we can begin connecting the nodes. Each node stores a Vector of connections which can grow to any desired size. These buckets are variable length.then the comparisons within a cell could be said to be O(n)). but provides index-able access (O(1) lookup time) to certain elements in the Vector. if not as good as keeping these elements in an array. Wireless Sensor Networks Page 12 . we must check for connections with adjacent cells. O(n*r2). will be almost. however. These connections store the node ID of the node at the other end of this connection."bucket sort" on all n nodes into (1/r)2 cells. which we're keeping for use later. Once these elements are sorted into their constituent cells. by setting parameters to the estimated cell size. which is a Java Class which is built on the List interface. so the performance. on average. Assuming that r will halve every time n is increased by a factor of four . so we store these nodes in a Vector. we can copy that node into the "bucket" corresponding to that cell. Once we know which cell it will end up in. as they are duplicating the initial array of nodes. we can minimize the necessity to grow the size of a Vector. that the number of nodes within a cell is. which is significantly lower than O(n). After the connections are made within a cell. it's a growable array. we must first check all O(n2) comparisons within a cell (note. This avoids to overhead and wasted space of storing a sparse adjacency matrix. Connections are stored redundantly by both vertices at either end of an edge.which keeps the average degree of the nodes constant and was used on all test graphs in this project . Essentially. As we go through the nodes. To utilize the cells to connect the nodes. we classify its destination cell based on its x and y coordinates. These cells will double the required amount of memory.

all information was retrieved by index. this means that we won't check connections within an adjacent cell until later. is typically much closer to O(n) time when the radius scales inversely with n in order to maintain a constant average vertex degree. for instance. some optimization can take place here. and above the source cell. However. A node on the far edge of an adjacent cell may not need to be compared to a node on the opposite edge of this cell. which. in practice. The decision of whether or not to sort vertices within these cells carries with it certain pros and cons. The first optimization we can make is to check only those relationships coming from the source cell and going to some adjacent cell. Also. we actually only need to check the adjacent nodes to the left. for the sake of this implementation. Thus. so we never needed to search for any particular item within a cell. we are able to create random geometric graphs of size n with a given radius in O(n2) time. it seemed beneficial to leave the cells unsorted.Again. One decision which was glazed over previously may deserve more consideration here. we would be redundantly checking nine cells for each one source cell. as the lookup within these cells was always done by index. It may be possible to further divide a cell dynamically if the nodes were sorted by xcoordinate. This greatly reduces the complexity of the connection process. if we consistently check for connections between cells in a certain direction (to the upper-left. Wireless Sensor Networks Page 13 . for instance). There are other analytical purposes that sorting these vertices could serve. If we were to connect to all eight adjacent cells (eight assumes that the source cell is not on an edge). By starting at the lower right and working up and to the left by column then by row. we can minimize the number of adjacent cells we need to check. upper-left.

Regarding the computational complexity. a breadth-first search will require O(|V| + |E|) time. However. roughly. or connected graph. This was a particular problem on the square graph (edges/corners) as well as on the skewed disc (the sparse center) as can be seen in Figure 7. This requires O(|V|) memory. indeed. The literature and others in this field suggested that the project limit itself to only those graphs with one component. Figure 7 . For the smaller graphs in the sample set (n ≤ 1. one contiguous component. some nodes would have a degree of 0 or would be an isolated subgraph. The logic here is to sort the nodes based.COMPONENT ANALYSIS One consideration which hadn't been anticipated coming into the project but had to be addressed was the issue of separate graphs being created. At time. To do this. I use a breadth-first search and keep an array of nodes which have been visited on this search.Histogram of the number of components on the Skewed Disc for n=6400 and r = 0.025 VERTEX ORDERING I implemented the Smallest Last algorithm to order the vertices in the graph based on their degree. on the largest graphs there were multiple occasions on which the graph consisted of multiple separate components. as I need a bit for each vertex in the graph. In order to limit my study only to those graphs with one component. on their degree so that we can Wireless Sensor Networks Page 14 . I needed to perform some additional analysis on the graph once it was created to ensure that it was.600) this was not a large problem.

The Smallest-Last (SL) algorithm is designed to work in time O(|V| + |E|). I will give a brief description of the algorithm here. thus less competition for color availability. I recommend that you view the animations available in the accompanying website.color those nodes with the highest degree first. as it's likely that their neighbors will have lower degree. In order to do this. an O(|V|) operation. one must be careful at this point not to have an algorithm which searches through the entire bucket to find a neighbor node and remove it. I was able to retrieve the length field in O(1) time. as it pertains to the algorithm implementation and data structures. including all edges connecting to this node. In this case. we then begin by taking a vertex of the lowest degree and performing a "cut" on the graph regarding this node. To truly understand the algorithm. we will remove the edge from the other node. as there could be up to n elements in any bucket. In implementation. Once the nodes have been grouped into these buckets. than the large degree nodes. we will move this node from a bucket of degree D to a bucket of degree D-1. to show that this node's degree is now one less than it was previously. second. as all of my vertices stored their connections as a Vector. if the bucket were sorted) in its time complexity. However. This will simulate the deletion of this node. we first perform a "bucket sort" on all of the vertices in the graph. assuming that each node stores its number of connections as an O(1) accessible variable. The algorithm is initially interested in grouping vertices by their degree. the algorithm would approach O(n2) (or O(n lg n). Wireless Sensor Networks Page 15 . We must then update two features of all nodes to which this node was previously connected (O(|E|) in time): first. If this were the case.

As we delete the nodes. This ordering. the degree can only decrease by one for consecutively removed nodes. This is because a node will only be selected if it is the lowest degree node in the Wireless Sensor Networks Page 16 . For one thing. upon deletion. we place them into an array or stack which represents the order in which the nodes were deleted. Smallest -> Largest) Figure 8 .Degree When Removed during smallest-last algorithm A few interesting observations can be made regarding the degree of nodes when removed from the graph. this had to be simulated. By recursively performing this action. the smallest-last ordering can be completed in O(|V| + |E|) time. we'll eventually consume the entire graph. beginning with low-degree nodes and moving up to higher-degree nodes. 18 16 14 Degree When Removed 12 10 8 6 4 2 0 1 47 93 139 185 231 277 323 369 415 461 507 553 599 645 691 737 783 829 875 921 967 1013 1059 1105 1151 1197 1243 1289 1335 1381 1427 1473 1519 1565 Vertex (Smallest Last Ordered. Thus. will serve as the order in which we color the vertices. Note that. for the most part. when read backwards. because of the lack of traditional pointers in Java.The easiest way to handle this is to implement a doubly-linked list for the buckets so that a vertex. can update the previous and next node's pointers in 2 * O(1) time.

Wireless Sensor Networks Page 17 . if a node has degree D. the only way a node could be of lower degree is if that node was connected to the node that just got deleted. No higher color could possibly be needed. and is now of degree D . we can assume that the lower-degree nodes will be easier to color.remaining graph. By assigning these colors initially. will analyze all edges connected to this node (O(|E|)) and will find the smallest color which is available. We apply the greedy algorithm commonly referred to as the "Grundy" function which. given a node off of the Smallest-Last ordering. not used by any adjacent node. GRAPH COLORING Note that the final nodes removed by the Smallest Last algorithm will necessarily be a clique . The maximum number of colors to be used in the graph can be computed by the largest degree when removed in the graph plus one. Also.a subgraph in which every vertex is connected to every other vertex . This is a near ideal place to start coloring.1. the final "plunge" down to zero marks the terminal clique. i. Thus. Other cliques can be seen by the sharp vertical descents elsewhere in the graph.commonly called the "terminal clique. it could use D colors to color its neighbors. However. as we know that we will need at least m colors to color the graph if there is an m-sized clique. as they are all connected to one another. and D + 1 colors to color itself. This is typically one of the larger cliques in the graph.e. This is because. the graph can increase in degree for consecutive nodes by any number. m-1." This is because the algorithm will remove all nodes of lower degree before arriving at m nodes which all have the same degree.

To do this. Because this must be done for every node in the graph. the total time of coloring is O(|V| |E|). my implementation creates a bitmap of the size of the current colors. is a near-optimal solution.O(|E|) . Wireless Sensor Networks Page 18 ." and applies that color to the current vertex.of the current node and marks the spot it the bitmap pertaining to the color of the visited cell as "taken. It then visits every neighbor . the algorithm then finds the lowest index which has not been marked as "taken. B E NC HM AR K R E SU LT S UM M AR Y & D IS PL A Y All data related to the performance of my implementation on the benchmark algorithms is detailed extensively on the accompanying website. Applying this function recursively will produce a coloring of the graph which. typically." After visiting all neighbors.