You are on page 1of 5

CISC-235 20120322

http://sites.cs.queensu.ca/courses/cisc235/Record/Week10/2012...

20120223 Graphs

Graphs can be viewed as generalizations of trees. A graph G consists of two sets: a set V of vertices, and a set E of edges, where each set consists of a pair of vertices in V. Terminology: We use n to denote |V| We use m to denote |E|. Note that if we disallow duplicate edges, m is in O(n^2) Two vertices x and y are adjacent if (x,y) is in E If x and y are ajacent, we call them neighbours. The degree of a vertex is the number of times the vertex appears in E. If loops are prohibited (as is usual), the degree of a vertex is the number of edges that touch the vertex. We use d(v) to denote the degree of v. A path is a sequence of vertices and edges such that each edge joins the previous vertex to the next vertex. A cycle is a path where the rst and last vertices are the same. A simple path is a path in which all the vertices and edges are distinct. A simple cycle is a cycle in which all the vertices and edges are distinct (with the exception of the rst vertex, which is also the last vertex). Unless otherwise noted, we will always limit our discussions to graphs with no multiple edges or loops (edges that join vertices to themselves) simple paths simple cycles A directed edge is an edge with one vertex identied as the head and the other as the tail. The edge is directed from the tail to the head. This is usually signied in drawings of the graph by placing an arrow-head on the edge, pointing into the head vertex. Edges are undirected unless otherwise specied. Edges may be weighted. The weight of an edge e is usually denoted by w(e). If the edge is identied by its end vertices, such as e = (x,y), we will sometimes indicate its weight by w(x,y)

1 of 5

12-04-09 12:17 PM

CISC-235 20120322

http://sites.cs.queensu.ca/courses/cisc235/Record/Week10/2012...

In general, graphs are used to represent unilateral (ie. directed) or bilateral (ie. undirected) pairwise relationships between discrete entities. In a communication network, the edges could represent the existance of a wired link between two objects in the network. In a social network, the edges could represent "friend" links. A graph G is connected if for each pair of vertices x and y, there is a path in G that starts at x and ends at y. Connectivity is a fundamentally important property of graphs and we will look at two methods of determining if a graph is connected. Before addressing these algorithms we must rst discuss appropriate data structures for representing graphs. There are two well-known approaches: adjacency matrices and adjacency lists. Adjacency Matrix Representation An adjacency matrix for a graph G is an n*n matrix A where A[i,j] = 1 if and only if vertex i is adjacent to vertex j A[i,j] = 0 if and only if vertex i is not adjacent to vertex j

Note that the size of A is n^2, regardless of the number of edges in E If the graph is undirected, A[i,j] == A[j,i]. However if the edges are directed, we set A[i,j] = 1 if and only if there is a directed edge from vertex i to vertex j, and A[i,j] = 0 otherwise. In the adjacency matrix for a directed graph, it is not necessarily true that A[i,j] == A[j,i] If the graph is weighted, A[i,j] = w(vertex i, vertex j). If it is possible for edges to be present and to have weight 0, any edge (i,j) that is not in E must be indicated by setting A[i,j] to some value that is not a possible edge-weight. Graph operations using an adjacency matrix: Vertex addition: We need to add a new row and column to A. This requires about 2*n operations at least Vertex deletion: Removing the row and column will be expensive - it is probably easier to delete all the edges that touch the vertex and ag the vertex as being deleted. This is in O(n) Edge addition: We change two elements of A - this is in O(1) Edge deletion: Same as edge addition Test for adjacency: to see if vertex i and vertex j are adjacent, we examine A[i,j] - this is in O(1) List the neighbours of vertex i: We need to examine each element of the i-th row of A - this is in O(n)

Adjacency Lists Representation

2 of 5

12-04-09 12:17 PM

CISC-235 20120322

http://sites.cs.queensu.ca/courses/cisc235/Record/Week10/2012...

For each vertex, create a list of the neighbours of the vertex. The list of neighbours is in no particular order. If the graph is undirected, each edge will create an entry in two of the adjacency lists. If the edges are directed, an edge from vertex i to vertex j would be represented by including j in the adjacency list for vertex i. If the edges are weighted, the weights of the edges can be stored as auxilliary data in the adjacency lists Graph operations using adjacency lists: Vertex addition: Create a new (empty) adjacency list. O(1) Vertex deletion: Removing all references to the deleted vertex may require traversing all the adjacency lists. This is in O(m) Edge addition: Add elements to two adjacency lists: O(1) Edge deletion: We need to remove the relevant elements from two of the lists. This is in O(m) Test for adjacency: Traverse the relevant lists. This is in O(m) List the neighbours of vertex i: Traverse the adjacency list for vertex i. If the maximum degree in the graph is bounded by a constant (which is very often the case in practice), this is in O(1). If the degree of each vertex is not bounded, this operation is in O(m). In all cases it is not worse than the same operation when using an adjacency matrix.

Note that neither of the representations is a clear winner. Deciding which to use requires a careful analysis of the operations required for whatever algorithm we are implementing.

Breadth-First Search Breadth-rst search is an algorithm that explores the graph, building paths from a start vertex to all other vertices that can be reached. If this turns out to be the entire vertex set, we know the graph is connected. Furthermore, if the edges are unweighted then BFS nds the shortest paths from the start vertex to all other vertices.. The basic idea of the BFS algorithm to build a tree using edges of the graph, rooted at the start vertex. We build the tree by adding as many vertices at each level as possible, before moving on to the next level. def BFS(v): # v is the start vertex. The choice of v will usually be determined by some aspect of the task we are trying to complete. create an empty queue Q mark v "visited" append v to Q while Q is not empty let x be the rst vertex in Q

3 of 5

12-04-09 12:17 PM

CISC-235 20120322

http://sites.cs.queensu.ca/courses/cisc235/Record/Week10/2012...

remove x from Q for each neighbour y of x such that y is not marked "visited": let p(y) = x # p(y) is the "parent" of y in the tree we are building mark y "visited" append y to Q

If this terminates with all vertices marked "visited", then the graph is connected. Furthermore we have constructed a tree using only edges of E that includes every vertex in V. Such a tree is called a spanning tree of G. Experiment on a few graphs to see how BFS builds a spanning tree of a connected graph, regardless of which vertex we start at. Implementation and Complexity: The while loop executes up to n times (if G is connected, the while loop executes exactly n times). Within the while loop, the operations on the queue all take O(1) time. It is to the execution time of the for loop that the choice of data structure makes a difference. If G is represented as an adjacency matrix then creating the list of neighbours y of x takes O(n) time. This gives us O(n^2) for the algorithm. On the other hand, suppose G is represented as a set of adjacency lists. Now on each iteration of the while loop, the for loop executes d(x) times where x is the vertex just removed at the head of Q. Since each vertex in V plays the role of x exactly once, the number of executions of the for loop equals the sum of the degrees of the vertices. Since it is easy to show that the sum of the degrees = 2*m, the for loop executes O(m) times. Thus the algorithm runs in O(n + m) time. (Make sure you see why it is O(n + m) rather than O(n*m) ) Does this make a difference? If m is small, it does. We need to formalize the idea of m being small. If m is in O(n), we say that G is sparse. If m is in THETA(n^2), we say that G is dense. Now we can state that if G is sparse and represented as a set of adjacency lists, BFS will run in O(n) time. If G is dense, BFS will run in O(n^2) time regardless of which representation we use.

Depth-First Search The most popular alternative to BFS is Depth-First Search. DFS also explores the graph and builds a spanning tree (or at least tries to - if the graph is not connected, no spanning tree exists), but where BFS always creates a spanning tree with the minimum possible number of levels, DFS is much less predictable. As the name suggests, DFS always tries to move forward to a new level before backing up and looking for another vertex at the current level.

4 of 5

12-04-09 12:17 PM

CISC-235 20120322

http://sites.cs.queensu.ca/courses/cisc235/Record/Week10/2012...

DFS is a recursive algorithm. As before, we assume the start vertex is determined by some aspect of the algorithm that is calling DFS. def DFS(v): mark v "visited" for each neighbour y of v such that y is not marked "visited": let p(y) = v DFS(y) Implementation and Complexity: Once again, the complexity of the algorithm depends on the implementation we choose to represent the vertices and edges of G. The total number of recursive calls cannot exceed n, since each call marks a vertex. Within each recursive call the most time-consuming operation is nding the neighbours of the current vertex. As with BFS, this takes O(n) time if we use an adjacency matrix, and O(d(v)) time if we use adjacency lists. Once again we nd that this gives O(n^2) over all with the adjacency matrix, and O(n + m) over all with the adjacency lists. The same conclusion regarding sparse versus dense graphs can be made.

5 of 5

12-04-09 12:17 PM

You might also like