You are on page 1of 35

CS204- Data structures and

algorithms

Fall 2022

Lecture 10: Depth-First Search (DFS) and


Breadth-First Search (BFS)

Malek Smaoui
Introduction
• Beware of the misnomer: BFS and DFS are NOT
search algorithms (no key to search for)
• DFS and BFS are graph traversal algorithms which
systematically explore (visit) all vertices of a
graph
– Usually classified as brute force algorithms
• Used for investigation of fundamental properties
of graphs: connectivity, cycles, …
• They are elementary / basic operations used as
tools (backbone) for other algorithms
Malek Smaoui
BFS: definition
• A vertex - “visiting” procedure which visits all
neighbors (adjacent vertices) of a vertex
before visiting the neighbors of the vertex that
has been visited next.
– Starts with a randomly selected source vertex.
– Procedure is repeated with other unvisited source
vertices until all the graph is visited

Malek Smaoui
Run BFS on these sample graphs

A B A B

E E
C D C D
F G F G

J J
H I H I

K K

Malek Smaoui
BFS: definition (revised)
• A vertex – marking procedure which marks all
neighbors (adjacent vertices) of a vertex
before marking the neighbors of the vertex
that has been marked next.
– Starts with a randomly selected source vertex.
– Procedure is repeated with other unmarked
source vertices until all the graph is marked

Malek Smaoui
Now the pseudocode
1. Input?
– Graph data structure / representation
2. How to mark?
– Vertex data structure / representation
3. Additional tools / data structures?
– Which vertex to visit next?

Malek Smaoui
Graph representation: adjacency lists

A B A B

E E
C D C D
F G F G

J J
H I H I

K K

Malek Smaoui
Graph representation: adjacency lists

𝐴→𝐶→𝐹→𝐺
𝐴→𝐹→𝐺
𝐵→𝐸→𝐷
𝐵→𝐸→𝐷
𝐶→𝐴→𝐼
𝐶→𝐴
𝐷→𝐵→𝐸→𝐻→𝐾
𝐷→𝐻
𝐸→𝐵→𝐷→𝐾
𝐸→𝐷
𝐹→𝐴→𝐼→𝐽
𝐹
𝐺→𝐴→𝐽
𝐺
𝐻→𝐷
𝐻
𝐼→𝐶→𝐹→𝐽
𝐼→𝐶→𝐹
𝐽→𝐹→𝐺→𝐼
𝐽→𝐺→𝐹→𝐼
𝐾→𝐷→𝐸
𝐾→𝐷→𝐸

Malek Smaoui
Now the pseudocode
1. Input?
– Graph data structure / representation
Adjacency lists
2. How to mark?
– Vertex data structure / representation
3. Additional tools / data structures?
– Which vertex to visit next?

Malek Smaoui
BFS: marking vertices
• Marking requires using an extra attribute per
vertex
– could be a number that shows the order of marking of
the vertices
• In the following suggested pseudocode, the
attribute is a color (white/gray/black)
– Undiscovered nodes are white (initially all the graph is
white)
– Discovered node which still have undiscovered
neighbors (white neighbors) are gray
– Discovered nodes which neighborhood is all
discovered (black or gray neighbors) are black
Malek Smaoui
Now the pseudocode
1. Input?
– Graph data structure / representation
2. How to mark?
– Vertex data structure / representation
Color attribute per vertex
3. Additional tools / data structures?
– Which vertex to visit next?

Malek Smaoui
Now the pseudocode
1. Input?
– Graph data structure / representation
2. How to mark?
– Vertex data structure / representation
Color attribute per vertex
3. Additional tools / data structures?
– Which vertex to visit next?
Use a queue where vertices are queued when they
are marked and dequeued when starting to discover
their neighborhood

Malek Smaoui
BFS from a single source routine
bfs(G, s) s.color = GRAY
/*** input: - Graph G with G.Adj Q = 𝜙 ; // queue initially empty
being its “array” of adjacency lists. ENQUEUE(Q, s)
- s, a vertex of G
Each vertex of G has an additional while Q ≠ 𝜙 ;
attribute: u = DEQUEUE(Q)
- color: for marking it during for each v in G.Adj[u]
exploration if v.color == WHITE
output: does not return anything but v.color = GRAY
the visited vertices of the graph will ENQUEUE(Q, v)
have different color attribute value
Additional data structures used: Q a u.color = BLACK
queue of vertices ***/
BFS
BFS(G)
/*** input: Graph G where G.V is the
set of all vertices of G
output: Graph G with all
vertices marked/colored and
additional attributes like parent
and distance from source updated
***/

for each vertex s in G.V


s.color = WHITE
for each vertex s in G.V
if s.color == WHITE
bfs(G,s)

Malek Smaoui
BFS example
Perform vertex coloring according to BFS

𝐴→𝐶→𝐹→𝐺 𝐴→𝐹→𝐺
𝐵→𝐸→𝐷 𝐵→𝐸→𝐷
𝐶→𝐴→𝐼 𝐶→𝐴
𝐷→𝐵→𝐸→𝐻→𝐾 𝐷→𝐻
𝐸→𝐵→𝐷→𝐾 𝐸→𝐷
𝐹→𝐴→𝐼→𝐽 𝐹
𝐺→𝐴→𝐽 𝐺
𝐻→𝐷 𝐻
𝐼→𝐶→𝐹→𝐽 𝐼→𝐶→𝐹
𝐽→𝐹→𝐺→𝐼
𝐽→𝐺→𝐹→𝐼
𝐾→𝐷→𝐸
𝐾→𝐷→𝐸
Beyond BFS
• To use BFS for more than systematic exploration
or traversal we need additional attributes
• We will use BFS to find the shortest path between
two vertices:
– Use an additional attribute per node 𝜋 to record the
parent vertex
• 𝑢 is the parent of 𝑣 if 𝑣 is marked while exploring the
neighbors of 𝑢
– Use an additional attribute per node 𝑑 to record the
distance between the vertex and the source.
Malek Smaoui
BFS for shortest path
bfs(G, s) s.color = GRAY
/*** input: - Graph G with G.Adj being s.d = 0
its “array” of adjacency lists. s. 𝜋 = NIL
- s, a vertex of G Q =𝜙 ; // queue initially empty
Each vertex of G has additional ENQUEUE(Q, s)
attributes:
- color: for marking it during while Q ≠ 𝜙 ;
exploration u = DEQUEUE(Q)
- 𝜋 : to save its parent vertex for each v in G.Adj[u]
- d: to save its distance from the if v.color == WHITE
source vertex v.color = GRAY
output: does not return anything but the v.d = u.d + 1
visited vertices of the graph will have v. 𝜋 = u
different attribute values ENQUEUE(Q, v)
Additional data structures used: Q a u.color = BLACK
queue of vertices ***/
BFS example
Update the graphs with the values of 𝜋 and d after BFS

𝐴→𝐶→𝐹→𝐺 𝐴→𝐹→𝐺
𝐵→𝐸→𝐷 𝐵→𝐸→𝐷
𝐶→𝐴→𝐼 𝐶→𝐴
𝐷→𝐵→𝐸→𝐻→𝐾 𝐷→𝐻
𝐸→𝐵→𝐷→𝐾 𝐸→𝐷
𝐹→𝐴→𝐼→𝐽 𝐹
𝐺→𝐴→𝐽 𝐺
𝐻→𝐷 𝐻
𝐼→𝐶→𝐹→𝐽 𝐼→𝐶→𝐹
𝐽→𝐹→𝐺→𝐼
𝐽→𝐺→𝐹→𝐼
𝐾→𝐷→𝐸
𝐾→𝐷→𝐸
Shortest path
• bfs can be used to find the shortest SHORTEST-PATH(G, s, v)
path between two vertices s and v for each vertex s in G.V
– It can eventually be slightly modified to s.color = WHITE
start marking at s but stops marking
when reaches v s.d = ∞
• The distance between s and v given s.𝜋 = NIL
by the attribute d is minimal bfs(G,s)
• The path between s and v is a PRINT-PATH(G, s, v)
shortest path --------------------------------------------
• Although seemingly straightforward, PRINT-PATH(G, s, v)
a mathematical proof of correctness if v== s
is not elementary.
print s
else if v.𝜋 == NIL
print “no path from” s “to”
v “exists”
else
PRINT-PATH(G, s, v.𝜋 )
print v
BFS example
Run shortest path (F, G) and shortest path (A, K)

𝐴→𝐶→𝐹→𝐺
𝐵→𝐸→𝐷
𝐶→𝐴→𝐼
𝐷→𝐵→𝐸→𝐻→𝐾
𝐸→𝐵→𝐷→𝐾
𝐹→𝐴→𝐼→𝐽
𝐺→𝐴→𝐽
𝐻→𝐷
𝐼→𝐶→𝐹→𝐽
𝐽→𝐹→𝐺→𝐼
𝐾→𝐷→𝐸
Connected components in
undirected graphs
• bfs(G,s) could be used as the backbone
procedure to find connected components in
undirected graphs
• Exercise: Modify bfs(G,s) and/or BFS(G)
pseudocodes to get CONNECTED-
COMPONENTS(G) which prints the vertices
of each connected component of a graph on a
separate line.
Connected components
Connected-component(G, s) CONNECTED-COMPONENTS(G)
/* output: printout of the vertices of the connected /* output: the vertices of each connected
component containing s on a single line */ component printed on a separate line */

s.color = GRAY for each vertex s in G.V


Q =𝜙 ; // queue initially empty s.color = WHITE
ENQUEUE(Q, s)
for each vertex s in G.V
while Q ≠ 𝜙 ; if s.color == WHITE
u = DEQUEUE(Q) Connected-component(G, s)
for each v in G.Adj[u] print(new_line)
if v.color == WHITE
v.color = GRAY
ENQUEUE(Q, v)
u.color = BLACK
print (u)
BFS example
Run connected components

𝐴→𝐶→𝐹→𝐺
𝐵→𝐸→𝐷
𝐶→𝐴→𝐼
𝐷→𝐵→𝐸→𝐻→𝐾
𝐸→𝐵→𝐷→𝐾
𝐹→𝐴→𝐼→𝐽
𝐺→𝐴→𝐽
𝐻→𝐷
𝐼→𝐶→𝐹→𝐽
𝐽→𝐹→𝐺→𝐼
𝐾→𝐷→𝐸
BFS efficiency
• BFS examines once each element of each
adjacency list to determine whether it needs
to be enqueued / dequeued or not
The running time is in the order of the size of
all adjacency lists: Θ 𝑉 + 𝐸
• If the graph was presented as an adjacency
matrix, what will be the running time?
DFS: definition and graph preparation
• Same objective as BFS but proceeds by exploring edges out of the
most recently discovered vertex that still has unexplored edges
leaving it.
• DFS uses an alternative set of attributes per vertex:
– color to keeps track of which vertices have been discovered or not
– 𝜋 to record the predecessor vertex
– d to record the “time” at which the vertex has been discovered
– f to record the “time” at which all the vertex neighbors have been
discovered (becomes a deadend).
• Note that just like BFS, only the color attribute is absolutely
required for the sole purpose of “discovering” or “visiting” vertices.
The other attributes are used for more useful applications that use
either search algorithms
DFS implementation
dfs(G, u) DFS(G)
/*** input and output are the same as for bfs for each vertex u in G.V
- d : notes when the vertex has been discovered u.color = WHITE
- f : notes when all its neighborhood is discovered
u.𝜋 = NIL
- time is a “global” variable defined in DFS(G) ***/
time = 0
time = time + 1 for each vertex u in G.V
u.d = time if u.color == WHITE
u.color = GRAY dfs(G, u)
for each v in G.Adj[u]
if v.color == WHITE
v.𝜋 = u
dfs(G, v)
u.color = BLACK
time = time + 1
u.f = time
DFS example
Update the graphs with the values of 𝜋 , d and f after DFS
Alternative implementation
• dfs(G,u) is recursive and a vertex is marked
GRAY before the recursive call and Black after
the recursive call.
For two vertices u and v; if u is marked GRAY
before v then v will be marked BLACK before
u.
This is a LIFO behavior
dfs(G,u) could be implemented without
recursion but using a stack.
Malek Smaoui
DFS efficiency
• Using the same analysis as BFS, we end up
with the same running time as BFS:
Θ 𝑉 + 𝐸

Malek Smaoui
DFS applications
• Just like BFS, DFS can be used to find
connected components in undirected graphs
• DFS can also tell if the graph contains cycles
• The timestamp attributes d and f are most
useful in applications like topological sorting
of DAGs.

Malek Smaoui
Topological sort of a DAG
• DAG: Directed Acyclic Graph
• A topological sort of a DAG G =(V,E) is a linear
ordering of all its vertices such that if G contains
an edge (u,v), then u appears before v in the
ordering.
• Many applications use directed acyclic graphs to
indicate precedencies among events.
– Example:
• degree study plan according to prerequisites
• Set of computing tasks where the outputs of some tasks are
inputs to other tasks
Topological sort example

Source: Introduction to Algorithms, Thomas Cormen et Al.


Topological sort of a DAG
TOPOLOGICAL-SORT(G)
call DFS(G) to compute finishing times v.f for each vertex
if a vertex is “finished”
insert it onto the front of a linked list
return the linked list of vertices
DFS implementation
dfs(G, u) DFS(G)
/*** input and output are the same as for bfs for each vertex u in G.V
- d : notes when the vertex has been discovered u.color = WHITE
- f : notes when all its neighborhood is discovered
u.𝜋 = NIL
- time is a “global” variable defined in DFS(G) ***/
time = 0
time = time + 1 for each vertex u in G.V
u.d = time if u.color == WHITE
u.color = GRAY dfs(G, u)
for each v in G.Adj[u]
if v.color == WHITE
v.𝜋 = u
dfs(G, v)
u.color = BLACK
time = time + 1
u.f = time
Topological sort of a DAG
TS-dfs(G, u, L) TOPLOGICAL-SORT(G)
/*** L is a linked list of
time = time + 1 vertices ***/
u.d = time for each vertex u in G.V
u.color = GRAY
u.color = WHITE
for each v in G.Adj[u]
u.𝜋 = NIL
if v.color == WHITE
v.𝜋 = u
u.d = -1
dfs(G, v) u.f = -1
u.color = BLACK time = 0
time = time + 1 L =∅
u.f = time for each vertex u in G.V
LIST-INSERT(L, u) if u.color == WHITE
TS-dfs(G, u, L)

You might also like