You are on page 1of 16

Graph in Data Structure

A graph is a pictorial representation of a set of objects where some pairs of objects


are connected by links. The interconnected objects are represented by points
termed as vertices, and the links that connect the vertices are called edges.
Formally, a graph is a pair of sets (V, E), where V is the set of vertices and E is the
set of edges, connecting the pairs of vertices. Take a look at the following graph −

In the above graph,


V = {a, b, c, d, e}
E = {ab, ac, bd, cd, de}

Graph Data Structure


Mathematical graphs can be represented in data structure. We can represent a
graph using an array of vertices and a two-dimensional array of edges. Before we
proceed further, let's familiarize ourselves with some important terms −
 Vertex − Each node of the graph is represented as a vertex. In the following example, the labeled circle
represents vertices. Thus, A to G are vertices. We can represent them using an array as shown in the
following image. Here A can be identified by index 0. B can be identified using index 1 and so on.

 Edge − Edge represents a path between two vertices or a line between two vertices. In the following
example, the lines from A to B, B to C, and so on represents edges. We can use a two-dimensional
array to represent an array as shown in the following image. Here AB can be represented as 1 at row
0, column 1, BC as 1 at row 1, column 2 and so on, keeping other combinations as 0.

 Adjacency − Two node or vertices are adjacent if they are connected to each other through an edge. In
the following example, B is adjacent to A, C is adjacent to B, and so on.

 Path − Path represents a sequence of edges between the two vertices. In the following example, ABCD
represents a path from A to D.
Basic Operations
Following are basic primary operations of a Graph −
 Add Vertex − Adds a vertex to the graph.

 Add Edge − Adds an edge between the two vertices of the graph.

 Display Vertex − Displays a vertex of the graph.

Depth First Search (DFS)

Depth First Search (DFS) - This algorithm traverses a graph in a depth ward motion
and uses a stack to remember to get the next vertex to start a search, when a dead

end occurs in any iteration.


As in the example given above, DFS algorithm traverses from S to A to D to G to E
to B first, then to F and lastly to C. It employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Push it in a stack.

 Rule 2 − If no adjacent vertex is found, pop up a vertex from the stack. (It will pop up all the vertices
from the stack, which do not have adjacent vertices.)

 Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.

Step Traversal Description

Initialize the
stack.

2 Mark S as
visited and put
it onto the
stack. Explore
any unvisited
adjacent node
from S. We
have three
nodes and we
can pick any
of them. For
this example,
we shall take
the node in an
alphabetical
order.

3
Mark A as
visited and put
it onto the
stack. Explore
any unvisited
adjacent node
from A.
Both S and D 
are adjacent
to A but we
are concerned
for unvisited
nodes only.
4 Visit D and
mark it as
visited and put
onto the stack.
Here, we
have B and C 
nodes, which
are adjacent
to D and both
are unvisited.
However, we
shall again
choose in an
alphabetical
order.

5
We choose B,
mark it as
visited and put
onto the stack.
Here B does
not have any
unvisited
adjacent node.
So, we
pop B from the
stack.

6 We check the
stack top for
return to the
previous node
and check if it
has any
unvisited
nodes. Here,
we find D to
be on the top
of the stack.

Only unvisited
adjacent node
is
from D is C no
w. So we
visit C, mark it
as visited and
put it onto the
stack.

As C does not have any unvisited adjacent node so we keep popping the stack until
we find a node that has an unvisited adjacent node. In this case, there's none and
we keep popping until the stack is empty.
Breadth First Search (BFS)
Breadth First Search (BFS) algorithm traverses a graph in a breadthward motion
and uses a queue to remember to get the next vertex to start a search, when a
dead end occurs in any iteration.

As in the example given above, BFS algorithm traverses from A to B to E to F first


then to C and G lastly to D. It employs the following rules.
 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited. Display it. Insert it in a queue.

 Rule 2 − If no adjacent vertex is found, remove the first vertex from the queue.

 Rule 3 − Repeat Rule 1 and Rule 2 until the queue is empty.

Ste Traversal Description


p

Initialize the queue.


2

We start from visiting S (starting


node), and mark it as visited.

We then see an unvisited adjacent


node from S. In this example, we
have three nodes but
alphabetically we choose A, mark
it as visited and enqueue it.

Next, the unvisited adjacent node


from S is B. We mark it as visited
and enqueue it.

Next, the unvisited adjacent node


from S is C. We mark it as visited
and enqueue it.
6

Now, S is left with no unvisited


adjacent nodes. So, we dequeue
and find A.

From A we have D as unvisited


adjacent node. We mark it as
visited and enqueue it.

At this stage, we are left with no unmarked (unvisited) nodes. But as per the
algorithm we keep on dequeuing in order to get all unvisited nodes. When the
queue gets emptied, the program is over.
Spanning Tree

A spanning tree is a subset of Graph G, which has all the vertices covered with
minimum possible number of edges. Hence, a spanning tree does not have cycles
and it cannot be disconnected..
By this definition, we can draw a conclusion that every connected and undirected
Graph G has at least one spanning tree. A disconnected graph does not have any
spanning tree, as it cannot be spanned to all its vertices.

We found three spanning trees off one complete graph. A complete undirected
graph can have maximum n  number of spanning trees, where n is the number of
n-2

nodes. In the above addressed example, n is 3, hence 3  = 3 spanning trees are


3−2

possible.

General Properties of Spanning Tree


We now understand that one graph can have more than one spanning tree.
Following are a few properties of the spanning tree connected to graph G −
 A connected graph G can have more than one spanning tree.

 All possible spanning trees of graph G, have the same number of edges and vertices.

 The spanning tree does not have any cycle (loops).

 Removing one edge from the spanning tree will make the graph disconnected, i.e. the spanning tree
is minimally connected.

 Adding one edge to the spanning tree will create a circuit or loop, i.e. the spanning tree is maximally
acyclic.

Mathematical Properties of Spanning Tree


 Spanning tree has n-1 edges, where n is the number of nodes (vertices).

 From a complete graph, by removing maximum e - n + 1 edges, we can construct a spanning tree.
 A complete graph can have maximum nn-2 number of spanning trees.

Thus, we can conclude that spanning trees are a subset of connected Graph G and
disconnected graphs do not have spanning tree.

Application of Spanning Tree


Spanning tree is basically used to find a minimum path to connect all nodes in a
graph. Common application of spanning trees are −
 Civil Network Planning

 Computer Network Routing Protocol

 Cluster Analysis

Let us understand this through a small example. Consider, city network as a huge
graph and now plans to deploy telephone lines in such a way that in minimum lines
we can connect to all city nodes. This is where the spanning tree comes into
picture.

Minimum Spanning Tree (MST)


In a weighted graph, a minimum spanning tree is a spanning tree that has minimum
weight than all other spanning trees of the same graph. In real-world situations, this
weight can be measured as distance, congestion, traffic load or any arbitrary value
denoted to the edges.

Minimum Spanning-Tree Algorithm


We shall learn about two most important spanning tree algorithms here −
 Kruskal's Algorithm

 Prim's Algorithm

Both are greedy algorithms.


Heap
Heap is a special case of balanced binary tree data structure where the root-node
key is compared with its children and arranged accordingly. If α has child
node β then −
key(α) ≥ key(β)
As the value of parent is greater than that of child, this property generates Max
Heap. Based on this criteria, a heap can be of two types −
For Input → 35 33 42 10 14 19 27 44 26 31
Min-Heap − Where the value of the root node is less than or equal to either of its
children.

Max-Heap − Where the value of the root node is greater than or equal to either of
its children.

Both trees are constructed using the same input and order of arrival.
Max Heap Construction Algorithm
We shall use the same example to demonstrate how a Max Heap is created. The
procedure to create Min Heap is similar but we go for min values instead of max
values.
We are going to derive an algorithm for max heap by inserting one element at a
time. At any point of time, heap must maintain its property. While insertion, we also
assume that we are inserting a node in an already heapified tree.
Step 1 − Create a new node at the end of heap.
Step 2 − Assign new value to the node.
Step 3 − Compare the value of this child node with its parent.
Step 4 − If value of parent is less than child, then swap them.
Step 5 − Repeat step 3 & 4 until Heap property holds.
Note − In Min Heap construction algorithm, we expect the value of the parent node
to be less than that of the child node.
Let's understand Max Heap construction by an animated illustration. We consider
the same input sample that we used earlier.

Max Heap Deletion Algorithm


Let us derive an algorithm to delete from max heap. Deletion in Max (or Min) Heap
always happens at the root to remove the Maximum (or minimum) value.
Step 1 − Remove root node.
Step 2 − Move the last element of last level to root.
Step 3 − Compare the value of this child node with its parent.
Step 4 − If value of parent is less than child, then swap them.
Step 5 − Repeat step 3 & 4 until Heap property holds.
Hashing
What is Hashing?

 Hashing is the process of mapping large amount of data item to smaller table with
the help of hashing function.
 Hashing is also known as Hashing Algorithm or Message Digest Function.
 It is a technique to convert a range of key values into a range of indexes of an array.
 It is used to facilitate the next level searching method when compared with the linear
or binary search.
 Hashing allows to update and retrieve any data entry in a constant time O(1).
 Constant time O(1) means the operation does not depend on the size of the data.
 Hashing is used with a database to enable items to be retrieved more quickly.
 It is used in the encryption and decryption of digital signatures.

What is Hash Function?

 A fixed process converts a key to a hash key is known as a Hash Function.


 This function takes a key and maps it to a value of a certain length which is called
a Hash value or Hash.
 Hash value represents the original string of characters, but it is normally smaller than
the original.
 It transfers the digital signature and then both hash value and signature are sent to
the receiver. Receiver uses the same hash function to generate the hash value and then
compares it to that received with the message.
 If the hash values are same, the message is transmitted without errors.

What is Hash Table?

 Hash table or hash map is a data structure used to store key-value pairs.
 It is a collection of items stored to make it easy to find them later.
 It uses a hash function to compute an index into an array of buckets or slots from
which the desired value can be found.
 It is an array of list where each list is known as bucket.
 It contains value based on the key.
 Hash table is used to implement the map interface and extends Dictionary class.
 Hash table is synchronized and contains only unique elements.
 The above figure shows the hash table with the size of n = 10. Each position of the
hash table is called as Slot. In the above hash table, there are n slots in the table, names =
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}. Slot 0, slot 1, slot 2 and so on. Hash table contains no items,
so every slot is empty.
 As we know the mapping between an item and the slot where item belongs in the
hash table is called the hash function. The hash function takes any item in the collection and
returns an integer in the range of slot names between 0 to n-1.
 Suppose we have integer items {26, 70, 18, 31, 54, 93}. One common method of
determining a hash key is the division method of hashing and the formula is :

Hash Key = Key Value % Number of Slots in the Table

 Division method or reminder method takes an item and divides it by the table size
and returns the remainder as its hash value.

Data Item Value % No. of Slots Hash Value

26 26 % 10 = 6 6

70 70 % 10 = 0 0

18 18 % 10 = 8 8

31 31 % 10 = 1 1

54 54 % 10 = 4 4

93 93 % 10 = 3 3

 After computing the hash values, we can insert each item into the hash table at the
designated position as shown in the above figure. In the hash table, 6 of the 10 slots are
occupied, it is referred to as the load factor and denoted by, λ = No. of items / table size.
For example , λ = 6/10.
 It is easy to search for an item using hash function where it computes the slot name
for the item and then checks the hash table to see if it is present.
 Constant amount of time O(1) is required to compute the hash value and index of the
hash table at that location.

Linear Probing

 Take the above example, if we insert next item 40 in our collection, it would have a
hash value of 0 (40 % 10 = 0). But 70 also had a hash value of 0, it becomes a problem.
This problem is called as Collision or Clash. Collision creates a problem for hashing
technique.
 Linear probing is used for resolving the collisions in hash table, data
structures for maintaining a collection of key-value pairs.
 Linear probing was invented by Gene Amdahl, Elaine M. McGraw and Arthur Samuel
in 1954 and analyzed by Donald Knuth in 1963.
 It is a component of open addressing scheme for using a hash table to solve the
dictionary problem.
 The simplest method is called Linear Probing. Formula to compute linear probing is:

P = (1 + P) % (MOD) Table_size
For example,

If we insert next item 40 in our collection, it would have a hash value of 0 (40 % 10 = 0).
But 70 also had a hash value of 0, it becomes a problem.

Linear probing solves this problem:

P = H(40)
44 % 10 = 0
Position 0 is occupied by 70. so we look elsewhere for a position to store 40.

Using Linear Probing:


P= (P + 1) % table-size
0 + 1 % 10 = 1
But, position 1 is occupied by 31, so we look elsewhere for a position to store 40.

Using linear probing, we try next position : 1 + 1 % 10 = 2


Position 2 is empty, so 40 is inserted there.

You might also like