You are on page 1of 95

Data Structures and Algorithms

Course Code: BCA-204 Credits: 4

Course Outcome: -
 To have a thorough understanding of the basic structure and operation of a digital computer.
 To discussindetailtheoperationofthearithmeticunitincludingthealgorithms&
 implementationoffixed-pointandfloating-pointaddition,subtraction,multiplication& division.
 To study the different ways of communicating with I/O devices and standard I/O interfaces.
 To study the hierarchical memory system including cache memories and virtual memory

Course Goal
 To understand basic structure of a digital computer.
 To Perform arithmetic operations of binary number system.
 To Know the organization of the control unit, arithmetic and logical unit, memory unit and
theI/Ounit.

Course Learning Outcomes


 The course focuses on understand basic structure of a digital computer, study the different
ways of communicating with I/O devices and standard I/O interfaces and thorough
understanding of the basic structure and operation of a digital computer.

Syllabus

Unit 1- Introduction to Algorithm Design: Algorithm. its characteristics, efficiency of


algorithms, analyzing Algorithms and problems.
Linear Structure: Arrays, records, stack, operation on stack, implementation of stack as an
array, queue, types of queues, operations on queue, implementation of queue.

Unit 2- Linked Structure: List representation, Polish notations, operations on linked list get
node and free node operation, implementing the list operation, inserting into an ordered
linked list, deleting, circular linked list, doubly linked list.

Unit 3-Tree Structure: Concept and terminology, Types of trees, Binary search tree,
inserting, deleting and searching into binary search tree, implementing the insert, search and
delete algorithms, tree traversals, Huffman's algorithm.

Unit 4- Graph Structure: Graph representation - Adjacency matrix, adjacency list,


Warshall's algorithm, adjacency multilist representation. Orthogonal representation of graph.
Graph traversals - BFS and DFS. Shortest path, all pairs of shortest paths, transitive closure.
Unit 5- Searching and sorting: Searching- sequential searching, binary searching, hashing
Sorting-selection sort, bubble sort, quick sort, heap sort, merge sort, and insertion sort.
efficiency considerations.

Recommended reference books


1. S. Lioschutz: Data Structures, Mc Graw Hill International Edition.

2. A.V. Aho., J.E. Hopcroft, and J.D. Ullman, Data Structures and Algorithms.

Pearson Education Asia.

3. A. Michael Berman: Data Structures via C++. Oxford University Press.

4. Sara Baase and Allen Van Gelder: Computer Algorithms, Pearson Education Asia

2
CONTENTS

Unit 1 Introduction to Algorithm Design 1-20


Unit 2 Linked Structures
Unit 3 Tree Structure
Unit 4 Graph Structure
Unit 5 Searching and sorting

3
UNIT 1 Introduction to Algorithm Design & Linear
Structure

Structure:
1.0 Learning Objectives
1.1 Algorithm and Its characteristics
1.2 Efficiency of algorithms
1.3 Analyzing Algorithms and problems
1.4 Linear Structure: Arrays, records
1.5 Stack, operation on stack
1.6 Implementation of stack as an array
1.7 Queue, types of queues
1.8 Operations on queue
1.9 Implementation of queue.
1.10 Unit End Questions

4
1.0 Learning Objectives

After studying this unit, you will be able to:


 Design an Algorithm
 Discuss different characteristics of Algorithms

1.1 Algorithm and Its characteristics

Definition: - An algorithm is a Step By Step process to solve a problem, where


each step indicates an intermediate task. Algorithm contains finite number of
steps that leads to the solution of the problem.
Properties /Characteristics of
an Algorithm:- Algorithm has
the following basic properties
 Input-Output:- Algorithm takes ‘0’ or more input and produces the
required output. This is the basic characteristic of an algorithm.
 Finiteness:- An algorithm must terminate in countable number of steps.
 Definiteness: Each step of an algorithm must be stated clearly and
unambiguously.
 Effectiveness: Each and every step in an algorithm can be converted in
to programming language statement.
 Generality: Algorithm is generalized one. It works on all set of inputs
and provides the required output. In other words it is not restricted to
a single input value.

Categories of Algorithm:
Based on the different types of steps in an Algorithm, it can be divided into
three categories, namely
 Sequence
 Selection and
 Iteration
Sequence: The steps described in an algorithm are performed successively one
by one without skipping any step. The sequence of steps defined in an
algorithm should be simple and easy to understand. Each instruction of such
an algorithm is executed, because no selection procedure or conditional
branching exists in a sequence algorithm.
Example:
// adding two numbers
Step 1: start
Step 2: read a,b
Step 3: Sum=a+b
Step 4: write Sum
Step 5: stop

Selection: The sequence type of algorithms are not sufficient to solve the
problems, which involves decision and conditions. In order to solve the

5
problem which involve decision making or option selection, we go for
Selection type of algorithm. The general format of Selection type of statement
is as shown below:
if(condition)
Statement-1;
else
Statement-2;
The above syntax specifies that if the condition is true, statement-1 will be
executed otherwise statement-2 will be executed. In case the operation is
unsuccessful. Then sequence of algorithm should be changed/ corrected in
such a way that the system will re- execute until the operation is successful.

Iteration: Iteration type algorithms are used in solving the problems which
involves repetition of statement. In this type of algorithms, a particular
number of statements are repeated ‘n’ no. of times.

Example1:
Step 1 : start
Step 2 : read n
Step 3 : repeat step 4 until n>0
Step 4 : (a) r=n mod 10
(b) s=s+r
(c) n=n/10
Step 5 : write s
Step 6 : stop

1.2 Efficiency of Algorithms

Performance Analysis an Algorithm:


The Efficiency of an Algorithm can be measured by the following metrics.
i. Time Complexity and
ii. Spac
e Complexity.
i.Time
Complexity:
The amount of time required for an algorithm to complete its execution is its
time complexity. An algorithm is said to be efficient if it takes the minimum
(reasonable) amount of time to complete its execution.

6
ii. Space Complexity:
The amount of space occupied by an algorithm is known as Space Complexity.
An algorithm is said to be efficient if it occupies less space and required the
minimum amount of time to complete its execution.

1. Write an algorithm for roots of a Quadratic Equation?


// Roots of a
quadratic
EquationStep 1 :
start
Step 2 : read a,b,c
Step 3 : if (a= 0) then step 4 else step 5
Step 4 : Write “ Given equation is a
linear equation “Step 5 : d=(b * b) _ (4
*a *c)
Step 6 : if ( d>0) then step 7 else step8
Step 7 : Write “ Roots are real
and Distinct”
Step 8: if(d=0) then step 9 else
step 10
Step 9: Write “Roots are
real and equal”
Step 10: Write “ Roots are
Imaginary”
Step 11: stop

2. Write an algorithm to find the largest among three different numbers entered by
user
Step 1: Start
Step 2: Declare
variables a,b and c.
Step 3: Read
variables a,b and c.
Step 4: If a>b
If a>c
Display a is the
largest number.
Else
Display c is the largest number.
Else
If b>c
Display b is the
largest number.
Else
Display c is the greatest number.
Step 5: Stop

3. Write an algorithm to find the factorial of a number entered by user.


Step 1: Start
Step 2: Declare variables n,factorial and i.
7
Step 3: Initialize variables factorial←1
i←1
Step 4: Read value of n
Step 5: Repeat the steps until i=n 5.1: factorial←factorial*i
5.2: i←i+1
Step 6: Display factorial
Step 7: Stop

4. Write an algorithm to find the Simple Interest for given Time and Rate of Interest .
Step 1: Start
Step 2: Read P,R,S,T.
Step 3:
Calculate
S=(PTR)/100
Step 4: Print S
Step 5: Stop

ASYMPTOTIC NOTATIONS
Asymptotic analysis of an algorithm refers to defining the mathematical
boundation/framing of its run-time performance. Using asymptotic analysis,
we can very well conclude the best case, average case, and worst case
scenario of an algorithm.
Asymptotic analysis is input bound i.e., if there's no input to the algorithm, it
is concluded to work in a constant time. Other than the "input" all other
factors are considered constant.
Asymptotic analysis refers to computing the running time of any operation in
mathematical units of computation. For example, the running time of one
operation is computed as f(n) and may be for another operation it is
computed as g(n2). This means the first operation running time will increase
linearly with the increase in n and the running time of the second operation
will increase exponentially when n increases. Similarly, the running time of
both operations will be nearly the same if n is significantly small.
The time required by an algorithm falls under three types −
 Best Case − Minimum time required for program execution.
 Average Case − Average time required for program execution.
 Worst Case − Maximum time required for program execution.
Asymptotic Notations
Following are the commonly used asymptotic notations to calculate the
running timecomplexity of an algorithm.
 Ο Notation
 Ω Notation
 θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an
algorithm's running time. It measures the worst case time complexity or the
longest amount of time an algorithm can possibly take to complete.

8
For example, for a function f(n)
Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an
algorithm's running time. It measures the best case time complexity or the
best amount of time an algorithm can possibly take to complete.

For example, for a function f(n)


Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and the
upper boundof an algorithm's running time. It is represented as follows −

θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }

9
1.3 Analysing Algorithms and problems

Algorithm analysis is an important part of computational complexity theory, which provides


theoretical estimation for the required resources of an algorithm to solve a specific computational
problem. Analysis of algorithms is the determination of the amount of time and space resources
required to execute it.

Why Analysis of Algorithms is important?

 To predict the behaviour of an algorithm without implementing it on a specific computer.


 It is much more convenient to have simple measures for the efficiency of an algorithm than to
implement the algorithm and test the efficiency every time a certain parameter in the
underlying computer system changes.
 It is impossible to predict the exact behavior of an algorithm. There are too many influencing
factors.

The analysis is thus only an approximation; it is not perfect.


More importantly, by analysing different algorithms, we can compare them to determine the best one
for our purpose.

Types of Algorithm Analysis:

Best case
Worst case
Average case
Best case: Define the input for which algorithm takes less time or minimum time. In the best case
calculate the lower bound of an algorithm. Example: In the linear search when search data is present
at the first location of large data then the best case occurs.

Worst Case: Define the input for which algorithm takes a long time or maximum time. In the worst
calculate the upper bound of an algorithm. Example: In the linear search when search data is not
present at all then the worst case occurs.

Average case: In the average case take all random inputs and calculate the computation time for all
inputs.
And then we divide it by the total number of inputs.
Average case = all random case time / total no of case
1.4 Linear Structure: Arrays, records

Introduction to Linear Data Structures

Linear data structure: Data structure in which data elements are arranged sequentially or
linearly, where each element is attached to its previous and next adjacent elements is called a
linear data structure.
Examples of linear data structures are array, stack, queue, linked list, etc.

10
Arrays:
An array is a linear data structure and it is a collection of items stored at contiguous
memory locations. The idea is to store multiple items of the same type together in one
place. It allows the processing of a large amount of data in a relatively short period. The
first element of the array is indexed by a subscript of 0. There are different operations
possible in an array, like Searching, Sorting, Inserting, Traversing, Reversing, and
Deleting.

Array

Characteristics of an Array:
An array has various characteristics which are as follows:
Arrays use an index-based data structure which helps to identify each of the elements in
an array easily using the index.
 If a user wants to store multiple values of the same data type, then the array can be
utilized efficiently.
 An array can also handle complex data structures by storing data in a two-
dimensional array.
An array is also used to implement other data structures like Stacks, Queues, Heaps,
Hash tables, etc.
 The search process in an array can be done very easily.

Operations performed on array:

 Initialization: An array can be initialized with values at the time of declaration or


later using an assignment statement.
 Accessing elements: Elements in an array can be accessed by their index, which
starts from 0 and goes up to the size of the array minus one.
 Searching for elements: Arrays can be searched for a specific element using linear
search or binary search algorithms.
 Sorting elements: Elements in an array can be sorted in ascending or descending
order using algorithms like bubble sort, insertion sort, or quick sort.
 Inserting elements: Elements can be inserted into an array at a specific location,
but this operation can be time-consuming because it requires shifting existing elements
in the array.
 Deleting elements: Elements can be deleted from an array by shifting the elements
that come after it to fill the gap.
 Updating elements: Elements in an array can be updated or modified by assigning
a new value to a specific index.
 Traversing elements: The elements in an array can be traversed in order, visiting
each element once.
These are some of the most common operations performed on arrays. The specific
operations and algorithms used may vary based on the requirements of the problem and the
programming language used.

11
Applications of Array:
Different applications of an array are as follows:
 An array is used in solving matrix problems.
 Database records are also implemented by an array.
 It helps in implementing a sorting algorithm.
 It is also used to implement other data structures like Stacks, Queues, Heaps, Hash
tables, etc.
 An array can be used for CPU scheduling.
 Can be applied as a lookup table in computers.
 Arrays can be used in speech processing where every speech signal is an array.
 The screen of the computer is also displayed by an array. Here we use a
multidimensional array.
 The array is used in many management systems like a library, students, parliament,
etc.
 The array is used in the online ticket booking system. Contacts on a cell phone are
displayed by this array.
 In games like online chess, where the player can store his past moves as well as
current moves. It indicates a hint of position.

Real-Life Applications of Array:


 An array is frequently used to store data for mathematical computations.
 It is used in image processing.
 It is also used in record management.
 Book pages are also real-life examples of an array.
 It is used in ordering boxes as well.

Need for Arrays


Arrays are used as solutions to many problems from the small sorting problems to more
complex problems like travelling salesperson problem. There are many data structures other
than arrays that provide efficient time and space complexity for these problems, so what
makes using arrays better? The answer lies in the random access lookup time.

Arrays provide O(1) random access lookup time. That means, accessing the 1st index of the
array and the 1000th index of the array will both take the same time. This is due to the fact
that array comes with a pointer and an offset value. The pointer points to the right location of
the memory and the offset value shows how far to look in the said memory.

array_name[index]
| |
Pointer Offset

Therefore, in an array with 6 elements, to access the 1st element, array is pointed towards the
0th index. Similarly, to access the 6th element, array is pointed towards the 5th index.

Array Representation
Arrays are represented as a collection of buckets where each bucket stores one element.
These buckets are indexed from ‘0’ to ‘n-1’, where n is the size of that particular array. For
example, an array with size 10 will have buckets indexed from 0 to 9.
12
This indexing will be similar for the multidimensional arrays as well. If it is a 2-dimensional
array, it will have sub-buckets in each bucket. Then it will be indexed as array_name[m][n],
where m and n are the sizes of each level in the array.

As per the above illustration, following are the important points to be considered.

 Index starts with 0.


 Array length is 9 which means it can store 9 elements.
 Each element can be accessed via its index. For example, we can fetch an element at
index 6 as 23.

Basic Operations in the Arrays

The basic operations in the Arrays are insertion, deletion, searching, display, traverse, and
update. These operations are usually performed to either modify the data in the array or to
report the status of the array.

Following are the basic operations supported by an array.

 Traverse − print all the array elements one by one.


 Insertion − Adds an element at the given index.
 Deletion − Deletes an element at the given index.
 Search − Searches an element using the given index or by the value.
 Update − Updates an element at the given index.
 Display − Displays the contents of the array.

In C, when an array is initialized with size, then it assigns defaults values to its elements in
following order.

13
Data Type Default Value

bool False

char 0

int 0

float 0.0

double 0.0f

void

wchar_t 0

Insertion Operation

In the insertion operation, we are adding one or more elements to the array. Based on the
requirement, a new element can be added at the beginning, end, or any given index of array.
This is done using input statements of the programming languages.

Algorithm

Following is an algorithm to insert elements into a Linear Array until we reach the end of the
array −

1. Start
2. Create an Array of a desired datatype and size.
3. Initialize a variable ‘i’ as 0.
4. Enter the element at ith index of the array.
5. Increment i by 1.
6. Repeat Steps 4 & 5 until the end of the array.
7. Stop

Here, we see a practical implementation of insertion operation, where we add data


at the end of the array −

Example

#include <stdio.h>
int main(){
int LA[3] = {}, i;
printf("Array Before Insertion:\n");
for(i = 0; i < 3; i++)
printf("LA[%d] = %d \n", i, LA[i]);
printf("Inserting Elements.. \n");
14
printf("The array elements after insertion :\n"); // prints array values
for(i = 0; i < 3; i++) {
LA[i] = i + 2;
printf("LA[%d] = %d \n", i, LA[i]);
}
return 0;
}
Output
Array Before Insertion:
LA[0] = 0
LA[1] = 0
LA[2] = 0
Inserting Elements..
The array elements after insertion :
LA[0] = 2
LA[1] = 3
LA[2] = 4

Deletion Operation

In this array operation, we delete an element from the particular index of an array. This
deletion operation takes place as we assign the value in the consequent index to the current
index.

Algorithm

Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to delete an element available at the Kth position of LA.

1. Start
2. Set J = K
3. Repeat steps 4 and 5 while J < N
4. Set LA[J] = LA[J + 1]
5. Set J = J+1
6. Set N = N-1
7. Stop

Example

Following are the implementations of this operation in various programming languages –

15
#include <stdio.h>
void main(){
int LA[] = {1,3,5};
int n = 3;
int i;
printf("The original array elements are :\n");
for(i = 0; i<n; i++)
printf("LA[%d] = %d \n", i, LA[i]);
for(i = 1; i<n; i++) {
LA[i] = LA[i+1];
n = n - 1;
}
printf("The array elements after deletion :\n");
for(i = 0; i<n; i++)
printf("LA[%d] = %d \n", i, LA[i]);
}
Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
The array elements after deletion :
LA[0] = 1
LA[1] = 5

Search Operation
Searching an element in the array using a key; The key element sequentially compares every
value in the array to check if the key is present in the array or not.

Algorithm

Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to find an element with a value of ITEM using sequential search.

1. Start
2. Set J = 0
3. Repeat steps 4 and 5 while J < N
4. IF LA[J] is equal ITEM THEN GOTO STEP 6
5. Set J = J +1
6. PRINT J, ITEM
7. Stop

16
Example

Following are the implementations of this operation in various programming languages −

#include <stdio.h>
void main(){
int LA[] = {1,3,5,7,8};
int item = 5, n = 5;
int i = 0, j = 0;
printf("The original array elements are :\n");
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
for(i = 0; i<n; i++) {
if( LA[i] == item ) {
printf("Found element %d at position %d\n", item, i+1);
}
}
}
Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8
Found element 5 at position 3

Traversal Operation

This operation traverses through all the elements of an array. We use loop statements to carry
this out.

Algorithm

Following is the algorithm to traverse through all the elements present in a Linear Array −

1 Start
2. Initialize an Array of certain size and datatype.
3. Initialize another variable ‘i’ with 0.
4. Print the ith value in the array and increment i.
5. Repeat Step 4 until the end of the array is reached.
6. End

17
Example

Following are the implementations of this operation in various programming languages −

#include <stdio.h>
int main(){
int LA[] = {1,3,5,7,8};
int item = 10, k = 3, n = 5;
int i = 0, j = n;
printf("The original array elements are :\n");
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
}
Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8

Update Operation

Update operation refers to updating an existing element from the array at a given index.

Algorithm

Consider LA is a linear array with N elements and K is a positive integer such that K<=N.
Following is the algorithm to update an element available at the Kth position of LA.

1. Start
2. Set LA[K-1] = ITEM
3. Stop

Example

Following are the implementations of this operation in various programming languages −

#include <stdio.h>
void main(){
int LA[] = {1,3,5,7,8};
int k = 3, n = 5, item = 10;
int i, j;

18
printf("The original array elements are :\n");
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
LA[k-1] = item;
printf("The array elements after updation :\n");
for(i = 0; i<n; i++) {
printf("LA[%d] = %d \n", i, LA[i]);
}
}
Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5 LA[3] = 7
LA[4] = 8
The array elements after updation :
LA[0] = 1
LA[1] = 3
LA[2] = 10
LA[3] = 7
LA[4] = 8

Display Operation

This operation displays all the elements in the entire array using a print statement.

Algorithm
1. Start
2. Print all the elements in the Array
3. Stop

Example

Following are the implementations of this operation in various programming languages −

#include <stdio.h>
int main(){
int LA[] = {1,3,5,7,8};
int n = 5;
int i;
printf("The original array elements are :\n");
for(i = 0; i<n; i++) {

19
printf("LA[%d] = %d \n", i, LA[i]);
}
}
Output
The original array elements are :
LA[0] = 1
LA[1] = 3
LA[2] = 5
LA[3] = 7
LA[4] = 8

Records

In computer science, records are a fundamental data structure used to store multiple pieces of
related information together under one name. They are also known as structures (in languages
like C) or classes (in object-oriented languages like Java or Python). Records allow you to
create custom data types that can hold different types of data in a structured way.

Here's a basic example of a record or structure in different programming languages:

struct Person {
char name[50];
int age;
float height;
};

// Example usage:
struct Person person1;
strcpy(person1.name, "John");
person1.age = 30;
person1.height = 5.9;

1.5 Stack, operation on stack

What is a Stack?
A Stack is a linear data structure that follows the LIFO (Last-In-First-Out) principle. Stack
has one end, whereas the Queue has two ends (front and rear). It contains only one
pointer top pointer pointing to the topmost element of the stack. Whenever an element is

20
added in the stack, it is added on the top of the stack, and the element can be deleted only
from the stack. In other words, a stack can be defined as a container in which insertion and
deletion can be done from the one end known as the top of the stack.

Some key points related to stack


o It is called as stack because it behaves like a real-world stack, piles of books, etc.
o A Stack is an abstract data type with a pre-defined capacity, which means that it can
store the elements of a limited size.
o It is a data structure that follows some order to insert and delete the elements, and that
order can be LIFO or FILO.

Working of Stack

Stack works on the LIFO pattern. As we can observe in the below figure there are five
memory blocks in the stack; therefore, the size of the stack is 5.

Suppose we want to store the elements in a stack and let's assume that stack is empty. We
have taken the stack of size 5 as shown below in which we are pushing the elements one by
one until the stack becomes full.

Since our stack is full as the size of the stack is 5. In the above cases, we can observe that it
goes from the top to the bottom when we were entering the new element in the stack. The
stack gets filled up from the bottom to the top.

When we perform the delete operation on the stack, there is only one way for entry and exit
as the other end is closed. It follows the LIFO pattern, which means that the value entered
first will be removed last. In the above case, the value 5 is entered first, so it will be removed
only after the deletion of all the other elements.

21
Standard Stack Operations

The following are some common operations implemented on the stack:

o push(): When we insert an element in a stack then the operation is known as a push.
If the stack is full then the overflow condition occurs.
o pop(): When we delete an element from the stack, the operation is known as a pop. If
the stack is empty means that no element exists in the stack, this state is known as an
underflow state.
o isEmpty(): It determines whether the stack is empty or not.
o isFull(): It determines whether the stack is full or not.'
o peek(): It returns the element at the given position.
o count(): It returns the total number of elements available in a stack.
o change(): It changes the element at the given position.
o display(): It prints all the elements available in the stack.

PUSH operation

The steps involved in the PUSH operation is given below:

o Before inserting an element in a stack, we check whether the stack is full.


o If we try to insert the element in a stack, and the stack is full, then
the overflow condition occurs.
o When we initialize a stack, we set the value of top as -1 to check that the stack is
empty.
o When the new element is pushed in a stack, first, the value of the top gets
incremented, i.e., top=top+1, and the element will be placed at the new position of
the top.
o The elements will be inserted until we reach the max size of the stack.

22
POP operation
The steps involved in the POP operation is given below:

o Before deleting the element from the stack, we check whether the stack is empty.
o If we try to delete the element from the empty stack, then the underflow condition
occurs.
o If the stack is not empty, we first access the element which is pointed by the top
o Once the pop operation is performed, the top is decremented by 1, i.e., top=top-1.

23
Applications of Stack

The following are the applications of the stack:

o Balancing of symbols: Stack is used for balancing a symbol. For example, we have
the following program:

1. int main()
2. {
3. cout<<"Hello";
4. cout<<"javaTpoint";
5. }

As we know, each program has an opening and closing braces; when the opening braces
come, we push the braces in a stack, and when the closing braces appear, we pop the opening
braces from the stack. Therefore, the net value comes out to be zero. If any symbol is left in
the stack, it means that some syntax occurs in a program.

o String reversal: Stack is also used for reversing a string. For example, we want to
reverse a "javaTpoint" string, so we can achieve this with the help of a stack.
First, we push all the characters of the string in a stack until we reach the null
character.
After pushing all the characters, we start taking out the character one by one until we
reach the bottom of the stack.
o UNDO/REDO: It can also be used for performing UNDO/REDO operations. For
example, we have an editor in which we write 'a', then 'b', and then 'c'; therefore, the
text written in an editor is abc. So, there are three states, a, ab, and abc, which are
24
stored in a stack. There would be two stacks in which one stack shows UNDO state,
and the other shows REDO state.
If we want to perform UNDO operation, and want to achieve 'ab' state, then we
implement pop operation.
o Recursion: The recursion means that the function is calling itself again. To maintain
the previous states, the compiler creates a system stack in which all the previous
records of the function are maintained.
o DFS (Depth First Search): This search is implemented on a Graph, and Graph uses
the stack data structure.
o Backtracking: Suppose we have to create a path to solve a maze problem. If we are
moving in a particular path, and we realize that we come on the wrong way. In order
to come at the beginning of the path to create a new path, we have to use the stack
data structure.
o Expression conversion: Stack can also be used for expression conversion. This is
one of the most important applications of stack. The list of the expression conversion
is given below:
o Infix to prefix
o Infix to postfix
o Prefix to infix
o Prefix to postfix
Postfix to infix
o Memory management: The stack manages the memory. The memory is assigned in
the contiguous memory blocks. The memory is known as stack memory as all the
variables are assigned in a function call stack memory. The compiler knows the
memory size assigned to the program. When the function is created, all its variables
are assigned in the stack memory. When the function completed its execution, all the
variables assigned in the stack are released.

1.6 Implementation of stack as an array

Array implementation of Stack

In array implementation, the stack is formed by using the array. All the operations regarding
the stack are performed using arrays. Lets see how each operation can be implemented on the
stack using array data structure.

Adding an element onto the stack (push operation)

Adding an element into the top of the stack is referred to as push operation. Push operation
involves following two steps.
25
1. Increment the variable Top so that it can now refere to the next memory location.
2. Add element at the position of incremented top. This is referred to as adding new
element at the top of the stack.

Stack is overflown when we try to insert an element into a completely filled stack therefore,
our main function must always avoid stack overflow condition.

Algorithm:

1. begin
2. if top = n then stack full
3. top = top + 1
4. stack (top) : = item;
5. end

Time Complexity : o(1)

implementation of push algorithm in C language


1. void push (int val,int n) //n is size of the stack
2. {
3. if (top == n )
4. printf("\n Overflow");
5. else
6. {
7. top = top +1;
8. stack[top] = val;
9. }
10. }
Deletion of an element from a stack (Pop operation)

Deletion of an element from the top of the stack is called pop operation. The value of the
variable top will be incremented by 1 whenever an item is deleted from the stack. The top
most element of the stack is stored in an another variable and then the top is decremented by
1. the operation returns the deleted value that was stored in another variable as the result.

The underflow condition occurs when we try to delete an element from an already empty
stack.

Algorithm :

1. begin
2. if top = 0 then stack empty;
3. item := stack(top);
4. top = top - 1;

26
5. end;

Time Complexity : o(1)

Implementation of POP algorithm using C language


1. int pop ()
2. {
3. if(top == -1)
4. {
5. printf("Underflow");
6. return 0;
7. }
8. else
9. {
10. return stack[top - - ];
11. }
12. }
Visiting each element of the stack (Peek operation)

Peek operation involves returning the element which is present at the top of the stack without
deleting it. Underflow condition can occur if we try to return the top element in an already
empty stack.

Algorithm :

PEEK (STACK, TOP)

1. Begin
2. if top = -1 then stack empty
3. item = stack[top]
4. return item
5. End

Time complexity: o(n)

Implementation of Peek algorithm in C language


1. int peek()
2. {
3. if (top == -1)
4. {
5. printf("Underflow");
6. return 0;
7. }

27
8. else
9. {
10. return stack [top];
11. }
12. }

C program

#include <stdio.h>
int stack[100],i,j,choice=0,n,top=-1;
void push();
void pop();
void show();
void main ()
{

printf("Enter the number of elements in the stack ");


scanf("%d",&n);
printf("*Stack operations using array*");

printf("\n----------------------------------------------\n");
while(choice != 4)
{
printf("Chose one from the below options...\n");
printf("\n1.Push\n2.Pop\n3.Show\n4.Exit");
printf("\n Enter your choice \n");
scanf("%d",&choice);
switch(choice)
{
case 1:
{
push();
break;
}
case 2:
{
pop();
break;
}
case 3:
28
{
show();
break;
}
case 4:
{
printf("Exiting....");
break;
}
default:
{
printf("Please Enter valid choice ");
}
};
}
}

void push ()
{
int val;
if (top == n )
printf("\n Overflow");
else
{
printf("Enter the value?");
scanf("%d",&val);
top = top +1;
stack[top] = val;
}
}

void pop ()
{
if(top == -1)
printf("Underflow");
else
top = top -1;
}
void show()

29
{
for (i=top;i>=0;i--)
{
printf("%d\n",stack[i]);
}
if(top == -1)
{
printf("Stack is empty");
}
}

1.7 Queue, Types of Queues

What is a Queue?
Queue is the data structure that is similar to the queue in the real world. A queue is a data
structure in which whatever comes first will go out first, and it follows the FIFO (First-In-
First-Out) policy. Queue can also be defined as the list or collection in which the insertion is
done from one end known as the rear end or the tail of the queue, whereas the deletion is
done from another end known as the front end or the head of the queue.

The real-world example of a queue is the ticket queue outside a cinema hall, where the person
who enters first in the queue gets the ticket first, and the last person enters in the queue gets
the ticket at last. Similar approach is followed in the queue in data structure.

The representation of the queue is shown in the below image -

Now, let's move towards the types of queue.

30
Types of Queue

There are four different types of queue that are listed as follows -

o Simple Queue or Linear Queue


o Circular Queue
o Priority Queue
o Double Ended Queue (or Deque)

Let us discuss each of the type of queue.

Simple Queue or Linear Queue


In Linear Queue, an insertion takes place from one end while the deletion occurs from
another end. The end at which the insertion takes place is known as the rear end, and the end
at which the deletion takes place is known as front end. It strictly follows the FIFO rule.

The major drawback of using a linear Queue is that insertion is done only from the rear end.
If the first three elements are deleted from the Queue, we cannot insert more elements even
though the space is available in a Linear Queue. In this case, the linear Queue shows the
overflow condition as the rear is pointing to the last element of the Queue.

31
Circular Queue

In Circular Queue, all the nodes are represented as circular. It is similar to the linear Queue
except that the last element of the queue is connected to the first element. It is also known as
Ring Buffer, as all the ends are connected to another end. The representation of circular
queue is shown in the below image -

The drawback that occurs in a linear queue is overcome by using the circular queue. If the
empty space is available in a circular queue, the new element can be added in an empty space
by simply incrementing the value of rear. The main advantage of using the circular queue is
better memory utilization.

Priority Queue
It is a special type of queue in which the elements are arranged based on the priority. It is a
special type of queue data structure in which every element has a priority associated with it.
Suppose some elements occur with the same priority, they will be arranged according to the
FIFO principle. The representation of priority queue is shown in the below image -

Insertion in priority queue takes place based on the arrival, while deletion in the priority
queue occurs based on the priority. Priority queue is mainly used to implement the CPU
scheduling algorithms.

There are two types of priority queue that are discussed as follows -

o Ascending priority queue - In ascending priority queue, elements can be inserted in


arbitrary order, but only smallest can be deleted first. Suppose an array with elements
7, 5, and 3 in the same order, so, insertion can be done with the same sequence, but
the order of deleting the elements is 3, 5, 7.

32
o Descending priority queue - In descending priority queue, elements can be inserted
in arbitrary order, but only the largest element can be deleted first. Suppose an array
with elements 7, 3, and 5 in the same order, so, insertion can be done with the same
sequence, but the order of deleting the elements is 7, 5, 3.

Deque (or, Double Ended Queue)

In Deque or Double Ended Queue, insertion and deletion can be done from both ends of the
queue either from the front or rear. It means that we can insert and delete elements from both
front and rear ends of the queue. Deque can be used as a palindrome checker means that if we
read the string from both ends, then the string would be the same.

Deque can be used both as stack and queue as it allows the insertion and deletion operations
on both ends. Deque can be considered as stack because stack follows the LIFO (Last In First
Out) principle in which insertion and deletion both can be performed only from one end. And
in deque, it is possible to perform both insertion and deletion from one end, and Deque does
not follow the FIFO principle.

The representation of the deque is shown in the below image -

There are two types of deque that are discussed as follows -

o Input restricted deque - As the name implies, in input restricted queue, insertion
operation can be performed at only one end, while deletion can be performed from
both ends.

o Output restricted deque - As the name implies, in output restricted queue, deletion
operation can be performed at only one end, while insertion can be performed from

33
both ends.

Now, let's see the operations performed on the queue.

Operations performed on queue


The fundamental operations that can be performed on queue are listed as follows -

o Enqueue: The Enqueue operation is used to insert the element at the rear end of the
queue. It returns void.
o Dequeue: It performs the deletion from the front-end of the queue. It also returns the
element, which has been removed, from the front-end. It returns an integer value.
o Peek: This is the third operation that returns the element, which is pointed by the
front pointer in the queue but does not delete it.
o Queue overflow (isfull): It shows the overflow condition when the queue is
completely full.
o Queue underflow (isempty): It shows the underflow condition when the Queue is
empty, i.e., no elements are in the Queue.

Now, let us see the ways to implement the queue.

Ways to implement the queue


There are two ways of implementing the Queue:

o Implementation using array: The sequential allocation in a Queue can be


implemented using an array.
o Implementation using Linked list: The linked list allocation in a Queue can be
implemented using a linked list.

1.8 Operations on Queue

34
Operations performed on queue

The fundamental operations that can be performed on queue are listed as follows -

o Enqueue: The Enqueue operation is used to insert the element at the rear end of the
queue. It returns void.
o Dequeue: It performs the deletion from the front-end of the queue. It also returns the
element, which has been removed, from the front-end. It returns an integer value.
o Peek: This is the third operation that returns the element, which is pointed by the
front pointer in the queue but does not delete it.
o Queue overflow (isfull): It shows the overflow condition when the queue is
completely full.
o Queue underflow (isempty): It shows the underflow condition when the Queue is
empty, i.e., no elements are in the Queue.

1.9 Implementation of Queue


Array implementation Of Queue:
For implementing queue, we need to keep track of two indices, front and rear. We enqueue
an item at the rear and dequeue an item from the front. If we simply increment front and
rear indices, then there may be problems, the front may reach the end of the array. The
solution to this problem is to increase front and rear in circular manner.
Steps for enqueue:
1. Check the queue is full or not
2. If full, print overflow and exit
3. If queue is not full, increment tail and add the element
4.
Steps for dequeue:
1. Check queue is empty or not
2. if empty, print underflow and exit
3. if not empty, print element at the head and increment head
Below is a program to implement above operation on queue

// implementation of queue
#include <bits/stdc++.h>
using namespace std;

// A structure to represent a queue


class Queue {
public:
int front, rear, size;

unsigned capacity;
int* array;
};

35
// function to create a queue

// of given capacity.

// It initializes size of queue as 0


Queue* createQueue(unsigned capacity)
{
Queue* queue = new Queue();
queue->capacity = capacity;
queue->front = queue->size = 0;
// This is important, see the enqueue
queue->rear = capacity - 1;
queue->array = new int[queue->capacity];
return queue;
}
// Queue is full when size
// becomes equal to the capacity
int isFull(Queue* queue)
{
return (queue->size == queue->capacity);
}

// Queue is empty when size is 0


int isEmpty(Queue* queue)
{
return (queue->size == 0);
}
// Function to add an item to the queue.
// It changes rear and size
void enqueue(Queue* queue, int item)
{
if (isFull(queue))
return;
queue->rear = (queue->rear + 1)
% queue->capacity;
queue->array[queue->rear] = item;
queue->size = queue->size + 1;
cout << item << " enqueued to queue\n";
}
// Function to remove an item from queue.
// It changes front and size
int dequeue(Queue* queue)
{
if (isEmpty(queue))
return INT_MIN;
int item = queue->array[queue->front];
queue->front = (queue->front + 1)
% queue->capacity;
queue->size = queue->size - 1;
return item;

36
}

// Function to get front of queue


int front(Queue* queue)

if (isEmpty(queue))

return INT_MIN;

return queue->array[queue->front];

// Function to get rear of queue

int rear(Queue* queue)

if (isEmpty(queue))

return INT_MIN;

return queue->array[queue->rear];

int main()

Queue* queue = createQueue(1000);

enqueue(queue, 10);

enqueue(queue, 20);

enqueue(queue, 30);

enqueue(queue, 40);

cout << dequeue(queue)

<< " dequeued from queue\n";

cout << "Front item is "

<< front(queue) << endl;

cout << "Rear item is "


<< rear(queue) << endl;

37
return 0;
}

Output
10 enqueued to queue
20 enqueued to queue
30 enqueued to queue
40 enqueued to queue
10 dequeued from queue
Front item is 20
Rear item is 40

1.10 Unit End Questions

1. Write short note on different operations that can be performed on data strucutres.

2. Write an algorithm for Push() and Pop() operations on Stack using arrays.

3. What are various operations performed on the Stack?

38
UNIT 2 Linked Structure

Structure:
2.1 List Representation
2.1 Polish Notations
2.2 Operations on linked list – get node and free node operation
2.3 Implementing the list operation
2.4 Inserting into an ordered linked list
2.5 Deleting circular linked list
2.6 Doubly linked list
2.7 Unit End Questions

2.1 List Representation

A linked list is a fundamental data structure in computer science. It is a collection of nodes


where each node contains data and a reference (or link) to the next node in the sequence. The
last node typically points to a null reference, indicating the end of the list. Linked lists
provide dynamic memory allocation, efficient insertion and deletion at arbitrary positions, but
they lack direct access to elements by index.

Advantages of linked list

Linked lists have several advantages that make them useful in various scenarios. Here
are some of the key advantages of linked lists:

1. Dynamic Size:
- Linked lists can easily grow or shrink in size during program execution because
memory allocation is dynamic. This contrasts with static arrays, where the size is fixed at
compile time.

2. Efficient Insertion and Deletion:


- Inserting or deleting an element in a linked list is generally more efficient than in an
array, especially when the operation is not at the beginning or end. No shifting of elements
is required; you just need to adjust the pointers.

3. No Wasted Memory:
- Linked lists do not require a fixed amount of memory like arrays. Memory is allocated
as needed, minimizing wasted space and making efficient use of memory.

4. No Pre-allocation of Space:

39
- Unlike arrays, which may need to pre-allocate space for a certain size, linked lists
allocate memory as elements are added. This can be beneficial when the number of
elements is uncertain or varies.

5. Ease of Implementation for Certain Operations:


- Certain operations, such as inserting an element in the middle of the list or merging
two lists, are easier and more efficient with linked lists compared to arrays.

6. No Need for Contiguous Memory:


- Elements in a linked list do not need to be stored in contiguous memory locations. This
property allows for better memory utilization and helps avoid fragmentation issues.

7. Ease of Building Other Data Structures:


- Linked lists are the building blocks for other complex data structures, such as stacks,
queues, and hash tables. They are a fundamental component in computer science.

8. No Predefined Size:
- Linked lists do not have a predefined size, which means that they can accommodate
data of any size without requiring modifications to the underlying data structure.

Despite these advantages, it's important to note that linked lists also have some drawbacks.
They may have higher memory overhead due to the storage of pointers, and direct access
to elements by index is not as efficient as with arrays. The choice between linked lists and
other data structures depends on the specific requirements of the task at hand.

Disadvantages of linked list

While linked lists have several advantages, they also come with certain disadvantages. It's
essential to consider these drawbacks when choosing a data structure for a particular
application. Here are some of the disadvantages of linked lists:

1. Memory Overhead:
- Linked lists require additional memory to store the pointers/references for each
element. This can lead to higher memory overhead compared to arrays, where elements
are stored contiguously.

2. Random Access is Inefficient:


- Unlike arrays, linked lists do not provide constant-time access to elements by index. To
access an element at a specific position, you need to traverse the list from the beginning or
end, resulting in a linear time complexity.

3. Traversal Overhead:

40
- Traversing a linked list can have a higher constant factor than iterating through an
array due to the extra memory accesses required for following pointers.

4. Cache Issues:
- Linked lists may not utilize the cache as efficiently as arrays. Since the elements are
scattered in memory, there is a higher likelihood of cache misses, leading to slower access
times.

5. Sequential Access is More Efficient:


- While insertion and deletion operations are efficient at any position in a linked list,
sequential access can be less efficient than with arrays. This is because arrays benefit from
spatial locality, and their elements are stored in contiguous memory locations.

6. Extra Space for Pointers:


- Each element in a linked list requires additional space for storing the pointers,
increasing the overall space complexity. This can be a concern when memory usage is a
critical factor.

7. Not Suitable for Small Data Sets:


- For small data sets, the overhead of maintaining pointers in a linked list may outweigh
the benefits. In such cases, simpler data structures like arrays may be more appropriate.

8. Complexity in Implementation:
- Implementing and managing linked lists can be more complex than using other data
structures, especially for beginners. Dealing with pointers and ensuring proper memory
management can be error-prone.

9. Lack of Cache Locality:


- Linked lists lack the cache locality that arrays offer. This can result in less efficient use
of the system's cache, leading to performance degradation.

Despite these disadvantages, linked lists remain valuable in certain situations where
dynamic size, efficient insertions/deletions, and flexibility in memory usage are crucial.
The choice of data structure depends on the specific requirements of the task at hand.

Types of Linked Lists

Linked lists come in various types, and each type has different characteristics to suit
different needs. The main types of linked lists are:

1. Singly Linked List:


- Each node contains data and a pointer/reference to the next node in the sequence.
- It is a unidirectional list, allowing traversal only in one direction (forward).
41
- The last node's next pointer is typically set to `nullptr` to indicate the end of the list.

struct Node {
int data;
Node* next;
};

2. Doubly Linked List:


- Each node contains data and two pointers/references: one to the next node and one to
the previous node.
- Supports bidirectional traversal (forward and backward).
- More memory overhead due to the extra pointers, but provides easier backward
traversal.

struct Node {
int data;
Node* next;
Node* prev;
};

3. Circular Linked List:


- Similar to a singly or doubly linked list, but the last node points back to the first node
(for a circular singly linked list) or the last node points to the first, and the first points to
the last (for a circular doubly linked list).
- Useful in applications where continuous access to the list is needed.

// Circular Singly Linked List


struct Node {
int data;
Node* next;
};

// Circular Doubly Linked List


struct Node {
int data;
Node* next;
Node* prev;
};
42
4. Skip List:
- A probabilistic data structure that allows for quick search, insertion, and deletion of
elements.
- Consists of multiple layers of linked lists, where each layer skips over a fixed number
of elements.
- More complex than traditional linked lists but provides efficient search operations.

5. Self-adjusting Lists:
- Lists designed to reorganize themselves dynamically to optimize for frequently
accessed elements.
- Examples include the move-to-front list and transpose list.

6. XOR Linked List:


- A memory-efficient variation where each node stores the XOR combination of the
addresses of its previous and next nodes.
- Requires bitwise XOR operations for traversal, making it less common in practice.

It's important to choose the appropriate type of linked list based on the specific
requirements of the task at hand. Each type has its own advantages and trade-offs, and the
choice depends on factors such as memory efficiency, access patterns, and the types of
operations you need to perform on the list.

Difference between arrays and linked lists


Feature Arrays Linked Lists
Contiguous block of memory Dynamically allocated nodes with
Memory Allocation allocated pointers
Size Flexibility Fixed size at compile time Dynamic size (can grow or shrink)
O(1) time complexity for access O(n) time complexity for access by
Random Access by index index
O(n) time complexity for O(1) time complexity for insert/delete
Insertion/Deletion insert/delete at known position, O(n) otherwise
Additional memory for
Memory Overhead Minimal (only data elements) pointers/references
Good (elements stored
Cache Locality contiguously) Poor (elements scattered in memory)
Implementation More complex due to pointers and
Complexity Simple memory management
Well-suited for situations where Well-suited for dynamic size
the size is known and access is requirements and frequent
Usage frequent insertions/deletions
Matrices, vectors, access patterns Dynamic data structures, stacks,
Applications are known queues, and situations where size
43
Feature Arrays Linked Lists
varies
Traversal Efficiency Efficient due to cache locality Less efficient due to scattered memory
struct Node { int data; Node*
Example in C++ int arr[10]; next; };
It's important to note that the choice between arrays and linked lists depends on the
specific requirements of the task. Arrays are more suitable when the size is fixed, and
random access to elements is frequent. Linked lists, on the other hand, are more suitable
for dynamic size requirements and situations where insertions and deletions are common.
Each has its own advantages and trade-offs.

2.2 Polish Notations

Polish notation, also known as prefix notation, is a way of writing mathematical


expressions in which the operators come before their operands. In Polish notation, the
order of operations is clear and unambiguous, as parentheses are not required. There are
two common variations of Polish notation: Polish prefix notation and Polish postfix
notation.

1. Polish Prefix Notation:


In Polish prefix notation, the operator precedes its operands. For example, the infix
expression "2 + 3" would be written as "+ 2 3" in Polish prefix notation.

Example:
- Infix: \( (3 + 4) \times 5 \)
- Prefix: \( \times + 3 4 5 \)

2. Polish Postfix Notation:


In Polish postfix notation, the operator follows its operands. For example, the infix
expression "2 + 3" would be written as "2 3 +" in Polish postfix notation.

Example:
- Infix: \( (3 + 4) \times 5 \)
- Postfix: \( 3 4 + 5 \times \)

Polish notation eliminates the need for parentheses and the rules of precedence because
the position of the operator determines its scope. Evaluating expressions in Polish notation
is often done using a stack-based algorithm.

Advantages of Polish Notation:


1. No need for parentheses to indicate order of operations.
44
2. Eliminates ambiguity in expressions.

Algorithm for Evaluating Postfix Expressions:


1. Initialize an empty stack.
2. Scan the expression from left to right.
3. If an operand is encountered, push it onto the stack.
4. If an operator is encountered, pop the required number of operands from the stack,
perform the operation, and push the result back onto the stack.
5. Continue scanning and performing operations until the end of the expression is reached.
6. The final result should be on the top of the stack.

Here's an example in C++ for evaluating a postfix expression:

#include <iostream>
#include <stack>
#include <sstream>

int evaluatePostfix(const std::string& expression) {


std::stack<int> stack;

std::istringstream iss(expression);
std::string token;

while (iss >> token) {


if (isdigit(token[0])) {
stack.push(std::stoi(token));
} else {
int operand2 = stack.top();
stack.pop();
int operand1 = stack.top();
stack.pop();

if (token == "+") {
stack.push(operand1 + operand2);
} else if (token == "-") {
stack.push(operand1 - operand2);
} else if (token == "*") {
stack.push(operand1 * operand2);
} else if (token == "/") {
stack.push(operand1 / operand2);
}
}
}
45
return stack.top();
}

int main() {
std::string postfixExpression = "3 4 + 5 *";
std::cout << "Result: " << evaluatePostfix(postfixExpression) << std::endl;

return 0;
}

This example evaluates the postfix expression "3 4 + 5 *", which corresponds to the infix
expression \( (3 + 4) \times 5 \). The result should be 35.

An expression is any word or group of words or symbols that generates a value on evaluation.
Parsing expression means analyzing the expression for its words or symbols depending on a
particular criterion. Expression parsing is a term used in a programming language to evaluate
arithmetic and logical expressions.

The way to write arithmetic expression is known as a notation. An arithmetic expression can
be written in three different but equivalent notations, i.e., without changing the essence or
output of an expression. These notations are −

 Infix Notation
 Prefix (Polish) Notation
 Postfix (Reverse-Polish) Notation

These notations are named as how they use operator in expression. We shall learn the same
here in this chapter.

Infix Notation

We write expression in infix notation, e.g. a - b + c, where operators are used in-between
operands. It is easy for us humans to read, write, and speak in infix notation but the same
does not go well with computing devices. An algorithm to process infix notation could be
difficult and costly in terms of time and space consumption.

Prefix Notation

In this notation, operator is prefixed to operands, i.e. operator is written ahead of operands.
For example, +ab. This is equivalent to its infix notation a + b. Prefix notation is also known
as Polish Notation.

Postfix Notation

This notation style is known as Reversed Polish Notation. In this notation style, the operator
is postfixed to the operands i.e., the operator is written after the operands. For example, ab+.
This is equivalent to its infix notation a + b.

46
The following table briefly tries to show the difference in all three notations −

Sr.No. Infix Notation Prefix Notation Postfix Notation

1 a+b +ab ab+

2 (a + b) ∗ c ∗+abc ab+c∗

3 a ∗ (b + c) ∗a+bc abc+∗

4 a/b+c/d +/ab/cd ab/cd/+

5 (a + b) ∗ (c + d) ∗+ab+cd ab+cd+∗

6 ((a + b) ∗ c) - d -∗+abcd ab+c∗d-

Parsing Expressions

As we have discussed, it is not a very efficient way to design an algorithm or program to


parse infix notations. Instead, these infix notations are first converted into either postfix or
prefix notations and then computed.

To parse any arithmetic expression, we need to take care of operator precedence and
associativity also.

Precedence

When an operand is in between two different operators, which operator will take the operand
first, is decided by the precedence of an operator over others. For example −

As multiplication operation has precedence over addition, b * c will be evaluated first. A


table of operator precedence is provided later.

Associativity

Associativity describes the rule where operators with the same precedence appear in an
expression. For example, in expression a + b − c, both + and − have the same precedence,
then which part of the expression will be evaluated first, is determined by associativity of
those operators. Here, both + and − are left associative, so the expression will be evaluated
as (a + b) − c.

Precedence and associativity determines the order of evaluation of an expression. Following


is an operator precedence and associativity table (highest to lowest) –

47
Sr.No. Operator Precedence Associativity

1 Exponentiation ^ Highest Right Associative

2 Multiplication ( ∗ ) & Division ( / ) Second Highest Left Associative

3 Addition ( + ) & Subtraction ( − ) Lowest Left Associative

The above table shows the default behavior of operators. At any point of time in expression
evaluation, the order can be altered by using parenthesis. For example −

In a + b*c, the expression part b*c will be evaluated first, with multiplication as precedence
over addition. We here use parenthesis for a + b to be evaluated first, like (a + b)*c.

Postfix Evaluation Algorithm

We shall now look at the algorithm on how to evaluate postfix notation −

Step 1. Scan the expression from left to right


Step 2. If it is an operand push it to stack
Step 3. If it is an operator pull operand from stack and
perform operation
Step 4. Store the output of step 3, back to stack
Step 5. Scan the expression until all operands are consumed
Step 6. Pop the stack and perform operation

2.3 Operations on Linked List

Linked lists support various operations for manipulating and accessing data. Here are
some common operations on linked lists:

1. Insertion:
- Insert at the Beginning: Add a new node at the beginning of the linked list.
- Insert at the End: Add a new node at the end of the linked list.
- Insert at a Specific Position: Add a new node at a specified position in the linked
list.

// Insert at the Beginning


void insertAtBeginning(Node*& head, int data) {
Node* newNode = new Node(data);
newNode->next = head;
head = newNode;
}

48
// Insert at the End
void insertAtEnd(Node*& head, int data) {
Node* newNode = new Node(data);
if (head == nullptr) {
head = newNode;
return;
}
Node* current = head;
while (current->next != nullptr) {
current = current->next;
}
current->next = newNode;
}

// Insert at a Specific Position


void insertAtPosition(Node*& head, int data, int position) {
// Implementation depends on the specific requirements
}

2. Deletion:
- Delete at the Beginning: Remove the first node from the linked list.
- Delete at the End: Remove the last node from the linked list.
- Delete at a Specific Position: Remove a node from a specified position in the
linked list.

// Delete at the Beginning


void deleteAtBeginning(Node*& head) {
if (head == nullptr) {
return;
}
Node* temp = head;
head = head->next;
delete temp;
}

// Delete at the End


void deleteAtEnd(Node*& head) {
if (head == nullptr) {
return;
}
49
if (head->next == nullptr) {
delete head;
head = nullptr;
return;
}
Node* current = head;
while (current->next->next != nullptr) {
current = current->next;
}
delete current->next;
current->next = nullptr;
}

// Delete at a Specific Position


void deleteAtPosition(Node*& head, int position) {
// Implementation depends on the specific requirements
}

3. Traversal:
- Traverse the linked list to access and print each element.

void traverse(Node* head) {


Node* current = head;
while (current != nullptr) {
// Process current node (e.g., print its data)
std::cout << current->data << " ";
current = current->next;
}
}

4. Search:
- Search for a specific element in the linked list.

bool search(Node* head, int key) {


Node* current = head;
while (current != nullptr) {
if (current->data == key) {
return true;
}
current = current->next;
50
}
return false;
}

5. Length:
- Determine the length (number of nodes) of the linked list.

int length(Node* head) {


int count = 0;
Node* current = head;
while (current != nullptr) {
count++;
current = current->next;
}
return count;
}

These are basic operations, and depending on the specific requirements, more complex
operations and variations may be implemented. Additionally, some operations might be more
efficient with specific types of linked lists (e.g., singly linked lists, doubly linked lists,
circular linked lists).

Double Linked List

A doubly linked list is a type of linked list in which each node contains a data element
and two pointers or references—one pointing to the next node in the sequence (as in a
singly linked list), and another pointing to the previous node. This bidirectional
linking allows for easier traversal in both forward and backward directions.

Here is a basic structure of a node in a doubly linked list:

struct Node {
int data;
Node* next;
Node* prev;

Node(int val) : data(val), next(nullptr), prev(nullptr) {}


};

Now, let's go through some common operations on a doubly linked list:

51
1. Insertion:
- Insert at the Beginning:

void insertAtBeginning(Node*& head, int data) {


Node* newNode = new Node(data);
newNode->next = head;
if (head != nullptr) {
head->prev = newNode;
}
head = newNode;
}

- Insert at the End:

void insertAtEnd(Node*& head, int data) {


Node* newNode = new Node(data);
if (head == nullptr) {
head = newNode;
} else {
Node* current = head;
while (current->next != nullptr) {
current = current->next;
}
current->next = newNode;
newNode->prev = current;
}
}

- Insert at a Specific Position:


void insertAtPosition(Node*& head, int data, int position) {
// Implementation depends on the specific requirements
}

2. Deletion:
- Delete at the Beginning:

void deleteAtBeginning(Node*& head) {


if (head == nullptr) {
return;
}
Node* temp = head;
head = head->next;
if (head != nullptr) {
52
head->prev = nullptr;
}
delete temp;
}

- Delete at the End:


void deleteAtEnd(Node*& head) {
if (head == nullptr) {
return;
}
if (head->next == nullptr) {
delete head;
head = nullptr;
return;
}
Node* current = head;
while (current->next != nullptr) {
current = current->next;
}
current->prev->next = nullptr;
delete current;
}

- Delete at a Specific Position:

void deleteAtPosition(Node*& head, int position) {


// Implementation depends on the specific requirements
}

3. Traversal (Forward and Backward):


void traverseForward(Node* head) {
Node* current = head;
while (current != nullptr) {
// Process current node (e.g., print its data)
std::cout << current->data << " ";
current = current->next;
}
}

void traverseBackward(Node* tail) {


Node* current = tail;
while (current != nullptr) {
// Process current node (e.g., print its data)
53
std::cout << current->data << " ";
current = current->prev;
}
}

These are basic operations for a doubly linked list. The doubly linked list provides
bidirectional traversal and makes certain operations, such as deletion at the end or
backward traversal, more efficient compared to a singly linked list.

2.6 Circular Linked List

A circular linked list is a type of linked list in which the last node of the list points
back to the first node, creating a closed loop. In a circular linked list, the "next"
pointer of the last node is not `nullptr` but points to the first node in the list. This
allows for continuous traversal and is especially useful in applications where
continuous access to the list is required.

Here is the basic structure of a node in a circular linked list:

struct Node {
int data;
Node* next;

Node(int val) : data(val), next(nullptr) {}


};
```

Now, let's go through some common operations on a circular linked list:

1. Insertion:
- Insert at the Beginning:

void insertAtBeginning(Node*& head, int data) {


Node* newNode = new Node(data);
if (head == nullptr) {
newNode->next = newNode; // Points to itself
head = newNode;
} else {
newNode->next = head;
Node* current = head;
while (current->next != head) {
54
current = current->next;
}
current->next = newNode;
head = newNode;
}
}

- Insert at the End:

void insertAtEnd(Node*& head, int data) {


Node* newNode = new Node(data);
if (head == nullptr) {
newNode->next = newNode; // Points to itself
head = newNode;
} else {
Node* current = head;
while (current->next != head) {
current = current->next;
}
current->next = newNode;
newNode->next = head;
}
}

- Insert at a Specific Position:

void insertAtPosition(Node*& head, int data, int position) {


// Implementation depends on the specific requirements
}

2. Deletion:
- Delete at the Beginning:

void deleteAtBeginning(Node*& head) {


if (head == nullptr) {
return;
}
Node* temp = head;
if (head->next == head) {
head = nullptr;
55
} else {
Node* current = head;
while (current->next != head) {
current = current->next;
}
current->next = head->next;
head = head->next;
}
delete temp;
}

- Delete at the End:

void deleteAtEnd(Node*& head) {


if (head == nullptr) {
return;
}
if (head->next == head) {
delete head;
head = nullptr;
} else {
Node* current = head;
while (current->next->next != head) {
current = current->next;
}
Node* temp = current->next;
current->next = head;
delete temp;
}
}

- Delete at a Specific Position:

void deleteAtPosition(Node*& head, int position) {


// Implementation depends on the specific requirements
}

3. Traversal:

void traverse(Node* head) {


56
if (head == nullptr) {
return;
}
Node* current = head;
do {
// Process current node (e.g., print its data)
std::cout << current->data << " ";
current = current->next;
} while (current != head);
}

In a circular linked list, operations are performed similarly to a singly linked list, with
the main difference being how the last node points back to the first. This allows for
continuous traversal without the need to check for `nullptr` as an end condition.

2.7 Unit End Questions

1) Write down the steps to modify a node in linked lists


2) Difference between arrays and lists.
3) What is the basic purpose of header in the linked list?
4) Explain the operation of traversing linked list. Write the algorithm and give an
example.

UNIT 3 Tree Structure

Structure:
3.1 Concept and terminology
3.2 Types of trees
3.3 Binary Search Trees
3.3 Inserting, deleting and searching into binary search tree
3.4 Implementing the insert, search and delete algorithms
3.5 Tree traversals
3.6 Huffman’s algorithm
3.7 Unit End Questions

57
3.1 Concept and Terminology

A tree data structure is a hierarchical structure that is used to represent and organize data in a way that
is easy to navigate and search. It is a collection of nodes that are connected by edges and has a
hierarchical relationship between the nodes.

The topmost node of the tree is called the root, and the nodes below it are called the child nodes. Each
node can have multiple child nodes, and these child nodes can also have their own child nodes,
forming a recursive structure.

Basic Terminologies In Tree Data Structure:

Parent Node: The node which is a predecessor of a node is called the parent node of that node. {B} is
the parent node of {D, E}.

Child Node: The node which is the immediate successor of a node is called the child node of that
node. Examples: {D, E} are the child nodes of {B}.

Root Node: The topmost node of a tree or the node which does not have any parent node is called the
root node. {A} is the root node of the tree. A non-empty tree must contain exactly one root node and
exactly one path from the root to all other nodes of the tree.

Leaf Node or External Node: The nodes which do not have any child nodes are called leaf nodes.
{K, L, M, N, O, P, G} are the leaf nodes of the tree.

Ancestor of a Node: Any predecessor nodes on the path of the root to that node are called Ancestors
of that node. {A,B} are the ancestor nodes of the node {E}

Descendant: Any successor node on the path from the leaf node to that node. {E,I} are the
descendants of the node {B}.
58
Sibling: Children of the same parent node are called siblings. {D,E} are called siblings.

Level of a node: The count of edges on the path from the root node to that node. The root node has
level 0.

Internal node: A node with at least one child is called Internal Node.

Neighbour of a Node: Parent or child nodes of that node are called neighbors of that node.

Subtree: Any node of the tree along with its descendant.

1. Hennessy, J. L., & Patterson, D. A. (2017). Computer architecture: A quantitative approach.


Morgan Kaufmann.
2. Tanenbaum, A. S., & Austin, T. (2019). Structured computer organization. Pearson.
3. Stallings, W. (2018). Computer organization and architecture: Designing for performance.
Pearson.
4. Hennessy, J. L., & Patterson, D. A. (2020). Computer architecture: A quantitative approach
(6th ed.). Morgan Kaufmann.
5. Pal Chaudhuri, P., & Pal Chaudhuri, B. (2012). Computer organization and architecture:
Designing for performance (8th ed.). Pearson Education India.

Representation of Tree Data Structure:

A tree consists of a root and zero or more subtrees T1, T2… Tk such that there is an edge from the
root of the tree to the root of each subtree.

59
Representation of a Node in Tree Data Structure:

struct Node
{
int data;
struct Node *first_child;
struct Node *second_child;
struct Node *third_child;
.
.
.
struct Node *nth_child;
};

3.2 Types of Trees

There are several types of trees in computer science, each with its own characteristics and use cases.
Here are some common types of trees:

1. Binary Tree:
- A tree in which each node has at most two children, usually referred to as the left child and the
right child.

2. Binary Search Tree (BST):


- A binary tree where the left subtree of a node contains only nodes with values less than the node's
value, and the right subtree contains only nodes with values greater than the node's value. This
property allows for efficient search, insertion, and deletion operations.

3. AVL Tree:
- A self-balancing binary search tree where the heights of the two child subtrees of any node differ
by at most one. AVL trees maintain balance to ensure efficient search, insertion, and deletion
operations.

4. Red-Black Tree:
- A self-balancing binary search tree with additional properties, ensuring that the tree remains
balanced during insertions and deletions. Red-Black trees are used in various applications, including
C++'s `std::set` and `std::map` implementations.

5. B-Tree:
- A self-balancing search tree that maintains sorted data and is designed to work well on secondary
storage, such as disks. B-trees are commonly used in databases and file systems.

60
6. Trie (Prefix Tree):
- A tree-like data structure used for efficiently storing a dynamic set or associative array where the
keys are usually strings. Trie structures are frequently used in applications like spell checkers and IP
routers.

7. Heap:
- A specialized tree-based data structure that satisfies the heap property. A binary heap is commonly
used to implement priority queues.

8. Expression Tree:
- A tree representation of an algebraic expression, where the leaves are operands and internal nodes
are operators. Expression trees are useful in compilers for evaluating expressions.

9. Segment Tree:
- A tree structure used for storing information about intervals or segments. It is particularly useful in
applications like range queries and updates.

10. Quadtree:
- A tree data structure in which each internal node has four children, dividing a space into four
quadrants. Quadtrees are often used in image processing and geographical information systems.

11. Octree:
- Similar to a quadtree, but in three dimensions, dividing space into octants. Octrees are used in 3D
computer graphics and simulations.

12. Splay Tree:


- A binary search tree with a self-adjusting property. Nodes that are frequently accessed are moved
closer to the root, optimizing for frequently accessed elements.

13. Suffix Tree:


- A tree-like data structure used to represent all the suffixes of a given string. Suffix trees are often
used in string matching and bioinformatics.

These are just a few examples of the many types of trees used in computer science. The choice of a
tree structure depends on the specific requirements and characteristics of the application or problem
being solved.

Tree Traversals

Tree traversal is the process of visiting each node in a tree data structure and performing a
specific operation at each node. There are different strategies for tree traversal, and they can
be broadly categorized into three main types: depth-first traversal, breadth-first traversal, and
level order traversal. Here are the common tree traversal techniques:

61
Depth-First Traversal:
In depth-first traversal, you start from the root and explore as far as possible along each
branch before backtracking.

1. Inorder Traversal (Left-Root-Right):


Traverse the left subtree, visit the root, and then traverse the right subtree.

void inorderTraversal(Node* root) {


if (root != nullptr) {
inorderTraversal(root->left);
std::cout << root->data << " ";
inorderTraversal(root->right);
}
}

2. Preorder Traversal (Root-Left-Right):


Visit the root, traverse the left subtree, and then traverse the right subtree.

void preorderTraversal(Node* root) {


if (root != nullptr) {
std::cout << root->data << " ";
preorderTraversal(root->left);
preorderTraversal(root->right);
}
}

3. Postorder Traversal (Left-Right-Root):


Traverse the left subtree, traverse the right subtree, and then visit the root.

void postorderTraversal(Node* root) {


if (root != nullptr) {
postorderTraversal(root->left);
postorderTraversal(root->right);
std::cout << root->data << " ";
}
}

Breadth-First Traversal (Level Order Traversal):


In breadth-first traversal, you visit all the nodes at the current level before moving on to the
next level.

62
1. Level Order Traversal:
Traverse the tree level by level, from left to right.

void levelOrderTraversal(Node* root) {


if (root == nullptr) {
return;
}
std::queue<Node*> q;
q.push(root);

while (!q.empty()) {
Node* current = q.front();
std::cout << current->data << " ";
q.pop();

if (current->left != nullptr) {
q.push(current->left);
}
if (current->right != nullptr) {
q.push(current->right);
}
}
}

These traversal techniques are fundamental for various applications, including searching,
printing tree structures, and evaluating expressions. The choice of the traversal method
depends on the specific requirements of the problem at hand.

3.3 Binary Search Trees

A Binary Search Tree (BST) is a binary tree data structure with the following properties:

1. Binary Tree Structure:


- Each node in a BST has at most two children, referred to as the left child and the right
child.

2. Ordering Property:
- For every node \(n\), all nodes in its left subtree have values less than \(n\), and all nodes
in its right subtree have values greater than \(n\).

This ordering property makes BSTs suitable for efficient search, insertion, and deletion
operations. The binary search tree is a common and versatile data structure used in various
applications.
63
### Operations on Binary Search Trees:

1. Search Operation:
- Start at the root and recursively navigate left or right based on the comparison of the
target value with the current node's value until finding the target or reaching a leaf node.

Node* search(Node* root, int key) {


if (root == nullptr || root->data == key) {
return root;
}

if (key < root->data) {


return search(root->left, key);
} else {
return search(root->right, key);
}
}

2. Insertion Operation:
- Starting at the root, recursively navigate left or right based on the comparison of the value
to be inserted with the current node's value. When reaching a null pointer, insert the new
node.

Node* insert(Node* root, int key) {


if (root == nullptr) {
return new Node(key);
}

if (key < root->data) {


root->left = insert(root->left, key);
} else if (key > root->data) {
root->right = insert(root->right, key);
}

return root;
}

3. Deletion Operation:
64
- Search for the node to be deleted. If it has one or no children, remove it. If it has two
children, find its in-order successor (or predecessor), replace it with the successor (or
predecessor), and then recursively delete the successor (or predecessor).

Node* findMin(Node* root) {


while (root->left != nullptr) {
root = root->left;
}
return root;
}

Node* deleteNode(Node* root, int key) {


if (root == nullptr) {
return root;
}

if (key < root->data) {


root->left = deleteNode(root->left, key);
} else if (key > root->data) {
root->right = deleteNode(root->right, key);
} else {
if (root->left == nullptr) {
Node* temp = root->right;
delete root;
return temp;
} else if (root->right == nullptr) {
Node* temp = root->left;
delete root;
return temp;
}

Node* temp = findMin(root->right);


root->data = temp->data;
root->right = deleteNode(root->right, temp->data);
}

return root;
}

4. Traversal Operations:
- In-order, pre-order, and post-order traversals are commonly used to visit nodes in a BST.
65
void inorderTraversal(Node* root) {
if (root != nullptr) {
inorderTraversal(root->left);
std::cout << root->data << " ";
inorderTraversal(root->right);
}
}

void preorderTraversal(Node* root) {


if (root != nullptr) {
std::cout << root->data << " ";
preorderTraversal(root->left);
preorderTraversal(root->right);
}
}

void postorderTraversal(Node* root) {


if (root != nullptr) {
postorderTraversal(root->left);
postorderTraversal(root->right);
std::cout << root->data << " ";
}
}

Binary Search Trees are widely used in databases, symbol tables, and other applications
where efficient search, insertion, and deletion operations are important. However, in certain
situations, unbalanced BSTs can become degenerate, leading to poor performance. Self-
balancing variants like AVL trees and Red-Black trees help address this issue by maintaining
balance during insertions and deletions.

3.4 Inserting, deleting and searching into binary search tree

Certainly! Here are C++ implementations for inserting, deleting, and searching operations in
a Binary Search Tree (BST). We'll assume you have a basic `Node` structure as described
earlier:

#include <iostream>

struct Node {
66
int data;
Node* left;
Node* right;

Node(int value) : data(value), left(nullptr), right(nullptr) {}


};

// Function to insert a value into the BST


Node* insert(Node* root, int key) {
if (root == nullptr) {
return new Node(key);
}

if (key < root->data) {


root->left = insert(root->left, key);
} else if (key > root->data) {
root->right = insert(root->right, key);
}

return root;
}

// Function to search for a value in the BST


Node* search(Node* root, int key) {
if (root == nullptr || root->data == key) {
return root;
}

if (key < root->data) {


return search(root->left, key);
} else {
return search(root->right, key);
}
}

// Function to find the minimum value node in a BST


Node* findMin(Node* root) {
while (root->left != nullptr) {
root = root->left;
}
return root;
}

67
// Function to delete a node with a given value from the BST
Node* deleteNode(Node* root, int key) {
if (root == nullptr) {
return root;
}

if (key < root->data) {


root->left = deleteNode(root->left, key);
} else if (key > root->data) {
root->right = deleteNode(root->right, key);
} else {
// Node with only one child or no child
if (root->left == nullptr) {
Node* temp = root->right;
delete root;
return temp;
} else if (root->right == nullptr) {
Node* temp = root->left;
delete root;
return temp;
}

// Node with two children


Node* temp = findMin(root->right);
root->data = temp->data;
root->right = deleteNode(root->right, temp->data);
}

return root;
}

// In-order traversal to print the BST


void inorderTraversal(Node* root) {
if (root != nullptr) {
inorderTraversal(root->left);
std::cout << root->data << " ";
inorderTraversal(root->right);
}
}

int main() {
Node* root = nullptr;

68
// Insert some values into the BST
root = insert(root, 50);
insert(root, 30);
insert(root, 20);
insert(root, 40);
insert(root, 70);
insert(root, 60);
insert(root, 80);

std::cout << "In-order traversal of the BST: ";


inorderTraversal(root);
std::cout << std::endl;

// Search for a value in the BST


int searchKey = 40;
Node* searchResult = search(root, searchKey);
if (searchResult != nullptr) {
std::cout << "Value " << searchKey << " found in the BST." << std::endl;
} else {
std::cout << "Value " << searchKey << " not found in the BST." << std::endl;
}

// Delete a value from the BST


int deleteKey = 30;
root = deleteNode(root, deleteKey);

std::cout << "In-order traversal after deleting " << deleteKey << ": ";
inorderTraversal(root);
std::cout << std::endl;

return 0;
}

This example demonstrates the basic operations of inserting, deleting, and searching in a
Binary Search Tree. The `insert`, `search`, and `deleteNode` functions are implemented based
on the properties of the Binary Search Tree. The `inorderTraversal` function is used to print
the elements in the tree in sorted order.

3.5 Tree traversals

Tree traversals are techniques for systematically visiting each node in a tree data structure.
There are several ways to traverse a tree, and the three main types are:

69
1. In-order Traversal (LNR - Left-Node-Right):
- In this traversal, the left subtree is explored first, then the current node is processed, and
finally, the right subtree is explored.

void inorderTraversal(Node* root) {


if (root != nullptr) {
inorderTraversal(root->left);
std::cout << root->data << " ";
inorderTraversal(root->right);
}
}

2. Pre-order Traversal (NLR - Node-Left-Right):


- In this traversal, the current node is processed first, followed by the left subtree, and then
the right subtree.

void preorderTraversal(Node* root) {


if (root != nullptr) {
std::cout << root->data << " ";
preorderTraversal(root->left);
preorderTraversal(root->right);
}
}

3. Post-order Traversal (LRN - Left-Right-Node):


- In this traversal, the left subtree is explored first, then the right subtree, and finally, the
current node is processed.

void postorderTraversal(Node* root) {


if (root != nullptr) {
postorderTraversal(root->left);
postorderTraversal(root->right);
std::cout << root->data << " ";
}
}

70
These traversals are fundamental in understanding and working with tree structures. The
choice of traversal depends on the specific requirements of the problem at hand.

### Example:

Let's assume you have the following `Node` structure:

struct Node {
int data;
Node* left;
Node* right;

Node(int value) : data(value), left(nullptr), right(nullptr) {}


};

Here's how you might use these traversal functions:

int main() {
Node* root = new Node(1);
root->left = new Node(2);
root->right = new Node(3);
root->left->left = new Node(4);
root->left->right = new Node(5);

std::cout << "In-order traversal: ";


inorderTraversal(root);
std::cout << std::endl;

std::cout << "Pre-order traversal: ";


preorderTraversal(root);
std::cout << std::endl;

std::cout << "Post-order traversal: ";


postorderTraversal(root);
std::cout << std::endl;

return 0;
}

71
This example creates a simple binary tree and performs in-order, pre-order, and post-order
traversals, printing the node values in different orders.

3.6 Huffman’s algorithm

Huffman coding is a popular algorithm used for lossless data compression. It was developed
by David A. Huffman in 1952. The key idea behind Huffman coding is to assign variable-
length codes to input characters, with shorter codes assigned to more frequently occurring
characters. This way, the most common characters are represented with fewer bits, leading to
more efficient compression.

Here are the basic steps involved in Huffman coding:

1. Frequency Count:
- Calculate the frequency of each character in the input data.

2. Priority Queue (Min-Heap):


- Create a priority queue (min-heap) based on the frequency of characters. Each node in the
heap represents a character along with its frequency.

3. Build Huffman Tree:


- Build a Huffman tree by repeatedly extracting the two nodes with the lowest frequencies
from the heap, creating a new node with their sum as the frequency, and inserting the new
node back into the heap. Continue this process until only one node remains in the heap,
which becomes the root of the Huffman tree.

4. Assign Codes:
- Assign binary codes to each character based on their position in the Huffman tree. Left
edges are labeled with '0', and right edges are labeled with '1'. The codes are constructed by
traversing from the root to each leaf.

5. Encode Data:
- Replace each character in the input data with its corresponding Huffman code to obtain
the compressed data.

6. Decode Data:
- Use the Huffman tree to decode the compressed data back to the original form.

Here's a basic example of Huffman coding in C++:

#include <iostream>
72
#include <queue>
#include <unordered_map>

struct Node {
char data;
unsigned frequency;
Node* left;
Node* right;

Node(char c, unsigned freq) : data(c), frequency(freq), left(nullptr), right(nullptr) {}


};

struct Compare {
bool operator()(Node* left, Node* right) {
return left->frequency > right->frequency;
}
};

void generateCodes(Node* root, std::string code, std::unordered_map<char, std::string>&


codes) {
if (root->left == nullptr && root->right == nullptr) {
codes[root->data] = code;
return;
}

generateCodes(root->left, code + "0", codes);


generateCodes(root->right, code + "1", codes);
}

Node* buildHuffmanTree(std::unordered_map<char, unsigned>& frequencies) {


std::priority_queue<Node*, std::vector<Node*>, Compare> minHeap;

for (const auto& entry : frequencies) {


minHeap.push(new Node(entry.first, entry.second));
}

while (minHeap.size() > 1) {


Node* left = minHeap.top();
minHeap.pop();
Node* right = minHeap.top();
minHeap.pop();

Node* internalNode = new Node('$', left->frequency + right->frequency);


73
internalNode->left = left;
internalNode->right = right;

minHeap.push(internalNode);
}

return minHeap.top();
}

int main() {
std::string input = "abracadabra";

// Step 1: Count character frequencies


std::unordered_map<char, unsigned> frequencies;
for (char c : input) {
frequencies[c]++;
}

// Step 2: Build Huffman Tree


Node* root = buildHuffmanTree(frequencies);

// Step 3: Generate Huffman Codes


std::unordered_map<char, std::string> huffmanCodes;
generateCodes(root, "", huffmanCodes);

// Display Huffman Codes


std::cout << "Huffman Codes:\n";
for (const auto& entry : huffmanCodes) {
std::cout << entry.first << ": " << entry.second << "\n";
}

// Step 4: Encode Data


std::cout << "\nEncoded Data:\n";
for (char c : input) {
std::cout << huffmanCodes[c];
}
std::cout << "\n";

return 0;
}

74
This example demonstrates the encoding part of Huffman coding. The Huffman tree is built,
and codes are generated for each character based on their frequency. The input data is then
encoded using these Huffman codes. In practice, Huffman coding is often used in
combination with other techniques for efficient compression.

75
UNIT 4 Graph Structure

Structure:
4.1 Graph Representation
4.2 Adjacency Matrix
4.3 Adjacency List
4.4 Warshall’s Algorithm
4.5 Adjacency Multilist Representation
4.6 Orthogonal Representation of a graph
4.7 Graph Traversals – BFA and DFS
4.8 Shortest Path – All pairs of shortest paths
4.9 Transitive Closure
4.10 Unit End Questions

Introduction:

A Graph is a non-linear data structure consisting of vertices and edges. The vertices are
sometimes also referred to as nodes and the edges are lines or arcs that connect any two
nodes in the graph. More formally a Graph is composed of a set of vertices( V ) and a set of
edges( E ). The graph is denoted by G(V, E).

Components of a Graph
Vertices: Vertices are the fundamental units of the graph. Sometimes, vertices are also known
as vertex or nodes. Every node/vertex can be labeled or unlabelled.
Edges: Edges are drawn or used to connect two nodes of the graph. It can be ordered pair of
nodes in a directed graph. Edges can connect any two nodes in any possible way. There are
no rules. Sometimes, edges are also known as arcs. Every edge can be labelled/unlabelled.

Types Of Graph
1. Null Graph
A graph is known as a null graph if there are no edges in the graph.
76
2. Trivial Graph
Graph having only a single vertex, it is also the smallest graph possible.

3. Undirected Graph
A graph in which edges do not have any direction. That is the nodes are unordered pairs in
the definition of every edge.

4. Directed Graph
A graph in which edge has direction. That is the nodes are ordered pairs in the definition of
every edge.

77
5. Connected Graph
The graph in which from one node we can visit any other node in the graph is known as a
connected graph.

6. Disconnected Graph
The graph in which at least one node is not reachable from a node is known as a disconnected
graph

7. Regular Graph
The graph in which the degree of every vertex is equal to K is called K regular graph.

8. Complete Graph
The graph in which from each node there is an edge to each other node.

78
9. Cycle Graph
The graph in which the graph is a cycle in itself, the degree of each vertex is 2.

10. Cyclic Graph


A graph containing at least one cycle is known as a Cyclic graph.

1. Directed Acyclic Graph


A Directed Graph that does not contain any cycle.

12. Bipartite Graph


A graph in which vertex can be divided into two sets such that vertex in each set does not
contain any edge between them.

79
13. Weighted Graph

A graph in which the edges are already specified with suitable weight is known as a
weighted graph.
Weighted graphs can be further classified as directed weighted graphs and undirected
weighted graphs

Tree v/s Graph


Trees are the restricted types of graphs, just with some more rules. Every tree will always be
a graph but not all graphs will be trees. Linked List, Trees, and Heaps all are special cases of
graphs.

4.1 Graph Representation

There are two ways to store a graph:

Adjacency Matrix
Adjacency List

Adjacency Matrix

80
In this method, the graph is stored in the form of the 2D matrix where rows and columns
denote vertices. Each entry in the matrix represents the weight of the edge between those
vertices.

Adjacency List
This graph is represented as a collection of linked lists. There is an array of pointer which
points to the edges connected to that vertex.

Comparison between Adjacency Matrix and Adjacency List


When the graph contains a large number of edges then it is good to store it as a matrix
because only some entries in the matrix will be empty. An algorithm such as Prim’s and
Dijkstra adjacency matrix is used to have less complexity.

81
Basic Operations on Graphs
Below are the basic operations on the graph:

Insertion of Nodes/Edges in the graph – Insert a node into the graph.


Deletion of Nodes/Edges in the graph – Delete a node from the graph.
Searching on Graphs – Search an entity in the graph.
Traversal of Graphs – Traversing all the nodes in the graph.

Usage of graphs
Maps can be represented using graphs and then can be used by computers to provide
various services like the shortest path between two cities.
When various tasks depend on each other then this situation can be represented using a
Directed Acyclic graph and we can find the order in which tasks can be performed using
topological sort.
State Transition Diagram represents what can be the legal moves from current states. In-
game of tic tac toe this can be used.

4.2 Adjacency Matrix

What is an adjacency matrix?

Adjacency matrix definition


In graph theory, an adjacency matrix is a dense way of describing the finite graph structure.
It is the 2D matrix that is used to map the association between the graph nodes.

If a graph has n number of vertices, then the adjacency matrix of that graph is n x n, and
each entry of the matrix represents the number of edges from one vertex to another.

An adjacency matrix is also called as connection matrix. Sometimes it is also called a Vertex
matrix.

Adjacency Matrix Representation


If an Undirected Graph G consists of n vertices then the adjacency matrix of a graph is n x n
matrix A = [aij] and defined by -

aij = 1 {if there is a path exists from Vi to Vj}

aij = 0 {Otherwise}

82
Let's see some of the important points with respect to the adjacency matrix.

o If there exists an edge between vertex Vi and Vj, where i is a row, and j is a
column, then the value of aij = 1.
o If there is no edge between vertex Vi and Vj, then the value of aij = 0.
o If there are no self loops in the simple graph, then the vertex matrix (or
adjacency matrix) should have 0s in the diagonal.
o An adjacency matrix is symmetric for an undirected graph. It specifies that the
value in the ith row and jth column is equal to the value in jth row ith
o If the adjacency matrix is multiplied by itself, and if there is a non-zero value is
present at the ith row and jth column, then there is the route from Vi to Vj with a
length equivalent to 2. The non-zero value in the adjacency matrix represents
that the number of distinct paths is present.

How to create an adjacency matrix?


Suppose there is a Graph g with n number of vertices, then the vertex matrix (or adjacency
matrix) is given by –

A = a11 a12 . . . . . a1n


a21 a22 . . . . . a2n
. . .
. . .
. . .
an1 an2 . . . . . ann

Where the aij equals the number of edges from the vertex i to j. As mentioned above,
the Adjacency matrix is symmetric for an undirected graph, so for an undirected
graph, aij = aji.

When the graphs are simple and there are no weights on the edges or multiple
edges, then the entries of the adjacency matrix will be 0 and 1. If there are no self-
loops, then the diagonal entries of the adjacency matrix will be 0.

Now, let's see the adjacency matrix for an undirected graph and for directed graphs.

Adjacency matrix for an undirected graph


In an undirected graph, edges are not associated with the directions with them. In an
undirected graph, if there is an edge exists between Vertex A and Vertex B, then the
vertices can be transferred from A to B as well as B to A.

83
Let us consider the below-undirected graph and try to construct the adjacency matrix
of it.

n the graph, we can see there is no self-loop, so the diagonal entries of the adjacent
matrix will be 0. The adjacency matrix of the above graph will be -

Adjacency matrix for a directed graph


In a directed graph, edges form an ordered pair. Edges represent a specific path from
some vertex A to another vertex B. Node A is called the initial node, while node B is
called the terminal node.

Let us consider the below directed graph and try to construct the adjacency matrix of
it.

84
In the above graph, we can see there is no self-loop, so the diagonal entries of the
adjacent matrix will be 0. The adjacency matrix of the above graph will be -

Properties of the adjacency matrix


Some of the properties of the adjacency matrix are listed as follows:

o An adjacency matrix is a matrix that contains rows and columns used to


represent a simple labeled graph with the numbers 0 and 1 in the position of
(VI, Vj), according to the condition of whether or not the two V i and Vj are
adjacent.
o For a directed graph, if there is an edge exists between vertex i or Vi to Vertex
j or Vj, then the value of A[Vi][Vj] = 1, otherwise the value will be 0.
o For an undirected graph, if there is an edge that exists between vertex i or
Vi to Vertex j or Vj, then the value of A[Vi][Vj] = 1 and A[Vj][Vi] = 1, otherwise
the value will be 0.

Let's see some questions of the adjacency matrix. Below questions are on the weighted
undirected, and directed graphs.

Question 1 - What will be the adjacency matrix for the below undirected weighted
graph?

85
Solution - In the given question, there is no self-loop, so it is clear that the diagonal
entries of the adjacent matrix for the above graph will be 0. The above graph is a
weighted undirected graph. The weights on the graph edges will be represented as
the entries of the adjacency matrix.

The adjacency matrix of the above graph will be -

Question 2 - What will be the adjacency matrix for the below directed weighted
graph?

Solution - In the given question, there is no self-loop, so it is clear that the diagonal
entries of the adjacent matrix for the above graph will be 0. The above graph is a
weighted directed graph. The weights on the graph edges will be represented as the
entries of the adjacency matrix.

The adjacency matrix of the above graph will be -

86
4.3 Adjacency List

An Adjacency list is a collection of nodes that have an edge over each other. The advantage
of this list over a regular graph is that you can edit or remove nodes from it easily. You can
also add additional edges between two nodes if needed by using the add_edge method on
your list object or calling another method (such as add_node()).

What is an Adjacency List?

It is a list that can be used to represent a graph. It is implemented as an array, with each
element of the array representing a vertex in the graph. The elements of the array store the
vertices that are adjacent to the vertex represented by the element. Each item contains two
pieces of information:

The first element of the array represents the number of neighbours (if any) surrounding it.
While the second element holds its location in space, i.e., left/right/top/bottom etc.
You may have noticed that there are some similarities between an adjacency list and a
directed graph, where nodes connect only if they share an edge or have some kind of
relationship with each other. However, unlike directed graphs where edges are always
identified by numbers instead of text labels like A-B or 1->2->3->4->5. In contrast,
undirected graphs don’t require such identification labels. Because all paths between two
points are available simultaneously, without having any restrictions on how many paths could
exist between them. Therefore, no restriction on how many paths exist.

Features of an Adjacency List

Adjacency lists are a data structure that stores the relationship between vertices in a graph.
The nodes in an adjacency list are referred to as vertices, and their neighbours are stored at
the same level of abstraction (e.g., two adjacent vertices). They also allow you to add new
edges or remove existing ones easily by simply adding or removing items from your list
respectively.

87
Adjacent pairs are stored in sorted order so that you can easily find them when needed. This
makes it easy for us to perform operations such as finding all paths between two vertices
using their indices within these structures.

An adjacency list only stores the edges of a graph, not the vertices, making it a space-
efficient representation of a graph.
It is also simple to implement and easy to modify.
An adjacency list can be easily modified to store additional information about the edges in
the graph, such as weights or labels.

Advantages of an Adjacency List

It is a data structure that stores the set of all vertices that are adjacent in a graph. It has several
advantages over other graph structures such as adjacency lists being more efficient in terms
of storage and access.

Adjacency lists can also be used to store and retrieve data. In addition, they can be used to
implement algorithms on graphs. Like the shortest path or topological ordering which
requires numeric values (the distance between two vertices).

Drawbacks of an Adjacency Lists


Adjacency lists are one of the most common graphs. They have an O(n) time complexity, and
they use memory to store their nodes.
The size of an adjacency list is also very high. This means that if you want to store large
amounts of data on your adjacency list. It will be slow because of the amount of memory
used by each node in the graph.
It is not efficient for finding the edges connecting a particular vertex. As we have to traverse
the entire linked list to find the desired edge.
Another disadvantage is that it is not easy to find the degree of a vertex in an adjacency list.
As we have to count the number of edges in the corresponding linked list.

Graph representing Adjacency List

A graph is a collection of nodes and edges. Nodes are connected by edges, which can be
directed or undirected. An edge is a connection between two nodes, where one node is called
the source and the other one is called the target. Edges are often represented as lines on paper
or screens, but they can also be visualized as arrows in space (e.g., in Google Earth).

An important concept to remember about graphs is that we can add more than one edge
between two given vertices. Like this:

88
This means we have multiple paths connecting two vertices together. Also note that there
may be multiple paths connecting two vertices if they share some common property (e.g.,
location).

Below is an example of a graph represented using an adjacency list:

Vertex 0: [1, 2]

Vertex 1: [0, 2]

The Vertex 2: [0, 1, 3]

And Vertex 3: [2]

In this example, the adjacency list for vertex 0 is [1, 2], which means that vertex 0
is connected to vertices 1 and 2. Similarly, the adjacency list for vertex 1 is [0, 2],
which means that vertex 1 is connected to vertices 0 and 2.

Example of an Adjacency List


Consider the following graph:

89
To represent this graph as list:

Vertex 0: [1, 2]

Vertex 1: [0, 2]

and Vertex 2: [1]

4.4 Floyd Warshall’s Algorithm

The Floyd-Warshall algorithm, named after its creators Robert Floyd and Stephen Warshall,
is a fundamental algorithm in computer science and graph theory. It is used to find the
shortest paths between all pairs of nodes in a weighted graph. This algorithm is highly
efficient and can handle graphs with both positive and negative edge weights, making it a
versatile tool for solving a wide range of network and connectivity problems.

Floyd Warshall Algorithm:

The Floyd Warshall Algorithm is an all pair shortest path algorithm unlike Dijkstra and
Bellman Ford which are single source shortest path algorithms. This algorithm works for
both the directed and undirected weighted graphs. But, it does not work for the graphs with
negative cycles (where the sum of the edges in a cycle is negative). It follows Dynamic
Programming approach to check every possible path going via every possible node in order
to calculate shortest distance between every pair of nodes.

Idea Behind Floyd Warshall Algortihm:

Suppose we have a graph G[][] with V vertices from 1 to N. Now we have to evaluate a
shortestPathMatrix[][] where shortestPathMatrix[i][j] represents the shortest path between
vertices i and j.

90
Obviously the shortest path between i to j will have some k number of intermediate nodes.
The idea behind floyd warshall algorithm is to treat each and every vertex from 1 to N as an
intermediate node one by one.

The following figure shows the above optimal substructure property in floyd warshall
algorithm:

Floyd Warshall Algorithm Algorithm:


Initialize the solution matrix same as the input graph matrix as a first step.
Then update the solution matrix by considering all vertices as an intermediate vertex.
The idea is to pick all vertices one by one and updates all shortest paths which include the
picked vertex as an intermediate vertex in the shortest path.
When we pick vertex number k as an intermediate vertex, we already have considered
vertices {0, 1, 2, .. k-1} as intermediate vertices.
For every pair (i, j) of the source and destination vertices respectively, there are two possible
cases.
k is not an intermediate vertex in shortest path from i to j. We keep the value of dist[i][j] as it
is.
k is an intermediate vertex in shortest path from i to j. We update the value of dist[i][j] as
dist[i][k] + dist[k][j], if dist[i][j] > dist[i][k] + dist[k][j]

Pseudo-Code of Floyd Warshall Algorithm :

For k = 0 to n – 1
For i = 0 to n – 1
For j = 0 to n – 1
Distance[i, j] = min(Distance[i, j], Distance[i, k] + Distance[k, j])
where i = source Node, j = Destination Node, k = Intermediate Node

91
Illustration of Floyd Warshall Algorithm :

Suppose we have a graph as shown in the image:

Step 1: Initialize the Distance[][] matrix using the input graph such that
Distance[i][j]= weight of edge from i to j, also Distance[i][j] = Infinity if there is
no edge from i to j.

Step 2: Treat node A as an intermediate node and calculate the Distance[][]


for every {i,j} node pair using the formula:
= Distance[i][j] = minimum (Distance[i][j], (Distance from i to A) + (Distance
from A to j ))
= Distance[i][j] = minimum (Distance[i][j], Distance[i][A] + Distance[A][j])

92
Step 3: Treat node B as an intermediate node and calculate the Distance[][]
for every {i,j} node pair using the formula:
= Distance[i][j] = minimum (Distance[i][j], (Distance from i to B) + (Distance
from B to j ))
= Distance[i][j] = minimum (Distance[i][j], Distance[i][B] + Distance[B][j])

Step 4: Treat node C as an intermediate node and calculate the Distance[][]


for every {i,j} node pair using the formula:
= Distance[i][j] = minimum (Distance[i][j], (Distance from i to C) + (Distance
from C to j ))
= Distance[i][j] = minimum (Distance[i][j], Distance[i][C] + Distance[C][j])

93
Step 5: Treat node D as an intermediate node and calculate the Distance[][]
for every {i,j} node pair using the formula:
= Distance[i][j] = minimum (Distance[i][j], (Distance from i to D) + (Distance
from D to j ))
= Distance[i][j] = minimum (Distance[i][j], Distance[i][D] + Distance[D][j])

Step 6: Treat node E as an intermediate node and calculate the Distance[][]


for every {i,j} node pair using the formula:
= Distance[i][j] = minimum (Distance[i][j], (Distance from i to E) + (Distance
from E to j ))
= Distance[i][j] = minimum (Distance[i][j], Distance[i][E] + Distance[E][j])

94
Step 7: Since all the nodes have been treated as an intermediate node, we
can now return the updated Distance[][] matrix as our answer matrix.

95

You might also like