You are on page 1of 49

DATA STRUCTURE

[BCS-301]
Syllabus
Introduction: Basic Terminology, Elementary Data
Organization, Built in Data Types in C. Algorithm,
Efficiency of an Algorithm, Time and Space Complexity,
Asymptotic notations: Big Oh, Big Theta and Big Omega,
Time-Space trade-off. Abstract Data Types (ADT)
Text Books
Aaron M. Tenenbaum, Yedidyah Langsam and Moshe J.
Augenstein, “Data Structures Using C and C++”, PHI
Learning Private Limited, Delhi India
Horowitz and Sahani, “Fundamentals of Data Structures”,
Galgotia Publications Pvt Ltd Delhi India.
Lipschutz, “Data Structures” Schaum’s Outline Series, Tata
McGraw-hill Education (India) Pvt. Ltd.
 Thareja, “Data Structure Using C” Oxford Higher
Education.
Course Outcomes [CO]
CO1: Describe the ways arrays, linked lists, stacks,
queues, trees, and graphs are represented in memory, used
by the algorithms and their common applications
CO2: Understand and implement linear data structures
such as Stack, Queue and Priority Queue.
CO3: Analyze the computational efficiency of the Sorting
and Searching algorithms.
CO4: Implementation of Trees and Graphs and perform
various operations on these data structure.
CO5: Identify the alternative implementations of data
structures with respect to its performance to solve a real
world problem.
Introduction
Data is a collection of facts and figures or a set of values that
can be recorded.
Data Structure is a way of collecting and organizing data in
such a way that we can perform operations on these data in an
effective way.
Data Structure is a particular way of storing and organizing
data in the memory of the computer so that these data can
easily be retrieved and efficiently utilized in the future when
required.
The choice of a good data structure makes it possible to
perform varieties of critical operations effectively.
An efficient data structure also uses minimum memory space
and minimum execution time to process the task. The main
goal is to reduce the space and time complexities of different
tasks.
Why is Data Structure important?
Data structure is important because it is used in almost every
program or software system.
Data Structures are necessary for designing efficient algorithms.
Using appropriate data structures can help programmers save a
good amount of time while performing operations such as storage,
retrieval, or processing of data.
Manipulation of large amounts of data is easier.
Data structures can help improve the performance of our code by
reducing the time and space complexity of algorithms.
Data structures can help you solve complex problems by breaking
them down into smaller, more manageable parts.
Data structures provide a way to abstract the underlying
complexity of a problem and simplify it, making it easier to
understand and solve.
Basic Types of Data Structures
 As we have discussed above, anything that can store data can be
called as a data structure, hence Integer, Float, Boolean, Char etc,
all are data structures. They are known as Primitive Data
Structures.
 Then we also have some complex Data Structures, which are used
to store large and connected data. Some example of Abstract Data
Structure are :
 Arrays
 Linked List
 Tree
 Graph
 Stack, Queue etc.
 All these data structures allow us to perform different operations on
data. We select these data structures based on which type of
operation is required.
Characteristic Description

Linear In Linear data structures, the data items are arranged in a


linear sequence and sequential order. Example: Array,
Linked List
Non-Linear In Non-Linear data structures, the data items are not in
sequence. Example: Tree, Graph
Homogeneous In homogeneous data structures, all the elements are of
same type. Example: Array
Non- In Non-Homogeneous data structure, the elements may
or may not be of the same type. Example: Structures
Homogeneous
Static Static data structures are those whose sizes and
structures associated memory locations are fixed, at
compile time. Example: Array
Dynamic Dynamic structures are those which expands or shrinks
depending upon the program need and its execution.
Also, their associated memory locations changes.
Example: Linked List created using pointers
Operations on Data Structure
The major or the common operations that can be performed on
the data structures are:
 Traversal: To access each data exactly once.
 Searching: We can search for any element in a data structure.
 Sorting: We can sort the elements of a data structure either in
an ascending or descending order.
 Insert: We can also insert the new element in a data structure.
 Update: We can also update the element, i.e., we can replace
the element with another element.
 Delete: We can also perform the delete operation to remove the
element from the data structure.
Basic Terminologies
 Data: We can define data as an elementary value or a collection of
values. For example, the Employee's name and ID are the data
related to the Employee.
 Data Items: A Single unit of value is known as Data Item.
 Group Items: Data Items that have subordinate data items are
known as Group Items. For example, an employee's name can
have a first, middle, and last name.
 Elementary Items: Data Items that are unable to divide into sub-
items are known as Elementary Items. For example, the ID of an
Employee.
 Entity and Attribute: A class of certain objects is represented by
an Entity. It consists of different Attributes. Each Attribute
symbolizes the specific property of that Entity.
Meaningful or processed data is called information. The
collection of data is organized into the hierarchy of fields,
records and files. A single elementary unit of information
representing an attribute of an entity is called a Field.
Records are the collection of field values of a given entity.
Collection of records of the entities in a given entity set is
called a file. Each record may contain a certain field that
uniquely represents that record. Such a field is called
a primary key.
Based on their length, records may be classified into two.
They are:
Fixed-length record : All record contain the same data items
with the same amount of space assigned to each items.
Variance length record: Records may contain different
length data items.
What is an Algorithm ?
 An algorithm is any well-defined computational procedure that takes
some value, or set of values, as input and produces some value, or set of
values, as output.
 An algorithm is a set of well-defined instructions to solve a particular
problem. It takes a set of input(s) and produces the desired output.
 Properties of an algorithm:
 Input- There should be 0 or more inputs supplied externally to the
algorithm.
 Output- There should be at least 1 output obtained.
 Definiteness- Every step of the algorithm should be clear and well
defined.
 Finiteness- The algorithm should have finite number of steps.
 Correctness- Every step of the algorithm must generate a correct output.
Characteristics
Efficiency of an Algorithm
The efficiency of an algorithm depends on its design,
implementation, resources, and memory required by it
for the computation.
An algorithm is said to be efficient and fast, if it takes
less time to execute and consumes less memory space.
The performance of an algorithm is measured on the
basis of following properties :
Time Complexity
Space Complexity
Space Complexity
It is the amount of memory space required by the algorithm,
during the course of its execution. Space complexity must be
taken seriously for multi-user systems and in situations where
limited memory is available.
Auxiliary Space is the extra space or temporary space used by
an algorithm.
Space Complexity of an algorithm is the total space taken by
the algorithm with respect to the input size. Space complexity
includes both Auxiliary space and space used by input.
Note: Space complexity depends on a variety of things such
as the programming language, the compiler, or even the
machine running the algorithm.
Time Complexity
Time Complexity is a way to represent the amount of time
required by the program to run till its completion.
It's generally a good practice to try to keep the time required
minimum, so that our algorithm completes it's execution in
the minimum time possible.
The time complexity of an algorithm depends on the
behavior of input:
Worst-case
Best-case
Average-case
Best, Average and Worst case Analysis of Algorithms
We all know that the running time of an algorithm
increases (or remains constant) as the input size (n)
increases.
Sometimes even if the size of the input is same, the running
time varies among different instances of the input.
In that case, we perform best, average and worst-case
analysis.
The best case gives the minimum time, the worst case
running time gives the maximum time and average case
running time gives the time required on average to execute
the algorithm.
int LinearSearch(int a, int n, int item) {
int i;
for (i = 0; i < n; i++) {
if (a[i] == item) {
return a[i];
}
}
return -1;
}
Best case happens when the item we are looking for is in the very
first position of the array: Θ(1)
Worst case happens when the item we are searching is in the last
position of the array or the item is not in the array: Θ(n)
Average case analyses all possible inputs and calculate the running
time for all inputs. Add up all the calculated values ​and divide the
sum by the total number of entries:(1+2+3+…+n+(n+1))/(n+1)
=O(n)
Example-1
int a = 0, b = 0;
for (i = 0; i < N; i++) {
a = a + 1;
}
for (j = 0; j < M; j++) {
b = b + 1;
}
 O(N + M) time, O(1) space
 The first loop is O(N) and the second loop is O(M).
Since N and M are independent variables, so we can’t say which
one is the leading term. Therefore Time complexity of the given
problem will be O(N+M).
Since variables size does not depend on the size of the input,
therefore Space Complexity will be constant or O(1).
Example-2
int i, j, k = 0;
for (i = 1; i <= n; i++) {
for (j = 1; j <= n; j = j +1) {
k = k + n / 2;
}
}

Time complexity: O(n2)


Space complexity: O(1)
Example-3
int i, j, k = 0;
for (i = 1; i <= n; i++) {
for (j = 1; j <= n; j = j * 2) {
k = k + j;
}
}

Time complexity: O(nlogn) & space complexity: O(1)


If you notice, j keeps doubling till it is less than or equal to n.
We can double a number till it is less than n requires log2(n)
steps.
So, total steps = O(n/ 2 * log2 (n)) = O(n*log2n)
Example-4
int i;
int j;
int k = 0;
for (i = 1; i <= n; i++) {
printf(“%d”, i);
printf(“%d”, i*i);
for (j = 1; j <= n; j = j +1) {
k = k +i + j;
}
}
 T(n)=n2+2n+3
 Time complexity: O(n2) & space complexity: O(1)
GATE-CS-2005

double fun (int n)


{
int i;
double sum;
if (n = = 0) return 1.0;
else
{
sum = 0.0;
for (i = 0; i < n; i++)
sum = sum + fun (i);
return sum;
}
}
 Function fun() is recursive. Space complexity is O(n) as there can
be at most O(n) active functions at a time.
GATE-CS-2004
Let A[1, ..., n] be an array storing a bit (1 or 0) at each location, and f(m)
is a function whose time complexity is θ(m).

counter = 0;
for (i = 1; i < = n; i++)
{
if (A[i] == 1)
counter++;
else {
f(counter);
counter = 0;
}
}
 The complexity of this program fragment is θ(n)
Asymptotic Notations
When it comes to analyzing the complexity of any algorithm
in terms of time and space, we can never provide an exact
number to define the time required and the space required by
the algorithm, instead we express it using some standard
notations, also known as Asymptotic Notations.
When we analyse any algorithm, we generally get a formula
to represent the amount of time required for execution or the
time required by the computer to run the lines of the code,
number of memory accesses, number of comparisons,
temporary variables occupying memory space etc.
Asymptotic Notations allow us to analyze an algorithm's
running time by identifying its behavior as its input size
grows. This is also referred to as an algorithm's growth
rate.
Let us take an example, if some algorithm has a time complexity
of
T(n) = (n2 + 3n + 4), which is a quadratic equation. For large
values of n, the 3n + 4 part will become insignificant compared to
the n2 part.
Growth rate of functions
 Logarithmic Function - log n
 Linear Function - an + b
 Quadratic Function - an2 + bn + c
 Polynomial Function - anz + . . . + an2 + an1 + an0, where z is
some constant
 Exponential Function - an, where a is some constant

GATE-CS-2021
Consider the following three functions.
f1=10n
f2=nlogn
f3=n√n
Arranges the functions in the increasing order of asymptotic growth rate:
f2<f3<f1
AKTU Questions
Define best case, average case and worst case for analyzing
the complexity of a program. [2022-23] [2 Marks]
Rank the following typical bounds in increasing order of
growth rate: O(log n), O(n4), O(1), O(n2 log n) [2021-22]
[2 Marks]
Define the following terms: (i) Time complexity (ii) Space
complexity (iii) Asymptotic notation (iv) Big O notation
[2019-20] [10 Marks]
What is Asymptotic Behavior?
The word Asymptotic means approaching a value or curve
arbitrarily closely (i.e., as some sort of limit is taken).
In asymptotic notations, we use the some model to ignore
the constant factors and insignificant parts of an
expression, to device a better way of representing
complexities of algorithms, in a single term, so that
comparison between algorithms can be done easily.
Let's take an example to understand this:
 If we have two algorithms with the following expressions
representing the time required by them for execution:
 Expression 1: (20n2 + 3n - 4)
 Expression 2: (n3 + 100n - 2)
 Now, as per asymptotic notations, we should just worry about how
the function will grow as the value of n (input) will grow, and that
will entirely depend on n2 for the Expression 1, and on n3 for
Expression 2. Hence, we can clearly say that the algorithm for
which running time is represented by the Expression 2, will grow
faster than the other one, simply by analyzing the highest power
coefficient and ignoring the other constants(20 in 20n2) and
insignificant parts of the expression(3n - 4 and 100n - 2).
 The main idea behind casting aside the less important part is to
make things manageable.
 All we need to do is, first analyze the algorithm to find out an
expression to define it's time requirements and then analyse how
that expression will grow as the input (n) will grow.
Types of Asymptotic Notations
We use three types of asymptotic notations to represent the
growth of any algorithm, as input increases:
Big Theta (Θ)
Big Oh(O)
Big Omega (Ω)
Upper Bounds: Big-O
This notation is known as the upper bound of the algorithm, or
a Worst Case of an algorithm.
It tells us that a certain function will never exceed for any
value of input n at any time.
Consider Linear Search algorithm, in which we traverse an
array elements, one by one to search a given number.
In Worst case, starting from the front of the array, we find the
element we are searching for at the end, which will lead to a
time complexity of n, where n represents the number of total
elements.
If the element we are searching for is the first element of the
array, in which case the time complexity will be constant.
So, the time complexity is O(n) in worst case.
Let f(n) and g(n) are two nonnegative functions
indicating the running time of two algorithms. We say,
g(n) is providing an upper bound to f(n) if there exist
some positive constants c and n0 such that
0 ≤ f(n) ≤ c.g(n) for all n ≥ n0. It is denoted as f(n) =
Ο(g(n)).
Lower Bounds: Omega
Big Omega notation is used to define the lower bound of
any algorithm or we can say the best case of any
algorithm.
This always indicates the minimum time required for any
algorithm for all input values, therefore the best case of any
algorithm.
In simple words, when we represent a time complexity for
any algorithm in the form of big-Ω, we mean that the
algorithm will take at least this much time to complete it's
execution. It can definitely take more time than this too.
Let f(n) and g(n) are two nonnegative functions indicating the
running time of two algorithms. We say the function g(n) is lower
bound of function f(n) if there exist some positive constants c and
n0 such that 0 ≤ c.g(n) ≤ f(n) for all n ≥ n0. It is denoted as f(n) =
Ω (g(n)).
Tight Bounds: Theta
 When we say tight bounds, we mean that the time complexity
represented by the Big-Θ notation is like the average value or range
within which the actual time of execution of the algorithm will be.
 For example, if for some algorithm the time complexity is
represented by the expression 3n 2 + 5n, and we use the Big-Θ
notation to represent this, then the time complexity would be Θ(n 2),
ignoring the constant coefficient and removing the insignificant
part, which is 5n.
 Here, in the example above, complexity of Θ(n 2) means, that the
average time for any input n will remain in
between, c1*n2 and c2*n2, where c1, c2 are two constants, thereby
tightly binding the expression representing the growth of the
algorithm.
Let f(n) and g(n) are two nonnegative functions indicating
running time of two algorithms. We say the function g(n) is tight
bound of function f(n) if there exist some positive constants c1,
c2, and n0 such that 0 ≤ c1 g(n) ≤ f(n) ≤ c2 g(n) for all n ≥ n0. It is
denoted as f(n) = Θ (g(n)).
ISRO CS 2015
int recursive (int n)
{
if(n == 1)
return (1);
else return (recursive (n-1) + recursive (n-1));
}

The time complexity of the following C function is


(assume n > 0): O(2n)
Complexity And Space-Time Tradeoff
An Algorithm is the best which helps to solve a problem that
requires less space in memory and also takes less time to
generate the output. But in general, it is not always possible to
achieve both of these conditions at the same time.
In computer science, a space-time or time-memory tradeoff is
a way of solving a problem or calculation in less time by using
more storage space (or memory), or by solving a problem in
very little space by spending a long time.
So if your problem is taking a long time but not much memory,
a space-time tradeoff would let you use more memory and solve
the problem more quickly. Or, if it could be solved very quickly
but requires more memory than you have, you can try to spend
more time solving the problem in the limited memory.
Suppose a file of thousands records contains name, SSN
and much addition information. Searching (linear search) a
record for a give name will take much time. Sorting the file
by the name and using binary search will reduce the search
time.
But on the other hand searching a record by using SSN will
again take a lot time. There are two approaches to solve this
problem:
Create a duplicate file and sort the records on the basis of
SSN and the main file is sorted by name. This approach will
double the memory requirement.
Another approach is to sort the main file by SSN and an
auxiliary array with two columns, first column contains
sorted name and the second column contains pointers which
give the location of the corresponding records in the main file.

Abstract Data Type (ADT)
The Data Type is basically a type of data that can be used
in different computer program. It signifies the type like
integer, float etc, the space like integer will take 4-bytes,
character will take 1-byte of space etc.
The abstract datatype is special kind of datatype, whose
behavior is defined by a set of values and set of operations.
The keyword “Abstract” is used as we can use these data
types, we can perform different operations. But how those
operations are working that is totally hidden from the user.
The ADT is made of with primitive data types, but
operation logics are hidden.
Some examples of ADT are Stack, Queue, Linked List etc.
Stack −
isFull(), This is used to check whether stack is full or not
isEmpry(), This is used to check whether stack is empty or
not
push(x), This is used to push x into the stack
pop(), This is used to delete one element from top of the
stack
peek(), This is used to get the top most element of the stack
size(), this function is used to get number of elements present
into the stack
Queue −
isFull(), This is used to check whether queue is full or not
isEmpry(), This is used to check whether queue is empty or
not
insert(x), This is used to add x into the queue at the rear end
delete(), This is used to delete one element from the front end
of the queue
size(), this function is used to get number of elements present
into the queue
List −
size(), this function is used to get number of elements
present into the list
insert(x), this function is used to insert one element into the
list
remove(x), this function is used to remove given element
from the list
get(i), this function is used to get element at position i
replace(x, y), this function is used to replace x with y value

You might also like