You are on page 1of 34

Introduction to Data

Structures
LECTURE 1: INTRODUCTION
Introduction
•Data Structure can be defined as the group of data elements which provides an efficient way of
storing and organizing data in the computer so that it can be used efficiently.
•Some examples of Data Structures are arrays, Linked List, Stack, Queue, etc.
•Data Structures are widely used in almost every aspect of Computer Science i.e. Operating
System, Compiler Design, Artificial intelligence, Graphics and many more.

Reference: https://www.javatpoint.com/data-structure-introduction
Basic Terminology
•Data structures are the building blocks of any program or the software.
•Choosing the appropriate data structure for a program is the most difficult task for a
programmer.
•Following terminology is used as far as data structures are concerned.
•Data: Data can be defined as an elementary value or the collection of values, for example,
student's name and its id are the data about the student.

•Group Items: Data items which have subordinate data items are called Group item, for example,
name of a student can have first name and the last name.
•Record: Record can be defined as the collection of various data items, for example, if we talk
about the student entity, then its name, address, course and marks can be grouped together to
form the record for the student.
•File: A File is a collection of various records of one type of entity, for example, if there are 60
employees in the class, then there will be 60 records in the related file where each record
contains the data about each employee.
•Attribute and Entity: An entity represents the class of certain objects. it contains various
attributes. Each attribute represents the particular property of that entity.
•Field: Field is a single elementary unit of information representing the attribute of an entity.
Need of Data Structures
As applications are getting complexed and amount of data is increasing day by day, there may
arise the following problems:
Processor speed: To handle very large amount of data, high speed processing is required, but as
the data is growing day by day to the billions of files per entity, processor may fail to deal with
that much amount of data.
Data Search: Consider an inventory size of 106 items in a store, If our application needs to
search for a particular item, it needs to traverse 106 items every time, results in slowing down
the search process.
Multiple requests: If thousands of users are searching the data simultaneously on a web server,
then there are the chances that a very large server can be failed during that process
In order to solve the above problems, data structures are used. Data is organized to form a data
structure in such a way that all items are not required to be searched and required data can be
searched instantly.
Advantages of Data Structures
Efficiency: Efficiency of a program depends upon the choice of data structures.
For example: suppose, we have some data and we need to perform the search for a particular record. In that
case, if we organize our data in an array, we will have to search sequentially element by element. hence, using
array may not be very efficient here. There are better data structures which can make the search process efficient
like ordered array, binary search tree or hash tables.
Reusability: Data structures are reusable, i.e. once we have implemented a particular data structure, we can use
it at any other place. Implementation of data structures can be compiled into libraries which can be used by
different clients.
Abstraction: Data structure is specified by the ADT which provides a level of abstraction. The client program uses
the data structure through interface only, without getting into the implementation details.
Abstract Data type (ADT) is a type (or class) for objects whose behaviour is defined by a set of value and a set of
operations.
The definition of ADT only mentions what operations are to be performed but not how these operations will be
implemented. It does not specify how data will be organized in memory and what algorithms will be used for
implementing the operations. It is called “abstract” because it gives an implementation-independent view. The
process of providing only the essentials and hiding the details is known as abstraction.
Data Structure Classification
There are two types of data structure available for the programming purpose:
•Primitive data structure
•Non-primitive data structure
•Primitive data structure is a fundamental type of data structure that stores the data of only one
type whereas the non-primitive data structure is a type of data structure which is a user-defined
that stores the data of different types in a single entity.
•In the case of primitive data structure, it contains fundamental data types such as integer, float,
character, pointer, and these fundamental data types can hold a single type of value.
•For example, integer variable can hold integer type of value, float variable can hold floating type
of value, character variable can hold character type of value whereas the pointer variable can
hold pointer type of value.
In the case of non-primitive data structure, it is categorized into two parts such as linear data
structure and non-linear data structure.
Linear data structure is a sequential type of data structure, and here sequential means that all
the elements in the memory are stored in a sequential manner; for example, element stored
after the second element would be the third element, the element stored after the third
element would be the fourth element and so on.
We have different linear data structures holding the sequential values such as Array, Linked
list, Stack, Queue.
Non-linear data structure is a kind of random type of data structure. The non-linear data
structures are Tree and Graph.
Linear Data Structures: A data structure is called linear if all of its elements are arranged in the
linear order. In linear data structures, the elements are stored in non-hierarchical way where
each element has the successors and predecessors except the first and last element.
Types of Linear Data Structures are given below:
Arrays: An array is a collection of similar type of data items and each data item is called an
element of the array. The data type of the element may be any valid data type like char, int, float
or double.
• The elements of array share the same variable name but each one carries a different index
number known as subscript. The array can be one dimensional, two dimensional or
multidimensional.
• The individual elements of the array age are: age[0], age[1], age[2], age[3],......... age[98],
age[99].
Linked List: Linked list is a linear data structure which is used to maintain a list in the memory. It
can be seen as the collection of nodes stored at non-contiguous memory locations. Each node of
the list contains a pointer to its adjacent node.
Stack: Stack is a linear list in which insertion and deletions are allowed only at one end,
called top.
A stack is an abstract data type (ADT), can be implemented in most of the programming
languages. It is named as stack because it behaves like a real-world stack, for example: - piles of
plates or deck of cards etc.
Queue: Queue is a linear list in which elements can be inserted only at one end called rear and
deleted only at the other end called front.
It is an abstract data structure, similar to stack. Queue is opened at both end therefore it follows
First-In-First-Out (FIFO) methodology for storing the data items.
Non Linear Data Structures: This data structure does not form a sequence i.e. each item or element is
connected with two or more other items in a non-linear arrangement. The data elements are not
arranged in sequential structure.
Types of Non Linear Data Structures are given below:
Trees: Trees are multilevel data structures with a hierarchical relationship among its elements known
as nodes. The bottommost nodes in the hierarchy are called leaf node while the topmost node is
called root node. Each node contains pointers to point adjacent nodes.
Tree data structure is based on the parent-child relationship among the nodes. Each node in the tree
can have more than one children except the leaf nodes whereas each node can have atmost one
parent except the root node. Trees can be classified into many categories which will be discussed later
in this course.
Graphs: Graphs can be defined as the pictorial representation of the set of elements (represented by
vertices) connected by the links known as edges. A graph is different from tree in the sense that a
graph can have cycle while the tree can not have the one.
Operations on data structure
1) Traversing: Every data structure contains the set of data elements. Traversing the data structure
means visiting each element of the data structure in order to perform some specific operation like
searching or sorting.
Example: If we need to calculate the average of the marks obtained by a student in 6 different
subject, we need to traverse the complete array of marks and calculate the total sum, then we will
divide that sum by the number of subjects i.e. 6, in order to find the average.
2) Insertion: Insertion can be defined as the process of adding the elements to the data structure at
any location.
If the size of data structure is n then we can only insert n-1 data elements into it.
3) Deletion: The process of removing an element from the data structure is called Deletion. We can
delete an element from the data structure at any random location.
If we try to delete an element from an empty data structure then underflow occurs.
4) Searching: The process of finding the location of an element within the data structure is
called Searching. There are two algorithms to perform searching, Linear Search and Binary
Search. We will discuss each one of them later in this tutorial.
5) Sorting: The process of arranging the data structure in a specific order is known as Sorting.
There are many algorithms that can be used to perform sorting, for example, insertion sort,
selection sort, bubble sort, etc.
6) Merging: When two lists List A and List B of size M and N respectively, of similar type of
elements, clubbed or joined to produce the third list, List C of size (M+N), then this process is
called merging
What is an Algorithm?
•What are algorithms?
•Why is the study of algorithms worthwhile?
•What is the role of algorithms relative to other technologies used in computers?
Let us consider the problem of preparing an omelette. To prepare an omelette, we follow the
steps given below:
1) Get the frying pan.
2) Get the oil.
◦ a. Do we have oil?
◦ i. If yes, put it in the pan.
◦ ii. If no, do we want to buy oil?
◦ 1. If yes, then go out and buy.
◦ 2. If no, we can terminate.

3) Turn on the stove, etc...


What we are doing is, for a given problem (preparing an omelette), we are providing a step-by step
procedure for solving it.
•An algorithm is the step-by-step unambiguous instructions to solve a given problem.
•In the traditional study of algorithms, there are two main criteria for judging the merits of
•algorithms: correctness (does the algorithm give solution to the problem in a finite number of
•steps?) and efficiency (how much resources (in terms of memory and time) does it take to
execute).
An algorithm is a finite set of instructions which, if followed, accomplish a particular task. In addition every
algorithm must satisfy the following criteria
(i) input: there are zero or more quantities which are externally supplied
(ii) output: at least one quantity is produced;
(iii) definiteness: each instruction must be clear and unambiguous;
(iv) finiteness: if we trace out the instructions of an algorithm, then for all cases the algorithm will
terminate after a finite number of steps
(v) effectiveness: every instruction must be sufficiently basic that it can in principle be carried out by a
person using only pencil and paper. It is not enough that each operation be definite as in (iii), but it must
also be feasible.
Example
Let's understand the algorithm through a real-world example. Suppose we want to make a lemon juice, so
following are the steps required to make a lemon juice:
Step 1: First, we will cut the lemon into half.
Step 2: Squeeze the lemon as much you can and take out its juice in a container.
Step 3: Add two tablespoon sugar in it.
Step 4: Stir the container until the sugar gets dissolved.
Step 5: When sugar gets dissolved, add some water and ice in it.
Step 6: Store the juice in a fridge for 5 to minutes.
Step 7: Now, it's ready to drink.
The above real-world can be directly compared to the definition of the algorithm. We cannot perform the step 3
before the step 2, we need to follow the specific order to make lemon juice. An algorithm also says that each and
every instruction should be followed in a specific order to perform a specific task.
Example
The following are the steps required to add two numbers entered by the user:
Step 1: Start
Step 2: Declare three variables a, b, and sum.
Step 3: Enter the values of a and b.
Step 4: Add the values of a and b and store the result in the sum variable, i.e., sum=a+b.
Step 5: Print sum
Step 6: Stop
Why the Analysis of Algorithms?
•To go from city “A” to city “B”, there can be many ways of accomplishing this: by flight, by bus, by
train and also by bicycle.
•Depending on the availability and convenience, we choose the one that suits us.
•Similarly, in computer science, multiple algorithms are available for solving the same problem
(for example, a sorting problem has many algorithms, like insertion sort, selection sort, quick
sort and many more).
•Algorithm analysis helps us to determine which algorithm is most efficient in terms of time and
space consumed.
Goal of the Analysis of Algorithms
The goal of the analysis of algorithms is to compare algorithms (or solutions) mainly in terms of running time but
also in terms of other factors (e.g., memory, developer effort, etc.)
What is Running Time Analysis?
•It is the process of determining how processing time increases as the size of the problem (input size) increases.
•Input size is the number of elements in the input, and depending on the problem type, the input may be of
different types. The following are the common types of inputs.
• Size of an array
• Polynomial degree
• Number of elements in a matrix
• Number of bits in the binary representation of the input
• Vertices and edges in a graph.
How to Compare Algorithms
To compare algorithms, let us define a few objective measures:
Execution times? Not a good measure as execution times are specific to a particular computer.
Number of statements executed? Not a good measure, since the number of statements varies
with the programming language as well as the style of the individual programmer.
Ideal solution? Let us assume that we express the running time of a given algorithm as a
function of the input size n (i.e., f(n)) and compare these different functions corresponding to
running times.
This kind of comparison is independent of machine time, programming style, etc.
Pseudocode
•In computer science, pseudocode is a plain language description of the steps in an algorithm or
another system.
•Pseudocode is a generic way of describing an algorithm without use of any specific
programming language syntax.
•Pseudocode often uses structural conventions of a normal programming language, but is
intended for human reading rather than machine reading.
•It typically omits details that are essential for machine understanding of the algorithm, such
as variable declarations and language-specific code.
•The purpose of using pseudocode is that it is easier for people to understand than conventional
programming language code, and that it is an efficient and environment-independent
description of the key principles of an algorithm.
•It is, as the name suggests, pseudo code —it cannot be executed on a real computer, but it
models and resembles real programming code, and is written at roughly the same level of detail
•Pseudocode, by nature, exists in various forms, although most borrow syntax from popular
programming languages (like C, Lisp, or FORTRAN). Natural language is used whenever details
are unimportant or distracting
•Simply, we can say that it’s the cooked up representation of an algorithm.
•Often at times, algorithms are represented with the help of pseudo codes as they can be
interpreted by programmers no matter what their programming background or knowledge is.
•Pseudo code, as the name suggests, is a false code or a representation of code which can be
understood by even a layman with some school level programming knowledge.
•Algorithm: It’s an organized logical sequence of the actions or the approach towards a particular
problem. A programmer implements an algorithm to solve a problem. Algorithms are expressed
using natural verbal but somewhat technical annotations.

Pseudo code: It’s simply an implementation of an algorithm in the form of annotations and
informative text written in plain English. It has no syntax like any of the programming language
and thus can’t be compiled or interpreted by the computer.
Advantages of Pseudocode
•Improves the readability of any approach. It’s one of the best approaches to start
implementation of an algorithm.
•Acts as a bridge between the program and the algorithm or flowchart. Also works as a rough
documentation, so the program of one developer can be understood easily when a pseudo code
is written out. In industries, the approach of documentation is essential. And that’s where a
pseudo-code proves vital.
•The main goal of a pseudo code is to explain what exactly each line of a program should do,
hence making the code construction phase easier for the programmer.
Disadvantages of Pseudocode
•Pseudocode does not provide a visual representation of the logic of programming.
•There are no proper format for writing the for pseudocode.
•In Pseudocode their is extra need of maintain documentation.
•In Pseudocode their is no proper standard very company follow their own standard for writing
the pseudocode.
Examples of Pseudocode
1..
If student's grade is greater than or equal to 50
Print "passed"

Else
Print "failed"
Finding class average
1. Set total to zero
2. Set grade counter to one
3. While grade counter is less than or equal to ten
1. Input the next grade
2. Add the grade into the total

4. Set the class average to the total divided by ten


5. Print the class average.
1. Write an algorithm to find maximum of two numbers “a” and “b”
2. Write an algorithm or steps to find maximum of three numbers a,b,c
If a>b and a>c
print “a” is maximum
If b>a and b>c
print “b” is maximum
If c>a and c>b
print “c” is maximum
Exercise
3. Write an algorithm to find maximum of n numbers?
Thank You

You might also like