You are on page 1of 66

DATA STRUCTURE

AND
ALGORITHM

BY - HIMANSHU RAWAT
Topics of Lecture 1
● What is Data Structure
● Classification of data structure
● Basic Terminology: Elementary Data Organization
● Operations in Data Structure
● Need of Data Structure
● What is Algorithm
● Examples of Algorithm
● What is Flowchart
● Symbols in FLowchart
● Rules for creating Flowchart
● Examples of Flowchart
● Homework Question
● Motive of DSA (complexities)
What Is Data Structure?

A data structure is a way of organizing data in a computer so that it can be used


efficiently. It is a collection of data elements and the relationships between them.
Data structures are used in almost every type of software application, from
operating systems to web browsers to games.

Examples of data structures include arrays, linked lists, stacks, queues, trees, and
graphs.
Classification of Data Structure
There are two types of data structures:
● Primitive data structure
● Non-primitive data structure

Primitive Data structure


The primitive data structures are primitive data types. Primitive Data Structures are the basic data
structures that directly operate upon the machine instructions.
They have different representations on different computers. The int, char, float, double, and pointer are
the primitive data structures that can hold a single value.

Non-Primitive Data structure


Non-primitive data structures are more complicated data structures and are derived from primitive data
structures.

The non-primitive data structure is divided into two types:


● Linear data structure
● Non-linear data structure
Linear Data Structure

The arrangement of data in a sequential manner is known as a linear data


structure. In these data structures, one element is connected to only one
another element in a linear form. Linear data structures are easy to implement
because computer memory is arranged in a linear way. Its examples
are array, stack, queue, linked list, etc.

Non-linear Data Structure:

Data structures where data elements are not arranged sequentially or linearly
are called non-linear data structures. In a non-linear data structure, single level
is not involved. Therefore, we can’t traverse all the elements in single run only.
Non-linear data structures are not easy to implement in comparison to linear
data structure. It utilizes computer memory efficiently in comparison to a
linear data structure. Its examples are trees and graphs.
Linear Data structures can also be classified as:

● Static data structure: It is a type of data structure where the size is


allocated at the compile time. In Static data structure the size of the
structure is fixed. The content of the data structure can be modified
but without changing the memory space allocated to it. Therefore, the
maximum size is fixed.eg., array.

● Dynamic data structure: It is a type of data structure where the size is


allocated at the run time. In Dynamic data structure the size of the
structure in not fixed and can be modified during the operations
performed on it. Dynamic data structures are designed to facilitate
change of data structures in the run time.Therefore, the maximum size
is flexible.e.g, linked list,queue
Basic Terminology: Elementary Data Organization

● Data and Data Item:-

Data are simply collection of facts and figures. Data


are values or set of values. A data item refers to a
single unit of values. Data items that are divided into
sub items are group items; those that are not are
called elementary items.

For example, a student‘s name may be divided into


three sub items – [first name, middle name and last
name] but the ID of a student would normally be
treated as a single item.

In the above figure ( ID, Age, Gender, First, Middle, Last, Street, Area ) are elementary data
items, whereas (Name, Address ) are group data items.
● Data Type:-

Data type is a classification identifying one of various types of data, such as floating-point,
integer, or Boolean, that determines the possible values for that type; the operations that
can be done on values of that type; and the way values of that type can be stored. It is of
two types: Primitive and non-primitive data type. Primitive data type is the basic data
type that is provided by the programming language with built -in support. This data type is
native to the language and is supported by machine directly while non-primitive data type
is derived from primitive data type. For example- array, structure etc.

● Variable:-

It is a symbolic name given to some known or unknown quantity or information, for the
purpose of allowing the name to be used independently of the information it represents.
A variable name in computer source code is usually associated with a data storage
location and thus also its contents and these may change during the course of program
execution.
● Record:-

Collection of related data items is known as record. The elements of records are usually called fields
or members. Records are distinguished from arrays by the fact that their number of fields is typically
fixed, each field has a name, and that each field may have a different type.

● Program:-

A sequence of instructions that a computer can interpret and execute is termed as program.

● Entity:-

An entity is something that has certain attributes or properties which may be assigned some values.
The values themselves may be either numeric or non-numeric. The example of entity with its attribute
and its values are shown in figure.
● Entity Set

An entity set is a group of or set of similar entities. For example, employees of an


organization, students of a class etc. Each attribute of an entity set has a range of values,
the set of all possible values that could be assigned to the particular attribute. The term
“information” is sometimes used for data with given attributes, of, in other words
meaningful or processed data.

● Field

A field is a single elementary unit of information representing an attribute of an entity, a


record is the collection of field values of a given entity and a file is the collection of records
of the entities in a given entity set.

● File

File is a collection of records of the entities in a given entity set. For example, file containing
records of students of a particular class.

● Key

A key is one or more field(s) in a record that take(s) unique values and can be used to
distinguish one record from the others.
Operations in Data Structure
Operations in data structures refer to the fundamental actions or functions that can be performed on
data structures, such as arrays, linked lists, trees, graphs, and more. These operations are essential for
manipulating, retrieving, and managing data efficiently within a data structure. The specific operations
available may vary depending on the type of data structure, but here are some common data structure
operations:

● Creation: The creation operation in data structure is the process of allocating memory for the
data structure and initializing its values. This operation is usually performed when the program
starts running.

● Insertion: Adding a new element or data item into the data structure. The position of insertion
may vary based on the data structure, such as adding an element at the beginning, end, or a
specific location.

● Deletion: Removing an element or data item from the data structure. Like insertion, deletion may
involve specifying the location of the item to be removed.

● Traversal: Visiting all elements of the data structure to perform a specific operation on each item.
Traversal is commonly used in tasks like searching, printing, or modifying elements.
● Search: Finding a specific element within the data structure. The goal is to determine whether a
particular item exists in the structure and, if so, its location or value.

● Update/Modification: Changing the value or attributes of an existing element within


the data structure. This is often necessary when you need to update the stored data.

● Sorting: Arranging the elements in a specific order (e.g., ascending or descending)


based on certain criteria. Sorting is crucial for efficient searching and organizing
data.

● Merging/Combining: Combining two or more data structures into a single data


structure. This operation is common in various scenarios, such as merging two
sorted lists into one sorted list.

● Splitting/Partitioning: Dividing a data structure into two or more smaller data


structures. For example, splitting a list into two lists based on a specific condition.
Need of data structure:-

➢ It gives different level of organization data.


➢ It tells how data can be stored and accessed in its elementary level.
➢ Provide operation on group of data, such as adding an item, looking
up highest priority item.
➢ Provide a means to manage huge amount of data efficiently.
➢ Provide fast searching and sorting of data
What is Algorithm?

An algorithm is a logical sequence of steps that can be followed to solve a


problem or achieve a goal. It is a clearly defined procedure that can be
repeated to produce the same result each time. Algorithms are used in a wide
variety of applications, from computer science to everyday life.
Few examples on Algorithm

1. Find the sum of any two given number

Step 1 : Start.

Step 2 : Input two numbers num1 and num2.

Step 3 : Calculate sum = num1 + num2.

Step 4 : Print sum

Step 5 : Exit
2. Find maximum of three given numbers.
Step 1 : Start.
Step 2 : Read a , b & c.
Step 3 : If a>b do
If a>c do
Print a
Else
Print c
Else
If b>c do
Print b
Else
Print c
Step 4 : Exit
What is a Flowchart?

Flowchart is a graphical representation of an algorithm. Programmers often use it


as a program-planning tool to solve a problem. It makes use of symbols which are
connected among them to indicate the flow of information and processing.

The process of drawing a flowchart for an algorithm is known as “flowcharting”.


Basic Symbols used in Flowchart Designs

● Terminal: The oval symbol indicates Start, Stop and Halt


in a program’s logic flow. A pause/halt is generally used
in a program logic under some error conditions.
Terminal is the first and last symbols in the flowchart.

● Input/Output: A parallelogram denotes any function of


input/output type. Program instructions that take input
from input devices and display output on output devices
are indicated with parallelogram in a flowchart.

● Processing: A box represents arithmetic instructions.


All arithmetic processes such as adding, subtracting,
multiplication and division are indicated by action or
process symbol.
● Decision Diamond symbol represents a decision point. Decision
based operations such as yes/no question or true/false are indicated
by diamond in flowchart.

● Connectors: Whenever flowchart becomes complex or it spreads


over more than one page, it is useful to use connectors to avoid any
confusions. It is represented by a circle.

● Flow lines: Flow lines indicate the exact sequence in which


instructions are executed. Arrows represent the direction
of flow of control and relationship among different symbols
of flowchart.
Rules For Creating Flowchart :

A flowchart is a graphical representation of an algorithm.it should follow some


rules while creating a flowchart

Rule 1: Flowchart opening statement must be ‘start’ keyword.

Rule 2: Flowchart ending statement must be ‘end’ keyword.

Rule 3: All symbols in the flowchart must be connected with an arrow line.

Rule 4: The decision symbol in the flowchart is associated with the arrow line.
1. Find the sum of any two given number Start

Step 1 : Start.
Input num1 and
Step 2 : Input two numbers num1 and num2
num2.

Step 3 : Calculate sum = num1 + num2.


Sum = num1 + num2
Step 4 : Print sum

Step 5 : Exit
Print Sum

Exit
2. Find maximum of three given numbers. Start

Step 1 : Start.
Step 2 : Read a , b & c. Read a,b&c
Step 3 : If a>b do
If a>c do
Print a
Else YES NO
Is
Print c
a>b?
Else
If b>c do
Print b NO NO
Is Is
Else a>c? b>c?
Print c
YES YES
Step 4 : Exit

Print a Print c Print b

Exit
Question for homework :-

Draw flowchart and write algorithm to check if a


given number is prime or not?
Hint :- Take divisor as 2.
Motive of DSA (Complexity)

"Complexity" typically refers to two main aspects: time complexity and space
complexity. These complexities help us analyze and understand how
efficiently an algorithm or data structure performs in terms of execution time
and memory usage.
Brain Teaser!
How we get to know 9 min is past?

7 min 4 min
Topics of Lecture 2
❖ Types of Data Structures
❖ Advantages of data structure
❖ Selecting a Data Structure
❖ Data Types and its types
❖ Characteristics of algorithm
❖ Analysis of algorithm
❖ Types of analysis of algorithm
❖ Space Complexity
❖ Time Complexity
❖ Time-Space trade off
❖ What is Asymptotic?
❖ Types of Asymptotic Notation
❖ Big O notation & its example
❖ Big omega notation & its example
❖ Theta notation & its example
Types of Data Structures:-
The collections of data you work with in a program have some kind of structure or organization.
No matter how complex your data structures are, they can be broken down into two fundamental types:

1. Contiguous

2. Non-Contiguous.

In contiguous structures, terms of data are kept together in memory (either RAM or in a file). An
array is an example of a contiguous structure. Since each element in the array is located next to
one or two other elements. In contrast, items in a non-contiguous structure and scattered in memory, but
we linked to each other in some way. A linked list is an example of a non-contiguous data structure. Here,
the nodes of the list are linked together using pointers stored in each node.
Advantages of Data structures

● Efficiency: If the choice of a data structure for implementing a particular ADT is


proper, it makes the program very efficient in terms of time and space.

● Reusability: The data structure provides reusability means that multiple client
programs can use the data structure.

● Abstraction: The data structure specified by an ADT also provides the level of
abstraction. The client cannot see the internal working of the data structure, so it
does not have to worry about the implementation part. The client can only see the
interface.
Selecting a Data Structure

➢ Analyze the problem to determine the resource constraints a solution must meet
➢ Determine basic operation that must be supported. Quantify resource constraint
for each operation
➢ Select the data structure that best meets these requirements.

Each data structure has cost and benefits. Rarely is one data structure better than other
in all situations. A data structure require :

➢ Space for each item it stores


➢ Time to perform each basic operation
➢ Programming effort.
Data Types
Data type is a way to classify various types of data such as integer, string, etc. which
determines the values that can be used with the corresponding type of data, the type of
operations that can be performed on the corresponding type of data.
There are two data types −
● Built-in Data Type
● Derived Data Type

Built-in Data Type


Those data types for which a language has built-in support are known as Built-in Data types.
For example, most of the languages provide the following built-in data types.
● Integers
● Boolean (true, false)
● Floating (Decimal numbers)
● Character and Strings
Derived Data Type
Those data types which are implementation independent as they can be implemented in one
or the other way are known as derived data types. These data types are normally built by the
combination of primary or built-in data types and associated operations on them.
For example −
List
Array
Stack
Queue
Characteristics of Algorithm:-

● Input specified - The input is the data to be transformed during the computation to produce the
output.An algorithm should have 0 or more well-defined inputs.Input precision requires that you
know what kind of data, how much and what form the data should be.

● Output specified - The output is the data resulting from the computation (your intended result).
An algorithm should have 1 or more well-defined outputs, and should match the desired
output.Output precision also requires that you know what kind of data, how much and what form
the output should be (or even if there will be any output at all!).

● Definiteness - Algorithms must specify every step and the order the steps must be taken in the
process.Definiteness means specifying the sequence of operations for turning input into output.
Algorithm should be clear and unambiguous.Details of each step must be also be spelled out
(including how to handle errors).It should contain everything quantitative and not qualitative.
● Effectiveness - For an algorithm to be effective, it means that all those steps that are required
to get to output must be feasible with the available resources.It should not contain any
unnecessary and redundant steps which could make an algorithm ineffective.

● Finiteness - The algorithm must stop, eventually.Stopping may mean that you get the
expected output OR you get a response that no solution is possible. Algorithms must terminate
after a finite number of steps.An algorithm should not be infinite and always terminate after
definite number of steps.

There is no point in developing an algorithm which is infinite as it will be useless for us.

● Independent - An algorithm should have step-by-step directions, which should be


independent of any programming code.It should be such that it could be run on any of the
programming languages.

Thus,these are the characteristics that an algorithm should have for its fruitfulness.
Analysis of algorithm

After designing an algorithm, it has to be checked and its correctness needs to be predicted. This
is done by analysing the Algorithm.

An Algorithm analysis measures the efficiency of the algorithm. The efficiency of an algorithm
can be checked by

➔ Correctness of an Algorithm
➔ Implementation of an Algorithm
➔ Simplicity of an Algorithm
➔ Execution time and Memory requirement of an Algorithm.
Types of Analysis:-
● Worst Case Analysis (Running Time)
● Average Case Analysis (Running Time)
● Best case Analysis (Running time)

1. Worst case Running Time :-


It is an upper bound on the running time for any input. Knowing it gives us a
guarantee that the item does not occur in data.

2. Average case Running time :-


It is an estimate of the running time for an average input. Computation of
Average case running time entails Knowing all possible input sequences, the
probability distribution of occurrence of these sequences and the running
times for the individual sequences.
3. Best case Running time :-
The Behaviour of the algorithm when input is in already in order. It rarely
occurs in practice comparatively with the first and second case.

The choice of a particular algorithm depends on following performance


analysis and measurements.

● Space Complexity
● Time Complexity
1. Space Complexity :-
Analysis of space complexity of an Algorithm or program is the amount of memory. It needs to run to completion.

The space needed by a program consists of following components.

● Instruction Space - Space needed to store the executable version of the program and it is fixed.

● Data Space - Space needed to store all constants, variables, Values and has further two components.

A. Space needed by constants and simple variables.

B. Space needed by fixed sized structural variables, such as array and structures.

C. Dynamically allocated space - This space usually varies.

● Environment Stack Space - This space is needed to store the information to resume the suspended junctions.
Each time a junction is invoked the following data is saved on the environment stack.

a. Return Address - From where it has to be resume after completion of the called junction.

b. values of all lead variables and the values of formal parameters in the junction being invoked.
2. Time Complexity-

The time complexity of an Algorithm or a program is the amount of time it needs to run to completion. The
exact time will depend on the implementation of the algorithm, Programming language, optimising the
capabilities of the compiler used, CPU speed, other hardware characteristics / specifications and so on.

The time complexity also depends on the amount of data inputted to an Algorithm. But we can calculate the
order of Magnitude for the time required.

Time-Space trade off


In computer science, a space time or time-memory trade-off is a way of solving a problem or calculation in
less time by using more storage space, or by solving a problem in very little space by spending a long time. So
if our problem is taking a long time but not much memory, a space time trade off would let we use more
memory and solve the problem more quickly or if it could be solved very quickly but requires more memory
than, we can try to spend more time solving the problem in the limited memory.
Asymptotic

It means a line that continually approaches a given curve but does not meet it at any finite distance.

Example:-

x is asymptotic with x + 1 as shown in graph.

Asymptotic may also be defined as a way to describe the

behavior of functions in the limit or without bounds.


Types of Asymptotic Notations:-

● Big O Notation [ F(n) <= C.g(n) ]

● Big Omega Notation [ F(n) >= C.g(n) ]

● Theta Notation [ C1.g(n) <= F(n) <= C2.g(n) ]

● Little O Notation [ F(n) < C.g(n) ]

● Little Omega Notation [ F(n) > C.g(n) ]


● Big-Oh Notation (O)

It provides possibly asymptotically tight upper bound for f(n) and it


does not give best case complexity but can give worst case
complexity.

Big oh is the formal method of expressing the upper bound of an


algorithm’s running time. It is the measure of the longest amount of
time it could possibly take for the algorithm to complete.

We say that f(n) is Big-O of g(n), written as f(n) = O(g(n)), iff there are
positive constants c and n0 such that

0 ≤ f(n) ≤ c g(n) for all n ≥ n0

If f(n) = O(g(n)), we say that g(n) is an upper bound on f(n).


● Big-Omega Notation (Ω)
It provides possibly asymptotically tight lower
bound for f(n) and it does not give worst case
complexity but can give best case complexity
f(n) is said to be Big-Omega of g(n), written as
f(n) = Ω(g(n)), iff there are positive constants c
and n0 such that

0 ≤ c g(n) ≤ f(n) for all n ≥ n0


If f(n) = Ω(g(n)), we say that g(n) is a lower bound
on f(n).
● Theta Notation (Θ)

We say that f(n) is Big-Theta of g(n), written as


f(n) = Θ(g(n)), iff there are positive constants c1,
c2 and n0 such that
0 ≤ c1 g(n) ≤ f(n) ≤ c2 g(n) for all n ≥ n0
Equivalently, f(n) = Θ(g(n)) if and only if f(n) = O(g(n))
and f(n) = Ω(g(n)). If f(n) = Θ(g(n)), we say that g(n) is a
tight bound on f(n).
Brain Teaser!
Michelle’s mom has four
children. The first child is
named April, the second is
named May and the third is
named June. What is the
name of her fourth child?
Topics of Lecture 3

❏ Program Development life cycle (PDLE)


❏ Structured Programing
❏ Top Down approach in structured programming
❏ Bottom Up approach in structured programming
❏ Advantages of structured programming
❏ Disadvantages of structured programming
❏ Pointer
❏ General syntax of pointer
❏ Advantages of pointer
❏ Installing mingw and vscode
Program Development Life Cycle:-

PDLE is a systematic approach of developing programs. It breaks the job of program


development into manageable subjects.

PDLE contains 7 steps:-

1. Problem Definition
2. Program Designing
3. Algorithm Development and flow charting
4. Program Coding
5. Debugging and compilation
6. Program Testing
7. Implementation and Documentation
● Problem Definition
This is the very first step to be followed during development of a program. Before writing the program
one must have a proper understanding of the problem.
He must know what are the output requirement of the problems and what type of input must be
followed (supplied).

● Program Designing
In this phase its solution procedure is designed and all alternatives are consulted. All types of input,
output format.

● Algorithm Development and Flowcharting


Once the design phase is over, the job is totally in the hands of programmer.
Now, programmer prepare the logical design of plan for program using various program design tools,
like :- Algorithm, Flow chart, decision table, charts etc.

● Program Coding
Normally the language for coding the program for any project is decided in advance.
When the algorithm is developed and tested it needs implementation using that language. This is
known as program coding.
● Debugging and Compilation

Once the program are ready they must be checked for any types of errors.

➔ Syntax error
➔ Logical error
➔ Execution error

Isolation of errors & their removal is known as debugging. Compilation is helpful in removing syntax
error.

● Program Testing

After compilation there may be logical errors, which may lead to undesired output.
For testing a program some sample data are taken and their output is calculated manually.
Then the some data are given as input to program. The output is compared to the manual output. If result
is ok, there are no logical error in the program. After program testing is completed successfully you move
to next step.
● Implementation and Documentation
This phase consist 3 major jobs.
➢ Installation
➢ Maintenance
➢ Documentation
● Installation - Means to load the program at user site & train him to use the software.

● Maintenance - After installation the software is evaluated regularly to confirm its utility
with time. Also if need maintenance i.e., there may be some errors discovered by user
during its day to day use which need removal and also sometimes there may be slight
change in user requirement which can be made upto date.

● Documentation - Means collecting, storing, organizing and maintaining complete


historical record of the develop software. It include useful comment in program, structure
chart, Decision table, Algorithm, Flowchart etc
Structured Programing
Structured programming is concerned with the structures used in a computer program. The programming
method used to decompose main function into lower level components for modular coding purpose is
called structured programming. In structured programming the program is developed as a series of
independent sections designed to perform only one specified task.

A structured program can be completely developed using four basic control structures.

1. Sequential
2. Conditional
3. Repetition
4. Procedures
● Sequential : This structure composed of statements executed one after another. There is only one
entry point and one exit point.

● Conditional : It is also known as selection structure. In this statements are


executed depending on certain condition.
● Repetitive : It repeat a set of statement while certain conditions are met.

● Procedure : It enables us to replace a set of statements with a single statement.


The structured programing uses one of the two approaches

● Top Down approach


● Bottom Up approach
Top Down approach in structured programming
It is a disciplined approach used for program designing in
which top level functions are decomposed into lower level
modules for easy handling and better management. The
decomposition is done in hierarchical manner.

It divides the large problem into smaller problem that can be


handled more easily. However if the subproblem is still
complex then must be further divided. These subproblem is
known as module.
Top-down programming focuses on the use of modules. It is therefore also known as modular
programming. The program is broken up into small modules so that it is easy to trace a
particular segment of code in the software program. The modules at the top level are those
that perform general tasks and proceed to other modules to perform a particular
task.Top-Down Model is followed by structural programming languages like C, Fortran etc.
Bottom Up approach in structured programming

In this approach instead of starting from top level one start


with bottom level modules. The bottom level modules are
first prepared and tested and then they combined to go
next lowest level modules. Bottom-up programming refers
to the style of programming where an application is
constructed with the description of modules. The
description begins at the bottom of the hierarchy of
modules and progresses through higher levels until it
reaches the top.

Bottom-up programming is just the opposite of top-down programming. Here, the program modules
are more general and reusable than top-down programming. Bottom-up model is based on
composition approach. Bottom-Up model have high interactivity between various modules.
Bottom-Up Model is mainly used by object oriented programming languages like Java, C++ etc.
Advantages of Structured Programming :

1. Easier to read and understand


2. User Friendly
3. Easier to Maintain
4. Mainly problem based instead of being machine based
5. Development is easier as it requires less effort and time
6. Easier to Debug
7. Machine-Independent, mostly.
Disadvantages of Structured Programming Approach:

1. Since it is Machine-Independent, So it takes time to convert into


machine code.
2. The converted machine code is not the same as for assembly
language.
3. The program depends upon changeable factors like data-types.
Therefore it needs to be updated with the need on the go.
4. Usually the development in this approach takes longer time as it is
language-dependent. Whereas in the case of assembly language, the
development takes lesser time as it is fixed for the machine.
Pointer

The pointer in C language is a variable which stores the address of another variable.
This variable can be of type int, char, array, function, or any other pointer.
General syntax of pointer * is dereferencing operator
& is referencing operator
type *var-name;

Here, type is the pointer's base type; it must be a valid C data type and var-name is
the name of the pointer variable. The asterisk * used to declare a pointer is the
same asterisk used for multiplication. However, in this statement the asterisk is
being used to designate a variable as a pointer.

Take a look at some of the valid pointer declarations −


int *ip; /* pointer to an integer */
double *dp; /* pointer to a double */
float *fp; /* pointer to a float */
char *ch /* pointer to a character */

By the help of * (indirection operator), we can print the value of pointer variable .
Advantages of pointer

1. Pointer reduces the code and improves the performance, it is used to


retrieving strings, trees, etc. and used with arrays, structures, and functions.
2. We can return multiple values from a function using the pointer.
3. It makes you able to access any memory location in the computer's memory.
More About Pointer:-
Install Mingw and vs code

Mingw - https://sourceforge.net/projects/mingw-w64/files/latest/download

Vs code - https://code.visualstudio.com/docs/?dv=win

You might also like