You are on page 1of 23

DATA STRUCTURE &

ALGORITHMS
(TIU-UCS-T201)

Presented by
Suvendu Chattaraj
(Department of CSE, TIU, WB)
What we will learn today?
•What is an algorithm?
•Why to study algorithm?
•What is Data Structure? Objective
•Types of data structure
•Relation between data structure and
algorithm
•Algorithm:
•Properties
•Expression
Al'Khwarizmi
What is an algorithm? (780-850, Baghdad,
Iraq) was a
mathematician who
wrote on Hindu-Arabic
numerals. The
word algorithm derives
By definition: from his name
“A finite sequence of instructions, each of which has a clear meaning and can be
performed with a finite amount of effort in a finite length of time”…
Note:
• Algorithm must be precise enough to be understood by human beings
• In order to be executed by a computer, an algorithm is represented as a
program (written in a rigorous formal programming language)
• Specifications such as declaration of variables before use, mentioning of array
size etc. which are normally used in a program can be ignored in algorithm
because human minds are much more flexible in understanding than a
computer
What the following will do?
Read num1, num2, num3
If (num1 < num2)
If(num2 < num3)
Write num1 , num2, num3
Else
If(num3 < num1)
Write num3, num1, num2
Else
Write num1, num3, num2
Else
If(num1 < num3)
Write num2 , num1, num3
Else
If(num3 < num2)
Write num3, num2, num1
Else
Write num2, num3, num1
What the following will do?
Read num1, num2, num3 Algorithm that reads in three numbers and
If (num1 < num2) writes them all in sorted order
If(num2 < num3)
Write num1 , num2, num3 Check:
Else 1. A finite sequence of instructions
If(num3 < num1) 2. each of which has a clear meaning
Write num3, num1, num2
3. can be performed with a finite amount
Else
of effort
Write num1, num3, num2
Else
4. in a finite length of time
If(num1 < num3)
Write num2 , num1, num3 Additionally:
Else 1. precise enough to be understood by
If(num3 < num2) human beings (not necessarily need to
Write num3, num2, num1 be a programmer / coder)
Else 2. Free from programming language
Write num2, num3, num1 specification requirements
Why we need to study algorithms?
Very relevant question because…
Processor speed
Average PC Desktop (1.5 - 2.5 Ghz)
Average Laptop or Macintosh (1.0 Ghz)
1 Ghz = 1 billion cycles per second = 109 cycles per second
2.5 Ghz = 2.5 X 109 cycles per second
Each instruction is associated with certain number of instruction cycles
The clock speed determines the number of instructions that can be
executed by the processor in a second
Typically, a high-end desktop x86 processor can execute billion (109)
instructions per second (Enormous!!!!!)
Let’s check
Computer A Computer B
(Executes 1 billions (109) of instructions (Executes 10 millions (10X 106 = 107)
per seconds) instructions per seconds)
Computer A is 100 times faster than Computer B is 100 times slower than
Computer B Computer A
A sorting algorithm whose running A sorting algorithm whose running
time grows like (2n2) w.r.t the size of time grows like (50 n lg n) w.r.t the size
the input n, is implemented of the input n, is implemented
Resulting code requires Resulting code requires 50 n lg n
2n2 instructions to sort n numbers instructions to sort n numbers
An array of 1 million numbers need to An array of 1 million numbers need to
be sorted; i.e n = 1 million = 106 be sorted; i.e n = 1 million = 106
How much time Computer A would How much time Computer B would
take? take?
How much time Computer A and B would take?
• •

• By using an algorithm whose running time grows more slowly,


even a 100 times slower computer (B) runs 20 times faster than
computer A!
• To sort 10 million numbers, computer A would take 2.3 Days,
while computer B would sort in under 20 minutes!!! WHY???
• Algorithms that are efficient in terms of time or space will help us
to utilize computer resources wisely
What is Data structure?
Few definitions:
•Data Type: Data type of a variable is the set of values that the
variable may assume (eg. int, char, floats in C)
•Abstract Data Type (ADT): a set of elements with a collection
of well defined operations - defines operations and results,
but not how they're implemented (eg. list, stack, queue, set,
tree, graph…)
•Data Structures:
•An implementation of an ADT;
•A way of arranging data in a computer's memory so that
items can be stored and retrieved conveniently
Type of data structure – very important
Based on how the data is conceptually organized
• Linear Data Structure
• a collection that stores its entries in a linear sequence, and in which entries
may be added or removed at will.
• they differ in the restrictions they place on how these entries may be added,
removed, or accessed (LIFO vs. FIFO)
• data elements can be traversed in a single run
• stack, queues, linked list are examples of linear data structure
• Nonlinear Data structure
•data entries are not arranged in a sequence, but with different
rules
•data elements can NOT be traversed in a single run
•Tree, graph are examples of nonlinear data structures
Relation between data structure and algorithm
•Data structures and algorithms together can develop good
quality computer programs.
•Their role is brought out clearly in the following diagram
(Aho, Hopcroft, and Ullman (1983))

Mathemat Abstract Data


ical Data Type Structure
Model
Pseudo Program
Informal
Language in C or
Algorithm
Program Java or …
Properties of algorithms - important
An algorithm must possess the following properties:
•finiteness: algorithm must always terminate after a finite number of
steps
•definiteness: each step of an algorithm must be precisely defined; the
actions to be carried out must be rigorously and unambiguously
specified for each case.
•input: an algorithm has zero or more inputs (quantities which are given
to it for processing), taken from a specified set of objects
•output: an algorithm has one or more outputs, which have a specified
relation to the inputs.
•effectiveness: all operations to be performed must be sufficiently basic
that they can be done exactly and in finite length and in a finite length
of time by a man using paper and pencil (Doable)
Expressing algorithm
Many ways:
1. Natural language
✔ ambiguous
✔ Vaguely defined
2. Flowchart
✔ very specific set of symbols
✔ tough to design if problem is complicated
3. Pseudocode
✔ combination of natural language and a programming language syntax
✔ most technical way to express algorithm
✔ syntax independent as algorithms are not run in computers
✔ ….
4. Programs
Problem, Algorithm and Program
A problem is a task to be accomplished in computer to map some inputs
to outputs. An algorithm solves a problem for every possible instance.
A program is an implementation of an algorithm in a programming
language; using some appropriate data structure.
One to
many Pr1
relation
One to
many Pr2
relation A1 .
.
A2
P1 .
.
Prm
An
Problem Algorithm Program
Performance analysis
Algorithms are compared based on the relative amount of
time or relative amount of space they require and specify the
growth of time /space requirements as a function of the input
size.
– Time Complexity: Running time of the program as a function
of the size of input
– Space Complexity: Amount of computer memory required
during the program execution, as a function of the input size

Will continue in the next class…


What we measure / count to analyze time complexity?
•The analysis can be done in two ways:
1. By implementing and profiling it in a particular machine – measure
of actual execution time of a program
• This approach is very much depended on factors like program, input to the
program, programming language, compiler, machine, operating system, etc.
Generalization of profiling result is tough.
2. By estimating the time and space needed by an algorithm in paper
and pencil – time complexity as a function of input length
• Independent of language, compiler, machine, operating system, etc.
• This is a rough analysis which shows "trends" rather than providing specific
execution times
• Knowing these trends is enough to understand the behavior of the
algorithm
Check…
•Searching is a basic problem in computer science
•Given a set of ‘n’ elements <a1, a2, … , an> and another value
key; searching is a problem to find whether the value key is
present in the set <a1, a2, … , an>
•A searching algorithm searches the entire list and returns the
position where the value key is found for the first time.
Returns NULL if the value key is NOT present in the set
•Few popular searching algorithms
•Linear search
•Binary search
•Interpolation search
int main(){ // to store execution time of code
Linear search double time_spent = 0.0;

The algorithm (Pseudocode) int arr[100], key, i;


srand(time(0));
Linear_Search(A[1…n], key)
for(int i = 0; i<100; i++)
1. for i = 1 to n arr[i] = rand()%150;
2. do if(A[i] == key) key = rand()%150;
3. then return i clock_t begin = clock();
for(i = 0; i<100; i++) {
4. return(-1)
if(arr[i] == key){
printf("\n%d found in position %d", key, i);
Typical outputs: break;
} } // End of if and for
34 is not present in the array if(i==100)
Time elapsed is 0.000090 seconds printf("%d is not present in the array",key);
clock_t end = clock();
66 found in position 35 time_spent += (double)(end - begin) / CLOCKS_PER_SEC;
Time elapsed is 0.000087 seconds printf("\nTime elapsed is %f seconds", time_spent); return 0; }
Implementation and profiling of linear search algorithm
• Linear search function implemented; Random data and random key value are generated from main
function; dimension of the data[1…i] is; for(i=100000; i <= 1000000; i = i + 50000)
• data[1…n] and key are passed to the Linear search function; execution times are recorded
n Program execution time
100000 3.344297171
150000 5.028327942
200000 6.663085699
250000 8.314799309
300000 10.03962421
350000 11.66102529
400000 13.31302595
450000 14.96860862
500000 16.60722566
550000 18.32366061
600000 19.98621774
650000 21.55535412
700000 23.40171409
750000 24.9976809
800000 26.50400424
850000 28.16081691
900000 29.89888048
950000
1000000
31.62698221
33.29020047
Generalization of profiling result is tough!!!
Important:

•Lack of knowledge regarding the hardware platform on which


the program will run, the actual size and distribution of the
input makes the ‘profiling’ based analysis difficult

•Paper and pencil based estimation of running time is the


other way out; provide a rough idea about the behavior of an
algorithm which is sufficient to understand the behavior of
the algorithm in different situations
How? 2 5 10 3 8 9 7 4 6 1
A[1 2 3 4 5 6 7 8 9 10]
Linear_Search(A[1…n], key)
1. for i = 1 to n Following calls require:
Linear_Search(A[1…10], 1) => 10 comparisons
2. do if(A[i] == key) Linear_Search(A[1…10], 11) => 10 comparisons
Linear_Search(A[1…10], 2) => 1 comparison
3. then return i Linear_Search(A[1…10], 8) => 5 comparisons
Linear_Search(A[1…10], 9) => 6 comparisons
4. return(-1)
The comparison “if(A[i] == key)” inside for - loop will run
• n times in worst possible situation – linear in terms of input size n
• Element found in the last position
• Element NOT found
• Constant number of times in best possible situation
• Element found in first position
• n/2 times on an average – linear in terms of input size n
Sufficient to understand the behavior of the algorithm
Linear_Search(A[1…n], key)
This loop will run
1. for i = 1 to n
• n times in worst possible situation - linear
2. do if(A[i] == key) • Constant number of times in best possible situation
3. then return i • n/2 times on an average - linear
4. return(-1)

Does the analysis


correspond to the graph
Questions
please…

You might also like