1

st
Feb 2014:
- Data structures: conceptual and concrete ways to organize data for efficient storage and
efficient manipulation
- Employment of this data structures in the design of efficient algorithms
Topics Covered:
• Introduction to Data Structure
• Abstract Data Type (ADT)
• Introduction to Algorithm Analysis and Design
Why to study Data Structure:
Any organization has a collection of records that can be searched, processed in any order, or modified.
o Data structures organize data:
• Good choice: more efficient programs
• Bad choice: poor program performance
– The choice can make a difference between the program running in a
few seconds or many day
o Characteristics of a problem’s solution
• efficient: if it solves problem within resource constraints
– time
– space
• Cost: amount of resources a solution consumes
Costs & Benefits :
A data structure requires a certain amount of:
• space for each data item it stores
• time to perform a single basic operation
• programming effort.

Selecting a data Structure :
Select a data structure as follows:
1. Analyze the problem to determine the resource constraints a solution must meet.
2. Determine the basic operations that must be supported. Quantify the resource constraints
for each operation.
3. Select the data structure that best meets these requirements.
Abstract Data Type :
o A logical view of the data objects together with specifications of the operations required to
create and manipulate them.
• Describe an algorithm – pseudo-code
• Describe a data structure – ADT
o A data structure is the physical implementation of an ADT
• Each ADT operation is implemented by one or more subroutines.
• Data structures are used to organize data in main memory
• Abstract Data Type (ADT) is Simple or structured data type whose implementation details are
hidden.
• An abstract data type is not a part of a program, because a program written in a programing
language requires the definition of a data structure, not only the operations on data structure.
• An object oriented language (OOL) such as C++ has direct link to abstract data types by
implementing them as a class.
Data Type :
Set of values
• Operations that can be performed on those values
– Ex: short int
• can take values (-32768 to 32767)
• Operations are +, -, ×, /

Data Type Classification:











ADT :
Both an interface and an implementation
• Interface and implementation are independent
• Interface defines
– The type of the data stored
– Operations that are performed on the data
– parameters of each operation
• Implementation defines
– data organization
– developing efficient algorithm for each operation



Primitive
Non-
Primitive
Sint,
float,short
int,
User
Defined
Referen
ce
Lang.
Defined
Arrays,strings,
structure,
Point
ersS
class
AD
T
Data Types
Basics Operations of ADT :
• insert(S, x)
• delete(S, x)
• search(S, x)
• findMin(S)
• findMax(S)
• findSuccessor(S, x)
• findPredecessor(S, x)
Classification of ADT :
• Linear
• Arrays
• Linked list
• Circular list
• Doubly linked list
• Stack
• Queue
• Circular queue
• Priority queue
• Non-linear
• Trees
– Binary Trees and Types
– Binary Search Trees and Variants
– Threaded Binary Trees
– Heaps
• Graphs
– Undirected
– Directed
• Hash Tables
ADT Summary :
• Standard data collection organizations (Data Structures) with desired operations
• Described by an interface
• Many implementations are possible
• Facilitate reuse and easy extensibility
• Design issues are Time and Space Complexity
Problems, Algorithms and Programs
Programmers deal with:
– problems,
– algorithms and
– computer programs.
Problem: a task to be performed.
– Best thought of as inputs and matching outputs.
– Problem definition should include constraints on the resources that may be
consumed by any acceptable solution.
Problems · mathematical functions
– A function is a matching between inputs (the domain) and outputs (the range).
– An input to a function may be single number, or a collection of information.
– The values making up an input are called the parameters of the function.
– A particular input must always result in the same output every time the function is
computed.
Algorithm: a method or a process followed to solve a problem.
– A recipe: The algorithm gives us a “recipe” for solving the problem by performing a
series of steps, where each step is completely understood and can be implemented.
An algorithm takes the input to a problem (function) and transforms it to the output.
– A mapping of input to output.
A problem can be solved by many algorithms.
For example, the problem of sorting can be solved by the
Following algorithms:
• Insertion sort
• Bubble sort
• Selection sort
• Shellsort
• Mergesort
An algorithm possesses the following properties:
– It must be correct.
– It must be composed of a series of concrete steps.
– There can be no ambiguity as to which step will be performed next.
– It must be composed of a finite number of steps.
– It must terminate.
A computer program is an instance, or concrete representation, for an algorithm in some programming
language.
Algorithm Design Techniques
The design of algorithms is also an important focus.
Types of algorithms:
• Greedy algorithms
• Divide and Conquer
• Dynamic programming
• Randomized algorithms
• Backtracking
Algorithm Analysis:
Predict the amount of resources required:
 Memory: how much space is needed?
 Computational time: how fast the algorithm runs?
FACT: running time grows with the size of the input
Input size (number of elements in the input)
– Size of an array, polynomial degree, # of elements in a matrix, # of bits in the binary
representation of the input, vertices and edges in a graph
Def: Running time = the number of primitive operations (steps) executed before termination
Running time is expressed as T(n) for some function T on input size n.
Two approaches to obtaining running time:
– Measuring under standard benchmark conditions.
– Estimating the algorithms performance
Estimation is based on:
– The “size” of the input
– The number of basic operations
The time to complete a basic operation does not depend on the value of its operands.
Lists of ADT :
• List is an ordered sequence of elements
• List has the property length (count of elements)
• The elements are arranged consecutively.
• Can be implemented as static(Array implementation) or dynamic (Linked List implementation)
Array
Fundamental data structure
• Homogeneous collection of values
• store values sequentially in memory
• associate INDEX with each value
• use array name and index to quickly access the value.
• efficient method for working with large collection of data.
• An array can be
• Single-dimensional
• Multi-dimensional
Array Memory Layout:
• The index in a one-dimensional array directly defines the relative positions of the element in
actual memory.
• Two-dimensional array is stored in memory using row-major or column-major storage
Operations on Array:
• The common operations on arrays are searching, insertion, deletion and traversal.
• An array is more suitable when the number of deletions and insertions is small, but a lot of
searching and retrieval activities are expected.
Pros and cons:
Advantages
• Simple and easy to use
• Faster Access to elements (Constant time random access)
Disadvantages
• Fixed size
• Inefficient insertions and deletions
Ques: We have stored the two-dimensional array students in memory. The array is 100 × 4 (100 rows
and 4 columns). Show the address of the element students [5][3] assuming that the element student
[1][1] is stored in the memory location with address 1000 and each element occupies only one memory
location. The computer uses row-major storage.
Solutions: We can use the following formula to find the location of an element, assuming each element
occupies one memory location.
Y=x+ {cols * (i-1)}+(j-1)
If the first element occupies the location 1000, the target element occupies the location 1018.
Linked Lists:
• A linked list is a collection of data in which each element contains the location of the next
element.
• Each element contains two parts: data and link. The name of the list is the same as the name of
this pointer variable.
• Head: pointer to the first node
• The last node points to NULL
Operations on Linked List :
• Search
• Insertion
• Deletion
• Traversal
Search operation:
Insertion:
Four cases can arise:
• Inserting into an empty list.
• Insertion at the beginning of the list.
• Insertion at the end of the list.
• Insertion in the middle of the list.
Deletion:
Two cases are:
• deleting the first node
• Deleting any other node.
Linked List advantage and disadvantage:

Advantages
• Not so simple
• Sequential access
Disadvantages
• Dynamic size
• Efficient insertions and deletions
Linked List Application:
• It is a dynamic data structure in which the list can start with no nodes and then grow as new
nodes are needed
• It is a suitable structure if a large number of insertions and deletions are needed, but searching a
linked list is slower that searching an array.
• It is a very efficient data structure for sorted list that will go through many insertions and
deletions
Linked List Operations:
Comparisons of linked list and Array


Variations of linked list :
Singly linked list: It has only head part and corresponding references to the next nodes.
Doubly linked list: A linked list which has both head and tail parts, thus allowing the traversal in
bi-directional fashion. Except the first node, the head node refers to the previous node.
Circular linked list: A linked list whose last node has reference to the first node.
Try:
• How many pointers are contained as data members in the nodes of a circular, doubly linked list
of integers with five nodes?
• If the address of A [1][1] and A[2][1] are 1000 and 1010 respectively and each element occupies
2 bytes then the array has been stored in _________ order.
• The operation of processing each element in the list is known as _____________
8
th
Feb 2014:
Topics:
• Arrays
• Linked Lists
• Stacks
• Queues
Stacks:
• A stack is a restricted linear list in which all additions and deletions are made at one end, the
top. (LIFO)
Operations on stack ADT:
• No search
• No adding in arbitrary positions
• No sorting
• No access to anything beyond the top element.
• Stack --- stack(stackName)
• Push ---push(stackName,dataItem)
• Pop ---- pop(stackName,dataItem)
• Empty--- empty(stackName)
Stack ADT implementation:
• Stack ADTs can be implemented using either AS an array or a linked list.
Stack array Implementation:
createStack(S): Define an array S for some fixed sixe N
top ← -1
push(x,S): if top = N-1 then error
else top ←top + 1
S*top+ ← x
StackEmpty(S): return (top < 0)
pop(S): if isStackEmpty() then error
else item ←S*top+
top ← top – 1
return(item)
Application of stack
• Expression Evaluation
• Function calls
• Memory Management (Run time Environment)
• Backtracking
• Parenthesis Matching
Expression Evaluation:
The three Notations of Expressions are:
• Infix a+b
• Postfix(RPN) ab+
• Prefix(PN) +ab
Conversion from Infix to postfix:
• 1. Print operands as they arrive.
• 2. If the stack is empty or contains a left parenthesis on top, push the incoming
operator onto the stack.
• 3. If the incoming symbol is a left parenthesis, push it on the stack.
• 4. If the incoming symbol is a right parenthesis, pop the stack and print the
operators until you see a left parenthesis. Discard the pair of parenthesis.
• 5. If the incoming symbol has higher precedence than the top of the stack, push it
on the stack.
• 6. If the incoming symbol has equal precedence with the top of the stack, use
association. If the association is left to right, pop and print the top of the stack and
then push the incoming operator. If the association is right to left, push the
incoming operator.
• 7. If the incoming symbol has lower precedence than the symbol on the top of the
stack, pop the stack and print the top operator. Then test the incoming operator
against the new top of stack.
• 8. At the end of the expression, pop and print all operators on the stack. (No
parentheses should remain.)
Ex:
Infix arithmetic expression a + b * c – d.
Input: a + b * c – d Output: a opStack: empty
Input: a + b * c – d Output: a opStack: +
Input: a + b * c – d Output: a b opStack: +
Input: a + b * c – d Output: a b opStack: + *
Input: a + b * c – d Output: a b c opStack: + *
Input: a + b * c – d Output: a b c opStack: + *
Input: a + b * c – d Output: a b c * opStack: +
Input: a + b * c – d Output: a b c * + opStack: empty
Input: a + b * c – d Output: a b c * + opStack: −
Input: a + b * c – d Output: a b c * + d opStack: −
Input: a + b * c – d Output: a b c * + d − opStack: empty
Ex:
• a/b^c-d abc^/d-
• a-b+c ab-c+
• a*(b+c) abc+c*
• a * (b + c * d) + e a b c d * + * e +
Queue ADT:
• A queue is a linear list in which data can only be inserted at one end, called the rear, and deleted
from the other end, called the front.(FIFO).
Operations on Queue ADT:
• Queue
• Enqueue
• Dequeue
• Empty
Queue Implementation:
• A queue ADT can be implemented using either as an array or a linked list
Application of Queue ADT:
• For implementing any "natural" FIFO service, like telephone enquiries, reservation requests,
traffic flow, etc.
• For implementing any "computational" FIFO service, for instance, to access some resources.
Examples: printer queues, disk queues, etc.
• For searching in special data structures (breadth-first search in graphs and trees).
• For handling scheduling of processes in a multitasking operating system.


Try :
• How many pointers are contained as data members in the nodes of a circular, doubly linked list
of integers with five nodes?
• If the address of A[1][1] and A[2][1] are 1000 and 1010 respectively and each element occupies
2 bytes then the array has been stored in _________ order.
• The operation of processing each element in the list is known as _____________
22 feb 2014:
• O-Notation : Intuitively: O(g(n)) = the set of functions with a smaller or same order of growth
as g(n)












Examples :
3n + 2 = O(n) ; 3n + 2 <= 4n for all n >= 2
3n + 3 = O(n) ; 3n + 3 <= 4n for all n >= 3
100n + 6 = O(n) ; 100n + 6 <= 101n for all n >= 6
= O(n
2
)
10 n
2
+ 4n + 2 < = 11 n
2
for n >= 5

• O - notation Intuitively: O(g(n)) = the set of functions with a larger or same order of growth as
g(n)














3n + 2 = ? 3n + 2 >= 3n for all n >= 1
3n + 3 = ? 3n + 3 >= 3n for all n >= 1
100n + 6 = ? 100n + 6 >= 100n for all n >= 1
3n + 3 = ?
3n + 3 <=6n for all n>=1,c2 = 6
3n+3 >= 3n for all n>=1, c1=3
3n<=3n+3<= 3n for all n>=1 3n+3 = O(n)


Sorting:
• Iterative methods:
• Insertion sort
• Bubble sort
• Selection sort

Divide and conquer
• Merge sort
• Quicksort

Counting sort
• Radix sort
• Bucket sort

Insertion Sort
Alg.: INSERTION-SORT(A)
for j ← 2 to n
do key ← A* j +
Insert A[ j ] into the sorted sequence A[1 . . j -1]
i ← j - 1
while i > 0 and A[i] > key
do A*i + 1+ ← A*i+
i ← i – 1
A*i + 1+ ← key
Insertion sort – sorts the elements in place





Analysis of Insertion ADT:
INSERTION-SORT(A)
for j ← 2 to n
do key ← A* j +
Insert A[ j ] into the sorted sequence A[1 . . j -1]
i ← j - 1
while i > 0 and A[i] > key
do A*i + 1+ ← A*i+
i ← i – 1
A[i + 1+ ← key



Best Case Analysis :

The array is already sorted
– A[i] ≤ key upon the first time the while loop test is run (when i = j -1)
– t
j
= 1
T(n) = c
1
n + c
2
(n -1) + c
4
(n -1) + c
5
(n -1) + c
8
(n-1) = (c
1
+ c
2
+ c
4
+ c
5
+ c
8
)n + (c
2

+ c
4
+ c
5
+ c
8
)
= an + b = O(n)

Worst Case Analysis :
The array is in reverse sorted order
– Always A[i] > key in while loop test
– Have to compare key with all elements to the left of the j-th position ¬ compare
with j-1 elements ¬ t
j
= j






( ) ( ) ) 1 ( 1 1 ) 1 ( ) 1 ( ) (
8
2
7
2
6
2
5 4 2 1
÷ + ÷ + ÷ + + ÷ + ÷ + =
¿ ¿ ¿
= = =
n c t c t c t c n c n c n c n T
n
j
j
n
j
j
n
j
j
Alg.: SELECTION-SORT(A)
n ← length*A+
for j ← 1 to n - 1
do smallest ← j
for i ← j + 1 to n
do if A[i] < A[smallest]
then smallest ← i
exchange A*j+ ↔ A*smallest+
Divide the problem into a number of sub-problems
– Similar sub-problems of smaller size
Conquer the sub-problems
– Solve the sub-problems recursively
– Sub-problem size small enough ¬ solve the problems in straightforward manner
Combine the solutions to the sub-problems
– Obtain the solution for the original problem
Merger and Sort APPROACH :
TO SORT AN ARRAY A[P . . R]:
DIVIDE
– DIVIDE THE N-ELEMENT SEQUENCE TO BE SORTED INTO TWO SUBSEQUENCES OF N/2
ELEMENTS EACH
CONQUER
– SORT THE SUBSEQUENCES RECURSIVELY USING MERGE SORT
– WHEN THE SIZE OF THE SEQUENCES IS 1 THERE IS NOTHING MORE TO DO
COMBINE
– MERGE THE TWO SORTED SUBSEQUENCES

MERGE SORT :
– ALG.: MERGE-SORT(A, P, R)
– IF P < R CHECK FOR BASE CASE
– THEN Q ← ¸(P + R)/2¸ DIVIDE
– MERGE-SORT(A, P, Q) CONQUER
– MERGE-SORT(A, Q + 1, R) CONQUER
– MERGE(A, P, Q, R) COMBINE
– INITIAL CALL: MERGE-SORT(A, 1, N)
MERGING:
INPUT: ARRAY A AND INDICES P, Q, R SUCH THAT P ≤ Q < R
– SUBARRAYS A[P . . Q] AND A[Q + 1 . . R] ARE SORTED
OUTPUT: ONE SINGLE SORTED SUBARRAY A[P . . R]
IDEA FOR MERGING:
– TWO PILES OF SORTED CARDS
• CHOOSE THE SMALLER OF THE TWO TOP CARDS
• REMOVE IT AND PLACE IT IN THE OUTPUT PILE
– REPEAT THE PROCESS UNTIL ONE PILE IS EMPTY
– TAKE THE REMAINING INPUT PILE AND PLACE IT FACE-DOWN ONTO THE OUTPUT PILE
MERGER PSEUDO CODE:
ALG.: MERGE (A, P, Q, R)
1. COMPUTE N
1
AND N
2

2. COPY THE FIRST N
1
ELEMENTS INTO L[1 . . N
1
+ 1] AND THE NEXT N
2
ELEMENTS INTO R[1 . . N
2
+
1]
3. L[N
1
+ 1+ ← ·; R[N
2
+ 1+ ← ·
4. I ← 1; J ← 1
5. FOR K ← P TO R
6. DO IF L* I + ≤ R* J +
7. THEN A*K+ ← L* I +
8. I ←I + 1
9. ELSE A*K+ ← R* J +
10. J ← J + 1
RUNNING TIME OF MERGE:
INITIALIZATION (COPYING INTO TEMPORARY ARRAYS):
– O(N
1
+ N
2
) = O(N)
ADDING THE ELEMENTS TO THE FINAL ARRAY (THE LAST FOR LOOP):
– N ITERATIONS, EACH TAKING CONSTANT TIME ¬ O(N)
TOTAL TIME FOR MERGE:
– O(N)
ANALYSING DIVIDE AND CONQUER:
THE RECURRENCE IS BASED ON THE THREE STEPS OF THE PARADIGM:
– T(N) – RUNNING TIME ON A PROBLEM OF SIZE N
– DIVIDE THE PROBLEM INTO A SUBPROBLEMS, EACH OF SIZE N/B: TAKES D(N)
– CONQUER (SOLVE) THE SUBPROBLEMS AT(N/B)
– COMBINE THE SOLUTIONS C(N)
O(1) IF N ≤ C
T(N) = AT(N/B) + D(N) + C(N) OTHERWISE
MERGE - SORT RUNNING TIME:
DIVIDE:
– COMPUTE Q AS THE AVERAGE OF P AND R: D(N) = O(1)
CONQUER:
– RECURSIVELY SOLVE 2 SUBPROBLEMS, EACH OF SIZE N/2 ¬ 2T (N/2)
COMBINE:
– MERGE ON AN N-ELEMENT SUBARRAY TAKES O(N) TIME ¬ C(N) = O(N)
O(1) IF N =1
T(N) = 2T(N/2) + O(N) IF N > 1

SOLVE THE RESCURRSION:
T(N) = C IF N = 1
2T(N/2) + CN IF N > 1
USE MASTER’S THEOREM:

COMPARE N WITH F(N) = CN
CASE 2: T(N) = Θ(NLGN)