Professional Documents
Culture Documents
Exponential
xa xb=xa+b
xa/xb=xa-b
(xa) b=xab
xn +xn=2xn
Logarithmic
Series
Proofing statements
We can proof statements in data structure analysis either by:
1. Proof by induction
A proof by induction has two standard parts. The first part is to proof the base
case, which is establishing that the theorem is true for some small value(s). Next
an inductive hypothesis is assumed. Generally this means that the theorem is true
for all case up to some limit k, using this assumption, the theorem is then shown
to be true for the next values, which are typically k + 1.
2. Proof by contradiction
Proof by contradiction proceeds by assuming that the theorem is false showing
that this assumption implies that some property is false, and hence the original
assumption was erroneous.
Recursion Revisited
Mathematical functions some times defined in a less standard form. For example, we
can define a function f, that is valid on nonnegative integer, that satisfies the
following: f (n) =2f (n-1) +n2 and f (0) =0. A function that is defined in terms of it
self is called recursive. Any recursive function definition has a base case, which is the
value for which the function is directly known without restoring to recursion.
Recursive calls will keep on being made until a base case is reached.
-1-
Chapter 1
The model defines an abstract view to the problem. This implies that the model
focuses only on problem related stuff and that a programmer tries to define the
properties of the problem. These properties include
The data which are affected and
The operations that are involved in the problem.
With abstraction you create a well-defined entity that can be properly handled. These
entities define the data structure of the program. An entity with the properties just
described is called an abstract data type (ADT).
Abstract Data Types
An ADT consists of an abstract data structure and operations. In other terms, an ADT
is an abstraction of a data structure. The ADT specifies:
What can be stored in the Abstract Data Type
What operations can be done on/by the Abstract Data Type?
For example, if we are going to model employees of an organization.
-2-
This ADT stores employees with their relevant attributes and discarding irrelevant
attributes.
This ADT supports hiring, firing, retiring, and the like operations.
A data structure is a language construct that the programmer has defined in order to
implement an abstract data type. There are lots of formalized and standard Abstract
data types such as Stacks, Queues, Trees, etc. Do all characteristics need to be
modeled? Not at all
It depends on the scope of the model
It depends on the reason for developing the model
Abstraction
Abstraction is a process of classifying characteristics as relevant and irrelevant for the
particular purpose at hand and ignoring the irrelevant ones. Applying abstraction
correctly is the essence of successful programming. How do data structures model the
world or some part of the world?
The value held by a data structure represents some specific characteristic of the
world
The characteristic being modeled restricts the possible values held by a data
structure
The characteristic being modeled restricts the possible operations to be performed
on the data structure.
Note: - Notice the relation between characteristic, value, and data structures
Algorithms
An algorithm is a well-defined computational procedure that takes some value or a set
of values as input and produces some value or a set of values as output. Data
structures model the static part of the world. They are unchanging while the world is
changing. In order to model the dynamic part of the world we need to work with
algorithms. Algorithms are the dynamic part of a program’s world model. An
algorithm transforms data structures from one state to another state in two ways:
An algorithm may change the value held by a data structure
An algorithm may change the data structure itself
The quality of a data structure is related to its ability to successfully model the
characteristics of the world. Similarly, the quality of an algorithm is related to its
ability to successfully simulate the changes in the world. However, independent of
any particular world model, the quality of data structure and algorithms is determined
by their ability to work together well. Generally speaking, correct data structures lead
-3-
to simple and efficient algorithms and correct algorithms lead to accurate and
efficient data structures.
Properties of an algorithm
Finiteness: Algorithm must complete after a finite number of steps.
Definiteness: Each step must be clearly defined, having one and only one
interpretation. At each point in computation, one should be able to tell exactly
what happens next.
Sequence: Each step must have a unique defined preceding and succeeding step.
The first step (start step) and last step (halt step) must be clearly noted.
Feasibility: It must be possible to perform each instruction.
Correctness: It must compute correct answer all possible legal inputs.
Language Independence: It must not depend on any one programming language.
Completeness: It must solve the problem completely.
Effectiveness: It must be possible to perform each step exactly and in a finite
amount of time.
Efficiency: It must solve with the least amount of computational resources such
as time and space.
Generality: Algorithm should be valid on all possible inputs.
Input/Output: There must be a specified number of input values, and one or
more result values.
Algorithm Analysis Concepts
Algorithm analysis refers to the process of determining how much computing time
and storage that algorithms will require. In other words, it’s a process of predicting
the resource requirement of algorithms in a given environment. In order to solve a
problem, there are many possible algorithms. One has to be able to choose the best
algorithm for the problem at hand using some scientific method. To classify some
data structures and algorithms as good, we need precise ways of analyzing them in
terms of resource requirement. The main resources are:
Running Time
Memory Usage/storage
Communication Bandwidth
Running time is usually treated as the most important since computational time is the
most precious resource in most problem domains. There are two approaches to
measure the efficiency of algorithms:
-4-
Empirical: Programming competing algorithms and trying them on different
instances.
Theoretical: Determining the quantity of resources required mathematically
(Execution time, memory space, etc.) needed by each algorithm.
However, it is difficult to use actual clock-time as a consistent measure of an
algorithm’s efficiency, because clock-time can vary based on many things. For
example,
Specific processor speed
Current processor load
Specific data for a particular run of the program
o Input Size
o Input Properties
Operating Environment
Accordingly, we can analyze an algorithm according to the number of operations
required, rather than according to an absolute amount of time involved. This can
show how an algorithm’s efficiency changes according to the size of the input.
Complexity Analysis
Complexity analysis is the systematic study of the cost of computation, measured
either in time units or in operations performed, or in the amount of storage space
required. The goal is to have a meaningful measure that permits comparison of
algorithms independent of operating platform. There are two things to consider:
Time Complexity: Determine the approximate number of operations required to
solve a problem of size n.
Space Complexity: Determine the approximate memory required to solve a
problem of size n.
Complexity analysis involves two distinct phases:
Algorithm Analysis: Analysis of the algorithm or data structure to produce a
function T (n) that describes the algorithm in terms of the operations performed in
order to measure the complexity of the algorithm.
Order of Magnitude Analysis: Analysis of the function T (n) to determine the
general complexity category to which it belongs.
There is no generally accepted set of rules for algorithm analysis. However, an exact
count of operations is commonly used.
Analysis Rules:
We assume an arbitrary time unit.
Execution of one of the following operations takes time 1:
-5-
o Assignment Operation
o Single Input/Output Operation
o Single Boolean Operations
o Single Arithmetic Operations
o Function Return
Running time of a selection statement (e.g. if, switch) is the time for the condition
evaluation + the maximum of the running times for the individual clauses in the
selection.
Loops: Running time for a loop is equal to the running time for the statements
inside the loop * number of iterations. The total running time of a statement inside
a group of nested loops is the running time of the statements multiplied by the
product of the sizes of all the loops. For nested loops, analyze inside out.
Always assume that the loop executes the maximum number of iterations
possible.
Running time of a function call is 1 for setup + the time for any parameter
calculations + the time required for the execution of the function body.
Examples:
1. int count()
{
int k=0;
cout<< “Enter an integer”;
cin>>n;
For (i=0; i<n; i++)
k=k+1;
Return 0;
}
Time Units needed to compute
-------------------------------------------------
1, for the assignment statement: int k=0
1, for the output statement: cout<<”Enter an integer”;
1, for the input statement: cin>>n;
In the for loop:
1 assignment, n+ 1 test, and n increments.
N loops of 2 units for an assignment, and an addition.
1, for the return statement.
-------------------------------------------------------------------
Hence, T (n) = 1+1+1+ (1+n+1+n) +2n+1 = 4n+6 = O (n)
-6-
2. Int total (int n)
{
int sum=0;
For (int i=1; i<=n; i++)
Sum=sum+1;
Return sum;
}
Time Units needed to compute
-------------------------------------------------
1, for the assignment statement: int sum=0
In the for loop:
1 assignment, n+ 1 test, and n increments.
N loops of 2 units for an assignment, and an addition.
1, for the return statement.
-------------------------------------------------------------------
Hence, T (n) = 1+ (1+n+1+n) +2n+1 = 4n+4 = O (n)
3. Void func ()
{ int x=0; int i=0; int j=1;
cout<< “Enter an Integer value”;
cin>>n;
While (i<n)
{
x++;
i++;
}
While (j<n)
{
j++;
}
}
Time Units to Compute
-------------------------------------------------
1, for the first assignment statement: x=0;
1, for the second assignment statement: i=0;
1, for the third assignment statement: j=1;
1, for the output statement.
1, for the input statement.
In the first while loop:
N+ 1 test
N loops of 2 units for the two increments (addition) operations
In the second while loop:
N tests
N-1 increments
-------------------------------------------------------------------
Hence, T (n) = 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O (n)
-7-
4. Int sum (int n)
{
int partial_sum = 0;
For (int i = 1; i <= n; i++)
partial_sum = partial_sum + (i * i * i);
Return partial_sum;
}
Time Units to Compute
-------------------------------------------------
1, for the assignment.
1 assignment, n+1 tests, and n increments.
N loops of 4 units for an assignment, an addition, and two multiplications.
1 for the return statement.
-------------------------------------------------------------------
Hence, T (n) = 1+ (1+n+1+n) +4n+1 = 6n+4 = O (n)
Formal Approach to Analysis
In the above examples we have seen that analysis so complex. However, it can be
simplified by using some formal approach in which case we can ignore initializations,
loop control, and book keeping.
For Loops: Formally
In general, for loop translates to a summation. The index and bounds of the
summation are the same as the index and bounds of the, for loop.
N
fo r (in t i = 1 ; i < = N ; i+ + ) {
s u m = s u m + i; 1 N
}
i 1
Suppose we count the number of additions that are done. There is 1 addition per
iteration of the loop, hence N additions in total.
Nested Loops: Formally
Nested for loops translate into multiple summations, one for each for loop.
for (int i = 1; i <= N ; i+ +) {
for (int j = 1; j <= M ; j++ ) { N M N
}
sum = su m + i+j;
i 1 j 1
2
i 1
2 M 2 MN
}
Again, count the number of additions. The outer summation is for the outer for
loop.
Consecutive Statements: Formally
Add the running times of the separate blocks of your code
-8-
for (int i = 1; i <= N ; i++) {
sum = sum +i;
} N N N
for (int i = 1; i <= N ; i++) { 1
i 1 i 1 j 1
2 N 2N 2
for (int j = 1; j <= N ; j++) {
sum = sum +i+j;
}
}
Conditionals: Formally
If (test) s1 else s2: Compute the maximum of the running time for s1 and s2.
if (te s t = = 1 ) {
f o r ( in t i = 1 ; i < = N ; i+ + ) { N N N
s u m = s u m + i; max 1 , 2
}} i 1 i 1 j 1
e ls e f o r ( in t i = 1 ; i < = N ; i+ + ) {
f o r ( in t j = 1 ; j < = N ; j + + ) { max N , 2 N 2 2 N 2
s u m = s u m + i+ j ;
}}
Example:
Suppose we have hardware capable of executing 106 instructions per second. How
long would it take to execute an algorithm whose complexity function is T (n) = 2n 2
on an input size of n=108?
Ans. The total number of operations to be performed would be T (10 8): T (108) =
2*(108)2 =2*1016. And the required number of seconds required would be given by T
(108)/106 so: Running time =2*1016/106 = 2*1010. The number of seconds per day is
86,400 so this is about 231,480 days (634 years).
Exercises
Determine the run time equation and complexity of each of the following code
segments.
1. For (i=0; i<n; i++)
For (j=0; j<n; j++)
Sum=sum+i+j;
What is the value of sum if n=100?
3. Int k=0;
For (int i=0; i<n; i++)
-9-
For (int j=i; j<n; j++)
k++;
What is the value of k when n is equal to 20?
4. Int k=0;
For (int i=1; i<n; i*=2)
For (int j=1; j<n; j++)
k++;
What is the value of k when n is equal to 20?
5. int x=0;
For (int i=1; i<n; i=i+5)
x++;
What is the value of x when n=25?
6. int x=0;
For (int k=n; k>=n/3; k=k-5)
x++;
What is the value of x when n=25?
7. int x=0;
For (int i=1; i<n; i=i+5)
For (int k=n; k>=n/3; k=k-5)
x++;
What is the value of x when n=25?
8. int x=0;
For (int i=1; i<n; i=i+5)
For (int j=0; j<i; j++)
For (int k=n;k>=n/2;k=k-3)
x++;
What is the correct big-Oh Notation for the above code segment?
Measures of Times
In order to determine the running time of an algorithm it is possible to define three
functions Tbest(n), Tavg(n) and Tworst(n) as the best, the average and the worst case
running time of the algorithm respectively.
Average Case (Tavg): The amount of time the algorithm takes on an "average" set of
inputs.
Worst Case (Tworst): The amount of time the algorithm takes on the worst possible
set of inputs.
Best Case (Tbest): The amount of time the algorithm takes on the smallest possible
set of inputs.We are interested in the worst-case time, since it provides a bound for all
input – this is called the “Big-Oh” estimate.
- 10 -
Asymptotic Analysis
Asymptotic analysis is concerned with how the running time of an algorithm
increases with the size of the input in the limit, as the size of the input increases
without bound. There are five notations used to describe a running time function.
These are:
Big-Oh Notation (O)
Big-Omega Notation ()
Theta Notation ()
Little-o Notation (o)
Little-Omega Notation ()
The Big-Oh Notation
Big-Oh notation is a way of comparing algorithms and is used for computing the
complexity of algorithms; i.e., the amount of time that it takes for computer program
to run. It’s only concerned with what happens for very large value of n. Therefore
only the largest term in the expression (function) is needed. For example, if the
number of operations in an algorithm is n2 – n, n is insignificant compared to n2 for
large values of n. Hence the n term is ignored. Of course, for small values of n, it may
be important. However, Big-Oh is mainly concerned with large values of n.
Formal Definition: f (n) = O (g (n)) if there exist c, k ∊ ℛ+ such that for all n≥ k, f (n)
≤ c.g (n).
Examples: The following points are facts that you can use for Big-Oh problems:
1<=n for all n>=1
n<=n2 for all n>=1
2n <=n! for all n>=4
log2n<=n for all n>=2
n<=nlog2n for all n>=2
1. f (n) =10n+5 and g(n)=n. Show that f(n) is O(g(n)).
To show that f (n) is O (g (n)) we must show that constants c and k such that
f(n) <=c.g(n) for all n>=k
Or 10n+5<=c.n for all n>=k
Try c=15. Then we need to show that 10n+5<=15n
Solving for n we get: 5<5n or 1<=n.
So f(n) =10n+5 <=15.g(n) for all n>=1.
(c=15, k=1).
2. f (n) = 3n2 +4n+1. Show that f (n) =O (n2).
4n <=4n2 for all n>=1 and 1<=n2 for all n>=1
3n2 +4n+1<=3n2+4n2+n2 for all n>=1
<=8n2 for all n>=1
So we have shown that f(n)<=8n2 for all n>=1
- 11 -
Therefore, f (n) is O (n2) (c=8, k=1)
Typical Orders
Here is a table of some typical cases. This uses logarithms to base 2, but these are
simply proportional to logarithms in other base.
N O(1) O(log n) O(n) O(n log n) O(n2) O(n3)
1 1 1 1 1 1 1
2 1 1 2 2 4 8
4 1 2 4 8 16 64
8 1 3 8 24 64 512
16 1 4 16 64 256 4,096
1024 1 10 1,024 10,240 1,048,576 1,073,741,824
Demonstrating that a function f(n) is big-O of a function g(n) requires that we find
specific constants c and k for which the inequality holds (and show that the inequality
does in fact hold). Big-O expresses an upper bound on the growth rate of a function,
for sufficiently large values of n. An upper bound is the best algorithmic solution that
has been found for a problem. “What is the best that we know we can do?”
Exercise:
f(n) = (3/2)n2+(5/2)n-3 .Show that f(n)= O(n2). In simple words, f (n) =O (g (n)) means
that the growth rate of f (n) is less than or equal to g (n).
Big-O Theorems
For all the following theorems, assume that f(n) is a function of n and that k is an
arbitrary constant.
Theorem 1: k is O(1)
Theorem 2: A polynomial is O(the term containing the highest power of n).
Polynomial’s growth rate is determined by the leading term
If f (n) is a polynomial of degree d, then f (n) is O (nd). In general, f(n) is big-O of
the dominant term of f(n).
Theorem 3: k*f(n) is O(f(n)). Constant factors may be ignored. For example: f (n)
=7n4+3n2+5n+1000 is O (n4)
Theorem 4(Transitivity): If f(n) is O(g(n))and g(n) is O(h(n)), then f(n) is O(h(n)).
Theorem 5: For any base b, logb (n) is O (logn). All logarithms grow at the same rate.
logbn is O (logdn) b, d > 1
Theorem 6: Each of the following functions is big-O of its successors:
k
logbn
n
nlogbn
n2
n to higher powers
2n
3n
Larger constants to the nth power
n!
- 12 -
nn
f (n)= 3nlogbn + 4 logbn+2 is O(nlogbn) and )(n2) and O(2n)
Properties of the O Notation
Higher powers grow faster
•nr is O( ns) if 0 <= r <= s
Fastest growing term dominates a sum
• If f(n) is O(g(n)), then f(n) + g(n) is O(g)
E.g. 5n4 + 6n3 is O (n4)
Exponential functions grow faster than powers, i.e. is O ( bn ) b > 1 and k
>= 0
E.g. n20 is O( 1.05n)
Logarithms grow more slowly than powers
•logbn is O( nk) b > 1 and k >= 0
E.g. log2n is O ( n0.5)
Big-Omega Notation
Just as O-notation provides an asymptotic upper bound on a function, notation
provides an asymptotic lower bound. Formal Definition: A function f(n) is ( g (n))
if there exist constants c and k ∊ ℛ+ such that f(n) >=c. g(n) for all n>=k. f(n)= ( g
(n)) means that f(n) is greater than or equal to some constant multiple of g(n) for all
values of n greater than or equal to some k.
Example: If f(n) =n2, then f(n)= ( n). In simple terms, f(n)= ( g (n)) means that the
growth rate of f(n) is greater that or equal to g(n).
Theta Notation
A function f (n) belongs to the set of (g(n)) if there exist positive constants c1 and
c2 such that it can be sandwiched between c1.g(n) and c2.g(n), for sufficiently large
values of n. Formal Definition: A function f (n) is (g(n)) if it is both O( g(n) ) and
( g(n) ). In other words, there exist constants c1, c2, and k >0 such that c1.g
(n)<=f(n)<=c2. g(n) for all n >= k. If f(n)= (g(n)), then g(n) is an asymptotically
tight bound for f(n). In simple terms, f(n)= (g(n)) means that f(n) and g(n) have the
same rate of growth.
Example:
1. If f(n)=2n+1, then f(n) = (n)
2. f(n) =2n2 then
f(n)=O(n4)
f(n)=O(n3)
f(n)=O(n2)
All these are technically correct, but the last expression is the best and tight one.
Since 2n2 and n2 have the same growth rate, it can be written as f (n) = (n2).
Little-o Notation
Big-Oh notation may or may not be asymptotically tight, for example:
2n2 = O (n2)
=O (n3)
f(n)=o(g(n)) means for all c>0 there exists some k>0 such that f(n)<c.g(n) for all
n>=k. Informally, f(n)=o(g(n)) means f(n) becomes insignificant relative to g(n)
as n approaches infinity.
- 13 -
Example: f (n) =3n+4 is o (n2)
In simple terms, f(n) has less growth rate compared to g(n).
g (n)= 2n2 g(n) =o(n3), O(n2), g(n) is not o(n2).
Little-Omega ( notation)
Little-omega () notation is to big-omega () notation as little-o notation is to Big-
Oh notation. We use notation to denote a lower bound that is not asymptotically
tight.
Formal Definition: f(n)= (g(n)) if there exists a constant no>0 such that 0<= c.
g(n)<f(n) for all n>=k.
Example: 2n2=(n) but it’s not (n).
Relational Properties of the Asymptotic Notations
Transitivity
• if f(n)=(g(n)) and g(n)= (h(n)) then f(n)=(h(n)),
• if f(n)=O(g(n)) and g(n)= O(h(n)) then f(n)=O(h(n)),
• if f(n)=(g(n)) and g(n)= (h(n)) then f(n)= (h(n)),
• if f(n)=o(g(n)) and g(n)= o(h(n)) then f(n)=o(h(n)), and
• if f(n)= (g(n)) and g(n)= (h(n)) then f(n)= (h(n)).
Symmetry
• f(n)=(g(n)) if and only if g(n)=(f(n)).
Transpose symmetry
• f(n)=O(g(n)) if and only if g(n)=(g(n),
• f(n)=o(g(n)) if and only if g(n)=(g(n)).
Reflexivity
• f(n)=(f(n)),
• f(n)=O(f(n)),
• f (n)=(f(n)).
- 14 -
Chapter 2
- 15 -
There are several algorithms that are easy and are in the order of O
(n2).
There is also algorithm (ShellSort) that is very simple to implement
and run in and O(n2) practically efficient.
There are slightly complicated O(nlog n) sorting algorithms.
Any general purpose sorting algorithm requires (nlogn) comparisons.
Assumptions:
- Our sorting is comparison based.
- Each algorithm will be passed an array containing N elements.
- N is the number of elements passed to our sorting algorithm
- The operators that will be used are “<”,”>”, and“==”
Bubble Sort
Bubble sort is the simplest algorithm to implement and the slowest
algorithm on very large inputs.
The basic idea is:
Loop through list and compare adjacent pair of elements
and when every two elements are out of order with respect
to each other interchange them.
Target:
Pull a least element to upper position s bubble.
The process of sequentially passing traversing through all pr part of list
is known as pass.
Implementation:
This algorithm needs tow nested loops. The outer loop
controls the number of passes through the list and inner
loop controls the number of adjacent comparisons.
- 16 -
5 15 20 7 17 17 17 17 17
4 3 17 17 7 13 13 13 13
3 72 3 13 13 7 11 11 11
2 13 13 3 11 11 7 9 9
1 11 11 11 3 9 9 7 7
0 9 9 9 9 3 3 3 3
- 17 -
Analysis:
-How many comparisons?
T(n) = 1+ 2+3+…+(n-2)+(n-1)
=n(n-1)/2
=(n2+n )/2
T(n)= O(n2)
-How many swaps?
(n-1)+(n-2)+…+1= O(n2)
If there is no change in the ith pass it implies that the elements are
sorted(sorting is done) earlier. Thus there is no need to continue the
remaining pass.
In exploring this part one can develop a modified bubble sort algorithm that
has better performance.
Modified bubble sort:
void mBubbleSort(int a[],int n)
{
int i=1;
int swapped =1;
while(swapped && i< n)
{
Swapped=0;
for(j=0;j<i; j++)
if(a[j]>a[j+1])
{
swap(a,j, j+1);
swapped=1;
}
i++;
}//end of MbubbleSort
- 18 -
Insertion Sort
Like bubble sort insertion sort compares adjacent entries in the array
being sorted. However the motivation for doing so is highly different in
insertion sort. Here the goal is that on the i th pass through the list ith element
among the list should be in inserted into its right place among the i entries in
the array. Thus after the ith pass, the first i element of the array should be in
sorted order.
The insertion sort works just like its name suggests - it inserts each
item into its proper place in the final list. The simplest implementation of this
requires two list structures - the source list and the list into which sorted
items are inserted. To save memory, most implementations use an in-place
sort that works by moving the current item past the already sorted items and
repeatedly swapping it with the preceding item until it is in place. It's the
most instinctive type of sorting algorithm.
The approach is the same approach that you use for sorting a set of
cards in your hand. While playing cards, you pick up a card, start at the
beginning of your hand and find the place to insert the new card, insert it and
move all the others up one place.
Basic Idea is:
Find the location for an element and move all others up, and
insert the element.
Steps:
1. The left most value can be said to be sorted relative to itself. Thus, we
don’t need to do anything.
2. Check to see if the second value is smaller that the first one. If it is
swap these two values. The first two values are now relatively sorted.
3. Next, we need to insert the third value in to the relatively sorted portion
so that after insertion, the portion will still be relatively sorted.
4. Now the first three are relatively sorted.
5. Do the same for the remaining items in the list.
Summarized steps:
1. Start a pass i at 2.
2. Start and index at j.
3. check if list[j] >list[j-1], if so exchange the elements ; set j to j-1
and repeat until j has reached one.
4. For each of i=3, 4, …, n repeat steps 2 and 3.
Implementation
void insertionSort(int a[], int n)
{
for(int i=1;i<n; i++)
{
for(int j=i ; j>1 ;j--)
{ // work backwards through the array finding where temp
should go
if(a[j]<a[j-1])
swap(a,j,j-1);
- 19 -
}//end of insertion_sort
Analysis
- How many comparisons?
T(n)= 1+2+3+…+(n-1)
=n(n+1)/2
= O(n2)
- How many swaps?
1+2+3+…+(n-1)= O(n2)
- How much space: -In-place algorithm
Modified insertion sort
void minsertionSort(int a[], int n)
{
int i,j;
int inserted;
for(int i=1;i<n; i++)
{
inserted=0;
j=i ;
while(j>1 && !inserted)
{
if(a[j]<a[j-1])
swap(a,j,j-1);
else
inserted=1;
j--;
}
}//end of mInsertion_sort
- 20 -
Selection Sort
The selection sort algorithm is similarly motivated as bubble sort. However in
the case of selection sort it attempts to avoid the multitude of interchanges of
adjacent entries to which bubble sort is prone.
To do this on the ith pass trough the array it will determine the position of the
smallest/largest entry among the list of elements then this element is
swapped. This algorithm uses its inner loop to find the largest/smallest entry
and store its position in a variable called minpos/maspos
Basic Idea:
Loop through the array from i=0 to n-1.
Select the smallest/largest element in the array from i to n
Swap this value with value at position i.
Index Elemen Pass Pass 2 Pass Pass 4 Pass 5 Pass 6 Pass 7
ts 1 3
7 7 9 56 56 56 56 56 72
6 20 20 20 20 20 24 72 65
5 15 15 15 72 72 72 24 24
4 19 19 19 19 24 20 20 20
3 24 24 24 24 19 19 19 19
2 72 72 72 15 15 15 15 15
1 56 56 9 9 9 9 9 9
0 9 7 7 7 7 7 7 7
Implementation:
selecting minimum element
minpos<- position of the first item in the array
currpos<-position of the 2nd element.
while current position is not passed end of the array
if a[currpos]< a[minpos] then
minpos<-currpos
currpos<-position of the next elment
end if
end loop
return maxpos
//code minselection.cpp
int minselection(int a[], int n)
{ int minpos=0;
int currpos=1;
while(currpos<n)
{ if(a[currpos]<a[minpos]
minpos=currpos;
currpos++;
}
Return minpos;
}
//selection sort
- 21 -
Void selectionSort(int a[], int n)
{
int last=n-1;
int minpos;
while(last>0)
{ minpos=minselect(a,last+1);
swap(a, minpos ,last);
last--;
}
}
Analysis
How many comparisons?
T(n)=(n-1)+(n-2)+…+1
= O(n2)
How many swaps?
n=O(n)
How much space: -In-place algorithm
- 22 -
General Comments
Each of these algorithms requires n-1 passes: each pass places one item in its
correct place. The ith pass makes either i or n - i comparisons and moves. So:
or O(n2). Thus these algorithms are only suitable for small problems where
their simple code makes them faster than the more complex code of the O(n
logn) algorithm. As a rule of thumb, expect to find an O(n logn) algorithm
faster for n>10 - but the exact value depends very much on individual
machines!.
Empirically it’s known that Insertion sort is over twice as fast as the bubble
sort and is just as easy to implement as the selection sort. In short, there
really isn't any reason to use the selection sort - use the insertion sort
instead.
If you really want to use the selection sort for some reason, try to avoid
sorting lists of more than a 1000 items with it or repetitively sorting lists of
more than a couple hundred items.
Comparisons of sorting algorithms
There is no optimal sorting algorithm. One may be better fo less number of
data, another for average etc… due to this fact it may be necessary to use
more than one method for a simple application.
Algorithm Best case Average case Worst case
Bubble O(n2) O(n2) O(n2)
Insertion O(n) O(n2) O(n2)
Selection O(n2) O(n2) O(n2)
Basic characteristics of sorting algorithms
Bubble Sort: - Easy to develop and implement.
- Involves sequential data movement.
- Continuous working even after the elements have been sorted
i.e., many comparisons
- Although it is the least efficient algorithm it may be fast on an
input data that is small in size and almost sorted.
Insertions sort: - reduce the number of comparisons in each pass.
Because it stops doing so the moment the exact location is
found.
- Thus it runs for in a leaner time for an almost sorted input
data.
- involves frequent data movement.
- Appropriate for input whose size is small and
almost sorted.
Selection sorting:-least movement data (swap)
- 23 -
-Continuous working even after the list is sorted i.e. many
comparisons.
-appropriate for applications whose comparisons are cheap
but swaps are expensive.
Logical sorting
All the above sorting algorithms are physical sorting. But we can also use
algorithms that use logical sorting.
Data movement refers to physical rearrangement of data elements in
memory thus frequent data movement hurts performance largely. This is a
big problem when the elements are string, structure or objects.
In this case when data movement is expensive, logical sorting is important.
The logical sorting is sorting of whole number (indexes) whose entries
reflect to the locations of the actual data.
- 24 -
Simples Searching
Searching is a process of finding the location in an input list. There are two
simple searching algorithms:
• Sequential Search, and
• Binary Search
The algorithms that one chose generally depend on the way the data items
are organized and structured in the data list.
if(!found)
index=-1;
return index;
}
Analysis:
-Time is proportional to the size of input (n) and we call this time complexity
O(n).
-complexity: the numbers of comparisons required for different cases
respectively are:
Best case: 1; O(1)
Average case: n/2 ;O(n)
Worst case:n+1; O(n)
- 25 -
Binary Search
Binary search assumes that the data being searched has already
sorted. Each comparison performed by the algorithm eliminate
half of the remaining elements in the array until the element is
found of one element remain in the list. This searching algorithms
works only on an ordered list:
Basic Idea:
• Locate midpoint of array to search
• Determine if target is in lower half or upper half of an array.
o If in lower half, make this half the array to search
o If in the upper half, make this half the array to search
• Loop back to step 1 until the size of the array to search is one, and this
element does not match, in which case return –1.
Implementation:
int Binary_Search(int list[],int k)
{
int left=0;
int right=n-1;
int found=0;
do{
mid=(left+right)/2;
if(key==list[mid])
found=1;
else
{
if(key<list[mid])
right=mid-1;
else
left=mid+1;
}
}while(found==0&&left<right);
if(!found)
index=-1;
else
index=mid;
return index;
}
Analysis:
The algorithm requires logn + 1 comparison there fore .The computational
time for this algorithm is proportional to log 2 n. Therefore the time complexity
is O(log n)
- 26 -
Chapter 3
- 27 -
Can data items be deleted
Are data processing in some well defined order or random access is
allowed.
Data structure affects the algorithms and the viscera. Each data structure
has cost and benefits. Rarely one data structure is better than other all cases.
Data structure requires:
Space for each data item (data members)
Time to perform each basic operations [member functions]
Programming effort
The study of data structure includes:
Logical (mathematical)
Implementation of the data structure.
i.e., Data structures have both logical and physical form.
Logical Form: the definition of the data items within ADT
Physical Form: the implementation of the data items in
the data structure.
ADT:
Type
Operation
Data Structure:
Storage
Subroutines
List ADT
- 28 -
Them most common operations are that are associated with the data
structures can be:
Operation Decryption
Create
Insert
Delete
Traverse
Search
Sort
Merge
The type of linear structure we choose for particular problem depends on the
relative frequency with which one performs the operations on the structures.
Array ADT
A linear array is a list of finite number of homogeneous data elements. The
relations ship between the data elements of the array is represented by the
physical relation ships of data in memory.
Implementation of array ADT
Data members: Stores the list of elements in consecutive array location
a1, a2, a3, a4…an-1, an
size: the size of the array.
index: indicates the current position in the list
Function Member
Create Declare an array of appropriate size or declare a dynamic
array and resize it when necessary.
Insert Insert an element at either of the following locations:
Insert Top
Insert Middle
Insert End
Delete Deletes an element from the list from
Top
Bottom
Middle
Traverse Passes trough the list and process every elements
Search Searches for the occurrence of an element
Merge Combines to lists in to third list
Sort Rearrange elements in the list in some defined order.
The linear relation ship between data items in memory makes it easy:-
computation
-Traverse
On the other hand it is expensive:
o To delete and insert
- 29 -
o To increase the size
Even if the array is dynamically allocated, an estimate of the maximum size of
the list is required. Usually this requires a high over estimate, which wastes
considerable space. This could be a serious limitation, especially if there are
many list of unknown size.
Implementation of Array ADT
//Arrayadt.h
#include<iostream.h>
#include<conio.h>
typedef int dataType;
typedef int bool;
const int capacity=100;
- 30 -
bool isFull(lList l)
{
return (l.count==capacity);
}
- 31 -
//inserts a new element
bool insertAt(lList &l,dataType newItem, int index)
{
if(isFull(l))
return 0;
int last=l.count;
//shift from last to index
while(index<last)
{
l.node[last].data=l.node[last-1].data;
--last;
}
l.node[index].data=newItem;
l.count++;
return 1;
}
- 32 -
//inserts the item newItem before the item beforeItem
bool insertBefore(lList &l,dataType beforeItem, dataType newItem)
{
if(isFull(l) && isEmpty(l))
return 0;
int index=search(l,beforeItem);
if(index==-1 && index==0)
return 0;
return insertAt(l,newItem,index);
}
Linked List
Linked list is a linear collection of elements, called nodes where a linear
relation ship is given by means of pointers or links. Each node has two pats
the data part and the link part.
Data part: is part which contain the data element
Link part: is a part which contains the address of the next element.
A linked list is made up of a chain of nodes. It is the most basic self-
referential structures. Linked lists allow you to have a chain of nodes with
related data.
In order to avoid the linear coast of insertion and deletion, we need to
ensure that the list is not stored contiguously, science other wise the entire
parts of the list will be needed to be moved. Because each node has a link
part the successive elements in the list need not occupy adjacent memory
location. This makes easier to insert and delete.
Arrays are simple and fast but we must specify their size at construction
time. This has its own drawbacks. If you construct an array with space for n,
tomorrow you may need n+1.Here comes a need for a more flexible system?
Flexible space use by dynamically allocating space for each element as
needed. This implies that one need not now the size of the list in advance.
Memory is efficiently utilized.
The disadvantage with linked list: Except the first element we can’t
access each element directly. Algorithms that require fast access will not be
very effective as required.
The linked list is a very flexible dynamic data structure: items may be
added to it or deleted from it at will. A programmer need not worry about how
many items a program will have to accommodate: this allows us to write
robust programs which require much less maintenance. A very common
- 33 -
source of problems in program maintenance is the need to increase the
capacity of a program to handle larger collections: even the most generous
allowance for growth tends to prove inadequate over time! In a linked list,
each item is allocated space as it is added to the list. A link is kept with each
item to the next item in the list.
List:
a1,a2,a3,a4,…..an
Logical view:
First a1 a2 a3 a4
Operations:
Insert:
Insert top:
First a1 a2 a3 a4
Insert last
a5
First a1 a2 a3 a4 a5
Insert nth
First a1 a2 a3 a4
- 34 -
Remove:
Remove first
First a5 a1 a2 a3 a4
Remove last
First a5 a1 a2 a3 a4
Remove nth
First a5 a1 a2 a3 a4
Traverse: Travers is the most complex task which is the drawback of linked
list. Except the first element, every element in the list can not be
accessed directly. This makes the complexity of any traversal
operation O (n).
Implementation of Linked list ADT
- 35 -
-1
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
Data part:
Space for the nodes
Array of nodes of predetermined size.
Operations:
Create Declare an array of appropriate size or declare a dynamic array
and resize it when necessary.
Insert Insert an element at either of the following locations:
Insert Front
Insert After a Node
Insert Before a Node
Insert Last
Delete Deletes an element from the list from
Delete Top
After a Node
Before a node
Delete Node
Delete last node
Traverse Passes trough the list and process every elements
Display
Search Searches for the occurrence of an element
Merge Combines to lists in to third list
Sort Rearrange elements in the list in some defined order.
Utility functions Isfull
Isempty
Getprev
Retrunnode
- 36 -
*/
struct nodeType
{
dataType item;
pointer next;
};
struct LList
{
nodeType node[capacity];
pointer first,free;
};
void initialize(LList &l);
boolean isempty(LList l);
boolean isfull(LList l);
pointer getnode(LList &l);
pointer getprev(LList,dataType newitem);
void insertFront(LList &l, dataType newItem);
void insertAfterNode(LList &l,dataType itemAfter,dataType newItem);
void insertBeforeNode(LList &l,dataType beforeItem,dataType newItem);
void insertLast(LList &l, dataType newItem);
void deleteAfterNode(LList &l,dataType itemAfter);
void deleteNode(LList &l,dataType deleteItem);
void deleteBeforeNode(LList &l,dataType beforeItem);
void delet(LList &l, dataType deleteitem);
void ascending(LList l1,LList &l2);
void display(LList l);
void returnnode(LList &l,pointer index);
/*
********************* Implementation************************
ALLIST.cpp
*/
#include<iostream.h>
#include<conio.h>
#include"c:\data\alist.h"
void initialize(LList &l)
{
l.first=-1;
l.free=0;
- 37 -
for(int i=0;i<capacity-1;i++)
l.node[i].next=i+1;
l.node[capacity-1].next=-1;
}
boolean isempty(LList l)
{
return (l.first==-1);
}
boolean isfull(LList l)
{
return (l.free==-1);
}
- 38 -
{
int fSlot=getnode(l);
if(fSlot!=-1)
{
l.node[fSlot].item=newItem;
l.node[fSlot].next=l.first;
l.first=fSlot;
}
else
cout<<"\nNo free node!";
}
- 39 -
l.node[fSlot].item=newItem;
l.node[fSlot].next=l.node[temp].next;
l.node[temp].next=fSlot;
}
else
cout<<"\nNo free Node!";
}
else
cout<<"\The Item To be searched is Not found!\n";
}
- 40 -
void display(LList l)
{
int temp=l.first;
while(temp!=-1)
{
cout<<l.node[temp].item<<endl;
temp=l.node[temp].next;
}
if(l.first==-1)
cout<<"\nEmpty List:";
}
void deleteAfterNode(LList &l,dataType itemAfter)
{
pointer temp=l.first;
int found=0;
while(!found && temp!=-1)
{
if(l.node[temp].item==itemAfter)
found=1;
else
temp=l.node[temp].next;
}
if(found)
{
if(l.node[temp].next==-1)
cout<<"\nThere is no node after the item to be deleted:\n";
else
{
int tempNext=l.node[l.node[temp].next].next;
returnnode(l,l.node[temp].next);
l.node[temp].next=tempNext;
}
}
else
cout<<"\nThe Item data not found!";
}
- 41 -
void deleteBeforeNode(LList &l,dataType beforeItem)
{
pointer prev2,prev1=-1,temp=l.first;
int found=0;
while(!found && temp!=-1)
{
if(beforeItem==l.node[temp].item)
found=1;
else
{
prev2=prev1;
prev1=temp;
temp=l.node[temp].next;
}
}
if(found)
{
if(temp==l.first)
cout<<"\nThere is no Item before the item to be deleted:\n";
else
{
if(prev1==l.first)
{
l.first=temp;
returnnode(l,prev1);
}
else
{
l.node[prev2].next=temp;
returnnode(l,prev1);
}
}
}
else
cout<<"\nItem to be searched is not found:\n";
}
- 42 -
void deleteNode(LList &l,dataType deleteItem)
{ pointer prev,temp=l.first;
int found=0;
while(!found && temp!=-1)
{
if(deleteItem==l.node[temp].item)
found=1;
else
{
prev=temp;
temp=l.node[temp].next;
}
}
if(found)
{
if(temp==l.first)
{
l.first=l.node[temp].next;
returnnode(l,temp);
}
else
{
l.node[l.node[temp].next].next=prev;
returnnode(l,temp);
}
}
else
cout<<"\nThe item to be deleted is not found!";
}
- 43 -
else if(l2.node[l2.first].item>l2.node[fSlot].item)
{ l2.node[fSlot].next=l2.first;
l2.first=fSlot;
}
else
{ pointer prev,temp=l2.first;
int flag=0;
while(temp!=-1 && !flag)
{ if(l2.node[temp].item>l2.node[fSlot].item)
flag=1;
else
{ prev=temp;
temp=l2.node[temp].next;
}
}
if(flag)
{ l2.node[fSlot].next=temp;
l2.node[prev].next=fSlot;
}
else
{ l2.node[fSlot].next=-1;
l2.node[prev].next=fSlot;
}
}
tmp=l1.node[tmp].next;
}while(tmp!=-1);
cout<<"\nThe original list is copied to another list in ascending:";
}
}
A linked list is a data structure that is built from structures and pointers. It
forms a chain of "nodes" with pointers representing the links of the chain and
holding the entire thing together.
In linked list each nodes has a link to the next node in the series. The last
node has a link to the special value NULL, which any pointer (whatever its
type) can point to, to show that it is the last link in the chain. There is also
another special pointer, called Start (also called head), which points to the
first link in the chain so that we can keep track of it.
The key part of a linked list is a structure, which holds the data for each node
(the name, address, age or whatever for the items in the list), and, most
importantly, a pointer to the next node. Here we have given the structure of a
typical node:
- 44 -
Defining the data structure for a linked list
struct node
{
char name[20]; // Name of up to 20 letters
int age
float height; // In metres
node *nxt;// Pointer to next node
};
Struct LList
{
node *first;
int count;
};
The important part of the structure is the line before the closing curly
brackets. This gives a pointer to the next node in the list. This is the only case
in C++ where you are allowed to refer to a data type (in this case node) before
you have even finished defining it!
Adding a node
The simplest strategy for adding an item to a list is to:
a. allocate space for a new node,
b. copy the item into it,
c. make the new node's next pointer point to the current head of the list
and
d. make the head of the list point to the newly allocated node.
Adding a node to the end of the list
The first problem that we face is how to add a node to the list. For simplicity's
sake, we will assume that it has to be added to the end of the list, although it
could be added anywhere in the list (a problem we will deal with later on).
Firstly, we declare the space for a pointer item and assign a temporary
pointer to it. This is done using the new statement as follows:
?
We can refer to the new node as *temp, i.e. "the node that temp points to".
When the fields of this structure are referred to, brackets can be put round
the *temp part, as otherwise the compiler will think we are trying to refer to
the fields of the pointer. Alternatively, we can use the arrow pointer notation.
Having set up the information, we have to decide what to do with the
pointers. Of course, if the list is empty to start with, there's no problem - just
set the Start pointer to point to this node (i.e. set it to the same value as
temp):
- 45 -
if (l.first == NULL)
l.first = tmp;
It is harder if there are already nodes in the list. In this case, the secret is to
declare a second pointer, temp2, to step through the list until it finds the last
node.
Deleting a node
- 46 -
delete temp;
Firstly, the start pointer doesn't point to NULL, so we don't have to display a
"Empty list, wise guy!" message. Let's get straight on with step2 - set the
pointer temp1 to the same as the start pointer:
- 47 -
The next pointer from this node isn't NULL, so we haven't found the end node.
Instead, we set the pointer temp2 to the same node as temp1
Going back to step 3, we see that temp1 still doesn't point to the last node in
the list, so we make temp2 point to what temp1 points to
first
NULL
Eventually, this goes on until temp1 really is pointing to the last node in the
list, with temp2 pointing to the penultimate node:
first
NULL
- 48 -
temp 2 temp1
Now we have reached step 8. The next thing to do is to delete the node
pointed to by temp1
We suppose you want some code for all that! All right then ....
node *temp1, *temp2;
if (start_ptr == NULL)
cout << "The list is empty!" << endl;
else
{ temp1 = start_ptr;
while (temp1->nxt != NULL)
{ temp2 = temp1;
temp1 = temp1->nxt;
}
delete temp1;
temp2->nxt = NULL;
}
The code seems a lot shorter than the explanation!
- 49 -
Now, the sharp-witted amongst you will have spotted a problem. If the list
only contains one node, the code above will malfunction. This is because the
function goes as far as the temp1 = l.first statement, but never gets as far as
setting up temp2. The code above has to be adapted so that if the first node is
also the last (has a NULL nxt pointer), then it is deleted and the l.first pointer
is assigned to NULL. In this case, there is no need for the pointer temp2:
node *temp1, *temp2;
if (start_ptr == NULL)
cout << "The list is empty!" << endl;
else
{ temp1 = start_ptr;
if (temp1->nxt == NULL) // This part is new!
{ delete temp1;
start_ptr = NULL;
}
else
{ while (temp1->nxt != NULL)
{ temp2 = temp1;
temp1 = temp1->nxt;
}
delete temp1;
temp2->nxt = NULL;
}
}
- 50 -
Navigating through the list
One thing you may need to do is to navigate through the list, with a
pointer that moves backwards and forwards through the list, like an index
pointer in an array. This is certainly necessary when you want to insert or
delete a node from somewhere inside the list, as you will need to specify the
position. We will call the mobile pointer current. First of all, it is declared, and
set to the same value as the l.first pointer:
node *current;
current = l.first;
Notice that you don't need to set current equal to the address of the start
pointer, as they are both pointers. The statement above makes them both
point to the same thing:
It's easy to get the current pointer to point to the next node in the list
(i.e. move from left to right along the list). If you want to move current along
one node, use the nxt field of the node that it is pointing to at the moment:
current = current->nxt;
In fact, we had better check that it isn't pointing to the last item in the
list. If it is, then there is no next node to move to:
if (current->nxt == NULL)
cout << "You are at the end of the list." << endl;
else
current = current->nxt;
Moving the current pointer back one step is a little harder. This is
because we have no way of moving back a step automatically from the
current node. The only way to find the node before the current one is to start
at the beginning, work our way through and stop when we find the node
before the one we are considering the moment. We can tell when this
happens, as the next pointer from that node will point to exactly the same
place in memory as the current pointer (i.e. the current node).
Previous Current
Start
Stop here
- 51 -
First of all, we had better check to see if the current node is also first
the one. If it is, then there is no "previous" node to point to. If not, check
through all the nodes in turn until we detect that we are just behind the
current one (Like a pantomime - "behind you!")
if (current ==l.first)
cout << "You are at the start of the list" << endl;
else
{ node *previous; // Declare the pointer
previous = l.first;
while (previous->next != current)
{
previous = previous->nxt;
}
current = previous;
}
The else clause translates as follows: Declare a temporary pointer (for
use in this else clause only). Set it equal to the start pointer. All the time that
it is not pointing to the node before the current node, move it along the line.
Once the previous node has been found, the current pointer is set to that
node - i.e. it moves back along the list.
The next easiest thing to do is to delete a node from the list directly
after the current position. We have to use a temporary pointer to point to the
node to be deleted. Once this node has been "anchored", the pointers to the
remaining nodes can be readjusted before the node on death row is deleted.
Here is the sequence of actions:
1. Firstly, the temporary pointer is assigned to the node after the current
one. This is the node to be deleted:
Previous Current
Start
2. Now the pointer from the current node is made to leap-frog the next
node and point to the one after that:
current temp
Start
- 52 -
3. The last step is to delete the node pointed to by temp.
Here is the code for deleting the node. It includes a test at the start to test
whether the current node is the last one in the list:
if (current->nxt == NULL)
cout << "There is no node after current" << endl;
else
{
node *temp;
temp = current->nxt;
current->nxt = temp->nxt; // Could be NULL
delete temp;
}
/*
Pointer implementation of Linked List
*****************PLLIST.h*****************************
*/
#include<iostream.h>
#include<conio.h>
typedef int dataType;
typedef int boolean;
struct node
{
dataType item;
int dup;
node *next;
};
struct PLList
{
pointer first;
};
void initialize(PLList &l);
boolean isempty(PLList l);
boolean isfull();
pointer getnode();
pointer getprev(PLList l, dataType newitem);
void insertFront(PLList &l, dataType newitem);
void insertLast(PLList &l, dataType newitem);
void insertAfter(PLList &l, dataType afteritem,dataType newitem);
void insertBefore(PLList &l, dataType beforeitem,dataType newitem);
void deletFront(PLList &l);
void deletLast(PLList &l);
void deletNth(PLList &l, int N);
void delet(PLList &l, dataType deleteitem);
void display(PLList l);
void returnnode(pointer loc);
void copy(PLList source, PLList &destination);
void merge(PLList l1,PLList l2,PLList &l3);//merge to the thired list
int size(PLList l);
- 53 -
#include "PLLIST.h"
boolean isempty(PLList l)
{
return (l.first==NULL);
}
boolean isfull()
{ pointer test=new node;
if(test)
{ delete test;
return 0;
}
return 1;
}
pointer getnode()
{
pointer tmp=NULL;
if(!isfull())
tmp=new node;
return tmp;
}
- 54 -
{
tmp->item=newitem;
while(tmp->next!=NULL)
tmp=tmp->next;
tmp->next=fnode;
fnode->next=NULL;
}
else
cout<<"The List is Full";
}
else
insertFront(l,newitem);
}
- 55 -
while(!found && current->next!=NULL)
{
if(current->item==beforeitem)
found=1;
else
{
prev=current;
current=current->next;
}
}
if(found)
{
pointer fnode=getnode();
fnode->item=newitem;
prev->next=fnode;
fnode->next=current;
}
else
cout<<"No Such elment";
}
else
cout<<"The List is empty";
}
void returnnode(pointer loc)
{
delete loc;
loc=NULL;
}
- 56 -
pointer prev=l.first;
pointer current=l.first;
while(current->next!=NULL)
{
prev=current;
current=current->next;
}
prev->next=NULL;
returnnode(current);
}
cout<<"The List is empty";
- 57 -
}
if(current)
{
prev->next=current->next;
returnnode(current);
}
else
cout<<deleteitem<<" is not found";
}
}
else
cout<<"Epty list";
int size(PLList l)
{
int count=0;
if(!isempty(l))
{
pointer tmp=l.first;
while(tmp)
{
count++;
tmp=tmp->next;
}
}
return count;
}
- 58 -
void display(PLList l)
{
if(!isempty(l))
{
pointer tmp=l.first;
while(tmp)
{
cout<<tmp->item<<endl;
tmp=tmp->next;
}
}
cout<<"The List is empty";
- 59 -
Variations of Linked List
1. Linked list with dummy header or trailer node
The advantage of dummy header or trailer node is to eliminate the
special case of inserting or deleting a node. The dummy part of the node is
used to store information about the list.
a. Linked list with dummy header
Even if there is no element in the list the dummy node will exist and
No need to care for the special case of one element. The dummy head node
help to keep information about the list we can for example, we can put
information like the number of elements in the list.
0
First
First
First
First
Empty linked list with dummy header and trailer nodes looks like:
0
First
- 60 -
2. Linked list with two pointers
Linked list with two pointers, one to the first node and the other node
to the last node will facilitate insertion operation at the last of the list. Note
for deletion since the previous node need to be null.
First
Last
First
- 61 -
They permit scanning or searching of the list in both directions. (To go
backwards in a simple list, it is necessary to go back to the start and scan
forwards.) Many applications require searching backwards and forwards
through sections of a list: for example, searching for a common name like
"Kim" in a Korean telephone directory would probably need much scanning
backwards and forwards through a small region of the whole list, so the
backward links become very useful. In this case, the node structure is altered
to have two links:
struct node
{
dataType item;
node *prev;
node *next;
};
node *current;
current = new node;
current->item = "Fred";
current->next = NULL;
current->prev = NULL;
I have also included some code to declare the first node and set its
pointers to NULL. It gives the following situation:
- 62 -
void insertFront(string newitem)
{
// Declare a temporary pointer and move it to the start
node *temp = current;
while (temp->prv != NULL)
temp = temp->prv;
// Declare a new node and link it in
node *temp2;
temp2 = new node;
temp2->item =newitem;
temp2->prev = NULL;
temp2->next = temp; // Links to current list
temp->prev = temp2;
}
W’ll go through the function for adding a node to the right-most end of
the list. The method is similar for adding a node at the other end. Firstly, a
temporary pointer is set up and is made to march along the list until it points
to last node in the list.
After that, a new node is declared, and the name is copied into it. The next
pointer of this new node is set to NULL to indicate that this node will be the
new end of the list. The prev pointer of the new node is linked into the last
node of the existing list. The next pointer of the current end of the list is set to
the new node.
- 63 -
The advantage of two way list over one way list:
For deletion of a node whose position is given is easy. Because we need
the location of the preceding node unlike the one way list which we may need
to traverse the entire list. Insertion of a node before a given node at a given
location is very easy. But not for insetting a node after the node because we
don’t need the location of the previous one. For traversing the list to process
each elements. Fro searching a sorted list for a given item that is most likely
located at about the end of the list. In which case we traverse the list in the
back warred direction.
Chapter 4
Queues
A queue follows the first-in-first-out (FIFO) principle.
Elements may be added to the queue at any time; however, only the element in the queue
for the longest time may be removed.
Elements are inserted at the rear (enqueued) and removed from the front (dequeued).
Queue ADT
front Return the Object at the front of the queue without removing the Object. If the
queue is empty, return an error.
Input: none Return: Object
Array-Based Queue
- 64 -
“Normal” configuration:
The head is the front of the queue, the tail is the rear of the queue.
No sentinel node is required.
Priority Queue
- 65 -
• Like an ordinary queue, priority queue has a front and a rear, and items are removed from
front.
• However, in priority queue, items are ordered so that the item with highest priority is always
in front
When an item is inserted the order must be maintained
• Priority queue is said to be an ascending-priority queue if the item with smallest key has
the highest priority
• Priority queue is said to be a descending-priority queue if the item with largest key has the
highest priority
Operations
• Remove an element
• Return the element at the front position
• Insert an element
• Find the place to insert
• Shift the elements to make room
• Insert the element
- 66 -
2. Sorted Linked List implementation
Elements are inserted into the list in sorted order in terms of a priority key field.
Deletions are performed at only one end, hence preserving the status of the priority
queue. All other functionalities associated with a linked list ADT are performed without
modification.
3. Heap-based implementation
• Heap is a kind of tree
• It provides both insertion and deletion operations
• It’s a method of choice where speed is important and a high number of insertions is
expected
Chapter 5 Stacks
- 67 -
Top 12
9
8
4
After pushing a single element (45)
Top 45
12
9
8
4
After the 2nd pushing (39)
Top 39
45
12
9
8
4
After popping
39
Top 45
12
9
8
4
Implementation of Stack ADT
Stacks can be implemented either as an array (contiguous list) or as a linked
list. We want a set of operations that will work with either type of
implementation; that means the method of implementation is hidden and can
be changed without affecting the programs that use them.
Algorithms of the Basic Operations:
- 68 -
CreateStack()
{
Remove existing items from the stack
Initialize the stack to empty
}
Push() Pop()
{ {
if there is room if stack not empty
{ {
put an item on the return the value of
top of the stack the top item
else remove the top item
give an error from the stack
message }
} else
} {
give an error
message
}
}
- 69 -
Array Implementation of Stacks
Data Members: we need two data members: -Data
Structure
An integer data type variable which indicate the
index of the to of the stack.
An array of elements, which forms the stack
structure.
Function Members:
Initialize
Push
Pop
Top
Isempty
Isfull
#include<iostream.h>
#include<conio.h>
#define stacksize 100
typedef int dataType;
typedef int pointer;
typedef int boolean;
struct stack
{
pointer top;
datatype items[stacksize];
};
Here, as you might have noticed, addition of an element is known as the PUSH
operation. So, if an array is given to you, which is supposed to act as a STACK,
70
you know that it has to be a STATIC Stack; meaning, data will overflow if you
cross the upper limit of the array. So, keep this in mind.
Algorithm:
Step-1: Increment the Stack TOP by 1. Check whether it is always less than
the Upper Limit of the stack. If it is less than the Upper Limit go to step-2 else
report -"Stack Overflow"
Step-2: Put the new element at the position pointed by the TOP
Note that in array implementation, we have taken TOP = -1 to signify the empty
stack, as this simplifies the implementation.
int length(stack s)
{
Return (s.top);
}
int Top()
{
return s.items[s.top];
}
POP is the synonym for delete when it comes to Stack. So, if you're taking an
array as the stack, remember that you'll return an error message, "Stack
underflow", if an attempt is made to Pop an item from an empty Stack. OK.
int pop(stack s)
{
if(!isfempty(s))
{
q.top--;
return 1;
}
}
void display(stack s)
71
{
while(s.top+1!=0)
{
cout<<s.items[s.top]<<endl;
s.top=q.top-1;
}
}
boolean isfull(stack s)
{
return s.top==stacksize;
}
Boolean isempty(stack s)
{
return s.top==-1;
}
72
Linked List Implementation of Stacks
struct node
{
int item;
struct node *next;
}
struct stack
{
node *stack = NULL; /*stack is initially empty
node *top = stack;
};
The PUSH operation
It’s very similar to the insertion operation in a dynamic singly linked list.
The only difference is that here you'll add the new element only at the end of
the list, which means addition can happen only from the TOP. Since a dynamic
list is used for the stack, the Stack is also dynamic, means it has no prior
upper limit set. So, we don't have to check for the Overflow condition at all!
First we create the new element to be pushed to the Stack. Then the TOP
most elements are made to point to our newly created element. And the TOP is
moved and made to point to the last element in the stack, which is our newly
added element.
Algorithm:
Step-1:- If the Stack is empty go to step-2 or else go to step-3
Step-2:- Create the new element and make your "stack" and "top" pointers
point to it and quit.
Step-3:- Create the new element and make the last (top most) element of the
stack to point to it
Step-4:- Make that new element your TOP most elements by making the "top"
pointer point to it.
73
if(s->stack == NULL)
{
stack = newnode;
top = stack;
}
else
{
top ->next = newnode;
top = newnode;
}
}
This is again very similar to the deletion operation in any Linked List, but
you can only delete from the end of the list and only one at a time; and that
makes it a stack. Here, we'll have a list pointer, "target", which will be pointing
to the last but one element in the List (stack). Every time we POP, the TOP
most elements will be deleted and "target" will be made as the TOP most
elements.
First we got the "target" pointing to the last but one node. And then we freed
the TOP most element. At the last step we made the "target" node as our TOP
most elements. Supposing you have only one element left in the Stack, then
we won't make use of "target" rather we'll take help of our "bottom" pointer. See
how...
datType pop(stac s )
{
int pop_val = 0;
struct node *target = stack;
if(s->stack == NULL)
cout<<"Stack Underflow";
else
{
if(top == bottom)
{
74
pop_val = top -> nodeval;
delete top;
stack = NULL;
top = bottom = stack;
}
else
{
while(target->next != top) target = target ->next;
pop_val = top->nodeval;
delete top;
top = target;
target ->next = NULL;
}
}
return(pop_val);
}
Function Calls
When a function is called, arguments (including the return address) have to be
passed to the called function.
If these arguments are stored in a fixed memory area then the function can not be
called recursively since the 1st return address would be overwritten by the 2nd
return address before the first was used:
10 call function abc(); /* retadrs = 11 */
11 continue;
...
90 function abc;
91 code;
92 if (expression)
93 call function abc(); /* retadrs = 94 */
94 code
95 return /* to retadrs */
A stack allows a new instance of retadrs for each call to the function. Recursive
calls on the function are limited only by the extent of the stack.
10 call function abc(); /* retadrs1 = 11 */
11 continue;
...
90 function abc;
91 code;
92 if (expression)
93 call function abc(); /* retadrs2 = 94 */
94 code
95 return /* to retadrsn */
75
Queues
A queue is a data structure which permits insertion only at one end and deletion at
the opposite end. It is also called a FIFO (First In First Out) buffer.
Examples:
Calls to large companies are generally placed on a queue when all operators are
busy.
Queue Operations:
insertion: enqueue append q_in qpush
deletion: dequeue serve q_out qpop
Implementations of Queues
Array
Circular Array
Linked List
1. Arrays
InitializeQueue:
Declare array of size Asize;
Set Front = 0, Rear = 0;
EnQueue(item):
if(Rear < ASize)
Queue[Rear++] = item;
else
"queue full";
DeQueue(item):
if(Front <= Rear)
item = Queue[Front++];
else
"queue empty";
2. Circular Arrays
A problem with simple arrays is we run out of space even if the queue never reaches
the size of the array - the end of the queue chases the front. Hence simulated
circular arrays can be a better idea. However circular arrays present problems with
deciding when the array is full or empty (Is it full or empty when the in pointer is
equal to the out pointer?). The simplest answer is to keep track of the queue length:
InitializeQueue:
76
Declare array of size Asize;
Set Front = 0, Rear = 0, QSize = 0;
EnQueue(item){
if(QSize < ASize) {
Queue[Rear++] = item;
QSize++;
} else
"queue full";
if(Rear == ASize) Rear = 0;
}
DeQueue(item){
if(QSize > 0) {
item = Queue[Front++];
QSize--;
} else
"queue empty";
if(Front == ASize) Front = 0;
}
Stack
A stack is a restricted variant of list in which elements my be inserted or removed from only one
end
“Last-In, First-Out” principle – elements are retrieved in reverse order of their insertion
77
public:
Stack(); // Constructor
~Stack(); // Destructor
void clear(); // Remove all from stack
void push(const Elem item); // Push onto stack
Elem pop(); // Pop from top of stack
Elem topValue() const; // Get value of top Elem
bool isEmpty() const; // TRUE if stack is empty
};
Example: Problem – Reversing a Word
Write an algorithm that reverses a word (string)
The algorithm should:
o Extract characters form string one by one
o Create a new string with the same characters but in the reverse order
Reverse algorithm using a Stack
create a Stack
take characters
from the string one by one and
push them
onto the stack
pop characters
from the stack and
append them
to output string
Implementations of stack:
• Array-based implementation
• Uses an array to store stack elements
• Linked-list based implementation
• Uses a linked list to store stack elements
Array based implementation
Elements are stored in an array
Maximum size of array must be known before the stack is created
Bottom of the stack is at position 0
Top – the only position addressed [no current]
78
Array based stack
Stack::Stack(int sz)
listarray = new
Elem[sz];
Stack::~Stack()
delete [] listarray;
void Stack::clear()
79
{
top = 0;
push
// Push Elem
onto stack
listarray[top++] =
item;
Elem Stack::pop()
assert(!isEmpty());
//Must be an Elem
to pop
return listarray[--
top];
return listarray[top-
1];
80
}
{ return top == 0;
Class Link
{ private:
Elem element; // Elem value for node
Link *next; // Pointer to next node
public:
Link(const Elem
elemval, Link* nextval =NULL)
};
81
void push(const Elem item); // Push onto stack
Elem pop(); // Pop from top of stack
Elem topValue() const; // Get value of top Elem
bool isEmpty() const; // TRUE if stack is empty
};
Constructor // Constructor
Stack::Stack()
top = NULL;
Destructor // Destructor
Stack::~Stack()
{ while(top != NULL)
top = top->next;
delete temp;
void Stack::clear()
{ while(top != NULL)
top = top->next;
delete temp;
82
}
Elem Stack::pop()
delete top;
top = ltemp;
return temp;
return top->element;
83
Chapter 6
Tree ADT
A general tree is a set of nodes that is either empty or has a designated node called the root
form which hierarchical deseeds zero or more sub-trees. Each sub tree satisfies the definition
of tree. it is a set of nodes and edges that connect pairs of nodes. It is an abstract model of a
hierarchical structure. Rooted tree has the following structure:
One node distinguished as root.
Every node C except the root is connected from exactly other node P. P is C's parent,
and C is one of C's children.
There is a unique path from the root to each node.
The number of edges in a path is the length of the path.
Tree Terminologies
Consider the following tree.
A
B E F G
C D H I J
K L M
H I 84 J
The node that has Subtrees is the parent of the roots of the Subtrees.
The roots of a Subtrees are the children of the node
Some Proprieties of Tree:
A tree with n node has n-1 edges
The depth of a root node is zero
The height of leaves is zero.
Binary Tree
Binary tree is a in which each node has at most two tow Subtrees i.e. a node has at most tow
children called left child and right child.
Full Binary Tree: a binary tree where each node has either zero or tow children.
Balanced Binary Tree: a binary tree where each node except the leaf nodes has left and right
children and all the leaves are at the same level.
85
Complete Binary Tree: a binary tree in which the length from the root to any leaf node is
either h or h-1 where h is the height of the tree. The deepest level should
also be filled from left to right.
A binary tree is called skewed if the number of nodes is only one less than the length of tree.
Complete Binary trees can be implemented with out any lose of memory space.
Implementation of a binary tree
1. array implementation
Data members:
An array to store elements
Function members:
initialize()
insert()
delete()
:
traverse ()
2. linked list implementation
Data members:
Node with: - Left pointer to a node.
-Right pointer to a node.
-Data part to sore the element.
Pointer to the root node
Counter to hold the number of elements in the tree.
Stack Dynamic memory (heap)
A
Tree Struct
Root
C
Count B
86
Function members:
initialize()
insert()
delete()
:
traverse()
Example:
RootNodePtr
10
6 15
4 8 14 18
7 12 16 19
11 13 17
87
Preorder traversal: - 10, 6, 4, 8, 7, 15, 14, 12, 11, 13, 18, 16, 17, 19
Inorder traversal: - 4, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
Postorder traversal: - 4, 7, 8, 6, 11, 13, 12, 14, 17, 16, 19, 18, 15, 10
10 10
6 15
15 10
4 8 14 18 6
18
7 12
16 19 4
19
11 13
c)
a) b)
B-Trees (or N-ary Trees
88
Introduction
A B-tree is a specialized multiway tree designed especially for use on disk. In a B-tree each node
may contain a large number of keys. The number of subtrees of each node, then, may also be
large. A B-tree is designed to branch out in this large number of directions and to contain a lot of
keys in each node so that the height of the tree is relatively small. This means that only a small
number of nodes must be read from disk to retrieve an item. The goal is to get fast access to the
data, and with disk drives this means reading a very small number of records. Note that a large
node size (with lots of keys in the node) also fits with the fact that with a disk drive one can
usually read a fair amount of data at once.
Definitions
A multiway tree of order m is an ordered tree where each node has at most m children. For
each node, if k is the actual number of children in the node, then k - 1 is the number of keys in
the node. If the keys and subtrees are arranged in the fashion of a search tree, then this is called a
multiway search tree of order m. For example, the following is a multiway search tree of order
4. Note that the first row in each node shows the keys, while the second row shows the pointers
to the child nodes. Of course, in any useful application there would be a record of data associated
with each key, so that the first row in each node might be an array of records where each record
contains a key and its associated data. Another approach would be to have the first row of each
node contain an array of records where each record contains a key and a record number for the
associated data record, which is found in another file. This last method is often used when the
data records are large. The example software will use the first method.
What does it mean to say that the keys and subtrees are "arranged in the fashion of a search
tree"? Suppose that we define our nodes as follows:
89
struct nodetype
{
int Count; // number of keys stored in the current node
ItemType Key[3]; // array to hold the 3 keys
long Branch[4]; // array of fake pointers (record numbers)
};
Then a multiway search tree of order 4 has to fulfill the following conditions related to the
ordering of the keys:
This generalizes in the obvious way to multiway search trees with other orders.
Note that ceil(x) is the so-called ceiling function. It's value is the smallest integer that is greater
than or equal to x. Thus ceil(3) = 3, ceil(3.35) = 4, ceil(1.98) = 2, ceil(5.01) = 6, ceil(7) = 7, etc.
A B-tree is a fairly well-balanced tree by virtue of the fact that all leaf nodes must be at the
bottom. Condition (2) tries to keep the tree fairly bushy by insisting that each node have at least
half the maximum number of children. This causes the tree to "fan out" so that the path from root
to leaf is very short even in a tree that contains a lot of data.
90
Example B-Tree
The following is an example of a B-tree of order 5. This means that (other that the root node) all
internal nodes have at least ceil(5 / 2) = ceil(2.5) = 3 children (and hence at least 2 keys). Of
course, the maximum number of children that a node can have is 5 (so that 4 is the maximum
number of keys). According to condition 4, each leaf node must contain at least 2 keys. In
practice B-trees usually have orders a lot bigger than 5.
Operations on a B-Tree
Question: How would you search in the above tree to look up S? How about J? How would you
do a sort-of "in-order" traversal, that is, a traversal that would produce the letters in ascending
order? (One would only do such a traversal on rare occasion as it would require a large amount
of disk activity and thus be very slow!)
According to Kruse (see reference at the end of this file) the insertion algorithm proceeds as
follows: When inserting an item, first do a search for it in the B-tree. If the item is not already in
the B-tree, this unsuccessful search will end at a leaf. If there is room in this leaf, just insert the
new item here. Note that this may require that some existing keys be moved one to the right to
make room for the new item. If instead this leaf node is full so that there is no room to add the
new item, then the node must be "split" with about half of the keys going into a new node to the
right of this one. The median (middle) key is moved up into the parent node. (Of course, if that
node has no room, then it may have to be split as well.) Note that when adding to an internal
node, not only might we have to move some keys one position to the right, but the associated
pointers have to be moved right as well. If the root node is ever split, the median key moves up
into a new root node, thus causing the tree to increase in height by one.
Let's work our way through an example similar to that given by Kruse. Insert the following
letters into what is originally an empty B-tree of order 5: C N G A H E K Q M F W L T Z D P R
91
X Y S Order 5 means that a node can have a maximum of 5 children and 4 keys. All nodes other
than the root must have a minimum of 2 keys. The first 4 letters get inserted into the same node,
resulting in this picture:
When we try to insert the H, we find no room in this node, so we split it into 2 nodes, moving the
median item G up into a new root node. Note that in practice we just leave the A and C in the
current node and place the H and N into a new node to the right of the old one.
Inserting M requires a split. Note that M happens to be the median key and so is moved up into
the parent node.
92
The letters F, W, L, and T are then added without needing any split.
When Z is added, the rightmost leaf must be split. The median item T is moved up into the
parent node. Note that by moving up the median key, the tree is kept fairly balanced, with 2 keys
in each of the resulting nodes.
93
The insertion of D causes the leftmost leaf to be split. D happens to be the median key and so is
the one moved up into the parent node. The letters P, R, X, and Y are then added without any
need of splitting:
Finally, when S is added, the node with N, P, Q, and R splits, sending the median Q up to the
parent. However, the parent node is full, so it splits, sending the median M up to form a new root
node. Note how the 3 pointers from the old parent node stay in the revised node that contains D
and G.
Deleting an Item
In the B-tree as we left it at the end of the last section, delete H. Of course, we first do a lookup
to find H. Since H is in a leaf and the leaf has more than the minimum number of keys, this is
easy. We move the K over where the H had been and the L over where the K had been. This
gives:
94
Next, delete the T. Since T is not in a leaf, we find its successor (the next item in ascending
order), which happens to be W, and move W up to replace the T. That way, what we really have
to do is to delete W from the leaf, which we already know how to do, since this leaf has extra
keys. In ALL cases we reduce deletion to a deletion in a leaf, by using this method.
Next, delete R. Although R is in a leaf, this leaf does not have an extra key; the deletion results
in a node with only one key, which is not acceptable for a B-tree of order 5. If the sibling node to
the immediate left or right has an extra key, we can then borrow a key from the parent and move
a key up from this sibling. In our specific case, the sibling to the right has an extra key. So, the
successor W of S (the last key in the node where the deletion occurred), is moved down from the
parent, and the X is moved up. (Of course, the S is moved over so that the W can be inserted in
its proper place.)
95
Finally, let's delete E. This one causes lots of problems. Although E is in a leaf, the leaf has no
extra keys, nor do the siblings to the immediate right or left. In such a case the leaf has to be
combined with one of these two siblings. This includes moving down the parent's key that was
between those of these two leaves. In our example, let's combine the leaf containing F with the
leaf containing A C. We also move down the D.
Of course, you immediately see that the parent node now contains only one key, G. This is not
acceptable. If this problem node had a sibling to its immediate left or right that had a spare key,
then we would again "borrow" a key. Suppose for the moment that the right sibling (the node
with Q X) had one more key in it somewhere to the right of Q. We would then move M down to
the node with too few keys and move the Q up where the M had been. However, the old left
subtree of Q would then have to become the right subtree of M. In other words, the N P node
96
would be attached via the pointer field to the right of M's new location. Since in our example we
have no way to borrow a key from a sibling, we must again combine with the sibling, and move
down the M from the parent. In this case, the tree shrinks in height by one.
Tree ADT
Definition
A tree T is a finite set of one or more nodes such that there is one designated node r called the
root of T, and the remaining nodes in (T – { r } ) are partitioned into n 0 disjoint subsets T1, T2,
..., Tk, each of which is a tree, and whose roots r1 , r2 , ..., rk , respectively, are children of r.
Characteristics
A tree is an instance of a more general category called graph
A tree consists of nodes connected by edges
Typically there is one node in the top row of the tree, with lines connecting to more nodes on the
second row, even more on the third, and so on.
97
Root
Tree
Root
Path
If n1, n2, …,nk is a set of nodes in the tree such that ni, is the parent of ni+1 for 1 i k,
then this set is called a path from n1 to nk. The length of the path is k – 1.
Think of someone walking from node to node along edges that connect them. The
resulting sequence of nodes is called a path. The number of edges that were walked on is
the length of the path.
Parent
Any node (except the root) has exactly one edge running upward to another node. The
node above it is called the parent of the node.
If there is a path from node R to node M, then R is an ancestor of M.
Parents, grandparents, etc. are ancestors.
98
Child
Any node may have one or more lines running downward to other nodes. The nodes
below a given node are called its children.
If there is a path from node R to node M, then M is a descendant of R.
Children, grandchildren, etc. are descendants
Levels
The level of a particular node refers to how many generations the node is from the root. If the
root is assumed to be on level 0, then its children will be on level 1, its grandchildren will be on
level 2, and so on.
Subtree
Any node may be considered to be the root of the subtree, which consists of its children
and its children children, and so on. If you think in terms of families, a node’s subtree
contains all its descendants.
Subtree
99
Visiting
A node is visited when program control arrives at the node, usually for the purpose of
carrying out some operation on the node, such as checking the value of one of its data
fields, or displaying it. Merely passing over a node on the path form one node to another
is not considered to be visiting the node.
Traversing
To traverse a tree means to visit all of the nodes in some specified order.
o For example, you might visit all the nodes in order of ascending key value. There
are other ways to traverse a tree, as we’ll see later.
If traversal visits all of the tree nodes it is called enumeration.
Keys
One data item in an object is usually designated a key value. This value is used to search
for the item or perform other operations on it.
Binary Tree
A binary tree is made up of a finite set of nodes that is either empty or consists of a node
called the root together with two binary trees, called the left and right subtrees, which are
disjoint from each other and from the root.
If every node in the tree can have at most two children, the tree is called binary tree.
100
Complete Binary Tree Not Complete Binary Tree
BST Property
• All elements stored in the left subtree of a node whose value is K have values less than K. All
elements stored in the right subtree of a node whose value is K have values greater than or
equal to K.
• That is, a node’s left child must have a key less than its parent, and a node’s right child must
have a key greater or equal to its parent
101
Operations on BST ADT
Create a BST
Insert an element
Remove an element
Find an element
Clear (remove all elements)
Display all elements in a sorted order
public:
BST();
//Constructor
~BST();
//Destructor
void clear();
//Clear the tree
};
102
Pointer based implementation
{ public:
~BinNode ( ) { } // Destructor
};
103
BST – Inorder traversal
• Visit the left subtree, then the node, then the right subtree.
Algorithm:
– If there is a left child visit it
– Visit the node itself
– If there is a right child visit it
BST - Implementation
104
BST – Clear the tree
Algorithm:
– If there is a left child visit it
– If there is a right child visit it
– Visit the node itself
BST - Implementation
{ if ( rt = = NULL ) return;
delete rt;
BST – find
Algorithm
105
• If value we are looking for < key of current node we have to check left subtree
• If value we are looking for > key of current node we have to check right subtree
BST - Implementation
BST – insert
106
BST – Insert Algorithm
• If value we want to insert < key of current node we have to go to the left subtree
• Otherwise we have to go to the right subtree
• If the current node is empty (not existing) create a node with the value we are inserting
and place it here
BST - Implementation
BST - delete
Similar to the insert function, after deletion of a node, the property of the BST must be
maintained.
Algorithm
107
• The property of the BST must be maintained
Steps:
BST – Implementation
{ if (rt == NULL)
remove(rt->left, val);
remove(rt->right, val);
108
else // Found it: remove it
} // subtree
assert(rt != NULL);
return deletemin(rt->left);
else // Found it
rt = rt->right;
return temp;
109
}
110