You are on page 1of 202

Fundamentals of

Data Structure in C

Ellis Horowitz
University of southern California
Chapter 1. BASIC CONCEPT

Data structure + Algorithm = Programming

The tools and techniques to design and implement large-


scale computer systems:
Data abstraction
Algorithm specification
Performance analysis
Performance measurement

1.1 System life cycle (the system development process)

requirement the purpose of the project (input, output )

break the problem down into manageable pieces


analysis Bottom - up (unstructed)
Top - down (structed)

design data object + operations <---language independanc

abstract data type algorithm specification

coding choose concrete representation for data objects


and write algorithm for each operatios on them

verification developing correctness for programs


testing the program with variety of input data
removing errors
*correctness running time
1.2 Algorithm Specification

【definition】An algorithm is a finite set of instructions that, if followed, accomplishes a


particular task, In addition, all algorithms must satisfy the following criteria:
(1) Input. There are zero or more quantities that are externally applied.
(2) Output. At least one quantity is produced.
(3) Definiteness. Each instruction is clear and unambiguous.
(4) Finiteness. If we trace out the instructions of an algorithm. then for all
cases, the algorithm terminates after finite number of steps.
(5) Effectiveness. Every instruction must be basic enough to be carried
out, in principle, by a person using only pencil and paper, It is not
enough that each operation be definite as in (3); it also must be
feasible.

The distinguishes between an algorithm and a program:


Program----- does not have to satisfy(4) (operation system)
a programming language
algorithm----may was natural language、graphic、a programming
language

〖example1〗selection sort
algorithm:
for (i=0; i<n; i++) {
examine list[i] to list[n-1] and suppose that the
smallest integer is at list[min];
interchange list[i] and list[min];
}

Program:
for interchanging:
1. macro version
#define swap (x, y, t ) (( t ) = ( x ), ( x ) = ( y ), ( y ) = ( t ))
2. function version swap ( int *a, *b);
void swap ( int *x, int *y ) /* both parameters are pointers
to ints */
{ int temp = *x ; /* declares temp as an int and assigns
to it the contents of what x points to */
*x = *y; /* stores what y points to into the location
where x points*/
*y = temp; /*places the contents of temp in location
pointed to by y*/
}
#include<stdio.h>
#include<math.h>
#define MAX_SIZE 101
#define SWAP(x, y, t) ((t) = (x),(x) = (y), (y) = (t))
void sort ( int [ ], int ); /*selection sort*/
void main (void)
{
int i,n;
int list [MAX_SIZE];
printf ("Enter the number of numbers to generate; ");
scan ("%d", &n);
if ( n < 1 || n > MAX_SIZE) {
printf (stderr, "Improper value of n\n");
exit (1);
}
for (i = 0; i < n; i++) { /* randomly generate numbers*/
list [ i ] = rand ( ) % 1000;
printf ("%d ",list [ i ]);
}
sort ( list , n ) ;
printf ( " \n Sorted array: \n ");
for ( i = 0 ; i < n ; i++ ) /*print out sorted numbers */
printf ( "%d ", list [ i ] );
printf ( " \n " );
}
void sort ( int list [ ] ,int n )
{
int i, j, min, temp;
for ( i = 0; i < n-1; i++ ) {
min = i;
for ( j = i + 1; j < n; j++ )
if ( list [ j ] <list [ min ] )
min = j;
SWAP ( list [ i ] , list [ min ], temp );
}
}

〖example2〗Binary search
given: a sorted list list [ n ]
serach number searchnum
return: i if list [ i ] = serachnum
-1 not present in the list

Algorithm:
while ( there are more integers to check ) {
middle = (lift +right )/2;
if ( serchnum < list[middle] )
right = middle - 1;
else if ( serchnum == list[middle])
return middle;
else left = middle + 1;
}

left(0) middle right(n-1)

searchnum ~ list[middle]
< = >
right=middle -1 i lefit=middle +1

program:
for comparing:
1. macro version
#define compare(x,y) ((x) < (y)? -1:((x) == (y)) ? 0:1)

2. function version
int compare ( int x, int y );
{ /* compare x and y, return -1 for less than, 0 for equal, 1 for
greater*/
if ( x < y ) return -1;
else if ( x == y ) return 0;
else return 1;
}

int binsearch ( int list [ ], int searchnum , int left, int right )
{ /* search list [ 0 ] <= list [ 1 ] <= ··· <= list [ n-1 ] for searchnum.
Return its position if found . Otherwise return -1*/
int middle ;
while ( left <= right ) {
middle = ( left + right ) / 2;
switch ( COMPARE ( list [ middle ] , searchnum )) {
case -1 : left = middle + 1 ;
break;
case 0 : return middle;
case 1 : right = middle - 1;
}
}
return -1;
}

Recursive algorithm

【definition】define an object in terms of a simpler case of itself

FACTORIAL
n! = n·(n-1)! n>1 ( 0! = 1)
n! = n·(n-1)·(n-2) … 3·2·1
MULTIPLY
a·b = a·(b-1) + a (a·1 = a)
a·b = a·(b-1) + a = a·(b-2) +a +a = a·(b-3) + a + a +a …
=a+a+ … +a+a (b times addition)
FIBONACCI NUMBER
fib (n) = n n=0 or n=1
fib (n) = fib(n-2) + fib(n-1) otherwise

FUNCTION F1 F2 --- Fn

direct recursion indirect recursion

〖example 3〗Binary search


int binsearch ( int list [ ], int searchnum , int left, int right)
{ /* search list [ 0 ] <= list [ 1 ] <= ··· <= list [ n-1 ] for searchnum.
Return its position if found . Otherwise return -1*/
int middle ;
if ( left <= right ) {
middle = ( left + right ) / 2 ;
switch ( COMPARE ( list [ middle ] , searchnum )) {
case -1 : return binsearch ( list , searchnum, middle + 1, right );
case 0 : return middle;
case 1 : return binsearch ( list , searchnum, left , middle -1 ) ;
}
}
return -1;
}

〖example 4〗permutation
o a,b,c t -------> o (a,b,c),(a,c,b),(b,a,c),(b,c,a),(c,a,b),(c,b,a) t
o a,b,c,d t -------> a+ all permutation of (b,c,d)
b+ all permutation of (a,c,d) --> 4·6 permutations
c+ all permutation of (a,b,d)
d+ all permutation of (a,b,c)

void perm ( char *list , int i, int n )


{ /* generate all the permutations of list [ i ] to list [ n ] */
int j , temp;
if ( i == n ) {
for ( j = 0; j <= n; j++ )
printf ( "%c ", list [ j ] );
printf ( " " );
return;
}
else { /* list [i ] to list [n ] has more than one permutation,
generate these recursively*/
for ( j = i; j <= n; j++) {
swap( list [i ], list [ j ], temp);
perm( list, i + 1, n);
swap(list [ i ], list [j ], temp);
}
}
}

?Exercises

1.3 Data abstraction

data type ------ a collection of objects and a set operations that act on
those objects.

predefined data type :


int 、char、 float 、double (short、long 、unsigned、 pointer)
array 、struct 、union 、enum

Abstract Data Type (ADT)


a data type that is organized in such a way that the
specification on the objects and specification of the operations on
the objects separated from the representation of the objects and
the implementation on the operations
----------------------------------------------------------------------------------------------
structure natural_Number is
objects: an ordered subrange of the integers starting at
zero and ending at the maximum integer(INT_MAX) on the
computer
functions:
for all x,y ∈ Nat_Number,TRUE,FALSE ∈ Boolean
and where +,-,<,and== are the usual integer operations
Nat_No Zero( ) ::= 0
Boolean Is_Zero(X) ::= If(x) return FALSE
else return TRUE
Nat_No Add(x,y) ::= If((x+y)<=INT_MAX) return x+y
else return INT_MAX
Boolean Equal(x,y) ::= if(x==y) return TRUE
else return FALSE
Nat_No Successor(x) ::= if(x==INT_MAX) return x
else return x+1
Nat_No subtract(x,y)::= if(x<y) return 0
else return x-y
end Natural_Number
------------------------------------------------------------------------------------------------
operation specification
the name of every function
the type of its arguments
the type of its result
function categories:
(1) creator / constructor
(2) transformers
(3) observers / reporters

1.4 Performance analysis

criteria
(1) Does the program meet the original specifications of the task?
(2) Does it work correctly?
(3) Does the program contain documentation that shows how to use
it and how it work?
(4) Does the program effectively use functions to create logical units?
(5) Is program's code readable?

more concrete criteria (space, running time)


(6) Does the program effectively use primary and secondary storage?
(7) Is program's running time acceptable for the task?

performance analysis ( complexity theory )


performance measurement ( machine--dependence running times )

complexity (space, time)

【definition】the space complexity of a program is the amount of memory that it needs to run
to completion.The time complexity of program is the amount of computer time that is need to
run to completion.

(1) space complexity

Fixed space requirements (C):


instruction space (store the code)
simple variables
fixed_size structured variables
constants

Variable space requirements:


Sp( I ) , the space needed by structured variables whose size depends on the particular
instance, I , of the problem being solved. ( +recursive space)
I ---- number、size、value of the input and output.

S(P) = C + Sp( I )

 float abc(float a, float b, float c)


{
return a+b+b*c+(a+b-c)/(a+b)+4.00) Sabc( I )=0
}

 float sum(float list[ ] , int n)


{
float tempsum=0; ssum( I )= ssum( n )=n
int i; ssum( n )=0
for ( i=0; i < n; i++) ( C does not copy the array)
tempsum +=list[ i ];
return tempsum;
}

 float rsum(float list[ ], int n)


{
if (n) return rsum( list, n-1) + list[ n-1 ];
returm 0;
}

srsum( MAX_SIZE )= 6 * MAX_SIZE (n=MAX_SIZE )

Type Name Number of bytes


parameter: float list [ ] 2
parameter: integer n 2
return address:(used internally) 2 (unless a far address)
TOTAL per recursive call 6

(2) Time complexity

T(p) The sum of a program, p, compile time and run (or execution) time

Tp(n) = CaADD(n) + CsSUB(n) + ClLDA(n) + CSTSTA(n)

【definition 】A program step is a syntactically or semantically meaningful program segment


whose execution time is independent of the instance characteristics.

s/e ------ the steps count for each statement.


Frequency ------ the number of times that each statement is execution.

T(p) not dependent solely on the number of inputs or outputs or some other easily specified
characteristic.

The best case step count --- the minimum number of steps
The worst case step count --- the maximum number of steps
The average case step count --- the average number of steps

Iterative summing of a list of numbers


float sum(float list[ ] , int n)
{
float tempsum=0; count++; /* for assignment*/
int i;
for ( i=0; i < n; i++){
count ++; / * for the for loop */
tempsum +=list[ i ]; count ++; / * for assignment */
}
count ++; /* last execution of for*/
count ++; /* for return*/
return tempsum;
}

 float sum(float list[ ] , int n) /*simplified version*/


{
float tempsum=0;
int i;
for ( i=0; i < n; i++)
count += 2;
count += 3;
return 0;
}

count = 2n + 3

Statement s/e Frequency Total steps


float sum(float list[ ] , int n) 0 0 0
{ 0 0 0
float tempsum=0; 1 1 1
int i; 0 0 0
for ( i=0; i < n; i++) 1 n+1 n+1
tempsum +=list[ i ]; 1 n n
return tempsum; 1 1 1
} 0 0 0
Total 2n+3

 Recursive summing of a list of numbers

float rsum(float list[ ], int m)


{
count ++; /*for if conditional*/
if (n) {
count++; /* for return and rsum invocation */
return rsum( list, n-1) + list[ n-1 ];
}
count++;
returm list [ 0 ];
}

count = 2n + 2

Statement s/e Frequency Total steps


float rsum(float list[ ], int m) 0 0 0
{ 0 0 0
if (n) 1 n+1 n+1
return rsum( list, n-1) + list[ n- 1 n n
1 ]; 1 1 1
return list [ 0 ]; 0 0 0
}

Total 2n+2
Matrix addition

Void add(int a [ MAX_SIZE ], int b [ MAX_SIZE], int c [ MAX_SIZE ],


int rows, int cols)
{
int i , j;
for ( i =0; i < rows; i++)
for ( j = 0; j < cols; j++)
c[ i ][ j ] = a [i ][ j ] +b [i ][ j ];
}
 Void add(int a [ MAX_SIZE ], int b [ MAX_SIZE], int c [ MAX_SIZE ],
int rows, int cols)
{
int i , j;
for ( i =0; i < rows; i++) {
count++; /* for i for loop*/
for ( j = 0; j < cols; j++)
count++; /* for j for loop*/
c[ i ][ j ] = a [i ][ j ] +b [i ][ j ];
count++; /* for assignment statement*/
}
count++; /* last time of j for loop*/
}
count++; /* last time of i for loop*/
}
 Void add(int a [ MAX_SIZE ], int b [ MAX_SIZE], int c [ MAX_SIZE ],
int rows, int cols)
{
int i , j;
for ( i =0; i < rows; i++) {
for ( j = 0; j < cols; j++)
count +=2;
count += 2;
}
count +=2;
}
Count = 2*rows*cols + 2*rows +1
Statement s/ e Frequency Total steps
Void add(int a [ MAX_SIZE ] … 0 0 0
{ 0 0 0
int i , j; 0 0 0
for ( i =0; i < rows; i++) 1 rows+1 rows+1
for ( j = 0; j < cols; j++) 1 rows(cols+1) rows*cols+rows
c[ i ][ j ] = a [i ][ j ] +b [i ][ j ]; 1 rows*cols rows*cols
} 0 0 0

total 2*rows*cols+2rows+1

(3) Asymptotic Notation ( Ο、Ω、Θ )

Program1 with a complexity of c1 n 2 + c2n

Program2 with a complexity of c3n

Program2 is faster than program1 ?

 c1 =1, c2 =1, c3 = 100


c1 n 2 + c2n <= c3n ( n <= 98 )
c1 n 2 + c2n > c3n ( n > 98 )

Break even point


 c1 =1, c2 =1, c3 = 1000

c1 N 2 + c2n <= c3n ( n <= 998 )

《1》Ο
【definition】f(n)
= Ο(g(n)) iff there exist positive constants c
and n0 such that f(n) <= cg(n) for all n, n>=n0
3n + 2 = Ο(n) 3n + 2 <= 4n for n >=2 c = 4 , n0=2
10 N 2 +4n + 2=Ο( n ) 10 N 2 +4n + 2 <=11 N 2
2
for n >=5 c =11,
n0=5
6* 2 n + N 2 =O( 2 )
n
6 2 n + N 2 <=7* 2 n for n >=4 c =7 , n0=4
2
3n + 3=Ο( n )2
3n+3 <=3 N for n >=2 c =3 , n0=2
3n+2 ≠ Ο(1) c, n0 not exist
10 N +4n +2 ≠ Ο(n)
2
c, n0 not exist
2
????????logn????O?n?????O?nlogn???O? n ) O? n )
3
?? 2 ) n

【Theorem1.2】 If f(n) = am N m + ……+a1n +a0 then f(n)=Ο( n


m
)

《2》Ω

【definition】f(n)
= Ω(g(n)) iff there exist positive constants c
and n0 such that f(n) >= cg(n) for all n, n>=n0

3n + 2 = Ω(n) 3n + 2 >= 3n for n >=1 c = 3 , n0=1


10 N +4n + 2=Ω( n ) 10 N 2 +4n + 2 >=
2 2
N 2
for n >=1 c = 1, n0=1
6* 2 n + N 2 =Ω( 2 )
n
6 2 n + N 2 <= 2n for n >=1 c =1 , n0=1
3n + 3=Ω(1)
10 N 2 +4n +2 = Ω(n)
10 N 2 +4n +2 = Ω(1)
6* 2 n + N 2 =Ω( n )
2

f (n) ----------> several function g(n)

g(n) --- as large of function of n as possible for which the


statement f(n)=Ω(g(n)) is true

【Theorem1.3】 If f(n) = am N m + ……+a1n +a0 then f(n)=Ω( n )


m

《3》Θ
【definition】f(n) = Θ(g(n)) iff there exist positive constants c1, c2
and n0 such that c1 g(n) <= f(n) <= c2 g(n) for all n, n>=n0

3n + 2 = Θ(n) 3n <= 3n + 2 <= 4n c1 =3, c2 =4 ,


n0=2
10 N 2 +4n + 2=Θ( n )2

6* 2 n + N 2 =Θ( 2 )
n
【Theorem1.4】If f(n) = am N m + ……+a1n +a0 then f(n)=Θ( n
m
)
【example】Complexity of matrix addition

Statement Asymptotic complexity


Void add(int a [ MAX_SIZE ] … 0
{ 0
int i , j; 0
for ( i =0; i < rows; i++) Θ(rows)
for ( j = 0; j < cols; j++) Θ(rows.cols)
c[ i ][ j ] = a [i ][ j ] +b [i ][ j ]; Θ(rows.cols)
} 0

Total Θ(rows.cols)

【example】Binary search
n --- the number of elements in the list
worst case Θ(logn)

【example】Permutation
i=n Θ(n)
i<n Θ( n + Tperm (i + 1,n))

Tperm( i,n) = Θ( (n - i + 1)(n + Tperm (i + 1,n)))


Tperm(1,n) = Θ( n ( n! )) n >=1

【example】Magic square
row sum = column sum = diagonals sum

15 8 1 24 17
16 14 7 5 23
22 20 13 6 4
3 21 19 12 10
9 2 25 18 11
#include <stdio.h>
#define MAX_SIZE 15 /*maximum size of square*/
void main(void) /* construct a magic square, interactively*/
{
static int square[MAX_SIZE][MAX_SIZE];
int i, j, row, column; /*indices*/
int count; /* counter */
int size; /* square size */
printf ("Enter the size of the square : ");
scan ("%d", &size);
/* check for input errors */
if (size < 1 || size > MAX_SIZE + 1 ) {
printf( stderr, "Error! size is out of range\n"); /* Ο(1) ∗/
exit (1);
}
if (!(size % 2) ) {
printf(stderr, "Error! size is even\n");
exit(1);
}
for(i = 0; i < size; i++)

for ( j = 0; j < size; j++) /* Ο( n 2 )*/


square[i][j] = 0;
square[0][(size -1 )/2] = 1; /* middle of fist row*/
/* i and j are current position */
i = 0;
j = (size / 2);

for ( count = 2; count < =size * size; count ++) { /* Ο( n 2 )*/


row = ( i - 1 < 0) ? ( size -1 ) : (i -1 ); /*up */
column = ( j –1<0) ? (size -1 ) : ( j -1 ); /*left*/
if (square [row][column]) /* down*/
i = (++i) % size;
else { /* square is unoccupied*/
i = row;
j = ( j -1< 0 ) ? ( size -1) : - - j:
}
square[i][j] = count;
}
/* output the magic square */
printf (" Magic square of size %d : \n\n", size );
for ( i = 0; i < size; i ++) {
for ( j = 0 ; j , size; j ++)
printf( "%5d", square[i][j]);
printf("\n");
}
printf("\n\n");
}

Coxeter's rule:
1. put a one in the middle box of the top row
2. go up and left assigning number in the increasing order to
empty box
jump off the square
occupied

(4) Practical Complexity

instance characteristic n
time name 1 2 4 8 16
32
1 constant 1 1 1 1 1 1
l ogn logarithmic 0 1 2 3 4 5
n linear 1 2 4 8 16
nlogn log linear 32
N2 quadratic 0 2 8 24 64
cubic 160
3
n
1 4 16 64 256
1024
1 8 64 512 4096
32768
2n exponential 2 4 16 256 65536
n! factorial 4294967296
1 2 24 40326 2092278988000
26313* 10 33

1.5 Performance measurement


★Generate a suitable large number of random test data
★Measure average case times

method1: use clock to time events

clock ( ) ------ the internal processor time


CLK_TCK ------ the number of clock ticks per second

# include <time.h>
clock_t start, stop (call clock() )
double duration
duration = ((double) ( stop - star)) / CLK_TCK <second>

method2: use time

# include <time.h>
time_t start, stop (call time (null) )
double duration
duration = (double) ( difference (stop,star)) <second>

【example】The worst case performance of the selection function

#include<stdio.h>
#include <time.h>
#define MAX_SIZE 1601
#define INTERACTIONS 26
#define SWAP(x, y, t ) ((t) = (x), (x) = (y), (y) = (t)))
void main (void)
{
int i, j, position;
int list[MAX_SIZE];
int sizelist[ ] = {1,10,20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400,
500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400,
1500, 1600};
clock_t start, stop;
double duration;
printf (" n time\n");
for ( i =0; i < interactions; i++ ) {
for ( j =0; j <sizelist [ i ]; j ++ )
list [ j ] = sizelist [ i ] - j ;
star = clock ( );
sort ( list, sizelist [ i ]);
stop = clock ( );
/* CLK_TCK = number of clock ticks per second */
duration = ((double ) ( stop - start )) / CLK_TCK;
printf ("%6d %f \ n", sizelist[ i ], duration);
}
}

n time n time
30 …100 .00 900 1 .86
200 .11 1000 2.31
300 .22 1100 2.80
400 .38 1200 3.35
500 .60 1300 3.90
600 .82 1400 4.54
700 1.15 1500 5.22
800 1.48 1600 5.93
7
6
5
4
3
2
1
0 500 1000 1500 2000

【example】Worst case performance for sequential search

sequential search function:

int seqsearch (int list[ ], int searchnum, int n)


{ /*search an array, list, that has n number, return i , if list [ i ] = searchnum.
return -1, if searchnum is not in the list * /
int i;
list [ n ] = searchnum;
for ( i = 0; list[ i ] != searchnum; i ++)
;
return ((i < n) i : -1);
}

n Iterations Ticks Total time(sec) duration


0 30000 16 0.879121
0.000029
10 12000 16 0.879121
0.000073
20 6000 14 0.769231
0.000128
30 5000 16 0.879121
0.000176
40 4000 16 0.879121
0.000220
50 4000 20 1.098901
0.000275
60 4000 23 1.263736
0.000316
70 3000 20 1.098901
0.000421
80 3000 23 1.263736
0.000421
90 2000 17 0.934066
0.000467
100 2000 18 0.989011
0.000495
200 1000 18 0.989011
0.000989
400 500 18 0.989011
0.001978
600 500 27 1.483516
0.002967
800 500 35 1.923077
0.003846
1000 300 27 1.483516
0.004945

#include<stdio.h>
#include <time.h>
#define MAX_SIZE 1001
#define INTERACTIONS 16
int segsearch( int [ ], int, int);
void main (void)
{
int i, j, position;
int list[MAX_SIZE];
int sizelist[ ] = { 0, 10,20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 400,
600, 800, 1000};
int numtimes [ ] = {30000, 12000, 6000, 5000, 4000,4000,4000, 3000,
3000, 2000, 2000,1000, 500, 500, 500, 200};
clock_t start, stop;
double duration, total;
for ( i =0; i < MAX_SIZE; i++ )
list [ i ] = i ;
for ( i =0; i < interactions; i++ )
star = clock ( );
for ( j =0; j < numtimes; j++ ) {
position = segsearch ( list, -1, sizelist[ i ] );
stop = clock ( );
/* CLK_TCK = number of clock ticks per second */
total = ((double ) ( stop - start )) / CLK_TCK;
printf ("%5d %d %d %f %f \ n", sizelist[ i ], numtimes [ i ],
int (stop -start), total, duration);
list [ sizelist[ i ]] = sizelist[ i ]; /*reset value */
}
}
CHAPTER 4 LISTS
4.1 Pointers

& the address operator


* the deferencing ( or indirection ) operator
int i, *pi ;
pi = & i; i = 10; *pi = 10;

other operations:

assign a pointer to a variable of type pointer


arithmetic operations ( + - * / ) on a pointer

a high degree of flexibility and efficiency

wise programming:

1) set all pointers to NULL when they are not actually pointing
to an object.
the null pointer is represented by the integer 0
2) use explicit type casts when converting between pointer
types
pi = malloc( sizeof (int)); /* assign to pi a pointer to int */
pf = (float * ) pi; /* cast an int pointer
to a float pointer */
3) define explicit return type for function
4) using dynamically allocated storage (malloc, free)

int i, *pi;
float f, *pf;
pi = (int *) malloc(sizeof(int));
pf = (float *) malloc(sizeof(float));
*pi = 1024;
*pf = 3.14;
printf ("an integer = %d, a float = %f \n", *pi, *pf );
free(pi);
free(pf);
4.2 Singly Linked Lists

Linked list ~ an ordered sequence of nodes with links


represented as arrows

ptr
bat cat sat vat null

Insert mat after cat :

ptr
bat cat sat vat null

mat

Delete mat from list :

ptr
bat cat mat sat vat null

the capabilities to make linked representation:

define a node's structure ----- self-referential


structures
create a new node --------- malloc function
remove nodes --------- free function

62
〖example〗list of words ending in at
define a node structure --
typedef struct list_node *list_pointer;
typedef struct list_node
{
char data [ 4 ];
list_pointer link;
};
list_pointer ptr = NULL;

create a new list ----- ( #define IS_EMPTY(ptr) ( ! ( ptr )) )

create new nodes ---- ptr = (list_pointer) malloc ( sizeof


(list_node ));

place a word into the list ----- strcpy ( ptr ->data, "bat");
ptr -> link = null;

address of
first node ptr->data ptr->link
b a t \0
ptr
〖example〗Two-node linked list
typedef struct list_node *list_pointer;
typedef struct list_node {
char data ;
list_pointer link;
};

list_pointer ptr = NULL;

list_pointer create2()
{
/* create a linked list with two nodes */
list_pointer first, second;
first = (list_pointer)malloc(sizeof(list_node));
second = (list_pointer)malloc(siezof(list_node));

63
second->link = NULL;
second->data = 20;
first->data = 10;
first->link = second;
return first;
}
ptr
10 20 null

〖example〗List insertion

#define IS_FULL (ptr ) ( ! (ptr ))


void insert(list_pointer *ptr, list_pointer node)
{
/* insert a new node with data = 50 into the list ptr after node */
list_pointer temp;
temp = (list_pointer)malloc(sizeof(list_node));
if ( IS_FULL(temp) ) {
fprintf(stderr, "The memory is full\n');
exit(1);
}
temp->data = 50;
if ( *ptr ) {
temp->link = node->link;
node->link = temp;
}
else {
temp->link = NULL;
*ptr = temp;
}
}

Two node list after the function call insert (&ptr, ptr)
ptr
10 20 null
node
temp 50

64
〖example〗list deletion

List after the function call delete(&ptr, NULL, ptr)


ptr trail = NULL ptr
10 50 20 null 50 20 null
node
before deletion after deletion

List after the function call delete(&ptr, ptr, prt->link)


ptr node ptr
10 50 20 null 10 20 null
trail
before deletion after deletion

void delete(list_pointer *ptr, list_pointer trail, list_pointer node)


{ /* delete node from the list, trail is the preceding node ptr
is the
head of the list */
if ( trail )
trail->link = node->link;
else
*ptr = (*ptr)->link;
free(node);
}

〖example〗Printing out a list

void print_list(list_pointer ptr)


{
printf("The list contains: ");
for ( ; ptr; ptr = ptr->link )
printf("%4d,ptr->data);
printf("\n");
}

65
4.3 Dynamically Linked Stacks and Queues

Linked Stacks

top element link


.... null

#define MAX_STACKS 10 /*maximum number od stacks */


typedef struct {
int key :
/* other fields */
} element;
typedef struct stack *stack_pointer;
typedef struct stack {
element item;
stack_pointer link:
};
stack_pointer top [ MAX_STACKS ];

the initial condition for the stacks:

top [ i ] = NULL, 0 <= i < MAX_STACKS

the boundary condition for the stacks:


top [ i ] = NULL, iff the i th stack is empty
IS_FULL ( temp) iff the memory is full

Program : Add to a linked stack


void add(stack_pointer *top, element item)
{
/* add an element to the top of the stack */

66
stack_pointer temp = (stack_pointer)malloc(sizeof(stack));
if ( IS_FULL(temp) ) {
fprintf( stderr, "The memory is full\n");
exit(1);
}
temp->item = item;
temp->link = *top;
*top = temp;
}

Program : Delete from a linked stack

element delete(stack_pointer *top) {


/* delete an element from the stack */
stack_pointer temp = *top;
element item;
if ( IS_EMPTY(temp) ) {
fprintf(stderr, "The stack is empty\n");
exit(1);
}
item = temp->item;
*top = temp->link;
free(temp);
return item;
}

Linked Queues

front element link rear


.... null

#define MAX_QUEUES 10 /*maximum number od queues */


typedef struct queue *queue_pointer;

67
typedef struct queue {
element item;
queue_pointer link:
};
queue_pointer front [ MAX_STACKS ]; rear [ MAX_STACKS ];

the initial condition for the queues:

front [ i ] = NULL, 0 <= i < MAX_QUEUES

the boudary condition for the queues:

front [ i ] = NULL, iff the i th queue is


empty
IS_FULL ( temp) iff the memory is full

Program: Add to the raer of a linked queue

void addq(queue_pointer *front, queue_pointer *rear, element


item)
{ /* add an element to the rear of the queue */
queue_pointer temp =
(queue_pointer)malloc(sizeof(queue));
if ( IS_FULL(temp) ) {
fprintf(stderr, "The memory is full\n");
exit(1);
}
temp->item = item;
temp->link = NULL;
if ( *front ) (*rear)->link = temp;
else *front = temp;
*rear = temp;
}

Program: Delete from the front of a linked queue

element deleteq(queue_pointer *front)

68
{
/* delete an element from the queue */
queue_pointer temp = *front;
element item;
if ( IS_EMPTY(*front) ) {
fprintf(stderr, "The queue is empty\n");
exit(1);
}
item = temp->item;
*front = temp->link;
free(temp);
return item;
}
4.4 Polynomials
A( x ) = am−1 X em−1 +..... + a0 X e0
ai are nozero coefficients
ei are nonnegative integer exponents such that
em −1 > em − 2 > . . . > e1 > e0 ≥ 0

typedef struct poly_node *poly_node;


typedef struct poly_node {
int coef;
int expon;
poly_pointer link:
};
poly_pointer a, b, d;

coef expon link


poly_nodes:
a = 3 X 14 + 2 X 8 + 1
b = 8 X 14 − 3 X 10 + 10 X 6
a 3 14 2 8 1 0 null

b 8 14 -3 10 10 6 null

69
(1) Adding Polynomials (d=a+b)
a->expon == b->expon

3 14 2 8 1 0 null
a
8 14 -3 10 10 6 null
b
11 14 null
d
a->expon < b->expon
3 14 2 8 1 0 null
a
8 14 -3 10 10 6 null
b
11 14 -3 10 null
d
a->expon > b->expon
3 14 2 8 1 0 null
a
8 14 -3 10 10 6 null
b
11 14 -3 10 2 8 null
d

Program: Add two polynomials

poly_pointer padd(poly_pointer a, poly_pointer b)


{ /* return a polynomial which is the sum of a and b */
poly_pointer front, rear, temp;
int sum;
rear = (poly_pointer)malloc(sizeof(poly_node));
if ( IS_FULL(rear) ) {
fprintf(stderr, "The memory is full\n");
exit(1);
}
front = rear;
while ( a && b )
switch ( COMPARE(a->expon, b->expon) ) {
case -1: /* a->expon < b->expon */
attach(b->coef, b->expon,&rear);

70
b = b->link;
break;
case 0: /* a->expon = b->expon */
sum = a->coef + b->coef;
if ( sum ) attach(sum, a->expon, &rear);
a = a->link; b = b->link; break;
case 1: /* a->expon > b->expon */
attach(a->coef, a->expon, &rear);
a = a->link;
}
/* copy rest of list a and then list b */
for ( ; a; a = a->link ) attach(a->coef, a->expon, &rear);
for ( ; b; b = b->link ) attach(b->coef, b->expon, &rear);
rear->link = NULL;
/* delete extra initial node */
temp = front; front = front->link; free(temp);
return front;
}

Program: Attach a node to the end of a list

void attach(float coefficient, int exponent, poly_pointer *ptr)


{ /* create a new node with coef = coefficient and expon =
exponent, attach it to the node pointer to by ptr. ptr is
updated to pointer to this new node */
poly_pointer temp;
temp = (poly_pointer)malloc(sizeof(poly_node));
if ( IS_FULL(temp) ) {
fprintf(stderr, 'The memory is full\n");
exit(1);
}
temp->coef = coefficient;
temp->expon = exponent;
(*ptr)->link = temp;

71
*ptr = temp;
}
Time complexity  (m + n)
A( x ) = am −1 X e + ..... + a0 X e
m −1 0

B ( x ) = bn−1 X f + ..... + b0 X f
m −1 0

1.coefficient additions
0 <= number of coefficient additions <= min [m, n ]
mix -- none of the exponent are equal
max -- exponents of polynomial are a subset
of the exponents of the other
2.exponent comparisons <= m+n
3.creation of new nodes for d <= m+n

(2)Erasing Polynomials

void erase(poly_pointer *ptr)


{ /* erase the polynomial pointed to by ptr */
poly_pointer temp;
while ( *ptr ) {
temp = *ptr; *ptr = (*ptr)->link;
free(temp);
}
}

(3)Representing Polynomials As Circlarly Linked Lists

circular list ----- the link field of the last node points to
the first node in the list

ptr = 3 X 14 + 2 X 8 + 1

3 14 2 8 1 0 null
ptr

avail ----a variable of type poly_pointer that points to the


first node in the list of freed nodes.

72
Program: get_node function

poly_pointer get_node(void)
/* provide a node for use */
{
poly_pointer node;
if ( avail ) {
node = avail;
avail = avail->link;
}
else {
node = (poly_pointer)malloc(sizeof(poly_node));
if ( IS_FULL(node) ) {
fprintf(stderr, "The memory is full\n");
exit(1);
}
}
return node;
}

Program: ret_node function

void ret_node(poly_pointer ptr)


{ /* return a node to the available list */
ptr->link = avail;
avail = ptr;
}

Program: Erasing a circular list

void crease(poly_pointer *ptr)


{ /* erase the circular list ptr */
poly_pointer temp;
if ( *ptr ) {

73
temp = (*ptr)->link;
(*ptr)->link = avail;
avail = temp;
*ptr = NULL;
}
}

avail

...
ptr

temp

...
avail

The circular list with head node representation

ptr 3 14 2 8 1 0

Program: Adding circular represented polynomials

poly_pointer cpadd(poly_pointer a, poly_pointer b)


{
/* polynomials a and b are singly linked circular lists with a
head node. Return a polynomial which is the sum of a and b */
poly_pointer starta, d, lastd;
int sum, done = FALSE;
starta = a; /* record start of a */
a = a->link; /* skip head node for a and b */
b = b->link;
d = get_node(); /* get a head node for sum */
d->expon = -1; lasted = d;
do {

74
switch ( COMPARE(a->expon, b->expon) ) {
case -1: /* a->expon , b->expon */
attach(b->coef, b->expon, &lasted);
b = b->link;
break;
case 0: /* a->expon = b->expon */
if ( starta == a ) done = TRUE;
else {
sum = a->coef + b->coef;
if ( sum ) attach(sum, a->expon, &lastd);
a = a->link; b = b->link;
}
break;
case 1: /* a->expon > b->expon */
attach(a->coef, a->expon, &lastd);
a = a->link;
}
} while ( !done );
lastd->link = d;
return d;
}

4.5 Additional List Operations

1. operations for chains

a) Inverting a singly linked list

list_pointer invert(list_pointer lead)


{
/* invert the list pointed to by lead */
list_pointer middle, trail;
middle = NULL;
while ( lead ) {
trail = middle;
middle = lead;

75
lead = lead->link;
middle->link = trail;
}
return middle;
}

b) Concatenating singly linked lists

list_pointer cncatenate(list_pointer ptr1, list_pointer ptr2)


{
/* produce a new list hat contains the list ptr1 followed by the
list ptr2. The list pointed to by ptr1 is changed permanently */
list_pointer temp;
if ( IS_EMPTY(ptr1) ) return ptr2;
else {
if ( !IS_EMPTY(ptr2) ) {
for ( temp = ptr1; temp->link; temp = temp->link )
;
temp->link = ptr2;
}
return ptr1;
}
}
2. operations for circularly linked lists

x1 x2 x3
a

x1 x2 x3
a
a) Inserting at the front of a list

void insert_front(list_pointer *ptr, list_pointer node)


/* insert node at the front of the circular list ptr, where ptr is the
last node in the list */
{ if ( IS_EMPTY(*ptr) ) {
/* list is empty, change ptr to point to new entry */
*ptr = node;

76
node->link = node;
}
else { /* list is not empty, add new entry at front */
node->link = (*ptr)->link;
(*ptr)->link = node;
}
}
b) Finding the length of a circular list

int length(list_pointer ptr)


{ /* find the length of the circular list ptr */
list_pointer temp;
int count = 0;
if ( ptr ) {
temp = ptr;
do {
count ++;
temp = temp->link;
} while ( temp != ptr );
}
return count;
}

4.6 Equivalence relations

【definition】A relation, ≡ ,over a set, S, is said to be an


equivalence relation over S iff it is symmetric, reflexive, and
transitive over S.

〖example〗 the "equal to " relationship is an equivalence


relation
1) x = x
2) x = y implies y = x
3) x = y and y = z implies x = z

〖example〗

77
0 ≡ 4, 3 ≡ 1 ,6 ≡ 10, 8 ≡ 9, 7 ≡ 4, 6 ≡ 8, 3 ≡ 5, 2
≡ 11, 11 ≡ 0

equivalence classes:
{0, 2,4, 7, 11}; {1, 3, 5 }; {6, 8, 9, 10 }

Equivalence algorithm

void equivalence()
{
initialize;
while ( there are more pairs ) {
read the next pair <i,j>;
process this pair;
}
initialize the output;
do
output a new equivalence class;
while ( not done );
}

A more detailed version of the equivalence algorithm

void equivalence()
{ initialize seq to NULL and out to TRUE;
while ( there are more pairs ) {
read the next pair, <i,j>;
put j on the seq[i] list;
put i on the seq[j] list;
}
for ( i =0; i<n; i++)
if ( out[i] ) {
out[i] = FALSE;

78
output this equivalence class;
}
}

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
seq

data 11 3 11 5 7 3 8 4 6 8 6 0
null null null null null null
link

data 4 1 0 10 9 2
link null null null null null null

Program: find equivalence classes O(m+ n)

m ~ the number of related pairs input


n ~ the number of objects

#include <stdio.h>
#include <alloc.h>
#define MAX_SIZE 24
#define IS_FULL(ptr) (!(ptr))
#define FALSE 0
#define TRUE 1

typedef struct node *node_opinter;


typedef sreuct node {
int data;
node_pointer link;
};
void main(void)
{ short int out[MAX_SIZE];
node_pointer seq[MAX_SIZE];
node_pointer x,y,top;
int i,j,n;
printf("Enter the size (<= %d) ",MAX_SIZE);
scanf("%d",&n);

79
for ( i = 0; i<n; i++ ) { /* initialize seq and out */
out[i] = TRUE; seq[i] = NULL;
}

/* Phase 1: Input the equivalence pairs: */


printf("Enter a pair of numbers ( -1 -1 to quit) : ");
scanf("%d%d",&i,&j);
while ( i>=0 ) {
x = (node_pointer)malloc(sizeof(node));
if ( IS_FULL(x) ) {
fprintf(stderr,"The memory is full\n");
exit(1);
}
x->data = j; x->link = seq[i]; seq[i] = x;
x = (node_pointer)malloc(sizeof(node));
if ( IS_FULL(x) ) {
printf(stderr, "The memory is full\n");
exit(1);
}
x->data = i; x->link = seq[j]; seq[j] = x;
printf("Enter a pair of number (-1 -1 to quit): ");
scanf("%d%d",&i,&j);
}

/* Phase 2: output the equivalence classes */


for ( i = 0; i<n; ++i )
if ( out [i] ) {
printf("\nNew class: %5d",i);
out[i] = FALSE; /* set class to false */
x = seq[i]; top = NULL; /* initialize stack */
for ( ; ; ) { /* find rest of class */
while ( x ) { /* process list */
j = x->data;
if ( out[j] ) {
printf("%5d",j); out[j] = FALSE;

80
y = x->link; x->link = top; top = x; x = y;
}
else x = x->link;
}
if ( !top ) break;
/* unstack */
x = seq[top->data]; top = top->link;
}
}
}

4.7 sparse matrixes

Node structure for sparse matrixes

down head right down head row col head entry i j


next value aij

head node entry node set up for aij


L
M
0 0 11 0 O
P
M
M
12 0 0 0 P
P
0 −4 0 0
M
N0 0 0 −15
P
Q

a h0 h1 h2 h3
4 4

h0 0 2
11
1 0
h1
12

h2
2 1
-4
3 3
h3 -15

81
#define MAX_SIZE 50 /* size of largest matrix */
typedef enum (head, entry ) tagfield;
typedef struct matrix_node *matrix_pointer;
typedef struct entry_node {
int row;
int col;
int value;
};
typedef struct matrix_node {
matrix_pointer down;
matrix_pointer right;
tagfield tag;
union {
matrix_pointer next;
entry_node entry;
} u;
};
matrix_pointer hdnode [ MAX_SIZE ];

Sample input :
first line num_rows num_cols
num_terms
other lines row column
value
[0] [1]
[2]
[0] 4 4
[1] 4
[2] 0 2
[3] 11
[4] 1 0
12
2 1

82
-1
3 3
-15

Program : Read in a sparse matrix

matrix_pointer mread(void)
{ /* read in a matrix and set up its linked representation. An
auxiliary global array hdnode is used */
int num_rows, num_cols, num_terms, num_heads, i;
int col, col, value, last, node;

printf("Enter the number of rows, columns and number of


nonzero terms: ");
scanf("%d%d%d',&num_rows,&num_cols,&num_terms);
num_heads = (num_cols > num_rows) ? num_cols :
num_rows;
/* set up head node for the list of head nodes */
node = new_node(); node->tag = entry;
node->u.entry.row = num_rows;
node->u.entry.col = num_cols;

if ( !num_heads ) node->right = node;


else { /* initialize the head nods */
for ( i = 0; i<num_heads; i++ ) {
temp = new_node;
hdnode[i] = temp; hdnode[i]->tag = head;
hdnode[i]->right = temp; hdnode[i]->u.next = temp;
}

current_row = 0;
last = hdnode[0]; /* last node in current row */
for ( i = 0; i < num_terms; i++ ) {
printf("Enter row, column and value: ");
scanf("%d%d%d",&row,&col,&value);
if ( row > current_row ) {

83
/* close current row */
last->right = hdnode[current_row];
current_row = row; last = hdnode[row];
}
temp = new_node();
temp->tag = entry; temp->u.entry.row = row;
temp->u.entry.col = col;
temp->u.entry.value = value;
temp->right = temp; /* link into row list */
last = temp;
/* link into column list */
hdnode[col]->u.next->down = temp;
hdnode[col]->u.next = temp;
}
/* close last row */
last->right = hdnode[current_row];
/* close all column lists */
for ( i=0; i<num_cols; i++ )
hdnode[i]->u.next->down = hdnode[i];
/* link all head nodes together */
for ( i=0; i<num_heads-1; i++ )
hdnode[i]->u.next = hdnode[i+1];
hdnode[num_heads-1]->u.next = node;
node->right = hdnode[0];
}
return node;
}

Program: Get a new maxtrix node

matrix_pointer new_node(void)
{ matrix_pointer temp;
temp = (matrix_pointer)malloc(sizeof(matrix_node));
if ( IS_FULL(temp) ) {

84
fprintf(stderr, "The memory is full\n');
exit(1);
}
return temp;
}
time complexity
O(max{num_rows, num_cols } + num_terms )
= O(num_row + num_cols + num_terms )

Program: Write out a sparse matrix

void mwrite(matrix_pointer node)


{
/* print out the matrix in row major form */
int i;
matrix_pointer temp, head = node->right;
/* matrix dimensions */
printf(" \n num_rows = %d, num_cols = %d \n",
node->u.entry.row, node-
>u.entry.col);
printf(" The matrix by row, column, and value: \n\n");
for ( i = 0; i<node->u.entry.row; i++ ) {
/* print out the entries in each row */
for ( temp = head->right; temp != head;

temp = temp->right)
printf("%5d%5d%5d \n",temp->u.entry.value);
head = head->u.next; /* next row */
}
}
time complexity
O(num_row + num_terms )

Program : Erase a sparse matrix

void merase(matrix_pointer *node)


{

85
/* erase the matrix, return the nodes to the heap */
matrix_pointer x,y,head = (*node)->right;
int i,num_heads;
/* free the entry and head nodes bu row */
for ( i = 0; i< (*node)->u.entry.row; i++ ) {
y = head->right;
while ( y != head ) {
x = y; y = y->right; free(x);
}
x = head; head = head->u.next; free(x);
}
/* free remaining head nodes */
y = head;
while ( y != *node ) {
x = y; y = y->u.next; free(x);
}
free(*node); *node = NULL;
}

time complexity
O(num_row + num_cols + num_terms )

4.8 Doubly Linked lists

typedef struct node *node_pointer;


typrdef struct node {
node_pointer llink;
element item;
node_pointer rlink;
};

86
llink item rlink
head node

ptr = ptr -> llink -> rlink = ptr -> rlink ->
llink

empty list : ptr

Program : Insertion into a doubly linked circular list

void dinsert(node_pointer node, node_pointer newnode)


{ /* insert newnode to the right of node */
newnode->llink = node;
newnode->rlink = node->rlink;
node->rlink->llink = newnode;
node->rlink = newnode;
}
node node

new node

87
Program: Deletion from a doubly linked circular list

void ddelete(node_pointer node, node_pointer deleted)


{ /* delete from the doubly linked list */
if ( node == deleted )
printf("Deletion of head node not permitted.\n");
else {
deleted->llink->rlink = deleted->rlink;
deleted->rlink->llink = deleted->llink;
free(deleted);
}
}

node node

、 deleted

88
Chapter 5 Trees
5.1 Introduction

Pedigree Dusty

Honey Bear Brandy

Brunhilde Terry Coyote Nugget

Gill Tansey Tweed Zoe Crocus Primrose Nous Belle

Lineal Proto ind0-European

Italic Hellenic Germanic

Osco-Umbrian Latin Greek North West


Germanic Germanic
Osco
Umbrian Spanish Italian Icolandic Swedish Low high
Franch Norwegian
Yiddish
【definition】A tree is a finite set of one or more nodes
such that
(1) There is a specially designated node called the
root
(2) The remaining nodes are partitioned into n >= 0
disjoint set
T 1, ... , T n , where each of these sets is a tree.
We call
T 1, ... , T n the subtrees of the root
Level
A 1

B
C D 2

E F G H I J 3

K L M 4
degree of a node --- the number of subtree of the node
degree of a tree --- the maximum degree of the nodes
in the tree
leaf (terminal node ) --- a node with degree zero
parent children sibling ancestors descendants
level high or depth

Representation of Trees

List Representation

(A ( B ( E ( K, L ) , F ), C ( G ), D ( H ( M ), I, L ) ) )

A
B C D
E F G H I L
K L M

In memory

data link 1 link 2 ... link n

Left Child -Right Sibling Representation

data

left child right child


A

B C D

E F G H I J

K L M

Representation As A Degree two Tree

C
E

G D
K F
H
L
M I

J
A A A

B B B

tree Left child-Right sibling Tree Binary tree

A
A A

B
B C B C
C

tree Left child-Right sibling Tree Binary tree

5.2 Binary Tree

【definition】A binary tree is a finite set of nodes that


is either empty or consists of a root and two disjoint
binary trees called the left subtree and the right
subtree.

note: a binary tree is really a different object than a


tree

1) The Abstract Data Type

structure Binary_Tree (abbreviated BinTree) is


objects: a finite set of nodes either empty or
consisting of a root node, left
Binary_Tree, and right
Binary_Tree .
funtion: for all bt, bt1, bt2 ∈ BinTree, item ∈
element
BinTree Creat( ) ::= creats an
empty binary tree
Boolean Isempty( bt ) ::= if ( bt ==
empty binary
tree)
return TRUE
else return
FALSE
BinTreeMakeBT(bt1,item,bt2)

::= return a
binary tree
whose left
subtree is bt1,
whose right
subtree is bt2,
and whose root
node contains
the data item.
BinTree Lchild (bt ) ::= if
( Isempty( bt ))
return error
else return
the left
subtree of bt.
element Data(bt) ::= if
(Isempty( bt ))
return error
else return
the data in
theroot node
of bt.
BinTree Rchild(bt) ::=
if( Isempty( bt
)) return
error
else return
the right
subtreeof bt.

TWO DIFFERENT BINARY TREES


A A

B B

TWO SPECIAL BINARY TREES

A
A
B
B C

C
D E F G
D
H I
Skewed binary tree Complete binary tree
E

2) The property of Binary Trees

〖Lemma5.1〗Maximum number Of nodes:


(1) The Maximum number of nodes on level i of
a binary
tree is 2
i −1
,i ≥1
(2) The Maximum number of nodes in a binary
tree of
depth k is 2 − 1, k ≥ 1
k

Proof:
(1) The proof is by induction on i
induction base: i = 1(root)
max_num ( on level 1) = 2 = 2 = 1
i −1 0

induction hypotheses: For all j, 1<= j < i,


j −1
max_num ( on level j ) = 2
induction step:
Since each node in a binary tree has a maximum
degree of 2,
max_num (on level i) = max_num(on level i -
1)*2= 2 * 2 = 2
i −2 i −1

(2) The maximum number of nodes in a binary tree of


k

depth k is: i =1 (maximum number of nodes on level
k
∑2
i −1
= 2 k −1
i) = i =1

〖Lemma5.2〗Relation between number of leaf


nodes and nodes of degree 2:
For any noempty binary tree, T, if n0 is the
number of leaf nodes and n2 the number of nodes of
degree 2, then
n0 = n2 + 1
Proof: n = n0 + n1 + n2
B = n1 + 2 n2 ( B is the number of
branches, n = B + 1)
n = 1 + n1 + 2 n2
-----> n0 = n2 + 1
【definition】A full binary tree of depth k is a binary
tree of depth having 2 − 1 nodes , k >= 0
k
1

2 3

4 5 6 7

8 9 10 11 12 13 14 15

【definition】A binary tree with n nodes and depth k


is complete iff its nodes correspond to the nodes
numbered from 1 to n in the full binary tree of depth
k.

(3) Binary Tree Representation


Array representation

〖Lemma5.3〗If a complete binary tree with n nodes


( depth = [ log 2 n + 1 ] ) is represented sequentially,
then for any node with index i , 1<= i <= n, we have:
(1) parent ( i ) is at [ i /2 ] if i ≠ 1 If i = 1, i is at
the root and
has no parent
(2) left_child( i ) is at 2i if 2i <= n, If 2i > n, then i
has no left
child.
(3) right_child ( i ) is at 2i + 1 if 2i + 1 > n , then i
has no right
child.

A B -- C --
- -- -- D -- ... E
[1] [2] [3] [4] [5] [6] [7] [8] [9] ... [16]

A B C D E F G H I
[1] [2] [3] [4] [5] [6] [7] [8] [9]
Linked representation
typedef struct node *tree_pointer;
typedef struct node {
int data;
tree_pointer left_child,
right_child;
};

left_child data data


right_child

left_child right_child

ROOT ROOT
A NULL A

B NULL B C

C NULL D NULL E NULL NULL F NULL NULL G NULL

D NULL NULL H NULL NULL I NULL

NULL E NULL

5.3 Binary Tree Traversal

L ------ moving left


V ------ visiting the node ( for example, printing out the
data field )
R ------ moving right

the possible combination of traversal:


LVR LRV VLR VRL RVL
RLV
inorder postorder preorder
1 +

2 17 E
*
3 19
* 14 D 18

4 / 11 C 16
15
5 13
A 8 B 12
10
6 7 9
(1) Recursive Traversal
Inorder Traversal
A/B*C*D+E
void inorder( tree_pointer ptr)
/* inorder tree traversal */
{
if (pr) {
inorder (ptr->left_child);
printf ("%d",ptr->data);
inorder (ptr->right_child);
}
}
Preorder Traversal
+**/AB C D E
void preorder(tree_pointer ptr)
/* preorder tree traversal */
{
if (ptr){
printf ("%d", ptr_data);
preorder (ptr->left_child);
preorder (ptr->right_child);
}
}
Postorder Traversal
A B/C*D*E+
void postorder( tree_pointer ptr)
/* postorder tree traversl */
{
if (ptr){
postorder ( ptr->left_child);
postorder ( ptr->right_child);
printf ("%d", ptr_data);
}
}

(2) Iterative Traversal

Inorder Traversal O(n)

void iter_inorder(tree_pointer node)


{
int top = -1; /*initialize stack */
tree_pointer stack[MAX_STACK_SIZE];
for (; ;){
for (;node; node = node->left_child)
add (&top, node); /* add to
stack */
node = delete(&top);
/*delete from stack*/
if (!node) break; /*
empty stack */
printf ("%d", node_data);
node = node->right_child;
}
}

(3) Level Order Traversal


+*E*D/CAB
void level_order(tree_pointer ptr)
/* level order tree traversal */
{ int front = rear = 0;
tree_pointer queue[MAX_QUEUE_SIZE];
if (!ptr) return; /*empty tree
*/
addq (front, &rear, ptr);
for (; ;){
ptr = deleteq(&front, rear);
if (ptr) {
printf ("%d",ptr_data);
if ( ptr->left_child)
addq (front, &rear, ptr-
>left_child);
if ( ptr->right_child)
addq (front, &rear, tr-
>right_child);
}
else break;
}
}
5.4 Binary Tree Operations
(1) Copy Binary trees
tree_pointer copy( tree_pointer original)
/* this function returns a tree_pointer to an exact
copy of the original tree */
{ tree_pointer temp;
if (original) {
temp = (tree_pointer) malloc (sizeof(node));
if (IS_FULL( temp )){
fprintf (stderr, "The memory is full \n");
exit(1);
}
temp->left_child = copy(original->left_child);
temp->right_child = copy(original->right_child);
temp->data = original->data;
return temp;
}
return NULL;
}

(2) Testing For Equality of Binary Trees

int equal(tree_pointer first,tree_pointer secord)


{/* function returns FALSE if the binary trees first
and second
are not equal ,Otherwise it returns TRUE*/
return (( ! first && !secord) ||
(first && second && (first->data==second-
data)
&& equal ( first->left_child, second->left_child )
&& equal ( first->right_child, second-
>right_child))
}

(3) The Satisfiability Problem


formulas ~ variables x1, x2, ...., xn
(true, false)
operator ∧ (and) ∨ (or) ¬
(not)
rules:
1. A variable is an expression
2. If x and y expressions, then ¬ x, x ∧ y, x ∨ y,
are expression
3. Parentheses can be used to alter the normal
order of
evaluation, which is ¬ before ∧ before ∨ .

the expression: x1 ∨ ( x2 ∧ ¬ x3 ) is a
formula.
If x1 = x3 = false, x2 = true, Then the value of
the expression =
false ∨ ( true ∧ ¬ false)
= false ∨ true
= true

The satisfiability problem for formulas of the


propositional calculus asks if there is an
assignment of values to the variables that causes
the value of the expression to be True
(x1 ∧ ¬ x2 ) ∨ ( ¬ x1 ∧ x3 ) ∨ ¬ x3

x3

x1 x3

x2 x2

left_child data value right_child

First version of satisfiability algorithm

for (all 2 n possible combination ) {


generate the next combination;
replace the variable by their values;
evaluate root by traversing it in postorder;
if (root->value) {
printf (<combination>);
return;
}
}
printf ("No satisfiable combination\n");

post_order_eval function
void post_order_eval(tree_pointer node)
{ /* modified post order traversal to evaluate a
propositional
calculus tree*/
if (node){
post_order_eval (node->left_child);
post_order_eval (node->right_child);
switch (node->data) {
case not:
node->value = !node->right_child-
>value;
break;
case and:
node->value = node->right_child-
>value &&
node->left_child->value;
break;
case or:
node->value = node->right_child-
>value ||
node->left_child->value;
break;
case true:
node->value = TRUE;
break;
case false:
node->value = FALSE;
}
}
}

5.5 Threaded Binary Trees


threads ~ replace the null links by pointer to other
nodes in
the tree.
construct rules:
(1) If ptr->left_child is null,
replace the null link with a pointer to the
inorder
predecessor of ptr
(2) If ptr->right_child is null,
replace the null link with a pointer to the
inorder
successor of ptr
root
A

B C

D E F G

H I
typedef struct threaded_tree *threaded_pointer;
typedef struct threaded_tree {
short int left_thread;
threaded_pointer left_child;
char data;
threaded_pointer right_child;
short int right_thread;
};

left_thread left_child data right_child right_thread

True False

memory representation of a threaded tree

ROOT f = FALSE
f . - . f t = TRUE

f . A . f

f . B . f f . C . f

f . D . f t . E . t t . F . t t . G . t

t . H . t t . I . t
operations

(1) inorder traversal of a threaded binary tree

Finding the inorder successor of a node

threaded_pointer insucc( threaded_pointer tree)


{
/*find the inorder sucessor of tree in a threaded
binary tree */
threaded_pointer temp;
temp = tree->right_child;
if ( !tree->right_thread )
while( !temp->left_child )
temp = temp->left_child;
return temp;
}

Inorder traversal of a threaded binary tree

void tinorder ( threaded_pointer tree)


{ /* traverse the threaded binary tree inorder */
threaded_pointer temp=tree;
for(; ;){
temp = insucc (temp);
if (temp = tree) break;
printf ("%3c", temp->data);
}
}
(2) inserting a node into a threaded binary tree
root root

A A
B parent parent
B
child
C D C D
child
( a )

root root

A A
parent parent
B B
child
C D C X

E F D

E F
child X
before after
( b )
right insertion in a threaded binary tree

void insert_right (threaded_pointer parent,

threaded_pointer child)
{ /* insert child as the right child of parent in a
threaded
binary tree */
threaded_pointer temp;
child->right_child=parent->right_child;
child->right_therad=parent->right_thread;
child->left_child=parent;
child->left_thread=TRUE;
parent->right_child=child;
parent->right_thread=FALSE;
if ( !child->right_thread ) {
temp = insucc(child);
temp->left_child = child;
}
}

5.6 heaps

heap ~ a special form of a complete binary tree

【definition】A max tree is a tree in which the key


value in each node is no smaller than the key value
in its children ( in any ) . A max heap is a complete
binary tree that is also a max tree.

[1] 14 9 [1] 30 [1]

[2] 12 [3] 7 6 [2] 3 [3] 25 [2]

[4] 10 [5] 8 [6] 6 5 [4]


【definition】A mix tree is a tree in which the key
value in each node is no greater than the key value
in its children ( in any ) . A mix heap is a complete
binary tree that is also a mix tree.

[1] 2 10 [1] 11 [1]

[2] 7 [3] 4 20 83 [3] 21 [2]


[2]

[4] 10 [5] 8 [6] 6 50 [4]


ABSTRACT DATA TYPE

structure MaxHeap is
objects:a complete binary tree of n>0 elements
organized
so that the value in each node is
at least as large
as those in its children
function:
for all heap ∈ MaxHeap, item ∈
Element,
n, max_size ∈ integer
MaxHeap create(max_size) ::= create an empty
heap that

can hold a maximum of

max_size elements.
Boolean HeapFull(heap,n) ::= if (n ==
max_size) return

TRUE else return FLASE


MaxHeap Insert(heap,item,n)::= if ( !HeapFull
(heap,n))

insert item into heap and

return the resulting heap

else return error


Boolean HeapEmpty(heap,n)::= if ( n>0 ) return
TRUE
else return
FALSE
Element Delete(heap,n) ::= if
( !HeapEmpty(heap,n)

return one instance of the

largest element in the

heap and remove it from

the heap else return error.

OPERATIONS

1. creation of an empty heap


2. insertion of a new element into the heap
3. deletion of the largest element from the heap

(1) Priority queues

A priority queue delete the element with the


highest ( or lowest) priority. At any time we can
insert an element with arbitrary priority into a
priority queue
Heaps are only one way to implement
priority queue

representationinsertiondeletionunordered
arrayΘ(1)Θ(n)unordered linked listΘ(1)Θ(n)sorted
arrayO(n)Θ(1)sorted linked listO(n)Θ(1)max
heapO( log 2 n )O( log 2 n)

(2) Insertion into a max heap


#define MAX_ELEMENTS 200 /* maximum heap size
+1*/
#define HEAP_FULL(n) (n = MAX_ELEMENTS -1 )
#define HEAP_EMPTY ( n) (!n)
typedef struct { int key ;
/* other field */
} element;
element heap [ MAX_ELEMENTS ];
int n = o;

[1] 20 [1] 20

[2] 15 [2] 15 [3] 2


[3] 2

[4] 14 [5] 10 [4] 14 [5] 10


[6]

(a) heap before insertion (b) initial location of new node

[1] 20 [1] 21

[2] 15 [2] 15 [3] 20


[3] 5

[4] 14 [5] 10 [4] 14 [5] 10 2 [6]


2 [6]

(c) insert 5 into heap(a) (d) insert 21 into heap(a)

PROGRAM O( log 2 n )

void insert_max_heap (element item,int *n)


{ /* insert item into a max heap of current size *n
*/
int i;
if (HEAP_FULL (*n)) {
fprintf (stderr,"The heap is full.\n");
exit(1);
}
i=++(*n);
while (( i != 1) && ( item.key > heap[ i /
2 ].key)) {
heap[ i ] = heap [ i/2 ]; i /= 2;
}
heap [ i ] = item;
}

(3) deletion from a max heap O( log 2 n )

20
removed
[1] [1] 10 [1] 15

[2] 15 [3] 2 [2] 15 [3] 2 [2] 14 [3] 2

[4] 14 [5] [4] 14 [4] 10

(a) heap structure (b)inserted at the root (c) final heap


element delete_max_heap(int *n)
{ /* delete element with the highest key from the
heap */
int parent,child;
element item, temp;
if (HEAP_EMPTY(*n)) {
fprintf (stderr, "The heap is empty\n");
exit (1);
}
/* save value of the element with the highest
key */
item = heap[1];
/* use last element in heap to adjust heap */
temp = heap[ (*n)-- ];
parent = 1;
child = 2;
while (child <=*n) {
/* find the larger child of the current
parent */
if (child < *n) && (heap[child].key<
heap[child+1].key)
child++;
if (temp.key >= heap[child].key) break;
/* move to the next lower level */
heap[parent] = heap[child];
parent = child;
child* = 2;
}
heap [parent] = temp;
return item;
}

5.7 Binary search trees


【definition】A binary search tree is a binary tree. It
may be empty. If it is not empty, it satisfies following
properties:
(1) Every element has a key, and no two elements
have the
same key, that is , the keys are unique.
(2) The keys in a nonempty left subtree must be
smaller than
the key in the root of the subtree.
(3) the keys in a nonempty right subtree must be
larger than
the key in the root of the subtree.
(4) The left and right subtree are also binary search
trees.
20 30 60

15 25 5 40 70

12 10 22 2 65 80

(1) Searching a binary search tree O(h)


Recursive search of a binary search tree

tree_pointer search(tree_pointer root,int key)


{ /* return a pointer to the node that contains key.If
there is
no such node,return NULL. */
if ( !root ) return NULL;
if (key == root->data) return root;
if (key < root->data)
return search (root->left_child, key);
return search (root->right_child, key);
}

Iterative search of a binary search tree


tree_pointer search2(tree_pointer tree,int key)
{ /* return a pointer to the node that contains key.
If there is
no such node,return NULL. */
while (tree) {
if (key == tree->data) return tree;
if (key < tree->data)
tree = tree->left_child;
else
tree = tree->right_child;
}
return NULL;
}

(2) Inserting into a binary search tree O(h)


30 30

5 40 5 40

2 80 2 35 80
void insert_node(tree_pointer *node, int num)
/* If num is in the tree pointed at by node do
nothing;otherwise add a new node with data = num */
{tree_pointer ptr, temp = modified_search (*node,
num);
if ( temp || !(*node)){
/*num is not in the tree */
ptr = (tree_pointer) malloc
(sizeof(node));
if (IS_FULL( ptr )){
fprintf (stderr,"The memory is full
\n");
exit(1);
}
ptr->data = num;
ptr->left_child = ptr->right_child =
NULL;
if (*node) /* insert as child of
temp */
if (num < temp->data) temp-
>left_child = ptr;
else temp->right_child = ptr;
else *node = ptr;
}
}

(3) Deletion from a binary search tree O( log 2 n)


30 30

5 40 5 80

2 80 2

40 40

20 60 20 55

50 70 10 30 50 70
10 30

45 55 45 52

52
(a)tree before deletion of 60 (b)tree after deletion of 60
(4) Height of a binary search tree

Height of the binary search tree is O( log 2 n ), on


average.

balanced search tree ~ search trees with a


worst case height of O( log 2 n) AVL , 2 - 3, red-
black tree O(h)

5.8 Selection trees


goal --- k ordered sequences ( run ) are to be
merged into a single ordered
sequence ( n records) O( log 2 n )
selection tree ---- a binary tree where each node
represents
the smaller of its two children
tournament algorithm O(n log 2 n )
leaf node ~ the first record in the
corresponding run
nonleaf node ~ the winner of a tournament
root ~ the overall winner or smallest key

1 6

2 3
6 8
6 7
4 5 6 8 17
9
10 11 12 13 14 15
8 9 90 17
10 9 20 6 8 9
15 20 20 15 15 11 100 18
16 38 30 25 50 16 110 20

run1 run2 run3 run4 run5 run6 run7 run8


k = 8
1 8

2 3
9 8
6 7
4 5 8 17
9 15

10 12 13 14 15
8 9 11
10 9 20 15 8 9 90 17

15 20 20 25 15 11 100 18
16 38 30 25 50 16 110 20

run1 run2 run3 run4 run5 run6 run7 run8

Tree of losers
each nonleaf node retains a pointer to the
loser. An additional node, node 0, has been added
to represent the overall winner of the tournament.

overall
0 6 winner

1 8

2 6 3
9 17
6 7
4 5 20 9 90
10
10 12 13 14 15
8 9 9 11
10 20 6 8 9 90 17
run 1 2 3 4 5 6 7 8
5.9 Forests
【definition】A forest is a set of n >= 0 disjoint
trees

(1) Transforming a forest into a binary tree

【definition】If T 1, ..., T n is a forest of trees, then


the binary
tree corresponding to this forest, denoted by B(T
1, ..., T n):
1. is empty, if n = 0
2. has root equal to root (T 1); has left
subtree equal to
B(T 11, T 12, ..., T 1m), where T 11, T 12, ..., T 1m
are the
subtrees of root (T 1); and has right subtree
B(T 2, ..., T n)

A
A E

B E
B C D F
C F G
G
D H
H I
I
(2) Forest Traversal

inorder
1. If F is empty, then return
2. Traverse the subtree of the first tree in tree
inorder
3. Visit the root of the first tree of F
4. traverse the remaining trees of F in inorder

preorder
1. If F is empty, then return
2. Visit the root of the first tree of F
3. Traverse the subtree of the first tree in tree
preorder
4. Traverse the remaining trees of F in preorder

postorder
1. If F is empty, then return
2. Traverse the subtree of the first tree in tree
postorder
3. Traverse the remaining trees of F in postorder
4. Visit the root of the first tree of F

5.10 Set representation


assume
⑴ the elements of the set are the number 0,1, 2, ...,
n-1
⑵ the set being represented are pairwise disjoint,
that is, if Si
and Sj are two sets and i ≠ j , the there is no
element that
is in both Si and Sj.
⑶ for each set we have linked the nodes from the
children to
the parent

0 4 2

6 7 8 1 9 3 5
9
S1 S2 S3
operations
⑴ disjoint set union. If Si and Sj are disjoint sets,
then their
union Si U Sj = { all elements, x, such that x is
in Si or Sj }.
⑵ Find( i). Find the set containing the element, i,

5.10.1 UNION AND FIND OPERATIONS


(1) union operation

0 4

6 7 8 4 0 1 9

1 9 6 7 8

S1 U S2 S2 U S1

set 0
name pointer
S1 6 7 8

S2
4
S3
1 9

3 5

i [0] [1] [2] [3] [4] [5] [6] [7][8] [9]parent


-1 4 -1 2 -1 2 0 0 0 4
(2) find operation ( for n-1 finds O( n ) )
2

int find1( int i)


{
for ( ; parent [ i] >= 0; i = parent [ i ] )
;
return i;
}
void union1( int i, int j ) /* for n-1 unions O(n)
*/
{
parent [ i ] = j;
}

union (0, 1), find (0)


union (1, 2), find (0)

union (n-2, n-1), find (0)


n-1

n-2

【definition】Weighting rule for union ( i, j ). If the


number of nodes in tree i is less then the number in
tree j then make j the parent of i; otherwise make i
the parent of j.
0 1 ... n-1 0 2 ... n-1 0 3 ... n-1

1 1 2

initial union(0,1), union(0,2),


0=find(0) 0=find(0)

0 4 ... n-1 0

1 2 3 1 2 3 ... n-1

union(0,3), union(0,n-1)
0=find(0)
void union2( int i, int j)
{ /* union the sets with roots i and j, i != j, using the
weighting rule.parent[i] = -count [ i ] and parent[j] = -
count[j] */
int temp = parent[i] + parent[j];
if (parent[i] > parent[j]) {
parent[i] = j; /*make j the new
root */
parent[j] = temp;
}
else { parent [j] = i; /*make i the new
root */
parent [j] = temp;
}
}
〖lemma5.4〗Let T be a tree with n nodes created as
a result of union2. No node in T has level greater
then [ log 2 n ] + 1
proof:
n = 1, lemma is clearly true
i < = n - 1, assume that it is true for all trees with i
nodes
i=n, T is a tree with n noes created by
union2
last union operation union( k, j )
m ---- the number of nodes in tree j (1
<= m <= n/2 )
n - m ---- the number of nodes in k
if the max level of any node in T is
same as k
then max level in T <= [ log 2 ( n − m ) ]
+ 1 <= [ log 2 n] +1
if the max level of any node in T is one
more than in j
then max level in T <= [ log 2 m ] + 2
<= [ log 2 n / 2 ] +2 <= [ log 2 n] +1
〖example〗
initial, parent [ i ] = - count [ i ] = -1, 0 ≤ i < n = 2 3
union( 0, 1) union( 2, 3) union( 4, 5)
union( 6, 7)
union( 0, 2) union( 4, 6) union(0, 4)

[-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] level command
0 1 2 3 4 5 6 7 1 initial

[-2] [-2] [-2] [-2]


1 union(0,1)
0 2 4 6 union(2,3)
union(4,5)
7 2 union(6,7)
1 3 5

[-4] [-4]
1 union(0,2)
0 4 union(4,6)

5 6 2
1 2

3 7 3
[-8]
0 1 union(0,4)

2
1 2 4

6 3
3 5

7 4

【definition】Collapsing rule: If j is a node on the


path from i to its root then make j a child of the root

int find2 (int i)


{/* find the root of the tree containing element i.Use
the
collapsing rule to collapse all nodes from i to root */
int root, trail, lead;
for (root = i; parent [root] >= 0; root = parent
[root])
;
for (trail = i; trail != root; trail = lead}{
lead = parent [trail]; parent
[trail] = root;
}
return root;
}

〖example〗process the following 8 finds:


find (7), find (7), … ,
find (7)
old version ----- 24 moves to process all eight finds
new version --- 13 moves to process all eight finds

α ( m , n ) ---- a very slowly growing function

α ( m , n ) = min{ z ≥ 1| A( z , 4[ m / n]) > log 2 n}

A( p , g ) ---- Ackermann's function ( a very


rapidly growing
function)
2 q p = 0
0 q = 0 and p ≥ 1

A( p, q ) = 
0 p ≥ 1 and q = 1
 A(p - 1, A(p, q - 1)) p ≥ 1 and q ≥ 2

(1)
A(3,4) = 2 --
2

}65,536 twos
(2) A( p , q + 1) > A( p , q )
(3) A( p + 1, q ) > A( p , q )

〖lemma5.5〗Tarjan: Let T(m, n) be the maximum


time require to process an intermixed sequence of
m >= n finds and n - 1 unions. Then:
k 1 m α (m, n) ≤ T(m, n) ≤ k 2α (m, n)
for some positive constants k1 and k2

5.10.2 Equivalence classes


i≡ j ---- determine the sets containing i and j
(find)
replace the two sets by their union
(union) Ο( m α ( 2 m , n )) n ---
equivalence classes m ---
equivalence pairs
〖example〗tree for equivalence

[-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] [-1] lnput
0 1 2 3 4 5 6 7 8 9 10 11 initial

[-2] [-1] [-2] [-1] [-2] [-1] [-2] [-1] 0 = 4


0 3 6 11 3 = 1
2 5 7 8 6 = 10
8 = 9
4 1 10 9

[-2] [-4] [-3] [-2] 7 = 4


0 6 = 8
6 3 2 3 = 5
2 = 11
4 7 10 8 1 5 11

9
[-5] [-4] [-3]
0 11 = 0
6 3

4 2 7 10 8 1 5

11 9

5.11 counting binary tree

1. distinct binary trees


n = 1 ( 0 ) node -------- one binary tree
n=2 nodes ------- two binary tree
n nodes ?
n=3 nodes ------- five binary tree
n = 2 n = 3

2. Stack permutations
preorder sequence: ABCDEFGH I
inorder sequence: BCAEDGHF I
Does such a pair of sequences uniquely define a
binary tree ?

(a) (b)
A
A
B D,E,F,G,H,I
B,C D,E,F,G,H,I
C

A PREORDER: 123456789
1 INORDER: 213547869

B D 2 4

C E F 3 5 6

G I 7 9

H 8
(c)

Every binary tree has


a unique pair of preorder-inorder
sequence

inorder (preorder ) permutation :


the order in which its nodes are visited
during an inorder (preorder ) traversal of the
tree.

If nodes are numbered such that its preorder


permutation
1,2,...,n,
then
1.distinct binary trees define distinct inorder
permutation.
2.the number of distinct binary trees is equal
to the
number of distinct inorder permutation
obtainable from
binary tree having the its preorder
permutation, 1,2,...,n.

〖example〗If we start with the numbers 1, 2, 3,


then the possible permutation obtainable by a
stack are:
(1, 2, 3) (1, 3, 2) (2, 1, 3) (2, 3, 1)
(3, 2, 1)
Binary trees corresponding to the five
permutations (3,2,1) is impossible

1 1 1
1 1
2 2 3 2 2
2

3 3 3 3

3. Matrix Multiplication

M 1 * M 2 * .... * M n

matrix multiplication is associative


n =3 : two possibilities: (M1*M2)* M3
M1*(M2*M3 )

n = 4: five possibilities: (( M 1 * M 2 ) * M 3 ) * M
4
(M 1 * ( M 2 * M 3 ) )* M
4
M 1 * (( M 2 * M 3 ) * M
4)
(M 1 * ( M 2 * ( M 3 * M
4 ) ))
(( M 1 * M 2 ) * (M 3 * M
4))
bn ----- the number of different ways to compute
the product
of n matrices ( b 1 = 1, b 2 = 2, b 3 = 5)
M ij = the product M i * M i + 1 * ... * M j ( i <=
j)
M 1n = the product M 1i * M i+1, n ( 1 <= i <= n)
M 1i = b i M i+1, n = b n - i
n−1
bn = ∑ bi bn− i n>1
i =1
b n = the number of distinct binary trees with n
nodes
b n = the sum of all the possible binary trees
formed in the
following way:
a root and two subtrees with b i and b n - i-1
nodes,
for 0 <= i <= n

bn

bi b n-i-1

n−1
bn = ∑ bi bn− i −1, n ≥ 1, and b0 = 1
i =0 Eq.(1).

4. Number of distinct binary trees


to obtain the number of distinct binary tree
with n nodes, we must solve the recurrence of
Eq.(1). we let
B ( x ) = ∑ bi x i
i ≥0 Eq.(1).

xB 2 ( x ) = B ( x ) − 1

using the formula to solve quadratics and the fact


that b(0) = b 0 = 1 we get:

xB ( x ) = B( x ) − 1
2
Eq.(2)

using the binomial theorem to expand (1 − 4 x )


1/ 2
to
obtain:
1 L
M L O O
P L1/ 2 O
1/ 2
B( x ) =
2x M
1− ∑ P
N NQ Q N
n≥ 0 n
M
m + 1QP
( −4 x ) n = ∑
( −1)
m≥0
m
2 2 m+1 x m

Eq.(3)

comparing Eq.(2) and Eq.(3) we see that bn, which

B ( x ) = ∑ bi x i
is the coefficient of x
n
in B(x), is: i ≥0

1 2n L
M O
bn =
n +1 n NPQ
bn = Ο( 4 n / n3/ 2 )
Chapter6 GRAPHS
6.1 The graph abstract data type
1. introduction
Koenigsberg bridge problem
starting at some land area, is it possible to return to our staring
location after walking across each of the bridge exactly once ?
C d g
c
C
g
c d
A e
e
D A D
a b
a f
b f
B B
start B ----> (a)---> A ---> (e )---> D ---> (g) --->
C ---> (d) ---> A ---> (b) ---> B ---> (f ) ---> D [no
solution]

Eulerian walk
There is a walk starting at any vertex, going through each
edge exactly once, and terminating at the starting vertex iff the
degree of each vertex is even.

graph applications:
analysis of electrical circuits,
finding shortest routes,
project planning,
identification of chemical compounds.

GRAPH (G) --- consists of two sets, a finite, nonempty set of


vertices ( V(G) ), and a finite, possibly empty set of edges
( E(G) ), G= ( V, E ).

undirected graph ---- the pair of vertices representing any


edges is undirected. (v0, v1) = (v1, v0)
directed graph ----- the pair of vertices representing any
edges is directed. < v0, v1 > ( v0 -- tail, v1 -
- head )
0 0 0

1 2 1 2 1
3
3 4 5 6 2
G1 G2 G3

V(G1) = {0, 1, 2, 3} E(G1) = { (0,1), (0,2), (0,3), (1,2),


(1,3), (2,3) }
V(G2) = {0, 1, 2, 3, 4, 5, 6} E(G2) = { (0,1), (0,2), (1,3), (1,4),
(2,5), (2,6) }
V(G3) = {0, 1, 2} E(G3) = { < 0, 1 >, < 1, 0 >,<1,2> }

G2 is a tree ---- > trees is a special case of graphs

The restrictions on graph:


1. A graph may not have an edge from a vertex , i, back to
itself (self loops)
2. A graph may not have multiple occurrences of the same
edge ( multigraph )

0 2 0

1 3

1 2

complete graph ----- is a graph that has the maximum


number
of edges
undirected graph (n vertices) ----- n (n-1) / 2
directed graph (n vertices) ----- n (n-1)
(v0, v1) ---> vertices v0 and v1 are adjacent
---> edge (v0, v1) is incident on v0 and v1
<v0, v1> ---> vertex v0 is adjacent to vertex v1
vertex v1 is adjacent from vertex v0
---> edge <v0, v1> is incident on v0 and v1
subgraph of G (G') ----- V(G') ⊆ V(G) and E(G') ⊆ E(G)

125
0 0
0 1 2
1 2 3 1 2

3
(a) some of subgraphs of G1

0 0 3 0

2 1
1
6
2
(b) some of subgraphs of G3
path from vp to vq ----- a sequence of vertices vp , vi1 , vi2 , … ,
vin , vq such that ( vp , vi1 ), ( vi1 , vi2 ),…,
( vin , vq ) are edges in an undirected graph.
in directed graph, the path consists of
< vp , vi1 >, < vi1 , vi2 >,…, < vin , vq >
length of a path ---- the number of edges on the graph
simple path ----- a path in which all vertices, except possibly
the first and the last are distinct.
simple directed path
cycle ---- a simple path in which the first and the last vertices
are the same.
directed cycle
v0 and v1 are connected ----- there is a path from v0 to v1
an undirected graph are connected ------
if , for every pair of distinct vertices vi , vj ,
there is a path from vi to vj in G.

H1 0 H2 4

2 1 5

3 6

7 G4

connected component of an undirected graph -----


a maximal connected subgraph. ( a tree is
graph

126
that is connected and acycle)

strongly connected ( directed graph ) ------


if , for every pair of vertices vi , vj in G, there is
a
path from vi to vj, and also from vj to vi.
strongly connected component -----
a maximal subgraph that is strongly connected.
0
2
1
degree of vi ---- the number of edges incident to that vertex.
in-degree ---- the number of edges that have v as the head.
out-degree ----- the number of edges that have v as the tail.
n−1
the number of edges : e = (∑di ) / 2
0

Abstract data type


structure Graph is
object: a nonempty set of vertices and a set of undirected
edges, where each edge is a pair of vertices.
function: for all graph ∈ Graph, v, v1, and v2 ∈
Vertices
Graph Create() ::= return an empty graph.
Graph InsertVertex( graph,v ) ::= return a graph with
v inserted. v has no incident
edges.
Graph Insertedge( graph, v1 , v2 )::= return a graph
with a new edge between v1
and v2 .
Graph DeleteVertex( graph,v ) ::= return a graph in
which v and all edges incident
to it are removed.
Graph DeleteEdge(graph, v1 , v2)::= return a graph
in which the edge (v1 , v2) is
removed. Leave the incident
nodes in the graph.
Boolean IsEmpty( graph ) ::= if ( graph == empty
graph) return TRUE else return
FALSE.

127
List Adjacent( graph, v ) ::= return a list of all
vertices hat are adjacent to v.
2. representations

⑴ adjacency matrix

0 1 2 3 4 5 6 7
0 1 2 3 0 0 1 1 0 0 0 0 0
0 1 2 1 1 0 0 1 0 0 0 0
0 0 1 1 1 0 0 1 0 2 1 0 0 1 0 0 0 0
1 1 0 1 1 1 1 0 1 3 0 1 1 0 0 0 0 0
2 1 1 0 1 2 0 0 0 4 0 0 0 0 0 1 0 0
3 1 1 1 0 5 0 0 0 0 1 0 1 0
6 0 0 0 0 0 1 0 1
G1 G3 7 0 0 0 0 0 0 1 0
G4

n−1
the degree of any vertex, i : ∑ adj _ mat [ i ][ j ]
j =0

How many edges are there in G or, Is G connected ?


Ο( n2 )
sparse graphs --- graphs that have a small number of edges
Ο( e + n ) e << n2 / 2

⑵ Adjacency list
headnodes vertex link

1 0 1 2 null
0 2 3 null

1 0 2 1 0 3 null
3 null

2 0 1 3 null 2 0 3 null

3 0 1 2 null 3 1 2 null

G1 4 5 null

0 1 null 5 4 6 null

1 0 2 null 6 5 7 null

2 null 7 6 null

G3
G4

128
#define MAX_VETICES 50 /* maximum number of vertices */
typedef struct node *node_pointer;
typedef struct node { int vertex;
struct node *link;
};
node_pointer graph[MAX_VERTICES];
int n= 0 ; /* vertices currently in use */

In undirected graph (n vertices, e edges):

n head nodes, 2e list nodes


node [ i ] --- the starting point of the list for vertex i, 0 <= i < n
and node [ n ] is set to n + 2e + 1
The vertices adjacent from vertex i are stored in node[i],…,
node[i+1] -1, 0 <= i < n.

the sequential representation for G4:


[0] 9 [5] 18 [10] 2 [15] 1
[20] 5
[1] 11 [6] 20 [11] 0 [16] 2
[21] 7
[2] 13 [7] 22 [12] 3 [17] 5
[22] 6
[3] 15 [8] 23 [13] 0 [18] 4
[4] 17 [9] 1 [14] 3 [19] 6

the degree of any vertex ---- the number of nodes in the its
adjacency list
the total number of edges in G ---- Ο( n + e )

In undurected graph (n vertices, e edges):

out - degree ---- the number of nodes in its adjacency list


in - degree ? inverse adjacency lists

129
0 1 null

1 0 null inverse adjacency list for g3


2 1 null

Alternate node structure for adjacency list :

tail head column link for haed row link for tail

headnodes
(shown twice) 0 1 2

0 0 1 null

1 1 0 null 1 2 null

2 null G3

note: we arranged the nodes in each of the lists so that the


vertices were in ascending order, it is not necessary in fact.
headnodes vertex link

0 3 1 2 null

1 2 0 3 null

2 3 0 1 null

3 2 1 0 null

G1
⑶ Adjacency multilists

marked vertex1 vertex2 path2


path1

typedef struct edge *edge_pointer;


typedef struct edge {
short int marked;
int vertex1;
int vertex2;
edge_pointer path1;
edge_pointer path2;

130
};
edge_pointer graph [MAX_VERTICES];

headnode
0 M1 0 1 M2 M4 edge(0,1)
1
2 M2 0 2 M3 M4 edge(0,2)
3
M3 0 3 null M5 edge(0,3)

M4 1 2 M5 M6 edge(1,2)

M5 1 3 null M6 edge(1,3)

M6 2 3 null null edge(0,1)

The lists are: vertex 0: M1-> M2 -> M3


vertex 1: M1-> M4 -> M5
vertex 2: M2-> M4 -> M6
vertex 3: M3-> M5 -> M6

⑷ Weighted edges ( network )


weight --- the distance from one vertex to another or the cost
of going from one vertex to an adjacent vertex

6.2 Elementary graph operations

⑴ depth first search

#define FALSE 0
#define TRUE 1
short int visited [MAX_VERTICES];

void dfs(int v)
{
/* depth first search of a graph beginning with vertex v */
node_pointer w;
visited[v] = TRUE;
printf ("%5d",v);
for (w = graph[v]; w; w = w->link)

131
if ( !visited [ w->vertex ])
dfs( w->vertex );
}
〖example〗
0 1 2 null
v0
1 0 3 4 null

2 0 5 6 null v1 v2

3 1 7 null
v3 v4 v5 v6
4 1 7 null

5 2 7 null
v7
6 2 7 null

7 3 4 5 6 null

complexity
for adjacency list , O (e) for adjacency matrix , O
(n )
2

⑵ Breadth first search


typedef struct queue *queue_pointer;
typedef struct queue { int vertex; queue_pointer; };
void addq (queue_pointer *, queue_pointer *, int);
int deleteq ( queue_pointer *);
void bfs(int v)
{/* breadth first traversal of a graph,starting with node v
the global array visited is initialized to 0, the queue
operations are similar to those described in chapter 4. */
node_pointer w;
queue_pointer front,rear;
front = rear = NULL; /* initialize queue */
printf ("%5d", v);
visited[v] = TRUE;
addq (&front,&rear,v);
while (front) { v = deleteq (&front);
for (w = graph[v]; w; w = w->link)
if ( !visited [w->vertex])
{
printf ("%5d",w->vertex);
addq (&front, &rear, w->vertex);
visited [w->vertex]=TRUE;

132
}
}
}

complexity
for adjacency list , O (e)
for adjacency matrix , O ( n )
2

⑶ Connected components

void connected(void)
{
/* determine the connected components of a graph */
int i;
for ( i=0; i < n; i++)
if ( !visited [ i ] ) {
dfs ( i );
printf ( "\n" );
}
}
complexity
for adjacency list , O(n+e)
for adjacency matrix , O ( n )
2

⑷ Spanning trees
spanning tree is any tree that consists solely of edge in G
and that includes all the vertices in G.

depth first spanning tree (a)


breadth first spanning tree (b)

133
v0 v0

v2 v0 v2
v0

v5 v6 v3 v4 v5 v6
v3 v4

v7 v7
(a) dfs(0) spanning tree (b) bfs(0) spanning tree

property1 :

If a nontree edge, (v,w) is added into any spanning tree, T, the


result is a cycle that consists of the edge (v,w) and all the
edge on the path from w to v in T.

[ use: obtain an independent set of circuit equation for an


electrical network ]

property2 :

A spanning tree is a minimal subgraph, G', of G such that


V(G') = V(G) is connect.

minimal subgraph -- subgraph with the fewest number of


edges.
any connected graph with n vertices must have at least n-1
edges, and all connected graphs with n-1 edges are trees.
therefore a spanning tree has n-1 edges.

[use: design of communication networks , some time we


assign weights to the edges for representing the cost of
constructing the communication link or the length of
the
link]

⑸ Biconnected components and articulation points

134
articulation points --- a vertex v of G such that the deletion of
v, together with all edges incident on v, produces a graph, G',
that has at lest two connected components.

Biconnected graph --- a connected graph that has no


articulation points.

Biconnected components of a connected undirected graph


is a maximal biconnected subgraph, H, of G.

□maximal means that G contains no other subgraph that


is both biconnected and properly contains H.
□two biconnected components of the same graph have no
more than one vertex in common.

8 9 0 8 9
0

1 7 7
1 7
1 7
2 3 5
2 3 3 5 5
4 6
4 6
(a) connected graph (b) biconnected components

the depth first number, or dfn, of the vertex:


the sequence in which the vertices are visited during the depth
first search
□ if u and v are both vertices, and u is an ancestor of v in the
depth first spanning tree, then dfn(u) < dfn (v)

135
0
3
9 8 9 8 1 5
4 0
4 5
3 1 7 7
0 5 2 2 6 6
2
2 3 5
3 1
7 7
4 6 9
1 6 4 8
0 9 8

(a) connected graph (b)

□ a nontree edge ( u, v ) is a back edge


iff either u is an ancestor of v or v is an ancestor of u.
* all nontree edges are back edges.
□ the root of a depth spanning tree is an articulation point
iff it has at least two children.
□ any other vertex u is an articulation point
iff it has at least one child w such that we cannot reach
an
ancestor of u using a path that consists of only w,
descendants of w, and a single back edge.

definition: low
for each vertex of G such that low(u) is the lowest depth first
number that we can reach from u using a path of
descendants followed by at most one back edge:

low(u) = min { dfn(u), min { low(u) | w is a child of u }


min { dfn(w) | (u,w)
is a back edge}}

□u is an articulation point
iff u is either the root of the spanning tree and has two or
more children, or u is not the root and u has a child w such
that low(w) >= dfn(u).

vertex 0 1 2 3 4 5 6 7 8 9
dfn 4 3 2 0 1 5 6 7 9 8

136
low 4 3 0 0 0 5 5 7 9 8

void dfnlow(int u,int v)


{
/* compute dfn and low while performing a dfs search
beginning at vertex u,v is the parent of u (if any) */
node_pointer ptr;
int w;
dfn[u] = low[u] = num++;
for (ptr =graph[u]; ptr; ptr= ptr->link ) {
w = ptr->vertex;
if (dfn [w] < 0) { /* w is an unvisited vertex */
dfnlow (w,u);
low[u] = MIN2 ( low[u], low[w] );
}
else if ( w != v)
low [u] = MIN2 ( low[u], low[w]);
}
}

void init (void)


{
int i;
for ( i = 0; i < n; i++ ){
visited [ i] = FALSE;
dfn[ i ] = low[ i ] = -1;
}
num = 0;
}

global declarations:

#define MIN2 (x, y) ((x ) < ( y ) ? (x) : (y))


short int dfn [MAX_VERTICES ];
short int low [MAX_VERTICES ];
int num;

137
void bicon( int u, int v) O(n + e )
{
/* compute dfn and low,and output the edges of G by their
biconnected component, v is the parent (if any) of the u
{if any) in the resulting spanning tree.It is assumed that
all entries of dfn[] have been initialized to -1,num has
been initialized to 0,and the stack has been set to empty
*/
node_pointer ptr;
int w,x,y;
dfn[u] = low[u] = num++;
for (ptr = graph[u]; ptr; ptr = ptr->link){
w = ptr->vertex;
if(v!= w && dfn[w] < dfn[u])
add (&top, u, w); /* add edge to
stack */
if ( dfn [ w ] m< 0) { /* w has not been
visited * /
bicon(w,u);
low[u] = MIN2(low[u],low[w]);
if (low[w] >= dfn[u]{
printf ("New biconnected component:");
do { /* delete edge from stack */
delete (&top, &x, &y);
printf (" <&d,&d>",x,y);
} while ( !((x == u) && ( y == w)));
printf ("\n");
}
}
else if (w != v) low[u] = MIN2(low[u], low[w]);
}
}

6.3 Minimum cost spanning trees

cost of a spanning tree of a weighted undirected graph is the


sum of the costs (weights) of the edges in the spanning tree.

a minimum cost spanning tree is a spanning tree of least cost.

138
greedy method ( design strategy ):
Kruskal's Prim's sollin algorithms

solution must satisfy the following constraints :


(1) we must use only edges within the graph
(2) we must use exactly n - 1 edges
(3) we may not use edges that would produce a cycle

Kruskal's algorithms

method --- adding edges to T one at a time.


1. select the edges for inclusion in T in nondecreasing order
of their cost.
2. an edge is added to T if it does not form a cycle with
the
edges that are already in T.
( since G is connected and has n > 0 vertices, exactly n
-1
edges be selected for inclusion in T )

Kruskal's algorithm

T={ };
while (T contains less than n-1 edges && E is not empty ){
choose a least cost edge (v,w) from E;
delete (v,w) from E;
if ((v,w) does not create a cycle in T)
add (v,w) into T;
else
discard (v,w);
}
if (T contains fewer than n-1 edges)
printf ("No spanning tree\n");

〖example〗

139
0 0 0
28
10 10 1
1 1
14 16
5 6 2 5 6 2 5 6 2
24
25 18
4 12 4 4
22 3 3 3

(a) (b) (c)


0 0 0
10 10 10 1
1 1 14
14 16
5 6 2 5 6 2 5 6 2

4 12 4 12 4 12
3 3 3

(d) (e) (f)


0 0
10 10 1
1 14 16
14 16
5 6 2 5 6 2
25
12 4 12
4
3 22 3
22

(g) (h)

Edge weight result figure


---- ---- initial fig.(b)
(0,5) 10 added to tree fig.(c)
(2,3) 12 added fig.(d)
(1,6) 14 added fig.(e)
(1,2) 16 added fig.(f )
(3,6) 18 discarded
(3,4) 22 added fig.(g)
(4,6) 24 discarded
(4,5) 25 added fig.(h)
(0,1) 28 no considered

140
〖theorem 6.1〗let G be an undirected connected graph.
Kruskal's algorithm generates a minimum cost spanning tree.
proof :
(a) Kruskal's method produces a spanning tree whenever
a
spanning tree exist
(b) the spanning tree generated is of minimum cost

Prim's algorithm

T={ };
TV = { 0 }; /* start with vertex 0 and no edges */
while (T contains fewer than n-1 edges) {
let (u,v) be a least cost edge such that u ∈ TV and
v ∉ TV;
if (there is no such edges)
break;
add v into T;
add (u, v) into T;
}
if (T contains fewer than n-1 edges)
printf("No spanning tree\n");

141
0 0 0
10 1 10 10
1 1

5 6 2 5 6 2 5 6 2
25 25
4 4 4
3 3 22 3

(a) (b) (c)

0 0
0
10 10 10 1
1 1 14 16
16
5 5 6 2 5 6 2
6 2
25 25 25
12 12 4 12
4 4
3 3 22 3
22 22
(d) (e) (f)

Sollin's algorithm

1. initialize the set of selected edges as empty set.


2. select edges (minimum cost) for each of the vertices.
(eliminating the duplicate edges)
3. add these edges to the set of selected edges, producing
the forest.
4. when there is only one tree at the end of a stage or no
edges remain for selection.

0 0
10 10
1 1
14 14 16
5 6 2 5 6 2
25
4 12 4 12
22 3 22 3
(a) (b)

142
6.4 Shortest path and transitive closure

※ Is there a path from A to B ?


※ If there is more than one path from A to B,
which path is the shortest ?

⑴ Single source all destinations

problem : given a directed graph, G = (V, E), a weighting


function, w(e), w(e) > 0, for the edges of G, and a source vertex
v0, determine shortest path from v0 to each of the remaining
vertices of G.

greedy method:
1.Let S denote the set of vertices, including v0, whose
shortest paths have been found
2. For w in not in S, let distance[w] be the length of the
shortest path starting from v0, going through vertices
only
in S, and ending in w.

□If the next shortest path is to vertex u, then the path from v0
to u goes through only those vertices that are in S.
□Vertex u is chosen so that it has the minimum distance,
distance[u], among all the vertices not in S.
□ Once we have selected u and generated the shortest path
from v0 to u, u becomes a member of S. (w is not currently
in S) the shortest path = distance [ u ] + length (<u, w> )

45
v0 50 10 v4 path lenght
v1
1) v0 v2 10
20 10 15 35 2) v0 v2 v3 25
20 30 3) v0 v2 v3 v1 45
4) v0 v4 45
v2 v3 v5
15 3

143
#define MAX_VERTICES 6 /* maximum number of vertices */
int cost[ ][MAX_VERTICES} =
{{ 0, 50, 10,1000, 45,1000},
{1000, 0, 15,1000, 10,1000},
{ 20,1000, 0, 15,1000,1000},
{1000, 20,1000, 0, 35,1000},
{1000,1000, 30,1000, 0,1000},
{1000,1000,1000, 3,1000, 0}};
int distance[MAX_VERTICES];
short int found[MAX_VERTICES];
int n = MAX_VERTICES;

program Single source shortest paths Ο( n2 )

void shortestpath (int v,int cost[ ][MAX_VERTICES],


int distance[ ],int n,short int found[ ])
{
/* distance[i] represents the shortest path from vertex v
to i,found[i] holds a 0 if the shortest path from vertex i
has not been found and a 1 if it has ,cost is the
adjacency matrix */
int i,u,w;
for (i = 0; i < n; i++){
found[ i ] = FALSE;
distance [ i] = cost [v] [ i ];
}
found [v]= TRUE;
distance[v] = 0;
for (i=0; i<n-2; i++){
u=choose(distance,n,found);
found[ u ] = TRUE;
for (w=0; w<n; w++)
if ( !found[w])
if (distance[u] +cost[u][w] <distance[w])
distance[w] = distance[u]+cost[u][w];
}
}

Choosing the least cost edge :

int choose(int distance[ ], int n, short int found[ ])


{

144
/* find the smallest distance not yet checked */
int i,min,minpos;
min = INT_MAX;
minpos = -1;
for (i=0; i<n; i++)
if (distance[i] < min && !found [i]){
min =distance[i];
minpos = i;
}
return minpos;
}

〖example〗
4 Boston
1500
Chicago 250
San 3
francisco Denver 1200
800 1000
1 2 5

300 Mew york


1000 1400

0 1700 900

Los Angeles 7 1000


New Orleans 6 Miami

0 1 2 3 4 5 6 7
0 0
1 300 0
2 1000 800 0
3 1200 0
4 1500 0 250
5 1000 0 900 1400
6 0 1000
7 1700 0

itera s vertex LA SF BOST NO


tion sele. [0] [1] DEN CHI [4] NY MIA [7]
[2]
[3] [5] [6]
initial -- --- +∞ +∞ +∞ 1500 0 250 +∞ +∞

145
{4} 5 +∞ +∞ +∞ 1250 0 250 1150 1650
1
{4,5} 6 +∞ +∞ +∞ 1250 0 250 1150 1650
2
{4,5,6} 3 +∞ +∞ 2450 1250 0 250 1150 1650
3
{4,5,6,3} 7 3350 +∞ 2450 1250 0 250 1150 1650
4
{4,5,6,3,7} 2 3350 3250 2450 1250 0 250 1150 1650
5
{4,5,6,3,7,2} 1 3350 3250 2450 1250 0 250 1150 1650
6
{4,5,6,3,7,2,1} 3350 3250 2450 1250 0 250 1150 1650

⑵ All pairs shortest paths

Using the above shortestpath algorithm , find the shortest


path between all pairs of vertices , Vi , V j , i ≠ j .
complexity = n * Ο ( n2 ) = Ο ( n3 )

G ~ cost adjacency matrix


k
A [ i ][ j ] ~ the cost of the shortest path from i to j, using only
these intermediate vertices with an index ≤ k .
An−1 [ i ][ j ] ~ the cost of the shortest path from i to j
A−1 [ i ][ j ] = cost [i ] [j ]
k −1
Rules for A [ i ][ j ] -----> Ak [ i ][ j ]
(1) the shortest path from i to j going through no vertices with
index greater than k does not go through the vertex with
k −1
index k and so its cost is A [ i ][ j ].
(2) the shortest such path does go through vertex k,
path i to j = path i to k + path k to j ( indices <= k-1)

Ak [ i ][ j ] = min{ Ak −1 [ i ][ j ], Ak −1 [ i ][ k ] + Ak −1 [ k ][ j ]}, k ≥ 0
A−1 [ i ][ j ] = cost [i ] [j ]

146
〖example〗

0
-2
1 2
L
M
0 1 ∞ O
P
1 −2 A−1
1
M
M
0 1
P
P
N∞ ∞ ∞ Q
A1 [0][2] ≠ min{ A1 [0][2], A0 [0][1] + A0 [1][2] } = 2
A1 [0][2] = ∞ ( path 0,1,0,1, 0,1, …, 0, 1, 2 )

All pairs,shortest paths function Ο( n3 )

int distance [ MAX_VERTICES ][ MAX_VERTICES ]


void allcosts(int cost[ ][MAX_VERTICES],
int distance[ ][MAX_VERTICES],int n)
{
/* determine the distances from each vertex to every other
vertex,cost is the adjacency matrix,distance is the matrix of
distances */
int i, j, k;
for (i = 0; i <n; i++)
for (j = 0; j < n; j++)
distance[i][j] = cost[i][j];
for (k = 0; k<n; k++)
for (i =0; i<n; i++)
for (j = 0; j<n; j++)
if (distance[i][k] + distance[k][j] < distance[i][j])
distance[i][j] = distance[i][k] +
distance[k][j];
}

〖example〗

V0 V1 0 1 2
0 0 4 11
1 6 0 2
V2 2 3 0

G cost adjacence matrix gor G

147
-1 0 1 2
A A A 0 1 2 A 0 1 2
0 1 2 0 1 2
0 0 4 11 0 0 4 11 0 0 4 6 0 0 4 6
1 6 0 2 1 6 0 2 1 6 0 2 1 5 0 2
2 3 0 2 3 7 0 2 3 7 0 2 3 7 0

⑶ transitive closure
+
【definition】 The transitive closure matrix , denoted A , of a
+
directed graph, G, is a matrix such that A [i][j] = 1 if there is
+
a path of length > 0 from i to j : otherwise, A [i][j] = 0.
【definition】 The reflexive closure matrix , denoted A*, of a
directed graph, G, is a matrix such that A* [i][j] = 1 if there is
a path of length >= 0 from i to j : otherwise, A* [i][j] = 0.

0 1 2 3 4

0 0 1 0 0 0 0 0 1 1 1 1 0 1 1 1 1 1
1 0 0 1 0 0 1 0 0 1 1 1 1 0 1 1 1 1
2 0 0 0 1 0 2 0 0 1 1 1 2 0 0 1 1 1
3 0 0 0 0 1 3 0 0 1 1 1 3 0 0 1 1 1
4 0 0 1 0 0 4 0 0 1 1 1 4 0 0 1 1 1
+ A*
A A
by using allcosts A+ --> A*
--> Ο( n3 )
simplify the algorithm : Ο( n ) (for undirected graph)
2

distanve[ i ][ j ] =
distanve[ i ][ j ] || distanve[ i ][ k ] && distanve[ k ][ j ]

6.5 Activity networks


⑴ Activity on vertex (AOV) networks
【definition】An activity on vertex, or AOV, network is a
directed graph G in which the vertices represent tasks or
activities and the edges represent precedence relations
between tasks
【definition】Vertex i in an AOV network G is a predecessor of
vertex j iff there is a directed path from vertex i to vertex j.
Vertex i is an immediate predecessor of vertex j iff < i, j > is an
edge in G. If i is a predecessor of j . then j is a successor of i. If i
is an immediate predecessor of j, then j Is an immediate
successor of i.

148
course course name prerequisites
c1 programming 1 none
c2 discrete mathematics none
c3 data structure c1,c2
c4 calculus I none
c5 calculus II c4
c6 linear algebra c5
c7 analysis of algorithm c3,c6
c8 assembly language c3
c9 operating system c7,c8
c10 programming language c7
c11 compiler design c10
c12 artificial intelligence c7
c13 computational theory c7
c14 parallel algorithms c13
c15 numerical analysis c5

C9
C8
C1
C10 C11
C3
C7
C2 C12 C14

C6 C13
C4 C5

C15

【definition】A relation · is transitive iff for all i, j, k. i·j and


j·k => i·k. A relation ·is irreflexive on a set S if i·i is false
for all elements, in S. A partial order is a precedence relation
that is both transitive and irreflexive.

【definition】A topological order is a linear ordering of the


vertices of a graph such that for any two vertices, i, j, if i is a
predecessor of j in the network then i precedes j in the linear
ordering.

several possible topological order:

149
C1,C2,C4,C5,C3,C6,C8,C7,C10,C13,C12,C14,C15,C11,C9 and
C4,C5,C2,C1,C6,C3,C8,C15,C7,C9,C10,C11,C12,C13,C14

algorithm
1. list out a vertex in the network that has no predecessor
2. delete this vertex, and all edges leading out from it, from
the network.
3. repeat 1, 2, until either all the vertices have been listed, or
all remaining vertices have predecessors and so we
cannot
remove any of them.

program topological sort

for (i = 0; i < n; i++){


if every vertex has a predecessor {
fprintf(stderr,"Network has a cycle .\n");
exit(1);
}
pick a vertex v that has no predecessors;
output v;
delete v and all edges leading out of v
from the network;
}

V1 V1 V1 V1 V1 V4

V0 V2 V4 V2 V4 V2 V4 V4 V4

V3 V5 V3 V5 V5 V5

(a)initial (b)V0 (c)V3 (d)V2 (e)V5 (f)V1 (g)V4


Topological order generated V0,V3,V2,V5,V1,V4

150
typedef struct node *node_pointer;
typedef struct node { int vertex;
node_pointer link;
};
typedef struct { int count;
node_pointer link;
} hdnodes;
hdnodes graph[ MAX_VERTICES ]

void topsort(hdnodes graph[ ],int n)


{ int i,j,k,top;
node_pointer ptr;
/* create a stack of vertices with no predecessors*/
top = -1;
for (i = 0; i < n; i++)
if (!graph[i].count){
graph[i].count = top;
top = i;
}
for (i = 0; i < n; i++)
if (top = -1){
fprintf(stderr,"\Network has a cycle. Sort
terminated.\n");
exit(1);
}
else{
j = top; /* unstack a vertex */
top = graph[top].count;
printf("v%d",j);
for (ptr = graph[j].link; ptr; ptr = ptr->link){
/*decrease the count of the successor vertices
of j */
k = ptr->vertex;
graph[k].count--;
if ( !graph[k].count){ /* add vertex k to the
stack */
graph[k].count = top;
top = k;
}
}

151
}
}

V0 0 1 2 3 null

V1 1 4 null

V2 1 4 5 null

V3 1 5 4 null

V4 3 null

V5 2 null

n−1
complexity Ο (( ∑ d i ) + n ) = Ο ( e + n )
i =0

⑵ activity on edge (AOE) network


AOE ( an activity on edge) closely related to the AOV
directed edges ~ tasks or activities to be performed on a
project
vertices ~ events which signal the completion of certain
activities
v1 a3=1 v6 a9=2
a0=6
a6=9 finish
v0 a1=4 v4 v8
start event interpretation
v2 a4=1 a7=7 v7 a10=4 v0 start of project
a2=5 v1 completion of activity a0
a8=4 v4 completion of activity a3 & a4
v3 v5 v7 completion of activity a7 & a8
a5=2 v8 completion of project

techniques for evaluating network:


PERT ( performance evaluation and review technique)
CPM ( critical path method )
RAMPS ( resource allocation and multiproject scheduling )

critical path --- a path that has the longest length


earliest time (an event ,vi, can occur) --- the length of the
longest path from the start vertex v0 to vertex vi.
latest time, late(i), of activity, a i ---- the latest time the

152
activity may start without increasing the project duration

calculation of earliest times

if activity a i is represented by edge <k,l > :

early( i ) = earliest [ k ]
late ( i ) = latest [ l ] - duration of activity ai

earliest [ 0 ] = 0
earliest [ j ] = max
i ∈P ( j ) { earliest [ i ] + duration of < i, j > }
P(j) is the set of immediate predecessors of j

calculation of latest times

latest [ n -1 ] = earliest [ n - 1 ] ( start)


latest [ j ] = min { latest [ i ] - duration of < i, j > }
i ∈S ( j )
S(j) is the set of vertices adjacent from vertex j,
S(j) is the set of immediate successors of j

V0 0 1 6 2 4 3 5 null

V1 1 4 1 null

V2 1 4 1 null

V3 1 5 2 null

V4 2
6 9 7 7 null

V5 1 7 4 null

V6 1 8 2 null

V7 2 8 4 null

V8 2 null

153
earliest [0] [1] [2] [3] [4] [5] [6] [7] [8] stack
initial 0 0 0 0 0 0 0 0 0 [0]
output v0 0 6 4 5 0 0 0 0 [3,2,1]
output v3 0 [5,2,1]
output v5 0 6 4 5 0 7 [2,1]
output v2 0 0 0 [1]
output v1 0 6 4 5 0 7 [4]
output v4 0 11 0 [7,6]
output v7 0 6 4 5 5 7 [6]
output v6 0 11 0 [8]
output v8 0 6 4 5 7 7
0 11 0
0 6 4 5 7 7
16 14 0
0 6 4 5 7 7
16 14 16
0 6 4 5 7 7
16 14 18

154
V0 3 null

V1 1 0 6 null

V2 1 0 4 null

V3 1 0 5 null

V4 2 1 1 2 1 null
V5 1
3 2 null

V6 1 4 9 null

V7 1 4 7 5 4 null

V8 0 6 2 7 4 null

earliest [0] [1] [2] [3] [4] [5] [6] [7]


[8] stack
initial 18 18 18 18 18 18 18
18 [8]
output v8 18 [7,6]
output v7 18 18 18 18 18 18 16 14 [5,6]
output v5 18 [3,6]
output v3 18 18 18 18 7 10 16 [6]
output v6 14 18 [4]
output v4 18 18 18 18 7 10 16 [2,1]
output v2 14 18 [1]
output v1 3 18 18 8 7 10 [0]
16 14 18
3 18 18 8 7 10
16 14 18
3 6 6 8 7
10 16 14 18
2 6 6 8 7
10 16 14 18
0 6 6 8 7
10 16 14 18

latest [ 8 ] = earliest [ 8 ] = 18
latest [ 6 ] = min { earliest [ 8 ] - 2 } = 16
latest [ 7 ] = min { earliest [ 8 ] - 4 } = 14
latest [ 4 ] = min { earliest [ 6 ] - 9 , earliest [ 7 ] - 7} = 7
latest [ 1 ] = min { earliest [ 4 ] - 1 } = 6

155
latest [ 2 ] = min { earliest [ 4 ] - 1 } = 6
latest [ 5 ] = min { earliest [ 7 ] - 4 } = 10
latest [ 3 ] = min { earliest [ 5 ] - 2 } = 8
latest [ 0 ]
= min { earliest [ 1 ] - 6,earliest [ 2 ] - 4,earliest [ 3 ] - 5 } = 0

156
chapter 9 heap structures
9.1 Min-Max heap

【definition】A min-max heap is a complete binary tree such


that if it is not empty, each element has a field called key.
Alternating levels of this tree are min level and max levels,
respectively. The root is on min level. Let x be any node in a
min-max heap. If x is on a min level then the element in x has
the minimum key from among all elements in the subtree with
root x. We call this node a min node. Similarly, If x is on a max
level then the element in x has the maximum key from among
all elements in the subtree with root x. We call this node a max
node.

7 min

70 40 max

30 9 10 15 min

45 50 30 20 12 max

(1) insertion into a min-max heap

#define MAX_SIZE 100 /* maximum size of heap plus 1


*/ #define FALSE 0
#define TRUE 1
#define SWAP (x, y, t) ((t) = (x) , (x) = (y), (y) = (t))
typrdef struct {
int key;
/* other field */
} element;
element heap [MAX_SIZE];
void min_max_insert(element heap[ ],int *n,element item)
{ /* insert item into the min-max heap : O (log n ) */
int parent;
(*n) ++;
if (*n == MAX_SIZE) {
fprintf(stderr,"The heap is full\n");
exit(1);
}
parent = (*n)/2;
if (!parent)
/* heap is empty, insert item into first position */
heap[1] = item;
else switch(level(parent)) {
case FALSE: /* min level */
if (item.key < heap[parent].key) {
heap[*n] = heap[parent];
verify_min(heap, parent, item);
}
else
verify_max(heap, *n, item);
break;
case TRUE: /* max level */
if (item.key > heap[parent].key) {
heap[*n] = heap[parent];
verify_max(heap, parent, item);
}
else
verify_min(heap, *n, item);
}
}
void verify_max(element heap[],int i,element item)
{
/* follow the nodes from the max node i to the root and
insert item into its proper place */
int grandparent = i / 4;
while (grandparent)

212
if (item.key > heap[grandparent].key) {
heap[i] = heap[grandparent];
i = grandparent;
grandparent /= 4;
}
else
break;
heap[i] = item;
}
7 min

70 40 max

30 9 10 15 min

45 50 30 20 12 j max

inserting 5
5 min

70 40 max

30 9 7 15 min

45 50 30 20 12 10 max

inserting 80
7 min

70 80 max

30 9 10 15 min

45 50 30 20 12 40 max

213
(2) deletion of min element
12 min

70 40 max

30 9 10 15 min

45 50 30 20 max

the two cases for reinserting an element item into a min-max-


heap :
① The root has no children
--> item is to be inserted into the root

② The root has at least one child


determine which node has the smallest key ( node k )
(a) item.key ≤ heap[k].key
--> item is to be inserted into the root
(b) item.key > heap[k].key and k is a child of the root
--> element heap[ k ] may be moved to the root and
item insert into node k
(c) item.key > heap[k].key and k is a grandchild
--> element heap[ k ] may be moved to the root and
item.key > heap[ parent ].key ( the parent of k )
heap[ parent ].key and item are to be interchanged .
insert item into the sub-heap with root k …

9 min

70 40 max

30 12 10 15 min

45 50 30 20 max

214
element delete_min(element heap[ ], int *n)
{ /* delete the minimum element from the min-max heap:
O(log n) */
int i, last, k, parent;
element temp, x;

if ( !(*n) ) {
fprintf(stderr,"The heap is empty\n");
heap[0].key = INT_MAX; /* error key in heap[0] */
return heap[0];
}
heap[0] = heap[1]; /* save the element */
x = heap[ (*n)-- ];
/* find place to insert x */
for ( i = 1, last = (*n)/2; i <= last;) {
k = min_child_grandchild(i, *n);
if (x.key <= heap[k].key) break;
/* case 2(b) or 2(c) */
heap[i] = heap[k];
if (k <= 2*i+1) { /* 2(b) */
i = k;
break;
}
/* case 2(c), k is a grandchild of i */
parent = k/2;
if (x.key > heap[parent].key)
SWAP(heap[parent],x,temp);
i = k;
} /* for */
heap[i] = x;
return heap[0];
}

215
9.2 deaps

【definition】 A deap is a complete binary tree that is either


empty or satisfies the following properties:
(1) the root contains no element
(2) the left subtree is a min-heap
(3) the right subtree is a max-heap
(4) if the right subtree is not empty, then let i be any node
in the left subtree, let j be the corresponding
node in
the right subtree. If such a j does not exist, then let
j
be the node in the right subtree that corresponds
to
the parent of i. The key in node i is less than or
equal to the key in j.
[log i ]−1
j=i+ 2 2

if ( j > n ) j /= 2

5 45

10 8 25 40

15 19 9 30 20

(1) insertion into a deap

216
5 45

10 8 25 40

15 19 9 30 20
i j

insertion of 4

4 45

5 8 25 40

15 10 9 30 20 19

insertion of 30

5 45

10 8 30 40

15 19 9 30 20 25

functions

• max_heap ( n )
this function return TRUE iff n is a position in the
max-heap of the deap.

• min_partner ( n )
This function compute the min_heap node that
corresponds to the max-heap position n. This is given by
[log n]−1
n- 2 2

• max_partner ( n )

217
This function compute the max-heap node that
corresponds to the parent of min-heap position n. This is
[log n]−1
given by ( n - 2 2
)/2

• min_insert and max_insert


These functions insert an element into a specified position
of a min- and max-heap, respectively. This is done by
following the path from this position toward the root of the
respective heap. Elements are moved down as necessary
until the correct place to insert the new element is found

element deap [ MAX_SIZE ]

void deap_insert(element deap[ ],int *n,element x)


{
/* insert x into the deap : O ( log n ) */
int i;
(*n) ++;
if (*n == MAX_SIZE) {
fprintf(stderr,"The heap is full\n");
exit(1);
}
if (*n == 2)
deap[2] = x; /* insert into empty deap */
else switch(max_heap(*n)) {
case FALSE: /* *n is a position on min side */
i = max_parent(*n);
if (x.key > deap[i].key) {
deap[*n] = deap[i];
max_insert(deap,i,x);
}
else
min_insert(deap,*n,x);
break;
case TRUE: /* *n is a position on max side */
i = min_parent(*n);
if (x.key < deap[i].key) {
deap[*n] = deap[i];
min_insert(deap,i,x);
}

218
else
max_insert(deap,*n,x);
}
}

(2) deletion of min element


[log i ]−1
j=i+ 2 2
if ( j > n ) j /= 2

8 45

10 9 25 40

15 19 20 30

element deap_delete_min(element deap[ ] , int *n)


{
/* delete the minimum element from the heap: O ( log n ) */
int i , j;
element temp;
if (*n < 2) {
fprintf(stderr , "The deap is empty\n");
/* return an error code to user */
deap[0].key = INT_MAX;
return deap[0];
}
deap[0] = deap[2]; /* save min element */
temp = deap[(*n)--];
for (i = 2; i*2 < = *n; deap[i] = deap[j] , i = j) {
/* find node with smaller key */

219
j = i*2;
if (j+1 < = *n) {
if (deap[j].key > deap[j+1].key)
j++;
}
}
modified_deap_insert(deap , i , temp);
return deap[0];
}

9.3 leftist trees


problem
combine two priority queues into a single priority queue
heap --- O (n) , leftist tree ---- O ( log n)

A G

B C H I

D E F J

2 2
A G
2 1 1 1
B C H I
1 1 1
D E F J

extrenal node intrenal node

shortest ( x ) =
R
So if x is an external node
T1+ min{ shortest ( left _ child ( x )), shortest ( right _ child ( x ))} otehrwise

【definition】
A leftist tree is a binary tree such that iff it is not empty,
then
shortest (left_child(x)) >= shortest(right_child(x))

220
for every internal node x

〖Lemma9.1〗
Let x be the root of a leftist tree that has n (internal ) nodes.
(a) n >= 2
shortest ( x )
-1
(b) The rightmost root to external node path is the shortest
root to external node path. Its length is shortest( x ).

proof: (a) the leftist tree has at least internal nodes


shortest ( x )
∑ 2 i −1 = 2 shortest ( x ) -1
i =1

typedef struct {
int key;
/* other field */
} element;
typedef struct leftist *leftist_tree;
struct leftist {
leftist_tree left_child;
element data;
leftist_tree right_child;
int shortest;
};
【definition】 A min-leftist tree (max leftist tree) is a leftist tree
in which the key value in each node is no large ( smaller ) than
the key value in its children ( if any ). In other words. a min (max)
leftist tree is a leftist that is also a min ( max ) tree
2 2
2 a 5 b
1 1 1 1
7 50 9 8
1 1 2 1
11 80 12 10
1 1 1 1
13 20 18 15

insert x --> min-leftist b


• create a min-leftist a that contains the single element x
• combine a and b

221
delete the min element <-- a
• combine a->left_child and a->right_child
• delete node a

combine a and b
• obtain a new binary tree containing all elements in a
and b following the rightmost path in a and/or b. (the key in
each node is no larger than the key in its children )
• interchange the left and right subtree of nodes as
necessary to convert this binary tree into a leftist tree

2
5
2 2
8 1
1 1 9 8
2 1 1
10 50
1 1 12 10 50
1 1 1 1
15 80
20 18 15 80

2 2
2 2
1 2 1
7 5 5 71
1 1 2 2 1
11 9 8 8 9 11 1
1 1 1
2 1 1 2
13 12
1
10 50
1
10 50 12 13 1
1 1 1 1 1 1
20 18 15 80 15 80 20 18

void min_combine(leftist_tree *a , leftist_tree *b)


{
/* combine the two min leftist trees *a and *b. The resulting
min leftist tree is returned in *a , and *b is set to NULL */
if (!*a)
*a = *b;

222
else if (*b)
min_union(a , b);
*b = NULL;
}

void min_union(leftist_tree *a , leftist_tree *b) /* O(log n ) */


{
/* recursively combine two nonempty min leftist trees */
leftist_tree temp;
/* set a to be the tree with smaller root */
if ((*a)->data.key > (*b)->data.key)
SWAP(*a , *b , temp);
/* create binary tree such that the smallest key in each
subtree is in the root */
if (!(*a)->right_child)
(*a)->right_child = *b;
else
min_union(&(*a)->right_child , b);
/* leftist tree property */
if (!(*a)->left_child) {
(*a)->left_child = (*a)->right_child;
(*a)->right_child = NULL;
}
else if ((*a)->left_child->shortest <
(*a)->right_child->shortest)
SWAP((*a)->left_child , (*a)->right_child , temp);
(*a)->shortest = (!(*a)->right_child) ? 1 :
(*a)->right_child->shortest + 1;
}

9.4 binomial heap

(1) cost amortization

individual operation:
binomial heap ~ O ( n )

223
leftist tree ~ O ( log n )
amortize part of the cost of expensive operation over the
inexpensive once ~ O ( 1 ) or O ( log n )

amortize scheme :
charge some of the actual cost of an operation to other
operations. This reduce the charged cost of some operations
and increases that of others

amortized cost of an operation = the total cost charged to it

the sum of the amortized costs of the operations is greater


than or equal to the sum of their actual costs

〖example〗
operation sequence: I1,I2,D1,I3,I4,I5,I6,D2,I7
actual cost: I1 ~ I7 -- 1, D1 - 8, D2 -- 10 ( unit of
time )
total cost = 25
if 2 units (D1) --> I1, I2, 4 units ( D2) --> I2 ~ I6
the amortized cost of each of I1 ~ I6 = 2 units
the amortized cost of each of D1 ~ C2 = 6 units
the total amortized cost = (2*6+1*+2*6 ) = 25 units

for any sequence of insert and delete min operations:


actual sequence cost = i + 10 * d
amortized sequence cost = 2 * i + 6 * d
bound on the sequence cost = min {2 * i + 6 * d, i + 10 * d}

while individual delete operations on an binomial heap may be


expensive, the cost of any sequence of binomial heap
operations is actually quite small

(2) definition of binomial heap

min-binomial heap -- a collection of min-tree ( B-heap )


max-binomial heap -- a collection of max-tree

224
1
3
8 12 7 16
5 4

10 15 30 9
6

20
insert, combine -- O (1) (actual , amortized time )
delete -- O ( log n ) ( amortized time )
B-heap representation : doubly linked list
field: degree, child, left_link, right_link, data
a
8 3 1

5 4 12 7 16
10
6 15 30 9

20
(3) insertion into a binomial heap O (1)
element x --> B-heap a
1. putting x into a new node
2. placing this node into the doubly linked circular list
pointer at by a
3. if a is null or x's key < the key in the node pointed by a
then reset a to the new node

(4) combine O(1)


B-heap a + B-heap b
1. combine the top doubly linked circular lists of a and b
into a single doubly linked circular list.
2. new B-heap pointer --> the smaller key' B-heap

(5) deletion of min element


step1: [Handle empty B-heap]
if (a = NULL) deletion_error else perform step
2- 4
step2: [Deletion from nonempty B-heap]
x = a->data; y = a->child ; delete a from its
doubly

225
linked circular list; Now, a pointer to any
remaining
node in the resulting list; If there is no such
node,
then a = NULL
step3: [Min-tree joining]
Consider the min-tree in the list a and y; joining
together pairs of min-tree of the same degree
until
all remaining min-trees has different degree;
step4: [Form min-tree root list]
Link the roots of the remaining min-tree ( if any )
together to form a doubly linked circular list;
Set a
to point the root ( if any ) with minimum key;
new
B-heap consist of the remaining min-tree and
the
sub min-tree of the deleted root
8 3 12 7 16

5 4 15 30 9
10
6 20

making the min-tree whose root has larger key a subtree of the other

7 3 12 16

8 9 5 4 15 30

6 20
10

226
3 12 16

7 5 4
15 30

8 9 6
20

10

for (degree = p->degree; tree[degree]; degree++) {


join_min_trees(p , tree[degree]);
tree[degree] = NULL;
}
tree[degree] = p;

(6) analysis

【definition】 The binomial tree , Bk, of degree k is a tree such


that if k = 0, then the tree has exactly one node and if
k > 0, then it consists of a root whose degree is k and whose
sub tree are B0, B1, …, Bk-1

insert -- O(1) delete -- O(logn)


〖Lemma9.2〗Let a be a B-heap with n elements that results
from a sequence of insert, combine, and delete min operations
performed on initially empty B-heap. Each min tree in a has
degree ≤ log 2 n. Conesquencely,
MAX_DEGREE ≤ [ log 2 n ] and the actual cost of a delete min is O(logn + s )

〖Theorem9.1〗 If a sequence of n insert, combine, and


delete min operations is performed on initially empty
B-heap, then we can amortize cost such that the amortized
time complexity of each insert and combine is O(1) and that of
each delete min is O(log n)

actual cost of any sequence of i insert, c combines, and dm


delete min is O (i + c + dm log i )

9.5 Fibonacci heap

227
Fibonacci heap:
min-Fibonacci heap is a collection of min-tree ( F-heap )
max-Fibonacci heap is a collection of max-tree

□ B-heaps are a special case of F-heap

operations:

(1) delete, delete the element in a specified node --- O(1)


(2) decrease key, decrease the key of a specified node by
a given positive amount ----------------------------
O(log n)

F-heap representation : doubly linked list

field: degree, child, left_link, right_link, data, (B-heap )


parent, child_cut

(1) deletion from an F-heap

1. if a = b, then do a delete min; otherwise do step 2, 3,


and 4 below
2. Delete b from the doubly linked list it is in
3. Combine the doubly linked list of b' s children with the
doubly linked list of a' s min-tree roots to get a single
doubly linked list. Trees of equal degree are not
joined
together as in a delete min
4. Dispose of not b

the deletion of 12

228
8 3 1 15 30

5 4 7 16 20
10
6 9

(2) Decrease key


1. reduce the key in b
2. if b is not a min-tree root and its key is smaller than
that in its parent, then delete b from its doubly linked
list and insert it into the doubly linked list of min-tree
roots
3. Change a to point to b in case the key in b is smaller
than that in a

the reducetion of 15 by 4

8 3 1 11

5 4 7 16 20
12
10
6 30 9

(3) Cascading cut


problem
theorem9.1 requires that each min-tree of degree k have
an exponential ( in k) number of nodes.
How to ensure that each min-tree of degree k has at least
ck nodes for some c, c>1 ?

□each delete and decrease key operation must be


followed by a cascading cut step --> add the child_cut to
each node.
□child_cut = TRUE
iff one of the children of x was cut off (i.e., removed) after the
most recent time x was made the child of its current parent

229
□each time two min-tree are joined in a delete min
operation, the child_cut of the root with larger key = False

2
14 12 10
F
4 5
16 15
T 18 30 11
6 60
T
20 8 7
8 6
T
10
20 7
T
30 12 11
2
*
14 18 4 5

16 15 60

(4) Analysis
〖Lemma9.3〗 Let a be an F-heap with n elements that
results from a sequence of insert, combine, delete min, delete,
and decrease key operations performed on initially empty F-
heap
(a) Let b be any node node in any of the min-trees of a.
The degree of b is at most log φ m , where
φ = ( 1+ and m is the number of elements in the
5 ) / 2
subtree with root b.
(b) MAX_DEGREE ≤ [ log φ m ] and the actual cost of a delete

230
min is O(log n + s)
〖Theorem9.2〗 If a sequence of n insert, combine, delete
min, delete, and decrease keyoperations is performed on an
initially empty F-heap, then we can amortize costs such that
the amortized time complexity of each insert, combine, and
decrease key operation is O(1) and that of each delete min
and delete is O(log n). The total time complexity of the entire
sequence is the sum of the amortized complexities of the
individual operations in the sequence

(5) Application of F-heaps

• the single source all destinations algorithm


S -- the set of vertices to which a shortest path has been found
dostance(i) -- the length of a shortest from the source vertex i, i ∈ S

S ~ F-hearp
1) determine i is min distance and add i to S
~~ a delete min operation on S
2) distance value of the remaining vertices in S
~~ a decrease key operation on each of the affected
verties
total time (graph ) -- Ω ( n )
2

total time (heap ) -- O (nlog n + e )

• find a shortest path between every pair of vertices

total time (graph ) -- Ω ( n )


3

total time (heap ) -- O ( n log n + ne )


2

231
Chapter 7 SORTING
7.1 searching and list verification
a collection of information concerning some set about objects :
list ~ be stored within the available memory
file ~ be stored externally
record --- the information for one of the objects
field --- a smaller information unit within each record
key --- some field that server to identify the record

□ the efficient of a searching strategy depends on the


assumptions we make about the arrangement of records int the
list

(1) Sequential search

#define MAX_SIZE 1000 /* maximum size of list plus one */


typedef struct {
int key;
/* other field */
} element;
element list [MAX_SIZE ];
program

int seqsearch(int list[ ], int searchnum, int n)


{
/* search an array, list, that has n numbers. Return i, if
list [ i ] = searchnum. Return -1, if searchnum is not in the list */

int i;
list [ n ] = searchnum;
for (i=0; list [ i] != searchnum; i++)
;
return ((i < n) ? i : - 1);
}

complexity
the average number of comparisons for a successful search:
n−1
∑ ( i + 1) / n = ( n + 1) / 2
i =0

(2) Binary search

assume: the list is ordered on the key field

program O( logn )

int binsearch(element list [ ], int searchnum, int n)


{
/* search list[0],....,list[n-1] */
int left = 0,right = n-1, middle;
while (left <= right){
middle = (left + right)/2;
switch(COMPARE(list[middle].key, searchnum)){
case -1:left = middle+1;
break;
case 0: return middle;
case 1: right = middle-1;
}
}
return -1;
}
decision tree for binary search

[5]
46
[2] [8]
17 58
[0] [10]
[3] [6]
4 26 48 90
[1] [4] [7] [9] [11]
15 30 56 82 95

157
(3) List verification

compare lists to verify that they are identical, or to identify the


discrepancies. ( an instance of repeatedly searching one list ,
using each key in the other list as the search key )

three types of error


1. a record with key list1 [i ].key appears in the first list, but
there is no record with the same key in the second list (list2).
2. a record with key list2 [i ].key appears in the second list,
but there is no record with the same key in the first list (list1).
3. list1[ i ].key = list2 [ i ].key, but the two records do not
match on at least one of the other fields.

program1 O( mn )

void verify1(element list1[ ], element list2[ ], int n, int m)


/* compare two unordered lists list1 and list2 */
{
int i, j;
int marked[MAX_SIZE];

for (i = 0; i < m; i++)


marked[i] = FALSE;
for (i = 0; i < n; i++)
if ((j = seqsearch(list2, m, list1[ i ].key)) < 0)
printf("%d is not in list2 \n", list1[ i ].key);
else
/* check each of the other fields from list1[i] and
list2[j],and print out any discrepancies */
marked[j] = TRUE;
for (i = 0; i < m; i++)
if ( !marked[ i ])
printf("%d is not in list1\n", list2[ i ].key);
}

158
program2 O(max [n log n, m log m ] )
void verify2( element list1[ ], element list2[ ], int n, int m)
/* same task as verify1,but list1 and list2 are sorted */
{ int i,j;
sort(list1, n); sort(list2, m);
i = j = 0;
while (i < n && j < m)
if ( list1[ i ].key < list2 [ j ].key){
printf("%d is not in list2\n", list1[i].key);
i++;
}
else if (list1[i].key == list2[j].key){
/* compare list1[i] and list2[j] on each of the other
fields and report any discrepancies */
i++; j++;
}
else {
printf("%d is not in list1\n",list2[j].key);
j++;
}
for ( ; i < n; i++)
printf("&d is not in list2\n", list1[i].key);
for ( ; j < m; j++)
printf("&d is not in list1\n", list2[j].key);
}

7.2 definitions
applications of sorting:
1. as an aid to searching
2. for matching entries in the lists
3. finding a permutation (permutation is not unique)

given: a list ( R0,R1,…, Rn-1) ( Ki a key value )


kσ ( i −1) ≤ kσ ( i ) , 0 < i ≤ n − 1
desired ordering: ( Rσ ( 0) , Rσ (1) , ..., Rσ ( n−1) )
unique permutation has the following properties:
(1) | sorted | kσ ( i −1) ≤ kσ ( i ) , 0 < i ≤ n − 1

159
(2) | stable | if i < j and Ki = kj in the input
list, then Ri precedes R j in the sorted
list
sorting methods:
internal sort ---- the list is small enough to sort entries
in main memory
selection sort insertion sort quick sort
heap sort merge sort radix sort
external sort ---- there is too much information to fit into
main memory
7.3 insertion sort

program
void insertion_sort(element list[ ],int n)
/* perform a insert sort on the list */
{
int i, j;
element next;
for (i = 1; i < n; i++){
next = list[ i ];
for (j = i-1; j >= 0 && next.key < list[j].key; j--)
list[ j+1] = list [ j ];
list [ j+1] = next;
}
}
n−1

complexity Ο ( ∑ i ) = Ο ( n2 )
i =0
left out of order ( LOO ) Ri is LOO iff Ri < max
0≤ j < i
{ Rj }
if K is the number of records LOO,
then the computing time O((k+1)n ).
worst case: Ο( n )
2

case: Ο( n )
2
average

〖example〗 n = 5, input ( 5, 4, 3, 2, 1)
i [0] [1] [2] [3] [4]
- 5 4 3 2 1
1 4 5 3 2 1
2 3 4 5 2 1

160
3 2 3 4 5 1
4 1 2 3 4 5

〖example〗 n = 5, input ( 2, 3, 4, 5, 1)

i [0] [1] [2] [3] [4]


- 2 3 4 5 1
1 2 3 4 5 1
2 2 3 4 5 1
3 2 3 4 5 1
4 1 2 3 4 5

R4 is LOO , i = 1,2,3 time O(1)


i=4 time O(n)
Variations

1.binary insertion sort:


binary search ~ sequential search
2. List insertion sort:
dynamically linked list ~ array

7.4 Quick sort ( C.A.R.Hoare )

Quick sort differs from insertion sort in that the pivot key Ki
controlling the process is placed at the right spot with respect
to the whole file.
1. if key Ki placed in position S(i),
then K j <= K s( i ) for j < S(i)
K j >= K s( i ) for j > S(i)
2. original file is partitioned into two subfiles:
R0, …, R s(i) -1 ( left of s(i) )
R s(i) +1, … ,R n -1 ( right of s(i) )
3. sort these subfiles

161
program quicksort ( list, 0, n-1 )

void quicksort (element list[ ], int left, int right)


/* sort list[left],...,list[right] into nondecreasing order on the
key field,list[left].key is arbitrarily chosen as the pivot key,it is
assumed that list[left].key <= list[right+1].key. */
{ int pivot, i, j;
element temp;
if (left < right){
i = left; j = right + 1;
pivot = list [left].key;
do{ /* search for keys from the left and right sublists,
swapping out-of-order elements until the left
and right boundaries cross or meet */
do i++;
while (list[ i ].key < pivot);

do
j--;
while (list[ j ], key > pivot);
if (i < j)
SWAP(list [ i ], list [ j ], temp);
} while ( i < j);
SWAP(list [ left ], list[ j ], temp);
quicksort( list, left, j-1);
quicksort( list, j+1, right);
}
}
〖example〗 n =10 input (26, 5, 37, 1, 61, 11, 59, 15, 48, 19)

R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
left right
[26 5 37 1 61 11 59 15 48 19]
0 9
[11 5 19 1 15] 26 [59 61 48 37]
0 4

162
[1 5] 11 [19 15] 26 [59 61 48 37]
0 1
1 5 11 [19 15] 26 [59 61 48 37]
3 4
1 5 11 15 19 26 [59 61 48 37]
6 9
1 5 11 15 19 26 [48 37] 59 [61]
6 7
1 5 11 19 15 26 37 48 59 [61]
9 9
1 5 11 19 15 26 37 48 59 61

worst case : O( n
2
)
best case : O(n)
average case:
T (n) -- the time taken to sort a file of n records:
T(n) ≤ cn + 2T(n/2), for some constant c
≤ cn + 2(cn/2 + 2T(n/4))
≤ 2cn + 4T(n/4)
·
≤ cn log 2 n + nT(1) = O(n log 2 n )

〖lemma7.1〗Let Tavg(n) be the expected time for quicksort to


sort a file with n records. Then there exists a constant k such
that Tavg ( n) ≤ kn log e n for n >= 2.
1 n−1 2 n−1
Tavg ( n ) ≤ cn + ∑ (Tavg ( j ) + Tavg ( n − j − 1)) = cn + ∑ (Tavg ( j ))
n j =0 n j =0
proof:
induction base:
for n = 2, Tavg ( 2 ) ≤ 2c + 2 b ≤ kn log e 2
induction hypothesis:
Tavg ( n) ≤ kn log e n for 1 ≤ n < m
induction step:
4 b 2 m−1 4 b 2 k m−1
Tavg ( m ) ≤ cm + + ∑ Tavg ( j ) ≤ cm + + ∑ j log e j
m m j =2 m m j =2

163
Tavg ( m ) ≤ cm +
4b 2k m
+ z
x log e xdx = cm +
4 b 2 k m 2 log e m m 2
+ −
L
M O
P
m m 2 m m 2 4 N Q
4b km
= cm + + km log e m − ≤ km log e m for m >= 2
m 2
stack space: O(log n)

variation
quicksort using a median of three:
povit = median { K left , K ( left + right )/ 2 ) , K right }

7.5 optimal sorting time


Best possible time ----- O( n log 2 n )
〖example〗Decision tree for insertion sort on R0,R1,R2

K0<=K1 [0,1,2]
YES NO
[0,1,2] [1,0,2]
K1<=K2 K0<=K2
NO
YES NO YES NO
[0,1,2]
[0,2,1] [1,0,2] [1,2,0]
stop K0<=K2 stop K1<=K2
I YES NO IV YES NO
[2,0,1] [1,2,0] [2,1,0]
[0,2,1] stop stop stop stop
II III V VI

Leaf permutation sample input key value that


give the permutation
I 012 (7, 9, 10)
II 021 (7, 10, 9)
III 201 (9, 10, 7)
IV 102 (9, 7, 10)
V 120 (10, 7, 9)
VI 210 (10, 9, 7)

【theorem】any decision tree that sorts n distinct elements has a


height of at least log 2 ( n!) +1

164
【corollary】any algorithm that sort by comparison only must
have a worst computing time of Ω( n log 2 n )
proof:
for every decision tree with n! leaves there is a path of length
c n log 2 n , c a constant, By the theorem, there is path length
log 2 n!
n/ 2
n! = n(n-1)(n-2), ..., (3) (2) (1) >= ( n / 2 )
so, log 2 n! >= (n/2) log 2 ( n / 2 ) = O( n log 2 n )

7.6 Merge sort

(1) merging
algorithm1
void merge(element list[ ],element sorted[ ],int i,int m, int n)
/* merge two sorted files: list[i], ..., list[m], and list[m+1], ...,
list[n]. These files are sorted to obtain a sorted list :
sorted [i] , ..., sorted[n] */
{
int j, k, t;
j = m+1; /* index for the second sublist */
k = i; /* index for the sorted list */

while ( i <= m && j <= n){


if ( list[i].key <= list[j].key)
sorted[k++] = list[i++];
else
sorted[k++] = list[j++];
}
if (i > m)
/* sorted[k],...,sorted[n] = list[j],...list[n] */
for (t = j; t <= n; t++)
sorted[k+t-j] = list[t];
else
/* sorted[k], ...,sorted[n] = list[i], ..., list[m] */
for (t = i; t <= m; t++)
sorted[k+t-i] = list[t];
}

complexity

165
computing time: O( n - i + 1)
m - the length of a record O( m( n - i + 1))

algorithm2 O(1) space merge


assume:
1.n is a perfect square
2. the number of records in each of the two lists to be
merged is a multiple of n

steps:
1. Identify the n records with largest keys. This is done by
following right to left along the two files to be merged.
2. Exchanged the records of the second file that were
identified step 1 with those just to the left of those
identified from the first file so that the n records with
largest keys form a contiguous block.
3. swap the block of n largest records with the leftmost
block ( unless it is already the leftmost block ). Sort the
rightmost block.
4. Reorder the blocks excluding the block of largest records
into nondecresing order of the last key in the blocks.
5. perform as many merge sub steps as needed to merge
the n - 1 blocks other than the block with the largest keys
6. sort the block with the largest keys.

0 2 4 6 8 a c e g i j k l m n t w z 1 3 5 7 9 b d f h o p q r s u v x y
0 2 4 6 8 a c e g i j k l m n t w z 1 3 5 7 9 b d f h o p q r s u v x y
0 2 4 6 8 a c e g i j k u v x y w z 1 3 5 7 9 b d f h o p q r s l m n t

u v x y w z c e g i j k 0 2 4 6 8 a 1 3 5 7 9 b d f h o p q l m n r s t
. . .
u v x y w z 0 2 4 6 8 a 1 3 5 7 9 b c e g i j k d f h o p q l m n r s t
. . .
0 v x y w z u 2 4 6 8 a 1 3 5 7 9 b c e g i j k d f h o p q l m n r s t
. . .
0 1 x y w z u 2 4 6 8 a v 3 5 7 9 b c e g i j k d f h o p q l m n r s t
. . .
0 1 2 y w z u x 4 6 8 a v 3 5 7 9 b c e g i j k d f h o p q l m n r s t

166
. . .
0 1 2 3 4 5 u x w 6 8 a v y x 7 9 b c e g i j k d f h o p q l m n r s t
. .
0 1 2 3 4 5 6 7 8 u w a.v y z x 9 b c e g i j k d f h o p q l m n r s t
.. .
0 1 2 3 4 5 6 7 8 9 a w v y z x u b c e g i j k d f h o p q l m n r s t
. . .
0 1 2 3 4 5 6 7 8 9 a w v y z x u b c e g i j k d f h o p q l m n r s t
. . .
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k v z u y x w o p q l m n r s t
. . .
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k v z u y x w o p q l m n r s t
. . .
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q y x w v z u r s t
.
0 1 2 3 4 5 6 7 8 9 a b c d e f g h i j k l m n o p q r s t v z u y x w

complexity
step1, step2 (each ): space O(1) time O( n )
step3: swapping : space O(1) time O( n )
sort (insertion sort) : time O( n ) space O(1)
step4: selection sort: time O( n ) space O(1)
1.5
insertion sort: time O( n )
step5: time O( n )
step6 ( any sort): time O( n )
total time : O( n ), space : O(1)

(2) Iterative merge sort O( n log 2 n )


void merge_pass(element list[ ],element sorted[ ],int n,int
length)
{ /* perform one pass of the merge sort.It merges adjacent
pairsof subfiles from list into sorted. n is the number of element
in the list.length is the length of the subfile */
int i, j;
for (i = 0; i <= n-2 * length; i += 2 * length)
merge(list, sorted, i, i + length - 1, i + 2 * length -1);
if( i+length < n)
merge(list, sorted, i, i+length-1, n-1);
else
for (j = i; j < n; j++)
sorted[ j ] = list[ j ];
}

167
void merge_sort(element list[ ], int n)
/* perform a merge sort on the files */
{ int length = 1;/* current length being merged */
element extra[MAX_SIZE];
while(length > n){
merge_pass(list, extra, n, length);
length *= 2;
merge_pass(extra, list, n, length);
length *= 2;
}
}
〖example〗input ( 26, 5, 77, 1, 61, 11, 59, 15, 48, 19 )

26 5 77 1 61 11 59 15 48 19

5 26 1 17 11 61 15 59 19 48

1 5 26 77 11 15 59 61 19 48

1 5 11 15 26 59 61 77 19 48
1 5 11 15 19 26 48 59 61 77

(2) Recursive merge sort O( n log 2 n )

typedef struct {
int key:
/* other field */
int link;
} element;

int rmerge(element list[ ], int lower, int upper)


/* sort the list, list[lower],..., list[upper].The link
field in each record is initially set to -1 */
{
int middle;
if (lower >= upper)
return lower;
else {
middle = (lower+upper) / 2;
return listmerge(list, rmerge (list, lower, middle),

168
rmerge (list, middle + 1, upper));
}
}
26 5 77 1 61 11 59 15 48 19

5 26 15 59
5 26 77 1 61 11 15 59 19 48

1 5 26 61 77 11 15 19 48 59
1 5 11 15 19 26 48 59 61 77

i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 26 5 77 1 61 11 59 15 48 19
link 8 5 -1 1 2 7 4 9 6 0

int listmerge(element list[ ], int first, int second)


/* merge lists pointed to by first and second */
{
int start = n;
while (first != -1 && second != -1)
if (list[first].key <= list[second].key){
/* key in first list is lower,link this element to
start and change start to point to first */
list[start].link = first;
start = first;
first = list[first].link;
}
else{
/* key second list is lower,link this element into
the partially sorted list */
list[start].link = second;
start = second;
second = list[second].link;
}
/*move remainder */
if (first == -1)
list[start].link = second;
else
list[start].link = first;

169
return list[n].link;/* start of the new list */
}
〖example〗input ( 26, 5, 77, 1, 61, 11, 59, 15, 48, 19 )

variation
natural merge sort: prevailing order within the input list

26 5 77 1 61 11 59 15 48 19

5 26 77 1 11 59 61 15 19 48

1 5 11 26 59 61 77 15 19 48

1 5 11 15 19 26 48 59 61 77

7.7 heap sort

heap sort ---- a fixed amount of additional storage


same time O( n log 2 n )
⊙slightly slower than merge sort using O(n) additional space
faster than merge sort using O(1) additional space

void adjust(element list[ , int root, int n)


/* adjust the binary tree to establish the heap */
{
int child,rootkey;
element temp;
temp = list[root];
rootkey = list[root].key;
child = 2*root; /* left child */
while (child <= n){
if ((child < n) && (list[child].key < list[child+1].key))
child++;
if (rootkey > list[child].key) /* compare root and

max child */
break;
else{

170
list[child/2] = list[child]; /* move to parent
*/
child *= 2;
}
}
list [child/2] = temp;
}

void heapsort(element list[ ], int n)


/* perform a heapsort on the array */
{
int i,j;
element temp;

for (i = n/2; i > 0; i--)


adjust(list, i, n);
for (i = n-1; i > 0; i--){
SWAP(list[1], list[i+1], temp);
adjust(list, 1, i);
}
}

complexity
suppose: 2 ≤ n < 2
k −1 k

first for loop:


k k k −1 i
∑ 2 (k − i) = ∑ 2
i −1 k − i −1
i ≤ n∑ i
< 2 n = O ( n)
i =1 i =1 i =1 2

second for loop: O( n log 2 n )


total time: O( n log 2 n )

〖example〗input ( 26, 5, 77, 1, 61, 11, 59, 15, 48, 19 )

1. array interpreted as a binary tree

171
[1]
26

[2] [3]
5 7
7
[7]
[4] [5] [6]
1 61 11 59

[8] [9]
15 48 19 [10]

2. max heap
[1] [1]
77 61
[2] [3] [2] [3]
61 59 48 59
[7] [7]
[4] [5] [6] [4] [5] [6]
48 19 11 26 15 19 11 26

[8] [9] [8] [9]


15 1 5 [10] 5 1 77 [10]

[1] [1]
59 48

[2] [3] [2] [3]


48 26 19 26
[7] [7]
[4] [5] [6] [4] [5] [6]
15 19 11 1 15 5 11 1

[8] [9] [8] [9]


5 61 77 [10] 59 61 77 [10]

[1]
26

[2] [3]
19 11
[7]
[4] [5] [6]
15 5 1 48

[8] [9]
59 61 77 [10]

7.8 Radix sort


problem
r −1
sorting records that have several keys ( K , K , ...., K )
0 1

172
K0 ---- the most significant key
r−1
K ---- the least significant key
j j
Ki ---- the key K of record Ri
A list of records, R0 , R1 , ...., Rn−1 is lexically sorted with
0 1 r −1
respect to the keys K , K , ...., K iff
( K i , K i , ..., K i ) ≤ ( K i +1 , K i +1 , ..., K i +1 )
0 1 r −1 0 1 r −1
0 ≤ i < n-1
□ r-tuple ( x0, x1, …, xr-1 ) ≤ r-tuple ( y0, y1, …, yr-1 )
iff
either xi = yi , 0 ≤ i ≤ j and xj+1 < yj+1 for some j < r-1
or xi = yi , 0 ≤ i < r
〖example〗
K 0 [suit]:
K 1 [face value]: 2 < 3 < 5 < … < 10 < J < Q < K < A
2 , ... , A , ... , 2 , ... , A
a sorted deck of cards:

MSD (Most Significant Digit )

3 5 4 A

LSD (Least Significant Digit )

2 3 4 A

□LSD approach is simpler than the MSD one because we do


not have to sort the subfiles independently

#define MAX_DIGIT 3 /* number between 0 and 999 */


#define RADIX_SIZE 10
typedef struct list_node *list_pointer;
typedef struct list_node {

173
int key[MAX_DIGIT];
list_pointer link;
};

list_pointer radix_sort (list_pointer ptr)


/* radix sort using a linked list ( LSD radix sort) */
{
list_pointer front [RADIX_SIZE], rear[RADIX_SIZE];
int i,j,digit;
for (i = MAX_DIGIT-1; i >= 0; i--){
for (j = 0;j < RADIX_SIZE; j++)
front[j] = rear[j] = NULL;
while (ptr){
digit = ptr->key[i];
if ( !front[digit])
front[digit] = ptr;
else
rear[digit]->link = ptr;
rear[digit] = ptr;
ptr = ptr->link;
}
/* reestablish the linked list for the next pass */
ptr = NULL;
for (j = RADIX_SIZE-1; j >= 0; j--)
if ( front[j]){
rear[j]->link = ptr; ptr = front[j];
}
}
return ptr;
}

〖example〗
initial input :

174
179-> 208-> 306-> 93-> 859-> 984-> 55-> 9-> 271-> 33

front[0] null rear[0]


front[1] 271 null rear[1]
front[2] null rear[2]
front[3] 93 33 null rear[3]
front[4] 984 null rear[4]
front[5] 55 null rear[5]
front[6] 306 null rear[6]
front[7] null rear[7]
front[8] 208 null rear[8]
front[9] 179 859 9 null rear[9]

first pass , i = 2:
271-> 93 -> 33 -> 984 -> 55 -> 306 -> 208 -> 179 -> 859 -> 9

front[0] 306 208 9 null rear[0]


front[1] null rear[1]
front[2] null rear[2]
front[3] 33 null rear[3]
front[4] null rear[4]
front[5] 55 859 null rear[5]
front[6] null rear[6]
front[7] 271 179 null rear[7]
front[8] 984 null rear[8]
front[9] 93 null rear[9]
second pass i = 1:
306 -> 208 -> 9 -> 33 -> 55 -> 859 -> 271 -> 179 -> 984 -> 93

175
front[0] 9 33 55 93 null
rear[0]
front[1] 179 null rear[1]
front[2] 208 271 null rear[2]
front[3] 306 null rear[3]
front[4] null rear[4]
front[5] null rear[5]
front[6] null rear[6]
front[7] null rear[7]
front[8] 859 null rear[8]
front[9] 984 null rear[9]

third pass i = 0:
9 -> 33 -> 55 -> 93 -> 179 -> 208 -> 271-> 306 -> 859 -> 984

7.9 List and table sorts


minimize data movement !

The sort does not physically rearrange the list, but modifies the
link field to show the sorted order.
three methods:
1, 2 ~ require linked list representations
3 ~ an auxiliary table that indirectly references the
list's records.

(1) linked list representation


start points to the record that contains the smallest key.
link field points to the record with the second smallest key ,
and so on.
1. interchange records R0 and Rstart
2. find the record that contained a 0 in its link field, and
change this record's link field to pointer to start
3. change the start to point to record R0's link field, and
interchange record R1 and Rstart
4. After n-1 iterations the list is in ascending order.

typedef struct {

176
int key;
int link;
int linkb; /* back link */
} element:

void list_sort1(element list[ ],int n,int start)


/* start is a pointer to the list of n sorted elements,linked
together by the field link,linkb is assumed to be present in each
element.the elements are rearranged so that the resulting
elements list[0],...,list[n-1] are consecutive and sorted */
{
int i, last, current;
element temp;

last = -1;
for (current = start; current != -1;
current =
list[current].link)
{
/* establish the back links for the list */
list[current].linkb = last;
last = current;
}
for (i = 0; i < n-1; i++){
/*move list[start] to position i while maintaining the list
*/
if (start != i){
if (list [ i ].link+1)
list [list [ i ].link].linkb = start;
list [ list[ i ].linkb].link = start;
SWAP ( list[start], list[i], temp);
}
start = list[i].link;
}
}

complexity O( mn )
n --- the number of records in the list
m --- each record is m words long

177
〖example〗input list ( 26, 5, 77, 1, 61, 11, 59, 15, 48, 19 )
start = 3 (linked list )
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 26 5 77 1 61 11 59 15 48 19
link 8 5 -1 1 2 7 4 9 6 0
linkb

start = 3 (doubly linked list )


i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 26 5 77 1 61 11 59 15 48 19
link 8 5 -1 1 2 7 4 9 6 0
linkb 9 3 4 -1 6 1 8 5 0 7

start = 3 (first iteration of the for loop )


i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 77 26 61 11 59 15 48 19
link 1 5 -1 8 2 7 4 9 6 3
linkb -1 3 4 9 6 1 8 5 3 7

i=2 start = 5
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 77 26 61 11 59 15 48 19
link 1 5 -1 8 2 7 4 9 6 3
linkb -1 3 4 9 6 1 8 5 3 7
i=3 start = 7
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 11 26 61 77 59 15 48 19
link 1 5 7 8 5 -1 4 9 6 3
linkb -1 3 1 9 6 4 8 5 3 7
i=4 start = 9
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 11 15 61 77 59 26 48 19
link 1 5 7 9 5 -1 4 9 8 7
linkb -1 3 1 5 6 4 8 5 9 7

(2) linked list representation ( no back links --- .D.MacLaren )

178
void list_sort2(element list[ ], int n, int start)
/* list sort with only one link field */
{ int i, next;
element temp;
for (i = 0; i < n-1; i++){
while(start < i) start = list[start].link;
next = list[start].link; /*save index of next largest
key*/
if (start != i){
SWAP(list[i], list[start], temp);
list[i].link = start;
}
start = next;
}
}
complexity O( mn )

〖example〗input list ( 26, 5, 77, 1, 61, 11, 59, 15, 48, 19 )


start = 3
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 26 5 77 1 61 11 59 15 48 19
link 8 5 -1 1 2 7 4 9 6 0
i=0 start = 3
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 77 26 61 11 59 15 48 19
link 3 5 -1 8 2 7 4 9 6 0
i=1 start = 5
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 77 26 61 11 59 15 48 19
link 3 5 -1 8 2 7 4 9 6 0
i=2 start = 7
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 11 26 61 77 59 15 48 19
link 3 5 5 8 2 -1 4 9 6 0
i=3 start = 9
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 11 15 61 77 59 26 48 19
link 3 5 5 7 2 -1 4 8 6 0

179
i=4 start = 0
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 11 15 19 77 59 26 48 61
link 3 5 5 7 9 -1 4 8 6 2

□list sort is not well suited for quick sort or heap sort.

(3) table sort O(n)


an auxiliary table ---
indirectly references the records in the list

before 0 1 4 Auxiliary
sort 2 3
table T

R0 R1 R2 R3 R4
key 50 9 11 8 3

After Auxiliary
4 3 1 2 0
sort table T
void table_sort(element list[ ], int n, int table[ ])
/* rearrange list[0], .., list[n-1] to correspond to the
sequence list[table[0]], ..., list[table[n-1]] */
{ int i, current, next;
element temp;
for (i = 0; i < n-1; i++)
if (table[i] != i){
/* nontrivial cycle starting at i */
temp = list[i]; current = i;
do {
next = table[current];
list[current] = list[next];
table[current] = current;
current = next;
}while(table[current] != i);
list[current] = temp;
table[current] = current;
}
}
〖example〗
initial configuration:

180
i R0 R1 R2 R3 R4 R5 R6 R7
key 35 14 12 42 26 50 31 18
table 2 1 7 4 6 0 3 5

after rearrangement of first cycle:


key 12 14 18 42 26 35 31 50
table 0 1 2 4 6 5 3 7

after rearrangement of second cycle:


key 12 14 18 26 31 35 42 50
table 0 1 2 3 4 5 6 7

7.10 summary of internal sorting

insertion sort
advantage: low overhead
suited case: already partially ordered
small n

merge sort
advantage: best worst case behavior
shortcoming: more storage
suited case: large n

quick sort
advantage: best average behavior
shortcoming: worst case behavior is O( n )
2

radix sort
shortcoming: behavior depends on the size of the key and
the choice of the radix

class type time(ave.) time(best) time (worst)


exchange bubble O( n )
2
--> O( n) 2
<-- O( n )

181
2
quick O( n log n) --> O( n) <--> O( n )
2
select& straight O( n )
2
tree binary tree O( n log n) skewedO( n )
tournament O( n log n) 2 n +1 number
heap O( n log n) small n
2 2
insertion simple O( n ) -->O( n) <-- O( n )
2
shell O( n(log n ) )
2
address O( n ) n/m=1O( n) n/m>>1O( n2 )
merge & merge O( n log n) large n small n
radix radix O( m * n) small m large m

7.11 external sorting


overhead for reading or writing from/to a disk:
seek time latency time transmission time
block --- the unit of data that is read from or written to the disk
at one time
sort method (merge sort ) :
step1 -- segments of the input file are sorted using internal
sort method.
step2 -- the sorted segments ( runs ) are written out onto
external storage as they are generated.
step3 -- runs are merged together to one run

〖example〗
file -- 4500 records (A1, …, A4500 )
block -- 250 records
run -- 3 blocks
1. internally sort three blocks at a time (i.e. 750 records ) to
obtain six runs R1 - R6
run1 run2 run3 run4 run5 run6

1-750 751-1500 1501-2250 2251-3000 3001-3750 3751-4500


2. set aside three blocks of internal memory, each capable of
holding 250 records.

182
run1 run2 run3 run4 run5 run6
750records 750records 750records 750records 750records 750records

run1 run2 run3


1500 records 1500 records 1500 records
run1
3000 records
run1
4500 records

ts = maximum seek time


tl = maximum latency time
trw = time to read or write one block of 250 records
tIO = ts + tl + trw
tIS = time to internally sort 750 records
ntm = time to merge n records from input buffer to the output
buffer

operation
time
(1) read 18 blocks of input , 18tIO , 36 tIO + 6tIS
internally sort, 6tIS ,
write 18 blocks, 18tIO
(2) merge runs 1-6 in pairs 36 tIO + 4500
tm
(3) merge two runs of 1500 records 24tIO + 3000 tm
each , 12 blocks
(4) merge one run of 3000 records 36 tIO+ 4500 tm
with one run of 1500 records
___________________
total time
132tIO + 12000 tm + 6tIS

(1) k-way Merging

4-way merge on 16 runs

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

183
k-way merge
1 2 ...k k+1 ... 2k ([m/k]-1)k+1 ...m

[log km]+1levels

n records-->m runs
---> [ log k m ] + 1 level
--> k-way merge
--> merge log k m passes
merge k-runs ( s1,s2,s3,, …,sk ) --> O((k-1) ∑1 si )
k

computing time
total computing time : n(k-1) log k m computing number
n(k-1) log k m = n(k-1) log 2 m / log 2 k = n log 2 m * (k-1) /
log 2 k

As k increase , the reduction in input/output time will be


offset by the resulting increase in CPU time needed to
perform the k-way merge.

the total internal processing time per level = O(n log 2 k )


the total internal processing time :
O( n log 2 k log k m ) = O ( n log 2 m )

(2) buffer handling for parallel operation

k-way merge --- k input buffer, 1 output buffer


parallel merge --- 2k input buffer, 2 output buffer

184
〖example〗run 1 : 1,3,5,7,8,9 run 2 : 2,4,6,15,20,25
- - 1 3
- - 2 4
ou[1] ou[2]
output ou[1]
merge into ou[1] merge into ou[2]
1 2 - - - -
3- 4 input into in[3] 3 4 input into in[4]
- -
in[1] in[2]
- - 5 5 6
- - 7 7 15
in[3] in[4]

5 7 9
6 8 -

output ou[2] output ou[1] output ou[2]


merge into ou[2]
merge into ou[1] 8 - - 20 merge into ou[1] - 20
input into in[1] 9 - input into in[2] 9 25 - 25
input into in[3]

- - - - -
7 15 - 15 15
assumption about the parallel processing capabilities:
(1) two disk drives and the input/output channel
(2) while data transmission is taking place between an
input/output device and a block of memory, the
CPU
cannot make references to that same block of the
memory.
(3) input and output buffer are to be the same size.

buffering algorithm

step 1: input the first block of the k runs setting up k linked


queues each having one block of data.
put the remaining k input blocks into a linked
stack of free input blocks. set ou to 0.

step 2: let lastkey [i] be the last key input from run i. let
nextrun be the run for which lastkey is minimum.if
lastkey [nextrun] < > + then initiate the input
of the next block from run nextrun.

185
step 3: use a function kwaymerge to merge records from the
k input queues into the output buffer ou. merging
continues until either the output buffer gets full or
arecord with key + is merged into ou. if,
during this merge , an input buffer becomes
empty before the output buffer gets full or
before a + is merged into a ou, the
kwaymerge advances to the next bufferon the
same queue and returns the empty buffer to the
stack of empty buffers.However , if an input
buffer becomes empty at the same time as the
output buffer gets full or + is merged into ou,
the empty buffer is left on the queue and
kwaymerge does not advance to the next buffer
on the queue.
Rather, the merge terminate.
step 4: wait for any ongoing disk input/output to complete.
step 5: if an input buffer has been read, add it to the queue
for the appropriate run. Determine the next run to
read from by determining nextrun such that
lastkey[nextrun] is minimum.
step 6: if lastkey [nextrun] + , then initiate reading
the next block from run nextrun into a free
input buffer.
step 7: initiate the writing of output buffer ou. Set ou = 1- ou.
step 8: if a record with key + has not been merged into
the output buffer go back to step 3.Otherwise, wait
for the ongoing write to complete and then terminate.
〖example〗
run1 20 25 26 28 29 30 33 +

run2 23 29 34 36 38 60 70 +

run3 24 28 31 33 40 43 50 +

186
next block
line queue run2 run3 outpu
run1 being read from
1 20 25 23 29 24 28 no output run1
2 25 26 28 29 24 28 20 23 run1
3 26 28 29 30 29 28 24 25 run3
4 29 30 29 28 31 33 26 28 run2
5 29 34 36 28 29
30 31 33 run1
6 33 + 34 36 31 33 29 30
run3
7 + 34 36 33 40 43 31 33 run2
8 + 36 38 60 40 43 33 34 run3
9 + 60 40 43 50 + 36 38 run2
10 + 60 70 + 50 + 40 43 no next
11 + 70 + + 50 60 no next
12 + + 70 +

〖theorem7.2〗the following is true for above program


1. in step 6, there is always a buffer available in which to
being reading the next block and
2. during the k-way merge of step 3 the next block in the
queue has been read in by the time it is needed

(3) run generation


#define k 16 /* nodes in the tree */
typedef struct {
int key; int run;
} element;
typedef struct {
element data; int loser;
} tree_node
tree_node tree[k];

tree[i].data, 0<=i<k k records in the tree


tree[i].data.key key value of record, tree[i].data

187
tree[i].data.run run number to which tree[i].data
belong
tree[i].loser loser of the tournament played at node
i
current_run run number of current run
winner position of overall tournament
winner,[0]
winner_run run number for tree[winner].data
max_runs number of runs that will be generated
last_key key value of last record output

void run_generation(char *in_name, char *out_name)


{
/* generate runs using a loser tree */

int winner = 0, winner_run = 0, current_run = 0;


int max_runs = 0, last_key = INT_MAX;
int i, parent, loser, temp;
FILE *in, *out;

in = open_input(in_name);
out = fopen(out_name, "wb");
for (i = 1; i < k; i++){
/* set up tree with dummy nodes */
tree[i].data.key = 0;
tree[i].data.run = 0;
tree[i].loser = i;
}
tree[winner].data.run = 0;

for ( ; ; ){
if (winner_run != current_run){
if (winner_run > max_runs){
/* last record reached , close files and return
*/
fclose(in); fclose(out);
return;
}
current_run = winner_run;

188
}
if (winner_run){
/* suppress output if dummy records */
fwrite(&tree[winner].data, sizeof(element), 1,
out);
last_key = tree[winner].data.key;
}
fread(&tree[winner].data, sizeof(element), 1,
in);
if (feof(in)){ /* signal to end
processing */
winner_run = max_runs+1;
tree[winner].data.run = winner_run;
}

else {
if (tree[winner].data.key < last_key){
winner_run++;
tree[winner].data.run = winner_run;
max_runs = winner_run;
}
else
tree[winner].data.run = current_run;
}
/* adjust tree */
parent = (k + winner)/2;
while(parent){
loser = tree[parent].loser;
if (tree[loser].data.run < winner_run ||
(tree[loser].data.run == winner_run &&
tree[loser].data.key <
tree[winner].data.key)){
temp = winner;
winner = tree[parent].loser;
tree[parent].loser = temp;
winner_run = tree[winner].data.run;
}
parent/ = 2;
}
}

189
}

FILE *open_input(char *source_name)


{
FILE *source;
source = fopen(source_name, "rb");
if (!source){
fprintf(stderr, "File %s cannot be opened for
input\n",
source_name);
exit(1);
}
return source;
}

(4) optimal merging of runs

intrenal node

15
and
5
2 4 5 15
2 4 extrenal node

external path lengths 1: 2*3 + 4*3 + 5*2 + 15*1 = 43


external path lengths 2: 2*2 + 4*2 + 5*2 + 15*2 = 52

Huffman tree
a binary tree with minimum weighted external path length

typedef struct tree_node *tree_pointer;


typedef struct tree_node {
tree_pointer left_child:
int weight;
tree_pointer right_child;
};

190
tree_pointer tree;
int n;

function least -- find a tree in heap with minimum weight


and remove it from list
function insert -- add a new tree to list

void huffman ( tree_pointer heap[ ], int n) /* O( nlogn )


*/
{ /* heap is a list of n single node binary trees */
tree_pointer tree;
int i;
/* initialize min heap */
initialize ( heap, n );
/* create a new tree by combining the trees with the
smallest weights until one tree remains */
for ( i = 1; i < n ; i++ ) {
tree ( tree_pointer) malloc (sizeof(tree_node));
if (is_full(tree)) {
fprintf (stderr, "the memeory is full\n");
exit(1);
}
tree->left_child = least(heap, n - i + 1);
tree->right_child = least(heap, n - i );
tree->weight = tree->left_child->weight +
tree->right_child ->weight ;
insert ( heap, n - i -1, tree);
}
}

191
5 10 16

2 3 5 5 7 9

2 3
39

23
16 23

10 13 10
7 9 13

5 5 5 5

2 3 2 3

192
chapter 9 heap structures
9.1 Min-Max heap

【definition】A min-max heap is a complete binary tree such


that if it is not empty, each element has a field called key.
Alternating levels of this tree are min level and max levels,
respectively. The root is on min level. Let x be any node in a
min-max heap. If x is on a min level then the element in x has
the minimum key from among all elements in the subtree with
root x. We call this node a min node. Similarly, If x is on a max
level then the element in x has the maximum key from among
all elements in the subtree with root x. We call this node a max
node.

7 min

70 40 max

30 9 10 15 min

45 50 30 20 12 max

(1) insertion into a min-max heap

#define MAX_SIZE 100 /* maximum size of heap plus 1


*/ #define FALSE 0
#define TRUE 1
#define SWAP (x, y, t) ((t) = (x) , (x) = (y), (y) = (t))
typrdef struct {
int key;
/* other field */
} element;
element heap [MAX_SIZE];
void min_max_insert(element heap[ ],int *n,element item)
{ /* insert item into the min-max heap : O (log n ) */
int parent;
(*n) ++;
if (*n == MAX_SIZE) {
fprintf(stderr,"The heap is full\n");
exit(1);
}
parent = (*n)/2;
if (!parent)
/* heap is empty, insert item into first position */
heap[1] = item;
else switch(level(parent)) {
case FALSE: /* min level */
if (item.key < heap[parent].key) {
heap[*n] = heap[parent];
verify_min(heap, parent, item);
}
else
verify_max(heap, *n, item);
break;
case TRUE: /* max level */
if (item.key > heap[parent].key) {
heap[*n] = heap[parent];
verify_max(heap, parent, item);
}
else
verify_min(heap, *n, item);
}
}
void verify_max(element heap[],int i,element item)
{
/* follow the nodes from the max node i to the root and
insert item into its proper place */
int grandparent = i / 4;
while (grandparent)

212
if (item.key > heap[grandparent].key) {
heap[i] = heap[grandparent];
i = grandparent;
grandparent /= 4;
}
else
break;
heap[i] = item;
}
7 min

70 40 max

30 9 10 15 min

45 50 30 20 12 j max

inserting 5
5 min

70 40 max

30 9 7 15 min

45 50 30 20 12 10 max

inserting 80
7 min

70 80 max

30 9 10 15 min

45 50 30 20 12 40 max

213
(2) deletion of min element
12 min

70 40 max

30 9 10 15 min

45 50 30 20 max

the two cases for reinserting an element item into a min-max-


heap :
① The root has no children
--> item is to be inserted into the root

② The root has at least one child


determine which node has the smallest key ( node k )
(a) item.key ≤ heap[k].key
--> item is to be inserted into the root
(b) item.key > heap[k].key and k is a child of the root
--> element heap[ k ] may be moved to the root and
item insert into node k
(c) item.key > heap[k].key and k is a grandchild
--> element heap[ k ] may be moved to the root and
item.key > heap[ parent ].key ( the parent of k )
heap[ parent ].key and item are to be interchanged .
insert item into the sub-heap with root k …

9 min

70 40 max

30 12 10 15 min

45 50 30 20 max

214
element delete_min(element heap[ ], int *n)
{ /* delete the minimum element from the min-max heap:
O(log n) */
int i, last, k, parent;
element temp, x;

if ( !(*n) ) {
fprintf(stderr,"The heap is empty\n");
heap[0].key = INT_MAX; /* error key in heap[0] */
return heap[0];
}
heap[0] = heap[1]; /* save the element */
x = heap[ (*n)-- ];
/* find place to insert x */
for ( i = 1, last = (*n)/2; i <= last;) {
k = min_child_grandchild(i, *n);
if (x.key <= heap[k].key) break;
/* case 2(b) or 2(c) */
heap[i] = heap[k];
if (k <= 2*i+1) { /* 2(b) */
i = k;
break;
}
/* case 2(c), k is a grandchild of i */
parent = k/2;
if (x.key > heap[parent].key)
SWAP(heap[parent],x,temp);
i = k;
} /* for */
heap[i] = x;
return heap[0];
}

215
9.2 deaps

【definition】 A deap is a complete binary tree that is either


empty or satisfies the following properties:
(1) the root contains no element
(2) the left subtree is a min-heap
(3) the right subtree is a max-heap
(4) if the right subtree is not empty, then let i be any node
in the left subtree, let j be the corresponding
node in
the right subtree. If such a j does not exist, then let
j
be the node in the right subtree that corresponds
to
the parent of i. The key in node i is less than or
equal to the key in j.
[log i ]−1
j=i+ 2 2

if ( j > n ) j /= 2

5 45

10 8 25 40

15 19 9 30 20

(1) insertion into a deap

216
5 45

10 8 25 40

15 19 9 30 20
i j

insertion of 4

4 45

5 8 25 40

15 10 9 30 20 19

insertion of 30

5 45

10 8 30 40

15 19 9 30 20 25

functions

• max_heap ( n )
this function return TRUE iff n is a position in the
max-heap of the deap.

• min_partner ( n )
This function compute the min_heap node that
corresponds to the max-heap position n. This is given by
[log n]−1
n- 2 2

• max_partner ( n )

217
This function compute the max-heap node that
corresponds to the parent of min-heap position n. This is
[log n]−1
given by ( n - 2 2
)/2

• min_insert and max_insert


These functions insert an element into a specified position
of a min- and max-heap, respectively. This is done by
following the path from this position toward the root of the
respective heap. Elements are moved down as necessary
until the correct place to insert the new element is found

element deap [ MAX_SIZE ]

void deap_insert(element deap[ ],int *n,element x)


{
/* insert x into the deap : O ( log n ) */
int i;
(*n) ++;
if (*n == MAX_SIZE) {
fprintf(stderr,"The heap is full\n");
exit(1);
}
if (*n == 2)
deap[2] = x; /* insert into empty deap */
else switch(max_heap(*n)) {
case FALSE: /* *n is a position on min side */
i = max_parent(*n);
if (x.key > deap[i].key) {
deap[*n] = deap[i];
max_insert(deap,i,x);
}
else
min_insert(deap,*n,x);
break;
case TRUE: /* *n is a position on max side */
i = min_parent(*n);
if (x.key < deap[i].key) {
deap[*n] = deap[i];
min_insert(deap,i,x);
}

218
else
max_insert(deap,*n,x);
}
}

(2) deletion of min element


[log i ]−1
j=i+ 2 2
if ( j > n ) j /= 2

8 45

10 9 25 40

15 19 20 30

element deap_delete_min(element deap[ ] , int *n)


{
/* delete the minimum element from the heap: O ( log n ) */
int i , j;
element temp;
if (*n < 2) {
fprintf(stderr , "The deap is empty\n");
/* return an error code to user */
deap[0].key = INT_MAX;
return deap[0];
}
deap[0] = deap[2]; /* save min element */
temp = deap[(*n)--];
for (i = 2; i*2 < = *n; deap[i] = deap[j] , i = j) {
/* find node with smaller key */

219
j = i*2;
if (j+1 < = *n) {
if (deap[j].key > deap[j+1].key)
j++;
}
}
modified_deap_insert(deap , i , temp);
return deap[0];
}

9.3 leftist trees


problem
combine two priority queues into a single priority queue
heap --- O (n) , leftist tree ---- O ( log n)

A G

B C H I

D E F J

2 2
A G
2 1 1 1
B C H I
1 1 1
D E F J

extrenal node intrenal node

shortest ( x ) =
R
So if x is an external node
T1+ min{ shortest ( left _ child ( x )), shortest ( right _ child ( x ))} otehrwise

【definition】
A leftist tree is a binary tree such that iff it is not empty,
then
shortest (left_child(x)) >= shortest(right_child(x))

220
for every internal node x

〖Lemma9.1〗
Let x be the root of a leftist tree that has n (internal ) nodes.
(a) n >= 2
shortest ( x )
-1
(b) The rightmost root to external node path is the shortest
root to external node path. Its length is shortest( x ).

proof: (a) the leftist tree has at least internal nodes


shortest ( x )
∑ 2 i −1 = 2 shortest ( x ) -1
i =1

typedef struct {
int key;
/* other field */
} element;
typedef struct leftist *leftist_tree;
struct leftist {
leftist_tree left_child;
element data;
leftist_tree right_child;
int shortest;
};
【definition】 A min-leftist tree (max leftist tree) is a leftist tree
in which the key value in each node is no large ( smaller ) than
the key value in its children ( if any ). In other words. a min (max)
leftist tree is a leftist that is also a min ( max ) tree
2 2
2 a 5 b
1 1 1 1
7 50 9 8
1 1 2 1
11 80 12 10
1 1 1 1
13 20 18 15

insert x --> min-leftist b


• create a min-leftist a that contains the single element x
• combine a and b

221
delete the min element <-- a
• combine a->left_child and a->right_child
• delete node a

combine a and b
• obtain a new binary tree containing all elements in a
and b following the rightmost path in a and/or b. (the key in
each node is no larger than the key in its children )
• interchange the left and right subtree of nodes as
necessary to convert this binary tree into a leftist tree

2
5
2 2
8 1
1 1 9 8
2 1 1
10 50
1 1 12 10 50
1 1 1 1
15 80
20 18 15 80

2 2
2 2
1 2 1
7 5 5 71
1 1 2 2 1
11 9 8 8 9 11 1
1 1 1
2 1 1 2
13 12
1
10 50
1
10 50 12 13 1
1 1 1 1 1 1
20 18 15 80 15 80 20 18

void min_combine(leftist_tree *a , leftist_tree *b)


{
/* combine the two min leftist trees *a and *b. The resulting
min leftist tree is returned in *a , and *b is set to NULL */
if (!*a)
*a = *b;

222
else if (*b)
min_union(a , b);
*b = NULL;
}

void min_union(leftist_tree *a , leftist_tree *b) /* O(log n ) */


{
/* recursively combine two nonempty min leftist trees */
leftist_tree temp;
/* set a to be the tree with smaller root */
if ((*a)->data.key > (*b)->data.key)
SWAP(*a , *b , temp);
/* create binary tree such that the smallest key in each
subtree is in the root */
if (!(*a)->right_child)
(*a)->right_child = *b;
else
min_union(&(*a)->right_child , b);
/* leftist tree property */
if (!(*a)->left_child) {
(*a)->left_child = (*a)->right_child;
(*a)->right_child = NULL;
}
else if ((*a)->left_child->shortest <
(*a)->right_child->shortest)
SWAP((*a)->left_child , (*a)->right_child , temp);
(*a)->shortest = (!(*a)->right_child) ? 1 :
(*a)->right_child->shortest + 1;
}

9.4 binomial heap

(1) cost amortization

individual operation:
binomial heap ~ O ( n )

223
leftist tree ~ O ( log n )
amortize part of the cost of expensive operation over the
inexpensive once ~ O ( 1 ) or O ( log n )

amortize scheme :
charge some of the actual cost of an operation to other
operations. This reduce the charged cost of some operations
and increases that of others

amortized cost of an operation = the total cost charged to it

the sum of the amortized costs of the operations is greater


than or equal to the sum of their actual costs

〖example〗
operation sequence: I1,I2,D1,I3,I4,I5,I6,D2,I7
actual cost: I1 ~ I7 -- 1, D1 - 8, D2 -- 10 ( unit of
time )
total cost = 25
if 2 units (D1) --> I1, I2, 4 units ( D2) --> I2 ~ I6
the amortized cost of each of I1 ~ I6 = 2 units
the amortized cost of each of D1 ~ C2 = 6 units
the total amortized cost = (2*6+1*+2*6 ) = 25 units

for any sequence of insert and delete min operations:


actual sequence cost = i + 10 * d
amortized sequence cost = 2 * i + 6 * d
bound on the sequence cost = min {2 * i + 6 * d, i + 10 * d}

while individual delete operations on an binomial heap may be


expensive, the cost of any sequence of binomial heap
operations is actually quite small

(2) definition of binomial heap

min-binomial heap -- a collection of min-tree ( B-heap )


max-binomial heap -- a collection of max-tree

224
1
3
8 12 7 16
5 4

10 15 30 9
6

20
insert, combine -- O (1) (actual , amortized time )
delete -- O ( log n ) ( amortized time )
B-heap representation : doubly linked list
field: degree, child, left_link, right_link, data
a
8 3 1

5 4 12 7 16
10
6 15 30 9

20
(3) insertion into a binomial heap O (1)
element x --> B-heap a
1. putting x into a new node
2. placing this node into the doubly linked circular list
pointer at by a
3. if a is null or x's key < the key in the node pointed by a
then reset a to the new node

(4) combine O(1)


B-heap a + B-heap b
1. combine the top doubly linked circular lists of a and b
into a single doubly linked circular list.
2. new B-heap pointer --> the smaller key' B-heap

(5) deletion of min element


step1: [Handle empty B-heap]
if (a = NULL) deletion_error else perform step
2- 4
step2: [Deletion from nonempty B-heap]
x = a->data; y = a->child ; delete a from its
doubly

225
linked circular list; Now, a pointer to any
remaining
node in the resulting list; If there is no such
node,
then a = NULL
step3: [Min-tree joining]
Consider the min-tree in the list a and y; joining
together pairs of min-tree of the same degree
until
all remaining min-trees has different degree;
step4: [Form min-tree root list]
Link the roots of the remaining min-tree ( if any )
together to form a doubly linked circular list;
Set a
to point the root ( if any ) with minimum key;
new
B-heap consist of the remaining min-tree and
the
sub min-tree of the deleted root
8 3 12 7 16

5 4 15 30 9
10
6 20

making the min-tree whose root has larger key a subtree of the other

7 3 12 16

8 9 5 4 15 30

6 20
10

226
3 12 16

7 5 4
15 30

8 9 6
20

10

for (degree = p->degree; tree[degree]; degree++) {


join_min_trees(p , tree[degree]);
tree[degree] = NULL;
}
tree[degree] = p;

(6) analysis

【definition】 The binomial tree , Bk, of degree k is a tree such


that if k = 0, then the tree has exactly one node and if
k > 0, then it consists of a root whose degree is k and whose
sub tree are B0, B1, …, Bk-1

insert -- O(1) delete -- O(logn)


〖Lemma9.2〗Let a be a B-heap with n elements that results
from a sequence of insert, combine, and delete min operations
performed on initially empty B-heap. Each min tree in a has
degree ≤ log 2 n. Conesquencely,
MAX_DEGREE ≤ [ log 2 n ] and the actual cost of a delete min is O(logn + s )

〖Theorem9.1〗 If a sequence of n insert, combine, and


delete min operations is performed on initially empty
B-heap, then we can amortize cost such that the amortized
time complexity of each insert and combine is O(1) and that of
each delete min is O(log n)

actual cost of any sequence of i insert, c combines, and dm


delete min is O (i + c + dm log i )

9.5 Fibonacci heap

227
Fibonacci heap:
min-Fibonacci heap is a collection of min-tree ( F-heap )
max-Fibonacci heap is a collection of max-tree

□ B-heaps are a special case of F-heap

operations:

(1) delete, delete the element in a specified node --- O(1)


(2) decrease key, decrease the key of a specified node by
a given positive amount ----------------------------
O(log n)

F-heap representation : doubly linked list

field: degree, child, left_link, right_link, data, (B-heap )


parent, child_cut

(1) deletion from an F-heap

1. if a = b, then do a delete min; otherwise do step 2, 3,


and 4 below
2. Delete b from the doubly linked list it is in
3. Combine the doubly linked list of b' s children with the
doubly linked list of a' s min-tree roots to get a single
doubly linked list. Trees of equal degree are not
joined
together as in a delete min
4. Dispose of not b

the deletion of 12

228
8 3 1 15 30

5 4 7 16 20
10
6 9

(2) Decrease key


1. reduce the key in b
2. if b is not a min-tree root and its key is smaller than
that in its parent, then delete b from its doubly linked
list and insert it into the doubly linked list of min-tree
roots
3. Change a to point to b in case the key in b is smaller
than that in a

the reducetion of 15 by 4

8 3 1 11

5 4 7 16 20
12
10
6 30 9

(3) Cascading cut


problem
theorem9.1 requires that each min-tree of degree k have
an exponential ( in k) number of nodes.
How to ensure that each min-tree of degree k has at least
ck nodes for some c, c>1 ?

□each delete and decrease key operation must be


followed by a cascading cut step --> add the child_cut to
each node.
□child_cut = TRUE
iff one of the children of x was cut off (i.e., removed) after the
most recent time x was made the child of its current parent

229
□each time two min-tree are joined in a delete min
operation, the child_cut of the root with larger key = False

2
14 12 10
F
4 5
16 15
T 18 30 11
6 60
T
20 8 7
8 6
T
10
20 7
T
30 12 11
2
*
14 18 4 5

16 15 60

(4) Analysis
〖Lemma9.3〗 Let a be an F-heap with n elements that
results from a sequence of insert, combine, delete min, delete,
and decrease key operations performed on initially empty F-
heap
(a) Let b be any node node in any of the min-trees of a.
The degree of b is at most log φ m , where
φ = ( 1+ and m is the number of elements in the
5 ) / 2
subtree with root b.
(b) MAX_DEGREE ≤ [ log φ m ] and the actual cost of a delete

230
min is O(log n + s)
〖Theorem9.2〗 If a sequence of n insert, combine, delete
min, delete, and decrease keyoperations is performed on an
initially empty F-heap, then we can amortize costs such that
the amortized time complexity of each insert, combine, and
decrease key operation is O(1) and that of each delete min
and delete is O(log n). The total time complexity of the entire
sequence is the sum of the amortized complexities of the
individual operations in the sequence

(5) Application of F-heaps

• the single source all destinations algorithm


S -- the set of vertices to which a shortest path has been found
dostance(i) -- the length of a shortest from the source vertex i, i ∈ S

S ~ F-hearp
1) determine i is min distance and add i to S
~~ a delete min operation on S
2) distance value of the remaining vertices in S
~~ a decrease key operation on each of the affected
verties
total time (graph ) -- Ω ( n )
2

total time (heap ) -- O (nlog n + e )

• find a shortest path between every pair of vertices

total time (graph ) -- Ω ( n )


3

total time (heap ) -- O ( n log n + ne )


2

231

You might also like