You are on page 1of 522

Fundamentals of

Data Structures in C

• Ellis Horowitz
• Sartaj Sahni

• Susan Anderson-Freed

CHAPTER 1 1
CHAPTER 1

BASIC CONCEPT

All the programs in this file are selected from


Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Computer Science Press.

CHAPTER 1 2
How to create programs
 Requirements
 Analysis: bottom-up vs. top-down
 Design: data objects and operations
 Refinement and Coding
 Verification
– Program Proving
– Testing
– Debugging

CHAPTER 1 3
Algorithm
 Definition
An algorithm is a finite set of instructions that accomplishes a particular task.
 Criteria
– input
– output
– definiteness: clear and unambiguous
– finiteness: terminate after a finite number of steps
– effectiveness: instruction is basic enough to be carried out

CHAPTER 1 4
Data Type
 Data Type
A data type is a collection of objects and a set of operations that act on those objects.
 Abstract Data Type
An abstract data type(ADT) is a data type that is organized in such a way that the
specification of the objects and the operations on the objects is separated from the
representation of the objects and the implementation of the operations.

CHAPTER 1 5
Specification vs. Implementation
 Operation specification
– function name
– the types of arguments
– the type of the results
 Implementation independent

CHAPTER 1 6
*Structure 1.1:Abstract data type Natural_Number (p.17)
structure Natural_Number is
objects: an ordered subrange of the integers starting at zero and ending

at the maximum integer (INT_MAX) on the computer


functions:
for all x, y  Nat_Number; TRUE, FALSE  Boolean
and where +, -, <, and == are the usual integer operations.
Nat_No Zero ( ) ::= 0
Boolean Is_Zero(x) ::= if (x) return FALSE
else return TRUE
Nat_No Add(x, y) ::= if ((x+y) <= INT_MAX) return x+y
else return INT_MAX
Boolean Equal(x,y) ::= if (x== y) return TRUE
else return FALSE
Nat_No Successor(x) ::= if (x == INT_MAX) return x
else return x+1
Nat_No Subtract(x,y) ::= if (x<y) return 0
else return x-y
::= is defined as
end Natural_Number
CHAPTER 1 7
Measurements
 Criteria
– Is it correct?
– Is it readable?
– …
 Performance Analysis (machine independent)
– space complexity: storage requirement
– time complexity: computing time
 Performance Measurement (machine dependent)

CHAPTER 1 8
Space Complexity
S(P)=C+SP(I)
 Fixed Space Requirements (C)
Independent of the characteristics of the inputs and outputs
– instruction space
– space for simple variables, fixed-size structured variable, constants
 Variable Space Requirements (SP(I))
depend on the instance characteristic I
– number, size, values of inputs and outputs associated with I
– recursive stack space, formal parameters, local variables, return address

CHAPTER 1 9
*Program 1.9: Simple arithmetic function (p.19)
float abc(float a, float b, float c)
{
return a + b + b * c + (a + b - c) / (a + b) + 4.00;
}
Sabc(I) = 0

*Program 1.10: Iterative function for summing a list of numbers (p.20)


float sum(float list[ ], int n)
{
Ssum(I) = 0
float tempsum = 0;
int i; Recall: pass the address of the
for (i = 0; i<n; i++) first element of the array &
tempsum += list [i]; pass by value
return tempsum;
}
CHAPTER 1 10
*Program 1.11: Recursive function for summing a list of numbers (p.20)
float rsum(float list[ ], int n)
{
if (n) return rsum(list, n-1) + list[n-1];
return 0;
} Ssum(I)=Ssum(n)=6n

Assumptions:
*Figure 1.1: Space needed for one recursive call of Program 1.11 (p.21)

Type Name Number of bytes


parameter: float list [ ] 2
parameter: integer n 2
return address:(used internally) 2(unless a far address)
TOTAL per recursive call 6

CHAPTER 1 11
Time Complexity
T(P)=C+TP(I)
 Compile time (C)
independent of instance characteristics
 run (execution) time TP
 Definition
A program step is a syntactically or semantically meaningful program segment whose execution time is
independent of the instance characteristics.
 Example
– abc = a + b + b * c + (a + b - c) / (a + b) + 4.0
– abc = a + b + c T (n)=caADD(n)+csSUB(n)+clLDA(n)+cstSTA(n)
P

Regard as the same unit


machine independent
CHAPTER 1 12
Methods to compute the step count

 Introduce variable count into programs


 Tabular method
– Determine the total number of steps contributed by each statement
step per execution  frequency
– add up the contribution of all statements

CHAPTER 1 13
Iterative summing of a list of numbers
*Program 1.12: Program 1.10 with count statements (p.23)

float sum(float list[ ], int n)


{
float tempsum = 0; count++; /* for assignment */
int i;
for (i = 0; i < n; i++) {
count++; /*for the for loop */
tempsum += list[i]; count++; /* for assignment */
}
count++; /* last execution of for */
return tempsum;
count++; /* for return */
}
2n + 3 steps
CHAPTER 1 14
*Program 1.13: Simplified version of Program 1.12 (p.23)

float sum(float list[ ], int n)


{
float tempsum = 0;
int i;
for (i = 0; i < n; i++) 2n + 3 steps
count += 2;
count += 3;
return 0;
}

CHAPTER 1 15
Recursive summing of a list of numbers
*Program 1.14: Program 1.11 with count statements added (p.24)

float rsum(float list[ ], int n)


{
count++; /*for if conditional */
if (n) {
count++; /* for return and rsum invocation */
return rsum(list, n-1) + list[n-1];
}
count++;
return list[0];
}
2n+2

CHAPTER 1 16
Matrix addition

*Program 1.15: Matrix addition (p.25)

void add( int a[ ] [MAX_SIZE], int b[ ] [MAX_SIZE],


int c [ ] [MAX_SIZE], int rows, int cols)
{
int i, j;
for (i = 0; i < rows; i++)
for (j= 0; j < cols; j++)
c[i][j] = a[i][j] +b[i][j];
}

CHAPTER 1 17
*Program 1.16: Matrix addition with count statements (p.25)

void add(int a[ ][MAX_SIZE], int b[ ][MAX_SIZE],


int c[ ][MAX_SIZE], int row, int cols )
{
int i, j; 2rows * cols + 2 rows + 1
for (i = 0; i < rows; i++){
count++; /* for i for loop */
for (j = 0; j < cols; j++) {
count++; /* for j for loop */
c[i][j] = a[i][j] + b[i][j];
count++; /* for assignment statement */
}
count++; /* last time of j for loop */
}
count++; /* last time of i for loop */
}
CHAPTER 1 18
*Program 1.17: Simplification of Program 1.16 (p.26)

void add(int a[ ][MAX_SIZE], int b [ ][MAX_SIZE],


int c[ ][MAX_SIZE], int rows, int cols)
{
int i, j;
for( i = 0; i < rows; i++) {
for (j = 0; j < cols; j++)
count += 2;
count += 2;
}
count++;
}
2rows  cols + 2rows +1
Suggestion: Interchange the loops when rows >> cols
CHAPTER 1 19
Tabular Method
*Figure 1.2: Step count table for Program 1.10 (p.26)
Iterative function to sum a list of numbers
steps/execution
Statement s/e Frequency Total steps
float sum(float list[ ], int n) 0 0 0
{ 0 0 0
float tempsum = 0; 1 1 1
int i; 0 0 0
for(i=0; i <n; i++) 1 n+1 n+1
tempsum += list[i]; 1 n n
return tempsum; 1 1 1
} 0 0 0
Total 2n+3

CHAPTER 1 20
Recursive Function to sum of a list of numbers
*Figure 1.3: Step count table for recursive summing function (p.27)

Statement s/e Frequency Total steps


float rsum(float list[ ], int n) 0 0 0
{ 0 0 0
if (n) 1 n+1 n+1
return rsum(list, n-1)+list[n-1]; 1 n n
return list[0]; 1 1 1
} 0 0 0
Total 2n+2

CHAPTER 1 21
Matrix Addition
*Figure 1.4: Step count table for matrix addition (p.27)

Statement s/e Frequency Total steps

Void add (int a[ ][MAX_SIZE]‧‧‧) 0 0 0


{ 0 0 0
int i, j; 0 0 0
for (i = 0; i < row; i++) 1 rows+1 rows+1
for (j=0; j< cols; j++) 1 rows‧(cols+1) rows‧cols+rows
c[i][j] = a[i][j] + b[i][j]; 1 rows‧cols rows‧cols
} 0 0 0

Total 2rows‧cols+2rows+1

CHAPTER 1 22
Exercise 1

*Program 1.18: Printing out a matrix (p.28)

void print_matrix(int matrix[ ][MAX_SIZE], int rows, int cols)


{
int i, j;
for (i = 0; i < row; i++) {
for (j = 0; j < cols; j++)
printf(“%d”, matrix[i][j]);
printf( “\n”);
}
}

CHAPTER 1 23
Exercise 2

*Program 1.19:Matrix multiplication function(p.28)

void mult(int a[ ][MAX_SIZE], int b[ ][MAX_SIZE], int c[ ][MAX_SIZE])


{
int i, j, k;
for (i = 0; i < MAX_SIZE; i++)
for (j = 0; j< MAX_SIZE; j++) {
c[i][j] = 0;
for (k = 0; k < MAX_SIZE; k++)
c[i][j] += a[i][k] * b[k][j];
}
}

CHAPTER 1 24
Exercise 3
*Program 1.20:Matrix product function(p.29)

void prod(int a[ ][MAX_SIZE], int b[ ][MAX_SIZE], int c[ ][MAX_SIZE],

int rowsa, int colsb, int colsa)


{
int i, j, k;
for (i = 0; i < rowsa; i++)
for (j = 0; j< colsb; j++) {
c[i][j] = 0;
for (k = 0; k< colsa; k++)
c[i][j] += a[i][k] * b[k][j];
}
}

CHAPTER 1 25
Exercise 4

*Program 1.21:Matrix transposition function (p.29)

void transpose(int a[ ][MAX_SIZE])


{
int i, j, temp;
for (i = 0; i < MAX_SIZE-1; i++)
for (j = i+1; j < MAX_SIZE; j++)
SWAP (a[i][j], a[j][i], temp);
}

CHAPTER 1 26
Asymptotic Notation (O)
 Definition
f(n) = O(g(n)) iff there exist positive constants c and n0 such that f(n)  cg(n) for all n, n  n0.
 Examples
– 3n+2=O(n) /* 3n+24n for n2 */
– 3n+3=O(n) /* 3n+34n for n3 */
– 100n+6=O(n) /* 100n+6101n for n10 */
– 10n2+4n+2=O(n2) /* 10n2+4n+211n2 for n5 */
– 6*2n+n2=O(2n) /* 6*2n+n2 7*2n for n4 */

CHAPTER 1 27
Example
 Complexity of c1n2+c2n and c3n
– for sufficiently large of value, c3n is faster than
c1n2+c2n
– for small values of n, either could be faster
• c1=1, c2=2, c3=100 --> c1n2+c2n  c3n for n  98
• c1=1, c2=2, c3=1000 --> c1n2+c2n  c3n for n  998
– break even point
• no matter what the values of c1, c2, and c3, the n beyond
which c3n is always faster than c1n2+c2n

CHAPTER 1 28
 O(1): constant
 O(n): linear
 O(n2): quadratic
 O(n3): cubic
 O(2n): exponential
 O(logn)
 O(nlogn)

CHAPTER 1 29
*Figure 1.7:Function values (p.38)

CHAPTER 1 30
*Figure 1.8:Plot of function values(p.39)

nlogn

logn

CHAPTER 1 31
*Figure 1.9:Times on a 1 billion instruction per second computer(p.40)

CHAPTER 1 32
CHAPTER 2

ARRAYS AND STRUCTURES


All the programs in this file are selected from
Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Computer Science Press.

CHAPTER 2 33
Arrays
Array: a set of index and value

data structure
For each index, there is a value associated with
that index.

representation (possible)
implemented by using consecutive memory.

CHAPTER 2 34
objects: A set of pairs <index, value> where for each value of index
there is a value from the set item. Index is a finite ordered set of one or

more dimensions, for example, {0, … , n-1} for one dimension,


{(0,0),(0,1),(0,2),(1,0),(1,1),(1,2),(2,0),(2,1),(2,2)} for two dimensions,

etc.
Functions:
for all A  Array, i  index, x  item, j, size  integer
Array Create(j, list) ::= return an array of j dimensions where list is a

j-tuple whose ith element is the size of the

ith dimension. Items are undefined.


Item Retrieve(A, i) ::= if (i  index) return the item associated with
index value i in array A
else return error
Array Store(A, i, x) ::= if (i in index)
return an array that is identical to array
A except the new pair <i, x> has been
inserted else return error
CHAPTER 2 35
end array
Arrays in C
int list[5], *plist[5];

list[5]: five integers


list[0], list[1], list[2], list[3], list[4]
*plist[5]: five pointers to integers
plist[0], plist[1], plist[2], plist[3], plist[4]

implementation of 1-D array


list[0] base address = 
list[1]  + sizeof(int)
list[2]  + 2*sizeof(int)
list[3]  + 3*sizeof(int)
list[4]  + 4*size(int)

CHAPTER 2 36
Arrays in C (Continued)
Compare int *list1 and int list2[5] in C.

Same: list1 and list2 are pointers.


Difference: list2 reserves five locations.

Notations:
list2 - a pointer to list2[0]
(list2 + i) - a pointer to list2[i] (&list2[i])
*(list2 + i) - list2[i]

CHAPTER 2 37
Example: 1-dimension array addressing
int one[] = {0, 1, 2, 3, 4};
Goal: print out address and value

void print1(int *ptr, int rows)


{
/* print out a one-dimensional array using a pointer */
int i;
printf(“Address Contents\n”);
for (i=0; i < rows; i++)
printf(“%8u%5d\n”, ptr+i, *(ptr+i));
printf(“\n”);
}

CHAPTER 2 38
call print1(&one[0], 5)

Address Contents
1228 0
1230 1
1232 2
1234 3
1236 4
*Figure 2.1: One- dimensional array addressing (p.53)

CHAPTER 2 39
Structures (records)
struct {
char name[10];
int age;
float salary;
} person;

strcpy(person.name, “james”);
person.age=10;
person.salary=35000;

CHAPTER 2 40
Create structure data type
typedef struct human_being {
char name[10];
int age;
float salary;
};

or

typedef struct {
char name[10];
int age;
float salary
} human_being;

human_being person1, person2;

CHAPTER 2 41
Unions
Similar to struct, but only one field is active.
Example: Add fields for male and female.
typedef struct sex_type {
enum tag_field {female, male} sex;
union {
int children;
int beard;
} u;
};
typedef struct human_being {
char name[10];
int age;
float salary;
human_being person1, person2;
date dob;
person1.sex_info.sex=male;
sex_type sex_info;
person1.sex_info.u.beard=FALSE;
}

CHAPTER 2 42
Self-Referential Structures
One or more of its components is a pointer to itself.

typedef struct list {


char data; Construct a list with three nodes
list *link; item1.link=&item2;
} item2.link=&item3;
malloc: obtain a node
list item1, item2, item3;
item1.data=‘a’;
item2.data=‘b’;
item3.data=‘c’; a b c
item1.link=item2.link=item3.link=NULL;

CHAPTER 2 43
Ordered List Examples
ordered (linear) list: (item1, item2, item3, …, itemn)

 (MONDAY, TUEDSAY, WEDNESDAY,


THURSDAY, FRIDAY, SATURDAYY,
SUNDAY)
 (2, 3, 4, 5, 6, 7, 8, 9, 10, Jack, Queen,
King,
Ace)
 (1941, 1942, 1943, 1944, 1945)
 (a1, a2, a3, …, an-1, an)
CHAPTER 2 44
Operations on Ordered List
 Find the length, n , of the list.
 Read the items from left to right (or right to left).
 Retrieve the i’th element.
 Store a new value into the i’th position.
 Insert a new element at the position i , causing
elements numbered i, i+1, …, n to become numbered
i+1, i+2, …, n+1
 Delete the element at position i , causing elements
numbered i+1, …, n to become numbered i, i+1, …,
n-1 array (sequential mapping)? (1)~(4) O (5)~(6) X
CHAPTER 2 45
Polynomials A(X)=3X20+2X5+4, B(X)=X4+10X3+3X2+1
Structure Polynomial is
objects:
p( x )  a1 x e1  ...  an x en ; a set of ordered pairs of
<ei,ai> where ai in Coefficients and ei in Exponents, ei are integers >= 0
functions:
for all poly, poly1, poly2  Polynomial, coef Coefficients, expon
Exponents
Polynomial Zero( ) ::= return the polynomial,
p(x) = 0
Boolean IsZero(poly) ::= if (poly) return FALSE
else return TRUE
Coefficient Coef(poly, expon) ::= if (expon  poly) return its
coefficient else return Zero
Exponent Lead_Exp(poly) ::= return the largest exponent in
poly
Polynomial Attach(poly,coef, expon) ::= if (expon  poly) return error
else return the polynomial poly

with the term <coef, expon>


inserted
CHAPTER 2 46
Polynomial Remove(poly, expon) ::= if (expon  poly) return the
polynomial poly with the
term whose exponent is
expon deleted
else return error
Polynomial SingleMult(poly, coef, expon) ::= return the polynomial
poly • coef • xexpon
Polynomial Add(poly1, poly2) ::= return the polynomial
poly1 +poly2
Polynomial Mult(poly1, poly2) ::= return the polynomial
poly1 • poly2
End Polynomial

*Structure 2.2:Abstract data type Polynomial (p.61)

CHAPTER 2 47
Polynomial Addition
data structure 1: #define MAX_DEGREE 101
typedef struct {
int degree;
float coef[MAX_DEGREE];
} polynomial;
/* d =a + b, where a, b, and d are polynomials */
d = Zero( )
while (! IsZero(a) && ! IsZero(b)) do {
switch COMPARE (Lead_Exp(a), Lead_Exp(b)) {
case -1: d =
Attach(d, Coef (b, Lead_Exp(b)), Lead_Exp(b));
b = Remove(b, Lead_Exp(b));
break;
case 0: sum = Coef (a, Lead_Exp (a)) + Coef ( b, Lead_Exp(b));
if (sum) {
Attach (d, sum, Lead_Exp(a));
a = Remove(a , Lead_Exp(a));
b = Remove(b , Lead_Exp(b));
} CHAPTER 2 48
break;
case 1: d =
Attach(d, Coef (a, Lead_Exp(a)), Lead_Exp(a));
a = Remove(a, Lead_Exp(a));
}
}
insert any remaining terms of a or b into d
advantage: easy implementation
disadvantage: waste space when sparse

*Program 2.4 :Initial version of padd function(p.62)

CHAPTER 2 49
Data structure 2: use one global array to store all polynomials
A(X)=2X1000+1
B(X)=X4+10X3+3X2+1 *Figure 2.2: Array representation of two polynomials
(p.63)
starta finisha startb finishb avail

coef
2 1 1 10 3 1

exp
1000 0 4 3 2 0

0 1 2 3 4 5 6

specification representation
poly <start, finish>
A <0,1>
B CHAPTER 2 <2,5> 50
storage requirements: start, finish, 2*(finish-start+1)
nonparse: twice as much as (1)
when all the items are nonzero

MAX_TERMS 100 /* size of terms array */


typedef struct {
float coef;
int expon;
} polynomial;
polynomial terms[MAX_TERMS];
int avail = 0;

*(p.62)

CHAPTER 2 51
Add two polynomials: D = A + B
void padd (int starta, int finisha, int startb, int finishb,
int * startd, int *finishd)
{
/* add A(x) and B(x) to obtain D(x) */
float coefficient;
*startd = avail;
while (starta <= finisha && startb <= finishb)
switch (COMPARE(terms[starta].expon,
terms[startb].expon)) {
case -1: /* a expon < b expon */
attach(terms[startb].coef, terms[startb].expon);
startb++
break;

CHAPTER 2 52
case 0: /* equal exponents */
coefficient = terms[starta].coef +
terms[startb].coef;
if (coefficient)
attach (coefficient, terms[starta].expon);
starta++;
startb++;
break;
case 1: /* a expon > b expon */
attach(terms[starta].coef, terms[starta].expon);
starta++;
}

CHAPTER 2 53
/* add in remaining terms of A(x) */
for( ; starta <= finisha; starta++)
attach(terms[starta].coef, terms[starta].expon);
/* add in remaining terms of B(x) */
for( ; startb <= finishb; startb++)
attach(terms[startb].coef, terms[startb].expon);
*finishd =avail -1;
}
Analysis: O(n+m)
where n (m) is the number of nonzeros in A(B).

*Program 2.5: Function to add two polynomial (p.64)


CHAPTER 2 54
void attach(float coefficient, int exponent)
{
/* add a new term to the polynomial */
if (avail >= MAX_TERMS) {
fprintf(stderr, “Too many terms in the polynomial\n”);
exit(1);
}
terms[avail].coef = coefficient;
terms[avail++].expon = exponent;
}
*Program 2.6:Function to add anew term (p.65)

Problem: Compaction is required


when polynomials that are no longer needed.
(data movement takes time.)

CHAPTER 2 55
Sparse Matrix

col1 col2 col3 col4 col5 col6


row0 15 0 0 22 0  15
row1
 0 11 3 0 0 0 
 
row2  0 0 0 6 0 0
 
row3  0 0 0 0 0 0 
row4
91 0 0 0 0 0
 
5*3 row5  0 0 28 0 0 0
6*6
(a) 15/15 (b) 8/36

*Figure 2.3:Two matrices


sparse matrix
data structure?
CHAPTER 2 56
SPARSE MATRIX ABSTRACT DATA TYPE
Structure Sparse_Matrix is
objects: a set of triples, <row, column, value>, where row
and column are integers and form a unique combination, and
value comes from the set item.
functions:
for all a, b  Sparse_Matrix, x  item, i, j, max_col,
max_row  index

Sparse_Marix Create(max_row, max_col) ::=


return a Sparse_matrix that can hold up to
max_items = max _row  max_col and
whose maximum row size is max_row and

whose maximum column size is max_col.

CHAPTER 2 57
Sparse_Matrix Transpose(a) ::=
return the matrix produced by interchanging
the row and column value of every triple.
Sparse_Matrix Add(a, b) ::=
if the dimensions of a and b are the same
return the matrix produced by adding
corresponding items, namely those with
identical row and column values.
else return error
Sparse_Matrix Multiply(a, b) ::=
if number of columns in a equals number of
rows in b
return the matrix d produced by multiplying
a by b according to the formula: d [i] [j] =
(a[i][k]•b[k][j]) where d (i, j) is the (i,j)th
element
else return error.
CHAPTER 2 58
* Structure 2.3: Abstract data type Sparse-Matrix (p.68)
(1) Represented by a two-dimensional array.
Sparse matrix wastes space.
(2) Each element is characterized by <row, col, value>.

# of rows (columns) row col value


row col value
# of nonzero terms
a[0] 6 6 8 b[0] 6 6 8
[1] 0 0 15 [1] 0 0 15
[2] 0 3 22 [2] 0 4 91
[3] 0 5 -15 [3] 1 1 11
[4] 1 1 11 transpose
[4] 2 1 3
[5] 1 2 3 [5] 2 5 28
[6] 2 3 -6 [6] 3 0 22
[7] 4 0 91 [7] 3 2 -6
[8] 5 2 28 [8] 5 0 -15

(a) (b)
row, column in ascending order
*Figure 2.4:Sparse matrix and its transpose stored as triples (p.69)
CHAPTER 2 59
Sparse_matrix Create(max_row, max_col) ::=

#define MAX_TERMS 101 /* maximum number of terms +1*/


typedef struct {
int col; # of rows (columns)
int row; # of nonzero terms
int value;
} term;
term a[MAX_TERMS]

* (P.69)

CHAPTER 2 60
Transpose a Matrix

(1) for each row i


take element <i, j, value> and store it
in element <j, i, value> of the transpose.

difficulty: where to put <j, i, value>


(0, 0, 15) ====> (0, 0, 15)
(0, 3, 22) ====> (3, 0, 22)
(0, 5, -15) ====> (5, 0, -15)
(1, 1, 11) ====> (1, 1, 11)
Move elements down very often.

(2) For all elements in column j,


place element <i, j, value> in element <j, i, value>

CHAPTER 2 61
void transpose (term a[], term b[])
/* b is set to the transpose of a */
{
int n, i, j, currentb;
n = a[0].value; /* total number of elements */
b[0].row = a[0].col; /* rows in b = columns in a */
b[0].col = a[0].row; /*columns in b = rows in a */
b[0].value = n;
if (n > 0) { /*non zero matrix */
currentb = 1;
for (i = 0; i < a[0].col; i++)
/* transpose by columns in a */
for( j = 1; j <= n; j++)
/* find elements from the current column */
if (a[j].col == i) {
/* element is in current column, add it to b */
CHAPTER 2 62
columns

elements

b[currentb].row = a[j].col;
b[currentb].col = a[j].row;
b[currentb].value = a[j].value;
currentb++
}
}
}
* Program 2.7: Transpose of a sparse matrix (p.71)

Scan the array “columns” times.


==> O(columns*elements)
The array has “elements” elements.

CHAPTER 2 63
Discussion: compared with 2-D array representation

O(columns*elements) vs. O(columns*rows)

elements --> columns * rows when nonsparse


O(columns*columns*rows)

Problem: Scan the array “columns” times.

Solution:
Determine the number of elements in each column of
the original matrix.
==>
Determine the starting positions of each row in the
transpose matrix.

CHAPTER 2 64
a[0] 6 6 8
a[1] 0 0 15
a[2] 0 3 22
a[3] 0 5 -15
a[4] 1 1 11
a[5] 1 2 3
a[6] 2 3 -6
a[7] 4 0 91
a[8] 5 2 28
[0] [1] [2] [3] [4] [5]
row_terms = 2 1 2 2 0 1
starting_pos = 1 3 4 6 8 8
CHAPTER 2 65
void fast_transpose(term a[ ], term b[ ])
{
/* the transpose of a is placed in b */
int row_terms[MAX_COL], starting_pos[MAX_COL];
int i, j, num_cols = a[0].col, num_terms = a[0].value;
b[0].row = num_cols; b[0].col = a[0].row;
b[0].value = num_terms;
if (num_terms > 0){ /*nonzero matrix*/
for (i = 0; i < num_cols; i++)
columns
row_terms[i] = 0;
elements
for (i = 1; i <= num_terms; i++)
row_term [a[i].col]++
starting_pos[0] = 1;
for (i =1; i < num_cols; i++)
columns
starting_pos[i]=starting_pos[i-1] +row_terms [i-1];

CHAPTER 2 66
for (i=1; i <= num_terms, i++) {
j = starting_pos[a[i].col]++;
elements b[j].row = a[i].col;
b[j].col = a[i].row;
b[j].value = a[i].value;
}
}
}
*Program 2.8:Fast transpose of a sparse matrix

Compared with 2-D array representation


O(columns+elements) vs. O(columns*rows)
elements --> columns * rows
O(columns+elements) --> O(columns*rows)

Cost: Additional row_terms and starting_pos arrays are required.


Let the two arrays row_terms and starting_pos be shared.

CHAPTER 2 67
Sparse Matrix Multiplication
Definition: [D]m*p=[A]m*n* [B]n*p
Procedure: Fix a row of A and find all elements in column j
of B for j=0, 1, …, p-1.

Alternative 1. Scan all of B to find all elements in j.


Alternative 2. Compute the transpose of B.
(Put all column elements consecutively)

1 0 0 1 1 1 1 1 1
1 0 0 0 0 0  1 1 1
    
1 0 0 0 0 0 1 1 1
CHAPTER 2 68
*Figure 2.5:Multiplication of two sparse matrices (p.73)
An Example

A= 1 0 2 B= 3 0 2
-1 4 6 -1 0 0
0 0 5
2 3 5 3 3 4
0 0 1 0 0 3
0 2 2 0 2 2
1 0 -1 1 0 -1
1 1 4 2 2 5
1 2 6

BT = 3 -1 0 (0,0) 3 3 4
0 0 0 (0,2) 0 0 3
2 0 5 0 1 -1
(1,0) 2 0 2
2 2 5
(1,2)

CHAPTER 2 69
General Case

dij=ai0*b0j+ai1*b1j+…+ai(n-1)*b(n-1)j

a 本來依 i 成群,經轉置後, b 也依 j 成群。

a Sa d Sd
b Sb e Se
c Sc f Sf
g Sg

最多可以產生 ad, ae, af, ag,


bd, be, bf, bg,
cd, ce, cf, cg 等 entries 。

CHAPTER 2 70
void mmult (term a[ ], term b[ ], term d[ ] )
/* multiply two sparse matrices */
{
int i, j, column, totalb = b[].value, totald = 0;
int rows_a = a[0].row, cols_a = a[0].col,
totala = a[0].value; int cols_b = b[0].col,
int row_begin = 1, row = a[1].row, sum =0;
int new_b[MAX_TERMS][3];
if (cols_a != b[0].row){
fprintf (stderr, “Incompatible matrices\n”);
exit (1);
}

CHAPTER 2 71
fast_transpose(b, new_b); cols_b + totalb
/* set boundary condition */
a[totala+1].row = rows_a;
new_b[totalb+1].row = cols_b;
new_b[totalb+1].col = 0;
for (i = 1; i <= totala; ) { at most rows_a times
column = new_b[1].row;
for (j = 1; j <= totalb+1;) {
/* mutiply row of a by column of b */
if (a[i].row != row) {
storesum(d, &totald, row, column, &sum);
i = row_begin;
for (; new_b[j].row == column; j++)
;
column =new_b[j].row
}
CHAPTER 2 72
else switch (COMPARE (a[i].col, new_b[j].col)) {
case -1: /* go to next term in a */
i++; break;
case 0: /* add terms, go to next term in a and b */
sum += (a[i++].value * new_b[j++].value);
break;
case 1: /* advance to next term in b*/
j++
}
} /* end of for j <= totalb+1 */
for (; a[i].row == row; i++)
;
row_begin = i; row = a[i].row;
} /* end of for i <=totala */
d[0].row = rows_a;
d[0].col = cols_b; d[0].value = totald;
} *Praogram 2.9: Sparse matrix multiplication (p.75)
CHAPTER 2 73
Analyzing the algorithm
cols_b * termsrow1 + totalb +
cols_b * termsrow2 + totalb +
…+
cols_b * termsrowp + totalb
= cols_b * (termsrow1 + termsrow2 + … + termsrowp) +
rows_a * totalb
= cols_b * totala + row_a * totalb

O(cols_b * totala + rows_a * totalb)

CHAPTER 2 74
Compared with matrix multiplication using array
for (i =0; i < rows_a; i++)
for (j=0; j < cols_b; j++) {
sum =0;
for (k=0; k < cols_a; k++)
sum += (a[i][k] *b[k][j]);
d[i][j] =sum;
}
O(rows_a * cols_a * cols_b) vs.
O(cols_b * total_a + rows_a * total_b)

optimal case: total_a < rows_a * cols_a


total_b < cols_a * cols_b
worse case: total_a --> rows_a * cols_a, or
total_b --> cols_a * cols_b

CHAPTER 2 75
void storesum(term d[ ], int *totald, int row, int column,
int *sum)
{
/* if *sum != 0, then it along with its row and column
position is stored as the *totald+1 entry in d */
if (*sum)
if (*totald < MAX_TERMS) {
d[++*totald].row = row;
d[*totald].col = column;
d[*totald].value = *sum;
}
else {
fprintf(stderr, ”Numbers of terms in product
exceed %d\n”, MAX_TERMS);

exit(1);
}
} CHAPTER 2 76
Program 2.10: storsum function
CHAPTER 3

STACKS AND QUEUES

All the programs in this file are selected from


Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Computer Science Press.

CHAPTER 3 77
stack: a Last-In-First-Out (LIFO) list

E top
D top D D top
C top C C C
B top B B B B
A top A A A A A

*Figure 3.1: Inserting and deleting elements in a stack (p.102)

CHAPTER 3 78
an application of stack: stack frame of function call
(activation record)

old frame pointer fp


fp: a pointer to current stack frame
return address
al

local variables
old frame pointer fp old frame pointer
stack frame of invoking function

return address return address


main
system stack before a1 is invoked system stack after a1 is invoked
(a) (b)
*Figure 3.2: System stack after function call a1 (p.103)
CHAPTER 3 79
abstract data type for stack
structure Stack is
objects: a finite ordered list with zero or more elements.
functions:
for all stack  Stack, item  element, max_stack_size
 positive integer
Stack CreateS(max_stack_size) ::=
create an empty stack whose maximum size is
max_stack_size
Boolean IsFull(stack, max_stack_size) ::=
if (number of elements in stack == max_stack_size)
return TRUE
else return FALSE
Stack Add(stack, item) ::=
if (IsFull(stack)) stack_full
else insert item into top of stack and return

CHAPTER 3 80
Boolean IsEmpty(stack) ::=
if(stack == CreateS(max_stack_size))
return TRUE
else return FALSE
Element Delete(stack) ::=
if(IsEmpty(stack)) return
else remove and return the item on the top
of the stack.

*Structure 3.1: Abstract data type Stack (p.104)

CHAPTER 3 81
Implementation: using array
Stack CreateS(max_stack_size) ::=
#define MAX_STACK_SIZE 100 /* maximum stack size */
typedef struct {
int key;
/* other fields */
} element;
element stack[MAX_STACK_SIZE];
int top = -1;

Boolean IsEmpty(Stack) ::= top< 0;

Boolean IsFull(Stack) ::= top >= MAX_STACK_SIZE-1;

CHAPTER 3 82
Add to a stack
void add(int *top, element item)
{
/* add an item to the global stack */
if (*top >= MAX_STACK_SIZE-1) {
stack_full( );
return;
}
stack[++*top] = item;
}
*program 3.1: Add to a stack (p.104)

CHAPTER 3 83
Delete from a stack

element delete(int *top)


{
/* return the top element from the stack */
if (*top == -1)
return stack_empty( ); /* returns and error key */
return stack[(*top)--];
}
*Program 3.2: Delete from a stack (p.105)

CHAPTER 3 84
Queue: a First-In-First-Out (FIFO) list

D rear
C rear C D rear
B rear B B C
rear front A front
A A front A front B
front

*Figure 3.4: Inserting and deleting elements in a queue (p.106)

CHAPTER 3 85
Application: Job scheduling

front rear Q[0] Q[1] Q[2] Q[3] Comments


-1 -1 queue is empty
-1 0 J1 Job 1 is added
-1 1 J1 J2 Job 2 is added
-1 2 J1 J2 J3 Job 3 is added
0 2 J2 J3 Job 1 is deleted
1 2 J3 Job 2 is deleted

*Figure 3.5: Insertion and deletion from a sequential queue (p.108)

CHAPTER 3 86
Abstract data type of queue
structure Queue is
objects: a finite ordered list with zero or more elements.
functions:
for all queue  Queue, item  element,
max_ queue_ size  positive integer
Queue CreateQ(max_queue_size) ::=
create an empty queue whose maximum size is
max_queue_size
Boolean IsFullQ(queue, max_queue_size) ::=
if(number of elements in queue == max_queue_size)
return TRUE
else return FALSE
Queue AddQ(queue, item) ::=
if (IsFullQ(queue)) queue_full
else insert item at rear of queue and return queue
CHAPTER 3 87
Boolean IsEmptyQ(queue) ::=
if (queue ==CreateQ(max_queue_size))
return TRUE
else return FALSE
Element DeleteQ(queue) ::=
if (IsEmptyQ(queue)) return
else remove and return the item at front of queue.

*Structure 3.2: Abstract data type Queue (p.107)

CHAPTER 3 88
Implementation 1: using array

Queue CreateQ(max_queue_size) ::=


# define MAX_QUEUE_SIZE 100/* Maximum queue size */
typedef struct {
int key;
/* other fields */
} element;
element queue[MAX_QUEUE_SIZE];
int rear = -1;
int front = -1;
Boolean IsEmpty(queue) ::= front == rear
Boolean IsFullQ(queue) ::= rear == MAX_QUEUE_SIZE-1

CHAPTER 3 89
Add to a queue
void addq(int *rear, element item)
{
/* add an item to the queue */
if (*rear == MAX_QUEUE_SIZE_1) {
queue_full( );
return;
}
queue [++*rear] = item;
}

*Program 3.3: Add to a queue (p.108)

CHAPTER 3 90
Delete from a queue
element deleteq(int *front, int rear)
{
/* remove element at the front of the queue */
if ( *front == rear)
return queue_empty( ); /* return an error key */
return queue [++ *front];
}

*Program 3.4: Delete from a queue(p.108)

problem: there may be available space when IsFullQ is true


I.E. movement is required.

CHAPTER 3 91
Implementation 2: regard an array as a circular queue
front: one position counterclockwise from the first element
rear: current end
EMPTY QUEUE

[2] [3] [2] [3]


J2 J3

[1] [4] [1] J1 [4]

[0] [5] [0] [5]

front = 0 front = 0
rear = 0 rear = 3

*Figure 3.6: Empty and nonempty circular queues (p.109)


CHAPTER 3 92
Problem: one space is left when queue is full

FULL QUEUE FULL QUEUE

[2] [3] [2] [3]


J2 J3 J8 J9

[1] J1 J4 [4][1] J7 [4]

J5 J6 J5

[0] [5] [0] [5]

front =0 front =4
rear = 5 rear =3

*Figure 3.7: Full circular queues and then we remove the item (p.110)
CHAPTER 3 93
Add to a circular queue

void addq(int front, int *rear, element item)


{
/* add an item to the queue */
*rear = (*rear +1) % MAX_QUEUE_SIZE;
if (front == *rear) /* reset rear and print error */
return;
}
queue[*rear] = item;
}

*Program 3.5: Add to a circular queue (p.110)

CHAPTER 3 94
Delete from a circular queue
element deleteq(int* front, int rear)
{
element item;
/* remove front element from the queue and put it in item */
if (*front == rear)
return queue_empty( );
/* queue_empty returns an error key */
*front = (*front+1) % MAX_QUEUE_SIZE;
return queue[*front];
}

*Program 3.6: Delete from a circular queue (p.111)

CHAPTER 3 95
A Mazing Problem

010001100011111
entrance
100011011100111
011000011110011
110111101101100
110100101111111
001101110100101
011110011111111
001101101111101
110001101100000
001111100011110
010011111011110 exit
1: blocked path 0: through path
*Figure 3.8: An example maze(p.113)
CHAPTER 3 96
a possible representation

NW N NE
[row-1][col-1] [row-1][col] [row-1][col+1]

W E
[row][col-1]  [row][col+1]
[row][col]
SW S SE
[row+1][col-1] [row+1][col] [row+1][col+1]

*Figure 3.9: Allowable moves (p.113)


CHAPTER 3 97
a possible implementation
typedef struct {
short int vert; next_row = row + move[dir].vert;
short int horiz; next_col = col + move[dir].horiz;
} offsets;
offsets move[8]; /*array of moves for each direction*/
Name Dir move[dir].vert move[dir].horiz
N 0 -1 0
NE 1 -1 1
E 2 0 1
SE 3 1 1
S 4 1 0
SW 5 1 -1
W 6 0 -1
NW 7 -1 -1
CHAPTER 3 98
Use stack to keep pass history

#define MAX_STACK_SIZE 100


/*maximum stack size*/
typedef struct {
short int row;
short int col;
short int dir;
} element;
element stack[MAX_STACK_SIZE];

CHAPTER 3 99
Initialize a stack to the maze’s entrance coordinates and direction
to north;
while (stack is not empty){
/* move to position at top of stack */
<row, col, dir> = delete from top of stack;
while (there are more moves from current position) {
<next_row, next_col > = coordinates of next move;
dir = direction of move;
if ((next_row == EXIT_ROW)&& (next_col ==
EXIT_COL))
success;
if (maze[next_row][next_col] == 0 &&
mark[next_row][next_col] == 0) {

CHAPTER 3 100
/* legal move and haven’t been there */
mark[next_row][next_col] = 1;
/* save current position and direction */
add <row, col, dir> to the top of the stack;
row = next_row;
col = next_col;
dir = north;
}
}
}
printf(“No path found\n”);

*Program 3.7: Initial maze algorithm (p.115)

CHAPTER 3 101
The size of a stack?
000001
111110
100001
011111
100001
111110
100001
011111
100000 m*p

mp --> ┌ m/2┐p, mp --> ┌ p/2┐m

*Figure 3.11: Simple maze with a long path (p.116)

CHAPTER 3 102
(1,1)
void path (void)
{ (m,p)
(m+2)*(p+2)
/* output a path through the maze if such a path exists */
int i, row, col, next_row, next_col, dir, found = FALSE;
element position;
mark[1][1] = 1; top =0;
stack[0].row = 1; stack[0].col = 1; stack[0].dir = 1;
while (top > -1 && !found) {
position = delete(&top);
0
row = position.row; col = position.col; 7 N 1
dir = position.dir; 6W E2
while (dir < 8 && !found) { 5 S 3
/*move in direction dir */ 4
next_row = row + move[dir].vert;
next_col = col + move[dir].horiz;

CHAPTER 3 103
if (next_row==EXIT_ROW && next_col==EXIT_COL)
found = TRUE;
else if ( !maze[next_row][next_col] &&
!mark[next_row][next_col] {
mark[next_row][next_col] = 1;
position.row = row; position.col = col;
position.dir = ++dir;
add(&top, position);
row = next_row; col = next_col; dir = 0;
}
else ++dir;
}
}

CHAPTER 3 104
if (found) {
printf(“The path is :\n”);
printf(“row col\n”);
for (i = 0; i <= top; i++)
printf(“ %2d%5d”, stack[i].row, stack[i].col);
printf(“%2d%5d\n”, row, col);
printf(“%2d%5d\n”, EXIT_ROW, EXIT_COL);
}
else printf(“The maze does not have a path\n”);
}

*Program 3.8:Maze search function (p.117)

CHAPTER 3 105
Evaluation of Expressions
X=a/b-c+d*e-a*c

a = 4, b = c = 2, d = e = 3

Interpretation 1:
((4/2)-2)+(3*3)-(4*2)=0 + 8+9=1

Interpretation 2:
(4/(2-2+3))*(3-4)*2=(4/3)*(-1)*2=-2.66666…

How to generate the machine instructions corres


ponding to a given expression?
precedence rule + associative rule
CHAPTER 3 106
Token Operator Precedence1 Associativity

() function call 17 left-to-right


[] array element
-> . struct or union member
-- ++ increment, decrement2 16 left-to-right

-- ++ decrement, increment3 15 right-to-left


! logical not
- one’s complement
-+ unary minus or plus
&* address or indirection
sizeof size (in bytes)
(type) type cast 14 right-to-left

*/% mutiplicative 13 Left-to-right

CHAPTER 3 107
+- binary add or subtract 12 left-to-right

<< >> shift 11 left-to-right

> >= relational 10 left-to-right


< <=
== != equality 9 left-to-right

& bitwise and 8 left-to-right

^ bitwise exclusive or 7 left-to-right

bitwise or 6 left-to-right

&& logical and 5 left-to-right

 logical or 4 left-to-right
CHAPTER 3 108
?: conditional 3 right-to-left

= += -= assignment 2 right-to-left
/= *= %=
<<= >>=
&= ^= =

, comma 1 left-to-right

1.The precedence column is taken from Harbison and Steele.


2.Postfix form
3.prefix form

*Figure 3.12: Precedence hierarchy for C (p.119)

CHAPTER 3 109
user compiler
Infix Postfix
2+3*4 234*+
a*b+5 ab*5+
(1+2)*7 12+7*
a*b/c ab*c/
(a/(b-c+d))*(e-a)*c abc-d+/ea-*c*
a/b-c+d*e-a*c ab/c-de*ac*-

*Figure 3.13: Infix and postfix notation (p.120)

Postfix: no parentheses, no precedence

CHAPTER 3 110
Token Stack Top
[0] [1] [2]
6 6 0
2 6 2 1
/ 6/2 0
3 6/2 3 1
- 6/2-3 0
4 6/2-3 4 1
2 6/2-3 4 2 2
* 6/2-3 4*2 1
+ 6/2-3+4*2 0

*Figure 3.14: Postfix evaluation (p.120)


CHAPTER 3 111
Goal: infix --> postfix

Assumptions:
operators: +, -, *, /, %
operands: single digit integer

#define MAX_STACK_SIZE 100 /* maximum stack size */


#define MAX_EXPR_SIZE 100 /* max size of expression */
typedef enum{1paran, rparen, plus, minus, times, divide,
mod, eos, operand} precedence;
int stack[MAX_STACK_SIZE]; /* global stack */
char expr[MAX_EXPR_SIZE]; /* input string */

CHAPTER 3 112
int eval(void)
{
/* evaluate a postfix expression, expr, maintained as a
global variable, ‘\0’ is the the end of the expression.
The stack and top of the stack are global variables.
get_token is used to return the token type and
the character symbol. Operands are assumed to be single
character digits */
precedence token;
char symbol;
int op1, op2;
int n = 0; /* counter for the expression string */
int top = -1;
token = get_token(&symbol, &n);
while (token != eos) {
if (token == operand) exp: character array
add(&top, symbol-’0’); /* stack insert */
CHAPTER 3 113
else {
/* remove two operands, perform operation, and
return result to the stack */
op2 = delete(&top); /* stack delete */
op1 = delete(&top);
switch(token) {
case plus: add(&top, op1+op2); break;
case minus: add(&top, op1-op2); break;
case times: add(&top, op1*op2); break;
case divide: add(&top, op1/op2); break;
case mod: add(&top, op1%op2);
}
}
token = get_token (&symbol, &n);
}
return delete(&top); /* return result */
} CHAPTER 3 114
*Program 3.9: Function to evaluate a postfix expression (p.122)
precedence get_token(char *symbol, int *n)
{
/* get the next token, symbol is the character
representation, which is returned, the token is
represented by its enumerated value, which
is returned in the function name */

*symbol =expr[(*n)++];
switch (*symbol) {
case ‘(‘ : return lparen;
case ’)’ : return rparen;
case ‘+’: return plus;
case ‘-’ : return minus;

CHAPTER 3 115
case ‘/’ : return divide;
case ‘*’ : return times;
case ‘%’ : return mod;
case ‘\0‘ : return eos;
default : return operand;
/* no error checking, default is operand */
}
}

*Program 3.10: Function to get a token from the input string (p.123)

CHAPTER 3 116
Infix to Postfix Conversion
(Intuitive Algorithm)
(1) Fully parenthesize expression
a / b - c + d * e - a * c -->
((((a / b) - c) + (d * e)) - a * c))

(2) All operators replace their corresponding right


parentheses.
((((a / b) - c) + (d * e)) - a * c))

/ - *+ * -
(3) Delete all parentheses.
ab/c-de*+ac*-
two passes
CHAPTER 3 117
The orders of operands in infix and postfix are the same.
a + b * c, * > +
Token Stack Top Output
[0] [1] [2]
a -1 a
+ + 0 a
b + 0 ab
* + * 1 ab
c + * 1 abc
eos -1 abc*=
*Figure 3.15: Translation of a+b*c to postfix (p.124)

CHAPTER 3 118
a *1 (b +c) *2 d
Token Stack Top Output
[0] [1] [2]
a -1 a
*1 *1 0 a
( *1 ( 1 a
b *1 ( 1 ab
+ *1 ( + 2 ab
c *1 ( + 2 abc
) *1 match ) 0 abc+
*2 *2 *1 = *2 0 abc+*1
d *2 0 abc+*1d
eos *2 0 abc+*1d*2
CHAPTER 3 119
* Figure 3.16: Translation of a*(b+c)*d to postfix (p.124)
Rules
(1) Operators are taken out of the stack as long as their
in-stack precedence is higher than or equal to the
incoming precedence of the new operator.

(2) ( has low in-stack precedence, and high incoming


precedence.

( ) + - * / % eos
isp 0 19 12 12 13 13 13 0
icp 20 19 12 12 13 13 13 0

CHAPTER 3 120
precedence stack[MAX_STACK_SIZE];
/* isp and icp arrays -- index is value of precedence
lparen, rparen, plus, minus, times, divide, mod, eos */
static int isp [ ] = {0, 19, 12, 12, 13, 13, 13, 0};
static int icp [ ] = {20, 19, 12, 12, 13, 13, 13, 0};
isp: in-stack precedence
icp: incoming precedence

CHAPTER 3 121
void postfix(void)
{
/* output the postfix of the expression. The expression
string, the stack, and top are global */
char symbol;
precedence token;
int n = 0;
int top = 0; /* place eos on stack */
stack[0] = eos;
for (token = get _token(&symbol, &n); token != eos;
token = get_token(&symbol, &n)) {
if (token == operand)
printf (“%c”, symbol);
else if (token == rparen ){

CHAPTER 3 122
/*unstack tokens until left parenthesis */
while (stack[top] != lparen)
print_token(delete(&top));
delete(&top); /*discard the left parenthesis */
}
else{
/* remove and print symbols whose isp is greater
than or equal to the current token’s icp */
while(isp[stack[top]] >= icp[token] )
print_token(delete(&top)); f(n)=(g(n)) iff there exist
add(&top, token); positive
} constants c1, c2, and n0 such
} that c1g(n)f(n)c2g(n) for all
while ((token = delete(&top)) !=n,eos)
nn0.
print_token(token); f(n)=(g(n)) iff g(n) is both an
print(“\n”); upper and lower bound on f(n).
} (n) CHAPTER 3 123
*Program 3.11: Function to convert from infix to postfix (p.126)
Infix Prefix

a*b/c /*abc
a/b-c+d*e-a*c -+-/abc*de*ac
a*(b+c)/d-g -/*a+bcdg

(1) evaluation
(2) transformation
*Figure 3.17: Infix and postfix expressions (p.127)

CHAPTER 3 124
Multiple stacks and queues
Two stacks

m[0], m[1], …, m[n-2], m[n-1]

bottommost bottommost
stack 1 stack 2

More than two stacks (n)


memory is divided into n equal segments
boundary[stack_no]
0  stack_no < MAX_STACKS
top[stack_no]
0  stack_no < MAX_STACKS
CHAPTER 3 125
Initially, boundary[i]=top[i].

0 1 [ m/n ] 2[ m/n ] m-1

boundary[ 0] boundary[1] boundary[ 2] boundary[n]


top[ 0] top[ 1] top[ 2]

All stacks are empty and divided into roughly equal segments.

*Figure 3.18: Initial configuration for n stacks in memory [m]. (p.129)


CHAPTER 3 126
#define MEMORY_SIZE 100 /* size of memory */
#define MAX_STACK_SIZE 100
/* max number of stacks plus 1 */
/* global memory declaration */
element memory[MEMORY_SIZE];
int top[MAX_STACKS];
int boundary[MAX_STACKS];
int n; /* number of stacks entered by the user */
*(p.128)

top[0] = boundary[0] = -1;


for (i = 1; i < n; i++)
top[i] =boundary[i] =(MEMORY_SIZE/n)*i;
boundary[n] = MEMORY_SIZE-1;
*(p.129)

CHAPTER 3 127
void add(int i, element item)
{
/* add an item to the ith stack */
if (top[i] == boundary [i+1])
stack_full(i); may have unused storage
memory[++top[i]] = item;
}
*Program 3.12:Add an item to the stack stack-no (p.129)

element delete(int i)
{
/* remove top element from the ith stack */
if (top[i] == boundary[i])
return stack_empty(i);
return memory[top[i]--];
}
CHAPTER 3 128
*Program 3.13:Delete an item from the stack stack-no (p.130)
Find j, stack_no < j < n ( 往右 )
such that top[j] < boundary[j+1]
or, 0  j < stack_no ( 往左 )

b[0] t[0] b[1] t[1] b[i] t[i] t[i+1] t[j] b[j+1] b[n]
b[i+1] b[i+2]
meet
b=boundary, t=top
往左或右找一個空間

*Figure 3.19: Configuration when stack i meets stack i+1,


but the memory is not full (p.130)

CHAPTER 3 129
CHAPTER 4

LISTS

All the programs in this file are selected from


Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Computer Science Press.

CHAPTER 4 130
Introduction
 Array
successive items locate a fixed distance
 disadvantage
– data movements during insertion and deletion
– waste space in storing n ordered lists of varying
size
 possible solution
– linked list
CHAPTER 4 131
4.1.1 Pointer Can Be Dangerous

pointer
int i, *pi;
pi = &i; i=10 or *pi=10
pi= malloc(size of(int));
/* assign to pi a pointer to int */
pf=(float *) pi;
/* casts an int pointer to a float pointer */
CHAPTER 4 132
4.1.2 Using Dynamically Allocated Storage

int i, *pi;
float f, *pf;
pi = (int *) malloc(sizeof(int));
pf = (float *) malloc (sizeof(float)); request memory
*pi =1024;
*pf =3.14;
printf(”an integer = %d, a float = %f\n”, *pi, *pf);
free(pi);
free(pf);
return memory

*Program4.1:Allocation and deallocation of pointers (p.138)


CHAPTER 4 133
4.2 SINGLY LINKED LISTS

bat  cat  sat  vat NULL

*Figure 4.1: Usual way to draw a linked list (p.139)

CHAPTER 4 134
Insertion

bat  cat  sat  vat NULL

mat 

*Figure 4.2: Insert mat after cat (p.140)

CHAPTER 4 135
bat  cat  mat  sat  vat NULL

dangling
reference

*Figure 4.3: Delete mat from list (p.140)

CHAPTER 4 136
Example 4.1: create a linked list of words
Declaration
typedef struct list_node, *list_pointer;
typedef struct list_node {
char data [4];
list_pointer link;
};
Creation
list_pointer ptr =NULL;
Testing
#define IS_EMPTY(ptr) (!(ptr))
Allocation
ptr=(list_pointer) malloc (sizeof(list_node));
CHAPTER 4 137
e -> name  (*e).name
strcpy(ptr -> data, “bat”);
ptr -> link = NULL;

address of ptr data ptr link


first node
 b a t \0 NULL
ptr

*Figure 4.4:Referencing the fields of a node(p.142)

CHAPTER 4 138
Example: create a two-node list
ptr

10  20 NULL

typedef struct list_node *list_pointer;


typedef struct list_node {
int data;
list_pointer link;
};
list_pointer ptr =NULL

Example 4.2: (p.142)


CHAPTER 4 139
list_pointer create2( )
{
/* create a linked list with two nodes */
list_pointer first, second;
first = (list_pointer) malloc(sizeof(list_node));
second = ( list_pointer) malloc(sizeof(list_node));
second -> link = NULL;
second -> data = 20; ptr
first -> data = 10;
first ->link = second;
10  20 NULL
return first;
} *Program 4.2:Create a tow-node list (p.143)

CHAPTER 4 140
Pointer Review (1)
int i, *pi;
1000 2000
i ? pi ?

pi = &i;
i 1000 2000
*pi ? pi 1000

i = 10 or *pi = 10
i 1000 2000
*pi 10 pi 1000

CHAPTER 4 141
Pointer Review (2)
typedef struct list_node *list_pointer;
typedef struct list_node {
int data;
list_pointer link;
}
list_pointer ptr = NULL;

1000
ptr NULL ptr->data(*ptr).data

ptr = malloc(sizeof(list_node));
1000 2000 *ptr
ptr 2000
data link
CHAPTER 4 142
Pointer Review (3)
void delete(list_pointer *ptr, list_pointer trail, list_pinter node)

ptr: a pointer point to a pointer point to a list node

1000 2000 3000


ptr 2000 3000
*ptr (*ptr)->link

trail (node): a pointer point to a list node

1000 3000 *trail


trail 3000
data link
trail->link(*trail).link
CHAPTER 4 143
Pointer Review (4)
element delete(stack_pointer *top)
top
delete(&st) vs. delete(st) st 200 300
top
200 300 200 300
400 400

.
st top .
300 300 .
400
Does not change.

CHAPTER 4 144
List Insertion:

Insert a node after a specific node

void insert(list_pointer *ptr, list_pointer node)


{
/* insert a new node with data = 50 into the list ptr after node */
list_pointer temp;
temp = (list_pointer) malloc(sizeof(list_node));
if (IS_FULL(temp)){
fprintf(stderr, “The memory is full\n”);
exit (1);
}

CHAPTER 4 145
temp->data = 50;
if (*ptr) { noempty list
temp->link =node ->link;
node->link = temp; ptr
}
else { empty list 10  20 NULL
temp->link = NULL; node
*ptr =temp;
} 50 
}
temp
*Program 4.3:Simple insert into front of list (p.144)

CHAPTER 4 146
ptr node trail = NULL ptr

10  50  20 NULL 50  20 NULL

(a) before deletion (b)after deletion

*Figure 4.7: List after the function call Delete(&ptr,NULL.ptr);(p.145)

CHAPTER 4 147
List Deletion
Delete the first node.
ptr trail node ptr

10  50  20 NULL 50  20 NULL

(a) before deletion (b)after deletion

Delete node other than the first node.

ptr trail node ptr

10  50  20 NULL 10  20 NULL

CHAPTER 4 148
void delete(list_pointer *ptr, list_pointer trail,

list_pointer node)
{
/* delete node from the list, trailtrail
is the preceding
node node
ptr is the head of the list */
10  50  20 NULL
if (trail)
trail->link = node->link;
10  20 NULL
else
*ptr = (*ptr) ->link;
free(node);
ptr node ptr
}
10  50  20 NULL 50  20 NULL
CHAPTER 4 149
Print out a list (traverse a list)
void print_list(list_pointer ptr)
{
printf(“The list ocntains: “);
for ( ; ptr; ptr = ptr->link)
printf(“%4d”, ptr->data);
printf(“\n”);
}

*Program 4.5: Printing a list (p.146)

CHAPTER 4 150
4.3 DYNAMICALLY LINKED STACKS AND QUEUES

top
element link
     NULL
(a) Linked Stack

front rear
element link
     NULL
(b) Linked queue

*Figure 4.10: Linked Stack and queue (p.147)

CHAPTER 4 151
Represent n stacks
#define MAX_STACKS 10 /* maximum number of stacks */
typedef struct {
int key;
/* other fields */
} element;
typedef struct stack *stack_pointer;
typedef struct stack {
element item;
stack_pointer link;
};
stack_pointer top[MAX_STACKS];

CHAPTER 4 152
Represent n queues
#define MAX_QUEUES 10 /* maximum number of queues */
typedef struct queue *queue_pointer;
typedef struct queue {
element item;
queue_pointer link;
};
queue_pointer front[MAX_QUEUE], rear[MAX_QUEUES];

CHAPTER 4 153
Push in the linked stack
void add(stack_pointer *top, element item)
{
/* add an element to the top of the stack */
stack_pointer temp =
(stack_pointer) malloc (sizeof (stack));
if (IS_FULL(temp)) {
fprintf(stderr, “ The memory is full\n”);
exit(1);
}
temp->item = item;
temp->link = *top;
*top= temp; *Program 4.6:Add to a linked stack
(p.149)
} CHAPTER 4 154
pop from the linked stack
element delete(stack_pointer *top) {
/* delete an element from the stack */
stack_pointer temp = *top;
element item;
if (IS_EMPTY(temp)) {
fprintf(stderr, “The stack is empty\n”);
exit(1);
}
item = temp->item;
*top = temp->link;
free(temp);
return item;
}
CHAPTER 4 155
*Program 4.7: Delete from a linked stack (p.149)
enqueue in the linked queue
void addq(queue_pointer *front, queue_pointer *rear, element item)
{ /* add an element to the rear of the queue */
queue_pointer temp =
(queue_pointer) malloc(sizeof (queue));
if (IS_FULL(temp)) {
fprintf(stderr, “ The memory is full\n”);
exit(1);
}
temp->item = item;
temp->link = NULL;
if (*front) (*rear) -> link = temp;
else *front = temp;
*rear = temp; }

CHAPTER 4 156
dequeue from the linked queue (similar to push)
element deleteq(queue_pointer *front) {
/* delete an element from the queue */
queue_pointer temp = *front;
element item;
if (IS_EMPTY(*front)) {
fprintf(stderr, “The queue is empty\n”);
exit(1);
}
item = temp->item;
*front = temp->link;
free(temp);
return item;
}

CHAPTER 4 157
Polynomials
A( x )  a m 1 x e m1
 a m 2 x e m 2
... a 0 x e 0

Representation

typedef struct poly_node *poly_pointer;


typedef struct poly_node {
int coef;
int expon;
poly_pointer link;
};
poly_pointer a, b, c;

coef expon link

CHAPTER 4 158
Examples

a  3x  2 x  1
14 8

a
3 14 2 8 1 0 null

b  8 x 14  3x 10  10 x 6
b
8 14 -3 10 10 6 null

CHAPTER 4 159
Adding Polynomials

3 14 2 8 1 0
a
8 14 -3 10 10 6
b
11 14 a->expon == b->expon
d

3 14 2 8 1 0
a
8 14 -3 10 10 6
b
11 14 -3 10 a->expon < b->expon
d
CHAPTER 4 160
Adding Polynomials (Continued)

3 14 2 8 1 0
a
8 14 -3 10 10 6
b
11 14 -3 10 2 8
d
a->expon > b->expon

CHAPTER 4 161
Alogrithm for Adding Polynomials
poly_pointer padd(poly_pointer a, poly_pointer b)
{
poly_pointer front, rear, temp;
int sum;
rear =(poly_pointer)malloc(sizeof(poly_node));
if (IS_FULL(rear)) {
fprintf(stderr, “The memory is full\n”);
exit(1);
}
front = rear;
while (a && b) {
switch (COMPARE(a->expon, b->expon)) {

CHAPTER 4 162
case -1: /* a->expon < b->expon */
attach(b->coef, b->expon, &rear);
b= b->link;
break;
case 0: /* a->expon == b->expon */
sum = a->coef + b->coef;
if (sum) attach(sum,a->expon,&rear);
a = a->link; b = b->link;
break;
case 1: /* a->expon > b->expon */
attach(a->coef, a->expon, &rear);
a = a->link;
}
}
for (; a; a = a->link)
attach(a->coef, a->expon, &rear);
for (; b; b=b->link)
attach(b->coef, b->expon, &rear);
rear->link = NULL;
temp = front; front = front->link; free(temp);
return front;
}
Delete extra initial node.

CHAPTER 4 163
Analysis
(1) coefficient additions
0  additions  min(m, n)
where m (n) denotes the number of terms in A (B).
(2) exponent comparisons
extreme case
em-1 > fm-1 > em-2 > fm-2 > … > e0 > f0
m+n-1 comparisons
(3) creation of new nodes
extreme case
m + n new nodes
summary O(m+n)

CHAPTER 4 164
Attach a Term
void attach(float coefficient, int exponent,
poly_pointer *ptr)
{
/* create a new node attaching to the node pointed to
by ptr. ptr is updated to point to this new node. */
poly_pointer temp;
temp = (poly_pointer) malloc(sizeof(poly_node));
if (IS_FULL(temp)) {
fprintf(stderr, “The memory is full\n”);
exit(1);
}
temp->coef = coefficient;
temp->expon = exponent;
(*ptr)->link = temp;
*ptr = temp;
}

CHAPTER 4 165
A Suite for Polynomials
e(x) = a(x) * b(x) + d(x)
poly_pointer a, b, d, e;
read_poly()
...
a = read_poly(); print_poly()
b = read_poly(); padd()
d = read_poly();
temp = pmult(a, b); psub()
e = padd(temp, d); pmult()
print_poly(e);

temp is used to hold a partial result.


By returning the nodes of temp, we
may use it to hold other polynomials

CHAPTER 4 166
Erase Polynomials
void earse(poly_pointer *ptr)
{
/* erase the polynomial pointed to by ptr */
poly_pointer temp;
while (*ptr) {
temp = *ptr;
*ptr = (*ptr)->link;
free(temp);
}
}

O(n)

CHAPTER 4 167
Circularly Linked Lists
circular list vs. chain

ptr
3 14 2 8 1 0

avail
ptr

temp

avail ...

CHAPTER 4 168
Maintain an Available List
poly_pointer get_node(void)
{
poly_pointer node;
if (avail) {
node = avail;
avail = avail->link:
}
else {
node = (poly_pointer)malloc(sizeof(poly_node));
if (IS_FULL(node)) {
printf(stderr, “The memory is full\n”);
exit(1);
}
}
return node;
}

CHAPTER 4 169
Maintain an Available List (Continued)
Insert ptr to the front of this list

void ret_node(poly_pointer ptr)


{
ptr->link = avail;
avail = ptr;
} Erase a circular list (see next page)
void cerase(poly_pointer *ptr)
{
poly_pointer temp;
if (*ptr) {
temp = (*ptr)->link;
(*ptr)->link = avail; (1)
avail = temp; (2)
*ptr = NULL;
}
}

Independent of # of nodes in a list O(1) constant time


CHAPTER 4 170
4.4.4 Representing Polynomials As Circularly Linked Lists

avail
(2)

     
ptr
(1) temp

     NULL

avail

*Figure 4.14: Returning a circular list to the avail list (p.159)

CHAPTER 4 171
Head Node

Represent polynomial as circular list.


(1) zero

a -1

Zero polynomial
(2) others
a
-1 3 14 2 8 1 0

a  3x  2 x  1
14 8

CHAPTER 4 172
Another Padd

poly_pointer cpadd(poly_pointer a, poly_pointer b)


{
poly_pointer starta, d, lastd;
int sum, done = FALSE;
starta = a;
a = a->link; Set expon field of head node to -1.
b = b->link;
d = get_node();
d->expon = -1; lastd = d;
do {
switch (COMPARE(a->expon, b->expon)) {
case -1: attach(b->coef, b->expon, &lastd);
b = b->link;
break;

CHAPTER 4 173
Another Padd (Continued)

case 0: if (starta == a) done = TRUE;


else {
sum = a->coef + b->coef;
if (sum) attach(sum,a->expon,&lastd);
a = a->link; b = b->link;
}
break;
case 1: attach(a->coef,a->expon,&lastd);
a = a->link;
}
} while (!done);
lastd->link = d;
return d; Link last node to first
}

CHAPTER 4 174
Additional List Operations

typedef struct list_node *list_pointer;


typedef struct list_node {
char data;
list_pointer link;
};

Invert single linked lists


Concatenate two linked lists

CHAPTER 4 175
Invert Single Linked Lists
Use two extra pointers: middle and trail.

list_pointer invert(list_pointer lead)


{
list_pointer middle, trail;
middle = NULL;
while (lead) {
trail = middle;
middle = lead;
lead = lead->link;
middle->link = trail;
}
return middle;
}

0: null
1: lead ...
2: lead

CHAPTER 4 176
Concatenate Two Lists

list_pointer concatenate(list_pointer
ptr1, list_pointer ptr2)
{
list_pointer temp;
if (IS_EMPTY(ptr1)) return ptr2;
else {
if (!IS_EMPTY(ptr2)) {
for (temp=ptr1;temp->link;temp=temp->link);
temp->link = ptr2;
}
return ptr1;
}
}
O(m) where m is # of elements in the first list

CHAPTER 4 177
4.5.2 Operations For Circularly Linked List

What happens when we insert a node to the front of a circular


linked list?

a X1  X2  X3 

Problem: move down the whole list.

*Figure 4.16: Example circular list (p.165)

CHAPTER 4 178
A possible solution:

X1  X2  X3  a

Note a pointer points to the last node.

*Figure 4.17: Pointing to the last node of a circular list (p.165)

CHAPTER 4 179
Operations for Circular Linked Lists
void insert_front (list_pointer *ptr, list_pointer
node)
{
if (IS_EMPTY(*ptr)) {
*ptr= node;
node->link = node;
}
else {
node->link = (*ptr)->link; (1)
(*ptr)->link = node; (2)
}
}

(2) X1  X2  X3  ptr
(1) CHAPTER 4 180
Length of Linked List

int length(list_pointer ptr)


{
list_pointer temp;
int count = 0;
if (ptr) {
temp = ptr;
do {
count++;
temp = temp->link;
} while (temp!=ptr);
}
return count;
}

CHAPTER 4 181
Equivalence Relations

A relation over a set, S, is said to be an equivalence relation over S iff it is


symmertric, reflexive, and transitive over S.
reflexive, x=x
symmetric, if x=y, then y=x
transitive, if x=y and y=z, then x=z

CHAPTER 4 182
Examples

0=4, 3=1, 6=10, 8=9, 7=4,


6=8, 3=5, 2=11, 11=1

three equivalent classes


{0,2,4,7,11}; {1,3,5}; {6,8,9,10}

CHAPTER 4 183
A Rough Algorithm to
Find Equivalence Classes
void equivalenec()
{
initialize;
while (there are more pairs) {
read the next pair <i,j>;
process this pair;
1
Phase

}
initialize the output;
do {
output a new equivalence class;
} while (not done);
}
2
Phase

What kinds of data structures are adopted?

CHAPTER 4 184
First Refinement

#include <stdio.h>
#include <alloc.h>
#define MAX_SIZE 24
#define IS_FULL(ptr) (!(ptr))
#define FALSE 0
#define TRUE 1
void equivalence()
{
initialize seq to NULL and out to TRUE
while (there are more pairs) {
read the next pair, <i,j>;
put j on the seq[i] list;
put i on the seq[j] list;
}
for (i=0; i<n; i++)
if (out[i]) { direct equivalence
out[i]= FALSE;
output this equivalence class;
}
} Compute indirect equivalence
using transitivity

CHAPTER 4 185
Lists After Pairs are input

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]
0  4 seq
31
6  10
89 11 3 11 5 7 3 8 4 6 8 6 0
74 NULL NULL NULL NULL NULL NULL
68
35
4 1 0 10 9 2
2  11
11  0 NULL NULL NULL NULL NULL NULL

typedef struct node *node_pointer ;


typedef struct node {
int data;
node_pointer link;
}; CHAPTER 4 186
Final Version for
Finding Equivalence Classes
void main(void)
{
short int out[MAX_SIZE];
node_pointer seq[MAX_SIZE];
node_pointer x, y, top;
int i, j, n;
printf(“Enter the size (<= %d) “, MAX_SIZE);
scanf(“%d”, &n);
for (i=0; i<n; i++) {
out[i]= TRUE; seq[i]= NULL;
}
printf(“Enter a pair of numbers (-1 -1 to quit): “);
scanf(“%d%d”, &i, &j);
Phase 1: input the equivalence pairs:

CHAPTER 4 187
while (i>=0) {
x = (node_pointer) malloc(sizeof(node));
if (IS_FULL(x))
fprintf(stderr, “memory is full\n”);
exit(1);
} Insert x to the top of lists seq[i]
x->data= j; x->link= seq[i]; seq[i]= x;
if (IS_FULL(x))
fprintf(stderr, “memory is full\n”);
exit(1);
} Insert x to the top of lists seq[j]
x->data= i; x->link= seq[j]; seq[j]= x;
printf(“Enter a pair of numbers (-1 -1 to \
quit): “);
scanf(“%d%d”, &i, &j);
}

CHAPTER 4 188
Phase 2: output the equivalence classes
for (i=0; i<n; i++) {
if (out[i]) {
printf(“\nNew class: %5d”, i);
out[i]= FALSE;
x = seq[i]; top = NULL;
for (;;) {
while (x) {
j = x->data;
if (out[j]) {
printf(“%5d”, j); push
out[j] = FALSE;
y = x->link; x->link = top;
top = x; x = y;
}
else x = x->link;
}
if (!top) break; pop
x = seq[top->data]; top = top->link;
}
}
}

CHAPTER 4 189
4.7 Sparse Matrices

0 0 11 0 
12 0 0 0 
 
0 4 0 0 
 
 0 0 0  15
inadequates of sequential schemes
(1) # of nonzero terms will vary after some matrix computation
(2) matrix just represents intermediate results

new scheme
Each column (row): a circular linked list with a head node

CHAPTER 4 190
Revisit Sparse Matrices
# of head nodes = max{# of rows, # of columns}
素連 down head right 連同一列元素
head node 同

行 next

down entry row col right


entry node
value

entry i j
aij
aij

CHAPTER 4 191
Linked Representation for Matrix

4 4

0 2
11

1 0 1 1
12 5

2 1
-4

3 3
-15
Circular linked list CHAPTER 4 192
#define MAX_SIZE 50 /* size of largest matrix */
typedef enum {head, entry} tagfield;
typedef struct matrix_node *matrix_pointer;
typedef struct entry_node {
int row;
int col;
int value;
};
typedef struct matrix_node {
matrix_pointer down;
matrix_pointer right;
tagfield tag;
CHAPTER 4 193
union {
matrix_pointer next;
entry_node entry;
} u;
};
matrix_pointer hdnode[MAX_SIZE];

CHAPTER 4 194
[0] [1] [2]
[0] 4 4 4
[1] 0 2 11
[2] 1 0 12
[3] 2 1 -4
[4] 3 3 -15
*Figure 4.22: Sample input for sparse matrix (p.174)

CHAPTER 4 195
Read in a Matrix

matrix_pointer mread(void)
{
/* read in a matrix and set up its linked
list. An global array hdnode is used */
int num_rows, num_cols, num_terms;
int num_heads, i;
int row, col, value, current_row;
matrix_pointer temp, last, node;
printf(“Enter the number of rows, columns
and number of nonzero terms: “);

CHAPTER 4 196
scanf(“%d%d%d”, &num_rows, &num_cols,
&num_terms);
num_heads =
(num_cols>num_rows)? num_cols : num_rows;
/* set up head node for the list of head
nodes */
node = new_node(); node->tag = entry;
node->u.entry.row = num_rows;
node->u.entry.col = num_cols;
if (!num_heads) node->right = node;
else { /* initialize the head nodes */
for (i=0; i<num_heads; i++) {
term= new_node();
hdnode[i] = temp;
hdnode[i]->tag = head;
hdnode[i]->right = temp;
hdnode[i]->u.next = temp;
} O(max(n,m))

CHAPTER 4 197
current_row= 0; last= hdnode[0];
for (i=0; i<num_terms; i++) {
printf(“Enter row, column and value:”);
scanf(“%d%d%d”, &row, &col, &value);
if (row>current_row) {
last->right= hdnode[current_row];
current_row= row; last=hdnode[row];
}
temp = new_node(); ...
temp->tag=entry; temp->u.entry.row=row;
temp->u.entry.col = col;
temp->u.entry.value = value;
last->right = temp;/*link to row list */
last= temp;
/* link to column list */
hdnode[col]->u.next->down = temp;
hdnode[col]=>u.next = temp;
}

利用 next field 存放 column 的 last node

CHAPTER 4 198
/*close last row */
last->right = hdnode[current_row];
/* close all column lists */
for (i=0; i<num_cols; i++)
hdnode[i]->u.next->down = hdnode[i];
/* link all head nodes together */
for (i=0; i<num_heads-1; i++)
hdnode[i]->u.next = hdnode[i+1];
hdnode[num_heads-1]->u.next= node;
node->right = hdnode[0];
}
return node;
}

O(max{#_rows, #_cols}+#_terms)

CHAPTER 4 199
Write out a Matrix

void mwrite(matrix_pointer node)


{ /* print out the matrix in row major form */
int i;
matrix_pointer temp, head = node->right;
printf(“\n num_rows = %d, num_cols= %d\n”,
node->u.entry.row,node->u.entry.col);
printf(“The matrix by row, column, and
value:\n\n”); O(#_rows+#_terms)
for (i=0; i<node->u.entry.row; i++) {
for (temp=head->right;temp!=head;temp=temp->right)
printf(“%5d%5d%5d\n”, temp->u.entry.row,
temp->u.entry.col, temp->u.entry.value);
head= head->u.next; /* next row */
}
}

CHAPTER 4 200
Free the entry and head nodes by row.
Erase a Matrix
void merase(matrix_pointer *node)
{
int i, num_heads;
matrix_pointer x, y, head = (*node)->right;
for (i=0; i<(*node)->u.entry.row; i++) {
y=head->right;
while (y!=head) {
x = y; y = y->right; free(x);
}
x= head; head= head->u.next; free(x);
}
y = head;
while (y!=*node) {
x = y; y = y->u.next; free(x);
}
free(*node); *node = NULL;
}
O(#_rows+#_cols+#_terms)

Alternative: 利用 Fig 4.14 的技巧,把一列資料 erase (constant time)


CHAPTER 4 201
Doubly Linked List
Move in forward and backward direction.

Singly linked list (in one direction only)


How to get the preceding node during deletion or insertion?
Using 2 pointers

Node in doubly linked list


left link field (llink)
data field (item)
right link field (rlink)

CHAPTER 4 202
Doubly Linked Lists

typedef struct node *node_pointer;


typedef struct node {
node_pointer llink; ptr
element item; = ptr->rlink->llink
node_pointer rlink; = ptr->llink->rlink
}

head node

llink item rlink


CHAPTER 4 203
ptr

*Figure 4.24:Empty doubly linked circular list with head node (p.180)

CHAPTER 4 204
node node

 

newnode

*Figure 4.25: Insertion into an empty doubly linked circular list (p.181)

CHAPTER 4 205
Insert

void dinsert(node_pointer node, node_pointer newnode)


{
(1) newnode->llink = node;
(2) newnode->rlink = node->rlink;
(3) node->rlink->llink = newnode;
(4) node->rlink = newnode;
}
head node

llink item rlink


(1) (3)
(4) (2)

CHAPTER 4 206
Delete

void ddelete(node_pointer node, node_pointer


deleted)
{
if (node==deleted) printf(“Deletion of head node
not permitted.\n”);
else {
(1) deleted->llink->rlink= deleted->rlink;
(2) deleted->rlink->llink= deleted->llink;
free(deleted);
}
}
head node

(1)
llink item rlink
(2)
CHAPTER 4 207
CHAPTER 5

Trees
All the programs in this file are selected from
Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Computer Science Press.
Trees
Root
Dusty

Honey Bear Brandy

Brunhilde Terry Coyote Nugget

Gill Tansey Tweed Zoe Crocus Primrose Nous Belle

leaf

CHAPTER 5 209
Definition of Tree
 A tree is a finite set of one or more nodes
such that:
 There is a specially designated node called
the root.
 The remaining nodes are partitioned into n>=0
disjoint sets T1, ..., Tn, where each of these
sets is a tree.
 We call T1, ..., Tn the subtrees of the root.

CHAPTER 5 210
Level and Depth
Level
node (13)
degree of a node
leaf (terminal) A 1
nonterminal 3 1
parent
B C D 2
children 2 2 1 2 3 2
sibling
degree of a tree (3) E F G H I J
ancestor 2 30 30 31 30 30 33
level of a node
height of a tree (4) K L M 4
0 40 4 0 4

CHAPTER 5 211
Terminology
 The degree of a node is the number of subtrees
of the node
– The degree of A is 3; the degree of C is 1.
 The node with degree 0 is a leaf or terminal
node.
 A node that has subtrees is the parent of the
roots of the subtrees.
 The roots of these subtrees are the children of
the node.
 Children of the same parent are siblings.
 The ancestors of a CHAPTER
node 5are all the nodes 212
along the path from the root to the node.
Representation of Trees
 List Representation
– ( A ( B ( E ( K, L ), F ), C ( G ), D ( H ( M ), I, J ) ) )
– The root comes first, followed by a list of sub-trees

data link 1 link 2 ... link n

How many link fields are


needed in such a representation?

CHAPTER 5 213
Left Child - Right Sibling
data
A left child right sibling

B C D

E F G H I J

K L M

CHAPTER 5 214
Binary Trees
 A binary tree is a finite set of nodes that is
either empty or consists of a root and two
disjoint binary trees called the left subtree
and the right subtree.
 Any tree can be transformed into binary tree.
– by left child-right sibling representation
 The left subtree and the right subtree are
distinguished.
CHAPTER 5 215
*Figure 5.6: Left child-right child tree representation of a tree (p.191)

E C

F G D
K

L H

M I

J
Abstract Data Type Binary_Tree
structure Binary_Tree(abbreviated BinTree) is
objects: a finite set of nodes either empty or
consisting of a root node, left Binary_Tree,
and right Binary_Tree.
functions:
for all bt, bt1, bt2  BinTree, item  element
Bintree Create()::= creates an empty binary tree
Boolean IsEmpty(bt)::= if (bt==empty binary
tree) return TRUE else return FALSE
CHAPTER 5 217
BinTree MakeBT(bt1, item, bt2)::= return a binary tree
whose left subtree is bt1, whose right subtree is bt2,
and whose root node contains the data item
Bintree Lchild(bt)::= if (IsEmpty(bt)) return error
else return the left subtree of bt
element Data(bt)::= if (IsEmpty(bt)) return error
else return the data in the root node of bt
Bintree Rchild(bt)::= if (IsEmpty(bt)) return error
else return the right subtree of bt

CHAPTER 5 218
Samples of Trees
Complete Binary Tree

A A 1 A

B B 2 B C

C Skewed Binary Tree


3 D E F G
D

4 H I
E 5
CHAPTER 5 219
Maximum Number of Nodes in BT
 The maximum number of nodes on level i of a
i-1
binary tree is 2i-1, i>=1.
 The maximum nubmer of nodes in a binary tree
of depth k is 2k-1, k>=1.

Prove by induction.
k
 2 i 1
 2 k
1
i 1

CHAPTER 5 220
Relations between Number of
Leaf Nodes and Nodes of Degree 2
For any nonempty binary tree, T, if n0 is the
number of leaf nodes and n2 the number of nodes
of degree 2, then n0=n2+1
proof:
Let n and B denote the total number of nodes &
branches in T.
Let n0, n1, n2 represent the nodes with no children
single child, and two children respectively.
n= n0+n1+n2, B+1=n, B=n1+2n2 ==> n1+2n2+1= n
CHAPTER 5 221
n1+2n2+1= n0+n1+n2 ==> n0=n2+1
Full BT VS Complete BT
 A full binary tree of depth k is a binary tree of
k
depth k having 2 -1 nodes, k>=0.
 A binary tree with n nodes and depth k is
complete iff its nodes correspond to the nodes
numbered from 1 to n in the full binary tree of
depth k. 由上至下,
A A
由左至右編號
B C B C

D E F G D E F G

H I J K L M N O
H I
Complete binary tree CHAPTER 5
Full binary tree of depth 4
222
Binary Tree Representations
 If a complete binary tree with n nodes (depth =
log n + 1) is represented sequentially, then for
any node with index i, 1<=i<=n, we have:
– parent(i) is at i/2 if i!=1. If i=1, i is at the root and
has no parent.
– left_child(i) ia at 2i if 2i<=n. If 2i>n, then i has no
left child.
– right_child(i) ia at 2i+1 if 2i +1 <=n. If 2i +1 >n,
then i has no right child.
CHAPTER 5 223
[1]
Sequential Representation [2]
A
B
[3] C
(1) waste space
(2) insertion/deletion [4] D
A [1] A [5] E
problem
[2] B [6] F
[3] -- [7]
B G
[4] C [8]
A H
[5] -- [9]
C I
[6] --
[7] -- B C
D [8] D
[9] --
. . D E F G
E
[16] E
CHAPTER 5 224
H I
Linked Representation
typedef struct node *tree_pointer;
typedef struct node {
int data;
tree_pointer left_child, right_child;
};
data

left_child data right_child


left_child right_child

CHAPTER 5 225
Binary Tree Traversals
 Let L, V, and R stand for moving left, visiting
the node, and moving right.
 There are six possible combinations of traversal
– LVR, LRV, VLR, VRL, RVL, RLV
 Adopt convention that we traverse left before
right, only 3 traversals remain
– LVR, LRV, VLR
– inorder, postorder, preorder
CHAPTER 5 226
Arithmetic Expression Using BT
+ inorder traversal
A/B*C*D+E
infix expression
* E
preorder traversal
+**/ABCDE
* D prefix expression
postorder traversal
AB/C*D*E+
/ C
postfix expression
level order traversal
A B +*E*D/CAB

CHAPTER 5 227
Inorder Traversal (recursive version)
void inorder(tree_pointer ptr)
/* inorder tree traversal */
{
A/B*C*D+E
if (ptr) {
inorder(ptr->left_child);
printf(“%d”, ptr->data);
indorder(ptr->right_child);
}
} CHAPTER 5 228
Preorder Traversal (recursive version)
void preorder(tree_pointer ptr)
/* preorder tree traversal */
{ +**/ABCDE
if (ptr) {
printf(“%d”, ptr->data);
preorder(ptr->left_child);
predorder(ptr->right_child);
}
} CHAPTER 5 229
Postorder Traversal (recursive version)
void postorder(tree_pointer ptr)
/* postorder tree traversal */
{ AB/C*D*E+
if (ptr) {
postorder(ptr->left_child);
postdorder(ptr->right_child);
printf(“%d”, ptr->data);
}
} CHAPTER 5 230
Iterative Inorder Traversal
(using stack)
void iter_inorder(tree_pointer node)
{
int top= -1; /* initialize stack */
tree_pointer stack[MAX_STACK_SIZE];
for (;;) {
for (; node; node=node->left_child)
add(&top, node);/* add to stack */
node= delete(&top);
/* delete from stack */
if (!node) break; /* empty stack */
printf(“%D”, node->data);
node = node->right_child;
}
} O(n) CHAPTER 5 231
Trace Operations of Inorder Traversal
Call of inorder Value in root Action Call of inorder Value in root Action
1 + 11 C
2 * 12 NULL
3 * 11 C printf
4 / 13 NULL
5 A 2 * printf
6 NULL 14 D
5 A printf 15 NULL
7 NULL 14 D printf
4 / printf 16 NULL
8 B 1 + printf
9 NULL 17 E
8 B printf 18 NULL
10 NULL 17 E printf
3 * printf 19 NULL

CHAPTER 5 232
Level Order Traversal
(using queue)

void level_order(tree_pointer ptr)


/* level order tree traversal */
{
int front = rear = 0;
tree_pointer queue[MAX_QUEUE_SIZE];
if (!ptr) return; /* empty queue */
addq(front, &rear, ptr);
for (;;) {
ptr = deleteq(&front, rear);

CHAPTER 5 233
if (ptr) {
printf(“%d”, ptr->data);
if (ptr->left_child)
addq(front, &rear,
ptr->left_child);
if (ptr->right_child)
addq(front, &rear,
ptr->right_child);
}
else break;
} +*E*D/CAB
}
CHAPTER 5 234
Copying Binary Trees
tree_poointer copy(tree_pointer original)
{
tree_pointer temp;
if (original) {
temp=(tree_pointer) malloc(sizeof(node));
if (IS_FULL(temp)) {
fprintf(stderr, “the memory is full\n”);
exit(1);
}
temp->left_child=copy(original->left_child);
temp->right_child=copy(original->right_child)
temp->data=original->data;
return temp;
} postorder
return NULL;
} CHAPTER 5 235
Equality of Binary Trees
the same topology and data

int equal(tree_pointer first, tree_pointer second)


{
/* function returns FALSE if the binary trees first and
second are not equal, otherwise it returns TRUE */

return ((!first && !second) || (first && second &&

(first->data == second->data) &&


equal(first->left_child, second->left_child) &&
equal(first->right_child, second->right_child)))
}

CHAPTER 5 236
Propositional Calculus Expression

 A variable is an expression.
 If x and y are expressions, then ¬x, xy,
xy are expressions.
 Parentheses can be used to alter the normal
order of evaluation (¬ >  > ).
 Example: x1  (x2  ¬x3)
 satisfiability problem: Is there an assignment to
make an expression true?
CHAPTER 5 237
(x1  ¬x2)  (¬ x1  x3)  ¬x3

(t,t,t) 
(t,t,f)
(t,f,t)  
(t,f,f)
(f,t,t)   X3
(f,t,f)
(f,f,t) X1   X3
(f,f,f)

2n possible combinations X2 X1
for n variables
postorder traversal (postfix evaluation)
node structure

left_child data value right_child

typedef emun {not, and, or, true, false } logical;


typedef struct node *tree_pointer;
typedef struct node {
tree_pointer list_child;
logical data;
short int value;
tree_pointer right_child;
};
First version of satisfiability algorithm

for (all 2n possible combinations) {


generate the next combination;
replace the variables by their values;
evaluate root by traversing it in postorder;
if (root->value) {
printf(<combination>);
return;
}
}
printf(“No satisfiable combination \n”);
Post-order-eval function
void post_order_eval(tree_pointer node)
{
/* modified post order traversal to evaluate a propositional
calculus tree */
if (node) {
post_order_eval(node->left_child);
post_order_eval(node->right_child);
switch(node->data) {
case not: node->value =
!node->right_child->value;
break;
case and: node->value =
node->right_child->value &&
node->left_child->value;
break;
case or: node->value =
node->right_child->value | |
node->left_child->value;
break;
case true: node->value = TRUE;
break;
case false: node->value = FALSE;
}
}
}
Threaded Binary Trees
 Two many null pointers in current representation
of binary trees
n: number of nodes
number of non-null links: n-1
total links: 2n
null links: 2n-(n-1)=n+1
 Replace these null pointers with some useful
“threads”.

CHAPTER 5 243
Threaded Binary Trees (Continued)
If ptr->left_child is null,
replace it with a pointer to the node that would be
visited before ptr in an inorder traversal

If ptr->right_child is null,
replace it with a pointer to the node that would be
visited after ptr in an inorder traversal

CHAPTER 5 244
A Threaded Binary Tree
root A
dangling
B C

dangling D E F G

inorder traversal:
H I H, D, I, B, E, A, F, C, G

CHAPTER 5 245
Data Structures for Threaded BT
left_thread left_child data right_child right_thread
TRUE   FALSE

TRUE: thread FALSE: child


typedef struct threaded_tree
*threaded_pointer;
typedef struct threaded_tree {
short int left_thread;
threaded_pointer left_child;
char data;
threaded_pointer right_child;
short int right_thread; };
Memory Representation of A Threaded BT

root --
f f

f A f

f B f f C f

f D f t E t t F t t G

t H t t I t

CHAPTER 5 247
Next Node in Threaded BT
threaded_pointer insucc(threaded_pointer
tree)
{
threaded_pointer temp;
temp = tree->right_child;
if (!tree->right_thread)
while (!temp->left_thread)
temp = temp->left_child;
return temp;
CHAPTER 5 248
}
Inorder Traversal of Threaded BT
void tinorder(threaded_pointer tree)
{
/* traverse the threaded binary tree
inorder */
threaded_pointer temp = tree;
for (;;) {
temp = insucc(temp);
O(n) if (temp==tree) break;
printf(“%3c”, temp->data);
}
CHAPTER 5 249
}
Inserting Nodes into Threaded BTs
 Insert child as the right child of node parent
– change parent->right_thread to FALSE
– set child->left_thread and child->right_thread
to TRUE
– set child->left_child to point to parent
– set child->right_child to parent->right_child
– change parent->right_child to point to child

CHAPTER 5 250
Examples
Insert a node D as a right child of B.
root root

A A
(1)
B parent B parent
(3)
child child
C D C D
(2)
empty
CHAPTER 5 251
*Figure 5.24: Insertion of child as a right child of parent in a threaded binary tree (p.217)

(3)

(2) (1)

(4)

nonempty
Right Insertion in Threaded BTs
void insert_right(threaded_pointer parent,
threaded_pointer child)
{
threaded_pointer temp;
(1) child->right_child = parent->right_child;
child->right_thread = parent->right_thread;
child->left_child = parent; case (a)
(2) child->left_thread = TRUE;
(3)parent->right_child = child;
parent->right_thread = FALSE;
if (!child->right_thread) { case (b)
(4) temp = insucc(child);
temp->left_child = child;
}
CHAPTER 5 253
}
Heap
 A max tree is a tree in which the key value in
each node is no smaller than the key values in
its children. A max heap is a complete binary
tree that is also a max tree.
 A min tree is a tree in which the key value in
each node is no larger than the key values in
its children. A min heap is a complete binary
tree that is also a min tree.
 Operations on heaps
– creation of an empty heap
– insertion of a new element into the heap;
CHAPTER 5 254
– deletion of the largest element from the heap
*Figure 5.25: Sample max heaps (p.219)

[1] [1] 9 [1] 30


14

[2] 12 [3] 7 [2] 6 [3] 3 [2]


25
[4] [5] [6] [4]
10 8 6 5

Property:
The root of max heap (min heap) contains
the largest (smallest).
*Figure 5.26:Sample min heaps (p.220)

[1] [1] 10 [1] 11


2

[2] 7 [3] 4 [2] 20 [3] 83 [2]


21
[4] [5] [6] [4]
10 8 6 50
structure MaxHeap ADT for Max Heap
objects: a complete binary tree of n > 0 elements organized so that
the value in each node is at least as large as those in its children
functions:
for all heap belong to MaxHeap, item belong to Element, n,
max_size belong to integer
MaxHeap Create(max_size)::= create an empty heap that can
hold a maximum of max_size elements
Boolean HeapFull(heap, n)::= if (n==max_size) return TRUE
else return FALSE
MaxHeap Insert(heap, item, n)::= if (!HeapFull(heap,n)) insert
item into heap and return the resulting heap
else return error
Boolean HeapEmpty(heap, n)::= if (n>0) return FALSE
else return TRUE
Element Delete(heap,n)::= if (!HeapEmpty(heap,n)) return one
instance of the largest element in the heap
and remove it from the heap
CHAPTER 5 257
else return error
Application: priority queue
 machine service
– amount of time (min heap)
– amount of payment (max heap)
 factory
– time tag

CHAPTER 5 258
Data Structures
 unordered linked list
 unordered array
 sorted linked list
 sorted array
 heap

CHAPTER 5 259
*Figure 5.27: Priority queue representations (p.221)

Representation Insertion Deletion

Unordered (1) (n)


array
Unordered (1) (n)
linked list
Sorted array O(n) (1)
Sorted linked O(n) (1)
list
Max heap O(log2n) O(log2n)
Example of Insertion to Max Heap

20 20 21

15 2 15 5 15 20

14 10 14 10 2 14 10 2

initial location of new node insert 5 into heap insert 21 into heap

CHAPTER 5 261
Insertion into a Max Heap
void insert_max_heap(element item, int *n)
{
int i;
if (HEAP_FULL(*n)) {
fprintf(stderr, “the heap is full.\n”);
exit(1);
}
i = ++(*n);
while ((i!=1)&&(item.key>heap[i/2].key)) {
heap[i] = heap[i/2];
i /= 2; 2k-1=n ==> k=log2(n+1)
}
heap[i]= item; O(log2n)
CHAPTER 5 262
}
Example of Deletion from Max Heap

remove
20 10 15

15 2 15 2 14 2

14 10 14 10

CHAPTER 5 263
Deletion from a Max Heap
element delete_max_heap(int *n)
{
int parent, child;
element item, temp;
if (HEAP_EMPTY(*n)) {
fprintf(stderr, “The heap is empty\n”);
exit(1);
}
/* save value of the element with the
highest key */
item = heap[1];
/* use last element in heap to adjust heap
temp = heap[(*n)--];
parent = 1;
child = 2; CHAPTER 5 264
while (child <= *n) {
/* find the larger child of the current
parent */
if ((child < *n)&&
(heap[child].key<heap[child+1].key))
child++;
if (temp.key >= heap[child].key) break;
/* move to the next lower level */
heap[parent] = heap[child];
child *= 2;
}
heap[parent] = temp;
return item;
}

CHAPTER 5 265
Binary Search Tree
 Heap
– a min (max) element is deleted. O(log2n)
– deletion of an arbitrary element O(n)
– search for an arbitrary element O(n)
 Binary search tree
– Every element has a unique key.
– The keys in a nonempty left subtree (right subtree) are
smaller (larger) than the key in the root of subtree.
– The left and right subtrees are also binary search trees.

CHAPTER 5 266
Examples of Binary Search Trees

20 30 60

12 25 5 40 70

10 15 22 2 65 80

CHAPTER 5 267
Searching a Binary Search Tree
tree_pointer search(tree_pointer root,
int key)
{
/* return a pointer to the node that
contains key. If there is no such
node, return NULL */

if (!root) return NULL;


if (key == root->data) return root;
if (key < root->data)
return search(root->left_child,
key);
return search(root->right_child,key);
} CHAPTER 5 268
Another Searching Algorithm
tree_pointer search2(tree_pointer tree,
int key)
{
while (tree) {
if (key == tree->data) return tree;
if (key < tree->data)
tree = tree->left_child;
else tree = tree->right_child;
}
return NULL; O(h)
CHAPTER 5 269
}
Insert Node in Binary Search Tree

30 30 30

5 40 5 40 5 40

2 2 80 2 35 80

Insert 80 Insert 35

CHAPTER 5 270
Insertion into A Binary Search Tree
void insert_node(tree_pointer *node, int num)
{tree_pointer ptr,
temp = modified_search(*node, num);
if (temp || !(*node)) {
ptr = (tree_pointer) malloc(sizeof(node));
if (IS_FULL(ptr)) {
fprintf(stderr, “The memory is full\n”);
exit(1);
}
ptr->data = num;
ptr->left_child = ptr->right_child = NULL;
if (*node)
if (num<temp->data) temp->left_child=ptr;
else temp->right_child = ptr;
else *node = ptr;
}
CHAPTER 5 271
}
Deletion for A Binary Search Tree
1
leaf 30
node
5 80
T1 T2
1
2
2
T1 X

T2
CHAPTER 5 272
Deletion for A Binary Search Tree
non-leaf
40 40
node

20 60 20 55

10 30 50 70 10 30 50 70

45 55 45 52

52
Before deleting 60 After deleting 60
CHAPTER 5 273
1

2
T1

T2 T3
1

2‘
T1

T2’ T3
CHAPTER 5 274
Selection Trees
(1) winner tree
(2) loser tree

CHAPTER 5 275
sequential
allocation
winner tree Each node represents
the smaller of its two
1
scheme children.
6
(complete
2 3
binary
tree) 6 8
4 5 6 7
9 6 8 17

8 9 10 11 12 13 14 15
10 9 20 6 8 9 90 17
18
sequence
ordered

15 20 20 15 15 11 100
16 38 30 25 50 16 110 20

run 1 run 2 run 3 run 4 run 5 run 6 run 7 run 8


CHAPTER 5 276
*Figure 5.35: Selection tree of Figure 5.34 after one record has been
output and the tree restructured(nodes that were changed are ticked)

1 

8
2  3
9 8

4 5  6 7
9 15 8 17
8 9 10 11 12 13 14 15
10 9 20 15 8 9 90 17

15 20 20 25 15 11 100 18
16 38 30 25 50 16 110 20
Analysis
 K: # of runs
 n: # of records
 setup time: O(K) (K-1)
 restructure time: O(log2K) log2(K+1)
 merge time: O(nlog2K)
 slight modification: tree of loser
– consider the parent node only (vs. sibling nodes)
CHAPTER 5 278
*Figure 5.36: Tree of losers corresponding to Figure 5.34 (p.235)
overall
6 winner
8
9
1
8

2
9 3
9 15 17

4 5 6 7
10 20 15 9 90

8 9 10 11 12 13 14 15
10 9 20 6 8 9 90 17
Run 1 2 3 4 5 6 7 8
15 15
Forest
 A forest is a set of n >= 0 disjoint trees
A
Forest
A E G
B E

B C D F H I C F G

D H

I
CHAPTER 5 280
Transform a forest into a binary tree

 T1, T2, …, Tn: a forest of trees


B(T1, T2, …, Tn): a binary tree
corresponding to this forest
 algorithm
(1) empty, if n = 0
(2) has root equal to root(T1)
has left subtree equal to B(T11,T12,…,T1m)
has right subtree equal to B(T2,T3,…,Tn)

CHAPTER 5 281
Forest Traversals
 Preorder
– If F is empty, then return
– Visit the root of the first tree of F
– Taverse the subtrees of the first tree in tree preorder
– Traverse the remaining trees of F in preorder
 Inorder
– If F is empty, then return
– Traverse the subtrees of the first tree in tree inorder
– Visit the root of the first tree
– Traverse the remaining trees of F is indorer
CHAPTER 5 282
A inorder: EFBGCHIJDA
preorder: ABEFCGDHIJ
B
A

E C
B C D

F G D
E F J
G H I
H

I
B C D
r
preorde

J E G H
F I
J
CHAPTER 5 283
Set Representation
 S1={0, 6, 7, 8}, S2={1, 4, 9}, S3={2, 3, 5}
0 1 2

6 7 8 4 9 3 5
Si  Sj = 
 Two operations considered here
– Disjoint set union S1  S2={0,6,7,8,1,4,9}
– Find(i): Find the set containing the element i.
3  SCHAPTER
3, 8 5S1 284
Disjoint Set Union
Make one of trees a subtree of the other

0 1

8 1 0 4 9
6 7

4 9 6 7 8

Possible representation for S1 union S2

CHAPTER 5 285
*Figure 5.41:Data Representation of S1S2and S3 (p.240)

0
set pointer
name
6 7 8
S1
S2 4

S3 1 9

3 5
Array Representation for Set
i [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
parent -1 4 -1 2 -1 2 0 0 0 4

int find1(int i)
{
for (; parent[i]>=0; i=parent[i])
return i;
}

void union1(int i, int j)


{
parent[i]= j;
}
CHAPTER 5 287
*Figure 5.43:Degenerate tree (p.242)

union operation n-1 union(0,1), find(0)


O(n) n-1 union(1,2), find(0)
.
find operation n-2 .
O(n2) n .
i union(n-2,n-1),find(0)
i 2 

degenerate tree
*Figure 5.44:Trees obtained using the weighting rule(p.243)
weighting rule for union(i,j): if # of nodes in i < # in j then j the parent of i
Modified Union Operation
void union2(int i, int j)
{ Keep a count in the root of tree
int temp = parent[i]+parent[j];
if (parent[i]>parent[j]) {
parent[i]=j; i has fewer nodes.
parent[j]=temp;
}
else { j has fewer nodes
parent[j]=i;
parent[i]=temp;
} If the number of nodes in tree i is
} less than the number in tree j, then
make j the parent of i; otherwise
make i the parent of j.
CHAPTER 5 290
Figure 5.45:Trees achieving worst case bound (p.245)

 log28+1
Modified Find(i) Operation
int find2(int i)
{
int root, trail, lead;
for (root=i; parent[root]>=0;
root=parent[root]);
for (trail=i; trail!=root;
trail=lead) {
lead = parent[trail];
parent[trail]= root;
} If j is a node on the path from
return root: i to its root then make j a child
} of the root
CHAPTER 5 292
0 0

1 2 4 1 2 4 6 7

3 5 6 3 5

find(7) find(7) find(7) find(7) find(7) find(7) find(7) find(7)


go up 3 1 1 1 1 1 1 1
reset 2
12 moves (vs. 24 moves)
CHAPTER 5 293
Applications
 Find equivalence class i  j
 Find Si and Sj such that i  Si and j  Sj
(two finds)
– Si = Sj do nothing
– Si  Sj union(Si , Sj)
 example
0  4, 3  1, 6  10, 8  9, 7  4, 6  8,
3  5, 2  11, 11  0
{0, 2, 4, 7, 11}, {1, 3, 5}, {6, 8, 9, 10}

CHAPTER 5 294
preorder: ABCDEFGHI
inorder: BCAEDGHFI
A A

B, C D, E, F, G, H, I B D

A
C E F

B D, E, F, G, H, I G I

C H
CHAPTER 5 295
CHAPTER 6

GRAPHS
All the programs in this file are selected from
Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Computer Science Press.

CHAPTER 6 296
Definition
 A graph G consists of two sets
– a finite, nonempty set of vertices V(G)
– a finite, possible empty set of edges E(G)
– G(V,E) represents a graph
 An undirected graph is one in which the pair of
vertices in a edge is unordered, (v 0, v1) = (v1,v0)
 A directed graph is one in which each edge is a
directed pair of vertices, <v0, v1> != <v1,v0>
tail head
CHAPTER 6 297
Examples for Graph
0 0 0

1 2 1 2
1
3
3 4 5 6
G1 2
G2
complete graph incomplete graph G3
V(G1)={0,1,2,3} E(G1)={(0,1),(0,2),(0,3),(1,2),(1,3),(2,3)}
V(G2)={0,1,2,3,4,5,6} E(G2)={(0,1),(0,2),(1,3),(1,4),(2,5),(2,6)}
V(G3)={0,1,2} E(G3)={<0,1>,<1,0>,<1,2>}
complete undirected graph: n(n-1)/2 edges
complete directed graph: n(n-1) edges
CHAPTER 6 298
Complete Graph
 A complete graph is a graph that has the
maximum number of edges
– for undirected graph with n vertices, the maximum
number of edges is n(n-1)/2
– for directed graph with n vertices, the maximum
number of edges is n(n-1)
– example: G1 is a complete graph

CHAPTER 6 299
Adjacent and Incident
 If (v0, v1) is an edge in an undirected graph,
– v0 and v1 are adjacent
– The edge (v0, v1) is incident on vertices v0 and v1
 If <v0, v1> is an edge in a directed graph
– v0 is adjacent to v1, and v1 is adjacent from v0
– The edge <v0, v1> is incident on v0 and v1

CHAPTER 6 300
*Figure 6.3:Example of a graph with feedback loops and a
multigraph (p.260)
0

0 2 1 3

1 2
self edge multigraph:
(a) (b) multiple occurrences
of the same edge

Figure 6.3
CHAPTER 6 301
Subgraph and Path
 A subgraph of G is a graph G’ such that V(G’)
is a subset of V(G) and E(G’) is a subset of
E(G)
 A path from vertex vp to vertex vq in a graph G,

is a sequence of vertices, vp, vi1, vi2, ..., vin, vq,


such that (vp, vi1), (vi1, vi2), ..., (vin, vq) are edges
in an undirected graph
 The length of a path is the number of edges on
it CHAPTER 6 302
Figure 6.4: subgraphs of G1 and G3 (p.261)
0 0 0 1 2 0

1 2 1 2 3 1 2
3
3
G1 (i) (ii) (iii) (iv)
(a) Some of the subgraph of G1

0 0 0 0
0
單一 1 1 1
1 分開
2 2
(i) (ii) (iii) (iv)
2 (b) Some of the subgraph of G3
CHAPTER 6 303
G3
Simple Path and Style
 A simple path is a path in which all vertices,
except possibly the first and the last, are distinct
 A cycle is a simple path in which the first and
the last vertices are the same
 In an undirected graph G, two vertices, v0 and v1,
are connected if there is a path in G from v0 to v1
 An undirected graph is connected if, for every
pair of distinct vertices vi, vj, there is a path
from vi to vj
CHAPTER 6 304
connected

0 0

1 2 1 2
3
3 4 5 6
G1
G2
tree (acyclic graph)

CHAPTER 6 305
Connected Component
 A connected component of an undirected graph
is a maximal connected subgraph.
 A tree is a graph that is connected and acyclic.
 A directed graph is strongly connected if there
is a directed path from vi to vj and also
from vj to vi.
 A strongly connected component is a maximal
subgraph that is strongly connected.
CHAPTER 6 306
*Figure 6.5: A graph with two connected components (p.262)
connected component (maximal connected subgraph)

H1 0 H2 4

2 1 5

3 6

G4 (not connected)
CHAPTER 6 307
*Figure 6.6: Strongly connected components of G3 (p.262)

strongly connected component


not strongly connected (maximal strongly connected subgraph)

0
0 2

1
2
G3

CHAPTER 6 308
Degree
 The degree of a vertex is the number of edges
incident to that vertex
 For directed graph,
– the in-degree of a vertex v is the number of edges
that have v as the head
– the out-degree of a vertex v is the number of edges
that have v as the tail
– if di is the degree of a vertex i in a graph G with n
verticesn 1
and e edges, the number of edges is
e( d )/ 2
0
i

CHAPTER 6 309
undirected graph
degree
3 0
0 2
1 2
3 1 23 3 3
3 4 5 6
3
G13 1 1 G2 1 1
0 in:1, out: 1
directed graph
in-degree
out-degree 1 in: 1, out: 2

2 in: 1, out: 0
G3 CHAPTER 6 310
ADT for Graph
structure Graph is
objects: a nonempty set of vertices and a set of undirected edges, where each

edge is a pair of vertices


functions: for all graph  Graph, v, v1 and v2  Vertices
Graph Create()::=return an empty graph
Graph InsertVertex(graph, v)::= return a graph with v inserted. v has no
incident edge.
Graph InsertEdge(graph, v1,v2)::= return a graph with new edge
between v1 and v2
Graph DeleteVertex(graph, v)::= return a graph in which v and all edges
incident to it are removed
Graph DeleteEdge(graph, v1, v2)::=return a graph in which the edge (v1,
v2 )
is removed
Boolean IsEmpty(graph)::= if (graph==empty graph) return TRUE
else return FALSE
CHAPTER 6 311
List Adjacent(graph,v)::= return a list of all vertices that are adjacent to v
Graph Representations
 Adjacency Matrix
 Adjacency Lists
 Adjacency Multilists

CHAPTER 6 312
Adjacency Matrix
 Let G=(V,E) be a graph with n vertices.
 The adjacency matrix of G is a two-dimensional
n by n array, say adj_mat
 If the edge (vi, vj) is in E(G), adj_mat[i][j]=1
 If there is no such edge in E(G), adj_mat[i][j]=0
 The adjacency matrix for an undirected graph is
symmetric; the adjacency matrix for a digraph
need not be symmetric
CHAPTER 6 313
Examples for Adjacency Matrix
0 0 4
0
2 1 5
1 2
3 6
3 1
0 1 1 1 0 1 0
1 0 1 1    7
  1 0 1 
2 0 1 1 0 0 0 0 0
1 1 0 1 0 0 0 1
   0 0 1 0 0 0 0
1 1 1 0
1 0 0 1 0 0 0 0
G2
G1  
0 1 1 0 0 0 0 0
0 0 0 0 0 1 0 0
 
0 0 0 0 1 0 1 0
symmetric 0 0 0 0 0 1 0 1
 
undirected: n2/2 0 0 0 0 0 0 1 0
directed: n2 CHAPTER 6 314
G4
Merits of Adjacency Matrix
 From the adjacency matrix, to determine the
connection of vertices is easy n 1

 The degree of a vertex is  adj _ mat[i][ j ]


j0
 For a digraph, the row sum is the out_degree,
while the column sum is the in_degree
n 1 n 1
ind (vi )   A[ j , i ] outd (vi )   A[i , j ]
j 0 j 0

CHAPTER 6 315
Data Structures for Adjacency Lists
Each row in adjacency matrix is represented as an adjacency list.

#define MAX_VERTICES 50
typedef struct node *node_pointer;
typedef struct node {
int vertex;
struct node *link;
};
node_pointer graph[MAX_VERTICES];
int n=0; /* vertices currently in use *
CHAPTER 6 316
0 0 4
2 1 5
1 2 3 6
3 7
0 1 2 3 0 1 2
1 0 2 3 1 0 3
2 0 1 3 2 0 3
3 0 1 2 3 1 2
G1 0 4 5
5 4 6
0 1 6 5 7
1 0 2 1
7 6
2
G3 G4
2CHAPTER 6 317
An undirected graph with n vertices and e edges ==> n head nodes and 2e list nodes
Interesting Operations
degree of a vertex in an undirected graph
–# of nodes in adjacency list
# of edges in a graph
–determined in O(n+e)
out-degree of a vertex in a directed graph
–# of nodes in its adjacency list
in-degree of a vertex in a directed graph
–traverse the whole data structure

CHAPTER 6 318
Compact Representation
0 4
node[0] … node[n-1]: starting point for vertices
2 1 5 node[n]: n+2e+1
3 6 node[n+1] … node[n+2e]: head node of edge

7
[0] 9 [8] 23 [16] 2
[1] 11 0 [9] 1 4 [17] 5
[2] 13 [10] 2 5 [18] 4
[3] 15 1 [11] 0 [19] 6
[4] 17 [12] 3 6 [20] 5
[5] 18 2 [13] 0 [21] 7
[6] 20 [14] 3 7 [22] 6
[7] 22 3 [15] 1
CHAPTER 6 319
Figure 6.10: Inverse adjacency list for G3

0
0  1 NULL

1 1  0 NULL

2  1 NULL
2

Determine in-degree of a vertex in a fast way.

CHAPTER 6 320
Figure 6.11: Alternate node structure for adjacency lists (p.267)

tail head column link for head row link for tail

CHAPTER 6 321
Figure 6.12: Orthogonal representation for graph G3(p.268)

0 1 2

0 0 1 NULL NULL

1 1 0 NULL 1 2 NULL NULL

0
2 NULL
0 1 0
 
 1 0 1 
0 0 0
1

CHAPTER 6
2 322
Figure 6.13:Alternate order adjacency list for G1 (p.268)
Order is of no significance.
headnodes vertax link

0  3  1  2 NULL

1  2  0  3 NULL

2  3  0  1 NULL

3  2  1  0 NULL

1 2
3 CHAPTER 6 323
Adjacency Multilists
An edge in an undirected graph is
represented by two nodes in adjacency list
representation.
Adjacency Multilists

–lists in which nodes may be shared among


several lists.
(an edge is shared by two different paths)

marked vertex1 vertex2 path1 path2

CHAPTER 6 324
Example for Adjacency Multlists
Lists: vertex 0: M1->M2->M3, vertex 1: M1->M4->M5
vertex 2: M2->M4->M6, vertex 3: M3->M5->M6
(1,0)
0 N1 0 1 N2 N4 edge (0,1)
1 (2,0)
2 N2 0 2 N3 N4 edge (0,2)
(3,0)
3 0 3 N5
N3 (2,1)
edge (0,3)
0 N4 1 2 N5 N6 edge (1,2)
(3,1)
N5 1 3 N6 edge (1,3)
1 2 (3,2)
N6 2 3 edge (2,3)
3
six edges
CHAPTER 6 325
Adjacency Multilists

typedef struct edge *edge_pointer;


typedef struct edge {
short int marked;
int vertex1, vertex2;
edge_pointer path1, path2;
};
edge_pointer graph[MAX_VERTICES];

marked vertex1 vertex2 path1 path2

CHAPTER 6 326
Some Graph Operations
 Traversal
Given G=(V,E) and vertex v, find all wV,
such that w connects v.
– Depth First Search (DFS)
preorder tree traversal
– Breadth First Search (BFS)
level order tree traversal
 Connected Components
 Spanning Trees
CHAPTER 6 327
*Figure 6.19:Graph G and its adjacency lists (p.274)
depth first search: v0, v1, v3, v7, v4, v5, v2, v6

breadth first search: v0, v1, v2, v3, v4, v5, v6, v7
CHAPTER 6 328
Depth First Search
#define FALSE 0
#define TRUE 1
short int visited[MAX_VERTICES];
void dfs(int v)
{
node_pointer w;
visited[v]= TRUE;
printf(“%5d”, v);
for (w=graph[v]; w; w=w->link)
if (!visited[w->vertex])
dfs(w->vertex); Data structure
} adjacency list: O(e)
adjacency matrix: O(n2)
CHAPTER 6 329
Breadth First Search
typedef struct queue *queue_pointer;
typedef struct queue {
int vertex;
queue_pointer link;
};
void addq(queue_pointer *,
queue_pointer *, int);
int deleteq(queue_pointer *);
CHAPTER 6 330
Breadth First Search (Continued)
void bfs(int v)
{
node_pointer w;
queue_pointer front, rear;
front = rear = NULL; adjacency list: O(e)
printf(“%5d”, v); adjacency matrix: O(n2)
visited[v] = TRUE;
addq(&front, &rear, v);

CHAPTER 6 331
while (front) {
v= deleteq(&front);
for (w=graph[v]; w; w=w->link)
if (!visited[w->vertex]) {
printf(“%5d”, w->vertex);
addq(&front, &rear, w->vertex);
visited[w->vertex] = TRUE;
}
}
}
CHAPTER 6 332
Connected Components
void connected(void)
{
for (i=0; i<n; i++) {
if (!visited[i]) {
dfs(i);
printf(“\n”);
}
adjacency list: O(n+e)
} adjacency matrix: O(n2)
}
CHAPTER 6 333
Spanning Trees
 When graph G is connected, a depth first or
breadth first search starting at any vertex will
visit all vertices in G
 A spanning tree is any tree that consists solely
of edges in G and that includes all the vertices
 E(G): T (tree edges) + N (nontree edges)
where T: set of edges used during search
N: set of remaining edges
CHAPTER 6 334
Examples of Spanning Tree

0 0 0 0

1 2 1 2 1 2 1 2
3 3 3 3

G1 Possible spanning trees

CHAPTER 6 335
Spanning Trees
 Either dfs or bfs can be used to create a
spanning tree
– When dfs is used, the resulting spanning tree is
known as a depth first spanning tree
– When bfs is used, the resulting spanning tree is
known as a breadth first spanning tree
 While adding a nontree edge into any spanning
tree, this will create a cycle
CHAPTER 6 336
DFS VS BFS Spanning Tree
0 0 0

1 2 1 2 1 2

3 4 5 6 3 4 5 6 3 4 5 6

7 7 nontree edge 7
cycle
DFS Spanning BFS Spanning

CHAPTER 6 337
A spanning tree is a minimal subgraph, G’, of G
such that V(G’)=V(G) and G’ is connected.

Any connected graph with n vertices must have


at least n-1 edges.

A biconnected graph is a connected graph that has


no articulation points. 0

1 2
biconnected graph
3 4 5 6

7
CHAPTER 6 338
0 8 9

1 7

3 5 connected graph

2
4 6
two connected components one connected graph

0 8 9 0 8 9

7 1 7
1

3 5 3 5

2 2
4 6 CHAPTER 6 4 6 339
biconnected component: a maximal connected subgraph H
(no subgraph that is both biconnected and properly contains H)

0 8 9

0 8 9
1 7 7

1 7
1 7

3 5
3 3 5 5
2
4 6 2
4 6

biconnected components
CHAPTER 6 340
Find biconnected component of a connected undirected graph
by depth first spanning tree
nontree
0 edge
3 (back edge)
4 0 9 8 9 8 1 5
nontree 4 5
3 1 7 7edge
0 5 (back edge) 2 2 6 6
2 2 3 5 3
1 7 7
dfn
depth 8 9
4 6
first 4 0 9 8
1 6
number Why is cross edge impossible?
(a) depth first spanning tree (b)
If u is an ancestor of v then dfn(u) < dfn(v).
CHAPTER 6 341
*Figure 6.24: dfn and low values for dfs spanning tree with root =3(p.281)

Vertax 0 1 2 3 4 5 6 7 8 9
dfn 4 3 2 0 1 5 6 7 9 8
low 4 0 0 0 0 5 5 5 9 8

CHAPTER 6 342
0
3 *The root of a depth first spanning
1 5 tree is an articulation point iff
4 5 it has at least two children.

2 2 6 6 *Any other vertex u is an articulation


3 point iff it has at least one child w
1 7 7 such that we cannot reach an ancestor
8 9of u using a path that consists of
4 0 9 8
(1) only w (2) descendants of w (3)
low(u)=min{dfn(u), single back edge.
min{low(w)|w is a child of u},
min{dfn(w)|(u,w) is a back edge}

u: articulation point
low(child)  dfn(u)
CHAPTER 6 343
vertex dfn low child low_child low:dfn
0 4 4 (4,n,n) null null null:4
1 3 0 (3,4,0) 0 4 43 
2 2 0 (2,0,n) 1 0 0<2
3 0 0 (0,0,n) 4,5 0,5 0,5  0 
4 1 0 (1,0,n) 2 0 0<1
5 5 5 (5,5,n) 6 5 55 
6 6 5 (6,5,n) 7 5 5<6
7 7 5 (7,8,5) 8,9 9,8 9,8  7 
8 9 9 (9,n,n) null null null, 9
9 8 8 (8,n,n) null null null, 8
3
1 5
4 5

2 2 6 6
3
1 7 7
8 9 CHAPTER 6 344
4 0 9 8
*Program 6.5: Initializaiton of dfn and low (p.282)

void init(void)
{
int i;
for (i = 0; i < n; i++) {
visited[i] = FALSE;
dfn[i] = low[i] = -1;
}
num = 0;
}

CHAPTER 6 345
*Program 6.4: Determining dfn and low (p.282)
void dfnlow(int u, int v) Initial call: dfn(x,-1)
{
/* compute dfn and low while performing a dfs search
beginning at vertex u, v is the parent of u (if any) */
node_pointer ptr;
int w;
dfn[u] = low[u] = num++; low[u]=min{dfn(u), …}
for (ptr = graph[u]; ptr; ptr = ptr ->link) {
v w = ptr ->vertex;
v if (dfn[w] < 0) { /*w is an unvisited vertex */
u dfnlow(w, u);
u low[u] = MIN2(low[u], low[w]);
} low[u]=min{…, min{low(w)|w is a child of u}, …}
w
else if (w != v) dfn[w]0 非第一次,表示藉 back edge
O X low[u] =MIN2(low[u], dfn[w] );
} CHAPTER 6 346
} low[u]=min{…,…,min{dfn(w)|(u,w) is a back edge}
*Program 6.6: Biconnected components of a graph (p.283)
void bicon(int u, int v)
{
/* compute dfn and low, and output the edges of G by their
biconnected components , v is the parent ( if any) of the u
(if any) in the resulting spanning tree. It is assumed that all
entries of dfn[ ] have been initialized to -1, num has been
initialized to 0, and the stack has been set to empty */
node_pointer ptr;
int w, x, y;
dfn[u] = low[u] = num ++; low[u]=min{dfn(u), …}
for (ptr = graph[u]; ptr; ptr = ptr->link) {
w = ptr ->vertex; (1) dfn[w]=-1 第一次
if ( v != w && dfn[w] < dfn[u] ) (2) dfn[w]!=-1 非第一次,藉 back
add(&top, u, w); /* add edge to stack */ edge

CHAPTER 6 347
if(dfn[w] < 0) {/* w has not been visited */
bicon(w, u); low[u]=min{…, min{low(w)|w is a child of u}, …
low[u] = MIN2(low[u], low[w]);
if (low[w] >= dfn[u] ){ articulation point
printf(“New biconnected component: “);
do { /* delete edge from stack */
delete(&top, &x, &y);
printf(“ <%d, %d>” , x, y);
} while (!(( x = = u) && (y = = w)));
printf(“\n”);
}
}
else if (w != v) low[u] = MIN2(low[u], dfn[w]);
} low[u]=min{…, …, min{dfn(w)|(u,w) is a back edge}}
}

CHAPTER 6 348
Minimum Cost Spanning Tree
 The cost of a spanning tree of a weighted
undirected graph is the sum of the costs of the
edges in the spanning tree
 A minimum cost spanning tree is a spanning
tree of least cost
 Three different algorithms can be used
– Kruskal
– Prim Select n-1 edges from a weighted graph
of n vertices with minimum cost.
– Sollin
CHAPTER 6 349
Greedy Strategy
 An optimal solution is constructed in stages
 At each stage, the best decision is made at this
time
 Since this decision cannot be changed later,
we make sure that the decision will result in a
feasible solution
 Typically, the selection of an item at each
stage is based on a least cost or a highest profit
criterion
CHAPTER 6 350
Kruskal’s Idea
 Build a minimum cost spanning tree T by
adding edges to T one at a time
 Select the edges for inclusion in T in
nondecreasing order of the cost
 An edge is added to T if it does not form a
cycle
 Since G is connected and has n > 0 vertices,
exactly n-1 edges will be selected
CHAPTER 6 351
Examples for Kruskal’s Algorithm
0 10 5

2 12 3

1 14 6 0 0 0
28
1 16 2 10 1 1 10 1
14 16
3 18 6 5 6 2 5 6 2 5 6 2
24
25
3 22 4 18 12 4 4
4
4 24 6 22 3 3 3

4 25 5

28
6/9 CHAPTER 6 352
0 10 5

2 12 3

1 14 6

1 16 2 0 0 0
10 1 10 1 10 1
3 18 6
14 14 16

3 22 4 5 6 2 5 6 2 5 6 2

12 12 12
4 24 6 4 4 4
3 3 3
4 25 5

0 28 1 + 3 6
CHAPTER 6 353
cycle
0 10 5

2 12 3

1 14 6
0 0
1 16 2
10 1 10 1
3 18 6 14 16 14 16

5 6 2 5 6 2
3 22 4
25
12 12
4 4
4 24 6 22
22 + 4 6 3
3
4 25 5 cycle
cost = 10 +25+22+12+16+14

0 28 1
CHAPTER 6 354
Kruskal’s Algorithm
T= {}; 目標:取出 n-1 條 edges
while (T contains less than n-1 edges
&& E is not empty) {
choose a least cost edge (v,w) from E;
min heap construction time O(e)
delete (v,w) from E; choose and delete O(log e)
if ((v,w) does not create a cycle in T)
add (v,w) to T find find & union O(log e)
else discard (v,w);
} {0,5}, {1,2,3,6}, {4} + edge(3,6) X + edge(3,4) --> {0,5},{1,2,3,4,6}
if (T contains fewer than n-1 edges)
printf(“No spanning tree\n”);
CHAPTER 6 355
O(e log e)
Prim’s Algorithm
(tree all the time vs. forest)
T={};
TV={0};
while (T contains fewer than n-1 edges)
{
let (u,v) be a least cost edge such
that u  TV and v  TV
if (there is no such edge ) break;
add v to TV;
add (u,v) to T;
}
if (T contains fewer than n-1 edges)
printf(“No spanning tree\n”);
CHAPTER 6 356
Examples for Prim’s Algorithm
0
28
10 1
14 16

5 6 2
24
25
18 12 0 0
0 4
10 1 22 3 10 1 10 1

5 6 2 5 6 2 5 6 2
25 25
4 4 4
3 22 3
3
CHAPTER 6 357
0
28
10 1
14 16

5 6 2
24
25
18 12
4
0
22 3 0 0
10 1 10 1 10 1
16 14 16

5 6 2 5 6 2 5 6 2
25 25 25
12 12 12
4 4 4
22 22 3 22 3
3
CHAPTER 6 358
0
Sollin’s Algorithm
28 vertex edge
10 1 0 0 -- 10 --> 5, 0 -- 28 --> 1
14 16 1 1 -- 14 --> 6, 1-- 16 --> 2, 1 -- 28 --> 0
2 2 -- 12 --> 3, 2 -- 16 --> 1
5 6 2 3 3 -- 12 --> 2, 3 -- 18 --> 6, 3 -- 22 --> 4
24 4 4 -- 22 --> 3, 4 -- 24 --> 6, 5 -- 25 --> 5
25
18 12 5 5 -- 10 --> 0, 5 -- 25 --> 4
4 6 6 -- 14 --> 1, 6 -- 18 --> 3, 6 -- 24 --> 4
22 3
0 0 {0,5} 0
28
1 10 1 5 25
4 0 1 10 1
14 14 16
{1,6}
5 6 2 5 6 2 5 6 2
16 28
1 2 1 0
12
4 4 12
18 24 4
CHAPTER 6 6 3 6 4 359
22 22 3
3 3
Single Source All Destinations

Determine the shortest paths from v0 to all the remaining vertices.

*Figure 6.29: Graph and shortest paths from v0 (p.293)

45

50 10 path length
V0 V1 V4 1) v0 v2 10
2) v0 v2 v3 25
20 10 15 20 35 30 3) v0 v2 v3 v1 45
4) v0 v4 45
15 3
V2 V3 V5

(a) (b)
CHAPTER 6 360
Example for the Shortest Path Boston
1500 4
San Chicago
Denver 250
Francisco
800
1200 3 1000
1 2
New
5 York
300 1000 1400

1700 New Orleans 900


0 7 1000
Los Angeles 6 Miami
0 1 2 3 4 5 6 7
0 0
1 300 0
2 1000 800 0
3 1200 0
4 1500 0 250
5 1000 0 900 1400
6 0 1000
7 1700 CHAPTER 6
0 361
Cost adjacency matrix
(a) (b) (c) 4
1500 4 1500 4 250
3 250 3 5
250
5 1000
5 900
6
選5 4 到 3 由 1500 改成 1250 4 到 6 由改成 1250

4 4
(d) (e) (f) 4
3
250
250 250
1000
5 5
7 1400 5
7 1400 7 1400
900
1000 900
6
6
4 到 7 由改成 1650 選6
4-5-6-7 比 4-5-7 長
CHAPTER 6 362
(g) 4 (h)
4
3 2
1200
3
250
250
1000
5 1000
7 1400 5
7 1400
900
900
6 6
選3 4 到 2 由改成 2450

(i) 4 4
2
1200
3 (j)
250
0 250
1000
5 5
7 1400 7 1400
900
6
4 到 0 由改成 3350
選7 CHAPTER 6 363
Example for the Shortest Path
(Continued)

Iteration S Vertex LA SF DEN CHI BO NY MIA NO


Selected [0] [1] [2] [3] [4] [5] [6]
Initial -- ---- + + + 1500 0 250 + (d)+
(b) (c)
1 {4} (a) 5 + + + 1250 0 250 1150 1650
2 {4,5} (e) 6 + + + 1250 0 250 1150(f)1650
(h)
3 {4,5,6} (g) 3 + + 2450 1250 0 250 1150 1650
4 {4,5,6,3} (j)3350 + 2450 1250 0 250 1150 1650
(i) 7
5 {4,5,6,3,7} 2 3350 3250 2450 1250 0 250 1150 1650
6 {4,5,6,3,7,2} 1 3350 3250 2450 1250 0 250 1150 1650
7 {4,5,6,3,7,2,1}

CHAPTER 6 364
Data Structure for SSAD
#define MAX_VERTICES 6 adjacency matrix
int cost[][MAX_VERTICES]=
{{ 0, 50, 10, 1000, 45, 1000},
{1000, 0, 15, 1000, 10, 1000},
{ 20, 1000, 0, 15, 1000, 1000},
{1000, 20, 1000, 0, 35, 1000},
{1000, 1000, 30, 1000, 0, 1000},
{1000, 1000, 1000, 3, 1000, 0}};
int distance[MAX_VERTICES];
short int found{MAX_VERTICES];
int n = MAX_VERTICES;
CHAPTER 6 365
Single Source All Destinations
void shortestpath(int v, int cost[]
[MAX_ERXTICES], int distance[], int n, short
int found[])
{
int i, u, w;
for (i=0; i<n; i++) {
found[i] = FALSE; O(n)
distance[i] = cost[v][i];
}
found[v] = TRUE;
distance[v] = 0;
CHAPTER 6 366
for (i=0; i<n-2; i++) {determine n-1 paths from v
u = choose(distance, n, found);
found[u] = TRUE;
for (w=0; w<n; w++)
if (!found[w]) 與 u 相連的端點 w
if (distance[u]+cost[u][w]<distance[w])
distance[w] = distance[u]+cost[u][w];
}
}
O(n2)

CHAPTER 6 367
All Pairs Shortest Paths
Find the shortest paths between all pairs of
vertices.
Solution 1

–Apply shortest path n times with each vertex as


source.
O(n3)
Solution 2
–Represent the graph G by its cost adjacency matrix
with cost[i][j]
–If the edge <i,j> is not in G, the cost[i][j] is set to
some sufficiently large number
–A[i][j] is the cost of the shortest path form i to j,
using only thoseCHAPTER
intermediate
6
vertices with an index
368

<= k
All Pairs Shortest Paths (Continued)
n-1
 The cost of the shortest path from i to j is A [i][j],
as no vertex in G has an index greater than n-1
-1
 A [i][j]=cost[i][j]
0 1 2 n-1 -1
 Calculate the A, A, A, ..., A from A iteratively
k k-1 k-1 k-1
 A [i][j]=min{A [i][j], A [i][k]+A [k][j]}, k>=0

CHAPTER 6 369
Graph with negative cycle

-2  0 1 
 2 0 1 
0 1 2  
1 1    0 

(a) Directed graph (b) A-1

The length of the shortest path from vertex 0 to vertex 2 is -.

0, 1, 0, 1,0, 1, …, 0, 1, 2

CHAPTER 6 370
Algorithm for All Pairs Shortest Paths
void allcosts(int cost[][MAX_VERTICES],
int distance[][MAX_VERTICES], int n)
{
int i, j, k;
for (i=0; i<n; i++)
for (j=0; j<n; j++)
distance[i][j] = cost[i][j];
for (k=0; k<n; k++)
for (i=0; i<n; i++)
for (j=0; j<n; j++)
if (distance[i][k]+distance[k][j]
< distance[i][j])
distance[i][j]=
distance[i][k]+distance[k][j];
} CHAPTER 6 371
* Figure 6.33: Directed graph and its cost matrix (p.299)

0 1 2

0 0 4 11
6
V0 V1 1 6 0 2
4
3 11 2
2 3  0
V2

(a)Digraph G (b)Cost adjacency matrix for G

CHAPTER 6 372
0 1 2 6 0 1 2
A -1 A 0

V0 V1 0 0 4 A-1
0 0 4 11 4 11
3 11 2 0 0

1 6 0 2 V2 1 6 0 2 6 4

2 3 7 0 3 11
2 3  0
v0

0 1 2 0 1 2
A 1
A 2

A0 A1
0 0 4 6 0 0 4 6
4 6 6 3

0 0 1 6 0 2 2 7
1 5 0 2
7 2 2 3 7 0 0 0
v1 2 3 7 0 v2
CHAPTER 6 373
Transitive Closure
Goal: given a graph with unweighted edges, determine if there is a path
from i to j for all i and j.
(1) Require positive path (> 0) lengths. transitive closure matrix
(2) Require nonnegative path (0) lengths. reflexive transitive closure matrix
0 0 1 0 0 0 
1 0 0 1 0 0
2 0 0 0 1 0 
0 1 2 3 4 
3 0 0 0 0 1 

4 0 0 1 0 0
(a) Digraph G (b) Adjacency matrix A for G

0 0 1 1 1 1 0 1 1 1 1 1
0 1 0
1  0 1 1 1  1 1 1 1
2 0 0 1 1 1 2 0 0 1 1 1
   
3 0 0 1 1 1 3 0 0 1 1 1 reflexiv
4 0 1 cycle 0 1
0 1 1 4 0 1 1 e
(c) transitive closure matrix A+ (d) reflexive transitive closure matrix A*
CHAPTER 6 374
There is a path of length > 0 There is a path of length 0
Activity on Vertex (AOV) Network
definition
A directed graph in which the vertices represent
tasks or activities and the edges represent
precedence relations between tasks.
predecessor (successor)
vertex i is a predecessor of vertex j iff there is a
directed path from i to j. j is a successor of i.
partial order
a precedence relation which is both transitive (i, j,
k, ij & jk => ik ) and irreflexive (x xx).
acylic graph
a directed graph with no directed cycles
CHAPTER 6 375
*Figure 6.38: An AOV network (p.305)
Topological order:
linear ordering of vertices
of a graph
i, j if i is a predecessor of
j, then i precedes j in the
linear ordering

C1, C2, C4, C5, C3, C6, C8,


C7, C10, C13, C12, C14, C15,
C11, C9

C4, C5, C2, C1, C6, C3, C8,


C15, C7, C9, C10, C11, C13,
C12, C14

CHAPTER 6 376
*Program 6.13: Topological sort (p.306)

for (i = 0; i <n; i++) {


if every vertex has a predecessor {
fprintf(stderr, “Network has a cycle. \n “ );
exit(1);
}
pick a vertex v that has no predecessors;
output v;
delete v and all edges leading out of v
from the network;
}

CHAPTER 6 377
*Figure 6.39:Simulation of Program 6.13 on an AOV
network (p.306)

v0 no predecessor v1, v2, v3 no predecessor select v2


delete v0->v1, v0->v2, v0->v3 select v3 delete v2->v4, v2->v5
delete v3->v4, v3->v5

select v5 select v1
delete v1->v4
CHAPTER 6 378
Issues in Data Structure
Consideration

Decide whether a vertex has any predecessors.


–Each vertex has a count.
Decide a vertex together with all its incident
edges.
–Adjacency list

CHAPTER 6 379
*Figure 6.40:Adjacency list representation of Figure 6.30(a)
(p.309) headnodes
node
count link vertex link
V0 0  1  2  3 NULL

1  4 NULL
V1
1  4  5 NULL

V23 1  5  4 NULL

v1
V4 3 NULL

v0 v2 v4
2 NULL

V5
v3 v5
CHAPTER 6 380
*(p.307)

typedef struct node *node_pointer;


typedef struct node {
int vertex;
node_pointer link;
};
typedef struct {
int count;
node_pointer link;
} hdnodes;
hdnodes graph[MAX_VERTICES];

CHAPTER 6 381
*Program 6.14: Topological sort (p.308)

void topsort (hdnodes graph [] , int n)


{
int i, j, k, top;
node_pointer ptr;
/* create a stack of vertices with no predecessors */
top = -1;
for (i = 0; i < n; i++)
if (!graph[i].count) {no predecessors, stack is linked through
graph[i].count = top; count field
O(n) top = i;
}
for (i = 0; i < n; i++)
if (top == -1) {
fprintf(stderr, “\n Network has a cycle. Sort terminated. \n”);
exit(1);
}

CHAPTER 6 382
} Continued
else {
j = top; /* unstack a vertex */
top = graph[top].count;
printf(“v%d, “, j);
for (ptr = graph [j]. link; ptr ;ptr = ptr ->link ){
/* decrease the count of the successor vertices of j */
k = ptr ->vertex;
graph[k].count --;
O(e)
if (!graph[k].count) {
/* add vertex k to the stack*/
graph[k].count = top;
top = k;
}
} O(e+n)
}
} CHAPTER 6 383
Activity on Edge (AOE)
Networks
directed edge
–tasks or activities to be performed
vertex

–events which signal the completion of certain activities


number

–time required to perform the activity

CHAPTER 6 384
*Figure 6.41:An AOE network(p.310)

concurrent

CHAPTER 6 385
Application of AOE Network
Evaluate performance
–minimum amount of time
–activity whose duration time should be shortened
–…
Critical path
–a path that has the longest length
–minimum time required to complete the project
–v0, v1, v4, v7, v8 or v0, v1, v4, v6, v8 (18)

CHAPTER 6 386
other factors
Earliest time that vi can occur
–the length of the longest path from v0 to vi
–the earliest start time for all activities leaving vi
–early(6) = early(7) = 7
Latest time of activity
–the latest time the activity may start without increasing
the project duration
–late(5)=8, late(7)=7
Critical activity
–an activity for which early(i)=late(i)
–early(7)=late(7)
late(i)-early(i)

–measure of how critical an activity is


CHAPTER 6 387
–late(5)-early(5)=8-5=3
earliest, early, latest, late
16
6
v1 6 v6 16
0 a3=1 a9=2
a0=6 7 16
0 6 7 a6=9 18
0 6 10
v4 7 v8
v0 0
a1=4 4 7 14
7 14 18
0 4 a4=1 a7=7
2 a10=4
v2 6 v7
14
0 6 14
a2=5
7
3 a8=4
7
5 14
5 v5
v3
a5=2
8 10
8
CHAPTER 6 388
Determine Critical Paths
Delete all noncritical activities
Generate all the paths from the start to
finish vertex.

CHAPTER 6 389
Calculation of Earliest Times
earliest[j]

–the earliest event occurrence time


earliest[0]=0
earliest[j]=max{earliest[i]+duration of <i,j>}
i p(j)

latest[j]

–the latest event occurrence time


vk ai vl

early(i)=earliest(k)
late(i)=latest(l)-duration of ai
CHAPTER 6 390
vi1

vi2 vj
forward stage .
.
.
vin

if (earliest[k] < earliest[j]+ptr->duration)


earliest[k]=earliest[j]+ptr->duration

CHAPTER 6 391
*Figure 6.42:Computing earliest from topological sort
(p.313)

v1 v6

v0 v4 v8
v2
v7

v3 v5

CHAPTER 6 392
Calculation of Latest Times
 latest[j]
– the latest event occurrence time
latest[n-1]=earliest[n-1]
latest[j]=min{latest[i]-duration of <j,i>}
i s(j)
vi1

vj vi2 backward stage


.
.
.
vin
if (latest[k] > latest[j]-ptr->duration)
CHAPTER 6 393
latest[k]=latest[j]-ptr->duration
*Figure 6.43: Computing latest for AOE network of Figure 6.41(a)(p.314)

CHAPTER 6 394
*Figure 6.43(continued):Computing latest of AOE network of Figure 6.41(a)
(p.315)

latest[8]=earliest[8]=18
latest[6]=min{earliest[8] - 2}=16
latest[7]=min{earliest[8] - 4}=14
latest[4]=min{earliest[6] - 9;earliest[7] -7}= 7
latest[1]=min{earliest[4] - 1}=6
latest[2]=min{earliest[4] - 1}=6
latest[5]=min{earliest[7] - 4}=10
latest[3]=min{earliest[5] - 2}=8
latest[0]=min{earliest[1] - 6;earliest[2]- 4; earliest[3] -
5}=0
(c)Computation of latest from Equation (6.4) using a reverse topological
order

CHAPTER 6 395
*Figure 6.44:Early, late and critical values(p.316)

Activity Early Late Late- Critical


Early
a0 0 0 0 Yes
a1 0 2 2 No
a2 0 3 3 No
a3 6 6 0 Yes
a4 4 6 2 No
a5 5 8 3 No
a6 7 7 0 Yes
a7 7 7 0 Yes
a8 7 10 3 No
a9 16 16 0 Yes
a10 14 14 0 Yes
CHAPTER 6 396
*Figure 6.45:Graph with noncritical activities deleted
(p.316)

V1 V6
a3 a6 a9
a0
V0 V4 V8

a7 a10
V7

CHAPTER 6 397
*Figure 6.46: AOE network with unreachable activities (p.317)

a4
V3 V4

V1 a5
a6 earliest[i]=0
a0 a2

V0 V5

a1 a3
V2

CHAPTER 6 398
*Figure 6.47: an AOE network(p.317)

a2=3 a6=4
V1 V3 V6
a12=4
a0=5 a7=4
a3=6 a =3 a10=5 a13=2
V5 V8 V9 finish
start V0
5

a8=1 a11=2
a1=6
V2 V4 V7
a4=3 a9=4

CHAPTER 6 399
CHAPTER 7

SORTING

All the programs in this file are selected from


Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Computer Science Press.

CHAPTER 7 400
Sequential Search
 Example
44, 55, 12, 42, 94, 18, 06, 67
 unsuccessful search
– n+1
 successful search

n 1 n 1
 (i  1) / n 
i 0 2

CHAPTER 7 401
* (P.320)

# define MAX-SIZE 1000/* maximum size of list plus one */


typedef struct {
int key;
/* other fields */
} element;
element list[MAX_SIZE];

CHAPTER 7 402
*Program 7.1:Sequential search (p.321)

int seqsearch( int list[ ], int searchnum, int n )


{
/*search an array, list, that has n numbers. Return i, if
list[i]=searchnum. Return -1, if searchnum is not in the list */
int i;
list[n]=searchnum; sentinel
for (i=0; list[i] != searchnum; i++)
;
return (( i<n) ? i : -1);
}

CHAPTER 7 403
*Program 7.2: Binary search (p.322)

int binsearch(element list[ ], int searchnum, int n)


{
/* search list [0], ..., list[n-1]*/
int left = 0, right = n-1, middle;
while (left <= right) {
middle = (left+ right)/2;
switch (COMPARE(list[middle].key, searchnum)) {
case -1: left = middle +1;
break;
case 0: return middle;
case 1:right = middle - 1;
}
} O(log2n)
return -1;
}
CHAPTER 7 404
*Figure 7.1:Decision tree for binary search (p.323)

[5]

[2] 46 [8]

17 58
[0] [3] [6] [10]

4 26 48 90

[1] [4] [7] [9] [11]

15 30 56 82 95

4, 15, 17, 26, 30, 46, 48, 56, 58, 82, 90, 95

CHAPTER 7 405
List Verification
 Compare lists to verify that they are identical or identify the discrepancies.
 example
– international revenue service (e.g., employee vs. employer)
 complexities
– random order: O(mn)
– ordered list:
O(tsort(n)+tsort(m)+m+n)

CHAPTER 7 406
*Program 7.3: verifying using a sequential search(p.324)

void verify1(element list1[], element list2[ ], int n, int m)


/* compare two unordered lists list1 and list2 */
{
(a) all records found in list1 but not in list2
int i, j; (b) all records found in list2 but not in list1
int marked[MAX_SIZE]; (c) all records that are in list1 and list2 with
the same key but have different values
for(i = 0; i<m; i++) for different fields.
marked[i] = FALSE;
for (i=0; i<n; i++)
if ((j = seqsearch(list2, m, list1[i].key)) < 0)
printf(“%d is not in list 2\n “, list1[i].key);
else
/* check each of the other fields from list1[i] and list2[j], and
print out any discrepancies */
CHAPTER 7 407
marked[j] = TRUE;
for ( i=0; i<m; i++)
if (!marked[i])
printf(“%d is not in list1\n”, list2[i]key);
}

CHAPTER 7 408
*Program 7.4:Fast verification of two lists (p.325)
void verify2(element list1[ ], element list2 [ ], int n, int m)
/* Same task as verify1, but list1 and list2 are sorted */
{
int i, j;
sort(list1, n);
sort(list2, m);
i = j = 0;
while (i < n && j < m)
if (list1[i].key < list2[j].key) {
printf (“%d is not in list 2 \n”, list1[i].key);
i++;
}
else if (list1[i].key == list2[j].key) {
/* compare list1[i] and list2[j] on each of the other field

and report any discrepancies */


i++; j++;
CHAPTER 7 409
}
else {
printf(“%d is not in list 1\n”, list2[j].key);
j++;
}
for(; i < n; i++)
printf (“%d is not in list 2\n”, list1[i].key);
for(; j < m; j++)
printf(“%d is not in list 1\n”, list2[j].key);
}

CHAPTER 7 410
Sorting Problem
 Definition
– given (R0, R1, …, Rn-1), where Ri = key + data
find a permutation , such that K(i-1)  K(i), 0<i<n-1
 sorted
– K(i-1)  K(i), 0<i<n-1
 stable
– if i < j and Ki = Kj then Ri precedes Rj in the sorted list
 internal sort vs. external sort
 criteria
– # of key comparisons
– # of data movements

CHAPTER 7 411
Insertion Sort
Find an element smaller than K.

26 5 77 1 61 11 59 15 48 19

5 26 77 1 61 11 59 15 48 19

5 26 77 1 61 11 59 15 48 19

1 5 26 77 61 11 59 15 48 19

1 5 26 61 77 11 59 15 48 19

1 5 11 26 61 77 59 15 48 19

1 5 11 26 59 61 77 15 48 19

1 5 11 15 26 59 61 77 48 19

1 5 11 15 26 48 59 61 77 19

5 26 1 61 11 59 15 48 19 77

CHAPTER 7 412
Insertion Sort
void insertion_sort(element list[], int n)
{
int i, j;
element next;
for (i=1; i<n; i++) {
next= list[i];
for (j=i-1; j>=0&&next.key<list[j].key;
j--)
list[j+1] = list[j];
list[j+1] = next;
}
}
insertion_sort(list,n)

CHAPTER 7 413
worse case

i 0 1 2 3 4
- 5 4 3 2 1
1 4 5 3 2 1 n2
2 3 4 5 2 1
3 2 3 4 5 1
O(  i )  O( n ) 2

j 0
4 1 2 3 4 5

best case left out of order (LOO)

i 0 1 2 3 4
- 2 3 4 5 1
1 2 3 4 5 1
2 2 3 4 5 1 O(n)
3 2 3 4 5 1
4 1 2 3 4 5

CHAPTER 7 414
Ri is LOO if Ri < max{Rj}
0j<i

k: # of records LOO

Computing time: O((k+1)n)

44 55 12 42 94 18 06 67
* * * * *

CHAPTER 7 415
Variation
 Binary insertion sort
– sequential search --> binary search
– reduce # of comparisons,
# of moves unchanged
 List insertion sort
– array --> linked list
– sequential search, move --> 0

CHAPTER 7 416
Quick Sort (C.A.R. Hoare)

 Given (R0, R1, …, Rn-1)


Ki: pivot key
if Ki is placed in S(i),
then Kj  Ks(i) for j < S(i),
Kj  Ks(i) for j > S(i).
 R0, …, RS(i)-1, RS(i), RS(i)+1, …, RS(n-1)
two partitions

CHAPTER 7 417
Example for Quick Sort

R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 left right
{ 26 5 37 1 61 11 59 15 48 19} 0 9
{ 11 5 19 1 15} 26 { 59 61 48 37} 0 4
{ 1 5} 11 { 19 15} 26 { 59 61 48 37} 0 1
{ 59 }
1 5 11 15 19 26 61 48 37 3 4
1 5 11 15 19 26 { 48 37} 59 { 61} 6 9
1 5 11 15 19 26 37 48 59 { 61} 6 7
1 5 11 15 19 26 37 48 59 61 9 9
1 5 11 15 19 26 37 48 59 61

CHAPTER 7 418
Quick Sort
void quicksort(element list[], int left,
int right)
{
int pivot, i, j;
element temp;
if (left < right) {
i = left; j = right+1;
pivot = list[left].key;
do {
do i++; while (list[i].key < pivot);
do j--; while (list[j].key > pivot);
if (i < j) SWAP(list[i], list[j], temp);
} while (i < j);
SWAP(list[left], list[j], temp);
quicksort(list, left, j-1);
quicksort(list, j+1, right);
}
}

CHAPTER 7 419
Analysis for Quick Sort
 Assume that each time a record is positioned,
the list is divided into the rough same size of
two parts.
 Position a list with n element needs O(n)
 T(n) is the time taken to sort n elements
T(n)<=cn+2T(n/2) for some c
<=cn+2(cn/2+2T(n/4))
...
<=cnlog n+nT(1)=O(nlog n)
CHAPTER 7 420
Time and Space for Quick Sort
 Space complexity:
– Average case and best case: O(log n)
– Worst case: O(n)
 Time complexity:
– Average case and best case: O(n log n)
– Worst case: O(n 2 )

CHAPTER 7 421
Merge Sort
 Given two sorted lists
(list[i], …, list[m])
(list[m+1], …, list[n])
generate a single sorted list
(sorted[i], …, sorted[n])
 O(n) space vs. O(1) space

CHAPTER 7 422
Merge Sort (O(n) space)
void merge(element list[], element sorted[],
int i, int m, int n)
{
int j, k, t; addition space: n-i+1
j = m+1; # of data movements: M(n-i+1)
k = i;
while (i<=m && j<=n) {
if (list[i].key<=list[j].key)
sorted[k++]= list[i++];
else sorted[k++]= list[j++];
}
if (i>m) for (t=j; t<=n; t++)
sorted[k+t-j]= list[t];
else for (t=i; t<=m; t++)
sorted[k+t-i] = list[t];
}

CHAPTER 7 423
Analysis
 array vs. linked list representation
– array: O(M(n-i+1)) where M: record length
for copy
– linked list representation: O(n-i+1)
(n-i+1) linked fields

CHAPTER 7 424
*Figure 7.5:First eight lines for O(1) space merge example (p337)
file 1 (sorted) file 2 (sorted)

discover n keys
exchange
g
preprocessin

exchange sort

Sort by the rightmost records of n -1 blocks


compare
exchange

compare
exchange

compare

CHAPTER 7 425
  
0 1 2 y w z u x 4 6 8 a v 3 5 7 9 b c e g i j k d f h o p q l m n r s t

  
0 1 2 3 w z u x 4 6 8 a v y 5 7 9 b c e g i j k d f h o p q l m n r s t

  
0 1 2 3 4 z u xw 6 8 a v y 5 7 9 b c e g i j k d f h o p q l m n r s t

  
0 1 2 3 4 5 u xw 6 8 a v y z 7 9 b c e g i j k d f h o p q l m n r s t

CHAPTER 7 426
*Figure 7.6:Last eight lines for O(1) space merge example(p.338)

6, 7, 8 are merged

Segment one is merged (i.e., 0, 2, 4, 6, 8, a)

Change place marker (longest sorted sequence of records)

Segment one is merged (i.e., b, c, e, g, i, j, k)

Change place marker

Segment one is merged (i.e., o, p, q)

No other segment. Sort the largest keys.

CHAPTER 7 427
*Program 7.8: O(1) space merge (p.339)

Steps in an O(1) space merge when the total number of records,


n is a perfect square */
and the number of records in each of the files to be merged is a
multiple of n */

Step1:Identify the n records with largest key. This is done by

following right to left along the two files to be merged.


Step2:Exchange the records of the second file that were
identified in Step1 with those just to the left of those
identified from the first file so that the n record with
largest keys form a contiguous block.
Step3:Swap the block of n largest records with the leftmost
block (unless it is already the leftmost block). Sort the
rightmost block.
CHAPTER 7 428
Step4:Reorder the blocks excluding the block of largest
records into nondecreasing order of the last key in the
blocks.
Step5:Perform as many merge sub steps as needed to merge
the n-1 blocks other than the block with the largest
keys.
Step6:Sort the block with the largest keys.

CHAPTER 7 429
Interactive Merge Sort
Sort 26, 5, 77, 1, 61, 11, 59, 15, 48, 19

26 5 77 1 61 11 59 15 48 19
\ / \ / \ / \ / \ /
5 26 1 77 11 61 15 59 19 48
\ / \ / |
1 5 26 77 11 15 59 61 19 48
\ /
1 5 11 15 26 59 61 77 19 48
\ /
1 5 11 15 19 26 48 59 61 77

O(nlog2n): log2n passes, O(n) for each pass

CHAPTER 7 430
Merge_Pass
void merge_pass(element list[], element
sorted[],int n, int length)
{
int i, j;
for (i=0; i<n-2*length; i+=2*length)
merge(list,sorted,i,i+length-1,
i+2*lenght-1);
if (i+length<n)
merge(list, sorted, i, i+lenght-1, n-1);
One complement segment and one partial segment
else
for (j=i; j<n; j++) sorted[j]= list[j];
} Only one segment

i i+length-1 i+2length-1
...
2*length
...
CHAPTER 7 431
Merge_Sort
void merge_sort(element list[], int n)
{
int length=1;
element extra[MAX_SIZE];

while (length<n) {
merge_pass(list, extra, n, length);
length *= 2;
merge_pass(extra, list, n, length);
length *= 2;
}
} l l l l ...

2l 2l ...

4l ...

CHAPTER 7 432
Recursive Formulation of Merge Sort
26 5 77 1 61 11 59 15 48 19

5 26 copy copy 11 59 copy copy

5 26 77 1 61 11 15 59 19 48

1 5 26 61 77 11 15 19 48 59

(1+5)/2 (6+10)/2

1 5 11 15 19 26 48 59 61 77
(1+10)/2

Data Structure: array (copy subfiles) vs. linked list (no copy)

CHAPTER 7 433
*Figure 7.9:Simulation of merge_sort (p.344)

start=3

i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 26 5 77 1 61 11 59 15 48 19
link 8 5 -1 1 2 7 4 9 6 0

CHAPTER 7 434
Recursive Merge Sort
lower upper

Point to the start of sorted chain


int rmerge(element list[], int lower,
int upper)
{
int middle;
if (lower >= upper) return lower;
else {
middle = (lower+upper)/2;
return listmerge(list,
rmerge(list,lower,middle),
rmerge(list,middle, upper));
}
}

lower middle upper


CHAPTER 7 435
ListMerge
int listmerge(element list[], int first, int
second)
{ first
int start=n; ...
second
while (first!=-1 && second!=-1) {
if (list[first].key<=list[second].key) {
/* key in first list is lower, link this
element to start and change start to
point to first */
list[start].link= first;
start = first;
first = list[first].link;
}

CHAPTER 7 436
else {
/* key in second list is lower, link this
element into the partially sorted list */
list[start].link = second;
start = second;
second = list[second].link;
}
}
if (first==-1)
list[start].link =first second;
is exhausted.
else
list[start].link = first;
second is exhausted.
return list[n].link;
}

O(nlog2n)
CHAPTER 7 437
Nature Merge Sort
26 5 77 1 61 11 59 15 48 19

5 26 77 1 11 59 61 15 19 48

1 5 11 26 59 61 77 15 19 48

1 5 11 15 19 26 48 59 61 77

CHAPTER 7 438
Heap Sort
*Figure 7.11: Array interpreted as a binary tree (p.349)
1 2 3 4 5 6 7 8 9 10
26 5 77 1 61 11 59 15 48 19

input file [1] 26

[2] 5 [3] 77

[4] 1 [5] 61 [6] 11 [7] 59

[8] 15 [9] 48 [10] 19

CHAPTER 7 439
*Figure 7.12: Max heap following first for loop of heapsort(p.350)

initial heap [1] 77

[2] 61 [3] 59
exchange

[4] 48 [5] 19 [6] 11 [7] 26

[8] 15 [9] 1 [10] 5

CHAPTER 7 440
Figure 7.13: Heap sort example(p.351)
[1] 61

[2] 48 [3] 59

[4] 15 [5] 19 [6] 11 [7] 26

[8] 5 [9] 1 [10] 77


(a)

[1] 59

[2] 48 [3] 26

[4] 15 [5] 19 [6] 11 [7] 1

[8] 5 [9] 61 [10] 77

(b)

CHAPTER 7 441
Figure 7.13(continued): Heap sort example(p.351)
[1] 48

[2] 19 [3] 26

[4] 15 [5] 5 [6] 11 [7] 1

[8] 59
59 [9] 61
61
[10] 77
(c)

[1] 26

[2] 19 [3] 11

48
[4] 15 [5] 5 [6] 1 [7] 48

59
[8] 59 [9] 61 [10] 77

(d)

CHAPTER 7 442
Heap Sort
void adjust(element list[], int root, int n)
{
int child, rootkey; element temp;
temp=list[root]; rootkey=list[root].key;
child=2*root;
while (child <= n) {
if ((child < n) &&
(list[child].key < list[child+1].key))
child++;
if (rootkey > list[child].key) break;
else {
list[child/2] = list[child];
child *= 2;
}
}
list[child/2] = temp; i
2i 2i+1
}

CHAPTER 7 443
Heap Sort
void heapsort(element list[], int n)
{ ascending order (max heap)
int i, j;
element temp;
bottom-up
for (i=n/2; i>0; i--) adjust(list, i, n);
for (i=n-1; i>0; i--) {
n-1 cylces
SWAP(list[1], list[i+1], temp);
adjust(list, 1, i);
} top-down
}

CHAPTER 7 444
Radix Sort
Sort by keys
K0, K1 , …, Kr-1

Most significant key Least significant key

R0, R1, …, Rn-1 are said to be sorted w.r.t. K0, K1, …, Kr-1 iff

( k , k ,..., k )  ( k , k ,..., k )
i
0
i
1
i
r 1 0
i 1
1
i 1
r 1
i 1
0i<n-1

Most significant digit first: sort on K0, then K1, ...

Least significant digit first: sort on Kr-1, then Kr-2, ...

CHAPTER 7 445
Figure 7.14: Arrangement of cards after first pass of an MSD
sort(p.353)

Suits:  <  <  < 


Face values: 2 < 3 < 4 < … < J < Q < K < A

(1) MSD sort first, e.g., bin sort, four bins  
LSD sort second, e.g., insertion sort

(2) LSD sort first, e.g., bin sort, 13 bins


2, 3, 4, …, 10, J, Q, K, A
MSD sort, e.g., bin sort four bins  
CHAPTER 7 446
Figure 7.15: Arrangement of cards after first pass of LSD
sort (p.353)

CHAPTER 7 447
Radix Sort
0  K  999
(K0, K1, K2)
MSD LSD
0-9 0-9 0-9
radix 10 sort
radix 2 sort

CHAPTER 7 448
Example for LSD Radix Sort
d (digit) = 3, r (radix) = 10 ascending order
179, 208, 306, 93, 859, 984, 55, 9, 271, 33
front[0] NULL rear[0]
Sort by

front[1] 271 NULL rear[1]


front[2] NULL rear[2]
front[3] 93 33 NULL rear[3]
front[4] 984 NULL rear[4]
front[5] 55 NULL rear[5]
front[6] 306 NULL rear[6]
front[7] NULL rear[7]
front[8] 208 NULL rear[8]
front[9] 179 859 9 NULL rear[9]
concatenat

271, 93, 33, 984, 55, 306, 208, 179, 859, 9 After the first pass
CHAPTER 7 449
front[0] 306 208 9 null rear[0]

front[1] null rear[1]

front[2] null rear[2]

front[3] 33 null rear[3]

front[4] null rear[4]

front[5] 55 859 null rear[5]

front[6] null rear[6]

front[7] 271 179 null rear[7]

front[8] 984 null rear[8]


front[9] 93 null CHAPTER 7 450
306, 208, 9, 33, 55, 859, 271, 179, 984, 93 (second pass) rear[9]
front[0] 9 33 55 93 null
rear[0]
front[1] 179 null rear[1]

front[2] 208 271 null rear[2]

front[3] 306 null rear[3]

front[4] null rear[4]

front[5] null rear[5]

front[6] null rear[6]

front[7] null rear[7]

front[8] 859 null rear[8]


front[9] 984 null CHAPTER 7 451
9, 33, 55, 93, 179, 208, 271, 306, 859, 984 (third pass) rear[9]
Data Structures for LSD Radix Sort
 An LSD radix r sort,
 R0, R1, ..., Rn-1 have the keys that are d-tuples
(x0, x1, ..., xd-1)
#define MAX_DIGIT 3
#define RADIX_SIZE 10
typedef struct list_node *list_pointer
typedef struct list_node {
int key[MAX_DIGIT];
list_pointer link;
}
CHAPTER 7 452
LSD Radix Sort
list_pointer radix_sort(list_pointer ptr)
{
list_pointer front[RADIX_SIZE],
rear[RADIX_SIZE];
int i, j, digit;
for (i=MAX_DIGIT-1; i>=0; i--) {
for (j=0; j<RADIX_SIZE; j++)
front[j]=read[j]=NULL; Initialize bins to be
while (ptr) { empty queue.
digit=ptr->key[I];
Put records
if (!front[digit]) into queues.
front[digit]=ptr;
else rear[digit]->link=ptr;

CHAPTER 7 453
rear[digit]=ptr;
O(n) ptr=ptr->link;
Get next record.
}
/* reestablish the linked list for the next pass */
ptr= NULL;
O(d(n+r)) for (j=RADIX_SIZE-1; j>=0; j++)
if (front[j]) {
rear[j]->link=ptr;
ptr=front[j];
O(r) }
}
return ptr;
}

CHAPTER 7 454
Practical Considerations for
Internal Sorting
 Data movement
– slow down sorting process
– insertion sort and merge sort --> linked file
 perform a linked list sort + rearrange records

CHAPTER 7 455
List and Table Sorts
 Many sorting algorithms require excessive
data movement since we must physically
move records following some comparisons
 If the records are large, this slows down the
sorting process
 We can reduce data movement by using a
linked list representation

CHAPTER 7 456
List and Table Sorts
 However, in some applications we must
physically rearrange the records so that they
are in the required order
 We can achieve considerable savings by first
performing a linked list sort and then physically
rearranging the records according to the order
specified in the list.

CHAPTER 7 457
In Place Sorting

i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 26 5 77 1 61 11 59 15 48 19
link 8 5 -1 1 2 7 4 9 6 0

R0 <--> R3

1 5 77 26 61 11 59 15 48 19
1 5 -1 8 2 7 4 9 6 3

How to know where the predecessor is ?


I.E. its link should be modified.

CHAPTER 7 458
Doubly Linked List
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 26 5 77 1 61 11 59 15 48 19
link 8 5 -1 1 2 7 4 9 6 0
linkb 9 3 4 -1 6 1 8 5 0 7

i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 77 26 61 11 59 15 48 19
link 1 5 -1 8 2 7 4 9 6 3
linkb -1 3 4 9 6 1 8 5 3 7

i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 77 26 61 11 59 15 48 19
link 1 5 -1 8 2 7 4 9 6 3
linkb -1 3 4 9 6 1 8 5 3 7

i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 77 26 61 11 59 15 48 19
link 1 5 -1 8 2
CHAPTER 77 4 9 6 3 459
linkb -1 3 4 9 6 1 8 5 3 7
Algorithm for List Sort
void list_sort1(element list[], int n,
int start)
{
int i, last, current;
element temp;
Convert start into a doubly lined list using linkb
last = -1;
for (current=start; current!=-1;
current=list[current].link) {
O(n)
list[current].linkb = last;
last = current;
}
CHAPTER 7 460
for (i=0; i<n-1; i++) {
if (start!=i) { list[i].link-1
if (list[i].link+1)
O(mn)
list[list[i].link].linkb= start;
list[list[i].linkb].link= start;
SWAP(list[start], list[i].temp);
}
start = list[i].link;
}
}

CHAPTER 7 461
(2)
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 26 5 77 1 61 11 59 15 48 19
link 8 5 -1 1 2 7 4 9 6 0
(1)
i=0
start=1 i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 77 26 61 11 59 15 48 19
link 3 5 -1 8 2 7 4 9 6 0
原值已放在 start 中 值不變
i=1 i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
start=5 key 1 5 77 26 61 11 59 15 48 19
link 3 5 -1 8 2 7 4 9 6 0

i=2
start=7 i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 11 26 61 77 59 15 48 19
link 3 5 5 8 2 -1 4 9 6 0
原值已放在 start 中
CHAPTER 7 462
值不變
i=3 i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
start=9 key 1 5 11 15 61 77 59 26 48 19
link 3 5 5 7 2 -1 4 8 6 0

值不變 原值已放在 start 中


i=4
start=0
i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
key 1 5 11 15 19 77 59 26 48 61
link 3 5 5 7 9 -1 4 8 6 2

想把第 0 個元素放在第 5 個位置?


想把第 3 個元素放在第 5 個位置?
想把第 7 個元素放在第 5 個位置?

i=5 i R0 R1 R2 R3 R4 R5 R6 R7 R8 R9
start=8 key 1 5 11 15 19 77 59 26 48 61
link 3 5 5 7 9 -1 4 8 6 2

CHAPTER 7 463
Table Sort

Before Auxiliary
0 1 2 3 4 table
sort
T
Data are not moved.
(0,1,2,3,4) R0 R1 R2 R3 R4
Key 50 9 11 8 3

A permutation
linked list sort

(4,3,1,2,0) Auxiliary
After
sort 4 3 2 1 0 table
T
第 0 名在第 4 個位置,第 1 名在第 4 個位置,…,第 4 名在第 0 個位
CHAPTER 7 464
(0,1,2,3,4)
Every permutation is made of disjoint cycles.

Example
R0 R1 R2 R3 R4 R5 R6 R7
key 35 14 12 42 26 50 31 18
table 2 1 7 4 6 0 3 5

two nontrivial cycles trivial cycle

R0 R2 R7 R5 R0 R1 R1
2 7 5 0 1

R3 R4 R6 R3
4 6 3 4

CHAPTER 7 465
(1) R0 R2 R7 R5 R0
0 2 7 5
12 18 50 35

35 0

(2) i=1,2 t[i]=i


(3) R3 R4 R6 R3
3 4 6
26 31 42

42 3
CHAPTER 7 466
Algorithm for Table Sort
void table_sort(element list[], int n,
int table[])
{
int i, current, next;
element temp;
for (i=0; i<n-1; i++)
if (table[i]!=i) {
temp= list[i];
current= i;

CHAPTER 7 467
do {
next = table[current];
list[current] = list[next];
table[current]= current;
current = next;
} while (table[current]!=i);
形成 cycle
list[current] = temp;
table[current] = current;
}
}

CHAPTER 7 468
Complexity of Sort
stability space time
best average worst
Bubble Sort stable little O(n) O(n2) O(n2)
Insertion Sort stable little O(n) O(n2) O(n2)
Quick Sort untable O(logn) O(nlogn) O(nlogn) O(n2)
Merge Sort stable O(n) O(nlogn) O(nlogn) O(nlogn)
Heap Sort untable little O(nlogn) O(nlogn) O(nlogn)
Radix Sort stable O(np) O(nlogn) O(nlogn) O(nlogn)
List Sort ? O(n) O(1) O(n) O(n)
Table Sort ? O(n) O(1) O(n) O(n)

CHAPTER 7 469
Comparison
n < 20: insertion sort

20  n < 45: quick sort

n  45: merge sort

hybrid method: merge sort + quick sort


merge sort + insertion sort

CHAPTER 7 470
External Sorting
 Very large files (overheads in disk access)
– seek time
– latency time
– transmission time
 merge sort
– phase 1
Segment the input file & sort the segments (runs)
– phase 2
Merge the runs

CHAPTER 7 471
File: 4500 records, A1, …, A4500
internal memory: 750 records (3 blocks)
block length: 250 records
input disk vs. scratch pad (disk)
(1) sort three blocks at a time and write them out onto scratch pad
(2) three blocks: two input buffers & one output buffer

750 Records 750 Records 750 Records 750 Records 750 Records 750 Recor

1500 Records 1500 Records 1500 Records

3000 Records
Block Size = 250

4500 Records
CHAPTER 7 472
2 2/3 passes
Time Complexity of External Sort
 input/output time
 ts = maximum seek time
 tl = maximum latency time
 trw = time to read/write one block of 250 records
 tIO = ts + tl + trw
 cpu processing time
 tIS = time to internally sort 750 records
 ntm = time to merge n records from input buffers to
the output buffer CHAPTER 7 473
Time Complexity of External Sort
(Continued)
Operation time

(1) read 18 blocks of input , 18tIO, 36 tIO +6 tIS


internally sort, 6tIS,
write 18 blocks, 18tIO
(2) merge runs 1-6 in pairs 36 tIO +4500 tm

(3) merge two runs of 1500 records 24 tIO +3000 tm


each, 12 blocks
(4) merge one run of 3000 records 36 tIO +4500 tm
with one run of 1500 records
Total Time 96 tIO +12000 tm+6 tIS
Critical factor: number of passes over the data
CHAPTER 7 474
Runs: m, pass: log2m
Consider Parallelism
 Carry out the CPU operation and I/O operation
in parallel
 132 tIO = 12000 tm + 6 tIS
 Two disks: 132 tIO is reduced to 66 tIO

CHAPTER 7 475
K-Way Merging
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

A 4-way merge on 16 runs

2 passes (4-way) vs. 4 passes (2-way)

CHAPTER 7 476
Analysis
 logk m passes
O(nlog2 k*logk m)
 1 pass: nlog2 k
 I/O time vs. CPU time
– reduction of the number of passes being made over the data
– efficient utilization of program buffers so that input, output
and CPU processing is overlapped as much as possible
– run generation
– run merging

CHAPTER 7 477
CHAPTER 8

HASHING
All the programs in this file are selected from
Ellis Horowitz, Sartaj Sahni, and Susan Anderson-Freed
“Fundamentals of Data Structures in C”,
Computer Science Press.
CHAPTER 8 478
Symbol Table
 Definition
A set of name-attribute pairs
 Operations
– Determine if a particular name is in the table
– Retrieve the attributes of the name
– Modify the attributes of that name
– Insert a new name and its attributes
– Delete a name and its attributes

CHAPTER 8 479
The ADT of Symbol Table
Structure SymbolTable(SymTab) is
objects: a set of name-attribute pairs, where the names are unique
functions:
for all name belongs to Name, attr belongs to Attribute, symtab belongs to
SymbolTable, max_size belongs to integer
SymTab Create(max_size) ::= create the empty symbol table whose maximum
capacity is max_size
Boolean IsIn(symtab, name) ::= if (name is in symtab) return TRUE
else return FALSE
Attribute Find(symtab, name) ::= if (name is in symtab) return the corresponding
attribute else return null attribute
SymTab Insert(symtal, name, attr) ::= if (name is in symtab)
replace its existing attribute with attr
else insert the pair (name, attr) into symtab
SymTab Delete(symtab, name) ::= if (name is not in symtab) return
else delete (name, attr) from symtab

search, insertion, deletion CHAPTER 8 480


Search vs. Hashing
 Search tree methods: key comparisons
 hashing methods: hash functions
 types
– statistic hashing
– dynamic hashing

CHAPTER 8 481
Static Hashing
hash table (ht) f(x): 0 … (b-1)
0
1
buckets
b

2
. . .
. . . ... ... ... .
. . . .
b-2
b-1
1 2 ………. s
s slots
CHAPTER 8 482
Identifier Density and Loading Density

 The identifier density of a hash table is the


ratio n/T
– n is the number of identifiers in the table
– T is possible identifiers
 The loading density or loading factor of a
hash table is  = n/(sb)
– s is the number of slots
– b is the number of buckets
CHAPTER 8 483
Synonyms
 Since the number of buckets b is usually
several orders of magnitude lower than T, the
hash function f must map several different
identifiers into the same bucket
 Two identifiers, i and j are synonyms with
respect to f if f(i) = f(j)

CHAPTER 8 484
Overflow and Collision
 An overflow occurs when we hash a new
identifier into a full bucket
 A collision occurs when we hash two
non-identical identifiers into the same bucket

CHAPTER 8 485
Example
Slot 0 Slot 1
0 acos atan synonyms
1
synonyms: 2 char ceil
char, ceil, 3 define
clock, ctime
4 exp
5 float floor synonyms
overflow
6

25
b=26, s=2, n=10, =10/52=0.19, f(x)=the first char of x
x: acos, define, float, exp, char, atan, ceil, floor, clock, ctime
f(x):0, 3, 5, 4, 2, 0, 2, 5, 2, 2
CHAPTER 8 486
Hashing Functions
 Two requirements
– easy computation
– minimal number of collisions
 mid-square (middle of square)
f m ( x)  middle( x2)
 division
f D ( x)  x% M (0 ~ (M-1))

Avoid the choice of M that leads to many collisions


CHAPTER 8 487
A program tends to have variables with the same suffix,
M=2i is not suitable
M = 2i fD(X) depends on LSBs of X
Example.
(1) Each character is represented by six bits.
(2) Identifiers are right-justified and zero-filled.
Zero-filled fD(X) shift right i bits
3 4
000000A1 = … 000001011100
000000B1 = … 000010011100
000000C1 = … 000011011100
00000X41 = … 011000011100
00NTXY1 = … 011000011001011100
M=2i, i  6

Similarly, AXY, BXY, WTXY (M=2i, i  12) have the same bucket address
CHAPTER 8 488
(3) Identifiers are left-justified and zero-filled.
60-bit word

fD(one-char id) = 0, M=2i, i  54


fD(two-char id) = 0, M=2i, i  48

CHAPTER 8 489
Programs in which many variables are permutations of each other.

Example. X=X1X2 Y=X2X1


X1 --> C(X1) X2 --> C(X2)
Each character is represented by six bits
X: C(X1) * 26 + C(X2)
Y: C(X2) * 26 + C(X1)
(fD(X) - fD(Y)) % P (where P is a prime number)
= C(X1) * 26 % P + C(X2) % P - C(X2) * 26 % P- C(X1) % P
P=3
64 % 3 * C(X1) % 3 + C(X2) % 3 -
64 % 3 * C(X2) % 3- C(X1) % 3
= C(X1) % 3 + C(X2) % 3 - C(X2) % 3- C(X1) % 3
=0
P = 7?
M is a prime number such that M does not divide rka for
small k and a (Knuth) CHAPTER 8 490
for example, M = 1009
Hashing Functions

 Folding
– Partition the identifier x into several parts
– All parts except for the last one have the same length
– Add the parts together to obtain the hash address
– Two possibilities
• Shift folding
– x1=123, x2=203, x3=241, x4=112, x5=20, address=699
• Folding at the boundaries
– x1=123, x2=203, x3=241, x4=112, x5=20, address=897
CHAPTER 8 491
123 203 241 112 20
P1 P2 P3 P4 P5
shift folding 123
203
241
112
20
699
folding at
the boundaries 123 203 241 112 20
MSD ---> LSD
LSD <--- MSD

CHAPTER 8 492
Digital Analysis
 All the identifiers are known in advance
M=1~999
X1 d11 d12 … d1n
X2 d21 d22 … d2n

Xm dm1 dm2 … dmn
Select 3 digits from n
Criterion:
Delete the digits having the most skewed distributions

CHAPTER 8 493
Overflow Handling
 Linear Open Addressing (linear probing)
 Quadratic probing
 Chaining

CHAPTER 8 494
Data Structure for Hash Table

#define MAX_CHAR 10
#define TABLE_SIZE 13
typedef struct {
char key[MAX_CHAR];
/* other fields */
} element;
element hash_table[TABLE_SIZE];

CHAPTER 8 495
Hash Algorithm via Division
void init_table(element ht[]) int hash(char *key)
{
{
int i;
for (i=0; i<TABLE_SIZE; i++) return (transform(key)
ht[i].key[0]=NULL; % TABLE_SIZE);
} }

int transform(char *key)


{
int number=0;
while (*key) number += *key++;
return number;
}

CHAPTER 8 496
Example

Identifier Additive Transform x Hash


for 102+111+114 327 2
do 100+111 211 3
while 119+104+105+108+101 537 4
if 105+102 207 12
else 101+108+115+101 425 9
function 102+117+110+99+116+105+111+110 870 12

CHAPTER 8 497
Linear Probing
(linear open addressing)
 Compute f(x) for identifier x
 Examine the buckets
ht[(f(x)+j)%TABLE_SIZE]
0  j  TABLE_SIZE
– The bucket contains x.
– The bucket contains the empty string
– The bucket contains a nonempty string other than x
– Return to ht[f(x)]
CHAPTER 8 498
Linear Probing
void linear_insert(element item, element ht[])
{
int i, hash_value;
I = hash_value = hash(item.key);
while(strlen(ht[i].key)) {
if (!strcmp(ht[i].key, item.key))
fprintf(stderr, “Duplicate entry\n”);
exit(1);
}
i = (i+1)%TABLE_SIZE;
if (i == hash_value) {
fprintf(stderr, “The table is full\n”);
exit(1);
}
}
ht[i] = item;
}
CHAPTER 8 499
Problem of Linear Probing
 Identifiers tend to cluster together
 Adjacent cluster tend to coalesce
 Increase the search time

CHAPTER 8 500
Coalesce Phenomenon
bucket x bucket searched bucket x bucket searched
0 acos 1 1 atoi 2
2 char 1 3 define 1
4 exp 1 5 ceil 4
6 cos 5 7 float 3
8 atol 9 9 floor 5
10 ctime 9 …
… 25
Average number of buckets examined is 41/11=3.73

CHAPTER 8 501
Quadratic Probing
 Linear probing searches buckets (f(x)+i)%b
 Quadratic probing uses a quadratic function
of i as the increment
 Examine buckets f(x), (f(x)+i )%b, (f(x)-i )
%b, for 1<=i<=(b-1)/2
 b is a prime number of the form 4j+3, j is an
integer
CHAPTER 8 502
rehashing

 Try f1, f2, …, fm in sequence if collision


occurs
 disadvantage
– comparison of identifiers with different hash
values
– use chain to resolve collisions

CHAPTER 8 503
Data Structure for Chaining
#define MAX_CHAR 10
#define TABLE_SIZE 13
#define IS_FULL(ptr) (!(ptr))
typedef struct {
char key[MAX_CHAR];
/* other fields */
} element;
typedef struct list *list_pointer;
typedef struct list {
element item;
list_pointer link;
};
list_pointer hash_table[TABLE_SIZE];
CHAPTER 8 504
Chain Insert
void chain_insert(element item, list_pointer ht[])
{
int hash_value = hash(item.key);
list_pointer ptr, trail=NULL, lead=ht[hash_value];
for (; lead; trail=lead, lead=lead->link)
if (!strcmp(lead->item.key, item.key)) {
fprintf(stderr, “The key is in the table\n”);
exit(1);
}
ptr = (list_pointer) malloc(sizeof(list));
if (IS_FULL(ptr)) {
fprintf(stderr, “The memory is full\n”);
exit(1);
}
ptr->item = item;
ptr->link = NULL;
if (trail) trail->link = ptr;
else ht[hash_value] = ptr;
CHAPTER 8 505
}
Results of Hash Chaining
acos, atoi, char, define, exp, ceil, cos, float, atol, floor, ctime
f(x)=first character of x

[0] -> acos -> atoi -> atol


[1] -> NULL
[2] -> char -> ceil -> cos -> ctime
[3] -> define
[4] -> exp
[5] -> float -> floor
[6] -> NULL

[25] -> NULL
# of key comparisons=21/11=1.91CHAPTER 8 506
=n/b .50 .75 .90 .95
hashing function chain/open chain/open chain/open chain/open
mid square 1.26/1.73 1.40/9.75 1.45/37.14 1.47/37.53
division 1.19/4.52 1.31/7.20 1.38/22.42 1.41/25.79
shift fold 1.33/21.75 1.48/65.10 1.40/77.01 1.51/118.57
Bound fold 1.39/22.97 1.57/48.70 1.55/69.63 1.51/97.56
digit analysis 1.35/4.55 1.49/30.62 1.52/89.20 1.52/125.59
theoretical 1.25/1.50 1.37/2.50 1.45/5.50 1.48/10.50

CHAPTER 8 507
dynamic hashing
(extensible hashing)
 dynamically increasing and decreasing file
size
 concepts
– file: a collection of records
– record: a key + data, stored in pages (buckets)
– space utilization
NumberOf Re cord
NumberOfPages * PageCapacity
CHAPTER 8 508
Dynamic Hashing Using Directories
Example. m(# of pages)=4, P(page capacity)=2
00, 01, 10, 11

Identifiers Binary representaiton


a0 100 000
a1 100 001 allocation:
b0 101 000 lower order
b1 101 001 from LSB two bits
c0 110 000 to MSB
c1 110 001
c2 110 010
c3 110 011

*Figure 8.8:Some identifiers requiring 3 bits per character(p.414)


CHAPTER 8 509
*Figure 8.9: A trie to hole identifiers(p.415)
0 a0,b0
0 a0,b0 1 c2
0
0 1 c2
+c5 0 a1,b1
1 0 a1,b1 1 0
1 c5
1 c3 +c1 1 c3
(a) two level trie on four pages (b) inserting c5 with overflow
0 a0,b0
0
1 c2
0 a1,c1
0
1 0 1 b1
Note: time to access 1 c5
a page: # of bits to 1
distinguish the identifiers
c3
Note: identifiers skewed: (c) inserting c1 with overflow
CHAPTER 8 510
depth of tree skewed
local depth
Extendiable Hashing
global depth: 4
f(x)=a set of binary digits --> table lookup
2 {0000,1000,0100,1100}
{000,100} 4 {0001}
{001} 2 {0010,1010,0110,1110}
{010,110} 2 {0011,1011,0111,1111}
{011,111}
3 {0101,1101}
page pointer
{101}
0 a0,b0
0 1 c2 4 {1001}

1 0 a1,b1
pages c & d: buddies
1 c3 0 a0,b0
0 1 c2

0 a1,b1
1 0
1 c5
CHAPTER 8 511
+c1 1 c3
If keys do not uniformly divide up among pages, then the
directory can glow quite large, but most of entries will point
to the same page

f: a family of hashing functions


hashi: key --> {0 .. 2i-1} 1  i  d
hash(key, i): produce random number of i bits from identifier key

hashi is hashi-1 with either a zero or one appeared as the new


leading bit of result
100 000 100 001 101 000
hash(a0,2)=00 hash(a1,4)=0001 hash(b0,2)=00
101 001 110 001 110 010
hash(b1,4)=1001 hash(c1, 4)=0001 hash(c2, 2)=10
110 011 110 101
hash(c3,2)=11 hash(c5,3)=101
CHAPTER 8 512
*Program 8.5: Dynamic hashing (p.421)

#include <stdio.h>
#include <alloc.h>
#include <stdlib.h> 25=32
#define WORD_SIZE 5 /* max number of directory bits */
#define PAGE_SIZE 10 /* max size of a page */
#define DIRECTORY_SIZE 32 /* max size of directory */
typedef struct page *paddr;
typedef struct page { See Figure 8.10(c) global depth=4
int local_depth; /* page level */ local depth of a =2
char *name[PAGE_SIZE]; the actual identifiers
int num_idents; /* #of identifiers in page */
};
typedef struct {
char *key; /* pointer to string */
/*other fields */
} brecord;
int global_depth; /* trie height */
paddr directory[DIRECTORY_SIZE]; /* pointers to pages */
CHAPTER 8 513
paddr hash(char *, short int);
paddr buddy(paddr);
short int pgsearch(char *, paddr );
int convert(paddr);
void enter(brecord, paddr);
void pgdelete(char *, paddr);
paddr find(brecord, char *);
void insert (brecord, char *);
int size(paddr);
void coalesce (paddr, paddr);
void delete(brecord, char *);

paddr hash(char *key, short int precision)


{
/* *key is hashed using a uniform hash function, and the
low precision bits are returned as the page address */
} directory subscript for directory lookup

CHAPTER 8 514
paddr buddy(paddr index)
{
/*Take an address of a page and returns the page’s
buddy, i. e., the leading bit is complemented */
buddy bn-1bn-2 … b0 bn-1bn-2 … b0
}

int size(paddr ptr)


{
/* return the number of identifiers in the page */
}
void coalesce(paddr ptr, paddr, buddy)
{
/*combine page ptr and its buddy into a single page */
}
short int pgsearch{char *key, paddr index)
{
/*Search a page for a key. If found return 1
otherwise return 0 */
CHAPTER 8 515
}
void convert (paddr ptr)
{
/* Convert a pointer to a pointer to a page to an equivalent integer */
}

void enter(brecord r, paddr ptr)


{
/* Insert a new record into the page pointed at by ptr */
}

void pgdelete(char *key, paddr ptr)


{
/* remove the record with key, hey, from the page pointed to by ptr */
}

short int find (char *key, paddr *ptr)


{
/* return 0 if key is not found and 1 if it is. Also,
return a pointer (in ptr) to the page that was searched.
Assume that an empty directory has one page. */
CHAPTER 8 516
paddr index;
int intindex;
index = hash(key, global_depth);
intindex = convert(index);
*ptr = directory[intindex];
return pgsearch(key, ptr);
}

void insert(brecord r, char *key)


{
paddr ptr;
if find(key, &ptr) {
fprintr(stderr, “ The key is already in the table.\n”);
exit(1);
}
if (ptr-> num_idents != PAGE_SIZE) {
enter(r, ptr);
ptr->num_idents++;
}
else{ /*Split the page into two, insert the new key, and update global_depth
CHAPTER 8 517

if necessary.
If this causes global_depth to exceed WORD_SIZE then print an error
and terminate. */
};
}

void delete(brecord r, char *key)


{
/* find and delete the record r from the file */
paddr ptr;
if (!find (key, &ptr )) {
fprintf(stderr, “Key is not in the table.\n”);
return; /* non-fatal error */
}
pgdelete(key, ptr);
if (size(ptr) + size(buddy(ptr)) <= PAGE_SIZE)
coalesce(ptr, buddy(ptr));
}

void main(void)
{
} CHAPTER 8 518
Directoryless Dynamic Hashing (Linear Hashing)
continuous address space

offset of base address (cf directory scheme)

0 a0,b0 00 a0
b0
0 1 c2 01 c2
-
10 a1
1 0 a1,b1 b1
11 c3
1 c3 -
*Figure 8.12: A trie mapped to a directoryless, contiguous storage
(p.424)
CHAPTER 8 519
*Figure 8.13: An example with two insertions (p.425)
2 rehash & split rehash & split
00 a0 000 a0 000 a0
b0 b0 1 b0
01 c2 01 c2 overflow 01 c2
- +c5 - page +c1 -
10 a1 10 a1 10 a1
b1 b1 c5 b1 c1 c5
11 c3 11 c3 11 c3
- - -
100 - 100 -
new page -
-
- new page
insert c5 insert
- c1
start of expansion 2 page 10 overflows page 10 overflows
there are 4 pages page 00 splits page 01 splits
(a) (b) (c)
CHAPTER 8 520
*Figure 8.14: During the rth phase of expansion of directoryless method (p.426)

pages already split pages not yet split pages added so far

addressed by r+1 bits addressed by r bits addressed by r+1 bits

q r
2r pages at start
suppose we are at phase r; there are 2r pages indexed by r bits

CHAPTER 8 521
*Program 8.6:Modified hash function (p.427)

if ( hash(key,r) < q)
page = hash(key, r+1);
else
page = hash(key, r);
if needed, then follow overflow pointers;

CHAPTER 8 522

You might also like