You are on page 1of 130

National Diploma in Information Technology

DATA STRUCTURES
& ALGORITHMS

1
Part I
INTRODUCTION

2
1. CONTEXT

Computers deal with huge amounts of data and information.


There is need to organise data properly and to use it in a structured way.

2. SOME DEFINITIONS

• Data
Any fact, number, character, state

• Information
Data placed within a context
Data with meaning

• Algorithm
A finite sequence of instructions, each with a clear meaning,
Which lead to the solution of a problem?

• Data Type
The data type of a variable is the set of values that the variable may assume

Single Data Type


Variables of such a type can assume 1 value at any time

Structured Data Type


Variables of such a type can assume 1 or more values at any time

• Abstract Data Type


A mathematical model, together with the different operations defined on that model

• Data Structure
Collection of variables, possibly of different data types, connected in various ways
Data structures are used to represent the mathematical model of an ADT

3
3. AN EXAMPLE: RATIONAL NUMBER

• ADT DEFINITION (type and operators)

Type rational;

Function make Rational(a,b: integer): rational;

function nominator (r: rational): integer;


function denominator (r: rational): integer;

function add (r1, r2: rational): rational;


function subtract (r1, r2: rational): rational;
function multiply (r1, r2: rational): rational;
function divide (r1, r2: rational): rational;

function equal (r1, r2: rational): boolean;

procedure display (r: rational);

• ADT IMPLEMENTATION (type and operators)


type rational = RECORD
nom: integer;
denom: integer;
END;

function makeRational(a, b: integer): rational;


var r: rational;
begin
r.nom := a; r.denom := b;
makeRational := r;
end;

function nominator (r: rational): integer;


begin
nominator := r.nom;
end;

function denominator (r: rational): integer;


function add (r1, r2: rational): rational;


var r: rational;
begin
r.denom := r1.denom * r2.denom;
r.nom := r1.nom * r2.denom + r2.nom * r1.denom;
add := r;
end;

function subtract (r1, r2: rational): rational;


function multiply (r1, r2: rational): rational;


function divide (r1, r2: rational): rational;


function equal (r1, r2: rational): boolean;


begin
equal := r1.nom div r2.nom = r1.denom div r2.denom;
4
end;

procedure display (r: rational);


4. ADVANTAGES OF USING ADT'S

- teamwork
- prototyping
- modular programming approach

5
Part II
PASCAL DATA TYPES
(a revision)

6
1. STANDARD PASCAL DATA TYPES

• integer
values: whole numbers, positive and negative
+, -, *, DIV, MOD
ABS(-4) -> 4
• real
values: floating point numbers
+, -, *, /
ROUND(3.6) -> 4
TRUNC(3.6) -> 3
• char
values: alphabet, special chars, control chars
ORD(ch) -> ordinal number
CHR(num) -> character (refer to ascii table for details)
• boolean
values: true and false
A B A and B A or B not A
True True True True False
True False False True False
False True False True True
False False False False True

7
2. SUBRANGE TYPES

A simple data type that limits the range of an ordinal type.

TYPE interval = -10 .. 10;


year = 0..2000;
result = 0..100;
capitalLetter = ‘A’ .. ‘Z’;

Using subrange types, you protect your programs of using ‘unexpected’ values.

3. ENUMERATED TYPES

A simple data type for which all possible values are listed as identifiers.

TYPE color = (yellow, red, green);


day = (Mon, Tue, Wed, Thu, Fri, Sat, Sun);
weekday = Mon .. Fri;

Using enumerated types, you enhance the development of programs.

4. ARRAY TYPES

An array is a structured data type for which all members are of the same type.
Each value of an array variable is identified by its "index".
• Example
type string = ARRAY[0..255] OF char;
table = ARRAY [1..10] OF integer;
var myTable: table;

1 2 3 4 5 6 7 8 9 10
myTable

x := 5;
myTable[6] := 13;
myTable[1] := x;
myTable[x] := 4 * x;
myTable[3 * x] := 2; (* index out of range error *)
myTable[2] := myTable[1] * 6;
myTable[10] := myTable[1 * 6];

1 2 3 4 5 6 7 8 9 10
5 30
myTable 20 13 13

8
• Example: a two dimensional array
type row = 1..5;
col = 1..5;
matrix = array[row, col] of integer;

• Initialising a two dimensional array


for row := 1 to 5 do
for col := 1 to 5 do
myTable[row, col] := 0;

• Summing all the elements of a two dimensional array


sum := 0;
for row := 1 to 5 do
for col := 1 to 5 do
sum := sum + myTable[row, col];

9
5. RECORD TYPES

A structured data type for which members can be of different types.


Each value of a record variable is identified by its "fieldname".
• Example
type date = record
day: 1..31;
month: 1..12;
year: 0..3000;
end;
var today: date;

day
today month
year

today.day := 14;
today.month := 5;
today.year := 2001;

day 14
month
today 5
year 2001

10
Part III
SEARCHING
METHODS (using
tables)

11
1. INTRODUCTION

We search for a data item in a table using a KEY.


A key-field is a field which is unique to each data item.
• Example
const maxBooks = 10000;
type index = 1 .. maxBooks;
book = RECORD
author: String[25];
publisher: String[25];
title: String[50];
ISBN: String[13]; (* key field *)
END;
bookFile = ARRAY[index] of book;

12
2. SEQUENTIAL SEARCH

• Idea
Start comparing keys at beginning of table and move towards the end of the table until element is
found or the end of the table is reached.
• Example: searching for 43

1 2 3 4 5 6 7 8 9 10
myTable
5 14 23 30 40 43 56 61 77 99

• Algorithm
The following function returns the index of the cell where myKey is found, or 0 if myKey is not
present.
function SeqSearch(T: table; myKey: integer): integer;
var index: integer;
begin
index := 1;
while (index < max) and (T[index] <> myKey) do
index := index + 1;
if T[index] = myKey
then seqSearch := index
else seqSearch := 0;
end;

13
3. BINARY SEARCH

• Idea
take middle-element of the table & compare
if keys do not match then
if your key is smaller
then Search the left half of the table (*using binary search *)
else Search the right half of the table (*using binary search *)
• Example: searching for 43
1 2 3 4 5 6 7 8 9 10
myTable
5 14 23 30 40 43 56 61 77 99

• Algorithm
The following function returns the index of the cell where myKey is found, or 0 if myKey is not
present.
function BinSearch(T: table; myKey: integer): integer;
var low, high, mid: integer;
found: boolean;
begin
low := 1;
high := max;
found := false;
repeat
mid := (low + high) DIV 2;
if T[mid] = myKey
then found := true
else if T[mid] < myKey
then low := mid + 1
else high := mid – 1;
until found OR (low > high);
if found
then BinSearch := mid
else BinSearch := 0;
end;

4. COMPARING ALGORITHMS

• Pre-conditions are different.


Sequential Search works when elements are not sorted.
Binary Search works only when elements are sorted.
• Running time of algorithms.
Sequential Search-> average of n/2 comparisons
Binary Search -> average of (log2n)/2 comparisons
Sequential Search -> O(n)
Binary Search -> O(log2n)

14
Part IV
INTERNAL SORTING
METHODS
(using tables)

15
1. INTRODUCTION

"Sorting" means to arrange a sequence of data items so that the values of their key-fields form a
proper sequence.
Sorting of data is an extremely important and frequently executed task.
Efficient algorithms for sorting data are needed.

2. BUBBLE SORT

• The idea
Go through the whole array, comparing adjacent elements and reversing two elements whenever
they are out of order.
Observe that after one pass, the smallest element will be in place.
If we repeat this process N-1 times then the whole array will be sorted!
• An example

1 2 3 4 5 6 7 8 9 10
11 76 4 14 40 9 66 61 5 12

Pass 1
1 2 3 4 5 6 7 8 9 10
4 11 76 5 14 40 9 66 61 12

smallest element is in it's right place


1 2 3 4 5 6 7 8 9 10
4 11 76 5 14 40 9 66 61 12

Pass 2
1 2 3 4 5 6 7 8 9 10
4 5 11 76 9 14 40 12 66 61

two elements are in the right place

Pass 3

1 2 3 4 5 6 7 8 9 10
4 5 9 11 76 12 14 40 61 66

Pass 4

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 76 14 40 61 66

16
Pass 5

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 76 40 61 66

Pass 6

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 40 76 61 66

Pass 7

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 40 61 76 66

Pass 8

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 40 61 66 76

Pass 9

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 40 61 66 76

The complete table is sorted after N-1 passes

• The Algorithm
We assume the following declarations:
const max = 10;
type table = array[1..max] of integer;

We need a procedure to swap the contents of two variables:


procedure swap(var x, y: integer);
var temp: integer;
begin
temp := x;
x := y;
y := temp;
end;

The procedure to sort a table using the bubble sort strategy:


procedure bubbleSort(var t: table);
var pass, index: integer;
17
begin
for pass := 1 to max - 1 do
for index := max downto pass + 1 do
if T[index] < T[index – 1]
then Swap(T[index], T[index – 1]);
end;

3. INSERTION SORT

• The Idea
If we insert an element in its correct position in a list of x sorted elements, we end up with a list
of x + 1 sorted elements.
Observe that 1 element is always sorted, and so, after inserting one other element we have 2
sorted elements.
If we now keep on inserting elements then after N-1 insertions, our whole list will be sorted.

• An example
1 2 3 4 5 6 7 8 9 10
11 76 4 14 40 9 66 61 5 12

one sorted element


Pass 1: we insert 76

1 2 3 4 5 6 7 8 9 10
11 76 4 14 40 9 66 61 5 12

two sorted elements


Pass 2: we insert 4

1 2 3 4 5 6 7 8 9 10
4 11 76 14 40 9 66 61 5 12

Pass 3: we insert 14

1 2 3 4 5 6 7 8 9 10
4 11 14 76 40 9 66 61 5 12

Pass 4: we insert 40

1 2 3 4 5 6 7 8 9 10
4 11 14 40 76 9 66 61 5 12

Pass 5
18
1 2 3 4 5 6 7 8 9 10
4 9 11 14 40 76 66 61 5 12

Pass 6

1 2 3 4 5 6 7 8 9 10
4 9 11 14 40 66 76 61 5 12

Pass 7

1 2 3 4 5 6 7 8 9 10
4 9 11 14 40 61 66 76 5 12

Pass 8

1 2 3 4 5 6 7 8 9 10
4 5 9 11 14 40 61 66 76 12

Pass 9

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 40 61 66 76

The complete table is sorted after N-1 passes

• The Algorithm
We assume the following declarations:
const max = 10;
type table = array[1..max] of integer;

We need a procedure to insert one element into a list of already sorted elements:
{ the procedure insert puts element T[index] in its correct }
{ place between elements T[1] and T[index – 1] }
procedure insert(var t: table; index: integer);
begin
while (index > 1) and (t[index] < t[index-1]) do
begin
swap(T[index], T[index-1]);
index := index – 1;
end;
end;

The procedure to sort a table using the insertion sort strategy:


procedure insertionSort(var t: Table);
var pass: integer;

19
begin
for pass := 2 to max do
insert(t, pass);
end;

20
4. SELECTION SORT

• The Idea
Find the smallest element in a list of unsorted elements and swap it with the element in position
1.
Now, find the smallest element in the rest of the list of unsorted elements and swap it with the
element in position 2.
If we repeat this N-1 times, our list will be completely sorted.

• An Example

1 2 3 4 5 6 7 8 9 10
11 76 4 14 40 9 66 61 5 12

we find the smallest element


Pass 1: we swap 4 and 11

1 2 3 4 5 6 7 8 9 10
4 76 11 14 40 9 66 61 5 12

we find the smallest element


Pass 2: we swap 5 and 76

1 2 3 4 5 6 7 8 9 10
4 5 11 14 40 9 66 61 76 12

Pass 3: we swap 9 and 11

1 2 3 4 5 6 7 8 9 10
4 5 9 14 40 11 66 61 76 12

Pass 4: we swap 11 and 14

1 2 3 4 5 6 7 8 9 10
4 5 9 11 40 14 66 61 76 12

Pass 5

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 66 61 76 40
21
Pass 6

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 66 61 76 40

Pass 7

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 40 61 76 66

Pass 8

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 40 61 76 66

Pass 9

1 2 3 4 5 6 7 8 9 10
4 5 9 11 12 14 40 61 66 76

The complete table is sorted after N-1 passes

• The Algorithm.
We assume the following declarations:
const max = 10;
type table = array[1..max] of integer;

We need a function that finds the smallest element in a part of the table:
{ the function findSmallest returns the index of the }
{ smallest element to be found in cells T[index] .. T[max]}
function findSmallest(t: Table; index: integer): integer;
var currentSmall, x: integer;
begin
currentSmall := index;
for x := index + 1 to max do
if t[x] < t[currentSmall]
then currentSmall := x;
findSmallest := currentSmall;
end;

The procedure to sort a table using the selection sort strategy:


procedure selectionSort(var t: Table);
var pass, index: integer;
begin
for pass := 1 to max - 1 do begin
index := findSmallest(t, pass);
swap(t[pass], t[index]);
22
end;
end;

23
Part V
POINTERS

24
1. A NEW DATA TYPE: POINTER

A pointer variable doesn’t store 'data' directly, it only points to where 'data' is stored.
A pointer variable is a variable whose value indicates another variable.
We say that a pointer variable points to an anonymous variable. That is, a variable which does not
have a name. The only way to access the anonymous variable is through the use of the pointer.

2. DECLARING POINTERS

We declare pointers in Pascal using ^


In general:VAR PointerVariable: ^datatype

Example: VAR intPtr: ^integer

Note:
• intPtr is a variable of type pointer.
• intPtr points to another variable; this anonymous variable, which can not be referred to by
name, is of type integer.
• intPtr actually contains the address of the anonymous variable

intPtr 12

pointer variable anonymous variable


Another example:
type charPointer = ^char;
infoPointer = ^info;
info = RECORD
name: string[25];
age: 0..100;
sex: (male, female);
END;
var ip: infopointer;
cp: charPointer;

3. ALLOCATING MEMORY: NEW

When a program starts running, the value of all variables is undefined.


So, initially, intPtr does not point to an anonymous variable.
intPtr ?

25
The procedure NEW allows us to allocate a portion of memory which can hold an anonymous
variable.
This means we have run-time creation of variables, we call them dynamic variables.

NEW(pointerVariable)
The system will search for a free part of memory which is big enough to hold an
anonymous variable and it will store the address of that variable in
pointerVariable

• Example
NEW(intPtr);
• an anonymous variable of type integer is created
• intPtr points to that anonymous variable
• the value of the anonymous variable is undefined

intPtr ?

26
4. ACCESSING ANONYMOUS VARIABLES

To access an anonymous variable we use the ^ selector.

pointerVariable^
This refers to the (anonymous) variable pointed to by pointerVariable

• Example

intPtr^ := 42; intPtr 42


intPtr^ := intPtr^ + 5; intPtr 47
intPtr^ := -1* intPtr^; intPtr -47

It is important to know the difference between


‘working with a pointer’ and ‘working with an anonymous variable’

The only operations allowed on pointers are =, <> and :=

The operations allowed on anonymous variables are defined by their type.

• Example
program happyPointers;
var ip1, ip2: ^integer;
begin
ip1^ := 10; (* Error: ip^ undefined *)

new(ip1); ip1 ?

ip1 := 25 (* Error: types incompatible *)

ip1^ := 25; ip1 25

ip2^ := ip1^ (* Error: ip2^ has not been allocated *)

ip2 := ip1 ip1 25


ip2
ip2^ := 82 ip1 82
ip2
if ip1 = ip2
then writeln('Pointers are equal')
else writeln('Pointers are different');

27
NEW(ip1) ip1 82
ip2
?

ip1^ := ip2^ div 2 ip1 82


ip2
41

ip2 := ip1^ (* Error: types incompatible *)

ip2 := ip1 ip1 82 (* GARBAGE *)


ip2
41

END.

28
5. DE-ALLOCATING MEMORY: DISPOSE

Through the use of pointers we can allocate, but also de-allocate memory at run-time.
The procedure DISPOSE allows us to free a portion of memory that is used by a dynamic variable.

DISPOSE(pointerVariable)
The pascal system will free the memory that was in use by the anonymous variable
pointed to by pointerVariable.
After this, the value for pointerVariable is undefined.

• Example
program happyPointers;
var ip1, ip2: ^integer;
begin

new(ip1); ip1 ?

ip1^ := 150; ip1 150

dispose(ip1); ip1 ?

new(ip1); ip1 ?

ip2 := ip1; ip1 ?


ip2

ip2^ := 60; ip1 60


ip2

dispose(ip1); ip1 ?
ip2 (* DANGLING POINTER!!! We
did not dispose p2 but it is
no longer pointing to a
dynamic variable *)

6. THE NIL VALUE

Every pointer can be assigned a special NIL value to indicate it does not point to any dynamic
variable.
It is a good practice to assign this NIL-value to pointers which are not pointing to any dynamic
variable.

ip1 := nil; ip1

• Example.
type info = record

29
name: string[25];
age: 0..100;
sex: (male, female);
end;
infoPtr = ^info;

var ptr1, ptr2, : infoPtr;

begin

NEW(ptr1) ptr1

ptr1^.age := 18; ptr1


18

ptr1^.name := ‘Nyasha'; ptr1


18
Nyasha

ptr1^.sex := male; ptr1


18
Nyasha
male

ptr2 := ptr1; ptr1 ptr2


18
Nyasha
male

ptr1 := nil; ptr1 ptr2


18
Nyasha
male

30
Part VI
Linked Lists

31
1. DYNAMIC AND STATIC DATA STRUCTURES

Data structures which have their size defined at compile time, and which can not have their size
changed at run time are called static data structures.
The size of a dynamic data structure is not fixed. It can be changed at run time.

2. INTRODUCING LINKED LISTS.

type list = ^node;


node = record
name: string[25];
age: 0..100;
sex: (male, female);
next: list;
end;

var head, tail : list;

begin
head ?
tail ?

NEW(head); head

head^.name := ‘Nyasha’;
head^.sex := male;
head^.age := 9;
head^.next := nil;
tail := head;

head tail
9
Nyasha
male

32
NEW(tail^.next);

head tail
9
Nyasha
male

tail := tail^.next;

head
9
Nyasha
male

tail

33
tail^.name := 'Paida';
tail^.sex := female;
tail^.age := 5;
tail^.next := nil;
head
9
Nyasha
male

tail
5
Paida
female

new(tail^.next);
tail := tail^.next
tail^.name := ‘Mazvita’;
tail^.sex := female;
tail^.age := 14;
tail^.next := nil;
head
9
Nyasha
male

5
Paida
female

14
tail
Mazvita
female

3. SINGLE LINKED LISTS.

A linked list consists of a sequence of data-items. Each data-item in the list is related, or identified,
by its position relative to the other elements.
In a linear single linked list, the only data-item accessible from an element is the ‘next’ element.

• Example
List1: 2, 5, 10, 8, 3, 7
List2: 2, 5, 8, 10, 3, 7

List1 and List2 are two different lists.


34
We can access elements of a list in a sequential way but we do not have direct access to elements
of the lists

4. IMPLEMENTATION OF LISTS

TYPE list = ^node;


node = record
info: integer; (* any data structure *)
next: list;
end;

VAR l, temp, pos: list;

• The null list or the empty list.

l := nil; l

35
• Adding an element to the front of a list.

l 5 10 2

1. make a new cell NEW(temp);


2. put info in cell temp^.info := 3;
3. let the cell’s next field point to the original list temp^.next := l;
4. let the original list now point to our new cell l := temp;

l 3 5 10 2

• Adding an element in the middle of a list.

l 5 10 2
pos

1. make a new cell NEW(temp);


2. put info in cell temp^.info := 7;
3. adjust the next field temp^.next := pos^.next;
4. adjust the next field of pos pos^.next := temp;

l 5 10 7 2

• Deleting an element at the front of a list.

l 5 10 2

1. change l to it’s next field l := l^.next;

5 10 2
l
Doing this, we create GARBAGE !!!

Proper solution:

l 5 10 2

1. make temp point to l temp := l;


2. change l to it’s next field l := l^.next;
3. destroy temp’s dynamic variable DISPOSE(temp);
36
l 10 2

• Traversing a list.
Example 1: Printing the contents of a list.

l 5 10 2

temp := l;
while temp <> nil do begin
writeln(temp^.info);
temp := temp^.next;
end;

Example 2: Searching for an element in a list.


(* Search returns a pointer to the node containing key when *)
(* key is present. It returns nil otherwise. *)
function Search(l: list; key: integer): list;
var temp: list; found: boolean;
begin
temp := l; found := false;
while (temp <> nil) and (not(found)) do
if temp^.info = key
then found := true
else temp := temp^.next;
if found
then Search := temp
else Search := nil;
end;

• Deleting an element in the middle of a list.

l 5 10 2

pos
Observe: we need a pointer to the previous element!

l 5 10 2

prev pos

prev^.next := pos^.next;
DISPOSE(pos);

l 5 2

37
5. STATIC VS. DYNAMIC DATA STRUCTURES

• Contiguous List

TYPE cList = ARRAY [1..100] OF integer;


VAR staticResults: cList;

• Dynamic List

TYPE dList = ^node;


node = record
info: integer;
next: dList;
end;
VAR dynamicResults: dList;

• Comparison
Dynamic data structures have the obvious advantage that their size can grow and shrink at run
time.
- The variable staticResults can hold exactly 100 integers.
- If we store only 5 integers in the variable, still space for 100 integers is
taken up.

-Storing 101 integers in the variable is impossible.

-the size of dynamicResults is only limited by the memory size of your


machine/system.

Dynamic data structures have a pointer overhead.


- For every integer we store in dynamicResults, we also need to store a pointer.

Lists can only be processed in a sequential way.


- For a binary search we need direct access.

6. OTHER LIST STRUCTURES.

• Doubly linked linear list.

5 8 9

type doubleList = ^doubleNode;


doubleNode = record
info: integer;
next: doubleList;
previous: doubleList;
38
end;

Advantages: Allows us to travel in both directions,


this makes some operations easier. (e.g. deleting)
Disadvantage: Overhead of two pointers per node.

• Single linked circular list.

l 5 10 2

type circularList = ^circularNode;


circularNode = record
info: integer;
next: circularList;
end;

Advantage: You can reach any node from a given node,


this makes some operations easier
(e.g. searching).

• Double linked circular list.

5 8 9

type dcList = ^dcNode;


dcNode = record
info: integer;
next: dcList;
previous: dcList;
end;
dlcList = ^node;

39
Part VII
RECURSION

40
1. WHAT IS RECURSION?

A recursive subprogram is a program which calls upon itself.


A recursive definition, defines an object in terms of the object itself.

• Example: Calculating the factorial.


Iterative definition:
Factorial(n) = 1 when n = 0
Factorial(n) = n x (n-1) x (n-2) x ... x 2 x 1 when n > 0

Recursive definition:
Factorial(n) = 1 when n = 0
Factorial(n) = n x Factorial(n-1) when n > 0

Factorial(5) = 5 x Factorial(4) Factorial(5) = 5 x 24 = 120

Factorial(4) = 4 x factorial(3) Factorial(4) = 4 x 6

Factorial(3) = 3 * Factorial(2) Factorial(3) = 3 x 2

Factorial(2) = 2 x factorial(1) Factorial(2) = 2 x 1

Factorial(1) = 1 x Factorial(0) Factorial(1) = 1 x 1

Factorial(0) = 1

41
2. SUBPROGRAMS IN PASCAL

Program Example;
procedure P1;
begin
...
end;
procedure P2;
begin
...
p1;
...
end;
procedure P3;
begin
...
p2;
...
end;

begin
p3;
end;

42
Program Example;
procedure P1;
subprogram call
begin
...
return from subprogram
end;
procedure P2;
begin
...
p1;
...
end;
procedure P3;
begin
...
p2;
...
end;

begin
p3;
end;

• subprogram call
- halt the current process
- pass parameters
- start execution of subprogram

• end of subprogram
- go back to that point in the program where the subprogram was called from
- return result (functions!)
- resume execution.

3. RECURSIVE SUBPROGRAMS

• The function FACTORIAL.

function factorial(n: integer);


begin
if n = 0
then factorial := 1
else factorial := n x factorial(n-1);
end;

43
Some guidelines for recursive subprograms:
 Each recursive step 'simplifies' the problem a bit.
 There is always a value for which the subprogram does not call upon itself; the exit-value.
 Each recursive step brings you closer to the exit-value.

44
writeln(Fac(4))???

n <- 4
function fac(n: integer);
begin
if n = 0
then fac := 1
else fac := n x fac(n-1);
end; n <- 3
function fac(n: integer);
begin
if n = 0
then fac := 1
else fac := n x fac(n-1);
end;
n <- 2
function fac(n: integer);
begin
if n = 0
then fac := 1
else fac := n x fac(n-1);
end; n <- 1
function fac(n: integer);
begin
if n = 0
then fac := 1
else fac := n x fac(n-1);
end; n <- 0
function fac(n: integer);
begin
if n = 0
then fac := 1
else fac := n x fac(n-1);
end;

• The FIBONACCI sequence.

The fibonacci sequence:


0 1 1 2 3 5 8 13 21 34 55 ...

The definition:
fib(n) = n when n = 0 or n = 1
fib(n) = fib(n-1) + fib(n-2) when n > 1

45
The function:
function fibo(n: integer): integer;
begin
if (n = 0) or (n = 1)
then fibo := n
else fibo := fibo(n-1) + fibo(n-2);
end;

An exercise:
Write an iterative version of ‘fibo’

46
• MULTIPLICATION of natural numbers.

The idea: (b-1) x a

b x a = a + a + a + ... + a

The definition:
a x b = a when b = 1
a x b = a + a x (b – 1) when b > 1

The function:
function multiply(a, b: integer): integer;
begin
if (b = 1)
then multiply := a
else multiply := a + multiply(a, b-1);
end;

An exercise:
Write a recursive version multiply which works for integer numbers

47
• BINARY SEARCH.
The idea:

search for key:

compare key with middle-element


if equal then stop
if key is smaller: search for key in left half
if key is bigger: search for key in right half

The function:
function binSearch( a: array[1..max] of integer;
key: integer;
low, high: integer): integer;
var mid: integer;
begin
if low > high
then binSearch := 0
else begin
mid := (low + high) div 2;
if a[mid] = key
then binSearch := mid
else if key < a[mid]
then binSearch := binSearch(a, key, low, mid – 1)
else binSearch := binSearch(a, key, mid + 1, high)
end;
end;

48
An exercise:

Determine the output of hello(5)

procedure hello(i: integer);


var x: integer;
begin
write(i);
for x := 1 to i do
write(‘x‘);
writeln;
if i > 0
then hello(i-1)
else writeln(‘Stop...’);
write(‘x’);
end;

49
• The TOWERS OF HANOI problem.

The problem:

A B C

Move all (let’s call it N) disks from peg A to peg C.


Only the top disk of any peg may be moved to another peg.
A larger disk may never rest on a smaller one.

???

Some ideas:
Suppose we can move N-1 disks from one peg to another.
...then the problem is solved... 
!!!

because we would  move N-1 disks from A to B


 move 1 disks from A to C (triviality)
 move N-1 disks from B to C

So we have a solution for N disks in terms of a solution of N-1 disks, and we have
the trivial case of 1 disk.

If we keep on using this same strategy then the whole problem will come down to
trivial cases:

N disks
N-1 disks & trivial case
N-2 disks & a trivial case
N-3 disks & a trivial case
...
1 disk = a trivial case

The solution:

procedure towers( N: integer;


fromPeg, toPeg, auxPeg: char);
begin

50
if N = 1
then writeln(‘Move disk from ‘, fromPeg, ‘ to ‘, toPeg)
else begin
towers(N-1, frompeg, auxPeg, toPeg);
towers(1, fromPeg, toPeg, auxPeg);
towers(N-1, auxPeg, toPeg, fromPeg);
end;
end;

program TowersOfHanoi;
var nrOfDisks: integer;
procedure towers(...);
begin
...
end;
begin
write(‘How many disks need to be moved? );
readln(nrOfdisks);
towers(nrOfDisks, ‘A’, ‘B’, ‘C’);
end.
• Counting nodes in a linked list.

function countNodes(l: list): integer;


begin
if l = nil
then countNodes := 0
else countNodes := 1 + countNodes(l^.next)
end;

• Printing the contents of a linked list.

procedure printList(l: list);


begin
if l <> nil
then begin
write(l^.info);
printList(l^.next);
end;
end;

• Sequential search in a linked list.

function searchList(l: list; key: integer): list;


begin
if l = nil
then searchList := nil
else if l^.info = key
then searchList := l
else searchList := searchList(l^.next, key);
51
end;

52
Part VIII
STACKS

53
1. DEFINITIONS.

A stack is an ordered collection of items into which new items may be inserted and from which items
may be deleted at one end, called the TOP of the stack.

A stack is a dynamic datastructure. The two basic operations on stacks are push and pop.

We use PUSH to add an element to the top of a stack.


push(s, x) -> add the item x to the top of the stack.

We use POP to delete an element from a stack.


pop(s) -> remove and return the top item from the stack.

2. AN EXAMPLE.

25 <-
tos
80 <- 80 80 <-
tos tos
40 <- 40 40 40 40 <-
tos tos
55 55 55 55 55 55 <-
tos
20 20 20 20 20 20
10 10 10 10 10 10
S s s s s s
push(s, 80) push(s, 25) x :=pop(s) x:=pop(s) x:=pop(s)

3. SOME MORE TERMINOLOGY.

A stack is also called a 'pushdown list' or 'LIFO'-structure.

An 'empty stack' is a stack which contains no elements.

54
• Calling pop(s) when s is empty creates an underflow.
• The function empty(s) returns true when s is empty.

The function stacktop returns a copy of the top element without removing it.

x := STACKTOP(s) x := POP(s);
PUSH(s, x);

55
4. EXAMPLE: REVERSING A STRING.

var x: integer;
s: stack;
str: string;

readln(str);

1 2 3 4 5 6 7 8 ... 253 254 255


str A p p l e

for x := 1 to length(str) do
push(s, str[x]);

e
l
p
p
A
s

x := 1;
while not(emptyStack(s)) do begin
str[x] := pop(s);
x := x + 1;
end;

writeln(str); elppA

56
5. THE ADT STACK.

• DEFINITION

type stack;
elementType;

{ creates an empty stack s }


procedure createStack(var s:stack);

{ destroys the stack s }


procedure destroyStack(var s:stack);

{ pushes el on top of the stack s }


procedure push(var s:stack; el: elementType);

{ removes and returns the top element of s }


function pop(var s: stack): elementType;

{ returns true if s contains no elements, }


{ and false otherwise }
function emptyStack(s: stack): boolean;

57
• IMPLEMENTATION USING ARRAYS

const maxStack = 100;

type elementType = integer;


stack = record
content: array[1..maxStack] of elementType;
top: 0..maxStack;
end;

procedure createStack(var s:stack);


begin
s.top := 0;
end;

procedure destroyStack(var s:stack);


begin
s.top := 0;
end;

function emptyStack(s: stack): boolean;


begin
emptyStack := s.top = 0;
end;

procedure push(var s:stack; el: elementType);


begin
if s.top = maxElements
then error(‘Stack overflow’)
else begin
s.top := s.top + 1;
s.content[s.top] := el;
end;
end;

function pop(var s: stack): elementType;


begin
if emptyStack(s)
then error(‘Stack underflow’)
else begin
pop := s.content[s.top];
s.top := s.top – 1;
end;
end;
• IMPLEMENTATION USING LISTS
type elementType = integer;
stack = ^stackNode;
stackNode = record
info: elementType;
next: stack;
end;

procedure createStack(var s:stack);


begin
s := nil;
end;

function emptyStack(s: stack): boolean;


begin
emptyStack := s = nil;
end;

procedure push(var s:stack; el: elementType);


58
var temp: stack;
begin
NEW(temp);
temp^.info := el;
temp^.next := s;
s := temp;
end;

function pop(var s: stack): elementType;


var temp: stack;
begin
if emptyStack(s)
then error(‘Stack underflow’)
else begin
temp := s;
s := s^.next;
pop := temp^.info;
DISPOSE(temp);
end;
end;

procedure destroyStack(var s:stack);


var aux: elementType;
begin
while not(emptyStack) do
aux := pop(s);
end;
• Black box principle

DEFINITITION
APPLICATION 1
type stack;
elementType;

{ creates an empty stack s }


APPLICATION 2
procedure createStack(var s:stack);

{ destroys the stack s }


procedure destroyStack(var s:stack);

{ pushes el on top of the stack s }


procedure push(var s:stack; el: elementType); APPLICATION x
{ removes and returns the top element of s }
IMPLEMENTATION
function pop(var s: stack): elementType;

{ returns true if s contains no elements, }


{ and false otherwise }
function emptyStack(s: stack): boolean;

59
60
6. INTRODUCTION TO EXPRESSIONS.

• Prefix, infix & postfix

A+B operand operator operand infix


+AB operator operand operand prefix (Polish notation)
AB+ operand operand operator postfix (Reverse Polish)

• Precedence rules

() $ x and / + and -

• Examples

INFIX POSTFIX PREFIX


A+BxC ABCx+ +AxBC
(A + B) x C AB+Cx x+ABC
(A + B) x (C – D) AB+CD–x x+AB–CD
A $ B x C – D + E / F / (G + H) AB$CxD–EF/GH+/+ +-x$ABCD//EF+GH
5 – 10 / (1 x 2 $ 3) 5 10 1 2 3 $ x / - - 5 / 10 x 1 $ 2 3

61
7. EVALUATION OF POSTFIX EXPRESSIONS

Observation
When scanning a postfix expression from left to right, we always read the operands before we
read the operator, this means that,
when we reach an operator, the operands are readily available, actually,
the last operands that we read will be the first ones to be used, hence,
our idea to use a stack!

Idea
- we scan the expression from left to right
- if we get an operand,
operand,
we push it on the stack
- if we get an operator,
operator,
we pop the two operands from the stack,
apply the operator and
push the result on the stack

Example

6 2 3 + - 3 8 2 / + * 2 $ 3 +
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

2
3 8 8 4
2 2 5 3 3 3 3 7 2 3
6 6 6 6 1 1 1 1 1 1 7 7 49 49 52
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

62
The algorithm
Restriction: for sake of simplicity, our expression contains digits only.

function evaluatePostfix(exp: string): real;


var operandStack: stack;
position: integer;
op1, op2: real;
item: char;
begin
initializeStack(operandStack);
for position := 1 to length(exp) do begin
item := exp[position];
if not(operator(item))
then push(operandStack, ord(item) - ord ('0'))
else begin
op2 := pop(operandStack);
op1 := pop(operandStack);
push(operandStack, calculate(item, op1, op2))
end;
end;
evaluatePostfix := pop(operandStack);
end;

function operator(item: char): boolean;


begin
operator := item in [‘+’, ‘-‘, ‘*’, ‘/’, ‘$’];
end;

function calculate(operator: char; op1, op2: real): real;


begin
case operator of
‘+’: calculate := op1 + op2;
‘-‘: calculate := op1 – op2;
‘/’: calculate := op1 / op2;
‘*‘: calculate := op1 * op2;
‘$’: calculate := exp(op2*ln(op1));
end;
end;

An exercise
Write a function to evaluate a prefix expression.

8. CONVERSION OF INFIX TO POSTFIX

• Infix expressions without brackets

Examples
3*4+5 34*5+
3+4*5 345*+

63
3*5+6–7*2 35*6+72*-
3+5*6–7*2 356*+72*-

Observations
the operands appear in the same order in both expressions
the order of the operators depends on precedence rules
we can’t decide to insert an operator until we’ve seen the next operator

Idea
- we scan the infix expression from left to right
- if we get an operand,
operand,
we place it in the postfix expression
- if we get an operator,
operator,
we pop all operators with higher (or same) precedence from the stack and put them in the
postfix string
we push the operator on the stack

64
Example
Infix expression:

3 + 5 * 6 – 7 * 2
1 2 3 4 5 6 7 8 9

Operator stack:

* * * *
+ + + + - - - -
1 2 3 4 5 6 7 8 9

Prefix expression:

3 5 6 * + 7 2 * -
1 2 3 4 5 6 7 8 9

65
The algorithm
Restriction: for sake of simplicity, our expression contains digits only.
function convertInfixToPostfix(exp: string): string;
var operatorStack: stack;
result: string;
position: integer;
item, temp: char;
testPrcd: boolean;
begin
initializeStack(operatorStack);
result := '';
for position := 1 to length(exp) do begin
item := exp[position];
if not(operator(item))
then result := result + item
else begin
testPrcd := not(emptyStack(operatorStack));
while testPrcd do begin
temp := tos(operatorStack);
if prcd(temp, item)
then begin
result := result + pop(operatorStack);
testPrcd := not(emptyStack(operatorStack));
end
else testPrcd := false
end;
push(operatorStack, item);
end;
end;
while not(emptyStack(operatorStack)) do
result := result + pop(operatorStack);
convertInfixToPostfix := result;
end;

(* The function prcd(op1, op2) returns TRUE when op1 has *)


(* higher or equal precedence than op2, FALSE otherwise *)

function prcd(op1, op2: char): boolean;

function assignValue(op: char): integer;


begin
case op of
'+', '-': assignValue := 1;
'*', '/': assignValue := 2;
'$': assignValue := 3;
end;
end;

begin
prcd := assignValue(op1) >= assignValue(op2);
end;
• Infix expressions with brackets:
When an opening bracket is reached, we start evaluation of a fresh ‘sub-expression’.
So, to accommodate for brackets:

prcd(‘(‘, op) -> FALSE (for any op)


prcd(op, ‘(‘) -> FALSE (for any op <> ‘)’)

66
prcd(op, ‘)’) -> TRUE (for any op <> ‘)’)

An closing bracket is never pushed on the stack.


When a closing bracket is compared with an opening bracket, the opening bracket is removed
from the stack but not copied to the postfix string. No further operators are then pop-ed from
the stack.

Example

( ( 1 - ( 2 + 3 ) ) * 4 ) $ ( 5 + 6 )
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

+ +
( ( ( (
- - - - - - + +
( ( ( ( ( ( ( ( * * ( ( ( (
( ( ( ( ( ( ( ( ( ( ( ( $ $ $ $ $ $
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Postfix string: 1 2 3 + - 4 * 5 6 + $

Exercise

adjust the algorithm to accommodate for brackets

67
9. THE RUN-TIME STACK.

The run-time stack is used to do the run-time organisation of Pascal programs.


The stack contains activation records which record the values of all the variables belonging to each
active part of a program.

program Example;
var x, y: integer;

procedure One(a, b: char);


begin ... end;

procedure Two(x, z: integer);


var e, y: char;
begin ... One; ... x := 1; z := 1; e:= ‘E’; y := ‘Y’; ...end;

function Three(n, s: real; t: boolean): char;


begin ... One; Two; ... end;

begin
x := 5; y := 10;
writeln(Three(x, y, false));
end.

68
Activation Record:
contains declared items that are currently in use (~ scope)
contains a return address

Subprogram Call:
an activation record is pushed on the run-time stack
the return address is the address of the current instruction
parameters are 'initialised', local variables are not initialised

Subprogram Ends:
an activation record is popped from the run-time stack
parameters and local variables are freed
program execution resumes at the return address found in that record

• Recursive subprograms.

Each recursive call involves a new activation record to be pushed on the run-time stack!

Disadvantages are time and space requirements

But recursion is still a natural, simple way to solve many problems

69
Part IX
QUEUES

70
10. DEFINITIONS.

A queue is a dynamic data structure consisting of an ordered collection of items into which new
items may be inserted at one end, called the REAR of the queue, and from which items may be
deleted at the other end, called the FRONT of the queue.

We use INSERT or ENQUEUE to add an element to the rear of the queue.


insert(q, x) -> insert the item x at the rear of the queue.

We use DELETE or DEQUEUE to remove an element from the front of the queue.
x := delete(q) -> delete and return the element at the front of the queue.

11. AN EXAMPLE.

myQueue A B C D

front rear

INSERT(myQueue, ‘E’);

myQueue A B C D E

front rear

ch := REMOVE(myQueue); {ch <- ‘A’}

myQueue B C D E

front rear

ch := REMOVE(myQueue); {ch <- ‘B’}

myQueue C D E

front rear

71
INSERT(myQueue, ‘F’);
myQueue C D E F

front rear

12. SOME MORE TERMINOLOGY.

A queue is also referred to as a 'FIFO'-structure.

An 'empty queue' is a queue which contains no elements


• Calling remove(q) when q is empty creates an underflow.
• The function empty(q) returns true when q is empty, false otherwise.

72
13. THE ADT QUEUE.

• DEFINITION
type queue;
elementType;

{ creates an empty queue q }


procedure createQueue(var q: queue);

{ destroys the queue q }


procedure destroyQueue(var q: queue);

{ insert el at the rear-end of q }


procedure insert(var q: queue; el: elementType);

{ removes and returns the front element of q }


function remove(var q: queue): elementType;

{ returns true if q contains no elements, }


{ false otherwise }
function empty(q: queue): boolean;

73
• IMPLEMENTATION USING ARRAYS
const maxQueue = 100;

type elementType = integer;


queue = record
items: array[1..maxQueue] OF elementType;
front, rear: 0..maxQueue;
end;

procedure createQueue(var q: queue);


begin
q.rear := 0;
q.front := 1;
end;

function empty(q: queue): boolean;


begin
empty := q.rear < q.front;
end;

procedure insert(var q: queue; el: elementType);


begin
if q.rear = maxQueue
then error(‘Queue overflow’)
else begin
q.rear := q.rear + 1;
q.items[q.rear] := el;
end;
end;

function remove(var q: queue): elementType;


begin
if empty(q)
then error(‘Queue underflow’)
else begin
remove := q.items[q.front];
q.front := q.front + 1;
end;
end;

74
• Problems with array implementation
Space is used in an unacceptable way.

1 2 3 4 5 ... 97 98 99 100

myQueue 5 8

front <- 98 rear <- 99

insert(myQueue, 6)
insert(myQueue, 4)

Although only 4 cells of the array are used, an overflow will be generated.

• Solution 1: Shifting of elements


After a remove operation we can shift all elements one place to the left. This solves our problem
but at high cost (~ time).

• Solution 2: Circular array representation


A circular array is a normal array which is used in a different way.
We consider the first element to immediately follow the last element; i.e.

75
1 2 3 4 5 ... 97 98 99 100

myQueue 5 8 4

front <- 98 rear <- 100


insert(myQueue, 10)

1 2 3 4 5 ... 97 98 99 100

myQueue 10 5 8 4

front <- 98
rear <- 1

• IMPLEMENTATION WITH ‘CIRCULAR ARRAYS’

const maxQueue = 100;

type elementType = integer;


queue = record
items: array[1..maxQueue] OF elementType;
front, rear: 1..maxQueue;
end;

procedure createQueue(var q: queue);


begin
q.rear := maxQueue;
q.front := maxQueue;
end;

function empty(q: queue): boolean;


begin
empty := q.rear = q.front;
end;

76
function remove(var q: queue): elementType;
begin
if empty(q)
then error(‘Queue underflow’)
else begin
q.front := q.front mod maxQueue + 1;
remove := q.items[q.front];
end;
end;

(* We sacrifice one element so that we can tell *)


(* the difference between an empty and a full queue *)
procedure insert(var q: queue; el: elementType);
begin
if q.rear mod maxQueue + 1 = q.front
then error(‘Queue overflow’)
else begin
q.rear := q.rear mod maxQueue + 1;
q.items[q.rear] := el;
end;
end;

procedure destroyQueue(var q: queue);


begin
q.rear := maxQueue;
q.front := maxQueue;
end;

77
• IMPLEMENTATION USING DYNAMIC LISTS

type elementType = integer;


queuePointer = ^queueNode;
queueNode = record
info: elementType;
next: queuePointer;
end;
queue = record
front, rear: queuePointer;
end;

procedure createQueue(var q: queue);


begin
q.rear := nil;
q.front := nil;
end;

function empty(q: queue): boolean;


begin
empty := q.front = nil;
end;

(* Cfr. Remove an element from the front of a list *)


function remove(var q: queue): elementType;
var temp: queuePointer;
begin
if empty(q)
then error(‘Queue underflow’)
else with q do begin
temp := front;
remove := front^.info;
front := front^.next;
dispose(temp);
end;
end;

78
(* Cfr. Add an element to the tail of a list *)
procedure insert(var q: queue; el: elementType);
var temp: queuePointer;
begin
new(temp);
temp^.info := el;
temp^.next := nil;
with q do
if empty (q)
then begin
rear := temp;
front := temp;
end
else begin
rear^.next := temp;
rear := rear^.next;
end;
end;

procedure destroyQueue(var q: queue);


var temp: elementType;
begin
while not(empty(q)) do
temp := remove(q);
end;

79
14. APPLICATIONS USING QUEUES.

• Buffers
A queue is an ideal data structure to provide a buffer between (computer-) devices that work at
different speeds.

- Typing: characters are entered into a queue and are removed from the queue when they are
ready to be processed.

c:> TYPE myFile.txt

d i r

c:> dir

- Print SPOOLing
Print jobs are stored in a queue. Whilst the print spooler is taking care of the print jobs, you
continue to work on your machine.
In some environments, different users have different priorities.

80
15. PRIORITY QUEUES.

An ASCENDING PRIORITY QUEUE consists of an ordered collection of items into which items may be
inserted arbitrarily and from which only the smallest element can be removed.

A DESCENDING PRIORITY QUEUE ... only the biggest element ...

Note that both structures require items to have a field on which they can be sorted

insert(dpq, 3);
insert(dpq, 7);
insert(dpq, 2);
write(remove(dpq)); -> 7
write(remove(dpq)); -> 3
insert(dpq, 9);
insert(dpq, 1);
write(remove(dpq)); -> 9

81
• Implementation of priority queues
Different possibilities...
One idea: keep elements in a sorted list.

Insert -> put element in its proper place in the list


Remove -> take the front element from the list

insert(dpq, 3); 3
insert(dpq, 7);
7 3
insert(dpq, 2);
write(remove(dpq)); -> 7 7 3 2

write(remove(dpq)); -> 3 3 2
insert(dpq, 9);
2
insert(dpq, 1);
9 2
write(remove(dpq)); -> 9
9 2 1

2 1

STATIC or DYNAMIC implementation

82
• Real World Simulations.
Creating ‘models’ on the computer
- to learn
- to save money
- for safety reasons

In a model each object and action of the real world has its counterpart in the computer program.

Event-driven simulation:
Actions, or events, occur over a period of time. The simulation proceeds as events are generated
and they have their effect on the simulated situation. The generated events are stored in a queue-
structure, waiting to be processed.

83
Part X
BINARY TREES

84
16. INTRODUCTION.

• Definition
A BINARY TREE is a finite set of elements that is either empty or is partitioned into three disjoint
subsets. The first subset contains a single element called the root of the tree. The other two subsets
are themselves binary trees, called the left and the right sub-trees of the original tree. Each element
of a binary tree is called a node of the tree.

• Example

R E

S D G U

O Y B

85
• Some terminology
A node A is called the father of B when B is at the root of A’s left or right sub-tree. The left and
right sub-trees are referred to as the left son and the right son of A.
A leaf node is a node which has no sons.
A stric tly binary tree is a tree in which each non-leaf node has nonempty left and right sub-trees.
The level of a node is 0 for the root of the tree and one more than the level of the father for any
other node in the tree. The depth of a binary tree is the maximum level of any leaf in the tree.
A complete binary tree of depth d is a strictly binary tree for which all leaf nodes are at level d.

86
17. BINARY SEARCH TREES.

In a binary search tree all the elements in the left sub-tree of a node n have values less than the
contents of n and all elements in the right sub-tree of n have values bigger than or equal to the
contents of n.

• Example.
Insert the following elements in an initially empty binary search tree.
70, 50, 30, 80, 60, 90, 55, 76, 20, 85

87
18. TREE TRAVERSALS.

• Pre-order traversal (depth-first order)

procedure preOrder(t: binTree);


begin
if not(empty(t))
then begin
writeln(t^.info);
preOrder(t^.left);
preOrder(t^.right);
end;
end.

• Post-order traversal

procedure postOrder(t: binTree);


begin
if not(empty(t))
then begin
postOrder(t^.left);
postOrder(t^.right);
writeln(t^.info);
end;
end.

88
• In-order traversal (symmetric order)

procedure inOrder(t: binTree);


begin
if not(empty(t))
then begin
inOrder(t^.left);
writeln(t^.info);
inOrder(t^.right);
end;
end.

• Exercise
What output will be produced when the following binary search tree is traversed using pre-order,
post-order and in-order traversals?

89
19. SOME ALGORITHMS

• Building a binary search tree.

procedure addToBst(var t: tree; info: integer);


var head, aux: tree;
begin
if emptyTree(t)
then t := createNode(info)
else begin
aux := t;
head := t;
while not(emptyTree(head)) do begin
aux := head;
if info < getInfo(head)
then head := getLeft(head)
else head := getRight(head);
end;
if info < getInfo(aux)
then setLeft(aux, createNode(info))
else setRight(aux, createNode(info))
end;
end;

90
• Removing duplicates from a list of numbers

We want to develop an algorithm to remove duplicates from a list of numbers.


Idea: we store the numbers in a BST, but as we go down to determine their
position, we check for duplicates.

program removeDuplicates;
uses trees;

const sentinel = -99;

var t, aux, head: tree;number : integer;

begin
write('Enter number: ');
readln(number);
t := createNode(number);
while number <> sentinel do
begin
write('Enter number: ');
readln(number);
if number <> sentinel then begin
aux := t;
head := t;
while (number <> getInfo(aux)) and not(emptyTree(head)) do
begin
aux := head;
if number < getInfo(head)
then head := getLeft(head)
else head := getRight(head);
end;
if number = getInfo(aux)
then writeln(number, ' is a duplicate')
else if number < getInfo(aux)
then setLeft(aux, createNode(number))
else setRight(aux, createNode(number))
end;
end;
end.

91
20. THE ADT BINARY TREE

unit trees;

interface

type tree = ^node;


node = record
info: integer;
left, right: tree;
end;

function createNode(el: integer): tree;


function createNullTree: tree;
function getInfo(t: tree): integer;
function getLeft(t: tree): tree;
function getRight(t: tree): tree;
procedure setLeft(var t: tree; lst: tree);
procedure setRight(var t:tree; rst: tree);
function emptyTree(t: tree): boolean;

implementation

function createNode(el: integer): tree;


var temp: tree;
begin
new(temp);
with temp^ do begin
info := el;
left := nil;
right := nil;
end;
createNode := temp;
end;

function createNullTree: tree;


begin
createNullTree := nil;
end;

function getInfo(t: tree): integer;


begin
getInfo := t^.info;
end;

function getLeft(t: tree): tree;


begin
getLeft := t^.left;
end;

function getRight(t: tree): tree;


begin
getRight := t^.right;
end;

procedure setLeft(var t: tree; lst: tree);


begin
t^.left := lst;
end;
92
procedure setRight(var t:tree; rst: tree);
begin
t^.right := rst;
end;

function emptyTree(t: tree): boolean;


begin
emptyTree := t = nil;
end;

begin
end.

• Trees using ARRAY implementation.


const maxNodes = 500;
type index = 1..maxNodes;
node = record
info: integer;
left: 0..maxNodes;
right: 0..maxNodes;
end;
bintree = record
elements = array[index] of node;
root = 0..maxindex;
end;
var bt: binTree;

93
• Example:

bt = 1

1 2 3 4 5 6 7 8 9 10
info 25 6 15 50 40 30 14 20 30 75

left 3 0 2 0 6 0 0 0 0 0

right 5 0 8 10 4 0 0 0 0 0

How to find a free node? Use a special value for left or right!
Construct a free-list.

• Implicit ARRAY implementation / Sequential representation

const maxNodes = 500;


type index = 1..maxNodes;
bintree = array[index] of integer;
var bt: bintree;

• Example:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
25 15 40 6 20 30 50 75

How do we know if a node is used? Special value or extra field!

Advantage:
Advantage: we can go to a node’s father.
Disadvantage:
Disadvantage: use of space in the array.

94
21. HETEROGENEOUS BINARY TREES.

A heterogeneous binary tree is a tree in which the information field of different nodes might have a
different type.

• Example: expression trees


leaf node: operand
non-leaf node: operator

3 + 4 * (6 - 7) / 5 + 1

evaluate(expTree):
case expTree^.infoKind of
operand: evaluate := expTree^.info
operator: begin
opnd1 := evaluate(getLeft(expTree));
opnd2 := evaluate(getRight(expTree));
oper := getInfo(expTree^.info);
evaluate := calculate(oper, opnd1, opnd2);
end;
end;

95
22. THREADED BINARY TREES

Observation:
: lots of pointers in a tree have the nil value
traversing a tree is a recursive process

Idea: use some of those nil pointers as guides for traversing the tree

Right In-threaded binary tree:


A node with an empty right subtree has its right field pointing to the in-order successor of that node.

• Example

b c

d e f

g h i

96
23. GENERAL TREES.

A tree is a finite nonempty set of elements in which one element is called the root and the remaining
elements are partitioned into m>=o disjoint subsets, each of which itself is a tree.
The degree of a node is defined as its number of sons.

Example:

Implementation:

using arrays:
const maxSons = 20;
type tree = ^node;
node = record
info: integer;
sons: array[1..maxSons] of tree;
end;

using pointers:
type tree = ^node;
node = record
info: integer;
sons: tree;
brother: tree;
end;

97
Part XI
More sorting
algorithms

98
24. MERGE SORT

Merging is the process of combining two or more sorted files into a third sorted file.
A sequence of x sorted numbers on a file is called a run of length x.

Example

File 1:[25 30 46] File 2:[10 18 43]

Merge: [10 18 25 30 43 46]

Observe: A file with n random numbers is a file containing n runs of length 1


Idea: n files of length 1 can be merged into n div 2 files with runs of length 2
we can repeat this process until we end with 1 file containing 1 run of
length n

[25] [75] [46] [10] [18] [43] [55] [27]

[25 75] [10 46] [18 43] [27 55]

[10 25 46 75] [18 27 43 55]

[10 18 25 27 43 46 55 75]

99
Remark: the use of many files can be reduced to the use of three files using merge and distribute.

File 1: [25 75 46 10 18 43 55 27]

Distribute runs of length 1


File 2: [25 46 18 55]
File 3: [75 10 43 27]

Merge into runs of length 2


File 1: [25 75 10 46 18 43 27 55]

Distribute runs of length 2


File 2: [25 75 18 43]
File 3: [10 46 27 55]

Merge into runs of length 4


File 1: [10 25 46 75 18 27 43 55]

Distribute runs of length 4


File 2: [10 25 46 75]
File 3: [18 27 43 55]

Merge into run of length 8 (= n)


File 1: [10 18 25 27 43 46 55 75]

Merge sort is an external sorting method with O(n Log2n), but can also be used on tables and any
other sequential data structure like linked lists

100
25. RADIX SORT

Radix sort is another external sorting method based on the values of the actual digits in a number
(key).

Example:
123 45 670 320 523 36 13 605 102 425 671 11

Store into queues according to the LSD


Queue0 -> 670 320
Queue1 -> 671 11
Queue2 -> 102
Queue3 -> 123 523 13
Queue4 ->
Queue5 -> 45 605 425
Queue6 -> 36
Queue7 ->
Queue8 ->
Queue9 ->

Empty queues:
670 320 671 11 102 123 523 13 45 605 425 36

101
Store into queues according to the 2nd digit
Queue0 -> 102 605
Queue1 -> 11 13
Queue2 -> 320 123 523 425
Queue3 -> 36
Queue4 -> 45
Queue5 ->
Queue6 ->
Queue7 -> 670 671
Queue8 ->
Queue9 ->

Empty queues:
102 605 11 13 320 123 523 425 36 45 670 671

Store into queues according to the msd


Queue0 -> 11 13 36 45
Queue1 -> 102 123
Queue2 ->
Queue3 -> 320
Queue4 -> 425
Queue5 -> 523
Queue6 -> 605 670 671
Queue7 ->
Queue8 ->
Queue9 ->

Empty queues:
11 13 36 45 102 123 320 425 523 605 670 671

102
Part XII
Hashing

103
26. INTRODUCTION

Hashing is a way of organising information that tries to optimise the speed for searching.
It's aim is actually to come up with a searching algorithm of O(1).

• Example.
We want to store information on 200 parts, each part identified by a key which is a number between
0 and 999999.

• Solution 1: Using the key as an index in a table.

type stock = array[0..999999] OF part;

function search(s: stock, key: integer): integer;


begin
search := key;
end;

ideal! search involves 1 step only!


impossible! use of space unacceptable!

• Solution 2: Using a hash function


A hash function is a function which transforms a key into an index

hash function
key -> index
hash of key

104
• Example
The hash function "MOD 1000" converts a key to an index which lies between 0 and 999.

hash(x) = x mod 1000


hash(825341) -> 341
hash(001234) -> 234

Knowing that the index will be a number between 0 and 999 we can declare our table as follows:

type stock = array[0..999] of part;

For storing and retrieving purposes we now use the hash function.

function search(s: stock, key: integer): integer;


begin
search := key mod 1000;
end;

procedure addPart(var s: stock; p: part);


begin
s[p.key mod 1000] := p;
end;

1 2 3 4 5 6 7 8 9 10 11 12 995 996 997 998 999


563001

110003

010004

569005

100007

563996

103997

105
27. COLLISIONS (CLASHES).

A problem arises when two records hash to the same hash key, this we call a collision or a hash
clash.

hash(110003) = 3
hash(583003) = 3

• Solution 1: rehashing or closed hashing


We use a second hash function whenever a collision occurs.
This second hash function is called the rehash function.
Note that we keep on using the rehash function until no collision occurs.

Example:
Example:
rehash(x) = (x + 1) mod 1000 “linear rehashing”

Inserting the part with key 583003


hash(583003) -> 3 we detect collision, we use the rehash function
rehash(3) -> 4 we detect collision, we use the rehash function
rehash(4) -> 5 we detect collision, we use the rehash function
rehash(5) -> 6 we insert part 583003 in cell #6

Searching is no longer O(1)!

106
• Solution 2: chaining or open hashing
We use a linked list to cater for collisions. Each cell in our table now contains a “list” of items, a
bucket, that hash to the cell with that index.

Example:

• A final remark
The design of a good hash function is a complex task. It should minimise the number of collisions,
and, at the same time, use as little space as possible.

28. SOME IDEAS FOR HASH FUNCTIONS

midsquare
key * key and then take the middle few digits

folding
break number into parts and combine them (adding/or-ing)

division method
mod

107
Part XIII
Graphs

108
29. DEFINITION.

A graph consists of a set of nodes (or vertices) and a set of arcs (or edges). Each arc in a graph is
specified by a pair of nodes.

G nodes {a, b, c, d, e}
edges { (a, b), (a, d), (a, e), (b, c),
(c, e), (d, c), (d, e), (e, e)}

If the pairs of nodes that make up the arcs are ordered pairs, the graph is said to be a directed
graph (or digraph).

G nodes {a, b, c, d, e}
edges { <a, b>, <a, d>, <a, e>, <b, c>,
<c, e>, <d, c>, <d, e>, <e, e> }

109
30. DIRECTED GRAPHS

A node n is incident to an arc if n is one of the two nodes in the ordered pair of nodes that comprise
x. We also say that x is incident to n.

The degree of a node is the number of arcs incident to it.


The indegree of a node n is the number of arcs that have n as the head and the outdegree of a
node n is the number of arcs that have n as the tail.

A node n is adjacent to a node m if there is an arc from m to n. If n is adjacent to m, then n is called


the successor of m, and m the predecessor of n.
A graph in which a number is associated with every arc, is called a weighted graph or a network.
The number associated with an arc is called its weight.

Graphs can be used to represent relations.


S = {3, 5, 6, 8, 10, 17}
R = {<3, 10>, <5, 6>, <5, 8>, <6, 17>,
<8, 17>, <10, 17> }

“x is related to y if x < y and y mod x is odd”

110
A path of length k from node a to node b is defined as a sequence of k+1 nodes n1, n2, ..., nk+1 such
that n1 = a, nk+1 = b and adjacent (ni, ni+1) is true for all i between 1 and k.

A path from a node to itself is called a cycle. If a graph contains a cycle it is cyclic, otherwise it is
acyclic.
A directed acyclic graph is called a dag.

Example.
G nodes = {a, b, c, d, e, f}
arcs = { <A, B>, <A, C>, <B, A>, <C, C>, <D, A>,
<D, C>, <D, F>, <D, D>, <F, C>, <F, D> }

B C

D E

111
31. IMPLEMENTATIONS OF GRAPHS.

• Adjacency matrix.

const maxNodes = 100;


type node = record
any fields…
end;
graph = record
nodes = array[1..maxNodes] of node;
arcs = array[1..maxNodes, 1..maxNodes] of boolean;
end;
var g: graph;

1 2 3 4 5
1 F T F T F
2 F F F F F
3 F T T F T
4 F T F F F
5 F F T F F

Some graph operations:

JOIN(n1, n2) g.arcs[n1, n2] := true;

REMOVE(n1, n2) g.arcs[n1, n2] := false;

ADJACENT(n1, n2) adjacent := g.arcs[n1, n2];

112
Declarations for a weighted graph:

arc = record
adj: boolean;
weight: integer;
end;
adjMatrix = array[1..maxNodes, 1..maxNodes] of arc;

• Mixed representation, adjacency lists.

1 2 4

2
3 2 3 5

4 2

5 3

CONST maxNodes = 100;


TYPE arcList = ^arcNode;
arcNode = record
adj: 1..maxNodes;
next: arcList;
end;
graph = array[1..maxNodes] of arcList;

113
• Linked representation.

type arcList = ^arcNode;


arcNode = record
adj: graph;
next: arcList;
end;
graph = ^graphNode;
graphNode = record
info: integer;
arcs: arcList;
nextNode: graph;
end;

114
32. PERT GRAPHS.

Project Evaluation & Review Technique

A PERT graph is ‘a weighted directed acyclic graph in which there is exactly one source and one sink.
The weights represent a time value.’

arc -> activity, time


node -> event, state

source-node: start of the project


sink node: end of the project

• Example.

What time is needed to finish the project?


How can we save time or manpower?

Know how to find


E.T. (Earliest Times)
L.T. (Latest Times)
Critical Path

115
Part XIV
File Design

116
33. INTRODUCTION

MAIN MEMORY <-> EXTERNAL MEMORY

bytes blocks

The unit of transfer between main storage and secondary storage is a block.

34. SEQUENTIAL FILES.

Records are stored one after another.


Access can be serial or sequential.

Disadvantage: slow
Advantage: can be implemented on any medium

Revision: Pascal text files.

35. INDEXED SEQUENTIAL FILES.

Records are stored sequentially, so access can be sequential.


A table, called an index, is kept on the external medium, so that records can also be accessed
directly through the use of the index.

Indexed sequential files can not be organised on a tape.

117
• Prime area
In this area the records are stored in a sequential way.

• Index area.
This area holds the index, which is usually split in different levels.

A disk address -> cylinder, surface, sector

Example:
The first level index contains the highest key found on every cylinder. There is one first level index.

Key Cylinder-Index
150 1
300 2
570 3
. .
. .
. .
40000 249
50000 250

118
The second level index contains the highest key found on every surface, there is one second level
index for every cylinder.

Key Surface-Index
20 1
50 2
75 3
. .
. .
. .
140 9
150 10

The third level index contains the highest key found on every sector, there is one third level index
for every cylinder/surface.

Key Sector-Index
1 1
5 2
7 3
. .
. .
. .
12 9
20 10

The index is kept on disk! When working with the file, relevant parts of the index are read in main
memory.

119
• Overflow area
Rebuilding the index, when records are added and deleted is a time consuming task.
As tracks become filled up with records, we can store records in some surfaces which are dedicated
to store these overflow records.
Two types of overflow areas are used: dedicated and independent.

When overflow areas become full, access time becomes slower and the need for re-organising the
file arises.

120
36. RANDOM OR DIRECT FILES.

Random files give the fastest (direct) access time to records but have the big disadvantage that
records can not be accessed sequentially.
The techniques for these files are the hashing techniques in which record keys are translated into
disk addresses.
The blocks (sectors) act as buckets.

121
CASE STUDY: Stock-keeping, a dynamic vs. static implementation

• 1 • INFORMATION.

What needs to be stored?


How will we store it?

Item description (string[255])


Item code (integer)
Price (real)
Nr in stock (integer)

TYPE Article = RECORD


description: string[255];
code: integer;
price: real;
nr_in_stock: integer;
END;

For our collection of articles: Contiguous


List
stock = RECORD
items: ARRAY [1..max] OF article;
nrOfItems: 0..max;
END;

STATIC IMPLEMENTATION

node = RECORD
item: article;
next: ^node;
END;
stock = ^node;

DYNAMIC IMPLEMENTATION

• 2 • ALGORITHMS.

Adding articles
Deleting articles *
Changing price of an article *
Changing description of article *
Update the stock *
Printing out the stock in an ordered way

* involves a search!
we prefer a binary search
only possible with tables
only possible when data is sorted
122
let us keep our stock sorted at all times

A. Initializing

STATIC DYNAMIC

ourStock.nrOfItems := 0; ourStock := nil;

NO SIGNIFICANT DIFFERENCE

B. Adding articles

STATIC DYNAMIC

if s.nrOfItems = max Sequential Search


then Error(‘Overflow’) Insert a node
else
Sequential Search (since it is not an
exact search)
Shift elements to the right
Put info in cell

OVERFLOW 
SHIFTING 

C. Deleting articles

STATIC DYNAMIC

Binary Search Sequential Search


Shift elements to the left Delete a node

 SEARCH
SHIFTING 

C. Changing price/description

STATIC DYNAMIC

Binary Search Sequential Search


Make change Make change

 SEARCH

D. Update stock

STATIC DYNAMIC
123
Binary Search Sequential Search
Make change Make change

 SEARCH

E. Print ordered list

STATIC DYNAMIC

Sequential walk/visit Sequential walk/visit

NO SIGNIFICANT DIFFERENCE

124
A FINAL COMPARISON:

Based on algorithms
Static implementation will generally be faster

Based on storage
“POINTER OVERHEAD”
for each item there is a pointer which takes up memory space!
Example: • assume 1 pointer takes up 4 bytes.
• assume 1 article takes up 300 bytes.
• assume there is static space reserved for 10000
articles of which 9800 are actually used.

STATIC DYNAMIC

3000000 bytes taken from memory 9800 x 304 = 2979200 bytes used
2850000 bytes really used 9500 x 4 = 38000 bytes overhead
150000 bytes wasted

• QUICKSORT. (* not in syllabus, addendum *)

The idea:
To sort an array A containing N elements...

Choose an element A[index]and re-arrange the array so that


only elements less than or equal to A[index] are found to the left of
A[index]
only elements bigger than or equal to A[index] are found to the right of
A[index]

Note that A[index] will remain in this position when the whole array is sorted.

If we repeat this process for the elements in the left and right subarray than
we end up with a completely sorted array.

An example:
1 2 3 4 5 6 7 8
30 60 50 40 20 80 75 35
myTable
Let’s take the first element and re-arrange the myTable

1 2 3 4 5 6 7 8
20 30 50 40 60 80 75 35
myTable
Left part is OK (1 element)
125
Right part: take 50 and re-arrange
1 2 3 4 5 6 7 8
20 30 35 40 50 80 75 60
myTable
Re-arrange the 2 parts...

1 2 3 4 5 6 7 8
20 30 35 40 50 60 75 80
myTable
1 2 3 4 5 6 7 8
20 30 35 40 50 60 75 80
myTable
1 2 3 4 5 6 7 8
20 30 35 40 50 60 75 80
myTable

126
The function Re_Arrange:

3 4 5 6 7 8
50 40 60 80 75 35

low = 3 low high


high = 8
pivot = 3 <- the element around which we re-arrange

1. increase low until you find an element > T[pivot]

3 4 5 6 7 8
50 40 60 80 75 35

low = 5 low high


high = 8

2. decrease high until you find an element <= T[pivot]

3 4 5 6 7 8
50 40 60 80 75 35

low = 5 low high


high = 8

127
3. if low < high then swap T[low] and T[high]

3 4 5 6 7 8
50 40 35 80 75 60

low = 5 low high


high = 8

-> repeat this process until low >= high

3 4 5 6 7 8
50 40 35 80 75 60

low = 6 low high


high = 8

3 4 5 6 7 8
50 40 35 80 75 60

low = 6 high low


high = 5

-> when low > high then swap T[pivot] with T[high]

3 4 5 6 7 8
35 40 50 80 75 60

Round up:

Declarations:

CONST max = 10;


TYPE table = array[1..max] of integer;

The procedure partition:

(* This function puts the element a [pivot] in its right position *)


(* in the array a and returns the index of that position *)
function Re_Arrange( var a: table;

128
left, right, pivot: integer): integer;
var low, high: integer;
begin
low := left;
high := right;
while low < high do begin
while (a[low] <= a[pivot]) and (low < right) do
low := low + 1;
while a[high] > a[pivot] do
high := high – 1;
if low < high
then swap(a[low], a[high])
else swap(a[pivot], a[high]);
end;
Re_Arrange := high;
end;

129
The procedure quicksort:

procedure quickSort( var a: table; left, right: integer);


var position: integer;
begin
if left < right
then begin
position := Re_Arrange(a, left, right, left);
quickSort(a, left, position – 1);
quickSort(a, position + 1, right);
end;
end;

Using quicksort:

quickSort(myTable, 1, max);

Time analysis:

QuickSort O(N x Log2N)

BubbleSort O(N2)
SelectionSort O(N2)
InsertionSort O(N2)

130