Professional Documents
Culture Documents
But why must programs be efficient when new computers are faster
every year?
The quest for program efficiency need not and should not conflict with
sound design and clear coding.
A programmer who has not mastered the basic principles of clear design is
not likely to write efficient programs
For this reason, the study of data structures and the algorithms that
manipulate them is at the heart of computer science.
This is done by describing, for each data structure, the amount of space and
time required for typical operations.
Only through such measurement can you determine which data structure in
your toolkit is most appropriate for a new problem.
A data structure is a way of organizing input data and operations which can
be performed on this data (e.g., add, delete, search).
Goals:
Using the proper data structure can make the difference between a
program running in a few seconds and one requiring many days.
A solution is said to be efficient if it solves the problem within the
required resource constraints.
When selecting a data structure to solve a problem, you should follow these
steps:
Examples of basic operations include inserting a data item into the data
structure, deleting a data item from the data structure, and finding a
specified data item.
In practice, it is hardly ever true that one data structure is better than
another for use in all situations.
Example 1
A bank must support many types of transactions with its customers, but we
will examine a simple model where customers wish to open accounts, close
accounts, and add money or withdraw money from accounts.
(1) the requirements for the physical infrastructure and workflow process
that the bank uses in its interactions with its customers, and
(2) the requirements for the database system that manages the accounts.
The typical customer opens and closes accounts far less often than he or
she accesses the account.
Customers are willing to wait many minutes while accounts are created
or deleted but are typically not willing to wait more than a brief time for
individual account transactions such as a deposit or withdrawal. These
observations can be considered as informal specifications for the time
constraints on the problem.
Teller and ATM transactions are expected to take little time. Opening or
closing an account can take much longer (perhaps up to an hour from the
customer’s perspective).
When considering the choice of data structure to use in the database system
that manages customer accounts, we see that a data structure that has
little concern for the cost of deletion, but is highly efficient for search
and moderately efficient for insertion, should meet the resource
constraints imposed by this problem.
One data structure that meets these requirements is the hash table.
Hash tables allow for extremely fast exact-match search. A record can
be modified quickly when the modification does not affect its space
requirements.
The integers also form a type. An integer is a simple type because its
values contain no subparts.
These implementation details are hidden from the user of the ADT and
protected from outside access, a concept referred to as encapsulation.
The term “data structure” often refers to data stored in a computer’s main
memory.
The related term file structure often refers to the organization of data on
peripheral storage, such as a disk drive or CD-ROM.
Examples of ADTs – linked lists, queues, stacks, trees, heaps,
graphs.
Data Organization
• linear – list, array, vector, linked list, sequences, queue, stack, deque
Algorithm
2. Display counter
3. Increment counter by 1
With the use of an algorithm, the same specified steps are used
for performing the tasks. This makes the process more
consistent and reliable.
The way in which the various data elements are organized in memory with
respect to each other is called a data structure.
Data can be organized in many different ways; therefore, you can create as
many data structures as you want. However, there are some standard data
structures that have proved useful over the years. These include arrays,
linked lists, stacks, queues, trees and graphs.
Suppose you have to write an algorithm that enables printer to service the
requests of multiple users on a first-come-first-served (FCFS) basis. In this
case, using a data structure that stores and retrieves the requests in the
order of their arrival would be much more efficient than a data structure
that stores and retrieves the requests in a random order
Consider an example where you have to find the maximum value in a set of
50 numbers, In this scenario, you can either use 50 variables or a data
structure, such as an array of size 50, to store the numbers. When 50
different variables are used to store the numbers, the algorithm to
determine the maximum value among the numbers can be written as:
Display max
On the other hand, when an array of size 50 is used, the algorithm can be
written as:-
Display max
From the preceding two algorithms, it can be seen that the algorithm using
an array manipulates memory much more efficiently than the algorithm
using 50 variables.
Also, the algorithm using an array involves few steps and is therefore, easier
to understand and implement as compared to the algorithm that uses 50
variables.
Moreover, there may exist more than one algorithm to solve a problem,
Writing an effective algorithm for a new problem or writing a better
algorithm for an already existing one is an art as well as science because
it requires both creativity and insight. Classification of Algorithms
2. dynamic programming
3. greedy
4. brute force
5. backtracking
7. Randomized
EXAMPLE
Consider an example where you have to find the minimum value in a list of
numbers. The lists is as shown in the figure:-
To find the minimum value, you can divide the list into two halves, as shown
in the following figure:-
Again, divide each of the two lists into two halves as shown in the following
figure:-
Now, there are only two elements in each list. At this stage, compare the
two elements in each lists to find the minimum of the two.
The minimum values from each of the four lists is shown in the following
figures
3 2 1 8
Minimum values in the four lists
Again, compare the first two minimum values to determine their minimum.
Also compare the last two minimum values to determine their minimum. The
two minimum values thus obtained are shown in the following figure:-
2 1
Minimum values in the two halves of the original list
Again, compare the two final minimum values to obtain the overall minimum
value, which is 1 in the preceding example.
Greedy Approach
The greedy approach is an algorithm design technique that selects the best
possible option at any given time. Algorithms based on the greedy
approach are used for solving optimization problems, where you need to
maximize profits or minimize costs under a given set of conditions. Some
examples of optimization problems are:-
B 3 150 450
C 4 200 800
D 1 50 50
E 5 100 500
A greedy algorithms acts greedy, and therefore selects the item with the
maximum total value at each stage. Therefore, first of all, item C with total
value of $800 and weight 4 kg will be selected. Next, item E with total value
$500 and weight 5 kg will be selected. The next item with the highest value
is item B with a total value of $450 and weight 3 kg. However, if this item is
selected, the total weight of the selected items will be 12 kg (4 + 5 + 3),
which is more than the capacity of the bag.
Therefore, we discard item B and search for the item with the next highest
value. The item with the next higher value is item A having a total value of
$400 and a total weight of 2 kg. However, the item also cannot be selected
because if it is selected, the total weight of the selected items will be 11 kg (
4 + 5 + 2). Now, there is only one item left, that is, item D with a total
value of $50 and a weight of 1 kg. This item can be selected as it makes the
total weight equal to 10 kg.
The selected items and their total weights are listed in the following table.
Item Weight (in kg) Total value (in $)
C 4 800
E 5 500
D 1 50
Total 10 1350
Items selected using Greedy Approach
For most problems, greedy algorithms usually fail to find the globally
optimal solution. This is because they usually don’t operate exhaustively on
all data. They can make commitments to certain choices too early, which
prevent them from finding the best overall solution later.
This can be seen from the preceding example, where the use of a greedy
algorithm selects item with a total value of $1350 only. However, if the
items were selected in the sequence depicted by the following table, the
total value would have been much greater, with the weight being 10 kg
only.
C 4 800
B 3 450
A 2 400
D 1 50
Total 10 1700
In the preceding example you can observe that the greedy approach
commits to item E very early.
up.
The dynamic programming algorithm ends up with better run time than
the brute force algorithm.
Backtracking algorithm
Backtracking algorithm views the problem to be solved as a sequence of
decisions and systematically considers all possible outcomes for each
decision to solve the overall problem.
For example, it finds a solution to the first subproblem and then attempts
recursively solve the other subproblems based
Randomized approach
Randomized approach: Any algorithm that makes some random (or
pseudo-random) choices.
the compiler
EXAMPLE
Compute the sum of 1+2+3+...+n for any integer n > 0.
Algorithm 1:
Time/Space Tradeoff
Time/space tradeoff refers to a situation where you can reduce the use of
memory at the cost of slower program execution, or reduce the running
time at the cost of increased memory usage.
Set I = 0 // 1 assignment
Increment I by 1 // n increments
The execution time required for the preceding algorithm is given by:-
T=a+bxn+cxn+dxn
T = a + n(b + c + d)
The basic idea behind recursion is to break a problem into smaller versions
of itself, and then build up a solution for the entire problem. This may sound
similar to the divide and conquer technique. However, recursions not similar
to the divide and conquer technique. Divide and conquer is a theoretical
concept that may be implemented in a computer program with the help of
recursion.
f(n) = f(n – 1) + n
In this case, the recursive definition of the function f(n) calls the same
function, but with its arguments reduced by one. The recursion will end n =
1, in which case f(1) = 1 has been defined.
To understand this concept, consider a factorial function. A factorial
function is defined as:- n! = 1 x 2 x 3 x 4 x .. x n
n! = (n – 1)! x n
3! = (3 x (2 x 1!))
3! = (3 x (2 x (1 x 0!)))
3! = (3 x (2 x (1 x 1)))
3! = (3 x (2 x 1 ))
3! = (3 x 2)
3! = 6
Algorithm: Factorial(n)
Return (1)
Return (n x Factorial(n – 1))
Tower of Hanoi
The objective of the game is to move all disks from the first pin to the third
pin in the least number of moves by using the second pin as an
intermediary.
Steps Moves
When n = 2, we should move the top disc from pin 1 to pin 2, ,move the top
disc from pin 1 to pin 3, and then move the top disc from pin 2 to pin 3.
The solution for n = 1 will be to move the disc from pin 1 to pin 3.
The following algorithm can be used to move the top n discs from the
first pin START to final pin FINISH through the temporary pin TEMP:-
Move (n, START, TEMP, FINISH) When n = 1:
Return
Efficiency of An Algorithm
In each case there is an ordering to the elements (in this case, the ordering
is based on time of arrival).
Even the characters which form the sentences on this web page are ordered
where the author specifies the order (hopefully in an attempt to convey an
idea to the intended audience).
Unordered,
Well-ordered relationships,
Partially-ordered relationships,
Another example the units converter in Maple: assocatied with each unit
such as the metre (m), the gram (g), the pound (lb), and the galileo (Gal) is
certain information, such as dimension and the relationship to other base
units.
A lot of data is well-ordered, that is, given any two elements, it is possible to
determine if one precedes the other.
The actual rule which is followed which determines this well ordering may be
based on either an implicit rule, for example,
Both of these are implicit rules based on the characteristics of the data.
In this case, we see that Karen < Roger and Roger < Susan, and
therefore we may also infer that Karen < Susan. This is called the
transitive property of a partial ordering. In this example, there is a minimal
element, Susan, which precedes all other elements in this tree, and is
therefore called a minimal element.
Note, however, that not all elements are comparable: we cannot compare
Karen and Julie, as neither Karen < Julie nor Julie < Karen is true.
Adjacency Relationships
The layout of the intersections of streets in a city or circuit elements on a
chip may be described by their locations, however, what is much more
useful is knowing which nodes are adjacent to others.
Static: These are data structures whose size is fixed at compile time, and
does not grow or shrink at runtime. An example of a static structure is an
array. Suppose you declare an array of size 50, but store only 5 elements in
it; the memory space allocated for the remaining 45 elements will be
wasted. Similarly, if you have declared an array of size 50 but later want to
store 20 more elements, you will not be able to store these extra required
elements because of the fixed size of an array.
Dynamic: These are data structures whose size is not fixed at compile time
and that can grow and shrink at run time to make efficient use of memory.
An example of a dynamic data structure would be a list of items for which
memory is not allocated in advance. As and when items are added to the
lists, memory is allocated for those elements. Similarly, when items are
removed from the list, memory allocated to those elements is de-allocated.
Such a list is called a linked list.
The position of an element in the array is called the index. In C++ arrays
always begin with 0:
0 1 2 3 4 5 (indices)
12 -3 24 65 92 11 (array values)
Indeed, an array with one or two indices is often called a vector or matrix
structure, respectively.
Arrays are among the oldest and most important data structures, and are
used by almost every program and are used to implement many other data
structures, such as lists and strings.
The terms array and array structure are often used to mean array data
type, a kind of data type provided by most high-level programming
languages that consists of a collection of values or variables that can be
selected by one or more indices computed at run-time.
The terms are also used, especially in the description of algorithms, to mean
associative array or "abstract array", a theoretical computer science model
(an abstract data type or ADT) intended to capture the essential
properties of arrays
Applications
Arrays are used to implement mathematical vectors and matrices, as
well as other kinds of rectangular tables. Many databases, small and
large, consist of (or include) one-dimensional arrays whose elements are
records.
Arrays are used to implement other data structures, such as heaps, hash
tables, deques, queues, stacks, strings, and VLists.
“Ordered” in this definition means that each element has a position in the
list.
In the simple list implementations, all elements of the list have the
same data type, although there is no conceptual objection to lists
whose elements have differing data types if the application requires it.
The operations defined as part of the list ADT do not depend on the
elemental data type.
For example, the list ADT can be used for lists of integers, lists
The beginning of the list is called the head, the end of the list is
called the tail. There might or might not be some relationship between
the value of an element and its position in the list.
0 through n-1 as
Some languages may allow list types to be indexed or sliced like array
types. In object-oriented programming languages, lists are usually provided
as instances of subclasses of a generic "list" class.
List data types are often implemented using arrays or linked lists of some
sort, but other data structures may be more appropriate for some
applications. In some contexts, such as in Lisp programming, the term list
may refer specifically to a linked list rather than an array.
Operations
Implementation of the list data structure may provide some of the following
operations:
list
Characteristics
• Lists have the following properties:
• The size of lists. It indicates how many elements there are in the list.
• Equality of lists:
• Lists may be typed. This implies that the entries in a list must have
types that are compatible with the list's type. It is common that lists
are typed when they are implemented using arrays.
• Each element in the list has an index. The first element commonly has
index 0 or 1 (or some other predefined integer). Subsequent elements
have indices that are 1 higher than the previous element The last
element has index <initial index> + <size> − 1.
Implementations
• Lists are typically implemented either as linked list(either singly or
doubly-linked) or as arrays usually variable length or dynamic
arrays.
• A linked list is a data structure that consists of a sequence of data
records such that in each record there is a field that contains a
reference (i.e., a link) to the next record in the sequence.
• A linked list whose nodes contain two fields: an integer value and a
link to the next node.
• Linked lists are among the simplest and most common data
structures; they provide an easy implementation for several important
abstract data structures, including stacks, queues, associative
arrays, and symbolic expressions.
The field of each node that contains the address of the next node is usually
called the next link or next pointer. The remaining fields are known as the
data, information, value, cargo, or payload fields.
The head of a list is its first node, and the tail is the list minus that node
(or a pointer thereto).
Singly-linked lists contain nodes which have a data field as well as a next
field, which points to the next node in the linked list
A singly-linked list whose nodes contain two fields: an integer value and a
link to the next node
A doubly-linked list whose nodes contain three fields: an integer value, the
link forward to the next node, and the link backward to the previous node
In a multiply-linked list, each node contains two or more link fields, each
field being used to connect the same set of data records in a different order
(e.g., by name, by department, by date of birth, etc.). (While doublylinked
lists can be seen as special cases of multiply-linked list, the fact that the
two orders are opposite to each other leads to simpler and more efficient
algorithms, so they are usually treated as a separate case.)
Operations on Sorted Lists
Suppose that we have a sorted of size N in which we are storing n < N
sorted objects. There are three different operations we will focus on:
• The operation affects the front of the sorted list (e.g., deleting the
smallest element),
• The operation affects the back of the sorted list (e.g., inserting an
element larger than any element currently in the sorted list).
The middle element (at location 7) is 21, and because 17 < 21, we exclude
the right half, as is shown in Figure below.
Next, we find the middle element of the left half: 8. Because 17 > 8, we
search the right half of the new list, as is shown in Figure below
2 5 7 8 12 17 19 21 25 26 28 31 33 34 39
If we denote the runtime of a binary search by T(n), then the run time for
searching a list of size n is a constant number of operations to find and test
the middle element and then the time it takes to search either the left or the
right halves (but only one).
That is, T(n) = T(n/2) + Θ(1). In the special case where n = 1, we know
that the run time is O(1): all we must do is check the one element. To
determine the run time of this, we assume n = 2k, and therefore k =
log2(n):
T(n) = T(2k)
= T(2k/2) + 1 = T(2k − 1) + 1
T(n) = T(2k − k) + k
= T(1) + k
=1+k
=O( log2(n) ). Thus, it follows, the time it takes to find an arbitrary object is
O(ln(n)).
Stacks
A stack is a linear list in which all additions and deletions are restricted to
one end, called top.
If you inserted objects into a stack and then you removed them, the order
of the objects is reversed. Numbers inserted as 1, 2, 3, 4, 5 are removed as
5, 4, 3, 2, 1.
This reversing attribute is the reason why stacks are called LIFO (last-
in, first-out) data structure.
Thus the stack is appropriate if we expect to access only the top object, all
other objects are inaccessible.
The three natural operations on a stack are push (insert), pop (delete),
and top (view).
The method pop removes an object from the top of the stack.
The method top reads the value on the top of the stack (it does not remove
it).
If useful, we can combine top and pop and get the method topAndPop which
returns and removes the most recent object from the stack.
Applications
There are numerous applications of stacks:
Most application allows the user to undo the previous action performed.
This is usually implemented by recording what changes must be made to
revert the current state into the previous state. These are packaged
together and stored as a single object on the stack, and if undo is selected,
then the top set of instructions is popped from the stack, the reversions are
implemented, and the application is now in the previous state.
Each time a new link or page is visited, the previous link is pushed onto the
back stack while the forward stack is emptied. If the back button is selected,
the current URL is pushed onto the forward stack and the last URL is popped
from the back stack. If the forward button is selected, the current URL is
pushed onto the back stack, and the next URL is popped from the forward
stack.
In a browser, the back and forward arrows are disabled if the corresponding
stacks are empty. Visiting a new page or selecting an active forward button
will always make the back stack non-empty, while selecting an active back
button will always make the forward stack non-empty.
Matching Parentheses
In most programming languages today, the grammar for the language
requires that delimiters be matched in order, including parentheses (),
brackets [], angled brackets <>, and braces {}.
For example, the delimiters (abc), [abc], <abc>, and {abc} are vacuously
matched, while the parentheses in (abc[def]ghi) is matched because the
brackets contained with them are also matched.
Of course, this is extremely suboptimal if you are using an array to store the
data you are trying to reverse (just swap the appropriate entries), but may
be useful if the number of items to be reversed is unknown.
Reverse-Polish Calculator(Notation)
Implementations of RPN are stack-based; that is, operands are popped from
a stack, and calculation results are pushed back onto it.
Although this concept may seem obscure at first, RPN has the advantage of
being extremely easy, and therefore fast, for a computer to analyze due to
it being a regular grammar Practical implications
Example
• 12+4*3+
1 1 Push operand
2 1, 2 Push operand
+ 3 Addition
4 3, 4 Push operand
* 12 Multiplication
+ 15 Addition
This method of defining operators does not require parentheses: there is no
possibility for ambiguity as there may be with
5 + 3 * 4 - 7, which may be interpreted in as many as 3! = 6 different
ways. The associations are given below.
((5 + 3) * 4) - 7 = 25 5 3 + 4 * 7 (5
+ 3) * (4 - 7) = -24 5 3 + 4 7 - * (5
+ (3 * 4)) - 7 = 10 3 4 * 5 + 7 5 +
((3 * 4) - 7) = 10 3 4 * 7 - 5 + (5 +
3) * (4 - 7) = -24 4 7 - 5 3 + *
5 + (3 * (4 - 7)) = -4 4 7 - 3 * 5 +
Queues
The concept of a queue is quite familiar to humanity: entering a lineup in
anticipation of receiving a service. We can generalize this concept as
follows:
Recursion
Recursive functions
Many mathematical functions can be defined recursively:
• factorial
• Fibonacci
• Fourier Transform
Many problems can be solved recursively, eg games of all types from simple
ones like the Towers of Hanoi problem to complex ones like chess. In
games, the recursive solutions are particularly convenient because, having
solved the problem by a series of recursive calls, you want to find out how
you got to the solution. By keeping track of the move chosen at any point,
the program call stack does this housekeeping for you! This is explained in
more detail later.
Example: Factorial
One of the simplest examples of a recursive definition is that for the factorial
function:
Note how this function calls itself to evaluate the next term.
Eventually it will reach the termination condition and exit. However, before
it reaches the termination condition, it will have pushed n stack frames onto
the program's run-time stack.
Data structures also may be recursively defined. One of the most important
class of structure - trees - allows recursive definitions which lead to simple
(and efficient) recursive functions for manipulating them.
Searching
Computer systems are often used to store large amounts of data from which
individual records must be retrieved according to some search criterion.
Thus the efficient storage of data to facilitate fast searching is an important
issue. In this section, we shall investigate the performance of some
searching algorithms and the data structures which they use.
Sequential Searches
Let's examine how long it will take to find an item matching a key in the
collections we have discussed so far. We're interested in:
The best case - in which the first comparison returns a match - requires a
single comparison and is O(1).
The average time depends on the probability that the key will be found in
the collection - this is something that we would not expect to know in the
majority of cases
The space used is O(n) and each operation of the Queue ADT
takes O(1) time
Applications of queues
There are numerous applications of queues. The most common is where
there are one or more servers satisfying requests from any number of
clients. Examples of this are:
A bank with one or more tellers and zero or more clients waiting to
INTRODUCTION TO TREES
In computer science, a tree is an abstract model of a hierarchical structure
Nodes which are the same distance from the root (have the same number of
edges lying between them and the root) are at the same level.
The height of the tree is the maximum distance between the root and any
node.
SECRETARIAT
SCHOOLS SCHOOLS
DEPARTMENTS DEPARTMENTS
Applications:
• Organization chart
• File systems
• Programming environments
The terminology for a tree is based on one of three possible analogies:
• Physical trees,
• Graph theory.
For each node in a linked list, if we consider the next node to be its
successor, then we may redefine a non-empty linked list as follows:
2. Each node has either zero or one successors (a node which follow it),
and
For each node, except for the head, there is a node which has that node as
its successor
In a finite linked list, there is exactly one node which has no successor, and
that node is called the tail node of the linked list. An example of such a
linked list is shown in Figure.
As suggested by the shape (standing it on end) we will call the first element
the root and the entire structure a tree.
Each node has zero or more successors (nodes which follow it), and
For each node, except for the root, there is a node which has that
node as its successor.
The terms root and leaf come from botany, however, the standard
representation of a tree is to draw it with the root node at the top, as shown
A path is a sequence of nodes (N0, N1, ..., Nn) such that Nk − 1 is a parent
a descendant of X.
From this comes the peculiar definition that each node is both an ancestor
and a descendant of itself. To avoid this, we further say that if there exists
a non-trivial path (i.e., a path of length n ≥ 1) from X to Y, then X is said
to be a proper ancestor of Y, while Y is said to be a proper
descendant of X.
Depth of a node in a tree is the length of the path from the root to the
node. Height of a node is the length of the longest path from a given node
to the deepest leaf. The height of a tree is equal to the height of the root.
A general rooted tree is a set of nodes that has a designated node called
the root, from which zero or more subtrees descend.
Every node (except the root) is connected by an edge from exactly one
other node. There is a unique path from the root to each node. An empty
tree has no nodes.
C is the
sibling
of B
D, E, F, G, I are
external nodes, or
leaves
Examples
The height of the tree (the maximum depth of any node within the tree) is
5.
The degree of the root node is 2 and its children are B and C.
The parent node is D, the degree is 3 and its children are E, F and G.
Tree Traversals
It is straight-forward to visit all the elements in either a linked list or an
array: start at the front and step through the elements one at a time.
What to do for trees is less clear: how do you visit all of the nodes within a
tree in an ordered fashion?
We would like to visit each of the nodes within the tree. A scheme for
visiting all of the nodes within a tree is termed a traversal of the tree. The
term visit is used to describe a point where some for of operation or
function is applied to a node.
The simplest way to traverse the nodes of this tree would be to visit all
nodes at level 0 (the root node) and then all nodes at level 1 (B-E).
Such a traversal is shown in Figure , where the nodes are visited in the
order A-B-C-D-E. Because all the nodes at each level are visited before the
level is incremented, such a traversal is termed a breadth-first traversal.
By definition, a tree does not have any loops, and therefore, we can follow
a path along the outside of all of the nodes. For convenience, we will begin
and end at the root.
Rather than visiting all the nodes at each level, such a walk around the
nodes of a tree immediately goes as deep as possible into the tree, and
therefore such traversals are collectively termed depth-first traversals.
Figure above emphasizes that each node is approached at least twice: one
first time, and one last time. These two approaches are shown below.
If we restrict ourselves to actually visiting a node only the first time visit,
the traversal is termed a pre-order depth-first traversal, while if we visit
the node only with the last approach, the traversal is termed a post-order
depth-first traversal.
The first thing to notice is that with a pre-order traversal, the parent
node is visited before any child nodes, while with a post-order traversal,
the parent node is only visited after all the child nodes have been visited.
This is critical in applications: in some cases, it is necessary to know
information about the children before the parent can be processed. In other
cases, the children must have information about the parent before they can
be processed.
Figure shows a depth-first traversal of a more complex tree.
The order in which the nodes are visited using a pre-order traversal are:
ABFGCDHE
The order in which the nodes are visited using a post-order traversal are:
FGBCHDEA
Example 1
35 25 12 7 16 28 26 33 74 63 42 68 87 79 94 Post-
7 16 12 26 33 28 25 42 68 63 79 94 87 74 35 Breadth-
first traversal
35 25 74 12 28 63 87 7 16 26 33 42 68 79 94
Example 2:
12678DE93ABCD45
67DE892ABDC3451
Breadth-first traversal
123456789ABCDEF
The call for v costs $(cv + 1), where cv is the number of children of v
• For the call for v, charge one cyber-dollar to v and charge one
once for its own call and once for its parent’s call.
Types of trees
Free Trees
It has no root.
This is the most general kind of tree, and may be converted into the more
familiar form by designating a node as the root.
Note the recursive nature of this definition; it is one of the salient points of
the tree structure.
Note also that it is possible to have an unordered rooted tree, eg. a tree
representing the possible moves in a game of chess. Position Trees
Two two-node position trees, on the other hand, may be quite different.
They both have roots, but one may have a child in position #1, and the
other a child in position #2.
Binary Trees
Algorithms for general trees tend to be complex, as each node may have a
different number of children.
Additionally, the children must be stored in some form of list, either an array
or a linked list. For a given node N, accessing and modifying these
structures is O(deg(N)) where deg(N) is the degree of the node (the
number of children).
A binary tree is a tree in which each node has exactly two subtrees: the
left subtree and right subtree, either or both of which may be empty.
We will identify the two children as the left sub-tree and the right
subtree of a given node. This reflects the common presentation of such
trees, as is shown in Figure, each node having one sub-tree to the left and
the other sub-tree to the right. Figure explicitly shows empty sub trees
using a ∅. For the purposes of order, we will consider the left sub-tree to
precede the right sub-tree.
A binary tree
Applications
Expression Trees
As each operator has two operands, we can store any operator as a binary
tree node with the two operands forming the two sub-trees.
For example, the above expression may be represented by the binary tree in
Figure below:
One observation about Figure above is that this is an ordered tree:
Another observation is that if all of the nodes are numeric values, we may
use a post-order depth-first traversal of this tree together with a stack to
evaluate the tree:
When a node is visited during the post-order traversal, one of two things
occurs:
• If the node is storing a number, we push the number onto the stack,
and
• If the node is storing an operator, we pop the last two elements off of
the stack, apply the operation, and put the result onto the stack.
Algorithm
(b) If the token is an operator, pop from the stack two values and make the
first one the right child and the second one the left child of the operand.
Push the operand on the stack.
(c) Continue two previous steps until no token left in the postfix form of the
algebraic expression.
For example, consider the post-order depth-first shown in Figure . The order
in which the nodes are visited is indicated by the red line.
Figures below show how the change to the stack with the visit of each node,
ultimately resulting in a single value on the stack, the result of the
evaluation.
Visiting the node 3
A perfect binary tree is a binary tree where all the leaves are at the same
depth, as is demonstrated in Figure
Perfect trees of height 0, 1, 2, 3, and 4.
An empty binary tree is vacuously perfect, as all nodes (there are none) are
at the same depth. A binary tree with a signal node is also perfect, as there
is only one node.
• A binary tree of height h > 1 is perfect if both the left and right
sub-trees are perfect binary trees of height h − 1.
A perfect binary tree of height 4 as a node with two perfect binary trees of
height 3.
Complete Binary Trees
A complete binary tree is one where the nodes are filled in breadth-first
traversal order. Consequently, all leaves in a complete binary tree of height
h must be at a depth of either h or h − 1.
A complete binary tree is a binary tree where the nodes are filled in the
breadth-first order. This suggests that all the the leaves of a complete
binary tree of height h are either at height h or h − 1. Unlike a perfect tree,
there exist perfect binary trees with an arbitrary number of nodes. The
complete binary trees for 1 through 10 nodes are shown in Figure
An empty binary tree is complete perfect, as all nodes (there are none) are
do follow the breadth-first transversal. A binary tree with a signal node is
also complete, as there is only one node.