DP Additional Notes

10/23/2014
15.5 Optimal binary search trees
• We are designing a program to translate text

• Perform lookup operations by building a BST with
words as keys and their equivalents as satellite data
• We can ensure an (lg ) search time per occurrence by
using a RBT or any other balanced BST
• A frequently used word may appear far from the root
while a rarely used word appears near the root
• We want frequent words to be placed nearer the root
• How do we organize a BST so as to minimize the
number of nodes visited in all searches, given that we
know how often each word occurs?
MAT-72006 AA+DS, Fall 2014 23-Oct-14 516
• What we need is an optimal binary search tree

• Formally, given a sequence = ,…, of
distinct sorted keys ( ), we
wish to build a BST from these keys
• For each key , we have a probability that a
search will be for
• Some searches may be for values not in , so
we also have + 1 “dummy keys” ,…,
representing values not in
• In particular, represents all values less than
, represents all values greater than
MAT-72006 AA+DS, Fall 2014 23-Oct-14 517
1
10/23/2014
• For = 1, 2, … , 1, the dummy key

represents all values between and
• For each dummy key , we have a probability
that a search will correspond to
MAT-72006 AA+DS, Fall 2014 23-Oct-14 518
• Each key is an internal node, and each

dummy key is a leaf
• Every search is either successful ( nds a key )
or unsuccessful ( nds a dummy key ), and so
we have
+ =1
• Because we have probabilities of searches for

each key and each dummy key, we can
determine the expected cost of a search in a
given BST
MAT-72006 AA+DS, Fall 2014 23-Oct-14 519
2
10/23/2014
• Let us assume that the actual cost of a search

equals the number of nodes examined, i.e., the
depth of the node found by the search in + 1
• Then the expected cost of a search in ,
E search cost in =
depth +1 + (depth + 1)
=1+ depth + depth ,
• where depth denotes a node’s depth in tree

MAT-72006 AA+DS, Fall 2014 23-Oct-14 520
Node Depth Probability Contribution

1 0.15 0.30
0 0.10 0.10
2 0.05 0.15
1 0.10 0.20
2 0.20 0.60
2 0.05 0.15
2 0.10 0.30
3 0.05 0.20
3 0.05 0.20
3 0.05 0.20
3 0.10 0.40
Total 2.80
MAT-72006 AA+DS, Fall 2014 23-Oct-14 521
3
10/23/2014
• For a given set of probabilities, we wish to construct

a BST whose expected search cost is smallest
• We call such a tree an optimal binary search tree
• An optimal BST for the probabilities given has
expected cost 2.75
• An optimal BST is not necessarily a tree whose
overall height is smallest
• Nor can we necessarily construct an optimal BST
by always putting the key with the greatest
probability at the root
• The lowest expected cost of any BST with at the
root is 2.85
MAT-72006 AA+DS, Fall 2014 23-Oct-14 522
MAT-72006 AA+DS, Fall 2014 23-Oct-14 523
4
10/23/2014
Step 1: The structure of an optimal BST

• Consider any subtree of a BST
• It must contain keys in a contiguous range
, … , , for some
• In addition, a subtree that contains keys , … ,
must also have as its leaves the dummy keys
,…,
• If an optimal BST has a subtree containing
keys , … , , then this subtree must be
optimal as well for the subproblem with keys
, … , and dummy keys ,…,
MAT-72006 AA+DS, Fall 2014 23-Oct-14 524
• Given keys , … , , one of them, say , is the

root of an optimal subtree containing these keys
• The left subtree of the root contains the keys
,…, (and dummy keys ,…, )
• The right subtree contains the keys ,…,
(and dummy keys , … , )
• As long as we
– examine all candidate roots , where ,
– and determine all optimal BSTs containing
,…, and those containing ,…, ,
we are guaranteed to nd an optimal BST
MAT-72006 AA+DS, Fall 2014 23-Oct-14 525
5
10/23/2014
• Suppose that in a subtree with keys , … , , we

select as the root
• ’s left subtree contains the keys , … ,
• Interpret this sequence as containing no keys
• Subtrees, however, also contain dummy keys
• Adopt the convention that a subtree containing
keys , … , has no actual keys but does
contain the single dummy key
• Symmetrically, if we select as the root, then
’s right subtree contains no actual keys, but it
does contain the dummy key
MAT-72006 AA+DS, Fall 2014 23-Oct-14 526
Step 2: A recursive solution

• We pick our subproblem domain as nding an
optimal BST containing the keys , … , , where
1, , and 1
• Let us de ne ] as the expected cost of
searching an optimal BST containing the keys
,…,
• Ultimately, we wish to compute [1, ]
• The easy case occurs when 1
• Then we have just the dummy key
• The expected search cost is 1 =
MAT-72006 AA+DS, Fall 2014 23-Oct-14 527
6
10/23/2014
• When > , we need to select a root from

among , … , and make an optimal BST with
keys , … , as its left subtree and an optimal
BST with keys , … , as its right subtree
• What happens to the expected search cost of a
subtree when it becomes a subtree of a node?
– Depth of each node increases by 1
– Expected search cost of this subtree increases by
the sum of all the probabilities in it
• For a subtree with keys , … , , let us denote
this sum of probabilities as
, = +
MAT-72006 AA+DS, Fall 2014 23-Oct-14 528
• Thus, if is the root of an optimal subtree

containing keys , … , , we have
+ 1 + 1
+( + 1, + 1, )
• Noting that
1 + + 1, )
we rewrite
1 + + 1, )
• We choose the root that gives the lowest
expected search cost: =
if 1
min 1 + + 1, ) if
MAT-72006 AA+DS, Fall 2014 23-Oct-14 529
7
10/23/2014
• The values give the expected search costs

in optimal BSTs
• To help us keep track of the structure of optimal
BSTs, we de ne root , for , to be
the index for which is the root of an optimal
BST containing keys , … ,
• Although we will see how to compute the values
of root , we leave the construction of an
optimal binary search tree from these values as
en exercise
MAT-72006 AA+DS, Fall 2014 23-Oct-14 530
Step 3: Computing the expected search cost of

an optimal BST
• We store values in a table 1. . + 1,0. .
• The rst index needs to run to + 1 because to
have a subtree containing only the dummy key
, we need to compute and store + 1,
• The second index needs to start from 0 because
to have a subtree containing only the dummy
key , we need to compute and store 1,0
MAT-72006 AA+DS, Fall 2014 23-Oct-14 531
8
10/23/2014
• We use only the entries for which 1

• We also use a table root , for recording the
root of the subtree containing keys , … ,
• This table uses only the entries
• We also store the values in a table
[1. . + 1,0. . ]
• For the base case, we compute 1 =
• For , we compute
1 +
• Thus, we can compute the ) values of
, in (1) time each
MAT-72006 AA+DS, Fall 2014 23-Oct-14 532
OPTIMAL-BST( , , )
1. let 1. . + 1,0. . , [1. . + 1,0. . ], root 1. . , 1. . be new tables
2. for = 1 to + 1
3. 1 =
4. 1 =
5. for = 1 to
6. for = 1 to +1
7. 1
8.
9. 1 +
10. for to
11. 1 + + 1, ]
12. if ]
13.
14. root
15.return and root
MAT-72006 AA+DS, Fall 2014 23-Oct-14 533
9
10/23/2014
MAT-72006 AA+DS, Fall 2014 23-Oct-14 534
• The OPTIMAL-BST procedure takes ) time,

just like MATRIX-CHAIN-ORDER
• Its running time is ), since its for loops are
nested three deep and each loop index takes on
at most values
• The loop indices in OPTIMAL-BST do not have
exactly the same bounds as those in MATRIX-
CHAIN-ORDER, but they are within 1 in all
directions
• Thus, like MATRIX-CHAIN-ORDER, the OPTIMAL-
BST procedure takes ( ) time
MAT-72006 AA+DS, Fall 2014 23-Oct-14 535
10
10/23/2014
16 Greedy Algorithms
• Optimization algorithms typically go through a
sequence of steps, with a set of choices at each
• For many optimization problems, using dynamic
programming to determine the best choices is
overkill; simpler, more ef cient algorithms will do
• A greedy algorithm always makes the choice
that looks best at the moment
• That is, it makes a locally optimal choice in the
hope that this choice will lead to a globally
optimal solution
MAT-72006 AA+DS, Fall 2014 23-Oct-14 536
16.1 An activity-selection problem
• Suppose we have a set = ,…, of

proposed activities that wish to use a resource
(e.g., a lecture hall), which can serve only one
activity at a time
• Each activity has a start time and a nish
time , where
• If selected, activity takes place during the half-
open time interval [ )
MAT-72006 AA+DS, Fall 2014 23-Oct-14 537
11
10/23/2014
• Activities and are compatible if the

intervals [ ) and [ ) do not overlap
• I.e., and are compatible if or
• We wish to select a maximum-size subset of
mutually compatible activities
• We assume that the activities are sorted in
monotonically increasing order of nish time:
MAT-72006 AA+DS, Fall 2014 23-Oct-14 538
• Consider, e.g., the following set of activities:

1 2 3 4 5 6 7 8 9 10 11
1 3 0 5 3 5 6 8 8 2 12
4 5 6 7 9 9 10 11 12 14 16
• For this example, the subset

consists of mutually compatible activities
• It is not a maximum subset, however, since the
subset is larger
• In fact, it is a largest subset of mutually
compatible activities; another largest subset is
, , ,
MAT-72006 AA+DS, Fall 2014 23-Oct-14 539
12
10/23/2014
The optimal substructure of the activity-

selection problem
• Let be the set of activities that start after
nishes and that nish before starts
• We wish to nd a maximum set of mutually
compatible activities in
• Suppose that such a maximum set is , which
includes some activity
• By including in an optimal solution, we are left
with two subproblems: nding mutually
compatible activities in the set and nding
mutually compatible activities in the set
MAT-72006 AA+DS, Fall 2014 23-Oct-14 540
• Let and , so that

– contains the activities in that nish
before starts and
– contains the activities in that start after
nishes
• Thus, , and so the
maximum-size set in consists of
=| + + 1 activities
• The usual cut-and-paste argument shows that
the optimal solution must also include
optimal solutions for and
MAT-72006 AA+DS, Fall 2014 23-Oct-14 541
13
10/23/2014
• This suggests that we might solve the activity-

selection problem by dynamic programming
• If we denote the size of an optimal solution for
the set by ], then we would have the
recurrence
+1
• Of course, if we did not know that an optimal
solution for the set includes activity , we
would have to examine all activities in to nd
which one to choose, so that
0 if
= max +1 if
MAT-72006 AA+DS, Fall 2014 23-Oct-14 542
Making the greedy choice

• For the activity-selection problem, we need
consider only the greedy choice
• We choose an activity that leaves the resource
available for as many other activities as possible
• Now, of the activities we end up choosing, one of
them must be the rst one to nish
• Choose the activity in with the earliest nish
time, since that leaves the resource available for
as many of the activities that follow it as possible
• Activities are sorted in monotonically increasing
order by nish time; greedy choice is activity
MAT-72006 AA+DS, Fall 2014 23-Oct-14 543
14
10/23/2014
• If we make the greedy choice, we only have to nd

activities that start after nishes
• and is the earliest nish time of any
activity no activity can have a nish time
• Thus, all activities that are compatible with activity
must start after nishes
• Let = be the set of activities that
start after activity nishes
• Optimal substructure: if is in the optimal solution,
then an optimal solution to the original problem
consists of and all the activities in an optimal
solution to the subproblem
MAT-72006 AA+DS, Fall 2014 23-Oct-14 544
Theorem 16.1 Consider any nonempty subproblem ,

and let be an activity in with the earliest nish time.
Then is included in some maximum-size subset of
mutually compatible activities of .
Proof Let be a max-size subset of mutually compatible

activities in , and let be the activity in with the
earliest nish time. If , we are done, since is in a
max-size subset of mutually compatible activities of .
If , let the set . The activities
in are disjoint because the activities in are disjoint,
is the rst activity in to nish, and . Since
| = | |, we conclude that is a maximum-size subset
of mutually compatible activities of and includes .
MAT-72006 AA+DS, Fall 2014 23-Oct-14 545
15
10/23/2014
• We can repeatedly choose the activity that

nishes rst, keep only the activities compatible
with this activity, and repeat until no activities
remain
• Moreover, because we always choose the
activity with the earliest nish time, the nish
times of the activities we choose must strictly
increase
• We can consider each activity just once overall,
in monotonically increasing order of nish times
MAT-72006 AA+DS, Fall 2014 23-Oct-14 546
A recursive greedy algorithm
RECURSIVE-ACTIVITY-SELECTOR )
1. +1
2. while and ]< ] // nd the rst
// activity in to nish
3. +1
4. if
5. return RECURSIVE-ACTIVITY-
SELECTOR )
6. else return
MAT-72006 AA+DS, Fall 2014 23-Oct-14 547
16
10/23/2014
MAT-72006 AA+DS, Fall 2014 23-Oct-14 548
An iterative greedy algorithm

GREEDY-ACTIVITY-SELECTOR )
1.
2.
3. 1
4. for 2 to
5. if ]
6.
7.
8. return
MAT-72006 AA+DS, Fall 2014 23-Oct-14 549
17
10/23/2014
• The set returned by the call

GREEDY-ACTIVITY-SELECTOR
is precisely the set returned by the call
RECURSIVE-ACTIVITY-SELECTOR
• Both the recursive version and the iterative
algorithm schedule a set of activities in )
time, assuming that the activities were already
sorted initially by their nish times
MAT-72006 AA+DS, Fall 2014 23-Oct-14 550
16.3 Huffman codes
• Huffman codes compress data very effectively

– savings of 20% to 90% are typical, depending on
the characteristics of the data being compressed
• We consider the data to be a sequence of
characters
• Huffman’s greedy algorithm uses a table giving
how often each character occurs (i.e., its
frequency) to build up an optimal way of
representing each character as a binary string
MAT-72006 AA+DS, Fall 2014 23-Oct-14 551
18
10/23/2014
• We have a 100,000-character data le that we

wish to store compactly
• We observe that the characters in the le occur
with the frequencies given in the table above
• That is, only 6 different characters appear, and
the character occurs 45,000 times
• Here, we consider the problem of designing a
binary character code (or code for short) in
which each character is represented by a unique
binary string, which we call a codeword
MAT-72006 AA+DS, Fall 2014 23-Oct-14 552
• Using a xed-length code, requires 3 bits to

represent 6 characters:
000, 001, … , 101
• We now need 300,000 bits to code the entire le
• A variable-length code gives frequent
characters short codewords and infrequent
characters long codewords
• Here the 1-bit string 0 represents , and the 4-bit
string 1100 represents
• This code requires
45 1 + 13 3 + 12 3 + 16 3 + 9 4 + 5
1,000 = 224,000 bits (savings 25%)
MAT-72006 AA+DS, Fall 2014 23-Oct-14 553
19
10/23/2014
Pre x codes
• We consider only codes in which no codeword is
also a pre x of some other codeword
• A pre x code can always achieve the optimal data
compression among any character code, and so we
can restrict our attention to pre x codes
• Encoding is always simple for any binary character
code; we just concatenate the codewords
representing each character of the le
• E.g., with the variable-length pre x code, we code
the 3-character le as 101 100 = 0101100
MAT-72006 AA+DS, Fall 2014 23-Oct-14 554
• Pre x codes simplify decoding

– No codeword is a pre x of any other, the
codeword that begins an encoded le is
unambiguous
• We can simply identify the initial codeword,
translate it back to the original character, and
repeat the decoding process on the remainder of
the encoded le
• In our example, the string 001011101 parses
uniquely as 101 1101, which decodes to
MAT-72006 AA+DS, Fall 2014 23-Oct-14 555
20
10/23/2014
• The decoding process needs a convenient

representation for the pre x code so that we can
easily pick off the initial codeword
• A binary tree whose leaves are the given
characters provides one such representation
• We interpret the binary codeword for a character
as the simple path from the root to that
character, where 0 means “go to the left child”
and 1 means “go to the right child”
• Note that the trees are not BSTs — the leaves
need not appear in sorted order and internal
nodes do not contain character keys
MAT-72006 AA+DS, Fall 2014 23-Oct-14 556
The trees corresponding to the xed-length

code = 000, … , = 101 and the optimal
pre x code = 0, = 101, … , = 1100
MAT-72006 AA+DS, Fall 2014 23-Oct-14 557
21
10/23/2014
• An optimal code for a le is always represented by a

full binary tree, in which every nonleaf node has two
children
• The xed-length code in our example is not optimal
since its tree is not a full binary tree: it contains
codewords beginning 10 …, but none beginning 11 …
• Since we can now restrict our attention to full binary
trees, we can say that if is the alphabet from
which the characters are drawn and
– all character frequencies are positive, then
– the tree for an optimal pre x code has exactly |
leaves, one for each letter of the alphabet, and
– exactly 1 internal nodes
MAT-72006 AA+DS, Fall 2014 23-Oct-14 558
• Given a tree corresponding to a pre x code,

we can easily compute the number of bits
required to encode a le
• For each character in the alphabet , let the
attribute denote the frequency of and
let ) denote the depth of ’s leaf
• ) is also the length of the codeword for
• Number of bits required to encode a le is thus
)
which we de ne as the cost of the tree
MAT-72006 AA+DS, Fall 2014 23-Oct-14 559
22
10/23/2014
Constructing a Huffman code
• Let be a set of characters and each character

be an object with an attribute
• The algorithm builds the tree corresponding to the
optimal code bottom-up
• It begins with | leaves and performs 1
“merging” operations to create the nal tree
• We use a min-priority queue , keyed on , to
identify the two least-frequent objects to merge
• The result is a new object whose frequency is the
sum of the frequencies of the two objects
MAT-72006 AA+DS, Fall 2014 23-Oct-14 560
HUFFMAN )
1. |
2.
3. for 1 to 1
4. allocate a new node
5. EXTRACT-MIN )
6. EXTRACT-MIN )
7.
8. INSERT )
9. return EXTRACT-MIN ) // return the root of the tree
MAT-72006 AA+DS, Fall 2014 23-Oct-14 561
23
10/23/2014
MAT-72006 AA+DS, Fall 2014 23-Oct-14 562
• To analyze the running time of HUFFMAN, let

be implemented as a binary min-heap
• For a set of characters, we can initialize
(line 2) in ) time using the BUILD-MIN-HEAP
• The for loop executes exactly 1 times, and
since each heap operation requires time (lg ),
the loop contributes lg ), to the running time
• Thus, the total running time of HUFFMAN on a set
of characters is lg )
• We can reduce the running time to lg lg ) by
replacing the binary min-heap with a van Emde
Boas tree
MAT-72006 AA+DS, Fall 2014 23-Oct-14 563
24
10/23/2014
Correctness of Huffman’s algorithm
• We show that the problem of determining an

optimal pre x code exhibits the greedy-choice
and optimal-substructure properties
Lemma 16.2 Let be an alphabet in which each

character has frequency . Let and
be two characters in having the lowest
frequencies. Then there exists an optimal pre x
code for in which the codewords for and
have the same length and differ only in the last bit.
MAT-72006 AA+DS, Fall 2014 23-Oct-14 564
Lemma 16.3 Let , , , and be as in

Lemma 16.2. Let . De ne
for as for , except that +
. Let be any tree representing an optimal
pre x code for the alphabet . Then the tree ,
obtained from by replacing the leaf node for
with an internal node having and as children,
represents an optimal pre x code for .
Theorem 16.4 Procedure HUFFMAN produces an

optimal pre x code.
MAT-72006 AA+DS, Fall 2014 23-Oct-14 565
25

DP Additional Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DP Additional Notes

Uploaded by

Copyright:

Available Formats

10/23/2014

15.5 Optimal binary search trees

• We are designing a program to translate text

• What we need is an optimal binary search tree

• For = 1, 2, … , 1, the dummy key

MAT-72006 AA+DS, Fall 2014 23-Oct-14 518

• Each key is an internal node, and each

• Because we have probabilities of searches for

• Let us assume that the actual cost of a search

=1+ depth + depth ,

• where depth denotes a node’s depth in tree

Node Depth Probability Contribution

MAT-72006 AA+DS, Fall 2014 23-Oct-14 521

• For a given set of probabilities, we wish to construct

MAT-72006 AA+DS, Fall 2014 23-Oct-14 522

MAT-72006 AA+DS, Fall 2014 23-Oct-14 523

Step 1: The structure of an optimal BST

• Given keys , … , , one of them, say , is the

MAT-72006 AA+DS, Fall 2014 23-Oct-14 525

• Suppose that in a subtree with keys , … , , we

Step 2: A recursive solution

• When > , we need to select a root from

• Thus, if is the root of an optimal subtree

MAT-72006 AA+DS, Fall 2014 23-Oct-14 529

• The values give the expected search costs

Step 3: Computing the expected search cost of

• We use only the entries for which 1

MAT-72006 AA+DS, Fall 2014 23-Oct-14 534

• The OPTIMAL-BST procedure takes ) time,

16.1 An activity-selection problem

• Suppose we have a set = ,…, of

MAT-72006 AA+DS, Fall 2014 23-Oct-14 537

• Activities and are compatible if the

MAT-72006 AA+DS, Fall 2014 23-Oct-14 538

• Consider, e.g., the following set of activities:

• For this example, the subset

The optimal substructure of the activity-

• Let and , so that

• This suggests that we might solve the activity-

MAT-72006 AA+DS, Fall 2014 23-Oct-14 542

Making the greedy choice

• If we make the greedy choice, we only have to nd

Theorem 16.1 Consider any nonempty subproblem ,

Proof Let be a max-size subset of mutually compatible

• We can repeatedly choose the activity that

A recursive greedy algorithm

MAT-72006 AA+DS, Fall 2014 23-Oct-14 548

An iterative greedy algorithm

• The set returned by the call

MAT-72006 AA+DS, Fall 2014 23-Oct-14 550

16.3 Huffman codes

• Huffman codes compress data very effectively

MAT-72006 AA+DS, Fall 2014 23-Oct-14 551

• We have a 100,000-character data le that we

• Using a xed-length code, requires 3 bits to

• Pre x codes simplify decoding

MAT-72006 AA+DS, Fall 2014 23-Oct-14 555

• The decoding process needs a convenient

MAT-72006 AA+DS, Fall 2014 23-Oct-14 556

The trees corresponding to the xed-length

MAT-72006 AA+DS, Fall 2014 23-Oct-14 557

• An optimal code for a le is always represented by a

• Given a tree corresponding to a pre x code,