(Les08) Greedy

Algoritmen & Datastructuren
2014 2015
Greedy Algorithms
Philip Dutr & Benedict Brown
Dept. of Computer Science, K.U.Leuven
What is a greedy algorithm?

Greedy algorithms make the choice that seems best now
Hopefully lots of locally good choices gives a good outcome
Example 1:
Driving directions: always head toward your goal
Copyright Ph.Dutr,Spring 2015

Greedy algorithms make the choice that seems best now
Hopefully lots of locally good choices gives a good outcome
Example 1:
Driving directions: always head toward your goal (but suboptimal)

Example 2: Numbering fragments
sequential fragments should be near

Greedy: next fragment is closest unnumbered fragment
Not optimal, but reasonable approximation to very hard problem

Example 3: Coin Exchange
Suppose you want to count out a certain amount of money,

using the fewest possible bills and coins
A greedy algorithm would do this would be:
At each step, take the largest possible bill or coin that does not
overshoot
Example: To make $6.39, you can choose:
a $5 bill
a $1 bill, to make $6
a 25 coin, to make $6.25
A 10 coin, to make $6.35
four 1 coins, to make $6.39
For US money, the greedy algorithm always gives the

optimum solution

Example 3: Coin Exchange
In some (fictional) monetary system, quatloos come in:

1 quatloo, 7 quatloo, and 10 quatloo coins
Using a greedy algorithm to count out 15 quatloos, you would

get:
Better solution: two 7 quatloo pieces and one 1 quatloo piece
A 10 quatloo piece
Five 1 quatloo pieces (for a total of 15 quatloos)
this requires six coins
this only requires three coins
The greedy algorithm for this problem results in a solution,

but not in an optimal solution
(Wall Street, 1987)

7
Real problem: Activity Scheduling

We want to schedule as many non-overlapping
activities as possible.
Tennis matches: as many on centre court as possible

Hubble: observations can occur only when orbit is right
Classrooms: class and practice sessions limited by available
rooms
Real problem: Activity Scheduling
Too many schedules to try them all
Greedy approach: Schedule an activity
.. then schedule as many non-overlapping activities as possible
If we always pick well, the result should be pretty good or

perfect!
Activity Scheduling, Heuristic 1
Schedule the next non-conflicting activity to start

This can lead to suboptimal choices:
We schedule one long activity instead of four short ones
We schedule this activity first, but this leaves no room for the other 4
(and we want to optimize for the number of activities)
10
Schedule the shortest non-conflicting activity

This can lead to suboptimal choices :
We schedule one short activity and blocking two longer ones.
We schedule this activity first, but this leaves no room for the other 2
11
Schedule the activity with the fewest conflicts

This can lead to suboptimal choices:
We schedule the central activity and two others, instead

of the top four.
We schedule this activity first (fewest conflict), and we then only have place for 2 others
12
Schedule the non-conflicting activity that finishes first
1st
13
2nd
3rd
4th
Let A = {a1, , an} be the set of activities selected by heuristic 4,

sorted by finish time.
Let A = {a1, , an} be an optimal set of activities.

Suppose a1 != a1:
The activities do not overlap, so they are also sorted by start time.
By heuristic 4, a1 is the first activity to finish, so it must finish before a1 .

Therefore, a1 cannot conflict with a2 (since A is a solution the problem),
and the schedule {a1, a2, , an}, is also a valid optimal schedule
(same number of activities as A).
By induction, we can keep replacing ai 's with ai, while maintaining

an optimal schedule. Therefore A is an optimal solution.
14
Real problem: File Compression
Computers usually encode characters using a fixed number of bits:
ASCII: 7 bits per character (127 characters)

ISO-8859-1: 8 bits per character (256 characters)
Unicode: 16 bits per character (64k characters) [ ... Unicode 7.0 June 2014]
But: Most documents contain fewer than 64k distinct characters
UTF-8 use fewer bits to represent common characters?
Roman alphabet (ASCII): 1 byte per character

Accented letters, Greek, Cyrillic, Arabic, Hebrew, etc.: 2 bytes
Chinese, etc.: 3 bytes
Obscure Chinese characters, historic characters, etc.: 4 bytes.
Cfr. Morse Code
15
Assign short codes to common letters

e = . , t = _ , ? = .._ _..
On average, Morse code yields 1.66:1 compression for English text
File Compression
No algorithm can compress every bitstring
Proof 1 [by contradiction]
Suppose you have a universal data compression algorithm U

that can compress every bitstring.
Given bitstring B0, compress it to get smaller bitstring B1.

Compress B1 to get a smaller bitstring B2.
Continue until reaching bitstring of size 1.
Implication: all bitstrings can be compressed to 1 bit!
Proof 2 [by counting]
16
Suppose your algorithm can compress all 1,000-bit strings.

21000 possible bitstrings with 1,000 bits.
Only 1 + 2 + 4 + + 2998 + 2999 = 21000 -1 can be encoded with
999 bits.
Redundancy in the English language?
17
Run-Length Encoding
Representation. Use 4-bit counts to represent

alternating runs of 0s and 1s:
18
15 0s, then 7 1s, then 7 0s, then 11 1s.
19
Huffman Coding
How can we compress more than Morse code?
Pick the codes for each character in a specific text

We can usually get 2:1 compression for English text
How do we know the length of each code?
Morse code uses dots and dashes for letters, and gaps to end codes
Computers have only 1s and 0s (no gaps)
Prefix codes: No code is a prefix of any other code
e.g. If e = 101, no other letter's code can start with 101
Simple tree representation for prefix codes:
20
Optimal prefix codes
Best code for each letter?
Construct prefix tree by greedily linking two least

frequent characters
Intuition: length of code should be proportional to letter's

frequency
Combined characters form new character with frequency =

sum of frequencies of linked characters.
Implementation requires a priority queue: ~n.log2 n
21
Optimal prefix codes
Example: ovoviviparous
22
Proof of optimality
Lemma 1: Any optimal code tree is complete.
(nodes with only 1 child are impossible)
23
Proof of optimality
Lemma 2:
There exists an optimal subtree in which the two least frequent letters are
siblings at the maximum depth.
Proof:
24
Every leaf node has a sibling or the tree would not be full (lemma 1).
The two least frequent letters x and y must be at the maximum depth.
If not, there must be a letter w at maximum depth that occurs more frequently,
and that would not be optimal.
If x and y are not siblings, swap y with x's sibling. The number of bits for all
letters remains unchanged.
Proof of optimality
Base Case:
Induction: Assume an optimal tree where the two least

frequent letters x and y are merged into a single letter z
Hypothesis: The tree created by adding x and y as
children of z is optimal
25
Proof of optimality
Contradiction: Expanded tree is not optimal. So start

from optimal expanded tree and merge x and y
26
Traveling Salesman Problem
A bike thief wishes to visit every building on campus, then return

to his hideout. In what order should he visit the buildings?
TSP is NP-complete - the best known algorithm is exponential
Use greedy algorithm to get reasonable solution
27

(Les08) Greedy

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(Les08) Greedy

Uploaded by

Copyright:

Available Formats

Algoritmen & Datastructuren

What is a greedy algorithm?

Driving directions: always head toward your goal

Copyright Ph.Dutr,Spring 2015

What is a greedy algorithm?

Driving directions: always head toward your goal (but suboptimal)

Copyright Ph.Dutr,Spring 2015

What is a greedy algorithm?

sequential fragments should be near

Copyright Ph.Dutr,Spring 2015

What is a greedy algorithm?

Suppose you want to count out a certain amount of money,

Example: To make $6.39, you can choose:

For US money, the greedy algorithm always gives the

Copyright Ph.Dutr,Spring 2015

What is a greedy algorithm?

In some (fictional) monetary system, quatloos come in:

Using a greedy algorithm to count out 15 quatloos, you would

Better solution: two 7 quatloo pieces and one 1 quatloo piece

this only requires three coins

The greedy algorithm for this problem results in a solution,

Copyright Ph.Dutr,Spring 2015

What is a greedy algorithm?

(Wall Street, 1987)

Copyright Ph.Dutr,Spring 2015

Real problem: Activity Scheduling

Tennis matches: as many on centre court as possible

Copyright Ph.Dutr,Spring 2015

Real problem: Activity Scheduling

Too many schedules to try them all

Greedy approach: Schedule an activity

.. then schedule as many non-overlapping activities as possible

If we always pick well, the result should be pretty good or

Activity Scheduling, Heuristic 1

Schedule the next non-conflicting activity to start

We schedule one long activity instead of four short ones

Copyright Ph.Dutr,Spring 2015

Activity Scheduling, Heuristic 2

Schedule the shortest non-conflicting activity

We schedule one short activity and blocking two longer ones.

Copyright Ph.Dutr,Spring 2015

Activity Scheduling, Heuristic 3

Schedule the activity with the fewest conflicts

We schedule the central activity and two others, instead

Copyright Ph.Dutr,Spring 2015

Activity Scheduling, Heuristic 4

Schedule the non-conflicting activity that finishes first

Copyright Ph.Dutr,Spring 2015

Activity Scheduling, Heuristic 4

Let A = {a1, , an} be the set of activities selected by heuristic 4,

Let A = {a1, , an} be an optimal set of activities.

By heuristic 4, a1 is the first activity to finish, so it must finish before a1 .

By induction, we can keep replacing ai 's with ai, while maintaining

Copyright Ph.Dutr,Spring 2015

Real problem: File Compression

Computers usually encode characters using a fixed number of bits:

ASCII: 7 bits per character (127 characters)

But: Most documents contain fewer than 64k distinct characters

UTF-8 use fewer bits to represent common characters?

Roman alphabet (ASCII): 1 byte per character

Cfr. Morse Code

Assign short codes to common letters

Copyright Ph.Dutr,Spring 2015

No algorithm can compress every bitstring