You are on page 1of 73

CSE 221: Design and

Analysis of Algorithms
Fall 23

Prof. Hala Zayed

1
5 - Greedy Algorithms

Adopted from: design and analysis of algorithms Stanford university

2
Contents

• THE GREEDY PARADIGM


• NON-EXAMPLE: GREEDY KNAPSACK?

• ACTIVITY SELECTION
• D&C vs. DP vs. GREEDY
• SCHEDULING
• HUFFMAN CODING

3
THE GREEDY PARADIGM

Commit to choices one-at-a-time,


never look back,
and hope for the best.

Greedy doesn’t always work.


We’ll see some non-examples where a tempting greedy approach won’t work.
Then, we’ll see some examples where a greedy solution exists!

4
THE GREEDY PARADIGM

DISCLAIMER: Commit to surprisingly


It’s often choices one-at-a-time,
easy to come up with ideas for
greedy algorithms, they’re usually pretty easy to write down, and their
never
runtimes are straightforward look back,
to analyze! But you’ll end up wondering,
“how am I supposed and hopewhen
to know forIthe best.
can use greedy algorithms?”
The answer may not be satisfying: a lot of the times, greedy algorithms
are not correct, and whenever they are correct, it can be difficult to
Greedy prove its correctness.
doesn’t always work.
We’ll see some non-examples where a tempting greedy approach won’t work.
Then, we’ll see some examples where a greedy solution exists!

5
NON-EXAMPLE: GREEDY KNAPSACK?

Can we design a greedy algorithm for Unbounded Knapsack?

UNBOUNDED KNAPSACK Capacity: 10


We have infinite copies of all the items.
What’s the most valuable way to fill the knapsack?

Item:
Total weight: 2 + 2 + 3 + 3 = 10 Weight: 6 2 4 3 11
Total value: 8 + 8 + 13 + 13 = 42 Value: 20 8 14 13 35

6
NON-EXAMPLE: GREEDY KNAPSACK?

Can we design a greedy algorithm for Unbounded Knapsack?

UNBOUNDED KNAPSACK Capacity: 10


We have infinite copies of all the items.
What’s the most valuable way to fill the knapsack?

Item:
Total weight: 2 + 2 + 3 + 3 = 10 Weight: 6 2 4 3 11
Total value: 8 + 8 + 13 + 13 = 42 Value: 20 8 14 13 35

Greedy approach? Here’s an idea: koalas have the best value/weight ratio, so keep using koalas!
Total weight: 3 + 3 + 3 = 9
Total value: 13 + 13 + 13 = 39

7
NON-EXAMPLE: GREEDY KNAPSACK?

Can we design a greedy algorithm for Unbounded Knapsack?

UNBOUNDED KNAPSACK Capacity: 10


We have infiniteThis doesn’t
copies work!
of all the items.We ended up “regretting” our greedy
choices.
What’s the most valuable way toBy theknapsack?
fill the time we
put in the third koala, we realized
that a magnet would have been better (even though it
doesn’t immediately seem as valuableItem: at the time) because
Total weight:it2would
+ 2 + 3 have
+ 3 = 10
left enough space
Weight: for a6 fourth object
2 that
4 3 11
Total value: 8 + 8 + 13 + 13 = 42 bump up our
could overall
Value: 20 value! 8 14 13 35

Greedy approach? Here’s an idea: koalas have the best value/weight ratio, so keep using koalas!
Total weight: 3 + 3 + 3 = 9
Total value: 13 + 13 + 13 = 39

8
NON-EXAMPLE: GREEDY KNAPSACK?
Our greedy approach for Unbounded Knapsack doesn’t work for all inputs
(and we showed it fails via a counterexample)

While we usually don’t say “No greedy algorithm can work”, you can often get an idea of
whether a nearsighted style of greedy decision making feels suitable for a problem by
going through a few attempts at designing a greedy solution.

In this Unbounded Knapsack attempt, we saw that making the nearsighted decision of
putting in the highest value/weight ratio object that can fit at the time will cause us to have
“regret” later down in the road. Making a nearsighted greedy decision feels inappropriate in
this problem, since it might be better to give up something earlier on to make room for
optimal decisions later. That’s why DP made more sense for Unbounded Knapsack: DP tries
to optimize its choice by seeing a decision all the way through (via recursive formulations)
and then picking the most optimal choice

9
ACTIVITY SELECTION
•An example where greedy works!

10
ACTIVITY SELECTION: THE TASK
Input: n activities with start times and finish times
Constraint: All activities are equally important, but you can only do 1 activity at a time!
Output: A way to maximize the number of activities you can do

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

11
ACTIVITY SELECTION: THE TASK

In what order should you greedily add activities? Here are 3 ideas:

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

12
ACTIVITY SELECTION: THE TASK

In what order should you greedily add activities? Here are 3 ideas:
1) Be impulsive: choose 2) Avoid commitment: 3) Finish fast: choose
activities in ascending choose activities in ascending activities
order of start times order of length in ascending order of end
times
Sit outside Eat dinner Sleep
Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

13
ACTIVITY SELECTION: THE TASK

In what order should you greedily add activities? Here are 3 ideas:
1) Be impulsive: choose 2) Avoid commitment: 3) Finish fast: choose
activities in ascending choose activities in ascending activities
order of start times order of length in ascending order of end
times
Sit outside
Only the third one seems to work (this is justEatourdinner
intuition Sleep
Work on homework right now)!
The first two greedy approaches
Wash dishes
could
Make hats lead to “regrettable”
Watch TV
decisions, and finding a counterexample confirms that.

Section Talk to people


Piano

time

14
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

15
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

16
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

17
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

18
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

19
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

20
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

21
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

22
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

23
OUR GREEDY ALGORITHM

Pick an available activity with the smallest finish time & repeat

Sit outside Eat dinner Sleep


Work on homework
Wash dishes Make hats Watch TV

Section Talk to people


Piano

time

24
ACTIVITY SELECTION: PSEUDOCODE

ACTIVITY_SELECTION(activities A with start and finish times):


A = MERGESORT_BY_FINISHTIMES(A)
result = {}
busy_until = 0
for a in A:
if a.start >= busy_until:
result.add(a)
busy_until = a.finish
return result

25
ACTIVITY SELECTION: PSEUDOCODE

ACTIVITY_SELECTION(activities A with start and finish times):


A = MERGESORT_BY_FINISHTIMES(A)
result = {}
busy_until = 0
for a in A:
if a.start >= busy_until:
result.add(a)
busy_until = a.finish
return result

Complexity
When activities are sorted by their finish time: O(N)
When activities are not sorted by their finish time, the time
complexity is O(N log N) due to complexity of sortingRuntime:

26
WHY IS IT GREEDY?

What makes our algorithm a greedy algorithm?

At each step in the algorithm, we make a choice (pick the available


activity with the smallest finish time) and never look back.

27
WHY IS IT GREEDY?

What makes our algorithm a greedy algorithm?

At each step in the algorithm, we make a choice (pick the available


activity with the smallest finish time) and never look back.

28
DP vs. GREEDY

Like Dynamic Programming, Greedy algorithms often work for problems with
nice optimal substructure. However, not only are optimal solutions to a problem
made up from optimal solutions of sub-problems,

but each problem depends on only one sub-problem!


(there’s some “best” decision to be made now, and then we solve a single sub-problem)

29
DP vs. GREEDY

Like Dynamic Programming, Greedy algorithms often work for problems with
nice optimal substructure. However, not only are optimal solutions to a problem
made up from optimal solutions of sub-problems,

but each problem depends on only one sub-problem!


(there’s some “best” decision to be made now, and then we solve a single sub-problem)
In our greedy activity
selection problem, we And then we moved on to solve this subproblem!
made a choice... (i.e. find the optimal set of activities with this smaller set of activities)
a3 a8 a5
a4
a2 a10 a9
a6 a1 a7
time

30
D&C vs. DP vs. GREEDY
DIVIDE-AND-CONQUER DYNAMIC PROGRAMMING GREEDY

big problem big problem big problem

sub- sub- sub- sub- sub- sub-


problem problem problem problem problem problem

sub-sub sub-sub sub-sub sub-sub sub-sub sub-sub sub-sub sub-sub sub-sub


problem problem problem problem problem problem problem problem problem

31
SCHEDULING
•Another (more complex) problem with a greedy solution!

32
SCHEDULING: THE TASK

Input: A set of n jobs. Job i takes ti hours. For every hour until job i is done, pay ci.
Output: An order of jobs to complete s.t. you minimize the cost.

Homework Time: 5 hours. Cost: 1 units/hr until it’s done

Sleep Time: 8 hours. Cost: 5 units/hr until it’s done

Laundry Time: 4 hours. Cost: 2 units/hr until it’s done

33
SCHEDULING: THE TASK

Input: A set of n jobs. Job i takes ti hours. For every hour until job i is done, pay ci.
Output: An order of jobs to complete s.t. you minimize the cost.

Homework Time: 5 hours. Cost: 1 units/hr until it’s done

Sleep Time: 8 hours. Cost: 5 units/hr until it’s done

Laundry Time: 4 hours. Cost: 2 units/hr until it’s done

Homework Sleep Laundry


costs (5 · 1) + (13 · 5) + (17 · 2) = 104
Sleep Homework Laundry
units
Sleep Laundry Homework
costs (8 · 5) + (13 · 1) + (17 · 2) = 87
units
34
costs (8 · 5) + (12 · 2) + (17 · 1) = 81
SCHEDULING: THE TASK

Input: A set of n jobs. Job i takes ti hours. For every hour until job i is done, pay ci.
Output: An order of jobs to complete s.t. you minimize the cost.

35
SCHEDULING: THE TASK

Input: A set of n jobs. Job i takes ti hours. For every hour until job i is done, pay ci.
Output: An order of jobs to complete s.t. you minimize the cost.

This problem has an optimal substructure!


Suppose this is an optimal schedule:

Job A Job B Job C Job D

This must be the optimal schedule on just jobs B, C, and D.


(If not, then rearranging B, C, D could make a better overall schedule than (A,B,C,D)!

A greedy algorithm could greedily commit to the “best” job to do first, and then move on,
repeatedly picking the next “best” job. What would be the “best” job to do first?

36
SCHEDULING: EASIER VERSION #1

What would be the “best” job to do first?


Thinking about time lengths & costs together feels a bit complicated… To get some intuition about how they
relate to each other, let’s brainstorm with a simpler version of the scheduling problem first:

SIMPLIFIED VERSION #1
Input: A set of n tasks. Each task takes 1 hour. For every hour until task i is done, pay ci.
Output: An order of tasks to complete s.t. you minimize the cost.
Job A Job B Job C Job D
Cost/hr: 5 Cost/hr: 2 Cost/hr: 1 Cost/hr: 10

Which jobs should we do first?


A) Do higher-cost jobs first
B) Do lower-cost jobs first

37
SCHEDULING: EASIER VERSION #1

What would be the “best” job to do first?


Thinking about time lengths & costs together feels a bit complicated… To get some intuition about how they
relate to each other, let’s brainstorm with a simpler version of the scheduling problem first:

SIMPLIFIED VERSION #1
Input: A set of n tasks. Each task takes 1 hour. For every hour until task i is done, pay ci.
Output: An order of tasks to complete s.t. you minimize the cost.
Job A Job B Job C Job D
Cost/hr: 5 Cost/hr: 2 Cost/hr: 1 Cost/hr: 10

Which jobs should we do first?


A) Do higher-cost jobs first
B) Do lower-cost jobs first

38
SCHEDULING: EASIER VERSION #1

What would be the “best” job to do first?


Thinking about time lengths & costs together feels a bit complicated… To get some intuition about how they
Do higher-cost
relate to each other, let’s brainstorm jobs first!
with a simpler versionWhy?
of the scheduling problem first:

Suppose A costs cA/hrSIMPLIFIED


and B costs cBVERSION
/hr, and cA #1
≥ cB (A is higher-cost).
Input: A set of n tasks. Each task takes 1 hour. For every hour until task i is done, pay ci.
Then cost(A then B) = 1cA + 2cB, and cost(B then A) = 1cB + 2cA.
Output: An order of tasks to complete s.t. you minimize the cost.
Since cA ≥ cB, then we know cost(A then B) ≤ cost(B then A),
Job A Job B Job C Job D
so it’s cheaper
Cost/hr: 5
to go with
Cost/hr: 2
A (theCost/hr:
higher1 cost job) before
Cost/hr: 10
B.
Delaying expensive jobs is a bad idea, and it’ll be better to get them out of the way
Which jobs
first. So if we save the cheapest should
jobs for we even
last, then do first?
though there are more hours
A) completed,
that go by before they get Do higher-cost jobs
the rate we first
pay for that delay is lower.
B) Do lower-cost jobs first

39
SCHEDULING: EASIER VERSION #2

What would be the “best” job to do first?


Here’s a different but still simpler version of the scheduling problem:

SIMPLIFIED VERSION #2
Input: A set of n tasks. Task i takes ti hours. For every hour until task i is done, pay 1 unit.
Output: An order of tasks to complete s.t. you minimize the cost.
Job A Job B Job C Job D
Cost/hr: 1 Cost/hr: 1 Cost/hr: 1 Cost/hr: 1

Which jobs should we do first?


A) Do longer jobs first
B) Do shorter jobs first

40
SCHEDULING: EASIER VERSION #2

What would be the “best” job to do first?


Here’s a different but still simpler version of the scheduling problem:

SIMPLIFIED VERSION #2
Input: A set of n tasks. Task i takes ti hours. For every hour until task i is done, pay 1 unit.
Output: An order of tasks to complete s.t. you minimize the cost.
Job A Job B Job C Job D
Cost/hr: 1 Cost/hr: 1 Cost/hr: 1 Cost/hr: 1

Which jobs should we do first?


A) Do longer jobs first
B) Do shorter jobs first

41
SCHEDULING: EASIER VERSION #2

What would be the “best” job to do first?


Here’s a differentDo
but shorter
still simpler version
jobs of Why?
first! the scheduling problem:

SIMPLIFIED
Suppose A takes tA hours and B takesVERSION #1 tA ≥ tB (A is longer).
tB hours, and
Input: A set of n tasks. Task i takes ti hours. For every hour until task i is done, pay 1 unit.
Then cost(A then B) = tA + (tA + tB), and cost(B then A) = tB + (tB + tA).
Output: An order of tasks to complete s.t. you minimize the cost.
Since tA ≥ tB, then we know cost(A then B) ≥ cost(B then A),
Job A Job B Job C Job D
so it’s
Cost/hr: 1
cheaper to go
Cost/hr: 1
with BCost/hr:
(the shorter
1
job) before A.
Cost/hr: 1
Basically, doing longer jobs first is a bad idea. A longer job would delay every job that
Which
comes after it by a longer amount, jobs should
so this weshorter
is why do first?
jobs are more attractive here
— the shortest jobsA)
addsDo onlonger jobsdelay
a minimal first for each subsequent job.
B) Do shorter jobs first

42
SCHEDULING: THE “BEST” JOB

What would be the “best” job to do first?


Since both time & cost can vary in this actual problem, we’d like to combine the best of both versions...

It seems like we prefer higher-cost & shorter jobs.


So if A is higher-cost and shorter than B (i.e. cA ≥ cB and tA ≤ tB), then A is “better”.

But what if neither A nor B are both higher-cost and shorter than the other? Then it’s
not immediately obvious what the “better” job would be…

We need some way of assigning a “score” to each job, and then we can
choose the job with the best score. Higher cost should increase a job’s
score, while longer time lengths should decrease a job’s score.

43
SCHEDULING: THE “BEST” JOB

We need some way of assigning a “score” to each job, and then we can choose the job
with the best score.
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
REASONABLE ATTEMPT
#1?

score for job i = costi – timei


(higher cost increases score,
longer times decreases score)

44
SCHEDULING: THE “BEST” JOB

We need some way of assigning a “score” to each job, and then we can choose the job
with the best score.
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
REASONABLE ATTEMPT REASONABLE ATTEMPT
#1? #2?

score for job i = costi – timei score for job i = costi / timei
(higher cost increases score, (higher cost increases score,
longer times decreases score) longer times decreases score)
Which one works?

45
SCHEDULING: THE “BEST” JOB

We need some way of assigning a “score” to each job, and then we can choose the job
with the best score.
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
REASONABLE ATTEMPT REASONABLE ATTEMPT
#1? #2?

score for job i = costi – timei score for job i = costi / timei
(higher cost increases score, (higher cost increases score,
longer times decreases score) longer times decreases score)
Cost/hr: 5 Cost/hr: 2
Consider this
Job A Job B
example:
time: 3 hours time: 1 hour

46
SCHEDULING: THE “BEST” JOB

We need some way of assigning a “score” to each job, and then we can choose the job
with the best score.
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
WRONG SCORING PROMISING SCORING
SCHEME! SCHEME!

score for job i = costi – timei score for job i = costi / timei
This says we should do Job A then Job This says we should do Job B then Job
B. A.
This gives us cost: (3 · 5) + (4 · 2) = 23 Cost/hr: 5 This gives us cost: (1 ·Cost/hr:
2) + (4 ·25) = 22
Consider this
Job A Job B
example:
time: 3 hours time: 1 hour

47
SCHEDULING: THE “BEST” JOB

We need some way of assigning a “score” to each job, and then we can choose the job
Why does the ratiowith the best
matter? Forscore.
any two tasks A and B:
Higher cost should increase a job’s score, while longer time lengths should decrease a job’s score.
cost(A B) = (tA· cA) + ((tPROMISING
WRONG SCORING A+ tB)· cB) SCORING
cost(B A) = (tB· cB) + ((tA+ tB)·SCHEME!
SCHEME! cA)

score for job i = costi


A B is better–than
time score for job i = costi / timei
Bi A when c B / t B ≤ c A / t A:
This says we should do Job A then Job This says we should do Job B then Job
(tA· cA) + ((tA+ tB)· cB) ≤ (tB· cB) + ((tA+ tB)· cA)
B. A.
This gives us cost:(t(3
A· ·cA
5)) ++ (t
(4A·· c2)
B) = (tB· Cost/hr:
+ 23 cB) ≤ (t5B· cThis
B) + (t A· cAus
gives ) +cost:
(tB· c(1 Cost/hr: 2
A) · 2) + (4 · 5) = 22
Consider this
Job A Job B
example: tA· cB ≤ tB· cA
time: 3 hours time: 1 hour
cB / t B ≤ cA / t A

48
SCHEDULING: “PSEUDOCODE”

Our greedy choice: always choose the job with the next biggest ratio:
cost (per hour until finished)
time it takes

49
SCHEDULING: “PSEUDOCODE”

Our greedy choice: always choose the job with the next biggest ratio:
cost (per hour until finished)
time it takes

SCHEDULING(n jobs with times & costs):


Compute cost/time ratios for all jobs
Sort jobs in descending order of cost/time ratios
Return sorted jobs!

Runtime: O(n log n)

50
SCHEDULING: WHAT DID WE LEARN?

The scheduling problem does have a greedy solution that works!


Always choose the job with the next biggest ratio:
cost (per hour until finished)
time it takes

51
HUFFMAN CODING
•One more problem with a greedy solution!

52
OPTIMAL CODES: THE TASK

ASCII can be pretty wasteful for English sentences, where letters have varying
frequencies. If e shows up so often, maybe we should have a more efficient
way of representing it (e.g. use less bits to represent e)!

everyday english sentence


01100101 01110110 01100101 01110010 01111001 01100100 01100001
01111001 00100000 01100101 01101110 01100111 01101100 01101001
01110011 01101000 00100000 01110011 01100101 01101110 01110100
01100101 01101110 01100011 01100101

Input: Some distribution on characters (frequencies of characters)


Output: A way to encode the characters as efficiently* as possible

53
OPTIMAL CODES: THE TASK

Input: Some distribution on characters (frequencies of characters)


Output: A way to encode the characters as efficiently* as possible
45
This means “D” makes up 16%
of the characters in the text
PERCENTAGE

16 we’re encoding
13 12
9
5
A B C D E F
Goal: Minimize the average number of bits used to encode a symbol
(with symbols weighted according to their frequencies)

54
OPTIMAL CODES: ATTEMPT 0

ATTEMPT 0: Use a fixed length code (the ith character gets coded as i in binary)
45

PERCENTAGE
16
13 12
9
5
A B C D E F
000 001 010 011 100 101

We should really try to get away with fewer bits for our more common symbols...

55
OPTIMAL CODES: ATTEMPT 1

ATTEMPT 1: Use a variable length code (shorter codes for common characters)
45

PERCENTAGE
16
13 12
9
5
A B C D E F
0 00 01 1 10 11

What is 001? Could be AC, or it could be BD… we’ve introduced ambiguity!

56
OPTIMAL CODES: ATTEMPT 2

ATTEMPT 2: Use a variable length prefix-free code, so that no character’s encoding


is a prefix of another character’s encoding (sometimes called a prefix code)
45

PERCENTAGE

16
13 12
9
5
A B C D E F
00 101 110 01 111 100
What is 0011001? ACD

57
A PREFIX-FREE CODE IS A TREE

We can think of a prefix-free code as a tree:


As long as all the A character’s encoding
letters show up as can be found by tracing
0 1
leaves, then the down from the root. By
corresponding code convention, left edges
is prefix-free denote 0, and right
0 1 0 1
edges denote 1.
A: D:
45 16 0 1 1
00 01 0
The cost of a tree is the B: C:
expected length of the encoding F: 5 E: 9
13 12
of a random letter (randomness 100 101 110 111
is weighted by frequency)

58
A PREFIX-FREE CODE IS A TREE

Cost of thisWe
treecan
(average leaf
think of depth): code as a tree:
a prefix-free
As long as all the A character’s encoding
(2 · 0.45)
letters show up as+ (2 · 0.16) + (3 · 0.05) + (3 · 0.13) + (3 · 0.12)can
+ (3be· 0.09)
found=by tracing
0
leaves, then the 2.391 down from the root. By
corresponding code convention, left edges
is prefix-free denote 0, and right
0 1 0 1
edges denote 1.
A: D:
45 16 0 1 1
00 01 0
The cost of a tree is the B: C:
expected length of the encoding F: 5 E: 9
13 12
of a random letter (randomness 100 101 110 111
is weighted by frequency)

59
A PREFIX-FREE CODE IS A TREE

Cost of thisWe
treecan
(average leaf
think of depth): code as a tree:
a prefix-free
As long as all the A character’s encoding
(2 · 0.45)
letters show up as+ (2 · 0.16) + (3 · 0.05) + (3 · 0.13) + (3 · 0.12)can
+ (3be· 0.09)
found=by tracing
0 1
leaves, then the 2.39 down from the root. By
corresponding code convention, left edges
is prefix-free denote 0, and right
0 1 0 1
edges denote 1.
A: OurD:goal (rephrased in terms of this tree):
Minimize
45 the (weighted)
16 average leaf depth of this binary tree!
0 1 1 0
00 01
The cost of a tree is the B: C:
expected length of the encoding F: 5 E: 9
13 12
of a random letter (randomness 100 101 110 111
is weighted by frequency)

60
A PREFIX-FREE CODE IS A TREE

IDEA: Greedily build sub-trees from the bottom up, where the “greedy goal”
is to have less frequent letters further down in the tree.

To ensure that less frequent letters are further down in the tree,
we’ll greedily build subtrees, by “merging” the 2 node with the smallest
frequency count, and then repeating until we’ve merged everything!

21
A “merge” between 2 nodes creates a
common parent node whose key is the 0 1
sum of those 2 nodes frequencies:
C:
E: 9
12

61
HUFFMAN CODING: EXAMPLE

Greedily build subtrees by merging, starting with the 2 most infrequent letters.

A: B: C: D:
E: 9 F: 5
45 13 12 16

62
HUFFMAN CODING: EXAMPLE

Greedily build subtrees by merging, starting with the 2 most infrequent letters.

14
0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16

63
HUFFMAN CODING: EXAMPLE

Greedily build subtrees by merging, starting with the 2 most infrequent letters.

25 14
0 1 0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16

64
HUFFMAN CODING: EXAMPLE

Greedily build subtrees by merging, starting with the 2 most infrequent letters.

30
1
0
25 14
0 1 0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16

65
HUFFMAN CODING: EXAMPLE

Greedily build subtrees by merging, starting with the 2 most infrequent letters.

55 1
0 30
1
0
25 14
0 1 0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16

66
HUFFMAN CODING: EXAMPLE

Greedily build subtrees by merging, starting with the 2 most infrequent letters.

100 1
55 1
0 30
0 1
0
25 14
0 1 0 1
A: B: C: D:
E: 9 F: 5
45 13 12 16

67
HUFFMAN CODING: EXAMPLE

Greedily build subtrees by merging, starting with the 2 most infrequent letters.
Note: This merging
100 procedure guarantees that
0 1
all characters will be leaves
A: 55 1 in the tree (so the tree
0
45 corresponds to a prefix-free
0 25 30 code)
Expected cost:
0 1 0 1
(0.45 · 1)
B: C: D:
+ 14
(0.13 · 3) + (0.12 · 3) + (0.16 · 13 12 16 0 1
3) 100 101 110
+ E: 9 F: 5
(0.09 · 4) + (0.05 · 4)
= 2.24 1110 1111

68
HUFFMAN CODING: PSEUDOCODE

HUFFMAN_CODING(Characters C, Frequencies F):


Create a node for each character (key is its frequency)
CURRENT = {set of all these nodes}
while len(CURRENT) > 1:
X and Y ← the 2 nodes in CURRENT with the smallest keys
Create a new node Z with Z.key = X.key + Y.key
Z.left = X, Z.right = Y
Add Z to CURRENT, and remove X and Y from CURRENT This depends on how
return CURRENT[0] we store our set of
CURRENT nodes! We
need to find minimum
Runtime: O(n · runtime per iteration) nodes and insert nodes.

69
HUFFMAN CODING: PSEUDOCODE

HUFFMAN_CODING(Characters C, Frequencies F):


Create a node for each character (key is its frequency)
CURRENT = {set of all these nodes}
while len(CURRENT) > 1:
X and Y ← the 2 nodes in CURRENT with the smallest keys
Create a new node Z with Z.key = X.key + Y.key
Z.left = X, Z.right = Y
Add Z to CURRENT, and remove X and Y from CURRENT
return CURRENT[0] Pre-sorting
frequencies using
MERGESORT and
maintaining 2
Runtime: O(n log n) queues (can you
figure this out?)

70
HUFFMAN CODING: WHAT DID WE LEARN?

Huffman Coding is an optimal way to encode characters to minimize


average number of bits needed to encode a character!
We greedily built subtrees & merging the 2 characters with the
minimum total frequency (from the bottom up)
SUPER FUN FACT!!!
David Huffman came up with this as his term paper for an MIT class. His professor gave students
the option to opt out of the final exam if they worked on a project to come up with optimal prefix
code. Turns out that his professor, Robert Fano, had been working on coming up with a prefix
code and had a more divide-and-conquer-y way to build a prefix tree, but it was suboptimal!
Huffman didn’t realize that the prefix code was an open problem (and later admitted that he
wouldn’t have tried it if he knew his professor had tried and couldn’t get it), and he just managed
to come up with this beautiful and optimal algorithm!

71
References
● Algorithm design: J. Kleinberg, E. Tardos. Pearson Education, Ch. 6.
● Some slides are updated from: CS381 Introduction to the Analysis of Algorithms
● Some slides are updated from: design and analysis of algorithms Stanford university

72
Questions

73

You might also like