You are on page 1of 69

Introduction to Algorithms

Greedy Algorithms
CSE 680
Prof. Roger Crawfis
Optimization Problems

• For most optimization problems you want to find, not just a solution,
but the best solution.
• A greedy algorithm sometimes works well for optimization
problems. It works in phases. At each phase:
• You take the best you can get right now, without regard for future
consequences.
• You hope that by choosing a local optimum at each step, you will end up at a
global optimum.
Greedy Algorithms
• In dynamic programming, a recursive property is used to
divide an instance into smaller instances.
• In the greedy approach, there is no division into smaller
instances. A greedy algorithm arrives at a solution by
making a sequence of choices, each of which simply looks
the best at the moment. That is, each choice is locally optimal.
• The hope is that a globally optimal solution will be obtained,
but this is not always the case.
• For a given algorithm, we must determine whether the
solution is always optimal.
Example
• A simple example illustrates the greedy
approach. Joe, the sales clerk, often
encounters the problem of giving
• change for a purchase. Customers usually
don't want to receive a lot of coins.
• most customers would be aggravated if he
gave them 87 pennies when the change was
$0.87
• A solution to an instance of Joe's change
problem is a set of coins that adds up to the
amount he owes the customer, and an optimal
solution is such a set of minimum size. Agreedy
approach to the problem could proceed as
follows. Initially there are no coins in the
change.
• Joe starts by looking for the largest coin (in
value) he can find. That is, his criterion for
deciding which coin is best (locally optimal) is
the value of the coin. This is called the selection
procedure in a greedy algorithm.
• Next he sees if adding this coin to the change
would make the total value of the change exceed
the amount owed. This is called the feasibility
check in a greedy algorithm.
• If adding the coin would not make the change exceed the
amount owed, he adds the coin to the change.
• Next he checks to see if the value of the change is now
equal to the amount owed. This is the solution check in a
greedy algorithm.
• If the values are not equal, Joe gets another coin using his
selection procedure and repeats the process.
Steps in Greedy approach
• A selection procedure chooses the next item to add
to the set. The selection is performed according to a
greedy criterion that satisfies some locally optimal
consideration at the time.
• A feasibility check determines if the new set is
feasible by checking whether it is possible to
complete this set in such a way as to give a solution
to the instance.
• A solution check determines whether the new set
constitutes a solution to the instance.
Example: Counting Money

• Suppose you want to count out a certain amount of money, using the
fewest possible bills and coins
• A greedy algorithm to do this would be:
At each step, take the largest possible bill or coin that does not
overshoot
• Example: To make $6.39, you can choose:
• a $5 bill
• a $1 bill, to make $6
• a 25¢ coin, to make $6.25
• A 10¢ coin, to make $6.35
• four 1¢ coins, to make $6.39
• For US money, the greedy algorithm always gives the optimum
solution
Greedy Algorithm Failure

• In some (fictional) monetary system, “krons” come in 1 kron, 7 kron,


and 10 kron coins
• Using a greedy algorithm to count out 15 krons, you would get
• A 10 kron piece
• Five 1 kron pieces, for a total of 15 krons
• This requires six coins
• A better solution would be to use two 7 kron pieces and one 1 kron
piece
• This only requires three coins
• The greedy algorithm results in a solution, but not in an optimal
solution
A Scheduling Problem

• You have to run nine jobs, with running times of 3, 5, 6, 10, 11, 14, 15,
18, and 20 minutes.
• You have three processors on which you can run these jobs.
• You decide to do the longest-running jobs first, on whatever processor is
P1
available.
20 10 3
P2
18 11 6
P3
15 14 5

• Time to completion: 18 + 11 + 6 = 35 minutes


• This solution isn’t bad, but we might be able to do better
Another Approach

• What would be the result if you ran the shortest job first?
• Again, the running times are 3, 5, 6, 10, 11, 14, 15, 18, and 20 minutes
P1 3 10 15
P2
5 11 18
P3
6 14 20

• That wasn’t such a good idea; time to completion is now


6 + 14 + 20 = 40 minutes
• Note, however, that the greedy algorithm itself is fast
• All we had to do at each stage was pick the minimum or maximum
An Optimum Solution

• Better solutions do exist:


P1 20 14
P2
18 11 5
P3
15 10 6 3

• This solution is clearly optimal (why?)


• Clearly, there are other optimal solutions (why?)
• How do we find such a solution?
• One way: Try all possible assignments of jobs to processors
• Unfortunately, this approach can take exponential time
Huffman encoding

• The Huffman encoding algorithm is a greedy algorithm


• Given the percentage the each character appears in a corpus, determine
a variable-bit pattern for each char.
• You always pick the two smallest percentages to combine.
100%
54%

27%

46% 15%

22% 12% 24% 6% 27% 9%


A B C D E F
Huffman Encoding
100%
100%
54% 0 1

54%
A=00
27% 46% B=100
0 1 C=01
27% E D=1010
46% 15% A C
E=11
B 15% F=1011
A B C D E F
⚫ Average bits/char: D F
0.22*2 + 0.12*3 + 0.24*2 + 0.06*4 + 0.27*2 + 0.09*4
= 2.42
⚫ The solution found doing this is an optimal solution.
⚫ The resulting binary tree is a full tree.
Analysis

• A greedy algorithm typically makes (approximately) n


choices for a problem of size n
• (The first or last choice may be forced)
• Hence the expected running time is:
O(n * O(choice(n))), where choice(n) is making a choice
among n objects
• Counting: Must find largest useable coin from among k sizes of coin
(k is a constant), an O(k)=O(1) operation;
• Therefore, coin counting is (n)
• Huffman: Must sort n values before making n choices
• Therefore, Huffman is O(n log n) + O(n) = O(n log n)
Other Greedy Algorithms

• Dijkstra’s algorithm for finding the shortest path in a graph


• Always takes the shortest edge connecting a known node to an unknown
node
• Kruskal’s algorithm for finding a minimum-cost spanning tree
• Always tries the lowest-cost remaining edge
• Prim’s algorithm for finding a minimum-cost spanning tree
• Always takes the lowest-cost edge between nodes in the spanning tree and
nodes not yet in the spanning tree
Connecting Wires

• There are n white dots and n black dots, equally spaced, in a line
• You want to connect each white dot with some one black dot, with a
minimum total length of “wire”
• Example:

• Total wire length above is 1 + 1 + 1 + 5 = 8


• Do you see a greedy algorithm for doing this?
Huffman Code Example
Symbol A B C D
Frequency 13% 25% 50% 12%
Original 00 01 10 11
Encoding 2 2 2 2
bits bits bits bits
Huffman 110 10 0 111
• Expected size
Encoding 3 2 1 bit 3
• Original 1/82 + 1/42bits
+ 1/22 +bits bits
1/82 = 2 bits / symbol
• Huffman  1/83 + 1/42 + 1/21 + 1/83 = 1.75 bits / symbol
Huffman Code Data Structures
D A
• Binary (Huffman) tree
• Represents Huffman code
• Edge  code (0 or 1)
• Leaf  symbol 1 0 B
• Path to leaf  encoding
• Example
• A = “110”, B = “10”, C = “0” C
• Priority queue
1 0
• To efficiently build binary tree

1 0
Huffman Code Algorithm Overview

• Encoding
• Calculate frequency of symbols in file
• Create binary tree representing “best” encoding
• Use binary tree to encode compressed file
• For each symbol, output path from root to leaf
• Size of encoding = length of path
• Save binary tree
Huffman Code – Creating Tree

• Algorithm
• Place each symbol in leaf
• Weight of leaf = symbol frequency
• Select two trees L and R (initially leafs)
• Such that L, R have lowest frequencies in tree
• Create new (internal) node
• Left child  L
• Right child  R
• New frequency  frequency( L ) + frequency( R )
• Repeat until all nodes merged into one tree
Huffman Tree Construction 1
A C E H I
3 5 8 2 7
Huffman Tree Construction 2
A H C E I
3 2 5 8 7

5
Huffman Tree Construction 3
A H E I
8 7
3 2
C

5 5

10
Huffman Tree Construction 4
A H E I
8 7
3 2
C
15
5 5

10
Huffman Tree Construction 5
A H
E = 01
3 2 I = 00
C E I C = 10
1 0
5 5 8 7 A = 111
1 0 1 0 H = 110
10 15

1 0
25
Huffman Coding Example
E = 01
• Huffman code
I = 00
C = 10
A = 111
• Input H = 110
• ACE
• Output
• (111)(10)(01) = 1111001
Huffman Code Algorithm Overview

• Decoding
• Read compressed file & binary tree
• Use binary tree to decode file
• Follow path from root to leaf
Huffman Decoding 1
A H
1111001
3 2
C E I
1 0
5 5 8 7

1 0 1 0
10 15

1 0
25
Huffman Decoding 2
A H
1111001
3 2
C E I
1 0
5 5 8 7

1 0 1 0
10 15

1 0
25
Huffman Decoding 3
A H
1111001
3 2
C E I
1 0
5 8 7
5 A
1 0 1 0
10 15

1 0
25
Huffman Decoding 4
A H
1111001
3 2
C E I
1 0
5 8 7
5 A
1 0 1 0
10 15

1 0
25
Huffman Decoding 5
A H
1111001
3 2
C E I
1 0
5 8 7
5 AC
1 0 1 0
10 15

1 0
25
Huffman Decoding 6
A H
1111001
3 2
C E I
1 0
5 8 7
5 AC
1 0 1 0
10 15

1 0
25
Huffman Decoding 7
A H
1111001
3 2
C E I
1 0
5 8 7
5 ACE
1 0 1 0
10 15

1 0
25
Bin packing problem
■ Input:
– n items with sizes a1, …, an (0 < ai ≤1).
■ Task:
– Find a packing in unit-sized bins that minimizes the
number of bins used.

Items 0.3 0.2 0.2 0.2 0.2 0.4 0.5

Bins 0.2
0.2
1.0 0.2 0.4
0.3 0.5
0.2
Bin packing problem
■ Input:
– n items with sizes a1, …, an (0 < ai ≤1).
■ Task:
– Find a packing in unit-sized bins that minimizes the
number of bins used.

Items 0.3 0.2 0.2 0.2 0.2 0.4 0.5

Bins 0.5 0.4

1.0 0.2
0.2
0.2
0.3 0.2
Overview (3/4)

■ Bin packing problem


– An example
– The First-Fit algorithm.
• Approximation factor is 2.
– No approximation algorithm having a guarantee of 3/2.
• Reduction from the set partition, an NP-complete problem.
– Asymptotic PTAS Aε.
• Lower bound of bins: ε, #distinct sizes of bins: K.
• Exact algorithm where ε and K are constants.
• Approximation algorithm where ε is constant.
The First-Fit algorithm (1/4)
■ This algorithm puts each item in one of partially
packed bins.
– If the item does not fit into any of these bins, it opens
a new bin and puts the item into it.
Order

Items 0.5 0.3 0.4 0.8 0.2 0.2 0.2

Bins
0.3
1.0
0.5 0.4
The First-Fit algorithm (2/4)
■ This algorithm puts each item in one of partially
packed bins.
– If the item does not fit into any of these bins, it opens
a new bin and puts the item into it.
Order

Items 0.5 0.3 0.4 0.8 0.2 0.2 0.2

Bins
0.3
1.0
0.5 0.4
0.8
The First-Fit algorithm (3/4)
■ This algorithm puts each item in one of partially
packed bins.
– If the item does not fit into any of these bins, it opens
a new bin and puts the item into it.
Order

Items 0.5 0.3 0.4 0.8 0.2 0.2 0.2

Bins 0.2
0.3
1.0
0.5 0.4
0.8
The First-Fit algorithm (4/4)
■ This algorithm puts each item in one of partially
packed bins.
– If the item does not fit into any of these bins, it opens
a new bin and puts the item into it.
Order

Items 0.5 0.3 0.4 0.8 0.2 0.2 0.2

Bins 0.2
0.2
0.3
1.0 0.2
0.5 0.4
0.8
Divide and Conquer
Definition
– Recursion lends itself to a general problem-solving technique (algorithm
design) called divide & conquer
• Divide the problem into 1 or more similar sub-problems
• Conquer each sub-problem, usually using a recursive call
• Combine the results from each sub-problem to form a
solution to the original problem
– Algorithmic Pattern:
DC( problem )
solution = 
if ( problem is small enough )
solution = problem.solve()
Divide
else
children = problem.divide()
for each c in children Conquer
solution = solution + c.solve()
return solution

Combine
Applicability
– Use the divide-and-conquer algorithmic pattern when ALL of the
following are true:
• The problem lends itself to division into sub-problems of the
same type

• The sub-problems are relatively independent of one another


(ie, no overlap in effort)

• An acceptable solution to the problem can be constructed


from acceptable solutions to sub-problems
MergeSort
Sort a collection of n items into increasing order

mergeSort(A) {
if(A.size() <= 1)
return A;
else {
left = A.subList(0, A.size()/2);
right = A.subList(A.size()/2, A.size());
sLeft = mergeSort(left);
sRight = mergeSort(right);
newA = merge(sLeft, sRight);
return newA;
}
MergeSort

3 8 5 4 1 7 6 2

3 8 5 4 1 7 6 2

3 4 5 8 1 2 6 7

1 2 3 4 5 6 7 8
QuickSort
Sort a collection of n items into increasing order
– Algorithm steps:
• Break the list into 2 pieces based on a pivot
– The pivot is usually the first item in the list
• All items smaller than the pivot go in the left and all items larger go
in the right
• Sort each piece (recursion again)
• Combine the results together
Binary Search
Find the location of an element in a sorted collection of n items (assuming it
exists)

binarySearch( A, low, high, target )


mid = (high + low) / 2
if ( A[mid] == target )
return mid
else if ( A[mid] < target )
return binarySearch(A, mid+1, high, target)
else
return binarySearch(A, low, mid-1, target)
Closest-Pair Problem
Find the closest pair of points in a collection of n points in a given plane (use
Euclidean distance)
• Assume n = 2k
• If n = 2, answer is distance between these two points
• Otherwise,
– sort points by x-coordinate
– split sorted points into two piles of equal size (i.e., divide
by vertical line l)
• Three possibilities: closest pair is
– in Left side
– in Right side
– between Left and Right side

Application: Closest pair of airplanes


at a given altitude.
Closest-Pair Problem
• Closest pair is between Left and Right?
– Let d = min(LeftMin, RightMin)
– need to see if there are points in Left and Right that are closer
than d
– only need to check points within 2d of l
– sort points by y-coordinate
– can only be eight points in the 2d x d slice
Steps to Designing a
Dynamic Programming Algorithm
1. Characterize optimal substructure

2. Recursively define the value of an optimal


solution

3. Compute the value bottom up

4. (if needed) Construct an optimal solution


Principle of Optimality

• The dynamic Programming works on a principle


of optimality.
• Principle of optimality states that in an optimal
sequence of decisions or choices, each sub
sequences must also be optimal.
Example 1: Fibonacci numbers

• Recall definition of Fibonacci numbers:

F(n) = F(n-1) + F(n-2)


F(0) = 0
F(1) = 1

• Computing the nth Fibonacci number recursively (top-down):

F(n)

F(n-1) + F(n-2)

F(n-2) + F(n-3) F(n-3) + F(n-4)

...

11
Fibonacci Numbers

• Fn= Fn-1+ Fn-2 n≥2


• F0 =0, F1 =1
• 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, …

• Straightforward recursive procedure is slow!

• Let’s draw the recursion tree


Backtracking
 Suppose you have to make a series of decisions,
among various choices, where
 You don’t have enough information to know what to
choose
 Each decision leads to a new set of choices
 Some sequence of choices (possibly more than one)
may be a solution to your problem
 Backtracking is a methodical way of trying out
various sequences of decisions, until you find one
that “works”

3
Solving a maze
 Given a maze, find a path from start to finish
 At each intersection, you have to decide between
three or fewer choices:
 Go straight
 Go left

 Go right

 You don’t have enough information to choose correctly


 Each choice leads to another set of choices
 One or more sequences of choices may (or may not) lead to a
solution
 Many types of maze problem can be solved with backtracking

4
Coloring a map
 You wish to color a map with
not more than four colors
 red, yellow, green, blue
 Adjacent countries must be in
different colors
 You don’t have enough information to choose colors
 Each choice leads to another set of choices
 One or more sequences of choices may (or may not) lead to a
solution
 Many coloring problems can be solved with backtracking

5
Solving a puzzle
 In this puzzle, all holes but one
are filled with white pegs
 You can jump over one peg
with another
 Jumped pegs are removed
 The object is to remove all
but the last peg
 You don’t have enough information to jump correctly
 Each choice leads to another set of choices
 One or more sequences of choices may (or may not) lead to a
solution
 Many kinds of puzzle can be solved with backtracking

6
Backtracking (animation)

dead end
?
dead end
dead end

?
start ? ? dead end
dead end
?
success!

7
Terminology I

A tree is composed of nodes

There are three kinds of


nodes:
The (one) root node
Internal nodes Backtracking can be thought of
as searching a tree for a
Leaf nodes
particular “goal” leaf node

8
Terminology II
 Each non-leaf node in a tree is a parent of one or more
other nodes (its children)
 Each node in the tree, other than the root, has exactly
one parent
parent
Usually, however,
we draw our trees
downward, with
parent the root at the top
children children

9
Full example: Map coloring
 The Four Color Theorem states that any map on a
plane can be colored with no more than four colors,
so that no two countries with a common border are
the same color
 For most maps, finding a legal coloring is easy
 For some maps, it can be fairly difficult to find a legal
coloring

12
13
BACKTRACKIN
G

The principle idea of back-


tracking is to construct solutions
as component at a time. And
then evaluate such partially
constructed solutions.
Backtracking [animation]

dead end
?
dead end
dead end

?
start ? ? dead end
dead end
?
success!

You might also like