You are on page 1of 39

Design of Algorithms by

Induction
Part 1
Algorithm Design and Analysis
2015 - Week 3
http://bigfoot.cs.upt.ro/~ioana/algo/

Bibliography: [Manber]- Chap 5


Algorithm Analysis <-> Design
• Given an algorithm, we can
– Analyze its complexity
– Proof its correctness

• But:
– Given a problem, how do we design a
solution that is both correct and efficient ?
• Is there a general method for this ?
Induction proofs vs
Design of algorithms by induction (1)
• Induction used to prove that a statement P(n)
holds for all integers n:
– Base case: Prove P(0)
– Assumption: assume that P(n-1) is true
– Induction step: prove that P(n-1) implies P(n) for all
n>0

– Strong induction: when we assume P(k) is true for


all k<=n-1 and use this in proving P(n)
Induction proofs vs
Design of algorithms by induction (2)
• Induction used in algorithm design:
– Base case: Solve a small instance of the problem
– Assumption: assume you can solve smaller
instances of the problem
– Induction step: Show how you can construct the
solution of the problem from the solution(s) of the
smaller problem(s)
Design of algorithms by induction
• Represents a fundamental design principle that
underlies techniques such as divide & conquer,
dynamic programming and even greedy
• How to reduce the problem to a smaller problem
or a set of smaller problems ? (n -> n-1, n/2, n/4,
…?)
• Examples:
– The successful party problem [Manber 5.3]
Part 1
– The celebrity problem [Manber 5.5]
– The skyline problem [Manber 5.6]
– One knapsack problem [Manber 5.10] Part 2
– The Max Consecutive Subsequence [Manber 5.8]
The successful party problem
• Problem: you are arranging a party and have a list of n
persons that you could invite. In order to have a
successful party, you want to invite as many people as
possible, but every invited person must be friends with at
least k of the other party guests. For each person, you
know his/her friends. Find the set of invited people.
Successful party - Example

K=3 Ann
Bob

Finn

Chris
Ellie Dan
Successful party – Direct approach
• Direct approach for solving: remove persons
who have less than k friends
• But: which is the right order of removing ?
1. Remove all persons with less than k friends, then
deal with the persons that are left without enough
friends ?
2. Remove first one person, then continue with
affected persons ?

Instead of thinking about our algorithm


as a sequence of steps to be executed,
think of proving a theorem that the algorithm exists
The design by induction approach
• Instead of thinking about our algorithm as a
sequence of steps to be executed, think of
proving a theorem that the algorithm exists
• We need to prove that this “theorem” holds for a
base case, and that if it holds for “n-1” this
implies that it holds for “n”
Successful party - Solution
• Induction hypothesis: We know how to find the
solution (maximal list of invited persons that
have at least k friends among the other invited
persons), provided that the number of given
persons is <n
• Base case:
– n<= k: no person can be invited
– n=k+1: if every person knows all of the other persons,
everybody gets invited. Otherwise, no one can be
invited.
Successful party - Solution
• Inductive step:
– Assume we know how to select the invited persons
out of a list of n-1, prove that we know how to select
the invited persons out of a list of n, n>k+1
– If all the n persons have more than k friends among
them, all n persons get invited => problem solved
– Else, there exists at least one person that has less
than k friends. Remove this person and the solution is
what results from solving the problem for the
remaining n-1 persons. By induction assumption, we
know to solve the problem for n-1 => problem solved
Successful party - Conclusions
• Designing by induction got us to a correct
algorithm, because we designed it proving its
correctness
– Every invited person knows at least k other invited
– We got the maximum possible number of invited persons
• The best way to reduce a problem is to eliminate
some of its elements.
– In this case, it was clear which persons to eliminate
– We will see examples where the elimination is not so
straightforward
The Celebrity problem
• Problem: A celebrity in a group of people is someone
who is known by everybody but does not know anyone.
You are allowed to ask anyone from the group a
question such as “Do you know that person?” pointing to
any other person from the group. Identify the celebrity (if
one exists) by asking as few questions as possible
• Problem:
• Given a n*n matrix “know” with know[p, q] = true if p
knows q and know[p, q] = false otherwise.
• Determine whether there exists an i such that:
– Know[j; i] = true (for all j, j≠ i) and Know[i; j] = false (for all j, j≠i )
Celebrity - Solution 1
• Brute force approach: ask questions arbitrary,
for each person ask questions for all others =>
n*(n-1) questions asked
Celebrity - Solution 2
• Use induction:
• Base case: Solution is trivial for n=1, one person is a
celebrity by definition.
• Assumption: Assume we leave out one person and that we
can solve the problem for the remaining n-1 persons.
– If there is a celebrity among the n-1 persons, we can find it
– If there is no celebrity among the n-1 persons, we find this out
• Induction step:
• For the n'th person we have 3 possibilities:
– The celebrity was among the n-1 persons
– The n’th person is the celebrity
– There is no celebrity in this group
Celebrity - Sol 2- Induction step
• We must solve the problem for n persons, assuming that
we know to solve it for n-1
• We leave out one person.
• We choose this person randomly – let it be the n’th
person
• Case 1: There is a celebrity among the n-1 persons, say
p. To check if this is also a celebrity for the n'th person
– check if know[n, p] and not know[p, n]
• Case 2: There is no celebrity among the n-1 persons. In
this case, either person n is the celebrity, or there is no
celebrity in the group.
– To check this we have to find out if know[i, n] and not know[n, i],
for all i <> n.
Celebrity – Solution 2
Function Celebrity_Sol2(n:integer) returns integer
if n = 1 then return 1
else
p = Celebrity_Sol2(n-1);
if p != 0 then
// p is the celebrity among n-1
if( knows[n,p] and not knows[p,n] ) then
return p
end if
end if
forall i = 1..n-1
if (knows[n, i] or not knows[i, n]) then
return 0 // there is no celebrity
end if
end for
return n // n is the celebrity
Celebrity – Solution 2 Analysis
T(n)
Function Celebrity_Sol2(n:integer) returns integer
if n = 1 then return 1
else
p = Celebrity_Sol2(n-1); T(n-1)
if p != 0 then
// p is the celebrity among n-1
if( knows[n,p] and not knows[p,n] ) then
return p
end if
end if
forall i = 1..n-1 O(n)
if (knows[n, i] or not knows[i, n]) then
return 0 // there is no celebrity
end if
end for
return n // n is the celebrity
Celebrity – Solution 2 Analysis
• T(n)=T(n-1) + n
• T(1)=1

• T(n)=1+2+3+ +n=n*(n+1)/2

• T(n) is O(n2)

• We have reduced a problem of size n to a problem of size n-1.


We then still have to relate the n-th element to the n-1 other
elements, and here this is done by a sequence which is O(n),
so we get an algorithm of complexity O (n2), which is the
same as the brute force.
• If we want to reduce the complexity of the algorithm to O(n),
we should have T(n)=T(n-1)+c
Celebrity – Solution 3
• The key idea here is to reduce the size of the
problem from n persons to n-1, but in a clever
way – by eliminating someone who is a non-
celebrity.
• After each question, we can eliminate a person
– if knows[i,j] then i cannot be a celebrity => elim i
– if not knows[i,j] then j cannot be a celebrity => elim j
Celebrity - Solution 3
• Use induction:
• Base case: Solution is trivial for n=1, one person is a
celebrity by definition.
• Assumption: Assume we leave out one person who is not a
celebrity and that we can solve the problem for the
remaining n-1 persons.
– If there is a celebrity among the n-1 persons, we can find it
– If there is no celebrity among the n-1 persons, we find this out
• Induction step:
• For the n'th person we have only 2 possibilities left:
– The celebrity was among the n-1 persons
– There is no celebrity in this group
Celebrity - Sol 3- Induction step
• We must solve the problem for n persons, assuming that
we know to solve it for n-1
• We leave out one person. In order to decide which
person to elim, we ask a question (to a random person i
about a random person j).
• The eliminated person is e=i or e=j, and we know that e
is not a celebrity
• Case 1: There is a celebrity among the n-1 persons that
remain after eliminating e, say p. To check if this is also
a celebrity for the person e
– check if know[e, p] and not know[p, e]
• Case 2: There is no celebrity among the n-1 persons. In
this case, there is no celebrity in the group. It is no need
any more to check if e is a celebrity !
Celebrity – Solution 3
Function Celebrity_Sol3(S:Set of persons) return person
if card(S) = 1 return S(1)
pick i, j any persons in S
if knows[i, j] then // i no celebrity
elim=i
else // if not knows[i, j] then j no celebrity
elim=j

p = Celebrity_Sol3(S-elim)
if p != 0 and knows[elim,p] and not knows[p,elim]
return p
else
return 0 // no celebrity
end if
end if
end function Celebrity_Sol3
Celebrity – Solution 3 Analysis
Function Celebrity_Sol3(S:Set of persons) return person
if card(S) = 1 return S(1)
pick i, j any persons in S
if knows[i, j] then // i no celebrity
elim=i
else // if not knows[i, j] then j no celebrity
elim=j

p = Celebrity_Sol3(S-elim)
if p != 0 and knows[elim,p] and not knows[p,elim]
return p
else
return 0 // no celebrity
end if
end if
end function Celebrity_Sol3 T(n)=T(n-1)+c
Algorithm is O(n)
Celebrity - Conclusions
• The size of the problem should be reduced (from
n to n-1) in a clever way
• This example shows that it sometimes pays off
to expend some effort ( in this case – one
question) to perform the reduction more
effectively
The Skyline Problem
• Problem: Given the exact locations and heights
of several rectangular buildings, having the
bottoms on a fixed line, draw the skyline of these
buildings, eliminating hidden lines.
The Skyline Problem
• A building Bi is represented as a triplet (Li, Hi, Ri)
• A skyline of a set of n buildings is a list of x coordinates and the
heights connecting them
• Input (1,11,5), (2,6,7), (3,13,9), (12,7,16), (14,3,25), (19,18,22),
(23,13,29), (24,4,28)
• Output (1, 11), (3, 13), (9, 0), (12, 7), (16, 3), (19, 18), (22, 3), (23,
13), (29, 0)
Skyline – Solution 1
• Base case: If number of buildings n=1, the
skyline is the building itself
• Inductive step: We assume that we know to
solve the skyline for n-1 buildings, and then we
add the n’th building to the skyline
Adding One Building to the Skyline
• We scan the skyline, looking at one horizontal
line after the other, and adjusting when the
height of the building is higher than the skyline
height

Worst case:
Bn O(n)
Skyline – Solution 1 Analysis
T(n)
Algorithm Skyline_Sol1(n:integer) returns Skyline
if n = 1 then return Building[1]
else
sky1 = Skyline_Sol1(n-1); T(n-1)
sky2 = MergeBuilding(Building[n], sky1);
return sky2;
O(n)
end algorithm

T(n)=T(n-1)+O(n)

O(n2)
Skyline – Solution 2
• Base case: If number of buildings n=1, the
skyline is the building itself
• Inductive step: We assume that we know to
solve the skylines for n/2 buildings, and then we
merge the two skylines
Merging two skylines
• We scan the two skylines together, from left to
right, match x coordinates and adjust height
where needed

n1 n2
Worst case:
O(n1+n2)
Skyline – Solution 2 Analysis
T(n)
Algorithm Skyline_Sol2(left, right:integer) returns
Skyline
if left=right then return Building[left]
else
middle=left+(left+right)/2 T(n/2)
sky1 = Skyline_Sol2(left, middle);
sky2 = Skyline_Sol2(middle+1, right);T(n/2)
sky3 = MergeSkylines(sky1, sky2);
return sky3;
O(n)
end algorithm

T(n)=2T(n/2)+O(n)

O(n log n)
Skyline - Conclusion
• When the effort of combining the subproblems
cannot be reduced, it is more efficient to split
into several subproblems of the same type which
are of equal sizes
– T(n)=T(n-1)+O(n) … O(n2)
– T(n)=2T(n/2)+O(n) … O(n log n)

• This technique is Divide and conquer


Efficiently dividing and combining
• Solving a problem takes following 3 actions:
– Divide the problem into a number of subproblems
that are smaller instances of the same problem.
– Solve the subproblems
– Combine the solutions to the subproblems into the
solution for the original problem.
• Execution time is defined by recurrences like
– T(size)=Sum (T(subproblem size)) + CombineTime

– Choosing the right subproblems sizes and the right


way of combining them decides the performance !
Recurrences
• T(n)=O(1), if n=1

• For n>1, we may have different recurrence relations, according to


the way of splitting into subproblems and of combining them. Most
common are:

– T(n)=T(n-1)+O(n) … O(n2)

– T(n)=T(n-1)+O(1) … O(n)

– T(n)=2T(n/2)+O(n) … O(n log n)

– T(n)=2T(n/2)+O(1) … O(n)

– T(n)=a*T(n/b) + f (n) ;
Conclusions (1)
• What is Design by induction ?
• An algorithm design method that uses the idea
behind induction to solve problems
– Instead of thinking about our algorithm as a sequence of steps
to be executed, think of proving a theorem that the algorithm
exists
• We need to prove that this “theorem” holds for a base case, and that
if it holds for “n-1” this implies that it holds for “n”
Conclusions (2)
• Why/when should we use Design by induction ?
– It is a well defined method for approaching a large variety of
problems (“where do I start ?”)
• Just take the statement of the problem and consider it is a theorem
to be proven by induction
– This method is always safe: designing by induction gets us to a
correct algorithm, because we designed it proving its
correctness
– Encourages abstract thinking vs coding => you can handle the
reasoning in case of complex algorithms
• A[1..n]; A[n/4+1 .. 3n/4]
• L, R; L=(R+3L+1)/4, R=(L+3R+1)/4
– We can make it also efficient (see next slide)
Conclusions (3)
• The inductive step is always based on a reduction
from problem size n to problems of size <n. How to
efficiently make the reduction to smaller problems:
– Sometimes one has to spend some effort to find the suitable
element to remove. (see Celebrity Problem)
– If the amount of work needed to combine the subproblems is not
trivial, reduce by dividing in subproblems of equal size – divide
and conquer (see Skyline Problem)

You might also like