You are on page 1of 15

Week 8: Prob.

Analysis and Linear Sorting CS478

Probabilistic Analysis and Sorting in


Linear Time

Dr. Zahid Halim

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

The Hiring Problem—An Introduction


• Problem:
– Use a job agency to hire a new office assistant
– Every day we interview a new candidate provided by the agency
– If the candidate is better qualified than our current office assistant,
we fire the current assistant and hire the candidate

Hire-Assistant(A)
1 current ← an infinitely useless dummy
assistant
2 for i = 1..n
3 do if A[i] is better qualified than current
4 then current ← A[i]
5 return current
Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi
Week 8: Prob. Analysis and Linear Sorting CS478

A Cost Model
• Every candidate we interview costs us ci dollars
– c1 dollars fee to be paid to the job agency
– c2 dollars travel expenses of the candidate, …

• If we hire the candidate, we have to pay ch extra dollars


– c3 dollars additional charge to be paid to the job agency
– c4 dollars internal administration

• Question: How much should we expect to pay when selecting


an office assistant out of n candidates?

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

• General answer:
– We always pay ci ⋅ n dollars for the interviews
– We pay ch ⋅ m dollars for hiring people, where m is the number of
candidates we hire (and fire) in the process
• Best case:
– The first candidate is the best
– Then our cost is ci ⋅ n + ch
• Worst case:
– The candidates show up in increasing order of qualification
– Then we hire every candidate
– The cost is ci ⋅ n + ch ⋅ n
• Abstract problem: Compute the expected number of
candidates we would hire.

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

A Reasonable Probability Distribution


• To compute expectations, we need a probability distribution
over all possible inputs.
• For the hiring problem:
– Assume that no two candidates are equally qualified
– Then we can rank the candidates 1, 2, …, n
– We assume that every possible order (permutation) of the candidates
occurs with equal probability
• This is called a uniformly random permutation.

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

The Expected Number of Hired Candidates

The number m of candidates we hire is a random variable that


depends on the given input permutation.
Our goal is to compute E(m), the expected value of m.

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

Indicator Variables
For an event E, we can define an indicator variable XE = I(E):

Indicator variables help to convert between probabilities an


expectations.

Lemma: For an event E and its indicator variable XE = I(E),


Pr(E) = E(XE).
Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi
Week 8: Prob. Analysis and Linear Sorting CS478

Average-Case Analysis of Hiring Algorithm


Events: E1, E2, …, En

Ei = we hire the i-th candidate

Indicator variables: X1, X2, …, Xn


Xi = I(Ei)

Then

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

Lemma: Assuming that the order in which the candidates


are presented is a uniformly random permutation, the
expected cost for hiring a new assistant out of n candidates is
Θ(ci ⋅ n + ch ⋅ lg n).

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

The Flaw of Average-Case Analysis


Adversarial Behaviour

• The hiring algorithm is deterministic


• Every deterministic algorithm has an input that elicits its worst-
case behaviour
• An adversary (the job agency) may always provide us with this
worst-case input

• Result: The input distribution we assumed is no longer valid.

• The big flaw of average-case analysis:


• The input distribution we assume may or may not be correct.

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

Randomization—Beating the Adversary


• Modification of our hiring algorithm:
– Get the complete list of candidates from the agency
– Permute them in a uniformly random order
– Interview them in this order using our old strategy
• Expected cost: Θ(ci ⋅ n + ch ⋅ lg n)
• Big difference: We impose the desired input distribution.
– No more questionable assumptions
– No input is guaranteed to elicit a worst-case behaviour
– The adversary is powerless

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

Counting Sort

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

Counting Sort…

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

Radix Sort (Example)

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi


Week 8: Prob. Analysis and Linear Sorting CS478

Radix Sort

Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Topi

You might also like