You are on page 1of 12

CS251: Algorithms

Linear time selection


Computer Science Dept.
Instructor: Ameera Jaradat
Outline
 The Selection Problem
 Randomized Selection
 Worst-Case Linear-Time Selection
Review: Order Statistics
 The ith order statistic in a set of n elements is the ith
smallest element
 The minimum is thus the 1st order statistic
 The maximum is the nth order statistic
 The median is the n/2 order statistic
 ⎣(n+1) / 2⎦ or ⎡(n+1)/2⎤
The Selection Problem
 The selection problem: find the ith smallest element of a
set
Example: If A=(7, 4, 8, 2, 4); then |A| = 5 and the 3rd smallest
.element (and median) is 4
 How fast can we solve the problem?
 O(n) for min or max.
 O(n log n) by sorting.
● Two algorithms:
■ Practical randomized algorithm: O(n) expected running time
■ theoretical algorithm: O(n) worst-case running time
Randomized Selection .1
 Key idea: use partition() from quicksort
 But, only need to examine one subarray
 This savings shows up in running time: O(n)

1. PARTITION( list[]) // return index k


2. If i < k, recurs on the left
3. If i > k. recurs on the right
4. Otherwise, output x

 A[q]  A[q]

p q r
Randomized Selection
Select(A, p, r, i)
if (p == r) then return A[p];
q = Partition(A, p, r)
k = q - p + 1;
if (i == k) then return A[q];
if (i < k) then
return Select(A, p, q-1, i);
else
return Select(A, q+1, r, i-k);

 A[q]  A[q]

p q r
:Example
 Select the 7th smallest element :

 Partition:

 Select the 7- 4 = 3rd smallest recursively

i=3

7
Analysis
 T(n) = T(max(k,n-k-1)) + Θ(n)
 The worst-case running time:
 T(n) = T(n –1) + Θ(n) = Θ(n2)
 If the partition was perfect (q = n/2) we have:
 T(n) = T(n/2) + Θ(n) = Θ(n)

 Summary
 Works fast: linear expected time.
 Excellent algorithm in practice.
 But, the worst case is very bad: Θ(n2).
Worst-Case Linear-Time Selection
Procedure select (A, low…high, k)
1. If n is small, for example n<6, then, sort and return the kth smallest
element .(bounded by 7 steps)
2. If n>5, partition the numbers into groups of 5. (time n/5)
3. Sort the elements in each group. Select the middle elements
(medians). (time- 7n/5)
4. Select the median of the medians  mm
5. Partition the elements into 3 lists (L,R,M) according to mm.
1. L  A[i] < mm
2. R  A[i]> mm
3. M  A[i] = mm
.Note that the rank of mm is r=|L|+1 (|L| is the size of L)
6. case:
 k=r return mm
 k<r return k th smallest of set L. select (A, 1..|L|, k)
 k>r return k-r th smallest of set R. select (A, r+1..|R|, k-r th )
Example

.the median is the kth element such that k = 25/ 2 =13


divide into 5 groups Sort each group
(8, 33, 17, 51, 57) (8, 17, 33, 51, 57)
(49, 35, 11, 25, 37) (11, 25, 35, 37, 49)
(14, 3, 2, 13, 52) (2, 3, 13, 14, 52)
(12, 6, 29, 32, 54) (6, 12, 29, 32, 54)
(5, 16, 22, 23, 7). (5, 7, 16, 22, 23).

Extract the median of each group


M = {33, 35, 13, 29, 16}. median of medians mm = 29:
partition A into three sequences:
L = {8, 17, 11, 25, 14, 3, 2, 13, 12, 6, 5, 16, 22, 23, 7}
M = {29},
R = {33, 51, 57, 49, 35, 37, 52, 32, 54}
.Example cont

L = {8, 17, 11, 25, 14, 3, 2, 13, 12, 6, 5, 16, 22, 23, 7}
 We repeat the same procedure above: select (L, 1,|L|, k)
 So we set A = L.
 We divide the elements into 3 groups of 5 elements each:
(8,17,11,25,14), (3,2,13,12,6), (5,16,22,23,7).
 Sort each of the group, and find the new set of medians: M = {14, 6,
16}.
 the new median of medians mm is 14.
 Next, partition A into three sequences:

L = {8, 11, 3, 2, 13, 12, 6, 5, 7},


M = {14} and
R = {17, 25, 16, 22, 23} .
 Since 13 > 10 = │A1│+│A2│, we set A = A3 and find the 3rd
element in A (3 = 13 - 10).
 The algorithm will return A[3] = 22. Thus, the median of the
numbers in the given sequence is 22.
Worst-Case Linear-Time Selection
 After partitioning around mm , step 6 will call Select() on at most 3n/4
elements
 The recurrence is therefore:

T (n)  T n 5  T 3n 4   n 


 cn 5  3cn 4  ( n) Substitute T(n) = cn
 19cn 20  (n) Combine fractions
 cn  cn 20  n 
 cn
Theorem : For constants c and a1, . . . , ak such that a1 + . . . ak < 1,
the recurrence T(n) ≤ T(a1n) + T(a2n) + . . . T(akn) + cn
solves to T(n) = Θ (n).

You might also like