This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

1

Medians and Ordered Statistics

January 31

2011

This is the term project report of Algorithms Analysis from the student BSCS 5th Semester @ Institute of Computing Bahauddin Zakariya University, Multan. This Project Report is Submitted By: -Suleman Altaf Bhutta www.facebook.com/suleman.bhutta

.................... 4 2......................................................................................................4. 6 2...... 4 Chapter 2: Selection Problems .......................................................................................................... 5 2........................................11.. 11 References...............................Minimum of a set ...........Partition-based general selection algorithm / QuickSelect .....Medians and Ordered Statistics Overview ....41......Advantages of Nonlinear general selection algorithms ................................................ 3 Minimum of ordered statistic ...................................................... 8 3.4..................................................................................................... 6 2............................ 10 4........ 4 Lower Median in Ordered statistic.............................................10.........................................................The Partition Algorithm ................................................................Running time of Selection in worst case linear time .................. 7 3.................................................................1.......Maximum of a set ................................................ 7 3.....................8..........................................................................2..................... 6 2.........1......3.............. 5 2......................Overview .....6What is Median? .....................................................Algorithm for Randomized Algorithm ................The Selection Problem ........................................................13 .................................................................31.. 8 3.The QuickSelect Algorithm ........................................2..... 5 2.......................Simultaneous minimum and maximum .....................1................................................................................................................21..................................................................6........... 3 Maximum of Ordered Statistic . 9 Chapter 4: Selection in worst-case linear time ................... 10 4.........................................3.....................................5....OverView and steps ..........Calculation of Upper Bounds on Expected Running time .................... 4 Upper Median in Ordered statistic ........ 3 Ordered statistics ................. 5 2....Expected running time............... 7 2................................................9............................... 3 1......................................................................................2Nonlinear general selection algorithm........................................................51............................ 4 2..................... 7 Chapter 3: Selection in Expected Linear Times........................................7.......................Medians and Ordered Statistics 2 Table of Contents Chapter 1...............QuickSelect Algorithm ................................

or a probability distribution.Minimum of ordered statistic The minimum of a set of elements is the ﬁrst order statistic (i = 1).Ordered statistics In statistics. the median is then usually defined to be the mean of the two middle values.2. a median is described as the numeric value separating the higher half of a sample. a population. 1. The median of a finite list of numbers can be found by arranging all the observations from lowest value to highest value and picking the middle one. 1. the kth order statistic of a statistical sample is equal to its kth-smallest value. . then there is no single middle value. order statistics are among the most fundamental tools in non-parametric statistics and inference. If there is an even number of observations. Together with rank statistics.Medians and Ordered Statistics 3 Chapter 1.Medians and Ordered Statistics Overview 1. from the lower half.1.What is Median? In probability theory and statistics.3.

..The Selection Problem In computer science. Output: The element x ∈ A that is larger than exactly i − 1 other elements of A.. the median of this set is unique.... .. 1... Selection is a sub problem of more complex problems like the nearest neighbor problem and shortest path problems.. Chapter 2: Selection Problems 2.......... selection algorithms... Input: A set A of n (distinct) numbers and a number i.. worst-case linear time...Upper Median in Ordered statistic When number of elements in a set are even...5.. .......End of Chapter 1..6.............Maximum of Ordered Statistic The maximum is the nth order statistic (i = n).... with 1 ≤ i ≤ n..... maximum. 1.................. There are O(n).. a selection algorithm is an algorithm for finding the kth smallest number in a list (such a number is called the kth order statistic)........... it always occurs at the (i+1)/2 th position.. This includes the cases of finding the minimum.......Medians and Ordered Statistics 4 1... the median of this set is unique.4...1. it always occurs at the i=n/2 +1 st position.......... and median elements......Lower Median in Ordered statistic When number of elements in a set are even.

5.A.2. The Algorithm requires O(kn) time.R. or more convincingly resembles to quicksort. k) for i from 1 to k minIndex = i minValue = list[i] NonLinear General for j from i+1 to n Selection Algorithm if list[j] < minValue minIndex = j minValue = list[j] swap list[i] and list[minIndex] return list[k] 2. The Partition Algorithm The QuickSelect Algorithm . But. It can be done with linked list data structures. 2. function select_nonLinear(list[1. it requires only O(j + (k-j)2) time to find the kth smallest element.QuickSelect Algorithm The quickselect algorithm has two parts as that of QuickSort.Advantages of Nonlinear general selection algorithms After locating the jth smallest element. we will find the most extreme value and move it to the beginning until we reach out desired index. whereas the one based on partition requires random access. Hoare. The Partition based selection. in this version we’ll call it as QuicksSelect().Partition-based general selection algorithm / QuickSelect The partition based algorithm was invented by C. or only O(k) for k ≤ j.4..Nonlinear general selection algorithm In this algorithm.n]. The Partition based selection algorithm is almost like the quickSort.Medians and Ordered Statistics 5 2. This can be termed as a selection sort that is incomplete in some kind.3. 2.

left. right.8.Medians and Ordered Statistics 6 2. pivotIndex) if k = pivotNewIndex return list[k] else if k < pivotNewIndex return select(list. right. left.Quick Select Algorithm function Quickselect(list.6. k) select pivotIndex between left and right pivotNewIndex := partition(list.The QuickSelect Algorithm storeIndex := storeIndex + 1 swap list[right] and list[storeIndex] return storeIndex 2. right. pivotIndex) pivotValue := list[pivotIndex] swap list[pivotIndex] and list[right] storeIndex := left QuickSort for i from left to right-1 Partition Algorithm if list[i] < pivotValue swap list[storeIndex] and list[i] 2. left.Minimum of a set The minimum of set of elements can be found by the following algorithm. k) else return select(list.The Partition Algorithm function partition(list.7. right. MINIMUM( A) min = A[1] for i = 2 to length[ A] do if min > A[i] then min ← A[i ] return min Minimum of a Set . pivotNewIndex-1.7. pivotNewIndex+1. left. k) QuickSelect Algorithm 2.

............ a graphics program may need to scale a set of (x......... .....9.. we must ﬁnd both the minimum and the maximum of a set of n elements............ ... MINIMUM( A) max = A[1] for i = 2 to length[ A] do if min < A[i] then min ← A[i ] return max Maximum of a Set 2.............. y) data to ﬁt onto a rectangular display screen or other graphical output device....10....... the program must ﬁrst determine the minimum and maximum of each coordinate........ For example......Medians and Ordered Statistics 7 2..........Overview The selection in expected linear time is done through Randomized-Select procedure.......End of Chapter 2... Chapter 3: Selection in Expected Linear Times 3........ To doso.......1..Maximum of a set The minimum of set of elements can be found by the following algorithm...Simultaneous minimum and maximum In some applications..

Medians and Ordered Statistics Input: An array A[1::n]of n distinct elements from a totally ordered set.n − k)) + a . and an index i. p. (1) . The median element is the output if n is odd and i =(n +1)=2. Output:The ith smallest element in subarray A[p::r].i) else 3. If the rank of the pivot element in the initial call to Rand-Partition is k0.e. the upper middle element. If n is even there are two `middle elements' corresponding to i = n=2and i =(n=2) + 1. then we can write T(n) > = T(max(k – 1.where1 < i < n. i. Output: The ith smallest element in array A. n. If i = 1 the output is the minimum element in the array. q − 1. for convenience we will assume that the median is i = d(n +1)=2e. r). But it can be solved in O(n) expected time using the following randomized algorithm. and if i = n the 8 output is the maximum element in the array. The selection problem can be solved in O(n log n) time using a good sorting algorithm. i) if p = r then return A[p] Algorithm for q := Rand-Partition(A.r..Algorithm for Randomized Algorithm Random-Select(A. r. when n is even.3. 3.i − k) return Random-Select(A. Randomized k := q − p +1. Let T(n) be the running time of Rand-Select on a given input A[1::n]of size n (worst-case over all values of i).2.where t = r − p +1.p. Input: Subarray A*p::r+ and index i with 1 i t. p. Algorithm if i = k then return A[q] else if i<k then return Random-Select(A.Expected running timeq +1.

...... ........4. 9 3.......Medians and Ordered Statistics where a......End of chapter 3...... The recurrence can be reWritten as We now compute a bound on E[T(n)] using the above expression for T(n)...........................7 of this project report)........ (The Random Parition is same as the partition algorithm discussed in section 2....n − k) by the maximum of the two terms for each value of k (the rst term is the maximum if and only if k> dn=2e)........ Using the substitution method we can show that T(n) <= cn for n >= 2c / (c−4a) if c> 4a.....Calculation of Upper Bounds on Expected Running time We will now compute an upper bound on E[T(n)] recurrence discussed in equation (1)................n represents the time taken by the initial call to Random-Partition.. The last inequality follows by replacing max(k − 1...... ....

Otherwise. (If there are an even number of medians. then return x. so that x is the kth smallest element and there are n – k elements on the high side of the partition. STEP 3 Use SELECT recursively to ﬁnd the median X of the n/5 medians found in step 2. use SELECT recursively to ﬁnd the ith smallest .) STEP 4 Partition the input array around the median-of-medians x using the modiﬁed version of PARTITION. If n = 1: SELECT merely returns its only input value as the ith smallest. If n>1: The SELECT works as follows STEP 1 Divide the n elements of the input array into n/5 groups of 5 elements each and at most one group made up of the remaining n mod 5 elements. X is the lower median. STEP 5 If i =k.Medians and Ordered Statistics 10 Chapter 4: Selection in worst-case linear time 4.1.OverView and steps The SELECT algorithm determines the ith smallest of an input array of n> 1 elements by executing the following steps. then by our convention. Let k be one more than the number of elements on the low side of the partition. STEP 2 Find the median of each of the n/5 groups by ﬁrst insertion sorting the elements of each group (of which there are at most 5) and then picking the median from the sorted list of group elements.

Thus.Running time of Selection in worst case linear time The following picture helps us to find out the worst case running time of Select(). 1 . at least half of the n/5 groups. Selection in Worst Case Linear time is actually the Refinement of Divider and Conquer! 4.1 is helpful in visualizing this bookkeeping.Medians and Ordered Statistics 11 element on the low side if i < k. To analyze the running time of SELECT : we ﬁrst determine a lower bound on the number of elements that are greater than the partitioning element x .2. At least half of the medians found in step 2 are greater than the median-of-medians x. or the (i − k)th smallest element on the high side if i > k. Figure 9.

(Step 2 consists of sort on sets of size O(1).) Step 3 takes time T ( n/5 O(n). Solving the Recurrence as follows . 12 Thus. assuming that T is monotonically increasing. Steps 1. the number of elements that are less than x is at least 3n/10 − 6. Thus. and step 5 takes time at most T (7n/10 + 6). calls of insertion ). in the worst case. and 4 take O(n) time. Discounting these two groups.Medians and Ordered Statistics At least half of the medians found in step 2 are greater than1the median-ofmedians x. which seems unmotivated at ﬁrst. Similarly. except for the one group that has fewer than 5 elements if 5 does not divide n exactly. We make the assumption. We can now develop a recurrence for the worst-case running time T (n) of the algorithm SELECT. at least half of the n/5 groups contribute 3 elements that are greater than x . and the one group containing x itself. SELECT is called recursively on at most 7n/10 + 6 elements in step 5. that any input of 140 or fewer elements requires O(1) time. We can therefore obtain the recurrence. the origin of the magic constant 140 will be clear shortly. it follows that the number of elements greater than x is at least. 2.

..wikipedia...algorithmsblog..................wikipedia................End Of Term Project Report...357/notes/lec7.........com/?p=136 ..org/wiki/Median Ordered statistic : en....wikipedia..pdf Selection in Worst case linear times : http://www................org/wiki/Selection_algorithm Minimum of a set : Algorithms by Cormen at page #184 Selection in expected linear times : www...wikipedia.....org/wiki/Selection_algorithm Partition Based Selection Problem : en.......Medians and Ordered Statistics 13 ...wikipedia..org/wiki/Order_statistic Selection algorithms : en..org/wiki/Selection_algorithm Non-Linear general Selection Problem : en.....edu/~vlr/s06......utexas.....cs... References Topic Reference Material used Median : http://en.............

- Knife in the Wind
- PIMRC10_prest
- 615.15
- On Optimizingc Subspaces for Face Recognition
- The SVM classifier zisserman lecture note.pdf
- hw4.pdf
- clustering kmeans
- Temporal data clustering via weighted clustering ensemble with different representations
- 10 - Sugeno-TSK Model
- final_u2
- Mesh Free Nano
- Website - Machine Learning
- Approximation
- Lecture 9
- 00-32
- FuzzyClusteringToolbox
- K-means clustering
- Projecting Data to a Lower Dimension With PCA
- Thesis Uniovi
- 04663070
- amc11413(thamkhaoeFCM)
- Modeling Relationships at Multiple Scales to Improve
- 2009 - Clustering Techniques for Financial Diversification
- A Link Analysis Extension of Correspondence Analysis for Mining Relational Databases
- A Joint Modeling Approach for Longitudinal Studies
- Queueing Theory
- Pca
- qt
- Iterative Closest Point by

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd