You are on page 1of 19

Statistical Inference

Order Statistics

Noha Youssef & Rana Salah

The American University in Cairo

1 / 19
Table of Contents

Introduction to Order Statistics

Notation

Distribution of Order Statistics

Exercises

2 / 19
Intuition and Applications

Many functions of random variables of interest in practice depend


on the relative magnitudes of the observed variables.
1 The fastest time in a car race or the heaviest mouse among
those fed on a certain diet.
2 Joint life Insurance A policy for a couple pays out when the
first of the spouses dies. You want to know the distribution of
the minimum of two lifespans of the couple.
3 Last Survivor Policy An insurance policy that
covers a married couple and pays the death
benefit on the death of the second spouse. Second to
die insurance helps the heirs of the married couple
rather than either the husband or the wife.You want to know
the distribution of the maximum of the two lifespans.

3 / 19
Cont’D

4 A high school teacher is interested to know if students who


take a foreign language will get higher grades if their final exam
is an oral exam instead of a written exam. The following year
he asks all of the Spanish teachers to give their students an
oral exam at the end of the course. The highest exam score
in the previous years was 87 (out of a possible 100 points).
Suppose we wish to know the probability that the fifth largest
grade this year exceeds 87 or the expected value of the tenth
largest grade.
5 A machine may run on 5 batteries and shuts off when the 5th
battery dies. You may want to know the distribution of the
lifetime of the third to die out of five or the distribution of the
last one to die depending on the action you are going to take.

4 / 19
Random Sampling

Random sampling assumes that the sample is taken in such a way


that the random variables for each trial are independent and follow
the common population density function. In this case, the joint
density function is the product of the common marginal densities.
Definition 1
The set of random variables X1 , X2 , · · · , Xn is said to be a
random sample of size n from a population with density function
f (x) if the joint pdf has the form
f (x1 , x2 , · · · , xn ) = f (x1 )f (x2 ) · · · f (xn ).

5 / 19
Order Statistics

If the random variables X1 , X2 , · · · , Xn are arranged in order of


magnitude and then written as X(1) ≤ X(2) ≤ · · · ≤ X(n) , we call
X(i) the i-th ordered statistic (i = 1, 2, · · · , n).
Xi ’s are assumed to be independent and identically distributed while
X(i) ’s are necessarily dependent. In some applications we are interested
in the order of the set of random variables X1 , X2 , · · · , Xn . In such
cases, we need to know something about the probability density
function of the order statistics X(1) , X(2) , · · · ≤ X(n) .

6 / 19
Notation

▶ X1 , X2 , · · · , Xn unordered variables
▶ x1 , x2 , · · · , xn unordered observations (an observation is a
realization of the random variable)
▶ X(1) , X(2) , · · · , X(n) ordered variables
▶ x(1) , x(2) , · · · , x(n) ordered observations
▶ X1:n , X2:n , · · · , Xn:n ordered variables (a notation used when
we need to emphasize the sample size)
▶ F (x) = P (X ≤ x) cumulative distribution function (cdf) of
X f (x) probability density function (pdf) for a continuous
variable X.
▶ p(x)probability function (pf) for a discrete variable X.

7 / 19
Cont’D

▶ F(r) (x), Fr;n (x) cdf of X(r) , Xr:n , r = 1, 2, · · · , n.


▶ f(r) (x), fr:n (x) pdf of X(r) , Xr:n r = 1, 2, · · · , n.
▶ p(r) (x), pr:n (x) pf of X(r) , Xr:n r = 1, 2, · · · , n.
▶ F(r)(s) (x, y) = P (X(r) ≤ x, X(s) ≤ y) 1 ≤ r < s ≤ n joint cdf
of X(r) and X(s) .
▶ f(r)(s) (x, y) joint pdf of X(r) and X(s) 1 ≤ r < s ≤ n .
▶ p(r)(s) (x, y) joint pf of X(r) and X(s) 1 ≤ r < s ≤ n .
▶ µr:n = E(Xr:n ) mean of Xr:n
▶ µ(k) k
r:n = E(Xr:n ) kth raw moment of Xr:n

8 / 19
Distribution of a Single Order Statistic

Distribution of the Maximum:


We suppose that X1 , X2 , · · · , Xn are n independent variables, each
with cdf F (x). The cdf of the nth order statistic, X(n) (largest order
statistic) is given by

F(n) (x) = P (X(n) ≤ x)


= P (all Xi ≤ x)
Yn
= P (Xi ≤ x) = {F (x)}n .
i=1

9 / 19
Cont’D

Distribution of the Minimum:


The cdf of the first order statistic (smallest order statistic), X(1) is
given by

F(1) (x) = P (X(1) ≤ x) = 1 − P (X(1) > x)


Yn
= 1− P (Xi > x)
i=1
= 1 − {1 − F (x)}n .

10 / 19
Cont’D
The pdf of the rth order statistic

n!
f(r) (x) = {F (x)}r−1 f (x){1 − F (x)}n−r .
(r − 1)!1!(n − r)!
The joint distribution of two order statistics

n!
f(r)(s) (x, y) = {F (x)}r−1 f (x)
(r − 1)!1!(s − r − 1)!1!(n − s)!
×{F (y) − F (x)}s−r−1 f (y){1 − F (y)}n−s .
11 / 19
The pdf of the joint n Order Statistics

f(1),(2),··· ,(n) (x1 , x2 , · · · , xn ) = n!f (x1 )f (x2 ) · · · f (xn ).


Reading assignment:page 334 from the textbook.

12 / 19
Exercise 1
Let X have the uniform distribution over the range (θ − 2, θ + 2).
Find the expected value of the median from a sample of size 3.

Solution:

3!
f(2) (x) = {P (X ≤ x)}1 f (x){1−P (X ≤ x)}1 , θ−2 ≤ x(2) ≤ θ+2.
1!1!1!
We write this domain θ − 2 ≤ x(1) ≤ x(2) ≤ x(3) ≤ θ + 2 when we
have the joint distribution of the three order statistics.
The pdf of the Uniform dist.
1
f (x) =
4
and
x θ 1
F (x) = P (X ≤ x) = − + .
4 4 2
13 / 19
Exercise 1:Cont’d

Then
 
1 2 1 1 2 1
f(2) (x) = 6× − x + θx − θ + , θ−2 ≤ x(2) ≤ θ+2
64 32 64 16

The expected value µ2:3 = E(X2:3 ), is obtained by


Z θ+2
E(X2:3 ) = xf(2) (x)dx
θ−2
Z θ+2  
1 1 1 1
= x6 × − x2 + θx − θ2 + dx
θ−2 64 32 64 16
= θ.

You can think of finding the dist of the median when n is even.
Search for it but is not examinable.

14 / 19
Exercise 2

Let X1 , X2 , X3 be a random sample from the exponential distribution


with mean 1, and X(1) , X(2) , X(3) be the order statistics of that
random sample. Find the probability density functions of X(1) , X(2) , X(3) .

Solution: The marginal pdf of the rth order statistic is given by


3!
f(r) (x) = {F (X)}r−1 f (x){1 − F (x)}n−r .
(r − 1)!1!(n − r)!

The exponential distribution is f (x) = λe−λx = e−x where 0 ≤ x.


The cdf is obtained as
Z x
F (x) = e−x dx = 1 − e−x .
0

15 / 19
Exercise 2:Cont’D

The distributions for the three order statistics are


6 −x
f(1) (x) = e {1 − (1 − e−x )}2 = 3e−x {e−x }2 = 3e−3x ,
2
3!
f(2) (x) = {1 − e−x }e−x {1 − (1 − e−x )}1
(2 − 1)!1!(3 − 2)!
= 6{1 − e−x }e−2x = 6{e−2x − e−3x },
6
f(3) (x) = {1 − e−x }2 e−x {1 − (1 − e−x )}0 = 3{1 − e−x }2 e−x ,
2
where 0 ≤ x(1) , 0 ≤ x(2) and 0 ≤ x(3) .

16 / 19
Exercise 3

Electronic components of a certain type have a length of life Y, with


probability density given by
1 −y/100
f (y) = e , y > 0.
100
Length of life is measured in hours. Suppose that two such components
operate independently and in series in a certain system (hence, the
system fails when either component fails). Find the density function
of the length of life of the system.

17 / 19
Exercise 3:Cont’D

Solution:
Because the system fails at the first component failure, X = min(Y1 , Y2 ),
where Y1 and Y2 are independent random variables with the given
density. So we need to find f(1) (y),

2!
f(1) (y) = f (x){1 − F (x)}1
1
and for an exponential distribution it is given by

f(1) (y) = 2(1/100)e−y/100 {1−(1−e−y/100 )} = (1/50)e−y/50 , y > 0.

18 / 19
Exercise 4

Suppose that the components in Exercise 3 operate in parallel (hence,


the system does not fail until both components fail). Find the
density function of the length of life of the system.

Left for you as an assignment.

19 / 19

You might also like