Professional Documents
Culture Documents
Percentile Matching
Aside from moments, there are many other characteristics of the observations and distribution that should align.
The percentile matching method focuses on the characteristic of percentiles to obtain parameter estimates.
( ) ( )
Pr X ≤ π p = F π p = p
π p = π p′ , k = 1, 2, …, r
k k
where the p k 's are arbitrarily chosen. For this course, the problem will specify which percentiles should be
matched.
EXAMPLE 2.1.3
Assume the median of a sample is 250. Suppose the data came from an exponential distribution with CDF
F(x) = 1 − e − x / θ, x>0
Recall that the median is another name for the 50th percentile. Therefore, we need to match
′
π 0.5 = π 0.5 = 250
By definition,
( )
F π 0.5 = 1 − e − π 0.5 / θ = 0.5
1 − e − 250 / θ = 0.5
e − 250 / θ = 1 − 0.5
250
− = ln(1 − 0.5)
θ
250
θ̂ = − = 360.674
ln(1 − 0.5)
In the example above, we were conveniently given that the median of the sample is 250. In reality, there are
several different approaches to compute sample percentiles, each potentially producing a different answer. In
this course, we will use the smoothed empirical percentile approach.
where
• b = ⌊p(n + 1)⌋ , i.e. round p(n + 1) down to the nearest integer, and
• c = p(n + 1) − b .
The formula above might look intimidating, but the logic is rather straightforward. Instead of memorizing the
formula, consider the following steps:
1. Calculate p(n + 1) . For this course, you won't have to worry about it being outside the interval [1, n] .
◦ For example, if p = 0.65 and n = 34 , then p(n + 1) = 22.75 . You may interpret this to loosely mean
"the 65th percentile of the sample is the 22.75th observation in ascending order".
◦ Since 22.75 is between 22 and 23, we need to interpolate between x ( 22 ) and x ( 23 ) , the 22nd and
23rd observations in ascending order.
Take the numbers after the decimal of p(n + 1) as the weight that's multiplied to the larger
observation. Then, the smaller observation gets the complement weight. In other words,
′
π 0.65 = 0.25x ( 22 ) + 0.75x ( 23 ) , where 0.75 is taken from 22.75 , and 0.25 = 1 − 0.75 .
◦ If p(n + 1) is an integer instead, the formula simplifies to π p′ = x ( b ) . This is consistent with the
procedure above; integers have 0's after the decimal, so the larger observation will receive a weight
of 0.
Write the expression that calculates the 34th percentile of a sample with size 7.
′
π 0.34 = 0.28x ( 2 ) + 0.72x ( 3 )
Write the expression that calculates the 84th percentile of a sample with size 17.
15.12 tells us to interpolate between the 15th and 16th observations in ascending order, with respective weights
1 − 0.12 = 0.88 and 0.12 . Thus,
′
π 0.84 = 0.88x ( 15 ) + 0.12x ( 16 )
Write the expression that calculates the 40th percentile of a sample with size 19.
p(n + 1) = 0.4(19 + 1) = 8
′
π 0.4 = x (8)
COACH'S REMARKS
An alternative and perhaps more intuitive way to remember which weight goes to which observation is: p(n + 1)
hints at which
Typesetting is the closer observation; it gets the larger weight.
math: 87%
Revisiting the examples above,
• p(n + 1) = 2.72 is closer to 3 than 2. Therefore, the larger weight of 0.72 is multiplied to x ( 3 ) , and the
smaller weight of 0.28 is multiplied to x ( 2 ) .
• p(n + 1) = 15.12 is closer to 15 than 16. Therefore, the larger weight of 0.88 is multiplied to x ( 15 ) , and the
smaller weight of 0.12 is multiplied to x ( 16 ) .
EXAMPLE 2.1.4
A soccer fan records the time it takes his favorite team to score one goal in 16 random matches.
15 35 60 85 33 69 88 44
35 78 90 32 2 68 23 19
τ
F(x) = 1 − e − ( x / θ ) , x>0
SOLUTION
2 15 19 23 32 33 35 35
44 60 68 69 78 85 88 90
′
π 0.4 = 0.2x ( 6 ) + 0.8x ( 7 )
= 0.2(33) + 0.8(35)
= 34.6
′
π 0.6 = 0.8x ( 10 ) + 0.2x ( 11 )
= 0.8(60) + 0.2(68)
= 61.6
′
π 0.4 = π 0.4 = 34.6
′
π 0.6 = π 0.6 = 61.6
τ
1 − e − ( 34.6 / θ ) = 0.4
τ
1 − e − ( 61.6 / θ ) = 0.6
First, simplify each equation in a similar way that was shown in Example 2.1.3. The resulting equations are
− ( )
34.6 τ
θ
= ln(0.6)
− ( )
61.6 τ
θ
= ln(0.4)
( ) 34.6 τ
61.6
= 0.5575
( ) τln
34.6
61.6
= ln(0.5575)
τ = ln(0.5575) / ln ( )
34.6
61.6
= 1.013
−
( )
34.6 1.013
θ
= ln(0.6)
34.6 1.013
θ 1.013 = −
ln(0.6)
( )
−1
34.6 1.013 1.013
θ= −
ln(0.6)
= 67.152
θ̂ = 67.152, τ̂ = 1.013
Discussions
Ask a question
Nur87%
Typesetting math: Alia Kamaluddin
SUMMARY:
MESSAGE: