You are on page 1of 2

Finding the Exact Quantile in a Sample

March 2, 2021

The R definition of q th quantile of a sample, where 0 ≤ q ≤ 1, is to first


find an appropriate index i, which will generally be a real number (although
might occasionally be an integer), and then to perform linear interpolation
in order to smoothly join the two data points nearest that particular fraction
of the way from bottom to top. To be precise, let

i = 1 + q(n − 1)

be the theoretical index we are looking for (you will see this smoothly joins
i = 1 at the bottom quantile q = 0 up to i = n at the top quantile q = 1
). (There are some other possible choices, see ?quantile in R, but we are
content with this one which has good simplicity and symmetry). Write your
data set (at least conceptually) in order from bottom to top,

x(1) < x(2) < x(3) < · · · < x(n)

(these are called the order statistics of the sample). Then write the index as
a whole number plus a fraction between 0 and 1 (just like a mixed numeral):

i=w+f

The whole number part w will tell us which two observations are involved,
x(w) and x(w+1) . The fractional part f tells us how far to move over from
the former to the latter. (If f = 0, we just take x(w) , and if f = 1, we just
take x(w+1) ). Thus the linear formula interpolating this is

q th quantile = x(w) + f x(w+1) − x(w) = (1 − f )x(w) + f x(w+1)

(Exercise to the reader: check that this definition matches the ordinary
symmetrical notion of median from last lecture, in the case q = 21 ).
An example:

1
To find Q1 for the temperature data

72, 75, 59, 61, 79, 60, 66, 65, 61, 66, 69, 61, 76, 79, 56, 56

Sort the data

56, 56, 59, 60, 61, 61, 61, 65, 66, 66, 69, 72, 75, 76, 79, 79

The sample size is 16.


3
i = 1 + q(n − 1) = 1 + 14(16 − 1) = 4
4
3 3
Q1 = x4 + (x5 − x4 ) = 60 + (61 − 60) = 60.75.
4 4

You might also like