Professional Documents
Culture Documents
Example 6.4.7 Let X1, ..., X,, be independent normally distributed random variables,
X. N(u, ai), and let Y = >IX,. The MGF of X. is
65
ORDER STATISTICS
The concept of a random sample of size n was discussed earlier, and the joint
density function of the associated n independent random variables, say
X1 ,X,,,isgivenby
x,,) = f(x1) f(x,.) (651)
For example, if a random sample of five light bulbs is tested, the observed
failure times might be (in months) (x1, ..., x5) = (5, 11, 4, 100, 17). Now, the
actual observations would have taken place in the order x3 = 4, x1 = 5, x2 = 11,
x5 = 17, and x4 = 100. It often is useful to consider the "ordered" random sample
of size n, denoted by (X1:,,, x2.,,, ..., x,,,,). That is, in this example x1.5 = x3 = 4,
x2.5 = x1 = 5, x3.5 = x2 = 11, x.5 = x5 = 17, and x5.5 = x4 = 100. Because we
do not really care which bulbs happened to be labeled number 1, number 2, and
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
6,5 ORDER STATISTICS 215
so on, one could equivalently record the ordered data as it was taken without
keeping track on the initial labeling. In some cases one may desire to stop after
the r smallest ordered observations out of n have been observed, because this
could result in a great saving of time In the example, 100 months were required
before all five light bulbs failed but the first four failed in 17 months
The joint distribution of the ordered variables is not the same as the joint
density of the unordered variables For example, the 5 different permutations of
a sample of five observations would correspond to just one ordered result This
suggests the result of the following theorem We will consider a transformation
that orders the values x1, x2, ..., x,,. For example,
= u1(x1, x2, X1, X2,
y,, =u,,(x1, x2, ...,x,,)= max (x1,x2, ...,x,,)
and in general y = u(x1, x2, ..., x,,) represents the ith smallest of x1, x2, ..., x,,.
For an example of this transformation see the above light bulb data Sometimes
we will use the notation xi,, for u1(x1, x2, ..., x,,), but ordinarily we will use the
simpler notation y. Similarly, when this transformation is applied to a random
sample X1, X2 ....., X,, we will obtain a set of ordered random variables, called
the order statistics and denoted by either X1,,, X2.,,, ..., X,,,, or Y1, Y2, ..., I.
Thecrsm 6.5.1 If X1, X2, ..., X,, is a random sample from a population with continuous pdf
f(x), then the joint pdf of the order statistic í Y2 , 1 is
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
216 CHAPTER 6 FUNCTIONS OF RANDOM VARIABLES
joint pdf is the product of factors f(y1) multiplied in some order, but it can be
written as f(y1)f(y2)f(y3) regardless of the order. If we sum over all 3! = 6
subsets, then the joint pdf of Y1, Y2, and Y3 is
6
g(y1, Y2' Y3) = iE 1
Suppose that X1, X2, and X represent a random sample of size 3 from a popu-
lation with pdf
f(x)=2x 0<x<1
It follows that the joint pdf of the order statistics Y1, Y2, and Y3 is
g(y1, Y2' Y3) = 3!(2yiX2y2X2y3)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
6.5 ORDER STATISTICS 217
' F(y2)]2 b
= 3!f(y1)
y'
= 3f(y1)[1 - F(y1)]2 a <j <b
Similarly,
r» ('Y2
g2(y2) 3!f(y1)f(y2)f(y3) dy1 dy3
=j Y2 Ja
These results may be generalized to the n-dimensional case to obtain the fol-
lowing theorem.
Theorem 6.52 Suppose that X, ..., X denotes a random sample of size n from a continuous
pdf, f(x), wheref(x) > O for a < x <b. Then the pdf of the kth order statistic Y,, is
given by
k(Yk)
= (k - 1)!(n - k)!
[F(y)]k -
- F(yk)] kf(y) (6.5.3)
An interesting heuristic argument can be given, based on the notion that the
"likelihood" of an observation is assigned by the pdf. To have Y,, = h one must
have k - i observations less than y,,; one at Yk' and n - k observations greater
than y,,, where P[X Y,,] = F(y,,), P[X ? Yk] = i - F(y,j, and the likelihood of
an observation at Y isf(y,,). There are n!/(k - 1)! 1 !(n - k)! possible orderings of
the n independent observations, and g,,(y,,) is given by the multinomial expression
(6.5.3). This is illustrated in Figure 6.6.
A similar argument can be used to easily give the joint pdf of any set of order
statistics. For example, consider a pair of order statistics } and Y where i <j. To
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
218 CHAPTER 6 FUNCTIONS OF RANDOM VARIABLES
k-1 i nk
y
Y2
For discrete and continuous random variables, the CDF of the minimum or
maximum of the sample can be derived directly by following the CDF technique.
For the minimum.
G1(y1) = FEYJ Yi]
= i - FEY1 > Yi]
= i - P[ail X > j]
= i - [1 - F(y1)] (6.5.7)
ni
I I I I
y1 000 000 3/.
i
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
6.5 ORDER STATISTICS 219
Following similar arguments, it is possible to express the CDF of the kth order
statistic. In this case we have 1, YA if k or more X, are at most Yk' where the
number of X. that are at most y follows a binomial distribution with parameters
n and p = F(y,3,
That is, let A denote the event that exactly j X,'s are less than or equal to Yk
and let B denote the event that Y,, yk; then
A3
J k
(fl)J(l p)i
where the A are dijoint and P(A) It follows that
P(B) = P(A), which gives the result stated in the following theorem.
jk
Theorem 6.5.3 For a random sample of size n from a discrete or continuous CDF, F(x), the
marginal CDF of the kth order statistic is given by
Example 6.5.2 Consider the result of two rolls of the four-sided die in Example 2.1.1. The graph
of the CDF of the maximum is shown in Figure 2 3 Although this function was
obtained numerically from a table of the pdf, we can obtain an analytic expres-
sion using equation (6.5.8). Specifically, let X1 and X2 represent a random sample
of size 2 from the discrete uniform distribution, X, DU (4). The CDF of X, is
F(x) = [x]/4 for i x 4, where [x] is the greatest integer not exceeding x. If
Y2 = max (X1, X2), then G2(y2) = ([y2]/4)2 for i y 4, according to equa-
tion (6.5.8). The CDF of the minimum, Y1 min (X1, X2), would be given by
G1(y1) = i - (1 - [y1]/4)2 for 1 Yi 4, according to equation (6.5.7).
Example 6.5.3 Consider a random sample of size n from a distribution with pdf and CDF given
by f(x) = 2x and F(x) = x2; O <x < 1. From equations (6.5.5) and (6.5.6), we
have that
g1(y1) = 2ny1(1 y)n1 O <y < i
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
220 CHAPTER 6 FUNCTIONS OF RANDOM VARIABLES
and
g(y) = 2ny(y,)" -1
= 2ny"' O < y <1
The corresponding CDFs may be obtained by integration or directly from
equations (6.5.7) and (6,5.8).
Example 6.5.4 Suppose that in Example 6.5.3 we are interested in the density of the range of the
sample, R = - Y1. From expression (6.5.4), we have
iii
(2y1)[y - y]"2(2y) O <Yi <Y < i
= (n 2)!
Making the transformation R = - Y1, S = Y1, yields the inverse transforrn
mation Yi = s, y = r + s, and JI = 1. Thus, thejoint pdfofR and S is
4n!
h(rs)(2),s(r+s)[r2+2rs]"2 O<s<1r, O<r<1
The regions A and B of the transformation are shown in Figure 6.8.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
6.5 ORDER STATISTICS 221
Exampk' 6.5.5 Again consider Example 6.5.4. In that case F(s + r) = i if s> i - r, so equation
(6.5.11) becomes
('1r ('1
2jn_
s2J1_
H1(r) n(2s)[(r + s)2 - ds + n(2s)[1 ds
= JO J1r
For the case n = 2,
$1_r
2)
H1(r) 2 ±2rs) ds + [_r45(1
= - 2r +
which is consistent with the pdf given by equation (6.5.10),
CENSORED SAMPLING
As mentioned earlier, in certain types of problems such as life-testing experi-
ments, the ordered observations may occur naturally. In such cases a great savings
in time and cost may be realized by terminating the experiment after only the
1rst r ordered observations have occurred, rather than waiting for all n failures to
occur This usually is referred to as Type H CSO! sampirng In this case, the
joint marginal density function of the first r order statistics may be obtained by
integrating over the remaining variables. Censored sampling is applicable to
many different types of problems, but for convenience the variable will be referred
to as "time" in the following discussion.
Theorcm 6.5.4 Type H Censored Sampling The joint marginal density function of the first r
order statistics from a random sample of size n from a continuous pdf, f(x), is
given by
Yr) [1 - F(Yr)]nrfIf(Yj (6.5.12)
(n
ifx<yi<'''<yr<coandzerootherwise.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
222 CHAPTER 6 FUNCTIONS OF RANDOM VARIABLES
= [F(t0)]T 1J
f(x) (6.5.15)
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
8.5 ORDER STATISTICS 223
to have values less than t0 . It is interesting to note that equation (6.5,15) does not
involve the original sample size n. Indeed, truncated sampling may occur in two
slightly different ways Suppose that the failure time of a unit follows the density
f(x), and that the unit is guaranteed for t0 years If a unit fails under warranty,
then it is returned to a certain repair center, and the failure times of these units
are recorded until r failure times are observed The conditional density function
of these r failure times then would follow equation (6 5 15), which does not
depend on n, and the original number of units, n, placed in service may be known
or unknown Also note that the data would again naturally occur as ordered data
and the original labeling of the original random units placed in service would be
unimportant or unknown Thus, it again would be reasonable to consider directly
the joint density of the ordered observations given by
g(y1 , Yr I = [F(t0)]r ]Jf(Yi) (6516)
Th9orcm 6.5.5 Type I Censored Sampling If Y' , 1"r denote the observed values of a random
sample of size n from f(x) that is Type I censored on the right at t0, then the
joint pdf of Y1, , R is given by
n!
f11.....YR1' .. Yr)
- (n - r)! [1 - F(t0)]flf(y) (6.5.17)
Proof
This follows by factoring the joint pdf into the product of the marginal pdf of R
with the conditional pdf of Y1, ..., Y« given R r. Specifically,
r! fIf(Y)
[F(t0)]r r!(nr)!
n!
- F(t0)]"_T
which simplifies to equation (6.5.17).
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
224 CHAPTER 6 FUNCTIONS OF RANDOM VARIABLES
Note that the forms of equations (6.5.12) and (6.5.17) are quite similar, with t0
replacing Yr
As suggested earlier, we will wish to use sample data to make statistical infer-
ences about the probability model for a given experiment The joint density func
tion or "likelihood function" of the sample data is the connecting link between
the observed data and the mathematical model, and indeed many statistical pro
cedures are expressed directly in terms of the likelihood function of the data
In the case of censored data, equations (6 5 12), (6 5 16), or (6 5 17) give the
likelihood function or joint density function of the available ordered data, and
statistical or probabilistic results must be based on these equations Thus it is
clear that the type of data available and the methods of sampling can affect the
likelihood function of the observed data.
ExampI 6,5.6 We will assume that failure times of airplane air conditioners follow an exponen-
tial model EXP(0). We will study properties of random variables in the next
chapter that will help us characterize a distribution and interpret the physical
meaning of parameters such as e. However, for illustration purposes, suppose the
manufacturer claims that an exponential distribution with O = 200 provides a
good model for the failure times of such air conditioners, but the mechanics feel
O = 150 provides a better model. Thirteen airplanes were placed in service, and
the Íìcst 10 air conditioner failure times were as follows (Proschan, 1963):
23, 50, 50, 55, 74, 90, 97, 102, 130, 194
For Type II censored sampling, the likelihood function for the exponential
distribution is given by equation (6.512) as
(n -
y; O) exp
= (n r)! O e
n!
(n_r)!Or ex[ [Y
¡r +( o
T Y+(1310)Yio1447
i
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
6.5 ORDER STATISTICS 225
Thus we see that the observed data values are more likely under the assumption
8 = 150 than when 0 200. Based on these data, it would be reasonable to infer
that the exponential model with O = 150 provides the better model. Indeed, it is
possible to show that the value of O that yields the maximum value of the likeli-
hood is the value
T
-= 1447
r 10
=144.7
Thus, if one wished to choose a value of O based on these data, the value
O = 144.7 seems reasonable.
For illustration purposes, suppose that Type I censoring had been used and
that the experiment had been conducted for 200 flying hours for each plane to
obtain the preceding data The likelihood function now is given by equation
(6.5.17):
n!
f(y1, ...,
- r)! û"
O=(y+(n_r)to'/r=
\i=1
146,5
JI
As a final illustration, suppose that a large fleet of planes is placed in service
and a repair depot decides to record the failure times that occur before 200 hours
However, some units in service may be taken to a different depot for repair, so it
is unknown how many units have not failed after 200 hours. That is, the sample
size n is unknown. Given that r ordered observations have been recorded, the
conditional likelihood is given by equation (6.5.16):
r! exp (_io)
Yr O, t0 r) = orct
- exp (t0/O)]"
where r = 10 and t0 = 200.
The value of O that maximizes this joint pdf cannot be expressed in closed
form; however, the approximate value for this case based on the given data is
O 245. This value is not too close to the other values obtained, but of course
the data were not actually obtained under this mode of sampling. If two different
assumptions are made about the same data, then one cannot expect to always get
similar results (although the Type I and Type II censoring formulas are quite
similar).
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor
226 CHAPTER 6 FUNCTIONS OF RANDOM VARIABLES
SUMMARY
The main purpose of this chapter was to develop methods for deriving the dis-
tribution of a function of one or more random variables. The CDF technique is a
general method that involves expressing the CDF of the "new" random variable
in terms of the distribution of the "old random variable (or variables) When one
k-dimensional vector of random variables (new variables) is defined as a function
of another k-dimensional vector of random variables (old variables) by means of
a set of equations, transformation methods make it possible to express the joint
pdf of the new random variables in terms of the j oint pdf of the old random
variables The continuous case also involves multiplying by a function called the
Jacobian of the transformation A special transformation, called the probability
integral transformation, and its inverse are useful in applications such as com-
puter simulation of data.
The transformation that orders the values in a random sample from smallest to
largest can be used to define the order statistics A set of order statistics in which
a specified subset is not observed is termed a censored sample, This concept is
useful in applications such as life-testing of manufactured components, where it is
not feasible to wait for all components to fail before analyzing the data.
EXERCISES
1. Let X be a random variable with pdff(x) = 4x3 if O < x < I and zero otherwise. Use the
cumulative (CDF) technique to determine the pdf of each of the following random
variables:
Y=X4.
W = e'.
(e) Z = In X.
(d) U = (X Ø5)2
2. Let X be a random variable that is uniformly distributed, X UNIF(O, 1). Use the CDF
technique to determine the pdf of each of the following:
Y=X'14.
W=e_X.
(e) Z = i - e'.
(d) U=X(1X).
3. The measured radius of a circle, R, has pdff(r) = 6r(1 - r), O <r
Find the distribution of the circumference.
Find the distribution of the area of the circle.
PDF compression, OCR, web optimization using a watermarked evaluation copy of CVISION PDFCompressor