You are on page 1of 3

On Prediction Intervals for Normal Samples

Toke Jayachandran
Naval Postgraduate School, Department of Mathematics, Monterey, California
93940
It is shown that many of the normal theory prediction intervals for statistics of random
samples are still valid when the samples are correlated and have a specified covariance
structure.
1. INTRODUCTION
In a number of papers [2-51 Hahn considered the problem of constructing simul-
taneous prediction intervals for future sample observations and their statistics such as
the means and variances; example applications were also discussed. The prediction
intervals were derived under the assumption that the sample observations are inde-
pendent and have identical normal distributions. The purpose of this article is to show
that the independence assumption can be relaxed and to present the most general
covariance structure for which Hahn's results would still be valid.
2. SIMULTANEOUS PREDICTION INTERVALS FOR RANDOM
SAMPLES
Suppose Xo, , Xo2. . . . ,X0. is an initial random sample of size n and Xi,, i = 1,2,
. . . , k; j = 1, 2, . . . ,111, j = 1, 2, . . . , m, k, sets of future random samples of
size m form a normal distribution N( p,a2). Let x i and ST, i = 0, 1, 2, . . . ,k, be the
means and variances (unbiased) of the k + 1 sample sets and x and S2 the_mean and
variance of the pooled sample of size N = n + km. Starting from Ti =( Xi - xo)/
[ S g l / m + I/n)J"2, i = 1,2, . . . ,k, that are jointly distributed as the multivariate
generalization of the Student's t distribution with parameters k, v = n - 1 and p =
m/ ( m + n), Hahn [ 2] obtained 1OOy% simultaneous prediction intervals for the Xi to
be
( 1 )
where t ( k, v, p, y) is the appropriate c.d.f. value of the multivariate t distribution tab-
ulated in [ 2, 3, 8] .
Hahn's [ 5] prediction intervals for S:, i =1,2, . . . , k, were based on
W, = max(Sj?/Sa) and W, = min(S:/Sf) that are, respectively, distributed as the
Studentized largest and Studentized smallest chi-square variables with parameters k,
m - 1 , and n - 1 ; c.d.f. values of these distributions are tabulated in [6,7]. An
upper 1OOy% simultaneous prediction limit to exceed the variances of all k future
samples of size m is
( 2)
-
Xo * t ( k, v, p, y) S, ( l / m +l / n) "*, i = 1, 2, . . . , k,
Wu(k,m - 1,n - l,r)Sd,
Nuvd Research Logistics Quarterly, Vol. 30, pp. 461-463 (1983)
Copyright 0 1983 by J ohn Wiley & Sons, Inc. CCC 0028- 1441 /83/03O461-03$0 1 .30
462 Naval Research Logistics Quarterly, Vol. 30 (1983)
where W,(k,rn - 1,n - 1,y) is a c.d.f. value of Wu. Lower prediction limits can be
obtained similarly, starting with W,.
It is not necessary for all future samples to be of the same size m. However, the
necessary c.d.f. values of the multivariate t and the Studentized chi-square distributions
are not readily available for the general case.
The basic distributional results (that are a consequence of the assumption that the
samples X , are independent and identically distributed) used in the construction of the
prediction limits (1) and (2) - - are that S! / u 2, i = 0,1,2, . . . , k, are independent chi-
square variates and Z, =X , - X, , i =1,2, . . . , k, are independent of Si .
3. PREDICTION INTERVALS FOR CORRELATED SAMPLES
Wewill now show that the distributional results mentioned in the previous section
still hold when the N x 1 vector X = ( Xol , . . . ,Xon, XI, , . . . ,XI,,,, . . . .X,,, . . . , Xh)
has a multivariate normal distribution with mean vector p =(p, . . . ,p) and co-
variance matrix V provided V is of the form
(3)
1
2
V = - ( A +A) + u2(I - E),
a, a, . . .
A = 7
N x N
aN a, aN . . .
where a, > 0, i = 1,2, . . . , N, I is an identity matrix, and E is a matrix of all 1s.
Consider the identity
(4)
Baldessari [l ] proved that a necessary and sufficient condition for Sf/u2, i = 0,1,2,
. . . , k, to be independent and have chi-square distributions is that the covariance
matrix be of the form (3). Sufficiency can be established using the standard theorems
on quadratic forms in normal variables [9, Chap. 31. To show that Z = ( Z, , Z2, . . . ,
Zk) is independent of Si note that Z is linear in X of the form CX and S i is a quadratic
form XB& for an appropriate k X N matrix C and an N X N matrix B,. It is fairly
straightforward to show that CVB, =0, which implies the independence of Z and
An example of a system of random variables with a covariance structure as in (3)
is a sequence of symmetric or exchangeable normal random variables. Examples of
nonidentically distributed systems can beconstructed by choosing values for a,, i = 1,2,
. . . , N, and cr2 so that V = (1/2)(A +A) +u2(I - E) is a positive definite matrix;
if N =4, a , =2, a2 = 4, a3 =6, a4 = 8, and u2 = 2 then
S;.
Jayachandran: Prediction Intervals for Normal Samples
463
.- 2 1
2 3
3 4
REFERENCES
[ 11 Baldessari, B., Analysis of Variance of Dependent Data, Statistica (Bologna), 26, 895-
903 (1966).
[2] Hahn, G. J ., Factors for Calculating Two-sided Prediction Intervals for Samples from a
Normal Population, Journal of the American Statistical Association, 64, 878-888 (1969).
[3] Hahn, G. J ., Additional Factors for Calculating Prediction Intervals for Samples from a
Normal Population, Journal of the American Statistical Association, 65, 1668-1676 (1970).
[4] Hahn, G. J ., Statistical Intervals for a Normal Population-Parts I and 11, Journal of
Quality Technology, 2, 115-125, 195-206 (1970).
[5] Hahn, G. J ., Prediction Intervals for a Normal Distribution, General Electric Company
TIS Report No. 71-C-038, 1970 (available from Distribution Unit, P.O. Box 43, Bldg. 5,
Room 237, General Electric Company, Schenectady, NY).
[6] Krishnaiah, P. R., and Armitage, J . V., Distribution of the Studentized Smallest Chi-
Square With Tables and Applications, ARL 64-218 Aerospace Research Laboratories,
Wright-Patterson Air Force Base, Ohio, 1964.
[7] Krishnaiah, P. R., and Armitage, J . V., Tables for the Studentized Largest Chi-square
Distribution and Their Applictions, ARL 64-188, Aerospace Research Laboratories, Wright-
Patterson Air Force Base, Ohio, 1964.
[8] Krishnaiah, P. R., and Armitage, J . V., Tables for Multivariate t-Distribution, Sankhya,
Series B, 28, 31-56 (1966).
[9] Rao, C. R., Linear Statistical Inference and Its Applications, 2nd ed., Wiley, New York,
1973.

You might also like