You are on page 1of 4

A Method of Obtaining Prediction Intervals

Author(s): G. David Faulkenberry


Source: Journal of the American Statistical Association, Vol. 68, No. 342 (Jun., 1973), pp. 433-
435
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/2284093 .
Accessed: 16/06/2014 05:28

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.

http://www.jstor.org

This content downloaded from 62.122.73.250 on Mon, 16 Jun 2014 05:28:01 AM


All use subject to JSTOR Terms and Conditions
A Methodof Obtaining
Prediction
Intervals
*
G. DAVID FAULKENBERRY

A method is given forobtaininga predictionintervalfora variable Y normalapproximation [6] oruse ofan hypothesistesting
with distribution Fy(yfO) based on observations x1,.-- . ,x, from distribu-
approach[3]. The purposeof this articleis to give a
tionFx(x/O),where0 is a common parameterto bothdistributions.The
method uses the conditional distributionof y given t, where t is a techniqueforderivingprediction intervalsbased on the
sufficientstatisticfor0 in the jointdistributionof(X,,.-**,Xn,Y).
Sample idea ofconditioning
on a sufficient
statistic.
applications are given forthe negative exponential and Poisson dis-
tributions. 2. METHOD
1. INTRODUCTION The proposedprocedureforobtaining inter-
prediction
vals dependsuponthefollowingresult.
It is sometimesof interestto obtain a value, arrivedat
If a completesufficient
statisticexists,then the prediction
by lifetesting,that withhighprobabilitywillbe less than problemcan be equivalently
formulated in termsof the distri-
the life length of a particularcomponentthat is to be butionconditioned
on thesufficient
statistic.
used in a "one-shot" system.Or, on the basis of January
sales in previous years a firmis interestedin having an This resultcan be verified as follows.Let X1, ., Xn
estimate, in interval form,of the sales for the coming Y be independent randomvariableswithdistribution
January.These are examplesofstatisticalinferenceprob- functions Fxi(xI), fori = 1, *, n, and Fy(yI0),where
lems called predictionintervalsor f-expectationtolerance 0 is a common parameter. Let R(xi, ***, x,,)be a function
intervals.The problem can be stated more formallyas whoserangeis regionsoftherealline.Define
follows:Let X1, **, X,n representa randomsample from I if y
a distributionFx(x 0)6 and let Y have distributionfunc- lXniY) = j
1 E R(x1, ,x.)
O(xi, if y0: R(xi,- JXn).
tion Fy(y 6), where 0 is common to both distributions.
On the basis of the outcomes xi, .., x,, we wish to make Thenwe have
a predictionabout the outcomeof Y, usually in the form
of an intervalor regionthat we are reasonablyconfident E4(X1, I, X., y) = Pr [Y E R(X1, X,)].
will contain the outcome of Y. That is, if L and U are Now,
functions of X1, -, X, then
1. ifR is suchthatPr[Y C R(Y1, , X,)] =
Pr (L < Y < U) = ,B
or equivalently thenift is a completesufficient
statistic,
, = Pr [Y E R(X1, ... , Xn)] = E(X1, ..., Xn, Y)
E dF(y!6)1
=I . = EJEE[o(x1, Xni Y) I T = t])
= E{PrE[YER(XI, *,Xn) iT=t]} .
More detailed information on the abstractformulationof
the problemis givenby [3], and a decisiontheoryformu- The completenessand sufficiency
of t thenimplies
lation is given by [1].
Pr[Y C R(X1, , Xn)IT =t]=L3.
A good bit of the literatureon predictionintervalsis
Also,
concerned with solving for prediction intervals of a
2. if thereexistsan R such that
particulartype or solving for intervalsfor a particular
distribution.For example, Thatcher [8] discusses the Pr[Y CR(X1, ,Xn) IT tt]
= forall t,
prediction interval concept and gives a solution for
binomialvariables,Hahn [4] considerspredictionregions then
fork futureY observationswhensamplingfroma normal Pr [CY R(X1, Xn)]E](X1,
=
... ..., Xn, Y)
distribution,Shah [7] gives a methodforobtainingpre- = E{E[E4(X, **XXn, Y) I T = t
diction intervals for a Poisson variable and Nelson
= E{Pr EY R(xi, * **, X.) iT = t]) = E(A) .
[6] gives approximate solutions for several sampling E

situations. Thereforewe have the equivalence of predictionregions


The methodsof derivingthese predictionintervalsare under the conditional and unconditionaldistributions.
generallyeitheruse of a pivotal quantity [4], use of a
? Journalof the AmericanStatistical Association
* G. David Faulkenberryis associate professor,
Departmentof Statistics,Oregon June 1973,Volume 68, Number342
State University,Corvallis,Oregon97331. Theoryand Methods Section
433

This content downloaded from 62.122.73.250 on Mon, 16 Jun 2014 05:28:01 AM


All use subject to JSTOR Terms and Conditions
434 Journalof the AmericanStatisticalAssociation,June 1973

This resultcan be applied to obtain predictionintervals This is, of course,the same resultgiven by the standard
as follows: approach of using the fact that the pivotal statistic
nY/s Xi is distributedF(2, 2n).
1. Determine F(y It) where T is sufficient for 6 in the
joint distributionof (X1, , X", Y). Example 3.2: Poisson: Let X1, ..., X., Y be indepen-
2. DetermineR'(t) such that dent randomvariableseach havinga Poisson distribution
withparameterX. The joint densityis given by
J I?' (t)
dF(yIt)=-3. e-( n+ l)XXA2;xi+y
p(Xi, ..,1Xny!YIX) (X!!)
y!J(x
3. Solve forR(xj, , x,,)such that y CJR(x1, , x.)
if and only if y C R'(t). T = E Xi + Y is sufficientfor X and the conditional
distributionof Y given t is
In some cases it is actuallyeasierto omitStep 2 and solve
directly for R as will be illustrated with the Poisson
example. B(yIt) = I(- )( 1)

3. EXAMPLES y = 0o,1l, ,t
The followingresults are not new, but the examples which is a binomial distributionwith parameterst and
illustratethe procedure.The exponentialis used since it 1/(n + 1). We now need to pick an interval [a', b']
clearly illustrates the three preceding steps, and the such that
Poisson is used to illustrate a case where no pivotal
E XaB(i It)- 3
statisticis available.
C-" is used because of the discretedistribution.This
Example 3.1: Exponential: Suppose X1, Xn, Y could be eliminated by randomization.) This interval
,

are independentrandom variables each having density would thenbe convertedto an


interval[a, b] fory,where
function a and b depend only on E xi. If it is decided to take an
f(u) = (1/6)e-u'0, u > 0, 6 > 0. "equal-tailed" interval,then we need
The joint densityis given by
zboB(iIt) I1- and
2
f(xl,. ,xn,y I ) = exp [ ]
=O' B(i It) 1-* (1)
and thefactorization givesT -
theorem E Xi + y suffi-
cient for 6, so the three steps correspondingto those in This is essentiallythe resultgiven by Shah [7], but we
Section 2 are as follows: can carrythisfurtherto simplifythe processof obtaining
a specificinterval.Since we cannot solve explicitlyfora'
1. The conditionaldensityof y given t is and b' as functionsof t (as was done in the exponential
= (n/t)(1 - y/t)n- for 0 < y < t example), we solve directlyfor a and b as functionsof
g(yIt)
= 0 otherwise. E xi. Note that (1) definesa' and b' as functionsof y,
and we are interestedin the intersectionof (1) and
This is obtained by findingthe joint distributionof E Xi a' < y < b'. The intersection(a, b) can be determinedby
and Y and transforming to t = E xi + y and y. simultaneouslysolvingthe equations in (1), respectively,
2. R'(t) in intervalformwill be (a, b) wherea and b withy = b' and y = a'. We thereforehave the following
satisfy equations defininga and b:
rb
fbg(yIt)dy = d
t=oB(il E xi+ b) = 2
or
(1 - a/t)n - (1 - = ,B.
and

For example, if we take an interval of the form(0, b), , a-oB(i I Exi + a)


2
which will be the interval of shortestwidth, we have
For given values of E xi and #,binomialtables can now
Rf(t) = {01t[l - (I - 0)1/n]}* be used to solve fora and b. For example,suppose that in
3. Using the fact that t E xi + y, the past fouryears a systemhas had a total of 20 break-
downs. And suppose it is reasonable that the yearly
y < [ -(1 - -3)1n l] number of breakdownscan be consideredobservations
fromn a Poisson distributionwith parameter X not de-
pending on the year. Then a prediction interval at
Kso approximatelythe 90 percent confidencelevel for the
R(1, , xn) = 10, L xjjl -,B3)-lfn _1]}. numberofbreakdownsthat will occur the comingyear is

This content downloaded from 62.122.73.250 on Mon, 16 Jun 2014 05:28:01 AM


All use subject to JSTOR Terms and Conditions
Prediction
Intervals 435

obtainedby takingn = 4, E xi = 20 and usingbinomial REFERENCES


tables such as [4]. This gives [1] Aitchison,J. and Schulthorpe,
D., "Some Problemsof Statis-
ticalPrediction,"Biometrika, 52 (December1965),469-83.
Eo B(ij22) .048 for a 2 ?2] Ferguson,T.S., Mathematical Statistics;A Decision Theoretic
and Approach,New York: AcademicPress,1967.
[3] Fraser,D.A.S. and Guttman,I., "ToleranceRegions,"Annals
i9=oB(i 29) = .951 for b 9. ofMathematical Statistics,
27 (March 1956),162-79.
Therefore[2, 9] is predictionintervalat the 90.3 percent [4] Hahn, G.J., "Factors for CalculatingTwo-SidedPrediction
confidencelevel. IntervalsFor Samplesfroma NormalDistribution," Journalof
For largevalues ofn and E xi it may be desirableto use the AmericanStatisticalAssociation,64 (September1969),
the Poisson approximationto the cumulative binomial. 878-88.
[5] Harvard ComputationLaboratory,Tables of the Cumulative
4. CONCLUSION Probability
Distribution,Cambridge, Mass.: HarvardUniversity
Press,1955.
The procedure given and illustrated in this note is
[6] Nelson, W.B., "Two-Sample Prediction,"General Electric
offeredas a usefulmethodof approachingthe problemof CompanyTIS Report68-C404. (AvailablefromDistribution
obtaining prediction intervals. It is always possible Unit, Research and DevelopmentCenter,General Electric
(though not necessarilyalways practical) to formulate Company,Schenectady, N. Y.) 1968.
the problemby conditioningon a sufficient statistic,so [7] Shah, B.V., "On PredictingFailuresin a FutureTime Period
in this sense the approach is general. Also it may be of fromKnownObservations," IEEE Transactions on Reliability,
interestto use this formulationto consider optimality 18 (November1969),203-4.
propertiessuch as is done for unbiased tests in [2]. [8] Thatcher,A.R., "RelationshipsBetweenBayesian and Con-
fidenceLimitsforPrediction,"Journalof theRoyalStatistical
[ReceivedMay 1971. RevisedMay 1972.] Ser. B, 26 (1964). 176-92.
Society,

This content downloaded from 62.122.73.250 on Mon, 16 Jun 2014 05:28:01 AM


All use subject to JSTOR Terms and Conditions

You might also like