You are on page 1of 44

A Catalog of Noninformative Priors

Ruoyong Yang James O. Berger


Parexel International ISDS
60 Revere Drive, Suite 200 Duke University
Northbrook, IL 60062 Duahrm, NC 27708-0251
ruoyong.yang@parexel.com berger@stat.duke.edu
December, 1998

Abstract
PRELIMINARY DRAFT: This draft of the catalog is incomplete, with much remaining
to be lled in and/or added. We are circulating this draft in the hopes that readers will know
of relevant information that should be added. Please send such information to the authors
above.]
A variety of methods of deriving noninformative priors have been developed, and applied
to a wide variety of statistical models. In this paper we provide a catalog of many of the
resulting priors, and list known properties of the priors. Emphasis is given to reference priors
and the Jereys prior, although other approaches are also considered.
Key words and phrases. Je reys prior, reference prior, maximal data informa-
tion prior.

Contents
1 Introduction 4
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Approaches to Development of Noninformative Priors . . . . . . . . . . . . . . . 5
This research was supported by NSF grants DMS-8923071 and DMS-9303556 at Purdue University.

1
2 Organization and Notation 6
3 AR(1) 6
4 Behrens-Fisher Problem 7
5 Beta 7
6 Binomial 8
7 Bivariate Binomial 9
8 Box-Cox Power Transformed Linear Model 9
9 Cauchy 10
10 Dirichlet 11
11 Exponential Regression Model 11
12 F Distribution 12
13 Gamma 12
14 Generalized Linear Model 13
15 Inverse Gamma 15
16 Inverse Normal or Gaussian 15
17 Linear Calibration 16
18 Location-Scale Parameter Models 17
19 Logit Model 19
20 Mixed Model 21
21 Mixture Model 21

2
22 Multinomial 22
23 Negative Binomial 23
24 Neyman and Scott Example 23
25 Nonlinear Regression Model 24
26 Normal 24
27 Pareto 28
28 Poisson 28
29 Product of Normal Means 29
30 Random E ects Models 30
31 Ratio of Exponential Means 34
32 Ratio of Normal Means 34
33 Ratio of Poisson Means 35
34 Sequential Analysis 36
35 Stress-Strength System 36
36 Sum of Squares of Normal Means 37
37 T Distribution 38
38 Uniform 38
39 Weibull 39

3
1 Introduction
1.1 Motivation
The literature on noninformative priors has grown enormously over recent years. There
have been several excellent books or review articles that have been concerned with discussing or
comparing di erent approaches to developing noninformative priors (e.g., Kass and Wasserman,
1993), but there has been no systematic e ort to catalog the noninformative priors that have
been developed. Since use of noninformative priors is becoming routine in Bayesian practice,
preparation of such a catalog seemed in order.
Although general discussion is not the purpose of this catalog, it is useful to review the
numerous reasons that noninformative priors are important to Bayesian analysis:
(i) Frequently, elicitation of subjective prior distributions is impossible, because of time or
cost limitations, or resistance or lack of training of clients. Automatic or default prior distribu-
tions are then needed.
(ii) The statistical analysis is often required to appear objective. Of course, true objectivity
is virtually never attainable, and the prior distribution is usually the least of the problems in
terms of objectivity, but use of a subjectively elicited prior signi cantly reduces the appearance
of objectivity. Noninformative priors not only preserve this appearance, but can be argued to
result in analyses that are more objective than most classical analyses.
(iii) Subjective elicitation can easily result in poor prior distributions, because of systematic
elicitation bias and the fact that elicitation typically yields only a few features of the prior,
with the rest of the prior (e.g, its functional form) being chosen in a convenient, but possibly
inappropriate, way. It is thus good practice to compare answers from a subjective analysis with
answers from a noninformative prior analysis. If there are substantial di erences, it is important
to check that the di erences are due to features of the prior that are trusted, and not due to
either unelicited \convenience" features of the prior, or suspect elicitations.
(iv) In high dimensional problems, the best one can typically hope for is to develop subjective
priors for the \important" parameters, with the unimportant or \nuisance" parameters being
given noninformative priors.
(v) Good noninformative priors can be somewhat magical in multiparameter problems. As

4
an example, the Je reys prior seems to almost always yield a proper posterior distribution. This
is \magical," in that the common constant (or uniform) prior will much more frequently fail
to yield a proper posterior. Even better, the reference prior approach has repeatedly yielded
multiparameter priors that overcome limitations of the Je reys prior, and yield surprisingly good
performance from almost any perspective. The point here is that, in multiparameter problems,
inappropriate aspects of priors (even proper ones) can accumulate across dimensions in very
detrimental ways reference priors seem to \magically" avoid such inappropriate accumulation.
(vi) Bayesian analysis with noninformative priors is being increasingly recognized as a method
for classical statisticians to obtain good classical procedures. For instance, the frequentist-
matching approach to developing noninformative priors is based on ensuring that one has
Bayesian credible sets with good frequentist properties, and it turns out that this is proba-
bly the best way to nd good frequentist con dence sets.

1.2 Approaches to Development of Noninformative Priors


We do not attempt a thorough discussion of the various approaches. See, e.g., Kass and
Wasserman (1993), for such discussion. We primarily will just de ne the various approaches,
and give relevant references.
The Uniform Prior: By this, we just mean the constant density, with the constant typically
being chosen to be 1 (unless the constant can be chosen to yield a proper density). This choice
was, of course, popularized by Laplace (1812).
p
The Je reys Prior: This is de ned as () = det(I ()), where I () is the Fisher infor-
mation matrix. This was proposed in Je reys (1961), as a solution to the problem that the
uniform prior does not yield an analysis invariant to choice of parameterization. Note that,
in speci c situations, Je reys often recommended noninformative priors that di ered from the
formal Je reys prior.
The Reference Prior: This approach was developed in Bernardo (1979), and modi ed for
multiparameter problems in Berger and Bernardo (1992c). The approach cannot be simply
described, but it can be roughly thought of as trying to modify the Je reys prior by reducing the
dependence among parameters that is frequently induced by the Je reys prior there are many
well-known examples in which the Je reys prior yields poor performance (even inconsistency)

5
because of this dependence.
The Maximal Data Information Prior (MDIP): This approach was developed in Zellner
R
(1971), based on an information argument. It is given by () = expf p(xj) log p(xj)dxg,
where p(xj) is the data density function.

2 Organization and Notation


The catalog is organized around statistical models, with the models being listed in alpha-
betical order. Each model-entry is kept as self-contained as possible. Listed for each are (i) the
model density (ii) various noninformative priors and (iii) certain of the resulting posteriors and
their properties. Category (iii) information is often very limited.
Notation is standard. This include (jD), jAj =determinant of A, () is a density with
respect to d.
Important Notation: Noninformative priors that are proper (i.e., integrate to 1) are given in
bold type. Others are improper. (The distinction is important for testing problems, where proper
distributions are typically needed for estimation and prediction, improper noninformative priors
are typically ne.)

3 AR(1)
The AR(1) model, in which the data X = (X1  ::: XT ) follow the model

Xt = Xt;1 + t 

where the t are i.i.d. N (0 2 ).


The expressions below are for 2 known. If 2 is unknown, multiply by 1 d (or 1
2 d2 ).

6
Prior () (Marginal) Posterior
Uniform 1
h 2T
i1=2
T + 11;;2 f E X2 0 ] ; 1;12 g
2
Je reys 1;2
Reference1
P
expf 12 E log( Ti=1 Xi2;1 )]g all are proper
8 p
>
< 1=2 1 ; 2 ] if j  j< 1
Reference2 > p2
: 1=2 j  j  ; 1] if j  j> 1:
1. Nonasymptotic reference prior.
2. Symmetrized reference prior, recommended for typical use. See Berger and Yang (1994)
for comparison of the noninformative priors.

4 Behrens-Fisher Problem
Let x1  : : :  xn be i.i.d. observations from N (
 2 ) and y1  : : :  yn be i.i.d. observations from
N (  2 ), the parameters of interest are  =
; and =
+ .
Liseo (1993) computed the Je reys prior and reference prior for this problem as
Prior (  2  2 ) (Marginal) Posterior
Uniform 1
Je reys 1=( )3 proper
Reference1 1=( )2
1. Independent of the group ordering of the parameters.

5 Beta
The Be(  ),  > 0,  > 0, density is
;( +  ) x;1 (1 ; x);1 I (x):
f (xj  ) = ;( 01]
);( )
The Fisher information matrix is
0 1
PG (1  ) ; PG (1   +  ) ; PG (1   +  )
I (  ) = B
@ CA 
;PG(1  +  ) PG(1  ) ; PG(1  +  )

7
where PG(1 x) = 1
P ;2
i=0 (x + i) is the PolyGamma function. The Je reys prior is thus the
square root of the Fisher information matrix.

6 Binomial
The B (n p), 0  p  1, density is
0 1
n
f (xjn p) = B
@ CA px(1 ; p)(n;x):
x
Case 1: Priors for p, given n
Prior (pjn) (Marginal) Posterior
Uniform 1 Be(pjx + 1 n ; x + 1)
Je reys 1 ;1=2
p (1 ; p);1=2 Be(pjx + 21  n ; x + 12 )
Reference
MDIP 1:6186pp (1 ; p)(1;p) proper
Novick and Hall's1 p;1 (1 ; p);1 Be(pjx n ; x)
1. See Novick and Hall (1965). Note this prior is uniform in  = log p=(1 ; p).

Case 2: Priors for n


Prior (n ) (Marginal) Posterior
Uniform 1
Je reys n;1=2
Reference1 1/n
Universal2 1
2:865n log(n) log log(n) log log::: log(n)

1. Discussed by Alba and Mendoza (1995).


2. The product in the denominator is up to the last term for which log log : : : log(n) > 1.
See Rissanen (1983).

8
7 Bivariate Binomial
0 1 0 1
m r
f (r sjp q m) = B
@ CA pr (1 ; p)m;r B@ CA qs(1 ; q)r;s
r s
for r = 1 : : :  m and s = 1 : : :  r. Polson and Wasserman (1990) compute the Fisher information
matrix of this distribution as I (p q) = mdiag(fp(1 ; p)g;1  pfq(1 ; q)g;1 ).
Prior (p qjm) (Marginal) Posterior
Uniform 1
Je reys 2 (1 ; p)
1 ;=2 q;1=2 (1 ; q);1=2 proper
Reference1 1 ;1=2
2 p (1 ; p);1=2 q;1=2 (1 ; q);1=2
Reference2 2 (1 ; p)
1 ;1=2 q;1=2 (1 ; q);1=2 (1 ; pq);1=2
Crowder and Sweeting's3 p;1(1 ; p);1q;1 (1 ; q);1
1. Parameter of interest is p or q.
2. Parameter of interest is  = pq or  = p(1 ; q)=(1 ; pq).
3. See Crowder and Sweeting (1989).

8 Box-Cox Power Transformed Linear Model


Given observations fy1  : : :  yn g, the model is
8 y ;1
>
< i  6= 0
z => 
()
=  + xti  + i  
: ln yi = 0
where  is a (k  1) vector of regression coecients, xi is a vector of covariates, and i  N (0 1)
truncated at ;( + xti  )= .
Je reys prior was obtained by Pericchi (1981),

(      ) / pk( +1) 



where p( ) is some unspeci ed prior for .

9
Box and Cox (1964) proposed the prior

(      ) / g;(k+1)(;1) ;1 

where g is the geometric mean of the y0 s.


Based on the so called data-translated parameterization,
8
>
< ;1 + xti + i 6= 0
()
z =>
: ln  + xti  + i  = 0

corresponding to  = ( ; 1)= or ln , Wixley (1993) proposed the following two priors,

(      ) / g;(k+1)(;1) ;1 p( )

where g is the geometric mean of the y0 s and p( ) is some unspeci ed prior for . This prior is
apart from the prior of Box and Cox (1964) only by a factor p( ). It is also the prior used in
Box and Tiao (1992).

(      ) / 1 +  ];(k+1)(1;1=) ;1 p( )


/ ;(k+1)(;1) ;1p( )

where g is the geometric mean of the y0 s and p( ) is some unspeci ed prior for . This prior
will give a very close resultant posterior distribution to that of Box and Cox (1964).

9 Cauchy
The C ( ), ;1 <  < 1,  > 0, density is

f (xj ) = 2 + (x ; )2 ] :

This is a location-scale parameter problem see that section for priors. Posterior analysis can
be found in Spiegelhalter (1985) and Howlader and Weiss (1988).

10
10 Dirichlet
P
The D() density, where ki=1 xi = 1, 0  xi  1, and  = (1  : : :  k )t , i > 0 for all i, is
given by
Yk (i ;1)
f (xj) = Qk;(0 ) xi 
i=1 ;(i ) i=1
P
where 0 = ki=1 i ,
The Fisher information matrix is
0 1
BB PG(1   1 ) ; PG (1  0 ) ; PG (1  0 ) CC
I (1  : : :  k ) = B . . . CC 
B@ A
;PG(1 0 ) PG(1 k ) ; PG(1 0 )
P
where PG(1 x) = 1 (x+i);2 is the PolyGamma function. The Je reys prior is jI (  : : :   )j1=2 .
i=0 1 k

11 Exponential Regression Model


See Ye and Berger (1991).

Yij  N ( + x+xi a  2 )

where   2 R, 0 <  < 1, x  0 a > 0, x and a known constants, 0  i  k ; 1 1  j  m,


the xi 's are known nonnegative regressors with xi 6= xj for i 6= j and the variance 2 > 0 is an
unknown constant. It is assumed that xi < xj for i < j .
Prior (   ) (Marginal) Posterior
Uniform 1
adhoc 1= improper
Je reys j j2x;1 p1(a )=4 proper 2-dimensional
numerical integration
Reference1 x;1 p(a )= proper 1-dimensional
Reference2 x;1 p(a)=2 numerical integration
Reference3 x;1 p(a)=3

11
where p(a ) = p1 (a )=p2 (a ), and
kX
;1 kX
;1 kX
;1 kX
;1
p1(w) = ( w2xi ; k1 ( wxi )2 )( x2i w2xi ; k1 ( xi wxi )2 )
i=0 i=0 i=0 i=0
kX
;1 1 kX
;1 kX;1
;( xi w2xi ; k w xi xiwxi )2 
i=0 i=0 i=0
kX
;1 kX
;1
p2(w) = w2xi ; k1 ( wxi )2 :
i=0 i=0

1. Group ordering f   g or f (  ) g or with all permutations of    this


is recommended for typical use, and appears to be approximately frequentist matching.
2. Group ordering f  ( )g and with all permutations of   .
3. Group ordering f (  )g and with all permutations of   .
The marginal posterior of  for the prior ;s x;1 p(a ) is given by
a
(jy)  1+ax0 (1 ;p(a )h) (a  s y) 

where
h( s y) = fs2yy ; md2k ( y)=p22 ()]km+s;3 2x0p(12 (;) )2 g1=2 
2

with
kX
;1
dk ( y) = (yi: ; y)xi :
i=0

12 F Distribution
The F (  ),  > 0,  > 0, density is
)=2]=2  =2
f (xj  ) = ;(;(+= 2);( x=2;1 I (x):
=2) ( + x)(+)=2 (01)

13 Gamma
The G(  ),  > 0,  > 0, density is

f (xj  ) = ;(1)  x;1 e;x= I(01)(x):

12
If  is known,  is a scale parameter. The Je reys, reference, and MDIP priors are all 1= ,
P
and the posterior is therefore IG(n 1= ni=1 xi ).
The Fisher information matrix is
0 1
PG(1 ) 1= C
I (  ) = B
@ A
1= 2 =
P (x + i);2 is the PolyGamma function.
where PG(1 x) = 1
i=0
Prior (  ) (Marginal) Posterior
Uniform 1
Je reys
p PG(1 ) ; 1=
p
Reference1 (PG(1 ) ; 1)== proper3
p
Reference2 PG(1 )=
1. Group ordering f  g.
2. Group ordering f g.
3. See Liseo (1993) and Sun and Ye (1994b) for marginal posterior.

14 Generalized Linear Model


Let y1  : : :  yn be independent observations, with the exponential density

f (yiji  ) = expfa;i 1 ()(yi i ; b(i)) + c(yi  )g

where the ai (), b() and c() are known functions, and ai () is of the form ai () = =wi , where
the wi 's are known weights. The i 's are related to regression coecients,  = (1  : : :  p )t , by
the link function
i = ( i ) i = 1 : : :  n
where i = xti  , and xti = (xi1  : : :  xip ) is a 1  p vector denoting the ith row of the n  p matrix
of covariates X , and  is a monotonic di erentiable function.
This model includes a large class of regression models, such as normal linear regression,
logistic and probit regression, Poisson regression, gamma regression, and some proportional

13
hazards models.
Ibrahim and Laud (1991) studied this model, using the Je reys prior. They focus on the
case where  is known. The Fisher information matrix they obtained for  is given by I ( ) =
;1 (X t WV ( )2 ( )X ), where W is an n  n diagonal matrix with ith diagonal element wi .
V ( ) and ( ) are n  n diagonal matrices with ith diagonal elements vi = v(xti  ) = d2 b(i )=di2
and i = (xti  ) = di =d i , respectively. Je reys prior is thus given by

( ) / jX t WV ( )2 ( )X j1=2 :

Ibrahim and Laud (1991) show that the Je reys prior can lead to an improper posterior
for this model, but that the posterior is proper for most GLM's. A sucient condition for the
posterior to be proper is that the integral
Z
expf;1 w(yr ; b(r))g( d drb(2r) )1=2 dr
2

be nite, the likelihood function of  be bounded above, and X be of full rank. Here S denotes
the range of .
In addition, Ibrahim and Laud (1991) give a sucient and necessary condition for the Je reys
prior to be proper, namely Z d2 b(r)
( dr2 )1=2 dr < 1:
S
When  is unknown, the Fisher information matrix is
0 1
I ( ) 0
I ( ) = B
@ 1 CA 
0 I2 ( )

where I1 ( );1 (X t WV ( )2 ( )X ) as above, and


X
n
I2 ( ) = ; f2wi ;3(b_ (i)i ; b(i )) + E (c(yi ))g
i=1

here b_ (i ) = db(i )=di , and c = @ 2 c(yi  )=@2 . The Je reys prior is then the square root of the
determinant of I ( ).

14
15 Inverse Gamma
The IG(  ),  > 0,  > 0, density is

f (xj  ) = ;() 1x(+1) e;1=x I(01)(x):

If  is known, 1= is a scale parameter. The Je reys, reference, and MDIP prior is 1= , and
P
the posterior is IG(n 1= ni=1 1=xi ).
If  is unknown, the Fisher information matrix is the same as that of the Gamma distribution.
Thus the reference prior and the Je reys prior are the same as for the Gamma distribution.

16 Inverse Normal or Gaussian


The IN ( ),  > 0, > 0, density is

f (xj ) = ( 2 )1=2 x;3=2 expf; x 12


2 ( ; x ) gI(01) (x):
The Fisher information matrix is given by I ( ) = diag( = 1=2 2 ).
Prior ( ) (Marginal) Posterior
Uniform 1
Je reys
p
1= 
Scale 1= proper2
Reference1 ;1=2 ;1
1. See Liseo (1993).
2. See Sun and Ye (1994b) for marginal posteriors.
P
De ne  = n ; 1,
= (ux= );1=2 , q = (
x);1 , x = xi =n. Banerjee and Bhattacharyya
(1979) show that the marginal posterior of  has a left-truncated t-distribution with  degrees
of freedom, location parameter 1=x, and scale parameter q, the point of truncation being zero.
i.e.,
(jD) / (nu=2);n=2 f1 + q1 2 ( ; x1 )2 g;(
+1)=2 
P
where u = xr ; 1=x, and xr = n1 (1=xi ). The marginal posterior of is the modi ed gamma

15
distribution
( jD) / (nu= 2)
=2 !((n =x)1=2 ) exp(;nu =2)
=2;1 
;(=2) H
(
)
where !() is the standard normal cdf, and H
() denotes the cdf of Student's t-distribution with
 degrees of freedom. The posterior mean and variance are also available see Banerjee and
Bhattacharyya (1979) for more detail.
Another parametrization of inverse Gaussian distribution, the IN ( 2 ),  > 0, 2 > 0,
density is
f (xj 2 ) = (2 2 );1=2 x;3=2 expf;(x ; )2 =(22 2 x)gI(01) (x):
The Fisher information matrix is given by I ( 2 ) = diag(;3 ;2 (24 );1 ).
Prior ( 2 ) (Marginal) Posterior
Uniform 1
Je reys ;3=2 ;3
Reference1 ;3=2 ;2
1. See Datta and Ghosh (1993).

17 Linear Calibration
Consider the model

yi =  + xi + 1i  i = 1 : : :  n
y0j =  + x0 + 2j  j = 1 : : :  k

where  , yi , and y0j are (p  1) vectors, xi 's are known values of the precise measurements, the
1i and 2j are i.i.d. Np (0 2 Ip ). The object is to predict x0.
P P P P P
Denote x = ni=1 xi =n, y = ni=1 yi =n, y0 = kj=1 y0j =k, cx = ni=1 (xi ; x)2 , ^ = ni=1 (xi ;
x)(yi ; y)=cx , and s1 and s2 be the two sums of residual squared errors based on the calibration
and prediction experiments, respectively. The statistics ^, y, y0 , and s = s1 + s2 are minimal
sucient and mutually independent. Kubokawa and Robert (1994) then consider the reduced

16
model

y  Np(   2 Ip )
z  Np(x0    2 Ip)
s  22q 

where y = c1x=2 ^, z = (y0 ; y)(n;1 + k;1 );1=2 , q = (n + k ; 3)p,   = c1x=2  , and x0 =
(x0 ; x)c;x 1=2 (n;1 + k;1 );1=2 .
The Fisher information matrix for the reduced model is given by
0 1
BB 1+x 2 0 CC
BB .. x0  = C
 
2 Ip . CC
0 2
B
I (   2  x0 ) = B BB 0 CC :
CC
BB q+2p CC
B@ 0 0 2 4 0
A
 t 2
x0  = 0 jj  jj2 =2

Prior (x0     2 ) (Marginal) Posterior


Uniform 1
Je reys (1 + x0 2 )(p;1)=2 jj  jj(2 );(p+3)=2
q
Reference1 (2 );(p+2)=2 = 1 + x0 2 proper
1. With respect to the group ordering fx0  (   2 )g. See Kubokawa and Robert (1994).

18 Location-Scale Parameter Models


Location Parameter Models:
The LP ( ),  2 Rp, density is

f (yj ) = g(y ; X )

where X is a (n  p) constant matrix and g is a n-dimensional density function.

17
Prior ( ) (Marginal) Posterior
Uniform1 1 proper if rank(X t X ) = p
Reference2  t (X t X ) ];(p;2) proper if rank(X t X ) = p > 2
1. Also Je reys and the usual reference prior.
2. Baranchik (1964) \shrinkage" prior also, the reference prior for (jj jj O), where  =
O (jj jj 0 : : :  0)t . This also arises from admissibility considerations see Berger and Strawder-
man (1993).

Location-Scale Parameter Models:


The LSP ( ),  2 Rp ,  > 0 density is

f (yj ) = 1n g( 1 (y ; X ))

where X is a (n  p) constant matrix and g is a n-dimensional density function.


Prior ( ) (Marginal) Posterior
Uniform 1
Je reys 1=(p+1)
Reference1 1=
Reference2 1  t (X t X ) ];(p;2)

Reference3 ( ) / ;1 (c1 2 + c2 )


1. Reference with respect to the group ordering f g also the prior actually recommended
by Je reys (1961).
2. Baranchik (1964) \shrinkage" prior also, the reference prior with respect to the parame-
ters fjj jj O g (see Location Parameter Models).
3.  = = is parameter of interest, selecting rectangular compacts on  and nuisance
parameter c1  2 + c2 2 . See Datta and Ghosh (1993).

Scale Parameter Models:

18
The SP (1  : : :  p ), i > 0 for 8i, density is
Yp
f (yj1  : : :  p ) = ( i;ni )g( 1 y1  1 y2  : : :  1 yp)
i=1 1 ~ 2 ~ p~

where yi are (ni  1) vectors, i = 1 : : :  p, g is a n1 + + np-dimensional density function.


~
Prior (1  : : :  p ) (Marginal) Posterior
Uniform 1
Je reys
Reference
Qp ;1
i=1 i
MDIP

19 Logit Model
Poirier (1992) analyzes the Je reys prior for the conditional logit model as follows. An
experiment consists of N trials. On trial n exactly one of Jn discrete alternatives is observed.
Let ynj be a binary variable which equals unity i alternative j is observed on trial n otherwise
t ]t , with each znj being a K  1 vector.  is a K  1
ynj equals zero. Let zn = znt 1  : : :  znJ n
vector of unknown parameters. The probability of alternative j on trial n is speci ed to be
t )
exp(znj
pnj ( ) = Prob(ynj = 1jzn   ) = PJn t :
i=1 exp(zni  )

When Jn J and znj = ej xn , where ej is a (J ; 1)  1 vector with all elements equal to zero
except the j th which equals unity and xn is a M  1 vector of observable characteristics of trial
n, this is the multinomial logit model.
Poirier (1992) gives the Je reys prior for this model as
 N J 1=2
 X Xn 
( ) /  pnj ( )znj ; z~n ( )]znj ; z~n ( )]t  
n=1 j =1 
P n p ( )z . The following special cases are also given therein.
where z~n ( ) = Ji=1 ni ni

Multinomial Logit Without Covariates:

19
Suppose K = J ; 1 and znj = ej  then the Je reys prior reduces to the proper density
Y
J
( ) =  ;(J=
J=2
2) ] pj ( )]1=2 
j =1

where pj = pnj for any n. The density for p = (p1  : : :  pJ ;1 ) is the Dirichlet density
Y
J
(p ) = ;(J=
J=2
2) p;j1=2:
j =1

In the binomial case,

( ) = ;1 fp1 ( )1 ; p1 ( )]g1=2 = 1exp(=2)


+ exp( )] 
and
(p ) = ;1 p1 (1 ; p1 )];1=2 :
The other competing noninformative prior are: the uniform prior on   the uniform prior on
p1  and the MDIP prior for p1. See Geisser (1984) for discussion.

Logistic Regression:
Suppose yij  i.i.d. Bernouli ( i ), with

e xti 
i= : i = 1 : : :  n:
1 + exti 
Then the likelihood function is
Pn t Y
n
f (yj ) = e i=1 yi xi  = (1 + exti  )
i=1

and the Fisher Information matrix is


0 10 1
BB 1(1 ; 1) CC BB xt1 C
C
.. C :
I ( ) = (x1  : : :  xn ) B ... CC BB
B@ A@ . C
A
n (1 ; n) xtn

20
The Je reys prior is the square root of the determinant of above matrix.
For more special cases of the logit model, see Poirier (1992).

20 Mixed Model
Consider mixed model

yijk = Bijk b + Wijk w + Ti + Cj + ijk  i = 1 ::: I j = 1 ::: J k = 1 ::: tij 

assuming ijk  N (0 2 ), and independently, Cj  N (0 c2 ), with b , w , Ti , 2 , and c2 the
parameters of interest.
Box and Tiao (1992) used the following prior in the balanced mixed model case, with tij =
n 8ij ,
(c2  2 ) / 2 (2 1+ n2 ) :
c
Chaloner (1987) used three priors for this model, (2 + c2 );1 , ;2 (2 + c2 );1 , and ;2 (2 +
c2 );3=2 .
Yang and Pyne (1996) derived the Je reys prior,
2  0 t2 131=2
6 XJ
B j
2(tj c2 + 2 )2
tj
2(tj c2 + 2 )2 CA75 
(c   ) / 4Det  @
2 2

j=1 2(tj c2tj+ 2 )2 t2j ;41 + 2(tj c21+ 2 )2
P
where tj = Ii=1 tij , and reference prior,
v
u
1 uX J t2
(c   ) / 2 t (t 2 +j 2 )2 :
2 2
j =1 j c

21 Mixture Model
For arbitrary density functions p1 (x) and p2 (x), consider the model

p(xj ) = p1 (x) + (1 ; )p2 (x):

21
Bernardo and Gir$on (1988) discussed the reference prior for this model. They found that the
reference prior is always proper.

22 Multinomial
The M (n p) density, where ki=1
P +1
xi = n and each xi is an integer between 0 and n,
P
p = (p1 : : :  pk+1 )t , with ki=1
+1
pi = 1 and 0  pi  1 for all i, is given by
kY
f (xjp) = Qk+1n!
+1
x
pi i :
(x !)
i=1 i i=1

Prior (p) (Marginal) Posterior


Uniform 1
Je reys Ck;1 (Qki=1 p;i 1=2 )(1 ; k );1=2
Reference1 all are proper
Reference2 Q
( ;k ) ki=1 p;i 1=2 (1 ; i );1=2 ]
Reference3 Q Q Q
( mi=1 Cn;i1)( ki=1 p;i 1=2 )( mi=1;1 (1 ; Ni );ni+1 =2 )(1 ; Nm );1=2
Reference4 ;1=2 (1 + );1=2 Qki=2 fp;i 1=2 (1 ; p2 (1 + ) ; Pij=3 pj );1=2 g
Pk
MDIP pp11 pp22 ppkk (1 ; Pki=1 pi)1; i=1 pi
Novick and Hall's5
Qk p;1
i=1 i
P
Here j = ji=1 pi , C2l;1 = l =(l ; 1)!, and C2l = (2 )l =(2l ; 1)(2l ; 3) (1)], for all positive
integers l. See Berger and Bernardo (1992b), Zellner (1993), and Bernardo and Ram$on (1996).
1. One-group reference prior.
2. k-group reference prior.
3. m-group reference prior. Ni and ni are as de ned in Berger and Bernardo (1992c).
The posterior for the m-group reference prior, which includes the posteriors for Reference1 and
Reference2 as special cases, is
Yk mY
;1
pixi ; 2 )( (1 ; Ni );ni+1 =2 )(1 ; Nm )n;r; 2 :
1
(pjD) / (
1

i=1 i=1

4. k-group reference prior, suppose  = p1 =p2 is parameter of interest. the reference posterior

22
for  is
x1 ;1=2
(jD) / (1 +)x1 +x2 +1 :
5. See Novick and Hall (1965).
Sono (1983) also derived a noninformative prior for this model, using the assumption of
prior independence of transformed parameters and an approximate data-translated likelihood
function.

23 Negative Binomial
The NB ( p),  > 0, 0 < p  1, density is

f (xj p) = ;(;(  + x) p(1 ; p)x:


 + 1);()
 is given:
Prior (p) (Marginal) Posterior
Uniform 1 Be(pj + 1 x ;  + 1)
p
Je reys 1=p 1 ; p] Be(pj x ;  + 1=2)
Reference

24 Neyman and Scott Example


This model consists of 2n independent observations,

Xij  N (i 2 ) i = 1 : : :  n j = 1 2:

Prior ( 1  : : :  n ) (Marginal) Posterior


Uniform 1
Je reys ;(n+1)
Reference1 ;1 proper
Reference2 ;n
1. Group ordering f (1  : : :  n )g or f 1  : : :  n g. Yields a sensible posterior. The

23
P P
posterior mean of 2 is S 2 =(n ; 2), with S 2 = ni=1 2j =1 (Xij ; X i )2 , X i = (Xi1 + Xi2 )=2. See
Berger and Bernardo (1992c) for discussion.
2. Group ordering f1  (2  : : :  n  )g. Strong inconsistency occurs for this prior, along
with the Je reys prior. The posterior mean of 2 for the Je reys prior is S 2 =(2n ; 2).

25 Nonlinear Regression Model


Eaves (1983) considered the following nonlinear regression model

y = g() + e

where the n  1 vector-valued design function g of a d-dimensional vector parameter  is no


longer assumed linear as compared with linear regression, although it remains 1-1 and smooth.
The noninformative prior proposed therein is

( ) / jI ()j1=2 =

where

I () = 21 E (D jjy ; g()jj2 j)


= d g()0 d g()

is the regression information matrix. Here D is the d  d matrix-valued second-order partial


operator: d g() is the n  d Jacobian of g. Note this prior is also the reference prior provided
that  and  are in separate group.

26 Normal

Univariate Normal:

24
The N ( 2 ), ;1 <  < 1, 2 > 0, density is

f (xj 2 ) = (2 12 )1=2 e;(x; )2 =2 2 :

2 Known:
Prior () (Marginal) Posterior
All 1 (jD) = N (x 2 =n)
P
Here x = n1 ni=1 xi .

 Known:
Prior (2 ) (Marginal) Posterior
Uniform 1 (2 jD)  IG((n ; 2)=2 2=S 2 )
Je reys
Reference 1=2 (2 jD)  IG(n=2 2=S 2 )
MDIP
P P
Here x = n1 ni=1 xi , and S 2 = ni=1 (xi ; x)2 .

 and 2 Both Unknown:


Prior ( 2 ) (Marginal) Posterior
Uniform 1 (jD)  T (n ; 3 x S 2 =n(n ; 3))
(2 jD)  IG((n ; 3)=2 2=S 2 )
Je reys 1=4 (jD)  T (n + 1 x S 2 =n(n + 1))
(2 jD)  IG((n + 1)=2 2=S 2 )
Reference1 ( ) / (2 + 2 );1=2 ;1
Reference2 1=2 (jD)  T (n ; 1 x S 2 =n(n ; 1))
MDIP (2 jD)  IG((n ; 1)=2 2=S 2 )
P P
Here x = n1 ni=1 xi , and S 2 = ni=1 (xi ; x)2 .
1.  = = is parameter of interest, parameter ordering f g. See Bernardo and Smith
(1994).

25
2. If  and 2 are in separate groups.

p-Variate Normal:
The Np ( &) density, where  = (1  : : :  p) 2 Rp and & is a positive de nite matrix, is
given by
1
f (xj &) = (2 )p=2 (det e ;(x; )t ;1 (x; )=2 :
&)1 =2

& Known:
Prior () (Marginal) Posterior
Uniform
Je reys 1 (jD)  Np(x &=n)
Reference
Shrinkage1 (t &;1 );(p;2)
P
Here x = n1 ni=1 xi .
1. See Baranchik (1964) and Berger and Strawderman (1993).

 Known:
Prior (&) (Marginal) Posterior
Uniform 1 (&;1 jD)  Wp(n ; p ; 1 S ;1 =n)
Je reys 1=j&j(p+1)=2 (&;1 jD)  Wp (n S ;1 =n)
Reference1
Q
1=j&j i<j (di ; dj ) proper
Q
Reference2 1=j&j(log d1 ; log dp )(p;2) i<j (di ; dj )
Reference3 1=R (1 ; R 2 ) (R jD)  (R 2 );1=2 (1 ; R 2 )n=2;1 2 F1 ( n2 
n  p;1  (RR )2 )= 3 F2 ( n  n  1  p;1  n+1  R2 )
2 2 2 2 2 2 2

MDIP 1=j&j ;
(& jD)  Wp (n ; p S =n)
1 ;1

P
Here S = n1 ni=1 (xi ; )((xi ; )t , d1 < d2 < < dp are the eigenvalues of &, and R and
R are population and sample multiple correlation coecients, respectively. If we write
0 1 0 1
  t 
^ 
^ t
&=B
@ 11 (1) CA  &^ = S = B@ 11 ^(1) CA 
(1) &22 ^(1) &22

26
q q
then, R = (1) ^ ;1 ^(1) =^11 .
t &;1 (1) =11 , and R = ^ t &
22 (1) 22 
1. Group ordering lists the ordered eigenvalues of & rst, and is recommended for typical
use. See Yang and Berger (1994).
2. Group ordering lists the eigenvalues of & rst, with fd1  dp g proceeding the other ordered
eigenvalues. See Yang and Berger (1994).
3. Population multiple correlation coecient is parameter of interest, and uses sample mul-
tiple correlation coecient as data. Prior and posterior are with respect to dR . 2 F1 and 3 F2 are
the hypergeometric function. See Tiwari, Chib and Jammalamadaka (1989). Also see Muirhead
(1982).

 and & Both Unknown:


Prior ( ) (Marginal) Posterior
Uniform 1
Je reys 1=j&j(p+2)=2
Reference1
Q
1=j&j i<j (di ; dj )
Q
Reference2 1=j&j(log d1 ; log dp )(p;2) i<j (di ; dj )
MDIP 1=j&j
P
Here S = n1 ni=1 (xi ; )((xi ; )t , and d1 < d2 < < dp are the eigenvalues of &.
1. Group ordering lists the ordered eigenvalues of & rst.  and & are in separate groups.
It is recommended for typical use. See Yang and Berger (1994).
2. Group ordering lists the eigenvalues of & rst, with fd1  dp g proceeding the other ordered
eigenvalues.  and & are in separate groups. See Yang and Berger (1994).

Bivariate Normal:
The N2 ( &) density, where,  = (1  2 ) 2 R2 and & = (ij ) is a 2  2 positive de nite
matrix, is given by
1
f (xj &) = (2 )(det e;(x; )t ;1 (x; )=2 :
&)1=2

 and & Both Unknown:

27
Prior ( &) (Marginal) Posterior
(1; 2 )(n;3)=2
Reference1 (1 ; 2 );1 (11 22 );1=2 (1;r
F ( 21  12  n ; 12  1+2r )
)n;3=2
1=2
Reference2 11 =(11 22 ; 12
2 )2
q 2 =(  ;  2 )2
Reference3 11 22 + 12 11 22 12

MDIP 4
p
1= 11 22 (1 ;  )2

1. The correlation coecient, , is parameter of interest. Parameters ordered as f 1  2  11  22 g.
r is the sample correlation coecient. Prior and posterior are with respect to dd1 d2 d11 d22 .
F is the hypergeometric function. See Bayarri (1981).
2. 11 is parameter of interest. Parameters ordered as f11  (12  22  1  2 )g. Limiting
sequence of compact sets is f122 = 2 ( 1;1   (1 ; l;1 )  2 (l;1  l)g.
22 11 11 22

3. 12 is parameter of interest. Parameters ordered as f12  (11  22  1  2 )g. Limiting
sequence of compact sets is f11 22 2 (12 2 (1 + 1=l)  2 l)  2 (l;1  l)g.
12 11

4. Prior and posterior are with respect to dd1 d2 d11 d22 .

27 Pareto
The Pa(x0  ) density, where, 0 < x0 < 1,  > 0, is given by

f (xjx0  ) = x ( xx0 )+1 I(x0 1)(x):


0

If x0 is known, this is a scale density, and the Je reys prior and reference prior is 1=.

28 Poisson
The P ( ), > 0, density is
; x
f (xj ) = e x!

Prior ( ) (Marginal) Posterior


Uniform 1 G(Pni=1 xi + 1 1=n)
P
Je reys ;1=2 G( ni=1 xi + 1=2 1=n)
Reference

28
Possion Process:
For a Poisson process X1  X2  : : : with unknown parameter , Je reys (1961), Novick and
Hall (1965), and Villegas (1977) proposed ignorance prior ( ) = ;1 , also called a logarithmic
uniform prior because it implies a uniform distribution for log .

29 Product of Normal Means


Consider the Np( &) density, where  = (1  : : :  p ), and the parameter of interest is
Qp  .
i=1 i

p=2, & = I2:


Prior (1  2 ) (Marginal) Posterior
Uniform 1 proper
Reference1 (21 + 22 )1=2
1. See Berger and Bernardo (1989).

p=n, & = In, n > 2:


Prior (1  : : :  n ) (Marginal) Posterior
Uniform 1 proper
Q q P
Reference1 ni=1 i ni=1 ;i 2
1. See Sun and Ye (1995).

p=n, & = diag(12  : : :  n2 ), and i > 0 for i = 1 : : :  n:


Prior (1  : : :  n ) (Marginal) Posterior
Uniform 1
Je reys1
Qn ;2
i=1 i r
( 1  n )2Q=k;1 g1 ( 1 ::: n ) Pn 2
Reference2 n 2 i i
=1 ni 2 proper
i=1 i r i
Reference3 1n n ) Pn
(Q 2
i
i=1 i
2 i=1 ni 2i
1. Proposed by Je reys (1961).

29
2. See Sun and Ye (1994a), where g1 () is any positive real function and ni is the number of
observations from ith population.
3. Also see Sun and Ye (1994a), where ni is the number of observations from ith population.
This prior is also the Tibshirani's matching prior.

30 Random E ects Models


One-Way Model (balanced):
See Berger and Bernardo (1992a).

Xij =  + i + ij  i = 1 : : :  p and j = 1 : : :  n

where the i are i.i.d. N (0 2 ) and, independently, the ij are i.i.d. N (0 2 ). The parameters
( 2  2 ) are unknown. The reference priors for this model are,
Ordered Grouping Reference Prior posterior
f( 2  2)g ;2(n 2 + 2 );3=2
f( 2 ) 2 g ;5=2 (n 2 + 2 );1
f( 2 ) 2 g ;3Cn =2 ;2 ( 2 =2 )
f2  ( 2 )g ;1(n 2 + 2 );3=2
f 2  ( 2 )g ;1 ;2(n 2 + 2 );1=2 ( 2 =2 ) proper
f (2  2)g, f(2  2 ) g
f 2  2 g ;2(n 2 + 2 );1
f2   2 g, f2  2  g
f 2  2 g, f 2   2 g, f 2  2  g ;Cn ;2 ( 2 =2 )
p p p
Here Cn = f1 ; n ; 1( n + n ; 1);3 g, ( 2 =2 ) = (n ; 1) + (1 + n 2 =2 );2 ]1=2 .
The posterior computation involves only one dimensional numerical integration. For details
see Berger and Bernardo (1992a).
Suppose  = n 2 =2 is the parameter of interest (see, Ye, 1994).

30
Prior (  2 ) (Marginal) Posterior
Uniform 1
Je reys ;3 (1 + );3=2
Reference1 ;2 (1 + );3=2 proper4
Reference2 ;2 (1 + );1
Reference3 ;3 (1 + );1
1. Group ordering f( ) 2 g.
2. Group ordering f  2 g f 2  g f( 2 ) g, recommended for typical use.
3. Group ordering f ( 2 )g.
4. The marginal posterior for , corresponding to the prior ;a (1 + );b is given by
) q;1 W );(p+q) 
(jD) = W q B (1 (+W= (1 + W )) (1 + 1+
pq
R
where B (x) = 01 t;1 (1 ; t);1 dt is the incomplete Beta function, p = (p + 2b ; 3)=2,
q = p(n ; 1) + a ; 2b]=2, W = S2 =S1 , S1 = Ppi=1 Pnj=1 (Yij ; Yi)2  Yi = n1 Pnj=1 Yij , S2 =
n Pnj=1(Yi ; Y )2 , and Y = pn1 Ppi=1 Pnj=1 Yij .

One-Way Model (unbalanced):


See Ye (1990).

Xij =  + i + ij  i = 1 : : :  k and j = 1 : : :  ni 

where the i are i.i.d. N (0 2 ) and, independently, the ij are i.i.d. N (0 2 ). The parameters
( 2  2 ) are unknown.
The reference priors for this model are,
Prior ( 2  2 ) (Marginal) Posterior
Uniform 1
Je reys ;5 s11 ( 2 =2 )(ns22 ( 2 =2 ) ; s11 ( 2 =2 )2 )]1=2
Reference1 ;2 ;2Cn ( 2 =2 )1=2 proper
Reference2 ;4 s22 ( 2 =2 )1=2

31
Here the limiting sequence of compact sets for reference prior is chosen to be 'l = al  bl ] 
cl  dl ]  el  fl ], for ( 2  2 ), and
X
k npi  for p q = 0 1 2 : : :
spq (x) =
i=1 (1 + ni x)
q
X
k 1 
(x) = n ; k + (1 + nix)2
p i
n ;
=1
k pn(pn ; pn ; k) 2
Cn = p p p
n + n ; k 2(pn + n ; k)2
+ 
 = llim log dl 
!1 log c; l
1

1. Group ordering f 2  2 g, f 2   2 g, and f 2  2  g.


2. Group ordering f 2  2 g, f2   2 g, f2  2  g, f (2  2 )g, and f(2  2 ) g.
Suppose v = 2 =2 is the parameter of interest.
Prior (v  2 ) (Marginal) Posterior
Uniform 1
Je reys ;3 s11 (v)(ns22 (v) ; s11 (v)2 )]1=2
Reference1 ;2 ns22 (v) ; s11(v)2 ]1=2 proper
Reference2 ;2 s22(v)1=2
1. Group ordering f v 2 g, fv  2 g, and fv 2  g.
2. Group ordering f 2  vg, f2   vg, f2  v g, f (2  v)g, and f(2  v) g.

Random Coecient Regression Model:

yi = Xi  i + i
~ ~
where yi is a (ti  1) vector of observations, Xi is a (ti  p) constant design matrix,  i is
~ ~
a (p  1) vector of random coecients for the ith experimental subject and i is a vector of
errors for i = 1 : : :  n. Furthermore, we assume that ( i  i  i = 1 : : :  n) are independent, and
~
 i  MV N (  &) and i  MV N (0 2 Ii ), where  is the (p  1) mean of the  i and Ii is the
~ ~ ~ ~
(ti  ti ) identity matrix.

32
De ning ( = &=2 , the Je reys prior is
X n  21  Xn Pn B )(vec(Pn B ))t  12
   vec ( i
i=1 P i=1 i ]Gt  =p+2 :
J (  (  ) /  Bi  G (Bi Bi ) ;
2
~ i=1 i=1
n t
i=1 i 
The reference prior with respect to the Group Ordering f  2  (g f2    (g or f2  (  g is
~ ~ ~
X
n
R (  ( 2 ) / jG (Bi Bi )Gt j 2 =2 :
1

~ i=1

The reference prior with respect to the Group Ordering f  ( 2 g f(   2 g or f( 2   g is
~ ~ ~
Xn P P
vec( ni=1 BPi )(vec( ni=1 Bi ))t ]Gt j 12 =2 
R (  ( 2 ) / jG (Bi Bi ) ; n t
~ i=1 i=1 i

where Bi def
= Xit (Xi (Xit + Ii );1 Xi , and G denotes a (p(p + 1)=2)  p2 constant matrix @ (vecV )=
@ (vecpV ), where V is a p  p symmetric matrix.
The posteriors corresponding to the priors above are proper, provided we have at least 2p +1
full rank design matrices. Computation is discussed in Yang and Chen (1995).

Random E ects Model:


X = fxij  j = 1 : : :  ni  i = 1 : : :  kg arises from k populations 1  : : :  k , i  Np (i  &)
and i  Np (
 T ). Fatti (1982) considers the usual di use prior distribution for &
and T ,
and Box and Tiao's noninformative prior distribution. Speci cally, Fatti (1982) consider the
following type of di use joint prior density

(&
 T ) / j&j;v1 =2 jT j;v2 =2 :

This prior has been used by Geisser (1964) and Geisser and Corn eld (1963), for v1 = v2 = p +1.
Fatti (1982) also studies the predictive density of a new observation x, under the hypothesis
x 2 r . Fatti found that v2 must be less than 2 for the predictive density to exist. So, T cannot
have the usual di use prior distribution with v2 = p + 1. If we assign the values v1 = p + 1 and

33
v2 = 1,

f (xjX  ;(N ;p;1)=2


r ) / jA3 j F (p=2 (N ; p ; 1)=2 (k ; 1)=2 A3 ;1 A1 )
2 1 for k > p
P P P
where A1 = n ki=1 (xi: ; x::)(xi: ; x::)t , A3 = ki=1 nj =1 (xij ; x:: )(xij ; x::)t , F (a1  a2  b1  !)
is the hypergeometric function, and
8
>
<n i 6= r
ni = > i

: nr + 1 i = r
P P P
where we assume ni = n  8i, xi: = n1 nj =1 xij , x:: = k1 ki=1 xi: , and N = ki=1 ni .
Box and Tiao (1992) propose the following noninformative joint prior,

(&
 T ) / j&j;(p+1)=2 j& + nT j;(p+1)=2 :

This prior distribution gives the predictive density as,

f (xjX r ) / jA3 j;(N ;1)=2 2 F1 ((p +1)=2 (N ; 1)=2 (k + p)=2 A3 ;1 A1) for N > p and k > p:

31 Ratio of Exponential Means


Let Xi ind
 Exponential(i), i = 1 2. The parameter of interest is 1 = 2 =1. With nuisance
parameter 2 = 1 2 , Datta and Ghosh (1993) get the reference prior as (1 :2 ) = (1 2 );1 .

32 Ratio of Normal Means


Suppose X = fX1  : : :  Xn g and Y = fY1  : : :  Ym g are available from two independent
normal populations with unknown means  and unknown common variance 2 . The problem
is to make inferences about the value of  = = , the ratio of the means.

34
The Fisher information matrix is (see Bernardo, 1977),
0 1
BB 2  0 CC
1
I (  ) = 2 B  1 +  0 C
B
@
2
CA :
0 0 4

It follows that the reference prior is

(  ) / (1 + 2 );1=2 ;1

or, in terms of the original parameterization

(  ) / (2 + 2 );1=2 ;1 :

The reference posterior is

(jD) / (1 + 2 );1=2 (m + 2 n);1=2 fS 2 + nmm(x+;2 ny) g;(n+m;1)=2 


2

P P P P
where x = ni=1 xi =n y = mi=1 yi =m, and S 2 = ni=1 (xi ; x)2 + mi=1 (yi ; y)2 .
Note for this problem, the usual noninformative prior 1= entails Fieller-Creasy problem.
Kappenman, Geisser and Antle (1970) showed that 1= can lead to a con dence interval of 
consisting the whole real line.

33 Ratio of Poisson Means


Let X and Y be independent Poisson random variables with means  and . The parameter
of interest is .
Liseo (1993) computed the Je reys prior and reference prior for this problem as
Prior ( ) (Marginal) Posterior
Uniform 1
Je reys ;1=2 proper
p
Reference 1= (1 + )

35
Here the Je reys prior and reference prior give the same marginal posterior for ,
x;1=2
(jD) / (1 + )x+y+1 :

34 Sequential Analysis
Suppose X1  X2  : : :  is an i.i.d. sample with common density function f (xi j) which satis es
the regular continuous condition. Here Xi and  are k  1 and p  1 vectors, respectively. Let
N be the stopping time.
From Ye (1993), if 0 < E (N ) < 1, the Je reys prior is

 p=2
q
J () / (E N ]) det(I ())

where I () is the Fisher information matrix of X1 .


Suppose  = ((1)  (2)  : : :  (m) ) is an m-ordered group. Furthermore, assume E (N ) /
E (1) (N ) depends only on (1) and 0 < E (N ) < 1. Then the reference prior is (see Ye, 1993)

 p =2
R ((1)  : : :  (m) ) / (E N ]) 1 R ((1)  : : :  (m) )

where p1 is the dimension of (1) and R ((1)  : : :  (m) ) is the reference prior of  for X1 , using
the same group order and compact subsets.

35 Stress-Strength System
Suppose X1  : : :  Xm are i.i.d. Weibull( 1  ) random variables, and independently, Y1  : : :  Ym
are i.i.d. Weibull( 2  ) random variables. Parameter of interest is

!1 = P (X1 < Y1) = 2 =( 1 + 2 ):

When  = 1, this is the simple stress-strength system under exponential distribution, with
parameter as scale parameter. Thompson and Basu (1993) computed the reference prior and
showed that the reference prior for ( 1  2 ), when !1 is the parameter of interest and !2 = 1 + 2

36
is the nuisance parameter, coincide with the Je reys prior. With reparameterization of !1 and
!2 = 2 =( 1 + 2 ), Basu and Sun (1994) computed the following reference priors and gave a
necessary and sucient condition for the existence of a proper posterior,
Prior (!1  !2   ) (Marginal) Posterior
Uniform 1
Je reys1 !1 (1 ; !1 )!2  ];1
Reference2 g1 (!1  !2 )=!1 (1 ; !1 )!2  ]
Reference3 g2 (!1  !2 )=!1 (1 ; !1 )!2  ]
Reference4 g3 (!1  !2 )=!1 (1 ; !1 )!2  ]
Where
q
g1 (!1 !2 ) = 1=   + a(1 ; a)flog(1 ; !1 )=!1 ]g2 
q
g2 (!1 !2 ) = a(1 ; !1 )2 + (1 ; a)!12 
q
g3 (!1 !2 ) = g1 (!1  !2 )=   + af + log(1 ; !1)!2 ]g2 + (1 ; a)f + log(!1 !2 )g2 

and
Z1
 = 1+ (log z )e;z dz
Z1 0 Z1
 = (log z )2 e;z dz ; f (log z )e;z dz g2 
0 0
a = m=(m + n):

1. Also the reference for the group ordering f(!1  !2   )g f !1  !2 g, and f (!1  !2 )g.
2. Group ordering f!1  (!2   )g and f!1   !2 g.
3. Group ordering f(!2   ) !1 g:
4. Group ordering f!1  !2   g:

36 Sum of Squares of Normal Means


Consider the Np ( Ip ) density, where  = (1  : : :  p ), and the parameter of interest is
Pp 2 .
i=1 i

37
Prior (1  : : :  p (Marginal) Posterior
Uniform 1 proper
Reference1 jjjj;(p;1)
1. See Datta and Ghosh (1993). This is also the prior obtained by Stein (1985) and Tibshirani
(1989). It can be viewed as a hierachical prior with (i) 1  : : :  p j i:i:d:
 N (0 ;1 ), (ii)  has the
improper gamma density function f () /;3=2 .

37 T Distribution
The T (  2 ) density, where  > 0, ;1 <  < 1, and 2 > 0, is given by
 !
2 ;(+1)=2
;(  + 1) = 2] ( x
f (xj   ) = ( )1=2 ;(=2) 1 + 2
2 ;  ) :

This is a location-scale parameter problem see that section for priors.

38 Uniform
The U ( ; a  + a), ;1 <  < 1, density is

f (xj) = 21a Ix2( ;a +a) :

The reference prior (Bernado and Smith, 1994) is the same as the Uniform prior, and the
reference posterior is Uniform distribution U (max(x1  : : :  xn ) ; a min(x1  : : :  xn ) + a).
The U (0 ), 0 <  < 1, density is

f (xj) = 1 Ix2(0 ):

The reference prior (Bernado and Smith, 1994) is () / ;1 , and the reference posterior is
Pareto distribution Pa(max(x1  : : :  xn ) n).

38
39 Weibull
The Weibull density, with shape parameter c > 0 and quasi-scale parameter  > 0, is

f (xj c) = cxc;1 exp(;xc ) x > 0:

w = log x has the extreme value density

f (wj c) = c exp(cw) exp(; exp(cw)) ;1 < w < 1:

As indicated in Evans and Nigm (1980), setting c = ;1 and  = exp(;=), the density of w
becomes
f (wj ) = ;1 expf(w ; )=g expexpf(w ; )=g]
in which  and  are seen to be location-scale parameters. Therefor, ( ) = 1=. Prior
transformed back to the original parameterization is ( c) = 1=(c2 ). Analysis using the usual
noninformative prior is discussed by Evans and Nigm (1980).
Sun and Berger (1994) computed the following reference prior
Prior ( c) (Marginal) Posterior
Reference1 1=(c) proper3
p
Reference2 (c 1 +   ; 2 ; 2(1 ;  ) log  + (log )2 );1
R R
Here  is Euler's constant, i.e.,  = ; 01 (log z )e;z dz and   = 01 (log z )2 e;z dz .
1. Group ordering f( c)g and fc g (also Je reys prior).
2. Group ordering f cg.
3. Provided n > 1 and not all observations are equal.
This density can also be written in the form
;1
f (xj  ) = x expf; x ] g:

Under this form, Sun and Berger (1994) computed the reference prior as

39
Prior (  ) (Marginal) Posterior
Reference1 ;1 proper3
Reference2 ( );1
1. Group ordering f(  )g (also Je reys prior).
2. Group ordering f  g and f g.
3. Provided n > 1 and not all observations are equal.

References
Alba, E.D. and Mendoza, M. (1995). A discrete model for Bayesian forcasting with stable
seasonal patterns. Advances in Econometrics 11.
Banerjee, A.K. and Bhattacharyya, G.K. (1979). Bayesian results for the inverse Gaussian
distribution with an application. Technometrics 21, No 2, 247-251.
Baranchik, A. (1964). Multiple regression and estimation of the mean of a multivariate normal
distribution. Technical Report No 51, Department of Statistics, Stanford University.
Basu, A.P. and Sun, D. (1994). Bayesian analysis for a stres-strength system via reference prior
approach. Submitted.
Bayarri, M.J. (1981). Inferencia Bayesiana sobre el coeciente de correlaci$on de una problaci$on
normal bivariante. Trab. Estadist. 32, 18-31.
Berger, J.O. (1985). Statistical Decision Theory and Bayesian Analysis. New York: Springer-
Verlag.
Berger, J.O. and Bernardo, J.M. (1989). Estimating a product of means: Bayesian analysis with
reference priors. J. Amer. Statist. Assoc. 84, 200-207.
Berger, J.O. and Bernardo, J.M. (1992a). Reference priors in a variance components problem.
In Bayesian Analysis in Statistics and Econometrics, P. Goel and N.S. Iyengar(eds.). New
York: Springer Verlag.
Berger, J.O. and Bernardo, J.M. (1992b). Ordered group reference priors with application to a
multinomial problem. Biometrika 79, 25-37.
Berger, J.O. and Bernardo, J.M. (1992c). On the development of the reference prior method. In
Bayesian Statistics 4, J.M.Bernardo, J.O.Berger, D.V.Lindley, and A.F.M.Smith (eds.).
London: Oxford University Press.

40
Berger, J.O. and Strawderman, W.E. (1993). Choice of hierarchical priors: admissibility in
estimation of normal means. Technical Report 93-34C, Department of Statistics, Purdue
University.
Berger, J.O. and Yang, R. (1994). Noninformative priors and Bayesian testing for the AR(1)
model. Econometric Theory, 10, No 3-4, 461-482.
Bernardo, J.M. (1977). Inferences about the ratio of normal means: a Bayesian approach to
the Fieller-Creasy problem. In Recent Developments in Statistics, J.R. Barra et al. (eds.).
Amsterdam: North-Holland.
Bernardo, J.M. (1979). Reference posterior distributions for Bayes inference. J. Roy. Statist.
Soc. Ser B 41, 113-147 (with discussion).
Bernardo, J.M. and Gir$on, F.J. (1988) A Bayesian analysis of simple mixture problems. In
Bayesian Statistics 3, J.M.Bernardo, M.H.DeGroot, D.V.Lindley, and A.F.M.Smith (eds.).
London: Oxford University Press.
Bernardo, J.M. and Ram$on, J.M. (1996). An elementary introduction to reference analysis.
Tech. Rep. BR96, Universitat de Valencia, Spain.
Bernardo, J.M. and Smith, A.F.M. (1994). Bayesian Theory. New York: John Wiley & Sons.
Box, G.E.P. and Cox, D.R. (1964). An analysis of transformation. J. Roy. Statist. Soc. Ser B
26, 211-252. (with discussion).
Box, G.E.P. and Tiao, G.C. (1992). Bayesian Inference in Statistical Analysis. John Wiley &
Sons, New York.
Chaloner, K. (1987). A Bayesian approach to the estimation of variance components for the
unbalanced one-way random model. Technometrics, 29, 323-337.
Crowder, M. and Sweeting, T. (1989). Bayesian inference for a bivariate binomial. Biometrika,
76, 599-604.
Datta, G.S. and Ghosh, M. (1993). Some remarks on noninformative priors. Technical Report
93-16, Department of Statistics, University of Georgia.
Eaves, D.M. (1983). On Bayesian nonlinear regression with an enzyme example. Biometrika,
70, 2, 373-379.
Evans, I.G. and Nigm, A.M. (1980). Bayesian prediction for two-parameter Weibull lifetime
models. Commun. Statist. - Theor. Meth., A9(6), 649-658.

41
Fatti, L.P. (1982). Predictive discrimination under the random e ect model. South African
Statist. J., 16, 55-77.
Geisser, S. (1984). On prior distributions for binary trials (with discussion). Amer. Statist., 38,
244-251.
Geisser, S. (1964). Posterior odds for multivariate normal classi cations. J. Roy. Statist. Soc.
Ser B 26, 69-76.
Geisser, S. and Corn eld, J. (1963). Posterior distributions for multivariate normal parameters.
J. Roy. Statist. Soc. Ser B 25, 368-376.
Howlader, H.A. and Weiss, G. (1988). Bayesian reliability estimation of a two-parameter Cauchy
distribution. Biom. J., 30, 3, 329-337.
Ibrahim, J.G. and Laud, P.W. (1991). On Bayesian analysis of generalized linear models using
Je reys prior. J. Amer. Statist. Assoc. 86, 981-986.
Je reys, H. (1961). Theory of Probability. London: Oxford University Press.
Joshi, S. and Shah, M. (1991). Estimating the mean of an inverse Gaussian distribution with
known coecient of variation. Commun. Statist. - Theor. Meth., 20(9), 2907-2912.
Kappenman, R.F., Geisser, S. and Antle, C.F. (1970). Bayesian and ducial solutions to the
Fieller-Creasy problem. Sankhya B, 32, 331-340.
Kass, R.E. and Wasserman, L. (1993). Formal rules for selecting prior distributions: a review
and annotated bibliography.
Kubokawa, T. and Robert, C.P. (1994). New perspective on Linear calibration. J. Multivariate
Anal., 51, No 1, 178-200.
Liseo, B. (1993). Elimination of nuisance parameters with reference noninformative priors.
Biometrika, 80, 295-304.
Muirhead, R.J. (1982). Aspects of Multivariate Statistical Theory. Wiley, New York.
Novick, W.R. and Hall, W.J. (1965). A Bayesian indi erence procedure. J. Amer. Statist.
Assoc., 60, 1104-1117.
Pericchi, L.R. (1981). A Bayesian approach to transformation to normality. Biometrika, 68,
35-43.
Poirier, D.J. (1992). Je reys prior for logit models. To appear in Econometrics.
Polson, N. and Wasserman, L. (1990). Prior distribution for the bivariate binomial. Biometrika,

42
77, 4, 901-904.
Rissanen, J. (1983). A universal prior for integers and estimation by minimum description
length. Annal. Statist., 11, No 2, 416-431.
Sono, S. (1983). On a noninformative prior distribution for Bayesian inference of multinomial
distribution's parameters. Ann. Inst. Statist. Math., 35, Part A, 167-174.
Spiegelhalter, D.J. (1985). Exact Bayesian inference on the parameters of a Cauchy distribution
with vague prior information. In Bayesian Statistics 2, J.M.Bernardo, M.H. DeGroot,
D.V.Lindley, and A.F.M.Smith (eds.). North-Holland: Elsevier Science Publishers B.V..
Stein, C. (1985). On the coverage probability of con dence sets based on a prior distribution. In
Sequential Methods in Statistics, Banach center publications, 16. PWN-Polish Scienti c
Publishers, Warsaw.
Sun, D.C. and Berger, J.O. (1994). Bayesian sequential reliability for Weibull and related
distributions. Ann. Inst. Statist. Math., 46, No 2, 221-249.
Sun, D.C. and Ye, K.Y. (1994a). Inference on a product of normal means with unknown vari-
ances. Submitted to Biometrika.
Sun, D.C. and Ye, K.Y. (1994b). Frequentist validity of posterior quantiles for a two-parameter
exponential family. Submitted to Biometrika.
Sun, D.C. and Ye, K.Y. (1995). Reference prior Bayesian analysis for normal mean products.
J. Amer. Statist. Assoc., 90.
Tibshirani, R. (1989). Noninformative priors for one parameter of many. Biometrika, 76, 604-
608.
Tiwari, R.C., Chib, S. and Jammalamadaka, S.R. (1989). Bayes estimation of the multiple
correlation coecient. Commun. Statist. - Theor. Meth., 18(4), 1401-1413.
Villegas, C. (1977). On the representation of ignorance. J. Amer. Statist. Assoc., 72, 651-654.
Wixley, R.A.J. (1993). Data translated likelihood and choice of prior for power transformation
to normality.
Yang, R. and Berger, J.O. (1994). Estimation of a covariance matrix using the reference prior.
Annal. Statist., 22, No 3, 1195-1211.
Yang, R. and Chen, M.H. (1995). Bayesian analysis for random coecient regression models
using noninformative priors. J. Multivariate Anal., 55, No 2, 283-311.

43
Yang, R. and Pyne, D. (1996). Bayesian analysis with mixed model in unbalanced case. Draft.
Ye, K.Y. (1990). Noninformative priors in Bayesian analysis. Ph.D. Dissertation. Purdue
University.
Ye, K.Y. (1994). Bayesian reference prior analysis on the ratio of variances for the balanced
one-way random e ect model. J. Statist. Plann. Inference, 41, No 3, 267-280.
Ye, K.Y. (1993). Reference priors when the stopping rule depends on the parameter of interest.
J. Amer. Statist. Assoc., 88, 360-363.
Ye, K.Y. and Berger, J.O. (1991). Noninformative priors for inference in exponential regression
models. Biometrika 78, 645-656.
Zellner, A. (1971). An Introduction to Bayesian Inference in Econometrics. New York: John
Wiley & Sons.
Zellner, A. (1984). Maximal data information prior distributions. In Zellner, A. Basic Issues in
Econometrics, U. of Chicago Press.
Zellner, A. (1993). Models, prior information and Bayesian analysis.

44

You might also like