Professional Documents
Culture Documents
Plummer Bayes 2
Plummer Bayes 2
16 December 2012
Plummer
Plummer
Plummer
M
So
Ed
Po1
Po2
Width posterior
probability
Sorted in order of
posterior probability
LF
M.F
Pop
NW
U1
U2
GDP
Ineq
Prob
Time
1
4 5
11
14
17 20
24
28
33 39 46
Model #
Plummer
Bayes Factors
1
2
p(Y | Mi )
p(Y | M2 )
posterior odds
prior odds
Bayes factor
p(M1
Plummer
Interpretation
Not worth more than a bare mention
Positive
Strong
Very strong
Plummer
Plummer
Plummer
Plummer
Plummer
Plummer
Plummer
N( + i , 2 )
N(0, )
N(0, 2 )
where 2 , 2 are fixed.
As 0 this tends to a pooled model where Y1 . . . Yn have
a common mean ()
As we get a fixed effects model: learning i tells us
nothing about j for j 6= i.
Plummer
Plummer
hi = p
i=1
Plummer
n
X
hi
i=1
Plummer
Spiegelhalter, Best, Carlin and van der Linde (2002) introduced the
deviance information criterion
DIC = D + pD
which combines
A measure of fit D, the expected deviance.
A measure of complexity pD , the effective number of parameters
Choose the model with the smallest DIC
Plummer
Y1
...
Y2
Yn
Y = (Y1 . . . Yn ) is a vector of
observations
is a vector of parameters
common to all models (the
focus)
Plummer
Spiegelhalter et al definition of pD
The effective number of parameters
In general pD depends on Y .
Plummer
Spiegelhalter et al definition of pD
The effective number of parameters
Plummer
Plummer
Advantages of DIC
Plummer
200
250
150
100
50
Number of citations
1999
2001
2003
2005
2007
Year
Plummer
Early publications
cite the technical
report
Spiegelhalter, Best
and Carlin (1998).
Limitations of DIC
Not consistent.
Given a set of nested models, DIC will tend to choose a model
that is too large as n .
Plummer
Problems with pD
Plummer
Plummer
Criticism of DIC
Plummer
Parameter
estimation
Model
criticism
Plummer
Parameter
estimation
Model
criticism
Bayes factors
Model choice based on marginal likelihood
without any attempt to estimate model
parameters.
Incompatible with improper or diffuse
reference priors.
Plummer
Parameter
estimation
Model
criticism
Plummer
Parameter
estimation
Model
criticism
Two-phase studies
Data collection in two phases
Initial data collection used for parameter
estimation (hypothesis generating)
New data collection for model criticism
(hypothesis testing)
Plummer
Parameter
estimation
Model
criticism
k-fold cross-validation
Randomly split the data k 10 groups
Use data from k 1 groups for parameter
estimation.
Use remaining group for model criticism.
Plummer
Parameter
estimation
Model
criticism
Plummer
Parameter
estimation
Model
criticism
Plummer
Loss functions
We adopt a utilitarian approach to model choice
A model is considered useful if it gives good out-of-sample
predictions.
Ideally we have two data sets
Z, a set of training data
Y, a set of validation data
Plummer
Plummer
Definition
A loss function is linear if it breaks down into a sum of
contributions from each element of the test data Y.
X
L(Y, Z) =
L(Yi , Z)
i
Plummer
Optimism
Definition
The optimism of L(Yi , Y)
popt i = E {L(Yi , Yi ) L(Yi , Y) | Yi }
Plummer
Optimism
Definition
The optimism of L(Yi , Y)
popt i = E {L(Yi , Yi ) L(Yi , Y) | Yi }
where
Yi = (Y1 , ...Yi1 , Yi+1 , . . . Yn )
is the data set with observation i removed.
Plummer
Plummer
(Z)
= E( | Z)
Plummer
Plummer
Plummer
Plummer
Plummer
pD
Pooled
Exchangeable
Spatial
Exchangeable + spatial
Plummer
1.0
43.5
31.0
31.6
Correct
penalty
1.1
570.5
163.9
166.4
Z
d p ( | Z) log {p (Y | )}
Plummer
Plummer
n
X
Cov (i , i | Yi )
i=1
where
i is the canonical parameter
i is the mean value parameter
is the scale parameter
Plummer
Plummer
Plummer
Approximation of popt
D + 2pD
Plummer
Plummer
Plummer