You are on page 1of 17

AStA Adv Stat Anal (2014) 98:287–303

DOI 10.1007/s10182-013-0222-0

ORIGINAL PAPER

Robust Bayesian methodology with applications


in credibility premium derivation and future claim size
prediction

Ali Karimnezhad · Ahmad Parsian

Received: 18 April 2013 / Accepted: 6 November 2013 / Published online: 22 November 2013
© Springer-Verlag Berlin Heidelberg 2013

Abstract Robust Bayesian methodology deals with the problem of explaining uncer-
tainty of the inputs (the prior, the model, and the loss function) and provides a break-
through way to take into account the input’s variation. If the uncertainty is in terms
of the prior knowledge, robust Bayesian analysis provides a way to consider the prior
knowledge in terms of a class of priors Γ and derive some optimal rules. In this
paper, we motivate utilizing robust Bayes methodology under the asymmetric general
entropy loss function in insurance and pursue two main goals, namely (i) computing
premiums and (ii) predicting a future claim size. To achieve the goals, we choose some
classes of priors and deal with (i) Bayes and posterior regret gamma minimax premium
computation, (ii) Bayes and posterior regret gamma minimax prediction of a future
claim size under the general entropy loss. We also perform a prequential analysis and
compare the performance of posterior regret gamma minimax predictors against the
Bayes predictors.

Keywords Bayesian analysis · Credibility premium · General entropy loss ·


Prequential analysis · Robust Bayesian analysis

1 Introduction

Let the random variable X (real or vector valued) represent the claim size of a risk
having the distribution function F(.|θ ), where θ belongs to the parameter space . The
main concerns of this paper are two problems in the Bayesian framework: (i) estimation

A. Karimnezhad · A. Parsian (B)


School of Mathematics, Statistics and Computer Science, University of Tehran, Tehran, Iran
e-mail: ahmad_p@khayam.ut.ac.ir
A. Karimnezhad
e-mail: a.karimnezhad@khayam.ut.ac.ir

123
288 A. Karimnezhad, A. Parsian

of a risk premium PR ≡ PR (θ ) and (ii) prediction of a future claim size Y based on


currently observed data X = x. Bayesian methodology in credibility theory has been
developed in several actuarial applications (Heilmann 1989; Klugman 1992; Makov
et al. 1996; Gómez-Déniz et al. 1999, 2006; Gómez-Déniz 2008). In the credibility
premium derivation studies, as a statistical decision-making process, some optimal
estimators (P) of the risk premium PR are derived under the symmetric squared error
loss (SEL) function L(PR , P) = (PR − P)2 . But the SEL function considers a same
penalty for a policyholder to be under or overcharged and gives no credit to situations
in which the policyholder is undercharged and the insurance company might lose its
money. There are numerous examples in the literature that suggest the use of loss
function associated with estimation or prediction should assign a more severe penalty
for overestimation than underestimation or vice versa (Parsian and Kirmani 2002;
Boratynska 2008; Kiapour and Nematollahi 2011). Hence, the use of the symmetric
SEL is critical and it is beneficial to consider some loss functions which assign less or
more penalty for being under or overcharged. A useful asymmetric loss function when
giving credit to under and overestimation is the general entropy loss (GEL) function
 q  
PR PR
L(PR , P) = − q ln − 1, q = 0, (1)
P P

where the risk premium PR can be chosen to be one of the well-known classical
principles such as the net premium principle, the exponential principle and the Esscher
principle (Heilmann 1989; Bühlmann and Gisler 2005; Gómez-Déniz 2008). The GEL
function has been used by many researchers (Calabria and Pulcini 1994; Soliman 2005;
Singh et al. 2008) and is a more general version of the entropy loss function which
allows different shapes of the loss to be considered; see Dey et al. (1987) and Dey and
Liu (1992) for details. It is interesting to mention that the GEL function is asymmetric,
convex in  = PPR when q = 1 and quasi-convex otherwise, has a unique minimum
at  = 1, and is scale invariant. The GEL function with negative q values penalizes
overestimation more than underestimation while with positive q values acts vice versa.
Any statistical decision-making problem in the Bayesian framework behaves under
a specific structure function (or prior density) π over the parameter space and, obvi-
ously, no one can say a chosen structure function acts better than any other. More
specifically, information on the appropriate prior is often too inadequate to specify
a prior distribution unambiguously. The problem of expressing uncertainty regarding
prior information can be solved by using a class Γ of priors as mentioned in Kamińska
and Porosiński (2009). This situation occurs in practice when a practitioner is unable
to choose an explicit prior. In connection with prior uncertainty, it is interesting to
consider an -contaminated class of priors given by

Γ = {π = (1 − )π0 + π : π ∈ Γ,  ∈ [0, 1]} , (2)

where π0 is an explicit prior and π ∈ Γ is any prior in the family of all probability
distributions on (0, +∞) and  ∈ [0, 1] is a fixed known constant. The -contaminated
class of priors Γ is a beneficial tool when dealing with the effect of prior uncertainty.
To see this, note that when  = 0, Γ has just one member, i.e., the explicit prior

123
Bayesian and robust Bayesian methodology 289

π0 . When  = 1, Γ has many alternative priors, we might prefer to use instead of


π0 and this case is interesting in robust Bayes analysis. Finally, 0 <  < 1 reflects
practitioner’s belief on how close the prior π is to the explicit prior π0 . For more
information see Berger (1985), Sivaganesan (1988), Sivaganesan and Berger (1989),
Berger and Moreno (1994), and Gómez-Déniz et al. (1999, 2006).
In this paper, we focus on the use of Bayesain and robust Bayesain methodology
under the GEL function. In Sects. 2 and 3, we deal with Bayes and robust Bayes
premium estimation and derive Bayes and posterior regret gamma minimax premiums.
In Sects. 4 and 5, we derive Bayes and posterior regret gamma minimax predictors
of a future claim size. Illustrative examples are provided to clarify how the results
benefit the practitioner. Finally, in Sect. 6, a prequential analysis is used to compare
the behavior of Bayes and posterior regret gamma minimax predictors of future claim
sizes. To keep readers in track, all the proofs are postponed to the Appendix.

2 Bayes premiums

In this section, we utilize the Bayesian viewpoint which has a substantial role in
modern statistics (Berger 1985) and consider Bayes premium calculation under the
GEL function (1). All the premium estimators (PE ) are considered to satisfy

Pθ (PE ∈ PR ()) = 1, ∀ θ ∈  = (0, ∞),

where PR () denotes the range of PR . This condition generates a class of premium
estimators and we denote this class by P. Now, suppose the individual risk X has
a distribution F(.|θ ) and density function f (.|θ ), where θ ∈ . Further, suppose
θ has a prior distribution (structure function in risk theory) with density π(θ ). In
actuarial literature, when experience is not available, the actuary charges the collec-
tive premium PCπ , which is given by minimizing the risk function, i.e., minimizing
E π(θ) [L(PR , P)], where E π(θ) [.] denotes the expectation w.r.t. the prior information
π(θ ). However, if the experience is available, the actuary computes the Bayes pre-
mium PBπ (x) ≡ PBπ , which is given by minimizing the posterior risk function, i.e.,
minimizing ρ(π, P) = E π(θ|x) [L(PR , P)], where E π(θ|x) [.] denotes the expectation
w.r.t. the posterior π(θ |x).
The following proposition provides the collective and Bayes premium of PR under
the GEL function (1).

Proposition 1 Let X = (X 1 , . . . , X t ) be a sequence of independent and identically


distributed random variables from a distribution with pdf f (x|θ ), θ ∈ . Consider
some mixed prior π (θ ) = (1 − )π0 (θ ) + π(θ ) w.r.t. some σ -finite measure ν. Then,
(i) The posterior distribution is π (θ |X) = λπ0 (θ |X) + (1 − λ) π(θ |X) with
(1−)m π0 (X) 
λ ≡ λ(X) = (1−)m π (X)+m (X) , where m π0 (X) =  f (X|θ )π0 (θ ) dν(θ )
 0 π
and m π (X) =  f (X|θ )π(θ ) dν(θ ).
(ii) The unique collective premium of PR w.r.t. the prior π (θ ) under the G E L func-
tion (1) is given by

123
290 A. Karimnezhad, A. Parsian


PCπ = q
(1 − )PCπ0 + PCπ ,

• •  q
where PCπ = E π (θ) PR , π • ∈ {π0 , π }.
(iii) The unique Bayes premium of PR w.r.t. the prior π (θ ) under the GEL function
(1) is given by

PBπ (X) ≡ PBπ = q
λPBπ0 + (1 − λ) PBπ ,

• • (θ|X)  q •
where PBπ = E π PR , π ∈ {π0 , π }.

Proof See the Appendix. 




The following examples provide two illustrations for premium calculation under
GEL function (1).

Example 1 (Gamma–Gamma Model) Suppose X 1 , . . . , X t represent independent


claim sizes of single risks during the last t periods, each having Γ (ν, θ )-distribution
with probability density function (pdf)

θ ν ν−1 −θ x
f (x|θ ) = x e , x > 0, θ > 0, ν > 0,
Γ (ν)

and consider Γ (α0 , β0 ) and Γ (α, β) as prior distributions for π0 and π , respec-
tively. It is easy to verify that the contaminated posterior distribution of θ given
X = (X 1 , . . . , X t ) is π (θ |X) = λΓ (α0 + νt, β0 + S) + (1 − λ) Γ (α + νt, β + S)
t ∗ β α (S+β0 )α0 +νt Γ (α+νt)Γ (α0 )
with S = i=1 X i and λ = 1−+λ 1−
∗ where λ = α0 α+νt
. Then,
β0 (S+β) Γ (α0 +νt)Γ (α)
under the net premium principle, i.e., PR = E θ [X t+1 ], and under the GEL function
(1), the risk, collective and Bayes premiums are given respectively by

ν
PR = ,
θ
PCπ = ν q (1 − )β0  q (α0 , q) + β q  q (α, q),
q

PBπ = ν q λ(S + β0 )q  q (α0 + νt, q) + (1 − λ)(S + β)q  q (α + νt, q), (3)



Γ (m−q)
where (m, q) = q
Γ (m) , provided m > q. It is interesting to realize that under
both Stein and entropy losses, i.e., q = −1 and q = 1, the Bayes net premium PBπ in
(3) when  = 0 or 1 can be restated as a credibility formula of the form

PBπ = (1 − )PBπ0 + PBπ ,  = 0, 1, (4)

where PBπ0 and PBπ , each, is a credibility formula of the form


• • • •
PBπ = Ztπ X̄ + (1 − Ztπ )PCπ , π • ∈ {π0 , π } , (5)

123
Bayesian and robust Bayesian methodology 291

• •
in which the credibility factor Ztπ and the risk premium PCπ correspond to the prior
π • . As a special case, when  = 1, i.e., π • = π , and for q = 1, the credibility factor
νt νβ
and the risk premium in (5) are in the form Ztπ = α+νt−1 , PCπ = α−1 , respectively.
It is also interesting to note that when  = 0 or 1 and q = 1, the credibility factor can
be restated in the form of
• tVar (E θ [X ])
Ztπ = , (6)
tVar (E θ [X ]) + E[Var θ [X ]]

as in Heilmann (1989).
Example 2 (Poisson-Gamma Model) Suppose that X 1 , . . . , X t represent independent
number of claims having P(θ )-distribution with probability function (pf)

e−θ θ x
f θ (x) = , x = 0, 1, . . . , θ > 0,
x!
and consider Γ (α0 , β0 ) and Γ (α, β) as prior distributions for π0 and π , respectively.
t data X is π (θ |X)
Then, the contaminated posterior distribution of θ given = λΓ (S +
α0 , β0 + t) + (1 − λ) Γ (S + α, β + t) with S = i=1 X i and λ = 1−+λ
1−
∗ where
β α (β0 +t) S+α0 Γ (S+α)Γ (α0 )
λ∗ = α . Then it can be verified that under the net premium
β0 0 (β+t) S+α Γ (S+α0 )Γ (α)
principle and under the GEL function (1), the risk, collective and Bayes premiums
are

PR = θ,
1
PCπ = q (1 − ) ∗ (α0 , q) +  ∗ (α, q),
q q

β
1 q
PBπ = (1 − ) ∗ (S + α0 , q) +  ∗ (S + α, q),
q q
(7)
β +t

where  ∗ (m, q) = q Γ Γ(m+q)(m) , provided m + q > 0. Again, it is interesting to realize
that under both Stein and entropy losses, the Bayes net premium (7) when  = 0 or 1
can be restated as the credibility formula (4) where PBπ0 and PBπ , each, is a credibility
formula of the form (5). As a special case, when  = 1, i.e., π • = π , and for q = 1,
the credibility factor and the risk premium in (5) are in the form Ztπ = β+t t
, PCπ = βα ,
respectively. It is also interesting to note that for both the entropy and Stein losses, the
credibility factor can be restated in the form of (6).

3 Posterior regret gamma minimax premiums

In this section, we use the robust Bayesian methodology in the context of prior uncer-
tainty. This situation in insurance occurs when a practitioner is unwilling or unable to
choose an explicit prior, but might be able to restrict himself/herself to a specific class
of possible priors. Robust Bayesian methodology provides a useful tool to consider
the prior knowledge in terms of a class Γ of priors and compute a posterior functional,

123
292 A. Karimnezhad, A. Parsian

such as posterior risk, Bayes risk or posterior expected value, as the prior ranges over
Γ . See Berger (1984, 1985, 1990) for more details.
Dealing with the posterior regret as a posterior functional due to choosing the rule
P instead of the Bayes premium PBπ , i.e., r p (P, PBπ ) = ρ(π, P) − ρ(π, PBπ ), the
rule P P R,Γ is said to be posterior regret gamma minimax (PRGM) premium over the
class of priors Γ if r p (P P R,Γ , PBπ ) = inf P ∈P supπ ∈Γ r p (P, PBπ ). Regret-type rules
in decision theory have been used and appreciated for a very long time. See Berger
(1984, 1985) and Zen and DasGupta (1993).
To derive the PRGM premiums under the GEL function (1) and over the class of
priors Γ in (2), the posterior regret is computed as follows
 q  
PBπ PBπ
r p (P, PBπ ) = ρ(π , P) − ρ(π , PBπ ) = − q ln − 1. (8)
P P

The following theorem provides the PRGM premiums under the GEL function (1).

Theorem 1 Suppose P(X) ≡ P = inf π ∈Γ PBπ and P(X) ≡ P = supπ ∈Γ PBπ
are finite and P < P. Then the PRGM premium over the class Γ and under the GEL
function (1) is

q 1/q
P − Pq
P P R,Γ (X) ≡ P P R,Γ = q .
ln P − ln P q

Proof See the Appendix. 




To clarify how to compute the PRGM premiums, consider the following examples.

Example 3 (Example 1, continued) In the -contaminated class (2), let Γ (α0 , β0 ) be


a prior distribution for π0 and Γ be one of the following classes

Γ1 = {Γ (α, β); α1 ≤ α ≤ α2 , β = β0 } ,
Γ2 = {Γ (α, β); β1 ≤ β ≤ β2 , α = α0 } ,
Γ3 = {Γ (α, β); α1 ≤ α ≤ α2 , β1 ≤ β ≤ β2 } ,
Γ4 = Γ (α, β); γ1 ≤ PCπ1 ≤ γ2 , α = α0 ,

and denote the corresponding -contaminated class by Γi . Then, the PRGM net pre-
mium over the Γi , i = 1, . . . , 4, under the GEL function (1) is given by
   1/q
 Pi − Pi
P P R,Γi =     , (9)
ln (1 − )PBπ0 + P i − ln (1 − )PBπ0 + P i

where PBπ0 = ν(S + β0 )(α0 + νt, q), P i = inf π ∈Γi PBπ and P i = supπ ∈Γi PBπ
and PB is the Bayes premium w.r.t. π . We observe that PBπ is decreasing in α and
increasing in β. Hence, for the chosen class Γi , i = 1, . . . , 4,

123
Bayesian and robust Bayesian methodology 293

P 1 = ν(α2 + νt, q)(S + β0 ), P 1 = ν(α1 + νt, q)(S + β0 ),


P 2 = ν(α0 + νt, q)(S + β1 ), P 2 = ν(α0 + νt, q)(S + β2 ),
P 3 = ν(α2 + νt, q)(S + β1 ), P 3 = ν(α1 + νt, q)(S + β2 ),
P 4 = ν(α0 + νt, q)(S + γ1∗ ), P 4 = ν(α0 + νt, q)(S + γ2∗ ),
γj
where γ j∗ = ν(α0 ,q) , j = 1, 2.

Example 4 (Example 2, continued) The PRGM net premiums over the classes Γi ,
i = 1, . . . , 4 and under the GEL function (1) are given by (9) with PBπ0 =

β0 +t  (S + α0 , q) and
1

1 1
P1 =  ∗ (S + α1 , q), P1 =  ∗ (S + α2 , q),
β0 + t β0 + t
1 1
P2 =  ∗ (S + α0 , q), P2 =  ∗ (S + α0 , q),
β2 + t β1 + t
1 1
P3 =  ∗ (S + α1 , q), P3 =  ∗ (S + α2 , q),
β2 + t β1 + t
1 1
P4 = ‡  ∗ (S + α0 , q), P4 = ‡  ∗ (S + α0 , q),
γ2 + t γ1 + t

γj
where γ j‡ =  ∗ (α0 ,q) , j = 1, 2.

4 Bayes prediction of a future claim size

The goal of this section is to predict some future observation of Y based on observing
X = x. Let X denote some claim sizes and the random variable Y represent a future
claim size with joint distribution function F(x, y|θ ) and density f (x, y|θ ), where
θ ∈ . Further, suppose θ has a prior density π w.r.t. some σ -finite measure ν. The
predictive distribution is given by

1
h π (y|x) = f (x, y|θ )π(θ |x) dν(θ ), (10)
f (x|θ )


where f (x|θ ) is the marginal density of X given θ , and π(θ |x) is posterior density
function of θ given X = x. In some cases, it is reasonable to assume that given θ , X
and Y are conditionally independent, i.e., f (y|x, θ ) = f (y|θ ) (see Spiegelhalter et
al. 2004 and Boratynska 2006). In this case, the predictive distribution (10) reduces to

h π (y|x) = f (y|θ )π(θ |x) dν(θ ).


Applications of predictive distributions to reinsurance are given in Hesselager


(1993) and Hürlimann (1993, 1995). For other applications, see Zellner (1971), Box

123
294 A. Karimnezhad, A. Parsian

and Tiao (1973), Aitchison and Dunsmore (1975), Grieve (1998), Berry and Stangl
(1996) as well as Dickson et al. (1998), Nayak (2000) and Spiegelhalter et al. (2004).
In this section, we apply the prediction procedure in insurance and obtain Bayes
predictors under the general entropy prediction loss (GEPL) function
 q  
Y Y
L(Y, D) = − q ln − 1. (11)
D D

The prediction problem in insurance has not been studied well and is in its infancy.
One reason could be the fact that in a random sample of size t and under the squared
prediction error loss function L(Y, D) = (Y − D)2 , a Bayes predictor for a future
observation X t+1 is identical to the Bayes net premium of corresponding risk premium
under the SEL function; see Bühlmann and Gisler (2005). But, as will be seen, there is
no persuasion holding this property under the other choices of prediction loss functions.
A Bayes predictor DBπ w.r.t. a given prior π is obtained by minimizing the posterior
risk ρ(π, D) = E h π (y|x) [L(Y, D)], where E h π (y|x) [.] denotes the expectation w.r.t.
the associated predictive density h π (y|x). The following proposition provides the
Bayes predictor under the GEPL function (11).
Proposition 2 Let π (θ ) = (1 − )π0 (θ ) + π(θ ) be some mixed prior, where π0 is
an explicit prior and π ∈ Γ is any prior in the family of all probability distributions
on (0, +∞), and  ∈ [0, 1] is a fixed known constant. Then,
(i) The posterior distribution is π(θ |x) = λ(x)π0 (θ |x) + [1 − λ(x)] π(θ |x) with
(1−)m π0 (x) 
λ(x) = (1−)m π (x)+m π (x)
, m π0 (x) =  f (x|θ )π0 (θ ) dν(θ ), m π (x) =
 0

 f (x|θ )π(θ ) dν(θ ), f (x|θ ) is the marginal density of x given θ .


(ii) The predictive density (10) can be represented as

h π (y|x) = λ(x)h π0 (y|x) + [1 − λ(x)] h π (y|x),

(iii) The value of Bayes predictor is given by



DBπ (x) = q
λ(x)DBπ0 (x) + (1 − λ(x)) DBπ (x),

•  
where DBπ (x) = E h π • (y|x) Y q , π • ∈ {π0 , π }.
Proof See the Appendix. 

The next example makes use of the above facts in the Gamma–Gamma model.
Example 5 (Example 1, continued.) Suppose X = (X 1 , . . . , X t ) is the observed vec-
tor of claim sizes and we are interested in obtaining the Bayes predictor  qof future claim
size X t+1 w.r.t. the prior π (θ ). It suffices to compute E h π0 (X t+1 |X) X t+1 , as follows
 q    q 
E h π0 (X t+1 |X) X t+1 = E π0 (θ|X) E X t+1 X t+1
 
=  ∗ (ν, q)E π0 (θ|X) θ −q
=  ∗ q (ν, q) q (α0 + νt, q)(S + β0 )q ,

123
Bayesian and robust Bayesian methodology 295

 
where  ∗ (m, q) = q Γ Γ(m+q)
(m) , m + q > 0 and (m, q) =
q Γ (m−q)
Γ (m) , m − q > 0.
Then, the Bayes predictor w.r.t. the prior π is given by

DBπ (X) =  ∗ (ν, q) q λ(X) q (α0 +νt, q)(S +β0 )q +(1−λ(X))  q (α+νt, q)(S +β)q .

As a special case, when  = 0, the Bayes predictor of X t+1 is given by DBπ0 (X) =
 ∗ (ν, q)(α0 + νt, q)(S + β0 ) and obviously it is not identical to the Bayes premium
PBπ0 in the Example 1.

5 Posterior regret gamma minimax prediction of a future claim size

This section is devoted to posterior regret gamma minimax prediction of future obser-
vation of claim size Y . In the context of robust Bayesian prediction methodology,
given a class of prior distributions Γ , we say D P R,Γ is a PRGM predictor over
the class Γ if r p (D P R,Γ , DBπ ) = inf P ∈P supπ ∈Γ r p (D, DBπ ), where r p (D, DBπ ) =
ρ(π, D) − ρ(π, DBπ ) measures the posterior regret of choosing the predictor D instead
of the Bayes predictor DBπ .
Once again, let the class of priors be Γ given in (2). It is easy to verify that the
posterior prediction regret under the GEPL function (11) is given by
 q  
DBπ DBπ
r p (D, DBπ ) = ρ(π , D) − ρ(π , DBπ ) = − q ln − 1. (12)
D D

The next theorem provides the PRGM predictor of a future claim size under the
GEPL function (11). The proof is similar to that of Theorem 1 and hence omitted.

Theorem 2 Suppose D(X) ≡ D = inf π ∈Γ DBπ and D(X) ≡ D = supπ ∈Γ DBπ
are finite and D < D. Then the PRGM predictor of a future claim size over the
-contaminated class of Γ and under the GEPL function (11) is given by

q 1/q
D − Dq
D P R,Γ (X) ≡ D P R,Γ = q .
ln D − ln Dq

The next example clarifies the derivation of PRGM predictors in the Gamma-
Gamma model.

Example 6 (Example 5, continued.) The interest is to obtain the PRGM predictor


of future claim size X t+1 w.r.t. the prior π(θ ). Once again, let Γ (α0 , β0 ) be a prior
distribution for π0 and let Γ be one of the classes considered in Example 3 and denote
the corresponding -contaminated class by Γi . Then, the PRGM premium over the
Γi , i = 1, . . . , 4, under the G E P L function (11) is specified by
   1/q
 Di − Di
D P R,Γi =     , (13)
ln (1 − )DBπ0 + Di − ln (1 − )DBπ0 + Di

123
296 A. Karimnezhad, A. Parsian

where DBπ0 (X) =  ∗ (ν, q)(α0 + νt, q)(S + β0 ), Di = inf π ∗ ∈Γi DBπ and Di =
supπ ∈Γi DBπ and DB is the Bayes premium w.r.t. π . We observe that DBπ is decreasing
in α and increasing in β. Hence, for a chosen class Γi , i = 1, . . . , 4,

D1 =  ∗ (ν, q)(α2 + νt, q)(S + β0 ), D1 =  ∗ (ν, q)(α1 + νt, q)(S + β0 ),


D2 =  ∗ (ν, q)(α0 + νt, q)(S + β1 ), D2 =  ∗ (ν, q)(α0 + νt, q)(S + β2 ),
D3 =  ∗ (ν, q)(α2 + νt, q)(S + β1 ), D3 =  ∗ (ν, q)(α1 + νt, q)(S + β2 ),
D4 =  ∗ (ν, q)(α0 + νt, q)(S + γ1∗ ), D4 =  ∗ (ν, q)(α0 + νt, q)(S + γ2∗ ),

γj
where γ j∗ = ν(α0 ,q) , j = 1, 2.

6 A prequential analysis

In this section, we use a prequential analysis to compare the behavior of Bayes predic-
tors against PRGM predictors (Dawid 1984; Dawid and Vovk 1999). In Example 5, let
Γ (4, 1) and Γ (4, 1.5) be prior distributions for π0 and π , and let d[1] and d[2] be the
Bayes predictors w.r.t. π when  = 0 and  = 0.05, respectively. Also in Example 6
consider the four classes of priors with α0 = 4, α1 = 2, α2 = 6, β0 = 1, β1 = 1,
β2 = 5, γ1 = 1, γ2 = 5 and denote the corresponding contaminated classes by Γi ,
i = 1, . . . , 4. For  = 1, denote the resulting PRGM predictors by d[3] = D P R,Γ 1 ,
1
d[4] = D P R,Γ 2 , d[5] = D P R,Γ 3 and d[6] = D P R,Γ 4 .
1 1 1
For a simulation study and comparing the predictors, let X 1 , X 2 , . . . be a sequence
of independent random variables from Γ (ν, θ )-distribution. To predict a future obser-
vation of claim size X t+1 based on observing claim sizes during the past t periods,
the problem reduces to predicting
t a future observation of Y = X t+1 ∼ Γ (ν, θ ) based
on observation of X = i=1 X i ∼ Γ (νt, θ ). We proceed the simulation study as
follows:
1. Generate a random sample x1 , . . . , xi from a Γ (ν, θ ) with ν = 4 and θ =
0.5(0.5)2.
2. Generate a next observation xi+1 from the distribution considered in Step 1 and
compute di+1 [l], l = 1, . . . , 6, the prediction
 value  x1 , x2 , . . . , xi .
 of xi+1based on
xi+1 q xi+1
3. Calculate prediction loss for xi+1 , i.e., di+1 [l] − q ln di+1 [l] − 1 when q =
−2, −1, 1, 2.
4. Increase i by 1 and repeat Steps 2 and 3 until i = t, when t = 10, 25(25)100.
5. Compute average prediction risk (APR) for each predictor di+1 [l], l = 1, . . . , 6
as follows

t    
1  xi+1 q 1 
t
xi+1
APR(d[l]) = − q ln − 1,
t di+1 [l] t di+1 [l]
i=1 i=1

for the selected values of q in Step 3.

123
Bayesian and robust Bayesian methodology 297

Table 1 EAPR values of Bayes predictor and PRGM predictors for various values of θ and t over Γ1i ,
i = 1, . . . , 4 under the GEPL function (11)

θ t q = −2 q = −1

d[1] d[2] d[3] d[4] d[5] d[6] d[1] d[2] d[3] d[4] d[5] d[6]

10 0.9272 0.9246 0.8667 0.9250 0.8715 0.8847 0.2166 0.2158 0.1976 0.2151 0.1975 0.2055
25 0.8202 0.8191 0.7946 0.8194 0.7966 0.8023 0.1863 0.1859 0.1783 0.1857 0.1783 0.1816
0.50 50 0.7743 0.7737 0.7609 0.7738 0.7619 0.7650 0.1733 0.1731 0.1691 0.1730 0.1691 0.1709
75 0.7712 0.7708 0.7624 0.7709 0.7630 0.7650 0.1690 0.1689 0.1662 0.1688 0.1662 0.1674
100 0.7622 0.7619 0.7556 0.7620 0.7561 0.7576 0.1668 0.1667 0.1647 0.1667 0.1647 0.1656
10 0.9298 0.9272 0.8592 0.9281 0.8648 0.8746 0.2073 0.2064 0.1844 0.2060 0.1849 0.1923
25 0.8078 0.8065 0.7740 0.8070 0.7765 0.7820 0.1818 0.1814 0.1719 0.1812 0.1721 0.1753
1.00 50 0.7807 0.7801 0.7634 0.7803 0.7646 0.7675 0.1712 0.1710 0.1660 0.1709 0.1661 0.1678
75 0.7579 0.7574 0.7461 0.7576 0.7469 0.7489 0.1673 0.1672 0.1638 0.1671 0.1639 0.1651
100 0.7487 0.7484 0.7398 0.7485 0.7405 0.7420 0.1652 0.1651 0.1627 0.1651 0.1627 0.1636
10 0.8710 0.8685 0.7988 0.8694 0.8031 0.8094 0.1969 0.1961 0.1766 0.1958 0.1770 0.1815
25 0.7948 0.7937 0.7629 0.7941 0.7647 0.7678 0.1772 0.1769 0.1686 0.1768 0.1688 0.1707
1.50 50 0.7639 0.7633 0.7471 0.7635 0.7480 0.7497 0.1695 0.1693 0.1650 0.1692 0.1650 0.1661
75 0.7539 0.7535 0.7432 0.7536 0.7438 0.7447 0.1664 0.1663 0.1633 0.1662 0.1634 0.1641
100 0.7430 0.7427 0.7345 0.7428 0.7349 0.7358 0.1641 0.1640 0.1618 0.1640 0.1618 0.1624
10 0.8496 0.8477 0.7991 0.8482 0.7999 0.7976 0.1902 0.1897 0.1775 0.1894 0.1776 0.1775
25 0.7946 0.7939 0.7786 0.7941 0.7780 0.7759 0.1750 0.1748 0.1697 0.1747 0.1697 0.1696
2.00 50 0.7574 0.7570 0.7459 0.7571 0.7460 0.7455 0.1680 0.1679 0.1653 0.1679 0.1653 0.1652
75 0.7514 0.7511 0.7430 0.7512 0.7432 0.7430 0.1648 0.1648 0.1629 0.1647 0.1629 0.1629
100 0.7413 0.7411 0.7352 0.7411 0.7353 0.7351 0.1631 0.1631 0.1616 0.1630 0.1616 0.1616
θ t q=1 q=2

d[1] d[2] d[3] d[4] d[5] d[6] d[1] d[2] d[3] d[4] d[5] d[6]

10 0.2079 0.2064 0.1731 0.2013 0.1680 0.1963 0.8583 0.8488 0.6570 0.8039 0.6226 0.8225
25 0.1664 0.1658 0.1521 0.1638 0.1500 0.1616 0.6546 0.6507 0.5713 0.6325 0.5573 0.6398
0.50 50 0.1503 0.1500 0.1430 0.1490 0.1420 0.1479 0.5788 0.5767 0.5358 0.5676 0.5287 0.5711
75 0.1441 0.1439 0.1393 0.1432 0.1386 0.1425 0.5480 0.5467 0.5201 0.5407 0.5155 0.5431
100 0.1410 0.1408 0.1373 0.1403 0.1368 0.1398 0.5336 0.5326 0.5122 0.5281 0.5087 0.5298
10 0.1860 0.1846 0.1529 0.1813 0.1514 0.1726 0.7242 0.7170 0.5692 0.6913 0.5658 0.6892
25 0.1577 0.1571 0.1437 0.1558 0.1430 0.1521 0.6010 0.5980 0.5353 0.5876 0.5337 0.5864
1.00 50 0.1452 0.1449 0.1380 0.1443 0.1377 0.1424 0.5480 0.5464 0.5139 0.5413 0.5131 0.5405
75 0.1412 0.1410 0.1363 0.1405 0.1361 0.1392 0.5302 0.5292 0.5076 0.5258 0.5071 0.5252
100 0.1389 0.1387 0.1352 0.1384 0.1350 0.1374 0.5210 0.5202 0.5033 0.5175 0.5029 0.5171
10 0.1721 0.1711 0.1484 0.1687 0.1496 0.1596 0.6545 0.6496 0.5589 0.6335 0.5737 0.6243
25 0.1518 0.1514 0.1419 0.1505 0.1425 0.1467 0.5722 0.5702 0.5321 0.5641 0.5382 0.5598
1.50 50 0.1430 0.1427 0.1378 0.1423 0.1380 0.1403 0.5355 0.5344 0.5143 0.5313 0.5173 0.5290
75 0.1395 0.1393 0.1359 0.1390 0.1360 0.1376 0.5216 0.5209 0.5066 0.5186 0.5085 0.5171
100 0.1373 0.1372 0.1346 0.1370 0.1347 0.1359 0.5123 0.5118 0.5014 0.5102 0.5029 0.5090

123
298 A. Karimnezhad, A. Parsian

Table 1 continued

θ t q=1 q=2

d[1] d[2] d[3] d[4] d[5] d[6] d[1] d[2] d[3] d[4] d[5] d[6]

10 0.1625 0.1618 0.1523 0.1602 0.1559 0.1527 0.6103 0.6074 0.5816 0.5982 0.6094 0.5875
25 0.1480 0.1477 0.1434 0.1471 0.1449 0.1437 0.5545 0.5532 0.5405 0.5497 0.5518 0.5446
2.00 50 0.1409 0.1408 0.1386 0.1405 0.1394 0.1388 0.5265 0.5258 0.5194 0.5241 0.5251 0.5215
75 0.1374 0.1373 0.1359 0.1371 0.1364 0.1360 0.5124 0.5120 0.5078 0.5108 0.5116 0.5091
100 0.1362 0.1362 0.1350 0.1360 0.1354 0.1351 0.5079 0.5076 0.5042 0.5067 0.5071 0.5054

6. Repeat Steps 1–5 N = 104 times and let APRm (d[l]) denotes A P R of predictor
d[l], l = 1, . . . , 6 for m-th repetition (m = 1, . . . , N ). Calculate estimated average
prediction risk (EAPR) as a measure of comparison study given by

1 
N
EAPR(d[l]) = A P Rm (d[l]).
N
m=1

The results are summarized in Table 1 for comparison purpose. From this table, the
following conclusions are carried out immediately
(i) EAPR values decrease as n increases.
(ii) PRGM predictors mostly have smaller EAPR values than the Bayes predictors. All
the PRGM predictors act much better than the Bayes predictor w.r.t. π0 ( = 0).
When considering -contaminated priors, d[5] and d[6] have still better perfor-
mance than the Bayes predictors. We should add here that none of the classes Γ1i ,
i = 1, . . . , 4, is superior to any alternatives and thus we cannot determine the best
PRGM predictor without having reliable information about the prior hyperpara-
meters. In fact, each one of classes of priors reflects the practitioner’s knowledge
about the hyperparameters, and when choosing a class of priors, the practitioner
only decides based on his/her experience with the hope of arriving at results com-
patible with what runs on the real world. We emphasize that if the chosen values
of the hyperparameters are not justified, then PRGM predictors outperform the
Bayes prediction rule, as somehow expected by the fact that robust rules are aimed
at global prevention against bad choices in a single prior choice. Obviously, for
justified choices of hyperparameters, the results may reverse in the sense that for
a justified prior, the Bayes predictor outperforms the robust prediction rules.

Appendix

Proof of Proposition 1 The proof of (i) is easily attained following Lemma 4.2 of
Berger (1985) with some adjustments in notations. Also see Sivaganesan and Berger
(1989).
To prove part (ii), first notice that the collective premium of PR w.r.t. the -
contaminated prior π (θ ) = (1 − )π0 (θ ) + π(θ ) is obtained by minimizing the
following quantity

123
Bayesian and robust Bayesian methodology 299


E π (θ) [L(PR , P)] = L(PR , P)π (θ ) dν(θ )

 
= (1 − ) L(PR , P)π0 (θ ) dν(θ ) +  L(PR , P)π(θ ) dν(θ ). (14)
 

One can easily verify that when the loss is the GEL function (1), the RHS of (14) has
a unique minimum which is given by
  


PCπ
q q
= (1
q − ) PR π0 (θ ) dν(θ ) +  PR π(θ ) dν(θ ).
θ∈ θ∈

 q
The proof of part (ii) is completed by defining PCπ0 = E π0 (θ) PR and PCπ =
 
E π(θ) PR .
q

Part (iii) is proved similar to what followed in part (ii). The Bayes premium of PR
w.r.t. the -contaminated prior π (θ ) is obtained by minimizing

π (θ|X)
E [L(PR , P)] = L(PR , P)π (θ |X) dν(θ ), (15)
θ∈

where π (θ |X) is the posterior density given in part (i). Replacing the posterior density
from part (i), (15) reduces to

E π (θ|X) [L(PR , P)] = λ L(PR , P)π0 (θ |X) dν(θ )


+ (1 − λ) L(PR , P)π(θ |X) dν(θ ). (16)


It can be easily verified that when the loss is the GEL function (1), the RHS of (16)
has a unique minimum at

PBπ = q
λPBπ0 + (1 − λ)PBπ ,

 q  q
where PBπ0 = E π0 (θ|X) PR and PBπ = E π(θ|X) PR . 

 πq 
PB 
Proof of Theorem 1 Note that ∂ P∂π r p (P, PBπ ) = Pqπ and ∂ 2∂P  r p
2
P q − 1
 q
B
 B B

 q PBπ π
(P, PB ) = π 2 (q − 1) P q + 1 . So, r p (P, PB ) has a unique minimum at
PB
PBπ = P. The following three steps leads us to the PRGM premium under the GEL
function (1). Notice that P < PBπ < P.

123
300 A. Karimnezhad, A. Parsian

Step 1: When P < P, differentiating r p (P, PBπ ) w.r.t. PBπ , it is easy to verify that

sup r p (P, PBπ ) = r p (P, P).


π ∈Γ

 q 
Let f 1 (P) = r p (P, P), then f 1 (P) = − Pq P P q − 1 . So for all P < P, f 1 (P) is
decreasing in P and inf P <P f 1 (P) = f 1 (P). It means that

inf sup r p (P, PBπ ) = r p (P, P), f 1 (P) = 0.


P <P π ∈Γ

Step 2: When P > P, differentiating r p (P, PBπ ) w.r.t. PBπ , it is easy to verify that

sup r p (P, PBπ ) = r p (P, P).


π ∈Γ

 q 
Let f 2 (P) = r p (P, P), then f 2 (P) = − Pq P P q − 1 . So for all P > P, f 2 (P) is
increasing in P and inf P >P f 2 (P) = f 2 (P). It means that

inf sup r p (P, PBπ ) = r p (P, P), f 2 (P) = 0.


P >P π ∈Γ

Step 3: When P < P < P, there are two possibilities: either P < P <
PBπ < P or P < PBπ < P < P. So for these possibilities, sup r p (P, PBπ ) =
max r p (P, P), r p (P, P) . Let

q  
P P Pq P
l(P) = f 1 (P) − f 2 (P) = q − q ln − q + q ln
P P P P
q
P − Pq P
= − q ln
P q P

It is easy to verify that l(P) is continuous and decreasing in P and l(P) l(P) < 0. Thus,
there exists P ∗ ∈ (P, P) such that l(P ∗ ) = 0, i.e., r p (P ∗ , P) = r p (P ∗ , P). There-
fore, for all P < P < P ∗ , l(P) > 0 and max r p (P, P), r p (P, P) = r p (P, P).
Furthermore for all P < P < P ∗ , r p (P, P) is decreasing in P and

inf r p (P, P) = r p (P ∗ , P).


P <P <P ∗

On the other hand, for all P ∗ < P < P, l(P) < 0 and max r p (P, P), r p (P, P) =
r p (P, P). Hence, for all P ∗ < P < P, r p (P, P) is increasing in P and

inf r p (P, P) = r p (P ∗ , P).


P ∗ <P <P

123
Bayesian and robust Bayesian methodology 301

Combining these arguments leads us to

inf sup r p (P, PBπ ) = r p (P ∗ , P) = r p (P ∗ , P).


P <P <P π ∈Γ

Using the above three steps, it is easy to see that

inf sup r p (P, PBπ ) = inf sup r p (P, PBπ ) = r p (P ∗ , P) = r p (P ∗ , P).


P ∈D π ∈Γ P <P <P π ∈Γ

   
Now, P P R,Γ = P ∗ ∈ P, P is given by l P PΓR = 0, which leads to

q 1/q
P − Pq
P P R,Γ = q .
ln P − ln P q



Proof of Proposition 2 The proof of (i) is similar to part (i) of Proposition 1. To prove
part (ii), notice that from (10) the predictive density associated with the -contaminated
prior π is given by

1
h π (y|x) = f (x, y|θ )π (θ |x) dν(θ ), (17)
f (x|θ )


where ν(.) is a σ -finite measure. Now, from (10), replacing π (θ |x) by λ(x)π0 (θ |x)+
[1 − λ(x)]π(θ |x), the predictive density (17) reduces to

h π (y|x) = λ(x)h π0 (y|x) + [1 − λ(x)]h π (y|x), (18)

where h π0 (y|x) and h π (y|x) are predictive densities associated with π0 and π , respec-
tively.
For part (iii), notice that a Bayes predictor w.r.t. the given prior π is obtained by
minimizing the posterior risk

ρ(π, D) = E h π (y|x) [L(Y, D)] = L(y, D)h π (y|x) dy, (19)
Y

where Y is the support of y. Using the predictive density h π (y|x) given in (18), the
posterior risk (19) reduces to
 
ρ(π, D) = λ(x) L(y, D)h π0 (y|x) dy +[1−λ(x)] L(y, D)h π (y|x) dy. (20)
Y Y

Now, if the prediction loss is the GEPL function (11), it is easy to observe that it has
a unique minimum at

123
302 A. Karimnezhad, A. Parsian


DBπ (x) = q
λ(x)DBπ0 (x) + (1 − λ(x)) DBπ (x),

•   
where DBπ (x) = Y L(y, D)h π • (y|x) dy = E h π • (y|x) Y q , π • ∈ {π0 , π }. This
completes the proof. 


References

Aitchison, J., Dunsmore, I.R.: Statistical Prediction Analysis. Cambridge University Press, Cambridge
(1975)
Berger, J.O.: An overview of robust Bayesian analysis. Test 3, 5–124 (1984)
Berger, J.O.: Statistical Decision Theory and Bayesian Analysis, 2nd edn. Springer, New York (1985)
Berger, J.O.: Robust Bayesian analysis: sensitivity to the prior. J Stat. Plan Inference 25, 303–328 (1990)
Berger, J.O., Moreno, E.: Bayesian robustness in bidimensional models prior independence. J Stat. Plan
Inference 40, 161–176 (1994)
Berry, D.A., Stangl, D.K.: Bayesian Biostatistics. Marcel Dekker, NewYork (1996)
Boratynska, A.: Robust Bayesian prediction with asymmetric loss function in Poisson model of insurance
risk. Acta Universitatis Lodziensis Folia Oeconomica 196, 123–138 (2006)
Boratynska, A.: Posterior regret -minimax estimation of insurance premium in collective risk model. Astin
Bull. 38, 277–291 (2008)
Box, G.E.P., Tiao, G.C.: Bayesian Inference in Statistical Analysis. Addison-Wesley Publishing Company,
Reading (1973)
Bühlmann, H., Gisler, A.: Course in Credibility Theory and its Application. Springer, Berlin (2005)
Calabria, R., Pulcini, G.: An engineering approch to Bayes estimation for the Weibull distribution. Micro-
electron. Reliab. 34, 789–802 (1994)
Dawid, A.P.: Statistical theory: the prequential approach with discussion. J. R Stat. Soc. Ser. A 147, 278–292
(1984)
Dawid, A.P., Vovk, V.G.: Prequential probability: principals and properties. Bernoulli 5, 125–162 (1999)
Dey, D. K., Ghosh, M., Srinivasan, C.: Simultaneous estimation of parameters under entropy loss. J. Stat.
Plan Inference 15, 347–363 (1987)
Dey, D.K., Liu, P.L.: On comparison of estimators in a generalized life model. Microelectron. Reliab. 32,
207–221 (1992)
Dickson, D.C.M., Tedesco, L.M., Zehnwirth, B.: Predictive aggregate claims distributions. J. Risk Insur.
65, 689–709 (1998)
Gómez-Déniz, E.: A generalization of the credibility theory obtained by using the weighted balanced loss
function. Insur. Math. Econ. 42, 850–854 (2008)
Gómez-Déniz, E., Hernández-Bastida, A., Vázquez-Polo, F.J.: The Esscher premium principle in risk theory,
a Bayesian sensitivity study. Insur. Math. Econ. 25, 387–395 (1999)
Gómez-Déniz, E., Pérez-Sánchez, J.M., Vázquez-Polo, F.J.: On the use of posterior regret G-minimax
actions to obtain credibility premiums. Insur. Math. Econ. 39, 115–121 (2006)
Grieve, A.P.: Some uses of predictive distributions in pharmaceutical research. In: Okuno, T. (ed.) Biometry,
Clinical Trials and Related Topics 83–99. Elsevier Science, Amsterdam (1998)
Heilmann, W.: Decision theoretic foundations of credibility theory. Insur. Math. Econ. 8, 77–95 (1989)
Hesselager, O.: A class of conjugate priors with applications to excess-of-loss reinsurance. Astin Bull. 23,
77–90 (1993)
Hürlimann, W.: Predictive stop-loss premiums. Astin Bull. 23, 55–76 (1993)
Hürlimann, W.: Predictive stop-loss premiums and Student’s t-distribution. Insur. Math. Econ. 16, 151–159
(1995)
Kamińska, A., Porosiński, Z.: On robust Bayesian estimation under some asymmetric and bounded loss
function. Statistics 43, 253–265 (2009)
Kiapour, A., Nematollahi, N.: Robust Bayesian prediction and estimation under a squared error loss function.
Stat. Probab. Lett. 81, 1717–1724 (2011)
Klugman, S.A.: Bayesian Statistics in Actuarial Science: With Emphasis in Credibility. Kluwer, Boston
(1992)
Makov, U.E., Smith, A.F.M., Liu, Y.-H.: Bayesian methods in actuarial acience. J. R Stat. Soc. Ser. D 45,
503–515 (1996)

123
Bayesian and robust Bayesian methodology 303

Nayak, T.K.: On best unbiased prediction and its relationships to unbiased estimation. J. Stat. Plan Inference
84, 171–189 (2000)
Parsian, A., Kirmani, S.N.U.A.: Estimation under LINEX function, In: Ullah, A., Wan, A., Chaturvedi, A.
(Eds.) Handbook of Applied Economics and Statistical Inference Inc. Marcel Dekker, New York , pp.
53–76 (2002)
Singh, P.K., Singh, S.K., Singh, U.: Bayes estimator of inverse Gaussian parameters under general entropy
loss function using Lindley’s approximation. Commun. Stat. Simul. Comput. 37, 1750–1762 (2008)
Sivaganesan, S.: Range of posterior measures for priors with arbitrary contaminations. Commun. Stat.
Theory Methods 17, 1591–1612 (1988)
Sivaganesan, S., Berger, J.O.: Ranges of posterior measures for priors with unimodal contaminations. Ann.
Stat. 17, 868–889 (1989)
Soliman, A.: Estimation of parameters of life from progressively. IEEE Trans. Reliab. 54, 34–42 (2005)
Spiegelhalter, D.J., Abrams, K.R., Myles, J.P.: Bayesian Approches to Clinical Trials and Health-care
Evaluation. Wiley, England (2004)
Zellner, A.: An Introduction to Bayesian Inference in Econometrics. Wiley, New York (1971)
Zen, M., DasGupta, A.: Estimating a binomial parameter: is robust Bayes real Bayes? Stat. Decis. 11, 37–60
(1993)

123

You might also like