Professional Documents
Culture Documents
Introduction
These notes are based on Chapter 3 of the textbook.
Tentatively, material will be covered on September 14 - 16.
Our first exam is scheduled for 09/21, so there will be no class session on that day. After exam, our sessions
will resume on September 23.
1
• Prior. The unknown parameter θ is viewed as a random variable, Θ with prior probability function,
f Θ (θ) = π(θ)
• Posterior. Having observed values as a set X of n records, the posterior distribution of (Θ|X) can
be found using a formula for its probability function:
π(θ) · f (x|θ)
π(θ|X) = f Θ|X (θ|x) = ,
f (x)
where the denominator represents joint marginal probability function for X obtained via taking
expectation of what stands i the numerator.
1.2 Objectives
Bayesian framework pursues objectives related to its structure. They are listed below.
1. Joint marginal associated with the sample. That is what was shown in the denominator for a posterior
distribution, π(θ|x). In particular, if there is a single record, X = X1 , then its marginal probability
function needs to be derived.
2. When a potential new value, Y = Xn+1 is supposed to be recorded, we assume that altogether, values
{X1 , X2 , . . . , Xn , Y = Xn+1 },
given Θ = θ, form a collection of conditionally i.i.d. variable. Then the predictive distribution of
(Y |X = x) is one o the goals that can be achieved by using Bayesian framework. It is a conditional
distribution of Y, given observations carried by an array X.
3. Particularly important is Bayesian credibility premium for Y = Xn+1 and defined as conditional
expectation,
E [Y |X] = Ẽ [Y |Θ],
where the symbol Ẽ indicates that the posterior distribution of (Θ|X) is replacing the prior.
2
2 Geometric Model and Continuous Prior
Assume that variables, X = {Xj : j ≥ 1} are independent and identically distributed, presenting the
counts of claim-free time periods (usually, years) that precede the first claim. The model in this case is
geometric, so
f (x|q) = (1 − q)x · q for integer x ≥ 0.
Parameter q is unknown, so Bayesian framework is based on assumptions expressed in terms of the prior
distribution. We restrict our considerations by conjugate pairs for model and prior.
That is why we assume that Q ∼ Beta [a, b], with specified parameters, a > 0 and b > 0. For a geometric
model, sometimes we will require a > 1 or a > 2, when needed.
Γ(a + b)
π(q) = · q a−1 · (1 − q)b−1 for (0 < q < 1).
Γ(a) · Γ(b)
Such formula can hardly help us and is presented as an illustration of how marginal probabilities can be
evaluated. Fortunately, as we use conjugate priors, such an exercise will be easy to bypass. Generally,
when n records are observed, the joint probability function is derived similarly by using
n
X
w= xj
j=1
as the observed total count of claim-free time periods. Then joint probability function is
The sum, W = nj=1 Xj is a sufficient statistic that carries all information about the parameter, as will
P
be seen immediately.
3
2.3 Posterior Distribution
Having observed X = {Xj : 1 ≤ j ≤ n}, we derive the posterior density for (Q|X) as follows:
π(q) · f (x|q)
π(q|x) = ,
P [X = x]
Γ(a + n + w + b)
P [Y = y|X = x] = P [Y |W = w] = (a + n) ·
Γ(w + b) · Γ(a + n + w + b + 1)
b0 b+w
E [Y |X] = =
a0 − 1 a+n−1
4
3 Poisson Model and Gamma Prior
Typical considerations for claim frequency are based on a Poisson model with unknown parameter λ that
will be viewed as a random variable Λ with specified prior distribution. There is a minor ambiguity with
notation related to gamma-distribution. Using a detailed description, we assume that Λ ∼ Gamma [a, θ]
if its density function is
1 λ
Λ a−1
π(λ) = f (λ) = ·λ · exp −
Γ(a) · θa θ
for λ > 0. Often b = θ−1 is used as a parameter, so the same density is presented as
ba
π(λ) = f Λ (λ) = · λa−1 · exp(−bλ)
Γ(a)
ba Γ(x + a)
= ·
Γ(a) · x! (b + 1)a+x
If a = r is an integer number, marginal distribution of X is negative binomial and can be interpreted as
b
follows: (X = k) means that k failures will occur before the rth success, where q = b+1 is the rate.
Now move on to a sample of size n. The formula for joint probability distribution for X = x is more
complicated, yet the good news is that one can proceed with no attention paid to it.
Consider W = 1≤j≤n Xj and notice that (W |Λ = λ) also has a Poisson distribution with rate λ · n. Given
P
X = x, we have the same information about Lambda as might be carried out by the sample. Posterior
distribution for (Λ|W = w) is derived as follows:
(n + b)w+a (w+a)−1
π(λ|W = w) = ·λ · exp(−(n + b)λ),
Γ(w + a)
which can be also stated in the equivalent form:
5
In terms of θ = b−1 , this formula can be presented as follows:
θ
E [Y |W = w] = (w + a) ·
n·θ+1
3.3.1 Example
Assume that the prior density of Q is
Γ(a + b) 6!
K= = = 60.
Γ(a) · Γ(b) (3!) · (2!)
Given X1 = 0, X2 > 0, and X3 = 0, we are about to determine all elements of Bayesian framework. For
T T
the sake of brevity, introduce the event, A = (X1 = 0) (X2 > 0) (X3 = 0)
P [A] = E [P [A|Q]]
6
2. Posterior distribution of (Q|A) can be found directly, by using the same trick as before:
π(q) · P [A|Q = q]
π(q|A) = = C · q 4 · (1 − q)4 ,
P [A]
4. Given A, the expected value of a binary random variable, Y = X4 is the posterior expectation of Q,
which is 0.5
7
4 Continuous Model and Prior
Assume that a sample X = {Xj : j ≥ 1} represents recorded losses or claim severity values. Assume that
they are all independent and identically distributed:
where r is a specified natural number and parameter λ > 0 is unknown. Let us consider what will happen
when this parameter is viewed as a random variable, Λ with specified continuous prior distribution.
4.1 Example
Assume that the model for losses is described in terms of a Gamma-density,
θ3 2
f (x|θ) = · x · exp(−θx) for x > 0
2
and the prior density for Θ is
1 3
π(θ) = · θ · exp(−θ).
6
It is clear that (X|Θ = θ) ∼ Gamma [3, θ−1 ] and Θ ∼ Gamma [4, 1].
Those who love the inverse Gamma can use transformed parametrization with Λ = Θ−1 that follows this
distribution with parameters [a = 4, b = 1].
This example and further general extensions is to be considered on September 16. The following tasks are
performed.
• Marginal distribution of X
• Predictive distribution of (Y |X = x)
• Bayesian credibility, E [Y |X = x]
x2 x2 60 · x2
Z ∞
Γ(7)
f (x) = θ6 · exp(−(x + 1)θ) dθ = × = ,
12 0 12 (x + 1)7 (x + 1)7
8
2. Posterior density of (Θ|X = x) formally should be found using the formula:
so (Θ|X = x) ∼ Gamma [7, (x+1)−1 ] with shape parameter a0 = 7 and scale (x+1)−1 . Equivalently,
if you prefer to use the reciprocal to scale, then it is b = x + 1.
The same result could be obtained faster, since model and prior are conjugate. Present the posterior
density in the form:
where the factor K does not contain θ and depends on x and parameters of the prior. Therefore, we
obtain the same posterior for (Θ|X = x) as before, and the value of K can be determined as
(x + 1)7
K=
6!
3. Predictive distribution for (Y |X = x) is obtained via integration of f (y|θ) with respect to the
posterior of (Θ|X = x) as follows:
" #
(x + 1)7 θ3 2 (x + 1)7 · y 2
Z ∞
6
f (y|x) = [θ · exp(−(x + 1)θ)] × · y · exp(−θ · y) dθ = 252 · ,
720 0 2 (x + y + 1)10
Multiple Observations. If a sample of n recorded X-values was given, then evaluation of the joint den-
sity associated with values of X would be time consuming, but the same representation of π(θ|(x1 , x2 , . . . , xn ))
would lead to the formula:
π(θ|x) = K · θ3n+a−1 · exp(−(w + b)θ),
where w = nj=1 xj is the value of a sufficient statistic, Θ ∼ Gamma [a, b−1 ], and K can be found from
P
9
4.3 Poisson Model and Gamma Prior: Example
Assume that observations, X = {Xj : 1 ≤ j ≤ n}, represent claim frequency values. The following
information is provided.
1. {Xj : 1 ≤ j ≤ n} are independent Poisson distributed with unknown rate λ.
2. Prior distribution for Λ is Gamma [α, β] such that
E [Λ] = 0.10 and Var [Λ] = 0.0003
4.3.1 Solution
Identify parameters of the prior distribution as follows. With moments of Λ being specified, we have:
E [Λ] = α · β = 0.10 and Var [Λ] = α · β 2 = 0.0003,
which implies that α = (0.03)−1 and β = (0.003)−1 . The total number of exposures is n = 200 · 3 = 600
and the overall claim frequency is w = 150. Therefore, posterior distribution for Λ is
Gamma [α0 = α + 150, β 0 = β + 600],
so the posterior expectation of Λ is
−1
1 1
Ẽ [Λ] = 150 + · + 600 ≈ 0.1964
0.03 0.003
4.4 Conclusion
Assume that Λ ∼ Gamma [a, b−1 ], so
ba
π(λ) = · λa−1 · exp(−b · λ).
Γ(a)
For n being a number of exposures, denote
n
X
W = Xj ,
j=1
the total number claims. Let w be the sum of observed claim frequency values. Then the posterior
distribution of (Λ|X) = x is the same as for (Λ|W = w) and
(Λ|W = w) ∼ Gamma [a0 = a + w, b0 = b + n],
so β 0 = b−1 and
a0 a+w
E [Λ|W = w] = 0
=
b b+n
10