You are on page 1of 113

Financial Mathematics in Life and Pension

Insurance
Ragnar Norberg
Summer School in Mathematical Finance, Dubrovnik, 16-22
September 2001

Contents
1 Payment streams and interest
1.1 Basic notions of payments and interest . . . . . . . . . . . . . . .
1.2 Application to loans . . . . . . . . . . . . . . . . . . . . . . . . .

3
3
8

2 Mortality
11
2.1 Aggregate mortality . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 The Gompertz-Makeham mortality law . . . . . . . . . . . . . . 14
2.3 Actuarial notation . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Insurance of a single life
3.1 Some standard forms of insurance . .
3.2 The principle of equivalence . . . . .
3.3 Prospective reserves . . . . . . . . .
3.4 Thieles dierential equation . . . . .
3.5 The stochastic process point of view

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

16
16
19
20
23
24

4 Markov chains in life insurance

4.1 The insurance policy as a stochastic process
4.2 The time-continuous Markov chain . . . . .
4.3 Applications . . . . . . . . . . . . . . . . . .
4.4 The standard multi-state contract . . . . .
4.5 Higher order moments of present values . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

26
26
27
31
34
40

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

5 A Markov chain interest model

46
5.1 The Markov model . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2 Dierential equations for moments of present values . . . . . . . 47
5.3 Complement on Markov chains . . . . . . . . . . . . . . . . . . . 48
6.1 General considerations . . . . . . . . . . .
6.2 First and second order bases . . . . . . . .
6.3 The technical surplus and how it emerges
6.4 Dividends and bonus . . . . . . . . . . . .
6.5 Bonus prognoses . . . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

52
52
53
55
57
60

CONTENTS
6.6
6.7

Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7 Financial mathematics in insurance

7.1 Finance in insurance . . . . . . . . . . . . . . . . . .
7.2 Prerequisites . . . . . . . . . . . . . . . . . . . . . .
7.3 A Markov chain nancial market - Introduction . . .
7.4 The Markov chain market . . . . . . . . . . . . . . .
7.5 Arbitrage-pricing of derivatives in a complete market
7.6 Numerical procedures . . . . . . . . . . . . . . . . .
7.7 Risk minimization in incomplete markets . . . . . .
7.8 Trading with bonds: How much can be hedged? . . .
7.9 The Vandermonde matrix in nance . . . . . . . . .
7.10 Two properties of the Vandermonde matrix . . . . .
7.11 Applications to nance . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

65
68

73
. 73
. 74
. 79
. 81
. 88
. 92
. 92
. 95
. 99
. 100
. 101

A Calculus

B Indicator functions

Chapter 1

interest
1.1

Basic notions of payments and interest

A. Streams of payments. We commence by giving some formal mathematical structure to the notion of payment streams. Referring to Appendix A, we
deal only with their properties as functions of time and do not discuss their
possible stochastic properties for the time being.
To x ideas and terminology, consider a nancial contract commencing at
time 0 and terminating at a later time n ( ), say, and denote by At the
total amount paid in respect of the contract during the time interval [0, t]. The
payment function {At }t0 is assumed to be the dierence of two non-decreasing,
nite-valued functions representing incomes and outgoes, respectively, and is
thus of nite variation (FV). Furthermore, the payment function is assumed to
be right-continuous (RC). From a practical point of view this assumption is just
a matter of convention, stating that the balance of the account changes at the
time of any deposit or withdrawal. From a mathematical point of view this is
convenient, since payment functions can then serve as integrators. In fact, we
shall restrict attention to payment functions that are piecewise dierentiable
(PD):
 t

a d +
(A A ) .
(1.1)
At = A0 +
0

0< t

The integral adds up payments that fall due continuously, and the sum adds up
lump sum payments. In dierential form (1.1) reads
dAt = at dt + At At .

(1.2)

It seems natural to count incomes as positive and outgoes as negative. Sometimes, and in particular in the context of insurance, it is convenient to work with
outgoes less incomes, and to avoid ugly minus signs we introduce B = A.
3

B. Interest. Suppose money is currently invested on (or borrowed from) an

account that bears interest. This means that a unit deposited on the account at
time u gives the account holder the right to cash, at any other time t, a certain
amount v(t, u), typically dierent from 1. The function v must be strictly
positive, and we shall argue that it must satisfy the functional relationship
v(s, u) = v(s, t) v(t, u) ,

(1.3)

implying, of course, that v(t, t) = 1 (put s = t = u and use strict positivity): If

the account holder invests 1 at time u, he may cash the amount on the left of
(1.3) at time s. If he instead cashes v(t, u) at time t and immediately reinvests
this amount again, he will obtain at time s the amount on the right of (1.3). To
avoid arbitrary gains, so-called arbitrage, the two strategies must give the same
result.
It is easy to verify that v(t, u) satises (1.3) if and only if it is of the form
v(t, u) = vt1 vu

(1.4)

for some strictly positive function vt (allowing an abuse of notation), which can
be taken to satisfy
v0 = 1 .
Then, vu must be the value at time 0 of a unit invested at time u, and we call it
the discounting function. Correspondingly, vt1 is the value at time t of a unit
invested at time 0, and we call it the accumulation function.
In practical banking operations one uses
vt = e

Rt
0

vt1 = e

Rt
0

(1.5)

where rt is some piecewise


continuous function, usually positive. (The shorthand exemplied by r = r d will be in frequent use throughout.) Under
the rule (1.5) the dynamics of accumulation and discounting are given by
de
de

Rt
0

Rt
0

=e

Rt
0

= e

rt dt ,

Rt
0

rt dt .

(1.6)
(1.7)

The relation (1.6) says that the interest earned in a small time interval is proportional to the length of the interval and to the current amount on deposit.
The proportionality factor rt is called the force of interest or the (instantaneous)
interest rate at time t. In integral form (1.6) and (1.7) read
 t R
Rt

e 0r = 1+
e 0 r r d ,
(1.8)
0

Rt
0


r

= 1

t
0

R
0

r d .

(1.9)

We will henceforth assume that interest is earned in accordance with (1.5)

and will be working with the expressions
v(t, u) = e

Ru
t

v(t, s) = e

Rt
s

for the general accumulation factor when s < t.

By constant interest rate r we have vt = v t , where
v = er

(1.10)

is the constant annual discount factor. In this case the constant annual accumulation factor is
v 1 = er = 1 + i ,

(1.11)

where i is the annual interest rate.

C. Valuation of payment streams. Suppose that the incomes/outgoes created by the payment stream A are currently deposited on/drawn from an account which bears interest at rate rt at time t. By (1.4) the value at time t of
the amount dA paid in the small time interval around time is v(t, ) dA =
vt1 v dA . Summing over all time intervals, and using (1.5), we get the value
at time t of the entire payment stream,
Rt  n
R
e 0 r dA = Ut Vt ,
e 0r
0

where
Ut = e

Rt
0

e
0

R
0


r

dA =

e
0

Rt

dA

(1.12)

is the accumulated value of past incomes less outgoes, and (recall the convention
B = A)
 n R
Rt  n
R

r
0 r
0
Vt = e
e
dB =
e t r dB
(1.13)
t

is the discounted value of future outgoes less incomes. This decomposition is

particularly relevant for payments governed by some contract; Ut is the cash
balance, that is, the amount held at the time of consideration, and Vt is the
future liability. The dierence between the two is the current value of the
contract.

The development of the cash balance 

can be viewed
in various ways: Appli
t
cation of (A.8) to (1.12), taking Xt = exp 0 rs ds (continuous, with dynamics
 

t
given by (1.6)) and Yt = 0 exp 0 rs ds dA , yields
dUt = Ut rt dt + dAt ,

(1.14)

which integrates to (note that U0 = A0 )


Ut = At +

U r d .

(1.15)

An alternative expression,

Ut = At +

t R
t

A r d ,

(1.16)

is derived from (1.12) upon applying the rule (A.9) of integration by parts. For
instance, in the rst expression in (1.12), put


e
0

R
0


r

dA

A0 +

A0 + e

Rt
0

dA .


At A0

A e

R
0

(r ) d .

The relations (1.14) (1.16) show, in an easily interpretable manner, how the
cash balance emerges from payments and earned interest. As a special case of
(1.15) we have the trivial relationship
e

Rt
0


r

=1+

t R
t

r d ,

(1.17)

which shows how a unit invested at time 0 accumulates with interest. Compare
with (1.8).
Likewise, from (1.13) we derive
dVt = Vt rt dt dBt ,


(1.18)

Vt = Bn Bt

V r d ,

(1.19)

and

Vt = Bn Bt

R
t

(Bn B )r d ,

(1.20)

the last two relationships valid for n = only if B < . Again interpretations are easy; (1.19) and (1.20) state, in dierent ways, that the debt can be

CHAPTER 1. PAYMENT STREAMS AND INTEREST

settled immediately at a price which is the total debt minus the present value
of future interest saved by advancing the repayment. We also easily obtain
 n R
Rn

Vt = e t r (Bn Bt ) +
e t r (B Bt )r d .
(1.21)
t

Typically, the nancial contract will lay down that incomes and outgoes be
equivalent in the sense that
Un = 0

or

V0 = 0 .

(1.22)

These two relationships are equivalent and they imply that, for any t,
Ut = Vt .

(1.23)

We anticipate here that, in the insurance context, the equivalence requirement

is usually not exercised at the level of the individual policy: the very purpose of
insurance is to redistribute money among the insured. Thus the principle must
be applied at the level of the portfolio in some sense, which we shall discuss
later. Moreover, in insurance the payments, and typically also the interest rate,
are not foreseeable at the outset, so in order to establish equivalence one may
have to currently adapt the payments to the development in some way or other.
D. Some standard payment functions and their values. Certain simple
payment functions are so frequently used that they have been given names. An
endowment of 1 at time n is dened by At = n (t), where

0, 0 t < n ,
(1.24)
n (t) =
1, t n .
(The only payment is An An = 1.) By constant interest rate r the present
value at time 0 of the endowment is ern or, setting v = er , v n .
An n-year immediate annuity of 1 per year consists of a sequence of endowments of 1 at times t = 1, . . . , n, and is thus given by
At =

n


j (t) = [t] n .

j=1

an =

n


erj =

j=1

1 ern
,
i

(1.25)

see (1.10) (1.11).

An n-year annuity-due of 1 per year consists of a sequence of endowments
of 1 at times t = 0, . . . , n 1, that is,
At =

n1

j=0

j (t) = [t + 1] n .

a
n =

n1


erj = (1 + i)

j=0

1 ern
.
i

(1.26)

An n-year continuous annuity payable at level rate 1 per year is given by

At = t n .

(1.27)

For the case with constant interest rate its present value at time 0 is (recall
(1.10))
 n
1 ern
a
n =
.
(1.28)
er d =
r
0
An everlasting (perpetual) annuity is called a perpetuity. Putting n = in
the (1.25), (1.26), and (1.28), we nd the following expressions for the present
values of the immediate perpetuity, the perpetuity-due, and the continuous
perpetuity:
a =

1
,
i

a
=

1+i
,
i

a
=

1
.
r

(1.29)

An m-year deferred n-year temporary life annuity commences only after m

years and is payable throughout n years thereafter. Thus it is just the dierence
between an m + n year annuity and an m year annuity. For the continuous
version,
At = ((t m) 0) n = (t (m + n)) (t m) .

(1.30)

Its present value at time 0 by constant interest is denoted a

m|n and must be
m+n a
m = v m a
n .
a
m|n = a

1.2

(1.31)

Application to loans

A. Basic features of a loan contract. Traditional loans and savings accounts in banks are among the simplest nancial contracts since they are entirely deterministic. Let us consider a loan contract stipulating that at time
0, say, the bank pays to a borrower an amount H, called the principal (rst
in Latin), and that the borrower thereafter pays back or amortizes the loan in
accordance with a non-decreasing payment function {At }0tn called the amortization function. The term of the contract, n, is sometimes called the duration
of the loan. Without loss of generality we assume henceforth that H = 1 (the
principal is proclaimed monetary unit).
The amortization function is to fulll A0 = 0 and An 1. The excess of
total amortizations over the principal is the total amount of interest. We denote

it by Rn and have An = 1 + Rn . General principles of book-keeping, needed e.g.

for taxation purposes, prescribe that the decomposition of the amortizations
into repayments and interest be extended to all t [0, n]. Thus,
At = Ft + Rt ,

(1.32)

F0 = 0 ,

Fn = 1

(formally a distribution function due to the convention H = 1), and R is a

non-decreasing interest payment function.
Furthermore, the contract is required to specify a nominal force of interest
rt , 0 t n, under which the value of the amortizations should be equivalent
to the value of the principal, that is,
 n R

e 0 r dA = 1 .
(1.33)
0

There are, of course, innitely many admissible decompositions (1.32) satisfying (1.33). A clue to constraints on F and R is oered by the relationship
 n R
 n R

0 r
e
dR =
e 0 r (1 F )r d ,
(1.34)
0

which is obtained upon

  (1.32)
 into (1.33)
 n and
 then
 using integration
 ninserting
by parts on the term 0 exp 0 r dF = 0 exp 0 r d(1 F ). The
condition (1.34) is trivially satised if
dRt = (1 Ft )rt dt ,
that is, interest is paid currently and instantaneously on the outstanding (part
of the) principal, 1 F . This will be referred to as natural interest.
Under the scheme of natural interest the relation (1.32) becomes
dAt = dFt + (1 Ft )rt dt ,

(1.35)

which establishes a one-to-one correspondence between amortizations and repayments. The dierential equation (1.35) is easily solved: First, integrate
(1.35) over (0, t] to obtain
 t
At = Ft +
(1 F )r d ,
(1.36)
0

which determines
whenrepayments
are given.
Second,

  multiply 
 amortizations
t 
t 
t
(1.35) with exp 0 r to obtain exp 0 r dAt = d exp 0 r (1 Ft )
and then integrate over (t, n] to arrive at
 n R

e t r dA = 1 Ft ,
(1.37)
t

which determines (outstanding) repayments when amortizations are given. Interpretations of the relationships are obvious. For instance, since 1 Ft is the
remaining debt at time t, (1.37) is the time t update of the equivalence requirement (1.33).

10

B. Standard forms of loans. We list some standard types of loans, taking

now r constant. It is understood that we consider only times t in [0, n].
The simplest form is the xed loan, which is repaid in its entirety only at
the term of the contract, that is, Ft = n (t), the endowment dened by (1.24).
The amortization function is obtained directly from (1.36): At = n (t) + rt.
A series loan has repayments of annuity form. The continuous version is
given by Ft = t/n, see (1.27). The amortization plan is obtained from (1.36):
At = t/n + rt(1 t/2n). Thus, dFt /dt = 1/n (xed) and dRt /dt = r(1 t/n)
(linearly decreasing).
An annuity loan is called so because the amortizations, which are the amounts
actually paid by the borrower, are of annuity form. The continuous version is
an , confer (1.33) and (1.28). From (1.37) we easily obtain
given by At = t/
Ft = 1 a
nt /
an . We nd dFt /dt = er(nt) /
an (exponentially increasing),
and dRt /dt = (1 er(nt) )/an .
Putting n = , the xed loan and the series loan both specialize to an
innite loan without repayment. Amortizations consist only of interest, which
is paid indenitely at rate r.

Chapter 2

Mortality
2.1

Aggregate mortality

A. The stochastic model. Consider an aggregate of individuals, e.g. the

population of a nation, the persons covered under an insurance scheme, or a
certain species of animals. The individuals need not be animate beings; for
instance, in engineering applications one is often interested in studying the worklife until failure of technical components or systems. Having demographic and
actuarial problems in mind, we shall, however, be speaking of persons and life
lengths until death.
Due to dierences in inheritance and living conditions and also due to events
of a more or less purely random nature, like accidents, diseases, etc., the life
lengths vary among individuals. Therefore, the life length of a randomly selected
new-born can suitably be envisaged as a non-negative random variable T with
a cumulative distribution function
F (t) = P[T t] .

(2.1)

In survival analysis it is convenient to work with the survival function

F (t) = P[T > t] = 1 F (t) .

(2.2)

f (t) =

d
d
F (t) = F (t) .
dt
dt

(2.3)

B. The force of mortality. The density is the derivative of F , see (2.3).

When dealing with non-negative random variables representing life lengths, it
is convenient to work with the derivative of log F ,
(t) =

f (t)
d
{ log F (t)} = ,
dt
F (t)
11

(2.4)

CHAPTER 2. MORTALITY

12

which is well dened for all t such that F (t) > 0. For small, positive dt we have
P[t < T t + dt]
f (t)dt
= P[T t + dt | T > t] .
=
(t)dt =
P[T > t]
F (t)
(In the second equality we have neglected a term o(dt) such that o(dt)/dt 0
as dt 0.) Thus, for a person aged t, the probability of dying within dt
years is (approximately) proportional to the length of the time interval, dt. The
proportionality factor (t) depends on the attained age, and is called the force
of mortality at age t. It is also called the mortality intensity.
Integrating (2.4) from 0 to t and using F (0) = 1, we obtain
F (t) = e

Rt
0

(2.5)

Relation (2.4) may be cast as

f (t) = F (t)(t) = e

Rt
0

(t) ,

(2.6)

which says that the probability f (t)dt of dying in the age interval (t, t+dt) is the
product of the probability F (t) of survival to t and the conditional probability
(t)dt of then dying before age t + dt.
The functions F , F , f , and are equivalent representations of the mortality
law; each of them corresponds one-to-one
to any one of the others.

Since F () = 0, we must have 0 = . Thus, if there is a nite highest
t
attainable age such that F () = 0 and F (t) > 0 for t < , then 0 as
t . If, moreover, is non-decreasing, we must also have limt (t) = .
C. The distribution of the remaining life length. Let Tx denote the
remaining life length of an individual chosen at random from the x years old
members of the population. Then Tx is distributed as T x, conditional on
T > x, and has cumulative distribution function
F (t|x) = P[T x + t | T > x] =

F (x + t) F (x)
1 F (x)

and survival function

F (x + t)
,
F (t|x) = P[T > x + t | T > x] =
F (x)

(2.7)

which are well dened for all x such that F (x) > 0. The density of this conditional distribution is
f (t|x) =

f (x + t)
.
F (x)

(2.8)

Denote by (t|x) the force of mortality of the distribution F (t|x). It is obtained

by inserting f (t|x) from (2.8) and F (t|x) from (2.7) in the places of f and F in
the denition (2.4). We nd
(t | x) = f (x + t)/F (x + t) = (x + t) .

(2.9)

CHAPTER 2. MORTALITY

13

Alternatively, we could insert (2.5) into (2.7) to obtain

F (t|x) = e

R x+t
x

(y) dy

= e

Rt
0

(x+ )d

(2.10)

which by the general relation (2.5) entails (2.9). Relation (2.9) explains why
the force of mortality is particularly handy; it depends only on the attained age
x + t, whereas the conditional density in (2.8) depends in general on x and t
in a more complex manner. Thus, the properties of all the conditional survival
distributions are summarized by one simple function of the total age only.
D. Expected values in life distributions. Let T be a non-negative r.v.
with distribution function F , not necessarily absolutely continuous, and let G :
R+ R be a PD and RC function such that E[G(T )] exists and is nite.
Integrating by parts, we nd

F ( ) dG( ) .
(2.11)
E[G(T )] = G(0) +
0

Taking G(t) = tk we get


E[T ] = k

tk1 F (t) dt ,

(2.12)

and, in particular,

E[T ] =

F (t)dt,

(2.13)


F (t | x) dt .
ex =

(2.14)

From (2.10) it is seen that F (t | x) is a decreasing function of x for xed t if

is an increasing function. Then ex is a decreasing function of x. One can easily
construct mortality laws for which F (t | x) and ex are not decreasing functions
of x.
Consider the more general function

0 t < a,
0,
(t a)k , a t < b,
(2.15)
G(t) = ((t b) (t a))k =

(b a)k , b t,
that is, dG(t) = k(t a)k1 dt for a < t < b and 0 elsewhere. It is realized that
G(T ) is the kth power of the number of years lived between age a and age b.
From (2.11) we obtain


(t a)k1 F (t) dt ,

E[G(T )] = k
a

(2.16)

CHAPTER 2. MORTALITY

14

In particular, the expected number of years lived between the ages of a and
b
b is a F (t) dt, which is the area between the t-axis and the survival function
in the interval from a to b. The formula can be motivated directly by noting
that F (t) dt is the expected number of years survived in the small time interval
(t, t + dt) and using that the expected value of the sum is the sum of the
expected values.

2.2

This distribution is widely used as a model for survivorship of human lives,

especially in the context of life insurance. Thus, as it will be frequently referred
to, we shall use the acronym G-M for this law. Its mortality intensity is of the
form
(t) = + ct ,

(2.17)

, , c > 0. The corresponding survival function is

 t



s

F (t) = exp
( + c )ds = exp t (ct 1)/ log c .

(2.18)

If > 0 and c > 1, then (t) is an increasing function of t. The constant

term accounts for age-independent causes of death like certain accidents and
epidemic diseases, and the term ct accounts for all kinds of wear-out eects
due to aging.
The G82M mortality law used by Danish insurance companies has parameters
= 5 104 ,

2.3

= 7.5858 105 ,

c = 1.09144 .

(2.19)

Actuarial notation

A. Actuaries in all countries unite! The International Association of

Actuaries (IAA) has laid down a notational standard, which is generally accepted among actuaries all over the world. Familiarity with this notation is a
must for anyone who wants to communicate in writing or reading with actuaries, and we shall henceforth adopt it in those simple situations where it is
applicable.
B. A list of some standard symbols. According to the IAA standard, the
quantities introduced so far are denoted as follows:
t qx

t px
x+t

=
=

F (t | x) ,
F (t | x) ,
(x + t) .

(2.20)
(2.21)
(2.22)

CHAPTER 2. MORTALITY

15

In particular, t q0 = F (t) and t p0 = F (t). One-year death and survival probabilities are abbreviated as
qx = 1 qx ,

px = 1 px .

(2.23)

Frequently used is also the n-year deferred probability of death within m

years,
= m+n qx n qx = n px m+n px .

(2.24)

 t
= exp(
x+ d ),
t px

(2.25)

n|m qx

f (t | x) =
ex

t px x+t ,

=
t px dt.
0

(2.26)
(2.27)

Chapter 3

3.1

A. The single-life status. Consider a person aged x with remaining life

length Tx as described in the previous section. In actuarial parlance this life
is called the single-life status (x) . Referring to Appendix B, we introduce the
indicator of the event of survival in t years, It = 1{Tx >t} . It is a binomial
random variable with success probability t px . The indicator of the event of
death within t years is 1 It = 1{Tx t} , which is a binomial variable with
success probability t qx = 1 t px . (We apologize for sometimes using technical
terms where they may sound cynical.) Note that, being 0 or 1, any indicator
1A satises (1A )q = 1A for q > 0.
The present section lists some standard forms of insurance that (x) can
purchase, investigates some of their properties, and presents some standard
actuarial methods and formulas.
We assume that the investments of the insurance company yield interest at
a xed rate r, hence discounting is at annual rate v = er . Standard actuarial
notation pertaining to this case is employed throughout.
B. The pure endowment insurance. An n-year pure (life) endowment of 1
is a unit that is paid to (x) at the end of n years if he is then still alive. Recalling
(1.24), the associated payment function is an endowment of In at time n. Its
present value at time 0 is
P V e;n = ern In .

(3.1)

The expected value of P V e;n is

n Ex

= ern n px .

(3.2)

For any q > 0 we have (P V e;n )q = eqrn In (recall that Inq = In ), and so the
q-th non-central moment of V e;n may be expressed as
E[(P V e;n )q ] = n Ex(qr) ,
16

(3.3)

CHAPTER 3. INSURANCE OF A SINGLE LIFE

17

where the topscript (qr) signies that discounting is made under a force of
interest that is q times the standard r.
In particular, the variance of P V e;n is
V[P V e;n ] = n Ex(2r) n Ex2 .

(3.4)

C. The life assurance. A life assurance contract species that a certain

amount, called the sum insured, is to be paid upon the death of the insured,
possibly limited to a specied period. We shall here consider only insurances
payable immediately upon death, and take the sum to be 1 (just a matter of
notation).
First, an n-year term insurance is payable upon death within n years. The
payment function is a lump sum of 1 In at time Tx . Its present value at time
0 is
P V ti;n

The expected value of P V ti;n is


A 1 =
xn

n
0

erTx (1 In ) .

(3.5)

ert t px x+t dt ,

(3.6)

and, similar to (3.3),

(qr)
E[(V ti;n )q ] = A 1 .

(3.7)

xn

In particular,
(2r)
V[P V ti;n ] = A 1 A21
xn

xn

(3.8)

An n-year endowment insurance is payable upon death if it occurs within

time n and otherwise at time n. The payment function is a lump sum of 1 at
time Tx n. Its present value at time 0 is
P V ei;n = er(Tx n) .

(3.9)

The expected value of P V ei;n is

 n

Ax n =
ert t px x+t dt + ern n px = A 1

xn

+ n Ex ,

(3.10)

and
(qr)
E(P V ei;n )q = Ax n .

(3.11)

(2r)
V[P V ei;n ] = Ax n A2x n .

(3.12)

It follows that

CHAPTER 3. INSURANCE OF A SINGLE LIFE

18

D. The life annuity. An n-year temporary life annuity of 1 per year is payable
as long as (x) survives but limited to n years. We consider here only the
continuous version. Recalling (1.27), the associated payment function is an
annuity of 1 in Tx n years. Its present value at time 0 is
Tx n =
P V a;n = a

1 er(Tx n)
.
r

The expected value of P V a;n is

 n

a
x n =
a
t t px x+t dt + a
n n px =
0

ert t px dt =

(3.13)

0

n
t Ex

dt .

(3.14)

The last expression, which follows upon integrating by parts, displays that the
annuity is a sum of pure endowments of dt in each small interval [t, t + dt) up
to time n, see (3.2). We shall demonstrate below that

q
q 1 (pr)
q 
(1)p1
,
(3.15)
a

E[(P V a;n )q ] = q1
p 1 xn
r
p=1
from which we derive


2
(2r)
2x n .
a
x n a
x n a
(3.16)
r
The endowment insurance is a combined benet consisting of an n-year term
insurance and an n-year pure endowment. By (3.9) and (3.13) it is related to
the life annuity by
V[P V a;n ] =

1 P V ei;n
or P V ei;n = 1 rP V a;n ,
(3.17)
r
which just reects the more general relationship (1.28). Taking expectation in
(3.17), we get
Ax n = 1 r
ax n .
(3.18)
P V a;n =

Also, since P V ti;n = P V ei;n P V e;n = 1 rP V a;n P V e;n , we have

A 1 = 1 r
ax n n Ex .
xn

(3.19)

The formerly announced result (3.15) follows by operating with the q-th
moment on the rst relationship in (3.17), and then using (3.12) and (3.18) and
rearranging a bit. One needs the binomial formula
q

q qp p
(x + y)q =
x y
p
p=0
q  
and the special case p=0 pq (1)qp = 0 (for x = 1 and y = 1).
A whole-life annuity is obtained by putting n = . Its expected present
value is denoted simply by a
x and is obtained by putting n = in (3.14), that
is

ert t px dt,
(3.20)
a
x =
0

and the same goes for the variance in (3.16) (justify the limit operations).

19

E. Computational aspects. Distribution functions of present values and

many other functions of interest can be calculated easily; after all there is only
one random variable in play, and nding expected values amounts just to forming
integrals in one dimension. We shall, however, not pursue this approach because
it will turn out that a dierent point of view is needed in more complex situations
to be studied in the sequel.
Table 3.1: Expected value (E), coecient of variation (CV), and skewness (SK)
of the present value at time 0 of a pure endowment (PE) with sum 1, a term
insurance (TI) with sum 1, an endowment insurance (EI) with sum 1, and a life
annuity (LA) with level intensity 1 per year, when x = 30, n = 30, is given
by (2.19), and r = ln(1.045).

E
CV
SK

PE
0.2257
0.4280
1.908

TI
0.06834
2.536
2.664

EI
0.2940
0.3140
4.451

LA
16.04
0.1308
4.451

Anyway, by methods to be developed later, we easily compute the three rst

moments of the present values considered above, and nd the expected values,
coecients of variation, and skewnesses shown in Table 3.1. The reader should
contemplate the results, keeping in mind that the coecient of variation may
be taken as a simple measure of riskiness.
We interpose that numerical techniques will be dominant in our context.
Explicit formulas cannot be obtained even for trivial quantities like a
x n under
the Gompertz-Makeham law (2.17); age dependence and other forms of inhomogeneity of basic entities leave little room for aesthetics in actuarial science. Also
relationships like (3.18) are of limited interest; they are certainly not needed for
computational purposes, but may provide some general insight.

3.2

The principle of equivalence

A. A note on terminology. Like any other good or service, insurance coverages are bought at some price. And, like any other business, an insurance
company must x prices that are sucient to defray the costs. In one respect,
however, insurance is dierent: for obvious reasons the customer is to pay in
advance. This circumstance is reected by the insurance terminology, according
to which payments made by the insured are called premiums from French prime
rst.
B. The equivalence principle. The equivalence principle of insurance states
that the expected present values of premiums and benets should be equal.
Then, roughly speaking, premiums and benets will balance on the average.
This idea will be made precise later. For the time being all calculations are

CHAPTER 3. INSURANCE OF A SINGLE LIFE

20

made on an individual net basis, that is, the equivalence principle is applied
to each individual policy, and without regard to expenses incurring in addition
to the benets specied by the insurance treaties. The resulting premiums are
The premium rate depends on the premium payment scheme. In the simplest
case, the full premium is paid as a single amount immediately upon the inception
of the policy. The resulting net single premium is just the expected present value
of the benets, which for standard forms of insurance is given in Section 3.1.
The net single premium may be a considerable amount and may easily exceed the liquid assets of the insured. Therefore, premiums are usually paid by
a series of installments extending over some period of time. The most common solution is to let a xed level amount fall due periodically, e.g. annually
or monthly, from the inception of the agreement until a specied time m and
contingent on the survival of the insured. Assume for the present that the premiums are paid continuously at a xed level rate . Then the premiums form
an m-year temporary life annuity, payable by the insured to the insurer. Its
ax m given by (3.14).
present value is P V a;m , with expected value

C. The net economic result for a policy. The random variables studied in
Section 3.1 represent the uncertain future liabilities of the insurer. Now, unless
single premiums are used, also the premium incomes are dependent on the
insureds life length and become a part of the insurers uncertainty. Therefore,
the relevant random variable associated with an insurance policy is the present
P V = P V b P V a;m ,
b

(3.21)
ei;n

where P V is the present value of the benets, e.g. P V

in the case of an
n-year endowment insurance.
Stated precisely, the equivalence principle lays down that
E[P V ] = 0 .

(3.22)

For example, with P V = P V

(3.22) becomes 0 = Ax n
axm| .
A measure of the uncertainty associated with the economic result of the
policy is the variance V[P V ]. For example, with P V b = P V ei;n and m = n,


1 v Tx n
V[P V ] = V v Tx n
= (1 + /r)2 V[v Tx n ]
r


(2r)
2 a
x n a
x n
=
1.
(3.23)
r
a2x n
b

3.3

ei;n

Prospective reserves

A. The case. We shall discuss the notion of reserve in the framework of a

combined insurance which comprises all standard forms of contingent payments

CHAPTER 3. INSURANCE OF A SINGLE LIFE

21

that have been studied so far and, therefore, easily specializes to each of those.
The insured is x years old upon issue of the contract, which is for a term of
n years. The benets consist of a term insurance with sum insured bt payable
upon death at time t (0, n) and a pure endowment with sum bn payable upon
survival at time n. A lump sum premium of 0 is due immediately upon issue
of the policy at time 0, and thereafter premiums are payable at rate t per time
unit contingent on survival at time t (0, n).
The expected present value at time 0 of total benets less premiums under
the contract is
 n
v px {x+ b } d + bn v n n px .
(3.24)
0 +
0

Under the equivalence principle this is set equal to 0, a constraint on the premium function .
B. Denition of the reserve. The expected value (3.24) represents, in an
average sense, an assessment of the economic prospects of the policy at the
outset. At any time t > 0 in the subsequent development of the policy the
assessment should be updated with regard to the information currently available.
If the policy has expired by death before time t, there is nothing more to be
done. If the policy is still in force, a renewed assessment must be based on the
conditional distribution of the remaining life length. Insurance legislation lays
down that at any time the insurance company must provide a reserve to meet
future net liabilities on the contract, and this reserve should be precisely the
expected present value at time t of total benets less premiums in the future.
Thus, if the policy is still in force at time t, the reserve is
 n
v t t px+t {x+ b } d + bn v nt nt px+t .
(3.25)
Vt =
t

More precisely, this quantity is called the prospective reserve at time t since it
looks ahead. Under the principle of equivalence it is usually called the net
premium reserve. We shall here take the liberty to just speak of the reserve.
There is a retrospective formula for the net premium reserve, which is obtained upon setting the expression in (3.24) equal to 0, then splitting the intet n
n
gral 0 into 0 + t , and observing that the latter integral plus the last term
in (3.24) is v t t px Vt . Then, solving with respect to Vt , we obtain

 t
1
Vt =
(1 + i)t px { x+ b } d .
(3.26)
(1 + i)t 0 +
t px
0
This formula expresses Vt as the surplus of transactions in the past, accumulated
at time t with the benet of interest and survivorship.
C. Some special cases. The net reserve is easily put up for the standard
forms of insurance treated in Sections 3.1 and 3.2. It is assumed that premiums

22

0 + V0 = 0.

(3.27)

Consider rst the pure endowment introduced in Paragraph 3.1.B. If the

single net premium n Ex is collected at time 0, then
Vt =

nt Ex+t ,

0 < t n.

(3.28)

If premiums are payable continuously at level rate throughout the contract

period, then
Vt

=
=

ax+t nt
n Ex
a

.
nt Ex+t
a
x n x+t nt
nt Ex+t

(3.29)

Next, consider an m-year deferred whole life annuity against the level net
premium in the deferred period. The net reserve is

x+t
ax+t mt , 0 < t < m,
mt| a
Vt =
a
x+t ,
t m,
x m
a
x a
= a
x+t a
x+t mt
a
x+t mt
a
x m
a
x
= a
x+t
a

(3.30)
a
x m x+t mt
(with the understanding that a
x mt = 0 if t > m).
For the n-year term insurance, considered in Paragraph 3.1.C, with level
during the contract period,
Vt

= A

x+t nt

ax+t nt

= 1 r
ax+t nt
= 1

nt Ex+t

nt Ex+t

(1 n Ex )

1 r
ax n n Ex
a
x+t nt
a
x n

a
x+t nt
.
a
x n

(3.31)

Finally, for the n-year endowment insurance, with level net premium in
the contract period,
Vt

= Ax+t nt
ax+t nt
1 r
ax n
= 1 r
ax+t nt
a
x+t nt
a
x n
a

= 1 x+t nt .
a
x n

(3.32)

The reserve in (3.32) is, of course, the sum of the reserves in (3.30) and
(3.31). Note that the pure term insurance requires a much smaller reserve than
the other insurance forms, with elements of savings in them. However, at old
ages x (where people typically are not covered against the risk of death since
death will incur soon with certainty) also the term insurance may have a Vt
close to 1 in the middle of the insurance period.

CHAPTER 3. INSURANCE OF A SINGLE LIFE

23

D. Non-negativity of the reserve. In all the examples given here the net
reserve is sketched as a non-negative function. Non-negativity of Vt is not a
consequence of the denition. One may easily construct premium payment
schemes that lead to negative values of Vt (just let the premiums fall due after
the payment of the benets), but such payment schemes are not used in practice.
The reason is that the holder of a policy with Vt < 0 is in expected debt to
the insurer and would thus have an incentive to cancel the policy and thereby
get rid of the debt. (The agreement obliges the policyholder only to pay the
premiums, and the contract can be terminated at any time the policyholder
wishes.) Therefore, it is in practice required that
Vt 0, t 0.

3.4

(3.33)

Thieles dierential equation

A. The dierential equation. We turn back to the general case with the
reserve given by (3.25). Suppose the policy is in force at time t (0, n). Upon
conditioning on what happens in the small time interval (t, t + dt), we nd
Vt = bt x+t dt t dt + (1 x+t dt)er dt Vt+dt .

(3.34)

Subtract Vt+dt on both sides, divide by dt and let dt tend to 0. Observing that
(erdt 1)/dt r as dt 0, one obtains Thieles dierential equation,
d
Vt = t bt x+t + (r + x+t ) Vt ,
dt

(3.35)

valid at each t where b, , and are continuous. The right hand side expression
in (3.35) shows how the fund per surviving policyholder changes per time unit
at time t. It is increased by the excess of premiums over benets (which may
be negative, of course), by the interest earned, rVt , and by the fund inherited
from those who die, x+t Vt .
When combined with the boundary condition
Vn

bn ,

(3.36)

the dierential equation (3.35) determines Vt for xed b and .

If the principle of equivalence is exercised, then we must add the condition
(3.27). This represents a constraint on the contractual payments b and ; typically, one rst species the benet b and then determines the premium rate for
a given premium plan (shape of ).
B. Savings premium and risk premium. Suppose the equivalence principle is in use. Rearrange (3.35) as
t =

d
Vt rVt + (b Vt )x+t .
dt

(3.37)

CHAPTER 3. INSURANCE OF A SINGLE LIFE

24

This form of the dierential equation shows how the premium at any time
d
Vt rVt ,
dt

(3.38)

tr = (b Vt )x+t .

(3.39)

ts =

The savings premium provides the amount needed in excess of the earned interest to maintain the reserve. The risk premium provides the amount needed in
excess of the available reserve to cover an insurance claim.
C. Uses of the dierential equation. In the examples given above, Thieles
dierential equation was useful primarily as a means of investigating the development of the reserve. It was not required in the construction of the premium
and the reserve, which could be put up by direct prospective reasoning. In the nal example to be given Thieles dierential equation is needed as a constructive
tool.
Assume that the pension treaty studied above is modied so that the reserve
is paid back at the moment of death in case the insured dies during the contract
period, the philosophy being that the savings belong to the insured. Then
the scheme is supplied by an (n + m)-year temporary term insurance with sum
bt = Vt at any time t (0, m + n). The solution to (3.35) is easily obtained as


st ,
0 < t < m,
Vt =
b
am+nt , m < t < m + n,
t
where st = 0 (1 + i)t d . The reserve develops just as for ordinary savings
contracts oered by banks.

3.5

A. The processes indicating survival and death. In Paragraph A of

Section 3.1 we introduced the indicator of the event of survival to time t, It =
1{Tx >t} , and the indicator of the complementary event of death within time t,
Nt = 1 It = 1{Tx t} . Viewed as functions of t, they are stochastic processes.
The latter counts the number of deaths of the insured as time is progresses and
is thus a simple example of a counting process as dened in Paragraph D of
Appendix A. This motivates the notation Nt . By their very denitions, It and
Nt are RC.
In the present context, where everything is governed by just one single random variable, Tx , the process point of view is not important for practical purposes. For didactical purposes, however, it is worthwhile taking it already here
as a rehearsal for more complicated situations where stochastic processes cannot
be dispensed with.

CHAPTER 3. INSURANCE OF A SINGLE LIFE

25

The payment functions of the benets considered in Section 3.1 can be recast
in terms of the processes It and Nt . In dierential form they are
dAte;n
dAti;n
t

= It dn (t) ,
= (1 n (t)) dNt ,

dAa;n
t
dAtei;n

= (1 n (t)) It dt ,
= dAti;n
+ dAte;n .
t

Their present values are

Rn

V a;n

= e 0 r In ,
 n R

=
e 0 r dN ,
0 n R

=
e 0 r I d ,

V ei;n

= V ti;n + V e;n .

V e;n
V ti;n

The expressions in (3.14) and (3.10) are obtained directly by taking expectation
under the integral sign (Fubini), using the obvious relations
E [I ] =
E [dN ] =

px ,
px x+

d .

The relationship (3.19) reemerges in its more basic form upon integrating
by parts to obtain
 n R
 n R
R

0n r
0 r
e
In = 1 +
e
(r )I d +
e 0 r dI ,
0

Chapter 4

insurance
4.1

A. The basic entities. Consider an insurance policy issued at time 0 for a

nite term of n years. We have in mind life or pension insurance or some other
form of insurance of persons like disability or sickness coverage. In such lines of
policy between certain states specied in the contract. Thus, we assume there
is a nite set of states, Z = {0, 1, . . . , r}, such that the policy at any time is in
one and only one state, commencing in state 0 (say) at time 0. Denote the state
of the policy at time t by Z(t). Regarded as a function from [0, n] to Z, Z is
assumed to be right-continuous, with a nite number of jumps, and Z(0) = 0.
To account for the random course of the policy, Z is modelled as a stochastic
process on some probability space (, H, P).
B. Model deliberations; realism versus simplicity. On specifying the
probability model, two concerns must be kept in mind, and they are inevitably
conicting. On the one hand, the model should reect the essential features of (a
certain piece of) reality, and this speaks for a complex model to the extent that
reality itself is complex. On the other hand, the model should be mathematically
tractable, and this speaks for a simple model allowing of easy computation of
quantities of interest. The art of modelling is to strike the right balance between
these two concerns.
Favouring simplicity in the rst place, we shall be working under Markov
assumptions, which allow for fairly easy computation of relevant probabilities
and expected values. Later on we shall demonstrate the versatility of this model
framework, showing that it is capable of representing virtually any conception
one might have of the mechanisms governing the development of the policy. We
shall take the Markov chain model presented in [14] as a suitable framework
26

27

4.2

A. The Markov property. A stochastic process is essentially determined

by its nite-dimensional distributions. In the present case, where Z has only a
nite state space, these are fully specied by the probabilities of the elementary
events ph=1 [Z(th ) = jh ], t1 < < tp in [0, n] and j1 , . . . , jp Z. Now
P [Z(th ) = jh , h = 1, . . . , p]
p

P [Z(th ) = jh | Z(tg ) = jg , g = 0, . . . , h 1 ] ,
=

(4.1)

h=1

where, for convenience, we have put t0 = 0 and j0 = 0 so that [Z(t0 ) = j0 ] is

the trivial event with probability 1. Thus, the specication of P could suitably
A particularly simple structure is obtained by assuming that, for all t1 <
< tp in [0, n] and j1 , . . . , jp Z,
P [Z(tp ) = jp | Z(th ) = jh , h = 1, . . . , p 1 ]
= P[Z(tp ) = jp | Z(tp1 ) = jp1 ] ,

(4.2)

which means that process is fully determined by the (simple) transition probabilities
pjk (t, u) = P[Z(u) = k | Z(t) = j] ,
(4.3)
t < u in [0, n] and j, k Z. In fact, if (4.2) holds, then (4.1) reduces to
P [Z(th ) = jh , h = 1, . . . , p] =

p


pjh1 jh (th1 , th ) ,

(4.4)

h=1

and one easily proves the equivalent that, for any t1 < < tp < t < tp+1 <
< tp+q in [0, n] and j1 , . . . , jp , j, jp+1 , . . . , jp+q in Z,
P [ Z(th ) = jh , h = p + 1, . . . , p + q | Z(t) = j, Z(th ) = jh , h = 1, . . . , p]
= P [ Z(th ) = jh , h = p + 1, . . . , p + q | Z(t) = j] .

(4.5)

Proclaiming t the present time, (4.5) says that the future of the process is
independent of its past when the present is known. (Fully known, that is; if
the present state is only partly known, it may certainly help to add information
The condition (4.2) is called the Markov property. We shall assume that
Z possesses this property and, accordingly, call it a continuous time Markov
process on the state space Z.

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

28

From the simple transition probabilities we form the more general transition
probability from j to some subset K Z,

pjk (t, u) .
(4.6)
pjK (t, u) = P[Z(u) K | Z(t) = j] =
kK

We have, of course,
pjZ (t, u) =

pjk (t, u) = 1 .

(4.7)

kZ

C. The Chapman-Kolmogorov equation. For a xed t [0, n] the events

{Z(t) = j}, j Z, are disjoint and their union is the almost sure event. It
follows that

P[Z(t) = j, Z(u) = k | Z(s) = i]
P[Z(u) = k | Z(s) = i] =
=

jZ

jZ

If Z is Markov, and 0 s t u, this reduces to


pij (s, t)pjk (t, u) ,
pik (s, u) =

(4.8)

jZ

which is known as the Chapman-Kolmogorov equation.

D. Intensities of transition. In principle, specifying the Markov model
amounts to specifying the pjk (t, u) in such a manner that the expressions on
the right of (4.4) dene probabilities in a consistent way. This would be easy
if Z were a discrete time Markov chain with t ranging in a nite time set
could just take the pjk (tq1 , tq ) as any
0 = t0 < t1 < < tq = n: then
we
r
non-negative numbers satisfying
k=0 pjk (tp1 , tp ) = 1 for each j Z and
p = 1, . . . , q. This simple device does not carry over without modication to the
continuous time case since there are no smallest nite time intervals from which
we can build all probabilities by (4.4). An obvious way of adapting the basic
idea to the time-continuous case is to add smoothness assumptions that give
meaning to a notion of transition probabilities in innitesimal time intervals.
More specically, we shall assume that the intensities of transition,
jk (t) = lim
h0

pjk (t, t + h)
h

(4.9)

exist for each j, k Z, j = k, and t [0, n) and, moreover, that they are
piecewise continuous. Another way of phrasing (4.9) is
pjk (t, t + dt) = jk (t)dt + o(dt) ,

(4.10)

29

where the term o(dt) is such that o(dt)/dt 0 as dt 0. Thus, transition

probabilities over a short time interval are assumed to be (approximately) proportional to the length of the interval, and the proportionality factors are just
the intensities, which may depend on the time. What is short in this connection depends on the sizes of the intensities. For instance, if the jk ( ) are
approximately constant and << 1 for all k = j and all [t, t + 1], then
jk (t) approximates the transition probability pjk (t, t + 1). In general, however,
the intensities may attain any positive values and should not be confused with
probabilities.
For j
/ K Z, we dene the intensity of transition from state j to the set
of states K at time t as
jK (t) = lim
ut


pjK (t, u)
=
jk (t) .
ut

(4.11)

kK

In particular, the total intensity of transition out of state j at time t is j,Z{j} (t),
which is abbreviated

j (t) =
jk (t) .
(4.12)
k;k=j

From (4.7) and (4.10) we get

pjj (t, t + dt) = 1 j (t)dt + o(dt) .

(4.13)

E. The Kolmogorov dierential equations.

The transition probabilities are two-dimensional functions of time, and in nontrivial situations it is virtually impossible to specify them directly in a consistent manner or even gure how they should look on intuitive grounds. The
intensities, however, are one-dimensional functions of time and, being easily interpretable, they form a natural starting point for specication of the model.
Luckily, as we shall now see, they are also basic entities in the system as they
determine the transition probabilities uniquely.
Suppose the process Z is in state j at time t. To nd the probability that
the process will be in state k at a given future time u, let us condition on
what happens in the rst small time interval (t, t + dt]. In the rst place Z
may remain in state j with probability 1 j (t) dt and, conditional on this
event, the probability of ending up in state k at time u is pjk (t + dt, u). In the
second place, Z may jump to some other state g with probability jg (t) dt and,
conditional on this event, the probability of ending up in state k at time u is
pgk (t + dt, u). Thus, the total probability of Z being in state k at time u is
pjk (t, u) = (1 j (t) dt) pjk (t + dt, u)

jg (t) dt pgk (t + dt, u) + o(dt) ,
+
g;g=j

(4.14)

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

30

Upon putting dt pjk (t, u) = pjk (t + dt, u) pjk (t, u) in the innitesimal sense,
we arrive at

dt pjk (t, u) = j (t) dt pjk (t, u)
jg (t) dt pgk (t, u) .
(4.15)
g;g=j

For given k and u these dierential equations determine the functions pjk (, u),
j = 0, . . . , r, uniquely when combined with the obvious conditions
pjk (u, u) = jk .

(4.16)

Here jk is the Kronecker delta dened as 1 if j = k and 0 otherwise.

The relation (4.14) could have been put up directly by use of the ChapmanKolmogorov equation (4.8), with s, t, i, j replaced by t, t + dt, j, g, but we have
carried through the detailed (still informal though) argument above since it will
be in use repeatedly throughout the text. It is called the backward (dierential)
argument since it focuses on t, which in the perspective of the considered time
period [t, u] is the very beginning. Accordingly, (4.15) is referred to as the
Kolmogorov backward dierential equations, being due to A.N. Kolmogorov.
At points of continuity of the intensities we can divide by dt in (4.15) and
obtain a limit on the right as dt tends to 0. Thus, at such points we can write
(4.15) as


pjk (t, u) = j (t)pjk (t, u)

jg (t)pgk (t, u) .
t

(4.17)

g;g=j

Since we have assumed that the intensities are piecewise continuous, the indicated derivatives exist piecewise. We prefer, however, to work with the dierential form (4.15) since it is generally valid under our assumptions and, moreover, invites algorithmic reasoning; numerical procedures for solving dierential
equations are based on approximation by dierence equations for some ne discretization and, in fact, (4.14) is basically what one would use with some small
dt > 0.
As one may have guessed, there exist also Kolmogorov forward dierential
equations. These are obtained by focusing on what happens at the end of the
time interval in consideration. Reasoning along the lines above, we have

pig (s, t) gj (t) dt + pij (s, t)(1 j (t) dt) + o(dt) ,
pij (s, t + dt) =
g;g=j

hence
dt pij (s, t) =

pig (s, t)gj (t) dt pij (s, t)j (t) dt .

(4.18)

g;g=j

For given i and s, the dierential equations (4.18) determine the functions
pij (s, ), j = 0, . . . , r, uniquely in conjunction with the obvious conditions
pij (s, s) = ij .

(4.19)

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

31

In some simple cases the dierential equations have nice analytical solutions,
but in most non-trivial cases they must be solved numerically, e.g. by the RungeKutta method.
Once the simple transition probabilities are determined, we may calculate
the probability of any event in H{t1 ,...,tr } from the nite-dimensional distribution (4.4). In fact, with nite Z every such probability is just a nite sum of
probabilities of elementary events to which we can apply (4.4).
Probabilities of more complex events that involve an innite number of coordinates of Z, e.g. events in HT with T an interval, cannot in general be
calculated from the simple transition probabilities. Often we can, however, put
up dierential equations for the requested probabilities and solve these by some
suitable method.
Of particular interest is the probability of staying uninterruptedly in the
current state for a certain period of time,
pjj (t, u) = P[Z( ) = j, (t, u] | Z(t) = j] .

(4.20)

Obviously pjj (t, u) = pjj (t, s) pjj (s, u) for t < s < u. By the backward
construction and (4.13) we get
pjj (t, u) = (1 j (t) dt) pjj (t + dt, u) + o(dt) .

(4.21)

From here proceed as above, using pjj (u, u) = 1, to obtain

pjj (t, u) = e

4.3

Ru
t

(4.22)

Applications

A. A single life with one cause of death. The life length of a person is
modelled as a positive random variable T with survival function F . There are
two states, alive and dead. Labelling these by 0 and 1, respectively, the state
process Z is simply
Z(t) = 1[T t] , t [0, n] ,
which counts the number of deaths by time t 0. The process Z is rightcontinuous and is obviously Markov since in state 0 the past is trivial, and in
state 1 the future is trivial. The transition probabilities are
p00 (s, t) = F (t)/F (s) .
The Chapman-Kolmogorov equation reduces to the trivial
p00 (s, u) = p00 (s, t)p00 (t, u)
or F (u)/F (s) = {F (t)/F (s)}{F (u)/F (t)}. The only non-null intensity is 01 (t) =
(t), and
Ru
p00 (t, u) = e t .
(4.23)

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

32

0
Alive

1

Figure 4.1: Sketch of the mortality model with one cause of death.
The Kolmogorov dierential equations reduce to just the denition of the intensity (write out the details).
The simple two state process with state 1 absorbing is outlined in Fig. 4.1

B. A single life with r causes of death. In the previous paragraph it was,

admittedly, the process set-up that needed the example and not the other way
around. The process formulation shows it power when we turn to more complex
situations. Fig. 4.2 outlines a rst extension of the model in the previous paragraph, whereby the single absorbing state (dead) is replaced by r absorbing
from heart disease, etc. The index 0 in the intensities 0j is superuous and
has been dropped.
Relation (4.12) implies that the total mortality intensity is the sum of the
intensities of death from dierent causes,
(t) =

r


j (t) .

(4.24)

j=1

For a person aged t the probability of survival to u is the well-known survival

probability p00 (t, u) given by (4.23), now with a nuanced explanation in the
present enriched model. For instance, the G-M law in the simple mortality model
may be motivated as resulting from two causes of death, one with intensity
independent of age (pure accident) and the other with intensity ct (wear-out).
The probability of a t years old dying from cause j before age u is
 u R

p0j (t, u) =
e t j ( ) d .
(4.25)
t

33

0
Alive

1

j

PP
PP
PP r
q
P

r

Figure 4.2: Sketch of the mortality model with r causes of death.

Inspection of (4.24) (4.25) gives rise to a comment. An increase of one
mortality intensity k results in a decrease of the survival probability (evidently)
and also of the probabilities of death from every other cause j = k, hence (since
the probabilities sum to 1) an increase of the probability of death from cause
k (also evident). Thus, the increased proportions of deaths from heart diseases
and cancer in our times could be suciently explained by the fact that medical
progress has practically eliminated mortality by lunge inammation, childbed
fever, and a number of other diseases.
C. A model for disabilities, recoveries, and death. Fig. 4.3 outlines a
model suitable for analyzing insurances with payments depending on the state
of health of the insured, e.g. sickness insurance providing an annuity benet
during periods of disability or life insurance with premium waiver during disability. Many other problems t into the same scheme by mere relabeling of
the states. For instance, in connection with a pension insurance with additional
benets to the spouse, states 0 and 1 would be unmarried and married,
and in connection with unemployment insurance they would be employed
and unemployed.
For a person who is active at time s the Kolmogorov forward dierential
(4.18) equations are

t

p01 (s, t) = p00 (s, t)(t) p01 (s, t)((t) + (t)) .

t

(4.26)
(4.27)

(The probability p02 (s, t) is determined by the other two.) The initial conditions
(4.19) become
p00 (s, s) = 1 ,

(4.28)

Active

34

Disabled

2

Figure 4.3: Sketch of a model for disabilities, recoveries, and death.

p01 (s, s) = 0 .

(4.29)

(For a person who is disabled at time s the forward dierential equations are
the same, only with the rst subscript 0 replaced by 1 in all the probabilities,
and the side conditions are p00 (s, s) = 0 , p11 (s, s) = 1 .).
When the intensities are suciently simple functions, one may nd explicit
closed expressions for the transition probabilities. Work through the case with
constant intensities.

4.4

The standard multi-state contract

A. The contractual payments. We refer to the insurance policy with development as described in Paragraph 4.1.A. Taking Z to be a stochastic process
with right-continuous paths and at most a nite number of jumps, the same
holds also for the associated indicator processes Ij and counting processes Njk
dened, respectively, by Ij (t) = 1[Z(t)=j] (1 or 0 according as the policy is in
the state j or not at time t) and Njk (t) = 9{ ; Z( ) = j, Z( ) = k, (0, t]}
(the number of transitions from state j to state k (k = j) during the time interval (0, t]). The indicator processes {Ij (t)}t0 and the counting processes
{Njk (t)}t0 are related by the fact that Ij increases/decreases (by 1) upon a
transition into/out of state j. Thus
dIj (t) = dNj (t) dNj (t) ,

(4.30)

where a dotin the place of a subscript signies summation over that subscript,
e.g. Nj = k;k=j Njk .
The policy is assumed to be of standard type, which means that the payment
function representing contractual benets less premiums is of the form (recall

35

dB(t) =

(4.31)

;=k

where each Bk , of form dBk (t) = bk (t) dt + Bk (t) Bk (t), is a deterministic

payment function specifying payments due during sojourns in state k (a general
life annuity), and each bk is a deterministic function specifying payments due
upon transitions from state k to state : (a general life assurance). When dierent
from 0, Bk (t) Bk (t) is an endowment at time t. The functions bk and bk are
assumed to be nite-valued and piecewise continuous. The set of discontinuity
points of any of the annuity functions Bk is D = {t0 , t1 , . . . , tq } (say).
Positive amounts represent benets and negative amounts represent premiums. In practice premiums are only of annuity type. At times t
/ [0, n] all
payments are null.
B. Identities revisited. Here we make an intermission to make a comment
that does not depend on the probability structure to be specied below. The
identity (3.18) rests on the corresponding identity (3.17) between the present
values. The latter is, in its turn, a special case of the identities put up in Section
1.1, from which many identities between present values in life insurance can be
derived.
Suppose the investment portfolio of the insurance company bears interest
with intensity r(t) at time t. The following identity, which expresses life annuities by endowments and life assurances, is easily obtained upon integrating by
parts, using (4.30):
 u R
 u R
Ru
Rt

e 0 r Ij ( ) dBj ( ) = e 0 r Ij (u) Bj (u) e 0 r Ij (t)Bj (t) +

e 0 r Ij ( )Bj ( )r( ) d
t
t
 u R
0 r
+
e
Bj ( ) d(Nj ( ) Nj ( )) .
t

C. Expected present values and prospective reserves. At any time

t [0, n], the present value of future benets less premiums under the contract
is
 n R

e t r dB( ) .
(4.32)
V (t) =
t

This is a liability for which the insurer is to provide a reserve, which by statute
is the expected value. Suppose the policy is in state j at time t. Then the
conditional expected value of V (t) is

 n R 


Vj (t) =
e t r
pjk (t, ) dBk ( ) +
bk ( )k ( ) d . (4.33)
t

;=k

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

36

This follows by taking expectation under the integral in (4.32), inserting dB( )
from (4.31), and using
E[Ik ( ) | Z(t) = j] = pjk (t, ) ,
E[dNk ( ) | Z(t) = j] = pjk (t, )k ( ) d .
We expound the result as follows. With probability pjk (t, ) the policy stays
in state k at time , and if this happens the life annuity provides the amount
dBk ( ) during a period of length d around . Thus,
R the expected present value
at time t of this contingent payment is pjk (t, )e t r dBk ( ). With probability
pjk (t, ) k ( ) d the policy jumps from state k to state : during a period of
length d around , and if this happens the assurance provides the amount
present value at time t of this contingent payment
bk ( ). Thus, the expected
R
is pjk (t, ) k ( ) d e t r bk ( ). Summing over all future times and types of
payments, we nd the total given by (4.33).
Let 0 t < u < n. Upon separating payments in (t, u] and in (u, n] on the
right of (4.33), and using Chapman-Kolmogorov on the latter part, we obtain

 u R 


e t r
pjk (t, ) dBk ( ) +
bk ( )k ( ) d
Vj (t) =
t

+e

Ru
t

;=k

(4.34)

This expression is also immediately obtained upon conditioning on the state of

the policy at time u.
Throughout the term of the policy the insurance company must currently
maintain a reserve to meet future net liabilities in respect of the contract. By
statute, if the policy is in state j at time t, then the company is to provide
a reserve that is precisely Vj (t). Accordingly, the functions Vj are called the
(state-wise) prospective reserves of the policy. One may say that the principle
of equivalence has been carried over to time t, now requiring expected balance
between the amount currently reserved and the discounted future liabilities,
given the information currently available. (Only the present state of the policy
is relevant due to the Markov property and the simple memoryless payments
under the standard contract).
D. The backward (Thieles) dierential equations. By letting u approach t in (4.34), we obtain a dierential form that displays the dynamics
of the reserves. In fact, we are going to derive a set of backward dierential
equations and, therefore, take the opportunity to apply the direct backward differential argument demonstrated and announced previously in Paragraph 4.2.E.
Thus, suppose the policy is in state j at time t
/ D. Conditioning on what
happens in a small time interval (t, t + dt] (not intersecting D) we write

jk (t) dt bjk (t)
Vj (t) = bj (t) dt +
k;k=j

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

+(1 j (t) dt)er(t) dt Vj (t + dt) +

37

jk (t) dt er(t) dt Vk (t + dt) .

k;k=j

Proceeding from here along the lines of the simple case in Section 3.4, we easily arrive at the backward or Thieles dierential equations for the state-wise
prospective reserves,
d
Vj (t)
dt

bj (t)

jk (t) Vk (t)

k;k=j

(4.35)

k;k=j

The dierential equations are valid in the open intervals (tp1 , tp ), p = 1, . . . , q,

and together with the conditions
Vj (tp ) = (Bj (tp ) Bj (tp )) + Vj (tp ) ,

p = 1, . . . , q, j Z,

(4.36)

they determine the functions Vj uniquely.

A comment is in order on the dierentiability of the Vj . At points of continuity of the functions bj , bjk , jk , and r there is no problem since there the
integrand on the right of (4.33) is continuous. At possible points of discontid
nuity of the integrand the derivative dt
Vj does not exist. However, since such
discontinuities are nite in number, they will not aect the integrations involved
in numerical procedures. Thus we shall throughout allow ourselves to write the
dierential equations on the form (4.35) instead of the generally valid dierential form obtained upon putting dVj (t) on the left and multiplying with dt on
the right.
E. Solving the dierential equations. Only in rare cases of no practical
interest is it possible to nd closed form solutions to the dierential equations.
In practice one must resort to numerical methods to determine the prospective
reserves. As a matter of experience a fourth order Runge-Kutta procedure works
reliably in virtually all situations encountered in practice.
One solves the dierential equations from top down. First solve (4.35) in the
upper interval (tq1 , n) subject to (4.36), which specializes to Vj (n) = Bj (n)
Bj (n) since Vj (n) = 0 for all j by denition. Then go to the interval below and
solve (4.35) subject to Vj (tq1 ) = (Bj (tq1 ) Bj (tq1 )) + Vj (tq1 ), where
Vj (tq1 ) was determined in the rst step. Proceed in this manner downwards.
It is realized that the Kolmogorov backward equations (4.15) are a special
case of the Thiele equations (4.35); the transition probability pjk (t, u) is just
the prospective reserve in state j at time t for the simple contract with the only
payment being a lump sum payment of 1 at time u if the policy is then in state k,
and with no interest. Thus a numerical procedure for computation of prospective
reserves can also be used for computation of the transition probabilities.

38

F. The equivalence principle. If the equivalence principle is invoked, one

must require that
V0 (0) = B0 (0) .

(4.37)

This condition imposes a constraint on the contractual functions bj , Bj , and

bjk , viz. on the premium level for given benets and design of the premium
plan. It is of a dierent nature than the conditions (4.36), which follow by the
very denition of prospective reserves (for given contractual functions).
G. Savings premium and risk premium. The equation (4.35) can be recast as

Rjk (t)jk (t) dt .
(4.38)
bj (t) dt = dVj (t) r(t) dt Vj (t) +
k;k=j

where
Rjk (t) = bjk (t) + Vk (t) Vj (t) .

(4.39)

The quantity Rjk (t) is called the sum at risk associated with (a possible) transition from state j to state k at time t since, upon such a transition, the insurer
must immediately pay out the sum insured and also provide the appropriate
reserve in the new state, but he can cash the reserve in the old state. Thus,
the last term in (4.38) is the expected net payout in connection with a possible transition out of the current state j in (t, t + dt), and it is called the risk
premium. The two rst terms on the right of (4.38) constitute the savings premium in (t, t + dt), called so because it is the amount that has to be provided
to maintain the reserve in the current state; the increment of the reserve less
the interest earned on it. On the left of (4.38) is the premium paid in (t, t + dt),
and so the relation shows how the premium decomposes in a savings part and a
risk part. Although helpful as an interpretation, this consideration alone cannot
carry the full understanding of the dierential equation since (4.38) is valid also
if bj (t) is positive (a benet) or 0.
I. Uses of the dierential equations. If the contractual functions do not
depend on the reserves, the dening relation (4.33) give explicit expressions for
the state-wise reserves and strictly speaking the dierential equations (4.35)
are not needed for constructive purposes. They are, however, computationally
convenient since there are good methods for numerical solution of dierential
equations. They also serve to give insight into the dynamics of the policy.
The situation is entirely dierent if the contractual functions are allowed to
depend on the reserves in some way or other. The most typical examples are
repayment of a part of the reserve upon withdrawal (a state withdrawn must
then be included in the state space Z) and expenses depending partly on the
reserve. Also the primary insurance benets may in some cases be specied
as functions of the reserve. In such situations the dierential equations are an
indispensable tool in the construction of the reserves and determination of the
equivalence premium. We shall provide an example in the next paragraph.

39

1
Both alive

01

02

13

3

23

Figure 4.4: Sketch of a model for two lives.

J. An example: Widows pension. A married couple buys a combined life
insurance and widows pension policy specifying that premiums are to be paid
with level intensity c as long as both husband and wife are alive, pensions are to
be paid with intensity b as long as the wife is widowed, and a life assurance with
sum s is due immediately upon the death of the husband if the wife is already
dead (a benet to their dependents). The policy terminates at time n. The
relevant Markov model is sketched in the ow-chart below. We assume that r
is constant.
The dierential equations (4.35) now specialize to the following (we omit the
trivial equation for V3 (t) = 0):
d
V0 (t) =
dt
d
V1 (t) =
dt
d
V2 (t) =
dt

(r + 01 (t) + 02 (t)) V0 (t)

01 (t)V1 (t) 02 (t)V2 (t) + c ,

(4.40)

(r + 13 (t)) V1 (t) b ,

(4.41)

(4.42)

Consider a modied contract, by which 50% of the reserve is to be paid back

to the husband in case he is widowered before time n, the philosophy being
that couples receiving no pensions should have some of their savings back. Now
the dierential equations are really needed. Under the modied contract the
equations above remain unchanged except that the term 0.5V0 (t)02 (t) must be

40

d
V0 (t)
dt

(r + 01 (t) + 0.502 (t)) V0 (t) + c

01 (t)V1 (t) + 02 (t)V2 (t) ,

(4.43)

Together with the conditions Vj (n) = 0, j = 0, 1, 2, these equations are easily

solved.
As a second case the widows pension shall be analyzed in the presence
of administration expenses that depend partly on the reserve. Consider again
the policy terms described in the introduction of this paragraph, but assume
that administration expenses incur with an intensity that is a times the current
reserve throughout the entire period [0, n].
The dierential equations for the reserves remain as in (4.40)(4.42), except
that for each j the term a Vj (t) is to be subtracted on the right of the dierential
equation for Vj . Thus, the administration costs related to the reserve has the
same eect as a decrease of the interest intensity r by a.

4.5

Higher order moments of present values

A. Dierential equations for moments of present values. Our framework is the Markov model and the standard insurance contract. The set of time
points with possible lump sum annuity payments is D = {t0 , t1 . . . , tm } (with
t0 = 0 and tm = n).
Denote by V (t, u) the present value at time t of the payments under the contract during the time interval (t, u] and abbreviate V (t) = V (t, n) (the present
value at time t of all future payments). We want to determine higher order moments of V (t). By the Markov property, we need only the state-wise conditional
moments
(q)
Vj (t) = E[V (t)q |Z(t) = j] ,
(q)

q = 1, 2, . . . The functions Vj
d (q)
V (t) =
dt j

(q)

(q1)

(t)

q


q
(qp)

jk (t)
(t) ,
(bjk (t))p Vk
p
p=0
k=j

valid on (0, n)\D and subject to the conditions

(q)
Vj (t)

t D.

q

q
(qp)
=
(t) ,
(Bj (t) Bj (t))p Vj
p
p=0

(4.44)

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

Proof: Obviously, for t < u < n,
V (t) = V (t, u) + e

Ru
t

V (u) ,

For any q = 1, 2, . . . we have by the binomial formula

q
 Ru
qp

q
q
V (t) =
.
V (t, u)p e t r V (u)
p
p=0

41

(4.45)

(4.46)

Consider rst a small time interval (t, t + dt] without any lump sum annuity
payment. Putting u = t + dt in (4.46) and taking conditional expectation, given
Z(t) = j, we get

q 

qp 

q
(q)
p
r(t) dt

Vj (t) =
V (t + dt)
E V (t, t + dt) e
 Z(t) = j . (4.47)
p
p=0
By use of iterated expectations, conditioning on what happens in the small
interval (t, t + dt], the p-th term on the right of (4.47) becomes

q
(qp)
(1 j (t) dt) (bj (t) dt)p e(qp)r(t) dt Vj
(t + dt)
(4.48)
p

q
(qp)
+
jk (t) dt (bj (t) dt + bjk (t))p e(qp)r(t) dt Vk
(t + dt) .
p
k; k=j

(4.49)
Let us identify the signicant parts of this expression, disregarding terms of
order o(dt). First look at (4.48); for p = 0 it is
(q)

(1 j (t) dt)eqr(t) dt Vj (t + dt) ,

for p = 1 it is

(q1)

q bj (t) dt e(q1)r(t) dt Vj

(t + dt) ,

and for p > 1 is o(dt). Next look at (4.49); the factor

p

p
dt (bj (t) dt + bjk (t))p = dt
(bj (t) dt)r (bjk (t))pr
r
r=0
reduces to dt (bjk (t))p so that (4.49) reduces to

q
(qp)
jk (t) dt (bjk (t))p e(qp)r(t) dt Vk
(t + dt) .
p
k; k=j

Thus, we gather
(q)

Vj (t) =

(q)

(1 j (t) dt)eqr(t) dt Vj (t + dt)

(q1)

+ q bj (t) dt e(q1)r(t) dt Vj
(t + dt)

q
 q 
(qp)
+
jk (t) dt (bjk (t))p e(qp)r(t) dt Vk
(t + dt) .
p
p=0
k; k=j

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

42

(q)

Now subtract Vj (t + dt) on both sides, divide by dt, let dt tend to 0, and use


limt0 eqr(t) dt 1 /dt = qr(t) to obtain the dierential equation (4.44).
The condition (4.44) follows easily by putting t dt and t in the roles of t
and u in (4.46) and letting dt tend to 0. 
A rigorous proof is given in [21].
Central moments are easier to interpret and therefore more useful than the
(q)j
non-central moments. Letting mt denote the q-th central moment correspond(q)j
ing to the non-central Vt , we have
(1)

(1)

mj (t)

= Vj (t) ,

(q)

q


mj (t)


qp
(p)
(1)
qp q
(1)
.
Vj (t) Vj (t)
p
p=0

(4.50)
(4.51)

B. Computations. The computational procedure goes as follows. First solve

the dierential equations in the upper interval (tm1 , n), where the side conditions (4.44) are just
(q)

Vj (n) = (Bj (n) Bj (n))q

(4.52)

(q)

since Vj (n) = q0 (the Kronecker delta). Then, if m > 1, solve the dierential
equations in the interval (tm2 , tm1 ) subject to (4.44) with t = tm1 , and
proceed in this manner downwards.
C. Numerical examples. We shall calculate the rst three moments for
some standard forms of insurance related to the disability model in Paragraph
4.3.C. We assume that the interest rate is constant and 4.5% per year,
r = ln(1.045) = 0.044017 ,
and that the intensities of transitions between the states depend only on the
age x of the insured and are
x = x = 0.0005 + 0.000075858 100.038x ,
x = 0.0004 + 0.0000034674 100.06x ,
x = 0.005 .
The intensities , , and are those specied in the G82M technical basis.
(That basis does not allow for recoveries and uses = 0).
Consider a male insured at age 30 for a period of 30 years, hence use 02 (t) =
12 (t) = 30+t , 01 (t) = 30+t , 10 (t) = 30+t , 0 < t < 30 (= n). The central
(q)j
moments mt
dened in (4.50) (4.51) have been computed for the states 0
and 1 (state 2 is uninteresting) at times t = 0, 6, 12, 18, 24, and are shown
in Table 4.1 for a term insurance with sum 1 (= b02 = b12 );

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

43

in Table 4.2 for an annuity payable in active state with level intensity 1
(= b0 );
in Table 4.3 for an annuity payable in disabled state with level intensity 1
(= b1 );
in Table 4.4 for a combined policy providing a term insurance with sum 1
(= b02 = b12 ) and a disability annuity with level intensity 0.5 (= b1 ) against
level net premium 0.013108 (= b0 ) payable in active state.
You should try to interpret the results.
D. Solvency margins in life insurance an illustration. Let Y be the
present value of all future net liabilities in respect of an insurance portfolio.
Denote the q-th central moment of Y by m(q) . The so-called normal power
approximation of the upper -fractile of the distribution of Y , which we denote
by y1 , is based on the rst three moments and is
y1 m(1) + c1


c2 1 m(3)
m(2) + 1
,
6
m(2)

where c1 is the upper -fractile of the standard normal distribution. Adopting

the so-called break-up criterion in solvency control, y1 can be taken as a

44

Time t
(1)0

12

18

24

(1)1

mt = mt : 0.0683 0.0771 0.0828 0.0801 0.0592

(2)0
(2)1
mt = mt : 0.0300 0.0389 0.0484 0.0549 0.0484
(3)0
(3)1
mt = mt : 0.0139 0.0191 0.0262 0.0343 0.0369

30
0
0
0

Table 4.2: Moments for an annuity of 1 per year while active:

Time t
(1)0

mt
(1)1
mt
(2)0
mt
(2)1
mt
(3)0
mt
(3)1
mt

:
:
:
:
:
:

12

18

24

15.763
13.921
11.606
8.698
4.995
0.863
0.648
0.431
0.230
0.070
5.885
5.665
4.740
2.950
0.833
7.795
5.372
3.104
1.290
0.234
51.550 44.570 32.020 15.650 2.737
78.888
49.950
25.099
8.143
0.876

30
0
0
0
0
0
0

Table 4.3: Moments for an annuity of 1 per year while disabled:

Time t
(1)0

mt
(1)1
mt
(2)0
mt
(2)1
mt
(3)0
mt
(3)1
mt

:
:
:
:
:
:

12

18

24

0.277
0.293
0.289
0.239
0.119
15.176
13.566
11.464
8.708
5.044
1.750
1.791
1.646
1.147
0.364
11.502
8.987
6.111
3.107
0.716
15.960
14.835
11.929
6.601
1.277
101.500 71.990 42.500 17.160 2.452

30
0
0
0
0
0
0

CHAPTER 4. MARKOV CHAINS IN LIFE INSURANCE

45

Table 4.4: Moments for a life assurance of 1 plus a disability annuity of 0.5 per
year against net premium of 0.013108 per year while active:
Time t
(1)0

mt
(1)1
mt
(2)0
mt
(2)1
mt
(3)0
mt
(3)1
mt

:
:
:
:
:
:

12

18

24

0.0000
0.0410
0.0751
0.0858
0.0533
7.6451
6.8519
5.8091
4.4312
2.5803
0.4869
0.5046
0.4746
0.3514
0.1430
2.7010
2.0164
1.2764
0.5704
0.0974
2.1047
1.9440
1.5563
0.8686
0.1956
12.1200 8.1340 4.3960 1.5100 0.1430

30
0
0
0
0
0
0

minimum requirement on the technical reserve at the time of consideration.

It decomposes into the premium reserve, m(1) , and what can be termed the
(1)
uctuation reserve, y1
measure of the riskiness of the

 m . A(1)possible
/P , where P is some suitable measure
portfolio is the ratio R = y1 m
of the size of the portfolio at the time of consideration. By way of illustration,
consider a portfolio of N independent policies, all identical to the one described
in connection with Table 4.4 and issued at the same time. Taking as P the
total premium income per year, the value of R at the time of issue is 48.61 for
N = 10, 12.00 for N = 100, 3.46 for N = 1000, 1.06 for N = 10000, and 0.332
for N = 100000.

Chapter 5

model
5.1

A. The force of interest process.

The economy (or rather the part of the economy that governs the interest) is
a homogeneous time-continuous Markov chain Y on a nite state space J Y =
{1, . . . , J Y }, with intensities of transition ef , e, f J Y , e = f . The force of
interest is re when the economy is in state e, that is,

IeY (t)re ,
(5.1)
r(t) =
e

where IeY (t) = 1{Y (t)=e} is the indicator of the event that Y is in state e at time t.
B. The payment process.
We adopt the standard Markov chain model of a life insurance policy in Chapter
4 and equip the associated indicator and counting processes with topscript Z to
distinguish them from the corresponding entities for the Markov chain governing
the interest. We assume the payment stream is of the standard type considered
in the previous chapter.
C. The full Markov model.
We assume that the processes Y and Z are independent. Then (Y, Z) is a
Markov chain on J Y J Z with intensities

ef (t) , e = f, j = k ,
jk (t) , e = f, j = k ,
ej,f k (t) =

0,
e = f, j = k .

46

5.2

47

values

A. The main result.

For the purpose of assessing the contractual liability we are interested in aspects
of its conditional distribution, given the available information at time t. We
focus here on determining the conditional moments. By the Markov assumption,
the functions in quest are the state-wise conditional moments


q 
 n

1
(q)

v dB
Vej (t) = E
 Y (t) = e, Z(t) = j .
v(t) t
(q)

Copying the proof in Section 4.5 we nd that the functions Vej () are determined by the dierential equations
d (q)
(q)
(q1)
V (t) = (qre + j (t) + e )Vej (t) qbj (t)Vej
(t)
dt ej

q


 q
(qr)
(q)

jk (t)
(t)
ef Vf j (t) ,
(bjk (t))r Vek
r
r=0
k;k=j

(5.1)

f ;f =e

(q)

Vej (t) =

q

q
(qr)
(t) , t D . 
(Bj (t))r Vej
r
r=0

(5.2)

(q)

For q = 2, 3, . . ., denote by mej (t) the q-th central moment corresponding

(q)

(1)

(1)

to Vej (t), and dene mej (t) = Vej (t). Having computed the non-central
moments, we obtain the central moments of orders q > 1 from
(q)
mej (t)

q

qp

q
(p)
(1)
=
.
(1)qp Vej (t) Vej (t)
p
p=0

B. Numerical results for a combined insurance policy.

Consider a combined life insurance and disability pension policy issued at time
0 to a person who is then aged x, say. The relevant states of the policy are 1 =
active, 2 = disabled , and 3 = dead . At time t, when the insured is x + t years
old, transitions between these states take place with intensities
13 (t) = 23 (t) = 0.0005 + 0.000075858 100.038(x+t) ,
12 (t) = 0.0004 + 0.0000034674 100.06(x+t) ,
21 (t) = 0.005 .
We extend the model by assuming that the force of interest may assume three
values, r1 = ln(1.00) = 0 (low in fact no interest), r2 = ln(1.045) = 0.04402
(medium), and r3 = ln(1.09) = 0.08618 (high), and that the transitions between

CHAPTER 5. A MARKOV CHAIN INTEREST MODEL

48

these states are governed by a Markov chain with innitesimal matrix of the
form

1 1
0
= 0.5 1 0.5 .
0
1 1

(5.3)

The scalar can be interpreted as the expected number of transitions per time
unit and is thus a measure of interest volatility.
Table 1 displays the rst three central moments of the present value at time
0 for the following case, henceforth referred to as the combined policy for short:
the age at entry is x = 30, the term of the policy is n = 30, the benets are
a life assurance with sum 1 (= b13 = b23 ) and a disability annuity with level
intensity 0.5 (= b2 ), and premiums are payable in active state continuously at
level rate (= b1 ), which is taken to be the net premium rate in state (2,1)
(i.e. the rate that establishes expected balance between discounted premiums
and benets when the insured is active and the interest is at medium level at
time 0).
The rst three rows in the body of the table form a benchmark; = 0
means no interest uctuation, and we therefore obtain the results for three
cases of xed interest. It is seen that the second and third order moments of the
present value are strongly dependent on the (xed) force of interest and, in fact,
their absolute values decrease when the force of interest increases (as could be
expected since increasing interest means decreasing discount factors and, hence,
decreasing present values of future amounts).
It is seen that, as increases, the dierences across the three pairs of columns
get smaller and in the end they vanish completely. The obvious interpretation
is that the initial interest level is of little importance if the interest changes
rapidly.
The overall impression from the two central columns corresponding to medium
interest is that, as increases from 0, the variance of the present value will rst
increase to a maximum and then decrease again and stabilize. This observation
supports the following piece of intuition: the introduction of moderate interest
uctuation adds uncertainty to the nal result of the contract, but if the interest
changes suciently rapidly, it will behave like xed interest at the mean level.
Presumably, the values of the net premium in the second column reect the
same eect.

5.3

A. Time-continuous Markov chains.

Let X = {X(t)}t0 be a time-continuous Markov chain on the nite state space
J = {1, . . . , J}. Denote by P (t, u) the J J matrix whose j, k-element is the
transition probability pjk (t, u) = P [X(u) = k | X(t) = j]. The Markov property

CHAPTER 5. A MARKOV CHAIN INTEREST MODEL

49

(q)

Table 5.1: Central moments mej (0) of orders q = 1, 2, 3 of the present value
of future benets less premiums for the combined policy in interest state e and
policy state j at time 0, for some dierent values of the rate of interest changes,
. Second column gives the net premium of a policy starting from interest
state 2 (medium) and policy state 1 (active).
e, j :

1, 1

1, 2

2, 1

2, 2

3, 1

3, 2

1
.0131 2
3

0.15
13.39
2.55
12.50
20.45 99.02

0.00
7.65
0.49
2.70
2.11 12.12

0.39
0.13
0.37

5.03
0.80
2.38

1
.05 .0137 2
3

0.06
11.31
1.61
12.26
11.94 42.87

0.00
0.62
3.20

7.90
5.41
4.33

0.03
0.25
0.94

5.78
2.43
0.08

.5

1
.0134 2
3

0.02
8.43
0.65
4.90
3.34 13.35

0.00
7.81
0.55
4.15
2.59 10.13

0.02
0.46
2.02

7.24
3.52
7.74

1
.0132 2
3

0.00
7.77
0.51
2.86
2.26 12.51

0.00
7.70
0.50
2.91
2.20 12.19

0.00
7.64
0.49
2.86
2.14 11.88

1
.0132 1
1

0.00
7.69
0.50
2.74
2.15 12.37

0.00
7.69
0.50
2.74
2.15 12.37

0.00
7.69
0.50
2.74
2.15 12.37

50

implies the Chapman-Kolmogorov equation

P (s, u) = P (s, t)P (t, u) ,

(5.1)

valid for 0 s t u. In particular

P (t, t) = I JJ ,

(5.2)

the J J identity matrix. The intensity of transition from state j to state k

(= j) at time t is dened as jk (t) = limdt0 pjk (t, t + dt)/dt or, equivalently, by
pjk (t, t + dt) = jk (t) dt + o(dt) ,

(5.3)

when the limit exists. Then, obviously,

pjj (t, t + dt) = 1 j (t)dt + o(dt) ,

(5.4)

where j (t) = k; k=j jk (t) can appropriately be termed the total intensity
of transition out of state j at time t. The innitesimal matrix M (t) is the J J
matrix with jk (t) in row j and column k, dening jj (t) = j (t). With this
notation (5.3) (5.4) can be assembled in
P (t, t + dt) = I + M (t)dt .

(5.5)

The probabilities determine the intensities. Conversely, the probabilities

are determined by the intensities through Kolmogorovs dierential equations,
which are readily obtained upon combining (5.1) and (5.5). There is a forward
equation,

t

(5.6)

P (t, u) = M (t)P (t, u) ,

t

(5.7)

each of which determine P (t, u) when combined with the condition (5.2).
B. Stationary Markov chains.
When M (t) = M , a constant, then (as is obvious from the Kolmogorov equations) P (s, t) = P (0, t s) depends on s and t only through t s. In this case
we write P (t) = P (0, t), allowing a slight abuse of notation. The equations (5.6)
(5.7) now reduce to
d
P (t) = P (t)M = M P (t) .
dt

(5.8)

The limit = limt P (t) exists, and the j-th row of is the limiting (stationary) distribution of the state of the process, given that it starts from state

CHAPTER 5. A MARKOV CHAIN INTEREST MODEL

51

j. We shall assume throughout that all states communicate with each other.
Then the stationary distribution  = (1 , . . . , J ), say, is independent of the
initial state, and so
= 1J1  ,

(5.9)

where 1J1 is the J-dimensional column vector with all entries equal to 1.
Letting t in (5.8) and using (5.9), we get 1J1  M = M 1J1  = 0JJ
(a matrix of the indicated dimension with all elements equal to 0), that is,
 M = 01J , M 1J1 = 0J1 .

(5.10)

Thus, 0 is an eigenvalue of M , and  and 1J1 are corresponding left and right
eigenvectors, respectively.
From Paragraph 4.3 of [17] we gather the following useful representation
result. Let j , j = 1, . . . , J, be the eigenvalues of M and, for each j, let j and
j be the corresponding left and right eigenvectors, respectively. Let be the
J J matrix whose j-th column is j . Then the j-th row of 1 is just j , and
introducing R(t) = diag(ej t ), the transition matrix P (t) can be expressed as
P (t) = R(t)1 =

J


ej t j j ,

(5.11)

j=1

which is computationally convenient. We can take 1 to be 0 and 1 = 1J1 .

Then 1 =  , and we obtain
P (t) = 1J1  +

J


ej t j j .

(5.12)

j=2

All the j , j = 2, . . . , J are strictly negative, and so the representation shows

that the transition probabilities converge exponentially to the stationary distribution.
d
is to be thought
At this point we need to make precise that in (5.8) the dt
(1)
of as an operator, to be distinguished from the matrix P (t) of derivatives it
produces when applied to P (t). Now, for > 0, dene
P (t) = P (t) .

(5.13)

Upon dierentiating this relationship and using (5.8), we obtain

d
d
P (t) = P (t) = P (1) (t) = P (t)M ,
dt
dt
which shows that P (t), which is certainly a matrix of transition probabilities,
has innitesimal matrix
M = M .

(5.14)

Thus, doubling (say) the intensities of transition aects the transition probabilities the same way as a doubling of the time period.

Chapter 6

6.1

General considerations

A. Background. The basic paradigm in life insurance mathematics is the

principle of equivalence, which lays down that the expected present value of
total benets less premiums in respect of an individual insurance policy should
equal 0 at the time of inception of the contract. The rationale of the principle is,
roughly speaking, that the law of large numbers will make outgoes and incomes
balance on the average in a large insurance portfolio. An implicit assumption
underlying this consideration is that the experience basis, that is, the factual
transition intensities, interest, and administration costs throughout the contract
period, are known at the time of issue. In reality, however, the experience basis
may undergo signicant and unforeseeable changes within the time horizon of
the contract, thus exposing the insurer to a risk that is indiversiable, that is,
can not be eliminated or mitigated by increasing the size of the portfolio.
The risk stemming from the uncertain development of the interest rate can,
under certain ideal market conditions, be eliminated by letting the contractual
payments depend on the returns on the companys investments. Products of
this type, known as unit-linked insurances, have been gaining increasing market
shares ever since they emerged some few decades ago, and today they are also
theoretically well understood, see [2], [20], and references therein.
Unlike the unit-linked concept, a standard life insurance policy species contractual payments in nominal amounts, binding to both parties throughout the
entire term of the contract. Thus, an adverse development of the experience
basis can not be countered by raising premiums or reducing benets and also
not by cancelling the contract (the right of withdrawal remains one-sidedly with
the insured). The only way the insurer can prevent the indiversiable risk is
to charge premiums to the safe side. In practice this is done by calculating
premiums on a conservative so-called technical basis or rst order basis, which
represents a provisional worst-case scenario for the future development of the
experience basis. In the course of the contract period the insurer currently ob-

52

53

serves the experience basis or (rather assesses it by what is called the) second
order basis. Upon identifying reserves of rst and second order, with the latter
incorporating explicit safety contributions, one obtains an expression for the current contributions to the technical surplus showing how it emerges from safety
margins in the individual technical elements. By statute the technical surplus
belongs to the insured and is to be redistributed as bonus, which may be paid
out either currently (cash bonus) or as a lump sum upon expiry of the policy
(terminal bonus), or may be used as single premiums for current purchases of
additional benets. A key references on this classical technique is the textbook
by [3], which describes the basic principles in the context of a single-life policy.
[27] carried the concept over to the multi-state policy.
The present theory is developed in [24] and [25].

B. Sketch of the usual technique. The approach commonly used in practice is the following. At the outset the contractual benets are valuated, and
the premium is set accordingly, on a rst order (technical) basis, which is a
set of hypothetical assumptions about interest, intensities of transition between
policy-states, costs, and possibly other relevant technical elements. The rst
order model is a means of prudent calculation of premiums and reserves, and
its elements are therefore placed to the safe side in a sense that will be made
precise later. As time passes reality reveals true elements that ultimately set
the realistic scenario for the entire term of the policy and constitute what is
called the second order (experience) basis. Upon comparing elements of rst
and second order, one can identify the safety loadings built into those of rst
order and design schemes for repayment of the systematic surplus they have
created. We will now make these things precise.

6.2

A. The second order model. The policy-state process Z is assumed to be a

time-continuous Markov chain as described in Section 4.2. In the present context
we need to equip the indicator processes and counting processes related to the
Z
. The probability measure
process Z with a topscript, calling them IjZ and Njk
and expectation operator induced by the transition intensities are denoted by
P and E, respectively.
The history of the policy up to and including time t is represented by the
sigma-algebra Ht = Ht = {Z( ) ; [0, t]}. The development of the policy
is given by the ltration (increasing family of sigma-algebras) H = {Ht }t[0,n] .
We remind of the fact that the compensated counting processes Mjk , j = k,
dened by
Z
dMjk (t) = dNjk
(t) IjZ (t)jk (t) dt

are zero mean H-martingales.

54

The investment portfolio of the insurance company bears interest with intensity r(t) at time t.
The intensities r and jk constitute the experience basis, also called the
second order basis, representing the true mechanisms governing the insurance
business. At any time its past history is known, whereas its future is unknown.
We extend the set-up by viewing the second order basis as stochastic, whereby
the uncertainty associated with it becomes quantiable in probabilistic terms.
In particular, prediction of its future development becomes a matter of modelbased forecasting. Thus, let us consider the set-up above as the conditional
model, given the second order basis, and place a distribution on the latter,
whereby r and the jk become stochastic processes. Let Gt denote their complete history up to, and including, time t and, accordingly, let E[ | Gt ] denote
conditional expectation, given this information.
For the time being we will work only in the conditional model and need not
specify any particular marginal distribution of the second order elements.
B. The rst order model. We let the rst order model be of the same type
as the conditional model of second order. Thus, the rst order basis is viewed as
deterministic, and we denote its elements by r and jk and the corresponding
probability measure and expectation operator by P and E , respectively. The
rst order basis represents a prudent initial assessment of the development of
the second order basis, and its elements are placed on the safe side in a sense
that will be made precise later.
By statute, the insurer must currently provide a reserve to meet future liabilities in respect of the contract, and these liabilities are to be valuated on the
rst order basis. The rst order reserve at time t, given that the policy is then
in state j, is


 n R


t r

e
dB( )  Z(t) = j
Vj (t) = E
t

 n R



=
e t r
pjg (t, ) dBg ( ) +
bgh ( )gh ( ) d . (6.1)
t

h;h=g

We need Thieles dierential equations

dVj (t) = r (t)Vj (t) dt dBj (t)

Rjk
(t) jk (t) dt ,

(6.2)

k; k=j

where

(t) = bjk (t) + Vk (t) Vj (t)

Rjk

(6.3)

is the sum at risk associated with a possible transition from state j to state k
at time t.
The premiums are based on the principle of equivalence exercised on the rst
order valuation basis,

 n
R
E
e 0 r dB( ) = 0 ,
(6.4)
0

or, equivalently,

6.3

55

V0 (0) = B0 (0) .

(6.5)

A. Denition of the surplus. With premiums determined by the principle

of equivalence (6.4) based on prudent rst order assumptions, the portfolio will
create a systematic technical surplus. To see how it emerges, we work at the
level of the individual policy and dene its individual surplus at time t as
 t R
t
S(t) =
e r d(B)( ) V (t)
0

= e

Rt
0

R
0

dB( )

IjZ (t)Vj (t)

which is past net income (premiums less benets), compounded with the factual
second order interest, minus expected discounted future liabilities valuated on
the conservative rst order basis. This denition complies with practical accountancy regulations in insurance since S(t) is precisely the dierence between
the current cash balance and the rst order reserve that by statute has to be
provided to meet future liabilities. We interpose that the integral in the rst
term on the right of (6.6) is well dened path by path since it involves only
processes of bounded variation.
Carrying on, we rst note that
S(0) = 0,

(6.6)


S(n) =

Rn

dB( ) ,

(6.7)

as it ought to be. Dierentiating (6.6) gives

 t
Rt
R
r
0
e 0 r dB( ) dB(t) dV (t)
dS(t) = e r(t) dt
0


= r(t) dt S(t) + r(t) dt
IjZ (t)Vj (t) dB(t) d
IjZ (t)Vj (t) .
j

Upon substituting dB(t) from (4.31), using the general It

o formula to write



Z
d
IjZ (t) Vj (t) =
IjZ (t) dVj (t) +
{Vk (t) Vj (t)} dNjk
(t)
j

j=k

and picking dVj (t) from (6.2), we nd

dS(t) = r(t) dt S(t) + dC(t) + dM (t) ,

56

where
dC(t) =

IjZ (t) cj (t) dt ,

with
cj (t) = {r(t) r (t)} Vj (t) +

Rjk
(t){jk (t) jk (t)} ,

k; k=j

and
dM (t) =

Rjk
(t) dMjk (t) ,

j=k

with the Mjk dened in (6.1).

The process M is a zero mean H-martingale in the conditional model, given
Gn , that is,
E[M (t) | Hs Gn ] = M (s)
for s t, and M (0) = 0. Then it is also a zero mean F-martingale in the full
model since E[M (t) | Fs ] = E [ E[M (t) | Hs Gn ] | Fs ] = E[M (s) | Fs ] = M (s).
The term dM (t) in (6.8) is the purely accidental part of the surplus increment.
The two rst terms on the right of (6.8) are the systematic parts, which make the
surplus drift to something with expected value dierent from 0. The rst term
is the earned interest on the surplus itself, and what remains is quite naturally
the policy-holders contribution to the technical surplus.
To put it another way, let us switch the rst
term on the right of (6.8) over
Rt
to the left and multiply the equation with e 0 r to form a complete dierential
on the left hand side. Integrating from 0 to t and using the fact that S(0) =
C(0) = M (0) = 0, we arrive at
e

Rt
0


r

S(t) =
0

R
0


r

dC( ) +

R
0

dM ( ) ,

showing that the discounted surplus at time t is the discounted total contributions plus a martingale representing noise.

B. Safety margins. The expression on the right of (6.8) displays how the
contributions arise from safety margins in the rst order force of interest (the
rst term) and in the transition intensities (the second term). The purpose
of the rst order basis is to create a non-negative technical surplus. This is
certainly fullled if
(6.8)
r(t) r (t)
(assuming that all Vj (t) are non-negative as they should be) and

(t) .

(6.9)

6.4

57

Dividends and bonus

A. The dividend process. Legislation lays down that the technical surplus
belongs to the insured and has to be repaid in its entirety. Therefore, to the
contractual payments B there must be added dividends, henceforth denoted
by D. The dividends are currently adapted to the development of the second
order basis and, as explained in Paragraph 6.1.A, they can not be negative.
The purpose of the dividends is to establish, ultimately, equivalence on the true
second order basis:
 
 n
R

0 r
e
d{B + D}( )  Gn = 0 .
(6.10)
E
0

We can state (6.10) equivalently as

 
 n R

n
E
e r d{B + D}( )  Gn = 0 .

(6.11)

The value at time t of past individual contributions less dividends, compounded with interest, is
 t R
t
e r d{C D}( ) .
(6.12)
U d (t) =
0

This amount is an outstanding account of the insured against the insurer, and
we shall call it the dividend reserve at time t.
By virtue of (6.7) we can recast the equivalence requirement (6.11) in the
appealing form
(6.13)
E[U d (n) | Gn ] = 0 .
From a solvency point of view it would make sense to strengthen (6.13) by
requiring that compounded dividends must never exceed compounded contributions:
(6.14)
E[U d (t) | Gt ] 0 ,
t [0, n]. At this point some explanation is in order. Although the ultimate
balance requirement is enforced by law, the dividends do not represent a contractual obligation on the part of the insurer; the dividends must be adapted
to the second order development up to time n and can, therefore, not be stipulated in the terms of the contract at time 0. On the other hand, at any time,
dividends allotted in the past have irrevocably been credited to the insureds
account. These regulatory facts are reected in (6.14).
If we adopt the view that the technical surplus belongs to those who created
it, we should sharpen (6.13) by imposing the stronger requirement
U d (n) = 0 .

(6.15)

This means that no transfer of redistributions across policies is allowed. The

solvency requirement conforming with this point of view, and sharpening (6.14),
is
(6.16)
U d (t) 0 ,

58

t [0, n].
The constraints imposed on D in this paragraph are of a general nature and
leave a certain latitude for various designs of dividend schemes. We shall list
some possibilities motivated by practice.
B. Special dividend schemes. The so-called contribution scheme is dened
by D = C, that is, all contributions are currently and immediately credited to
the account of the insured. No dividend reserve will accrue and, consequently,
the only instrument on the part of the insurer in case of adverse second order
experience is to cease crediting dividends. In some countries the contribution
principle is enforced by law. This means that insurers are compelled to operate
with minimal protection against adverse second order developments.
By terminal dividend is meant that all contributions are currently invested
and their compounded total is credited to the insured as a lump sum dividend
payment only upon the termination of the contract at some time T after which
no more contributions are generated. Typically T would be the time of transition
to an absorbing state (death or withdrawal), truncated at n. If compounding is
at second order rate of interest, then
 T R
T
e r dC( ) .
D(t) = 1[t T ]
0

Contribution dividends and terminal dividends represent opposite extremes

in the set of conceivable dividend schemes, which are countless. One class
of intermediate solutions are those that yield dividends only at certain times
between certain
T1 < < TK n, e.g. annually or at times of transition
 Ti R Ti r
states. At each time Ti the amount D(Ti ) = Ti1
e dC( ) (with T0 = 0)
is entered to the insureds credit.
C. Allocation of dividends; bonus. Once they have been allotted, dividends belong to the insured. They may, however, be disposed of in various ways
and need not be paid out currently as they fall due. The actual payouts of dividends are termed bonus in the sequel, and the corresponding payment function
is denoted by B b .
The compounded value of credited dividends less paid bonuses at time t is
 t R
t
e r d{D B b }( ) .
(6.17)
U b (t) =
0

This is a debt owed by the insurer to the insured, and we shall call it the bonus
reserve at time t. Bonuses may not be advanced, so B b must satisfy
U b (t) 0

(6.18)

for all t [0, n]. In particular, since D(0) = 0, one has B b (0) = 0. Moreover,
since all dividends must eventually be paid out, we must have
U b (n) = 0 .

(6.19)

59

We have introduced three notions of reserves that all appear on the debit
side of the insurers balance sheet. First, the premium reserve V is provided
to meet net outgoes in respect of future events; second, the dividend reserve U d
is provided to settle the excess of past contributions over past dividends; third,
the bonus reserve U b is provided to settle the unpaid part of dividends credited
in the past. The premium reserve is of prospective type and is a predicted
amount, whereas the dividend and bonus reserves are of retrospective type and
are indeed known amounts summing up to
 t R
t
U d (t) + U b (t) =
e r d{C B b }( ) ,
(6.20)
0

the compounded total of past contributions not yet paid back to the insured.
D. Some commonly used bonus schemes. The term cash bonus is, quite
naturally, used for the scheme B b = D. Under this scheme the bonus reserve is
always null, of course.
By terminal bonus, also called reversionary bonus, is meant that all dividends, with accumulation of interest, are paid out as a lump sum upon the
termination of the contract at some time T , that is,

B b (t) = 1[t T ]

RT

dD( ) .

Here we could replace the integrator D by C since terminal bonus obviously

does not depend on the dividend scheme; all contributions are to be repaid with
accumulation of interest.
Assume now, what is common in practice, that dividends are currently used
to purchase additional insurance coverage of the same type as in the primary
policy. It seems natural to let the additional benets be proportional to those
stipulated in the primary policy since they represent the desired prole of the
product. Thus, the dividends dD(s) in any time interval [s, s + ds) are used as
a single premium for an insurance with payment function of the form
dQ(s){B + ( ) B + (s)} ,
(s, n], where the topscript + signies, in an obvious sense, that only
positive payments (benets) are counted.
Supposing that additional insurances are written on rst order basis, the
proportionality factor dQ(s) is determined by
+
(s),
dD(s) = dQ(s)VZ(s)

where
+
(s)
VZ(s)



=E

e
s

R
s





dB ( ) Z(s)
+

is the single premium at time s for the future benets under the policy.

(6.21)

60

Now the bonus payments B b are of the form

dB b (t) = Q(t)dB + (t) .

(6.22)

Being written on rst order basis, also the additional insurances create technical
surplus. The total contributions under this scheme develop as
dC(t) + Q(t)dC + (t) ,

(6.23)

where the rst term on the right stems from the primary policy and the second
term stems from the Q(t) units of additional insurances purchased in the past,
each of which
payment function B + producing contributions C + of the form
 has
+
Z
dC (t) = j Ij (t) c+
j (t) dt, with
+

c+
j (t) = {r(t) r (t)}Vj (t) +

+
Rjk
(t){jk (t) jk (t)} ,

k; k=j
+
+
+
(t) = b+
Rjk
jk (t) + Vk (t) Vj (t) .

The present situation is more involved than those encountered previously

since, not only are dividends driven by the contractual payments, but it is
also the other way around. To keep things relatively simple, suppose that the
contribution principle is adopted so that the dividends in (6.21) are set equal
to the contributions in (6.23). Then the system is governed by the dynamics
+
(t)
dC(t) + Q(t)dC + (t) = dQ(t)VZ(t)
+
(t) is strictly positive whenever dC(t) and dC + (t) are,
or, realizing that VZ(t)

(6.24)

where G and H are dened by

dG(t)

dH(t) =

1
dC + (t) ,
+
VZ(t)
(t)
1
+
VZ(t)
(t)

dC(t) .

(6.25)
(6.26)

Multiplying with exp(G(t)) to form a complete dierential on the left and then
integrating from 0 to t, using Q(0) = 0, we obtain
 t
Q(t) =
eG(t)G( ) dH( ) .
(6.27)
0

6.5

Bonus prognoses

A. A Markov chain environment. We shall adopt a simple Markov chain

description of the uncertainty associated with the development of the second

61

order basis. Let Y (t), 0 t n, be a time-continuous Markov chain with

nite state space Y = {1, . . . , q} and constant intensities of transition, ef .
Denote the associated indicator processes by IeY . The process Y represents the
economic-demographic environment, and we let the second order elements
depend on the current Y -state:

IeY (t) re = rY (t) ,
r(t) =
e

jk (t)

IeY (t) e;jk (t) = Y (t);jk (t) .

The re are constants and the e;jk (t) are intensity functions, all deterministic.
With this specication of the full two-stage model it is realized that the pair
X = (Y, Z) is a Markov chain on the state space X = Y Z, and its intensities
of transition, which we denote by ej,f k (t) for (e, j), (f, k) X , (e, j) = (f, k),
are
ej,f j (t) = ef ,

e = f,

(6.28)

j = k,

(6.29)

and null for all other transitions.

In this extended set-up the contributions, whose dependence on the second
order elements was not visualized earlier, can appropriately be represented as

dC(t) = c(t) dt =
IeY (t)IjZ (t)cej (t) dt,
e,j

where


cej (t) = {re r (t)} Vj (t) +

Rjk
(t){jk (t) e;jk (t)} .

(6.30)

k; k=j

Under the scheme of additional benets described in Paragraph 6.4.D a similar

convention goes for C + and c+ and, accordingly, (6.25) and (6.26) become

dG(t) = g(t) dt =
IeY (t)IjZ (t)gej (t) dt ,
(6.31)
e,j

gej (t)

dH(t) =

c+
ej (t)
Vj+ (t)

h(t) dt =

(6.32)


(6.33)

e,j

hej (t)

cej (t)
.
Vj+ (t)

(6.34)

62

B. Preparatory remarks on the issue of bonus prognoses. There is no

single functional of the future bonus stream that presents itself as the relevant
quantity to prognosticate. One could e.g. take the total bonuses discounted by
some suitable ination rate, or the undiscounted total bonuses, or the rate at
which bonus will be paid at certain times, and one could apply any of these possibilities to the random development of the policy or to some representative xed
development. We shall focus on the expected value, and in the simplest cases
also higher order moments, of the future bonuses discounted by the stochastic
second order interest. From this we can easily deduce predictors for a number
of other relevant quantities. We turn now to the analysis of some of the schemes
described in Section 6.4.
C. Contribution dividends and cash bonus. This case, where B b = C =
D, is particularly simple since the bonus payments at any time depend only on
the current state of the process. We can then employ the appropriate version
of Thieles dierential equation to calculate the state-wise expected discounted
future bonuses (= contributions),

 n R



e t r c( ) d  X(t) = (e, j) .
Wej (t) = E
t

They are determined by the appropriate version of Thieles dierential equation,


d
Wej (t) = re Wej (t) cej (t)
ef (Wf j (t) Wej (t))
dt
f ;f =e


(6.35)
k;k=j

subject to
Wej (n) = 0 ,

e, j .

(6.36)

D. Terminal dividend and/or bonus. Under the terminal bonus scheme

dividends and bonuses are the same, of course. The problem of predicting
the total bonus payments discounted with respect to second order interest is
basically the same as in the previous paragraph since it amounts to adding
the total amount of compounded past contributions, which is known, and the
state-wise predictor of discounted future contributions.
Suppose instead that at time t, the policy still being in force, it is decided
to predict the undiscounted value of the terminal bonus amount,
 t R
 T R
T
t
e r c( ) d =
e r c( ) d W  (t) + W  (t) ,
(6.37)
W =
0

where
W  (t)
W  (t)

RT

= e t r,
 T R
T
=
e r c( ) d .
t

63

We (t) =

(t) =
Wej

E[W  (t) | Y (t) = e],

E[W  (t) | X(t) = (e, j)] ,

to nd the state-wise predictors of W in (6.37),

 t R
t

Wej (t) =
e r c( ) d We (t) + Wej
(t) .
0

We shall nd these functions by the backward construction, starting from

W  (t) =

er dt W  (t + dt),

W  (t) =

c(t) dt W  (t) + W  (t + dt) .

Conditioning on what happens in the small time interval (t, t + dt], we get


We (t) = ere dt (1 e dt) We (t + dt) +
ef (t) dt Wf (t + dt) ,
f ; f =e

and

Wej
(t) =


cej (t) dt We (t) + (1 (e + e;j (t)) dt) Wej
(t + dt)

+
ef (t) dt Wfj (t + dt)
f ; f =e


e;jk (t) dt Wek
(t + dt) .

k; k=j

From these relationships we easily obtain the dierential equations

d 
W (t)
dt e

d 
W (t)
dt ej

re We (t)



ef Wf (t) We (t) ,

f ; f =e

(6.38)




ef Wfj (t) Wej
(t)

f ; f =e

 


e;jk (t) Wek
(t) Wej
(t) ,

(6.39)

k; k=j

We (n) = 1 ,


Wej
(n) = 0 ,

e, j .

(6.40)

E. Additional benets. Suppose we want to predict the total future bonuses

discounted with respect to second order interest,
 n R

W (t) =
e t r Q( ) dB + ( ) ,
t

64

with Q dened by (6.27). Recalling (6.31)(6.34), we reshape W (t) as

 n R  R

W (t) =
e t r
e r g h(r) dr dB + ( )
t
0

 R
 n R  t R
R
t

t r
e
e r g h(r) dr e t g +
e r g h(r) dr dB + ( )
=
0

t R
t

(6.41)

with

W (t)

n R

W  (t)

(gr)

R
t

dB + ( ),

W  ( ) h( ) d .


Wej
(t) =

Wej (t) =

E[W  (t) | X(t) = (e, j)] ,

E[W  (t) | X(t) = (e, j)] ,

in order to nd the state-wise predictors of W (t) in (6.41),

 t R
t


Wej (t) =
e r g h(r) dr Wej
(t) + Wej
(t) .
0

W  (t)

W  (t)

W  (t) h(t) dt + er(t) dt W  (t + dt) ,

from which we proceed in the same way as in the previous paragraph to obtain

dWej
(t)


= dBj+ (t) + (re gej (t)) dt Wej
(t)

 


ef dt Wf j (t) Wej (t)

f ; f =e


(t)
dWej





e;jk (t) dt b+
jk (t) + Wek (t) Wej (t) ,

k;k=j

Wej
(t)hej (t) dt

f ; f =e

(6.42)


+ re dt Wej
(t)




ef dt Wfj (t) Wej
(t)

 


e;jk (t) dt Wek
(t) Wej
(t) .

(6.43)

k; k=j

The appropriate side conditions are


(n) = Bj+ (n) ,
Wej


Wej
(n) = 0 ,

e, j .

(6.44)

65

F. Predicting undiscounted amounts. If the undiscounted total contributions or additional benets is what one wants to predict, one can just apply the
formulas with all re replaced by 0.
G. Predicting bonuses for a given policy path. Yet another form of
above, would be to predict bonus payments for some possible xed pursuits of
a policy instead of averaging over all possibilities. Such prognoses are obtained
from those described above upon keeping the realized path Z( ) for [0, t],
where t is the time of consideration, and putting Z( ) = z( ) for (t, n],
where z() is some xed path with z(t) = Z(t). The relevant predictors then
become essentially functions only of the current Y -state and are simple special
cases of the results above.
As an example of an even simpler type of prognosis for a policy in state j
at time t, the insurer could present the expected bonus payment per time unit
at a future time s, given that the policy is then in state i, and do this for some
representative selections of s and i. If Y (t) = e, then the relevant prediction is

pYef (t, s)cf i (s) .
E[cY (s)i (s) | Y (t) = e] =
f

6.6

Examples

A. The case. For our purpose, which is to illustrate the role of the stochastic
environment in model-based prognoses, it suces to consider simple insurance
products for which the relevant policy states are Z = {a, d} (alive and dead).
We will consider a single life insured at age 30 for a period of n = 30 years,
and let the rst order elements be those of the Danish technical basis G82M for
males:
r

ln(1.045) ,

Three dierent forms of insurance benets will be considered, and in each

case we assume that premiums are payable continuously at level rate as long
as the policy is in force. First, a term insurance (TI) of 1 = bad (t) with rst
order premium rate 0.0042608 = ba (t). Second, a pure endowment (PE) of
1 = Ba (30) with rst order premium rate 0.0140690 = ba (t). Third, an
endowment insurance (EI), which is just the combination of the former two; 1
= bad (t) = Ba (30), 0.0183298 = ba (t).
Just as an illustration, let the second order model be the simple one where
interest and mortality are governed by independent time-continuous Markov
chains and, more specically, that r switches with a constant intensity i between the rst order rate r and a better rate Gi r (Gi > 1) and, similarly,
switches with a constant intensity m between the rst order rate and a better rate Gm (Gm < 1). (We choose to express ourselves this way although (6.9)

bb,a

66

gb,a
i

r , , alive

(t)

Gm (t)

(t)

,d
m

Gi r , , alive

Gm (t)

bg,a

r , Gm , alive

gg,a

Gi r , Gm , alive

Figure 6.1: The Markov process X = (Y, Z) for a single life insurance in an
environment with two interest states and two mortality states.
shows that, for insurance forms with negative sum at risk, e.g. pure endowment
insurance, it is actually a higher second order mortality that is better in the
sense of creating positive contributions.)
The situation ts into the framework of Paragraph 6.5.A; Y has states Y =
{bb, gb, bg, gg} representing all combinations of bad (b) and good (g) interest
and mortality, and the non-null intensities are
bb,gb = gb,bb = bg,gg = gg,bg = i ,
bb,bg = bg,bb = gb,gg = gg,gb = m .
The rst order basis is just the worst-scenario bb.
Adopting the device (6.28)(6.29), we consider the Markov chain X = (Y, Z)
with states (bb, a), (gb, a), etc. It is realized that all death states can be merged
into one, so it suces to work with the simple Markov model with ve states
sketched in Figure 6.1.

B. Results. We shall report some numerical results for the case where Gi =
1.25, Gm = 0.75, and i = m = 0.1. Prognoses are made at the time of issue

67

of the policy. Computations were performed by the fourth order Runge-Kutta

method, which turns out to work with high precision in the present class of
situations.
Table 6.1 displays, for each of the three policies, the state-wise expected
values of discounted contributions obtained by solving (6.35)(6.36). We shall
be content here to point out two features: First, for the term insurance the
mortality margin is far more important than the interest margin, whereas for
the pure endowment it is the other way around (the latter has the larger reserve).
Note that the sum at risk is negative for the pure endowment, so that the rst
order assumption of excess mortality is really not to the safe side, see (6.9).
Second, high interest produces large contributions, but, since high initial interest
also induces severe discounting, it is not necessarily true that good initial interest
will produce a high value of the expected discounted contributions, see the two
last entries in the row TI.
The latter remark suggests the use of a discounting function dierent from
the one based on the second order interest, e.g. some exogenous deator reecting the likely development of the price index or the discounting function
corresponding to rst order interest. In particular, one can simply drop discounting and prognosticate the total amounts paid. We shall do this in the
following, noting that the expected value of bonuses discounted by second order
interest must in fact be the same for all bonus schemes, and are already shown
in Table 6.1.
Table 6.2 shows state-wise expected values of undiscounted bonuses for three
dierent schemes; contribution dividends and cash bonus (C, the same as total undiscounted contributions), terminal bonus (T B), and additional benets
(AB).
We rst note that, now, any improvement of initial second order conditions
helps to increase prospective contributions and bonuses.
Furthermore, expected bonuses are generally smaller for C than for T B and
AB since bonuses under C are paid earlier. Dierences between T B and AB
must be due to a similar eect. Thus, we can infer that AB must on the average
fall due earlier than T B, except for the pure endowment policy, of course.
One might expect that the bonuses for the term insurance and the pure endowment policies add up to the bonuses for the combined endowment insurance
policy, as is the case for C and T B. However, for AB it is seen that the sum of
the bonuses for the two component policies is generally smaller than the bonuses
for the combined policy. The explanation must be that additional death benets and additional survival benets are not purchased in the same proportions
under the two policy strategies. The observed dierence indicates that, on the
average, the additional benets fall due later under the combined policy, which
therefore must have the smaller proportion of additional death benets.
C. Assessment of prognostication error. Bonus prognoses based on the
present model may be equipped with quantitative measures of the prognostication error. By the technique of proof shown in Section 6.5 we may derive

68

Table 6.1: Conditional expected present value at time 0 of total contributions

for term insurance policy (TI), pure endowment policy (PE), and endowment
insurance policy (EI), given initial second order states of interest and mortality
(b or g).
bb

gb

bg

gg

TI :

PE :

EI :

.02463 .02677 .02656 .02865

dierential equations for higher order moments of any of the predictands considered and calculate e.g. the coecient of variation, the skewness, and the
kurtosis.

6.7

Discussions

A. The principle of equivalence. This principle, as formulated in (6.4), is

basic in life insurance. The expected value represents averaging over a large
(really innite) portfolio of policies, the philosophy being that, even if the individual policy creates a (possibly large) loss or gain, there will be balance on
the average between outgoes and incomes in the portfolio as a whole if the premiums are set by equivalence. The deviation from perfect balance, which is
inevitable in a nite world with nite portfolios, represents prot or loss on the
part of the insurer and has to be settled by an adjustment of the equity capital.
(The possibility of loss, about as likely and about as large as the possible prot,
might seem unacceptable to an industry that needs to attract investors, but it
should be kept in mind that salaries to employees and dividends to owners are
accounted as part of the expenses, which we have not discussed here.)
B. On the notion of second order basis. The denition of the second
order basis as the true one is slightly at variance with practical usage (which is
not uniform anyway). The various amendments made to our idealized denition
in practice are due to administrative and procedural bottlenecks: The factual
development of interest, mortality, etc. has to be veried by the insurer and
then approved by the supervisory authority. Since this can not be a continuous
operation, any regulatory denition of the second order basis must to some
extent involve realistic, still typically conservative, short term forecasts of the
future development. However, our denition can certainly be agreed upon as
the intended one.

69

Table 6.2: Conditional expected value (E) of undiscounted total contributions

(C), terminal bonus (T B), and total additional benets (AB) for term insurance
policy (TI), pure endowment policy (PE), and endowment insurance policy (EI),
given initial second order states of interest and mortality (b or g).
bb
TI:

gb

bg

gg

E
E
E

C:
TB :
AB :

.02153
.03693
.02949

.02222
.03916
.03096

.02436
.04600
.03545

.02505
.04847
.03706

PE: E
E
E

C:
TB :
AB :

.04342
.07337
.07337

.04818
.08687
.08687

.04314
.07264
.07264

.04791
.08615
.08615

EI:

C:
TB :
AB :

.06495
.11030
.10723

.07040
.12603
.12199

.06750
.11864
.11501

.07296
.13462
.13003

E
E
E

C. Model deliberations. The Markov chain model is mathematically tractable

since state-wise expected values are determined by solving (in most cases simple) systems of rst order ordinary dierential equations. At the same time,
when equipped with a suciently rich state space and appropriate intensities
of transition, it is able to picture virtually any conceivable notion of the real
object of the model.
The Markov chain model is particularly apt to describe the development of
life insurance policy since the paths of Z are of the same kind as the true ones.
When used to describe the development of the second order basis, however,
the approximative nature of the Markov chain is obvious, and it will surface
immediately as e.g. the experienced force of interest takes values outside of the
nite set allowed by the model. This is not a serious objection, however, and
the next paragraph explains why.
D. The role of the stochastic environment model. A paramount concern is that of establishing equivalence conditional on the factual second order
history in the sense of (6.10). Now, in this conditional expectation the marginal
distribution of the second order elements does not appear and is, in this respect, irrelevant. Also the contributions and, hence, the dividends are functions
only of the realized experience basis and do not involve the distribution of its
elements.
Then, what remains the purpose of placing a distribution on the second
order elements is to form a basis for prognostication of bonus. Subsidiary as it
is, this role is still an important part of the play; although a prognosis does not
commit the insurer to pay the forecasted amounts, it should as much as possible

70

be a reliable piece of information to the insured. Therefore, the distribution

placed on the second order elements should set a reasonable scenario for the
course of events, but it need not be perfectly true. This is comforting since any
view of the mechanisms governing the economic-demographic development is
to some extent guess-work. When the accounts are eventually made up, every
speculative element must be absent, and that is precisely what the principle
(6.10) lays down.
E. A digression: Which is more important, interest or mortality?
Actuarial wisdom says it is interest. This is, of course, an empirical statement
based on the fact that, in the era of contemporary insurance, mortality rates
have been smaller and more stable than interest rates. Our model can add some
other kind of insight. We shall again be content with a simple illustration related
to the single life described in Section 6.6. Table 6.3 displays expected values and
standard deviations of the present values at time 0 of a term life insurance and
a life annuity under various scenarios with xed interest and mortality, that is,
conditional on xed Y -state throughout the term of the policy. The impact of
interest variation is seen by reading column-wise, and the impact of mortality
variation is seen by reading row-wise. The overall impression is that mortality
is the more important element by term insurance, whereas interest is the (by
far) more important by life annuity insurance.

bb,a

71

gb,a
r , , alive

0.1

1.25r , , alive

0.1

(t)

(t)

,d
0.1

0.1

0.1

0.75 (t)

0.1

0.75 (t)

bg,a

gg,a
r , 0.75 , alive

0.1

1.25r , 0.75 , alive

0.1

FIGURE 1. The Markov process X = (Y, Z) for a single life insurance in an environment with two interest states and two mortality states.

72

Table 6.3: Expected value (E) and standard deviation (SD) of present values of
a term life insurance (TI) with sum 1 and a life annuity (LA) with level intensity
1 per year, with interest r = Gi r and mortality = Gm for various choices of
Gi and Gm .

Gm :

TI
1.0

1.5

0.5

1.5

LA
1.0

0.5

Gi : 0.5

E : .14636 .10119 .05250

SD : .27902 .24104 .18041

20.545 20.996 21.467

03.691 03.101 02.257

1.0

E : .09927 .06834 .03531

SD : .20245 .17330 .12857

15.750 16.039 16.340

02.505 02.097 01.521

1.5

E : .06976 .04782 .02460

SD : .15858 .13437 .09868

12.466 12.655 12.852

01.759 01.468 01.061

Table 6.4: Conditional expected present value at time 0 of total contributions

for term insurance policy (TI), pure endowment policy (PE), and endowment
insurance policy (EI), given initial second order states of interest and mortality
(b or g).
bb

gb

bg

gg

TI :

.00851

.00854

.01061

.01059

PE :

.01613

.01823

.01595

.01807

EI :

.02463

.02677

.02656

.02865

Table 6.5: Conditional expected value of undiscounted total contributions (C)

and terminal bonus (T B) for term insurance policy (TI), pure endowment policy
(PE), and endowment insurance policy (EI), given initial second order states of
interest and mortality (b or g).
bb

gb

bg

gg

TI:

C:
TB :

.02153
.03693

.02222
.03916

.02436
.04600

.02505
.04847

PE:

C:
TB :

.04342
.07337

.04818
.08687

.04314
.07264

.04791
.08615

EI:

C:
TB :

.06495
.11030

.07040
.12603

.06750
.11864

.07296
.13462

Chapter 7

Financial mathematics in
insurance
7.1

Finance in insurance

Finance was always an essential part of insurance. Trivially, one might say,
because any business has to attend to its money aairs. However, insurance is
not just any business; its products are not physical goods or services, but nancial obligations released by certain random events. Furthermore, characteristic
of insurance business is that the products are typically paid in advance. This
makes the insurance industry a major accumulators of capital in todays society,
in particular long term business like pension funds. Consequently, the nancial
operations (investment strategy) of an insurance company may be as decisive of
its revenues as its insurance operations (design of products, risk management,
premium rating, procedures of claims assessment, and the pure randomness in
the claims process). Accordingly, one speaks of assets risk or nancial risk as
well as of liability risk or insurance risk. We anticipate here that nancial risk
may well be the more severe; insurance risk due to random deviations from the
expected claims result is diversiable, that is, can be eliminated in a suciently
large insurance portfolio (the laws of large numbers), whereas nancial risk is
held to be indiversiable since the entire portfolio is aected by the development
of the economy.
centers on measurement and control of the insurance risk. The answer may
partly be found in institutional circumstances: The insurance industry used to
be heavily regulated, solvency being the primary concern of the regulatory authority. Possible adverse developments of economic factors (e.g. ination, weak
returns on investment, low interest rates, etc.) would be safeguarded against
by placing premiums on the safe side. The comfortable surpluses, which would
typically accumulate under this regime, were redistributed as bonuses (dividends) to the policyholders only in arrears, after interest and other nancial
73

74

parameters had been observed. Furthermore, the insurance industry used to be

separated from other forms of business and protected from competition within
itself, and severe restrictions were placed on its investment operations. In these
circumstances nancial matters appeared to be something the traditional actuary did not need to worry about. Another reason why insurance mathematics
used to be void of nancial considerations was, of course, the absence of a well
developed theory for description and control of nancial risk.
All this has changed. National and institutional borders have been downsized
or eliminated and regulations have been liberalized: Mergers between insurance
companies and banks are now commonplace, new insurance products are being
created and put on the market virtually every day, by insurance companies and
other nancial institutions as well, and without prior licencing by the supervisory authority. The insurance companies of today nd themselves placed on
a ercely competitive market. Many new products are directly linked to economic indices, like unit-linked life insurance and catastrophe derivatives. By
so-called securitization also insurance risk can be put on the market and thus
open new possibilities of inviting investors from outside to participate in risk
that previously had to be shared solely between the participants in the insurance
insurance schemes. These developments in practical insurance coincide with the
advent of modern nancial mathematics, which has equipped the actuaries with
a well developed theory within which nancial risk and insurance risk can be
analyzed, quantied and controlled.
A new order of the day is thus set for the actuarial profession. The purpose
of this chapter is to give a glimpse into some basic ideas and results in modern
nancial mathematics and to indicate by examples how they may be applied to
actuarial problems involving management of nancial risk.

7.2

Prerequisites

A. Probability and expectation.

Taking basic measure theoretic probability as a prerequisite, we represent the
relevant part of the world and its uncertainties by a probability space (, F , P).
Here is the set of possible outcomes , F is a sigmaalgebra of subsets of
representing the events to which we want to assign probabilities, and P : F
is a probability measure.
A set A F such that P[A] = 0 is called a nullset, and a property that
takes place in all of , except possibly on a nullset, is said to hold almost surely
(a.s.). If more than one probability measure are in play, we write nullset (P)
and
and a.s. (P) whenever emphasis is needed. Two probability measures P
P, if they are dened on the same F
P are said to be equivalent, written P
and have the same nullsets.
Let G be some sub-sigmaalgebra of F . We denote the restriction of P to G
by PG ; PG [A] = P[A], A G. Note that also (, G, PG ) is a probability space.
A G-measurable random variable (r.v.) is a function X :  R such that
X 1 (B) G for all B R, the Borel sets in R. We write X G in short.

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

75

The expected
value of a r.v. X is the probability-weighted average E[X] =

X dP = X() dP(), provided this integral is well dened.
The conditional expected value of X, given G, is the r.v. E[X|G] G satisfying


E{E[X|G] Y } = E[XY ]

(7.1)

for each Y G such that the expected value on the right exists. It is unique
up to nullsets (P). To motivate (7.1), consider the special case when G =
{B1 , B2 , . . .}, the sigma-algebra generated by the F -measurable sets B1 , B2 , . . .,
which form a partition of . Being G-measurable, E[X|G] must be of the form

k bk 1Bk . Putting this together with Y = 1Bj into the relationship (7.1) we
arrive at


B X dP
,
1Bj j
E[X|G] =
P[Bj ]
j
as it ought to be. In particular, taking X = IA , we nd the conditional probability P[A|B] = P[A B]/P[B].
One easily veries the rule of iterated expectations, which states that, for
H G F,
E { E[X|G]| H} = E[X|H] .

(7.2)

F. Change of measure.
If L is a r.v. such that L 0 a.s. (P) and E[L] = 1, we can dene a probability
on F by
measure P


P[A]
=
L dP = E[1A L] .
(7.3)
A

P.
If L > 0 a.s. (P), then P
is
The expected value of X w.r.t. P

E[X]
= E[XL]

(7.4)

if this integral exists; by the denition (7.3), the relation (7.4) is true for indicators, hence for simple functions and, by passing
 to limits, it holds for measur = XL dP suggests the notation
able functions. Spelling out (7.4) as X dP
= L dP or
dP

dP
= L.
dP

(7.5)

w.r.t. P.
The function L is called the Radon-Nikodym derivative of P

E[XL|G]

E[X|G]
=
.
E[L|G]

(7.6)

76

E[X|G]

E{
Y } = E[XY
]

(7.7)

for all Y G. The expression on the left of (7.7) can be reshaped as

E{E[X|G]
Y L} = E{E[X|G]
E[L|G] Y } .
The expression on the right of (7.7) is
E[XY L] = E{E[XL|G] Y } .
It follows that (7.7) is true for all Y G if and only if

E[X|G]
E[L|G] = E[XL|G] ,
which is the same as (7.6).
For X G we have
G [X] = E[X]

E
= E[XL] = E {X E[L|G]} = EG {X E[L|G]} ,

(7.8)

showing that
G
dP
= E[L|G] .
dPG

(7.9)

B. Stochastic processes.
To describe the evolution of random phenomena over some time interval [0, T ],
we introduce a family F = {Ft }0tT of sub-sigmaalgebras of F , where Ft
represents the information available at time t. More precisely, Ft is the set of
events whose occurrence or non-occurrence can be ascertained at time t. If no
information is ever sacriced, we have Fs Ft for s < t. We then say that F is
a ltration, and (, F , F, P) is called a ltered probability space.
A stochastic process is a family of r.v.-s, {Xt }0tT . It is said to be adapted
to the ltration F if Xt Ft for each t [0, T ], that is, at any time the
current state (and also the past history) of the process is fully known if we
are currently provided with the information F. An adapted process is said to
be predictable if its value at any time is entirely determined by its history in
the strict past, loosely speaking. For our purposes it is sucient to think of
predictable processes as being either left-continuous or deterministic.
C. Martingales.
An adapted process X with nite expectation is a martingale if
E[Xt |Fs ] = Xs
for s < t. The martingale property depends both on the ltration and on the
probability measure, and when these need emphasis, we shall say that X is

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

77

martingale (F, P). The denition says that, on the average, a martingale
is always expected to remain on its current level. One easily veries that,
conditional on the present information, a martingale has uncorrelated future
increments. Any integrable r.v. Y induces a martingale {Xt }t0 dened by
Xt = E[Y |Ft ], a consequence of (7.2).
Abbreviate Pt = PFt , introduce
Lt =

t
dP
,
dPt

Lt = E[L|Ft ] ,

(7.10)

which is a martingale (F, P).

D. Counting processes.
As the name suggests, a counting process is a stochastic process N = {Nt }0tT
that commences from zero (N0 = 0) and thereafter increases by isolated jumps
of size 1 only. The natural ltration of N is FN = {FtN }0tT , where FtN =
{Ns ; s t} is the history of N by time t. This is the smallest ltration to
N
which N is adapted. The strict past history of N at time t is denoted by Ft
.
An FN -predictable process {t }0tT is called a compensator of N if the
process M dened by
M t = Nt t

(7.11)

is a zero mean FN -martingale. If is absolutely continuous, that is, of the form

 t
t =
s ds ,
0

then the process is called the intensity of N . We may also dene the intensity
informally by
t dt = P [dNt = 1 | Ft ] = E [dNt | Ft ] ,
and we sometimes write the associated martingale (7.11) in dierential form,
dMt = dNt t dt .

(7.12)

of the form
 t
hs dMs ,
(7.13)
H t = H0 +
0

where H0 is F0N -measurable and h is an FN -predictable process such that H is

integrable. The stochastic integral is also a martingale.

78

A fundamental representation result states that every FN martingale is a

stochastic integral w.r.t. M . It follows that every integrable FtN measurable
r.v. is of the form (7.13).
 t (1)
 t (2)
(1)
(1)
(2)
(2)
If Ht = H0 + 0 hs dMs and Ht = H0 + 0 hs dMs are stochastic
integrals with nite variance, then an easy heuristic calculation shows that


T
(1)
(2)
(1) (2)
hs hs s ds | Ft ,
(7.14)
Cov[HT , HT |Ft ] = E
t

and, in particular,


Var[HT |Ft ] = E


h2s s ds | Ft .

H (1) and H (2) are said to be orthogonal if they have conditionally uncorrelated
increments, that is, the covariance in (7.14) is null. This is equivalent to saying
that H (1) H (2) is a martingale.
The intensity is also called the innitesimal characteristic if the counting
process since it entirely determines it probabilistic properties. If t is deterministic, then Nt is a Poisson process. If depends only on Nt , then Nt is a
Markov process.
A comprehensive textbook on counting processes in life history analysis is
[1].
E. The Girsanov transform.
Girsanovs theorem is a celebrated one in stochastics, and it is basic in mathematical nance. We formulate and prove the counting process variation:
Theorem (Girsanov). Let Nt be a counting process with (F, P)-intensity t ,
t be a given non-negative F-adapted process such that
t = 0 if and
and let

P and
only if t = 0. Then there exists a probability measure P such that P

 t

 t
s ln s ) dNs +
s ) ds .
Lt = exp
(ln
(s
0

Proof: We shall give a constructive proof, starting from a guessed L in (7.5).

Since L must be strictly positive a.e. (P), a candidate would be L = LT , where
 t

 t
Lt = exp
s dNs +
s ds
0

In the rst place, Lt should be a martingale (F, P). By Itos formula,
dLt

=
=

Lt t dt + Lt (et 1) dNt



 

Lt t + et 1 t dt + Lt et 1 dMt .

79

The representation result (7.13) tells us that to make L a martingale, we must

make the drift term vanish, that is,


(7.15)
t = 1 et t ,
whereby



dLt = Lt et 1 dMt ,

given
In the second place, we want to determine t such that the process M
by
t dt
t = dNt
dM

(7.16)

s or, by (7.6),
Thus, we should have E[
M
t |Fs ] = M
is a martingale (F, P).


t L|Fs
E M
s .
=M
E [L|Fs ]
Using the martingale property (7.10) of Lt , this is the same as


s Ls
t Lt |Fs = M
E M
t Lt should be a martingale (F, P). Since
i.e. M
t Lt ) =
d(M

t dt)Lt + M
t (et 1)Lt (t dt)
(


t Lt ) dNt
t + 1)Lt et M
+ (M
 


t + et t + (M
t + 1)Lt et M
t Lt ) dMt .
Lt dt

t
we conclude that the martingale property is obtained by choosing t = ln
ln t .
The multivariate case goes in the same way; just replace by vector-valued
processes.

7.3

A Markov chain nancial market - Introduction

A. Motivation.
The theory of diusion processes, with its wealth of powerful theorems and
model variations, is an indispensable toolkit in modern nancial mathematics.
The seminal papers of Black and Scholes [6] and Merton [18] were crafted with
Brownian motion, and so were most of the almost countless papers on arbitrage
pricing theory and its bifurcations that followed over the past quarter of a
century.
A main course of current research, initiated by the martingale approach to
arbitrage pricing ([13] and [16]), aims at generalization and unication. Today

80

the core of the matter is well understood in a general semimartingale setting,

see e.g. [8]. Another course of research investigates special models, in particular various Levy motion alternatives to the Brownian driving process, see e.g.
[9] and [23]. Pure jump processes have been widely used in nance, ranging
from plain Poisson processes introduced in [19] to quite general marked point
processes, see e.g. [4]. And, as a pedagogical exercise, the market driven by a
binomial process has been intensively studied since it was launched in [7].
The present paper undertakes to study a nancial market driven by a continuous time homogeneous Markov chain. The idea was launched in [22] and
reappeared in [10], the context being limited to modelling of the spot rate of
interest. The purpose of the present study is two-fold: In the rst place, it is
instructive to see how well established theory turns out in the framework of a
general Markov chain market. In the second place, it is worthwhile investigating
the feasibility of the model from a theoretical as well as from a practical point
of view. Poisson driven markets are accommodated as special cases.
B. Preliminaries: Notation and some useful results.
Vectors and matrices are denoted by in bold letters, lower and upper case,
respectively. They may be equipped with topscripts indicating dimensions, e.g.
Anm has n rows and m columns. We may write A = (ajk )kK
jJ to emphasize
the ranges of the row index j and the column index k. The transpose of A
is denoted by A . Vectors are invariably taken to be of column type, hence
row vectors appear as transposed. The identity matrix is denoted by I, the
vector with all entries equal to 1 is denoted by 1, and the vector with all entries
equal to 0 is denoted by 0. By Dj=1,...,n (aj ), or just D(a), is meant the diagonal
matrix with the entries of a = (a1 , . . . , an ) down the principal diagonal. The ndimensional Euclidean space is denoted by Rn , and the linear subspace spanned
by the columns of Anm is denoted by R(A).
A diagonalizable square matrix Ann can be represented as
A = Dj=1,...,n (j )

n


j j j ,

(7.1)

j=1

where the j are the columns of nn and the j are the rows of 1 . The
j are the eigenvalues of A, and j and j are the corresponding right and left
eigenvectors, respectively. Eigenvectors (right or left) corresponding to eigenvalues that are distinguishable and non-null are mutually orthogonal. These
results can be looked up in e.g. [17].
The exponential function of Ann is the n n matrix dened by
exp(A) =

n


1 p
A = Dj=1,...,n (ej ) 1 =
ej j j ,
p!
p=0
j=1

(7.2)

where the last two expressions follow from (7.1). The matrix exp(A) has full
rank.

81

If nn is positive denite symmetric, then " 1 , 2 # = 1 2 denes an

inner product on Rn . The corresponding norm is given by \$\$ = ", #1/2
. If

Fnm has full rank m ( n), then the " , # -projection of onto R(F) is
F = PF ,

(7.3)

where the projection matrix (or projector) PF is

PF = F(F F)1 F .

(7.4)

The projection of onto the orthogonal complement R(F) is

F = F = (I PF ) .
Its squared length, which is the squared " , # -distance from to R(F), is
\$ F \$2 = \$\$2 \$ F \$2 =  (I PF ) .

(7.5)

The cardinality of a set Y is denoted by |Y|. For a nite set it is just its
number of elements.

7.4

A. The continuous time Markov chain.

At the base of everything (although slumbering in the background) is some
probability space (, F , P).
Let {Yt }t0 be a continuous time Markov chain with nite state space
Y = {1, . . . , n}. We assume that it is time homogeneous so that the transition probabilities
pjk
t = P[Ys+t = k | Ys = j]
depend only on the length of the transition period. This implies that the transition intensities
pjk
t
,
t0 t

jk = lim

(7.1)

j = k, exist and are constant. To avoid repetitious reminders of the type

j, k Y, we reserve the indices j and k for states in Y throughout. We will
frequently refer to
Y j = {k; jk > 0} ,
the set of states that are directly accessible from state j, and denote the number
of such states by
nj = |Y j | .
Put
jj = j =


k;kY j

jk

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

82

(minus the total intensity of transition out of state j). We assume that all states
intercommunicate so that pjk
t > 0 for all j, k (and t > 0). This implies that
nj > 0 for all j (no absorbing states). The matrix of transition probabilities,
Pt = (pjk
t ),
and the innitesimal matrix,
= (jk ) ,
are related by (7.1), which in matrix form reads = limt0 1t (Pt I), and by
the backward and forward Kolmogorov dierential equations,
d
Pt = Pt = Pt .
dt
Under the side condition P0 = I, (7.2) integrates to
Pt = exp(t) .
In the representation (7.2),
Pt = Dj=1,...,n (ej t ) 1 =

n


ej t j j ,

(7.2)

j=1

the rst (say) eigenvalue is 1 = 0, and corresponding eigenvectors are 1 = 1

jn
and 1 = (p1 , . . . , pn ) = limt (pj1
t , . . . , pt ), the stationary distribution of Y .
The remaining eigenvalues, 2 , . . . , n , are all strictly negative so that, by (7.2),
the transition probabilities converge exponentially to the stationary distribution
as t increases.
Introduce
Itj = 1[Yt = j] ,

(7.3)

the indicator of the event that Y is in state j at time t, and

Ntjk = |{s; 0 < s t, Ys = j , Ys = k}| ,

(7.4)

the number of direct transitions of Y from state j to state k Y j in the time

interval (0, t]. For k
/ Y j we dene Ntjk 0. Taking Y to be right-continuous,
the same goes for the indicator processes I j and the counting processes N jk .
As is seen from (7.3), (7.4), and the obvious relationships
 j

jIt ,
Itj = I0j +
(Ntkj Ntjk ) ,
Yt =
j

k;k=j

the state process, the indicator processes, and the counting processes carry the
same information, which at any time t is represented by the sigma-algebra FtY =
{Ys ; 0 s t}. The corresponding ltration, denoted by FY = {FtY }t0 , is
taken to satisfy the usual conditions of right-continuity (Ft = u>t Fu ) and

83

completeness (F0 contains all subsets of P-nullsets), and F0 is assumed to be

the trivial (, ). This means, essentially, that Y is right-continuous (hence the
same goes for the I j and the N jk ) and that Y0 deterministic.
The compensated counting processes M jk , j = k, dened by
dMtjk = dNtjk Itj jk dt

(7.5)

and M0jk = 0, are zero mean, square integrable, mutually orthogonal martingales
w.r.t. (FY , P).
We now turn to the subject matter of our study and, referring to introductory
texts like [5] and [26], take basic notions and results from arbitrage pricing theory
as prerequisites.
B. The continuous time Markov chain market.
We consider a nancial market driven by the Markov chain described above.
Thus, Yt represents the state of the economy at time t, FtY represents the
information available about the economic history by time t, and FY represents
the ow of such information over time.
In the market there are m + 1 basic assets, which can be traded freely and
frictionlessly (short sales are allowed, and there are no transaction costs). A
special role is played by asset No. 0, which is a locally risk-free bank account
with state-dependent interest rate
 j
It rj ,
rt = rYt =
j

where the state-wise interest rates rj , j = 1, . . . , n, are constants. Thus, its

price process is

 t

  t
rs ds = exp
rj
Isj ds ,
Bt = exp
0

with dynamics
dBt = Bt rt dt = Bt

rj Itj dt .

(Setting B0 = 1 a just a matter of convention.)

The remaining m assets, henceforth referred to as stocks, are risky, with
price processes of the form

 t

 
(7.6)
ij
Isj ds +
ijk Ntjk ,
Sti = exp
j

kY j

i = 1, . . . , m, where the ij and ijk are constants and, for each i, at least one
of the ijk is non-null. Thus, in addition to yielding state-dependent returns of

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

84

the same form as the bank account, stock No. i makes a price jump of relative
size


ijk = exp ijk 1
upon any transition of the economy from state j to state k. By the general Itos
formula, its dynamics is given by




i
ij Itj dt +
ijk dNtjk .
dSti = St
(7.7)
j

kY j

Taking the bank account as numeraire, we introduce the discounted stock

prices Sti = Sti /Bt , i = 0, . . . , m. (The discounted price of the bank account is
t 1, which is certainly a martingale under any measure). The discounted
B
stock prices are

 t
 

Sti = exp (ij rj )
Isj ds +
ijk Ntjk ,
0

kY j

with dynamics

i
dSti = St


j

(ij rj )Itj dt +

 
j

ijk dNtjk ,

kY j

i = 1, . . . , m.
We stress that the theory we are going to develop does not aim at explaining how the prices of the basic assets emerge from supply and demand, business
cycles, investment climate, or whatever; they are exogenously given basic entities. (And God said let there be light, and there was light, and he said let
there also be these prices.) The purpose of the theory is to derive principles for
consistent pricing of nancial contracts, derivatives, or claims in a given market.
C. Portfolios.
A dynamic portfolio or investment strategy is an m + 1-dimensional stochastic
process
t = (t , t ) ,
where t represents the number of units of the bank account held at time t, and
the i-th entry in
t = (t1 , . . . , tm )
represents the number of units of stock No. i held at time t. As it will turn
out, the bank account and the stocks will appear to play dierent parts in the
show, the latter being the more visible. It is, therefore, convenient to costume
the two types of assets and their corresponding portfolio entries accordingly. To
save notation, however, it is useful also to work with double notation
t = (t0 , . . . , tm ) ,

85

with t0 = t , ti = ti , i = 1, . . . , m, and work with

St = (St0 , . . . , Stm ) ,

St0 = Bt .

The portfolio is adapted to FY (the investor cannot see into the future), and
the shares of stocks, , must also be FY -predictable (the investor cannot, e.g.
upon a sudden crash of the stock market, escape losses by selling stocks at
prices quoted just before and hurry the money over to the locally risk-free bank
account.)
The value of the portfolio at time t is
Vt = t Bt +

m


ti Sti = t Bt + t St = t St

i=0

Henceforth we will mainly work with discounted prices and values and, in
accordance with (7.8), equip their symbols with a tilde. The discounted value
of the portfolio at time t is
t =  S

Vt = t + t S
t t .

(7.8)

The strategy is self-nancing (SF) if dVt = t dSt or, equivalently,

=
dVt = t dS
t

m


ti dSti .

(7.9)

i=1

We explain the last step: Put Yt = Bt1 , a continuous process. The dynamics
= dYt S + Yt dS . Thus, for
= Yt S is then dS
of the discounted prices S
t
t
t
t
t

Vt = Yt Vt , we have
,
dVt = dYt Vt + Yt dVt = dYt t St + Yt t dSt = t (dYt St + Yt dSt ) = t dS
t
hence the property of being self-nancing is preserved under discounting.
The SF property says that, after the initial investment of V0 , no further
investment inow or dividend outow is allowed. In integral form:
 t
 t
= V +
s .
Vt = V0 +
s dS
s dS
(7.10)
s
0
0

St , hence (7.9) is satised. More generally, for a continuous portfolio we

t + t dS
t , and the self-nancing condition would be
would have dVt () = d t S
, which says that any purchase of
equivalent to the a budget constraint d t S
t
assets must be nanced by a sale of some other assets. We urge to say that we
shall typically be dealing with portfolios that are not continuous and, in fact,
not even right-continuous so that dt is meaningless (integrals with respect
to the process are not well dened).


CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

86

D. Absence of arbitrage.
An SF portfolio is called an arbitrage if, for some t > 0,
V0 < 0 and Vt 0 a.s. P ,
or, equivalently,
V0 < 0 and Vt 0 a.s. P .
A basic requirement on a well-functioning market is the absence of arbitrage.
The assumption of no arbitrage, which appears very modest, has surprisingly
far-reaching consequences as we shall see.
that is equivalent to
An martingale measure is any probability measure P

t are martingales (F, P).

The
P and such that the discounted asset prices S
fundamental theorem of arbitrage pricing says: If there exists a martingale
measure, then there is no arbitrage. This result follows from easy calculations
under P
and using the martingale
starting from (7.10): Forming expectation E
we nd
under P,
property of S
 t
s ] = V
E[V ] = V + E[
 dS
t

(the stochastic integral is a martingale). It is seen that arbitrage is impossible.

We return now to our special Markov chain driven market. Let
jk )
= (

jk = 0 if and
be an innitesimal matrix that is equivalent to in the sense that
jk
equivalent to
only if = 0. By Girsanovs theorem, there exists a measure P,
Consequently,
P, under which Y is a Markov chain with innitesimal matrix .
jk , j = 1, . . . , n, k Y j , dened by
the processes M
jk dt ,
tjk = dNtjk Itj
dM

(7.11)

jk = 0, are zero mean, mutually orthogonal martingales w.r.t. (FY , P).

and M
0
Rewrite (7.8) as





i
jk I j dt +
tjk , (7.12)
ij rj +
dSti = St
ijk
ijk dM
t
j

kY j

kY j

if and
i = 1, . . . , m. The discounted stock prices are martingales w.r.t. (FY , P)
only if the drift terms on the right vanish, that is,

jk = 0 ,
ijk
(7.13)
ij rj +
kY j

j = 1, . . . , n, i = 1, . . . , m. From general theory it is known that the existence

implies absence of arbitrage.
of such an equivalent martingale measure P

87

The relation (7.13) can be cast in matrix form as

j

,
rj 1 j = j

(7.14)

j = 1, . . . , n, where 1 is m 1 and
 
j = ij i=1,...,m ,


kY j
j = ijk i=1,...,m ,

 
j =
jk

kY j

The existence of an equivalent martingale measure is equivalent to the existence

j to (7.14) with all entries strictly positive. Thus, the market is
of a solution
arbitrage-free if (and we can show only if) for each j, rj 1 j is in the interior
of the convex cone of the columns of j .
Assume henceforth that the market is arbitrage-free so that (7.12) reduces
to
 
i
jk ,
dSti = St
ijk dM
t
j

kY j

jk dened by (7.11) are martingales w.r.t. (FY , P)

for some measure
where the M

P that is equivalent to P.
Inserting (7.15) into (7.9), we nd that is SF if and only if
dVt =

m
  
j

i
jk ,
ti St
ijk dM
t

(7.15)

kY j i=1

and, in particular,
implying that V is a martingale w.r.t. (FY , P)
V | Ft ] .
Vt = E[
T
denotes expectation under P.
(Note that the tilde, which in the rst
Here E
place was introduced to distinguish discounted values from the nominal ones, is
also attached to the equivalent martingale measure and certain related entities.
This usage is motivated by the fact that the martingale measure arises from the
discounted basic price processes, roughly speaking.)
E. Attainability.
A T -claim is a contractual payment due at time T . Formally, it is an FTY measurable random variable H with nite expected value. The claim is attainable if it can be perfectly duplicated by some SF portfolio , that is,
.
VT = H
If an attainable claim should be traded in the market, then its price must
at any time be equal to the value of the duplicating portfolio in order to avoid
arbitrage. Thus, denoting the price process by t and, recalling (7.16) and
(7.16), we have
H
| Ft ] ,

t = Vt = E[

(7.16)

or

 
 R
e tT r H  Ft .
t = E

88

(7.17)

By (7.16) and (7.15), the dynamics of the discounted price process of an

attainable claim is
m
  

i
tjk .
ti St
ijk dM

(7.18)

F. Completeness.
Any T -claim H as dened above can be represented as
 T 

tjk ,
H = E[H] +
tjk dM

(7.19)

d
t =

kY j i=1

kY j

where the tjk are FY -predictable and integrable processes. Conversely, any
random variable of the form (7.19) is, of course, a T -claim. By virtue of (7.16),
and (7.15), attainability of H means that
 T
= V0 +
dVt
H
0


=

V0 +

  
j

kY j

i
tjk .
ti St
ijk dM

Comparing (7.19) and (7.20), we see that H is attainable i there exist predictable processes t1 , . . . , tm such that
m


i
ti St
ijk = tjk ,

i=1

for all j and k Y j . This means that the nj -vector

jt = (tjk )kY j


is in R(j ).
The market is complete if every T -claim is attainable, that is, if every nj 
vector is in R(j ). This is the case if and only if rank(j ) = nj , which can be
fullled for each j only if m maxj nj .

7.5

A. Dierential equations for the arbitrage-free price.

Assume that the market is arbitrage-free and complete so that prices of T -claims
are uniquely given by (7.16) or (7.17).

89

Let us for the time being consider a T -claim of the form

H = h(YT , ST ) .

(7.1)

Examples are a European call option on stock No. : dened by H = (ST K)+ ,
a caplet dened by H = (rT g)+ = (rYT g)+ , and a zero coupon T -bond
dened by H = 1.
For any claim of the form (7.1) the relevant state variables involved in the
conditional expectation (7.17) are t, Yt , St , hence t is of the form
t =

n


Itj f j (t, St ) ,

(7.2)

j=1

where the


 R

e tT r H  Yt = j, S  = s
f j (t, s) = E
t

(7.3)

are the state-wise price functions.

Assume that the
The discounted price (7.16) is a martingale w.r.t. (FY , P).
j
functions f (t, s) are continuously diferentiable. Using It
o on

t = e

Rt
0

n


Itj f j (t, St ) ,

(7.4)

j=1

we nd
d
t

j
j
j j

j
f (t, St )St
= e
r f (t, St ) + f (t, St ) +
dt
t
s
j
Rt   


f k (t, St
(1 + jk )) f j (t, St
) dNtjk
+e 0 r

Rt
0

Itj

kY j

j
f (t, St )St j
Itj rj f j (t, St ) + f j (t, St ) +
t
s
j



jk dt
+
{f k (t, St
(1 + jk )) f j (t, St
)}

= e

Rt
0

kY j

+e

Rt
0

  


jk .
f k (t, St
(1 + jk )) f j (t, St
) dM
t
j

kY j

By the martingale property, the drift term must vanish, and we arrive at the
non-stochastic partial dierential equations

j
f (t, s)sj
rj f j (t, s) + f j (t, s) +
t
s
 
 jk
= 0,
f k (t, s(1 + jk )) f j (t, s)
+
kY j

90

j = 1, . . . , n, which are to be solved subject to the side conditions

f j (T, s) = h(j, s) ,

(7.5)

j = 1, . . . , n.
In matrix form, with
R = Dj=1,...,n (rj ) ,

A = Dj=1,...,n (j ) ,

and other symbols (hopefully) self-explaining, the dierential equations and the
side conditions are
Rf (t, s) +

(t, s(1 + )) = 0 ,
f (t, s) + sA f (t, s) + f
t
s
f (T, s) = h(s) .

(7.6)
(7.7)

B. Identifying the strategy.

Once we have determined the solution f j (t, s), j = 1, . . . , n, the price process is
known and given by (7.2).
The duplicating SF strategy can be obtained as follows. Setting the drift
term to 0 in (7.5), we nd the dynamics of the discounted price;
Rt   


jk .
f k (t, St
(1 + jk )) f j (t, St
) dM
(7.8)
d
t = e 0 r
t
j

kY j

Identifying the coecients in (7.8) with those in (7.18), we obtain, for each state
j, the equations
m


i

ti St
ijk = f k (t, St
(1 + jk )) f j (t, St
),

(7.9)

i=1

k Y j . The solution jt = (ti,j )i=1,...,m (say) certainly exists since rank(j )

m, and it is unique i rank(j ) = m. Furthermore, it is a function of t and St
and is thus predictable. This simplistic argument works on the open intervals
jk dt. For the dynamics
tjk = Itj
between the jumps of the process Y , where dM
(7.8) and (7.18) to be the same also at jump times, the coecients must clearly
be left-continuous. We conclude that
t =

n


j
It
t ,

j=1

which is predictable.
Finally, is determined upon combining (7.8), (7.16), and (7.4):
!
m
n
R


j j
j
i,j i
0t r

t = e
t St .
It f (t, St ) It
j=1

i=1

91

C. The Asian option.

As an example of a path-dependent claim, let us consider an Asian option, which
+

T
essentially is a T -claim of the form H = 0 S d K , where K 0. The
price process is

!+ 
 T
RT

e t r
 FtY
S d K
t = E

0



n
t

Itj f j t, St ,
S d ,
=
0

j=1

where

f j (t, s, u) =

e
E

RT
t

r
t

!+ 

 Yt = j, St = s .
S + u K



t = e

Rt
0

n


 t
Itj f j t, St ,
Ss .

j=1

We obtain partial dierential equations in three variables.

The special case K = 0 is simpler, with only two state variables.
D. Interest rate derivatives.
A particularly simple, but still important, class of claims are those of the form
H = h(YT ). Interest rate derivatives of the form H = h(rT ) are included since
rT = rYT . For such claims the only relevant state variables are t and Yt , so that
the function in (7.3) depends only on t and j. The equation (7.5) reduces to

d j
jk ,
ft = rj ftj
(ftk ftj )
dt
j

(7.10)

kY

fTj = hj .

(7.11)

In matrix form,
d
)f
t,
ft = (R
dt
subject to
fT = h .
The solution is
R)(T t)}h .
ft = exp{(

(7.12)

92

It depends on t and T only through T t.

In particular, the zero coupon bond with maturity T corresponds to h = 1.
We will henceforth refer to it as the T -bond in short and denote its price process
by p(t, T ) and its state-wise price functions by p(t, T ) = (pj (t, T ))j=1,...,n ;
R)(T t)}1 .
p(t, T ) = exp{(

(7.13)

For a call option on a U -bond, exercised at time T (< U ) with price K, h

has entries hj = (pj (T, U ) K)+ .
In (7.12) (7.13) it may be useful to employ the representation shown in
(7.2),
1 ,
R)(T t)} =
Dj=1,...,n (ej (T t) )
exp{(

(7.14)

say.

7.6

Numerical procedures

A. Simulation.
The homogeneous Markov process {Yt }t[0,T ] is simulated as follows: Let K be
the number of transitions between states in [0, T ], and let T1 , . . . , TK be the
successive times of transition. The sequence {(Tn , YTn )}n=0,...,K is generated
recursively, starting from the initial state Y0 at time T0 = 0, as follows. Having arrived at Tn and YTn , generate the next waiting time Tn+1 Tn as an
exponential variate with parameter Yn (e.g. ln(Un )/Yn , where Un has a
uniform distribution over [0, 1]), and let the new state YTn+1 be k with probability Yn k /Yn . Continue in this manner K + 1 times until TK < T TK+1 .
B. Numerical solution of dierential equations.
Alternatively, the dierential equations must be solved numerically. For interest
rate derivatives, which involve only ordinary rst order dierential equations,
a Runge Kutta will do. For stock derivatives, which involve partial rst order
dierential equations, one must employ a suitable nite dierence method, see
e.g. [28].

7.7

Risk minimization in incomplete markets

A. Incompleteness.
The notion of incompleteness pertains to situations where a contingent claim
cannot be duplicated by an SF portfolio and, consequently, does not receive a
unique price from the no arbitrage postulate alone.
In Paragraph 7.4F we were dealing implicitly with incompleteness arising
from a scarcity of traded assets, that is, the discounted basic price processes
and, in
are incapable of spanning the space of all martingales w.r.t. (FY , P)
particular, reproducing the value (7.19) of every nancial derivative (function
of the basic asset prices).

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

93

Incompleteness also arises when the contingent claim is not a purely nancial derivative, that is, its value depends also on circumstances external to the
nancial market. We have in mind insurance claims that are caused by events
like death or re and whose claim amounts are e.g. ination adjusted or linked
to the value of some investment portfolio.
In the latter case we need to work in an extended model specifying a basic
probability space with a ltration F = {Ft }t0 containing FY and satisfying the
usual conditions. Typically it will be the natural ltration of Y and some other
process that generates the insurance events. The denitions and conditions laid
down in Paragraphs 7.4C-E are modied accordingly, so that adaptedness of
and predictability of are taken to be w.r.t. (F, P) (keeping the symbol P for
the basic probability measure), a T -claim H is FT measurable, etc.
B. Risk minimization.
Throughout the remainder of the paper we will mainly be working with discounted prices and values without any other mention than the notational tilde.
The reason is that the theory of risk minimization rests on certain martingale
representation results that apply to discounted prices under a martingale measure. We will be content to give just a sketchy review of some main concepts
and results from the seminal paper of F
ollmer and Sondermann [11].
be a T -claim that is not attainable. This means that an admissible
Let H
portfolio satisfying

VT = H
cannot be SF. The cost, Ct , of the portfolio by time t is dened as that part of
the value that has not been gained from trading:
 t
.
Ct = Vt
 dS
0

 

(C C )2  Ft .
t = E
R
T
t

(7.1)

By denition, the risk of an admissible portfolio is

 

 T

 2

Rt = E (H Vt
dS )  Ft ,

t
which is a measure of how well the current value of the portfolio plus future
trading gains approximates the claim. The theory of risk minimization takes this
entity as its object function and proves the existence of an optimal admissible
portfolio that minimizes the risk (7.1) for all t [0, T ]. The proof is constructive
and provides a recipe for how to actually determine the optimal portfolio.
at time t as
One sets out by dening the intrinsic value of H


H
| Ft .
VtH = E

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

94

Thus, the intrinsic value process is the martingale that represents the natural
current forecast of the claim under the chosen martingale measure. By the
Galchouk-Kunita-Watanabe representation, it decomposes uniquely as
 t

H
H]
+

H
VtH = E[
t dSt + Lt ,
0

which is orthogonal to S.
The portwhere LH is a martingale w.r.t. (F, P)
folio H dened by this decomposition minimizes the risk process among all
admissible strategies. The minimum risk is
 


T

H
H

Rt = E
d"L #  Ft .

t

As the name suggests, a life insurance product is said to be unit-linked if the
benet is a certain predetermined number of units of an asset (or portfolio)
into which the premiums are currently invested. If the contract stipulates a
minimum value of the benet, disconnected from the asset price, then one speaks
of unit-linked insurance with guarantee. A risk minimization approach to pricing
and hedging of unit-linked insurance claims was rst taken by Mller [20], who
worked with the Black-Scholes-Merton nancial market. We will here sketch
how the analysis goes in our Markov chain market, which conforms well with
the life history process in that they both are intensity-driven.
Let Tx be the remaining life time of an x years old who purchases an insurance at time 0, say. The conditional probability of survival to age x + u, given
survival to age x + t (0 t < u), is
ut px+t

Ru
t

x+s ds

(7.2)

where y is the mortality intensity at age y. We have

d ut px+t = ut px+t x+t dt .

(7.3)

Introduce the indicator of survival to age x + t,

It = 1[Tx > t] ,
and the indicator of death before time t,
Nt = 1[Tx t] = 1 It .
The process Nt is a (very simple) counting process with intensity It x+t , that
is, M given by
dMt = dNt It x+t dt

(7.4)

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

95

is a martingale w.r.t. (F, P). Assume that the life time Tx is independent of the
obtained by replacing
economy Y . We will work with the martingale measure P

the intensity matrix of Y with the martingalizing and leaving the rest of
the model unaltered.
Consider a unit-linked pure endowment benet payable at a xed time T ,
contingent on survival of the insured, with sum insured equal to one unit of
stock No. :, but guaranteed no less than a xed amount g. This benet is a
contingent T -claim,
H = (ST g) IT .
The single premium payable as a lump sum at time 0 is to be determined.
Let us assume that the nancial market is complete so that every purely
nancial derivative has a unique price process. Then the intrinsic value of H at
time t is
t It T t px+t ,
VtH =
where
t is the discounted price process of the derivative ST g.
Using Ito and inserting (7.4), we nd
dVtH

d
t It

T t px+t

+
t It

T t px+t x+t

d
t It

T t px+t

t T t px+t dMt .

dt + (0
t T t px+t ) dNt

It is seen that the optimal trading strategy is that of the price process of the
sum insured multiplied with the conditional probability that the sum will be
paid out, and that
t dMt .
dLH
t = T t px+t
Consequently,

H
R
t

2
T s px+s

T t px+t

" 2 #

E
s  Ft st px+t x+s ds
" 2 #

E
s  Ft T s px+s x+s ds .

7.8

A. A nite zero coupon bond market.

Suppose an agent faces a contingent T -claim and is allowed to invest only in
the bank account and a nite number m of zero coupon bonds with maturities
Ti , i = 1, . . . , m, all post time T . For instance, regulatory constraints may be
imposed on the investment strategies of an insurance company. The question
is, to what extent can the claim be hedged by self-nanced trading in these
available assets?
An allowed SF portfolio has discounted value process Vt of the form
dVt =

m

i=1

ti

 
j

kY j

jk =
(
pk (t, Ti ) pj (t, Ti ))dM
t


j

j ) Fj ,
d(M
t
t t

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

96

j = (M
jk )kY j is the nj -dimensional row vector
where is predictable, M
t
t
t = (M
tjk ), and
comprising the non-null entries in the j-th row of M
Fjt = Yj Ft
where
(t, Tm )) ,
Ft = (
pj (t, Ti ))i=1,...,m
p(t, T1 ), , p
j=1,...,n = (

(7.1)

pk (t, Ti ) pj (t, Ti ))i=1,...,m

. If
and Yj is the nj n matrix which maps Ft to (
kY j
e.g. Y n = {1, . . . , p}, then Yn = (Ipp , 0p(np1) , 1p1 ).
The sub-market consisting of the bank account and the m zero coupon bonds
is complete in respect of T -claims i the discounted bond prices span the space
over the time interval [0, T ]. This is the case i,
of all martingales w.r.t. (FY , P)
for each j, rank(Fjt ) = nj . Now, since Yj obviously has full rank nj , the rank
of Fjt is determined by the rank of Ft in (7.1). We will argue that, typically, Ft
has full rank. Thus, suppose c = (c1 , . . . , cm ) is such that
Ft c = 0n1 .
Recalling (7.13), this is the same as
m


R)Ti }1 = 0 ,
ci exp{(

i=1

has full rank,

or, by (7.14) and since
m

j
1 1 = 0 .
Dj=1,...,n (
ci e Ti )

(7.2)

i=1

1 has full rank, the entries of the vector

1 1 cannot be all null.
Since
Typically all entries are non-null, and we assume this is the case. Then (7.2) is
equivalent to
m


ci e

Ti

= 0,

j = 1, . . . , n.

(7.3)

i=1

Using the fact that the generalized Vandermonde matrix has full rank, we know
that (7.3) has a non-null solution c if and only if the number of distinct eigenvalues j is less than m, see Section 7.9 below.
In the case where rank(Fjt ) < nj for some j we would like to determine the
Galchouk-Kunita-Watanabe decomposition for a given FTY -claim. The intrinsic
value process has dynamics
  jk

tjk =
jt ) jt .
t =
t dM
d(M
(7.4)
dH
j

kY j

97

We seek a decomposition of the form


  jk
jk
ti d
p(t, Ti ) +
t dM
dVt =
t
i


j

jY j

kY j

jk +
ti (
pk (t, Ti ) pj (t, Ti )) dM
t

jt ) Fjt jt
d(M

 
j

jt ) jt
d(M

jk
tjk dM
t

kY j

such that the two martingales on the right hand side are orthogonal, that is,
 j  j j
j jt = 0 ,
It
(Ft t )
kY j

j = D(
j ). This means that, for each j, the vector jt in (7.4) is to
where
j
be decomposed into its " , #
j projections onto R(Ft ) and its orthocomplement.
From (7.3) and (7.4) we obtain
Fjt jt = Pjt jt ,
where

 j j 1 j  j
F ) F
,
Pjt = Fjt (Fjt
t
t

hence
 j j 1 j  j j
F ) F
.
jt = (Fjt
t
t
t

(7.5)

jt = (I Pjt ) jt ,

(7.6)

Furthermore,

and the risk is


t


j

Yt j
pst

jk (sjk )2 ds .

(7.7)

kY j

The computation goes as follows: The coecients jk involved in the intrinsic value process (7.4) and the state-wise prices pj (t, Ti ) of the Ti -bonds are
obtained by simultaneously solving (7.5) and (7.10), starting from (7.7) and
(7.10), respectively, and at each step computing the optimal trading strategy
by (7.5) and the from (7.6), and adding the step-wise contribution to the
variance (7.7) (the step-length times the current value of the integrand).
B. First example: The oorlet.
For a simple example, consider a oorlet H = (r rT )+ , where T < mini Ti .
The motivation could be that at time T the insurance company will ascribe
interest to the insureds account at current interest rate, but not less than a

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

98

prexed guaranteed rate r . Then H is the amount that must be provided per
unit on deposit and per time unit at time T .
Computation goes by the scheme described above, with the tjk = ftk ftj
obtained from (7.10) subject to (7.11) with hj = (r rj )+ .
C. Second example: The interest guarantee in insurance.
A more practically relevant example is an interest rate guarantee on a life insurance policy. Premiums and reserves are calculated on the basis of a prudent
so-called rst order assumption, stating that the interest rate will be at some
xed (low) level r throughout the term of the insurance contract. Denote the
corresponding rst order reserve at time t by Vt . The (portfolio-wide) mean
surplus created by the rst order assumption in the time interval [t, t + dt) is
(r rt )+ t px Vt dt. This surplus is currently credited to the account of the insured as dividend, and the total amount of dividends is paid out to the insured
at the term of the contracts at time T . Negative dividends are not permitted,
however, so at time T the insurer must cover

H

=
0

RT
s

(r rs )+ s px Vs ds .

The intrinsic value of this claim is

 


T
R

0s r
+

Ht = E
e
(r rs ) s px Vs ds Ft

0
 t R
Rt 
s
=
e 0 r (r rs )+ s px Vs ds + e 0 r
Itj ftj ,
0

where the ftj are the state-wise expected values of future guarantees, discounted
at time t,




T
Rs

j

ft = E
e t (r rs ) s px Vs ds Yt = j .

t
Working along the lines of Section 7.5, we determine the ftj by solving

d j
jk ,
ft = (r rj )+ t px Vt + rj ftj
(ftk ftj )
dt
j
kY

subject to
fTj = 0 .
The intrinsic value has dynamics (7.4) with tjk = ftk ftj .
From here we proceed as described in Paragraph A.

(7.8)

99

D. Computing the risk.

Constructive dierential equations may be put up for the risk. As a simple
example, for an interest rate derivative the state-wise risk is
 T


2
j

Rt =
pjg
gk gk d .
t
t

k;k=g


2  T  d
 

2
d j
Rt =
pjg
gk d ,
jk tjk +
t
dt
dt
t
g
k;k=j

k;k=g


d jg
j jg
pst =
jh phg
st + pst ,
dt
h;h=j

we arrive at

2


d j
k + j R
j .
Rt =
jk tjk
jk R
t
t
dt
k;k=j

7.9

k;k=j

A. The Vandermonde matrix.

Let An denote the generic n n matrix of the form

j=1,...,n
An = ei j i=1,...,n ,

(7.1)

where 1 , . . . , n and 1 , . . . , n are reals. This is a classic in matrix theory,

known as the generalized Vandermonde matrix (usually its elements are written

in the form xi j with xi > 0). It is well known that it is non-singular i all i
are dierent and all j are dierent, see Gantmacher [12] p. 87.
B. Purpose of the study.
The matrix An in (7.1) and its close relative

j=1,...,n
An 1n 1n = ei j 1 i=1,...,n ,

(7.2)

arise naturally in zero coupon bond prices based on spot interest rates driven by
certain homogeneous Markov processes. It turns out that, in such bond markets,
the issue of completeness is closely related to the rank of the two archetype
matrices. Roughly speaking, non-singularity of matrices of types (7.1) or (7.2)
ensures that any simple T -claim can be duplicated by a portfolio consisting of
the risk-free bank account and a suciently large number of zero coupon bonds.
The non-singularity results are proved in Section 7.10, and applications to bond
markets are presented in Section 7.11.

7.10

100

A. The main result.

We take the opportunity here to provide a short proof of the quoted result on
non-singularity of the Vandermonde matrix in (7.1), and will supply a similar
result about its relative dened in (7.2).
Theorem
(i) If the i are all dierent and the j are all dierent, then An is non-singular.
(ii) If, furthermore, the i and the j are all dierent from 0, then An 1n 1n
is non-singular.

Proof: The proof goes by induction. Let Hn be the hypothesis stated in the
two items of the lemma. Trivially, H1 is true. Assuming that Hn1 is true, we
need to prove Hn .
Addressing rst item (i) of the the hypothesis, it suces to prove that
det(An ) = 0. Recast this determinant as
( )

e 1 n 1
e(1 n )n

n


n j

det(An ) =
det
e

j=1
e(n1 n )1 e(n1 n )n
1

1

n
n1


An1 1n1
=
en j
e(i n )n det
1n1
1
j=1

i=1

where

j=1,...,n1
An1 = e(i n )(j n )
.

(7.1)

i=1,...,n1

The determinant appearing in (7.1) remains unchanged upon subtracting the

n-th row of the matrix from all other rows, which gives

An1 1n1 1n1 0n1
An1 1n1
=
det
det
1n1
1
1n1
1



= det An1 1n1 1n1 .
Now, since the i are all dierent and also the j are all dierent, the matrix
An1 in (7.1) is of the form required in item (ii) of the lemma and so, by the
assumed hypothesis Hn1 , det(An1 1n1 1n1 ) = 0. It follows from (7.1)
and (7.2) that det(An ) = 0, hence item (i) of Hn holds true.

101

Next, we turn to item (ii) of Hn . Preparing for an ad absurdum argument,

assume that An is as specied in item (ii) of the lemma and that An 1n 1n is
singular. Then there exists a vector c = (c1 , . . . , cn ) = 0n such that
An c = 1n 1n c .
Introducing the function
f () =

n


(7.2)

cj ej ,

j=1

and putting 0 = 0, we can spell out (7.2) as

f (0 ) = f (1 ) = = f (n ) ,

(7.3)

that is, f assumes the same value at n + 1 distinct values of . Since f is

continuously dierentiable, Rolles theorem implies that the derivative f  of f
is 0 at n distinct values 1 , . . . , n (say) of . Now,
f  () =

n


cj j ej ,

j=1

and since some cj are dierent from 0 and all j are dierent from 0, it follows
 j=1,...,n
that the matrix An = ei j i=1,...,n should be singular. This contradicts the
previously established item (i) under Hn , showing that the assumed singularity
of An 1n 1n is absurd. We conclude that also item (ii) of Hn holds true. 
B. Remarks.
In fact, if 1 < < n and 1 < < n , then det(An ) > 0 (see [12]). If we
take this fact for granted, (7.1) and (7.2) show that also det(An 1n 1n ) > 0,
implying that the latter is non-singular under the hypothesis of item (ii) in the
theorem. The sign of a general Vandermonde determinant is, of course, the
product of the signs of the row and column permutations needed to order the
i and the j by their size.

7.11

Applications to nance

A. Zero coupon bond prices.

A zero coupon bond with maturity T , or just T -bond in short, is the simple
contingent claim of 1 at time T . Taking an arbitrage-free nancial market for
granted, the price process {p(t, T )}t[0,T ] of the T -bond is
 
 R
e tT ru du  Ft ,
(7.1)
p(t, T ) = E
denotes expectation under some martingale measure, and Ft is the
where E
information available at time t.

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

102

We will provide some examples where the results in Section 7.10 are instrumental for establishing linear independence of price processes of bonds with
dierent maturities. The issue is non-trivial only in cases where the bond prices
are governed by more than one source of randomness, of course, so we have
to look into cases where the spot rate of interest is driven by more than one
martingale.
B. Markov chain interest rate.
Referring to Chapter 5, let us model the spot rate of interest {rt }t0 as a
continuous time, homogeneous, recurrent Markov chain with nite state space
{r1 , . . . , rn }.
We are working under some martingale measure given by an innitesimal
jk ,
jk ) of the Markov chain, that is, the transition intensities are
= (
matrix

jj
jk
=

j = k, and
k;k=j . The price at time t T of a zero coupon bond
with maturity T is
n

p(t, T ) =
Itj pj (t, T ) ,
j=1

where Itj = 1[rt = rj ] and


 R

e tT ru du  rt = rj .
pj (t, T ) = E
The vector of state-wise prices,
p(t, T ) = (pj (t, T ))j=1,...,n ,
is given by (7.13),
R)(T t)}1 = Diag(ej (T t) ) 1 ,
p(t, T ) = exp{(
where R = Diag(rj ) is the n n diagonal matrix with the entries rj down the
principal diagonal, 1 is the n-vector with all entries equal to 1, j , j = 1, . . . , n,
R, and and are the n n matrices formed by
are the eigenvalues of
the right and left eigenvectors, respectively.
The price processes of m zero coupon bonds with maturities T1 < < Tm
are linearly independent only if the matrix
j

(e

Ti j=1,...,n
)i=1,...,m

has rank m. From item (i) in the theorem in Paragraph 7.10A we conclude that
this is the case if there are at least m distinct eigenvalues j . It also
 follows that
t
the market consisting of the bank account with price process exp 0 rs ds and
the m zero coupon bonds is complete for the class of all FTr1 -claims only if both
the number of distinct eigenvalues and the number of bonds are no less than
the maximum number of states that can be directly accessed from any single
state of the Markov chain.

CHAPTER 7. FINANCIAL MATHEMATICS IN INSURANCE

103

C. Mixed Vaci
cek interest rate.
The Vasicek model takes the spot rate of interest to be an Ornstein-Uhlenbeck
process given by
t .
drt = ( rt ) dt + dW

(7.2)

Here is the stationary mean of the process, is a positive mean reversion

is a standard Brownian
parameter, is a positive volatility parameter, and W
motion under a martingale measure. The dynamics of the discounted T -bond
price,
p(t, T ) = e
is
d
p(t, T ) = p(t, T )

Rt

ru du

p(t, T ) ,

(7.3)


 (T t)
t ,
e
1 dW

(7.4)

see e.g. [5]. Obviously, any FTW claim can be duplicated by a self-nancing
portfolio in the T -bond and the bank account, and so the completeness issue is
trivial in this model.
To create an example where one bond is not sucient to complete the market, let us concoct a mixed Vasicek model by putting
rt =

n


rtj ,

j=1

where the rj are independent Ornstein-Uhlenbeck processes,

j,
drtj = j (j rtj ) dt + j dW
t
j are independent standard Brownian motions. We
j = 1, . . . , n, and the W
j
assume that the are all distinct (otherwise we could gather all processes rj
with coinciding mean reversion parameter into one Ornstein-Uhlenbeck process).
The mixed Vasicek process is not mean-reverting in the same simple sense as
the traditional Vasicek process. It is stationary, however, and is apt to describe
interest that is subject to several random phenomena, each of mean-reverting
type.
By the assumed independence, the price of the T -bond is just
p(t, T ) =

n


pj (t, T ) ,

j=1

 
R
e tT ru du  rj , and the discounted price is
where pj (t, T ) = E
t


p(t, T ) =

n

j=1

pj (t, T ) ,

104

where pj (t, T ) is the j-analogue to (7.3). By virtue of (7.4), we conclude that

the discounted T -bond price has dynamics
d
p(t, T ) = p(t, T )

n


j  j (T t)
tj .
e
1 dW
j

j=1

(7.5)

Now, consider the market consisting of the bank account and m zero coupon
bonds with maturities T1 < < Tm . From (7.5) it is seen that this market is

complete for the class of FTW1 1 ,...,Wn -claims if and only if the matrix

i=1,...,m
j
e (Ti t) 1

(7.6)

j=1,...,n

has rank n. By virtue of item (ii) in the theorem in Paragraph 7.10A, we

conclude that this is the case if m n.
D. Mixed Poisson-driven Ornstein-Uhlenbeck interest rate.
Referring to [23], let us replace the Brownian motions in Paragraph C above
with independent compensated Poisson processes, that is,
j = dN j j dt ,
dW
t
t
where each N j is a Poisson process with intensity j . Instead of (7.5) we obtain
 j 

n
&


j (T t)
tj .
e
d
p(t, T ) = p(t, T )
1
exp
1 dW
j

j=1

(7.7)

It is seen from (7.7) that the market consisting of the bank account and m
zero coupon bonds with maturities T1 < < Tm is complete for the class of

FTN11 ,...,Nn -claims if and only if the matrix

 j 
i=1,...,m
&

j (T t)
exp

1
e
j
j=1,...,n
has rank n. By item (ii) in the theorem in Paragraph 7.10A, we know that the
matrix (7.6) has full rank. Thus, completeness of a market consisting of the
bank account and at least n bonds would be established and we would be
done if we could prove that the n m matrix (eji 1) has full rank whenever
(ji ) has full rank. With this conjecture our study of these problems will have
to halt for the time being.

Bibliography
[1] Andersen, P.K., Borgan, ., Gill, R.D., Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer-Verlag, New York, Berlin, Heidelberg.
[2] Aase, K.K. and Persson, S.-A. (1994). Pricing of unit-linked life insurance
policies. Scand. Actuarial J., 1994, 26-52.
[3] Berger, A. (1939): Mathematik der Lebensversicherung. Verlag von Julius
Springer, Vienna.
[4] Bj
ork, T., Kabanov, Y., Runggaldier, W. (1997): Bond market structures
in the presence of marked point processes. Mathematical Finance, 7, 211239.
[5] Bj
ork, T. (1998): Arbitrage Theory in Continuous Time, Oxford University
Press.
[6] Black, F., Scholes, M. (1973): The pricing of options and corporate liabilities. J. Polit. Economy, 81, 637-654.
[7] Cox, J., Ross, S., Rubinstein, M. (1979): Option pricing: A simplied
approach. J. of Financial Economics, 7, 229-263.
[8] Delbaen, F., Schachermayer, W. (1994): A general version of the fundamental theorem on asset pricing. Mathematische Annalen, 300, 463-520.
[9] Eberlein, E., Raible, S. (1999): Term structure models driven by general
Levy processes. Mathematical Finance, 9, 31-53.
[10] Elliott, R.J., Kopp, P.E. (1998):
Springer-Verlag.

Mathematics of nancial markets,

[11] F
ollmer, H., Sondermann, D. (1986): Hedging of non-redundant claims.
In Contributions to Mathematical Economics in Honor of Gerard Debreu,
205-223, eds. Hildebrand, W., Mas-Collel, A., North-Holland.
[12] Gantmacher, F.R. (1959): Matrizenrechnung II, VEB Deutscher Verlag der
Wissenschaften, Berlin.
105

BIBLIOGRAPHY

106

[13] Harrison, J.M., Kreps, D.M. (1979): Martingales and arbitrage in multiperiod securities markets. J. Economic Theory, 20, 1979, 381-408.
[14] Hoem, J.M. (1969): Markov chain models in life insurance. Bl
atter Deutsch.
Gesellschaft Vers.math., 9, 91107.
[15] Hoem, J.M. (1969): Purged and partial Markov chains. Skandinavisk Aktuarietidskrift, 52, 147155.
[16] Harrison, J.M., Pliska, S. (1981): Martingales and stochastic integrals in
the theory of continuous trading. J. Stoch. Proc. and Appl., 11, 215-260.
[17] Karlin, S., Taylor, H. (1975): A rst Course in Stochastic Processes, 2nd.
[18] Merton, R.C. (1973): The theory of rational option pricing. Bell Journal
of Economics and Management Science, 4, 141-183.
[19] Merton, R.C. (1976): Option pricing when underlying stock returns are
discontinuous. J. Financial Economics, 3, 125-144.
[20] Mller, T. (1998): Risk minimizing hedging strategies for unit-linked life
insurance. ASTIN Bull., 28, 17-47.
[21] Norberg, R. (1995): Dierential equations for moments of present values
in life insurance. Insurance: Math. & Econ., 17, 171-180.
[22] Norberg, R. (1995): A time-continuous Markov chain interest model with
applications to insurance. J. Appl. Stoch. Models and Data Anal., 245-256.
[23] Norberg, R. (1998): Vasicek beyond the normal. Working paper No. 152,
Laboratory of Actuarial Math., Univ. Copenhagen.
[24] Norberg, R. (1999): A theory of bonus in life insurance. Finance and
Stochastics., 3, 373-390
[25] Norberg, R. (2000): On bonus and bonus prognoses in life insurance. Scand.
Actuarial J. (To appear.)
[26] Pliska, S.R. (1997): Introduction to Mathematical Finance, Blackwell Publishers.
[27] Ramlau-Hansen, H. (1991): Distribution of surplus in life insurance. ASTIN
Bull., 21, 57-71.
[28] Thomas, J.W. (1995): Numerical Partial Dierential Equations: Finite
Dierence Methods, Springer-Verlag.

Appendix A

Calculus
A. Piecewise dierentiable functions Being invariably concerned with operations in time, commencing at some given date, we will mainly consider functions dened on the positive real line [0, ). Thus, let us consider a generic
function X = {Xt }t0 and think of Xt as the state or value of some process at
time t. For the time being we take X to be real-valued.
In the present text we will work exclusively in the space of so-called piecewise
dierentiable functions. From a mathematical point of view this space is tiny
since only elementary undergraduate calculus is needed to move about in it.
From a practical point of view it is huge since it comfortably accommodates any
idea, however sophisticated, that an actuary may wish to express and analyze.
It is convenient to enter this space from the outside, starting from a wider class
of functions.
We rst take X to be of nite variation (FV), which means that it is the
dierence between two non-decreasing, nite-valued functions. Then the leftlimit Xt = limst Xs and the right-limit Xt+ = limst Xs exist for all t, and
they dier on at most a countable set D(X) of discontinuity points of X.
We are particularly interested FV functions X that are right-continuous
(RC), that is, Xt = limst Xs for all t. Any probability distribution function
is of this type, and any stream of payments accounted as incomes or outgoes,
can reasonably be taken to be FV and, as a convention, RC. If X is RC, then
Xt = Xt Xt , when dierent from 0, is the jump made by X at time t.
For our purposes it suces to let X be of the form
 t

Xt = X0 +
x d +
(X X ) .
(A.1)
0

0< t

The integral, which may be taken to be of Riemann type, adds up the continuous increments/decrements, and the sum, which is understood to range over
discontinuity times, adds up increments/decrements by jumps.
We assume, furthermore, that X is piecewise dierentiable (PD); A property
holds piecewise if it takes place everywhere except possibly at a nite number
1

APPENDIX A. CALCULUS

of points in every nite interval. In other words, the set of exceptional points,
if not empty, must be of the form {t0 , t1 , . . .}, with t0 < t1 < , and, in case it
is innite, limj tj = . Obviously, X is PD if both X and x are piecewise
d
Xt = xt , that is, the
continuous. At any point t
/ D = D(X) D(x) we have dt
function X grows (or decreases) continuously at rate xt .
As a convenient notational device we shall frequently write (A.1) in dierential form as
dXt = xt dt + Xt Xt .

(A.2)

A left-continuous PD function may be dened by letting the sum in (A.1)

range only over the half-open interval [0, t). Of course, a PD function may be
neither right-continuous nor left-continuous, but such cases are of no interest to
us.
B. The integral with respect to a function. Let X and Y both be PD
and, moreover, let X be RC and given by (A.2). The integral over (s, t] of Y
with respect to X is dened as
 t
 t

Y dX =
Y x d +
Y (X X ) ,
(A.3)
s

s< t

provided that the individual terms on the right and also their sum are well
dened. Considered as a function of t the integral is itself PD and RC with
continuous increments Yt xt dt and jumps Yt (Xt Xt ). One may think of the
integral as the weighted sum of the Y -values, with the increments of X as
weights, or vice versa. In particular, (A.1) can be written simply as

Xt = Xs +

dX ,

(A.4)

saying that the value of X at time t is its value at time s plus all its increments
in (s, t].
By denition,
 t
 t

 r
Y dX = lim
Y dX =
Y dX Yt (Xt Xt ) =
Y dX ,
s

rt

(s,t)

a left-continuous function of t. Likewise,

 t

 t
 r
Y dX = lim
Y dX =
Y dX + Ys (Xs Xs ) =
s

rs

a left-continuous function of s.

[s,t]

Y dX ,

APPENDIX A. CALCULUS

C. The chain rule (It

os formula). Let Xt = (Xt1 , . . . , Xtm ) be an m-variate
i
function with PD and RC components given by dXti = xit dt + (Xti Xt
).
m
Let f : R  R have continuous partial derivatives, and form the composed
function f (Xt ). On the open intervals where there are neither discontinuities in
the xi nor jumps of the X i , the function f (Xt ) develops in accordance with the
well-known chain rule for scalar elds along rectiable curves. At the exceptional
points f (Xt ) may change (only) due to jumps of the X i , and at any such point
t it jumps by f (Xt ) f (Xt ). Thus, we gather the so-called change of variable
rule or It
os formula, which in our simple function space reads
df (Xt ) =

m

f
(Xt ) xit dt + f (Xt ) f (Xt ) ,
i
x
i=1

(A.5)

or, in integral form,

 t
m


i
f (Xt ) = f (Xs ) +
f
(X
)
x
d
+
{f (X ) f (X )} . (A.6)

i
s i=1 x
s< t

Obviously, f (Xt ) is PD and RC.

A frequently used special case is (check the formulas!)
d(Xt Yt )

= Xt yt dt + Yt xt dt + Xt Yt Xt Yt
= Xt dYt + Yt dXt + (Xt Xt )(Yt Yt )
= Xt dYt + Yt dXt .

(A.7)

If X and Y have no common jumps, as is certainly the case if one of them is

continuous, then (A.7) reduces to the familiar
d(Xt Yt ) = Xt dYt + Yt dXt .
The integral form of (A.7) is the so-called rule of integration by parts:
 t
 t
Y dX = Yt Xt Ys Xs
X dY .
s

(A.8)

(A.9)

D. Counting processes. Let t1 < t2 < be a sequence in (0, ), either

nite or, if innite, such that limj tj = . Think of tj as the j-th time of
occurrence of a certain event. The number of events occurring within a given
time t is Nt = 9{j ; tj t} or, putting t0 = 0, Nt = j for tj t < tj+1 .
The function N = {Nt }t0 thus dened is called a counting function since it
currently counts the number of occurred events. It is a particularly simple PD
and RC function commencing from N0 = 0 and thereafter increasing only by
jumps of size 1 at the epochs tj , j = 1, 2, . . .
The change of variable rule (A.6) becomes particularly simple when X is a
counting function. In fact, for f : R  R and for N dened above,

{f (N ) f (N )}
(A.10)
f (Nt ) = f (Ns ) +
s< t

APPENDIX A. CALCULUS
= f (Ns ) +

4


{f (N + 1) f (N )}(N N ) (A.11)

s< t

{f (N + 1) f (N )} dN .

= f (Ns ) +

(A.12)

Basically, what these expressions state, is just the fact that

f (j) = f (0) +

j


{f (i) f (i 1)} .

i=1

Still they will prove to be useful representations when we come to stochastic

counting processes.
Going back to the general PD and RC function X in (A.1), we can associate
with it a counting function N dened by Nt = 9{ (0, t]; Xt = Xt }, the
number of discontinuities of X within time t. Equipped with our notion of
integral, we can now express X as
dXt = xct dt + xdt dNt ,

(A.13)

where xct = xt is the instantaneous rate of continuous change and xdt = Xt Xt

is the size of the jump, if any, at t. Generalizing (A.12), we have

f (Xt ) = f (Xs ) +
s

d
f (X ) xc d +
dx

{f (X + xd ) f (X )} dN .
s

(A.14)

Appendix B

Indicator functions
A. Indicator functions in general spaces. Let be some space with
generic point , and let A be some subset of . The function IA : {0, 1}
dened by

1 if A ,
IA () =
0 if Ac ,
is called the indicator function or just the indicator of A since it indicates by
the value 1 precisely those points that belong to A.
Since IA assumes only the values 0 and 1, (IA )p = IA for any p > 0. Clearly,
I = 0, I = 1, and
(B.1)
IAc = 1 IA ,
where Ac = \A is the complement of A.
For any two sets A and B (subsets of ),
IAB = IA IB

(B.2)

IAB = IA + IB IA IB .

(B.3)

and
The last two statements are displayed here only for ease of reference. They
are special cases of the following results, valid for any nite collection of sets
{A1 , . . . , Ar }:
r

r
IAj ,
(B.4)
Ij=1 Aj =
j=1

Irj=1 Aj =


j

IAj

(B.5)

j1 <j2

r


(aj + bj ) =

j=1

r 

p=0 r\p

(B.6)

APPENDIX B. INDICATOR FUNCTIONS

 
where r\p signies that the sum ranges over all pr dierent ways of dividing
{1, . . . , r} into two disjoint subsets {j1 , . . . , jp } ( when p = 0) and {jp+1 , . . . , jr }
( when p = r). Combining the general relation
c

{ A } = Ac ,

(B.7)

with (B.1) and (B.4), we nd

Irj=1 Aj = 1 Irj=1 Acj = 1

r


(1 IAj ) ,

j=1

and arrive at (B.5) by use of (B.6).

B. Indicators of events. Let (, F , P) be some probability space. The indicator IA of an event A F is a simple binomial random variable;
IA Bin(1, P[A]) .
It follows that
E [IA ] = P[A] , V [IA ] = P[A](1 P[A]) .

(B.8)