Professional Documents
Culture Documents
Stochastic Processes Learning The Language PDF
Stochastic Processes Learning The Language PDF
keywords
Financial Mathematics; General Insurance Mathematics; Life Insurance Mathematics; Stochas-
tic Processes
authors addresses
A. J. G. Cairns, M.A., Ph.D., F.F.A., Department of Actuarial Mathematics and Statistics,
Heriot-Watt University, Edinburgh EH14 4AS, U.K. Tel: +44(0)131-451-3245; Fax: +44(0)131-
451-3249; E-mail: A.J.G.Cairns@ma.hw.ac.uk
D. C. M. Dickson, B.Sc., Ph.D., F.F.A., F.I.A.A., Centre for Actuarial Studies, The University of
Melbourne, Parkville, Victoria 3052, Australia. Tel: +61(0)3-9344-4727; Fax: +61-3-9344-6899;
E-mail: ddickson@cupid.ecom.unimelb.edu.au
A. S. Macdonald, B.Sc., Ph.D., F.F.A., Department of Actuarial Mathematics and Statistics,
Heriot-Watt University, Edinburgh EH14 4AS, U.K. Tel: +44(0)131-451-3209; Fax: +44(0)131-
451-3249; E-mail: A.S.Macdonald@ma.hw.ac.uk
H. R. Waters, M.A., D.Phil., F.I.A., F.F.A., Department of Actuarial Mathematics and Statis-
tics, Heriot-Watt University, Edinburgh EH14 4AS, U.K. Tel: +44(0)131-451-3211; Fax: +44(0)
131-451-3249; E-mail: H.R.Waters@ma.hw.ac.uk
M. Willder, B.A., F.I.A., Department of Actuarial Mathematics and Statistics, Heriot-Watt
University, Edinburgh EH14 4AS, U.K. Tel: +44(0)131-451-3204; Fax: +44(0)131-451-3249;
E-mail: M.Willder@ma.hw.ac.uk
1. Introduction
In Sections 2 and 3 of this paper we introduce some of the main concepts in stochastic
modelling, now included in the actuarial examination syllabus as Subject 103. In Sections
4, 5 and 6 we illustrate the application of these concepts to life insurance mathematics
(Subjects 104 and 105), finance (Subject 109) and risk theory (Subject 106). Subject 103
includes material not previously included in the UK actuarial examinations. Hence, we
hope this paper will be of interest not only to students preparing to take Subject 103, but
also to students and actuaries who will not be required to take this Subject.
It is assumed that the reader is familiar with probability and statistics up to the level
of Subject 101.
Stochastic Processes: Learning the Language 2
2. Familiar Territory
The set of possible outcomes from experiment (a) is rather limited compared to that
from experiment (b). For experiment (b) we can consider more complicated events, each
of which is just a subset of the set of all possible outcomes. For example, we could consider
the event {even number}, which is equivalent to {2, 4, 6}, or the event {less than or equal
to 4}, which is equivalent to {1, 2, 3, 4}. Probabilities for these events are calculated by
summing the probabilities of the corresponding individual outcomes, so that:
1 1
P[{even number}] = P[{2}] + P[{4}] + P[{6}] = 3 =
6 2
A real valued random variable is a function which associates a real number with
each possible outcome from an experiment. For example, for the coin spinning experiment
we could define the random variable X to be 1 if the outcome is {H} and 0 if the outcome
is {T}.
Now suppose our experiment is to spin our coin 100 times. We now have 2100 possible
outcomes. We can define events such as {the first spin gives Heads and the second spin
gives Heads} and, using the presumed independence of the results of different spins, we
can calculate the probability of this event as 12 12 = 14 .
Consider the random variable Xn , for n = 1, 2, . . . , 100 which is defined to be the
number of Heads in the first n spins of the coin. Probabilities for Xn come from the
binomial distribution, so that:
! m nm
n 1 1
P[Xn = m] = for m = 0, 1, . . . , n.
m 2 2
We can also consider conditional probabilities for Xn+k given the value of Xn . For
example:
1
P[X37 = 20 | X36 = 19] =
2
1
P[X37 = 19 | X36 = 19] =
2
P[X37 = m | X36 = 19] = 0 if m 6= 19, 20.
Stochastic Processes: Learning the Language 3
From these probabilities we can calculate the conditional expectation of X37 given
that X36 = 19. This is written E[X37 | X36 = 19] and its value is 19.5. If we had not
specified the value of X36 , then we could still say that E[X37 | X36 ] = X36 + 12 . There are
two points to note here:
(a) E[X37 | X36 ] denotes the expected value of X37 given some information about what
happened in the first 36 spins of the coin; and
(b) E[X37 | X36 ] is itself a random variable whose value is determined by the value taken
by X36 . In other words, E[X37 | X36 ] is a function of X36 .
The language of elementary probability theory has been adequate for describing the
ideas introduced in this section. However, when we consider more complex situations, we
will need a more precise language.
3. Further Concepts
= {1, 2, 3, 4, 5, 6}
In this case there are 6 sample points. We express the outcome of a 4 being rolled as
= {4}.
An event is a subset of the sample space. In our example the event of an odd number
being rolled is the subset {1, 3, 5}.
3.3 -algebras
We denote by F the set of all events in which we could possibly be interested.
To make the mathematics work, we insist that F contains the empty set , the whole
sample space , and all unions, intersections and complements of its members. With
these conditions, F is called a -algebra of events1 .
1
To be more precise, the number of unions and intersections should be finite or countably infinite.
Stochastic Processes: Learning the Language 4
is a sub--algebra, but:
is not, since the complement of the set {6} does not belong to G2 .
(c) P() = 1; that is, with probability 1 one of the outcomes in occurs.
The three axioms above are consistent with our usual understanding of probability.
For the die rolling experiment on the pair (, F), we could have a very simple measure
which assigns a probability of 16 to each of the outcomes {1}, {2}, {3}, {4}, {5}, and {6}.
Now consider a biased die where the probability of an odd number is twice that of an
even number. We now need a new measure P where P ({1}) = P ({3}) = P ({5}) = 29
and P ({2}) = P ({4}) = P ({6}) = 19 . This new measure P still satisfies the axioms
above, but note that the sample space and the -algebra F are unchanged. This shows
that it is possible to define two different probability measures on the same sample space
and -algebra, namely (, F, P) and (, F, P ).
to study the development of this quantity over time. An example of a stochastic process
{Xn }n=1 was given in Section 2, where Xn was the number of heads in the first n spins
of a coin.
A sample path for a stochastic process {Xt , t T } ordered by some time set T , is
the realised set of random variables {Xt (), t T } for an outcome . For example,
for the experiment where we spin a coin 5 times and count the number of heads, the
sample path {0, 0, 1, 2, 2} corresponds to the outcome = {T,T,H,H,T}.
3.7 Information
Consider the coin spinning experiment introduced in Section 2 and the associated
stochastic process {Xn }100
n=1 . As discussed in Section 2, the conditional expectation
E[Xn+m |Xn ] is a random variable which depends on the value taken by Xn . Because
of the nature of this particular stochastic process, the value of E[Xn+m |Xn ] is the same
as E[Xn+m |{Xk }nk=1 ]. Let Fn be the sub--algebra created from all the possible events,
together with their possible unions, intersections and complements, that could have hap-
pened in the first n spins of the coin. Then Fn represents the information we have
after n spins from knowing the values of X1 , X2 , . . . , Xn . In this case, we describe Fn as
the sub--algebra generated by X1 , X2 , . . . , Xn , and write Fn = (X1 , X2 , . . . , Xn ). The
conditional expectation E[Xn+m |{Xk }nk=1 ] can be written E[Xn+m |Fn ].
More generally, our information at time t is a -algebra Ft containing those events
which, at time t, we would know either have happened or have not happened.
3.8 Filtrations
A filtration is any set of -algebras {Ft } where Ft Fs for all t < s . So we have
a sequence of increasing amounts of information where each member Ft contains all the
information in prior members.
Usually Ft contains all the information revealed up to time t, that is, we do not
delete any of our old information. Then at a later time, s, we have more information,
Fs , because we add to the original information the information we have obtained between
times t and s. In this case Ft can be regarded as the history of the process up to and
including time t.
For our coin spinning experiment, the information provided by the filtration Ft should
allow us to reconstruct the result of all the spins up to and including time t, but not after
time t. If Ft recorded the results of the last four spins only, it would not be a filtration
since Ft would tell us nothing about the (t 4)th spin.
We are interested in the probability that a stochastic process will have a certain value
in the future. We may be given information as to the values of the stochastic process at
certain times in the past, and this information may affect the probability of the future
outcome. However, for a Markov Chain the only relevant information is the most recent
known value of the stochastic process. Any additional information prior to the most recent
Stochastic Processes: Learning the Language 6
value will not change the probability. A consequence of this property is that if {Xn }
n=1
is a Markov Chain and Fn = (X1 , X2 , . . . , Xn ), then:
E[Xn+m | Fn ] = E[Xn+m | X1 , . . . , Xn ]
= E[Xn+m | Xn ]
For example, consider a die rolling experiment where Nr is the number of sixes in
1
the first r rolls. Given N2 = 1, the probability that N4 = 3 is 36 using a fair die. This
probability is not altered if we also know that N1 = 1.
Now consider the coin spinning experiment where Xn is the number of heads in the
first n spins. The argument used in the previous paragraph can be used in this case to
show that {Xn } n=1 is a Markov Chain.
(a) T is the first time the process {Xt } reaches the value 120; or
(b) T is the time when the process {Xt } reaches its maximum value.
Definition (a) defines a stopping time for the process because the decision to set T =
t means that the process reaches the value 120 for the first time at time t, and this
information should be known at time t. Definition (b) does not define a stopping time for
the process because setting T = t requires knowledge of the values of the process before
and after time t.
More formally, the random variable T mapping to the time index set T is a stopping
time if:
{ : T () = t} Ft for all t T .
3.12 Martingales
Let {Ft }tT be a filtration. A martingale with respect to {Ft }t0 is a stochastic
process {Xt } with the properties that:
(a) E(|Xt |) < for all t;
(b) E(Xt |Fs ) = Xs for all s < t.
x
0 = able - 1 = dead
4.1 Introduction
The aim of this section is to formulate life insurance mathematics in terms of stochas-
tic processes. The motivation for this is the observation that life and related insurances
depend on life events (death, illness and so on) that, in sequence, form an individuals life
history. It is this life history that we regard as the sample path of a suitable stochastic
process. The simplest such life event is death, and precisely because of its simplicity it
can be modelled successfully without resorting to stochastic processes (for example, by
regarding the remaining lifetime as a random variable). Other life events are not so sim-
ple, so it is more important to have regard to the life history when we try to formulate
models. Hence stochastic processes form a natural starting point.
To keep matters clear, we will develop the simplest possible example, represented in
an intuitive way by the two-state (or single decrement) model in Figure 1. Of course, this
process death as a single decrement is very familiar, so at first it seems that all we
do is express familiar results in not-so-familiar language. Of itself this offers nothing new,
but, we emphasise, the payoff comes when we must model more complicated life histories.
(a) All the tools developed in the case of this simple process carry over to more compli-
cated processes, such as are often needed to model illness or long term care.
(b) The useful tools turn out to be exactly those that are also needed in modern financial
mathematics. In particular, stochastic integrals and conditional expectation
are key ideas. So, instead of acquiring two different toolkits, one will do for both.
The main difference between financial mathematics and life insurance mathematics
is that the former is based on processes with continuous paths, while the latter is based
on processes with jumps2 . The fundamental objects in life insurance mathematics are
stochastic processes called counting processes.
As will be obvious from the references, this section is based on the work of Professor
Ragnar Norberg.
1.5
1.0
0.5
0.0
0 20 40 60 80 100
Age
state, and the number 1 to the dead state. A typical sample path of this process might
then look like Figure 2, where a life dies at age 46. The sample path is a function of time;
we call it N01 (t). N01 (t) indicates whether death has yet occurred3 . Looked at another
way, N01 (t) counts the number of events that have happened up to and including time
t. Do not think that because only one type of event can occur, and that only once, this
counting interpretation is trivial: far from it. It is what defines a counting process.
We pay close attention to the increments of the sample path N01 (t). They are
very simple. If the process does not jump at time t, the increment is 0. We write this
as dN 01 (t) = 0. If the process does jump at time t, the increment is 1. We write this
as dN 01 (t) = 1. (Sometimes you will see N 01 (t) instead of dN 01 (t); here it does not
matter.)
Discrete increments like dN 01 (t) are, for counting processes, what the first derivative
d/dx is for processes with differentiable sample paths. Just as a differentiable sample path
can be reconstructed from its derivative (by integration) so can a counting process be
reconstructed from its increments (also by integration). That leads us to the stochastic
integral.
more than one integer time. Can we reconstruct N01 (t) from its increments dN 01 (t)? To
be specific, can we find N01 (T )? (T need not be an integer). Let J(T ) be the set of all
possible jump times up to and including T (that is, all integers T ). Then:
X
N01 (T ) = dN 01 (t). (1)
tJ(T )
Suppose N01 (t) is still discrete-time, but can jump at more points: for example at the
end of each month. Again, define J(T ) as the set of all possible jump times up to and
including T , and equation (1) remains valid. This works for any discrete set of possible
jump times, no matter how refined it is (years, months, days, minutes, nanoseconds . . .).
What happens in the limit?
(a) the counting process becomes the continuous-time version with which we started;
(b) the set of possible jump times J(T ) becomes the interval (0, T ]; and
(c) the sum for N01 (T ) becomes an integral:
Z ZT
N01 (T ) = dN 01 (t) = dN 01 (t). (2)
tJ(T ) 0
Z
Y = v t I0 (t)dt. (4)
0
Given the sample path, this is a perfectly ordinary integral, but since the sample path
is random, so is Y . Defining X(T ) and Y (T ) as the present value of payments up to time
T , we can write down the stochastic processes:
ZT ZT
t
X(T ) = v dN 01 (t) and Y (T ) = v t I0 (t)dt. (5)
0 0
Noting the obvious, premiums can be treated as a negative annuity, and these definitions
can be extended to any multiple state model. Also without difficulty, discrete annuity
or pure endowment payments can also be accommodated, but we leave them out for
simplicity.
The quantities a0 (t) and A01 (t) are functions of time, but need not be stochastic
processes. They define payments that will be made, depending on events, but they do not
represent the events themselves. In the case of a non-profit assurance, for example, they
will be deterministic functions of age. The payments actually made can be expressed as
a rate, dL(t):
and the value of the cumulative payment at time 0, denoted V (0), is:
ZT ZT ZT
t t
V (0) = v dL(t) = v A01 (t)dN 01 (t) + v t a0 (t)I0 (t)dt (8)
0 0 0
This quantity is the main target of study. Compare it with equation (5); it simply allows
for more general payments. It is a stochastic process, as a function of T , since it now
Stochastic Processes: Learning the Language 12
represents the payments made depending on the particular life history (that is, the sample
path of N01 (t)).
We also make use of the accumulated/discounted value of the payments at any time
s, denoted V (s):
1 Z t 1 Z t 1 Z t
T T T
All concrete calculations depend on the choice of probability measure (mortality ba-
sis). We will illustrate this using expected present values. Suppose the actuary has chosen
a probability measure P (equivalent to life table probabilities t px ). Taking as an example
the whole life assurance benefit, for a life aged x, say, EP [X] is:
Z Z Z Z
EP v dN 01 (t) = v EP [dN 01 (t)] = v P[dN 01 (t) = 1] = v t t px x+t dt
t t t
(10)
0 0 0 0
Stochastic Processes: Learning the Language 13
01
x -
0 = able 1 = ill
S 10
x
S
02
x
S 12
x
S
S
w
S
/
2 = dead
which should be familiar5 . If the actuary chooses a different measure P , say (equivalent
to different life table probabilities t px ), we get a different expected value:
Z
EP [X] = v t t px x+t dt. (11)
0
or, regarding them as one object, we have a multivariate counting process with 4 com-
ponents. We can also define stochastic processes indicating presence in each state, Ij (t),
5
The last step in equation (10) follows because the event {dN 01 (t) = 1} is just the event survives to
just before age x + t, then dies in the next instant, which has the probability t px x+t dt.
6
We did not introduce S(t) for the 2-state model: we do so now, it is the same as N01 (t).
Stochastic Processes: Learning the Language 14
2.5
2.0
1.5
1.0
0.5
0.0
0 20 40 60 80 100
Age
annuity payment functions aj (t) for each state, and sum assured functions for each possi-
ble transition, Ajk (t). Then all of the life insurance mathematics from the 2-state model
carries over with only notational changes.
M01 (t) is called the compensated counting process, and the integral on the right hand
side is called the compensator of N01 (t). It is easy to see that M01 (t) is a martingale
from its increments:
Zt
M01 (t) = N01 (t) I0 (s)s ds (16)
0
and EP [dM01 = 0. Alternatively, given a force of mortality t , we can find a probabil-
(t)]
ity measure PR such that M01 (t) is a P -martingale; P is simply given by the probabilities
t
t px = exp( 0 s ds). This is true of any (well-behaved) force of mortality, not just na-
tures chosen true force of mortality7 .
An idea of the usefulness of M01 (t) can be gained from equation (13). If we con-
sider an age interval short enough that a constant transition intensity is a reasonable
approximation, this becomes:
Zt
M01 (t) = N01 (t) I0 (s)ds. (17)
0
ZT
1
EP [V (s)|Fs ] = EP s v t dL(t) Fs (19)
v
0
7
This is exactly analogous to the equivalent martingale measure of financial mathematics, in which
we are given the drift of a geometric Brownian motion (coincidentally, also often denoted t ) and then
find a probability measure under which the discounted process is a martingale.
Stochastic Processes: Learning the Language 16
Zs ZT
1 1
= EP s v dL(t) Fs + EP s v dL(t) Fs
t t
(20)
v v s
0
The second term on the right is the prospective reserve at time s. If the information
Fs is the complete life history up to time s, it is the same as the usual prospective
reserve. However, this definition is more general; for example, under a joint-life second-
death assurance, the first death might not be reported, so that Fs represents incomplete
information. Also, it does not depend on the probabilistic nature of the process generating
the life history; it is not necessary to suppose that the process is Markov, for example.
If the process is Markov (as we often suppose) then conditioning on Fs simply means
conditioning on the state occupied at time s, which is very convenient in practice.
The first term on the right is minus the retrospective reserve. This definition of the
retrospective reserve is new (Norberg, 1991) and is not equivalent to classical definitions.
This is a striking achievement of the stochastic process approach: for convenience we also
list some of the notions of retrospective reserve that have preceded it:
(a) The classical retrospective reserve (for example, Neill (1977)) depends on a deter-
ministic cohort of lives, who share out a fund among survivors at the end of the
term. However, this just exposes the weaknesses of the deterministic model: given
a whole number of lives at outset, lx say, the number of survivors some time later,
lxt px is usually not an integer. Viewed prospectively this can be excused as being a
convenient way of thinking about expected values, but viewed retrospectively there
is no such excuse.
(b) Hoem (1969) allowed both the number of survivors, and the fund shared among
survivors, to be random, and showed that the classical retrospective reserve was
obtained in the limit, as the number of lives increased to infinity.
(c) Perhaps surprisingly, the classical notion of retrospective reserve does not lead to a
unique specification of what the reserve should be in each state of a general Markov
model, leading to several alternative definitions (Hoem, 1988; Wolthius & Hoem, 1990;
Wolthius, 1992) in which the retrospective and prospective reserves in the initial state
were equated by definition.
(d) Finally, Norberg (1991) pointed out that the classical retrospective reserve is . . .
rather a retrospective formula for the prospective reserve . . ., and introduced the def-
inition in equation (20). This is properly defined for individual lives, and depends on
known information Fs . If Fs is the complete life history, the conditional expectation
disappears and:
1 Z t
s
Most systems of ODEs do not admit closed-form solutions, and have to be solved nu-
merically, but many methods of solution are quite simple8 , and well within the capability
of a modern PC. So, while closed-form solutions are nice, they are not too important,
and it is better to seek ODEs that are relevant to the problem, rather than explicitly
soluble. We would remind actuaries of a venerable example of a numerical solution to an
intractable ODE, namely the life table.
(d) The tools we use are exactly those that are essential in modern financial mathematics,
in particular stochastic integrals and conditional expectations. For a remarkable syn-
thesis of these two fields, see Mller (1998). An alternative approach, in which rates
of return as well as lifetimes are modelled by Markov processes, has been developed
(Norberg, 1995b) extending greatly the scope of the material discussed here.
(e) We have not discussed data analysis, but mortality studies are increasingly turning
towards counting process tools, for exactly the same reason as in (a). It will often be
helpful for actuaries at least to understand the language.
5. Finance
5.1 Introduction
In this section we are going to illustrate how stochastic processes can be used to price
financial derivatives.
A financial derivative is a contract which derives its value from some underlying
security. For example, a European call option on a share gives the holder the right, but
not the obligation, to buy the share at the exercise date T at the strike price of K. If the
share price at time T , ST , is less than K then the option will not be exercised and it will
expire without any value. If ST is greater than K then the holder will exercise the option
and a profit of ST K will be made. The profit at T is, therefore, max{ST K, 0}.
Suppose that we have a set of assets in which we can invest (with holdings which can
be positive or negative). Consider a particular portfolio which starts off with value zero
at time 0 (so we have some positive holdings and some negative). With this portfolio,
it is known that there is some time T in the future when its value will be non-negative
with certainty and strictly positive with probability greater than zero. This is called an
arbitrage opportunity. To exploit it we could multiply up all amounts by one thousand
or one million and make huge profits without any cost or risk.
In financial mathematics and derivative pricing we make the fundamental assumption
that arbitrage opportunities like this do not exist (or at least that if they do exist, they
disappear too quickly to be exploited).
As such the model appears to be quite unrealistic. However, it does provide us with good
insight into the theory behind more realistic models. Furthermore it provides us with an
effective computational tool for derivatives pricing.
At time 0 suppose we hold units of stock and units of cash. The value of this
portfolio at time 0 is V0 . At time 1 the same portfolio has the value:
(
S0 u + er if the stock price goes up
V1 =
S0 d + er if the stock price goes down
Let us choose and so that V1 = fu if the stock price goes up and V1 = fd if the
stock price goes down. Then:
S0 u + er = fu
and S0 d + er = fd
Stochastic Processes: Learning the Language 20
u
* S0 u
S0
u
H HH
HH
j u
H
S0 d
Thus we have two linear equations in two unknowns, and . We solve this system of
equations and find that:
fu fd
=
S0 (u d)
= er (fu S0 u)
!
r (fu fd )u
= e fu
ud
!
r fd u fu d
= e
ud
V0 = S0 +
(fu fd ) (fd u fu d)
= + er
u d !
u d !
r
1 de 1 + uer
= fu + fd
ud ud
r
= e (qfu + (1 q)fd )
er d
where q =
ud
u er er d
1q = =1
ud ud
Note that the no-arbitrage condition d < er < u ensures that 0 < q < 1.
If we denote the payoff of the derivative at t = 1 by the random variable f (S1 ), we
can write:
V0 = er EQ (f (S1 ))
Under Q we see that the expected return on the risky stock is the same as that on
a risk-free investment in cash. In other words under the probability measure Q investors
are neutral with regard to risk: they require no additional returns for taking on more risk.
This is why Q is sometimes referred to as a risk-neutral probability measure.
Under the real-world measure P the expected return on the stock will not normally
be equal to the return on risk-free cash. Under normal circumstances investors demand
higher expected returns in return for accepting the risk in the stock price. Thus we would
normally find that p > q. However, this makes no difference to our analysis.
where is the actuarial, risk-discount rate. Compare this with the price calculated using
the principles of financial economics above:
If forwards are trading at V0a , where V0a > V0 , then we can sell one derivative at the
actuarial price, and use an amount V0 to set up the replicating portfolio (, ) at time
0. The replicating portfolio ensures that we have the right amount of money at t = 1 to
pay off the holder of the derivative contract. The difference between V0a and V0 is then
guaranteed profit with no risk.
Similarly if V0a < V0 we can also make arbitrage profits.
(In fact neither of these situations could persist for any length of time because demand
for such contracts trading at V0a would push the price back towards V0 very quickly. This is
a fundamental principle of financial economics: that is, prices should not admit arbitrage
Stochastic Processes: Learning the Language 22
opportunities. If they did exist then the market would spot any opportunities very quickly
and the resulting excess supply or demand would remove the arbitrage opportunity before
any substantial profits could be made. In other words, arbitrage opportunities might
exist for very short periods of time in practice, while the market is free from arbitrage
for the great majority of time and certainly at any points in time where large financial
transactions are concerned. Of course, we would have no problem in buying such a
contract if we were to offer a price of V0a to the seller if this was greater than V0 but we
would not be able to sell at that price. Similarly we could easily sell such a contract if
V0a < V0 but not buy at that price. In both cases we would be left in a position where we
would have to maintain a risky portfolio in order to give ourselves a chance of a profit,
since hedging would result in a guaranteed loss.)
For V0a to make reasonable sense, then, we must set in such a way that V0a equals
V0 . In other words, the subjective choice of in actuarial work equates to the objective
selection of the risk-neutral probability measure Q. Choosing to equate V0a and V0
is not what happens in practice and, although is set with regard to the level of risk
under the derivative contract, the subjective element in this choice means that there is
no guarantee that V0a will equal V0 . In general, therefore, the actuarial approach, on
its own, is not appropriate for use in derivative pricing. Where models are generalised
and assumptions weakened to such an extent that it is not possible to construct hedging
strategies which replicate derivative payoffs then there is a role for a combination of the
financial economic and actuarial approaches. However, this is beyond the scope of this
paper.
St = S0 uNt dtNt
where Nt is the number of up-steps10 between time 0 and time t. This means that we
have n + 1 possible states at time n. We can see that the value of the stock price at time t
depends only upon the number of up and down steps and not on the order in which they
occurred. Because of this property the model is called a recombining binomial tree
or a binomial lattice (see Figure 6).
The sample space for this model, , is the set of all sample paths from time 0 to time
n. This is widely known as the random walk model.There are 2n such sample paths since
there are two pssible outcomes in each time period. The information F is the -algebra
10
In this sense, Nt can also be regarded as a discrete-time counting process; see Section 4.
Stochastic Processes: Learning the Language 23
u S 0 u4
*
u
3
*
HHS0 u
H
HH
u
S 0 u2 j
H u S 0 u3 d
* HH
*
HH
u
H
j
H u
* HH
S0 u *
HHS 0 u2 d
H H
H HH
S0
u
HH
H
H u
j
S0 ud j
H u S0 u2 d2
* HH
*
HH HH
H
j
H u H
j
H u
HHS0 d *
HHS0 ud2
H
HH H
HH
u 2
j
H j
H u S0 ud3
H S0 d *
HH
H
H
j
H u
3
HHS0 d
H
HH
j
H u S0 d4
generated by all sample paths from time 0 to n while the filtrations Ft are generated by
all sample paths up to time t. (Given the sample space , each sample path up to time t
is equivalent to 2nt elements of the sample space, each element being the same over the
period 0 to t. Nt and St are random variables which are functions of the sample space.)
Under this model all periods have the same probability of an up step and steps in
each time period are independent of one another. Thus the number of up steps up to
time t, Nt , has under Q a binomial distribution with parameters t and q. Furthermore,
for 0 < t < n, Nt is independent of Nn Nt and Nn Nt has a binomial distribution with
parameters n t and q.
Let us extend our notation a little bit. Let Vt (j) be the fair value of the derivative
at time t given Nt = j for j = 0, . . . , t. Also let Vn (j) = f (S0 uj dnj ). Finally we write
Vt = Vt (Nt ) to be the random value at some future time t.
In order for us to calculate the value at time 0, V0 (0), we must work backwards one
period at a time from time n making use of the one-period binomial model as we go.
First let us consider the time period n 1 to n. Suppose that Nn1 = j. Then, by
analogy with the one-period model we have:
Stochastic Processes: Learning the Language 24
h i
Vt = er(nt) EQ f St uNn Nt d(nt)(Nn Nt ) | Nt
X
nt (n t)!
= er(nt) f St uk dntk q k (1 q)ntk .
k=0 k!(n t k)!
We have noted before that EQ (S1 ) = S0 er giving rise to the use of the name risk-
neutral measure for Q. Similarly in the n-period model we have (putting f (s) = s):
So the use of the expression risk-neutral measure for Q is still valid. Alternatively we
can write:
h i
EQ erT ST | Ft = ert St .
In other words, the discounted asset value process Dt = ert St is a martingale under Q.
This gives rise to another name for Q: equivalent martingale measure.
Stochastic Processes: Learning the Language 25
In fact we normally use this result the other way round, as we will see in the next
section. That is, the first thing we do is to find the equivalent martingale measure Q, and
then use it immediately to price derivatives.
dSt = St dt + St dZt .
By analogy with the binomial model there is another probability measure Q (the
risk-neutral measure or equivalent martingale measure) under which:
(a) ert St is a martingale h i
(b) St can be written as the geometric Brownian motion S0 exp r 12 2 t + Zt where
is a standard Brownian motion under Q
Z(t)
By continuing the analogy with the binomial model (for example, see Baxter & Rennie
(1996)) we can also say that the value at time t of the derivative is:
With a bit more work we can also see that, under this model, if we invest Vt in the
right way (that is, with a suitable hedging strategy), then we can replicate the payoff at
T without the need for extra cash.
Suppose that we consider a European call option, so that f (s) = max{sK, 0}. Then
we can exploit a well known property of the log-normal distribution to get the celebrated
Black-Scholes formula:
6. Risk Theory
6.1 Introduction
In this section we show how stochastic processes can be used to gain insight into the
pricing of general insurance policies. In particular, we will make use of the notion of a
martingale, Brownian motion and also the optional stopping theorem.
Suppose we have a general insurance risk, for example comprehensive insurance for
a fleet of cars or professional indemnity insurance for a software supplier, for which cover
is required on an annual basis. We want to answer the question: In an ideal world,
how should we calculate the annual premium for this cover? Let us denote by Pn the
premium to be charged for cover in year n, where the coming year is year 1.
The starting point is to consider the claims which will arise each year. Since the
aggregate amount of claims is uncertain, we model these amounts as random variables.
Let Sn be a random variable denoting the aggregate claims arising from the risk in year n.
We might also take into consideration, particularly for a large risk, the amount of capital
with which we are prepared to back this risk. We denote this initial capital, or initial
surplus, U . In practice, we would also take into consideration many other factors, for
example, expenses and the premium rates charged by our competitors. However, to keep
things simple, we will ignore these other factors.
Throughout this section we will assume that the random variables {Sn } n=1 are inde-
pendent of each other, but we will not assume they are identically distributed. To be able
to calculate Pn we need to be able to calculate, or at least estimate, the distribution of
Sn . If we have information about the distributions of claim numbers and claim amounts
in year n, we may be able to use Panjers celebrated recursion formula to calculate the
distribution of Sn . See, for example, Klugman et al (1997). In some circumstances, for
example when Sn is the sum of a large number of independent claim amounts, it may be
Stochastic Processes: Learning the Language 27
Sn N (n , n2 ) (24)
for some parameters n and n .
Pn = n + p n (26)
where p is such that:
(p ) = p
d d2
u(x) > 0 and u(x) < 0.
dx dx2
The first of these conditions says that we always prefer more money. The second
condition says that an extra pound is worth less, the wealthier we are. See Bowers et
al. (1997) for more details.
We can use this utility function to calculate Pn from the following formula:
insure the risk, our expected utility of wealth at the end of the year is E[u(W + Pn Sn )].
Formula (27) says that for a premium of Pn we are indifferent between insuring the risk
and not insuring it. In this sense the premium Pn is fair. See, for example, Bowers et
al. (1997) for details.
Now assume that u(x) is an exponential utility function with parameter > 0,
so that:
u(x) = 1 ex . (28)
Using (28) in (27) and solving for Pn , we have:
A major difference in the development so far is that for (26) the loading factor p has an
intuitive meaning, whereas in (30), or (29), the parameter is not so easily understood
or quantified. To fill in this gap, we need to consider our surplus at the end of each year
in the future.
Let Un denote the surplus at the end of the n-th year, so that:
X
n
Un = U + (Pk Sk )
k=1
to be the probability that at some time in the future we will need more capital to back
this risk, that is, in more emotive language, the probability of ultimate ruin in discrete
time for our surplus process. Let us assume that Pn is calculated using (29).
Consider the process {Yn } n=0 , where Yn = exp{Un } for n = 0, 1, 2, ... . Then
{Yn }
n=0 is a martingale with respect to {Sn }
n=1 . To prove this, note that:
implies that:
E [Yn+1 |S1 , . . . , Sn ]
= E [exp{ (Un + Pn+1 Sn+1 ) |S1 , . . . , Sn }]
= E [exp{(Pn+1 Sn+1 )}|S1 , . . . , Sn ] exp{Un }.
Since {Sn }
n=1 is a sequence of independent random variables it follows that:
T = min(n: Un 0 or Un b) :
Thus, the process stops when the first of two events occurs: (i) ruin; or, (ii) the surplus
reaches at least level b. The optional stopping theorem tells us that:
E [exp{UT }] = exp{U }.
Let pb denote the probability that ruin occurs without the surplus ever having been
at level b or above. Then, conditioning on the two events described above:
Now let b . Then pb (u), the first expectation on the right hand side of (31)
is at least 1 since it is the moment generating function of the deficit at ruin, evaluated at
, and the second expectation goes to 0 since it is bounded above by exp{b}. Thus:
giving:
exp{U }
(U ) = exp{U }. (32)
E [exp{UT }|UT 0]
This gives us a simple bound on the probability of ultimate ruin. It also suggests an
appropriate value for the parameter in formula (29). For example, if the initial surplus
is 10, and we require that the probability of ultimate ruin is to be no more than 1%, then
we require such that exp{10} = 0.01, giving = 0.461.
Let us consider the special case when {Sn }n=1 is a sequence of independent and iden-
tically distributed random variables, each distributed as compound Poisson with Poisson
parameter . Then Pn is independent of n, say Pn = P , and:
P = 1 log E [exp{Sn }]
= 1 log [exp{ (MX () 1)}]
where MX (.) is the moment generating function of the distribution of a single claim
amount, giving:
+ P = MX (). (33)
This is the equation for the adjustment coefficient for this model (see, for exam-
ple, Gerber (1979)). Thus, in this particular case, the parameter is the adjustment
coefficient, and equation (32) is simply Lundbergs inequality.
B(t) B(s) N (t s), (t s) 2 . (34)
We assume that premiums are received continuously at constant rate P per annum,
where:
1
P = + 2 (35)
2
which is equivalent to formula (30). The surplus at time t is denoted U (t), where:
so that c (U ) is the probability of ultimate ruin in continuous time for our surplus process.
We will apply the martingale argument of the previous subsection to find c (U ). First
note that:
E[Y (t)|B(u), 0 u s]
= E[exp{U (t)}|B(u), 0 u s]
= E[exp{(U (t) U (s))} exp{U (s)}|B(u), 0 u s].
Hence:
E[Y (t)|B(u), 0 u s]
= E[exp{(U (t) U (s))}|B(u), 0 u s]E[exp{U (s)}|B(u), 0 u s]
= E[exp{(U (t) U (s))}] exp{U (s)}.
Now:
We can then use the fact that the process {B(t)}t0 has stationary increments to say that
B(t) B(s) is equivalent in distribution to B(t s), and hence U (t) U (s) is equivalent
11
Recall that a martingale is defined with respect to a filtration: here we mean that the relevant
filtration is that generated by the process {B(t)}t0 .
Stochastic Processes: Learning the Language 32
Once again defining pb to be the probability that ruin occurs without the surplus process
ever having been at level b or above, we have:
If the surplus level attains b without ruin occurring, then U (Tc ) = b since the sample
paths of Brownian motion are continuous, i.e. the process cannot jump from below b to
above b without passing through b. The situation is the same if ruin occurs. Hence:
and:
Thus:
pb + exp{b}(1 pb ) = exp{U }
Stochastic Processes: Learning the Language 33
c (U ) = exp{U }. (38)
Formula (38) is Lundbergs inequality for our continuous time surplus process, but
this is now an equality. Going back to formula (32), this shows that the upper bound for
the probability of ruin in discrete time, in the special case where the mean and variance
of claims do not change over time, is just the exact probability of ruin in continuous time.
Stochastic Processes: Learning the Language 34
References
Baxter, M., & Rennie, A. (1996). Financial calculus. Cambridge University Press.
Bowers, N.L., Gerber, H.U., Hickman, J.C., Jones, D.A. & Nesbitt, C.J. (1997).
Actuarial mathematics (2nd edition). Society of Actuaries, Schaumburg, IL.
Gerber, H.U. (1979). An introduction to mathematical risk Ttheory. S.S. Heubner Foundation
Monograph Series No. 8. Distributed by R. Irwin, Homewood, IL.
Goovaerts, M.J., De Vylder, F. & Haezendonck, J. (1984). Insurance premiums. North-
Holland, Amsterdam.
Grimmett, G.R., Stirzaker, D.R. (1992). Probability and random processes. Oxford Uni-
versity Press.
Hoem, J.M. (1969). Markov chain models in life insurance. Bl atter der Deutschen Gesellschaft
ur Versicherungsmathematik 9, 91107.
f
Hoem, J.M. (1988). The versatility of the Markov chain as a tool in the mathematics of
life insurance. Transactions of the 23rd International Congress of Actuaries, Helsinki S,
171202.
Hoem, J.M. & Aalen, O.O. (1978). Actuarial values of payment streams. Scandinavian
Actuarial Journal 1978, 3847.
Klugman, S.A., Panjer, H.H. & Willmot, G.E. (1997). Loss models from data to
decisions. John Wiley & Sons, New York.
Kulkarni, V.G. (1995). Modeling and analysis of stochastic systems. Chapman & Hall,
London.
Macdonald, A.S. (1996a). An actuarial survey of statistical models for decrement and tran-
sition data I: Multiple state, Poisson and Binomial models. British Actuarial Journal 2,
129155.
Macdonald, A.S. (1996b). An actuarial survey of statistical models for decrement and tran-
sition data III: Counting process models. British Actuarial Journal 2, 703726.
Mller, T. (1998). Risk-minimising hedging strategies for unit-linked life insurance contracts.
ASTIN Bulletin 28, 1747.
Neill, A. (1977). Life contingencies. Heinemann, London.
Norberg, R. (1990). Payment measures, interest and discounting. Scandinavian Actuarial
Journal 1990, 1433.
Norberg, R. (1991). Reserves in life and pension insurance. Scandinavian Actuarial Journal
1991, 324.
Norberg, R. (1992). Hattendorffs theorem and Theiles differential equation generalized.
Scandinavian Actuarial Journal 1992, 214.
Norberg, R. (1993a). Addendum to Hattendorffs theorem and Theiles differential equation
generalized, SAJ 1992, 214. Scandinavian Actuarial Journal 1993, 5053.
Norberg, R. (1993b). Identities for life insurance benefits. Scandinavian Actuarial Journal
1993, 100106.
Norberg, R. (1995a). Differential equations for moments of present values in life insurance.
Insurance: Mathematics & Economics 17, 171180.
Norberg, R. (1995b). Stochastic calculus in actuarial science. University of Copenhagen
Working Paper 135, 23 pp.
Stochastic Processes: Learning the Language 35
APPENDIX A
Brownian Motion
APPENDIX B
Stochastic Differential Equations
With traditional calculus we have no problem in dealing with the first integral. How-
ever in the second integral the usual Riemann-Stieltjes approach fails because Zu is just
too volatile a function. (This is related to the fact that Zt is not differentiable.) In fact
the second integral is dealt with using Ito integration. A good treatment of this can be
found in ksendal (1998).
Writing down the stochastic integral is not really very informative and it is useful to
have, if possible, a closed expression for Xt . An important result which allows us to do
this in many cases is Itos Lemma: suppose that Xt and Yt are diffusion processes with
dXt = m(t, Xt )dt + s(t, Xt )dZt and Yt = f (t, Xt ) for some function f (t, x). Then:
f f 1 2f
dYt = (t, Xt )dt + (t, Xt )dXt + 2
(t, Xt )s(t, Xt )2 dt
t x 2 x
For example suppose that Xt = Zt and Yt = exp[at + bXt ] = f (t, Xt ). Then:
Stochastic Processes: Learning the Language 37
f
= a exp[at + bx] = af (t, x)
t
f
= bf (t, x)
x
2f
2
= b2 f (t, x).
x
Thus, by Itos Lemma:
1
dYt = aYt dt + bYt dXt + b2 Yt dt
2
1 2
= (a + b )Yt dt + bYt dZt .
2