CK1 Booklet 1 PDF

Exclusive use Batch 3a
Subject CS2
Revision Notes
For the 2019 exams
Survival analysis
Booklet 4
covering
Chapter 6 Survival models

Chapter 7 Estimating the lifetime distribution function
Chapter 8 Proportional hazards models
The Actuarial Education Company

CONTENTS
Contents Page
Links to the Course Notes and Syllabus 2

Overview 4
Core Reading 5
Past Exam Questions 51
Solutions to Past Exam Questions 98
Factsheet 202
Copyright agreement
All of this material is copyright. The copyright belongs to Institute and

Faculty Education Ltd, a subsidiary of the Institute and Faculty of Actuaries.
The material is sold to you for your own exclusive use. You may not hire
out, lend, give, sell, transmit electronically, store electronically or photocopy
any part of it. You must take care of your material to ensure it is not used or
copied by anyone at any time.
Legal action will be taken if these terms are infringed. In addition, we may
seek to take disciplinary action through the profession or through your
employer.
These conditions remain in force after you have finished using the course.
© IFE: 2019 Examinations Page 1

LINKS TO THE COURSE NOTES AND SYLLABUS
Material covered in this booklet
Chapter 6 Survival models

Chapter 7 Estimating the lifetime distribution function
Chapter 8 Proportional hazards models
These chapter numbers refer to the 2019 edition of the ActEd course notes.
Syllabus objectives covered in this booklet
4.1 Explain the concept of survival models.
4.1.1 Describe the model of lifetime or failure time from age x

as a random variable.
4.1.2 State the consistency condition between the random
variable representing lifetimes from different ages.
4.1.3 Define the distribution and density functions of the random
future lifetime, the survival function, the force of mortality
or hazard rate, and derive relationships between them.
4.1.4 Define the actuarial symbols t px and t q x and derive

integral formulae for them.
4.1.5 State the Gompertz and Makeham laws of mortality.
4.1.6 Define the curtate future lifetime from age x and state its
probability function.
4.1.7 Define the symbols ex and e∞ x and derive an approximate

relation between them. Define the expected value and
variance of the complete and curtate future lifetimes and
derive expressions for them.
4.2 Describe estimation procedures for lifetime distributions.
4.2.1 Describe the various ways in which lifetime data might be

censored.
Page 2 © IFE: 2019 Examinations

4.2.2 Describe the estimation of the empirical survival function in

the absence of censoring, and what problems are
introduced by censoring.
4.2.3 Describe the Kaplan-Meier (or product limit) estimator of
the survival function in the presence of censoring, compute
it from typical data and estimate its variance.
4.2.4 Describe the Nelson-Aalen estimator of the cumulative
hazard rate in the presence of censoring, compute it from
typical data and estimate its variance.
4.2.5 Describe models for proportional hazards, and how these
models can be used to estimate the impact of covariates
on the hazard.
4.2.6 Describe the Cox model for proportional hazards, derive
the partial likelihood estimate in the absence of ties and
state the asymptotic distribution of the partial likelihood
estimator.

OVERVIEW
This booklet covers Syllabus objectives 4.1.1-4.1.7 and 4.2, relating to

survival analysis.
The following topics are included:

 the lifetime model (based on the lifetime random variables T and K )
 types of censoring
 the Kaplan-Meier model
 the Nelson-Aalen model
 proportional hazards models (including the Cox regression model).
Models that make an assumption about the distribution of the lifetime

random variable, eg T ~ Exp( l ) , are described as parametric models.
The Kaplan-Meier and Nelson-Aalen models are non-parametric models.

They aim to estimate a particular feature of the distribution of the lifetime
random variable, eg cumulative distribution function or integrated hazard
function, without making any assumptions about the distribution of the
lifetime random variable itself.
The Kaplan-Meier and Nelson-Aalen models assume that the same

stochastic model applies to all the individuals in the population being
modelled and that the censoring of observations is non-informative, ie the
same stochastic model applies to both the censored and the uncensored
individuals.
Proportional hazards models are semi-parametric. Part of the specification

of the hazard function assumes a distribution (the baseline hazard) and part
makes no such assumption (the proportionality part). These models allow us
to compare the hazard functions for different classes of lives.
As well as carrying out calculations, you need to be able to explain the

assumptions made by each model and to interpret the results.
You should also be able to define and give examples of different types of
censoring, and to spot all the types of censoring that are present in a given
situation.

CORE READING
All of the Core Reading for the topics covered in this booklet is contained in
this section.
We have inserted paragraph numbers in some places, such as 1, 2, 3 …, to

help break up the text. These numbers do not form part of the Core
Reading.
The text given in Arial Bold font is Core Reading.
The text given in Arial Bold Italic font is additional Core Reading that is not
directly related to the topic being discussed.
____________
Chapter 6 – Survival models
The future lifetime model
The starting point for a simple mathematical model of survival is the

observation that the future lifetime of a person (called a ‘life’ in
actuarial work) is not known in advance. Further, we observe that
lifetimes range from 0 to in excess of 100 years. A natural assumption
therefore is that the future lifetime of a given life is a random variable.
1 Assumption
The future lifetime of a new-born person is a random variable,

denoted T, which is continuously distributed on an interval [0, w ]
where 0 < w < • .
____________
2 The maximum age w is called the limiting age.
Typical values of w for practical work are in the range 100–120. The
possibility of survival beyond age w is excluded by the model for
convenience and simplicity.
____________

3 Distribution function and survival function of a new-born life
F (t ) = P [ T £ t ] is the distribution function of T
S (t ) = P [ T > t ] = 1 - F (t ) is the survival function of T

____________
4 We often need to deal with ages greater than zero. To meet this need,
we define Tx to be the future lifetime after age x , of a life who
survives to age x , for 0 £ x £ w . Note that T0 = T .
____________
5 Distribution function and survival function of a life aged x
For 0 £ x £ w :
Fx (t ) = P [ Tx £ t ] is the distribution function of Tx
S x (t ) = P [ Tx > t ] = 1 - Fx (t ) is the survival function of Tx

____________
6 For consistency with T , the distribution function of the random

variable Tx (0 £ x £ w ) must satisfy the following relationships:
F (x + t ) - F (x )
Fx (t ) = P [ Tx £ t ] = P [ T £ x + t ΩT > x ] =
S( x )
____________
7 Probabilities of survival and death
We now introduce the notation used by actuaries for probabilities of

death and survival. Define:
t qx = Fx (t )
t px = 1 - t q x = Sx (t )
____________

It is convenient in much actuarial work to use a time unit of one year.

When this is the case, so that t = 1 , we omit the ‘t’ from these
probabilities. That is, we define:
q x = 1q x and px = 1 px
q x and t q x are called rates of mortality.

____________
The force of mortality
8 A quantity which plays a central role in a survival model is the force of

mortality (which is more widely known as the hazard rate in statistics).
____________
9 We denote the force of mortality at age x (0 £ x < w ) by m x , and define

it as:
1
m x = lim+ ¥ P [ T £ x + h ΩT > x ]
hÆ0 h
We will always suppose that the limit exists.
The interpretation of m x is very important.

____________
The probability P [ T £ x + h ΩT > x ] is (from the definitions

above) Fx (h) = h q x .
____________
10 For small h, we can ignore the limit and write:
h qx ª h. m x
In other words, the probability of death in a short time h after age x is

roughly proportional to h, the constant of proportionality being m x .
____________

11 For x ≥ 0 and t > 0 , we could define the force of mortality m x + t in two

ways:
1
(1) m x +t = lim+ ¥ P [ T £ x + t + h ΩT > x + t ]
hÆ0 h
1
(2) m x +t = lim+ ¥ P [ Tx £ t + hΩTx > t ]
hÆ0 h
It is an easy exercise to show from the definitions that these are equal.
We will often use m x + t for a fixed age x and 0 £ t < w - x .
____________
12 The definition of S x (t ) leads to an important relationship:
S x (t ) = P [ Tx > t ] = P [ T > x + t ΩT > x ]
P [ T > x + t ] S( x + t )
= =
P [T > x ] S( x )
x + t p0
which can be expressed in actuarial notation as: t px =
x p0
____________
13 Therefore, for any age x and for s > 0, t > 0 :
x + s + t p0 x + s p0 x + s + t p0
s +t px = = ¥ = s px ¥ t px + s
x p0 x p0 x + s p0
Similarly:
s + t px = t px ¥ s px + t
____________

In words, the probability of surviving for time (s + t ) after age x is

given by multiplying:
(i) the probability of surviving for time s , and
(ii) the probability of then surviving for a further time t
or by multiplying:
(i) the probability of surviving for time t , and
(ii) the probability of then surviving for a further time s .

____________
The probability density function of Tx
14 The distribution function of Tx is Fx (t ) , by definition. We also want

to know its probability density function (PDF).
d
Denote this by f x (t ) , and recall that f x (t ) = Fx (t ) .
dt
Then:
d
f x (t ) = P [ Tx £ t ]
dt
1
= lim ¥ ( P [ Tx £ t + h ] - P [ T x £ t ] )
h Æ0+ h
P [ T £ x + t + h ΩT > x ] - P [ T £ x + t ΩT > x ]
= lim
hÆ0 + h
P [T £ x +t + h ]-P [T £ x ] - ( P [T £ x +t ]-P [T £ x ] )
= lim
h Æ0+ S( x ) ¥ h
P [T £ x + t + h ]-P [T £ x + t ]
= lim
h Æ0+ S( x ) ¥ h

Now multiply and divide by S ( x + t ) and we have:
S( x + t ) 1 P [T £ x + t + h ]- P [T £ x + t ]
f x (t ) = ¥ lim
S( x ) h Æ0+ h S( x + t )
1
= S x (t ) ¥ lim P [ T £ x + t + h ΩT > x + t ]
hÆ 0+ h
= S x (t ) ¥ m x + t
or, in actuarial notation, for a fixed age x between 0 and w :
f x (t ) = t px m x + t (0 £ t < w - x )
This is one of the most important results concerning survival models.

____________
Let’s summarise the model we have introduced. Tx is the (random)

future lifetime after age x :
 it is by assumption a continuous random variable taking values in

[ 0 ,w - x ];
 its distribution function is Fx (t ) = t q x ; and
 its probability density function is f x (t ) = t px m x + t
The force of mortality is interpreted by the approximate relationship:
h qx ª h. m x (for small h)
The survival functions Sx (t ) or t px satisfy the relationship:
s +t px = s px ¥ t px + s = t px ¥ s px +t (for any s > 0, t > 0)

____________

Initial and central rates of mortality
15 q x is called an initial rate of mortality, because it is the probability that

a life alive at exact age x (the initial time) dies before exact age x + 1 .
____________
16 An alternative often used (especially in demography) is the central rate

of mortality, denoted m x :
qx
mx = 1
Ú 0 t px dt
The quantity m x is the probability of dying between exact ages x

and x + 1 per person-year lived between exact ages x and x + 1 ; the
1
denominator Ú 0 t px dt is interpreted as the expected amount of time
spent alive between ages x and x + 1 by a life alive at age x and the
numerator is the probability of that life dying between exact ages x
and x + 1 .
____________
m x is useful when the aim is to project numbers of deaths, given the

number of lives alive in age groups; this is one of the basic
components of a population projection. In practice the age groups
used in population projection are often broader than one year, so the
definition of m x has to be suitably adjusted.
Historically, m x was estimated by statistics of the form:
Number of deaths
Total time spent alive and at risk
called ‘occurrence-exposure rates’.

More recently, these statistics have been used to estimate the force of
mortality rather than m x , because in that context they have a solid
basis in terms of a probabilistic model.
____________
17 However, if m x + t is a constant, m , between ages x and x + 1 , then:
1
qx Ú 0 t px m dt
mx = 1
= 1
=m
Ú 0 t px dt Ú 0 t px dt
so there is still a close connection.

____________
Expected future lifetime
18 The expected future lifetime after age x , which is referred to by

demographers as the expectation of life at age x , is defined as E ÎÈTx ˚˘ .
It is denoted e∞ x .
____________
19 By definition:
w -x
e∞ x = Ú t . t px m x + t dt
0
w -x
∂
= Ú t (-
∂t
t p x ) dt
0
w -x
= -[t . t p x ]w
0
-x
+ Ú t px dt (integrating by parts)
0
w -x
= Ú t px dt
0
____________

20 To define the curtate expectation of life, we first need to define K x , the

curtate future lifetime of a life age x .
____________
21 The curtate future lifetime of a life age x is:
K x = [ Tx ]
where the square brackets denote the integer part. In words, K x is

equal to Tx rounded down to the integer below.
Clearly K x is a discrete random variable, taking values on the

integers:
0, 1, 2, ... [w - x ]
____________
22 The probability distribution of K x is easy to write down.
P [ K x = k ] = P [ k £ Tx < k + 1 ]
= P [ k < Tx £ k + 1 ] (*)
= k p x .q x + k
____________
Note that switching the inequalities at step (*) requires an assumption

about Tx . It is enough to suppose that Fx (t ) is continuous in t . We
will not discuss this further here.
____________
23 We now define the curtate expectation of life, denoted e x , by:
e x = E [K x ]
____________

24 Then:
[w - x ]
ex = Â k . k px .q x + k
k =0
= 1 p x .q x +1
+ 2 px .q x + 2 + 2 px .q x + 2
+ 3 px .q x + 3 + 3 px .q x + 3 + 3 px .q x + 3
+ ...
[w - x ] [w - x ]
= Â Â j px .q x + j (summing columns)
k =1 j =k
[w - x ]
= Â k px
k =1
____________
We have two simple formulae:
w -x
e∞ x = Ú t px dt
0
[w - x ]
ex = Â k px
k =1
____________
25 The complete and curtate expectations of life are related by the

approximate equation:
e∞ x ª e x + ½
To see this, define J x = Tx - K x to be the random lifetime after the

highest integer age to which a life age x survives.

Approximately, E [ J x ] = ½ , but E [Tx ] = E [ K x ] + E [ J x ] so e∞ x ª e x + ½

as stated.
____________
26 It is easy to write down the variances of the complete and curtate

future lifetimes:
w -x 2
var[Tx ] = Ú t 2 t px m x +t dt - e∞ x
0
[w - x ]
var[K x ] = Â k2 k px q x + k - e x2
k =0
but these do not simplify neatly as the expected values do.

____________
The expectation of life is often used as a measure of the standard of

living and health care in a given country.
____________
27 Some important formulae
In this section we give two important formulae, one for t q x and one
for t px . The first follows from the result that f x (t ) = t px m x + t . We
have:
t t
t qx = Fx (t ) = Ú0 fx (s ) ds = Ú0 s px m x + s ds
This formula is easy to interpret. For each time s , between 0 and t ,

the integrand is the product of:
(i) s px , the probability of surviving to age x + s , and
(ii) m x + s ds , which is approximately equal to ds q x + s , the probability

of dying just after age x + s .

Since it is impossible to die at more than one time, we simply add up,
or in the limit integrate, all these mutually exclusive probabilities.
____________
28 The formula for t px follows from the solution of the following

equation:
∂ ∂
p =- q = -f x (s ) = - s px m x + s
∂s s x ∂s s x
(You will see why we have used s as the variable in a moment.)

____________
29 To solve this, note that:
∂
∂ s px
∂
log s px = s
∂s s px
so that the above equation can be rewritten as:
∂
log s px = - m x + s
∂s
Hence:
t t
∂
Ú ∂ s log s px ds = - Ú m x + s ds + c
0 0
where c is some constant of integration. The left-hand side is:
[log s px ]t0 = log t px (since 0 px = 1)
so taking exponentials of both sides gives:
ÏÔ t ¸Ô
t px = exp Ì - Ú m x + s ds + c ˝
ÓÔ 0 ˛Ô

Now since 0 px = 1 , we must have c = 0 , so finally:
ÏÔ t ¸Ô
p
t x = exp Ì - Ú m x + s ds ˝
ÔÓ 0 Ô˛
____________
To summarise, we have derived the following very important results:
t
t qx = Ú s px m x + s ds (6.1)
0
ÏÔ t ¸Ô
t px = exp Ì - Ú m x + s ds ˝ (6.2)
ÔÓ 0 Ô˛
____________
Simple parametric survival models
30 Several survival models are in common use in which the random

variable denoting future lifetime has a distribution expressed in terms
of a small number of parameters. Perhaps the simplest is the
exponential model, in which the hazard is constant:
mx = m
It follows from (6.2) above that in the exponential model:
{ }
ÏÔ t ¸Ô
= S x (t ) = exp Ì - Ú m ds ˝ = exp - [ m s ]0 = exp( - m t )
t
t px
ÓÔ 0 ˛Ô
and hence that:
t qx = 1 - t px = 1 - exp( m t )
____________

Suppose we have an exponential distribution with parameter l = 0.5 .

The R code for simulating 100 values is given by:
rexp(100,rate=0.5)
The PDF is obtained by dexp(x, rate=0.5) and is useful for

graphing. For example:
plot(seq(0:5000),dexp(seq(0:5000),
rate=0.5),type="l")
To calculate probabilities for a continuous distribution we use the CDF

which is obtained by pexp.
For example, to calculate P ( X £ 2) = 0.6321206 we use the R code:
pexp(2,rate=0.5)
Similarly, the quantiles can be calculated with qexp.

____________
31 A simple extension to the exponential model is the Weibull model, in

which the survival function Sx (t ) is given by the two-parameter
formula:
S x (t ) = exp È -a t b ˘ (6.3)
Î ˚
Since:
∂
m x +t = - log[Sx (t )]
∂t
we see that:
∂
m x +t = - [ -a t b ] = -[ -ab t b -1 ] = ab t b -1
∂t
____________

32 Different values of the parameter b can give rise to a hazard that is

monotonically increasing or monotonically decreasing as t increases,
or in the specific case where b = 1 , a hazard that is constant, since if
b = 1:
ab t b -1 = a .1.t 0 = a
This can be seen also from the expression for Sx (t ) (6.3), from which it
is clear that, when b = 1 , the Weibull model is the same as the
exponential model.
____________
The R code for simulating a random sample of 100 values from the
Weibull distribution with c = 2 and g = 0.25 is:
rweibull(100, 0.25, 2^(-1/0.25))
R uses a different parameterisation for the scale parameter, c.
Similarly, the PDF, CDF and quantiles can be obtained using the R
functions dweibull, pweibull and qweibull.
Alternatively, we could redefine them from first principles as follows:
rweibull <- function(n,c,g){

rp <- (log(1-runif(n))/c)^(1/g)
rp}
dweibull <- function(x,c,g){

c*g*x^(g-1)*exp(-c*x^g)}
pweibull <- function(q,c,g){

1-exp(-c*x^g)}
qweibull <- function(p,c,g){

q <- (log(1-p)/c)^(1/g)
q}
____________

Gompertz’ and Makeham’s laws
33 The Gompertz and Makeham laws of mortality are two further examples
of parametric survival models. They can be expressed as follows:
Gompertz’ Law: mx = B c x (6.4)
Makeham’s Law: mx = A + B c x
____________
34 Gompertz’ Law is an exponential function, and it is often a reasonable

assumption for middle ages and older ages.
____________
35 Makeham’s Law incorporates a constant term, which is sometimes

interpreted as an allowance for accidental deaths, not depending on
age.
____________
36 If a life table is known to follow Gompertz’ Law, the parameters B

and c can be determined given the values of m x at any two ages. In
the case of a life table following Makeham’s Law, the
parameters A, B and c can be determined given the values of m x at
any three ages.
____________
37 Survival probabilities t px can be found using:
Ê t ˆ
p
t x = exp Á - Ú m x + s ds ˜
Ë 0 ¯
____________

38 For example, in the case of Gompertz’ Law:
x
(c t - 1)
t px = gc
where:
Ê -B ˆ
g = exp Á
Ë log c ˜¯
____________
39 In the case of Makeham’s Law:
x
(c t - 1)
t px = st g c
where:
Ê -B ˆ
g = exp Á and s = exp(- A)
Ë log c ˜¯
We will use these laws in Booklet 6.

____________
The R base system does not have a command to simulate the

Gompertz distribution.
In the package flexsurv, the command rgompertz will simulate a

Gompertz distribution. The commands dgompertz, pgompertz and
qgompertz will generate the density, distribution function and
quantiles respectively.
The command hgompertz generates the hazard, and Hgompertz the

cumulative hazard.
Note that in these commands the parameters of the Gompertz

distribution are to be specified as ‘shape’ and ‘rate’. If the shape is a
and the rate is b , then Gompertz’s Law (6.4) may be written:
m x = b ea x

In terms of the notation used in (6.4) above, we have:
shape = logc
rate = B
For example, if the force of mortality is governed by Gompertz law with

shape parameter equal to 0.01 and rate parameter equal to 0.001, the
force of mortality or hazard at age 30 can be calculated using:
hgompertz(30, shape = 0.01, rate = 0.001)
to be 0.00135.
____________

Chapter 7 – Estimating the lifetime distribution function
We now turn to statistical inference.
40 Given some mild conditions on the distribution of T , we can obtain all

information by estimating F (t ) , S (t ) , f (t ) or m t for all t ≥ 0 .
____________
The simplest experiment would be to observe a large number of

new-born lives.
____________
41 The proportion alive at age t > 0 would furnish an estimate of S (t ) .

The estimate would be a step function, and the larger the sample the
closer to a smooth function we would expect it to be. For use in
applications it could be smoothed further.
____________
We need not assume that T is a member of any parametric family; this

is a non-parametric approach to estimation. You will recognise this as
the empirical distribution function of T .
____________
42 Clearly, there are some practical problems:

 Even if a satisfactory group of lives could be found, the experiment
would take about 100 years to complete.
 The observational plan requires us to observe the deaths of all the
lives in the sample. In practice many would be lost to the
investigation, for one reason or another, and to exclude these from
the analysis might bias the result.
____________
43 The statistical term for this problem is censoring. All we know in

respect of some lives is that they died after a certain age.
____________

In medical statistics, where lifetimes are often shorter, non-parametric

estimation is very important.
44 In this booklet we show how the experiment above can be amended to

allow for censoring. Otherwise, we must use a different observational
plan, and base inference on data gathered over a shorter time, eg 3 or 4
years.
A consequence is that we no longer observe the same cohort

throughout their joint lifetimes, so we might not be sampling from the
same distribution. It might be sensible to widen the model assumption,
so that the mortality of lives born in year y is modelled by a random
variable T y , for example. In practice we usually divide the
investigation up into single years of age.
____________
45 Observing lives between (say) integer ages x and x + 1 , and limiting

the period of investigation, are also forms of censoring. Censoring
might still occur at unpredictable times – by lapsing a life policy, for
example – but survivors will certainly be lost to observation at a known
time, either on attaining age x + 1 or when the investigation ends.
____________
46 Censoring is the key feature of survival data (indeed survival analysis

might be defined as the analysis of censored data) and the
mechanisms which give rise to censoring play an important part in
statistical inference. Censoring is present when we do not observe the
exact length of a lifetime, but only that its length falls within some
interval. This can happen in several ways.
____________
47 Right censoring
Data are right censored if the censoring mechanism cuts short

observations in progress. An example is the ending of a mortality
investigation before all the lives being observed have died. Persons
still alive when the investigation ends are right censored – we know
only that their lifetimes exceed some value..
____________

48 Left censoring
Data are left censored if the censoring mechanism prevents us from

knowing when entry into the state that we wish to observe took place.
An example arises in medical studies in which patients are subject to
regular examinations. Discovery of a condition tells us only that the
onset fell in the period since the previous examination; the time
elapsed since onset has been left censored.
____________
49 Interval censoring
Data are interval censored if the observational plan only allows us to

say that an event of interest fell within some interval of time. An
example arises in actuarial investigations, where we might know only
the calendar year of death.
Both right and left censoring can be seen as special cases of interval
censoring.
____________
In actuarial investigations, right censoring is the most common form of

censoring encountered.
____________
50 Random censoring
Suppose that the time Ci (say) at which observation of the i th lifetime

is censored is a random variable. Suppose that Ti is the (random)
lifetime of the i th life. Then the observation will be censored if Ci < Ti .
In such a situation, censoring is said to be random.
____________
The case in which the censoring mechanism is a second decrement of

interest gives rise to multiple decrement models.
____________

51 Type I censoring
If the censoring times {Ci } are known in advance (a degenerate case of

random censoring) then the mechanism is called ‘Type I censoring’.
____________
52 Type II censoring
If observation is continued until a predetermined number of deaths has

occurred, then ‘Type II censoring’ is said to be present. This can
simplify the analysis, because then the number of events of interest is
non-random.
____________
Many actuarial investigations are characterised by a combination of

random and Type I censoring, for example, in life office mortality
studies where policies rather than lives are observed, and observation
ceases either when a policy lapses (random censoring) or at some
predetermined date marking the end of the period of investigation
(Type I censoring).
____________
53 Informative and non-informative censoring
Censoring is non-informative if it gives no information about the

lifetimes {Ti } .
In the case of random censoring, the independence of each pair Ti , Ci

is sufficient to ensure that the censoring is non-informative.
Informative censoring is more difficult to analyse, essentially because
the resulting likelihoods cannot usually be factorised.
____________

It is obvious that the observational plan is likely to introduce censoring

of some kind, and consideration should be given to the effect on the
analysis in specifying the observational plan. Censoring might also
depend on the results of the observations to date. For example, if
strong enough evidence accumulates during the course of a medical
experiment, the investigation might be ended prematurely, so that the
better treatment can be extended to all the subjects under study, or the
inferior treatment withdrawn.
____________
The Kaplan-Meier (product-limit) estimator
54 In this section we develop the empirical distribution function to allow

for censoring.
We will consider lifetimes as a function of time t without mention of a

starting age x . The following could be applied equally to new-born
lives, to lives aged x at outset, or to lives with some property in
common at time t = 0 , for example diagnosis of a medical condition.
Medical studies are often based on time since diagnosis or time since
the start of treatment, and if the patient’s age enters the analysis it is
usually as an explanatory variable in a regression model.
____________
Assumptions and notation
55 Suppose we observe a population of n lives in the presence of

non-informative censoring, and suppose we observe m deaths.
Let t1 < t2 <  < t k be the ordered times at which deaths were
observed. We do not assume that k = m , so more than one death
might be observed at a single failure time.
Suppose that d j deaths are observed at time t j (1 £ j £ k ) so that

d1 + d 2 +  + d k = m .
Observation of the remaining n - m lives is censored.

Suppose that cj lives are censored between times tj and

t j +1 (0 £ j £ k ) , where we define t0 = 0 and t k +1 = • to allow for
censored observations after the last observed failure time; then
c0 + c1 +  + ck = n - m .
____________
56 The Kaplan-Meier estimator of the survivor function adopts the

following conventions.
(a) The hazard of experiencing the event is zero at all durations

except those where an event actually happens in our sample.
(b) The hazard of experiencing the event at any particular

dj
duration, t j , when an event takes place is equal to , where
nj
d j is the number of individuals experiencing the event at
duration t j and n j is the risk set at that duration (that is, the
number of individuals still at risk of experiencing the event just
prior to duration t j ).
(c) Persons that are censored are removed from observation at

the duration at which censoring takes place, save that persons
who are censored at a duration where events also take place
are assumed to be censored immediately after the events have
taken place (so that they are still at risk at that duration).
____________
Effectively, what we are doing is partitioning duration into very small

intervals such that at the vast majority of such intervals no events
occur. There is no reason to suppose, given the data that we have, that
the risk of the event happening is anything other than zero at those
intervals where no events occur. We have no evidence in our data to
suppose anything else.
For those very small intervals in which events do occur, we suppose

that the hazard is constant (ie piecewise exponential) within each
interval, but that it can vary between intervals.
____________

57 We estimate the hazard within the interval containing event time t j as:
dj
lˆ j =
nj
Of course, effectively this formula is being used for all the other
intervals as well, but as d j = 0 in all these intervals, the hazard will be
zero.
____________
It is possible to show that this estimate arises as a maximum likelihood

estimate. The likelihood of the data can be written:
k
dj n j -d j
’ lj (1 -l j )
j =1
This is proportional to a product of independent binomial likelihoods,

so that the maximum is attained by setting:
dj
lˆ j =
nj
(1 £ j £ k)
____________
Extending the force of mortality to discrete distributions
It is convenient to extend to discrete distributions the definition of

force of mortality (or hazard) given in Paragraph 9 for continuous
distributions.
____________

Discrete hazard function
58 Suppose F (t ) has probability masses at the points t1, t2 ,  , t k . Then

define:
l j = P ÈT = t j T ≥ t j ˘ 1£ j £ k (7.1)
Î ˚
(We use the symbol l to avoid confusion with the usual force of
mortality.)
____________
59 If we assume that T has a discrete distribution then:
1 - F (t ) = ’ (1 - l j )
t j £t
Since 1 - F (t ) = S(t ) , we can estimate the survival function using the

formula:
Sˆ (t ) = ’
t j £t
(1 - lˆ )
j
____________
This is the Kaplan-Meier estimate. To compute the Kaplan-Meier

estimate of the survivor function, Sˆ (t ) , we simply multiply the survival
probabilities within each of the intervals up to and including
duration t .
____________
60 Because the Kaplan-Meier estimate involves multiplying up survival

probabilities, it is sometimes called the product limit estimate.
____________

In effect, we choose finer and finer partitions of the time axis, and
estimate (1 - F (t )) as the product of the probabilities of surviving each
sub-interval. Then, with the above definition of the discrete force of
mortality (7.1), we obtain the Kaplan-Meier estimate as the mesh of the
partition tends to zero. This is the origin of the name ‘product-limit’
estimate, by which the Kaplan-Meier estimate is sometimes known.
____________
61 Note that the Kaplan-Meier estimate of the survivor function is constant

after the last duration at which an event is observed to occur.
____________
62 It is not defined at durations longer than the duration of the last

censored observation.
____________
Only those at risk at the observed lifetimes {t j } contribute to the

estimate. It follows that it is unnecessary to start observation on all
lives at the same time or age; the estimate is valid for data truncated
from the left, provided the truncation is non-informative in the sense
that entry to the study at a particular age or time is independent of the
remaining lifetime. (Note that left truncation is not the same as left
censoring.)
____________
In the package survival, the function survfit() is used to find the

Kaplan-Meier estimate of the survival function.
R code:
survfit(formula, conf.int = 0.95, conf.type = "log")
In this code ‘formula’ is a survival object. With right-censored data, a

survival object may be created with the R command:
Surv(time, delta)
Here ‘time’ is a vector containing the times to the event of censoring,

and ‘delta’ is a 0/1 vector denoting whether the individual was
censored (0) or experienced the event (1).
____________

Comparing lifetime distributions
63 Since Kaplan-Meier estimates are often used to compare the lifetime

distributions of two or more populations – for example, in comparing
medical treatments – their statistical properties are important.
Approximate formulae for the variance of F (t ) are available.
Greenwood’s formula (proof not required):
( )
2 dj
var ÈÎF (t ) ˘˚ ª 1 - Fˆ (t ) Â
t j £t n j (n j - d j )
is reasonable over most t , but might tend to understate the variance in

the tails of the distribution.
____________
The Nelson-Aalen estimate
64 The integrated hazard function
An alternative non-parametric approach is to estimate the integrated

hazard:
t
L t = Ú m sds + Â lj
0 t j £t
where the integral deals with the continuous part of the distribution
and the sum with the discrete part. (Since this methodology was
developed by statisticians, the term ‘integrated hazard’ is in universal
use, and ‘integrated force of mortality’ is almost never seen.)
____________

Calculating Nelson-Aalen estimates
65 The Nelson-Aalen estimate of the integrated hazard is:
dj
Lˆ t = Â
t j £t nj
____________
66 The Nelson-Aalen estimate of the survival function is therefore:
Sˆ (t ) = exp È- Lˆ t ˘
Î ˚
and the Nelson-Aalen estimate of the distribution function is:
Fˆ (t ) = 1 - exp È- Lˆ t ˘
Î ˚
____________
67 Corresponding to Greenwood’s formula for the variance of the

Kaplan-Meier estimator, there is a formula for the variance of the
Nelson-Aalen estimator:
 ]ª
var [ L Â
(
d j nj - d j )
t
t j £t n3j
____________
ˆ .
68 The Kaplan-Meier estimate can be approximated in terms of L t
Ê dj ˆ
Fˆt = 1 - ’ Á1- ˜
t j £t Ë nj ¯
Ê dj ˆ
@ 1 - exp Á - Â ˜
ÁË t j £ t n j ˜¯
ˆ )
= 1 - exp( - L t
____________

Parametric estimation of the survival function
69 An alternative approach to estimating the survival function proceeds

as follows:
1. assume a functional form for the survival function S (t )
2. express S (t ) and the hazard h(t ) in terms of the parameters of the

chosen function
3. estimate the parameters by maximum likelihood.
____________
Unless the functional form chosen is very simple, estimation will

involve the solution of several simultaneous equations and must be
done iteratively.
____________
70 Possible simple functional forms include the exponential and Weibull

distributions, and the Gompertz’ law, which are all described earlier in
this booklet.
____________
71 For many processes, such as human mortality, it turns out that no

simple functional form can describe human mortality at all ages.
However, for estimation purposes this is not a problem, since we can
divide the age range into small sections, estimate the chosen function
for each section (the parameters for each section will be different) and
then ‘chain’ the sections together to create a life table for the whole
age (or duration) range with which we are concerned.
____________
Maximum likelihood estimation
72 We illustrate maximum likelihood estimation by considering the

exponential hazard, which has one parameter, m .
Consider only the single year of age between exact ages x and x + 1 .

We follow a sample of n independent lives from exact age x until the

first of the following things happens:
(a) their death between exact ages x and x + 1
(b) they withdraw from the investigation between exact ages x and
x +1
(c) their ( x + 1) th birthday.
Cases (b) and (c) are treated as censored at either the time of
withdrawal, or exact age x + 1 respectively.
Assume that the hazard of death (or force of mortality) is constant

between ages x and x + 1 and takes the unknown value m . We ask
the question: what is the most likely value of m given the data in our
investigation? Assume that we measure duration in years since a
person’s x th birthday.
Consider first those lives in category (a), who die before exact age
x + 1 . Suppose there are k of these.
Take the first of these, and suppose that he or she died at duration t1 .
Given only the data on this life, the value of m that is most likely is the
value that maximises the probability that he or she actually dies at
duration t1 .
____________
73 The probability that Life 1 will actually die at duration t1 is equal to

f (t1) , where f (t ) is the probability density function of T .
____________
So the value of m that we need is the value that maximises f (t1) .
However, in the investigation, we have more than one life that died.
Suppose a second life died at duration t2 .
____________

74 The probability of this happening is f (t2 ) , and the joint probability that
Life 1 died at duration t1 and Life 2 died at duration t2 is f (t1) f (t 2 ) .

____________
Given just these two lives, the value of m we need will be that which
maximises f (t1) f (t2 ) .
____________
75 If we now consider all the k lives that died, then the value of m we
want is that which maximises:
’ f (t i )
all lives that died
This product is the probability of observing the data we actually did

observe.
____________
But what of the lives that were censored? Their experience must also
be taken into account.
Consider the first censored life, and suppose he or she was censored
at duration t k +1 . All we know about this person is that he or she was
still alive at duration t k +1 .
____________
76 The probability that a life will still be alive at duration t k +1 is S (t k +1) .

____________

77 Considering all the censored lives, the probability of observing the

data we do observe is:
’ S (t i )
all censored lives
Now, putting the deaths and the censored cases together, we can write
down the probability of observing all the data we actually observe –
both censored lives and those that died. This probability is:
’ S (t i ) ’ f (t i )
all censored lives all lives which died
This is called the likelihood of the data.

____________
The maximum likelihood estimate of the parameter m , which we

denote by m̂ , is the value that maximises this likelihood.
To obtain m̂ , define a variable d i such that:
d i = 1 if life i died
d i = 0 if life i was censored
Then, in the general case, the likelihood can be written:
n
L = ’ f (t i )d i S (t i )1-d i
i =1
Now, since:
f (t ) = S (t )h(t )
the likelihood can also be written:
n n
L = ’ h(t i )d i S (t i )d i S (t i )1-d i = ’ h(t i )d i S (t i )
i =1 i =1

We now substitute the chosen functional form into this equation to

express the likelihood in terms of the parameter m .
____________
78 This produces:
n
L = ’ m d i exp( - m t i )
i =1
Noting that whatever value of m maximises L will also maximise the

logarithm of L , we first take the logarithm of L :
n n
log L = Â d i log m - Â m t i
i =1 i =1
We differentiate this with respect to m to give:
∂ log L
Â di n
= i =1 - Â ti
∂m m i =1
Setting this equal to zero produces:
n
Â di n
i =1
m
= Â ti
i =1
so that:
n
Â di
mˆ = i =1
n
Â ti
i =1

We can check that this is a maximum by noting that:
n
2
∂ log L
Â di
i =1
=-
∂m 2 m2
This must be negative, as both numerator and denominator are

necessarily positive (unless we have no deaths at all in our data, in
which case the maximum likelihood estimate of the hazard is 0).
n n
Since Â di is just the total number of deaths in our data, and Â ti is
i =1 i =1
the total time that the lives in the data are exposed to the risk of death,
our maximum likelihood estimate of the force of mortality (or hazard) is
just deaths divided by exposed to risk, which is intuitively sensible.
____________
For parametric distributions with more than one parameter, maximum

likelihood estimation of the parameters involves the solution of
simultaneous equations, the number of simultaneous equations being
equal to the number of parameters to be estimated. These equations
often require iterative methods.
____________
Using the estimates for different age ranges
If we repeat the exercise for other years of age, we can obtain a series
of estimates for the different hazards in each year of age.
Suppose that the maximum likelihood estimate of the constant force

during the single year of age from x to x + 1 is mˆ x . Then the
probability that a person alive at exact age x will still be alive at exact
age x + 1 is just Sx (1) .
____________
79 Given the constant force:
Sˆ x (1) = exp( - mˆ x )
____________

80 To work out the probability that a person alive at exact age x will
survive to exact age x + 2 we note that this probability is equal to:
Sˆ x (1) Sˆ x +1(1) = exp( - mˆ x ) exp( - mˆ x +1)
Therefore:
Sˆ x (1) Sˆ x +1(1) = Sˆ x (2) = exp[ -( mˆ x + mˆ x +1)]
In general, therefore:
Ê m -1 ˆ
Sˆ x (m ) = ˆ
m px = exp Á - Â mˆ x + j ˜
ÁË j = 0 ˜¯
By ‘chaining’ together the probabilities in this way, we can evaluate

probabilities over any relevant age range.
____________

Chapter 8 – Proportional hazards models
Covariates
81 Estimates of the lifetime distribution, whether parametric or

non-parametric, are limited in their ability to deal with some important
questions in survival analysis, such as the effect of covariates on
survival.
____________
82 A covariate is any quantity recorded in respect of each life, such as

age, sex, type of treatment, level of medication, severity of symptoms
and so on.
____________
83 If the covariates partition the population into a small number of

homogeneous groups, it is possible to compare Kaplan-Meier or other
estimates of the survivor function in respect of each population, but a
more direct and transparent method is to construct a model in which
the effects of the covariates on survival are modelled directly: a
regression model.
____________
In this section, we will assume that the values of the covariates in

respect of the i th life are represented by a 1 ¥ p vector, zi .
____________
Proportional hazards models
The most widely used regression model in recent years has been the
proportional hazards model.
____________

84 Proportional hazards (PH) models can be constructed using both

parametric and non-parametric approaches to estimating the effect of
duration on the hazard function.
In PH models the hazard function for the i th life, li (t , zi ) , may be

written:
li (t , zi ) = l0 (t )g (zi )
where l0 (t ) is a function only of duration t , and g (zi ) is a function

only of the covariate vector. (In keeping with statistical habit, we
denote hazards by l rather than m .) Here, l0 (t ) is the hazard for an
individual with a covariate vector equal to zero. It is called the baseline
hazard.
____________
85 Models can be specified in which the effect of covariates changes with

duration:
li (t , zi ) = l0 (t )g (zi , t )
but because the hazard no longer factorises into two terms, one
depending only on duration and the other depending only on the
covariates, these are not PH models. They are also both more complex
to interpret and more computer-intensive to estimate.
____________
Fully parametric models
86 In a fully parametric PH model, the strong assumption is made that the

lifetime distribution, and hence the hazard, belongs to a given family of
parametric distributions, and the regression problem is reduced to
estimating the parameters from the data.
____________
87 Distributions commonly used are the exponential (constant hazard),

Weibull (monotonic hazard), Gompertz-Makeham (exponential hazard)
and log-logistic (‘humped’ hazard).
____________

88 The same distributions are often used as loss distributions with

insurance claims data, but censored observations complicate the
likelihoods considerably and numerical methods are usually required.
For the distributions above, the likelihoods can be written down
(though not always solved) explicitly.
____________
89 Parametric models can be used with a homogeneous population (the

one-sample case) as described in Paragraphs 69 to 80, or can be fitted
to a moderate number of homogeneous groups, in which case
confidence intervals for the fitted parameters give a test of differences
between the groups which should be better than non-parametric
procedures.
____________
90 A parametric PH model using the Gompertz distribution might be

specified as follows. The Gompertz hazard is:
l (t ) = Bc t
with two parameters B and c . If we let the value of the parameter B

depend on the covariate vector zi :
( )
B = exp b zTi
where b is a 1 ¥ p vector of regression coefficients, then through the

scalar product b zTi the influence of each factor in zi enters the hazard
multiplicatively. (Note that the ‘T’ denotes the transpose of the
vector zi , not a lifetime.)
We then have the PH model:
( )
li (t , zi ) = c t exp b zTi
____________

91 Actuaries are frequently interested in both the baseline hazard and the
effect of the covariates. As long as numerical methods are available to
maximise the full likelihood (and find the information matrix), which
nowadays should not be a problem, it is not difficult to specify any
baseline hazard required and to estimate all the parameters
simultaneously, ie those in the baseline hazard and the regression
coefficients.
____________
92 Under PH models, the hazards of different lives with covariate vectors

z1 and z2 are in the same proportion at all times:
( )
T
l (t , z1 ) exp b z1
=
( )
l (t , z2 ) exp b zT
2
Hence the name proportional hazards model.

____________
93 Moreover, the specification above ensures that the hazard is always

positive and gives a linear model for the log-hazard:
log li (t , zi ) = t log c + b zTi
which is very convenient in theory and practice.

____________
94 However, fully parametric models are difficult to apply without

foreknowledge of the form of the hazard function. Moreover, in many
medical applications answers to questions depend mainly on
estimating the regression coefficients. The baseline hazard is
relatively unimportant. For these reasons, an alternative
semi-parametric approach, originally proposed by D R Cox in 1972, has
become popular.
____________

The Cox proportional hazards model
95 The Cox PH model proposes the following form of hazard function for
the i th life:
l (t ; zi ) = l0 (t ) exp( b ziT )
l0 (t ) is the baseline hazard.

____________
96 The utility of this model arises from the fact that the general ‘shape’ of
the hazard function for all individuals is determined by the baseline
hazard, while the exponential term accounts for differences between
individuals. So, if we are not primarily concerned with the precise form
of the hazard, but with the effects of the covariates, we can
ignore l0 (t ) and estimate b from the data irrespective of the shape of
the baseline hazard. This is termed a semi-parametric approach.
So useful and flexible has this proved, that the Cox model now
dominates the literature on survival analysis, and it is probably the tool
to which a statistician would turn first for the analysis of survival data.
____________
The partial likelihood
97 To estimate b in the Cox model it is usual to maximise the partial

likelihood. The partial likelihood estimates the regression coefficients
but avoids the need to estimate the baseline hazard. Moreover, since
(remarkably) it behaves essentially like an ordinary likelihood, it
furnishes all the statistical information needed for standard inference
on the regression coefficients.
____________
98 Let R (t j ) denote the set of lives which are at risk just before the j th
observed lifetime and for the moment assume that there is only one
death at each observed lifetime, that is d j = 1 (1 £ j £ k ) .

The partial likelihood is:
k exp(b z Tj )
L( b ) = ’
j =1 Â exp(b ziT )
i ŒR (t j )
Intuitively, each observed lifetime contributes the probability that the

life observed to die should have been the one out of the R (t j ) lives at
risk to die, conditional on one death being observed at time t j .
____________
99 Note that the baseline hazard cancels out and the partial likelihood
depends only on the order in which deaths are observed. (The name
‘partial’ likelihood arises because those parts of the full likelihood
involving the times at which deaths were observed and what was
observed between the observed deaths are thrown away.)
____________
Maximising the partial likelihood
100 Maximisation of this expression has to proceed numerically, and most

statistics packages have procedures for fitting a Cox model.
____________
101 In practice there might be ties in the data, that is:
(a) some d j > 1; or
(b) some observations are censored at an observed lifetime.
It is usual to deal with (b) by including the lives on whom observation

was censored at time t j in the risk set R (t j ) , effectively assuming that
censoring occurs just after the deaths were observed.
____________

Breslow’s approximation
102 Accurate calculation of the partial likelihood in case (a) is messy, since
all possible combinations of d j deaths out of the R (t j ) at risk at time
t j ought to contribute, and an approximation due to Breslow is often
used, namely:
k exp( b s Tj )
L( b ) = ’ dj
j =1 Ê ˆ
Á Â exp( b ziT )˜
ÁË i ŒR (t j ) ˜¯
where s j is the sum of the covariate vectors z of the d j lives

observed to die at time t j .
____________
Properties of the partial likelihood
103 As mentioned earlier, the partial likelihood behaves much like a full
likelihood; it yields an estimator for b which is asymptotically
(multivariate) normal and unbiased, and whose asymptotic variance
matrix can be estimated by the inverse of the observed information
matrix.
____________
The efficient score function, namely the vector function:
Ê ∂ log L( b ) ∂ log L( b ) ˆ
u(b ) = Á , ... , ˜
Ë ∂b 1 ∂b p ¯
plays an important part; in particular solving u (bˆ ) = 0 furnishes the
maximum likelihood estimate b̂ .

The observed information matrix I ( bˆ ) is then the negative of the p ¥ p

matrix of second partial derivatives:
∂ 2 log L(b )
I (b )ij = - (1 £ i , j £ p )
∂b i ∂b j
evaluated at b̂ .
____________
104 A useful feature of most computer packages for fitting a Cox model is
that the information matrix evaluated at b̂ is usually produced as a
by-product of the fitting process (it is used in the Newton-Raphson
algorithm) so standard errors of the components of b̂ are available.
These are helpful in evaluating the fit of a particular model.
____________
Model fitting
Assessing the effect of the covariates
105 In a practical problem, several possible explanatory variables might

present themselves, and part of the modelling process is the selection
of those which have significant effects. Therefore criteria are needed
for assessing the effects of covariates, alone or in combination.
____________
106 A common criterion is the likelihood ratio statistic. Suppose we need

to assess the effect of adding further covariates to the model. In
general, suppose we fit a model with p covariates, and another model
with p + q covariates, which include the p covariates of the first
model.
Each is fitted by maximising a likelihood; let Lp and Lp +q be the

maximised log-likelihoods of the first and second models respectively.

The likelihood ratio statistic is then:
-2(Lp - Lp +q )
and it has an asymptotic c 2 distribution on q degrees of freedom,

under the hypothesis that the extra q covariates have no effect in the
presence of the original p covariates.
____________
Strictly this statistic is based upon full likelihoods, but when fitting a
Cox model it is used with partial likelihoods.
For example, suppose we have considered a model for the effect of

hypertension on survival, in which zi has two components, with the
level of zi(1) representing sex and the level of zi(2) representing blood
pressure.
Suppose we want to test the hypothesis that cigarette smoking has no

effect, allowing for sex and blood pressure.
Then we could define an augmented covariate vector

zi¢ = (zi(1) , zi(2) , zi(3) ) in which zi(3) is a factor (say, 0 for non-smoker and
1 for smoker) and refit the model.
The likelihood ratio statistic -2(L2 - L3 ) then has an asymptotic c 2

distribution on 1 degree of freedom, under the null hypothesis (which
is that the new parameter b 3 = 0 ).
____________

Building models
107 The likelihood ratio statistic is the basis of various model-building

strategies, in which:
(a) we start with the null model (one with no covariates) and add
possible covariates one at a time; or
(b) we start with a full model which includes all possible covariates,
and then try to eliminate those of no significant effect.
____________
In addition, it is necessary to test for interactions between covariates,

in case their effects should depend on the presence or absence of each
other in the same way as described in Subject CS1.
The likelihood ratio statistic is a standard tool in model selection; for

example it was used in the UK to choose members of a
Gompertz-Makeham family of functions for parametric graduations (see
Booklet 6).
____________
In the R package survival, the command coxph() fits a Cox

proportional hazards model to the supplied data.
R code:
coxph(formula)
The argument formula will be similar to that used when fitting a linear
model via lm() (see Subject CS1) except that the response variable
will be a survival object instead of a vector.
____________

PAST EXAM QUESTIONS
This section contains all the relevant exam questions from 2008 to 2017 that
are related to the topics covered in this booklet.
Solutions are given after the questions. These give enough information for
you to check your answer, including working, and also show you what an
outline examination answer should look like. Further information may be
available in the Examiners’ Report, ASET or Course Notes. (ASET can be
ordered from ActEd.)
We first provide you with a cross-reference grid that indicates the main
subject areas of each exam question. You can use this, if you wish, to
select the questions that relate just to those aspects of the topic that you
may be particularly interested in reviewing.
Alternatively, you can choose to ignore the grid, and attempt each question
without having any clues as to its content.

9
8
7
6
5
4
3
2
1
33
32
31
30
29
28
27
26
25
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
Question
Page 52



Lifetime RVs







Force of mortality




UDD assumption
Cross-reference grid
Central rate of


mortality



Gompertz model

Weibull model









Hazard function









Types of censoring
Kaplan-Meier







model
Nelson-Aalen






model









PH models
Question attempted
© IFE: 2019 Examinations

50
49
48
47
46
45
44
43
42
41
40
39
38
37
36
35
34
Question

Lifetime RVs



Force of mortality
UDD assumption
© IFE: 2019 Examinations

Central rate of
mortality
 Gompertz model
Weibull model





Hazard function





Types of censoring
Kaplan-Meier





model
Nelson-Aalen



 model





PH models
Question attempted
Page 53
1 Subject CT4 April 2008 Question 8
An education authority provides children with musical instrument tuition. The

authority is concerned about the number of children giving up playing their
instrument and is testing a new tuition method with a proportion of the children
which it hopes will improve persistency rates. Data have been collected and a
Cox proportional hazards model has been fitted for the hazard of giving up
playing the instrument. Symmetric 95% confidence intervals (based upon
standard errors) for the regression parameters are shown below.
Covariate Confidence Interval
Instrument
Piano 0
Violin [-0.05,0.19]
Trumpet [0.07,0.21]
Tuition method
Traditional 0
New [-0.15,0.05]
Sex
Male [-0.08,0.12]
Female 0
(i) Write down a general expression for the Cox proportional hazards model,
defining all terms that you use. [3]
(ii) State the regression parameters for the fitted model. [2]
(iii) Describe the class of children to which the baseline hazard applies. [1]
(iv) Discuss the suggestion that the new tuition method has improved the
chances of children continuing to play their instrument. [3]
(v) Calculate, using the results from the model, the probability that a boy will
still be playing the piano after 4 years if provided with the new tuition
method, given that the probability that a girl will still be playing the trumpet
after 4 years following the traditional method is 0.7. [3]
[Total 12]

An investigation into the mortality of patients following a specific type of major

operation was undertaken. A sample of 10 patients was followed from the
date of the operation until either they died, or they left the hospital where the
operation was carried out, or a period of 30 days had elapsed (whichever of
these events occurred first). The data on the 10 patients are given in the table
below.
Patient number Duration of Reason for

observation observation
(days) ceasing
1 2 Died
2 6 Died
3 12 Died
4 20 Left hospital
5 24 Left hospital
6 27 Died
7 30 Study ended
8 30 Study ended
9 30 Study ended
10 30 Study ended
(i) State whether the following types of censoring are present in this
investigation. In each case give a reason for your answer.
(a) Type I
(b) Type II
(c) Random. [3]
(ii) State, with a reason, whether the censoring in this investigation is likely to
be informative. [1]
(iii) Calculate the value of the Kaplan-Meier estimate of the survival function
at duration 28 days. [5]
(iv) Write down the Kaplan-Meier estimate of the hazard of death at duration 8
days. [1]
(v) Sketch the Kaplan-Meier estimate of the survival function. [2]

[Total 12]

3 Subject CT4 September 2008 Question 10
In an investigation of reconviction rates among those who have served

prison sentences, let X be a random variable which measures the duration
from the date of release from prison until the ex-prisoner is convicted of a
subsequent offence. The investigation monitored a sample of 100
ex-prisoners (who were all released on the same date) at one-monthly
intervals from their date of release for a period of 6 months. Those who
could not be traced in any month were removed from the sample at that
point and not traced in subsequent months. Reconviction was assumed to
take place at the duration that a prisoner was first known to have been
reconvicted.
(i) Express the hazard rate at duration x months in terms of probabilities. [1]
The investigation produced the following data for a sample of 100

ex-prisoners.
Months since Number of Number who had

release prisoners been reconvicted
contacted since last contact
1 100 0
2 97 0
3 95 4
4 90 3
5 85 5
6 80 0
(ii) Calculate the Nelson-Aalen estimate of the survival function. [5]
A previous investigation found that the probability that a prisoner would be

reconvicted within 6 months of release was 0.2.
(iii) Estimate confidence intervals around the integrated hazard using the
results from part (ii) to test the hypothesis that the rate of reconviction has
declined since the previous investigation. [6]
[Total 12]

4 Subject CT4 September 2008 Question 12 (part)
(i) Explain the meaning of the rates of mortality usually denoted q x

and mx , and the relationship between them. [3]
(ii) Write down a formula for t qx , 0 £ t £ 1, under both of the following

assumptions about the distribution of deaths in the age range ÈÎ x, x + 1˘˚ :
(a) uniform distribution of deaths
(b) constant force of mortality [2]
A group of animals experiences a mortality rate qx = 0.1 .
(iii) Calculate mx under both of the assumptions (a) and (b) above. [5]
(iv) Comment on your results in part (iii). [2]

[Total 12]
Below is an extract from English Life Table 15 (females).
Age x Number of survivors to

(years) exact age x out of
100,000 births
30 98,617
40 97,952
(i) Calculate 5 q30 under each of the two following alternative assumptions:
(a) a uniform distribution of deaths (UDD) between ages 30 and 40 years
(b) a constant force of mortality between ages 30 and 40 years. [3]
(ii) Calculate the number of survivors to exact age 35 years out of 100,000
births under each of the assumptions in (i) above. [1]

English Life Table 15 (females) was originally calculated using data

classified by single years of age. The number of survivors to exact age 35
years was 98,359.
(iii) Comment on the appropriateness of the assumptions of UDD and a

constant force of mortality between ages 30 and 40 years in this example.
[3]
[Total 7]
(i) Prove that, under Gompertz’s Law, the probability of survival from age x
to age x + t , t px , is given by:
c x ( c t -1)
È Ê -B ˆ ˘
t px = Í exp Á [3]
Î Ë ln c ˜¯ ˙˚
For a certain population, estimates of survival probabilities are available as

follows:
1 p50 = 0.995
2 p50 = 0.989
(ii) Calculate values of B and c consistent with these observations. [3]
(iii) Comment on the calculation performed in (ii) compared with the usual
process for estimating the parameters from a set of crude mortality rates.
[3]
[Total 9]

Let Tx be a random variable denoting future lifetime after age x , and let T
be another random variable denoting the lifetime of a new-born person.
(i) (a) Define, in terms of probabilities, Sx (t ) , which represents the survival

function of Tx .
(b) Derive an expression relating Sx (t ) to S(t ) , the survival function

of T . [2]
(ii) Define, in terms of probabilities involving Tx , the force of mortality, m x + t .

[1]
The Weibull distribution has a survival function given by:
(
Sx (t ) = exp -( l t )b )
where l and b are parameters ( l , b > 0 ).
(iii) Derive an expression for the Weibull force of mortality in terms of l

and b . [3]
(iv) Sketch, on the same graph, the Weibull force of mortality for 0 £ t £ 5 for
the following pairs of values of l and b :
l = 1, b = 0.5
l = 1, b =1
l = 1, b = 1.5 [4]
[Total 10]

Describe the difference between the following assumptions about mortality

between any two ages, x and y ( y > x ):
 uniform distribution of deaths

 constant force of mortality
In your answer, explain the shape of the survival function between ages x
and y under each of the two assumptions. [2]
An electronics company developed a revolutionary new battery which it

believed would make it enormous profits. It commissioned a sub-contractor
to estimate the survival function of battery life for the first 12 prototypes. The
sub-contractor inserted each prototype battery into an identical electrical
device at the same time and measured the duration elapsing between the
time each device was switched on and the time its battery ran out. The sub-
contractor was instructed to terminate the test immediately after the failure of
the 8th battery, and to return all 12 batteries to the company.
When the test was complete, the sub-contractor reported that he had
terminated the test after 150 days. He further reported that:
 two batteries had failed after 97 days
 three further batteries had failed after 120 days
 two further batteries had failed after 141 days
 one further battery had failed after 150 days.
However, he reported that he was only able to return 11 batteries, as one

had exploded after 110 days, and he had treated this battery as censored at
that duration when working out the Kaplan-Meier estimate of the survival
function.
(i) State, with reasons, the forms of censoring present in this study. [2]
(ii) Calculate the Kaplan-Meier estimate of the survival function based on the
information supplied by the sub-contractor. [5]
In his report, the sub-contractor claimed that the Kaplan-Meier estimate of

the survival function at the duration when the investigation was terminated
was 0.2727.

(iii) Explain why the sub-contractor’s Kaplan-Meier estimate would be

consistent with him having stolen the battery he claimed had exploded. [4]
[Total 11]
A study was undertaken into the length of spells of unemployment among

young people in a certain city. A sample of young people was monitored
from the time they started to claim unemployment benefit until either they
resumed work, or they moved away from the city. None of the members of
the sample died during the study.
The study investigated the impact of age, sex and educational qualifications
on the hazard of returning to work using the following covariates:
A a young person’s age when he or she started claiming benefit

(measured in exact years since his or her 16th birthday)
S a dummy variable taking the value 1 if the person was male and 0 if the
person was female
E a dummy variable taking the value 1 if the person had passed a school
leaving examination in mathematics, and 0 otherwise
with associated parameters b A , bS and b E .
The investigators decided to use a Cox proportional hazards regression

model for the study.
(i) Explain what is meant by a proportional hazards model. [3]
(ii) Explain why the Cox model is a popular model for the analysis of survival
data. [3]
(iii) (a) Write down the equation of the model that was estimated, defining
the terms you use (other than those defined above).
(b) List the characteristics of the young person to whom the baseline
hazard applies. [3]

The results showed:

 The hazard of resuming work for males who started claiming benefit
aged 17 years exact and who had passed the mathematics examination
was 1.5 times the hazard for males who started claiming benefit aged
16 years exact but who had not passed the mathematics examination.
 Females who had passed the mathematics examination were twice as
likely to take up a new job as were males of the same age who had
failed the mathematics examination.
 Females who started claiming benefit aged 20 years exact and who had
passed the mathematics examination were twice as likely to resume
work as were males who started claiming benefit aged 16 years exact
and who had also passed the mathematics examination.
(iv) Calculate the estimated values of the parameters b A , bS and b E . [6]

[Total 15]
Write down integral equations for the mean and variance of the complete
future lifetime at age x , Tx . [2]
A certain profession admits new members to the status of student. Students

may qualify as fellows of the profession by virtue of passing a series of
examinations. Normally student members sit the examinations whilst working
for an employer. There are two sessions of the examinations each year.
An employer provides study support to student members of the profession. It

wishes to assess the cost of providing this study support and therefore wishes
to know the average time it can expect to take for its students to qualify.
The employer has maintained records for 23 of its students who all sat their
first examination in the first session of 2003. The students’ progress has been
recorded up to and including the last session of 2009. The following data
records the number of sessions which had been held before the specified
event occurred for a student in this cohort:
Qualified 6, 8, 8, 9, 9, 9, 11, 11, 13, 13, 13

Stopped studying 4, 5, 8, 11, 14

The remaining seven students were still studying for the examinations at the
end of 2009.
(i) Determine the median number of sessions taken to qualify for those
students who qualified during the period of observation. [2]
(ii) Calculate the Kaplan-Meier estimate of the survival function S(t ) for the
‘hazard’ of qualifying, where t is the number of sessions of examinations
since 1 January 2003. [5]
(iii) Hence estimate the median number of sessions to qualify for the students
of this employer. [2]
(iv) Explain the difference between the results in (i) and (iii) above. [2]
[Total 11]
(i) Write down the hazard function for the Cox proportional hazards model
defining all the terms that you use. [2]
A farmer is concerned that he is losing a lot of his birds to a predator, so he

decides to build a new enclosure using taller fencing. This fencing is
expensive and he cannot afford to build a large enough area for all his birds.
He therefore decides to put half his birds in the new enclosure and leave the
others in the existing enclosure. He is convinced that the new enclosure is an
improvement, but has asked an actuarial student to determine whether the
new enclosure will result in an increase in the life expectancy of his birds. The
student has fitted a Cox proportional hazards model to data on the duration
until a bird is killed by a predator and calculated the following figures relating to
the regression parameters:
Parameter
Variance
estimate
Chicken 0 0
Bird Duck –0.210 0.002
Goose 0.075 0.004
New 0.125 0.0015
Enclosure
Old 0 0
Male 0.2 0.0026
Sex
Female 0 0

(ii) State the features of the bird to which the baseline hazard applies. [1]
(iii) For each regression parameter:
(a) Define the associated covariate.
(b) Calculate the 95% confidence interval based on the standard error. [3]
(iv) Comment on the farmer’s belief that the new enclosure will result in an
increase in his birds’ life expectancy. [2]
(v) Calculate, using this model, the probability that a female duck in the new
enclosure has been killed by a predator at the end of six months, given
that the probability that a male goose in the old enclosure has been killed
at the end of the same period is 0.1 (all other decrements can be ignored).
[4]
[Total 12]
(i) Write down a formula for t qx ( 0 £ t £ 1 ) under both of the following

assumptions:
(a) uniform distribution of deaths
(b) constant force of mortality [2]
(ii) Calculate 0.5 p60 to six decimal places under both assumptions given
q60 = 0.05 . [2]
(iii) Comment on the relative magnitude of your answers to part (ii). [1]
[Total 5]

A researcher is reviewing a study published in a medical journal into survival

after a certain major operation. The journal only gives the following summary
information:
 the study followed 16 patients from the point of surgery
 the patients were studied until the earliest of five years after the
operation, the end of the study or the withdrawal of the patient from the
study
 the Nelson-Aalen estimate, S(t ) , of the survival function was as follows:
Duration since operation t (years) S(t )

0 £ t <1 1
1£ t < 3 0.9355
3£t <4 0.7122
4£t <5 0.6285
(i) Describe the types of censoring which are present in the study. [2]
(ii) Calculate the number of deaths which occurred, classified by duration

since the operation. [6]
(iii) Calculate the number of patients who were censored. [1]

[Total 9]
A study of the mortality of a certain species of insect reveals that for the first 30
days of life, the insects are subject to a constant force of mortality of 0.05.
After 30 days, the force of mortality increases according to the formula:
m 30 + x = 0.05 exp(0.01x )
where x is the number of days after day 30.
(i) Calculate the probability that a newly born insect will survive for at least
10 days. [1]
(ii) Calculate the probability that an insect aged 10 days will survive for at
least a further 30 days. [3]

(iii) Calculate the age in days by which 90 per cent of insects are expected
to have died. [4]
[Total 8]
At Miracle Cure hospital a pioneering new surgery was tested to replace

human lungs with synthetic implants. Operations were carried out throughout
June 2010. Patients who underwent the surgery were monitored daily until the
end of August 2010, or until they died or left hospital if sooner. The results are
shown below. Where no date is given, the patient was alive and still in hospital
at the end of August.
Patient Date of Date of leaving Reason for

surgery observation leaving
observation
A June 1 June 3 Died
B June 3 July 2 Left hospital
C June 5
D June 8
E June 9 July 11 Died
F June 12
G June 16 June 21 Died
H June 17 Aug 12 Left hospital
I June 22
J June 24 June 29 Died
K June 25 Aug 20 Died
L June 26
M June 29 Aug 6 Left hospital
N June 30
(i) Explain whether each of the following types of censoring is present and
for those present explain where they occur:
 right censoring
 left censoring
 informative censoring. [3]
(ii) Calculate the Kaplan-Meier estimate of the survival function for these
patients, stating all assumptions that you make. [6]

(iii) Sketch, on a suitably labelled graph, the Kaplan-Meier estimate of the

survival function. [2]
(iv) Estimate the probability that a patient will die within four weeks of surgery.
[1]
[Total 12]
(i) Describe what is represented by each of the central rate of mortality mx

and the initial rate of mortality q x . [2]
(ii) State the circumstance in which mx = m x . [1]

[Total 3]
A new weed killer was tested which was designed to kill weeds growing in
grass. The weedkiller was administered via a single application to 20 test
areas of grass. Within hours of applying the weedkiller, the leaves of all the
weeds went black and died, but after a time some of the weeds re-grew as
the weedkiller did not always kill the roots.
The test lasted for 12 months, but after six months five of the test areas were
accidentally ploughed up and so the trial on these areas had to be
discontinued. None of these five areas had shown any weed re-growth at
the time they were ploughed up.
 Ten of the remaining 15 areas experienced a re-growth of weeds at the

following durations (in months): 1, 2, 2, 2, 5, 5, 8, 8, 8, 8.
 Five areas still had no weed re-growth when the trial ended after 12
months.
(i) Describe, giving reasons, the types of censoring present in the data. [2]
(ii) Estimate the probability that there is no re-growth of weeds nine months
after application of the weedkiller using either the Kaplan-Meier or the
Nelson-Aalen estimator. [4]
[Total 6]

A study is made of the impact of regular exercise and gender on the risk of
developing heart disease among 50-70 year olds. A sample of people is
followed from exact age 50 years until either they develop heart disease or
they attain the age of 70 years. The study uses a Cox regression model.
(i) List reasons why the Cox regression model is a suitable model for
analyses of this kind. [3]
The investigator defined two covariates as follows:
 Z1 = 1 if male, 0 if female
 Z2 = 1 if takes regular exercise, 0 otherwise.
The investigator then fitted three models, one with just gender as a covariate,
a second with gender and exercise as covariates, and a third with gender,
exercise and the interaction between them as covariates. The maximised
log-likelihoods of the three models and the maximum likelihood estimates of
the parameters in the third model were as follows:
Null model –1,269

Gender –1,256
Gender + exercise –1,250
Gender + exercise + interaction –1,246
Covariate Parameter
Gender 0.2
Exercise –0.3
Interaction –0.35
(ii) Show that the interaction term is required in the model by performing a
suitable statistical test. [5]
(iii) Interpret the results of the model. [3]

[Total 11]

A new drug treatment for patients suffering from a chronic skin disease with
visible symptoms was tested. The drug was administered through a daily dose
for the duration of the trial. As soon as the drug regime started, the symptoms
disappeared in all patients, but after some time had a tendency to reappear as
the agent causing the disease developed resistance to the drug. The trial
lasted for six months.
The data below show the number of patients experiencing a return of their
symptoms in each month after the drug regime started.
Number of patients
Number of patient-
Month experiencing a return of their
months exposed to risk
symptoms
1 200 5
2 190 8
3 175 15
4 150 10
5 135 6
6 125 3
(i) Calculate the hazard of symptoms returning in each month. [2]
As part of the investigation, it is desired to assess the impact of certain risk

factors on the hazard of symptoms returning. It is suggested that to achieve
this, the hazard could be modelled using either a Gompertz model or a
semi-parametric model.
(ii) Comment on the use of each of these models in this situation. [4]
[Total 6]

For a particular investigation the hazard of mortality is assumed to take the

form:
h(t ) = A + Bt
where A and B are constants and t represents time.
For each life i in the investigation ( i = 1,, n ) information was collected on

the length of time the life was observed ti and whether the life exited due to
death ( d i = 1 if the life died, 0 otherwise).
(i) Show that the likelihood of the data is given by:
n
L= ’ ( A + Bti )d i exp È - Ati -
Î
1 Bt 2 ˘
2 i ˚ [3]
i =1
(ii) Derive two simultaneous equations from which the maximum likelihood
estimates of the parameters A and B can be calculated. [3]
[Total 6]

Mr Bunn the baker made 12 pies to sell in his shop. He placed the pies in
the shop at 9am. During the rest of the day the following events took place:
Time Event
10am A boy bought two pies
11am A man bought three pies
12 noon Mr Bunn accidentally sat on one pie and squashed it so it could
not be sold
1pm A woman bought two pies
2pm A dog from across the street ran into Mr Bunn’s shop and stole
two pies
3pm A girl on the way home from school bought one pie
5pm Mr Bunn closed for the day and the remaining pie was still in
the shop
(i) Estimate the time it takes Mr Bunn to sell 40% of the pies he makes, using
the Nelson-Aalen estimator. [6]
(ii) Comment on whether you think this estimate would be a good basis for
Mr Bunn to plan his future production of pies. [3]
[Total 9]

(i) State one advantage of a semi-parametric model over a fully parametric

one. [1]
(ii) Write down a general expression for the Cox proportional hazards model,
defining all the terms you use. [2]
A life office is trying to understand the impact of certain factors on the lapse
rates of its policies. It has studied the lapse rates on a block of business
subdivided by:
• sex of policyholder (Male or Female)
• policy type (Term Assurance or Whole Life)
• sales channel (Internet, Direct Sales Force or Independent Financial
Adviser.)
The office has fitted a Cox proportional hazards model to the data and has
calculated the following regression parameters:
Covariate Regression parameter
Female 0.2
Male 0
Term Assurance −0.1

Whole Life 0
Internet 0.4
Independent Financial Adviser −0.2
Direct Sales Force 0
(iii) State the sex/sales channel/policy type combination to which the baseline
hazard relates. [1]
A term assurance is sold to a female by an independent financial adviser.
(iv) Calculate the probability that this term assurance is still in force after five
years given that 60% of whole life policies bought on the internet by males
have lapsed by the end of year five. [4]
[Total 8]

A certain town runs a training course for traffic wardens each year. The
course lasts for 30 days, but the examination which enables someone to
qualify as a traffic warden can be sat any day during the course. In 2011
there were 13 participants who started the training course. The following
table has been compiled to show the day each candidate qualified or the day
each candidate who did not qualify left the course.
Candidate Day Day left without

qualified qualifying
A 30
B 5
C 21
D 19
E 12
F 30
G 1
H 19
I 12
J 30
K 15
L 10
M 24
(i) Explain whether the following types of censoring are present:

 interval censoring
 right censoring
 informative censoring. [3]
(ii) Calculate the Kaplan-Meier estimate of the non-qualification function. [6]
(iii) Sketch a graph of the Kaplan-Meier estimate, labelling the axes. [2]
When the data were gathered, the reasons for exit of candidates D and H
were accidentally transposed, and those for candidates B and L were also
accidentally transposed.
(iv) Explain how your answer to part (ii) would change if you had access to the
correct (ie untransposed) data for candidates D, H, B and L. [3]
[Total 14]

In the context of a survival model:
(i) Define right censoring, Type I censoring and Type II censoring. [3]
(ii) Give an example of a practical situation in which censoring would be

informative. [2]
[Total 5]
The mortality of a certain species of furry animal has been studied. It is

known that at ages over five years the force of mortality m is constant, but
the variation in mortality with age below five years of age is not understood.
Let the proportion of furry animals that survive to exact age five years be
5 p0 .
(i) Show that, for furry animals that die at ages over five years, the average
5m + 1
age at death in years is . [1]
m
(ii) Obtain an expression, in terms of m and 5 p0 , for the proportion of all

furry animals that die between exact ages 10 and 15 years. [3]
A new investigation of this species of furry animal revealed that 30 per cent
of those born survived to exact age 10 years and 20 per cent of those born
survived to exact age 15 years.
(iii) Calculate m and 5 p0 . [3]

[Total 7]

(i) State the form of the hazard function for the Cox regression model,
defining all the terms used. [2]
(ii) State two advantages of the Cox regression model. [2]
Susanna is studying for an online test. She has collected data on past
attempts at the test and has fitted a Cox regression model to the success
rate using three covariates:
Employment Z1 = 0 if an employee, and 1 if self-employed

Attempt Z2 = 0 if first attempt, and 1 if subsequent attempt
Study time Z3 = 0 if no study time taken, and 1 if study time taken.
Having analysed the data Susanna estimates the parameters as:
Employment 0.4
Attempt –0.2
Study time 1.15
Bill is an employee. He has taken study time and is attempting the test for
the second time. Ben is self-employed and is attempting the test for the first
time without taking study time.
(iii) Calculate how much more or less likely Ben is to pass, compared with Bill.
[3]
Susanna subsequently discovers that the effect of the number of attempts is

different for employees and the self-employed.
(iv) Explain how the model could be adjusted to take this into account. [2]
[Total 9]

The Shining Light company has developed a new type of light bulb which it
recently tested. 1,000 bulbs were switched on and observed until they
failed, or until 500 hours had elapsed. For each bulb that failed, the duration
in hours until failure was noted. Due to an earth tremor after 200 hours, 200
bulbs shattered and had to be removed from the test before failure.
The results showed that 10 bulbs failed after 50 hours, 20 bulbs failed after
100 hours, 50 bulbs failed after 250 hours, 300 bulbs failed after 400 hours
and 50 bulbs failed after 450 hours.
(i) Calculate the Kaplan-Meier estimate of the survival function S(t ) for the
light bulbs in the test. [6]
(ii) Sketch the Kaplan-Meier estimate calculated in part (i). [2]
(iii) Estimate the probability that a bulb will not have failed after each of the
following durations: 300 hours, 400 hours and 600 hours. If it is not
possible to obtain an estimate for any of the durations without additional
assumptions, explain why. [3]
[Total 11]

(i) Explain what is meant by censoring in the context of a mortality

investigation. [1]
A trial was conducted on the effectiveness of a new cream to treat a skin

condition. 100 sufferers applied the cream daily for four weeks or until their
symptoms disappeared if this happened sooner. Some of the sufferers left
the trial before their symptoms disappeared.
(ii) Describe two types of censoring that are present and state to whom they
apply. [2]
The following data were collected:
Number of Day symptoms Number of Day they

sufferers disappeared sufferers left the trial
2 6 3 2
1 7 1 10
1 10 3 13
2 14
(iii) Calculate the Nelson-Aalen estimate of the survival function for this trial.
[5]
(iv) Sketch the survival function, labelling the axes. [2]
(v) Estimate the probability that a person using the cream will still have
symptoms of the skin condition after two weeks. [1]
[Total 11]

(i) Explain why the Gompertz model is commonly used in investigations of

human mortality. [1]
The following model of mortality was used in an investigation of the effects of

where someone lives and income on the risk of death:
loge  x    0 x  1U   2I
where  x is the force of mortality at age x , U takes the value 1 if the

person lives in an urban area and 0 if the person lives in a rural area, I is
the annual income in US dollars, and  , 0 , 1 and  2 are parameters.
(ii) Show that the model is both a Gompertz model and a proportional
hazards model. [3]
The estimates of the parameters were   9.0 , 0  0.09 , 1  0.3 and

2  0.0001.
(iii) Calculate the predicted force of mortality for an urban resident aged 40
years with an annual income of $20,000. [2]
(iv) Calculate the additional income that an urban resident must have in order
to have the same force of mortality as a rural resident of the same age. [2]
(v) Calculate the 10-year survival probability for an urban resident aged 40
years whose annual income is $20,000. [2]
(vi) Determine the age of a rural resident with the same income as an urban
resident aged 40 years, who has the same chance of surviving for the
next 10 years. [4]
[Total 14]

An investigation has been performed into risk factors for liver disease in
persons currently resident in the United Kingdom (UK) and aged over 50
years. It considered the impact of three covariates: age at the start of the
investigation, weekly alcohol consumption and previous residence in a
tropical country.
The investigation used a Cox regression model for the hazard of developing
the disease, h(t ) , with three parameters, b A , bC and bT , as follows:
h(t ) = h0 (t ) exp ( b A A + bC C + bT T )
A was defined as exact age at the start of the investigation less 50 years.
C represented weekly alcohol consumption, and took the value 1 if the

person consumed more than the recommended maximum per week (a
heavy drinker) and 0 otherwise.
T represented previous residence in a tropical country, and took the value 1

if the person had lived in a tropical country for more than 12 months and 0
otherwise.
(i) State the characteristics of a person to whom the baseline hazard h0 (t )

applies. [1]
The results of the investigation revealed that the hazard was:

 twice as high for a heavy drinker aged 60 years exact at the start of the
investigation than for a person aged 50 years exact at the start of the
investigation who was not a heavy drinker, where neither had previously
lived in a tropical country.
 four times as high for a heavy drinker who had previously lived in a
tropical country for more than 12 months than for a non-heavy drinker of
the same age who had not previously lived in a tropical country.
 three times as high for a person who had lived in a tropical country for
more than 12 months than for a person of the same age and drinking
habits who had always lived in the UK.
(ii) Calculate b A , bC and bT . [5]

The probability of a person aged 50 years exact at the start of the

investigation, who does not drink heavily and has always lived in the UK
remaining free of the disease for 10 years is 0.8.
(iii) Show that the probability of a person of the same age and drinking habits,
who has lived for more than 12 months in a tropical country, remaining
free of the disease for 10 years is slightly over one half. [4]
[Total 10]
(i) Describe what is meant by censoring in the context of a mortality

investigation. [1]
(ii) Explain what right censoring, left censoring and interval censoring are,
giving an example of each. [3]
A toy manufacturer is testing the lifetime of its new electric children’s toy.
500 are set going at 9am one morning on test rigs plugged into the electricity
supply and are run until 5pm the next day or until they fail, whichever comes
first. Unfortunately the cleaner unplugged a test rig on which 17 toys were
still working at 7pm on the first evening in order to plug his floor polisher in.
Then, as he left work three hours later, he took three of the still working toys
for his children to play with. Of the other 480 toys it was found that 12 failed
after four hours, 25 failed after 11 hours and a further 8 failed after 31 hours.
(iii) Explain which forms of censoring are present in this investigation. [2]
(iv) Calculate the Nelson-Aalen estimate of the survival function. [5]
(v) Sketch a graph of the Nelson-Aalen estimate of the survival function,

labelling the axes. [2]
(vi) Comment on the length of time for which a new toy has a 60% probability
of surviving. [1]
[Total 14]

(i) Define the force of mortality m x + t of a random variable T denoting length

of life. [1]
The mortality of a certain species of animal has been studied. It is known

that at ages under five years the force of mortality m is constant.
(ii) Write down an expression, in terms of m , for the probability that an

animal will survive from birth to exact age five years. [1]
Mortality of these animals at ages over five years exact is incompletely

understood.
However, it is known that the probability that an animal aged exactly five
years will survive until exact age 10 years is twice the probability that an
animal aged exactly five years will survive until exact age 20 years.
Assume that the force of mortality l is constant at ages over five years
exact.
(iii) Calculate l . [3]
(iv) Calculate the expectation of life at birth for these animals if l = m . [1]
(v) Derive an expression, in terms only of m , for the expectation of life at

birth for these animals if l π m . [4]
[Total 10]
An investigation was undertaken into the length of post-operative stay in

hospital after a particular type of surgery. All patients undergoing this
surgery between 1 January and 31 January 2013 were observed until either
they left the hospital, died, or underwent a second operation. The event of
interest was leaving the hospital. Patients who died or underwent a second
operation during the period of investigation were treated as censored at the
date of death or second operation respectively. The investigation ended on
28 February 2013, and patients who were still in the hospital at that time
were treated as censored.

(i) State, with reasons, whether the following types of censoring are present
in this investigation:
 right
 Type I
 Type II
 random. [4]
(ii) Comment on whether censoring in this investigation is likely to be

informative. [2]
The following data relate to 11 patients included in the investigation.
Date of Date observation Reason that observation

operation ended ended
2 January 30 January Second operation

5 January 7 January Died
10 January 24 January Left hospital
12 January 12 February Left hospital
15 January 29 January Left hospital
20 January 21 January Died
23 January 28 February End of investigation
24 January 31 January Second operation
(iii) Calculate the Kaplan-Meier estimate of the survivor function for remaining
in the hospital. [6]
(iv) Sketch the Kaplan-Meier estimate of the survivor function, labelling the
axes. [2]
(v) Comment on the results of the investigation. [2]

[Total 16]

The mortality of a rare form of flying beetle is being studied. It has been
discovered that beetles kept in a protected environment have a constant
force of mortality m but that those in the wild have a force of mortality which
is 50% higher. It has been proven that the beetles revert immediately to the
higher rate of mortality if they are released from the protected environment.
A beetle born and always living in the wild has a 58% chance of living for
eight days.
Calculate the probability of living the same length of time for:
(a) a beetle born and reared in the protected environment
(b) a beetle born in the protected environment which is scheduled to be

released into the wild after six days. [4]
(i) Explain what is meant by a proportional hazards model. [3]
(ii) Outline three reasons why the Cox proportional hazards model is widely
used in empirical work. [3]
[Total 6]

A study was made of a group of people seeking jobs. 700 people who were
just starting to look for work were followed for a period of eight months in a
series of interviews after exactly one month, two months etc. If the job
seeker found a job during a month, the job was assumed to have started at
the end of the month. Unfortunately, the study was unable to maintain
contact with all the job seekers.
The data from the study are shown in the table below.
Months since
Found employment Contact lost
start of study
1 100 50
2 70 0
3 50 20
4 40 20
5 20 30
6 20 60
7 12 38
8 6 0
(i) (a) Describe two types of censoring present in the investigation.
(b) Describe an example of a person to whom each type applies. [3]
(ii) Calculate the Kaplan-Meier estimate of the function for ‘remaining without
employment’. [6]
A Weibull distribution with a rate h(t ) given by the formula h(t ) = l b b t b -1

was fitted to these data. The estimated value of l was 0.18 and the
estimated value of b was 0.3.
(iii) Test the goodness of fit of the data to this Weibull distribution. [6]
[Total 15]

(i) Define how the following forms of censoring arise in a survival

investigation:
● right censoring
● type I censoring
● random censoring. [3]
An experience analysis is conducted where the event of interest is the lapse

of a term assurance policy.
(ii) Explain whether each form of censoring listed in part (i) occurs in each of
the following situations. If it is not possible to state whether a form of
censoring occurs, explain why this is the case.
(a) A policyholder dies.
(b) A subset of the policies is migrated to a new administration system

and no data are provided from the new system to the experience
analysis team.
(c) A policy reaches its maturity date. [4]

[Total 7]

(i) Describe what is meant by a proportional hazards model. [3]
A pharmaceutical company is interested in testing a new treatment for a

debilitating but non-fatal condition in cows. A randomised trial was carried
out in which a sample of cows with the condition was assigned to either the
new treatment or the previous treatment. The event of interest was the
recovery of a cow from the condition. The results were analysed using a
Cox regression model.
The final model estimated the hazard h(t, x ) as:
h(t , x ) = h0 (t ) exp( b 0 z + b1x + b 2 xz )
where:
 h0 (t ) is the baseline hazard
 z is a covariate taking the value 1 if the cow was assigned the new
treatment and 0 if the cow was assigned the previous treatment
 x is a covariate denoting the length of time (in days) for which the cow
had been suffering from the condition when treatment was started
 t is the number of days since treatment started.
b 0 , b1 and b 2 are parameters. Their estimated values were b 0 = 0.8 ,

b1 = 0.4 and b 2 = -0.1 .
(ii) Determine the characteristics of the baseline cow. [1]
For a particular cow, the new treatment and the previous treatment have
exactly the same hazard.
(iii) Calculate the number of days for which that cow had the condition before
the initiation of treatment. [2]
Under the previous treatment, cows whose treatment began after they had
been suffering from the condition for three days had a median recovery time
of 14 days once treatment had started.

(iv) Calculate the proportion of these cows which would still have had the
condition after 14 days if they had been given the new treatment. [4]
[Total 10]
A school offers a one-year course in a foreign language as an evening class.

This is divided into three terms of 13 weeks each with one lesson per week.
At the end of each lesson all the students sit a test and any that pass are
awarded a qualification, and no longer attend the course.
Last year 33 students started the course. Of these 13 dropped out before
completing the year, and 16 passed the test before the end of the year. The
last lesson attended by the students who did not stay for the whole 39
lessons is shown in the table below along with their reason for leaving.
Number of students Last lesson attended Reason for leaving

5 1 Dropped out
1 6 Dropped out
2 7 Passed test
2 13 Dropped out
5 14 Passed test
6 27 Passed test
4 28 Dropped out
1 30 Dropped out
3 36 Passed test
(i) Calculate the Nelson-Aalen estimate of the survival function. [5]
(ii) Sketch a graph of the Nelson-Aalen estimate of the survival function,

labelling the axes. [2]
(iii) Determine the probability that a student who starts the course passes by
the end of the year. [1]
Since only four students had not passed by the end of the year and a total of
16 had passed, the school claims in its publicity that 80% of students are
awarded the qualification by the end of the year.
(iv) Comment on the school’s claim in light of your answer to part (iii). [2]
[Total 10]

An investigation was undertaken into the time spent waiting in check-out

queues at a supermarket. A random sample of customers was surveyed,
and the times at which they joined the check-out queue and completed their
purchases were recorded. If they left the check-out queue without
completing a purchase, the time at which they left was also recorded. Below
are the data for 12 customers.
Customer Time purchase Time left without

Time joined
number completed making purchase
1 10.00 am 10.08 am
2 10.07 am 10.09 am
3 10.10 am 10.16 am
4 10.25 am 10.31 am
5 10.30 am 10.32 am
6 10.45 am 10.49 am
7 11.10 am 11.20 am
8 11.15 am 11.21 am
9 11.35 am 11.40 am
10 11.58 am 12.09 pm
11 12.10 pm 12.14 pm
12 12.15 pm 12.22 pm
(i) Calculate the Kaplan-Meier estimate of the survival function of the

duration between joining the queue and completing a purchase. [6]
The supermarket decides to introduce a scheme under which any customer

who has to wait at a check-out for more than 10 minutes receives a $2
refund on the cost of their shopping. The supermarket has 20,000
customers per day.
(ii) Give an estimate of the daily cost of the new scheme. [1]
(iii) Comment on the assumptions that you have made in obtaining the
estimate in (ii). [2]
[Total 9]

An energy provider is worried about the number of its customers who

transfer to other companies within the first two years of their contract and is
trying to direct its advertising towards the most loyal section of the
population.
The company has looked at its records over recent years and has fitted a
Cox proportional hazards model to those who have transferred within the
first two years using the factors which appear to have the most impact on
early transfer rates.
The following figures have been derived from the data:
Factor Parameter Variance

Estimate
Gender Male 0.25 0.015
Female 0 0
Volume of energy High 0.32 0.008
consumed Low 0 0
Area of Residence City Centre 0.19 0.012
City (not centre) 0 0
Rural 0.35 0.005
(i) Give the hazard function for this Cox proportional hazard model defining
all the terms and covariates. [3]
(ii) State the features of the person to whom the baseline hazard applies. [1]
(iii) Calculate symmetric 95% confidence intervals for the parameters based
on the standard errors. [2]
(iv) Test the suggestion that women change energy providers more frequently
than men. [3]
There is a 70% probability that a male customer who is a low consumer of

energy and lives in a rural area has transferred providers before the end of
two years.
(v) Calculate the probability that a male customer who is a high consumer of
energy and lives in a city centre remains with the company for at least two
years. [3]

(vi) Set out how you would determine whether the effect of any of the factors
depends upon any of the other factors. [5]
[Total 17]
Brian worked in a large open-plan office with a communal kitchen in which

the workers made coffee. Each worker supplied his or her own coffee cup.
For several years Brian was annoyed by his coffee cups being taken away
by colleagues and never returned to the kitchen, so he decided to do an
experiment. He brought into the kitchen 20 cups which were distinguishable
from the other cups in the kitchen. At the end of each day for 15 days he
counted the number of his 20 cups which remained. The results were as
follows:
Day Number of cups Day Number of cups
1 20 9 15
2 19 10 15
3 18 11 15
4 18 12 15
5 17 13 13
6 17 14 12
7 17 15 10
8 16
Brian noted that:

 the cup that ‘disappeared’ during day 2 was taken home by Brian to be
used by his mother
 the two cups that ‘disappeared’ during day 13 were accidentally broken
by Brian when doing his daily check.
Let h( x ) be the hazard that each of Brian’s cups is taken by colleagues

during day x and not returned, and let S( x ) be the corresponding survival
function.
(i) Determine an estimate of S( x ) for Brian’s cups using the Nelson-Aalen

estimator. [6]
(ii) Sketch a chart for your estimated S( x ) . [2]

[Total 8]

45 Subject CT4 September 2016 Question 10 (adapted)
A researcher is investigating the contributing factors to the speed at which

patients recover from a common minor surgical procedure undertaken in
hospitals across the country. He has the questionnaires which each patient
completed before the surgery and the length of time the patient remained in
hospital after surgery and is attempting to fit a Cox proportional hazards
model to the data.
He's fitted a model with what he assumes are the most common contributing
factors and has calculated the parameters as shown in the table below:
Covariate Category Parameter
Gender Male 0
Female 0.065
Smoker Non-Smoker 0.035

Smoker 0
Drinker Non-Drinker 0.06

Moderate Drinker 0
Heavy Drinker 0.085
(i) Give the hazard function for this Cox proportional hazards model, defining
all the terms and covariates. [4]
A male moderate drinker who does not smoke has a hazard of leaving
hospital after three days of 0.6.
(ii) Calculate the hazard rate at three days for a female smoker who is a
heavy drinker and who is still in hospital at that point. [3]
A colleague suggests that, in his experience, gender has no material impact

on the length of time in hospital after surgery.
(iii) Explain how the researcher could test this suggestion statistically. [2]
Another colleague suggests that the original model is good, but could be
improved by including an additional factor as to whether a patient is married
or not.
(iv) Set out how the researcher could establish whether an additional factor
representing marital status would improve the model. [4]
[Total 13]

In a certain country, the force of mortality, m x , in the age range 90-105

years exact is given by:
Age range (years) mx

90 £ x < 95 0.10
95 £ x < 100 0.15
100 £ x < 105 0.20
The head of state sends a congratulatory card on a citizen’s 100th birthday

and again on reaching age 105.
Derive the probability that a person aged exactly 93 WILL receive a

congratulatory card for reaching age 100 but NOT receive a second
congratulatory card for reaching age 105. [3]

(i) Describe the essential feature of a proportional hazards model. [2]
A study was made of the impact of drinking beer on men aged 60 years and
over. A sample of men was followed from their 60th birthdays until they died,
or left the study for other reasons. The baseline hazard of death, m , was
assumed to be constant, and a proportional hazards model was estimated with
a single covariate: the average daily beer intake in standard-sized glasses
consumed, x . The equation of the model is:
h(t ) = m exp( b x )
where h(t ) is the hazard of death at age 60 + t .
The estimated value of m is 0.03, and the estimated value of b is 0.2.
(ii) Explain how m and b should be interpreted, in the context of this model.
[2]
(iii) Calculate the estimated hazard of death of a man aged exactly 62 years
who drinks two glasses of beer a day. [1]
A man is aged exactly 60 years and drinks three glasses of beer a day.
(iv) (a) Calculate the estimated probability that this man will still be alive in
10 years’ time.
(b) Calculate the expectation of life at age 60 years for this man. [2]
Another man is aged exactly 60 years. He drinks beer only in his local bar.
He drinks all the beer he buys and is expected to continue drinking the same
amount of beer every day until he dies. The owner of the bar is interested in
selling as much beer as possible.
(v) Determine the average number of glasses of beer a day the owner must
sell the man in order to maximise the total amount of beer the man buys
over his remaining lifetime. [4]
[Total 11]

A careful shopkeeper takes delivery of a batch of 20 packets of cheese. Every

morning at 8am precisely she checks to see if any of the cheese has gone
mouldy and throws away any mouldy packets.
As she runs a high quality establishment, she has lots of customers and some
of the cheese is sold. After ten days she decides the cheese will be too old to
sell and throws out the remaining packets.
A curious customer observes that the shopkeeper has created an

observational plan for calculating the hazard of cheese going mouldy.
(i) State, with reasons, THREE types of censoring present in this situation. [3]
(ii) Assess, for EACH type of censoring listed in your answer to part (i),
whether a change to the observational plan could be made which would
remove that type of censoring. [3]
The shopkeeper made notes at 8am each day as follows:
Day Shopkeeper’s notes

1 Sold three packets already
2 Sold one more packet
3 One went mouldy
4 Two more mouldy ones, I hope my fridge is cold enough
5 Seems OK, nothing to report
6 Sold four more – all to one customer!
7 Nothing to report
8 Another two mouldy ones this morning
9 Sold two more
10 Three more mouldy ones – I’ll throw the rest out
(iii) Calculate the Kaplan-Meier estimate of the survival function for cheese
staying free from mould. [6]
[Total 12]

A pharmaceutical company is undertaking trials on a new drug which, it claims,

cures a particularly uncomfortable but not life threatening condition. It has
conducted extensive testing of the drug on a large group of people suffering
from the condition and has noticed that the drug is much more effective in
some groups of patients than others. It has fitted a Cox regression for the
hazard of symptoms disappearing h(t ) with three parameters:
h(t ) = h0 (t ) exp (S b s + A b A + G bG )
where bS , b A and bG are parameters and:
 S represents the sex of the patient and takes a value of 1 if the patient
is female, 0 if male
 A represents the age, in years minus 20, of the patient when the drug
was administered
 G takes the value 1 if the patient attended a gym, 0 otherwise.
The company has discovered the following, where the age given is the age
when the drug was administered:
 a 25 year old female who attended a gym had a hazard of symptoms
disappearing equal to twice that of a male of the same age who did not
attend a gym
 a 45 year old male who did not attend a gym had a hazard of symptoms
disappearing half that of a 43 year old male who attended a gym
 a 32 year old female who attended a gym had a hazard of symptoms
disappearing 60% greater than that of a 45 year old female who did not
attend a gym.
(i) Calculate the values of the parameters bS , b A and bG . [5]
(ii) Determine for which group of people the drug is most effective. [3]
The probability that a woman who attended a gym and was aged 38 years
when she was given the drug still had symptoms of the condition after 28 days
was found to be 0.75.

(iii) Calculate the probability of still having symptoms after 28 days for a male
aged 26 years when given the drug who did not attend a gym. [4]
[Total 12]
(i) Write down the formulae for the Kaplan-Meier estimator Sˆ (t ) and Nelson-
Aalen estimator S (t ) of survival in the presence of a stated hazard,
defining all terms used. [2]
The following graph shows the functions:
y = 1 - x and y = e - x over the range 0 £ x £ 1.
(ii) Demonstrate that the Nelson-Aalen estimator is never lower than the
Kaplan-Meier estimator. [2]
A trial is conducted amongst 20 patients who have suffered from eczema but
are in remission (that is, they are clear of the condition). The trial is to assess
whether continuing with periodic doses of a certain steroid cream in remission
reduces the rate at which eczema recurs. Patients are invited to tests every 3
months for a period of up to 5 years from when first declared to be in
remission.

(iii) Describe THREE types of censoring present in the investigation. [3]
The data for the trial are subdivided into a group who continued to receive the
steroid cream, and a control group who did not receive the steroid cream. The
data for the patients in the trial showing the quarterly test at which eczema
recurred, or censoring occurred, are as follows (an * indicates a patient who
was censored):
For group receiving steroid cream: 3, 5, 6*, 7*, 10, 10, 12*, 14*, 18, 19*
For control group: 6, 8, 8, 10*, 11*, 12*, 14, 15*, 18, 18
(iv) Calculate the Kaplan-Meier estimates of the survival function for

remaining clear of eczema for:
(a) the group who continued to receive the steroid cream
(b) the control group. [8]
(v) (a) Recommend, without performing any calculations, a method of

establishing whether the hazard of eczema returning is statistically
lower for those continuing to receive the steroid cream.
(b) Comment on the chance of being able to conclude from the trial data
that continuing to receive the steroid cream reduces the risk of
recurrence of eczema. [3]
[Total 18]

SOLUTIONS TO PAST EXAM QUESTIONS
The solutions presented here are just outline solutions for you to use to
check your answers. See ASET for full solutions.
(i) Expression for the Cox proportional hazards model
The Cox proportional hazards model is of the form:
T
l (t , Z ) = l0 (t ) eβZ
where:
 t is the time in years since taking up an instrument
 l0 (t ) is the baseline hazard at time t
 Z = (Z1, Z2, Z3 , Z4 ) is a vector of covariates such that:
Ï1 if instrument played is violin

Z1 = Ì
Ó0 otherwise
Ï1 if instrument played is trumpet

Z2 = Ì
Ó0 otherwise
Ï1 if tuition method is new

Z3 = Ì
Ó0 otherwise
Ï1 if child is male
Z4 = Ì
Ó0 otherwise
 β = ( b1, b 2, b 3 , b 4 ) where b1,..., b 4 are the parameters of the model

corresponding to the covariates defined above.

(ii) Regression parameters for the fitted model
The vector of regression parameters for the fitted model is:
(0.07, 0.14, -0.05, 0.02)

These values are the mid-points of the confidence intervals.
(iii) Baseline group
The class of children to which the baseline hazard applies is girls who learn
the piano using the traditional method.
(iv) Has the new tuition method improved persistency rates?
The point estimate of b 3 is –0.05. Since this is negative, it suggests that the
new tuition method reduces the hazard rate, ie it improves the chances of
children continuing to play their instruments.
However, when we look at the 95% confidence interval for b 3 , we see that it
contains 0. So b 3 does not appear to be significantly different from 0. This
implies that the difference made by new tuition method is not significant.
It may also be worth looking at the breakdown by sex and instrument played to
see if the new method makes a significant difference for a particular sex or a
particular instrument.
(v) Probability
The probability that a girl, taught using the traditional method, will still be
playing the trumpet after 4 years is:
exp Ê - Ú l0 (s ) e b 2 ds ˆ = 0.7
4
Ë 0 ¯
Using the model:
e0.14
È Ê 4 ˆ˘
ÍÎ exp Ë - Ú0 l0 (s ) ds ¯ ˙˚ = 0.7
-0.14
fi exp Ê - Ú l0 (s ) ds ˆ = (0.7)
4 e
= 0.73339
Ë 0 ¯

So the probability that a boy, taught using the new method, will still be playing
the piano after 4 years is:
e -0.05 + 0.02
È ˘
exp Ê - Ú l0 (s ) e b3 + b 4 ds ˆ = Í exp Ê - Ú l0 (s ) ds ˆ ˙
4 4
Ë 0 ¯ Î Ë 0 ¯˚
e -0.03
= (0.73339) = 0.74014
(i)(a) Type I censoring
Type I censoring occurs at the end of the investigation, when the remaining
patients are censored. It is known in advance that any remaining patients will
be censored after 30 days.
(i)(b) Type II censoring
Type II censoring does not occur here since the investigation does not end
once a predetermined number of patients have died.
(i)(c) Random censoring
Random censoring occurs since it is not known in advance when the patients
will leave hospital. The duration at which a patient leaves hospital can be
considered to be a random variable.
(ii) Whether the censoring in this investigation is likely to be informative
Informative censoring is likely to be present here since a patient who has left
hospital is likely to be in better health, and is therefore less likely to die, than
those that remain.

(iii) Kaplan-Meier estimate of the survival function
The Kaplan-Meier estimate of the survival function is:
Ï1 for 0 £ t < 2
Ô
Ô9 for 2 £ t < 6
Ô 10
Ô9 8 4
ˆ Ô ¥ = for 6 £ t < 12
S(t ) = Ì 10 9 5
Ô9 8 7 7
Ô ¥ ¥ = for 12 £ t < 27
Ô 10 9 8 10
Ô 9 8 7 4 14
Ô ¥ ¥ ¥ = for 27 £ t £ 30
Ó 10 9 8 5 25
So the Kaplan-Meier estimate of the survival function at duration 28 days

is 14
25
, or 0.56
(iv) Kaplan-Meier estimate of the hazard of death at duration 8
Since no lives are observed to die at duration 8, the Kaplan-Meier estimate of

the hazard of death at that time is 0.
(v) Sketch of the Kaplan-Meier estimate of the survival function
The graph of the Kaplan-Meier estimate of the survival function is shown

below:
S(t)
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
5 10 15 20 25 30 time, t

(i) Hazard function in terms of probabilities
1
lim P ( x < X £ x + h)
lx = h Æ 0 + h
P ( X > x)
(ii) Nelson-Aalen estimate of survival function
For the Nelson-Aalen estimate, first we estimate the cumulative hazard

function:
ˆ =
L t Â lˆj
t j £t
where:
dj
lˆ j =
nj
d j = number of reconvictions at time t j
n j = number of criminals at large at time t j (prior to any reconvictions
at time t j )
t j  j th time at which a reconviction occurs.
We then calculate an estimate of the survival function from:
ˆ
Sˆ (t ) = e -Lt
The table below shows the data in terms of the required values, and the
calculation of the cumulative hazard function at each t j .
j tj nj dj lˆ j ˆ
L j
1 3 95 4 0.042105 0.042105
2 4 90 3 0.033333 0.075439
3 5 85 5 0.058824 0.134262
ˆ is zero for 0 £ t < 3 , and equal to 0.134262 for 5 £ t < 7 .

The value of L t

The values of Sˆ (t ) are therefore:
Sˆ (t )
0£t <3 1
3£t <4 0.95877
4£t <5 0.92734
5£t <7 0.87436
t≥7 £ 0.87436
(iii) Testing whether the reconviction rate has declined
The probability of reconviction in the first 6 months is denoted by F (6) , and

is equal to 1 - S(6) .
We will be testing the following hypotheses:
H0 : F (6) = 0.2
H1 : F (6) < 0.2
This means we will need a 1-sided confidence interval of the form:
ÈÎ0, Fu (6)˘˚
where Fu (6) is the 95th percentile of the probability distribution of F (6) ,

which is the estimator of F (6) .
 . This is (from the Tables page 33):

First we’ll need the variance of L 6
 =
var L ( ) Â
(
d j nj - d j ) = 4 ¥ 91 + 3 ¥ 87 + 5 ¥ 80 = 0.00143391
6 3
t j £6 nj 953 903 853
 is then:
The standard error of L 6
SE = 0.00143391 = 0.037867

 is normally distributed (which is true asymptotically):

Now, assuming L 6
Î(
Fu (6) = 1 - exp ÈÍ - Lˆ + 1.6449 ¥ SE ˘
6 ˙˚ )
= 1 - exp ÈÎ - (0.134262 + 1.6449 ¥ 0.037867) ˘˚
= 1 - e -0.196549
= 0.1784
The required 95% confidence interval for F (6) is therefore ÎÈ0, 0.1784 ˚˘ .
As this excludes the value 0.2, then we can reject H0 , at least at the 5%
level, and so conclude that there is sufficient evidence to support the
hypothesis that 6-month reconviction rates have declined since the previous
investigation.
We have removed the parts of this question that refer to the Balducci
assumption as this assumption is beyond the scope of Subject CS2.
(i) q x and mx , and their relationship
q x is the initial rate of mortality. It represents the probability that a life,

currently aged exactly x years, dies within the next year.
mx is the central rate of mortality. It is the probability of dying between

exact ages x and x + 1 per person-year lived between exact ages x and
1
x + 1 ; the denominator Ú 0 t px dt is interpreted as the expected amount of
time spent alive between ages x and x + 1 by a life alive at age x and the
numerator is the probability of that life dying between exact ages x and
x +1.

The rates are related by:
qx
mx = 1
Ú 0 t px dt
1
where Ú 0 t px dt is the expected time spent alive in the year beginning at
exact age x .
(ii) Formulae for t q x
(a) Uniform distribution of deaths
t qx = t. qx
(b) Constant force of mortality
= 1 - t px = 1 - ( px ) = 1 - (1 - q x )
t t
t qx
(iii) Calculation of mx
(a) Uniform distribution of deaths
Now:
1
È 2
t ˘
Ú 0 (1 - t qx ) dt = Ú 0 (1 - t qx ) dt = Ít - 2 qx ˙
1 1 1 1
Ú0 t px dt = = 1- 2
qx
ÍÎ ˙˚0
Therefore:
qx 0.1
mx = 1
= = 0.10526
1- qx 1 - 0.05
2

(b) Constant force of mortality
Ú 0 t px dt = Ú 0 ( px ) Ú 0 (0.9)
1 1 t 1 t
dt = dt
1 (ln 0.9) t dt = 1 È (ln 0.9) t ˘1

= Ú0e ln 0.9 ÎÍ
e
˚˙0
0.9 - 1
= = 0.949122
ln 0.9
Therefore:
0.1
mx = = 0.10536
0.949122
(iv) Comment on the differences
Rewriting the formula for mx :
1
Ú 0 t px mx + t dt
mx = 1
Ú 0 t px dt
we can see that it is a weighted average of m x + t , the force of mortality at

each age between x and x + 1 , with the probability of survival ( t px ) as
weights.
The calculated value of mx will therefore be affected by the way in which

m x + t is assumed to vary over the age range.
Assuming m x + t is constant gives an mx value of 0.10536.
The UDD assumption produces a lower value, of 0.10526. UDD implies that
the force of mortality is rising over the year of age (in order to produce a
constant rate of the number dying as the number of survivors decreases
over the age range). As the weights ( t px ) are higher at the start of the year
and decrease over the year, the greatest weights will be applied to m x + t
towards the start of the year, where mortality rates are lowest. This will tend
to reduce the weighted average mortality rate compared to (b).

At most points in the human lifespan, the force of mortality is rising with
increasing age. This means that the UDD assumption is likely to be the
most realistic assumption to make.
Both assumptions produce the same answer to the third significant figure (in
this case), so that for most practical purposes either assumption produces
acceptable results.
(i)(a) Uniform distribution of deaths assumption
The total number of deaths between the ages of 30 and 40 is:
98, 617 - 97, 952 = 665
If deaths occur uniformly between the ages of 30 and 40, we would expect to
get 665 / 2 = 332.5 deaths between the ages of 30 and 35 (and the other
332.5 deaths between the ages of 35 and 40). So, under this assumption,
the probability of dying between the ages of 30 and 35 is:
332.5
5 q30 = = 0.003372
98, 617
(i)(b) Constant force of mortality assumption
Assuming a constant force of mortality of m between the ages of 30 and 40,

we have:
10 p30 = e -10 m
= e -5 m = ( 10 p30 )
½
5 p30
Using the figures given in the question:
97, 952
10 p30 = = 0.993257
98, 617

So:
97, 952
5 p30 = = 0.996623 and 5 q30 = 1 - 0.996623 = 0.003377
98, 617
Alternatively, you could use the equation:
10 p30 = 0.993257 = e -10 m
to calculate m = 0.0006766 and use this value in the formula:
5 q30 = 1 - e -5 m
(ii) Number of survivors at exact age 35
Under the UDD assumption, the number of survivors at exact age 35 is:
98, 617 - 332.5 = 98, 284.5
Under the constant force assumption, the number of survivors is:
97, 952
98, 617 ¥ 5 p30 = 98, 617 = 98, 283.9
98, 617
(iii) Comment
The two assumptions give very similar answers.
The actual number of survivors is 98,359 and this is higher than the figures
given by both the UDD and the constant force assumptions. This means
that there were more deaths between the ages of 35 and 40 than there were
between the ages of 30 and 35. In other words mortality is higher between
35 and 40 than it is between 30 and 35, which suggests that the force of
mortality is increasing between 30 and 40.
The UDD assumption implies that the force of mortality is increasing

between the ages of 30 and 40 (whereas the constant force assumption
obviously says that the force of mortality is the same at all ages between 30
and 40). However, it appears that the actual force of mortality is increasing
faster than it would under the UDD assumption. So neither of the two
assumptions is very appropriate.

(i) Proof
Gompertz’ law states that:
m x = Bc x
So, under Gompertz’ law:
t t x +s t
Ú0 mx + s ds = Ú0 Bc ds = Bc x Ú c s ds
0
t Bc x È s ln c ˘t
= Bc x Ú es ln c ds = e
0 ln c Î ˚0
=
Bc x È s ˘t
c =
Bc x c t - 1( )
ln c Î ˚0 ln c
Hence:
Ê t ˆ Í (
È -Bc x c t - 1
) ˘˙ = ÍÈexp Ê -B ˆ ˙˘ ( )
c x c t -1
t px = exp Ë - Ú0 m x + s ds ¯ = exp Í
ln c ˙
Î ËÁ ln c ¯˜ ˚
ÍÎ ˙˚
(ii) Values of B and c
We have:
Ê -Bc 50 (c - 1) ˆ
1 p50 = exp Á ˜ = 0.995
ÁË ln c ˜¯
and:
(
Ê -Bc 50 c 2 - 1
= exp Á
) ˆ˜ = 0.989
2 p50 Á ˜
ln c
Ë ¯

Taking logs gives:
Bc 50 (c - 1)
= - ln 0.995 (1)
ln c
and:
(
Bc 50 c 2 - 1 ) = - ln 0.989 (2)
ln c
Dividing equation (2) by equation (1) gives:
c 2 - 1 ln 0.989
=
c - 1 ln 0.995
Since c 2 - 1 = (c - 1)(c + 1) , it follows that:
ln 0.989
c +1=
ln 0.995
So c = 1.20665 .
Substituting this into equation (1) gives:
- ln 0.995 ¥ ln1.20665
B= = 3.797 ¥ 10 -7
1.2066550 (1.20665 - 1)
(iii) Comment
In part (ii) we solved the two equations analytically. However, we would

normally use a graduation technique to fit a formula to many more than two
crude mortality rates. The parameters could be estimated using maximum
likelihood or least squares methodology. The calculations would be much
more complicated than those in (ii) and would be carried out on a computer.
Also, in part (ii), we have used Gompertz’ law for mortality. However, in
general, we may use another member of the Gompertz-Makeham family.
The general formula for these functions is:
m x = polynomial1 + exp ( polynomial 2 )

Estimating the parameters in a formula of this kind is more complicated than

estimating the parameters in part (ii).
(i)(a) Definition of Sx (t )
The definition is:
Sx (t ) = P (Tx > t )
(i)(b) Derivation
We could say:
x + t p0 S( x + t )
S x ( t ) = t px = =
x p0 S( x )
(ii) Definition of the force of mortality
The definition of m x + t in terms of Tx is:
P (Tx £ t + h | Tx > t )
m x + t = lim
h Æ0 h
(iii) Expression for the Weibull force of mortality
Since:
∂ ∂
mx +t = - ln t px = - ln Sx (t )
∂t ∂t
for the Weibull distribution, we have:
∂
mx +t =
∂t
(l t )b = b l b t b -1

(iv) Graph of the Weibull force of mortality
When l = 1 and b = 0.5 , the force of mortality is given by m x + t = 0.5 t -0.5 .
When l = 1 and b = 1 , the force of mortality is given by m x + t = 1 for all t .
When l = 1 and b = 1.5 , the force of mortality is given by m x + t = 1.5 t 0.5 .
The diagram below shows the three force of mortality functions on the same
graph.
Force of mortality
 = 1,  = 1.5
1  = 1,  = 1
 = 1,  = 0.5
1 2 3 4 5 Duration, t

UDD assumption
If deaths are uniformly distributed between the ages of x and y , then the
number of lives in the population decreases linearly between the ages of x
and y .
The survival function is a linearly decreasing function of t .
Constant force of mortality assumption
This assumption says that m x + t is equal to some constant m for all t

between 0 and y - x .
In general:
Sx (t ) = t px = exp Ê - Ú m x + s ds ˆ
t
Ë 0 ¯
Under the constant force assumption, this simplifies to:
S x (t ) = e - t m
This is an exponentially decreasing function of t .
(i) Types of censoring present
Type II censoring is present because the observation is stopped after a

predetermined number of batteries have failed (8 in this case). Type II
censoring is a special case of right censoring.
Random censoring is present because we didn’t know in advance that a

battery was going to explode, and therefore be censored, at time 110.
Non-informative censoring is also present because the battery that exploded

at time 110 gives us no information about the future lifetime of the other
batteries under investigation.

(ii) Kaplan-Meier estimate of the survival function
Suppose that time is measured in days. We have:
n j = number at d j = number of
Failure nj - d j
j risk just before failures at time
time, t j nj
time t j tj
10
1 97 12 2 12
6
2 120 9 3 9
4
3 141 6 2 6
3
4 150 4 1 4
The Kaplan-Meier estimate of the survival function is:
Ï1 for 0 £ t < 97
Ô5
for 97 £ t < 120
Ê n j - d j ˆ ÔÔ 6
Sˆ (t ) = ’ Á ˜ =Ì
5 for 120 £ t < 141
t j £t Ë n j ¯ Ô109
for 141 £ t < 150
Ô 27
ÔÓ5 18 for t = 150
The survival function is not defined at times greater than the time at which
the last censoring event took place, ie time 150 days in this case.
(iii) Explain why the figure is consistent with theft of the battery
5
The estimate of S(150) calculated in part (ii) is 18 , or 0.2777, which is not
the same as the figure of 0.2727 in the sub-contractor’s report.

However, if the sub-contractor had stolen one of the batteries at the start of
the investigation, we would have:
n j = number at d j = number of
Failure nj - d j
j risk just before failures at time
time, t j nj
time t j tj
9
1 97 11 2 11
6
2 120 9 3 9
4
3 141 6 2 6
3
4 150 4 1 4
So the Kaplan-Meier estimate of the survival function at time 150 would be:
Sˆ (150) = 9 11 ¥ 6 9 ¥ 4 6 ¥ 3 4 = 3 11 = 0.2727
which is the same as the figure given in the report.
(i) Proportional hazards model
A proportional hazards model is one that allows the effect of covariates on

survival rates to be analysed.
The hazard function is assumed to be the product of two terms. One of

these terms depends only on age or duration, and the other depends only on
the covariates. An example of a proportional hazards model is the Cox
regression model, which is of the form:
(
l (t , Z ) = l0 (t ) exp b ZT )
where l0 (t ) is the baseline hazard at age/duration/time t , b is a vector of
parameters and Z is a vector of covariates.

In a proportional hazards model, the hazards of different lives with the same
age/duration/time are in the same proportion at all times, eg in the Cox
model:
( ) = l0 (t ) exp ( b ZT1 ) = exp ( b ZT1 )

l t, Z1
l (t , Z 2 ) l0 (t ) exp ( b ZT ) exp ( b ZT )
2 2
(ii) Why the Cox model is a popular model for the analysis of survival data
The Cox model ensures that the hazard is always positive and gives a linear
model for the log-hazard, which is convenient in theory and practice.
In the Cox model, the general shape of the hazard function for all individuals
is determined by the baseline hazard, while the exponential term accounts
for the differences between individuals. So if we are not primarily concerned
with the precise form of the hazard, but with the effects of the covariates, we
can ignore l0 (t ) and estimate the parameters from the data irrespective of
the shape of the baseline hazard.
(iii)(a) Equation of the model
The equation of the model is:
l (t , Z ) = l0 (t ) exp ( b A A + bSS + b E E )
where Z = ( A, S, E ) is the vector of covariates, l (t , Z ) is the estimated

hazard at time t for an individual with covariate vector Z , and l0 (t ) is the
baseline hazard at time t .
(iii)(b) Baseline group
The baseline hazard applies to the group of lives for whom all the covariates
are 0, ie females who were aged exactly 16 when they started to claim
benefit and had not passed the school leaving examination in mathematics.

(iv) Estimated parameter values
The first bullet point tells us that:
exp ( b A + bS + b E )
= 1.5
exp ( bS )
Hence:
b A + b E = ln1.5 (1)
The second bullet point tells us that:
exp ( b E )
=2
exp ( bS )
and hence:
b E - bS = ln 2 (2)
Similarly, the third bullet point tells us that:
exp ( 4 b A + b E )
=2
exp ( bS + b E )
and hence:
4 b A - bS = ln 2 (3)
Subtracting (2) from (3) gives:
4 b A - bE = 0
So:
bE = 4 b A
Substituting this into (1) gives:
5 b A = ln1.5

So the parameter values are:
b A = 0.2 ln1.5 = 0.08109

b E = 0.8 ln1.5 = 0.32437
bS = b E - ln 2 = -0.36878
The probability density function of the complete future lifetime Tx is:
f ( t ) = t px m x + t , t ≥ 0
So the mean of the complete future lifetime at age x is:
• •
E (Tx ) = Ú0 t f (t )dt = Ú0 t t px m x + t dt
Note that E (Tx ) is also sometimes written as ex . An alternative formula for



ex is:
•
E (Tx ) = Ú0 t px dt
The variance of Tx is:

2
m x + t dt - Ê Ú t t px m x + t dt ˆ
2 • 2 •
var(Tx ) = E ÈTx 2 ˘ - ÎÈE (Tx )˚˘ = Ú0 t t px
Î ˚ Ë 0 ¯

(i) Median for those who qualified
The first row of data tells us that 11 students qualified during the observation
period.
6 8 8 9 9 9 11 11 13 13 13
The median number of sessions for them is the middle (6th) value from this
ordered list, ie 9 sessions.
If we combine the times and arrange them in ascending order (using Q to

denote qualifiers and S to denote students who stopped studying), we obtain
the following timeline:
23 new students 7 students remaining
S Q S Q
Q Q Q Q
SS Q Q Q Q Q S
4 5 6 8 9 11 13 14
To estimate the survival function, we construct the usual table:
j tj dj nj
Time of Number of Number at risk

j th decrement decrements at time t j immediately before
time t j
1 6 1 21
2 8 2 20
3 9 3 17
4 11 2 14
5 13 3 11

The Kaplan-Meier estimator of the survival function at time t is:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - ˜
t j £t Ë nj ¯
This leads to the following values:
0£t <6: Sˆ (t ) = 1
20
6£t <8: Sˆ (t ) = = 0.95238
21
20 18
8 £ t < 9: Sˆ (t ) = ¥ = 0.85714
21 20
20 18 14
9 £ t < 11 : Sˆ (t ) = ¥ ¥ = 0.70588
21 20 17
20 18 14 12
11 £ t < 13 : Sˆ (t ) = ¥ ¥ ¥ = 0.60504
21 20 17 14
20 18 14 12 8
13 £ t : Sˆ (t ) = ¥ ¥ ¥ ¥ = 0.44003
21 20 17 14 11
(iii) Median
The median is the point on the distribution where Sˆ (t ) = 0.5 , which

corresponds to t = 13 sessions.
(iv) Explain the difference
The median value we worked out in part (i) was the median time to qualify,
given that the student qualified during the period. This is not what we would
normally mean by the average time to qualify because this calculation
ignores the 7 students who were still studying at the end of the period.
The median value in part (iii), on the other hand, takes into account the
censored students, ie the students who dropped out and those still studying
at the end of the period. This provides a better estimate of the average time
to qualify.

(i) Hazard function
The general form of the Cox model models l (t ; z i ) , the hazard function at
time t for individual i , as:
( )
l (t ; z i ) = l0 (t ) exp β zTi
where:
 t is the duration (time)
 z i is a row vector of covariates for individual i
 β is a row vector of regression parameters for the model
 l0 (t ) is the baseline hazard rate.
(ii) Baseline bird
We can see from the table that the parameter estimates are zero for
‘chicken’, ‘old’ and ‘female’. So the baseline bird is a female chicken in the
old enclosure.
(iii)(a) Define the covariates
The hazard function for the Cox model in this example would have the
following form:
l (t ; z1, z2, z3 , z4 ) = l0 (t ) exp ( b1z1 + b 2z2 + b 3 z3 + b 4 z4 )
where:
 z1 = 1 for a duck and 0 for other birds
 z2 = 1 for a goose and 0 for other birds
 z3 = 1 for the new enclosure and 0 for the old enclosure
 z4 = 1 for male and 0 for female
 the b ’s are the corresponding parameters to be estimated

(iii)(b) Confidence intervals
The parameter b1 quantifies the effect of the type of bird (the differential
between duck and chicken) on the birds’ mortality rates.
The 95% confidence interval for this parameter is:
-0.210 ± 1.96 ¥ 0.002 = -0.210 ± 0.088 ie -0.298 £ b1 £ -0.122
The parameter b 2 quantifies the effect of the type of bird (the differential
between goose and chicken) on the birds’ mortality rates.
+0.075 ± 1.96 ¥ 0.004 = +0.075 ± 0.124 ie -0.049 £ b 2 £ +0.199
The parameter b 3 quantifies the effect of the enclosure (new versus old) on
the birds’ mortality rates.
+0.125 ± 1.96 ¥ 0.0015 = +0.125 ± 0.076 ie +0.049 £ b 3 £ +0.201
The parameter b 4 quantifies the effect of sex (male versus female) on the
birds’ mortality rates.
+0.2 ± 1.96 ¥ 0.0026 = +0.2 ± 0.100 ie +0.100 £ b 4 £ +0.300
(iv) Effect of the new enclosure
Since the confidence interval for b 3 does not contain the value 0 and the
parameter estimate is positive, this implies that birds in the new enclosure
have a significantly higher mortality rate. So it appears that the new
enclosure will result in a reduction in the birds’ life expectancy, rather than
an increase.

(v) Calculate the probability
The hazard rate for a male goose in the old enclosure at time t is:
l (t ; 0,1, 0,1) = l0 (t ) exp( -0.210 ¥ 0 + 0.075 ¥ 1 + 0.125 ¥ 0 + 0.2 ¥ 1)
= l0 (t )e0.275
We are given that at time 6 months ( = T , say), the survival probability is:
SM (T ) = exp Ê - Ú l0 (t )e0.075 + 0.2dt ˆ = 1 - 0.1 = 0.9

T
Ë 0 ¯
We can use this to find S0 (T ) , the survival probability of the baseline bird up
to time T :
0.9 = exp Ê - Ú l0 (t )e0.275dt ˆ = exp Ê - Ú l0 (t )dt ¥ e0.275 ˆ

T T
Ë 0 ¯ Ë 0 ¯
1.31653
= exp Ê - Ú l0 (t )dt ˆ
T 1.31653
= ÈÎS0 (T )˘˚
Ë 0 ¯
fi S0 (T ) = 0.91 1.31653 = 0.92309
The hazard rate for a female duck in the new enclosure at time t is:
l (t ;1, 0,1, 0) = l0 (t ) exp( -0.210 ¥ 1 + 0.075 ¥ 0 + 0.125 ¥ 1 + 0.2 ¥ 0)
= l0 (t )e -0.085
The corresponding survival probability up to 6 months is:
SF (T ) = exp Ê - Ú l0 (t )e -0.085dt ˆ = exp Ê - Ú l0 (t )dt ¥ e -0.085 ˆ

T T
Ë 0 ¯ Ë 0 ¯
0.91851
= exp Ê - Ú l0 (t )dt ˆ
T 0.91851
= ÈÎS0 (T )˘˚
Ë 0 ¯
0.91851
= ÎÈ0.92309 ˚˘ = 0.92913
The probability that the female duck in the new enclosure has been killed by
6 months is therefore 1 - 0.92913 = 0.07087 .

We have removed the parts of this question that refer to the Balducci
assumption as this assumption is beyond the scope of Subject CS2.
(i)(a) UDD
t qx = t qx
(i)(b) Constant force assumption
( )
t
= 1 - ( px ) = 1 - (1 - q x )
t t
t qx = 1 - t px = 1 - e - m t = 1 - e - m
(ii)(a) Calculation using the UDD assumption
0.5 p60 = 1 - 0.5 q60 = 1 - 0.5 q60 = 1 - 0.5 ¥ 0.05 = 0.975000
(ii)(b) Calculation using the constant force assumption
= ( p60 )
0.5
0.5 p60 = 0.950.5 = 0.974679
(iii) Comment
There is not that much difference between the values.
The lighter the mortality in the first half of the year, the higher the value
of 0.5 p60 . This will occur when the force of mortality is increasing over the
year between age x and age x + 1 .
The UDD assumption implies that the force of mortality is increasing and so
this gives a higher value for 0.5 p60 .
The constant force assumption says that mortality rates are the same in the
first half and in the second half of the year and so this gives a lower value
for 0.5 p60 .

(i) Types of censoring present in the study
Type I censoring is present since we know in advance that all surviving lives
(who have not previously withdrawn) will be censored 5 years after their
operations. Type I censoring is a special case of right censoring.
Random censoring occurs when patients withdraw from the study since the
withdrawal times are not known in advance. This is another special case of
right censoring.
(ii) Number of deaths, classified by duration
The estimated survival function has ‘steps’ at times 1, 3 and 4. So deaths

have been observed at these times.
The Nelson-Aalen estimate of the survival function is Sˆ (t ) = e -L(t ) ,

ˆ
dj
ˆ (t ) =
where L Â nj
. So:
tj £t
dj
Â = - ln Sˆ (t )
tj £t nj
From the given data, we have:
d 1
Sˆ (1) = 0.9355 fi 1 = - ln 0.9355 = 0.0667 =
n1 15
This means that d1 = 1 and n1 = 15 .
d d
Sˆ (3) = 0.7122 fi 1 + 2 = - ln 0.7122
n1 n2
d2 1 3
fi = - ln 0.7122 - = 0.2727 =
n2 15 11
This means that d 2 = 3 and n2 = 11 .

Also:
d d d
Sˆ ( 4) = 0.6285 fi 1 + 2 + 3 = - ln 0.6285
n1 n2 n3
d3 1 3 1
fi = - ln 0.6285 - - = 0.1250 =
n3 15 11 8
This means that d3 = 1 and n3 = 8 .
(iii) Number of patients who were censored
11 patients must have been censored because 5 out of the original 16

patients died during the study.
(i) Probability of surviving for at least 10 days
Using the formula t px = exp Ê - Ú m x + s ds ˆ and working in days, we have:

t
Ë 0 ¯
= exp Ê - Ú ms ds ˆ
10
10 p0 Ë 0 ¯
= exp Ê - Ú 0.05 ds ˆ
10
Ë 0 ¯
= exp ( -0.5)
= 0.6065
(ii) Probability of surviving for a further 30 days
Because the form of the force of mortality changes after 30 days, we need to
split up our calculation into the parts before and after 30 days:
30 p10 = 20 p10 ¥ 10 p30
= exp Ê - Ú m10 + s ds ˆ ¥ exp Ê - Ú m30 + s ds ˆ

20 10
Ë 0 ¯ Ë 0 ¯
= exp Ê - Ú 0.05ds ˆ ¥ exp Ê - Ú 0.05e0.01s ds ˆ

20 10
Ë 0 ¯ Ë 0 ¯

The first integral is:
20
Ú0 0.05ds = 0.05 ¥ 20 = 1
and the second is:
10
È 0.05e0.01s ˘
10
Ú0 0.05e0.01s ds = Í
ÍÎ 0.01 ˙˚ 0
˙ = 5 e0.1 - 1 ( )
So we get:
30 p10
ÎÍ (
= exp ( -1) ¥ exp È -5 e0.1 - 1 ˘ = 0.2174
˚˙ )
(iii) Age by which 90% have died
Let y denote the age (in days) by which 90% have died. This will satisfy
the equation y p0 = 0.1 . This age is likely to be greater than 30 days. So,
to evaluate the LHS, we need to split the age range as before:
y p0 = 30 p 0 ¥ y - 30 p3 0
= exp Ê - Ú ms ds ˆ ¥ exp Ê - Ú m30 + s ds ˆ

30 y - 30
Ë 0 ¯ Ë 0 ¯
= exp Ê - Ú 0.05ds ˆ ¥ exp Ê - Ú 0.05e0.01s ds ˆ

30 y - 30
Ë 0 ¯ Ë 0 ¯
(
= exp ( -0.05 ¥ 30) ¥ exp È -5 e0.01( y - 30) - 1 ˘
ÎÍ ˚˙ )
So:
(
exp ( -1.5) ¥ exp È -5 e0.01( y - 30) - 1 ˘ = 0.1
ÎÍ ˙˚ )

Taking logs:
( )
-1.5 - 5 e0.01( y - 30) - 1 = ln 0.1
fi 3.5 - 5e0.01( y - 30) = ln 0.1
3.5 - ln 0.1
fi e0.01( y - 30) =
5
Ê 3.5 - ln 0.1ˆ
fi y - 30 = 100 ln Á ˜¯ = 14.89
Ë 5
fi y = 30 + 14.89 = 44.89 days
(i) Censoring
Right censoring
Right censoring is present when some of the periods of observation have

been cut short.
This is true for the individuals who left hospital during the study (B, H and M)
and for the individuals who were still alive and in hospital when the study
period finished (C, D, F, I, L and N). For these individuals all we can
establish is a lower limit for their time of death ti .
Left censoring
Left censoring is present when we are only able to establish an upper limit
for the time of death ti , not a precise value. This may be true to some
extent for the individuals who died during the study, since we are only given
the date of surgery and date of death, not the precise time of day.
However, this effect is very small and can probably be ignored.

Informative censoring
Informative censoring is where the different types of decrement cannot be

considered to be independent. This means that, when censoring occurs, it
might affect the survival probabilities for those remaining.

It is likely that patients will only be discharged from hospital if they are well
enough to be able to cope by themselves. So those remaining in the
hospital will tend to be the more seriously ill patients, which implies
informative censoring.
The table below shows the times of death and censoring in ascending order,
measured in days from the date of surgery. A ’+’ indicates a right-censored
observation. For duration 56, we have followed the usual convention that
deaths are assumed to occur before censoring.
Patient Date of Date of Reason Time in days

surgery leaving
A June 1 June 3 Died 2
G June 16 June 21 Died 5
J June 24 June 29 Died 5
B June 3 July 2 Left 29+
E June 9 July 11 Died 32
M June 29 Aug 6 Left 38+
K June 25 Aug 20 Died 56
H June 17 Aug 12 Left 56+
N June 30 Aug 31 End of study 62+
L June 26 Aug 31 End of study 66+
I June 22 Aug 31 End of study 70+
F June 12 Aug 31 End of study 80+
D June 8 Aug 31 End of study 84+
C June 5 Aug 31 End of study 87+

We can now construct the usual summary table for calculations based on
the Kaplan-Meier model:
Counter Time of the Number of deaths Number at risk just

j th death at time t j before time t j
j tj dj nj
1 2 1 14
2 5 2 13
3 32 1 10
4 56 1 8
The Kaplan-Meier estimator of the survival function at time t is:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - ˜
t j £t Ë nj ¯
This leads to the following values:
0 £ t < 2: Sˆ (t ) = 1
13
2 £ t < 5: Sˆ (t ) = = 0.92857
14
ˆ 13 11 11
5 £ t < 32 : S(t ) = ¥ = = 0.78571
14 13 14
ˆ 13 11 9 99
32 £ t < 56 : S(t ) = ¥ ¥ = = 0.70714
14 13 10 140
13 11 9 7 99
56 £ t < 92 : Sˆ (t ) = ¥ ¥ ¥ = = 0.61875
14 13 10 8 160

(iii) Graph
The graph of the Kaplan-Meier estimate of the survival function is a

decreasing step function, which looks like this:
Kaplan-Meier survival function
1
Estimate of S(t)
0.8
0.6
0.4
0.2
0
0 20 40 60 80 100
Duration t (days)
(iv) Probability of dying within 4 weeks
The probability of dying within 4 weeks (ie 28 days) of surgery is:
11 3
Fˆ (28) = 1 - Sˆ (28) = 1 - = = 0.21429
14 14
(i) Describe the central and initial rate of mortality
The central rate of mortality mx is the probability of dying between exact

ages x and x + 1 per person-year lived between exact ages x and x + 1 .
The initial rate of mortality q x is the probability that a life aged exactly x will
die during the next year, ie before reaching exact age x + 1 .

(ii) Circumstance in which mx = m x
If the force of mortality is constant between exact ages x and x + 1 , then

mx = m x .
(i) Types of censoring
Type 1 censoring
Type 1 censoring is present since the study was terminated after a fixed time
period.
Right censoring

been cut short.
The test areas that were accidentally ploughed up and the ones that still
showed no re-growth after 12 months are subject to right censoring, since
we only know that their re-growth time exceeded a given number.
Non-informative censoring
Non-informative censoring occurs when the different types of decrement can

be considered to be independent. This means that, when censoring occurs,
it does not affect the survival probabilities for those remaining.
We are told that it was an accident that the areas were ploughed up. So this
would be non-informative censoring.
Random censoring
Random censoring occurs when a decrement occurs at a time that was not
scheduled.
The areas that were ploughed up accidentally provide an example of random

censoring.

(ii) Probability of no re-growth by 9 months
The table below summarises the information given, with the times arranged
in ascending order. A ’+’ sign indicates a right-censored observation.
Month Outcome Number of plots

1 Re-growth 1
2 Re-growth 3
5 Re-growth 2
6+ Re-ploughed 5
8 Re-growth 4
12+ End of study 5
We can now construct the usual summary table for calculations of empirical
survival rates:
Counter Time of the Number of Number at risk

j th re-growth re-growths at just before time
time t j tj
j tj
dj nj
1 1 1 20
2 2 3 19
3 5 2 16
4 8 4 9
Kaplan-Meier approach
The Kaplan-Meier estimate of the survival function at time t is:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - ˜
t j £t Ë nj ¯
So the estimated probability of no re-growth by 9 months is:
19 16 14 5 35 7
Sˆ (9) = ¥ ¥ ¥ = = = 0.3889
20 19 16 9 90 18

Nelson-Aalen approach
According to the Nelson-Aalen model, the integrated hazard is estimated as:
dj
Lˆ t = Â nj
t j £t
So, at time 9 months, this is:
ˆ = 1 + 3 + 2 + 4 = 0.7773
L 9
20 19 16 9
Survival probabilities are then calculated as:
Sˆ (t ) = exp -L
ˆ
t ( )
So the estimated probability of no re-growth by 9 months is:
Sˆ (9) = e -0.7773 = 0.4596
(i) Why the Cox regression model is suitable
The Cox model would be suitable in this case because:
 the study can be thought of in terms of hazard rates, which form the
basis for the Cox (proportional hazards) model
 we are primarily interested in comparing the effect of covariates (gender
and exercise regime)
 we may want to analyse interactions between the covariates (which the
model can incorporate)
 we are not particularly interested in the baseline hazard rate itself (and
Cox partial likelihoods allow us to strip this out)
 the Cox model is a well-known model that is commonly used
 the structure of the Cox model ensures that the hazard rate is always
positive

 software is available for estimating the parameters and calculating the

likelihoods
 the Cox model can deal with censored data
 it is easy to test whether each parameter is statistically significant
 the results are easy to interpret.
(ii) Interaction term
The Cox model in this case would take the form:
l (t ; Z1, Z2, Z3 ) = l0 (t )e b1 Z1 + b 2 Z2 + b 3 Z3
where Z3 = Z1 Z2 .
We need to test:
H0 : b 3 = 0 versus H1 : b 3 π 0
To do this we can use the likelihood ratio test to compare two models:
 a two-parameter model with parameters b1 and b 2
 a three-parameter model with parameters b1 , b 2 and b 3 .
The value of the test statistic is:
(
-2( 2 -  3 ) = -2 -1,250 - ( -1,246) = 8 )
Assuming that the sample size for the study is sufficiently large, under H0 ,
this should come from the c12 distribution. Since the observed value of the
test statistic exceeds 3.841, the upper 5% point of c12 , we reject H0 and
conclude that the b 3 parameter is required, ie there is a significant
interaction effect present.

(iii) Interpret the results
Using the estimated parameter values, the Cox model is:
l (t ; Z1, Z2, Z3 ) = l0 (t ) e b1 Z1 + b 2 Z2 + b3 Z3 = l0 (t ) e0.2Z1 - 0.3Z2 - 0.35Z3
The factors in the hazard rate for the four possible combinations of
individuals are:
 Female, no exercise: e0 = 1 (= baseline)
 Female, with exercise: e b 2 = e -0.3
 Male, no exercise: e b1 = e0.2
 Male, with exercise: e b1 + b 2 + b3 = e0.2 - 0.3 - 0.35 = e -0.45
From these, we can see that:

 females who take exercise are less susceptible to heart disease than
females who take no exercise
 males who take no exercise are more susceptible to heart disease than
females who take no exercise
 males who take exercise are less susceptible to heart disease than
females who take exercise (and males who don’t).
Another way of interpreting the interaction term with parameter b 3 = -0.35

is to say that the beneficial effect of exercise is greater for males than for
females.

(i) Calculate the hazard
The hazard rate for each month is obtained by dividing the number of
transitions by the exposed to risk, as shown in the table below.
Number of Number of patients

Estimated hazard
Month patient-months experiencing a return
for month
exposed to risk of their symptoms
1 200 5 5 200 = 0.025

2 190 8 8 190 = 0.042
3 175 15 15 175 = 0.086
4 150 10 10 150 = 0.067
5 135 6 6 135 = 0.044
6 125 3 3 125 = 0.024
(ii) Comment
Gompertz model
Gompertz’ law assumes that the hazard rate at age x is equal to mx = Bc x .

(This will increase/decrease over time depending on whether c > 1
or c < 1 .)
This formula implies that the hazard rate either increases or decreases
monotonically over time (or remains constant if c = 1 ). However, the results
from the table in part (i) indicate that the hazard rate reaches a peak in
month 3 and then starts to decline. This suggests that the Gompertz model
is not suitable here.
The Gompertz model does not allow us to model differences in the hazard
rate attributable to factors other than age or time.
Semi-parametric
A semi-parametric model specifies a formula for the hazard rate that is partly
parametric and partly non-parametric. The parametric component allows the
model to be fitted to the individual data.
One example of a semi-parametric model is the Cox model, which assumes

that the hazard rate at time t is equal to l (t ; z1, z2 ,) = l0 (t )e b1z1 + b 2z2 + .

The baseline hazard l0 (t ) , which is the same for all individuals, allows us to
incorporate a humped graph consistent with the results we can see in the
table.
The parametric component allows us to take into account other factors

affecting the hazard rate via the covariate values z1, z2 , for each
individual, which is not possible with the Gompertz model.
Semi-parametric models allow us to examine the effect of each factor, ie

whether it is statistically significant, its direction and its size.
However, the actual pattern of the hazard rate might not be consistent with
the proportional hazards assumption underlying the Cox model.
(i) Likelihood
For those lives who died, we know the exact values of their lifetimes. So the
contribution made to the likelihood function by the i th death is fTi (ti ) , where
f denotes the PDF of the complete future lifetime random variable for life i .
Using the notation of this question:
fTi (ti ) = S(ti ) h (ti )
where:
S(ti ) = exp Ê - Ú i h(s ) ds ˆ

t
Ë 0 ¯
= exp Ê - Ú i ( A + Bs ) ds ˆ
t
Ë 0 ¯
Ê ti ˆ
= exp Á - È A s + 1
B s2 ˘ ˜
Ë Î 2 ˚0¯
(
= exp - A ti - 1
2
B ti2 )
So the total contribution made to the likelihood function by the deaths is:
’ ( A + B ti )
deaths
(
exp - A ti - 1
2
B ti2 )

We do not know the exact lifetime of each of the survivors. For the i th
survivor, all we know is that his/her lifetime is more than ti . So the
contribution made to the likelihood function by the i th survivor is:
P (Ti > ti ) = S(ti ) = exp - A ti - ( 1

2
B ti2 )
and hence the total contribution made to the likelihood function by the
survivors is:
’
survivors
(
exp - A ti - 1
2
B ti2 )
So the overall likelihood function is:
L= ’ ( A + B ti )
deaths
(
exp - A ti - 1
2
B ti2 ) ’
survivors
(
exp - A ti - 1
2
B ti2 )
= ’ ( A + B ti ) ’ exp ( - A ti - 21 B ti2 )
deaths all
’ exp (- A ti - 21 B ti2 )
di
= ’ ( A + B ti )
all all
( )
n
d
= ’ ( A + B ti ) i exp - A ti - 1
2
B ti2
i =1
(ii) Simultaneous equations
The log-likelihood is:
Â {d i ln ( A + Bti ) - Ati - 21 Bti2 }

n
ln L =
i =1
Differentiating this with respect to the parameters A and B :
∂ n Ï d ¸Ô
Ô
ln L = Â Ì i - ti ˝
∂A Ó A + Bti
i =1 Ô Ô˛
∂ n Ï d t ¸
Ô 1 2Ô
ln L = Â Ì i i - t
2 i ˝
∂B Ô A + Bti
i =1 Ó ˛Ô

The maximum likelihood estimates of A and B are the values of these

parameters for which the two partial derivatives are equal to 0.
So the required simultaneous equations are:
n ÏÔ d ¸Ô n d n
Â Ì A + iBt - ti ˝ = 0 or Â A + iBt = Â ti
i =1 Ô
Ó i Ô˛ i =1 i i =1
n
ÔÏ d t Ô¸ n dt n
Â Ì A +i Bt
i
- 21 ti2 ˝ = 0 or Â A +i Bt
i
= 1
2 Â ti2
Ô
i =1 Ó i ˛Ô i =1 i i =1
(i) Time to sell 40% of the pies
For the Nelson-Aalen method, the survival probability up to time t is

estimated as:
(
Sˆ (t ) = exp -Lˆ (t ) )
dj
where the integrated hazard is calculated as Lˆ (t ) = Â nj
.
t j £t
So the estimated time at which 40% of the pies are sold, which is the same
as when the estimated survival probability equals 60%, will satisfy:
Sˆ (t ) = 0.6 ie exp -Lˆ (t ) = 0.6( )

Taking logs gives:
Lˆ (t ) = - ln0.6 = 0.5108
ie:
dj
Â nj
= 0.5108
t j £t

We can summarise the observations as follows:
Time of day Hours since the start Outcome Number of

of the day ( t ) pies affected
9am 0 Start of study
10am 1 Sale 2
11am 2 Sale 3
12 noon 3 Censored 1
1pm 4 Sale 2
2pm 5 Censored 2
3pm 6 Sale 1
5pm 8 End of study
survival rates:
Counter Time of the Number of sales Number at risk

j th sale at time t j just before time t j
j tj dj nj
1 1 2 12
2 2 3 10
3 4 2 6
4 6 1 2

We can then calculate the estimates of the hazard rate and the integrated
hazard between the times of each sale:
Time interval Fraction of Integrated hazard function

remaining pies sold
Lˆ (t )
d j nj
0£t <1 0 Lˆ (t ) = 0
1£ t < 2 2 = 1 Lˆ (t ) = 1 = 0.1667
12 6 6
2£t <4 3 Lˆ (t ) = 1 3 =
+ 10 7 = 0.4667
10 6 15
4£t <6 2 = 1 Lˆ (t ) = 7 + 1 = 4 = 0.8

6 3 15 3 5
6£t <8 1 Lˆ (t ) = 4 + 1 = 13 = 1.3

2 5 2 10
So the time at which the integrated hazard reaches 0.5108 is time t = 4 , ie

the time taken for Mr Bunn to sell 40% of his pies is estimated as 1pm.
(ii) Comment
This estimate is based on a sample of only 12 pies on a single day, which

might not be representative of the pattern of sales.
The true probability could be very different due to sampling error. It might be
better to work out a confidence interval.
It is not obvious how Mr Bunn could use this particular figure to plan
production, as the 40% figure seems an arbitrary percentage. Perhaps 90%
would be more meaningful, as this would be a time by which almost all the
pies have been sold.
Also, if he bakes twice as many pies, unless sales increase, it will take him
longer to sell 40% of them.
The Nelson-Aalen model assumes that all pies have the same probability of
being sold, which should be reasonable if they are all of a similar size and
none of them are burnt or damaged.
It also assumes that sales are independent. This is not the case here since
some of the sales and censoring events involved several pies in a single
event.

Hopefully, Mr Bunn sitting on a pie and the dog stealing two pies were one-
off events that we would not expect to be repeated.
(i) One advantage of a semi-parametric model over a fully parametric one
In a semi-parametric model, any differences between individuals can be

examined without the need to know what the general ‘shape’ is.
This allows us to parameterise the model from the data irrespective of the
shape of the baseline hazard.
(ii) Cox proportional hazards model
The general form of the Cox model models l (t ; z i ) , the hazard function at
time t for individual i , as:
( )
l (t ; z i ) = l0 (t ) exp β zTi
where:
 z i is a row vector of covariates for individual i
 β is a row vector of regression parameters for the model
(iii) Baseline hazard
The baseline hazard relates to males who bought Whole Life Assurance
through the direct sales channel.

(iv) Probability that the term assurance is still in force after five years
We are told that 60% of whole life policies bought on the internet by males
have lapsed by the end of year five. We can interpret this as the probability
that such a policy has lapsed by the end of year 5:
ÏÔ 5 ¸Ô
1 - exp Ì - Ú l (t; z i ) dt ˝ = 0.6
ÔÓ 0 Ô˛
ÏÔ 5 ¸Ô
¤ exp Ì - e0.4 Ú l0 (t ) dt ˝ = 0.4
ÓÔ 0 ˛Ô
5
Ú l0 (t ) dt = - (ln 0.4) e
-0.4
¤
0
We can use this to calculate the probability that a term assurance sold to a
female by an IFA is still in force after five years:
ÏÔ 5 ¸Ô
exp Ì - e0.2 - 0.1- 0.2 Ú l0 (t ) dt ˝
ÓÔ 0 ˛Ô
{
= exp e0.2 - 0.1- 0.2 (ln 0.4) e -0.4 }
= exp {ln (0.4) e } -0.5
= 0.57364
So, the required probability is 57.36%.
Interval censoring is not present here because we know the day that each
traffic warden left without having qualified.
Right censoring is present here because the investigation ends after 30 days
and not all participants have qualified.
Informative censoring is present because the traffic wardens that left without
qualifying probably did so because they knew that it was going to take them
a long time to qualify.

(ii) Kaplan-Meier estimate
The Kaplan-Meier estimate of the survival function at time t is:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - ˜
t j £t Ë nj ¯
We can construct a table:
Day Number of Number that could

qualified qualifications have qualified on
’ (1 - lˆj )
this day dj
tj dj nj lˆ j = Sˆ (t ) =
nj t j £t
1 1 13 0.07692 0.92308
5 1 12 0.08333 0.84615
12 2 10 0.20000 0.67692
15 1 8 0.12500 0.59231
19 1 7 0.14286 0.50769
24 1 4 0.25000 0.38077
So, we have:
Ï 1 0 £ t <1
Ô
Ô0.92308 1£ t < 5
Ô0.84615 5 £ t < 12
ˆ Ô
S(t ) = Ì0.67692 12 £ t < 15
Ô 0.59231 15 £ t < 19
Ô
Ô0.50769 19 £ t < 24
Ô
Ó0.38077 24 £ t £ 30

(iii) Graph of the Kaplan-Meier estimate
S(t)
1
0 t
0 30
(iv) How the answer would change with access to the correct data
In the Kaplan-Meier model, we assume that, if we have a death and a

censored life at the same time, the censored life is included in the n j .
Therefore, the fact that the reasons for exit of candidates D and H were
accidentally transposed is irrelevant. Both candidates left at time 19 so
swapping them will only change their labelling. There will be no effect on the
answer in part (ii).
The fact that the reasons for exit of candidates B and L were accidentally
transposed, means that the ‘death’ that we thought occurred at time 5,
actually occurred at time 10. So, the estimate of the survival probability,
Ŝ (t ) , will increase for 5 £ t < 10 and reduce for 10 £ t .

(i) Definitions
Right censoring
Right censoring occurs when observations in progress are cut short. So the
precise duration of the event is not known, only that it exceeds a certain
value.
Type I censoring
Type I censoring occurs in a study if the censoring times are specified in

advance, for example, if there is a fixed cut-off date for the observations.
Type II censoring
With Type II censoring, observation continues until a specified number of

events has occurred, for example, if the study stops when half the people
have died.
(ii) Example of informative censoring
Censoring is informative if it provides some information about the timing of

the events under study.
For example, in an investigation into the mortality of active members of a

pension scheme, members who take early retirement will tend to be those in
worse health, so the lifetimes of those remaining will tend to be longer.
(i) Average age at death
Since the lives have a constant future force of mortality m , their future
lifetimes have an Exp( m ) distribution and their expected future lifetime is
1 m . However, they have already lived for 5 years. So their average age at
5m + 1
death will be 5 + 1 m , which can be written in the equivalent form .
m

(ii) Proportion that die between ages 10 and 15 years
The proportion of animals that will die between ages 10 and 15 is

10 p0 - 15 p0 .
For ages 5 and above, the force of mortality takes a constant value m , and
we have:
= exp Ê - Ú m x + s ds ˆ = exp Ê - Ú m ds ˆ = e - m t
t t
t px Ë 0 ¯ Ë 0 ¯
So, splitting the age range at age 5, we have:
10 p0 - 15 p0 = 5 p0 ¥ 5 p5 - 5 p0 ¥ 10 p5
= 5 p0 ( 5 p5 - 10 p5 )
(
= 5 p0 e -5 m - e -10 m )
(iii) Calculate m and 5 p0
We are now told that:
10 p0 = 5 p0e -5 m = 0.3 and 15 p0 = 5 p0e -10 m = 0.2
Dividing the first of these equations by the second, we get:
0.3
e5 m = = 1.5 fi m = 1
5
ln1.5 = 0.08109
0.2
From the first equation, we then have:
0.3 0.32
5 p0 = 0.3e5 m = 0.3 ¥ = = 0.45
0.2 0.2

(i) State the form of the hazard function
The hazard function for the Cox regression model is:
l (t ; z1, z2, ) = l0 (t )e b1z1 + b 2 z2 +
where:
 l (t ; z1, z2,) is the hazard rate at duration t for an individual with

covariates z1, z2, 
 l0 (t ) is the baseline hazard rate
 b1, b 2, are the regression parameters estimated from the data.
(ii) Advantages of the Cox model
The Cox regression model allows us to compare individuals with different

covariates (eg males and females) without needing to consider the form of
the baseline hazard rates.
The Cox model is a commonly used model and reliable software is available
for carrying out the required calculations.
(iii) Ben versus Bill
The discrete-time hazard rates for Bill and Ben at time t are:
lBill (t ; z1, z2, z3 ) = l0 (t )e0.4 ¥ 0 - 0.2 ¥1+1.15 ¥1 = l0 (t )e0.95
and:
lBen (t ; z1, z2, z3 ) = l0 (t )e0.4 ¥1- 0.2¥ 0 +1.15 ¥0 = l0 (t )e0.4
The ratio of these is:
lBen (t ; z1, z2, z3 ) l0 (t )e0.4

= = e -0.55 = 0.577
lBill (t ; z1, z2, z3 ) l0 (t )e0.95
So Ben is 42.3% ( = 1 - 0.577 ) less likely to pass the exam than Bill.

(iv) Adjusting the model
If the effect of the number of attempts is different for employees and the self-
employed, this means that there is an interaction between these two factors.
To incorporate this in the model we need to include an additional term based

on the product of z1 and z2 with a new regression parameter b1,2 (say).
So there would be an extra term equal to b1,2 z1z2 in the exponent, and we
would need to estimate b1,2 .
(i) Kaplan-Meier estimate of the survival function
For the Kaplan-Meier method, the survival function up to time t is estimated

as:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - n ˜
t j £t Ë j ¯
Number of bulbs
Time (hours) Outcome
affected
0 Start of study 1,000
50 Failed 10
100 Failed 20
200 Tremor (censored) 200
250 Failed 50
400 Failed 300
450 Failed 50
500 End of study

survival rates:
Counter Time of the Number of failures Number at risk just

j th failure at time t j before time t j
j tj dj nj
1 50 10 1,000
2 100 20 990
3 250 50 770
4 400 300 720
5 450 50 420
We can then calculate the estimates of the survival function between the
times of each failure:
Fraction surviving Estimated survival function

Time interval dj Ê dj ˆ
(hours) 1- Sˆ (t ) = ’ Á1 - n ˜
nj t j £t Ë j¯
0 £ t < 50 ––– Sˆ (t ) = 1
10
50 £ t < 100 1- = 0.99 Sˆ (t ) = 1 ¥ 0.99 = 0.99
1, 000
20 Sˆ (t ) = 0.99 ¥ 0.979798
100 £ t < 250 1- = 0.979798
990 = 0.97
50 Sˆ (t ) = 0.97 ¥ 0.935065
250 £ t < 400 1- = 0.935065
770 = 0.907013
300 Sˆ (t ) = 0.907013 ¥ 0.583333
400 £ t < 450 1- = 0.583333
720 = 0.529091
50 Sˆ (t ) = 0.529091 ¥ 0.880952
450 £ t < 500 1- = 0.880952
420 = 0.466104

(ii) Sketch
The graph of the estimated survival function looks like this:
Estimated survival function (Kaplan-Meier)

1.0
0.9
0.8
Survival probability
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 100 200 300 400 500
Duration (hours)
(iii) Probabilities
The estimated probabilities at durations 300 and 400 are:
Sˆ (300) = 0.907013
Sˆ (400) = 0.529091
We cannot estimate S(600) since time 600 lies outside the range of our
observations.

(i) Explain what is meant by censoring
Censoring is where we have incomplete information about the durations at

which the decrements occurred, eg in a mortality study we might not know
the exact dates of death. The presence of censoring makes conclusions
based on the data less reliable than if we had the ‘full’ data.
(ii) Two types of censoring that are present
Type 1 censoring
Type 1 censoring is present since the study was terminated after a fixed time
period. So we do not know the time of recovery of any sufferers who were
still receiving the treatment at the end of the study.
Right censoring

been cut short. This affects the 7 sufferers who left the study before they
had been cured.
Other possible answers are random censoring, interval censoring, and

informative or non-informative censoring, if accompanied by a suitable
description.
(iii) Nelson-Aalen estimate of the survival function
Time (days) Outcome Number involved

0 Start of study 100
2 Left 3
6 Recovered 2
7 Recovered 1
10 Recovered 1
10 Left 1
13 Left 3
14 Recovered 2
28 End of study 87

If we assume that censoring occurs after symptoms disappear, where these

two events occur on the same day, we can construct the usual summary
table for calculations of empirical survival rates:
Counter Time of the Number of Number at

j th recovery recoveries at risk just
time t j before time t j
j tj
dj nj
1 6 2 97
2 7 1 95
3 10 1 94
4 14 2 89
According to the Nelson-Aalen model, the integrated hazard is estimated as:
dj
ˆ t  
t j t nj
and the survival probabilities are then calculated as:
Sˆ (t )  exp ˆ t 

So the survival function is as shown in the table.
Time Fraction Cumulative Estimated

interval recovering hazard survival function
(days) dj dj Sˆ (t )  exp( ˆ t )
ˆ t  
nj t j t nj
0t 6 ––– 0 1
2
6t 7  0.02062 0.02062 0.97959
97
1
7  t  10  0.01053 0.03114 0.96934
95
1
10  t  14  0.01064 0.04178 0.95908
94
2
14  t  28  0.02247 0.06426 0.93777
89
(iv) Sketch
Estimated survival function (Nelson-Aalen)

1.0
0.9
0 5 10 15 20 25 30
Duration (days)

(v) Probability
The probability that a person using the cream will still have symptoms after
two weeks is Sˆ (14) , which equals 0.93777, ie approximately 94%.
(i) Why the Gompertz model is commonly used
The Gompertz model is a simple model that is easy to apply. It has been
found to give a reasonably good description of human mortality at the older
ages where the rates increase exponentially.
(ii) Show that it is a Gompertz and a proportional hazards model
Gompertz
The model given can be expressed in the form:
 
x
 x  e  0 x  1U  2I  e  1U  2I e 0  Bc x
where B  e  1U  2I and c  e 0 are constants for each individual. So this
matches the form of a Gompertz model.
Proportional hazards
Alternatively, the model given can be expressed in the form:
  e  0 x  1U  2I  e

  0 x 1U   2I
 e  
x
 ( t ;z1,z2 ) 0 ( t ) e 1z1  2z2
where t  x is used as the measure of time, 0 (t )  e  0 x is the baseline

hazard and z1  U and z2  I are the two covariates. So this matches the
form of a Cox model, which is a proportional hazards model.

(iii) Predicted force of mortality
Here we have:
x  40 , U  1 and I  20,000
So the predicted force of mortality is:
 x  e  0 x  1U  2I
 e 9.0  0.0940  0.310.000120,000  e 7.1  0.000825
(iv) Additional income
Let y denote the additional income required (with I being the rural
resident’s income). If we equate the force of mortality for the urban and rural
residents, we get:
  0 x  1  2 (I  y )    0 x   2I
e
 e 
urban rural
After cancelling, this becomes:
e 1  2 y  1
Taking logs and rearranging then gives:
1 0.3
y    3,000
2 0.0001
So the urban resident would require an extra $3,000.
(v) Survival probability
The predicted force of mortality at age 40  t for this individual, who is the
same as in part (iii), is:
40  t  e  0 (40  t )  1U  2I

 e 9.0  0.09(40  t )  0.31 0.0001 20,000
 e 7.1 0.09t

The 10-year survival probability, S(10) , for this resident satisfies:
10
log S(10)   
0
40  t dt
10 7.1 0.09t
 
 e
0
dt
10
 e 7.1 e0.09t dt

0
10
 0.09t 
7.1 e
 e  
 0.09  0
 e0.9  1 
 e 7.1    0.01338
 0.09 
 
So: S(10)  e 0.01338  0.98671
(vi) Age with the same chance of surviving
We need to find the age 40  z for which Srural (10)  Surban (10) , or
equivalently log Srural (10)  log Surban (10) .
We already know from part (v) that, for an individual with an income of
$20,000:
 e0.9  1 
log Surban (10)  e 7.1  
 0.09 
 
By a similar calculation, we can see that, for a rural resident:
40  t  e  0 (40  t )  1U  2I

 e 9.0  0.09(40  t )  0.30  0.0001 20,000  e 7.4  0.09t
10
log Srural (10)   
0
40  z  t dt
10 7.4  0.09 z  0.09t
  e dt
0
 e0.9  1 
 e 7.4  0.09 z  
 0.09 
 

So, equating these two expressions shows that we need to have:
0.3
7.1  7.4  0.09z z  3 31
0.09
So the required age is 43 31 .
(i) Baseline characteristics
The baseline hazard applies to individuals who were exact age 50 at the
start of the investigation, who are not heavy drinkers and who have not lived
for 12 months in a tropical country.
(ii) Calculate the b ’s
The hazard rates for the individuals in the first bullet point are
h0 (t ) exp (10 b A + bC ) and h0 (t ) exp (0) .
So: h0 (t ) exp (10 b A + bC ) = 2h0 (t ) exp (0)

fi 10 b A + bC = ln 2
The hazard rates for the individuals in the second bullet point are
h0 (t ) exp ( b A A + bC + bT ) and h0 (t ) exp ( b A A) .
So: h0 (t ) exp ( b A A + bC + bT ) = 4h0 (t ) exp ( b A A)

fi bC + bT = ln 4
The hazard rates for the individuals in the third bullet point are
h0 (t ) exp ( b A A + bC C + bT ) and h0 (t ) exp ( b A A + bC C ) .
So: h0 (t ) exp ( b A A + bC C + bT ) = 3h0 (t ) exp ( b A A + bC C )

fi bT = ln3
Working in reverse order, we can now find the values of the three b ’s:
bT = 1.0986 , bC = 0.2877 , b A = 0.04055

(iii) Probability calculation
The first individual, whose 10-year survival probability is 0.8, is a baseline

individual.
So: S1(10) = exp Ê - Ú h0 (t )dt ˆ = 0.8

10
Ë 0 ¯
The hazard rate for the second individual is h0 (t )e bT . So their 10-year

survival probability is:
S2 (10) = exp Ê - Ú h0 (t )e bT dt ˆ = exp Ê - Ú h0 (t )dt e bT ˆ

10 10
Ë 0 ¯ Ë 0 ¯
e bT
È ˘
= Íexp Ê - Ú h0 (t )dt ˆ ˙
10
= 0.83 = 0.512
Î Ë 0 ¯˚
This is slightly over one half, as stated.
(i) Censoring

which the decrements occurred, eg in a mortality study we might not know
the exact dates of death or the exact dates of entry. The presence of
censoring makes conclusions based on the data less reliable than if we had
the ‘full’ data.
(ii) Right / left / interval censoring
Right censoring is where an observation is cut short, so that we don’t know

the precise time when the event of interest occurred, only that it occurred
after a certain time. An example would be policyholders who cancel their
policies during a life office mortality investigation.
Left censoring is where we don’t know the precise time of entry, only that it
occurred before a certain time. An example would be in a medical
investigation of a disease when we don’t know the precise time of onset.
Interval censoring is where we can only say that the duration at the time of
the event of interest lies within a certain interval. An example would be in a
medical investigation where patients are only observed at six-monthly
intervals.

(iii) Which forms of censoring are present?
Right censoring is present here for the toys that were unplugged / stolen.
We don’t know when they would have failed if they had been allowed to
continue operating.
Random censoring is present here for the toys that were unplugged / stolen.
These events could not have been anticipated.
Non-random censoring is present here for the toys that were still operating at
the end of the observation period (5pm on the second day). All observations
were due to stop at this point anyway.
Non-informative censoring is present for the toys that were unplugged, since
there was nothing special to distinguish the 17 that were affected.
(iv) Nelson-Aalen estimate of the survival function

estimated as:
(
Sˆ (t ) = exp -Lˆ (t ) )
dj
.
t j £t
Time of day Hours since Outcome Number of

the start ( t ) toys affected
9am 0 Start of study 500
1pm 4 Failed 12
7pm 10 Censored (unplugged) 17
8pm 11 Failed 25
10pm 13 Censored (stolen) 3
4pm 31 Failed 8
5pm 32 End of study 435

survival rates:
Counter Time of the Number of failures Number at risk

j th failure at time t j just before time t j
j tj dj nj
1 4 12 500
2 11 25 471
3 31 8 443
hazard between each of the failure times:
Time Fraction of Integrated hazard Survival function

interval remaining toys function ˆ
that fail Sˆ (t ) = e -L(t )
L̂(t )
d j nj
0£t <4 0 0 1
12
4 £ t < 11 = 0.024 0.024 0.97629
500
25 0.024 + 0.05308
11 £ t < 31 = 0.05308 = 0.07708 0.92582
471
8 0.07708 + 0.01806
31 £ t £ 32 = 0.01806 0.90925
443 = 0.09514

(v) Graph

1.0
0.9
0.8
0 5 10 15 20 25 30 35
Duration (hours)
(vi) Comment
At the highest duration (32 hours) we have estimated the survival probability
to be 90.92%, which is much higher than 60%. So a survival probability of
60% corresponds to a time greater than 32 hours. However, we have no
data for failures beyond this time. So all we can say is that the length of time
for which a new toy has a 60% survival probability exceeds 32 hours.
(i) Definition of the force of mortality
If T denotes a person’s total lifetime, the force of mortality m x + t can be

defined as:
1
m x + t = lim P (T £ x + t + h | T > x + t )
h Æ0 h
(ii) Probability of surviving to age 5
The probability that an animal will survive from birth to exact age 5 years
is e -5 m .

(iii) Calculate l
We are told that:
5 p5 = 2 ¥ 15 p5
Since the force of mortality from age 5 onwards has a constant value l , this
is:
e -5 l = 2e -15 l
So: e10 l = 2 fi l = 1 ln 2
10
= 0.0693
(iv) Expectation of life if l = m
Method 1
In this case, the force of mortality takes a constant value m throughout the
whole of the animal’s life.
1
So the lifetime will have an Exp( m ) distribution, which has a mean of .
m
Method 2
 •
The formula for calculating the expected lifetime is ex = Ú0 t px dt . If the
force of mortality takes a constant value m , we have:
•
 • • - mt È 1 ˘ 1 1
e0 = Ú0 t p0 dt = Ú0 e dt = Í - e - mt ˙ = - (0 - 1) =
Î m ˚0 m m
So the numerical value is:
1 1
= = 14.427 years
m 0.0693

(v) Expectation of life if l π m
In this case:
 •
e0 = Ú0 t p0 dt
5 •
= Ú0 t p0 dt + Ú5 t p0 dt
5 •
= Ú0 t p0 dt + Ú5 5 p0 t - 5 p5 dt
We can then use the facts that t p0 = e - mt when t £ 5 and t - 5 p5 = e - l (t -5)

when t ≥ 5 .
So we get:
 5 - mt •
e0 = Ú0 e dt + Ú e -5 m e - l ( t - 5) dt
5
5 •
È 1 ˘ È 1 - l (t - 5) ˘
= Í - e - m t ˙ + e -5 m Í- l e ˙
Î m ˚0 Î ˚5
1 1 1 Ê 1 1ˆ
= (1 - e -5 m ) + e -5 m or + e -5 m Á - ˜
m l m Ël m¯
Substituting the value for l found in part (iii):
 1 1 Ê 1ˆ
e0 = (1 - e -5 m ) + 14.43e -5 m or + e -5 m Á 14.43 - ˜
m m Ë m¯

(i) Types of censoring that are present
Right censoring

been cut short. This affects those patients who died or had a second
operation.
Type I censoring
Type I censoring is present since the study was terminated after a fixed time
period. So we do not know the time of leaving the hospital for patients who
were still in the hospital at the end of the study.
Type II censoring
Type II censoring is where the study is terminated after a certain quota of

events has occurred. This is not the case here.
Random censoring
Random censoring is present for those patients who died or had a second
operation before the end of the study, as these events could not be
predicted in advance.
(ii) Informative censoring
Informative censoring is present where lives who leave the study through
censoring can be expected to influence the likelihood of decrements within
the remaining population. This is likely to be the case here since the
patients removed by censoring (death or a second operation) will be in a
worse condition than those remaining in the hospital and so would have
probably had longer stays than those who were not censored.
We first need to calculate the number of days each patient was in the study
by subtracting the date of the operation from the date that observation
ended. This leads to the values:
28, 2, 14, 31, 14, 15, 1, 36, 7, 24, 14

We can then summarise the observations as follows:
Time (days) Outcome Number involved

0 Start of study 11
1 Death 1
2 Death 1
7 2nd operation 1
14 Left hospital 3
15 Left hospital 1
24 Left hospital 1
28 2nd operation 1
31 Left hospital 1
36 End of study 1
survival rates (where ‘failure’ corresponds to leaving the hospital in this
scenario):
Counter Time of the Number of failures Number at risk

j th failure at time t j just before time
tj
j tj dj
nj
1 14 3 8
2 15 1 5
3 24 1 4
4 31 1 2

as:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - n ˜
t j £t Ë j ¯

We can then calculate the estimates of the survival function between the
times of each failure:
Time interval (hours) Fraction surviving Estimated survival function

dj Ê dj ˆ
1- Sˆ (t ) = ’ Á1 - n ˜
nj
t j £t Ë j ¯
0 £ t < 14 ––– Sˆ (t ) = 1
3 5 5 5
14 £ t < 15 1- = = 0.625 Sˆ (t ) = 1 ¥ = ( = 0.625)
8 8 8 8
1 4 5 4 1
15 £ t < 24 1- = = 0.8 Sˆ (t ) = ¥ = ( = 0.5)
5 5 8 5 2
1 3 1 3 3
24 £ t < 31 1- = = 0.75 Sˆ (t ) = ¥ = ( = 0.375)
4 4 2 4 8
1 1 3 1 3
31 £ t £ 36 1- = = 0.5 Sˆ (t ) = ¥ = ( = 0.1875)
2 2 8 2 16
(iv) Sketch
Estimated survival function (Kaplan-Meier)

1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 10 20 30
Duration (days)

(v) Comment on the results
We can see that a lot of patients left the hospital after 14 or 15 days. This
suggests that the typical recovery period is 2 to 3 weeks, or that patients are
routinely kept in hospital for 2 weeks after surgery before they are
discharged.
We can also see from the table that the two deaths occurred very soon after
the original surgery. This suggests that the mortality rate is highest in the
first few days.
Since informative censoring is present, the results may not be reliable.
(a) Survival probability for a beetle in the protected environment
The mortality rates for the beetle are m in the protected environment and
1.5 m in the wild.
Using the information given:
e -1.5 m ¥ 8 = 0.58
fi e -12 m = 0.58
1
fi m=- ln 0.58 = 0.04539
12
The probability of a beetle surviving for 8 days in the protected environment

is:
e - m ¥ 8 = e -0.04539 ¥ 8 = 0.69548 , or approximately 70%
(b) Survival probability for a beetle due to be released
We can calculate this probability by splitting the time period into two parts –
before and after time 6:
- m ¥6 -1.5 m ¥ 2 -9 m
e
 ¥e
 =e = e -9 ¥ 0.04539 = 0.6646
In protected In wild
environment
or approximately 66%.

(i) Proportional hazards models
Proportional hazards models are used to describe the hazard rate of

individuals where this depends on both duration (the time since a specified
event) and other covariates.
The hazard rate for each individual consists of a baseline hazard, which is a
non-parametric component that depends only on the duration, multiplied by
a parametric function that depends only on the values of the covariates for
the individual.
The model is ‘proportional’ because the hazard rate for each individual
always remains in the same proportion to the baseline hazard (and hence
also to other individuals).
The baseline hazard rate corresponds to an individual with all covariates

equal to zero.
(ii) Advantages of the Cox model
The Cox regression model allows us to compare individuals with different

covariates (eg males and females) without needing to consider the form of
the baseline hazard rates.
The Cox model is a commonly used model and reliable software is available
for carrying out the required calculations.
The exponential function ensures that the hazard rate is always positive.
It is a semi-parametric model, so the baseline hazard rate does not need to

be specified in advance.

(i)(a) Two types of censoring
Right censoring is where an observation is cut short, so that we don’t know

the precise time when the event of interest occurred, only that it occurred
after a certain time.
Random censoring is where an observation is censored at a time that was

not specified in the observational plan, and so the precise time could not
have been anticipated.
(i)(b) Examples of censoring
Right censoring is present here for the people the researchers lost contact
with and for the people who still did not have a job at the end of the study
period. We don’t know when they would have found a job if we had
continued to monitor them.
Random censoring is also present here for the people the researchers lost
contact with. The timing of these events could not have been anticipated.
You could also have mentioned:

● interval censoring, which applies to those people who found a job, as we
only know that they found employment at some point during the
previous month
● Type I censoring, which applies to the people who still did not have a job
by the end of the study period
● non-random censoring, which applies to the people who still did not
have a job by the end of the study period
● informative censoring, which applies to the people the researchers lost
contact with if this was partly as a result of those individuals having
found a job without telling the researchers.

as:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - n ˜
t j £t Ë j ¯

We can see from the table that, at the end of the first month, we had lost
contact with 50 people. However, it is not clear from the information given
whether we should treat these people as having been censored:
(a) at random times during the previous month or
(b) at the end of the first month (when we find that they have not turned up
for their interview).
This affects the calculations because in case (a) the censoring obviously
occurred before the decrements at time 1, whereas in case (b), we would
need to assume that the censoring occurred after the decrements at time 1
(as specified in the Core Reading).
We will adopt approach (b), but the examiners have said that either
approach was equally valid, provided that the assumption made was clearly
stated.
We can construct the usual summary table for calculations of empirical

survival rates.
Found
Counter Time Censored At risk
employment
j tj cj nj
dj
1 1 100 50 700
2 2 70 0 550
3 3 50 20 480
4 4 40 20 410
5 5 20 30 350
6 6 20 60 300
7 7 12 38 220
8 8 6 164 170

We can then calculate the estimates of the survival function for each time
interval.

Time
interval dj Ê dj ˆ
1- Sˆ (t ) = ’ Á1 - ˜
(months) nj t j £t Ë nj ¯
0 £ t <1 ––– Sˆ (t ) = 1
100
1£ t < 2 1- = 0.85714 Sˆ (t ) = 1 ¥ 0.85714 = 0.85714
700
70
2£t <3 1- = 0.87273 Sˆ (t ) = 0.85714 ¥ 0.87273 = 0.74805
550
50
3£t <4 1- = 0.89583 Sˆ (t ) = 0.74805 ¥ 0.89583 = 0.67013
480
40
4£t <5 1- = 0.90244 Sˆ (t ) = 0.67013 ¥ 0.90244 = 0.60475
410
20
5£t <6 1- = 0.94286 Sˆ (t ) = 0.60475 ¥ 0.94286 = 0.57019
350
20
6£t <7 1- = 0.93333 Sˆ (t ) = 0.57019 ¥ 0.93333 = 0.53218
300
12
7£t <8 1- = 0.94545 Sˆ (t ) = 0.53218 ¥ 0.94545 = 0.50315
220
6
t =8 1- = 0.96471 Sˆ (t ) = 0.50315 ¥ 0.96471 = 0.48539
170
(iii) Goodness-of-fit test of Weibull distribution
We can carry out a chi-squared test of the hypotheses:
H0 : The underlying rates of obtaining a job conform to a Weibull distribution.
H1 : The underlying rates of obtaining a job do not conform to a Weibull

distribution.

8 (Observed - Expected)2
The test statistic for this test is: c 2 = Â Expected
.
t =1
Observed
Month Expected Contribution
h(t ) = l b b t b -1 nj number
(t ) number (dj ) to c 2
1 0.17935 700 125.55 100 5.20

2 0.11040 550 60.72 70 1.42
3 0.08312 480 39.90 50 2.56
4 0.06796 410 27.86 40 5.29
5 0.05813 350 20.35 20 0.01
6 0.05117 300 15.35 20 1.41
7 0.04593 220 10.11 12 0.36
8 0.04184 170 7.11 6 0.17
The chi-squared value is:
c 2 = 5.20 + 1.42 +  + 0.17 = 16.40
This is a one-sided test and, under the null hypothesis, this statistic should
have a chi-squared distribution. There are 8 months and 2 parameters have
been estimated from the data. So the number of degrees of freedom in this
case is 6.
Since 16.40 is greater than 12.59 (the 95th percentile of the c 62

distribution), we reject the null hypothesis that the rates of finding a job are
consistent with a Weibull distribution.

(i) Censoring

which the event of interest occurred.
Right censoring occurs when an observation is cut short, ie an individual is

removed from observation prior to the event of interest occurring. So we
don’t know the precise time when the event of interest occurred, only that it
occurred after a certain time.
Type I censoring occurs when the censoring times are known in advance
and individuals under observation are considered to have been censored if
the event of interest has not occurred by a specified date.
Random censoring occurs when the censoring times are not known in
advance but are considered to be random variables for individuals that are
removed from observation before the event of interest has occurred.
(ii) Which forms of censoring are present?
(ii)(a) A policyholder dies
Right censoring is present because observation of the policyholder ceased

for a different reason before a lapse occurred.
Type I censoring did not occur because the time when the policyholder was
censored was not known in advance.
Random censoring is present because the time when the policyholder was
censored was not known in advance.
(ii)(b) Migration to a new administration system

for a different reason before a lapse occurred.
Type I censoring may be present if it was known at the outset that this
migration would occur.
Random censoring may be present if it was not known at the outset that this
migration would occur.

(ii)(c) A policy matures

for a reason other than a lapse.
Type I censoring is present since it was known at the outset that the
policyholder’s policy would mature on this date.
Random censoring is not present since it was known at the outset that the
policyholder’s policy would mature on this date.
(i) Proportional hazards models
Proportional hazards models are used to describe the hazard rate of

individuals where this depends on both duration (the time since a specified
event) and other covariates.
The hazard rate for each individual consists of a baseline hazard, which is a
non-parametric component that depends only on the duration, multiplied by
a parametric function that depends only on the values of the covariates for
the individual.
The model is ‘proportional’ because the hazard rate for each individual
always remains in the same proportion to the baseline hazard (and hence
also to other individuals).
The baseline hazard rate corresponds to an individual with all covariates

equal to zero.
(ii) Baseline
The ‘baseline cow’ is one for which the covariates x and z equal zero, ie a
cow assigned to the previous treatment, with treatment starting immediately.
(iii) Number of days before treatment
The hazard rate for a cow receiving the new treatment is:
hNEW (t , x ) = h0 (t ) exp(0.8 + 0.4 x - 0.1x ) = h0 (t ) exp(0.8 + 0.3 x )

The hazard rate for a cow receiving the previous treatment is:
hOLD (t , x ) = h0 (t ) exp(0.4 x )
If these are equal, then:
h0 (t ) exp(0.8 + 0.3 x ) = h0 (t ) exp(0.4 x )
fi 0.8 + 0.3 x = 0.4 x

fi 0.8 = 0.1x
fi x =8
So treatment started after 8 days.
(iv) Proportion still having the condition after 14 days
We are told that the median recovery time for cows on the previous
treatment with x = 3 was 14 days. So 50% of the cows will still have the
condition after 14 days, ie:
SOLD (14) = exp Ê - Ú h0 (t )e3 b1dt ˆ = exp Ê - Ú h0 (t )e1.2dt ˆ = 0.5

14 14
Ë 0 ¯ Ë 0 ¯
The proportion having the condition after 14 days with the new treatment is:
SNEW (14) = exp Ê - Ú h0 (t )e b0 + 3 b1 + 3 b 2 dt ˆ = exp Ê - Ú h0 (t )e1.7dt ˆ

14 14
Ë 0 ¯ Ë 0 ¯
We can calculate this as:
SNEW (14) = exp Ê - Ú h0 (t )e1.2dt ¥ e0.5 ˆ

14
Ë 0 ¯
È ˘
= Í exp Ê - Ú h0 (t )e1.2dt ˆ ˙
14
e0.5
= 0.5
(e ) = 0.3189
0.5
Î Ë 0 ¯˚

(i) Nelson-Aalen estimate of the survival function
Assuming that the decrement of interest is passing the exam, we can use
the data provided to construct the usual summary table required for
calculating empirical survival rates:
Counter j th time at Number of Number

which a pass passes at remaining just
occurs time t j before time t j
j
tj dj nj
1 7 2 27
2 14 5 23
3 27 6 18
4 36 3 7

estimated as:
(
Sˆ (t ) = exp -L
ˆ (t ) )
dj
.
t j £t

hazard between each of the times when students passed:
Time Fraction of Integrated hazard function Survival

interval students ˆ (t ) function
L
passing ˆ
Sˆ (t ) = e -L(t )
d j nj
0£t <7 0 0 1
2
7 £ t < 14 = 0.07407 0.07407 0.92860
27
5
14 £ t < 27 = 0.21739 0.07407 + 0.21739 = 0.29147 0.74717
23
6
27 £ t < 36 = 0.33333 0.29147 + 0.33333 = 0.62480 0.53537
18
3
36 £ t £ 39 = 0.42857 0.62480 + 0.42857 = 1.05337 0.34876
7
(ii) Graph

1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 5 10 15 20 25 30 35 40
Duration (weeks)

(iii) Probability of passing by the end of the year
From our calculations based on the Nelson-Aalen method in part (i), the
probability that a student who starts the course will pass the exam by the
end of the year (after adjusting for those who drop out) is:
1 - Sˆ (39) = 1 - 0.34876 = 0.65124
(iv) Comment
The school’s logic is that 16 students passed while 4 remained at the end of
16 16
the year who had not passed, so the pass rate is = = 80% .
16 + 4 20
This would be a valid claim if no students had dropped out. However, a

significant number of students dropped out during the year (and therefore
never passed the exam) and this has not been taken into account.
(i) Kaplan-Meier estimate of the survival function

as:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - n ˜
t j £t Ë j¯
We first need to calculate the duration in the queue for each customer. For
those who made a purchase, this is the difference between the ‘time joined’
and the ‘time purchase completed’. For those who were censored, ie left
without making a purchase, this is the difference between the ‘time joined’
and the ‘time left without making purchase’. Using ‘+’ to denote a censored
observation, these are:
8, 2, 6, 6, 2, 4, 10+, 6, 5+, 11, 4, 7+
Sorting these into ascending order, we have:
2, 2, 4, 4, 5+, 6, 6, 6, 7+, 8, 10+, 11

We can construct the usual summary table for estimating empirical survival
rates.
Made a Censored
Counter Time At risk in (t j , t j +1)
purchase
j tj nj
dj cj
1 2 12 2 0
2 4 10 2 1
3 6 7 3 1
4 8 3 1 1
5 11 1 1 0
We can then calculate the Kaplan-Meier estimates of the survival function for
each time interval.

Time
interval dj Ê dj ˆ
1- Sˆ (t ) = ’ Á1 - ˜
(minutes) nj t j £t Ë nj ¯
0£t <2 ––– Sˆ (t ) = 1

2 5 5 5
2£t <4 1- = Sˆ (t ) = 1 ¥ = = 0.83333
12 6 6 6
2 4 5 4 2
4£t <6 1- = Sˆ (t ) = ¥ = = 0.66667
10 5 6 5 3
3 4 2 4 8
6£t <8 1- = Sˆ (t ) = ¥ = = 0.38095
7 7 3 7 21
1 2 8 2 16
8 £ t < 11 1- = Sˆ (t ) = ¥ = = 0.25397
3 3 21 3 63
1 16
t ≥ 11 1- = 0 Sˆ (t ) = ¥0 = 0
1 63
(ii) Daily cost
The estimated expected daily cost is:
16
20, 000 ¥ $2 ¥ Sˆ (10) = $40, 000 ¥ = $10,159
63

(iii) Assumptions
This calculation assumes that:

 the new scheme won’t change customers’ behaviour, eg they may be
more likely to remain in the queue, knowing they might get the $2 refund
 awarding the $2 refunds (or explaining the new system to customers)
won’t slow down the customers going through the check-out tills
 the time of day when the investigation was done is typical of the day as
a whole.
Our calculations also depend on the assumptions underlying the

Kaplan-Meier model, ie a homogeneous population, independent
decrements and non-informative censoring. So you could also mention
whether you thought these assumptions were valid here. For example,
informative censoring is likely to be present since, if someone leaves the
queue, the waiting times of the customers behind this person will be
reduced.
(i) Hazard function
The hazard rate for a customer whose contract began t years ago will be:
l (t ; z1, z2, z3 , z4 ) = l0 (t ) exp( b1z1 + b 2z2 + b 3 z3 + b 4 z4 )
where l0 (t ) is the baseline hazard at time t .
The covariates are defined as follows:
Ï1 Male Ï1 High consumption

z1 = Ì z2 = Ì
Ó0 Female Ó0 Low consumption
Ï1 City Centre Ï1 Rural area

z3 = Ì z4 = Ì
Ó0 Other Ó0 Other
b1, b 2, b 3 , b 4 are regression parameters. The estimated values for these

are bˆ1 = -0.25 , bˆ2 = 0.32 , bˆ3 = 0.19 and bˆ4 = -0.35 .

(ii) Baseline hazard
The baseline person is a female consumer with low energy consumption

living in a location designated as ‘city (not centre)’.
(iii) Confidence intervals for parameters
Assuming a normal approximation gives the following 95% confidence

intervals:
b1 : -0.25 ± 1.96 0.015 = -0.25 ± 0.24 = ( -0.49, -0.01)

b 2 : 0.32 ± 1.96 0.008 = 0.32 ± 0.18 = (0.14, 0.50)
b 3 : 0.19 ± 1.96 0.012 = 0.19 ± 0.21 = ( -0.02, 0.40)
b 4 : -0.35 ± 1.96 0.005 = -0.35 ± 0.24 = ( -0.49, -0.21)
(iv) Test the suggestion
We are testing the hypotheses:
H0 : b1 = 0 (ie women change providers with the same frequency as men).

H1 : b1 < 0 (ie women change providers more frequently than men).
The z -value for this one-sided test is:
-0.25 - 0
z= = -2.041
0.015
Since this is less than –1.6449 (the 5th percentile of the standard normal
distribution), we reject H0 and conclude that the data provides evidence at
the 5% significance level to suggest that women do change providers more
often than men.
(v) Probability
A male who is a low energy consumer and lives in a rural area has
covariates z1 = 1 , z2 = 0 , z3 = 0 and z4 = 1 . Using the estimated
parameter values, his hazard rate after t years is l0 (t )e -0.6 .

The probability that this customer remains with the company for at least two
years is:
S1(2) = exp Ê - Ú l0 (t )e -0.6dt ˆ = 1 - 0.7 = 0.3

2
Ë 0 ¯
A male who is a high energy consumer and lives in a city centre has
covariates z1 = 1 , z2 = 1 , z3 = 1 and z4 = 0 . Again, using the estimated
parameter values, his hazard rate after t years is l0 (t )e0.26 .
The probability that this customer remains with the company for at least two
years is:
2 0.26
- l ( t )e
S2 (2) = e Ú0 0
dt
= (e Ú )
2 2 e0.86
- Ú0 l0 ( t )e -0.6 dt .e0.86 - l0 ( t )e -0.6 dt 0.86
=e 0 = (0.3)e = 0.058
(vi) Interdependence of factors
To test whether there is an interaction in the effect arising from gender and
energy consumption (say), we need to introduce an extra term b12z1z2 in
the model and test whether b12 is significantly different from zero.
This can be done by applying a likelihood ratio test with the hypotheses:
H0 : b12 = 0 versus H1 : b12 π 0
The test statistic for this test is:
(
-2 ln L original model - ln L model with interaction )
Under H0 , this test statistic has a chi-square distribution with 1 degree of
freedom. If the observed value exceeds 3.841, we reject H0 at the 5%
significance level and we retain this interaction term in the model.
We need to do this for each pair of variables that might involve an

interaction.
We can then consider adding interaction terms involving 3 variables.

(i) Nelson-Aalen estimate of S( x )
We will assume that the cups are stolen at the end of the day.
We can then use the data provided to construct the usual summary table
required for calculating empirical survival rates:
Counter j th time at Number of Number

which a cup is cups stolen at remaining just
stolen time t j before time t j
j
tj dj nj
1 3 1 19
2 5 1 18
3 8 1 17
4 9 1 16
5 14 1 13
6 15 2 12

estimated as:
(
Sˆ (t ) = exp -L
ˆ (t ) )
dj
ˆ (t ) =
where the integrated hazard is calculated as L Â nj
.
t j £t

hazard:
Time Fraction of Integrated hazard function Survival

interval cups stolen function
ˆ
d j nj L̂(t ) Sˆ (t ) = e -L(t )
0£t <3 0 0 1
1
3£t <5 = 0.05263 0.05263 0.94873
19
1
5£t <8 = 0.05556 0.05263 + 0.05556 = 0.10819 0.89746
18
1
8£t <9 = 0.05882 0.10819 + 0.05882 = 0.16701 0.84619
17
1
9 £ t < 14 = 0.06250 0.16701 + 0.06250 = 0.22951 0.79492
16
1
14 £ t < 15 = 0.07692 0.22951 + 0.07692 = 0.30643 0.73607
13
2
t = 15 = 0.16667 0.30643 + 0.16667 = 0.47310 0.62307
12

(ii) Sketch
A graph of the estimated survival function looks like this:
Estimated survival function S(x)

1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0 5 10 15
Duration (days)
45 Subject CT4 September 2016 Question 10 (adapted)
(i) Cox hazard function
The hazard rate for a customer who has been in hospital for a length of time
t will be:
l (t ; z1, z2, z3 , z4 ) = l0 (t ) exp( b1z1 + b 2z2 + b 3 z3 + b 4 z4 )
where l0 (t ) is the baseline hazard rate (taken to be the rate for male
smokers who are moderate drinkers).
The covariates are defined as follows:
Ï1 Female Ï1 Non-smoker
z1 = Ì z2 = Ì
Ó0 Male Ó0 Smoker
Ï1 Non-drinker Ï1 Heavy drinker

z3 = Ì z4 = Ì
Ó0 Other Ó0 Other

b1, b 2, b 3 , b 4 are regression parameters. The estimated values for these are
bˆ1 = 0.065 , bˆ2 = -0.035 , bˆ3 = -0.06 and bˆ4 = 0.085 .
(ii) Probability calculation
For the male moderate drinker who does not smoke, we have z1 = 0 ,
z2 = 1 , z3 = 0 and z4 = 0 .
So: l0 (3) exp(0 - 0.035 - 0 + 0) = 0.6

fi l0 (3) = 0.6e0.035
For the female heavy drinker who smokes, we have z1 = 1 , z2 = 0 , z3 = 0

and z4 = 1 .
So her rate at the same time point is:
l0 (3) exp(0.065 + 0 + 0 + 0.085)

= l0 (3)e0.15 = 0.6e0.035e0.15 = 0.6e0.185 = 0.72
(iii) Testing the suggestion statistically
H0 : b1 = 0 vs H1 : b1 π 0
We can do this using a likelihood ratio test.
The test statistic is:
-2( 3 -  4 )
where  3 is the maximum log-likelihood under H0 and  4 is the maximum

log-likelihood under H1 . Under H0 the test statistic has a c12 distribution.
If the value of the test statistic exceeds the critical value 3.841 (the upper 5%
point of this distribution), we reject H0 and conclude that gender does have
a material impact. Otherwise we cannot conclude that gender has a material
impact.

(iv) Additional factor
If we are only considering the distinction between patients who are married
and patients who are not married, we could introduce an extra covariate z5
with a corresponding parameter b 5 defined as follows (with ‘Not married’
specified as the baseline level):
Ï1 Married
z5 = Ì
Ó0 Not married
We can then carry out a likelihood ratio test as in part (iii).
H0 : b 5 = 0 vs H1 : b 5 π 0
The test statistic is:
-2( 4 -  5 )
where  4 is the maximum log-likelihood under H0 and  5 is the maximum

log-likelihood under H1 .
Under H0 the test statistic has a c12 distribution.
If the value of the test statistic exceeds the critical value 3.841, we reject H0
and conclude that marital status does have a material impact. Otherwise we
cannot conclude that marital status has a material impact.
The probability that a 93-year-old will receive a card at age 100 but not at
age 105 is:
7 p93 ¥ 5 q100 = 7 p93 ¥ (1 - 5 p100 )
The force of mortality takes the constant value 0.20 over the range
(100,105) .

So:
= exp Ê - Ú m100 + s ds ˆ = e -5 ¥ 0.20 = e -1

5
5 p100 Ë 0 ¯
Noting that the form of the mortality function changes at age 95, we also
have:
= exp Ê - Ú m93 + s ds ˆ = e -(2 ¥ 0.10 + 5 ¥ 0.15) = e -0.95

7
7 p93 Ë 0 ¯
So the required probability is:
( )
e -0.95 ¥ 1 - e -1 = 0.38674 ¥ (1 - 0.36788) = 0.24447
(i) Proportional hazards model
hi (t )
In a proportional hazards model, the ratio of the hazard rates for two
h j (t )
individuals i and j does not depend on the duration t .
Alternatively, we could say that the hazard rate is a product of a baseline

function that depends on the duration t and a constant factor that allows for
the covariates for each individual.
(ii) Interpretation of m and b
The m in the formula denotes the baseline hazard rate, which is used as a
reference value for the hazard rate at age 60 + t . It is the hazard rate for a
man who does not drink any beer.
The parameter b specifies how much the hazard rate increases by with
each extra glass of beer drunk.

(iii) Hazard rate for a man aged 62 who drinks 2 glasses a day
The hazard rate for a man aged 62 who drinks 2 glasses of beer a day is
estimated to be:
hˆ(2) = 0.03 exp(0.2 ¥ 2) = 0.03e0.4 = 0.04475
(iv)(a) Probability he is still alive in 10 years’ time
The hazard rate at age 60 + t for a man who drinks 3 glasses of beer a day
is estimated to be:
hˆ(t ) = 0.03 exp(0.2 ¥ 3) = 0.03e0.6 = 0.05466
So the probability that this man will still be alive in 10 years’ time is estimated
to be:
Sˆ (10) = exp Ê - Ú hˆ(t ) dt ˆ = e -10 ¥ 0.05466 = 0.57889

10
Ë 0 ¯
(iv)(b) Expectation of life
This man will experience a constant force of mortality of 0.05466.
So the length of his future lifetime will have an exponential distribution with
parameter l = 0.05466 .
His expectation of life equals the mean of this distribution, which is:
1 1
= = 18.29 years
l 0.05466
(v) Maximising the total amount of beer bought
Using the same method, if this man drinks x glasses of beer a day, his
1
hazard rate will be m e b x and his expectation of life will be years. So
me b x
the expected total number of beers he will buy during his lifetime is:
1 365.25
365.25 ¥ bx
¥x= xe -0.2 x
me m

We need to choose the value of x to maximise the function f ( x ) = xe -0.2 x .
Differentiating this using the product rule gives:
f ¢( x ) = e -0.2 x - 0.2 xe -0.2 x = (1 - 0.2 x )e -0.2 x
1
This equals 0 when x = =5.
0.2
The second derivative is:
f ¢¢( x ) = -0.2e -0.2 x - (1 - 0.2 x )0.2e -0.2 x = ( -0.4 + 0.04 x )e -0.2 x
and f ¢¢(5) = ( -0.4 + 0.04 ¥ 5)e -0.2 ¥ 5 = -0.2e -1 < 0
So the total beer sales will be maximised if he drinks 5 glasses a day.
The shopkeeper is interested in when the cheese goes mouldy. So this is

the event of interest, whereas selling the cheese will result in censoring.
Right censoring occurs when observations are cut off so that we don’t know
when the event of interest would have occurred. The packets of cheese that
were sold or thrown out at the end were subject to right censoring.
Random censoring occurs when the time of censoring is not specified in

advance. The packets of cheese that were sold were subject to random
censoring.
Type I censoring occurs when the observational plan specifies a fixed end
date. This study was to continue for 10 days and so Type I censoring was
present.
You could also have mentioned informative or non-informative censoring.

For example, if customers inspected the cheese and always bought the
freshest looking packets, this would be informative censoring.

(ii) Removing the censoring
The right censoring and random censoring when packets of cheese were
sold could be removed by closing the shop or removing the cheese from
sale.
The Type I censoring could be removed by continuing the study beyond 10

days until all the cheese had gone mouldy.
If you mentioned informative censoring in your answer to part (i), then you
could say that this could be removed if the shopkeeper selected a packet at
random to give to the customers. Non-informative censoring could be
removed by telling customers to select the freshest cheese to buy.

as:
Ê dj ˆ
Sˆ (t ) = ’ Á1 - n ˜
t j £t Ë j ¯
We can construct the usual summary table for estimating empirical survival
rates, where t denotes the number of days until the cheese goes mouldy.
Cheese gone Censored in time

Counter Day At risk interval after t j
mouldy
j tj nj
dj cj
1 3 16 1 0
2 4 15 2 4
3 8 9 2 2
4 10 5 3 2

each time interval.
Fraction Estimated survival function

surviving
Time interval Ê dj ˆ
(days) dj Sˆ (t ) = ’ Á1 - n ˜
1- t j £t Ë j ¯
nj
0£t <3 ––– Sˆ (t ) = 1
1 15 15 15
3£t <4 1- = Sˆ (t ) = 1 ¥ = = 0.9375
16 16 16 16
2 13 15 13 13
4£t <8 1- = Sˆ (t ) = ¥ = = 0.8125
15 15 16 15 16
2 7 13 7 91
8 £ t < 10 1- = Sˆ (t ) = ¥ = = 0.6319
9 9 16 9 144
3 2 91 2 91
t = 10 1- = Sˆ (t ) = ¥ = = 0.2528
5 5 144 5 360

(i) Parameter values
We can deduce the following relationships from the three statements from
the company.
‘A 25 year old female …’
e bS + 5 b A + bG = 2e5 b A
fi e bS + bG = 2
fi bS + bG = ln 2 … (1)
‘A 45 year old male …’
e 25 b A = 1 e 23 b A + bG
2
fi e 2 b A - bG = 1
2
fi 2 b A - bG = - ln 2 … (2)
‘A 32 year old female …’
e bS +12 b A + bG = 1.6e bS + 25 b A
fi e -13 b A + bG = 1.6
fi -13 b A + bG = ln1.6 … (3)
Adding (2) and (3) to eliminate bG gives:
-11b A = - ln 2 + ln1.6
fi bA = 1 (ln 2 - ln1.6) = 0.02029

11
Rearranging (2) then gives:
bG = 2 b A + ln 2 = 0.73372
Finally, from (1):
bS = ln 2 - bG = -0.04057

(ii) Group for which the drug is most effective
h(t ) is the ‘hazard’ of symptoms disappearing. So the group for which the
drug is most effective will have the highest value of h(t ) .
Since b A and bG are positive but bS is negative, this corresponds to a

male who was the maximum age when the drug was administered and
attended a gym.
(iii) Probability calculation
For the woman, we have S = 1 , A = 38 - 20 = 18 and G = 1 . So her hazard

rate is:
hFemale (t ) = h0 (t )e bS +18 b A + bG ( = 2.88144h0 (t ))
and her ‘survival’ probability is:
SFemale (28) = exp Ê - Ú h0 (t )e bS +18 b A + bG dt ˆ = 0.75

28
Ë 0 ¯
For the man, we have S = 0 , A = 26 - 20 = 6 and G = 0 . So his hazard

rate is:
hMale (t ) = h0 (t )e 6 b A ( = 1.12943h0 (t ))
We can calculate his ‘survival’ probability as:
SMale (28) = exp Ê - Ú h0 (t )e 6 b A dt ˆ

28
Ë 0 ¯
= exp Ê - Ú h0 (t )e bS +18 b A + bG dt ¥ e - bS -12 b A - bG ˆ

28
Ë 0 ¯
exp( - bS -12 b A - bG )
È ˘
= Í exp Ê - Ú h0 (t )e bS +18 b A + bG dt ˆ ˙
28
Î Ë 0 ¯˚
= 0.75exp( -0.93658)
= 0.75 0.39197 = 0.89336
So the required probability is approximately 0.89.

(i) Formulae
The formulae for estimating the survival function are:
Ê dj ˆ
 SˆKM (t ) = ’ Á1 - n ˜ for the Kaplan-Meier model
t j £t Ë j¯
Ê dj ˆ
 SˆNA (t ) = exp Á - Â ˜ for the Nelson-Aalen model.
ÁË t j £ t n j ˜¯
Here:
 t j is the j th time at which a hazard event occurs
 d j is the number of decrements that occurred at time t j
 n j is the number of individuals who were at risk of the hazard just

before time t j .
(ii) Demonstrate that the Nelson-Aalen estimator is never lower
We can see from the graph that e - x ≥ 1 - x in the range 0 £ x £ 1 . Since

dj
the values of must lie in this range, we can apply this inequality with
nj
dj
x= to deduce that:
nj
Ê dj ˆ Ê dj ˆ Ê dj ˆ
SˆNA (t ) = exp Á - Â ˜ = ’ exp Á - ˜ ≥ ’ Á1 - ˜ = SˆKM (t )
ÁË t j £ t n j ˜¯ t j £ t Ë n j ¯ t j £t Ë nj ¯
(iii) Types of censoring

Right censoring occurs when observations are cut off so that we don’t know
when the event of interest would have occurred. Patients who are still in the
trial at the end of the 5 years or who drop out before then will be subject to
right censoring.

Random censoring occurs when the time of censoring is not specified in

advance. Patients who drop out before 5 years will be subject to random
censoring.
Type I censoring occurs when the observational plan specifies a fixed end
date. Here patients who are still under observation after 5 years will be
subject to Type I censoring.
There is also interval censoring (resulting from the 3-month gap between the
tests). You could also have mentioned informative or non-informative
censoring, provided you justify these. For example, if patients drop out
because they think they have recovered, this would be informative
censoring.
(iv) Kaplan-Meier estimates of the survival function
Group receiving steroid cream
For the group receiving treatment the usual summary table required for
calculating empirical survival rates looks like this:
Counter j th time at which Number of Number in

recurrence occurs recurrences at remission just
tj
dj nj
j
1 3 1 10
2 5 1 9
3 10 2 6
4 18 1 2

each time interval.
Time interval Fraction surviving Estimated survival function

(days)
dj Ê dj ˆ
1- Sˆ (t ) = ’ Á1 - n ˜
nj
t j £t Ë j¯
0£t <3 1 Sˆ (t ) = 1
3£t <5 1 9 9 9
1- = Sˆ (t ) = 1 ¥ = = 0.9
10 10 10 10
5 £ t < 10 1 8 9 8 8
1- = Sˆ (t ) = ¥ = = 0.8
9 9 10 9 10
10 £ t < 18 2 2 8 2 8
1- = Sˆ (t ) = ¥ = = 0.5333
6 3 10 3 15
18 £ t < 20 1 1 8 1 4
1- = Sˆ (t ) = ¥ = = 0.2667
2 2 15 2 15
Control group
For the control group we have:
Counter j th time at which Number of Number in

recurrence occurs recurrences at remission just
tj
dj nj
j
1 6 1 10
2 8 2 9
3 14 1 4
4 18 2 2

The estimated survival function for the control group is:
Time interval Fraction surviving Estimated survival function

(days)
dj Ê dj ˆ
1- Sˆ (t ) = ’ Á1 - n ˜
nj
t j £t Ë j¯
0£t <6 1 Sˆ (t ) = 1
6£t <8 1 9 9 9
1- = Sˆ (t ) = 1 ¥ = = 0.9
10 10 10 10
8 £ t < 14 2 7 9 7 7
1- = Sˆ (t ) = ¥ = = 0.7
9 9 10 9 10
14 £ t < 18 1 3 7 3 21
1- = Sˆ (t ) = ¥ = = 0.525
4 4 10 4 40
18 £ t < 20 2 21
1- =0 Sˆ (t ) = ¥0 = 0
2 40
(v)(a) Statistical test
We could test whether there is a significant difference by calculating

confidence intervals for the survival function. If these do not overlap, this
would indicate a significant difference.
The confidence intervals can be calculated by assuming a normal

distribution, ie extending 1.6449 standard deviations beyond the mean for a
5% test (as this would be a one-sided test).
The standard deviation can be calculated using Greenwood’s formula for the
variance of the Kaplan-Meier estimator, which is given in the Tables.
(v)(b) Comment
The estimated survival functions do not seem to show a consistent large

difference in favour of the cream.
Also, we have very small sample sizes for both groups, making a statistically
significant result unlikely.

In fact, if we compare the estimates of the survival functions for both groups,
we can see that they cross over. For example:
SˆCream (7) = 0.8 < 0.9 = SˆControl (7)
but:
SˆCream (9) = 0.8 > 0.7 = SˆControl (9)
This means that we definitely won’t be able to conclude that the cream is
effective over the whole period.

FACTSHEET
This factsheet summarises the main methods, formulae and information

required for tackling questions on survival models based on future lifetimes
and the Kaplan-Meier, Nelson-Aalen and proportional hazards models.
Probabilities of death and survival
t qx = Fx (t ) = P [ Tx £ t ] =
Ú
0
s px m x + s ds
Ï t ¸
Ô Ô
t px = 1 - t q x = Sx (t ) = 1 - Fx (t ) = P [ Tx > t ] = exp Ì - m x + s ds ˝
ÔÓ 0 Ô˛
Ú
t + s px = t px ¥ s px + t = s px ¥ t p x + s
Force of mortality
1
mx = lim+ ¥ P [ T £ x + h ΩT > x ]
h Æ0 h
1
mx = lim ¥ h qx so h qx ª h.m x (for small h )
h Æ0 + h
Derivatives of t px
∂ t px
= - t px m x + t
∂t
∂ t px
= - t px ( m x + t - m x )
∂x
Central rate of mortality
mx =
qx
=
Ú
0
t px m x + t dt
1 1
Ú0
t px dt
Ú0
t px dt

The complete future lifetime random variable Tx
d
fx (t ) = Fx (t ) = t px mx + t (0 £ t £ w - x )
dt
w -x
e∞ x = E [Tx ] =
Ú
0
t px dt
w -x
Ú
2
var[Tx ] = t 2 t px m x + t dt - e∞ x
0
The curtate future lifetime random variable K x
K x is defined to be the integer part of Tx
d x +k
P ( K x = k ) = k px q x + k = k|qx =
lx
[w - x ]
ex = E [ K x ] = Â k px ª e∞ x - ½
k =1
ÎÈw - x ˚˘
var[K x ] = Âk
k =0
2
k px q x + k - ex 2
The shape of the human mortality curve
Human mortality is relatively high at very young ages, peaks again at ages
around 20 (the ‘accident hump’) and increases exponentially at older ages.
Exponential model
This model assumes that the future lifetime random variable Tx follows an
exponential distribution. The hazard function (or force of mortality) is
constant under this model and the survival function is:
t px = e -t m
m can be estimated by maximum likelihood.

Weibull model
This model assumes that the future lifetime random variable Tx follows a
Weibull distribution. For this model:
t px = 1 - FTx (t ) = exp -a t b ( ) (see page 15 of the Tables)
and the hazard function is:
m x + t = ab t b -1
The parameters a and b can be estimated by maximum likelihood.
If b < 1 , the hazard function is a decreasing function of t .
If b > 1 , the hazard function is an increasing function of t .
If b = 1 , the hazard function is constant and the Weibull model is the same
as the exponential model.
Gompertz’ law
mx = B c x
x
( c t -1) Ê -B ˆ
t px = gc where g = exp Á
Ë log c ˜¯
Makeham’s law
mx = A + B c x
x
( c t -1) Ê -B ˆ
t px = st g c where g = exp Á and s = exp( - A)
Ë log c ˜¯

Summary and comparison of models
We have summarised below the key features of each model. The bullet
points cover:
 how the model is parameterised
 how other decrements are handled
 what the key assumptions are
 how the likelihood function is determined
 how the model is fitted (estimating the parameters)
 key formulae / results
 what types of exam questions tend to come up
 other specific points relating to the model
Kaplan-Meier model
 non-parametric
– models the cumulative distribution of the lifetime random
variable Tx
 other decrements are handled via censoring

 key assumptions
– no heterogeneity (everyone has the same probabilities)
– non-informative censoring
 likelihood function
k
d n j -d j
L= ’ l j j (1 - l j )
j =1
 fitting the model
dj
l̂ j = estimate of the discrete hazard
nj

Fˆ (t ) = 1 - ’ (1 - lˆj ) estimate of distribution function

t j £t
dj
var[F (t )] ª [1 - Fˆ (t )]2 Â Greenwood’s formula
t j £t n j (n j - d j )
(on page 33 of Tables)
 exam questions
– numerical calculations
– identifying/defining types of censoring
– sketching graphs (usually the distribution function of Tx or the
survival function)
– testing hypotheses about F (t ) or S (t )
– constructing confidence intervals for F (t ) or S (t )
 specific points
– also known as the ‘product-limit’ estimator
– very similar to Nelson-Aalen in terms of application
Nelson-Aalen model
 non-parametric
– models the integrated hazard corresponding to the lifetime random
variable Tx

 key assumptions
– no heterogeneity (everyone has the same hazard rate)
– non-informative censoring


dj
lˆ j = estimate of the discrete hazard
nj
dj
ˆ =
L j Â nj
estimate of integrated hazard
t j £t
Fˆ (t ) = 1 - exp( -L
ˆ )
j estimate of distribution function
d j (n j - d j )
 (t )] =
var[ L Â variance formula
t j £t n 3j
(on page 33 of Tables)

 exam questions
– identifying/defining types of censoring
– sketching graphs (usually the distribution function of Tx or the
survival function)
– testing hypotheses about L (t ) or S (t )
– constructing confidence intervals for L (t ) or S (t )
 specific points
– very similar to Kaplan-Meier in terms of application
– can be obtained as an approximation to the Kaplan-Meier model
– has better statistical properties in small samples than the Kaplan-
Meier estimation procedure

Fully parametric models
 parametric
– the lifetime distribution is assumed to belong to a given family of
parametric distributions
 other decrements must be handled using conditional probabilities, which
complicates the likelihood function
 key assumptions
– no heterogeneity (everyone has the same hazard rate)
– parameters of the model can be estimated using the method of
maximum likelihood
possible distributions for the hazard rate include:
– exponential (constant hazard)
– Weibull (monotonic hazard)
– Gompertz-Makeham (exponential hazard)
– log-logistic (‘humped’ hazard)
 exam questions
– numerical / algebraic calculations
– using likelihood functions to estimate the parameter(s)
 specific points
– can be applied to a single homogeneous group or separately to a
small number of homogeneous groups
– also has applications in modelling loss distributions for insurance
claims data

Proportional hazards models
 parametric or non-parametric
– the baseline hazard, l0 (t ) , models a generic random variable Tx
– a proportional factor g ( zi ) is applied to take account of the

covariates.
 key assumptions
– hazard rates are proportional to baseline hazard
– parameters of the model can be estimated in a similar way to the
Cox model
li (t , zi ) = l0 (t )g (zi )
 exam questions
– comparing rates and probabilities for different people
– determining whether specified models are proportional or not
 specific points
– this model can handle heterogeneity (ie different rates for different
people)
– can be extended to a form in which the effect of the covariates
changes with duration
Cox regression model
 semi-parametric
– the baseline hazard, l0 (t ) , models a generic random variable Tx
– a proportional factor exp ( b1z1 + b 2z2 +  + b k zk ) is applied, where

the b ’s are parameters and the z ’s are covariates.

 key assumptions
– hazard rates are proportional to baseline hazard
– proportional adjustment is exponential-linear
 likelihood function / fitting the model
k exp( b zTj )
L( b ) = ’ partial likelihood
j =1 Â exp( b zTi )
i ŒR ( t j )
– use Breslow’s approximation if there are ties

– find the MLEs of the b ’s to fit the model
-2(l p - l p + q ) ~ c q2 likelihood ratio test
 exam questions
– comparing rates and probabilities for different people
– determining whether specified models are proportional or not
– interpreting the parameters (the b ’s)
– writing down the partial likelihood function

– finding the MLEs of the b ’s
– applying statistical tests to the b ’s
 specific points
– it is a proportional hazards model
– this model can handle heterogeneity (ie different rates for different
people)
– can use products eg exp ( b1z1 + b 2z2 + b12z1z2 ) to incorporate

‘interactions’ between covariates

NOTES

NOTES

NOTES

NOTES

NOTES

NOTES

CK1 Booklet 1 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CK1 Booklet 1 PDF

Uploaded by

Copyright:

Available Formats

Exclusive use Batch 3a

Chapter 6 Survival models

The Actuarial Education Company

Links to the Course Notes and Syllabus 2

All of this material is copyright. The copyright belongs to Institute and

© IFE: 2019 Examinations Page 1

LINKS TO THE COURSE NOTES AND SYLLABUS

Material covered in this booklet

Chapter 6 Survival models

Syllabus objectives covered in this booklet

4.1 Explain the concept of survival models.

4.1.1 Describe the model of lifetime or failure time from age x

4.1.4 Define the actuarial symbols t px and t q x and derive

4.1.7 Define the symbols ex and e∞ x and derive an approximate

4.2.1 Describe the various ways in which lifetime data might be

Page 2 © IFE: 2019 Examinations

4.2.2 Describe the estimation of the empirical survival function in

© IFE: 2019 Examinations Page 3

This booklet covers Syllabus objectives 4.1.1-4.1.7 and 4.2, relating to

The following topics are included:

Models that make an assumption about the distribution of the lifetime

The Kaplan-Meier and Nelson-Aalen models are non-parametric models.

The Kaplan-Meier and Nelson-Aalen models assume that the same

Proportional hazards models are semi-parametric. Part of the specification

As well as carrying out calculations, you need to be able to explain the

Page 4 © IFE: 2019 Examinations

We have inserted paragraph numbers in some places, such as 1, 2, 3 …, to

The text given in Arial Bold font is Core Reading.

The future lifetime model

The starting point for a simple mathematical model of survival is the

The future lifetime of a new-born person is a random variable,

2 The maximum age w is called the limiting age.

© IFE: 2019 Examinations Page 5

3 Distribution function and survival function of a new-born life

F (t ) = P [ T £ t ] is the distribution function of T

S (t ) = P [ T > t ] = 1 - F (t ) is the survival function of T

5 Distribution function and survival function of a life aged x

Fx (t ) = P [ Tx £ t ] is the distribution function of Tx

S x (t ) = P [ Tx > t ] = 1 - Fx (t ) is the survival function of Tx

6 For consistency with T , the distribution function of the random

7 Probabilities of survival and death

We now introduce the notation used by actuaries for probabilities of

Page 6 © IFE: 2019 Examinations

It is convenient in much actuarial work to use a time unit of one year.

q x and t q x are called rates of mortality.

The force of mortality

8 A quantity which plays a central role in a survival model is the force of

9 We denote the force of mortality at age x (0 £ x < w ) by m x , and define

We will always suppose that the limit exists.

The interpretation of m x is very important.

The probability P [ T £ x + h ΩT > x ] is (from the definitions

10 For small h, we can ignore the limit and write:

In other words, the probability of death in a short time h after age x is

© IFE: 2019 Examinations Page 7

11 For x ≥ 0 and t > 0 , we could define the force of mortality m x + t in two

12 The definition of S x (t ) leads to an important relationship:

S x (t ) = P [ Tx > t ] = P [ T > x + t ΩT > x ]

13 Therefore, for any age x and for s > 0, t > 0 :

Page 8 © IFE: 2019 Examinations

In words, the probability of surviving for time (s + t ) after age x is

(i) the probability of surviving for time s , and

(ii) the probability of then surviving for a further time t

(i) the probability of surviving for time t , and