Professional Documents
Culture Documents
Subject CS2
Revision Notes
For the 2019 exams
Exposed to risk
Booklet 5
covering
CONTENTS
Contents Page
Copyright agreement
Legal action will be taken if these terms are infringed. In addition, we may
seek to take disciplinary action through the profession or through your
employer.
These conditions remain in force after you have finished using the course.
These chapter numbers refer to the 2019 edition of the ActEd course notes.
OVERVIEW
This booklet covers Syllabus objective 4.4, which relates to the estimation of
transition intensities (eg mortality rates) depending on age.
This involves calculating the number of transitions (eg deaths) and dividing
this by an appropriate exposed to risk. The data are subdivided according to
a specific age definition, enabling us to estimate mortality rates for specific
ages. The process is summarised by the formula:
dx
mˆ x =
E xc
Note that:
mˆ x is the estimate of m x + f , the transition rate at a particular age x + f
(where f is a value dependent upon the age definition of the data)
CORE READING
All of the Core Reading for the topics covered in this booklet is contained in
this section.
The text given in Arial Bold Italic font is additional Core Reading that is not
directly related to the topic being discussed.
____________
The multiple state and Poisson models analyses are based on the
assumption that we can observe groups of identical lives (or at least
lives whose mortality characteristics are the same). In practice, this is
never possible.
____________
2 The solution
(d) compute E cx .
(a) What happens when the dates of entry to and exit from observation
have not been recorded?
(b) What happens if the definition of age does not correspond exactly
to the age interval x to x + 1 (for integer x )?
____________
This is in fact similar to the way in which data are submitted to the CMI.
It is often quite convenient for companies to submit a total of policies
in force on a date such as 1 January.
____________
K + N +1
E cx = Ú Px ,t dt
K
10 The simplest approximation, and the one most often used, is that Px ,t
is linear between census dates, leading to the trapezium
approximation.
____________
K + N +1 K +N
11 E cx = Ú Px ,t dt ª Â ½(Px ,t + Px ,t + 1)
K t =K
____________
This is the method used by the CMI. It is easily adapted to census data
available at more or less frequent intervals, or at irregular intervals.
____________
Each of these identifies a different year of age, called the rate interval.
____________
14 Once the rate interval has been identified (from the age definition used
in d x ) the rule is that:
We must ensure that the census data are consistent with the death
data. We invoke the principle of correspondence; we must check the
following:
____________
15 The census data Px ,t are consistent with the death data d x if and only
if, were any of the lives counted in Px ,t to die on the census date, he or
she would be included in d x .
____________
Px(2)
,t = Number of lives under observation, age x nearest birthday
at time t , where t = 1 January in calendar year K , K + 1 ,
…, K + N , K + N + 1
Px(3)
,t = Number of lives under observation, age x next birthday at
time t , where t = 1 January in calendar years K , K + 1 ,
…, K + N , K + N + 1
___________
17 In the event that the death data and the census data use different
definitions of age, we must adjust the census data. Unless it is
unavoidable, we never adjust the death data, since that ‘carries most
information’ when rates of mortality are small. Hence, it is always the
death data that determines what rate interval to use.
___________
For example, the CMI uses the definition ‘age nearest birthday’ in its
work; that is, death data as in d x(2) . However, some life offices
contribute census data classified by ‘age last birthday’, because that is
what is available from their records. The latter must be adjusted in
some way.
____________
This section contains all the relevant exam questions from 2008 to 2017 that
are related to the topics covered in this booklet.
Solutions are given after the questions. These give enough information for
you to check your answer, including working, and also show you what an
outline examination answer should look like. Further information may be
available in the Examiners’ Report, ASET or Course Notes. (ASET can be
ordered from ActEd.)
We first provide you with a cross-reference grid that indicates the main
subject areas of each exam question. You can use this, if you wish, to
select the questions that relate just to those aspects of the topic that you
may be particularly interested in reviewing.
Alternatively, you can choose to ignore the grid, and attempt each question
without having any clues as to its content.
24
23
22
21
20
19
18
17
16
15
14
13
12
11
10
Question
Problem of
heterogeneity
Subdividing data
Principle of
correspondence
Cross-reference grid
calculation
ETR – census
method
Estimating rates
Age to which the
Exclusive use Batch 3a
rate applies
Question attempted
Page 13
Exclusive use Batch 3a
List four factors in respect of which life insurance mortality statistics are often
subdivided. [2]
(i) List the data needed for the exact calculation of a central exposed to
risk depending on age. [2]
Persons with no date of death given were still alive when the investigation
ended.
(ii) Calculate a central exposed to risk using the data for the 10 lives in the
sample. [3]
(iii) (a) Calculate the maximum likelihood estimate of the hazard of death at
age 40 last birthday.
(b) Hence, or otherwise, estimate q40 . [2]
[Total 7]
List four factors often used to subdivide life insurance mortality statistics. [2]
Employees of the company are posted to the swamp for six month tours of
duty starting on 1 January, 1 April, 1 July or 1 October. The first employees
to be posted arrived on 1 January 2008. The swamp is so inaccessible that
no employees are allowed to leave before their six month tours of duty are
completed.
(i) Estimate the quarterly rate of being bitten by a spider for each quarter of
2008, stating any assumptions you make. [7]
(ii) Suggest reasons why the assumptions you made in (i) might not be
valid. [1]
[Total 8]
Two neighbouring small countries have for many years taken annual
censuses of their populations on 1 January in which each inhabitant must
give his or her age. Country A uses an ‘age last birthday’ definition of age,
whereas Country B uses an ‘age nearest birthday’ definition. Each country
has also operated a system in which deaths are recorded on an ‘age nearest
birthday at date of death’ basis.
(iii) Derive a formula which would allow the actuary to estimate the force of
mortality at age x + f , m x + f , in a particular calendar year, in terms of
the available data, and derive a value for f . [6]
(iv) List four factors other than geographical location which a government
statistical office might use to subdivide data for national mortality
analysis. [2]
[Total 11]
(i) Explain the reasons why data are subdivided when conducting mortality
investigations. [2]
(ii) Describe the problems which can arise with subdividing data. [2]
[Total 4]
(i) List four factors other than age and smoker status by which life
insurance mortality statistics are often subdivided. [2]
Two offices in different towns of the same life insurance company write
25-year term assurance policies. Below are data from these two offices
relating to policyholders of the same age. Both deaths and policies in force
are on an age last birthday basis.
(ii) Calculate the central death rate for the calendar year 2009 at this age
for the offices in Gasperton and Great Hawking. [2]
(iii) Estimate the central death rates for smokers and non-smokers in
Gasperton and Great Hawking. [4]
(iv) Comment on the company’s pricing structure in the light of your results
from parts (ii) and (iii) above. [3]
[Total 11]
49 175 200
50 200 225
51 225 235
(ii) Estimate, using these data, the force of mortality at age 50 next birthday
for the period 1 January 2009 to 1 January 2011. [5]
(iii) State the exact age to which your answer to part (ii) relates. [1]
[Total 7]
Below are some data from the three most recent censuses.
Age in
Population 2006 Population 2009 Population 2010
completed
(thousands) (thousands) (thousands)
years
64 300 320 350
65 290 310 330
66 280 300 320
Between the censuses of 2006 and 2009 there were a total of 3,000 deaths
to inhabitants aged 65 nearest birthday, and between the censuses of 2009
and 2010 there were a total of 1,000 deaths to inhabitants aged 65 nearest
birthday.
(i) Estimate, stating any assumptions you make, the death rate at age 65
years for each of the following periods:
the period between the 2006 and 2009 censuses
the period between the 2009 and 2010 censuses. [6]
(ii) Explain the exact age to which your estimates apply. [1]
[Total 7]
(ii) Discuss one potential problem with subdividing mortality data. [2]
(iii) List four factors which are commonly used to subdivide mortality data. [2]
[Total 6]
(i) Explain why data are subdivided into homogeneous groups when
mortality investigations are conducted. [2]
(ii) List four factors, other than age and sex, by which mortality statistics are
often subdivided. [2]
[Total 4]
Country A
Age last Population Population Population
birthday 1 February 2011 1 February 2012 1 February 2013
44 382,000 394,000 401,000
45 374,000 381,000 385,000
46 354,000 372,000 375,000
Country B
Age nearest Population Population Population
birthday 1 August 2011 1 August 2012 1 August 2013
44 382,000 394,000 401,000
45 374,000 381,000 385,000
46 354,000 372,000 375,000
In the combined lands of Countries A and B in the calendar year 2012 there
were 4,800 deaths of those aged 46 next birthday and 4,500 deaths of those
aged 45 next birthday.
The two countries decide to form an economic union, after which it will be
mandatory to offer the same rates for life insurance to residents of each
country.
(ii) Estimate the death rate at age 45 years last birthday for the two
countries combined. [6]
(iii) Explain the exact age to which your estimate relates. [1]
[Total 8]
(i) Explain the census approximation for calculating the exposed to risk
between any two census dates. [2]
Age and
Year Company A Company B Company C
year
Age 54 2011 3,400 1,250 5,780
2012 3,350 1,450 5,500
2013 3,000 1,500 6,010
(ii) Calculate the contribution to central exposed to risk for lives aged 55
last birthday for the calendar year 2012 for each of the companies. [6]
[Total 8]
The table below shows the number of people entering in various intervals
between 10:00pm and 2:00am on 30 June 2013. No one was admitted after
1:00am, and you may assume that all those who enter the premises stay
until 2:00am.
(ii) Calculate the rate per person-hour at which those attending the
nightclub aged 22 last birthday required medical attention for heat
exhaustion, stating any assumptions you make. [6]
[Total 7]
(i) State why it is important to divide data into homogeneous classes when
undertaking mortality investigations. [2]
(ii) List four factors, apart from smoking behaviour, by which mortality data
are often classified by life insurance companies. [2]
In a particular life insurance market, it has for many years been the practice
for all companies to charge smokers higher premiums than non-smokers for
the same term assurance policy. Suppose one company decides to switch
to charging smokers and non-smokers the same premiums for term
assurance policies. The other companies retain differential pricing for
smokers and non-smokers.
(iii) Discuss the likely implications for the company making the switch. [4]
[Total 8]
List four factors, other than age and sex, by which mortality statistics are
often subdivided. [2]
Company A and Company B are two small insurance companies which have
recently merged to form Company C. Company C is reviewing its premium
rates for a whole of life product and so is conducting an analysis of mortality
rates experienced.
Company A
Age next Number of Number of Number of
birthday policies 1 Jan policies 1 Jan policies 1 Jan
2012 2013 2014
51 8,192 6,421 8,118
52 7,684 8,298 7,187
53 9,421 8,016 9,026
Company B
Age last Number of Number of Number of
birthday policies 1 April policies 1 April policies 1 April
2012 2013 2014
51 4,496 3,817 4,872
52 5,281 5,218 3,812
53 4,992 5,076 5,076
(i) Estimate the force of mortality for the combined company for age 52 last
birthday, stating all assumptions that you make. [6]
(ii) Explain the exact age to which your estimate applies. [1]
[Total 7]
Write down the information required to compute the exact exposed to risk in
an investigation of mortality. [3]
nx (t ) = total number of policies under which death claims are made when
the policyholder is aged x last birthday for each calendar year t .
(b) State any assumptions that you make, indicating at which point in
your derivation each assumption is relevant. [5]
[Total 6]
(i) Calculate the contribution to the central exposed to risk for lives age 51
last birthday for the calendar year 2014 for each company individually. [5]
(ii) (a) State the assumptions you have made in order to perform
your calculations.
(i) Draw a transition diagram with three states which could be used to
analyse data from this scheme. [2]
(ii) Give the likelihood of the data, defining all the terms you use. [4]
(iii) Derive the maximum likelihood estimator of the rate of falling sick. [3]
The company records data on 1 January each year, classified by age last
birthday on:
the total number of employees (including those in receipt of sickness
benefit).
the number of employees in receipt of a sickness benefit.
The company wishes to estimate the rates of falling sick and recovery at age
52 years nearest birthday over the two-year period consisting of the calendar
years 2014 and 2015.
A study was conducted into the mortality of persons aged between exact
ages 85 and 86 years. The study took place from 1 April 2015 to 31 March
2016. The following table shows information on 10 lives observed in the
study.
1 1 August 2014 –
2 1 November 2014 –
3 1 January 2015 1 February 2016
4 1 February 2015 –
5 1 March 2015 –
6 1 April 2015 1 January 2016
7 1 June 2015 1 November 2015
8 1 July 2015 –
9 1 September 2015 1 March 2016
10 1 January 2016 –
(i) Calculate a central exposed to risk for the 10 lives in the sample,
working in months. [3]
(ii) Give the maximum likelihood estimate of the mortality hazard at age 85
last birthday. [1]
The solutions presented here are just outline solutions for you to use to
check your answers. See ASET for full solutions.
We have:
Adding the central exposed to risk values for all the lives gives a total of 53
months, or 4 years and 5 months.
2
mˆ = = 0.45283
53 / 12
Using the MLE of the force of mortality from part (iii)(a), we have:
qˆ 40 = 1 - e - m = 1 - e -0.45283 = 0.36417
ˆ
It may be difficult to adhere to this principle if the age classification used for
the death data is different from the age classification used for the exposure
data. This can occur when the death data and exposure data come from
different sources.
Assuming that the force of mortality is constant between the exact ages of x
and x + 1 , it can be estimated by:
dx
mˆ x =
E xc
Let Px (t ) denote the number of lives at time t aged x last birthday, where
time is measured in years from 1 January 2005. (With this definition,
Px ( -½) = Px,2004 , Px (½) = Px,2005 , Px (2½) = Px,2007 , Px (3½) = Px,2008 .)
Then:
3
E xc = Ú0 Px (t ) dt
1
Px (0) =
2
(Px ( -½) + Px (½))
1
Px (3) =
2
(Px (2½) + Px (3½))
So we now know (or have estimated values of) Px (0) , Px (½) , Px (2½) and
Px (3) . If we further assume that Px (t ) varies linearly between time ½ and
time 2½, then:
1 1 1 1 1
E xc = ¥ ÈP (0) + Px (½)˘˚ + 2 ¥ ÈÎPx (½) + Px (2½)˘˚ + ¥ ÈÎPx (2½) + Px (3)˘˚
2 2Î x 2 2 2
1 È1 ˘
= Í (Px ( -½) + Px (½)) + Px (½)˙ + ÎÈPx (½) + Px (2½)˚˘
4 Î2 ˚
1È 1 ˘
+ P (2½) + (Px (2½) + Px (3½)) ˙
4 ÍÎ x 2 ˚
1 11 11 1
= Px ( -½) + Px (½) + Px (2½) + Px (3½)
8 8 8 8
1 11 11 1
= Px,2004 + Px,2005 + Px,2007 + Px,2008
8 8 8 8
Since the rate interval starts at exact age x and ends at exact age x + 1 ,
mˆ x estimates m x +½ .
We will assume that all employees who are bitten by spiders are given the
antidote and make an immediate recovery and that the only deaths that
occur are accidental deaths.
First quarter
The first quarter started on 1 January with 90 employees in the first cohort
and ended with 80 ( = 90 - 10 ) on 31 March. If we assume that the deaths
occurred uniformly over the quarter, this gives a total employee exposed-to-
risk for this period of 21 (90 + 80) = 85 quarters.
Since there were 15 spider bites during this period, the rate of spider bites is
15
= 17.6% per quarter.
85
Second quarter
The second quarter started on 1 April with the remaining 80 employees from
the first cohort plus the 80 new arrivals in the second cohort, making a total
of 160, and ended with 152 ( = 160 - 8 ) on 30 June. If we assume that the
deaths occurred uniformly over the quarter, this gives a total employee
exposed-to-risk for the second quarter of 21 (160 + 152) = 156 quarters.
At the start of the second quarter there were equal numbers (80 of each) of
employees from the first and second cohorts. If we assume that the
proportion of deaths during each quarter was the same for each cohort, we
would expect that the 8 deaths during the second quarter were equally split
between the two groups. So 4 of the employees from the first cohort would
go home on 30 June and 4 employees from the second cohort would have
been killed.
Third quarter
Based on the assumption of equal proportions of deaths for each cohort, the
third quarter started on 1 July with the remaining 76 ( = 80 - 4 ) employees
from the second cohort plus the 114 new arrivals from the third cohort,
making a total of 190, and ended with 180 ( = 190 - 10 ) on 30 September.
This gives a total employee exposed-to-risk for the third quarter of
1 (190 + 180) = 185 quarters.
2
Again, if we assume that the deaths during the third quarter were in
proportion to the numbers at the start of the quarter, then the 10 deaths
76
during the third quarter would consist of ¥ 10 = 4 from the second
190
cohort and 6 from the third cohort.
Fourth quarter
The fourth quarter started on 1 October with the remaining 108 ( = 114 - 6 )
employees from the third cohort plus the 126 new arrivals from the fourth
cohort, making a total of 234, and ended with 221 ( = 234 - 13 ) on 31
December. This gives a total employee exposed-to-risk for the fourth
quarter of 21 (234 + 221) = 227.5 quarters.
Again, if we assume that the deaths during the fourth quarter were in
proportion to the numbers at the start of the quarter, then the 13 deaths
108
during the fourth quarter would consist of ¥ 13 = 6 from the second
234
cohort and 7 from the third cohort.
We have assumed that the probability of dying does not vary by duration, the
length of time the employees have been in the region.
However, we might expect that each cohort of employees will become more
experienced by the second half of their tour and will be less likely to be
killed. For the same reason, the assumption of uniform deaths over each
quarter may not be valid.
The question only considers accidental deaths. However, there may also be
deaths from other causes, eg tropical diseases.
Period of investigation
Suppose that:
the period of investigation covers an n -year period and starts on a 1st
January.
time is measured in years since the start of the investigation.
Death data
Death data are classified according to age nearest birthday for both
countries. Let:
Census data
The census data are classified according to age last birthday for Country A
and age nearest birthday for Country B. So the census data and death data
match for Country B but not for Country A.
Let PxA (t ) denote the number of lives in Country A at time t aged x last
birthday. The values of PxA (0) , PxA (1) , ..., PxA (n ) are known for all x .
However, since the census data don’t match the death data, we define
another function PxA * (t ) to be the number of lives in Country A at time t
aged x nearest birthday. (This function does match the death data.)
Ú0 Px (t ) dt
n
E xc,A = A*
1 È A* 1
E xc,A = P (0) + PxA * (1)˘ + + ÈPxA * (n - 1) + PxA * (n )˘
2Î x ˚ 2Î ˚
1 n -1
= Â ÈPxA * (k ) + PxA * (k + 1)˘
2 k =0 Î ˚
Assuming that birthdays are uniformly distributed over the calendar year:
1È A
PxA * (t ) = Px -1 (t ) + PxA (t )˘
2Î ˚
1 n -1 È A
E xc,A = Â Px -1 (k ) + PxA (k ) + PxA-1 (k + 1) + PxA (k + 1)˘˚
4 k =0 Î
Let PxB (t ) denote the number of lives in Country B at time t aged x last
birthday. The values of PxB (0) , PxB (1) , ..., PxB (n ) are known for all x and
the central exposed to risk for Country B is given by:
Ú0 Px (t ) dt
n
E xc,B = B
Assuming that PxB (t ) varies linearly between the census dates, we have:
1 n -1 È B
E xc,B = Â Px (k ) + PxB (k + 1)˘˚
2 k =0 Î
The combined central exposed to risk for the unified state is then:
E xc = E xc,A + E xc,B
Assuming that the force of mortality is constant over the rate interval
‘ x nearest birthday’, it can be estimated by:
dx
mˆ x =
E xc
The rate interval starts at exact age x - ½ and ends at exact age x + ½ , so
the average age of the rate interval is x . Hence mˆ x is an estimate of m x .
The main reason why it might be difficult to ensure that the principle of
correspondence is adhered to is that the data used for the numerator and
the denominator may come from different sources.
(iii) Formula
We are also given information about the number of deaths. So we can let
q x denote the number of deaths aged x nearest birthday that occurred
during the calendar year we are considering.
qx
mˆ x =
E xc
We can use the census method, which tells us that, for a one-year study:
1
E xc = *
Ú0 Px (t )dt
where Px* (t ) is the number of lives at time t who are aged x nearest
birthday (ie according to the age definition used in q x for the deaths).
If we assume that the population numbers vary linearly during the calendar
year we are considering, we can approximate this integral using the
trapezium rule:
E xc ª 1 ÈP * (0) + P * (1)˘
2 Î x x ˚
Px* (0) is a count of the people who are aged x nearest birthday at the start
of the year. Since we are not given this information directly, we will need to
approximate it. If a person is aged x nearest birthday on a particular date,
then they must either be x - 1 last birthday or x last birthday. If we assume
that birthdays for these lives are spread uniformly over the calendar year in
question, there will be half of each type. So:
Px* (0) ª 1
2
ÈÎPx -1(0) + Px (0)˘˚
Px* (1) ª 1
2
ÈÎPx -1(1) + Px (1)˘˚
E xc ª 1
2 { 1
2
ÈÎPx -1(0) + Px (0)˘˚ + 1
2
ÈÎPx -1(1) + Px (1)˘˚}
= 1
4 {Px -1(0) + Px (0) + Px -1(1) + Px (1)}
Here the rate interval is the year when a life is aged x nearest birthday.
This runs from exact age x - 21 to exact age x + 21 . We are assuming that
the force of mortality is constant over the year. So the age to which the
estimate relates will be the average age in the middle of the rate interval,
which is x . So f = 0 .
qx
mˆ x =
E xc
where:
E xc = 1
4 {Px -1(0) + Px (0) + Px -1(1) + Px (1)}
(iv) Factors for subdividing the data
Factors that a government statistical office might use to subdivide the data
include:
sex
marital status
nationality or ethnic group
employment status.
Insurance companies and other users of the data may require mortality rates
for specific categories of lives, eg a 50-year-old male smoker with a term
assurance policy.
Many models of mortality assume that the lives involved all have the same
rates of mortality. Subdividing the data makes this assumption more
reasonable.
(i) Factors by which life insurance mortality statistics are often subdivided
Factors include:
sex
policy type
duration (ie how long ago the policyholder took out their policy)
‘first class’ versus ‘impaired’ lives (ie whether the policyholder has any
significant known health issues).
The central rate of mortality for the rate interval labelled x is estimated
as q x E xc , where q x denotes the number of deaths during the investigation
of lives aged x last birthday.
1
E xc = Ú0 Px (t )dt
If we assume that the population numbers varied linearly over the calendar
year, then using the census method:
E xc = 1
2 ÎÈPx (0) + Px (1)˚˘
q x = 25
E xc = 1 (2,000 + 2,100)
2
= 2,050
ˆ Gasp =
qx 25
fi m = = 0.0122
E xc 2,050
q x = 21
E xc = 1 (1,770 + 1,674)
2
= 1,722
ˆ GHawk =
qx 21
fi m = = 0.0122
E xc 1,722
S
Let mGasp and m NS
Gasp be the central mortality rates for smokers and
non-smokers in Gasperton respectively. 50% of the policyholders in
Gasperton are smokers and 50% are non-smokers. So:
0.5 ¥ 1.4mNS NS
Gasp + 0.5 m Gasp = 0.0112
and hence:
0.0112
mNS
Gasp = = 0.0106
1.2
mSGasp = 1.4 ¥ 0.0106 = 0.0142
Similarly:
So:
NS NS
0.2 ¥ 1.4 mGHawk + 0.8 mGHawk = 0.0112
and hence:
0.0112
mNS
GHawk = = 0.0113
1.08
S
mGHawk = 1.4 ¥ 0.0113 = 0.0158
(iv) Comment
The insurer currently charges the same premiums for both towns. However,
our crude estimates indicate that the underlying rates for both smokers and
non-smokers are higher for Great Hawking. This would suggest that the
company might be under-charging policyholders in Great Hawking and
over-charging them in Gasperton.
If the company does not differentiate on the basis of geographical area in its
prices, it may lose business in Gasperton to a rival company that does
differentiate. Conversely, in Great Hawking it may attract new business from
rival companies, but will be under-charging and hence may risk its life
assurance fund becoming insolvent.
Our calculations also assume that the national differential of 40% in mortality
rates for smokers applies in both Gasperton and Great Hawking, whereas
the differentials may actually be different.
The insurer currently charges 40% higher premiums for smokers. However,
the theoretical premiums for a 25-year term assurance policy are not directly
proportional to the mortality rate for a single age. So a 40% higher mortality
rate does not imply a 40% higher premium rate. (This is because the
calculation of the premium is affected by other factors such as interest rates
and expenses.)
The data we have been given only allows us to estimate the crude rates for
a single age, whereas the premium calculation will depend on the mortality
rates over a 25-year age range.
qx
mˆ x + f =
E xc
We are also given information about the number of deaths. So we can let
q x (t ) denote the number of deaths aged x next birthday in calendar year t
and Px* (t ) denote the number of lives aged x next birthday at time t .
So:
Px* (t ) = Px -1(t )
We can use the census method, which tells us that, for a two-year study:
2
E xc = *
Ú0 Px (t )dt
We can rewrite this as:
2
E xc = *
Ú0 Px (t )dt
1 * 1.5 2
= Ú0 Px (t )dt + Ú1 Px* (t )dt + Ú P * (t )dt
1.5 x
1È * 1 1 1 1
E xc ª Px (0) + Px* (1)˘ + ¥ ÈPx* (1) + Px* (1.5)˘ + ¥ ÈPx* (1.5) + Px* (2)˘
2Î ˚ 2 2Î ˚ 2 2Î ˚
Here we are assuming that the population varies linearly between census
dates.
Simplifying and using the fact that Px* (t ) = Px -1(t ) , this becomes:
1 * 3 1 1
E xc ª P (0) + Px* (1) + Px* (1.5) + Px* (2)
2 x 4 2 4
1 3 1 1
= Px -1(0) + Px -1(1) + Px -1(1.5) + Px -1(2)
2 4 2 4
q 50 (2009) + q 50 (2010)
mˆ 50 + f =
1 3 1 1
P (0) + P49 (1) + P49 (1.5) + P49 (2)
2 49 4 2 4
200 + 225
=
1 3 1 1
¥ 2,000 + ¥ 2,100 + ¥ 2,300 + ¥ 2,500
2 4 2 4
425
= = 0.097701
4,350
The force of mortality m corresponds to the age in the middle of the rate
interval. q x denotes the number of deaths aged x next birthday. So, any
life dying must be between the ages of x - 1 and x and the rate interval
is ÈÎ x - 1, x ) . So the age to which our estimate of the force of mortality m
applies is 49.5.
The central exposed to risks for age x for each of the two periods are then:
3 4
E xc (1) = *
Ú0 Px (t )dt and E xc (2) = *
Ú3 Px (t )dt
If we now assume that the population numbers vary linearly between the
census dates, we can apply the trapezium rule to approximate the integrals:
E xc (1) ª 3 ¥ 1
2 {P (0) + P (3)}
*
x
*
x and E xc (2) ª 1
2 {P (3) + P (4)}
*
x
*
x
So:
Px* (t ) = 1
2 {Px -1(t ) + Px (t )}
The census formulae then become:
E xc (1) ª 3 ¥ 1
2 { 1
2 (Px -1(0) + Px (0)) + 21 (Px -1(3) + Px (3))}
and:
E xc (2) ª 1
2 { 1
2 (Px -1(3) + Px (3)) + 21 (Px -1(4) + Px (4))}
Substituting the numerical values (expressed in thousands) when x = 65
gives:
c
E65 (1) ª 3 ¥ 1
2 {
(P64 (0) + P65 (0)) + 21 (P64 (3) + P65 (3))}
1
2
Similarly:
c
E65 (2) ª 1
2 {
(P64 (3) + P65 (3)) + 21 (P64 (4) + P65 (4))}
1
2
q 65 (1) 3
mˆ65 (1) = c
= = 0.00328
E65 (1) 915
and:
q 65 (2) 1
mˆ65 (2) = c
= = 0.00305
E65 (2) 327.5
This assumes that the force of mortality is constant over each year of age.
These estimates have been derived for the rate interval labelled as 65,
which is the period when the lives were aged 65 nearest birthday. This runs
from exact age 64½ to exact age 65½. In the middle of this period the lives
were aged exactly 65. So the estimate of the force of mortality applies to
exact age 65.
One problem with subdividing data is that some of the subgroups may be
very small, containing only a few individuals.
Estimates of mortality rates derived from the small groups will be unreliable,
as it will be difficult to pin down the true underlying rates with any certainty.
The other main problem is incomplete data. You can discuss this instead
here.
Insurance companies and other users of the data may require mortality rates
for specific categories of lives, eg a 50-year-old male smoker with a term
assurance policy.
Many models of mortality assume that the lives involved all have the same
rates of mortality. Subdividing the data makes this assumption more
reasonable.
Apart from age and sex, mortality statistics are often subdivided by:
smoker status
policy type (for an insurance company)
known medical conditions
location / postcode.
To estimate the death rate at age 45 last birthday for the calendar year 2012
based on the two countries combined, we need to divide the total number of
deaths aged 45 last birthday by the total central exposed to risk at age 45
last birthday.
The total number of deaths aged 45 last birthday is the same as the total
number of deaths aged 46 next birthday, which we are told is 4,800 for
calendar year 2012.
Country A
The central exposed to risk at age 45 last birthday for Country A for the
calendar year 2012 is:
c 1 A
E45 ( A) = Ú0 P45 (t )dt
A
where P45 (t ) denotes the number of lives in Country A aged 45 last birthday
t years after 1 January 2012. This corresponds to the shaded area in the
graph below.
To approximate this area, we can divide the shaded area into two
trapeziums:
one covering the period from 1 Jan 12 to 1 Feb 12, which is of length 1
month
one covering the period from 1 Feb 12 to 1 Jan 13, which is of length
11 months.
So:
1
c A 1 A
E45 ( A) = Ú012 P45 (t )dt + Ú 121 P45 (t )dt
Country A
1 Jan 13 ?
1 Jan 12 ?
385,000
381,000
374,000
Assuming that the population numbers vary linearly between the census
dates, we can approximate these integrals using the trapezium rule to get:
c
E45 ( A) ª 1
12
¥ 1
2 (P A A 1
( )) + 1211 ¥ 21 (P45A ( 121 ) + P45A (1))
45 (0) + P45 12
A
We can find the value of P45 ( 121 ) , ie at 1 Feb 12, directly from the table:
A
P45 ( 121 ) = 381,000
A A
To find the values of P45 (0) and P45 (1) , ie the January figures, we need to
interpolate between the February figures:
A 1 11 ¥ 381,000 = 380,417
P45 (0) ª 12
¥ 374,000 + 12
A 1 11 ¥ 385,000 = 384,667
P45 (1) ª 12
¥ 381,000 + 12
We then have:
c
E45 ( A) ª 1
12
¥ 1
2 (380, 417 + 381, 000) + 12 ¥ 2 (381, 000 + 384, 667)
11 1
= 382, 656
Country B
1 Aug 2011: 1
2(374,000 + 354,000) = 364,000
2( )
1 Aug 2012: 1 381,000 + 372,000 = 376,500
2( )
1 Aug 2013: 1 385,000 + 375,000 = 380,000
The central exposed to risk at age 45 last birthday for Country B for the
calendar year 2012 is:
c 1 B
E45 (B) = Ú0 P45 (t )dt
B
where P45 (t ) denotes the number of lives in Country B aged 45 last birthday
t years after 1 January 2012. This corresponds to the shaded area in the
graph below.
1 Jan 13 ?
380,000
1 Jan 12 ? 376,500
364,000
Again, assuming that the population numbers vary linearly between the
census dates, we can approximate this as:
7
c B 1 B
E45 ( B) = Ú012 P45 (t )dt + Ú 127 P45 (t )dt
ª 7
12
¥ 1
2 (P
B B 7
( )) + 125 ¥ 21 (P45B ( 127 ) + P45B (1))
45 (0) + P45 12
B
We calculated the value of P45 ( 127 ) , ie at 1 August 2012, above:
B
P45 ( 127 ) = 376,500
B B
To find the values of P45 (0) and P45 (1) , ie the January figures, we need to
interpolate between the August figures:
B 7 5 ¥ 376,500 = 369,208
P45 (0) ª 12
¥ 364,000 + 12
B 7 5 ¥ 380,000 = 377,958
P45 (1) ª 12
¥ 376,500 + 12
We then have:
c
E45 ( B) ª 7
12
¥ 1
2 (369,208 + 376,500) + 125 ¥ 21 (376,500 + 377,958)
= 374,677
So the estimate of the death rate for age 45 last birthday is:
4,800
= 0.006338
382,656 + 374,677
We have estimated a mortality rate for the rate interval when lives were aged
45 last birthday (assuming that the force of mortality is constant over this
period). This period runs from exact age 45 to exact age 46. In the middle
of this period the lives were aged 45½. So the estimate applies to age 45½.
If we assume that the population number varies linearly over this period, this
can be approximated using the trapezium rule to give:
t2
E xc = Út1 Px (t ) dt ª (t2 - t1 ) ¥ 1
2
ÈÎPx (t1 ) + Px (t2 )˘˚
We need to find the central exposed to risk for age 55 last birthday for the
calendar year 2012, ie for the period from 1 January 2012 to 31 December
2012 (which we can assume is the same as 1 January 2013).
Company A
c 1
E55 ( A) ª 2
(3,205 + 3,025) = 3,115
Company B
The central exposed to risk at age 55 last birthday for Company B for the
calendar year 2012 is:
c 1 B
E55 (B ) = Ú0 P55 (t ) dt
B
where P55 (t ) denotes the number of lives in Company B aged 55 last
birthday t years after 1 January 2012. This corresponds to the shaded area
in the graph below.
1 Jan 13 ?
1 Jan 12 ?
1,440
1,300
1,190
This graph is not to scale. It shows the population numbers assuming they
vary linearly between the census dates (31 March).
To approximate this area, we can divide the shaded area into two
trapeziums:
one covering the period from 1 Jan 12 to 31 Mar 12, which is of length 3
months
one covering the period from 31 Mar 12 to 1 Jan 13, which is of length
9 months.
So:
3
c B 1 B
E55 (B ) = Ú012 P55 (t ) dt + Ú123 P55 (t ) dt
Assuming that the population numbers vary linearly between the census
dates, we can approximate these integrals using the trapezium rule to get:
c
E55 (B ) ª 3
12
¥ 1
2 (P
B B 3
55 (0) + P55 12 ( )) + 9
12
¥ 1
2 (P ( ) + P (1))
B 3
55 12
B
55
B
P55 ( ) is the number on 31 Mar 12, which we know is 1,300.
3
12
B
P55 (0) is the number on 1 Jan 12, which we can find by interpolation
between the 31 Mar 11 and 31 Mar 12 figures:
B 1 3
P55 (0) = 4
¥ 1,190 + 4
¥ 1,300 = 1,272.5
B
P55 (1) is the number on 1 Jan 13, which we can find by interpolation
between the 31 Mar 12 and 31 Mar 13 figures:
B 1 3
P55 (1) = 4
¥ 1,300 + 4
¥ 1, 440 = 1, 405
So, finally, we can estimate the exposed to risk for Company B as:
c
E55 (B ) ª 3
12
¥ 1
2 (1,272.5 + 1,300) + 129 ¥ 21 (1,300 + 1, 405) = 1,335.94
Company C
A life aged 55 last birthday on 1 January 2012 was aged 55 last birthday at
the end of the 2011 calendar year (the day before), or equivalently, aged 56
next birthday at the end of the 2011 calendar year.
Similarly, the number aged 55 last birthday on 1 January 2013 was 5,980.
We can then use the trapezium approximation to find the exposed to risk:
c 1 (5,950 + 5,980)
E55 (C ) ª 2
= 5,965
Individuals aged 22 last birthday on 30th June 2013 were born between 1st
July 1990 (23 years earlier) and 30th June 1991 (22 years earlier). So,
assuming that births were uniform over this period, half were born in 1990
and half in 1991.
c 4
E22 = Ú0 P22 (t ) dt
P22 (0) = 0
1
P22 (1.5) = 2
(200 + 150) = 175
900
575
175
c 1.5 2 3 4
E22 = Ú0 P22 (t ) dt + Ú P (t ) dt
1.5 22
+ Ú P22 (t ) dt + Ú P22 (t ) dt
2 3
Assuming that the population numbers vary linearly between the census
times, ie that people enter the nightclub uniformly over each period, we can
approximate these integrals using the trapezium rule to get:
c
E22 ª 1.5 ¥ 1
2 (P22 (0) + P22 (1.5) ) + 0.5 ¥ (P (1.5) + P (2))
1
2 22 22
+1 ¥ 1
2 (P (2) + P (3)) + 1¥ (P (3) + P (4))
22 22
1
2 22 22
1 1
+1 ¥ 2
(575 + 900) + 1 ¥ 2
(900 + 900)
Total 1,956.25
This method assumes that the entry times for the individuals within each
cohort are distributed uniformly over the relevant time period.
40
= 0.02045 per hour
1, 956.25
Insurance companies and other users of the data may require mortality rates
for specific categories of lives, eg a 50-year-old male smoker with a term
assurance policy.
Many models of mortality assume that the lives involved all have the same
rates of mortality. Subdividing the data makes this assumption more
reasonable.
This will result in the company attracting more smokers and fewer
non-smokers. The company will tend to make losses on the smokers and
lose the opportunity to make profits on the non-smokers. This could
eventually lead to bankruptcy.
If the company opts to apply the higher smoker rates to everyone, then
smokers will be paying a premium that is consistent with the market,
whereas the non-smoker rates will be uncompetitive.
As a result, there will be fewer policies sold to non-smokers, but policies sold
to smokers should be unaffected.
Apart from age and sex, mortality statistics are often subdivided by:
smoker status
policy type (for an insurance company)
known medical conditions
location / postcode.
To estimate the death rate at age 52 last birthday for the calendar year 2013
based on the two companies combined, we need to divide the total number
of deaths aged 52 last birthday (ie 28 + 17 = 45 ) by the total central exposed
to risk for age 52 last birthday. This assumes that the force of mortality is
constant over the year of age.
Company A
The central exposed to risk at age 52 last birthday for Company A for the
calendar year 2013 is:
c 1 A
E52 ( A) = Ú0 P52 (t ) dt
A
where P52 (t ) denotes the number of Company A policyholders aged 52 last
birthday t years after 1 January 2013.
Assuming that the population numbers vary linearly between the census
dates (ie between 1 Jan 2013 and 1 Jan 2014), we can approximate this
integral using the trapezium rule to get:
c
E52 ( A) ª 1
2 (PA A
52 (0) + P52 (1))
The population numbers for Company A are recorded using age next
birthday. We can find the corresponding numbers age last birthday by
noting that age x last birthday is equivalent to age x + 1 next birthday.
A A
P52 (0) = 8,016 and P52 (1) = 9,026
So:
c
E52 ( A) ª 1
2 (8,016 + 9,026) = 8,521
Company B
The central exposed to risk at age 52 last birthday for Company B for the
calendar year 2013 is:
c 1 B
E52 (B ) = Ú0 P52 (t ) dt
B
where P52 (t ) denotes the number of Company B policyholders aged 52 last
birthday t years after 1 January 2013. This corresponds to the shaded area
in the graph below.
Company B
5,281 1 Jan 13 ?
5,218
1 Jan 14 ?
3,812
To approximate this area, we can divide the shaded area into two
trapeziums:
one covering the period from 1 Jan 13 to 1 Apr 13, which is of length
3 months
one covering the period from 1 Apr 13 to 1 Jan 14, which is of length
9 months.
So:
3
c B 1 B
E52 (B ) = Ú012 P52 (t ) dt + Ú123 P52 (t ) dt
Assuming that the population numbers vary linearly between the census
dates, we can approximate these integrals using the trapezium rule to get:
c
E52 (B ) ª 3
12
¥ 1
2 (PB B 3
( )) +
52 (0) + P52 12
9
12
¥ 1
2 (P ( ) + P (1))
B 3
52 12
B
52
B B
To find the values of P52 (0) and P52 (1) , ie the January figures, we need to
interpolate between the April figures:
B 3 9
P52 (0) ª 12
¥ 5,281 + 12 ¥ 5,218 = 5,233.75
B 3 9
P52 (1) ª 12
¥ 5,218 + 12 ¥ 3,812 = 4,163.5
We then have:
c
E52 (B ) ª 3
12
¥ 1
2 (5,233.75 + 5,218) + 129 ¥ 21 (5,218 + 4,163.5) = 4,824.53
So the estimated combined force of mortality is:
28 + 17 45
= = 0.00337
8,521 + 4,824.53 13,345.53
We have estimated a mortality rate for the rate interval when lives were aged
52 last birthday (assuming that the force of mortality is constant over this
period). This period runs from exact age 52 to exact age 53. In the middle
of this period the lives were aged 52½. So the estimate applies to age 52½.
1
E xc = *
Ú0 Px (s ) ds
Here Px* (s ) is the number of in-force policies where the policyholder was
aged x last birthday at time s years after the start of year t .
1È *
E xc = Px (0) + Px* (1)˘
2Î ˚
However, the population numbers provided in the data are based on age
nearest birthday. So we need to express P * in terms of P .
We assume that the birthdays of the individuals in the study are distributed
uniformly over the calendar year. (Assumption 2)
Px* (s ) =
1
2
{
Px (t + s ) + Px +1(t + s )}
To help get the ages in this relationship correct, you can imagine that we
want to know how many people are aged 25 (say) last birthday. Such
people must be either 25 or 26 nearest birthday and, under Assumption 2,
this would be an equal split.
So:
*
P25 () =
1
2
{
P25 () + P26 () }
For the times, note that our notation Px* (0) refers to time 0 after the start of
year t , ie 1st January in year t . This is the same time point as in the
symbol Px (t ) given in the question. Similarly, Px* (1) refers to time 1 after the
start of year t , ie 31st December in year t , or 1st January in year t + 1 .
This is the same time point as in the symbol Px (t + 1) given in the question.
E xc =
1 È1
2 ÍÎ 2
{
Px (t ) + Px +1(t ) +} {
1
2
} ˘
Px (t + 1) + Px +1(t + 1) ˙
˚
1È
= P (t ) + Px +1(t ) + Px (t + 1) + Px +1(t + 1)˘
4Î x ˚
(ii)(b) Assumptions
Company A
We can approximate the number of policies for lives aged 51 last birthday on
1 January 2014 as:
1
2 (PA A
51(2014) + P52 (2014) )= 1
2 (6,002 + 5,600) = 5,801
We can approximate the number of policies for lives aged 51 last birthday on
1 January 2015 as:
1
2 (PA A
51(2015) + P52 (2015) )= 1
2 (5,056 + 4,906) = 4,981
So the contribution to the exposed to risk for lives aged 51 last birthday for
the calendar year 2014 for Company A is:
c
E51( A) ª 1
2 (5,801 + 4,981) = 5,391
Company B
We can approximate the number of policies for lives aged 51 last birthday on
1 January 2014 as:
10 B 2 B 10 2
12
¥ P51(2013) + 12 ¥ P51(2014) = 12
¥ 2,333 + 12 ¥ 2, 417 = 2,347
We can approximate the number of policies for lives aged 51 last birthday on
1 January 2015 as:
So the contribution to the exposed to risk for lives aged 51 last birthday for
the calendar year 2014 for Company B is:
c
E51(B ) = 10
12
¥ 1
2 (2,347 + 2, 417) + 122 ¥ 21 (2, 417 + 2,383) = 2,385
Company C
Lives aged 51 last birthday on 1 January 2014 were aged 52 next birthday
on that date (or equivalently on 31 December 2013).
So the number of policies for lives aged 51 last birthday on 1 January 2014
is:
C
P52 (2013) = 3,895
Lives aged 51 last birthday on 1 January 2015 were aged 52 next birthday
on that date (or equivalently on 31 December 2014).
This figure is not provided in the data, but we can approximate it as:
1
2 (P
C C
52 (2013) + P52 (2015) )= 1
2 (3,895 + 4,367) = 4,131
So the contribution to the exposed to risk for lives aged 51 last birthday for
the calendar year 2014 for Company C is:
c
E52 (C ) ª 1
2 (3,895 + 4,131) = 4,013
(ii)(a) Assumptions
Company A
We have also assumed that the number of policies in force varies linearly
over the calendar year 2014.
Company B
We have assumed that the number of policies in force varies linearly over
each year.
We have also assumed that each calendar month is of equal length (or more
specifically that 1 November is 10/12ths of the way through the calendar
year).
Company C
We have also assumed that the number of policies in force varies linearly
over the period from 1 January 2013 to 31 December 2014.
The assumptions relating to the ages are needed to ensure that the exposed
to risk is based on the same age definition as is used for recording the
deaths.
H S
(ii) Likelihood
If ti denotes the observed waiting time in state i and nij denotes the
observed number of transitions from state i to state j , the likelihood is:
where C is a constant.
∂ n
ln L = -tH + HS
∂s s
nHS
s =
tH
∂2 nHS
2
ln L = - <0
∂s s2
We can calculate the central exposed to risk using a census approach based
on the formula:
2
E xc = *
Ú0 Px (t ) dt
1È * 1
E xc = P (0) + Px* (1)˘ + ÈPx* (1) + Px* (2)˘ = 1 *
P (0) + Px* (1) + 21 Px* (2)
2Î x ˚ 2Î ˚ 2 x
However, the population numbers provided in the data are based on age last
birthday. So we need to express P * in terms of Px (t ) , the number of
employees at time t who are aged x last birthday.
If we assume that the birthdays of the individuals in the study are distributed
uniformly over the calendar year, we would then have:
Px* (t ) = 1
2 { Px -1(t ) + Px (t )}
So:
E xc = 1
2
¥ {P1
2 x -1(0) + Px (0) } + {P 1
2 x -1(1) + Px (1) }
+ 21 ¥ {P 1
2 x -1 (2) + P (2)}
x
c
E52 (all ) = 1
2
¥ {P (0) + P (0)} + {P
1
2 51 52
1
2 51(1) + P52 (1) }
+ 21 ¥ {P (2) + P (2)}
1
2 51 52
1
= 2
¥ 21 (148 + 146) + 21 (162 + 148) + 1
2
¥ 21 (180 + 160)
= 313.5
The denominator in the calculation of the recovery rate is the exposed to risk
for employees receiving sickness benefit. We can calculate this by just
considering the sick employees:
c
E52 (sick ) = 1
2
¥ {P (0) + P (0)} + {P
1
2
S
51
S
52
1
2
S S
51(1) + P52 (1) }
+ 1
2
¥ {P (2) + P (2)}
1
2
S
51
S
52
1 1
= 2
¥ 2
(12 + 10) + 21 (20 + 18) + 1
2
¥ 1
2
(8 + 7)
1
= 2
¥ 11 + 19 + 21 ¥ 7.5
= 28.25
The denominator in the calculation of the sickness rate is the exposed to risk
for healthy employees who are not receiving sickness benefit. By
subtraction, this is:
c c c
E52 ( healthy ) = E52 (all ) - E52 (sick ) = 313.5 - 28.25 = 285.25
The dates shown in bold in the table below are the dates on which we
started and finished observing each life at age 85 last birthday during the
observation period. The number of months exposed to risk is then
calculated by subtraction.
So the number of months contributed to the exposed to risk by each life is:
4, 7, 9, 10, 11, 9, 5, 9, 6, 3
The number of deaths at age 85 last birthday in the sample is 3 (ie Lives 6, 7
and 9).
d85 3
mˆ = c
= 1
= 0.49315
E85 6 12
FACTSHEET
use the date of exit and date of birth data to work out the date at which
the life was last observed between the x th and ( x + 1) th birthdays
calculate the length of time between the two above dates (expressed in
years), to give the individual’s contribution to E xc .
Finally sum all contributions over all lives in the investigation: this is E xc .
define a variable Px, t that corresponds exactly (at time t) to the age
definition of deaths
write down the integral of Px, t over [a, b] , where a and b are the start
and end dates of the investigation period respectively
b
ie E xc = Úa Px, t dt
split the investigation period into suitable time intervals (eg intervals
between successive census dates)
for each interval, use the trapezium rule approximation to calculate the
value of the integral, ie:
Rate intervals
A rate interval is a period of one year during which a person has a particular
age label. They are determined by the age definition of the deaths.
‘x last birthday’ means ‘year starting at the xth birthday’.
‘x nearest birthday’ means ‘year with the xth birthday in the middle’.
‘x next birthday’ means ‘year ending at the xth birthday’.
work out the rate interval that is implied by the definition of deaths
work out the age in the middle of the rate interval.
work out the rate interval that is implied by the definition of deaths
work out the age at the start of the rate interval.
NOTES
NOTES
NOTES
NOTES
NOTES
NOTES