Professional Documents
Culture Documents
Stochastic Processes
-- in Kiostatistics
COCHRAN COX
and Experimental Designs, Second Edition •
DODGE ROMIG
and Sampling Inspection Tables, Second Edition•
AND MATHEMA'
- *
sions
CRAMER • The Elements of Probability Theory and Some of Its Appli-
cations
DOOB • Stochastic Processes
FELLER An • Introduction to Probability Theory and Its Applications,
Volume I,Third Edition
FELLER • An Introduction to Probability Theory and Its Applica-
tions, Volume II
FISHER • Contributions to Mathematical Statistics
FISZ • Probability Theory and Mathematical Statistics, Third Edition
FRASER Nonparametric Methods in Statistics
•
Theory, Volume II
HOEL Introduction to Mathematical Statistics, Third Edition
•
r
Applied Probability and Statistics ( Continued)
CHAKRA VARTI, LAHA and ROY • Handbook of Methods of
Applied Statistics, Vol. II
CHERNOFF and MOSES Elementary Decision Theory •
CHEW •
Experimental Designs in Industry
CHIANG • Introduction to Stochastic Processes in Biostatistics
CLELLAND, deCANI, BROWN, BURSK, and Basic MURRAY •
COCHRAN COX
and Experimental Designs, Second Edition •
DODGE ROMIG
and Sampling Inspection Tables, Second Edition
•
Theory, Volume I
HOEL Elementary Statistics, Second Edition
•
https://archive.org/details/introductiontostOOchia
Introduction to Stochastic Processes in Biostatistics
A WILEY PUBLICATION IN
APPLIED STATISTICS
Introduction to Stochastic
Processes in Biostatistics
New York •
London •
Sydney
Copyright © 1968 by John Wiley & Sons, Inc.
All rights reserved. No
part of this book may
be reproduced by any means nor transmitted,
,
r
In Memory of My Parents
f
Preface
Time, life, and risks are three basic elements of stochastic processes in
biostatistics. Risks of death, risks of illness, risks of birth, and other
risks act continuously on man with varying degrees of intensity. Long
before the development of modern probability and statistics, men were
concerned with the chance of dying and the length of life, and they con-
structed tables to measure longevity. But it was not until the advances
in the theory of stochastic processes made in recent years that empirical
processes in the human population have been systematically studied from
a probabilistic point of view.
The purpose of this book is to present stochastic models describing
these empirical processes. Emphasis is placed on specific results and
explicit solutions rather than on the general theory of stochastic processes.
Those readers who have a greater curiosity about the theoretical arguments
are advised to consult the rich literature on the subject.
A basic knowledge of probability and statistics is required for a profit-
able reading of the text. Calculus is the only mathematics presupposed,
although some familiarity with differential equations and matrix algebra
is needed for a thorough understanding of the material.
The text is divided into two parts. Part 1 begins with one chapter on
random variables and one on probability generating functions for use in
succeeding chapters. Chapter 3 is devoted to basic models of population
growth, ranging from the Poisson process to the time-dependent birth-
death process. Some other models of practical interest that are not in-
cluded elsewhere are given in the problems at the end of the chapter.
Birth and death are undoubtedly the most important events in the
human population, but the illness process is statistically more complex.
Illnesses are potentially concurrent, repetitive, and reversible, and con-
sequently analysis is more challenging. In this book illnesses are treated
as discrete entities, and a population is visualized as consisting of discrete
states of illness. An individual is said to be in a particular state of illness
if he is affected with the corresponding diseases. Since he may leave one
illness state for another or enter a death state, consideration of illness
opens up a new domain of interest in multiple transition probability and
multiple transition time. A basic and important case is that in which there
Vll
viii PREFACE
are two illness states. Two chapters (Chapters 4 and 5) are devoted to this
simple illness-death process.
In dealing with a general illness-death process that considers any finite
number of illness states, I found myself confronted with a finite Markov
process. avoid repetition and to maintain a reasonable graduation of
To
mathematical involvement, I have interrupted the development of illness
processes to discuss the Kolmogorov differential equations for a general
situation in Chapter 6. This chapter is concerned almost entirely with the
derivation of explicit solutions of these equations. For easy reference a
section (Section 3) on matrix algebra is included.
Once the Kolmogorov
differential equations are solved in Chapter 6,
the discussion on the general illness-death process in Chapter 7 becomes
straightforward; however, the model contains sufficient points of interest
to require a separate chapter. The general illness-death process has been
extended in Chapter 8 in order to account for the population increase
through immigration and birth. These two possibilities lead to the emigra-
tion-immigration process and the birth-illness-death process, respectively.
But my effort failed to provide an explicit solution for the probability
distribution function in the latter case.
Part 2 is devoted to special problems in survival and mortality. The
life table and competing risks are classical and central topics in biostatistics,
while the follow-up study dealing with truncated information is of con-
siderable practical importance. I have endeavored to integrate these
topics as thoroughly as possible with probabilistic
and statistical principles.
I hope that I have done justice to these topics and to modern probability
and statistics.
It should be emphasized that although the concept of illness processes
through 8.
PREFACE IX
Watson who read some of the chapters and provided useful suggestions.
My thanks are also due to Mrs. Shirley A. Hinegardner for her expert
typing of the difficult material; to Mrs. Dorothy Wyckoff for her patience
with the numerical computations; and to Mrs. Lois Karp for secretarial
assistance.
Chin Long Chiang
University of California Berkeley
,
May, 1968
f
Contefrts
PART 1
1. Random Variables
1. Introduction 3
2. Random Variables 4
3. Multivariate Probability Distributions 7
4. Mathematical Expectation 10
4.1. A Useful Inequality 11
1. Introduction 24
2. General Properties 24
3. Convolutions 26
4. Examples 27
4.1. Binomial Distribution 27
4.2. Poisson Distribution 28
4.3. Geometric and Negative Binomial Distributions 28
5. Partial Fraction Expansions 30
6. Multivariate Probability Generating Functions 31
xi
xii CONTENTS
1. Introduction 73
2. Illness Transition Probability, Pap(t) and Death Transition
Probability, Q ad (t ) 75
3. Chapman-Kolmogorov Equations 80
4. Expected Durations of Stay In Illness and Death States 81
5. Population Sizes in Illness States and Death States 82
5.1. The Limiting Distribution 85
Problems 86
L Introduction 89
2. Multiple Exit Transition Probabilities, Pjj$\t) 90
2.1. Conditional Distribution of the Number of Transitions 94
3. Multiple Transition Probabilities Leading to Death, Q^\t) 95
r
CONTENTS Xiii
4. Chapman-Kolmogorov Equations 99
5. More Identities for Multiple Transition Probabilities 101
6. Multiple Entrance Transition Probabilities, p$(t) 104
7. Multiple Transition Time, T^ l)
106
7.1. Multiple Transition Time Leading to Death, r 109
7.2. Identities for Multiple Transition Time 110
Problems 111
PiM) 135
4.3. Second Solution for Individual Transition Probabilities
Piff) 138
4.4. Identity of the Two Solutions 140
4.5. Chapman-Kolmogorov Equations 141
Problems 142
1. Introduction 151
2. Transition Probabilities 153
2.1. Illness Transition Probabilities, Paj8 (/)
153
2.2. Transition Probabilities Leading to Death, Qad (t) 156
2.3. An Equality Concerning Transition Probabilities 158
2.4. Limiting Transition Probabilities 160
XIV CONTENTS
Q'SKt) 167
3.3. Multiple Entrance Transition Probabilities, p $(t)
{
168
Problems 169
1. Introduction 171
2. Emigration-Immigration Processes — Poisson-Markov
Processes 173
2.1. The Differential Equations 174
2.2. Solution for the Probability Generating Function 176
2.3. Relation to the Illness-Death Process and Solution for
the Probability Distribution 181
2.4. Constant Immigration 182
3. A Birth-Illness-Death Process 183
Problems 184
PART 2
f
CONTENTS XV
1. Introduction 218
1.1. Probability Distribution of the Number of Survivors 219
2. Joint Probability Distribution of the Numbers of Survivors 221
2.1. An Urn Scheme 223
3. Joint Probability Distribution of the Numbers of Deaths 225
4. Optimum Properties of p s and qj 225
4.1. Maximum Likelihood Estimator of pj 226
4.2. Cramer-Rao Lower Bound for the Variance of an
Unbiased Estimator of pj 229
4.3. Sufficiency and Efficiency of p j 231
5. Distribution of the Observed Expectation of Life 233
5.1. Observed Expectation of Life and Sample Mean
Length of Life 235
5.2. Variance of the Observed Expectation of Life 237
Problems 240
References
r
Introduction to Stochastic Processes in Biostatistics
Part 1
CHAPTER 1
Random Variables
1. INTRODUCTION
A body of probability theory and statistics has been developed for
large
the study of phenomena arising from random experiments. Some studies
take the form of mathematical models constructed to describe observable
events, while others are concerned with statistical inference regarding
random experiments. A familiar random experiment is the tossing of dice.
When a die is tossed, there are six possible outcomes and it is not certain
which one will occur. In a laboratory determination of antibody titer the
result will also vary from one trial to another, even same blood
if the
specimen is used and the laboratory conditions are kept constant. Examples
of random experiments can be found almost everywhere; in fact, the
concept of random experiment may be extended so that any phenomenon
may be thought of as the result of some random experiment, be it real
or hypothetical.
As a framework for discussing random phenomena, it is convenient
to represent each conceivable outcome of a random experiment by a
point, called a sample point ,
denoted by s. The totality of all sample
points for a particular experiment is called the sample space, denoted by
S. Events may be represented by subsets of S; thus an event A consists
of a certain collection of possible outcomes s. If two subsets contain no
points s in common, they are said to be disjoint, and the corresponding
events are said to be mutually exclusive : they cannot both occur as a
result of a single experiment.
The probabilities of the various events in S are the starting point for
analysis of the experiment represented by S. Denoting the probability
of an event A by Pr{T}, we may state the three fundamental assumptions
about these probabilities as follows.
3
4 RANDOM VARIABLES [1.1
Pr{5'} = 1 (1.2)
This is called the countably additive assumption. In the case of two mutu-
ally exclusive events, A x and A 2 we ,
have
2. RANDOM VARIABLES
Any single-valued numerical function V(s) defined on a sample space
S will be called a random variable ;
thus, a random variable associates
with each point s in the sample space a unique real number, called its
that the random variable assumes the value x t The sequence {/?J is called
.
Pr{X(s) < x) = J p = i
F(x), —co<x<co (2.2)
X;<X
is called the distribution function of X(s).
If X(s) assumes only one value, say x k with ,
pk = 1, then V(s) is called
a degenerate random variable ,
and is, in effect, a constant.
X(s) is a proper random variable if its distribution {/?J satisfies the
condition
I P, =
i
1- (2.3)
I P, < 1, (2.4)
t
1 . 2] RANDOM VARIABLES 5
corresponding probabilities p = ^,
i
= 1, i6. Now we define a
. . . ,
Pr{X = k} = p\\ - p)
n
-\ k = 0,l,...,n. (2.5)
M_
\kj k\(n
—
— k)\
(2 . 6 )
n trials.
The function f(x) is called the probability density function (or density
The
function ) of X. distribution function of X is
they are nonnegative but need not be less than unity. The density function
is merely a tool which by (2.7) will yield the probability that X lies in any
interval.
As in the discrete case, a continuous random variable is proper if its
fie~ x > 0
/(*) = (2 . 10 )
0 x < 0
and the distribution function
_ P -v x x > 0
F(x) = (2 . 11 )
x < 0
r
1.3] MULTIVARIATE PROBABILITY DISTRIBUTIONS 7
f(x) = —Lr e~
x2/2
,
— oo < x < oo. (2.12)
V 277
The distribution function
= P —L
v2,i
F(x) e~ dy (2.13)
J-oo yj2l
1 1 Pii =
i j
I- (3-2)
Let
Pi = 2 Pa (3-3)
3
and
‘li = 2Pa< (3 - 4)
= 2 Pa = p r{ p = y,}. (3.6)
The sequences and {qj) are called the marginal distributions of X and
{/?J
Y, respectively. They are, of course, simply the probability distributions
of X and Y the adjective “marginal” has meaning only in relation to
8 RANDOM VARIABLES [1.3
the joint probability distribution {/?,,}. It is clear that the marginal distri-
butions {pi} {(q and be determined from the joint distribution {pi} }\
may
however, the joint distribution cannot, in general, be determined from the
marginal distributions.
For improper random variables, a joint distribution can be defined
exactly as in (3.1), but the equations (3.2), (3.5), and (3.6) are no longer
valid. In the remainder of this chapter we consider only proper random
variables, since the concepts to be introduced are not useful for improper
random variables.
The conditional probability distribution of Y given X= is defined by
Pa = Pi<h (3.8)
for all i and j. Formula (3.8) is equivalent to the more intuitive condition
that
Pr{ Y=y \X= x,} = s
Pr{ Y = Vj }. (3.9)
'!
(3.13)
t
1.3] MULTIVARIATE PROBABILITY DISTRIBUTIONS 9
f( x V) = h(x, y)
and g(y I
x) = Kx , y)
(3.14)
g(y)
for g(y) > 0, f( x) > 0, and satisfy
m= [ /(* |
y)g(y) dy and g(y) =J g(y \
x)f(x) dx. (3.15)
Pm = MX = Y= Z= (3.17)
Pa = Pr{X = xt Y
, = y ,} = £ pm (3.18)
k
and the other by
Pi = Pr{X = x,} = 2p = ti 21 pm - (3.19)
j j k
Two levels of conditional probability can also be defined in the same way.
Independence of X, Y, and Z is defined by the relation
defined on that same sample space; in the later chapters we shall find the
vector notation convenient.
10 RANDOM VARIABLES [1.4
4. MATHEMATICAL EXPECTATION
The mathematical expectation of a (proper) random variable X is
defined as
E(X) = 2 xiPi (4-1)
expectation.
Pr{X = k) 0 ,
1 ,.... (4.3)
k\
The expectation of X is
00
E(X)=J_k~ = X. (4.4)
fc= o k !
E(X) xjue
tlx
dx = - . (4.6)
Then, for any constants a b, and c, we may use the definition of expecta-
,
tion to write
= a 2 2 Pa + b 2 2 Pa + c 2 Vi 2 Pa
i j i j i i
= a + b 2 xiPi + c 2 yfl,
i 3
= a + bE(X) + cE(Y).
In general we have
Theorem 1. The expectation of a linear function of random variables
isequal to the same linear function of the expectations so that if ,
X l9 . . . ,
Xn
are random variables and c 0 c l9 c n are constants then
, . . . , ,
...,Xn)] = <f>[E(xj, . . . ,
E(Xn )]
12 RANDOM VARIABLES [1.4
holds true a linear function. This equality does not hold true
when </> is
4>(X) = 1 IX:
> (4.11)
E(X)
The reverse inequality holds true if X assumes only negative values.
Proof. Let
X- E(X)
L1 + x^jm\ E(X)( 1 + A), (4.12)
l E(X) J
so that
£(A) = £
/mw\
x - E(xy 0. (4.13)
\ E(X) )
From (4.12),
= — — (l + (4.16)
E(X)\ \1 + A/ / E(X)
since the random variables A2 and 1 + A, and hence the expectation
£[A 2
/(1 + A)], are positive.
E(Y\x ) = E
l Zy
j
i
Pi
if, Pi> 0 (4.17)
E(Y |*) = |
J-«
% f(x)
dy, f{x) > 0 (4.18)
f
1 . 4] MATHEMATICAL EXPECTATION 13
E(Y X) |
takes on the value E{
|
x t) when X assumes the value x {
1
.
defined analogously, with the random variable upon which the expectation
is conditioned treated as a constant when calculating the expectation;
thus, for example,
E[(X + Y)\X} = X + E(Y\X) (4.19)
E[XY |
X] = XE[Y\X}. (4.20)
E[E(Y |
X)] = E( Y). (4.21)
Corollary.
E(XY) = E[XE( Y |
X)}. (4.22)
= 22 VjPa = 2 Vi 2 Pa = 2 ViPi = £
i 3 3 1 3
( y )- ( 4 23
-
The proof for the continuous case is left to the reader. The corollary
follows from (4.20) and from the theorem.
Theorem 4. If X, Y, . . . ,
Z are independent random variables then ,
E(XY • •
•
Z) = E(X) E{ Y) • •
• •
E{Z). (4.24)
1
As an aid in interpreting repeated expectations, remember that E( Y X) is a function \
E[X(X - 1)
• •
• (X - r+ 1)] r — 1,2,... (5.2)
The first moment, which is nothing but the expectation E(X), is called the
mean of the distribution and is a commonly used index of its location.
The second central moment
o* 2 = E[X - E{X)f (5.4)
*c
2 = 0 (5.5)
and that it depends on the scale of measurement but not on the origin
of measurement:
Ga+bx — b 2(y x
2
-
(5-6)
E(X 2 )=J,k 2
00
k=o
~ kl
= A(A+ 1) (5.7)
o'
A'
2
= f (x ~ -\ dx =— 2
. (5.9)
Jo \ fi ) iu
t
1.5] MOMENTS, VARIANCE, AND COVARIANCE 15
Example 3. Suppose that we have two urns. The ith urn contains a
proportion p { of white balls and q t of black balls with p { + q t 1 / = 1,2. = ,
From the ith urn n { balls are drawn with replacement, of which { are X
white. The number n t is number n 2 is a random
fixed in advance, but the
variable and is set equal to the number of white balls drawn from the
first urn; in other words, n 2 = Xv The proportions of white balls drawn
and (5.11)
In general, we have
Theorems. The variance of theJinear function c 0 + c1 X + 1
• *
*
+ cn Xn
is «
Var(c„ +cX + l 1
• •
•
+ c„X„) =2 c?o x ?+ 2 %c c i j
aXuXf (5.12)
= i =1 3=1
Var( Cl X + x
• •
•
+ c n Xn = c*o x +) + c*o x * • *
•
*
(5.13)
and if X lf . . .
Xn are independent and have the
,
same variance a x 2 then ,
VarOO = - Y
2
(T . (5.14)
n
defined as the number of successes in the ith trial, has the Bernoulli
distribution
Pr{Z i
= 0}= 1 -p, 1 }=P
so that
E(X ) = p t (5.15)
Then Z represents the number of successes in n trials and hence has the
binomial distribution (2.5). Theorem 1 and (5.15) imply that
E(Z) = np (5.17)
and, since X lf . . .
,
Xn are independent, Theorem 5 and (5.16) imply that
Oz
2 = np( l - p)- (5.18)
Note that this approach to E(Z) and o z 2 is much simpler than direct
computation from (2.5).
m
/?!,... ,/? m , respectively, where ^ Pi = 1 - Then the numbers X l9 . . .
,
Xm
i=0
f
1 . 5] MOMENTS, VARIANCE, AND COVARIANCE 17
of occurrences of El9 . . . ,
Em in n independent repetitions of the experi-
ment have jointly the multinomial distribution given by 2
n\
Pr{X! = k± , . x X , to Pl
ki ... n k m (n-k)
Pm P 0
/ci!
• •
•
km \ (n - k)\
= 0, 1, . . . , n; i = 1, . . . , m, (5.19)
where k — kx -\- • -
-
+ km . We are interested in the covariance between
X i
and Xj. It is clear that each X i
has the binomial distribution with the
variance
a x? = npA 1 — Pi) (5.20)
a Xi+Xj — a Xi + G Xj
2
+ ^ G Xi,Xj (5.22)
Let U= a1 X + 1 + a m Xm and V = b1 Y1 +
• •
•
+ b n Yn be two • •
•
to n
G U,V = 2 2 a i^ j
G Xi,Y }’ (5-24)
i= 3=1 1
Var(VT) = x tf
2
<r
F
2
+ [£(7)]
2
<r
x2 + [E(X)] 2 a Y 2 (5.25)
2
A random variable X0 corresponding to the occurrences of E0 is not explicitly
included in (5,19) because the value of X0 is completely determined by the relation
n Xi —
18 RANDOM VARIABLES [1.5
( 5Aa ) we have
Using (5.4a) again for E{X2) and E(Y 2 ) in (5.26), we obtain (5.25). In
general, the variance of the product of n independent random variables is
Var(JT 1 X n) = fl
i =1
{<r Xj
2
+ [ £(7Q)]
2
}
- f[
i=
[£(*,)]
2
- (5.27)
om = +2
=
(*, - ^m,i)
i
(5.28)
where
(5.29)
dXi
Var[0(A)] =2 (5.30)
*=1 Z=1 3=1
Comparing (5.31) with the exact formula in (5.25) shows that the error
is in the missing term a x 2 a Y 2 .
1 . 5] MOMENTS, VARIANCE, AND COVARIANCE 19
+ 2 E{[X - E(X |
Z)][E(X Z) |
- E{X)]}. (5.36)
Applying Theorem 3, the first term after the last equality sign becomes
E(E{[X - E(X |
Z)]
2
1
Z}) = E[a xiz ],
2
E{E{X Z) |
- E[E(X |
Z)]}
2
= <r
2
E(A.| Z „
and the third term is
2E(E{[X - E(X |
Z))[E(X Z) \
- E(X)} \
Z})
= 2E([E(X |
Z) - E(X)]E{[X - E(X |
Z)] |
Z})
=0
since E{[X — E(X \
Z)\ \
Z} = 0, and (5.34) is proved. The proof of
(5.35) is similar.
U= x-Em and K =
y^£(r).
(5.37)
ox <Jy
20 RANDOM VARIABLES [1.5
and that
= E(U 2 = 1,
ou2 a v 2 = E(V 2 = L
) (5.39) )
For these reasons, U and V are called standardized random variables. The
correlation coefficient between X and Y, denoted by p XY is defined as the ,
Pxf = ^ = W) = £ (2^)(2^))
f (5.40)
PROBLEMS
1. Let X be a binomial random variable with the probability distribution
given in (2.5) and let Y= n — X. Find the covariance a x Y .
show that
a? = o (5.5)
t al+*x = (5.6)
~1
r (/?) = xn e~ x dx.
j
Jo
f{x) = e-x*l2
+ 00
f(x)dx = 1
l - 00
(* ~
f(x) = exp
f
V 2' \ 2o 2
(b) Using (a), verify that the marginal density functions of X and Y are
normal with means zero and variances o x 2 and o Y 2 respectively, and that the ,
regression line of Y oh X. How does the value of p affect the closeness of the
points of the random variables (X, meaning of Y) to this line? Discuss the
normal
correlation coefficients in this light. {Note. In the case of a bivariate
distribution, correlation coefficient of zero implies independence of X and Y.)
(d) Verify for the bivariate normal case that E[E{ Y X)] — E{ Y) and
12. |
E(XY) = E[XE( Y |
X)].
(e) Use (b) and (c) to verify that
+ ‘ ’
*
+ Xn I(Xi-X )
2
X= and ^Y
2 i=l
n n ~ 1
PROBLEMS 23
respectively. t
14. Let U= ax X + 1
• •
•
+ a m Xm and V= b1 Y1 + + bn Yn
• •
•
as given in
Section 5.2. Show that the covariance between U and V is given by
m n
a u.v =11 ai—1 i=l
J>t
a x,.Yi (5.24)
16. Referring to (5.11), show that the two proportions p 1 and p 2 of white
balls drawn have zero covariance but that they are not independently dis-
tributed. (Hint: Compare Pr{/ 2 = 1} with Pr {p 2 = 1 | px = 1}.)
2
G
Pr{\X-fj\ >t} <j.
Use the result of Problem 18 to prove the inequality.
1. INTRODUCTION
The method of generating functions is the most important tool in the
study of stochastic processes with a discrete sample space. It has been used
in differential and integral calculus and in combinatorial analysis. The
generating function of an integral-valued random variable completely
determines its probability distribution and provides convenient ways to
obtain the moments of the distribution; furthermore, certain important
relations among random variables may be very simply expressed in terms
of generating functions. Detailed treatments of the subject are given in
Feller [1957] and Riordan [1958].
2. GENERAL PROPERTIES
Definition. Let a 0 , a l9 . . . , be a sequence of real numbers. If the
power series
= a 0 + a s + a2s +
A(s) x
2 • •
• (2.1)
24
f
2 2]
. GENERAL PROPERTIES 25
Pk — ~~
d
1
gX
k
k (s) 0 ,
1 , (2.4)
k ds \
s=0
= E(s
x ).
gx(s) (2 . 6)
In this case, we may use the p.g.f. also to obtain the expectation and high-
order factorial moments of X by taking appropriate derivatives: 2
E[X(X - 1) •
•
• (X - r + 1)] = —
ds
g x (s) (2.9)
'x
2 _ E[X(X - 1)] + E(X) - [£( X)f. (2 . 10 )
00
1
If a power
ae
series
^ c k k converges to a function/^) for
Jc
s |s| < 50 , then the differenti-
ated series
^k converges at least for |^| < .?
0,
and is equal to f'(s ) in that interval.
a)
3. CONVOLUTIONS
Let {ak } and \b k ) be two sequences of real numbers. The sequence {<:*.}
with elements
ck = a <fik + uA-i + ' '
'
+ a k bt (3.1)
K) = {**}*{&*}• (3.2)
Let
Sa( s )
= 2 aksk
k=
and
gi.(*) = 2 V*
k=
be the generating functions of {.ak } and {bk }. Multiplying the infinite sums
for ga (s) and g b (s) together and grouping terms with equal powers of s,
we find that
Pr {Z = k} = rk = p0qk + Plqk_k + • •
•
+ pkq0 , (3.6)
so that the sequence {r k } is the convolution of {pk } and Thus the above
{qk }.
result (3.3) implies that the p.g.f. of the sum of two independent random
variables is the product of the individual p.g.f.’s. Generalizing by induc-
r
2.4] EXAMPLES 27
the p.g.f of Z is
= E(s Xl s Xi x
s ”)
• •
•
4. EXAMPLES
4.1. Binomial Distribution
n~k =
Pr{Z =k}= - k
^ p\ 1 p) , 0, 1, . . . , n. (4.4)
(4.5)
28 PROBABILITY GENERATING FUNCTIONS [2.4
[(1 p) , (4.7)
gx(s)=Is
k ~ = e-
m~ s)
/c=o
.
k\
(4.9)
Pr{y
1
= fc}
J
= (4.10)
fc!
and P'g' f -
»
grW=2^ 17kf = e
fc=o !
• (4.11)
t
2.4] EXAMPLES 29
and variance
(4.16)
E(Z r ) Gy = r (4.17)
p
and its p.g.f. is
" P_
G z (s) = (4.18)
1 — qs_
success (the rth success). Since there are 1^1 ways of having k
failures and r — 1 successes in the first (/: + /'— 1) trials, the distribution
of Z r is
given by
We now find it
rn
useful to recall Newton's binomial expansion
q p
k r
,
k = 0,1, (4.19)
which holds true for any real number x if the binomial coefficients are
defined by
'x\ x(x — 1)
• •
(x — k + 1)
~ •
(4.21)
k)
'r + k-V
(7H-»rn- k
(4.22)
and verified.
ICT‘K-(r7i
(4.18) is
30 PROBABILITY GENERATING FUNCTIONS [2.4
which we present for the common case where G(^) is a rational function.
Suppose that the p.g.f. of a random variable can be expressed as a ratio
G(s) = —
V(s)
' (5.1)
where U(s ) and V(s ) are polynomials with no common roots. If U(s) and
V(s ) have a common factor, it should be canceled before the method is
applied. For simplicity, we shall assume that the degree of V(s) is m and
the degree of U(s ) is less than m. Suppose that the equation V(s) = 0 has
m distinct roots, s l9 . . . , s m , so that
V(s) = (s - SiKs - s2 ) -
(s - sj. (5.2)
G(s) =
— + • •
•
+ -^5_ (5. 3)
Si s s2 — s sm — s
,
- U(sr
ar = )
r = 1, . . . , m. (5.4)
V'(s r )
~ -U(s)
(*i s)G(s) = (5.5)
(s - s 2 )(s - s3 )
• •
• (s - sj
r
2.6] MULTIVARIATE PROBABILITY GENERATING FUNCTIONS 31
(5.6)
Pk
— +—+
jfc+i
1
„fc+ 1
1
^ c fc
>3 yyj
+
k = 0, 1. (5.7)
root. Then
Pic (5.8)
„k +
Pr{^i = ku . .
. ,
Xm = kj = ph ...
km ,
(6.2)
k( = 0,1,...; i=l,. . . ,
m.
32 PROBABILITY GENERATING FUNCTIONS [2.6
I -. 2^=1 (6 - 3)
and an improper random vector if the sum is less than one. The p.g.f. of X
is defined, for \s± \
< 1, . . . ,
\s m \
< 1, by
E(sf 0
x = G x(l, . ... .... 1) (6. 6)
so that
s m)
= 3Gx(si>
• • • .
E{X ) t
(6.7)
ds,- ;=1
and
2
d O x (si, sj
E[XA t
- 1)]
2
• • ,
ds ,
Analogously, it is again evident from (6.5) that the p.g.f. of the joint
probability distribution of any two components if j of a proper random X X
vector can be obtained from G x (^i, . . . , s m) by setting all the s ’ s equal to
unity except and sj9 and also that
2
d
E{X {
X) 3 Gx(si, .... (6. 8)
dsfdsj S=1
E{s x, . .
.
x
Sm m) = E(s x l} .
.
. E(Sm X m y (6.1!)
f
2 6]
.
MULTIVARIATE PROBABILITY GENERATING FUNCTIONS 33
is
n
^x(^i» • • •
» 11 • • • »
3=1
In particular , if the Y ;
are identically distributed with p.g.f g(s l9 . . .
,
s m ),
then
n
G x (5i, . . . , sj = [g(j ls . . . ,
s m )] . (6.13)
This is most easily seen by the device of writing each multinomial variable
as the sum of n independent and identically distributed random
variables and applying (6.13).
Applying (6.8) to (6.14), we have
E&Xd = n(n - l)PiPj . (6.15)
E[s x = G x (l,
t
‘] 1) = [1 -/», +p s Y,
i i
(6.16)
34 PROBABILITY GENERATING FUNCTIONS [2.6
corresponding probabilities p q l9 q 2
, , . . •
, qm , so that
P + + #2 + ' ’
'
+ °lm — 1 - (6.18)
Let Xi be the number of failures of type Fi9 for i— l, ... ,m, preceding
the rth success in an infinite sequence of trials. The joint probability
distribution of X l9 . . .
,
Xm is
(k + r - 1)!
Pr{AT , = k 1 ,...,X m = km } = 1
kl
P
r
kk !
•
km \ (r — 1)!
(6.19)
where k = kx + • •
•
+ k m and the corresponding
,
p.g.f. is
00 00
K (fc + r- 1)!
sj = y
£i
G\(si,
' •
2 Si . . .
Sc
” —
h -0 * =«m fcj!
- •
- fc m ! (r 1)!
(6 . 20 )
As with (6.13), the form of the p.g.f. shows that each Xi can be regarded
as the sum of r independent and identically distributed random variables.
From (6.20) we use (6.7), (6.8), and (6.9) to compute the expectation
(6.22)
P\ P /
and the covariance
(7
°Xi,Xj ~ r M*
- o (6.23)
P
The marginal distribution of any of the random variables X t
and the
distribution of the sum Xx + • •
•
+ Xm are negative binomial distributions.
2.7] SUM OF A RANDOM NUMBER OF RANDOM VARIABLES 35
This is evident from the definitions of the random variables; we may see it
E(s
x<) = G x (l, • •
where
(
\P +
c
—WW-M
- \1 - Qa/
(6.24)
of Xi in (6.21) and (6.22) can also be obtained from (6.24). Also, since
Z = X± + • •
•
+ Xm ,
we find the following relation
2
<’z = I<rx,
2
+ Z2*>X it Xj (6.27)
i= i= 3=1 1
j
2
i + 9j
+i 2
(6.28)
P i=i P \ pJ) i=i 3 =i P
which may be verified by direct computation.
ZN = V, + • •
•
+ XN (7.2)
h(s ) = E(s
N). (7.3)
36 PROBABILITY GENERATING FUNCTIONS [2.7
G(s) = E(s n-
z
(7.4)
Then
G(s) = h[g(s)].
Proof. Since
Z
E{s n ) = E[E(s Zn N)]
1
(7.5)
and
E N) = £( S
x ' + "- +x
^| N = {g(s)}^ (7.6)
) ;
we have
G(s) = £[{g(s)f ]
= h[g(s)]. (7.7)
This simple functional relation between the p.g.f. G(s) of the compound
distribution of ZN and the p.g.f.’s h(s) and ^( 5 ) is quite useful, as we shall
illustrate with an example in branching processes in Section 8.
Example. Let Xi
have a logarithmic distribution with
Pr{X z
-
= k} = - 1 , 2 , (7.8)
k log (1 - p)
- {1 ~ s)
h(s) = e \ (7.10)
- -
= e~
x
(l ps)'
l/log<1 J>)
. (7.11)
Now let
(7.12)
log (1 - p)
so that
(7.14)
•"-(rfSf
which is the p.g.f. of the negative binomial distribution in (4.18) with
p
replaced by (1 — p). ^
2 8]
.
A SIMPLE BRANCHING PROCESS 37
which is the same for each individual in a given generation; (ii) that this
probability distribution remains fixed from generation to generation; and
(iii) that individuals produce offspring independently of each other. Thus
we are dealing with independent and identically distributed random
variables.
Let
00
be the p.g.f. of X ,
and let g n (s ) be the p.g.f. of Zn , n = 1,2,.... Since
Z0 = 1 ,
the size of the first generation Z x
has the same probability
distribution as X,
Pr{Z 1 = k}= p k , (8.3)
descendants of the x
members of the first generation, so that Z 2 is the
Z
sum of Z x independent random variables, each of which has the probability
distribution {p k } and the p.g.f. g(s). Therefore Z 2 has a compound
distribution with the p.g.f. obtained from formula (7.7):
gaO) = gfeWl- (8 . 4)
38 PROBABILITY GENERATING FUNCTIONS [2.8
p.g.f. of Z n+1 is
g n+ iM = gn[g(s)l (8.5)
Po = (1 - P)
Pi=P
= 0,pk for k — 1, 3, 4, ... .
Using (8.5) and (8.8) and the monotonicity property of g(s) we have
r
2 8]. A SIMPLE BRANCHING PROCESS 39
x qn+i = g(q ny ,
(8.io)
Taking limits of both sides of (8.10) we see that £ satisfies the equation
£ = $(£) (8.ii)
In fact, the limit £ is the smallest root of the equation. To prove this we let
x be an arbitrary positive root of the equation x = g{x)\ then
<h
= 2(0) < g(x) = x (8.12)
and hence
?2 = 2(?i) < g(*) = x. (8.13)
By induction
q n+ 1 = g{q n < g(») =
) *, (8.14)
intersects the line y = s only at the point (1 1). Hence, in this case, £ = 1 ,
intercept the line y = s in at most two points. One of these is (1, 1) and
therefore the equation (8.11) has at most one root between 0 and 1.
Whether such a root exists depends entirely upon the derivative g'(l).
Ifg'(l) > 1, the root £ < exists; for, tracing the curve y = g(s) backward
1
from the point (1, 1), we find that it must fall below the line y = s, and
eventually cross the line to reach the point (0,/? 0 ). If, on the other hand,
g'(l) < 1, then the curve must be entirely above the line and there is no
root less than unity, so that £ must be unity (Figure 1).
Now the derivative g'(l) is the expected number of offspring of any
individual in the population. According to the preceding argument, if this
Figure
s
< giW <
> g!(s) >
giis)
g2 (s)
<
> '
<i
>
„
£
0
<
£
<s <t
<S< 1 .
(o.IO)
must exist for all s , and that G(s) = 1 only for 5=1. Passing to the limit
in (8.6) shows that
(C 0<5<1
G(s) = (8.18)
(l 5=1.
This limiting function G(5) is clearly not a p.g.f., since it is discontinuous
at 5 = 1. If, however, we redefine (7(1) to be equal to £, then we may
interpret G(s) as the p.g.f. of an improper random variable Z, the “ultimate
size” of the population, which has the probability distribution
Pr{Z = 0} = f
Pr{Z = =0 A:} A: =1,2, (8.19)
Pr{Z = oo} = 1 - l
Thus the population either becomes grows
extinct with probability £, or
without bound with probability —
no intermediate course is possible.
1 £;
Of course, in practical applications to the growth of real populations, the
f
PROBLEMS 41
PROBLEMS
1. Derive the p.g.f. for a binomial distribution in n trials with probability p.
any trial, if the “success” occurs at least r times in succession, we say that there
is a run of r successes. Let pn be the probability that a run of r successes will
occur in n trials. Find the p.g.f. of p n and from which the probability p n for
n > r.
® (x\
(i +t y = \t\ < i
5. x
Oo + a x ) = Z ^ ^
ai la
lr* 1
. M < l«ol-
m 00
/ x \
(«0 + al + •
'
•
+ a m) X = 2 "
2 u a l kl • '
'
a m km “S
Ku"m )
'
I '
fci=0 km = o wh’ • • • »
where 2 a i
< |a 0 |, k = kx + • •
•
+ km ,
and the multinomial coefficient is
=
i
defined by
x(x — 1)
• •
• (x — k + 1)
" )-
( k m] ky. ' '
•
km
\^i, • • • ,
\
for all real x. Use this multinomial expansion to verify the p.g.f. (6.14).
V +k - 1
(4.22)
jrn (qs)
k
= (1 - qs)
42 PROBABILITY GENERATING FUNCTIONS
6.
=j>( k )?y-
7. Prove Theorem 2.
and X3 .
14. Let a random variable X have the p.g.f. G x (s) and let a and b be constants.
Find the p.g.f. for the linear function a + bX.
15. Continuity theorem. For each n let {p ktn } be a probability distribution so
that
00
Pk,n> 0 and =
fc =0
and let
00
gn(s) = 2=0Pk,nS k
fc
{p k } with p.g.f.
G(s) = l/v*.
fc =
For each k
pk n =p k
n— oo
if, and only if,
lim^nCs) = G(s)
n—>oo
whatever may be 0 ^ j < 1. Prove the theorem.
r
PROBLEMS 43
/ /A ,
lim [1 — p + /w] n =
n —-oo
p q and for p = q = \.
Find the probability that gambler A will be ruined if his adversary B has an
infinite amount of capital.
19. Continuation. Suppose, in Problem 18, that players A and B have initial
capitals of a and b, respectively. What is the probability that A (or B ) will be
ruined at the nth game?
20. Continuation. Suppose that,when gambler A begins with x dollars
0 < x < a + b, the series of games has a finite expectation D x Justify that .
Dx = p dx + i + q Dx-i + !•
a + b 1 ~ {qlpf ~
E>a =
q -p q - p 1 - (qlp)a+b_ '
21. Branching process. In the simple branching process, let the expectation
and variance of X be n and
2
o' ,
respectively. Use
and
Var (Zn+1 ) = E[o^ n+i z J + \
2
oE(Zn+i Zn) \
,
44 PROBABILITY GENERATING FUNCTIONS
n
to show that E{Zf) = /i and
Var (Zn ) = H 7* 1
= no‘
.2
22. Continuation. Use the relation gn+ \{s) = gn [g(s)] to derive the formulas
for E(Zn ) and Var(Zn ) in Problem 21.
23. Verify the following identity for the negative multinomial distribution:
(6.28)
where p + q = 1 and q = q1 + • •
•
+ qm .
defined by
Mx (s ) = E(e*X).
26. Continuation. The m.g.f. and the p.g.f. of a random variable X have the
relation
Mx(s) =gx(eS )-
Derive from each of the above the corresponding expectations E{X) and E(X 2).
27. Continuation. Find the m.g.f.’s for:
(i) The exponential distribution with probability density function
f{x) = jie
/JX
x > 0
= 0 x < 0.
g^2tt
From the m.g.f.’s, find the corresponding expectations and variances.
r
CHAPTER «
3
1. INTRODUCTION
Since the early work of Kolmogorov [1931] and Feller [1936], the
theory of stochastic processes has developed rapidly. It has been used
to describe empirical phenomena and to solve many practical problems.
Systematic treatments of the subject are given by Chung [1960], Doob
[1953], Feller [1957, 1966], Harris [1963]; reference should also be made
to Bartlett [1956], Bharucha-Reid [1960], Bailey [1964], Cox and Miller
[1965], Karlin [1966], Kendall [1948], Parzen [1962], Takacs [1959].
A stochastic process {T(t)} is a family of random variables describing
an empirical process whose development is governed by probabilistic
laws. The parameter t, which is often interpreted as time, is real, but may
be either discrete or continuous. The random variable X(t) may be real-
valued or complex-valued, or it may
take the form of a vector. In diffusion
processes, for example, both X(t)and t are continuous variables, whereas
in Markov chains, X(t) and t take on discrete values. In the study of
population growth we shall have a continuous time parameter t, but the
random variable X{t) will have a discrete set of possible values, namely,
the nonnegative integers. Our main interest is the probability distribution
45
46 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.2
occur in (0, t) and j events in ( t t + A), where 2 < j < k, with probability
,
0 (A). Hence, considering all these possibilities and combining all quantities
of order 0 (A), we have for k > 1,
and, for k = 0,
Transposing p k (t) to the left side of (2.1), dividing the resulting equation
through by A, and passing to the limit as A — > 0, we find that the prob-
abilities satisfy the system of differential equations
(2.3n)
1
The standard notation o( A) represents any function of A which tends to 0 faster
than A, that is any function such that [o(A)/A] 0 as A -> 0. -
3.2] THE POISSON PROCESS 47
and similarly
/>o(0) = l,
pk ( 0) = 0, £>1. (2.4)
Po(0 = (2.5)
e
lt
j Pi(0 + 2e
Ai
pi(0 =7 [e
A,
Pi(01 = 2; (2.6)
at at
Pi(t) = foe~
xt
. (2.7)
u
PM = —k\7T^~
e~ (Xtf
» /c = 0, 1, (2.8)
ferential equation is first derived and solved to obtain the p.g.f. of X(t ),
and then the probability distribution is derived from the p.g.f. To illustrate
the technique, we reconsider the Poisson process.
Let the p.g.f. of the probability distribution {p k (t)} be defined as in
Chapter 2 by
Differentiating (2.9) under the summation sign 2 and using (2.3) we obtain
jGx (s;t)=l= 4
dt
i»*(oV (2.10) j
dt kJ
= - A f pk(t)sk + As | p^Os*-
1
. (2.11)
k= 0 fc=l
-A!<1-s)
G x (s; t) = c(s)e . (2.15)
- u,1 - s)
G x (s; t) = e (2.16)
2
Since |/7 k (0^l < each of the infinite series in (2.11) converges uniformly in t for
\s\ <1; hence ^?(dldt)p k (t)s l converges uniformly in / and the term-by-term differentia-
tion is justified for |5j <1. Letting s — 1, we have lim G x (s‘, 0 = G A*(1 0 Thus i
• our
s—
argument applies also for 5 = 1. Justification for term-by-term differentiation in the
remainder of this book is similar and will not be given explicitly.
3.2] THE POISSON PROCESS 49
the probabilities
*
exp j-j\r)dT j\r) dr
pm = k\
J
(2.18)
£[X(()]
=/>t ) dr. (2.19)
We see that the generalization does not affect the form of the distribution,
but allows variation to occur as determined by the function 2(r).
/» 00 -kt,
{hf
p*(0 Pt|i(0/W dl = f(X) dl. (2 . 20 )
=Jo r k\
/(A) = —
T(a)
A
m~1
e~
l>1
A>O,a>O,0>O (2 . 21 )
= 0 A < 0
1*00
PjX 0
P
tXfk
T(a)/c
)k\ Jo
— oo
A^--le -M+» d x
1
r(/c + a} a {k+
=
-
p t\p + t)~ *\ k 0, 1, . . . . (2.23)
k T(a)
\
50 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.2
which follows readily upon integrating (2.22) by parts. Using (2.24) above
and (4.21) in Chapter 2, equation (2.23) may be rewritten as
+
<“>
Pk(0
f nG-rJtH
which is for each t a negative binomial distribution with parameters
P
r = a, p = (2.26)
P + t
(2.28)
7 P*„(0 =
at
0. (3.1a)
7 P*(0 =
at
-3*p*(0 + K-iVk-M’ k > K- (3.11>)
PkJSty
= A Pk$) = f° r k 7^ K (3-2)
i=Jc o
j&i
2 7—
<
^ = °- (3- 4 )
^°rm - 3,)
3=k 0
the reader may refer to equation (3.33) in that chapter for verification.
We now prove (3.3) by induction. Equation (3.1a) has the solution
Pk,(0 = e~Xt<
hence (3.3) is true for k = k 0 Suppose
. (3.3) is true for k — 1, that is,
"fc-1 -Xit
“* 0 e
p*-i(0 = (-i)*“ l 4„
• •
•
4-2 2 (3.5)
n (a _ A)
3=ko
j^i
We shall derive (3.3) from (3.16). Following the usual procedure for
linear first-order differential equations, we multiply (3.16) by e Xkt to
obtain
A
e
ll ‘
7 PkW + e
lk
%Pk(t) = e ‘‘4-iPt-i(0
at
52 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.3
or
dt
I^pM = w%(0 (3.6)
•k-i
m
i
,Uk-h)t
y [e^P = (-1)^^ • •
•
Vi
dt
J=k o
3 +i
fc-i
djdt )e
u *-^'
J (
't-i ^
i- (/t _ Ai)
*°n(4 - 4)
*
3=ko
j*i
Therefore
'
fc -1
Mk-K)t
= (-i)*~*°4„ • •
•
4-i
i=k 0
It + c (3.7)
n(4-4)
i=ko
j&i
At t = 0,^(0) = 0 for k > /: 0 . Using identity (3.4), we see that the con-
stant c is given by
c = — .
(3.8)
pj(4 - 4)
j=ko
^ j
It follows that the probability of exactly one birth among the k individuals
in (r, t -j- A) is kX A +
o(A) and the probability of more than one birth is
o(A). Thus we have a simple birth process with X k = kX and system (3.1)
3.3] PURE BIRTH PROCESSES 53
becomes
with the initial conditions (3.2). System (3.9) is easily solved; for k > k0
k u (l -
= _ e-
k°
p k (.t) (
y (3. 10)
where
and
- k
no a - m = x - %-i) -‘(
k
k k k
1 ~ A\k - k
K °J
0 y. (3.13)
i=ko \
3
(3.14)
since
evolution, with X(t) representing the total number of species in some genus
at time t. It is instructive to consider the number Y(t) of new species
generated in the interval (0, t ); that is, Y(t) = X(t) — k 0 From
. (3.10),
54 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.3
+ ~ -
Pr{y(0 = k} = pk+k„(t) = \ e**t (3.15)
E[X(t)] = k0 + E[Y(t)] = k0 e u
(t)
= a Yit) = k0 e
lt
[e
lt
— 1].
From the form of the negative binomial p.g.f. [see equation (4.18) in
Chapter 2], we see that Y(t) is the sum of k 0 independent and identically
distributed random variables, which are, of course, the species generated
by each of the k 0 initial species.
J Pk(‘) =
at
-kMt)pk(t) + (k- l)A(0p4 _i(0» k > k0 . (3.166)
The system (3.16) could be solved successively; instead, we shall use the
method of p.g.f.’s. Let
0 fc=fr 0
(3.19)
Each of the summations on the right is equal to the derivative of 6 ^(5 ; /)
with respect to 5; hence
d ?)
t
3.3] PURE BIRTH PROCESSES 55
dt ds
dG x (s t) = 0. (3.21)
1 A(t)s(l - s)
;
s
G x (s; t) = G> exp A(t) dr (3.24)
l (-J.
where d> is an arbitrary differentiable function. To obtain the particular
solution corresponding to the initial condition (3.18), we set t = 0 in
(3.24) to obtain
s
O = G x (s;0) = s
k °.
(3.25)
1 — s
Equation (3.25) holds true at least for all s with \s\ < 1 ;
hence, for any 6
such that |0/(1 + 0)| < 1,
k0
6
<D{0} = (3.26)
1 + 0 .
Letting
= exp
0
1
{-[X(T)dj
we have, from (3.24) and (3.26), the solution
G x (s; t) = s
ko(
exp
K A(t) dr
fco
(3.27)
The second factor on the right side is recognized as the p.g.f. of a negative
first factor informs us that X(t) is obtained from the negative binomial
56 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.3
moments of X are the same as for the simple Yule process, except that Xt
|
XU^} |
X(0)]. (3.29)
f X(^)
Sc exp ‘^(t) drj
1
:-j
E{sf ih) X(t ,)}
I
= (3.30)
exp — Kr ) dr
|
(3.31)
1 — s<
(3.28) becomes
G( Sl> s2 ;
t lt t 2) = £[**<'> |
X(0)]. (3.32)
Since, clearly, |zjJ < 1, we can use (3.27) once again to obtain
X(0)
6i(Si, S 2 ,
t\, 1
2)
—
exp
d>H_i (3.33)
Zi[j-expj-j A(r )dr
f
3.4] THE POLYA PROCESS 57
tt 1
X(0)
s 1 tt1 s 2 tt 2
G(si,s 2 ; ti, t 2 ) (3.35)
1 s x (l Tr l)S 2 7T 2 S 2 (l — 7T 2 )j
Co v[X(t1 ), X(t 2 ) |
X(0)]
Pr WO = k l9 X(t2 ) = k2 1
X(0) = k0 }
= Pr {X(t x ) = k x Y(0)
|
= k 0 }Pr{X(t2 ) = k 2 X(tx)
|
= k x ). (3.37)
Pr {X(t 1 ) = k 1 X(t 2 )
,
= k2 \
X(0) = k0 }
k-i+i
— 1
ti +l ki+i-ki
X(r) dr . (3.38)
m = X T- Xak
1 + Xat
(4.1)
58 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.4
X + Xak0 (4.2a)
PkM)
1 + Xat
—d PkU) = -
dt
X
1
-J-
+ Xat
Aak
— PtU) +
:
N
a 4" Xa(k
-
1
7
4-
——Xat
1)
Pk- 1(0. k > K (4-2b)
We shall once again demonstrate the use of the p.g.f., although the above
equations can also be solved successively. Let
ot os
(4.5)
The auxiliary equations are
dt ds dG x (s ; t)
(4.6)
1 4- Aat 2as(l — 5) —2(1 — s)G x (s; t)
G x (s ;
t)s
1/a
= const. (4.10)
Hence the general solution of (4.5) is
S
G x (s 0 ;
= s-
1,a4»[(i-=2-
)n + Xat) (4.11)
(4.12)
%
Equation (4.12) holds true for all s such that |s| < 1; hence, for any 0
so that 1 1/(1 + 6)| < 1,
6 = ? (1 + Xat)
—1/a 1 — S
-i— (fco+l/a)
Gx( s 0 ‘>
— s 1 + (1 -f- Xa,V)
O&0
1/(1 + Aai)
fco+l/ffl
(4.14)
_1 — s[Xatj( 1 + Xat)]_
Hence X{t) is, except for the additive constant k 0 a negative binomial ,
r = k0 + - and p = — .
a 1 -f- ka,t
If we rewrite (4.14) as
G x (s ; t) = (
[1 + Aat — 7—
Aats
f°[l + Xat - Aats]~
lla
(4.15)
it is clear that
_tt'“sU<
lim G x (s; t) = s* 0 e . (4.16)
o->0
Thus we see that as a -> 0 the Polya process (except for the additive
constant k 0 ) approaches the Poisson process. This is expected in view of
(4.1).
The situation is more surprising if we compare the Polya process with
the weighted Poisson process in Section 2.2. In fact, the two processes
are identical in the sense that, except for the additive constant k0 they
,
case the risk was assumed to be independent of both. Thus we have two
very different mechanisms on the “micro” level which give rise to the same
behavior of the system on the “macro” level. This means that the two
underlying mechanisms cannot be distinguished by observations of the
random variables X(t).
X(t) =
k individuals are independently subject to the force of mortality
ju(t), then the probability of one death occurring in (/, t + A) is kju(t) A+
o(A), the probability of two or more deaths occurring in (t, rA) is
-h
o(A), and the probability of no change is 1 — kju(t) A+ o(A). Following
the steps in Section 3; we can formulate the differential equations for the
probabilities of X(t) and, from these, arrive at the solution
Pr{*(0 = k |
*((>) = fc
0} = pk (t) =
X (5.1)
is the probability that the individual will die prior to (or at) age t. Consider
now the interval (0, t +
A) and the corresponding distribution function
FT (t + A). For an individual to die prior to t + A he must die prior to t
or else he must survive to t and die during the interval (/, t + A). Therefore
t
3.5] PURE DEATH PROCESS 61
Equation (5.7) gives the probability that one individual alive at age 0
will survive to age t. If there are k 0 individuals alive at age 0, the number
X(t ) of survivors to age t is clearly a binomial random variable with the
probability of success 1 — FT (t). Consequently X(t) has the probability
distribution (5.1), and the solution is completed.
The survival probability in (5.7) has been known to life table students
for more than two hundred years. 3 Unfortunately it has not been given
due recognition by investigators in statistics, although different forms of
this function have appeared in various areas of research. We shall mention
MO = = MO ex P -
Jo I
Mr ) ^T
j
t > o
=0 t < 0. (5.8)
E. Halley’s famous table for the City of Breslau was published in the year 1693.
62 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.5
(5.9)
dt\fx{t)J MO
where h is a positive constant. Integrating (5.9) gives
MO = Be*. (5.11)
MO = A + Bc\ (5.13)
fit) = (5.14)
a formula that plays a central role in the problem of life testing (Epstein
and Sobel "
[1953]).
6. BIRTH-DEATH PROCESSES
We now consider processes that permit a population to grow as well
These are more relevant to biological populations, in which
as to decline.
both births and deaths occur, than are the processes considered in the
preceding sections.We shall discuss two models in detail and briefly
mention a few others of practical interest in the exercises of this chapter.
Again we let X{t) denote the size of a population at time t for 0 <
/ < oo, with the initial size X(0) = k 0 and pk {t) — Pr{2f(0 = k). Given
3 6]
. BIRTH-DEATH PROCESSES 63
X(t ) =
k we assume that (i) the probability of exactly one birth occurring
,
death is k {t ) A + 0 (A); and (iii) the probability of more than one change
fj,
,
1 - 40)A - MOA - o( A).
Consequently the probability pk (t + A) at time t + A may be expressed
in the difference equation
pk u + a) = Pk m- Km
- ,voa]
+ Pk- WV- WA + A+i(0ft+i(0A +
1 1 o(A), (6.1)
7 P O) = - WO + Po( 0 ]po (0 +
at
0
Suppose that both X k (t) and /u k (t) are independent of time but pro-
portional to k,
X k (t) = kX and ju k ( t) = k/u (6.4)
7 Po(0 = pPiO)
at
7 pk =
at
(t) -k(X + p.)pk(t) + (k - \)Xpk^{t) + (fc + l)pp t+ 1 (0. (6.5)
Since the equations (6.5) cannot be solved successively, we shall use the
Gx 0;0 = 2,skPk(t)- (6 . 6)
0
64 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.6
It is easily deduced from (6.5) that the p.g.f. satisfies the homogeneous
partial differential equation
7)
— G x (s; 0 + (1 — s)(As — fx) — G x (s; t) — 0 (6.7)
dt os
Gx (s; 0) = s
k °.
(6. 8)
and dG x (s t) = 0. (6.9)
1 (1 — s)(As — fx)
;
In the case X ^ fx we may use partial fractions to rewrite the first auxiliary
equation as
1
dt — ds T ds (6. 10 )
(A - p)(Xs - fx) (A - /x)( 1 - s)
or
(A — fx) dt = d log — (6 . 11 )
11 — s
——
—
As [x
e
w_ ,)1'
= const. (6 . 12 )
1 — s
G x (s;t) = <i)
(A -n)t
(6.14)
As
O 11 (6.15)
\ As — fxj
at least for alls with |s| < 1. Hence, for all 6 such that 1 1 + d/x\ < |1 + QX\,
3 6]
. BIRTH-DEATH PROCESSES 65
Letting
— 1 —s u ~tl)t
0 e (6.17)
Xs — /JL
Letting
’
|(As _ „) + A(1 - s)M •
1 _ £<*-#*><
^
a (0 =f* . A — /i)i ’ ^(0 - a (0 (6.19)
[jl — (
h
(6.18) may be rewritten as
WO + [1 - «(0 - flO]*}*
II IMS' lK0f°-
3
[i -«( 0 -/»( 0 ]V, (6 . 21 )
_ fc
{1 - /?(0s}-
ft
»
=i . °)(-i )i [/?(0] i
s
i
i=0 (
=2
00
//< 0 + “ *
1\
. (6.22)
1
i=0 V J
Hence the probability
min[fc 0 .fc] /b \ /b 1 b _ _
Iwo^womi - “(o - m‘
/ |
i>*« = 2 Q°)(° *_
k> 1,
— «(0 + [1 - «(0 - ft ( Qs ]
0
1 - 0(f)s
and
MO = - «»][i [i - /KOMOf- 1
k > 1,
MO = a (0- (6.24)
66 SOME STOCHASTIC MODELS OF POPULATION GROWTH [3.6
and
11 <6 26)
1 — 0L*(t)s j
(6.29)
'’• <0 - (ifsf-
Formulas (6.28) and (6.29) can also be obtained directly from (6.20) and
(6.23) by letting ^ —> 2. The expected value and variance of X(t) are
E[X(t)] = k0 and
2
o x(t) — 2k0 fa; (6.30)
thus when the birth rate equals the death rate the population size has a
constant expectation but an increasing variance.
The limiting behavior of the birth-death process is analogous to that
encountered in branching processes (see Section 8, Chapter 2). The
probability that the population becomes extinct at time t is
p 0 (t)
= Gx (0 ;
t).
k0
lim G x (s; t
)
= [JL < A
*-+oo \A/
= 1 fx>x (6.31)
Thus, if /li > 2, the probability of extinction tends to unity as t — oo, and
the population is certain to die out eventually. On the other hand, if
the limiting “p.g.f.” is constant, the probability that the population will
increase without bound is 1 — °.
of the expectation and variance of X(t). From (6.25), (6.26), and (6.30),
we see that
lim E[X(t)] = 0 if yM > A
t-* 00
if fji
= A
= 00 if yM < A (6.32)
and
lim CT
X(() = 0 if yU > A
t~* 00
= 00 if yM < A. (6.33)
When A = ^ we have an
interesting case in which the probability of
extinction tends to unityand yet the expected population size tends to k 0 .
dt ds
and dGx (s;t) = 0. (6.35)
1 (1 - s)[sA(0 - MO]
Since A(/) and /u(t ) are functions of t, the method of partial fractions will
not work here. We introduce a new variable
z = (1 - s)- 1 (6.36)
t - wo - mo> + m = o (6.37)
at
SOME STOCHASTIC MODELS OF POPULATION
GROWTH [3.6
68
y{t) yW =0
e
yU)
^ _ [A(0 - fi{i)]ze + Kt)e
dt
or
- [zeyW + ]
A(t)e
jl ')
a= 0. (6.39)
dt
1
e
yW
+ X(r)e
y(T)
dr = const. (6.40)
1 - s
Using (6.40) and the second auxiliary equation in (6.35) we obtain the
general solution
y{T)
G_x(s; 0 of + X(r)e dr). (6.41)
1 - s
- s
k°
(6.42)
i u fc °
O(0)= 1 (6.43)
d
ik 0
G x (s;t)= 1 - 1
(6.44)
—
7
—
+
y{t)
s
r*
Jo
?i(r)e
,
y{T)
di
or
1 — s k0
Gx( s 0 > —U 7
)
(6.45)
e
y(t)
+ 'P %
1 ^(r)eyM drj
to
<=>
*
Letting
a(0 = 1 (6.46)
t
PROBLEMS 69
and
m= 1 - e
yit)
[l - a(<)] (6.47)
Except for the definitions of a(t) and /?(/), the p.g.f. (6.48) is of the same
form as the one in (6.20); therefore the probability distribution {p k (t)}
is given by (6.23) with a (t) and /5(t) as defined in (6.46) and (6.47).
The probability of extinction at time t may be obtained directly from
(6.46) and (6.48)
1*0
Po(0 = i - (6.49)
,y<*>
dr
JV)^’ I
e
yU)
+ f Mj)eyM dr = 1 + Jo[ n(r)e
yM dr (6.50)
Jo
and
Po(0 = (6.51)
i + dr
We see that in this case p 0 (t) -> 1 as t -> oo if, and only if, the integral in the
above expression diverges as / — oo.
PROBLEMS
1. Poisson process. Let {2f(/)} be the simple Poisson process with the prob-
ability distribution given in (2.8). Find the covariance between X(t ) and
X(t + r).
as the length of time up to the occurrence of the nth event. Derive the distribution
function
Fn (l ) = Pr {Tn < t}
and the expectation E(Tn ). [Hint. Establish a relation between X(t) the number
of events occurring up to t and Tn .]
3. Continuation. Derive the distribution function for the difference
Tn+i — Tn .
70 SOME STOCHASTIC MODELS OF POPULATION GROWTH
k
+ k = 0,1,...,
6. p k (t) = P“t (P (2.23)
0 2367
1 749
2 350
3 222
4 136
5 95
6 64
7 41
8 25
9 14
10 12
11 11
12 9
13 8
14 5
15 or more 8
Total 4116
PROBLEMS 71
(b) Compute the sample mean X and sample variance Sx 2 from the frequency
table above, and use these values to estimate a and f$ in the formula.
(c) Compute the expected number of persons Fk for each k and determine
_ Fk ) 2
Xo
2
y (fk
*
%
^
7. Yule process. From (3.9), derive the probability distribution
10. Continuation. Derive the covariance between X{t^) and X{t2 ) in (3.36) (i)
from the p.g.f. (3.35), and (ii) by using the identity E\X(t^)X{t^ Y(0)] |
=
ElXirJEiX^) XOj} | |
A'(O)].
12. Pure death process. Derive and solve the differential equations for the
probability distribution in (5.1).
13. Continuation. Use the method of p.g.f. to derive the probability dis-
tribution in (5.1).
14. Linear growth. Show that the probability function (6.23) satisfies the
system of differential equations in (6.5) and that
!/>*(') = i.
k=
15. Continuation. Derive the p.g.f. in (6.28) by solving the differential equation
in (6.7).
16. Continuation. Derive the p.g.f. in (6.28) from (6.18) by taking the limit as
jM — A.
17. Migration process. Suppose that the change in a population size results
only from immigration and emigration so that
Pr {X{t + A) = k + 1 |
X(t) = k} = A +rj o( A)
and
Pr {X(t + A) = k - 1 X{t) = k} = k/uA + o( A)
|
18. Continuation. Suppose that the increase in a population size results from
“birth” and “immigration” so that
Pr {X(t + A) = k + 1 |
X{t) = k) = kXA + r]A + o(A)
72 SOME STOCHASTIC MODELS OF POPULATION GROWTH
Derive the probability distribution for X(t) given 2f(0) = k 0 and the corre-
sponding p.g.f.
19. General birth-death process. Show that the p.g.f. (6.44) satisfies the partial
with X(0) = k 0 Derive the appropriate differential equations for the prob-
.
23. Queueing. A counter has s stations for serving customers. If all s stations
are occupied, newly arriving customers must form a queue (or line) and wait
until service is available. A queueing system is determined by (i) the input process ,
(ii) the queue discipline, and (iii) the service mechanism. The queueing theory has
25. Continuation. Let Q be the length of the waiting line in the limiting case
in Problem 24. Derive the expectation and variance of Q.
26. Continuation. Let W be the length of time that a customer has to wait
in line for service. Find the expected waiting time E{W) for the limiting
case as f — oo.
r
CHAPTER 4
1. INTRODUCTION
In the pure death process and in the birth-death processes individuals
were assumed to have the same probability of dying within a given time
interval. This assumption does not hold true in the human population.
All individuals are not equally healthy, and the chance of dying varies
from one person to another. Since death is usually preceded by an illness,
a mortality study is incomplete unless illness is taken into consideration.
Illness and death, however, are distinct and different types of events.
Illnesses are potentially concurrent, repetitive,and reversible, whereas
death an irreversible or absorbing state. The study of illness adds a new
is
relapse, recovery, and death for cancer patients. Generally, in the study
of illness-death processes, a population is divided into a number of illness
given in Chapter 7.
upon the severity of his case. A patient may be transferred from one ward
to another when improves or worsens; he may be discharged
his condition
from the hospital, transferred to another hospital, or he may die. A record
is kept for each patient from the time of admission regarding treatments
f
4.2] ILLNESS AND DEATH TRANSITION PROBABILITIES 75
Let (r, t) be a* time interval, with 0 < r < < t oo. At time r let an
individual be in state Sa During. the interval (r, t) he may travel continu-
ally between Sa and Sp ,
for a, = 1, 2, and he may reach a state of death.
His transitions from one state to another are governed by intensities of
risk of illness (vap ) and of death (^ a<5 ), which are defined as follows:
a ^ P; a, ft = 1, 2; d = 1, . . . ,
r,
for each £, t < £ < t. The intensities vap and are assumed to be
independent of £ for r < ^ < t. For notational convenience, we define
- (v + (2.3)
so that
It is clear that vaa < 0. If vaa = 0, then the state Sa is absorbing, and
should be treated as a death state. Hence we shall assume that
< 0 1 ,
2 . (2.5)
r r
2 Pii >
5=1
0 or 2 1^26 >
5=1
0. (2.6)
transition probabilities ,
2<x,5( r T ) >
= 0, a = 1, 2; <5 = 1, . . ,
r.
P*a( T ,
t + A) = Paa (r, 0[1 + ^aa A] + P ,(r, t)vpa A + 0 (A) a/
2 12 )
P*p( T »
1 + A) = P<x(x(t t)va p/S + P (T, /)[1 -f I^A] + 0(A).
a/J
( .
4.2] ILLNESS AND DEATH TRANSITION PROBABILITIES 77
(2.13)
1
d
— P p(T
a ’ 0 = 'PaaO* t)vafi + Pa p(r, t)vpp, <x 5* /?; 0 C, /? = 1, 2.
f>c«pe
pt = + c^e pt vpf ,
(p
- O c.. - = 0
(2.15)
+ (p - Vfp) Caf = 0.
of (2.16) are thus the only values of p for which the expressions (2.14) can
be solutions of (2.13). Because the discriminant
k. - v/w )
2
+
cannot be negative, the roots pi are always real; furthermore, if
va
p > 0 and > 0 (2.18)
78 A SIMPLE ILLNESS-DEATH PROCESS [4.2
the discriminant is strictly positive and the roots are therefore distinct.
We assume that (2.18) holds true, that is, that each state can be
shall
reached from the other state, and hence we shall consider only the case of
distinct roots. From (2.5) and (2.17), we see that p 2 is strictly negative
and />! is nonpositive. Closer examination of (2.17) in the light of (2.3),
(2.5), and (2.6) shows that p 1 is also strictly negative.
We see from equations (2.15) that the coefficients ca(Xi and capi corre-
sponding to each root p, are related as follows:
L= ^aai ^ aPi
(2. 20 )
Pi
~ V
PP Kp
we have, for each p„ a pair of solutions
t) = k^p, - vpp)ePit
V (2.22)
A/)0> 0= 2= kp^*, a ^ ft; a, /5 = 1, 2.
i
^«aO, t) = 2 Kip, - v
pp )e
PiT = 1
= i
(2.23)
p *fi (
r> T ) =2 = 0.
ki = -Pit
' ^j; i,j = 1, 2. (2.24)
Pi ~ Pi
4.2] ILLNESS AND DEATH TRANSITION PROBABILITIES 79
=i-
Vj Oi(t-T)
PU*. o
i=1
Pi ~ Pi
2 (2.25)
v
Pa p(r, 0 = 2 *fi Oiit—T) •
Pi
t = t (2.26)
Pi
~ vtH
PUT, 0 = PU 0 = I Pit
— i Pi Pj
2 (2.27)
individual in R s at time t may have reached that state at any time prior to
t, let us consider an infinitesimal time interval (r, r + dr) for a fixed r,
0 < r < t. The probability that an individual in state Sa at time 0 will
reach the state Rd in the interval (r, r + dr) is
Qad(0 = f
dr + ( Pap{r)[Xp 8 dr. (2.29)
Jo Jo
^
J
Q*d(t) =2 : l(pi - vpp)P'<xd + v*pPpd]
= i
PiiPi ~ Pi)
j i; a p; a, ft
= 1, 2; 6 = 1, . . .
r. (2.30)
80 A SIMPLE ILLNESS-DEATH PROCESS [4.3
3. CHAPMAN-KOLMOGOROV EQUATIONS
The illness-death process, described in this chapter, is a Markov process
in the sense that the future transitions of an individual are independent of
transitions made in the past (see Definition 1 in Chapter 6). An important
(2.11)
consequence of this property is the Chapman-Kolmogorov equation in
(3.2)
*=1 (Pi - pj 2
~ r ft»)(P2 — vff) +
_ (Pi
I J.
gPl r+ Pi (i-r)\
p
(Pi - P2) 2
\
*-l
/
(Pi
(Pi
“ Vpp)(p2 ~ Vpp) + V ap Vpa
~ (3-5)
When (3.4) and (3.5) are substituted in (3.3), we recover the left side of
equation (3.2) and hence also equation (3. la).
(3.2), verifying
Chapman-Kolmogorov type equations may be established also for the
transition probabilities leading to death. Consider an individual in illness
state Sa at time 0 and the probability Q %s(t) that he will be in death state
Rd at time /, and let r be a fixed point in the interval (0, t). The individual
may reach state Rd either prior to t or after r. The probability of the
former event is Q aS ( r )- In the latter case, he must be in either illness state
4.3] EXPECTED DURATION IN ILLNESS AND DEATH 81
and
(1 if the individual is in Rd at time r
JJr) = I
lo otherwise
(4.4)
£[M t = )] (4.5)
= P*p(t) dr (4.9)
and
Using formulas (2.27) and (2.30) for the transition probabilities Pap ( t)
and Qzsir), we obtain the explicit forms
P;-~ -
e.M = f i eT" dr =i (/“ 1) (4.11)
0 4=1 Pi - p,
=1 *
pi(pi - Pi)
;„,(») =fi e
p‘ r
dr =i ^ - (ep “ - 1) (4. 12)
'
Jo i-i pt - Pj = i
i ft(Pi - Pi)
and
_ fi r
«««(0 2 —
r* 2e i
Kpi
~
JO 4=1
i- -1
Pi(/>< P>)
;
(Pi
~ vpp)Pad + Va pfipd
-(/*'- 1)_ t (4.13)
<=i Lft Pi(pi ~ Pi)
The sum of the expected durations of stay over all states is equal to the
entire length of the interval,
2V0 + 22.0(0 =
0=1 (5=1
1- (5.D
Having derived explicit forms for the probabilities Pa p(t) and Q a s(t), we
can now verify equation (5.1) directly. Using (2.30) and (2.3), we have
4.5] POPULATION SIZES IN ILLNESS STATES AND DEATH STATES 83
= 2—
r 2 gp t t i- r r
^
r ( Pi
- *„)2 /“«« + v2
i-l i-l p,( pi — Pi) L 6=1 6=1 J
2
1
= 2— :
l-(Pi
~ *',,)(»’«« + *<.,)
- + VM))]
i=1 PAPi ~ Pi)
1
= 2 f‘~ t- ft
2
+ - ft* M ] (5.2)
=1 PAPi ~
*
Pi)
,
5=1
2 QJf) =1-2 * =1 P* - [(ft
- *-„) + *m]- (5-2)
Pi
the sum
x(0) = x1 (0) + x 2 (0) (5.4)
be the initial size of the population. Suppose that the z(0) individuals
travel independently from one state to another, and that at the end of the
interval (0, t ), Xp (t individuals are in illness state Sp and Yd (t) individuals
are in state Rd . Obviously
where Xap (t) is the number of those people in state Sp at time t who were
in state Sa at time 0, and
where Yad (t) is the number of those in state Rs at time t who were in state
Sa On the other hand, each of the za (0) people in Sa at time 0
at time 0.
must be in one of the illness states or in one of the death states at time /;
therefore we have
Column totals
(population sizes X {t)
x X2 (t) Ti(0 • '
•
yr (o *(0)
at time t
The population sizes of the states at time 0 and at time t are summarized
in Table 1. Because of equation (5.1), for each a the random variables
on the right side of (5.8) have a multinomial distribution with p.g.f.
given by [see Equation (6.14), Chapter 2]:
Therefore the p.g.f. of the joint probability distribution for the population
sizes of all the states at time t is
X x (t) SK Xi{t) z{
9 Y x (t)
"" l t? Y r (t * )
1( 0 ), x 2 (0)]
.
* •
.
*
2 Z.
r
2
XM)
= n [PJ‘)h + PJt)s
ct=l
2 + QJOz, + • •
•
+ Q„ r {t)z r ] (5.10)
= in x -i al \
a2
*„(0)1
! yal \ y \
PJtf“'PJtf«>QJjf*'- QM y ",
(5. 1 1)
ft
= 1, 2; d = 1, . . . ,
r, so that
Xl T" X 2p Xp 5 ft — 1 5 2 ,
and
Vis + V2 i = y* & = 1 , • • • ,
r.
E[X„(t) |
x,(0), x 2 (0)] = x1 (0)P 1 /t) + x 2 (0)P 2 „(f), (5.12)
4.5] POPULATION SIZES IN ILLNESS STATES AND DEATH STATES 85
and
E[Ys (t) |
^(0), * 2 (0)] = + * s (0)e M (0. (5.13)
=2
2 ^(0)Q^(0[i - Q»«(<)]- (5.15)
a=l
we have
Pr{^(0 = 0, X (t) =
2 0 |
*,(0), * 2 (0)}
= n IQJi) +
a=l
• •
•
+ Q«r( 0 f“
<0>
a
2 x a(0)
(
* p Pit \
= IT 1 -2 [(Pi
— Vfifi) + Vafi]\ • (5‘ 16 )
a=l 1 i=l Pi - pi )
~
Paa(t) =
»7
P«,(°o) = lim lim £
Pi
e** = 0, (5.17)
*->ooi=l p t — p }
(5.19)
86 A SIMPLE ILLNESS-DEATH PROCESS [4.5
fl [*16,1(0°) + • •
•
+ zrC«r( 00 )]* a<#1
' (5.21)
Thus as / — oo, the initial populations ^(0) and £ 2 (0) die out, the random
variables ^(oo) and X (cc)
2
degenerate to the value 0, and the limiting
distribution is the convolution of two independent multinomial distri-
PROBLEMS
1. In a stochastic model for follow-up study of cancer patients, Fix and
Neyman [1951] considered two illness states: S1 ,
the state of “leading normal
life,” and S2 ,
the state of being under treatment for cancer; and two death
states, R lt death from other causes, and R2 ,
death from cancer. For illustration,
they suggested the following two sets of numerical values for illness and death
intensities.
For each set of data, compute the transition probabilities PaP (t) and Qad (t), and
the expected durations of stay ea/} (t) and e (t), for a,
ad ft, 6 = 1 , 2 and t = 1 .
c
= <xpi
(2.20tf)
Pi
-
then the general solution becomes
P^r,t)= £
i =
(2.22 a)
Derive from (2.22a) the particular solution for the transition probabilities
corresponding to the
initial conditions (2.9).
PROBLEMS 87
4. Chap man- Kolmogorov equation. Verify that solution (2.27) satisfies the
Chapman-Kolmogorov equation (3.1b).
5. Verify equations (3.4) and (3.5).
6. Substitute (2.27) and (2.30) for the transition probabilities Pap(t) and
Qa3 (t) in (3.6) and verify the resulting equation.
9. A
two dimensional process. The probability distribution of the population
sizes of the illness states at a particular time t, as discussed in Section 5, can be
derived with a different approach. Given the initial population sizes ^(0) and
x 2 (0) at time 0, let Xff) and 2 {t) be the population sizes in Xand S2 at time t
with the probability distribution denoted by P(x lt x 2 t ), \
P(%i, x 2 \
t) = Pr{A 1 (/)
r
= x lf X (t) = x
2 2 |
3^(0), ^ 2 ( 0 )}*
d
-r p (x 1, x2 ;t) = P(x lt x 2 ;
t)(x p> lx + x 2 v 22 )
dt
+ P{x + 1 1, x2 ;
t){x x + \)^ x + P{x x x 2 , + 1 ;
t)(x 2 + \)/i 2
where " + •
/^ ar .
be the p.g.f. of the probability distribution in Problem 9 with the initial con-
dition at t = 0
G x (5l5 S2 ; 0) = 5fi
(0)
5f2(
0)
.
Establish the partial differential equation for the p.g.f. and derive the particular
solution corresponding to the above initial condition.
11. Continuation. From the p.g.f. in Problem 10, derive formulas for the
probability distribution P(x lf x 2 ;
t ), the expectations, variance, and covariance
of X (t) and X
x 2 {t). Compare your results with those given in (5.11), (5.12), and
(5.14).
88 A SIMPLE ILLNESS-DEATH PROCESS
to smoke again; the probability of his resuming the habit in a time interval
(t t +
,
A) is v 21 A + 0(A) and of remaining a nonsmoker is 1 + v 22 A + 0(A),
with v 22 = —v 21 The transition probability P 12 (t) then is the probability that
.
a person who smokes at the beginning of time interval (0, t ) will be a nonsmoker
at time t. Derive the formulas for the transition probabilities P^t), for
a, /? = 1 , 2. (The difference should be noted between a nonsmoker and a
smoker who isnot smoking at a particular moment.)
13. Continuation. Show that PAO +^
= 1,2. 2 (0 = 1, for a
14. Continuation. Given a person who smokes
beginning of a time at the
interval, find the expected durations of time that he will be a smoker and a
nonsmoker during the interval (0, t ).
CHAPTER 5
Illness-Death Process
1. INTRODUCTION
The transition probabilities P.p(l) and Qad(t) presented in Chapter 4
describe the end results of the movement of an individual in a given time
interval. For example, if we consider an individual in the healthy state S1
at the beginning of the interval (0,i), then Pu (* ) is the probability that he
will be healthy at the end of the interval, but this tells us nothing about
his health during the interval. He may become ill and recover a number of
times during (0, t), or he may remain healthy throughout the interval;
the probability P xl (t) does not enable us to distinguish one case from the
other. The probabilities Pa p(t ) and Q aS (t) are also insufficient for describing
an individual’s health during an interval. In order to present a more
detailed picture of the health condition of an individual throughout the
entire interval, we need to study multiple transitions: the number of times
that an individual leaves the initial state and the number of times that
he enters another state. The first part of this chapter is devoted to deriving
explicit formulas for the multiple transition probabilities and various related
problems.
In the formulation of multiple transition probabilities, we are concerned
with the number of transitions made in a given time interval. Alternatively,
one may study the time required for making a given number of transitions,
for example, how long it takes a patient to return to good health, or how
long will it be before the occurrence of a second illness. Clearly in such
investigations, time is a random variable. In the latter part of this chapter
we shall present the probability density functions for the multiple transition
times and problems associated with the multiple transition times.
89
90 MULTIPLE TRANSITIONS IN SIMPLE ILLNESS-DEATH PROCESS [5.2
is defined by
M a p( r ,
an improper random variable, as is demonstrated
t) is later in this
section. We now derive an explicit solution for Pffiij, /).
We derive systems of differential equations for Pffi{r t ) by con-
first 9
sidering two contiguous time intervals (r, t) and (/, t + A), and the
probability that ap(r t + A) = m. M
Direct enumeration yields the
,
t) = /><?>(T,
( 2 . 2)
a 7^ 0
Plf(r, t) = PirU (T, tXf +
I *,0= 1,2.
(2.4)
0
—g a /?(s; T, 0 = g aa (s; T, Os^ + g a/? (s; T, t)v
pp ,
jg;a,|5= 1,2.
5.2] MULTIPLE EXIT TRANSITION PROBABILITIES, P$\t) 91
&«.0; t, r) = 1
,
g^Cs; r, r) = 0, a ^ /? ;
a, /J = 1 , 2.
(2.5)
The differential equations in (2.4) can be solved in exactly the same way
as those in (2.13), Chapter 4. Here the characteristic equation is
0> ~ O = P
2 ~ (”«« + TW )p + (vmVff W) = 0
~sv.t (p - vtt )
( 2. 6 )
with the roots
Pi
= H0« + V
ff ) + V (v„. - VfnY + 4 V„ g Vf Js\
and (2.7)
Pi = M(f«« + w - V K. - rw +
t ) )
2
4r^v^].
Notice that, except for the appearance of s , (2.6) and (2.7) are identical
to the corresponding expressions in (2.16) and (2.17) in Chapter 4. The
particular solution of (2.4) corresponding to the initial condition (2.5)
turns out to be
*„.(*; r, 0 =
~
— *
2
=1
e
PiU ~ r)
Pi Pi
(2 . 8 )
/
0-2—-«
4\ "S?
SV qfj /?j(£— 7")
-
a */»;», 0
j 5^ C
= 1, 2.
When j=l,
g««(i ;
r, 0= p«(t, 0.
and
gapOi T . 0= 0. a 5^ 0; a, 0 = 1, 2. (2.9)
The p.g.f.’s in (2.8) depend only on the difference (/ — r) and thus the
process is we would anticipate
again homogeneous with respect to time, as
from the transition probabilities Pap (t). Once again we may let r = 0,
and let t = t — r be the length of the time interval in which the number
of transitions from Sa to Sp is
MJt) = 0, 0 (2.10)
K?'(‘) = P%\0 0 , (2 . 11 )
92 MULTIPLE TRANSITIONS IN SIMPLE ILLNESS-DEATH PROCESS [5.2
Pi - Pi
(2-13)
gat (*; 0 =2
<-i ft -
— Pt
e \ 6
j T i; * P;a,P = 1,2.
Pi
The p.g.f.’s in (2. 13) have s under all the square root signs in pe - therefore
it is not practical to derive the multiple transition probabilities by taking
derivatives,
A simpler way
p- (,)
=
P Jp\t)
(
is to expand the
(2.14)
right
side of each of the equations (2.13) as a power series in s. We let
g„(s; 0 = eWVw' (e
«/2>V- _ e-«/2)V")
L 2V-
+ i(e
( ' /2,Vr
+ e~ w^ :
) (2.16)
/ \2fc+l
ft1.
and
- 00 /*\2fc
= y — — /L\
1
i( e «/2)V-
+ e-(*/«)V") I
(^7)2* (2.18)
Jt=0 (2fc) ! \2 /
Using (2.15) and substituting (2.17) and (2.18) in (2.16), we have
2fcn
r a«
g aa (s; 0 = e
(</2)(V
aa+ v/?/pNT 1 /f
L ( 2 /c + 1 ) i(T ( 2 /c )!\2
x [(*«« - Vpp )
2
+ 4r a/J ^ a s]
fc
. (2.19)
5.2] MULTIPLE EXIT TRANSITION PROBABILITIES, PaJ { ]
(t) 93
\2(k—m) Koc
~ VPP 2fc+l 1 It
27c-i
+ S ,
The probability P™ { ]
(t) is the coefficient of s m in (2.20),
- *» / +1
>.«
K« - f
2(k—m) f
X ,
1 (tf-
_(2fc + 1)!\2/ (2fc)!\2/ _
m = 0, 1, ;a # P; . . . a, jS'fe 1, 2. (2.21)
—T 2fc+l
1
2fc-i
(2. 22 )
k v2(fc— TO+1) 1
X i
= to — (\/77 — j)k«
/ 1
PP>
(2k + 1) it:
m= 1,2, ... ;a# S;a,/3
/
= 1,2. (2.24)
94 MULTIPLE TRANSITIONS IN SIMPLE ILLNESS-DEATH PROCESS [5.2
For m= 1 we have the first passage probability that exactly one transition
will take place from to Sp during a time period of length t,
2fc+l
1
2k
k= (2 k + 1)!\2
=
>'««
v
-il
-
— (/««' - (2.25)
It should be noted that, for fixed a, ft and t, the sum of the probabilities
P { ™ ]
(t)over all possible values of m is less than one. In fact, using (2.17)
and (2.27) in Chapter 4 and (2.21) above, the sum
'
V «*—_VPp ]
_j_
_ c
-(t/2)V
^ j
Because of (5.1) in Chapter 4, both of these probabilities are less than one.
Thus there is a positive probability that the random variables ap (t) will M
not take on any value at all and hence, as noted earlier, they are improper
random variables. Since an individual in state Sa at time 0 may be in Sa
or in Sp at time /, or may even have left the population through death
prior to time not surprising that the probabilities (2.26) and (2.28)
t , it is
Pr{M;/0 = m} = m= 0, 1 (2.29)
with
m= 0
I PTO) = 1- (2.31)
P*M 3s
WjJ + V.fVf.iv, Villi')
P'M(px -
2
~~
Pi P2 P 2)
gPlt gPit
- Pi ~~ p2 pit
X 1 te‘ (2.32)
(K vpp). Pi ~ P2
and
pit
= VrfVfJ /g
pi<
+ e_v*\ 2l>aP vP«
E[MJt)] 1 + ~ (2.33)
Pi
j. - PpAe^-e^)
— 2 (Pi P2)
2
where p 1 and p 2 are given in (2.7) with s 1, and Paa (t) in (2.26).
™-1
= [p '(T)flae dr +^Plj\r)^ dr
(
Ql?(t) d
m = l,2,...;a^/3;a,fi=l,2;d=l,...,K (3.3)
The first term on the right side is the probability of the multiple transition
leading to Rd from the Sa and the second term is the prob-
original state ,
= m — 2(k-m+l)
vf cl ) '/'»« 2m —
]c—
(\J7l — 1/
)
( vaa V
tW )
1
t /_\ 2fc+l
(T/ 2 ){Vaa+v
-\ e PP )
dr
\_(2k + 1)! Jo \2
1 *
+ dr (3.4)
(2/c)! Jo \2j
(3.6)
Therefore
2(fc— ra+1)
Am) k \( 5'««-’’w)
:
= 2(4 V/) J 1
I
)
k=m-l\m
( — )
1/ (vaa + Vppf
(k+1)
/ 2^+1 Ql 2fc+1
-2^ J a 1 — 2 77 C
_ \
— Aact + 6
a<5
0
C
L \ i=o Z!
}
/ (2k + 1)!
(3.7)
5.3] TRANSITION PROBABILITIES LEADING TO DEATH, Q^\t) 97
2S(0 = fo Pi7\r)n/s dr
00
/ k \ (v — V ^(fe-m+l.)
= 2(4 V/ a
1
r- 2 2v«pVpd
k=rn-l\m -
,
2(fc+l)
1/ (v^ + Vpt
2/C+1 r\l \
/
(‘-sH (3 8)
-
k \(V" -V fll )
2{k-m+l)
Q2k+X
m= 1, 2, . .
. ;
a 7* 0; a, 0 = 1, 2; <5 = 1, . . . ,
r. (3.9)
It should be noted that the random variables Dad (t) also are improper
and that
2 0!?(t) =
m= 1
QJt) < 1 . (3.10)
w*;0 = m=i 1
(3.11)
00
*) - 1 s"e£S(0 (3.i2)
m=l
and
h« s .a(s; 0 = 2 S” PrWft,*
m= Jo 1
= f‘i
Jo m=l
become
Jo i=l Pi - Pi
=I
S(
- -
" W (/“-lKa (3-16)
i=l Pi(p z
;
- Pi)
|
and
M/>(s; 0 = f 2
Jo i-1 pi
—-— Pi
dr
=2 (3.17)
- P,(pi - 1
Pi)
S( ~V
Ms; 0 = i f "\ (e p “ - IK, + 2 (*"' - lKa,
i=1 MP* - Pi) *=! PAP* - Pi)
j\ a ^ 0; i,j, a, 0 = 1, 2;
i = 1, <5 . . . ,
r. (3.18)
Formulas (3.16), (3.17), and (3.18) can be used to compute the expecta-
tions and variances of the corresponding “properized” random variables.
We illustrate this with an example. Let D*d p (t) be the number of times
that a person leaves state Sa given that he is in S at time 0 and enters R d
, tt
by way of Sp during the time interval (0, t ). If Sa denotes the state of health
and Sp the state of illness, then D xd (t) is the number of times a person
p
becomes ill before dying from an illness in (0, t). The corresponding
probability is
Pr{D«V,(0= = in = 1 , 2, (3.19)
Qzs.pU)
5.4] CHAPMAN-KOLMOGOROV EQUATIONS 99
where
w o = I eisjco =
m=l
o
2
i- ~ -y“- iw« f
(3.20)
PAP. - Pi)
with /a, as given in (2.7) for s = 1. The expectation of D*s p (t) can be
computed as follows:
=
Qaa.o(0 3s
WO
"ay?
3 [(Pi
- P,)
2
- 2*., *,.](«** - 1)
QtaA*)*- 1 Ip<(p< ~ Pi)
w,
+ '
« -2-(e'“- 1)'
(Pi - P 2 ) LP t Pi
(3.21)
4. CHAPMAN-KOLMOGOROV EQUATIONS
In deriving differential equations (2.2) for the multiple transition prob-
abilities, we have implicitly assumed that the future transitions of an
individual are stochastically independent of his past transitions. Since the
process is homogeneous with respect to time, this assumption implies
that, for 0 < r < /,
to to
P%\0 = 2 P^(r)P[Z~ \t - l
r) + 2 P^{r)PT~ l+l \t ~ r), (4. 1)
1=0 1=1
and
TO —1 TO
= 2 P^(r)P^- \t - 2 Plf(j)P^- \t -
l l
Pit (0 T) + 1=1 r) (4.2)
J=0
2 CWPir'V -
1=0
T) (4.3)
1
lP^{r)Pf
=1
{
T l+1
\t-r) (4.5)
and {P$\t — t)}. Making adjustment for the power of 5, the p.g.f. of
(4.5) is equal to
Equation (4.8) can be verified in the same way as was equation (3.2) in
Chapter 4; this shows that the multiple transition probabilities in (2.21)
and (2.24) satisfy equation (4.1).
Using the same argument we see that for equation (4.2) to hold true
for all m= 1 ,
2 ,
. . . ,
it is necessary and sufficient that the corresponding
p.g.f.’s satisfy the following equation
fixed time r between 0 and t, the individual either (i) has already reached
Rd after leaving Sx m times, or (ii) has left Sa / times and is now in Sa ,
or
(iii) has left Sa l times and is now in Sp These
. possibilities are mutually
exclusive, so that
to— 1 m
- -l
= Q£'(t) + 1=0
2 r) 2 P a%)Qtl m
+ 1=1 (
Xt - r),
(4.10)
— r) ~ l)
where Qp^ m (t is the probability that the individual, starting at t
in Sp leaves
,
S m — a l times during (r, /) and is in Rs by t. In other words
r ,)
(£
- tK* dS
= QpTp
l+1
X> - r) + Q'pTJXt - t). (4.U)
Consequently the required equation is
TO—
Qlf(t) = Qif(r) + 2 P'»Q[T 0 - l)
r)
1=0
TO
a a, /? = 1, 2; d = 1, . . . ,
r; m= 1, 2, . . . . (4.12)
A (5-D
p(w)
(0
(m—l— l)i - t) dr / = 0 ,. m (5.2)
leads to
>(m)
(t) = - r) dr, 1=1,..., m- 1. (5.3)
and
/<
\“ +1 (2i+ 1)!(2/+ 1)!
2 (5.6)
\ 2/ (2k + 1 )!
2fc+l
Cl
which demonstrates the identity
y
u+
\1 - yf
i+1
dy = 2
(2i +
(2k
l)!(2j+
+ 1 )!
1)!
(5.6),
5.5] MORE IDENTITIES FOR MULTIPLE TRANSITION PROBABILITIES 103
along the vertical (j + 1) and the horizontal (/) axes, whereas the right side
of (5.8) represents the summation taken first along the diagonal lines
lc = +j +
i 1 = and then added over
const., all the diagonals indicated.
Equality (5.8) thus becomes obvious.
(1 - c)~
m = (1 - c)-'(l - (5.10)
where |c| < 1. Using Newton’s binomial formula to expand both sides
n
of (5.10) and equating the coefficients of c yield
(-i r =k-i)
i=0
(5.11)
Im + n — 1\
_y // + n — — i 1\ im — + —
l i 1\
(5.12)
l m - 1 / ~r=o\ l- 1 j { m-l-1 )'
-r)dr = 2^(4
=T_i
1 I ('- )(\m - -
1
)
1/ l 1/
2k+l
2(k—m+l) 1
X (v M- v0f )
2/ (2k + 1) ! J
(5.13)
where k — +j +
i 1. Now (5.8) and (5.9) are used successively and the
quantity inside the brackets is reduced to
2k+l
’
k 2(fc— to+1)
1
~~ V
i
Jc=m — (
\tYl — ( V atz Bp)
2/ (2k +
(5.14)
1)!J
Pifi’Cr, 0= PrfA^/r, 0= n)
= Pr{an individual in Sa at r will enter Sp n times during
(t, /) and will be in Sp at t) a, = 1 ,
2 n ;
= 0, 1 ,
. . . .
(6 . 1
)
the p.g.f.’s of t) and show that they are the same as the p.g.f.’s of
PpKr, t).
* + A) = vPa ApJp (r + A, + A) + (1 +
i l
t vpp A)pp p\r
n
+ A, + A),
t
n
p[ p \r, t + A) = (1 + T aa A )p[np \r, t) + V" A ppy\r, {
t)
Ppp(T >
t + A) = vpa Apanp \r,
(
t) + (1 + vpp A)p
{
pp
n
\r, t ).
| piS>(T, = V'jtijUr, +
t) t) t)
dt
PwO> 0 = wiS’(T 0 + *
vfePfif(T > 0. a # a, /3 = 1, 2.
dt
n=l
and (6.5)
00
«,,(*; t, 0 = 5>”P$(t,0
rc=0
satisfy
g/?^(s; T, o = v Pa g aP (s ;
T, 0 + v
pp gpp
(s; T, 0
dt
106 MULTIPLE TRANSITIONS IN SIMPLE ILLNESS-DEATH PROCESS [5.6
g ap(s;r t) 9
= 0
(6.7)
g = 1.
gcf(s;r, 0 = 2,
*- Pi ~ P
(6. 8)
g w (s; r,0 =I a * ft a, p = 1, 2,
* Pi Pj
where p x and p 2 are given in (2.7). The formulas in (6.8) are identical with
a, is the first passage time. If Sa stands for the state of health and
Sp for the state of illness, then is the length of time a person enjoys
his health before an illness occurs, and Tj^J is the time needed to recover
from the illness.
/y ( t), let us consider two contiguous time intervals (0, t) and (/, t + dt).
m = 1,2,
f
5.7] MULTIPLE TRANSITION TIME, T (
J
l)
107
Substituting (2.24) for the multiple transition probability in (7.1) gives the
explicit form
= 1(4 v. t v H )
m eumv ^ v
et )
dt, tx
’
i ( —
_k=m-l \m
k
.)(*„ - ^)2,t- m+i) — 1
(T
1-
1
1/ (2/c + 1)!\2/ J
f%\ t) dt = dt
2(fc— TO+1)
k=m-l\m — 1/
2fc+l 2fc-i
1 /f
dt.
ilk + 1) ! \2/ (2k)\\2j J
Am)
(0 = ,A
(4vaP vPa )
$ t k \(v
2 -aa -V
fc=m-i\m-l/ (v
- pp T*~^
v
*+ u
7 -2
\ 1=0
—+
aa
v2(fc— m+l) /
Y
K
i
1
2k+lfil
//! /
pp
(7.5)
and
—
*/» (0 = 2^(4^^,)
^ >.m— "V
2
^
k=m—l \m —
1 /
(v
—
\(^*a
ITaa
— v
V pp)
^/J/
\2(fc-™+l)
\2(fc+l)
,
1/ (r aa + V
2fc+l nl \ ^2fc+l
/
- 2v e {v " + v"\2k
where
"(
l
-&T\ l- + i)i
(7.6)
6 =- (V °* + (7.7)
2
In particular, the /ir.s7 return time 7^' ' has the probability density
function
108 MULTIPLE TRANSITIONS IN SIMPLE ILLNESS-DEATH PROCESS [5.7
+ KM*
/l.
= e
, aa « _ (7.9)
V«z V
PP
Vaa~ Vpp\Vaa
/$« dt = dt (7.10)
O0 = — W - Vy <z<x
(* D- (7.11)
/£’( 0
/«r« <*>&=
with
W° o)
dt. 1.2, (7.15)
a, P = 1, 2. (7.16)
The expectation ElTffl*] means the average length of time needed for m
transitions from Sa to S among those individuals who reach S from S
p a p
in m transitions. Exact formulas for the expectation and variance of
5.7] MULTIPLE TRANSITION TIME, 109
f^\t)
E[Tim)
,,
ota —r~, at
o)
4 2 (fc + 1)
(m - l)^**
*«)**(»-.. + ^r (2w)
(7.17)
7c=m— 1 \ m — 1
/
and
_ k=m—l m \
1
/
a 7^ a, f$
= 1 ,
2 ; m = 1, 2 , . . . (7.18)
In particular,
£[Tir ]
= -(— + —) (7.19)
\K* VppJ
and
E[T%*] . (7.20)
Given an individual in Sa ,
let t ( ™ be
]
the length of time until he enters
a death state Rd after making m exit transitions from Sa The probability
.
u
&r’(o dt = [Pir (*K» + * (7.2D
= fVi7 wrf t
l
4>ir'(o
Jo
4%® di = + *P
(e
v** 1 - *"*)!*„ dt (7.23)
>V«
~ vtP
and
v at v _
(7.24)
/*ao
I
(7.25)
However, since the R s are absorbing states, starting with any state *S'
a
an individual will sooner or later be in one of the death states Rd . This
means that
r oo
For fixed /, this length of time may be divided into two periods: a period
of length T™ for making / transitions from Sa to Sa and
, a second period
of length T£~ l)
making the remaining (m
for — /) transitions. Therefore
"
T'”' = T {
+ / = 1, ... m— 1. (7.27)
In the same way, we have two identities for the multiple transition time
'T'(m)
1 »
<xp
=T (
“ + T$~ \ l
(7.29)
and
= T% + T%~ \ 1
1 = 1 m-1. (7.30)
future transitions from past transitions, (7.27) holds true if and only if the
corresponding probability density functions have the relation
p%\ dr (7.32)
or
'T'(m) Xp 'r(m t )
1 a. a. 2L 1 eta. ’ (7.34)
i =
k
rr(m)
— rT-Oj) _L T^ m i~ Z*+l)l
1 <za
2* l
1
<xp
1
P<x (7.35)
\
T \m) — 2* r^Uj)
J act
i T-’fmj—
1 <zp (7.36)
<xp L '
J>
i=l
and
k
(m) >T r(m
T ap — I2, rT-'(Ij)
L
J <xp
_i_
i
r
1
pp
l —li)-\
]•> (7.37)
i=l
where m + 1
• •
•
+ mk — m.
PROBLEMS
1. Differential equations for the multiple exit transition probabilities can be
derived with a slightly different approach. By considering two contiguous
intervals (r — A, r) and (t, /), we find that
5
= »
J/ -
>(T ’ ?)
for m = 0 , 1, . . . .
(c) Solve the differential equations in (b) for^aa O; r, t) and gp<l {s\ r, t ).
(d) Obtain the multiple transition probabilities from the p.g.f.’s in (c).
j Oo 0')%
= +
t
and also
jP't'lV)
= vO) +
6. Probability of first return. Verify Equation (2.23) for the probability of
first return to the origirfal state within a time interval of length /.
11. Derive from the second equation in (2.13) an explicit formula for the
p.g.f. and, from this, the formula for Pfp\t) in (2.24). {
12. Verify the expectations E[M*a (r)] in (2.32) and E[M*p (t)] in (2.33).
13. Evaluate the incomplete gamma function f y n e~ v dy, and use the result
to verify (3.5). Jo
14. Verify the formulas for Q^fit) in (3.7) and Q^fit) in (3.8).
15. Find G1V..W and Qjj- (t) from the corresponding p.g.f.’s in (3.16) and
p
(3.17).
PROBLEMS 113
|V(1 - j
y) dy. (5.7)
Jo
19. Use Equations (5.6), (5.8), and (5.9) to verify the identities (5.2), (5.4),
and (5.5).
20. Verify the distribution functions of the multiple transition time in (7.5)
and (7.6).
21. Verify the density function and the distribution function of the first
22. Verify the density function and the distribution function of the first
where 0 < r < t, and verify it with the explicit formulas in (2.22), (2.23), and
(2.25).
where 0 < r < t, and verify it with the explicit formulas in (2.22), and (2.25).
2 mi®^'*’) = 1. (7.26)
5=1 =
CHAPTER 6
whose value indicates the state of the system; that is, the event “JT(/) = y”
is the same as the event “the system is in state In the population
j at time
growth models of Chapter 3, for example, the state of the system is the
population size and X\t) has an infinite number of possible values, whereas
in the illness-death process of Chapter 4, the state of the system is an
illness or death state of one individual and the system has r + 2 possible
states.
P,,(r, t) = Pr{V(/) =j |
X(t) = /}, 1,7 = 0, 1,.... (1.1)
114
f
6 . 1
]
MARKOV PROCESSES 115
AiO - t) = Pr {X{t) =j |
X(t) = 0, i,j = 0, 1 (1.3)
The simple Poisson process and the illness-death process, for example,
are homogeneous, whereas the Polya process is nonhomogeneous.
The Markov property implies important relations among the transition
>
probabilities / iJ (r, /). Let £ be a fixed point in the interval (r, t) so that
r < £ < /, and let X(r), X(£), and X{t) be the corresponding random
variables. Because of equation (1. 2) we have
Pr{X(0 = k |
X(r) = i, X(t) =j} = Pr{AT(<) = k X{$)\
= j}
= P (S,t)
ik (1.4)
so that
r < £ < r.
Puci-t
- r) = 2 pa(£ - T ) p jk(t - £)»
j
Regularity Assumptions
(i) For every integer /, there exists a continuous function v u (t) < 0
such that
.. 1 - Ph(t, t + A)
lim -n-iW ( 2 . 1)
A-»0
\im
Pii(T ’ T + A) = vH (r); (2.2)
A-0 A
furthermore, for fixed j the passage in (2.2) is uniform with respect to />
The functions v i3 (r) are called intensity functions. If we agree to require
that
P»(r, r) = S tl (2.3)
1
where 6 is the Kronecker delta, then it is clear from (2.1) and (2.2) that
2 PiA*. 0 =
0
•
(2 . 8)
1
The Kronecker delta <5^- is defined as follows
Su = 1 and = 0 j ^ i.
Pik (T >
* + ^) — Pik( T > 0 Pkk(t> 1 + A) +^ Pair, t)P jk (t, t + A). (2.10)
Pj Au + A) 0(A)
+ 1 Pa(T o >
{
|
A A
According to Assumption (ii) Pik (t, t + A)/A - v jk (t)
uniform in j, the
right side of (2.11) tends to a limit as A— 0; hence the left side also tends
to a limit. Therefore the transition probability Piki? >
t) satisfies the
differential equation
(2 . 12 )
Pjjij - A, t)
'
y
A*. A
PJr, t)
+ ±^- ( 2. 14 )
e
118 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [6.2
hold true. Then the transition probabilities satisfy the system of forward
differential equations (2.12) with the initial conditions (2.3) and the system
both the forward and the backward differential equations as well as the
Chapman-Kolmogorov equation.
In general the solution P ik (r, t) is nonhomogeneous with respect to time;
however, it may be shown that if the intensity functions are equal to
constant v tj independent of time, then the process is homogeneous with
respect to time. In this case, the two systems of differential equations
become
7 PmO) = 2
dt j
(
2 17 )
.
and
7 A*(0 = I v-uPdW
dt j
(2.18)
PiM = d ik . (2.19)
0 pn(T 0
0=
5
p(t, (2 . 20 )
r
6 2]
.
THE KOLMOGOROV DIFFERENTIAL EQUATIONS 119
MO MO
V(0
MO MO
(2 . 21 )
2.2. Examples
Example 1. In the Poisson process the intensity matrix is
-2 2 0 0 •••
0 -2 2 0 •••
0 0 -2 2 • • •
(
and the Kolmogorov equations are
0 0 0 0
MO ~M0" 0 0
0 0 3/t(0
0 0 0 0 -
v(0 =
MO -W0 + M01 m 0 •
. (2.31)
0 2/i(0 -2[*(0 + M01 22(0 •
a
Pair, o = pUi -i(t, ou - D2(o + Pi^, + mo]}
dt
+ -P<,i+i(T » OO + i)M0
(2.32)
d_
Pij(j,t) = — i2(r)P<+lf Xr, 0 + i[2(r) + Mt)]P«(t, 0
dr
- ip(r)Pi_ (r ltj 9 1).
Let
be any 5X5 matrix of elements wiS ; let W' be its transpose, and W\ its \
\W\ i = 1, . . . , s; (3.2)
i=i
or of any column
I»A=
i-
o if k 9^ i (3.4)
1 ""W* =
i= 1
0 if k 9^ j. (3.5)
(3.6)
122 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [6.3
n Wn W. W 11 w 12
MW =
n ^22 s2
W 21 W 22
Wu w» w. Ws i vr s2
\W\ 0
0 \w\
WM. (3.7)
v 0 0 in
Thus we have
MW = WM = \W\1, (3.8)
(3.9)
From (3.8)
(3.10)
If W has an inverse W 1
,
then |
W\ W^\
|
= |
WW~'\ = |/| = 1 , therefore
the determinant |
W\ ^ 0. Now let be the (j, k) element of an inverse
s
of W so that ^ w ij wjk~ 1 — dal from (3.2) and (3.4) we see that iv3fc
_1 =
3=1
Wkjl \W\. Thus a square matrix W has an inverse if, and only if \W\j£ 0;
and the matrix W 1
given in (3.9) is the unique inverse of W.
6.3] MATRICES, EIGENVALUES, AND DIAGONALIZATION 123
Wt = pt (3.11)
(pI-W)t = 0 (3.12)
we see that (3.11) holds true if and only if the characteristic matrix of W
(pi - W) = A (3.13)
\A\ = \ Pl
- W| = 0. (3.14)
b s_ x = Wn + • •
•
+ W and b ss ; s = \
W\. The coefficient b 1 is called the
trace or spur of W, and is usually written tr W or sp W. The characteristic
polynomial may also be written in terms of the roots p u . . .
, p s as
jpl — W| = (p — pj)
• •
•
(p — Ps). (3.16)
W=2 = 2 pi
= i 1 = i
bs-1 = 2 w = P1P2 ii
• '
•
Ps- + • •
•
+ p 2 p3
*
'
•
Ps ( 3 -!7)
i=
bs = \W\ = PlPi Ps .
124 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [6.3
WT, = p,T,, j = ,
r. (3.18)
Now suppose that the eigenvectors are linearly dependent, so that there
exist constants c, not all of which are zero such that
2 c/I, = 0.
5=1
(3.19)
UT, = (ft!
- W) • •
•
( Pr _J - W)( PrTr - WT = r) 0 (3.21)
UT = UT 3 =
2
• • •
= UT = r 0. (3.22)
On the other hand,
UTj = (ftl - W) • •
•
( Pr _J - W)( PrT - WT,) 1
= (ft I - W) • •
•
( Pr_, T - WT )(pr - ft)
x x
= (/0 2 I
- W) • •
•
(p,_! — Pl )(p - Pl )Ti r
0 = U2 c/Ij = 2 C UT, = t
CjUT,
5=1 5=1
= Cj(p - p 2 x)
•
•
•
(p r - Pi) Tv (3.24)
Since the p 3 are all distinct, (3.24) implies that c must be zero, which
x
contradicts the assumption that c x 5^ 0, and proves that eigenvectors are
linearly independent.
^
6.3] MATRICES, EIGENVALUES, AND DIAGONALIZATION 125
T = (Tj, . . . ,
T s) (3.25)
To examine the structure of the matrix (3.26), we use (3.25) and (3.18)
to write
WT = (WTj, . . . .
WT ) = S ( ft T lf Ps T.) (3.27)
and hence,
T 'WT = (pJ'-'T ,, .... p s T- TJ.
1
(3.28)
1 0 • •
•
ON
0 1 • •
•
0
(T- 1 ^, . . . ,
T T) =
-1
S (3.29)
iO 0
Pi 0
0 ?2
t
T _1
WT (3.30)
0 o • •
•
Ps
we also have
Pi 0 •
0 p2 •
R __1
WR = (3.32)
>° 0
n (p,
-
3=1
j¥=i
= 1 for r =s— 1. (3.33)
~l ~2
/(p) = ai s
+ a zP s + * *
*
+ as (3.34)
(p — Pi)
‘
'
(p ~ ~ Pi+ 1)
Pi-l)(P (p ~ Ps)
' '
’
approximate an arbitrary function f(p). In our case, with f(p) of the form
(3.34), the difference between the two sides of (3.35) is a polynomial of
degree s — 1 with s distinct roots and hence must be identically zero;,
therefore (3.35) holds true for all p.
2
I benefited from a useful discussion with Grace Lo Yang on this line of proof of
the lemma.
6.3] MATRICES, EIGENVALUES, AND DIAGONALIZATION 127
77^/00
s
= (* - D! «i (3.36)
dp
and on the right side
IT (ft - Pi)
3=1
Substituting
~2
/(ft) = a lP r+a 2 pl H + a, (3.38)
n
3=1
(ft - p^ '
n
3 =1
(p< - Pi)
3+i 3 ±i
+ »s I (3.39)
i=
n
3=1
(.pi - p^
3±i
Since (3.39) must be equal to the right side of (3.36) whatever may be
at 7*— 0, the first sum inside the brackets must be one and the other sums
must be zero, and the lemma is proved.
^k2(j)
T ,(*)= I I
(3-41)
V^fcsO*)
128 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [6.3
other words, any nonzero column of the adjoint matrix of A (j) is an eigen
-
AJj) o\
\
A k2 O') 0
( P ,I
- W)T fk) = A (j) = (3.42)
AJj) / \o /
therefore must vanish, whereas the £th element is equal to the determinant
\A(J)\ which also vanishes, since p i is a root of the equation pi
^ W| = 0. |
A kl (X) A kl (2) •
•
•
A kl (s) ^
Aj&iX) A k2 (2) •
•
•
A k2 (s)
T (k) = (3.43)
Clearly the matrix T (k) varies with the choice of k. However, the inverse
matrices T-1 (l), . . T_1 (s) have an important feature in common: the
. ,
~
kth column of the inverse matrix T l (k) is independent of k and is given by
1=1
Before presenting the proof of (3.46), let us illustrate this with an example.
and
Pj
— V, — v 12
A (;) =
;
21 Pj V 22
and the eigenvalues are
P2 = M(Hi + * 22 ) - V (v lx - v 22 ) 2 + 4v 12 v 21 ].
^n(l) Ah(2) Pi — V 22 P2
~ V 22
T(l)
^12(1) ^12(2) ' 21 "21
130 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [6.3
T_1(1) = =
| MD 7^(1) l l 1
_ Pi - v_22
|T(1)|/ \P2- Pi F (p - Pl
J 7X1)1 21 2 ),
A 2 iW A 21 (2)
T(2)
^22(1) ^22 ( 2 )/ \pi v n P2 vu
Tn (2) T22 ( 2) 1
“ j
I1 Pl
- I’ll 1
where the second column is given in (3.46), and is also equal to the first
column of T -1 (l).
Proof of (3.46). Use T(/c) in (3.43) to formulate the following system
of simultaneous equations
A11'
T (k) (3.47)
Tkj (k)
j = 1, • • • , 5. (3.48)
\T(k)\
6.3] MATRICES, EIGENVALUES, AND DIAGONALIZATION 131
Ai(i>i + •
•
+ AiOK = o
A*(l) 2 i + • •
•
+ A kk (s)z s = 1 (3.49)
A»( 1>1 + • •
•
+ A i/s)z = s
0.
(3.15),
A kk (j) = pr +
A ki (j) =
1
wr
b ki2 p
s
r
2
2
+
+ •
*
•
*
•
+
+
b kks p?
<\ l * k
n
^ z jpj° —
S
b k i2^ z jp j “ + ••• + b kl 0
3 =1 3=1
!L z iP i
S 1
+ b kk2 ^ Zjp) +
2 * •
•
+ b kks ^z jPo0 = 1 (3.51)
3=1 3=1 3=1
b kS 2 2 Z -pS +
3
“ ••• + b kss ^z jPj ° — o.
3=1 3=1
Our problem is to solve (3.51) for z 3-. Recalling from the lemma in Section
J 1
n
i=i
(Pi - pi ) (3.33)
l i= 3
= 0, 0 < r <s- 1
= ,
j
= 1, . . . s. (3.52)
n
i=i
(pi - pi )
l * 3
Since the determinant of T(/:) is not zero, system (3.47) has a unique
solution, hence solutions (3.48) and (3.52) are equal, proving Equation
"
(3.46) (Chiang [1968]).
132 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [6.4
vu = - 2 vu> i = 1 s (4.2)
3=1
j*i
so that the matrix
is singular.
In this chapter we also assume that in the system there are no absorbing
states / which vu = 0. The case in which there are absorbing states
for
will be treated in Chapter 7.
In finite Markov processes the Kolmogorov differential equations are
PiM = d ik .
(4.6)
t
6.4] EXPLICIT SOLUTIONS FOR KOLMOGOROV EQUATIONS 133
Pi i(o ••• pu { 0\
• )• (4.10)
PJr + t) =iP
5=1
ii
(r)P ilc (t). (1.7)
Ml = |
pi - V| = 0 (4.12)
(4.13)
3
I am grateful to Myra Jordan Samuels for useful discussions on the sign of eigenvalue
of V. An alternative proof due to Mrs. Samuels is given in Problem 27.
134 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [ 6.4
such that
01 - V)C = 0 (4.14)
or, equivalently,
pCj = J'jiCi + • •
•
+ Vjs cs , j = 1 , . . . ,
s. (4.15)
so that
> ca cjt j = l, ... ,s. (4.16)
pc« 2 = V«1 + • •
•
+ + • •
•
+ VA- (4-17)
Using (4.16) and the relation £v < aj 0, we may rewrite (4.17) to obtain
(4.18)
va = Vikf j, k = l, . . . ,
s. (4.19)
•
• '
^22 ^23 V2s
(4.20)
V.ss
• *
•
*21 *23 *2 S
*
*S 1 *s3
*
'
V.
6.4] EXPLICIT SOLUTIONS FOR KOLMOGOROV EQUATIONS 135
where the only difference is in the first column. Replacing the first column
in (4.21) by the sum of all the columns and making the substitution
4
Vn + v* + * *
'
+ vis = -v i2 (4.22)
The eigenvalues and the cofactors Vtj of V have the relations in (3.17).
In the present case, p x = 0, we have
P2P3
* *
’
ps = Vn + * *
*
+ Ki» (4-24)
whatever may be j = l, s.
A/0 = ctj e
pt
,
ij = 1, . .
.
s, (4.25)
( ^sl Cs/
In order for C' to have a nontrivial solution, it is necessary that the matrix
A' = (pi — V') be singular, that is, the determinant \A'\ = 0. This means
4
Here we use the well-known result in algebra that replacement of any column (or
any row) of a determinant by the sum of all the columns (or rows) does not change the
value of the determinant. See Problem 4 for proof.
136 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [6.4
|
pi _ V'| = 0 (4.29)
are the only values of p for which (4.25) is a valid solution of (4.7). Since
the matrix V and its transpose V' have the same eigenvalues, the roots of
(4.29) are also the eigenvalues of V and these are denoted by p l9 ... , ps .
For each p t ,
the corresponding constant matrix C(/) is determined from
I Pi ~ *11 -*21
*12 Pi *22
A'(/) = (4.31)
If all the 5 eigenvalues of V are real and distinct, the general solution of
(4.4) is
Equation (4.33) is the general solution for any finite system of differential
equations (4.4) in which need not be transition probabilities, as
pointed out earlier.
2 kuA'uiD = 1 j = i
=0 j * <• (4.34)
5
The cofactors A'aj (l of any row a would be solutions for the c ijt ; but the choice
of a = / is convenient for evaluation of the proportionality constants k it .
6.4] EXPLICIT SOLUTIONS FOR KOLMOGOROV EQUATIONS 137
-1
A'u(l) = p
*
+ b ii2 p {~
s 2
+ • *
•
+ b iis
s ~“
= b ij2 p + • *
• 4- b ijs , j i
2k =
1 s
2 KiPi +
i=i
bu 2 Kip
z=i
i
“ + ••• + biis
i=i
it
i (4.36)
2 kil = 0.
2
bis2 2 kilPl + ' ’
*
+ ^ss
i=l i=l
_ s
Pi
- = 1 r =s— 1
i_1
ff(/>l - Pm)
m=l
m^l
= 0 0 < r <s— 1. (4.37)
Ki = 1
i, l = 1, . .
. , s. (4.38)
IT(PJ - Pm)
m=
mi^l
AU
PiXO =I i,j= l,..., s, (4.39)
1=1
IT (Pl - Pm)
m=l
m+ l
If we ignore the fact that P(/) and V are matrices, the differential equation
Vt l
00
vn /
n
e =I P
n= nl 0
(4.41)
then (4.40) is indeed a solution of (4.7). It can be shown that the matrix
series in (4.41) converges uniformly in t (see Problem 20, also Bellman
[1960] and Doob [1953]). Therefore, we can take the derivative of the
infinite sum in (4.40) with respect to t term by term
d v
d 00
vn t
n
DP(t) =f
at
e ‘P(0) =
f
at n= o n\
P(0)
=I
n= i
V"
(n
~
— 1)!
P(0)V = e
v,
P(0)V = P(f)V, (4.42)
t (k) = rim . .
. ,
t s
(k)] (4.44)
diagonalizes V
T-1 (A;)VT(A:) = p (4.45)
V =
2
[T (A:) pT— (A:)] [T(A:)pT“ 1 (A:)] = T(k)^T~\k). (4.47)
V = n
T(A:)p
n
T _1 (A:), n = 1,2,... . (4.48)
=1
n=0
oo
1± =1
n=
\rn^n
Yl !
oo
0
mu
r
r/ /, \ -.W'T'— 1/
Yl !
(Q‘ = T(fc)
oo
2 P_L T
n= 0 Yl !
-I
(fc) . (4.49)
0 0
and hence,
oo _ n+n
2 aJ_
n= n\ 0
0
oo _
^ n+n
ft 4.1
p2 i
00
o 2
n= 0 n ]
!
n=0 Yl\
n= 0 Yl !
(4.51)
140 THE KOLMOGOROV DIFFERENTIAL EQUATIONS [6.4
(4.52)
<?
V/
P(0) = T(/:)E(/)T
_1
(/:)P(0). (4.53)
In our case
P(0) = I (4.54)
so that the particular solution for the differential equation (4.7) corre-
sponding to initial condition (4.54) is
Expanding (4.55) we obtain the second explicit solution for the transition
probabilities P tj {t ),
AX0 = 1=1
i4*(0pj^«'
,
‘, ij = 1, s. (4.56)
T |
(/c )|
We must now show that the two explicit solutions in (4.39) and (4.56)
are identical, or that
Ml
s
A* - T (k)
= I a m (1) n Sit
(4.57)
i=i \T(k)\
IT ( Pl - Pm)
m=
m^l
Since (4.56) holds true for any k ,
we may let k = j to rewrite (4.57) as
11=1
M) JjjiJ)
g
Pit
(4.58)
1=1 \T(j)\
IT ( Pl - Pm)
m=
mi 1
1
f
6.4] EXPLICIT SOLUTIONS FOR KOLMOGOROV EQUATIONS 141
AMD TAJ)
(4.59)
~ I
TO) |
YI(PI Pm)
m=
m+ i
1 T„U)
(4.60)
*
\T(j)\
IT (Pl ~ Pm)
m=
l
sufficient to prove (4.60) for each /. Equation (4.60), however, has been
proved in Section 3.4 [see Equation (3.46)]; hence the proof of the identity
is complete. ^
PAr + t)
=J P
3=1
i ii
(T)P ik (t)- (1-7)
We it for the reader to verify that solutions (4.39) and (4.56) satisfy
leave
(1.7) and we proceed here to verify (4.61). Substituting solution (4.55) in
the right side of (4.61), we compute
lim Pift) = ,
i,j, a = 1, . .
. ,
s. (4.63)
lv
i=i
la
lim Pu(t) = . •
(4.64)
n (ft - pj
m=
Here A'(l) = (pj.
— V'), Pi = 0; hence
Aim = (- 1 r ^ = (- ir%.
1
(4.65)
(- 1 rv 2 ••/>. = (- + • •
•
+ KJ. (4.66)
and
(- 1 rv. • • •
= (- ly !^ + 1 • •
•
+ KJ, (4.66a)
PROBLEMS
1. Use Equations (2.1) and (2.2) to show that
a
Pii(r, t) = v u( T )
dt
2. Show that if
I Pair, t) = 1 (2 . 8)
5. Linearly independent vectors. Show that all the column vectors of a square
matrix W are linearly independent if and only if all therow vectors are linearly
independent.
7. The sum of the diagonal elements of a square matrix W is called the trace
W,
of 8. tr W. In general the sum of the determinants of all principal minors of W
of order r is called the spur of W, sp r W (see, for example, Aitken [1962]).
Verify the relations in (3.17) among the coefficients b, the spurs, and the eigen-
values for the following matrix.
a n zi + ' ’
*
+ a is zs = ci
a s iZi "b ~b a ss Zs cg .
\A\ = 9* 0
l
sl
then the system has a unique solution. Find the solution and show that the
solution is unique.
Continuation. Show that if in Problem 8 the c’s are all zero, then a non-
trivial solution exists if and only if the determinant \A\ = 0 and that the solution
is proportional to the corresponding cofactors of A namely
;
Z = kA
i ijt j = 1, . . . ,
s
whatever may be / = 1 , . . . , s.
144 THE KOLMOGOROV DIFFERENTIAL EQUATIONS
( 0
T,(l) = l = 1,2
Pi 0
T _ 1 (1)WT(1) =
0 P2
Pi ~ P2
P2
— Pi,
(d) Can a statement similar to (c) be made about the matrix T(2) = (T 2 (l),
T 2 (2)) ? Explain.
1 1
W =
-2
2
IA kl (l)\ /ON
W - W) U(o)=(o)’ k = 1,2.
A kl ( 1) ^ fcl (2)\
T(*) =
A k2 ( 1 ) A k2 (2)J
and show that
Pi 0
T- J (A:)WT(A:) =
0 P2
PROBLEMS 145
(d) Verify that the kth column of the inverse matrix T -1 (^) of T(fc) is equal to
L
—
Pi P2
12.
1
\P2 ~ Pi,
13.
For ^ = 1, p 2 = 2, /> 3 = 3, show that
r r
Pi P2 P3
.
— = 0 r = 0, 1
(Pi
— P2XP1 P3) ( P2 Pl)(P2 P3) (P 3 Pl)(P3
— P?)
= 1 r = 2.
Pi P2 P3 P4
2 2 2 2
— (Pi P2XP1 P3) ( Pi P4) ( P2
— P3) ( P2 ~ P4) ( P3 ~ P4)
Pi P2 P3 P4
3 3 3 3
Pi P2 P3 P4
Pi P2
= ff IT (Pi - p,)-
= i 1 j=i+ 1
15. 1 1 1
Pi p| pr
(c) Let Vij be the cofactor of a Vandermonde determinant of order s. Show
that ^ Vij = 0 for 1 < i < s.
0=1
Use part (c) in Problem 13 to show that, for distinct numbers p l9
P2> • • • > Ps’ s
J
T- 0 .
n
0=1
(pi - pi>
n
0=1
(pi - pi )
j
= 1 for r = s — 1
P? 1 n r+i
Ps+l
r
Hint. Pi + Pi
1
Ps+ 1 + Pi
2
Ps+ 1 + • •
•
+ P s+1 +
Ps+l Pi Ps+l _
146 THE KOLMOGOROV DIFFERENTIAL EQUATIONS
16.
1
IT (Pi
~
3=1
17. j¥=i
[Hint. Let x be any real number different from plf ... , ps and substitute (p { — x)
for Pi .]
,S—
I = 1
n
3=1
(pi - ps )
18. 3
then
Pi
I = 0 0 < r < s — 1.
1
XT (Pi ~ Pi )
19. 3=1
i¥=i
G ( P) = /
(p - Pi)
• •
•
(p - Ps)
20. (4.3)
with
S
- 2 v a» i = l, ... ,s. (4.2)
3 = 1
3 #
Show that the cofactors of satisfy the relation
V13 = Vi
r r tk
The matrix exponential. (See Bellman [1960] pp. 162-4) For any s x s
matrix W
of elements define
nwii =i
i=l 3=1
i> H i.
6
Personal communication.
PROBLEMS 147
and
IIWVII < IIWII ||V||.
(b) Show that a sufficient condition for the convergence of the matrix series
oo oo
(c) Using (a) and (b), show that the matrix exponential defined in (4.41)
converges uniformly in any bounded interval — T<t<T.
(d) Using this result, show that (4.40) is a solution of the backward differential
equation (4.8).
(4.5)
3=1
23. Show
24. that if the eigenvalues p t of V are real and distinct, then whatever
may be iff, l = 1, . . s, the functions
. ,
(4.4)
(4.39)
1
1
IT ( Pl - Pm)
m=
m^l
and
(4.56)
\T(k)\
148 THE KOLMOGOROV DIFFERENTIAL EQUATIONS
25. To illustrate that the solution in (4.56) for the transition probabilities is
independent of k, consider the simple illness-death process in Chapter 4 with
V Vl2 \
= / 11
MD = (piI - V),
V 21 V 22/
\
(4.566)
1=1 \T{2)\
and show that they are equal for given a and (3.
(b) Matrix V has one zero eigenvalue and two negative eigenvalues. Determine
these eigenvalues.
(c) For each eigenvalue p t , formulate the characteristic matrix
A(/) = ( PI I - V)
/ ^fci(0\
t m = amd
Wo/
is an eigenvector of V corresponding to p t
.
t
PROBLEMS 149
Verify that
/ Pi o o\
T-\k)XT(k) = 0 P2 0
j
\0 0 pj
and
/ Pi 0 0\
TW 0 P2 0T- 1
(^)=V.
\0 0 pj
(f) Compute V2 and
2
/ Pi 0 0 \
2
T(A:)| 0 P2 0 T- 1 ^)
\ 0 0 p*J
2
/ Pi 0 0\
2
0 p2 0 •
\0 0 p*J
-T.
at
Pikif) = 2 P iAt) v
j=l
ok
2
d
-rPiicb) = 2v ij pjk(t )
al j=l
assuming Pij(0) =
(i) Compute
Pij(t) = 2—
3
1=1
——
A'Xl)
i,j = 1.2,3
IT (Pi ~Prr>)
m=
and show that they satisfy the differential equations and the initial conditions
in (h).
fPij(r)Pj k it) =P ik
(r+t)
3=1
(4.13)
Adding (4.15) over those ] for which cj > 0, denoting this partial sum by S*.
we have
Cl 2* % + + = p£* c
•
.
•
•
cs Z.*vsj ..
Show that each of the terms on the left side of the last equation is nonpositive
and hence p <; 0.
CHAPTER 7
1. INTRODUCTION
We can now use the results of the preceding chapter to extend the simple
illness-death process in Chapter 4 to include the variability of illness and
causes of death. In this general model there are s states of illness, which
we shall denote by Sa ,
ol = 1, . . s, and r states of death denoted by
. ,
Rd ,
6 = 1 , . . . ,
r, where s and r are positive integers. An illness state may
be broadly defined to include the absence of illness (a health state), a
physical impairment, a single specific disease or stage of disease, or any
combination of diseases. A death state, on the other hand, is defined by
the cause of death, whether single or multiple; emigration may also be
treated as a cause of death. Transition from an illness state to a death
state specifies not only the cause of death but also any disease present.
For example, an individual in illness state Sa of tuberculosis-pneumonia
who passes to the death state Rs of pneumonia dies of pneumonia in the
presence of tuberculosis. A slightly different classification of illness and
death states may be found in Chiang [1964a].
Transition from one state to another, and indeed the model itself, is
[Ji
a8 A + 0(A) = Pr{an individual in state Sa at time r will be in state
Rd at time r + A}, (5 = 1 ,
. . . ,
R.
(1 1) .
151
152 A GENERAL MODEL OF ILLNESS-DEATH PROCESS [7.1
Ti
(1 . 2)
* ‘
h1 ii *
A* h
U= (1.3)
t*8l f*sr
For a time interval (0, t) for 0 < < t oo, let us define the transition
probabilities:
a, fi
= 1, . . . ,
s; d = 1, . . . ,
r, (1.6)
(1.7)
(1 . 8)
7 2] . TRANSITION PROBABILITIES 153
with
P(0) = I Q(0) = 0. (1.9)
i^w + ieuo =
p=i 5=i
i (i.io)
so that the intensities and the transition probabilities have the relations:
a, ft
= 1 ,
. . . , s (1 . 11 )
<=0
and
P'oiS — ,
Q<zd(t ) S = 1 ,
. . . ,
r. (1 . 12 )
dt t =
2. TRANSITION PROBABILITIES
In this section we present explicit formulas for the transition probabilities
and related problems. Because of the similarity of the illness transition
probabilities Pap {t) and the transition probabilities in Chapter 6, we shall
avoid repetition by not giving minute details.
7 pjt = i pjt)vf7
) (2 1) .
dt /j=i
154 A GENERAL MODEL OF ILLNESS-DEATH PROCESS [7.2
T Pa,(t) = P=i
at
2 KpPfyiO, a,y=l,...,s, (2.2)
\ Pl
- V'| = 0 (2.3)
and A'(0 =
(p J V') — and let A'ap (l) be the cofactors of A'(/). If the roots
(or eigenvalues) p l9 . . . ,p s are real and distinct, then equations (2.1) and
(2.2) have the common solution
P*y(t) =I >Plt
a, fi = 1 , , s. (2.4)
1=1
ff (Pi ~ Pi)
3=1
3*1
(2.5)
1=1 |T(fc)|
where A af (l) is the transpose of A'^U) and Tyl (k) is the cofactor of the
matrix
T(*) = ( 2 . 6)
gPzt
E(0 = ( 2 8).
0
7.2] TRANSITION PROBABILITIES 155
Equations (2.4), (2.5), and (2.7) are similar to those in (4.39), (4.56),
and Chapter 6; the reader is referred to Chapter 6 for the deriva-
(4.55) in
tion of these equations and for the proof of the identity of (2.4) and (2.5)
[see Equation (4.57) in Chapter 6]. For completeness, we shall show that
v A'Jl) Pl e^ ±± A'^De”"
“2-2 vtr
2.
,=1
s
n (p,-Pf) HM n ,
(/».-#>,)
(2.9)
3 =1 3=1
3*1 3*1
since the left side of (2.12) is an expansion of the determinant \A\t)\ using
the ath row. Obviously \A\l)\ =
0, because p is a root of the characteristic
t
P*y(r + 0 = i P«p(j)Ppy(t\ a, y = 1, . . . ,
s. (2.13)
= P
n T (k) ‘
24J0 \T(k)\
P,(r+t>
2+*«(0 g <V 24*(D^^«'
W |T(fe)|
,
|T(/c)| .
i
(2.15)
All the summations are taken from 1 to s. The right side of (2.15) after
multiplication becomes
22
Z.Z. *7,W e^ ,+t)
I P
+
l
|7’(fc)|
3
1*3
^
222 AJJ) \T(k)\ +« 0 ')
|T(fc)|
*'v+p<f-
(2 - 16 )
W
W ^d^2^;a*(o
/,
v , mW iMO
|T(fc)|
-
—
|r(fe)|
, „J#) V
'
|r(k)|7|T(k)|
TyJ(fe)
= 4b.(D (2.17)
|T(fc)|
since the second sum in (2.17) is equal to unity; for the second term, for
/ j, we have
T 2' Tfl(k)Akf(j)
2+*«(o^U,g)T^t = +*,o) "
(fe)
)
= 0 (2.18)
|T(/c)| ir(fc)| |T(fe)| |T(fc)|
since in the last sum the elements A kp (j) and the cofactors Tpi {k) belong
to two different columns of \T(k)\. Substituting (2.17) and (2.18) in (2.16)
yields the left side of (2.15), proving (2.13).
For the initial condition (1.6), we need to show that
A'(l)
= 1 for y = a
~
n
3=1
(pi - p ,)
3*1
= 0 for y T6 (2.19)
probability matrix Q(/) can be derived from probability Pa p(t) and the
7.2] TRANSITION PROBABILITIES 157
Q(f)=jV)UdT, (2.21)
= T(fc)[E(f) - (2.22)
Pi 0
0 P2
P
= (2.23)
.0 0
S S /IX
J,
QJt) = 22 AJJ) ^77 ftV* ~
i=i ^=i \T(k)\
1 W, (2.24)
A '" (l)
QJt) = 22 -
(e
P ‘‘
- 1)M„
3=1
pi-pm
3 + l
a = 1, . .
. ,
s; 6 = 1, . .
.
r. (2.25)
= T(/c)[E(r + t) - Hp^T-H^U
= Q(r + 0 (2.29)
AUl) K,(l)
I 2
0=1 1=1
vPi
12= 2=1
+ 5=1 s
(•‘- 1 )^= 1, (2 . 31 )
fr (pi - n
0 1 1
(pi - Pi)pi
0=1 0=1
0*1 0*1
which can be shown by direct computation and is left to the reader. The
corresponding matrix equation is verified below.
7.2] TRANSITION PROBABILITIES 159
(2.32)
Here the first and the third column vectors are 5Xl, the second one is
r x 1, and all the components are unity. We substitute (2.7) for P(t ) and
(2.22) for Q(/) in the left side of (2.32), which becomes
'
+
J
where the unit matrix I is 5 X s. From (1.1) and (1.2) and (1.3) we have
the relation
A
(2.34)
T(fc)E(0T-HAr)| '
|
- T(fc)[E(;) - Hp^T-W (2.35)
Substituting (2.37) in (2.35) and canceling out the first two terms give us
T(fc) I T- 1 ^)
IT (pi - Pi)
3 =1
3*1
and
A'a p(l)[ipd
QUoo) = lim 2 2
P = 1=1
*-*•00
(e'S - 1) = -22
p=n=i
1
n
3=1
(pi - Pi)pi n
3=1
(pi - pm
3*1 3*1
Thus, starting from any illness state Sa at any given time, an individual
will eventually enter a death state with a probability of one.
r
7.2] TRANSITION PROBABILITIES 161
p = 1 ,
. . . , S (2.41)
Using formulas (2.4) and (2.25) for the transition probabilities, we compute
I—^
'- 1
(/ >
p
e ‘’dr
fl (Pi ~ Pi)
3=1
3 * l
= V [/' - 1 ], a, /S = 1 (2.45)
1=1
n
3=1
(pi - pm
3 *l
and
A'a fi(l)l*ps
e«»( 0
_ _ 1} ^
n
3=1
(pi - pi)pi
3 * l
p,t
11 L Pi
0 1)
IT (pi - pj)p>
3=1
3*1
a = 1, . . . , s: (5 = 1, . r. (2.46)
1
B=
0 2 £ ««(0 =
+ 5=1 L (2.47)
162 A GENERAL MODEL OF ILLNESS-DEATH PROCESS [7.2
Equation (2.47) may be shown directly with the explicit formulas in (2.45)
and (2.46), but a simpler approach is to use (2.43) and (2.44). Thus
gral is equal to t.
x(0) = x ± (0) + • •
•
+ x 8 (0) (2.49)
is the initial size of the total population. These individuals will travel
independently from one illness state to another and some will enter a
death state during the time interval (0, t). At the end of the interval, there
will be Xp (t) individuals in Sp and Y8 {t) in R d, = 1 ,
. . . ,
s; d = .
1 , . . . -, r.
XM) = XJt) + • •
•
+ XJt) (2.50)
where XJt) is the number of people in state Sf at time t who were in state
S„ at time
= YJt) + •
+ YJt) (2.51)
where Yad (t) is the number of people in state Rd at time t who were in
state Sa at time 0. On the other hand, each of the ^a (0) people in state Stt
at time 0 must be in one of the illness states or in one of the death states
at time t; this gives us
x M=IXJt)+lYJt).
5=1
(2.52)
P=1
According to (2.30)
i=lPJ<)+lQJt)
5=1
(2.30)
P= 1
therefore, given # a (0), the random variables on the right side of (2.52)
have joint multinomial distribution. Letting values xaP and Xap (t) assume
Ya8 (t) assume values y a8 we have the joint probability distribution
,
s r
x (ov x VaS
^~r TT lP«p(t)] °e IT lQ x s(t)] (2.53)
nvny-i^ 1
t
7.3] MULTIPLE TRANSITION PROBABILITIES 163
where xap and y a<5 are nonnegative integers satisfying the condition
2 x*n + 2 y*d =
P <5
*«(0). (2.54)
= in-
®—
s
1 "I
—~— n
|"
x tOV
|
T T .
s
fi= 1
ijvop* n
<5
r
=1
(2.55)
i?=i a=i
where x ap and y ad satisfy (2.54); the summation is taken over xap and y ad
so that for all ft
and d
s s
2— x =
OC 1
«/i
a
7i and
OL
2= 1
y„_s = Vi- (2-56)
The expected population sizes of the states and the corresponding variances
and covariances are easily obtained from (2.55) by using familiar theorems
in the multinomial distribution given in Chapter 1. The expectations,
for example, are
E[X (t)]=ixJO)PJt)
ll
(2.57)
<x=l
and
s r
x tOV
2n “ =1
n
s=1
( 2 59 )
.
n^!
<5=1
where Q a<5
(oo) is given in (2.39).
during the interval. This problem was resolved for the simple illness-death
process by studying the multiple transition probabilities, the details of
which were given in Chapter 5. We now consider the same problem for
the general model.
Let us first review the symbols and definitions. Pa/J (0 was defined as the
probability that an individual in state Sa at time 0 will be in state Sp at
time t.During the interval (0, t) the individual will travel continuously
from one state to another; he may, for example, return to and exit from Sa
a number of times and he may enter Sp a number of times. We define
M aP (t)
as the number of exit transitions from Sa and aP (t) as the number , N
of entrance transitions to S p
Both ap (t) and ap.(t) are M
random variables N
n
with the corresponding probabilities denoted by Pffiit) and p[ p (t),
respectively, so that
pyco = p«W«#(0 = m )
= Pr{an individual in Sa at time 0 will leave Sa m times during
(0, t) and will be in Sp at time /} m = 0, 1, ;
(3.1)
= Pr{iW0 = n)
= Pr{an individual in Sa at time 0 will enter Sp n times during
(0, t) and will be in Sp at time t} n = 0, 1, ... ;
a,/5 = 1, . . . ,
5.
(3.2)
2 P^'0) = nI—
m= 0
(3.4)
Since Pa/J (/) < 1, Map (t) and NaP (t) are improper random variables.
Explicit formulas for the multiple transition probabilities are difficult to
obtain for an arbitrary number of illness states, but formulas for the
corresponding p.g.f.’s can be derived.
Obviously, at u = 1,
g* fi ( U 0 = (3.6)
7^(0 = + i p[;\t)v^
at = p
p*a
7 C( 0 =
at
+ 0=1
2 PWiOvpy, (3.7)
m= 0, 1, . . .
; y ^ a; a, y = 1, . . . , s.
From (3.7) direct computations show that the p.g.f.’s satisfy the differential
equations
s
—d g««(« 0 = 2 s*p (u ;
* ;
* =1
(3.8)
s
d
7 g« («; 0 =
y + 2 g«/»(«; O^r
at /?=i
p*oc
2
System (3.8) represents s differential equations and can be solved in the
same way as system (2.1), although the presence of u as the coefficient of
gay (u; t) in some but not all of the equations in (3.8) makes it difficult to
-Vug*l ~ *2«g«S
1-
Vsagas
= 0
(3.10)
166 A GENERAL MODEL OF ILLNESS-DEATH PROCESS [7.3
where, for simplicity, the symbol D is written for the differentiation djdt
and gaP for g^u; t ). The corresponding characteristic equation is
(p
- v ll) —v2 i uva i • •
— Vsl
-Vl2 ( P V22) -uva2 •
•
-vs2
-Via ~V 2 a • •
•
(p - 0 •
•
~V aS
(3.11)
with the argument u appearing in the ath column except in the diagonal
element. Suppose that the characteristic equation has distinct real roots
pi( a) for / = 1, . . . , s. Following the steps in the solution of the system
(4.4) of the Kolmogorov differential equations in Section 4.2 of Chapter 6,
we obtain the general solution of (3.10),
g iy(u ; 0 =2 (3. 1 2)
i =
where A'ay (oL,i) are cofactors of \A'(a.,i)\ in (3.11) with p = p.(oc),
Ki = (3.13)
n
2=1
[ft(«) - ?/“)]
j±i
From (3.12) and (3.13) follows the solution for the p.g.f.
g ay (u ; 0= i
^ y(g ’ 0
(3.14)
,
'
n
2—1
tft(a) - />;(«)]
i*i
For given a and y, the p.g.f. in (3.14) is an implicit function of the argument
u which involved in the eigenvalues p t ( a) and the cofactors ^' (a, i).
is
When u — y
1, |T'(a)| in (3.11) becomes identical to \A'\ in (2.3), and the
(3.16)
be the probability and
= (3.17)
m=
be the corresponding p.g.f. We shall derive explicit formulas for h xd (u\ t).
0= f j
Jo [m=l
2u
“- i p<”-i)
(t)Ju
j
^ dr
f*t s f 00 \
+ (3.20)
Jo 0 = 1 [ 771=1 J
0*<X
hju 0 = ; 2 -- _ !]
n
3=1
tft( a ) - />,(«)]/>;(«)
j*i
+i
fri^fr
i-
4 “*,(a;
'
tft( a )
_ Pj( a )]pt( a )
— [/
,(a)<
- 1], (3.2i)
3=1
In the general illness-death model of more than two illness states the
number of exit transitions from a state Sx and the number of entrance
transitions into another state Sp not only are conceptually different but
also have different probability distributions. This can be shown from the
corresponding p.g.f.’s. Let
M«;0 = n2«*pS
—
(0
,
(3.22)
Pay\0 = I *„/>#(») +
(
J
at = P
P±y
(3.23)
d
PyyW = 2 VypPpyiO-
dt
s
~d gay(u ; 0 = 2 V *p8py( U 0 + l
Vxy Ug yy {u \ t)
at =p
P*y
(3.24)
dt
P
- v ll) —v12 • * •
— v ly u
(
-*U
~ V 21 ( P ~ V 22 ) '
* •
— V 2y U ~ V 2s
\Mv)\ = 0
~v yl
~ vyi •
•
• - Vn) -Vys
,
(p
~v sl —Vs2 *
• •
—V u '
- vss)
sy •
(p
(3.26)
in the yth coliimn of the determinant \A(y)\, whereas in (3.1 1) u was in the
of (3.26) differ from pi(a), p s ( a) of (3.1 1). They are equal only when
. . . ,
y — a. If the roots of (3.26) are real and distinct, we have the solution
ePiW
g«O;0 = 27 \ a, r-1, (3.27)
5
TI lpi(r)
- p/y)]
3=1
The p.g.f.’s in (3.27) are different from the p.g.f.’s gay (u; t) of the multiple
exit transition probabilities in (3.14). Thus the probability distribution
of the number of exit transitions from a state is different from that of the
number of entrance transitions into another state.
PROBLEMS
1. Show that
* Tn (k)
1 <2 -
)
ai
w =
i
I
=1
Wr®- (2.2)
2. Show that
A '(0
= 1 for
1 _
'IT (pi pi>
3=1
3^=1
= 0 for y 7* a. (2.19)
3. Verify the equation
Jo
5. Verify that the transition probabilities Pay (t) in (2.4) and Q ?a (f) in (2.25)
7. The matrix
1-4 1 2
V = 1 —3 1
j
\ 1 0-2
is of rank 3 and satisfies the conditions
2= v ap < 0 and v
af} > 0 for p # a.
fi
_ Jp
V S? _ j
3=1
3=1
10. Substitute (2.45) and (2.46) in (2.47) and prove the resulting equation.
11. Verify the system of differential equations in (3.23) for the multiple
entrance transition probabilities.
CHAPTER 8
Birth-Illness-Death Process
1. INTRODUCTION
In Chapter 7 emphasis was placed on transition probabilities rather
than on the distribution of population sizes, which was obtained only as a
by-product. This chapter is concerned exclusively with population sizes
in illness states, and we shall derive directly the corresponding probability
distributions. Furthermore, the populations of various states, which were
previously treated as being entirely based on the initial size, are now
allowed to change because of birth and migration. Movement between
states (internal migration) is retained through illness transition as before.
The illness states are again denoted by Sa ,
a = 1, . . . ,
s. For a time
interval (r, t), let the populations of the states at the initial time r be
represented by a constant column vector
(u)
X (1 . 2)
)
171
172 MIGRATION PROCESSES AND BIRTH-ILLNESS-DEATH PROCESS [8.1
P(\ T ,
x t ;
t, t) = Pr{X( = x t
at t given x r at r}, (1.4)
P[\ T x , t )
= P(x T ,
x t ;
r, t). (1.5)
( 1 . 8)
where the components are Kronecker deltas. It is easily seen that the
probabilities in (1.4) satisfy the following system of differential equations
dt
P(x r x, t )
- -P(x„ x,) 2 lK(t) - v *«(0] + 2 P(x„ X, - SJA*(<)
22 Ax
+ a=l/?=l r,
X, + s„ - 8fi )v*f (t) +2
a=l
P(x r,
X, + 8>*(r) (1.9)
a ^P
P(x r x 7 ) ,
= 1, P(x r x , f )
= 0, for x* xr . (1.10)
2. EMIGRATION-IMMIGRATION PROCESSES—
POISSON-MARKOV PROCESSES
was pointed out early in Chapter 7 that emigration may be treated
It
Number of
Type of Process States s
X — 0
m
1
Poisson process
1 0
0
1 0
Simple illness-
Finite Markov
process s — xii v ii —
General illness-
death process s 0 x zt v *p x*tQ*a 1 + * '
‘
+ P*}
Migration process s
w x*t v
*fi
xztt*a
Birth-illness-
death process s X*A Xat vaP x*tV*
so that
- P(x r,
X,) = P(x„ X,)2[ 2 a(0- +2P(X r
,X (
- SJA^f)
+ 22 p (x„ X, + 8„ - + IK,
a=10=l
x±P
c
G Xi(s; t, t) =2 • '
•
2 «f" ' '
' K? ,,p (x r , x«). (2.4)
Direct calculations show that the p.g.f. satisfies the following partial
differential equation
G x (u; r, ()
~ 2 2 «(')G x ,(u ; r, 0 + 2 »'«««« r^- G Xl(u ; r, 0
f-
dt a=l <x=l Oil _
!/<«/ G X( (u
+ a=l ;
r, l). (2.5)
OU a
-21 VC
a = l/?=l
1 - “/>) r
C7W a
-
Cx,(“ i
r, 0 (2.7)
Differential Equation (2.8) can be solved in the usual way. The auxiliary
equations are
dt dz x dz s dG x< (z;r, t)
(2. 10 )
1 2P vm z n 2p Vo 2p 4(0 z/)G Xl
(z; r, r)
dz
. + 2 V/? = a = 1, . . . , s (2.10a)
at 0=1
s
d
T lo 8 G x ,(z;t, t) = - 2 4(0 2 /j.
(2.10b)
dt z?=i
AM'
X(t) = z = ( 2 11 )
.
AM,
and
• •
Vii •
v
V = ( 2 12 )
.
8 2]
. EMIGRATION-IMMIGRATION PROCESSES 177
(D + V)z = 0 (2.13a)
and
d log GXt (z; r, t) = — \'{t)z dt (2.13 b)
matrix
Mi) = ( ft i
- v) (2.15)
A k i(j)
T,(*) (2.16)
AM}),
is an eigenvector of V corresponding to pj with
^i(l) ^ h (2) A kl ( S )
AkiiV) >4 ^ 2 ( 2 ) Aic2,(s)
T (k) (2.18)
where the yth column is the /cth column of the adjoint matrix of A (y), and
k is arbitrary but the same for all columns [see Equation (3.43) in
Chapter 6].
178 MIGRATION PROCESSES AND BIRTH-ILLNESS-DEATH PROCESS [8.2
z = e~
pjt
c (j) (2.19)
where
(2 . 20 )
Since —p ;
is a solution of (2.14), the determinant \A(j)\ of the matrix
A(y) is equal to zero. Cramer’s rule implies that
(2.23)
z = T,(fc )e~
Pit
bi . (2.25)
If the eigenvalues —p i
are real and distinct, then we have the general
solution for the first set of auxiliary equations (2.13a)
0 • • •
ePit
E(r) = (2.27)
8 2]
. EMIGRATION-IMMIGRATION PROCESSES 179
(2.28)
and write
z = T(fc)E
_1
(0b. (2.29)
where
1
V II
1 S' S' (2.34)
and
Rearranging (2.29) as
b = E(r)T- 1 (it)z (2.38)
Substituting (2.38) and (2.39) in (2.37) gives the general solution for the
differential equation (2.8),
4>(E(r)r‘fflz) = fl [1 - x.]
x " (2.42)
a=l
Hence for all b r such that every component of z in (2.44) is less than one
in absolute value,
<E>{M = fl [1 - (2.45)
<z=l
b, = E(0T-i(£)z (2.46)
we let
<
D{E(0T~1 (fc)z} = fl
<x=l
[l - U" x
(2.48)
G X| (z; T, f) = n
a=l
[1 - Z'V*' exp {- «]'(0E(0T“ \k)z}. (2.49)
8 2]
. EMIGRATION-IMMIGRATION PROCESSES 181
P (/ - r) (2.51)
Pa.p(t
j = \T(k)\
Therefore, since z p = 1
V’
1 -2P.p(t-r)=lQJt-r)
P=1 5=1
<
The exponential part of the p.g.f. in (2.49) may be written a little dif-
ferently. Let
Y)'(0E(f)T-
1
(fc)z = 6 '(t)z = i 6f
0=1
m- uf) (2.54)
(2.55)
i=i I
T(k)\
182 MIGRATION PROCESSES AND BIRTH-ILLNESS-DEATH PROCESS [8.2
p r{*i (
=x lt ,
...,Xst = xst \
xj
=i
k
i
ksp
n
<2 =1
X<zt'-
n ip.,o - r)f*
0p |
II Kn' h, - 2 K f ]\
P= 1. P
x <XT-'Zka p
n e“V
(,
[0 # (O]*»/i
1 ~2P«p(t-r) (2.57)
P= p=i kop •
e [Xh\ =2 - r) + 0/0,
a=l
Var( X ft |
x r) =2 ~ t)[1 - Paf (t - r)] + 0/0 (2.58)
a=l
and
C0V(X pt , X yt |
X r) = — 2 XarP<xp(^ ~ r )P<xy 0 ~ T )*
a=l
*)'(<) = X'T(fc) JV ^) dS = 1
VT(fc)[I - E _1 (/)]p _1 (2.59)
0'(O Z = VT(fc)[E(0 - I]
_1
T_1 (A:)z. (2.60)
T(A:)p
_1
T_1 (/:) = V”1 . (2.61)
X'T(fc)[E(0 - I]p~ 1 T
-1
(A:)z = X'T(A:)p- 1 T- 1 (/:)[T(^)E(/)T- 1 (A:) - I]z
= X'V-HP^) - I] z - (2.62)
Equation (2.60) thus becomes
(2.65)
3. A BIRTH-ILLNESS-DEATH PROCESS
When 2*(j) is a function of the population size xat of state Sa at time t,
vector X t
be the population at time t, and write the probability of X as t
P(x„ x t)
= Pr{X, = x, X, at r}. (3.2)
184 MIGRATION PROCESSES AND BIRTH-ILLNESS-DEATH PROCESS [8.3
Substituting (3.1) in (1.9) gives the following differential equation for the
probability in (3.2)
|-P(x r ,x ()
= -P(x„x )2*.([AI - < »,«] +ip(x„x*- b a )(xat - l)A a
Ot a= 1 a=l
+ a=l
1 0=1
i P(x r,
x, + 8, - S„)(x„ ( + l)vaf
a +_
ip(x„ x, +
+ a=l 8J(a:It + IK- (3.3)
f-G x (u; 0 = - - i
O ^-Gx,(u; t) + i«A.-^-CX| (u; t)
Ot a=l C7W a a=l OU a
s
_a
Gx ,
(z; 0 =2 Aa(1 - 2 «) + 2v <
0- (3.5)
at a=l 0=1
PROBLEMS
1. Show that the assumption of independence of individuals’ transitions
implies the relations
as in (2.2).
V
*f
= Xat vaf and = V,
2. Backward differential equations. Derive a system of backward differential
Chapters 6, 7 and 8 [for example, (2.26)], before adding the individual solutions
to obtain a general solution. Explain in detail why such a verification or assump-
tion is necessary.
5. Continuation . What form will the general solution have if two of the
eigenvalues are complex conjugates? Explain.
6. Continuation. What form will the general solution have if the characteristic
equation has multiple roots? Explain.
7. Show that when the eigenvalues pj are real and distinct, the vector Z in
9. Show that the p.g.f. in (2.49) satisfies the partial differential equation in
( 2 . 8 ).
10.Show that the components of the vector £ in (2.47) are given in (2.52).
16. Show that when the number of illness states 5 = 1, the partial differential
equation (3.4) reduces to that in (6.7) for the linear growth described in Chapter 3.
4
Part 2
^CHAPTER 9
1. INTRODUCTION
Long before the development of modern probability and statistics,
men were concerned with the length of life and they constructed tables to
measure longevity. A crude table, credited to the Praetorian Praefect
Ulpianus, was constructed as early as the middle of the third century a.d.
Modern tables date from John Graunt’s Bills of Mortality published
life ,
189
190 THE LIFE TABLE AND ITS CONSTRUCTION [9.1
are apparent. Statistics covering a period of 100 years are available for
onlya few populations and, even when available, are likely to be unreliable.
Individuals in a given cohort may emigrate or die unrecorded, and the
life expectancy of a group of people already dead is of historical interest
hypothetical cohort on the basis of the actual death rates in the given
population. The current life table is then a fictitious pattern reflecting the
mortality experience of a real population during a given period of time.
Cohort and current life tables may be either complete or abridged. In
a complete life table the functions are computed for each year of life; an
abridged life table differs only in that it deals with age intervals greater
than one year, with the possible exception of the first year or the first
five years of life. A typical set of intervals is: 0-1, 1-5, 5-10, 10-15, etc.
This chapter describes a general form of the life table with interpretations
of its various functions, and offers a method of constructing the current
life table. Theoretical aspects of life table functions are discussed in detail
in Chapter 10.
defined by the two exact ages stated except for the final age interval,
which is half open, with beginning age denoted by w.
estimate of the probability that an individual alive at the exact age x will
die during the interval (#, x +
1). The figures in this column are derived
9.2] DESCRIPTION OF THE LIFE TABLE 191
from the corresponding age specific death rates of the current population.
These proportions are the basic quantities from which figures in other
columns of the table are computed. To avoid decimals, they are sometimes
expressed as the number of deaths per 1000 population.
II
H x = 0, 1, . . . ,
w. (2.1)
^aj+1
X = o, 1, . .,
. ,
w — 1. (2.2)
Starting with the first age interval, we use Equation (2.1) for x = 0 to
obtain the number d0 dying in the interval and Equation (2.2) for x = 0
to obtain the number who survive to the end of the interval. With
persons alive at the exact age 1 we again use the relations (2.1) and (2.2)
for x = 1 to obtain the corresponding figures for the second interval. By
repeated application of (2.1) and (2.2) we compute all the figures in
columns 3 and 4.
Column 5 . Fraction of last year of life for age x, a'x Each of the dx .
people who die during the interval (x,x + 1) has lived x complete years
plus some fraction of the year (x, x + 1). The average of the fractions,
denoted by a' ,
x
plays an important role not only in the construction of life
192 THE LIFE TABLE AND ITS CONSTRUCTION [9.2
tables, but also in the stochastic studies of life table functions as presented
in Chapter 10.
member of the cohort who survives the year (x, x + 1) contributes one
year to Lx ,
while each member who dies during the year (x, x + 1) con-
tributes, on the average, a fraction a'
x
of a year, so that
Lx = (h - dx) + a x dx>
'
x = 0, 1, . . . , w - 1, (2.3)
where the first term on the right side is the number of years lived in the
interval (x, x + 1) by the {lx — dx ) survivors, and the
term is the last
Lx = lx ~ Mr (2-4)
equal to the sum of the number of years lived in each age interval beginning
with age x, or
Tx — Lx + Lx+i + ' ’
'
+ Lw ,
x — 0, 1 ,
• . • ,
w, (2.5)
TX = LX + Tx+1 . (2.6)
number of years yet to be lived by a person now aged x. Since the total
number of years of life remaining to the lx individuals is Tx ,
T
—x
,
3 = 0,1, w. (2.7)
x
increases, with the single exception of the first year of life, when the reverse
is true due to high mortality during that year. In the 1960 California
population, for example, the expectation of life at birth is e0 = 70.58
years whereas at age one e x — 71.30. The symbol ex denoting the observed
,
r
9.2] DESCRIPTION OF THE LIFE TABLE 193
Px = 1 - (2.S)
the proportion of survivors over the age interval (x, x + 1), and
Pxv = PxPx+1
' '
'
Py-1 =J (2-9)
the proportion of those living at age x who will survive to age y. When
x = 0, p Qy becomes the proportion of the total born alive who survive to
age y ;
clearly
n
Poy
-
~ ! •
*0
followed throughout its life span. The number lx of survivors and the
number dx of deaths recorded for each age constitute the basic variables
from which, using (2.1), q x is computed. The figures in the remaining
columns are obtained by the procedures used for the current life table.
_ C+ i T l*
c+2 T
* 1 ’
l
x
was first introduced. This expectation considers only the completed years of life lived by
survivors, whereas the complete expectation of life takes into account also the fractional
years of by those who die in any year. Under the assumption that each person
life lived
who any year of age lives half of the year on the average, the complete
dies during
expectation of life is given by
&X ^X T •
Since the curtate expectation is no longer in use, in this book the symbol ex is used to
denote the true expectation of life at age x.
194 THE LIFE TABLE AND ITS CONSTRUCTION [9.3
more variation and should be revised when necessary; the relevant informa-
tion is often available in vital statistics publications. The data suggest
a'
0 = .10 for the white population and a'0 = .14 for the nonwhite
population.
In a complete life table age intervals are one year in length. The basic
quantities are the number Dr of people who died at age x in a current
population and the corresponding age specific death rate M x . To construct
the life table we begin by establishing a basic relation between the pro-
portion cj
x and the rate M x . Let Nx be the number of people alive at the
exact age x , among whom D x deaths occur in the interval ( ,
x + 1).
Then by definition qx is given by
r
9.3] CONSTRUCTION OF THE COMPLETE LIFE TABLE 195
On the other hand, the age specific death rate M x is the ratio of Dx to the
total number of years lived by the Nx people during the interval (x x
, + 1).
This total number is the sum of (Nx — Dx) years lived by the survivors
and ax D x years lived by those dying during the year; hence
M = x
.
(3.2)
(N x - Dx + ) a'D x
When the denominator is estimated with the corresponding midyear
population Px
(Nx - Dx + ) a'x Dx = Px (3.3)
Nx = Px + (1 - <)Dx . (3.5)
= x = 0, 1, . . w — 1. (3.6)
-
. ,
Px + (1 a'x )D a
Dividing the numerator and the denominator on the right side of (3.6) by
Px we
,
obtain a basic relation between qx and x M \
M
<{x = 0,1, w (3.7)
1 + (1 - a')M x
Formula (3.7) is fundamental in the construction of complete life tables
by the present method and was given in Chiang [1960b, 1961].
To illustrate, let us consider the total California 1960 population as
shown in Table 1. For the first year of life, we have P0 = 356,435 in
Column 2 and D0 = 8663 in Column 3. Thus, the age specific rate for
x = 0 is
Do _ 8663
.024305.
D0 ~ 356,435
The average fraction of the year lived by an infant who dies in his first
year of life is a'
0 = .10. Therefore, the proportion dying is
.024305
.02378.
1 + (1 - .10). 024305
Table 1. Construction of complete life table for total California population
1960
x to x + 1 Px P>x M x ax lx
(1) (2) (3) (4) (5) (6)
196
Table 1. Continued
x to x + 1 Px Dx M x ax 9x
(1) (2) (3) (4) (5) (6)
197
Table 1. Continued
X to x + 1 Px Dx M x ax qx
198 r
9.3] CONSTRUCTION OF THE COMPLETE LIFE TABLE 199
When all the values of qx have been computed and /0 has been selected,
dx and lx for successive values of x are determined from Equations (2.1)
and (2.2) as shown in Table 2. For the 1960 California population we
determine first the number of life table infant deaths,
d0 = l0
q0 = 100,000 x .02378 = 2378,
/i
* /0 - d0 = 100,000 - 2378 = 97,622.
The formula for the number of years Lx lived in the age interval (x, x + 1
is derived also with the aid of a'x , the fraction of the last year of life as
given in Section 2:
C= ( lx
- dx) + a 'A’ x = 0, l, . . . (2.3)
Remark 4. The ratio dx /L x is known as the life table death rate for
age x. Since a life table on the age specific death rates
is entirely based
of the current population, the death rates computed from the life table
should be equal to the corresponding rates of the current population;
symbolically
f=M
f-'cc
x =
* X
a = 0, 1, ... . (3.8)
To prove equation (3.8), we substitute (2.3) in the left side of (3.8) and
divide the resulting expression by lx to obtain
Lx 1 - (1 - a x )q x
interval age w and over, and q w = The length of the interval is infinite
1 .
Lw (3.10)
Mw
‘
Table 2. Complete life table for total California population , 1960
Total
Propor- Number Number
tion Number Fraction of Years of Years Observed
Age Dying in Number Dying in of Last Lived in Lived Expectation
Interval Interval Living at Interval Year of Interval Beyond of Life
(in Years) (x, x + 1) Age x (x, x + 1) Life (x, x + 1) Age x at Age x
x to x + 1 4 4 ax
'
Lx Tx 4x
Total
Propor- Number Number
tion Number Fraction of Years of Years Observed
Age Dying in Number Dying in of Last Lived in Lived Expectation
Interval Interval Living at Interval Year of Interval Beyond of Life
(in Years) (x, x + 1) Age x (x, x + 1 ) Life (x, x -f- 1 Age x at Age x
x to x + 1 V* 4 dx Lx Tx 4
(1) (2) (3) (4) (5) (6) (7) (8)
201
202 THE LIFE TABLE AND ITS CONSTRUCTION [9.3
Table 2. Continued
Total
Propor- Number Number
tion Number Fraction of Years of Years Observed
Age Dying in Number Dying in of Last Lived in Lived Expectation
Interval Interval Living at Interval Year of Interval Beyond of Life
(in Years) (x, x + 1) Age x (x, x + 1) Life ( x x
, +1) Age at Age x
x \o x 1 4 dx ** Lx Tx 4
(1) (2) (3) (4) (5) (6) (7) (8)
= an)
Mw
Obviously
TW = L W and ew = ^ ^ .
(3.12)
‘‘W
In the California 1960 life table w = 95, and /95 = 1699. The death rate
for age 95 and over is M 95 = .351685; therefore
1699
U 5
.351685
= 4831
and
4831 . ..
Tq , = 4831 and e 95 = = 2.84. (3.11a)
1699
Calculations for the remaining columns, Tx and ex were explained
,
adequately in Section 2.
f
9.4] CONSTRUCTION OF THE ABRIDGED LIFE TABLE 203
(4.1)
and M i
is the ratio of to the total number of years lived by the N {
M -
t
= (4.2)
(N t
-
Solving (4.2) for N {
and substituting the resulting expression in (4.1) give
the basic formula in the construction of an abridged life table
"jMj
(4.3)
204 THE LIFE TABLE AND ITS CONSTRUCTION [9.4
When the age interval is a single year, (4.3) reduces to (3.7). The age
specific death rate may be estimated from
M t
= — (4.4)
Pi
where P {
is the midyear population.
All other quantities in the table are functions of q i9 a i9 and the radix
/0 . The number of deaths in ('x i9 xi+1 ) and the number li+1 of survivals
at age xi+1 are computed from
di = hqi, i = 0, 1, . . . ,
w - 1, (4.5)
and
h+ i =h~d i,
i = (4.6)
survivors at age xt is
L =i
di9 i = 0, 1, . . . ,
w - 1. (4.7)
Lw — (4.8)
Mw
where Mw is again the specific death rate for. people of age x w and over.
The total number T, of years remaining to all the people attaining age
xt is the sum of Lj for j = i, i + 1 w. The observed expectation of ,
. . . ,
= w.
, i 0, . . . (4.9)
h
As an example, the construction of the abridged life table for the 1960
total United States population is given in Tables 3 and 4.
of life. Let {
N
be the number of people alive at x { among them a pro- ;
Number Fraction
Midyear of Death of Last Proportion
Age Population Deaths Rate Age Dying in
Interval in Interval in Interval in Interval Interval Interval
a xi+ i) b
(in Years) (x i, x i+i) (*». (x i> x i+ 1) of Life ( xi,
x i tO xi+ Pi P>i Mi ai Hi
(1) (2) (3) (4) (5) (6)
a
U.S. Department of Commerce, Bureau of the Census, United States Census of
Population 1960. United States Summary, Detailed Characteristics, Final Report
,
205
206 THE LIFE TABLE AND ITS CONSTRUCTION [9.4
Table 4. Abridged life table for total United States population , 1960
Total
Propor- Fraction Number Number
tion Number of Last of Years of Years Observed
Age Dying in Number Dying in Age Lived in Lived Expectation
Interval Interval Living Interval Interval Interval Beyond of Life
(in Years) (* if x i+ i) at Age 0+ x i+i) of Life (*i, *i+x) Age x t at Age x {
x t to x i+1 9i h di cii Li T Si
years lived in (x i9 xi+1) by all those who die at any age in the interval is of
course the sum
Here
1 - *,)
1
a, = i = 0, 1, . . . ,
w - 1. (4.13)
*il 1 ~ A.<+J
The proportion q x of deaths for a single year of age is computed from (3.7),
and the proportions p i x and p i+1 of survivors are computed from (2.9);
i
of the 1960 United States population, but it should be emphasized that they
are equally applicable to other populations. 2 To see this, let us rewrite
formula (4.12) as follows:
Xi + 1—1
Pixtfx
Xt < X < xi+1 . (4.15)
1 Pi,i +
is the proportion of those dying in the interval (x i9 x i+1 ) who will die in the
year (x, x +
This shows that a i depends neither on the values of qx or
1).
within the interval. Since the trend of mortality does not vary much from
one current population to another, the may be regarded as constant
(with the possible exception of a0 ) for subpopulations of the United States
and may be revised every decade. The invariant property of can be
verified with empirical data. In Table 5 are the fractions a determined l
years were those listed in Tables 1 and 2 except a 0 — .11 for England and
Wales. Table 5 shows a remarkable agreement among the four sets of
values of a im
2
I express my appreciation to the National Center for Health Statistics, P.H.S., U.S.
Department of Health, Education and Welfare for permission to use their data for the
computation of a t .
208 THE LIFE TABLE AND ITS CONSTRUCTION [9.4
the mortality pattern over an entire interval but not on the mortality rate
for any single year. When
the mortality rate increases with age in an
interval, the fraction a t > J; when the reverse pattern prevails, a { < J.
Consider, for example, the age intervals (5, 10) and (10, 15) in 1960
United States population. Although a x = \ for each age in the two
intervals, a = .46 for interval (5, 10) and a = .54 for interval (10, 15),
t {
Table 6. Computation of for age intervals (5, 10) and (10, 15) based
on 1960 United States population experience
Fraction
of the
Age Last Year Proportion Dying in Fraction of Last
Interval of Life Age Interval Age Interval
x to x + 1 ax qx ai
(1) (2) (3) (4)
V= s> (
». (5.1)
D t
number of deaths in the interval (x x i+ P) out of N alive at age
is the i , t
xt .we assume that all individuals in the group N have the same prob-
If t
and q t
is a binomial proportion, and their sample variances are
- &)
•V = (5.3)
and
o.2 9.(1 - ?,)
Ss ‘ " Nf
’ (5.4)
qA 2 q?i\-q i)
(5.5)
*'
D t
’
n iMi
4* (4.3)
1 + (1 - a^riiMi
Substituting (4.3) in (5.5), we have the required formula for the sample
variance of q i9
dq.
2 _
—
—nnM nz
2
M z
(l z z z)
(5.6)
PA + (1 - aMMif 1
2 M x( 1
- a'x M x)
' (5.7)
*
PJ1 + (1-<)MJ 3
If ax is assumed to be (5.7) becomes
4M,(2 - Mx )
(5.8)
P x (2 + M xf '
Pa = PiPi+i ‘
*
*
Pi- 1. i < jl i,j = 0, 1 , . . . (5.9)
Pi = 1 - 4i, i = 0, 1, . . . (5.10)
are basedon the corresponding age-specific death rates and are independent
of one another. Therefore the sample variance of
p tj may be determined
from (cf. Equation (5.30), Chapter 1):
(5.11)
9.5] SAMPLE VARIANCES OF q { , pi5 AND e c
211
(5.12)
S C = IA, [% +
2
(1 - a.KfSft
2
, a = 0, 1, . .
.
w - 1. (5.13)
Formula (5.13) holds true, in fact, for both the current and the cohort life
tables.
Using formula (5.13) and referring to Table 7, the essential steps in the
computation of the sample variance of e a are as follows.
By taking the square root of the sample variance, we obtain the sample
standard error of the observed expectation of life, as shown in Column 8.
The main life table functions and the corresponding standard errors for
the 1960 total United States population is given in Table 8.
and pf
= (5.14)
ki
are ordinary binomial proportions with the sample variance
(5.15)
oi'E'S
— OOO O
<N O<000
2 ooooo hhhooo
©©o^ o; O; on oo oo oo r-
—4—
ooooo ooooo ooooo ©ooooo
I
!
1
£ c o to
c3 t
W
D m
<U (j <N
m ON O (N o rt t— NO in co ON
m o O OO O
NO r- (N
m
ooO
t-~
mm
#
-3 c <*T NO (N in (N NO rt r-
C CO
rj-
cn
Tf
ON in — in NO <N r-~ — m ON <N OO NO ON On ON
u-
O o
r-.
cn OOO r-~-
o on ON OO OO t"- NO >n in rt rt rt >n ON
^ c3
CO
£
life
of
NO OO m <N Om rt «n rt ON m ON O NO m rt NO
00 r- un (N r- 3©oo ON oo as so so rt p SO po
t~- OO
expectation
rt ©
Tt ^4
rt sO r^ rt
in in
On (N rt NO NO
oo rt
mo oo’cn
>n
_4 oo*
oo in
<N NO rn
(N O' (N mm
so r-~"
3 r- m
o'
p— OO in
rf r-" ,_r _4 as
p <N >n
<N m" r-" ON rt"
vo ©^ rt °\ pm m
no" rt" ci oo" rt"
m — o om
ooO
r- <N OO NO NO rt NO sD NO NO
o ON On oo oo r- NO >n rt ro (N i—1
observed
1960
H:' 1
the
population
of
NO oo oo (N in
m m as_ ON m <N NO rt ON rt ro ON oo NO
p C4 n m
t"-
p fNj O as oo rt O
r~- O ON rt 00
OO rt NO rn in
(N OO m rt
V-
O rt oo rn m’
r~ OO r- ON m
NO rt ni rn in
(N >n mm in o
rn (N
(N r^ ON
o
Om
variance
m pP p rt
oo" oo" in" r^"
rp
m"
fN <N
°\
in" On" rf
p
no"
in
p pp pm
oo
On" in" t-" in’ r-"
(N
nj rt" m"
p>n
rt"
States
O' rt (S <N rt NO >n rt in NO ON ON o oo in m 1
sample
United
the
JU O <N m O M ON 't
Tt r' in Tf -h
m m on
<N r- o «n m
r-~ ON
OO
(N ON r-
rt min oo NO
oO
©©NONor- oo o\ ^ n h
of § ^ *^ ^ ND ^ © © — N N m Tt h N rt OO NO NO Csl in
op
c -H rt
total
c -r <*-
° ©
3 (S
ON
1—1 >n T—< ON
NO o<N P
£ rt mNO (N NO
Computation
O
—
on
U >
«J
«
aj -tO
J <3 <3
© on no rt r-^
rn rt >n in
cij <|
£ o c o
7.
Table
Cl 0 4-
<
C i— rt in in in >n in m m
c O <u
i in in in in in in in in in in in
<U
—1 c
JO
CNJ
> «
— m
H mo
V-l
o ^ mO — O
(N <
in
(N
o in O
ip
in
f*N
o
in
o ino
NO NO t- oo
r-~
in o
»n
oo ON On
<U (U -p
O nom
i—i
<
© m © ©
in
^ >n O in om o in d> in NT
rN <N m m rt rt >n NO NO r~ oo oo On
212
9.5] SAMPLE VARIANCES OF q { p (j AND ,
e c
213
Table 8. Main life table functions and the corresponding standard errors
total United States population , 1960
as in (5.9) for the current life table. It is proved in Chapter 10 that the
pi s are linearly uncorrelated. Therefore the sample variance of p^ also
has the form as given in (5.12),
3-1
o2 ~ 2 V '-2p
Sfu — Pa LPh Sp h
2
. (5.12)
214 THE LIFE TABLE AND ITS CONSTRUCTION [9.5
It is easily shown that the right side of (5.18) is equal to the right side of
(5.16).
Let Ya denote the future lifetime of an individual at age xa the observed
;
or
4 =F a . (5.19)
These values of Ya are recorded in the life table in the form of a frequency
distribution in which dt the frequency in the interval (x i9 x i+1 ), i
is a, =
a + 1 , ,w. On the average, each of the di people lives x i years plus a
fraction a t of the interval (x i9 xi+1 ), or x i — xa + years beyond xa \
K = x — x + a4n &
i a i = a, a + 1, . . . ,
w. (5.20)
w
J
4= L = 7 i=a
2 (*< — + a i n i¥i (5.21)
l
a
1 w
=7 i=a
2 t (*i
- *. + a t n t) - e,]%. (5.22)
l
oc
S* 2
= -S Y 2
„ > (5.23)
or
u>
|
SC = 77.1 t( xi - x* + a n i i)
- e,}%. (5.24)
i=a
It was noted at the end of Section 5.1 that formula (5.13) holds true also
for the cohort life table. This implies the equality
^
— 2w
x
2 P*ilet+ +
*=«
1 (1 - fl<)n<]
2 1
- P&i
/,
=
1
i=*
0 l( x i
- x <* + a i”i) ~ % (5.25)
PROBLEMS
1 . Show that
PiyX^jx
_ 1
x=Xi 1
P i,i+ 1
3. Life table based on a sample of deaths. One series of United States life tables
is based on a sample of deaths. The actual number D t
of deaths in the age interval
(
x if x i+ j)-
is not observed directly but is estimated by a sampling procedure.
The estimated values are then used as the basis for the computation of the
mortality rate and the life table functions (see Sirken [1966] and Chiang [1967]).
We have then a sample of size taken without replacement from the total of D
(5
+ * •
*
+ Sw = S.
Derive the formula for the variance of S { and the covariance between S { and <5 -.
;
qt
=
P +
UiDi
—- a^niDi
. (4.3 a)
{ (1
(b) In this case there is a covariance between q t and qj. Why? Derive the
formula for the covariance.
(c) Derive the formula for the observed expectation of life ea .
5. The material in Table 9 is taken from Miller and Thomas [1958] in their
study of the effects of larval crowding and body size on the longevity of adult
Drosophila Melanogaster. Construct a life table for males and one for females,
and compare their longevity, assuming a t = \.
6. England and Wales life tables. Table 10 gives the population size and
number of deaths by age and sex for England and Wales, 1961. Construct one
life table for males and one for females. Comment briefly.
formula (5.13) to compute the sample variance of e t for the total population of
England and Wales, 1961.
Males Females
Table 10. Population and deaths by age and sex, and fraction of last age
interval of life, England and Wales, 1961
Fraction of
Age Populatioqs 3 Deaths 3,
Last Age
Interval Pi Di Interval 15
The Registrar General's Statistical Review of England and Wales for the year 1961,
il
216
Table 11. Abridged life table for total England and Wales population 1961 ,
Total
Propor- Fraction Number Number
tion Number of Last of Years of Years Observed
Age Dying in Number Dying in Age Lived in Lived Expectation
Interval Interval Living Interval Interval Interval Beyond of Life
(in Years) 0 i,
x i+ 1 ) at Age x t (x i, xi+ i ) of Life («+ ^i+i) Age x t at Age x {
x { to x i+1 it h di E Ti ii
217
CHAPTER 10
Probability Distributions of
1. INTRODUCTION
The concept of the life table originated in longevity studies of man; it
was always presented as a subject peculiar to public health, demography,
and actuarial science. As a result, its development has not received suffi-
cient attention in the field of statistics. Actually, the problems of mortality
studies are similar to those of reliability theory and life testing, and they
i <y; i,j = o, l, . .
.
(l.i)
and
218
10 . 1 ] INTRODUCTION 219
Total
Propor- Fraction Number Number Observed
tion of Last Number of Years of Years Expecta-
Age Number Dying in Age Dying in Lived in Lived tion of
Interval Living at Interval Interval Interval Interval Beyond Life at
(in Years) Age x t ( xi, xi+\) of Life ( xi, xi+ 1) (x i , Age Xi Age Xi
x i to xi+1 ai
u <ii di Li Li ii
x 0 to x x lo qo ao do Lo To *0
xw and over L qw dw Lw Tw
When x s
=
x i+1 we drop the second subscript and write p for p i i+ No
, t
All the quantities in the life table, with the exception of /0 and a i9 are
treated as random variables in this chapter. The radix /0 is conventionally
set equal to a convenient number, such as /0 = 100,000, so that the value
of lj clearly indicates the proportion of survivors to age xt . We adopt the
convention and consider /0 a constant in deriving the probability distribu-
tions of other life table functions. The distributions of the quantities in
columns and T are not discussed because of their limited use. One
t
remark should be made regarding the final age interval (x w and over):
In a conventional table the last interval is usually an open interval, for
example 95 and over; statistically speaking, x w is a random variable and
is treated accordingly. This point is discussed in Section 2.1. Throughout
individuals are subjected to the same force of mortality and in which one
individual’s survival is independent of the survival of any other individual
in the group.
The various functions of the life table are usually given for integral
ages or for other discrete intervals. In the derivation of the distribution
of survivors, however, age is more conveniently treated as a continuous
variable with formulas derived for lx , the number of individuals surviving
the age interval (0, x), for all possible values of x.
220 PROBABILITY DISTRIBUTIONS OF LIFE TABLE FUNCTIONS [10.1
It has been shown in the pure death process in Section 5, Chapter 3, that
Pr{U = k \
l0 }— (
lo k
\p0x (l- 0x
y°- k (1.4)
\ ** i
where
is the probability that an individual alive at age 0 will survive to age x, and
the corresponding p.g.f. is given by
•Gy i(
,(s) = (1 - Pox + PoxS)'
0
- (1.6)
For x — xif the probability that an individual will survive the age
interval (0, x^ is
G i,U.(s) = - 4
(1 Poi + PoiS) - (1.8)
EUi |
/o]
= toPoi (1.9)
and
°Y;Uo
= ^oAh(1
— Poi)- (1.10)
for a < < j./ The p.g.f. for the conditional distribution of /,• given
/,• > 0 is
G Wi,( s <) = E (s t‘ |
h) = (1 - Pit + fa jSi) h - (1.13)
10 2 ]
. NUMBERS OF SURVIVORS 221
Although formula (1.13) holds true whatever x i < xj may be, the
conditional probabilities of A, relative to /0 , ll9 . . . ,
/,. are the same as those
relative to in the sense that for each k
Pr{/,. = k |
/,} = Pr {/, = k |
/J. (1.15)
Gl lt . . . , £ u |
Z 0 (^l’
• • • j
S it) — ^( S l
1 ’ ‘ '
“
I
/o) (2. 1 )
To derive an explicit formula for the p.g.f. (2.1), we use the identity
(see Equation (4.22) in Chapter 1):
£[si
!l • •
•
Si+r |
/»]
= • •
•
sMsfc 1
1
h) |
/„] (2.2)
and write
^i*
1 ’ *
•
4tV\ M = E[s 1 • •
•
st
li
E{si£ |
Q |
Z
0]
i = 0, 1, . . . ,
u - 1, (2.3)
where the conditional expectation of the quantity inside the braces is the
p.g.f. of the conditional distribution of li+1 given /,-, with the explicit
function presented in (1.14). Now we use the identity (2.3) for / = u — 1
= E[ Sl h '
1
Gh l „ 1Io
(s 1 , . . . , s u) •
st-i'E{s u | |
;„]. (2.4)
Gh i„| i0 (si, • • •
, O = E[ Sl
" l
s “Si( 1 - Pu—i + Pu-isJ""
1
|
U
= £[Sl
!l • •
•
s'u-l4-1 I /<)]. (2.5)
222 PROBABILITY DISTRIBUTIONS OF LIFE TABLE FUNCTIONS [10.2
where
= V-i(l ~ Pu-i + Pu-iSu) (2-6)
h,...,lu\l0 (s lt . • • , S u) = ElSi
1
S u-3 Z u-2 I U (2. 8)
Here
Z u— 2 — 2 Pu— 2^u— 2O ^ u — l) P u— 2P u— lAtt— 2^u— lO (2*9)
(
2 10 )
.
i
s
M p 0 ( i>
• • • >
s u) = [1 - {Poi(l - «i) + p02 s i(l - s 2)
+ Po3SiS 2 (l - *3 + )
‘
+ Po„.V 2
' •
•
s„_i(l - (2.11)
The p.g.f. (2.1 1) is then used to derive the joint probability function and
moments of the random variables ll9 l u by differentiating (2.11) with . . .
,
W+'o -
ki
p r {h = k 1 ,...,i u = k u \i0 } = fi (
*=0 \ki+1J
k i+l = 0, 1, . . . ,
with k0 = l
0 . (2.12)
E(h |
/0 ) = loPoi (2.13)
and
a ir,ii\io = ^oPoi(l ~ Poi), i <jlUj = 1, . .
. , u. (2.14)
r
10 2 ]
. NUMBERS OF SURVIVORS 223
Po/”(i - PowY
0
Hi - PwY
w
>
w = o, l, . .
.
(2.15)
where, for convenience,
Pow — PoPl ' ’
'
pw- 1- (2.16)
w= w=l
(2.19)
Letting v -> 00 and /? 0)t;+1 0, we have from (2.19)
For /0 = 1,
= - pV*.
fc= i W (2.22)
and the expectation, the variance, and the p.g.f. of W all have closed
forms. Using (2.22) we compute the p.g.f. of W
g w ( s) = - pv*
W=0 fc= 1 \ KJ
+1
= 2( lf
ft)
(1 - P*)(l - P^r1 , (2.23)
the expectation
,
E(W) = i(-l)l+1
k =1
('
\/c/
°'i(l - pTY (2.24)
d0 + + ’ *
*
+ dw = /0 , (3.1)
Pr {d0 = <5 0) . . . , dw = 8W |
/„}
= -- ,u
--— (Poo^oY 0 ' '
'
(Po»«»)' ”;
5
(3-3)
o0 !
• •
•
oj.
E(d i |
/„) = l0
p 0i q„ (3.4)
and thus the numbers of deaths in intervals beyond xa also have a multi-
nomial distribution.
Pi + qt = !, i = 0, 1, (4.1)
226 PROBABILITY DISTRIBUTIONS OF LIFE TABLE FUNCTIONS [10.4
= 0 otherwise (4.2)
and
2
i=0
*.
Pr{ £< = 1} = p q = p - Pi ot { 0i ( 1 )
Pr{ £j = 0} = - p - p^. 1
0i ( 1 (4.3)
Pr{£ 0 == 1} + Pr{e 1 — 1} + • • •
= q0 +p 0q 1 + • • •
= 1. (4.4)
This means the probability is one that an individual alive at age x Q will
eventually die. The joint probability of the random variables in the
sequence is
//)( £ «; Pi)
= n [Po<(l - Pi)]
£
^ for 2% = 1
i =0 i=0
= 0 otherwise. (4.6)
lo
f(e„; Pi)
= II/Ati; p.)
P=
= nn
p=l i =
[poi(i - pw , (4.7)
f
10.4] OPTIMUM PROPERTIES OF pj AND q f 221
1 0
2% =
B =1
d i (4.8)
we rewrite (4.7) as
log L = 0, j
= 0,1,..., (4.11)
OPi
i=i +
2 di
lj+i
Pi (4.12)
oo
’
h
2d i
where
h =2d i (4.13)
is the number of survivors at age It should be noted that if for some age
xw all the lw individuals alive at x w die within the interval (x w9 xw+1 ),
that is, ew
p
=
for all f$, then e ip = 0,
1 = 0, and l — 0 for all i > w, t
E[Pi\
(t) -£(M h) (4.14)
228 PROBABILITY DISTRIBUTIONS OF LIFE TABLE FUNCTIONS [10.4
E[p,
2
]
= 1 - Pi) + P>
2
( 4 15 )
-
E[p* Pi\ I
=E , EQk+i I 4) I P j Pk — E(pk), (4.18)
L •'k
E[PiPk\ = E[PjE(p k |
Pj)] ~ ElP,]E[p k ]
and that
^A = 0. (4.19)
Observe that formula (4.19) of zero covariance holds true only for
proportions in two nonoverlapping age intervals. If the two intervals
considered both begin with age x a but extend to the ages xj and xk respec- ,
tively, the covariance between the proportions paj and pak is not equal to
zero. Easy computation shows that
(4.23)
e(-)
\lj
> —
E(l i)
Hence (4.21) holds true if
2 1
E(l i ) > E(l i) (4.25)
E(h)
or if
E(!i 2 ) - [£(/i)]
2
> 0 (4.26)
which is always true since the left side is the variance of /l5 proving (4.21)
and the independence of p 0 and p 1 (see Problem 16 in Chapter 1).
1
It should be pointed out that when the parameters pj,j = 0, 1, . . ., are estimated
jointly, the lower bound is a function of the expectations
j, k = 0, 1, ... . Formula (4.27), however, is correct, because in the present case the
expectations of the “mixed” partial derivatives vanish whatever j ^ k may be.
230 PROBABILITY DISTRIBUTIONS OF LIFE TABLE FUNCTIONS [10.4
E —
L spi
log L = E rtA L i
lLdPi J d Pj
1 (4.28)
and
(4.29)
Var (— log
\dPi /
— E [(flog l) 1 = -E rf,io g Li
L \dpj
Cov ( pj,
V 3 Pi
log l)
J
= E r.Pj
L
— a
d Pj
,
log Lr i = E r.
L
i
Ld Pt
a -I
= ^-E[p
d
= ^-p,= l]
1. (4.31)
d
Pj Pj
Since the square of the covariance between two random variables cannot
exceed the product of the two variances, we have
IpPoi
E log L (4.33)
La?/ Pi( 1 - Pi)
and the lower bound is
The between the lower bound in (4.34) and the exact formula
difference
between 1 /l0p 0j and £(1//,). Therefore, relative
(4.16) lies in the difference
to the lower bound, the efficiency of p j is l0p Qj lE(\llj). However, we shall
show in the next section that the maximum-likelihood estimator p i given
in (4.12) has the minimum variance of all the unbiased estimators of pp y
Functions Tk Tk (X1 =
n ), k
= 1, ,
. . . ,
X . . . ,
r, are called joint sufficient
statistics for 0 l9 . . .
,
0r if and only if the joint probability (density) function
f(x l9 ... 9
xn ;
0 l9 ... 9 6 r) can be factorized as follows:
f(x l9 . . .
,x
words, sufficient statistics exhaust all the information about the parameters
contained in the sample. The concept of sufficient statistics was introduced
by Fisher [1922], and the above factorization was developed by Neyman
[1935]. For a detailed discussion on sufficiency and other optimum
properties of estimators, see Lehmann [1959], Hogg and Craig [1965],
Kendall and Stuart [1961], and other standard textbooks in statistics.
In the present case, the random variables are e ip and the parameters
are p 0 p l9
,
. . . ;
the joint probability of e ip is given in (4.7) or (4.9),
oo
oo
where
A
g(c = n
=° Oi C
l
1)* Cl '
(1 _ ‘i^p.
p.y< li + l n li +
(4.38)
and
io
\
HP ! / -2= £ .7l) !
p 1 /
h(e,
f \ h) = FI (4.39)
i =
V>V (4.40)
and the equality sign holds true if and only if pj = pj. To prove (4.40) we
note that, since / and h+i are sufficient statistics for p j9 the conditional
expectation
E(Pi |
C Cl) = CC Cl) (4.41)
Pi = E(p>) = E[E{pi |
/„ Ci)] = £[ V<!„ Ci)] (4-42)
i - (i - Pot)
10
Lw.
Po/°(l - PoiY°
1‘
(i - p,)
1
^ 1
(4.43)
fo l
nu, i
i+l )
Pi =2 1 ^[(yPo/V-Poi)^'
Ij-lIj + ,-0 - 1 (1 - p0 ,
, ' !' +1
X ’
)p‘'
i+1(1 - Pi)
!,
(4.44)
(/ J 1
lj
V(h, Cl) =
p
1
(4.45)
meaning that
E[pi |
Ij+ 1 ] = Pj- (4.46)
10.5] DISTRIBUTION OF THE OBSERVED EXPECTATION OF LIFE 233
°v?
= E(Pi - Pj ) = E \-(Pj - Pj) + (Pj - Pj)]
2 2
and
(4.50)
which was to be shown. The two variances are equal only in the trivial
casewhen pj = pj.
Thus we have shown that the quantities pj and ^
in the life table are the
unique ,
unbiased , efficient estimators of the corresponding probabilities pj
and qj.
y«> o. (5.i)
234 PROBABILITY DISTRIBUTIONS OF LIFE TABLE FUNCTIONS [10.5
poo 00
f*a +2/ a
p ( )
rx a +y a fv«
= [i(t) dr = /u(xa + t) dt (5.3)
J %(x J0
00 I* 00
f(y*) dy, = \
<T®d® = l. (5.5)
Jo Jo
r oo
°Y? = J# (2/a
- e aff(y<t) dy, (5.7)
poo / rx a +v a \
It is instructive to prove that the two alternative definitions (5.6) and (5.8)
are identical. Let u = y ai du = dy a .
'Xa+Va
exp — Mt ) dr (5.9)
10.5] DISTRIBUTION OF THE OBSERVED EXPECTATION OF LIFE 235
and
J> p
(
MT ) drju(x„ + yj dy„
r*a+Va r°° pa'
= -2t«exp
I
Mr dr + eXP
(
^r)dr\dya
(-J,
' )
Jo H. . (5.11)
The first term on the right vanishes and the second term is the same as
(5.8), proving the identity.
/a ,
each of which has the probability density function (5.1), the expectation
(5.6) and the variance (5.7). According to the central limit theorem, for
large /a the distribution of the sample mean
ra =
fa k=i
l
lYak (5.12)
y« = 4- (5.i3)
da + -- -
+ dw = la . (5.14)
exact age at which death occurs, that is, on the distribution of deaths
236 PROBABILITY DISTRIBUTIONS OF LIFE TABLE FUNCTIONS [10.5
within each age interval. Suppose that the distribution of deaths in the
interval (x i9 x i+1 ) is such that, on the average, each of the dt individuals
lives a jin years
i
in the interval. Each thus lives -f a^rij years, or x { —
xa + ajUj years after age xa and the sample
,
mean is given by
_ w
i 1
Z(*i - x«)di +2 (5.15)
l„ i=a i=ct i=
By definition
- xa = n« + n„ +1 + +
hence
w w /i — 1 \
w— 1 w w—
= 2 n 2 d = 2 n i(h -
i i
d i) (5.16)
j=a i=j + j=a
since /,
= ds + dj+1 + • •
•
+ dw . Substituting (5.16) in (5.15) gives
rw-1
Y. = 2 n i(h - d i) +2and i i i
i=a
2 l n i0i ~ di) + a i n i di) + a w n w d, (5.17)
L i
= «*(/< ~ di) + ajYijdj, i = a, • •
•
, w - 1, (5.18)
be the number of years lived by la beyond age x w Using . (5.18) and (5.19)
we rewrite (5.17) as
j
w
Y.-rlLt
'«*=«
(5.20)
which ,
ot course, is e a , the observed expectation of life at age x as given in
a
(4.9) in Chapter 9, proving (5.13).
t
10.5] DISTRIBUTION OF THE OBSERVED EXPECTATION OF LIFE 237
We know from (5.13) that the variance of ea can be obtained from the
variance of Ya
For practical purposes we need to have a formula for the
.
variance of which can be estimated for the cohort and the current life
tables. Using the relation di = h - / we rewrite (5.15) as m ,
W w
I
4= a.n, + 2 0
i=a + 1
^= /
a
a,n, + 2 cjai
+i=ct
(5.21)
be rewritten 2
w
ea = <*
ana + ^ ciP*i> a = 0, 1, . . . ,
w. (5.22)
i=a+
w w—1 w
Var (4) = 2 ci 2 Var (Pa) + 2 2 2 ci c i c°v (Ai.Ai)- (5-24)
i=a+ 1 i=a+ 1 i=i+l
2
i=a + 1
Pai) “F ^ 2
i=a+l j=i+l
2 Pai)
W 10— 1 10 / W 2-1
y
E 2 Pai "f ^ 2 Ci 2 C jPaj
(2 C iPai\ (5.25)
(r.) i=a+ 1 i=a+l j=i+l \i=a+l /
Var (4) 2 Ci
2
Pa< + 2 2 - a<Wi) - (e, - a««.)
s
*=a+l i'=a+l
2
i=a+
“F (^a ^a^a) (5.26)
2
If = 1 and — % for all then = 1 and (5.21) and (5.22) become
|
w 10
^a
= *F 2 P<*i
and ea "F
^ P<zi’
^
Z ^
Z
i=a+l i=a+l
respectively.
238 PROBABILITY DISTRIBUTIONS OF LIFE TABLE FUNCTIONS [10.5
= fe + (1 ~ ^-i)«*_i}
2
- fe+i + (1 - a^n^p?.
(5.27)
Var (e t
2 (i e > + (1 - «i-iH-i}
2
P,
_i=a-)_l
/ 1 \ r^-i
Var (ej =E T ({ei+1 + (1 - a,)^} ?«,,+ 1
|-j
- {«<+! + (1 - a i) n i} Pi
2 2
Pai) (5.29)
Since /? a i+1 =p ai pi we have the final formula for the variance of the
observed expectation of life at age xa ,
a = 0, 1, . .
.
tv — 1. (5.30)
Formula (5.30) is reduced to
Thus we have
such that, on the average, each of the d individuals lives a n years in the
t l i
interval, for i = a, a + 1 , . . . ,
w, then as la approaches infinity, the
10.5] DISTRIBUTION OF THE OBSERVED EXPECTATION OF LIFE 239
by (5.21), is asymptotically normal and has the mean and variance as given
by (5.22) and (5.31), respectively. ^
paj ,
which in the current life table is computed from
P.i = PJ.+ 1
• •
Pi-i, j = a+ , • • . , w. (5.32)
“
SPi
Paj = PaiPi+ i,j, for a < i <j;
= 0, otherwise. (5.33)
Hence
w
2 CjPaiPi+l,j
i=i +
= Pai\Ci+l + 2 C oPi+l,i
j=i +
PROBLEMS
1. Derive the probability distribution for the number lx of survivors at age x
given / 0 at age x = 0 by the following methods.
(a) By establishing and solving the differential equations for the probabilities
Pr {l x = k |
/ }.
0
(b) By the method of probability generating function.
ll9 . . . , lu given /
0
is that given in (2.1 1).
from w = 0 to w = oo is unity.
w=
Substitution of s = 1 in the last expression gives G w (\) = 0 (!) and taking the
derivative of G w (s) at s = 1 yields the expectation
oo oo
r
PROBLEMS 241
9.
E
(w o8i )-°
where log L is given in (4.10). Evaluate the expectation for k = j, and use the
result to determine the lower bound of the variance of an unbiased estimator
of Pi-
E[V(ljy lj+1 )] = 0
for every pi implies that V{lh lj+1 ) = 0 for all possible values of /5 and
Show that the family of the probability density functions of V(lj, /J+1 ) =
1-i+ilh ls complete.
In general, if a function V(lj, lj+1 ) of sufficient statistics lj and / j+1 is an unbiased
estimator of pj and the probability density function of V(lj9 lj+1 ) is complete,
then, according to a theorem of Blackwell [1947] and Rao [1949], V{lj9 lj+1 ) is
the unique best estimator of pj (see also Lehmann and Scheffe [1950]).
14. Expectation of life. Suppose the intensity of risk of death (force of mor-
tality) is independent of age so that p(x) = /u. Find ex the expectation of life at ,
1 w w
2 (X i
- = ax n x + 2 CiPai
J
‘a i=a i=a .+
Competing Risks
1. INTRODUCTION
qib — Pr{ an individual alive at xt will die in the interval {xi9 xi+1 )
if R is the only risk acting on the population};
d
Qis , i
= Pr{an individual alive at x i will die in the interval (x i9 x i+1 )
from Rd if R x is eliminated as a risk of death};
Q iS 12.
= Pr{an individual alive at x t will die in the interval (x i9 z m )
and
qi
= Pr{an individual alive at x t will die in the interval (x z-, * i+ i)},
with pi qi 1. + =
The use of the terms “risk” and “cause” needs clarification. Both terms
may refer to the same condition, but are distinguished here by their
position in time relative to the occurrence of death. Prior to death the
condition referred to is a risk; after death the same condition is the cause.
For example, tuberculosis is a risk of dying to which an individual is
exposed, but tuberculosis is also the cause of death if it is the disease from
which the individual eventually dies.
In the human population the net and partial crude probabilities cannot
be estimated directly, but only through their relations with the crude
probability. The study of these relations is part of the problem of “com-
Me 1) + * •
•
+ fx{t\ r) = MO (2.2)
is the total intensity (or the total force of mortality). Although for each
risk Rd the intensity jLi(t; d) is a function of time t ,
we assume that within
the time interval (xi9 x i+1 ) the ratio
MC <0 _ C| (2.3)
MO
is independent of time t, but is a function of the interval (x { # t +1 ) and ,
-
risk R d Assumption
.
(2.3) permits the risk-specific intensity /u(t ; (5) to vary
in absolute magnitude, but requires that it remain a constant proportion
of the total intensity in an interval.
11 . 2] CRUDE, NET, AND PARTIAL CRUDE PROBABILITIES 245
exp — a
( (
t) dr^ju(t; (5) dt (2.5)
J J
where the exponential function is the probability of surviving from x { to t
when all risks are acting, and the factor p{t\ b) dt is the instantaneous
death probability from cause Rd in time interval (t , t + dt). Summing (2.5)
over all possible values of t, for x t < < t xi+li gives the crude probability
f*(t; S)
Qid exp - /i(t) dr)n(t) dt. (2.7)
MO J Xi \ J Xi
Integrating givess
_
= ju(t; d)
— Ktj 6)
{jr*
Qid 1 exp 0 dt (2. 8)
MO M0
hence
Qil + * *
*
+ Qir — <Ii>
i = 0, 1, . . . .
(2 . 10)
246 COMPETING RISKS [11.2
The net probability of death in the interval (x i9 x i+1 ) when R b is the only
operating risk is obviously
With Equation (2.9), (2.12) gives the relation between the net and the crude
probabilities,
( -J
MO -fi(t,8)]dt
=1- Pl«-QiiUv (2.14)
To simplify (2.f7), we note from (2.9) that the ratio p(t; d )/\>(/) — ju(t; 1)]
QiS
1 — exp f MO - Mo !)] dt
<?; - Qa J Xi
Qu
<?, i- (2.18)
<Ii
~ Qa
Substituting (2.14) for d = 1 in (2.18) gives the final form
- Qnm
QiS 1 = [1 - pl
Si
l a = 2, .... r. (2.19)
- 2a
The sum of Q ib l for d = 2, r is equal to the net probability of . . . ,
iQisa =1 [1
- = 1 - = q,„ (2.20)
*= 2 *= 2 4 * - 2a
as might have been anticipated.
Formula (2.19) can be easily generalized to other cases where more than
one risk is eliminated. If risks R 1 and R 2 are eliminated, the partial crude
probability that an individual alive at time xt will die from cause Rd in the
interval (
x t x i+l)
,
is
Vi J’
(5 = 3, . . . ,
r. (2.21)
Qidliqi
~ 2a), and (q { - Q n )lq t and, consequently, formulas (2.13),
(2.14), (2.19), and (2.21) would become meaningless. In other words, if
an individual were certain to survive an interval, it would be meaningless
to speak of his chance of dying from a specific risk. On the other hand,
if Pi were zero
(q = 1), formula (2.4) shows that the integral
{
dt = da + • •
•
+ dir (3.1)
and
h — da + * *
’
+ dir + li+l9 i = 0, 1, . . . . (3.2)
1 =&! + •• + Qir + Pi •
(3 . 3)
— n Q%‘Pi +l
> (3 . 4)
n
3=1
dj.k+y-
1
hj = + PA+) ’ (3-5)
where \s id \
< 1 and |
st \
< 1.
We can see from (3.4) and (3.5) that for any positive integer u the joint
probability distribution of all the random variablesdil9 dir li+1 for
. . . ,
, ,
i = 0, 1, . . . ,
u is
t=0
Ud
— r
i5
,
ui+1 \
es
)d>r
Ar Vi
r)
li +
(3.6)
1
For simplicity of formulas, no symbols for the values that the random variables /,
and did assume are introduced in this chapter.
11.3] NUMBERS OF DEATHS AND NUMBERS OF SURVIVORS 249
with di8 and Vi+1 satisfying Equation (3.2). The corresponding p.g.f.,
defined as
u
/
,li+i
T
dis,li+i\lo\*id’ *i + IT s il >i+l (3.7)
\i=0
~ r u / j — r \ u ~[lo
\ /
Gdi6,li+i\lo( S id’ Si+l) 2 Qo5
_5=1
s 05 +2 (
j=l \i=0
IT Pi^i+1
/
le
\d=l
jd s jd
]
+ n^
z=0
'<+
(3.8)
We shall not reproduce the derivation of (3.8) but, instead, we shall prove
it by induction.
Clearly (3.8) holds true for u = 0 since, in this case, (3.8) becomes
» (3.9)
we want to prove that (3.8) is true also for u. Under this inductive
assumption for u — 1, we have
li+\
E A id A »+l
r U— 1 /
u- — r \ u—
in
3
22o5 S 05
3=1
+
3=1 V=0
Pi S i+1 iQjA-4 +
5=1 /
n pa
i= 0
+1 (3.10)
number of survivors (lu ) at the beginning of the interval (x u , a: tt+ i). The
conditional expectation inside the brackets in (3.11) is the p.g.f. of the
conditional probability distribution of dul , . .
. ,
dur and l u+1 given lu .
u—
m
- /
HQud S u5 + Pu S u+ 1
_U= V-i / 15=1
- u— r r
/
/ \\ / \
[_{i=0
n n k (3.12)
250 COMPETING RISKS [11.3
where
M— \ -| lo
Using (3.13), we rewrite the last term inside the brackets in the form
n pa+1
( i=0
)
/ \<5=1
(
ZQ.aSuA
/
+ n PA+i-
i=0
(3-15)
r u / j —1 \ /
r \ u
2
5=1
Qod -f 2 (II
3=1 \«=0
Ti) ( Qod ) + IT Pi
i= 0
/ \<5=1 /
(3.16)
The sum of the probabilities (3.6) over all possible values of the random
variables is thus unity, as it should be.
The moments of the random variables d« and are easily computed
from the p.g.f. (3.8). Taking the first derivative of (3.8), we have the
expectations
E(da |
k) = kPuQii, <5=1 (3.17)
and
E(Ji |
/0 ) = loPoi, f = 0,1,... . (3.18)
and
adi6,djE \lo ~ ~loPoiQidPojQje> 6 9^ e; 6, E = 1, . . . ,
ij = 0, 1, .... (3.20)
The covariance between the number dying and the number surviving can
be derived in a similar way
' <j
a i t ,ii\io
= — Poi)Poi (3.23)
has been given in (2.14) of Chapter 10. These results show that, for each
u the random variables did
, ,
li+1 for i = 0, 1 , . . . ,
u; d = 1 ,
. . . ,
r form a
chain of multinomial distributions with the probability distribution and the
p.g.f. given in (3.6) and (3.8), respectively.
n dal
o
l
dir k+ i!i
\
6S
1
Q% pl‘ +1
r
(3.6)
The random variables in the present case are dib and li+1 , with the
expectations
E{dib |
/0 ) — loPoiQid’ ^ 1, . . . ,
r,
i = 0, 1, . . . ,
u. (3.17)
and
E(l i+ 1 |
/o)
— loPoiPi — loPoi ^ 2 Qibj ’
^ * * * >
**»
^
i = 0, 1 m. (3.18)
“ r(dib - h+ i loPoiU-lQi
2
loPoiQisf \ *=*
Xo-Z 5=1
1 : +
,
(4.1)
-
Hd h+i
The reduced % 2 (4.1) differs from its ordinary form in that the random
variable, rather than its expectation, appears in the denominator. The
substitution of
Pi = l-2Qis (4.2)
5=1 <
Qib Pi
<5 = 1, . . . ,
r. (4.3)
^ id ^i+1
2&is + Pi ,
f.
Pi _ _
5=1
I r ~ 1
(4.4)
b+1 V,/ l
Z a ib +i
/
l
i +
i
5=1
Pi = L±i (4.5)
h
11.4] ESTIMATION OF PROBABILITIES 253
and t
Qa = J (4.6)
did = 0 and =
0 for all i >
w, so that there is no contribution to the
reduced % 2 beyond the wth term. Consequently the estimators (4.5) and
(4.6) are defined only for positive
The estimators in (4.5) and (4.6) are recognized as maximizing values
of the joint probability (3.6) (see Problem 7). Thus, in this case, Neyman’s
reduced % 2 method yields estimators identical to those obtained by the
maximum likelihood principle. Using the same argument as that in
(4.5) and (4.6) are the unique, unbiased, efficient estimators of the corre-
sponding probabilities.
The exact formulas for variances and covariances of the estimators in
b II b <M<S?
II tq cC (4.8)
o o
Co w(Q a Q ic \k)
,
= “Cjjj QaQn j = i
= 0 j 7* i (4.9)
Cov (p,, pj |
/
0)
= Cov (q„ qs \l 0 ) = 0 j * i (4.10)
Cov(A,e»|/0)= -E^p&s j = i
= 0 j * i. (4.11)
(4.12)
(4.13)
(4.14)
,
r;
. . ,u.
(4.15)
1
8=2
^=2 —
~ r
8=2 d ;
d
—
1
da
l
(?
^\(di-d a )/a n
(4.16)
Although formulas (4.12) to (4.15) for the estimators of the net and
partial crude probabilities are quite simple, the exact formulas for their
variances and covariances are difficult to obtain. Approximate formulas,
however, can be developed [see Equation (5.30) in Chapterl]. For example,
for the variance of the estimator q i8 ,
we write
+ &«}
Cov (Qn' Pi I O (4 - 1? )
where the partial derivatives are taken at the true point ( Q^Pi )• Using
formulas (4.7), (4.8), and (4.11) for the variances and covariances and
simplifying, we have
Var = F.l' Y 1
- log (1 - q iS ) log (1 - q u) + qJ],
(q iS |
/ )
0
W Pili
[ Pi
, (4.18)
11.4] ESTIMATION OF PROBABILITIES 255
= ( ~ qu)
Var (qiS \
/„)
|
[p, log (1 - q iS) log (1 - q iS) + Qj\. (4.19)
hPoiPiQi
(1 - ««)( 1 -
Cov (fo. 9ie I O= loPoiPiQ,
4«)
[pt log (1 - q iS ) log (1 -q ie )
-Q Q iS it ]
(4.20)
and
= - (1 - qJQu
Cov (qa, Pt /„) (4.21)
KPdi
Approximate formulas for the variances and covariances of qx s and Q xd .
may be obtained in the same way. For completeness they are listed below.
= (1 ~ Q±e)
Var(A.,K) [Pi log (1 - qj log (1 - q( t)
. + (q, -Q iS f],
loPoiPiQi
(4.22)
=~ (1 ~ ft-aX 1 - <?i t)
Cov 4i., |
l
o)
^ oPoiPiQi
(4.23)
- qi-i)(9i -
C ov (4i.,, Pi\l^ = -
(i Qis)
(4.24)
( oPoi
10 =
Qil Qie
Var (Q iS .i \Ue i
loPoitii - Qn)QiS
[QiSlitfi Qil ) Qie]‘
+ (9i - Qn) + QaPi , (4.25)
hPoiPi<li(<li ~ Qa ) -
X fe — Qii) + Qa? b ^
(4.26)
Q,e-iQh ^ — 2, .... r,
Cov(Q tS ,P = [Qid Q/i)]
(4.27)
. l i \!o)
loPoi
’
i = 0, 1,
256 COMPETING RISKS [11.4
[Equation (4.16)], the following relations hold true for the variances and
covariances
2>L, I «„
+ 2
2 JCov( 6., I
/„) = „ (4-28)
and
2cov(e a ,, p t |
/„) = ,
Io
. (4.29)
When (4.22), (4.24), (4.25), (4.26), and (4.27) are substituted into (4.28)
and (4.29), we have two identities which can be verified by direct
computation.
M, (5.1)
ft
= (5.2)
Ni = [Pi + (1 - (5 - 3)
UiMi
ft (5.4)
1 + (1 - ftKM,
as given in equation (4.3) in Chapter 9, and its complement is
1 — ciiUiMi
Pi (5.5)
1 + (1 — a^riiMi
11.5] APPLICATION TO CURRENT MORTALITY DATA 257
The D t
deatlfs are now further divided according to cause with D id
^ (5.7)
(5.8)
njM id
Qid (5.9)
1 + (1 — a i )n i M i
= 1 - P?,SID\ (5-10)
Qidi
—
D ~ Da
t
[1 - ^A-A.)/A] s = 2, ,
r; (5.12)
and
D, ADi-Dix-DirtWn
Qid 12 [1 Pi J’ (5.13)
-
-
D, D, D2 *
g2
^ " [P t + - aMiDmi
(1
tfi-d)
<5 = 1, . . . ,
r;
/ = 0, 1, (5.15)
particular cause of death. For example, we consider the case where risk
R 8 is eliminated from a population and give formulas for two main life
table functions, the proportion of survivors and the observed expectation
of life. Here the basic quantity is the estimator q^ of the net probability
of dying in (xif xi+l ) when R d is eliminated. The proportion of people alive
at an age xx who will survive to age xi is computed from
Am = i=a
TT (1 — 4i d) (5.16)
= plu I (1 - 4rs)-
2
Sl s ,
a < ji a, j = o, 1, . . . . (5.17)
When a =
0 the quantity in formula (5.16) is equivalent to /,//0 .
The formula for the observed expectation of life at age xa when risk Rd
is eliminated is given by
Kb - K > 0. (5.20)
f
11.5] APPLICATION TO CURRENT MORTALITY DATA 259
a U.S.
Census of Population 1960, U.S. Summary, Detailed Characteristics, Table 156.
Bureau of the Census, U.S. Department of Commerce.
b Vital Statistics
of the U.S. I960, Vol. II, Part A, Table 5-11. National Center for Health
Statistics, U.S. Department of Health, Education and Welfare.
c Category
Numbers 330-334, 400-468, and 592-594 of the 1955 Revision of the International
Classification of Diseases Injuries, and Causes of Death, World Health Organization, 1957.
,
260
rH
<c?i
each
<<?
7- — r- SO N Tt rf in h Tt o in O «N oq so ON ©
1
On, © fN m Os Os
———o m o\ K rn ©
tj-’ t~-’
(N fN
oo*
M n ri
^ in in i-i oo oo fN
NO NO NO NO NO
in <u
o
c
<D
death
£
Q '«
oo r- N" ^t h ^ M— in m o OO OO NO
of OO 8o8 r-H fN 7 so oo in
© ©©
-
— o
m ro in on
w
t O O r-
«n o
in in
88 in on no on rf i-f ©
ooooo
C/5
<U
I
oo OO
O OO OO O
i
O
O O
Oo OO
— h m
OO m O oo rj-
O'
<N N-
o
ON
risk
S
<D
a Uh
<u
as ’G
1960) T3
ON Tf rf r- o On in >n cn — M in N n NO ON ©<N ©© ©
m mm oor^oort —
'
o3
c /"“"S
CO <n oo -t fN
OS <n fN
os os oo oo
fN fN oo M r- M r ^|-
fN On N" r~
ON Tf O ©
©
— OOoO oo ih -i m n 7
oo
oo o o
i
*«*
NO OO ON ON
ooOoO o oo OOOOO o© ©
i
disease
States
U b fN
w
CVR
r~ i— fN r- N- m m
in oo m oo — —n—
NO(Noocn <N ©©
m OO NO ©
United
NO Tt Os <n <n
OS rn fN
O
rn m r" ^ •*?
<n r-
OO 1^ ON no
<n 00 ON ON
NO ON ON Tt
OO
>n
©
©
OO©
— o
oo
—
>
oO O O OO OOOO—io ©
** <? N O —1 <N <n no NO NO fN ON oo
, U £h O
i
ooo oo
i
— <N N- in f" ©
Oh 1
eliminating
females
rH
<D
tg
the
S _ Tf
fN Os © — ^ (N (N M O\riooh-H
mm fN N"
m
m h
— O oo m m
white <<:
in oo oo oo On oo ON N- NO
( ©8oo© © O O oo tj- NO On
— (N oo NO r-
m ON NO
o o
88 o oo O o o o
oo o ooooO
i
|
and i
<N Tj- r- r) fN NO
C/5
JD
> o o o oo — —i (N ro Tj-
£
dying
interval
—
<L>
-G
w
i-rH
of £ 0)
H m
O O
— OO
in
CO oo
oo Tj-
OO NO 1— Tt f\J
h- h- On >n os
1 in M oo
O— oo
O
oo cn <n
NO NO On r-
m© ©
r' fN
©
o
o3 •
m s SO fN fN
'rt IT) no so oo
OOOO—i
r~- i—i
m
t~~ oo
m on — t NO
o— — mNO NO ON ©
oOoo
i
age
>
r-»
<?M w/ fN
o O O o o ooooo
-h cs
oooo o — Tf ON >n
fN cn 8
o '| 1
1
—1
probability w
©
M nM7
in Os OS r~~ oo Oi h- NO ON NO ON ON r- r- fN ON <n
mm
rj-
G SO in in no in
7 m oo
— Nt in fM fN OO >n ©
©
— O
I
<u SO Tf <N fN SO oo oo ON oo OO in in
w o oO oo oO ©o OO ro in oo N oo ©
OO
< (N NO NO NO fN
O O <N
C/5
The
u <L)
i-i
<5-1 fN
O O i
OOO—— m
(N <n NO OO ©
Ph
2.
/—
C/5
H
Table V-
t
cj
> 8 *r omo in OmOm Om O «n
m no O
no h
in© ©m>n
<u <3 >n o fN Cjlcn<nT|-'^- in
in o m O m
r- oo 00 ON ON
m© in ©
+
bp- G *1 >A o m «n O «n O © r-
1
O O
| 1 | | | | 1 1 |
1 1
" N 1
>n
< w 5 w 7 >n m no no oo oo ON On
261
6*5
'w
©
oooOO *— 04 cn r- m ro — 00 ON d; d; oo CO in
1
p\ doooo ©dooo o (N d d- ON d- ON
(N NO d" OO
oo
V CO »n
a)
c
0>
J-<
5C
Q co
oo ON ON d-
Do ON in d- 00
cn NO d-
r- NO m o O
oo OO
oo
04
ri d-
oo
p-
p-
in
88 O 04 cn NO
o O m CO r- P-
o co O m O
88 oo
8 ooo
o©
oo r CO NO
r-H P- P- NO ON co
JD
cS
88 o o ooOOo r_l 04 CO1
CO CO
o
<s
<0
Uh
<u
<3 •ti
O
1
1— NO d- oo NO >n r4 NO oo oo (N NO CO d- d-
<u
o3 O d" ON p- <N co in On
NO
m m
oo NO
•<t
r-~ r- >n
ON oo ON
OO 04 d- d-
ON d- p^ O co
04
05 ON NO <N
m
.
<*>
<<
J>
U £ <c O ON ON ON On ON ON ON ON On On ON ON OO OO oo P- NO d" co
i—
w
3
1960) ON 04 04 oo m
ON (N OO
O ON ON m d" O d- ON NO
o o-
p-
c On NO 04 r- r' NO d- NO in CO NO O' NO
Oo
o o m
dO
NO co oo d- ON CO NO CO <N ON H
>
<u
C/5
W
NO oo r-' p- NO NO »n d- (N ON NO o <N O
co »n O 04
O
U CD
O ON On On ON ON ON ON ON ON ON ON OO OO oo r- NO d- 04
a< T""1
States
©
lev,
United
©
ooOOo r-( y—l 04 d- OO in OO i—< in 04 oo 04
t"- t-H CO
1
IT)
ddodo doodd cn NO (N
C4
p-'
p^ 00 CO
CO NO 04 p- d"
oo"
ON
, T* 04 O' ON
a> o 04
o
c
females
o
Ih
<U
©
s _ co
O 04 04 (N NO o NO d"
*n d-
m o OOo CO (N 04 (N
o ON 04 04 OO
o o d O
'3' p- 04 -
04 04 ON >n NO
04 m r-
-f-
and 1
1
© © O Oo •n o in <— CO >n 04 NO oo
© Oo om o O
N o
ooo© oo oo o7
-T+-
in NO CO ON
C/5
"d
£ o oo oo co NO O NO m ON NO NO On ON in NO d- 04 d-
otf 8 ON On d- i— d“ o >n oo ON d- f" OO NO
m NO NO y—t co d" NO
m
> c
© m S
8P
O ON
ON
NO
p-
NO
>n
NO
ON 04
m in
in On
d-
m On OO Tf NO
O
d- ON
CO
CO
in CO 04 NO
>n NO
mm
CO
Ui <C
y*
ON ON ON ON ON ON On ON ON On oo oo p" NO d- CO 04
/“* N
c/5
13 c3
+
> 0)
>>o
.
« o in o m o m O «n
04 co co
o «n
in
o »n o >n O »n o«n
^s 0-1 1-5 7 04
in s «n
7 <n Op C" t"- oc oo ON On
+
Age Intel
•
c
S H .;•*
1
8 «n 8 >n 2
04 04 co cn d-
1
in O
>n
1
in in
in NO NO Q o
p-
in
r-~
8
oo
in
oo ON
O
1
in
ON
262
f
males
(
death
no
r-
©
o p no
oo oo inmrtM^
© OO hOONOO
m oo ro i—i
^
x
ni cn ^ co rn ro mm ro ro w m m. ni
ri ni ni P© o’ o’
of
risk
a
as cC £
t
n--
o
«n
0\ Tf
on
n
©
> •§ \d K
rn oo it
U| oo oo oo n- ti-
disease
w
1960)
,
g S (N oo id
nj rf NO ON
vo OrnhHM
ni rH in h- rn
CVR
> (A
(
no’ T-i vd id ni on id -t cd
States
2-
»n in tJ- -t m
United
eliminating
04 ON —O ON hoomc\H mn^^-
oo P id
,
— <N m
© © n| tJ-'
no
of
~ nl
i
females
effect
and
co
no On
hM
© n
O©h MOMtNom
« n (N (N
^ © r-
h oim no c~
-h o t no oo no c-
^i in o\ it --
r*-
't
the
P -H ni c4 ni ni ni ni r4 ni ni — —! ©’ ©" ON oo oo’ oo oo
and
life
c4 g
in >n oo r- -it
©
00 NO
*— in ON ro ti-
© ON NO NO ON NO
OO t— OO ro
>n it >n NO
Cl no NO CM
on
m
of ON rn r- ’-H
> •§ ©' NO NO CN c-’ ni oo ed ON* it
©’ NO rd © ni y—1 t r-H
U| oo’
r- oo r- r- NO VO «n m
It It ro ro m
04 04 IN i—i i—i
W
expectation
S
0- OO io
© 04 C' ON ON 04 >n ti-
© oo NO On
© m ON
© 00
t-H t— ©
© t~- ni
0^ 04 re in
ON It
p ON ro t— lt
ON >n ©’ NO r-1
ON r->
P o p in
P nj ON
-t rd
u£ cu
t~-’ OO
NO NO NO
•t*
>n »n tt •t mm r-‘ ni ON
<N <N 1
ni
i—i
ni
The
4.
18^ o1? ri in
© -©
q »n (N
in © © >n
co co -t
in ©m
m ©m©
>n no io r-
Table <u O
c ~ OninQin © m©m© in © >n © >n
I I I l I I I I I
I
(S M fn rmt t in m no no
263
264 COMPETING RISKS [11.5]
1.
PROBLEMS
Justify this formula and show that it is identical to the one given in (2.11).
3. Use the method in Section 2, Chapter 10, to derive the p.g.f. (3.8) for the
joint probability distribution of dib and /*•.
i <].
r
5. Since 2d ib
=h “ h+i
<5=
= Co v(/ - Co v(/i+1
2 Co v{d
<5=1
ib ,
lj) f ,
lj) , lj)
and
= C ov(/ - Co v(/f, m
2 Cov(/,, d
<5=1
jb ) t-,
lj) / ).
Justify these relations and use (3.21), (3.22), and (3.23) to verify them.
8. Let
/ r \Zi+i
h\
da !
' '
'
dir •**+l*n—,Q&---Qfrh \
- 2 G.J
5=1 /
•
d2
l°g Li I , d, e = 1, . . . ,r.
r
PROBLEMS 265
(b) Formulate a matrix A with these expectations as its elements, and find
the inverse A-1 of the matrix A. Show that the diagonal elements of A-1 are
given by
Qid ^ ~ Qid)
* Kh)
and the off diagonal elements by
9.
“ ^7) QidQi* <5^e; <5, e= 1, . . . ,
r.
In writing the first equation in (4.4) we made use of the following elementary
result. Let a i} b if i = 1, . . . ,
r, be positive numbers. If
then
+ + a„
for s < r.
ai + + ba
Prove it.
10. Derive formula (4.18) for the variance of qi5 from (4.17).
11. Substitute (4.22), (4.25), and (4.26) in (4.28) and verify the resulting
equation.
12. In Tables 5 to 7 are the population sizes, the number of deaths, and deaths
from three causes by age for the total United States population, the white
population, and non white population in the year 1960. For a particular cause
of your choice, compute the crude, net, and partial crude probabilities and
construct appropriate life tables to study the effect of the selected cause on the
longevity of people in these populations. Prepare short paragraphs to summarize
the results and interpret your findings.
Table 5. Population , total deaths , and deaths from three causes by age for
total United States population y 1960
Deaths by Cause 13
a U.S.
Census of Population 1960, U.S. Summary, Detailed Characteristics, Table
155, Bureau of the Census, U.S. Dept, of Commerce.
Vital Statistics of the U.S. 1960, vol. II, Part A, Table 5-11, National Center for
Health U.S. Dept, of Health, Education, and Welfare.
Statistics,
c
Category Numbers 330-334, 400-468 and 592-594 of the 1955 Revision of the
International Classification of Diseases, Injuries, and
d
Causes of Death, 1957. WHO,
Category Numbers 140-205.
e
Category Numbers E800-E962.
266
r
Table 6. Population , total deaths , and deaths from three causes by age for
United States white population , 1960
Deaths by Cause 13
267
Table 7. Population, total deaths, and deaths from three causes by age for
United States nonwhite population, 1960
Deaths by Cause*5
a U.S.
Census of Population 1960, U.S. Summary, Detailed Characteristics, Table
155, Bureau of the Census, U.S. Dept, of Commerce.
b Vital
of the U.S. 1960, vol. II, Part A, Table 5-11, National Center for
Statistics
Health Statistics, U.S. Dept, of Health, Education, and Welfare.
c
Category Numbers 330-334, 400-468 and 592-594 of The 1955 Revision of the
and Causes of Death, WHO, 1957.
International Classification of Diseases, Injuries,
d Category
Numbers 140-205.
e
Category Numbers E800-E962.
268
r
CHAPTER 12
1. INTRODUCTION
Statistical studies in the general category of medical follow-up and life
testing have as their common
immediate objective the estimation of life
expectancy and survival rates for a defined population at risk. These
studies must usually be terminated before all survival information is
complete and are therefore said to be truncated. The nature of the problem
in an investigation concerned with the medical follow-up of patients is the
same as in the life testing of electric bulbs, although difference in sample
size may require different approaches. For illustration, we use cancer
survival data of a large sample and therefore our terminology is the same
as that of the medical follow-up study.
In a typical follow-up study, a group of individuals with some common
morbidity experience is followed from a well-defined zero point, such as
date of hospital admisssion. The purpose of the study might be to evaluate
a certain therapeutic measure by comparing the expectation of life and
survival rates of treated patients with those of untreated patients, or by
comparing the expectation of life of treated and presumably cured patients
with that of the general population. When the period of observation ends,
there will usually remain a number of individuals for whom the mortality
data is incomplete. First, some patients will still be alive at the close of the
study. Second, some patients will have died from causes other than that
under study, so that the chance of dying from the specific cause cannot
be determined directly. Finally, some patients will be “lost” to the study
because of follow-up failure. These three sources of incomplete information
have created interesting statistical problems in the estimation of the expec-
tation oflife and survival rates. Many contributions have been made to
269
270 MEDICAL FOLLOW-UP STUDIES [12.1
Dorn [1950], Cutler and Ederer [1958], and Littell [1952]. For the material
presented in this chapter, reference may be made to Chiang [1961a].
The purpose of this chapter is to adapt the life table methodology \
period and observed until death or until termination of the study, which-
ever comes first. The time of admission is taken as the common point of
origin for all N 0 patients; thus N 0 is the number of patients with which
the study begins, or the number alive at time zero. The time axis refers
to the time of follow-up since admission, and x denotes the exact number
of years of follow-up. A constant time interval of one year will be used for
simplicity of notation, with the typical interval denoted by (x,x + 1),
for x = 0, 1, . . . , y
— 1. The symbol p x will be used to denote the
probability that a patient alive at time x will survive the interval ( , + 1),
x
and qx the probability that he will die during the interval, with p x + qx = 1
1
For easy explanation, the common closing date is used here for describing the
methods of analysis; however, these methods are equally applicable to data based on
the date of last reporting for individual patients.
12.2] PROBABILITY OF SURVIVAL AND EXPECTATION OF LIFE 271
Total Nx ™x nx
Survivors sx + wx sx Wx
Deaths Dx dx <
a Survivors among those admitted to the study more than (x + 1) years before closing
date.
b Survivors among those admitted to the study less than (x + years but more than
1)
x years before closing date.
date and wx will survive to the closing date of the study. The sum dx +
d'x = Dx is the total number of deaths in the interval. Thus sx dX9 wx9 and ,
d'x are the basic random variables and will be used to estimate the prob-
ability p x that a patient alive at x will survive the interval (x, x + 1), and
its complement qx .
with ii{t) =
depending only on x for x
/u,(x) < r < x + 1 ,
then the
probability of surviving the interval is given by
-"'
P. = e ( 2 2)
.
272 MEDICAL FOLLOW-UP STUDIES [12.2
/*(*.; p x) = (“') “ p*
)d*-
(2 4)
-
E(sx |
mx) = mxPx and Eidx mx) = |
r»x( 1 — Px) ( 2 -5)
respectively.
The distribution of the random variables in the group of n x patients
depends upon the time of withdrawal. A plausible assumption is that the
withdrawals take place at random during the interval (x, x + 1). Under
this assumption the probability that a patient will survive to the closing
date is
g-lr-xMx) dr = _ 1 ~ P*
> (2.6)
>Og Px
- —^ =
log Px
P 'j\ (2.7)
2
Table 2. Comparison between p*J
and
-(! - Px) Px
Px pi'
2
-(1 -px)/\ogpx
.70 .837 .841
.75 .866 .869
.80 .894 .896
.85 .922 .923
.90 .949 .949
.95 .975 .975
f
12.2] PROBABILITY OF SURVIVAL AND EXPECTATION OF LIFE 273
2
Consequently, p]j. is taken as the probability of surviving to the closing
date and (1 — p]j 2
) as the probability of dying before the time of with-
drawal. Thus the probability distribution of the random variable wx in the
group of n x patients due to withdraw is
(2.S)
E(w x n x )
|
= n x pl
12
and E(d x \
n x) = n x( 1 - p'J
2
), (2.9)
respectively.
Since the Nx individuals comprise two independent groups according
to their withdrawal status, the likelihood function of all the random vari-
ables is the product of the two probability functions (2.4) and (2.8), or
X = 0, 1, . . . , y — 1, (2.10)
where c stands for the product of the combinatorial factors in (2.4) and
(2.8).
(s* + \wJPx~
1 - rf x(l - PxV - 1
2<(1 - PTr'P* 112 = 0, (2.12)
2 2
a quadratic equation in p^ Since .
p]j cannot take on negative values,
(2.13) has the unique solution
2
with the complement
qx = 1 - px ,
x = 0, 1, . . . , y - 1. (2.15)
d'x are replaced with their respective expectations as given by (2.5) and
(2.9), the resulting expression is identical with the probability px :
- ±n x - p J + V jn x
(1
l 2
) (1 - pH 2 2
) + 4 (N x - jn x)(m xp x + \n xp]j 2
)
2 (N x - i n x)
(2.16)
m xpx + i" x Px
2
= Nx ~
( \ n x)P x + 2»*P*
2
(1 - vT)- (2.17)
Then the quantity under the square root sign may be simplified to
[K(l - Px
2
) + 2( Nx - i "x)PxT,
l/2i2
(2.18)
i»Jl Px
2
) + Jn*( 1 - Px
2
) + 2(JV X - ,2
in x )pl l
2
= Po (2.19)
2 (N, in x )
proving (2.16).
The exact formula for the variance of the estimator p x in (2.14) is
2
In a follow-up study of children with tuberculosis, W. H. Frost [1933] introduced
the concept of “person year” and used the quantity (in the present notation)
K^a; + Dx)
as the “person-year of life experience” and the ratio
Ac = A/f
NX — + Ac)
f
12.2] PROBABILITY OF SURVIVAL AND EXPECTATION OF LIFE 275
where
M = mx + n
x x (l + pi
12 )- 1
( 2 21 )
.
and
(1 pf) ( 2 22 )
.
4d + pTfpT
When the bias of the estimator is neglected, an approximate formula for
the variance of p x is given by
1
\
2
= ( ik ,
,,
r ' (2.23)
-E / a . \ \ Pxq x 7
S
,
log
te
The quantity ttx usually is small in comparison with the preceding term
and may be neglected, and (2.23) is reduced to
* = 0,1,...,2/-1. (2.24)
M x
Remark 1. If the time of death of each of the Dx patients and the time
of withdrawal of each of the wx patients are known, the probability p x
will be estimated from a different formula. In such cases there will be Nx
individual observations, and obviously it is unnecessary to consider the
Nx patients as two distinct groups according to their withdrawal status.
276 MEDICAL FOLLOW-UP STUDIES [12.2
e~
titl{x)
jLi(x ) = —pj* log p x . (2.25)
Let Tj < 1 be the length of observation in interval (x, x 1) for the jth +
patient in the group of surviving to the closing date of the study, for
j = 1, ... ,wx ,
with the probability density
e- **
T m)
= px
Ti
. (2.26)
— Dx
- 1
Px ex P ( (2.28)
d
I
v
sz + 2 U + 2t i
J
i= i=l
l
computed from
Pox = PoPi *
'
*
Px- 1» * = 1 ,
2, . . . , y. (2.29)
The sample variance of p 0x has the same form as that given in Equation
(5.12) of Chapter 9,
r
12.2] PROBABILITY OF SURVIVAL AND EXPECTATION OF LIFE 277
K= i + Pa + Pa.Pct+1 + • *
‘
+ PaP*+ '
*
'
Py-1 + P*Pol+ 1
'
*
’
Py + • • • •
(2.31)
patients who entered the program in its year, py _ x will be zero, and first
K *= i + Pa + PaPa+1 + * *
*
+ PaPa+ Py-1 *
'
*
+ PayiPv + PyPy+1 + ‘ *
*)> (2.32)
ju(r) drj ,
2 = y, y + 1, . . . . (2.33)
J
If the intensity function is constant beyond y, so that p T = ju, the prob-
ability of surviving the interval (
2, 2 + 1) becomes
pz = = p, 2 = y, y + 1, . . . , (2.34)
As a result, we have
t < y such that p p t+1 ... are approximately equal, thus indicating a
, t , ,
for a = 0, . . . , y — 1.
Sj, (2.38)
and
Ply
ea = Pa et+ + 7 + a < t
dpt 2 (1- A) 2 .
Pay
’
a > t (2.40)
(l -4) 2
f
12.3] CONSIDERATION OF COMPETING RISKS 279
a < t, (2.41)
and
The value of px and the sample variance of px are obtained from formulas
(2.14) and (2.24), respectively.
are acting simultaneously on each patient in the study. For risk R8 there
is a corresponding intensity function (force of mortality) d ),
(5=1 ,
. . . ,
r, and the sum
/i(t; !) + •••
+ h(t, r) = //( r ) (3.1)
that an individual alive at time x will die prior to x + t, 0 < t < 1, from
R8 in the presence of all other risks in the population. It follows directly
from Equation (2.6) in Chapter 1 1 that
'X+t
QJt) = (3.2)
280 MEDICAL FOLLOW-UP STUDIES [12.3
QJt) = [l - e^ M = ] [1 - PJM
fJl{x) p{x)
0 < < t 1 ;
(5 = 1, (3.3)
From (3.1) we see that the sum of the crude probabilities in (3.3) is
QJi) = [1 - e-^' = 2
] Qx,[l + P^T 1
, S = 1, . . . . r.
p(*)
(3.5)
Equation (3.4) implies that
2xl[l + py
2
]" 1 + + 2a, r [l + pj
/2
]
_1
+ P*
/_
= 1>
x = 0,l,...,y- 1. (3.6)
9xs= 1 - Vx
xS ' q
\ (3.7)
8
II 1 - (5=i,. . . ,
r; (3.8)
Qxd.l (5 = 2. r; (3.9)
Qx Qx 1
= 3,..
Q x5.12 —
(5
J<1x-Qx1-Qx2)/Qx
"x ],
x = 0, 1, • , V
~ 1.
(3.10)
Our immediate problem is to estimate Qxb px , ,
and qx .
Total Nx ™x nx
Survivors sx + Wx Sx Wx
Deaths (all causes) Dx dx d'
x
a Survivors among those admitted to the study more than (x + 1) years before closing
date.
b Survivors among those admitted to the study less than (x + 1) years but more than
x years before closing date.
where sx + dxl + • •
•
+ dxr = mx . Their mathematical expectations are
given by
E(sx |
mx) = mxpx and E(dx6 \
mx) = mx QxS (3.12)
respectively.
In the group of n x patients due to withdraw in interval (x, x + 1), wx
will be alive at the closing date of the study and d'xs will die from Rd before
the closing date. Each of the nx individuals has the survival probability
(2.7)] and the probability of dying from risk R d before
12
p\ [cf. Equation
282 MEDICAL FOLLOW-UP STUDIES [12.3
QJi) = QJt 1 + pJ f\
l 2
6 = 1, . . . , r. (3.13)
V
Since /?*
/2
and the probabilities in (3.13) add to unity, as shown in (3.6),
fx2^x’ d x d’ P xi Qxd) Px
/2)Wx
fi[Qxd(^ + pTr1 /**, (3.i4)
5=1
where wx + dxl + • •
•
+ d'xr ;= n x . The mathematical expectations are
E(w x n x ) |
= nx vT and E(d xS \
n x) = n x Q xi (l + pH
2
)
1
(3.15)
— nn
L ra:
S *+ (1 / 2)M J
VLQ&IQM + pTr'] \
dl
(3 . 16 )
(3.14). Equation (3.16) may be simplified to give the final form of the
likelihood function
function (3.17) is
I D xi log Q xi
+ 5=1 , (3.18)
r
12.3] CONSIDERATION OF COMPETING RISKS 283
ID xS \o S Q x3 -
+ 5=1 x(p x 2Q xi -
+ 5=1 l) . (3. 20 )
\ /
Differentiating <$>
x with respect to px , Qxl , . . . , Qxr and setting the
derivatives equal to zero yield equations
ux
2
- = o (3.21)
dp x Px (1 + pT)PT
and
— ^
dQ xS
<I>
X =
Qxb
- A = 0, 6=1, r. (3.22)
d=l^x8
or A = — (3.23)
4x
IQxS
5=1
Dxd _ Dx
(3.24)
Qxd tfx
4
Once upon a time a rich man willed eleven urns of gold to his three sons. After his
death it was revealed that by the terms of the will one half of the gold was to go to the
eldest son, one fourth to the second son, and one sixth to the youngest son; the entire
fortune must be divided up among the sons, but the gold in each urn must remain intact.
The sons found their intelligence no match for the terms of the will, and they could
not find a way to divide the gold according to the instructions. Amid the excitement
there came an old man; he was learned and generous. Touched by the sons’ anxiety, he
offered an urn of gold of his own. The three sons accepted the offer with gratitude
because it brought the problem to their level of intelligence the eldest son received six
:
urns; the second son, three urns; and the youngest son, two urns. The wise old man
recovered his urn of gold, which was left over after the three sons had taken their share.
The old man could be Lagrange; his urn played the role of Lagrange coefficient. And
itwas immaterial whether his urn contained gold or anything else, just as the exact value
of the Lagrange coefficient is of no importance in the solution of a practical problem.
Note that if there were no restriction of indivisibility on the gold within the urns, the
sons would have to divide the gold an infinite number of times, and in the end they
would have received the same amount of gold as they did by the old man’s subterfuge.
284 MEDICAL FOLLOW-UP STUDIES [12.3
Using the second equation in (3.23), we find from (3.21) the quadratic
equation
(N x - \n x )p x + WxPT ~ («» + iw x ) = 0, (3.25)
„
Pm
_ r -M + - *«.)(». + Fo r
L 2 (JV.-K) J*
x = 0, 1, . . , y — 1 , (3.26)
and ^ will have the same values as those in Section 2. Substituting (3.26)
in (3.24) yields the estimators of the crude probabilities,
QXS = ~
Ur*
?*,
S
a;
=
=
1, 2, .
0, 1, ...
.
. ,
,
r,
y - 1.
(3.27)
We now use (3.26) and (3.27) in formulas (3.7) to (3.10) to obtain the
following estimators of the net and partial crude probabilities
gXi = i - Px’
SID’,
(3.28)
qx S = i - d = l,...,r. (3.29)
and
D xd
^xS n =
O .
rt fi(Dx
Px
—Dx i—DX2 )/Dx -i
D x _ un xl _ u
u D x2 1 J’
o.
II u>
II O • • , y - 1. (3.31)
Qxs
- [1 - vX
(Qx-QxO/Qx
], (3.32)
<lx 6x1
which is identical to Q xd in formula (3.9), proving the consistency.
,
x
12.3] CONSIDERATION OF COMPETING RISKS 285
Qx 1 = 1
-Px- Q* Q„ (3.33)
in (3.18). When the bias is neglected the inverse of the asymptotic co-
variance matrix of the estimators, px Q x2
, , . . . , Qxr ,
is given by
I dp**
1 x
log L.
1
•))
1
M (
{dpjQ xe
d
lx(r-l)
,
on r
% \
"
A=
1 (- £ {dQ '
dQj°
eLx
)) xS . 1)
(r - 1) x 1 (r - 1) X (r - 1) J
(3.34)
(3.35)
(3.36)
(3.37)
r (3.38)
(3.39)
with its determinant
M - Px )
Ml -
(3.40)
<xl QxrP x xl \Lxr
at.
2
= ^11 = Pxlx
1
(3.41)
IM X + tr x p xq
and
4~ TTxPxIM,
= -Q I
Cov(Q xi Q xc ) =
,
^
Ml
xS Q_
Mx + ”xPx<lx-
S e; <5, e = 2, . . . ,
r. (3.44)
Since the term Qxl was not explicitly included in (3.34), the formulas for
the variance of Q xl and the covariances between Qxl and other estimators
were not presented. By reason of symmetry, however, the above expressions
f° r Qxd ar e also true for
Qxl which is to say that formulas (3.42), (3.43),
,
f
12 4 ]
. LOST CASES 287
= ~r m . ( 3 . 45 )
(3 . 47 )
and
4 . LOST CASES
Every patient in a medical follow-up is exposed not only to the risk of
dying but also to the risk of being lost to the study because of follow-up
failure. Untraceable patients have caused difficulties in determining
survival rates, as have patients withdrawing due to the termination of a
study. However, lost cases and withdrawals belong to entirely different
categories. In a group of Nx patients beginning the interval (x, x + 1),
for example, everyone is exposed to the risk of being lost, but only n x
patients are subject to withdrawal in the interval. Therefore, it is incorrect
to treat lost cases and withdrawals equally in estimating probabilities of
survival or death. For the purpose of determining the probability of dying
from a specific cause, patients lost due to follow-up failure are not different
from those dying of causes unrelated to the study. Being lost, therefore,
288 MEDICAL FOLLOW-UP STUDIES [12.4
of the symbols.
Suppose we let R r denote the risk of being lost; for the time element
(r,r + A) in the interval (x x + 1) let 9
/->
*ix.r
— 1
L
n^Vx—Qxr) iQx
Fx
= Pr{a patient alive at time x will die in interval (x, x + 1) if
the risk Rr of being lost is eliminated}; (4.5)
1
L >7
Hx.r
— vx(Qx~Qxr) IGx
Fx
= Pr{a patient alive at x will survive to time x -f 1 if the
risk R r of being lost is eliminated}; (4.6)
QxS.r
=~ [1
-
lx ~ Qxr
= Pr{a patient alive at x will die in (x, x -f 1) from cause R d
if the risk R of being lost is eliminated}. (4.7)
r
The probabilities in (4.5) (4.6), and (4.7) are equivalent to qx , x, and Qxd)
respectively, if there is no risk of being lost.
The symbol dxr in Table 3 now stands for the number of lost cases
among the mx patients and d'xr for the number of lost cases among the n x
patients; the sum D xr = dxr + d'xr is the total number of cases lost in the
interval. The probabilities in (4.2) through (4.7) can be estimated from
formulas (3.26) through (3.30) in Section 3.
12.5] LIFE TABLE CONSTRUCTION FOR THE FOLLOW-UP POPULATION 289
The second interval began with the s 0 = 17,202 survivors from the first
interval, which is entered as 7V4 in Column 2 of line 2. The N1 patients
were again divided successively by withdrawal status, by survival status,
5
I express my appreciation to George Linden and the California Tumor Registry,
California Department of Public Health, for making their follow-up data available for
use in this book.
290 MEDICAL FOLLOW-UP STUDIES [12.5
The main life table functions and the corresponding sample standard
errors are shown in Table 6. The basic quantity in the table is qx 4 the .
,
PROBLEMS
1. Verify equation (2.6).
2. Table 2 on page 272 shows the numerical difference between p]j 2 and
— (1 — px 2)/\ogpx Justify the approximation (2.7) by showing that
.
<l
- ~ p* 2 = + +
]ogp^
3. Show that the estimator/^ in (2.14) is a maximizing value of the likelihood
function in (2.10).
6. Verify the formula in (2.41) and (2.42) for the sample variance of ea .
10. Verify that the determinant of the matrix A in (3.39) is given in (3.40).
12. During the period from January 1, 1942, to December 31, 1954, in the
State of California there were 5,982 registered patients with a diagnosis of
cervix uteri. Table 7 shows their survival experience up to December 31, 1954.
(a) Construct a life table for these cancer patients, including the sample
standard errors of p 0x q x and ex
, ,
.
x to x + 1 Nx mx dx dx i dx 2 dx z dx 4
20-21 117 55 49 6 2 1 2 1
21-22 49 0 0 — — — — —
Source. California Tumor Registry, California State Department of Public Health.
a
December 31, 1963 was the common closing date.
292
breast cancer , cases initially diagnosed 1942-1962 ‘
— — — — — — — 20,858 0-1
177 145 32 1
— 2 29 1,071 14-15
107 82 25 1
— 1 23 419 17-18
82 69 13 — — 3 10 284 18-19
62 43 19 1
— 3 15 192 19-20
62 45 17 — 1 1 15 117 20-21
49 40 9 — — 1 8 49 21-22
293
o £
C S .52 J <o
3 m2 g
i- C <«-» ° U* aoc ©
o
u ->o, n d.S -E o
IS Q i
"r
°
co O
Is
estimates
III y k> 00
£ -* O C
U C* Joo efc
c o <o «i
C i- sC <OJ
c/5
(X
C c c:
o
o
w c o
likelihood
— do
-+-*
•o
fli
C /3 4-J
500 ««'t'nOOOO«'OM'VlVOin 0',(NN'-O
,
'd °
Maximum
° J
«J
C
f
^\£ir--^i(Sn-' 00 r5 0\vo« 0\ioqt^r<1 ^-< ^N cancers).
c 3 cl lort - OOOONM^t-'^O'OVO'OVO'OiO'OiO'tMM^
’>» Oh m 'in
0 c ^
00 05
(other
«*-, <u
O .C
2
cancer.
probabilities R
and
_l_
II f5 C\>--Of,n»vof5 ^o'n^^'>Do 'i — oo o
kS “T iflf5
iO'tiONiOf5 ctr5 -HrHNOO««^N'tN^vOO
i
-O I
breast
£&
w
O H ^u< door^^'O'odooovd^'tind'o^iO'-'t'ooNd
/vl W5 rv5 <n Tf
CT5 L/V t”+* T+* •*+• r/v r*-\ r/v r/V /VJ rsl r/V
Health.
cancer)
£ 15
of crude
z e: o< Public
c*h <u
° c (breast
0)M of
partial
15 e <U c
diagnosis
x
£ co iO»firtvOt--|vi5,«r50VOOO-iniO't\OOMOI^'00 R
5 <WJ »mNh«o;^^qq^h;q^NhriioN\o«q
~05 C t'\dd^Ndt^f5vdiOM()oridr,iNinooi vDiod
— OONOOVO^O^t't^trcinMMNrJNMM
: Department
risks
and <N
ctf •©
« o
both
net
following
State
of denote
°£ M'l
u£ ^ to
Sjo E u
n CM^ — — O © -— vo — vomOmoor^rirNmri
California
io
experience
r~Tj-fN<N — ~h
r^i
OO'-'-'-'-WN'tOO'^r'l
s 22 «- ;
^ fN (~T)
here
W ofi, ffl
used
Registry,
is
Survival
c _ c
<u
~
S 11 O (fl ^
> D > I + ^'OON\O(SMO\(» 00 f^ 00 ,lctO\'O-t^'«0 t^'O
£^
O O
C H MIOtvf'MOOO\MMO\OMMOOM\Or''nOO't'C r '«0
Tumor
subscript
5. co X d .H
c x) OOOOOOOOOOOOOOOOOOOOONOOOOOOOOOOOOOOCTvOOf~-^C>
« > ,*-
. £
™ —
.s
o O a
Table
California
simplicity
• !£ 05
"3 O rt
294
f
Table 6. Survival experience following diagnosis of breast cancer
the main life table functions and their standard errors
( when risk of being lost is eliminated )
Interval
Estimate of Net Observed
Since
Probability of Dying a>Year Survival Rate Expectation of
Diagnosis
in Interval Life at x &
(in Years)
a
Pi 6-4 *s use d f° r Pt m formula (2.37).
295
^0(N'-<0 — — — <N^O
°U
.el o « ><
>~.±i £ +- >
u*
Q2 0)
o
—
U o
73
X^
>
0)
+ e
uteri
£ »
z
8.
cervix
Q «
c *H (D A
js g> a g *g «rHOiONVOTfl^rHVOinoOON
f'O'or'O'nvovo^ooi^oot^
the E ’> H^ I
of
13 o *3 5 iftrHoofno»o\^oo»i- 1 <s
cancer
^o-rt-r-ON^mvo^Or-HOooooor-
1954
” Q ^t Tl2
[2 1
of to
California.
1942
diagnosis
date.
of
date.
diagnosed
4) 4> £• State
o 43 ’> closing
e£
3
to
prior
rhONrt Public
prior
years
experience
of years
1
cases
60 _
1+
,_
Z C
Ul
4> C3 c3
O^hw^O’-O^'t^ooMO x
X<o c > 4= o OOOHK1-
7t C4 (S — —
1 OO VO T}- n fS - +
IS
Department
and
6 ^00 *e x
3
x
Survival
z than
- 3 5*3 t^(J\h-‘<tnoOM'CVOVO't4iO
|Qg
O *_. !>
t -HoovoM'O^ONONTm^or'
m'tWh(SO\'OT)-f4N«
between
«n m <s —
more
H O u >3
33
i «-h
Registry,
7. Z.2
w >
h +
.5
.£
4>
^e MC5
i
.e .a c «
i
California
of of
go
O rt
c u
—«
O ^N
— fO
Source.
^(STi'ti^vor'oooN —
8
.S-2 c lllllllllllll
0«MrOT}-io\or'OOONO”D a b
C/D Q w
296
f
References
Aitken, A. C. [1962]. Determinants and Matrices. Oliver and Boyd, Edinburgh and
London.
Armitage, P. [1951]. The statistical theory of bacterial populations subject to mutation,
J. Royal Statist. Soc ., B14, 1-33.
Armitage, P. [1959]. The comparison of survival curves, J. Royal Statist. Soc., A122,
279-300.
Bailey, N. T. The Elements of Stochastic Processes with Applications to the
J. [1964].
New York.
Natural Sciences. Wiley,
Barlow R. E. and F. Proschan [1965]. Mathematical Theory of Reliability. Wiley,
New York.
Bartlett, M. S. [1949]. Some evolutionary stochastic processes, J. Royal Statist. Soc.,
Bll, 211-229.
Bartlett, M. S. [1956]. An Introduction to Stochastic Processes. Cambridge University
Press.
Bates, G. E. and J. Neyman [1952]. Contribution to the theory of accident proneness.
I. An optimistic model of the correlation between light and severe accidents,
Univer. Calif Pub. Statist., 1, 215-254, University of California Press, Berkeley.
Bellman, R. [I960]. Introduction to Matrix Analysis. McGraw-Hill, New York.
Berkson, and L. Elveback [I960]. Competing exponential risks, with particular
J.
reference to smoking and lung cancer, J. Amer. Statist. Assoc., 55, 415-428.
Berkson, J. and R. P. Gage [1952]. Survival curve for cancer patients following treatment,
J. Amer. Statist. Assoc., 47, 501-515.
297
298 REFERENCES
Brockmayer, E., H. L. Halstr0m, and A. Jensen [1948]. The life and work of A. K.
Copenhagen Telephone Co., Copenhagen.
Erlang,
Buck, R. [1965]. Advanced Calculus (2nd ed.). McGraw-Hill, New York.
Chiang, C. L. [1960a]. A stochastic sthdy of the life table and its applications: I.
Probability distributions of the biometric functions, Biometrics, 16 , 618-635.
Chiang, C. L. [1960b]. A stochastic study of the life table and its applications: II.
Sample variance of the observed expectation of life and other biometric functions,
Human Biol., 32 , 221-238.
Chiang, C. L. [1961a]. A stochastic study of the life table and its applications: III.
The follow-up study with the consideration of competing risks, Biometrics, 17 ,
57-78.
Chiang, C. L. [1961b]. Standard error of the age-adjusted, Vital Statistics, Special
Reports Selected Studies, 47(9), 275-285. National Center for Health Statistics.
Chiang, C. L. [1964a]. A stochastic model of competing risks of illness and competing
risks in Medicine and Biology, J. Gurland, Ed. Uni-
of death, Stochastic Models
Wisconsin Press, Madison, pp. 323-354.
versity of
Chiang, C. L. [1964b]. A birth-illness-death process, Ann. Math. Statist., 35, 1390-1391,
abstract.
Chiang, C. L. [1965]. An index of health: mathematical models, Vital and Health
Statistics, Ser. 2, No. 5, 1-19. National Center for Health Statistics.
f
REFERENCES 299
Feller, W. [1936]. Zur Theorie der stochastischen Prozesse, Math. Ann., 113, 113-160
Feller, W. [1940]. On the integrodifferential equations of purely discontinuous Markoff
processes, Trans. Amer. Math. Soc., 48, 488-515.
Feller, W. [1957]. An Introduction to Probability Theory and its Applications, Vol. 1
Part V: Selection and Mutation, Proc. Royal Soc. {Edinburgh), 23, 838-844.
Halley, E. [1693]. An estimate of the degrees of the mortality of mankind, drawn from
curious tables of the births and funerals at the city of Breslau, Philos. Trans. Royal
Soc. {London), 17, 596-610.
Harris, T. E. [1963], The Theory of Branching Processes. Springer-Verlag, Berlin.
Harris, T. E., P. Meier, and J. W. Tukey [1950]. Timing of the distribution of events
between observations. Human Biol., 22, 249-270.
Hochstadt, H. [1964]. Differential Equations. Holt, New York.
Homma, T. [1955]. On a certain queueing process, Repts. Statist. Appl. Research, Union
Japan. Scientists and Engrs., 4, 14-32.
Hogg, R. V. and A. T. Craig [1965]. Introduction to Mathematical Statistics (2nd
ed.). Macmillan, New York.
Irwin, J. O. [1949]. The standard error of an estimate of expectation of life, J. Hygiene,
47, 188-189.
Kaplan, E. L. and P. Meier [1958]. Nonparametric estimation from incomplete observa-
tions, J. Amer. Statist. Assoc., 53, 457-481.
300 REFERENCES
B3, 151-185.
Kendall, D. G. [1954]. Stochastic processes occurring in the theory of queues and their
analysis by means of the imbedded Markov chain, Ann. Math. Statist., 24 , 338-354.
Kendall, M. G. and A. Stuart [1961], The Advanced Theory of Statistics , Vol. 2.,
Griffin, London.
Kimball, A. W. [1958]. Disease incidence estimation in populations subject to multiple
causes of death, Bull. Internat. Statist. Inst., 36, 193-204.
King, G. [1914]. On a short method of constructing an abridged mortality table, J. Inst.
Actuaries, 48 294-303. ,
Polya, G. and G. Szego [1964], Aufgaben und Lehrsatze aus der Analysis, Vol. 2.
Springer-Verlag, Berlin.
Quenouille, M. H. [1949]. A relation between the logarithmic, Poisson, and negative
binomial series, Biometrics, 5, 162-164.
Rao, C. R. [1945]. Information and accuracy attainable in the estimation of statistical
parameters, Bull. Calcutta Math. Soc., 37, 81-91.
f
REFERENCES 301
Rao, C. R. [1949]. Sufficient statistics and minimum variance estimates, Proc. Cam-
bridge Phil. Soc ., 45, 213-218.
Reed, L. and M. Merrill [1939]. A short method for constructing an abridged life
J.
f
Author Index
Chiang, C. L., 50, 70, 126, 131, 135, 145, Irwin, J. O., 299
151, 173, 194, 195, 203, 214, 215, 244,
270, 298 Kaplan, E. L., 299
Chung, K. L., 45, 118, 298 Karlin, S., 45, 300
Cornfield, J., 244, 298 Kendall, D. G„ 45, 72, 300
Cox, D. R., 45, 298 Kendall, M. G., 231, 300
Craig, A. T., 231, 299 Kimball, A. W., 244, 300
Cramer, H., 229, 298 King, G., 203, 300
Cutler, S., 270, 298 Kolmogorov, A., 45, 118, 300
Doob, J. L., 45, 118, 138, 298 Lagrange, J. L., 126, 283
Dorn, H., 270, 298 Laplace, P. S., 243
Leadbetter, M. R., 298
Ederer, F., 270, 298 Lehmann, E., 231, 241, 300
Elvebeck, L., 244, 297 Littell, A. S., 270, 300
Epstein, B., 62, 298 Lundberg, O., 50, 300
Esary, J. D., 297
McGuire, J., 146
Farr, W., 189 MacLane, S., 120, 297
Feller, W., 24, 45, 118, 299 Makeham, W. M., 62, 243, 300
Fisher, R. A., 37, 231, 274, 284, 299 Meier, P., 269, 299
Fix, E., 73, 74, 86, 244, 269, 299 Merrell, M., 203, 299
Ford, L. R., 55, 299 Miller, D. H., 45, 298
Forster, F. G., 72, 299 MiUer, F. H., 55
Frost, W. H., 269, 274, 299 Miller, R. S., 215, 300
304 AUTHOR INDEX
Neyman, J., 37, 50, 73, 74, 86, 231, 244, Scheffe, H., 241, 300
251, 252, 253, 269, 300 Scott, E. L., 37, 300
Sirken, M., 215, 301
Parzen, E., 45, 300 Sobel, M., 62, 298
Patil, V. T., 173, 300 Stuart, A., 231, 300
Polya, G., 126, 300 Szego, G., 126, 300
Price, R., 189
Proschan, F., 297 Takacs, L., 45, 301
Thomas, J. L., 215, 300
Quenouille, M. H., 300 Todhunter, I., 244, 301
Tukey, J., 269, 299
Rao, C. R., 229, 241, 300
Reed, L. J., 203, 301 Watson, The Rev. H. W., 37
Riordan, J., 24, 301 Weibull, W., 62, 301
Ruben, H., 301 Wiggle sworth, E., 189
Rudin, W., 25, 301
Yang, G. L., 126
Samuels, M. J., 133, 15C Yule, G., 53, 301
Saunders, S. C., 297
Zahl, S., 301
f
Subject Index
305
306 SUBJECT INDEX
in multiple exit transition probabilities, Cramer-Rao lower bound, for the vari-
91 ance of an estimator, 229
of p.g.f., 166, 168 Cramer’s rule, 136, 143, 178
distribution, 8, 42 gamma, 49
Consistent estimator, Fisher’s, 274, 284 geometric, 28, 224
Constant relative intensity, 245 Gompertz, 61
Continuity theorem for p.g.f., 42 logarithmic, 36
generating function of, 26 negative binomial, 28, 30, 36, 42, 43,
of multinomial distributions, 86, 163 44
n-fold, 27 negative multinomial, 34, 42
Correlation coefficient, 19, 20, 22 normal, 21, 44
t
SUBJECT INDEX 307
probability density function of, 107, 113 characteristic equation of, 136
Forward differential equations, 149, 153, eigenvalues of, 133, 136, 150
f
SUBJECT INDEX 309
square, 122
of uniform distribution, 44
in medical follow-up studies, 273, 276, force of, 7, 60, 220, 244
284, 290 Multinomial coefficient, 41
of net probability, 294 Multinomial distribution, 16, 17, 33, 84,
of partial crude probability, 294 162, 181, 182, 225, 248, 287
of probability of surviving, 226, 227 marginal distribution in, 33
Mean, 14 p.g.f. of, 33, 42
value, 10 Multinomial expansion, 41
see also Expectation Multiple decrement, 243
Medical follow-up studies, 269 tables, 244
closing date in, 270, 281 Multiple entrance transition probabilities,
competing risks in, 279 104, 168, 170
estimate, of net probability, 290, 295 backward differential equations for, 168
of partial crude probability, 290 differential equations for, 105, 168
estimation, of expectation of life in, 277, p.g.f. of, 105, 168
278 Multiple exit transition probabilities, 90,
of survival probability in, 276 91, 93, 112, 164, 165
expected number of deaths, 272, 273 differential equations for, 90, 111
expected number of survivors, 272, 273 p.g.f. of, 90, 92, 111, 112, 164, 169
life table in, 289 Multiple transition probabilities, 89, 163
likelihood function in, 270, 273, 276, identities for, 101, 102
280, 282, 290 leading to death, 95, 96, 167
maximum likelihood estimators in, 273, Multiple transitions, 89
276, 284, 290, 294 conditional distribution of the number
a stochastic model for, 86 94
of,
variance of estimator in, 275, 285, 286, expected number of, 95
287 in illness-death process, 89
withdrawal status in, 271, 281 number of entrance transitions, 104, 169
Mental illness, 74 number of exit transitions, 90, 104, 164,
care for, 74
169
instance and prevalence of, 74
Multiple transition time, 89
Mid-year population, 195, 205
distribution function of, 107, 113
of England and Wales, 216
expectation of, 109
Migration processes, 71, 171, 172, 173,
identities for, 110, 111
174
leading to death, 109
differential equation in, 173, 175
probability density function of, 106
p.g.f. in, 175
Multivariate distribution, 7
solution for the p.g.f. in, 176, 180
Multivariate Pascal distribution, 34
see also Emigration-immigration proces-
ses
Moment generating function, 44 National Center for Health Statistics, 194,
of binomial distribution, 44 204
of exponential distribution, 44 National Vital Statistics Division, 194
f
SUBJECT INDEX 311
marginal, 7, 8, 42 continuous, 6
t
SUBJECT INDEX 313
CRAMER
The Elements of Probability Theory end Some of Its Applications
FELLER
An Introduction to Probability Theory and Its Applications, Volume /,
Second Edition
FELLER
An Introduction to Probability Theory and Its Applications, Volume II
HOEL
Introduction to Mathematical Statistics, Third Edition
PRABHU
Queues and Inventories: A Study of Their Basic Stochastic Processes
TAKACS
Combinational Methods in the Theory of Stochastic Processes
WILKS
Mathematical Stc*i:tics
'
is Wiley & Sons, Inc.
'
605 Third Avenue, New York, N. Y. 1 001 6
New York •
London • Sydney