Professional Documents
Culture Documents
147 Full PDF
147 Full PDF
net/publication/8193428
CITATIONS READS
21 823
1 author:
Stephen D Kachman
University of Nebraska at Lincoln
144 PUBLICATIONS 3,347 CITATIONS
SEE PROFILE
All content following this page was uploaded by Stephen D Kachman on 27 January 2014.
The online version of this article, along with updated information and services, is located on
the World Wide Web at:
http://jas.fass.org/content/77/E-Suppl_2/147
www.asas.org
147
Hazard Function
Models for survival analysis can be built from a
hazard function, which measures the risk of failure of
an individual at time t. The hazard function for
animal i at time t is
Figure 2. Weibull hazard function, rl( lt) r–1 where l = 1/5, and
Pr( Ti < t + Dt|Ti > t) ƒ( t; hi) rate parameter ( r) of 0.5 ( ) , 1 ( – –), or 2 ( ......) .
l( t; hi ) = lim = .
Dt→0 Dt S( t; hi)
J. Anim. Sci. Vol. 77, Suppl. 2/J. Dairy Sci. Vol. 82, Suppl. 2/1999
where l0( t) = l( t; 0 ) = baseline hazard function. The in several important ways. First, it does not include a
survival function can then be written as residual component ( e) . In the survival model the
residual variability is modeled through the survival
–L0( t) e
hi distribution. Second, the expected survival time,
S( t; hi) = e given the random effects, is not equal to the Xβ + Zu
as in the mixed model. Third, larger values of the risk
where L0( t) = L( t; 0). The role of the risk factor hi factor lead to shorter expected survival times.
will be examined next. Under the Weibull survival model, the survival
function [1] for animal i can be written as
Risk Factor hi
–exp[r ln(t)+hi]
S( t; hi) = e . [2]
The vector of risk factors is a linear combination of
fixed and random effects
The effect of changes on median survival time, mhi,
can be found by solving S( mhi; hi) = 0.5. After a bit of
algebra
1/re–hi/r
mh = [– ln(0.5) ] .
i
mh + D = mh e–D/r.
i i
r
Figure 3. Weibull survival function, e– ( lt) where l = 1/5 and For example, let r = 2, and the risk factor for males be
rate parameter ( r) of 0.5 ( ) , 1 ( – –), or 2 ( ......) . 0.5 larger than the risk factor for females; then the
J. Anim. Sci. Vol. 77, Suppl. 2/J. Dairy Sci. Vol. 82, Suppl. 2/1999
ESTIMATION
The basic approaches to estimation include non- Figure 5. Effect of changes in the risk factor on median survival
parametric, semi-parametric, and parametric. The fo- time with rate parameter ( r) of 0.5 ( ) , 1 ( – –), or 2 ( ......) .
cus of this paper is on the parametric approach. The
parametric approach is better suited to handle the
large complex models encountered in animal breed-
ing. The basic parametric approach involves getting y*i = 1 – L ( T ) ehi + R h ,
0 i ii i
the joint likelihood of the survival times and the
random effects. In simple cases, the marginal likeli- which is very similar to the usual mixed model equa-
hood of survival time can be obtained by integrating tions. Because these equations must be solved itera-
over random effects. The marginal likelihood can also tively, the computational time will be several times
be approximated by taking a second order Taylor’s
greater than for a corresponding linear model when
series expansion of the joint log-likelihood. From a
variance components are known. However, if the vari-
Bayesian viewpoint, that would be equivalent to ob-
ance components must be estimated computational
taining the posterior mode.
times will be similar to the corresponding linear
Ignoring an additive constant, the joint log-
likelihood for the Weibull distribution is model. Approximate standard errors and tests are
obtained as in the linear model case.
l ( b, u,r) = ∑[ln(r/ti) + r ln(ti) + hi
i CENSORING
– exp(r ln(ti) + hi) ] – 1/2 ln |G| – 1/2u′G–1u.
In this section the effect of censoring will be dis-
cussed. Unlike traits such as yearling weight, data on
Written in a slightly more general form
survival traits are often censored. That is, the sur-
vival time may either be known to be greater than a
l ( b, u,r) = ∑[ln( l0( ti) + hi – L0( ti)exp( hi) ]
certain amount (right censored), less than a certain
i
– 1/2 ln |G| – 1/2u′G–1u. [3] amount (left censored), or be within a certain range
(double censored). Of the three types of censoring,
Posterior mode estimates of the fixed and random right censoring is the most common. Right censoring
effects can then be obtained by taking the first and can occur because an animal is removed before failure
second partial derivatives of [3]. After a little algebra, can be observed or because failure occurred after the
the resulting estimation equations are end of data collection. Left censoring can occur be-
cause an animal failed before data collection began.
X′RX X′RZ b̂ Double censoring can occur if there is a break in data
=
X′y*
Z′RX Z ′RZ + G–1 û Z ′y* [4] collection, and an animal fails somewhere in that
interval.
where In the following examination of censoring, time of
censoring and survival time will be assumed to be
2 independent. Attention will also be focused on han-
R = – ∂l
∂h∂h′ dling right censoring. Conceptually other types of cen-
Rii = L ( T ) ehi soring are handled similarly.
0 i
∂l
For data that are right censored, the time of cen-
y* = + Rη soring is observed instead of the time of failure. Let Ti
∂h
J. Anim. Sci. Vol. 77, Suppl. 2/J. Dairy Sci. Vol. 82, Suppl. 2/1999
= observed time at which an animal has failed or the management, disease, or economic forces. A sharp
time at which the record was censored. If a record is drop in the price of milk would increase a dairy cow’s
uncensored, then the density function of Ti is risk of being culled. However, we would not expect the
drop to have an impact on an animal prior to the drop
–L( Ti; hi) taking place. These changes can also be on an in-
ƒ( Ti; hi) = l( Ti; hi) e = λ(Ti; hi) S( Ti; hi) .
dividual animal basis for factors such as disease and
reproductive status. Figure 6 illustrates a hazard
If a record is censored, then the probability mass function for an animal who becomes ill at 2 yr and
function of Ti is obtained by integrating ƒ( t; hi) from recovers at 3.5 yr. The risk of failure increases during
Ti to ∞ yielding the period of illness and decreases when the animal
recovers. These changes in the risk of failure can be
–L( Ti; hi)
S( Ti; hi) = e . modeled using a time dependent covariate.
The record on the animal is broken into three
conditionally independent observations: 1 ) a well
The log likelihood for animal i is
animal with a survival time greater than 2, 2 ) an ill
animal with a survival time greater than 3.5 condi-
li = Wi ln(l( Ti; hi) ) – L( Ti; hi)
tioned on survival till time 2 yr, and 3 ) a recovered
animal conditioned on survival until 3.5 yr. The
where Wi = 1 if a record is uncensored and = 0 if a resulting log likelihood for the animal at time t > 3.5
record is censored. The corresponding elements in [4] yr is
are
li( t) = ln(S(2; hi0) )
Rii = L0 (Ti ) ehi [5] – ln(S( 2; hi1) ) + ln ( S(3.5; hi1) )
– ln(S( 3.5; hi2) ) + ln ( ƒ( t; hi2) )
and
where hi0, hi1, and hi2 = risk factors for animal i when
y*i = Wi – L0( Ti) ehi + Riihi. [6] it is well, ill, and recovered, respectively. The likeli-
hood is obtained by observing that
Censoring can lead to difficulties in parameter esti-
ƒ( t;hi|t > C) = ƒ( t;hi)
mation. When animal i is right censored at Ti, the log
S( C;hi)
likelihood is
li( t|t > C) = ln(ƒ( t;hi) ) – ln(S( C;hi) )
hi
li = –L0( Ti) e
where ƒ( t; hi|t > C) = density of animal i surviving to
time t, conditioned on its surviving to time C, and
taking the partial with respect to hi yields
li( t|t > C) = corresponding contribution to the log
likelihood.
∂li h
= –L0( Ti) e i.
∂hi
PROGRAMMING ISSUES
h Existing mixed model programs can be modified to
Because L0( Ti) is positive, and e i is positive for all
handle the analysis of a survival trait with relatively
values of hi, the partial is always negative. The impli-
small changes. The changes that need to be made
cation is that if all the records in a fixed effect group
include repeatedly building and solving the mixed
are right censored, then the estimate of the risk factor
model equations with updated risk factors. Within the
for that fixed effect group will go to –∞.
portion of the program that builds the mixed model
equations, risk factors, hi, adjusted weights, Rii, and
TIME-DEPENDENT COVARIATES adjusted dependent variables, y*i , need to be calcu-
Various events in an animal’s life can lead to lated for each animal. The adjusted risk factors can
changes in its risk of failure. For example, the under- be calculated within the main body of the program.
lying risk in a herd can change over time because of The adjusted weights and dependent variables are
J. Anim. Sci. Vol. 77, Suppl. 2/J. Dairy Sci. Vol. 82, Suppl. 2/1999
The two additions are the calculation of the risk
factor for animal ETA and the addition of a link 1
L( Ti; hi, ri ln( Ti) 2 +
subroutine LINK(). r2i
The basics of the link subroutine, assuming r is
known, are 1 – L( Ti; hi, ri)
hi
y*i = 1 + Rii
( 1 – L( Ti; hi, ri) ) ln( Ti) + ri
ri
SUBROUTINE LINK
(Y,R,ETA,W,YSTAR) where ri = rate parameter for animal i. Typically the
REAL*8 Y,R,ETA,W,
LAMBDA,RHO model equation for the rate parameter is ri = r or in
RHO=1. Known rate parameter(r=RHO) matrix form
lambda=exp(RHO*log(y)) Baseline survival function
( L0( Ti)=lambda)
R=LAMBDA*EXP(ETA) Weight ( Rii=R) [5]. r = 1r
YSTAR=W–LAMBDA*EXP Adjusted dependent variable
(ETA)+R*ETA ( y*i =YSTAR) [6]
where r = vector of rate parameters.
RETURN
END In addition, simultaneous estimation of the rate
parameter and the risk factor can lead to convergence
problems. Often it will be necessary to initially fix the
rate parameter to obtain reasonable estimates of the
within the link subroutine the one line that depends risk factors. Second, risk factors have a tendency to go
on the hazard function selected is the calculation of to ±∞. Generally this effect is due to contemporary
integrated baseline hazard function lambda. groups in which all observations are right censored or
Although the basics are straightforward, the use- due to inclusion of time-dependent covariates. The
fulness of the program will depend on several addi- basic way of handling this effect is to provide bounds
tional details. First, the above modifications depend for the risk factors. For example, bounds for r ln(Ti)
on the rate parameter RHO being known. In practice + hi would be –7 and 2.5. Third, time-dependent
it will need to be estimated. Estimation of the rate covariates can be handled by preprocessing the data
parameter can be treated as a second trait and esti- to produce multiple coded records.
J. Anim. Sci. Vol. 77, Suppl. 2/J. Dairy Sci. Vol. 82, Suppl. 2/1999
J. Anim. Sci. Vol. 77, Suppl. 2/J. Dairy Sci. Vol. 82, Suppl. 2/1999