Professional Documents
Culture Documents
1
Estimation - Parametric
• Often we will assume that the random survival times of a population follow a
particular parametric distribution defined by f (t). For example, last week we
saw an example where f (t) = λ exp(−λt).
• Once we have assumed a particular distribution we need a means of
estimating the parameters that define the distribution.
• For the moment we will ignore censoring, that is, our data are complete.
2
Method of Moments (MOM)
For the exponential distribution we know that E(X) = λ1 and that the first
−1
Pn
sample moment is n i=1 xi . Equating the sample and population moments
1 −1
Pn −1
Pn −1
gives λ̂ = n i xi and λ̂ = (n i=1 xi ) .
3
Maximum Likelihood Estimation (MLE)
Note: The above results mean that for large enough n, our MLE θ̂ is
1
approximately N (θ, I(θ) ).
5
Maximum Likelihood Estimation (MLE)
6
R Example
poisson.L<-function(mu,x) {
n<-length(x)
logL<-sum(x)*log(mu)-n*mu
return(-logL)
optim(par=1,fn=poisson.L,method="BFGS",x=c(6,6,0,3,5,4,9,5,3,7,6))
7
R Example
8
R Example
n=5
Frequency
0 200
1 2 3 4 5 6
mle5
n=10,000
Frequency
0 150
mle10000
10
A Non-parametric Approach
d(t)
F̂ (t) = ,
N
where d(t) is the number of the N observed lifetimes that are ≤ t.
• An alternative to specifying a particular parametric distribution is to use a
“distribution free” or non-parametric approach.
• One advantage of a non-parametric approach is that we do not need to
specify a particular distribution - can be more robust.
• One disadvantage of a non-parametric approach is that it can be less
“efficient” than a suitable parametric approach.
11
A Non-parametric Approach
d(t)
F̂ (t) =
N
where d(t) is the number of the N observed lifetimes that are ≤ t.
• d(t) is binomial(N, F (t)).
• E(F̂ (t)) = E( d(t)
N ) = N −1
× N × F (t) = F (t). (unbiased).
1
• V (F̂ (t)) = N F (t)(1 − F (t))
12
R Example
observed<-rexp(20,0.25)
lammle<-1/mean(observed)
Fparam<-1-exp(-lammle*seq(0,max(observed),by=0.01))
Fnonparam<-rep(0,length(seq(0,max(observed),by=0.01)))
j<-1
for(i in seq(0,max(observed),by=0.01)) {
Fnonparam[j]<-sum(observed<=i)/length(observed)
j<-j+1
}
plot(seq(0,max(observed),by=0.01),Fparam,type="l",main=
13
"Estimation of CDF",xlab="Time",ylab="Probability")
lines(seq(0,max(observed),by=0.01),Fnonparam,col="red")
14
R Example
Estimation of CDF
0.8
0.6
Probability
0.4
0.2
0.0
0 2 4 6 8 10
Time
16
Censoring
• Random censoring - let Ci represent the time of censoring of the ith life and
Ti the random lifetime of the ith life. Life is censored if Ci < Ti .
1. Definition 1: Censoring is random if Ci is a random variable.
2. Definition 2: Censoring is random if Ci and Ti are independent random
variables. (This is the most common definition)
• Informative/non-informative censoring - Censoring is non-informative if it
provides no information about the future lifetime (Ti ). Definition 2 of
random censoring implies non-informative censoring, definition 1 does not.
The methods we will speak about in this course will rely on the fact that the
censoring is non-informative.
17