You are on page 1of 7

Tutorial 5

Name: Ying Li, Sanyou Wu, Zhiyuan Pang, Liting Yu


The University of Hong Kong

1 Recap

Estimator properties

1. Unbiasedness

2. Efficiency

3. Consistency

2 This week

Estimator properties

1. Consistency

2.1 Consistency

The estimator θ̂n based on a random sample X1 , . . . , Xn is said to be a consistent estimator of θ


is, for any ϵ > 0, n o
lim P θ̂n − θ ≤ ϵ = 1
n→∞

or equivalently n o
lim P θ̂n − θ > ϵ = 0.
n→∞

The statement “θ̂n is a consistent estimator of θ ” is equivalent to “θ̂n converges in probability to


θ ”. That is, the sample estimator should have a high probability of being close to the population
value θ for a large sample size n.

- Sufficient condition for consistency: An (asymptotically) unbiased estimator θ̂n of θ is consistent


for θ if  
lim Var θ̂n = 0.
n→∞

Bias versus Consistency

1
• The unbiasedness alone does not imply the consistency. (Example: Unbiased but not
consistent) For i.i.d. samples x1 , x2 , . . . , xn one can use Tn (x) = xn as the estimator of E(x).
So E(Tn (x)) = E(x) and it is unbiased, but it does not converge to any value.
• Biased but consistent For example, if the mean is estimated by x̄ = n1 xi + n1 , it is biased.
P

 
- Test for consistency: Let θ̂n be an estimator of θ and let Var θ̂n be finite. If
 2 
lim E θ̂n − θ = 0,
n→∞

then θ̂n is a consistent estimator of θ.

- Property: If θ̂ →p θ and θ̃ →p θ′ , then

(i) θ̂ ± θ̃ → p θ ± θ′ ;

(ii) θ̂ · θ̃ →p θ · θ′ ;

(iii) θ̂/θ̃ →p θ/θ′ assuming that θ̃ ̸= 0 and θ′ ̸= 0;

(iv) if g is any real-valued function that is continuous at θ, g(θ̂) →p g(θ).


Remark. (i)-(iii) are called the “Slutsky Theorems”, named after Eugen Slutsky. (iv) is called the
“Continuous Mapping Theorem”.

3 Exercise

iid
1. Given X1 , X2 , . . . , Xn ∼ U [θ, 2θ], where θ > 0 is an unknown parameter.

a. Find the maximum likelihood estimator θ̂ for θ.


b. Is θ̂ unbiased for θ ? If not, modify this estimator to make it unbiased.
c. Show that θ̂ is consistent for θ.

Solution:

Qn 1
 1
a. L(θ) = i=1 θ = θn if θ ⩽ xmin ⩽ xmax ⩽ 2θ
dL n
= − n+1 < 0,
∵ ∀θ > 0
dθ θ
∴ As θ decreases, L(θ) increases monotonically
∵ xmax ⩽ 2θ
xmax
∴θ⩾
2
xmax
∴ θ̂M LE =
2

2
Note: θ has to be less and equal to xmin , Thus, we need to check that xmax /2 > xmin ⇔
xmax > 2xmin can be satisfied. This is prohibited by the joint supports of the uniform
sample, ie. θ ⩽ xmin ⩽ xmax ⩽ 2θ

b. x ∼ U [θ, 2θ]
1
fx (x) = for x ∈ [θ, 2θ].
θ

x−θ
 θ
 for x ∈ [θ, 2θ]
Fx (x) = P (X < x) = 1 for x > 2θ

0 otherwise

Let F(n) (x) and f(n) (x) be the cdf and pdf of the xmax .

F(n) (x) = P (max {X1 , . . . , Xn } < x) = P (X1 < x, X2 < x, . . . , Xn < x)


iid
= P (X1 < x) P (X2 < x) . . . P (Xn < x)

identical
x−θ n
 
 θ
 if x ∈ [θ, 2θ]
[F (x)]n = 1 if x > 2θ

0 otherwise

d
f(n) (x) = dx F(x) i.e.,

x − θ n−1 1 (x − θ)n−1
 
d
f(n) (x) = F(n) (x) = n =n for x ∈ [θ, 2θ]
dx θ θ θn
Z 2θ
n(x − θ)n−1 n 2θ
Z
E (xmax ) = x dx = n x(x − θ)n−1 dx
θ θn θ θ
( )
(x − θ)n 2θ
Z 2θ
(x − θ)n

n
= n x − dx
θ n θ θ n
( 2θ )
n 2θ · θn (x − θ)n+1

= n −
θ n n(n + 1) θ
n 2θ · θn θn+1
 
= n −
θ n n(n + 1)
θ 2nθ + 2θ − θ
= 2θ − = .
n+1 n+1
2n + 1
= θ
n+1
  x  2n + 1
max
∴ E θ̂M LE = E = θ ̸= θ
2 2n + 2
∴ θ̂Me is biased for θ
 
2n + 2
∴E θ̂M LE = θ
2n + 1
 
2n + 2 2n + 2 xmax n+1
∴ θ̃ = θ̂M LE = = xmax is an unbiased estimator of θ.
2n + 1 2n + 1 2 2n + 1

3
c. For any ε > 0 (no matter how small ε is), we have
 x 
max
P −θ <ε
 2 x 
max
=P −ε < −θ <ε
2
=P (2θ − 2ε < xmax < 2θ + 2ε)
=F(n) (2θ + 2ε) − F(n) (2θ − 2ε)
x−θ n
 
 n  θ
 if x ∈ [θ, 2θ]
2θ − 2ε − θ
=1 − Note: Fn (x) = 1 if x > 2θ
θ 
0 otherwise

 n
θ − 2ε
=1 − → 1 as n → ∞.
θ
θ−2ε θ−2ε n

Note 1: If 2ε < θ, then 0 < θ < 1 and θ → 0, as n → ∞.
Note 2: If 2ε ≥ θ, then F(n) (2θ − 2ε) = 0 by the definition of F(n) (x).
xmax p
∴ θ̂MLE = →θ
2

2. Suppose that X1 , X2 , . . . , Xn denote a random sample from the probability density function
(
2x −x2 /θ
e , for 0 < x < ∞,
f (x; θ) = θ
0, otherwise.

a. Find the maximum likelihood estimator (MLE) of θ.


b. Is the MLE above (i) consistent? (ii) unbiased? Give reasons.
c. Derive the Cramér-Rao Inequality, and comment on the MLE.

Solution:

a. Find θ̂M LE
n n  n
!
x2 2n
 P 2
Y Y 2xi − θi
xi Y
L(θ) = f (xi ) = e = n e− θ xi
θ θ
i=1 i=1 i=1
x2i X P
ln L(θ) = n ln 2 − n ln θ − + ln xi
θ Pn
x2
P 2
d ln L(θ) n xi X
2
=− + 2 =0⇒ xi = nθ ⇒ θ = i=1 i
dθ θ θ n
d2 ln L(θ) x2
P
Let’s check dθ2
to make sure θ = n i can indeed maximize ln L(θ).
d2 ln L(θ) 2 x2i 2 x2i
P P
n given
X
2
= 2− < 0 ⇔ nθ − 2 xi < 0 ⇔ θ <
dθ2 θP θ3 n
x2i 2 x2i
P
∵ <
n Pnn 2
x
∴ θ̂M LE = i=1 i
n

4
b.  Pn 2
i=1 xin iid
E x2i = E x2i
 
E(θ̂) = E =
n n
Z ∞ Z ∞
2x x2
2 2
x2 e− θ dx

E xi = x fx (x)dx =
0 0 θ
Z ∞ 2
x −x 2
= e θ · 2xdx
0 θ
Let t = x2 , then dt = 2xdx Z ∞
t −t
= e θ dt
0 θ
Z ∞ 1 2
1
=θ θ
t2−1 e− θ t dt = θ
0 Γ(2)
| {z }
pdt of Γ(2, θ1 )

∴ E(θ̂) = θ and θ̂ is unbiased.

Remark. Gamma distribution


β α α−1 −βt
Γ(x; α, β) = x e . (1)
Γ(α)

∵ x21 , x22 , x23 , . . . , x2n are i.i.d. with E x2i = θ




p
∴ By the weak law of large number n1 ni=1 x2i −→ E x2i = θ
P 

2
c. The log-likelihood function of a single observation sample is l(θ) = ln fx (x) = ln 2x − ln θ − xθ

d2 1 x2 2x2
   
d d d 1
l(θ) = l(θ) = − + = −
dθ2 dθ dθ dθ θ θ2 θ2 θ3

∵ The support of x does not depend on θ


 2
2x2
  
d 1
∴ I(θ) = −E l(θ) = −E − 3
dθ2 θ2 θ
2

1 2E x
=− 2 +
θ θ3
1 2θ
=− 2 + 3
θ θ
1
= 2
θ
1 1 1 θ2
∴ CB = = = 1 =
In (θ) nI(θ) n θ2 n
 2
Σxi 1  iid n
= 2 Var Σx2i = 2 Var x2i

Var(θ̂) = Var
n n n
1
= Var x2i

n
Let. y = g(x) = x2 (1)

5
Let y = g(x) = x2 , then g(·) is a monotonic function for x ∈ (0, ∞). Hence, we can apply
transformation formula.

dx 2x − x2 1 − 1 2 y −y 1 −1
fy (y) = fx (x) = e θ y 2 = e θ y 2
dy θ 2 θ 2
1 1
= e− θ y for y > 0
θ
x>0 √ 1
Note: y = x2 ⇒ x = y⇒ dx
dy = 12 y − 2
 
1
∴ y ∼ Exp
θ
1
∴ Var x2i = Var(y) = 2

 =θ
1 2
θ
θ2
∴ By (1), Var(θ̂) = = CB
n
∴ θ̂ is the UMVUE of θ

3. (Spring 2013 Class test) Consider a sample X1 , . . . , Xn with a common pdf


(
θ2 xe−θx , for x > 0,
f (x; θ) =
0, otherwise,

where θ > 0 is an unknown parameter.

a. Write down the likelihood function based on the sample.

b. Find the Fisher information In (θ) contained in the sample.

c. Let X̄ = n1 (X1 + · · · + Xn ) be the sample mean, consider T1 = 2



as an estimator of θ. Is T1
a consistent estimator?

d. Check whether T1 is an unbiased estimator given that


2
θ2
  
1 θ 1
Eθ = , Eθ = .
X1 + · · · + Xn 2n − 1 X1 + · · · + Xn (2n − 1)(2n − 2)

2n−1
e. Consider another estimator defined by T2 = X1 +···+Xn . Show that T2 is unbiased.

Solution:

a.
n
Y
L(θ) = f (x1 , x2 , . . . , xn ; θ) = f (xi ; θ)
i=1
n h n
!
Y i Pn Y
2 −θxi 2n −θ i=1 xi
= θ xi e =θ e xi for xmin > 0
i=1 i=1

6
b. The log-likelihood function of a single observation sample is l(θ) = ln fx (x) = 2 ln θ + ln x − θx

d2
   
d d d 2 2
2
l(θ) = l(θ) = −x =− 2
dθ dθ dθ dθ θ θ

∵ the support of x does not depend on θ

d2
 
2
∴ I(θ) = −E l(θ) = 2
dθ2 θ
2n
∴ In (θ) = nI(θ) = 2
θ
c.
iid
∵ xi ∼ Γ(2, θ)
2
∴ E (xi ) = , ∀i
θ
n
1X p 2 2 2
∵ x̄ = xi −→ E (xi ) = and g(z) = is continuous at z =
n θ z θ
i=1

2 p 2 2
∴ By continuous mapping Theorem, T1 = x̄ → E(xi ) = 2 = θ ∴ T1 is a consistent estimator
θ
of E (xi )

d.
2 2
Bias (T1 ) = E(T1 ) − θ = E( ) − θ = E( 1 P ) − θ
x̄ n xi
   
1 2nθ 2n − (2n − 1) θ
= 2nE −θ = −θ =θ = ̸= 0
Σxi 2n − 1 2n − 1 2n − 1
∴ T1 is biased for θ.

e.    
2n − 1 1
E (T2 ) = E = (2n − 1)E P
Σxi xi
θ
= (2n − 1) =θ
2n − 1
∴ T2 is unbiased for θ.

You might also like