You are on page 1of 10

Chapter 8.3.

Maximum Likelihood Estimation


Prof. Tesler

Math 283 April 12, 2011

Prof. Tesler

8.3 Maximum Likeilihood Estimation

Math 283 / April 12, 2011

1 / 10

Estimating parameters
Let Y be a random variable with a distribution of known type but unknown parameter value .
Bernoulli or geometric with unknown p. Poisson with unknown mean .

Write the pdf of Y as PY(y; ) to emphasize that there is a parameter . Do n independent trials to get data y1 , y2 , y3 , . . . , yn . The joint pdf is PY1 ,...,Yn(y1 , . . . , yn ; ) = PY(y1 ; ) PY(yn ; ) Goal: Use the data to estimate .
Prof. Tesler 8.3 Maximum Likeilihood Estimation Math 283 / April 12, 2011 2 / 10

Likelihood function
Previously, we knew the parameter and regarded the ys as unknowns (occurring with certain probabilities). Dene the likelihood of given data y1 , . . . , yn to be L(; y1 , . . . , yn ) = PY1 ,...,Yn(y1 , . . . , yn ; ) = PY(y1 ; ) PY(yn ; ) Its the exact same formula as the joint pdf; the difference is the interpretation. Now we consider the data y1 , . . . , yn to be given and to be an unknown.

Denition (Maximum Likelihood Estimate, or MLE)


that maximizes the likelihood is the Maximum The value = Likelihood Estimate. Often, it is found using Calculus: dL d2 L =0 <0 2 d d may nd some maxima, and also need to check boundary values of .
Prof. Tesler 8.3 Maximum Likeilihood Estimation Math 283 / April 12, 2011 3 / 10

MLE for the Poisson distribution


Y has a Poisson distribution with unknown parameter Collect data from independent trials: Y1 = y1 , Y2 = y2 , , Yn = yn Likelihood:
n

0.

L ( ; y1 , . . . , yn ) =
i=1

yi n y1 ++yn e e = yi ! y1 ! yn !

Log likelihood is maximized at the same and is easier to use: ln L(; y1 , . . . , yn ) = n + (y1 + + yn ) ln ln(y1 ! yn !) Critical point: Solve d(ln L)/d = 0: d(ln L) y1 + + yn = n + =0 d
Prof. Tesler

so

y1 + + yn = n
Math 283 / April 12, 2011 4 / 10

8.3 Maximum Likeilihood Estimation

MLE for the Poisson distribution


Log likelihood is maximized at the same and is easier to use: ln L(; y1 , . . . , yn ) = n + (y1 + + yn ) ln ln(y1 ! yn !) Critical point: Solve d(ln L)/d = 0: d(ln L) y1 + + yn = n + =0 d y1 + + yn = n

so

Check second derivative is negative: d2 (ln L) y1 + + yn n = = 0 2 2 d y1 + + yn since y1 + + yn 0. So its a max unless y1 = = yn = 0. Boundaries for range 0: Must check 0+ and . Both send ln L , so the identied above gives the max.

The Maximum Likelihood Estimate for the Poisson distribution


y1 + + yn 0(# of 0s) + 1(# of 1s) + 2(# of 2s) + = = n n
Prof. Tesler 8.3 Maximum Likeilihood Estimation Math 283 / April 12, 2011 5 / 10

MLE for the Poisson distribution


The exceptional case on the previous slide was y1 + + yn = 0, giving y1 = = yn = 0. In this case, ln L(; y1 , . . . , yn ) = n + (y1 + + yn ) ln ln(y1 ! yn !) = n + 0 ln ln(0! 0!) = n As , we get n , which doesnt have a maximum. However, 0 for the Poisson distribution, so the full problem is to maximize L (or ln L) subject to 0: ln L = 0 if = 0 So the MLE is = 0.
Prof. Tesler 8.3 Maximum Likeilihood Estimation Math 283 / April 12, 2011 6 / 10

and

ln L < 0 if > 0.

Repeating the estimation gives different results


A. A does n trials yA1 , yA2 , . . . , yAn , leading to MLE B. B does n trials yB1 , yB2 , . . . , yBn , leading to MLE A, B , . . . compare? How do Treat the n trials in each experiment as random variables Y1 , . . . , Yn and the MLE as a random variable .

Estimate Poisson parameter with n = 10 trials (secret: = 1.23)


Experiment A B C D E Mean
Prof. Tesler

Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 1 0 0 0 3 0 2 2 0 2 1 2 0 1 1 3 0 0 0 1 3 2 2 1 1 1 1 2 1 1 1 2 1 2 1 4 2 3 2 1 0 3 0 1 1 0 0 1 2 2 1.2 1.8 0.6 1 1.4 1.6 1 1.6 1 1.4


8.3 Maximum Likeilihood Estimation

1.0 0.9 1.5 1.9 1.0 1.26


7 / 10

Math 283 / April 12, 2011

Desireable properties of an estimator

should be narrowly distributed around the correct value of . Increasing n should improve the estimate. The distribution of should be known. The MLE often does this.

Prof. Tesler

8.3 Maximum Likeilihood Estimation

Math 283 / April 12, 2011

8 / 10

Bias
Suppose Y is Poisson with secret parameter . Poisson MLE from data is Y1 + + Yn = n If many MLEs are computed from independent data sets, the average tends to Y1 + + Yn E(Y1 ) + + E(Yn ) E ( ) = E = n n n + + = = = n n Since E( ) = , we say is an unbiased estimator of . If the formula were different such that we had E( ) , we would say is a biased estimator of . E.g.: = 2Y1 has E( ) = 2, so its biased (unless = 0).
Prof. Tesler 8.3 Maximum Likeilihood Estimation Math 283 / April 12, 2011 9 / 10

Efciency (want estimates to have small spread)


Continue with Poisson MLE = The variance is Y1 + + Yn Var( ) = Var n
Y1 ++Yn n

and secret mean .

Var(Y1 ) + + Var(Yn ) = n2 n Var(Y1 ) Var(Y1 ) = = = 2 n n n Increasing n makes the variance smaller ( is more efcient). Heres a second estimator: Use Y1 , Y2 and discard Y3 , . . . , Yn . Y1 + 2Y2 = 3 + 2 E ( )= = 3 so unbiased

Var(Y1 ) + 4 Var(Y2 ) + 4 5 Var( )= = = 9 9 9 so it has higher variance (less efcient) than the MLE.
Prof. Tesler 8.3 Maximum Likeilihood Estimation Math 283 / April 12, 2011 10 / 10