This action might not be possible to undo. Are you sure you want to continue?
l
†l
l
Suppose we are trying to measure the true value of some quantity (xT). u We make repeated measurements of this quantity {x1, x2, … xn}. u The standard way to estimate xT from our measurements is to calculate the mean value: 1 N m x = Â xi N i=1 + set xT = mx. + DOES THIS PROCEDURE MAKE SENSE??? + MLM: a general method for estimating parameters of interest from data. Statement of the Maximum Likelihood Method u Assume we have made N measurements of x {x1, x2, … xn}. u Assume we know the probability distribution function that describes x: f(x, a). u Assume we want to determine the parameter a. + MLM: pick a to maximize the probability of getting the measurements (the xi's) that we did! How do we use the MLM? u The probability of measuring x1 is f(x1, a)dx u The probability of measuring x2 is f(x2, a)dx u The probability of measuring xn is f(xn, a)dx u If the measurements are independent, the probability of getting the measurements we did is: L = f (x1,a )dx ⋅ f (x2 ,a )dx ⋅⋅⋅ f (xn ,a )dx = f (x1,a ) ⋅ f (x2 ,a ) ⋅⋅⋅ f (xn ,a )dx n u We can drop the dxn term as it is only a proportionality constant
+
†
L = ’ f (xi , a )
i=1
N
Likelihood Function
L5: Maximum Likelihood Method 1
K.K. Gan
†
u
u
We want to pick the a that maximizes L: ∂L =0 ∂a a =a* Both L and lnL have maximum at the same location. + maximize lnL rather than L itself because lnL converts the product into a summation.
ln L = Â ln f (xi , a )
i=1
+
†
N
new maximization condition: N ∂ ∂ ln L = Â ln f (xi , a ) =0 ∂a a =a* i=1∂a a =a * † n a could be an array of parameters (e.g. slope and intercept) or just a single variable. n equations to determine a range from simple linear equations to coupled nonlinear equations. l Example: †u Let f(x, a) be given by a Gaussian distribution. u Let a = m be the mean of the Gaussian. u We want the best estimate of a from our set of n measurements {x1, x2, … xn}. u Let’s assume that s is the same for each measurement.
f (xi , a ) =
u
1 e s 2p
n

( x i a ) 2 2s 2
The likelihood function for this problem is:
L = ’ f (xi , a ) = ’
i=1 n
†
1 i=1 s 2p
( x a ) 2  i 2 e 2s
È 1 =Í Îs 2p ˙ ˚
( x1 a ) ˘n  2s 2 e
2
( x a ) 2  2 2 e 2s
( x a ) 2  n 2 Le 2s
È 1 =Í Îs 2p ˙ ˚
( x a ) Â i ˘n  i=1 2s 2 e
n
2
K.K. Gan
L5: Maximum Likelihood Method
2
†
u
Find a that maximizes the log likelihood function:
∂ ln L ∂ È Ê 1 ˆ n (xi  a )2 ˘ = ˜ Â Ín lnÁ ˙= 0 Ë s 2p ¯ i=1 2s 2 ˚ ∂a ∂a Î ∂ n Â (xi  a )2 = 0 ∂a i=1
Â 2(xi  a )(1) = 0
i=1 n
Â xi = na
i=1
n
u
†
1 n a = Â xi Average n i=1 If s are different for each data point + a is just the weighted average: n x Â i2 s Weighted average a = i=1 i n 1 Â 2 i=1s i
†
K.K. Gan L5: Maximum Likelihood Method 3
l
Example u Let f(x, a) be given by a Poisson distribution. u Let a = m be the mean of the Poisson. u We want the best estimate of a from our set of n measurements {x1, x2, … xn}. u The likelihood function for this problem is: n
e a e a e a e a e a = ... = x1! x2! xn ! x1! x2!..x n ! i=1 i=1 xi ! Find a that maximizes the log likelihood function: n ˆ d ln L d Ê 1 n = Á na + ln a ⋅ Â xi  ln(x1! x2!..x n !)˜ = n + Â xi = 0 da da Ë a i=1 ¯ i=1 L = ’ f (xi , a ) = ’
i=1
n
n
a
xi
a
x1
a
x2
a
xn
na
Â xi
u
†
1 n a = Â xi n i=1
J
Average
Some general properties of the Maximum Likelihood Method
†
J
J
J J J L
For large data samples (large n) the likelihood function, L, approaches a Gaussian distribution. Maximum likelihood estimates are usually consistent. + For large n the estimates converge to the true value of the parameters we wish to determine. Maximum likelihood estimates are usually unbiased. + For all sample sizes the parameter of interest is calculated correctly. Maximum likelihood estimate is efficient: the estimate has the smallest variance. Maximum likelihood estimate is sufficient: it uses all the information in the observations (the xi’s). The solution from MLM is unique. Bad news: we must know the correct probability distribution for the problem at hand!
K.K. Gan L5: Maximum Likelihood Method 4
Maximum Likelihood Fit of Data to a Function
l
Suppose we have a set of n measurements: x1, y1 ± s1
x2 , y2 ± s 2 ... x n , yn ± s n
⋅
l
†
Assume each measurement error (s) is a standard deviation from a Gaussian pdf. u Assume that for each measured value y, there’s an x which is known exactly. u Suppose we know the functional relationship between the y’s and the x’s: y = q(x, a , b,...) n a, b...are parameters. + MLM gives us a method to determine a, b... from our data. Example: Fitting data points to a straight line: q(x, a , b,...) = a + bx
u
u
†
2p i=1 s i 2p Find a and b by maximizing the likelihood function L likelihood function: ∂ ln L ∂ n È Ê 1 ˆ (yi  a  bxi )2 ˘ n È 2(yi  a  bxi )(1)˘ = Â ÍlnÁ ˙ = Â Í˙= 0 ˜2 2 ∂a ∂a i=1Î Ë s i 2p ¯ 2s i 2s i ˚ ˚ i=1Î ∂ ln L ∂ n È Ê 1 ˆ (yi  a  bxi )2 ˘ n È 2(yi  a  bxi )(xi )˘ = Â ÍlnÁ ˙ = Â Í˙= 0 ˜2 2 ∂b ∂b i=1Î Ë s i 2p ¯ 2s i 2s i ˚ ˚ i=1Î
i=1 i=1 s i
L = ’ f (xi , a , b ) = ’
n
n
1

e
( y i q( x i ,a ,b )) 2 2s i2
n
=’
1

e
( y i a bx i ) 2 2s i2
two linear equations with two unknowns
K.K. Gan
L5: Maximum Likelihood Method
5
†
u
Assume all s’s are the same for simplicity: n n n Â yi  Â a  Â bxi = 0
i=1 n 2 Â yi xi  Â axi  Â bxi = 0 i=1 n i=1 n
u
†
We nownhave two equations that are linear in the two unknowns, a and b. n Â yi = na + b Â xi È n ˘ È i=1 i=1 Í Â yi ˙ Í n i=1 ˙ = Í n n n Ín Matrix form 2 n Â yi xi = a Â xi + b Â xi ÍÂ y x ˙ ÍÂ x i=1 i=1 i=1 Í i i˙ Í i Îi=1 ˚ Îi=1
2 Â yi Â xi  Â yi xi Â xi n n n n
i=1
i=1
i=1
n Â xi yi  Â yi Â xi and
n
n
n
˘ Â xi ˙ Èa ˘ i=1 ˙ Í ˙ n Í ˙ Â xi2 ˙ Îb ˚ Í ˙ ˙ i=1 ˚
n
†
n
a = i=1
n Â xi2 i=1
i=1 n
 ( Â xi )
i=1
i=1 n
i=1 2
b=
i=1 n
i=1 i=1 n 2 † n Â xi  ( Â xi )2 i=1 i=1
Taylor Eqs. 8.1012
We will see this problem again when we talk about “least squares” (“chisquare”) fitting. l EXAMPLE: u A trolley moves along a track at constant speed. Suppose the following measurements of the † time vs. distance were made. From the data find the best value for the speed (v) of the trolley. Time t (seconds)
u
1.0
2.0
3.0
4.0
5.0
6.0
Distance d (mm) 11 19 33 40 49 61 Our model of the motion of the trolley tells us that: d = d0 + vt
K.K. Gan L5: Maximum Likelihood Method 6
†
u u
We want to find v, the slope (b) of the straight line describing the motion of the trolley. We need to evaluate the sums listed in the above formula:
Â xi = Â ti = 21 s
i=1 n i=1 n i=1 n n 6 i=1 6
Â yi = Â di = 213 mm
i=1 6
Â xi yi = Â ti di = 919 s ⋅ mm = 91 s 2
n n
i=1 6 xi2 = Â ti2 Â i=1 i=1 n
n Â xi yi  Â yi Â xi
i=1 n i=1 i=1 n 2 n Â xi  ( Â xi )2 i=1 i=1
v=
=
†
6 ¥ 919  21¥ 213 = 9.9 mm / s 2 6 ¥ 91 21
best estimate of the speed
d0 = 0.8 mm
†
best estimate of the starting point
K.K. Gan
L5: Maximum Likelihood Method
7
MLM fit to the data for d = d0 + vt
70 60 50 40 30 20 10 0
u u u
_____
d = 0.8 + 9.9t
1
2
time (sec)
3
4
5
6
7
The line best represents our data. Not all the data points are "on" the line. The line minimizes the sum of squares of the deviations between the line and our data (di): n n 2 2 d = Â [data i  prediction i ] = Â [di  (d0 + vti )] Least square fit
i=1 i=1
K.K. Gan
L5: Maximum Likelihood Method
distance (mm)
0
8
†
This action might not be possible to undo. Are you sure you want to continue?