You are on page 1of 10

AGC

DSP Estimation Theory


 We seek to determine from a set of data, a
set of parameters such that their values
would yield the highest probability of
obtaining the observed data.
 The unknown parameters may be seen as
deterministic or random variables
 There are essentially two alternatives to the
statistical case
 When no a priori distribution assumed then
Maximum Likelihood
 When a priori distribution known then Bayes
Professor A G Constantinides©
AGC
DSP Maximum Likelihood
 Principle: Estimate a parameter such that for this
value the probability of obtaining an actually
observed sample is as large as possible.
 I.e. having got the observation we “look back” and
compute probability that the given sample will be
observed, as if the experiment is to be done again.
 This probability depends on a parameter which is
adjusted to give it a maximum possible value.
 Reminds you of politicians observing the movement
of the crowd and then move to the front to lead them?

Professor A G Constantinides©
AGC
DSP Estimation Theory
 Let a random variable X have a probability
distribution dependent on a parameter 
 The parameter  lies in a space of all
possible parameters 
 Let f X ( x |  )be the probability density
function of X
 Assume the the mathematical form of f X is
known but not 

Professor A G Constantinides©
AGC
DSP Estimation Theory
 The joint pdf of m sample random variables
evaluated at each the sample points
x1, x2 , . xm
 Is given as
m
l ( , x1 , x2 , . xm )  l ( , x)   f X ( xi |  )
i 1

 The above is known as the likelihood of the


sampled obserevation

Professor A G Constantinides©
AGC
DSP Estimation Theory
 The likelihood function is a function of the
unknown parameter  for a fixed set of
observations
 The Maximum Likelihood Principle requires
us to select that value of  that maximises
the likelihood function
 The parameter  may also be regarded as
a vector of parameters
θ  1 2 . k 
Professor A G Constantinides©
AGC
DSP Estimation Theory
 It is often more convenient to use
(θ, x)  log( f X (x | θ))
 The maximum is then at
(θ, x)
0
θ

Professor A G Constantinides©
AGC
DSP An example
 Let x  x1, x2 , . xmbe a random sample
selected from a normal distribution
N( , ) 2

 The joint pdf is


m 1  ( x   ) 2

f X (x |  , )  
2
exp  i

i 1 2   2 2

 We wish to find the best  and  2

Professor A G Constantinides©
AGC
DSP Estimation Theory
 Form the log-likelihood function
1 m
(θ, x)  m log( 2 )  m log    ( xi   )
2

 Hence 2 2
i 1
(θ, x) 1 m
 2  ( xi   )  0
  i 1
(θ, x) m 1 m
   3  ( xi   ) 2  0
   i 1
1 m
 or ̂ ML   xi ˆ ML
1 m
  ( xi   ML ) 2
m i 1 m i 1
Professor A G Constantinides©
AGC
DSP Fisher and Cramer-Rao
 The Fisher Information helps in placing a bound on
estimators  
J (θ)  E{ ( (θ, x))T }
θ 
 Cramer-Rao Lower Bound:“If t(X) is any unbiased
estimator of θ based on maximum likelihood then
1
E{[t(X)  θ)][t(X)  θ)] }  J (θ)
T

 Ie J (θ) provides a lower bound on the covariance


matrix of any unbiased estimator

Professor A G Constantinides©
AGC
DSP Estimation Theory
 It can be seen that if we model the
observations as the output of an AR
process driven by zero mean Gaussian
noise then the Maximum Likelihood
estimator for the variance is also the
Least Squares Estimator.

Professor A G Constantinides©

You might also like