Professional Documents
Culture Documents
ELEG 833
Gonzalo R. Arce
Fall 2008
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 1 / 39
Outline
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 2 / 39
Statistical Foundations of Filtering
Filtering and parameter estimation are related since information is carried into one
or more parameters of a signal.
In AM and FM modulation, the information resides in the envelope and
instantaneous frequency.
Information can be carried in a number of signal parameters the mean,
variance, phase, frequency.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 3 / 39
Statistical Foundations of Filtering
Xi = β + Zi i = 1, 2, · · · , N.
Given X1 , X2 , · · · , XN , the goal is to derive a “good” estimate of β.
Estimates of this kind are known as location estimates, a key in the formulation of
the optimal filtering problem.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 4 / 39
Statistical Foundations of Filtering
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 5 / 39
Statistical Foundations of Filtering Properties of Estimators
Outline
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 6 / 39
Statistical Foundations of Filtering Properties of Estimators
Properties of Estimators
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 7 / 39
Statistical Foundations of Filtering Properties of Estimators
Unbiased Estimators
A typical probability density fβ̂ (y /β) is shown below.
A “good” estimator will have its density function clustered about β.
The mean of β̂ should be close to β.
0.5
f^β (y|β)
2
0
β y
−3 −2 −1 0 1 2 3
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 8 / 39
Statistical Foundations of Filtering Properties of Estimators
E {β̂N } = β.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 9 / 39
Statistical Foundations of Filtering Properties of Estimators
( N
)
n o 1 X
var β̂N = var Xi
N
i=1
σ2
= , (1)
N
where σ 2 is the input variance.
The precision of the estimate improves as N increases.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 10 / 39
Statistical Foundations of Filtering Properties of Estimators
Efficient Estimators
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 11 / 39
Statistical Foundations of Filtering Properties of Estimators
Let f (X; β) be the density function of the observations X given the value of β. If
β̂ is an unbiased estimate of β, its variance is bounded by the Cramér-Rao bound:
( 2 )!−1
∂
var{β̂} ≥ E ln f (X; β) (2)
∂β
provided that the partial derivative of the log likelihood function exists and is
absolutely integrable.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 12 / 39
Statistical Foundations of Filtering Properties of Estimators
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 13 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Outline
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 14 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 15 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 16 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
N
Y
f (X1 , X2 , · · · , XN ; β) = f (Xi − β)
i=1
N
Y1 2 2
= e −(Xi −β) /2σ
√ (5)
i=1
2πσ
N/2 P
1 N 2 2
= 2
e − i=1 (Xi −β) /2σ .
2πσ
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 17 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
The ML estimate of location is the value β̂ which minimizes the least squares sum
N
X
β̂ML = arg min (Xi − β)2 . (6)
β
i=1
The value that minimizes the sum, results in the sample mean
N
1 X
β̂ML = Xi . (7)
N
i=1
PN
Note that the sample mean is unbiased since E {β̂ML } = (1/N) i=1 E {Xi } = β.
As a ML estimate, it is efficient having its variance, in (1), reach the Cramér-Rao
bound.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 18 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
i=1
PN γ
= C N e− i=1 |Xi −β| /σ , (8)
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 19 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
N
X
β̃ML = arg min |Xi − β|γ .
β
i=1
γ=2
γ=1
γ = 0.5
X X X X5 X
1 4 3 2
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 20 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
When the dispersion parameter is 1, the model is Laplacian and the optimal
estimator minimizes
N
X
β̃ML = arg min |Xi − β|. (10)
β
i=1
N
X N
X
L1 (β) = X(i) − β = X(i) − Nβ. (11)
i=1 i=1
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 21 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
For values of β in the range X(j) < β ≤ X(j+1) , L1 (β) can be written as
j
X N
X
L1 (β) = β − X(i) + X(i) − β
i=1 i=j+1
N
X j
X
= X(i) − X(i) − (N − 2j) β, (12)
i=j+1 i=1
for j = 1, 2, · · · , N − 1.
Similarly, for X(N) < β < ∞,
N
X
L1 (β) = − X(i) + Nβ. (13)
i=1
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 22 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Pn
Letting X(0) = −∞ and X(N+1) = ∞, and defining i=m X(i) = 0 if m > n, we
can combine (11)-(13) into the following compactly written cost function
N
X Xj
L1 (β) = X(i) − X(i) − (N − 2j) β, j = 0, 1, · · · , N (14)
i=j+1 i=1
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 23 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
N
X j
X
L1 (β) = X(i) − X(i) − (N − 2j) β, j = 0, 1, · · · , N
i=j+1 i=1
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 24 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
For N odd there is an integer k, such that the slopes over the intervals
(X(k−1) , X(k) ] and (X(k) , X(k+1) ], are negative and positive, respectively. From
(14), these two conditions are satisfied if both
N N
k< and k> −1
2 2
N+1
hold. Both constraints are met when k = 2
N
X
β̂ML = arg min |Xi − β|
β
i=1
(
X N+1 N odd
= ( 2 ) i
X( N ) , X( N ) N even
2 2
= MEDIAN(X1 , X2 , · · · , XN ). (15)
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 25 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 26 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 27 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Given k > 0, the ML location estimate is known as the sample myriad and is
given by
N
Y
k 2 + (Xi − β)2
β̂k = arg min (19)
β
i=1
= MYRIAD{k; X1 , X2 , · · · , XN }.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 28 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
The sample myriad involves the free parameter k (refered to as the linearity
parameter). The behavior of the myriad is markedly dependent on the value of k.
−3
x 10
14
12 k=20
10
6
k=2
4
2
k=0.2
0
X1 X4 X3 X5 X2
−2
−6 −4 −2 0 2 4 6 8 10 12
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 29 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Since the logarithm is a strictly monotonic function, then the sample myriad will
also minimize log Gk (β).
N
X
log k 2 + (Xi − β)2 .
MYRIAD{k; X1 , · · · , XN } = arg min (20)
β
i=1
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 30 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
If an observation in the set of input samples has a “large” magnitude such that
|Xi − β| ≫ k, the cost associated with this sample is approximately log(Xi − β)2
—the log of the square deviation.
The sample myriad (approximately) minimizes the sum of logarithmic square
deviations, referred to as the LLS criterion.
3 LLS k = 1
LAD
2 LS
ρ(X−β)
1
0
|X − β|
−1 LLS k = 0.01
−2
−3
−4
−5
−0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Figure: Cost functions of the mean (LS), the median (LAD) and the myriad (LLS)
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 31 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
Geometrical Interpretation
The observations X1 , X2 , · · · , XN are placed along the real line. Next, a vertical
bar that runs horizontally through the real line is added. The length of the vertical
bar is equal to k. Each of the terms
k 2 + (Xi − β)2
(21)
in (20), represents the distance from point A, at the end of the vertical bar, to the
sample point Xi .
A
A
K
K
x x
x5 x1 x 2 β’ ^β x 3 x6 x4 x5 x1 x2 x 3 ^β x 6 x 4
(a) (b)
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 32 / 39
Statistical Foundations of Filtering Maximum Likelihood Estimation
The sample myriad, β̂k , indicates the position of the bar for which the product of
distances from point A to the samples X1 , X2 , · · · , XN is minimum. Any other
value, such as x = β ′ , produces a higher product of distances.
A
A
K
K
x x
x5 x1 x 2 β’ ^β x 3 x6 x4 x5 x1 x2 x 3 ^β x 6 x 4
(a) (b)
Figure: (a) The sample myriad, β̂, minimizes the product of distances from point A to
all samples. (b) the myriad as k is reduced.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 33 / 39
Statistical Foundations of Filtering Robust Estimation
Outline
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 34 / 39
Statistical Foundations of Filtering Robust Estimation
Robust Estimation
The maximum likelihood estimates have assumed that the form of the distribution
is known. In practice, we can seldom be certain of such distributional assumptions
and two types of questions arise:
1. How sensitive are optimal estimators to the precise nature of the
assumed probability model?
2. Is it possible to construct robust estimators which perform well
under deviations from the assumed model?
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 35 / 39
Statistical Foundations of Filtering Robust Estimation
Sensitivity of Estimators
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 36 / 39
Statistical Foundations of Filtering Robust Estimation
In the first set of experiments, a sample set of size 10 including one outlier is considered.
The nine i.i.d. samples are distributed as N(µ, 1) and the outlier is distributed as
N(µ + λ, 1).
Table: Bias of estimators of µ for N = 10 when a single observation is from N(µ + λ, 1) and
the others from N(µ, 1).
λ
Estimator 0 0.5 1.0 1.5 2.0 3.0 4.0 ∞
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 37 / 39
Statistical Foundations of Filtering Robust Estimation
Table: Mean squared error of various estimators of µ for N = 10, when a single observation is
from N(µ + λ, 1) and the others from N(µ, 1).
λ
Estimator 0.0 0.5 1.0 1.5 2.0 3.0 4.0 ∞
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 38 / 39
Statistical Foundations of Filtering Robust Estimation
Table: Variance of estimators of µ for N = 10, where a single observation is from N(µ, σ 2 )
and the others from N(µ, 1).
σ
Estimator 0.5 1.0 2.0 3.0 4.0 ∞
The mean is better than the median as long as the variance of the outlier is not large.
The trimmed mean, however, outperforms the median regardless of the variance of the
outlier.
Gonzalo R. Arce (Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu) Fall 2008 39 / 39