Estimation and Detection: Lecture 1: Introduction and Minimum Variance Unbiased Estimators

14/11/16
Estimation and Detection

Lecture 1: Introduction and Minimum
Variance Unbiased Estimators
Dr. ir. Richard C. Hendriks & Dr. Sundeep P. Chepuri – 15/11/2016

1
Introduction – Why Estimation & Detection

Many signal processing applications involve estimation and/or detection tasks, e.g.:
• Hearing aid: Estimation of target speech in a noisy environment.
• localisation of sensors/sources in a wireless sensor network.
• Estimation/detection of biomedical signals.
• Radar: Detection of a target
• Etc.
1
14/11/16
Example: Estimation
• Sensitivity
• Dynamic range
Gain and Noise Gain and

compression reduction compression
• Due to the gain, inaudible noise/disturbances can suddenly become audi-

ble.
• Hearing impaired people have a worse spectral and temporal resolution,
and therefore more difficulties to understand a target under noisy condi-
tions. 3
Example: Estimation
A simplified model:
y 1 = s + n1
ŝ
s Noise Gain and
reduction compression
n y 2 = s + n2
• ŝ1 = y1
1
PN
• ŝ2 = N i=1 yi
How to determine an estimate ŝ? • How good are these estimators?
• Are there better estimators?
PN yi
i=1 2
ni
What about ŝ3 = PN 1 ?
i=1 2
ni
4
2
14/11/16
y 1 = s + n1
Example: Estimation ŝ
Noise Gain and
let’s assume
y 2 = s + n2
• s is deterministic
• ŝ1 = y1
• Noise ni at every microphone is a zero-mean iid random process.
1
PN
• ŝ2 = N i=1 yi
PN
V ar[ŝ]? i=1
yi
2
ni
• ŝ3 = PN 1
i=1 2
ni
2 2
• V ar[ŝ1 ] = ni If indeed iid ) V ar[ŝ1 ] = n
PN P 2
• V ar[ŝ2 ] = V ar[ N1 i=1 yi ] = 1
N2
2
ni . If indeed iid ) V ar[ŝ2 ] = n
N
" PN yi #
i=1 2 2
• V ar[ŝ3 ] = V ar PN
ni
1 = P 11 . If indeed iid ) V ar[ŝ3 ] = n
i=1 2 2
N
ni ni
y 1 = s + n1
Example: Estimation ŝ
Noise Gain and
y 2 = s + n2
What if the noise is not iid, but only independent?
2 2 2 2
• n1 = n and n2 =2 n
V ar[ŝ]?
2
• V ar[ŝ1 ] = n
PN 3 2
• V ar[ŝ2 ] = V ar[ N1 i=1 yi ] = n
4 . Questions:
" PN yi #
i=1 2
ni 2 2 • How to derive good (optimal) estimators?
• V ar[ŝ3 ] = V ar PN 1 = 3
n
.
i=1 2
ni
• How to quantify their performance?
• What is the best estimator for a certain estimation task?
6
3
14/11/16
Example: Detection
Characterization of abnormal heart rhythms (Atrial fibrillation)
Common treatment: Ablation. Only successful in 50 % of cases. . .
Research project on atrial fibrillation:
• Measure signals directly on the heart using a high dimensional sensor
(188).
• Get understanding of the real problem in Atrial fibrillation.

• Develop less invasive methods.
Example: Detection
5000
Response at 5 di↵erent positions of the sensor
0
-5000
without atrial fibrillation.
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
0
• How to detect the moment of atrial activity
-5000
in each signal?
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
time samples (f s = 10 kHz)
4
14/11/16
Example: Detection
5000
Response at 5 di↵erent positions of the sensor
0
WITH atrial fibrillation.
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
• How to detect the moment of atrial activity
0
in each signal?
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Example: Detection
5000 Derivative of the response at 5 di↵erent
0 positions of the sensor WITH atrial fibrillation.
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000 • How to detect the moment of atrial activity
0 in each signal?
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000 • Check whether y[n] > ?
0
-5000
• How to chose the threshold ?
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
5000
0
• What is the optimal detector?
-5000
5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
• How to evaluate the performance of the
0 detector?
-5000
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Threshold
10
5
14/11/16
This Course
Theory on estimation and detection:
• Generally, the (optimal) solution to estimation and detection problems
depends on the underlying signal model.
• Data is often random: Random noise and sometimes random target as
well. Models are therefore often based on statistical descriptions.
Key Questions:
• How to determine optimal estimators.
• What is the estimation bound?
• How to determine the optimal detector?

• How to express the detection performance?
11
This Course – Estimation Theory
Typical problem formulation for estimation theory:
Consider the estimation of ✓ from observations y[n] = ✓ + w[n]
While the noise is typically assumed to be stochastic, the target can be

• ✓ can be an unknown deterministic quantity ) Classical estimation (lec-
ture 1 - 5).
• ✓ can be an (unknown) random quantity ) Bayesian estimation (lecture
6 - 8).
12
6
14/11/16
This Course – Detection Theory

Typical binary detection problem: Determine whether a certain signal that is embedded in
noise is present or not.
H0 x[n] = w[n]
H1 x[n] = s[n] + w[n]
Typical detector:
T (x[n]) >
• How to make an optimal decision? (that is, how to chose )
• Detection for deterministic signals (lecture 9).
• Detection for random signals (lecture 10).
• Detection with unknown parameters (lecture 11).

13
Course Information - 1
• Course website: http://ens.ewi.tudelft.nl/Education/courses/et4386/index.php
• Blackboard
• Exam:
– Written
– closed book, with one 2-sided HANDWRITTEN A4 formula sheet.
– Mini-project counts for 20 % of the final grade.
14
7
14/11/16
Course Information - 2
• books:
– Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory;

S.M. Kay, Prentice Hall 1993; ISBN-13: 978-0133457117.
– Fundamentals of Statistical Signal Processing, Volume II: Detection Theory;

S.M. Kay, Prentice 1993; ISBN-13: 978-0135041352.
• Instructors:
– Richard C. Hendriks
– Sundeep Chepuri
• Contact: et4386-EWI@TuDelft.nl
15
Course Information – Mini Projects

• Grade for the project counts for 20 % of the final grade.
• Can be carried out in pairs of students.
• There will be several projects to chose from. Let us know your ranked preference
before November 25th (to the course e-mail address).
• Final report deadline: before January 9th!
• Access to final exam will only be granted if report is handed in!
• Project presentation during last two lectures.
16
8
14/11/16
Estimation Theory – Lecture1
Minimum Variance Unbiased

(MVU) Estimation
17
Introductory Example
Given a process (DC in noise):
x[n] = A + w[n] , n = 0, · · · , N 1, w[n] zero mean noise
How can we estimate A?
1. Â1 = ...? What is the signal model?
2. Â2 = ...? • w[n] is a zero-mean random uncorrelated process.
3. ... • A: Let’s for now assume A is deterministic.
18
9
14/11/16
Given a process (DC in noise):
x[n] = A + w[n] , n = 0, · · · , N 1, w[n] zero mean noise
How can we estimate A? Consider the case A = 1.

Some candidates: Which estimate is better?
1. Â1 = x[0] • Â1 = x[0] = 0.93

PN 1 PN 1
2. Â2 = 1
N n=0 x[n] • Â2 = 1
N n=0 x[n] = 0.9
Can the performance of an estimator be based on a single realisation?

No! The data is random. As a result, the estimator is a random variable too!
19
Use statistical descriptors: Expected value and variance of the estimator.
• E(Â1 ) = A
⇣ PN ⌘ PN
1 1 1 1
E(Â2 ) = E N n=0 x[n] = N n=0 E(x[n]) =A
Both estimators are unbiased.
• Look at the variance, define 2

= var(w[n]):
2
var(Â1 ) =
⇣ PN ⌘ PN 2
1 1 1 1
var(Â2 ) = var N n=0 x[n] = N2 n=0 var(x[n]) = N
Â2 has smaller variance. Also, var(Â2 ) ! 0 as N ! 1: the estimator is consistent.
Conclusion: Estimators are random variables.

Question: What are good (or optimal) estimators?
20
10
14/11/16
Unbiased Estimators
Let ✓ˆ = g(x), x = [x[0], x[1], ..., x[N 1]]T .
An unbiased estimator implies that
Z
ˆ =
E[✓] g(x)p(x; ✓)dx = ✓.
Is being unbiased sufficient? No, hence the example:
1. Â1 = x[0]
PN 1
2. Â2 = 1
N n=0 x[n]
21
Minimum Variance Criterion

A natural criterion is the Mean Square Error (MSE):
h i ⇢h i2
mse(✓)ˆ = E (✓ˆ ✓)2 = E (✓ˆ E(✓))ˆ + (E(✓)
ˆ ✓)
h i
= E (✓ˆ E(✓))
ˆ 2 + (E(✓)
ˆ ˆ +(E(✓)
✓)2 = var(✓) ˆ ✓ )2
| {z } | {z }
variance bias
The MSE consist of errors due to
• bias of the estimator
• variance of the estimator.
22
11
14/11/16
Minimum Variance Criterion

h i
ˆ = E (✓ˆ E(✓))
mse(✓) ˆ 2 + (E(✓)
ˆ ˆ +(E(✓)
✓)2 = var(✓) ˆ ✓ )2
| {z } | {z }
variance bias
PN 1
Consider the (biased) estimator: Â3 = a
N n=0 x[n], with constant a.
• E[Â3 ] = aA
a2 2
• var[Â3 ] = N
a2 2
• MSE(Â3 ) = N + (a 1)2 A2 .
A2
Find the optimal estimator requires finding the optimal a: aopt = A2 + 2 /N
.
2 PN 1
The optimal estimator: Â3 = N AA2 + 2 n=0 x[n].
) This estimator depends on A and is therefore unrealisable!!

23
Minimum Variance Unbiased Estimator
The optimal estimator depends on A (and thus unrealisable), because the bias term in the
MSE depends on A:
a2 2
MSE(Â3 ) = + (a 1)2 A2 .
N | {z }
bias
We therefore abandon the MSE and constrain the bias to be zero.
ˆ = ✓, then
Hence, if E(✓)
h i h i
ˆ = E (✓ˆ E(✓))
mse(✓) ˆ 2 + (E(✓)
ˆ ✓)2 = E (✓ˆ E(✓))
ˆ 2 .
This is what we call: the minimum variance unbiased estimator (MVU).
24
12
14/11/16
Minimum Variance Unbiased Estimator
Remark: The MVU does not always exist and is generally difficult to find.
ˆ
var(✓) ˆ
var(✓)
✓ˆ1 ✓ˆ1
✓ˆ2
✓ˆ3 ✓ˆ2
✓ˆ3
✓ ✓
If ✓1 , ✓2 and ✓3 are all unbiased MVU does not exist!

estimators, then ✓3 is the MVU.
25
MVU: Example
8
< N (✓, 1) , ✓ 0
Consider two independent random processes x and y: x ⇠ N (✓, 1) , y⇠
: N (✓, 2) , ✓<0
Consider two possible estimators for ✓: ✓ˆ1 = 1
2 (x + y) , ✓ˆ2 = 2 1
3x+ 3y
Both estimators are unbiased. Look at the variances:

8
1 < 18
, ✓ 0
var(✓ˆ1 ) = (var(x) + var(y)) = 36
4 : 27
, ✓<0
36
8
4 1 < 20
, ✓ 0
var(✓ˆ2 ) = var(x) + var(y) = 36
9 9 : 24
, ✓<0
36
Between these two estimators, there is no MVU estimator.
26
13
14/11/16
Finding the MVU
Even if the MVU exists, there is no standard "recipe" to find it.
1. Determine Cramer-Rao-Bound (next week, Ch. 3)
2. Apply Rao-Blackwell-Lehmann-Scheffe theorem (will not be discussed)
3. Restrict estimators to be both unbiased AND linear (Ch. 6)
27
Towards the CRB (Ch. 3): Estimator accuracy
Goal: Estimate A
x[0] = A + w[0],
with w[n] ⇠ N (0, 2

).
⇥ ⇤
The pdf of x[0] is given by: p(x[0]; A) = p 1 exp 1
2 2
(x[0] A)2
2⇡ 2
Question: What determines now how accurate we can estimate A?
The CRB is a bound on the variance V ar[Â] of an estimator.
28
14
14/11/16

p(x[0] = 3; A; < 2 = 1=3])
Likelihood function: p(x[0]; A) as function of A.
0.5
0.4
0.3 Consider the two cases 2

= 1 and 2
= 1/3.
0.2
0.1 The "sharpness" (curvature) of the likelihood

0
0 1 2 3
A
4 5 6 7 function determines how accurately the unknown
0.5
p(x[0] = 3; A; < 2 = 1])
parameter can be estimated.
0.4
0.3
0.2
How to measure sharpness/curvature?
0.1
0
0 1 2 3 4 5 6 7
A
29

@ 2 ln[p(x[0];A)]
Measure curvature by @A2
p(x[0] = 3; A; < = 1])
0.2
0.15
p(x[0]; A)
0.1
0.05
0
0 1 2 3 4 5 6 7
A
6
ln [p(x[0]; A)]
4 @ ln[p(x[0];A)]
@A
2 @ 2 ln[p(x[0];A)]
@A2
0
-2
-4
-6
0 1 2 3 4 5 6 7
A
30
15
14/11/16

@ 2 ln[p(x[0];A)]
Measure curvature by @A2
h i
In this example: 1
@ 2 ln[p(x[0];A)]
= 2
= var Â
@A2
2
@ ln[p(x[0];A)]
In general, as @A2
can depend on x[0], curvature is measured by

@ 2 ln [p(x[0]; A)]
E
@A2
The Cramer-Rao bound:

h i 1
var Â h i
@ 2 ln[p(x;A)]
E @A2
31
Towards the CRB (Ch. 3): Example
Calculate the CRB:
x[n] = A + w[n] , n = 0, · · · , N 1, with w[n] ⇠ N (0, 2

)
PN 1
Remember estimator Â2 = 1
N n=0 x[n].
What is the Cramer-Rao bound

h i 1
var Â h i?
@ 2 ln[p(x;A)]
E @A2
32
16
14/11/16
Next week:
Cramer-Rao Lower bound
• Gives a bound on the variance of an unbiased estimator:
ˆ 1 1
var(✓)  2 = "✓ ◆2 #
@ ln p(x; ✓) @ ln p(x; ✓)
E E
@✓2 @✓
• Can be used to determine whether an estimator is the MVU. This is the case if an
estimator reaches the bound.
• Can be used as a benchmark to compare the performance of unbiased estimators.
• Can be used to determine the non-existence of better estimators.

33
17

Estimation and Detection: Lecture 1: Introduction and Minimum Variance Unbiased Estimators

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Estimation and Detection: Lecture 1: Introduction and Minimum Variance Unbiased Estimators

Uploaded by

Copyright:

Available Formats

14/11/16

Estimation and Detection

Dr. ir. Richard C. Hendriks & Dr. Sundeep P. Chepuri – 15/11/2016

Introduction – Why Estimation & Detection

• Hearing aid: Estimation of target speech in a noisy environment.

• localisation of sensors/sources in a wireless sensor network.

• Estimation/detection of biomedical signals.

• Radar: Detection of a target

Gain and Noise Gain and

• Due to the gain, inaudible noise/disturbances can suddenly become audi-

• Get understanding of the real problem in Atrial fibrillation.

time samples (f s = 10 kHz)

time samples (f s = 10 kHz)

• How to determine the optimal detector?

This Course – Estimation Theory

Typical problem formulation for estimation theory:

Consider the estimation of ✓ from observations y[n] = ✓ + w[n]

While the noise is typically assumed to be stochastic, the target can be

This Course – Detection Theory

• How to make an optimal decision? (that is, how to chose )

• Detection for deterministic signals (lecture 9).

• Detection for random signals (lecture 10).

• Detection with unknown parameters (lecture 11).

• Course website: http://ens.ewi.tudelft.nl/Education/courses/et4386/index.php

– closed book, with one 2-sided HANDWRITTEN A4 formula sheet.

– Mini-project counts for 20 % of the final grade.

– Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory;

– Fundamentals of Statistical Signal Processing, Volume II: Detection Theory;

Course Information – Mini Projects

• Can be carried out in pairs of students.

• Final report deadline: before January 9th!

• Access to final exam will only be granted if report is handed in!

• Project presentation during last two lectures.

Estimation Theory – Lecture1

Minimum Variance Unbiased

x[n] = A + w[n] , n = 0, · · · , N 1, w[n] zero mean noise

How can we estimate A?

1. Â1 = ...? What is the signal model?

2. Â2 = ...? • w[n] is a zero-mean random uncorrelated process.

3. ... • A: Let’s for now assume A is deterministic.

x[n] = A + w[n] , n = 0, · · · , N 1, w[n] zero mean noise

How can we estimate A? Consider the case A = 1.

1. Â1 = x[0] • Â1 = x[0] = 0.93

Can the performance of an estimator be based on a single realisation?

• Look at the variance, define 2

Â2 has smaller variance. Also, var(Â2 ) ! 0 as N ! 1: the estimator is consistent.

Conclusion: Estimators are random variables.

Is being unbiased sufficient? No, hence the example:

Minimum Variance Criterion

The MSE consist of errors due to

• bias of the estimator

• variance of the estimator.

Minimum Variance Criterion

) This estimator depends on A and is therefore unrealisable!!

Minimum Variance Unbiased Estimator

We therefore abandon the MSE and constrain the bias to be zero.

This is what we call: the minimum variance unbiased estimator (MVU).

Minimum Variance Unbiased Estimator

If ✓1 , ✓2 and ✓3 are all unbiased MVU does not exist!

Both estimators are unbiased. Look at the variances:

Between these two estimators, there is no MVU estimator.

Finding the MVU

Even if the MVU exists, there is no standard "recipe" to find it.