Professional Documents
Culture Documents
Detection theory
Lecture IV, Version 1.1
Andreas Austeng
March 2022
Detection versus estimation
Detection:
The ability to discern between information-bearing patterns (signals) and random
patterns that distract from the information (noise).
Used for (automated) scanning of (large) data sets for the purpose of detecting,
localizing, and identifying particular targets of interest.
Estimation:
Extension of the detection part. It answers the question as to how much of a
particular item of interest there is (i.e., a fine localization in the parameter space)
and provides a measure of the accuracy or confidence.
Estimation
Source: Detection theory : applications and digital signal processing / Ralph D. Hippenstiel; Figure 1.1.
Decision model (cont.)
E: The event space
The signals are observed (i.e. obtained) after having passed through some channel (i.e. the medium).
In general, the received data is the transmitted signal plus an additive noise component:
𝑟 (𝑡) = 𝑠(𝑡) + 𝑛(𝑡)
Example (Target detection, Ex 1.13)
Events: Received data:
𝑒 0 (no target) → 𝑟 0 (𝑡) = 𝑛(𝑡), for 0 ≤ 𝑡 ≤ 𝑇 .
𝑒 1 (target present) → 𝑟 1 (𝑡) = 𝐴 𝑠(𝑡 − 𝑡 𝑅 ) + 𝑛(𝑡), for 𝑡 𝑅 ≤ 𝑡 ≤ 𝑇 + 𝑡 𝑅 ,
where 𝐴 is the signal loss coefficient and 𝑡 𝑅 the propagation delay. (No Doppler shift assumed).
Decision model (cont.)
D: The decision space
2
− ( 𝑥−𝑚2𝑥 )
If 𝑥 ∼ 𝑁 (𝑚 𝑥 , 𝜎 2 ), then 𝑓 𝑋 (𝑥) = √ 1 𝑒 2𝜎
2𝜋 𝜎
Noise only
fn(x)
x x
1 2
I The probability that a single random value would be between 𝑥1 and 𝑥2 (given the probability
density function 𝑓 𝑋 (𝑥)): ∫ 𝑥2
𝑃(𝑥1 < 𝑥 < 𝑥2 ) = 𝑓 𝑋 (𝑥)𝑑𝑥
𝑥1
∫∞
I For any pdf: 𝑓 𝑋 (𝑥) 𝑑𝑥 = 1.
−∞
I And if 𝑥 ∼ 𝑁 (𝑚 𝑥 , 𝜎 2 ) :
I 𝑃(𝑚 𝑥 − 𝜎 < 𝑥 < 𝑚 𝑥 + 𝜎) = 0.6827, 𝑃(𝑚 𝑥 − 2𝜎 < 𝑥 < 𝑚 𝑥 + 2𝜎) = 0.9545
I 𝑃(𝑚 𝑥 − 3𝜎 < 𝑥 < 𝑚 𝑥 + 3𝜎) = 0.9973
Right tail probability
I In matlab:
I Given the standard normal distribution
p = normspec([-Inf,1],0,1,’outside’)
(zero-mean, unit variance):
2
𝑓 𝑋 (𝑥) = √1 𝑒 − 𝑥2 . 0.4
Probability Greater than Upper Bound is 0.15866
2𝜋
I Then the cumulative distribution is: 0.35
∫𝑥 𝑡2
Φ(𝑥) = √1 −∞ 𝑒 − 2 . 0.3
2𝜋
I The probability of exceeding a given value, 0.25
Density
𝑄(𝑥), is given by the complementary 0.2
𝛾 = 𝑄 −1 (𝑃𝐹 𝐴). 0
-4 -3 -2 -1 0 1 2 3 4
Critical Value
Set operations in probability theory
Bayes’ rule:
I Union: 𝐴 ∪ 𝐵
I From 𝑃(𝐵 | 𝐴) = 𝑃𝑃( 𝐴∩𝐵)
( 𝐴) ⇒
I Intersection: 𝐴 ∩ 𝐵 𝑃( 𝐴 ∩ 𝐵) = 𝑃(𝐵 | 𝐴) 𝑃( 𝐴)
I 𝑃( 𝐴 ∪ 𝐵) = 𝑃( 𝐴) + 𝑃(𝐵) − 𝑃( 𝐴 ∩ 𝐵) I Then:
I Conditional probabilities: 𝑃( 𝐴 | 𝐵) = 𝑃 ( 𝐴∩𝐵)
= 𝑃 (𝐵 | 𝐴) 𝑃 ( 𝐴)
𝑃 (𝐵) 𝑃 (𝐵)
𝑃 ( 𝐴∩𝐵)
𝑃( 𝐴 | 𝐵) = 𝑃 (𝐵)
𝑃 ( 𝐴∩𝐵)
𝑃(𝐵 | 𝐴) = 𝑃 ( 𝐴)
A basic signal detection problem
Noise only
fn(x)
1 𝑥2
−
𝑓𝑛 (𝑥) = √ 𝑒 2𝜎 2
2𝜋𝜎
A basic signal detection problem
Signal + Noise
fn(x) f (x)
s
fn(x) fs(x)
Decision θ
threshold x
c
Decision criterion:
I if 𝑥 > 𝑥 𝑐 , decide 𝐻1 (signal + noise)
I Otherwise, decide 𝐻0 (noise only)
A basic signal detection problem
Decision Bias, 𝛽
fn(x) fs(x)
Decision Bias:
I We define the decision Bias, 𝛽, as
𝑓𝑠 (𝑥 𝑐 )
𝛽= .
𝑓𝑛 (𝑥 𝑐 )
I I.e as the relative likelihood of observing the threshold value 𝑥 𝑐 if the signal is present to
observing 𝑥 𝑐 if the signal is not present.
Decision outcomes
Decision
Type I Error
True negative (TN)
False Positive (FP)
Signal+Noise
Type II Error
True positive (TP)
False Negative (FN)
Decision outcomes
Noise only
Noise only
fn(x)
CR FA
xc
Signal + Noise
f (x)
s
D
M
xc θ
Pr{0R|1T} = 0.2
Pr{1R|0T} = 0.2
I If no prior probability nor any cost information is available, then comparing the likelihood
function will give a meaningful detection scheme.
I That is: "Given the data, which conditional PDF (i.e. 𝑓0 ( 𝒚) or 𝑓1 ( 𝒚)) has more likely
generated the data?"
I In the scalar case, the detection scheme becomes
𝐻1
>
𝑓1 (𝑦) < 𝑓0 (𝑦)
𝐻0
I The criterion is used in binary communication problems where the cost of making an error is
the same and the cost of making a correct decision is zero.
I We do allow non-equal prior probabilities (depends on the source coding).
I We set up the average cost in the same was as done in the Bayes’ case.
I The average cost becomes (𝐶00 = 𝐶11 = 0, 𝐶10 = 𝐶01 = 1)
∫ ∫
𝐶 = 𝑃0 𝑓0 ( 𝒚) 𝑑 𝒚 + 𝑃1 𝑓1 ( 𝒚) 𝑑 𝒚
𝑅1 𝑅0
I This is the same expression (including the threshold) as that of the MAP detector!
Min-Max criterion
I The cost information (i.e. 𝐶𝑖 𝑗 , 𝑖, 𝑗 = 0, 1)
is available but not the a priori
probabilities 𝑃0 or 𝑃1 .
I The best Bayes’ detection scheme is then
desirable: The worst possible choice of
𝐻𝑖 , 𝑖 = 0, 1 (Murphy’s law)
I The average Bayes’ cost (using 𝑃𝐹 𝐴 and
𝑃 𝑀 ):
I If we assume 𝐶00 = 𝐶11 = 0 (as in most real
𝐶 =𝑃0 𝐶00 + 𝑃1 𝐶11 + · · · life situations), we can plot 𝐶 as a function of
𝑃1 and either get a straight line or a curve as in
𝑃1 (𝐶00 − 𝐶11 )𝑃 𝑀 + 𝑃0 (𝐶10 − 𝐶00 )𝑃𝐹 𝐴
the figure.
I When 𝑃0 = 1, then 𝑃1 = 𝑃𝐹 𝐴 = 0 and I In the figure, the maximum occurs when the
𝐶 = 𝐶00 slope of the curve is zero.
I When 𝑃1 = 1, then 𝑃0 = 𝑃𝐹 𝐴 = 𝑃 𝑀 = 0 I To find the maximum, set the partial derivative
and 𝐶 = 𝐶11 of 𝐶 with respect to 𝑃1 to zero.
Min-Max criterion ...
I So given the maximum average cost, we
minimize it, hence the name Min-Max
criterion
𝜕𝐶 𝜕𝐶 n
= (1 − 𝑃1 )𝐶00 · · ·
𝜕𝑃1 𝜕𝑃1
+ 𝑃1 𝐶11 + 𝑃1 (𝐶00 − 𝐶11 )𝑃 𝑀 · · ·
o
+ (1 − 𝑃1 ) (𝐶10 − 𝐶00 )𝑃𝐹 𝐴
I The detector form using the Bayes’ cost
I Setting it to zero gives: formula is the Bayes detector with a threshold
𝜆 that requires the above equations to be true.
(𝐶11 − 𝐶00 ) + (𝐶01 − 𝐶11 )𝑃 𝑀
− (𝐶10 − 𝐶00 )𝑃𝐹 𝐴 = 0
𝐶01 𝑃 𝑀 = 𝐶10 𝑃𝐹 𝐴
The Neyman-Pearson criterion
I Approach:
I One of the first in a class called CFAR
I Fix the false alarm rate 𝑃 𝐹 𝐴 at a value 𝛼 (i.e.
(constant false alarm rate) detectors.
force (𝑃 𝐹 𝐴 − 𝛼) to zero) while maximizing
I Typically used in radar and sonar 𝑃 𝐷 (or minimizing 𝑃 𝑀 since 𝑃 𝑀 = 1 − 𝑃 𝐷 ).
applications I Easily solved with Lagrange multipliers.
I Allows constant false alarm rate.
I Cost function:
I Attractive when loss of life or essential
equipment are involved. minimize equality constraint
I Desirable when resource allocations are z
∫ }| { z }| {
h∫ i
important and the maximum false alarm 𝐽= 𝑓1 ( 𝒚) 𝑑 𝒚 +𝜆 [ 𝑓0 (𝒚) 𝑑 𝒚 − 𝛼
rate has to be kept at a tolerable level. 𝑅 𝑅1
∫ 0 ∫
I In essence, the false alarm rate is fixed h i
= 𝑓1 ( 𝒚) 𝑑 𝒚 + 𝜆 1 − [ 𝑓0 ( 𝒚) 𝑑 𝒚 − 𝛼
while the detection probability is 𝑅0 𝑅0
maximized. ∫
=𝜆(1 − 𝛼) + 𝑓1 ( 𝒚) − 𝜆 𝑓0 ( 𝒚) 𝑑 𝒚
𝑅0
The Neyman-Pearson criterion ...
I Solving the cost function:
∫
𝐽 =𝜆(1 − 𝛼) + 𝑓1 ( 𝒚) − 𝜆 𝑓0 ( 𝒚) 𝑑 𝒚
𝑅0
I For fixed 𝜆, the first term is constant (>0).
I Need only to concider I By solving the above equation, we get:
∫
𝐻1
𝑓1 ( 𝒚) − 𝜆 𝑓0 (𝒚) 𝑑 𝒚 >
𝑅0 𝑧= < 𝛾 , and
𝐻0
I Same as for the Bayes’ detector.
I We get I The false alarm rate is then given by
𝐻1
> ∫ ∞
𝑓1 (𝒚) 𝜆 𝑓0 (𝒚)
<
𝐻0
𝑃𝐹 𝐴 = 𝑓0 (𝑧) 𝑑𝑧 = 𝛼
𝛾
I or 𝐻1
𝑓1 (𝒚) > I The threshold 𝛾 is chosen so that the 𝑃𝐹 𝐴
Λ( 𝒚) = < 𝜆
𝑓0 (𝒚) 𝐻0 constraint is met (i.e. 𝑃𝐹 𝐴 = 𝛼).
Receiver operator characteristic (ROC)
Detector performance evaluation
xc small, ß small
1
increasing ß
operating point
P(D|S)
xc large, ß large
0
0 P(FA|N) 1
I Generally: 𝛽 = 𝜆 = 𝜕𝑃𝐹 𝐴
𝜕𝑃𝐷 , i.e. the slope is the threshold of the detector.
𝑓𝑠 ( 𝑥𝑐 )
I Also, from Bayes’ detector: 𝛽 = 𝑓𝑛 ( 𝑥𝑐 )
Receiver operator characteristic (ROC) (continuing ...)
increasing d’
P(D|S)
0
0 P(FA|N) 1
Noise only
False Alarm (FA)
Correct Rejection (CR)
H0
True state
Type I Error
True negative (TN)
False Positive (FP)
Signal+Noise
Type II Error
True positive (TP)
False Negative (FN)
Terminology ...
From wikipedia.org
I A negative result in a test with high sensitivity is useful for ruling out disease.
I https://en.wikipedia.org/wiki/Sensitivity_and_specificity
Specificity
I Positive result in a test with high specificity is useful for ruling in disease.
I https://en.wikipedia.org/wiki/Sensitivity_and_specificity
Sensitivity and Specifisity
From faktisk.no
https://www.faktisk.no/artikler/r8q/er-14-av-15-positive-koronaprover-falske
Worked example