Department of Electrical and Computer Engineering Brigham Young University · Provo, Utah
2009
Detection and Estimation Theory
Lecture Notes
For ECEn 672
Prepared by Wynn Stirling Winter Semester, 2009
Section 001
Copyright c 2009, Wynn C. Stirling
02
ECEn 672
Contents
1 The Formalism of Statistical Decision Theory 
11 

1.1 Game Theory and Decision Theory 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
11 

1.2 The Mathematical Structure of Decision Theory 
14 

1.2.1 The Formalism of Statistical Decision Theory 
15 

1.2.2 Special Cases 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
19 

2 The Multivariate Normal Distribution 
21 

2.1 The Univariate Normal Distribution 
21 

2.2 Development of The Multivariate Distribution 
21 

2.3 Transformation of Variables 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
24 

2.4 The Multivariate Normal Density 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
26 

3 Introductory Estimation Theory Concepts 
31 

3.1 Notational Conventions 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
31 

3.2 Populations and Statistics 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
32 

3.2.1 Suﬃcient Statistics 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
33 

3.2.2 Complete Suﬃcient Statistics 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
39 

3.3 Exponential Families 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
313 

3.4 Minimum Variance Unbiased Estimators 
317 

4 NeymanPearson Theory 
41 

4.1 Hypothesis Testing 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
41 

4.2 Simple Hypothesis versus Simple Alternative 
42 

4.3 The NeymanPearson Lemma 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
43 

4.4 The Likelihood Ratio . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
48 

4.5 Receiver Operating Characteristic 
. 
. 
. 
. 
411 

4.6 Composite Binary Hypotheses 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
418 

5 Bayes Decision Theory 
51 

5.1 
The Bayes Principle 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
51 
Winter 2009
03
5.2 Bayes Risk 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
5 2 

5.3 Bayes Tests of Simple Binary Hypotheses 
54 

5.4 Bayes Envelope Function . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
510 

5.5 Posterior Distributions 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
512 

5.6 Randomized Decision Rules 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
515 

5.7 Minimax Rules 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
5 17 

5.8 Summary of Binary Decision Problems 
518 

5.9 Multiple Decision Problems 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
518 

5.10 An Important Class of MAry Problems 
524 

6 Maximum Likelihood Estimation 
61 

6.1 The Maximum Likelihood Principle 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
61 

6.2 Maximum Likelihood for Continuous Distributions 
65 

6.3 Comments on Estimation Quality 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
68 

6.4 The Cram´erRao Bound 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
69 

6.5 Asymptotic Properties of Maximum Likelihood Estimators 
615 

6.6 The Multivariate Normal Case . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
620 

6.7 Appendix: Matrix Derivatives 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
623 

7 Conditioning 
71 

7.1 Conditional Densities 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
71 

7.2 σ ﬁelds 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
75 

7.3 Conditioning on a σ ﬁeld 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
710 

7.4 Conditional Expectations and LeastSquares Estimation 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
713 

8 Bayes Estimation Theory 
81 

8.1 Bayes Risk 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
8 
3 

8.2 MAP Estimates 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
8 
6 

8.3 Conjugate Prior Distributions 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
89 

8.4 Improper Prior Distributions . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
812 

8.5 Sequential Bayes Estimation 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
813 
04
ECEn 672
9 Linear Estimation Theory 
916 

9.1 Introduction . 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
916 

9.2 Minimum Mean Square Estimation (MMSE) 
918 

9.3 Estimation Given a Single Random Variable 
919 

9.4 Estimation Given two Random Variables 
920 

9.5 Estimation Given N Random Variables 
921 

9.6 Mean Square Estimation for Random Vectors 
923 

9.7 Hilbert Space of Random Variables 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
924 

9.8 Geometric Interpretation of Mean Square Estimation 
927 

9.9 GramSchmidt Procedure 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
929 

9.10 Estimation Given the Innovations Process 
933 

9.11 Innovations and Matrix Factorizations 
936 

9.12 LDU Decomposition 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
937 

9.13 Cholesky Decomposition 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
938 

9.14 White Noise Interpretations 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
940 

9.15 More On Modeling 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
941 

10 Estimation of State Space Systems 
1042 

10.1 Innovations for Processes with State Space Models 
1042 

10.2 Innovations Representations 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1048 

10.3 A Recursion for P _{i}_{} _{i}_{−} _{1} 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1050 

10.4 The DiscreteTime Kalman Filter 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1052 

10.5 Perspective 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1057 
10.6 Kalman Filter Example 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1059 

10.6.1 Model Equations 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1059 

10.7 Interpretation of the Kalman Gain 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1062 

10.8 Smoothing 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1063 

10.8.1 A Word About Notation 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1063 

10.8.2 FixedLag and FixedPoint Smoothing 
1064 
10.8.3
The RauchTungStreibel FixedInterval Smooother
1064
Winter 2009
05
10.9 Extensions to Nonlinear Systems 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1069 

10.9.1 Linearization 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1069 
10.9.2 The Extended Kalman Filter 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
. 
1072 
06
ECEn 672
List of Figures
11 
Loss function (or matrix) for Odd or Even game 
12 

12 13 
. Risk Matrix for Statistical Odd or Even Game Structure of a Statistical Game . . . . . . . . . . . . . . . . . . . . 
. 
. 
. 
. 
18 19 
41 
Illustration of threshold for NeymanPearson test 
46 

42 
Error probabilities for normal variables with diﬀerent means and equal vari 

ances: (a) P _{F} _{A} calculation, (b) P _{D} 
412 

43 
Receiver operating characteristic: normal variables with unequal means and 

equal variances. . . . . . . . . . . . . . . . . . 44 Receiver operating characteristic: normal variables with equal means and . . . . . . . . . . . . . . . . 
. 
4 13 

unequal . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. 
. 
. 
. 
415 

45 
Demonstration of convexity property of 
416 

51 
Bayes envelope function. . . . . . . . . . . . . . . . . . . . . . . . . . 
. 
. 
. 
. 
511 
52 Bayes envelope function: normal variables with unequal means and equal 

variances. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. 
. 
. 
. 
512 

53 54 55 
. Bayes envelope function Loss Function . Geometrical interpretation of the risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
. . 
. . 
. . 
. . 
514 516 521 
56 
Geometrical interpretation of the minimax 
522 

57 
Loss Function for Statistical Odd or Even Game 
522 

58 
Risk set for “odd or even” game. . . . . . . . . . . . . . . . . . . . . 
. 
. 
. 
. 
523 
59 
Decision space for M = 3. . . . . . . . . . . . . . . . . . . . . . . . . 
. 
. 
. 
. 
528 
61 71 
Empiric Distribution Function. The . rectangles { X ∈ [x − ∆x, x + ∆x], Y ∈ [y − ∆y, y + ∆y ]} . . . . . . . . . . . . . . . . . . . . family of 
. 
. 
. 
. 
64 73 
72 
The family of trapezoids { X ∈ [x − ∆x, x + ∆x], Y ∈ [y − X ∆y, y + X ∆y ]} . 74 

91 
Geometric interpretation of conditional 
928 

92 
Geometric illustration of GramSchmidt 
930 
Winter 2009
11
1 The Formalism of Statistical Decision Theory
1.1 Game Theory and Decision Theory
This course is primarily focused on the engineering topics of detection and estimation. These topics have their roots in probability theory, and ﬁt in the general area of statistical decision theory. In fact, the component of statistical decision theory that we will be concerned with ﬁts in an even larger mathematical construct, that of game theory. Therefore, to establish these connections and to provide a useful context for future development, we will begin our discussion of this topic with a brief detour into the general area of mathematical games. A twoperson, zero sum mathematical game, which we will refer to from now on simply as a game , consists of three basic components:
1. A nonempty set, Θ _{1} , of possible actions available to Player 1.
2. A nonempty set, Θ _{2} , of possible actions available to Player 2.
3. A loss function, L : Θ _{1} × Θ _{2} → , representing the loss incurred by Player 1 (which, under the zerosum condition, corresponds to the gain obtained by Player 2).
Any such triple (Θ _{1} , Θ _{2} , L) deﬁnes a game. Here is a simple example taken from [3, Page 2]. Example: Odd or Even . Two contestants simultaneously put up either one or two ﬁn gers. Player 1 wins if the sum of the digits showing is odd, and Player 2 wins if the sum of the digits showing is even. The winner in all cases receives in dollars the sum of the digits showing, this being paid to him by the loser. To create a triple (Θ _{1} , Θ _{2} , L) for this game we deﬁne Θ _{1} = Θ _{2} = { 1 , 2 } and deﬁne loss function by
L(1 , 1) = 2
L(1 , 2) = −3
L(2 , 1) = −3
L(2 , 2) = 4
It is customary to arrange the loss function into a loss matrix as depicted in Figure 11.
12
ECEn 672
❅ ^{❅} ❅
Θ
Θ
1
2
1
2
1
2
Figure 11: Loss function (or matrix) for Odd or Even game
We won’t get into the details of how to develop a strategy for this game and many others similar in structure to it; that is a topic in its own right. For those who may be interested in general game theory, [10] is a reasonable introduction.
Exercise 11 Consider the wellknown game of Prisoner’s Dilemma. Two age nts, denoted X _{1} and X _{2} , are accused of a crime. They are interrogated separately, but the sentences that are passed are based upon the joint outcome. If they both confess, they are both sentenced to a jail term of three years. If neither confesses, they are both sentenced to a jail term of one year. If one confesses and the other refuses to confess, then the one who confesses is set free and the one who refuses to confess is sentenced to a ja il term of ﬁve years. This payoﬀ matrix is illustrated in Table 11. The ﬁrst entry in each quadrant of the payoﬀ matrix corresponds to X _{1} ’s payoﬀ, and the second entry corresponds to X _{2} ’s payoﬀ. This particular game represents an slight extension to our original deﬁnition, since it is not a zerosum game. When playing such a game, a reasonable strategy is for each agent to make a choice such that, once chosen, neither player would have an incenti ve to depart unilaterally from the outcome. Such a decision pair is called a Nash equilibrium point. In other words, at the Nash equilibrium point, both players can only hurt themselves by departing from their decision. What is the Nash equilibrium point for the Prisone r’s Dilemma game? Explain why this problem is considered a “dilemma.”
Exercise 12 In his delightful book, Superior Beings–If They Exist, How Would We Know?, Steven J. Brams introduces a game called the Revelation Game . In this game, there are two
Winter 2009
13
Table 11: A typical payoﬀ matrix for the Prisoner’s Dilemma.
Table 12: Payoﬀ for Revelation Game: 4 = best, 3 = next best, 2 = next worst, 1 = worst.
agents. Player 1 we will term the superior being (SB), and Pla yer 2 is a person (P). SB has
two strategies:
1. Reveal himself
2. Don’t reveal himself
Agent P also has two strategies:
1. Believe in SB’s existence
2. Don’t believe in SB’s existence
Figure 12 provides the payoﬀ matrix for this game. What is th e Nash equilibrium point for
this game?
We will view decision theory as a game between the decisionmaker, or agent, and na
ture, where nature takes the role of, say, Player 1, and the agent becomes Player 2. The
components of this game, which we will denote by (Θ, ∆, L), become
1. A nonempty set, Θ, of possible states of nature, sometimes referred to as the parameter
space .
14
ECEn 672
2. A nonempty set, ∆, of possible decisions available to the agent, sometimes called the decision space .
3. A loss function, L : Θ × ∆ → , representing the loss incurred by nature (which corresponds to the gain obtained by the agent. This function is also sometimes called the cost function .
Let’s take a minute and detail some of the important diﬀerences between game theory and decision theory.
• In a twoperson game, it is usually assumed that the players are simultaneously trying to maximize their winnings (or minimize their losses), whereas with decision theory, nature assumes essentially a neutral role and only the agent is trying to extremize anything. Of course, if you are paranoid, you might want to consider nature your opponent, but most people feel content to think of nature as b eing neutral. If we do so, we might be willing to be a little more bold in the decision strategies we choose, since we don’t need to be so careful about protecting ourselves.
• In a game, we usually assume that each player makes its decision based on exactly the same information (cheating is not allowed), whereas in decision theory, the agent may have available additional information, via observations, that may be used to gain an advantage on nature. This diﬀerence is more apparent than real, because there is nothing about game theory that says a game has to be fair. In fact, decision problems can be viewed as simply more complex games. The fact seems to b e, that decision theory is really a subset of the larger body of game theory, but there are enough special issues and structure involved in the way the agent may use observations to warrant its being a theory on its own, apart considered from game theory proper.
1.2 The Mathematical Structure of Decision Theory
In its most straightforward expression, the agent’s job is to guess the state of nature. A good job means small loss, so the agent is motivated to get the most out of any information available in the form of observations. We suppose that before making a decision the agent is
Winter 2009
15
permitted to look at the observed value of a random variable or vector, X , whose distribution depends upon the true state of nature, θ ∈ Θ.
Before presenting the mathematical development, we need a preliminary deﬁnition. Let (Θ _{1} , T _{1} ) and (Θ _{2} , T _{2} ) be two measurable spaces. A transition probability is a mapping P :
Θ _{1} × T _{2} → [0 , 1] such that ^{1}
1. For every
θ _{1} ∈ Θ _{1} , P ( θ _{1} , · ) is
a probability on (Θ <
Much more than documents.
Discover everything Scribd has to offer, including books and audiobooks from major publishers.
Cancel anytime.