You are on page 1of 130

Decision and estimation in information processing:

course nr. 5

March 22, 2021

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Part I

Moments that characterize RVs

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Let ξ be a RV with PDF wξ .


We define the mean of ξ as:
Z∞
ξ= xwξ (x)dx.
−∞

The mean value is perceived as an average of a (large) number


of instances of that RV, (e.g.,: the average salary in the
banking domain, the average lifetime in a given country, the
average numan height in a given geographical area.)
Question: are we talking about the same thing as the mean
value defined above?
Answer: yes!.
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV

Example: we have a set of N = 1825 resistors. We want to


compute the average resistance.
We measure the resistances of all available resistors. The
following table results:

Value of resistance Ri 19Ω 20Ω 21Ω 22Ω 23Ω 24Ω 25Ω


Number of resistors 89 211 432 501 342 171 79
The average resistance:
N
1 X 89 · 19Ω + 211 · 20Ω + . . . + 79 · 25Ω
Rmed = R(ri ) =
N 1825
i=1
89 211 79
= 19Ω + 20Ω + . . . + 25Ω .
1825 1825 1825

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean of a RV

Example: we have a set of N = 1825 resistors. We want to


compute the average resistance.
We measure the resistances of all available resistors. The
following table results:

Value of resistance Ri 19Ω 20Ω 21Ω 22Ω 23Ω 24Ω 25Ω


Number of resistors 89 211 432 501 342 171 79
The average resistance:
N
1 X 89 · 19Ω + 211 · 20Ω + . . . + 79 · 25Ω
Rmed = R(ri ) =
N 1825
i=1
89 211 79
= 19Ω + 20Ω + . . . + 25Ω .
1825 1825 1825

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean of a RV

Example: we have a set of N = 1825 resistors. We want to


compute the average resistance.
We measure the resistances of all available resistors. The
following table results:

Value of resistance Ri 19Ω 20Ω 21Ω 22Ω 23Ω 24Ω 25Ω


Number of resistors 89 211 432 501 342 171 79
The average resistance:
N
1 X 89 · 19Ω + 211 · 20Ω + . . . + 79 · 25Ω
Rmed = R(ri ) =
N 1825
i=1
89 211 79
= 19Ω + 20Ω + . . . + 25Ω .
1825 1825 1825

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean of a RV

Example: we have a set of N = 1825 resistors. We want to


compute the average resistance.
We measure the resistances of all available resistors. The
following table results:

Value of resistance Ri 19Ω 20Ω 21Ω 22Ω 23Ω 24Ω 25Ω


Number of resistors 89 211 432 501 342 171 79
The average resistance:
N
1 X 89 · 19Ω + 211 · 20Ω + . . . + 79 · 25Ω
Rmed = R(ri ) =
N 1825
i=1
89 211 79
= 19Ω + 20Ω + . . . + 25Ω .
1825 1825 1825

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean of a RV

Example: we have a set of N = 1825 resistors. We want to


compute the average resistance.
We measure the resistances of all available resistors. The
following table results:

Value of resistance Ri 19Ω 20Ω 21Ω 22Ω 23Ω 24Ω 25Ω


Number of resistors 89 211 432 501 342 171 79
The average resistance:
N
1 X 89 · 19Ω + 211 · 20Ω + . . . + 79 · 25Ω
Rmed = R(ri ) =
N 1825
i=1
89 211 79
= 19Ω + 20Ω + . . . + 25Ω .
1825 1825 1825

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean of a RV

Example: we have a set of N = 1825 resistors. We want to


compute the average resistance.
We measure the resistances of all available resistors. The
following table results:

Value of resistance Ri 19Ω 20Ω 21Ω 22Ω 23Ω 24Ω 25Ω


Number of resistors 89 211 432 501 342 171 79
The average resistance:
N
1 X 89 · 19Ω + 211 · 20Ω + . . . + 79 · 25Ω
Rmed = R(ri ) =
N 1825
i=1
89 211 79
= 19Ω + 20Ω + . . . + 25Ω .
1825 1825 1825

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean of a RV
89 211
What is the significance of ratios 1825 , 1825 etc.? The
89
experimentally determined probability: P(19Ω) = 1825 ,
211
P(20Ω) = 1825 , etc.
Hence:
25Ω
X
Rmed ≈ Ri P(Ri )
Ri =19Ω

Assuming the value of resistance is a countinuous (as it


actually is), we can write:
Z
Rmed ≈ Rw (R)dR.
R

Conclusion: the average of instances is a modality of


estimating the mean value ξ when wξ is not known!
Decision and estimation in information processing: course nr. 5
The mean
The mean theorem
Moments

The mean theorem

Let η = g (ξ).
Then, the mean of η can be directly computed as:
Z∞
η= g (x)wξ (x)dx.
−∞

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem

Let η = g (ξ).
Then, the mean of η can be directly computed as:
Z∞
η= g (x)wξ (x)dx.
−∞

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem

Let η = g (ξ).
Then, the mean of η can be directly computed as:
Z∞
η= g (x)wξ (x)dx.
−∞

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


In the context of functions of 1 RV, we proved that:

wη (y )∆y = wξ (x1 )∆x1 + wξ (x2 ) |∆x2 | + wξ (x3 )∆x3 .

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


By multiplying with y = g (x1 ) = g (x2 ) = g (x3 ), we get:
ywη (y )∆y = g (x1 )wξ (x1 )∆x1 +g (x2 )wξ (x2 ) |∆x2 |+g (x3 )wξ (x3 )∆x3 .

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


By multiplying with y = g (x1 ) = g (x2 ) = g (x3 ), we get:
ywη (y )∆y = g (x1 )wξ (x1 )∆x1 +g (x2 )wξ (x2 ) |∆x2 |+g (x3 )wξ (x3 )∆x3 .

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


We split the whole axis OY in non-overlapping intervals
[yi , yi + ∆yi ].
For each interval [yi , yi + ∆yi ] the following holds:
X
yi wη (yi )∆yi = g (xk )wξ (xk )∆xk .
k

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


We split the whole axis OY in non-overlapping intervals
[yi , yi + ∆yi ].
For each interval [yi , yi + ∆yi ] the following holds:
X
yi wη (yi )∆yi = g (xk )wξ (xk )∆xk .
k

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


We split the whole axis OY in non-overlapping intervals
[yi , yi + ∆yi ].
For each interval [yi , yi + ∆yi ] the following holds:
X
yi wη (yi )∆yi = g (xk )wξ (xk )∆xk .
k

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


The mean of η can then be computed as:
X Z∞
η≈ yj wη (yj )∆yj −→ ywη (y )dy ,
∆yj →0
j −∞
∪∆yj =R
j

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


The mean of η can then be computed as:
X Z∞
η≈ yj wη (yj )∆yj −→ ywη (y )dy ,
∆yj →0
j −∞
∪∆yj =R
j

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


But g is a function: a function associates to every x a single y!!
Hence:
X X
yj wη (yj )∆yj ≈ g (xi )wξ (xi )∆xi .
j i
∪∆yj =R ∪∆xi =R
j i

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


But g is a function: a function associates to every x a single y!!
Hence:
X X
yj wη (yj )∆yj ≈ g (xi )wξ (xi )∆xi .
j i
∪∆yj =R ∪∆xi =R
j i

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


But g is a function: a function associates to every x a single y!!
Hence:
X X
yj wη (yj )∆yj ≈ g (xi )wξ (xi )∆xi .
j i
∪∆yj =R ∪∆xi =R
j i

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof


But g is a function: a function associates to every x a single y!!
Hence:
X X
yj wη (yj )∆yj ≈ g (xi )wξ (xi )∆xi .
j i
∪∆yj =R ∪∆xi =R
j i

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof

When ∆yj −→ 0:
Z∞ Z∞
η= ywη (y )dy = g (x)wξ (x)dx.
−∞ −∞

Remark: the theorem is valid for any function g .

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof

When ∆yj −→ 0:
Z∞ Z∞
η= ywη (y )dy = g (x)wξ (x)dx.
−∞ −∞

Remark: the theorem is valid for any function g .

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof

When ∆yj −→ 0:
Z∞ Z∞
η= ywη (y )dy = g (x)wξ (x)dx.
−∞ −∞

Remark: the theorem is valid for any function g .

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: proof

When ∆yj −→ 0:
Z∞ Z∞
η= ywη (y )dy = g (x)wξ (x)dx.
−∞ −∞

Remark: the theorem is valid for any function g .

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: example

Compute the mean of η = aξ + b with a, b ∈ R.


We have:
Z∞ Z∞ Z∞
η= (ax+b)wξ (x)dx = a xwξ (x)dx +b wξ (x)dx = aξ+b.
−∞ −∞ −∞
| {z } | {z }
ξ 1

Particular cases:
a = 1 ⇒ η = ξ + b.
b = 0 ⇒ η = aξ.

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: example

Compute the mean of η = aξ + b with a, b ∈ R.


We have:
Z∞ Z∞ Z∞
η= (ax+b)wξ (x)dx = a xwξ (x)dx +b wξ (x)dx = aξ+b.
−∞ −∞ −∞
| {z } | {z }
ξ 1

Particular cases:
a = 1 ⇒ η = ξ + b.
b = 0 ⇒ η = aξ.

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: example

Compute the mean of η = aξ + b with a, b ∈ R.


We have:
Z∞ Z∞ Z∞
η= (ax+b)wξ (x)dx = a xwξ (x)dx +b wξ (x)dx = aξ+b.
−∞ −∞ −∞
| {z } | {z }
ξ 1

Particular cases:
a = 1 ⇒ η = ξ + b.
b = 0 ⇒ η = aξ.

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: example

Compute the mean of η = aξ + b with a, b ∈ R.


We have:
Z∞ Z∞ Z∞
η= (ax+b)wξ (x)dx = a xwξ (x)dx +b wξ (x)dx = aξ+b.
−∞ −∞ −∞
| {z } | {z }
ξ 1

Particular cases:
a = 1 ⇒ η = ξ + b.
b = 0 ⇒ η = aξ.

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: example

Compute the mean of η = aξ + b with a, b ∈ R.


We have:
Z∞ Z∞ Z∞
η= (ax+b)wξ (x)dx = a xwξ (x)dx +b wξ (x)dx = aξ+b.
−∞ −∞ −∞
| {z } | {z }
ξ 1

Particular cases:
a = 1 ⇒ η = ξ + b.
b = 0 ⇒ η = aξ.

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The mean theorem: example

Compute the mean of η = aξ + b with a, b ∈ R.


We have:
Z∞ Z∞ Z∞
η= (ax+b)wξ (x)dx = a xwξ (x)dx +b wξ (x)dx = aξ+b.
−∞ −∞ −∞
| {z } | {z }
ξ 1

Particular cases:
a = 1 ⇒ η = ξ + b.
b = 0 ⇒ η = aξ.

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Uncentered moments

Let RV ξ have PDF wξ .


We define the uncentered moment of order k of ξ as:
Z∞
mk (ξ) = ξ k = x k wξ (x)dx. k = 1, 2, . . .
−∞

The uncentered moment of order two is called squared mean:


Z∞
m2 (ξ) = ξ2 = x 2 wξ (x)dx.
−∞

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Uncentered moments

Let RV ξ have PDF wξ .


We define the uncentered moment of order k of ξ as:
Z∞
mk (ξ) = ξ k = x k wξ (x)dx. k = 1, 2, . . .
−∞

The uncentered moment of order two is called squared mean:


Z∞
m2 (ξ) = ξ2 = x 2 wξ (x)dx.
−∞

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Uncentered moments

Let RV ξ have PDF wξ .


We define the uncentered moment of order k of ξ as:
Z∞
mk (ξ) = ξ k = x k wξ (x)dx. k = 1, 2, . . .
−∞

The uncentered moment of order two is called squared mean:


Z∞
m2 (ξ) = ξ2 = x 2 wξ (x)dx.
−∞

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Uncentered moments

Let RV ξ have PDF wξ .


We define the uncentered moment of order k of ξ as:
Z∞
mk (ξ) = ξ k = x k wξ (x)dx. k = 1, 2, . . .
−∞

The uncentered moment of order two is called squared mean:


Z∞
m2 (ξ) = ξ2 = x 2 wξ (x)dx.
−∞

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Uncentered moments

Let RV ξ have PDF wξ .


We define the uncentered moment of order k of ξ as:
Z∞
mk (ξ) = ξ k = x k wξ (x)dx. k = 1, 2, . . .
−∞

The uncentered moment of order two is called squared mean:


Z∞
m2 (ξ) = ξ2 = x 2 wξ (x)dx.
−∞

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Centered moments
Let ξ have PDF wξ .
We define the centered moment of order k of ξ:
Z∞
Mk (ξ) = (ξ − ξ)k = (x − ξ)k wξ (x)dx. k = 2, . . .
−∞

The centered moment of order two is called variance:


Z∞
M2 (ξ) = (ξ − ξ)2 = (x − ξ)2 wξ (x)dx.
−∞

The square root of variance is called standard deviation:


p
σξ = M2 (ξ)

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Centered moments
Let ξ have PDF wξ .
We define the centered moment of order k of ξ:
Z∞
Mk (ξ) = (ξ − ξ)k = (x − ξ)k wξ (x)dx. k = 2, . . .
−∞

The centered moment of order two is called variance:


Z∞
M2 (ξ) = (ξ − ξ)2 = (x − ξ)2 wξ (x)dx.
−∞

The square root of variance is called standard deviation:


p
σξ = M2 (ξ)

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Centered moments
Let ξ have PDF wξ .
We define the centered moment of order k of ξ:
Z∞
Mk (ξ) = (ξ − ξ)k = (x − ξ)k wξ (x)dx. k = 2, . . .
−∞

The centered moment of order two is called variance:


Z∞
M2 (ξ) = (ξ − ξ)2 = (x − ξ)2 wξ (x)dx.
−∞

The square root of variance is called standard deviation:


p
σξ = M2 (ξ)

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Centered moments
Let ξ have PDF wξ .
We define the centered moment of order k of ξ:
Z∞
Mk (ξ) = (ξ − ξ)k = (x − ξ)k wξ (x)dx. k = 2, . . .
−∞

The centered moment of order two is called variance:


Z∞
M2 (ξ) = (ξ − ξ)2 = (x − ξ)2 wξ (x)dx.
−∞

The square root of variance is called standard deviation:


p
σξ = M2 (ξ)

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Centered moments
Let ξ have PDF wξ .
We define the centered moment of order k of ξ:
Z∞
Mk (ξ) = (ξ − ξ)k = (x − ξ)k wξ (x)dx. k = 2, . . .
−∞

The centered moment of order two is called variance:


Z∞
M2 (ξ) = (ξ − ξ)2 = (x − ξ)2 wξ (x)dx.
−∞

The square root of variance is called standard deviation:


p
σξ = M2 (ξ)

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Centered moments
Let ξ have PDF wξ .
We define the centered moment of order k of ξ:
Z∞
Mk (ξ) = (ξ − ξ)k = (x − ξ)k wξ (x)dx. k = 2, . . .
−∞

The centered moment of order two is called variance:


Z∞
M2 (ξ) = (ξ − ξ)2 = (x − ξ)2 wξ (x)dx.
−∞

The square root of variance is called standard deviation:


p
σξ = M2 (ξ)

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Centered moments
Let ξ have PDF wξ .
We define the centered moment of order k of ξ:
Z∞
Mk (ξ) = (ξ − ξ)k = (x − ξ)k wξ (x)dx. k = 2, . . .
−∞

The centered moment of order two is called variance:


Z∞
M2 (ξ) = (ξ − ξ)2 = (x − ξ)2 wξ (x)dx.
−∞

The square root of variance is called standard deviation:


p
σξ = M2 (ξ)

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Relation between mean, squared mean, and variance


Usual notation for variance is:
M2 (ξ) = σξ2

Between ξ,ξ 2 and σξ , we have:


2
σξ2 = ξ 2 − ξ

Proof: 
R∞ 2

σξ2 = x 2 − 2xξ + ξ wξ (x)dx
−∞
Z∞ Z∞ Z∞
2 2
= x wξ (x)dx − 2ξ xwξ (x)dx + ξ wξ (x)dx .
−∞ −∞ −∞
| {z } | {z } | {z }
ξ2 ξ 1

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Relation between mean, squared mean, and variance


Usual notation for variance is:
M2 (ξ) = σξ2

Between ξ,ξ 2 and σξ , we have:


2
σξ2 = ξ 2 − ξ

Proof: 
R∞ 2

σξ2 = x 2 − 2xξ + ξ wξ (x)dx
−∞
Z∞ Z∞ Z∞
2 2
= x wξ (x)dx − 2ξ xwξ (x)dx + ξ wξ (x)dx .
−∞ −∞ −∞
| {z } | {z } | {z }
ξ2 ξ 1

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Relation between mean, squared mean, and variance


Usual notation for variance is:
M2 (ξ) = σξ2

Between ξ,ξ 2 and σξ , we have:


2
σξ2 = ξ 2 − ξ

Proof: 
R∞ 2

σξ2 = x 2 − 2xξ + ξ wξ (x)dx
−∞
Z∞ Z∞ Z∞
2 2
= x wξ (x)dx − 2ξ xwξ (x)dx + ξ wξ (x)dx .
−∞ −∞ −∞
| {z } | {z } | {z }
ξ2 ξ 1

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Relation between mean, squared mean, and variance


Usual notation for variance is:
M2 (ξ) = σξ2

Between ξ,ξ 2 and σξ , we have:


2
σξ2 = ξ 2 − ξ

Proof: 
R∞ 2

σξ2 = x 2 − 2xξ + ξ wξ (x)dx
−∞
Z∞ Z∞ Z∞
2 2
= x wξ (x)dx − 2ξ xwξ (x)dx + ξ wξ (x)dx .
−∞ −∞ −∞
| {z } | {z } | {z }
ξ2 ξ 1

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Relation between mean, squared mean, and variance


Usual notation for variance is:
M2 (ξ) = σξ2

Between ξ,ξ 2 and σξ , we have:


2
σξ2 = ξ 2 − ξ

Proof: 
R∞ 2

σξ2 = x 2 − 2xξ + ξ wξ (x)dx
−∞
Z∞ Z∞ Z∞
2 2
= x wξ (x)dx − 2ξ xwξ (x)dx + ξ wξ (x)dx .
−∞ −∞ −∞
| {z } | {z } | {z }
ξ2 ξ 1

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Relation between mean, squared mean, and variance


Usual notation for variance is:
M2 (ξ) = σξ2

Between ξ,ξ 2 and σξ , we have:


2
σξ2 = ξ 2 − ξ

Proof: 
R∞ 2

σξ2 = x 2 − 2xξ + ξ wξ (x)dx
−∞
Z∞ Z∞ Z∞
2 2
= x wξ (x)dx − 2ξ xwξ (x)dx + ξ wξ (x)dx .
−∞ −∞ −∞
| {z } | {z } | {z }
ξ2 ξ 1

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

Relation between mean, squared mean, and variance


Usual notation for variance is:
M2 (ξ) = σξ2

Between ξ,ξ 2 and σξ , we have:


2
σξ2 = ξ 2 − ξ

Proof: 
R∞ 2

σξ2 = x 2 − 2xξ + ξ wξ (x)dx
−∞
Z∞ Z∞ Z∞
2 2
= x wξ (x)dx − 2ξ xwξ (x)dx + ξ wξ (x)dx .
−∞ −∞ −∞
| {z } | {z } | {z }
ξ2 ξ 1

Decision and estimation in information processing: course nr. 5


The mean
The mean theorem
Moments

The significance of standard deviation


If ξ : N (m, σ), is can be proven that ξ = m, whereas σξ = σ.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Part II

Statistical characterisation of a pair of


RVs

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Introduction

ξ and η may be characterized, separately, by wξ (x) and


wη (y ).
One cannot assess if there is a dependence (and how strong it
is) between ξ and η.
We need functions that characterized the joint statistical
behaviour of ξ and η.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Introduction

ξ and η may be characterized, separately, by wξ (x) and


wη (y ).
One cannot assess if there is a dependence (and how strong it
is) between ξ and η.
We need functions that characterized the joint statistical
behaviour of ξ and η.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Introduction

ξ and η may be characterized, separately, by wξ (x) and


wη (y ).
One cannot assess if there is a dependence (and how strong it
is) between ξ and η.
We need functions that characterized the joint statistical
behaviour of ξ and η.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint CDF


We define Fξη : R2 → [0, 1]
def
Fξη (x, y ) = P ((ξ ≤ x) ∩ (η ≤ y ))

Fξη (x0 , y0 ) = P ((ξ, η) ∈ D)


y

2
D∈R
y
0

x
x0

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint CDF


We define Fξη : R2 → [0, 1]
def
Fξη (x, y ) = P ((ξ ≤ x) ∩ (η ≤ y ))

Fξη (x0 , y0 ) = P ((ξ, η) ∈ D)


y

2
D∈R
y
0

x
x0

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint CDF


We define Fξη : R2 → [0, 1]
def
Fξη (x, y ) = P ((ξ ≤ x) ∩ (η ≤ y ))

Fξη (x0 , y0 ) = P ((ξ, η) ∈ D)


y

2
D∈R
y
0

x
x0

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of the joint CDF

1 Fξη (−∞, y ) = Fξη (x, −∞) = 0


2 Fξη (∞, ∞) = 1
3 Fξη (x, ∞) = Fξ (x), and Fξη (∞, y ) = Fη (y ), respectively

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of the joint CDF

1 Fξη (−∞, y ) = Fξη (x, −∞) = 0


2 Fξη (∞, ∞) = 1
3 Fξη (x, ∞) = Fξ (x), and Fξη (∞, y ) = Fη (y ), respectively

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of the joint CDF

1 Fξη (−∞, y ) = Fξη (x, −∞) = 0


2 Fξη (∞, ∞) = 1
3 Fξη (x, ∞) = Fξ (x), and Fξη (∞, y ) = Fη (y ), respectively

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of the joint CDF


4 P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
Fξη (x2 , y2 ) − Fξη (x1 , y2 ) − Fξη (x2 , y1 ) + Fξη (x1 , y1 )
Proof. We have:
P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
= P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ ≤ x2 ) ∩ (η ≤ y1 ))

Dar:
P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 )) =
= P((ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ) ∩ (η ≤ y2 ))
= Fξη (x2 , y2 ) − Fξη (x1 , y2 )
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of the joint CDF


4 P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
Fξη (x2 , y2 ) − Fξη (x1 , y2 ) − Fξη (x2 , y1 ) + Fξη (x1 , y1 )
Proof. We have:
P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
= P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ ≤ x2 ) ∩ (η ≤ y1 ))

Dar:
P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 )) =
= P((ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ) ∩ (η ≤ y2 ))
= Fξη (x2 , y2 ) − Fξη (x1 , y2 )
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of the joint CDF


4 P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
Fξη (x2 , y2 ) − Fξη (x1 , y2 ) − Fξη (x2 , y1 ) + Fξη (x1 , y1 )
Proof. We have:
P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
= P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ ≤ x2 ) ∩ (η ≤ y1 ))

Dar:
P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 )) =
= P((ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ) ∩ (η ≤ y2 ))
= Fξη (x2 , y2 ) − Fξη (x1 , y2 )
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of the joint CDF


4 P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
Fξη (x2 , y2 ) − Fξη (x1 , y2 ) − Fξη (x2 , y1 ) + Fξη (x1 , y1 )
Proof. We have:
P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
= P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ ≤ x2 ) ∩ (η ≤ y1 ))

Dar:
P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 )) =
= P((ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ) ∩ (η ≤ y2 ))
= Fξη (x2 , y2 ) − Fξη (x1 , y2 )
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of the joint CDF


4 P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
Fξη (x2 , y2 ) − Fξη (x1 , y2 ) − Fξη (x2 , y1 ) + Fξη (x1 , y1 )
Proof. We have:
P((x1 < ξ ≤ x2 ) ∩ (y1 < η ≤ y2 )) =
= P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ ≤ x2 ) ∩ (η ≤ y1 ))

Dar:
P((x1 < ξ ≤ x2 ) ∩ (η ≤ y2 )) =
= P((ξ ≤ x2 ) ∩ (η ≤ y2 ))−
− P((x1 < ξ) ∩ (η ≤ y2 ))
= Fξη (x2 , y2 ) − Fξη (x1 , y2 )
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of the joint CDF

Similarly:

P((x1 < ξ ≤ x2 ) ∩ (η ≤ y1 )) =
= P((ξ ≤ x2 ) ∩ (η ≤ y1 ))−
− P((x1 < ξ) ∩ (η ≤ y1 ))
= Fξη (x2 , y1 ) − Fξη (x1 , y1 )

The expected result follows.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of the joint CDF

Similarly:

P((x1 < ξ ≤ x2 ) ∩ (η ≤ y1 )) =
= P((ξ ≤ x2 ) ∩ (η ≤ y1 ))−
− P((x1 < ξ) ∩ (η ≤ y1 ))
= Fξη (x2 , y1 ) − Fξη (x1 , y1 )

The expected result follows.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of the joint CDF

Similarly:

P((x1 < ξ ≤ x2 ) ∩ (η ≤ y1 )) =
= P((ξ ≤ x2 ) ∩ (η ≤ y1 ))−
− P((x1 < ξ) ∩ (η ≤ y1 ))
= Fξη (x2 , y1 ) − Fξη (x1 , y1 )

The expected result follows.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF

We define wξη : R2 → [0, ∞):

def ∂ 2 Fξη (x, y )


wξη (x, y ) = .
∂x∂y

As in the 1D case, it can be shown that:


wξη (x, y )∆x∆y ≈ P (ξ ∈ (x, x + ∆x]) ∩ (η ∈ (y , y + ∆y ]) .
∆x,∆y &&

By using wξη , we can compute the probability that ξ lies around x


and η lies around y simultaneously.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF

We define wξη : R2 → [0, ∞):

def ∂ 2 Fξη (x, y )


wξη (x, y ) = .
∂x∂y

As in the 1D case, it can be shown that:


wξη (x, y )∆x∆y ≈ P (ξ ∈ (x, x + ∆x]) ∩ (η ∈ (y , y + ∆y ]) .
∆x,∆y &&

By using wξη , we can compute the probability that ξ lies around x


and η lies around y simultaneously.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF

We define wξη : R2 → [0, ∞):

def ∂ 2 Fξη (x, y )


wξη (x, y ) = .
∂x∂y

As in the 1D case, it can be shown that:


wξη (x, y )∆x∆y ≈ P (ξ ∈ (x, x + ∆x]) ∩ (η ∈ (y , y + ∆y ]) .
∆x,∆y &&

By using wξη , we can compute the probability that ξ lies around x


and η lies around y simultaneously.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF

We define wξη : R2 → [0, ∞):

def ∂ 2 Fξη (x, y )


wξη (x, y ) = .
∂x∂y

As in the 1D case, it can be shown that:


wξη (x, y )∆x∆y ≈ P (ξ ∈ (x, x + ∆x]) ∩ (η ∈ (y , y + ∆y ]) .
∆x,∆y &&

By using wξη , we can compute the probability that ξ lies around x


and η lies around y simultaneously.

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF


Proof:
∂Fξη (x, y ) Fξη (x + ∆x, y ) − Fξη (x, y )
= lim ,
∂x ∆x→0 ∆x
Then:
 
∂Fξη (x,y )
∂2F ∂
ξη (x, y ) ∂x
wξη (x, y ) = =
∂x∂y ∂y
 
F (x+∆x,y )−Fξη (x,y )
∂ lim∆x→0 ξη ∆x
=
∂y
∂Fξη (x+∆x,y ) ∂Fξη (x,y )
∂y − ∂y
= lim
∆x→0 ∆x

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF


Proof:
∂Fξη (x, y ) Fξη (x + ∆x, y ) − Fξη (x, y )
= lim ,
∂x ∆x→0 ∆x
Then:
 
∂Fξη (x,y )
∂2F ∂
ξη (x, y ) ∂x
wξη (x, y ) = =
∂x∂y ∂y
 
F (x+∆x,y )−Fξη (x,y )
∂ lim∆x→0 ξη ∆x
=
∂y
∂Fξη (x+∆x,y ) ∂Fξη (x,y )
∂y − ∂y
= lim
∆x→0 ∆x

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF


Proof:
∂Fξη (x, y ) Fξη (x + ∆x, y ) − Fξη (x, y )
= lim ,
∂x ∆x→0 ∆x
Then:
 
∂Fξη (x,y )
∂2F ∂
ξη (x, y ) ∂x
wξη (x, y ) = =
∂x∂y ∂y
 
F (x+∆x,y )−Fξη (x,y )
∂ lim∆x→0 ξη ∆x
=
∂y
∂Fξη (x+∆x,y ) ∂Fξη (x,y )
∂y − ∂y
= lim
∆x→0 ∆x

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF


Proof:
∂Fξη (x, y ) Fξη (x + ∆x, y ) − Fξη (x, y )
= lim ,
∂x ∆x→0 ∆x
Then:
 
∂Fξη (x,y )
∂2F ∂
ξη (x, y ) ∂x
wξη (x, y ) = =
∂x∂y ∂y
 
F (x+∆x,y )−Fξη (x,y )
∂ lim∆x→0 ξη ∆x
=
∂y
∂Fξη (x+∆x,y ) ∂Fξη (x,y )
∂y − ∂y
= lim
∆x→0 ∆x

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF


At their turn:
∂Fξη (x + ∆x, y ) Fξη (x + ∆x, y + ∆y ) − Fξη (x + ∆x, y )
= lim
∂y ∆y →0 ∆y
∂Fξη (x, y ) Fξη (x, y + ∆y ) − Fξη (x, y )
= lim
∂y ∆y →0 ∆y

It follows:
Fξη (x + ∆x, y + ∆y ) − Fξη (x + ∆x, y ) − . . .
wξη (x, y ) = lim lim
∆x→0 ∆y →0 ∆x∆y
. . . − Fξη (x, y + ∆y ) + Fξη (x, y )
∆x∆y

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF


At their turn:
∂Fξη (x + ∆x, y ) Fξη (x + ∆x, y + ∆y ) − Fξη (x + ∆x, y )
= lim
∂y ∆y →0 ∆y
∂Fξη (x, y ) Fξη (x, y + ∆y ) − Fξη (x, y )
= lim
∂y ∆y →0 ∆y

It follows:
Fξη (x + ∆x, y + ∆y ) − Fξη (x + ∆x, y ) − . . .
wξη (x, y ) = lim lim
∆x→0 ∆y →0 ∆x∆y
. . . − Fξη (x, y + ∆y ) + Fξη (x, y )
∆x∆y

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF


At their turn:
∂Fξη (x + ∆x, y ) Fξη (x + ∆x, y + ∆y ) − Fξη (x + ∆x, y )
= lim
∂y ∆y →0 ∆y
∂Fξη (x, y ) Fξη (x, y + ∆y ) − Fξη (x, y )
= lim
∂y ∆y →0 ∆y

It follows:
Fξη (x + ∆x, y + ∆y ) − Fξη (x + ∆x, y ) − . . .
wξη (x, y ) = lim lim
∆x→0 ∆y →0 ∆x∆y
. . . − Fξη (x, y + ∆y ) + Fξη (x, y )
∆x∆y

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF


At their turn:
∂Fξη (x + ∆x, y ) Fξη (x + ∆x, y + ∆y ) − Fξη (x + ∆x, y )
= lim
∂y ∆y →0 ∆y
∂Fξη (x, y ) Fξη (x, y + ∆y ) − Fξη (x, y )
= lim
∂y ∆y →0 ∆y

It follows:
Fξη (x + ∆x, y + ∆y ) − Fξη (x + ∆x, y ) − . . .
wξη (x, y ) = lim lim
∆x→0 ∆y →0 ∆x∆y
. . . − Fξη (x, y + ∆y ) + Fξη (x, y )
∆x∆y

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF

By using property nr. 4 of joint CDF, we then have:

P ((ξ ∈ (x, x + ∆x]) ∩ (η ∈ (y , y + ∆y ]))


wξη (x, y ) = lim lim .
∆x→0 ∆y →0 ∆x∆y

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

The joint PDF

By using property nr. 4 of joint CDF, we then have:

P ((ξ ∈ (x, x + ∆x]) ∩ (η ∈ (y , y + ∆y ]))


wξη (x, y ) = lim lim .
∆x→0 ∆y →0 ∆x∆y

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF

1 wξη (x, y ) ≥ 0, ∀(x, y ) ⊂ R2 .


wξη (x, y )dxdy pentru ∀D ⊂ R2
 RR
2 P (ξ, η) ∈ D =
D
Proof:
S SAny domain D ⊂ R2 can be decomposed as
D = i j (xi , xi + ∆xi ] × (yj , yj + ∆yj ]:
[xi,xi+∆ xi] × [yj,yj+∆ yj]
y

D∈R2
… …

… …

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF

1 wξη (x, y ) ≥ 0, ∀(x, y ) ⊂ R2 .


wξη (x, y )dxdy pentru ∀D ⊂ R2
 RR
2 P (ξ, η) ∈ D =
D
Proof:
S SAny domain D ⊂ R2 can be decomposed as
D = i j (xi , xi + ∆xi ] × (yj , yj + ∆yj ]:
[xi,xi+∆ xi] × [yj,yj+∆ yj]
y

D∈R2
… …

… …

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF

1 wξη (x, y ) ≥ 0, ∀(x, y ) ⊂ R2 .


wξη (x, y )dxdy pentru ∀D ⊂ R2
 RR
2 P (ξ, η) ∈ D =
D
Proof:
S SAny domain D ⊂ R2 can be decomposed as
D = i j (xi , xi + ∆xi ] × (yj , yj + ∆yj ]:
[xi,xi+∆ xi] × [yj,yj+∆ yj]
y

D∈R2
… …

… …

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF

1 wξη (x, y ) ≥ 0, ∀(x, y ) ⊂ R2 .


wξη (x, y )dxdy pentru ∀D ⊂ R2
 RR
2 P (ξ, η) ∈ D =
D
Proof:
S SAny domain D ⊂ R2 can be decomposed as
D = i j (xi , xi + ∆xi ] × (yj , yj + ∆yj ]:
[xi,xi+∆ xi] × [yj,yj+∆ yj]
y

D∈R2
… …

… …

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


We have:
 
[[
P((ξ, η) ∈ D) ≈ P (ξ, η) ∈ ([xi , xi + ∆xi ] × [yj , yj + ∆yj ])
i j

axiom3
XX 
= P (ξ ∈ [xi , xi + ∆xi ]) ∩ (η ∈ [yj , yj + ∆yj ])
i j

XX
= wξη (xi , yj )∆xi ∆yj
i j
ZZ
∆xi ,∆yj →0
−→ wξη (x, y )dxdy .
D

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


We have:
 
[[
P((ξ, η) ∈ D) ≈ P (ξ, η) ∈ ([xi , xi + ∆xi ] × [yj , yj + ∆yj ])
i j

axiom3
XX 
= P (ξ ∈ [xi , xi + ∆xi ]) ∩ (η ∈ [yj , yj + ∆yj ])
i j

XX
= wξη (xi , yj )∆xi ∆yj
i j
ZZ
∆xi ,∆yj →0
−→ wξη (x, y )dxdy .
D

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


We have:
 
[[
P((ξ, η) ∈ D) ≈ P (ξ, η) ∈ ([xi , xi + ∆xi ] × [yj , yj + ∆yj ])
i j

axiom3
XX 
= P (ξ ∈ [xi , xi + ∆xi ]) ∩ (η ∈ [yj , yj + ∆yj ])
i j

XX
= wξη (xi , yj )∆xi ∆yj
i j
ZZ
∆xi ,∆yj →0
−→ wξη (x, y )dxdy .
D

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


We have:
 
[[
P((ξ, η) ∈ D) ≈ P (ξ, η) ∈ ([xi , xi + ∆xi ] × [yj , yj + ∆yj ])
i j

axiom3
XX 
= P (ξ ∈ [xi , xi + ∆xi ]) ∩ (η ∈ [yj , yj + ∆yj ])
i j

XX
= wξη (xi , yj )∆xi ∆yj
i j
ZZ
∆xi ,∆yj →0
−→ wξη (x, y )dxdy .
D

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


We have:
 
[[
P((ξ, η) ∈ D) ≈ P (ξ, η) ∈ ([xi , xi + ∆xi ] × [yj , yj + ∆yj ])
i j

axiom3
XX 
= P (ξ ∈ [xi , xi + ∆xi ]) ∩ (η ∈ [yj , yj + ∆yj ])
i j

XX
= wξη (xi , yj )∆xi ∆yj
i j
ZZ
∆xi ,∆yj →0
−→ wξη (x, y )dxdy .
D

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


3 The normalization condition:
Z∞Z
wξη (x, y )dxdy = 1.
−∞

Proof: We write property 2 for D = R2 .


4 The joint CDF:
Zx Zy
Fξη (x, y ) = wξη (u, v )dudv .
−∞ −∞

Proof:
We write property 2 for D = (−∞, x] × (−∞, y ].
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of joint PDF


3 The normalization condition:
Z∞Z
wξη (x, y )dxdy = 1.
−∞

Proof: We write property 2 for D = R2 .


4 The joint CDF:
Zx Zy
Fξη (x, y ) = wξη (u, v )dudv .
−∞ −∞

Proof:
We write property 2 for D = (−∞, x] × (−∞, y ].
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of joint PDF


3 The normalization condition:
Z∞Z
wξη (x, y )dxdy = 1.
−∞

Proof: We write property 2 for D = R2 .


4 The joint CDF:
Zx Zy
Fξη (x, y ) = wξη (u, v )dudv .
−∞ −∞

Proof:
We write property 2 for D = (−∞, x] × (−∞, y ].
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of joint PDF


3 The normalization condition:
Z∞Z
wξη (x, y )dxdy = 1.
−∞

Proof: We write property 2 for D = R2 .


4 The joint CDF:
Zx Zy
Fξη (x, y ) = wξη (u, v )dudv .
−∞ −∞

Proof:
We write property 2 for D = (−∞, x] × (−∞, y ].
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of joint PDF


3 The normalization condition:
Z∞Z
wξη (x, y )dxdy = 1.
−∞

Proof: We write property 2 for D = R2 .


4 The joint CDF:
Zx Zy
Fξη (x, y ) = wξη (u, v )dudv .
−∞ −∞

Proof:
We write property 2 for D = (−∞, x] × (−∞, y ].
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of joint PDF


3 The normalization condition:
Z∞Z
wξη (x, y )dxdy = 1.
−∞

Proof: We write property 2 for D = R2 .


4 The joint CDF:
Zx Zy
Fξη (x, y ) = wξη (u, v )dudv .
−∞ −∞

Proof:
We write property 2 for D = (−∞, x] × (−∞, y ].
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of joint PDF


3 The normalization condition:
Z∞Z
wξη (x, y )dxdy = 1.
−∞

Proof: We write property 2 for D = R2 .


4 The joint CDF:
Zx Zy
Fξη (x, y ) = wξη (u, v )dudv .
−∞ −∞

Proof:
We write property 2 for D = (−∞, x] × (−∞, y ].
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of joint PDF


3 The normalization condition:
Z∞Z
wξη (x, y )dxdy = 1.
−∞

Proof: We write property 2 for D = R2 .


4 The joint CDF:
Zx Zy
Fξη (x, y ) = wξη (u, v )dudv .
−∞ −∞

Proof:
We write property 2 for D = (−∞, x] × (−∞, y ].
Decision and estimation in information processing: course nr. 5
Introduction
The joint CDF
The joint PDF

Properties of joint PDF

5 Marginal PDFs.
Z∞
wξ (x) = wξη (x, y )dy
−∞

Z∞
wη (y ) = wξη (x, y )dx
−∞

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF

5 Marginal PDFs.
Z∞
wξ (x) = wξη (x, y )dy
−∞

Z∞
wη (y ) = wξη (x, y )dx
−∞

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF

5 Marginal PDFs.
Z∞
wξ (x) = wξη (x, y )dy
−∞

Z∞
wη (y ) = wξη (x, y )dx
−∞

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


Proof:
Zx Z∞
Fξ (x) = Fξη (x, ∞) = wξη (u, v )dudv ,
∞ −∞

wherefrom:

Zx Z∞ Z∞
dFξ (x) d
wξ (x) = = wξη (u, v )dudv = wξη (x, v )dv .
dx dx
∞ −∞ −∞

d
Rx
We used the fact that dx
f (u)du = f (x) for any real function f , here for
a
Rx
f (x) = wξη (u, v )dv .
−∞

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


Proof:
Zx Z∞
Fξ (x) = Fξη (x, ∞) = wξη (u, v )dudv ,
∞ −∞

wherefrom:

Zx Z∞ Z∞
dFξ (x) d
wξ (x) = = wξη (u, v )dudv = wξη (x, v )dv .
dx dx
∞ −∞ −∞

d
Rx
We used the fact that dx
f (u)du = f (x) for any real function f , here for
a
Rx
f (x) = wξη (u, v )dv .
−∞

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


Proof:
Zx Z∞
Fξ (x) = Fξη (x, ∞) = wξη (u, v )dudv ,
∞ −∞

wherefrom:

Zx Z∞ Z∞
dFξ (x) d
wξ (x) = = wξη (u, v )dudv = wξη (x, v )dv .
dx dx
∞ −∞ −∞

d
Rx
We used the fact that dx
f (u)du = f (x) for any real function f , here for
a
Rx
f (x) = wξη (u, v )dv .
−∞

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


Proof:
Zx Z∞
Fξ (x) = Fξη (x, ∞) = wξη (u, v )dudv ,
∞ −∞

wherefrom:

Zx Z∞ Z∞
dFξ (x) d
wξ (x) = = wξη (u, v )dudv = wξη (x, v )dv .
dx dx
∞ −∞ −∞

d
Rx
We used the fact that dx
f (u)du = f (x) for any real function f , here for
a
Rx
f (x) = wξη (u, v )dv .
−∞

Decision and estimation in information processing: course nr. 5


Introduction
The joint CDF
The joint PDF

Properties of joint PDF


Proof:
Zx Z∞
Fξ (x) = Fξη (x, ∞) = wξη (u, v )dudv ,
∞ −∞

wherefrom:

Zx Z∞ Z∞
dFξ (x) d
wξ (x) = = wξη (u, v )dudv = wξη (x, v )dv .
dx dx
∞ −∞ −∞

d
Rx
We used the fact that dx
f (u)du = f (x) for any real function f , here for
a
Rx
f (x) = wξη (u, v )dv .
−∞

Decision and estimation in information processing: course nr. 5

You might also like