You are on page 1of 155

Stochastic Signal Processing

TS Ñoã Hoàng Tuaán

Boä Moân Vieãn Thoâng, Khoa Ñieän-Ñieän Töû


Ñaïi Hoïc Baùch Khoa TP. HCM

E-mail: do-hong@hcmut.edu.vn

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 1 BG, CH, ÑHBK
Outline (1)

Chapter 1: Discrete-Time Signal Processing


z-Transform.
Linear Time-Invariant Filters.
Discrete Fourier Transform (DFT).

Chapter 2: Stochastic Processes and Models


Review of Probability and Random Variables.
Stochastic Models.
Stochastic Processes.

Chapter 3: Spectrum Analysis


Spectral Density.
Spectral Representation of Stochastic Process.
Spectral Estimation.
Other Statistical Characteristics of a Stochastic Process.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 2 BG, CH, ÑHBK
Outline (2)
Chapter 4: Eigenanalysis
Properties of Eigenvalues and Eigenvectors.
Chapter 5: Wiener Filters
Minimum Mean-Squared Error.
Wiener-Hopf Equations.
Channel Equalization.
Linearly Constrained Minimum Variance (LCMV) Filter.
Chapter 6: Linear Prediction
Forward Linear Prediction.
Backward Linear Prediction.
Levinson-Durbin Algorithm.
Chapter 7: Kalman Filters
Kalman Algorithm.
Applications of Kalman Filter: Tracking Trajectory of Object
and System Identification.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 3 BG, CH, ÑHBK
Outline (3)

Chapter 8: Linear Adaptive Filtering


Adaptive Filters and Applications.
Method of Steepest Descent.
Least-Mean-Square Algorithm.
Recursive Least-Squares Algorithm.
Examples of Adaptive Filtering: Adaptive Equalization
and Adaptive Beamforming.

Chapter 9: Estimation Theory


Fundamentals.
Minimum Variance Unbiased Estimators.
Maximum Likelihood Estimation.
Eigenanalysis Algorithms for Spectral Estimation.
Example: Direction-of-Arrival (DoA) Estimation.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 4 BG, CH, ÑHBK
References

[1] Simon Haykin, Adaptive Filter Theory, Prentice Hall,


1996 (3rd Ed.), 2001 (4th Ed.).

[2] Steven M. Kay, Fundamentals of Statistical Signal Processing:


Estimation Theory, Prentice Hall, 1993.

[3] Alan V. Oppenheim, Ronald W. Schafer, Discrete-Time Signal


Processing, Prentice Hall, 1989.

[4] Athanasios Papoulis, Probability, Random Variables, and


Stochastic Processes, McGraw-Hill, 1991 (3rd Ed.), 2001 (4th Ed.).

[5] Dimitris G. Manolakis, Vinay K. Ingle, Stephen M. Kogon,


Statistical and Adaptive Signal Processing, Artech House, 2005.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 5 BG, CH, ÑHBK
Goal of the course
Introduction to the theory and algorithms used for the analysis and
processing of random signals (stochastic signals) and their
applications to communications problems.

ƒ Understanding the random signals: via statistical description


(theory of probability, random variables and stochastic processes),
modeling and the dependence between the samples of one or more
discrete-time random signals.

ƒ Developing the theoretically methods/practical techniques for


processing random signals to achieve a predefined application-
dependent objective.
• Major applications in communications: signal modeling,
spectral estimation, frequency-selective filtering, adaptive
filtering, array processing...

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 6 BG, CH, ÑHBK
Random Signals vs. Deterministic Signals
Example: Discrete-time random signals (a) and the dependence
between the samples (b), [5].

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 7 BG, CH, ÑHBK
Techniques for Processing Random Signals
‰ Signal analysis (signal modeling, spectral estimation): Primary goal
is to extract useful information for understanding and classifying the
signals.
Typical applications: Detection of useful information from receiving
signals, system modeling/identification, detection and classification of
radar and sonar targets, speech recognition, signal representation for
data compression…

‰ Signal filtering (frequency-selective filtering, adaptive filtering,


array processing): Main objective is to improve the quality of a signal
according to a criterion of performance.
Typical applications: Noise and interference cancellation, echo
cancellation, channel equalization, active noise control…

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 8 BG, CH, ÑHBK
Applications of Adaptive Filters (1)
System identification [5]:

System inversion [5]:

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 9 BG, CH, ÑHBK
Applications of Adaptive Filters (2)
Signal prediction [5]:

Interference cacellation [5]:

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 10 BG, CH, ÑHBK
Applications of Adaptive Filters (3)
Simple model of a digital communications system with channel
equalization [5]:

Pulse trains: (a) without intersymbol


interference (ISI) and (b) with ISI.
ISI distortion: The tails of adjacent
pulses interfere the current pulse and
can lead to an incorrect decision.
The equalizer can compensate for
the ISI distortion.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 11 BG, CH, ÑHBK
Applications of Adaptive Filters (4)
Block diagram of the basic components of an active noise control system
[5]:

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 12 BG, CH, ÑHBK
Applications of Array Processing (1)
Beamforming [5]:

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 13 BG, CH, ÑHBK
Applications of Array Processing (2)
Example of (adaptive) beamforming with an airborne for interference
mitigation [5]:

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 14 BG, CH, ÑHBK
Applications of Array Processing (3)
(Adaptive) Sidelobe canceler [5]:

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 15 BG, CH, ÑHBK
Chapter 1:
Discrete-Time Signal Processing

‰ z-Transform.

‰ Linear Time-Invariant Filters.

‰ Discrete Fourier Transform (DFT).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû BG, CH, ÑHBK
1. Z-transform (1)
‰ Discrete-time signals → signals described as a time series, consisting
of sequence of uniformly spaced samples whose varying amplitudes
carry the useful information content of the signal.

‰ Consider time series (sequence) {u(n)} or u(n) denoted by samples:


u(n), u(n-1), u(n-2),…, n: discrete time.
ƒ Z-transform of u(n):

U ( z ) = z[u (n)] = ∑ u ( n ) z −n (1.1)
n = −∞
z: complex variable.
z-transform pair: u(n) ↔ U(z).
ƒ Region of convergence (ROC): set of values of z for which U(z) is
uniformly convergent.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 17 BG, CH, ÑHBK
1. Z-transform (2)
‰ Properties:
ƒ Linear transform (superposition):
au1 (n) + bu2 (n) ↔ aU1 ( z ) + bU 2 ( z ) (1.2)
ROC of (1.2): intersection of ROC of U1(z) and ROC of U2(z).
ƒ Time-shifting:
u ( n) ↔ U ( z ) ⇒ u (n − n0 ) ↔ z n0U ( z ) (1.3)
n0: integer.
ROC of (1.3): same as ROC of U(z).
Special case: n 0 = 1 ⇒ u (n − 1) ↔ z −1U ( z )
z-1: unit-delay element.
ƒ Convolution theorem:

∑ u (n)u
i = −∞
1 2 (n − i ) ↔ U1 ( z )U 2 ( z ) (1.4)
ROC of (1.4): intersection of ROC of U1(z) and ROC of U2(z).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 18 BG, CH, ÑHBK
1. Linear Time-Invariant (LTI) Filters (1)
‰ Definition:

v(n) LTI u(n)


Filter:
av1 (n) + bv2 (n) Linearity, au1 (n) + bu2 (n)
Time-invariant
v(n − k ) u (n − k )

‰ Impulse response h(n):

LTI u(n)=h(n)
t=0 Filter
h(n)

For arbitrary input v(n): convolution sum



u ( n) = ∑ h(i)v(n − i)
i = −∞
(1.5)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 19 BG, CH, ÑHBK
1. LTI Filters (2)
‰ Transfer function:
ƒ Applying z-transform to both side of (1.5)
U ( z ) = H ( z )V ( z )
(1.6)
U(z) ↔ u(n), V(z) ↔ v(n), H(z) ↔ h(n)
H(z): transfer function of the filter
U ( z)
H ( z) =
V ( z)
(1.7)

ƒ When input sequence v(n) and output sequence u(n) are related
by difference equation of order N:
N N

∑ a u (n − j ) =∑ b v(n − j )
j =0
j
j =0
j
(1.8)
aj, bj: constant coefficients. Applying z-transform, obtaining:
N N

U ( z)
∑a z
j =0
j
−j

a ∏ (1 − c z
k
−1
)
H ( z) = = = 0 k =1
(1.9)
V ( z) N N

∑b z ∏ (1 − d
−j b0 −1
j k z )
j =0 k =1

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 20 BG, CH, ÑHBK
1. LTI Filters (3)
‰ From (1.9), two distinct types of LTI filters:
ƒ Finite-duration impulse response (FIR) filters: dk=0 for all k.
→ all-zero filter, h(n) has finite duration.
ƒ Infinite-duration impulse response (IIR) filters: H(z) has at least
one non-zero pole, h(n) has infinite duration. When ck=0 for all
k → all-pole filter.
ƒ See examples of FIR and IIR filters in next two slides.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 21 BG, CH, ÑHBK
1. LTI Filters (4)

v(n) v(n-1) v(n-2) v(n-M+1) v(n-M)


z-1 z-1 … z-1

a1 a2 … aM-1 aM

u(n)
+ + … + +

FIR filter

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 22 BG, CH, ÑHBK
1. LTI Filters (5)
v(n) + u(n)

z-1

+ a1 u(n-1)

z-1

+ a2 u(n-2)
.
.
.

+ aM-1 u(n-M+1)

z-1
IIR filter
aM u(n-M)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 23 BG, CH, ÑHBK
1. LTI Filters (6)
‰ Causality and stability:
ƒ LTI filter is causal if:
h( n) = 0 for n < 0 (1.10)
ƒ LTI filter is stable if output sequence is bounded for all bounded
input sequences. From (1.5), necessary and sufficient condition:

∑ h( k ) < ∞
k = −∞
(1.11)
Im

Unit circle A causal LTI filter is stable


if and only if all of the poles
of the filter’s transfer
Re function lie inside the unit
circle in the z-plane
Region of
stability
(See more in [3])
z-plane
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 24 BG, CH, ÑHBK
1. Discrete Fourier Transform (DFT) (1)
‰ Fourier transform of a sequence u(n) is obtained from z-transform
by setting z=exp(j2πf), f: real frequency variable.

When u(n) has a finite duration, its Fourier representation →


discrete Fourier transform (DFT).

For numerical computation of DFT → efficient fast Fourier


transform (FFT).

‰ u(n): finite-duration sequence of length N, DFT of u(n):


N −1
⎛ j 2πkn ⎞ (1.12)
U (k ) = ∑ u (n) exp⎜ − ⎟ , k = 0,..., N − 1
n =0 ⎝ N ⎠
Inverse DFT (IDFT) of U(k):
1 N −1
⎛ j 2πkn ⎞
u ( n) =
N

k =0
U ( k ) exp⎜
⎝ N
⎟ , n = 0,..., N − 1

(1.13)

u(n), U(k): same length N → “N-point DFT”

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 25 BG, CH, ÑHBK
Chapter 2:
Stochastic Processes and Models

‰ Review of Probability and Random Variables.

‰ Review of Stochastic Processes and Stochastic Models.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû BG, CH, ÑHBK
2. Review of Probability and Random Variables

‰ Axioms of Probability.

‰ Repeated Trials.

‰ Concepts of Random Variables.

‰ Functions of Random Variables.

‰ Moments and Conditional Statistics.

‰ Sequences of Random Variables.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 27 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (1)
‰ Probability theory deals with the study of random phenomena, which
under repeated experiments yield different outcomes that have
certain underlying patterns about them. The notion of an experiment
assumes a set of repeatable conditions that allow any number of
identical repetitions. When an experiment is performed under these
conditions, certain elementary events ξi occur in different but
completely uncertain ways. We can assign nonnegative number P(ξi ),
as the probability of the event ξi in various ways:
Laplace’s Classical Definition: The Probability of an event A is
defined a-priori without actual experimentation as
Number of outcomes favorable to A
P( A) = ,
Total number of possible outcomes
provided all these outcomes are equally likely.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 28 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (2)
Relative Frequency Definition: The probability of an event A is defined
as
nA
P ( A) = lim
n→∞ n

where nA is the number of occurrences of A and n is the total number of


trials.

The axiomatic approach to probability, due to Kolmogorov, developed


through a set of axioms (below) is generally recognized as superior to the
above definitions, as it provides a solid foundation for complicated
applications.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 29 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (3)
The totality of all ξ i , known a priori, constitutes a set Ω, the set of all
experimental outcomes.
Ω = {ξ 1 , ξ 2 , ,ξ k , }
Ω has subsetsA, B, C , . Recall that if A is a subset of Ω, then ξ ∈ A
implies ξ ∈ Ω. From A and B, we can generate other related subsets
A ∪ B, A ∩ B, A, B,

A ∪ B = { ξ | ξ ∈ A or ξ ∈ B}
A ∩ B = { ξ | ξ ∈ A and ξ ∈ B}
and
A = { ξ | ξ ∉ A}

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 30 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (4)

A
A B A B A

A∪ B A∩ B A
If A ∩ B = φ , the empty set, then A and B are said to be mutually
exclusive (M.E).
A partition of Ω is a collection of mutually exclusive subsets of Ω such
that their union is Ω.
Ai ∩ A j = φ , and ∪A
i =1
i = Ω.

A1
A2
A B Ai
Aj An
A∩ B =φ
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 31 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (5)
De-Morgan’s Laws:

A∪ B = A∩ B ; A∩ B = A∪ B

A B A B A B A B

A∪ B A∪ B A∩ B

Often it is meaningful to talk about at least some of the subsets of Ω as


events, for which we must have mechanism to compute their probabilities.

Example: Consider the experiment where two coins are simultaneously


tossed. The various elementary events are

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 32 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (6)
ξ1 = (H, H ), ξ2 = (H, T ), ξ3 = (T , H ), ξ4 = (T , T )
and
Ω = { ξ 1 , ξ 2 , ξ 3 , ξ 4 }.
The subset A = { ξ 1 , ξ 2 , ξ 3 } is the same as “Head has occurred at least
once” and qualifies as an event.
Suppose two subsets A and B are both events, then consider
“Does an outcome belong to A or B = A∪ B ”
“Does an outcome belong to A and B = A∩ B ”
“Does an outcome fall outside A”?

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 33 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (7)
Thus the sets A ∪ B , A ∩ B , A , B , etc., also qualify as events. We shall
formalize this using the notion of a Field.

•Field: A collection of subsets of a nonempty set Ω forms a field F if


(i) Ω ∈ F
(ii) If A ∈ F , then A ∈ F
(iii) If A ∈ F and B ∈ F , then A ∪ B ∈ F.
Using (i) - (iii), it is easy to show that A ∩ B , A ∩ B , etc., also belong to F.

For example, from (ii) we have A ∈ F , B ∈ F , and using (iii) this gives
A ∪ B ∈ F ; applying (ii) again we get A ∪ B = A ∩ B ∈ F , where we
have used De Morgan’s theorem in slide 22.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 34 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (8)
‰ Axioms of Probability
For any event A, we assign a number P(A), called the probability of
the event A. This number satisfies the following three conditions that
act the axioms of probability.

(i) P( A) ≥ 0 (Probability is a nonnegative number)


(ii) P(Ω) = 1 (Probability of the whole set is unity)
(iii) If A ∩ B = φ , then P( A ∪ B ) = P( A) + P( B ).

(Note that (iii) states that if A and B are mutually exclusive (M.E.)
events)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 35 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (9)
The following conclusions follow from these axioms:
a. P( A ∪ A) = P( A) + P( A) = 1 or P( A) = 1 − P( A).
b. P{ φ } = 0.
c. Suppose A and B are not mutually exclusive (M.E.)?
P ( A ∪ B ) = P ( A) + P ( B ) − P ( AB ).

‰ Conditional Probability and Independence


In N independent trials, suppose NA, NB, NAB denote the number of times
events A, B and AB occur respectively, for large N
NA N N
P ( A) ≈ , P ( B ) ≈ B , P ( AB ) ≈ AB .
N N N
Among the NA occurrences of A, only NAB of them are also found among
the NB occurrences of B. Thus the ratio
N AB N AB / N P ( AB )
= =
NB NB / N P( B)
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 36 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (10)
is a measure of “the event A given that B has already occurred”. We
denote this conditional probability by
P(A|B) = Probability of “the event A given that B has occurred”.
We define P( AB )
P( A | B ) = ,
P( B )
provided P ( B ) ≠ 0. As we show below, the above definition satisfies all
probability axioms discussed earlier.

Independence: A and B are said to be independent events, if


P ( AB ) = P ( A ) ⋅ P ( B ).
Then
P ( AB ) P ( A)P ( B )
P(A | B) = = = P ( A ).
P(B) P(B)
Thus if A and B are independent, the event that B has occurred does not
shed any more light into the event A. It makes no difference to A
whether B has occurred or not.
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 37 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (11)
Let
A = A1 ∪ A2 ∪ A3 ∪ ∪ An , (*)

a union of n independent events. Then by De-Morgan’s law


A = A1 A 2 An
and using their independence
n n
P ( A ) = P ( A1 A 2 An ) = ∏ P ( A ) = ∏ (1 − P ( A )).
i =1
i
i =1
i

Thus for any A as in (*) n


P ( A ) = 1 − P ( A ) = 1 − ∏ (1 − P ( Ai )) ,
i =1
a useful result.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 38 BG, CH, ÑHBK
PILLAI
2. Review of Probability Theory: Basics (12)
Bayes’ theorem:
P( B | A)
P( A | B ) = ⋅ P( A)
P( B )

Although simple enough, Bayes’ theorem has an interesting


interpretation: P(A) represents the a-priori probability of the event A.
Suppose B has occurred, and assume that A and B are not independent.
How can this new information be used to update our knowledge about
A? Bayes’ rule above take into account the new information (“B has
occurred”) and gives out the a-posteriori probability of A given B.

We can also view the event B as new knowledge obtained from a fresh
experiment. We know something about A as P(A). The new information
is available in terms of B. The new information should be used to
improve our knowledge/understanding of A. Bayes’ theorem gives the
exact mechanism for incorporating such new information.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 39 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (13)
Let A1 , A2 ,..., An are pair wise disjoint Ai Aj = φ , we have
n n
P(B) = ∑
i =1
P ( BA i ) = ∑i =1
P ( B | A i ) P ( A i ).

A more general version of Bayes’ theorem


P ( B | Ai ) P ( Ai ) P ( B | Ai ) P ( Ai )
P ( Ai | B ) = = ,
P(B) n

∑i =1
P ( B | Ai ) P ( Ai )

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 40 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (14)
Example: Three switches connected in parallel operate independently.
Each switch remains closed with probability p. (a) Find the probability of
receiving an input signal at the output. (b) Find the probability that switch S1
is open given that an input signal is received at the output.

s1
s2
Input Output
s3

Solution:
a. Let Ai = “Switch Si is closed”. Then P ( Ai ) = p , i = 1 → 3 .
Since switches operate independently, we have
P ( Ai A j ) = P ( Ai ) P ( A j ); P ( A1 A2 A3 ) = P ( A1 ) P ( A2 ) P ( A3 ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 41 BG, CH, ÑHBK
2. Review of Probability Theory: Basics (15)
Let R = “input signal is received at the output”. For the event R to
occur either switch 1 or switch 2 or switch 3 must remain closed, i.e.,
R = A1 ∪ A2 ∪ A3 .
Using results in slide 38,
P( R) = P( A1 ∪ A2 ∪ A3 ) = 1 − (1 − p)3 = 3 p − 3 p 2 + p3.

b. We need P( A1 | R ). From Bayes’ theorem

P ( R | A1 ) P( A1 ) (2 p − p 2 )(1 − p ) 2 − 2 p + p2
P( A1 | R ) = = = .
P( R ) 3 p − 3 p2 + p3 3 p − 3 p2 + p3

Because of the symmetry of the switches, we also have


P( A1 | R ) = P( A2 | R ) = P( A3 | R ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 42 BG, CH, ÑHBK
2. Review of Probability Theory: Repeated Trials (1)
‰ Consider two independent experiments with associated probability models
(Ω1, F1, P1) and (Ω2, F2, P2). Let ξ∈Ω1, η∈Ω2 represent elementary events.
A joint performance of the two experiments produces an elementary events
ω = (ξ, η). How to characterize an appropriate probability to this
“combined event” ?
Consider the Cartesian product space Ω = Ω1× Ω2 generated from Ω1 and Ω2
such that if ξ ∈ Ω1 and η ∈ Ω2 , then every ω in Ω is an ordered pair of the
form ω = (ξ, η). To arrive at a probability model we need to define the
combined trio (Ω, F, P).
Suppose A∈F1 and B ∈ F2. Then A × B is the set of all pairs (ξ, η), where ξ
∈ A and η ∈ B. Any such subset of Ω appears to be a legitimate event for the
combined experiment. Let F denote the field composed of all such subsets A
× B together with their unions and compliments. In this combined
experiment, the probabilities of the events A × Ω2 and Ω1 × B are such that
P( A × Ω 2 ) = P1 ( A), P (Ω1 × B ) = P2 ( B ).
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 43 BG, CH, ÑHBK
2. Review of Probability Theory: Repeated Trials (2)
Moreover, the events A × Ω2 and Ω1 × B are independent for any A ∈ F1 and
B ∈ F2 . Since
( A × Ω 2 ) ∩ (Ω1 × B ) = A × B,
then
P( A × B ) = P( A × Ω 2 ) ⋅ P(Ω1 × B ) = P1 ( A) P2 ( B )
for all A ∈ F1 and B ∈ F2 . This equation extends to a unique probability
measure P( ≡ P1 × P2 ) on the sets in F and defines the combined trio (Ω, F, P).

‰ Generalization: Given n experiments Ω1 , Ω 2 , , Ωn , and their associated


Fi and Pi , i = 1 → n , let
Ω = Ω1 × Ω2 × × Ωn
represent their Cartesian product whose elementary events are the ordered
n-tuples ξ 1 , ξ 2 , , ξ n , where ξ i ∈ Ω i . Events in this combined space are of
the form
A1 × A2 × × An
where Ai ∈ Fi ,

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 44 BG, CH, ÑHBK
2. Review of Probability Theory: Repeated Trials (3)
If all these n experiments are independent, and P(Ai) is the probability of
the event Ai in Fi then as before
P ( A1 × A2 × × An ) = P1 ( A1 ) P2 ( A2 ) Pn ( An ).
Example: An event A has probability p of occurring in a single trial. Find
the probability that A occurs exactly k times, k ≤ n in n trials.

Solution: Let (Ω, F, P) be the probability model for a single trial. The
outcome of n experiments is an n-tuple
ω = { ξ1 , ξ 2 , , ξ n }∈ Ω0 ,

where every ξi ∈ Ω and Ω 0 = Ω × Ω × × Ω. The event A occurs


at trial # i , if ξ i ∈ A . Suppose A occurs exactly k times in ω. Then k of
the ξi belong to A, say ξ i1 , ξ i 2 , , ξ i k , and the remaining n-k are
contained in its compliment in A .

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 45 BG, CH, ÑHBK
2. Review of Probability Theory: Repeated Trials (4)
Using independence, the probability of occurrence of such an ω is given by
P0 (ω ) = P ({ ξi1 , ξi2 , , ξik , , ξin }) = P({ ξ i1 }) P({ ξi2 }) P({ ξik }) P({ ξin })
= P ( A) P ( A) P( A) P ( A) P ( A) P( A) = p k q n −k .
k n −k

However the k occurrences of A can occur in any particular location inside


ω. Let ω1 , ω 2 , , ω N represent all such events in which A occurs exactly
k times. Then
" A occurs exactly k times in n trials" = ω1 ∪ ω 2 ∪ ∪ ωN .

But, all these ωi s are mutually exclusive, and equiprobable. Thus


P (" A occurs exactly k times in n trials" )
N
= ∑ P0 (ωi ) = NP0 (ω ) = Np k q n −k ,
i =1

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 46 BG, CH, ÑHBK
2. Review of Probability Theory: Repeated Trials (5)
Recall that, starting with n possible choices, the first object can be chosen
n different ways, and for every such choice the second one in (n-1) ways, …
and the kth one (n-k+1) ways, and this gives the total choices for k objects
out of n to be n(n-k)…(n-k+1). But, this includes the k! choices among the
k objects that are indistinguishable for identical objects. As a result
n( n − 1) (n − k + 1) n! ⎛n⎞
N= = =⎜ ⎟
k! ( n − k )! k! ⎜⎝ k ⎟⎠

represents the number of combinations, or choices of n identical objects


taken k at a time. Thus, we obtain Bernoulli formula
Pn ( k ) = P(" A occurs exactly k times in n trials" )
⎛ n ⎞ k n −k
= ⎜⎜ ⎟⎟ p q , k = 0,1,2, , n,
⎝k ⎠

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 47 BG, CH, ÑHBK
2. Review of Probability Theory: Repeated Trials (6)
Independent repeated experiments of this nature, where the outcome is
either a “success” ( = A) or a “failure” ( = A) are characterized as
Bernoulli trials, and the probability of k successes in n trials is given by
Bernoulli formula, where p represents the probability of “success”
in any one trial.

‰ Bernoulli trial: consists of repeated independent and identical


experiments each of which has only two outcomes A or A with P( A) = p,
and P ( A) = q. The probability of exactly k occurrences of A in n such trials
is given by Bernoulli formula. Let
X k = " exactly k occurrence s in n trials" .

Since the number of occurrences of A in n trials must be an integer


k = 0, 1, 2,…, n, either X0 or X1 or X2 or … or Xn must occur in such an
experiment. Thus
P(X 0 ∪ X1 ∪ ∪ X n ) = 1.
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 48 BG, CH, ÑHBK
2. Review of Probability Theory: Repeated Trials (7)
But Xi, Xj are mutually exclusive. Thus
n
⎛n⎞
n
P( X 0 ∪ X 1 ∪ ∪ X n ) = ∑ P( X k ) = ∑ ⎜⎜ ⎟⎟ p k q n −k .
k =0 k =0 ⎝ k ⎠

For a given n and p what is the most likely value of k ?

Pn (k )
n = 12, p = 1 / 2.

From Figure, the most probable value of k is that number which maximizes
Pn(k) in Bernoulli formula. To obtain this value, consider the ratio
Pn (k − 1) n! p k −1q n −k +1 (n − k )! k! k q
= k n −k
= .
Pn ( k ) (n − k + 1)! ( k − 1)! n! p q n − k +1 p

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 49 BG, CH, ÑHBK
2. Review of Probability Theory: Repeated Trials (8)

Pn (k − 1) n! p k −1q n −k +1 (n − k )! k! k q
= k n −k
= .
Pn ( k ) (n − k + 1)! ( k − 1)! n! p q n − k +1 p

Thus, Pn(k) ≥ Pn(k-1), if k(1-p) ≤ (n-k+1)p or k ≤ (n+1)p . Thus Pn(k) as a


function of k increases until
k = ( n + 1) p

if it is an integer, or the largest integer kmax less than (n+1)p, it represents


the most likely number of successes (or heads) in n trials.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 50 BG, CH, ÑHBK
2. Review of Probability Theory: Random Variables (1)
‰ Let (Ω, F, P) be a probability model for an experiment, and X a function
that maps every ζ ∈ Ω to a unique point x ∈ R, the set of real numbers.
Since the outcome ζ is not certain, so is the value X(ζ ) = x. Thus if B is some
subset of R, we may want to determine the probability of “X(ζ ) ∈ B ”. To
determine this probability, we can look at the set A = X-1(B) ∈ Ω that contains
all ζ ∈ Ω that maps into B under the function X.

Ω
ξ
A
X (ξ )
x B R

Obviously, if the set A = X-1(B) also belongs to the associated field F, then it
is an event and the probability of A is well defined; in that case we can say

Probability of the event " X (ξ ) ∈ B " = P( X −1 ( B )).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 51 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (2)
However, X-1(B) may not always belong to F for all B, thus creating
difficulties. The notion of random variable (r.v) makes sure that the inverse
mapping always results in an event so that we are able to determine the
probability for any B ∈ R.

‰ Random Variable (r.v): A finite single valued function X(.) that maps
the set of all experimental outcomes Ω into the set of real numbers R is said
to be a r.v, if the set { ζ | X(ζ) ≤ x } is an event ∈ F for every x in R.

Alternatively, X is said to be a r.v, if X-1(B) ∈ F where B represents semi-


definite intervals of the form {-∞ < x ≤ a} and all other sets that can be
constructed from these sets by performing the set operations of union,
intersection and negation any number of times. Thus if X is a r.v, then
{ξ | X (ξ ) ≤ x } = { X ≤ x }
is an event for every x .

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 52 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (3)
What about { a < X ≤ b }, { X = a } ? Are they also events ?

In fact with b > a , since { X ≤ a } and { X ≤ b } are events,


{ X ≤ a }c = { X > a } is an event and hence
{ X > a } ∩ { X ≤ b } = { a < X ≤ b },
is also an event. Thus, { a – 1/n < X ≤ a }, is an event for every n.
Consequently, ∞
⎧ 1 ⎫
∩ ⎨⎩ n a − < X ≤ a ⎬ = { X = a}

n =1

is also an event. All events have well defined probability. Thus the
probability of the event { ζ | X(ζ) ≤ x } must depend on x. Denote

P{ ξ | X (ξ ) ≤ x } = FX ( x) ≥ 0
which is referred to as the Probability Distribution Function (PDF)
associated with the r.v X.
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 53 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (4)
‰ Distribution Function: If g(x) is a distribution function, then
(i) g(+∞) = 1; g(-∞) = 0
(ii) if x1 < x2, then g(x1) ≤ g(x2)
(iii) g(x+) = g(x) for all x.
It is shown that the PDF FX(x) satisfies these properties for any r.v X.

Additional Properties of a PDF

(iv) if FX(x0) = 0 for some x0, then FX(x) = 0 for x ≤ x0


(v) P{ X (ξ ) > x } = 1 − FX ( x ).
(vi) P{ x1 < X (ξ ) ≤ x2 } = FX ( x2 ) − FX ( x1 ), x2 > x1.
(vii) P ( X (ξ ) = x ) = FX ( x ) − FX ( x − ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 54 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (5)
X is said to be a continuous-type r.v if its distribution function FX(x) is
continuous. In that case FX(x -) = FX(x) for all x, and from property (vii) we
get P{X = x} = 0.

If FX(x) is constant except for a finite number of jump discontinuities


(piece-wise constant; step-type), then X is said to be a discrete-type r.v.
If xi is such a discontinuity point, then from (vii)

pi = P{X = xi } = FX ( xi ) − FX ( xi− ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 55 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (6)
Example: X is a r.v such that X(ζ) = c, ζ ∈ Ω. Find FX(x).
Solution: For x < c, {X(ζ) ≤ x } = {φ}, so that FX(x) = 0, and for x > c,
{X(ζ) ≤ x } = Ω, so that FX(x) = 1.
FX (x) FX (x)
1 1
q
x x
c 1

Example: Toss a coin. Ω = {H, T}. Suppose the r.v X is such that X(T) = 0,
X(H) = 1. Find FX(x).
Solution: For x < 0, {X(ζ) ≤ x } = {φ}, so that FX(x) = 0.
0 ≤ x < 1, { X (ξ ) ≤ x } = { T }, so that FX ( x) = P{ T } = 1 − p,
x ≥ 1, { X (ξ ) ≤ x } = { H , T } = Ω, so that FX ( x) = 1.
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 56 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (7)
Example: A fair coin is tossed twice, and let the r.v X represent the number
of heads. Find FX(x).
Solution: In this case Ω = {HH, HT, TH, TT}, and X(HH) = 2, X(HT) =
X(TH) = 1, X(TT) = 0.
x < 0, {X (ξ ) ≤ x} = φ ⇒ FX ( x) = 0,
1
0 ≤ x < 1, {X (ξ ) ≤ x} = { TT } ⇒ FX ( x) = P{ TT } = P (T ) P(T ) = ,
4
3
1 ≤ x < 2, {X (ξ ) ≤ x} = { TT , HT , TH } ⇒ FX ( x) = P{ TT , HT , TH } = ,
4
x ≥ 2, {X (ξ ) ≤ x} = Ω ⇒ FX ( x) = 1.
P{X = 1} = FX (1) − FX (1− ) = 3 / 4 − 1 / 4 = 1 / 2.
FX (x)
1
3/ 4
1/ 4
x
1 2
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 57 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (8)
‰ Probability Density Function (p.d.f): The derivative of the distribution
function FX(x) is called the probability density function fX(x) of the r.v X.
Thus dF ( x )
f X ( x) = X .
dx
Since
dFX ( x ) F ( x + Δx ) − FX ( x )
= lim X ≥ 0,
dx Δx → 0 Δx
from the monotone-nondecreasing nature of FX(x), it follows that fX(x) ≥ 0
for all x. fX(x) will be a continuous function, if X is a continuous type r.v.
However, if X is a discrete type r.v, then its p.d.f has the general form

f X ( x)
f X ( x) = ∑ piδ ( x − xi ), pi
i

xi x

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 58 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (9)
where xi represent the jump-discontinuity points in FX(x). In the figure,
fX(x) represents a collection of positive discrete masses, and it is known as
the probability mass function (p.m.f ) in the discrete case.
f X (x)
pi

xi x
We also obtain
x
FX ( x) = ∫ f x (u )du.
−∞

Since FX(+ ∞) = 1, it yields


+∞
∫−∞
f x ( x )dx = 1,

which justifies its name as the density function. Further, we also get
P{ x1 < X (ξ ) ≤ x2 } = FX ( x2 ) − FX ( x1 ) = ∫ f X ( x )dx.
x2

x1

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 59 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (10)
Thus the area under fX(x) in the interval (x1, x2) represents the probability in
the equation
P{ x1 < X (ξ ) ≤ x2 } = FX ( x2 ) − FX ( x1 ) = ∫ f X ( x )dx.
x2

x1

FX (x) f X (x)
1

x1 x2 x x
x1 x2
(a) (b)

Often, r.vs are referred by their specific density functions - both in the
continuous and discrete cases - and in what follows we shall list a number
of them in each category.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 60 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (11)
‰ Continuous-type Random Variables
1. Normal (Gaussian): X is said to be normal or Gaussian r.v, if
1 − ( x − μ ) 2 / 2σ 2
f X ( x) = e .
2πσ 2

This is a bell shaped curve, symmetric around the parameter μ and its
distribution function is given by
x 1 ⎛x−μ⎞
FX ( x ) = ∫
2 2
e −( y − μ ) / 2σ dy = G⎜ ⎟,
−∞
2πσ 2
⎝ σ ⎠
x 1 − y2 / 2
where G ( x ) = ∫ e dy is often tabulated. Since fX(x) depends on
−∞ 2π
two parameters μ and σ2, the notation X ∼ N(μ, σ2) will be used to
represent fX(x). f X (x)

x
μ
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 61 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (12)
2. Uniform: X ∼ U(a,b), a < b, if
⎧⎪ 1 f X (x)
, a ≤ x ≤ b, 1
f X ( x) = ⎨ b − a
b−a
⎪⎩ 0, otherwise. x
a b
3. Exponential: X ∼ ε (λ), if
⎧⎪ 1 − x / λ f X (x)
e , x ≥ 0,
f X ( x) = ⎨ λ
⎪⎩ 0, otherwise. x

4. Gamma: X ∼ G(α, β), if α > 0, β > 0


f X ( x)
⎧ xα −1 −x / β
⎪ e , x ≥ 0,
f X ( x ) = ⎨ Γ(α ) β α
⎪⎩ 0, otherwise.
If α = n an integer, Γ(n) = (n-1)!
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 62 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (13)
5. Beta: X ∼ β(a,b) if (a > 0, b > 0) f X ( x)

⎧ 1
⎪ x a −1 (1 − x )b−1 , 0 < x < 1,
f X ( x ) = ⎨ β ( a , b) x
⎪⎩ 0, otherwise. 0 1

where the Beta function β(a,b) is defined as


1
β (a, b) = ∫ u a −1 (1 − u )b−1 du.
0

6. Chi-Square: X ∼ χ 2(n) if f X ( x)
⎧ 1
⎪ n/2 x n / 2−1e − x / 2 , x ≥ 0,
f X ( x ) = ⎨ 2 Γ( n / 2)
⎪⎩ 0, otherwise. x

Note that χ 2(n) is the same as Gamma (n/2, 2)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 63 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (14)
7. Rayleigh: X ∼ R(σ2), if f X ( x)
⎧⎪ x − x 2 / 2σ 2
e , x ≥ 0,
f X ( x ) = ⎨σ 2
⎪⎩ 0, otherwise. x

8. Nakagami – m distribution:
⎧ 2 ⎛ m ⎞ m 2 m −1 − mx 2 / Ω
⎪ , x≥0
f X ( x ) = ⎨ Γ( m ) ⎜⎝ Ω ⎟⎠
x e
⎪ 0 otherwise

9. Cauchy: X ∼ C(α, μ), if f X ( x)


α /π
f X ( x) = , − ∞ < x < +∞.
α + (x − μ)
2 2

x
μ
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 64 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (15)
10. Laplace: fX (x)

1 −|x|/ λ
f X ( x) = e , − ∞ < x < +∞.

x

11. Student’s t-distribution with n degrees of freedom:


fT ( t )
Γ(( n + 1) / 2 ) ⎛ t ⎞2 − ( n +1) / 2

fT (t ) = ⎜⎜1 + ⎟⎟ , − ∞ < t < +∞.


πn Γ ( n / 2 ) ⎝ n ⎠

t
12. Fisher’s F-distribution:
⎧Γ{( m + n ) / 2} m m / 2 n n / 2 z m / 2 −1
⎪ , z≥0
f z ( z) = ⎨ Γ ( m / 2) Γ ( n / 2) ( n + mz ) (m+n) / 2

⎪ 0 otherwise

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 65 BG, CH, ÑHBK
2. Review of Probability Theory: R. V (16)
‰ Discrete-type Random Variables

1. Bernoulli: X takes the values (0,1), and


P ( X = 0) = q, P ( X = 1) = p.

2. Binomial: X ∼ B(n,p) if P(X = k)

⎛n⎞
P( X = k ) = ⎜⎜ ⎟⎟ p k q n −k , k = 0,1,2, , n.
⎝k ⎠ k
12 n
3. Poisson: X ∼ P(λ) if
P(X = k)
−λ λ
k
P( X = k ) = e , k = 0,1,2, , ∞.
k!

4. Discrete-Uniform
1
P( X = k ) = , k = 1,2, , N.
N
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 66 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (1)
‰ Let X be a r.v defined on the model (Ω, F, P) and suppose g(x) is a
function of the variable x. Define
Y = g(X)

Is Y necessarily a r.v ? If so what is its PDF FY(y) and pdf fY(y) ?

Consider some of the following functions to illustrate the technical details.

Example 1: Y = aX + b
ƒ Suppose a > 0,
⎛ y −b⎞ ⎛ y −b⎞
FY ( y ) = P (Y (ξ ) ≤ y ) = P (aX (ξ ) + b ≤ y ) = P⎜ X (ξ ) ≤ ⎟ = FX⎜ ⎟.
⎝ a ⎠ ⎝ a ⎠
and
1 ⎛ y −b⎞
fY ( y ) = fX ⎜ ⎟.
a ⎝ a ⎠

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 67 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (2)
ƒ If a < 0, then
⎛ y −b⎞
FY ( y ) = P (Y (ξ ) ≤ y ) = P (aX (ξ ) + b ≤ y ) = P ⎜ X (ξ ) > ⎟
⎝ a ⎠
⎛ y −b⎞
= 1 − FX ⎜ ⎟ ,
⎝ a ⎠
and hence
1 ⎛ y −b⎞
fY ( y ) = − fX ⎜ ⎟.
a ⎝ a ⎠

ƒ For all a
1 ⎛ y −b⎞
fY ( y ) = fX ⎜ ⎟.
|a | ⎝ a ⎠

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 68 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (3)
Example 2: Y = X2
FY ( y ) = P (Y (ξ ) ≤ y ) = P (X 2 (ξ ) ≤ y ).
If y < 0 then the event {X2(ζ) ≤ y} = φ, and hence
FY ( y ) = 0, y < 0.
For y > 0, from figure, the event {Y(ζ) ≤ y} = {X2(ζ) ≤ y} is equivalent to
{x1 ≤ X(ζ) ≤ x2}. Hence

FY ( y ) = P ( x1 < X (ξ ) ≤ x2 ) = FX ( x2 ) − FX ( x1 ) Y = X2
= FX ( y ) − FX ( − y ), y > 0. y

By direct differentiation, we get X


x1 x2
⎧ 1
⎪ (
f ( y ) + f X (− y ) ,
fY ( y ) = ⎨ 2 y X
)
y > 0,
⎪⎩ 0, otherwise.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 69 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (4)
If fX(x) represents an even function, then fY(y) reduces to
fY ( y ) =
1
y
( )
f X y U ( y ).

In particular, if X ∼ N(0,1), so that


1 − x2 / 2
f X ( x) = e ,

then, we obtain the p.d.f of Y = X2 to be


1
fY ( y ) = e − y / 2U ( y ).
2πy

we notice that this equation represents a Chi-square r.v with n = 1, since


Γ(1/2) = √π .
Thus, if X is a Gaussian r.v with μ = 0, then Y = X2 represents a
Chi-square r.v with one degree of freedom (n = 1).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 70 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (5)
Example 3: Let g( X )
⎧ X − c, X > c,
−c

Y = g ( X ) = ⎨ 0, − c < X ≤ c, c X

⎪ X + c, X ≤ −c.

In this case
P (Y = 0) = P ( −c < X (ξ ) ≤ c ) = FX (c ) − FX ( −c ).
FX (x )
For y > 0, we have x > c, and Y(ζ) = X(ζ) – c, so that
FY ( y ) = P (Y (ξ ) ≤ y ) = P ( X (ξ ) − c ≤ y ) x
= P ( X (ξ ) ≤ y + c ) = FX ( y + c ), y > 0.
Similarly y < 0, if x < - c, and Y(ζ) = X(ζ) + c, so that FY ( y )

FY ( y ) = P (Y (ξ ) ≤ y ) = P ( X (ξ ) + c ≤ y )
= P ( X (ξ ) ≤ y − c ) = FX ( y − c ), y < 0. y
Thus
⎧ f X ( y + c ), y > 0,

f Y ( y ) = ⎨[ FX ( c ) − FX ( − c )]δ ( y ),
⎪ f ( y − c ), y < 0.
⎩ X
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 71 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (6)
Example 4: Half-wave rectifier Y
⎧ x, x > 0,
Y = g ( X ); g ( x ) = ⎨
⎩0, x ≤ 0.
X
In this case
P(Y = 0) = P( X (ξ ) ≤ 0) = FX (0).

and for y > 0, since Y = X


FY ( y ) = P (Y (ξ ) ≤ y ) = P ( X (ξ ) ≤ y ) = FX ( y ).

Thus
⎧ f X ( y ), y > 0,

fY ( y ) = ⎨ FX (0)δ ( y ) y = 0, = f X ( y )U ( y ) + FX (0)δ ( y ).
⎪ 0, y < 0,

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 72 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (7)
‰ Note: As a general approach, given Y = g(X), first sketch the graph
y = g(x), and determine the range space of y. Suppose a < y < b is the range
space of y = g(x). Then clearly for y < a, FY(y) = 0, and for y > b, FY(y) = 1,
so that FY(y) can be nonzero only in a < y < b. Next, determine whether there
are discontinuities in the range space of y. If so, evaluate P(Y(ζ) = yi) at these
discontinuities. In the continuous region of y, use the basic approach
FY ( y ) = P (g ( X (ξ )) ≤ y )

and determine appropriate events in terms of the r.v X for every y. Finally,
we must have FY(y) for -∞ < y < + ∞ and obtain
dFY ( y )
fY ( y ) = in a < y < b.
dy

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 73 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (8)
‰ However, if Y = g(X) is a continuous function, it is easy to obtain fY(y)
directly as
1 1
fY ( y ) = ∑ f X ( xi ) = ∑ f X ( xi ). (♦)
i dy / dx x ′
i g ( xi )
i

The summation index i in this equation depends on y, and for every y the
equation y = g(xi) must be solved to obtain the total number of solutions at
every y, and the actual solutions x1, x2, … all in terms of y.

For example, if Y = X2, then for all y > 0, x1 = -√y and x2 = +√y represent
the two solutions for each y. Notice that the solutions xi are all in terms of
y so that the right side of (♦) is only a function of y. Moreover
Y = X2
dy dy y
= 2 x so that =2 y
dx dx x = xi
X
x1 x2

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 74 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (9)
Using (♦), we obtain
⎧ 1
⎪ ( f ( y ) + f X (− y ) ,
fY ( y ) = ⎨ 2 y X
)
y > 0,
⎪⎩ 0, otherwise ,

which agrees with the result in Example 2.

Example 5: Y = 1/X. Find fY(y)

Here for every y, x1 = 1/y is the only solution, and


dy 1 dy 1
= − 2 so that = 2
= y 2
,
dx x dx x = x1 1/ y
Then, from (♦), we obtain
1 ⎛1⎞
fY ( y ) = f
2 X⎜
⎜ ⎟⎟.
y ⎝ y⎠
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 75 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (10)
In particular, suppose X is a Cauchy r.v with parameter α so that
α /π
f X ( x) = 2 , − ∞ < x < +∞.
α + x2
In this case, Y = 1/X has the p.d.f
1 α /π (1 / α ) / π
fY ( y ) = 2 2 = , − ∞ < y < +∞.
y α + (1 / y ) 2
(1 / α ) + y
2 2

But this represents the p.d.f of a Cauchy r.v with parameter (1/α). Thus if
X ∼ C(α), then 1/X ∼ C(1/α).

Example 6: Suppose fX(x) = 2x / π 2, 0 < x < π and Y = sinX . Determine fY(y).

Since X has zero probability of falling outside the interval (0, π), y = sinx has
zero probability of falling outside the interval (0, 1). Clearly fY(y) = 0 outside
this interval.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 76 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (11)
For any 0 < y < 1, the equation y = sinx has an infinite number of solutions
…x1, x2, x3,… (see the Figure below), where x1 = sin-1y is the principal
solution. Moreover, using the symmetry we also get x2 = π - x1 etc. Further,
dy
= cos x = 1 − sin 2 x = 1 − y 2
dx
so that
dy fX (x)
= 1− y .
2

dx x = xi
x3
x
(a) π
y = sin x

x
x −1 x1 x2 x3
π
(b)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 77 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (12)
From (♦), we obtain for 0 < y < 1
+∞
1
fY ( y ) = ∑ f X ( xi ).
i =−∞ 1− y 2
i ≠0

But from the figure, in this case fX(-x1) = fX(x3) = fX(x4) = … = 0


(Except for fX(x1) and fX(x2) the rest are all zeros). Thus

1 1 ⎛ 2 x1 2 x2 ⎞
fY ( y ) = ( f X ( x1 ) + f X ( x2 ) ) = ⎜ 2 + 2 ⎟
1− y 2
1− y ⎝ π
2 π ⎠
⎧ 2 fY ( y )
2( x1 + π − x1 ) ⎪ , 0 < y < 1,
= = ⎨π 1 − y 2

π 1− y
2 2
⎪⎩ 0, otherwise.
2
π
y
1

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 78 BG, CH, ÑHBK
2. Review of Probability Theory: Function of R. V (13)
‰ Functions of a discrete-type r.v

Suppose X is a discrete-type r.v with P(X = xi) = pi , x = x1, x2,…, xi, …


and Y = g(X). Clearly Y is also of discrete-type, and when x = xi, yi = g(xi)
and for those yi , P(Y = yi) = P(X = xi) = pi , y = y1, y2,…, yi, …

Example 7: Suppose X ∼ P(λ), so that


−λ λk
P( X = k ) = e , k = 0,1,2,
k!
Define Y = X2 + 1. Find the p.m.f of Y.

X takes the values 0, 1, 2,…, k, … so that Y only takes the value 1, 2, 5,…,
k2+1,… and P(Y = k2+1) = P(X = k) so that for j = k2+1
λ j −1
P(Y = j ) = P X = ( )
j −1 = e −λ

( j − 1)!
, j = 1, 2, 5, , k 2 + 1, .

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 79 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (1)
‰ Mean or the Expected Value of a r.v X is defined as
+∞
η X = X = E ( X ) = ∫ x f X ( x )dx.
−∞

If X is a discrete-type r.v, then we get


η X = X = E ( X ) = ∫ x ∑ piδ ( x − xi )dx = ∑ xi pi ∫ δ ( x − xi )dx
i i
1

= ∑ xi pi = ∑ xi P( X = xi ) .
i i

Mean represents the average (mean) value of the r.v in a very large number
of trials. For example if X ∼ U(a,b), then
b
b x 1 x2 b2 − a 2 a + b
E( X ) = ∫ dx = = =
a b−a b − a 2 a 2( b − a ) 2

is the midpoint of the interval (a,b).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 80 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (2)
On the other hand if X is exponential with parameter λ, then
∞ x ∞
E( X ) = ∫ e −x / λ
dx = λ ∫ ye − y dy = λ ,
0 λ 0

implying that the parameter λ represents the mean value of the exponential
r.v.

Similarly if X is Poisson with parameter λ , we get


∞ ∞
λk ∞
λk
E ( X ) = ∑ kP ( X = k ) = ∑ ke −λ
=e −λ
∑ k k!
k =0 k =0 k! k =1

λk ∞
λi
=e −λ
∑ (k − 1)! = λe ∑ i! = λe λ eλ = λ.
k =1
−λ

i =0

Thus the parameter λ also represents the mean of the Poisson r.v.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 81 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (3)
In a similar manner, if X is binomial, then its mean is given by
n
⎛ n ⎞ k n −k
n n
n!
E ( X ) = ∑ kP ( X = k ) = ∑ k ⎜⎜ ⎟⎟ p q = ∑ k p k q n −k
k =0 k =0 ⎝ k ⎠ k =1 (n − k )! k!
n
n! n −1
( n − 1)!
=∑ p q = np ∑
k n −k
p i q n −i −1 = np( p + q)n −1 = np.
k =1 ( n − k )! ( k − 1)! i =0 ( n − i − 1)! i!

Thus np represents the mean of the binomial r.v.

For the normal r.v,


1 +∞ 1 +∞
∫ ∫
2
/ 2σ 2 2
/ 2σ 2
E( X ) = xe −( x − μ ) dx = ( y + μ )e − y dy
2πσ 2 −∞
2πσ 2 −∞

1 +∞ 1 +∞
∫ ∫
2
/ 2σ 2 2
/ 2σ 2
= ye − y dy + μ ⋅ e− y dy = μ .
2πσ 2 −∞
2πσ 2 −∞
0
1

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 82 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (4)
Given X ∼ fX(x), suppose Y = g(X) defines a new r.v with p.d.f fY(y), the new
r.v Y has a mean μY given by
+∞ +∞
μY = E (Y ) = E ( g ( X ) ) = ∫ y fY ( y )dy = ∫ g ( x) f X ( x)dx.
−∞ −∞

In the discrete case,


E (Y ) = ∑ g ( xi )P( X = xi ).
i

From the equations, fY(y) is not required to evaluate E(Y) for Y=g(X).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 83 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (5)
Example: Determine the mean of Y = X2 where X is a Poisson r.v.
∞ ∞
λk ∞
λk
E (X 2
)= ∑ k 2
P( X = k ) = ∑k 2
e −λ
=e −λ
∑k 2

k =0 k =0 k! k =1 k!

λk ∞
λi +1
=e −λ
∑ k ( k − 1)! = e ∑ ( i + 1)
k =1
−λ

i=0 i!
⎛ ∞ λi ∞
λi ⎞ ⎛ ∞ λi λ ⎞
= λe −λ
⎜⎜ ∑ i + ∑ ⎟ = λe −λ
⎜⎜ ∑ i + e ⎟⎟
⎝ i=0 i ! i=0 i! ⎟⎠ ⎝ i =1 i ! ⎠
⎛ ∞ λi λ ⎞ −λ ⎛

λ m +1 ⎞
= λ e ⎜⎜ ∑
−λ
+ e ⎟⎟ = λ e ⎜⎜ ∑ + e λ ⎟⎟
⎝ i =1 ( i − 1)! ⎠ ⎝ m =0 m! ⎠
= λ e − λ (λ e λ + e λ ) = λ 2 + λ .

In general, E(Xk) is known as the kth moment of r.v X. Thus if X ∼ P(λ), its
second moment is given by the above equation.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 84 BG, CH, ÑHBK
PILLAI
2. Review of Probability Theory: Moments… (6)
‰ For a r.v X with mean μ , X - μ represents the deviation of the r.v from its
mean. Since this deviation can be either positive or negative, consider the
quantity (X - μ )2 and its average value E[(X - μ )2 ] represents the average
mean square deviation of X around its mean. Define
σ 2 = E[( X − μ ) 2 ] > 0.
X

With g(X) = (X - μ )2 , we get


+∞
σ = ∫ ( x − μ )2 f X ( x )dx > 0.
2
X −∞

where σX2 is known as the variance of the r.v X, and its square root
σ X = E ( X − μ ) 2 is known as the standard deviation of X. Note that the
standard deviation represents the root mean square spread of the r.v X
around its mean μ .

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 85 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (7)
Alternatively, the variance can be calculated by

Var ( X ) = σ X2 = ∫
+∞

−∞
(x 2
− 2 xμ + μ 2 ) f X ( x )dx
+∞ +∞
=∫ x f X ( x )dx − 2 μ ∫ x f X ( x )dx + μ 2
2
−∞ −∞

= E (X )− μ = E (X ) − [E ( X )] 2
___
2
2 2 2
=X −X .
2

Example: Determine the variance of Poisson r.v

σ = X − X = (λ2 + λ ) − λ2 = λ.
___
2
2 2
X

Thus for a Poisson r.v, mean and variance are both equal to its parameter λ

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 86 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (8)
Example: Determine the variance of the normal r.v N(μ , σ2 )

We have +∞ 1
Var ( X ) = E [( X − μ ) 2 ] = ∫ ( x − μ)
2
/ 2σ 2
e −( x − μ )
2
dx.
−∞
2πσ 2

Use of the identity


+∞ +∞ 1
∫ f X ( x )dx = ∫
2
/ 2σ 2
e −( x − μ ) dx = 1
−∞ −∞
2πσ 2

for a normal p.d.f. This gives


+∞

2
/ 2σ 2
e −( x − μ ) dx = 2π σ .
−∞

Differentiating both sides with respect to σ , we get


+∞ ( x − μ )
2


−( x − μ ) 2 / 2σ 2
e dx = 2π
3 −∞ σ
or
+∞ 1
∫ −∞ ( )
2 2
μ e −( x− μ ) / 2σ
dx = σ 2 ,
2
x −
2 πσ 2

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 87 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (9)
‰ Moments: In general ___
mn = X = E ( X n ), n ≥ 1
n

are known as the moments of the r.v X, and


μn = E[( X − μ )n ]
are known as the central moments of X. Clearly, the mean μ = m1 ,
and the variance σ2 = μ2. It is easy to relate mn and μn. Infact
⎛ n ⎛n⎞ k n −k ⎞
μn = [( − μ ) ] = ⎜ ∑ ⎜⎜ ⎟⎟ ( − μ ) ⎟⎟
E X n
E ⎜ X
⎝ k =0 ⎝ k ⎠ ⎠
⎛n⎞ ⎛n⎞
= ∑ ⎜⎜ ⎟⎟E (X ) ( − μ ) = ∑ ⎜⎜ ⎟⎟ mk ( − μ )n −k .
n n
k n −k

k =0 ⎝ k ⎠ k =0 ⎝ k ⎠

In general, the quantities E[(X - a)n] are known as the generalized


moments of X about a, and E[⏐X⏐n] are known as the absolute moments
of X.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 88 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (10)
‰ Characteristic Function of a r.v X is defined as
Φ X (ω ) = E (e )= ∫ +∞
jXω
e jxω f X ( x )dx.
−∞

Thus ΦX(0) = 1, and ⏐ΦX(ω)⏐ ≤ 1 for all ω.

For discrete r.vs the characteristic function reduces to


Φ X (ω ) = ∑ e jkω P( X = k ).
k

Example: If X ∼ P(λ), then its characteristic function is given by



jkω − λ λ
k ∞
( λe jω ) k
Φ X (ω ) = ∑ e e =e ∑
jω jω
−λ
= e −λ e λe = e λ ( e −1) .
k =0 k! k =0 k!

Example: If X is a binomial r.v, its characteristic function is given by


n
⎛ n ⎞ k n −k n ⎛ n ⎞
Φ X (ω ) = ∑ e jkω
⎜⎜ ⎟⎟ p q = ∑ ⎜⎜ ⎟⎟( pe jω )k q n −k =( pe jω + q) n .
k =0 ⎝k ⎠ k =0 ⎝ k ⎠

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 89 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (11)
The characteristic function can be used to compute the mean, variance and
other higher order moments of any r.v X.
⎡ ∞ ( jωX ) k ⎤ ∞ k E ( X k ) k
Φ X (ω ) = E (e ) = E ⎢∑
jXω
⎥=∑j ω
⎣ k =0 k ! ⎦ k =0 k !
2
2 E( X ) k E( X )
k
= 1 + jE ( X )ω + j ω + +j
2
ωk + .
2! k!
Taking the first derivative with respect to ω, and letting it to be equal to zero,
we get ∂Φ X (ω ) 1 ∂Φ X (ω )
= jE ( X ) or E ( X ) = .
∂ω ω =0 j ∂ω ω =0

Similarly, the second derivative gives


1 ∂ 2Φ X (ω )
E( X ) = 2
2
,
j ∂ω 2
ω =0
and repeating this procedure k times, we obtain the kth moment of X to be
1 ∂ k Φ X (ω )
E( X ) = k
k
, k ≥ 1.
j ∂ω k
ω =0

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 90 BG, CH, ÑHBK
2. Review of Probability Theory: Moments… (12)
Example: If X ∼ P(λ), then
∂Φ X (ω ) jω
= e −λ e λe λje jω ,
∂ω
so that E(X) = λ.

The second derivative gives


∂ 2Φ X (ω )
∂ω 2
= e e (
− λ λe jω
( λje jω 2
) + e λe j ω
λj )
2 jω
e ,

so that
E ( X 2 ) = λ2 + λ ,

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 91 BG, CH, ÑHBK
2. Review of Probability Theory: Two r.vs (1)
‰ Let X and Y denote two random variables (r.v) based on a probability
model (Ω, F, P). Then
P ( x1 < X (ξ ) ≤ x2 ) = FX ( x2 ) − FX ( x1 ) = ∫ f X ( x )dx,
x2

and
x1

P ( y1 < Y (ξ ) ≤ y2 ) = FY ( y2 ) − FY ( y1 ) = ∫ fY ( y )dy.
y2

y1

What about the probability that the pair of r.vs (X,Y) belongs to an arbitrary
region D? In other words, how does one estimate, for example,
P[( x1 < X (ξ ) ≤ x2 ) ∩ ( y1 < Y (ξ ) ≤ y2 )] = ?

Towards this, we define the joint probability distribution function of X


and Y to be
FXY ( x, y ) = P[( X (ξ ) ≤ x ) ∩ (Y (ξ ) ≤ y )]
= P( X ≤ x, Y ≤ y ) ≥ 0,
where x and y are arbitrary real numbers.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 92 BG, CH, ÑHBK
2. Review of Probability Theory: Two r.vs (2)
‰ Properties
(i) FXY ( −∞, y ) = FXY ( x,−∞) = 0, FXY ( +∞,+∞) = 1.

(ii) P ( x1 < X (ξ ) ≤ x2 , Y (ξ ) ≤ y ) = FXY ( x2 , y ) − FXY ( x1 , y ).


P ( X (ξ ) ≤ x, y1 < Y (ξ ) ≤ y2 ) = FXY ( x, y2 ) − FXY ( x, y1 ).

(iii) P ( x1 < X (ξ ) ≤ x2 , y1 < Y (ξ ) ≤ y2 ) = FXY ( x2 , y2 ) − FXY ( x2 , y1 )


− FXY ( x1 , y2 ) + FXY ( x1 , y1 ).
This is the probability that (X,Y) belongs to the rectangle R0.
Y

y2
R0
y1

x1 x2
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 93 BG, CH, ÑHBK
2. Review of Probability Theory: Two r.vs (3)
‰ Joint probability density function (Joint p.d.f): By definition, the
joint p.d.f of X and Y is given by
∂ 2 FXY ( x, y )
f XY ( x, y ) = .
∂x ∂y
and hence we obtain the useful formula
x y
FXY ( x, y ) = ∫ ∫ f XY (u, v ) dudv.
−∞ −∞

Using property (i), we also get


+∞ +∞
∫ ∫
−∞ −∞
f XY ( x, y ) dxdy = 1.

The probability that (X,Y) belongs to an arbitrary region D is given by

P (( X , Y ) ∈ D ) = ∫ ∫ f XY ( x, y )dxdy.
( x , y )∈D

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 94 BG, CH, ÑHBK
2. Review of Probability Theory: Two r.vs (4)
‰ Marginal Statistics: In the context of several r.vs, the statistics of each
individual ones are called marginal statistics. Thus FX(x) is the marginal
probability distribution function of X, and fX(x) is the marginal p.d.f of X.
It is interesting to note that all marginals can be obtained from the joint p.d.f.
In fact
FX ( x ) = FXY ( x,+∞), FY ( y ) = FXY ( +∞, y ).
Also +∞ +∞
f X ( x ) = ∫ f XY ( x, y )dy , fY ( y ) = ∫ f XY ( x, y )dx.
−∞ −∞

If X and Y are discrete r.vs, then pij = P(X = xi, Y = yi) represents their
joint p.d.f, and their respective marginal p.d.fs are given by
P( X = xi ) = ∑ P( X = xi , Y = y j ) = ∑ pij
j j
P(Y = y j ) = ∑ P( X = xi , Y = y j ) = ∑ pij
i i
The joint P.D.F and/or the joint p.d.f represent complete information about the r.vs,
and their marginal p.d.fs can be evaluated from the joint p.d.f. However, given
marginals, (most often) it will not be possible to compute the joint p.d.f.
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 95 BG, CH, ÑHBK
2. Review of Probability Theory: Two r.vs (5)
‰ Independence of r.vs
Definition: The random variables X and Y are said to be statistically
independent if the events {X(ζ) ∈ A} and {Y(ζ) ∈ B} are independent
events for any two sets A and B in x and y axes respectively. Applying the
above definition to the events {X(ζ) ≤ x} and {Y(ζ) ≤ y}, we conclude that,
if the r.vs X and Y are independent, then
P (( X (ξ ) ≤ x ) ∩ (Y (ξ ) ≤ y ) ) = P ( X (ξ ) ≤ x ) P (Y (ξ ) ≤ y )
i.e.,
FXY ( x, y ) = FX ( x ) FY ( y )
or equivalently, if X and Y are independent, then we must have
f XY ( x, y ) = f X ( x ) fY ( y ).

If X and Y are discrete-type r.vs then their independence implies


P( X = xi , Y = y j ) = P ( X = xi ) P(Y = y j ) for all i, j.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 96 BG, CH, ÑHBK
2. Review of Probability Theory: Two r.vs (6)
The equations in previous slide give us the procedure to test for
independence. Given fXY(x,y), obtain the marginal p.d.fs fX(x) and fY(x)
and examine whether these equations (last two equations) are valid. If so,
the r.vs are independent, otherwise they are dependent.
Example: Given ⎧ xy 2e − y , 0 < y < ∞, 0 < x < 1,
f XY ( x, y ) = ⎨
⎩ 0, otherwise.
Determine whether X and Y are independent.
+∞ ∞
We have f X ( x ) = ∫ f XY ( x, y )dy = x ∫ y 2e − y dy
0 0

= x ⎛⎜ − 2 ye − y + 2 ∫ ye − y dy ⎞⎟ = 2 x, 0 < x < 1.
∞ ∞

⎝ 0 0 ⎠
Similarly,
1 y2 −y
fY ( y ) = ∫ f XY ( x, y )dx = e , 0 < y < ∞.
0 2
In this case, f XY ( x, y ) = f X ( x ) fY ( y ), and hence X and Y are independent r.vs.
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 97 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (1)
‰ Given two random variables X and Y and a function g(x,y), we form a new
random variable Z = g(X, Y).

Given the joint p.d.f fXY(x, y), how does one obtain fZ(z) the p.d.f of Z ?
Problems of this type are of interest from a practical standpoint. For example,
a receiver output signal usually consists of the desired signal buried in noise,
and the above formulation in that case reduces to Z = X + Y.

It is important to know the statistics of the incoming signal for proper


receiver design. In this context, we shall analyze problems of the following
type: X +Y
max( X , Y ) X −Y
min( X , Y ) Z = g ( X ,Y ) XY

X 2 +Y 2 X /Y
tan −1 ( X / Y )

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 98 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (2)
‰ Start with: FZ ( z ) = P (Z (ξ ) ≤ z ) = P (g ( X , Y ) ≤ z ) = P[( X , Y ) ∈ Dz ]
=∫ ∫ f XY ( x, y )dxdy ,
x , y∈Dz

where Dz in the XY plane represents the region such that g(x, y) ≤ z is


satisfied. To determine FZ(z), it is enough to find the region Dz for every z,
and then evaluate the integral there.

Example 1: Z = X + Y. Find fZ(z).


+∞ z− y
FZ ( z ) = P ( X + Y ≤ z ) = ∫ ∫ f XY ( x, y )dxdy ,
y = −∞ x = −∞

since the region Dz of the xy plane where x+y ≤ z y

is the shaded area in the figure to the left of the


line x+y = z. Integrating over the horizontal strip x= z− y

along the x-axis first (inner integral) followed by


x
sliding that strip along the y-axis from -∞ to +∞
(outer integral) we cover the entire shaded area.
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 99 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (3)
We can find fZ(z) by differentiating FZ(z) directly. In this context, it is useful
to recall the differentiation rule due to Leibnitz. Suppose
b( z )
H ( z) = ∫ h( x, z )dx.
a( z)

Then
dH ( z ) db( z ) da ( z ) b ( z ) ∂h ( x , z )
= h (b( z ), z ) − h (a ( z ), z ) + ∫ dx.
dz dz dz a ( z ) ∂z

Using above equations, we get


⎛ ∂ z− y +∞ ⎛ z − y ∂f
⎞ XY ( x , y ) ⎞
+∞
fZ ( z) = ∫ ⎜ ∫ −∞ XYf ( x , y ) dx ⎟ dy = ∫ ⎜ f ( z − y , y ) − 0 + ∫ ⎟ dy
⎝ ∂z ∂z
XY
−∞
⎠ −∞
⎝ −∞

+∞
=∫ f XY ( z − y , y )dy.
−∞

If X and Y are independent, fXY(x,y) = fX(x)fY(y), then we get


+∞ +∞
fZ ( z) = ∫ f X ( z − y ) fY ( y )dy = ∫ f X ( x ) fY ( z − x )dx. (Convolution!)
y = −∞ x = −∞

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 100 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (4)
Example 2: X and Y are independent normal r.vs with zero mean and
common variance σ2 . Determine fZ(z) for Z = X2 + Y2 .

We have
FZ ( z ) = P (X 2 + Y 2 ≤ z ) = ∫ ∫ f XY ( x, y )dxdy.
X 2 +Y 2 ≤ z

But, X2 + Y2 ≤ z represents the area of a circle with radius √z and hence


z z− y2
FZ ( z ) = ∫ ∫ f XY ( x, y )dxdy.
y =− z x =− z − y 2 y

This gives after repeated differentiation z


X 2 +Y 2 = z
f Z ( z) = z x


z

y =− z
1
2 z−y 2
(f XY ( z − y 2 , y ) + f XY (− z − y 2 , y ) dy. )
− z
(*)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 101 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (5)
Moreover, X and Y are said to be jointly normal (Gaussian) distributed, if
their joint p.d.f has the following form:
−1 ⎛⎜ ( x − μ X ) 2 2 ρ ( x − μ X )( y − μY ) ( y − μY ) 2 ⎞⎟
− +
1 2 (1− ρ 2 ) ⎜⎝ σ X2 σ XσY σ Y2 ⎟
f XY ( x, y ) = e ⎠
,
2πσ X σ Y 1 − r 2

− ∞ < x < +∞, − ∞ < y < +∞, | r |< 1.


with zero mean 1 ⎛ x 2 2 rxy y 2 ⎞
− ⎜ − + ⎟
1 2 (1− r 2 ) ⎜⎝ σ 12 σ 1σ 2 σ 22 ⎟
f XY ( x, y ) = e ⎠
.
2πσ 1σ 2 1 − r 2

with r = 0 and σ1 = σ2 = σ, then direct substitution into (*), we get


2
z ⎛ 1 1 ⎞ e − z / 2σ z 1
fZ ( z) = ∫ ∫
2 2 2
−( z− y + y ) / 2σ
⎜ 2 ⋅ e ⎟ dy = dy
2 z − y 2 ⎝ 2πσ πσ 2
2
y =− z
⎠ 0
z− y 2

− z / 2σ 2
e π /2 z cos θ 1 − z / 2σ 2
=
πσ 2 ∫ 0
z cos θ
dθ =
2σ 2
e U ( z ),
with y = z sin θ .
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 102 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (6)
Thus, if X and Y are independent zero mean Gaussian r.vs with common
variance σ2 then X2 + Y2 is an exponential r.vs with parameter 2σ2.

Example 3: Let Z = X 2 + Y 2 . Find fZ(z)


y
From the figure, the present case corresponds to a
circle with radius z2. Thus
z
2 2
X 2 +Y 2 = z
z z −y
FZ ( z ) = ∫ ∫ 2 2
f XY ( x, y )dxdy. z x
y =− z x =− z − y

f Z ( z) = ∫
z

−z
z −y
2
z
2
(f XY ( z 2
− y 2
, y ) + f XY ( − z 2
− y 2
, y ) dy. ) − z

Now suppose X and Y are independent Gaussian as in Example 2, we obtain


z z 1 2 z − z / 2σ z 1
f Z ( z) = 2∫ ∫
2 2 2 2 2 2
( z − y + y ) / 2σ
e dy = e dy
2 2πσ 2 πσ 2
0
z −y
2 0
z −y
2 2

2z π/2 z cosθ z
∫ Rayleigh distribution!
2
/ 2σ 2 2 2
= e −z
dθ = 2 e − z / 2σ U ( z ),
πσ 2 0 z cosθ σ
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 103 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (7)
Thus, if W = X + iY, where X and Y are real, independent normal r.vs with
zero mean and equal variance, then the r.v W = X 2 + Y 2 has a Rayleigh
density. W is said to be a complex Gaussian r.v with zero mean, whose real
and imaginary parts are independent r.vs and its magnitude has Rayleigh
distribution.

⎛X ⎞
What about its phase θ = tan −1 ⎜ ⎟?
⎝Y ⎠
Clearly, the principal value of θ lies in the interval (-π/2, +π/2). If we let
U = tan θ = X/Y , then it is shown that U has a Cauchy distribution with
1/ π
fU ( u ) = , − ∞ < u < ∞.
u2 + 1
As a result
1 1 1/π ⎧1 / π , − π / 2 < θ < π / 2,
fθ (θ ) = fU (tan θ ) = =⎨
| dθ / du | (1 / sec 2 θ ) tan 2 θ + 1 ⎩ 0, otherwise .

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 104 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (8)
To summarize, the magnitude and phase of a zero mean complex Gaussian r.v
has Rayleigh and uniform distributions respectively. Interestingly, as we will
see later, these two derived r.vs are also independent of each other!

Example 4: Redo example 3, where X and Y have nonzero means μX and


μY respectively.
Since 1 −[( x − μ X ) 2 + ( y − μY ) 2 ] / 2σ 2
f XY ( x, y ) = e ,
2πσ 2

Similar to example 3, we obtain the Rician probability density function to be


(e ) dθ
− ( z 2 + μ 2 ) / 2σ 2
ze π/2
∫π
zμ cos(θ −φ ) / σ 2 − zμ cos(θ +φ ) / σ 2
f Z ( z) = +e
2πσ 2 − /2

−( z 2 + μ 2 ) / 2σ 2
ze ⎛⎜ π/2 e zμ cos(θ −φ ) / σ 2 dθ + 3π/2 e zμ cos(θ −φ ) / σ 2 dθ ⎞⎟
⎝ ∫ −π/2 ∫π/2
=
2πσ 2 ⎠
−( z 2 + μ 2 ) / 2σ 2
ze ⎛ zμ ⎞
= I 0 ⎜ 2 ⎟,
2πσ 2 ⎝σ ⎠
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 105 BG, CH, ÑHBK
2. Review of Probability Theory: Func. of 2 r.vs (9)
where x = z cos θ , y = z sin θ , μ = μ X2 + μY2 , μ X = μ cos φ , μY = μ sin φ ,

and 1 2π 1 π
I 0 (η ) =
2π ∫π ∫0
0
eη cos(θ −φ ) dθ = eη cosθ dθ

is the modified Bessel function of the first kind and zeroth order.

Thus, if X and Y have nonzero means μX and μY, respectively. Then


Z = X 2 + Y 2 is said to be a Rician r.v. Such a scene arises in fading multipath
situation where there is a dominant constant component (mean) in addition to
a zero mean Gaussian r.v. The constant
component may be the line of sight Multipath/Gaussian
signal and the zero mean Gaussian r.v Line of sight noise
part could be due to random multipath signal (constant)
components adding up incoherently Rician
a ∑
(see diagram). The envelope of such Output
a signal is said to have a Rician p.d.f.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 106 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (1)
‰ Given two r.vs X and Y and a function g(x,y), define the r.v Z = g(X,Y), the
mean of Z can be defined as
+∞
μ Z = E ( Z ) = ∫ z f Z ( z )dz.
−∞

or more useful formula


+∞ +∞ +∞
E(Z ) = ∫ z f Z ( z )dz = ∫ ∫ g ( x, y ) f XY ( x, y )dxdy.
−∞ −∞ −∞

If X and Y are discrete-type r.vs, then


E[ g ( X , Y )] = ∑∑ g ( xi , y j ) P( X = xi , Y = y j ).
i j

Since expectation is a linear operator, we also get


⎛ ⎞
E ⎜ ∑ ak g k ( X , Y ) ⎟ = ∑ ak E[ g k ( X , Y )].
⎝ k ⎠ k

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 107 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (2)
If X and Y are independent r.vs, it is easy to see that V=g(X) and W=h(Y)
are always independent of each other. We get the interesting result
+∞ +∞
E[ g ( X )h(Y )] = ∫ ∫ g ( x )h( y ) f X ( x ) fY ( y )dxdy
−∞ −∞
+∞ +∞
= ∫ g ( x ) f X ( x )dx ∫ h( y ) fY ( y )dy = E[ g ( X )]E[h(Y )].
−∞ −∞

In the case of one random variable, we defined the parameters mean and
variance to represent its average behavior. How does one parametrically
represent similar cross-behavior between two random variables? Towards
this, we can generalize the variance definition given as

‰ Covariance: Given any two r.vs X and Y, define


Cov ( X , Y ) = E [( X − μ X )(Y − μY )].
or
Cov( X , Y ) = E ( XY ) − μ X μY = E ( XY ) − E ( X ) E (Y )
____ __ __
= XY − X Y .
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 108 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (3)
It is easy to see that
Cov ( X , Y ) ≤ Var ( X )Var (Y ) .
We define the normalized parameter
Cov( X , Y ) Cov( X , Y )
ρ XY = = , − 1 ≤ ρ XY ≤ 1,
Var ( X )Var (Y ) σ Xσ Y
and it represents the correlation coefficient between X and Y.

Uncorrelated r.vs: If ρXY = 0, then X and Y are said to be uncorrelated r.vs.


If X and Y are uncorrelated, then E(XY) = E(X)E(Y)

Orthogonality: X and Y are said to be orthogonal if E(XY) = 0

If either X or Y has zero mean, then orthogonality implies uncorrelatedness


also and vice-versa. Suppose X and Y are independent r.vs, it also implies
they are uncorrelated.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 109 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (4)
Naturally, if two random variables are statistically independent, then there
cannot be any correlation between them ρXY = 0. However, the converse
is in general not true. As the next example shows, random variables can
be uncorrelated without being independent.

Example 5: Let Z = aX + bY. Determine the variance of Z in terms of σX, σY


and ρXY .
We have μ Z = E ( Z ) = E ( aX + bY ) = a μ X + bμY

σ Z2 = Var ( Z ) = E ⎡⎣( Z − μ Z )2 ⎤⎦ = E ⎡( a ( X − μ X ) + b(Y − μY ) ) ⎤


2

⎣ ⎦
= a 2 E ( X − μ X ) 2 + 2abE ( ( X − μ X )(Y − μY ) ) + b2 E (Y − μY ) 2
= a 2σ X2 + 2ab ρ XY σ X σ Y + b2σ Y2 .

In particular, if X and Y are independent, then ρXY = 0 and


σ Z2 = a 2σ X2 + b2σ Y2 .
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 110 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (5)
‰ Moments: represents the joint moment of order (k,m) for X and Y
+∞ +∞
E[ X Y ] = ∫
k m
∫ x k y m f XY ( x, y )dx dy ,
−∞ −∞

Following the one random variable case, we can define the joint
characteristic function between two random variables which will turn out to
be useful for moment calculations.

‰ Joint characteristic functions: between X and Y is defined as


Φ XY (u, v ) = E ( e )=∫ ∫
+∞ +∞
j ( Xu +Yv )
e j ( Xu +Yv ) f XY ( x, y )dxdy.
−∞ −∞

Note that Φ XY (u, v ) ≤ Φ XY (0,0) = 1.

It is easy to show that


1 ∂ 2Φ XY (u, v )
E ( XY ) = 2 .
j ∂u∂v u = 0 ,v = 0

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 111 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (6)
If X and Y are independent r.vs, then we obtain
Φ XY (u, v ) = E ( e juX ) E ( e jvY ) = Φ X (u )ΦY ( v ).
Also
Φ X (u ) = Φ XY (u, 0), ΦY ( v ) = Φ XY (0, v ).

‰ More on Gaussian r.vs: the joint characteristic function of two jointly


Gaussian r.vs to be
1
j ( μ X u + μY v ) − (σ X2 u 2 + 2 ρσ X σ Y uv +σ Y2 v 2 )
j ( Xu +Yv )
Φ XY (u, v ) = E (e )=e 2
.

Example 6: Let X and Y be jointly Gaussian r.vs with parameters


N(μX, μY, σX2, σY2, ρ). Define Z = aX + bY. Determine fZ(z).
In this case we can use characteristic function to solve.
Φ Z (u ) = E (e jZu ) = E ( e j ( aX +bY ) u ) = E ( e jauX + jbuY )
= Φ XY ( au, bu ).
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 112 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (7)
or 1
j ( aμ X + bμY ) u − ( a 2σ X2 + 2 ρabσ X σ Y + b2σ Y2 ) u 2
1
jμ Z u − σ Z2 u 2
Φ Z (u ) = e 2
=e 2
,
where μ Z =Δ a μ X + b μ Y ,
σ 2
Z =Δ a 2 σ 2
X + 2 ρ ab σ X σ Y + b 2 σ Y2 .
Thus, Z = aX + bY is also Gaussian with mean and variance as above.
We conclude that any linear combination of jointly Gaussian r.vs generate a
Gaussian r.v.

Example 7: Suppose X and Y are jointly Gaussian r.vs as in the example 6.


Define two linear combinations: Z = aX + bY and W = cX + dY. Determine
their joint distribution?

The characteristic function of Z and W is given by


Φ ZW (u, v ) = E ( e j ( Zu +Wv ) ) = E ( e j ( aX +bY ) u + j ( cX + dY ) v )
= E ( e jX ( au + cv )+ jY ( bu + dv ) ) = Φ XY ( au + cv, bu + dv ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 113 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (8)
Similar to example 6, we get
1
j ( μ Z u + μW v ) − (σ Z2 u 2 + 2 ρ ZW σ X σ Y uv +σ W2 v 2 )
Φ ZW (u, v ) = e 2
,

where
μ Z = aμ X + bμY ,
μW = cμ X + dμY ,
σ Z2 = a 2σ X2 + 2abρσ X σ Y + b2σ Y2 ,
σ W2 = c 2σ X2 + 2cdρσ X σ Y + d 2σ Y2 ,
and
acσ X2 + ( ad + bc ) ρσ X σ Y + bdσ Y2
ρ ZW = .
σ Zσ W

Thus, Z and W are also jointly distributed Gaussian r.vs with means,
variances and correlation coefficient as above.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 114 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (9)
To summarize, any two linear combinations of jointly Gaussian random
variables (independent or dependent) are also jointly Gaussian r.vs.

Gaussian Linear Gaussian


input operator output

Gaussian r. vs are also interesting because of the following result:

‰ Central Limit Theorem: Suppose X1, X2,…, Xn are a set of zero mean
independent, identically distributed (i.i.d) random variables with some
common distribution. Consider their scaled sum
X1 + X 2 + + Xn
Y= .
n
Then asymptotically (as n →∞)
Y → N (0, σ 2 ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 115 BG, CH, ÑHBK
2. Review of Probability Theory: Joint Moments… (10)
The central limit theorem states that a large sum of independent random
variables each with finite variance tends to behave like a normal random
variable. Thus the individual p.d.fs become unimportant to analyze the
collective sum behavior. If we model the noise phenomenon as the sum of
a large number of independent random variables (eg: electron motion in
resistor components), then this theorem allows us to conclude that noise
behaves like a Gaussian r.v.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 116 BG, CH, ÑHBK
2. Review of Stochastic Processes and Models

‰ Mean, Autocorrelation.

‰ Stationarity.

‰ Deterministic Systems.

‰ Discrete Time Stochastic Process.

‰ Stochastic Models.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 117 BG, CH, ÑHBK
2. Review of Stochastic Processes: Introduction (1)
‰ Let ζ denote the random outcome of an experiment. To every such
outcome suppose a waveform X(t, ζ) is assigned. The collection of such
waveforms form a stochastic process. The set of {ζk} and the time index t
can be continuous or discrete (countably infinite or finite) as well. For fixed
ζi ∈ S (the set of all experimental outcomes), X(t, ζ) is a specific time
function. For fixed t, X1 = X(t1, ζi) is a random variable. The ensemble of
all such realizations X(t, ζ) over
time represents the stochastic X (t, ξ )
process (or random process)
X(t) (see the figure). X (t, ξ )
n

For example: X (t, ξ k )

X(t) = acos(ω0t +φ)


X (t, ξ 2 )
where φ is a uniformly distributed
random variable in (0, 2π), X (t, ξ1 )
represents a stochastic process. 0 t1 t2
t

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 118 BG, CH, ÑHBK
2. Review of Stochastic Processes: Introduction (2)
If X(t) is a stochastic process, then for fixed t, X(t) represents a random
variable. Its distribution function is given by
FX ( x, t ) = P{ X (t ) ≤ x}
Notice that FX(x, t) depends on t, since for a different t, we obtain a
different random variable. Further
Δ dF X ( x , t )
f X ( x, t ) =
dx
represents the first-order probability density function of the process X(t).

For t = t1 and t = t2, X(t) represents two different random variables


X1 = X(t1) and X2 = X(t2), respectively. Their joint distribution is given by
FX ( x1 , x2 , t1 , t2 ) = P{ X (t1 ) ≤ x1 , X (t2 ) ≤ x2 }
and ∂ 2 FX ( x1 , x2 , t1 , t2 )
Δ
f X ( x1 , x2 , t1 , t2 ) =
∂x1 ∂x2
represents the second-order density function of the process X(t).
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 119 BG, CH, ÑHBK
2. Review of Stochastic Processes: Introduction (3)
Similarly, fX(x1, x2,… xn, t1, t2,… tn) represents the nth order density function
of the process X(t). Complete specification of the stochastic process X(t)
requires the knowledge of fX(x1, x2,… xn, t1, t2,… tn) for all ti , i = 1, 2…, n and
for all n. (an almost impossible task in reality!).

‰ Mean of a stochastic process:


+∞
μ ( t ) =Δ E { X ( t )} = ∫ −∞ x f X ( x , t ) dx
represents the mean value of a process X(t). In general, the mean of
a process can depend on the time index t.

‰ Autocorrelation function of a process X(t) is defined as


Δ
RXX (t1 , t2 ) = E{X (t1 ) X * (t2 )} = ∫ ∫ x1 x2* f X ( x1 , x2 , t1 , t2 )dx1dx2
and it represents the interrelationship between the random variables
X1 = X(t1) and X2 = X(t2) generated from the process X(t).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 120 BG, CH, ÑHBK
2. Review of Stochastic Processes: Introduction (4)
Properties:

(i) R XX (t1 , t2 ) = R XX* (t2 , t1 ) = [ E{ X (t2 ) X * (t1 )}]*

(ii) R XX (t , t ) = E{| X (t ) |2 } > 0. (Average instantaneous power)

(iii) RXX(t1, t2) represents a nonnegative definite function, i.e., for any set
of constants {ai}ni=1 n n
∑ ∑ ai a*j R XX
(ti , t j ) ≥ 0.
i =1 j =1
n
this follows by noticing that E{| Y | } ≥ 0 for Y = ∑ ai X (ti ).
2

i =1

The function
C XX (t1 , t 2 ) = RXX (t1 , t 2 ) − μ X (t1 ) μ *X (t 2 )
represents the autocovariance function of the process X(t).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 121 BG, CH, ÑHBK
2. Review of Stochastic Processes: Introduction (5)
Example:
X (t ) = a cos(ω 0 t + ϕ ), ϕ ~ U (0,2π ).
This gives
μ (t ) = E{ X (t )} = aE{cos(ω 0 t + ϕ )}
X

= a cos ω 0 t E{cosϕ } − a sin ω 0 t E{sin ϕ } = 0,



since E{cosϕ } = 1 cosϕ dϕ = 0 = E{sin ϕ }.
2π ∫0
Similarly,
R XX (t1 , t2 ) = a 2 E{cos(ω 0 t1 + ϕ ) cos(ω 0 t2 + ϕ )}
a2
= E{cosω 0 (t1 − t2 ) + cos(ω 0 (t1 + t2 ) + 2ϕ )}
2
a2
= cos ω 0 (t1 − t2 ).
2

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 122 BG, CH, ÑHBK
2. Review of Stochastic Processes: Stationary (1)
‰ Stationary processes exhibit statistical properties that are invariant to
shift in the time index. Thus, for example, second-order stationarity implies
that the statistical properties of the pairs {X(t1) , X(t2)} and {X(t1+c),X(t2+c)}
are the same for any c. Similarly, first-order stationarity implies that the
statistical properties of X(ti) and X(ti+c) are the same for any c.

In strict terms, the statistical properties are governed by the joint probability
density function. Hence a process is nth-order Strict-Sense Stationary
(S.S.S) if
f X ( x1 , x2 , xn , t1 , t2 , tn ) ≡ f X ( x1 , x2 , xn , t1 + c, t2 + c , tn + c ) (*)
for any c, where the left side represents the joint density function of the
random variables X1 = X(t1), X2 = X(t2), …, Xn = X(tn), and the right side
corresponds to the joint density function of the random variables X’1=X(t1+c),
X’2 = X(t2+c), …, X’n = X(tn+c). A process X(t) is said to be strict-sense
stationary if (*) is true for all ti, i = 1, 2, …, n; n = 1, 2, … and any c.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 123 BG, CH, ÑHBK
2. Review of Stochastic Processes: Stationary (2)
For a first-order strict sense stationary process, from (*) we have
f X ( x, t ) ≡ f X ( x, t + c )
for any c. In particular c = – t gives
f X ( x, t ) = f X ( x )
i.e., the first-order density of X(t) is independent of t. In that case
+∞
E [ X (t )] = ∫ −∞ x f ( x )dx = μ , a constant.

Similarly, for a second-order strict-sense stationary process we have


from (*)
f X ( x1 , x2 , t1 , t 2 ) ≡ f X ( x1 , x2 , t1 + c, t 2 + c)
for any c. For c = – t2 we get
f X ( x1 , x2 , t1 , t 2 ) ≡ f X ( x1 , x2 , t1 − t 2 )
i.e., the second order density function of a strict sense stationary process
depends only on the difference of the time indices t1 – t2 = τ .

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 124 BG, CH, ÑHBK
2. Review of Stochastic Processes: Stationary (3)
In that case the autocorrelation function is given by
Δ
RXX (t1 , t2 ) = E{ X (t1 ) X * (t2 )}
= ∫ ∫ x1 x2* f X ( x1 , x2 ,τ = t1 − t2 )dx1dx2
= RXX (t1 − t2 ) Δ= RXX (τ ) = RXX* (−τ ),
i.e., the autocorrelation function of a second order strict-sense stationary
process depends only on the difference of the time indices τ .

However, the basic conditions for the first and second order stationarity are
usually difficult to verify. In that case, we often resort to a looser definition
of stationarity, known as Wide-Sense Stationarity (W.S.S). Thus, a process
X(t) is said to be Wide-Sense Stationary if
(i) E{ X (t )} = μ
and
(ii) E{ X (t1 ) X (t2 )} = R XX (t1 − t2 ),
*

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 125 BG, CH, ÑHBK
2. Review of Stochastic Processes: Stationary (4)
i.e., for wide-sense stationary processes, the mean is a constant and the
autocorrelation function depends only on the difference between the time
indices. Strict-sense stationarity always implies wide-sense stationarity.
However, the converse is not true in general, the only exception being the
Gaussian process. If X(t) is a Gaussian process, then
wide-sense stationarity (w.s.s) ⇒ strict-sense stationarity (s.s.s).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 126 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (1)
‰ A deterministic system transforms each input waveform X(t, ζi ) into an
output waveform Y(t, ζi) = T[X(t, ζi)] by operating only on the time variable t.
A stochastic system operates on both the variables t and ζ .

Thus, in deterministic system, a set of realizations at the input corresponding


to a process X(t) generates a new set of realizations Y(t, ζ) at the output
associated with a new process Y(t).
Y (t, ξ i )
X (t , ξ i )

⎯⎯⎯→
X (t )
T [⋅] ⎯Y⎯→

(t )

t t

Our goal is to study the output process statistics in terms of the input
process statistics and the system function.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 127 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (2)

Deterministic Systems

Memoryless Systems Systems with Memory


Y ( t ) = g [ X ( t )]

Time-varying Time-invariant Linear systems


systems systems Y ( t ) = L[ X ( t )]

Linear-Time Invariant
(LTI) systems
+∞
X (t ) h (t ) Y ( t ) = ∫ − ∞ h ( t − τ ) X (τ ) dτ
+∞
= ∫ − ∞ h (τ ) X ( t − τ ) dτ .
LTI system
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 128 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (3)
‰ Memoryless Systems:
The output Y(t) in this case depends only on the present value of the input X(t).
i.e., Y ( t ) = g{ X ( t )}

Strict-sense Memoryless Strict-sense


stationary input system stationary output.

Wide-sense Need not be


Memoryless
stationary in
stationary input system
any sense.

X(t) stationary Memoryless Y(t) stationary,but


Gaussian with system not Gaussian with
R XX (τ ) R XY (τ ) = η R XX (τ ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 129 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (4)
Theorem: If X(t) is a zero mean stationary Gaussian process, and
Y(t) = g[X(t)], where g(.) represents a nonlinear memoryless device, then
R XY (τ ) = ηR XX (τ ), η = E{g ′( X )}.
where g’(x) is the derivative with respect to x.

‰ Linear Systems: L[.] represents a linear system if


L{a1 X (t1 ) + a 2 X (t2 )} = a1 L{ X (t1 )} + a 2 L{ X (t2 )}.
Let Y(t) = L{X(t)} represent the output of a linear system.

‰ Time-Invariant System: L[.] represents a time-invariant system if


Y (t ) = L{ X (t )} ⇒ L{ X (t − t0 )} = Y (t − t0 )
i.e., shift in the input results in the same shift in the output also.

‰ If L[.] satisfies both conditions for linear and time-invariant, then it


corresponds to a linear time-invariant (LTI) system.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 130 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (5)
LTI systems can be uniquely represented in terms of their output to a
delta function Impulse h(t )
response of
the system
δ (t ) LTI h (t )
t

Impulse Impulse
response
then Y (t )

X (t )
X (t ) Y (t ) t
t LTI
+∞
Y (t ) = ∫ − ∞ h (t − τ ) X (τ )dτ
arbitrary
input +∞
= ∫ − ∞ h (τ ) X (t − τ )dτ
where +∞
X (t ) = ∫ − ∞ X (τ )δ (t − τ )dτ

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 131 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (6)
Thus +∞
Y (t ) = L{ X (t )} = L{∫ − ∞ X (τ )δ (t − τ )dτ }
+∞
= ∫ − ∞ L{ X (τ )δ (t − τ )dτ } By Linearity
+∞
= ∫ − ∞ X (τ ) L{δ (t − τ )}dτ By Time-invariance
+∞ +∞
= ∫ − ∞ X (τ )h(t − τ )dτ = ∫ − ∞ h(τ ) X (t − τ )dτ .

Then, the mean of the output process is given by


+∞
μ (t ) = E{Y (t )} = ∫ − ∞ E{ X (τ )h(t − τ )dτ }
Y

+∞
= ∫ − ∞ μ X (τ )h(t − τ )dτ = μ X (t ) ∗ h(t ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 132 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (7)
Similarly, the cross-correlation function between the input and output
processes is given by
R XY (t1 , t2 ) = E{ X (t1 )Y * (t2 )}
+∞
= E{ X (t1 ) ∫ − ∞ X * (t2 − α )h * (α )dα }
+∞
= ∫ − ∞ E{ X (t1 ) X * (t2 − α )}h * (α )dα
+∞
= ∫ − ∞ R XX (t1 , t2 − α )h * (α )dα
= R XX (t1 , t2 ) ∗ h * (t2 ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 133 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (8)
Finally the output autocorrelation function is given by
RYY (t1 , t2 ) = E{Y (t1 )Y * (t2 )}
+∞
= E{∫ − ∞ X (t1 − β )h ( β )dβ Y * (t2 )}
+∞
= ∫ − ∞ E{ X (t1 − β )Y * (t2 )}h ( β )dβ
+∞
= ∫ − ∞ R XY (t1 − β , t2 )h ( β )dβ
= R XY (t1 , t2 ) ∗ h(t1 ),
or
RYY (t1 , t2 ) = R XX (t1 , t2 ) ∗ h * (t2 ) ∗ h (t1 ).

In particular, if X(t) is wide-sense stationary, then we have μX(t) = μX


Also +∞
R XY ( t1 , t 2 ) = ∫ − ∞ R XX ( t1 − t 2 + α ) h (α ) dα
*

Δ
= R XX (τ ) ∗ h * ( −τ ) = R XY (τ ), τ = t1 − t 2 .
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 134 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (9)
Thus X(t) and Y(t) are jointly w.s.s. Further, the output autocorrelation
simplifies to
+∞
RYY (t1 , t 2 ) = ∫ −∞ RXY (t1 − β − t 2 )h( β )dβ , τ = t1 − t 2
= RXY (τ ) ∗ h(τ ) = RYY (τ ).
or
RYY (τ ) = RXX (τ ) ∗ h* ( −τ ) ∗ h(τ ).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 135 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (10)

X (t ) Y (t )
wide-sense LTI system wide-sense
stationary process h(t) stationary process.
(a)

X (t ) Y (t )
LTI system
strict-sense strict-sense
h(t)
stationary process stationary process
(b)

X (t ) Y (t )
Gaussian process Linear system Gaussian process
(also stationary) (also stationary)
(c)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 136 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (11)
‰ White Noise Process: W(t) is said to be a white noise process if
RWW (t1 , t 2 ) = q (t1 )δ (t1 − t 2 ),
i.e., E[W(t1) W*(t2)] = 0 unless t1 = t2.

W(t) is said to be wide-sense stationary (w.s.s) white noise if


E[W(t)] = constant, and
RWW (t1 , t 2 ) = qδ (t1 − t 2 ) = qδ (τ ).

If W(t) is also a Gaussian process (white Gaussian process), then all of


its samples are independent random variables (why?).

For w.s.s. white noise input W(t), we have


+∞
E [ N (t )] = μW ∫ −∞ h(τ )dτ , a constant and Rnn (τ ) = qδ (τ ) ∗ h* ( −τ ) ∗ h(τ )
= qh* (−τ ) ∗ h(τ ) = qρ (τ )
+∞
where ρ (τ ) = h(τ ) ∗ h* (−τ ) = ∫ −∞ h(α )h* (α + τ )dα .

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 137 BG, CH, ÑHBK
2. Review of Stochastic Processes: Systems (12)
Thus the output of a white noise process through an LTI system represents
a (colored) noise process.
Note: White noise need not be Gaussian.
“White” and “Gaussian” are two different concepts!

White noise LTI Colored noise


W(t) h(t) N (t ) = h (t ) ∗ W (t )

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 138 BG, CH, ÑHBK
2. Review of Stochastic Processes: Discrete Time (1)
‰ A discrete time stochastic process (DTStP) Xn = X(nT) is a sequence of
random variables. The mean, autocorrelation and auto-covariance functions
of a discrete-time process are gives by
μ n = E{ X ( nT )}
R ( n1 , n2 ) = E{ X ( n1T ) X * ( n2T )}
C ( n1 , n2 ) = R ( n1 , n2 ) − μ n1 μ n*2
respectively. As before strict sense stationarity and wide-sense stationarity
definitions apply here also. For example, X(nT) is wide sense stationary if
E{ X ( nT )} = μ , a constant
and
Δ
E[X{(k +n)T}X*{(k)T}] = R(n) = rn = r−*n
i.e., R(n1, n2) = R(n1 – n2) = R*(n2 – n1).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 139 BG, CH, ÑHBK
2. Review of Stochastic Processes: Discrete Time (2)
‰ If X(nT) represents a wide-sense stationary input to a discrete-time
system {h(nT)}, and Y(nT) the system output, then as before the cross
correlation function satisfies
R XY ( n ) = R XX ( n ) ∗ h * ( − n )
and the output autocorrelation function is given by
RYY ( n ) = R XY ( n ) ∗ h( n )
or
RYY (n ) = R XX ( n ) ∗ h * ( −n ) ∗ h( n ).

Thus wide-sense stationarity from input to output is preserved for discrete-


time systems also.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 140 BG, CH, ÑHBK
2. Review of Stochastic Processes: Discrete Time (3)
‰ Mean (or ensemble average μ) of a stochastic process is obtained by
averaging across the process, while time average is obtained by averaging
along the process as M −1
1
μˆ =
M
∑X
n =0
n

where M is total number of time samples used in estimation.

Consider wide-sense DTStP Xn, time average converge to ensemble average


if:
[
lim (μ − μˆ ) = 0
M →∞
2
]
the process Xn is said mean ergodic (in the mean-square error sense).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 141 BG, CH, ÑHBK
2. Review of Stochastic Processes: Discrete Time (4)
‰ Let define an (M×1)-observation vector xn represents elements of time
series Xn, Xn-1, ….Xn-M+1
xn = [Xn, Xn-1, ….Xn-M+1]T

An (M×M)-correlation matrix R (using condition of wide-sense stationary)


can be defined as
⎡ R(0) R(1) R( M − 1) ⎤
⎢ R(−1) R ( M − 2)⎥⎥
R = E xnxn [ H
] =⎢

R(0)

⎢ ⎥
⎣ R (− M + 1) R (− M + 2) R (0) ⎦

Superscript H denotes Hermitian transposition.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 142 BG, CH, ÑHBK
2. Review of Stochastic Processes: Discrete Time (5)
Properties:
(i) Correlation matrix R of stationary DTStP is Hermitian: RH = R or
R(k)=R*(k).Therefore:
⎡ R(0) R(1) R( M − 1) ⎤
⎢ R ∗ (1) R ( 0) R ( M − 2)⎥⎥
R= ⎢
⎢ ⎥
⎢ ∗ ∗ ⎥
⎣ R ( M − 1) R ( M − 2) R ( 0) ⎦
(ii) Matrix R of stationary DTStP is Toeplitz: all elements on main
diagonal are equal, elements on any subdiagonal are also equal.
(iii) Let x be arbitrary (nonzero) (M×1)-complex-valued vector, then
xHRx ≥ 0 (nonnegative definition).
(iv) If xBn is backward arrangement of xn: x n = [ xn − M +1 , xn − M + 2 ,..., xn ]
B T

Then, E[xBnxBHn] = RT

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 143 BG, CH, ÑHBK
2. Review of Stochastic Processes: Discrete Time (6)
(v) Consider correlation matrices RM and RM+1, corresponding to M and M+1
observations of process, these matrices are related by

⎡ R (0) r H ⎤
R M +1 = ⎢ ⎥
⎣ r R M⎦

or
⎡R M r B∗ ⎤
R M +1 = ⎢ BT ⎥
⎣r R ( 0) ⎦

where rH = [R(1), R(2),…, R(M)] and rBT = [r(-M), r(-M+1),…, r(-1)]

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 144 BG, CH, ÑHBK
2. Review of Stochastic Processes: Discrete Time (7)
‰ Consider a time series consisting of complex sine wave plus noise:
un = u (n) = α exp( jωn) + v(n), n = 0,..., N − 1
Sources of sine wave and noise are independent. Assumed that v(n) has
zero mean and autocorrelation function given by
⎧σ
[ ] k =0
2
E v ( n )v ∗ ( n − k ) = ⎨ v
⎩0 k ≠0

For a lag k, autocorrelation function of process u(n):


⎧ α
2
+ σ v2 , k =0
[ ∗ ⎪
]
r (k ) = E u (n)u (n − k ) = ⎨ 2 jωk
⎪⎩ α e , k ≠0

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 145 BG, CH, ÑHBK
2. Review of Stochastic Processes: Discrete Time (8)
Therefore, correlation matrix of u(n):
⎡ 1 ⎤
⎢ 1 + exp( jω ) exp( jω ( M − 1)) ⎥
ρ
⎢ ⎥
1
2⎢ exp(− jω ) 1+ exp( jω ( M − 2))⎥
R=α ⎢ ρ ⎥
⎢ ⎥
⎢ 1 ⎥
⎢exp(− jω ( M − 1)) exp(− jω ( M − 2)) 1+ ⎥
⎣ ρ ⎦
2
α
where ρ = 2 : signal to noise ratio (SNR).
σv

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 146 BG, CH, ÑHBK
2. Review of Stochastic Models (1)
‰ Consider an input – output representation
p q
X ( n ) = − ∑ a k X ( n − k ) + ∑ bkW ( n − k ),
k =1 k =0

where X(n) may be considered as the output of a system {h(n)} driven by


the input W(n). Using z – transform, it gives
p q
X ( z ) ∑ ak z −k
= W ( z ) ∑ bk z − k , a0 ≡ 1
k =0 k =0
or
−1 −2

X ( z ) b0 + b1 z + b2 z + + bq z − q Δ B ( z )
H (z) = ∑ h(k ) z −k = =
W ( z ) 1 + a1 z − 1 + a 2 z −2 + + apz −p
=
A( z )
k =0

represents the transfer function of the associated system response {h(n)} so


that ∞
X ( n ) = ∑ h( n − k )W ( k ).
k =0

W(n) h(n) X(n)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 147 BG, CH, ÑHBK
2. Review of Stochastic Models (2)
Notice that the transfer function H(z) is rational with p poles and q zeros that
determine the model order of the underlying system. The output X(n)
undergoes regression over p of its previous values and at the same time a
moving average based on W(n), W(n-1), …, W(n-q) of the input over (q + 1)
values is added to it, thus generating an Auto Regressive Moving Average
(ARMA (p, q)) process X(n).
Generally the input {W(n)} represents a sequence of uncorrelated random
variables of zero mean and constant variance so that
RWW (n) = σ W2δ (n).

If in addition, {W(n)} is normally distributed then the output {X(n)} also


represents a strict-sense stationary normal process.

If q = 0, then X(n) represents an Auto Regressive AR(p) process (all-pole


process), and if p = 0, then X(n) represents an Moving Average MA(q)
process (all-zero process).

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 148 BG, CH, ÑHBK
2. Review of Stochastic Models (3)
AR(1) process: An AR(1) process has the form
X ( n ) = aX ( n − 1) + W ( n )
and the corresponding system transfer function

1
H ( z) =
1 − az −1
= ∑ a n −n
z
n =0
provided | a | < 1. Thus
h(n ) = a n , | a |< 1
represents the impulse response of an AR(1) stable system. We get the
output autocorrelation sequence of an AR(1) process to be
∞ |n |
a
R XX ( n ) = σ W2δ ( n ) ∗ {a − n } ∗ {a n } = σ W2 ∑ a |n|+ k a k = σ W2
k =0 1− a2
The normalized (in terms of RXX (0)) output autocorrelation sequence is
given by
R (n)
ρ X ( n ) = XX = a |n| , | n | ≥ 0.
R XX (0)
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 149 BG, CH, ÑHBK
2. Review of Stochastic Models (4)
It is instructive to compare an AR(1) model discussed above by
superimposing a random component to it, which may be an error term
associated with observing a first order AR process X(n). Thus
Y ( n) = X ( n) + V ( n)
where X(n) ~ AR(1), and V(n) is an uncorrelated random sequence with
zero mean and variance σ2V that is also uncorrelated with {W(n)}. Then,
we obtain the output autocorrelation of the observed process Y(n) to be
RYY (n) = RXX ( n) + RVV ( n) = RXX (n) + σ V2δ (n)
a |n|
= σW 2
+ σ 2
δ ( n)
1− a 2 V

so that its normalized version is given by


Δ RYY (n) ⎧1 n=0
ρY (n) = = ⎨ |n|
RYY (0) ⎩c a n = ±1, ± 2,
where
σ W2
c= 2 < 1.
σ W + σ V (1 − a )
2 2

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 150 BG, CH, ÑHBK
2. Review of Stochastic Models (5)
The results demonstrate the effect of superimposing an error sequence on
an AR(1) model. For non-zero lags, the autocorrelation of the observed
sequence {Y(n)} is reduced by a constant factor compared to the original
process {X(n)}. The superimposed error sequence V(n) only affects the
corresponding term in Y(n) (term by term). However, a particular term in
the “input sequence” W(n) affects X(n) and Y(n) as well as all subsequent
observations.

ρ X (0) = ρ Y (0) = 1

ρ X (k ) > ρY (k )

n
0 k

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 151 BG, CH, ÑHBK
2. Review of Stochastic Models (6)
AR(2) Process: An AR(2) process has the form
X ( n) = a1 X ( n − 1) + a2 X (n − 2) + W (n)
and the corresponding transfer function is given by

1 b1 b2
H ( z ) = ∑ h( n) z − n = = +
n =0 1 − a1 z −1 − a2 z − 2 1 − λ1 z −1 1 − λ2 z −1
so that
h(0) = 1, h(1) = a1 , h(n) = a1h(n − 1) + a2 h(n − 2), n ≥ 2
and in term of the poles of the transfer function, we have
h(n) = b1λ1n + b2 λn2 , n≥0
that represents the impulse response of the system. We also have
λ1 + λ2 = a1 , λ1λ2 = −a2 ,
b1 + b2 = 1, b1λ1 + b2 λ2 = a1.
and H(z) stable implies | λ1 |< 1, | λ2 |< 1.
Boä moân Vieãn thoâng SSP2008
Khoa Ñieän-Ñieän töû 152 BG, CH, ÑHBK
2. Review of Stochastic Models (7)
Further, the output autocorrelations satisfy the recursion
R XX ( n ) = E{ X ( n + m ) X * ( m )}
= E{[ a1 X ( n + m − 1) + a2 X ( n + m − 2)] X * ( m )}
0
+ E{W ( n + m ) X * ( m )}
= a1 R XX ( n − 1) + a2 R XX ( n − 2)
and hence their normalized version is given by
R (n )
ρ X ( n ) =Δ XX = a1 ρ X ( n − 1) + a 2 ρ X ( n − 2).
R XX (0)
By direct calculation using, the output autocorrelations are given by
RXX (n) = RWW (n) ∗ h* (− n) ∗ h(n) = σ W2 h* (− n) ∗ h(n)

= σ W ∑ h * ( n + k ) ∗ h( k )
2

k =0

⎛ | b1 |2 (λ1* ) n b1*b2 (λ1* ) n b1b2* (λ*2 ) n | b2 |2 (λ*2 ) n ⎞


= σW ⎜ 2
+ + + ⎟
⎝ 1− | λ1 | 1 − λ1λ2 1 − λ1λ2 1− | λ2 | ⎠
2 * * 2

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 153 BG, CH, ÑHBK
2. Review of Stochastic Models (8)
Then, the normalized output autocorrelations may be expressed as
RXX (n)
ρ X ( n) = = c1λ1 + c2 λ2
*n *n

RXX (0)

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 154 BG, CH, ÑHBK
2. Review of Stochastic Models (9)
‰ An ARMA (p, q) system has only p + q + 1 independent coefficients,
(ak, k = 1… p; bi, i = 0… q) and hence its impulse response sequence {hk}
also must exhibit a similar dependence among them. In fact, according to
P. Dienes (1931), and Kronecker (1881) states that the necessary and
sufficient condition for H ( z ) = ∑ k = 0 hk z to represent a rational system
∞ −k

(ARMA) is that
det H n = 0, n≥N (for all sufficiently large n ),
where
⎛ h0 h1 h2 hn ⎞
⎜h h h3 hn +1 ⎟
Δ
Hn = ⎜ 1 2
⎟.
⎜ ⎟
⎜h h h2 n ⎟⎠
⎝ n n +1 hn + 2
i.e., in the case of rational systems for all sufficiently large n, the Hankel
matrices Hn all have the same rank.

Boä moân Vieãn thoâng SSP2008


Khoa Ñieän-Ñieän töû 155 BG, CH, ÑHBK

You might also like