Professional Documents
Culture Documents
2–1
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–2
■ Properties that are true of all pdfs:
1. f
ZX .x/ " 0 8 x.
1
2. fX .x/ dx D 1.
#1 Z x0
4
3. Pr.X $ x0/ D fX .x/ dx D FX .x0/, which is the RV’s cumulative
#1
distribution function (cdf).1
■ Problem: Apart from simple examples it is often difficult to determine
fX .x/ accurately. ➠ Use approximations to capture the key behavior.
■ Need to define key characteristics of fX .x/.
EXPECTATION : Describes the expected outcome of a random trial.
Z 1
xN D EŒX! D xfX .x/ dx.
#1
■ Expectation is a linear operator (very important for working with it).
■ So, for example, the first moment about the mean: EŒX # x!
N D 0.
STATISTICAL AVERAGE : Different from expectation.
■ Consider a (discrete) RV X that can assume n values x1, x2, . . . xn.
■ Define the average by making many measurements N ! 1. Then,
mi is the number of times the value of the measurement is i.
1 m1 m2 mn
xN D .m1x1 C m2x2 C % % % C mnxn/ D x1 C x2 C % % % C xn.
N N N N
■ In the limit, m =N ! Pr.X D x / and so forth (assuming ergodicity),
1 1
N
X
xN D xi Pr.X D xi /.
i D1
1
Some call this a probability distribution function, with acronym PDF (uppercase), as
different from a probability density function, which has acronym pdf (lowercase).
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–3
■ So, statistical means can converge to expectation in the limit. (Can
show similar property for continuous RVs. . . but harder to do.)
VARIANCE : Second moment about the mean.
Z 1
N 2! D
var.X/ D EŒ.X # x/ N 2fX .x/ dx
.x # x/
#1
D EŒX 2! # .EŒX!/2,
or is equal to the mean-square minus the square-mean.
STANDARD DEVIATION : Measure of dispersion about the mean of the
p
samples of X: "X D var.X/.
■ The expectation and variance capture key features of the actual pdf.
Higher-order moments are available, but we won’t need them!
KEY POINT FOR UNDERSTANDING VARIANCE : Chebychev’s inequality
■ Chebychev’s inequality states (for positive ")
"X2
Pr.jX # xj N " "/ $ 2 ,
"
which implies that probability is concentrated around the mean.
■ It may be proven as follows:
Z x#"
N Z 1
Pr.jX # xj
N " "/ D fX .x/ dx C fX .x/ dx.
#1 xC"
N
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–5
2.2: Vector random variables
■ With very little change in the preceding, we can also handle vectors of
random variables.
2 3 2 3
X1 x1
6 7 6 7
6 X2 7 6 x2 7
X D6 7
6 ::: 7 I let x0 D 6 7
6 ::: 7 .
4 5 4 5
Xn xn
■ X described by (scalar function) joint pdf fX .x/ of vector X.
■ fX .x0/ means fX .X1 D x1; X2 D x2 % % % Xn D xn/.
■ That is, fX .x0/ dx1 dx2 % % % dxn is the probability that X is between x0
and x0 C dx.
■ Properties of joint pdf fX .x/:
1. f
ZX1.x/
Z 1" 0 Z8 x. Same as before.
1
2. %%% fX .x/ dx1 dx2 % % % dxn D 1. Basically the same.
#1 #1 Z
#1 Z
1 1 Z 1
3. xN D EŒX! D %%% xfX .x/ dx1 dx2 % % % dxn. Basically same.
#1 #1 #1
4. Correlation matrix: Different.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–6
y T „X y " 0 8 y.
0 $ EŒ.y T .X # x//
N 2!
D y T EŒ.X # x/.X
N N T !y
# x/
D y T „X y.
■ Notice that correlation and covariance are the same for zero-mean
random vectors.
■ The covariance entries have specific meaning:
.„X /i i D "X2 i
xN 1 xN 2
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–7
! "
11
fX .x/ D 1=2
N T „#1
exp # .x # x/ X .x # x/
N :
n=2
.2#/ j„X j 2
j„X j D det.„X /; „#1
X requires positive-definite „X .
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–8
2.3: Uncorrelated versus independent
therefore uncorrelated.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–9
■ Does uncorrelation imply independence? Consider the example
Y1 D sin.2#X/ and Y2 D cos.2#X/
with X uniformly distributed on Œ0; 1!.
■ We can show that EŒY1! D EŒY2! D EŒY1Y2! D 0.
■ So, cov.Y1; Y2/ D 0 and Y1 and Y2 are uncorrelated.
■ But, Y12 C Y22 D 1 and therefore Y1 and Y2 are clearly not independent.
■ Note: cov.X1; X2 / D 0 (uncorrelated) implies that
EŒX1X2 ! D EŒX1 !EŒX2 !
D EŒXi Xj # Xi xN j # xN i Xj C xN i xN j !
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–10
2.4: Functions of random variables
MAIN POINT #3: Suppose we are given two random variables Y and X
with Y D g.X/. Also assume that g #1 exists, g and g #1 are
continuously differentiable, then
#ˇ #1 ˇ#
#ˇ @g .y/ ˇ#
fY .y/ D fX .g #1.y// #ˇ
#ˇ @y ˇ# ,
ˇ#
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–11
& '
1 1
■ f .x/ D
X n=2 1=2
exp # .x # x/ N T „#1
X .x # x/ N .
.2#/ j„X j 2
■ Therefore,
& '
jA#1j 1 #1 T #1 #1
fY .y/ D exp # .A .y # B/ # x/
N „ X .A .y # B/ # x/
N
.2#/n=2j„X j1=2 2
& '
1 1 T #1 T #1 #1
D exp # .y # y/
N .A / „X A .y # y/ N
.2#/n=2.jAjj„X jjAT j/1=2 2
& '
1 1
D n=2 1=2
exp # .y # y/„ N #1 Y .y # y/ N ,
.2#/ j„Y j 2
if „Y D A„X AT and yN D AxN C B. That is, Y & N .AxN C B; A„X AT /.
CONCLUSION : Sum of Gaussians is Gaussian—very special case.
RANDOM NOTE : How to use randn.m to simulate non-zero mean
Gaussian noise with covariance „Y ?
■ Y & N .y;
N „Y / but randn.m returns X & N .0; I /.
■ Try y D yN C AT x where A is square with the same dimension as „Y ;
AT A D „Y . (A is the Cholesky decomposition of positive-definite
symmetric matrix „Y ).
ybar = [1; 2];
covar = [1, 0.5; 0.5, 1]; 5000 samples with mean [1;2] and
A = chol(covar); covar [1,0.5; 0.5,1]
5
x = randn([2, 1]);
4
y = ybar + A'*x;
3
y coordinate
[L,D] = ldl(covar); 0
x = randn([2,5000]); −1
y = ybar(:,ones([1 5000])) −4 −2 0 2 4 6
x coordinate
+(L*sqrt(D))*x;
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–12
Z D AX C BW C a
EŒZ! D Ń D AxN C B wN C a
„Z D A„X AT C B„W B T .
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–13
2.5: Conditioning
D fY jX .y j x/fX .x/,
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–14
Therefore,
fY jX .y j x/fX .x/
fXjY .x j y/ D .
fY .y/
■ This is known as Bayes’ rule. It relates the a posteriori probability to
the a priori probability.
■ It forms a key step in the Kalman filter derivation.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–15
■ Since the state of our dynamic system adds up the effects of lots of
independent random inputs, it is reasonable to assume that the
distribution of the state tends to the normal distribution.
■ This leads to the key assumptions for the derivation of the Kalman
filter, as we will see:
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–16
2.6: Vector random (stochastic) processes
■ A stochastic random process is a family of random vectors indexed by
a parameter set (“time” in our case).
! For example, the random process X.t/.
! The value of the random process at any time t0 is RV X.t0/.
Properties
“White” noise
■ Correlation from one time instant to the next is a key property of a
random process. ➠ RX .%/ and CX .%/ very important.
■ Some processes have a unique autocorrelation:
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–17
1. Zero mean,
2. RX .%/ D EŒX.t/X.t C %/T ! D SX ı.%/ where ı.%/ is the Dirac delta.
■ Therefore, the process is uncorrelated in time.
■ Clearly an abstraction, but proves to be a very useful one.
White noise Correlated noise
4 0.2
0.15
2
0.1
0.05
Value
Value
0
0
−0.05
−2
−0.1
−4 −0.15
0 200 400 600 800 1000 0 200 400 600 800 1000
Time Time
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–18
Z 1
SX .!/ D RX .%/e #j!% d%
#1
Z 1
1
RX .%/ D SX .!/e j!% d!.
2# #1
■ SX .!/ gives a frequency-domain interpretation of the “power”
(energy) in the random process X.t/, and is real, symmetric and
positive definite for real random processes.
EXAMPLE : RX .%/ D " 2e #˛j%j for ˛ > 0. Then,
Z 1
2 #˛j%j #j!% 2" 2˛
SX .!/ D " e e d% D 2 2
.
#1 ! C ˛
RX .% / SX .!/
2" 2
Peak of " 2 Peak of
˛
% !
Autocorrelation PSD
■ ˛ is a bandwidth measure.
INTERPRETATION :
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–19
2.7: Discrete-time dynamic systems with random inputs
u.t / A; B; C; D
y.t /
w.t /
x.t /
x.t /
x.t/
P D Ax.t/ C Bu.t/; x.0/ D x0; x.0/; u.t/ known.
x.to /
x.to /
x1 x1
Deterministic Stochastic
NOTATION : Until now, we have always used capital letters for random
variables. The state of a system driven by a random process is a
random vector, so we could now call it X.t/ or Xk . However, it is more
common to retain the standard notation x.t/ or xk and understand
from the context that we are now discussing an RV
Discrete-time systems
■ We will start with a discrete system and then look at how to get an
equivalent answer for continuous systems.
■ Model:
xk D Ad;k#1xk#1 C Bd;k#1uk#1 C wk#1 ,
where
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–21
1. Zero mean: EŒwk ! D 0 8 k.
2. White: EŒwk1 wkT2 ! D „w &.k1 # k2/.
■ „w is called the spectral density of the noise signal wk .
■ Uncertainty at startup: Statistics of initial condition.
EŒx0 ! D xN 0 I
EŒ.x0 # xN 0 /wkT ! D 0 8 k
Mean value:
D Ad xN k#1 C Bd uk#1.
xN 0 W Given
xN k D Ad xN k!1 C Bd uk!1 .
■ To study the random variations about the mean, we need to form the
second central moment of the statistics.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–22
■ Easiest to study if we note that
■ Thus,
■ Three terms:
„x;0 W Given
In this equation, Ad „x;k!1 ATd is the homogeneous part; „w is the driving term.
! As k ! 1, „x;k D „x;kC1 D „x .
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–23
T
! Then, „x D Ad „x Ad C „w .
„x D ˛„x ˛ T C 1
„x .1 # ˛ 2/ D 1
1
„x D .
1 # ˛2
Valid for j˛j < 1; otherwise unstable.
■ In the example below, we plot 100 random trajectories for ˛ D 0:75
and „x;0 D 50.
■ Compare propagation of „x;k with steady-state dlyap.m solution.
„x D 2:29 Propagation of state, with error bounds
50 30
40 20
10
30
Value
„x;k
0
20
−10
10
−20
0 −30
0 5 10 15 20 0 5 10 15 20
Time Time (100 simulations plotted)
p
■ The error bounds plotted are 3" bounds (i.e., ˙3 „x;k ).
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–24
2.8: Continuous-time dynamic systems with random inputs
■ For continuous-time systems we have
x.t/
P D A.t/x.t/ C Bu .t/u.t/ C Bw .t/w.t/,
where
x.t/ W State vector; a random process:
x.0/
N W Given
PN / D Ax.t
x.t N / C Bu.t /.
KEY POINT: While Sw may have simple form, „w will be full matrix in
general.
■ To solve integral, we can approximate: As &t ! 0, then k&t # % ! 0
and e A.k&t #%/ ' I C A.k&t # %/ C % % %
■ That is, e A.k&t #%/ ' I . Then,
( )
„w ' Bw Sw BwT &t.
■ We will see a better method to evaluate „w when &t ¤ 0, but for now
we continue with this result to determine the continuous-time system
covariance propagation.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–27
„x .0/ W Given
P x .t / D A„x .t / C „x .t /AT C Bw Sw BwT .
„
In this equation, A„x .t / C „x .t /AT is the homogeneous part; Bw Sw BwT is the driving term.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–28
EXAMPLE : x.t/
P D Ax.t/ C Bw w.t/, where A and Bw are scalars.
■ P x D 2A„x C Bw2 Sw . This can be solved in closed form to find
Then, „
Bw2 Sw 2At
„x .t/ D .e # 1/ C „x .0/e 2At .
2A
■ If A < 0 (stable) then the initial condition contribution goes to zero.
#Bw2 Sw
„x;ss D
2A
independent of „x .0/.
■ Example: Let A D #1, Bw D 1,
„x;ss D 1
Sw D 2, and „x .0/ D 5. Find 5
„x .t/. 4
Solution: 3
„x .t /
2
■ Increased „x;ss as more noise
added: Bw2 Sw term. 1
0
■ Decreased „x;ss as A becomes 0 1 2 3 4 5
Time
“more stable”.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–29
2.9: Relating „w to Sw precisely: A little trick
■ Define
2 3 2 3
#A Bw Sw BwT c11 c12
ZD4 5; C De Z&t
D4 5
T 0 c22
0 A
ˇ
■ We can compute C D L Œ.sI # Z/ !ˇt D&t and find that
#1 #1
ˇ ˇ
c11 D L#1 Œ.sI # ´11/#1!ˇt D&t D L#1 Œ.sI C A/#1!ˇt D&t D e #A&t
ˇ
c12 D L#1 Œ.sI # ´11/#1´12.sI # ´22/#1!ˇt D&t
#1 #1 T T #1 ˇ
ˇ
D L Œ.sI C A/ Bw Sw Bw .sI # A / ! t D&t
ˇ
#1 ˇ T #1 ˇ
ˇ T
c22 D L Œ.sI # ´22/ ! t D&t D L Œ.sI # A / ! t D&t D e A &t D ATd
#1 #1
T T
■ So, c22 D Ad . Also c22 c12 D „w . With one expm.m command, can
quickly switch from continuous-time to discrete-time.
■ To show the second identity, recognize that c12 is a convolution of two
impulse responses: one due to the system .sI C A/#1Bw and the
other due to the system Sw BwT .sI # AT /#1.
■ The first of these has impulse response Ie #At Bw and the second has
T
impulse response Sw BwT e A t . Convolving them we get
Z &t
T
Ie #A% Bw Sw BwT e A .&t #%/ d%.
0
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–30
■ Finally, consider
Z &t
T T .&t #%/
c22 c12 D e A.&t #%/Bw Sw BwT e A d% D „w .
0
2 3 2 3
0 #1 0 0 0:91 #0:47 #0:006 #0:0046
6 7 6 7
6 1 1 0 0:06 7 6 0:47 1:38 0:0046 0:023 7
6 7 6 7
ZD6 7I C D6 7:
6 0 0 0 #1 7 6 0 0 0:93 #0:32 7
4 5 4 5
0 0 1 #1 0 0 0:32 0:62
■ So,
" # " # " #
0:93 0:32 0:0009 0:0030 0 0
Ad D I „w D I Bw Sw BwT D .
#0:32 0:62 0:0030 0:0156 0 0:06
■ Exact „w and approximate „w very different due to large &t.
■ Compare predictions of steady-state performance.
! Continuous: " #
0:03 0
A„x;ss C „x;ss AT C Bw Sw BwT D 0I „x;ss D :
0 0:03
" #
0:03 0
! Discrete: „x;ss D Ad „x;ss ATd C „w I „x;ss D :
0 0:03
■ Because we used discrete white noise scaled to be equivalent to the
continuous noise, the steady-state predictions are the same.
■ Conversion now complete:
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–31
DISCRETE : Ad , „w , Bd
CONTINUOUS : A, Sw , Bu , Bw
P x .t / D A„x .t / C „x .t /AT C Bw Sw BwT
Given „x .0/ W „
Given x.0/
N PN / D Ax.t
W x.t N / C Bu u.t /.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–32
2.10: Shaping filters
■ We have assumed that the inputs to the linear dynamic systems are
white (Gaussian) noises.
■ Therefore, the PSD is flat ➠ The input has content at all frequencies.
■ Pretty limiting assumption, but one that can be easily fixed ➠ Can use
second linear system to “shape” the noise and modify the PSD as
desired.
Previous Picture New Picture
White Shaped
noise w.t / G.s/ y.t / G.s/ y.t /
noise w1 .t /
White w1 .t /
noise w2 .t / H.s/ G.s/ y.t /
Shaped
■ Therefore, we can drive our linear system with noise that has a
desired PSD by introducing a shaping filter H.s/ that itself is driven by
white noise.
■ The combined system GH.s/ looks exactly the same as before, but
the system G.s/ is not driven by pure white noise any more.
■ Analysis quite simple: Augment original model with filter states.
■ Original system has
x.t/
P D Ax.t/ C Bw w1 .t/
y.t/ D C x.t/
■ New shaping filter with white input and desired PSD output has
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–33
xP s .t/ D As xs .t/ C Bs w2 .t/
w1.t/ D Cs xs .t/.
EXAMPLE : Let
2" 2˛ 2
Sw1 .!/ D 2
! C ˛2
p p
2"˛ 2"˛
D % .
˛ C j! ˛ # j!
Then
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett
ECE5530, VECTOR RANDOM PROCESSES 2–34
p
2"˛
H.s/ D ;
sC˛
p
xP s .t/ D #˛xs .t/ C 2˛"w2 .t/
w1.t/ D xs .t/.
Lecture notes prepared by Dr. Gregory L. Plett. Copyright © 2016, 2010, 2008, 2006, 2004, 2002, 2001, 1999, Gregory L. Plett