You are on page 1of 14

Outline

1 Introduction
2 Random processes

IN5340 / IN9340 Lecture 2 3 Temporal characteristics


Stationarity and Ergodicity
Random variables, vectors and sequences Correlation and covariance
4 Complex numbers
Recap complex numbers
Roy Edgar Hansen January 2022 Complex Random sequences
5 Example: sonar data
6 Summary
7 The project

1 / 52

What do we learn Definition of random process


Random sequences (discrete time random processes) Concept: Enlarging the random variable to include time.
Wikipedia on Stochastic processes A random variable x becomes a function of the possible outcomes (values) s of an
Literature: experiment and time t
x (s, t )
R. M. Gray and L. D. Davisson.
An Introduction to Statistical Signal Processing. The family of all such functions is called a random process
Cambridge University Press, 2004.
URL https://ee.stanford.edu/~gray/sp.pdf. X (s, t )
M. H. Hayes.
Statistical Digital Signal Processing and Modeling. Convenient short form x (t ) for specific waveform of the random process X (t )
John Wiley & Sons, 1996.
A random process becomes a random variable for fixed time

2 / 52 3 / 52
Ensemble and realisation Classification of processes
X (s, t ) represents a family or ensemble of time functions There are different types of random processes
Convenient short form x (t ) for specific waveform of the random process X (t ) One in particular interest: the discrete time random process
Each member time function is called a realisation We will in the following use the notation x (n) using index n in a discrete time sequence
Or, the notation x (t ) using time t in the time sequence (which still can be discrete)
A discrete time random process is a collection (or ensemble) of discrete-time signals.
Sloppy notation ahead: We will use x instead of X

4 / 52 5 / 52

Definition of random process Definition of random process


Discrete time random process A complete statistical characterisation requires the joint probability distribution
Indexed sequence of random variables Fx (α1 , . . . , αk , n1 , . . . , nk ) = Pr {x (n1 ) ≤ α1 , . . . , x (nk ) ≤ αk }

Straightforward extension of the concept of random variables or the joint probability density function
Each random variable has a probability distribution
∂k
fx (α1 , . . . , αk , n1 , . . . , nk ) = Fx (α1 , . . . , αk , n1 , . . . , nk )
Fx (α, n) = Pr {x (n) ≤ α} ∂α1 . . . ∂αk
Sometimes it is sufficient to describe the random process with the first and second order
and a probability density function
distribution (or density)
dFx (α, n)
fx (α, n) =

6 / 52 7 / 52
Stationarity Stationary processes
At a given time, the random process becomes a random variable First order distribution function for the random process
This random variable has statistical properties such as a probability density function, mean
Fx (α, n) = Pr {x (n) ≤ α}
value, variance, moments etc
The random process is called stationary to order one if
stationarity
If all the statistical properties does not change with time, the random process is said to be fx (α, n) = fx (α, n + k ) for all n, k
stationary
This implies that the expectation becomes constant
Stationarity is the same as statistical time-invariance
E {x (n)} = x̄ = constant

8 / 52 9 / 52

Stationary processes Autocorrelation


Second order joint distribution function for the random process We define the autocorrelation as

Fx (α1 , α2 , n1 , n2 ) = Pr {x (n1 ) ≤ α1 , x (n2 ) ≤ α2 } Rxx (n1 , n2 ) = E {x (n1 )x (n2 )}

The random process is called stationary to order two if If the random process is second order stationary, we realise that we can write k = n2 − n1
which gives
fx (α1 , α2 , n1 , n2 ) = fx (α1 , α2 , n1 + k , n2 + k ) for all k , n1 , n2 Rxx (n1 , n2 ) = Rxx (k ) where k = n2 − n1

A second order stationary process is also first order stationary Written in time continuous notation

Rxx (t1 , t2 ) = Rxx (τ ) where τ = t2 − t1

10 / 52 11 / 52
Wide sense stationary processes Time averages and ergodicity
A random process is said to be wide sense stationary (WSS) if
The time average of any quantiy is defined as
wide sense stationary Z T
1
The expected value (mean value) of the process is constant A{·} = lim {·}dt
T →∞ 2T −T

E {x (n)} = x̄ A denotes time average as opposed to E that denotes the ensemble average

The autocorrelation is only a function of lag Consider a random realisation x (t ) of the random process X (t )
The time average becomes x̄ = A{x (t )}
Rxx (n1 , n2 ) = Rxx (k )
Similarly, the time autocorrelation Rxx (τ ) = A{x (t )x (t + τ )}
The variance of the random process is finite When all sample functions in the ensemble is considered, we realise that x̄ and Rxx (τ )
becomes random variables
µ2 ≤ ∞

på norsk: svakt stasjonær


12 / 52 13 / 52

Time averages and ergodicity cont Autocorrelation


By taking the expectation of these random variables, we obtain The autocorrelation function of a random process x is

E {x̄ } = X̄ Rxx (t1 , t2 ) = E {x (t1 )x (t2 )}


E {Rxx (τ )} = Rxx (τ ) Rxx (t , t + τ ) = E {x (t )x (t + τ )}

Ergodic processes stated loosely For Wide Sense Stationary (WSS) processes

If the time averages equal the corresponding statistical averages, the process is said to be Rxx (t , t + τ ) = Rxx (τ )
ergodic
Properties for the AC for WSS random processes:
What does this require? At least that the statistical properties does not change with time.
|Rxx (τ )| ≤ Rxx (0)
Rxx (−τ ) = Rxx (τ )
Rxx (0) = E {x 2 (t )}

14 / 52 15 / 52
In the computer: How to estimate the expectation In the computer: How to estimate the PDF
Recall Z ∞
The PDF can be estimated by the normalised histogram
E {x } = αfx (α)d α Why? Well, the histogram is defined as
−∞
How do we estimate the expected value from a random sequence? Histogram
In real life we do not have the probability density function. function N = hist( x(n), bins )
In real life, we never have all realisations, which means that we often have to assume for all x(n) values
ergodicity find the correct bin for x(n)
Count: N(bin) = N(bin) + 1
Given a sequence x (n) we approximate the ensemble average with the time average end
1 XN
E {x } ≈ A{x (n)} = x (n)
N n =1 The histogram gives directly the count of all different values. Normalise this, and we obtain
Why is this an approximate of the expectation? the probability that any value can occur (density).

What happened to the PDF? This multiplied with the hit number of all possible values gives, naturally, the count of all
values.
16 / 52 17 / 52

In the computer: How to estimate the autocorrelation In the computer: How to estimate the autocorrelation
Assuming a large time interval 2T and ergodicity (which implies that the random process is In matlab
WSS) Signal Processing Toolbox:
Correlation and Covariance
Rxx (t , t + τ ) = E {x (t )x (t + τ )} ≈ A{x (t )x (t + τ )}
Z t +T Similar tools in Python SciPy / NumPy.
1
= x (t )x (t + τ )dt ≈ Rxx (τ ) = Rxx (τ )
2T t −T

or in discrete notation
1 XN
Rxx (k ) ≈ A{x (n)x (n + k )} = x (n)x (n + k )
N n=1

This is simply a convolution without reversing

18 / 52 19 / 52
What is the autocorrelation Cross correlation
Contains information about the history of the random process Assume two random processes x (t ) and y (t ).
The cross correlation function is defined as

Rxy (t1 , t2 ) = E {x (t1 )y (t2 )}

We say they are jointly wide sense stationary if the cross correlation is only a function of time
difference and not absolute time

Rxy (t , t + τ ) = E {x (t )y (t + τ )} = Rxy (τ )

20 / 52 21 / 52

Autocovariance and cross covariance Autocovariance and cross covariance


Similar to the covariance, the autocovariance is For jointly wide sense stationary functions (remember that the expectation becomes
constant):
Cxx (t , t + τ ) = E {(x (t ) − E {x (t )}) (x (t + τ ) − E {x (t + τ )})} Cxx (τ ) = Rxx (τ ) − x̄ 2

The cross covariance is and


Cxy (τ ) = Rxy (τ ) − x̄ ȳ
Cxy (t , t + τ ) = E {(x (t ) − E {x (t )}) (y (t + τ ) − E {y (t + τ )})}
For a WSS random process, the variance is (by direct insertion)
n o
σx2 = E (x (t ) − E {x (t )})2 = Rxx (0) − x̄ 2

If Cxy (t , t + τ ) = 0, the two random processes x and y are said to be uncorrelated

22 / 52 23 / 52
Complex numbers Complex vectors
Mathematical description: We define a complex number as Complex vectors:  
z1
z = x + iy , z = x + jy  z2 
⃗z =  
 .. 
where x , y are real numbers and i 2 = −1 or j 2 = −1  . 
zN
Convenient representation in wireless comms, radar, sonar
where each zn is complex
Complex conjugate: z ∗ = x − iy , in matlab: conj()
√ p ⃗z T = [z1 , z2 , . . . , zN ] is the transpose. transpose() or .’ in matlab
The absolute value or amplitude r = zz ∗ = x2 + y2
⃗z H = ⃗z T ∗ is the Hermitian transpose, the complex conjugate of the transpose

Euler’s formula
eix = cos x + i sin x ctranspose() or ’ in matlab

Note: Often common to use complex random variables ||⃗z || = ⃗z⃗z H is the Euclidean norm or 2-norm. norm() in matlab.

See wikipedia on complex numbers and the matlab manual Read the manual for correct usage. Note that matlab often have multiple usages.

24 / 52 25 / 52

The narrowband complex signal representation The Hilbert transform


The narrowband complex signal representation The Hilbert-transform is defined as
Z ∞ 
z (t ) = A(t )ei ϕ(t ) , x (t ) = ℜ{z (t )} = A(t ) cos ϕ(t ) 1 x (t ) 1
H{x (t )} = p.v . dτ = x (t ) ∗
π −∞ t −τ πt
Typically in comms, radar, and sonar:
where P.V. stands for the Cauchy principal value of the integral
z (t ) = A(t )ei {2πf0 t +ϕ(t )}
Why do we care? What gives the Hilbert-transform us?
where f0 is deterministic, and A(t ) and ϕ(t ) are slowly varying If z (t ) = x (t ) + iy (t ) is an analytic signal, the Hilbert-transform of the real value returns the
The basebanded signal is the signal itself where the carrier frequency is removed imaginary part
y (t ) = H{x (t )}
zBB (t ) = z (t )e−i2πf0 t = A(t )ei ϕ(t )
An analytic signal is a complex-valued function that has no negative frequency components
If the signal bandwidth is small compared to the carrier, basebanding is an effective, lossless H{cos(t )} = sin(t ), H{sin(t )} = − cos(t )
and deterministic process that allows for reducing the sampling rate
26 / 52 27 / 52
The Hilbert transform 2 Complex random sequences
Again, why do we care? What gives the Hilbert-transform us? Expansion from random vectors to complex random vectors
Consider a recorded narrowband signal: x (t ) = A(t ) cos {2π f0 t + ϕ(t )} Let ⃗z = ⃗x + i ⃗y be a complex random vector where
How do we obtain the signal amplitude A(t )? The approach: ⃗x T = [x1 , x2 , . . . , xN ], ⃗y T = [y1 , y2 , . . . , yN ]
Construct the analytic signal z (t ) = x (t ) + i H{x (t )} = A(t )ei {2πf0 t +ϕ(t )} are random vectors where each element is a random variable
i ϕ(t ) A complete statistical characterisation requires the joint CDF
If needed, baseband zBB (t ) = A(t )e
The amplitude is now easily accesible A(t ) = |z (t )|, and also the phase variation Fz (α1 , β1 , . . . , αk , βk ) = Pr {x1 ≤ α1 , y1 ≤ β1 , . . . , xk ≤ αk , yk ≤ βk }
ϕ(t ) = ∠z (t ) or the joint PDF
In matlab: hilbert(). Note that the analytic signal is returned ∂ 2k
fz (α1 , β1 , . . . , αk , βk ) = Fz (α1 , β1 , . . . , αk , βk )
See wikipedia on hilbert transform and analytic signals. ∂α1 ∂β1 . . . ∂αk ∂βk
Sometimes it is sufficient with the first and second order density
And for statistical independence, its separable
28 / 52 29 / 52

The Rayleigh distribution The Rayleigh distribution - derivation of PDF 1


Consider a complex random variable z = x + iy Two statistically independent random variables
where x , y are independent real zero-mean Gaussian random variables with the same The joint PDF is fx ,y (α, β) = fx (α)fy (β)
variance
1
exp −(α2 + β 2 )/(2σ 2 )

With PDF fx ,y (α, β) =
1 2πσ 2
exp −α2 /(2σ 2 )

fx (α) = √
2πσ 2 Recap the relation between Probability and the PDF
2
denoted as x ∼ N (0, σ ) Z x2

2 Pr {x1 < x ≤ x2 } = fx (α)d α


And similar, y ∼ N (0, σ ) x1
p
The magnitude A = x2 + y2 In matlab: A = abs(z) We are going to use this and perform a coordinate transform from cartesian x , y coordinates
The phase θ = tan−1 (y /x ) In matlab: theta = angle(z) to polar coordinates r , θ
What are the PDFs for phase and amplitude? This derivation is taken from www.dsplog.com

30 / 52 31 / 52
The Rayleigh distribution - derivation of PDF 2 The Rayleigh distribution - derivation of PDF 3
Consider the joint probability that x1 < x ≤ x1 + dx and y1 < y ≤ y1 + dy Coordinate transform

−(x 2 + y 2 )
 
1 Pr {r1 < r ≤ r1 + dr , θ1 < θ ≤ θ1 + d θ}
Pr {x1 < x ≤ x1 + dx , y1 < y ≤ y1 + dy } = exp dxdy
2πσ 2 2σ 2
 2
1 −r
= 2
exp rdrd θ
2πσ 2σ 2
Coordinate transform: dxdy = rdrd θ  2
r −r 1
= exp dr dθ
Pr {x1 < x ≤ x1 + dx , y1 < y ≤ y1 + dy } = Pr {r1 < r ≤ r1 + dr , θ1 < θ ≤ θ1 + d θ} σ 2 2σ 2 2π

giving a Joint PDF in polar coordinates

−r 2
 
r 1
fr ,θ (r , θ) = exp
σ2 2σ 2 2π

32 / 52 33 / 52

The Rayleigh distribution - derivation of PDF 4 The Rayleigh distribution


Since r and θ are independent, we can separate into
−r 2 −r 2
   
r
fr (r ) = exp , Fr (r ) = 1 − exp
−r 2 σ2 2σ 2 2σ 2
 
r
fr (r ) = exp
σ2 2σ 2

and
1
fθ (θ) =

Remembering (and checking) the property
Lord Rayleigh. Nobel Prize winner
Z in 1904 in Physics. wikipedia.org

fx (α)d α = 1

The PDF for the amplitude is the so-called Rayleigh distribution, and the PDF for the phase is
the uniform distribution with equal probability of any phase. More details can be found in any book on probability theory, wikipedia. This derivation is taken
from www.dsplog.com
34 / 52 35 / 52
Example: real data collected by a sonar Example: real data collected by a sonar
The HUGIN autonomous underwater vehicle Single channel timeseries from one ping
Wideband interferometric synthetic aperture sonar Consider the collected data a random process.
Transmitter that insonifies the seafloor with a LFM pulse Is the process stationary?
Array of receivers that collects the echoes from the seafloor How do we check for stationarity?
The signal scattered from the seafloor is considered to be random
The signal consists of a signal part and additive noise

36 / 52 37 / 52

Example: Stationarity Example: Stationarity


pseudo code: Sliding window mean and standard deviation The random sequence is clearly non-stationary
function statarr = sliding_stats_estimation( data ) We divide into “similar” regions before we continue our statistical analysis
blocksize = 512;
step = (blocksize/4); Region 1: Backscattered signal from the seafloor
for n=1:n_blocks
statarr(n,1) = mean( data( (n-1)*step+1:(n-1)*step+blocksize ) ); Region 2: Additive noise
statarr(n,2) = std( data( (n-1)*step+1:(n-1)*step+blocksize ) );
end

38 / 52 39 / 52
Example: Estimating the PDF Example: Estimating the PDF
Question: Is the probability density function Gaussian? The sonar data are complex
(x − mx )2
 
1 x (t ) = xRe (t ) + jxIm (t ) = a(t )ej ϕ(t )
fx (x ) = p exp −
2πσx2 2σx2
The complex random sequence is considered as two independent random sequences (in a
where mx is the first order moment (the mean value), and σx2 = µ2 is the second order vector) with joint PDF.
central moment (or the variance) We can check the PDF of the real and imaginary part separately
Approach: Compare the theoretical PDF with the estimated PDF (normalised histogram) If xRe (t ) and xIm (t ) are statistically independent Gaussian random processes, the PDF of the
amplitude (or magnitude) q
a (t ) = 2 (t ) + x 2 (t )
xRe Im
should be a Rayleigh distribution, and the PDF of the phase
ϕ(t ) = tan−1 {xIm /xRe }
should be uniform.
40 / 52 41 / 52

Example: Estimating the PDF Example: Estimating the PDF - Region 1


Before we move on, lets expand the ensemble by using all the receiver channels for each
selected region.
Is this always safe?
Can two receiver channels be statistically dependent?

42 / 52 43 / 52
Example: Estimating the PDF - Region 2 Example: Estimating the PDF - Conclusion
In region 2, the real and imaginary part fits a Gauss well
The phase is also uniform (all phase values are equally probable)
The magnitude also fits well a Rayleigh distribution
In region 1, this is not the case.
The histogram indicates that the PDF is heavy tailed.
This means that it is more likely to have spikes (large amplitude values) in the timeseries
than in a timeseries with Gaussian PDF.
This actually fits well the theory of acoustic scattering.
What can an estimate of the PDF be used to?

44 / 52 45 / 52

Example: Statistical dependence Example: Statistical dependence between channels

Can we check for How about the dependence


dependence between real between channels?
and imaginary part?
If the normalised cross
covariance is zero, the two
processes are said to be
uncorrelated
ρxy (τ ) = Cxy (τ )/σx σy = 0

46 / 52 47 / 52
Example: Statistical dependence - Conclusion Original sonar image
The real and imaginary part of the signal is uncorrelated
The individual channels (receiver elements) are correlated
What physical phenomenon could cause this?
The channels are strongly correlated in region 2 (the noise region)
Why is this?
Any other way to detect the correlation between channels?

48 / 52 49 / 52

After characterisation of the noise and filtering Summary of lecture 2

random process cross correlation


ensemble autocovariance
realisation cross covariance
discrete time random process complex numbers
stationarity Hilbert transform
autocorrelation The Rayleigh distribution
wide sense stationary
time average
ergodicity

50 / 52 51 / 52
The project
The course is topic based with 7 different topics
Each topic has one week lectures and one week project work
Roy Edgar Hansen
Each project consists of a set of exercises in matlab or python
Each project delivery is in the form of a presentation
IN5340 / IN9340 Lecture 2
At least 4 projects must be delivered and approved Random variables, vectors and sequences
Each project will be worked through in the class
Each student must deliver his/her own presentation
The student must be prepared to present at the group session
Each student must present their project at least once

52 / 52

You might also like