You are on page 1of 5

OpenStax-CNX module: m11260 1

Colored Gaussian Noise ∗

Don Johnson
This work is produced by OpenStax-CNX and licensed under the
Creative Commons Attribution License 1.0†

When the additive Gaussian noise in the sensors' outputs is colored (i.e., the noise values are correlated
in some fashion), the linearity of beamforming algorithms means that the array processing output r also
contains colored noise. The solution to the colored-noise, binary detection problem remains the likelihood
ratio, but diers in the form of the a priori densities. The noise will again be assumed zero mean, but the
noise vector has non-trivial covariance matrix K : n ∼ N (0, K).
1 T
K −1 n)
e −( 2 n
p n (n) = p
det (2πK)
In this case, the logarithm of the likelihood ratio is
(r − s1 ) K −1 (r − s1 ) − (r − s0 ) K −1 (r − s0 ) ≷ 2ln (η)

which, after the usual simplications, is written

s1 T K −1 s1 s0 T K −1 s0 M1
rT K −1 s1 − − rT K −1 s0 − ≷ ln (η)
2 2 M0

The sucient statistic for the colored Gaussian noise detection problem is

Υi (r) = rT K −1 si (1)

The quantities computed for each signal have a similar, but more complicated interpretation than in the
white noise case. rT K −1 si is a dot product, but with respect to the so-called kernelK −1 . The eect of the
kernel is to weight certain components more heavily than others. A positive-denite symmetric matrix (the
covariance matrix is one such example) can be expressed in terms of its eigenvectors and eigenvalues.
X 1
K −1 = vk v k T

The sucient statistic can thus be written as the complicated summation

X 1
rT K −1 si = rT vk vk T si

where λk and vk denote the k th eigenvalue and eigenvector of the covariance matrix K . Each of the con-
stituent dot products is largest when the signal and the observation vectors have strong components parallel
∗ Version 1.2: Aug 29, 2003 2:36 pm -0500
OpenStax-CNX module: m11260 2

to vk . However, the product of these dot products is weighted by the reciprocal of the associated eigen-
value. Thus, components in the observation vector parallel to the signal will tend to be accentuated; those
components parallel to the eigenvectors having the smaller eigenvalues will receive greater accentuation
than others. The usual notions of parallelism and orthogonality become "skewed" because of the presence
of the kernel. A covariance matrix's eigenvalue has "units" of variance; these accentuated directions thus
correspond to small noise variance. We can therefore view the weighted dot product as a computation that
is simultaneously trying to select components in the observations similar to the signal, but concentrating on
those where the noise variance is small.
The second term in the expressions consistuting the optimal detector are of the form si T K −1 si . This
quantity is a special case of the dot product just discussed. The two vectors involved in this dot product
are identical; they are parallel by denition. The weighting of the signal components by the reciprocal
eigenvalues remains. Recalling the units of the eigenvectors of K , si T K −1 si has the units of a signal-to-noise
ratio, which is computed in a way that enhances the contribution of those signal components parallel to the
"low noise" directions.
To compute the performance probabilities, we express the detection rule in terms of the sucient statis-
M1 1
rT K −1 (s1 − s0 ) ≷ ln (η) + s1 T K −1 s1 − s0 T K −1 s0

M0 2
The distribution of the sucient statistic on the left side of this equation is Gaussian because it consists as
a linear transformation of the Gaussian random vector r. Assuming the ith model to be true,
rT K −1 (s1 − s0 ) ∼ N si T K −1 (s1 − s0 ) , (s1 − s0 ) K −1 (s1 − s0 )

The false-alarm probability for the optimal Gaussian colored noise detector is given by
 
1 T −1
 ln (η) + 2 (s1 − s0 ) K (s1 − s0 ) 
PF = Q    21  (2)
T −1
(s1 − s0 ) K (s1 − s0 )

As in the white noise case, the important signal-related quantity in this expression is the signal-to-noise
ratio of the dierence signal. The distance interpretation of this quantity remains, but the distance is now
warped by the kernel's presence in the dot product.
The sucient statistic computed for each signal can be given two signal processing interpretations in the
colored noise case. Both of these rest on considering the quantity rT K −1 si as a simple dot product, but with
dierent ideas on grouping terms. The simplest is to group the kernel with the signal so that the sucient
statistic is the dot product between the observations and a modied version of the signal si = K −1 si . This

modied signal thus becomes the equivalent to the unit-sample response of the matched lter. In this form,
the observed data are unaltered and passed through a matched lter whose unit-sample response depends on
both the signal and the noise characteristics. The size of the noise covariance matrix, equal to the number
of observations used by the detector, is usually large: hundreds if not thousands of samples are possible.
Thus, computation of the inverse of the noise covariance matrix becomes an issue. This problem needs to be
solved only once if the noise characteristics are static; the inverse can be precomputed on a general purpose
computer using well-established numerical algorithms. The signal-to-noise ratio term of the sucient statistic

is the dot product of the signal with the modied signal si . This view of the receiver structure is shown in
Figure 1.
OpenStax-CNX module: m11260 3

Figure 1: These diagrams depict the signal processing operations involved in the optimum detector
when the additive noise is not white. The upper diagram shows a matched lter whose unit-sample
response depends both on the signal and the noise characteristics. The lower diagram is often termed
the whitening lter structure, where the noise components of the observed data are rst whitened, then
passed through a matched lter whose unit-sample response is related to the "whitened" signal.

A second and more theoretically powerful view of the computations involved in the colored noise detector
emerges when we factor covariance matrix. The Cholesky factorization of a positive-denite, symmetric
matrix (such as a covariance matrix or its inverse) has the form K = LDLT . With this factorization, the
sucient statistic can be written as
rT K −1 si = D−1/2 L−1 r D−1/2 L−1 si

The components of the dot product are multiplied by the same matrix (D−1/2 L−1 ), which is lower-triangular.
If this matrix were also Toeplitz, the product of this kind between a Toeplitz matrix and a vector would
be equivalent to the convolution of the components of the vector with the rst column of the matrix. If
the matrix is not Toeplitz (which, inconveniently, is the typical case), a convolution also results, but with
a unit-sample response that varies with the index of the outputa time-varying, linear ltering operation.
The variation of the unit-sample response corresponds to the dierent rows of the matrix D−1/2 L−1 running
backwards from the main-diagonal entry. What is the physical interpretation of the action of this lter?
The covariance of the random vector x = Ar is given by Kx = AKr AT . Applying this result to the current
situation, we set A = D−1/2 L−1 and Kr = K = LDLT with the result that the covariance matrix Kx is the
identity matrix! Thus, the matrix D−1/2 L−1 corresponds to a (possibly time-varying) whitening lter:
we have converted the colored-noise component of the observed data to white noise! As the lter is always
linear, the Gaussian observation noise remains Gaussian at the output. Thus, the colored noise problem
is converted into a simpler one with the whitening lter: the whitened observations are rst match-ltered
with the "whitened" signal s+i =D L si (whitened with respect to noise characteristics only) then half
−1/2 −1

the energy of the whitened signal is subtracted (Figure 1).

Example 1
To demonstrate the interpretation of the Cholesky factorization of the covariance unit matrix as a
time-varying whitening lter, consider the covariance matrix
 
1 a a2 a3
 
 a 1 a a2 
 

 a2 a 1 a 
 
a3 a2 a 1
OpenStax-CNX module: m11260 4

This covariance matrix indicates that the nosie was produced by passing white Gaussian noise
through a rst-order lter having coecient a: n (l) = an (l − 1)+w (l), where w (l) is unit-variance
white noise. Thus, we would expect that if a whitening lter emerged from the matrix manipulations
(derived just below), it would be a rst-order FIR lter having a unit-sample response proportional
to 
 1 if l = 0

h (l) = −a if l = 1

0 otherwise

Simple arithmetic calculations of the Cholesky decomposition suce to show that the matrices L
and D are given by  
1 0 0 0
 
 a 1 0 0 
 

 a2 a 1 0 
 
3 2
a a a 1
 
1 0 0 0
 
 0 1 − a2 0 0 
 

 0
 0 1−a 0 

0 0 0 1 − a2
and that their inverses are  
1 0 0 0
 
 −a 1 0 0 
L =
 

 0 −a 1 0 
 
0 0 −a 1
 
1 0 0 0
 
 0 1
−1 1−a2 0 0 
D =
 

 0 0 0 
 1−a2 
0 0 0 1−a2

Because D is diagonal, the matrix D−1/2 equals the term-by-term square root of the inverse of D.
The product of interest here is therefore given by
 
1 0 0 0
 
 √ −a 1
0 0
 1−a2 √1−a2

−1/2 −1
D L =

√ −a

 0 √ 1 0 
 1−a2 1−a2 
−a 1
0 0 √


Let r express the product D−1/2 L−1 r. This vector's elements are given by
∼ ∼ 1
r0 = r0 , r1 = √ (r1 − ar0 ) , . . .
1 − a2
OpenStax-CNX module: m11260 5

Thus, the expected FIR whitening lter emerges after the rst term. The expression could not be
of this form as no observations were assumed to precede r0 . This edge eect is the source of the
time-varying aspect of the whitening lter. If the system modeling the noise generation process
has only poles, this whitening lter will always stabilize - not vary with time - once sucient data
are present within the memory of the FIR inverse lter. In contrast, the presence of zeros in the
generation system would imply an IIR whitening lter. With nite data, the unit-sample response
would then change on each output sample.