You are on page 1of 89

TIME-FREQUENCY ANALYSIS OF

NON-STATIONARY PROCESSES
AN INTRODUCTION
MARIA SANDSTEN
2013
Centre for Mathematical Sciences
C
E
N
T
R
U
M
S
C
I
E
N
T
I
A
R
U
M
M
A
T
H
E
M
A
T
I
C
A
R
U
M
Contents
1 Introduction and some history 3
1.1 A short story on spectral analysis . . . . . . . . . . . . . . . . . . . . 3
1.2 Spectrum analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 A historical chronology of time-frequency analysis . . . . . . . . . . . 7
2 Spectrogram and Wigner distribution 11
2.1 Fourier transform and spectrum analysis . . . . . . . . . . . . . . . . 11
2.2 The spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 An adaptive window . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 The Wigner distribution and Wigner spectrum . . . . . . . . . . . . . 17
2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Cross-terms and the Wigner-Ville distribution 21
3.1 Cross-terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 The analytic signal and the Wigner-Ville distribution . . . . . . . . . 22
3.3 Cross-terms of the spectrogram . . . . . . . . . . . . . . . . . . . . . 26
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Ambiguity functions and kernels 29
4.1 Denitions and properties . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 RID and product kernels . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3 Separable kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5 Concentration measures 41
5.1 Concentration measures . . . . . . . . . . . . . . . . . . . . . . . . . 41
1
Maria Sandsten Introduction and some history
6 Quadratic distributions and Cohens class 43
6.1 Kernel interpretation of the spectrogram . . . . . . . . . . . . . . . . 44
6.2 Some historical important distributions . . . . . . . . . . . . . . . . . 45
6.3 A view on the four domains of time-frequency analysis . . . . . . . . 49
6.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
7 The instantaneous frequency 59
7.1 Instantaneous frequency . . . . . . . . . . . . . . . . . . . . . . . . . 59
7.2 Time-frequency reassignment . . . . . . . . . . . . . . . . . . . . . . 62
7.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8 The uncertainty principle and Gabor expansion 67
8.1 The uncertainty principle . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.2 Gabor expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
9 Stochastic time-frequency analysis 71
9.1 Dierent denitions of non-stationary processes . . . . . . . . . . . . 72
9.2 The mean square error optimal kernel . . . . . . . . . . . . . . . . . . 75
9.3 Multitaper time-frequency analysis . . . . . . . . . . . . . . . . . . . 76
9.4 Cross-term reduction using a penalty function . . . . . . . . . . . . . 78
2
Chapter 1
Introduction and some history
Signals in nature vary greatly and sometimes it can be dicult to characterize them.
We usually dier between deterministic and stochastic (random) signals where a
deterministic signal is explicitly known and a stochastic process is one realization
from a collection of signals which are characterized by properties like expected value
and variance. We also dier between stationary and non-stationary signals where
a stationary signal has time-invariant properties, i.e. properties that do not change
with time. Time is a fundamental way of studying a signal but we can also study
the signal in other representations, where one of the most important is frequency.
Time-frequency analysis of non-stationary and time-varying signals, in general, has
been a eld of research for a number of years.
1.1 A short story on spectral analysis
The short description here just intend to give an orientation on the history behind
spectral analysis in signal processing and for a more thorough description see, e.g.,
[1].
Sir Isaac Newton (1642-1727) performed an experiment in 1704, where he used a
glass prism to resolve the sunbeams into the colors of the rainbow, a spectrum. The
result was an image of the frequencies contained in the sunlight. Jean Baptiste Joseph
Fourier (1768-1830) invented the mathematics to handle discontinuities in 1807 where
the idea was that a discontinuous function can be expressed as a sum of continuous
frequency functions. This idea was seen as absurd from the established scientists at
that time, including Pierre Simon Laplace (1749-1827) and Joseph Louis Lagrange
(1736-1813). The application behind the idea was the most basic problem regarding
3
Maria Sandsten Introduction and some history
heat, a discontinuity in temperature when hot and cold objects were put together.
However, the idea was eventually accepted and is what we today call the Fourier
expansion.
Joseph von Fraunhofer (1787-1826) invented the spectroscope in 1815 for measure-
ment of the index of refraction of glasses. He discovered what today has become
known as the Fraunhofer lines. Based on this, an important discovery was made
by Robert Bunsen (1811-1899) in the 1900th century. He studied the light from a
burning rag soaked with salt solution. The spectrum from the glass prism consisted
of lines and especially a bright yellow line. Further experiments showed that every
material has its own unique spectrum, with dierent frequency contents. The glass
prism was the rst spectrum analyser. Based on this discovery, Bunsen and Gus-
tav Kirchho (1824-1887) discovered that light spectra can be used for recognition,
detection and classication of substances.
The spectrum of electrical signals can be obtained by using narrow bandpass-lters.
The measured output from one lter is squared and averaged and the result is the
average power at the center-frequency of the lter. Another bandpass-lter with dif-
ferent center-frequency will give the average power at another frequency. With many
lters of dierent center-frequencies, an image of the frequency content is obtained.
This spectrum analyzer performance is equivalent to modulating (shifting) the signal
in frequency and using one narrow-banded lowpass-lter. The modulation procedure
will move the power of the signal at the modulation frequency to the base-band.
Thereby only one lter, the lowpass-lter, is needed. The modulation frequency is
dierent for every new spectrum value. This interpretation is called the heterodyne
technique.
1.2 Spectrum analysis
A rst step in modern spectrum analysis (discrete-time, sampled) was made by Sir
Arthur Schuster already in 1898, [2]. He tted Fourier series to sunspot numbers
to nd hidden periodicities. The Periodogram was invented. The Periodogram can
be described with the heterodyne technique where the lowpass-lter is a lter with
impulse respones corresponding to a time limited rectangle sequence (window). The
method has been frequently used for spectrum estimation ever since James Cooley
and John Tukey developed the Fast Fourier Transform (FFT) algorithm in 1965,
[3, 4, 5]. However, it was then discovered that Carl Friedrich Gauss (1777-1855)
invented the FFT-algorithm already in 1805, although he did not publish the result.
The FFT is the ecient way of calculating the Fourier transform of a signal which
4
Maria Sandsten Introduction and some history
explores the structure of the Fourier transform algorithm to minimize the number
of multiplications and summations. This made the Fourier transform to actually
become a tool and not just a theoretical description. A recent and more thorough
review of Tukey s work in time series analysis and spectrum estimation can be found
in [6].
To nd a reliable spectrum with optimal frequency resolution from a sequence of (of-
ten short length) data has become a large research eld. For stationary time-series,
two major approaches can be found in the literature, the windowed Periodogram
(and lag-windowed covariance estimates) which represents the non-parametric ap-
proach or the classical spectral analysis and the model based, usually an Auto-
Regressive (AR), which represents the parametric approach or the modern spec-
tral analysis. Developments and combinations of these two approaches have during
the last decades been utilized in a huge number of applications, [7]. A fairly recent,
short, pedagogical overview is given in, [8, 9].
High-resolution spectral estimation of sinusoids with algorithms as MUSIC, Pisarenko
and others is also a very large research eld and descriptions of some of these algo-
rithms could be found in, e.g., [7]. These algorithms are outstanding in certain
applications but in some cases the usual periodogram is applicable even for sinusoids,
e.g., to detect or estimate a single sinusoid in a noisy signal, e.g., [10].
All these spectrum analysis techniques rely on the fact that the properties, e.g., the
frequency contents, of the signal change slowly with time. If this is true, the time-
frequency spectrum of the signal can be estimated and visualized. However, if the
signal if non-stationary or time-varying, with e.g., step-wise changes, other methods
should be applied.
We give an example where the need for time and frequency analysis becomes obvious.
We analyze two dierent signals, which are obviously not similar Figure 1.1a) and b).
These two signals are not stationary and the periodogram spectra, i.e., the squared
absolute values of the Fourier transforms will be exactly the same although they have
a very dierent appearance in time, which is seen in Figure 1.1c) and d). From the
signals alone, it is not possible to tell what frequencies they consist of and from the
spectra alone, it is not possible to tell how the non-stationary signals evolve with
time. With use of time-frequency analysis, however, in the form of spectrograms,
these dierences become clearly visible, Figure 1.2a) and b), where red color indicates
high power and blue color is low power.
5
Maria Sandsten Introduction and some history
0 200 400
1.5
1
0.5
0
0.5
1
1.5
Time
a) Chirp signal
0 200 400
0
5
10
15
Time
b) Impulse signal
0 0.1 0.2 0.3 0.4
0
200
400
600
800
Frequency
P
o
w
e
r
c) Spectrum of chirp
0 0.1 0.2 0.3 0.4
0
200
400
600
800
Frequency
P
o
w
e
r
d) Spectrum of impulse
Figure 1.1: Two dierent signals with the same spectral estimates.
Figure 1.2: Spectrograms of the chirp and impulse signals, (Red color: high power,
blue color: low power).
6
Maria Sandsten Introduction and some history
1.3 A historical chronology of time-frequency anal-
ysis
Eugene Wigner (1902-1995) received the Nobel Prize together with Maria Goeppert
Mayer and Hans Jensen in 1963 for the discovery concerning the theory of the atomic
nucleus and elementary particles. The Wigner distribution, almost as well known as
the spectrogram, was suggested in a famous paper from 1932, because it seemed to
be the simplest, [11]. The paper is actually in the area of quantum mechanics and
not at all in signal analysis. The drawback of the Wigner distribution is that it gives
cross-terms, that is located in the middle between and can be twice as large as the
dierent signal components. A large area of research has been devoted to reduction
of these cross-terms, where dierent time-frequency kernels are proposed.
The design of these kernels are usually made in the so called ambiguity domain
also known as the doppler-lag domain. The word ambiguity is a bit ambiguous
as it then should stand for something that is not clearly dened. The name comes
from the beginning from the radar eld and was introduced in [12], describing the
equivalence between time-shifting and frequency-shifting for linear FM signals, which
are frequently used in radar. Suggesting the name characteristic function, it was
however rst introduced by Jean-Andre Ville, [13], where the concept of the Wigner
distribution also was used for signal analysis. A similar derivation was also suggested
at about the same time by Jose Enrique Moyal, [14]. In his paper, Ville also redened
the analytic signal, where a real-valued signal is converted into a positive frequency
signal. The Wigner distribution of the analytic signal, which is usually applied in
practice today, is therefore called the Wigner-Ville distribution.
The concept of instantaneous frequency, the derivative of the phase, is closely
related to the denition of the analytic signal as we from the analytic signal get a
phase. The rst interest in instantaneous frequency arouse with frequency modulation
in radio transmission with a variable frequency in an electric circuit theory context.
Dennis Gabor (1900-1979) suggested the representation of a signal in two dimen-
sions, time and frequency. In his paper from 1946 Gabor suggested certain elemen-
tary signals that occupy the smallest possible area in the two-dimensional area, the
information diagram, one quantum of information. He had a large interest in
holography and in 1948 he carried out the basic experiments in holography, at that
time called wavefront reconstruction. In 1971 he received the Nobel Prize for his
discovery of the principles underlying the science of holography. The Gabor expansion
was related by Martin Bastiaans to the short-time Fourier transform in 1980.
In parallell to this, in the 1940s to 60s, during and after the invention of the spectro-
7
Maria Sandsten Introduction and some history
gram and the Wigner-Ville distribution, a lot of other distributions and methods were
invented, e.g., Rihaczek distribution, [15], Page distribution, [16] and Margenau-Hill
or Levin distribution, [17]. In 1966 a formulation was made by Leon Cohen, also in
quantum mechanics, [18], which included these and an innite number of other meth-
ods as kernel functions. The formulation by Cohen was restricted with a constraint
of the kernels so the marginals should be satised. Later a quadratic class of
methods have been dened, also including kernels which do not satisfy the marginals,
but otherwise fulll the properties of the Cohens class. The general quadratic class
of time-frequency representation methods is the most applied and a huge number of
time-frequency kernels can be found for dierent representations. Commonly, this
class is referred to as the Cohens class. We could however note, that Cohens class
from the beginning also included signal-dependent kernels which can not be included
in the quadratic class as the methods then no longer are a quadratic form of the
signal. The discrete-time and discrete-frequency kernel was introduced by Claasen
and Mecklenbr auker, [19], as late as 1980 in signal analysis.
The reassignment principle was introduced in 1976 by Kodera, de Villedary and
Gendrin, [20], but were not applied to a larger extent until when Auger and Flandrin
reintroduced the method, [21]. The idea of reassignment is to keep the localization of
a single component by reassigning mass to the center of gravity where the cross-terms
are reduced by smoothing of the Wigner distribution.
However, many of these methods are actually developed for deterministic signals
(possibly with a disturbance) and used for a more exploratory description of multi-
component signals. In the recent 10-15 years, the attempts to nd optimal estimation
algorithms for non-stationary processes, have increased, [22]. The term Wigner
spectrum has been established where the ambiguity function is replaced by the am-
biguity spectrum. A fundamental problem is the estimation of a time-frequency
spectrum from a single realization of the process. Many kernels, optimized for dif-
ferent purposes, can be found, e.g., [23, 24, 25, 26]. A well-dened framework has
also been proposed for the class of underspread processes, for which the ambiguity
domain spectrum of the process is concentrated around the origin. Additionally, it is
shown, at least approximately, that many rules for time-invariant linear systems and
stationary processes applies, [27].
We will also nd a connection between the spectrogram formulation and the Wigner-
Ville distribution/spectrum using the concept of multitapers, originally invented by
David Thomson, [28]. The calculation of the two-dimensional convolution between
the kernel and the Wigner-Ville spectrum of a process realization can be simplied
using kernel decomposition and calculating multitaper spectrograms, e.g., [29, 30].
The time-lag estimation kernel is rotated and the corresponding eigenvectors and
8
Maria Sandsten Introduction and some history
eigenvalues are calculated. The estimate of the Wigner-Ville spectrum is given as the
weighted sum, using the eigenvalues as weights, of the spectrograms of the data with
the dierent eigenvectors as sliding windows, [31]. Dierent approaches to approxi-
mate a kernel with a few multitaper spectrograms have been taken, e.g., comparison
of Slepian and Hermite functions, [32], decomposing the Reduced Interference distri-
bution, [33, 34], least square optimization of the weights using Hermite functions as
windows, [35] and utilizing a penalty function, [36]. In [25], time-frequency kernels
and connected multitapers for statistically stable frequency marginals are developed.
9
Maria Sandsten Introduction and some history
10
Chapter 2
Spectrogram and Wigner
distribution
A spectrogram is a spectral representation that shows how the spectral density of
a signal varies with time. Other names that are found in dierent applications are
spectral waterfall or sonogram. The time domain data is divided in shorter se-
quences, which usually overlap, and Fourier transformed to calculate the magnitude of
the frequency spectrum for each sequence. These frequency spectra are then ordered
on a corresponding time-scale and form a three-dimensional picture, (time, frequency,
magnitude). The Wigner distribution/spectrum, [11], almost as well known as the
spectrogram, is another famous spectral representation, which has the best time- and
frequency concentration of a signal. However, the Wigner distribution gives cross-
terms, which makes the interpretation of the Wigner distribution dicult and a large
area of research has been devoted to reduction of these cross-terms.
2.1 Fourier transform and spectrum analysis
The Fourier transform of a continuous signal x(t) is dened as
X(f) = Tx(t) =
_

x(t)e
i2ft
dt,
and the signal x(t) can be recovered by the inverse Fourier transform,
x(t) = T
1
X(f) =
_

X(f)e
i2ft
df.
11
Maria Sandsten Spectrogram and Wigner distribution
This is a Fourier transform pair, we have representation in both time and frequency.
The absolute value of the Fourier transform gives us the magnitude function,
[X(f)[ and the argument is the phase function, arg(X(f)). The spectrum is
given from the squared magnitude function,
S
x
(f) = [X(f)[
2
.
We should also remember that using the Wiener-Khintchine theorem, the power
spectrum of a zero-mean stationary stochastic process x(t), can also be calculated
as the Fourier transform of the covariance function r
x
(),
S
x
(f) = Tr
x
() =
_

r
x
()e
i2f
d,
where r
x
() is dened as
r
x
() = c[x(t )x

(t)],
denoting c[ ] as the expected value and as the complex conjugate. The covariance
function r
x
() can be recovered by the inverse Fourier transform of the spectrum,
r
x
() = T
1
S
x
(f) =
_

S
x
(f)e
i2f
d.
2.2 The spectrogram
The short-time Fourier transform (STFT) of a signal x(t) is dened as
X(t, f) =
_

x(t
1
)h

(t
1
t)e
i2ft
1
dt
1
, (2.1)
where the window function h(t) centered a at time t is multiplied with the signal x(t)
before the Fourier transform. The window function is viewing the signal just close to
the time t and the Fourier transform will be an estimate locally around t. The usual
way of calculating the STFT is to use a xed positive even window, h(t), of a certain
shape, which is centered around zero and has power
_

[h(t)[
2
dt = 1. Similar to the
ordinary Fourier transform and spectrum we can formulate the spectrogram as
S
x
(t, f) = [X(t, f)[
2
, (2.2)
which is used very frequently for analyzing time-varying and non-stationary signals.
We can also extend the Wiener-Khintchine theorem, and dene the power spectrum
12
Maria Sandsten Spectrogram and Wigner distribution
of a zero-mean non-stationary stochastic process x(t), to be calculated as the
Fourier transform of the time-dependent covariance function r
x
(t, ),
S
x
(t, f) = Tr
x
(t, ) =
_

r
x
(t, )e
i2f
d, (2.3)
where r
x
(t, ) is dened as
r
x
(t, ) = c[x(t )x

(t)]. (2.4)
Using the spectrogram, the signal is divided in several shorter pieces, where a spec-
trum is estimated from each piece, which give us information about where in time
dierent frequencies occur. There are several advantages of using the spectrogram,
e.g., fast implemention using the fast Fourier transform (FFT), easy interpretation
and the clear connection to the periodogram. We illustrate this with an example:
Example
In Figure 2.1a) a signal consisting of several frequency components is shown. A
musical interpretation is three tones of increasing height. A common estimate of
the spectrum is dened by the Fourier transform of a windowed sampled data vector,
x(n), n = 0, 1, 2, . . . N 1, (N = 512), where t = n/F
s
, (F
s
is the sample frequency
and in this example F
s
= 1). The (modied) periodogram is dened as
S
x
(l) =
1
N
[
N1

n=0
x(n)h(n)e
i2n
l
L
[
2
, l = 0 . . . L 1 (2.5)
where the frequency scale is f = l/L F
s
. The window function h(n) is normalized
according to
h(n) =
h
1
(n)
_
1
N

N1
n=0
h
2
1
(n)
, (2.6)
where h
1
(n) can be any window function, including h
1
(n) = 1, (the rectangle win-
dow). The periodogram (i.e. using a rectangle window), in Figure 2.1b) shows three
frequency components but can not give any information on how these signals are lo-
cated in time. Figure 2.1c) presents the modied periodogram (Hanning window),
which is a very common analysis window, giving an estimated spectrum that seems
to consist of just one strong frequency component and two weaker. These examples
13
Maria Sandsten Spectrogram and Wigner distribution
0 50 100 150 200 250 300 350 400 450 500
1.5
1
0.5
0
0.5
1
1.5
a) Data
t
0 0.05 0.1 0.15 0.2
0
1
2
3
4
c) Modified periodogram
f
S
x
(
f
)
0 0.05 0.1 0.15 0.2
0
1
2
3
4
b) Periodogram
f
S
x
(
f
)
Figure 2.1: A three-component signals with increasing tone, frequency; a) data; b)
periodogram; c) modied periodogram
show how important it is to be careful using dierent spectral estimation techniques.
The spectrogram is dened as
S
x
(n, l) =
1
M
[
N1

n
1
=0
x(n
1
)h(n
1
n + M/2)e
i2n
1
l
L
[
2
, (2.7)
for time-discrete signals and frequencies, where the window function is of length M
and normalized according to Eq. (2.6) with N = M. The resulting spectrogram of
the three musical tones is presented in Figure 2.2 and shows a clear view of three
frequency components and their locations in time and frequency.
The length (and shape) of the window function h(t) is very important as it deter-
mines the resolution in time and frequency, as shown in Figure 2.3. The signal shown
in Figure 2.3a) consists of two Gaussian windowed sinusoidal components located at
the same time interval around t = 100, but with dierent frequencies, f
0
= 0.07 and
f
0
= 0.17 and another component with frequency f
0
= 0.07 located around t = 160.
The three components show up clearly in Figure 2.3c) where the window length of
a Hanning window is M = 32. Figures 2.3b) and d) show the result when too short
or too long window lengths are chosen. For a too short window length, Figure 2.3b),
14
Maria Sandsten Spectrogram and Wigner distribution
Figure 2.2: Spectrogram of the signal in Figure 2.1 using a Hanning window of length
128 samples.
the two components located at the same time interval are smeared together as the
frequency resolution becomes really bad for a window length of M = 8. With a long
window, Figure 2.3d), the frequency resolution is good but now the time resolution is
so bad that the two components with the same frequency are smeared together. The
time- and frequency resolutions have to fulll the so called uncertainty inequality,
to which we will return to later. This gives a strong limitation on using the spectro-
gram for signals where the demand on time- and frequency resolution is very high.
For now, we just note that usually a proper trade-o between time- and frequency
resolution is found by choosing the length of the window about the same size as the
time-invariance or stationarity of the individual signal components. The following
simple example show this.
Example
If the signal and the window function are
x(t) = (

)
1
4
e

2
t
2
, h(t) = (

)
1
4
e

2
t
2
,
15
Maria Sandsten Spectrogram and Wigner distribution
Figure 2.3: Spectrogram examples with dierent window lengths
respectively, then the spectrogram is
S
x
(t, f) =
2

+
e


+
t
2

1
+
4
2
f
2
.
Studying the spectrogram one can note e.g., that the area of the ellipse e
1
down from
the peak value is
A =
+

.
Dierentiation gives = and a minimum area A = 2. The conclusion is that the
a minimum time-frequency spread is given if the window length is equal to the signal
length for a Gaussian signal and window.
2.2.1 An adaptive window
As it is not always easy to know or estimate the components individual lengths
beforehand and these also vary for dierent components, adaptive windows can be
16
Maria Sandsten Spectrogram and Wigner distribution
used, [37, 38]. A variable Gaussian window is dened as
h
t,f
(t
1
) =
_
2
1[c
t,f
]

_
1/4
e
(c
t,f
(tt
1
)
2
)
,
where 1[ ] is the real part. The real part of the complex-valued parameter c
t,f
controls the time-duration of the window and the complex part controls the so called
chirp rate (or tilt in the time-frequency plane). The adaptation of the parameter
of the window can be made in dierent ways. The local concentration at t, f is
maximized as
max
c
t,f
_ _
[S
x
(t
1
, f
1
)(t t
1
, f f
1
)[
4
dt
1
df
1
(
_ _
[S
x
(t
1
, f
1
)(t t
1
, f f
1
)[
2
dt
1
df
1
)
2
,
where (t, f) is a time-frequency window that decreases to zero away from the central
point at t = 0 and f = 0. This way of maximizing the concentration is similar to
maximizing other measures of sharpness, focus, peakiness or minimizing e.g., Renyi
entropy.
2.3 The Wigner distribution and Wigner spectrum
The absolutely best time- and frequency resolution and concentration is given by the
Wigner distribution which is dened as
W
x
(t, f) =
_

x(t +

2
)x

(t

2
)e
i2f
d. (2.8)
For a non-stationary process, we can dene a symmetrical instantaneous auto-
correlation function, (IAF), as
r
x
(t, ) = c[x(t +

2
)x

(t

2
)], (2.9)
and an extension of the time-varying spectral density is given as
S
x
(t, f) =
_

r
x
(t, )e
i2f
d, (2.10)
similarly as for the spectrogram. Note the dierence between the two IAFs in Eq. (2.9)
and Eq. (2.4). The formulation based on Eq. (2.9) and Eq. (2.10) is called the Wigner
spectrum which also fullls the basic properties of the Wigner distribution. As most
of the research connected to the stochastic Wigner spectrum stems from the determin-
istic Wigner distribution, we start by introducing the properties and methods of the
17
Maria Sandsten Spectrogram and Wigner distribution
Wigner distribution. The Wigner distribution has always better resolution than the
spectrogram for a so called monocomponent signal, i.e., a complex sinusoidal signal
or a complex chirp signal (linearly increasing, or (decreasing) frequency) if they are
dened for all values of t. The Wigner distribution give exactly the instantaneous
frequency, i.e. perfectly localized functions for these signals.
The basic properties for the Wigner distribution and the Wigner spectrum are the
same, meaning that all formulas mentioned below are valid although we restrict to
mentioning just the Wigner distribution.
The Wigner distribution is always real valued even if the signal is complex valued. For
real valued signals/processes the frequency domain is symmetrical, i.e., W
x
(t, f) =
W
x
(t, f), (compare with the denition of spectrum and spectrogram for real valued
signals). The Wigner distribution also satises the so called marginals, i.e.,
_

W
x
(t, f)df = [x(t)[
2
, (2.11)
and _

W
x
(t, f)dt = [X(f)[
2
. (2.12)
If the marginals are satised, the total energy condition is also automatically satised,
_

W
x
(t, f)dtdf =
_

[x(t)[
2
dt =
_

[X(f)[
2
df = E
x
, (2.13)
where E
x
is the energy of the signal. If we shift the signal in time or frequency, the
Wigner distribution will be shifted accordingly, it is time-shift and frequency-
shift invariant, i.e., if y(t) = x(t t
0
) then
W
y
(t, f) = W
x
(t t
0
, f), (2.14)
and if z(t) = x(t)e
i2f
0
t
then
W
z
(t, f) = W
x
(t, f f
0
). (2.15)
The Wigner distribution has weak time support as it is limited by the duration
of the signal, i.e., if x(t) = 0 when t < t
1
and t > t
2
then W
x
(t, f) = 0 for t < t
1
and t > t
2
. Similarly it has weak frequency support and is thereby limited by
bandwidth of the signal, i.e., if X(f) = 0 when f < f
1
and f > f
2
, then W
x
(t, f) = 0
for f < f
1
and f > f
2
. However, when the signal stops for while and then starts
again, i.e., if there is an interval in the signal that is zero, it does not imply that
the Wigner distribution is zero in that time interval. The same goes for frequency
intervals where the spectrum is zero, it does not imply that the Wigner distribution
is zero in that frequency interval.
18
Maria Sandsten Spectrogram and Wigner distribution
Example
The Wigner distribution of x(t) = e
i2f
0
t
, < t < , (complex-valued sinusoidal
signal of frequency f
0
), is given by
W
x
(t, f) =
_
e
i2f
0
(t+

2
)
e
i2f
0
(t

2
)
e
i2f
d
=
_
e
i2(f
0
f)
d = (f f
0
).
Example
For the chirp signal x(t) = e
i2

2
t
2
+i2f
0
t
, we can compute the Wigner distribution as
W
x
(t, f) =
_
e
i2(

2
(t+

2
)
2
+f
0
(t+

2
))
e
i2(

2
(t

2
)
2
+f
0
(t

2
))
e
i2f
d
=
_
e
i2(t+f
0
f)
d = (f f
0
t).
We could also mention that for x(t) = (t t
0
), the Wigner distribution is given as
W
x
(t, f) = (t t
0
), where we should note that the integral for the Wigner distribu-
tion involves the multiplication of two delta-functions, which should not be allowed,
but we leave this theoretical dilemma to the serious mathematics. For these and
some other simple deterministic signals, the calculations of the Wigner distribution
is possible to perform by hand, see also examples in the exercises. Remembering that
the Wigner distribution is the most concentrated distribution, we also compute the
Wigner distribution for the Gaussian signal, that was analyzed in the spectrogram
section.
Example
For the same signal as in the spectrogram window length example, i.e., x(t) = (

)
1
4
e

2
t
2
,
the Wigner distribution is
W
x
(t, f) = 2e
(t
2
+
1

4
2
f
2
)
.
Studying the area of this ellipse e
1
down from the peak value gives A = which is
half of the corresponding area of the spectrogram.
So, why dont we always use the Wigner distribution in all calculations and why is
there a huge eld of research to nd new well-resolved time-frequency distributions?
The problem is well-known as cross-terms, which we will look more closely into in
the next chapter.
19
Maria Sandsten Spectrogram and Wigner distribution
2.4 Exercises
2.1 A signal x(t) = (tt
0
) is windowed with a Gaussian function h(t) = (a/)
1/4
e
at
2
/2
.
Compute the STFT and the spectrogram.
2.2 Show that the Wigner distribution must be real-valued even if the signal is
complex-valued. Hint: Consider the complex conjugate of the Wigner distribu-
tion.
2.3 Show that the Wigner distribution is time-shift and frequency-shift invariant
by replacing the input signal to the distribution with y(t) = x(t t
0
)e
i2f
0
t
.
2.4 Calculate the Wigner distribution of
x(t) = (/)
1/4
e
t
2
/2
e
i2

2
t
2
+i2f
0
t
C2.5 Create a one-component signal with Gaussian windowed complex-valued sinu-
soids, (using e.g., the m-le [Xc,T]=gaussdata(N,Nk,Tvect,Fvect); with total
data length N=256, component length Nk=128, centre time value Tvect=100
and centre frequency value Fvect=0.1). You can easily create the correspond-
ing real-valued sinusoid by using Xr=real(Xc);. Compute and plot the spec-
trogram using a Hanning window of length WIN=64. You can use the m-le
[S,TI,FI]=mtspectrogram(X,WIN); or alternatively Matlabs spectrogram. Use
the complex-valued sinusoid Xc and also the real-valued sinusoid Xr as input
signal. Compare the dierent spectrogram images and make sure that you
agree on the view. What will happen if you change the frequency to the very
low frequency Fvect=0.02? Explain the dierence between the image of the
complex-valued sinusoid and the real-valued sinusoid. The following les could
be used:
The m-le [X,T]=gaussdata(N,Nk,Tvect,Fvect,Fs); computes a (N x 1)
complex-valued data vector X with K dierent (Nk X 1) Gaussian compo-
nents with frequencies specied in Fvect located at centre values specied
in Tvect. A sample frequency Fs can be specied.
The m-le [S,TI,FI]=mtspectrogram(X,WIN,Fs,NFFT,NSTEP,wei); com-
putes and plots the (multitaper) spectrogram using the window specied
with the vector WIN, (or with the Hanning window length WIN).
20
Chapter 3
Cross-terms and the Wigner-Ville
distribution
The Wigner distribution gives cross-terms, that is located in the middle between
and can be twice as large as the dierent signal components. And it does not matter
how far apart the dierent signal components are, cross-terms show up anyway and
between all components of the signal as well as disturbances in the case of noise.
This makes the interpretation of the Wigner distribution dicult and a large area of
research has been devoted to reduction of these cross-terms.
In his paper, [13], Jean-Andre Ville, among other denitions, also redened the an-
alytic signal. The The Wigner distribution of the analytic signal, which is usually
applied in practice today, is therefore called the Wigner-Ville distribution.
3.1 Cross-terms
The main problem with the Wigner distribution/spectrum shows up for multi-component
signals or processes and for signals with disturbances the method is often useless. The
drawback is seen even if we just study the two-component signal x(t) = x
1
(t) +x
2
(t)
for which the Wigner distribution is,
W
x
(t, f) = W
x
1
(t, f) + W
x
2
(t, f) + 21[W
x
1
,x
2
(t, f)], (3.1)
where W
x
1
(t, f) and W
x
2
(t, f), called auto-terms, are the Wigner distributions of
x
1
(t) and x
2
(t) respectively. The cross-term is 21[W
x
1
,x
2
(t, f)] = 21[Tx
1
(t +
/2)x

2
(t /2)]. This cross-term will in fact always be present and will be lo-
cated midway between the auto-terms. The cross-term will oscillate proportionally
21
Maria Sandsten Cross-terms and the Wigner-Ville distribution
to the distance between the auto-terms and the direction of the oscillations will be
orthogonal to the line connecting the auto-terms, see Figure 3.1, which shows the
Wigner distribution of the analytic signal of the sequence displayed in Figure 2.1.
(The concept of analytic signals will be explained later).
Example
Comparing Figure 3.1 with Figure 2.1 we know where we should nd the auto-terms.
For simplicity, they are called component F
1
, F
2
and F
3
. We nd two of them as
yellow components and the middle one, F
2
, marked with a red-yellow square-pattern.
This pattern consists of the middle component and the added corresponding cross-term
from the outer components, (oscillating and located in the middle between F
1
and F
3
).
Then there are a cross-term in the middle between F
1
and F
2
and one between F
2
and F
3
. In conclusion, from these three signal components, the Wigner distribution
will give the three components and additionally three cross-terms (one between each
possible pair of components). All the cross-terms oscillates and it should be noted that
they also adopt negative values (blue colour), which seems strange if we interpret the
Wigner distribution computation as a time-varying power estimate. This is a major
draw-back of the Wigner distribution/spectrum. This example, only including three
components, shows that the result using the Wigner distribution/spectrum could easily
be misinterpreted.
3.2 The analytic signal and the Wigner-Ville dis-
tribution
A signal z(t) is analytic if
Z(f) = 0 for f < 0, (3.2)
where Z(f) = Tz(t), i.e. the analytic signal requires the spectrum to be identical
to the spectrum for the real signal for positive frequencies and to be zero for negative
frequencies. The analytic signal was introduced by Gabor in 1946, [39] and later
Ville introduced the Wigner distribution using the analytic signal, [13]. In honor of
this contribution, the denition
W
z
(t, f) =
_

z(t +

2
)z

(t

2
)e
i2f
d, (3.3)
where z(t) now is the analytic signal, is called the Wigner-Ville distribution. This
is the most used denition today. We illustrate the advantages of the Wigner-Ville
22
Maria Sandsten Cross-terms and the Wigner-Ville distribution
Figure 3.1: The Wigner distribution of the analytic signal of the sequence in Fig-
ure 2.1.
distribution with an example.
Example
Two real-valued sinusoids windowed with Gaussian windows with frequencies f
0
=
0.15, located around t = 80, and f
0
= 0.3 at t = 200, are combined to a sequence of
length N = 256, see Figure. 3.2. The spectrogram is shown in Figure 3.3a), where we
clearly can identify the two components and their frequency locations. As the signal
is real-valued, we also get two components located at f = 0.15 and f = 0.3. In
Figure 3.3b) the Wigner distribution, calculated using Eq. (3.4), is shown and now
the picture becomes complex. The low-frequency component shows up at t = 80 and
f = 0.15 and the high-frequency component at t = 200 and f = 0.3 as expected,
(yellow-orange components) but we also get others that actually look like components
and not oscillating cross-terms, one at t = 80, f = 0.35 and the other at t = 200,
f = 0.2. These two components come from that everything repeats with period
0.5. This is a well known fact for the Discrete Fourier Transform (DFT) although
that we are used to that the period is one, not 0.5. The reason for the period of
0.5 is that we actually have down-sampled the signal a factor 2 when the discrete
23
Maria Sandsten Cross-terms and the Wigner-Ville distribution
Wigner distribution is calculated. The discrete Wigner distribution used in the
computations is dened as
W
x
[n, l] = 2
min(n,N1n)

min(n,N1n)
x[n + m]x

[n m]e
i2m
l
L
, (3.4)
for a discrete-time signal x[n], n = 0 . . . N 1, where L is the number of frequency
values, f = l/(2 L) .
In sampling of a signal, we get aliasing if the frequency content of the signal exceeds
half the sampling frequency. A similar phenomenon appears here as the sequence is
down-sampled a factor 2, meaning that f
0
(here 0.3) above f = 0.25 will alias to a
frequency 0.5f
0
(here 0.2) and will show up in the spectrum image as a component.
The only indication of that this frequency component is something strange, is that
there is not a cross-term midway between the frequency component at f = 0.3 and
the one at f = 0.2, the cross-terms are found between f = 0.15 and f = 0.3. We also
see the cross-terms midway between the frequency located at f = f
0
and f = f
0
,
(at f = 0) as well as the cross-terms midway between e.g., f = 0.15 and f = 0.3,
(located at f = (0.15 + 0.3)/2 0.22. The picture of a two-component real-valued
signal becomes very complex and the interpretation dicult. One elegant solution is
to use the analytic signal and calculate the Wigner-Ville distribution, Figure 3.3c).
The negative frequency components f = 0.15 and f = 0.3 disappear and the
only cross-term is between the two components at positive frequencies. The spectrum
repeats with period 0.5 but as we know from the beginning that no frequencies are
located at negative frequencies (denition of analytic signal), we can just ignore the
repeated picture at negative frequencies. We also see that no aliasing occur for the
frequency at f = 0.3, (as there are no negative frequencies that can repeat above 0.25).
This means, that the signal frequency content of a analytic signal can be up to f = 0.5
as usual for discrete-time signals. There are other possible methods to avoid aliasing
for signal frequencies higher than 0.25 and still compute the Wigner distribution for
the real-valued signals, e.g., could the signal be up-sampled a factor two before the
computation of the IAF, and thereby the aliasing is moved to the frequency 0.5 as in
usual sampling.
24
Maria Sandsten Cross-terms and the Wigner-Ville distribution
0 50 100 150 200 250
1
0.5
0
0.5
1
Figure 3.2: A two-component real-valued sequence.
Figure 3.3: Dierent time-frequency distributions of the sequence in Figure 3.2.
25
Maria Sandsten Cross-terms and the Wigner-Ville distribution
3.3 Cross-terms of the spectrogram
However, actually, the spectrogram also give cross-terms but they show up as strong
components mainly when the signal components are close to each other. This is seen
in the following example: If we have two signals x
1
(t) and x
2
(t), the spectrum, the
squared absolute value of the Fourier transform, of the resulting sum of the signals,
x(t) = x
1
(t) + x
2
(t) is
[X(f)[
2
= [Tx(t)[
2
= [X
1
(f) + X
2
(f)[
2
= [X
1
(f)[
2
+[X
2
(f)[
2
+ X
1
(f)X

2
(f) + X

1
(f)X
2
(f)
= [X
1
(f)[
2
+[X
2
(f)[
2
+ 21[X
1
(f)X

2
(f)].
When the spectrum of X
1
(f) and X
2
(f) covers the same frequency interval the density
of the sum involves not only the sum of the spectra for the two dierent signals but
also a third term 21[X
1
(f)X

2
(f)]. This term is aected by dierent phases of the
signal as well as the magnitude function. This term is called leakage but could also
be dened as a cross-term, and it shows up in ordinary spectrum analysis as well as in
the spectrogram, i.e, we extend these considerations to the time-frequency expression
by
[X(t, f)[
2
= [Tx(t
1
)h

(t
1
t)[
2
= [X
1
(t, f) + X
2
(t, f)[
2
= [X
1
(t, f)[
2
+[X
2
(t, f)[
2
+ 21[X
1
(t, f)X

2
(t, f)], (3.5)
where the third term is included if the same time-frequency region is covered. This is
the main dierence if we compare to the Wigner distribution. For the spectrogram,
the cross-term component shows up if the signal components X
1
(t, f) and X

2
(t, f)
overlap. This is seen in Figure 3.4, where the two complex-valued sinusoids are located
at a large distance and a small distance. The cross-term from the Wigner distribution
is clearly seen in both cases, always placed midway between the two components. For
the spectrogram there is no cross-term present when the frequency distance between
the two components is large, but when the two components are more closely spaced
the eect of the overlapped components becomes visible, Figure 3.4. For a thorough
analysis and comparison, see [40].
26
Maria Sandsten Cross-terms and the Wigner-Ville distribution
Figure 3.4: The cross-terms for the spectrogram and the Wigner distribution of two
complex Gaussian windowed sinusoids.
27
Maria Sandsten Cross-terms and the Wigner-Ville distribution
3.4 Exercises
3.1 Calculate the Wigner distributions of
a)
x(t) = e
i2f
1
t
+ e
i2f
2
t
b)
x(t) = Acos(2f
0
t).
3.2 Calculate the Wigner distribution of
x(t) = (t t
1
) + (t t
2
).
3.3 Calculate the Wigner distribution of
x(t) = e
i2f
0
t
+ (t t
0
).
C3.4 Create a three-component signal with Gaussian windowed complex-valued sinu-
soids with total data length N=256, component length Nk=128, centre time val-
ues Tvect=[70 70 180] and centre frequency values Fvect=[0.1 0.3 0.1] using the
function gaussdata. Compute and plot the Wigner distribution using the m-le
[W,TI,FI]=quadtf(Xc);. Compare the resulting image to the result of the spec-
trogram and explain what you see. Also, compute the real-valued corresponding
signal Xr and calculate the Wigner distribution. What can you conclude from
this picture? Can you explain the view?
The following additional le could be used:
The m-le [W,TI,FI]=quadtf(X,METHOD,par,Fs,NFFT); computes and
plots the specied time-frequency distribution (default Wigner distribu-
tion). To compute the analytic signal as input when you have real-valued
data, use the Matlab-function hilbert, Xa=hilbert(X).
28
Chapter 4
Ambiguity functions and kernels
A large area of research has been devoted to reduction of cross-terms, where dierent
time-frequency kernels are proposed. The design of these kernels are usually made
in the so called ambiguity domain, also known as the doppler-lag domain. The
word ambiguity is a bit ambiguous as it then should stand for something that is
not clearly dened. The name comes from the beginning from the radar eld and was
introduced in [12], describing the equivalence between time-shifting and frequency-
shifting for linear FM signals, which are frequently used in radar. Suggesting the
name characteristic function, it was however rst introduced by Jean-Andre Ville,
[13], where the concept of the Wigner distribution also was used for signal analysis. A
similar derivation was also suggested at about the same time by Jose Enrique Moyal,
[14]. Ambiguity functions and ambiguity kernels have shown to be an ecient way
of reducing cross-terms.
4.1 Denitions and properties
The ambiguity function is dened as
A
z
(, ) =
_

z(t +

2
)z

(t

2
)e
i2t
dt, (4.1)
where usually the analytic signal z(t) is used. (Without any restrictions, z(t) could
be replaced by x(t)). We note that the formulation is similar to the Wigner-Villle
distribution, the dierence is that the Fourier transform is made in the t-variable
instead of the -variable, giving the ambiguity function dependent of the two variables
and . The Fourier transform of the instantaneous auto-correlation function in the
29
Maria Sandsten Ambiguity functions and kernels
variable t gives
A
s
x
(, ) =
_

r
x
(t, )e
i2t
dt, (4.2)
which we call the ambiguity spectrum.
Example
We compare the view of the Wigner-Ville distribution and ambiguity function for dif-
ferent signals. In the rst case in Figure 4.1a) and b) we see that a Gaussian function
with centre frequency f = 0.05 and centre time t = 40 will be a Gaussian function in
the Wigner domain at the corresponding locations in time and frequency. The am-
biguity function however, will be located at = 0 and = 0, but the frequency- and
time-shift will show up as the oscillation and direction on the oscillations respectively.
In Figure 4.1c) and d) the Gaussian with higher centre frequency f = 0.2 located at
t = 0 also shows up in the centre of the ambiguity function with a dierent frequency
and direction of the oscillation. The clear advantage of the ambiguity function shows
up in the last example, Figure 4.1e) and f ), where the signal now consists of the sum
of the two signal components of the two other examples. The two signal components
and the cross-term of the Wigner distribution are clearly visible in Figure 4.1e). In
the ambiguity function, Figure 4.1f ), the signal components add together at the centre
where the cross-term now show up located at = f
b
f
a
and = t
b
t
a
as well as
at = f
a
f
b
and = t
a
t
b
.
As the signal (auto-term) components always will be located at the centre of the
ambiguity function, independently of where they are located in the time-frequency
plane, and the cross-terms always will be located away from the centre, a natural
approach is to keep the terms located at the centre and reduce the components away
from the centre of the ambiguity function. This is also the ambition of many dierent
ambiguity kernels that have been proposed.
So called quadratic time-frequency distributions can be formulated as the mul-
tiplication of the ambiguity function and the ambiguity kernel, (, ),
A
Q
z
(, ) = A
z
(, ) (, ), (4.3)
and the smoothed Wigner-Ville distribution is found as the 2-dimensional convolu-
tion
W
Q
z
(t, f) = W
z
(t, f) (t, f),
30
Maria Sandsten Ambiguity functions and kernels
Figure 4.1: A time- and frequency shifted Gaussian function and the representation
as Wigner-Ville distribution and ambiguity function (real part); a) and b), f
1
= 0.1,
t
1
= 20; c) and d), f
2
= 0.3, t
2
= 0; e) f), The sum of the two Gaussian functions
31
Maria Sandsten Ambiguity functions and kernels
where
(t, f) = T(, ) =
_

(, )e
i2(ft)
dd.
We can note that the Wigner-Ville distribution has the simple ambiguity domain
kernel (, ) = 1 and the corresponding time-frequency (non-)smoothing kernel
(t, f) = (t)(f).
Recalling the time marginal property, we see that if we put = 0 in Eq.(4.1) we get
A
z
(, 0) =
_

z(t)z

(t)e
i2t
dt =
_

[z(t)[
2
e
i2t
dt, (4.4)
which is the Fourier transform of the time marginal. Similarly, using
Z(f) =
_

z(t)e
i2tf
dt we get
A
z
(0, ) =
_

Z(f)Z

(f)e
i2f
df =
_

[Z(f)[
2
e
i2f
df, (4.5)
which is the inverse Fourier transform of the frequency marginal. The frequency
marginal is the usual spectral density and the inverse Fourier transform of the spectral
density, A
z
(0, ), can then be interpreted as the usual covariance function. We can
conclude from this, that in order to preserve the marginals of the time-frequency
distribution, the ambiguity kernel must fulll
(0, ) = (, 0) = 1. (4.6)
From this we also conclude that
(0, 0) = 1, (4.7)
to preserve the total energy of the signal. The Wigner distribution is real-valued and
for the ltered ambiguity function to also become real-valued, the kernel must fulll
(, ) =

(, ). (Exercise!) (4.8)
The time-invariance and frequency-invariance property is important and to nd the
restrictions of the ambiguity kernel we study the quadratic distribution for a time-
and frequency-shifted signal w(t) = z(t t
0
)e
i2f
0
t
giving
W
Q
w
(t, f) =
_

w(u +

2
)w

(u

2
)(, )e
i2(tfu)
dudd,
=
_

z(u +

2
t
0
)e
i2f
0
(u+

2
)
z

(u

2
t
0
)e
i2f
0
(u

2
)
. . .
32
Maria Sandsten Ambiguity functions and kernels
. . . (, )e
i2(tfu)
dudd,
=
_

z(u
1
+

2
)z

(u
1


2
)(, )e
i2(f
0
+tfu
1
t
0
)
du
1
dd,
=
_

z(u
1
+

2
)z

(u
1


2
)(, )e
i2((tt
0
)(ff
0
)u
1
)
du
1
dd,
= W
Q
z
(t t
0
, f f
0
),
where we have assumed that the kernel (, ) is not a function of time nor of fre-
quency. This leads to the conclusion that the quadratic distribution is time shift
invariant if the kernel is independent of time and frequency shift invariant if the
kernel is independent of frequency. However, the kernel can be signal dependent.
The weak time support of a quadratic distribution is satised if
_

(, )e
i2t
d = 0 [[ 2[t[,
and the weak frequency support if
_

(, )e
i2f
d = 0 [[ 2[f[.
The Wigner distribution has weak time- and frequency support but not strong time-
and frequency support as cross-terms arise in between signal components of a multi-
component signal. Strong time support means that whenever the signal is zero, then
the distribution also is zero, giving a stronger restriction on the ambiguity kernel,
_

(, )e
i2t
d = 0 [[ , = 2[t[,
and similarly strong frequency support implies that
_

(, )e
i2f
d = 0 [[ , = 2[f[.
For proofs see [41].
33
Maria Sandsten Ambiguity functions and kernels
4.2 RID and product kernels
Methods to reduce the cross-terms, also sometimes referred to as interference terms,
have been proposed and a number of useful kernels, that falls into the socalled Re-
duced Interference distributions (RID) class can be found. The maybe most
applied RID is the Choi-Williams distribution, [42], also called the Exponential dis-
tribution (ED), with the ambiguity kernel dened as

ED
(, ) = e

, (4.9)
where is a design parameter. The Choi-Williams distribution also falls into the
subclass of product kernels which have the advantage in optimization of being
dependent of one variable only, i.e., x = . The Choi-Williams distribution also
satisfy the marginals as

ED
(0, ) =
ED
(, 0) = 1.
The kernel is also Hermitian, i.e.,

ED
(, ) =
ED
(, ),
ensuring realness of the time-frequency distribution.
Example
Comparing the Choi-Williams distribution with the Wigner-Ville distribution for two
example signals we see in Figure 4.2 a nice view where the Choi-Williams distribution
gives a reduction of the cross-terms by smearing out them, but still keeps most of the
concentration of the auto-terms, when the signal components are located at dierent
frequencies as well as dierent times. However, it is important to note that this
nice view are not fully repeated for the two complex sinusoids located at the same
frequency, see Figure 4.3, where the cross-terms still are present but comparing the
size of them will show that the cross-terms of the Wigner-Ville distribution is twice
the height of the auto-terms but the Choi-Williams distribution cross-terms are about
half the height of the auto-terms, a reduction of four times. Still, the remaining
amplitude of the cross-terms could cause misinterpretation of the distributions. A
similar performance would be seen for two components located at the same time.
Other ways of dealing with the cross-terms have been applied, e.g. the Born-Jordan
distribution derived by Cohen, [18], using a rule dened by Born and Jordan. The
34
Maria Sandsten Ambiguity functions and kernels
Figure 4.2: The Wigner and Choi-Williams distributions of two complex sinusoids
with Gaussian envelopes located at t
1
= 64, f
1
= 0.1 and t
2
= 128, f
2
= 0.2.
35
Maria Sandsten Ambiguity functions and kernels
Figure 4.3: The Wigner-Ville and Choi-Williams distributions of two complex sinu-
soids with Gaussian envelopes located at t
1
= 64, f
1
= 0.1 and t
2
= 128, f
2
= 0.1.
36
Maria Sandsten Ambiguity functions and kernels
properties of this kernel were not fully understood until in the 1990s, with the work
of Jeong and Williams, [43]. The ambiguity kernel for the Born-Jordan distribution,
also called is the Sinc-distribution, is

BJ
(, ) = sinc(a) =
sin(a)
a
, (4.10)
which also ts into the RID class and is a product kernel.
4.3 Separable kernels
A nice form of simple kernels are the separable kernels dened by
(, ) = G
1
()g
2
(). (4.11)
This form transfers easily to the time-frequency domain as (t, f) = g
1
(t)G
2
(f)
with g
1
(t) = T
1
G
1
() and G
2
(f) = Tg
2
(). The quadratic time-frequency
formulation becomes
W
Q
z
(t, f) = g
1
(t) W
z
(t, f) G
2
(f), (4.12)
as
A
Q
z
(, ) = G
1
()A
z
(, )g
2
(). (4.13)
The separable kernel replaces the 2-D convolution of the quadratic time-frequency
representation with two 1-D convolutions, which might be benecial for some signals.
Two special cases can be identied: if
G
1
() = 1, (4.14)
a doppler-independent kernel is given as (, ) = g
2
(), and the quadratic time-
frequency distribution reduces to
W
Q
z
(t, f) = W
z
(t, f) G
2
(f), (4.15)
which is a smoothing in the frequency domain meaning that only the weak time
support is satised. The doppler-independent kernel is also given the name Pseudo-
Wigner or windowed Wigner distribution. The other case is when
g
2
() = 1, (4.16)
giving the lag-independent kernel, (, ) = G
1
() where the weak frequency sup-
port is satised and the time-frequency formulation gives only a smoothing in the
variable t,
W
Q
z
(t, f) = g
1
(t) W
z
(t, f). (4.17)
37
Maria Sandsten Ambiguity functions and kernels
Figure 4.4: The resulting time-frequency plots for three complex sinusoids with
Gaussian envelopes, located at t
1
= 64, f
1
= 0.1 and t
2
= 128, f
2
= 0.1 and
t
3
= 128, f
3
= 0.2; a) The Wigner distribution; b) Doppler-independent kernel;
c) Lag-independent kernel.
Example
A comparison for some example signals show the advantage of the frequency smoothing
for the Doppler-independent kernel as the cross-terms for signals with dierent time
location disappears with use of averaging (smoothing) in the frequency direction, as
the cross-terms oscillates in the frequency direction, Figure 4.4a) and b). For the
lag-independent kernel we see the advantage of the smoothing in the time-direction in
Figure 4.4c), where the cross-terms disappears in the frequency direction.
38
Maria Sandsten Ambiguity functions and kernels
4.4 Exercises
4.1 Compute the ambiguity functions of
a)
x(t) = e
i2f
0
t
b)
x(t) = (t t
0
)
c)
x(t) = e
i2f
0
t
+ e
i2f
1
t
d)
x(t) = Acos(2f
0
t)
4.2 Compute the ambiguity function of
x(t) = e
i2f
0
t
+ (t t
0
)
4.3 Show that Eq. (4.8) is the claim for the real-valued property for the quadratic
class.
4.4 Calculate the Pseudo-Wigner distribution using the lag-window
g() = e
a
2
/2
,
for the complex sine wave,
z(t) = e
i2f
0
t
.
Compare with the results from the Wigner-Ville distribution. Lead: the follow-
ing denite integral could be helpful:
_

e
2xx
2
dx =
_

, > 0. (4.18)
39
Maria Sandsten Ambiguity functions and kernels
C4.5 a) Start with the one-component complex-valued signal from exercise C2.5a).
Compute and plot the ambiguity function with [A,TI,FI]=quadamb(Xc);.
The plot shows the absolute value! (For analysis of real and imaginary
parts, study the output A of the function). Study the ambiguity function
for dierent one-component signals, e.g., change Tvect and Fvect as inputs
to gaussdata. Does the absolute value of the ambiguity function change
for the dierent signals? Verify by a closer study of the absolute value of
the output A.
b) Construct a another signal with more than one component using gauss-
data, e.g. a similar one as in C3.4, and study the resulting ambiguity func-
tions and time-frequency distributions, using the Choi-Williams distribu-
tion. To change the parameter (default = 1) use
[A,TI,FI]=quadamb(Xc,choi,) where F
s
= 1 and NFFT=1024 (default
values). A smaller value of , e.g., = 0.1 will reduce cross-terms but will
also aect the auto-terms. For large , the Choi-Williams distribution will
be similar to the Wigner distribution. Study the ambiguity kernel (gure2)
and the resulting ltered ambiguity function (gure1) and explain the ef-
fect of the kernel. Study the resulting time-frequency distribution using
[W,TI,FI]=quadtf(Xc,choi,). Change to Tvect=[70 180] Fvect=[0.1
0.3] of the complex-valued signal and study the Choi-Williams distribu-
tion for dierent values of . Why is this signal easier to resolve? Explain
from the view of the ambiguity kernel and ltered ambiguity function. How
should you describe the form of the ambiguity kernel for the Choi-Williams
distribution? How can you see that the marginals are satised?
c) Use the three-component complex-valued signal from C3.4 and investigate
the dierence between the doppler-independent kernel, METHOD=w-
wig, and the lag-independent kernel, METHOD=l-ind, for dierent win-
dow lengths (Hanning, default length par = NN/10). Study the ltered
ambiguity function, the ambiguity kernel as well as the time-frequency
distribution and explain the dierences and how dierent window lengths,
par=WIN, aect the results.
The following additionally m-le could be used and modied for your own pur-
poses:
The m-le [A,TI,FI]=quadamb(X,METHOD,par,Fs,NFFT) computes and
plots the ltered ambiguity function (absolute value) using a specied
method.
40
Chapter 5
Concentration measures
Many of the proposed methods in time-frequency analysis are devoted to suppression
of cross-terms and to maintain the concentration of the autoterms. To measure
this and possibly automatically determine parameters, measurement criteria for the
concentration are needed. Such criteria were dened already in [39] and data-adaptive
methods could be found ,e.g., in [37].
The basic idea of concentration measurement can be found using a simple example
from probability theory. We use N non-negative numbers, p
1
, p
2
, . . . , p
N
where
p
1
+ p
2
+ . . . + p
N
= 1.
A quadratic test function
= p
2
1
+ p
2
2
+ . . . p
2
N
,
will attend its minimum value when p
1
= p
2
= . . . = p
N
= 1/N (maximally
spread)and the maximum value when only one p
i
= 1 and all the others are zero
(minimum spread). Incorporating the unity sum constraint in the test function gives
=
p
2
1
+ p
2
2
+ . . . p
2
N
(p
1
+ p
2
+ . . . + p
N
)
2
.
5.1 Concentration measures
We can formulate similar a measure for the time-frequency concentration, e.g., as
suggested in [37],
=
_

W
4
(t, f)dtdf
(
_

W
2
(t, f)dtdf)
2
, (5.1)
41
Maria Sandsten Concentration measures
where W(t, f) is a chosen time-frequency distribution or spectrum. Other norms
could of course be used, but this one is similar to the often used kurtosis in statis-
tics. The measure in Eq. (5.1) is however not useful for signals including several
components and a better measure is a more local function, also proposed in [37],

(t,f)
=
_

Q
2
(t
1
t, f
1
f)W
4
(t
1
, f
1
)dt
1
df
1
(
_

Q(t
1
t, f
1
f)W
2
(t
1
, f
1
)dt
1
df
1
)
2
, (5.2)
where Q(t, f) is a weighting function that determines the region for the concentration
measurement. One suggestion is to use a Gaussian function as weighting function,
[37].
Another measure is based on the Renyi entropy, [44],
R

=
1
1
log
2
(
_

(t, f)dtdf), > 2. (5.3)


The behavior depends on
_

(t, f)dtdf for > 2 similarly to the above men-


tioned functions and the logarithm is a monotone function. However, as 1/(1 )
is negative for > 2, the entropy of Eq. (5.3) becomes larger for less concentrated
distributions. The Renyi entropy could be interpreted as counting the number of
components in a multi-component signal. For odd values of > 1, R

is asymp-
totically invariant to time-frequency cross-terms and does not count them. The R

attends the lowest value for the Wigner distribution of a single Gaussian windowed
signal. It is also invariant to time- and frequency shifts of the signal.
The Renyi entropy can be related to the well-known Shannon entropy
H =
_

W(t, f) log
2
W(t, f)dtdf,
which could be recovered from the Renyi entropy, from the limit of 1, [44]. We
should note that the Shannon entropy could not be used for a distribution including
any negative values at any time-frequency value as the logarithm is applied inside
the integral. For the Renyi entropy however, the logarithm is applied outside the
integral, and many signals and especially smoothed distributions are positive in this
context. However, multi-component signals and odd values of can be found for
which
_

(t, f)dtdf < 0, see [44] for examples of such Wigner distributions.
Often, the value of = 3 is applied. Many of the time-frequency distributions in
the literature have been based either implicitly or explicitly on information measures,
e.g., [45, 46].
42
Chapter 6
Quadratic distributions and
Cohens class
In the 1940s to 60s, during and after the invention of the spectrogram and the
Wigner-Ville distribution, a lot of other distributions and methods were invented,
e.g., Rihaczek distribution, [15], Page distribution, [16] and Margenau-Hill or Levin
distribution, [17]. In 1966 a formulation was made by Leon Cohen in quantum me-
chanics, [18], which included these and an innite number of other methods as kernel
functions. Many of the distributions satised the marginals, the instantaneous fre-
quency condition and other properties and all of these methods are nowadays referred
to as the the Cohens class. Later a quadratic class was dened, also including
kernels that not satisfy the marginals, which was a restriction of the original Cohens
class. However, the quadratic class is nowadays often referred to as the Cohens
class. Quadratic time-frequency distributions can always be formulated as the
multiplication of the ambiguity function and the ambiguity kernel, (, ),
A
Q
z
(, ) = A
z
(, ) (, ), (6.1)
where the ambiguity function is the two-dimensional Fourier transform of the Wigner-
Ville distribution and accordingly, for any ambiguity kernel, the smoothed Wigner-
Ville distribution is found as the 2-dimensional convolution
W
Q
z
(t, f) = W
z
(t, f) (t, f),
=
_

A
z
(, )(, )e
i2(ft)
dd
43
Maria Sandsten Quadratic distributions and Cohens class
=
_

z(u +

2
)z

(u

2
)(, )e
i2(tfu)
dudd, (6.2)
where the last line is the most recognized dening the quadratic class. The quadratic
class also includes the spectrogram.
6.1 Kernel interpretation of the spectrogram
The spectrogram, using the analytic signal, is dened
S
z
(t, f) = [
_

(t t
1
)z(t
1
)e
i2ft
1
dt
1
[
2
,
= (
_

(t t
1
)z(t
1
)e
i2ft
1
dt
1
)(
_

h(t t
2
)z

(t
2
)e
i2ft
2
dt
2
).
Replacing t
1
= t

+

2
and t
2
= t


2
,
S
z
(t, f) =
_

z(t

+

2
)z

(t


2
)h(t t


2
)h

(t t

+

2
)e
i2f
ddt

,
where we can identify the time-lag distribution r
z
(t, ) = z(t +

2
)z

(t

2
) and a
time-lag kernel

h
(t, ) = h(t +

2
)h

(t

2
),
giving
W
h
z
(t, f) = S
z
(t, f) =
_

r
z
(t

, )
h
(t t

, )

e
i2f
dt

d.
(The proof of the time-lag formulation of the quadratic time-frequency formulation
is calculated in exercise 6.1.)
Another formulation of the spectrogram in the quadratic class is given by starting
with a kernel
h
(, ) which is the ambiguity function of a window h(t), i.e.,

h
(, ) =
_

h(t +

2
)h

(t

2
)e
i2t
dt.
Then we get (we drop the integral limits for simplication),
W
h
z
(t, f) =
_ _ _ _
z(t
1
+

2
)z

(t
1

2
)h(t
2
+

2
)h

(t
2

2
)e
i2(t
1
+t
2
t+f)
dt
1
dt
2
dd,
which after integration over becomes
=
_ _ _
z(t
1
+

2
)z

(t
1


2
)h(t
2
+

2
)h

(t
2


2
)(t t
1
t
2
)ddt
1
dt
2
.
44
Maria Sandsten Quadratic distributions and Cohens class
Using that t
2
= t t
1
,
=
_ _
z(t
1
+

2
)z

(t
1


2
)h(t t
1
+

2
)h

(t t
1


2
)e
i2f
ddt
1
.
Replacing t
1
+

2
= t

and t
1


2
= t

gives
=
_ _
z(t

)z

(t

)h(t t

)h

(t t

)e
i2(t

)
dt

dt

,
which equals the spectrogram,
S
z
(t, f) = (
_

(t t

)z(t

)e
i2ft

dt

)(
_

h(t t

)z

(t

)e
i2ft

dt

),
= [
_

(t t

)z(t

)e
i2ft

dt

[
2
.
If we study the assumed the ambiguity kernel of the spectrogram above, we see that
the marginals never could be fullled. Why?
6.2 Some historical important distributions
The Rihaczek distribution (RD), [15], is derived from the energy of a complex-valued
deterministic signal over nite ranges of t and f, and if these ranges become innites-
imal, a energy density function is obtained. The energy of a complex-valued signal
is
E =
_

[z(t)[
2
dt =
_

z(t)z

(t)dt =
_

z(t)Z

(f)e
i2ft
dfdt, (6.3)
where then for innitesimal t and f we dene
R
z
(t, f) = z(t)Z

(f)e
i2ft
, (6.4)
as the Rihaczek distribution. (Sometimes this distribution is also called the Kirkwood-
Rihaczek distribution as it actually was derived earlier in the context of quantum
mechanics, [47].) We note that the marginals are satised as
_

R
z
(t, f)df = [z(t)[
2
,
(seen from Eq. (6.3)), and
_

R
z
(t, f)dt = Z

(f)
_

z(t)e
i2ft
dt = [Z(f)[
2
.
45
Maria Sandsten Quadratic distributions and Cohens class
Other important properties of the Rihaczek distribution are both strong time support
and strong frequency support, which is easily seen as the distribution is zero at the
time intervals where z(t) is zero and at the frequency intervals where Z(f) is zero.
The Fourier transform in two variables of Eq. (6.4) gives the ambiguity function from
where the ambiguity kernel
(, ) = e
i
, (6.5)
can be identied. From this we also verify that the marginals are satised as (, 0) =
(0, ) = 1.
Example
Comparing the results of the Wigner-Ville and Rihaczek distributions for some sig-
nals show that for a sum of two complex-valued sinusoid signal, (f
1
= 0.05 and
f
2
= 0.15), multiplied with a Gaussian functions and located at the same time-instant,
Figure 6.1a), the Rihaczek distribution Figure 6.1c) (real value) and d) (imaginary
value) does not generate any cross-terms as the Wigner-Ville distribution does, Fig-
ure 6.1b). The view of the components might be dicult to interpret as they seem to
oscillate quite strange. However, in Figure 6.2, for the sum of two complex-valued
sinusoids, (f
1
= 0.05 and f
2
= 0.15), with Gaussian envelope located at dierent
time intervals and with dierent frequencies, the Rihaczek distribution produces twice
as many cross-terms compared to the Wigner-Ville distribution. Why the cross-terms
appear as in Figure 6.2 is intuitively easy to understand if we study Eq. (6.4) where the
signal z(t) will be present around time 30 and 30 and the Fourier transform Z(f)
will be present around frequencies f
1
= 0.05 and f
2
= 0.15. These two functions
are combined to the two-dimensional time-frequency representation, which naturally
then will have components appering at all combinations of these time and frequency
instants.
The real part of the Rihaczek distribution is also dened as a distribution, the Levin
distribution (LD), [48], i.e.,
L
z
(t, f) = 1[z(t)Z

(f)e
i2ft
]. (6.6)
It also satises the marginals and has strong time and frequency support. The ad-
vantage of the Levin distribution is that it is real valued (similar to the Wigner-Ville
distribution), even for complex signals. It was also derived in quantum-mechanical
context and is therefore sometimes called Margenau-Hill distribution, [17].
46
Maria Sandsten Quadratic distributions and Cohens class
Figure 6.1: The Wigner-Ville and Rihaczek distributions of a real valued cosine signal
with Gaussian envelope.
47
Maria Sandsten Quadratic distributions and Cohens class
Figure 6.2: The Wigner-Ville and Rihaczek distributions of two complex sinusoids
with Gaussian envelope located at dierent time intervals and with dierent frequen-
cies.
48
Maria Sandsten Quadratic distributions and Cohens class
The Rihaczek and Levin distributions have their obvious drawbacks, but as they
also are intuitively nice from interpretation aspects, they are often applied where the
signal is windowed, e.g., a windowed Rihaczek distribution is dened as
R
w
z
(t, f) = z(t) [T
f
z()w( t)]

e
i2ft
. (6.7)
With an appropriate size of the window, the cross-terms of the Rihaczek distribution
can be reduced.
Another suggestion of distribution is the running transform, Z

(t, f), or the Page


distribution, [16], which is the Fourier transform of the signal up to time t.
Z

(f) =
_
t

z(t

)e
i2ft

dt

. (6.8)
If the signal is zero after time t, the marginal in frequency is [Z

(f)[
2
. If a distribution
satisfy the marginal, then
_
t

(t

, f)dt

= [Z

(f)[
2
.
Dierentiating with respect to time gives
P

(t, f) =

t
_
[
_
t

z(t

)e
i2ft

dt

[
2
_
=

t
___
t

z(t

)e
i2ft

dt

___
t

(t

)e
i2ft

dt

__
= z(t)e
i2ft
Z

(f) + Z

(f)z

(t)e
i2ft
= 21
_
z

(t)Z

(f)e
i2ft
_
,
to be recognized to be somewhat similar to the Levin distribution. Comparing the
results of the previous examples, we see the characteristics of the Page distribution
in Figure 6.3, where one of the cross-terms from the Rihaczek distribution disappears
but the other remains. Can you see the reason why the rst appearing cross-term in
time is the one that disappears? Study the denition in Eq. (6.8) and the limits of
the integral.
6.3 A view on the four domains of time-frequency
analysis
As noted we can also formulate the quadratic class as
W
Q
z
(t, f) =
_

r
z
(u, )(t u, )e
i2f
dud, (6.9)
49
Maria Sandsten Quadratic distributions and Cohens class
Figure 6.3: The Wigner-Ville and Page distributions of two complex valued sinu-
soids with Gaussian envelope located at dierent time intervals and with dierent
frequencies.
50
Maria Sandsten Quadratic distributions and Cohens class
where the time-lag kernel is dened as
(t, ) =
_

(, )e
i2t
d,
and the Instantaneous Autocorrelation Function, (IAF) as
r
z
(t, ) = z(t +

2
)z

(t

2
). (6.10)
The Doppler distribution is found from the Fourier transform of the Wigner-Ville
distribution in t or the Fourier transform of the ambiguity function in , i.e.,
D
z
(, f) =
_

A
z
(, )e
i2f
d
=
_

W
z
(t, f)e
i2t
dt. (6.11)
The doppler distribution is the corresponding spectral autocorrelation function
which is also calculated from the IAF as
D
z
(, f) =
_

r
z
(t, )e
i2(f+t)
dtd. (6.12)
Replacing t + /2 = t
1
and t /2 = t
2
we reformulate the doppler distribution as
D
z
(, f) =
_

z(t
1
)e
i2(f+

2
)t
1
dt
1

(t
2
)e
i2(f

2
)t
2
dt
2
= Z(f +

2
)Z

(f

2
), (6.13)
where Z(f) is the Fourier transform of z(t). We now see that the doppler distribution
is the frequency dual of the IAF.
Now we have presented four dierent domains for representation of a time-varying
signal, the instantaneous autocorrelation function (IAF) in the variables (t, ), the
Wigner distribution in (t, f), the ambiguity function in (, ) and nally the doppler
distribution in (, f). A schematic overview is given in Figure 6.4. Studying simple
signals in the dierent domains give us information on the interpretation. A Gaussian
envelope signal centered at t = 0 will show up as a Gaussian signal in all other domains
too as the Fourier transform also has a Gaussian shape. It is important to remember
which variable that is used in the Fourier transform and which variable that is xed
when transforming from one domain to another. When we move up, down, left or
right between the gures, only one variable change for each step.
51
Maria Sandsten Quadratic distributions and Cohens class
6
?
-

f
t
Wigner, W
z
(t, f)
IAF, r
z
(t, )
Doppler, D
z
(, f)
Ambiguity, A
z
(, )
?
-
?
-
T
f
T
t
T
f
T
t
Figure 6.4: The four possible domains in time-frequency analysis.
52
Maria Sandsten Quadratic distributions and Cohens class
Examples
If we delay the Gaussian signal in time to t = 20, the IAF and Wigner distribution
will shift the information by the movement of the Gaussian to t = 20, Figure 6.5.
For the ambiguity function and doppler distribution the result is complex valued and
we choose to plot the real value of the functions. The Fourier transform from the t-
variable to the -variable will give us a Gaussian envelope with a phase corresponding
to the time-shift of t = 20. The phase-shift shows up in the real value (and the
imaginary value) and is seen in the plot as a function shifting between negative and
positive values if studied in the -variable. Increasing the time-shift will cause these
these shifts to occur more frequently, i.e., a more rapid oscillation. If we now replace
Figure 6.5: The Gaussian signal shifted to t = 20 and the representation in all four
domains.
the Gaussian signal with a complex-valued signal of frequency f = 0.2 with a Gaussian
envelope, movement in frequency show up in the Wigner and the doppler distribution
as the functions now are located at f = 0.2. For the IAF and the ambiguity function
however, we now get a function shifting from positive to negative values in the -
variable, Figure 6.6.
53
Maria Sandsten Quadratic distributions and Cohens class
Figure 6.6: The Gaussian signal shifted in frequency to f = 0.2 and the representation
in all four domains.
The combination of a shift in time of t = 20 with a shift in frequency f = 0.2, can
clearly be seen in the Wigner distribution of Figure 6.7. For the ambiguity function,
the combination of time- and frequency shift show up as diagonal lines but the function
is still located at zero both in and . The IAF give time shift and in the doppler
distribution the frequency shift of f = 0.2 shows up.
With a multicomponent signal, a combination of a Gaussian signal a time t = 0 and
the complex signal with frequency f = 0.2, Gaussian envelope and located at t = 20,
a more complex pattern can be seen in all domains of Figure 6.8. For the Wigner
distribution, the two functions are seen at t = 0, f = 0 and t = 20, f = 0.2 with the
cross-term in between. The cross-term also show up in the other domains at various
places. In the IAF domain the auto-terms are located at = 0 as before where the
cross-terms show up at = 20 and = 20. The same thing is seen in the for doppler
distribution where the auto-terms are located at = 0 and the cross-terms at positive
and negative -values respectively. For the ambiguity function, these properties can be
used to dier auto-terms from cross-terms as the auto-terms are located at = = 0
and the cross-terms at positive and negative and .
54
Maria Sandsten Quadratic distributions and Cohens class
Figure 6.7: The Gaussian signal shifted in frequency to f = 0.2 and in time to t = 20
and the representation in all four domains.
Moving the complex sinusoids more apart in time will cause a corresponding shift
in the IAF domain as well as the ambiguity function. For the doppler distribution
however, this larger shift just shows up as the frequency of the shifts of the functions,
the location of the auto-terms as well as the cross-terms remain the same. On the
other hand, if the we increase the frequency shift but keep the time shift constant, the
dierent terms of the IAF will remain at the same location but the doppler distribution
location will change similarly to the ambiguity function. The auto-terms will still be
located at the origin and the cross-terms somewhere outside the origin of the ambiguity
function.
In the last example in Figure 6.9 we show the four domains in the case of a real
cosine signal of frequency f = 0.2 with Gaussian envelope centered at t = 0. The
auto-terms as well as the cross-terms show up as expected in the dierent domains.
In the Wigner domain the auto-terms are located at f = 0.2 and the cross-term at
f = 0. In the ambiguity function, the auto-term is at = = 0 and the cross-terms
outside. For the IAF everything is located at t = = 0 as t = 0 as all term are located
at t = 0 and hence no time-dierence between them exist and thereby the terms are
55
Maria Sandsten Quadratic distributions and Cohens class
Figure 6.8: Two Gaussian signal, one at f = 0, t = 0 and one shifted in frequency to
f = 0.2 and in time to t = 20 and the representation in all four domains.
located at = 0 as well. The doppler distribution has the dierence in frequency
f = 0.2, which shows up both in f and in . Note that the dierence is doubled in
the -domain.
56
Maria Sandsten Quadratic distributions and Cohens class
Figure 6.9: A real-valued cosine signal with f = 0.2 and Gaussian envelope and the
representation in all four domains.
57
Maria Sandsten Quadratic distributions and Cohens class
6.4 Exercises
6.1 Use Eq. (6.2) and show that the quadratic class also can be written as
W
Q
z
(t, f) =
_

r
z
(u, )(t u, )e
i2f
dud,
where the time-lag kernel is dened as
(t, ) =
_

(, )e
i2t
d.
6.2 Derive the ambiguity kernel for the Rihaczek distribution starting from the
time-frequency distribution
W
Q
z
(t, f) = R
z
(t, f) = z(t)Z

(f)e
i2ft
.
6.3 Verify that the time-lag kernel of the Levin distribution is
(t, ) =
1
2
_
(t +

2
) + (t

2
)
_
.
C6.4 a) Construct a signal with several components using gaussdata. Compute
and plot the Rihaczek distribution and the windowed Rihaczek distribution
using the m-les quadtf and quadamb, and investigate the advantages
and disadvantages of the dierent methods. How does the window length
aect the ambiguity kernel and the time-frequency distribution? Compare
the Rihaczek distribution with the Levin distribution for one or two ex-
ample signals and point out the dierences. (Hint: Study the output W
from quadtf.)
b) Study the time-lag kernel, (given as output from quadtf), for some dis-
tributions to gure out the demand of the time-lag kernel for a real-valued
time-frequency distribution. Visualize a few examples.
58
Chapter 7
The instantaneous frequency
The concept of instantaneous frequency is closely related to the denition of the
analytic signal as we from the analytic signal get a phase and the instantaneous
frequency is dened as the derivative of the phase. The rst interest in instantaneous
frequency arouse with frequency modulation in radio transmission with a variable
frequency in an electric circuit theory context. Many methods for estimating the
instantaneous frequency have been proposed, e.g., more recently, iteratively using
matched adaptive spectrograms, [49], adaptive window lengths, [50] and estimation
of multi-component signals, [51].
7.1 Instantaneous frequency
Here, one of several approaches for the denition of the instantaneous frequency are
presented, where we start by a simple amplitude-modulated signal,
x(t) = A(t)e
i(t)
= A(t)e
i2f
0
t
, (7.1)
where A(t) is the instantaneous amplitude. The frequency of this signal for all t
is found as
f
0
=

(t)
2
.
We can now extend this constant-frequency example to a signal
x(t) = A(t)e
i(t)
= A(t)e
i2f(t)t
, (7.2)
where f(t) is time-variable and where (t) is dened as the instantaneous phase.
If (t) is dierentiable and is evaluated at t = t
1
and t = t
2
, where t
2
> t
1
, there
59
Maria Sandsten The instantaneous frequency
exists a time interval between t
1
and t
2
such that
(t
2
) (t
1
) = (t
2
t
1
)

(t). (7.3)
If p
I
is the period of one oscillation of x(t) then the instantaneous frequency is
dened as f
I
= 1/p
I
. Evaluating for one period, i.e., t
2
= t
1
+ p
I
, will result in that
(t
2
) = (t
1
) + 2 and

(t) from Eq. (7.3) will be simplied to

(t) = 2/p
I
= 2f
I
.
The instantaneous frequency will then, similarly to the constant-frequency example,
be
f
I
=

(t)
2
, (7.4)
for the time t during that oscillation.
In the case of a real-valued signal, however, e.g.,
x(t) = A(t) cos(2f(t)t), (7.5)
the instantaneous frequency will be zero as this signal has no phase in the sense
dened above. This will of course be the case for any real-valued signal. If we could
choose we would rather like the instantaneous frequency to be f
I
also for a real-valued
signal. Therefore, it is of interest to use analytic signals.
Analytic signals
As already mentioned a signal z(t) is analytic if
Z(f) = 0 for f < 0, (7.6)
where Z(f) = Tz(t), i.e. the analytic signal requires the spectrum to be identical
to the spectrum for the real signal for positive frequencies and to be zero for negative
frequencies.
Theorem 1 The signal
z(t) = x(t) + iy(t), (7.7)
where x(t) and y(t) are real-valued, is analytic with a real-valued DC component, if
and only if
Y (f) = (i sign(f))X(f), (7.8)
60
Maria Sandsten The instantaneous frequency
where X(f) = Tx(t) and Y (f) = Ty(t) and where
sign(f)
_

_
1 if f > 0
0 if f = 0
1 if f < 0.
(7.9)
We get
Z(f) =
_

_
2X(f) if f > 0
X(f) if f = 0
0 if f < 0.
(7.10)
The inverse Fourier transform of Z(f) is the analytic signal corresponding to the real
signal x(t). The calculation of the analytic signal can be made in the time domain as
well, see e.g., [31], but the interpretation in frequency domain is much easier, using
the Hilbert transform. The Hilbert transform of a signal x(t) is computed as
1[x(t)] = T
1
(i sign(f))Tx(t), (7.11)
giving
z(t) = x(t) + i1[x(t)]. (7.12)
Example
The Hilbert transform introduces a phase lag of 90 degrees (/2), caused by the mul-
tiplication of i = e
i/2
, where the resulting signal is then said to be in quadrature
of the original signal. For the positive constant f
0
,
1[cos(2f
0
t)] = sin(2f
0
t),
1[sin(2f
0
t)] = cos(2f
0
t).
We can also assume
1[A(t) cos((t))] = A(t) sin((t)),
if the variation of A(t) is suciently slow. Suciently slow in this case means that
the variation should be slow enough to ensure that the spectrum of A(t) do not overlap
the spectrum of cos((t)). This means that all low frequencies are contained in A(t)
where the high frequencies are in (t). We get
z(t) = A(t) cos((t)) + i1[A(t) cos((t))]
= A(t) cos((t)) + iA(t) sin((t))
= A(t)e
i(t)
.
61
Maria Sandsten The instantaneous frequency
Example
The convolution of an analytic signal with an arbitrary function is analytic as,
_
z(t
1
)h(t t
1
)dt
1
=
_

0
Z(f)H(f)e
i2ft
df,
where the resulting signal must be analytic as Z(f) = 0 for f < 0 and the integral
thereby is dened only for positive frequencies.
Example
It is important to note that when the signal is a multi-component signal, the denition
of instantaneous frequency becomes dicult. The concept of instantaneous frequency
is no longer valid and the methods that rely on the denition of instantaneous fre-
quency do not work properly. One example is the sum of two complex sinusoids,
x(t) = A
1
e
i2f
1
t
+ A
2
e
i2f
2
t
, (7.13)
where f
1
and f
2
are positive,. i.e., an analytic signal. Depending on the values of the
frequencies and the amplitudes very dierent instantaneous frequencies show up, see
Figure 7.1a), where the parameters are f
1
= 0.11, f
2
= 0.2, A
1
= 0.5 and A
2
= 1
giving an instantaneous frequency varying between 0.17 and 0.29 which includes the
frequency 0.2 but not at all 0.11. In Figure 7.1b), the amplitude of the rst frequency
is changed to A
1
= 1.5 and the resulting instantaneous frequency even becomes
negative for certain time points.
7.2 Time-frequency reassignment
The reassignment principle was introduced in 1976 by Kodera, de Villedary and
Gendrin, [20], but were not applied to a larger extent until when Auger and Flandrin
reintroduced the method, [21]. The idea of reassignment is to keep the localization
of a single component by reassigning mass to the center of gravity where the cross-
terms are reduced by smoothing of the Wigner distribution. The spectrogram using
the window h(t) is
S
x
(t, f) = [X(t, f)[
2
,
where
X(t, f) =
_

x(t
1
)h

(t
1
t)e
i2ft
1
dt
1
,
62
Maria Sandsten The instantaneous frequency
0 50 100
0.16
0.18
0.2
0.22
0.24
0.26
0.28
0.3
0.32
t
f
i
(
t
)
a)
0 50 100
0.1
0.05
0
0.05
0.1
0.15
t
b)
Figure 7.1: Example of instantaneous frequency of a two-component signal; a) f
1
=
0.11, f
2
= 0.2, A
1
= 0.5 and A
2
= 1; b) f
1
= 0.11, f
2
= 0.2, A
1
= 1.5 and A
2
= 1.
but could also be rewritten as the 2-dimensional convolution between the Wigner
distribution and the time-frequency kernel of the window function,
S
x
(t, f) =
_

W
x
(u, v)W
h
(t u, f v)du dv. (7.14)
In this form, the spectrogram is easily seen as a weighted sum of all the Wigner
distribution values at the neighboring points. The mass at these points should be
assigned to the center of gravity, which are calculated as

t
x
(t, f) =
1
S
x
(t, f)
_

uW
x
(u, v)W
h
(t u, f v)du dv (7.15)

f
x
(t, f) =
1
S
x
(t, f)
_

vW
x
(u, v)W
h
(t u, f v)du dv. (7.16)
giving
RS
h
x
(t, f) =
_

S
x
(u, v)(t

t
x
(u, v))(f

f
x
(u, v))du dv. (7.17)
63
Maria Sandsten The instantaneous frequency
From the beginning

t, (local group delay), and

f, (local instantaneous frequency),
have been related to the phase of the STFT as

t
x
(t, f) =
1
2

f
(t, f) (7.18)

f
x
(t, f) = f +

t
(t, f), (7.19)
where
(t, f) = argX(t, f).
However, this does not lead to anything useful for discrete-time signals and a much
more ecient implementation is possible as

t
x
(t, f) = t 1
X
th
(t, f)
X
h
(t, f)
(7.20)

f
x
(t, f) = f +
X
dh/dt
(t, f)
2X
h
(t, f)
, (7.21)
using three dierent analysis windows, h(t), th(t) and dh/dt(t). The spectrogram
kernel W
h
(t, f) can be replaced by any other time-frequency kernel and for any
other distribution belonging to the Cohen class, the reassignment process will lead
to perfectly localized single component chirp signals, frequency tones and impulse
signals,[21]. For multi-component signals, the reassignment improves the readability
as the cross-terms are reduced by the smoothing of the specic distribution and the
reassignment then squeezes the signals terms. An example is given in Figure 7.2.
7.3 Exercises
7.1 Prove Theorem 1.
7.2 Calculate the analytic signal of
x(t) = cos(2f
1
t) cos(2f
2
t),
where 0 f
1
f
2
.
7.3 Calculate the instantaneous frequency of the sum of two complex sinusoids, i.e.,
x(t) = x
1
(t) + x
2
(t) = A
1
e
i2f
1
t
+ A
2
e
i2f
2
t
.
64
Maria Sandsten The instantaneous frequency
Figure 7.2: Spectrogram and reassigned spectrogram of a chirp-signal using window-
length 16
65
Maria Sandsten The instantaneous frequency
C7.4 a) Use the m-le gausschirpdata to compute a Gaussian windowed complex-
valued chirp-signal, e.g., with N=256, Nk=256, Tvect=128, F0vect=0,
F1vect=0.5. Investigate the time-frequency distribution using dierent
methods. Which of the applied methods should actually, in theory, give
the instantaneous frequency of (an innite length) chirp signal?
b) Investigate the performance, by view, of the reassignment spectrogram
using the m-le reassignspectrogram on gausschirpdata. Use for exam-
ple the signal of C7.4a) and investigate for dierent window length, e.g.,
WIN=20 and WIN=250. How does the spectrogram and the reassigned
spectrogram change for these values. Especially study the direction of the
linear increase. Also investigate the resulting spectra when adding some
noise on the signal and try to put a level of input parameter e, e.g. e = 0.1
or e = 1.
The following additionally m-les could be used and modied for your own
purpose:
The m-le [X,T]=gausschirpdata(N,Nk,Tvect,F0vect,F1vect,Fs) gener-
ates a complex-valued signal of total length N including a number of
Gaussian windowed chirp-components, each of length Nk. The centre time
values are specied in Tvect, the initial frequency values in F0vect and the
nal frequency values in F1vect. A sample frequency Fs can be specied.
The m-le [SS,MSS]=reassignspectrogram(X,WIN,e,Fs,NFFT); computes
and plots the usual windowed spectrogram (SS), and the corresponding re-
assigned spectrogram (MSS). A window or a Hanning window-length could
be changed using the input variable WIN. The parameter e sets the level,
below which no reassignment of spectrum values are made. It might fasten
the calculations and can also improve the result in a noisy situation.
66
Chapter 8
The uncertainty principle and
Gabor expansion
The spectrogram and STFT relies on the idea by Dennis Gabor in 1946 where he
suggested representing a signal in two dimensions, time and frequency. He called the
two-dimensional area, information diagrams and suggested certain elementary sig-
nals that occupy the smallest possible area in the information diagram, one quantum
of information. In 1971 he received the Nobel Prize for his discovery of the principles
underlying the science of holography. However, the Gabor expansion was not related
to the short-time Fourier transform until more recently, [52]
8.1 The uncertainty principle
A signal cannot be both time-limited and frequency-limited at the same time. This
is seen, e.g., in the example where a multiplication of any signal x(t) with a rectangle
function certainly will time-limit the signal,
x
T
(t) = x(t) rect[t/T].
In the frequency domain, the result will be of innite bandwidth as the multiplication
is transferred to a convolution with an innite sinc-function, (sin x/x).
X
T
(f) = X(f) Tsinc(fT).
Similarly, a bandwidth limitation by a rectangle function in the frequency domain
will cause a convolution in the time domain with an innite sinc-function.
67
Maria Sandsten The uncertainty principle and Gabor expansion
We can measure the frequency bandwidth using the so called eective bandwidth
B
e
,
B
2
e
=
_

f
2
[X(f)[
2
df
_

[X(f)[
2
df
, (8.1)
where the denominator is the total energy of the signal and analogously the so called
eective duration T
e
is dened as
T
2
e
=
_

t
2
[x(t)[
2
dt
_

[x(t)[
2
dt
. (8.2)
In these denitions the signal as well as the spectrum are assumed to be located
symmetrically around t = 0 and f = 0. These expressions can be compared with
the variance of a stochastic variable, i.e., if [X(f)[
2
is interpreted as the probability
density function of the stochastic variable f the denominator would be 1 and the
numerator would be the variance of f. Using these measures we can dene the
bandwidth-duration product, B
e
T
e
, which for a Gaussian signal is
B
e
T
e
=
1
4
,
[39], and for all other signals
B
e
T
e
>
1
4
.
Another name for the bandwidth-duration product is the uncertainty principle
which serves as a measure of the information richness in the signal. Gabor chose
the Gaussian signal as his elementary function in the Gabor expansion, [39], as this
function is the most concentrated both in time and frequency.
8.2 Gabor expansion
For a continuous signal x(t), the Gabor expansion is dened as
x(t) =

m=

k=
a
m,k
g(t mT
0
)e
i2kF
0
t
, (8.3)
where T
0
and F
0
denote the time and frequency sampling steps. The coecients a
m,k
are called the Gabor coecients. Gabor chose the Gaussian function
g(t) = (

)
1
4
e

2
t
2
,
68
Maria Sandsten The uncertainty principle and Gabor expansion
as the elementary function or synthesis window, because it is optimally concen-
trated in the joint time-frequency domain in terms of the uncertainty principle. How-
ever, in practice, the Gaussian window cannot be used as it does not possess the
property of compact support, and using the Gaussian window actually means using
a truncated version of this window. The related sampled STFT, also known as the
Gabor transform, is
X(mT
0
, kF
0
) =
_

x(t
1
)w

(t
1
mT
0
)e
i2kF
0
t
1
dt
1
, (8.4)
where X(mT
0
, kF
0
) = a
m,k
, if the sampling distances T
0
and F
0
satisfy the critical
sampling relation F
0
T
0
= 1. In this case there is an uniquely dened analysis
window w(t) given from the synthesis window g(t). This is seen if Eq. (8.4) is
substituted into Eq. (8.3) as
x(t) =
_

x(t
1
)

m=

k=
g(t mT
0
)w

(t
1
mT
0
)e
i2(kF
0
tkF
0
t
1
)
dt
1
,
which is true if

m=

k=
g(t mT
0
)w

(t
1
mT
0
)e
i2(kF
0
tkF
0
t
1
)
= (t t
1
).
Using the Poisson-sum formula and the biorthogonality, see [52], the double summa-
tion is reduced to a single integration,
_

g(t)w

(t mT
0
)e
i2kF
0
t
=
k

m
,
where
k
is the Kronecker delta and the analysis window w(t) follows uniquely from
a given synthesis window. However, this window does not have attractive properties
from a time-frequency analysis perspective, as it is not well located in time and
frequency. If instead an oversampled lattice is used, i.e., F
0
T
0
< 1, the analysis
window is no longer unique, and a choice can be made from a time-frequency analysis
perspective.
For discrete-time signals, calculation of the analysis window is made using the Zak
transform, [53] and a mean square error solution gives a analysis window which is most
similar to the synthesis window, [54], see Figure 8.1. The Gaussian synthesis window
in Figure 8.1a) generates at critical sampling the analysis window of Figure 8.1b)
which is not attractive for generating a time-frequency distribution, but might be the
69
Maria Sandsten The uncertainty principle and Gabor expansion
0 50 100
0
0.1
0.2
0.3
a) g(t)
t
0 50 100
0.2
0
0.2
0.4
b) w(t), critical sampling
t
0 50 100
0
0.1
0.2
0.3
c) w(t), oversampling 4
t
0 50 100
0
0.1
0.2
0.3
d) w(t), oversampling 16
t
Figure 8.1: a) A Gaussian synthesis window g(t). The analysis window w(t) at; b)
critical sampling; c) oversampling a factor 4; d) oversampling a factor 16.
best for other applications, e.g., image analysis. For higher oversampling Figure 8.1c)
and d), the analysis window becomes more similar to the synthesis window.
The Gabor expansion is often used in applications where analysis as well as recon-
struction of the signal is needed but also for estimation of e.g., linear time-variant
(LTV) lters, [55].
70
Chapter 9
Stochastic time-frequency analysis
Time-frequency analysis of non-stationary and time-varying signals, in general, has
been a eld of research for a number of years. The general quadratic class of time-
frequency representation methods can be used for this purpose, and huge number of
known time-frequency kernels can be found for dierent representations. However,
many of these are actually developed for deterministic signals (possibly with a dis-
turbance) and used for a more exploratory description of multicomponent signals. In
the recent 10-15 years, the attempts to nd optimal estimation algorithms for non-
stationary processes, have increased, [22]. A fundamental problem is the estimation
of a time-frequency spectrum from a single realization of the process. The IAF for
the non-stationary process is
r
z
(t, ) = c[z(t +

2
)z

(t

2
)], (9.1)
with the estimate from one realization as
r
z
(t, ) = z(t +

2
)z

(t

2
). (9.2)
The Wigner spectrum is dened as
W
z
(t, f) =
_

r
z
(t, )e
i2f
d, (9.3)
with an estimate found as

W
z
(t, f) =
_

r
z
(t, )e
i2f
d.
71
Maria Sandsten Stochastic time-frequency analysis
9.1 Dierent denitions of non-stationary processes
The spectral decomposition theorem for a stationary process admits the Cramer
harmonic decomposition
z(t) =
_

e
i2f
1
t
dZ(f
1
).
This decomposition has double orthogonality, which means that
_

e
i2f
1
t
e
i2f
2
t
dt = (f
2
f
1
).
and that
c[dZ(f
1
)dZ

(f
2
)] = (f
1
f
2
)S
z
(f
1
)df
1
df
2
.
A zero-mean Gaussian non-stationary process is characterized by the covariance
function
p
z
(t
1
, t
2
) = c[z(t
1
)z

(t
2
)],
which is dual to the spectral distribution function P
z
(f
1
, f
2
) by means of the Fourier-
type relation,
c[z(t
1
)z

(t
2
)] =
_

P
z
(f
1
, f
2
)e
i2(f
1
t
1
f
2
t
2
)
df
1
df
2
.
The spectral increments of P
z
(f
1
, f
2
) are not orthogonal as is the case for a sta-
tionary process, where the spectral distribution function reduces to P
z
(f
1
, f
2
) =
(f
1
f
2
)S
z
(f
1
). However, for the existence of a decomposition, the requirement
_

[P
z
(f
1
, f
2
)[df
1
df
2
< ,
has to be fullled. The corresponding process z(t) is called harmonizable, [56, 57].
If we want to maintain the double orthogonality of the decomposition, the socalled
Karhunen decomposition, can be applied, where the eigenfunctions of the covari-
ance kernel is used, [22]. The covariance has the form
c[z(t
1
)z

(t
2
)] =
_

(t
1
, f
1
)

(t
2
, f
2
)c[dZ(f
1
)dZ

(f
2
)]
=
_

(t
1
, f
1
)

(t
2
, f
1
)S
z
(f
1
)df
1
.
72
Maria Sandsten Stochastic time-frequency analysis
Using the orthogonality relation
_

(t
1
, f
1
)

(t
2
, f
2
)dt
1
= (f
2
f
1
),
we nd the Karhunen representation as
_

p
z
(t
1
, t
2
)(t
2
, f
1
)dt
2
= S
z
(f
1
)(t
1
, f
1
). (9.4)
The advantage of the Karhunen representations is the orthogonality in time as well
as frequency. However, the interpretation of the variable f
1
as frequency is no longer
obvious. The decomposition of the autocovariance leads to
c[[z(t)[
2
] =
_

[(t
1
, f
1
)[
2
S
z
(f
1
)df
1
,
which denes a time-dependent power spectrum as
W
z
(t
1
, f
1
) = [(t
1
, f
1
)[
2
S
z
(f
1
).
As the non-stationarity property is very general, dierent restrictions have been ap-
plied. One denition is the locally stationary process (LSP), [58], which has a
covariance of the form
p
z
(t
1
, t
2
) = q(
t
1
+ t
2
2
) r(t
1
t
2
), (9.5)
where q(t) is a non-negative function and r(t) positive denite. Studying a formula-
tion as a local symmetric covariance function, i.e.,
c[z(t +

2
)z

(t

2
)] = q(t) r(),
we see that the LSP is a result of a modulation in time of a stationary covariance
function. When q(t) tends to be constant, the LSP becomes more stationary. Exam-
ples of realizations of such processes are depicted in Figure 9.1 for dierent parameter
values.
Another similar denition is the quasi-stationary process, which apply a deter-
ministic modulation directly on the stationary process, i.e.
z(t) = d(t)w(t),
73
Maria Sandsten Stochastic time-frequency analysis
0 100 200
1
0.5
0
0.5
1
c=1.1
0 100 200
1.5
1
0.5
0
0.5
1
c=1.5
0 100 200
1.5
1
0.5
0
0.5
1
c=2
0 100 200
2
1.5
1
0.5
0
0.5
1
1.5
c=4
0 100 200
2
1.5
1
0.5
0
0.5
1
1.5
c=10
0 100 200
2
1
0
1
2
c=30
Figure 9.1: Examples of realizations of a locally stationary processes for dierent
parameter values, c.
where w(t) is a complex-valued weakly stationary process. The local covariance
function becomes
c[z(t +

2
)z

(t

2
)] = d(t +

2
)d(t

2
)r
w
().
If the modulation is slow, i.e., the variation of d(t) is small in comparison with the
statistical memory of w(t), then d(t + /2) d(t) d(t /2).
A third denition is connected to linear systems, where we can represent z(t) as
the output of a linear time-varying system H whose input is stationary white noise,
denoted n(t), with power spectral density S
n
(f) = 1, [59],
z(t) = (Hn)(t) =
_

h(t, t

)n(t

)dt

,
We note that if z(t) is stationary, His time-invariant and the power spectral density of
z(t) is given by S
z
(f) = [H(f)[
2
, where H(f) =
_

h()e
i2f
d. The system H is
called an innovations system of the process z(t) and the process is underspread if
components that are suciently distant from each other in the time-frequency plane
are eectively uncorrelated. Many papers relating to time-frequency analysis and
linear time-varying systems can be found, e.g., [27],
74
Maria Sandsten Stochastic time-frequency analysis
Priestley, [60], suggested the oscillatory random process, which is a rather re-
stricted class of processes, as a linear combination of oscillatory processes is not
necessarily oscillatory, where a basis function
(t, f) = A(t, f)e
i2ft
,
is applied giving the process z(t) as
z(t) =
_

(t, f)dZ(f). (9.6)


It is worth noting that the basis functions are not necessarily orthogonal for the
Priestly decomposition. Remembering the Cramer decomposition of a stationary
process z(t) =
_

e
i2ft
dZ(f) where the process is decomposed using the orthogonal
basis functions e
i2ft
we see that the Priestly decomposition is similar in the sense
that an amplitude modulation of each complex exponential is applied for the non-
stationary oscillatory process. If A(t, f) is slowly varying in time for each frequency
the decomposition is close to orthogonal as
_

(t, f
1
)

(t, f
2
)dt =
_

A(t, f
1
)A

(t, f
2
)e
i2(f
1
f
2
)t
dt (f
1
f
2
),
when A(t, f
1
)A

(t, f
2
) 1. The socalled evolutionary spectrum is found as
S
z
(t, f) = [A(t, f)[
2
S
z
(f).
9.2 The mean square error optimal kernel
Many ambiguity kernels, optimized for dierent purposes, can be found, e.g., [23, 24,
25, 26], where dierent criteria and denitions of non-stationary processes have been
used. Sayeed and Jones [23] derived the optimal kernel in the mean square error
sense for Gaussian harmonizable processes by minimizing the integrated expected
squared error in the ambiguity domain,
J() =
_ _
c[

A
z
(, )(, ) c[

A
z
(, )]]
2
dd, (9.7)
resulting in the optimal ambiguity kernel

opt
(, ) =
c[

A
z
(, )]
2
c[

A
2
z
(, )]
. (9.8)
75
Maria Sandsten Stochastic time-frequency analysis
Based on the fact that the expected ambiguity spectrum of a locally stationary process
is separable (rank one),
c[

A
z
(, )] = A
z
(, ) = Q()r(),
where Q() =
_
q(t)e
i2t
dt. The optimal kernel for Gaussian circularly symmetric
processes was derived in [24] as,

opt
(, ) =
[Q()[
2
[r()[
2
[Q()[
2
[r()[
2
+ (
_
[r(t)[
2
e
i2t
dt)(q q)()
. (9.9)
The optimal kernel in the time-frequency domain is

opt
(t, f) =
_ _

opt
(, )e
i2(ft)
dd.
The transformation of the ambiguity kernel to the time-lag kernel (t, ), (and thereby
to the time-frequency kernel as well) is not straightforward in the discrete-time,
discrete-frequency (DT-DF) case, [61, 62]. It has been shown, that the Jeong-
Williams ambiguity domain estimator represents any covariance smoothing estima-
tor but that this estimator cannot be used to compute the mean square error (MSE)
optimal covariance function estimator in the DT-DF case in dierence to the time-
continuous case. Similarly, in [63], a DT-DF time-frequency matched lter is pro-
posed.
9.3 Multitaper time-frequency analysis
The Slepian functions of the Thomson multitapers are well established in spectrum
analysis today, and are recognized to give orthonormal spectra, (for white noise) as
well as to be the most localized tapers in the frequency domain. In time-frequency
analysis the Hermite functions have been shown to give the best time-frequency
localization and orthonormality in the time-frequency domain. Recently, many
interesting methods, where the orthonormal multitapers are based on Hermite func-
tions, can be found, e.g., [25, 35, 64, 65, 66]. However, similarly as for the Slepian
functions, for spectra with peaks, the cross-correlation between sub-spectra give de-
graded performance, [67].
It has been shown that the calculation of the two-dimensional convolution between
the kernel and the Wigner spectrum estimate of a process realization can be simplied
using kernel decomposition and calculating multitaper spectrograms, [29, 30]. The
76
Maria Sandsten Stochastic time-frequency analysis
time-lag estimation kernel is rotated and the corresponding eigenvectors and eigen-
values are calculated. The estimate of the Wigner spectrum is given as the weighted
sum of the spectrograms of the data with the dierent eigenvectors as sliding win-
dows and the eigenvalues as weights, [31]. This can be seen if use the quadratic class
denition based on the time-lag kernel,
W
Q
z
(t, f) =
_

z(t +

2
)z

(t

2
)(t u, )e
i2f
dud.
Replacing u = (t
1
+ t
2
)/2 and = t
1
t
2
, we nd
W
Q
z
(t, f) =
_ _
z(t
1
)z

(t
2
)(t
t
1
+ t
2
2
, t
1
t
2
)e
i2f(t
1
t
2
)
dt
1
dt
2
=
_ _
z(t
1
)z

(t
2
)
rot
(t t
1
, t
2
)e
i2ft
1
e
i2ft
2
dt
1
dt
2
(9.10)
where

rot
(t
1
, t
2
) = (
t
1
+ t
2
2
, t
1
t
2
).
If the kernel
rot
(t
1
, t
2
) satises the Hermitian property

rot
(t
1
, t
2
) = (
rot
(t
2
, t
1
))

,
then solving the integral
_

rot
(t
1
, t
2
)u(t
1
)dt
1
= u(t
2
),
results in eigenvalues
k
and eigenfunctions u
k
which form a complete set. The kernel
can be expressed as

rot
(t
1
, t
2
) =

k=1

k
u

k
(t
1
)u
k
(t
2
).
Using the eigenvalues and eigenvectors, Eq. (9.10) is rewritten as a weighted sum of
spectrograms,
W
Q
z
(t, f) =

k=1

k
_ _
z(t
1
)z

(t
2
)e
i2ft
1
e
i2ft
2
u

k
(t t
1
)u
k
(t t
2
)dt
1
dt
2
=

k=1

k
[
_
z(t
1
)e
i2ft
1
u

k
(t t
1
)dt
1
[
2
.
77
Maria Sandsten Stochastic time-frequency analysis
Depending on the dierent
k
, the number of spectrograms that are averaged could
be just a few or an innite number. With just a few
k
that diers zero the multiple
window spectrogram solution is an eective solution from implementation aspects.
Dierent approaches to approximate a time-varying spectrum with a few windowed
spectrograms have been taken, e.g., [35, 68, 32, 69, 25, 36, 70].
9.4 Cross-term reduction using a penalty function
In [71], a penalty functions are designed and used in the computation of multitapers
corresponding to the Wigner and Choi-Williams distributions The resulting multi-
taper spectrograms will approximately fulll the concentrations of the distributions
but will also suppress the cross-terms outside a predetermined doppler-lag bandwidth.
The amount of cross-term suppression is determined by a parameter of the penalty
function. The multitapers and weighting factors are computed as the generalized
eigenvectors and eigenvalues from the corresponding time-lag kernel and a penalty
function, controlling the time-frequency resolution and suppression of cross-terms,
_

rot
(t
1
, t
2
)q(t
1
)dt
1
=
_

rot
P
(t
1
, t
2
)q(t
1
)dt
1
,
where the time-lag kernel of the Wigner distribution is dened as
(t, ) = (t),
and the time-lag kernel of the Choi-Williams (CW) distribution is
(t, ) =

[[
e

2
t
2
/
2
,
proposed in [72], Chapter 6, page 271. The penalty function is

rot
P
(t
1
, t
2
) =
_
P

(t
1
t
2
) if [t
1
[

2
and [t
2
[

2

(t
1
t
2
) otherwise.
where the corresponding spectral density of

() is
S

(f) =
_
P if [ f [

2
1 if [ f [<

2
.
Some examples are shown in Figure 9.2. We compare the usual Wigner distribution
and Choi-Williams distributions for two dierent parameter values of , Figure 9.2a),
78
Maria Sandsten Stochastic time-frequency analysis
c) and e), with their low-rank multitaper approximations using the novel technique
where the cross-terms at further distance than 96 samples and 0.12 in frequency will
be suppressed with 10 dB, Figure 9.2b), d) and f). Here it is clearly seen that the
cross-terms between the two Gaussian components located at dierent frequencies
around samples 600 is still present around each of the components in the frequency
intervals 0.05 0.06 and 0.2 0.06. The cross-terms, lying outside the predened
time-frequency resolution are reduced, while the single component concentration still
are high. It was also shown that the new method was less sensitive to white noise
disturbances. This is a general technique that is applicable to other kernels.
79
Maria Sandsten Stochastic time-frequency analysis
Figure 9.2: Time-frequency analysis using dierent algorithms for a signal consisting
of several dierent components: a) Wigner distribution, b) WIG-P, c) Choi-Williams
distribution, = 10, d) CW-P, = 10, e) Choi-Williams distribution, = 0.1, f)
CW-P, = 0.1
80
Bibliography
[1] E. A. Robinson. A historical perspective of spectrum estimation. Proc. of the
IEEE, 70(9):885907, Sept 1982.
[2] A. Schuster. On the investigation of hidden periodicities with application to a
supposed 26-day period of meterological phenomena. Terr. Magnet., 3:1341,
1898.
[3] J. W. Cooley and J. W. Tukey. An algorithm for the machine calculation of
Fourier series. Math. Comput., 19:297301, 1965.
[4] J. W. Cooley, P. A. W. Lewis, and P. D. Welch. Historical notes on the fast
Fourier transform. IEEE Trans. Audio Electroacoust., AU-15:7679, 1967.
[5] J. W. Cooley, P. Lewis, and P. D. Welch. The fast fourier transform and its
applications. IEEE Trans. Education, E-12:2734, March 1969.
[6] D. R. Brillinger. John W. Tukeys work on time series and spectrum analysis.
The Annals of Statistics, 30(6):15951618, Dec 2002.
[7] P. Stoica and R. Moses. Introduction to Spectral Analysis. Prentice Hall, 1997.
[8] B. Rust and D. Donelly. The Fast Fourier Transform for experimentalists, part
IV: Autoregressive spectral analysis. Computing in Science and Engineering,
7(6):8590, 2005.
[9] B. Rust and D. Donelly. The Fast Fourier Transform for experimentalists, part
III: Classical spectral analysis. Computing in Science and Engineering, 7(5):74
78, 2005.
[10] B. G. Quinn. Frequency estimation using tapered data. In Proc. of ICASSP,
pages III73 III76. IEEE, 2006.
81
Maria Sandsten Stochastic time-frequency analysis
[11] E. P. Wigner. On the quantum correction for thermodynamic equilibrium.
Physics Review, 40:749759, June 1932.
[12] S. M. Sussman. Least-squares synthesis of radar ambiguity functions. IRE Trans.
Information theory, 8:246254, April 1962.
[13] J. Ville. Theorie et applications de la notion de signal analytique. Cables et
Transmissions, 2A:6174, 1948.
[14] J. E. Moyal. Quantum mechanics as a statistical theory. Proc. Camb. Phil. Soc.,
45:99124, 1949.
[15] A. W. Rihaczek. Signal energy distribution in time and frequency. IEEE Trans.
Information Theory, 14:369374, May 1968.
[16] C. H. Page. Instantaneous power spectra. J. of Applied Physics, 23:103106,
January 1952.
[17] H. Margenau and R. N. Hill. Correlation between measurements in quantum
theory. Progress of Theoretical Physics, 26:722738, 1961.
[18] L. Cohen. Generalized phase-space distribution functions. Jour. Math. Phys.,
7:781786, 1966.
[19] T. A. C. M. Claasen and W. F. G. Mecklenbrauker. The wigner distribution-
a tool for time-frequency signal analysis-part III: Relations with other time-
frequency signal analysis. Philips Jour. Research, 35:372389, 1980.
[20] K. Kodera, C. de Villedary, and R. Gendrin. A new method for the numerical
analysis of nonstationary signals. Physics of the Earth & Planetary Interiors,
12:142150, 1976.
[21] F. Auger and P. Flandrin. Improving the readability of time-frequency and
time-scale representations by the reassignment method. IEEE Trans. on Signal
Processing, 43:10681089, May 1995.
[22] P. Flandrin. Time-Frequency/Time-Scale Analysis. Academic Press, 1999.
[23] A. M. Sayeed and D. L. Jones. Optimal kernels for nonstationary spectral esti-
mation. IEEE Transactions on Signal Processing, 43(2):478491, Feb. 1995.
82
Maria Sandsten Stochastic time-frequency analysis
[24] P. Wahlberg and M. Hansson. Kernels and multiple windows for estimation of
the Wigner-Ville spectrum of gaussian locally stationary processes. IEEE Trans.
on Signal Processing, 55(1):7387, January 2007.
[25] S. Aviyente and W. J. Williams. Multitaper marginal time-frequency distribu-
tions. Signal Processing, 86:279295, 2006.
[26] I. Djurovic, V. Katkovnik, and L. Stankovic. Median lter based realizations
of the robust time-frequency distributions. Signal Processing, 81(8):17711776,
2001.
[27] G. Matz and F. Hlawatsch. Nonstationary spectral analysis based on time-
frequency operator symbols and underspread approximations. IEEE Trans. on
Information Theory, 52(3):10671086, March 2006.
[28] D. J. Thomson. Spectrum estimation and harmonic analysis. Proc. of the IEEE,
70(9):10551096, Sept 1982.
[29] G. S. Cunningham and W. J. Williams. Kernel decomposition of time-frequency
distributions. IEEE Trans. on Signal Processing, 42:14251442, June 1994.
[30] M. G. Amin. Spectral decomposition of time-frequency distribution kernels.
IEEE Trans. on Signal Processing, 42(5):11561165, May 1994.
[31] L. Cohen. Time-Frequency Analysis. Prentice-Hall, 1995.
[32] M. Bayram and R. G. Baraniuk. Multiple window time-frequency analysis. In
Proc. of Int. Symposium of Time-Frequency and Time-Scale Analysis, pages 511
514, June 1996.
[33] S. Aviyente and W. J. Williams. Multitaper reduced interference distribution.
In Proc. of the tenth IEEE Workshop on Statistical Signal and Array Processing,
pages 569573. IEEE, 2000.
[34] W. J. Williams and S. Aviyente. Spectrogram decompositions of time-frequency
distributions. In Proc. of the ISSPA, pages 587590. IEEE, 2001.
[35] F. Cakrak and P. J. Loughlin. Multiple window time-varying spectral analysis.
IEEE Transactions on Signal Processing, 49(2):448453, 2001.
83
Maria Sandsten Stochastic time-frequency analysis
[36] M. Hansson. Multiple window decomposition of time-frequency kernels using a
penalty function for suppressed sidelobes. In Proc. of the ICASSP, Toulouse,
France, May 14-19 2006. IEEE.
[37] D. L. Jones and T. W. Parks. A high-resolution data-adaptive time-frequency
representation. IEEE Trans. Acoustics, Speech, and Signal Processing, 38:2127
2135, 1990.
[38] R. N. Czerwinski and D. J. Jones. Adaptive short-time fourier analysis. IEEE
Signal Processing Letters, 4(2):4245, Feb. 1997.
[39] D. Gabor. Theory of communications. Journal of the IEE, 93:429457, 1946.
[40] S. Kadambe and G. F. Boudreaux-Bartels. A comparison of the existence of cross
terms in the wigner distribution and the squared magnitude of the wavelet trans-
form and the short time fourier transform. IEEE Trans. on Signal Processing,
40(10):24982517, Oct. 1992.
[41] P. J. Loughlin, J. W. Pitton, and L. E. Atlas. Bilinear time-frequency represen-
tations: new insights and properties. Signal Processing, IEEE Transactions on,
41(2):750767, 1993.
[42] H.-I. Choi and W. J. Williams. Improved time-frequency representation of multi-
component signals using exponential kernels. IEEE Trans. Acoustics, Speech and
Signal Processing, 37:862871, June 1989.
[43] J. Jeong and W. Williams. Kernel design for reduced interference distributions.
IEEE Transactions on Signal Processing, 40:402412, 1992.
[44] R. G. Baraniuk, P. Flandrin, A. J. E. M. Janssen, and O. J. J. Michel. Measur-
ing time-frequency information content using Renyi entropies. IEEE Trans. on
Information Theory, 47(4):13911409, May 2001.
[45] D. L. Jones and R. G. Baraniuk. An adaptive optimal-kernel time-frequency rep-
resentation. IEEE Transactions on Signal Processing, 43(10):23612371, 1995.
[46] S. Aviyente and W. J. Williams. Minimum entropy time-frequency distributions.
IEEE Signal Processing Letters, 12:3740, January 2005.
[47] J. G. Kirkwood. Quantum statistics of almost classical ensembles. Physics
review, 44:3137, 1933.
84
Maria Sandsten Stochastic time-frequency analysis
[48] M. J. Levin. Instanteaneous spectra and ambiguity functions. IEEE Trans.
Information theory, 10:9597, 1964.
[49] M. K. Emresoy and A. El-Jaroudi. Iterative instantaneous frequency estima-
tion and adaptive matched spectrogram. Signal Processing, 64:157165, January
1998.
[50] L. Stankovic and V. Katnovik. The wigner distribution of noisy signals with
adaptive time-frequency varying windows. IEEE Trans. on Signal Processing,
47:10991108, April 1999.
[51] B. Barkat and B. Boashash. A high-resolution quadratic time-frequency distri-
bution for multi-component signals. IEEE Trans. on Signal Processing, 49:2232
2239, October 2001.
[52] M. J. Bastiaans. Gabors expansion of a signal into Gaussian elementary signals.
Proc. of the IEEE, 68(4):538539, April 1980.
[53] A. J. E. M. Janssen. The Zak transform: a signal transform for sampled time-
continuous signals. Philips J. Res., 43:2369, 1988.
[54] J. Wexler and S. Raz. Discrete Gabor expansions. Signal Processing, 21(3):207
221, Nov 1990.
[55] X.-G. Xia. System identication using chirp signals and time-variant lters in
the joint time-frequency domain. IEEE Trans. on Signal Processing, 45(8):2072
2084, August 1997.
[56] W. Mark. Spectral analysis of the convolution and ltering of nonstationary
stochastic processes. J. Sound Vibr, 11(1):1963, 1970.
[57] W. Martin. Time-frequency analysis of random signals. In Proc. ICASSP, pages
13251328, 1982.
[58] R. A. Silverman. Locally stationary random processes. IRE Trans. Inf. Theory,
3:182187, 1957.
[59] H. Cramer. On some classes of nonstationary stochastic processes. In Proc. 4th
Berkeley Sympos. Math. Stat. Prob., pages 5778. University of California Press,
1961.
85
Maria Sandsten Stochastic time-frequency analysis
[60] M. B. Priestley. Evolutionary spectra and non-stationary processes. J. Roy.
Statistics Soc. ser. B, 27:204229, 1965.
[61] J. Sandberg and M. Hansson-Sandsten. A comparison between dierent discrete
ambiguity domain denitions in stochastic time-frequency analysis. IEEE Trans.
on Signal Processing, 57(3):868877, 2009.
[62] J. Sandberg and M. Hansson-Sandsten. Optimal stochastic discrete time-
frequency analysis in the ambiguity and time-lag domain. Signal Processing,
90(7):22032211, 2010.
[63] J. M. O Toole, M. Mesbah, and B. Boashash. Accurate and ecient implemen-
tation of the time-frequency matched lter. IET Signal Processing, 4(4):428437,
2010.
[64] J. Xiao and P. Flandrin. Multitaper time-frequency reassignment for nonstation-
ary spectrum estimation and chirp enhancement. IEEE Transactions on Signal
Processing, 55(6):28512860, 2007.
[65] I. Orovic, S. Stankovic, T. Thayaparan, and L. J. Stankovic. Multiwindow S-
method for instantaneous frequency estimation and its application in radar signal
analysis. IET Signal Processing, 4(4):363370, 2010.
[66] M. Hansson-Sandsten. Optimal estimation of the time-varying spectrum of a
class of locally stationary processes using Hermite functions. EURASIP Journal
on Advances in Signal Processing, 2011. Article ID 980805.
[67] M. Hansson-Sandsten and J. Sandberg. Optimization of weighting factors for
multiple window spectrogram of event related potentials. EURASIP Journal on
Advances in Signal Processing, 2010. Article ID 391798.
[68] L. L. Scharf and B. Friedlander. Toeplitz and Hankel kernels for estimating
time-varying spectra of discrete-time random processes. IEEE Transactions on
Signal Processing, 49(1):179189, 2001.
[69] S. Aviyente and W. J. Williams. A centrosymmetric kernel decomposition for
time-frequency distribution computation. Signal Processing, IEEE Transactions
on, 52(6):15741584, 2004.
[70] M. Hansson-Sandsten. Multi taper wigner distribution with predetermined
doppler-lag bandwidth and sidelobe suppression. In European Signal Process-
ing Conference (EUSIPCO), Aalborg, Denmark, 2010.
86
Maria Sandsten Stochastic time-frequency analysis
[71] M. Hansson-Sandsten. Multitaper Wigner and Choi-Williams distributions with
predetermined doppler-lag bandwidth and sidelobe suppression. Signal Process-
ing, 91(6):14571465, 2011.
[72] ed. B. Boashash. Time Frequency Signal Analysis and Processing. Elsevier, 1st
edition, 2003.
87

You might also like