Linear Time Variant

Dissertation
A TIME-FREQUENCY CALCULUS FOR TIME-VARYING SYSTEMS AND NONSTATIONARY PROCESSES WITH APPLICATIONS
Gerald Matz
(g.matz@ieee.org)
Institute of Communications and Radio-Frequency Engineering Vienna University of Technology
This dissertation is available online at http://www.nt.tuwien.ac.at/dspgroup/tfgroup/doc/psles/GM-phd.ps.gz
INSTITUT FR NACHRICHTENTECHNIK UND HOCHFREQUENZTECHNIK
DISSERTATION
A TIME-FREQUENCY CALCULUS FOR TIME-VARYING SYSTEMS AND NONSTATIONARY PROCESSES WITH APPLICATIONS
ausgefhrt zum Zwecke der Erlangung des akademischen Grades eines u Doktors der technischen Wissenschaften
unter der Leitung von Ao. Univ.-Prof. Dipl.-Ing. Dr. Franz Hlawatsch Institut fr Nachrichtentechnik und Hochfrequenztechnik u
eingereicht an der Technischen Universitt Wien a Fakultt fr Elektrotechnik a u
von Gerald Matz Servitengasse 13/9 1090 Wien
Wien, im November 2000
Die Begutachtung dieser Arbeit erfolgte durch:
1. Ao. Univ.-Prof. Dipl.-Ing. Dr. F. Hlawatsch

Institut fr Nachrichtentechnik und Hochfrequenztechnik u Technische Universitt Wien a
2. O. Univ.-Prof. Dipl.-Ing. Dr. W. Mecklenbruker a

Institut fr Nachrichtentechnik und Hochfrequenztechnik u Technische Universitt Wien a
To Petra
Abstract
This thesis introduces an approximate time-frequency calculus for underspread linear time-varying systems (i.e., time-varying systems that eect only small time-frequency shifts of the input signal) and underspread nonstationary random processes (i.e., nonstationary processes that feature only small time-frequency correlations). After briey describing the major diculties encountered with time-varying systems and nonstationary processes, we introduce an extended denition of underspread systems. Our extended underspread concept is based on weighted integrals and moments of the systems generalized spreading function. Subsequently, numerous approximations are presented which show that in the case of underspread systems the generalized Weyl symbol constitutes an approximate time-frequency transfer function. As a mathematical underpinning of our transfer function approximations, we provide bounds on the associated approximation errors that involve the previously dened weighted integrals and moments of the generalized spreading function. We then consider nonstationary random processes and provide an extended denition of underspread processes. This extended underspread concept is based on weighted integrals and moments of the generalized expected ambiguity function of the process. Subsequently, two fundamental classes of time-varying power spectra are introduced and analyzed: type I spectra that extend the generalized Wigner-Ville spectrum and type II spectra that extend the generalized evolutionary spectrum. We show that in the case of underspread processes, the various members of these two classes of spectra are approximately equivalent to each other and (at least) approximately satisfy several desirable properties. Our approximations are again supported by bounds on the associated approximation errors. These bounds are formulated in terms of the previously dened weighted integrals and moments of the generalized expected ambiguity function. The denition and analysis of time-frequency coherence functions concludes our discussion of time-varying power spectra. Finally, we illustrate the practical relevance of our theoretical ndings by considering several applications in the areas of statistical signal processing and wireless communications. These applications include nonstationary signal estimation and detection, the sounding of mobile radio channels, multicarrier communications over time-varying channels, and the analysis of car engine signals. ix
Kurzfassung
Diese Dissertation beschreibt einen Zeit-Frequenz-Kalkl fr lineare zeitvariante Systeme mit u u geringen Zeit-Frequenz-Verschiebungen (underspread-Systeme) und fr instationre stochastische u a Prozesse mit schwachen Zeit-Frequenz-Korrelationen (underspread-Prozesse). Nach einer kurzen Darstellung der Probleme, welche bei zeitvarianten Systemen und instationren a Prozessen auftreten, stellen wir ein erweitertes Konzept von underspread-Systemen vor. Dieses beruht auf gewichteten Integralen und Momenten der verallgemeinerten Spreading-Funktion des Systems. Im Weiteren werden zahlreiche Approximationen formuliert, welche zeigen, dass fr die Klasse u der underspread-Systeme das verallgemeinerte Weyl-Symbol nherungsweise als Zeit-Frequenza Ubertragungsfunktion interpretiert und verwendet werden kann. Die angesprochenen Approximationen werden durch obere Schranken fr die zugehrigen Approximationsfehler mathematisch unu o termauert, wobei diese Schranken mit den zuvor denierten gewichteten Integralen und Momenten formuliert werden. Danach betrachten wir instationre stochastische Prozesse und geben eine erweiterte Denition a von underspread-Prozessen. Diese beruht auf einer globalen Charakterisierung der Zeit-FrequenzKorrelation des Prozesses mittels gewichteter Integrale und Momente des Erwartungswertes der verallgemeinerten Ambiguittsfunktion. Schlielich werden zwei Klassen von zeitvarianten Spektren a vorgestellt und analysiert: Spektren vom Typ I (eine Erweiterung des verallgemeinerten WignerVille-Spektrums) und Spektren vom Typ II (eine Erweiterung des verallgemeinerten evolutionren a Spektrums). Wir zeigen, dass im Fall von underspread-Prozessen die verschiedenen Spektren beider Klassen nherungsweise quivalent sind und gewisse wnschenswerte Eigenschaften (zumindest) a a u nherungsweise erfllen. Wieder werden alle Nherungen durch obere Schranken fr die entsprechena u a u den Approximationsfehler untermauert. Die Diskussion zeitvarianter Spektren wird mit der Denition und Analyse von Zeit-Frequenz-Kohrenzfunktionen beendet. a Abschlieend illustrieren wir die Anwendung unserer theoretischen Ergebnisse auf gewisse Bereiche der statistischen Signalverarbeitung und der Mobilkommunikation. Diese Bereiche umfassen die instationre Signalschtzung und -detektion, die Messung von Mobilfunkkanlen, Mehrtrgera a a a Ubertragungsverfahren fr zeitvariante Kanle und die Analyse instationrer Motorsignale. u a a x
Acknowledgements
It is my pleasure to express my sincere thanks to several people who have contributed to this thesis in various ways. I am particularly indebted to F. Hlawatsch who continually furthered my personal and professional development. His constant advice and thorough proofreading resulted in countless useful suggestions that helped a lot to improve this thesis with regard to both technical content and presentation. W. Mecklenbruker kindly agreed to act as a referee and pointed me to generalized Chebyshev a inequalities. His support and continual interest in the progress of this work are gratefully acknowledged. W. Kozek pioneered the theory of underspread systems and processes. His inuence on this thesis and my research in general is sincerely appreciated. I am grateful to J. F. Bhme, S. Carstens-Behrens, and M. Wagner for introducing me to the o problems of car engine diagnosis and providing me with the car engine data used in Chapter 4. I am indebted to A. Molisch for several enlightening discussions in which he generously shared with me his expertise on mobile radio. Finally and most importantly, I have been permanently backed up by my wife Petra. Her sympathy, encouragement, love, and support have been vital for the completion of this thesis. I owe her more than anyone else. xi
Contents
1 Introduction
1.1 1.2 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . Review of Time-Invariant/Stationary Theory . . . . . . . . . . . . . 1.2.1 Transfer Functions of Time-Invariant and Frequeny-Invariant 1.2.2 Power Densities of Stationary and White Processes . . . . . Time-Varying Systems and Nonstationary Random Processes . . . . 1.3.1 Time-Varying Systems and the Generalized Weyl Symbol . . 1.3.2 Nonstationary Processes and Time-Varying Power Spectra . The Importance of Being Underspread . . . . . . . . . . . . . . . . 1.4.1 Underspread Linear Time-Varying Systems . . . . . . . . . . 1.4.2 Underspread Nonstationary Processes . . . . . . . . . . . . Signal Processing Applications . . . . . . . . . . . . . . . . . . . . Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview of Contributions . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . Linear Sytems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
2 3 3 4 5 6 6 8 8 10 11 12 12
1.3
1.4
1.5 1.6 1.7
2 Underspread Systems
2.1 Operators with Compactly Supported Spreading Function . . . . . . . . . . . . . . . . . 2.1.1 General Support Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.2 Denition of Displacement-limited Underspread Operators . . . . . . . . . . . . 2.1.3 Unitary Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.4 Operator Sums, Adjoints, Products, and Inverses . . . . . . . . . . . . . . . . . Operators with Rapidly Decaying Spreading Function . . . . . . . . . . . . . . . . . . . 2.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.2 Weighted Integrals and Moments of the Generalized Spreading Function . . . . . 2.2.3 Unitary Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.4 Underspread Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.5 Operator Sums, Adjoints, Products, and Inverses . . . . . . . . . . . . . . . . . 2.2.6 Non-Band-Limited Parts of Operators with Rapidly Decaying Spreading Function Underspread Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Approximate Uniqueness of the Generalized Weyl Symbol . . . . . . . . . . . . .
xiii . . . . . . . . . . . . . .
15
16 16 17 18 20 21 21 22 27 31 32 35 37 37
2.2
2.3
xiv
2.3.2 2.3.3 2.3.4 2.3.5 2.3.6 2.3.7 2.3.8 2.3.9 2.3.10 2.3.11 2.3.12 2.3.13 2.3.14 2.3.15 2.3.16 2.3.17 2.3.18
The Generalized Weyl Symbol of Operator Adjoints . . . . . . . . . . . . . . . . . Approximate Real-Valuedness of the Generalized Weyl Symbol . . . . . . . . . . . Composition of Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Composition of H with H+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Operator Inversion Based on the Generalized Weyl SymbolPart I . . . . . . . . . Operator Inversion Based on the Generalized Weyl SymbolPart II . . . . . . . . Approximate Eigenvalues and Eigenfunctions . . . . . . . . . . . . . . . . . . . . Approximate Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Input-Output Relation for Deterministic Signals Based on the Generalized Weyl Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (Multi-Window) STFT Filter Approximation of Time-Varying Systems . . . . . . . Inmum and Supremum of the Weyl Symbol . . . . . . . . . . . . . . . . . . . . Approximate Non-Negativity of the Generalized Weyl Symbol of Positive SemiDenite Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Boundedness of the Generalized Weyl Symbol of Self-Adjoint Operators . . . . . . Maximum System Gain (Operator Norm) . . . . . . . . . . . . . . . . . . . . . . Approximate Commutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Approximate Normality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sampling of the Generalized Weyl Symbol of Underspread Operators . . . . . . . .
40 41 42 48 50 57 61 66 69 73 77 79 84 87 89 90 92
3 Underspread Processes
3.1 Time-Frequency Correlation Analysis . . . . . . . . . . . . 3.1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Time-Frequency Correlation Functions . . . . . . . 3.1.3 The Expected Ambiguity Function . . . . . . . . . 3.1.4 Extended Concept of Underspread Processes . . . . 3.1.5 Innovations System . . . . . . . . . . . . . . . . . 3.1.6 An Example . . . . . . . . . . . . . . . . . . . . . Elementary Time-Varying Power Spectra . . . . . . . . . . 3.2.1 Generalized Wigner-Ville Spectrum . . . . . . . . . 3.2.2 Generalized Evolutionary Spectrum . . . . . . . . . Type I Time-Varying Power Spectra . . . . . . . . . . . . 3.3.1 Denition and Formulations . . . . . . . . . . . . 3.3.2 Examples . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Approximate Equivalence . . . . . . . . . . . . . . 3.3.4 Properties . . . . . . . . . . . . . . . . . . . . . . Type II Time-Varying Power Spectra . . . . . . . . . . . . 3.4.1 Denition and Formulations . . . . . . . . . . . . 3.4.2 Examples . . . . . . . . . . . . . . . . . . . . . . 3.4.3 Approximate Equivalence . . . . . . . . . . . . . . 3.4.4 Properties . . . . . . . . . . . . . . . . . . . . . . Equivalence of Time-Varying Power Spectra . . . . . . . . 3.5.1 Equivalence of Generalized Wigner-Ville Spectrum Spectrum . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . and Generalized Evolutionary . . . . . . . . . . . . . . . . .
97
98 98 98 99 103 106 107 110 110 120 126 127 129 130 132 140 140 141 142 143 146 147
3.2
3.3
3.4
3.5
xv
3.6
3.7 3.8
3.5.2 Equivalence of Type I and Type II Spectra . . . . . . . . . . . . . . . . Input-Output Relations for Nonstationary Random Processes . . . . . . . . . . 3.6.1 Input-Output Relation Based on the Generalized Wigner-Ville Spectrum 3.6.2 Input-Output Relation Based on the Generalized Evolutionary Spectrum Approximate Karhunen-Lo`ve Expansion . . . . . . . . . . . . . . . . . . . . . e Time-Frequency Coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.8.1 Spectral Coherence and Coherence Operator . . . . . . . . . . . . . . . 3.8.2 Time-Frequency Formulation of the Coherence Operator . . . . . . . . 3.8.3 The Generalized Time-Frequency Coherence Function . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
149 151 151 154 157 158 159 160 164
4 Applications
4.1 Nonstationary Signal Estimation . . . . . . . . . . . . . . . . . . . . . 4.1.1 Time-Varying Wiener Filter . . . . . . . . . . . . . . . . . . . . 4.1.2 Time-Frequency Formulation of the Time-Varying Wiener Filter 4.1.3 Time-Frequency Filter Design . . . . . . . . . . . . . . . . . . . 4.1.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . Nonstationary Signal Detection . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Optimal Detectors . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Time-Frequency Formulation of Optimal Detectors . . . . . . . 4.2.3 Time-Frequency Detector Design . . . . . . . . . . . . . . . . . 4.2.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . Sounding of Mobile Radio Channels . . . . . . . . . . . . . . . . . . . 4.3.1 Channel Sounder Model . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Analysis of Measurement Errors . . . . . . . . . . . . . . . . . 4.3.3 Optimization of PN Sequence Length . . . . . . . . . . . . . . 4.3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . Multicarrier Communication Systems . . . . . . . . . . . . . . . . . . . 4.4.1 Pulse-Shaping OFDM and BFDM Systems . . . . . . . . . . . 4.4.2 Approximate Input-Output Relation for OFDM/BFDM Systems 4.4.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . Analysis of Car Engine Signals . . . . . . . . . . . . . . . . . . . . . . 4.5.1 Time-Varying Spectral Analysis . . . . . . . . . . . . . . . . . . 4.5.2 TF Coherence Analysis . . . . . . . . . . . . . . . . . . . . . . 4.5.3 Subspace Identication . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
169
170 170 170 172 173 174 175 176 178 179 181 182 183 185 186 187 188 189 189 190 191 193 194
4.2
4.3
4.4
4.5
5 Conclusions
5.1 5.2
197
Summary of Novel Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 Open Problems for Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
A Linear Operator Theory
207
A.1 Basic Facts about Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 A.2 Kernel Representation of Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 A.3 Eigenvalue Decomposition and Singular Value Decomposition . . . . . . . . . . . . . . . 210
xvi
A.4 Special Types of Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
B Time-Frequency Analysis Tools

B.1 Time-Frequency Representations of Linear, Time-Varying Systems . . . . . . . . . . . . . B.1.1 Generalized Spreading Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.2 Generalized Weyl Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.3 Generalized Transfer Wigner Distribution and Generalized Input and Output Wigner Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2 Time-Frequency Signal Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.1 Short-Time Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.2 Generalized Wigner Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.3 Spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.2.4 Generalized Ambiguity Function . . . . . . . . . . . . . . . . . . . . . . . . . . . B.3 Time-Frequency Representations of Random Processes . . . . . . . . . . . . . . . . . . . B.3.1 Generalized Wigner-Ville Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . B.3.2 Generalized Evolutionary Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . B.3.3 Physical Spectrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.3.4 Generalized Expected Ambiguity Function . . . . . . . . . . . . . . . . . . . . . .
213
214 214 217 220 222 222 222 223 224 224 225 226 226 227
C The Symplectic Group and Metaplectic Operators
229
C.1 The Symplectic Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230 C.2 Metaplectic Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 C.3 Eects on Time-Frequency Representations . . . . . . . . . . . . . . . . . . . . . . . . . 233
Bibliography List of Abbreviations
237 249
1
Introduction
[. . .] there is no doubt that linear systems will continue to be an object Thomas Kailath of study for as long as one can foresee.
INEAR systems and random processes are the foundations of numerous signal modelling, analysis, and processing schemes. In a formal sense, the characterization of linear systems and the second-
order description of random processes can be based on the same mathematics: linear operator theory. This viewpoint will be emphasized throughout this thesis. A fundamental distinction must be made between time-invariant systems and stationary processes
on the one hand and time-varying systems and nonstationary processes on the other. The majority of books and research papers restrict to time-invariant systems and stationary processes, both of which can be treated in an ecient and intuitively appealing manner using Fourier analysis. In contrast, time-varying systems and nonstationary processes have received much less attention. In this introductory chapter, we review some basic facts of the theories of time-invariant linear systems and stationary processes (Section 1.2). Further, in Section 1.3 we discuss the fundamental problems encountered when dealing with time-varying/nonstationary scenarios. Section 1.4 outlines the central results of this thesis, which consist in the (approximate) solution of these basic problems by means of an approximate time-frequency calculus of time-varying transfer functions and timevarying power spectra. This time-frequency calculus is valid for the practically important classes of underspread systems and underspread processes and it has the advantage of being conceptually simple, computationally attractive, and physically intuitive. The chapter continues with an outline of signal processing applications of our results in Section 1.5, a brief account of related work in Section 1.6, and a summary of the major contributions of this thesis in Section 1.7.
Chapter 1. Introduction
1.1 General Remarks

Linear systems and random processes are of fundamental importance in many engineering applications. In particular, linear systems provide useful models for communication channels and speech production, are vital in transmitter/receiver design, and are used as lters in processing schemes for signal separation, enhancement, and detection. Similarly, random processes are used to model phenomena as diverse as speech and audio, communication signals, visual data, biological signals, or signals arising in machine monitoring. Furthermore, undesired disturbances (noise and interference) are usually modelled as being random, too. Hence, linear systems and random processes lie at the heart of numerous (statistical) signal processing methods. The above short list of their applications is far from being complete and, indeed, falls short of illustrating their ubiquity. In a formal sense, linear systems and (second-order statistics of) random processes allow a unied mathematical treatment via linear operator theory1 (a brief review of certain elements of linear operator theory is given in Appendix A; far more comprehensive treatments can be found in [69, 158]). In particular, a linear system can be associated to a linear operator H whose kernel equals the systems impulse response h(t, t ) relating the input signal x(t) and the output signal y(t) as2 y(t) = (Hx)(t) =
t
h(t, t ) x(t ) dt .
(1.1)
Throughout this thesis, when talking about linear systems we will restrict our attention to HilbertSchmidt (HS) operators (see Appendix A). The only exceptions to this general rule are i) linear time-invariant (LTI) systems and linear frequency-invariant (LFI) systems (which are never HS); and ii) unitary systems (see Subsection 2.1.3). In a similar way, a correlation operator Rx can be used as second-order description of a random process x(t) in the sense that the kernel of Rx equals the correlation function rx (t, t ) = E{x(t) x (t )} of x(t) (here, E denotes expectation). We note that the set of all correlation operators equals the subclass of positive semi-denite linear operators (see Appendix A). Except in the case of stationary or white processes, we will implicitly assume the processes involved to have nite mean energy, Ex E{ x 2 } < . This implies that the corresponding 2 LTI systems and stationary processes can be eciently dealt with using convolution and Fourier transform techniques. The corresponding theories are well developed and allow to gain useful insights (as discussed in Section 1.2). Unfortunately, Fourier transform techniques lose much of their appeal and usefulness in the case of linear time-varying (LTV) systems and nonstationary processes. This explains why LTV systems and nonstationary processes, while providing a very general framework (much more general than LTI systems and stationary processes), are considerably more dicult to treat (see Section 1.3). This problem motivated much of the work in this thesis, which is concerned
1
correlation operator is trace-class (or nuclear, see Appendix A).
We note that while the theoretical treatment of linear systems and random processes can be cast in the same
mathematical framework, the interpretation of the corresponding objects is quite dierent. In particular, in the case of a linear system we are mainly interested in characterizing its eects on various input signals. In contrast, in the case of a random process we are concerned with a description of its correlative properties and its power distribution. 2 Throughout this thesis, integrals are from to unless stated otherwise.
1.2 Review of Time-Invariant/Stationary Theory
with the development of a time-frequency (TF) calculus for LTV systems and nonstationary processes. This TF calculus is valid for underspread systems and underspread processes which will be discussed in Section 1.4.
1.2 Review of Time-Invariant/Stationary Theory

LTI systems and their dual, LFI systems, as well as stationary processes and their dual, white processes, are quite restrictive models (see Section A.4). However, their treatment using convolution and Fouriertype methods is comparatively simple. Thus, we subsequently outline several well-known results for LTI (LFI) systems and stationary (white) random processes (details can be found in standard textbooks like [160, 163]). Our emphasis will be placed on those properties whose extension to timevarying systems and nonstationary processes will be developed in Chapters 2 and 3, respectively.
1.2.1
Transfer Functions of Time-Invariant and Frequeny-Invariant Linear Sytems
general input-output relation (1.1) specializes to a convolution (denoted by an asterisk), y(t) = (g x)(t) = g(t t ) x(t ) dt .
LTI systems are characterized by an impulse response of the form h(t, t ) = g(t t ), so that the
LFI systems are characterized by an impulse response of the form h(t, t ) = m(t) (t t ); the inputoutput relation (1.1) here reduces to a time-domain multiplication, y(t) =
t
m(t) (t t ) x(t ) dt = m(t) x(t).
LTI (LFI) systems are always normal and any two LTI systems (LFI systems) commute with each other. The spectral transfer function (frequency response) of LTI systems is given by the Fourier transform of g( ), G(f )
g( ) ej2f d ,
(1.2)
and the temporal transfer function of LFI systems is given by the multiplier function m(t). These spectral and temporal transfer functions are extremely simple and ecient system descriptions. This is due to the following properties: The complex sinusoids {ej2f t } (parametrized in a physically meaningful manner by frequency f ) are the generalized eigenfunctions [68] of any LTI system, with G(f ) the associated generalized eigenvalues, i.e. (Hx)(t) = G(f ) ej2f t for x(t) = ej2f t . Similarly, the Dirac impulses (t t ) (parametrized in a physically meaningful manner by time t ) are the generalized eigenfunctions x(t) = (t t ). of LFI systems, with m(t ) the associated generalized eigenvalues: (Hx)(t) = m(t ) (t t ) for
Chapter 1. Introduction As a consequence of the previous property, for LTI systems the Fourier transform of (Hx)(t)
equals the Fourier transform of x(t) multiplied by G(f ), and for LFI systems the output signal (Hx)(t) equals the input signal x(t) multiplied by m(t). Hence, for LTI and LFI systems, the
input-output relation simplies to the multiplication of two functions in the frequency domain and in the time domain, respectively. The spectral (temporal) transfer function of the series connection (composition) of two LTI (LFI) systems with transfer functions G1 (f ) (m1 (t)) and G2 (f ) (m2 (t)) equals G1 (f ) G2 (f ) (m1 (t) m2 (t)). The adjoint H+ of an LTI (LFI) system has impulse response g ( ) (m (t) (t t)), and hence its spectral (temporal) transfer function is simply the complex conjugate of G(f ) (m(t)). The inverse H1 of LTI (LFI) systemif it existscorresponds to the reciprocal of the spectral (temporal) transfer function. For LTI (LFI) systems the maximum system gain (i.e., operator norm, see Section A.1) is equal to the supremum of |G(f )| (|m(t)|).
1.2.2
Power Densities of Stationary and White Processes
The correlation function rx (t1 , t2 ) = E{x(t1 ) x (t2 )} of (wide-sense) stationary processes depends only on the dierence t1 t2 , i.e., rx (t1 , t2 ) = rx (t1 t2 ). Hence, the 1-D correlation function rx ( ) is a x(t) is dened as the Fourier transform of the correlation function (Wiener-Khintchine relation), i.e., Px (f )
complete second-order description. The power spectral density (PSD) [163] of the stationary process
rx ( ) ej2f d .
(1.3)
Similarly, the correlation function of a white process is of the form rx (t1 , t2 ) = qx (t1 ) (t1 t2 ) and mean instantaneous intensity are very simple, physically intuitive, and useful second-order statistical descriptions of the process. The following properties and interpretations of the PSD and mean instantaneous intensity are of fundamental importance: The complex exponentials ej2f t are the generalized eigenfunctions of the (convolution type) correlation operator Rx of stationary processes. This implies that the Fourier transform diagonalizes the correlation operator and thus provides a decorrelation of stationary processes, E X(f ) X (f ) = Px (f ) (f f ) . Here, X(f ) =
j2f t t x(t) e
the temporal power density is given by the mean instantaneous intensity [163] qx (t). The PSD and
dt is the process Fourier transform. For white processes, the corre-
lation operator is diagonalized by the Dirac impulses (t t ) and decorrelation is obtained in the time domain, E {x(t) x (t )} = qx (t) (t t ).
1.3 Time-Varying Systems and Nonstationary Random Processes
Let h( ) denote the impulse response of a (time-invariant) innovations system of x(t) [38, 163], i.e., x(t) = that
h( ) n(t ) d , with n(t) normalized stationary white noise. Then it is known Px (f ) = |H(f )|2 , (1.4)
where H(f ) is the transfer function of the innovations system as dened in (1.2). Similarly, if h(t, t ) = m(t) (t t ) is the (frequency-invariant) innovations system of a white process, i.e., x(t) = m(t) n(t), then qx (t) = |m(t)|2 . The PSD (mean instantaneous intensity) is a complete second-order statistics of stationary (qx (t)). The PSD and the mean instantaneous intensity are nonnegative quantities that integrate to the mean temporal and spectral power, respectively: Px (f ) 0 , qx (t) 0 , Px (f ) df = E |x(t)|2 ,
t
(white) processes since the correlation function rx (t1 , t2 ) can be completely recovered from Px (f )
qx (t) dt = E |X(f )|2 .
These properties are essential for an interpretation as average power densities. If a stationary process x(t) is passed through an LTI system with impulse response k( ), the output process y(t) = (k x)(t) is again stationary with PSD Py (f ) = |K(f )|2 Px (f ) . Furthermore, the cross PSD of y(t) and x(t) is given by Py,x (f )
(1.5)
ry,x ( ) ej2f d = K(f ) Px (f ) ,
(1.6)
where ry,x ( ) = E{y(t) x (t )} is the cross correlation function of y(t) and x(t). Similarly, passing a white process x(t) through an LFI system with temporal transfer function m(t) again yields a white process y(t) = m(t) x(t) with mean instantaneous intensity qy (t) = |m(t)|2 qx (t) and mean instantaneous cross intensity qy,x (t) = m(t) qx (t).
1.3 Time-Varying Systems and Nonstationary Random Processes

LTV systems and nonstationary random processes provide very general and thus powerful models for a large variety of engineering applications. Unfortunately, as discussed below, this generality is paid for with an increased diculty in describing LTV systems and nonstationary processes. The situation gets more comforting if one considers the classes of underspread systems and underspread processes [117120, 126, 127, 129, 144, 145] which can be viewed as natural extensions of LTI (LFI)
systems and stationary (white) processes, respectively. The basic concepts related to underspread systems and underspread processes as well as their importance for an approximate TF calculus will be outlined in Section 1.4, while a detailed discussion of the approximate TF calculus will be provided in Chapters 2 and 3.
1.3.1
Time-Varying Systems and the Generalized Weyl Symbol
For LTV systems, the generalized Weyl symbol (GWS) is dened as [114, 118, 144] LH (t, f )
()
h() (t, ) ej2f d ,
(1.7)
summarized in Section B.1.2. The GWS has been recognized as a potential candidate for a TF (or,
with h() (t, ) = h(t + (1/2 ), t (1/2 + ) ). Important properties and relations of the GWS are
time-varying) transfer function, i.e., as a generalization of the spectral (temporal) transfer function. Unfortunately, in contrast to LTI (LFI) systems, LTV systems and the GWS are generally much more cumbersome to work with. This is due to the following reasons : Contrary to LTI and LFI systems, LTV systems are not always normal. In the non-normal case, one has to deal with (numerically more expensive) singular value decompositions instead of eigenvalue decompositions [69, 158] (see also Appendix A). In general, the eigenfunctions or singular functions of distinct LTV system are dierent and possess no simple specic structure. Since the singular value decomposition (eigenvalue decomposition) of LTV systems has no specic structure, it can only be interpreted using signal space concepts, thus lacking a specic physical interpretation. In particular, the parameter k in (A.4) and (A.5) has in general no physical meaning. Furthermore, there exists no fast implementation (like the FFT for LTI systems) of the transform associated to these unstructured singular functions (eigenfunctions). None of the practically convenient properties of the spectral (temporal) transfer function of LTI (LFI) systems (see the list in Subsection 1.2.1) is valid any longer for the GWS of LTV systems.
1.3.2
Nonstationary Processes and Time-Varying Power Spectra
In the context of random processes, the eigenvalue decomposition of the correlation operator, referred to as Karhunen-Lo`ve (KL) decomposition [136], has a specic interpretation that is important for e statistical signal processing schemes. We shall thus discuss very briey the KL expansion of nonstationary processes. Let us consider a nite-energy, zero-mean, nonstationary process x(t). The associated correlation operator Rx is trace-class (cf. Appendix A) and has (orthonormal) eigenfunctions uk (t) and absolutely summable nonnegative eigenvalues k . The KL theorem [136] states that x(t) can be expanded into the eigenfunctions uk (t) (in the mean-square sense) and that the expansion
1.3 Time-Varying Systems and Nonstationary Random Processes coecients are uncorrelated with mean power equal to the eigenvalues k , i.e.,3
x(t) =
k=1
x, uk uk (t) ,
E { x, uk x, ul } = k kl .
(1.8)
The double orthogonality of the KL expansion (orthogonality of the basis functions uk (t) and orthogonality of the random coecients x, uk ) is the reason for the (theoretical) optimality and usefulness of the KL transform (i.e., the transform mapping the random process x(t) to the coecients x, uk ) in various applications like transform coding or signal detection. In the case of stationary processes, the KL transform reduces to the Fourier transform. With X(f ) denoting the Fourier transform of x(t), (1.8) reads x(t) =
f
X(f ) ej2f t df ,
E X(f ) X (f ) = Px (f ) (f f ) ,
replacing the discrete parameter k. For white processes there is a similar integral representation where formally uk (t) (t t ) and k qx (t ), with the continuous parameter t (time) replacing the discrete parameter k. Hence, in a certain sense, the KL eigenvalues provide a generalization of the PSD and the mean instantaneous intensity to general nonstationary processes. Unfortunately, the KL transform in general suers from several drawbacks: The KL basis functions uk (t) are not known a priori and in general are dierent for dierent processes. In general, the basis {uk (t)} lacks a simple structure; this is the reason why typically no ecient implementation of the KL transform exists. In general, the parameter k of the KL transform has no physical meaning like frequency f or time t. These drawbacks are well recognized and have been a major driving force for research into approximations of the KL transform [50, 60, 61, 118, 120, 137, 138]: depending on the specic application, the goal has been either to develop approximate KL transforms using highly structured and eciently implementable bases or to provide physically more relevant denitions of spectra for nonstationary processes. Potential candidates for the latter are the generalized Wigner-Ville spectrum (GWVS), dened as [60, 61, 63, 140, 145] Wx (t, f ) with
() rx (t, ) () () rx (t, ) ej2f d ,
i.e., one formally has uk (t) ej2f t and k Px (f ), with the continuous parameter f (frequency)
dened as [145, 148] (cf. also [49, 118, 170, 171])

R
= rx (t + (1/2 ), t (1/2 + ) ), and the generalized evolutionary spectrum (GES), G() (t, f ) x LH (t, f ) ,
() 2
The inner product is dened as usual, x, y =
x(t) y (t) dt.
with H an innovations system of x(t), i.e., a system satisfying HH+ = Rx . Further details about the GWVS and the GES can be found in Sections B.3.1 and B.3.2, respectively. Unfortunately, none of the practically important and convenient properties of the PSD or the mean instantaneous intensity as detailed in Subsection 1.2.2 are satised by the GWVS or GES of general nonstationary processes.
1.4 The Importance of Being Underspread

In the foregoing subsections, we saw that in general the GWS of an LTV system and the GWVS or GES of a nonstationary process fail to provide tools that are as ecient and meaningful as the transfer function of LTI (LFI) systems and the power densities of stationary (white) processes, respectively. However, for underspread systems (see Chapter 2) the GWS features similar properties as the transfer function (at least in an approximate sense) and for underspread processes (see Chapter 3) the GWVS and GES (approximately) satisfy similar properties as the PSD. Making this statement precise and proving this claim is the main theme of this thesis. In the next two subsections, we outline the main results of the theory developed in Chapters 2 and 3.
1.4.1
Underspread Linear Time-Varying Systems
For our subsequent discussion, it is necessary to consider the TF shifts introduced by an LTV system H. These can be characterized by the generalized spreading function (GSF) [114, 118] SH (, )
t ()
h() (t, ) ej2t dt ,
with h() (t, ) as before. Important properties and relations of the GSF are summarized in Section B.1.1. In particular, y(t) = (Hx)(t) =
()
SH (, ) (S() x)(t) d d , ,
()
(1.9)
with S, denoting the generalized TF shift operator,

() (S, x)(t) = x(t ) ej2t ej2 (1/2) .
It is seen from (1.9) that the spread, or extension, of the GSF about the origin of the (, )-plane provides a global characterization of the TF shifts introduced by the system. The GSF extension in the direction characterizes the systems length of memory whereas the extension in the direction determines the fastness of the systems time-variations or uctuations. Conceptually, an LTV system is called underspread if its GSF is concentrated in a small region about the origin of the (, ) plane,4 which indicates that the system introduces only small TF shifts , or, in other words, that the systems memory is short and/or its time-variations are slow (see Subsections 2.1.2 and 2.2.4). In contrast, systems introducing large TF shifts , are referred to as overspread.
4
In fact, in most cases it suces that the GSF is concentrated around some point (0 , 0 ) since the corresponding
oset TF shift can be split o from the system.
1.4 The Importance of Being Underspread
The underspread concept was rst used for random LTV systems in the context of fading multipath channels [11,109,172,203] and for doubly spread radar targets [72,74,203]. Adopting the terminology of the random case, Kozek [118120] introduced underspread deterministic LTV systems by requiring that their GSF be exactly zero outside a small rectangular support region about the origin of the (, ) plane. Systems being underspread in Kozeks sense, i.e., having small compact GSF support, will be treated in Section 2.1 and we will refer to them as displacement-limited (DL) operators. In practice, the condition of small compact GSF support is often not satised exactly but only eectively. This raises the question of how to choose the eective support region and how the modeling error resulting from a specic choice of this eective support region aects the validity of the results obtained using the compact support model. Furthermore, the small compact support requirement is often unnecessarily restrictive since several important results hold for much wider classes of systems, including systems with a GSF that satises certain support constraints but still has innite support. Thus, in this thesis we introduce and use an extended underspread concept based on operators with rapidly decaying GSF (see Section 2.2). As a foundation for this extension, we use weighted integrals and moments of the GSF as measures of the global TF shifts of a system, without requiring the GSF to have compact support. For LTV systems of the underspread type (be they DL operators or operators with rapidly decaying GSF), all of the diculties connected with LTV systems and the GWS as listed in Subsection 1.3.1 are (at least approximately) removed. In fact, one can establish a GWS-based approximate TF transfer function calculus that yields the following useful results (the details and bounds on the associated approximation errors will be presented in Section 2.3): Underspread operators are approximately normal (see Subsection 2.3.17) and have approximate structured (see Subsection 2.3.8). Hence, these approximate eigenfunctions also allow an intuitive physical interpretation. The GWS approximately reects the systems maximum gain (see Subsection 2.3.15). The GWS of the adjoint operator H+ is approximately equal to the complex conjugate of the GWS of H (see Subsection 2.3.2). The GWS of the product (composition) of two (jointly) underspread operators (systems) is approximately given by the product of the respective individual GWS (see Subsection 2.3.4). Jointly underspread LTV systems are approximately commuting (see Subsection 2.3.16). In a certain sense to be discussed in Subsection 2.3.6, the GWS of the inverse of H approximately corresponds to 1/LH (t, f ).
()
eigenfunctions which, being time and frequency translates of a prototype signal, are highly
All these approximations can be summarized by stating that the GWS of (jointly) underspread systems can be interpreted as a TF transfer function which can be used in exactly the same manner as the spectral (temporal) transfer function of LTI (LFI) systems.
10
1.4.2
Underspread Nonstationary Processes
TF correlations play a similar role for random processes as TF shifts do for linear systems. They can be compactly characterized by the generalized expected ambiguity function (GEAF) [117, 118, 126] () Ax (, ) with
() rx (t, )
SRx (, ) =
()
() rx (t, ) ej2t dt ,
as before (further details can be found in Sections 3.1 and B.3.4). In particular, the
GEAF can be interpreted as a global measure of the TF correlation of process components separated by in time and by in frequency. Hence, the extension of the GEAF about the origin of the (, ) plane provides a global characterization of the TF correlations of the process. The GEAF extension in the direction describes the temporal correlation width whereas the extension in the direction characterizes the spectral correlation width of the process. Conceptually, we will refer to a random process as underspread if its GEAF is suciently concentrated about the origin of the (, ) plane. This is equivalent to the requirement that the process features only small TF correlations. Since the GEAF is the GSF of the correlation operator Rx , it follows that the correlation operator of an underspread process is underspread in the sense of Subsection 1.4.1. In contrast, processes with large TF correlations are referred to as overspread. Underspread processes were rst introduced by Kozek [117, 118, 126] in analogy to underspread LTV systems. Kozeks denition of underspread processes was similarly based on the requirement that their GEAF is exactly zero outside a small rectangular support region about the origin of the (, ) plane. We will refer to processes that are underspread according to this original denition as correlationlimited (CL) processes. Similar to the case of LTV systems, the condition of small compact GEAF support will often be satised only eectively. Hence, for the same reasons as mentioned in the foregoing subsection, we shall introduce and use in this thesis an extended underspread concept that is based on rapid decay of the GEAF. The GEAF decay (and thus the global TF correlations) will be measured via weighted integrals and moments. For this extended version of the underspread property, no compact GEAF support is required. The important fact about underspread processes is that the diculties connected with nonstationary random processes and the GWVS/GES as listed in Subsection 1.3.2 are alleviated. In particular, one can show that the GWVS and GES (as well as other time-varying spectrasee Chapter 3) of underspread processes (approximately) satisfy the following useful properties (the details and bounds on the associated approximation errors will be presented in Chapter 3): Time and frequency translates of a reasonable prototype signal are approximate KL eigenfunctions of underspread processes. These approximate KL eigenfunctions allow a physically be eciently implemented. The GWVS and GES are (approximately) positive and describe the mean TF energy distribution of the process in a meaningful way. meaningful interpretation and, due to their underlying structure, the associated transform can
1.5 Signal Processing Applications
11
For both the GWVS and GES, simple and intuitive approximate input-output relations similar to (1.5) and (1.6) can be derived. Almost all denitions of time-varying power spectra proposed up to now in the literature (in particular, the GWVS and GES) are approximately equivalent. In an approximate sense, most of the time-varying power spectra are complete second-order characterizations of the process (for the GWVS, this holds exactly, see Subsection B.3.1). In essence, the above approximations imply that the GWVS and GES of underspread processes are reasonable denitions of time-varying power spectra which extend the PSD (mean instantaneous intensity) of stationary (white) processes in a meaningful way to the nonstationary case.
1.5 Signal Processing Applications

As explained above, Chapters 2 and 3 establish that the GWS is a meaningful TF transfer function for underspread LTV systems and that the GWVS and GES are meaningful mean TF energy distributions of underspread nonstationary random processes. These ndings have important implications for the following applications that will be discussed in more detail in Chapter 4. Nonstationary Signal Estimation and Enhancement. The Wiener lter is known to be the optimal linear signal estimator with respect to a mean square error criterion [106, 187, 197, 202]. Unfortunately, in the case of nonstationary processes, the design of the Wiener lter is based on the solution of an operator equation and requires a computationally costly and potentially unstable operator inversion. Applying the results of Subsection 2.3.7 allows to use a computationally ecient, numerically stable, and physically intuitive approximate TF design of time-varying Wiener lters [92, 111] (see Section 4.1). Nonstationary Signal Detection. In the case of nonstationary Gaussian processes, the design of the likelihood ratio detector [108,168,187,202] and the deection-optimal detector [7,168] involves the solution of an operator equation that requires expensive and potentially unstable operator inversions. Similar to the signal estimation problem, we show how the results of Subsection 2.3.6 can be used to obtain a computationally less costly, stable, and intuitive approximate TF design of time-varying signal detectors [141143, 146] (see Section 4.2). Sounding of Mobile Radio Channels. Accurate wideband measurements of mobile radio channels by means of correlative channel sounders are the cornerstone of any design or simulation of mobile radio systems with high data rate [40, 52, 164]. While most channel sounders assume the channel to be time-invariant, practical mobile radio channels are time-varying. For this reason, the measurements are typically aected by systematic errors [149,150,156]. Using the results of Subsections 2.3.1, 2.3.4, and 2.3.18, these errors can be quantied and bounded (see Section 4.3). Multicarrier Communication Systems. Multicarrier communication systems like orthogonal frequency division multiplex (OFDM) and discrete multi-tone (DMT) [2830, 133, 182, 209, 215] are
12
closely related to TF signal expansions. Recent work showed that pulses other than the usual rectangular one as well as an extension to biorthogonal transmit and receive lters can be advantageous in the case of fast time-varying channels [18, 19, 128]. In Section 4.4, we will point out a close relation of these recent results with approximate eigenfunctions of LTV systems as discussed in Subsection 2.3.8. Analysis of Car Engine Signals. Pressure and vibration signals measured in combustion engines can be modelled successfully as nonstationary random processes. They are important for knock detection and other car engine diagnosis tasks [17, 26, 27, 113, 143, 146, 181, 207]. Application of the time-varying spectra and TF coherence function considered in Chapter 3 allows to extract useful time-varying and nonstationary features of this type of real data (see Section 4.5).
1.6 Related Work

As mentioned above, underspread LTV systems with a GSF satisfying a rectangular support constraint were rst considered in the pioneering work of W. Kozek [118120, 127]. Several transfer function approximations have been derived within this framework [118120, 127]. Furthermore, results in a similar spirit have been obtained for the symbol calculi in the context of quantum-mechanical quantization [64, 167, 206] and pseudo-dierential operators [64, 73, 94, 112], with the dierence that these theories dene specic symbol classes directly in phase space whereas in our approach we formulate growth conditions in the (dual) spreading function domain. Also, whereas in quantization theory and pseudo-dierential operator theory one studies the operators corresponding to a given symbol, we consider a given LTV system (operator) and investigate how close its GWS comes to the intuitive engineering notion of a TF transfer function. We will further comment on these dierences and similarities in Sections 2.2 and 2.3. Several results for underspread random processes with rectangularly supported GEAF that are related to the approximations in Chapter 3 were also derived by W. Kozek in [115, 118, 120, 126]. Furthermore, results close in spirit to our considerations have been presented in [191193]. There, the analysis and processing of random processes with slowly time-varying statistics based on innovation systems and the evolutionary spectrum is discussed. In [60,61,63], observations regarding the WignerVille spectrum and the evolutionary spectrum were made which are closely related to our discussion of time-varying power spectra. Finally, [32] considers LTV systems and nonstationary random processes that are regionally underspread, i.e., require the underspread condition to hold only for a portion of the system (process) that is localized in a specic TF region.
1.7 Overview of Contributions

We conclude this introductory chapter with an overview of the major contributions of this thesis (in order of appearance). LTV systems with compactly supported GSF (Section 2.1): We introduce novel parameters
1.7 Overview of Contributions
13
characterizing the TF shifts of LTV systems with compact GSF support, and we extend the previous denitions [118120,127] of (jointly) underspread systems to accommodate oblique orientations of GSF support regions. Furthermore, the spread parameters of unitarily transformed systems as well as of sums, products, and adjoints of underspread LTV systems are analyzed. LTV systems with rapidly decaying GSF (Section 2.2): We present an extension of the underspread concept to LTV systems whose GSF does not necessarily have compact support but features rapid decay. This extension uses weigthed integrals and moments of the GSF that provide novel measures of the TF shifts introduced by a system. We derive various relations for weighted GSF integrals and moments of unitarily transformed LTV systems and of sums, products, and adjoints of LTV systems. Furthermore, we present Chebyshev-like inequalities inequalities that are useful for assessing the errors made by approximating an arbitrary LTV system by a system with compactly supported GSF. TF transfer function approximations (Section 2.3): This large section contains numerous results that establish a GWSbased approximate TF transfer function calculus for underspread LTV systems. All TF transfer function approximations are underpinned by bounds on the associated approximation errors which are formulated in terms of weighted GSF integrals and moments. While numerous TF transfer function approximations are completely new, others are extensions of existing result for the special case of LTV systems with compactly supported GSF in [118120, 127]. TF correlation analysis (Section 3.1): We present methods for the analysis of the TF correlations of a random process. Furthermore, we provide a novel concept of underspread processes that is based on weighted integrals and moments of the GEAF and that extends previous denitions of underspread processes that were based on the assumption of a compactly supported GEAF [117, 118, 126]. Elementary time-varying spectra (Section 3.2): For underspread processes, the GWVS and GES are shown to be smooth, eectively real-valued, and positive quantities. We furthermore discuss the occurrence of statistical cross-terms in the case of overspread processes and we present uncertainty relations for the GWVS and GES. Type I and type II spectra (Sections 3.33.5): Type I time-varying power spectra (previously considered in [3,60,61,63]) are introduced in an axiomatic fashion and shown to be extensions of the GWVS. Similarly, we extend the GES by introducing the novel class of type II time-varying power spectra. We show that these spectra satisfy (at least approximately) several desirable properties. Furthermore, we prove that for underspread processes the members of these two classes of spectra are approximately equivalent (for the special case of GWVS and GES and processes with compactly supported GEAF, part of these results has been shown previously in [118, 126]).
14
Input-output relations (Section 3.6): We present novel approximations that relate the GWVS (GES) of the nonstationary output process of an LTV system to the GWVS (GES) of the nonstationary input process. Approximate KL expansion (Section 3.7): We present approximate KL expansions for underspread nonstationary random processes. This extends existing results derived for processes with compactly supported GEAF [115, 116, 118, 120] to more general scenarios and is furthermore related to results obtained in [137, 138] for locally stationary processes. TF coherence function (Section 3.8): We discuss the concept of coherence for nonstationary random processes and introduce a novel class of TF coherence functions. While a TF coherence has been dened in [210] in an ad hoc fashion, we prove that our TF coherence functions can be viewed as approximate TF formulations of a coherence operator and we provide several approximations that justify their interpretation as a coherence function. Applications (Chapter 4): Chapter 4 applies the theoretical results developed in Chapters 2 and 3 to the problems of signal estimation, signal detection, channel sounding, multicarrier communications, and car engine diagnosis (see Section 1.5 for an outline of these applications).
2
Underspread Systems
My work always tried to unite the truth with the beautiful, but when I had to Hermann Weyl choose one or the other, I usually chose the beautiful.
LTHOUGH there exist several useful time-frequency tools for characterizing linear time-varying systems (linear operators), such as the generalized Weyl symbol and the generalized spreading
function, these time-frequency representations are in general dicult to work with. For example, series connections and inverses of linear time-varying systems in general cannot be expressed via the generalized Weyl symbol in as simple a manner as it is possible with the transfer function of linear time-invariant systems. Hence, there remain several problems as to how these representations are to be interpreted and how they can be used in specic signal processing applications. A key concept that allows to answer these questions is that of underspread systems. Such systems
introduce only limited time-frequency shifts and are characterized by a spreading function concentrated around the origin of the (, )-plane. Underspread systems essentially come in two avors: The rst type, proposed in this thesis and discussed in detail in Section 2.2, builds on the requirement of rapid decay of the spreading function such that specic weighted integrals and moments are suciently small. The second type (which can be viewed as a special case of the rst), is based on a strict support constraint of the spreading function, similar to strictly bandlimited signals. This latter type of underspread systems, reviewed in Section 2.1, has been introduced and extensively analyzed in the pioneering work of W. Kozek. In Section 2.3 of this chapter, numerous approximations based on specic underspread assumptions are proved that establish an approximate time-frequency transfer function calculus. We note that some parts of our analysis are parallel to Kozeks work, although our (more general) denition of underspread systems using weighted integrals and moments and the approximations based on it are original. 15
16
Chapter 2. Underspread Systems
2.1 Operators with Compactly Supported Spreading Function

In this section we consider operators with compactly supported GSF and review Kozeks original definition of underspread operators. We furthermore discuss the eect of unitary metaplectic transforms on the operators GSF support and study the GSF supports of the sum, product, and inverses of operators.
2.1.1
General Support Constraints
The denition of underspread (deterministic) LTV systems by requiring the support of their GSF to be conned to a rectangular region about the origin has rst been proposed and extensively studied in the pioneering work of W. Kozek [117120, 123, 126]. This support constraint on the GSF implies that the system introduces only limited time displacements and frequency displacements and hence we refer to such systems as displacement-limited (DL) systems (operators). Since GSF and GWS are 2-D Fourier transform pairs, it further follows that the GWS of a DL system is a 2-D band-limited function. The existing extensive theory of band-limited signals [162] thus serves as an additional motivation for this denition of underspread systems. In particular, a strict band-limitation of the GWS allows the GWS to be sampled on a 2-D sampling grid without information loss. The following discussion is intended as a review of some of the concepts introduced in [117119, 123, 126], with some slight modications and extensions comprising more general support constraints than rectangular ones (e.g., oblique regions in the (, )-plane). This will also yield bounds on the errors incurred by the transfer function approximations in Section 2.3 that are slightly more tight and/or valid under more general conditions than previous bounds in [117119, 123, 126]. Consider an LTV system (operator) H with GSF SH (, ). We require that the support of the GSF is conned to a compact region GH , i.e., |SH (, )| = 0 for (, ) GH . For such systems, we can write1 GH = (, ) : |SH (, )| > 0 . Thus, with the indicator function IGH (, ) of GH , dened as IGH (, ) = the GSF satises the condition
() ()
1 ,
0 ,
(, ) GH , (, ) GH ,
SH (, ) IGH (, ) = SH (, ).
()
()
(2.1)
While in general there exist innitely many indicator functions which satisfy (2.1) for a given SH (, ), our denition corresponds to the indicator function with minimal support, i.e., where the associated region GH has minimal area. The multiplicative relation (2.1) in the (, )-domain corresponds to a
1 ()
Note that the GSF magnitude is independent of , |SH (, )| = |SH (, )| (see Subsection B.1.1).
17
convolution relation in the TF domain, i.e.,

t () f
LH (t , f ) LT (t t , f f ) dt df = LH (t, f ).
()
()
()
(2.2)
Here, LT (t, f ) is the GWS of an operator T dened via its GSF as ST (, ) = IGH (, ).
() T ()
(2.3)
Note (2.2) holds for any operator T with GSF satisfying S e (, ) = 1 for (, ) GH (while being within the class of all operators satisfying (2.2). arbitrary for (, ) GH ). However, due to (B.9), the operator T dened via (2.3) has minimum norm To each (, )-region G or indicator function IG (, ), there corresponds a linear subspace SG that SG = H : SH (, )IG (, ) = SH (, ) . The orthogonal complement of SG corresponds to the complement G = R2 \G of G whose indicator
c function is IG (, ) = 1 IG (, ), such that for any operator Hc in the complement space SG = SG () () () () () ()
consists of all linear operators H satisfying the corresponding support constraint (2.1),
there is SHc (, ) IG (, ) = 0 or equivalently SHc (, ) IG (, ) = SHc (, ) [1 IG (, )] = SHc (, ). Obviously, we also have H SG1 and G1 G2 = H SG2 .
Furthermore, given a region G or indicator function IG (, ), any operator H can be split into two orthogonal parts lying respectively inside and outside the associated operator space SG , H = HG + HG . (2.4)
The orthogonal compononents HG and HG are found by projecting H onto the respective subspace. This projection can easily be accomplished in the GSF domain according to SHG (, ) = SH (, )IG (, ),
() ()
() (, ) HG
= SH (, ) [1 IG (, )] .
()
(2.5)
these two systems are indeed orthogonal, i.e. HG , HG = Tr HG (HG )+ not in general imply HG HG = 0 or HG HG
We call HG the DL part of H corresponding to the region G. It is easily checked using (B.8) that = 0. These concepts will be important in Sections 2.3.6
= 0. Note that this does
and 2.3.18 when discussing operator inverses and TF-sampling of the GWS, respectively.
2.1.2
sense:
Denition of Displacement-limited Underspread Operators
In the following, we recall from [118] the denition of underspread operators in the constrained support
(max) (max)
time and frequency shift, respectively, introduced by the system, H

(max) (,)GH
Denition 2.1. Let H be a DL system with GSF support GH and let H max | | , H
(max)
and H
be the maximum
(,)GH
max || .
(2.6)
18
H H
(max) (max)

(max) H
GH
GH
(max)
(a)
(b)
Figure 2.1: Illustration of (a) a rectangular and (b) a hyperbolic compact support constraint. Then, H = 4H
(max) (max) H
is called the (strict-sense) displacement spread of the DL system H, and H is called (strict-sense) DL underspread if H = 4H
(max) (max) H
1.
(max) (max) H
(2.7) is the area of a
Condition (2.7) was rst introduced in [118120]. We note that 4H rectangle with sides of length
(max) 2H
and
(max) 2H ,
respectively, which contains the GSF support
region GH . Hence, H measures the support of the GSF of H via a circumscribed rectangle (cf. Fig. 2.1(a)). Sometimes, it will suce to require the less restrictive condition H max | | 1 .
(,)GH
(2.8)
Since H measures only the maximum product | | for points within GH , we have H H /4 , (2.9)
with equality if and only if one of the four corner points (max , max ) GH . Condition (2.8) is
somewhat dierent in spirit from (2.7) since it does not require the area of the support of the GSF to be small or even nite; however, it still implies a support constraint since the GSF has to lie within the hyperbolae dened by | | = H (see also Fig. 2.1(b)). It is important to note that DL operators [118120] are a special case of operators with rapidly
decaying GSF to be dened in Section 2.2. This will further be discussed in Subsection 2.2.2.
2.1.3
Unitary Transformations
Let us now consider the inuence of some specic unitary transformations of operators on the GSF support, i.e., we consider the transformed operator H = UHU+ with some unitary operator U.

()
19
We rst study the eects of TF shifts, i.e., H = St,f HS+ where St,f denotes the joint TF shift t,f operator (see Appendix B). Since TF shifts leave the GSF magnitude unchanged, the quantities H and H remain unchanged too, i.e., H = H , e H = H . e
Hence, any class of operators dened by some type of GSF support constraint also comprises all TF shifted versions of each member of this class, H SG = St,f HS+ SG . t,f
In particular, all TF shifted versions of an underspread DL system are also DL underspread. area-preserving linear (symplectic) TF coordinate transforms (see Appendix C and [46, 64, 162, 208]). For any metaplectic operator U = (A), the GSF with = 0 of H = UHU+ is given by (cf. (C.4)) S e (, ) = SH (a + b, c + d). Since |SH (, )| is independent of (cf. Section B.1.1), there is also2 S e (, ) = SH (a + b, c + d)
() H () () (0) H (0)
Next let us consider the class M of metaplectic transformations U = (A) that correspond to the
for all .
(2.10)
Specic metaplectic operators which depend only on a single parameter are the TF scaling operator multiplication operator (a = d = 1, b = 0), the chirp convolution operator (a = d = 1, c = 0), and the fractional Fourier transform operator (a = d = cos , b = T 2 sin , c = (sin )/T 2 ). It is straightforward to show that only for TF scalings and Fourier transforms there is H = H e (a = 1/d, b = c = 0), the Fourier transform operator (a = d = 0, b = 1/c = T 2 ), the chirp
and H = H . In all other cases, H and H must be expected to change. In some cases, it is e undesirable that systems whose GSFs are equal up to area-preserving linear TF coordinate transforms are assigned dierent displacement spreads. Hence, we extend the denition of the DL underspread property in (2.7) such that the displacement spread of all operators whose GSFs are obtained from each other by symplectic group TF coordinate transforms is equal: Denition 2.2. An operator H is called wide-sense DL underspread if there exists a metaplectic operator U M such that the displacement spread of H = UHU+ satises H 1. We call e
min H UM
inf H e
the wide-sense displacement spread of H. Note that all systems which are unitarily equivalent via some U M are thus assigned the
same wide-sense displacement spread. This is of specic importance when discussing transfer function approximations for the case = 0 where shearings and rotations of the GSF can be of particular interest (see, e.g., Subsections 2.3.4 and 2.3.15).
2
Note that for = 0 this equality is valid only for the magnitude of the GSF and not for the GSF itself.
20
2.1.4
Operator Sums, Adjoints, Products, and Inverses
We shall next consider the displacement spreads of sums, adjoints, products, and inverses of operators. Sum. With regard to the sum of two operators H1 and H2 , we obviously have |SH1 +H2 (, )| = SH1 (, ) + SH2 (, ) |SH1 (, )| + |SH2 (, )| . Without making any additional assumptions, we have to deal with the worst case, i.e., GSFs SH1 (, ), SH2 (, ) which do not cancel anywhere in the sum SH1 (, ) + SH2 (, ). Here, the region of support of SH1 +H2 (, ) is given by GH1 ,H2 obtain IGH1 ,H2 (, ) = IGH1 (, ) + IGH2 (, ) IGH1 (, )IGH2 (, ). For the displacement spreads we then H1 +H2 H1 ,H2 H1 +H2 H1 ,H2 4 max H1
(max) () () () ()
GH1 GH2 and the corresponding indicator function is
, H2
(max)
max H1
(max)
, H2
(max)
(2.11)
max {H1 , H2 } .
It follows that H1 ,H2 max {H1 , H2 } and hence, in general, H1 +H2 must be expected to be larger
than both H1 and H2 (except if the support of one GSF is totally contained within that of the other GSF, or the GSFs add to zero in specic peripheral regions of the (, )-plane). This shows that the sum of two DL underspread operators is DL but not necessarily DL underspread. On the other hand, H1 ,H2 is never larger than the maximum of H1 and H2 . Hence, the class of operators dened by H A is closed under addition. In [118] two systems are called jointly underspread if the area of the union of their GSF supports
can be circumscribed by an axis-parallel rectangle of area less than one. However, based on the above observations and using the denition
min H1 ,H2 UM
inf UH1 U+ ,UH2 U+
we here introduce the following denition of jointly DL operators: Denition 2.3. Two systems H1 and H2 are said to be jointly strict-sense (wide-sense) DL undermin spread if H1 ,H2 1 (H1 ,H1 1).
Hence, jointly DL underspread systems are individually DL underspread and also their sum H1 +H2
min is DL underspread. We will call H1 ,H2 (H1 ,H1 ) the strict-sense (wide-sense) joint displacement spread
of H1 and H2 . In essence, jointly DL underspread operators have GSFs satisfying similar support constraints.
Adjoint. Since SH+ (, ) = SH (, ) according to (B.6), the displacement spreads of a
system and its adjoint are equal, H+ = H , and furthermore H+ = H .
Product. The following inequalities can be easily deduced from the bound (B.13): H2 H1 H1
(max) (max)
+ H2
(max)
H2 H1 H1
(max)
(max)
+ H2
(max)
2.2 Operators with Rapidly Decaying Spreading Function
21
In particular, the above inequalities imply that H2 can be as large as 4H but not larger, H2 = 4H2
(max) (max) H2
4(2H
(max)
)(2H
(max)
) = 4H .
and H+ H 4H . series [158]
This bound can be shown to hold also for the composition of a system and its adjoint, i.e., HH+ 4H Inverse. The GSF of the inverse of an operator H is dicult to analyze. From the Neumann
H1 =
k=0
(I H)k ,
it is seen that due to the higher powers of (I H) the support of the GSF of H1 may grow ad
innitum. Thus, in general the inverse of a DL underspread operator need not be even DL. Yet, this
does not imply that the GSF of H1 may not be concentrated around the origin. This will further be discussed in Subsection 2.3.6.

After this discussion of DL and DL underspread operators, we now turn to a novel extended concept of underspread operators that replaces the compact support constraint on the GSF by generalized decay constraints.
2.2.1
Motivation
In many cases, the assumption that the support of the GSF of an operator is exactly conned to some small area around the origin (as used in [118, 119, 127] for the denition of underspread operators) appears to be too restrictive. Often, the GSF is merely concentrated around the origin and has rapid decay away from the origin. A useful measure of the decay of the GSF is in terms of weighted GSF integrals and moments which describe the eective, rather than exact, support of the GSF. Hence, a reasonable and practically relevant extension of the underspread concept can be based on such measures of eective GSF support. This is the point of view we adopt in this section. Based on our extended concept of underspread operators with rapidly decaying GSF, we will prove the validity of several underspread approximations in Section 2.3. This approach has several advantages as compared to the results obtained for DL operators: In many practical situations, a theory based on weighted GSF integrals and moments is closer to physical reality than a theory based on exact support constraints. The results can be used to judge how well the results obtained for DL operators apply to operators with rapidly decaying GSF (see Subsection 2.2.6). In general, an operator and its inverse cannot be simultaneously underspread in the DL sense, whereas it is possible that they both have fast decaying GSF.
22
(a)
(b)
(c)
(d)
(e)
Figure 2.2: Gray-scale plots (darker shades correspond to larger values) of several specic weighting functions: (a) (, ) = | |k , (b) (, ) = ||k , (c) (, ) = | |k , (d) (, ) = |1 As (, )|, (e) (0) (, ) = 1 (0) 1 with As (, ) the ambiguity function (see Section B.2.4) of a normalized As (, ) Gaussian function. On the other hand, a drawback of our extended theory of underspread operators is that it does not allow for an exact TF sampling of the GWS as discussed in [118] for the case of DL underspread operators. We will further comment on the sampling problem in Subsection 2.3.18.
(0)
2.2.2
Weighted Integrals and Moments of the Generalized Spreading Function
To circumvent the problems and limitations associated the DL underspread concept as mentioned above, we here propose to globally characterize the TF shifts of a system H by means of the weighted GSF integrals (, ) |SH (, )| d d
mH
()
1 SH
1
MH
()
1/2 2 (, ) |SH (, )|2 d d = 1 H |SH (, )|2 d d

()
|SH (, )| d d
(, ) |SH (, )| d d ,
(2.12-a)
1/2 2
2 (, ) |SH (, )|2 d d
, (2.12-b)
which are normalized by the L1 or L2 norm of SH (, ) (recall that SH
= H 2 ). We note that
1
due to (1.9), the implicit assumption that the GSF has nite L1 norm, SH
condition for the bounded input bounded output stability of the system H. Typically, (, ) in (2.12-a) GSF contributions lying away from the origin. We note that mH and MH do not depend on the |SH (, )|, we have mH+ = mH and MH+ = MH whenever the weighting function is evenused in later subsections. GWS parameter . Since the GSF magnitude of the adjoint system H+ is given by |SH+ (, )| =
() () () ()
< , is a sucient
and (2.12-b) is a nonnegative weighting function which satises (, ) (0, 0) = 0 and penalizes
() ()
symmetric, i.e., (, ) = (, ). Fig. 2.2 shows some specic weighting functions which will be
23
Absolute Moments of the GSF. As a special case of the above weighted integrals using the weighting functions (, ) = | |k ||l , we also introduce the normalized moments mH MH
(k,l)
1 SH 1 H
2 1
| |k ||l |SH (, )| d d , 2k 2l |SH (, )|2 d d

1 1/2
(2.13-a) ,
2 2
(k,l)
(2.13-b) as probability
with integers3 k, l N0 . If one views |SH (, )|/ SH (, )
density functions (they are positive and integrate to one), the above moments are analogous to the (absolute) moments of random variables [163]; thus, we call mH
(k,l)
and |SH (, )|2 / H and MH

(k,l)
the (absolute) moments
of order (k, l) of H. Note that
(0,0) mH
(0,0) MH
= 1 for any system H. In almost all cases we will use
those (absolute) moments where either k = 0 or l = 0 or k = l. Moments with k = 0 or l = 0 penalize mainly GSF contributions located away from the axis or away from the axis, respectively. Thus, systems with GSF concentrated along the axis (i.e., quasi-LTI systems) have small mH 2.2). Moments with k = l penalize mainly GSF contributions located away from the and axes,
(0,l)
and small MH , whereas systems with GSF concentrated

(k,0)
(0,l)
along the axis (i.e., quasi-LFI systems) have small mH
and small MH
(k,0)
(cf. Figs. B.1 and
i.e., lying in oblique directions in the (, ) plane. This is due to the fact that the corresponding superposition (i.e., parallel connection) of a (quasi-) LTI system and a (quasi-) LFI system has a GSF concentrated along the and axes and will thus have small mH
(k,k)
weighting function is constant along the hyperbolae | | = c (cf. Fig. 2.2). In particular, a and MH
(k,k)
Note that MH
(k,l)
in (2.13-b) is well-dened only for Hilbert-Schmidt (HS) operators, i.e., for systems
with nite HS norm (cf. Appendix A). As such, this denition is not directly applicable to LTI and LFI systems. However, for LTI and LFI systems appropriately modied moment denitions can be given in terms of the impulse response g( ) or the Fourier transform of the modulation function, M () = (Fm)(), respectively: MHLTI
(k,0)
1 g 2
1/2
2k |g( )|2 d
MHLFI
(0,l)
1 M
1/2 2
2l |M ()|2 d
(2.14)
Note that these specic moments characterize the time displacements and frequency displacements of LTI and LFI systems, respectively; since LTI systems do not cause frequency displacements and LFI systems do not cause time displacements, it does not make sense to ask about moments characterizing the frequency displacements of LTI systems or the time displacements of LFI systems. Furthermore, for systems with GSF perfectly concentrated along the and axes, i.e., for any superposition of LTI k, l both > 0.
3
and LFI systems with impulse response h(t, t ) = g(t t ) + m(t) (t t ), one has mH
(k,l)
= 0 for any
Note that theoretically k, l could be any positive real-valued numbers. However, in the subsequent sections only
integer values are required.
24
Due to Schwarz inequality, the moments satisfy the following inequalities: mH Since mH
(2k,0) (k,l)
mH
(2k,0)
mH
(0,2l)
MH and MH
(2k,0)
(k,l)
MH
(2k,0)
MH
(0,2l)
(2.15)
and mH
(0,2l)
and similarly MH
(2k,0)
measure the average extension of the GSF
in the and direction, respectively, their product is a measure of the eective o origin support of the GSF dened via an equivalent rectangle. The above inequalities hence are mathematical formulations of the fact that the eective o axes support is always less than the eective o origin support. Weighted GSF Integrals/Moments of DL Operators. For the case of a DL operator H with GSF exactly zero outside a compact region GH , the following result relates the GSF integrals and moments to the quantities H
(max)
, H
(max)
, and H dened in Section 2.1.
Proposition 2.4. The GSF integrals of a DL operator H satisfy the bounds mH H with H
(max) () (max)
MH H
()
(max)
(2.16)
max(,)GH (, ). For the GSF moments of a DL operator H, (2.16) further implies
the bounds mH
(k,l)
(max) k
(max) l
MH
(k,l)
(max) k
(max) l
(2.17)
and for k = l one has the tighter bounds mH

(k,k)
k H
()
H 4
MH
(k,k)
k H
H 4
(2.18)
Proof. The bound for mH is obtained by noting that mH =

()
1
1
SH 1 SH
(, ) |SH (, )| d d =
(,)GH
1 SH
1 GH
(, ) |SH (, )| d d
GH
max (, ) |SH (, )| d d
GH
= H
(max)
1 SH
1
|SH (, )| d d = H
(max)
From this, the bound for mH H

(max)
(k,l)
follows by noting that for (, ) = | |k ||l

(max) k GH GH
= max{| |k ||l } max{| |k } max{||l } = H

GH
(max) l
and the bound for mH
(k,k)
follows by noting that for (, ) = | |k ||k = | |k H

(max)
= max{| |k } = k . H
GH
The bounds on MH , MH
()
(k,l)
and MH
(k,k)
can be derived in a similar manner.
25
This result implies that for systems which are DL underspread, the weighted GSF integrals and moments are small. Hence, DL underspread operators are a special case of the extended underspread framework based on weighted GSF integrals and moments. The above bounds will also allow us in Section 2.3 to compare the results obtained for operators with rapidly decaying GSF with those obtained in [118] for DL operators. Derivatives of the GWS. Due to the 2-D Fourier relation (B.20) of the GSF and GWS, the following bounds and expressions in terms of the GSF moments can be derived for for the (partial) derivatives of the GWS in (1.7): Proposition 2.5. The L and L2 norms of the partial derivatives of the GWS are related to the (absolute) GSF moments via k+l LH (t, f ) (2)k+l SH tl f k
() 1 mH , (k,l)
k+l LH tl f k
()
= (2)(k+l) H
2
(k,l) . 2 MH
Proof. Since dierentiating in the TF domain corresponds to multiplications in the spreading domain, there is k+l LH (t, f ) = tl f k
()
(1)l (j2)(k+l) k l SH (, ) ej2( f t) d d.

()
Therefore, the magnitude of the partial derivatives of the GWS is bounded as k+l LH (t, f ) tl f k
()
(1)l (j2)(k+l) k l SH (, ) ej2( f t) d d (2)(k+l) k l SH (, ) d d = (2)(k+l) SH

() (k,l) 1 mH ,
()
from which the L bound follows. Similarly, by Parsevals theorem, k+l LH tl f k

() 2
=
2
(2)2(k+l) k l SH (, ) d d 2k 2l |SH (, )|2 d d = (2)2(k+l) H

() 2 2
()
= (2)2(k+l)

MH
(k,l) 2
thus yielding an exact expression for the L2 norm of the partial derivatives of the GWS in terms of MH . From Proposition 2.5 it is seen that the GSF moments mH
(k,l) (k,l)
and MH
(k,l)
essentially determine
(0,1)
the smoothness of the GWS. In particular, the GWSs rst derivative with respect to time will be smallimplying smoothness of the GWS in the time directionif the moments mH small. Similarly, the GWS will be smooth in the frequency direction if
(1,0) mH
and MH
(0,1)
are
and
(1,0) MH
are small.
We note that the rst relation (bound) bears some resemblance to the denition of specic symbol classes of pseudo-dierential operators [64, 73, 94, 95, 112, 206], with the dierence that our bound is uniform over the entire TF plane whereas in pseudo-dierential operator theory there is an additional term controlling the growth of the partial derivatives of the symbol.
26
Displacement Spread Parameters. A specically interesting characterization of the TF shifts of an LTV system is in terms of the moments MH
(1,0)
and MH
(0,1)
for which we will use special symbols.
We dene the temporal displacement spread H and the spectral displacement spread H as H H MH
(1,0)
= =
1 H 1 H
1/2 2
2 |SH (, )|2 d d
1/2
= =
1 H
1/2 2 t
2 |h() (t, )|2 dt d

2 () h (t, ) dt d t
,
1/2
(2.19) . (2.20)
MH
(0,1)
2 |SH (, )|2 d d
1 2 H
From the right-most expressions, it is seen that the temporal and spectral displacement spreads characterize, respectively, the memory and the time-variations of H. Sometimes we will also use the displacement radius H (T ) [86] where T > 0 is an arbitrary time constant introduced for reasons of compatibility of physical dimensions. The displacement radius is dened as weighted GSF integral MH T with weighting function equal to the normalized Euclidian distance of (, ) from the origin, T (, ) = H (T )
( ) MH T 2 T ( )
+ (T )2 =
2
/T T
,
2 1/2
1 = H
+ (T )
|SH (, )| d d
Hence, all GSF contributions are weighted by their Euclidian distance from the origin. Thus, H (T ) is a measure of the o origin support of the GSF. It may be large even for systems whose o axes support as measured by mH
(k,k)
is small. The square of the displacement radius is easily shown to equal
the normalized sum of the two squared displacement spreads, i.e., 2 (T ) = H 2 H + T 2 2 . H T2
By using this relation and completing the square one obtains 2 (T ) = H 2 H + T 2 2 = H T2 H HT T

2
+ 2 H H . H / H for which
H T
It is thus seen that the constant T minimizing 2 (T ) is T0 = H the minimum is given by 2 H min 2 (T ) = 2 H H
T
H T = 0, and
H H
= 2 H H ,
from which it follows that 2 (T ) 2 H H . H 1 H

()
We will furthermore occasionally use the weighted GSF integral H

() 2 2
|SH (, )|2 d d .
It is not a special case of mH or MH and in general does not describe the eective support of SH (, ). Rather, it is a measure of the orientation (or skewness) of the GSF, i.e., it indicates whether the GSF is concentrated along an axis or rather along oblique directions. Since a large magnitude of
()
27
H implies that the time and frequency shifts caused by H are strongly coupled, i.e., large (small) time shifts imply large (small) frequency shifts and vice versa, we refer to H as displacement correlation of H. An inequality which we will need later is 2 2 2 . H H H (2.21)
In some cases it is convenient to arrange the (normalized) temporal and spectral displacement spreads and the displacement correlation into a matrix which will be called the displacement spread matrix , DH
(T ) (T )
2 The determinant of D H (which is independent of T ) will be denoted by H , 2 H (T )
2 H T2 H
(T )
H . 2 T 2 H
det D H = 2 2 2 . H H H
(T )
Note that (2.21) implies that det D H 0 and thus D H is seen to be positive semi-denite. Furthermore, the squared displacement radius is the trace of the displacement spread matrix, i.e., 2 (T ) = Tr DH H
(T )
All moment quantities introduced in this subsection and their interrelations are summarized in Table 2.1.
2.2.3
Unitary Transformations
Similarly to Subsection 2.1.3, we now consider the inuence of some specic unitary transformations of operators on the weighted integrals and moments of the GSF. TF Shifts. Since TF shifts of an operator H leave the GSF magnitude unchanged, the weighted GSF integrals and moments as dened by (2.12-a), (2.12-b), (2.13-a), and (2.13-b) remain unchanged too, i.e., the weighted GSF integrals and moments of the TF shifted operator H = St,f HS+ are given t,f by m e = mH ,
() H ()
M e = MH ,
() H
()
me
(k,l) H
= mH ,
(k,l)
and M e
(k,l) H
= MH .
(k,l)
sponding to area-preserving linear TF coordinate transforms [64, 162] (see Subsection 2.1.3 and Appendix C). Using the metaplectic covariance (2.10) of the GSF, the following expression is obtained for the weighted GSF integrals of H = UHU+ , me =
() H
Metaplectic Transformations. Next, let us consider unitary transformations U M corre-
1 SH e 1 = SH
(, ) |SH (, )| d d e (, ) |SH (a + b, c + d)| d d
28
Quantity mH MH
(k,l) 1 SH 1 H
Denition | |k ||l |SH (, )| d d

1/2
Interrelations 0 mH 0 MH
(k,l)
mH MH
(2k,0)
mH
(0,2l)
(k,l)
2k 2l |SH (, )|2 d d 2 |SH (, )| d d

2
(k,l)
(2k,0)
MH
(0,2l)
H H H 2 (T ) H
(T ) DH 1 H
2 2
1 H 1 H
1/2
H = MH H = MH
(1,0)
2 |SH (, )| d d
2
1/2
(0,1)
1 H
2 2
|SH (, )| d d + (T )
2
H H H 2 (T ) = H
(T ) 2 H T2
2 T
2 H
2 H T2 H
2 T 2 H
(T )
det DH
|SH (, )| d d
+ T 2 2 = Tr DH H
(T )
(T )
2 H H
DH 0, Tr DH
= 2 (T ) 2 H H H
2 H = 2 2 2 H H H
Table 2.1: Moments and displacement spread parameters and their interrelations.
1 SH
1
(d b, a c ) |SH (, )| d d = mH ,
H
()
() () with (, ) = (d b, a c ). Similarly, M e = MH .
In Section 2.3, weighting functions of the type s (, ) = |1As (, )| and (, ) = 1 s

(0)
(0)
1 (0) As (,)
(with As (, ) the ambiguity function, see Section B.2.4) will be of specic importance. Here, using the fact that the ambiguity function is covariant to metaplectic transforms, it can be shown that
(0) s (, ) = |1 A(0) (d b, a c )| = |1 As (, )| = s (, ) , s
(2.22) (2.23)
)
(, ) = 1 s
1
(0) As (d
b, a c )
( ) () H
= 1
(
1
(0) As (, ) )
= (, ) , s
(
s s with s(t) = (U+ s)(t). Thus, we here obtain mUHU+ = mH U+s and mUHU+ = mH U+s .
( )
Further special cases of the above result M e
= MH which are of particular interest are the
()
following expressions for the temporal and spectral displacement spread, 2 = d2 2 + b2 2 2bd H H H e H
2 = c2 2 + a2 2 2ac H . H H e H In a similar manner, we obtain for the displacement correlation H = cd 2 ab 2 + (ad + bc)H . e H H
(2.24)
(2.25)
29
TF scaling A (Ux)(t) 2 e H 2 e H H e 1/d 0 0 d 1 x

|d| t d
Fourier transform 0 1/T 2

1 T
Chirp multiplication 1 c 0 1
2
Chirp convolution 1 0
|b|
T 2 0
b 1
2
X( Tt2 ) 2 T 4 H
ejct x(t) 2 H 2 2cH + c2 2 H H H c 2 H
1 x(t) ejt
/b
2 /a2 H a2 2 H H
2 2bH + b2 2 H H 2 H H b 2 H
2 /T 4 H H
linear TF coordinate transforms A) on the displacement spread parameters 2 , 2 , and H . H H
Table 2.2: The eect of specic unitary transformations U M (corresponding to area-preserving
Simplications of these expressions for the case of specic unitary transformations are shown in Table 2.2. The above three relations can be rewritten more compactly in terms of the displacement spread matrix, D e = AT DH A1 , T T that where AT =
1/T 0 0 T a b c d T 0 0 1/T (T ) H (T )
and AT denotes the transpose of A1 . With det AT = 1, it follows T

(T ) (T )
det D H (T ) 2 = det DH = H , (2.26) det2 A i.e., the determinant of the displacement spread matrix is invariant with respect to metaplectic unitary
2 H = det D e = det AT D H A1 = e (T ) H
operator transformations. It is dicult to obtain general expressions for the moments of unitarily transformed systems. However, the following results for specic metaplectic transformations (cf. Table 2.2) can be found: TF scaling: Fourier transform: Chirp multiplication: Chirp convolution: me
(k,l) H
= |a|lk mH , = |b|kl mH , = mH , = mH ,
(0,l) (k,0) (l,k)
(k,l)
Me
(k,l) H
= |a|lk MH , = |b|kl MH , = MH
(k,0) (l,k)
(k,l)
(k,l) H (k,0) me H
me
(k,l) H (k,0) Me H
Me
me
(0,l) H
Me
(0,l) H
= MH .
(0,l)
Minimization of 2 2 . The following interesting result yields a closed-form solution for the minH H imum of the product H H over all metaplectic operators. Its derivation is based on the determinant e e
2 H of the displacement spread matrix D H . (T )
is equal to
Theorem 2.6. The minimum of H H , where H = UHU+ , over all metaplectic operators U M e e
UM
inf { H H } = e e
det D H = H
(T )
30
and is achieved by the following systems (among others, in particular all their TF scaled and Fourier transformed versions): H1 = F0 HF0
(T ) (T )+
sponding to the rotation matrix R
, where F0 = (R0 ) is the fractional Fourier transform operator corre(T )
(T )
(T )
(cf. Appendix C) with the angle 0 given by 1 2H arctan 2 T 2 2

H 2 H T2
0 =
H2 = Cc HC+ , where Cc is the chirp multiplication operator (cf. Appendix C) corresponding to c Cc =

10 c 1
with chirp rate c =
H . 2 H
H3 = Bb HB+ , where Bb is the chirp convolution operator (cf. Appendix C) corresponding to b Bb =

1 b 01
with chirp rate 1/b =
2 H H .
Proof. The proof of this theorem is based on the fact that, as expressed by (2.26), the determinant of the displacement spread matrix is invariant to area-preserving linear coordinate transforms. Eq.
2 (2.26) can be rewritten as 2 = 2 2 2 = H from which H H = e e e e e e H H H H 2 H + 2 . For xed H, this e H
expression is obviously minimized by minimizing |H |. In fact, since there are three free parameters e in (2.25) (one of the four parameters a, b, c, and d is related to the other three by the condition such that 2 = 0. Instead of giving the general solution (which actually consists of all matrices e
H (T ) H
det A = 1), it is possible to choose two parameters arbitrarily and solve for the third free parameter diagonalizing DH ), we discuss three important special solutions that achieve 2 = 0. e First, we restrict the minimization to rotational transforms of the type R
(T )
cos T 2 sin (sin )/T 2 cos
with , the rotation angle, now being the only parameter (apart from the normalization constant T ). The corresponding metaplectic representation U is given by the fractional Fourier transform operator. With the assumed form of A, H can be simplied as follows (cf. (2.25)): e H = sin cos e = = 1 2 1 4 2 H T 2 2 + (cos2 sin2 ) H H T2
2 H T 2 2 sin 2 + H cos 2 H T2 2 H T 2 2 H T2
2 1/2
2 H
sin 2 + arctan
2H
2 H T2
T 22 H
It is thus seen that H = 0 can be achieved via a rotation of the GSF by an angle 0 = e
1 2 arctan 2 /T2H 2 2 , which solves the minimization of H H . Note that this rotation, up to the e e 2 T
H H
normalization constant T , is an orthogonal transform that diagonalizes D H , thus corresponding to the eigenvector matrix of D H . Alternatively, the minimization of H can be performed using chirp multiplications or chirp cone volutions. It is seen from (2.25) that H = 0 can be achieved by a chirp multiplication by choosing e a = d = 1, b = 0, c = H / 2 or by a chirp convolution by setting a = d = 1, c = 0, b = H / 2 , thus H H
(T )
(T )
31
yielding further systems with minimal H H . Finally, since TF scalings and Fourier transforms do e e not aect H H , any such modication of a system with minimal H H yields again a system with e e e e minimal H H . e e Hence, any system H can be unitarily transformed such that the resulting system H = UHU+ has H = 0 and H H achieves the lower bound H . e e e
2.2.4
Underspread Systems
() () (k,l)
The GSF integrals mH and MH and, specically, the GSF moments mH
and MH
(k,l)
measure
the spread of |SH (, )| about the origin of the (, ) plane. Hence, without being forced to assume that the GSF has nite support, we can consider a system H to be underspread if suitable GSF integrals/moments are small. LTV systems that are not underspread are called overspread. We note that this is not a clear-cut denition of underspread systems. Indeed, we will see in Section 2.3 that bounds on various error quantities associated with specic transfer function approximations can be formulated using dierent GSF integrals/moments. Hence, in order that these approximation errors be small, dierent GSF integrals/moments are required to be small. This is somewhat similar to the many dierent quantities that can be used to measure the eective bandwidth of a signal. Our concept of underspread systems is thus more complicated but also more exible than Kozeks denition that was based on the area of the (assumedly) compact support of the GSF. It is precisely this exibility which, in Section 2.3, will allow us to establish error bounds for the transfer function approximation without unnecessarily restrictive assumptions. To be specic, several of the following conditions satised by underspread systems will be of importance in Section 2.3: mH
(1,1)
(1,0) (0,1) mH mH ( ) mH s
1, 1, 1,
(0)
MH
(1,1)
1,
H H = MH MH s 1,
( )
(1,0)
MH
(0,1)
1,
(2.27)
where in the last line either s (, ) = |1 As (, )| or s (, ) = 1
underspread systems satisfying the above constraints are illustrated in Fig. 2.3. It should be noted that the concept of underspread systems is not equivalent to that of slowly time-varying (quasi-LTI) systems which requires |SH (, )| to be narrow with respect to (i.e., small moments mH
(0,l)
1 (0) As (,)
. Examples for
and
(0,l) MH ,
see part (d) of Fig. 2.3). In particular, a slowly time-varying system may be overspread
(i.e., not underspread) if its memory is very long, while a system with fast time-variations may be underspread if its memory is short enough. The conditions in the rst two lines of (2.27) do not allow the GSF to be oriented in oblique directions (for the weighted GSF integrals in the last line, this depends on the choice of s(t)). Yet, in some cases (especially when deriving transfer function approximations for the case = 0), oblique orientations of the GSF will be allowed. Often specic moments of a system H are rather large whereas systems include systems having GSF oriented along oblique directions (see Fig. 2.3(e)). To cover these the same moments of a unitarily equivalent system UHU+ with U M are small. Examples of such
32
(a)
(b)
(c)
(d)
(e)
Figure 2.3: Schematic representation of the GSF magnitude of (a) an underspread system with small mH
(1,1)
and MH
(0,l) mH
(1,1)
; (b) an underspread system with small mH mH

(0,1) (1,0) mUHU+ mUHU+ (0,l) MH );
(1,0)
(0,1)
and MH
(k,0)
(1,0)
MH
(0,1)
; (c) an
underspread system with small system (small and
for U corresponding to a rotation; (d) a quasi-LTI

(k,0)
(e) a quasi-LFI system (small mH
and MH
).
cases in a satisfactory way, only the minima of certain moment quantities among all systems related by metaplectic transforms may be required to be small, e.g., inf mUHU+ mUHU+ 1,
(1,0) (0,1)
UM
UM
inf
UHU+ UHU+ 1.
(2.28)
|SH (, )| is converted into an orientation of |SUHU+ (, )| along the axis and/or axis (cf. Theorem 2.6).
Here, an appropriate U M produces a coordinate transform such that the oblique orientation of
2.2.5
Operator Sums, Adjoints, Products, and Inverses
We conclude our discussion of operators with rapidly decaying GSF by considering the weighted GSF integrals and moments of sums, adjoints, products, and inverses of such operators. Sum. Using the inequality |SH1 +H2 (, )| = |SH1 (, ) + SH2 (, )| |SH1 (, )| + |SH2 (, )| , one straightforwardly obtains mH1 +H2 and, as a special case, mH1 +H2
(k,l) ()
SH1 1 SH1 + SH2
mH1 +
()
SH2 1 SH1 + SH2
mH2
()
SH1 1 SH1 + SH2
mH1 +
(k,l)
SH2 1 SH1 + SH2
mH2 .
(k,l)
(2.29)
Using the fact that |SH1 +H2 (, )|2 2|SH1 (, )|2 + 2|SH2 (, )|2 (due to the Schwarz inequality | x, y |2 x 2 y 2 with x = (SH1 (, ) SH2 (, ))T and y = (1 1)T ) and the inequality a2 + b2 a + b (valid for a 0, b 0), one can furthermore show that
() MH1 +H2
2 H1 2 2 H2 2 () () M + M H1 + H2 2 H1 H1 + H2 2 H2
33
and
(k,l) MH1 +H2
2 H1 2 2 H2 2 (k,l) (k,l) MH1 + M . H1 + H2 2 H1 + H2 2 H2

1
These bounds imply that the sum of two underspread operators will again be underspread. Furthermore, if the norms SH1
1
and SH2
or the norms H1
and H2
are not similar, then the
moments of the operator sum tend to be determined by the moments of the operator with the larger norm. Therefore, also the sum of two operators where only one is underspread may be underspread if the underspread operator dominates the other one in an appropriate sense. This property is in striking constrast to DL operators (cf. Subsection 2.1.4). We also generalize the concept of jointly underspread systems, introduced for DL operators in Denition 2.3, using a moment-based approach. In particular, we call two systems H1 and H2 jointly underspread if both systems are individually underspread and if in addition also their sum H1 + H2 is underspread. With regard to the underspread condition mH1 +H2 mH1 +H2 1, it follows from (2.29) that mH1 +H2 mH1 +H2
(1,0) (0,1) (1,0) (0,1)
SH1 1 SH1 + SH2
mH1 +
(1,0)
SH2 1 SH1 + SH2

(0,1)
mH2
(1,0)
SH2 1 (0,1) m SH1 + SH2 1 H2 1 SH2 2 SH1 2 (1,0) (0,1) (1,0) (0,1) 1 1 m mH1 mH1 + m = SH1 + SH2 2 SH1 + SH2 2 H2 H2 1 1 SH1 1 SH2 1 (1,0) (0,1) (1,0) (0,1) + . mH1 mH2 + mH2 mH1 2 SH1 + SH2 1 mH1 + Thus it is seen that in addition to mH1 mH1 1 and mH2 mH2 1, jointly underspread systems mH1 mH2 + mH2 mH1 1 .
(1,0) (0,1) (1,0) (0,1) (1,0) (0,1) (1,0) (0,1)
SH1 1 SH1 + SH2
have to satisfy the condition
(2.30)
This latter requirement will become particularly important in Subsection 2.3.4. In essence, the concept of jointly underspread implies that, in addition to the systems being individually underspread, their memory lengths and amounts of temporal variation are respectively comparable. For example, an LTI system with long memory (i.e., large mHLTI ) and an LFI system with rapidly varying multiplication function (i.e., large mHLFI ), although being individually underspread, are not jointly underspread since mHLTI mHLFI will be large in that case. Adjoint. With SH+ (, ) = SH (, ) , we obtain mH+ = mH ,
() () (0) (0) (1,0) (0,1) (0,1) (1,0)
MH+ = MH ,
()
()
with (, ) = (, ), which further specializes to mH+ = mH ,

(k,l) (k,l)
MH+ = MH
(k,l)
(k,l)
Product. The GSF of the product of two operators is given by the so-called twisted convolution (cf. Eq. (B.12) in Section B.1.1) which is dicult to analyze in general. However, a simple consequence
34
of (B.12) is |SH2 H1 (, )| |SH1 (, )| |SH2 (, )| , (2.31)
with denoting 2-D convolution. Unfortunately, this bound is often very loose. Simply think of a unitary system U and its adjoint U+ which can have arbitrarily large TF shifts. In this case, UU+ = I
() ()
whose GSF is perfectly concentrated at the origin of the (, ) plane although the above inequality suggests that the the support of SH2 H1 (, ) may be larger than the support of SU (, ) or that of SU+ (, ). Using (2.31) we can derive bounds on the moments of the system H2 H1 . With the normalization
(k,l) mH ()
= mH /(k! l!), we arrive at the following
(k,l)
Proposition 2.7. The normalized moments of the product H2 H1 are bounded by the 2-D convolution of the normalized moments of H1 and H2 up to order k and l,
(k,l) mH2 H1
SH2 1 SH1 SH2 H1 1
k 1
mH1 mH2
i=0 j=0
(i,j)
(ki,lj)
(2.32)
Proof. Inserting (2.31) into the moment denition (2.13-a), one has SH2 H1
(k,l) 1 mH2 H1
|SH1 (1 , 1 )| |SH2 ( 1 , 1 )| d1 d1 | |k ||l d d |SH1 (1 , 1 )| |SH2 (2 , 2 )| |1 + 2 |k |1 + 2 |l d1 d1 d2 d2

k
2 l
|SH1 (1 , 1 )| |SH2 (2 , 2 )|
i=0
k |1 |i |2 |ki i
j=0
=
i=0 j=0
k! l! i! j! (k i)! (l j)!
2 2
l |1 |j |2 |lj d1 d1 d2 d2 j
1 1
|SH1 (1 , 1 )| |1 |i |1 |j d1 d1
k l
|SH2 (2 , 2 )| |2 |ki |2 |lj d2 d2

(i,j) 1 mH1
=
i=0 j=0
k! l! SH1 i! j! (k i)! (l j)!

k l (i,j)
SH2
(ki,lj) 1 mH2
= SH1 which nally gives (2.32).
SH2
1 k! l! i=0 j=0
mH1 mH2 , i! j! (k i)! (l j)!
(ki,lj)
Important special cases of this result are obtained for (k, l) = (1, 0), (k, l) = (0, 1), and (k, l) = (1, 1): mH2 H1
(1,0)
SH2 1 SH1 SH2 H1 1
mH1 + mH2
(1,0)
(1,0)
(2.33)
2.2 Operators with Rapidly Decaying Spreading Function SH2 1 SH1 SH2 H1 1 SH2 1 SH1 SH2 H1 1
35
mH2 H1 mH2 H1
(1,1)
(0,1)
mH1 + mH2
(1,1)
(0,1)
(0,1)
,
(1,0) (1,0) (0,1) (1,1)
(2.34) . (2.35)
mH1 + mH1 mH2 + mH1 mH2 + mH2
(0,1)
Inverse. In general, it is dicult to characterize the moments of the inverse of an operator. From the Neumann series [158]
=
k=0
(I H)k ,
(2.36)
H1 will grow ad innitum. Yet, under sucient regularity conditions the inverse of an underspread operator can also be underspread in the sense of having small moments. Inverses of underspread operators are further discussed in Subsection 2.3.6.
it is seen that, due to the higher powers of (I H), it is possible that the support of the GSF of
2.2.6
Non-Band-Limited Parts of Operators with Rapidly Decaying Spreading Function
In this section, we will relate DL operators and operators with rapidly decaying GSF by studying the norms of the non-DL part of the latter. To this end, recall the splitup (2.4) of any operator H into its DL part HG and its non-DL part HG . The next few results derive bounds on the norms of SHG (, ) for regions G of dierent shape using a Chebyshev inequality-like approach4 . The bounds on SHG
1
Section 2.3. Furthermore, they serve as a measure of the errors made when replacing H by its DL part HG . Indeed, the dierence LH (t, f ) LHG (t, f ) satises
() () L G (t, f ) LH (t, f ) LHG (t, f ) S G = H H SH 1 SH 1 SH () 1 1 () ()
|SH (, )| d d and HG
2 2
= SHG
2 2
|SH (, )|2 d d will be important in
LH LHG SH 2
O
()
()
HG H
2 2
.
O
(2.37) H 2,
Furthermore, using the denition (A.1) of the operator norm H
and the fact that H
the L2 norm of the dierence (Hx)(t) (HG x)(t) can be bounded as Hx HG x H 2 x 2

2
HG x 2 HG H 2 x 2 H
O 2
HG x 2 x 2 H
2 2
The rst result considers rectangular support constraints (see Fig. 2.1(a)). Proposition 2.8. For any rectangular region G SHG SH
4
operator H is bounded as
1 1
[G , G ] [G , G ], the non-DL part HG of any

(1,0) 2
m H G
(1,0)
m + H G
(0,1)
HG H
2 2
MH G
MH G
(0,1) 2
1/2
M H G
(1,0)
M + H G
(0,1)
. (2.38)
We are grateful to Prof. W. Mecklenbruker for drawing our attention to this method. a
36
2 2 G | | G 1 I| |G (, ) -1 0 1 G
Figure 2.4: Illustration of Chebyshev-like inequalities. Proof. The basis for the proof of the above bounds is given by the inequalities I| |G (, ) I| |G (, ) | | , G 2 2 , G I||G (, ) I||G (, ) || , G 2 2 , G (2.39) (2.40)
which are illustrated in Fig. 2.4. Using (2.39), the rst bound in (2.38) is shown as follows: SHG
1
=
G
|SH (, )| d d =
| |G ||G
|SH (, )| d d
| |G
|SH (, )| d d +
||G
|SH (, )| d d || |SH (, )| d d G (2.41)
| | |SH (, )| d d + G
1 (1,0) mH
= SH
(0,1) mH
The HS norm bound can be shown similarly using (2.40), HG

2 2
= SHG
| |G
2 2
=
G
|SH (, )|2 d d |SH (, )|2 d d 2 2 2 |SH (, )| d d G .
|SH (, )|2 d d +
||G
2
2 2 2 G
|SH (, )|2 d d +
(1,0) 2 MH
= H
(0,1) 2 MH
The looser bound in (2.38) follows from the inequality x2 + y 2 (x + y)2 for x, y 0.
2.3 Underspread Approximations Next, we consider regions G with hyperbolic boundaries (see Fig. 2.1(b)). Proposition 2.9. For any region G bounded as SHG SH
1 1
37
{(, ) : | | G }, the non-DL part HG of any operator H is mH G

(1,1)
HG H
2 2
MH . G
(1,1)
(2.42)
Proof. The integral of |SH (, )| over G can be written as
function by the weighting function | |/G , i.e.,5 IG (, ) | |/G , and obtain SHG
1
the indicator function of G. Using a Chebyshev inequality-like approach, we can bound this indicator
|SH (, )| IG (, ) d d with IG (, )
=
G
|SH (, )| d d = 1 G
|SH (, )| IG (, ) d d mH 1 G
(1,1)
|SH (, )| | | d d = SH
The HS norm bound follows similarly by noting that furthermore IG (, ) 2 2 /2 . G
2.3 Underspread Approximations

In this section, we show that for LTV systems that are underspread in the extended sense of Section 2.2, the GWS LH (t, f ) is an approximate TF transfer function that generalizes the spectral (temporal) transfer function of LTI (LFI) systems. As a mathematical underpinning of this approximate TF transfer function calculus, we establish explicit upper bounds6 on the approximation errors associated with it. These bounds are formulated in terms of the GSF integrals and/or moments introduced in Subsection 2.2.2 and do not require the GSF to have nite support. Hence, our subsequent results will show that a GWS-based transfer function calculus is valid for a signicantly wider class of systems than that considered previously [118120].
()
2.3.1
Approximate Uniqueness of the Generalized Weyl Symbol
In general, two GWSs with dierent values will yield dierent results. For example, the Weyl symbol ( = 0) will be dierent from Zadehs function ( = 1/2). The interrelation of the individual members of the GWS family is given by (B.21) with the dual interrelation of the corresponding members of the GSF family given by (B.5). This situation is dierent from the LTI and LFI cases (where the transfer function of a given system is uniquely dened) and may be considered an inconvenience since in practice it may often not be clear which value should be selected. Fortunately, the subsequently derived bounds on the GWSs -dependence show that in the underspread case the choice of is not critical. These bounds extend existing bounds of Kozek [118]. Furthermore, results in a similar spirit
5
That IG (, ) | |/G is seen as follows: for (, ) G one has | |/G 0 = IG (, ), and for (, ) G, i.e.,
| | > G , one has | |/G 1 = IG (, ). 6 We note that in some situations our error bounds may be rather coarse. However, in practical situations they are still useful as there are often no other ways to assess the accuracy of specic transfer function approximations.
38
have been derived by Kohn and Nirenberg [112] for the special cases = 1/2 and by Folland [64] for the dierence between Weyl symbol and Zadehs time-varying transfer function. Theorem 2.10. For any LTV system H, the dierence 1
(1 ,2 )
(t, f )
LH 1 (t, f ) LH 2 (t, f )
( )
( )
between two GWSs with parameters 1 and 2 is bounded as 1

(1 ,2 )
(t, f )
1
SH
(1,1) 2|1 2 |mH ,
(1 ,2 ) 2
2|1 2 |MH (t, f ) is given by
(1,1)
(2.43)
Proof. With (B.20) and (B.5), the 2-D Fourier transform of 1 ( , ) 1 1 2 (, ) = 1

(1 ,2 )
(1 ,2 )
(t, f ) ej2(t f ) dt df = SH 1 (, ) 1 ej2(1 2 ) .
( )
It is thus seen that this dierence is essentially determined by the deviation of ej2(1 2 ) from 1 which is obviously small for (, ) values around the origin. The rst bound (L bound) in (2.43) is then shown as 1
(1 ,2 )
(t, f ) =

( , ) 1 1 2 (, ) ej2(tf ) d d
( , ) 1 1 2 (, ) d d
|SH (, )| 1 ej2(1 2 ) d d
=2
|SH (, )| |sin((1 2 ) )| d d

(2.44)
(1,1) 1 mH ,
2|1 2 |
|SH (, )| | | || d d = 2|1 2 | SH
where we used | sin x| |x|. The second bound (L2 bound) is shown similarly, 1
(1 ,2 ) 2 2
( , ) = 1 1 2 =4

2 2
|SH (, )|2 1 ej2(1 2 )
d d
|SH (, )|2 sin2 ((1 2 ) ) d d

4 2 (1 2 )2
|SH (, )|2 2 2 d d = 4 2 (1 2 )2 H
2 2
MH
(1,1) 2
Discussion. The bounds (2.43) depend only on the dierence of 1 and 2 and approach zero with decreasing 1 2 , thus correctly reecting the obvious fact that for 1 = 2 the dierence 1
(1 ,2 )
(t, f ) becomes zero. From the proof of the above theorem, it is also seen that 1
k |1 2 |
(1 ,2 )
(t, f )
k, since here sin((1 2 ) ) = 0. A specic class of such systems (corresponding to k = 0) is given

()
vanishes for systems whose GSF is perfectly localized along the hyperbolae | | =
with integer
by any superposition of LTI and LFI systems with impulse response h(t, t ) = g(t t ) + m(t) (t t ) and GSF located on the and axis. The GWS of such systems is given by LH (t, f ) = G(f ) + m(t)
and is therefore independent of .

(1,1) (1,1)
39
In the case that the bounds (2.43) are small, which calls for small moments mH
and MH
, it
is seen that two GWSs obtained with dierent parameters 1 and 2 are approximately equal, LH 1 (t, f ) LH 2 (t, f ) . The GWS is thus approximately independent of and can therefore be considered as an (approximately) unique TF transfer function. Small mH
(1,1) (1,1) ( ) ( )
and MH
in particular require H to be an
underspread system with GSF concentrated along the and/or axis. Examples of such systems include quasi-LTI systems (i.e., systems with slow time-variations), quasi-LFI systems (i.e., systems with short memory), and any parallel connection (superposition) thereof. On the other hand, the moments mH
(1,1)
and MH
(1,1)
may be large for GSF oriented in oblique directions. This is consistent
with the known fact that the Weyl symbol, i.e. the GWS with = 0, has unique properties regarding systems having obliquely oriented GSF. DL Operators. For the special case of DL operators as dened in Section 2.1, using (2.18) in (2.43) directly yields the bound 2|1 2 |H for both the normalized L and L2 norm of 1 2 ) )| sin(|1 2 |H ) (valid for 2|1 2 |H 1) yields the tighter bounds 1
(1 ,2 ) (1 ,2 )
. However,
a slight renement of the proof of Theorem 2.10 that uses the fact that max(,)GH | sin((1 1
(1 ,2 ) 2
(t, f )
1
SH
2 sin(|1 2 |H ),
2 sin(|1 2 |H ),
(2.45)
which are valid for DL operators satisfying7 2|1 2 |H 1. These bounds may be compared to the following bounds obtained in [118] for DL operators with rectangular GSF support, 1
(1 ,2 )
(t, f )
1
SH
H , 2 sin |1 2 | 4
(1 ,2 ) 2
2 sin |1 2 |
H , 4
which are valid for |1 2 |H 2. From this comparison, it is seen that our bounds are tighter superpositions of quasi-LTI and quasi-LFI systems for which H may be innite).
(since H H /4 according to Subsection 2.1.2) and valid under more general conditions (e.g., for Non-DL Operators. We will now consider the case that one erroneously assumes that the system
rather has a rapidly decaying GSF. Specically, let us consider the case where the systems GSF is eectively (but not exactly) contained within G = {(, ) : | | G }8 . In that case, neglecting the GSF contributions outside of G, (2.45) would suggest that the L and L2 norms of the dierence 1
(1 ,2 )
under analysis is DL with presumed GSF support region G, while actually the operator is not DL but
might be wrong, i.e., too small. The following result presents correct bounds that also show by how much one might be wrong when erroneously using the bounds (2.45).
7 8
(t, f ) are (approximately) less than 2 sin(|1 2 |G ). Of course, this approximate bound
This condition is not very restrictive since the error bounds in (2.45) will typically be used only for small H . This eective support region G could be determined by thresholding the GSF, i.e., G = {(, ) : |SH (, )| }, so
that G = max(,):|SH (,)| | |.
40
Proposition 2.11. For any LTV system H and any G such that 2|1 2 |G 1, the dierence 1
(1 ,2 )
(t, f ) = LH 1 (t, f ) LH 2 (t, f ) is bounded as (t, f )

1 2
( )
( )
(1 ,2 )
SH 1
2 sin(|1 2 |G ) + 2 2 sin (|1 2 |G ) +

2
mH , G MH G
(1,1) 2 1/2
(1,1)
(2.46) M 2 sin(|1 2 |G ) + 2 H . G
(1,1)
(1 ,2 )
(2.47)
Proof. Consider the region G = {(, ) : | | G }. Starting from (2.44) and splitting the integral yields 1
(1 ,2 )
(t, f ) 2
|SH (, )| |sin((1 2 ) )| d d + 2
G
|SH (, )| |sin((1 2 ) )| d d |SH (, )| d d

G
2 |sin((1 2 )G )| 2 SH
1 sin(|1
|SH (, )| d d + 2
1
2 |G ) + 2 SHG
where the rst term in the bound holds for 2|1 2 |G 1. The second term is bounded according to (2.42), which nally yields (2.46). The proof of (2.47) is analogous. Thus, for non-DL operators, the bounds (2.45) with an arbitrarily chosen but in any case incorrect G might deviate from the correct bounds (2.46) and (2.47) by as much as 2mH /G and 2MH respectively.
(1,1) (1,1)
/G ,
2.3.2
The Generalized Weyl Symbol of Operator Adjoints

() ()
The GWS of the adjoint operator H+ is given by (B.23), i.e., LH+ (t, f ) = LH
(0) (0)
(t, f ), and thus is
generally not equal to the complex conjugate of the GWS of H (this is only true for = 0, i.e., LH+ (t, f ) = LH (t, f )). This is a dierence from the LTI and LFI cases where the transfer functions of the adjoint of a system H can be obtained by complex conjugation of the transfer function of H. Yet, the subsequently presented results will show that for an underspread system H the complex conjugate of its GWS is a good approximation to the GWS of H+ . Our bounds on the resulting approximation error are straightforward consequences of the approximate -invariance of the GWS as been derived by Kohn and Nirenberg [112]. discussed in the preceding subsection. For the special case = 1/2, a result in a similar spirit has
Corollary 2.12. For any LTV system H, the dierence 2 (t, f ) is bounded as
() ()
LH+ (t, f ) LH (t, f ) 2 H

() 2 2 (1,1)
()
()
2 (t, f ) (1,1) 4|| mH , SH 1
4|| MH
(2.48)

() () () () () (,)
41
Proof. With LH+ (t, f ) = LH 2 = .
(t, f ), we have 2 (t, f ) = LH
Hence, the above bounds follow straightforwardly by applying the bounds (2.43) with 1 = and Discussion. The bounds in Corollary 2.12 correctly reect that 2 (t, f ) = 0 for = 0. Furthermore, for an arbitrary parallel connection of LTI and LFI systems, with GWS LH (t, f ) = G(f )+m(t),
() ()
(t, f ) LH (t, f ) = 1
(t, f ).
there is 2 (t, f ) = 0 since here mH
()
(1,1)
= 0. More generally, for an underspread system whose GSF

(1,1)
is concentrated along the and axes so that mH systems GWS,
and MH
(1,1)
are small, the bounds show that
the GWS of the adjoint of a system can approximately be obtained by complex conjugation of the LH+ (t, f ) LH (t, f ) .
() ()
(2.49)
Due to the similar bounds in Corollary 2.12 and Theorem 2.10, the above approximation is valid in the same situations in which the GWS is approximately -independent, e.g. for (superpositions of) quasi-LTI and quasi-LFI systems. On the other hand, (2.49) is not valid for systems with GSF oriented in oblique directions. For such systems one has thus to resort to the case = 0 where LH+ (t, f ) = LH (t, f ). DL Operators. For the special case of DL operators, setting 1 = and 2 = in (2.45), one 2 (t, f ) 2 sin(2||H ) 4||H , SH 1
() (0) (0)
obtains
2 H
() 2 2
2 sin(2||H ) 4||H ,
(2.50)
where the tighter bounds involving the sine term are valid only for 4||H 1. Non-DL Operators. Similarly to the results (2.46) and (2.47) derived in connection with the uniqueness of the GWS, it can be shown that for any LTV system H and any G such that 4||G 1, one has
(1,1) 2 (t, f ) mH 2 sin(2||G ) + 2 , SH 1 G ()
(2.51)
2 1/2
2 H
() 2 2
2 sin2 (2||G ) +
(1,1)
MH G
(1,1)
2 sin(2||G ) + 2
(1,1)
MH . G
(1,1)
(2.52)
Thus, in the case of non-DL operators, the deviations of the bounds (2.50) from the correct bounds (2.51) and (2.52) can be as large as 2mH /G and 2MH /G , respectively.
2.3.3
Approximate Real-Valuedness of the Generalized Weyl Symbol
A simple but important consequence of Corollary 2.12 concerns self-adjoint operators, i.e., operators satisfying H = H+ . Unlike the transfer functions of LTI and LFI systems, the GWS of such operators is generally not real-valued unless = 0. Fortunately, Corollary 2.12 implies that for underspread self-adjoint operators, the imaginary part of the GWS is negligible.
42

()
Corollary 2.13. For a self-adjoint operator H, the imaginary part of its GWS, LH (t, f )
1 2j () LH (t, f )
() LH (t, f )
, is bounded as
LH (t, f ) SH 1 operator LH (t, f ) =

() 1 2j ()
()
2|| mH ,
() 1 2j
(1,1)
LH H 2
()
()
2|| MH
()
(1,1)
(2.53)
Proof. The above bounds are a straightforward consequence of Corollary 2.12 since for a self-adjoint LH (t, f ) LH (t, f ) = LH+ (t, f ) LH (t, f ) =
(1,1) () 1 2j 2 (t, f ).
Discussion. As a consequence of the foregoing corollary, self-adjoint operators having their GSF localized along the and/or axis, i.e., underspread operators with small mH approximately real-valued GWSs, LH (t, f ) 0,
()
and MH
(1,1)
, have
or
LH (t, f ) LH (t, f ).
()
()
An example illustrating this approximation will be provided in Subsection 2.3.14. Furthermore, this result is important in the context of time-varying spectral analysis, since it shows that in the underspread case the GWVS in (B.45) is approximately real-valued even for = 0 (see Subsection 3.2.1). DL Operators. For the special case of DL operators, the following bounds can be obtained from (2.50), LH (t, f ) SH 1
()
sin(2||H ) 2||H ,
LH H 2
()
sin(2||H ) 2||H ,
(2.54)
where the tighter bounds in terms of the sine function are valid for 4||H 1. Similar to the previous subsections, the deviations involved in incorrectly using these bounds for non-DL systems
(1,1) (1,1)
are upper-bounded by mH /G and MH
/G .
2.3.4
Composition of Systems
One of the most important properties of LTI systems is the fact that the transfer function of a composition (series connection) of two LTI systems equals the product G1 (f ) G2 (f ) of the transfer functions of the individual systems. Similarly, the temporal transfer function of the composition of two LFI systems is given by m1 (t) m2 (t). This composition property of conventional transfer functions is the cornerstone of many signal processing techniques used in applications like ltering, estimation, detection, and channel equalization. Unfortunately, contrary to the LTI/LFI case, a similar composition property no longer holds true for general LTV systems: The GWS of the operator composition H2 H1 can not be obtained by multiplying the individual GWSs of H1 and H2 . This prohibits an exact transfer function based formulation of several signal processing techniques for the time-varying/nonstationary case. (In the context of quantum mechanics, such a multiplicative symbol calculus corresponds to ideal quantization [64].) Besides the cases where H1 and H2 are either both LTI or both LFI (or, for = 0, metaplectic transformations thereof), for Hilbert-Schmidt (HS)
43
= 1/2
= 1/2 LTV LTI
|| = 1/2 LTI LTI
LTI
LTV
LTV
LFI
LFI
LTV
LFI
LFI
Figure 2.5: Situations where LH2 H1 (t, f ) = LH2 (t, f )LH1 (t, f ) with H1 corresponding to the left-hand system and H2 corresponding to the righ-hand system. operators the relation LH2 H1 (t, f ) = LH1 (t, f )LH2 (t, f ) is correct only in the following situations (see Figure 2.5): For = 1/2, if H1 is an LTI system with spectral transfer function G1 (f ) or H2 is an LFI system with temporal transfer function m2 (t), it can be shown that
(1/2) (1/2) (1/2) () () ()
()
()
()
LH2 H1 (t, f ) = LH1 (t, f ) LH2 (t, f ) =
G1 (f ) L(1/2) (t, f )
H2
for H1 LTI for H2 LFI.
In the dual case = 1/2, if H1 is LFI with temporal transfer function m1 (t) or H2 is LTI with spectral transfer function G2 (f ), one obtains
(1/2) (1/2) (1/2)
LH1 (t, f ) m2 (t)
(1/2)
(2.55)
LH2 H1 (t, f ) = LH1
(t, f ) LH2
(t, f ) =
m1 (t) L(1/2) (t, f )

H2
for H1 LFI for H2 LTI.
Hence, we see that the situations where a multiplicative relation of the GWS holds exactly are rare. However, the situation is more comforting if one restricts attention to underspread operators. In the following, it is shown that for underspread systems the desired composition property for the GWS is approximately valid. The bounds on the approximation error given below generalize and extend the bounds derived for DL operators in [118120]. Results in a similar spirit have been obtained in the context of quantization [54, 64, 206] and pseudo-dierential operators [64, 73, 94, 95, 112, 206]. Note that an approximate product formula for the GWS of a composition of LTV systems also implies the approximate commutation of the systems involved (cf. Subsection 2.3.16). Using the twisted product in the form given by (B.31), it is seen that the error in replacing
() LH2 H1 (t, f ) ()
LH1
(1/2)
(2.56)
(t, f ) G2 (f )
= LH2 #LH1 (t, f ) by LH2 (t, f )LH1 (t, f ) is equal to LH2 H1 (t, f ) LH2 (t, f )LH1 (t, f ) =
() () ()
()
()
()
()
3 (t, f )
k+l>0
k+l LH2 (t, f ) k+l LH1 (t, f ) ckl , (j2)k+l tl f k tk f l
()
()
44
with ck,l = ( + 1/2)k ( 1/2)l /(k! l!). This already shows that this error will be small for suciently smooth GWSs. The following result gives bounds on the magnitude of this error. Theorem 2.14. For any two LTV systems H1 and H2 , the dierence 3 (t, f ) is bounded as 3 (t, f ) SH1 1 SH2
() ()
LH2 H1 (t, f ) LH1 (t, f ) LH2 (t, f )
()
()
()
2 BH1 ,H2
()
with
BH1 ,H2
()
1 (0,1) (1,0) 1 (1,0) (0,1) mH2 + mH1 mH2 . (2.57) m 2 H1 2
Proof. The GSF of the operator product H2 H1 is given by the twisted convolution (B.12). Hence, the Fourier transform of 3 (t, f ) is obtained as
() () () () () 3 (, ) = SH2 SH1 (, ) SH2 SH1 (, ) ()
SH2 ( , ) SH1 ( , ) ej2 (,,
()
()
, )
1 d d ,
with (, , , ) = ( + 1/2) ( ) + ( 1/2)( ) . Thus, we obtain 3 (t, f ) = =

()
() 3 (, ) ej2(tf ) d d
() ()
() 3 (, ) d d
, )
SH2 ( , ) SH1 ( , ) ej2 (,, SH2 ( , ) SH1 ( , )
1 d d d d d d d d . (2.58)
sin (, , , )
Substituting 1 = , 1 = and using | sin x| |x| yields further 3 (t, f ) 2

() 1 1
|SH2 ( , )| |SH1 (1 , 1 )| sin +
(2.59)
1 1 1 + 1 d d d1 d1 (2.60) 2 2 1 1 1 + 1 d d d1 d1 |SH2 ( , )| |SH1 (1 , 1 )| + 2 2 1
1 2 + 2 + 2 1 2
| | |SH2 ( , )| d d
|1 | |SH1 (1 , 1 )| d1 d1
1
| | |SH2 ( , )| d d
|1 | |SH1 (1 , 1 )| d1 d1 ,
(2.61)
from which the result (2.57) follows. Discussion. By virtue of its symmetry with respect to H1 and H2 , the bound in (2.57) also where H1 H2 = [H1 H2 + H2 H1 ]/2 is the (commutative) Jordan product [64] of H1 and H2 . applies to the dierences LH1 H2 (t, f ) LH1 (t, f ) LH2 (t, f ) and LH1 H2 (t, f ) LH1 (t, f ) LH2 (t, f ),
() () () () () ()
45
f
6
f
6
f
6
f
6
(a)
- t
(b)
- t
(c)
- t
(d)
- t
Figure 2.6: Transfer function approximation (with = 0) for composition of systems: (a) Weyl symbol of H1 , (b) Weyl symbol of H2 , (c) Weyl symbol of H2 H1 , and (d) product of LH1 (t, f ) and LH2 (t, f ). The number of samples is 128 and (normalized) frequency ranges from 1/4 to 1/4. The bound shows that if H1 and H2 are jointly underspread such that mH1 mH2 are both small, then we have an approximate multiplicative composition property, LH2 #LH1 (t, f ) = LH2 H1 (t, f ) LH1 (t, f ) LH2 (t, f ) .
() () () () () (0,1) (1,0) (0) (0)
and
(1,0) (0,1) mH1 mH2
(2.62)
In other words, for underspread operators the twisted GWS product is approximately equal to the pointwise GWS product. An example illustrating the validity of this approximation (with = 0) is shown in Fig. 2.6. In this example, the maximum normalized error is the bound in (2.57) is 2BH1 ,H2 = 0.084. deserves special attention and will be considered in more detail afterwards.
() (1/2) (0,1) (1,0) (0) maxt,f |3 (t,f )| SH1 1 SH2 1
(0)
= 0.065 while
In the following, we briey discuss this approximation for the cases = 1/2. The case = 0 For = 1/2, BH1 ,H2 simplies to BH1 ,H2 = mH1 mH2 , which will be small if |SH1 (, )| is located
along the axis and/or |SH2 (, )| is located along the axis. Thus, for = 1/2 the approximation with the exact multiplicative calculus in (2.55)).
(2.62) is good when H1 is a quasi-LTI system and/or H2 is a quasi-LFI system (note the consistency For = 1/2, we have BH1 ,H2 = mH1 mH2 . This will be small if |SH1 (, )| is located along the
(1/2) (1,0) (0,1)
axis and/or |SH2 (, )| is located along the axis, i.e., if H1 is quasi-LFI and/or H2 is quasi-LTI approximation (2.62) is poor, if the systems GSFs are oriented in oblique directions. For || 1/2 one has +
(0,1) (1,0) (1,0) (0,1) 1 2
(note the consistency with (2.56)). For both = 1/2 and = 1/2, the bound is large, and thus the +
1 2
= 1; hence BH1 ,H2 in (2.57) is here a convex combination9

()
()
of mH1 mH2 and mH1 mH2 and thus assumes values between the two extreme cases = 1/2. Case = 0. The case = 0 is of particular importance. Here, BH1 ,H2 simplies to BH1 ,H2 =
(0,1)
9
(0)
1 (1,0) (0,1) (0,1) (1,0) mH1 mH2 + mH1 mH2 , 2
which is symmetric with respect to H1 and H2 . Thus, the approximation (2.62) will be good if both mH1 mH2 and mH1 mH2 are small, which amounts to the condition that H1 and H2 are jointly
The convex combination of two quantities x and y is dened by (1 )x + y with 0 1.
(1,0) (1,0) (0,1)
46
underspread in the sense of Subsection 2.2.5. This requires that the GSFs of H1 and H2 are both concentrated about the origin of the (, ) plane, with similar orientation parallel to the or axis. However, this requirement can be relaxed by a renement of the bound 2BH1 ,H2 that is stated in the next theorem and that allows for systems H1 and H2 with GSFs oriented along similar oblique directions. This improved bound is due to the covariance of the Weyl symbol to unitary metaplectic operators U M that correspond to area-preserving, linear TF coordinate transforms (cf. Subsection 2.1.3). Theorem 2.15. For any two LTV systems H1 and H2 , the dierence 3 (t, f ) = LH2 H1 (t, f )
(0) (0) (0) (0) (0)
LH1 (t, f ) LH2 (t, f ) obeys the (generally tighter) bound 3 (t, f ) SH1 1 SH2
(0)
2 inf BUH1 U+ ,UH2 U+ .

UM
(0)
(2.63)
Proof. Specializing the proof of Theorem 2.14 to = 0, it is seen that 3 (t, f )

i i
(0)
|SH2 (1 , 1 )| |SH1 (2 , 2 )| |1 2 2 1 | d1 d1 d2 d2 ,
(2.64)
i i
where 1 2 2 1 is the symplectic form on R2 . Performing a symplectic coordinate transform A

form [64, 154], 1 2 2 1 = 1 2 2 1 , the right hand side of (2.64) becomes
, where A =
a b c d
with det A = ad bc = 1, and using the invariance of the symplectic
SH2 (a1 + b1 , c1 + d1 ) SH1 (a2 + b2 , c2 + d2 ) 1 2 2 1 d1 d1 d2 d2 .
2.1.3. Using the covariance property S e (, ) = SH (a + b, c + d) and noting that |SH (, )| = |SH (, )|, the above expression becomes
1 1 2 2
Let Hi = UHi U+ where U M is the unitary operator corresponding to A in the sense of Subsection
(0) H (0) (0)
|SH2 (1 , 1 )| |SH1 (2 , 2 )| 1 2 2 1 d1 d1 d2 d2 . e e
Inserting |1 2 2 1 | |1 2 | + |2 1 | and using SHi e
SHi
(which holds since the GSF
magnitude of Hi is obtained from that of Hi via an area-preserving coordinate transform), we obtain 3 (t, f )
(0) |SH2 (1 , 1 )| |1 | d1 d1 e |SH1 (2 , 2 )| |2 | d2 d2 e |SH1 (2 , 2 )| |2 | d2 d2 e
|SH2 (1 , 1 )| |1 | d1 d1 e
= SH1
SH2
me
(0,1) (1,0) me H1 H2
+ me
(1,0) (0,1) me H1 H2
As this bound is valid for all U M, we nally obtain the bound (2.63). Since the symplectic group contains TF rotations and TF shearings, this theorem shows that
(0) 3 (t, f )
may be small even if the GSFs of H1 and H2 are oriented in (similar) oblique directions.
This is not true for = 0, which once again shows the exceptional position of the Weyl symbol.
47
DL Operators. For DL operators, using (2.17), the bound (2.57) can be further bounded as 3 (t, f ) SH1 1 SH2 (2.60). For +
1 2 ()
2 +
1 2
1 (max) (max) 1 (max) (max) . H1 + H1 H2 2 H2 2 1/2, this yields .
(2.65)
A tighter bound is obtained by adapting the proof of Theorem 2.14 for DL operators, starting from H2
() (max) (max) H1
H1
(max) (max) H2
3 (t, f ) SH1 1 SH2
2 sin +
1 (max) (max) 1 (max) (max) H2 H1 + H1 H2 2 2
(2.66)
This latter bound may be compared to the corresponding bound in [118] (valid for (1+2||)H1 ,H2 2 where H1 ,H2 has been dened in (2.11)) 3 (t, f ) SH1 1 SH2
()
2 sin
H1 ,H2 1 . + || 2 2
It is seen that our bound shows more explicitly which quantities have to be small for various values of in order that the approximation (2.62) is valid; furthermore, our bound is also tighter since + 1 (max) (max) 1 1 (max) (max) (max) (max) (max) (max) + || H2 H1 + H1 H2 H1 + H1 H2 2 H2 2 2 1 (max) (max) (max) (max) H1 + H2 + || max H1 , H2 2 H1 ,H2 1 1 (max) (max) (max) (max) 2 = max H1 , H2 + || max H1 , H2 + || . 2 2 2
Non-DL Operators. We now investigate the potential error resulting from using the bound (2.66) derived for DL operators in the case of non-DL operators. Proposition 2.16. For any two LTV systems H1 and H2 and any G1 , G1 , G2 , G2 such that +
1 2 () ()
G2 G1 +
1
1 2
G1 G2 1/2, the dierence 3 (t, f ) is bounded as mH1 m +2 + H1 G1 G1

(1,0) (0,1)
3 (t, f ) SH1 1 SH2
1 1 2 sin + G2 G1 + G1 G2 2 2
mH2 m + H2 . G2 G2 (2.67)
(1,0)
(0,1)
Proof. Let G1 = [G1 , G1 ] [G1 , G1 ] and G2 = [G2 , G2 ] [G2 , G2 ]. Starting with (2.60) and splitting the integral yields 3 (t, f ) 2
()
G1 G2
|SH2 ( , )| |SH1 (1 , 1 )| sin
1 1 1 + 1 2 2
d d d1 d1 (2.68)
+2
G1 G2
|SH2 ( , )| |SH1 (1 , 1 )| d d d1 d1 ,
using the inequality

(1 ,1 )G1 ( , )G2
where we also used | sin x| 1 to bound the second integral. The rst term in (2.68) can be bounded max sin + 1 1 1 + 1 2 2 sin + 1 1 G2 G1 + G1 G2 2 2 ,
48
1 2
Chapter 2. Underspread Systems G2 G1 +

1 2
which is valid for + be rewritten as
G1 G2 1/2. Furthermore, the second term in (2.68) can S ,
|SH2 ( , )| |SH1 (1 , 1 )| d d d1 d1 = S
G1 G2
H1 1 1
H2 2 1
and the L1 -bound in (2.38) can be applied to S proposition.

(1,0)
H1 1 1
and S
H2 2 1
. This nally establishes the
It is thus seen that for non-DL operators, the bound (2.66) (incorrectly used) deviates from the correct bound (2.67) by as much as 2
mH 1 G1
mH 1 G1
(0,1)
mH 2 G2
(1,0)
mH 2 G2
(0,1)
2.3.5
Composition of
with
H+
In some applications, the composition of H with H+ is of importance. Examples are the innovations system representation of random processes [148] (see also Chapter 3) and the proofs of Theorems 2.22 and 2.32. For LTI or LFI systems, the transfer function of H+ H is |G(f )|2 or |m(t)|2 , respectively. For general LTV systems, the TF transfer function (GWS) of H+ H is no longer given by the squared
magnitude of the GWS of H. However, the following result can be obtained from Theorem 2.14 and Corollary 2.12. We note that for the case = 0 a similar result for DL operators is given in [118]. Corollary 2.17. For any LTV system H, the dierence 4 (t, f ) is bounded as 4 (t, f ) SH 2 1
() ()
LH+ H (t, f ) LH (t, f )
()
()
(2.69)
2 CH
()
with
CH
()
c mH mH
(0,1)
(1,0)
+ 2 || mH
(1,1)
(2.70)
where c = | + 1/2| + | 1/2|. Proof. Subtracting and adding LH (t, f )LH+ (t, f ) to 4 (t, f ), we obtain 4 (t, f ) = LH+ H (t, f ) LH (t, f )LH+ (t, f ) + LH (t, f )LH+ (t, f ) |LH (t, f )|2
() () () () () () () () () () () () () () () () () () () () () ()
LH+ H (t, f ) LH (t, f )LH+ (t, f ) + LH (t, f )LH+ (t, f ) |LH (t, f )|2 = LH+ H (t, f ) LH (t, f )LH+ (t, f ) + LH+ (t, f ) LH (t, f ) LH (t, f )
(1,0) SH 2 2c mH 1 (k,l) (0,1) mH (k,l)
(2.71)
(1,1) SH 1 4||mH
() LH (t, f )
where we used (2.57) with mH+ = mH follows with

() LH (t, f )
and SH+
= SH
as well as (2.48). From this, (2.70)
SH 1 .
(0,1) (1,0)
Discussion. Hence, for an underspread system with small mH mH LH+ H (t, f ) LH (t, f ) ,
() () 2
and small mH , we have
(1,1)
49
such that squaring a system (i.e., composing H and H+ ) is approximately equivalent to squaring its GWS. We note that the bound (2.70), and thus the above approximation, remain valid if H+ H is replaced by HH+ or H H+ . For || 1/2, we have c = 1 and thus CH = mH mH
() () (1,0) (0,1)
+ 2 || mH
(1,1)
mH mH
(0)
(1,0)
(0,1)
+ mH ,
(0,1) (1,0)
(1,1)
|| 1/2 .
The bound CH is tightest for = 0, in which case CH = mH mH . This is due to the fact that the Weyl symbol of the adjoint is exactly obtained by complex conjugation. Furthermore, for = 0 the above result can be rened similarly to Theorem 2.15, yielding the tighter bound |4 (t, f )| (1,0) (0,1) (0) 2 inf CUHU+ = 2 inf mUHU+ mUHU+ . 2 UM UM SH 1
() (0)
(2.72)
These results for the GWS of the composition of H and H+ are interesting also since they show that the generalized input Wigner distribution IWH (t, f ) and the generalized output Wigner distribution OWH (t, f ) (see (B.36) and [90]) of an underspread system are approximately equal to the squared magnitude of the GWS, thus reducing essentially to the GWS. Furthermore, since both LHH+ (t, f ) and LH+ H (t, f ) are approximately equal to |LH (t, f )|2 , generalized input and output Wigner distribution are also approximately equal to each other. More precisely, by applying the triangle inequality to IWH (t, f ) OWH (t, f ) = LH+ H (t, f ) LHH+ (t, f ) = LH+ H (t, f ) |LH (t, f )|2 + |LH (t, f )|2 LHH+ (t, f ) and twice using (2.70), we obtain
() () () () () () () () () () () () ()
IWH (t, f ) OWH (t, f ) SH 2 1
()
4 CH ,
()
which, in the case = 0 can be rened by using (2.72), i.e., IWH (t, f ) OWH (t, f ) SH
2 1 () ()
4 inf
UM
mUHU+ mUHU+ .
(0,1)
(1,0)
Thus, for underspread systems the generalized input Wigner distribution and the generalized output suggests that HH+ H+ H, i.e., that an underspread operator H is approximately normal. This subject will be discussed in more detail in Subsection 2.3.17. DL Operators. According to (2.71), we can write 4 (t, f ) 3 (t, f ) + 2 (t, f ) LH (t, f ) ,
() () () () ()
Wigner distribution are approximately equal. The approximation LH+ H (t, f ) LHH+ (t, f ) also
()
()
(2.73)
with H1 and H2 in 3 (t, f ) replaced by H and H+ , respectively. For a DL operator, the bounds (2.66) on 3 (t, f ) and (2.50) on 2 (t, f ) apply, and we obtain (recall that H+
(max) H+ () () (max)
= H
(max)
(max) H , ()
and H
(max) (max) H H
= H /4) + 1 (max) (max) 1 (max) (max) H+ + H+ H H 2 2
4 (t, f ) 2 SH
SH+
1 sin
50
()
+ 2 SH 2 SH
2 1
sin(2||H ) LH (t, f )
2 1
sin
c H + 2 SH 4
sin
||H , 2
where we furthermore used LH (t, f ) SH 1 . This yields the bound 4 (t, f ) SH 2 1

()
()
2 sin
c H + 2 sin ||H , 4 2
(2.74)
which is valid for DL operators with c H 2 (recall that c = | + 1/2| + | 1/2|). Non-DL Operators. For any (i.e., not necessarly DL) operator H and any G , G such that 2c G G 1, (2.67) and (2.51) (in combination with G G G ) can be applied to (2.73), which yields
(0,1) (1,0) 4 (t, f ) m mH + H 2 sin c G G +2 G G SH 2 1 () 2
+2 sin(2||G G )+2
mH G
(1,0)
mH G
(0,1)
. (2.75)
Hence, the error incurred by applying (2.74) to a non-DL operator that is erroneously assumed to be DL with H = 4G G is bounded by 2
mH G
(1,0)
mH G
(0,1)
+2
mH G
(1,0)
mH G
(0,1)
2.3.6
Operator Inversion Based on the Generalized Weyl SymbolPart I
In this subsection, we consider the approximate solution of an important operator equation via approximate operator inversions in the TF domain. We will consider equations of the type H1 GH2 = H3 , (2.76)
where H1 , H2 , and H3 are HS operators that are jointly DL10 , i.e., their GSFs are contained within the same centered rectangle G, and G is some operator that is to be calculated. Eq. (2.76) can be solved for G by computing the (pseudo-)inverses of H1 and H2 . However, our
aim is to nd approximate expressions for the GWS of G since it is desirable to replace potentially numerically unstable and computationally intensive operator inversions by simple TF domain inversions (i.e., scalar divisions). Equations of the type (2.76) are important for the following reasons: In Section 3.8, we consider a coherence operator x,y for nonstationary random processes x(t) and y(t) that is dened by Rx x,y Ry
1/2 1/2 1/2
= Rx,y (with Rx , Ry , and Rx,y the auto- and cross1/2
correlation operators of x(t) and y(t)), i.e., by an equation of the type (2.76) with H1 = Rx , G = x,y , H2 = Ry , and H3 = Rx,y . Our subsequent results will enable us to formulate an approximate TF coherence function that applies to jointly underspread nonstationary processes (see Section 3.8).
10
The restriction to DL operators is made for reasons of tractability. The extension of our results to operators with
rapidly decaying spreading function would be desirable but appears to be dicult.
51
In the Gaussian signal detection problem discussed in Section 4.2, the likelihood ratio test statistic is a quadratic form HLR x, x with HLR dened by R0 HLR R1 = R1 R0 (here, R0 and R1 are the correlation operators to be discriminated). This equation is of the type (2.76) is given by HD x, x with HD dened by R1 HLR R1 = R1 R0 . This equation is again of the will be the basis for an approximate TF formulation of optimal detectors that is valid for jointly underspread processes (see Section 4.2). In order to obtain an approximation for the GWS of G in (2.76), it is tempting to twice apply the product formula (2.62) directly to (2.76). Unfortunately, in most cases G will not be DL underspread, thereby prohibiting direct application of (2.62). Alternatively, one could think of applying (2.62) to the equation G = H1 H3 H1 , obtained by multiplying (2.76) by the (pseudo-)inverses H1 and 1 2 1 H1 . Unfortunately, H1 and H1 will typically not be DL either (see Subsection 2.1.4), and hence 2 1 2 (2.62) again cannot be applied directly. In the following, we present two theorems which show that a smoothed version of the GWS of G is approximately equal to the following ratio formed with the GWSs of H3 , H1 , and H2 , LG (t, f )
()
with H1 = R0 , G = HLR , H2 = R1 , and H3 = R1 R0 . Similarly, the deection-optimal test
type (2.76) with H1 = H2 = R1 , G = HD , and H3 = R1 R0 . The results of this subsection
LH3 (t, f ) LH1 (t, f ) LH2 (t, f )

() ()
()
(2.77)
Here, (t, f ) denotes a 2-D lowpass function to be discussed in more detail later. DL Approximation of G. Our approximate TF solution of (2.76) is based on splitting the operator G into a DL part and a non-DL part according to (2.4). Thus we have G = GG + GG with SGG (, ) = SG (, ) IG (t, f ) where G is the joint support of the GSFs of H1 , H2 , and H3 . We next not greatly inuence the validity of (2.76).
() ()
show that the non-DL part GG = G GG is negligible in the sense that removing it from G does
Theorem 2.18. Consider an operator G satisfying H1 GH2 = H3 , where H1 , H2 , and H3 are jointly the joint displacement spread is given by G = 4G G . Let GG denote the DL part of G dened by SGG (, ) = SG (, ) IG (, ) and let GG = G GG denote the non-DL part of G. Then, the dierence H1 GG H2 H3 is bounded as H1 H1 GG H2 H3
2 () ()
DL HS operators with GSF support contained in the rectangle G =
G , G G , G so that
2 2
GG
H2
3 G .
() ()
(2.78)
Proof. In the (, )-domain, (2.76) corresponds to SH1 GH2 (, ) = SH3 (, ). Since SH1 GH2 (, ) equals the twisted convolution of SH1 (, ), SG (, ), and SH2 (, ) (see (B.12)), we obtain SH1 SG SH2 (, ) = SH3 (, ) .
() () () () () () ()
()
52
Splitting G into a DL part and a non-DL part corresponding to the rectangular region G, i.e., G = GG + GG , further yields SH1 SGG SH2 (, ) + SH1 S
() () () () () () () GG
SH2 (, ) = SH3 (, ).
()
()
(2.79)
() observation now is that since SH3 (, ) is () () () follows from (2.79) that SH1 S G SH2 G
the support of SH1 SGG SH2 (, ) is conned to G =
Since the supports of SH1 (, ), SGG (, ), and SH2 (, ) are conned to G, (B.13) implies that
()
()
conned to G and SH1 SGG SH2 (, ) is conned to G , it

() () GG
3G , 3G 3G , 3G . The crucial
() () ()
outside of G cannot be canceled by SH1 SGG SH2 (, ) and thus would contradict (2.79). Proceeding with our proof of the bound (2.78), we note that H1 GG H2 H3 = H1 GG H2 H1 GH2 = H1 GG G H2 = H1 GG H2 , and hence, using (B.12) there is H1 GG H2 H3
2 2
the support of SGG (, ) lies totally outside G. Indeed, any contributions of SH1 S
() () ()
(, ) must also be conned to G , irrespective of the fact that SH2 (, )

()
(2.80)
= H1 GG H2
2 2
=
G () () GG
SH1 S
()
() GG
SH2 (, ) d d
, )
()
=
G
SH1 S
( , ) SH2 ( , ) ej2 (,,

() () GG ()
()
d d d d,
where we used the support constraint of SH1 S obtain
SH2 (, ) discussed above. By applying the
Schwarz inequality and the inequality (B.15), and by noticing that the area of G is 9G , we nally H1 GG H2 H3
2 2
SH1 S
()
() GG
(1 , 1 )
d1 d1
SH1
G 2 2
2 2 2
|SH2 ( 2 , 2 )|2 d2 d2 d d
2 2
S GG
2 2
SH2
2 2 G
2 2 d
d
2 2
= H1
GG
H2
d d = H1
GG
2 2
H2
2 2 9 G
Discussion. From the foregoing theorem, it is seen that if G is small, i.e., if H1 , H2 , and H3 are jointly DL underspread, then removing the non-DL part GG from G does not greatly aect the validity of H1 GH2 = H3 : H1 GH2 = H3 = H1 GG H2 H3 . (2.81)
We note that small G requires the GSFs of H1 , H2 , and H3 to be essentially located along the or axis, respectively. Yet, Theorem 2.18 can easily be extended to allow for arbitrary GSF orientations. Let us assume that the GSFs of H1 , H2 , and H3 are supported within a region G that is obtained

cos 0 (sin 0 )/T 2
T 2
53
sin 0 cos 0
by a rotation of the rectangular region G using the rotation matrix A = H1 GG H2 H3

e
. Then,
with the metaplectic operator U M that corresponds to A (see Appendix C) we obtain

2
= U(H1 GG H2 H3 )U+
+ e G +
= UH1 GG H2 U+ UH3 U+
+ 2
= UH1 U UG U UH2 U UH3 U = H1 G H2 H3

G 2
(2.82)
(2.78); all that counts is the area G = G of the GSF support. e
Theorem 2.18 directly applies to (2.82) and it is thus seen that the orientation of G is not relevant to Since the operator norm is upper-bounded by the HS norm (see Appendix A), the approximation
where the systems Hi = UHi U+ , i = 1, 2, 3, have GSFs supported within G and G = UGU+ . Hence,
note that the above theorem can be viewed as an improvement on inequality (B.16). Indeed, (B.16) is valid for arbitrary operators (i.e., also for non-DL operators) and, together with (2.80), implies that H1 G H2 H3
G 2
(2.81) is also valid in the sense that the operator norm H1 GG H2 H3
is small. Furthermore, we
= H1 G H2
H1
G 2
H2
or
H1
H1 GG H2 H3
2
2 2
GG
H2
1.
Approximation for the GWS of G. The product (2.5) dening the GSF of GG corresponds to a 2-D convolution of GWSs, LGG (t, f ) = LG LT (t, f ) ,
() () () ()
(2.83)
where T is dened by ST (, ) = IG (, ) (see (2.3)). Since the GWS of T is a 2-D lowpass function (that equals the function (t, f ) in (2.77)), (2.83) means that the GWS of GG is a smoothed version of the GWS of G. The previous theorem, which shows that for jointly DL underspread operators H1 , H2 , and H3 the non-DL part of G is negligible with respect to the operator equation (2.76), is the basis for the next result which formulates an approximation for the GWS of GG (i.e., for the smoothed GWS of G according to (2.83)) by the ratio LH3 (t, f )/[LH1 (t, f )LH2 (t, f )]. Theorem 2.18. Then, the dierence 5 (t, f ) is bounded as SH1 5 (t, f ) 1 SG SH2
() () () () ()
Theorem 2.19. Let the operators H1 , H2 , H3 , G, GG and the rectangular region G be dened as in LH1 (t, f )LGG (t, f ) LH2 (t, f ) LH3 (t, f ) 5 H1 2 G
() 2 2 () () () ()
3 2 c G + 9 G , 2
H2
2 c G + 8c
3 G + 3
G , (2.84)
with c = | + 1/2| + | 1/2|. LH

() (t, f ) G 1 G H2 () ()
Proof. To prove the rst bound (L bound), we subtract/add both LH1 (t, f )LGG H2 (t, f ) and from/to 5 (t, f ) and apply the triangle inequality twice. This gives
() () () () ()
2
()
()
5 (t, f ) LH1 (t, f )LGG (t, f )LH2 (t, f ) LH1 (t, f )LGG H (t, f ) +
() () LH1 (t, f )LGG H2 (t, f ) () LH1 GG H2 (t, f ) ()
+ LH1 GG H2 (t, f ) LH3 (t, f ) .
()
(2.85)
54
By noting that the maximum time shift and maximum frequency shift of both GG and H2 are G and G , respectively, the rst term on the right hand side of (2.85) can be bounded by using (2.65) (with H1 in (2.65) replaced by GG ) and by subsequently applying the inequalities |LH1 (t, f )| SH1 S GG
1 () 1
and
S GG
()
SG
G , ()
LH1 (t, f )LGG (t, f )LH2 (t, f ) LH1 (t, f )LGG H (t, f )
2
()
()
()
() LH1 (t, f ) ()
() () LGG (t, f )LH2 (t, f ) 1
LGG H2 (t, f ) 1 1 G G + G G 2 2 (2.86)
()
LH1 (t, f ) SGG SH1 1 SGG 2 c G SH1 2

1 1
SH2
1 2
SH2 SG
1 2 c G G
SH2
1.
In a similar way, the second term on the right hand side of (2.85) can be bounded by noting that the GSFs of H1 and GG H2 are conned to G and [2G , 2G ] [2G , 2G ], respectively. Hence, applying (2.65) (with H2 in (2.65) replaced by GG H2 ) and subsequently using the inequalities SGG H2 S GG
1 1
SH2
1 ()
and SGG
()
SG
G ()
yields
LH1 (t, f )LGG H2 (t, f ) LH1 GG H2 (t, f ) SH1 SH1

1 1
SGG H2 S GG
1 1
2 +
SH2 SG
1 1 (2G )G + G (2G ) 2 2 1 4 c G G SH2

1.
2 c G SH1
(2.87)
Finally, the third term on the right hand side of (2.85) can be developed and bounded as LH1 GG H2 (t, f ) LH3 (t, f ) = L
() () () (t, f ) H1 GG H2
=
()
SH1 S
()
() GG
SH2 (, ) d d
()
SH1 S
G () () GG
()
() GG
SH2 (, ) d d d d
G
SH1 S
SH2
()
9 G SH1 SGG 9 G SH1 SH1 S

() () GG () 1
()
SH2 SH2
1 1,
SG
(2.88)
where we twice used Youngs inequality (B.14) (with p = , q = 1) and the fact that the support of The L bound in (2.84) nally follows by combining (2.86), (2.87), and (2.88).
() () LH1 (t, f )LGG H2 (t, f ) () ()
SH2 (, ) is conned to G , whose area is given by 9 G (see the proof of Theorem 2.18).
The second bound (L2 bound) in (2.84) is derived by again subtracting/adding both and LH1 GG H2 (t, f ) from/to 5 (t, f ) and by twice applying the triangle in() () () () () () () () 2
equality, 5
() 2
LH1 LGG LH2 LH1 LGG H2
()
()
+ LH1 LGG H2 LH1 GG H2
+ LH1 GG H2 LH3
. (2.89)
55
The rst term on the right hand side of (2.89) can be bounded as follows. First, it is easily seen that LH1 LGG LH2 LH1 LGG H2
() () () () () 2
LH1
()
LGG LH2 LGG H2
()
()
()
Noting that the maximum time shift and maximum frequency shift of both GG and H2 are G and G , respectively, and by applying the (slightly rened) bound [118, 119] LGG LH2 LGG H2 we obtain LH1 LGG LH2 LH1 LGG H where we further used the inequalities GG
() () () () ()
2
()
()
()
3 3 64G G GG
H2
= c
3 G GG
H2
LH1
()
c
2
3 G GG
H2
2 c G H1 2
H2
2, 1
(2.90) G H
2
inequality holds for DL operators and can be shown using the Schwarz inequality). The second term on the right hand side of (2.89) can be similarly bounded by noting that the maximum time shift and maximum frequency shift of GG H2 are 2G and 2G , respectively. Then applying the (slightly rened) bound [118, 119] LH1 LGG H LH
2
and LH
()
SH
(the last
()
()
() G 1 G H2 2
64(2G )3 (2G )3 H1
2
GG H2
2,
as well as G = 4G G , SGG H2
() ()
GG
()
H2 2 , and GG
2
G 2 , we obtain
3
LH1 LGG H2 LH1 GG H2

() () 2
c 8 c
4G
H1
2
GG H2
2
2 2.
3 G H 1 2
H2
(2.91)
(2.89) is bounded according to (2.78). The L2 bound for 5 (t, f ) in (2.84) is nally obtained by combining (2.90), (2.91), and (2.78). Discussion. The above result shows that for jointly DL underspread HS operators H1 , H2 , and from H1 GH2 = H3 that H3 , i.e., operators with GSFs supported within a rectangular region G of small area G , it follows LH1 (t, f ) LGG (t, f ) LH2 (t, f ) LH3 (t, f ) .
() () () ()
By noting that LH1 GG H2 LH3
= H1 GG H2 H3
, the third term on the right hand side of

()
(2.92)
We note that small G requires the GSFs of H1 , H2 , and H3 to be essentially located along the or axis, respectively. Yet, in the case of = 0, Theorem 2.19 can be extended using a renement of the involved bounds via metaplectic transformations like that performed in Subsection 2.3.4. Regularized Inversion in the TF Domain. The approximation (2.92) can be used to obtain an approximate solution of (2.76) via a regularized inversion in the TF domain. To this end, let us dene a new operator G via its GWS as () LH3 (t, f ) , for (t, f ) R () () () LH1 (t, f ) LH2 (t, f ) L e (t, f ) G 0, for (t, f ) R ,
(2.93)
56
where R and 0. tity SH1

1
LH1 (t, f ) LH2 (t, f ) (t, f ) : , SH1 1 SH2 1 Since LH1 (t, f ) LH2 (t, f )
() ()
()
()
with
()
3 2 c G + 9 G , 2 SH1 1 SH2 1 , () () LH1 (t, f ) LH2 (t, f )
(2.94) the quanand hence
LH1 (t, f ) LH2 (t, f )
()
SH2
is an upper bound on the maximum value of SH2
the region R corresponds to the TF locations where the denominator in (2.93) is larger than =
3 2 2 c G + 9 G times the upper bound SH1 1 1
on its maximum value. Now, Theorem

()
2.19 implies that, for (t, f ) R, 1 SG
LGG (t, f ) L e (t, f ) = =
()
() G
1 SG 1 SG 1 SG
()
LGG (t, f )
()
()
LH3 (t, f ) LH1 (t, f ) LH2 (t, f )

() () () ()
LH1 (t, f ) LGG (t, f ) LH2 (t, f ) LH3 (t, f ) LH1 (t, f ) LH2 (t, f ) SH1
1 () () 3 2 2 c G + 9 G
()
SG
()
SH2
1 ()
LH1 (t, f ) LH2 (t, f )

1
SH1
SH2
()
LH1 (t, f ) LH2 (t, f ) 1 = . Hence, for large enough, it follows that within R LGG (t, f ) L e (t, f ) .
() G () L e (t, f ) G () () () G
(2.95)
For the purpose of illustration, let us consider an example where G = 105 . In order that the regularized TF inverse L e (t, f ) in (2.93) approximates LGG (t, f ) with an accuracy of 1% in the sense that
1 SG
LGG (t, f )
()
1% is obtained from (2.94) as
9 G ] 0.01. Hence, the region R where the approximation (2.95) holds with the desired accuracy of LH1 (t, f ) LH2 (t, f ) R = (t, f ) : 0.01 , SH1 1 SH2 1
() ()
2 0.01, = 100 is required. It follows that = [3 c G /2 +
i.e., R consists of those TF points where LH1 (t, f ) LH2 (t, f ) exceeds 1% of the upper bound SH1
1
()
()
SH2
on its maximum value.
Since
GG
that for large enough, the operator G obtained by the regularized TF inversion (2.93) approximately satises (2.76) as well. It is thus possible to approximately solve operator equations of the type (2.76)
approximately satises (2.76), i.e., H1 GG H2 H3 , the approximation (2.95) implies
using algebraic operations in the TF domain, thereby avoiding computationally costly and potentially unstable operator inversions: H1 GH2 = H3 = H1 GG H2 H3

()
57
LG (t, f ) L e (t, f ) =
()
()
() G
LH1 (t, f ) LH2 (t, f ) 0,
()
LH3 (t, f )
()
, for (t, f ) R, for (t, f ) R ,

() G
with the smoothing function (t, f ) = LT (t, f ). The operator G is nally obtained from L e (t, f ) by an inverse Weyl transform (B.18). Applications of this result to the denition of a TF coherence function and to nonstationary signal detection will be presented in Subsections 3.8 and 4.2, respectively. We note that in a certain sense, the regularized TF domain inversion resembles the computation of pseudo-inverses of (numerically) rank decient matrices (or operators) via a thresholding of the singular values [70]. However, in our case the thresholding is performed in the TF domain, which is considerably more intuitive.
2.3.7
Operator Inversion Based on the Generalized Weyl SymbolPart II
We will next consider the operator equation GH2 = H3 , (2.96)
where H2 and H3 are jointly DL HS operators with GSF support region G and G is the operator to be calculated. Due to our assumptions of HS operators H1 , H2 , H3 in the context of (2.76), (2.96) cannot be viewed as a special case of (2.76) since the identitiy operator I has innite HS norm. Still, most of the preceding arguments apply to (2.96) as well, leading to similar theorems with similar proofs. We further note that all results presented below for GH2 = H3 are valid for equations of the type H1 G = H3 as well. Eq. (2.96) can be solved for G by computing the (pseudo-)inverse of H2 . However, our aim again is to nd approximate expressions for the GWS of G in order to replace potentially numerically unstable and computationally intensive operator inversions by simple TF domain inversions (i.e., scalar divisions). Eq. (2.96) will be important for studying nonstationary linear signal estimation in Section 4.1. There, we will see that the dening equation for the Wiener lter HW is given by HW Ry = Rx,y , with Ry being the auto-correlation operator of the observation y(t) and Rx,y being the cross-correlation operator of the desired signal x(t) and the observation y(t). This equation is of the type (2.96) with G = HW , H2 = Ry , and H3 = Rx,y . The results of this subsection will be useful for an approximate TF formulation of the Wiener lter that is valid for jointly underspread processes (see Section 4.1). In order to obtain an approximation for the GWS of G, it is again tempting to apply (2.62) directly to (2.96) or to the equation G = H3 H1 obtained by multiplying (2.96) by H1 . Unfortunately, in 2 2 most cases neither G nor H1 will be DL underspread, and hence (2.62) cannot be applied directly. In 2 analogy to the discussion of equations of the type (2.76), we subsequently present two theorems which show that a smoothed version of the GWS of G is approximately equal to the ratio of the GWSs of H3 and H2 , LG (t, f )
()
LH3 (t, f ) LH2 (t, f )

()
()
58
()
Here, (t, f ) = LT (t, f ) is a 2-D lowpass function as in (2.83). DL Approximation of G. We again split the operator G into a DL part and a non-DL part such that G = GG + GG (with SGG (, ) = SG (, ) IG (, ) where G denotes the joint GSF support of H2 and H3 ). We will then show that the non-DL part GG is negligible for the validity of (2.96). Theorem 2.20. Consider an operator G satisfying GH2 = H3 , where H2 and H3 are jointly DL displacement spread is given by G = 4G G . Let
() () ()
with GSF support contained in the rectangular region G = GG
denote the DL part of G dened by SGG (, ) =
G , G G , G so that the joint

()
SG (, ) IG (, ) and let GG = GGG denote the non-DL part of G. Then, the dierence GG H2 H3 is bounded as GG H2 H3 GG
2 2 2
H2
2 G .
(2.97)
Proof. The proof of this theorem is essentially parallel to that of Theorem 2.18. In particular, the equation SGG SH2 (, ) + S
() () () GG
SH2 (, ) = SH3 (, ) ,
()
()
()
(2.98)
obtained by rewriting GH2 = GG H2 + GG H2 = H3 in the (, )-domain, implies that the support of to G) S SGG SH2 (, ) is conned to G = 2G , 2G 2G , 2G , and hence (since SH3 (, ) is conned
() GG
(2.98). With
G cannot be canceled by SGG SH2 (, ) (which is supported within G ) and thus would contradict GG H2 H3 = GG H2 GH2 = GG G H2 = GG H2 , (2.99)
SH2 (, ) is conned to G . Indeed, any contributions of S

() ()
()
() GG
SH2 (, ) outside of
()
there is GG H2 H3
2 2
= GG H2
2 2
=
G
() GG
SH2 (, ) d d , ) ej2 (,,

, )
()
=
G
() () ( , ) SH2 ( GG
d d d d
1 2 2
SGG (1 , 1 )
2 2 G
d1 d1
2 2 2 2
|SH2 ( 2 , 2 )|2 d2 d2 d d ,
= GG
H2
d d = GG
H2
2 2 4 G
where we used the support constraint of S that the area of G is 4G .
() GG
SH2 (, ), applied the Schwarz inequality, and noted
()
Discussion. The foregoing theorem shows that if G is small, i.e., if H2 and H3 are jointly DL underspread, then removing the non-DL part GG from G does not greatly aect the validity of GH2 = H3 : GH2 = H3 = GG H2 H3 . (2.100)
59
Whereas small G requires the GSFs of H2 and H3 to be essentially located along the or axis, respectively, Theorem 2.20 can easily be extended to allow for arbitrary GSF orientations. Let us rotation of the rectangular region G using a rotation matrix A that corresponds to a metaplectic GG H2 H3
e
assume that the GSFs of H2 and H3 are supported within a rectangular region G obtained by a
operator U. We then have (cf. (2.82))
= GG H2 H3
(2.101)
where the systems Hi = UHi U+ , i = 2, 3, have GSFs supported within G and G = UGU+ . Hence, (2.97).
Theorem 2.20 directly applies to (2.101) and it is seen that the orientation of G is not relevant to Since the operator norm is upper-bounded by the HS norm (see Appendix A), the approximation
(2.97) can be viewed as an improvement on inequality (B.16). Note that (B.16)) holds for arbitrary operators (i.e., also for non-DL operators) and, together with (2.99), implies that GG H2 H3 = GG H2 GG H2 or GG H2 GG
2 2 2
(2.100) is also valid in the sense that the operator norm GG H2 H3
is small. Again, the bound
H2
1.
Approximation for the GWS of G. The previous theorem, which shows that for H2 , H3 jointly DL underspread the non-DL part of G is negligible with respect to the operator equation (2.96), is the basis for the next theorem which allows to approximate the GWS of GG (i.e., the smoothed GWS of G according to (2.83)) by the ratio LH3 (t, f )/LH2 (t, f ). Theorem 2.21. Let the operators H2 , H3 , G, and GG be dened as in Theorem 2.20. Then, the dierence 6 (t, f ) is bounded as 6 (t, f ) SG SH2
() () () ()
LGG (t, f ) LH2 (t, f ) LH3 (t, f )
()
()
()
2 c G + 4 G , 2
6 G
2
() 2
H2
3 G + 2 G ,
where c = | + 1/2| + | 1/2|. Proof. To prove the rst bound (L bound), we subtract and add LGG H2 (t, f ) from/to 6 (t, f ) and apply the triangle inequality, 6 (t, f ) LGG (t, f )LH2 (t, f ) LGG H2 (t, f ) + LGG H2 (t, f ) LH3 (t, f ) .
() () () () () () () ()
(2.102)
1
The rst term on the right-hand side of (2.102) can be bounded by using (2.65) and SGG SG
G ,
LGG (t, f )LH2 (t, f ) LGG H2 (t, f ) SH2
()
()
()
S GG
c G SH2 2
SG
2 c G . 2
60
The second term on the right hand side of (2.102) can be developed and bounded as LGG H2 (t, f ) LH3 (t, f ) = L
() () () (t, f ) GG H2 () GG () G 1
S
G
() GG
SH2 (, ) d d
() GG
()
SH2
d d = S 4G 4 SH2
SH2 SG
()
4G ,
SH2 support of
() S G G () SH2
S GG
where we used Youngs inequality (B.14) with p = , q = 1, SGG (, ) is conned to 2.20).

()
G,
whose area is given by 4G (see the proof of Theorem

()
SG
and the fact that the
The second bound (L2 bound) is derived by again subtracting and adding LGG H2 (t, f ) from/to
6 (t, f ) and applying the triangle inequality, 6

() 2
LGG LH2 LGG H2 =

() () LGG LH2
()
()
()
() LGG H 2 2
+ LGG H2 LH3 + G H2 H3
G
()
() 2
(2.103)
The rst term in (2.103) obeys the bound LGG LH2 LGG H
() () () ()
2
3 G GG
H2
c G 2.
3 G G
H2
(2.104)
that has been derived in [118, 119]. The second term in (2.103) can be bounded using (2.97). The L2 bound for 6 (t, f ) is nally obtained by noting that GG
2
Discussion. The above result shows that for jointly DL underspread operators H2 , H3 , i.e., operators having small joint displacement spread G , it follows from GH2 = H3 that there is approximately LGG (t, f ) LH2 (t, f ) LH3 (t, f ) .
() () ()
(2.105)
We note that small G requires the GSFs of H2 and H3 to be essentially located along the or axis, respectively. Yet, in the case of = 0, Theorem 2.21 can be extended using a renement of the involved bounds via metaplectic transformations like that performed in Subsection 2.3.4. Regularized Inversion in the TF Domain. The approximation (2.105) can again be used as a basis for an approximate solution of (2.96) via a regularized inversion in the TF domain. Let us dene the operator G via its GWS as () LH (t, f ) 3 , for (t, f ) R () LH2 (t, f ) 0, for (t, f ) R , with 2 c G + 4 G . 2
L e (t, f )
() G
(2.106)
where R (t, f ) :
LH2 (t, f ) , SH2 1
()

()
61
Since LH2 (t, f ) SH2 denominator

() LH2 (t, f )
1,
in (2.106) is larger than =
R can again be interpreted as the TF region where the magnitude of the

2 2 c G + 4 G times the upper bound SH2 1
on its maximum value. Theorem 2.21 implies that, for (t, f ) R, 1 SG

() LGG (t, f ) () L e (t, f ) G
1 SG 1 SG 1 SG

() LGG (t, f ) ()
LH3 (t, f ) LH2 (t, f )

() () ()
()
LGG (t, f ) LH2 (t, f ) LH3 (t, f ) LH2 (t, f ) SG

2 2 c G + 4 G
()
SH2
1 ()
LH2 (t, f )
SH 1 () LH2 (t, f ) 1 = .
Hence, for large enough, it follows that within R LGG (t, f ) L e (t, f ) . Since GG approximately satises (2.96), i.e., GG H2 H3 , the preceding approximation implies that the operator G obtained by the regularized TF inversion (2.106) approximately satises (2.96) as well. This allows to approximately solve operator equations of the type (2.96) using algebraic operations in the TF domain, thereby avoiding a computationally costly operator inversion: GH2 = H3 = GG H2 H3 () LH (t, f ) 3 , for (t, f ) R, () () () LH2 (t, f ) LG (t, f ) L e (t, f ) = G 0, for (t, f ) R ,
() () () G
with the smoothing function (t, f ) = LT (t, f ). The operator G can nally be obtained from its GWS via an inverse Weyl transformation (B.18). The application of this result to nonstationary signal estimation will be discussed in Section 4.1.
2.3.8
Approximate Eigenvalues and Eigenfunctions
As mentioned in Subection 1.2.1, the complex sinusoids ef0 (t) = ej2f0 t i.e., signals with perfect frequency concentrationare the generalized eigenfunctions [68] of any LTI system, with the transfer function at frequency f0 , G(f0 ), being the associated generalized eigenvalue. Thus, the response of i.e., signals perfectly localized in timeare the generalized eigenfunctions of any LFI system, with the temporal transfer function at time t0 , m(t0 ), being the associated generalized eigenvalue. Thus, an LTI system to ef0 (t) = ej2f0 t is G(f0 ) ej2f0 t . Similarly, the Dirac impulses t0 (t) = (t t0 )
62
the response of an LFI system to a Dirac impulse (t t0 ) is given by m(t0 ) (t t0 ). Note that in the LTI and LFI cases the eigenfunctions are highly structured: the complex sinusoids ef (t) are frequency-shifted versions of each other, and the Dirac impulses t (t) are time-shifted versions of each other, ef2 (t) = (Ff2 f1 ef1 )(t) , t2 (t) = (Tt2 t1 t1 )(t) ,
with T and F denoting the time shift operator and the frequency shift operator, respectively (see Subsection A.4). Moreover, the eigenvalues are given by the values of the transfer function. Thus, for LTI and LFI systems the mathematical notion of an eigenvalue spectrum coincides with the engineering notion of a transfer function. The situation is dierent in the case of general LTV systems. The eigenfunctions (singular functions) of dierent LTV systems are dierent (unless the systems commute [64, 158]). Furthermore, the eigenfunctions (singular functions) of general LTV systems are not localized and structured in any sense and the eigenvalues (singular values) are not equal to the TF transfer function (i.e. GWS) values. However, we will now show that underspread systems have a well-structured set of TF-localized approximate eigenfunctions, with the associated approximate eigenvalues given by the GWS values. (Note that it is not necessary to consider approximate singular functions since according to Subsection 2.3.17 underspread operators are approximately normal.) Let s(t) be a normalized function that is well concentrated about the origin of the TF plane (e.g., a Gaussian function). We consider the family of functions11 st0 ,f0 (t) = St0 ,f0 s (t) = s(t t0 ) ej2f0 t obtained by TF-shifting s(t) to the TF point (t0 , f0 ). By construction, st0 ,f0 (t) is then well TFconcentrated about (t0 , f0 ) and two functions out of this set are related (up to a phase factor) by a TF shift, st2 ,f2 (t) = ej2f1 (t1 t2 ) St2 t1 ,f2 f1 st1 ,f1 (t) . In the following, we shall state conditions under which the response of an LTV system to st0 ,f0 (t) is approximately LH (t0 , f0 ) st0 ,f0 (t), which implies that st0 ,f0 (t) is an approximate eigenfunction of H with LH (t0 , f0 ) the associated approximate eigenvalue. The families of TF shifted functions st0 ,f0 (t) are sometimes called Weyl-Heisenberg function sets. They have previously been considered as approximate eigenfunctions of DL operators [118, 127], and the next theorem is essentially an adaptation and extension of these results. Theorem 2.22. For any LTV system H, TF point (t0 , f0 ), and normalized function s(t) (i.e., s 1), the dierence 7 (t)
11
(1/2)
(1/2)
()
()
()
Hst0 ,f0 (t) LH (t0 , f0 ) st0 ,f0 (t)

() ()
()
Note that the specic choice of = 1/2 for the joint TF shift St0 ,f0 is made for notational simplicity; choosing a
dierent leaves the subsequent arguments unchanged. Furthermore, this value is not related to that in LH (t, f ) used in our subsequent development.
63
is bounded as 7 SH
() 2 1
DH,s
()
with
DH,s
()
()
s 2 CH + mH+ H + 2 mH s ,
()
( )
( )
(2.107)
function (see B.2.4) and CH as dened in (2.70). Proof. First, we develop the square of 7 7 Using st0 ,f0
() 2 2 2 () 2
with the weighting function s (, ) = 1 As (, ) where As (, ) is the generalized ambiguity

()
()
as + LH (t0 , f0 )
() () 2
= Hst0 ,f0
2
2 2
2 LH (t0 , f0 ) Hst0 ,f0 , st0 ,f0

()
()
st0 ,f0
2 2.
= s
= 1, subtracting and adding 2 LH (t0 , f0 ) LH (t0 , f0 ), and applying the triangle
inequality yields 7
() 2 2
= Hst0 ,f0 Hst0 ,f0
2 2
2 LH (t0 , f0 ) Hst0 ,f0 , st0 ,f0 LH (t0 , f0 ) LH (t0 , f0 )

() () 2
()
()
2 2
+ 2 LH (t0 , f0 )
()
LH (t0 , f0 )
()
()
Hst0 ,f0 , st0 ,f0 LH (t0 , f0 ) .

()
(2.108)
By subtracting and adding LH+ H (t0 , f0 ), applying (B.10) to the quadratic form H+ H st0 ,f0 , st0 ,f0 , using Ast0 ,f0 (, ) = As (, ) ej2( f0 t0 ) , and recalling the denition of 4 (t, f ) in Corollary 2.17, the rst term in (2.108) becomes Hst0 ,f0
2 2 () 2 () () () () ()
LH (t0 , f0 ) =
()
H+ H st0 ,f0 , st0 ,f0 LH+ H (t0 , f0 ) + LH+ H (t0 , f0 ) |LH (t0 , f0 )|2
0 0
() SH+ H (, ) Ast ,f (, ) d d ()
SH+ H (, ) ej2(t0 f0 ) d d + 4 (t0 , f0 )

()
()
()
() SH+ H (, ) As (, ) 1 ej2(t0 f0 ) d d + 4 (t0 , f0 ) ()
|SH+ H (, )| 1 A() (, ) d d + 4 (t0 , f0 ) s

(s ) 1 mH+ H
(2.109)
()
SH+ H
+ SH
() 2 1 2 CH
SH
2 1
s mH+ H + 2 CH
( )
where we furthermore used the triangle inequality, the bound (2.70), and Youngs inequality (B.14) with p = q = r = 1, i.e., SH+ H
1
(2.108) can be developed and bounded as Hst0 ,f0 , st0 ,f0 LH (t0 , f0 ) =
()
SH 2 . In a similar manner, the factor of the second term in 1

() SH (, ) As (, ) 1 ej2(t0 f0 ) d d (s ) 1 mH . ()
|SH (, )| 1 A() (, ) d d = SH s
() 1
(2.110)
Inserting these two bounds into (2.108) and using |LH (t0 , f0 )| SH
() ( ) ()
yields the bound (2.107).

( )
s Discussion. For an underspread system H where CH , mH+ H , and mH s can be made small such
that DH,s will be small, we thus have the approximate eigenvalue relation H st0 ,f0 (t) LH (t0 , f0 ) st0 ,f0 (t) ,
()
(2.111)
64
f
6
f
6
f
6
f
6
- t
- t
- t
- t
(a)
(b)
(c)
(d)
- t
- t
- t
Figure 2.7: Approximate eigenfunctions and eigenvalue interpretation of GWS: (a) Wigner distribution (top) and real and imaginary part (bottom) of input signal st0 ,f0 (t), (b) Weyl symbol of H, (c) output signal (Hst0 ,f0 )(t), and (d) input signal multiplied by corresponding TF transfer function value LH (t0 , f0 ). The signal length is 128 samples and the (normalized) frequency ranges from 1/4 to 1/4. which implies that properly TF-localized functions are approximate eigenfunctions and the GWS LH (t, f ) is an approximate eigenvalue distribution over the TF plane. In particular, since the eective duration and bandwidth of the normalized function s(t) correspond to the second-order derivatives of As (, ) at the origin [162, 212],
2 Ts () ()
1 2 As (, ) t |s(t)| dt = 2 4 2 t
2 2
()
(0,0)
2 Fs
1 2 As (, ) f |S(f )| df = 2 4 2 f
2 2 () ()
()
(0,0)
1 around the origin which further implies s (, ) 0 for small , . Thus, it is seen that small mH s
( )
the GAF of a well TF-localized function s(t) (i.e., having small Ts Fs ) satises As (, ) As (0, 0)
( )
s and mH+ H requires that the eective supports of |SH (, )| and of |SH+ H (, )| are small, i.e., that
H is underspread12 .
() (0)
The bound DH,s is tightest for = 0 since here CH is smallest. Furthermore, CH can be replaced by inf UM CUHU+ which allows for oblique orientation of the GSF of H. We note that also mH s
( ) ( )
s and mH+ H can be adapted to oblique GSF orientation; this can be achieved by proper choice of the
()
(0)
function s(t) according to (2.22). An example illustrating the approximation (2.111) for the case = 0 is shown in Fig. 2.7. In this example, the normalized error is
(0) DH,s 7 SH
(0) 2 1
= 0.38 while the corresponding bound in (2.107) is
= 0.73.
DL Operators. From the proof of Theorem 2.22 (see (2.108), (2.109), and (2.110)), it follows
12
by a factor of four [118].
() () () Note that SH+ H (, ) = SH+ SH (, ) and that twisted convolution enlarges the eective support maximally
65
that 7
() 2 2
|SH+ H (, )| 1 A() (, ) d d + 4 (t0 , f0 ) s + 2 SH

1
()
(2.112)
|SH (, )| 1
A() (, ) s
d d .
(max)
(max) (max) [H , H ], (max)
In the case of a DL operator H with GSF support contained in GH = [H
, H
(max)
ensures ||H 1 since 2|| c ) and by noting that the support of |SH+ H (, )| is contained in G = [2H 7
() 2 2
we can further develop (2.112) by applying (2.74) (valid for c H 2 which also
(max)
, 2H
] [2H
(max)
, 2H
(max)
],
() |SH+ H (, )| 1 As (, ) d d + 2 SH
1 GH
|SH (, )| 1 A() (, ) d d s
+ SH 2
(max) G
2 1
2 sin c H + 2 sin ||H 4 2

(max) GH
|SH+ H (, )| d d + 2 SH 1 1
2 1
|SH (, )| d d
+ 2 SH = 2 where 1 7 SH
(max) (max)
SH+ H
c H + sin ||H sin 4 2 (max) c H + sin ||H + sin + 2 SH 2 1 1 4 2

(max)
,
1
max(,)GH s (, ) and 2
max(,)G s (, ). Thus, using SH+ H
we nally obtain the bounds

() 2 2 2 1
SH 2 , 1
(max)
+ 21
(max)
+ 2 sin
(max) + 4 sin c H + 2 sin ||H 32 H , (2.113) 4 2 4

(max)
where the second (coarser but simpler) bound requires || 1/2 (in which case c = 1) in addition to c H 2 and exploits the fact that 1 2
(max)
Non-DL Operators. Next , let us consider the case where H is not DL but erroneously assumed side of (2.112) can be developed as to be DL with GSF support region G = [G , G ] [G , G ]. Then, the rst term on the right hand |SH+ H (, )| 1 A() (, ) d d s =
G () |SH+ H (, )| 1 As (, ) d d + G
|SH+ H (, )| 1 A() (, ) d d s
(max)
SH+ H
+2
G
|SH+ H (, )| d d mH+ H mH+ H + 2G 2G

(1,0) (1,0) (0,1)
(max) 2 (max)
SH 2 1 SH
2 1
+ 2 SH+ H + 2 SH
2 1
mH G
mH G
(0,1)
(2.114)
66 where we used G = [2G , 2G ] [2G , 2G ], 2

() As (, ) (max)
and H2 = H+ ). Similarly, the third term on the right hand side of (2.112) can be developed as 2 SH
1 () |SH (, )| 1 As (, ) d d 1 G () |SH (, )| 1 As (, ) d d + 2 SH 1 G
2, the rst Chebyshev-like inequality in (2.38), as well as (2.33) and (2.34) (with H1 = H
= max(,)G s (, ), the fact that 1
= 2 SH
|SH (, )| 1 A() (, ) d d s
2 SH 2 SH where we used 1
2 (max) 1 1
+ 4 SH
1 G
|SH (, )| d d mH G
(1,0)
2 (max) 1 1
+ 4 SH
2 1
mH G
(0,1)
(2.115)
(max)
in (2.38). Furthermore, (2.75) can be applied to the second term 4 (t0 , f0 ) on the right hand side of (2.112). Upon combining (2.75) with (2.114) and (2.115) and using G = 4G G , we nally obtain 7 SH
() 2 2 2 1
= max(,)G s (, ), 1As (, ) 2, and the rst Chebyshev-like inequality

()
()
(max)
+ 21
(max)
+ 2 sin m + H G
c G + 2 sin ||G 4 2 mH +2 G
(1,0)
(2.116)
2
mH +8 G
(1,0)
(0,1)
m + H G
(0,1)
(2.117)
By comparing this expression (which is valid for any system H) with the rst bound in (2.113), it is seen that the bound (2.113), erroneously applied to a non-DL operator (erroneously assumed to have displacement spread G ) might deviate from the correct bound (2.117) by as much as 8
(0,1) mH G
mH G
(1,0)
+2
(1,0) mH G
(0,1) mH G
2.3.9
Approximate Diagonalization
A normal operator is diagonalized by its eigenfunctions {uk (t)}, i.e., Huk , uk = k uk , uk = k uk , uk = k kk . This diagonalization property is of major importance in many applications. In the subsequent, we shall establish an approximate diagonalization result for underspread operators. To this end, we consider are now doubly indexed). These function sets are obtained by TF shifting two normalized functions u(t), v(t), uk,l (t) = u(t kT ) ej2lF t , vk,l (t) = v(t kT ) ej2lF t , with14 T F 1, and they are assumed to satisfy the biorthogonality condition uk,l , vk ,l = kk ll .
13 14
biorthogonal13 Weyl-Heisenberg function sets [56,100] {uk,l (t)}, {vk,l (t)} (note that these function sets
Orthogonal Weyl-Heisenberg function sets suer from poor TF localization [56] and are thus not considered here. Note that the condition T F 1 implies that the function sets {uk,l (t)}, {vk,l (t)} are incomplete. This is no serious
restriction since we here are not concerned with complete operator expansions (cf. [118]).
67
An important consequence of this biorthogonality condition is the fact that the generalized cross ambiguity function (see Subsection B.2.4) of u(t) and v(t) vanishes on the lattice (kT, lF ) for k = 0 and l = 0, i.e.,
() Au,v (kT, lF ) = k0 l0 .
(2.118)
This property will be important for the interpretation of the following theorem. We note that approximate operator diagonalization has previously been considered (partly at a mathematically much more sophisticated level) in [36, 42, 54, 64, 118, 120, 128, 137, 138, 153, 179]. Theorem 2.23. For any LTV system H and any biorthogonal Weyl-Heisenberg sets {uk,l (t)}, 8 [k, l; k , l ] is bounded as
(kk ,ll ) 8 [k, l; k , l ] u,v mH , SH 1
{vk,l (t)}, the dierence
()
Huk,l , vk ,l LH (kT, lF ) kk ll

()
()
(2.119)
with u,v (, ) = k0 l0 Av,u + kT, + lF . Proof. First note that 8 [k, l; k , l ] can be developed as 8 [k, l; k , l ] = Huk,l , vk ,l LH (kT, lF )kk ll = SH , A(),l ,uk,l kk ll vk =
() SH (, ) () () () ()
(k,l)
()
SH (, ) ej2(kT lF ) d d

()
() Avk ,l ,uk,l (, ) kk ll ej2(kT lF ) d d . () ()
Specializing this expression to k = k , l = l and using Avk,l ,uk,l (, ) = Av,u (, ) ej2(kT lF ) , we obtain 8 [k, l; k, l] =
() () SH (, ) Av,u (, ) 1 ej2(kT lF ) d d ()
SH (, ) 1 A() (, ) d d = SH 1 mH u,v v,u
(0,0)
which proves (2.119) for the case k = k , l = l . For k = k , l = l , the bound in (2.119) is shown similarly, 8 [k, l; k , l ] =
() () () SH (, ) Avk ,l ,uk,l (, ) d d ()
SH (, )

A() v,u
()
+ (k k )T, + (l l )F .
d d =
u,v S H 1 mH
(kk ,ll )
where we used Avk ,l ,uk,l (, ) = Av,u + (k k )T, + (l l )F
68
@ l k@ @
-3 -172 -130 -101 -84 -106 -123 -165
-2 -160 -117 -86 -66 -86 -116 -159
-1 -147 -104 -69 -49 -68 -98 -141
0 -153 -110 -54 0 -54 -91 -134
1 -155 -113 -74 -35 -70 -107 -150
2 -160 -117 -91 -53 -86 -116 -158
3 -166 -123 -112 -70 -106 -121 -164
-3 -2 -1 0 1 2 3
Table 2.3: Ratio Hu0,0 , vk ,l / Hu0,0 , v0,0 of o-diagonal elements to diagonal element in dB for k = 3, . . . , 3 and l = 3, . . . , 3.
(k,l)
Discussion. The preceding theorem shows that for underspread systems where mH made small by suitable choice of u(t) and v(t), one has L() (kT, lF ) ,
H
u,v
can be
Huk,l , vk ,l LH (kT, lF ) kk ll =
()
for k = k , l = l , for k = k , l = l .
()
0 ,
(2.120)
u,v (, ) 1 Av,u (, ) 0 about the origin of the (, )-plane. Thus, it is seen that small mH u,v (2.118) implies that for k = 0, l = 0,
For well TF-localized u(t), v(t), it can be shown that Av,u (, ) Av,u (0, 0) = 1 or equivalently
(0,0) () (
(0,0)
()
requires that |SH (, )| is concentrated about the origin, i.e., that H is underspread. Furthermore,
(k,l) () u,v (0, 0) = Au,v (kT, lF ) = 0 ,
k = 0, l = 0 .
u,v
(kk ,ll )
Thus, for H underspread (with |SH (, )| concentrated about the origin) also mH
for k = k ,
l = l will be small. We can thus conclude that well TF-localized biorthogonal Weyl-Heisenberg funcHuk,l , vk ,l , k = k , l = l , are approximately zero and the GWS values at the grid points (kT, lF )
tion sets approximately diagonalize underspread operators in the sense that the o-diagonal elements approximately equal the diagonal elements Huk,l , vk,l . The approximation (2.120) is illustrated in Fig. 2.8 and Table 2.3 for = 0 and the same LTV system as in Fig. 2.7. It is seen from Table 2.3 that the o-diagonal values Hu0,0 , vk ,l (with k = 0, l = 0) are substantially smaller than the diagonal element Hu0,0 , v0,0 . In particular, in this example, the normalized dierence between Hu0,0 , v0,0 and LH (0, 0) is
( ) |8 [0,0;1,0]| | Hu0,0 ,v1,0 | SH 1 SH 1
(0)
(0)
|8 [0,0;0,0]| SH 1
(0)
= 0.03
and the corresponding bound is mH u,v = 0.14. Furthermore, the largest normalized o-diagonal element and associated bound are = 0.021 and mH
u,v
(1,0)
= 0.098, respectively.
69
-t
20 0 20 40 60 80 100 0.5 0.25 0 0.25 20 0 20 40 60 80
-t
-f
0.5
100 0.5
0.25
0.25
-f
0.5
(a)
(b)
(c)
Figure 2.8: Prototypes u(t) and v(t) of a biorthogonal Weyl-Heisenberg function set with T = 16 and F = 1/8. (a) Prototype u(t) (top) and magnitude of its Fourier transform (bottom, in dB); (b) prototype v(t) (top) and magnitude of its Fourier transform (bottom, in dB); (c) magnitude of cross ambiguity function of u(t) and v(t); note the zeros on the grid (kT, lF ). The signals have length 128 and the (normalized) frequency ranges from 1/2 to 1/2.
2.3.10
Input-Output Relation for Deterministic Signals Based on the Generalized Weyl Symbol
In this subsection, we discuss a TF input-output relation for deterministic signals based on the short-time Fourier transform [61, 84, 157, 169] (STFT) and the GWS. We note that TF input-output relations for random processes will be presented in Section 3.6. For LTI systems, due to the fact that the complex sinusoids are eigenfunctions, the Fourier transform of the output signal (Hx)(t) is given by G(f ) X(f ). Similarly, for LFI systems (Hx)(t) equals m(t) x(t). Note that these relations involve the signals and the transfer functions in a linear manner. In the case of LTV systems, a similar multiplicative input-output relation in the TF domain may be desirable; this input-output relation should involve the signals and the transfer function (GWS) in a linear manner, too. We shall select the STFT [61, 84, 157, 169] as TF signal representation since it is the only linear TF signal representation that is covariant, up to an unavoidable phase factor, to TF shifts [93]. According to Subsection B.2.1, the STFT is given by
(s) STFTx (t, f ) t
x(t ) s (t t) ej2f t dt ,
(2.121)
where s(t) is a normalized window. Assuming s(t) to have good TF concentration and H to be are approximate eigenfunctions of H, i.e., Hst0 ,f0 (t) LH (t0 , f0 ) st0 ,f0 (t). Hence, applying H to both sides of the STFT inversion formula (see (B.38)) x(t) =
t0 f0
underspread, we know from Subsection 2.3.8 that the TF shifted functions st0 ,f0 (t) = s(t t0 )ej2f0 t
()
STFT(s) (t0 , f0 ) st0 ,f0 (t) dt0 df0 , x
(2.122)
we obtain (Hx)(t) = H
t0 f0 (s) STFTx (t0 , f0 )st0 ,f0 dt0 df0
(t)
70
=
t0 f0
STFT(s) (t0 , f0 )(Hst0 ,f0 )(t) dt0 df0 x STFT(s) (t0 , f0 ) LH (t0 , f0 ) st0 ,f0 (t) dt0 df0 . x
()
Comparing this with
t0
f0
(Hx)(t) =
t0 f0
STFTHx (t0 , f0 ) st0 ,f0 (t) dt0 df0

(s)
(s)
STFTx (t, f ) LH (t, f ) (although of course equal integrals do not imply that the integrands are equal). The next theorem gives a bound on the quality of this approximation.
(obtained from (2.122) by replacing x(t) with (Hx)(t)) suggests that there might be STFTHx (t, f )
(s) ()
Theorem 2.24. For any LTV system H, any signal x(t), and any normalized function s(t), the dierence 9 (t, f ) is bounded as 9 (t, f ) (1,1) () DH+ ,s + 4||mH , SH 1 x 2
() () () (s) STFTHx (t, f ) LH (t, f ) STFTx (t, f ) (s) ()
9 2 (1,1) ( ) 2 MH s + 4|| MH , H 2 x 2
()
()
(2.123)
with the weighting functions s (, ) = 1 As (, ) (used in DH+ ,s , see (2.107)) and (, ) = s 1 Re{As (, )} (used in MH s ). Proof. Using the STFT denition (B.37), we have 9 (t, f ) = Hx, st,f LH (t, f ) x, st,f = x, H+ LH (t, f )I st,f = x, H+ LH+ (t, f ) I + LH+ (t, f ) I LH (t, f ) I st,f
() () () () () () () () () ( )
= A (t, f ) + B (t, f ) with A (t, f )

()
(2.124)
x, H+ LH+ (t, f )I st,f ,

1 2
()
B (t, f )
()
LH+ (t, f ) LH (t, f )
()
()
x, st,f .
Using Schwarz inequality, (2.107), and SH+ A (t, f ) x

() 2
= SH 1 , we obtain x
2
H+ st,f LH+ (t, f ) st,f

()
()
SH+
() 1 DH+ ,s
= x
SH
() 1 DH+ ,s , 2
(2.125)
while applying Schwarz inequality to B (t, f ) and using (2.48) as well as st,f B (t, f ) LH+ (t, f ) LH (t, f ) x A (t, f ) + B (t, f ) (cf. (2.124)).
() 2 2 () () () () () 2
= 1 yields .
()
st,f
SH
(1,1) 1 4|| mH
(2.126)
The rst (L ) bound in (2.123) then follows upon inserting (2.125) and (2.126) into 9 (t, f ) To derive the second (L2 ) bound in (2.123), we note that A =
t f
x, H+ st,f LH+ (t, f ) st,f
()
dt df
x
t f
2 2
H+ st,f LH+ (t, f )st,f
()
2 2 dt df
71
= x
2 2
H+ st,f
t f
2 2
2 LH+ (t, f ) H+ st,f , st,f
()
+ |LH+ (t, f )|2 st,f
()
2 2
dt df . (2.127)
Next, we develop the three terms in this expression. Using (B.10), it can be shown that H+ st,f
t f 2 2 dt df
=
t f
HH+ st,f , st,f dt df

() SHH+ (, ) As (, ) ej2(t f ) d d dt df ()
=
t f ()
= = where we used As (0, 0) = s

() () 2 2
() SHH+ (0, 0)
() SHH+ (, ) As (, ) ( ) () d d
= Tr HH+ = SH
2 2,
(2.128)
= 1 and (B.7). Similarly, LH+ (t, f )

() () SH+ (, )As (, ) ej2(t f ) d d dt df ()
LH+ (t, f ) H+ st,f , st,f dt df = =
() |SH+ (, )|2 As (, ) d d .
(2.129)
Finally, with st,f
2 2
= 1 we have LH+ (t, f )

() 2
st,f
2 2 dt df
= LH
() 2 2
= SH 2 . 2
()
(2.130)
Inserting (2.128), (2.129), and (2.130) in (2.127) yields the following L2 bound for A (t, f ), A
() 2 2
2 2 2 2
|SH (, )|2 d d 2
() |SH (, )|2 As (, ) d d
=2 x
|SH (, )|2 1 A() (, ) s
d d = 2 x
2 2
2 2
MH s
( ) 2
(2.131)
The L2 norm of B (t, f ) can be bounded using Schwarz inequality and (2.48), B
() 2 2
()
LH+ (t, f ) LH (t, f )
()
()
2 2
st,f
2 2 dt df
2 2
2 2
4||MH
(1,1) 2
(2.132)
() 2
The second bound (L2 bound) in (2.123) then follows by inserting (2.131) and (2.132) into 9
() A 2
() B 2
(cf. (2.124)).
() ( )
Discussion. Theorem 2.24 shows that if DH+ ,s (see Subsection 2.3.8) and MH s can be made small by suitable choice of s(t), and if mH output relation STFTHx (t, f ) LH (t, f ) STFT(s) (t, f ) . x
( ) () () (s) () (1,1)
and MH
(1,1)
are small, we obtain the approximate input(2.133)

(1,1)
We note that small MH s requires that As (, ) As (0, 0) 1 on the eective support of |SH (, )|, which in turn requires H to be underspread with small eective GSF support. Small mH
72
f
6
f
6
f
6
f
6
(a)
- t
(b)
- t
(c)
- t
(d)
- t
Figure 2.9: Approximate multiplicative input-output relation: (a) STFT magnitude of chirp-like input signal x(t), (b) Weyl symbol of H, (c) STFT magnitude of output signal, and (d) magnitude of the product of STFT of input signal and TF transfer function, LH (t, f ) STFTx (t, f ). The signal length is 128 samples and the (normalized) frequency ranges from 1/4 to 1/4. and MH
(1,1) (0) (s)
require that SH (, ) be localized along the and/or axis respectively. The latter
()
condition can be relaxed in the case = 0. Here, the second terms in the bounds (2.123) are zero and the bounds are tightest in this case. Furthermore, these bounds can further be rened by exploiting the covariance of the Weyl symbol to metaplectic transforms. These rened bounds also allow for systems with GSF oriented in oblique directions (see the discussion at the end of Subsections 2.3.4, 2.3.6, and 2.3.8). An example of the approximation (2.133) is shown in Fig. 2.9 for = 0. In this example, the normalized errors were maxt,f
(0) |9 (t,f )| SH 1 x 2
(0)
in (2.123) were DH+ ,s = 0.62 and
= 0.16 and
( ) s
9 2 H 2 x 2
(0)
= 0.075 while the corresponding bounds
2 MH
DL Operators. Since a comparison of the L bounds for non-DL and DL operators is cumbersome, we restrict ourselves to the L2 bounds. Here, one can show that for DL H with GSF support contained within GH = [H
(max)
, H 9
(max)
] [H
(max)
, H
(max)
] and for ||H 1, one has ||H 2 (2.134)
() 2
2 (max) + 2 sin
with (max) = max(,)GH (, ). s Non-DL Operators. On the other hand, for non-DL operators and any region G = [G , G ]
(1,0) (0,1) MH M ||G + 2 2 + H , 2 G G
[G , G ] of area G = 4G G , one can show that for ||G 1, 9

() 2
2 (max) + 2 sin
(2.135)
with (max) = max(,)G (, ). It is thus seen that the bound (2.134), when erroneously applied to s (0,1) M (1,0) MH H from non-DL operators, might deviate from the bound (2.135) by as much as 2 2 G + G the correct bound (2.135).
73
2.3.11
(Multi-Window) STFT Filter Approximation of Time-Varying Systems

(s) () (s)
The input-output relation STFTHx (t, f ) LH (t, f ) STFTx (t, f ) valid in the underspread case suggests that the output signal (Hx)(t) can approximately be obtained by applying an inverse STFT to LH (t, f ) STFTx (t, f ) (note, however, that in general this product is not a valid STFT). This corresponds to an approximate TF implementation of H that consists of calculating the STFT of the input signal, STFTx (t, f ), multiplying by the TF transfer function LH (t, f ), and applying the inverse STFT (2.122) to LH (t, f ) STFTx (t, f ). The resulting LTV system is called an STFT lter and will be denoted by H() . Using (2.122) and (2.121), the STFT lters input-output relation is (the subscript in y1 and H1 y1 (t)
() () (s) (s) () () (s)
will become clear shortly)

(s) STFTx (t , f ) LH (t , f ) st ,f (t) dt df ()
H1 x (t) = =
()
x, st ,f LH (t , f ) st ,f (t) dt df .
()
Such STFT lters have been considered previously [24, 39, 42, 47, 118, 123, 125, 157, 169], mostly with weighting functions other than LH (t, f ). They are an intuitively appealing, practical way of implementing an LTV lter. In a TF-discretized form, they permit time-varying subband signal processing using the Gabor expansion [53,55,56,66] or, equivalently, DFT lter banks [22,23,39,135,166,196,204]. Furthermore, in a mathematical context STFT lters are related to Toeplitz operators [118, 159, 179]. The question now is whether the STFT lter H1
() ()
with weighting function LH (t, f ) is a reasonable
()
approximation to H. We shall postpone the answer to this question and rst discuss multi-window STFT lters [118, 123, 125], which are an extension of the conventional STFT lter using several orthonormal window functions. Here, N STFTs are calculated using N orthonormal window functions {sk (t)}k=1...N . Each of these STFTs is multiplied by the same weighting function (in our case LH (t, f )), and to each product LH (t, f ) STFTx k (t, f ) an inverse STFT (2.122) using the respective
() () (s )
window sk (t) is applied. The output is nally obtained as a weighted superposition of the individual STFT lter outputs, yN (t) = HN x (t)
k=1 () () N
(s STFTx k ) (t , f )LH (t , f )(sk )t ,f (t) dt df , N k=1 k
()
(2.136)
where HN denotes the overall multi-window STFT lter and the k satisfy
()
= 1. Note that
for N = 1 this reduces to the single-window case as discussed above. In general, H will be dierent from its multi-window STFT lter approximation HN . Yet, in the following we will establish a relation and bound for the dierence between the GWS of the original system H and that of the multi-window STFT lter HN ; subsequently, we will bound the dierence between the exact output signal y(t) = (Hx)(t) and the multi-window STFT lters output signal yN (t) = (HN x)(t). Theorem 2.25. Let HN be the multi-window STFT lter operator dened in ( 2.136), with orthonormal window functions {sk (t)}k=1...N and
() N k=1 k () () () ()
= 1. Then, the dierence

() HN
10 (t, f ) = LH (t, f ) L b () (t, f )
74
satises 10 (t, f ) ( ) mH N , SH 1 with the weighting function N (, ) = 1

N k=1 k sk sk . ()
10 H
() 2 2
= MH N ,
()
( )
(2.137)
() N k=1 k Ask (, )
In the single-window case there is 1 (, ) = 1 As1 (, ) .
= 1 STN (, ) where TN =
()
Proof. The GSF of H() can be written as [64, 118, 125, 190]
() S b () (, ) H
N
=
k=1
() () k SH (, ) Ask (, )
() SH (, )
N k=1
() k Ask (, ) = SH (, ) STN (, ).
()
()
The rst (L ) bound is then shown as 10 (t, f ) = =

()
SH (, ) S b () (, ) ej2(t f ) d d
HN ()
()
()
|SH (, )| 1 STN (, ) d d |SH (, )| 1 STN (, ) d d

(N ) . 1 mH ()
(2.138)
= SH
The expression for the L2 norm of 10 (t, f ) is shown by noting that 10

() 2 2
()
= SH S b ()
HN
()
()
2 2
=
()
SH (, ) 1 STN (, )
2
()
()
d d (2.139)
|SH (, )|2 1 STN (, ) d d

2 2
= SH
( )
MH N
( )
( ) 2
Discussion. If mH N and MH N are small, the foregoing theorem implies that the GWS of the multi-window STFT lter is approximately equal to the GWS of H (i.e., to the weighting function in (2.136)), L b () (t, f ) LH (t, f ).
HN () ()
Small mH N and MH N require that TN (i.e., the orthonormal set {sk (t)}k=1,...,N and the coecients turn requires that the eective support of |SH (, )| is small, i.e., that H is underspread. Furthermore, since 10
() 2
( )
( )
{k }k=1,...,N ) can be chosen such that |STN (, )| 1 on the eective support of |SH (, )|, which in equals the HS norm H HN = H
() 2
and since the operator norm is
bounded from above by the HS norm (see Section A.1), there is H HN

() O
H HN
() 2
(N ) . 2 MH ()
(2.140)
Thus, it follows from (2.137) that, if MH N is small, the multi-window STFT lter HN will be close exibility with increasing N . In fact, for N = 1 one has ST1 (, ) = As1 (, ) and the situation to H in operator and HS norm, HN H. Note that the weighting function N provides increasing
() () ()
( )
75
reduces to the single-window or rank-one case. By exploiting the inherent exibility of the rank-N even in those cases where the conventional (single-window) STFT lter (which is obtained for N = 1) fails to yield acceptable results. As a consequence of the foregoing theorem, the performance of the multi-window STFT lter can also be assessed in terms of the output signals y(t) = (Hx)(t) and yN (t) = HN x (t). Corollary 2.26. For any LTV system H and any signal x(t), the dierence 11 (t) is bounded as
() () ()
operator TN , it is possible to achieve small mH N and MH N and thus a good approximation HN H
( )
( )
()
(Hx)(t) HN x (t) 11 2 ( ) MH N , H 2 x 2
() () () ()
()
11 (t) ( ) mH N , SH 1 x
()
(2.141)
with the weighting function N (, ) as in Theorem 2.25. Proof. Using (B.3), the expression S b () (, ) = SH (, ) ST (, ), and15 x, (t) x, x
, () HN
the rst (L ) bound is shown as 11 (t) = =

() ()
SH (, ) S b () (, ) x() (t) d d ,
HN
()
SH (, ) 1 STN (, ) x() (t) d d ,

()
()
()
|SH (, )| 1 STN (, ) x() (t) d d x ,
SH
(N ) . 1 mH
The second (L2 ) bound is shown by noting that 11 and invoking (2.140). Discussion. It is seen that for an underspread system H for which mH N and MH N can be made small by suitable choice of the STFT windows sk (t) and the weighting factors k , we have to the output of the given LTV system H. In [118] it was shown that for DL operators, an exact multi-window STFT lter implementation is possible by choosing T such that ST (, ) = IH (, ) as in (2.3) (which in general requires innitely many windows, i.e., N ). However, also in our framework the approximation error can be made arbitrary small if N is suciently large. DL Operators. Let us assume that H is a DL operator with GSF support GH . Suitably intro10 (t, f ) (max) , N SH 1
() () ( ) ( ) () 2 2
H HN x
()
2 2
H HN
() 2 O
2 2
HN x (t) (Hx)(t), i.e., the output of the multi-window STFT lter is a good approximation
()
ducing (2.16) in the previous proofs yields the bounds
15
Here, x, (t)
()
` () () S, x)(t) with the generalized joint TF shift operator S, (see (B.4)).
10 H
() 2 2
(max)
(2.142)
76
and 11 (t) (max) , N SH 1 x with N

(max) ()
11 2 (max) , N H 2 x 2
()
(2.143)
= max(,)GH N (, ) . In general, these bounds are less tight than the corresponding
bounds (2.137) and (2.141) which also hold for DL operators. Non-DL Operators. For non-DL operators and arbitrary rectangular region G = [G , G ]
() () (max)
[G , G ], the following bounds for 10 (t, f ) and 11 (t) can be derived in terms of N appropriate moments.
()
and
Proposition 2.27. For any LTV system H, any multi-window STFT lter HN as dened in ( 2.136), and any rectangular support region G = [G , G ] [G , G ], the dierence 10 (t, f ) is bounded as
(0,1) (1,0) 10 (t, f ) (max) (max) mH + mH N + N , SH 1 G G () ()
10 H
() 2 2
(max) 2
(max) MH + N G
(1,0)
(1,0)
(max) MH + N G ,
(0,1)
2 1/2
(max) N
(max) MH + N G
M + H G
(0,1)
and the dierence 11 (t) is bounded as

(0,1) (1,0) 11 (t) (max) (max) mH + mH + N N , SH 1 x G G ()
()
11 2 H 2 x 2
()
(max) 2
(max) MH + N G MH G
(1,0)
(1,0)
(max) MH + N G .
(0,1)
2 1/2
N Here, N
(max)
(max)
+ N
(max)
MH G
(0,1)
Theorem 2.25.
(max) = max = max(,)G {N (, )} and N (,)G {N (, )} with N (, ) dened as in
Proof. Starting directly with the expression (2.138) and splitting this integral yields 10 (t, f ) =
G ()
|SH (, )| 1 ST (, ) d d +
G
()
|SH (, )| 1 ST (, ) d d
()
|SH (, )| N (, ) d d +
G
|SH (, )| N (, ) d d |SH (, )| d d
G
N =
(max) G
(max) |SH (, )| d d + N
1
(max) N
SHG
(max) N
SHG

(max)
77 (max) for (, ) G. The rst for (, ) G and N (, ) N

(max)
where we used N (, ) N
term in this expression is bounded by SH 1 N

() ()
, and a bound for the second term is given by the
rst inequality in (2.38). Combining these bounds yields the result to be proved. The L2 bound for 10 (t, f ) and the bounds for 11 (t) can be derived in a completely analogous way. The coarser () () bounds for the L2 norms of 10 (t) and 11 (t) follow from the inequality a2 + b2 a + b, valid for a 0, b 0.
The foregoing result shows that the bounds (2.142) and (2.143), when erroneously applied to non-DL operators, might deviate from the correct bounds on the L and L2 norm, respectively, of (0,1) (1,0) () () (max) mH + mH and the dierences (t, f ) and (t) in Proposition 2.27 by as much as
10 11 N G G
(max) MH N G
(1,0)
MH G
(0,1)
2.3.12
Inmum and Supremum of the Weyl Symbol
We have seen in Subsection 2.3.8 that the GWS of an underspread system can be interpreted as an approximate eigenvalue distribution over the TF plane. We now assume H to be self-adjoint, such that the eigenvalues k of H are real-valued. Furthermore, we consider the Weyl symbol LH (t, f ) of H (i.e., = 0) which is real-valued as well. We shall investigate how close inf t,f LH (t, f ) and supt,f LH (t, f ) are to inf H inf k k and sup H supk k , respectively. Note that in the case of self-adjoint LTI/LFI systems the inmum (supremum) of the eigenvalues is always equal to the inmum (supremum) of the respective transfer function. However, a similar relation unfortunately is not true for the GWS of general LTV systems. This topic will be seen to be of importance in Subsections 2.3.15 and 2.3.17. Related issues in the theory of quantization and pseudo-dierential operators are the boundedness or positivity of operators corresponding to bounded or positive symbols, the results being theorems of the Caldern-Vaillancourt type [25, 64, 94, 95] and G o arding inequalities [36, 54, 64, 94], respectively. The next theorem is an extension and adaptation of previous results [118]. Theorem 2.28. For any self-adjoint LTV system H, there is inf t,f LH (t, f ) inf ( ) H mH s , SH 1 with the weighting function s (, ) = 1
1 (0) As (,) (0) (0) (0) (0)
supt,f LH (t, f ) sup ( ) H mH s , SH 1
(0)
(2.144)
where s(t) is an arbitrary normalized function.
Proof. Let us consider the lower symbol [64, 118] LL (t, f ) H Hst,f , st,f =

SH (, ) A(0) (, ) ej2(tf ) d d s
(0)
(2.145)
and the upper symbol [64, 118] LU (t, f ) H

SH (, ) As (, )
(0)
(0)
ej2(tf ) d d,
(2.146)
78
inf LU (t, f ) H
t,f

inf LL (t, f ) H
t,f
sup LL (t, f ) H
t,f
sup LU (t, f ) H
t,f
inf H
sup H
Figure 2.10: Illustration of inequalities (2.147) and (2.148).
where s(t) is a normalized function. (For Gaussian s(t), the lower and upper symbols can be related to the Wick and anti-Wick symbols of quantum mechanics [14,64].) It is known [64,118] that (see Fig. 2.10) inf LU (t, f ) inf inf LL (t, f ) H H H
t,f t,f
(2.147) (2.148)
sup LL (t, f ) sup sup LU (t, f ) . H H H

t,f t,f
We now have LH (t, f ) LL (t, f ) = H

(0)
SH (, ) 1 A(0) (, ) ej2( f t) d d s
(s ) 1 mH ,
(0)
|SH (, )| 1 A(0) (, ) d d = SH s
(2.149)
(0) with s (, ) = 1 As (, ) . Similarly, we have
LH (t, f ) LU (t, f ) = H with s (, ) = 1

1
(0) As (,)
(0)
SH (, ) 1
(0)
1
(0) As (, )
ej2( f t) d d d d = SH
2 2 (s ) 1 mH ,
|SH (, )| 1
1
(0) As (, ) (0)
(2.150)
. Furthermore, due to |As (, )| s 1 SH 1 SH

1 1
= 1 there is
( ) mH s
|SH (, )|
|1 As (, )| |As (, )|
(0)
(0)
d d
( )
|SH (, )| 1 A(0) (, ) d d = mH s . s
(2.151)
The inequality (2.149) is equivalent to the two inequalities LH (t, f ) SH

(0) (s ) 1 mH
LL (t, f ) , H
(2.152)
(0) (s ) 1 mH ,
LL (t, f ) LH (t, f ) + SH H and similarly, (2.150) is equivalent to LH (t, f ) SH

(0) (s ) 1 mH
(2.153)
LU (t, f ) , H
(2.154)
(0) (s ) 1 mH .
LU (t, f ) LH (t, f ) + SH H
(2.155)

(0)
79
Since inf t ,f LL (t , f ) LL (t, f ), it follows from (2.153) that inf t ,f LL (t , f ) LH (t, f ) + H H H SH

(s ) 1 mH
for all (t, f ) and hence that

(0) t,f
t ,f
inf LL (t , f ) inf LH (t, f ) + SH H
(s ) 1 mH
inf LH (t, f ) + SH
t,f
(0)
(s ) 1 mH ,
(2.156)
where the second inequality is due to (2.151). In a similar manner, (2.154) can be shown to imply inf LH (t, f ) SH
t,f (0) (s ) 1 mH
inf LU (t , f ) . H
t ,f
(2.157)
Inserting (2.156) and (2.157) into (2.147), it follows that inf LH (t, f ) SH
t,f (0) (0) (s ) 1 mH
inf inf LH (t, f ) + SH H

t,f (s ) 1 mH .
(0)
(s ) 1 mH
or, equivalently, | inf t,f LH (t, f ) inf | SH H sup LH (t, f ) SH

t,f (0) (s ) 1 mH
By similar reasoning, it can be shown that (2.152) (together with (2.151)) implies that sup LH (t, f ) SH
t,f (0) (s ) 1 mH
sup LL (t , f ) , H
t ,f
(2.158)
and that (2.155) implies sup LU (t , f ) sup LH (t, f ) + SH H

t ,f t,f (0) (s ) 1 mH .
(2.159)
Inserting (2.158) and (2.159) into (2.148) shows that sup LH (t, f ) SH
t,f (0) (0) (s ) 1 mH
sup sup LH (t, f ) + SH H

t,f (s ) 1 mH .
(0)
(s ) 1 mH
or, equivalently, | supt,f LH (t, f ) sup | SH H

( )
Discussion. Whenever mH s can be made small by suitable choice of s(t), it follows from Theorem 2.28 that inf LH (t, f ) inf , H
t,f ( ) (0)
sup LH (t, f ) sup . H

t,f 1 typically (0) As (,) ( ) small mH s . Small
(0)
Small mH s requires As (, ) As (0, 0) 1 on the eective support of |SH (, )|, thus implying that this eective support is small, i.e., that H is underspread. Since s (, ) = 1
( )
grows very rapidly, |SH (, )| must decrease extremely fast in order to allow for
( )
mH s thus seems to be the strongest constraint on H imposed in this chapter. Note, however, that mH s can be adapted to oblique orientation of |SH (, )| by proper choice of the function s(t).
2.3.13
Approximate Non-Negativity of the Generalized Weyl Symbol of Positive Semi-Denite Operators
In contrast to positive semi-denite16 LTI (LFI) systems, which have nonnegative spectral (temporal) transfer functions, a similar property does not hold for the GWS of a general positive semi-denite
16
We note that a completely analogous discussion applies to negative semi-denite HS operators and their GWS.
80
HS operator H. However, since for positive semi-denite HS operators inf = 0, the discussion in the H preceding subsection suggests that in the underspread case inf t,f LH (t, f ) inf = 0, i.e., that the H general case = 0 (where LH (t, f ) is complex-valued), the nonnegativity property can be formulated as LH (t, f ) = P LH (t, f ) with the positive real part of the GWS P LH (t, f )
() () () () () (0)
Weyl symbol of an underspread positive semi-denite operator is approximately nonnegative. For the
1 () LH (t, f ) 2
()
+ LH (t, f )
()
0.
Subsequently, we will show that the GWS of underspread positive semi-denite operators is approximately nonnegative, i.e., LH (t, f ) P LH (t, f ) . Specically, we will show that i) the imaginary
()
the negative real part of the GWS (see Fig. 2.11) N LH (t, f )
() () ()
part LH (t, f ) of the GWS is small (this was already demonstrated in Subsection 2.3.3); and ii) 1 () LH (t, f ) 2
()
P LH (t, f ) LH (t, f ) =
()
LH (t, f )
(2.160)
is small (note that N LH (t, f ) results:
0). To this end, we recall and adapt several of the preceding
Theorem 2.28 shows that the inmum of the Weyl symbol of an underspread positive HS operator
(0) inf t,f LH (t, f ) (this (0) case inf t,f LH (t, f ) <
approximately equals 0. In particular, for = 0, the GWS is real-valued and N LH (t, f )

(0) (0) is trivial in the case inf t,f LH (t, f ) 0 where N LH (t, f ) (0) (0) 0, there is supt,f N LH (t, f ) = inf t,f LH (t, f ) = inf t,f left-hand inequality of (2.144) with inf = 0 yields H (0)
(0)
= 0; in the
LH (t, f ) ).
Further using the

(0)
N LH (t, f ) where 1 (, ) = 1
inf LH (t, f ) = inf LH (t, f ) inf SH H

t,f t,f 1 (0) As (,)
(0)
(0)
(1 ) 1 mH
(2.161)
with s(t) any normalized function. However, this result will not
be used in the following since the bound in (2.161) is always less tight than the bound (2.166) below (see (2.151) and the proof of Theorem 2.29).
The positive semi-denite square-root Hr =
H of a positive semi-denite operator H (see
Appendix A) has minimal TF displacements among all systems G factoring H according to GG+ = H [79, 90]. That is, for H underspread, Hr will be underspread too. Corollary 2.17 shows that for Hr underspread, the GWS of H approximately equals the squared magnitude of the GWS of Hr , i.e., LH (t, f ) = LH part of LH (t, f ) is small.
() () () + (t, f ) r Hr
LHr (t, f ) . This suggests that the negative
()
Similarly, for an underspread positive semi-denite operator H, (2.149) shows that the Weyl
symbol is approximately equivalent to the lower symbol LL (t, f ) (dened in (2.145)). But H LL (t, f ) inf = 0 according to (2.147). Below, we will show that for an underspread systen H H H the approximation LH (t, f ) LL (t, f ) holds for arbitrary . This in turn indicates that the H negative part of the GWS of an underspread positive semi-denite operator H is small.
()
81
LH (t, f0 ) L> (t, f0 ) H sup H 0 LL (t, f0 ) H
()
P LH (t, f0 )
(0)
L< (t, f0 ) H
t N LH (t, f0 )
()
Figure 2.11: Illustration of N LH (t, f ) , P LH (t, f ) , L> (t, f ), L< (t, f ), and LL (t, f ) for a H H H positive semi-denite system. The solid curve shows the real part of the GWS, whereas the thick dashed curve shows the lower symbol LL (t, f ), both as a function of t for xed f = f0 . H
()
()
By making the above arguments precise, the following theorem is obtained. Theorem 2.29. For any positive semi-denite system H, the negative real part N LH (t, f ) of the N LH (t, f ) SHr
2 1 () ()
GWS is bounded as
min{1 , 2 } ,
2
(2.162)
N LH H 2 where 1 = inf
s
2 =1
()
()
inf
2 =1
MH s
( )
(2.163)
mH s , 2 = 2 CHr (with CH dened in (2.70) and Hr the positive semi2
( )
()
()
denite square-root of H), and s (, ) = 1 As (, ) with s N LH (t, f ) can be bounded as N LH

() p ()
= 1.
Proof. With P (t, f ) denoting an arbitrary real-valued and positive TF function, the Lp norm of 1 () LH 2 1 () = LH 2 1 () LH 2
() LH
LH
() p () p p
P + P LH P
p p
1 () P LH 2
LH (t, f )
()
() LH p
.
()
(2.164)
Here, we used the triangle inequality and the fact that the dierence between P (t, f ) and the in turn is the dierence between P (t, f ) and LH (t, f ) (since P (t, f ) is real-valued). is the dierence between P (t, f ) and LH (t, f ) (since P (t, f ) is positive), which
()
82
Chapter 2. Underspread Systems To prove (2.162) we note that with p = and17 P (t, f ) = LL (t, f ), (2.164) can be bounded as H LH (t, f ) LL (t, f ) H =
() () SH (, ) SH (, ) As (, ) d d () ()
SH (, ) 1 A() (, ) d d s
(s ) 1 mH (s ) 2 1 mH .
()
= SH SHr With (2.164), we thus obtain
(2.165)
N LH (t, f ) LH (t, f ) LL (t, f ) SH H SHr with regard to s(t) nally gives N LH (t, f ) SHr
() () 2 1 2 s (s ) 2 1 mH ,
()
()
(s ) 1 mH
(2.166)
which holds for all s(t) (assumed to be normalized in order that s (0, 0) = 0). Taking the inmum
inf
2 =1
mH s
( )
= 1 SHr
2 1.
(2.167)
and thus
Similarly, with p = and P (t, f ) = LHr (t, f ) , (2.164) reduces to (2.70) (with H replaced by Hr ), N LH (t, f ) SHr
() () 2 1 2 CH
= 2 SHr
2 1.
(2.168)
( )
The nal bound (2.162) follows upon combining (2.167) and (2.168). Note that s (, ) 1 (, ) s(t). Therefore, as mentioned previously, the bound (2.161) is always less tight than that in (2.166). To prove the bound (2.163), we note that LH LL H
() 2 2
(where 1 (, ) is the weighting function used in (2.161)) and hence mH s mH 1 for all normalized
( )
() SH (, ) SH (, ) As (, ) d d
()
()
|SH (, )|2 1 A() (, ) d d = H s
()
2 2
MH s
( ) 2
(2.169)
Since this expression holds for all s(t), we obtain with (2.164), P (t, f ) = LL (t, f ), and p = 2 H N LH
() 2
inf
s
2 =1
LH LL H
()
inf
2 =1
MH s
( )
which nally establishes (2.163). In combination with Corollary 2.13, the foregoing theorem immediately implies the following result. Theorem 2.30. For any positive semi-denite system H, the dierence 12 (t, f )
17
()
That LL (t, f ) is indeed non-negative follows from LL (t, f ) = Hst,f , st,f 0. In the following derivation, note also H H R R () () that (B.10) implies LL (t, f ) = SH (, ) As (, ) ej2(tf ) d d , which generalizes the expression in (2.145). H
LH (t, f ) P LH (t, f )
()
()
83
between the GWS and its positive real part is bounded as 12 (t, f ) SHr 12 H
2 1 2 2 () ()
2|| mH
(1,1)
+ min{1 , 2 } , + inf
s
(2.170)
2|| MH
(1,1)
2 =1
MH s
( )
(2.171)
where 1 , 1 , and s (, ) are as dened in Theorem 2.29. Proof. The GWS can be written as LH (t, f ) = LH (t, f ) + j LH (t, f ) = P LH (t, f ) N LH (t, f ) + j LH (t, f ) . It follows that 12 (t, f ) = j LH (t, f ) N LH (t, f ) , and hence by the triangle inequality that 12
() p () () () () () () () () ()
(2.172)
LH
() p
+ N LH
() p
(2.173)
(involving the imaginary part of the GWS) have been provided in Corollary 2.13, i.e., LH
() 2 ()
where for our purposes p = 2 or p = . Bounds on the rst term on the right hand side of (2.173) SH H
(1,1) 1 2|| mH (1,1) , 2 2|| MH 1
SHr
(1,1) 2 1 2|| mH ,
(2.174) (2.175)
LH
where in the L bound we further used SH
via (2.173) (with p = 2) upon combination of the bounds (2.175) and (2.163).
(with p = ) upon combining the bounds (2.174) and (2.162). Similarly, the bound (2.171) follows Discussion. The foregoing theorems show that for an underspread positive semi-denite operator
SHr
2. 1
The bound (2.170) then follows via (2.173)
having small mH , MH part,
(1,1)
(1,1)
, mH s or CHr , and MH s , the GWS approximately equals its positive real LH (t, f ) P LH (t, f ) ,
() ()
( )
()
( )
i.e., both the imaginary part and the negative real part of the GWS are small, LH (t, f ) 0 ,
()
N LH (t, f ) 0 .
()
()
vanishes completely). Furthermore, for = 0 the bounds allow to account for oblique orientation of the GSF of H and Hr by choosing a function s(t) with obliquely oriented ambiguity function As (, ), and furthermore, 2 can be replaced by the rened bound 2 inf UM mUHr U+ mUHr U+ to (2.72).
(0,1) (1,0) (0)
The bounds in the theorems are tightest for = 0 (since here the imaginary part LH (t, f )
according
84
Chapter 2. Underspread Systems DL Operators. For a positive semi-denite DL operator H with GSF support GH , applying N LH (t, f ) SH 1
() ()
(2.16)(2.18) to the bounds (2.162), (2.163), (2.170), and (2.171), yields the bounds18 1 , N LH H 2
() 2
1 , || H + 1 , 2
(max)
(2.176) (2.177) H
(max)
12 (t, f ) || H + 1 , SH 1 2 where 1 = inf

(max) , H (max) H s
2 =1
12 H
()
() 2 2
(max)
with s
(max)
as dened in (2.6).
= max(,)GH 1 As (, ) , and H = 4 H
with
Non-DL Operators. For any (i.e., not necessarily DL) system H, and any rectangular region G = [G , G ] [G , G ], the results of Section 2.2.6 can be used to show that N LH (t, f ) SH 1 N LH H 2
() () 2 ()
mH 1 + 2 G
(1,0)
mH G
(0,1)
, , , ,
MH 1 + 2 G
(1,0)
M + H G
(0,1)
(1,0) (0,1) 12 (t, f ) MH M 2|| G G + 1 + 3 + H SH 1 G G
12 H where 1 = inf
s
2 =1
() 2 2
MH 2|| G G + 1 + 3 G
()
(1,0)
MH G
(0,1)
(max)
with s +
MH G
(0,1)
(max)
resulting from erroneously applying (2.176) and (2.177) (with GH = G) to a non-DL system is upper bounded in terms of
MH G
(1,0)
= max(,)G 1 As (, ) . Hence, it is seen that the error
2.3.14
Boundedness of the Generalized Weyl Symbol of Self-Adjoint Operators
In the preceding subsection we showed that the GWS of underspread positive semi-denite HS operators cannot be too negative (i.e., not much smaller than inf = 0). Subsequently, we present a dual H discussion which shows that the GWS cannot exceed the norm H |LH (t, f )| exceeding H
() O () O
of a self-adjoint HS operator
O
by too much. This diers from the results of Subsection 2.3.15 in the sense that i) only the part of ii) bounds on the L and L2 norms of the GWS part above H [158], we shall subsequently consider only the case H discussion applies to the case H
() O O
is considered (i.e., the case |LH (t, f )| < H

O O
is not relevant here); and
are developed. = supk |k | = max{sup , inf } H H
Let H be a self-adjoint operator so that k R. While H = inf . H
= sup and note that a completely parallel H
18
Note that the term involving 2 is ignored here for simplicity.
We dene the part of LH (t, f ) which is less than sup by the thresholding (see Fig. 2.11) H L() (t, f ) , for L() (t, f ) sup , H H H L< (t, f ) H () sup , for LH (t, f ) > sup . H H
2.3 Underspread Approximations The part of LH (t, f ) exceeding sup is dened similarly (see Fig. 2.11), H L() (t, f ) sup , for L() (t, f ) > sup , H H H H > LH (t, f ) () sup 0 , for L (t, f ) .
H H ()
85
(2.178)
Note that L< (t, f ) + L> (t, f ) = LH (t, f ) and hence H H

()
()
LH (t, f ) = L< (t, f ) + L> (t, f ) + j LH (t, f ) . H H

() () ()
()
(2.179)
The thresholding of LH (t, f ) by sup in the denition (2.178) is equivalent to a thresholding of H LH (t, f ) sup = LHsup I (t, f ) with threshold 0. Hence, L> (t, f ) can be rewritten as H H
H
L> (t, f ) H
N Lsup IH (t, f ) = N sup LH (t, f ) , H

H
()
()
(2.180)
i.e., as the negative real part of the GWS of the operator sup I H. Similarly, H L< (t, f ) = sup P Lsup IH (t, f ) . H H
H
()
Since we assumed H
sup , H
the inequality Hx, x H
x
O
2 2
implies
2 2
(sup I H)x, x = sup I x, x Hx, x = H H H
Hx, x 0 .
advantage of the results of the previous subsection in the derivation of the next theorem. Theorem 2.31. For any self-adjoint, bounded HS system H with H 13 (t, f ) is bounded as 13 (t, f ) ( ) (1,1) 2|| mH + inf mH s , SH 1 s 2 =1 with s (, ) as in Theorem 2.30. Proof. To prove this theorem, we rst note that according to (2.179), 13 (t, f ) = LH (t, f ) L< (t, f ) = j LH (t, f ) + L> (t, f ) , H H and hence, by the triangle inequality, 13
() p () () () () () O
Thus, the operator sup I H is positive semi-denite. Together with (2.180), this allows to take H = sup , the dierence H
LH (t, f ) L< (t, f ) H

() 2 2
()
13 H
2|| MH
(1,1)
+ inf
s
2 =1
MH s
( )
(2.181)
LH
() p
+ L> H
p ()
(2.182)
again be bounded using Corollary 2.13 (see (2.174) and (2.175)). The second term can be further develLL (t, f ) followed by application of the triangle inequality, H L> H
p
where for our purposes p = 2 or p = . The rst term in this bound (involving LH (t, f ) ) can
oped by inserting the denition of the negative real part (2.160) into (2.180) and by adding/subtracting 1 () Lsup IH H 2
Lsup IH
H
()
86
Chapter 2. Underspread Systems 1 () () sup + LH sup LH H H 2 p 1 () () sup LH sup + LL LL + LH H H H H 2 p 1 () 1 () (sup LL ) + LH LL p sup LH H H H H 2 2 p 1 sup 1 () () H LH (sup LL ) p + LH LL p H H H 2 2 1 () 1 L () + LH LL p L LH H p 2 H 2 1 L 1 () () LH LH p + LH LL p H 2 2 () LH LL p . H
= = =
(2.183)
(2.148). Combining (2.183) with (2.165) (for p = ) and (2.169) (for p = 2) an taking the inmum with regard to s(t) further yields L> H
( )
Here, we exploited the fact that LL (t, f ) is real-valued and that sup LL (t, f ) 0 according to H H H
SH
inf
2 =1
mH s
L> H
inf
2 =1
MH s
( )
(2.184)
The L bound in (2.181) then follows by combining the left hand inequality in (2.184) and (2.174) in (2.182) (with p = ). Similarly, the L2 bound in (2.181) follows by combining the right hand inequality in (2.184) and (2.175) in (2.182) (with p = 2).
(1,1) (1,1)
Discussion. The foregoing theorem shows that for underspread operators H where mH , MH are small and mH s , MH s can be made small by suitable choice of s(t), one has LH (t, f ) L< (t, f ) , H or quivalently LH (t, f ) 0 ,
() () ( ) ( )
L> (t, f ) 0 . H
()
The bounds in the theorem are tightest for = 0 (since here the imaginary part LH (t, f ) vanishes completely). Furthermore, for = 0 the bounds (2.181) allow to account for oblique orientation of the GSF of H by choosing a function s(t) with obliquely oriented ambiguity function. Theorems 2.30 and 2.31 impose severe restrictions on the possible values of the GWS of positive semi-denite HS operators (note, however, that Theorem 2.31 holds for the broader class of self-adjoint operators). This restricted range of GWS values is illustrated in Fig. 2.12 for the case = 1/2 (we here do not consider = 0 since we want to illustrate the approximation LH (t, f ) 0 but there is
(0)
(1/2) LH LH (t,f ) 2 = 0.054 and = 0.032, respectively, SH 1 H 2 (1,1) (1,1) while the corresponding bounds are mH = 0.315 and MH = 0.087, respectively. Similarly, L> maxt,f L> (t,f ) H H = 0.007 and H 22 = 0.026, the maximum magnitude and L2 norm of L> (t, f ) are H SH 1
()
LH (t, f ) = 0). In the example in part (b) of this gure, the maximum normalized magnitude and L2 norm of LH
(1/2)
(1/2)
(t, f ) are maxt,f
87
LH (t, f )
(a)
()
2mH
(1,1)
SH
+ mH s SH
( )
H
( ) mH s
LH (t, f )
()
SH
2mH LH
(1/2)
(1,1)
SH
(t, f )
(b)
0.5 0 0.5 0 1 2 3 4
LH
(1/2)
(t, f )
Figure 2.12: (a) Schematic illustration of the possible values of the GWS of an underspread positive semi-denite HS operator H. The thick line represents the desired range of GWS values and the dashed line represents the bounds of Theorems 2.30 and 2.31. (b) Actual values of the GWS (with = 1/2) of an underspread positive semi-denite operator with H
O
= 4.
respectively, while the corresponding bounds are mH s = 0.145 and MH s = 0.046.19 The bounds on L> (t, f ) also apply to the negative part N LH H normalized magnitude maxt,f respectively. Combining maxt,f
(1/2) |12 (t,f )| SH 1
( )
( )
(1/2)
(t, f ) (see Theorem 2.29) which has maximum

N {LH H
(1/2) 2
(1/2) LH (t,f )
SH
= 0.006 and normalized L2 norm (t, f ) and L> (t, f ) H and
= 0.026, yields
the +
results
( ) mH s
for
the bounds
(1,1) mH
maxt,f
(1/2) |13 (t,f )| SH 1
LH
(1/2)
= 0.06 and
(1,1) MH
= 0.46 and
+ MH
(1/2) 12 2 H 2 (s )
(1/2) 13 2 H 2
N LH
(1/2)
(t, f )
= 0.058 as compared to
= 0.133.
An important special case of a positive semi-denite operator H is an orthogonal projection operator (see Section A.4 in Appendix A). If the associated signal space is non-sophisticated [81, 87], the orthogonal projection operator H will be underspread. Here, it follows that the GWS of H (termed the generalized Wigner distribution of the signal space corresponding to H in [81, 87]) essentially is limited to values between 0 and 1 (see Fig. 2.11 with sup = 1). H
2.3.15
Maximum System Gain (Operator Norm)

as dened by (A.1) equals the supremum
We now return to a general (i.e., not necessarily self-adjoint) LTV system H. For LTI and LFI systems, the maximum system gain (operator norm) H
19 ( ) ( )
Here, instead of looking for the inmum of mH s and MH s within all s(t) with s = 1, we restricted to Gaussian
signals of varying length.
88
of the magnitude of the transfer function. We now ask whether for a general LTV system H similarly related to the magnitude of DL operators [118].
() LH (t, f ).
is
The following result extends an existing bound for

()
Theorem 2.32. For any LTV system H, the dierence between the supremum of |LH (t, f )|2 and the squared maximum system gain is bounded as supt,f LH (t, f ) H SH 2 1
() 2 2 O
2 mH mH
(0,1)
(1,0)
+ 4|| mH
(1,1)
s + mH+ H ,
( )
(2.185)
with the weighting function s (, ) as in Theorem 2.28. Proof. By subtracting and adding supt,f LH (t, f ) sup H H+ as well as the triangle inequality, sup LH (t, f ) H
t,f t,f () 2 2 O t,f () 2 (0) 2
and supt,f LH+ H (t, f ) and by using H H

(0) 2 O
(0)
2 () supt,f LH (t, f )
2 O
can be bounded as
2
sup LH (t, f ) sup LH (t, f )

t,f (0) 2 (0) t,f t,f
+ sup LH (t, f ) sup LH+ H (t, f ) + sup LH+ H (t, f ) sup H . H+ We next derive a bound for the dierence 14 (t, f ) 2-D Fourier transform of SH (, ) ej2 , 14 (t, f ) =
() (0) 2 () LH (t, f ) () () () SH (, ) SH (, )
(0)
(2.186)
is given by
LH (t, f ) LH (t, f )
()
(0)
by noting that the

()
and by using SH (, ) =
() 14 (t, f ) d d SH ( , ) SH ( , ) ej2[
(0) (0)
( )( )]
1 d d d d
|SH ( , )| |SH (1 , 1 )| sin ( 1 1 ) d d d1 d1
2||
|SH ( , )| |SH (1 , 1 )| | | + |1 1 | d d d1 d1
= 4|| SH
(1,1) 2 . 1 mH
Using arguments similar to those in the proof of Theorem 2.28, it can be shown that this bound also implies that the rst term on the right hand side of (2.186) is bounded as sup LH (t, f ) sup LH (t, f )
t,f t,f () 2 (0) 2 (1,1) 2 1 mH .
4|| SH
(2.187)
Furthermore, it similarly follows from (2.70) with = 0 that the second term on the right hand side of (2.186) is bounded as sup LH (t, f ) sup LH+ H (t, f ) 2 SH
t,f t,f (0) 2 (0) 2 (0,1) (1,0) 1 mH mH .
(2.188)
Finally, since H+ H is self-adjoint, the bounds in (2.144) hold for H+ H and we obtain that the third term on the right hand side of (2.186) is bounded as sup LH+ H (t, f ) sup H SH+ H H+
t,f (0) (s ) 1 mH+ H (s ) 2 1 mH+ H ,
SH
(2.189)
2.3 Underspread Approximations SH+
89 = SH 2 . Inserting the bounds (2.187), (2.188), 1
where we furthermore used SH+ H
SH
and (2.189) into (2.186) nally yields (2.185).

s Discussion. Theorem 2.32 shows that if mH+ H can be made small by suitable choice of the
( )
function s(t), and if mH
(1,1)
and mH mH
(0,1)
(1,0)
are small too, then

() O
sup |LH (t, f )| H

t,f ( )
|SH+ H (, )|, thus implying that H+ H is underspread with very rapidly decreasing GSF. Furthermore, small mH
(1,1)
s In particular, small mH+ H requires that As (, ) As (0, 0) 1 on the eective support of
and mH mH
(0,1)
(1,0)
amount to the requirement that the GSF of H be localized along
the or axis. The latter requirement can be relaxed in the case = 0. Here, the bound in (2.185) will be tightest since its second term vanishes. Moreover, using (2.72) we obtain the tighter bound supt,f LH (t, f ) H SH 2 1
(0) 2 2 O
s mUHU+ mUHU+ + mH+ H .
2 inf
(0,1)
(1,0)
( )
UM
(2.190)
this rened bound may be small even if the GSF is oriented in an oblique direction in the (, ) plane.
s The second term, mH+ H , can be adapted to oblique directions by proper choice of s(t).
Since the metaplectic operators U M allow shearings and rotations of the GSF, the rst term in
( )
2.3.16
Approximate Commutation
GH HG is the commutator of G and H. A
In contrast to the LTI or LFI case, two LTV systems G and H typically do not commute, i.e., GH = HG or equivalently [G, H] = 0, where [G, H] necessary and sucient condition for two operators to commute is the existence of a common system
of eigenfunctions [64]. However, even if two operators do not have a common system of eigenfunctions, the following theorem shows that these operators will approximately commutate if they are jointly underspread. In the following we consider the operator norm [G, H] HS norm [G, H]
2 O
. We note that bounds on the
have been formulated previously for DL operators [118].
Theorem 2.33. The operator norm of the commutator of two LTV systems G and H is bounded as [G, H] SG
2 1 2 O 2 1
SH
16 2
UM
inf BUGU+ ,UHU+

(0,1)
(0)
+ 8 inf with BG,H =

(0) 1 2
UM
mUGU+ + mUHU+ mUGU+ + mUHU+

(1,0) (0,1)
(0,1)
(1,0)
(1,0)
s + 4 m[G,H]+ [G,H] ,
( )
(2.191)
mG mH
(0,1)
(1,0)
+ mG mH
and with s (, ) as in Theorem 2.28.
Proof. Applying (2.190) to [G, H] yields [G, H]

2 O
sup L[G,H] (t, f ) + S[G,H]

t,f
(0)
2 1
2 inf
UM
s mU[G,H]U+ mU[G,H]U+ + m[G,H]+ [G,H] .
(0,1)
(1,0)
( )
(2.192)
90
(0) 2 (0)
In order to bound supt,f L[G,H] (t, f ) , we consider L[G,H] (t, f ).

(0) (0) LG (t, f )LH (t, f ) (0)
Subtracting and adding
and applying (2.63) twice, we obtain

(0) (0)
L[G,H] (t, f ) = LGH (t, f ) LHG (t, f ) LGH (t, f ) LG (t, f )LH (t, f ) + LG (t, f )LH (t, f ) LHG (t, f ) SG which gives the bound sup L[G,H] (t, f )
t,f 2 1 (0) (0) (0) (0) (0) (0) (0)
SH
1 4
UM
inf BUGU+ ,UHU+ ,
SG
2 1
SH
2 2 1 16
UM
inf BUGU+ ,UHU+
(0)
(2.193)
With regard to the second term on the right hand side of (2.192), we have S[G,H] 1 mU[G,H]U+ = =
(0,1)
|| SU(GHHG)U+ (, ) d d || SUGU+ UHU+ (, ) SUHU+ UGU+ (, ) d d || |SUGU+ UHU+ (, )| + |SUHU+ UGU+ (, )| d d

(0,1) 1 mUGU+ UHU+ (0,1) 1
= SUGU+ UHU+ SUGU+ = SG

1
+ SUHU+ UGU+
(0,1)
(0,1) 1 mUHU+ UGU+
SUHU+
1
mUGU+ + mUHU+
1
+ SUHU+
1
SUGU+
(0,1)
mUHU+ + mUGU+
(0,1)
(0,1)
(0,1)
SH
12
mUGU+ + mUHU+ ,
where we twice applied (2.34) (once for H1 = UGU+ , H2 = UHU+ and once for H1 = UHU+ , H2 = UGU+ ) and in the last step used the fact that SUGU+ for U M. SG
1 1
= SG 1 , SUHU+
In a similar manner, one can show using (2.33) that

(1,0) mUGU+ 1
SH
1
12
+ mUHU+ .
1
(1,0)
1 = SH 1 (1,0) S[G,H] 1 mU[G,H]U+
Inserting these bounds into (2.192) and using
S[G,H]
SGH
+ SHG
2 SG
SH 1 , the bound (2.191) follows.
Discussion. In the case of two jointly underspread systems where the weighted GSF integrals and moments in the bound (2.191) are small, the operator norm of the commutator [G, H] is also small, two systems commute if and only if they have a common set of eigenfunctions [64, 158], this result is consistent with Theorem 2.22. which shows that two jointly underspread systems approximately commute, i.e., GH HG. Since
2.3.17
Approximate Normality
Normal systems have the advantage of allowing an eigenvalue decomposition instead of a (numerically more expensive) singular value decomposition. In contrast to LTI or LFI systems, general LTV systems may be non-normal, i.e., HH+ = H+ H or equivalently [H, H+ ] = HH+ H+ H = 0. A bound on
2.3 Underspread Approximations [H, H+ ]
91
could be obtained as a special case of (2.191). However, the following theorem exploits have been formulated previously for DL operators [118].
the self-adjointness of [H, H+ ] to yield a tighter and simpler bound. We note that bounds on the HS norm [H, H+ ]
2
Theorem 2.34. The operator norm of the commutator of an LTV system H and its adjoint H+ is bounded as [H, H+ ] SH
2 1 O
s s mUHU+ mUHU+ + mHH+ + mH+ H ,
4 inf
(0,1)
(1,0)
( )
( )
UM
(2.194)
with the weighting function s (, ) as in Theorem 2.28. Proof. It is known [158] that max{sup + ] , sup + ] }. [H,H [H,H (2.144) yields [H, H+ ]
O
First assume that
[H, H+ ]
sup |[H,H+ ] |
O
sup + ] . [H,H
In this case, the second bound in
max{inf + ] , sup + ] } [H,H [H,H
[H, H+ ]
= sup + ] sup L[H,H+ ] (t, f ) + S[H,H+ ] [H,H

t,f
(0)
s m[H,H+ ]
( )
sup L[H,H+ ] (t, f ) + S[H,H+ ]

t,f
(0)
s m[H,H+ ] .
( )
(2.195)
Specializing (2.193), the rst term in (2.195) can be bounded as sup L[H,H+ ] (t, f ) SH
t,f (0) 2 1 4
UM
inf mUHU+ mUHU+ ,

(k,l)
(0,1)
(1,0)
(2.196)
where we used that BH,H+ = mH mH bounded as S[H,H+ ]

1
s m[H,H+ ] =
(0)
(0,1)
(1,0)
due to mH+ = mH . The second term in (2.195) can be
(k,l)
( )
s (, )|SHH+ (, ) SH+ H (, )| d d s (, ) |SHH+ (, )| + |SH+ H (, )| d d

(s ) 1 mHH+
(2.197)
= SHH+
+ SH+ H
(s ) 1 mH+ H
SH
2 1
s s mHH+ + mH+ H ,
( )
( )
(2.198)
where once again we used Youngs inequality. Inserting (2.196) and (2.198) into (2.195), the bound (2.194) follows. If [H, H+ ]
O
second bound in (2.144) to [H, H+ ]. Discussion. inf UM
= inf + ] = sup + ] , the bound is shown similarly by applying the [H,H [H,H
From Theorem 2.34, it follows that an underspread system H for which

s s as well as mHH+ and mH+ H are small satises HH+ H+ H and is
(0,1) (1,0) mUHU+ mUHU+
( )
( )
thus approximately normal. In most cases, the latter two requirements will be the stronger con-
straints, requiring very rapid GSF decay. Note however that it is not necessary that the GSFs of H, H+ H, or HH+ are located along the or axis since all moments and weighted integrals involved in the bound allow for oblique GSF orientation.
92
2.3.18
Sampling of the Generalized Weyl Symbol of Underspread Operators
In this section we shall investigate a sampling expansion of the GWS. GWS sampling is of practical interest since it yields sparse representations of the underlying system [118]. This topic has previously been considered for DL operators in [118] and in the context of random LTV channels in [11,102,103]. Here, we will consider two methods for the sampling of the GWS of non-DL operators and investigate the associated reconstruction errors. In what follows, we will restrict to rectangular sampling grids, although other grids might also be useful [118, 120, 121]. The following short-hand notation will be used, where T and F denote the temporal and spectral sampling periods, respectively, LH [k, l]
()
LH (kT, lF ).
()
In order to reconstruct the continuous GWS from its discrete samples, a weighted superposition of TF shifted versions of an interpolation function (t, f ), with the GWS samples as weights, can be used, () LH (t, f ) = T F
k l
LH [k, l] (t kT, f lF ) .
()
() () If consistency is desired in the sense that the TF samples of LH (t, f ) equal those of LH (t, f ), i.e., () () L (kT, lF ) = L [k, l], one has to require that H H
() Using standard techniques of 2-D sampling theory, the 2-D Fourier transform of LH (t, f ) can be shown to be given by () SH (, )
t f
1 , k=l=0 (kT, lF ) = T F 0, otherwise.
() LH (t, f ) ej2(t f ) dt df = (, )
k l
SH
()
k l , + F T
(2.199)
where (, ) = (t, f ) = T1 F 1 1 G = 2F , 2F sinc
t f
(t, f ) ej2(t f ) dt df is the 2-D Fourier transform of (t, f ) and sumA particularly interesting choice for the reconstruction kernel is , with sinc(x) = sin(x)/x, or, equivalently, (, ) = IG (, ) with (t kT ) sinc (f lF ) T F
() f F
mations are from to .

t T sinc 1 1 2T , 2T
. In that case, the reconstruction formulas for the GWS and GSF read LH [k, l] sinc
k l ()
() LH (t, f ) = and
(2.200)
() SH (, ) = IG (, )
k l
SH
l k , + F T
(2.201)
The latter relation, which describes GWS sampling and reconstruction in the GSF domain, is illustrated in Fig. 2.13. is perfectly re-obtained from its samples if we use (, ) = IG (, ). In fact, it follows from (2.201) () () () () that in this case S (, ) = S (, ) and hence L (t, f ) = L (t, f ).
H H H H
In the case of DL operators whose GSF support is completely contained in G, the continuous GWS
2.3 Underspread Approximations 1/T 1/F G
93
(a)
(b)
Figure 2.13: Illustration of the eect of GWS sampling in the GSF domain: (a) GSF of original system, (b) GSF of sampled GWS, reconstruction function (, ) = IG (, ) (dotted rectangle), and GSF of reconstructed GWS (dark gray). A properly DL operator has been assumed resulting in errorfree reconstruction. In the case of an operator that is not properly DL, the various GSF components in (b) would overlap and error-free reconstruction would be impossible. In the case of non-DL operators or DL operators whose GSF support is not contained in G,
f F
GWS sampling and interpolation using (t, f ) = T1 sinc t sinc F T () () L (t, f ) which equals L (t, f ) only on the sampling grid,
H H () () LH (kT, lF ) = LH (kT, lF ) ,
yields a reconstructed GWS
but is generally dierent from LH (t, f ) otherwise. Let us consider the operator H dened by L b (t, f ) = LH (t, f ),
() () () with LH (t, f ) given by (2.200). With S b (, ) = SH (, ) and (2.201), it follows that H is DL. H () H ()
()
Furthermore, the GWS of H equals that of H on the sampling grid but is generally dierent otherwise. A general bound on the error introduced by sampling the GWS of a system H that is not properly DL, i.e., by using the reconstructed operator H instead of H, is stated in the following theorem. Theorem 2.35. For any operator H and any sampling periods T and F , the dierence 15 (t, f ) is bounded as 15 (t, f ) (0,1) (1,0) 4 mH F + mH T , SH 1 part HG according to (2.5), we obtain
() () () () () () L b (t, f ) LH (t, f ) = LH (t, f ) LH (t, f ) H
15 H
() 2 2
4 MH
(1,0)
F + MH
(0,1)
T .
(2.202)
1 1 1 1 Proof. As before, let G = 2F , 2F 2T , 2T . Splitting H into its DL part20 HG and its non-DL
() () 15 (t, f ) = L G
20
H +HG
b b Note that in general HG = H due to the aliasing components in H.
(t, f ) L
() (t, f ) HG +HG
94

() () () () = LHG (t, f ) + L G (t, f ) LHG (t, f ) L G (t, f ) H H () () = L G (t, f ) L G (t, f ) , H H
(2.203)
() () since the DL part HG can be perfectly reconstructed, i.e., LHG (t, f ) = LHG (t, f ). Using the triangle
inequality, we have
() () () 15 (t, f ) L G (t, f ) + L G (t, f ) . H H
(2.204)
Using (2.201), the rst term on the right hand side can be bounded as () L G (t, f )
H
SHG (, ) d d IG (, ) SHG +
k l
k l , + F T
d d
IG 1
l k , F T
SHG , d d
SHG ( , ) d d =
G
|SH ( , )| d d .
(2.205)
The second term on the right-hand side of (2.204) can be bounded as L

() (t, f ) HG
|SHG (, )| d d =
G
|SH (, )| d d .
()
(2.206)
By inserting (2.205) and (2.206) in (2.204), it follows that 15 (t, f ) 2
|SH (, )| d d. From
this, the rst (L ) bound in (2.202) nally follows upon applying the rst bound in (2.38) with G = 1/(2F ) and G = 1/(2T ). With regard to the second (L2 ) bound in (2.202), one can show that 15
() 2 1/2
|SH (, )|2 d d
, from which the nal bound is obtained upon applying the
second bound in (2.38). Note that (2.203) implies that the 2-D Fourier transform of 15 (t, f ) equals () 15 (, )
() () () 15 (t, f ) ej2(t f ) dt df = S G (t, f ) S G (, ) , H H ()
is the aliasing error resulting from sampling LH (t, f ) which is not properly bandlimited. The second component (corresponding to L
() (t, f )) HG
() with SH (t, f ) given by (2.201). According to (2.201) and (2.5), these two components are supported () within the disjoint regions G and G, respectively. The rst component (corresponding to L (t, f ))
HG ()
is the error which results since the non-DL part of H cannot
be reconstructed by bandlimited interpolation. We note that the application of Theorem 2.35 to the sounding of mobile radio channels will be discussed in Section 4.3. The reconstruction method described above preserves the GWS values on the sampling grid (t, f ) = (kT, lF ), but suers from a reconstruction error due to aliasing if we sample the GWS of an operator
95
H that is not properly DL. In order to avoid this aliasing error, one could alternatively try to rst approximate H by a properly DL operator whose GWS can then be perfectly reconstructed from its samples. Let us approximate H in a least-squares sense by a DL operator with GSF support contained in G. Straightforward application of the orthogonality principle yields as best approximation the operator HG (cf. also Subsection 2.1.1), arg min HG = HG .
G: GG =G
The next theorem quanties the errors induced by this alternative sampling method. Theorem 2.36. For any operator H and any sampling periods T , F with corresponding region G =
1 1 1 1 2F , 2F 2T , 2T , the dierence
16 (t, f ) is bounded as 16 (t, f ) (0,1) (1,0) 2 mH F + mH T , SH 1

()
()
LH (t, f ) LHG (t, f )
()
()
16 H
() 2 2
2 MH
(1,0)
F + MH
(0,1)
T .
(2.207)
() Proof. Since HG is properly DL, it can be perfectly reconstructed, i.e., HG = HG and LHG (t, f ) = LHG (t, f ). Hence, the reconstruction error is given by
() () () () () () 16 (t, f ) = LH (t, f ) LHG (t, f ) = LH (t, f ) LHG (t, f ) = L G (t, f ) . H ()
The bounds in (2.207) then follow straightforwardly upon combining (2.37) and in (2.38). By comparing (2.207) with (2.202), it is seen that the bounds on the reconstruction error 16 (t, f ) are smaller than the respective bounds on the reconstruction error 15 (t, f ) (resulting from sampling LH (t, f ) directly). This corresponds to the fact that no aliasing errors are introduced when sampling LHG (t, f ). On the other hand, the GWS samples of H and its DL approximation HG are not equal in general, LH [k, l] = LHG [k, l], and thus this second method does not give perfect reconstruction on the sampling grid. An example illustrating the sampling and reconstruction of the GWS of a non-DL operator is shown in Fig. 2.14. It is seen that irrespective of the fact that H is non-DL, both sampling/reconstruction methods yield satisfactory results, the reason being that H is suciently underspread. In this example, the normalized errors introduced by sampling the GWS directly are
15 H
() 2 2
()
()
()
()
()
()
|15 (t,f )| SH 1
()
= 1.7 104 and
= 0.002, while the corresponding L and L2 bounds in (2.202) are 0.01 and 0.1, respectively.
|16 (t,f )| SH 1
()
The normalized errors obtained by approximating H by its DL part HG (which can be perfectly reconstructed from its GWS samples) are = 104 and
16 H
() 2 2
= 0.001, while the corresponding
L and L2 bounds in (2.207) are 0.005 and 0.05, respectively. These errors as well as the corresponding bounds conrm that the second approach, based on the least-squares approximation of H by HG , is superior to sampling the GWS directly. Finally, we note that in this example the bounds are larger
96
f
6
f
6
f
6
f
6
- t
- t
- t
- t
(a)
(b)
(c)
(d)
Figure 2.14: Example illustrating the sampling/reconstruction of the GWS (with = 0) of a non-DL (0) (0) system: (a) true Weyl symbol L (t, f ), (b) reconstructed Weyl symbol L (t, f ), (c) Weyl symbol
H H
of DL part
HG ,
(d) reconstruction kernel (t, f ). The number of samples is 128, the (normalized)
frequency range covered is [1/4, 1/4], and the temporal and spectral sampling periods were T = 16 and F = 1/16, respectively. by a factor of about 50 than the true errors. This can be attributed to the fact that our error bounds are based on the Chebyshev-like inequalities in (2.38), which often tend to be rather loose.
3
Underspread Processes
All stable processes we shall predict. All unstable John von Neumann processes we shall control.
frequency correlations of random processes by means of time-frequency correlation functions and the expected ambiguity function, and we discuss the concept of underspread and overspread processes. In Section 3.2, we study two dierent families of time-varying power spectrathe generalized WignerVille spectrum and the generalized evolutionary spectrumand we show that these two families of spectra become approximately equivalent in the case of underspread processes. We also discuss the relation of time-frequency correlations with statistical cross or interference terms appearing in timevarying power spectra. The suppression of such interference terms via smoothing provides a motivation for the denition of two fundamental classes of time-varying power spectra which we refer to as type I spectra (Section 3.3) and type II spectra (Section 3.4). These two classes of spectra extend the generalized Wigner-Ville spectrum and the generalized evolutionary spectrum, respectively. It turns out that for underspread processes, type I and type II spectra satisfy desirable mathematical properties at least in an approximate way. Furthermore, we show in Section 3.5 that in the underspread case type I and type II spectra are almost identical. Further topics discussed in this chapter include time-frequency input-output relations for nonstationary random processes (Section 3.6), approximate Karhunen-Lo`ve expansions (Section 3.7), and e the concept of time-frequency coherence functions (Section 3.8).
HIS chapter discusses the analysis and characterization of nonstationary random processes via time-frequency methods. In Section 3.1, we consider the description of the statistical time-
97
98
Chapter 3. Underspread Processes
3.1 Time-Frequency Correlation Analysis

In Chapter 2, the characterization of time-frequency (TF) displacements of linear systems was seen to be of fundamental importance. For random processes, a similar role is played by TF correlations. In what follows, this will be explained in more detail.
3.1.1
Motivation
The basic second-order statistic of a nonstationary random process1 x(t) is the (temporal) correlation function rx (t1 , t2 ) = E {x(t1 ) x (t2 )} (or, equivalently, the correlation operator Rx ). Another useful Fourier transform X(f ), RX (f1 , f2 ) = E {X(f1 ) X (f2 )} [60,63,136]. The spectral correlation function is also referred to as Lo`ves spectrum and has recently gained new interest (see e.g. [76, 199]). In the e stationary case, the spectral correlation function is nonzero only for f1 = f2 , which shows that dierent spectral components are uncorrelated and thus only temporal correlations are present. Similarly, the temporal correlation function of a white process is nonzero only for t1 = t2 , which reects that dierent temporal components are uncorrelated (while spectral correlations may well be present). In the extreme case of a stationary and white process, neither temporal nor spectral correlations are present. General nonstationary processes, however, feature both temporal and spectral correlations. Joint descriptions of these TF correlations are considered next. second-order characterization is the spectral correlation function, i.e., the correlation of the process
3.1.2
Time-Frequency Correlation Functions
To obtain a function that jointly describes the temporal and spectral correlations of a random process, we present a modication and generalization of the line of arguments given in [118, 126]. There, the correlation function of the short-time Fourier transform (STFT, see Subsection B.2.1) of x(t) using analysis window2 g(t)
(g) Rx (t1 , f1 ; t2 , f2 ) (g) E STFTx (t1 , f1 ) STFT(g) (t2 , f2 ) x
was introduced as a measure of the correlation between two process components localized around the TF points (t1 , f1 ) and (t2 , f2 ), respectively. Since STFTx (t, f ) = x, gt,f with gt,f (t ) = (St,f g)(t ), Rx (t1 , f1 ; t2 , f2 ) can be rewritten as
(g) Rx (t1 , f1 ; t2 , f2 ) = E { x, gt1 ,f1 (g) (g)
x, gt2 ,f2 } = E
Pg S+ ,f1 x, S+ ,f2 x t1 t2
= Tr St2 ,f2 Pg S+ ,f1 Rx t1
, (3.1)
where Pg = respectively:
and spectral correlation functions are re-obtained from Rx (t1 , f1 ; t2 , f2 ) with g(t) = (t) and g(t) = 1,
g g
denotes the rank-one projection operator on span{g(t)}. Note that the temporal
(g)
g(t) = (t) ,
1 2
(g) Rx (t1 , f1 ; t2 , f2 ) rx (t1 , t2 ) ,
Throughout this chapter, we assume that all random processes have zero mean. The window g(t) is assumed to be normalized and well-localized about the origin of the TF plane.

(g) Rx (t1 , f1 ; t2 , f2 ) RX (f1 , f2 ) .
99
g(t) = 1 ,
Furthermore, for t1 = t2 = t and f1 = f2 = f , there is

(g) (g) (g) Rx (t, f ; t, f ) = E SPECx (t, f ) = PSx (t, f ) ,
i.e., the diagonal of the TF correlation function equals the physical spectrum (see Subsections 3.3.2 and B.3.3). By replacing Pg in (3.1) with a trace-normalized TF localization operator T [4244,80,81,174,175], a generalization of Rx (t1 , f1 ; t2 , f2 ) can be dened as
(T) Rx (t1 , f1 ; t2 , f2 ) (g)
TS+ ,f1 x, S+ ,f2 x t1 t2
= Tr St2 ,f2 TS+ ,f1 Rx t1

(g)
(3.2)
This is a more exible measure of the TF correlations of x(t) than Rx (t1 , f1 ; t2 , f2 ). For compact and normal T with eigendecomposition T =
(T) Rx (t1 , f1 ; t2 , f2 ) = Tr St2 ,f2 k k k gk gk = k
k Pgk (see Subsection A.3), one obtains =

k
k Pgk S+ ,f1 Rx t1
k Tr St2 ,f2 Pgk S+ ,f1 Rx t1
=
k (g)
(g k Rx k ) (t1 , f1 ; t2 , f2 ) .
This shows that Rx (t1 , f1 ; t2 , f2 ) in (3.1) is a special case of Rx (t1 , f1 ; t2 , f2 ) obtained using a rankone localization operator T = Pg . A further special case is T =
() 1 NP
(T)
where P is the projection operator
region R such that LP (t, f ) IR (t, f ) with IR (t, f ) the indicator function of R [80, 81]. Assuming that P is such that the corresponding TF region R is localized around the origin of the TF plane, the TF correlation Rx (t1 , f1 ; t2 , f2 ) can be interpreted as correlation of the process components localized
(T)
on a given N -dimensional subspace X . To each non-sophisticated subspace X , there corresponds a TF
within IR (t t1 , f f1 ) and the process components localized within IR (t t2 , f f2 ) (see Fig. 3.1). convenient, (T) Rx (t, f ; t, f )
(T)
In the subsequent, the following coordinate-transformed version of the TF correlation will be t f t f (T) Rx t + . ,f + ;t ,f 2 2 2 2
(T)
In Section 3.3, the diagonal Rx (t, f ; t, f ) = Rx (t, f ; 0, 0) will be seen to correspond to a specic class of time-varying spectra.
3.1.3
The Expected Ambiguity Function

(T)
The correlation function Rx (t1 , f1 ; t2 , f2 ) is physically intuitive but has two drawbacks: rst, it depends on the specic choice of the TF localization operator T and thus is non-unique; second, as a function of four variables it is cumbersome to work with. We will see presently that the generalized expected ambiguity function [118, 120, 126] (GEAF) resolves the above two problems; however, it does not directly lend itself to an intuitive physical interpretation.
100
f f2
f1
111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 IR (t t1 , f f1 ) 000000 111111 000000 111111 111111 000000 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 000000
IR (t t2 , f f2 ) 111111 000000
11111111 00000000 111 000 11111111 00000000 111 000 11111111 00000000 111 000 11111111 00000000 111 000 11111111 00000000
t1
11111111 00000000 111 000 11111111 00000000 111 000 11111111 00000000 111 000 11111111 00000000
x(t) t
t2
(T)
Figure 3.1: Illustration of the TF correlation Rx (t1 , f1 ; t2 , f2 ) between components of the random process x(t) localized in regions around the TF analysis points (t1 , f1 ) and (t2 , f2 ). In this example, T = P/3 where P is the projection operator on the space spanned by the rst three Hermite functions approximately area 3. The shaded region illustrates the TF support of the nonstationary process x(t). [64, 81, 99] (each represented by a hatched annular region). The corresponding TF region R has
The GEAF is dened as (cf. (B.52)) () Ax (, ) = with rx (t, ) = rx t +

() 1 2 () rx (t, ) ej2t dt ,
(3.3)
+ . Under certain weak conditions, it equals the expec() () () tation of the process generalized ambiguity function, Ax (, ) = E Ax (, ) = E x, S, x . Furthermore, by comparing (3.3) with (B.1), it is seen that the GEAF is the GSF of the correlation operator Rx ,
() A() (, ) = SRx (, ) . x
, t
1 2
Further properties of the GEAF can be found in Subsection B.3.4 of Appendix B and, in [118, 126]. (T) What is important in the context of the TF correlation function Rx (t, f ; t, f ) is the fact that
(T) Rx (t, f ; t, f ) = ej2tf (0) x ST (t , f ) A(0) (, ) ej2( t f ) d d .
(3.4)
(T) If ST (, ) is well localized around the origin, (3.4) shows that Rx (t, f ; t, f ), and hence also (T) (0) Rx (t1 , f1 ; t2 , f2 ), is essentially determined by the values of Ax (, ) around = t and = f . Furthermore, (3.4) implies that the magnitude of the TF correlation function is bounded by the convolution of |ST (, )| and Ax (, )|, i.e., (T) Rx (t, f ; t, f )

|ST ( t, f )| |Ax (, )| d d .
(3.5)
101
This upper bound does not depend on the absolute location (center point) (t, f ) of the TF analysis points (t1 , f1 ) and (t2 , f2 ) but only on their time separation t = t1 t2 and frequency separation f = f1 f2 . Thus, for well localized |ST (, )|, (3.5) shows that the magnitude |Ax (, )| of the in time and by f = in frequency. It is insightful to formally consider two extreme cases3 for the choice of the operator T underlying (T) Rx (t, f ; t, f ): T = I: Choosing T as the identity operator implies that no TF localization is performed at all. to the TF correlation function. This is reected by
(I) Rx (t, f ; t, f ) = A(0) (t, f ) ej2f t , x
GEAF at a point (, ) characterizes the correlation of all process components separated by t =
Consequently, all components of x(t) separated by t in time and by f in frequency contribute
(I) Rx (t, f ; t, f ) = Ax (t, f ) , which shows that here TF correlation function and GEAF become perfectly equivalent. In (T) particular, |Rx (t, f ; t, f )| does not depend on the absolute location (center point) (t, f ) of the TF analysis points (t1 , f1 ) and (t2 , f2 ). T = L(0) : Choosing T to equal the TF reection operator L(0) dened by (B.22) yields perfect (0) (T) TF localization since L (0) (t, f ) = (t)(f ). Here, Rx (t, f ; t, f ) is obtained as
L (0) (L(0) Rx ) (t, f ; t, f ) = ej2tf W x (t, f ) , (0) (L(0) Rx ) (t, f ; t, f ) = W x (t, f ) .
(L It is seen that in this case, Rx the frequency separation f .
(0) )
(t, f ; t, f ) is independent of the time separation t and
We close this section on the GEAF by considering the special case of a Gaussian random process. This sheds additional light on the interpretation of the GEAF as a TF correlation. We consider the covariance function of the generalized Wigner distribution (GWD) Wx (t, f ) of x(t) (see Section B.2.2), dened as
() Cx (t1 , f1 ; t2 , f2 ) () () cov Wx (t1 , f1 ), Wx (t2 , f2 ) ()
= E
() () Wx (t1 , f1 ) W x (t1 , f1 ) Wx (t2 , f2 ) W x (t2 , f2 ) () ()
()
()
() () = E Wx (t1 , f1 ) Wx (t2 , f2 ) W x (t1 , f1 ) W x ()
(t2 , f2 ) .
(3.6)
In the following, we use a coordinate-transformed version of Cx (t1 , f1 ; t2 , f2 ) dened as

() () Cx (t, f ; , ) = Cx (t + , f + + ; t + , f ) ,
(3.7)
with + = 1/2 + and = 1/2 .

3
Note that these two cases do not satisfy our assumption of T having normalized trace.
102
Theorem 3.1. For a circular complex Gaussian random process x(t),

2 () Cx (t, f ; , ) dt df = Ax (, ) .
(3.8)
Proof. We start by developing the correlation function of the generalized Wigner distribution using Isserlis formula [177] for circular complex Gaussian random processes,
E {x(t1 ) x (t2 ) x(t3 ) x (t4 )} = rx (t1 , t2 ) rx (t4 , t3 ) + rx (t1 , t4 ) rx (t2 , t3 ) .
Using this in (B.39) yields

() () E Wx (t1 , f1 ) Wx (t2 , f2 )
=E
1
x(t1 + 1 ) x (t1 + 1 ) ej2f1 1 d1
x (t2 + 2 ) x(t2 + 2 ) ej2f2 2 d2 ej2(f1 1 f2 2 ) d1 d2 (3.9)
=
1 2
E x(t1 + 1 ) x (t1 + 1 ) x(t2 + 2 ) x (t2 + 2 )

rx (t1 + 1 , t1 + 1 ) rx (t2 + 2 , t2 + 2 )
=
1 2
= W x (t1 , f1 ) W x + =
()
+ rx (t1 + 1 , t2 + 2 ) rx (t1 + 1 , t2 + 2 ) ej2(f1 1 f2 2 ) d1 d2 ()
(t2 , f2 )
1 2 () () W x (t1 , f1 ) W x (t2 , f2 ) + ()
rx (t1 + 1 , t2 + 2 ) rx (t1 + 1 , t2 + 2 ) ej2(f1 1 f2 2 ) d1 d2
WRx (t1 , f1 ; t2 , f2 ) ,
()
()
(3.10)
where WRx (t1 , f1 ; t2 , f2 ) is the generalized transfer Wigner distribution of Rx as dened in (B.32). Inserting (3.10) into (3.6) shows that Cx (t1 , f1 ; t2 , f2 ) = WRx (t1 , f1 ; t2 , f2 ) and hence
() () Cx (t, f ; , ) = Cx (t + , f + + ; t + , f ) ()
= WRx (t + , f + + ; t + , f ) = WRx (t, f ; , ) ,
()
replaced by Rx ), i.e.,
with WH (t, f ; , ) as dened in (B.33). The nal result (3.8) then follows from (B.35) (with H
()
t ()
() Cx (t, f ; , ) dt df =
()
WRx (t, f ; , ) dt df = SRx (, ) ,
()
by noting that SRx (, ) = Ax (, ). The foregoing theorem conrms that for Gaussian processes the squared magnitude of the GEAF at (, ) can be interpreted as the integrated covariance of all pairs of GWD values separated by in time and by in frequency.
103
3.1.4
Extended Concept of Underspread Processes
In many situations, a simple global characterization of TF correlations by just a few parameters is desired. Such a characterization will be provided next. The basis of our discussion is the fact that the GEAF of a random process x(t) equals the GSF of the correlation operator Rx of x(t), i.e., () () Ax (, ) = S (, ). Hence, all of the considerations in Sections 2.1 and 2.2 regarding the global
Rx
characterization of TF displacements of linear systems (as described in detail by the GSF) carry over to the global characterization of TF correlations of random processes (as described in detail by the GEAF). In the following, we shall thus rephrase (with minor modications) the main ideas of Sections 2.1 and 2.2 in the context of TF correlation analysis of random processes. For a better understanding, we note that the unitary transformation URx U+ of a correlation operator Rx corresponds to the transformation (Ux)(t) of the underlying random process, i.e., x(t) = (Ux)(t) = Rx = URx U+ .
GEAF (see Appendix C).
For metaplectic operators U M this corresponds to a symplectic coordinate transformation of the Correlation-Limited Processes. In [118,120,126], the support of the GEAF was used to dene
a global measure of TF correlations that is similar to the global characterization of TF shifts discussed in Section 2.1. Let us consider a random process x(t) with GEAF supported within a region Gx , i.e., () Ax (, ) = A() (, ) IGx (, ) , x (3.11)
Any such process will be termed correlation limited (CL). The temporal correlation width x the spectral correlation width
(max) x
where IGx (, ) is the indicator function of Gx and Gx is the minimal region such that (3.11) holds.
(max)
and
of a CL process are dened as

(max) x (,)Gx
(max) x
(,)Gx
max | | ,
max || .
CL underspread processes:
Assuming in addition that the area of the region Gx is suciently small leads us to the denition of
(max)
Denition 3.2. Let x(t) be a CL process with temporal correlation width x width
(max) x .
and spectral correlation
Then, x
(max) (max) 4x x
is called the (strict-sense) correlation spread of the CL process x(t), and x(t) is called strict-sense CL underspread if x 1 . (3.12)
If x(t) is not strict-sense CL underspread but there exists a metaplectic operator U M such that the
correlation spread of (Ux)(t) satises Ux 1, then x(t) is called wide-sense CL underspread. The
104
quantitity
(min) x UM
min Ux
is referred to as wide-sense correlation spread of x(t). The CL underspread denition (3.12) was introduced in [117, 118, 120, 126] and further used in [92, 115, 116, 122124, 129, 141]. We note that the wide-sense correlation spread x correlations of a process is given by
(min)
measures the
area of a rectangle circumscribed about the region Gx . An alternative measure of the global TF x
(,)Gx
max | | .
It can be used as basis for the alternative CL underspread condition x 1 , (3.13)
which corresponds to a hyperbolic support constraint (cf. Fig. 2.1 for an illustration of rectangular and hyperbolic support constraints). Since
(max) (max) x = x /4 , x = max | | max | | max || = x (,)Gx (,)Gx (,)Gx () () the condition (3.13) is less restrictive than (3.12). Note that Ax (, ) = SRx (, ) implies that
x = Rx and x = Rx , where H and H were dened in Subsection 2.1.2. Processes with Rapidly Decaying Expected Ambiguity Function. Processes with GEAF having exactly limited support seem unrealistic in practical applications. Using an eective GEAF support instead is problematic since there is no unique choice for the eective support and furthermore the modelling errors incurred by a specic choice are dicult to judge. Hence, similar to Section 2.2, we propose to use weighted integrals of the GEAF as alternative global measures of TF correlation, i.e. (cf. (2.12-a), (2.12-b)), (, ) Ax (, ) d d

() mx
= Ax (, ) d d
1 x A
(, ) Ax (, ) d d ,
1
(3.14a)
() Mx
1/2 x (, ) 2 d d (, ) A 1 = 2 Ax Ax (, ) d d
2
2 2 (, ) Ax (, ) d d 2
1/2
, (3.14b)
where (, ) is a nonnegative weighting function as described in Subsection 2.2.2. A special case of these weighted integrals are the GEAF moments obtained with weighting function k,l (, ) = | |k ||l (cf. (2.13-a), (2.13-b) and Fig. 2.2),
(k,l) mx (k,l ) (k,l) Mx (k,l )
mx
Mx
(3.15)
105
() () Due to Ax (, ) = SRx (, ), the weighted integrals and moments of the GEAF of x(t) equal the
corresponding weighted integrals and moments of the GSF of Rx , i.e.,

() mx = mRx , (k,l) mx = (k,l) mRx , () () () () Mx = MRx , (k,l) Mx = (k,l) MRx . (k,l) ()
(3.16) (3.17)
(k,l)
Since the weighted GEAF integrals mx and Mx and the GEAF moments mx and Mx x (, ) about the origin of the (, ) plane, we can consider a nonstationary measure the spread of A random process x(t) to be underspread if suitable GEAF integrals/moments are small, without being forced to assume that the GEAF has compact support. We note that while this is not a clearcut denition of underspread processes, this underspread concept is less restrictive and provides us with more exibility than an underspread denition based on compact GEAF support. Some of the conditions (satised by underspread systems) that will be important in subsequent sections are the following: m(1,1) 1, x
(1,1) Mx 1,
m(1,0) m(0,1) 1, x x
m(s ) 1, x
()
(1,0) (0,1) Mx Mx 1,
(3.18)
( Mx s ) 1, 1 () As (,)
function s(t). Examples for underspread systems satisfying these constraints are illustrated in Fig. 3.2. Note that this extended concept of underspread processes is not equivalent to that of quasistationary processes (which require the GEAF to be concentrated along the axis, see part (d) of Fig. 3.2). In particular, a quasistationary process may be overspread (i.e., not underspread) if its eective temporal correlation width is large while a highly nonstationary process (i.e., a process with large eective spectral correlation width) may be underspread if its eective temporal correlation is short enough. Since oblique orientations of the GEAF are not allowed by the conditions in the rst two lines of (3.18), in such situations the less restrictive conditions
UM
where in the last line either s (, ) = |1 As (, )| or s (, ) = 1
with some normalized
min mUx mUx
(1,0)
(0,1)
1,
UM
min MUx MUx
(1,0)
(0,1)
into an orientation of |AUx (, )| along the axis and/or axis (see Appendix C).
can be used, with M denoting the set of metaplectic transforms. Here, an appropriate U M produces a coordinate-transformed GEAF such that the oblique orientation of |Ax (, )| is converted While the interpretation of the weighted GEAF integrals and GEAF moments in terms of corre-
lations is dierent from that of the weighted GSF integrals and GSF moments dened in Subsection 2.2.2, the discussion in Subsections 2.2.2 through 2.2.6 concerning the mathematical properties of the weighted GSF integrals and GSF moments straightforwardly carries over to the weighted GEAF integrals and GEAF moments and thus need not be repeated here. Only the case of sums of random processes deserves some special attention.
106
(a)
(b)
(c)
(d)
(e)
Figure 3.2: Schematic representation of the GEAF magnitude of (a) an underspread process with small mx
(1,1)
and Mx
(1,1)
; (b) an underspread process with small mx

(1,0) (0,1) mUx mUx (0,l) Mx );
(1,0)
mx
(0,1)
and Mx
(k,0)
(1,0)
Mx
(0,1)
; (c) an
underspread process with small process (small

(0,l) mx
with U corresponding to a rotation; (c) a quasistationary

(k,0)
and
(d) a quasiwhite process (small mx
and Mx
).
Sums of Random Processes. The diculty with the sum x(t) = x1 (t) + x2 (t) of two random processes x1 (t) and x2 (t) stems from the fact that the correlation Rx does not equal the sum of the individual correlation operators Rx1 and Rx2 , unless x1 (t) and x2 (t) are uncorrelated. In general, one has Rx = Rx1 + Rx2 + Rx1 ,x2 + Rx2 ,x1 = Rx1 + Rx2 + 2RH1 ,x2 , x where Rx1 ,x2 denotes the cross-correlation operator associated to the cross-correlation function E {x1 (t) x (t )} and RH1 ,x2 x 2 (Rx1 ,x2 + R+1 ,x2 )/2 is the hermitian (self-adjoint) part of Rx1 ,x2 . Since x
1 2 x1 ,x2
() () () () A() (, ) = SRx (, ) = SRx (, ) + SRx (, ) + 2SRH x
(, )
() () = A() (, ) + Ax2 (, ) + 2H Ax1 ,x2 (, ) , x1 () () () with H Ax1 ,x2 (, ) = Ax1 ,x2 (, ) + Ax1 ,x2 (, ) /2 denoting the (hermitian) even part of () () Ax1 ,x2 (, ), it follows that the magnitude of Ax (, ) is bounded as A() (, ) A() (, ) + A() (, ) + 2 H A() 2 (, ) x x1 x2 x1 ,x .
Thus, it is seen that in order that the GEAF of x(t) is concentrated about the origin, not only the () individual GEAFs of x1 (t) and x2 (t) but also the cross GEAF Ax1 ,x2 (, ) has to be concentrated about the origin. This observation is the basis for our concept of jointly underspread processes: Denition 3.3. Two processes x1 (t) and x2 (t) are referred to as being jointly underspread if they both are individually underspread and furthermore the cross-correlation operator Rx1 ,x2 is underspread. Note that two uncorrelated underspread processes are thus jointly underspread.
3.1.5
Innovations System
An alternative characterization of the TF correlations of a process is in terms of innovations systems. The innovations system representation of a random process x(t) is given by [38, 148, 152] x(t) = (Hn)(t) =
t
h(t, t ) n(t ) dt ,
(3.19)
107
where n(t) denotes normalized stationary white noise with correlation operator I. An innovations system H is obtained as a solution of the equation HH+ = Rx . This solution is not unique since, given a valid innovations system H and unitary system U such that UU+ = I, also the system H = HU satises the dening equation, HH+ = HUU+ H+ = HH+ = Rx , and thus also H is a valid innovations system. Subsequently, the set of all innovations systems for a given random process x(t) will be denoted by Ix , i.e., Ix The relation Rx = HH+ (B.13) |Ax (, )| |SH (, )| |SH (, )| . (3.20) {H : HH+ = Rx }. () () () implies that Ax (, ) = SRx (, ) = SHH+ (, ) and thus by virtue of
be underspread too. Conversely, if x(t) is an underspread process, one can always nd an underspread innovations system H. In particular, the innovations system H = Rx is maximally underspread since the positive square root has minimal displacement eects among all innovations systems H Ix process is necessarily overspread too. From these relations, it follows that for characterizing the TF correlations of a random process x(t), the weighted integrals mH , MH and moments mH , MH of the GSF of the innovations system () () H = Rx provide an alternative to the weighted integrals and moments mx , Mx and moments mx
(k,l) () () (k,l) (k,l)
Hence, if the innovations system H is underspread (cf. Chapter 2) and thus |SH (, )| is concentrated about the origin, also |Ax (, )| will be concentrated about the origin and thus the process x(t) will
[86,90,148]. We furthermore note that (3.20) also implies that the innovations system of an overspread
, Mx
(k,l)
of the GEAF of x(t). Proposition 2.7 (with H2 = H and H1 = H+ ) establishes a relation

(k,l)
between the normalized moments mx

(k,l) mx
= mx
(k,l)
/(k! l!) and mH

l
(k,l)
= mH /(k! l!), .
(k,l)
SH Ax
2 1
1 i=0 j=0
mH mH
(i,j)
(ki,lj)
Important special cases of this relation are (cf. (2.33)(2.35)) m(1,0) 2 x SH Ax

2 1 1
mH
(1,0)
m(0,1) 2 x
(max)
SH Ax
2 1 1
mH
(0,1)
m(1,1) 2 x
SH Ax
2 1 1
mH
(1,1)
+ mH mH
(0,1)
(1,0)
Finally, (3.20) implies that a displacement-limited (DL) innovations system H with GSF support region GH = [H region Gx =
(max)
, H
(max)
(max) (max) [x , x ]
] [H
, H
(max)
] generates a CL process x(t) with GEAF support

(max)
(max) (max) [x , x ]
where x
= 2H
(max)
and x
(max)
= 2H
(max)
3.1.6
An Example
We next present two example processes that illustrate the dierence between underspread and overspread processes. The processes will be specied by their Karhunen-Love (KL) expansions. e N signals that are well TF localized in two TF regions R1 and R2 , respectively. The regions R1 and Underspread Process. Let {vk (t)}k=1...N and {vk (t)}k=1...N denote two orthonormal sets of
(1) (2)
R2 are separated in time by t and in frequency by f (see Fig. 3.3(a)). We assume that the signals
108

f t t f t (a) (b) (c)
R2
f R1 t
underlying the processes KL expansion, (b) GEAF magnitude of underspread process, (c) GEAF magnitude of overspread process. In (b) and (c), the number of signal samples is 256 and the (normalized) frequency lag ranges from 1/2 to 1/2. in these sets are mutually orthogonal and TF disjoint, vk , vl
(1) (2)
Figure 3.3: Illustration of underspread and overspread processes: (a) sketch of TF regions R1 and R1
= 0,
Wv(1) (t, f ) Wv(2) (t, f ) 0 .

k l
Let us rst consider a random processes x(t) given by

N
x(t) =
k=1 (i)
(1) (1) k vk (t)
+
k=1
k vk (t) ,
(2)
(2)
(3.21)
where the k (k = 1 . . . N , i = 1, 2) are mutually orthogonal, zero-mean random coecients with mean power E k
(i) 2
= k . Since this expansion involves orthogonal basis functions and orthogonal
(i)
coecients, it constitutes the KL expansion of x(t). The corresponding correlation operator and expected ambiguity function are given by
N
Rx =
k=1 N
(1) (1) k vk
(1) vk
+
k=1 N k=1
k vk vk
(2) ()
(2)
(2)
(2)
(2)
(3.22)
A() (, ) = x
k=1
k A
(1)
() vk
(1)
(, ) +
k A
vk
(, ) .
Since the KL expansion of x(t) consists of uncorrelated, well TF localized basis functions, only small TF correlations are present and the GEAF of x(t) is concentrated about the origin (at least for N not too small). Thus, the process x(t) is underspread. The GEAF of an example process with N = 30 is shown in Fig. 3.3(b). It is indeed highly concentrated about the origin which is further corroborated by the correspondingly small moments mx
(1,0)
mx
(0,1)
= 0.035, mx
(1,1)
= 0.249, Mx
(1,0)
Mx
(0,1)
= 0.008, Mx
(1,1)
= 0.012.
Overspread Process. In contrast to the process x(t) discussed previously, consider the process
N
x(t) =
k=1
k vk (t) +
k=1
(1)
(1)
k vk (t) ,
(2)
(2)
(3.23)

(i) 2 (i)
109
where the power of each component is the same as for the process x(t), i.e., E now the coecents are correlated in the sense that E k l
(1) (2) (1) (2)
= k , but
= k k,l with k > 0 (note that for
the underspread process above, k = 0). This process features signicant TF correlations since the coecients of the components vk (t) and vk (t), which are located in the TF regions R1 and R2 ,
(i)
respectively, are correlated. Since the k are not mutually orthogonal, (3.23) is not the KL expansion of x(t) which actually is given by
N N k=1
x(t) =
k=1
(1) (1) k vk (t) +
(2) (2) k vk (t) ,
(3.24)
where vk (t) = cos(k ) vk (t) + sin(k ) vk (t) , vk (t) = sin(k ) vk (t) + cos(k ) vk (t) ,
1 with k = 2 arctan 2k (1) (2) k k (2) (1) (2) (1) (1) (2)
(2) (1) (1) k = cos(k ) k + sin(k )k , (2) (1) (2) k = sin(k ) k + cos(k )k ,
(3.25a) (3.25b)
(i) . The coecients k are mutually orthogonal with respective powers k + k 2

(1) (1) (2)
E E
(1) k (2) k
(1) = k (2) = k
k k k k
(1)
(1)
(2) 2
2 2
+ 4|k |2 + 4|k |2
, .
(i)
k + k 2
(2)
(2) 2
While the KL expansion (3.24) is doubly orthogonal, the underlying basis functions vk (t) are no (3.24) the TF correlations of the process x(t) are hidden in the TF structure of the basis functions. In subsequent sections, it will be seen that these TF correlations lead to the occurrence of statistical cross terms in time-varying spectra. The correlation operator and GEAF of x(t) can be written as
N
longer well TF localized since they are supported in both R1 and R2 . Hence, in the KL representation
Rx =
k=1 N
(1) (1) (1) k vk vk +

(1) vk
N k=1 N
(2) (2) (2) k vk vk

(2) vk N
(3.26)
(1) k vk (2) vk N k vk vk (2) (1)
=
k=1
(1) (1) k vk
+
k=1
(2) (2) k vk
+
k=1
+
k=1
, (3.27)
() Ax (, ) = =
N k=1 N k=1
(1) () k A (1) (, ) +
vk (1) k
N k=1 N
(2) () k A (2) (, )
vk (2) k
()
(1)
vk
(, ) +
k=1
() vk
(2)
(, ) +
k=1
k A
() vk ,vk
(1) (2)
(, ) +
k=1
k A
() vk ,vk
(2) (1)
(, ) .
It is seen that the existence of TF correlations is also reected by the GEAF of x(t) which has signicant components Av(i) ,v(j) (, ) (i = j) o the origin about the points (, ) = (t, f ) and (, ) = (t, f ). Hence, x(t) is an overspread process.
k k
110
The GEAF of an example process with N = 30 is shown Fig. 3.3(c)). The GEAF components o the origin about (, ) = (t, f ) and (, ) = (t, f ) lead to substantially larger moments mx
(1,0)
mx
(0,1)
= 14.31, mx
(1,1)
= 15.25, Mx
(1,0)
Mx
(0,1)
= 7.57, and Mx
(1,1)
= 11.35.
3.2 Elementary Time-Varying Power Spectra

We shall next discuss two fundamental classes of time-varying power spectra: the generalized WignerVille spectrum and the generalized evolutionary spectrum.
3.2.1
Generalized Wigner-Ville Spectrum
Extending the Wiener-Khintchine relation (1.3) between the temporal correlation function rx ( ) and the power spectral density Px (f ) of a stationary random process, a time-varying spectrum of a nonstationary process can be dened as the 2-D (symplectic) Fourier transform of the GEAF (TF correlation () function) Ax (, ), Wx (t, f ) =
()
x A() (, ) ej2(f t) d d
(3.28)
() rx (t, ) ej2f d
= LRx (t, f ) .
()
(3.29)
This time-varying power spectrum is known as generalized Wigner-Ville spectrum (GWVS, see also Section B.3.1) [60, 61, 63, 126, 140, 145]. For = 0, the GWVS reduces to the ordinary Wigner-Ville spectrum [12, 60, 63, 67, 140] whereas for = 1/2 the Rihaczek spectrum [60, 63, 178] is obtained. The GWVS satises a large number of desirable mathematical properties and proved to be useful for certain signal processing applications [92,111,141,143]. A potential drawback of the GWVS from the point of view of interpretation, namely the occurrence of negative values, will be discussed in Subsection 3.3.4. From the 2-D Fourier relationship connecting the GWVS and the GEAF, it follows immediately that the GWVS of an underspread process is a smooth 2-D lowpass function. For convenience,
()
we reformulate Proposition 2.5 for nonstationary random processes using the relations Wx (t, f ) = (k,l) (k,l) (k,l) (k,l) () () () : , and Mx = M L (t, f ), Ax (, ) = S (, ), mx = m
Rx Rx Rx Rx
Corollary 3.4. For any process x(t), the partial derivatives of the GWVS satisfy k+l Wx (t, f ) (2)k+l Ax tl f k
() (k,l) , 1 mx
k+l Wx tl f k
()
= (2)(k+l) Ax
2
(k,l) . 2 Mx
It is thus seen that for underspread processes with small GEAF moments, the GWVS is a smooth function. On the other hand, if the process is overspread, (3.28) implies that the GWVS contains severe oscillations that correspond to GEAF contributions o the origin and hence can be viewed as indicators of TF correlation. These oscillatory GWVS components also comprise negative GWVS
111
=0 (t1 , f1 )
= 1/2 (t1 , f1 ) f2
= 1/2 =0 = 1/2 =1
f0 /2
/2 /2
/2 f1 f0 (t2 , f2 )
(t2 , f2 ) t0 (a) t
t0 (b)
()
t1 (c)
t2
Figure 3.4: Illustration of TF points (t1 , f1 ), (t2 , f2 ) contributing to Wx (t0 , f0 ) for (a) = 0 (Wigner-Ville spectrum), (b) = 1/2 (Rihaczek spectrum); (c) TF locus of statistical cross terms of the GWVS for several values of .
values and can be interpreted as cross terms or interference terms. Thus, TF correlations cause statistical cross or interference terms in the GWVS [145] (similar observations have been reported for a special case in [198]). For Gaussian random processes, the cross terms of the GWVS and their relation to TF correlations () can further be analyzed using the covariance function Cx (t, f ; , ) of the GWD of x(t) as dened in (3.7). We have the following corollary to Theorem 3.1. Corollary 3.5. For a circular complex Gaussian random process x(t),
() 2 () Cx (t, f ; , ) d d = Wx (t, f ) .
(3.30)
() () Proof. In the proof of Theorem 3.1 it was shown that Cx (t, f ; , ) = WRx (t, f ; , ). From this,
(3.30) follows by using (B.34) (with H replaced by Rx ) and LRx (t, f ) = Wx (t, f ).
()
()
Hence, (3.30) says that the value of the GWVS at the TF point (t0 , f0 ) subsumes the covariance of the GWD at all TF points (t1 , f1 ) = (t0 + , f0 + + ) and (t2 , f2 ) = (t0 + , f0 ). Note
Let us give an interpretation of Corollary 3.5. We consider (3.30) for xed TF analysis point (t, f ) = () () () (t0 , f0 ). According to (3.7), Cx (t0 , f0 ; , ) = cov Wx (t + , f + + ), Wx (t + , f ) .
that while these TF points are always separated by in time and by in frequency, their particular TF location is determined by (see Fig. 3.4(a),(b) for the cases = 0 and = 1/2). The other way around, if two TF points (t1 , f1 ) and (t2 , f2 ) of the generalized Wigner distribution are correlated, (t, f ) = i.e., Cx (t1 , f1 ; t2 , f2 ) = 0, then this correlation will contribute to the GWVS value at the TF point
t1 +t2 2 ()
even if x(t) features no energy at this TF point.
(t1 t2 ), f1 +f2 + (f1 f2 ) (see Fig. 3.4(c)). This explains why Wx (t, f )| > 0 2
()
The basic mechanisms of statistical cross terms are further illustrated by the following example. Example (continued). We reconsider the random processes x(t) and x(t) dened by (3.21) and (3.23), respectively, with N = 30. In Subsection 3.1.6, it was seen that x(t) is an underspread process
112
t (a) (b)
t (c)
t (d)
Figure 3.5: Illustration of the GWVS of the underspread example process x(t) and the overspread example process x(t): (a) W x (t, f ), (b) real part of W x Wx
(1/2) (0) (1/2)
(t, f ), (c) W x (t, f ), (d) real part of
(0)
(t, f ). In (c) and (d), statistical cross terms are clearly visible. The number of signal samples
is 256, (normalized) frequency ranges from 1/4 to 1/4. since its GEAF is highly concentrated about the origin and features small moments. From (3.22), the GWVS of x(t) is obtained as Wx (t, f ) =
k=1 (1) (2) () N
k W
(1)
() vk
N k=1
(1) (t, f ) +
k W
(2)
() vk
(2)
(t, f ) .
()
Since the basis functions vk
and vk
essentially features energetic contributions in these two TF regions (see Fig. 3.5(a)(b)). In contrast, it follows from (3.26) that the GWVS of x(t) is given by Wx (t, f ) =
k=1 () () N
are well TF localized in R1 and R2 , respectively, Wx (t, f )
k W
(1)
() vk
N k=1 ()
(1) (t, f ) +
k W
(2)
() vk
(2) (t, f ) +
k W
k=1
()
(1)
N
(2) (t, f ) +
vk ,vk
k W k=1
() vk ,vk () vk ,vk
(1) (2) (1)
(t, f ) .
Compared to Wx (t, f ), Wx (t, f ) is seen to feature additional cross GWD terms W W

() vk ,vk
(2) (1)
(2)
(t, f ),
(t, f ) (see Fig. 3.5(c) and (d)). These additional contributions correspond to oscillating
statistical cross terms that can be attributed to nonzero the TF correlations k present in the process x(t). While the strength of the statistical cross terms is determined by the correlation parameters k , their location and geometry is determined by the (-dependent) interference geometry of the GWD [85]. The suppression or reduction of these cross terms by smoothing the GWVS will be considered in Section 3.3.
Approximate Uniqueness of the GWVS

In contrast to the PSD which is unique, the GWVS depends on the parameter . This non-uniqueness might be considered an inconvenience. An important result regarding the choice of the GWVS parameter results from specializing Theorem 2.10. Corollary 3.6. For any random process x(t), the dierence Wx
(1 )
(t, f ) Wx
(2 )
(t, f ) between two
113
GWVS with parameters 1 and 2 is bounded as Wx

(1 )
(t, f ) Wx Ax
1
(2 )
(t, f )
2|1
2 |m(1,1) , x
Wx
(1 )
Wx Ax
2
(2 ) 2 (1,1) 2|1 2 |Mx .
(3.31)
Proof. The corollary follows from Theorem 2.10 with H = Rx and the moment equality (3.17). The bound (3.31) shows that for small mx with dierent are approximately equal, Wx
(1 ) (1,1)
and Mx
(2 )
(1,1)
, i.e., for underspread processes, GWVS
(t, f ) Wx
(t, f ) .
(1/2) (0) (1/2)
An example for this approximation with 1 = 0 and 2 = 1/2 is shown in parts (a) and (b) of Fig. 3.5. In this example, the normalized errors were maxt,f 2.7 103 while the corresponding bounds in (3.31) Small mx
(1,1) Wx (t,f )Wx Ax 1 (1,1) were mx
(0)
(t,f )
= 104 and
, Mx
(1,1)
essentially requires that the process GEAF is concentrated along the axis
(1,1)
= 3 103 and
Wx Wx 2 = Rx 2 (1,1) Mx = 3.8 102 .
and/or axis. Processes having a GEAF oriented in oblique directions will not have small mx
(1,1) Mx .
For such processes, the Wigner-Ville spectrum (i.e., the GWVS with = 0) plays an out-
standing role [60, 61, 63] due to its metaplectic covariance properties (which are analogous to those of the GWS with = 0), see Appendix C. CL Processes. For CL processes, application of 2.18 (with mx x = Rx ) to (3.31) yields the bounds Wx
(1 ) (2 ) (1 ) (1,1) (1,1) (1,1) (1,1)
= mRx , Mx
(2 ) 2
= MRx , and
(t, f ) Wx Ax 1
(t, f )
2|1 2 |x ,
Wx
Wx Ax 2
2|1 2 |x .
Moreover, we can formulate a result analogous to Proposition 2.11 which illuminates the interrelation of the bounds for the CL case with those for the non-CL case.
Real-Valuedness and Positivity

The PSD is real-valued and positive. In contrast, the GWVS is real-valued only for = 0 and its real part is only rarely everywhere positive, i.e., there may be Wx (t, f ) = P Wx (t, f ) , where P Wx (t, f )
() () ()
1 () Wx (t, f ) 2
()
+ Wx (t, f )
()
denotes the positive real part of the GWVS Wx (t, f ). However, the approximate -invariance of the GWVS stated in the foregoing corollary suggests that for underspread processes the imaginary part of the GWVS with = 0 is negligible. Furthermore, our qualitative discussion of statistical cross terms suggested that these cross terms are the main source for negative GWVS values. Since the GWVS of underspread processes contain only small statistical cross terms, they might be suspected to be almost positive. Indeed, reformulation of Corollary 2.13 and Theorem 2.29 yields
114

()
Corollary 3.7. For any random process x(t), the imaginary part of the GWVS, Wx (t, f ) () () 1 2j Wx (t, f ) Wx (t, f ) , is bounded as Wx (t, f ) Ax 1
()
2||m(1,1) , x
Wx Ax
()
() 2 2 (1,1) 2||Mx . () ()
(3.32)
Furthermore, the negative real part of the GWVS, N Wx (t, f ) is bounded as N Wx (t, f ) SHp
2 1 ()
P Wx (t, f ) Wx (t, f ) ,
2 ( Mx s ) , ()
min{1 , 2 } ,
N Wx Ax
()
() 2
inf
s
2 =1
(3.33)
where 1 = inf
2 =1
mx
(s )
semi-denite innovations system of x(t).
with s (, ) = 1 As (, ) and 2 = 2 CHp with Hp the positive Ax SHp

2 , 1
Proof. Using the moment equalities (3.16), (3.17), and the inequality bounds follow from Corollary 2.13 and Theorem 2.29 with H = Rx .
the above
Combination of (3.32) and (3.33) gives the following reformulation of Corollary 2.30. Corollary 3.8. For any random process x(t), the dierence Wx (t, f ) P Wx (t, f ) is bounded as Wx (t, f ) P Wx (t, f ) SHp
2 1 () () () ()
2|| m(1,1) + min{1 , 2 } , x

2 (1,1) 2|| Mx + inf s ( Mx s ) ,
(3.34a)
Wx Proof. We have
()
P Wx Ax
2
()
2 =1
(3.34b)
Wx (t, f ) P Wx (t, f ) = j Wx (t, f ) + N Wx (t, f ) , and thus, by the triangle inequality, Wx

()
()
()
()
()
P Wx
() p
Wx
() p
+ N Wx
() p
(3.35)
where for our purposes either p = or p = 2. Applying the bounds on the left-hand sides of (3.32) bounds on the right-hand sides of (3.32) and (3.33) to (3.35) (with p = 2). For inf
s
2 =1
and (3.33) to (3.35) (with p = ) then yields (3.34a). Similarly, (3.34b) is obtained by applying the underspread
( ) Mx s
processes,
() CHp
i.e.,
processes
where
mx
(1,1)
Mx
(1,1)
inf
2 =1
mx
(s )
, and
are small, the above bounds show that

()
Wx (t, f ) 0 , and thus

()
N Wx (t, f ) 0 ,
()
()
Wx (t, f ) P Wx (t, f ) .
115
Hence, for underspread processes the imaginary and negative real part of the GWVS will be negligible. Parts (a) and (b) of Fig. 3.5 are examples for this approximation with = 0 and = 1/2, respectively. For = 0, the normalized errors were maxt,f duration) were mx errors were maxt,f 3.6 103 and
(s ) N W x (t,f ) SHp 2 1
(0)
3.3 104 , while the corresponding bounds in (3.33) (obtained with a Gaussian signal s(t) of optimal
(1/2) Wx (t,f )P
= 4 106 and
N Wx Ax
(0) 2 2
= 5.2 104 and Mx

(1/2) Wx (t,f ) SHp 2 1
(s )
= 2.9 103 , respectively. For = 1/2, the normalized

Wx
(1/2)
= 8.3 105 and
P W x Ax
2
(1/2)
= 0.033, while the

(1,1)
corresponding bounds in (3.34) (using the same Gaussian signal s(t) as before) were mx
(1,1) Mx
+ mx
(s )
( ) Mx s
If the GEAF of x(t) and the GSF of Hp are oriented along the axis or axis, all of the relevant
weighted integrals and moments will be small. For processes having an obliquely oriented GEAF, only the GWVS with = 0 (which is always real-valued as correctly reected by (3.32)) will approximately equal its positive part. In such cases, the weighted integrals mx replacing CHp with minUM CUHp U+ (cf. (2.72)). (with mx CL Processes. If x(t) is a CL process with GEAF support region Gx , applying (2.18) and (2.16)
(1,1) (0) (0) (s )
and Mx
(s )
can be made small by
suitable choice of s(t), and 2 can be rened using the metaplectic covariance of the Weyl symbol by
= mRx , Mx
(1,1)
(1,1)
= MRx , and Mx
()
(1,1)
(s )
= MRxs ) to (3.34) yields the bounds4 2|| x + inf {(max) } , s

s
2 =1
( )
Wx (t, f ) P Wx (t, f ) SHp

2 1
()
Wx
()
P Wx Ax 2
() 2
2|| x + inf {(max) } , s

s
2 =1
where s
(max)
= max(,)Gx s (, ).
Uncertainty Relations
The eective duration Tx and the eective bandwidth Fx of a nite-energy, deterministic signal x(t) are dened by
2 Tx = tt
dt 1 = 2 dt x t |x(t)|
2 |x(t)|2
2 2
t2 |x(t)|2 dt ,
2 Fx =
f 2 |X(f )|2 df
f
|X(f )|2 df
1 x
2 2
f 2 |X(f )|2 df .
Furthermore, a joint measure of the duration and bandwidth of x(t) is given by 2 x Tx T

2
+ (T Fx )2 ,
where T is an arbitrary normalization constant. According to the uncertainty principle [45,6466], x is bounded from below as 2 = x
4
Tx T
+ (T Fx )2
1 . 2
(3.36)
For simplicity, we here disregard the term in (3.34a) involving 2 .
116 A special case of this inequality, obtained with T 2 = Tx /Fx , reads Tx Fx 1 , 4
(3.37)
which shows that Tx and Fx cannot be simultaneously small. Equality in (3.36) and (3.36) is attained if and only if x(t) is a Gaussian signal [6466]. The uncertainty inequality can also be viewed as a lower bound on the spread of the GWD since the GWDs marginal properties imply
2 Tx =
1 x
2 2
() t2 Wx (t, f ) dt df ,
2 Fx =
1 x
2 2
() f 2 Wx (t, f ) dt df ,
and thus (3.36) can be rewritten as 2 = x 1 x

2 2
t T
() + (T f )2 Wx (t, f ) dt df
1 . 2
(3.38)
()
This shows that x can be interpreted as a TF radius measuring the TF concentration of Wx (t, f ). Next, let x(t) be a nonstationary random process. Since (3.36) holds for any (nite-energy) realization of a random process, multiplying by x 2 , taking the expectation, and renormalizing by 2 2 Ex = E x 2 yields 2 1 1 2 x , Tx Fx , (3.39) 2 4 with 2 Tx 2 Fx and 2 x
2 2 Note, however, that in general Tx = E Tx tt 2 r (t, t) dt x t rx (t, t) dt f
1 = 2 Ex
t2 rx (t, t) dt ,
t
(3.40a) (3.40b)
f 2 rX (f, f ) df
f rX (f, f ) df
1 = 2 Ex
f 2 rX (f, f ) df ,
f
Tx 2 + (T Fx )2 . T 2 2 , Fx = E Fx , 2 = E 2 . Similarly, we straightforx x t T
2 ()
wardly obtain from (3.38) that 1 2 = 2 x Ex + (T f )2 Wx (t, f ) dt df .

()
(3.41)
and hence, the TF radius x is a measure of the TF concentration of Wx (t, f ). Together with the left-hand inequality in (3.39), this indicates that the GWVS of a nite-energy process cannot be too much concentrated. In the follwoing we shall present bounds that are tighter than those in (3.39) and that relate the spread of the GWVS to an eective rank parameter of the correlation operator. We note that uncertainty relations for positive (semi-)denite operators (like correlation operators) are well known in quantum mechanics (cf. [158]) and have previously been considered in [81, 99, 101]. The following theorem is essentially an adapted/rephrased result from [99]. We recall that the KL expansion of a correlation operator is Rx =
k=1 k
order) and the (orthogonal) KL eigenfunctions uk (t).
uk u , with the KL eigenvalues k 0 (sorted in decreasing k
117
Theorem 3.9. The TF radius x and the duration-bandwidth product Tx Fx of any nite-energy ran dom process x(t) are bounded from below as 2 x where x = 1 1 x , 2 1 1 Tx Fx x , 2 2
k=1 k k k=1 k
(3.42)
Proof. We need the Hermite operator HT dened as HT x (t) = t T

2
x(t)
T 2
d2 x(t) = (M2 + D2 )x (t) , dt2

T d j2 dt .
where M is an LFI operator performing a multiplication by t/T and D =
The operator
1 2 (2k
HT is positive denite and unbounded [69, 158]. Its eigenvalues are given by k = have HT =
k=1 k
its eigenfunctions are the Hermite functions [45, 64, 81, 99], which we denote by hk (t). Thus, we
()
1) and
t 2 + (T f )2 [45, 81, 99]. Starting with T () () Wx (t, f ) = LRx (t, f ) then implies
hk h . Furthermore, it can straightforwardly be shown that LHT (t, f ) = k (3.41), the unitarity of the GWS (cf. (B.25)) together with
1 1 1 () () 2 = 2 LHT , Wx x = 2 HT , Rx = 2 Ex Ex Ex According to Lemma 3.1 in [99], there is

k=1
k=1
k HT uk , uk .
ck HT fk , fk
ck k ,
k=1
(3.43)
ck = k and fk (t) = uk (t), we hence obtain 2 x 1 2 Ex = 1

k=1
for any orthonormal basis {fk (t)} and any ordered nonnegative sequence ck 0, c1 c2 . With 1 k k = 2 Ex 1 2
k=1
1 1 (2k 1) = 2 k 2 Ex = 1 1 x , 2
k=1
k k
k=1
k 2
k=1 k k 2 Ex k=1 k
k=1 k 2 Ex
where we further used
in (3.42). The right-hand inequality in (3.42) is a special case of the left-hand inequality obtained by 2 setting T = Tx /Fx in the expression 2 = Tx + (T Fx )2 .
x T
2 = Tr{Rx } = Ex . This concludes the proof of the left-hand inequality
Discussion. The parameter x can be viewed as a measures the eective rank of the correlation operator Rx = uk u . Hence, Theorem 3.9 shows that lower bound for the TF radius x k x Fx is eectively determined by the eigenvalue spread of Rx . It and the duration-bandwidth product T the lower bounds in (3.39). In particular, the lower bounds in (3.42) and (3.39) become equivalent can furthermore be shown that x 1, and hence the lower bounds in (3.42) are tighter (larger) than
k=1 k
118
2 in the case x = 1, i.e., for a rank-one correlation operator Rx = Ex u1 u . Finally, we note that 1 equality for Rx =
k=1 k
equality in (3.43) is attained for uk (t) = hk (t) and hence the lower bounds in (3.42) are attained with the Hermite functions. hk h , i.e., in the case of random processes whose KL eigenfunctions are k
Since the GWVS in general is neither positive nor real-valued (except for = 0), we next consider an alternative measure of GWVS concentration, 2 () x
t f t 2 T
+ (T f )2
Wx (t, f ) dt df
()
() 2 t f Wx (t, f ) dt df
1
() 2 Wx 2 t f
t T
+ (T f )2
Wx (t, f ) dt df .
()
A lower bound on this modied TF radius is slightly more dicult to establish. The following theorem builds upon results from [99] and [101] and shows that the Wigner-Ville spectrum, i.e., the GWVS with = 0, has maximum TF concentration, i.e., minimum TF radius. Theorem 3.10. The TF radius 2 () of any nite-energy random process x(t) satises x 2 () = 2 (0) + 2 x x It is bounded from below as 2 () x where x = 1 1 + 2 x 2 2 Mx T
(1,0) 2 (0,1) + (T Mx )2 ,
Mx T
(1,0)
(0,1) + (T Mx )2 .
(3.44)
(3.45)
2 k=1 k k 2 k=1 k (0)
.
(0)
Proof. We rst consider the case = 0. Using Wx (t, f ) = LRx (t, f ) = we have t T =
t f 2 (0) 2
k Wuk (t, f ) (see (B.29)),
(0)
W x (t, f ) t T
2
dt df k uk t +
1
k=1
1 j2f 1 1 uk t e d1 2 2
=
t 1
l
l=1 2
u t + l

2 j2f 2 2 ul t e d2 dt df 2 2 1 1 2 2 uk t ul t + ul t (1 2 ) d1 d2 dt 2 2 2 2
t T
2
k l uk t +
k=1 l=1
=
t 1
t T
k l uk t +
=
t1 t2
k=1 l=1 t1 + t2 2
1 1 1 1 uk t ul t + ul t d1 dt 2 2 2 2
2T
k l uk (t1 ) u (t2 ) u (t1 ) ul (t2 ) dt1 dt2 l k

k=1 l=1
1 4T 2
t1
t2
(t2 + 2t1 t2 + t2 ) 1 2
k=1 l=1
k l uk (t1 ) u (t2 ) u (t1 ) ul (t2 ) dt1 dt2 l k

119
1 4
k l
k=1 l=1
M2 uk , ul ul , uk + 2 Muk , ul Mul , uk + M2 ul , uk M2 uk , uk k,l + Muk , ul

2
uk , ul
1 = 2 1 2
k l
k=1 l=1 2 k k=1
M2 uk , uk .
In a similar way, one can show that (T f )2 W x (t, f )

(0) 2
dt df
1 2
2 D2 uk , uk . k
k=1
Hence, recalling that HT = M2 + D2 , we obtain

t f
t T
+ (T f )
2 (0) W x (t, f ) dt df
1 2
2 k
k=1
HT uk , uk
1 2
2 k
k=1 () 2 2
1 (2k 1) = , 2 =
k
where we have used (3.43) with ck = 2 and fk (t) = uk (t). Hence, with Wx k 2 x = 1
() 2 Wx 2 t f 2 1 k=1 kk () 2 2 Wx 2
2 , we have k 1 2 (3.46)
t T
+ (T f )
() 2 Wx (t, f ) dt df
() Wx 2
1 2 2
k=1
2 k k
2 k=1 k () 2 Wx 2
1 1 x 2 2
()
Next, we consider the case = 0. We note that the 2-D Fourier transform of t Wx (t, f ) equals 1 j2 (0) 1 j2 (0) 1 () e Ax (, ) = e A (, ) j2 A(0) (, ) , Ax (, ) = x j2 j2 j2 x where (B.54) has been used. Using Parsevals relation, we then obtain t T
2
Wx (t, f ) dt df = =
()
1 T2
1 j2 (0) e A (, ) j2 A(0) (, ) x j2 x
2 (0) Ax (, ) d d + 2 2 2 Ax (, ) d d
d d
1 (2T )2
I(, ) d d
+ 4 2 2
(0) (0) where I(, ) = j2 Ax (, ) Ax (, ). Since I (, ) = I(, ), the second term vanishes
and we have t T
Wx (t, f ) dt df = =
()
1 (2T )2
t f
2 (0) 2 Ax (, ) d d + 2 T (0) 2
2 2 Ax (, ) d d 2 (1,0) . Mx
t T
W x (t, f ) dt df +
2 T2
Ax
2 2
(3.47)
120
In a similar way it can be shown that (T f )2 Wx (t, f ) dt df =

() 2 (0) (T f )2 W x (t, f ) dt df + ( T )2 Ax 2 2 2 (0,1) . Mx 2
(3.48)
The expression (3.44) then follows upon combination of (3.47) and (3.48). Finally, (3.45) follows by applying (3.46) to (3.44). The foregoing theorem shows two things. First, from (3.44) it is seen that the Wigner-Ville spectrum, i.e., the GWVS with = 0, has maximum TF concentration within the entire GWVS family. Only in the case of processes with small moments Mx
(1,0)
and Mx
(0,1)
, i.e., for underspread
processes, will the TF concentration of the other GWVS members be nearly as good as that of the x Wigner-Ville spectrum, i.e., 2 () 2 (0). Second, according to (3.45), the TF concentration is x (1,0) bounded from below, with the lower bound being determined by x and the moments Mx and Mx
(1,0)
. The parameter x again is a measure the spread of the eigenvalues k of the correlation
operator Rx and hence can be interpreted as the eective rank of Rx . We note that = 0 yields the smallest lower bound in (3.45), i.e., 2 (0) x Since x 1, it nally is seen that 2 (0) x
1 4
1 1 . x 2 2 for any nite-energy process.
From the uncertainty relations derived above, we can conclude an important general rule: for
greater eective rank (broader KL eigenvalue spectrum) of the correlation operator Rx , the GWVS will have a larger TF support region.
3.2.2
Generalized Evolutionary Spectrum
Besides the GWVS, a second approach to dening a time-varying spectrum is based on the innovations system representation (3.19) of the random process x(t). The generalized evolutionary spectrum (GES) can be dened in analogy to (1.4) as squared magnitude of the TF transfer function (generalized Weyl symbol) of an innovations system H Ix of x(t) [147, 148, 170], G() (t, f ) x
()
LH (t, f ) .
()
Note that Gx (t, f ) depends on the choice of H (we recall from Subsection 3.1.5 that H is not uniquely dened). From the above denition, it is seen that the GES, in contrast to the GWVS, is always nonnegative. For = 1/2, the GES reduces to the (ordinary) evolutionary spectrum as originally dened by Priestley [170]. For = 1/2, the transitory evolutionary spectrum [49] is obtained. (We note that in the case of a self-adjoint innovations system H = H+ , the evolutionary spectrum and the transitory evolutionary
() ()
(t, f ).) A further special case of the GES is the Weyl spectrum [147] obtained with = 0 and positive semi-denite innovations system H = Rx . Desirable spectrum are equal since LH+ (t, f ) = LH properties of the GES are discussed in detail in [148].

()
121
The usefulness of the GES denition relies on the interpretation of the GWS LH (t, f ) as a TF transfer function (i.e., TF domain weighting). In Chapter 2 it was seen that such an interpretation requires the system H to be underspread. According to Subsection 3.1.5, underspread innovations systems H produce underspread processes x(t), and thus it follows that the GES has a useful interpretation only for underspread processes. For overspread processes, the innovations system H is necessarily overspread as well. Here, the GWS of H will feature oscillating cross terms whose squared magnitude will appear in the GES. Thus, for overspread processes, the GES features statistical cross terms just as the GWVS. However, the appearance of the GES cross terms is dierent from that of the GWVS cross terms (see further below). The following result establishes smoothness properties of the GES in terms of moments of the innovations system (see also Section 3.1.5). Theorem 3.11. For any random process x(t) and any innovations system H Ix , the partial derivatives of the GES (dened using H) are bounded as k+l Gx (t, f ) (2)k+l SH tl f k
() () k 2 1 i=0 j=0 () l
k i
l (i,j) (ki,lj) . mH mH j
()
Proof. The 2-D Fourier transform of Gx (t, f ) = LH (t, f ) LH (t, f ) is given by the convolution of SH (, ) and SH (, ). Since SH (, ) = SH+ (, ), there is G() (t, f ) = x and thus k+l Gx (t, f ) = tl f k
() () () () ()
SH SH+
()
()
(, ) ej2(t f ) d d
(1)l (j2)k+l k l SH SH+
()
()
(, ) ej2(t f ) d d.
Therefore, the magnitude of the partial derivatives of the GES is bounded as k+l GH (t, f ) tl f k
()
(2)k+l k l

SH ( , ) SH+ ( , ) d d d d
() ()
()
()
(2)k+l = (2)
k+l
k l SH ( , ) SH+ ( , ) d d d d
|1 + |k |1 + |l |SH ( , )| |SH+ (1 , 1 )| d1 d1 d d
k
(2)k+l
i=0
k |1 |i | |ki i

l j=0
l |1 |j | |lj j
|SH ( , )| |SH+ (1 , 1 )| d1 d1 d d
k l
= (2)
k+l i=0 j=0
k i
l j
|1 |i |1 |j |SH+ (1 , 1 )| d1 d1
| |ki | |lj |SH ( , )| d d
122
k l
= (2)
k+l
SH 2 1
i=0 j=0
k i
l (i,j) (ki,lj) , mH+ mH j

(k,l)
from which the nal result follows since mH+ = mH . It is thus seen that for underspread processes with underspread innovations systems having small GSF moments, the GES is a 2-D lowpass function, i.e., smooth. In contrast, for overspread processes the GES features statistical cross terms. These statistical GES cross terms can be related to the TF displacements of the innovations system H as follows. Corollary 3.12. For any innovations system H Ix of a nonstationary random process x(t), there is WH (t, f ; , ) d d = G() (t, f ) , x
()
(k,l)
(3.49)
with Gx (t, f ) dened using H. Proof. Equation (3.49) is a restatement of (B.34) with LH (t, f )
() 2
()
= Gx (t, f ).
()
()
which are separated by in time and by in frequency, Corollary 3.12 states that the value of the GES at any particular TF analysis point (t0 , f0 ) subsumes all TF displacements of the innovations system H. The other way around, if H causes TF displacements between the TF points (t1 , f1 ) and TF point (t, f ) = (t2 , f2 ), i.e., WH (t1 , f1 ; t2 , f2 ) = 0, then these displacements will contribute to the GES value at the
t1 +t2 2 ()
the energy transfer between the TF points (t1 , f1 ) = (t+ , f ++ ) and (t2 , f2 ) = (t+ , f )
Since according to [90] the generalized transfer Wigner distribution WH (t, f ; , ) characterizes
The cross terms of the GES will be further illustrated by the following example. Example (continued). We again reconsider the example processes x(t) and x(t) from Subsections 3.1.6 and 3.2.1. From the eigenexpansion (3.22) of the correlation operator Rx , the positive semi-denite innova tions system Hp = Rx of the process x(t) is obtained as
N
(t1 t2 ), f1 +f2 + (f1 f2 ) . 2
Hp =
k=1 (i)
k vk vk
(1)
(1)
(1)
+
k=1
k vk vk
(2)
(2)
(2)
Since the basis functions vk (t) are well TF localized, the above eigenexpansion shows that the innovations systen Hp introduces only small TF displacements, i.e., it is underspread. The GES of x(t) is given by
N () Gx (t, f )
=
k=1 N
(1) () k W (1) (t, f ) + v

k
N k=1 N k=1
k W
(2)
()
(2) vk
(t, f )
() vk 2
=
k=1
k W
(1)
() vk
+ (1) (t, f )
k W
(2)
(2) (t, f )

N N
123
+ 2
k=1 l=1 (1) (2)
k l
(1) (2)
() vk
(1)
(t, f ) W
() vl
(2)
(t, f ) .
() vk
(1)
Since the basis functions vk and vk neglecting the cross terms W

(i) () vk
(i)
are assumed to be TF disjoint, i.e., W

() vl
(i)
(t, f ) W
() vl
(2)
(t, f ) = 0,
the last term in the above expression vanishes. Evaluating the square in the remaining two terms and (t, f ) W (t, f ), k = l, on the basis of the orthogonality5 of the sets
{vk }i=1...N , the following approximation for the GES of x(t) is obtained,
N () Gx (t, f )
(1) k
()
(1) vk
(t, f ) +
k=1
(2)
()
(2) vk
(t, f ) .
k=1
and (b) for an example process with N = 30. Next, we consider the overspread process x(t). Using (3.25), (3.26), and (3.27), it can be veried that the positive semidenite innovations system Hp = Rx of x(t) is given by
N
R2 where the basis functions vk
Thus, it is seen that the GES of x(t) essentially features contributions only in the TF regions R1 and
(1)
and vk
(2)
are respectively localized. This is illustred in Figs. 3.6(a)
Hp =
k=1 N
(1) (1) (1) k vk vk +

(1) (1) (1) k vk vk N
N k=1
(2) (2) (2) k vk vk

N
=
k=1
+
k=1
(2) (2) (2) k vk vk
+
k=1
k vk vk
(1)
(2)
+ vk vk
(2)
(1)
with k = k = k =
(2) (1)
(1) k cos2 (k ) + (1) k sin2 (k ) + (1) k
(2) k sin2 (k ) , (2) k cos2 (k ) ,
(2) k cos(k ) sin(k ) .
From this expression, it is seen that the innovations system Hp is able to transfer energy between the TF regions R1 and R2 , and thus represents an overspread system. The GES obtained with the innovations system Hp is (approximately) given by Gx (t, f ) = LHp (t, f )
N () () 2 N N 2
=
k=1 N
(1) k
() vk
(1)
(t, f ) +
k=1 2
(2) k N k=1
() vk
(2)
(t, f ) +
k=1 2
k W
N
() vk ,vk
(1) (2)
(t, f ) + W
() vk ,vk
(2) (1)
(t, f )
2
(1) (t, f )
(1) k
() vk
(1) (t, f ) +
(2) k
() vk
(2) (t, f ) +
k W
k=1
()
(1)
k=1
vk ,vk
(2) (t, f )+ W
()
(2)
vk ,vk
E D () () (i) (i) = kl actually implies W (i) , W (i) = kl . This Via Moyals relation [151], the orthogonality condition vk , vl
vk vl ()
(i) vk
is equivalent to W
(t, f ) W
()
(i) vl
(t, f ) 0, k = l, if the GWDs involved are eectively positive.
124
t (a) (b)
t (c)
t (d)
Figure 3.6: Illustration of the GES of the underspread example process x(t) and the overspread example process x(t) (the same processes as in Fig. 3.5): (a) Gx (t, f ), (b) Gx (d) Gx
(1/2) (0) (1/2)
(t, f ), (c) Gx (t, f ),
(0)
(t, f ). In (c) and (d), statistical cross terms are clearly visible. The number of signal samples
is 256, (normalized) frequency ranges from 1/4 to 1/4. where we used the TF disjointness of {vk (t)} and {vk (t)} and neglected all terms involving W
()
(i)
(1)
(2)
vk
(t, f ) W
()
(i)
vk ,vk
(j)
(t, f ) + W
()
(j)
separated). It is seen that
vk ,vk () Gx (t, f )
(i)
(t, f ) (which is justied if the TF regions R1 and R2 are well features cross GWDs W
() vk ,vk
(i) (j)
(, ); these correspond to oscil-
lating statistical cross terms that are due to TF correlations (see Figs. 3.6(c)(d)). In the GES, these cross terms appear squared and thus cannot be reduced by smoothing the GES. However, they can be suppressed or reduced by smoothing the GWS of the innovations system before squaring. This will be considered in Section 3.4.
Approximate Uniqueness of the GES

Like the GWVS, the GES depends on the parameter . However, we next show that for underspread processes the GES is approximately independent of . (A related result for DL underspread innovations systems can be found in [118].) We recall from Subsection 3.1.5 that for an underspread processes x(t), we can always nd an underspread innovations system H Ix . Theorem 3.13. For any random process x(t) and any innovations system H Ix , the dierence between two GES with parameters 1 and 2 , both dened using H, is bounded as Gx Proof. There is
( ( Gx 1 ) (t, f ) Gx 2 ) (t, f ) (1 )
(t, f ) Gx SH 2 1
(2 )
(t, f )
4|1 2 |mH
(1,1)
(3.50)
( ( Gx 1 ) (, ) Gx 2 ) (, ) d d ,
(1 ) (2 )
(3.51)
() () () with Gx (t, f ) denoting the 2-D Fourier transform of Gx (t, f ). Since Gx (t, f ) equals the convolution
of SH (, ) and SH (, ), using (B.5) the 2-D Fourier transform of Gx be written as
()
()
(t, f ) Gx
(t, f ) can
( ) ( ) ( ) ( ) ( ( Gx 1 ) (, ) Gx 2 ) (, ) = SH 1 (, ) SH 1 (, ) SH 2 (, ) SH 2 (, )
125
SH 1 ( , ) SH 1 ( , ) SH 2 ( , ) SH 2 ( , ) d d SH 1 ( , ) SH 1 ( , ) 1 ej2(1 2 )[
( ) ( )
( )( )]
( )
( )
( )
( )
d d .
Inserting the last expression in (3.51) and substituting = 1 and = 1 , we further obtain
( ( Gx 1 ) (t, f ) Gx 2 ) (t, f ) 1 1
|SH ( , )| |SH (1 , 1 )| d d d1 d1 d d d1 d1
1e =2
1
j2(1 2 )[ 1 1 ]
|SH ( , )| |SH (1 , 1 )| sin (1 2 )[ 1 1 ]
2|1 2 | = 4|1 2 |
1 1 (1,1) 2 S H 1 mH .
|SH ( , )| |SH (2 , 2 )| | | + |1 1 | d d d1 d1
The bound (3.50) shows that for small mH , GES with dierent (but using the same innovations system) are approximately equal,
( ( Gx 1 ) (t, f ) Gx 2 ) (t, f ) .
(1,1)
An example for this approximation with 1 = 0, 2 = 1/2, and H =

(1,1)
Rx is shown in parts (a) and

Gx (t,f )Gx SH
(0) (1/2) 2 1
(b) of Fig. 3.6. The maximum normalized error in this example was maxt,f while the corresponding bound in (3.50) was 2 mH
(t,f )
= 0.005
= 0.054. Note that small mH
(1,1)
implies that
the innovations system H (and hence the process x(t)) is underspread with GSF concentrated along the axis and/or axis. Innovations systems whose GSF is oriented in oblique directions will not have small mH . Indeed, the GES with = 0 is known to be particularly suited to processes with components that are obliquely oriented in the TF plane [147, 148]. CL Processes. If the innovations system H is DL, then the resulting process will be CL and furthermore x 4H . In that case, straightforward application of (2.18) to (3.50) yields the bound Gx
(1 ) (1,1)
(t, f ) Gx SH 2 1
(2 )
(t, f )
4|1 2 |H .
Uncertainty Relations for the Generalized Evolutionary Spectrum

The GES, like the GWVS, can be shown to obey a TF uncertainty principle if the underlying innova tions system H is positive (semi)denite. Hence, in the following we assume H = Hp = Rx . Let us measure the GES extension by the TF radius of the GES dened as 2 () x
t f t 2 T
+ (T f )2 Gx (t, f ) dt df
()
() t f Gx (t, f ) dt df
1 = 2 Ex
t T
+ (T f )2 G() (t, f ) dt df . x
By adapting and combining results from [99] and [101], we then obtain the following theorem.
126
Theorem 3.14. For any nite-energy process x(t), the TF radius 2 of the GES (dened with the x positive (semi)denite innovations system Hp ) satises 2 () x It is bounded from below as 2 () x where x = with k denoting the KL eigenvalues of x(t). Proof. The proof of this theorem is completely analogous to that of Theorem 3.10 with Rx replaced by Hp . The foregoing theorem shows that the TF concentration of the GES (using the positive semidenite innovations system) is maximal for = 0. For = 0, the TF radius will be larger by an amount determined by the moments MHp and MHp . Hence, only in the case of underspread systems where MHp
(1,0) (1,0) (0,1) (1,0)
2 (0) x
MHp T
+ (T MHp )2 .
(0,1)
1 1 x + 2 2 2
MHp T
(1,0)
+ (T MHp )2 ,
(0,1)
(3.52)
k=1 k k k=1 k
and MHp
(0,1)
are small will the TF concentration of the GES with = 0 be comparable
to that of the GES with = 0. Furthermore, the theorem shows that 2 () is bounded from below x with the lower bound determined by x which measures the eective rank of the correlation operator Rx = k uk u (and of the positive semi-denite innovations system Hp = k=1 k uk uk ). k=1 k For the important special case = 0, the lower bound in (3.52) is minimal, i.e., 2 (0) x 2 (0) x
1 4
1 1 x . 2 2
Furthermore, since x 1 (with x = 1 for a rank-one correlation operator), there is the bound which is independent of the KL eigenvalues k . We conclude that the TF concentration of the GES depends on the eigenvalue spread of Rx and
Hp , i.e., the TF support region of the GES is larger for processes whose correlation operator has a large eigenvalue spread x .
3.3 Type I Time-Varying Power Spectra

According to Subsection 3.2.1, the GWVS of an overspread process contains cross terms that are due to the TF correlations of the process. While these statistical cross terms are potentially useful as they indicate TF correlations inherent in an overspread process, they may also be inconvenient as they tend to mask the auto terms that characterize the mean TF energy distribution of the process and thus indicate the TF location of the process components. In order to suppress or reduce cross terms, a TF smoothing can be applied, similar to the TF smoothing used in deterministic quadratic
127
TF analysis [61, 84, 85]. This motivates a generalization of the GWVS which will be termed the class of type I time-varying power spectra. In this section, we provide an axiomatic denition of this class of time-varying power spectra. Furthermore, we discuss some specic members of this class and we show that for underspread processes all type I spectra are approximately equivalent and approximately satisfy desirable mathematical properties.
3.3.1
Denition and Formulations
The PSD Px (f ) of a stationary process x(t) is a linear transformation of the correlation function rx ( ) that is frequency-shift covariant in the sense that x(t) = x(t) ej2f0 t = Px (f ) = Px (f f0 ) . (3.53)
Similarly, for nonstationary processes, we will use linearity and the TF shift covariance property to dene a class of time-varying spectra which we term type I time-varying power spectra. This class of spectra has previously been dened and studied in [3, 60, 61, 63]. The cross-correlation operator Rx,y depends linearly on x(t) and sesquilinearly on y(t), i.e., x(t) = a1 x1 (t) + a2 x2 (t) y(t) = b1 y1 (t) + b2 y2 (t) = = Rx,y = a1 Rx1 ,y + a2 Rx2 ,y , Rx,y = b Rx,y1 + b Rx,y2 1 2
Let us require that the time-varying (cross) power spectrum to be dened, denoted Cx,y (t, f ) hereafter, Cx,y (t, f ) = a1 Cx1 ,y (t, f ) + a2 Cx2 ,y (t, f ) , Cx,y (t, f ) = b Cx,y1 (t, f ) + b Cx,y2 (t, f ) . 1 2
x(t) = a1 x1 (t) + a2 x2 (t) y(t) = b1 y1 (t) + b2 y2 (t)
= =
(3.54a) (3.54b)
This condition is satised if and only if Cx,y (t, f ) depends linearly on the cross-correlation operator Rx,y , i.e., Cx,y (t, f ) =
t1 t2
c(t, f ; t1 , t2 ) rx,y (t1 , t2 ) dt1 dt2 ,
where c(t, f ; t1 , t2 ) is the kernel of the linear transformation rx,y (t1 , t2 ) Cx (t, f ). This expression of Cx,y (t, f ) can alternatively be viewed as the inner product of the correlation operator Rx,y and a TF-parameterized operator Ct,f , Cx,y (t, f ) = Ct,f , Ry,x = Tr{Ct,f Rx,y } , where the kernel of Ct,f is given by ct,f (t1 , t2 ) = c(t, f ; t2 , t1 ). If Ct,f is a TF localization operator about the TF point (t, f ) [4244, 81, 88, 174, 175], then by the above formulation, Cx (t, f ) can be interpreted as the mean amount of signal energy the TF localization operator Ct,f picks up about the TF point (t, f ). In the above general formulation, the localization properties of the operator Ct,f may still vary with the TF analysis point (t, f ). This is no longer the case if, in addition to (3.54), one requires that the time-varying power spectrum is TF shift covariant, i.e., x(t) = St0 ,f0 x (t) ,
() ()
y (t) = St0 ,f0 y (t)
y Cx,(t, f ) = Cx,y (t t0 , f f0 )
(3.55)
128
() ()+
(note that Rx, = St0 ,f0 Rx,y St0 ,f0 ). This covariance property is the TF analog of the frequency-shift y covariance in (3.53). It can be shown [93] that (3.55) is satised if and only if Ct,f has the form6 [60,63] Ct,f = St,f CSt,f
() ()+
ct,f (t1 , t2 ) = c(t1 t, t2 t) ej2f (t1 t2 ) ,
where c(t1 , t2 ) is the kernel of a xed TF localization operator C. This means that the TF localization operator Ct,f with parameters t and f equals a xed TF localization operator C shifted by t in time and by f in frequency. We thus arrive at the following general denition of type I time-varying (cross) power spectra: Cx,y (t, f ) Tr{Ct,f Rx,y } = Ct,f , Ry,x with Ct,f = St,f CSt,f .
() ()+
(3.56)
This class of spectra is axiomatically dened by the two properties (3.54) and (3.55). The class is parameterized by the operator C or, equivalently, its kernel c(t1 , t2 ). Subsequently, we shall mostly consider auto spectra for which we simply write Cx (t, f ) = Cx,x (t, f ). Furthermore, we assume that the spectra integrate to the total mean energy, Cx (t, f ) dt df = Ex ,
t f
which amounts to a trace normalization of C, i.e., Tr{C} =

t f
LC (t, f ) dt df = SC (0, 0) = 1 .
()
()
(3.57)
By comparing (3.56) with (3.2), it is seen that Cx (t, f ) can be viewed as the diagonal of the TF correlation function Rx (t1 , f1 ; t2 , f2 ) (with T replaced by C), i.e.,
(C) Cx (t, f ) = Rx (t, f ; t, f ) or (T)
(C) Cx (t, f ) = Rx (t, f ; 0, 0) .
Furthermore, under weak conditions Cx (t, f ) = E {Cx (t, f )} , with Cx (t, f ) Ct,f x, x ,
where Cx (t, f ) is recognized as a member of Cohens class of quadratic TF signal representations [35, 61,84]. Hence, type I spectra can be viewed as the stochastic analogue of Cohens class of deterministic TF representations. There exist four canonical formulations of type I spectra that are based on representations of the operators C and Rx in four dierent domains (related by Fourier transforms):7 Cx (t, f ) = Ct,f , Rx =
t1
6 7 () ()+
t2
c(t1 t, t2 t) rx (t1 , t2 ) ej2f (t1 t2 ) dt1 dt2
(3.58a)
Note that TF shifts St,f HSt,f of an operator H are independent of . The expression in (3.58b) uses the bi-frequency function BC (f1 , f2 ) that is dened in Section A.2. Furthermore, in
the formulations (3.58c) adn (3.58d) that involve the GWS and GSF with = 0, other choices for are possible as well. However, the choice = 0 leads to more convenient expressions.
129
=
f1 f2
BC (f1 f, f2 f ) rX (f1 , f2 ) ej2t(f1 f2 ) df1 df2 (0) (0)
(3.58b) (3.58c) (3.58d)
=
t f
LC (t t, f f ) Wx (t , f ) dt df SC (, ) A(0) (, ) ej2(t f ) d d . x
(0)
From (3.58c), it is seen that the time-varying power spectrum Cx (t, f ) results from convolving the Wigner-Ville spectrum with the Weyl symbol of C. Hence, the 2-D Fourier transform of Cx (t, f ) equals the product of the expected ambiguity function of x(t) and the spreading function of C, as expressed by (3.58d). If C is underspread, i.e., |SC (, )| is concentrated about the origin of the (, ) (0) plane, L (t, f ) will be a 2-D lowpass function and Cx (t, f ) will hence be a smoothed version of the
C
Wigner-Ville spectrum. This subclass of type I time-varying power spectra, generated by underspread TF localization operators C, will be termed smoothed type I spectra. Obviously, smoothing can be used to suppress or at least reduce the oscillatory cross terms occurring in the Wigner-Ville spectrum of overspread processes (see also Fig. 3.8).
3.3.2
Examples
We now consider a few examples of type I spectra, all of which correspond to prominent members of Cohens class. Generalized Wigner-Ville Spectrum [60, 63, 140]: The GWVS Wx (t, f ) discussed in Subsection 3.2.1 represents a specic subclass of type I spectra obtained with C = L() (see Subsection B.1.2), i.e., SC (, ) = ej2 . Note that L() is not underspread, i.e., the GWS of L() is not a lowpass function and hence the GWVS members are not smoothed type I spectra. Pages (Instantaneous Power) Spectrum [71,77,161]: Page was concerned about the causality of time-varying power spectra, which means that the spectrum should only depend on past values of the process. This condition led him to the denition of the instantaneous power spectrum, Cx (t, f ) = E d |X (f )|2 dt t
with Xt (f ) = t (0) ()
x( ) ej2f d .
(3.59)
The instantaneous power spectrum can be rewritten as Cx (t, f ) = 2

0
rx (t, t ) ej2f d
(0)
and can be shown to be a type I spectrum with the corresponding operator C dened by SC (, ) = ej| | . Levins Spectrum [71, 77, 161]: Levin augmented Pages denition by adding an anti-causal analogue of (3.59), i.e., Cx (t, f ) = E d |X (f )|2 dt t +E d + |X (f )|2 dt t ,
130
+ with Xt (f ) = j2f t x( ) e

d and Xt (f ) as in (3.59). Levins spectrum can alternatively be
expressed as Cx (t, f ) =
rx (t, t ) ej2f d
.
(0)
The spreading function of the corresponding operator C is given by SC (, ) = cos( ). We note that Levins spectrum can also be viewed as real part of the Rihaczek spectrum (i.e., the GWVS with = 1/2). Physical Spectrum [67, 139]: Guided by the idea of measuring the mean energy around a TF analysis point, Mark introduced the physical spectrum (see also Subsection B.3.3), which can be shown to equal the expectation of the spectrogram (see Subsection B.2.3), i.e.,
(g) (g) PSx (t, f ) = E SPECx (t, f ) = E (g) STFTx (t, f ) 2
=E
x, gt,f
= Rx gt,f , gt,f = St,f g g S+ , Rx t,f

(0)
with g(t) a normalized analysis window. The localization operator underlying the physical spectrum has rank one and is given by C = g g , so that LC (t, f ) = Wg (t, f ) and SC (, ) = Ag (, ). Multi-Window Physical Spectrum: If the localization operator is compact and normal, it has an eigenexpansion C=
k k gk gk . (0) (0) (0)
Plugging this decomposition into (3.56), it is seen that Cx (t, f ) = St,f =

k k k gk gk S+ , Rx = t,f k k St,f gk gk S+ , Rx t,f
(g k PSx k ) (t, f ) .
(3.60)
Hence, the time-varying spectrum Cx (t, f ) corresponds to a multi-window physical spectrum with orthonormal analysis windows gk (t). We note that the number of physical spectra involved equals the rank of the localization operator C. Furthermore, LC (t, f ) =
(0) k (0) k
k Wgk (t, f ) and SC (, ) =
(0)
(0)
k Agk (, ). For typical localization operators with fast decaying eigenvalues k , a few terms in the multi-window expansion (3.60) are sucient for a reasonable approximation of Cx (t, f ). Choosing the TF localization operator proportional to the orthogonal projection operator P = N -dimensional space span{g1 (t), . . . , gN (t)}, i.e., C = P/N , we obtain Cx (t, f ) = N orthogonal windows gk (t). We note that multi-window expansions of TF representations have previously been considered, e.g. in [4, 41, 118].
N k=1 gk gk on the (gk ) N 1 k=1 PSx (t, f ). N
In that case, the type I spectrum is simply the arithmetic mean of N physical spectra computed with
3.3.3
Approximate Equivalence
Many denitions for a type I time-varying power spectrum can be given via dierent choices of the TF localization operator C. Of course, the specic properties of the resulting spectrum depend on
131
this choice. While in general the spectra resulting from dierent TF localization operators are quite dierent, the following shows that for underspread processes, most type I time-varying power spectra yield eectively similar results. Theorem 3.15. For any random process x(t), the dierence of two type I time-varying power spectra (2) (1) (2) (1) Cx (t, f ) = Ct,f , Rx and Cx (t, f ) = Ct,f , Rx generated by operators C(1) and C(2) , respectively, satises (1) (2) Cx (t, f ) Cx (t, f ) () mx , Ax 1 where (, ) = SC(1) (, ) SC(2) (, ) . (1) (2) Proof. Using (3.58d), the 2-D Fourier transform of (t, f ) Cx (t, f ) Cx (t, f ) is obtained as (0) (0) (0) (, ) = Ax (, ) SC(1) (, ) SC(2) (, ) . The bound and expression for the L and L2 norms of (t, f ) then follow from |(t, f )| |(, )| d d and = , respectively.
2 2 (0) (0)
(1) (2) Cx Cx Ax 2
() = Mx ,
(3.61)
Discussion. The above theorem shows that for processes with small mx
()
and Mx , i.e., for
()
underspread processes, two dierent type I time-varying spectra are approximately equal, (1) (2) Cx (t, f ) Cx (t, f ) . Note that (0, 0) = 0 due to our assumption of trace normalized TF localization operators (cf. (3.57)). The requirements for small mx and Mx
() ()
shall be illustrated next by considering some special cases.
The approximate equivalence of two GWVS with dierent has already been discussed in Corollary 3.6. It can be viewed as specialization of (3.61) with C(1) = L(1 ) , C(2) = L(2 ) , and the additional inequality (, ) = |ej21 ej22 | 2|1 2 || ||| which yields mx and Mx
(0) () ()
Ag (, )| = |1 Ag (, )|. Note that in this case mx , Mx if the eective support region of

() Ag (, ) (0)
A further interesting special case is C(1) = L() and C(2) = g g where (, ) = |ej21
()
2|1 2 | Mx
(1,1)
2|1 2 | mx
(1,1)
physical spectrum, this implies that the Wigner-Ville spectrum is approximately positive. This issue was explored in Subsection 3.2.1. CL Processes. In the case of CL processes, i.e., processes with compact GEAF support region Gx , straightforward application of (2.16) yields the bounds (1) (2) Cx (t, f ) Cx (t, f ) (max) , x Ax 1 with x
(max) (0) (0) (0)
(3.61) implies the approximate equivalence Wx (t, f ) Px (t, f ). Together with the positivity of the
will be small if x(t) is underspread and covers the eective support region of |Ax (, )|. Here,
(g)
()
()
(1) (2) Cx Cx Ax 2
(0)
(max) , x
the DL parts of C(1) and C(2) are equal, i.e., SC(1) (, ) = SC(2) (, ) for (, ) Gx or equivalently (max) (1) (2) = 0 and thus Cx (t, f ) = Cx (t, f ) for a CL process with GEAF [C(1) ]Gx = [C(2) ]Gx , one has x support region Gx .
= max(,)Gx SC(1) (, ) SC(2) (, ) . In particular, this implies that in cases where
132
3.3.4
Properties
A large part of the literature on deterministic TF representations is dedicated to the investigation of their mathematical properties [35, 61, 84]. Among the more important desirable properties are the time and frequency marginal property, unitarity (Moyals relation), TF shift and scale covariance, real valuedness, and positivity. The properties of a type I spectrum Cx (t, f ) are completely characterized by the operator C or, equivalently, by the Weyl symbol LC (t, f ) or the spreading function SC (, ). (0) A given property is satised by Cx (t, f ) if the associated operator C (or, equivalently, LC (t, f ) or SC (, )) satises a corresponding condition. In the following, we will consider some specfc properties and show that in a stochastic setting these properties are satised by many type I timevarying spectra at least in an approximate manner if the process under consideration is underspread.
(0) (0) (0)
Real-valuedness
Whereas the PSD is a real-valued function, this ist not necessarily true for a type I spectrum. Hence, the rst property we consider is real-valuedness, i.e., Cx (t, f ) = Cx (t, f ). According to (3.56), this property is satised if and only if the operator C is self-adjoint (i.e., its eigenvalues have to be real, {k } = 0). From (B.6) it is seen that C is self-adjoint if and only if the GSF of C features Hermitian symmetry, SC (, ) = SC (, ). However, the next result shows that even if C is not self-adjoint, Cx (t, f ) is approximately real-valued in the case of underspread processes.
(0) (0)
1 Theorem 3.16. For any random process x(t), the imaginary part {Cx (t, f )} = 2j Cx (t, f ) Cx (t, f ) of any type I spectrum Cx (t, f ) = Ct,f , Rx satises 1 () |{Cx (t, f )}| mx , 2 Ax 1
(0) (0)
{Cx } Ax 2
1 () M , 2 x
(3.62)
with the weight function (, ) = SC (, ) SC (, ) . Proof. Using (3.58d), it follows that {Cx (t, f )} = 1 2j
(0) (0) SC (, ) SC (, ) A(0) (, ) ej2(t f ) d d . x
The L bound in (3.62) is then shown as 1 (0) (0) S (, ) SC (, ) A(0) (, ) ej2(t f ) d d x 2 C 1 (0) (0) S (, ) SC (, ) Ax (, ) d d 2 C 1 () = mx Ax 1 . 2 In a similar way one obtains |{Cx (t, f )}| = 1 2 (0) (0) S (, ) SC (, ) |Ax (, )|2 d d 4 C 2 1 () = Mx Ax 2 , 2 which proves the expression for the L2 norm in (3.62). {Cx }
2 2

()
133
()
Discussion. The above result shows that for underspread processes with small mx there is Cx (t, f ) Cx (t, f ) ,
(0) (0)
and Mx ,
or
()
{Cx (t, f )} 0 .
() (0)
and Mx requires that SC (, ) approxi mately features Hermitian symmetry on the eective support of Ax (, ) . For typical TF localization operators C, this in turn implies that Ax (, ) has to be concentrated about the origin, i.e., the process x(t) has to be underspread. Of the example spectra in Subsection 3.3.2, all but the GWVS with = 0 and the multiwindow () spectra with {k } = 0 are real-valued. For the GWVS case, i.e., Cx (t, f ) = W (t, f ) or equivalently
x
Due to (, ) = SC (, ) SC (, ) , small mx
C=
L() ,
simpler but coarser bounds in terms of the GEAF moments

()
(1,1) mx (1,1)
and Mx
(1,1)
are provided
(1,1)
by Corollary 3.7. These bounds can also be obtained from Theorem 3.16 by noting that here (, ) = |ej2 ej2 | = 2| sin(2 )| 4||| ||| so that mx 4||mx ward application of (2.16) yields the bounds and Mx
()
4||Mx
CL Processes. In the case of CL processes with compact GEAF support region Gx , straightfor |{Cx (t, f )}| 1 (max) , x 1 2 x A {Cx } x 2 A
2
1 (max) 2 x
with x
(max)
the DL part of C is self-adjoint, i.e., SC (, ) = SC (, ) for (, ) Gx or equivalently CGx = (max) = 0 and thus Cx (t, f ) will be real-valued. [CGx ]+ , we have x
= max(,)Gx SC (, ) SC (, ) . In particular, this implies that in cases where

(0) (0)
(0)
(0)
Marginal Properties
In a stochastic context, the marginal property in time requires that Cx (t, f ) df = E |x(t)|2 = rx (t, t) .
Similarly, the marginal property in frequency requires that Cx (t, f ) dt = E |X(f )|2 = rX (f, f ) .
For a type I spectrum Cx (t, f ), (3.58a) and (3.58b) respectively imply that Cx (t, f ) df =
f
c(t t, t t) rx (t , t ) dt , BC (f f, f f ) rX (f , f ) df .
Cx (t, f ) dt =
t
Hence, the marginal property in time is satised if and only if c(t, t) = (t), or equivalently SC (0, ) = 1. Similarly, the marginal property in frequency is satised if and only if BC (f, f ) = (f ), or equivalently SC (, 0) = 1. Even if these conditions are not met exactly, the following result shows that in the case of underspread processes the marginal properties will approximately be satised.
(0) (0)
134
Theorem 3.17. For any random process x(t) and any type I spectrum Cx (t, f ) = Ct,f , Rx , the dierences 1 (t) f Cx (t, f ) df rx (t, t) and 2 (f ) t Cx (t, f ) dt rX (f, f ) satisfy |1 (t)| m(1) , x Ax 1, where m(1) x
(1) Mx
1 2 (1) = Mx , x A 2,
|2 (f )| m(2) , x Ax 1,
2 2 (2) = Mx , x A 2,
(3.63)
1 Ax 1 Ax 1 x A 1 Ax
1/p
1,
1 SC (0, ) |Ax (0, )| d , 1 SC (0, ) |Ax (0, )|2 d 1 SC (, 0) |Ax (, 0)| d , 1 SC (, 0) |Ax (, 0)|2 d
p, (0) 2 1/2 (0) (0) 2 1/2
(0)
2,
m(2) x
(2) Mx
1,
1,
with Ax
p,
|Ax (0, )|p d
and Ax
|Ax (, 0)|p d
1/p
Proof. First note that (B.53) and (B.55) imply that rx (t, t) = Ax (0, ) ej2t d. Furthermore, (0) (0) due to (3.58d) there is f Cx (t, f ) df = SC (0, ) Ax (0, ) ej2t d. We thus obtain 1 (t) =
(0) x 1 SC (0, ) A(0) (0, ) ej2t d (0) 1 SC (0, ) Ax (0, ) d = Ax (1) 1, mx .
(0)
Similarly, 1
2 2
1 SC (0, )
(0)
2 Ax (0, ) d = Ax
2 2,
(1) Mx .
The expressions for 2 (f ) can be shown in an analogous way by noting that rX (f, f ) = (0) j2 f d . j2 f d and (0) (0) Ax (, 0) e t Cx (t, f ) dt = SC (, 0) Ax (, 0) e Discussion. The above theorem shows that for underspread processes where mx , mx , Mx , (2) and Mx are small, type I spectra Cx (t, f ) that might not satisfy the marginal properties exactly will satisfy them at least approximately, i.e., Cx (t, f ) df E |x(t)|2 ,
(2) (2) (1) (2) (1)
f (1)
Cx (t, f ) dt E |X(f )|2 .
Note that mx , Mx , mx , and Mx can be viewed as degenerate weighted GEAF integrals that (1) (1) measure the extension of Ax (, ) along the axis and axis, respectively. Small mx and Mx requires that E |x(t)|2 = rx (t, t) = rx (t, 0) varies slowly over time and small mx and Mx requires and rX (f, ) may vary arbitrarily fast for > 0 and > 0, respectively. that E |X(f )|2
() () (2) (2)
(1)
= rX (f, f ) = rX (f, 0) varies slowly over frequency. Note however, that rx (t, )
()
()
135
Of the example spectra in Subsection 3.3.2, all but the physical spectrum and the multi-window physical spectrum satisfy the marginal property in time and frequency, respectively. For the physical spectrum, i.e., the case C = g g (with g(t) a normalized window function), there is SC (, ) = Ag (, ) and thus SC (0, ) =
(0) (0) f (0)
With the truncated Taylor series expansion of Ag (, ) about the origin (valid for small | |, ||),
2 2 A(0) (, ) 1 4 2 Fg 2 4 2 Tg 2 , g 2 where Tg = tt 2 |g(t)|2 2 dt and Fg = f
G(f +/2) G (f /2) df and SC (, 0) =

(0)
(0)
t g(t+ /2) g
(t /2) dt.
(3.64)
g(t), we obtain the approximations
f 2 |G(f )|2 df measure the eective duration and bandwidth of

1/2
2 m(1) 4 2 Tg x
2 |Ax (0, )| d Ax
1,
(1) 2 Mx 4 2 Tg
4 |Ax (0, )|2 d Ax

2,
(3.65)
1/2
2 m(2) 4 2 Fg x
2 |Ax (, 0)| d Ax
1,
(1) 2 Mx 4 2 Fg
4 |Ax (, 0)|2 d Ax
2,
(3.66)
be interpreted as measures of the time-variation of rx (t, t) and of the frequency variation of rX (f, f ), respectively. Thus it is seen that in order to have small mx , Mx in the case of the physical spectrum, the instantaneous power rx (t, t) = E |x(t)|2 Similarly, small
(2) mx , (2) Mx (1) (1)
which are valid for |Ax (, )| being concentrated about the origin. The fractions in (3.65) and (3.66) can
must vary slowly compared to the window length Tg . varies
slowly compared to the window bandwidth Fg . We conclude that in order that the physical spectrum approximately satises the marginal properties, the duration and bandwidth of the window g(t) must be matched to the GEAF extension of the process x(t) in the and direction, respectively. In particular, quasistationary processes require a long, narrowband analysis window, whereas quasiwhite processes require a short, wideband window. shown using similar arguments as in Proposition 2.4 that m(1) 1 x where 1
(max) (max)
requires that the expected energy density rx (f, f ) = E |X(f )|2
CL Processes. In the case of CL processes with compact GEAF support region Gx , it can be
(1) Mx 1 (0) (max)
m(2) 2 x
(max)
(max)
(2) Mx 2 (0)
(max)
inequalities to (3.63) yields the bounds |1 (t)| (max) 1 , Ax

1,
= max(,)Gx 1 SC (0, ) and 2 1 2 (max) 1 , Ax 2,
= max(,)Gx 1 SC (, 0) . Applying these 2 2 (max) 2 . Ax 2,

(0)
|2 (f )| (max) 2 , Ax
1, (0)
Hence, if the spreading function of the operator C satises SC (0, ) = 1 and/or SC (, 0) = 1 for (, ) Gx , we have, respectively, 1 and/or frequency is satised.
(max)
= 0 and/or 2
(max)
= 0, so that the marginal property in time
136
Moyal-Type Relation
The Moyal-type relation Cx , Cy = Tr{Rx Ry } = Rx , Ry with Cx , Cy =
t f
(3.67)
Cx (t, f ) Cy (t, f ) dt df
can be shown to be satised if and only if |SC (, )| 1 (note that this condition cannot be met by case of underspread processes, (3.67) holds at least approximately.
HS operators C). Even if this condition is not satised exactly, the following result shows that in the Theorem 3.18. For any two random processes x(t) and y(t) and any type I spectrum Cx (t, f ) = Ct,f , Rx , the dierence between Cx , Cy and Rx , Ry is bounded as Cx , Cy Rx , Ry Rx 2 Ry 2 with (, ) =
( ( () () min Mx ) My ) , Mx , My ,

(3.68)
1 |SC (, )|2 and (, ) = 1 |SC (, )|2 .
(0) (0) and (3.58d) in combination with Parsevals relation, we have Proof. Using Rx , Ry = Ax , Ax =

Cx , Cy Rx , Ry
(0) (0) x y y x A(0) (, ) SC (, )A(0) (, ) SC (, ) d d A(0) , A(0)
A(0) (, ) A(0) (, ) |SC (, )|2 1 d d. x y
From this, we obtain the intermediate bound Ax (, ) Ay (, ) 1 |SC (, )|2 d d . (3.69)
Applying the Schwarz inequality to (3.69) in three dierent ways yields, respectively, Ax (, )
( ) 2 Mx 2 1/2
1 |SC (, )|2 d d
Ay (, )

1/2
1 |SC (, )|2 d d
= Ax =
Ay
( ) 2 My , 1/2
2 Ax (, ) d d () Ax 2 Ay 2 My ,
Ay (, )
1 |SC (, )|2 d d
1/2
Ax (, )
() 2 Mx
1 |SC (, )|
2 2
1/2
d d

2 Ay (, ) d d
1/2
= Ax
Ay
2.
The bound (3.68) nally follows by simultaneously invoking the above three inequalities and by noting that Ax = Rx and Ay = Ry .
2 2 2 2
137
Discussion. The foregoing theorem shows that for underspread processes x(t) and y(t) where
( ) ( ) Mx My
or Mx
()
or My
()
is small, type I spectra that might not satisfy the Moyal-type relation
(3.67) exactly will satisfy it at least approximately, i.e., Cx , Cy Rx , Ry . We further note that the intermediate result (3.69) gives a tighter bound that shows that only the product of the individual GEAFs has to be concentrated about the origin. However, the bound (3.68), while being coarser, has the advantage of being expressed directly in terms of weighted integrals of the individual processes. For the special case y(t) = x(t), (3.69) yields the bound Cx
2 2
Rx 2 2
Rx
2 2
( Mx ) .
Hence, Theorem 3.18 also shows that for an underspread process x(t) the L2 norm of Cx (t, f ) approximately equals the HS norm of the correlation operator Rx , Cx
2
Rx
2, ( )
where the relative error of this approximation is bounded in terms of Mx

(0)
Of the example spectra in Subsection 3.3.2, only the GWVS and Pages spectrum satisfy the Moyal-type relation (3.67) exactly. For Levins spectrum where SC (, ) = cos( ), we obtain (, ) = Mx
( )
2 2 2 , so that Mx and
() Mx
1 |SC (, )|2 = | sin( )| | ||| and (, ) = 1 |SC (, )|2 = sin2 ( )

( )
can be obtained using a truncated Taylor series expansion in a similar way as it was
Mx
(1,1)
, Mx
()
2 Mx
(2,2)
. For the physical spectrum, approximations for
demonstrated for the marginal properties. CL Processes. In the case of CL processes with GEAF support intersection region Gx,y = {(, ) : x (, ) Ay (, )| > 0}, starting from (3.69) we obtain |A Cx , Cy Rx , Ry Ax (, ) Ay (, ) 1 |SC (, )|2 d d Ax (, ) Ay (, ) d d (max) Ax x
2
Gx,y
(max) x where x
(max)
Ay
2,
Gx,y
= max(,)Gx,y (, ) = max(,)Gx,y 1 |SC (, )|2 . Hence, Cx , Cy Rx , Ry Rx 2 Ry 2 (max) . x
This furthermore implies that if C satises |SC (, )| = 1 for (, ) Gx,y (or equivalently |SCGx,y (, )| = IGx,y (, )), then (max) = 0 and thus Cx , Cy = Rx , Ry . Similar remarks apply to the special case y(t) = x(t).
138
Positivity
The PSD Px (f ) of a stationary process and the mean instantaneous intensity qx (t) of a white process are positive quantities. Positivity is necessary for a proper interpretation as mean power or energy distribution. Hence, positivity of time-varying power spectra has been a topic of longstanding interest. In [57, 60, 63], positivity of the Wigner-Ville spectrum was related to a sucient amount of randomness of the process to be analyzed. Reinterpreting sucient amount of randomness as limited TF correlations, we may expect that type I spectra of underspread processes are at least approximately (real-valued and) positive, i.e., Cx (t, f ) P Cx (t, f ) , with the positive real-valued part of Cx (t, f ) P Cx (t, f ) 1 Cx (t, f ) 2 + Cx (t, f ) .
dened as in Subsection 2.3.13
A necessary and sucient condition for a time-varying power spectrum Cx (t, f ) to be real-valued and positive (strictly speaking, non-negative) for all nonstationary random processes is that the corresponding operator C be positive (semi-)denite, i.e., C 0 or equivalently k 0. However, positivity can be obtained. We note that by virtue of (3.29), Theorem 2.30 (with H replaced by Rx ) directly applies to the GWVS. Hence for underspread processes the GWVS is approximately real-valued and positive. This was also stated in Corollary 3.8. In the following, we will generalize this result to arbitrary type I spectra. Theorem 3.19. For any random process x(t) and any type I spectrum Cx (t, f ) = Ct,f , Rx , the dierence between Cx (t, f ) and P Cx (t, f ) is bounded as Cx (t, f ) P Cx (t, f ) x A 1 Cx P Cx Ax 2
(0) (0) 2 () mx + inf C 0
the subsequent results show that if one restricts to underspread random processes, (approximate)
mx
(C )
(3.70a)
() Mx + inf (0)
C 0
Mx
(C )
(3.70b)
with (, ) = SC (, ) SC (, ) and C (, ) = SC (, ) SC (, ) . Proof. The proof is largely analogous to that of Theorem 2.30. With the negative real part of Cx (t, f ) N Cx (t, f ) 1 Cx (t, f ) P Cx (t, f ) Cx (t, f ) = 2 Cx (t, f ) 0,
(0)
the type I spectrum Cx (t, f ) can be split up as Cx (t, f ) = P Cx (t, f ) N Cx (t, f ) + j Cx (t, f ) , so that Cx (t, f ) P Cx (t, f ) = j Cx (t, f ) N Cx (t, f ) . Hence, the triangle inequality yields Cx P Cx
p
Cx
+ N Cx
(3.71)
139
where for our purposes p = 2 or p = . Bounds on the rst term on the right-hand side of (3.71) (involving the imaginary part of Cx (t, f )) have been provided in Theorem 3.16. With C (t, f ) denoting
x
an arbitrary positive type I spectrum generated by a positive semi-denite operator term (involving the negative real part of Cx (t, f )) can be further bounded as N Cx
p
O, the second
1 Cx Cx 2 p 1 = Cx Cx + Cx Cx 2 p 1 1 Cx Cx + Cx Cx 2 2 p Cx Cx
p p
Cx Cx
(3.72)
Cx (t, f ) is positive) which in turn is the dierence between Cx (t, f ) and Cx (t, f ) (since Cx (t, f ) is real-valued). By noting that (3.72) holds for arbitrary positive C (t, f ) and by applying (3.61) to
x
Here, we used the triangle inequality and the fact that the dierence between Cx (t, f ) and the magni tude of the real part of Cx (t, f ) is the dierence between Cx (t, f ) and the real part of Cx (t, f ) (since
(3.72) (with
C(1)
= C,
C(2)
C inf
and p = or p = 2), we obtain further mx

(C )
N Cx
Ax
C 0
N Cx
Ax
C 0
inf
Mx
(C )
(3.73)
The bounds in (3.70) nally follow by inserting theses bounds in (3.71). Discussion. The foregoing theorem shows that type I spectra of processes with small mx , Mx , inf C 0 mx
(C ) () ()
, and inf C 0 Mx
(C )
are approximately nonnegative, or Cx (t, f ) 0 , N Cx (t, f ) 0 .

() ()
Cx (t, f ) P Cx (t, f ) ,
The bounds (3.70) are tightest for real-valued type I spectra since here mx = 0 and Mx = 0. Furthermore, if Cx (t, f ) is indeed positive (i.e., C 0), this is correctly indicated by the bounds (3.70) since with C = C we have mx C = 0 and Mx C = 0 in addition to mx = 0, Mx = 0. Note that oblique orientation of |Ax (, )| and/or |SC (, )| can be accommodated via metaplectic function |SC (a + b, c + d)|).
( ) ( ) () ()
transformations of C , since C 0 and U M imply C = UC U+ 0 (and C has spreading As noted above, type I spectra induced by positive semi-denite operators C are real-valued and
N k=1 k gk gk )
positive. In particular, this is true for the physical spectrum (C = g g 0) and multi-window physical spectra (C = can further be specialized since (, ) = |ej2 ej2 | = 2| sin(2 )| 4||| ||| and
(0) ()
with real-valued k 0. For the GWVS, the bounds (3.70)
C (, ) = ej2 SC (, ) = 1 SC (, ) . In the case of rank-one TF localization operators ( ) () (g) C = g g , i.e., Cx (t, f ) = PSx (t, f ), we have C (, ) = |1 Ag (, )| so that Mx C can be approximated using a truncated Taylor series of Ag (, ) similarly as it was done for the marginal properties.
()
140
Chapter 3. Underspread Processes CL Processes. In the case of CL processes with compact GEAF support region Gx , straightfor Cx (t, f ) P Cx (t, f ) Ax
1
ward application of (2.16) yields the bounds
(max) + inf
2
C 0
(max)
, ,
Cx P Cx Ax
2
(max) + inf
C 0
(max)
with (max) = max(,)Gx (, ) and C
(max)
= max(,)Gx C (, ). This further implies that in cases

(0) (0)
CGx = [CGx ]+ ), and if there is a positive semi-denite operator C whose DL part equals that of C, i.e., SC (, ) = SC (, ) for (, ) Gx (or equivalently CGx = (C )Gx ), we have (max) = 0 and (max) C = 0, respectively, and thus Cx (t, f ) will be real-valued and positive in this case.
(0) (0)
where the DL part of C is self-adjoint, i.e., SC (, ) = SC (, ) for (, ) Gx (or equivalently
3.4 Type II Time-Varying Power Spectra

The GES of an overspread process contains cross terms that are due to the TF correlations of the process (see Subsection 3.2.2). Sometimes these statistical cross terms are undesirable since they may mask the auto terms characterizing the process mean TF energy distribution or they may erroneously be taken for auto terms. The GES cross terms can be suppressed or reduced by an appropriate TF smoothing. This motivates a generalization of the GES which we call type II timevarying power spectra.
3.4.1
Denition and Formulations
The PSD Px (f ) can be interpreted as squared magnitude of the frequency response H(f ) of an LTI innovations system H Ix (see (1.4)), Px (f ) = |H(f )|2 . The frequency response H(f ) is obtained from the innovations systems impulse response h( ) via a Fourier transform, which is a linear, frequency shift-covariant transform. In a similar spirit, we use the innovations system representation (3.19) of nonstationary processes as a basis for dening type II time-varying power spectra as Gx (t, f ) DH (t, f ) ,
2
(3.74)
of the innovations system H. The denition of this time-varying transfer function needs further specication. In particular, if one requires that DH (t, f ) is a linear transform of H that is moreover TF shift covariant in the sense that H = St0 ,f0 HS+ ,f0 t0 = DH (t, f ) = DH (t t0 , f f0 ) , e
where H Ix is an innovations system of x(t) and DH (t, f ) is a time-varying transfer function
141
then it can be shown that DH (t, f ) must be of the form DH (t, f ) =

t f
Tr{HDt,f } = H, D+ t,f
(0)
= H, St,f D+ St,f
(0)
()
()+
(3.75) (3.76)
LD (t t, f f ) LH (t , f ) dt df ,
where the operator D completely species the TF transfer function DH (t, f ). Usually, D will be chosen as a TF localization operator [4244, 81, 88, 174, 175]. For convenience, we henceforth assume that |SD (, )| 1 (this merely amounts to a proper normalization of D). The class of TF transfer functions given by (3.75) has previously been considered in a quantum-
mechanical context in [33,34] and in a signal analysis context in [62,96]. The lower symbol LL (t, f ) H in (2.145) and the upper symbol LU (t, f ) in (2.146) are specic members of this class obtained with H SD (, ) = As (, ) and SD (, ) = 1/As
(0) (0) (0) (0)
(, ), respectively.
Like type I spectra, type II spectra admit four dierent canonical formulations in terms of representations of the operator D in four dierent domains: Gx (t, f ) = H, D+ t,f
2 2
=
t1 t2
d(t2 t, t1 t) h(t1 , t2 ) ej2f (t1 t2 ) dt1 dt2
(3.77a)
2
=
f1 f2
BD (f2 f, f1 f ) BH (f1 , f2 ) ej2t(f1 f2 ) df1 df2

(0) (0) 2
(3.77b) (3.77c)
=
t f
LD (t t, f f ) LH (t , f ) dt df
(0) (0) SD+ (, ) SH (, ) ej2(t f )
d d
.
(0)
(3.77d)
It is seen from (3.77c) that for underspread operators D whose Weyl symbol LD (t, f ) is a 2-D lowpass function, Gx (t, f ) incorporates a smoothing of LH (t, f ). This smoothing can be used to suppress or at least reduce oscillating components that are contained in LH (t, f ) when the innovations system H (and hence x(t)) is overspread (see Fig. (3.8)). Type II spectra with underspread D will henceforth be referred to as smoothed type II spectra. Since the smoothing is applied before taking the squared magnitude, Gx (t, f ) cannot be interpreted as smoothed version of the GES. Indeed, the cross terms of the GES are positive and could not be suppressed by a post-smoothing. This is dierent from type I spectra which can be interpreted as smoothed versions of the Wigner-Ville spectrum.
(0) (0)
3.4.2
Examples
()
We briey consider two classes of type II spectra which are of particular interest. Generalized Evolutionary Spectrum: The GES Gx (t, f ) discussed in Subsection 3.2.2 represents a specic family of type II spectra obtained with operator D = L() (see Subsection B.1.2), i.e., SD (, ) = ej2 . Note, however, that L() is not underspread, i.e., the GWS of L() is not a lowpass function and hence the GES members are not smoothed type II spectra. The Weyl spectrum [147,148], the (ordinary) evolutionary spectrum [170,171], and the transitory evolutionary spectrum [49,148] are important special cases of the GES obtained with = 0, = 1/2, and = 1/2, respectively.
(0)
142
Chapter 3. Underspread Processes Rank-One TF Localization Operator: The lower symbol LL (t, f ) = Hgt,f , gt,f in (2.145) H
[64, 118] is a TF transfer function that measures the gain of H around the TF analysis point (t, f ). This corresponds to using a rank-one TF localization operator D = gg , i.e., SD (, ) = Ag (, ). Since this approach parallels that taken in the denition of the physical spectrum [139] (see Subsection 3.3.2), we refer to Gx (t, f ) = LL (t, f ) H as type II physical spectrum.
2 (0) (0)
(3.78)
3.4.3
Approximate Equivalence
Depending on the choice of the TF localization operator D, dierent TF transfer functions DH (t, f ) and hence dierent type II time-varying power spectra are obtained. In some situations this nonuniqueness might be considered an inconvenience. However, the next result shows that for underspread processes most type II spectra yield eectively equivalent results. Theorem 3.20. For any random process x(t) and any innovations system H Ix , the dierence of two type II time-varying power spectra Gx (t, f ) =
(1) (2) (1) 2 2
Dt,f , H
(1)
and Gx (t, f ) =
(2)
Dt,f , H
(2)
generated by the operators D(1) and D(2) , respectively, is bounded as Gx (t, f ) Gx (t, f ) SH
2 1
2mH ,
()
(3.79)
with (, ) = SD(1)+ (, ) SD(2)+ (, ) . Proof. With (3.77d), the Fourier transform Gx (, ) of Gx (t, f ) is obtained as G(i) (, ) = SD(i)+ (, ) SH (, ) SD(i)+ (, ) SH (, ) x =
(0) (0) (0) (0) (i) (i)
(0)
(0)
SD(i)+ ( , ) SH ( , ) SD(i)+ ( , ) SH ( , ) d d .
(0)
(0)
(0)
(0)
Hence, it follows that G(1) (t, f ) G(2) (t, f ) x x

1 1 2 2
G(1) (, ) G(2) (, ) d d x x SD(2)+ (1 , 1 ) SD(2)+ (2 , 2 )

(0) (0)
(0) (0) SD(1)+ (1 , 1 ) SD(1)+ (2 , 2 )
|SH (1 , 1 )| |SH (2 , 2 )| d1 d1 d2 d2
1 1 2 2
SD(1)+ (1 , 1 ) SD(2)+ (1 , 1 ) + SD(1)+ (2 , 2 ) SD(2)+ (2 , 2 )
(0)
(0)
(0)
(0)
|SH (1 , 1 )| |SH (2 , 2 )| d1 d1 d2 d2
1 1
SD(1)+ (1 , 1 ) SD(2)+ (1 , 1 ) |SH (1 , 1 )| d1 d1

1
(0)
(0)
|SH (2 , 2 )| d2 d2
+
1
|SH (1 , 1 )| d1 d1
SD(1)+ (2 , 2 ) SD(2)+ (2 , 2 ) |SH (2 , 2 )| d2 d2
(0)
(0)

() 2 1 mH ,
143
= 2 SH
where we used the fact that |ab cd| |a c| + |b d| for |a|, |b|, |c|, |d| 1 together with our general assumption that SD(i) (, ) 1.
()
Discussion. The foregoing theorem implies that for an underspread innovations system H with small mH (which is possible only if the corresponding process x(t) is underspread), two dierent type II spectra using H are approximately equal, G(1) (t, f ) G(2) (t, f ) . x x The following special cases illustrate the requirements for small mH . The approximate equivalence of GES with dierent has previously been considered in Theorem 3.13. It can alternatively be obtained as a special case of Theorem 3.20 with D(1) = L(1 ) , D(2) = L(2 ) , and the inequality (, ) = |ej21 ej22 | 2| sin((1 2 ) )| 2|1 2 || ||| whence mH 2|1 2 | mH .
() (1,1) ()
a GES and a type II physical spectrum). Here, (, ) = |ej2 As (, )| = |1 As (, )| so that mH will be small if H is underspread and if the eective support region of |As (, ) covers the eective support region of |SH (, )|.
()
A further interesting special case is D(1) = L() and D(2) = s s (i.e., the dierence between
(0) ()
process whose GEAF support region is twice as large. In this case, applying (2.16) to (3.79) results in the bounds Gx (t, f ) Gx (t, f ) SH 2 1 with H
(max) (0) (0) (1) (2)
CL Processes. A DL innovations system H with compact GSF support region GH produces a CL
2H
(0)
(max)
DL parts of D(1) and D(2) are equal, i.e., S [D(1)+ ]GH = [D(2)+ ]GH ), we have H
(max)
= max(,)GH SD(1) (, ) SD(2) (, ) . In particular, this implies that in cases where the
(0) + (, ) D(1)
= 0 and thus Gx (t, f ) = Gx (t, f ).
= SD(2)+ (, ) for (, ) GH (or equivalently

(1) (2)
3.4.4
Properties
Type II spectra have the advantage of being always real-valued and nonnegative. Thus, in the following we only consider the total mean energy property and the marginal properties.
Mean Energy
It can be shown that in order for a type II spectrum to integrate to the total mean energy of the process, i.e.,
t f
Gx (t, f ) dt df = Ex = Tr{Rx } ,
(3.80)
the operator D underlying Gx (t, f ) has to satisfy |SD (, )| = 1. This condition is met by all GES members but not for most other type II spectra. However, the following result shows that for an underspread innovations system H, (3.80) is satised at least in an approximate way.
144
Theorem 3.21. For any random process x(t) and any innovations system H Ix , the dierence Ex t f Gx (t, f ) dt df with Gx (t, f ) = | H, D+ |2 equals t,f H with the weighting function (, ) =
2 2
= MH
() 2
(3.81)
1 SD+ (, ) . = H, D+ H equals DH (, ) t,f =
Proof. Since the 2-D Fourier transform of DH (t, f ) SD+ (, ) SH (, ), Parsevals relation implies that Gx (t, f ) dt df =
t f t f (0) (0)
| H, D+ |2 dt df = t,f
2 DH (, ) d d
|SH (, )|2 |SD+ (, )|2 d d .
On the other hand, the mean energy Ex can be rewritten as Ex = Tr{Rx } = Tr HH+ = H Hence, we obtain =
2 2
|SH (, )|2 d d .
|SH (, )|2 1 |SD+ (, )|2 d d ,
which implies the nal result. Note that we have assumed |SD (, )| 1 so that 1 |SD (, )|2 0 and hence 0. Discussion. The foregoing theorem implies that for an innovations system with small MH , there holds the approximation
t () f ()
Gx (t, f ) dt df Ex .
The weighted GSF integrals MH will be small as long as |SD+ (, )| 1 on the eective support innovations system (and hence, an underspread process).
2 2 1 4 2 Ts 2 4 2 Fs 2 (see (3.64)) and hence
of |SH (, )|, which is favored by a small eective support of |SH (, )|, i.e., by an underspread As a special case, let us consider the type II physical spectrum where |SD+ (, )| = |As (, )| MH
() 2
4 2
2 2 2 |SH (, )|2 Ts 2 + Fs 2 d d = 4 2 Ts MH
(0,1) 2
2 + 4 2 Fs MH
(1,0) 2
Achieving small MH in this case requires a well-balanced choice of the duration/bandwidth of the analysis window s(t) as compared to the innovations systems moments MH
(1,0)
()
and MH
(0,1)
DL Innovations System. For a DL innovations system H with compact GSF support region GH , it follows upon combining (3.81) and (2.16) that H with H
(max) 2 2 (max) 2
1 |SD (, )|2 . Furthermore, if |SD (, )| = 1 for (, ) GH (or (max) = 0 and thus t f Gx (t, f ) = Ex . equivalently |S[D+ ]GH (, )| = IGH (, )), then H = max(,)GH
145
Marginal Properties
A type II spectrum is said to satisfy the marginal properties in time and frequency if, respectively, Gx (t, f ) df = rx (t, t) ,
f t
Gx (t, f ) dt = rX (f, f ) .
These properties will be satised if the operator D respectively satises the following two conditions:
SD ( , ) SD ( , ) ej = 1 ,
SD ( , ) SD ( , ) ej = 1 .
In particular, the condition on the left-hand side, and thus the marginal property in time, is satised for D = L(1/2) , i.e., in the case of the evolutionary spectrum (the GES with = 1/2). Similarly, the condition on the right-hand side, and thus the marginal property in frequency, is satised for D = L(1/2) , i.e., in the case of the transitory evolutionary spectrum (the GES with = 1/2). While explicit but slightly involved expressions for the dierences
() f t Gx (t, f ) dt rX (f, f )
can be derived, we here consider bounds on these dierences only for the GES,
Gx (t, f ) df rx (t, t) and
i.e., the special case D = L() for which Gx (t, f ) = Gx (t, f ). These bounds extend previous bounds in [148]. Theorem 3.22. For any random process x(t) and any innovations system H Ix , the dierences 1 (t)
f
bounded as
Gx (t, f ) df rx (t, t) and 2 (f ) |1 (t)| SH SH 4
()
() t Gx (t, f ) dt rX (f, f )
(with Gx (t, f ) using H) are 1 (1,1) , m 2 H
()
1 (1,1) , m 2 H and SH
|2 (f )| SH SH sup
4 +
(3.82)
with SH
sup
|SH (, )| d
|SH (, )| d .
Proof. The twisted convolution (B.12) (with = 0) implies that

(0) A(0) (, ) = SHH+ (, ) = x (0) (0)
SH ( , ) SH+ ( , ) ej(
(0)
(0)
d d .
Since furthermore SH+ (, ) = SH (, ), the mean power rx (t, t) of x(t) can be rewritten as rx (t, t) =
x A(0) (0, ) ej2t d =
SH ( , ) SH ( , ) ej d d ej2t d.
(0)
(0)
(0)
In contrast, using (3.77d) and the fact that for the GES SD (, ) = ej2 , the time marginal of the GES can be shown to equal G() (t, f ) df = x SH ( , ) SH ( , ) ej2 d d ej2t d.
(0) (0)
Hence, we obtain |1 (t)| =

1
|SH ( , )| |SH ( , )| ej2
ej d d d
( 1)
|SH ( , )| |SH ( , 1 )| 1 ej2( 2 )
d d d1
146
Chapter 3. Underspread Processes 1 ( 1 ) d d d1 2
=2
1
|SH ( , )| |SH ( , 1 )| sin

1
1 2 1 = 2 2 + 2
|SH ( , )| |SH ( , 1 )| | |(| | + |1 |) d d d1 |SH ( , )| |SH ( , 1 )| | | | | d d d1
1 2
|SH ( , )| |SH ( , 1 )| | | |1 | d d d1
1 |SH ( , 1 )| d1 sup 2 1 1 |SH ( , )| d + 2 sup 2 1 (1,1) = 4 S H S H 1 mH , 2
| | | | |SH ( , )| d d
1
| | |1 | |SH ( , 1 )| d d1
which proves the bound for the time marginal on the left-hand side of (3.82). The bound for the frequency marginal on the right-hand side of (3.82) can be shown in an analogous manner. Discussion. The foregoing theorem shows that for an underspread innovations system with small
(1,1) mH
(which is possible only if x(t) is underspread), the marginal properties will be approximately G() (t, f ) df rx (t, t) , x
(1,1)
satised,
f t
G() (t, f ) dt rX (f, f ) . x
We recall that small mH
Furthermore, the bounds in (3.82) correctly reect the fact that the marginal property in time is in frequency is exactly satised by the transitory evolutionary spectrum (i.e., the GES with = 1/2). CL process. In this case, it follows upon combining (3.82) and (2.18) that |1 (t)| SH SH with H as dened in (2.8). 4 1 H , 2 |2 (f )| SH SH
requires |SH (, )| to be concentrated along the axis and/or axis.
exactly satised by the evolutionary spectrum (i.e., the GES with = 1/2) and the marginal property CL Processes. A DL innovations system H with compact GSF support region GH generates a 1 H , 2
4 +
3.5 Equivalence of Time-Varying Power Spectra

In Sections 3.3 and 3.4, we have introduced the classes of type I and type II time-varying power spectra with the GWVS and GES, respectively, as central members. These classes present two alternative broad frameworks for the denition of time-varying spectra. However, it was seen in Subsections 3.2.1 and 3.2.2 that in the case of an underspread process, the GWVS and GES are essentially independent of . Moreover, Subsections 3.3.3 and 3.4.3 generalized this approximate equivalence to any two members within either the class of type I spectra or the class of type II spectra. In the following, we
147
further generalize these equivalence results. First, we prove the approximate equivalence of GWVS and GES for underspread processes. Then, we combine the available equivalence results to show that for underspread processes any type I spectrum is approximately equal to any type II spectrum.
3.5.1
Equivalence of Generalized Wigner-Ville Spectrum and Generalized Evolutionary Spectrum
The common element of the GWVS and GES is their formulation in terms of the GWS. Keeping in mind the relation Rx = HH+ with H Ix , it is seen that the GWVS is the GWS of the squared innovations system HH+ , Wx (t, f ) = LHH+ (t, f ) , whereas the GES is the squared magnitude of the GWS of the innovations system H, G() (t, f ) = LH (t, f ) . x This observation and Corollary 2.17 are the basis for the following result. Corollary 3.23. For any random process x(t) and any innovations system H Ix , the dierence between the GWVS and the GES (using H) with the same parameter is bounded as Wx (t, f ) Gx (t, f ) SH
2 1 () () () 2 () ()
(3.83)
(3.84)
2 CH
()
with
CH
()
c mH mH
(0,1)
(1,0)
+ 2 || mH
(1,1)
(3.85)
where c = | + 1/2| + | 1/2|. equals 4 (t, f ) in (2.69) except for the fact that H+ H in (2.69) is replaced by HH+ . As mentioned in Subsection 2.3.5, the bound (2.70) remains valid even if H+ H is replaced by HH+ . This gives (3.85). Discussion. If an innovations system H exists such that mH and/or axis), then the foregoing corollary shows that Wx (t, f ) G() (t, f ) . x
() (0,1)
Proof. With (3.83) and (3.84), we have Wx (t, f ) Gx (t, f ) = LHH+ (t, f ) LH (t, f ) , which
()
()
()
()
()
mH
(1,0)
and mH
(1,1)
are small (which
essentially requires the GSF of Hand hence the GEAF of x(t)to be concentrated along the axis
(3.86)
For = 0, this approximation is illustrated by the similarity of Fig. (3.5)(a) and Fig. (3.6)(a). Similarly, for = 1/2 the approximation (3.86) is illustrated by the similarity of Fig. (3.5)(b) and Fig. (3.6)(b). In these examples, the normalized errors were
(1/2) (1/2) Wx (t,f )Gx (t,f ) 2 SH 1
W x (t,f )Gx (t,f ) SH 2 1 (0)
(0)
(0)
= 0.003 and
(1/2)
= 0.003 while the corresponding bounds were 2 CH = 0.018 and 2 CH
0.073, respectively.
The fact that the positive semi-denite root Hp =
Rx causes minimal TF displacements within
the class of normal innovations systems [90, 148] suggests that the GWVS and GES agree best when
148
the positive (semi)denite innovations systems is used in the GES. Furthermore, the bound (3.85) is tightest for = 0 where the second term of CH vanishes. The case = 0 deserves further attention due to the metaplectic covariance of the GWS with = 0. Indeed, it follows from (2.72) that for = 0 one has the rened bound Wx (t, f ) Gx (t, f ) (1,0) (0,1) 2 inf mUHU+ mUHU+ . UM SH 2 1 equivalence result for GWVS and GES with arbitrary . Corollary 3.24. For any random process x(t) and any innovations system H Ix , the dierence Wx
(1 ) (0) (0) ()
(3.87)
By combining Corollary 3.6, Theorem 3.13, and Corollary 3.23, we nally obtain the following
(t, f ) Gx
(2 )
(t, f ) (with Gx Wx
(2 )
(t, f ) using H) is bounded as

(2 )
(1 )
(t, f ) Gx SH 2 1
(t, f )
2 min{B1 , B2 } ,
(3.88)
where B1 B2 c1 mH mH
(0,1) (0,1) (1,0)
+ 2 ( |1 | + |1 2 |) mH + 2 |2 | mH
(0,1) (1,1)
(1,1)
(3.89)
c2 mH mH
(1,0)
+ |1 2 | m(1,1) x + 2 |2 | + |1 2 | mH
(1,1)
c2 + 2|1 2 | mH mH
(1,0)
(3.90)
with c = | + 1/2| + | 1/2|. Proof. Subtracting/adding Gx Wx

(1 ) (1 )
(t, f ) from/to Wx
(1 ) (1 )
(1 )
(t, f ) Gx
(2 )
(t, f ), we have
( (t, f ) Gx 2 ) (t, f ) = Wx
( ( ( (t, f ) Gx 1 ) (t, f ) + Gx 1 ) (t, f ) Gx 2 ) (t, f ) ( ( ( (t, f ) Gx 1 ) (t, f ) + Gx 1 ) (t, f ) Gx 2 ) (t, f ) .
Wx ing/adding Wx Wx
(2 )
Applying (3.85) (with = 1 ) and (3.50), we obtain the bound 2B1 . Alternatively, by subtract(t, f ) from/to Wx
(1 )
(t, f ) Gx
(1 ) (1 )
(2 )
(t, f ), we have
(2 ) (2 )
(1 )
( (t, f ) Gx 2 ) (t, f ) = Wx
(t, f ) Wx (t, f ) Wx
(t, f ) + Wx
(2 )
( (t, f ) Gx 2 ) (t, f ) ( (t, f ) Gx 2 ) (t, f ) .
Wx
(t, f ) + Wx
1
(2 )
Applying (3.31) (in combination with Ax according to (2.35), mx (3.88).

(1,1)
= SH SH+
(1,0)
(with = 2 ), we obtain the bound 2B2 . The bound (3.90) on B2 follows from the fact that 2[mH
(1,1)
SH
2 1
according to (B.14)) and (3.85)
+ mH mH ]. Combining the bounds 2B1 and 2B2 gives
(0,1)
Discussion. Due to (3.89) and (3.90), the moments mH , mH
(0,1)
(1,0)
and mH
(1,1)
of the innovations
(0,1) (1,0)
system suce for discussing the foregoing corollary. For innovations system H such that mH mH

(1,1)
149
and mH
are small, i.e., for underspread innovations systems and thus for underspread processes, the Wx
(1 ) ( (t, f ) Gx 2 ) (t, f ) .
corollary implies that

(0,1) (1,0) (1,1)
Small mH mH axis.
and mH
essentially requires that the GSF of H is oriented along the axis and/or
3.5.2
Equivalence of Type I and Type II Spectra
The following result shows that for underspread processes all type I and type II spectra yield eectively equivalent results. This general result on the equivalence of time-varying power spectra is obtained by combining the bounds in (3.61), (3.79), and (3.87) via the triangle inequality. Theorem 3.25. For any random process x(t) and any innovations system H Ix , the dierence between a type I spectrum Cx (t, f ) = Ct,f , Rx and a type II spectrum Gx (t, f ) = H, D+ induced t,f by operators C and D, respectively, is bounded as Cx (t, f ) Gx (t, f ) SH 2 1
(0) ( mx 1 ) + 2 mH 2 + 2 inf (0) ( )
UM
mUHU+ mUHU+ ,
(0,1)
(1,0)
(3.91)
with 1 (, ) = |1 SC (, )| and 2 (, ) = |1 SD+ (, )|. inequality yields

(0) (0) Proof. Subtracting/adding Wx (t, f ) and Gx (t, f ) to Cx (t, f ) Gx (t, f ) and applying the triangle (0) (0) |Cx (t, f ) Gx (t, f )| = Cx (t, f ) Wx (t, f ) + Wx (t, f ) G(0) (t, f ) + G(0) (t, f ) Gx (t, f ) x x (0) (0) Cx (t, f ) Wx (t, f ) + Wx (t, f ) G(0) (t, f ) + G(0) (t, f ) Gx (t, f ) . x x
(3.92)
According to Theorem 3.15, with C(1) replaced by C and C(2) replaced by L(0) , the rst term in (3.92) is bounded as
(0) Cx (t, f ) Wx (t, f ) Ax (1 ) . 1 mx
(3.93)
A bound on the second term in (3.92) is given by (3.87). Finally, Theorem 3.20, with D(1) replaced by L(0) and D(2) replaced by D, implies that the third term in (3.92) is bounded as G(0) (t, f ) Gx (t, f ) SH x
(2 ) 2 1 mH .
(3.94)
The bound (3.91) then follows by combining (3.93), (3.87), and (3.94), and by noting that according to (B.14) Ax = SHH+ SH 2 .
1 1 1
Discussion. The foregoing result shows that if the process x(t) and the innovations system H are underspread such that the respective weighted integrals and moments are small, we have Cx (t, f ) Gx (t, f ) . Hence, in the underspread case, type I and type II power spectra will yield eectively equivalent results. On the other hand, for overspread processes, dierent spectra can yield dramatically dierent results.
150 f f
Chapter 3. Underspread Processes f
(a) f
t f
(b)
t f
(c)
(d)
(e)
(f)
(0)
Figure 3.7: (a)(c) Type I power spectra and (d)(f ) type II power spectra (with positive semi-denite innovations system) of an underspread process: (a) Wigner-Ville spectrum, Wx (t, f ), (b) real part of Rihaczek spectrum, Wx
(1/2)
(t, f ) , (c) (type I) physical spectrum, PSx (t, f ), with Gaussian

(0) (1/2)
(g)
window g(t), (d) Weyl spectrum, Gx (t, f ), (e) evolutionary spectrum, Gx the transitory evolutionary spectrum, to 1/4.
(1/2) Gx (t, f )),
(t, f ) (simultaneously
(f ) type II physical spectrum, Gx (t, f ) in (3.78)
with Gaussian window g(t). The signal length is 256 samples, normalized frequency ranges from 1/4
An example illustrating the approximate equivalence of time-varying power spectra in the case of an underspread process (which has been synthesized following [89]) is shown in Fig. 3.7. Here, all spectra yield practically identical results and correctly characterize the process mean TF energy distribution. As a counterexample, Fig. 3.8 shows the same time-varying spectra for the case of an overspread process that has been obtained from the underspread process by introducing correlations between the T component and the F component. These TF correlations are indicated by oscillating cross terms in parts (a), (b), (d), and (e) of Fig. 3.8. On the other hand, in the type I and type II physical spectra in parts (c) and (f), respectively, these cross terms are eectively suppressed and the process mean TF energy distribution is better visible. However, these smoothed spectra no longer indicate the existing strong TF correlations and hence do not completely characterize the process second-order statistics.
3.6 Input-Output Relations for Nonstationary Random Processes f f f
151
(a) f
t f
(b)
t f
(c)
(d)
(e)
(f)
Figure 3.8: The same time-varying spectra as in Fig. 3.7, but now for an overspread process.
3.6 Input-Output Relations for Nonstationary Random Processes

One reason for the importance of the PSD is its usefulness for describing the action of LTI systems. When a wide-sense stationary random process x(t) with power spectral density Px (f ) is passed through an LTI system K with frequency response K(f ), the output y(t) = (Kx)(t) is again wide-sense stationary with power spectral density Py (f ) = |K(f )|2 Px (f ). Similarly, the response of an LFI system with temporal transfer function m(t) to a (nonstationary) white process x(t) with mean instantaneous intensity [163] qx (t) is again white with mean instantaneous intensity qy (t) = |m(t)|2 qx (t). In
this section, we investigate whether these simple multiplicative input-output relations relating the second-order statistics of x(t) with those of y(t) = (Kx)(t) can be extended to the general case where nonstationary random processes are passed through LTV systems. Both GWVS-based and GES-based input-output relations will be considered.
3.6.1
Input-Output Relation Based on the Generalized Wigner-Ville Spectrum
Let us consider an LTV system K whose input x(t) is a zero-mean, nonstationary random process. The system output y(t) = (Kx)(t) is a zero-mean, nonstationary random process whose correlation
152
operator is given by Ry = E {(Kx) (Kx) } = K E {x x } K+ = KRx K+ . Note that in this relation, like in the LTI/LFI case, the system K enters in a quadratic manner. We now look for a TF reformulation of the above input-output relation in terms of the GWS. Specically, if K and Rx are jointly underspread in the sense of Subsection 2.3.4, then (2.62) and (2.49) imply LRy (t, f ) = LKRx K+ (t, f ) LK (t, f ) LRx (t, f ) LK+ (t, f ) LK (t, f )|2 LRx (t, f ) , or, equivalently, using the GWVS, Wy
() () () () () () () ()
(t, f ) |LK (t, f )|2 Wx (t, f ) .
()
()
(3.95)
The next theorem bounds the error of this approximation. Theorem 3.26. For any random process x(t) and any LTV system K, the dierence () (t, f ) is bounded as |() (t, f )| (1,1) (0,1) (1,0) (1,0) (0,1) + 4||mK 2 c mK m(1,0) + mK m(0,1) + mK mK x x 2 SK 1 Ax 1 with c = | + 1/2| + | 1/2|. Proof. Applying the triangle inequality after subtracting and adding LK (t, f )Wx (t, f )LK+ (t, f ) yields () (t, f ) = WKRx K+ (t, f ) LK (t, f )Wx (t, f )LK+ (t, f ) + LK (t, f )Wx (t, f )LK+ (t, f ) |LK (t, f )|2 Wx (t, f ) = A (t, f ) + B (t, f ) A (t, f ) + B (t, f ) with A (t, f ) B (t, f )
() () () () () () () () () () () () () () () () () ()
WKx (t, f ) |LK (t, f )|2 Wx (t, f )
()
()
()
(3.96)
(3.97)
LKRx K+ (t, f ) LK (t, f )Wx (t, f )LK+ (t, f ) LK (t, f )Wx (t, f ) LK+ (t, f ) LK (t, f ) .
() () () () () () ()
()
()
()
()
() () () () () Since the 2-D Fourier transform of LKRx K+ (t, f ) is given by SKRx K+ (, ) = SK Ax SK+ (, )
(see (B.12)) and since the 2-D Fourier transform of LK (t, f )Wx (t, f )LK+ (t, f ) is given by () () () () SK Ax SK+ (, ), the 2-D Fourier transform of A (t, f ) is obtained as
() () () () () () A (, ) = SK Ax SK+ (, ) SK A() SK+ (, ) x
=
1 1 2 2
x SK (1 , 1 ) A() (2 , 2 ) SK+ ( 1 2 , 1 2 )
()
()
3.6 Input-Output Relations for Nonstationary Random Processes ej2[ ( 1 ,1 ,2 ,2 )+ (,,1 ,1 )] 1 d1 d1 d2 d2 , where (B.12) has been used. This leads to the following bound on A (t, f ), A (t, f ) 2
() ()
153
() A (, ) d d
1 1 2 2
|SK (1 , 1 )| |Ax (2 , 2 )| |SK+ ( 1 2 , 1 2 )|
sin [ ( 1 , 1 , 2 , 2 ) + (, , 1 , 1 )] d1 d1 d2 d2 d d . Using | sin x| |x| and substituting 3 = 1 2 and 3 = 1 2 , this expression can further
()
be developed as
A (t, f ) 2
|SK (1 , 1 )| |Ax (2 , 2 )| |SK+ (3 , 3 )|
| (2 + 3 , 2 + 3 , 2 , 2 )| + | (1 + 2 + 3 , 1 + 2 + 3 , 1 , 1 )| d1 d1 d2 d2 d3 d3 2
1 1 2 2 3 3
|SK (1 , 1 )| |Ax (2 , 2 )| |SK+ (3 , 3 )| 1 2 |2 1 | + |3 1 | + |3 2 |

(0,1) (1,0)
1 + 2
2 1
|1 2 | + |1 3 | + |2 3 | + Ax
(0,1) (1,0)
d1 d1 d2 d2 d3 d3 = 2 SK
1 c
mK m(1,0) + mK m(0,1) + mK mK x x
(3.98)
and by applying (2.48),

()
where the nal expression is obtained by collecting corresponding terms in the integral and using () () () (k,l) (k,l) . A bound on (t, f ) is obtained by using L (t, f ) SK , W (t, f ) Ax , m + =m
K K B K 1 x 1
B (t, f ) SK
Ax
LK+ (t, f ) LK (t, f ) SK
()
()
2 1
Ax
(1,1) 1 4|| mK
(3.99)
Inserting (3.98) and (3.99) in (3.97) nally yields the bound (3.96). Discussion. Due to Theorem 3.26, the approximate input-output relation (3.95) will be valid if the LTV system K and the correlation operator Rx are jointly underspread such that the moments appearing in the bound (3.96) are small. This requires the GSF of K and the GEAF of x(t) to be similarly localized along the axis and/or axis. The bound (3.96) is tightest for = 0, in which case also a rened bound similar to (2.63) and (2.72) can be obtained by using metaplectic transformations of K and x(t); this allows the GSF of K and the GEAF of x(t) to be oriented in (similar) oblique directions. An example illustrating the approximation (3.95) for the case = 0 is shown in Fig. 3.9. In this example, the normalized error was maxt,f bound in (3.96) was 3.3 103 .
|() (t,f )| SK 2 A x 1 1
= 1.3 103 while the corresponding
154 f f f
t (a)
(0)
t (b)
(0)
t (c) (d)
Figure 3.9: Illustration of approximate input-output relation for the GWVS: (a) Wigner-Ville spectrum Wx (t, f ) of input process x(t), (b) Weyl symbol LK (t, f ) of LTV system K, (c) Wigner-Ville Wigner-Ville spectrum of y(t). The signal length is 256 samples, the normalized frequency range is [1/4, 1/4]. spectrum Wy (t, f ) of ltered process y(t) = (Kx)(t), (d) approximation |LK (t, f )|2 Wx (t, f ) for
(0) (0) (0)
CL Processes. In the case of a CL process x(t) with GEAF support contained in the rectangular region Gx = [x , x ] [x , x ] and a DL system K with GSF support contained in the rectangular region GK = [K , K ] [K , K ], applying (2.17) and (2.18) to (3.96) and using (2.9) yields |() (t, f )| 3 2 c K x + K x + K K + 4|| K c + || K,Rx , 2 2 SK 1 Ax 1
(3.100)
where K,Rx is the joint displacement spread of K and Rx as dened in (2.11). For || 1/2, we have c = 1 so that the bound in (3.100) is further bounded by the simple expression 2 K,Rx .
3.6.2
Input-Output Relation Based on the Generalized Evolutionary Spectrum
An input-output relation for the GES was presented in [148] without a bound on the associated approximation error and based on the assumption that a CL process x(t) is passed through a DL system K. The following result is less restrictive in that the GEAF of x(t) and the GSF of K are only required to have rapid decay. Furthermore, it exploits the fact that according to y(t) = (Kx)(t) = (KHn)(t), the operator KH is an innovations system of y(t) if H is an innovations system of x(t), i.e., H Ix = KH Iy .
In the following, we assume that the GES of x(t) is computed using the innovations system H and the GES of y(t) is computed using the innovations system KH. We note that if H is the positive semidenite root of Rx , KH will not be the positive semi-denite root of Ry unless K and H commute. In contrast, if H is a causal innovations systems of x(t) and if K is causal, too, then KH is also a causal innovations system of y(t). Theorem 3.27. For any random process x(t) and any LTV system K, the dierence () (t, f ) = GKx (t, f ) |LK (t, f )|2 G() (t, f ) x
() ()
3.6 Input-Output Relations for Nonstationary Random Processes

() ()
155
(with Gx (t, f ) based on the innovations system H and GKx (t, f ) based on the innovations system KH) is bounded as |() (t, f )| (1,0) (0,1) (1,0) (0,1) 4c mH mK + mK mH SK 2 SH 2 1 1 with c = +
1 2 ()
(3.101)
1 2
.
() 2
Proof. With Gx (t, f ) = LH (t, f )

()
and GKx (t, f ) = LKH (t, f ) , we have

() () () ()
()
()
() (t, f ) = LKH (t, f )LKH (t, f ) LK (t, f )LK (t, f )LH (t, f )LH (t, f ) . Hence, the 2-D Fourier transform of () (t, f ) is given by
() () () () () () () (, ) = SKH S(KH)+ SK SH SK+ SH+ (, )
()
=
()
SK SH
()
()
SH+ SK+
()
()
()
(, ) SK SH SK+ SH+
()
()
()
()
(, )
where we used SH (, ) = SH+ (, ) (see (B.6)). Using (B.12), we further obtain |() (t, f )| =
1 1
|() (, )| d d SK SH (1 , 1 ) SH+ SK+

() () () () () () ()
( 1 , 1 ) d1 d1 ( 1 , 1 ) d1 d1 d d
SK SH (1 , 1 ) SH+ SK+
() () 1 2 2
()
SK (2 , 2 ) SH (1 2 , 1 2 ) ej2 (1 ,1 ,2 ,2 ) d2 d2 1 3 , 1 3 )
() () SK+ (3 , 3 )SH+ (
ej2 ( 1 ,1 ,3 ,3 )] d3 d3 d1 d1 =
1 1 2 2 3 3 1 1 2 2
SK (2 , 2 ) SH (1 2 , 1 2 ) d2 d2 1 3 , 1 3 ) d3 d3 d1 d1 d d
() ()
()
()
() () SK+ (3 , 3 )SH+ ( ()
SK (2 , 2 ) SH (1 2 , 1 2 ) SK+ (3 , 3 )
() SH+ (
1 3 , 1 3 ) ej2[ (1 ,1 ,2 ,2)+ ( 1 ,1 ,3 ,3 )] 1
d1 d1 d2 d2 d3 d3 d d
4 4
SK (2 , 2 ) SH ( , ) SK+ (3 , 3 ) SH+ (4 , 4 )
2 2 3 3
e 2
j2[ ( +2 , +2 ,2 ,2 )+ (3 +4 ,3 +4 ,3 ,3 )]
1 d4 d4 d d d2 d2 d3 d3
SK (2 , 2 ) SH ( , ) SK+ (3 , 3 ) SH+ (4 , 4 )
2 2 3 3
156 f f f
t (a)
(0)
t (b) (c)
(0)
t (d)
Figure 3.10: Illustration of approximate input-output relation for the GES: (a) GES (Weyl spectrum) Gx (t, f ) of input process x(t), (b) Weyl symbol LK (t, f ) of LTV system K, (c) GES (0) Gy (t, f ) of ltered process y(t) = (Kx)(t) (using innovations system K Rx ), (d) approximation |LK (t, f )|2 Gx (t, f ) for GES of y(t). The signal length is 256 samples, the normalized frequency range is [1/4, 1/4].
(0) (0)
1 1 + |2 | + |4 3 | + | 2 | + |3 4 | 2 2
2 1
d4 d4 d d d2 d2 d3 d3
= 4c SH
SK
2 1
mH mK
(1,0)
(0,1)
+ mK mH
(1,0)
(0,1)
,
(k,l) (k,l)
where in the last step we split the integral, collected corresponding terms, and used mH+ = mH . Discussion. The foregoing theorem shows that if mH mK
(1,0) (0,1)
+ mK mH
(1,0)
(0,1)
is small, the GES of
y(t) = (Kx)(t) with innovations system KH can be approximated as

() Gy (t, f ) |LK (t, f )|2 G() (t, f ) . x ()
(3.102)
Small mH mK
(1,0)
(0,1)
+ mK mH
(1,0)
(0,1)
requires that H and K (and hence x(t) and y(t)) are jointly under-
spread (see (2.30)), i.e., that the GSFs of H and K are both eectively concentrated in a similar way along the axis and/or axis. In the case = 0, a rened bound similar to (2.63) can be obtained by using metaplectic transformations of H and K; this allows the GSFs of H and K to be oriented in (similar) oblique directions. An example illustrating the approximation (3.102) for = 0 is shown in Fig. 3.10. In this example, the normalized error was maxt,f the corresponding bound in (3.101) was 2.3 103 .
|(0) (t,f )| SK 2 SH 2 1 1
= 6 104 while
CL Processes. In the case of a DL innovations system H with GSF support contained in the rectangular region GH = [H , H ] [H , H ], the process x(t) is CL with GEAF support region Gx = [2H , 2H ] [2H , 2H ]. If in addition K is a DL system with GSF support contained in the rectangular region GK = [K , K ] [K , K ], then y(t) = (Kx)(t) is also a CL process with GEAF (2.17) to (3.101) yields the bound support region Gy = [2(H + K ), 2(H + K )] [2(H + K ), 2(H + K )]. In this case, applying |() (t, f )| 4 c H K + K H 2 c K,H , SK 2 SH 2 1 1
3.7 Approximate Karhunen-Lo`ve Expansion e
157
c = 1 so that the last bound becomes 2 K,H.
where K,H is the joint displacement spread of H and K as dened in (2.11). For || 1/2 we have
3.7 Approximate Karhunen-Lo`ve Expansion e

As mentioned in Subection 1.3.2, the complex sinusoids ef0 (t) = ej2f0 t are the Karhunen-Lo`ve (KL) e eigenfunctions of any stationary random process x(t), with the PSD at frequency f0 , Px (f0 ), the associated KL eigenvalue. Similarly, the Dirac impulses t0 (t) = (t t0 ) are the KL eigenfunctions of any white random process, with the mean instantaneous intensity at time t0 , qx (t0 ), the associated KL eigenvalue. Note that in the stationary and white cases the KL eigenfunctions are highly structured, i.e., they are related by frequency shifts and time shifts, respectively. The situation is dierent in the case of general nonstationary processes. The KL eigenfunctions of dierent processes are dierent (unless the associated correlation operators commute [64, 158]). Furthermore, the KL eigenfunctions of a general nonstationary process are not localized and structured in any sense and the KL eigenvalues are not equal to the values of any conventional time-varying power spectrum. However, we will now show that underspread processes have a well-structured set of TFlocalized approximate KL eigenfunctions, with the associated approximate KL eigenvalues given by the GWVS values. We note that our discussion of approximate KL eigenvalues and eigenfunctions is essentially an adaptation of previous results of Kozek [115, 116, 118, 120] and is also conceptually similar to the approach presented in [137, 138]. Furthermore, the subsequent discussion is closely related to the results in Subsections 2.3.8 and 2.3.9. Let s(t) be a normalized function that is well concentrated about the origin of the TF plane (e.g., a Gaussian function). We consider the family of functions st0 ,f0 (t) = s(t t0 ) ej2f0 t obtained by TF-shifting s(t) to the TF point (t0 , f0 ). By construction, st0 ,f0 (t) is then well TF-concentrated about (t0 , f0 ). If x(t) is an underspread process, then Rx is an underspread operator and Theorem 2.23, with H replaced by Rx , states that the st0 ,f0 (t) are approximate eigenfunctions of Rx with Wx (t0 , f0 ) = LRx (t0 , f0 ) the associated approximate eigenfunctions. This suggests that for an underspread process x(t), the TF shifted functions st0 ,f0 (t) are approximate KL eigenfunctions with Wx (t0 , f0 ) the associated approximate KL eigenvalues. However, an essential feature of the KL expansion is its double orthogonality, i.e., (see (1.8))
() () ()
x(t) =
k=1
x, uk uk (t) ,
E { x, uk x, ul } = k kl ,
uk , ul = kl .
by time t0 and frequency f0 . Hence, we alternatively consider the approximate KL expansion
A similar property cannot be achieved by the function set {st0 ,f0 (t)} which is continously parameterized
x(t) =
k=1
x, uk,l vk,l (t)
(3.103)
with discrete parameters k and l. Here, {uk,l (t)}, {vk,l (t)} are biorthogonal Weyl-Heisenberg sets
158
[56, 100] obtained by TF shifting two functions u(t), v(t) (for convenience we assume u = 1), uk,l (t) = u(t kT ) ej2lF t , with T F 1. The biorthogonality condition reads uk,l , vk ,l = kk ll . In order for (3.103) to constitute an approximate KL expansion, we furthermore require (statistical) orthohonality of the expansion coecients, E x, uk,l x, uk ,l
()
vk,l (t) = v(t kT ) ej2lF t ,
= Rx uk,l , uk ,l Wx (kT, lF ) kk ll .
The error incurred by this approximation is bounded in the next corollary that is a modied version of Theorem 2.23 with H = Rx . the dierence is bounded as Corollary 3.28. For any random process x(t) and any biorthogonal Weyl-Heisenberg set {uk,l (t)}, 9 [k, l; k , l ]
()
Rx uk,l , uk ,l Wx (kT, lF ) kk ll

()
(kk ,ll ) () [k, l; k , l ] u mx , Ax 1
(3.104)
with u
(k,l)
(, ) = k0 l0 Au
()
+ kT, + lF .
u
(k,l)
Discussion. The preceding corollary shows that for underspread processes where mx made small by suitable choice of u(t) one has E x, uk,l x, uk ,l
()
can be
= Rx uk,l , uk ,l Wx (kT, lF ) kk ll ,
(0,0) ()
()
(3.105)
and hence approximate double (bi-)orthogonality of the expansion (3.103). For well TF-localized u(t) it can be shown that Au (, ) Au (0, 0) = 1 or equivalently u about the origin of the (, )-plane. Thus, it is seen that small mx
()
(0,0) (u )
concentrated about the origin, i.e., that x(t) is an underspread process. Furthermore, for k = 0, l = 0 (k,l) (k,l) ( ) there is u (0, 0) = Au (kT, lF ) . Thus, small mx u requires that |Ax (, )| is concentrated about the origin and Au (kT, lF ) 0 for k = 0 or l = 0, i.e., the ambiguity function of u(t) should decay quickly outside the eective support of |Ax (, )|. This means that the eective support of |Au (, )| should be matched to the eective support of |Ax (, )|.
(, ) = |1 Au (, )| 0 requires that |Ax (, )| is
3.8 Time-Frequency Coherence

The last section in this chapter is dedicated to the introduction and discussion of a TF coherence function. We will rst briey review the denition and properties of the ordinary spectral coherence function for stationary processes. For nonstationary processes, a coherence operator will be introduced. Using results from Subsection 2.3.6, we show that the GWS of the coherence operator can be approximated by a ratio involving the (cross) GWVS of the processes under consideration. Some eorts regarding the denition of a TF coherence function have previously been presented in [210].
159
3.8.1
Spectral Coherence and Coherence Operator
For jointly stationary processes x(t) and y(t) with PSDs Px (f ), Py (f ) and cross-PSD Px,y (f ), the (spectral) coherence function is dened as [13, 67, 171] x,y (f ) Px,y (f ) . Px (f ) Py (f )
This is similar to a correlation coecient of the Fourier transforms of x(t) and y(t) at frequency f . In particular, the spectral coherence function satises 0 |x,y (f )|2 1. linear transforms,8 i.e., for y(t) = (k x)(t) one has |x,y (f )|2 = |Px,y (f )|2 |Px (f ) K (f )|2 = 1. Px (f ) Py (f ) Px (f ) |K(f )|2 Px (f ) (3.107) (3.106)
The coherence function is important since it has unit magnitude in the case of processes related by
This property has been exploited in numerous signal processing applications [13]. Let x(t), y(t) be two zero-mean, generally nonstationary random processes. As a nonstationary counterpart of the coherence function, we now introduce a coherence operator as the cross-correlation operator of the whitened processes (Rx x,y
1/2
x)(t) and (Ry
1/2
y)(t), i.e.,9 (3.108)

1/2
1/2 1/2 1/2 1/2 , E (Rx x) (Ry y) = Rx Rx,y Ry
where Rx,y = E {x y }, Rx and Ry are assumed invertible with positive denite roots Rx
1/2 Ry
Rx ,
Ry . Note that there is y,x =
+ x,y .
Furthermore, in the case of jointly stationary processes
x(t) and y(t), Rx , Ry , and Rx,y are convolution operators. Hence, in this case x,y is a convolution of x,y (f ), i.e., x,y ( ) = operator as well and its kernel (x,y )(t1 , t2 ) = x,y (t1 t2 ) corresponds to the inverse Fourier transform
f
x,y (f ) ej2f df . In this sense, the denition of x,y is consistent with the
coherence function of the stationary case. Apart from the consistency with the stationary case, a meaningful interpretation of x,y as coherence requires that it satises properties similar to (3.106) and (3.107). The property (3.106) can be extended to the nonstationary case as follows: Theorem 3.29. The operator norm of the coherence operator satises x,y
O
1.
(3.109)
+ + Furthermore, the quadratic form induced by the squared coherence operators x,y x,y and x,y x,y
satises
+ 0 x,y x,y g, g 1 , + 0 x,y x,y g, g 1 , 2
(3.110)
where g(t) is any normalized function (i.e., g

8
= 1).
Note that due to the assumption of joint stationarity of x(t) and y(t), admissible linear transforms necessarily
correspond to LTI systems. 9 A similar approach in a discrete-time framework has recently been taken in [189].
160
Proof. We start from the singular value decomposition of the coherence operator, x,y =
k k uk vk .
where x(t) = (Rx
and k > 0 are the singular values given by k = x,y vk , uk . According to (3.108), x,y = E { y } x
1/2
Here, the orthonormal bases {uk (t)} and {vk (t)} are the left and right singular functions, respectively x)(t) and y (t) = (Ry
1/2
y)(t) are stationary white processes with correlation
Rx = Ry = I. Hence, we further obtain k = x,y vk , uk = (E { y })vk , uk = E { x, uk x E {| x, uk |2 } E {| y , vk |2 } = Rx uk , uk y , vk } Ry vk , vk = uk

2 2
vk
2 2
= 1,
where we used the Schwarz inequality for random variables. Since the supremum of the singular values determines the operator norm [69], i.e., x,y and
+ + x,y x,y g, g = x,y g 2 2 O + hand inequality in (3.110) is obtained by noting that x,y x,y is a positive semi-denite operator 2 O 2 2
= supk {k }, the inequality (3.109) follows. The left-
x,y
1,
O
where in the last step we used the previously derived bound x,y
+ x,y x,y
is shown in a similar way.
1. The inequality involving
With regard to (3.107), let us consider two processes related by an invertible linear system K, i.e., y(t) = (Kx)(t). Here, we have Rx,y = Rx K+ and Ry = KRx K+ , and hence
+ 1/2 1/2 1/2 1/2 x,y x,y = Rx Rx,y R1 R+ Rx = Rx Rx K+ (K+ )1 R1 K1 KRx Rx y x,y x 1/2 = R1/2 R1 Rx = I . x x + That is, the squared coherence operator x,y x,y equals the identity operator. Similarly, it can be + shown that x,y x,y = I. Hence, we conclude that the coherence operator of linearly related processes
is unitary. This extends the central property (3.107) of the stationary coherence function x,y (f ) to the nonstationary case.
3.8.2
Time-Frequency Formulation of the Coherence Operator
While the coherence operator x,y satises similar mathematical properties as the coherence function, it involves computationally costly and unstable operator inversions and lacks physical intuition. The GWS of x,y , Lx,y (t, f ) = L
() () Rx
1/2
Rx,y Ry
1/2
(t, f ) ,
yields a TF formulation of the coherence operator; however, it still involves operator inverses. Therefore, we now introduce a more convenient TF reformulation of x,y and show that it is an approximation to Lx,y (t, f ).
()
161
We rst note that the coherence operator can be dened alternatively by Hx x,y Hy = Rx,y , where Hx = Rx and Hy = (3.111)
Ry are the positive (semi-)denite innovations systems of x(t) and
y(t), respectively. Next, we assume that Hx , Hy , and Rx,y are jointly DL operators with joint GSF support region G. Note that this implies that Rx and Ry are jointly DL underspread as well; hence, x(t) and y(t) are jointly CL underspread. In order to obtain an approximation to the GWS of x,y , one could think of directly applying (2.62) to (3.111), i.e., LHx x,y Hy (t, f ) LHx (t, f )Lx,y (t, f )LHy (t, f ) = LRx,y (t, f ) . Unfortunately, even though Hx , Hy , and Rx,y are DL underspread operators, it is not obvious whether x,y is underspread as well, i.e., whether the GSF of x,y is suciently concentrated about the origin. However, by noting that (3.111) is of the form (2.76) with H1 = Hx , H2 = Hy , H3 = Rx,y , and G = x,y , we can use the results of Subsection 2.3.6. DL approximation of x,y . We split x,y into a DL part and a non-DL part according to negligible in the sense that removing it from x,y does not greatly inuence the validity of (3.111). modications. Corollary 3.30. Consider a coherence operator x,y dened by Hx x,y Hy = Rx,y , where Hx = Hy = Rx ,
G G G G (2.4), i.e., x,y = x,y + x,y . Then, Theorem 2.18 states that the non-DL part x,y = x,y x,y is () () () () ()
For convenience, we reformulate this theorem as a corollary incorporating the necessary notational
G G of x,y dened by SG (, ) = Sx,y (, ) IG (, ) and let x,y = x,y x,y denote the non-DL part of G x,y . Then, the dierence Hx x,y Hy Rx,y is bounded as G Hx x,y Hy Rx,y Hx 2 x,y 2 Hy 2 2
x,y
G G , G so that the joint displacement spread is given by G = 4G G . Let x,y denote the DL part () ()
Ry , and Rx,y are jointly DL with GSF support contained in the rectangle G = G , G
3 G .
(3.112)
Hence, it is seen that if G is small, i.e., if x(t) and y(t) are jointly CL underspread, then removing
G the non-DL part x,y from x,y does not greatly aect the validity of Hx x,y Hy = Rx,y :
Hx x,y Hy = Rx,y
G Hx x,y Hy Rx,y .
Note that since Rx = H2 and Ry = H2 , x(t) and y(t) are CL processes with GEAF support region x y GSF of Rx,y ) is contained in G G2 , it follows that x(t) and y(t) are jointly CL processes with underspread processes. G2 = 2G , 2G 2G , 2G . Since by assumption the cross-GEAF of x(t) and y(t) (i.e., the
joint correlation spread x,y = 4G . Hence, small G essentially requires small x,y , i.e., jointly CL
162
Approximation of the GWS of x,y . The foregoing corollary is the basis for the next result
G which states an approximation of the GWS of the DL part x,y of the coherence operator. We note
that LG (t, f ) is a smoothed version of the GWS of x,y ,

x,y
()
LG (t, f ) = Lx,y LT (t, f ) ,

x,y
()
()
()
where T is dened via ST (, ) = IG (, ) (see Subsection 2.1.1). The next corollary is a reformulation of Theorem 2.19 with some notational adaptations.
G Corollary 3.31. Let Hx , Hy , Rx,y , x,y , x,y , and the rectangular region G be dened as in Corollary ()
()
3.30. Then, the dierence
() (t, f )
()
LHx (t, f )LG (t, f ) LHy (t, f ) Wx,y (t, f ) ,

x,y
()
()
()
where Wx,y (t, f ) is the cross-GWVS as dened in (B.45), is bounded as () (t, f ) 1 Sx,y SHy 3 2 c G + 9 G , 2 () x,y
2 2 2 c G + 8 c 3 G + 3 G ,
SHx
Hx
Hy
with c = | + 1/2| + | 1/2|. The above result shows that for small G , it follows from H1 GH2 = H3 that LHx (t, f ) LG (t, f ) LHy (t, f ) LRx,y (t, f ) = Wx,y (t, f ) .
x,y
()
()
()
()
()
As explained above, small G essentially requires that x(t) and y(t) are jointly CL underspread processes. Using the regularized inversion techniques of Subsection 2.3.6, we can now obtain a TF approximation for the GWS of (the DL part of) x,y . In particular, let us dene an operator x,y via its GWS as (cf. (2.93), (2.94)) Wx,y (t, f ) LHx (t, f ) LHy (t, f )
() () ()
Le
() (t, f ) x,y
where
()
0,
()
, for (t, f ) R , for (t, f ) R ,
(3.113)
(t, f ) :
LHx (t, f ) LHy (t, f ) SHx

() 1
SHy
with
3 2 c G + 9 G , 2
is the TF region where LHx (t, f ) LHy (t, f ) is essentially nonzero. According to the relevant derivation in Subsection 2.3.6, Corollary 3.31 now implies that within R, 1 Sx,y
() () (t, f ) x,y
()
LG (t, f ) L e
x,y
1 .
Hence, for large enough it follows that within R LG (t, f ) L e

x,y
()
() (t, f ) . x,y
(3.114)
163
G G Since x,y approximately satises (3.111), i.e., Hx x,y Hy Rx,y , the approximation (3.114) shows
that for large enough, the operator x,y dened via (3.113) approximately satises (3.111) as well: Hx x,y Hy = Rx,y
()
=
()
G Hx x,y Hy Rx,y () (t, f ) x,y
Lx,y LT (t, f ) L e
Wx,y (t, f ) LHx (t, f ) LHy (t, f ) 0,

() ()
()
, for (t, f ) R, for (t, f ) R .
Finally, we can rephrase the TF approximation of LG (t, f ) in a still more suggestive form. Since
x,y
()
Hx and Hy are positive semi-denite DL underspread innovations systems, combining Corollaries 3.23 and 3.8 yields the approximations |LHx (t, f )|2 Wx (t, f ) P Wx (t, f ) ,
() () () () () () () () () () ()
|LHy (t, f )|2 Wy

() ()
(t, f ) P Wy
()
(t, f ) .
By virtue of Theorem 2.30, there is |LHx (t, f )| LHx (t, f ) and |LHy (t, f )| LHy (t, f ). Hence, we obinto (3.113), we can nally rewrite (3.114) as () Wx,y (t, f ) , for (t, f ) R , () () () () LG (t, f ) x,y (t, f ) P Wx (t, f ) P Wy (t, f ) x,y 0, for (t, f ) R .
() (0) (0) (0)
tain LHx (t, f )
P Wx (t, f ) and LHy (t, f )
P Wy
(t, f ) . Plugging these approximations
(3.115)
Here, x,y (t, f ) denes a TF coherence function that is formulated in terms of GWVS. For = 0, the TF coherence function x,y (t, f ) is similar to the TF coherence function introduced in [210]. There, however, R was chosen as the TF region where Wx (t, f ) and Wy (t, f ) are positive and the sodened
TF coherence was proposed in an ad hoc fashion without establishing its relation to the coherence operator.
Finally, with Corollary 2.17 it follows from (3.115) that the GWS of the square of the DL part of the coherence operator is approximately given by () 2 Wx,y (t, f ) , for (t, f ) R , () () 2 2 () () () LG G+ (t, f ) LG (t, f ) x,y (t, f ) = P Wx (t, f ) P Wy (t, f ) x,y x,y x,y 0, for (t, f ) R . Discussion. The foregoing results have shown that for processes where
G x,y
Rx ,
Ry , and Rx,y
is negligible in the sense that are jointly DL with GSF support region G , then the non-DL part G G Rx x,y Ry Rx,y . Furthermore, the GWS of x,y (which is a smoothed version of the GWS of () x,y ) is approximately equal to the TF coherence function x,y (t, f ). We note that if Rx , Ry , and Rx,y are jointly DL with GSF support area G , the processes x(t) and y(t) are jointly CL with joint correlation spread x,y = 4G (see Subsection 3.1.5). To illustrate the approximation (3.115), we analyze the coherence of the input process x(t) and the noise-contaminated output process y(t) = (Kx)(t) + n(t) of an underspread, self-adjoint LTV
164
(a) f
6
(b) f
6
0.8 0.6 0.4 0.2
(c) f
6
0.8 0.6 0.4 0.2
1.5
0.5
-t
-t
(0)
-t
(0)
Figure 3.11: Illustration of approximation of Lx,y (t, f ) by TF coherence function x,y (t, f ) for y(t) = (Kx)(t) + n(t): (a) Weyl symbol of the LTV system K, LK (t, f ); (b) magnitude of Weyl symbol of coherence operator x,y , |Lx,y (t, f )|; (c) magnitude of TF coherence function, |x,y (t, f )|. The signal length is 128 samples, normalized frequency ranges from 1/4 to 1/4.
(0) (0)
system K (the Weyl symbol of K is shown in Fig. 3.11(a)). Input signal x(t) and noise n(t) are assumed to be zero-mean, uncorrelated, with respective correlations Rx = I and Rn = I. It follows that Ry = KK+ + I and Rx,y = K+ . In the noise-free case, x(t) and y(t) would be completely
+ coherent, i.e., x,y x,y = I. However, the noise n(t) causes an SNR-dependent reduction of coherence.
In particular, with K =
dependence of coherence is clearly visible in the Weyl symbol of x,y that is shown in Fig. 3.11(b).
(0)
+ k wk wk one can show that x,y x,y =
|k |2 k |k |2 +
wk wk . The SNR
In particular, in those regions of the TF plane where LK (t, f ) is large, the output SNR is large (i.e., (Kx)(t) is the dominant part of the output signal) and the coherence of x(t) and y(t) in these regions is large. Similarly, in TF regions where LK (t, f ) is small, the output SNR is small (i.e., the output signal is dominated by n(t)) and the coherence in these TF regions will be small. The TF coherence function x,y (t, f ) is shown for comparison in Fig. 3.11(c). It is seen to be practically identical to the Weyl symbol of x,y , thereby conrming the approximation (3.115).
(0) (0)
3.8.3
The Generalized Time-Frequency Coherence Function
In the previous subsection, we considered an approximate TF formulation of the coherence operator in terms of the (cross-) GWVS. For (jointly) underspread processes, the GWVS could be replaced by other type I time-varying power spectra to which it is approximately equal (cf. Theorem 3.15). This motivates the following generalized denition of a generalized TF coherence function using type I (cross-)spectra of the processes x(t) and y(t) as dened in Subsection 3.3, x,y (t, f ) Cx,y (t, f ) P Cx (t, f ) P Cy (t, f ) . (3.116)
Here, the positive parts of Cx (t, f ) and Cy (t, f ) are used in order to make the square-root in the denominator well-dened. Note that according to Theorem 3.19, neglecting the negative parts of Cx (t, f ) and Cy (t, f ) does not result in a loss of information if x(t) and y(t) are underspread processes. This does not restrict the applicability of x,y (t, f ) since, as will be seen below, for a meaningful
165
interpretation of a TF coherence function the underspread condition is required anyway. Taking the positive parts is not necessary if only the squared magnitude of the TF coherence function is of interest. Here, we simply write x,y (t, f )
2 2 Cx,y (t, f ) . Cx (t, f ) Cy (t, f )
(3.117)
PSx (t, f )), i.e.,
A particularly intuitive interpretation of the above TF coherence function is obtained when the (g) type I spectrum C (t, f ) is chosen as the physical spectrum (which is always positive, P PSx (t, f ) =
(g)
x,y (t, f ) =
PSx,y (t, f ) PSx (t, f ) PSy (t, f )

(g) (g)
(g)
E {| x, gt,f |2 } E {| y, gt,f |2 }
E { x, gt,f
y, gt,f }
Here, the correlation and mean power of the nonstationary processes x(t) and y(t) at the TF analysis point (t, f ) are measured via a TF shifted analysis window gt,f (t ) = g(t t) ej2f t , thereby endowing x,y (t, f ) with an immediate physical interpretation as a TF correlation coecient. Due to the following result, positive type I spectra (satisfying P Cx (t, f ) = Cx (t, f ) for all x(t)) are of particular interest. Theorem 3.32. The squared magnitude of the TF coherence function in (3.117) is bounded as 0 x,y (t, f ) operator, i.e., C 0 or C 0.
2
if and only if the underlying type I spectrum C (t, f ) = Tr{Ct,f R } is induced by a semi-denite Proof. The lower bound 0 |x,y (t, f )| is of course trivial. Since y x Cx,y (t, f ) = E{ Ct,f x, y } = E{ CS+ x, S+ y } = E{ C, y } = Cx,(0, 0) , t,f t,f with x = (S+ x)(t), y = (S+ y)(t), we can restrict attention to x,y (0, 0) . Furthermore, since only the t,f t,f x,y (t, f ), Cx (t, f ), and Cy (t, f ) are of interest, the case C 0 can be reduced to the magnitudes of C with k > 0 (where possibly K = ). We obtain, Cx,y (0, 0) = E{ Cx, y } = E
K k=1 k gkgk
case C 0 by the substitution C C. Hence, let us next assume C 0 so that C =

K
k x, gk y, gk
k=1
= x, y
with the vectors x = ( x, g1 , . . . x, gK )T , y = ( y, g1 , . . . y, gK )T , and the inner product x, y the vectors x and y). With x x,y (0, 0)
2
E{y H x} where = diag{1 , . . . , K } (it is easily checked that this denes a valid inner product for
E
= x, x
and the Schwarz inequality, we further obtain

2 E 2 E
| x, y E |2 x |Cx,y (0, 0)|2 = = 2 2 Cx (0, 0)Cy (0, 0)| x E y E x
y y
2 E 2 E
= 1,
which proves the if part of the theorem. To prove the only if part, we show that for any indef2 inite operator C one can nd processes x(t) and y(t) such that |x,y (0, 0)| > 1, i.e., Cx,y (0, 0) > Cx (0, 0) Cy (0, 0) or C, Ry,x
2
>
C, Rx
C, Ry .
(3.118)
166

k gk gk to construct two correlated rank k gk gk , Ry =
To this end, we use the eigenfunctions gk (t) of C = dom processes x(t) =

k k gk k
k k
x, gk gk (t) and y(t) =
E { y, gk y, gl } = k kl , and E { x, gk y, gl } = k kl so that Rx =
y, gk gk (t) with E { x, gk x, gl } = k kl ,
specied below. Since C is indenite, at least two of its eigenvalues have dierent sign. Without loss of generality, we assume that the eigenvalues are ordered such that the rst eigenvalue is positive,
gk , and Rx,y = k k gk gk . The parameters k , k , k have to satisfy the conditions k 0, k 0, and |k | k k (the latter being due to the Schwarz inequality) and will be further
1 1 > 0, and the second eigenvalue is negative, 2 < 0. By choosing 2 = 2 1 > 0 and k = 0 for
k 3, we achieve C, Rx =
k k = 0, so that the right-hand side of (3.118) equals zero. In con2 2
1 1 , 2 , 1 , 2 can be chosen as arbitrarily large positive numbers (while still obeying 2 = 2 1 ). We
trast, the left-hand side of (3.118) equals C, Ry,x = 1 + 2 since enforcing the condition 2 1 |k | k k in our case implies k = 0 for k 3. However, 1 and 2 can be chosen arbitrarily since C, Ry,x
2
conclude that
= 1 + 2 2 1
can be made arbitrarily large by suitable choice of 1 , 2 ,

2
thereby verifying (3.118). This proves the only if part of the theorem. The above theorem establishes a property of the TF coherence function x,y (t, f ) that is analo-
gous to (3.106). We next present an analogue of property (3.107) for the case of nonstationary random processes related by an LTV system. To this end, we restrict to the case where the type I spectrum chosen is the GWVS. Theorem 3.33. For any two linearly related random processes y(t) = (Kx)(t), the dierence () (t, f ) is bounded as Wx,y (t, f ) Wx (t, f ) Wy
() 2 () ()
(t, f )
() (t, f ) (1,1) () 2 Bx,K + 4 || m(1,1) + mK x 2 2 Ax SK

1 1
(3.119)
with () Bx,K
() 2
c 5 mK m(1,0) + 5 mK m(0,1) + mK mK x x
()
(0,1)
(1,0)
(0,1)
(1,0)
where c = |1/2 + | + |1/2 |.

()
Proof. Noting that Wx,y (t, f ) = LRx K+ (t, f ) and Wy Wx (t, f )

() 2 ()
(t, f ) = LKRx K+ (t, f ), subtracting/adding
()
LK+ (t, f ) , and applying the triangle inequality, we obtain () (t, f ) A (t, f ) + B (t, f )
where A (t, f ) = LRx K+ (t, f ) B (t, f ) = Wx (t, f )

() () () () 2 () 2
LK+ (t, f )
()
Wx (t, f )
2
()
LK+ (t, f ) ,
() ()
()
Wx (t, f ) Wy
(t, f ) .
Let us rst consider A (t, f ). Applying the inequality |a|2 |b|2 2|a b|(|a| + |b|) with a = LRx K+ (t, f ) and b = Wx (t, f ) LK+ (t, f ) to A (t, f ) yields A (t, f ) = LRx K+ (t, f )
() 2
Wx (t, f )
()
LK+ (t, f )
()

() () () () () ()
167
2 LRx K+ (t, f ) Wx (t, f ) LK+ (t, f )
LRx K+ (t, f ) + Wx (t, f ) LK+ (t, f ) .

()
mK , we further have
Applying (2.57) with H1 = Rx and H2 = K+ and subsequently using the relations Wx (t, f ) (k,l) () Ax 1 , LK (t, f ) SK 1 , SRx K+ 1 SRx 1 SK+ 1 (see (B.14)), SK+ 1 = SK 1 , mK+ =
(k,l)
A (t, f ) 2 2 Ax 2 2 Ax 8 Ax
2 1
1 1
SK + SK +
2 1
() 1 BRx ,K () 1 BRx ,K
LRx K+ (t, f ) + Wx (t, f ) LK+ (t, f ) SRx K +

1
()
()
()
+ Ax
SK +
SK
2 1
+
2 1
8 c Ax
SK
1 1 (0,1) (1,0) (0,1) mx mK + m(1,0) mK x 2 2 (1,0) (0,1) (0,1) (1,0) mx mK + mx mK ,
(3.120)
where in the last step we used | 1/2| c . For B (t, f ) we obtain B (t, f ) = Wx (t, f ) = Wx (t, f )
() 2 () () () 2
LK+ (t, f )
() 2
()
Wx (t, f ) Wy
()
()
()
(t, f )
LK (t, f ) Wx
(t, f ) WKx (t, f ) .
()
Subtracting/adding LK (t, f ) Wx (t, f ) inside the second term and applying the triangle inequality, we obtain further B (t, f ) = Wx (t, f )
() () 2 () ()
LK (t, f ) +
Wx
(t, f ) Wx (t, f )
2 () ()
LK (t, f ) Wx (t, f ) WKx (t, f )

2 ()
()
Wx (t, f )
()
()
LK (t, f ) 2 {Wx (t, f )} +

() 2 ()
()
LK (t, f ) Wx (t, f ) WKx (t, f ) .

()
()
()
()
() () Successively applying the inequalities Wx (t, f ) Ax 1 , LK (t, f ) SK 1 , the bound (3.32) on
{Wx (t, f )} , and the bound (3.96) on B (t, f ) Ax Ax Ax Ax = Ax

() 2 1 1 1 1 2 12 2 1 2 1 ()
LK (t, f ) Wx (t, f ) WKx (t, f ) , we obtain LK (t, f ) Wx (t, f ) WKx (t, f )

2 () () () 2 () () () 2 () () ()
LK (t, f ) 2 {Wx (t, f )} + SK SK SK {Wx (t, f )} +

(1,1) 1 4 || mx (1,1) 1 4 || mx
()
LK (t, f ) Wx (t, f ) WKx (t, f ) LK (t, f ) Wx (t, f ) WKx (t, f ) + c mK m(1,0) + mK m(0,1) + mK mK x x
(0,1) (1,0) (0,1) (0,1) (1,0) (0,1) (1,0) (1,0)
Ax Ax
1
+ +
2 Ax
2 1
SK
2 1
2 || mK
(1,1)
SK
2 1
4|| m(1,1) + mK x
(1,1)
+ 2 c mK m(1,0) + mK m(0,1) + mK mK x x
(3.121) The nal bound (3.119) follows upon combining (3.120) and (3.121). Discussion. The foregoing theorem shows that for small mK mx
(1,1) mK , (0,1) (1,0)
, mK mx
(1,0)
(0,1)
, mx
(1,1)
, and
there is Wx,y (t, f )

() 2
Wx (t, f ) Wy
()
()
(t, f )
168
and thus x,y (t, f )

2
1.
(3.122)
In order that the above moments be small, Rx and K have to be jointly strictly underspread operators, i.e., x(t) has to a an underspread process and K has to be an underspread system and their GEAF/GSF have to be similarly oriented along the axis or axis. Note that this implies that y(t) = (Kx)(t) will be an underspread process as well. In the case = 0, the bound (3.119) is tightest and can be tightened even further by using metaplectic transformations of Rx and K in a similar manner as in Theorem 2.15. As an illustration of the approximation (3.122), we reconsider the example given at the end of Subsection 3.8.2 without noise, i.e., y(t) = (Kx)(t). Here, x(t) is stationary white noise and K is an LTV system with GWS as shown in Fig. 3.11(a). Since x(t) and y(t) are linearly related,
+ there is x,y x,y = I. The TF coherence function x,y (t, f ) computed using the Wigner-Ville spec-
trum (i.e., = 0), also correctly indicates the linear relationship since there is x,y (t, f ) 1 with max(t,f ) x,y (t, f ) 1 = 0.0164, thereby conrming the validity of (3.122). We can nally conclude from the foregoing two theorems that within the underspread framework,
the TF coherence function is a meaningful concept since it features similar properties as the spectral coherence function of the stationary case. However, for overspread (i.e., not jointly underspread) processes x(t) and y(t), the TF coherence function in general is only of limited usefulness, i.e., its magnitude may get larger than one and/or it may not correctly indicate existing linear relationships. As an example, consider two correlated random processes x(t) = u(t + t0 ) ej2f0 t and y(t) = u(t 2 2 t0 ) ej2f0 t where u(t) = et /T / 2T , t0 and f0 are xed, and is random with E{||2 } = > 0. One obtains Wx,y (t, f ) = e2[t
(0) (0) (0)
2 /T 2 +f 2 T 2 ]
ej4(t0 f f0 t)
2 2 0) T ] 2 2 0) T ]
Wx (t, f ) = e2[(t+t0 ) Wy (t, f ) = e2[(tt0 )

(0) (0) ()
2 /T 2 +(f +f 2 /T 2 +(f f
ever, Wx,y (t, f ) is localized (and oscillatory) about (0, 0), corresponding to a statistical cross term (see Section 3.2). It follows that |x,y (t, f )| = e2[t0 /T
2 2 +f 2 T 2 ] 0
It is seen that Wx (t, f ) and Wy (t, f ) are localized about (t0 , f0 ) and (t0 , f0 ), respectively. How-
1,
which for increasing t0 , f0 can become arbitrarily large. Furthermore, apart of not being upper bounded by 1, the TF coherence function |x,y (t, f )| in this example also fails to indicate that x(t) and y(t) are linearly related. In fact, y(t) = ej4f0 t0 (S2t0 ,2f0 x)(t) but |x,y (t, f )| = 1 for t0 = 0, f0 = 0. The large values of |x,y (t, f )| are seen to be due to TF correlations, i.e., correlations between
(1/2)
x(t) and y(t) indeed are not jointly underspread.
point (2t0 , 2f0 ) in the (, ) plane (i.e., not about the origin). Hence, it is seen that the processes
components of x(t) and y(t) located in dierent parts of the TF plane. These TF correlations are () indicated by statistical cross terms in Wx,y (t, f ) and by the fact that |Ax,y (, )| is localized about the
Applications
Before creation God did just pure mathematics. Then He thought it would be a pleasant change to do some applied. John E. Littlewood
HIS chapter discusses several practical applications of the theory developed in Chapters 2 and 3. First, in Sections 4.1 and 4.2 we show that the results of Subsections 2.3.6 and 2.3.7 allow a
time-frequency formulation and design of mean-square optimal time-varying Wiener lters and timevarying likelihood ratio detectors. In Section 4.3, we demonstrate the relevance of Subsections 2.3.1, 2.3.4, and 2.3.18 to the analysis of systematic measurement errors of mobile radio channel sounders. Then, in Section Section 4.4 we apply results of Subsection 2.3.8 to (bi-)orthogonal frequency division multiplexing communication systems operating over linear time-varying channels. Finally, in Section 4.5, we illustrate the usefulness of time-varying spectra and time-frequency coherence functions (see Chapter 3) for the analysis of signals measured in car engines.
169
170
Chapter 4. Applications
4.1 Nonstationary Signal Estimation

Signal estimation is important in many practical applications such as in speech enhancement and interference excision. Subsequently, we will briey outline the relevance of the results of Chapters 2 and 3 to this problem. Further details are discussed in [92, 111]. Other work on TF aspects of time-varying Wiener lters can be found in [9, 110, 131, 132, 186, 191, 192].
4.1.1
Time-Varying Wiener Filter
Let us assume that we observe a zero-mean nonstationary random process y(t) and we would like to form an estimate x(t) of a desired nonstationary zero-mean (signal) process x(t) by passing y(t) through a linear, generally time-varying system H, i.e., x(t) = (H y)(t). The cross-correlation operator Rx,y , describing the statistical relation of x(t) and y(t), and the correlation operator Ry are assumed to be known. Adopting a minimum mean-square error (MSE) criterion, the optimal estimator minimizes the expected energy Ee = E{ e 2 } of the estimation error e(t) = x(t) x(t), i.e., 2 HW arg minH Ee . Using the orthogonality principle [106, 187, 197, 202], this optimization problem can be reduced to the solution of the Wiener-Hopf equation HW Ry = Rx,y . (4.1)
With R1 denoting the (pseudo-)inverse of Ry , the time-varying Wiener lter is given by [92,106,111, y 187, 197, 202] HW = Rx,y R1 . y (4.2)
Note that this involves a computationally intensive and potentially unstable operator inversion. The minimum MSE achieved with HW is given by Ee = Es HW , Rx,y = Es Tr HW R+ . x,y
W
In the special case of jointly stationary processes, the design of the Wiener lter can be performed
in the physically intuitive frequency domain. The transfer function of HW is then given by Px,y (f ) , for Py (f ) > 0 , Py (f ) HW (f ) = 0, for Py (f ) = 0 , where Px,y (f ), Py (f ) denote the (cross) power spectral densities (PSDs) of x(t) and y(t).
(4.3)
4.1.2
Time-Frequency Formulation of the Time-Varying Wiener Filter
We next assume that x(t) and y(t) are jointly CL underspread processes (cf. Section 3.1) with GEAF support region G and joint correlation spread x,y = G . Then, (4.1) is seen to be of the type (2.96) Wiener lter HW , dened by SHG (, ) = SHW (, ) IG (, ) ,
W
with G = HW , H2 = Ry , and H3 = Rx,y . Hence, Theorem 2.20 implies that the DL part HG of the W
() ()
constitutes an approximate solution of (4.1), i.e. (see also (2.100)), HG Ry Rx,y , W (4.4)
4.1 Nonstationary Signal Estimation
171
with the associated approximation error being bounded as HG Ry Rx,y W HG W

2 2
Ry
x,y .
(4.5)
This further implies the following result on the (suboptimal) MSE EeG achieved with HG . W
W
Corollary 4.1. The excess MSE EeG EeW is bounded as

W
0 EeG EeW 2 HW
W
2 2
Ry
x,y .
(4.6)
Proof. The left-hand inequality in (4.6) is trivial since EeW is by denition the minimum achievable MSE. Furthermore, it can be shown that the MSE achieved with an arbitrary system H can be written as Ee = Tr Rx HR+ Rx,y H+ + HRy H+ . x,y With H = HW as dened in (4.2), this simplies to EeW = Tr Rx HW R+ , so that we obtain x,y EeG EeW = Tr Rx HG R+ Rx,y HG W x,y W
W
+ HG Ry HG W W
+
Tr Rx HW R+ x,y
= Tr{Rx } + Tr = Tr
HG Ry Rx,y HG W W
+
HG R+ Tr{Rx } + Tr HW R+ x,y W x,y .

() HG W
HG Ry Rx,y HG W W
+ Tr HG R+ W x,y
Due to the unitarity of the GSF (see (B.8)) and the fact that the supports of S do not overlap, there is Tr HG R+ W x,y further arrive at EeG EeW = Tr
W
() (, ) and Ax,y (, )
= S
() HG W
() , Ax,y = 0. By applying the Schwarz inequality, we
HG Ry Rx,y HG W W
= HG Ry Rx,y , HG HG Ry Rx,y W W W
2
HG W
2
2.
The nal bound (4.6) now follows by applying (4.5) and by noting that HG W HG 2 W HW
2.
HW
and
The preceding corollary shows that in the case of jointly CL underspread processes with small joint correlation spread x,y , the MSE achieved with HG is close to the minimum achievable MSE Ee .
W
W
We next turn to a TF formulation of the time-varying Wiener lter. Since the Weyl symbols of Rx,y and Ry equal the (cross) Wigner-Ville spectra of x(t) and y(t), Theorem 2.21 (with G = HW , H2 = Ry , H3 = Rx,y and G = x,y ) implies that the Weyl symbol of HG satises (cf. (2.105) W and [92, 111]) LHG (t, f ) Wy (t, f ) Wx,y (t, f ) ,
W
(0)
(0)
(0)
where the approximation error is bounded as LHG (t, f ) Wy (t, f ) Wx,y (t, f )
W
(0)
(0)
(0)
SHW
Ay
2 x,y + 4 x,y , 2
LHG Wy
W
(0)
(0)
Wx,y Ry
2
(0) 2
HW
3 x,y + 2 x,y .
172
Hence, using regularized inversion as described in Subsection 2.3.7 (cf. (2.106)), we obtain (0) Wx,y (t, f ) , for (t, f ) R , (0) (0) LHG (t, f ) Wy (t, f ) W 0, for (t, f ) R , where R = (t, f ) :
Wy (t,f ) SH2 1
(0)
(4.7)
with =
2 c x,y +4 x,y is the TF region where the Wigner-
Ville spectrum of y(t) is essentially positive and is chosen to meet prescribed accuracy requirements (see Subsection 2.3.7). The approximation (4.7) constitutes a TF formulation of the time-varying Wiener lter that extends the frequency-domain formulation (4.3) of the time-invariant Wiener lter to the nonstationary case. We recall that HG is nearly optimal in the sense that it achieves an MSE W only slightly larger than that achieved by HW . Furthermore, we recall that LHG (t, f ) is a smoothed version of the Weyl symbol of HW .
W
(0)
4.1.3
Time-Frequency Filter Design
While (4.7) provides an approximate expression for the Weyl symbol of the Wiener lter, let us now dene another LTV system HW by setting its Weyl symbol equal to the right-hand side of (4.7) [92,111]: (0) Wx,y (t, f ) , for (t, f ) R , (0) (0) L e (t, f ) (4.8) Wy (t, f ) HW 0, for (t, f ) R .
(0) HW (0)
W
We refer to HW as TF pseudo-Wiener lter [92, 111]. For jointly CL underspread processes x(t) and and thus HW HG , i.e., the TF pseudo-Wiener lter HW is a close approximation to the DL W part HG of the Wiener lter HW and therefore nearly optimal. For processes that are not jointly W y(t), where (4.7) is a good approximation, combination of (4.8) and (4.7) yields L e (t, f ) LHG (t, f )
underspread, however, the performance of HW must be expected to be far from optimal (see the overspread simulation example below). Compared to the Wiener lter HW , the TF pseudo-Wiener lter HW has two advantages: Modied a priori knowledge. The design of HW is based on the (cross) Wigner-Ville spectra
(0) (0)
Wx,y (t, f ) and Wy (t, f ), and thus it is physically more intuitive than the design of HW which is based on correlation operators.
Reduced and stable computation. The design of HW according to (4.2) requires a computationally replaced by a simple and stable scalar division.
intensive and potentially unstable operator inversion. By using (4.8), this operator inversion is
We note that for jointly underspread processes, the (cross) Wigner-Ville spectra occurring in the right-hand side of (4.8) can be replaced by the GWVS or any other type I spectrum since in the underspread case these spectra are approximately equivalent (cf. Section 3.5.1 and [118, 126, 148]).
4.1 Nonstationary Signal Estimation f f
173
t (a) f (b) f
t (c) f (d)
t (e) (f )
t (g)
Figure 4.1: Illustration of the TF formulation of the Wiener lter: (a) Wigner-Ville spectrum of x(t), (b) Wigner-Ville spectrum of n(t), (c) expected ambiguity function of x(t), (d) expected ambiguity function of n(t), (e) real part of Weyl symbol of Wiener lter HW , (f ) real part of Weyl symbol of DL part HG of Wiener lter, (g) Weyl symbol of TF pseudo-Wiener lter HW . The rectangles in (c) W and (d) have area 1 and thus allow an assessment of the underspread property of x(t) and n(t). The signal length is 128 samples.
4.1.4
Simulation Results
Underspread Example. Using the TF synthesis technique introduced in [89], we synthesized a signal process x(t) and a noise process n(t) (uncorrelated with x(t)), whose sum constitutes the observation, i.e., y(t) = x(t) + n(t). Hence, Ry = Rx + Rn and Rx,y = Rx . The expected energies of these processes were Ex = 9.09 and En = 11.89, respectively, corresponding to a mean input SNR of Ex /En = 1.17 dB. The Wigner-Ville spectra and expected ambiguity functions of x(t) and n(t) are shown in Fig. 4.1(a)(d). The expected ambiguity functions in parts (c) and (d) show that HG , and the TF pseudo-Wiener lter HW are shown in Fig. 4.1(e)(g). It is veried that HW W closely approximates HG and also HW . This is further corroborated by similarity of the mean SNR W improvements1 of 6.14 dB and 6.11 dB achieved by HW and HW , respectively. Overspread Example. We next consider an example where the underspread assumption is violated. Again, the observation y(t) = x(t) + n(t) is the sum of a signal process x(t) and a noise process n(t). Parts (a) and (c) of Fig. 4.2 show the Wigner-Ville spectrum und expected ambiguity function, respectively, of the signal process x(t) that is seen to be a CL underspread process. In
The SNR improvement is dened as the dierence SNRout SNRin of the output SNR SNRout = Ex /Ee and the x /En . input SNR SNRin = E
1
the processes are jointly CL underspread. The Weyl symbols of the Wiener lter HW , its DL part
174 f f
t (a) f (b) f
t (c) f (d)
t (e) (f )
t (g)
Figure 4.2: A ltering experiment involving an overspread noise process: (a) Wigner-Ville spectrum of x(t), (b) Wigner-Ville spectrum of n(t), (c) expected ambiguity function of x(t), (d) expected ambiguity function of n(t), (e) real part of Weyl symbol of Wiener lter HW , (f ) real part of Weyl symbol of DL part HG of Wiener lter, (g) Weyl symbol of TF pseudo-Wiener lter HW . The rectangles in W parts (c) and (d) have area 1 and thus allow to assess the under-/overspread property of s(t) and n(t). (In particular, part (d) shows that n(t) is overspread.) The signal length is 128 samples. contrast, the noise process n(t), shown in parts (b) and (d) of Fig. 4.2, is fairly quasi-stationary but nonetheless overspread since its temporal correlation width is too large (this is also indicated by the statistical cross terms in the Wigner-Ville spectrum of n(t)). Consequently, the TF pseudo-Wiener lter HW (shown in Fig. 4.2(g)) is signicantly dierent from the Wiener lter HW (shown in Fig. 4.2(e)) and the DL part HG of the Wiener lter (see Fig. 4.2(f)) does not approximate HW . The W processes x(t) and n(t) were constructed such that they lie in linearly independent (disjoint) signal spaces. Here, HW can be shown to be an oblique projection operator [10] that perfectly reconstructs x(t), thereby achieving zero MSE and innite SNR improvement. On the other hand, the TF pseudoWiener lter HW and the DL part HG of the Wiener lter merely achieve SNR improvements of W 4.79 dB and 4.24 dB, respectively. We conclude that in an overspread scenario, the TF designed pseudo-Wiener lter achieves an SNR improvement that is far from optimal.
4.2 Nonstationary Signal Detection

In this section, we briey discuss the relevance of the results of Chapters 2 and 3 to the practically important problem of signal detection [108, 168, 187, 202]. Further details on TF detectors can be found in [58, 59, 141143, 146, 183, 185].
175
4.2.1
Optimal Detectors
We consider the binary hypothesis test H0 : y(t) = x0 (t) vs. H1 : y(t) = x1 (t) , (4.9)
where x0 (t) and x1 (t) are zero-mean, nonstationary, mutually uncorrelated random processes with known correlation operators R0 and R1 , respectively. Typically, a decision between the hypotheses is obtained by comparing a suitable test statistic (y) derived from the observation y(t) to a prescribed threshold . Here, we will consider test statistics that can be written as a quadratic form (y) = H y, y , induced by a linear operator H. Note that the test statistic is completely specied by H. (4.10)
Likelihood Ratio Detector

It can be shown that the optimal test statistic (optimal both in the Neyman-Pearson and in the Bayesian sense) is given by the (log-)likelihood ratio [108, 168, 187, 202]. If both x0 (t) and x1 (t) are Gaussian processes, the likelihood ratio leads to the equivalent test statistic LR (y) = HLR y, y , where the operator HLR is dened by [108, 168, 187, 202] R0 HLR R1 = R1 R0 . With the (pseudo-)inverses R1 , R1 of R0 and R1 , the solution of (4.11) can be written as 0 1 HLR = R1 (R1 R0 )R1 . 0 1 The computation of HLR requires two operator inversions which are computationally costly and potentially unstable. The design of the likelihood ratio detector simplies if x0 (t) and x1 (t) are stationary processes. In this case, one has LR (y) =
f
(4.11)
HLR (f ) |Y (f )|2 df with for Px0 (f ) Px1 (f ) > 0 , for Px0 (f ) Px1 (f ) = 0 ,
where Y (f ) is the Fourier transform of y(t) and Px0 (f ), Px1 (f ) are the PSDs of x0 (t) and x1 (t), respectively.
Px1 (f ) Px0 (f ) , Px0 (f ) Px1 (f ) HLR (f ) = 0,
(4.12)
176
Deection-Optimal Detector
There are situations where the distribution of the observation under hypothesis H1 is not known or the corresponding likelihood ratio is dicult to compute. In such situations, the deection-optimal detector can be used as an alternative to the optimal likelihood ratio detector. This is the quadratic test statistic (4.10) with the operator H chosen to maximize deection, dened as [7, 108] d
2
E {|H1 } E {|H0 } var{|H0 }
with E {|Hi } and var{|Hi } denoting the expectation and variance of under hypothesis Hi . The deection measures how well the conditional PDFs of under H0 and H1 are separated. If the distribution of y(t) under H0 is Gaussian, the deection can be shown [7] to equal d2 = Tr2 {H(R1 R0 )} . Tr{(HR1 )2 }
1/2
It can furthermore be shown [7] that the maximal deection equals d2 = R1 max and is achieved with the operator HD that satises R1 HD R1 = R1 R0 .
(R1 R0 )R1
1/2 2 2
(4.13)
The operator equation (4.13) can be solved using the (pseudo-)inverse R1 of R1 : 1 HD = R1 (R1 R0 )R1 . 1 1 However, the computation of R1 requires a large computational expense and may be numerically 1 unstable. A simplication is possible if x0 (t) and x1 (t) are stationary processes with PSDs Px0 (f ) and Px1 (f ); here, D (y) =
f
HD (f ) |Y (f )|2 df with
4.2.2
Time-Frequency Formulation of Optimal Detectors
Px1 (f ) Px0 (f ) , for P (f ) > 0 , x1 2 Px1 (f ) HD (f ) = 0 , for Px1 (f ) = 0 .
(4.14)
We next consider the TF reformulation of the above optimal detectors. To this end, we rst note that according to (B.27) the quadratic test statistic (4.10) can be rewritten as
(0) (y) = LH , Wy = (0) (0) LH (t, f ) Wy (t, f ) dt df , (0) (0)
where in the last expression we used the fact that the Wigner distribution Wx (t, f ) is always realvalued. In the following, we will present approximations for the Weyl symbol of the operators HLR and HD that are valid for jointly CL processes and allow for a simple and intuitive TF formulation of the likelihood-ratio detector and the deection-optimal detector.
177
Time-Frequency Formulation of the Likelihood Ratio Detector

Let us assume that x0 (t) and x1 (t) are jointly CL underspread processes (cf. Section 3.1) with GEAF support region G and correlation spread x0 ,x1 = G . It is then recognized that (4.11) is of the type the DL part HG of HLR approximately satises (4.11), i.e. (cf. (2.81)) LR R0 HG R1 R1 R0 , LR and the resulting approximation error is bounded as R0 HG R1 (R1 R0 ) LR R0 2 HG 2 LR R1
2 2
(2.76) with G = HLR , H1 = R0 , H2 = R1 , and H3 = R1 R0 . Hence, Theorem 2.18 implies that
x0 ,x1 .
With regard to an approximate TF formulation of the likelihood ratio detector, we next note that it that the Weyl symbol of HG approximately satises (see (2.92)) LR
(0) (0) (0) (0)
LR
follows from Theorem 2.19 (with G = HLR , H1 = R0 , H2 = R1 , H3 = R1 R0 , and G = x0 ,x1 ) Wx0 (t, f )LHG (t, f )Wx1 (t, f ) Wx1 (t, f ) Wx0 (t, f ) ,
(0)
where we used the fact that the Weyl symbol of Ri equals the Wigner-Ville spectrum of xi (t), i.e., LRi (t, f ) = Wxi (t, f ). The associated approximation error is bounded as (cf. (2.84)) Wx0 (t, f )LHG (t, f ) Wx1 (t, f ) Wx0 (t, f ) Wx1 (t, f )
LR
(0)
(0)
(0)
(0)
(0)
(0)
(0)
Ax0
SHLR
(0)
Ax1
(0)
3 2 + 9 x0 ,x1 , 2 x0 ,x1
W x0 LHG W x1 W x0 W x1
LR
(0)
(0)
(0)
R0 that
HLR
R1
2 x0 ,x1 + 8
3 x0 ,x1 + 3 x0 ,x1 .
Using the regularized TF inversion technique described at the end of Subsection 2.3.6, it nally follows (0) Wx (t, f ) Wx(0) (t, f ) 1 0 , (t, f ) R (0) (0) (0) LHG (t, f ) Wx0 (t, f ) Wx1 (t, f ) LR 0 , (t, f ) R .
(0) (0)
(4.15)
Here, R is the TF region where Wx0 (t, f )Wx1 (t, f ) with appropriately chosen to meet specic accuracy requirements (see Subsection 2.3.6). The approximate TF formulation (4.15) extends the frequency domain formulation (4.12) valid in the stationary case to the case of jointly CL nonstationary processes.
Time-Frequency Formulation of the Deection-Optimal Detector

The TF formulation of the deection-optimal detector is completely analogous. It is again based on the assumption that x0 (t) and x1 (t) are jointly CL underspread processes whose GEAFs are supported within a region G such that the joint correlation spread of x0 (t) and x1 (t) is x0 ,x1 = G . Then, (4.13)
178
is again recognized to be of the type (2.76), now with G = HD , H1 = H2 = R1 , and H3 = R1 R0 . Hence, according to Theorem 2.18, the DL part HG of HD approximately satises (4.13), D R1 HG R1 R1 R0 . D The resulting approximation error is bounded as R1 HG R1 (R1 R0 ) D R1 2 2 HG 2 LR
2
3 x0 ,x1 .
(0) (0)
To obtain an approximate TF formulation of the deection-optimal detector, we next apply Theorem this yields the approximation 2.19 (with G = HD , H1 = H2 = R1 , H3 = R1 R0 , and G = x0 ,x1 ). With LRi (t, f ) = Wxi (t, f ), LHG (t, f ) Wx1 (t, f )
LR
(0)
(0)
Wx1 (t, f ) Wx0 (t, f ) ,
(0)
(0)
for the Weyl symbol of HG . Furthermore, the corresponding approximation error is bounded as D LHG (t, f ) Wx1 (t, f )
D
(0)
(0)
Ax1 LHG
(0)
D
2 1
Wx0 (t, f ) Wx1 (t, f ) SHD

(0) (0) 2
(0)
(0)
3 2 + 9 x0 ,x1 , 2 x0 ,x1
(0) 2 W x1
W x0 W x1
2 2
R1 the approximation
HD
2 x0 ,x1 + 8
3 x0 ,x1 + 3 x0 ,x1 .
Hence, with the regularized TF inversion technique described in Subsection 2.3.6, we nally obtain Wx(0) (t, f ) Wx(0) (t, f ) 1 0 , (t, f ) R 2 (0) (0) Wx1 (t, f ) LHG (t, f ) D 0 , (t, f ) R .
(0)
(4.16)
Here, R is the TF region where Wx1 (t, f ) with chosen according to certain accuracy requirements the frequency domain formulation (4.14) valid for jointly stationary processes to the case of jointly CL underspread nonstationary processes.
(see Subsection 2.3.6). This approximate TF formulation of the deection-optimal detector extends
4.2.3
Time-Frequency Detector Design
The expressions (4.15) and (4.16) provide approximate TF formulations of the DL parts of the operators HLR and HD , respectively. This motivates the denition of alternative test statistics (termed TF pseudo-likelihood ratio detector and TF pseudo-deection-optimal detector ) that are based on a TF design of the underlying operators. The TF designed detectors proposed below oer the same advantages as the TF pseudo-Wiener lter discussed in Subsection 4.1.3: Modied a priori knowledge. The a priori information required for the design of the TF detectors is specied in the physically intuitive TF domain.
179
Reduced and stable computation. The design of HLR and HD requires computation of the opstable scalar inversions.
erator inverses R1 and R1 . In contrast, the TF designed detectors are based on simple and 0 1
Time-Frequency Pseudo-Likelihood Ratio Detector

Motivated by the approximation (4.15), we dene a TF test statistic LR (y) HLR y, y = L e
(0) (0) , Wy HLR
where the operator HLR is dened by setting its Weyl symbol equal to the right-hand side of (4.15), i.e., Le
(0) (t, f ) HLR
We refer to LR (y) as TF pseudo-likelihood ratio detector. For jointly CL underspread processes, (0) (0) (t, f ) L G (t, f ) and thus HLR HG HLR as well as LR (y) LR (y). (4.15) implies that L
e HLR
(0) Wx (t, f ) Wx(0) (t, f ) 1 0 , (t, f ) R (0) (0) Wx0 (t, f ) Wx1 (t, f ) 0 , (t, f ) R .
LR
Hence, in the underspread case the TF pseudo-likelihood ratio detector will perform nearly as well as the likelihood ratio detector.
HLR
Time-Frequency Pseudo-Deection-Optimal Detector

Motivated by the approximation (4.16), we consider the TF test statistic D (y)
(0) HD y, y = L e , Wy (0) HD
dened with the operator HD whose Weyl symbol is chosen to equal the right-hand side of (4.16), i.e., Wx(0) (t, f ) Wx(0) (t, f ) 1 0 , (t, f ) R 2 (0) (0) Wx1 (t, f ) L e (t, f ) HD 0 , (t, f ) R .
HD
D
We refer to D (y) as TF pseudo-deection-optimal detector. For jointly CL underspread processes, (0) (0) (4.16) implies that L e (t, f ) LHG (t, f ) and thus HD HG HD as well as D (y) D (y). D Thus, in the underspread case the TF pseudo-deection-optimal detector performs nearly as well as the deection-optimal detector.
4.2.4
Simulation Results
Underspread Example. We next present Monte Carlo simulations for the detection of a nonstationary Gaussian signal s(t) in uncorrelated nonstationary Gaussian noise n(t) (both processes have been synthesized using the technique introduced in [89]). This corresponds to the hypothesis test (4.9) with x0 (t) = n(t) and x1 (t) = s(t) + n(t), i.e., R0 = Rn and R1 = Rs + Rn . Figs. 4.3(a),(c) show the
180 f f
t (a) f f (b)
t (c)
1 1
(d) PD
6
0.9 0.8
PD
0.9 0.8
0.7
0.7
t (e) f f (f )
0.6 0.01
- PF
0.1 1
0.6 0.01
- PF
0.1 1
(g)
1 1
(h) PD
6
0.9 0.8
PD
0.9 0.8
0.7
0.7
t (i) (j)
0.6 0.01
- PF
0.1 1
0.6 0.01
- PF
0.1 1
(k)
(l)
Figure 4.3: Illustration of the TF formulation of likelihood ratio and deection-optimal detector: (a) Wigner-Ville spectrum of s(t), (b) Wigner-Ville spectrum of n(t), (c) magnitude of expected ambiguity function of s(t), (d) magnitude of expected ambiguity function of n(t), (e) Weyl symbol of HLR , (f ) Weyl symbol of HLR , (g) ROC of LR (y), (h) ROC of LR (y), (i) Weyl symbol of HD , (j) Weyl symbol of HD , (k) ROC of D (y), (l) ROC of D (y). The rectangles in (c) and (d) have area 1 and thus allow to assess the underspread property of s(t) and n(t). The signal length is 128 samples.
Wigner-Ville spectrum and expected ambiguity function of s(t) and Figs. 4.3(b),(d) show the WignerVille spectrum and expected ambiguity function of n(t). It is seen that x(t) and n(t) are jointly CL underspread processes. The Weyl symbols of the operators HLR and HLR are shown in Figs. 4.3(e) receiver operating characteristics (ROCs) are shown in Figs. 4.3 (g) and (h) (the ROCs, empirically obtained by averaging over 3 105 experiments, show the detection probability PD versus the false alarm probability PF [108, 168, 187, 202]). Since they are practically indistinguishable, likelihood ratio detector and the TF pseudo-likelihood ratio detector perform equally well. Similar remarks apply to the Weyl symbols of HD and HD (Fig. 4.3(i),(j)) and the ROCs of the deection-optimal detector and the TF pseudo-deection-optimal detector (Fig. 4.3(k),(l)). and (f), from which the approximation LHLR (t, f ) L e
(0) (0) (t, f ) HLR
can be veried. Furthermore, the
4.3 Sounding of Mobile Radio Channels f f
181
t (a) f f (b)
t (c)
1 0.8 0.6 0.4 0.2 0.1 0.08 0.06 0.04 0.02 1 0.8 0.6 0.4 0.2 0.1 0.08 0.06 0.04
(d) PD
6
PD
6
t (e) (f )
0.01 0.01
- PF
0.1 1
0.02 0.01 0.01 0.1
- PF
1
(g)
(h)
Figure 4.4: Discrimination between an overspread process x0 (t) and an underspread process x1 (t): (a) Wigner-Ville spectrum of x0 (t), (b) Wigner-Ville spectrum of x1 (t), (c) magnitude of expected ambiguity function of x0 (t), (d) magnitude of expected ambiguity function of x1 (t), (e) Weyl symbol of HLR , (f ) Weyl symbol of HLR , (g) ROC of LR (y), (h) ROC of LR (y). The rectangles in (c) and (d) have area 1 and thus allow to assess the underspread/overspread property of x0 (t) and x1 (t). The signal length is 128 samples. Overspread Example. We next consider the discrimination between an overspread process x0 (t) and an underspread process x1 (t). The Wigner-Ville spectra and expected ambiguity functions of x0 (t) and x1 (t) are shown in Figs. 4.4(a)(d). It is seen that both processes consist of two energetic components that are concentrated in disjoint TF regions. However, while these components are uncorrelated in the case of x1 (t), they are strongly correlated in the case of the overspread process x0 (t). The hypothesis test is thus intended to nd out whether the two TF disjoint process components are correlated (the overspread case x0 (t)) or uncorrelated (the underspread case x1 (t)). Figs. 4.4(e)(h) compare the likelihood ratio detector HLR and the TF pseudo-likelihood ratio detector HLR . Since x0 (t) is an overspread process, the Weyl symbol of the TF pseudo-likelihood ratio detection operator HLR (shown in Fig. 4.4(f)) signicantly diers from the Weyl symbol of HLR (shown in Fig. 4.4(e)). This dierence is furthermore reected by the fact that the ROC of LR (y) (see Fig. 4.4(h)) is signicantly worse than the (optimal) ROC of LR (y) (see Fig. 4.4(g)). We conclude that in overspread scenarios, the performance of TF designed detectors is far from optimum.
4.3 Sounding of Mobile Radio Channels

Accurate wideband measurements of mobile radio channels are important for the design or simulation of mobile radio systems with high data rate. The typical channel sounders used to obtain such
182
(t)
G transmit lter
x(t)
y(t)
R receive lter
- u(t)
time-varying channel
Figure 4.5: Generic channel sounder model. measurements are based on correlation/pulse compression techniques [40, 52, 164]. While these techniques are theoretically exact in the case of time-invariant channels, in time-varying environments the measurements obtained are aected by systematic measurement errors. In this section, these errors will briey be discussed for the case of uncalibrated sounders. For further details and a discussion of calibrated channel sounders, we refer to [149, 150, 156].
4.3.1
Channel Sounder Model

m (tmT )
We rst describe a generic model for correlative channel sounders (see Fig. 4.5). The sounding signal (i.e., the input to the mobile radio channel H) is obtained by periodic excitation (t) = of an LTI transmit lter G with impulse response g(t), i.e., x(t) = (G)(t) = (g )(t) = g(t mT ) .
In the following, we assume that the duration of g(t) is less than T . The output signal of the equivalent complex baseband channel H (assumed to be bandlimited to [B, B]) is given by y(t) = (Hx)(t). At the receiver front end, y(t) is passed through an LTI receive lter R with impulse response r(t). This results in the sounder output signal u(t) = (r y)(t) = (Ry)(t) = (RHx)(t) = (RHG)(t). We note that for proper operation, g(t) and r(t) should be designed such that (r f )(t) (t) (pulse compression), equivalently RG I . Assuming that channel and receive lter commute, i.e., RH = HR , the pulse compression property (4.17) implies u(t) = (RHG)(t) = (HRG)(t) (H)(t) = h(t, mT ).
m
(4.17)
(4.18)
Therefore, under these assumptions channel sounders essentially achieve a direct impulse sounding of the channel (i.e., the efective sounding signal is the impulse train (t)) and u(t) approximately consists of subsequent channel impulse response snapshots. An estimate of h(1/2) (t, ) = h(t + , t)
183
(cf. (B.2)) at t = mTrep = mKT (with the repetition factor K; note that Trep = KT ) can be calculated from u(t) as h(1/2) (mTrep , ) = u( + mTrep ) w( ) , (4.19) where w( ) equals 1 for 0 < T and 0 else. Furthermore, an estimate of h(1/2) (t, ) can then be obtained by interpolation according to h(1/2) (t, ) =
m
h(1/2) (mTrep , ) sinc u( + mTrep ) w( ) sinc

m
(t mTrep ) Trep (t mTrep ) Trep = (u)(t, ) = (RHG)(t, ) . (4.20)
Here, denotes the mapping of the output signal u(t) to the measured impulse response. We note that the above generic model covers the three most important practical channel sounders, namely the PN-sounder [40, 52, 164], swept time-delay cross-correlator [37], and chirp sounder [180, 194].
4.3.2
Analysis of Measurement Errors
As was shown in [149,150], the dierence (error) between the measured impulse response (1/2) (t, ) = h (RHG)(t, ) and the impulse response h(1/2) (t, ) = h(t, t ) usually desired in mobile radio applications can be split into four components, i.e.,
4
h(1/2) (t, ) h(1/2) (t, ) =
ei (t, ) .
i=1
(4.21)
We next discuss the denitions and interpretations of the error components ei (t, ). Furthermore, upper bounds on specic error norms are provided. These bounds are formulated in terms of important channel and sounder parameters. We note that for an arbitrary norm inequality to (4.21) gives
4
, application of the triangle (4.22)
h(1/2) h(1/2)
ei .
i=1
Thus, our upper bounds on the error components ei (t, ) also yield upper bounds on the total error hO (t, ) hI (t, ).
H H
Commutation Error
For time-varying channels, the commutation property RH = HR in (4.18) is not satised exactly (recall that correlative sounding assumes the operator ordering HRG but in fact we have RHG). This causes a commutation error e1 (t, ) h(1/2) (t, ) (HRG)(t, ) = (RHG)(t, ) (HRG)(t, ) = ([R, H]x)(t, ) ,
(jointly) underspread systems approximately commute. Indeed, it can be shown (by adapting the
where [R, H] = RH HR denotes the commutator of R and H. In Subsection 2.3.16, we saw that
184
proof of Theorem 2.14) [149], that the commutation error is bounded as |e1 (mTrep , )| 1 g SH 1 r 1 where mR
(1,0)
2 mR
(1,0)
mH
(0,1)
|e1 (mTrep , )| d T 1 g SH 1 r 1
(4.23)
(0,1)
measures the eective duration of the receive lter impulse response r(t) and mH
measures the channels eective Doppler spread. From (4.23), it is seen that the commutation error will be small if the receive lter is not too long and the channel does not vary too fast. We note that the error bound (4.23) also proved important in the context of scattering function estimation for random time-varying channels [5, 6].
Pulse Compression Error

According to (4.17), correlative channel sounders theoretically require that the transmit lter and the receive lter have perfect pulse compression properties. Unfortunately, practical transmit and receive compression error lters do not yield perfect pulse compression, i.e., (r g)(t) = (t) or RG = I. This leads to a pulse e2 (t, ) (HRG)(t, ) (H)(t, ) .
This error is determined by the transmit and receive lters and is not related to the channels time variation. In practical channel sounders, calibration is used to reduce the eects of imperfect pulse compression. However, it is shown in [150,156] that for time-varying channels conventional calibration procedures are aected by systematic errors as well. The pulse compression error is bounded as [149] |e2 (mTrep , )| 2 SH 1 1 T R
|k|BT
k k G T T
1 ,
|e2 (mTrep , )| d T 2 , SH 1
(4.24)
where B is the channel bandwidth. The bound (4.24) is intuitive since the error resulting from imperfect correlation/pulse compression properties of transmit and receive lter is obviously determined by the deviation of the composite transfer function R(f )G(f ) from the ideal value 1.
Aliasing Error
In order that subsequent snapshots do not overlap, the channels maximum delay H
(max) H (max)
must satisfy
Trep = KT , the channels maximum Doppler shift H

(max)
T . Similarly, in order that the channel variation is properly tracked using the repetition rate
(max)
must satisfy H
(max)
these two requirements yields the perfect identication condition [103, 104] H T 1 2KH
(max)
1/(2KT ). Combining (4.25)
Note that it is necessary for this condition to hold that H
(max) (max) H
underspread. If the condition in (4.25) is not met, there will occur aliasing errors (du to overlapping snapshots and/or insucient channel tracking). The associated error component is given by e3 (t, ) (H)(t, ) h(1/2) (t, ) .
1/(2K), i.e., that the channel is
185
Since correlative channel sounding can be shown to correspond to a sampling of the channels timevarying transfer function (i.e., of the GWS LH (t, f )), Theorem 2.35 directly applies (with F = 1/(2KT )). It follows that the aliasing error e3 (t, ) is bounded as [149]
(1,0) |e3 (t, )| d mH (0,1) 2 + 2KT mH , SH 1 T ()
e3 SH
2 2
MH 2 T
(1,0)
+ 2KT MH
(0,1)
Misinterpretation Error
Usually, in the context of practical channel sounders the measured function (1/2) (t, ) = (u)(t, ) is h erroneously interpreted as an estimate of the impulse response h(1/2) (t, ) rather than of h(1/2) (t, ). This corresponds to a misinterpretation error e4 (t, ) h(1/2) (t, ) h(1/2) (t, ) .
Using the -invariance results of Subsection 2.3.1, e4 (t, ) can be bounded as
|e4 (t, )| d (1,1) 2 mH , SH 1
e4 SH
2 2
2 MH
(1,1)
(4.26)
Note that the eect of misinterpretation can easily be avoided by converting h(1/2) (t, ) into h(1/2) (t, ) via the relation h(1/2) (t, ) = h(1/2) (t , ).
4.3.3
Optimization of PN Sequence Length
We next analyze the dependence of the measurement errors of a PN sounder (i.e., a sounder using pseudo-noise sequences as transmit signal) on the PN sequence length N for constant measurement bandwidth. Disregarding aliasing and misinterpretation error and further developing and bounding (4.23) and (4.24), the following bound can be shown: |hH
(1/2)
(mTrep , ) hH SH 1
(1/2)
(mTrep , )|
mH |e1 (mTrep , )| |e2 (mTrep , )| + N SH 1 SH 1 2B

mH 2B
(0,1)
(0,1)
2 . (4.27) N
It is seen that the bound for the commutation error, N
, increases with increasing N while the
bound for the pulse compression error, 2/N , decreases with increasing N . The total bound on the right-hand side of (4.27) is minimized by the following value of the PN sequence length N : N = 4B H
(1)
However, for practical implementations N + 1 has to be a power of two so that proper PN sequences can be used. Thus, our nal design rule is to choose a PN sequence length Nopt = 2lopt 1 where lopt = round{ld(N + 1)} . (4.28)
186
0 50 100 150 200 250 0.1 1 10 100 1000 0 50 100 150 200 250 0.1 1 10 100 1000 0 50 100 150 200 250 0.1
10
100
1000
(a)
(b)
(c)
Figure 4.6: Systematic measurement errors (dashed line) and corresponding upper bounds (solid line) for a synthetic two-path channel (in dB): (a) Maximum integrated commutation error, (b) maximum integrated pulse-compression error, (c) maximum integrated misinterpretation error. (In this example, there was no aliasing error.) The horizontal axis shows the channels Doppler shift 1 in Hz.
4.3.4
Simulation Results
Sounding of a Twopath Channel. To illustrate the systematic sounding errors described above, we simulated the sounding of a synthetic two-path channel with carrier frequency 1.8 GHz. The baseband channels impulse response is h(1/2) (t, ) = a0 ( ) + a1 cos(21 t) ( 1 ) . This channel consists of a direct path with constant amplitude a0 = 1 and a second path with delay 1 = 2 s and sinusoidally varying amplitude (peak amplitude a1 = 0.4). The Doppler shift 1 was varied between 0.1 Hz and 1000 Hz, corresponding to a velocity ranging from 0.06 km/h to 600 km/h. We assumed a PN sounder using a PN sequence of length N = 127 and repetition factor K = 1 (i.e., Trep = T . With the sampling frequency (double measurement bandwidth B) assumed as 10 MHz, the duration of the PN sequence (= sounding period) is T = 12.7 s. K = 1 condition (4.25) is satised. The other three error components and their upper bounds are compared for various values of 1 in Fig. 4.6. The short PN sequence caused the pulse-compression error to dominate. The maximum integrated pulse-compression error max
m 1 T
In this example, the aliasing error is zero since with 1 = 2 s, 1 1000 Hz, T = 12.7 s, and
the corresponding upper bound SH 1 2 according to (4.24) were calculated as 5.9 103 , max
m 1 T
|e2 (mT, )| d 103
and
respectively, independently of 1 (see Fig 4.6(b)). The maximum integrated commutation error |e1 (mT, )| d
1 T
and 7.8
and the corresponding upper bound g
SH
r 1 1 according to (4.23)
(1,1)
are shown in Fig. 4.6(a) as a function of 1 . Similarly, the maximum integrated misinterpretation error max
m
(4.26) are shown in Fig. 4.6(c). It is seen that the commutation and misinterpretation errors and the corresponding upper bounds grow with increasing 1 ; however, in this example the misinterpretation error always stays well below the commutation and pulse-compression errors.
|e4 (mT, )| d
and the corresponding upper bound 2 SH 1 mH /T according to
4.4 Multicarrier Communication Systems

10 20 30 40 50 60 0.1 1 10 100 1000 10 20 30 40 50 60 0.1 1 10 100 1000
187
(a)
(b)
Figure 4.7: Total systematic measurement error (dashed line) and corresponding upper bound (solid line) in dB for a synthetic two-path channel: (a) for PN sequence length N = 127, (b) for PN sequence length N = 1023. The horizontal axis shows the channels Doppler shift 1 in Hz. Finally, the total error max
m 1 T
according to (4.22) are shown in Fig. 4.7(a). Comparing with Fig. 4.6, we see that whereas up to about 1 = 200 Hz the (constant) pulse-compression error dominates, for 1 > 200 Hz the commutation error dominates. For comparison, Fig. 4.7(b) shows the total error and corresponding bound when the same two-path channel is sounded with a PN sequence of length N = 1023. It is seen that the value of 1 where the commutation error starts to dominate has dropped to about 50 Hz. This shows that for dierent maximum Doppler shifts 1 dierent values of N are preferable. The choice of N for a given max is considered next. Optimization of PN Sequence Length. To illustrate our result regarding optimal PN sequence length, we sounded the same two-path channel as in the previous example, with 1 = 60 Hz, using PN sequences of length N = 2l 1 with l = 5, . . . , 11. Fig. 4.8 shows the maximum magnitude of
m, m,
|hH
(1/2)
(mT, ) hH
(1/2)
(mT, )| d
and the associated bound
the commutation error, max |e1 (mT, )|, of the pulse-compression error, max |e2 (mT, )|, and of their sum, max |e1 (mT, ) + e2 (mT, )|, as a function of N . It is seen that these errors are best balanced
m, m,
for N = Nmin = 511 since max |e1 (mT, ) + e2 (mT, )| is minimal at this point. This agrees with our theoretical guideline (4.28) which yields lopt = 9 and thus Nopt = 2lopt 1 = 511 as well.

Orthogonal frequency division multiplexing (OFDM) [28, 30, 133, 182, 209, 215] is a multicarrier communication scheme used or proposed for digital audio broadcasting (DAB), digital video broadcasting (DVB), wireless local area networks (WLANs), and data transmission over digital subscriber lines (xDSL services, often also referred to as discrete multi-tone (DMT)). Recently, biorthogonal frequency division multiplexing (BFDM) has been proposed as a generalization of OFDM that is particularly attractive in time-varying environments [20,21,128]. Subsequently, we will briey discuss the relevance of Subsection 2.3.8 (which deals with approximate eigenfunctions and approximate diagonalization)
188
20
30
40
50
60
70 31
63
127
255 511 - N
1023
2047
Figure 4.8: Maximum magnitude (in dB) of the commutation error (dotted line), of the pulsecompression error (dashed line), and of their sum (solid line) as a function of the PN sequence length N for a synthetic two-path channel with 1 = 60 Hz. to pulse-shaping OFDM systems. We note that similar results have been presented for random timevarying channels in [127].
4.4.1
Pulse-Shaping OFDM and BFDM Systems
Recently, pulse-shaping OFDM and BFDM systems [1921, 128, 201] have been recognized to oer improved robustness to time-frequency dispersion caused by time-varying (mobile radio) channels. Subsequently, we consider an OFDM/BFDM system as shown in Fig. 4.9, with L subcarriers, symbol duration T , and subcarrier spacing F (chosen such that T F 1). The baseband transmit signal for such a system can be written as
L1
x(t) =
k= l=0
ak,l gk,l (t) ,
where ak,l is the transmit symbol associated to the kth symbol period and the lth subcarrier and shifting a prototype lter g(t). The receiver determines the symbol estimates according to ak,l = shifting the prototype receive lter2 (t)) and y(t) = (Hx)(t) is the received signal when transmitting x(t) over an LTV channel H.3 Transmit and receive pulses are assumed to satisfy the (bi)orthogonality condition gk,l , k ,l = kk ll . For H = I, (4.29) implies ak,l = ak,l , i.e., perfect recovery of the transmit symbols.
2 3
gk,l (t) = gk,l (t) = g(t kT ) ej2lF (tkT ) is the corresponding transmit pulse that is obtained by TF y, k,l where k,l (t) = (t kT ) ej2lF (tT ) is the corresponding receive pulse (obtained by TF
(4.29)
Note that (t) = g(t) in the case of OFDM. For simplicity, we here consider a noisefree scenario.
189
ak,0 ak,1
g(t) g(t) ej2F t x(t) H y(t)
(t) (t) ej2F t
kT kT
ak,0 ak,1
ak,L1
g(t) ej2(L1)F t
(t) ej2(L1)F t
kT
ak,L1
Figure 4.9: OFDM/BFDM communication over a time-varying channel H.
4.4.2
Approximate Input-Output Relation for OFDM/BFDM Systems

L1
We rst note that the received symbols can be written as (cf. also [128]) ak,l = Hx, k,l =
k = l =0
ak ,l Hgk ,l , k,l .
Hence, due to the time and frequency dispersion of the time-varying channel, a given the received symbol ak,l depends not only on ak,l but also on ak ,l with k = k and l = l . This parasitic dependence is known as intersymbol interference (ISI) and interchannel interference (ICI) [128]. However, the results of Subsection 2.3.8 imply that for underspread LTV channels and properly chosen prototype pulses g(t) and (t), this ISI/ICI is small. Specically, Theorem 2.23 (with u(t) and v(t) replaced by g(t) and (t), respectively) directly applies to the BFDM transmission system considered here. It implies that for a reasonably underspread channel H and well TF-localized transmit and receive lter prototypes4 g(t) and (t), one has Hgk ,l , k,l LH (kT, lF ) kk ll , where the approximation error is bounded as (cf. (2.119))
(kk ,ll ) Hgk ,l , k,l LH (kT, lF ) kk ll g, mH , SH 1
()
(4.30)
()
with g, (, ) = k0 l0 A,g + kT, + lF . The approximation (4.30) further implies ak,l LH (kT, lF ) ak,l .
()
(k,l)
()
(4.31)
This approximate multiplicative input-output relation for pulse-shaping OFDM and BFDM systems operating over underspread LTV channels is practically important since it allows the use of simple methods for channel estimation and equalization [29, 48, 51, 75, 134, 182, 209]. Consequently, a similar input-output relation is almost always used in the relevant literature. Note that the results in Subsection 2.3.9 provide a theoretical justication of this approximate input-output relation.
4.4.3
Simulation Results
We next illustrate the approximate input-output relation (4.31) using a practical OFDM transmission scheme with rectangular transmit and receive pulses g(t) = (t) of duration T = 1 ms and 64 subcar4
We note that a procedure for optimally matching the pulses g(t) and (t) to the channel has been proposed in [128].
190
0 -10 -20 -30 -40 -50
16
24
32
40 l
48
56
64
Figure 4.10: Intersymbol and interchannel interference occurring in a conventional OFDM system shows |k0 +1,l |2 (in dB relative to |k0 ,32 |2 ) as a function of the subcarrier index l. (All other ak,l were a a zero.) with 64 subcarriers. The single symbol ak0 ,32 was transmitted. shows |k0 ,l |2 and - -- - a
riers with subcarrier spacing F = 1 kHz. No cyclic prex [165] was used since it cannot help avoid interchannel interference in the case of time-varying channels. A single symbol was transmitted at symbol period k = k0 and subcarrier l = 32. The channel was obtained as a single realization of a standard random WSSUS channel [11] with Jakes Doppler prole (maximum Doppler frequency max = 156 Hz) and exponential delay prole (decay 120 s) [176] (note that due to max decay = 0.018 this channel periods k0 and k0 + 1 as a function of the subcarrier index l. Since the channel was causal and its memory was shorter than the symbol period T , all received symbols ak,l for k < k0 and k > k0 +1 were exactly zero. On the other hand, it is seen that the received symbols ak0 ,l , l = 32 as well as ak0 +1,l are not exactly zero which means that there is some ISI/ICI. However, the power of these symbols is more than 20 dB weaker than that of ak0 ,32 , thereby conrming the approximation (4.31). is reasonably underspread). Fig. 4.10 shows the power of the received symbols ak,l for the symbol
4.5 Analysis of Car Engine Signals

In this section, we consider time-varying spectral analysis of pressure and vibration data measured in a car engine.5 A TF analysis of this kind of data has previously been considered in [17, 26, 27, 113, 143, 146, 148, 155, 181]. Our data consisted of vibration signals obtained from acceleration sensors mounted on the engine housing and corresponding pressure signals obtained from pressure sensors mounted inside the cylinder of a BMW engine. Several data sets were recorded at dierent engine speeds. Each data set contains the vibration and pressure signals corresponding to several knocking combustion cycles (for more details on knock and its detection see [17, 26, 27, 113, 143, 146]). The load and the engine speed were
5
These data have kindly been made available to us by J. F. Bhme, D. Knig, S. Carstens-Behrens, and M. Wagner o o
(courtesy of Aral-Forschung, Bochum).
191
i 2 3 4
speed 2000 3000 4000
Li 360 248 186
Ni 126 343 199
Table 4.1: Number Ni and length Li of available combustion signals at various engine speeds. kept constant during each measurement. The measured signal segments corresponding to dierent combustion cycles can be viewed as independent realizations of a nonstationary discrete-time and nite-length random process. Thus, the data constitutes an ensemble (i.e., multiple realizations) on which the subsequent spectral analysis can be based. and similarly for the vibration signals yk [n]. Here, n = 1, . . . , Li and k = 1, . . . , Ni with Li the signal length and Ni the number of combustions measured at speed i 1000 rpm (see Table 4.1). Note duration of a combustion decreases with engine speed. Let us denote the kth measured pressure signal at engine speed i 1000 rpm, i = 2, 3, 4, by xk [n]
(i) (i)
that the signal length depends on engine speed since the sampling rate was kept constant while the
4.5.1
Time-Varying Spectral Analysis
We rst discuss time-varying spectral analysis of the pressure and vibration signals using the GWVS and GES. For each engine speed, sample (i.e., estimated) correlation operators were computed from the available data according to6 R(i) x 1 = Ni
Ni k=1 (i) xk
(i) xk
,
(2)
R(i) y
1 = Ni
Ni k=1
yk yk
(i)
(i)
the positive semi-denite square roots Hx and Hy of Rx and Ry , respectively. Finally, estimates of the GWVS and GES with = 0 and = 1/2 were computed according to (B.45) and (B.49), Wx(i) (t, f ) = L b (i) (t, f ) ,
Rx () () 2 () () Gx(i) (t, f ) = L b (i) (t, f ) , Hx Hy 2 () () Gy(i) (t, f ) = L b (i) (t, f ) .
Note that N2 = 126 L2 = 360 so that Rx and Ry were singular. Furthermore, we also determined
(i) (i) (i) (i)
(2)
Wy(i) (t, f ) = L b (i) (t, f ) ,

Ry
()
()
The resulting spectrum estimates are shown in Figs. 4.11 and 4.12. It can be seen from Fig. 4.11 that all spectra succeed in displaying the time-varying and nonstationary features of the pressure signals. In particular, it can be recognized that at all engine speeds the pressure signals consist of several resonances with decreasing resonance frequencies. These results are consistent with physical considerations involving cylinder geometry, coupling of resonance frequencies, and decreasing cylinder temperature [17, 113]. Fig. 4.12 shows that the vibration signals feature a
b Note that all signals and operators involved in this example are discrete-time, and thus the correaltion operators Rx b (i) become matrices of size Li Li . Nevertheless, we continue using the continuous-time notation for convenience. and Rx
6 (i)
192
Wigner-Ville spectrum
25 20 25 20 15 10 5 0 0 25 20 15 10 5 0 0 25 20 15 10 5 0 0
Rihaczek spectrum
25 20 15 10 5 0 0 25 20 15 10 5 0 0 25 20 15 10 5 0 0
Weyl spectrum
evolutionary spectrum
25 20 15 10 5 0 0 25 20 15 10 5 0 0 25 20 15 10 5 0 0
2000 rpm
15 10 5 0 0 25 20
30
60
90
30
60
90
30
60
90
30
60
90
3000 rpm
15 10 5 0 0 25 20
30
60
90
30
60
90
30
60
90
30
60
90
4000 rpm
15 10 5 0 0
30
60
90
30
60
90
30
60
90
30
60
90
Figure 4.11: Time-varying spectra of cylinder pressure signals x(i) [n]. First row: 2000 rpm, second row: 3000 rpm, third row: 4000 rpm; rst column: Wigner-Ville spectrum, second column: real part of Rihaczek spectrum, third column: Weyl spectrum, fourth column: evolutionary spectrum (equal to transitory evolutionary spectrum). Horizontal axis: crank angle (in degrees, proportional to time), vertical axis: frequency (in kHz).
similar behavior although the spectra are much more aected by engine noise (especially at higher engine speeds) and by dispersion eects caused by the the engine housing. With regard to the dierent spectra used, it is seen that the Wigner-Ville spectrum and Weyl spectrum (i.e., GWVS and GES with = 0) are preferable to the Rihaczek spectrum and evolutionary spectrum (i.e., GWVS and GES with = 1/2) since they better display the decreasing resonance frequencies. Specically, they feature a better concentration of the resonance components along the decreasing resonance frequencies. This can be attributed to the fact that due to its metaplectic covariance, the Weyl symbol (i.e., the GWS with = 0, which underlies the denition of Wigner-Ville spectrum and Weyl spectrum), is particularly suited for structures with oblique locations in the TF plane. We nally note using (slightly) smoothed type I and type II spectra (see Sections 3.3 and 3.4) tends to yield further improvements regarding the readability of the resulting spectra (in particular for the noisier vibration signals).

Wigner-Ville spectrum
25 20 25 20 15 10 5 0 0 25 20 15 10 5 0 0 25 20 15 10 5 0 0
193
Weyl spectrum
25 20 15 10 5 0 0 25 20 15 10 5 0 0 25 20 15 10 5 0 0
Rihaczek spectrum
evolutionary spectrum
25 20 15 10 5 0 0 25 20 15 10 5 0 0 25 20 15 10 5 0 0
2000 rpm
15 10 5 0 0 25 20
30
60
90
30
60
90
30
60
90
30
60
90
3000 rpm
15 10 5 0 0 25 20
30
60
90
30
60
90
30
60
90
30
60
90
4000 rpm
15 10 5 0 0
30
60
90
30
60
90
30
60
90
30
60
90
Figure 4.12: Time-varying spectra of engine vibration signals y (i) [n]. First row: 2000 rpm, second row: 3000 rpm; third row: 4000 rpm; rst column: Wigner-Ville spectrum, second column: real part of Rihaczek spectrum, third column: Weyl spectrum, fourth column: evolutionary spectrum (equal to transitory evolutionary spectrum). Horizontal axis: crank angle (in degrees, proportional to time), vertical axis: frequency (in kHz).
4.5.2
TF Coherence Analysis
We next analyze the TF coherence of pressure and vibration signals. The goal is to see whether corresponding pressure and vibration signals are linearly related (as assumed e.g. in [27]), i.e., whether the relation between pressure signal and vibration signal can be modelled by an LTV system (such an approach was taken in [17, 113]). The magnitudes of the estimated TF coherence functions (cf. (3.116)) Cx(i) ,y(i) (t, f ) Cx(i) (t, f ) Cy(i) (t, f )
x(i) ,y(i) (t, f ) =
are shown in Fig. 4.13 for engine speeds of 2000, 3000, and 4000 rpm. Here, the spectrum estimates
194
25 20 15 10 5 0 0 1 0.8 0.6 0.4 0.2 0 0
1 0.8 0.6 0.4 0.2 0
25 20 15 10 5 0 0 1 0.8 0.6 0.4 0.2 0 0
1 0.8 0.6 0.4 0.2 0
25 20 15 10 5 0 0 1 0.8 0.6 0.4 0.2 0 0
1 0.8 0.6 0.4 0.2 0
30
60
90
30
60
90
30
60
90
10
15
20
25
10
15
20
25
10
15
20
25
(a)
(b)
(c)
Figure 4.13: Magnitude of estimated TF coherence function of corresponding pressure and vibration signals for (a) 2000 rpm, (b) 3000 rpm, (c) 4000 rpm. The upper plots show gray scale representations where the horizontal axis is crank angle (in degrees, proportional to time) and the vertical axis is frequency (in kHz). The lower plots show cuts in the frequency direction taken at crank angle 30 degree (indicated by the dashed line in the gray scale plots). C (t, f ) were obtained by averaging rank two multiwindow spectrograms (see Subsection B.2.3), i.e., 1 Cx(i) ,y(i) (t, f ) = Ni
Ni k=1
1 1 (g1 ) (g2 ) SPEC (i) (i) (t, f ) + SPEC (i) (i) (t, f ) xk ,yk xk ,yk 2 2
and similarly for Cx(i) (t, f ) and Cy(i) (t, f ). The windows g1 (t) and g2 (t) were chosen as chirped versions of the rst and second Hermite function with window length and chirp rate chosen to match the duration and frequency decay of the resonances in the combustion signals. It is seen that in those TF regions where the resonances are localized, (i) (i) (t, f ) is signicantly larger than zero. In
x ,y
particular, at all engine speeds the estimated TF coherence function in the region corresponding to the rst resonance frequency is 0.9, which clearly indicates a linear relationship between pressure and vibration signals. The TF coherence of the higher resonances is smaller but still suggests a linear interference from extraneous sources (note that coherence drops with decreasing SNR). relationship of pressure and vibration, though apparently contaminated by measurement noise and
4.5.3
Subspace Identication
Recently, matched subspace detectors [187, 188] have been proposed for the detection of knock in car engines [146]. The prior knowledge required to design such detectors consists of the subspace X in which the signals to be detected are supposed to live. A conventional method for estimating the
195
(ML) subspace identication [187]. This method consists of the following steps: i) Compute the sample correlation 1 Ry = N
N j=1 yi yi .
subspace X (whose dimension p is assumed known) from N observations yi (t) is maximum-likelihood
ii) Solve the eigenproblem for the sample correlation Ry , i.e., compute the real numbers k and the orthonormal functions uk (t) satisfying Ry uk (t) = k uk (t) . creasing magnitude. iii) Finally, the estimated subspace X is obtained as X = span{u1 (t), . . . , up (t)} . and the corresponding estimated orthogonal projection is given by PX =
p k=1 uk
(4.32)
It is assumed that the eigenvalues (and corresponding eigenfunctions) are sorted in order of de-
Unfortunately, the solution of the eigenproblem (4.32) is computationally expensive and sensitive to noise. However, if y(t) is an underspread process (so that Ry and, for N large enough, also Ry are underspread operators), the results of Subsection 2.3.8 can be applied. There, it was shown that TF shifted versions skT,lF (t) of a well-localized prototype function s(t) are approximate eigenfunctions of underspread operators and the corresponding GWS values LH (kT, lF ) are the associated approximate eigenvalues (cf. (2.111)), (HskT,lF )(t) LH (kT, lF ) skT,lF (t) . For T F = 1, each approximate eigenfunction skT,lF (t) approximately covers a separate TF region of area 1. Hence, the space spanned by p approximate eigenfunctions skT,lF (t) approximately corresponds to a TF region of area p. Picking the TF region with the largest GWS values thus approximately corresponds to picking the subspace spanned by the eigenfunctions associated with the largest eigenvalues. (A more comprehensive treatment of the correspondence between subspaces and TF regions can be found in [81]). Recalling that the GWS of Ry is the GWVS of Wy i) Compute the Wigner-Ville spectrum estimate W y (t, f ) =
(0) () L (t, f ) Ry () () ()
u . k
(t, f ), and using = 0, the above considera-
tions suggest the following TF version of ML subspace identication [146]:

N (0) Wyi (t, f ) . j=1
1 = N
For underspread processes or low SNR, an additional TF smoothing is advantageous [61, 129, 140]. ii) Determine an estimate of the TF region R by thresholding W y (t, f ), R = (t, f ) : W y (t, f ) , with chosen such that the area of R equals p, i.e.,
t f IR (t, f ) dt df (0) (0)
=p
196
25 20 15 10 5 0 0 25 20 15 10 5 0 0
25 20 15 10 5 0 0 25 20 15 10 5 0 0
25 20 15 10 5 0 0 25 20 15 10 5 0 0
30
60
90
30
60
90
30
60
90
30
60
90
30
60
90
30
60
90
(a)
(b)
(c)
(0) PX
Figure 4.14: Illustration of ML subspace identication and its TF counterpart for engine vibration signals at (a) 2000 rpm, (b) 3000 rpm, and (c) 4000 rpm. Upper row: Weyl symbols L b (t, f ) of orthogonal projection operators PX estimated using ML subspace identication. Lower row: TF regions R estimated using the TF technique for subspace identication. Horizontal axis: crank angle
(in degrees, proportional to time); vertical axis: frequency (in kHz).
iii) Compute the kernel of the approximate orthogonal projection operator P via an inverse Weyl transform (cf. (B.18)) of the indicator function of R, p(t, t ) =
f
IR
t + t , f ej2f (tt ) df . 2
(Thus, the Weyl symbol of P equals the indicator function IR (t, f ).) Note that P is self-adjoint but not idempotent. If idempotency is indispensable, a least squares approach can be used to determine the orthogonal projection operator that best approximates P [81]. Numerical Simulations. Examples illustrating ML subspace identication and its TF analogue for the vibration signals yk [n] discussed further above are shown in Fig. 4.14. Here, all Ni available signals corresponding to a given engine speed were used to determine the ML estimate of the orthog onal projection operator PX as well as the estimated TF region R according to the two procedures
(i)
outlined above. It is seen that the TF version of ML subspace identication yields clearer pictures of the dominant components of the vibrations signals. The corresponding TF designed approximate orthogonal projection operators P where also observed to yield better performance than the estimated orthogonal projection operators PX when used for the design of TF subspace detectors [146].
Conclusions
Whoever in the pursuit of science seeks after immediate practical utility may Hermann von Helmholtz rest assured that he seeks in vain.
N this concluding chapter, we rst give a concise summary of the novel contributions of this thesis. In particular, we review underspread linear systems, time-frequency transfer function approxima-
tions, underspread random processes, time-varying power spectra, and their applications. In addition, we outline several open problems that extend the results of this thesis into various directions and may serve as suggestions for future research.
197
198
Chapter 5. Conclusions
5.1 Summary of Novel Contributions

The following summary of novel contributions reviews the main theoretical results and practical applications discussed in the previous chapters. For each subject, a reference to the corresponding section(s) is provided.
Generalized Concept of Underspread LTV Systems

thesis. Our contributions here are threefold:
Sections 2.1 and 2.2
The concept of underspread systems is of central importance for most of the results derived in this
First, a previous denition of (jointly) underspread linear time-varying (LTV) systems [118120,
127] that is based on the assumption of compactly supported generalized spreading function (GSF) was generalized in order to accommodate oblique orientations of compact GSF support regions. We also introduced novel parameters for characterizing the amount of time-frequency (TF) shifts introduced by such systems and investigated their behavior when the system is subjected to various transformations.
Second, we extended the underspread concept to LTV systems whose GSF does not have compact support but features rapid decay. This extension was based on the introduction of weighted integrals and moments of the GSF as measures of the amount of TF shifts introduced by a system. Again, the eect of various system transformations on these novel parameters was studied. Finally, generalized Chebyshev inequalities were used to derive bounds on the error made by ap-
proximating an arbitrary LTV system by a system with compactly supported GSF. In particular, this allows to relate the two dierent concepts of underspread systems (compact support/rapid decay).
Time-Frequency Transfer Function Approximations
Section 2.3
A major part of this thesis was concerned with the development of a TF transfer function calculus for LTV systems that is based on the generalized Weyl symbol (GWS). We provided various approximations that are valid in the case of (jointly) underspread systems and allow an interpretation of the GWS similar to the transfer function (frequency response) of linear time-invariant systems. In particular, we proved that for underspread systems the following GWS properties are valid in an approximate sense: The composition of jointly underspread systems approximately corresponds to a multiplication of the corresponding GWSs. TF translates of a well TF localized function are approximate eigenfunctions of underspread LTV systems, with the associated GWS values being the approximate eigenvalues.
5.1 Summary of Novel Contributions
199
The input-output relation of underspread LTV systems can approximately be formulated in terms of a multiplication of the input signals short-time Fourier transform by the GWS. The maximum gain of an underspread LTV system approximately equals the supremum of the GWS. Underspread LTV systems are approximately normal and they approximately commute. The GWS of positive semi-denite systems is approximately nonnegative. In a certain sense, operator inversion for underspread LTV systems can approximately be replaced by pointwise inversion of the GWS. As a mathematical underpinning, each approximation was accompanied by an upper bound on the associated approximation error that is formulated in terms of in terms of weighted GSF integrals and moments.
Time-Frequency Correlation Analysis and Underspread Nonstationary Random Processes

LTV systems. Our contributions regarding TF correlations are as follows:
Section 3.1
TF correlations play a similar role for nonstationary random processes as TF displacements do for
We presented intuitive methods for the TF correlation analysis of nonstationary random process that involves TF correlation functions and the generalized expected ambiguity function (GEAF); they show some similarity to the analysis of TF displacements of LTV systems. We reviewed and generalized an existing denition of underspread random processes that is based on the assumption of compactly supported GEAF. Furthermore, we provided a novel, extended concept of underspread processes that is based on weighted integrals and moments of the GEAF and does not require the GEAF to have compact support. We also considered innovations system representations of random processes and related the TF correlations of a process to the TF displacements of its innovations systems.
Elementary Time-Varying Power Spectra
Sections 3.2, 3.6, and 3.7
The generalized Wigner-Ville spectrum (GWVS) and the generalized evolutionary spectrum (GES) are well-established elementary classes of time-varying power spectra. We provided the following novel results regarding the GWSV and GES: We showed that for an underspread process the spectra within both classes are smooth, (ap-
proximately) real-valued and (approximately) nonnegative. In contrast, for overspread (i.e., not
200
underspread) processes we demonstrated that the GWVS and GES contain statistical crossterms that are indicative of TF correlations. We furthermore derived uncertainty relations for both the GWVS and GES that link the maximum TF concentration of these spectra to the eective rank of the correlation operator of the underlying process. We provided approximations that relate the GWVS (GES) of the nonstationary output process process. Finally, we provided an approximate Karhunen-Lo`ve (KL) expansion in which the e GWVS acts as an approximate KL eigenvalue distribution and two biorthogonal bases obtained by TF shifting two well TF localized prototype functions act as approximate KL eigenfunctions.
of an underspread LTV system to the GWVS (GES) of an underspread nonstationary input
Type I and Type II Spectra
Sections 3.33.5
Another major part of this thesis is dedicated to two classes of generalized time-varying power spectra: type I spectra and type II spectra, which are obtained by incorporating a TF domain convolution in the GWVS and GES, respectively. Whereas type I spectra have already been considered in the literature, our denition of type II spectra is new. We showed that in the case of underspread processes, any type I and type II spectrum satises desirable mathematical properties (at least approximately); is approximately equivalent to any other (type I or type II) spectrum and approximately constitutes a complete second-order statistic;
describes the mean TF energy distribution of the process in a satisfactory way. On the other hand, in the case of an overspread process dierent (type I and/or type II) spectra may dier dramatically. Non-smoothed spectra are complete second-order statistics but in the overspread case they contain statistical cross-terms. While indicating TF correlations inherent in overspread processes, they tend to obscure the process mean TF energy distribution. Smoothed spectra, on the oher hand, are not complete second-order statistics (they fail to indicate TF correlations) but feature attenuated cross-terms and thereby better represent the process mean TF energy distribution.
Time-Frequency Coherence Functions
Section 3.8
We introduced a coherence operator and TF coherence functions for the coherence analysis of two nonstationary random processes. We proved that the TF coherence functions constitute an approximate TF formulation of the coherence operator. Furthermore, both the coherence operator an the coherence function possess similar or analogous properties as the ordinary coherence function of stationary processes. Compared to the coherence operator, the TF coherence functions have the advantage of being more intuitive and computationally less intensive.
5.2 Open Problems for Future Research
201 Sections 4.14.5
Applications
An important part of this thesis was concerned with the application of our theoretical results to practically important problems in the areas of statistical signal processing and communications. We showed that our results yield an approximate TF formulation and design of time-varying putationally ecient and feature nearly optimal performance. Similarly, we derived an ecient approximate TF formulation and design of optimal detectors for (jointly) underspread random processes. We used several of the bounds derived previously to develop upper bounds on the systematic measurement errors of correlative mobile radio channel sounders. The eigenfunction approximation for underspread LTV systems was used to quantify the intersymbol and interchannel interference occuring in orthogonal frequency division multiplexing (OFDM) communications systems. Finally, we considered the application of time-varying spectral analysis and TF coherence analysis to car engine signals.
Wiener lters for (jointly) underspread processes. The resulting TF Wiener lters are com-

The following discussion suggests some topics for future research. Most of these topics concern the extension of the approach taken in this thesis in various directions. Our discussion of approximate GWS-based inversion of LTV systems for the solution of operator equations pf the types H1 GH2 = H3 and GH2 = H3 was restricted to displacement-limited (DL) systems (see Subsections 2.3.6 and 2.3.7). Up to now, we were not able to derive similar results for the non-DL case. While simulation results indicate that approximate GWS based inversion of non-DL underspread operators is possible, a theoretical analysis and verication of this empirical observation is still an open problem. The main results of this thesis concern the development of various approximations that establish
a TF calculus for underspread LTV systems and underspread nonstationary random processes. In general, these approximations are not valid for overspread systems and processes. However, we
feel that at least part of our results can be extended to specic subclasses of overspread scenarios. To be specic, we note that an arbitrary LTV system H can be written as a superposition of TF shifted underspread systems Hk,l , i.e., H=
k l
Sk0 ,l0 Hk,l Sk0 ,l0 .
()
()+
202
() ()
where (, ) is a function that is concentrated in a region [0 /2, 0 /2][0 /2, 0 /2] about the 1. If H is an overspread system with GSF eectively contained in a region [H , H ] [H , H ] origin (with 0 0 1 in order that Hk,l is underspread) and satises
k l
Here, the individual subsystems Hk,l are dened by SHk,l (, ) = SH ( k0 , l0 ) (, )
( k0 , l0 ) =
with H H 1, there will eectively be K = (H H )/(0 0 ) subsystems Hk,l . While our approximate TF transfer function then applies to the individual subsystems Hk,l with errors in the order of 0 0 , the total error for H will generally be in the order of K0 0 = H H so that
nothing is gained. However, if only K K subsystems Hk,l are eectively nonzero, tolerable approximation errors in the order of K 0 0 =
K K H H
can be expected. A detailed analysis of
the usefulness of such an approach is left for future work. This thesis was restricted to deterministic LTV systems. In some applications (e.g., mobile com-
munications), the underlying systems (channels) are often modelled as being random in addition to being time-varying. Some TF transfer function approximations for underspread random LTV systems satisfying the so-called wide-sense stationary uncorrelated scattering (WSSUS) assumption have been provided in [91, 127]. However, a systematic study of such approximations for underspread random LTV systems is still lacking. Furthermore, since the WSSUS assumption is satised only approximately by real channels, non-WSSUS random LTV systems also are practically relevant and thus should be studied in more detail.
Another possible extension of our TF calculus for underspread systems and processes concerns
multi-input/multi-output (MIMO) LTV systems and multivariate nonstationary random processes (i.e., vector processes). Recently, these systems and processes have gained considerable interest in connection with communications systems using antenna arrays. MIMO LTV systems can be viewed as matrices of operators, i.e., y1 (t) H11 . . .. . = . . . . yN (t) HN 1 H1M x1 (t) . . . . , . . HN M xM (t)
with M and N the dimensionality at the input and output side, respectively. In a similar manner, the correlations of multivariate processes can be described by matrices of correlation operators, i.e., x1 (t) . E . y1 (t ) yM (t ) . xN (t) R11 . .. corresponds to . . . RN 1 R1M . . , . RN M
where Rkl = Rxk ,yl . The formulation of an underspread concept and the development of an associated approximate TF calculus poses an interesting problem for future research. In fact, it is known from the case of LTI systems that MIMO systems can feature properties very dierent from single-input/single-ouput systems [105].
203
Recently, generalized TF symbols for LTV systems (operators) and nonstationary random processes have been introduced which are covariant to TF displacements other than TF shifts [96,97]. The basic theory of these generalized symbols and associated generalized spreading functions is reasonably developed and some applications have also been investigated. However, a modication of the underspread concept that is adapted to these new symbol classes and the formulation of an associated approximate symbolic calculus have not been considered until now. (We note that a symbolic calculus for ane symbols in a mathematical/quantum-mechanical context has been presented in [15].) Instead of developing an underspread theory for various generalized TF symbols separately, it may be possible to take advantage of the generalized covariance theory for linear and quadratic TF signal representations [82,83,93,184]. The steps to be taken here are are: i) the development of an operator symbol that is covariant to an arbitrary (but xed) TF displacement operator (this can be based on the eigenvalue or singular value decomposition of operators as in (B.28)); ii) the derivation of the associated generalization of the GSF; iii) the formulation of an associated underspread property for systems and processes with regard to the underlying TF displacement operator; iv) the establishment of an approximate TF symbol calculus for these generalized underspread systems and processes. While an LTV system or a nonstationary random process might not satisfy the underspread assumption for all times and/or frequencies, its restriction to certain TF regions might well be reasonably underspread. This observation motivates the concept of regionally underspread systems and processes. Such an approach has recently been introduced and studied in [32], where some approximate regional TF transfer function approximations (essentially based on [144]) are also provided. However, there are still some open questions regarding regionally underspread systems and processes that might be studied in future work. Finally, we suggest to consider further applications of our TF calculus of underspread systems and processes. We feel that our results may be particularly useful in the areas of communications over LTV channels (with potential applications such as channel estimation, channel equalization, interference cancellation, and diversity combining) and nonstationary statistical signal processing (with potential applications such as nonstationary prediction and transform coding using TF signal expansions). Furthermore, nonstationary statistical modelling of time-varying channels involves aspects of both linear time-varying systems and nonstationary random processes.
204
Appendices
205
206
Linear Operator Theory

The theory of [. . .] operators [. . .] has attracted the ever increasing attention of mathematicians and physicists, and sometimes of engineers also. Israel C. Gohberg and Mark G. Krein
satisfactory and tractable models for a wide variety of physical phenomena. Hence, in this appendix we present a self-contained discussion of linear operator theory as far as is necessary for this thesis. We consider norms of linear operators, representations of linear operators via kernel functions, eigenvalue and singular value decompositions, and important special types of operators. Excellent and far more comprehensive treatments of linear operator theory can be found in [69, 158].
INEAR operators (linear systems) play a fundamental role in this thesis and generally in many areas of engineering. This is due to the fact that they are a suciently general concept to yield
207
208
Appendix A. Linear Operator Theory
A.1 Basic Facts about Linear Operators

Operators are abstract mathematical objects which describe the interrelation between an input quantity x and an output quantity y which are elements of two linear spaces X and Y, respectively. This interrelation or transformation can symbolically be written as y = Hx where H denotes the operator and Hx denotes the result of H acting on x (the output associated to the input x). Linear operators (systems) are a special type of operators which satisfy the following two requirements: Homogeneity: H(cx) = cHx for all complex numbers c and for all x X Additivity: H(x1 + x2 ) = Hx1 + Hx2 for all x1 X and x2 X . The above two conditions are satised if and only if
N N
H
k=1
ck xk
=
k=1
ck Hxk ,
which is well-known as superposition principle. linear spaces equipped with an inner product denoted by1 x1 , x2 . In particular, in most cases x1 , x2 =
t x1 (t) x2 (t) dt.
In almost all cases we consider, the linear spaces X and Y are Hilbert spaces, i.e., complete
X and Y are equal to the space of square-integrable functions L2 (R) with the usual inner product Using these inner products, the adjoint operator H+ is dened by Hx, y = x, H+ y , for all x X , y Y .
The adjoint can be shown to satisfy (H+ )+ = H. Operators which are equal to their adjoint, i.e. H = H+ , are called self-adjoint or Hermitian. Note that by the relation x = x, x an inner product always induces a norm. Based on this norm, we can dene the important subclass of bounded linear operators, i.e. those satisfying with the operator norm dened as Hx M x with nite M . Bounded linear operators constitute themselves a normed linear space, Hx . x
H Note that H+ = H
O.
sup Hx = sup
x =1 x =0
(A.1)
Further subclasses of linear operators can be obtained by imposing
additional constraints. Of specic interest to us are Hilbert-Schmidt (HS) operators which satisfy H
2
Hxk
< ,
2 2
where {xk } is an arbitrary orthonormal basis of X and H Observe that

1
denotes the Hilbert-Schmidt (HS) norm. which in particular implies that any HS
H+
= H 2 . We note that H
Note that the inner products in X and Y are possibly dierent. Nevertheless, we do not use dierent symbols since
in most cases it will be clear from the context to which space we refer.
A.2 Kernel Representation of Operators
209
operator is bounded. HS operators on a particular space X can themselves be viewed as elements of a Hilbert space with an inner product dened as H, G
k
Hxk , Gxk .
The trace of an operator is dened by Tr{H}

k
Hxk , xk , H
2 2
which by comparison shows that H, G Tr{H+ H} = Tr{HH+ }.
Operators satisfying
= Tr{G+ H} = Tr{HG+ } and furthermore
| Hxk , xk | <
(A.2)
are referred to as trace class (or nuclear ) operators [69] and form a subclass of HS operators. A further fundamental concept is the bilinear form of an operator given by the inner product QH (x, y) Hx, y ,
which in the case of y = x is referred to as quadratic form. Note that QH (x, y) = Q + (y, x) which H implies that the quadratic form of a self-adjoint operator is always real-valued. If furthermore QH (x, x) is positive (non-negative) for all x, the operator is referred to as being positive denite (positive semidenite).
A.2 Kernel Representation of Operators

It is often useful to view the action of a linear operator as an integral of x(t) with kernel function (impulse response) h(t, t ), i.e. (Hx)(t) =
t
h(t, t ) x(t ) dt
(A.3)
(note that for HS operators such a representation is guaranteed to exist with h(t, t ) L2 (R2 )). The kernel of the adjoint operator H+ is given by h (t , t) and the kernel of the operator Hc = H2 H1 composed of H1 and H2 reads hc (t, t ) = h2 (t, t ) h1 (t , t ) dt .
The trace, HS norm, inner product, and bilinear form can be reformulated in terms of the kernel as Tr{H} =
t
h(t, t) dt , |h(t, t )|2 dt dt , h(t, t ) g (t, t ) dt dt ,

t t
2 2
=
t t
H, G
210
QH (x, y) =
t t
h(t, t ) x(t ) y (t) dt dt .
We note that there exists a similar kernel representation in the frequency domain, which is referred to as bi-frequency function [214] and is dened as BH (f, f ) =
t t
h(t, t ) ej2(f tf
t )
dt dt .
The bi-frequency function maps the Fourier transform X(f ) of the input signal to the Fourier transform Y (f ) of the output signal y(t) = (Hx)(t), Y (f ) =
f
BH (f, f ) X(f ) df .
A.3 Eigenvalue Decomposition and Singular Value Decomposition

operators possess an eigenexpansion or eigenvalue decomposition (EVD) in the sense that they can be written as2 H=
k
Operators satisfying HH+ = H+ H (which necessarily requires X = Y) are called normal. Normal HS
k uk u , k
h(t, t ) =
k
k uk (t) u (t ) , k
(A.4)
with generally complex-valued eigenvalues k and orthonormal eigenfunctions uk (t). This in particular implies that H leaves its eigenfunctions unchanged besides a multiplication by a scalar, i.e., (Huk )(t) = k uk (t) . Inserting (A.4) into (A.3) yields
(Hx)(t) =
k=0
k x, uk uk (t),
which is a representation of the system output as superposition of the eigenfunctions uk (t) weighted by the eigenvalues k and the inner product of the input signal x(t) with the eigenfunctions uk (t). In the case of self-adjoint operators it can be shown that the eigenvalues are real-valued. If in addition the operator is positive denite there is furthermore k > 0. We further note that a fundamental result of operator theory states that two operators commute (i.e., their order can be interchanged, HG = GH) if and only if they have identical sets of eigenfunctions. In that case the eigenvalues of the operator product are given by k value decomposition (SVD) H=
k
2
(HG)
= k
(GH)
= k
(G)
k .
(H)
In the case of non-normal HS operators, the eigenvalue decomposition is replaced by the singular
k uk vk ,
h(t, t ) =
k
k uk (t) vk (t ),
(A.5)
Here, x y denotes a rank one operator with kernel x(t) y (t ).
A.4 Special Types of Linear Operators
211
with real, nonnegative singular values k 0 and {uk (t)} and {vk (t)} being two sets of orthonormal functions referred to as left and right singular functions, respectively. The singular values and left/right singular functions can be found by solving two eigenproblems for the operators HH+ and H+ H, respectively:
2 (HH+ uk )(t) = k uk (t) , 2 (H+ Hvk )(t) = k vk (t) .
Inserting (A.5) into (A.3) yields
(Hx)(t) =
k=0
k x, vk uk (t),
which is a representation of the system output as superposition of the left singular functions uk (t) weighted by the singular values k and the inner product of the input signal x(t) with the right singular functions vk (t). The operator thus projects the input x(t) onto the signal space spanned by {vk (t)} and uses the resulting coecients to compose the output as superposition of the signals {uk (t)}. Sometimes, the following expressions for the trace, operator norm, HS norm, and quadratic form
of H in terms of the eigenvalues (singular values) and eigenfunctions (singular functions) are useful: Tr{H} =
k
k ,
Tr{H} =
k
k ,
H H
O 2 2
= sup {|k |} ,
k
H H y, uk
O 2 2
= sup {|k |} ,
k
=
k
|k |2 , k x, uk ,
=
k
|k |2 , k x, vk y, uk
QH (x, y) =
k
QH (x, y) =
k
A.4 Special Types of Linear Operators

We conclude this brief dicussion of linear operator theory with a survey of some important special types of linear operators. Normal Operators. Normal operators are dened by the equation H+ H = HH+ and possess an eigenexpansion (see (A.4)). Self-Adjoint Operators. A particularly important special case of normal operators is given by self-adjoint or Hermitian operators, dened by H = H+ . It is straightforward to show that their eigenvalues k as well as their quadratic form QH (x, x) are real-valued. Positive Denite Operators. Positive denite (positive semi-denite) operators are special self-adjoint operators characterized by having a positive (nonnegative) quadratic form QH (x, x) for are positive (nonnegative). Note that operators of the form HH+ and H+ H are always positive semidenite. The most important positive semi-denite operators in this thesis are correlation operators R of random processes whose kernel equals the correlation function r(t, t ). all x = 0. We write H > 0 (H 0). All eigenvalues of a positive denite (semi-denite) operator
212
The positive (semi-)denite square root Hr =
H of a positive (semi-)denite operator H can
be dened by Hr Hr = H and Hr 0. Then, it is possible to assign to each normal operator a corresponding magnitude operator via |H| = HH+ . We note that the trace class condition (A.2) can equivalently be written as Tr{|H|} < [69]. The positive part of a self-adjoint operator H is dened as H+ = (|H| + H)/2.
Unitary Operators. Unitary operators U satisfy UU+ = U+ U = I, from which it follows that U1 = U+ . The magnitude of the eigenvalues of a unitary operator is equal to one, i.e., |k | = 1. An important example of a unitary operator is the TF shift operator (see Section B.1.1). Projection Operators. Projection operators are characterized by being idempotent, P2 = P. If furthermore P is self-adjoint, it is referred to as orthogonal projection, otherwise the projector is called oblique. Any eigenvalue of an orthogonal projector equals either zero or one. Linear Time-Invariant Systems. Linear time-invariant (LTI) systems are systems which commute with a time shift, i.e., HT = T H . Here, T is the time shift operator dened by (T x)(t) = x(t ). The impulse response (kernel) of convolution, y(t) = (Hx)(t) = operators that are not HS. Linear Frequency-Invariant Systems. Linear frequency-invariant (LFI) systems are systems which commute with the frequency shift operator F , i.e., HF = F H , where F is dened by (F x)(t) = x(t) ej2t . The impulse response (kernel) of such systems is of the relation (A.3) here reduces to a multiplication in the time domain, i.e., y(t) = (Hx)(t) = m(t) x(t) . LFI systems also do not belong to the HS class. type h(t, t ) = m(t)(t t ). Thus, LFI systems are dual to LTI systems in the sense that the I/O
t
such systems is of the type h(t, t ) = g(t t ), which shows that the I/O relation (A.3) reduces to a
a multiplication in the frequency domain, Y (f ) = G(f ) X(f ) . LTI systems are the most prominent
g(t t ) x(t ) dt , which, via the Fourier transform, corresponds to
B
Time-Frequency Analysis Tools
The simplicities of natural laws arise through the complexities of the languages we use for their expression. Eugene P. Wigner
N this appendix we present a self-contained discussion of time-frequency representations of linear, time-varying systems, of deterministic signals, and of random processes as far as is necessary for this
thesis. In the case of time-varying systems, we describe mainly two concepts: the generalized spreading elementary time-frequency shifts and the latter is a time-frequency parametrized representation of the system with the interpretation of a time-frequency weighting or a time-varying transfer function. These linear time-frequency system representations are complemented by the generalized transfer Wigner distribution, the generalized input Wigner distribution, and the generalized output Wigner distribution, which are quadratic time-frequency representations of linear systems. For deterministic signals, there exist many time-frequency representations. However, for our purposes, it suces to discuss the short-time Fourier transform, the generalized Wigner distribution, the spectrogram, and the generalized ambiguity function. Finally, the time-frequency representations of random processes we discuss are the generalized Wigner-Ville spectrum, the generalized evolutionary spectrum, the physical spectrum, and the generalized expected ambiguity function. For these time-frequency representations, the properties relevant to this thesis are discussed, and several interrelations and parallels are pointed out. We note that further details on time-frequency analysis can be found in several textbooks and tutorial papers [16, 35, 56, 61, 84, 151, 173, 200].
function and the generalized Weyl symbol. The former is related to a decomposition of a system into
213
214
Appendix B. Time-Frequency Analysis Tools
B.1 Time-Frequency Representations of Linear, Time-Varying Systems

In Appendix A we gave a brief discussion of linear operators which are mathematical models for linear time-varying (LTV) systems. In addition to the system descriptions considered there, there also exist time-frequency (TF) representations that characterize LTV systems in dierent ways. These TF descriptions will be reviewed in the following two subsections.
B.1.1
Generalized Spreading Function
In contrast to linear time-invariant and linear frequency-invariant systems, which produce only time shifts and frequency shifts, respectively, of various components of the input signal, general LTV systems introduce both time shifts and frequency shifts. A transparent description of the TF shifts of a linear system H with kernel h(t, t ) is given by the generalized spreading function (GSF) [114, 118] SH (, )
t ()
h() (t, ) ej2t dt ,
(B.1)
with h() (t, ) h t+
1 , t 2
1 + 2
(B.2)
where is a real-valued parameter. The kernel of H can be reobtained from the GSF via h(t, t ) =
SH (t t , ) ej2 (( 2 +)t+( 2 )t ) d ,
()
which shows that the GSF contains all information about H. For a specic LTV system H, the GSF describes the weights in a representation of the output signal y(t) as a weighted superposition of TF shifted versions of the input signal x(t), y(t) = (Hx)(t) =
()
SH (, ) (S() x)(t) d d , ,
()
(B.3)
with S, denoting the generalized TF shift operator,1

() (S, x)(t) = x(t ) ej2t ej2 (1/2) .
(B.4)
With = 1/2 the ordinary spreading function (delay-Doppler spread function) as introduced by spread function is reobtained, Bello [11, 109, 195] is reobtained from the GSF denition, and for = 1/2 Bellos Doppler-delay
(1/2)
SH
1
(, ) =
t
h(t, t ) ej2t dt ,
The parameter is due to the innitely many possibilities to dene a joint TF shift by combining the time shift
(1/2)
operator T with the frequency shift operator F . The case = 1/2 corresponds to rst shifting in time and then shifting in frequency, S,
(1/2) S,
= F T , while = 1/2 corresponds to rst shifting in frequency and then in time,

()
= T F . Note that the generalized TF shift operator corresponds to the Schrdinger representation of the o
Heisenberg group. The map SH (, ) H is then referred to as integrated representation of the convolution algebra on the Heisenberg group [64].

215
(d)
(a)
(b)
(c)
(e)
(f)
Figure B.1: Schematic representation of the GSF magnitude of some generic types of linear systems: (a) identity operator, (b) TF shift operator, (c) LTI system, (d) LFI system, (e) quasi-LTI system, (f ) quasi-LFI system. SH
(1/2)
(, ) =
t
h(t + , t) ej2t dt .
For dierent values the various GSFs dier from each other merely by a phase factor: SH 2 (, ) = SH 1 (, ) ej2(1 2 ) . Due to (B.5), the magnitude of the GSF is invariant, SH 1 (, ) = SH 2 (, ) . We will therefore usually neglect the superscript and simply write |SH (, )|. B.1): The GSF of the identity operator I with impulse response h(t, t ) = (t t ) is given by SI (, ) = ( )(). This is consistent with the fact that the identity operator shifts a signal
() ( ) ( ) ( ) ( )
(B.5)
We briey consider some special cases which help clarify the interpretation of the GSF (see Fig.
neither in time nor in frequency.

0 For a TF shift operator S0 ,0 with impulse response h(t, t ) = (tt 0 ) ej20 t ej20 0 (0 1/2) ,
( ) ()
the GSF is obtained as S
0 S0 ,0
( )
(, ) = ( 0 )( 0 ) ej2(0 )0 0 , i.e., a 2-D Dirac impulse

( )
0 at the point (0 , 0 ). This agrees with the fact that S0 ,0 shifts a signal by 0 in time and by 0
in frequency. For an LTI system H with impulse response h(t, t ) = g(t t ), we obtain SH (, ) = g( )(), which is consistent with the fact that LTI systems do not introduce any frequency shifts.
() ()
For an LFI system H with h(t, t ) = m(t)(t t ), the GSF is given by SH (, ) = M ()( ), introduce any time shifts. Sometimes we will need the GSF of the adjoint system H+ which is related to the GSF of H as SH+ (, ) = SH
() ()
where M () is the Fourier transform of m(t). This correctly reects that LFI systems do not
(, ) = SH (, ) ej4
()
(B.6)
216
It is seen directly from the denition of the GSF that SH (0, 0) = Tr{H} .
()
(B.7)
Furthermore, the GSF is also a unitary operator representation in the sense that it preserves inner products, i.e., SH , SG
() ()
=
()
SH (, ) SG (, ) d d = H, G ,
()
()
(B.8)
which implies that the L2 norm of SH (, ) equals the HS norm of H, SH

() 2 2
SH (, ) d d = H
()
2 2,
(B.9)
and, furthermore, that the bilinear form QH (x, y) can be reformulated as a (, )-domain inner product, QH (x, y) = Hx, y = SH , A() = y,x
() () () SH (, ) Ay,x (, ) d d , ()
(B.10)
where Ay,x (, ) denotes the generalized cross ambiguity function of y(t) and x(t) (cf. Subsection B.2.4). Using the SVD (A.5) of H, the GSF can also be written as a weighted superposition of generalized cross ambiguity functions of the (left and right) singular functions, i.e. SH (, ) =
k ()
k A() k (, ) . uk ,v
(B.11)
For normal operators, this simplies to the following weighted superposition of generalized auto ambiguity functions of the eigenfunctions (cf. (A.4)), SH (, ) =
k ()
k A() k (, ) . uk ,u
In Subsection 2.3.4 we consider operator products of the type H2 H1 . The GSF of such operator products is given by the so-called twisted convolution [64, 118], SH2 H1 (, ) = SH2 SH1 (, )
() () ()
SH2 ( , ) SH1 ( , ) ej2 (,,
()
()
, )
d d (B.12)
with (, , , ) = (+1/2) ( )+(1/2)( ) . In the case = 0, the phase (, , , ) simplies (up to the factor of 1/2) to the symplectic form on R2 , i.e., 0 (, , , ) =
() 1 2
The name twisted convolution stems from the fact that apart from the phase factor (, , , ), (B.12) looks like an ordinary convolution. Using (B.12), the magnitude of SH2 H1 (, ) can be shown to be bounded as SH2 H1 (, ) =
() () () () ()
( ).
SH2 SH1 (, ) SH1 (, ) SH2 (, ) ,
(B.13)
with denoting ordinary 2-D convolution. The twisted convolution also satises Youngs inequality [64], SH2 SH1
() () r
SH2
() p
SH1
() q
1 1 1 + = + 1, p q r
(B.14)
217
and even the stronger bound SH2 SH1

() () 2
SH2
() 2
SH1
() 2
(B.15)
We note that with (B.9) and (B.12), this is equivalent to H2 H1

2
H2
H1
(B.16)
However, the twisted convolution is not commutative. In particular, interchanging H2 and H1 reverses the sign of (, , , ). By expanding the phase factor into its Taylor series, ej2 (,,
, )
= ej2(+1/2) ( ) ej2(1/2)( ) [j2( + 1/2) ( )]k [j2( 1/2)( ) ]l = k! l!

k=0 l=0
k,l=0
(j2)k+l ckl ( )l ( )k ,
with ckl = ( + 1/2)k ( 1/2)l /(k! l!), and suitably substituting, the twisted convolution (B.12) can be expressed as an innite sum of ordinary convolutions (for = 0, this was already done in [64]), SH2 H1 (, ) =
()
(j2)k+l ckl
k,l=0 () ()
SH2 ( , ) ( )l ( )k SH1 ( , ) d d (j2)k+l ckl k l SH2 (, ) l k SH1 (, ) ,

() ()
()
()
= SH1 SH2 (, ) +
k+l>0
where in the last step we have split o the k = l = 0 term since it gives the ordinary convolution of SH1 (, ) and SH2 (, ).
() ()
B.1.2
Generalized Weyl Symbol
An alternative TF description of an LTV system that is related to TF weightings instead of TF shifts (see Section 2.3) is the generalized Weyl symbol (GWS), dened by [114, 118] LH (t, f )
()
h() (t, ) ej2f d ,
(B.17)
with h() (t, ) given by (B.2). For = 0, the GWS reduces to the ordinary Weyl symbol [64, 99, 114, 118, 190], LH (t, f ) =
(0)
j2f h t + ,t e d . 2 2
With = 1/2, Zadehs time-varying transfer function [214] is re-obtained, LH

(1/2)
(t, f ) =
h(t, t ) ej2f d ,
Bellos frequency-dependent modulation function [11], LH

(1/2)
and for = 1/2 the GWS reduces to the Kohn-Nirenberg symbol [64, 112] which is equivalent to (t, f ) =
h(t + , t) ej2f d .
218
The kernel of H can be reobtained from the GWS via h(t, t ) =

f
LH
()
t + t + (t t ), f ej2f (tt ) df , 2
(B.18)
which shows that the GWS contains all information about H. Based on (B.18), the I/O relation (A.3) can be reformulated as y(t) = (Hx)(t) =
t f
LH
()
t + t + (t t ), f x(t ) ej2f (tt ) df dt . 2
For = 1/2 this reduces to the particularly simple expressions y(t) =

f
LH
(1/2)
(t, f ) X(f ) ej2f t df ,
Y (f ) =
t
LH
(1/2)
(t, f ) x(t) ej2f t dt .
(B.19)
It is important to note that the GWS is in 2-D Fourier correspondence with the GSF, i.e., SH (, ) =
t f ()
LH (t, f ) ej2(t f ) dt df.
()
(B.20)
Hence, as a consequence of (B.5), for two dierent parameters 1 and 2 the corresponding GWSs are connected via a two-dimensional convolution, LH 2 (t, f ) = LH 1 (t, f ) e
( ) ( ) j2f t
1 1 2
(B.21)
Furthermore, taking the 2-D Fourier transform of (B.3), one obtains the alternative I/O relation (Hx)(t ) =
t f
LH (t, f ) Lt,f x (t ) dt df ,
()
()
with
Lt,f = St,f L() S+ , t,f
()
where L() is a linear operator with kernel l() (t, t ) given by l() (t, t ) = t + t + (t t ) . 2
()
(B.22)
For = 1/2 and = 0 the action of the operator Lt,f is given by Lt,f x (t ) = X(f ) (t t) ej2f t Lt,f
(1/2) (0) (1/2)
x (t ) = x(t) ej2f (t t)
Lt,f x (t ) = 2 x(2t t ) ej4f (t t) . Note that Lt,f performs a TF reection about (t, f ). Some examples for the GWS of specic systems are as follows: The GWS of the identity operator I with impulse response h(t, t ) = (t t ) is given by LI (t, f ) = 1, which correctly reects that the identity operator leaves all signals unchanged.
( ) () ()
0 For a TF shift operator S0 ,0 with impulse response h(t, t ) = (tt 0 ) ej20 t ej20 0 (0 1/2) ,
(0)
the GWS is given by L
0 S0 ,0
( )
(t, f ) = ej2(0 t0 f ) ej2(0 )0 0 .
219
The GWS of an LTI system with impulse response h(t, t ) = g(t t ) reduces to the ordinary transfer function for all t, LH (t, f ) = G(f ) where G(f ) is the Fourier transform of g( ).
()
The GWS of an LFI system with impulse response h(t, t ) = m(t)(tt ) reduces to the temporal transfer function for all f , i.e. LH (t, f ) = m(t).
()
Due to (B.20) and (B.6), the GWS of the adjoint system H+ can be expressed (for = 0) in terms of the GWS of H as LH+ (t, f ) = LH and for = 0 there is simply LH+ (t, f ) = LH (t, f ),
(0) (0) () ()
(t, f ) = LH (t, f ) ejf t
()
(B.23)
(B.24)
from which it follows that the Weyl symbol (GWS with = 0) of a self-adjoint system is real-valued. By integrating (B.17) over t and f , it is seen that LH (t, f ) dt df = Tr{H} .
t f ()
Furthermore, like the GSF, the GWS preserves inner products and norms of HS operators, LH , LG LH
() ()
=
t f
LH (t, f ) LG (t, f ) dt df = H, G , LH (t, f ) dt df = H

t f () 2 2 2
()
()
(B.25) (B.26)
() 2 2
Furthermore, the bilinear form can be expressed in terms of the GWS as

() QH (x, y) = Hx, y = LH , Wy,x = () () () LH (t, f ) Wy,x (t, f ) dt df , ()
(B.27)
where Wy,x (t, f ) is the generalized cross Wigner distribution (see Subsection B.2.2) of y(t) and x(t). Combining the SVD (A.5) of H with the GWS denition (B.17), it is seen that the GWS can be written as a weighted superposition of generalized cross Wigner distributions of the singular functions, LH (t, f ) =
k () () k Wuk ,vk (t, f ) .
(B.28)
In the case of normal operators, this simplies to the weighted superposition of generalized auto Wigner distributions of the eigenfunctions, LH (t, f ) =
k () () k Wuk (t, f ) .
(B.29)
The GWS of operator products H2 H1 can be obtained by the taking the 2-D Fourier transform of (B.12), which yields
() LH2 H1 (t, f )
() () LH2 #LH1
(t, f )
k,l=0
k+l LH2 (t, f ) k+l LH1 (t, f ) ckl (j2)k+l tl f k tk f l
()
()
(B.30)
220
Appendix B. Time-Frequency Analysis Tools k+l LH2 (t, f ) k+l LH1 (t, f ) ckl , (j2)k+l tl f k tk f l
() ()
() () LH2 (t, f ) LH1 (t, f )
+
k+l>0
(B.31)
with ckl = ( + 1/2)k ( 1/2)l /(k! l!). Note that if the phase factor in the twisted convolution (B.12)
was not present, then due to the 2-D Fourier relation (B.20), the GWS of H2 H1 would be exactly
() ()
equal to the product of the individual GWSs. According to the naming convention used for = 0 in the mathematics literature [64], we refer to LH2 #LH1 (t, f ) as twisted product (or star product) of LH2 (t, f ) and LH1 (t, f ).
() ()
B.1.3
Generalized Transfer Wigner Distribution and Generalized Input and Output Wigner Distribution
An interesting alternative approach to describe the TF weightings and TF displacements of an LTV system H is in terms of the generalized transfer Wigner distribution, dened as [8, 90, 130] WH (t1 , f1 ; t2 , f2 ) =
()
h t1 +
1 2
h t1
1 1 , t2 + 2 1 + 1 , t2 2
1 2 2 1 + 2 2
ej2(f1 1 f2 2 ) d1 d2 .
(B.32)
The generalized transfer Wigner distribution describes the mapping of the generalized Wigner distribution (see B.2.2) of the input signal x(t) to the generalized Wigner distribution of the output signal (Hx)(t) according to WHx (t1 , f1 ) =
t2 f2 () () WH (t1 , f1 ; t2 , f2 ) Wx (t2 , f2 ) dt2 df2 . ()
We next present a Theorem that will be important in Chapter 3 and that generalizes a result presented for the special case = 0 in [90]. To this end, we dene a coordinate transformed version of the generalized transfer Wigner distribution as WH (t, f ; , ) = WH (t + , f + + ; t + , f ) where + = 1/2 + and = 1/2 . Theorem B.1. The generalized transfer Wigner distribution of any linear system H satises WH (t, f ; , ) d d = LH (t, f ) , WH (t, f ; , ) dt df = SH (, ) .
() 2 () () 2 () ()
(B.33)
(B.34) (B.35)
Proof. To prove (B.34), we insert (B.32) into (B.33) and integrate with respect to and , WH (t, f ; , ) d d =
1 2 ()
h(t + + 1 , t + + 2 ) h (t + + 1 , t + + 2 )
B.1 Time-Frequency Representations of Linear, Time-Varying Systems ej2[(f + =

1 2
+ ) (f ) ] 1 2
221
d1 d2 d d
h(t + + 1 , t + + 2 ) h (t + + 1 , t + + 2 ) ej2(
+ + ) 1 2
d ej2f (1 2 ) d1 d2 d
(+ 1 + 2 ) =
1 2
h(t + 1 , t + + (2 )) h (t + + (1 ), t + 2 )

(+ 1 + 2 ) ej2f (1 2 ) d1 d2 d
h(t + 1 , t + 1 ) h (t + 2 , t + 2 ) ej2f (1 2 ) d1 d2 ()
= LH (t, f ) LH (t, f ) , which completes the proof of (B.34). In a similar way, integrating WH (t, f ; , ) with respect to t and f yields
() ()
()
WH (t, f ; , ) dt df =
t f 1 2
h(t + + 1 , t + + 2 ) h (t + + 1 , t + + 2 )
+ ) (f ) ] 1 2
ej2[(f + =
t 1 2
d1 d2 dt df
h(t + + 1 , t + + 2 ) h (t + + 1 , t + + 2 ) ej2f (1 2 ) df ej2(

+ + ) 1 2
=
t 1
d1 d2 dt
h(t + + 1 , t + + 2 ) h (t + + 1 , t + + 2 )
+ + ) 1 2
(1 2 ) ej2( =
t 1
d1 d2 dt
h(t + + 1 , t + + 1 ) h (t + + 1 , t + + 1 ) ej21 d1 dt h(t1 + , t1 + ) h (t2 + , t2 + ) ej2(t1 t2 ) dt1 dt2
= =
t1 t2 () () SH (, ) SH (, ) ,
and hence nally (3.8) since |SH (, )| is independent of . and (t2 , f2 ), respectively, yields the generalized input Wigner distribution and the generalized output Wigner distribution of H [90], i.e., IWH (t, f )
t1 f1 ()
()
Integrating the generalized transfer Wigner distribution WH (t1 , f1 ; t2 , f2 ) with respect to (t1 , f1 )
()
WH (t1 , f1 ; t, f ) dt1 df1 , WH (t, f ; t2 , f2 ) dt2 df2 .

()
()
OWH (t, f )
t2 f2
()
222
The generalized input Wigner distribution describes the TF regions where the system H can pick up energy, and the generalized output Wigner distribution describes the TF regions where the output signals are potentially located. We note that the generalized input and output Wigner distribution can be related to the GWS via IWH (t, f ) = LH+ H (t, f )
() ()
and
OWH (t, f ) = LHH+ (t, f ).
()
()
(B.36)
Hence, it is seen that in the case of normal systems, the generalized input Wigner distribution and the generalized output Wigner distribution coincide and are then simply referred to as generalized Wigner distribution of the system H.
B.2 Time-Frequency Signal Representations

In this section, we give a brief account of linear and quadratic TF signal representations [16, 35, 56, 61, 84, 151, 173, 200]. Quadratic TF representations of a signal x(t) can be viewed as special cases of TF operator representations (see Section B.1) obtained for the rank one operator H = x x .
B.2.1
Short-Time Fourier Transform
An obvious way of analyzing a signal as to which frequencies occur at a given time is to apply a sliding window to the original signal and to compute the Fourier transform of the windowed signal. The resulting linear TF representation is referred to as short-time Fourier transform (STFT) [84,157] and is given by
(g) STFTx (t, f )
x, St,f g =
(1/2)
x(t )g (t t)ej2f t dt ,
(B.37)
where g(t) is an analysis window. Thus, the STFT describes the local frequency content of the signal in a neighbourhood of the respective analysis time t. The STFT is unique in the sense that it is the only TF shift covariant linear TFR (up to a phase factor). It is possible to recover the signal x(t) from its STFT using the synthesis formula x(t) =
t f (g) STFTx (t , f ) St ,f w (t) dt df = (1/2) (g) STFTx (t , f ) w(t t )ej2f t dt df , (B.38)
provided that the analysis window g(t) and the synthesis window w(t) satisfy the (not very restrictive) condition g, w = 1 [84]. Clearly, the STFT is signicantly inuenced by the choice of the window function g(t). In order that the STFT correctly localizes the signal in the TF plane, the window has to be localized around the origin of the TF plane.
B.2.2
98, 151]
Generalized Wigner Distribution
The generalized (cross) Wigner distribution (GWD) is a bilinear TF representation dened as [31, 46,
() Wx,y (t, f ) () qx,y (t, )ej2f d ,
(B.39)
B.2 Time-Frequency Signal Representations
223
where
() qx,y (t, )
qx,y t +
1 1 , t + , 2 2
with qx,y (t, t ) = x(t) y (t ) .
(B.40)
It is an extension of the ordinary (cross) Wigner distribution which was originally introduced in quantum mechanics by Wigner [211], rediscovered for signal analysis by Ville [205] and more recently made popular by Claasen and Mecklenbruker [31]. The Wigner distribution is re-obtained from the a GWD with = 0,
(0) Wx,y (t, f ) =
x t+
j2f y t e d. 2 2
(B.41)
A further special GWD member, obtained with = 1/2, is the (cross) Rihaczek distribution [178],
(1/2) Wx,y (t, f ) =
x(t)y (t )ej2f d = x(t) Y (f ) ej2f t .
Furthermore, two GWD members with dierent values are related by a 2-D convolution as
( ( Wx,y2 ) (t, f ) = Wx,y1 ) (t, f ) e j2f t
1 1 2
The GWD is a bilinear TF representation well-known for the exceptionally large number of desirable properties it satises. It allows a loose interpretation as an energy distribution in the TF plane (we note that a pointwise interpretation as TF energy density is a priori prohibited by the uncertainty principle. We note that the GWD is the deterministic counterpart of the generalized Wigner-Ville spectrum (see Section B.3.1). The cross GWD of x(t) and y(t) can also be written in terms of the GWS of the rank one operator x y whose kernel is the outer signal product qx,y (t, t ) in (B.40),
() Wx,y (t, f ) = Lxy (t, f ). ()
Thus, any GWS property (see Subsection B.1.2) carries over to a corresponding GWD property.
B.2.3
Spectrogram
An important quadratic TF representation is the spectrogram that is dened as the squared magnitude of the STFT (B.37) [84],
(g) SPECx (t, f ) (g) STFTx (t, f ) . 2
(B.42)
Correspondingly, the cross spectrogram is dened as

(g) (g) SPECx,y (t, f ) = STFTx (t, f ) STFT(g) (t, f ) . y
The spectrogram is connected with the GWD via a double convolution,

(g) SPECx (t, f ) = () () Wx (t , f ) Wg (t t, f f ) dt df .
Thus, for usual windows g(t), the spectrogram can be interpreted as a smoothed GWD.
224
B.2.4
Generalized Ambiguity Function

() Ax,y (, ) () qx,y (t, )ej2t dt = x, S() y ,
Unlike the WD and the spectrogram, the generalized (cross) ambiguity function (GAF) [78] (B.43)
t
describes signals in a correlative domain. The GAF is an extension of the ordinary ambiguity function, originally introduced by Woodward in 1953 [213], and re-obtained from the GAF with = 0, A(0) (, ) = x,y x t+
t
j2t y t e dt . 2 2
It is furthermore in (two-dimensional) Fourier correspondence with the GWD,

() Ax,y (, ) = t f () Wx,y (t, f ) ej2(t f ) dt df.
The GAFs for various values dier from each other merely by a phase factor, i.e.,
(2 Ax,y ) (, ) = A(1 ) (, ) ej2(1 2 ) . x,y
(B.44)
In many situations we are interested only in the magnitude of the GAF, which is -invariant,
(1 (2 Ax,y ) (, ) = Ax,y ) (, ) .
The auto GAF Ax (, )
()
Ax,x (, ) features hermitian even symmetry,

() A() (, ) = Ax (, ), x
()
and satises the inequality2 A() (, ) A() (0, 0) = Ex , x x which indicates that the GAF behaves like a correlation function. The shape of the auto GAF is constrained by the so-called radar uncertainty principle,
2 2
A() (, ) d d = A() (0, 0) x x
2 = Ex ,
which imposes a strong limitation on the relation of volume and height of the auto GAF. The GAF is furthermore equal to the GSF of the outer signal product operator x y , A() (, ) = Sxy (, ). x,y Thus, the discussion of the GSF properties in Subsection B.1.1 applies to the GAF as well.
()
B.3 Time-Frequency Representations of Random Processes

In this section, we discuss TF representations of nonstationary random processes. These TF representations are statistical characterizations of the process considered. Some of them can be viewed as special cases of the TF operator representations discussed in Section B.1, obtained with the (positive semi-denite) correlation operator Rx of the random process x(t), or, equivalently, as expectations of some of the TF signal representations discussed in Section B.2.
2
Ex denotes the energy of the signal x(t), i.e., Ex = x
2 2
|x(t)|2 dt.
225
B.3.1
Generalized Wigner-Ville Spectrum
The generalized (cross) WignerVille spectrum (GWVS) [60,139,140] of two (generally nonstationary) random processes x(t), y(t) is dened by Wx,y (t, f ) with rx,y (t, )
() () () rx,y (t, ) ej2f d ,
(B.45)
rx,y t +
1 2
relation function of x(t) and y(t) (E denotes expectation). With = 0 the ordinary Wigner-Ville spectrum [12, 60, 139, 140] is obtained, Wx,y (t, f ) =
(0)
, t
1 2
+ , where rx,y (t, t ) = E {x(t) y (t )} is the cross cor-
j2f e d , rx,y t + , t 2 2
while = 1/2 yields the Rihaczek spectrum [60] W x,y (t, f ) =

(1/2)
rx,y (t, t ) ej2f d .
The GWVS satises a large number of desirable properties and thus is an interesting approach to dening a TF function that describes the spectral properties of nonstationary random processes. Note that (B.45) is a one-to-one mapping that can be inverted as rx,y (t, t ) =
f
W x,y
()
t + t + (t t ), f ej2f (tt ) df . 2
Thus, for y(t) = x(t) the GWVS is a complete second-order statistic of the process x(t). Under certain mild conditions, the GWVS equals the expectation of the GWD in (B.39),
() Wx,y (t, f ) = E Wx,y (t, f ) . ()
Comparing the denition of the GWVS with (B.17) shows that the WVS can equivalently be written as the GWS (see (B.17)) of the correlation operator Rx,y = E {x y } with kernel rx,y (t, t ), Wx,y (t, f ) = LRx,y (t, f ) . It can furthermore be shown that the GWVS constitutes a 2-D Fourier transform pair with the GEAF (to be discussed in Subsection B.3.4), Wx,y (t, f ) =
() () ()
A() (, ) ej2(t f ) d d . x,y
(B.46)
This is of particular importance since it makes explicit the connection between the smoothness of the GWVS and the extension of the GEAF about the origin of the (, ) plane (see Subsection 3.2.1). To further illustrate the interpretation of the GWVS, we nally consider specic examples of random processes: The GWVS of a stationary and white process with correlation operator Rx = I is given by ideally homogeneous manner. Wx (t, f ) . This shows that stationary white processes cover the entire TF plane in an
()
226
For a stationary process with correlation function rx (t, t ) = rx (t t ), the GWVS equals the power spectral density for all t, Wx (t, f ) Sx (f ).
()
For a (generally nonstationary) white process with correlation function rx (t, t ) = qx (t)(t t ), the GWVS equals the mean instantaneous intensity qx (t) for all f , Wx (t, f ) qx (t).
()
B.3.2
Generalized Evolutionary Spectrum
In 1965, Priestley [170] dened the evolutionary spectrum which is based on considerations related to the Karhunen Lo`ve expansion [1,107,136]. Another interpretation of the evolutionary spectrum is in e terms of an innovations system representation of the underlying process. Based on this innovations system interpretation, the generalized evolutionary spectrum (GES) has been introduced recently [145, 147, 148]. Any nonstationary process can be modelled as output of an LTV innovations system H which is excited by stationary white noise n(t), x(t) = (Hn)(t) =
t
h(t, t ) n(t ) dt ,
with E n(t) n (t ) = (t t ) .
(B.47)
Computing the correlation operator of x(t) from (B.47) results in Rx = HH+ . (B.48)
There exists no unique innovations system since for any valid innovations system H and any system U satisfying UU+ = I, it can easily be checked that HU is a valid innovations system as well. In analogy to the power spectral density of stationary processes, the GES can then be dened as squared magnitude of the innovations systems transfer function (i.e., GWS),
() Gx (t, f )
|LH (t, f )|2 .
()
(B.49)
To compute the GES, the so-called factorization problem has to be solved, i.e., one has to nd a valid innovations system H satisfying (B.48) for given Rx . Note that the GES is nonunique since dierent choices of the innovations system H yield dierent results for Gx (t, f ). The evolutionary spectrum is a special case of the GES re-obtained with = 1/2. Another special case is given by the transitory evolutionary spectrum [49, 147, 148], which is dual to the evolutionary spectrum and re-obtained from = 0 and using the the positive semidenite root of Rx (see Appendix A) as innovations system. the GES with = 1/2. A further special case, the Weyl spectrum [147,148], is obtained by choosing
()
B.3.3
Physical Spectrum
Mark, in his 1970 paper [139], dened the physical spectrum as the expectation of the spectrogram (see (B.42)),
(g) PSx (t, f ) (g) E SPECx (t, f ) = Rx St,f g, St,f g . (1/2) (1/2)
(B.50)
227
It can readily be veried that (B.50) can be written as 2-D convolution of the GWVS and the GWD of the analysis window [139],
(g) PSx (t, f ) = () W x (t , f ) Wg (t t, f f ) dt df . ()
(B.51)
t f
B.3.4
Generalized Expected Ambiguity Function
The generalized expected (cross) ambiguity function (GEAF) is dened as [118, 126] x,y A() (, )
() rx,y (t, ) ej2t dt.
(B.52)
The cross correlation function rx,y (t, t ) can be recovered from the GEAF by rx,y (t, t ) =
1 1 A() (t t , ) ej2 (( 2 +)t+( 2 )t ) d . x,y
(B.53)
Thus, for y(t) = x(t) the GEAF is a complete second-order statistics of the random process x(t). Under mild conditions, the GEAF can be shown to equal the expectation of the GAF in (B.43), () Ax,y (, ) = E A() (, ) . x,y Furthermore it equals the GSF (see (B.1)) of the correlation operator Rx,y ,
() A() (, ) = SRx,y (, ). x,y
Thus, it follows from (B.5) or from (B.44) that the members of the GEAF family obtained with dierent values dier from each other merely by a phase factor, A(2 ) (, ) = A(1 ) (, ) ej2(1 2 ) . x,y x,y (B.54)
Clearly, the properties of the GAF/GSF carry over to the GEAF. In particular, the (auto) GEAF () () Ax (, ) Ax,x (, ) features Hermitian symmetry, () () Ax (, ) = Ax (, ) , and attains its maximum at the origin, () |Ax (, )| A() (0, 0) = Ex , x with Ex E x
2 2
(B.55)
t rx (t, t) dt
the expected energy of x(t). Furthermore, the GEAF is in 2-D
Fourier correspondence with the GWVS (cf. (B.46)), () Ax,y (, ) = Wx,y (t, f ) ej2(t f ) dt df .
()
An important interpretation of the GEAF as TF correlation function is presented in Subsection 3.1.3. We nally consider some specic examples to clarify the interpretation of the GEAF (see Fig. B.2):
228


(a)
(b)
(c)
(d)
(e)
Figure B.2: Schematic representation of the GEAF magnitude of some generic types of random processes: (a) stationary white process, (b) stationary (non-white) process, (c) white (nonstationary) process, (d) quasi-stationary process, (e) quasi-white process. The GEAF of a stationary and white process with correlation operator Rx = I is given by () Ax (, )(, ) = ( )(). This is consistent with the fact that the stationary white processes feature neither temporal correlations nor spectral correlations. () For a stationary process with correlation function rx (t, t ) = rx (t t ), we obtain Ax (, ) = rx ( )(), which is consistent with the fact that stationary processes feature no spectral corre lations. For a (generally nonstationary) white process with correlation function rx (t, t ) = qx (t)(t t ), () the GEAF is given by Ax (, ) = Qx ()( ), where Qx () is the Fourier transform of qx (t). This result is intuitive since white processes feature no temporal correlations.
C
The Symplectic Group and Metaplectic Operators
God exists since mathematics is consistent, and the Devil Andr Weil e exists since we cannot prove it.
the symplectic group and its metaplectic representation. The latter in essence yields a class of unitary operators (termed metaplectic operators) that includes the time-frequency scaling operator, the Fourier transform operator, and the chirp convolution and multiplication operators as special cases. In this appendix, we outline the denition and essential properties of the symplectic group and metaplectic operators. Furthermore, we study the relevance of these concepts to certain time-frequency representations of linear systems and random processes.
INEAR coordinate transforms in the time-frequency or time lag/frequency lag domain are important in time-frequency analysis. They can be associated to what is known in mathematics as
229
230
Appendix C. The Symplectic Group and Metaplectic Operators
C.1 The Symplectic Group

We rst discuss the two-dimensional1 symplectic group in a notation adapted to our purposes. More detailed discussions can be found in [46, 64, 154, 208]. and taking values in F ) that satises A symplectic form on a vector space V over a eld F is a function f (x, y) (dened for all x, y V
f (a x1 + b x2 , y) = a f (x1 , y) + b f (x2 , y)
and
f (y, x) = f (x, y)
for all x1 , x2 , x, y V and a, b F . Hence, f (x, y) can be viewed as an antisymmetric bilinear form. For V = R2 and F = R, it can be shown that any symplectic form f ( 1 , 2 ) of two vectors 1 = (1 1 )T , 2 = (2 2 )T , can be written as f ( 1 , 2 ) = TJ 2 , 1 with J = 0 1 , R.
1 0
by [64]
(Note that J T = J 1 = J .) The standard symplectic form is obtained with = 1 and denoted 1, 2 The set of real-valued 2 2 matrices A = T J 2 = 1 2 1 2 . 1
a b c d
preserving the symplectic form in the sense that
A 1 , A 2 = 1 , 2 is referred to as symplectic group 2 [64, 154] and denoted by Sp. Any A Sp will be called symplectic matrix . The following equivalences can be shown: A Sp A Sp A Sp AT J A = J , A1 = J AT J 1 = det A = 1 . d c b a ,
The last property shows that the symplectic group consists of all matrices that correspond to areapreserving linear coordinate transforms = A. shown that the matrices Bb =
1 2
We next consider some important subsets of the symplectic group Sp. In particular, it can be
1 b 0 1
Cc =
1 0 c 1
Dd =
1/d 0 0 d
and R = eJ =
cos sin sin cos
(C.1)
The symplectic group is generally considered for R2n [64]. For our purposes, the case n = 1 is sucient. The corresponding group operation is matrix multiplication, the identity element equals the identity matrix, and the
inverse element is given by the matrix inverse.
C.2 Metaplectic Operators
231
(with b, c, d, R, |d| > 0) generate subgroups of Sp. For reasons of proper physical dimension, we will sometimes use a modied version of R that incorporates a normalization, i.e., R Any A =
a b c d (T )
cos (sin )/T 2
T 2 sin cos
= D1/T R DT .
in at least one of the following forms [64, 162]: a=0: b=0: c=0: d=0: A = Cc1 D1/a Bb1 , A = C c 1 B b Cc 2 , A = Bb1 Cc Bb2 , A = Bb1 Dd Cc1 ,
Sp can be written as a decomposition of the three3 matrices Bb , Cc , and Dd in (C.1) b , a c a a1 b d1 c
with b1 = with c1 = with b1 = with b1 =
c1 =
(C.2a) (C.2b) (C.2c) (C.2d)
d1 , b a1 , c b , d
c2 = b2 = c . d
c1 =
These decompositions will be important in the following. Note that there exists no unique decomposition of an arbitrary symplectic matrix A [64, 162].
C.2 Metaplectic Operators

on L2 (R) (for details see [64]). The mapping A (A) is termed the metaplectic representation of To each symplectic matrix A Sp, one can associate in a unique manner a unitary operator U = (A)
the symplectic group Sp. We will refer to any unitary operator U = (A) as metaplectic operator property
and denote the class of all (A) by M. The group structure of Sp implies the important composition A1 Sp, A2 Sp (A2 A1 ) = (A1 )(A2 ) . (C.3)
Unfortunately, it is dicult to calculate (A) explicitely for arbitrary A. However, it is comparatively simple to determine (A) for the following special cases [64, 162] (see also Table C.1): Scaling (A = Dd ): The metaplectic operator Dd = (Dd ) corresponding to Dd (with |d| > 0) performs a scaling according to (Dd x)(t) =
(T )
1 |d|
t . d
(T )
Fourier Transform (A = R/2 ): The metaplectic operator F (T ) = (R/2 ) corresponding to R/2 = D1/T R/2 DT = D1/T J DT can be shown to act like a Fourier transform, i.e., (F (T ) x)(t) =
3
(T )
1 t X . T T2
Note that due to det A = 1, any symplectic matrix A has only three free parameters.
232
TF scaling A (A)x (t) Dd =

|d|
Fourier transform R/2 =

1 T (T )
Chirp multiplication T 2 0 Cc = 1 c
2
Chirp convolution Bb = 1 b 0 1
2
1/d 0 0 d
t d
0 1/T 2 X
t T2
0 1
1 x
ejct x(t)
x(t) 1 ejt
|b|
/b
Table C.1: Basic symplectic matrices and associated metaplectic operators.
Chirp Convolution (A = Bb ): The metaplectic operator Bb = (Bb ) corresponding to Bb acts as 2 1 an LTI system with impulse response ejt /b , i.e.,
|b|
(Bb x)(t) =
1 |b|
t
x(t ) ej(tt )
2 /b
dt .
Chirp Multiplication (A = Cc ): The metaplectic operator Cc = (Cc ) corresponding to Cc performs a multiplication by ejct ,
2
(Cc x)(t) = x(t) ejct . Note that this corresponds to an LFI system that is dual to the LTI system Bb . Since any symplectic matrix A Sp can be decomposed according to at least one of the equations
in (C.2), the foregoing simple unitary operators can be combined according to (C.3) to describe the action of any (A). According to (C.2), the following cases have to be distinguished. Case 1 (a = 0): For a = 0, we can apply (C.3) twice to (C.2a) in order to obtain (A) = (Cc1 D1/a Bb1 ) = Bb1 D1/a Cc1 The action of this operator can be compactly written as (A)x (t) = |a| |b1 | ej(tt )
t
2 /b 1
with b1 =
b , a
c1 =
c . a
x(at ) ejc1 (at ) dt .
Case 2 (b = 0): In the case b = 0, application of (C.3) to (C.2b) yields (A) = (Cc1 Bb Cc2 ) = Cc2 Bb Cc1 with c1 = d1 , b c2 = a1 . b
The action of this operator can be compactly written as (A)x (t) = ejc2t
2
1 |b|
t
x(t ) ejc1 t ej(tt )
2 /b
dt .
Case 3 (c = 0): The case c = 0 is dual to the case b = 0. Here, applying (C.3) to (C.2c) gives (A) = (Bb1 Cc Bb2 ) = Bb2 Cc Bb1 with b1 = a1 , c b2 = d1 . c
C.3 Eects on Time-Frequency Representations
233
This operator acts as (A)x (t) = 1 |b1 b2 |

t1 t2
x(t2 )ej(t1 t2 )
2 /b 1
ej(tt1 )
2 /b 2
ejct1 dt1 dt2 .
Case 4 (d = 0): Finally, for d = 0, application of (C.3) to (C.2d) yields the operator (A) = (Bb1 Dd Cc1 ) = Cc1 Dd Bb1 which acts according to (A)x (t) = ejc1 t
2
with b1 =
b , d
c1 =
c , d
1 |db1 |
x(t ) ej(t/dt )
t
2 /b 1
dt .
We nally note that the fractional Fourier transform (see [2] and the references therein) is a specialization of the foregoing expressions for (A) to the case A = R , i.e., for F have T 1 | sin |
t (T ) (T )
= (R ) we
(T )
x(t ) ej sin
cos
t2 +t2 T2
ej2 T 2 sin dt , sin = 0 , = 0, 2, . . . , = , 3, . . . .
tt
F x (t) =
(T )
x(t) , x(t) ,

We next consider the behavior of TF representations of LTV systems and nonstationary random processes when the system/process is subjected to unitary transformations by a metaplectic operator. LTV Systems. We rst consider unitarily transformed LTV systems H = UHU+ . It can be shown that for U = (A) the spreading function (i.e., the GSF with = 0, see Subsection B.1.1) of H can be written as S e (, ) = SH A(, ) = SH (a + b, c + d).
() (0) H (0) (0)
(C.4)
We emphasize that this property holds only for = 0. However, since |SH (, )| is independent of (cf. Section B.1.1), there is also
() H
S e (, ) = SH (a + b, c + d)
()
for all .
Similarly, the Weyl symbol (i.e., the GWS with = 0, see Subsection B.1.2) of H can be shown to be given by L e (t, f ) = LH A(t, f ) = LH (at + bf, ct + df ).
(0) H (0) (0)
(C.5)
Again, this property holds only for the case = 0. From (C.4) and (C.5), we conclude that metaplectic transformations of LTV systems indeed correspond to area-preserving linear TF coordinate transforms. The following special cases are of particular practical importance (see also Fig. C.1):
234
(a)
(b)
(c)
(d)
(e)
(f)
Figure C.1: GSF magnitude of unitarily transformed systems: (a) original system, (b)(f ) systems transformed by (b) TF scaling, (c) Fourier transform, (d) chirp multiplication, (e) chirp convolution, (f ) fractional Fourier transform. S (0) (, ) = S (0) , d e H H d Scaling: H = Dd HD+ d (0) (0) t L (t, f ) = L , df H e H d S (0) (, ) = S (0) T 2 , e H H T2 (T ) (T )+ Fourier transform: H = F HF (0) L (t, f ) = L(0) f T 2 , t H e H T2 S (0) (, ) = S (0) ( + b, ) H e + H Chirp convolution: H = Bb HBb (0) (0) L e (t, f ) = LH (t + bf, f )
H
Chirp multiplication:
H = Cc HC+ c
S (0) (, ) = S (0) (, + c )
e H (0) L e (t, f ) H H
(0) LH (t, f
+ ct)
Nonstationary Random Processes. The discussion of unitary metaplectic transformations of random processes is parallel to that of LTV systems provided previously. Indeed, the correlation operators of the random processes x(t) and x(t) = (Ux)(t) are related by Rx = URx U+ . By noting that the expected ambiguity function (i.e., the GEAF with = 0, see Subsection B.3.4) and the Wigner-Ville spectrum (i.e., the GWVS with = 0, see Subsection B.3.1) can be written as
235
(0) (0) (0) (0) Ax (, ) = SRx (, ) and Wx (t, f ) = LRx (t, f ), respectively, the foregoing results in (C.4), (C.5)
can be applied immediately: x x A(0) (, ) = A(0) (A(, )) = A(0) (a + b, c + d) , x

(0) (0) (0)
x(t) = (A)x (t)
Wx (t, f ) = Wx (A(, )) = Wx (at + bf, ct + df ) .

(0)
Next, we consider the Weyl spectrum (i.e., the GES with = 0, see Subsection B.3.2) Gx (t, f ) = LH (t, f ) , where Hx is an innovations system of x(t). If Hx is an innovations system of x(t), it can easily be checked that Hx = (A)Hx (A)+ is an innovations systems of x(t) = (A)x (t). Thus, the Weyl spectrum of x(t) using Hx = (A)Hx (A)+ can be related to the Weyl spectrum of x(t) using Hx according to Gx (t, f ) = G(0) (A(, )) = G(0) (at + bf, ct + df ) . x x We emphasize that all of the above relations for the GWVS, GEAF, and GES only hold for = 0. For convenience, we explicitely state the the following special cases: A(0) (, ) = A(0) , d x x d (0) (0) t Scaling: x(t) = (Dd x)(t) , df W (t, f ) = Wx x d (0) G (t, f ) = G(0) t , df x x d A(0) (, ) = A(0) T 2 , x x T2 t (0) (0) Fourier transform: x(t) = (FT x)(t) f T 2, 2 W (t, f ) = Wx x T (0) G (t, f ) = G(0) f T 2 , t x x T2 A(0) (, ) = A(0) ( + b, ) x x (0) (0) Chirp convolution: x(t) = (Bb x)(t) W (t, f ) = Wx (t + bf, f ) x (0) G (t, f ) = G(0) (t + bf, f )
x x (0) (0) 2
Chirp multiplication:
x(t) = (Cc x)(t)
A(0) (, ) = A(0) (, + c ) x x (0) (0) W (t, f ) = Wx (t, f + ct) x (0) G (t, f ) = G(0) (t, f + ct) .
x x
236
Bibliography
[1] V. R. Algazi and D. J. Sakrison, On the optimality of Karhunen-Loeve expansion, IEEE Trans. Inf. Theory, (1969), pp. 319321. [2] L. B. Almeida, The fractional Fourier transform and time-frequency representations, IEEE Trans. Signal Processing, 42 (1994), pp. 30843091. [3] M. Amin, Time-frequency spectrum analysis and estimation for non-stationary random processes, in Advances in Spectrum Estimation, B. Boashash, ed., Melbourne: Longman Cheshire, 1992, pp. 208232. [4] M. G. Amin, Spectral decomposition of timefrequency distribution kernels, IEEE Trans. Signal Processing, 42 (1994), pp. 11561165. [5] H. Arts, G. Matz, and F. Hlawatsch, An unbiased scattering function estimator for fast timee varying channels, in Proc. 2nd IEEE Workshop on Signal Processing Advances in Wireless Communications, Annapolis, MD, May 1999, pp. 411414. [6] , A novel unbiased scattering function estimator allowing data-driven operation, , (to be submitted).
[7] C. R. Baker, Optimum quadratic detection of a random vector in Gaussian noise, IEEE Trans. Comm. Technol., 14 (1966), pp. 802805. [8] M. J. Bastiaans, Application of the Wigner distribution function in optics, in The Wigner Distribution Theory and Applications in Signal Processing, W. Mecklenbruker and F. Hlawatsch, eds., Elsevier, a Amsterdam (The Netherlands), 1997, pp. 375424. [9] A. A. Beex and M. Xie, Time-varying ltering via multiresolution parametric spectral estimation, in Proc. IEEE ICASSP-95, Detroit, MI, May 1995, pp. 15651568. [10] R. T. Behrens and L. L. Scharf, Signal processing applications of oblique projection operators, IEEE Trans. Signal Processing, 42 (1994), pp. 14131424. [11] P. A. Bello, Characterization of randomly time-variant linear channels, IEEE Trans. Comm. Syst., 11 (1963), pp. 360393. [12] J. S. Bendat and A. G. Piersol, Measurement and Analysis of Random Data, Wiley, New York, 1966. [13] , Engineering Applications of Correlation and Spectral Analysis, Wiley, New York, 2nd ed., 1993.
[14] F. A. Berezin, Wick and anti-Wick operator symbols, Math. USSR Sb., 15 (1971), pp. 577606. [15] J. Bertrand and P. Bertrand, Symbolic calculus on the time-frequency half-plane, J. Math. Phys., 39 (1998), pp. 40714090. [16] B. Boashash, ed., Time-Frequency Signal Analysis, Melbourne: Longman Cheshire and New York: Wiley, 1992.
237
238
Bibliography
[17] J. F. Bohme and D. Konig, Statistical processing of car engine signals for combustion diagnosis, in Proc. IEEE-SP Workshop on Statistical Signal and Array Proc., Quebec, CA, June 1994, pp. 369374. [18] H. Bolcskei, Blind estimation of symbol timing and carrier frequency oset in pulse shaping OFDM systems, in Proc. IEEE ICASSP-99, Phoenix, Arizona, March 1999. to appear. [19] , Design of pulse shaping OFDM systems for mobile radio applications, in Proc. Second IEEE Workshop on Signal Processing Applications in Wireless Communication, Annapolis, MD, May 1999. submitted.
[20] H. Bolcskei, P. Duhamel, and R. Hleiss, Design of pulse shaping OFDM/OQAM systems for wireless communications with high spectral eciency, submitted to IEEE Trans. Signal Processing, (1998). [21] H. Bolcskei, P. Duhamel, and R. Hleiss, Design of pulse shaping OFDM/OQAM systems for high data-rate transmission over wireless channels, in IEEE-ICC99, Vancouver, Canada, June 1999, pp. 559 564. [22] H. Bolcskei and F. Hlawatsch, Oversampled modulated lter banks, in Gabor Analysis and Algorithms: Theory and Applications, H. G. Feichtinger and T. Strohmer, eds., Birkhuser, Boston, MA, a 1998, ch. 9, pp. 295322. [23] H. Bolcskei, F. Hlawatsch, and H. G. Feichtinger, Oversampled FIR and IIR DFT lter banks and Weyl-Heisenberg frames, in Proc. IEEE ICASSP-96, vol. 3, Atlanta, GA, May 1996, pp. 13911394. [24] R. Bourdier, J. F. Allard, and K. Trumpf, Eective frequency response and signal replica generation for ltering algorithms using multiplicative modications of the STFT, Signal Processing, 15 (1988), pp. 193201. [25] A. Calderon and R. Vaillancourt, On the boundedness of pseudodierential operators, J. Math. Soc. Japan, 23 (1971), pp. 374378. [26] S. Carstens-Behrens, M. Wagner, and J. F. Bohme, Detection of multiple resonances in noise, 52 (1998), pp. 285292. Int. J. Electron. Commun. (AEU), [27] S. Carstens-Behrens, M. Wagner, and J. F. Bohme, Improved knock detection by time-variant ltered structure-borne sound, in Proc. IEEE ICASSP-99, Phoenix, AZ, March 1999, pp. 22552258. [28] R. W. Chang, Synthesis of band-limited orthogonal signals for multi-channel data transmission, Bell Syst. Tech. J., 45 (1966), pp. 17751796. [29] J. S. Chow, J. C. Tu, and J. M. Cioffi, A discrete multitone transceiver system for HDSL applications, IEEE J. Sel. Areas Comm., 9 (1991), pp. 895908. [30] L. J. Cimini, Analysis and simulation of a digital mobile channel using orthogonal frequency division multiplexing, IEEE Trans. Comm., 33 (1985), pp. 665675. [31] T. A. C. M. Claasen and W. F. G. Mecklenbrauker, The Wigner distributionA tool for timefrequency signal analysis; Part I: Continuous-time signals, Part II: Discrete-time signals, Part III: Relations with other time-frequency signal transformations, Philips J. Research, 35 (1980), pp. 217250, 276300, and 372389. [32] M. Coates, Time-Frequency Modelling, PhD thesis, Department of Engineering, University of Cambridge, 1998. [33] L. Cohen, Generalized phase-space distribution functions, J. Math. Phys., 7 (1966), pp. 781786. [34] [35] , Quantization problem and variational principle in the phase-space formulation of quantum mechanics, J. Math. Phys., 17 (1976), pp. 18631866. , Time-Frequency Analysis, Prentice Hall, Englewood Clis (NJ), 1995.
Bibliography
239
[36] A. Cordoba and C. Fefferman, Wave packets and Fourier integral operators, Comm. Partial Di. Eq., 3 (1978), pp. 9791005. [37] D. Cox, Delay Doppler characteristics of multipath propagation in a suburban mobile radio environment, IEEE Trans. Antennas and Propagation, 20 (1972), pp. 625635. [38] H. Cramr, On some classes of nonstationary stochastic processes, in Proc. 4th Berkeley Symp. on Math. e Stat. and Prob., Univ. Calif. Press, 1961, pp. 5778. [39] R. E. Crochiere and L. R. Rabiner, Multirate Digital Signal Processing, Prentice Hall, Englewood Clis (NJ), 1983. [40] P. J. Cullen, P. C. Fannin, and A. Molina, Wide-band measurement and analysis techniques for the mobile radio channel, IEEE Trans. Vehicular Technology, 42 (1993), pp. 589603. [41] G. S. Cunningham and W. J. Williams, Kernel decomposition of time-frequency distributions, IEEE Trans. Signal Processing, 42 (1994), pp. 14251442. [42] I. Daubechies, Time-frequency localization operators: A geometric phase space approach, IEEE Trans. Inf. Theory, 34 (1988), pp. 605612. [43] , The wavelet transform, time-frequency localization and signal analysis, IEEE Trans. Inf. Theory, 36 (1990), pp. 9611005.
[44] I. Daubechies and T. Paul, Time-frequency localization operators: A geometric phase space approach II. The use of dilations, Inverse Problems, 4 (1988), pp. 661680. [45] N. G. de Bruijn, Uncertainty principles in Fourier analysis, in Inequalities, O. Shisha, ed., Academic Press, New York, 1967, pp. 5771. [46] , A theory of generalized functions, with applications to Wigner distribution and Weyl correspondence, Nieuw Archief voor Wiskunde (3), XXI (1973), pp. 205280.
[47] A. Dembo and D. Malah, Signal synthesis from modied discrete short-time transform, IEEE Trans. Acoust., Speech, Signal Processing, 36 (1988), pp. 168181. [48] V. Dmoulin and M. Pcot, Vector equalization: An alternative approach for OFDM systems, Annales e e des Telecommunications, 52 (1997), pp. 411. [49] C. S. Detka and A. El-Jaroudi, The transitory evolutionary spectrum, in Proc. IEEE ICASSP-94, Adelaide, Australia, April 1994, pp. 289292. [50] D. L. Donoho, M. Vetterli, R. A. DeVore, and I. Daubechies, Data compression and harmonic analysis, IEEE Trans. Inf. Theory, 44 (1998), pp. 24352477. [51] O. Edfors, M. Sandell, J.-J. van de Beek, S. Wilson, and P. Borjesson, OFDM channel estimation by singular value decomposition, IEEE Trans. Comm., 46 (1998), pp. 931939. [52] P. C. Fannin, A. Molina, S. S. Swords, and P. J. Cullen, Digital signal processing techniques applied to mobile radio channel sounding, Proc. IEE-F, 138 (1991), pp. 502508. [53] S. Farkash and S. Raz, Linear systems in Gabor time-frequency space, IEEE Trans. Signal Processing, 42 (1994), pp. 611617. [54] C. L. Fefferman, The uncertainty principle, Bull. Amer. Math. Soc., 9 (1983), pp. 129206. [55] H. G. Feichtinger and K. Grochenig, Gabor wavelets and the Heisenberg group: Gabor expansions and short time Fourier transform from the group theoretical point of view, in Wavelets A Tutorial in Theory and Applications, C. Chui, ed., Academic Press, Boston, 1992, pp. 359397. [56] H. G. Feichtinger and T. Strohmer, eds., Gabor Analysis and Algorithms: Theory and Applications, Birkhuser, Boston (MA), 1998. a
240
Bibliography
[57] P. Flandrin, On the positivity of the Wigner-Ville spectrum, Signal Processing, 11 (1986), pp. 187189. [58] P. Flandrin, A time-frequency formulation of optimum detection, IEEE Trans. Acoust., Speech, Signal Processing, 36 (1988), pp. 13771384. [59] P. Flandrin, Time-frequency receivers for locally optimum detection, in Proc. IEEE ICASSP-83, New York, 1988, pp. 27252728. [60] [61] , Time-dependent spectra for nonstationary stochastic processes, in Time and Frequency Representation of Signals and Systems, G. Longo and B. Picinbono, eds., Springer, Wien, 1989, pp. 69124. , Time-Frequency/Time-Scale Analysis, Academic Press, San Diego (CA), 1999.
[62] P. Flandrin, B. Escudi, and J. Gra, R`gles des correspondance et -produits dans le plan tempse e e frquence, C. R. Acad. Sc. Paris, srie I, 294 (1982), pp. 281284. e e [63] P. Flandrin and W. Martin, The Wigner-Ville spectrum of nonstationary random signals, in The Wigner Distribution Theory and Applications in Signal Processing, W. Mecklenbruker and a F. Hlawatsch, eds., Elsevier, Amsterdam (The Netherlands), 1997, pp. 211267. [64] G. B. Folland, Harmonic Analysis in Phase Space, vol. 122 of Annals of Mathematics Studies, Princeton University Press, Princeton (NJ), 1989. [65] G. B. Folland and A. Sitaram, The uncertainty principle: A mathematical survey, J. Fourier Anal. Appl., 3 (1997), pp. 207238. [66] D. Gabor, Theory of communication, J. IEE, 93 (1946), pp. 429457. [67] W. A. Gardner, Statistical Spectral Analysis, Prentice Hall, New Jersey, 1988. [68] I. M. Gelfand and G. E. Schilow, Verallgemeinerte Funktionen (Distributionen), VEB, Berlin, 1960. [69] I. C. Gohberg and M. G. Krein, Introduction to the Theory of Linear Nonselfadjoint Operators, Amer. Math. Soc., Providence (RI), 1969. [70] G. H. Golub and C. F. Van Loan, Matrix Computations, Johns Hopkins University Press, Baltimore, 2 ed., 1989. [71] O. D. Grace, Instantaneous power spectra, J. Acoust. Soc. Amer., 69 (1981), pp. 191198. [72] P. E. Green, Jr., Radar measurements of target scattering properties, in Radar Astronomy, J. V. Evans and T. Hagfors, eds., McGraw-Hill, New York, 1968, ch. 1. [73] A. Grossmann, G. Loupias, and E. M. Stein, An algebra of pseudodierential operators and quantum mechanics in phase space, Ann. Inst. Fourier, 18 (1968), pp. 343368. [74] T. Hagfors and J. M. Moran, Detection and estimation practices in radio and radar astronomy, Proc. IEEE, 58 (1970), pp. 743759. [75] L. Hanzo, W. Webb, and T. Keller, Single- and multi-carrier quadrature amplitude modulation, Wiley, Chichester, UK, 2000. [76] S. Haykin and D. J. Thomson, Signal detection in a nonstationary environment reformulated as an adaptive pattern classication problem, Proc. IEEE, 86 (1998), pp. 23252345. [77] R. D. Hippenstiel and P. M. De Oliveira, Time-varying spectral estimation using the instantaneous power spectrum (IPS), IEEE Trans. Acoust., Speech, Signal Processing, 38 (1990), pp. 17521759. [78] F. Hlawatsch, Duality and classication of bilinear time-frequency signal representations, IEEE Trans. Signal Processing, 39 (1991), pp. 15641574. [79] , Wigner distribution analysis of linear, time-varying systems, in Proc. IEEE ISCAS-92, San Diego, CA, May 1992, pp. 14591462.
Bibliography
241
[80] [81] [82]
, Time-frequency analysis and synthesis of linear signal spaces, Tech. Rep. #95-03, Institute of Communications and Radio-Frequency Engineering, Vienna University of Technology, Vienna, 1996. , Time-Frequency Analysis and Synthesis of Linear Signal Spaces: Time-Frequency Filters, Signal Detection and Estimation, and Range-Doppler Estimation, Kluwer, Boston, 1998. , The covariance principle in time-frequency analysis, in Encyclopedia of Signal Processing, A. Poularikas, ed., CRC Press, Boca Raton (FL), 2001.
[83] F. Hlawatsch and H. Bolcskei, Covariant time-frequency distributions based on conjugate operators, IEEE Signal Processing Letters, 3 (1996), pp. 4446. [84] F. Hlawatsch and G. F. Boudreaux-Bartels, Linear and quadratic time-frequency signal representations, IEEE Signal Processing Magazine, 9 (1992), pp. 2167. [85] F. Hlawatsch and P. Flandrin, The interference structure of the Wigner distribution and related time-frequency signal representations, in The Wigner Distribution Theory and Applications in Signal Processing, W. Mecklenbruker and F. Hlawatsch, eds., Elsevier, Amsterdam, The Netherlands, 1997, a pp. 59133. [86] F. Hlawatsch and W. Kozek, Time-frequency weighting and displacement eects in linear, timevarying systems, in Proc. IEEE ISCAS-92, San Diego, CA, May 1992, pp. 14551458. [87] [88] [89] , The Wigner distribution of a linear signal space, IEEE Trans. Signal Processing, 41 (1993), pp. 12481258. , Time-frequency projection lters and time-frequency signal expansions, IEEE Trans. Signal Processing, 42 (1994), pp. 33213334. , Second-order time-frequency synthesis of nonstationary random processes, IEEE Trans. Inf. Theory, 41 (1995), pp. 255267.
[90] F. Hlawatsch and G. Matz, Quadratic time-frequency analysis of linear time-varying systems, in Wavelet Transforms and Time-Frequency Signal Analysis, L. Debnath, ed., Birkhuser, Boston (MA), a 2000, ch. 9. [91] , Time-frequency characterization of random time-varying channels, in Encyclopedia of Signal Processing, A. Poularikas, ed., CRC Press, Boca Raton (FL), 2001.
[92] F. Hlawatsch, G. Matz, H. Kirchauer, and W. Kozek, Time-frequency formulation, design, and implementation of time-varying optimal lters for signal estimation, IEEE Trans. Signal Processing, 48 (2000), pp. 14171432. [93] F. Hlawatsch, G. Taubock, and T. Twaroch, Covariant time-frequency analysis, in Wavelets and Signal Processing, L. Debnath, ed., Birkhuser, Boston (MA), 2001, ch. 7. a [94] L. Hormander, The Weyl calculus of pseudo-dierential operators, Comm. Pure Appl. Math., 32 (1979), pp. 359443. [95] R. Howe, Quantum mechanics and partial dierential equations, J. Funct. Anal., 38 (1980), pp. 188254. [96] B. Iem, Generalization of the Weyl symbol and the spreading function via time-frequency warpings: Theory and applications, PhD thesis, Univ. Rhode Island, 1998. [97] B. Iem, A. Papandreou-Suppappola, and G. F. Boudreaux-Bartels, Time-frequency symbols for statistical signal processing, in Encyclopedia of Signal Processing, A. Poularikas, ed., CRC Press, Boca Raton (FL), 2001. [98] A. J. E. M. Janssen, On the locus and spread of pseudo-density functions in the time-frequency plane, Philips J. Research, 37 (1982), pp. 79110.
242
Bibliography
[99] [100] [101]
, Wigner weight functions and Weyl symbols of non-negative denite linear operators, Philips J. Research, 44 (1989), pp. 742. , Duality and biorthogonality for Weyl-Heisenberg frames, J. Fourier Analysis and Applications, 1 (1995), pp. 403436. , Positivity and spread of bilinear time-frequency distributions, in The Wigner Distribution Theory and Applications in Signal Processing, W. Mecklenbruker and F. Hlawatsch, eds., Elsevier, Amsterdam a (The Netherlands), 1997, pp. 158.
[102] T. Kailath, Channel characterization: time-variant dispersive channels, in Lectures on Communication Theory, E. J. Baghdady, ed., McGrawHill, 1961, pp. 95123. [103] [104] [105] [106] , Measurements on time-variant communication channels, IEEE Trans. Inf. Theory, 8 (1962), pp. 229236. , Report on progress in information theory in the USA, 19601963, Part IV: Time-variant communication channels, IEEE Trans. Inf. Theory, 9 (1963), pp. 233237. , Linear Systems, Prentice Hall, Englewood Clis (NJ), 1980. , Lectures on Wiener and Kalman Filtering, vol. 140 of CISM Courses and Lectures, Springer, Wien, 1981.
[107] K. Karhunen, Uber lineare Methoden in der Wahrscheinlichkeitsrechnung, Ann. Acad. Sci. Fennicae, ser. A (I), 18 (1947). [108] S. M. Kay, Fundamentals of Statistical Signal Processing: Detection Theory, Prentice Hall, Upper Saddle River (NJ), 1998. [109] R. S. Kennedy, Fading Dispersive Communication Channels, Wiley, New York, 1969. [110] H. A. Khan and L. F. Chaparro, Nonstationary Wiener ltering based on evolutionary spectral theory, in Proc. IEEE ICASSP-97, Munich, Germany, May 1997, pp. 36773680. [111] H. Kirchauer, F. Hlawatsch, and W. Kozek, Time-frequency formulation and design of nonstationary Wiener lters, in Proc. IEEE ICASSP-95, Detroit, MI, May 1995, pp. 15491552. [112] J. J. Kohn and L. Nirenberg, An algebra of pseudo-dierential operators, Comm. Pure Appl. Math., 18 (1965), pp. 269305. [113] D. Konig and J. F. Bohme, Application of cyclostationary and time-frequency signal analysis to car engine diagnosis, in Proc. IEEE ICASSP-94, Adelaide, Australia, April 1994, pp. 149152. [114] W. Kozek, On the generalized Weyl correspondence and its application to time-frequency analysis of linear time-varying systems, in Proc. IEEE-SP Int. Sympos. Time-Frequency Time-Scale Analysis, Victoria, Canada, Oct. 1992, pp. 167170. [115] [116] [117] , Matched generalized Gabor expansion of nonstationary processes, in Proc. 27th Asilomar Conf. Signals, Systems, Computers, Pacic Grove, CA, Nov. 1993, pp. 499503. , Optimally Karhunen-Loeve-like STFT expansion of nonstationary random processes, in Proc. IEEE ICASSP-93, Minneapolis, MN, April 1993, pp. 428431. , On the underspread/overspread classication of nonstationary random processes, in Proc. Int. Conf. Industrial and Applied Mathematics (Hamburg, July 1995), K. Kirchgssner, O. Mahrenholtz, a and R. Mennicken, eds., vol. 3 of Mathematical Research, Berlin, 1996, Akademieverlag, pp. 6366. , Matched Weyl-Heisenberg Expansions of Nonstationary Environments, PhD thesis, Vienna University of Technology, March 1997.
[118]
Bibliography
243
[119] [120]
, On the transfer function calculus for underspread LTV channels, IEEE Trans. Signal Processing, 45 (1997), pp. 219223. , Adaptation of Weyl-Heisenberg frames to underspread environments, in Gabor Analysis and Algorithms: Theory and Applications, H. G. Feichtinger and T. Strohmer, eds., Birkhuser, Boston (MA), a 1998, ch. 10, pp. 323352.
[121] W. Kozek and H. Feichtinger, Time-frequency structured decorrelation of speech signals via nonseparable Gabor frames, in Proc. IEEE ICASSP-97, Munich, Germany, 1997. [122] W. Kozek and H. G. Feichtinger, Time-frequency structured decorrelation of speech signals via nonseparable Gabor frames, in Proc. IEEE ICASSP-97, Munich, Germany, April 1997, pp. 14391442. [123] W. Kozek, H. G. Feichtinger, and J. Scharinger, Matched multiwindow methods for the estimation and ltering of nonstationary processes, in Proc. IEEE ISCAS-96, Atlanta, GA, May 1996, pp. 509512. [124] W. Kozek, H. G. Feichtinger, and T. Strohmer, Time-frequency synthesis of statistically matched Weyl-Heisenberg prototype signals, in Proc. IEEE Int. Symp. Time-Frequency/Time-Scale Analysis, Philadelphia (PA), Oct. 1994, pp. 417420. [125] W. Kozek and F. Hlawatsch, A comparative study of linear and nonlinear time-frequency lters, in Proc. IEEE-SP Int. Sympos. Time-Frequency Time-Scale Analysis, Victoria, Canada, Oct. 1992, pp. 163 166. [126] W. Kozek, F. Hlawatsch, H. Kirchauer, and U. Trautwein, Correlative time-frequency analysis and classication of nonstationary random processes, in Proc. IEEE-SP Int. Sympos. Time-Frequency Time-Scale Analysis, Philadelphia, PA, Oct. 1994, pp. 417420. [127] W. Kozek and A. F. Molisch, On the eigenstructure of underspread WSSUS channels, in Proc. IEEE Workshop on Signal Processing Applications in Wireless Communication, Paris, France, April 1997, pp. 325328. [128] , Nonorthogonal pulseshapes for multicarrier communications in doubly dispersive channels, IEEE J. Sel. Areas Comm., 16 (1998), pp. 15791589.
[129] W. Kozek and K. Riedel, Quadratic time-varying spectral estimation for underspread processes, in Proc. IEEE-SP Int. Sympos. Time-Frequency Time-Scale Analysis, Philadelphia, PA, Oct. 1994, pp. 460 463. [130] B. V. K. Kumar and K. J. deVos, Linear system description using Wigner distribution functions, in Proc. SPIE, Advanced Algorithms and Architectures for Signal Processing II, vol. 826, 1987, pp. 115 124. [131] P. Lander and E. J. Berbari, Time-frequency plane Wiener ltering of the high-resolution ECG: Background and time-frequency representations, IEEE Trans. Biomedical Engineering, 44 (1997), pp. 247 255. [132] , Time-frequency plane Wiener ltering of the high-resolution ECG: Development and application, IEEE Trans. Biomedical Engineering, 44 (1997), pp. 256265.
[133] B. LeFloch, M. Alard, and C. Berrou, Coded orthogonal frequency division multiplex, Proc. of IEEE, 83 (1995), pp. 982996. [134] Y. Li, L. Cimini, and N. Sollenberger, Robust channel estimation for OFDM systems with rapid dispersive fading channels, IEEE Trans. Comm., 46 (1998), pp. 902915. [135] Y. Lin and P. P. Vaidyanathan, Application of DFT lter banks and cosine modulated lter banks in ltering, in Proc. APCCAS-94, Taipei, Taiwan, 1994, pp. 254259. [136] M. Lo`ve, Probability Theory, Van Nostrand, Princeton (NJ), 3rd ed., 1962. e
244
Bibliography
[137] S. G. Mallat, A Wavelet Tour of Signal Processing, Academic Press, San Diego, 1998. [138] S. G. Mallat, G. Papanicolaou, and Z. Zhang, Adaptive covariance estimation of locally stationary processes, Annals of Stat., 26 (1998). [139] W. D. Mark, Spectral analysis of the convolution and ltering of non-stationary stochastic processes, J. Sound Vib., 11 (1970), pp. 1963. [140] W. Martin and P. Flandrin, Wigner-Ville spectral analysis of nonstationary processes, IEEE Trans. Acoust., Speech, Signal Processing, 33 (1985), pp. 14611470. [141] G. Matz and F. Hlawatsch, Time-frequency formulation and design of optimal detectors, in Proc. IEEE-SP Int. Sympos. Time-Frequency Time-Scale Analysis, Paris, France, June 1996, pp. 213216. [142] , Time-frequency formulation and design of optimal detectors in nonstationary environments, Tech. Rep. #96-05, Dept. of Communications and Radio-Frequency Engineering, Vienna University of Technology, Sept. 1996. , Time-frequency methods for signal detection with application to the detection of knock in car engines, in Proc. IEEE-SP Workshop on Statistical Signal and Array Proc., Portland, OR, Sept. 1998, pp. 196199. , Time-frequency transfer function calculus (symbolic calculus) of linear time-varying systems (linear operators) based on a generalized underspread theory, J. Math. Phys., Special Issue on Wavelet and TimeFrequency Analysis, 39 (1998), pp. 40414071. , Time-varying spectra for underspread and overspread nonstationary processes, in Proc. 32nd Asilomar Conf. Signals, Systems, Computers, Pacic Grove, CA, Nov. 1998, pp. 282286. , Time-frequency subspace detectors and application to knock detection, Int. J. Electron. Commun. (AEU), 53 (1999), pp. 379385.
[143] [144]
[145] [146]
[147] G. Matz, F. Hlawatsch, and W. Kozek, Weyl spectral analysis of nonstationary random processes, in IEEE UK Sympos. Applications of Time-Frequency and Time-Scale Methods, Univ. of Warwick, Coventry, UK, Aug. 1995, pp. pp. 120127. [148] , Generalized evolutionary spectral analysis and the Weyl spectrum of nonstationary random processes, IEEE Trans. Signal Processing, 45 (1997), pp. 15201534.
[149] G. Matz, A. F. Molisch, F. Hlawatsch, M. Steinbauer, and I. Gaspard, On the systematic measurement errors of correlative mobile radio channel sounders. Submitted to IEEE Trans. Comm. [150] G. Matz, A. F. Molisch, M. Steinbauer, F. Hlawatsch, I. Gaspard, and H. Arts, Bounds on e the systematic measurement errors of channel sounders for time-varying mobile radio channels, in Proc. IEEE VTC-99 Fall, Amsterdam, The Netherlands, Sept. 1999, pp. 14651470. [151] W. Mecklenbrauker and F. Hlawatsch, eds., The Wigner Distribution Theory and Applications in Signal Processing, Elsevier, Amsterdam (The Netherlands), 1997. [152] G. Mlard, Proprits du spectre volutif dun processus non-stationnaire, Ann. Inst. H. Poincar B, e ee e e XIV (1978), pp. 411424. [153] Y. Meyer, Wavelets and operators, in AMS Proc. of Symposia in Appl. Math.: Dierent Perspectives on Wavelets, I. Daubechies, ed., AMS, Providence (RI), 1993, pp. 3558. [154] W. Miller, Jr., Topics in harmonic analysis with applications to radar and sonar, in Radar and Sonar, Part I, R. E. Blahut, W. Miller, Jr., and C. H. Wilcox, eds., Springer, New York, 1991, pp. 66168. [155] F. Molinaro and F. Castani, A comparison of time-frequency methods, in Signal Processing V: e Theories and Applications, L. Torres, E. Masgrau, and M. A. Lagunas, eds., Amsterdam, 1990, Elsevier, pp. 145148.
Bibliography
245
[156] A. F. Molisch, G. Matz, and F. Hlawatsch, Systematic errors of calibrated correlative channel sounders and improved calibration. To be submitted. [157] S. H. Nawab and T. F. Quatieri, Short-time Fourier transform, in Advanced Topics in Signal Processing, J. S. Lim and A. V. Oppenheim, eds., Prentice Hall, Englewood Clis, NJ, 1988, ch. 6, pp. 289337. [158] A. W. Naylor and G. R. Sell, Linear Operator Theory in Engineering and Science, Springer, New York, 2nd ed., 1982. [159] K. Nowak, Some eigenvalue estimates for wavelet related Toeplitz operators, Colloquium Mathematicum, LXV, Fasc. 1 (1993), pp. 149156. [160] A. V. Oppenheim, A. S. Willsky, and I. T. Young, Signals and Systems, Prentice Hall, Englewood Clis (NJ), 1983. [161] C. H. Page, Instantaneous power spectra, J. Appl. Phys., 23 (1952), pp. 103106. [162] A. Papoulis, Signal Analysis, McGraw-Hill, Singapore, 1984. [163] , Probability, Random Variables, and Stochastic Processes, McGraw-Hill, New York, 3rd ed., 1991.
[164] J. D. Parsons, D. A. Demery, and A. M. D. Turkmani, Sounding techniques for wideband mobile radio channels: A review, Proc. IEE-I, 138 (1991), pp. 437446. [165] A. Peled and A. Ruiz, Frequency domain data transmission using reduced computational complexity algorithms, in Proc. IEEE ICASSP-80, Denver, CO, 1980, pp. 964967. [166] M. Poize, M. Renaudin, and P. Venier, The Gabor transform as a modulated lter bank system, in Quatorzi`me Colloque GRETSI, Juan-Les-Pins, France, Sept. 1993, pp. 351354. e [167] J. C. T. Pool, Mathematical aspects of the Weyl correspondence, J. Math. Phys., 7 (1966), pp. 6676. [168] H. V. Poor, An Introduction to Signal Detection and Estimation, Springer, New York, 1988. [169] M. R. Portnoff, Time-frequency representation of digital signals and systems based on short-time Fourier analysis, IEEE Trans. Acoust., Speech, Signal Processing, 28 (1980), pp. 5569. [170] M. B. Priestley, Evolutionary spectra and non-stationary processes, J. Roy. Stat. Soc. Ser. B, 27 (1965), pp. 204237. [171] , Spectral Analysis and Time Series Part II, Academic Press, London, 1981.
[172] J. G. Proakis, Digital Communications, McGraw-Hill, New York, 3rd ed., 1995. [173] S. Qian and D. Chen, Joint Time-Frequency Analysis, Prentice Hall, Englewood Clis (NJ), 1996. [174] J. Ramanathan and P. Topiwala, Time-frequency localization via the Weyl correspondence, SIAM J. Matrix Anal. Appl., 24 (1993), pp. 13781393. [175] , Time-frequency localization and the spectrogram, Applied and Computational Harmonic Analysis, 1 (1994), pp. 209215.
[176] T. S. Rappaport, Wireless Communications: Principles & Practice, Prentice Hall, Upper Saddle River, New Jersey, 1996. [177] K. Riedel, Optimal data-based kernel estimation of evolutionary spectra, IEEE Trans. Signal Processing, 41 (1993), pp. 24392447. [178] A. W. Rihaczek, Signal energy distribution in time and frequency, IEEE Trans. Inf. Theory, 14 (1968), pp. 369374. [179] R. Rochberg, Toeplitz and Hankel operators, wavelets, NWO sequences, and almost diagonalization of operators, Lecture Notes in Pure and Appl. Math., 51 (1990), pp. 425444.
246
Bibliography
[180] S. Salous, N. Nikandrou, and N. Bajj, Digital techniques for mobile radio chirp sounders, IEE Proc. Commun., 145 (1998), pp. 191196. [181] B. Samimy and G. Rizzoni, Mechanical signature analysis using time-frequency signal processing: Application to internal combustion engine knock detection, Proc. IEEE, 84 (1996), pp. 13301343. [182] M. Sandell, Design and analysis of estimators for multicarrier modulation and ultrasonic imaging, PhD thesis, Lulea University of Technology, Lulea, Sweden, 1996. [183] A. M. Sayeed and D. L. Jones, Optimal detection using bilinear time-frequency and time-scale representations, IEEE Trans. Signal Processing, 43 (1995), pp. 28722883. [184] A. M. Sayeed and D. L. Jones, A canonical covariance-based method for generalized joint signal representations, IEEE Signal Processing Letters, 3 (1996), pp. 121123. [185] A. M. Sayeed and D. L. Jones, Optimal reduced-rank time-frequency/time-scale detectors, in Proc. IEEE-SP Int. Sympos. Time-Frequency Time-Scale Analysis, Paris, June 1996, pp. 209212. [186] A. M. Sayeed, P. Lander, and D. L. Jones, Improved time-frequency ltering of signal-averaged electrocardiograms, J. of Electrocardiology, 28 (1995), pp. 5358. [187] L. L. Scharf, Statistical Signal Processing, Addison Wesley, Reading (MA), 1991. [188] L. L. Scharf and B. Friedlander, Matched subspace detectors, IEEE Trans. Signal Processing, 42 (1994), pp. 21462157. [189] L. L. Scharf and J. B. Thomas, Wiener lters in canonical coordinates for transform coding, ltering, and quantizing, IEEE Trans. Signal Processing, 46 (1998), pp. 647654. [190] R. G. Shenoy and T. W. Parks, The Weyl correspondence and time-frequency analysis, IEEE Trans. Signal Processing, 42 (1994), pp. 318331. [191] J. A. Sills, Nonstationary Signal Modeling, Filtering, and Parameterization, PhD thesis, Georgia Institute of Technology, Atlanta, March 1995. [192] J. A. Sills and E. W. Kamen, Wiener ltering of nonstationary signals based on spectral density functions, in Proc. 34th IEEE Conf. Decision and Control, Kobe, Japan, Dec. 1995, pp. 25212526. [193] , Time-varying matched lters, Circuits, Systems, Signal Processing, 15 (1996), pp. 609630.
[194] M. Skolnik, Radar Handbook, McGraw-Hill, New York, 1984. [195] K. A. Sostrand, Mathematics of the time-varying channel, Proc. NATO Advanced Study Inst. on Signal Processing with Emphasis on Underwater Acoustics, 2 (1968), pp. 25.125.20. [196] K. Swaminathan and P. P. Vaidyanathan, Theory and design of uniform DFT, parallel, quadrature mirror lter banks, IEEE Trans. Circuits and Systems, 33 (1986), pp. 11701191. [197] C. W. Therrien, Discrete Random Signals and Statistical Signal Processing, Prentice Hall, Englewood Clis (NJ), 1992. [198] R. Thoma, J. Steffens, and U. Trautwein, Statistical cross-terms in quadratic time-frequency distributions, in Proc. Int. Conf. on DSP, Nicosia, Cyprus, July 1993. [199] D. J. Thomson, Multi-window spectrum estimation for non-stationary data, in Proc. IEEE-SP Workshop on Statistical Signal and Array Proc., Portland, OR, Sept. 1998, pp. 344347. [200] R. Tolimieri and M. An, Time-Frequency Representations, Birkhuser, Boston, 1998. a [201] A. Vahlin and N. Holte, Optimal nite duration pulses for OFDM, IEEE Trans. Comm., 4 (1996), pp. 1014.
Bibliography
247
[202] H. L. Van Trees, Detection, Estimation, and Modulation Theory, Part I: Detection, Estimation, and Linear Modulation Theory, Wiley, New York, 1968. [203] , Detection, Estimation, and Modulation Theory, Part III: Radar-Sonar Signal Processing and Gaussian Signals in Noise, Krieger, Malabar (FL), 1992.
[204] M. Vetterli, A theory of multirate lter banks, IEEE Trans. Acoust., Speech, Signal Processing, 35 (1987), pp. 356372. [205] J. Ville, Thorie et applications de la notion de signal analytique, Cbles & Transm., 2`me A. (1948), e a e pp. 6174. [206] A. Voros, An algebra of pseudodierential operators and the asymptotics of quantum mechanics, J. Funct. Anal., 29 (1978), pp. 104132. [207] M. Wagner, E. Karlsson, D. Konig, and C. Tork, Time variant system identication for car engine signal analysis, in Proc. EUSIPCO-94, Edinburgh, Sept. 1994, pp. 14091412. [208] A. Weil, Sur certaines groupes doprateurs unitaires, Acta Math., 111 (1964), pp. 143211. e [209] S. B. Weinstein and P. M. Ebert, Data transmission by frequency division multiplexing using the discrete Fourier transform, IEEE Trans. Comm. Tech., 19 (1971), pp. 628634. [210] L. B. White and B. Boashash, Cross spectral analysis of nonstationary processes, IEEE Trans. Inf. Theory, 36 (1990), pp. 830835. [211] E. P. Wigner, On the quantum correction for thermodynamic equilibrium, Phys. Rev., 40 (1932), pp. 749 759. [212] C. H. Wilcox, The synthesis problem for radar ambiguity functions, in Radar and Sonar, Part I, R. E. Blahut, W. Miller, Jr., and C. H. Wilcox, eds., Springer, New York, 1991, pp. 229260. [213] P. M. Woodward, Probability and Information Theory with Application to Radar, Pergamon Press, London, 1953. [214] L. A. Zadeh, Frequency analysis of variable networks, Proc. of IRE, 76 (1950), pp. 291299. [215] W. Y. Zou and Y. Wu, COFDM: An overview, IEEE Trans. Broadc., 41 (1995), pp. 18.
248
List of Abbreviations
BFDM CL DL EVD GAF GEAF GES GSF GWD GWS GWVS HS ICI ISI KL LFI LTI LTV OFDM PSD ROC STFT SVD TF biorthogonal frequency division multiplexing correlation-limited displacement-limited eigenvalue decomposition generalized ambiguity function generalized expected ambiguity function generalized evolutionary spectrum generalized spreading function generalized Wigner distribution generalized Weyl symbol generalized Wigner-Ville spectrum Hilbert-Schmidt interchannel interference intersymbol interference Karhunen-Love e linear frequency-invariant linear time-invariant linear time-varying orthogonal frequency division multiplexing power spectral density receiver operating characteristics short-time Fourier transform singular value decomposition time-frequency
249

Linear Time Variant

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Time Variant

Uploaded by

Copyright:

Available Formats

Dissertation

Institute of Communications and Radio-Frequency Engineering Vienna University of Technology

This dissertation is available online at http://www.nt.tuwien.ac.at/dspgroup/tfgroup/doc/psles/GM-phd.ps.gz

INSTITUT FR NACHRICHTENTECHNIK UND HOCHFREQUENZTECHNIK

eingereicht an der Technischen Universitt Wien a Fakultt fr Elektrotechnik a u

von Gerald Matz Servitengasse 13/9 1090 Wien

Wien, im November 2000

Die Begutachtung dieser Arbeit erfolgte durch:

1. Ao. Univ.-Prof. Dipl.-Ing. Dr. F. Hlawatsch

2. O. Univ.-Prof. Dipl.-Ing. Dr. W. Mecklenbruker a

1.5 1.6 1.7

149 151 151 154 157 158 159 160 164

A Linear Operator Theory

A.4 Special Types of Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

B Time-Frequency Analysis Tools

C The Symplectic Group and Metaplectic Operators

Bibliography List of Abbreviations

1.1 General Remarks

correlation operator is trace-class (or nuclear, see Appendix A).

1.2 Review of Time-Invariant/Stationary Theory

1.2 Review of Time-Invariant/Stationary Theory

Transfer Functions of Time-Invariant and Frequeny-Invariant Linear Sytems

m(t) (t t ) x(t ) dt = m(t) x(t).

Power Densities of Stationary and White Processes

dt is the process Fourier transform. For white processes, the corre-

1.3 Time-Varying Systems and Nonstationary Random Processes

qx (t) dt = E |X(f )|2 .

ry,x ( ) ej2f d = K(f ) Px (f ) ,

1.3 Time-Varying Systems and Nonstationary Random Processes

Time-Varying Systems and the Generalized Weyl Symbol

h() (t, ) ej2f d ,

Nonstationary Processes and Time-Varying Power Spectra

dened as [145, 148] (cf. also [49, 118, 170, 171])

The inner product is dened as usual, x, y =

x(t) y (t) dt.

1.4 The Importance of Being Underspread

Underspread Linear Time-Varying Systems

h() (t, ) ej2t dt ,

with S, denoting the generalized TF shift operator,

oset TF shift can be split o from the system.

1.4 The Importance of Being Underspread

Underspread Nonstationary Processes

1.5 Signal Processing Applications

1.5 Signal Processing Applications

1.6 Related Work

1.7 Overview of Contributions

1.7 Overview of Contributions

Chapter 2. Underspread Systems

2.1 Operators with Compactly Supported Spreading Function

General Support Constraints

2.1 Operators with Compactly Supported Spreading Function

convolution relation in the TF domain, i.e.,

= 0. Note that this does

Denition of Displacement-limited Underspread Operators

time and frequency shift, respectively, introduced by the system, H

Chapter 2. Underspread Systems

(2.7) is the area of a

respectively, which contains the GSF support

2.1 Operators with Compactly Supported Spreading Function

Chapter 2. Underspread Systems

Operator Sums, Adjoints, Products, and Inverses

GH1 GH2 and the corresponding indicator function is

inf UH1 U+ ,UH2 U+