P. 1
Interferometer_dlc

Interferometer_dlc

|Views: 871|Likes:
Published by Mark
This edition has a minor change on the copyright page.

The material you are about to see is protected under copyright and other applicable proprietary rights. You are given permission to use this material provided that it is used solely for internal and personal purposes and is not re-sold for monetary gain. The material is provided "as is", "as available", and "with all faults", and it is provided without warranties of any sort. This material should be treated like other self-published works of science or engineering because it may contain any number of honest mistakes and oversights. Facts should be checked against other sources before being relied on, and the assumptions and mathematical manipulations used to derive formulas should be scrutinized carefully for correctness and applicability before being trusted. --The Author

Performance of Standard Fourier-Transform Spectrometers Copyright @ 2007 by Douglas Cohen
PRINTED IN THE UNITED STATES OF AMERICA
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior permission of the author.
This edition has a minor change on the copyright page.

The material you are about to see is protected under copyright and other applicable proprietary rights. You are given permission to use this material provided that it is used solely for internal and personal purposes and is not re-sold for monetary gain. The material is provided "as is", "as available", and "with all faults", and it is provided without warranties of any sort. This material should be treated like other self-published works of science or engineering because it may contain any number of honest mistakes and oversights. Facts should be checked against other sources before being relied on, and the assumptions and mathematical manipulations used to derive formulas should be scrutinized carefully for correctness and applicability before being trusted. --The Author

Performance of Standard Fourier-Transform Spectrometers Copyright @ 2007 by Douglas Cohen
PRINTED IN THE UNITED STATES OF AMERICA
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without prior permission of the author.

More info:

Published by: Mark on Sep 06, 2011
Copyright:Attribution Non-commercial

Availability:

Read on Scribd mobile: iPhone, iPad and Android.
download as PDF, TXT or read online from Scribd
See more
See less

12/27/2013

pdf

text

original

Sections

Performance of Standard

Fourier-Transform Spectrometers
(or, more than you probably
wanted to know about Fourier
transforms, random-signal theory,
and Michelson interferometers)
by Douglas Cohen
Volume One, Chapters 1-4






























Performance Analysis of Standard Fourier-Transform
Spectrometers

Copyright @ 2007 by Douglas Cohen

PRINTED IN THE UNITED STATES OF AMERICA

All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise,
without the prior permission of the author.

Performance of Standard Fourier-Transform Spectrometers








To Sophie and Phoebe who do not know calculus,
and to Clara who does

i
CONTENTS
Preface ..........................................................................................................................................vii

1 Ether Wind, Spectral Lines, and Michelson Interferometers ...................................... 1

1.1 The First Michelson Interferometer........................................................................ 2
1.2 Historical Reasoning Behind the Ether Wind Experiment .................................. 14
1.3 Monochromatic Light and Spectral Lines ............................................................ 24
1.4 Applying the Michelson Interferometer to Spectral Lines................................... 24
1.5 Interference Equation for the Ideal Michelson Interferometer............................. 31
1.6 Fringe Patterns of Finite-Width Spectral Lines.................................................... 51
1.7 Fourier-Transform Spectrometers ....................................................................... 52
1.8 Laser-Based Control Systems............................................................................... 57

2 Fourier Theory................................................................................................................. 62

2.1 Basic Concept of a Fourier Transform ................................................................ 62
2.2 Fourier Sine and Cosine Transforms.................................................................... 67
2.3 Even, Odd, and Mixed Functions ........................................................................ 76
2.4 Extended Sine and Cosine Transforms................................................................. 80
2.5 Forward and Inverse Fourier Transforms............................................................. 89
2.6 Fourier Transform as a Linear Operator............................................................... 97
2.7 Mathematical Symmetries of the Fourier Transform .......................................... 99
2.8 Basic Fourier Identities....................................................................................... 103
2.9 Fourier Convolution Theorem............................................................................ 110
2.10 Fourier Transforms and Divergent Integrals ...................................................... 117
2.11 Generalized Functions ....................................................................................... 121
2.12 Generalized Limits ............................................................................................. 132
2.13 Fourier Transforms of Generalized Functions ................................................... 136
2.14 The Delta Function ............................................................................................ 144
2.15 Derivative of the Delta Function ....................................................................... 153
2.16 Fourier Transform of the Delta Function ........................................................... 157
2.17 Fourier Convolution Theorem with Generalized Functions............................... 159
2.18 The Shah Function.............................................................................................. 162
2.19 Fourier Transform of the Shah Function ........................................................... 165
2.20 Fourier Series...................................................................................................... 173
2.21 Discrete Fourier Transform ............................................................................... 181
2.22 Aliasing as an Error ........................................................................................... 188
2.23 Aliasing as a Tool ............................................................................................... 197

ii
2.24 Sampling Theorem............................................................................................. 200
2.25 Fourier Transforms in Two and Three Dimensions........................................... 207
Table 2.1.............................................................................................................. 219
Table 2.2.............................................................................................................. 221

3 Random Variables, Random Functions, and Power Spectra ................................... 223

3.1 Random and Nonrandom Variables ................................................................... 223
3.2 Random and Nonrandom Functions................................................................... 224
3.3 Probability Density Distributions: Mean, Variance, Standard Deviation.......... 226
3.4 The Expectation Operator .................................................................................. 230
3.5 Independent and Dependent Random Variables ................................................ 233
3.6 Analyzing Independent Random Variables ...................................................... 233
3.7 Large Numbers of Random Variables ............................................................... 234
3.8 Single-Variable Means from Multivariable Distributions ................................. 235
3.9 Analyzing Dependent Random Variables.......................................................... 236
3.10 Linearity of the Expectation Operator .............................................................. 239
3.11 The Central Limit Theorem .............................................................................. 243
3.12 Averaging to Improve Experimental Accuracy ................................................. 247
3.13 Mean, Autocorrelation, Autocovariance of Random Functions of Time .......... 249
3.14 Ensembles ......................................................................................................... 251
3.15 Stationary Random Functions ............................................................................ 252
3.16 Gaussian Random Processes .............................................................................. 261
3.17 Products of Two, Three, and Four Jointly Normal Random Variables ............. 263
3.18 Ergodic Random Functions................................................................................ 272
3.19 Experimental Noise............................................................................................ 279
3.20 The Power Spectrum.......................................................................................... 280
3.21 Random Inputs and Outputs of Linear Systems ................................................ 282
3.22 The Sign of the Power Spectrum ...................................................................... 287
3.23 The Power Spectrum and Fourier Transforms of Random Functions .............. 289
3.24 The Multidimensional Wiener-Khinchin Theorem............................................ 297
3.25 Band-Limited White Noise ................................................................................ 299
3.26 Even and Odd Components of Random Functions ............................................ 302
3.27 Analyzing the Noise in Artificially Created Even Signals ............................... 319

4 From Maxwell’s Equations to the Michelson Interferometer ................................. 330

4.1 Deriving the Electromagnetic Wave Equations ................................................. 330
4.2 Electromagnetic Plane Waves............................................................................ 335
4.3 Monochromatic Wave Trains............................................................................. 344
4.4 Linear Polarization of Monochromatic Plane Waves ....................................... 349

iii
4.5 Transmitted Plane Waves .................................................................................. 353
4.6 Reflected Plane Waves ...................................................................................... 363
4.7 Polychromatic Wave Fields ............................................................................... 369
4.8 Angle-Wavenumber Transforms ........................................................................ 375
4.9 Beam-Chopped and Direction-Chopped Radiation ........................................... 383
4.10 Time-Chopped and Band-Limited Radiation .................................................... 390
4.11 Top-Level Description of a Standard Michelson Interferometer ....................... 394
4.12 Monochromatic Plane Waves and Michelson Interferometers .......................... 395
4.13 Multiple Plane Waves and Michelson Interferometers ...................................... 416
4.14 Energy Flux of Time-Chopped and Beam-Chopped Radiation Fields .............. 427
4.15 Energy Flux of the Balanced Radiation Fields .................................................. 438
4.16 Simplified Formulas for the Optical Power in the Balanced Signal .................. 454
4.17 Energy Flux in the Unbalanced Radiation Fields .............................................. 464
4.18 Simplified Formulas Describing Unbalanced Background Radiation ............... 483
Appendix 4A .................................................................................................................. 490
Appendix 4B ................................................................................................................... 499
Appendix 4C ................................................................................................................... 522
Appendix 4D ................................................................................................................... 528
Appendix 4E ................................................................................................................... 532
Appendix 4F ................................................................................................................... 551

5 Description of Practical Interferometer Measurements............................................ 555

5.1 Radiometric Description of Electromagnetic Fields ......................................... 555
5.2 Radiance Fields in Space.................................................................................... 566
5.3 Radiance, Brightness, and the Inverse-Square Law .......................................... 571
5.4 The Balanced Signal of a Michelson Interferometer.......................................... 573
5.5 The Unbalanced Signal of a Michelson Interferometer .................................... 585
5.6 The Off-Axis Signal of a Michelson Interferometer ......................................... 588
5.7 The Standard Michelson Interferometer with Central Detector ......................... 599
5.8 The Fore and Aft Optics .................................................................................... 605
5.9 The Detector Signal ............................................................................................ 611
5.10 The Detector Circuit .......................................................................................... 617
5.11 The Effective Spectrum...................................................................................... 622
5.12 Symmetries of the Interferogram Signal and Effective Spectrum...................... 624
5.13 Background Radiation Inside a Standard Michelson Interferometer ................ 626
5.14 Removing the Background Spectra ................................................................... 640
5.15 Double-Sided Interferograms ............................................................................. 643
5.16 Apodization of Spectra ...................................................................................... 650
5.17 The Effect of a Finite Field of View................................................................... 656
5.18 Single-Sided Interferograms............................................................................... 667

iv
5.19 Calibration ......................................................................................................... 682
5.20 Nonflat Optical Surfaces ................................................................................... 686
5.21 An Example of How to Analyze Nonflat Optical Surfaces .............................. 692
5.22 Sampling the Interferogram Signal .................................................................... 696
5.23 Setting Up the Discrete Fourier Transform of the Sampled Signal ................... 699
5.24 Oversampling the Interferogram........................................................................ 704
5.25 Undersampling the Interferogram...................................................................... 715
5.26 Off-Center Sampling of the Interferogram Signal ............................................. 723
Appendix 5A ................................................................................................................. 727
Appendix 5B ................................................................................................................. 731
Appendix 5C ................................................................................................................. 738


6 NEdN and Detector Noise............................................................................................. 742

6.1 Definition of NEdN............................................................................................ 742
6.2 Signal from the Spectral Radiance..................................................................... 748
6.3 Signal from the Background Radiance .............................................................. 752
6.4 Inverse Fourier Transform of the Background Radiance................................... 753
6.5 Background Radiance, Total Error, and Signal Noise ....................................... 759
6.6 Detector Noise.................................................................................................... 763
6.7 1/f Noise in Detectors......................................................................................... 764
6.8 Avoidable and Unavoidable Noise in Double-Sided Signals ............................ 767
6.9 Passing the Detector Noise Through the Detector Circuit ................................. 769
6.10 Total Detector Noise in Double-Sided Signals .................................................. 772
6.11 Measuring the Noise-Contaminated Spectrum.................................................. 782
6.12 Characterizing the Detector Noise ..................................................................... 792
6.13 Detector Noise with a Band-Limited, White-Noise Power Spectrum............... 795
6.14 An Example of Simulated Detector Noise in a Double-Sided Signal................ 800
6.15 Photon Noise in Detectors.................................................................................. 806
6.16 Detector-Noise NEdN in Double-Sided Signals................................................ 814
6.17 Real and Imaginary Parts of the Detector Noise................................................ 820
6.18 Detector Noise in a Single-Sided Signal............................................................ 821
6.19 Uncalibrated Spectra of Single-Sided Signals with Detector Noise ................. 829
6.20 Calibrated Spectra of Single-Sided Signals with Detector Noise...................... 840
6.21 Detector-Noise NEdN in a Single-Sided Signal ................................................ 844
6.22 Detector Circuit as an Anti-Aliasing Filter ........................................................ 849
Appendix 6A ................................................................................................................. 857
Appendix 6B ................................................................................................................. 861


v
7 Mirror-Misalignment NEdN in Double-Sided Interferograms................................. 865

7.1 Setting Up the Signal Equations......................................................................... 865
7.2 Specifying the Random Misalignment Angle of the Moving Mirror................. 867
7.3 Ȥ-Based Signal Contaminated by Misalignment Noise ...................................... 873
7.4 Misalignment Noise and the Detector Circuit (or Anti-Aliasing Filter) ............ 879
7.5 Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals ............ 882
7.6 Calibrated Spectra Contaminated by Misalignment Noise ................................ 891
7.7 Avoidable and Unavoidable Mirror-Misalignment Noise in Ȥ-Based Signals... 895
7.8 Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal
Spectrum ............................................................................................................ 898
7.9 Power Spectrum of
) 2 ( ~ θ
n .................................................................................... 903
7.10 Calculating the Variance of įL

........................................................................ 905
7.11 Formula for the Misalignment NEdN of Double-Sided Signals ........................ 909
7.12 Connection Between
) 2 (
~ ~
θ
n n
p
Power Spectrum and the Power Spectra of

x
θ
~
,
y
θ
~
................................................................................................................. 911
7.13 The Shape of the
) 2 (
~ ~
θ
n n
p
Power Spectrum............................................................ 921
7.14 The Size of the
) 2 (
~ ~
θ
n n
p
Power Spectrum............................................................... 927
7.15 Simulated Misalignment Noise .......................................................................... 929
Appendix 7A .................................................................................................................. 945
Appendix 7B .................................................................................................................. 948

8 The Sampling-Error NEdN in Double-Sided Interferograms................................... 953

8.1 Noise-Free Signal at the a/D Converter.............................................................. 953
8.2 Sampling Noise at the a/D Converter ................................................................. 954
8.3 Power Spectrum and Autocorrelation Function of the Sampling Noise ............ 956
8.4 Uncalibrated Spectral Signal .............................................................................. 959
8.5 Calibrating the Spectral Signal Contaminated by Sampling Noise.................... 964
8.6 Random Sampling Error in the Measured Spectrum.......................................... 969
8.7 Calculating the NEdN from the Random Sampling Error.................................. 972
8.8 Black-Body Spectrum Contaminated by Sampling Noise ................................. 986
8.9 Sampling Noise and an Isolated Lorentz Emission Line.................................... 996
8.10 Error from Quasi-Static Sampling Noise.......................................................... 1007
8.11 Comparing the Sampling-Error, Misalignment, and Detector NEdNs............. 1024

Bibliography ............................................................................................................................. 1039

vi

PREFACE
Over the past three or four decades, Fourier-transform spectrometers based on Michelson
interferometers have become an ever more popular way to measure spectral radiance, especially
in the infrared region of the electromagnetic spectrum. The equations and formulas used to
characterize the performance of these instruments—how accurate they are and in what ways they
distort measured spectra—are usually presented in a very approximate form. It is easy to
understand why this is so: optical imperfections and random disturbances have to interact with
the Fourier transform before they affect the spectral measurement. Although engineering intuition
and simple statistics are often all that is needed to evaluate even the most complicated measuring
system, here they are not enough.
Fortunately the problem is not inherently very difficult, although the knowledge needed to
handle it is spread over the fields of optics, Fourier transforms, and random-signal theory. This
book, after briefly outlining the historical development of the Michelson interferometer, starts off
with an overview of both random signal theory and Fourier transform analysis. Maxwell’s
equations are then used to introduce the optical concepts required to understand Michelson
interferometers, leading to formulas for the balanced, unbalanced, and off-axis signals. This
analysis includes the effects of misaligned optics, polarized radiation, and nonuniform fields of
view; the formulas derived here contain all the information needed to construct professional-
quality computer simulations of these instruments. The typical distortions present in Fourier-
transform measurements are thoroughly analyzed, and there are detailed explanations of the
random measurement errors due to imperfect detectors, unsteady optical alignment, background
radiation, and mistakes in sampling the signal.
Many times optical engineers and scientists interested in evaluating the performance of
Fourier-transform spectrometers are faced with an unappealing choice between equations that are
too simple-minded and computer simulations that are too complicated and specific. The
convolution-based formulas presented here occupy the middle ground between these extremes—
sophisticated enough to give accurate, dependable answers and simple enough to be evaluated
without much trouble. All derivations are explained at length, making it easy to adapt them to the
nonstandard types of Michelson interferometers not covered here. By the end of the book, the
reader knows how to analyze nonideal Fourier-transform spectrometers operating in an imperfect
world.
- 1 -
1
ETHER WIND, SPECTRAL LINES, AND
MICHELSON INTERFEROMETERS
The Michelson interferometer is named after Albert Abraham Michelson, who designed and built
it in 1881 to detect the ether wind caused by the Earth’s orbital motion. Michelson’s attempt
failed; his interferometer, sensitive enough to detect stamping feet 100 meters away,
1
could not
detect the Earth’s orbital motion. So important and difficult to explain was this result that
Michelson and Edward Morley repeated the experiment with a larger and more sensitive
interferometer in 1887. This second attempt, which is today called the Michelson-Morley
experiment, also yielded a negative result: The Earth’s motion could not be detected. The
Michelson-Morley experiment is one of the most important negative findings of 19th-century
science; it encouraged physics to discard the idea of a luminiferous ether and prepared the way
for Einstein’s relativity theories at the beginning of the 20th century.
The idea of a luminiferous ether—a plenum pervading both (transparent) matter and empty
space—had been widely accepted ever since Young and Fresnel established around 1820 that
light behaved like a transverse vibration or wavefield as it propagated past obstacles. There were
recognized difficulties with the concept; for example, the ether provided no detectable resistance
to the motion of material bodies yet was elastic enough to transmit light vibrations without
measurable energy loss. In the 1820s and ’30s, Poisson, Cauchy, and Green, famous
mathematical scientists, derived equations of motion for transverse waves in an elastic medium,
but when these equations were applied to the already known behavior of light, the results were at
best mixed.
2
In 1867 James Clerk Maxwell modified the formulas describing the interdependent
behavior of electric and magnetic fields to make them a self-consistent set of equations; he
believed himself to be constructing a mechanical analogy for the ether. After showing that the
new set of equations predicted transverse electromagnetic waves traveling at the speed of light,
Maxwell not only asserted that light was a propagating electromagnetic disturbance, but he also
used his discovery to connect electric and magnetic properties to the behavior of the luminiferous
ether. It was not until 1888 that Hertz demonstrated experimentally that propagating
electromagnetic disturbances actually exist; and the optical community itself did not
acknowledge until 1896, with the discovery of the Lorentz-Zeeman effect, that light had to be


1
A. Michelson, “The Relative Motion of the Earth and the Luminiferous Ether,” American Journal of Science 22,
Series 2 (1881), p. 120–129.
2
E. Whittaker, A History of the Theories of Aether and Electricity, Vol. I, The Classical Theories (Thomas Nelson &
Sons, Ltd., New York, 1951), pp. 129–142.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 2 -
such a propagating electromagnetic wavefield.
3
So the ether concept was not only alive and well
at the time of Michelson’s experiments, but it could also be said, with the growing acceptance of
Maxwell’s equations to describe the behavior of the luminiferous ether, that it had never been
healthier.


3
D. Goldstein, Polarized Light, 2nd ed. (Marcel Dekker, Inc., New York, 2003), p. 298.
1.1 The First Michelson Interferometer
Figure 1.1(a) is a drawing of the instrument Michelson described in his 1881 paper, and Fig.
1.1(b) shows how the interferometer works. Incident light enters from the left, as shown by the
dark solid arrow, and hits a glass plate whose back is a partly reflecting, partly transmitting
surface. Ideally, half the incident light is transmitted through to mirror C and half is reflected up
to mirror D. Mirrors C and D then return the light to the beam splitter, as shown by the dashed
arrows. At the beam splitter, the light is again half transmitted and half reflected to send two
equal-intensity beams into the observer’s telescope. The light that is first transmitted and then
reflected at the beam splitter is called beam TR, and the light that is first reflected and then
transmitted at the beam splitter is called beam RT. These beams are drawn as two side-by-side
dotted arrows, but in reality they should be thought of as lying one on top of the other, filling the
same volume of space as they travel from the beam splitter to the telescope.
Michelson, thinking then in terms of 19th-century optical theory, would have regarded light as
transverse and elastic vibrations in the ether. The ether’s plane of vibration might be horizontal,
as shown in Fig. 1.2(a), or vertical, as shown in Fig. 1.2(b). It was assumed, in fact, that the ether
could undergo transverse vibrations in any plane at all—horizontal, vertical, or something in
between, as shown in Fig. 1.2(c)—although not all at the same time. At any given point in the
light beam, there could be only one plane of vibration, with different colors of light characterized
by different wavelengths of vibration. If a “snapshot” of a light beam could be taken, the plane of
vibration could well be changing along its length, as shown in Fig. 1.3(a). At some slightly later
time, the snapshot would show the same configuration advanced in the direction of propagation,
as shown in Fig. 1.3(b). White light, then as now, was taken to be a composite beam consisting of
many different wavelengths simultaneously traveling in the same direction. Different colors of
light correspond to disturbances of different wavelengths. Combining or adding together many
different-colored disturbances produces a total transverse vibration having no particular or unique
wavelength and with the plane of vibration free to change in an irregular fashion along the length
of the beam, as shown in Fig. 1.3(c). The situation depicted in Figs. 1.3(a)–1.3(c) is actually very
close to the physical models used today to explain the behavior of light; all we need to do is
accept Maxwell’s equations—but not Maxwell’s ether—and say that the sinusoidal curves in
1he First Michelson Interferometer · 1.1


- 3 -
FIGURE 1.1(A). The first Michelson interferometer.














Figs. 1.3(a)±1.3(c) describe the changing length and orientations of the tip of the wavefield’s
oscillating electric or magnetic field vectors.
4

Suppose length a in Fig. 1.1(b) is adMusted until the distance from mirror C to the beam splitter
is exactly the same as the distance from mirror D to the beam splitter. When monochromatic
light—that is, light having a unique wavelength—enters the interferometer as shown in Figs.
1.4(a) and 1.4(b), then the beams reflected from C and D recombine when leaving the
interferometer in such a way that their planes of vibration, as well as their state of oscillation,
exactly match. Since the planes of vibration match, we can disregard the planes’ orientation and
Must add together the two beams’ sinusoidal curves. Figure 1.5(a) shows that if the RT and TR
beams line up exactly—as they must when the distances from mirrors C and D to the beam
splitter are equal—then the summed oscillation is a maximum because the two wavefields are in
phase. If the distances from mirrors C and D to the beam splitter are unequal, then beams RT and
TR shift with respect to each other, as shown in Figs. 1.5(b)±1.5(e). The two beams can be out of
phase by any fraction of a wavelength.


4
See, for example, the discussion in Secs. 4.2 through 4.4 of Chapter 4. Figures 1.2(a) and 1.2(b) can be profitably
compared to Figs. 4.5 and 4.6 in Chapter 4.
D
depending on how much the inequality in mirror distance is. phase by any fraction of a wavelength depending on the amount of inequality in the two distances.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 4 -
FIGURE 1.1(b).
Mirror D
Compensator
Plate
Beam
Splitter
Beam RT
first reflected then
transmitted at beam splitter
Observing Telescope
Incident
Light
a
Beam TR
first transmitted then
reflected at beam splitter
partially reflective
surface
Mirror C
The First Michelson Interferometer · 1.1


- 5 -
FIGURE 1.2(a).
FIGURE 1.2(b).
plane perpendicular
to direction of
propagation
plane
perpendicular
to direction of
propagation
direction of
propagation
vibrations of
transverse wavefield
vibrations of
transverse wavefield
cut in wavefield
cut in
wavefield
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 6 -







FIGURE 1.2(c).
FIGURE 1.3(a).
FIGURE 1.3(b).
FIGURE 1.3(c).
three different
planes of vibration
propagation direction for
transverse wavefield
vibration wavelength
vibration wavelength
white light—no unique
wavelength
The First Michelson Interferometer · 1.1


- 7 -
The closer this fraction is to one-half, the smaller the summed oscillation; and if they are out of
phase by exactly a half-wavelength, then their sum is zero and the combined beam disappears.
When one beam is shifted against the other by exactly one wavelength, and the planes of
vibration still match, then once again the monochromatic RT and TR beams are in phase and
producing a bright combined oscillation.
5
There seems to be a real possibility that a
monochromatic beam cannot be used to confirm that mirrors C and D are the same distance from
the beam splitter because the recombined exit beam may look the same as it does when no shift at
all exists if one wavefield is shifted against the other by one, two, etc., wavelengths.
Suppose two monochromatic beams with two different wavelengths are sent through the
interferometer at the same time. If the distances from mirrors C and D to the beam splitter are
equal, then both the monochromatic beams, even though they have different wavelengths, must
be in phase when leaving the interferometer, producing a maximally bright oscillation in the
recombined exit beam. When the distances to the beam splitter are not exactly equal, however,
one of the monochromatic beams may end up shifted against itself by one, two, etc., wavelengths,
but there is no reason for the other beam to be shifted against itself the same way. When three
monochromatic beams are sent through the interferometer while the distances to the beam splitter
are not equal, matching all three wavetrains becomes even more unlikely. Hence, if we pass
white light containing innumerable distinct monochromatic wavetrains through the instrument,
then the RT and TR beams will recombine to produce a maximally bright output beam if and only
if the distances from mirrors C and D to the beam splitter are equal.
To make the white-light beam work as intended, the interferometer needs a glass compensator
plate between mirror C and the beam splitter [see Fig. 1.1(b)]. The compensator plate must be the
same thickness and orientation—and made from the same type of glass—as the glass in front of
the beam splitter’s partially reflecting surface. Figure 1.6(a) shows how light waves reflect from
mirrors C and D; the wavelength does not change while reflecting. In Fig. 1.6(b), however, light
waves inside the glass are somewhat shorter than they are outside the glass; the wavelength of the
light with respect to the glass thickness is greatly exaggerated to show this effect.
Therefore, a given distance traveled inside the glass corresponds to more wavelengths of a
monochromatic beam than the same distance in empty space. Moreover, different colors or
wavelengths of light shrink by different amounts, and this effect was a familiar one to 19th-
century optical scientists. If the compensator plate is not present, then the RT beam in Fig. 1.1(b)
passes through the glass in the beam splitter three times, whereas the TR beam passes through the
beam-splitter glass only once. The RT beam thus contains more wavelengths than the TR beam
even though the distances between the mirrors and the beam splitter are equal. With the
compensator plate present, however, both the TR and the RT beams pass through three glass
thicknesses.


5
In fact, we now know that a strictly monochromatic beam of light must have matching planes of vibration when
shifted against itself by exactly one, two, etc., wavelengths.
plate there, however, both the TR and RT beams pass through three glass layers.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 8 -

FIGURE 1.4(a). Figure 1.4(a) shows a segment of radiation entering the interferometer and Fig. 1.4(b)
shows what that segment becomes when it leaves the interferometer if the distance it travels up and back
each interferometer arm is the same.






before passing through
the interferometer
The First Michelson Interferometer · 1.1


- 9 -

FIGURE 1.4(b).






after leaving the
interferometer
Beam RT Beam TR
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 10 -




FIGURE 1.5(a).
FIGURE 1.5(b).
FIGURE 1.5(c).
FIGURE 1.5(d).
FIGURE 1.5(e).
Beam TR
Beam RT
Total
Beam TR
Beam RT
Total
Beam TR
Beam RT
Total
Beam TR
Beam RT
Total
Beam TR
Beam RT
Total
In Phase
Out-of-Phase
by a Quarter
Wavelength
Out-of-Phase
by a Half
Wavelength
Out-of-Phase by
Three-Quarters
Wavelength
In Phase
The First Michelson Interferometer · 1.1


- 11 -

FIGURE 1.6(b).
Incident Wavefield
Reflected Wavefield
Glass
Substrate
Beamsplitting Film
Reflected Wavefield
Incident Wavefield
Transmitted
Wavefield
FIGURE 1.6(a).
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 12 -
Now each monochromatic component has its own unique number of wavelengths in each arm
of the interferometer; thus, the blue-light component in one arm has the same number of
wavelengths as the blue-light component in the other arm, the red-light component in one arm
has the same number of wavelengths as the red-light component in the other arm, and the same
can be said about all the other colors in the white-light beam.
Michelson wanted to do more than Must make the distances traveled by light going back and
forth between the C, D mirrors and the beam splitter equal; he also wanted to see how the
distances traveled by the light beams changed when he rotated the interferometer on its stand >see
Fig. 1.1(a)@. Up to now, we have assumed that mirrors C and D are exactly perpendicular to the
line of sight between their centers and the beam splitter, but nothing stops us from tilting one of
them a very slight amount, as shown in Fig. 1.7. The degree of tilt is, of course, greatly
exaggerated to show what is happening. When the tilt is imposed after the distances of mirrors C
and D to the beam splitter have been made equal, the center line of the tilted mirror remains at the
same distance from the beam splitter as it was before the tilt occurred. If the tilt is so small that
the slight change in direction of the beam can be disregarded, then that part of the beam reflecting
off the mirror’s center line still recombines with light from the other mirror in such a way as to
produce the maximally bright oscillation already discussed above. The off-center parts of the
recombined beam are, of course, dimmer because the off-center parts of the tilted mirror no
longer match up properly to the untilted mirror.
6
An observer looking through the telescope
shown in Figs. 1.1(a) and 1.1(b) sees a bright central band, called a ³fringe,´ corresponding to the
central strip lying along the center line of the tilted mirror, with dark and less bright bands or
fringes on either side. If the distance that the light travels between the tilted mirror and the beam
splitter changes slightly, we expect the central fringe to shift as one side or another of the tilted
mirror—instead of its center line—becomes equal to the distance traveled by the light in the other
arm of the interferometer. It is exactly this sort of fringe shift that Michelson hoped to see when
he rotated the interferometer on its stand, changing the direction in space of the light going up
and back the arms of the interferometer.
One last point we need to make is that many beam splitters of the type shown in Fig. 1.1(b)
reflect differently from the glass side and the nonglass side of the partially reflecting surface,
reversing the directing of vibration in the TR beam reflecting off the nonglass side and not
reversing it in the RT beam reflecting off the glass side.
7

Figure 1.5(c) shows that reversing the direction of vibration is the same as changing the phase
of the beam by one half-wavelength or 180°, so the phenomenon is often referred to as a 180°
phase shift on reflection. Michelson used this sort of phase-shifting beam splitter, so the RT and
TR beams in his interferometer did not match up the way they are shown in Fig. 1.4(b) when the
distances of mirrors C and D from the beam splitter are equal but instead match up as shown in


6
See Secs. 5.20 and 5.21 in Chapter 5 for a more detailed discussion of how to analyze a tilted mirror.
7
F. Jenkins and H. White, Fundamentals of Optics, 3rd ed. (McGraw-Hill Book Company, New <ork, 1957), p.
251.
Now each monochromatic component has its own unique number of wavelengths in each arm
distances of mirrors C and D from the beam splitter are equal but instead match up as shown in
The First Michelson Interferometer · 1.1


- 13 -
FIGURE 1.7.



Centerline of
Tilted Mirror
Line of Sight to Beam Splitter
Angle
of Tilt
Note: The angle of tilt is
greatly exaggerated in
this diagram.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 14 -
Fig. 1.8. Now the central fringe coming from the center line of the tilted mirror is dark because
all the monochromatic components of the two beams cancel out rather than add together. When
Michelson sent white light through his interferometer, he thus saw a central dark fringe with
parallel multicolored fringes on either side. The colored fringes come from the off-center strips of
the tilted mirror where one or another monochromatic wavetrain is shifted against itself by
exactly one, two, etc., wavelengths, increasing the amplitude of its oscillation with respect to the
wavetrains of other colors inside the recombined beam. In this setup, the central dark fringe is
unique, making it easy for Michelson to see how its position changes as the interferometer is
rotated.
1.2 Historical Reasoning Behind the Ether-Wind Experiment
Physical theory has changed a great deal since 1881, but it is still relatively easy to understand
the reasoning behind Michelson’s experiment. As soon as light is taken to be a wavefield in a
medium at rest, such as waves on the surface of water, and the Earth’s motion through space is
regarded as carrying the interferometer through the medium, everything falls into place.
The first point worth mentioning is that the velocity at the equator due to the Earth’s daily
rotation is 0.46 km/sec, much less than the Earth’s orbital velocity around the sun of 29.67
km/sec. Consequently, the rotational velocity of Michelson’s laboratory—well north of the
equator—was only about 1% of the orbital velocity, and Michelson did not have to pay any
attention to it. The interferometer in Fig. 1.1(a) can be rotated on its stand, so at noon and
midnight, Michelson could always arrange for one arm to be aligned with the Earth’s orbital
velocity. Figures 1.9(a) and 1.9(b) show light traveling along the arms of a Michelson
interferometer when the interferometer is viewed as moving with a velocity v through a stationary
medium—that is, a luminiferous ether—and one of the arms is aligned with v. To keep life
simple, we have dropped the compensator plate from the two diagrams. Figure 1.9(a) shows light
traveling out and back along the arm aligned with v, with the interferometer rotated so that this is
the arm holding mirror C in Fig. 1.1(b). Figure 1.9(b) shows light traveling out and back along
the arm holding mirror D in Fig. 1.1(b). The positions of mirrors C and D are adjusted so that
each one is the same distance a from the beam splitter.
Figure 1.9(a) shows the beam splitter at three different positions as a single crest of the light’s
wavefield moves through the interferometer: when the wavecrest first enters the arm of the
interferometer, when the wavecrest reflects off mirror C, and when the wavecrest returns to the
beam splitter for the second time. Mirror C is shown at the same three times—when the
wavecrest enters the arm, when it reflects off C, and when it returns to the beam splitter. The
velocity of the wavecrest with respect to the ether is c, and time t
1
elapses as the wavecrest goes
from the beam splitter to mirror C. Hence, the wavecrest covers a distance a + vt
1
in the
stationary ether while traveling at velocity c, with


1 1
a vt ct + . (1.1a)

47 9.7
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
- 15 -
FIGURE 1.8.




Beam TR
Beam RT
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 16 -




FIGURE 1.9(a).











Direction of
Earth’s Motion
Positions
of Mirror C
Positions of the
Beam Splitter
To Telescope
Incident Light
a

1
vt
2
vt
1
vt
2
vt
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
- 17 -

FIGURE 1.9(b).







Mirror D
Direction of
Earth’s Motion
a
Incident Light
To Telescope
Positions of the
Beam Splitter

3
vt
3
vt
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 18 -

Time t
2
elapses while the wavecrest returns from mirror C to the beam splitter, and similar
reasoning shows that

2 2
a vt ct ÷ . (1.1b)

Solving for
1
t and
2
t in Eqs. (1.1a) and (1.1b) gives


1
a
t
c v

÷

and

2
a
t
c v

+
.

The wavecrest spends time

1 2
2 2
2 a a ac
t t
c v c v c v
+ +
÷ + ÷


going out to mirror C and back to the beam splitter, and it does so while traveling at velocity c, so
it covers a total distance

2
1 2 2 2
2
( )
ac
c t t
c v
+
÷
. (1.1c)

Figure 1.9(a) also shows the wavecrest traveling at an angle, instead of straight down, after it
reflects off the beam splitter when leaving the interferometer’s arm. This allows it to head toward
where the observing telescope will be by the time the wavecrest reaches it; there is thus no
danger of the telescope missing the wavecrest because it has moved out of position. Figures
1.10(a) and 1.10(b) show why this happens. Figure 1.10(a) shows a single wavecrest reflecting
off a 45° stationary mirror. The large dots indicate where the “corner” of the reflecting wavecrest
is now and has been in the past as it reflects from the stationary mirror. The reflected wavecrest
travels upward at 90° from its original direction, as expected. Figure 1.10(b) shows what happens
when the same type of wavecrest reflects off a moving 45° mirror. The four thin solid lines show
the positions of the mirror at four equally spaced instants in time, and the large dots again show
where the corner of the reflecting wavecrest is at these times. Connecting these dots with a thick
dashed line, we see that the wavecrest feels an effective stationary mirror that is slanted at an
angle somewhat greater than 45°. This means the reflected wavecrest does not travel straight up
as in Fig. 1.10(a) but instead moves a little off to the right.
The wavecrest spends time
Solving for
1
t and
2
t in Eqs. (1.1a) and (1.1b) gives
Time t
2
elapses while the wavecrest returns from mirror C to the beam splitter, and similar
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
- 19 -
Figure 1.9(b) shows how the wavecrest travels up and back the interferometer arm
perpendicular to velocity v. In time
3
t , the wavecrest travels a distance
2 2 2
3
a v t + from the beam
splitter to mirror D; and, because it does this at velocity c, we must have


2 2 2
3 3
ct a v t +
or

3
2 2
a
t
c v

÷
.


Figure 1.9(b) shows that the total distance traveled from the beam splitter to mirror D and
back again must be

3
2 2
2
2
ac
ct
c v

÷
. (1.2)


Even though the two interferometer arms are both of length a, if the interferometer is moving
then a single wavecrest splitting at the beam splitter does not travel the same distance in each arm
before recombining at the beam splitter. The difference ¨s between the distances traveled out and
back in each arm is, according to Eqs. (1.2) and (1.1c),


1 2 3
2 2 2 2 2 2 2 2
2 2 1
( ) 2 1 1
1 1
ac c a
s c t t ct
c v c v v c v c
ª º
ª º
« » A + ÷ ÷ ÷
« »
« » ÷ ÷ ÷ ÷
¬ ¼
¬ ¼
.


The Earth’s orbital velocity is about
4
10
÷
of the speed of light c, so we can make the
approximation

( )
2
1 2
2 2
2
1 1
2
v
v c
c
÷
÷ e + .

This gives

2 2 2
4 4
2 2 2
2 1 1 1 ( )
2 2
v v av
s a O v c
c c c
§ ·§ ·
A e + + ÷ +
¨ ¸¨ ¸
© ¹© ¹
.
Figure 1.9(b) shows that the total distance traveled from the beam splitter to mirror D and
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 20 -
FIGURE 1.10(a). An incident wavecrest enters from the right and is reflected up from a stationary
surface. The dots show where the corner of the wavecrest is at equally spaced time intervals while it is
reflecting off the surface.










incident wavecrest
moving to the left
reflected wavecrest
moving up
reflecting surface
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
- 21 -















FIGURE 1.10(b). The same wavecrest is shown here at four instants of time, each instant
separated from the next by a time interval of ¨t, as it enters from the right and reflects off a flat
surface traveling from left to right across the page. The dots show where the corner of the wavecrest
is at these four instants of time, and the thick dashed line shows the effective slant of the surface
experienced by the wavecrest as it reflects.
direction of travel of
reflected wavecrest
t
t t A ÷ 2
t t A ÷ 3
direction of travel of
incident wavecrest
t
t t A ÷
t t A ÷ 2
t t A ÷ 3
reflecting surface at four equally spaced
instants of time
Same incident wavecrest at four equally
spaced instants of time

t – ǻt
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 22 -
Since
2 2 8
10 v c
÷
e and
4 4 16
10 v c
÷
e , it makes sense to neglect the
4 4
v c terms and write


2
8
2
10
av
s a
c
÷
A e = . (1.3a)

It is perhaps of interest to point out that Michelson, by mistakenly assuming that the light
traveling up and back the arm perpendicular to the orbital velocity covered a distance 2a instead
of
2 2
2 / ac c v ÷ , ended up with

2
8
2
2
2 10
av
s a
c
÷
A e = × (1.3b)

in his 1881 paper. This incorrect formula did not affect Michelson’s overall analysis because, as
he explained in the paper, the data was good enough to rule out an effect ten times smaller than
what he expected to see.
As pointed out in Sec. 1.1, when white light passed through the interferometer with one of the
end mirrors slightly tilted, Michelson saw a central dark band or fringe from the centerline of the
tilted mirror because the centerline is the same distance from the beam splitter as the untilted
mirror. Remembering that Michelson used a beam splitter that reversed the direction of vibration
in one of the recombining beams, we know that at the center of the dark fringe each
monochromatic wavetrain in the white-light beam cancels itself out. At the first colored band or
fringe on either side of the centerline, the wavetrains go from cancelling themselves out to
reinforcing themselves, becoming bright at those positions on the tilted mirror where the length
traveled out and back the tilted mirror arm is a half-wavelength longer than at the center of the
dark band [see, for example, the transition from Fig. 1.5(c) to Fig. 1.5(e)]. Hence, for each
monochromatic wavetrain, the transition from dark to bright is halfway complete where the
length traveled out and back the tilted-mirror arm is a quarter wavelength different from what it is
at the center of the dark band. Considering the joint actions of all the monochromatic wavetrains
in the white-light beam, Michelson then knew that going from the center to the edge of the dark
fringe corresponded to shifting from a position on the tilted mirror where the length out and back
in both interferometer arms was equal to a position where the length out and back the tilted
mirror arm was different by one quarter of the average wavelength Ȝ
av
of the white-light beam.
Thus the fringe widths inside the telescope’s field of view gave him an extremely fine-grained
scale for measuring the difference in distance between the two arms. For greater accuracy, a
monochromatic beam could be sent through the interferometer and the tilted mirror adjusted until
the fringes matched up with the scale marks of the telescope’s eyepiece.
If the interferometer is rotated so that the arm originally parallel to v is now perpendicular to
v, then the distance out and back one arm is shorter by ¨s and the distance out and back in the
other arm is longer by ¨s, so there is—according to Eq. (1.3a)—a shift of
It is perhaps of interest to point out that Michelson, by mistakenly assuming that the light
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
- 23 -

2
8
2
2
2 2 10
av
s a
c

∆ ≅ ≈ × (1.4)

of the wavefield from one arm when compared to the wavefield from the other arm. If 2¨s equals
/ 4
av
λ , the dark fringe shifts until its center is located at the previous position of one of its edges;
if 2¨s is larger, then the dark fringe shifts more; and if 2¨s is smaller, then the dark fringe shifts
less. For the value of a he chose, Michelson expected the fringe to shift by approximately one-
tenth its width. To within experimental error, he did not see the dark fringe shift at all. Michelson
concluded that
the hypothesis of the stationary ether is thus shown to be incorrect, and the necessary conclusion follows that
the hypothesis is erroneous.
8

The existence of the ether was accepted by a lot of scientists, so this experiment was by no
means the last word in the matter; indeed, it inaugurated 50 years of ever more painstaking
attempts to detect an ether wind using larger and more sensitive Michelson interferometers.
Michelson himself took the first step down this road when, in 1887, he collaborated with Edward
Morley to repeat his experiment; Fig. 1.11 shows the optical diagram of the interferometer they
constructed. They concluded that the velocity v of the interferometer with respect to the ether was
probably less than a sixth of the Earth’s orbital velocity, an upper limit suggested by
experimental error.
9
Michelson and Morley regarded this as another negative result. Many
scientists, including Michelson, at first interpreted these experiments as showing that the Earth
dragged along a layer of ether near its surface, making it hard to say just how fast the
interferometer might be moving with respect to the ether in the laboratory. Interferometers were
set up on tops of mountains and sent up in high-altitude balloons, hoping to get outside the ether
layer dragged along by the Earth, but no one came up with any results convincingly larger than
experimental error. According to Einstein’s special theory of relativity, published in 1905, there
is no reason to expect “ether drift” at all, because the speed of light is the same in all inertial
frames of reference. After 1905, attempts to detect ether drift were basically attempts to disprove
relativity theory, and scientists who pursued them were regarded by their peers as ever more
eccentric. Perhaps the last serious attempt to detect an ether wind using a Michelson
interferometer took place on top of Mount Palomar, where Dayton Miller ran an extremely large
and sensitive Michelson experiment in the 1920s. When publishing the results in the early 1930s,
he claimed to detect ether-wind velocities on the order of 10 km/sec,
10,11
but the data remained


8
Michelson, “The Relative Motion of the Earth.”
9
A. Michelson and E. Morley, “On the Relative Motion of the Earth and the Luminiferous Ether,” American Journal
of Science 34, Series 3 (1887), 333–345.
10
D. Miller, “The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth,” Reviews of
Modern Physics 5, no. 2 (July 1933), 203–242.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 24 -
controversial. After his death, the results were attributed to slight but systematic temperature
changes in the instrument during the measurements.
12



11
D. Miller, ³The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth,´ Nature
(February 3, 1934), 162±164.
12
R. Shankland, S. McCuskey, F. Leone, and G. Kuerti, ³New Analysis of the Interferometer Observations of
Dayton C. Miller,´ Reviews of Modern Phvsics 27, no. 2 (April 1955), 167±178.
1.3 Monochromatic Light and SpectraI Lines
The wavelength λ of a monochromatic light wave and the frequency f in cycles per unit time of
that same monochromatic light wave are connected by

f c λ = , (1.5)

where c is the velocity of light. By the second half of the 19th century, it was known that the light
emitted by free atoms, such as from the atoms inside a hot dilute gas, is often emitted at specific
frequencies called spectral lines. Equation (1.5) then requires the light from a spectral line to
have a precise wavelength λ ÷ c/f. Michelson used these spectral lines to generate the
monochromatic light sent through his interferometer. When, for example, a spectroscope was
used to separate out the cadmium red line and send it through the interferometer, he would see a
regular pattern of red fringes; when the mercury green line was sent through, he would see
regular green fringes; and so on. Many of these lines are in reality clumped groups of spectral
lines, all having nearly the same wavelength; they masquerade as a single bright line when
observed by low-resolution spectroscopes and spectrometers.
1.4 AppIying the MicheIson Interferometer to SpectraI Lines
After the first ether-wind experiments, Michelson demonstrated that his interferometer could also
be used both as an extremely accurate, practical ruler for measuring fundamental lengths and as
an extremely high-resolution spectrometer. To understand Michelson’s approach, we must keep
in mind that the only ³optical detectors´ available back then were cameras (whose images had to
be chemically developed in darkrooms) and the human eye.
When the interferometer is used as a ruler or spectrometer, one of the arms is modified so that
its mirror is easily moved, as shown in Fig. 1.12. This moving mirror and the fixed mirror on the
other arm are still slightly tilted with respect to each other; that is, when extended indefinitely,
the planes of the mirror surfaces do not meet at exactly 90°. In this discussion, we refer to the
moving mirror as being tilted and the fixed mirror as being untilted. To keep things consistent
with the discussion in Sec. 1.1, the beam splitter is assumed to be the same type used in the 1881 with the discussion in Sec. 1.1, the beam splitter is assumed to be the same type used in the 1881
Applying the Michelson Interferometer to Spectral Lines · 1.4
- 25 -











FIGURE 1.11.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 26 -
ether-wind experiment. Hence, when a white-light beam is sent through the instrument, an
observer notes a central dark fringe if the center of the tilted moving mirror is the same distance
from the beam splitter as the center of the fixed mirror. This equidistant position of the moving
mirror is today often called the position of zero-path difference (ZPD) because the light’s path up
and back each arm of the interferometer is the same when there is no tilt present.
The position and tilt of the moving mirror can be adjusted until the central dark fringe is
centered on rulings marked in the telescope’s eyepiece. When the white-light beam is replaced by
a monochromatic beam from a spectral line, the observer sees a sequence of light and dark bands
forming a regular pattern of fringes having the same color as the spectral line. The marked
position of the central dark fringe in the center of the eyepiece is now occupied by a dark null of
the monochromatic fringe pattern. This null corresponds to the centerline strip of the tilted
mirror’s surface being the same distance from the beam splitter as the untilted mirror’s surface.
The two bright fringes on either side of the marked null separate that null from the two
neighboring nulls, with the neighboring nulls corresponding to two strips of the tilted mirror’s
surface that are a half-wavelength closer to, and a half-wavelength further away from, the beam
splitter. A half-wavelength difference in distance from the beam splitter creates, of course, a full
wavelength’s difference in the distance traveled up and back the interferometer’s arm, which is
why we see another null. Depending on the configuration of the telescope, the amount of tilt in
the tilted mirror, and the wavelength of the monochromatic beam, there will be some number of
additional fringes alternating bright and dark across the field of view, with the nulls
corresponding to strips of the tilted mirror’s surface that are one half-wavelength closer to and
further away from the beam splitter, two halves or one full wavelength closer to and further away
from the beam splitter, three halves closer to and further away from the beam splitter, and so on.
The observer can slowly move the tilted mirror out along its arm, watching as the fringe
pattern moves across the telescope’s field of view. The movement occurs, of course, because the
strips of the moving mirror’s tilted surface that are 1/2, 1, 3/2, etc., wavelengths closer to or
further away from the beam splitter are now no longer where they used to be. The marked null
shifts and, after the mirror moves half a wavelength from its original position, the null that used
to be immediately to one side shifts into the marked location. The fringe pattern looks the same
as just before the mirror began moving, but the observer knows there has been a half-wavelength
shift in the position of the moving mirror because the fringes have been carefully watched as their
positions changed. As the mirror moves, old fringes move out of sight on one side of the field of
view while new fringes replace them on the other side of the field of view. The observer checks
that the tilt of the moving mirror does not change by making sure that there is always the same
number of bright-null repetitions in the fringe pattern. Since the position of the moving mirror is
always known to within a small fraction of a wavelength, the interferometer has now become an
extremely accurate way to measure distance.
Applying the Michelson Interferometer to Spectral Lines · 1.4
- 27 -







































FIGURE 1.12.
p
Moving Mirror
Fixed
Mirror
Compensator
Plate
Beam
Splitter
Source Radiance Containing
Spectral Lines
To Telescope
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 28 -
Michelson did not hesitate to measure distances with his interferometer. In 1892 he
established that the standard meter bar in Paris corresponded, to an accuracy of one part in two
million, to 1,553,163.5 wavelengths of monochromatic light from the red cadmium spectral line.
At Yerkes Observatory in Wisconsin, he measured the extremely small tidal distortions of the
planet Earth due to the moon’s gravity, helping to establish that the Earth has an iron core, and
published the results in 1919. There is, however, a fundamental difficulty limiting his ability to
use the interferometer as a ruler: As the moving mirror gets further and further away from its
equidistant or ZPD position, the pattern of fringes starts to fade and eventually disappears. This
phenomenon is caused by the beam from the spectral line not being exactly monochromatic—
either because what looks like a single spectral line is in reality a group of two or more lines
having almost the same wavelength, or because the line itself has a finite spectral “width,”
simultaneously emitting light at a very large number of wavelengths all very close to each other
in value.
To see why the fade-out occurs for a closely spaced group of spectral lines, we first analyze
what happens when the light from a pair of equal-intensity, closely spaced spectral lines,
sometimes called a spectral doublet, is sent through the interferometer. Inside the interferometer,
the doublet behaves like two monochromatic beams—each having a slightly different
wavelength—simultaneously passing through the instrument. After using white light to put the
moving, tilted mirror at its ZPD position, we begin sending the doublet beam through the
interferometer. Each monochromatic beam produces a fringe pattern. To the human eye, the
fringe patterns have the same color and their nulls seem to be at exactly the same places in the
telescope’s field of view. Because the wavelengths of the beams are nearly identical, the two
fringe patterns lie almost exactly on top of each other, reinforcing each other the same way the
dashed and solid oscillations lie on top of each other to create a thicker line at the left-hand edge
of Fig. 1.13. When, for example, there is a null in one beam’s fringe pattern because that strip of
the tilted mirror’s surface is an integer number of half-wavelengths closer to or further away from
the beam splitter, the null from the other beam’s fringe pattern falls in almost exactly the same
place because it has almost exactly the same wavelength. As we shift the moving mirror further
away from ZPD and watch the fringes move, we know that when each new fringe forms at the
leading edge of the field of view, it shows that the edge of the tilted moving mirror is an ever
larger number of half-wavelengths further from the beam splitter. Sooner or later, however, the
same thing happens to the two beams’ fringe patterns that happens in Fig. 1.13 as we look away
from its left-hand edge—the oscillations get out of phase. Just as the dashed and solid lines in
Fig. 1.13 no longer match up exactly because they have slightly different repetition lengths, so do
the two fringe patterns of the two beams match up less well because they have slightly different
wavelengths. There always comes a point—perhaps when the next null is forming at 10,000 or
50,000 or more half-wavelengths from the ZPD position of the moving mirror—where the
monochromatic beam with the slightly shorter wavelength λ
1
is ready to form a null somewhat
before the beam with the slightly longer wavelength λ
2
. The nulls and brights from one
monochromatic fringe pattern shift enough with respect to the other that we begin to notice a
change: the pattern begins to fade. Eventually, the two fringe patterns are completely out of
Applying the Michelson Interferometer to Spectral Lines · 1.4
- 29 -
phase, with the brights and nulls of one pattern lying on, respectively, the nulls and brights of the
other. If the two beams are of equal intensity, then the fringe pattern fades away completely.
Suppose the λ
1
set of fringes first becomes exactly out of phase with the λ
2
set of fringes when
the moving mirror has traveled a distance of approximately N/2 wavelengths of the λ
2
beam from
its equidistant or ZPD location. At this point, N satisfies the approximate equation


2 1
1 1 1
2 2 2
N N λ λ
§ ·
≅ +
¨ ¸
© ¹
, (1.6a)
which can also be written as

2 1
1
1
2N
λ λ
λ

≅ . (1.6b)

This gives the formula for the fractional spread


2 1
1
λ λ
λ



between the doublet’s wavelengths in terms of N. If N is too large for convenient counting and
only several digits of accuracy are needed, we can directly measure the distance p in Fig. 1.12 at
which the fringe pattern disappears. Recognizing that both sides of Eq. (1.6a) are formulas for p
at the fade-out point, we can approximate either side of Eq. (1.6a) by
av
Nλ , where
av
λ is the
approximate wavelength of the doublet, and write


2
av
N
p
λ
≅ . (1.6c)

Solving for N gives the formula

2
av
p
N
λ
≅ (1.6d)

to estimate N in terms of the known values of p and
av
λ . This approximate value of N can then
be put into Eq. (1.6b) to find the fractional spread in the doublet. Hence, we see that the fade-out
is both a “bug” and a “feature” of the interferometer—although it sets a limit on the distances that
can be measured, it also specifies the exact separation of spectral lines too close to be resolved by
other types of spectrometers. This exercise also establishes the basic idea behind Michelson-
based spectroscopy: examining the behavior of the interference signal to measure the beam’s
spectral shape.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 30 -
FIGURE 1.13. The solid oscillation represents the fringe pattern of one spectral line in the doublet and
the dashed oscillation represents the fringe pattern of the other spectral line in the doublet. The
wavelengths of both spectral lines are almost the same, so their fringe patterns slowly change from being
in-phase, to being out-of-phase, and then back to being in-phase.














Now that we understand why the fringe pattern of a doublet fades, it is easy to see why the
same sort of thing happens with any size group—or multiplet—of closely spaced spectral lines.
Each line of intrinsically greater or lesser intensity generates a fringe pattern of intrinsically
greater or lesser intensity connected to its wavelength. Near ZPD, all the fringe patterns are in
phase, but as the moving mirror shifts away from ZPD, the fringe patterns, since each is produced
by a slightly different wavelength, go out of phase, causing the fringes to fade. Figure 1.14 even
suggests a quick way of understanding something about why a single, finite-width spectral line
also produces fading fringe patterns; approximating it as a closely spaced multiplet, we might
expect its fringes to behave the same way any other multiplet’s would. We should, however, be
careful about carrying this sort of reasoning too far. Figure 1.13 suggests that if, after reaching
the fade-out point, we keep moving the tilted mirror away from its ZPD position, then the
doublet’s fringe pattern starts to reappear, eventually becoming as strong as it was near ZPD. The
same sort of phenomenon should also occur for any multiplet consisting of a finite number of
exact wavelengths; if we go far enough from ZPD, then there should be a region where the fringe
patterns are all back in phase. In reality, when moving away from ZPD, there are indeed regions
where a multiplet’s fringe pattern first fades then grows stronger, but the finite width of each
spectral line inside the multiplet stops the fringes from ever regaining their full ZPD strength.
The fringes always, eventually, fade away completely. To explain this behavior, it is enough to
examine how and why the fringe pattern of a single, finite-width spectral line fades away. This is
done in the next three sections, where we show how a fringe pattern is connected to the Fourier
transform of the spectral intensity.
ax p ( )
min p ( )
p
i
P
i
10 0 x
i
0 1 2 3 4 5 6 7 8 9 10
0
strong fringes strong fringes weak fringes weak fringes no fringes
Interference Equation for the Ideal Michelson Interferometer· 1.5

- 31 -
1.5 Interference Equation for the Ideal Michelson Interferometer
When using a Michelson interferometer for Fourier-transform spectroscopy, the end mirrors in
each arm are aligned to be perpendicular to the line of sight between their centers and the center
of the beam splitter. In effect, we remove the tilt from the moving mirror so that its central fringe
fills the detector’s field of view in Fig. 1.15. The light beam passing through the interferometer
should be collimated, shown schematically in Fig. 1.15, by putting the point source of the beam
at the focus of a thin lens. The beam leaving the interferometer is concentrated onto a detector by
another thin lens. The dashed line shows the ZPD position of the moving mirror in Figs. 1.15 and
1.16. The moving mirror is a distance p from ZPD in these two figures, with p taken to be
positive when the mirror is further away from the beam splitter than its ZPD position and
negative when it is closer to the beam splitter than its ZPD position. The moving mirror should
remain perpendicular to the line of sight between it and the beam splitter as p changes, and the
detector records the changing intensity I of the collimated beam leaving the interferometer.
Even though Michelson did not usually set up his interferometers this way, optical theory was
advanced enough then for him to predict how I depends on p. The first step is to set up an x, y, z
Cartesian coordinate system such as the one shown in Fig. 1.16, with the collimated exit beam
traveling down the z axis. There are dimensionless unit vectors ˆ x , ˆ y , ˆ z pointing in the direction
of the positive x, y, z coordinate axes. Still treating a light beam as a transverse wavefield of the
type shown in Figs. 1.2(a)–1.2(c) and 1.3(a)–1.3(c), we assume that beam TR in Fig. 1.16 is
monochromatic light and write its transverse disturbance as


2 2
ˆ ˆ cos 2 cos 2
f f U f V
f f
z z
A xU ft yV ft
π π
π δ π δ
λ λ
§ · § ·
= − + + − +
¨ ¸ ¨ ¸
¨ ¸ ¨ ¸
© ¹ © ¹
K
. (1.7a)

Here, t is the time coordinate, f is the frequency of the monochromatic disturbance, and λ
f
is the
wavelength corresponding to frequency f. The period of the disturbance is, of course, 1/f, and Eq.
(1.5) reminds us that the wavelength λ
f
is connected to the frequency f by


f
f c λ = ,

where again c is the speed of light. Vector
f
A
K
has no ˆ z component, allowing it to represent a
transverse disturbance in the “ether” of the type shown in Figs. 1.2(a)–1.2(c) and 1.3(a)–1.3(c).
The ˆ x and ˆ y components of
f
A
K
are the real-valued expressions

2
cos 2
f U
f
z
U ft
π
π δ
λ
§ ·
− +
¨ ¸
¨ ¸
© ¹

1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 32 -
FIGURE 1.14.








frequency f
frequency f
Spectral Multiplet
Spectral Intensity
Spectral Intensity
Interference Equation for the Ideal Michelson Interferometer· 1.5
- 33 -


FIGURE 1.15.







90 deg.
90 deg.
45 deg.
source at
focus
p
Moving Mirror
Fixed
Mirror
Beam
Splitter
Compensator
Plate
Detector

1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 34 -
and

2
cos 2
f V
f
z
V ft
π
π δ
λ
§ ·
− +
¨ ¸
¨ ¸
© ¹


respectively. These components must both oscillate at the same frequency f because the light
beam is monochromatic, but they can have different constant phase shifts
U
δ and
V
δ . This allows
f
A
K
to point in different directions in the x, y plane when we move along the beam, as suggested
by the changing orientations of the arrows in beams RT and TR of Fig. 1.16. The U
f
and V
f

amplitudes of the x and y oscillations do not have to be equal. To simplify the notation, and
because the concept will be routinely used in the rest of the book, we define


1
f
f
σ
λ
= (1.7b)

to be the wavenumber of the monochromatic disturbance. Now Eqs. (1.7a) and (1.5) can be
written as


( ) ( )
ˆ ˆ cos 2 2 cos 2 2
f f f U f f V
A xU z ft yV z ft πσ π δ πσ π δ = − + + − +
K
(1.7c)
with
/
f
f c σ = . (1.7d)

This is the same monochromatic disturbance as before; all that changes is the notation used to
specify how its phase changes with z.
The power transported by a physical wavefield of any type is usually proportional to its
squared amplitude;
13,14
and in optics it is now, as it was in Michelson’s time, customary to set the
time average of the squared amplitude equal to the intensity of the transverse wavefield.
15
Visible
light has a wavelength on the order of
7
5 10 meters

× , so by Eq. (1.5) its frequency is about


14
7
6 10 Hz
5 10 meters
c
f

≅ ≅ ×
×
(1.8a)
given that
8
3 10 m/sec c ≅ × . Hence one cycle of the transverse wavefield has a period of about


13
H. Lamb, Hydrodynamics (6th edition), Dover Publications, New York, 1945 copy of the 6th edition first
published in 1879, p. 370.
14
P. Morse and K. Ingard, Theoretical Acoustics, McGraw-Hill, Inc., New York, 1968, p. 250.
15
G. Stokes, Mathematical and Physical Papers, Vol. III, Cambridge at the University Press, 1901, pp. 233-258.
Interference Equation for the Ideal Michelson Interferometer· 1.5
- 35 -









































FIGURE 1.16.
y axis
x axis
z axis



Beam TR
Beam RT
Compensator
Plate
Fixed
Mirror
Moving Mirror
p
p 2 = χ
Beam
Splitter
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 36 -

15
14
1
2 10 sec
6 10 Hz
÷
e ×
×
. (1.8b)

The response time of the unaided human eye is perhaps as short as 10
í2
s, and 2×10
í15
s is
shorter than that by a factor of about 10
13
. The response of the fastest optical detectors available
today is on the order of 10
í9
s, which is still an incredibly long time compared to 2×10
í15
s.
Therefore, we might as well take the time over which the squared amplitude is averaged to be
infinitely long, because compared to the wavefield’s period, that’s what it effectively is.
Following the notation of the time, the time average of a function g(t) is taken to be

( )
1
( ) lim ( )
2
T
T
T
g t g t dt
T
÷·
÷

³
j . (1.9a)

For any two functions g(t) and h(t), we then have

( )
1 1 1
( ) ( ) lim [ ( ) ( )] lim ( ) lim ( )
2 2 2
T T T
T T T
T T T
g t h t g t h t dt g t dt h t dt
T T T
÷· ÷· ÷·
÷ ÷ ÷
+ + +
³ ³ ³
j

or
( ) ( ) ( ) ( ) ( ) ( ) ( ) g t h t g t h t + + j j j . (1.9b)

Multiplying g(t) by a constant K and then averaging, we get

( )
1 1
( ) lim [ ( )] lim ( )
2 2
T T
T T
T T
K g t Kg t dt K g t dt
T T
÷· ÷·
÷ ÷

³ ³
j
or
( ) ( ) ( ) ( ) K g t K g t j j . (1.9c)

The squared amplitude of the monochromatic wavefield in Eq. (1.7c) is


( ) ( )
2 2 2 2
cos 2 2 cos 2 2
f f f f U f f V
A A U z ft V z ft ro r o ro r o
-
÷ + + ÷ +
K K
.

Time averaging both sides to get the intensity gives


( ) ( ) ( )
2 2 2 2
( ) cos 2 2 cos 2 2
f f f f U f f V
A A U z ft V z ft ro r o ro r o
-
÷ + + ÷ +
K K
j j , (1.10a)

1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 36 -

15
14
1
2 10 sec
6 10 Hz
÷
e ×
×
. (1.8b)

The response time of the unaided human eye is perhaps as short as 10
í2
s, and 2×10
í15
s is
shorter than that by a factor of about 10
13
. The response of the fastest optical detectors available
today is on the order of 10
í9
s, which is still an incredibly long time compared to 2×10
í15
s.
Therefore, we might as well take the time over which the squared amplitude is averaged to be
infinitely long, because compared to the wavefield’s period, that’s what it effectively is.
Following the notation of the time, the time average of a function g(t) is taken to be

( )
1
( ) lim ( )
2
T
T
T
g t g t dt
T
÷·
÷

³
j . (1.9a)

For any two functions g(t) and h(t), we then have

( )
1 1 1
( ) ( ) lim [ ( ) ( )] lim ( ) lim ( )
2 2 2
T T T
T T T
T T T
g t h t g t h t dt g t dt h t dt
T T T
÷· ÷· ÷·
÷ ÷ ÷
+ + +
³ ³ ³
j

or
( ) ( ) ( ) ( ) ( ) ( ) ( ) g t h t g t h t + + j j j . (1.9b)

Multiplying g(t) by a constant K and then averaging, we get

( )
1 1
( ) lim [ ( )] lim ( )
2 2
T T
T T
T T
K g t Kg t dt K g t dt
T T
÷· ÷·
÷ ÷

³ ³
j
or
( ) ( ) ( ) ( ) K g t K g t j j . (1.9c)

The squared amplitude of the monochromatic wavefield in Eq. (1.7c) is


( ) ( )
2 2 2 2
cos 2 2 cos 2 2
f f f f U f f V
A A U z ft V z ft ro r o ro r o
-
÷ + + ÷ +
K K
.

Time averaging both sides to get the intensity gives


( ) ( ) ( )
2 2 2 2
( ) cos 2 2 cos 2 2
f f f f U f f V
A A U z ft V z ft ro r o ro r o
-
÷ + + ÷ +
K K
j j , (1.10a)

Multiplying g(t) by a constant K and then averaging, we get
For any two functions g(t) and h(t), we then have

The response time of the unaided human eye is perhaps as short as 10
í2
s, and 2×10
í15
s is
13
Interference Equation for the Ideal Michelson Interferometer· 1.5
which becomes, applying Eqs. (1.9b) and (1.9c),


( ) ( ) ( ) ( )
2 2 2 2
( ) cos 2 2 cos 2 2
f f f f U f f V
A A U z ft V z ft ro r o ro r o
-
÷ + + ÷ +
K K
j j j . (1.10b)

The average of the squared cosine is 1/2 over one of its cycles.
16
As the averaging time gets
longer, it contains ever more cycles of the squared cosine, as well as—almost certainly—some
fraction of a cycle. The contribution of the squared cosine over a fractional cycle has practically
no influence compared to the squared cosine’s average value of 1/2 over a large number of
complete cycles. In the limit as T ĺ ’, it follows that


( )
2
cos ( ) 1/ 2 at b + j (1.10c)
- 37 -

for all real values of a and b. Hence, the formula for the intensity of the monochromatic beam in
Eq. (1.10b) now reduces to

( )
2 2
1
( )
2
f f f f
A A U V + i
K K
j . (1.10d)

Although the squared cosine is always positive, the cosine itself is negative as often as it is
positive and averages to zero over one cycle. As the averaging time increases, it includes an ever
larger number of cycles as well as (probably) some leftover fraction of a cycle. Again, the
influence of the zero from the large number of complete cycles outweighs the contribution of
whatever fractional cycle may be present, and as T ĺ ’ in the limit

( ) cos( ) 0 at b + j (1.11)
for all real values of a and b.
The wavefield of a beam of light containing two monochromatic wavetrains of frequencies f
1

and f
2
can be written as

1 2
f f
A A A +
K K K
, (1.12a)
where

( ) ( )
1 1 1 1 1
(1) (1)
1 1
ˆ ˆ cos 2 2 cos 2 2
f f f U f f V
A xU z f t yV z f t ro r o ro r o ÷ + + ÷ +
K
(1.12b)
and

( ) ( )
2 2 2 2 2
(2) (2)
2 2
ˆ ˆ cos 2 2 cos 2 2
f f f U f f V
A xU z f t yV z f t ro r o ro r o ÷ + + ÷ +
K
. (1.12c)



16
D. Griffiths, Introduction to Electrodynamics, 2nd ed. (Prentice Hall, Englewood Cliffs, NJ, 1989), p. 359.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 38 -
The beam’s intensity is the time average of its squared amplitude, which is


( ) ( ) ( )
1 2 1 2 1 1 2 2 1 2
( ) ( ) 2 )
f f f f f f f f f f
A A A A A A A A A A A A - - - - - + + + +
K K K K K K K K K K K K
j j j .


Equations (1.9b) and (1.9c) can be applied to get


( ) ( ) ( ) ( )
1 1 2 2 1 2
2
f f f f f f
A A A A A A A A - - - - + +
K K K K K K K K
j j j j . (1.12d)


Substituting Eqs. (1.12b) and (1.12c) into the cross term in Eq. (1.12d) gives


( ) ( ) ( ) (
( ) ( ))
1 2 1 2 1 2
1 2 1 2
(1) (2)
1 2
(1) (2)
1 2
cos 2 2 cos 2 2
cos 2 2 cos 2 2
f f f f f U f U
f f f V f V
A A U U z f t z f t
V V z f t z f t
ro r o ro r o
ro r o ro r o
- ÷ + ÷ +
+ ÷ + ÷ +
K K
j j
.



Again, Eqs. (1.9b) and (1.9c) are applied to get


( ) ( ) ( ) ( )
( ) ( ) ( )
1 2 1 2 1 2
1 2 1 2
(1) (2)
1 2
(1) (2)
1 2
cos 2 2 cos 2 2
cos 2 2 cos 2 2
f f f f f U f U
f f f V f V
A A U U z f t z f t
V V z f t z f t
ro r o ro r o
ro r o ro r o
- ÷ + ÷ +
+ ÷ + ÷ +
K K
j j
j

.
(1.12e)

There is a trigonometric identity


1 1
(cos )(cos ) cos( ) cos( )
2 2
ç ç ç ç ç ç + + ÷ , (1.12f)
which shows that

( ) ( )
( )
( )
1 2
1 2
1 2
(1) (2)
1 2
(1) (2)
1 2
(1) (2)
1 2
cos 2 2 cos 2 2
1
cos 2 ( ) 2 ( )
2
1
cos 2 ( ) 2 ( )
2
f U f U
f f U U
f f U U
z f t z f t
z t f f
z t f f
ro r o ro r o
r o o r o o
r o o r o o
÷ + ÷ +
+ ÷ + + +
+ ÷ ÷ ÷ + ÷

.
(1.12g)

Taking the time average of both sides and applying Eqs. (1.9b) and (1.9c), we see that

Taking the time average of both sides and applying Eqs. (1.9b) and (1.9c), we see that
There is a trigonometric identity
Interference Equation for the Ideal Michelson Interferometer· 1.5
- 39 -

( ) ( ) ( )
( ) ( )
( ) ( )
1 2
1 2
1 2
(1) (2)
1 2
(1) (2)
1 2
(1) (2)
1 2
cos 2 2 cos 2 2
1
cos 2 ( ) 2 ( )
2
1
cos 2 ( ) 2 ( )
2
f U f U
f f U U
f f U U
z f t z f t
z t f f
z t f f
ro r o ro r o
r o o r o o
r o o r o o
÷ + ÷ +
+ ÷ + + +
+ ÷ ÷ ÷ + ÷
j
j
j

.


Equation (1.11) requires both terms on the right-hand side to be zero, which gives


( ) ( ) ( )
1 2
(1) (2)
1 2
cos 2 2 cos 2 2 = 0
f U f U
z f t z f t ro r o ro r o ÷ + ÷ + j . (1.12h)


Replacing
(1,2)
U
o by
(1,2)
V
o in the algebra used to reach this result does not change the
conclusion, which means that


( ) ( ) ( )
1 2
(1) (2)
1 2
cos 2 2 cos 2 2 = 0
f V f V
z f t z f t ro r o ro r o ÷ + ÷ + j (1.12i)

also. Substituting these two formulas into Eq. (1.12e) leads to


( )
1 2
0
f f
A A -
K K
j (1.12j)

for any two frequencies f
1
and f
2
such that f
1


f
2
. Hence, Eq. (1.12d) can be written as


( ) ( ) ( )
1 1 2 2
f f f f
A A A A A A - - - +
K K K K K K
j j j . (1.12k)


Comparing the formula in (1.12k) for the intensity of a beam containing two monochromatic
wavefields to the left-hand side of the formula in (1.10d) for the intensity of a single
monochromatic wavefield, we note that the intensity of the beam with two monochromatic
wavefields is the sum of the intensities of each monochromatic wavefield.
The wavefield of a beam of light containing three monochromatic wavetrains of frequencies
f
1
, f
2
, and f
3
can be written as

1 2 3
f f f
A A A A + +
K K K K
(1.13a)

with
1
f
A
K
,
2
f
A
K
specified by formulas (1.12b) and (1.12c) respectively and
3
f
A
K
specified by

Equation (1.11) requires both terms on the right-hand side to be zero, which gives
Replacing
(1,2)
U
o by
(1,2)
V
o in the algebra used to reach this result does not change the
Comparing the formula in (1.12k) for the intensity of a beam containing two monochromatic
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 40 -

( ) ( )
3 3 3 3 3
(3) (3)
3 3
ˆ ˆ cos 2 2 cos 2 2
f f f U f f V
A xU z f t yV z f t ro r o ro r o ÷ + + ÷ +
K
. (1.13b)

Following the same analysis as before, we note that the intensity of this three-frequency light
beam is

( ) ( )
( )
( ) ( ) ( )
( )
1 2 3 1 2 3
1 1 2 2 3 3 1 2 1 3 2
1 1 2 2 3 3
1 2 1
3
( ) ( )
2 2 2
2 2
f f f f f f
f f f f f f f f f f f f
f f f f f f
f f f
A A A A A A A A
A A A A A A A A A A A A
A A A A A A
A A A
- -
- - - - - -
- - -
- -
+ + + +
+ + + + +
+ +
+ +
K K K K K K K K
K K K K K K K K K K K K
K K K K K K
K K K
j j
j
j j j
j j



( ) ( )
3 2 3
2
f f f
A A A - +
K K K
j .

Equation (1.12j) shows that
( )
1 2
0
f f
A A -
K K
j

for any two distinct frequencies f
1
and f
2
. The only thing different about
( )
1 3
f f
A A -
K K
j and
( )
2 3
f f
A A -
K K
j is the subscripts assigned to the distinct frequencies, so the same algebra showing
that
( )
1 2
f f
A A -
K K
j is zero also shows that


( ) ( )
1 3 2 3
0
f f f f
A A A A - -
K K K K
j j .

Hence, the three-frequency formula for
( )
A A -
K K
j reduces to


( ) ( ) ( ) ( )
1 1 2 2 3 3
f f f f f f
A A A A A A A A - - - - + +
K K K K K K K K
j j j j . (1.13c)

Here again, the intensity of the beam equals the sum of the intensities of its monochromatic
wavetrains.
This same argument can obviously be generalized to a beam consisting of N monochromatic
wavetrains. Since N may be left unspecified and can be made as large as we please, this is the
same as extending it to a beam of white light. The white-light wavefield can be written as


1
i
N
f
i
A A

¦
K K
, (1.14a)
where

( ) ( )
( ) ( )
ˆ ˆ cos 2 2 cos 2 2
i i i i i
i i
f f f i U f f i V
A xU z f t yV z f t ro r o ro r o ÷ + + ÷ +
K
(1.14b)
Following the same analysis as before, we note that the intensity of this three-frequency light
Hence, the three-frequency formula for
( )
A A -
K K
j reduces to
Interference Equation for the Ideal Michelson Interferometer· 1.5
- 41 -
with f
i
 f
j
whenever i  j. The intensity of this beam is


( )
1 1 1 1
i j i j
N N N N
f f f f
i j i j
A A A A A A
= = = =
• • •
§ ·
§ · § ·
§ ·
= =
¨ ¸
¨ ¸ ¨ ¸
¨ ¸
¨ ¸
© ¹
© ¹ © ¹
© ¹
¦ ¦ ¦¦
K K K K K K
j j j ,

or, applying Eq. (1.9b),

( ) ( )
1 1
i j
N N
f f
i j
A A A A
= =
• • =
¦¦
K K K K
j j . (1.14c)
Equation (1.12j) requires

( )
0
i j
f f
A A • =
K K
j (1.14d)

whenever i  j, so Eq. (1.14c) reduces to


( ) ( ) ( ) ( ) ( )
1 1 2 2
1
N N i i
N
f f f f f f f f
i
A A A A A A A A A A
=
• • • • • = + + + =
¦
K K K K K K K K K K
" j j j j j (1.14e)

because all the i  j terms disappear. Equation (1.14e) shows that the intensity of any beam, even
a white-light beam, is the sum of the intensities of its monochromatic wavetrains. This is
sometimes called the principle of independent superposition,
17
and can be written as


1 2
1
N i
N
f f f f
i
I I I I I
=
= + + + =
¦
" , (1.14f)
where

( )
I A A • =
K K
j (1.14g)
is the total intensity of the beam and

( )
i i i
f f f
I A A • =
K K
j (1.14h)

is the intensity of the beam’s monochromatic wavetrain of frequency f
i
.
Returning now to Fig. 1.16, we suppose that Eqs. (1.14f)–(1.14h) refer to beam TR and
consider how to write the disturbance for beam RT. In an ideal Michelson interferometer, the
only difference between beam RT and beam TR is that the wavefields in beam RT lag behind the
wavefields in beam TR by a distance Ȥ = 2p that is usually called the optical-path difference.
Using the notation specified in Eq. (1.14b), we see that for every monochromatic wavetrain



17
J. Chamberlain, The Principles of Interferometric Spectroscopy (John Wiley & Sons, New York, 1979), p. 98.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 42 -

( ) ( )
( ) ( ) ( )
ˆ ˆ cos 2 2 cos 2 2
i i i i i
TR i i
f f f i U f f i V
A xU z f t yV z f t πσ π δ πσ π δ = − + + − +
K
(1.15a)

in beam TR, there must be, according to Fig. 1.16, a corresponding monochromatic wavetrain


( ) ( )
( ) ( ) ( )
ˆ ˆ cos 2 ( ) 2 cos 2 ( ) 2
i i i i i
RT i i
f f f i U f f i V
A xU z f t yV z f t πσ χ π δ πσ χ π δ = + − + + + − +
K
(1.15b)

in beam RT. The total disturbance for the combined beams’ f
i
th wavetrain is then


( ) ( )
i i
RT TR
f f
A A +
K K


in Fig. 1.16. We also note, however, that the beam splitter in Fig. 1.16 is evidently not the same
sort of beam splitter as the one used by Michelson because it does not reverse the direction of the
oscillation of the TR beam the way that the beam splitter in Fig. 1.8 did. For this sort of beam
splitter, the total disturbance of the combined beam’s f
i
th wavetrain should be


( ) ( )
i i
RT TR
f f
A A −
K K


according to the discussion at the end of Sec. 1.1. To accommodate both possibilities, we write
the f
i
th wavetrain of the combined beam as


( ) ( ) ( )
i i i
cb RT TR
f f f
A A WA = +
K K K
, (1.15c)

where parameter W is í1 for Michelson-type beam splitters and 1 for non-Michelson beam
splitters. The superscript (cb) indicates that the disturbance
( )
i
cb
f
A
K
is the f
i
th wavetrain of two
beams combined in a balanced way—that is, each beam has undergone one transmission and one
reflection at the beam splitter. The intensity of the combined f
i
th wavetrain is


( ) ( )
( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) 2 ( ) ( ) ( ) ( )
( ) ( )
2
i i i i i i i
i i i i i i
cb cb cb RT TR RT TR
f f f f f f f
RT RT TR TR RT TR
f f f f f f
I A A A WA A WA
A A W A A WA A
• •
• • •
= = + +
= + +
K K K K K K
K K K K K K
j j
j .


Applying Eqs. (1.9b) and (1.9c) gives


( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
2
i i i i i i i
cb RT RT TR TR RT TR
f f f f f f f
I A A A A W A A • • • = + +
K K K K K K
j j j , (1.15d)

Interference Equation for the Ideal Michelson Interferometer· 1.5
- 43 -
where we have recognized that W
2
= 1 because W = ±1. Since both disturbances have the same f
i

frequency, Eq. (1.12j) cannot be used to say that
( )
( ) ( )
i i
RT TR
f f
A A -
K K
j is zero. Substituting from
(1.15a) and (1.15b) gives


( ) ( ) ( ) (
( ) ( ))
( ) ( ) 2 ( ) ( )
2 ( ) ( )
cos 2 ( ) 2 cos 2 2
cos 2 ( ) 2 cos 2 2 ,
i i i i i
i i i
RT TR i i
f f f f i U f i U
i i
f f i V f i V
A A U z f t z f t
V z f t z f t
ro ¿ r o ro r o
ro ¿ r o ro r o
- + ÷ + ÷ +
+ + ÷ + ÷ +
K K
j j


or

( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) 2 ( ) ( )
2 ( ) ( )
cos 2 2 2 cos 2 2
cos 2 2 2 cos 2 2
i i i i i i
i i i i
RT TR i i
f f f f f i U f i U
i i
f f f i V f i V
A A U z f t z f t
V z f t z f t
ro ro ¿ r o ro r o
ro ro ¿ r o ro r o
- + ÷ + ÷ +
+ + ÷ + ÷ +
K K
j j
j

.
(1.15e)


Formula (1.12f) shows that


( ) ( ) ( )
( ) ( )
( ) ( )
( )
cos 2 2 2 cos 2 2
1 1
cos 4 2 4 2 cos 2
2 2
i i i
i i i
i i
f f i U f i U
i
f f i U f
z f t z f t
z f t
ro ro ¿ r o ro r o
ro ro ¿ r o ro ¿
+ ÷ + ÷ +
§ ·
+ ÷ + +
¨ ¸
© ¹
j
j .


Applying (1.9b) and (1.9c), we get that


( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( )
( )
cos 2 2 2 cos 2 2
1 1
cos 4 2 4 2 cos 2
2 2
i i i
i i i
i i
f f i U f i U
i
f f i U f
z f t z f t
z f t
ro ro ¿ r o ro r o
ro ro ¿ r o ro ¿
+ ÷ + ÷ +
+ ÷ + +
j
j j .
(1.15f)

The time average of any time-independent quantity equals that quantity—that is,

( ) K K j (1.15g)

for any constant K. Equation (1.11) shows that


( ) ( )
( )
cos 4 2 4 2 0
i i
i
f f i U
z f t ro ro ¿ r o + ÷ + j .

Applying (1.9b) and (1.9c), we get that
The time average of any time-independent quantity equals that quantity—that is,
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 44 -
These two results can be substituted into (1.15f) to get


( ) ( ) ( )
( )
( ) ( )
cos 2 2 2 cos 2 2
1
cos 2
2
i i i
i
i i
f f i U f i U
f
z f t z f t ro ro ¿ r o ro r o
ro ¿
+ ÷ + ÷ +

j
.
(1.15h)

Replacing
( ) i
U
o by
( ) i
V
o does not change the algebra used to derive (1.15h). It follows that


( ) ( ) ( ) ( )
( ) ( )
1
cos 2 2 2 cos 2 2 cos 2
2
i i i i
i i
f f i V f i V f
z f t z f t ro ro ¿ r o ro r o ro ¿ + ÷ + ÷ + j . (1.15i)

Substituting (1.15h) and (1.15i) into (1.15e) now gives


( ) ( ) ( )
( ) ( ) 2 2
1
cos 2
2
i i i i i
RT TR
f f f f f
A A U V ro ¿ - +
K K
j , (1.15j)

and this result can be put into (1.15d) to get


( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) 2 2
cos 2
i i i i i i i i
cb RT RT TR TR
f f f f f f f f
I A A A A W U V ro ¿ - - + + +
K K K K
j j . (1.15k)

For an ideal Michelson interferometer, the intensity of the f
i
th monochromatic wavetrain in
the RT beam and the intensity of the f
i
th monochromatic wavetrain in the TR beam must be
identical because they arise in a symmetric way from the f
i
th wavetrain of the white-light beam
entering the instrument. We can imagine taking out the moving mirror from its interferometer
arm so that only the TR beam is reflected back to the beam splitter. This means that only the
( )
i
TR
f
A
K
monochromatic disturbance leaves the interferometer in the proper direction, and its
intensity is, of course,
( )
( ) ( )
i i
TR TR
f f
A A -
K K
j . Taking out the fixed mirror in the other arm and
replacing the moving mirror in the first arm ensures that only the RT beam reflects back to the
beam splitter. Now
( )
( ) ( )
i i
RT RT
f f
A A -
K K
j is the intensity of the monochromatic disturbance leaving
the interferometer in the proper direction. Since we have just said that these two intensities must
be equal, it follows that


( ) ( )
( ) ( ) ( ) ( )
i i i i
RT RT TR TR
f f f f
A A A A - -
K K K K
j j . (1.16a)

Substituting (1.15h) and (1.15i) into (1.15e) now gives
Replacing
( ) i
U
o by
( ) i
V
o does not change the algebra used to derive (1.15h). It follows that
Interference Equation for the Ideal Michelson Interferometer· 1.5
- 45 -
Equation (1.10d) holds true for any monochromatic wavetrain
f
A
K
of frequency f, so it must
apply to wavetrain
( )
i
TR
f
A
K
of frequency f
1
. Hence, Eq. (1.15a) must mean that

( )
( ) ( ) 2 2
1
( ).
2
i i i i
TR TR
f f f f
A A U V - +
K K
j (1.16b)

Equation (1.10d) also applies to wavetrain
( )
i
RT
f
A
K
of frequency f
i
in Eq. (1.15b), which
similarly leads to

( )
( ) ( ) 2 2
1
( )
2
i i i i
RT RT
f f f f
A A U V - +
K K
j . (1.16c)

The right-hand sides of (1.16b) and (1.16c) are the same, which makes sense since the left-hand
sides of (1.16b) and (1.16c) must satisfy Eq. (1.16a).
Again taking out the moving mirror, we note that then, in an ideal interferometer, one quarter
of the entering beam’s power ends up leaving the interferometer as beam TR traveling along the z
axis in Fig. 1.16. Hence, if
(0)
i
f
I is the intensity of the f
i
th monochromatic wavetrain entering this
interferometer, we must have

( )
( ) ( ) (0)
1
4
i i i
TR TR
f f f
A A I -
K K
j . (1.17a)

Consulting Eq. (1.16a), we see that this means


( )
( ) ( ) (0)
1
4
i i i
RT RT
f f f
A A I -
K K
j (1.17b)

and, of course, Eqs. (1.16b) and (1.16c) then reveal that


(0) 2 2
2( )
i i i
f f f
I U V + . (1.17c)

Substituting Eqs. (1.17a)–(1.17c) into (1.15k) then leads to


( )
( ) (0) (0)
1
cos 2
2 2
i i i i
cb
f f f f
W
I I I ro ¿ +
or

( )
( ) (0)
1
1 cos 2
2
i i i
cb
f f f
I I W ro ¿
ª º
+
¬ ¼
. (1.17d)

Equation (1.10d) also applies to wavetrain
( )
i
RT
f
A
K
of frequency f
i
in Eq. (1.15b), which
Equation (1.10d) holds true for any monochromatic wavetrain
f
A
K
of frequency f, so it must
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 46 -
Equation (1.17d) is the basic equation for the intensity of a monochromatic wavetrain leaving
an ideal Michelson interferometer when the intensity of the corresponding wavetrain entering the
interferometer is
(0)
i
f
I and the moving mirror is displaced from its ZPD position by a distance
/ 2 p ¿ , as shown in Fig. 1.16. We note that for those values of Ȥ = 2p, where
( )
cos 2 1
i
f
W ro ¿ , the intensity of the f
i
th monochromatic wavetrain leaving the interferometer is
the same as the intensity of the f
i
th monochromatic wavetrain entering the interferometer. This
corresponds to constructive interference of the f
i
th monochromatic component of the RT and TR
beams. Suppose the beam entering the interferometer consists of just this one monochromatic
component. Glancing back at Fig. 1.1(b), we see that the power of the beam entering an ideal
Michelson interferometer can leave by either the combined RT and TR dotted beams or by the
two combined dash-dot beams traveling in the opposite direction to the incident beam. The dotted
beams are often called the balanced output of the interferometer, because each one has undergone
one transmission and one reflection at the beam splitter; similarly, the dash-dot beams are called
the unbalanced output, because one beam has undergone two reflections and the other beam has
undergone two transmissions. Conservation of energy requires that the power in all the
monochromatic beams leaving the ideal interferometer must equal the power in the one
monochromatic beam entering the interferometer. Hence, when constructive interference of the
balanced RT and TR beams makes their combined intensity equal to that of the beam entering the
interferometer, we know that destructive interference of the two unbalanced beams must make
their combined intensity equal to zero. Consequently, at each Ȥ = 2p value where
( )
2 cos 1
i
f
W ro ¿ , not only is the intensity of the balanced monochromatic beams the same as
that of the monochromatic beam entering the interferometer, but also the intensity of the
unbalanced monochromatic beams is zero. On the other hand, for moving-mirror positions where
Ȥ = 2p has a value such that
( )
2 cos 1
i
f
W ro ¿ ÷ , the intensity of the combined monochromatic
RT and TR beams in Fig. 1.1(b) is zero according to Eq. (1.17d). At these moving-mirror
locations, the balanced output undergoes destructive interference. Conservation of energy then
requires the unbalanced output to undergo constructive interference and have the same intensity
as the monochromatic beam entering the interferometer.
This analysis can be generalized to any mirror position and value of Ȥ = 2p. If
( )
i
cu
f
I is the
intensity of the unbalanced monochromatic wavetrain and, as before,
(0)
i
f
I and
( )
i
cb
f
I are the
intensities of the incident monochromatic wavetrain and balanced monochromatic wavetrain
respectively, then conservation of energy forces us to write


(0) ( ) ( )
i i i
cb cu
f f f
I I I + . (1.18a)

Substituting from Eq. (1.17d), we get
Interference Equation for the Ideal Michelson Interferometer· 1.5
- 47 -

( )
(0) (0) ( )
1
1 cos 2
2
i i i i
cu
f f f f
I I W I πσ χ
ª º
= + +
¬ ¼
,

which can be solved for
( )
i
cu
f
I to get

( )
( ) (0)
1
1 cos 2
2
i i i
cu
f f f
I I W πσ χ
ª º
= −
¬ ¼
. (1.18b)
This specifies the intensity of the f
i
th monochromatic wavetrain in the unbalanced output of an
ideal Michelson interferometer.
The dashed lines in Fig. 1.17 show the positions of the moving mirror at which


1 2
, , , ,
i i i
f f f
n n n
χ
σ σ σ
+ +
=… ….

These are the positions where
( )
0
i
cb
f
I = in Eq. (1.17d) when W = í1 for an interferometer using a
Michelson-type beam splitter. This can also be written as, substituting from Eq. (1.7b),

, , ( 1) , ( 2) ,
i i i
f f f
n n n χ λ λ λ = + + " " ,

where
i
f
λ is the wavelength of the f
i
th monochromatic wavetrain. For beam splitters where
1 W = , of course, these dashed lines represent the moving-mirror positions at which
( ) (0)
i i
cb
f f
I I = . If
the moving mirror is slightly tilted, so that its surface crosses more than one dashed line, and the
beam entering the interferometer contains only the f
i
th monochromatic wavetrain, then the
combined RT and TR beams leaving the interferometer have light and dark strips as the surface
of the tilted mirror crosses through those planes in space where an untilted mirror would produce
an all-bright or an all-dark balanced output. This connects Eq. (1.17d) to the bright and null
fringe patterns from a spectral line discussed in Sec. 1.4.
When a beam of white light passes through the interferometer—that is, a beam having many
different frequencies—the principle of independent superposition in Eq. (1.14f) requires the
intensity of the interferometer’s balanced output to be the sum of the intensities of each
monochromatic wavetrain,

( ) ( )
1
i
N
cb cb
f
i
I I
=
=
¦
,

which becomes, substituting from Eq. (1.17d),


( )
( ) (0)
1
1
1 cos 2
2
i i
N
cb
f f
i
I I W πσ χ
=
ª º
= +
¬ ¼
¦
. (1.19a)
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 48 -





FIGURE 1.17.
nth crossing
(n + 1)st crossing
(n + 2)nd crossing
(n + 3)rd crossing
position where
i
f
nλ χ =
position where
i
f
n λ χ ) 1 ( + =
position where
i
f
n λ χ ) 2 ( + =
position where
i
f
n λ χ ) 3 ( + =
distance between
dashed lines is 2 /
i
f
λ
Interference Equation for the Ideal Michelson Interferometer· 1.5
- 49 -
When describing natural sources of light, we often replace sums of discrete quantities with
integrals over continuous functions, and this transformation was perhaps even more characteristic
of late 19th-century science than it is of today’s physics. So it would be an automatic process for
Michelson and his contemporaries to define a spectral intensity function
(0)
( ) I f to describe the
radiation entering the instrument. When using this sort of mathematical formalism, we say that
(0)
( ) I f df is the optical intensity of all the radiation having frequency values between f and f + df
entering the interferometer. The intensity of the balanced output is then


( )
( ) (0)
0
1
( ) 1 cos 2
2
cb
f
I I f W df ro ¿
·
ª º
+
¬ ¼
³
. (1.19b)

The physical meaning of Eq. (1.19b) is exactly the same as Eq. (1.19a); we have just replaced
(0)
i
f
I by
(0)
( ) I f df and changed the sum to an integral. We have also relied on variable f itself
instead of index i to label the different frequencies. To make this last tactic work, we just assume
that
(0)
( ) I f is zero for those frequencies f that are not part of the original sum over i; this also
lets us specify the integral to be over all possible frequencies f between 0 and ’. The
wavenumber ı
f
can be eliminated by substituting from the formula for f in (1.7d) to get


( ) (0)
0
1 2
( ) 1 cos
2
cb
f
I I f W df
c
r
¿
·
ª º
§ ·
+
¨ ¸ « »
© ¹
¬ ¼
³
. (1.19c)

The only problem with this equation is the unreasonably high numbers required to represent f
at optical frequencies—when going from one extreme to the other across the visible spectrum, for
example, frequency f changes from 4×10
14
Hz to 7.5×10
14
Hz (approximately). Consequently,
today’s Fourier spectroscopists often use Eq. (1.7d) to eliminate f rather than ı from Eq. (1.19b).
To do this, we differentiate both sides of (1.7d) to get

df c do or
1
d df
c
o
and define

(0)
( ) ( ) S cI c o o (1.19d)
so that
(0)
1
( ) ( ) S d cI c df
c
o o o
simplifies to

(0)
( ) ( ) S d I c df o o o . (1.19e)

The only problem with this equation is the unreasonably high numbers required to represent f
The physical meaning of Eq. (1.19b) is exactly the same as Eq. (1.19a); we have just replaced
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 50 -
Now Eq. (1.7d) can be applied to (1.19c) to get

( )
( )
0
1
( ) 1 cos 2
2
cb
I S W d σ πσχ σ

ª º = +
¬ ¼ ³
. (1.19f)

To get the white-light intensity formulas for the unbalanced output, we can apply to the
unbalanced monochromatic formula the same analysis used on the balanced monochromatic
formula. Comparing the unbalanced formula (1.18b) to the balanced formula (1.17d), we see that
changing the sign of W is all that needs to be done to go from the balanced formula to the
unbalanced formula. Hence, when we apply to the unbalanced formula the same algebra used on
the balanced formula, we know that all the way through the derivation—and, of course, in the
final results—the only difference would be that W is replaced by íW. Consequently, we can write
down at once the unbalanced white-light formulas corresponding to (1.19b), (1.19c), and (1.19f)
as


( )
( ) (0)
0
1
( ) 1 cos 2
2
cu
f
I I f W df πσ χ

ª º
= −
¬ ¼
³
, (1.20a)


( ) (0)
0
1 2
( ) 1 cos
2
cu
f
I I f W df
c
π
χ

ª º
§ ·
= −
¨ ¸ « »
© ¹
¬ ¼
³
, (1.20b)

and
( )
( )
0
1
( ) 1 cos 2
2
cu
I S W d σ πσχ σ

ª º = −
¬ ¼ ³
(1.20c)


respectively. Formulas (1.19b), (1.19c), and (1.19f) contain all the basic information needed to
understand how Fourier-transform spectroscopy works, and it was derived here using only those
facts that Michelson knew over 100 years ago about the nature of light. Unfortunately, it applies
only to an ideal interferometer; not surprisingly, the 19th-century approach used to derive it is
difficult to adapt to the study of both the random and nonrandom errors present in even the most
accurate of today’s Michelson interferometers. For this reason, in Chapter 4 we return to basic
principles and rederive the formula for I
(cb)
starting from the modern form of Maxwell’s
equations, this time being careful to include all the nonideal terms needed for the error analysis.
Formula (1.19f) is, however, already good enough—if we borrow several mathematical results
from Chapter 2—to explain why the fringes from even the thinnest of spectral lines discussed in
Sec. 1.4 must eventually fade away as Ȥ = 2p increases.
Fringe Patterns of Finite-Width Spectral Lines· 1.6

- 51 -
1.6 Fringe Patterns of Finite-Width Spectral Lines
Finite-width spectral lines, such as the one in the top graph of Fig. 1.18, can be represented by a
spectral intensity function I
(0)
(f). We can also follow the standard practice of Fourier
spectroscopists and represent the finite-width spectral line by the S(ı) function defined in Eq.
(1.19d) and plotted in the bottom graph of Fig. 1.18. If the intensity of a spectral line is described
by a narrow I
(0)
(f) function such as the one in the top graph of Fig. 1.18, which is significantly
different from zero only between two very closely spaced frequencies f
1
and f
2
, then the
corresponding S(ı) curve is significantly different from zero only between the two closely spaced
wavenumbers
1 1
/ f c o and
2 2
/ f c o , as shown in the bottom graph of Fig. 1.18.
The right-hand side of Eq. (1.19f) can be split up into the sum of a constant term and a term
that changes as the location coordinate p = Ȥ/2 of the moving mirror changes,

( )
( )
0 0
1
( ) ( ) cos 2
2 2
cb
W
I S d S d o o o ro¿ o
· ·
+
³ ³
. (1.21a)

Since 0 o > in the integrals over do , nothing stops us from replacing ( ) S o by ( ) S o in the
second term to get
( ) ( )
0 0
( ) cos 2 ( ) cos 2 S d S d o ro¿ o o ro¿ o
· ·

³ ³
. (1.21b)

Anticipating some of the Fourier material in Chapter 2, we note that, according to Eq. (2.11a)
in Chapter 2, function ( ) S o is even because

( ) ( ) S S o o ÷ ,

and, of course, it is real because it represents a real physical quantity—the intensity of the
spectral line. Turning next to Eq. (2.34g) in Chapter 2, we see that because ( ) S o is a real and
even function, the cosine integral on the right-hand side of Eq. (1.21b) is one half of the Fourier
transform of S [if we specify that parameter ı in (1.21b) corresponds to variable t in (2.34g) and
that parameter Ȥ in (1.21b) corresponds to variable f in (2.34g)]. Anticipating the material in
Chapter 2 one last time, we consult Eq. (2.35k) and note that if the nth derivative of S has a well-
defined Fourier transform, then for large values of its argument the Fourier transform of S
approaches zero as the nth power of the absolute value of its argument. Since S describes a
spectral line—that is, a natural phenomenon—we expect it to have derivatives of all orders and
also expect those derivatives to have Fourier transforms. The argument of the Fourier transform
of S is Ȥ, and we already know that the right-hand side of (1.21b) is half the Fourier transform of
S, so we can now conclude that
Anticipating some of the Fourier material in Chapter 2, we note that, according to Eq. (2.11a)
Since 0 o > in the integrals over do , nothing stops us from replacing ( ) S o by ( ) S o in the
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 52 -
( ) ( )
( )
0 0
( ) cos 2 ( ) cos 2
n
S d S d O o ro¿ o o ro¿ o ¿
· ·
÷
=
³ ³
(1.21c)

for positive values of n as ¿ ÷·. Applying this to Eq. (1.20a) shows that


( )
( )
0
1
( )
2
n
cb
I S d O o o ¿
·
÷
+
³
(1.21d)

for large values of Ȥ. Hence, as the moving mirror gets further and further from its ZPD location,
increasing the value of 2 p ¿ , the value of
( ) cb
I eventually stops changing and approaches the
constant value

( )
0
1
lim ( )
2
cb
I S d
¿
o o
·
÷·

³
. (1.21e)

This happens for all types of intensity curves, not just those associated with spectral lines. If S
does represent a spectral line such as the one in Fig. 1.18, the brights and nulls associated with
the dashed lines in Fig. 1.17 eventually fade away. Consequently, no matter how the moving
mirror is tilted, no fringes can be seen. If the Michelson interferometer is being used as a ruler,
the fringe counting must stop. When the spectral line is a closely spaced multiplet, each line in
the group has a finite spectral width, ensuring that—no matter how the lines interact with each
other to form bright and dim regions in the overall fringe pattern—eventually any and all fringe
traces must disappear. Every spectral line found in nature produces light having some finite
spectral width, no matter how small, so this sort of fade-out is a universal phenomenon.
1.7 Fourier-Transform Spectrometers
In Michelson’s time there was no easy way to measure the intensity of the exit beam leaving the
interferometer, so it was not practical to measure the change in I
(cb)
as a function of Ȥ = 2p in
order to determine the Ȥ-dependent curve,

( )
0
( ) cos 2 S d o ro¿ o
·
³
,

coming from the second term on the right-hand side of Eq. (1.21a). In the previous section we
found that this curve is half the Fourier transform of S. This means that if the curve could be
(1.21a)
Fourier-Transform Spectrometers · 1.7

- 53 -












frequency f
wavenumber σ

1
f
2
f

c
f
1
1
= σ
c
f
2
2
= σ
Spectral Intensity ) (
) 0 (
f I
) ( ) (
) 0 (
σ σ c cI S =
FIGURE 1.18.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 54 -
measured, then the Fourier transform could be reversed to get the shape of the S spectrum
entering the interferometer. In the 1950s, both optical detectors to measure I
(cb)
and digital
computers to reverse the Fourier transform became widely available. Spectroscopists began to
design and build spectrometers based on measuring I
(cb)
as a function of Ȥ and then reversing the
Fourier transform to find S. Today, these sorts of instruments are usually called Fourier-transform
spectrometers.
Equation (1.21a) is an idealized form of the fundamental equation of Fourier-transform
spectroscopy. It describes the intensity of the beam leaving an interferometer whenever we
1) Divide the beam into equal-amplitude secondary beams, and
2) Recombine the two secondary beams after the wavefield of one is shifted a distance Ȥ
with respect to the wavefield of the other.
Although this is exactly what happens inside a standard Michelson interferometer, Figs. 1.19(a)–
1.19(d) show that there are many other combinations of beam splitters and mirrors that divide and
recombine beams in this way.
18

Figure 1.19(a) shows the first and perhaps most obvious modification. Michelson put the arms
of his interferometer at right angles to maximize the fringe shift due to the ether wind thought to
exist by 19th-century scientists. If all that is desired, however, is to divide and recombine beams,
then the two arms can be at any (reasonable) angle with respect to each other, as shown in Fig.
1.19(a). The setup in Fig. 1.19(a) may in fact have some advantages over the standard Michelson
interferometer; arranging for near-normal reflections off the beam splitter usually modifies the
polarization of the wavefields less than large-angle reflections (see Sec. 4.4 of Chapter 4 for an
explanation of polarization).
Figure 1.19(b) shows that the end mirrors can be replaced by retroreflectors like corner cubes
or cat’s-eyes. For best results, both arms should have the same type of retroreflector.
The discussion following Eq. (1.17d) above explains the difference between the balanced and
unbalanced optical outputs leaving the standard Michelson interferometer. In Figs. 1.19(a) and
1.19(b), the unbalanced output cannot be detected because it goes back out along the entrance
beam, making it impossible to separate the two. The interferometer in Fig. 1.19(c), however,
shows that there are ways to keep the entrance beam separate from the unbalanced output, giving
us access to both the balanced and unbalanced optical signals. According to Eqs. (1.19f) and
(1.20c), if I
(cb)
is the intensity of the balanced output and I
(cb)
is the intensity of the unbalanced
output, then
( )
( ) ( )
0
( ) cos 2
cb cu
I I W S d o ro¿ o
·
÷
³
(1.22a)
and


18
To keep things simple, compensation plates and other secondary optical components have been omitted.
I
(cu)
Fourier-1ransform Spectrometers · 1.7

- 55 -

( ) ( )
0
( )
cb cu
I I S d σ σ

+ =
³
. (1.22b)

Equation (1.22a) shows that subtracting the output of the detectors measuring the balanced and
unbalanced signals eliminates the constant term and doubles the size of the signal component
containing the Fourier transform. Adding the detectors’ outputs in Eq. (1.22b) eliminates the
Fourier transform, producing the integrated spectral intensity of the entrance beam. This
integrated source intensity should, of course, remain constant during a spectral measurement
because Fourier-transform spectrometers are vulnerable to source fluctuations. Astronomers often
design their Fourier-transform spectrometers so that both the balanced and unbalanced outputs
are available. When they investigate the spectra of weak and fluctuating sources (such as
twinkling stars), these instruments allow them both to double the signal from—and to check the
constancy of—the radiances being measured. If the source fluctuates, formula (1.22b) can be
used to measure the fluctuation. Sometimes this allows the astronomer to rescale the Fourier
signal in (1.22a) to correct the spectral measurement.
In a standard Michelson interferometer such as the one shown in Fig. 1.1(b), and in the setups
shown in Figs. 1.19(a)±1.19(c), the wavefield of one recombining beam is displaced a distance Ȥ
with respect to the wavefield of the other whenever the moving mirror or corner cube is displaced
from =PD by a distance Ȥ/2. In Fig. 1.19(d), however, the corner cube only has to move a
distance Ȥ/4 to displace one wavefield by Ȥ with respect to the other. Equation (5.67) in Chapter 5
shows that larger values of Ȥ lead to more detailed spectral measurements in standard Michelson
interferometers, and the same holds true for the nonstandard interferometers discussed here. In
particular, a setup such as the one shown in Fig. 1.19(d) lets us achieve larger Ȥ values with
smaller displacements of the corner cube. The moving corner cube is also, strictly speaking, no
longer the retroreflector; plane mirrors in both arms are used to reverse the beam directions.
During the 1950s, it was established that Fourier-transform spectrometers had two basic
advantages—often called the Jacquinot advantage and the Fellget advantage—over contemporary
types of prism-based and grating-based spectrometers.
19
These advantages revealed that under
many circumstances spectra measured by Fourier-transform spectrometers had a better signal-to-
noise ratio than equivalent prism-based or grating-based instruments. With the popularization of
the fast-Fourier transform (FFT) algorithms in the 1960s, Fourier-transform spectrometers soon
established themselves as usually the first and best choice for measuring infrared spectra
(electromagnetic radiation having wavelengths between 1 and 100 ȝm). The growing availability
of personal and desktop computers in the late 1970s and 1980s made Fourier-transform systems
more compact, powerful, and user-friendly. Over the past two decades, there has been a tendency
to use standard Michelson configurations, such as those in Figs. 1.1(b) or 1.19(a), when


19
J. Chamberlain, The Principles of Interferometric Spectroscopv, p. 16.
to use standard Michelson configurations, such as those in Figs. 1.1(b) or 1.19(a), when
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 56 -














FIGURE 1.19(a).
FIGURE 1.19(b).
FIGURE 1.19(c).

2
χ
= p

2
χ
= p

2
χ
= p
Moving Corner
Cube
Fixed Corner
Cube
Fixed Corner
Cube
Beam
Splitter
Beam
Splitter
Entrance Beam
Entrance Beam
To Unbalanced
Signal Detector
To Balanced
Signal Detector
To Balanced
Signal Detector
Moving
Mirror
Fixed
Mirror
Beam
Splitter
Entrance Beam
To Balanced
Signal Detector

Fourier-Transform Spectrometers · 1.7

- 57 -

FIGURE 1.19(d).


























designing the optics of Fourier-transform spectrometers. Standard Michelsons are well suited to
the laser-based servo controls often used to maintain the alignment of the fixed and moving
mirrors.
1.8 Laser-Based Control Systems
Today’s Fourier-transform spectrometers often rely on laser-based servo systems to maintain
alignment and control the motion of the moving mirror. The average wavelength of the measured
spectra determines the standards of alignment and control required for good spectral
Moving Corner Cube
Beam
Splitter
Entrance Beam
Fixed
Mirror
4
χ
= p

To Balanced Signal Detector
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 58 -
measurement. Systems designed to measure infrared spectra typically have lasers that work in the
visible. Not only do modest standards of alignment and control in the visible correspond to
extremely accurate standards of alignment and control in the infrared—because visible
wavelengths are much shorter than infrared wavelengths—but the infrared detectors responsible
for the spectral measurements are also easily shielded from stray laser light. The laser servo
systems follow many different designs. Figures 1.20(a) and 1.20(b) show a typical setup that may
not be exactly like any system now in use but that does present the basic ideas behind them.
In Fig. 1.20(a), a single laser beam is separated into beams A, B, and C by laser-beam
splitters. Separating one beam into three ensures that all three beams have the same wavelength.
The three beams enter the interferometer parallel to, and at the edges of, the entrance beam.
Figure 1.20(b) shows the path of beams A and B through the instrument; beam C is not shown
because it is out of the plane of the page, but it is assumed to follow a path similar to beams A
and B. The solid lines representing the laser beams are always parallel to the dotted lines showing
the path of the entrance beam through the interferometer; and the laser beams interact with the
interferometer’s beam splitter, fixed mirror, and moving mirror exactly the same way the
entrance beam does. Because all three laser beams are monochromatic wavetrains of wavelength
λ, the same reasoning used to produce Fig. 1.17 shows that we can draw a sequence of dashed
lines perpendicular to the laser beams to represent the moving-mirror positions where the laser
beams would form fringes. Just like in Fig. 1.17, each dashed line is separated from its two
nearest neighbors by λ/2. Taking the dashed lines to represent nulls, we note that if the moving
mirror has a slight tilt, as shown in Fig. 1.20(b), then the laser detector for beam B will see a near
null in the beam B fringe while the laser detector for beam A will see a near bright in the beam A
fringe. If the moving mirror is aligned in the plane of Fig. 1.20(b) but has a small out-of-plane
tilt, then the laser detector for beam C is sure to see a different fringe brightness than the laser
detectors for beams A and B. The three laser detectors send their signals to a servomechanism
that readjusts the mirror tilt until both detectors see the same fringe intensity, keeping the
interferometer aligned while the moving mirror changes position. Often these servomechanisms
readjust the tilt of the fixed mirror instead of directly correcting the moving mirror’s tilt. It is not
difficult to design systems of this sort that can detect changes of λ/100 in the position of the
moving-mirror’s surface. The A, B, and C laser detectors can also be used to count fringes as the
moving mirror changes position, keeping a record of where the moving mirror is and how fast it
is moving. This information is almost always used to sample the interferometer’s output signal at
equally spaced positions of the moving mirror, and it is often sent to a servomechanism
responsible for producing steady motion in the moving mirror.

___________


Chapters 2 and 3 spell out the mathematical ideas needed to analyze the performance of
Fourier-transform spectrometers, and they also establish the notation used to describe these ideas
in subsequent chapters. Readers who are already familiar with Fourier theory and random
Laser-Based Control Systems · 1.8
- 59 -
functions can skip ahead to Chapter 4, returning to Chapters 2 and 3 as needed to refresh their
understanding. Chapter 4 starts with Maxwell’s equations, working with them to derive the
nonideal versions of Eq. (1.19f) and (1.20c) needed to understand both the nonrandom and
random sources of error in Fourier-transform spectrometers. We always assume a standard
Michelson configuration, such as the ones shown in Fig. 1.1(b) or 1.19(a), controlled by laser-
based metrology and alignment systems similar to the ones shown in Figs. 1.20(a) and 1.20(b).
These are arguably the most common type of Fourier-transform spectrometer in use today. Most
of the basic ideas applied here to these standard Michelson systems are also relevant to other
types of Fourier-transform spectrometers; anyone who reads and understands the analysis
presented in Chapters 4 through 8 will be able to modify the equations presented there so that
they apply to nonstandard Michelson configurations. One possible exception to this rule are
Michelsons such as the one shown in Fig. 1.19(b) that use nonstandard retroreflectors to return
the split entrance beam to the beam splitter. These sorts of systems, which are outside the scope
of this book, are spared many forms of the “tilt” misalignment possible in a standard Michelson,
which is an advantage, but on the other hand exhibit shear types of misalignments, which
standard Michelsons do not have. The equations governing shear misalignment turn out to be
similar to those for tilt misalignment, but it does not necessarily make sense to analyze them as a
source of random error, the way tilt is analyzed in Chapter 7.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
- 60 -

FIGURE 1.20(a).







Interferometer
Beam Splitter
Laser Beam
Splitters
Entrance
Beam
Laser
Beam C
Beam B
Beam A
Laser-Based Control Systems · 1.8
- 61 -













FIGURE 1.20(b).
Laser Fringe Positions
Moving
Mirror
Fixed
Mirror
Laser
Laser Beam
Splitters
Beam C
Beam B
Beam A
To Laser
Detector B
To Laser
Detector A
To Infrared Detector
Entrance
Beam
Interferometer
Beam Splitter
- 62 -
2
FOURIER THEORY
Many single-chapter introductions to Fourier theory follow a top-down approach, defining what a
Fourier transform is and then listing the mathematical consequences. Here, on the other hand, we
begin with more of a bottom-up approach, seeking not only to present the mathematical
formalism of Fourier transforms but also to give an intuitive feel for how they work and what
they mean. Once the basic idea is established, we need to know which data sequences and
functions have well-defined Fourier transforms. This topic is often scanted because Fourier
theory is notorious for providing no simple mathematical answers to this simple mathematical
question. Indeed, engineers, scientists, and applied mathematicians have a long tradition of using
Fourier transforms in mathematically improper—yet extremely useful—ways that usually give
the correct answer. To show why these techniques work, and also when they cannot be trusted,
there is a brief sketch of generalized function theory. This is followed by a discussion of the
Fourier series and the discrete Fourier transform, including an exact description of how they are
connected to the integral Fourier transform. The discrete Fourier transform is particularly
important because, almost without exception, the only type of Fourier transform calculated on
today’s computers is the discrete Fourier transform; without it, the Michelson interferometer
would be a much more limited instrument. The chapter then concludes with a brief discussion of
how Fourier transforms are applied to two-dimensional and three-dimensional functions.
2.1 Basic Concept of a Fourier Transform
The idea of a Fourier transform develops naturally from a simple idea for comparing the shape of
two sequences of measurements. A sequence of measurements is really just a list of numbers, so
when we compare sequences of measurements we compare the shapes of number lists graphed in
the order of their measurement. We can suppose without any loss of generality that two lists,
k
u
and
k
v , have the same number of members with 1, 2, , k N … . Figures 2.1(a) and 2.1(b) show
two lists
k
u and
k
v graphed against their index value k. Defining u and v to be the mean values
of
k
u and
k
v ,

1
1
N
k
k
u u
N

¦
(2.1a)
and

1
1
N
k
k
v v
N

¦
, (2.1b)
- 62 -
Basic Concept of a Fourier Transform · 2.1
- 63 -




FIGURE 2.1(a).
FIGURE 2.1(b).
increasing index k
increasing index k
1 2 3 4
1 2 3 4
List
k
u
List
k
v
- 63 -
2 · Fourier Theory

- 64 -
we form the sum S of the products of the differences from the mean,

( )( )
1
N
k k
k
S u u v v

÷ ÷
¦
. (2.2)

If the graphs of
k
u and
k
v have similar shapes, so that
k k
u u v v ÷ = ÷ for most values of k,
then ( )
k
u u ÷ and ( )
k
v v ÷ are very likely to have the same sign for most values of k. This means
few terms in the sum are negative and S ends up being a large positive number. If
k
u and
k
v have
little similarity in shape, then ( )
k
u u ÷ and ( )
k
v v ÷ are as likely to have opposite signs as the
same sign and the terms in the sum are just as likely to be positive as they are to be negative.
When this happens, S is a sum of terms that tend to cancel out, and the magnitude of S is likely to
be small.
The same basic idea can be applied to continuous functions u(t) and v(t). To create a formal
correspondence between functions and lists, we define an interval ¨t in t and match
k
u and
k
v to
u(t) and v(t) with the equations
( )
k
u
u k t
t
A
A

and
( )
k
v k t v A .

Because u and v are continuous functions of time, we can assume that they vary in an
unsurprising manner between the isolated points at , 2 , , t t N t A A A … at which they have been
specified. Traditionally, the argument of functions u and v is called t and assumed to be time, but
it is worth remembering that t can stand for any relevant physical parameter, such as length,
voltage, current, etc. Now we can approximate Eq. (2.2) as

( )( ) ( ) ( )
N t
t
S u t u v t v dt
A
A
e ÷ ÷
³
, (2.3a)
where now

1
( )
N t
t
u u t dt
N t
A
A
e
A
³
(2.3b)
and

1
( )
N t
t
v v t dt
N t
A
A
e
A
³
. (2.3c)

Equations (2.3b) and (2.3c) just ensure that u and v are now the average values of u(t) and
- 64 -
Basic Concept of a Fourier Transform · 2.1
- 65 -
v(t) respectively. We note that the value of u has been redefined from what it was in Eq. (2.1a)
above,
/
new old
u u t e A ,

whereas v has basically the same value as in Eq. (2.1b)—the only change is to replace the sum
by the equivalent integral. At this point, the finite value of ¨t is just a distraction, because it is the
shapes of the continuous functions u(t) and v(t) that are being compared. Taking the limit as
0 t A ÷ and N ÷· in such a way that


max
0
lim
t
N
N t T constant
A ÷
÷·
A , (2.4a)
we get
( )( )
max
0
( ) ( )
T
S u t u v t v dt ÷ ÷
³
, (2.4b)
where

max
max
0
1
( )
T
u u t dt
T

³
(2.4c)
and

max
max
0
1
( )
T
v v t dt
T

³
. (2.4d)

We still expect S to be large when functions u and v have similar shapes and S to be small when
they have dissimilar shapes.
Equation (2.4b) can be written as


( ) ( )
( )
max max
max max
max max max
max
0 0
max
0 0
max
0 0 0
max
0
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ,
T T
T T
T T T
T
S u t u v t dt v u t u dt
u t u v t dt v u t dt u T
u t v t dt u v t dt v u t dt u T
u t v t dt u v T
÷ ÷ ÷
ª º
÷ ÷ ÷
« »
« »
¬ ¼
ª º
÷ ÷ ÷
« »
« »
¬ ¼
÷
³ ³
³ ³
³ ³ ³
³



(2.5)

where in the last step (2.4c) ensures that the term in the square brackets [ ] is zero and (2.4d) is
- 65 -
2 · Fourier Theory

- 66 -
used to replace the integral over v by
max
vT . To get to Fourier theory from Eq. (2.5), we suppose
v(t) to be an oscillatory function like sin(2 ) ft r or cos(2 ) ft r with 0 f = . This makes function u
the data—that is, the value of our measurement at time t is u(t). Equation (2.4d) then reveals,
depending on whether we choose v to be a sine curve or a cosine curve, that

( )
max
max max
0
1
sin(2 ) 1 cos(2 )
2
T
vT ft dt fT
f
r r
r
÷
³
(2.6a)
or

max
max max
0
1
cos(2 ) sin(2 )
2
T
vT ft dt fT
f
r r
r

³
. (2.6b)

When v is a sine curve,
max
vT oscillates between ( ) 1 f r and 0 as T
max
increases; and when v
is a cosine curve,
max
vT oscillates between ( ) 1 2 f r and ( ) 1 2 f r ÷ as T
max
increases. Keeping in
mind that u(t) represents a function measured in a laboratory, if we want to compare the shape of
u to either sin(2 ) ft r or cos(2 ) ft r , common sense requires T
max
, the range of t over which data is
gathered, to be much greater than 1/ƒ, the period of the sine or cosine curve to which we want to
compare the data. Unless u entirely lacks a resemblance to the sine or cosine so that


max
0
( ) ( ) 0
T
u t v t dt e
³


no matter how large u or T
max
become, we expect

max
0
( ) ( )
T
u t v t dt
³


to be large when the u measurements are large, and small when the u measurements are small—
and the integral’s magnitude should also increase as T
max
increases. So when u represents a
typical set of data that is not completely unlike v in shape, then

max
max
0
( ) ( ) ( )
T
u t v t dt O uT
³

or
- 66 -
Basic Concept of a Fourier Transform · 2.1
- 67 -
max
max
0
1
( ) ( ) ( )
T
u t v t dt O T
u

³
.

Equations (2.6a) and (2.6b) show that
max
vT must remain somewhere between the two values
( ) 1 f r and ( ) 1 2 f r ÷ no matter how large T
max
gets, which means


1
max
( ) vT O f
÷
.

Having already concluded that T
max
has been chosen much larger than 1/ƒ, we expect


max
1
max max
0
1
( ) ( ) ( ) ( )
T
u t v t dt O T O f vT
u
÷
>>
³
,

which, of course, reduces to

max
max
0
1
( ) ( )
T
u t v t dt vT
u
>>
³
.

Therefore, Eq. (2.5) can be approximated as


max max max
max
0 0 0
1 1
( ) ( ) ( ) ( ) ( ) ( )
T T T
S u u t v t dt v T u u t v t dt u t v t dt
u u
ª º
÷ e
« »
« »
¬ ¼
³ ³ ³
. (2.7)

The integral in (2.7) can be regarded as assigning the number S to the similarity in shape of u and
v, when v is a sine or cosine curve of frequency ƒ. Remembering where S came from, we realize
that this number is large when u and v have similar shapes and small when u and v have
dissimilar shapes.
2.2 Fourier Sine and Cosine Transforms
To make the ideas of the previous section mathematically rigorous, we define the Fourier sine
transform of function u to be
( )
( )
0
( ) 2 ( ) sin(2 )
ft
u t u t ft dt r
·

³
p (2.8a)

- 67 -
2 · Fourier Theory
- 68 -
and the Fourier cosine transform of u to be

( )
( )
0
( ) 2 ( ) cos(2 )
ft
u t u t ft dt r
·

³
C . (2.8b)

The notation ( )
( )
( )
ft
u t p and ( )
( )
( )
ft
u t C shows that the function u(t) is being multiplied by,
respectively, the sine or cosine function having—as indicated by the superscript—an argument ft
multiplied by 2r . The order of the ft product in the superscript does not matter because it does
not matter in the arguments of the sine and cosine, so

( ) ( )
( ) ( )
( ) ( )
ft tf
u t u t p p and ( ) ( )
( ) ( )
( ) ( )
ft tf
u t u t C C .

In particular we know, because t is repeated in both u(t) and the superscript of p and C , that t is
the dummy variable of integration whereas ƒ, which is only contained in the superscript, is an
independent parameter. This means the transforms ( )
( )
( )
ft
u t p and ( )
( )
( )
ft
u t C are themselves
functions of the parameter ƒ,
( )
0
2 ( ) sin(2 ) U f u t ft dt r
·

³
p
(2.8c)
and
( )
0
2 ( ) cos(2 ) U f u t ft dt r
·

³
C
. (2.8d)

The “capital U” names of functions U
p
and U
C
show that they are mathematically associated
with the original function u(t), created from u(t) by the integrals in (2.8c) and (2.8d).
Although the upper limit of integration is now ’ in Eqs. (2.8a) and (2.8b), this should not be
interpreted as taking the limit as
max
T ÷· in Eq. (2.7). The upper limit is put at ’ just to
eliminate T
max
as an explicit parameter, and the idea behind the presence of T
max
—that u(t)
represents the result of a measurement—is kept alive by placing restrictions on the type of
function u can be. In particular, we expect u(t), in some sense, to diminish or get small as t gets
large, because it is impossible to measure data for all the times t out to ’. It turns out that when
the right sorts of restrictions are placed on u, the Fourier sine and cosine transforms can be
inverted to recover the original functions,

( )
0
( ) 2 sin(2 ) u t U f ft df r
·

³
p
(2.8e)
- 68 -
Fourier Sine and Cosine Transforms · 2.2
- 69 -
and
( )
0
( ) 2 cos(2 ) u t U f ft df r
·

³
C
(2.8f)

for 0 t > .
If we adopt the strictest definition of what is meant by the integral of a function between 0 and
’, then Eqs. (2.8a)–(2.8f) are true when function u(t) satisfies the following four requirements:
(I) It is absolutely integrable.
(II) It is continuous except for a finite number of jump discontinuities.
(III) It is bounded on any finite interval 0 a t b < < < < ·.
(IV) It has finite variation on any finite interval 0 a t b < < < < ·.
We now show why function u(t) naturally satisfies all these restrictions when it represents a
(possibly idealized) measurement controlled or described by a continuous parameter t.
No matter what the argument t of function u represents—time, voltage, energy, etc.—function
u(t) can only be measured over a finite range of t. Although there may be no reason to think u is
zero or negligible when measured outside this range, we obviously cannot “make up” values for
what it might be. If we extrapolate to get the unmeasured t values, the extrapolation should not
dominate the information contained in u. In general, the measurement should be carried out in
such a way that the unmeasured or extrapolated values are of negligible importance compared to
the measured values. Mathematically we might say that there exists a positive, finite value of t,
which we call T
max
, such that the important measured values of u are all at
max
t T s . One way of
expressing this constraint is to require


max
0 0
( ) ( )
T
u t dt u t dt
·
e
³ ³
. (2.9a)

Since the left-hand integral ought to be finite, when (2.9a) is true, it follows that


0
( ) u t dt
·
< ·
³
. (2.9b)

Functions u that satisfy (2.9b) are said to be absolutely integrable; clearly, all functions
representing possible measurements share this quality, satisfying requirement (I) above.
Understanding requirement (II) requires some discussion of what it means to call an
experimental measurement continuous. To assign, with negligible experimental error, a definite
value of t to a measurement u, some minimum and finite change in t must occur between adjacent
measurements. In practice, continuous measurements are constructed by connecting sequences of
- 69 -
2 · Fourier Theory
- 70 -
adjacent but separate points. We then assume that if u were measured between these already
known points, it would equal (to within experimental error) the values selected by connecting the
points. Thus, the continuity of u is a requirement that the measurement captures all the relevant
detail. In this sense, asserting that u is continuous is a type of idealization—just another way of
saying that the measurement is accurate and representative. This takes care of the first part of
requirement (II), but there is a second part permitting u to have a finite number of jump
discontinuities. Figure 2.2 shows a jump discontinuity in u(t). Jump discontinuities represent
another type of idealization—what can occur when, for example, instruments are turned on or off
during a measurement. Because it is unrealistic to have this happen an infinite number of times
over a finite range of t, it makes sense to say that all functions u representing measurements are
continuous over any finite range of t except for a finite number of jump discontinuities.
Consequently, we can expect all functions representing measurements to satisfy requirement (II).
Standard proofs that the Fourier transform of the Fourier transform returns the original
function u usually end up showing as their final step that

( ) [ ]
0
0
1
2 sin(2 ) lim ( ) ( )
2
U f ft df u t u t
r
r r r
·
÷
÷ + +
³
p
(2.9c)
and
( ) [ ]
0
0
1
2 cos(2 ) lim ( ) ( )
2
U f ft df u t u t
r
r r r
·
÷
÷ + +
³
C
. (2.9d)

When u is continuous, this immediately reduces to the desired result, but when the integrals are
evaluated at a jump discontinuity, such as at
o
t t in Fig. 2.2, the limits on the right-hand side of
(2.9c) and (2.9d) give u a value at the jump discontinuity that is probably different from the
original value of u at the jump discontinuity. To keep this from happening, we define the value of
u to be, for all values
jump
t t marking the location of a jump discontinuity,


0
1
( ) lim ( ) ( )
2
jump jump jump
u t u t u t
r
r r
÷
÷ + + ª º
¬ ¼
. (2.9e)

Modifying u this way cannot change the value of any integral whose integrand is the product of u
with another smooth function. The sine and cosine are smooth functions, so using (2.9e) to
modify the value of u at jump discontinuities does not change the values of the sine or cosine
transforms.
Measurements must be done with physically realizable equipment, which necessarily
produces finite values of u. This means there always exists a finite real number B < · such that
- 70 -
Fourier Sine and Cosine Transforms · 2.2
- 71 -
Figure 2.2.



______________________________________________________________________________

( ) u t B < (2.9f)

over any finite interval 0 a t b < < < < · when function u represents a measurement. Functions
obeying this inequality are called bounded functions, so functions representing measurements
always satisfy requirement (III).
Requirement (IV) is a little bit more complicated to explain. Any function u(t) can be written
as the difference of two other functions
1
( ) u t and
2
( ) u t , as shown in Figs. 2.3(a) and 2.3(b),


1 2
( ) ( ) ( ) u t u t u t ÷ (2.9g)

In Fig. 2.3(a), function u is drawn with a continuous line where it is increasing and with a dashed
line where it is decreasing. In Fig. 2.3(b), we see that functions
1
u and
2
u are constructed so that
every time u increases,
1
u also increases while
2
u remains the same, and every time u decreases,
2
u increases while
1
u remains the same. Consequently, for any function u and time values b a > ,
the differences
1 1
( ) ( ) u b u a ÷ and
2 2
( ) ( ) u b u a ÷ are non-negative and can only increase, which
means that their sum
t
( ) u t

0
t t
- 71 -
2 · Fourier Theory
- 72 -




FIGURE 2.3(a).
FIGURE 2.3(b).
t
t
( ) u t

1,2
( ) u t
a b
a b

1
t
2
t
3
t

2
t
1
t
3
t

1
( ) u t

2
( ) u t
- 72 -
Fourier Sine and Cosine Transforms · 2.2
- 73 -

1 1 2 2
( ) ( ) ( ) ( ) ( )
ab
V u u b u a u b u a ÷ + ÷ (2.9h)

is also non-negative. Functions
1
u and
2
u have been constructed so that every time u goes up and
down, the differences
1 1
( ) ( ) u b u a ÷ and
2 2
( ) ( ) u b u a ÷ increase, making the size of ( )
ab
V u a
record of how many times u oscillates in the interval a t b < < . We define ( )
ab
V u to be the
variation of u over the interval a t b < < , and if

( )
ab
V u < ·, (2.9i)

we say that u has finite variation over the interval a t b < < . Requirement (IV), that u have finite
variation in any interval 0 a t b < < < < ·, means that u can only oscillate a finite number of
times in that interval. The function
1
sin(( 1) ) t
÷
÷ , for example, does not have finite variation over
any interval containing 1 t . If we attempted to measure a quantity that had infinite variation
inside a finite interval, we would be blocked by the realization, already discussed above in
connection with requirement (II), that adjacent measurements must be separated by some
minimum value of t. If the measurement were repeated over and over, it would seem as if u were
changing unpredictably in the region of infinite variation, leading us to wonder whether our
measurement reflected the same physical reality. Therefore, our measurements cannot have
infinite variation, and so any function u(t) representing a realistic measurement must also satisfy
requirement (IV).
We see that requirements (I) through (IV) are always satisfied by functions representing
physically realizable measurements. It should be emphasized that requirements (I) through (IV)
are sufficient to ensure that Eqs. (2.8a)–(2.8f) hold true, but not necessary. It is easy to show that
there exist functions that do not meet requirements (I) through (IV) yet still satisfy Eqs. (2.8a)–
(2.8f). Consider, for example,

( )
( )
( )
for 0 1 2
( ) / 2 for 1 2
0 for 1 2
t
g t t
t
r r
r r
r
­ s <
°

®
°
>
¯



(2.10a)

This test function clearly satisfies (I) through (IV) and so must have a Fourier cosine transform,


( )
1
2
0
sin( )
( ) 2 cos(2 )
f
G f ft dt
f
r
r r
÷

³
C
(2.10b)

such that we return to the original function g by taking cosine transform of the G
C
transform,
- 73 -
2 · Fourier Theory
- 74 -

0 0
sin( )
( ) 2 ( ) cos(2 ) 2 cos(2 )
f
g t G f ft df ft df
f
r r
· ·

³ ³
C
. (2.10c)

We could, however, just as easily have started with the function


sin( )
( )
t
h t
t


and taken its cosine transform to get


0
sin( )
( ) 2 cos(2 )
t
H f ft dt
t
r
·

³
C
. (2.10d)

The integral in (2.10d) is clearly the same as the first integral in (2.10c) with the variables ƒ and t
interchanged. Therefore,


( )
( )
( )
for 0 1 2
( ) ( ) / 2 for 1 2
0 for 1 2
f
H f g f f
f
r r
r r
r
­ s <
°

®
°
>
¯
C





Hence we know that h(t) satisfies Eqs. (2.8b), (2.8d), and (2.8f)—it is both cosine transformable
and its cosine transform returns the original function when cosine transformed—exactly because
g(t) in (2.10a) satisfies Eqs. (2.8b), (2.8d), and (2.8f). Yet h(t), unlike g(t), does not satisfy
requirements (I) through (IV)—in particular, it violates requirement (I) because it is not
absolutely integrable. To see that this is true, note that


( ) ( )
1 1 1
0 1 1
sin( ) sin( )
1 2 1
sin( )
j j
j j j
j j
t t
dt dt t dt
t t j j
r r
r r
r r
·
· · ·

÷ ÷
> ÷·
¦ ¦ ¦
³ ³ ³
,

where the last step uses a well-known property of the harmonic series,


1
1
j
j
·

¦
,

that it grows large without limit. This simple example also shows that just because a function g(t)
satisfies requirements (I) through (IV), so that the transform of the transform returns the original
- 74 -
Fourier Sine and Cosine Transforms · 2.2
- 75 -
function g(t), it does not necessarily follow that transform itself satisfies requirements (I) through
(IV).
Here is another example to show that, even though the transform of a function may exist, if
requirements (I) through (IV) are violated, then the transform of the transform does not
necessarily return the original function. We consider another test function,


1
( ) z t t
÷
, (2.10e)

which is clearly not absolutely integrable because

( )
0
0 0
lim lim ln
A
A A
dt dt
A
t t
r
r r
r
·
÷· ÷·
÷ ÷
ª º ·
¬ ¼ ³ ³
,

violating requirement (I). The sine transform of z is


0
sin(2 )
( ) 2
ft
Z f dt
t
r
·

³
p
.

Any handbook of definite integrals shows that


0 for 0
( )
for 0
f
Z f
f r
­

®
>
¯
p


. (2.10f)


Therefore, the sine transform Z
p
of
1
( ) z t t
÷
exists, yet the sine transform of the sine transform
does not return z:


[ ]
0 0
1 1
2 sin(2 ) lim 2 sin(2 ) lim 1 cos(2 )
F
F F
ft df ft df Ft
t t
r r r r r
·
÷· ÷·
÷ =
³ ³
. (2.10g)

Clearly, if a function violates requirements (I) through (IV) yet has a well-defined sine or
cosine transform, the sine transform of the sine transform and the cosine transform of the cosine
transform must be checked explicitly to confirm that the original function is returned. The only
exception is when the transform itself satisfies (I) through (IV) even though the original test
function does not. Because we could just as easily have started with the transform itself instead of
the original test function, we can conclude that the transform of the transform of the original
function must return the original function. In general, repeatedly applying the sine or cosine
- 75 -
2 · Fourier Theory
- 76 -
transform just takes us back and forth between the same two functions, and the transformations
are mathematically justified whenever at least one of those functions satisfies requirements (I)
through (IV).
2.3 Even, Odd, and Mixed Functions
Fourier transform theory can be extended to include functions that are evaluated for negative as
well as positive values of their arguments. To assist our analysis of these extended transforms, we
decide to classify u as an even, odd, or mixed function. An even function u satisfies the constraint

( ) ( ) u t u t ÷ (2.11a)

for all values of t, negative as well as positive; an odd function satisfies the constraint

( ) ( ) u t u t ÷ ÷ (2.11b)

for all values of t, negative as well as positive; and a mixed function is partly even and partly odd
in the sense that it is the sum of an even function and an odd function, neither of which is
identically zero. Any function u(t)—whether even, odd, or mixed—can be written as the sum of
two functions,
e
u and
o
u , with
e
u being an even function obeying (2.11a) and
o
u being an odd
function obeying (2.11b),

( ) ( ) ( )
e o
u t u t u t + , (2.11c)
where

[ ]
1
( ) ( ) ( )
2
e
u t u t u t + ÷ (2.11d)
and

[ ]
1
( ) ( ) ( )
2
o
u t u t u t ÷ ÷ . (2.11e)

Clearly,

[ ] [ ]
1 1
( ) ( ) ( ) ( ) ( ) ( )
2 2
e e
u t u t u t u t u t u t ÷ ÷ + + ÷
and

[ ] [ ]
1 1
( ) ( ) ( ) ( ) ( ) ( )
2 2
o o
u t u t u t u t u t u t ÷ ÷ ÷ ÷ ÷ ÷ ÷ .

If u starts off as an even function, then
e
u u , and
o
u is identically zero; if u starts off as an odd
function, then
o
u u , and
e
u is identically zero; and if u starts off as a mixed function, then
- 76 -
Even, Odd, and Mixed Functions · 2.3
- 77 -
neither
e
u nor
o
u are identically zero. If u is identically zero, it can be regarded as either even or
odd, according to the classifier’s convenience.
Figures 2.4(a) and 2.4(b) graph examples of even and odd functions respectively, and Fig.
2.4(c) shows a mixed function that is split up into its even and odd parts. We note that cos(2 ) ft r
is an even function of both ƒ and t and sin(2 ) ft r is an odd function of both ƒ and t. One point
worth remembering is that the behavior of even and odd functions is severely constrained near
0 t . For any odd function at 0 t , we have

(0) ( 0) (0) u u u ÷ ÷ ÷

from Eq. (2.11b). Since the only number equal to its own negative value is zero, all odd functions
u(t) that have a well-defined value at 0 t must be zero at 0 t ,


0
0 if (0) exists and is odd.
t
u u u

(2.12a)

Because ( ) ( ) u t u t ÷ for even functions, when t is near zero the value of u (if u is continuous) is
almost constant. Therefore, when t is exactly zero the derivative of any even function u(t), if it is
well defined, must be zero,


0
0 if the derivative at zero exists and is even.
t
du
u
dt

(2.12b)

In fact, using the definition of the derivative


0 0
( ) ( ) ( ) ( )
lim lim
du u t u t u t u t
dt
r r
r r
r r
÷ ÷
+ ÷ ÷ ÷ ª º ª º

« » « »
¬ ¼ ¬ ¼
,

when u is even we see that


0 0
( ) ( ) ( ) ( )
lim lim
o o
o o o o
t t t t
du u t u t u t u t du
dt dt
r r
r r
r r
÷ ÷
÷
÷ + ÷ ÷ ÷ ÷ ª º ª º
÷
« » « »
¬ ¼ ¬ ¼
.


This shows that when u is even, the derivative of u is odd, and so from (2.12a), which states that
odd functions are zero when their argument is zero, we know that (2.12b) must be true. Similarly,
for any odd function u,
- 77 -
2 · Fourier Theory

- 78 -





FIGURE 2.4(a).
FIGURE 2.4(b).
t
( ) u t
( ) u t
t
- 78 -
Even, Odd, and Mixed Functions · 2.3
- 79 -
FIGURE 2.4(c).































0 0
( ) ( ) ( ) ( )
lim lim
o o
o o o o
t t t t
du u t u t u t u t du
dt dt
r r
r r
r r
÷ ÷
÷
÷ + ÷ ÷ ÷ ÷ ª º ª º

« » « »
¬ ¼ ¬ ¼
,

showing that when u is odd, its derivative is even. The second derivative
2 2
d u dt of an even
function u is the first derivative of du dt that is odd, and so
2 2
d u dt must be even; similarly, the
third derivative
3 3
d u dt is the first derivative of
2 2
d u dt that is even, and so must be odd.
Examining in this fashion ever higher derivatives of the even function u, we conclude that
9.28
9.557
u t
i
ue t
i
uo t
i
2 2 t
i
2 1.5 1 0.5 0 0.5 1 1.5 2
10
5
0
5
10
( )
e
u t
( )
o
u t
( ) u t
t
0 t
- 79 -
2 · Fourier Theory

- 80 -

odd function for 1, 3, 5,
when is even.
even function for 2, 4,
n
n
n
d u
u
n dt
­ ½

® ¾

¯ ¿





(2.12c)

The same reasoning applied to the derivatives of an odd function u shows that


even function for 1, 3, 5,
when is odd.
odd function for 2, 4, 6,
n
n
n
d u
u
n dt
­ ½

® ¾

¯ ¿





(2.12d)

Equation (2.12c) states that the odd-numbered derivatives of an even function are odd while the
even-numbered derivatives of an even function are even, and Eq. (2.12d) states that the odd-
numbered derivatives of an odd function are even while the even-numbered derivatives of an odd
function are odd. Therefore, an immediate consequence of (2.12a), (2.12c), and (2.12d) is that the
odd-numbered derivatives of an even function—if they exist and are well-defined—are zero at
0 t and the even-numbered derivatives of an odd function—if they exist and are well-defined—
are zero at 0 t .
2.4 Extended Sine and Cosine Transforms
We can now extend the sine and cosine transforms to include functions u(t) evaluated for
negative as well as positive values of t while generalizing requirements (I) through (IV)
previously applied to u for 0 t > in Sec. 2.2. The extended requirements are

(V) Function ( ) u t must satisfy
( ) u t dt
·
÷·
< ·
³
. (2.13a)

(VI) Function ( ) u t must be continuous except for a finite number of jump discontinuities
over any finite interval a t b ÷· < < < < ·.
(VII) There must exist a finite positive number B such that

( ) u t B < . (2.13b)

(VIII) The non-negative variation ( )
ab
V u of function u(t) as defined in Eqs. (2.9g) and (2.9h)
is finite over any finite interval a t b ÷· < < < < ·,

( )
ab
V u < ·. (2.13c)

- 80 -
Extended Sine and Cosine Transforms · 2.4
- 81 -
We also define the value of u at all its jump discontinuities to be given by Eq. (2.9e). These new
requirements are clearly just the old set of requirements extended to cover negative as well as
positive values of t.
The extended Fourier sine transform of u is

( )
( )
( ) ( ) sin(2 )
ft
u t u t ft dt r
·
÷·

³
E
p , (2.14a)

and the extended Fourier cosine transform of u is

( )
( )
( ) ( ) cos(2 )
ft
u t u t ft dt r
·
÷·

³
E
C . (2.14b)

Just like in Eqs. (2.8a) and (2.8b), defining the standard sine and cosine transforms, the order of
the ft product in the superscript does not matter:

( ) ( )
( ) ( )
( ) ( )
ft tf
u t u t
E E
p p
and
( ) ( )
( ) ( )
( ) ( )
ft tf
u t u t
E E
C C .

We can write u as the sum of even and odd functions, ( ) ( ) ( )
e o
u t u t u t + , as described in Eq.
(2.11c), and substitute this sum into the definitions of the extended sine and cosine transforms in
(2.14a) and (2.14b) to get

( )
( )
( ) ( ) sin(2 ) ( ) sin(2 )
ft
e o
u t u t ft dt u t ft dt r r
· ·
÷· ÷·
+
³ ³
E
p (2.15a)
and
( )
( )
( ) ( ) cos(2 ) ( ) cos(2 )
ft
e o
u t u t ft dt u t ft dt r r
· ·
÷· ÷·
+
³ ³
E
C . (2.15b)

We note that the product of an even function
e
u and the sine, as well as the product of an odd
function
o
u and the cosine, must be an odd function,

( ) [ ] [ ]
( ) sin 2 ( ) ( ) sin(2 ) ( ) sin(2 )
e e e
u t f t u t ft u t ft r r r ÷ ÷ ÷ ÷ , (2.16a)
- 81 -
2 · Fourier Theory

- 82 -
and
( ) [ ] [ ]
( ) cos 2 ( ) ( ) cos(2 ) ( ) cos(2 )
o o o
u t f t u t ft u t ft r r r ÷ ÷ ÷ ÷ . (2.16b)

The integral between í’ and +’ of any odd function ( )
o
t o can be thought of as the limit of
the sum of a large number of small terms,

( ) ( 2 ) ( ) (0) ( ) (2 )
o o o o o o
t dt dt dt dt dt dt dt dt dt dt o o o o o o
·
÷·
e + ÷ + ÷ + + + +
³
" ".

Because
o
o is odd, (0)
o
o is zero; ( ) ( )
o o
dt dt dt dt o o ÷ ÷ and cancels ( )
o
dt dt o ;
( 2 ) (2 )
o o
dt dt dt dt o o ÷ ÷ and cancels (2 )
o
dt dt o ; and so on. Therefore,
20


( ) 0
o
t dt o
·
÷·

³
, (2.17)

and Eqs. (2.15a) and (2.15b) can be written as

( )
( )
E
( ) ( ) sin(2 )
ft
o
u t u t ft dt r
·
÷·

³
p (2.18a)
and
( )
( )
E
( ) ( ) cos(2 )
ft
e
u t u t ft dt r
·
÷·

³
C . (2.18b)

The integral between í’ and +’ of any even function ( )
e
t o can be thought of as

( ) ( 2 ) ( ) (0) ( ) (2 )
e e e e e e
t dt dt dt dt dt dt dt dt dt dt o o o o o o
·
÷·
e + ÷ + ÷ + + + +
³
" " .

Because
e
o is even, ( ) ( )
e e
dt dt o o ÷ , ( 2 ) (2 )
e e
dt dt o o ÷ , and so on. Therefore, the integral over
negative t has the same value as the integral over positive t and we can write



20
Strictly speaking, we are here treating the integral between í’ and +’ as a Cauchy principle value, a concept
introduced in Sec. 2.10 below.
- 82 -
Extended Sine and Cosine Transforms · 2.4
- 83 -

0
( ) 2 ( )
e e
t dt t dt o o
· ·
÷·

³ ³
. (2.19)

The product of
o
u and the sine is an even function,

( ) [ ] [ ] [ ]
( ) sin 2 ( ) ( ) sin(2 ) ( ) sin(2 )
o o o
u t f t u t ft u t ft r r r ÷ ÷ ÷ ÷ , (2.20)

and the product of
e
u and the cosine, both of them even functions, is another even function.
Consequently, the extended sine and cosine transforms in Eqs. (2.18a) and (2.18b) are, according
to (2.19), (2.8a), and (2.8b),

( ) ( )
( ) ( )
E
0
( ) ( ) sin(2 ) 2 ( ) sin(2 ) ( )
ft ft
o o o
u t u t ft dt u t ft dt u t r r
· ·
÷·

³ ³
p p (2.21a)
and
( ) ( )
( ) ( )
E
0
( ) ( ) cos(2 ) 2 ( ) cos(2 ) ( )
ft ft
e e e
u t u t ft dt u t ft dt u t r r
· ·
÷·

³ ³
C C . (2.21b)

Equation (2.21a) shows that the extended sine transform of a function u(t) is the unextended sine
transform of
o
u , the odd component of u; and Eq. (2.21b) shows that the extended cosine
transform of u(t) is the unextended cosine transform of
e
u , the even component of u. Because the
result will be needed later, we also show that the extended sine transform defined in Eq. (2.14a)
is an odd function of ƒ,

( ) ( )
( ) ( )
E E
( ) ( ) sin( 2 ) ( ) sin(2 ) ( )
ft ft
u t u t ft dt u t ft dt u t r r
· ·
÷
÷· ÷·
÷ ÷ ÷
³ ³
p p ; (2.22a)

and a similar manipulation shows that the extended cosine transform defined in (2.14b) is an even
function of ƒ,

( ) ( )
( ) ( )
E E
( ) ( ) cos( 2 ) ( ) cos(2 ) ( )
ft ft
u t u t ft dt u t ft dt u t r r
· ·
÷
÷· ÷·
÷
³ ³
C C . (2.22b)

We now examine what happens when the extended sine and cosine transforms are applied
twice to the same function. We define
- 83 -
2 · Fourier Theory

- 84 -
( ) ( ) ( )
( ) ( )
E E
( ) ( )
ft ft
o
U f u t u t
p
p p (2.23a)
and
( ) ( ) ( )
( ) ( )
E E
( ) ( )
ft ft
e
U f u t u t
C
C C , (2.23b)

where the second step in Eqs. (2.23a) and (2.23b) comes from (2.21a) and (2.21b). Taking the
extended Fourier sine and cosine transforms of
E
U
p
and
E
U
C
respectively, we get

( ) ( )
( ) ( )
E E E E E
( ) ( ) ( ) sin(2 )
tf ft
U f U f U f ft df r
·
÷·

³
p p p
p p (2.24a)
and
( ) ( )
( ) ( )
E E E E E
( ) ( ) ( ) cos(2 )
tf ft
U f U f U f ft df r
·
÷·

³
C C C
C C . (2.24b)

The second step in (2.24a) and (2.24b) is there just to emphasize that we are allowed to change
the order of the ft product in the superscripts.
Equation (2.22a) shows that the extended sine transform
E
U
p
is an odd function of ƒ, so its
product with the sine is an even function of ƒ; and Eq. (2.22b) shows that the extended cosine
transform
E
U
C
is an even function of ƒ, so its product with the cosine is also an even function of
ƒ. Hence, according to (2.19), Eqs. (2.24a) and (2.24b) become

( )
( )
E E E
0
( ) 2 ( ) sin(2 )
tf
U f U f ft df r
·

³
p p
p (2.25a)
and
( )
( )
E E E
0
( ) 2 ( ) cos(2 )
tf
U f U f ft df r
·

³
C C
C . (2.25b)

But Eq. (2.23a) shows that
E
U
p
is also the unextended sine transform of
o
u , so from (2.25a) we
see that
( )
( )
E E
( )
tf
U f
p
p

equals the unextended sine transform of the unextended sine transform of
o
u , the odd component
of function u. According to Eqs. (2.8a), (2.8c), and (2.8e), the unextended sine transform of the
unextended sine transform returns the original function for positive values of t. This means that
the extended sine transform of the extended sine transform,
- 84 -
Extended Sine and Cosine Transforms · 2.4
- 85 -
( )
( )
E E
( )
tf
U f
p
p ,

which we have just seen to be equal to the unextended sine transform of the unextended sine
transform, must return
o
u for positive values of t. Consequently, for positive values of t, Eq.
(2.25a) becomes
( )
( )
E E E
0
( ) 2 ( ) sin(2 ) ( )
tf
o
U f U f ft df u t r
·

³
p p
p . (2.26a)

Function
o
u is, however, defined for all values of t according to the rule for odd functions
( ) ( )
o o
u t u t ÷ ÷ , and the integral

E
0
2 ( ) sin(2 ( )) U f f t df r
·
÷
³
p


is also an odd function of t when we allow t to be both positive and negative,


E E
0 0
2 ( ) sin(2 ( )) 2 ( ) sin(2 ) U f f t df U f ft df r r
· ·
÷ ÷
³ ³
p p
.

Consequently, the integral exists and is well defined for negative t whenever the integral exists
and is well-defined for positive t. We conclude that Eq. (2.26a) holds true for negative as well as
positive t. Hence, using Eq. (2.23a) to substitute for
E
U
p
in Eq. (2.26a), we can write

( ) ( )
( ) ( )
E E
( ) ( )
tf ft
o
u t u t
´
´ p p (2.26b)

This shows that taking the extended sine transform of the extended sine transform returns the odd
component
o
u of function u for all values of t, both positive and negative. Switching now to the
extended cosine transform
E
U
C
, we see that Eq. (2.23b) shows the extended cosine transform
E
U
C

is also the unextended cosine transform of
e
u , the even component of function u. From the right-
hand side of Eq. (2.25b), we then know that

( )
( )
E E
( )
tf
U f
C
C

is equal to the unextended cosine transform of the unextended cosine transform of
e
u . Equations
(2.8b), (2.8d), and (2.8f) show that the unextended cosine transform of the unextended cosine
transform returns the original function for positive values of t. Consequently, the extended cosine
- 85 -
2 · Fourier Theory

- 86 -
transform of the extended cosine transform,

( )
( )
E E
( )
tf
U f
C
C ,

which we have just seen to be equal to the unextended cosine transform of the unextended cosine
transform of
e
u , must also equal
e
u for positive values of t. This means that Eq. (2.25b) becomes
(for positive values of t),
( )
( )
E E E
0
( ) 2 ( ) cos(2 ) ( )
tf
e
U f U f ft df u t r
·

³
C C
C . (2.26c)

But ( )
e
u t is defined for negative as well as positive values of t according to the rule
( ) ( )
e e
u t u t ÷ for even functions of t, and the integral


E
0
2 ( ) cos(2 ) U f ft df r
·
³
C


is also an even function of t when t is allowed to be both positive and negative:

( ) ( )
E E
0 0
2 ( ) cos 2 ( ) 2 ( ) cos 2 ( ) U f f t df U f f t df r r
· ·
÷ ÷
³ ³
C C
.

Consequently, the integral exists and is well defined for negative t if it exists and is well defined
for positive t. We conclude that Eq. (2.26c) is valid for both negative and positive t and that,
substituting Eq. (2.23b) into Eq. (2.26c),

( ) ( )
( ) ( )
E E
( ) ( )
tf ft
e
u t u t
´
´ C C . (2.26d)

This shows that taking the extended cosine transform of the extended cosine transform returns
e
u , the even component of function u, for all values of t both positive and negative. Equations
(2.11d) and (2.11e), the original definitions of the even and odd components of a function u,
show that Eqs. (2.26b) and (2.26d) can be written as

( ) ( ) [ ]
( ) ( )
E E
1
( ) ( ) ( )
2
tf ft
u t u t u t
´
´ ÷ ÷ p p (2.26e)
and
- 86 -
Extended Sine and Cosine Transforms · 2.4
- 87 -

( ) [ ]
( ) ( )
E E
1
( ( )) ( ) ( )
2
tf ft
u t u t u t
´
´ + ÷ C C . (2.26f)

Adding together the extended sine transform of the extended sine transform and the extended
cosine transform of the extended cosine transform then gives


( ) ( ) ( ) ( )
[ ] [ ]
( ) ( ) ( ) ( )
E E E E
( ) ( )
1 1
( ) ( ) ( ) ( ) ( ) .
2 2
tf ft tf ft
u t u t
u t u t u t u t u t
´ ´
´ ´ +
+ ÷ + ÷ ÷
p p C C

(2.26g)

We conclude that for any function u(t), the sum of the extended sine transform of the extended
sine transform and the extended cosine transform of the extended cosine transform returns the
original function.
One obvious way to proceed from this point is to define the Hartley transform


( ) [ ]
( ) ( )
( )
( )
( ) ( )
E E
E E
( ) ( ) cos(2 ) sin(2 )
( ) cos(2 ) ( ) sin(2 )
( ) ( )
( ) ,
ft
tf tf
u t u t ft ft dt
u t ft dt u t ft dt
u t u t
U f U f
r r
r r
·
÷·
· ·
÷· ÷·
+
+
+
+
³
³ ³
a
p
e
p
C
C



(2.26h)

where in the next-to-last step we use definitions (2.14a) and (2.14b) of the extended sine and
cosine transforms and in the last step Eqs. (2.23a) and (2.23b) are used to write the extended sine
and cosine transforms as functions of ƒ. The order of the ft product in the superscript is not
important because, just like in the sine and cosine transforms, we have

( ) ( )
( ) ( )
a a
( ) ( )
ft tf
u t u t e e .

Working with this definition, we see that the Hartley transform of the Hartley transform gives


( ) ( ) ( ) ( )
( ) [ ]
( ) ( ) ( )
E E
E E
( ) ( )
( ) cos(2 ) sin(2 )
tf ft tf
u t U f U f
U f U f ft ft df r r
´
·
÷·
´ +
ª º + +
¬ ¼ ³
a a a
.
p
p
e e e
C
C
(2.26i)

- 87 -
2 · Fourier Theory

- 88 -
According to Eqs. (2.22a) and (2.22b), the extended sine transform
E
U
C
is an odd function of ƒ
and the extended cosine transform
E
U
C
is an even function of ƒ. Using the same reasoning as in
Eqs. (2.16a) and (2.16b) above,

( ) [ ] [ ]
E E E
( ) sin 2 ( ) ( ) sin(2 ) ( ) sin(2 ) U f t f U f ft U f ft r r r ÷ ÷ ÷ ÷
C C C


and
( ) [ ] [ ]
E E E
( ) cos 2 ( ) ( ) cos(2 ) ( ) cos(2 ) U f t f U f ft U f ft r r r ÷ ÷ ÷ ÷
p p p
.


We see that
E
( ) sin(2 ) U f ft r
C
and ( )
E
cos(2 ) U f ft r
p
are both odd functions of ƒ, and Eq. (2.17)
states that the integral between í’ and +’ of any odd function is zero. Therefore,

( )
E E
( ) sin(2 ) cos(2 ) 0 U f ft df U f ft df r r
· ·
÷· ÷·

³ ³
p C
.

Now the Hartley transform of the Hartley transform in Eq. (2.26i) can be simplified to


( ) ( ) ( ) [ ]
( ) ( )
( ) ( )
E E
E E
E E
( ) ( ) cos(2 ) sin(2 )
( ) cos(2 ) ( ) sin(2 )
cos(2 ) sin(2
tf ft
u t U f U f ft ft df
U f ft df U f ft df
U f ft df U f f
r r
r r
r r
·
´
÷·
· ·
÷· ÷·
·
÷·
´ ª º + +
¬ ¼
+
+ +
³
³ ³
³
a a p
p p
e e
C
C C


( )
( ) ( )
E E
( ) ( )
E E E E
)
( ) cos(2 ) sin(2 )
( ) ( )
tf tf
t df
U f ft df U f ft df
U f U f
r r
·
÷·
· ·
÷· ÷·
+
+
³
³ ³
p
p
p
C
C
C

.


Because
E
U
C
and
E
U
p
are respectively the extended sine and cosine transforms of u [see Eqs.
(2.23a) and (2.23b)], we have

( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
E E E E
( ) ( ) ( )
tf ft tf ft tf ft
u t u t u t
´ ´ ´
´ ´ ´ +
a a
e e p p C C ,

which becomes, substituting from (2.26g),
E
U
p
cosine and sine
- 88 -
Extended Sine and Cosine Transforms · 2.4
- 89 -
( ) ( )
( ) ( )
( ) ( )
tf ft
u t u t
´
´
a a
e e . (2.26j)

We see that the Hartley transform of the Hartley transform returns the original function for both
positive and negative values of t. The Hartley transform was never very popular and is only rarely
encountered today. What is done instead, as we shall see in the next section, is to combine the
extended sine and cosine transforms into a single Fourier transform based on a complex
exponential.
2.5 Forward and Inverse Fourier Transforms
The Fourier transform is based on the well-known identity

cos( ) sin( )
i
e i
o
o o + , (2.27)

where 1 i ÷ .
For any real function u(t) satisfying requirements (V) through (VIII) in Sec. 2.4, we can add
the extended cosine transform to i times the extended sine transform to get

( ) ( ) [ ]
( ) ( ) 2
E E
( ) ( ) ( ) cos(2 ) sin(2 ) ( )
ft ft ift
u t i u t u t ft i ft dt e u t dt
r
r r
· ·
÷· ÷·
+ +
³ ³
p C . (2.28a)


From Eqs. (2.23a) and (2.23b), we have

( ) ( )
( )
E E
( )
ft
u t U f
C
C and ( ) ( )
( )
E E
( )
ft
u t U f
p
p ,

which means (2.28a) can be written as

( ) ( )
2
E E
( )
ift
e u t dt U f iU f
r
·
÷·
+
³
p C
. (2.28b)

Taking the extended sine transform of both sides of (2.28b) gives


( ) ( )
( )
2
E E
E
sin(2 ) ( ) sin(2 ) sin(2 )
sin(2 )
ift
df ft dt e u t U f ft df i U f ft df
i U f ft df
r
r r r
r
· · · ·
´
÷· ÷· ÷· ÷·
·
÷·
´ ´ +

³ ³ ³ ³
³

p
p
C
(2.28c)
- 89 -
2 · Fourier Theory

- 90 -
because ( )
E
sin(2 ) U f ft r
C
is an odd function of ƒ and integrates to zero [see discussion after Eq.
(2.26i) above]. Taking the extended cosine transform of both sides of Eq. (2.28b) gives


( ) ( )
( )
2
E E
E
cos(2 ) ( ) cos(2 ) cos(2 )
cos(2 )
ift
df ft dt e u t U f ft df i U f ft df
U f ft df
r
r r r
r
· · · ·
´
÷· ÷· ÷· ÷·
·
÷·
´ ´ +

³ ³ ³ ³
³

p C
C
(2.28d)

because ( )
E
cos(2 ) U f ft r
p
is an odd function of ƒ and integrates to zero. Substitution of Eqs.
(2.24a) and (2.24b) into (2.28c) and (2.28d) gives

( )
2 ( )
E E
sin(2 ) ( ) ( )
ift tf
df ft dt e u t i U f
r
r
· ·
´
÷· ÷·
´ ´
³ ³
p
p (2.28e)
and
( )
2 ( )
E E
cos(2 ) ( ) ( )
ift tf
df ft dt e u t U f
r
r
· ·
´
÷· ÷·
´ ´
³ ³
C
C . (2.28f)

Since ( ) ( )
( )
E E
( )
ft
u t U f
C
C and ( ) ( )
( )
E E
( )
ft
u t U f
p
p [see Eqs. (2.23a) and (2.23b)], Eqs.
(2.28e) and (2.28f) can be written as

( ) ( )
2 ( ) ( )
E E
sin(2 ) ( ) ( )
ift tf ft
df ft dt e u t i u t
r
r
· ·
´ ´
÷· ÷·
´ ´ ´
³ ³
p p (2.28g)
and
( ) ( )
2 ( ) ( )
E E
cos(2 ) ( ) ( )
ift tf ft
df ft dt e u t u t
r
r
· ·
´ ´
÷· ÷·
´ ´ ´
³ ³
C C . (2.28h)

We now multiply both sides of (2.28g) by ( ) i ÷ and sum the resulting equation with Eq. (2.28h) to
get

( ) ( ) ( ) ( )
2 2
( ) ( ) ( ) ( )
E E E E
cos(2 ) ( ) sin(2 ) ( )
( ) ( )
ift ift
tf ft tf ft
df ft dt e u t i df ft dt e u t
u t u t
r r
r r
· · · ·
´ ´
÷· ÷· ÷· ÷·
´ ´
´ ´ ´ ´ ÷
´ ´ +
³ ³ ³ ³
C C p p


or, using the identity cos( ) sin( )
i
e i
o
o o
÷
÷ ,

- 90 -
Forward and Inverse Fourier Transforms · 2.5
- 91 -
( ) ( ) ( ) ( )
2 2 ( ) ( ) ( ) ( )
E E E E
( ) ( ) ( )
ift ift tf ft tf ft
df e dt e u t u t u t
r r
· ·
´ ´ ´ ÷
÷· ÷·
´ ´ ´ ´ +
³ ³
p p C C . (2.28i)

Equation (2.26g) simplifies this to


2 2
( ) ( )
ift ift
df e dt e u t u t
r r
· ·
´ ÷
÷· ÷·
´ ´
³ ³
. (2.28j)

If, in Eq. (2.28a), we start out by adding the extended cosine transform to ( ) i ÷ times the extended
sine transform, then instead of Eqs. (2.28g) and (2.28h), we get [just replace i by ( ) i ÷
everywhere]
( ) ( )
2 ( ) ( )
E E
sin(2 ) ( ) ( )
ift tf ft
df ft dt e u t i u t
r
r
· ·
´ ´ ÷
÷· ÷·
´ ´ ´ ÷
³ ³
p p
and
( ) ( )
2 ( ) ( )
E E
cos(2 ) ( ) ( )
ift tf ft
df ft dt e u t u t
r
r
· ·
´ ´ ÷
÷· ÷·
´ ´ ´
³ ³
C C .

Now we must multiply the top equation by i before summing it with the bottom equation to get


( ) ( ) ( ) ( )
2 2
( ) ( ) ( ) ( )
E E E E
cos(2 ) ( ) sin(2 ) ( )
( ) ( )
ift ift
tf ft tf ft
df ft dt e u t i df ft dt e u t
u t u t
r r
r r
· · · ·
´ ´ ÷ ÷
÷· ÷· ÷· ÷·
´ ´
´ ´ ´ ´ +
´ ´ +
³ ³ ³ ³
p p C C


or

2 2
( ) ( )
ift ift
df e dt e u t u t
r r
· ·
´ ÷
÷· ÷·
´ ´
³ ³
. (2.28k)

Clearly, Eqs. (2.28j) and (2.28k) are basically the same identity, which can be written as


2 2
( ) ( )
ift ift
df e dt e u t u t
r r
· ·
´ ±
÷· ÷·
´ ´
³ ³
B
. (2.28A )

As long as the exponent of e changes sign in the two integrals over ƒ and t, we get back the
original function. Looking at how Eqs. (2.28j) and (2.28k) are derived, we see that if the sign of
the exponent does not change, we get
- 91 -
2 · Fourier Theory

- 92 -
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
E E E E
( ) ( )
tf ft tf ft
u t u t
´ ´
´ ´ ÷ p p C C
instead of
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
E E E E
( ) ( )
tf ft tf ft
u t u t
´ ´
´ ´ + p p C C .

Equations (2.26e) and (2.26f) then show that

( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
E E E E
( ) ( ) ( )
tf ft tf ft
u t u t u t
´ ´
´ ´ ÷ ÷ p p C C ,

which gives

2 2
( ) ( )
ift ift
df e dt e u t u t
r r
· ·
´ ± ±
÷· ÷·
´ ´ ÷
³ ³
(2.28m)

This interesting result shows that when u is even so that ( ) ( ) u t u t ÷ , we still get back the
original function, and when u is odd so that ( ) ( ) u t u t ÷ ÷ , we just have to multiply by ( 1) ÷ to
retrieve u. Even when u is mixed, no information is lost; reversing the sign of the argument still
gets us back to the original function. Replacing t by ít in (2.28m) takes us back to the original
formula (2.28A ).
Up to this point, we have taken u to be real, but if Eq. (2.28A ) holds true when u is a real
function of a real argument, it must also hold true when u is a complex function of a real
argument. To show why this is so, we break complex functions u(t) of a real argument t into real
and imaginary parts,
( ) ( ) ( )
r i
u t u t iu t + ,

where
r
u and
i
u are both real functions of t. Substituting this complex-valued u(t) into the left-
hand side of (2.28A ) gives


[ ]
2 2
2 2 2 2
( ) ( )
( ) ( ) .
ift ift
r i
ift ift ift ift
r i
df e dt e u t iu t
df e dt e u t i df e dt e u t
r r
r r r r
· ·
´ ±
÷· ÷·
· · · ·
´ ´ ± ±
÷· ÷· ÷· ÷·
´ ´ ´ +
´ ´ ´ ´ +
³ ³
³ ³ ³ ³
B
B B



Since (2.28A ) holds for real functions
r
u and
i
u , this last expression must be equal to the
original complex function u,

( ) ( ) ( )
r i
u t iu t u t + ,
- 92 -
Forward and Inverse Fourier Transforms · 2.5
- 93 -
showing that Eq. (2.28A ) is true for complex functions of t as well as strictly real functions of t.
Similar reasoning shows that (2.28m) also holds true for complex functions of real variables.
Indeed, we can even apply this analysis to the unextended sine and cosine transforms to show that
the unextended sine transform of the unextended sine transform and the unextended cosine
transform of the unextended cosine transform return the original function (for positive values of
the argument) when the original function is complex.
We now define the Fourier transform of a complex function u with real argument t to be

( )
( ) 2
( ) ( )
ift ift
u t u t e dt
r
·
÷ ÷
÷·

³
F . (2.29a)

The notation for F introduced in (2.29a) explicitly shows that t, being repeated inside both upper
and lower parentheses, is the dummy variable of integration; and that F produces a function of ƒ
because ƒ is only listed in the upper parentheses. We call (2.29a) the forward Fourier transform
and, when convenient, follow the custom of writing it with the upper-case letter of the
transformed function,

2
( ) ( )
ift
U f u t e dt
r
·
÷
÷·

³
. (2.29b)

If (2.29a) is the forward transform, then the inverse Fourier transform is


( ) 2
( ( )) ( )
itf ift
U f U f e df
r
·
÷·

³
F . (2.29c)

In both the forward and inverse transform the order of the tf product in the superscript is
irrelevant, just as it is for the sine, cosine, and Hartley transforms,

( ) ( )
( ) ( )
( ) ( )
itf ift
u t u t
± ±
F F and ( ) ( )
( ) ( )
( ) ( )
itf ift
U f U f
± ±
F F .

What is important is the sign inside the superscript, since it determines whether the forward or
inverse transform is being performed. Equation (2.28A ) shows, of course, that

( ) ( ) ( )
( ) 2 ( ) ( )
( ) ( ) ( ) ( )
itf ift itf ift
u t U f U f e df u t
r
·
´ ÷
÷·
´
³
F F F . (2.29d)

It is entirely a matter of convention which Fourier transform is called the forward transform and
which is called the reverse transform; all that matters is for (2.28A ) to be satisfied. Some authors
- 93 -
2 · Fourier Theory

- 94 -
change the sign of the exponent ( ) 2 ift r , defining the forward Fourier transform to be
( ) ift
F ,

( )
( ) 2
( ) ( )
ift ift
u t u t e dt
r
·
÷·

³
F ,

and the inverse Fourier transform to be
( ) ift ÷
F ,

( )
( ) 2
( ) ( )
itf ift
U f U f e df
r
·
÷ ÷
÷·

³
F .

Clearly, this convention also satisfies (2.28 A ), with the inverse Fourier transform of the forward
Fourier transform still returning the original function.
In physics and related disciplines, the frequency variable is often changed to 2 f u r , so that
(2.28A ) becomes

1
( ) ( )
2
i t i t
d e dt e u t u t
u u
u
r
· ·
´ ±
÷· ÷·
´ ´
³ ³
B
. (2.30a)

Authors using the frequency variable Ȧ allocate the factor of 1 (2 ) r different ways when
defining the forward and inverse Fourier transforms in terms of Ȧ, with all reasonable
possibilities chosen at one time or another:

Forward Fourier transform of ( ) is ( ) ( )
i t
u t u t e dt U
u
u
·
÷·

³
B
, (2.30b)

1
Inverse Fourier transform of ( ) ( )
2
i t
U U e d
u
u u u
r
·
±
÷·

³
,


1
Forward Fourier transform of ( ) is ( ) ( )
2
i t
u t u t e dt U
u
u
r
·
÷·

³
B
, (2.30c)

1
Inverse Fourier transform of ( ) ( )
2
i t
U U e d
u
u u u
r
·
±
÷·

³
,


1
Forward Fourier transform of ( ) is ( ) ( )
2
i t
u t u t e dt U
u
u
r
·
÷·

³
B
, (2.30d)
- 94 -
Forward and Inverse Fourier Transforms · 2.5
- 95 -
Inverse Fourier transform of ( ) ( )
i t
U U e d
u
u u u
·
±
÷·

³
.

In each of the three pairs of definitions listed above, the plus and minus signs are synchronized;
so if the top (bottom) sign is chosen for the first member of the pair then the top (bottom) sign
must also be chosen for the second member of the pair. This gives a total of six different ways of
defining the forward and inverse Fourier transforms, and all six satisfy Eq. (2.30a).
The unextended sine and cosine transforms—usually called just the sine and cosine
transforms—can also be defined in many different ways. Equations (2.8a), (2.8c), (2.8e), and
(2.8b), (2.8d), (2.8f) can be combined to write


0 0
4 sin(2 ) ( ) sin(2 ) ( ) for 0 df ft dt u t ft u t t r r
· ·
´ ´ ´ >
³ ³
(2.31a)
and

0 0
4 cos(2 ) ( ) cos(2 ) ( ) for 0 df ft dt u t ft u t t r r
· ·
´ ´ ´ >
³ ³
. (2.31b)

Changing the frequency variable to 2 f u r gives


0 0
2
sin( ) ( ) sin( ) ( ) for 0 df t dt u t t u t t u u
r
· ·
´ ´ ´ >
³ ³
(2.31c)
and

0 0
2
cos( ) ( ) cos( ) ( ) for 0 df t dt u t t u t t u u
r
· ·
´ ´ ´ >
³ ³
. (2.31d)

Just like the factor of 1 (2 ) r in Eq. (2.30a), the factor of 2 r in (2.31c) and (2.31d) can be
allocated three different ways when defining the forward and inverse sine and cosine transforms:

( )
0
Forward sine transform of ( ) for 0 is ( ) sin( ) u t t u t t dt U u u
·
>
³

p
, (2.31e)
( )
0
Forward cosine transform of ( ) for > 0 is ( ) cos( ) u t t u t t dt U u u
·

³
C
,
( ) ( )
0
2
Inverse sine transform of is sin( ) ( ) for 0 U U t d u t t u u u u
r
·
>
³
p p
,
- 95 -
2 · Fourier Theory

- 96 -
( ) ( )
0
2
Inverse cosine transform of is cos( ) ( ) for 0 U U t d u t t u u u u
r
·
>
³
C C
,

( )
0
2
Forward sine transform of ( ) for > 0 is ( ) sin( ) u t t u t t dt U u u
r
·

³
p
, (2.31f)
( )
0
2
Forward cosine transform of ( ) for > 0 is ( ) cos( ) u t t u t t dt U u u
r
·

³
C
,
( ) ( )
0
2
Inverse sine transform of is sin( ) ( ) for 0 U U t d u t t u u u u
r
·
>
³
p p
,
( ) ( )
0
2
Inverse cosine transform of is cos( ) ( ) for 0 U U t d u t t u u u u
r
·
>
³
C C
,

( )
0
2
Forward sine transform of ( ) for > 0 is ( ) sin( ) u t t u t t dt U u u
r
·

³
p
, (2.31g)
( )
0
2
Forward cosine transform of ( ) for > 0 is ( ) cos( ) u t t u t t dt U u u
r
·

³
C
,
( ) ( )
0
Inverse sine transform of is sin( ) ( ) for 0 U U t d u t t u u u u
·
>
³
p p
,
( ) ( )
0
Inverse cosine transform of is cos( ) ( ) for 0 U U t d u t t u u u u
·
>
³
C C
.

The reader should expect to encounter all three classes of definitions given in (2.31e)–(2.31g).
The symmetric definitions in (2.31f) are the most popular, probably because they remove the
distinction between the forward and inverse transform, letting us say that the sine transform of
the sine transform and the cosine transform of the cosine transform return the original function
for 0 t > .
In today’s optical-engineering textbooks—and user manuals for the fast Fourier transform—
there is a tendency to choose Eq. (2.29a)–(2.29d) as the definitions of the forward and inverse
Fourier transform, and that is the convention followed here. It is perhaps somewhat
unconventional not to use the frequency variable 2 f u r when defining the sine and cosine
transforms, but using ƒ rather than Ȧ brings their definitions into conformity with the definitions
chosen for the forward and inverse Fourier transforms.
- 96 -
Fourier Transform as a Linear Operation · 2.6
- 97 -
2.6 Fourier Transform as a Linear Operation
The forward and inverse Fourier transforms are linear operations. If Į, ȕ are any two complex
constants and u(t), v(t) are two complex-valued functions of a real variable t, then the definition
of a linear operator L is that

( ) ( ) ( ) ( ) ( ) ( ) ( ) u t v t u t v t o þ o þ + + L L L . (2.32a)

Examples of linear operators are multiplication by a specified function g(t)

( )
1
( ) ( ) ( ) u t g t u t L ,

differentiation with respect to t
( )
2
( )
( )
du t
u t
dt
L ,

and integration over the interval
1 2
t t t < <

( )
2
1
3
( ) ( )
t
t
u t u t dt
³
L .
We see that for these three examples

( ) ( ) ( )
1 1 1
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) u t v t g t u t g t v t u t v t o þ o þ o þ + + + L L L ,

( ) ( ) ( )
2 2 2
( ) ( )
( ) ( ) ( ) ( )
du t dv t
u t v t u t v t
dt dt
o þ o þ o þ + + + L L L ,
and
( ) ( ) ( )
2 2
1 1
3 3 3
( ) ( ) ( ) ( ) ( ) ( )
t t
t t
u t v t u t dt v t dt u t v t o þ o þ o þ + + +
³ ³
L L L .

Combinations of linear operators are always linear; for example, the operator Z defined by

( ) ( ) ( )
3 1
( ) ( ) u t u t Z L L
must be linear because
is that
- 97 -
2 · Fourier Theory

- 98 -

( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( )
3 1 3 1 1
3 1 3 1
( ) ( ) ( ) ( ) ( ) ( )
( ) ( )
( ) ( )
u t v t u t v t u t v t
u t v t
u t v t
o þ o þ o þ
o þ
o þ
+ + +
+
+


Z L L L L L
L L L L
L L
(2.32b)

We note that the forward Fourier transform

( )
( ) 2
( ) ( )
ift ift
u t u t e dt
r
·
÷ ÷
÷·

³
F

as defined in Eq. (2.29a) is, in fact, just ( ) ( )
3 1
( ) u t L L with
2
( )
ift
g t e
r ÷
in the
1
L multiplication
and
1
t ÷·,
2
t · in the
3
L integration. Similarly, the inverse Fourier transform is,
interchanging the roles of the ƒ and t variables in Eq. (2.29b),

( )
( ) 2
( ) ( )
ift ift
U t U t e dt
r
·
÷·

³
F ,

showing it to be ( ) ( )
3 1
( ) U t L L with
2
( )
ift
g t e
r
in the
1
L multiplication and
1
t ÷·,
2
t · in
the
3
L integration. Equation (2.32b) thus shows that both the forward and inverse Fourier
transforms are linear. The unextended and extended sine transforms in Eqs. (2.8a) and (2.14a),

( )
( )
0
( ) 2 ( ) sin(2 )
ft
u t u t ft dt r
·

³
p and ( )
( )
( ) ( ) sin(2 )
ft
u t u t ft dt r
·
÷·

³
E
p ,

are also both ( ) ( )
3 1
( ) u t L L : the unextended sine transform has ( ) 2sin(2 ) g t ft r in the
1
L
multiplication and
1
0 t ,
2
t · in the
3
L integration; and the extended sine transform has
( ) sin(2 ) g t ft r in the
1
L multiplication and
1
t ÷·,
2
t · in the
3
L integration. The
unextended and extended cosine transforms in Eqs. (2.8b) and (2.14b),

( )
( )
0
( ) 2 ( ) cos(2 )
ft
u t u t ft dt r
·

³
C and ( )
( )
( ) ( ) cos(2 )
ft
u t u t ft dt r
·
÷·

³
E
C ,

are, of course, identical to the unextended and extended sine transforms in being ( ) ( )
3 1
( ) u t L L ;
the only change is that the sines change to cosines in the
1
L multiplications. From Eq. (2.32b), all
- 98 -
Fourier Transform as a Linear Operation · 2.6
- 99 -
four transforms—the extended sine transform, the unextended sine transform, the extended
cosine transform, and the unextended cosine transform—are linear operations. We see that the
only other transform discussed so far, the Hartley transform

( ) [ ]
( )
( ) ( ) cos(2 ) sin(2 )
ft
u t u t ft ft dt r r
·
÷·
+
³
a
e

in Eq. (2.26h), must also be linear because it is

( ) ( )
3 1
( ) u t L L with ( ) cos(2 ) sin(2 ) g t ft ft r r +

in the
1
L multiplication and has
1
t ÷·,
2
t · in the
3
L integration.
2.7 Mathematical Symmetries of the Fourier Transform
There are a large number of symmetry relations that hold for any function u(t) and its Fourier
transform
( )
( ) 2
( ) ( ) ( )
ift ift
U f u t u t e dt
r
·
÷ ÷
÷·

³
F . (2.33a)

We have already seen that the inverse Fourier transform of ( ) U f returns the original function,

( )
2 ( )
( ) ( ) ( )
ift itf
U f e df U f u t
r
·
÷·

³
F . (2.33b)


Replacing t by ít, changes this to
( )
( )
( ) ( )
itf
u t U f
÷
÷ F .

Interchanging the roles of variables ƒ and t, we get

( )
( )
( ) ( )
ift
u f U t
÷
÷ F , (2.33c)

which shows that u(íf) is the forward Fourier transform of U(t). We expect, then, that U(t) is the
inverse Fourier transform of u(íf). To show this is true, we interchange the roles of variables ƒ
and t in (2.33a) and then make f f ´ ÷ the new variable of integration to get
- 99 -
2 · Fourier Theory

- 100 -

( )
( )
( ) 2 2 2
( )
( ) ( ) ( ) ( ) ( )
( ) .
itf ift if t ift
itf
U t u f u f e df u f e df u f e df
u f
r r r
· ÷· ·
´ ÷ ÷
÷· · ÷·
´ ´ ÷ ÷ ÷
÷
³ ³ ³

F
F
(2.33d)

Not only does this show that U(t) is the inverse Fourier transform of u(íf) but also, by comparing
the two expressions involving the F operator, we see that changing the sign of the integration
variable ƒ does not change the value of the Fourier operation F. It does, however, change its
name—the first F operation in (2.33d) is the forward Fourier transform of u(f) and the second F
operation in (2.33d) is the inverse Fourier transform of u(íf). Taking the complex conjugate of all
three expressions in Eq. (2.33b) gives


( )
2 ( )
( ) ( ) ( )
ift itf
u t U f e df U f
r
·
· · ÷ ÷ ·
÷·

³
F ,

which shows that we get the complex conjugate of operator F by taking the complex conjugates
of the quantities inside both parentheses. Starting with the original Fourier transform relationship
between U and u,
( )
( )
( ) ( )
ift
U f u t
÷
F (2.33e)
and
( )
( )
( ) ( )
itf
u t U f F , (2.33f)

we take the complex conjugates of both sides of (2.33e),


( )
( )
( ) ( )
ift
U f u t
· ·
F ,

and then change the sign of ƒ to get


( )
( )
( ) ( )
ift
U f u t
· ÷ ·
÷ F . (2.33g)

This shows that U(íf)
*
is the forward Fourier transform of u(t)
*
. Since U(íf)
*
is the forward
Fourier transform of u(t)
*
, we expect the inverse Fourier transform of U(íf)
*
to be u(t)
*
. To show
this is true, we just change the sign of integration variable in Eq. (2.33f),

( )
( )
( ) ( )
itf
u t U f
÷
÷ F ,

and then take the complex conjugate to get
- 100 -
Mathematical Symmetries of the Fourier Transform · 2.7
- 101 -

( )
( )
( ) ( )
itf
u t U f
· ·
÷ F . (2.33h)

Hence, u(t)
*
is indeed the inverse Fourier transform of U(íf)
*
.
When u(t) is a strictly real function, as it is for much of the Fourier-transform work done in
this book, u equals its complex conjugate so that

( ) ( )
( ) ( )
( ) ( )
ift ift
u t u t
÷ ÷ ·
F F ,

and Eq. (2.33g) becomes
( )
( )
( ) ( )
ift
U f u t
· ÷
÷ F .

But ( )
( )
( )
ift
u t
÷
F is just U(f), the forward Fourier transform of u, so

( ) ( ) U f U f
·
÷

or, taking the complex conjugate of both sides,

( ) ( ) U f U f
·
÷ . (2.34a)

Functions U(f) that obey Eq. (2.34a) are called Hermitian. If u(t) is purely imaginary, so that
( ) ( ) u t u t
·
÷ , then Eq. (2.33g) becomes

( )
( )
( ) ( )
ift
U f u t
· ÷
÷ ÷ F
or
( )
( )
( ) ( )
ift
u t U f
÷ ·
÷ ÷ F , (2.34b)

where the linearity of F is used to take ( 1) ÷ outside the transform and shift it over to the other
side of the equation. Since ( )
( )
( )
ift
u t
÷
F is just U(f), Eq. (2.34b) shows that

( ) ( ) U f U f
·
÷ ÷
or
( ) ( ) U f U f
·
÷ ÷ (2.34c)

when u is purely imaginary. Functions U(f) that obey Eq. (2.34c) are called anti-Hermitian. A
special and very important case occurs when u is both real and even. Then, since U is the forward
- 101 -
2 · Fourier Theory

- 102 -
Fourier transform of u with ( )
( )
( ) ( )
ift
U f u t
÷
F , we take the complex conjugate of both sides to
get

( )
( )
( ) ( )
ift
U f u t
· ·
F .

Because u is real this becomes, changing the sign of the variable of integration,

( ) ( )
( ) ( )
( ) ( ) ( )
ift ift
U f u t u t
· ÷
÷ F F .

Because u is even, this simplifies to

( )
( )
( ) ( ) ( )
ift
U f u t U f
· ÷
F
so that
( ) ( ) U f U f
·
. (2.34d)

Hence, U equals its own complex conjugate, which shows it must be real. Because u is real, we
already know that U is Hermitian and (2.34a) must hold true; now that U is known to be real, Eq.
(2.34a) can be written as
( ) ( ) U f U f ÷ (2.34e)

This shows that U must be real and even when u is real and even. Taking the real part of Eq.
(2.33a) now gives, since both U and u are known to be real,


( )
2 2
( ) Re ( ) ( ) Re
ift ift
U f u t e dt u t e dt
r r
· ·
÷ ÷
÷· ÷·
§ ·

¨ ¸
© ¹
³ ³
,

which becomes, applying Eq. (2.27),

( ) ( ) cos(2 ) U f u t ft dt r
·
÷·

³
. (2.34f)

Because u(t) is also even, we know that the product ( ) cos(2 ) u t ft r is even with respect to t,
which means that (2.34f) can be written as [see formula (2.19) above]


0
( ) 2 ( ) cos(2 ) U f u t ft dt r
·

³
. (2.34g)
- 102 -
Mathematical Symmetries of the Fourier Transform · 2.7
- 103 -
The right-hand side is the unextended cosine transform of u, showing that when u(t) is real and
even, its Fourier transform equals its cosine transform. According to Eq. (2.8f), it follows that u
must then be the cosine transform of U,


0
( ) 2 ( ) cos(2 ) u t U f ft df r
·

³
. (2.34h)
2.8 Basic Fourier Identities
There are a number of simple Fourier identities that are true for the transforms of any function u.
One very simple identity—surprisingly easy to overlook—is that when U(f) is the forward or
inverse Fourier transform of u(t), the value of U at the origin is the total integral of u:


2
0
0
( ) ( )
ift
f
f
U f u t e dt
r
·

÷·

ª º

« »
¬ ¼
³
B

or
(0) ( ) U u t dt
·
÷·

³
. (2.35a)

Similarly, (0) u is the total integral of ( ) U f :


2
0
0
( ) ( )
ift
t
t
u t U f e df
r
·
±

÷·

ª º

« »
¬ ¼
³

or
(0) ( ) u U f df
·
÷·

³
. (2.35b)

When U(f) is the forward Fourier transform of u(t), the nth derivative of U is


2 2
( ) ( 2 ) ( )
n n
ift n n ift
n n
d U
u t e dt i t u t e dt
df f
r r
r
· ·
÷ ÷
÷· ÷·
o
ª º ÷
¬ ¼
o
³ ³
; (2.35c)

and, because Eqs. (2.29a) and (2.29d) require u to be the inverse transform of U when U is the
forward transform of u, the nth derivative of u is
- 103 -
2 · Fourier Theory

- 104 -

2 2
( ) (2 ) ( )
n n
ift n n ift
n n
d u
U f e df i f U f e df
dt t
r r
r
· ·
÷· ÷·
o
ª º
¬ ¼
o
³ ³
. (2.35d)

Therefore, when both u and
n n
d u dt satisfy requirements (V) through (VIII) in Sec. 2.4 and U(f)
is the forward Fourier transform of u(t), Eq. (2.35d) shows that [(2 ) ( )]
n n
i f U f r must be the
forward Fourier transform of
n n
d u dt because
n n
d u dt is the inverse Fourier transform of
[(2 ) ( )]
n n
i f U f r . Equation (2.35c) similarly shows that when u(t) and [ ( )]
n
t u t satisfy
requirements (V) through (VIII) in Sec. 2.4 and U(f) is the forward Fourier transform of u(t), the
forward Fourier transform of [ ( )]
n
t u t is

1
( 2 )
n
n n
d U
i df r ÷
.

We introduce the notation “ ÷” to show this sort of Fourier-transform relationship between
functions, adopting the convention that the function on the right is always the forward Fourier
transform of the function on the left and the function on the left is always the inverse Fourier
transform of the function on the right. The results of the above analysis can then be written as

(2 ) ( )
n
n n
n
d u
i f U f
dt
r ÷ (2.35e)
and

1
( )
( 2 )
n
n
n n
d U
t u t
i df r
÷
÷
. (2.35f)

For the integral of any complex function c(t), the inequality

( ) ( )
b b
a a
c t dt c t dt s
³ ³
(2.35g)
must hold true for any two real values of a and b where a b s . When u(t) is real, so is its nth
derivative, and we can write


2 2 2
n n n
ift ift ift
n n n
d u d u d u
e dt e dt e dt
dt dt dt
r r r
· · ·
÷ ÷ ÷
÷· ÷· ÷·
s
³ ³ ³
,

which reduces to, since
2
1
ift
e
r ÷
,
- 104 -
Basic Fourier Identities · 2.8
- 105 -

2
n n
ift
n n
d u d u
e dt dt
dt dt
r
· ·
÷
÷· ÷·
s
³ ³
. (2.35h)

Because we are supposing the Fourier transform of /
n n
d u dt to exist, the existence requirement
in Eq. (2.13a) shows that

n
n
d u
dt
dt
·
÷·
³


is finite. Hence, inequality (2.35h) requires


2
n
ift
n
d u
e dt
dt
r
·
÷
÷·
³


also to be finite, which means that we can assume that it is less than or equal to some finite real
and non-negative number B for all values of ƒ:


2
n
ift
n
d u
e dt B
dt
r
·
÷
÷·
s
³
. (2.35i)

Formula (2.35e) states that

2
(2 ) ( )
n
ift n n n
n
d u
e dt i f U f
dt
r
r
·
÷
÷·

³
, (2.35j)
where

2
( ) ( )
ift
U f u t e dt
r
·
÷
÷·

³


is, of course, the Fourier transform of u(t). Taking the magnitude of the complex values of both
sides of (2.35j) and remembering that 1
n
i shows that


2
(2 ) ( )
n
n
ift n
n
d u
e dt f U f
dt
r
r
·
÷
÷·

³
,

which becomes, applying inequality (2.35i),
- 105 -
2 · Fourier Theory

- 106 -
(2 ) ( )
n
n
B f U f r >
or
( )
(2 )
n
n
B
U f f
r
÷
s . (2.35k)

Hence, when the Fourier transform of the nth derivative of u(t) exists, we know that the
magnitude ( ) U f of the Fourier transform of u decreases as
n
f
÷
for large values of ƒ.
We next examine a set of identities often called the Fourier shift theorem. When U(f) is the
forward Fourier transform of u(t),

2
( ) ( )
ift
U f u t e dt
r
·
÷
÷·

³
,

and u(t) is shifted to the right by an amount a,

( ) ( ) u t u t a ÷ ÷ ,

then the forward Fourier transform of ( ) u t a ÷ is, changing the variable of integration to
t t a ´ ÷ ,


2 2 ( )
2 2 2
( ) ( )
( ) ( ).
ift if t a
ifa ift ifa
u t a e dt u t e dt
e u t e dt e U f
r r
r r r
· ·
´ ÷ ÷ +
÷· ÷·
·
´ ÷ ÷ ÷
÷·
´ ´ ÷
´ ´
³ ³
³



Hence the forward Fourier transform of ( ) u t a ÷ is
2
( )
ifa
e U f
r ÷
when the forward Fourier
transform of u(t) is U(f), which we can write as

If ( ) ( ) u t U f ÷ then
2
( ) ( )
ifa
u t a e U f
r ÷
÷ ÷ . (2.36a)

In terms of the Fourier F operator, we have

( ) ( )
( ) 2 ( )
( ) ( )
ift ifa ift
u t a e u t
r ÷ ÷ ÷
÷ F F . (2.36b)

Working with the reverse Fourier transform of
0
( ) U f f ÷ and changing the variable of
integration to
0
f f f ´ ÷ , we see that
operator, we have
- 106 -
Basic Fourier Identities · 2.8
- 107 -

0 0
2 2 2 2
0
( ) ( ) ( )
ift if t if t if t
U f f e df e U f e df e u t
r r r r
· ·
´
÷· ÷·
´ ´ ÷
³ ³
(2.36c)
or

0
2
0
( ) ( )
if t
e u t U f f
r
÷ ÷ . (2.36d)

The F operator lets us write this result as

( ) ( )
0
( ) 2 ( )
0
( ) ( )
itf if t itf
U f f e U f
r
÷ F F (2.36e)
or

( )
( ) ( )
( )
0
0
( ) 2
0
( ) ( ) ( )
i f f t
ift if t
e u t U f f u t
r
÷
÷ F F . (2.36f)

Equations (2.36d)–(2.36f) show that multiplying u(t) by
0
2 if t
e
r
shifts U(ƒ), the forward Fourier
transform of u(t), to the right by a frequency
0
f . By interchanging the roles of t and ƒ—and
replacing u by U and
0
f by a—in (2.36e) and comparing the result to (2.36b), we see the two
equations can be combined into one formula:

( ) ( )
( ) 2 ( )
( ) ( )
ift ifa ift
u t a e u t
r ± ± ±
÷ F F . (2.36g)

This last result can also be written as, defining a new constant b a ÷ ,


2 2 2
( ) ( )
ift ifb ift
u t b e dt e u t e dt
r r r
· ·
± ±
÷· ÷·
+
³ ³
B
(2.36h)
or
( ) ( )
( ) 2 ( )
( ) ( )
ift ifb ift
u t b e u t
r ± ±
+
B
F F . (2.36i)

The next set of identities is sometimes called the Fourier scaling theorem. If U(ƒ) is the
forward Fourier transform of u(t) and the argument of u is scaled by the real constant a,

( ) ( ) u t u at ÷ ,

then the forward Fourier transform of ( ) u at is, letting t at ´ ,


2
2
1 1
( ) ( )
ft
i
ift a
f
u at e dt u t e dt U
a a a
r
r
´
§ · · ·
÷
¨ ¸
÷
© ¹
÷· ÷·
§ ·
´ ´
¨ ¸
© ¹
³ ³
.
- 107 -
2 · Fourier Theory

- 108 -
This can be written as

1
( )
f
u at U
a a
§ ·
÷
¨ ¸
© ¹
(2.37a)
or
( )
( ) ( )
( )
( )
1
( ) ( )
i f a t
ift
u at u t
a
÷
÷
F F . (2.37b)

We also have, scaling the frequency by a positive constant a and letting f af ´ , that


2
2
1 1
( ) ( )
f t
i
ift a
t
U af e df U f e df u
a a a
r
r
´
§ · · ·
¨ ¸
© ¹
÷· ÷·
§ ·
´ ´
¨ ¸
© ¹
³ ³
.

This can be written as

1
( )
t
u U af
a a
§ ·
÷
¨ ¸
© ¹
for 0 a > (2.37c)
or
( )
( ) ( )
( )
( )
1
( ) ( )
i t a f
itf
U af U f
a
F F for 0 a > . (2.37d)

Equation (2.37b) and (after interchanging the roles of ƒ and t) Eq. (2.37d) can be combined into
the single formula,
( )
( ) ( )
( )
( )
1
( ) ( )
i f a t
ift
u at u t
a
±
±
F F for 0 a > . (2.37e)

Because u(t) must satisfy requirements (V) through (VIII) in Sec. 2.4 for these results to be
true—and in particular it must satisfy requirement (V) that it be absolutely integrable—there may
well be only a finite region of t over which u(t) is significantly different from zero. When
0 1 a < < so that the range of t over which u is significantly different from zero expands, formula
(2.37a) shows that the region of ƒ over which U(ƒ) is significantly different from zero shrinks;
and, of course, when 1 a > , just the opposite occurs. For 0 1 a < < , function ( ) u at more closely
resembles sin(2 ) ft r and cos(2 ) ft r for smaller values of ƒ, explaining why the region of ƒ for
which U is significantly different from zero shrinks; and when 1 a > , function ( ) u at more closely
resembles sin(2 ) ft r and cos(2 ) ft r for larger values of ƒ, explaining why the region of ƒ for
which U is significantly different from zero expands. We also note that if 1 (2 ) f r , so that
sin(2 ) sin( ) ft t r and cos(2 ) cos( ) ft t r , then the sine and cosine can change significantly in
value only when t changes by at least
- 108 -
Basic Fourier Identities · 2.8
- 109 -

min
(1) t O A .

Suppose t must also change by at least
min
(1) t O A for a significant change in u(t) to occur,
which means that sin(2 ) sin( ) ft t r and cos(2 ) cos( ) ft t r vary about as fast with respect to t as
u does—that is, sin( ) t and cos( ) t “resemble” u somewhat. Recalling the heuristic reasoning used
in Sec. 2.1 to introduce and justify the sine and cosine integrals, we now expect U(ƒ) to be
significantly different from zero when 1 (2 ) f r . Suppose next that t changes by less than
min
(1) t O A so that u does not change significantly in value, remaining almost constant. Now
when ƒ becomes significantly larger than 1 (2 ) r , functions sin(2 ) ft r and cos(2 ) ft r oscillate
ever more rapidly so that they change significantly in value for changes in t that are ever smaller
than
min
t A . For these larger values of ƒ, the sine and cosine do not much resemble u(t), forcing
the Fourier transform U(ƒ) to be negligible or zero for (1 (2 )) f O r > . We can modify the
original function u by creating a new function ( ) ( ) u t u t
þ
þ for 0 þ > . Now t must change by at
least an ( ) O þ amount for u
þ
to change significantly; and when t changes by less than ( ) O þ ,
function u
þ
does not change significantly in value. We know from (2.37a) with 1 a þ that the
forward Fourier transform of u
þ
is ( ) ( ) U f U f
þ
þ þ . Hence, when ƒ is larger than
( ) 1 (2 ) O rþ , it must be true that ( ) U f
þ
is negligible or zero, since this is the same as having
(1 (2 )) f O r > in U(ƒ). Because 2r is often regarded as an (1) O quantity, this result can also be
interpreted as showing that ( ) U f
þ
must be negligible or zero for (1 ) f O þ > . Since the original
Fourier transform pair

( ) ( ) u t U f ÷

is left unspecified, u
þ
in fact represents any function v(t) where t must change by at least an
( ) O þ amount for a significant change in v to occur. Consequently, we can conclude if t must
change by at least an ( ) O þ amount for v(t) to change significantly, then the forward Fourier
transform of v(t) must be negligible or zero for (1 ) f O þ > . The arguments leading to this
conclusion work just as well when we consider the inverse Fourier transform in Eqs. (2.37c) and
(2.37e). Therefore, this more general result is also true: if v(t) is a function such that t must
change by at least an ( ) O þ amount for a significant change in v to occur, then the forward or
inverse Fourier transform,

2
( ) ( )
ift
V f v t e dt
r
·
±
÷·

³
,
is negligible or zero for (1 ) f O þ > .
- 109 -
2 · Fourier Theory

- 110 -
2.9 Fourier Convolution Theorem
It is hard to overstate the importance of the Fourier convolution theorem; it plays a fundamental
role in linear signal theory and structures the thinking of many different engineering
disciplines—signal processing, electrical engineering, image analysis, and servomechanism
design, to name but a few.
We define the convolution of two functions u(t) and v(t) to be

( ) ( ) ( ) ( ) u t v t u t v t t dt
·
÷·
´ ´ ´ · ÷
³
. (2.38a)

Here, u and v may be complex functions but their argument t is assumed to be real. The
convolution is commutative and associative. It is commutative because making the substitution
t t t ´´ ´ ÷ gives

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) u t v t u t v t t dt u t t v t dt v t u t t dt
· ÷· ·
÷· · ÷·
´ ´ ´ ´´ ´´ ´´ ´´ ´´ ´´ · ÷ ÷ ÷ ÷
³ ³ ³
,

showing that
( ) ( ) ( ) ( ) u t v t v t u t · · . (2.38b)

The convolution is associative because for three complex functions u(t), v(t), and h(t) with real
argument t we can write, changing the variable of integration to t t t ´´´ ´´ ´ ÷ ,


[ ]
( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
u t v t h t dt h t t dt u t v t t dt u t dt h t t v t t
dt u t dt v t h t t t
· · · ·
÷· ÷· ÷· ÷·
·
÷ ÷·
´´ ´´ ´ ´ ´´ ´ ´ ´ ´´ ´´ ´´ ´ · · ÷ ÷ ÷ ÷
´ ´ ´´´ ´´´ ´ ´´´ ÷ ÷
³ ³ ³ ³
³

[ ]
( ) ( ) ( ) . u t v t h t
·
·
· ·
³



Hence,

[ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) u t v t h t u t v t h t · · · · . (2.38c)


The convolution is a linear operation, because for any two complex constants Į and ȕ,

- 110 -
Fourier Convolution Theorem · 2.9
- 111 -

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ,
h t u t v t h t u t t v t t dt
h t u t t dt h t v t t dt
o þ o þ
o þ
·
÷·
· ·
÷· ÷·
´ ´ ´ ´ · + ÷ + ÷
´ ´ ´ ´ ´ ´ ÷ + ÷
³
³ ³


showing that

( ) ( ) ( ) ( ( ) ( )) ( ) ( ) ( ) ( ) h t u t v t h t u t h t v t o þ o þ · + · + · . (2.38d)

Because the convolution is commutative, the equation can also be written as

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) u t v t h t u t h t v t h t o þ o þ + · · + · . (2.38e)

This shows that the convolution is linear on both the left-hand and right-hand sides of the ·.
The convolution of two even functions or two odd functions is an even function. If u(t) and
v(t) are both even or both odd, then we have, using t t ´´ ´ ÷ ,


( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) .
u t v t u t v t t dt u t v t t dt
u t v t t dt u t v t
· ÷·
÷· ·
·
÷·
´ ´ ´ ´´ ´´ ´´ ÷ · ÷ ÷ ÷ ÷ ÷ ÷ +
´´ ´´ ´´ ÷ ·
³ ³
³

(2.38f)

When u is even and v is odd, or u is odd and v is even, then we have



( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) .
u t v t u t v t t dt u t v t t dt
u t v t t dt u t v t
· ÷·
÷· ·
·
÷·
´ ´ ´ ´´ ´´ ´´ ÷ · ÷ ÷ ÷ ÷ ÷ ÷ +
´´ ´´ ´´ ÷ ÷ ÷ ·
³ ³
³

(2.38g)


Hence, the convolution of an even and an odd function is always odd.
If u and v have more than one argument so that they are written
1 2
( , , , ) u y x x … and
1 2
( , , , ) v y x x ´ ´ … , then we adopt the convention that the convolution

- 111 -
2 · Fourier Theory

- 112 -

1 2 1 2
( , , , ) ( , , , ) u y x x v y x x ´ ´ · … …

is over variable y rather than variables
1 1 2 2
, , , , x x x x ´ ´ …,


1 2 1 2 1 2 1 2
( , , , ) ( , , , ) ( , , , ) ( , , , ) u y x x v y x x u y x x v y y x x dy
·
÷·
´ ´ ´ ´ ´ ´ ´ · ÷
³
… … … … ,

because y is the only argument repeated on both sides of the ·.
To derive the Fourier convolution theorem, we take the forward or inverse transform of
( ) ( ) u t v t · to get

( ) [ ]
( ) 2 2
2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ).
ift ift ift
ift
u t v t e u t v t dt dt e dt u t v t t
dt u t dt e v t t
r r
r
· · ·
± ± ±
÷· ÷· ÷·
· ·
±
÷· ÷·
´ ´ ´ · · ÷
´ ´ ´ ÷
³ ³ ³
³ ³

F


Changing the variable of integration in the inner integral to t t t ´´ ´ ÷ gives


( )
( ) 2 2
2 2
( ) ( ) ( ) ( )
( ) ( )
ift ift ift
ift ift
u t v t dt u t e dt e v t
dt u t e dt e v t
r r
r r
· ·
´ ´´ ± ± ±
÷· ÷·
· ·
´ ´´ ± ±
÷· ÷·
´ ´ ´´ ´´ ·
ª º ª º
´ ´ ´´ ´´
« » « »
¬ ¼ ¬ ¼
³ ³
³ ³

F


or
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
u t v t u t v t
± ± ±
· F F F . (2.39a)

If U(ƒ) and V(ƒ) are the forward Fourier transforms of u(t) and v(t) respectively, we can choose
the minus sign of (2.39a) to get


[ ]
2
( ) ( ) ( ) ( )
ift
e u t v t dt U f V f
r
·
÷
÷·
·
³
, (2.39b)
which shows that
( ) ( ) ( ) ( ) u t v t U f V f · ÷ . (2.39c)

Equation (2.28A ) can be written as, for any function g(t) after interchanging the roles of t and t´ ,
- 112 -
Fourier Convolution Theorem · 2.9
- 113 -
( ) ( )
( ) ( )
( ) ( )
it f ift
g t g t
´ ±
´
B
F F . (2.39d)

We replace
( ) ±
F by
( ) B
F on the right-hand side of Eq. (2.39a), which is just a change in the order
in which the two possible signs of the exponent are listed, and then take
( ) it f ´ ±
F of both sides to
get that, applying (2.39d) with ( ) ( ) ( ) g t u t v t · ,

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
it f ift ift
u t v t u t v t
´ ±
´ ´ ·
B B
F F F . (2.39e)

Because u(t) and v(t) represent arbitrary, Fourier-transformable functions of t,
( )
( ( ))
ift
u t
B
F and
( )
( ( ))
ift
v t
B
F must be arbitrary, Fourier-transformable functions of ƒ, which we can call
( )
U
B
and
( )
V
B
respectively,
( )
( ) ( )
( ) ( )
ift
U f u t
B B
F (2.39f)
and
( )
( ) ( )
( ) ( )
ift
V f v t
B B
F . (2.39g)

Applying this notation to (2.39d), first with ( ) ( ) g t u t and then with ( ) ( ) g t v t , we see that


( )
( ) ( )
( ) ( )
it f
U f u t
´ ±
´
B
F (2.39h)
and

( )
( ) ( )
( ) ( )
it f
V f v t
´ ±
´
B
F . (2.39i)

Hence Eq. (2.39e) can be written as


( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
it f it f it f
U f V f U f V f
´ ´ ´ ´´ ´ ± ± ±
´ ´´ ·
B B B B
F F F ,

where the convolution is over t´ because it is the only argument repeated on both sides of the ·.
Since
( )
U
B
and
( )
V
B
are arbitrary, transformable functions, we can replace them by the arbitrary
transformable functions u and v to get, after interchanging the roles of ƒ and t´ ,

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
u t v t u t v t
´ ´´ ´´´ ± ± ±
´ ´ ´´ ´´´ · F F F .

This can be simplified by dropping a prime from each of the t’s:

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
u t v t u t v t
´ ´´ ± ± ±
´ ´´ · F F F . (2.39j)
(2.39d)
- 113 -
2 · Fourier Theory

- 114 -
If U(ƒ) and V(ƒ) are the forward Fourier transforms of u(t) and v(t) respectively, we can choose
the minus sign of (2.39j) to get


[ ]
2
( ) ( ) ( ) ( )
ift
e u t v t dt U f V f
r
·
÷
÷·
·
³
(2.39k)
or
( ) ( ) ( ) ( ) u t v t U f V f ÷ · . (2.39A )

Equation (2.39b) shows that the forward Fourier transform of the convolution of two functions
is the product of the forward Fourier transform of each function, and (2.39k) shows that the
forward Fourier transform of the product of two functions is the convolution of the forward
Fourier transform of each function. Equations (2.39a) and (2.39j) show that everything we just
said about the forward Fourier transform still holds true when we take the reverse Fourier
transform of the product of two functions or of the convolution of two functions.
When using the Fourier convolution theorem, we usually regard one of the two convolved
functions as representing the undisturbed signal—that is, the true set of values for what is to be
measured—and the other—usually much more narrow—function as specifying the blurring or
smearing effect of an imperfect measurement. The blurring or smearing function has different
names in different engineering disciplines; optical engineers often call it the instrument-response
or instrument line-shape function. In Fig. 2.5(a), function u is taken to be the true signal, and in
Fig. 2.5(b) function v is the instrument-response or instrument line-shape function. The
convolution
( ) ( ) ( ) ( ) ( )
blur
u t v t u t v t t dt u t
·
÷·
´ ´ ´ · ÷
³


defines the new function ( )
blur
u t as shown in Figs. 2.5(c)–2.5(e). The function v is flipped left to
right and slid along the t´ axis in Fig. 2.5(c) by changing the value of t. Figure 2.5(d) is a close-
up of v at a specific value of t, with the shaded region being the area under the product
( ) ( ) u t v t t ´ ´ ÷ . Since ( ) ( ) u t v t t ´ ´ ÷ is zero where ( ) v t t´ ÷ is zero, the area of the shaded region can
be found by integrating ( ) ( ) u t v t t ´ ´ ÷ over t´ between í’ and +’. This is, of course, just the
convolution of u and v for this particular value of t , which means the area of the shaded region
must be ( )
blur
u t for this value of t. Figure 2.5(e) represents the complete ( )
blur
u t function for all
values of t; clearly
blur
u has less detail than the original signal u.
The v(t) function in Fig. 2.5(b) is an unusual type of instrument response because it is not an
even function of t. Figure 2.5(f) shows a typical even instrument response ( )
e
v t . When the
instrument-response function is
e
v , the blurred signal is
- 114 -
Fourier Convolution Theorem · 2.9
- 115 -

,
( ) ( ) ( )
e blur e
u t u t v t · . (2.40a)

The instrument-response function is even, so ( ) ( )
e e
v t v t ÷ and we can write


,
( ) ( ) ( ) ( ) ( )
e blur e e
u t u t v t t dt u t v t t dt
· ·
÷· ÷·
´ ´ ´ ´ ´ ´ ÷ ÷
³ ³
(2.40b)

with the last integral in (2.40b) making it perhaps more obvious that u
e,blur
is a localized and
weighted average of u centered on t. Instrument-response or line-shape functions are usually
designed to be even because an even instrument-response function does not shift the center point
of isolated peaks in the true data u.
As described in the first chapter, when using Michelson interferometers, we do not much care
about the exact shape of the optical intensity signal u but are instead interested in the shape of its
transform,
( )
( )
( ) ( )
ift
U f u t
÷
F . (2.40c)

In many types of interferometers, u is a signal of time t, which means U can be analyzed as a
function of ƒ, the signal frequency. The electrical circuits transmitting and recording the signal u
can never do a perfect job—they always blur and smooth the original signal to some extent—so
what we end up with is not u(t) and U(ƒ) but rather
,
( )
e blur
u t and the associated Fourier transform


( )
( )
, ,
( ) ( )
ift
e blur e blur
U f u t
÷
F . (2.40d)

The relationship between
, e blur
U and U must be understood to design the electrical circuits
properly. Here is an important example of how to use the Fourier convolution theorem.
Substitution of (2.40a) into (2.40d) gives

( )
( )
,
( ) ( ) ( )
ift
e blur e
U f u t v t
÷
· F .

Using the Fourier convolution theorem as presented in Eq. (2.39a), this is rewritten as

( ) ( )
( ) ( )
,
( ) ( ) ( )
ift ift
e blur e
U f u t v t
÷ ÷
F F
or

,
( ) ( ) ( )
e blur e
U f U f V f , (2.40e)

where U(ƒ) comes from (2.40c) and we define

- 115 -
2 · Fourier Theory

- 116 -


FIGURE 2.5(a).
FIGURE 2.5(b).
FIGURE 2.5(c).
FIGURE 2.5(d).
FIGURE 2.5(e).
FIGURE 2.5(f).
( ) u t
( ) v t
( )
e
v t
( )
blur
u t
t


t
t
t
( ) v t t´ ÷
( ) u t´
( ) ( ) u t v t t ´ ´ ÷
t value

- 116 -
Fourier Convolution Theorem · 2.9
- 117 -
( )
( )
( ) ( )
ift
e e
V f v t
÷
F .

Equation (2.40e) is a very reassuring result, stating that as long as ( )
e
V f is known and not zero,
we can recover the Fourier transform of the true signal U(ƒ) from
,
( )
e blur
U f by calculating


,
( )
( )
( )
e blur
e
U f
U f
V f
. (2.40f)

To design the circuits of a Michelson interferometer, we find the frequencies ƒ for which U(ƒ)
must be known and arrange for
e
V to be as constant as possible—and definitely not zero—over
these frequencies. It turns out that preserving certain signal frequencies while neglecting others is
a standard problem in electrical circuit design, and it is usually easy to arrange for this to occur.
There is, in fact, a whole branch of electrical engineering called filter theory that describes
exactly how to design circuits where
e
V is zero or very small at some frequencies while being
large and quasi-constant at others.
2.10 Fourier Transforms and Divergent Integrals
Fourier-transform theory has a history of treating with extreme kindness engineers and scientists
who blindly use its formalism without worrying about whether their manipulations make
mathematical sense. The rule of thumb seems to be that if the final result is mathematically
sound—such as a finite integral or the transform of an obviously transformable function—it
almost never matters whether intermediate steps involve the transforms of functions that
obviously cannot be transformed or even, strictly speaking, are not true functions at all. Any
reasonably comprehensive table of Fourier transforms contains functions that not only violate
requirements (V) through (VIII) in Sec. 2.4 but also have transform integrals that, according to
the standard definition of integration, either diverge or have no well-defined value. This book
shows that these puzzling entries are the modest but ubiquitous legacy of mathematicians who
have extended the meaning of what is meant by an integral and what is meant by a function in
Fourier-transform theory. Their work has not only benefited many scientists and engineers who
no longer have to apologize for the way they solve Fourier-transform problems but has also
helped their students who no longer need to accept without good explanations divergent integrals
and the transforms of poorly defined functions.
The standard definition of an improper integral

( ) u t dt
·
÷·
³

for the function u(t) is that
- 117 -
2 · Fourier Theory

- 118 -

2
1
1
2
( ) lim ( )
T
T
T
T
u t dt u t dt
·
÷·
÷· ÷
÷·

³ ³
.y

If there is any singular point
s
t where lim ( )
s
t t
u t
÷
±·, the definition becomes


1 2
1 2
1 2
1 2
,
0, 0
( ) lim ( ) ( )
s
s
t T
T T
T t
u t dt u t dt u t dt
r
r
r r
÷ ·
÷· ÷·
÷· ÷ +
÷ ÷
ª º
+
« »
« »
¬ ¼
³ ³ ³
. (2.41a)
In this definition, the limits as
1
T ÷·,
2
T ÷·,
1
0 r ÷ , and
2
0 r ÷ occur independently; no
matter how
1
T ,
2
T ,
1
r , and
2
r approach their limits, the same answer is expected if the integral
exists. We now decide, in the interest of expanding Fourier-transform theory, to change this
standard definition of improper integral by connecting
1
r to
2
r and
1
T to
2
T as we take the limit,


0
( ) lim ( ) ( )
s
s
t T
T
T t
u t dt u t dt u t dt
r
r
r
÷ ·
÷·
÷· ÷ +
÷
ª º
+
« »
« »
¬ ¼
³ ³ ³
. (2.41b)

The limiting process in definition (2.41b) is said to give the Cauchy principle value of the
integral, sometimes written as
PV ( ) u t dt
·
÷·
³
or
_
( ) u t dt
·
÷·
³
.

If u(t) has multiple singular points, the definition is expanded in the obvious way. For example,
with two singular points at
1 s
t and
2 s
t with
1 2 s s
t t < , we have


1 1 2 2
1 1 2 2
1
2
0
0
PV ( ) lim ( ) ( ) ( )
s s
s s
t t T
T
T t t
u t dt u t dt u t dt u t dt
r r
r r
r
r
÷ ÷ ·
÷·
÷· ÷ + +
÷
÷
ª º
+ +
« »
« »
¬ ¼
³ ³ ³ ³
(2.41c)

and so on for three, four, etc., interior points of singularity in u(t). If an improper integral
converges to a finite value in the standard sense of (2.41a), then its Cauchy principle value also
converges to the same answer, but many improper integrals that do not converge in the sense of
(2.41a) nevertheless have well-defined Cauchy principle values. For this reason, it is customary
in Fourier-transform theory to interpret all improper integrals—such as the forward and inverse
Fourier transforms—as Cauchy principle values, and that is what we shall do from now on. There
will be no special notation used to distinguish Cauchy principle values from ordinary improper
integrals.
- 118 -
Fourier Transforms and Divergent Integrals · 2.10
- 119 -
To show the relevance of the Cauchy principle value, we calculate the Fourier transform of
1 t , an example already considered above in connection with the sine transform [see discussion
following Eq. (2.10e)]. Using the identity cos( ) sin( )
i
e i
o
o o + , we have


( ) 1 2 1 1 1
( ) cos(2 ) sin(2 )
ift ift
t e t dt ft t dt i ft t dt
r
r r
· · ·
÷ ÷ ÷ ÷ ÷ ÷
· · ·
÷
³ ³ ³
F . (2.42a)

There is no problem evaluating the imaginary part of this transform. Because
1
[ sin(2 )] t ft r
÷
is
an even function of t, we can apply formulas (2.19) and (2.10f) to get


1 1
0
sin(2 ) 2 sin(2 ) for 0 i ft t dt i ft t dt i f r r r
· ·
÷ ÷
÷·
>
³ ³
.

When 0 f < , we have


1 1
sin(2 ) sin(2 ) i ft t dt i f t t dt i r r r
· ·
÷ ÷
÷· ÷·
÷ ÷
³ ³
,

allowing us to write

1
sin(2 ) sgn( ) i ft t dt i f r r
·
÷
÷·

³
, (2.42b)

where we define

1 for 0
sgn( ) 0 for 0
1 for 0
f
f f
f
> ­
°

®
°
÷ <
¯



. (2.42c)


The specification that sgn(0) 0 makes sgn( ) f a proper odd function, equal to zero at 0 f ,
even though it has a jump discontinuity there. It also, of course, makes sense considering that
(2.42b) is the integral of the zero function when 0 f . Evaluation of the real part of the
transform in (2.42a) shows the usefulness of interpreting improper integrals as Cauchy principle
values. When 0 f , the real part of the left-hand side of (2.42a) becomes, using the standard
interpretation of an improper integral in (2.41a),
- 119 -
2 · Fourier Theory

- 120 -

1 2 1
1 2 1 2
1 2 1
1 2 1 2
1 2
1 2
2
, ,
2
0, 0 0, 0
1 2
,
1 2
0, 0
lim lim ln
lim ln ln
T T
T T T T
T
T T
dt dt dt dt T
t t t t
T T
r
r r
r r r r
r r
r
r r
÷ ·
÷· ÷· ÷· ÷·
÷· ÷
÷ ÷ ÷ ÷
÷· ÷·
÷ ÷
ª º ª º
§ ·
+ ÷ +
« » « »
¨ ¸
© ¹ « » « »
¬ ¼ ¬ ¼
§ · § ·
÷ +
¨ ¸ ¨
© ¹ ©
³ ³ ³ ³

1 2
1 2
1 2
,
2 1
0, 0
lim ln ln
T T
T
T
r r
r
r
÷· ÷·
÷ ÷
ª º
« » ¸
¹
¬ ¼
ª º
§ · § ·
+
« » ¨ ¸ ¨ ¸
© ¹ © ¹
¬ ¼
.
(2.43a)

The expression
1 2
ln( ) r r can be made anything we want depending on the limiting ratio
chosen for
1 2
r r as
1
0 r ÷ and
2
0 r ÷ ; the same is true of
1 2
ln( ) T T as
1
T ÷· and
2
T ÷·.
Therefore, under the standard interpretation of an improper integral, the limit in (2.43a) does not
exist. Comparison of (2.41a) to (2.41b) shows that (2.43a) can be converted to a Cauchy principle
value by setting
1 2
r r r ,
1 2
T T T , and taking the limit as T ÷·, 0 r ÷ . This leads to


0
lim ln ln 0
T
T
T
r
r
r
÷·
÷
ª º
§ · § ·
+
¨ ¸ ¨ ¸ « »
© ¹ © ¹
¬ ¼
,

allowing us to give a well-defined value to the expression
dt
t
·
÷·
³
.
In general, the Cauchy principle value of any odd function is always zero,

( ) 0 for any function such that ( ) ( ) u t dt u u t u t
·
÷·
÷ ÷
³
, (2.43b)

because when taking the limit we are always simultaneously adding ( ) u t dt increments to the
integral at values of t and ít with the balanced addition of increments always cancelling out.
Hence, interpreted as a Cauchy principle value,


1
cos(2 ) 0 ft t dt r
·
÷
÷·

³
(2.43c)

because
1
[ cos(2 )] t ft r
÷
is an odd function of t. Therefore we can now assign a well-defined
meaning to the forward Fourier transform of 1 t in (2.42a) using (2.43c) and (2.42b):


( ) 1
( ) sgn( )
ift
t i f r
÷ ÷
÷ F . (2.43d)
- 120 -
Fourier Transforms and Divergent Integrals · 2.10
- 121 -
For this answer to be a true extension to Fourier-transform theory, however, 1/t must satisfy
Eq. (2.28A ); that is, the inverse transform

( )
( )
sgn( )
itf
i f r ÷ F

has to give back the original function 1/t.
Direct evaluation of the inverse transform gives


( )
( ) 2
sgn( ) sgn( )
cos(2 ) sgn( ) sin(2 ) sgn( ) .
itf ift
i f i e f df
i ft f df ft f df
r
r r
r r r r
·
÷·
· ·
÷· ÷·
÷ ÷
÷ +
³
³ ³

F
(2.43e)

The cosine integral is again the integral of an odd function so its Cauchy principle value is zero,
but it is still not clear what value to assign the integral of [sin(2 ) sgn( )] ft f r . As the integral of
an even function, we might try applying formula (2.19) to get


0 0
?
sin(2 ) sgn( ) 2 sin(2 ) sgn( ) 2 sin(2 ) ft f df ft f df ft df r r r r r r
· · ·
÷·

³ ³ ³
, (2.43f)

but then we have the same difficulty already encountered when trying to evaluate the sine
transform

0
2 sin(2 ) ft df r r
·
³


in Eq. (2.10g). To evaluate the inverse transform of sgn( ) i f r ÷ , we need to create a new class of
mathematical entities, called generalized functions, together with a set of rules for how they
behave inside integrals. This extension to Fourier-transform theory is often called distribution
theory, with the generalized functions called distributions.
2.11 Generalized Functions
Generalized functions are based on the well-established mathematical concept of a functional. A
functional is a rule for assigning a complex number to each member of a set of test functions,
where each test function o has only one number assigned to it and the same number may end up
assigned to different test functions. The Fourier transform of a function ( ) t o at a specific
frequency
0
f f is a functional because it assigns the number ( )
0
( )
0
( ) ( )
if t
f t o
÷
d F to the test
- 121 -
2 · Fourier Theory
- 122 -
function o . In general, we can use any complex function u(t) having a real argument t as a
weighting function inside an integral to create a functional. This functional, called u ³ , is defined
to be

( )
( ) ( ) complex number u dt u t t o o
·
÷·
³
³
. (2.44)

According to this definition the functional u ³ is linear, like the Fourier transform, because


( )( ) [ ]
( ) ( )
1 2 1 2 1 2
1 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) u u t t t dt u t t dt u t t dt
u u
oo þo oo þo o o þ o
o o þ o
· · ·
÷· ÷· ÷·
³ + + +
³ + ³
³ ³ ³

(2.45)

for any two complex constants Į, ȕ and test functions
1
o ,
2
o .
From the notation u ³ , it is clear that all functions u, as long as the integral in Eq. (2.44) exists,
have associated with them the functional u ³ defined for the test functions o . There are also
functionals that behave in every way like the functionals u ³ , but for which no corresponding true
function u can be defined. We can, however, associate with these functionals a new class of
mathematical objects, called generalized functions, which can be shown to have many of the
properties of true functions. For this reason, it is customary to use function notation when
referring to generalized functions. If an already-understood functional has no true function u(t)
associated with it, we can use the properties of this already-understood functional to define a
generalized function called ( )
G
u t , with the subscript G reminding us that
G
u is a generalized
function. By analogy with the true function u(t) associated with the functional u ³ , the
generalized function and its behavior inside integrals is defined in terms of the already-known
functional, which we call
G
u ³ , using the definition


( )
( ) ( )
G G
u t t dt u o o
·
÷·
³
³
(2.46)

for any test function o . Since we already know what complex number the functional
G
u ³ gives
for any test function o , Eq. (2.46) is not a definition of
G
u ³ but rather a definition of what it
means to put [ ( ) ( )]
G
u t t o inside an integral. Clearly, the generalized function itself is well
defined only when its product with a test function is integrated over t. Because the functional
G
u ³
behaves in every way like the functionals u ³ based on the Cauchy-principle-value integration of
true functions, we have established a new type of integration using the product of generalized
- 122 -
Generalized Functions · 2.11
- 123 -
functions ( )
G
u t with test functions ( ) t o . Hence, we have not only generalized what is meant by a
function but have also extended again what is meant by integration.
To handle algebraic expressions involving both generalized functions and true functions, we
must define what it means to say two generalized functions ( )
G
u t and ( )
G
v t are equal. We say
that when
( ) ( ) ( ) ( )
G G
u t t dt v t t dt o o
· ·
÷· ÷·

³ ³
(2.47a)

for all appropriate test functions o , then

( ) ( )
G G
u t v t . (2.47b)

We also define a generalized function ( )
G
u t , which we know only from its associated
functional
G
u ³ using definition (2.46), to be equal to a true function v(t) when


( ) ( ) G
u v o o ³ ³ (2.48a)

for all appropriate test functions o . Another way of stating this is that whenever

( ) ( ) ( ) ( )
G
u t t dt v t t dt o o
· ·
÷· ÷·

³ ³
(2.48b)

for all the test functions o , we say that
( ) ( )
G
u t v t . (2.48c)

Two generalized functions ( )
G
u t and ( )
G
v t are defined to be equal over an interval a t b < <
when

( ) ( ) G ab G ab
u v o o ³ ³ (2.48d)
or
( ) ( ) ( ) ( )
G ab G ab
u t t dt v t t dt o o
· ·
÷· ÷·

³ ³
(2.48e)

for all test functions ( )
ab
t o that are identically zero for all t a < and for all t b > . The key point
here is that we are explicitly allowing ( )
ab
t o to be nonzero only inside the interval a t b < < . We
also say that a true function v(t) equals a generalized function ( )
G
u t in the interval a t b < < ,
- 123 -
2 · Fourier Theory
- 124 -

( ) ( ) for
G
u t v t a t b < < , (2.48f)
whenever
( ) ( ) ( ) ( )
G ab ab
u t t dt v t t dt o o
· ·
÷· ÷·

³ ³
(2.48g)

for all the ( )
ab
t o test functions. In Eqs. (2.48d)–(2.48g), we allow for half-infinite intervals by
permitting constant b to be +·with constant a finite and constant a to be í’ with constant b
finite.
The definitions of equality between two generalized functions or between a generalized
function and a true function can be, depending on the set of test functions o chosen, either very
much looser than the standard idea of equality or very much the same. Suppose, by way of
analogy, we define two true functions
1
( ) u t and
2
( ) u t to be “equal” when


1 2
( ) ( ) ( ) ( ) u t t dt u t t dt o o
· ·
÷· ÷·

³ ³
(2.49)

for all test functions o . If the only allowed test function is ( ) 0 t o , then any two functions
1
( ) u t
and
2
( ) u t are “equal.” If, on the other hand, the allowed test functions are
2
( )
ift
t e
r
o
±
for all real
values of ƒ, we are saying that
1
( ) u t and
2
( ) u t are “equal” when their Fourier transforms
( )
( 2 )
1
( )
ift
u t
r ±
F and ( )
( 2 )
2
( )
ift
u t
r ±
F are the same. From the Fourier inversion formulas, it then
follows that
1
( ) u t must be identical to
2
( ) u t , except possibly at jump discontinuities and isolated
points, for all reasonably well-behaved functions
1
( ) u t and
2
( ) u t . In general, we expect the set of
test functions to be diverse enough that serious thought and some mathematical ingenuity are
required to find two functions
1
( ) u t and
2
( ) u t that satisfy Eq. (2.49) yet are not basically the
same function. Of course, the integrals used in Eq. (2.49)—and all the other integrals involving
only true functions in Eqs. (2.44) through (2.48g), for that matter—must be known to exist. Often
the finiteness of these integrals and the general smoothness of the test functions are enforced by
the requirement that

lim[ ( )] 0 for 0,1, 2,
N
t
t t N o
÷·
… , (2.50a)

with the Mth derivative,
( )
( )
M M M
t d dt o o , satisfying
- 124 -
Generalized Functions · 2.11
- 125 -

( )
lim[ ( )] 0 for 0,1, 2,
and 1, 2,
N
M
t
t t N
M
o
÷·





. (2.50b)

A function such as
2
at
e
÷
for 0 a > satisfies (2.50a) and (2.50b), and in general all functions
representing physically realistic measurements can be taken to satisfy these two requirements. It
turns out, however, that the most useful and popular generalized function used in Fourier theory
can handle a wider variety of test functions, requiring only that the test functions o be
continuous at 0 t (see Sec. 2.14 below).
Continuing to develop what is meant by the sign applied to generalized functions, we say
that the product of a true function w(t) and a generalized function ( )
G
u t is another generalized
function ( )
G
v t ,
( ) ( ) ( )
G G
v t w t u t , (2.51a)

which is defined to mean that
( ) ( ) ( ) ( ) ( )
G G
v t t dt w t u t t dt o o
· ·
÷· ÷·

³ ³


for all test functions ( ) t o . A linear combination of true functions and generalized functions
specified by

1 1 2 2
( ) ( ) ( ) ( ) ( )
G G G
w t u t v t u t v t + +" (2.51b)

is defined to mean that


1 1 2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
G G G
w t t dt u t v t t dt u t v t t dt o o o
· · ·
÷· ÷· ÷·
+ +
³ ³ ³
"

for all test functions ( ) t o . In general, there is no difficulty assigning a meaning to equations such
as

1 1 2 2
1 1 2 2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
G G N GN
G G M GM
u t v t u t v t u t v t
U t V t U t V t U t V t
+ + +
+ + +
"
"
(2.51c)

for true functions
1 2 1 2
( ), ( ), , ( ), ( ), ( ), , ( )
N M
u t u t u t U t U t U t … … and generalized functions
1 2 1 2
( ), ( ), , ( ), ( ), ( ), , ( )
G G GN G G GM
v t v t v t V t V t V t … … . As long as both sides of the equation are just
linear combinations of generalized functions and true functions, we interpret their equality to
mean that
- 125 -
2 · Fourier Theory
- 126 -

1 1 2 2
1 1 2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
G G N GN
G G M GM
u t v t t dt u t v t t dt u t v t t dt
U t V t t dt U t V t t dt U t V t t dt
o o o
o o o
· · ·
÷· ÷· ÷·
· · ·
÷· ÷· ÷·
+ + +
+ + +
³ ³ ³
³ ³ ³
"
"


for all test functions ( ) t o . Even the simplest nonlinear expressions, however, such as


[ ]
2
?
( ) ( )
G G
v t u t ,

cannot be resolved by putting both sides inside an integral, because the right-hand side of


[ ]
2
?
( ) ( ) ( ) ( )
G G
v t t dt u t t dt o o
· ·
÷· ÷·

³ ³


is still undefined. We know that the left-hand side is the same as applying the already-understood
functional
G
u ³ to o ,

( )
( ) ( )
G G
u t t dt u o o
·
÷·
³
³
,
but no definition has been given to

[ ]
2
( ) ( )
G
u t t dt o
·
÷·
³


in terms of the functional
G
u ³ . It turns out that, in general, nonlinear expressions involving
generalized functions cannot be given useful interpretations. Hence, generalized functions must
be treated with caution unless they are used inside linear combinations of the type shown in
(2.51b) and (2.51c).
Although generalized functions do have limitations, there are many things that can be done
with them. We can give meaning to ( )
G
u t a ÷ for any real constant a by defining that

( ) ( ) ( ) ( )
G G
u t a t dt u t t a dt o o
· ·
÷· ÷·
÷ +
³ ³
(2.52a)

for all test functions o . This definition is, of course, consistent with what happens when the
formal substitution t t a ´ ÷ is made inside the original integral,
- 126 -
Generalized Functions · 2.11
- 127 -
( ) ( ) ( ) ( ) ( ) ( )
G G G
u t a t dt u t t a dt u t t a dt o o o
· · ·
÷· ÷· ÷·
´ ´ ´ ÷ + +
³ ³ ³
,

treating ( )
G
u t a ÷ like a true function ( ) u t a ÷ . We can give meaning to ( )
G
u at for any real
constant a by defining that
( )
1
( ) ( ) ( )
G G
u at t dt u t t a dt
a
o o
· ·
÷· ÷·

³ ³
(2.52b)

for all test functions o . This definition is consistent with what happens when we make the formal
substitution t at ´ in the integral
( ) ( )
G
u at t dt o
·
÷·
³

and treat ( )
G
u at like a true function,


( )
( )
( )
1
( ) for 0
1
( ) ( ) ( )
1
( ) for 0
G
G G
G
u t t a dt a
a
u at t dt u t t a dt
a
u t t a dt a
a
o
o o
o
·
· ·
÷·
÷·
÷· ÷·
·
­ ½
´ ´ ´ >
° °
° °

® ¾
° °
´ ´ ´ <
° °
¯ ¿
³
³ ³
³


.

When the argument of
G
u is the a linear combination at c + for real constants a and c, we
define
( )
1
( ) ( ) ( ) ( )
G G
u at c t dt u t t c a dt
a
o o
· ·
÷· ÷·
+ ÷
³ ³
(2.52c)

and, combining the arguments used to explain definitions (2.52a) and (2.52b), we see that
transforming the variable of integration to t at c ´ + gives

( )
1
( ) ( ) ( ) ( )
G G
u at c t dt u t t c a dt
a
o o
· ·
÷· ÷·
´ ´ ´ + ÷
³ ³
,

justifying definition (2.52c). In general, any variable transformation that is permitted for the
argument of a true function we also permit for the argument of a generalized function unless it
results in an inappropriate test function.
We define a generalized function ( )
G
u t to be even if
- 127 -
2 · Fourier Theory
- 128 -
( ) ( ) 0
G o
u t t dt o
·
÷·

³
(2.52d)

for all odd test functions
o
o , and we define ( )
G
u t to be odd if

( ) ( ) 0
G e
u t t dt o
·
÷·

³
(2.52e)

for all even test functions
e
o . This gives ( )
G
u t the same behavior it would have if it were an even
or odd true function multiplied by
e
o or
o
o and integrated over all t. Putting a subscript e on the
generalized function ( )
Ge
u t to show that it obeys the above definition for an even generalized
function, we note that, as described in Eq. (2.11c) above, any test function ( ) t o can be written as
the sum of an even function ( )
e
t o and an odd function ( )
o
t o . Hence, for any test function o and
an even generalized function ( )
Ge
u t , we can write, using definition (2.52d),


[ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) .
Ge Ge e o Ge e Ge o
Ge e
u t t dt u t t t dt u t t dt u t t dt
u t t dt
o o o o o
o
· · · ·
÷· ÷· ÷· ÷·
·
÷·
+ +

³ ³ ³ ³
³



Definition (2.52b) gives, again using that ( ) ( ) ( )
e o
t t t o o o + ,


[ ]
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
Ge Ge Ge e o
Ge e Ge o
G
u t t dt u t t dt u t t t dt
u t t dt u t t dt
u
o o o o
o o
· · ·
÷· ÷· ÷·
· ·
÷· ÷·
÷ ÷ ÷ + ÷
÷ + ÷

³ ³ ³
³ ³

( ) ( ) ( ) ( )
( ) ( ) ,
e e Ge o
Ge e
t t dt u t t dt
u t t dt
o o
o
· ·
÷· ÷·
·
÷·
÷

³ ³
³



where in the last two steps we use ( ) ( )
o o
t t o o ÷ ÷ , ( ) ( )
e e
t t o o ÷ , and definition (2.52d). We see
that both
- 128 -
Generalized Functions · 2.11
- 129 -
( ) ( )
Ge
u t t dt o
·
÷·
³
and ( ) ( )
Ge
u t t dt o
·
÷·
÷
³

are equal to
( ) ( )
Ge e
u t t dt o
·
÷·
³


for any test function o , so by definition (2.47a) for the equality of two generalized functions, it
follows that
( ) ( )
Ge Ge
u t u t ÷ (2.52f)

for any even generalized function ( )
Ge
u t . If ( )
Go
u t is any odd generalized function, we can use
( ) ( ) ( )
e o
t t t o o o + and definition (2.52e) to get


[ ]
( ) ( ) ( ) ( ) ( ) ( ) ( )
Go Go e o Go o
u t t dt u t t t dt u t t dt o o o o
· · ·
÷· ÷· ÷·
+
³ ³ ³


and definition (2.52b) to get


( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) [ ( )] ( )
Go Go Go e Go o
Go e Go o
u t t dt u t t dt u t t dt u t t dt
u t t dt u t t dt
o o o o
o o
· · · ·
÷· ÷· ÷· ÷·
· ·
÷· ÷·
÷ ÷ ÷ + ÷
+ ÷
³ ³ ³ ³
³ ³

[ ( ) ( )]
Go o
u t t dt o
·
÷·
÷
³


or
[ ( )] ( ) ( ) ( )
Go Go o
u t t dt u t t dt o o
· ·
÷· ÷·
÷ ÷
³ ³
.

Clearly, ( ) ( )
Go
u t t dt o
·
÷·
³
and [ ( )] ( )
Go
u t t dt o
·
÷·
÷ ÷
³
are equal to each other because they are both
equal to ( ) ( )
Go o
u t t dt o
·
÷·
³
for any test function o , so by definition (2.47a) we conclude that
( ) ( )
Go Go
u t u t ÷ ÷
- 129 -
2 · Fourier Theory
- 130 -
or
( ) ( )
Go Go
u t u t ÷ ÷ . (2.52g)

We define the derivative of a generalized function ( )
G
u t to be another generalized function


(1)
( ) ( )
G G
u t u t ´ .

The generalized function ( )
G
u t is defined in terms of the already-known functional
G
u ³ , but
what functional
G
u´ ³ defines the generalized function ( )
G
u t ´ ? We specify this new functional
G
u´ ³
with the definition

( ) ( ) G G
u u o o ´ ´ ³ ÷ ³
or

( )
( ) ( ) ( )
G G G
d
u u t t dt u t dt
dt
o
o o
· ·
÷· ÷·
§ ·
´ ´ ³ ÷ ÷
¨ ¸
© ¹
³ ³
(2.53a)

for any test function o . Therefore, the new generalized function ( )
G
u t ´ satisfies the equation

( ) ( ) ( )
G G
d
u t t dt u t dt
dt
o
o
· ·
÷· ÷·
§ ·
´ ÷
¨ ¸
© ¹
³ ³
(2.53b)

for any test function o . We note that this definition is consistent with a formal integration by
parts, treating ( )
G
u t ´ like a true function ( ) u t ´ to get


[ ]
( ) ( ) ( ) ( ) ( ) ( )
G G G G
d d
u t t dt u t t u t dt u t dt
dt dt
o o
o o
· · ·
·
÷·
÷· ÷· ÷·
§ · § ·
´ ÷ ÷
¨ ¸ ¨ ¸
© ¹ © ¹
³ ³ ³
,

with the term in square brackets [ ] zero for all test functions o . We can make this first term zero
either by requiring o to approach zero as t ÷±· or by having ( )
G
u t equal a true function in the
sense of (2.48g) with the true function becoming zero as t ÷±·. The integral involving
( ) t d dt o o ´ must also, of course, have a well-defined meaning for all the test functions o .
The convolution of two generalized functions ( )
G
u t and ( )
G
v t is defined to be another
generalized function
( ) ( ) ( )
G G G
w t u t v t · . (2.54a)

From Eqs. (2.47a) and (2.47b), we know that (2.54a) must mean that
- 130 -
Generalized Functions · 2.11
- 131 -

[ ]
( ) ( ) ( ) ( ) ( )
G G G
w t t dt u t v t t dt o o
· ·
÷· ÷·
·
³ ³
(2.54b)

for all test functions o . We now give meaning to both sides of (2.54b) by defining that, for all
test functions o ,


[ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
G G G G G
w t t dt u t v t t dt dt u t dt v t t t o o o
· · · ·
÷· ÷· ÷· ÷·
´ ´ ´´ ´´ ´ ´´ · +
³ ³ ³ ³
. (2.54c)

Note that the right-hand side of (2.54c) is as well defined as our previous definitions, since

( ) ( )
v G
v t t t dt o
·
÷·
´´ ´ ´´ ´´ d +
³


is just another complex number depending on the real parameter t´ , which can be treated as
another true test function ( )
v
t´ d inside the double integral of (2.54c),

( ) ( ) ( ) ( ) ( )
G G G v
dt u t dt v t t t u t t dt o
· · ·
÷· ÷· ÷·
´ ´ ´´ ´´ ´ ´´ ´ ´ ´ + d
³ ³ ³
.

As long as ( ) t t o ´ ´´ + and ( )
v
t´ d are both test functions whenever o is a test function,
definition (2.54c) should present no difficulties. To justify this definition, we note that formally
treating ( )
G
u t and ( )
G
v t as true functions gives


[ ]
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ,
G G G G
G G
u t v t t dt dt t dt u t v t t
dt u t dt t v t t
o o
o
· · ·
÷· ÷· ÷·
· ·
÷· ÷·
´´ ´´ ´´ ´´ ´´ ´´ ´ ´ ´´ ´ · ÷
´ ´ ´´ ´´ ´´ ´ ÷
³ ³ ³
³ ³



where the last step interchanges the order of integration. We now use (2.52a) to write

( ) ( ) ( ) ( )
G G
t v t t dt v t t t dt o o
· ·
÷· ÷·
´´ ´´ ´ ´´ ´´ ´´ ´ ´´ ÷ +
³ ³
,
which leads to
- 131 -
2 · Fourier Theory
- 132 -

[ ]
( ) ( ) ( ) ( ) ( ) ( )
G G G G
u t v t t dt dt u t dt v t t t o o
· · ·
÷· ÷· ÷·
´´ ´´ ´´ ´´ ´ ´ ´´ ´´ ´´ ´ · +
³ ³ ³
,

justifying the definition given in (2.54c). Note that the order of integration inside the double
integral of (2.54c) can be freely interchanged,

( ) ( ) ( ) ( ) ( ) ( )
G G G G
dt u t dt v t t t dt v t dt u t t t o o
· · · ·
÷· ÷· ÷· ÷·
´ ´ ´´ ´´ ´ ´´ ´´ ´´ ´ ´ ´ ´´ + +
³ ³ ³ ³
,

showing that ( ) ( ) ( ) ( )
G G G G
u t v t v t u t · · for generalized functions as well as true functions.
Because the convolution itself is defined as an integral, there is no problem giving a meaning to
the convolution of a true function with a generalized function as long as the true function is an
acceptable test function. For a generalized function ( )
G
u t and test function ( ) t o , we have

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
G G G G
u t t u t t t dt u t t t dt u t t t dt o o o o
· · ·
÷· ÷· ÷·
´ ´ ´ ´ ´ ´ ´ ´ ´ · ÷ ÷ ÷ ÷
³ ³ ³
, (2.55a)

where definition (2.52c) with 1 a ÷ and c t is used in the last step of (2.55a). It clearly makes
sense to say that
( ) ( ) ( ) ( )
G G
u t t t dt t u t o o
·
÷·
´ ´ ´ ÷ ·
³
,
which means that
( ) ( ) ( ) ( )
G G
u t t t u t o o · · (2.55b)

for the convolution of a generalized function with any test function o .
2.12 Generalized Limits
Given a sequence of true functions
1 2
( ), ( ), , ( ),
n
u t u t u t … …, we can form a corresponding
sequence of integrals with the test functions o ,


1 2
( ) ( ) , ( ) ( ) , , ( ) ( ) ,
n
u t t dt u t t dt u t t dt o o o
· · ·
÷· ÷· ÷·
³ ³ ³
… ….

We define Glim, the generalized limit of the sequence of true functions ( )
n
u t , by taking the
standard limit of the sequence of integrals,
- 132 -
Generalized Limits · 2.12
- 133 -
lim ( ) ( )
n
n
u t t dt o
·
÷·
÷·
³
,

and requiring that the generalized limit of the sequence of true functions ( )
n
u t , written as

lim ( )
n
n
G u t
÷·
,

satisfy the equation
lim ( ) ( ) lim ( ) ( )
n n
n n
u t t dt G u t t dt o o
· ·
÷· ÷·
÷· ÷·
ª º

¬ ¼
³ ³
(2.56a)

for any test function o . In effect, the generalized limit Glim is what we get when we insist on
moving the standard limit inside the integral. Almost always, of course, it turns out that the
generalized limit is the same as the standard limit,

lim ( ) lim ( )
n n
n n
G u t u t
÷· ÷·
,
so that
lim ( ) ( ) lim ( ) ( )
n n
n n
u t t dt u t t dt o o
· ·
÷· ÷·
÷· ÷·
ª º

¬ ¼
³ ³
, (2.56b)

but this is not always the case. If we define the H function (see Fig. 2.6) by


1 for
( , ) 1 2 for
0 for
t T
t T t T
t T
­ <
°
H
®
°
>
¯



, (2.56c)

we can construct a sequence of true functions by


1
( ) ,1
n
t
u t
n n
§ ·
H
¨ ¸
© ¹
. (2.56d)

Function ( ,1) t n H is 1 only when n t n ÷ < < , so when

( ) 1 t o
- 133 -
2 · Fourier Theory

- 134 -
is an acceptable test function, it is always true that


1
( ) ,1 2
n
t
u t dt dt
n n
· ·
÷· ÷·
§ ·
H
¨ ¸
© ¹
³ ³
,

which makes
lim ( ) 2
n
n
u t dt
·
÷·
÷·

³
. (2.56e)
On the other hand,

1
lim ( ) lim ,1 0
n
n n
t
u t
n n
÷· ÷·
ª º
§ ·
H
¨ ¸ « »
© ¹
¬ ¼
,
which gives
lim ( ) 0
n
n
u t dt
·
÷·
÷·
ª º

¬ ¼
³
. (2.56f)
______________________________________________________________________________

FIGURE 2.6. ( , ) t T H

t
t T t T ÷
- 134 -
Generalized Limits · 2.12
- 135 -
The disagreement of (2.56e) and (2.56f) shows that there can be a very important difference
between the generalized limit and the standard limit, because Eq. (2.56b) does not always hold
true. We cannot avoid this problem by ruling out constant test functions such as ( ) 1 t o .
Consider, for example,

2
1
( )
1
t
t
o
+


and construct a sequence of true functions

( ) sin( )
n
u t t t n .

We find that
21


1
2
sin( )
1
n
t t n
dt e
t
r
·
÷
÷·

+
³
, (2.57a)
which gives

2
sin( )
lim
1
n
t t n
dt
t
r
·
÷·
÷·

+
³
. (2.57b)

This is not the same as

[ ]
2
{lim sin( ) }
0
1
n
t t n
dt
t
·
÷·
÷·

+
³
. (2.57c)

Once again, we have found a sequence of true functions ( )
n
u t that does not satisfy (2.56b). This
second example can, in fact, be seen to fail (2.56b) for much the same reason as the first. Since an
even function is being integrated, we can write that [see Eq. (2.19)]


2 2
0
sin( ) sin( )
lim 2lim
1 1
n n
t t n t t n
dt dt
t t
· ·
÷· ÷·
÷·

+ +
³ ³
. (2.57d)

Consider what happens to the first, positive hump of the sine as n increases in the integral on the
right-hand side of Eq. (2.57d). The values of t for which sin( ) t n is significantly different from
zero, say from ( 4) n r to (3 4) n r , comprise an interval ( 2) t n r A with a width that
increases linearly with n, just like the interval 2n in (2.56d) over which ( ,1) t n H equals one. The


21
I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, edited by Alan Jeffrey, 5th ed.
(Academic Press, New York, 1994), p. 445, formula 4 in Sec. 3.723 with a=1/n and þ=1.
- 135 -
- 136 -
2.13 Fourier Transforms of Generalized Functions
For every generalized function ( )
G
u t , there is at least one sequence of true functions
1 2
( ), ( ), , ( ),
n
u t u t u t … … such that
lim ( ) ( )
n G
n
G u t u t
÷·
. (2.58a)

This formula should be interpreted in the sense of (2.47b) and (2.56a); that is, it means

lim ( ) ( ) lim ( ) ( ) ( ) ( )
n n G
n
n
G u t t dt u t t dt u t t dt o o o
· · ·
÷·
÷·
÷· ÷· ÷·
ª º

« »
¬ ¼
³ ³ ³
(2.58b)

for all test functions o . We use the sequence of true functions whose generalized limit is the
generalized function to define the Fourier transform of the generalized function. If a sequence of
true functions
1 2
( ), ( ), , ( ),
n
w t w t w t … … can be forward Fourier transformed to give another
2 · Fourier Theory

center of this hump is at ( 2) t n r , so as n increases, the hump’s center appears at ever larger
values of t. Hence, we can make the approximation that for large n


1
2
2
1
t
t
t nr
÷
e =
+
.

This means the characteristic size of

2
sin( )
1
t t n
t +


at the hump decreases as 1 n , while the hump’s width, ( 2) t n r A , increases as n. The product
of the size and width therefore tends to a constant as n gets large, preventing the integral from
shrinking as n ÷·. This is the same phenomenon that caused our first example
1
( ,1) n t n
÷
H to
fail Eq. (2.56b). Up to this point, we have, of course, only discussed the contribution of the first
- 136 -
Fourier Transforms of Generalized Functions · 2.13
- 137 -
sequence of true functions
1 2
( ), ( ), , ( ),
n
W f W f W f … … such that


2
( ) ( )
ift
n n
W f w t e dt
r
·
÷
÷·

³
(2.59a)
and

2
( ) ( )
ift
n n
w t W f e df
r
·
÷·

³
(2.59b)

for all values of n, we then define the forward Fourier transform of the generalized function

( ) lim ( )
G n
n
w t G w t
÷·
(2.59c)
to be
( )
( )
( ) lim ( )
ift
G n
n
w t G W f
÷
÷·
F . (2.59d)

We expect the sequence of true functions
1 2
( ), ( ), , ( ),
n
W f W f W f … … also to give a generalized
function when we take the generalized limit of the sequence,

( ) lim ( )
G n
n
W f G W f
÷·
, (2.59e)

and we define the inverse Fourier transform of this generalized function to be ( )
G
w t ,

( )
( )
( ) lim ( ) ( )
itf
G n G
n
W f G w t w t
÷·
F . (2.59f)

The double-arrow notation ÷ introduced in the discussion after Eq. (2.35d) can be used to
restate this definition more concisely. We define that whenever


1 2
( ), ( ), , ( )
G
w t w t w t …

is true, and that whenever

1 2
( ), ( ), , ( )
G
W f W f W f …

is true, and that whenever


1 1 2 2
( ) ( ), ( ) ( ), , ( ) ( ),
n n
w t W f w t W f w t W f ÷ ÷ ÷ … …
- 137 -
2 · Fourier Theory

- 138 -
is true for all n, it must also be true that

( ) ( )
G G
w t W t ÷ (2.59g)

for the generalized functions given by the generalized limits of sequences


1 2
( ), ( ), w t w t … and
1 2
( ), ( ), W f W f … .

Now at last we can attach a meaning to the Fourier transform pair that could not be completed
in Eqs. (2.43d)–(2.43f). The explicit development that follows is perhaps somewhat long, but
worth doing to show how to construct the Fourier transforms of some of the functions violating
one or more of requirements (V) through (VIII) in Sec. 2.4. We create the sequence

sgn( ) ( ,1), sgn( ) ( , 2), , sgn( ) ( , ), f f f f f f n H H H … …

and define the generalized sgn function by


[ ]
"sgn( )" lim sgn( ) ( , )
n
f G f f n
÷·
H , (2.60a)

where quotes “ ” are used to indicate that the “sgn( ) f ” is a generalized function instead of the
true function sgn( ) f defined in Eq. (2.42c) above. The reason for this choice of sequence is
straightforward—function [sgn( ) ( , )] f f n H satisfies requirements (V) through (VIII) in Sec. 2.4
for every finite value of n and so has a well-defined Fourier transform; as n increases, function
[sgn( ) ( , )] f f n H resembles ever more closely the sgn( ) f function to which we want to give a
Fourier transform. We note that for any test function o


[ ]
( ) "sgn( )" ( ) lim sgn( ) ( , )
lim ( ) sgn( ) ( , )
lim ( ) sgn( )
( ) sgn( )
n
n
n
n
n
f f df f G f f n df
f f f n df
f f df
f f df
o o
o
o
o
· ·
÷·
÷· ÷·
·
÷·
÷·
÷·
÷
H
H

³ ³
³
³


,
·
÷·
³

so
"sgn( )" sgn( ) f f (2.60b)
- 138 -
Fourier Transforms of Generalized Functions · 2.13
- 139 -
in the sense of Eq. (2.48c). This equivalence can be used to justify dropping the distinction
between “sgn( ) f ” and sgn( ) f . Applied mathematicians who work with generalized functions
often drop the distinction between a generalized function and the true function to which it is
equivalent, and the double-quote notation introduced here is not standard usage. There is,
however, no harm in keeping track of the distinction between the two types of functions, and the
double quotes acknowledge the close relationship of the two functions while reminding us that
they are not the same.
The inverse Fourier transform of [ sgn( ) ( , )] i f f n r ÷ H is, using the identity
cos sin
i
e i
¢
¢ ¢ + ,

( )
( ) 2
0
sgn( ) ( , ) sgn( ) ( , ) 2 sin(2 )
n
itf ift
i f f n i e f f n df ft df
r
r r r r
·
÷·
÷ H ÷ H
³ ³
F .

In the last step, we use that the integral of

[cos(2 ) sgn( ) ( , )] ft f f n r H ,

which is an odd function in ƒ, has an integral that is zero according to Eq. (2.17); and the integral
between (ín) and n of [sin(2 ) sgn( )] ft f r , which is an even function in ƒ, is twice the value of its
integral from zero to n according to Eq. (2.19). Making the substitution 2 f tf r ´ gives

( ) [ ]
2
( )
0
1
sgn( ) ( , ) cos
nt
itf
i f f n f
t
r
r ´ ÷ H ÷ F .

This shows that the inverse Fourier transform of [ sgn( ) ( , )] i f f n r ÷ H is

( ) [ ]
( ) 1
sgn( ) ( , ) 1 cos(2 )
itf
i f f n t nt r r
÷
÷ H ÷ F .

Now we calculate the forward Fourier transform of (1/ )[1 cos(2 )] t nt r ÷ . We get


( )
( ) 1 2 1
2 2
[1 cos(2 )] [1 cos(2 )]
1
cos(2 )
1
sgn( ) cos(2 ) sin(2
ift ift
ift ift
t nt e t nt dt
dt
e e nt dt
t t
i f i nt ft
t
r
r r
r r
r
r r r
·
÷ ÷ ÷ ÷
÷·
· ·
÷ ÷
÷· ÷·
÷ ÷
÷
÷ +
³
³ ³


F
) . dt
·
÷·
³

- 139 -
2 · Fourier Theory

- 140 -
In the last step, Eq. (2.43d) is used to evaluate the integral of
2 1
[ ]
ift
e t
r ÷ ÷
; we also substitute
cos sin
i
e i
¢
¢ ¢ + into the integral of
2 1
[ cos(2 )]
ift
e t nt
r
r
÷ ÷
, discovering that the Cauchy principle
value of the integral of
1
[ cos(2 ) cos(2 )] t ft nt r r
÷
, which is an odd function in t, is zero [see Eq.
(2.17)]. The remaining integral over the even function


1
[ sin(2 ) cos(2 )] t ft nt r r
÷


can be simplified by applying Eq. (2.19) and then consulting a table of definite integrals,
22



0
1 1
cos(2 ) sin(2 ) 2sgn( ) cos(2 ) sin(2 )
sgn( ) (2 , 2 ) sgn( ) ( , ) .
nt ft dt f nt f t dt
t t
f n f f n f
r r r r
r r r r
· ·
÷·

H H
³ ³




We conclude that the forward Fourier transform of (1/ )[1 cos(2 )] t nt r ÷ is


( )
( ) 1
[1 cos(2 )] sgn( ) ( , ) sgn( ) 1 ( , )
sgn( ) ( , ) .
ift
t nt f i i n f i f n f
i f f n
r r r r
r
÷ ÷
ª º ª º ÷ ÷ + H ÷ ÷H
¬ ¼ ¬ ¼
÷ H
F



Hence, (1/ )[1 cos(2 )] t nt r ÷ and [ sgn( ) ( , )] i f f n r ÷ H are a Fourier-transform pair,


[ ]
1
1 cos(2 ) sgn( ) ( , ) nt i f f n
t
r r ÷ ÷ ÷ H .

This confirms that there are two sequences


[ ] [ ] [ ]
1 1 1
1 cos(2 ) , 1 cos(4 ) , , 1 cos(2 ) , t t nt
t t t
r r r ÷ ÷ ÷ … … (2.60c)
and

sgn( ) ( ,1), sgn( ) ( , 2), , sgn( ) ( , ), i f f i f f i f f n r r r ÷ H ÷ H ÷ H … …



22
I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, p. 453, formula 2 in Sec. 3.741 with
a=2r|f| and b=2rn.
- 140 -
Fourier Transforms of Generalized Functions · 2.13
- 141 -
such that each member of the lower sequence is the forward Fourier transform of the
corresponding member of the upper sequence and each member of the upper sequence is the
inverse Fourier transform of the corresponding member of the lower sequence. We know from
(2.60a) and (2.60b) that the generalized function given by the generalized limit of the lower
sequence is


[ ] [ ]
lim sgn( ) ( , ) lim sgn( ) ( , ) "sgn( )"
sgn( ) ,
n n
G i f f n i G f f n i f
i f
r r r
r
÷· ÷·
÷ H ÷ H ÷
÷
(2.60d)

but what is the generalized function given by the generalized limit of the upper sequence? We
have for any test function o


¦ ¦ [ ]
1
1
( ) lim [1 cos(2 )] lim ( ) 1 cos(2 )
lim ( ) ( ) cos(2 )
n n
n
t G t nt dt t nt dt
t
dt dt
t t nt
t t
o r o r
o o r
· ·
÷
÷· ÷·
÷· ÷·
· ·
÷·
÷· ÷·
÷ ÷
­ ½
° °
÷
® ¾
° °
¯ ¿
³ ³
³ ³


1
( ) lim ( ) cos(2 ) .
n
dt
t t nt dt
t t
o o r
· ·
÷·
÷· ÷·
÷
³ ³

(2.60e)

Working with the limit of the integral containing cos(2 ) nt r , we write


1 1
lim ( ) cos(2 ) lim ( ) cos(2 )
1
lim ( ) cos(2 )
1
lim ( ) cos(2 ) ,
n n
n
n
t nt dt t nt dt
t t
t nt dt
t
t nt dt
t
r
r
r
r
o r o r
o r
o r
·
÷· ÷·
÷· ÷·
÷·
÷
·
÷·

+
+
³ ³
³
³


(2.60f)

where r is a small positive number. By making all the test functions ( ) t o have finite variation as
in requirement (VIII) in Sec. 2.4, we recognize the first and third integrals on the right-hand side
of (2.60f) become zero as n ÷·, because eventually the cosine oscillates both positive and
negative over each infinitesimal interval while ( ) t t o barely changes at all—the integrals can be
made as small as desired by picking a large enough value of n. For future use, we note that for
any continuous, finite-variation test function o ,
- 141 -
2 · Fourier Theory

- 142 -

int
lim ( ) sin( ) lim ( ) cos( ) lim ( ) 0
n n n
t nt dt t nt dt t e dt o o o
· · ·
±
÷· ÷· ÷·
÷· ÷· ÷·

³ ³ ³
,

so that

int
limsin( ) limcos( ) lim 0
n n n
G nt G nt G e
±
÷· ÷· ÷·
. (2.60g)

The middle integral in Eq. (2.60f) can be written as


1
( ) cos(2 ) (0) ( , ) cos(2 )
dt
t nt t nt dt
t t
r
r
o r o r r
·
÷ ÷·
e H
³ ³
,

where we have chosen r small enough that ( ) t o barely changes over the integral, letting us
replace it by (0) o . Now the middle integral on the right-hand side of (2.60f) can be recognized as
the Cauchy principle value of the integral of (1 ) ( , ) cos(2 ) t t nt r r H , which is an odd function of t
and must be zero according to Eq. (2.17). Hence, (2.60f) becomes


1
lim ( ) cos(2 ) 0
n
t nt dt
t
o r
·
÷·
÷·

³
,

which shows that (2.60e) simplifies to


¦ ¦
1
( ) lim [1 cos(2 )] ( )
n
dt
t G t nt dt t
t
o r o
· ·
÷
÷·
÷· ÷·
÷
³ ³
(2.60h)

for any test function o . Since (2.60h) denotes equality in the sense of Eq. (2.48c), we can define
the generalized function “
1
t
÷
” to be


¦ ¦
1 1
" " lim [1 cos(2 )]
n
t G t nt r
÷ ÷
÷·
÷ (2.60i)

and then note that Eq. (2.60h) now states that


1 1
" " t t
÷ ÷
. (2.60j)

Equations (2.60d) and (2.60j) show that [ "sgn( )"] i f r ÷ and “
1
t
÷
” are the generalized limits of the
two sequences in (2.60c). Because all the sequence members are Fourier transform pairs, we
- 142 -
Fourier Transforms of Generalized Functions · 2.13
- 143 -
know, according to (2.59g), that [ "sgn( )"] i f r ÷ and
1
" " t
÷
are a Fourier transform pair even
though [ sgn( )] i f r ÷ and
1
t
÷
do not satisfy requirements (V) through (VIII) in Sec. 2.4 and, as
shown in Eqs. (2.43a) and (2.43f), their transforms cannot be evaluated as standard integrals. In
this sense, we can write that


( ) 1
( ) sgn( )
ift
t i f r
÷ ÷
÷ F (2.60k)
and
( )
( ) 1
sgn( )
ift
i f t r
÷
÷ F . (2.60A )

This can also be written as, reversing the sign of ƒ in (2.60k), the sign of t in (2.60A ), and using
Eq. (2.42c) to get that sgn( ) sgn( ) f f ÷ ÷ ,


( ) 1
( ) sgn( )
ift
t i f r
÷
F (2.60m)
and
( )
( ) 1
sgn( )
ift
i f t r
÷ ÷
÷ ÷ F . (2.60n)

It is important to remember that Eqs. (2.60k) and (2.60m) are true only when integrals between
í’ and +’ are interpreted as Cauchy principle values and (2.60A ) and (2.60n) are true only
when equality is defined as in Eq. (2.48c) using generalized function theory. Strictly speaking, it
might be better to say that the Cauchy principle value of


2 ift
dt
e
t
r
·
±
÷·
³
is sgn( ) i f r ±
and that

[ ]
2 1
"sgn( )" " "
ift
e i f df t
r
r
·
± ÷
÷·
÷ ±
³
.

This is the reason that

2
sgn( )
ift
dt
e i f
t
r
r
·
±
÷·
±
³
(2.61a)

is usually not listed in standard tables of improper integrals without notation showing that it is a
Cauchy principle value, and the equality


[ ]
2
sgn( )
ift
i
e f df
t
r
r
·
±
÷·
±
³
(2.61b)
- 143 -
2 · Fourier Theory

- 144 -
is usually not listed in these tables under any circumstances. It is also true, however, that (2.61a)
and (2.61b) are constantly used either explicitly or implicitly in Fourier-transform theory; and
lists of Fourier-transform pairs often contain (2.61a) and (2.61b). Unfortunately, it is standard
practice in the Fourier-transform tables that do list these integrals to omit any explanation that
they are only true when interpreted as the Fourier transforms of generalized functions. In general,
when using tables of Fourier transforms, all those transforms that do not exist as standard
integrals or Cauchy principle values should be interpreted as the transforms of generalized
functions and used only in the context of generalized function theory.
2.14 The Delta Function
The most popular and useful generalized function is the Dirac delta function, a name usually
shortened to just the delta function. In a sense, the Secs. 2.11–2.13 describing generalized
function theory are there just so we can give a mathematically exact description of the delta
function. The delta function is often inexactly described in elementary textbooks as that function
( ) t o such that

for 0
( )
0 for 0
t
t
t
o
· ­

®
=
¯


(2.62a)

with

(0) for 0
( ) ( )
0 for 0 or 0
b
a
f a b
t f t dt
a b a b
o
< < ­

®
< < < <
¯
³


. (2.62b)

More sophisticated textbooks may define it as a standard limit, for example,


1
( ) lim[ ( , )]
n
t n t n o
÷
÷·
H (2.63a)
or

2
( ) lim
nt
n
n
t e o
r
÷
÷·
§ ·

¨ ¸
¨ ¸
© ¹
. (2.63b)

There are, in fact, two different—but equivalent—mathematically exact ways to define the delta
function. The first way is to create a well-defined functional o ³ that, when operating on a
complex-valued test function ( ) t o with a real argument t, produces as its complex number (0) o ,
the value of o at t equal to zero,

( )
(0) o o o ³ . (2.64a)
- 144 -
The Delta Function · 2.14
- 145 -
This makes ( ) t o the generalized function associated with functional o ³ , with ( ) t o having the
property that
( ) ( ) (0) t t dt o o o
·
÷·

³
(2.64b)

for all test functions o . The second way to define ( ) t o is to say it is the generalized limit of a
sequence such as the ones specified in (2.63a) and (2.63b),


1
( ) lim[ ( , )]
n
t G n t n o
÷
÷·
H (2.65a)
or

2
( ) lim
nt
n
n
t G e o
r
÷
÷·
§ ·

¨ ¸
¨ ¸
© ¹
. (2.65b)

Although the delta function is a generalized function in every sense of the term, we follow
standard notation and do not add the G subscript—or add the quotes “ ”—used to label other
generalized functions in this chapter.
Defining ( ) t o with a functional, as in (2.64a), shows that this generalized function can be
used on an extremely large set of test functions—any true function that is continuous at the origin
is an acceptable and appropriate test function. The subset of test functions
ab
o used in Eqs.
(2.48d)–(2.48g) has a b < with ( )
ab
t o automatically set to zero when t does not lie inside the
interval a t b < < . These functions can be used in (2.64b) to show that

( ) ( ) (0) 0
ab ab
t t dt o o o
·
÷·

³


when 0 a b < < or 0 a b < < . Therefore, we have

( ) 0 for 0 t t o = (2.65c)

in the sense of definition (2.48f)—that is, we know that

( ) ( ) 0 ( ) 0
ab ab
t t dt t dt o o o
· ·
÷· ÷·

³ ³


- 145 -
2 · Fourier Theory

- 146 -
for all test functions
ab
o where the interval a t b < < does not include 0 t . This is a
mathematically exact way of stating the lower level of Eq. (2.62a). If ( ) t o is defined using
generalized limits, as in Eqs. (2.65a) and (2.65b), then we must show why Eq. (2.64b) is true. The
sequence in (2.65b), for example, leads to


2 2 2
2
( ) lim lim ( ) lim (0)
(0) lim
nt nt nt
n n n
nt
n
n n n
t G e dt e t dt e dt
n
e dt
o o o
r r r
o
r
· · ·
÷ ÷ ÷
÷· ÷· ÷·
÷· ÷· ÷·
·
÷
÷·
÷·
ª º
e
« »
¬ ¼

³ ³ ³
³

(0) o
(2.66)

for any test function o . As n gets large in (2.66), only the value of o at 0 t can contribute
significantly to the integral. Replacing ( ) t o by (0) o quickly reduces the whole expression to
(0) o , showing that the generalized limit of the sequence in (2.65b) is indeed the delta function.
Some commonly used sequences that have the delta function as their generalized limits are


( )
2 2
( ) lim
1
n
n
t G
n t
r
o
÷·

+
, (2.67a)


2
2
sin ( )
( ) lim
n
nt
t G
n t
o
r
÷·
, (2.67b)


sin(2 )
( ) lim
n
nt
t G
t
r
o
r
÷·
, (2.67c)

and so on. Perhaps the most interesting of these sequences is (2.67c). We know from (2.65c) that
one important property of the delta function is

( ) ( ) 0
ab
t t dt o o
·
÷·

³


whenever the interval a t b < < does not include 0 t . The reason that


sin(2 ) sin(2 )
lim ( ) lim ( ) 0
ab ab
n n
nt nt
G t dt t dt
t t
r r
o o
r r
· ·
÷· ÷·
÷· ÷·
ª º ª º

« » « »
¬ ¼ ¬ ¼
³ ³

- 146 -
The Delta Function · 2.14
- 147 -
when the interval a t b < < does not include 0 t is that for extremely large n values the sine
oscillates rapidly between +1 and í1 while ( )
ab
t t o stays essentially constant for 0 t = , averaging
the integrand to zero. Hence,


sin(2 )
lim ( ) 0 for 0
n
nt
G t t
t
r
o
r
÷·
=

for the same reason that

int
lim 0
n
G e
±
÷·


in Eq. (2.60g). To understand the behavior near 0 t , we construct function
0
( )
a b
t o in which the
interval a t b < < does include 0 t . Now we can write, transforming the variable of integration
to 2 t nt r ´ ,


( ) ( )
0 0
0 0 0
sin(2 ) 1 sin( )
lim ( ) lim
2
1 sin( )
0 lim 0 ( ) ( ) ,
a b a b
n n
a b a b a b
n
nt t t
G t dt dt
t t n
t
dt t t dt
t
r
o o
r r r
o o o o
r
· ·
÷· ÷·
÷· ÷·
· ·
÷·
÷· ÷·
´ ´
ª º ª º § ·
´
¨ ¸
« » « »
´
¬ ¼ ¬ ¼ © ¹
´
ª º
´
« »
´
¬ ¼
³ ³
³ ³



where in the second-to-last step we use

sin( ) t
dt
t
r
·
÷·
´
´
´
³
.

Any arbitrary test function can be written as a function
0
( )
a b
t o whose interval of nonzero values
includes 0 t plus other test functions whose intervals of nonzero values do not include 0 t ;
that is, we can always write
0
( ) ( ) [other functions zero at the origin]
a b
t t o o + . When this ( ) t o is
multiplied by limsin(2 ) ( )
n
G nt t r r
÷·
and integrated over t between í’ and +’, we realize that the
value of the integral is
0
(0) (0)
a b
o o because the other functions that are zero at the origin give
zero contribution to the integral as n ÷·. Consequently,

( )
sin(2 )
lim ( ) 0 ( ) ( )
n
nt
G t dt t t dt
t
r
o o o o
r
· ·
÷·
÷· ÷·
ª º

« »
¬ ¼
³ ³
,

indicating that the generalized limit of the sequence
(see any handbook of definite integrals)
- 147 -
2 · Fourier Theory

- 148 -

sin(2 ) nt
t
r
r


equals the delta function in the only sense that two generalized functions can ever be equal—the
integral of the left-hand side with any test function o is always the same as the integral of the
right-hand side with any test function o [see discussion after Eq. (2.47b)]. Figures 2.7(a)–2.7(c)
and 2.8(a)–2.8(c) plot the behavior of
2
nt
n e r
÷
and
1
( ) sin(2 ) t nt r r
÷
sequences, showing the
two different ways these sequences change into delta functions.
We note that for any odd test function ( )
o
t o

( ) ( ) (0) 0
o o
t t dt o o o
·
÷·

³


because, according to Eq. (2.12a), odd functions are zero at the origin. Therefore, from the
definitions of even and odd generalized functions in Eqs. (2.52d) and (2.52e), we conclude that
the delta function is an even generalized function because its integral with all odd test functions is
always zero. This means we can write [see Eq. (2.52f)]

( ) ( ) t t o o ÷ . (2.68a)

From the behavior of generalized functions specified in Eq. (2.52a), we have


0 0 0
( ) ( ) ( ) ( ) ( ) t t t dt t t t dt t o o o o o
· ·
÷· ÷·
÷ +
³ ³


and, because the delta function equals the zero function for 0 t = , this result can be written as


0 0
0
0 0
0 for or
( ) ( )
( ) for
b
a
a b t t a b
t t t dt
t a t b
o o
o
< < < < ­
÷
®
< <
¯
³


. (2.68b)

From Eq. (2.52b), we have


1 1
( ) ( ) ( ) ( / ) (0) c t t dt t t c dt
c c
o o o o o
· ·
÷· ÷·

³ ³
,

from which we conclude that
- 148 -
The Delta Function · 2.14
- 149 -



















Figures 2.7(a)–2.7(c) show how
2
/
nt
n e r
÷
changes into a delta function of t as n increases.
FIGURE 2.7(a).
FIGURE 2.7(b).
FIGURE 2.7(c).
t
t
t
0
0
0
- 149 -
2 · Fourier Theory

- 150 -












Figures 2.8(a)–2.8(c) show how
-1
(ʌt) sin(2ʌnt) changes into a delta function of t as n increases.
0
0
0
0
0
FIGURE 2.8(A).
FIGURE 2.8(b).
FIGURE 2.8(c).
t
t
t
0
0
0
- 150 -
The Delta Function · 2.14
- 151 -

1
( ) ( ) ct t
c
o o (2.68c)
because

1 1
( ) ( ) (0) ( ) ( ) c t t dt t t dt
c c
o o o o o
· ·
÷· ÷·
ª º

« »
« »
¬ ¼
³ ³


for all test functions o . We note that this last rule, Eq. (2.68c), can also be used to show that the
delta function is even, since (2.68a) is just a special case of (2.68c) with 1 c ÷ .
Equation (2.52c) shows that there is no difficulty handling a general linear transformation of
the delta function’s argument, because for any two real constants a and c, we have


1 1 1
( ) ( ) ( ) (( ) / ) ( )
c c
a t c t dt t t c a dt t t dt
a a a a a
o o o o o o o
· · ·
÷· ÷· ÷·
ª º
§ · § ·
÷ + ÷
« »
¨ ¸ ¨ ¸
© ¹ © ¹ « »
¬ ¼
³ ³ ³


for all test functions o . Consequently,

1
( )
c
a t c t
a a
o o
§ ·
÷ ÷
¨ ¸
© ¹
. (2.68d)

This is the same answer we would get from factoring a out of the delta function argument and
then using (2.68c) to rescale the delta function.
When the delta function is multiplied by a true function v(t), we have


[ ] [ ] [ ]
0 0 0 0 0 0
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) t t v t t dt t t v t t dt v t t t t v t t dt o o o o o o o
· · ·
÷· ÷· ÷·
÷ ÷ ÷
³ ³ ³


for any test function o , from which we conclude that


0 0 0
( ) ( ) ( ) ( ) v t t t v t t t o o ÷ ÷ . (2.68e)

A useful generalization of (2.68d) is, for continuous true functions u(t),

( )
all
1
( ) ( )
( )
k
k k
u t t t
u t
o o ÷
´
¦
, (2.68f)

where ( ) u t du dt ´ and
1 2
, , t t … are the values of t for which ( ) 0 u t . This formula only makes
sense, of course, when ( ) 0
k
u t ´ = for
1 2
, , t t …. Perhaps the easiest way to see that (2.68f) must be
- 151 -
2 · Fourier Theory

- 152 -
true is to note that the delta function equals the zero function whenever its argument is not zero.
Therefore,

all
( ( )) ( ) ( ( )) ( )
k
k
t
k
t
u t t dt u t t dt
r
r
o o o o
+ ·
÷· ÷
ª º

« »
« »
¬ ¼
¦
³ ³
(2.68g)

with 0 r > taken to be small enough that each interval
k k
t t t r r ÷ < < + only includes one of the
k
t values for which u is zero. Nothing stops us from making r as small as we please—as long as
it does not become zero—and eventually each integral on the right-hand side of (2.68g) can be
written as
( ) ( ) ( ) ( ) ( ) ( ) ( )
k k
k k
t t
k k
t t
u t t dt t t u t t dt
r r
r r
o o o o
+ +
÷ ÷
´ ÷
³ ³
,
where we expand u as

( ) ( ) ( ) ( ) ( ) ( )
k k k k k
u t u t t t u t t t u t ´ ´ e + ÷ ÷

since ( ) 0
k
u t . Next, we use (2.68d) to write

( )
1
( ) ( ) ( )
( )
k k k
k
t t u t t t
u t
o o ´ ÷ ÷
´
,

so that

( )
1
( ) ( ) ( ) ( )
( )
1
( ) ( ) .
( )
k k
k k
t t
k
k
t t
k
k
u t t dt t t t dt
u t
t t t dt
u t
r r
r r
o o o o
o o
+ +
÷ ÷
·
÷·
ª º
÷
« »
´
« »
¬ ¼
ª º
÷
« »
´
« »
¬ ¼
³ ³
³



Substitution of this result back into (2.68g) gives


all all
1 1
( ( )) ( ) ( ) ( ) ( ) ( )
( ) ( )
k k
k k k k
u t t dt t t t dt t t t dt
u t u t
o o o o o o
· · ·
÷· ÷· ÷·
ª º ª º
÷ ÷
« » « »
´ ´
« » « »
¬ ¼ ¬ ¼
¦ ¦
³ ³ ³


for all test functions o . This justifies Eq. (2.68f) according to the definition for the equality of
generalized functions [see Eqs. (2.47a) and (2.47b)].
- 152 -
Derivatives of the Delta Function · 2.15
- 153 -
2.15 Derivatives of the Delta Function
We have already remarked that the set of test functions for ( ) t o contains all functions that are
continuous at the origin. Changing the argument of the delta function changes the set of
appropriate test functions. In Eq. (2.68b), for example, the test functions must be continuous at
0
t t ; in (2.68d) they must be continuous at / t c a ; and in (2.68f) they must be continuous at
all
k
t t . When Eq. (2.53b) is used to define the derivative of a delta function, ( ) t o´ , we have

( ) ( ) ( ) ( ) (0) t t dt t t dt o o o o o
· ·
÷· ÷·
´ ´ ´ ÷ ÷
³ ³
, (2.69a)

which shows that now the first derivative of all the test functions must be continuous at the
origin. If we start out with a test function ( )
ab
t o that must be identically zero for all t a < and for
all t b > , then Eq. (2.69a) becomes

( ) ( ) ( ) ( ) (0) 0
ab ab ab
t t dt t t dt o o o o o
· ·
÷· ÷·
´ ´ ´ ÷ ÷
³ ³


whenever the interval a t b < < does not contain the origin. Hence, we can write

( ) ( ) 0 0 ( )
ab ab
t t dt t dt o o o
· ·
÷· ÷·
´
³ ³


for 0 a b < < or 0 a b < < , showing that ( ) t o´ equals the zero function in the sense of Eq. (2.48f)
for 0 t = . Equation (2.52a) can be used in conjunction with (2.53b) to evaluate ( ) t o´ when it is
shifted from the origin by an amount
0
t ,


0 0 0 0
( ) ( ) ( ) ( ) ( ) ( ) ( ) t t t dt t t t dt t t t dt t o o o o o o o
· · ·
÷· ÷· ÷·
´ ´ ´ ´ ÷ + ÷ + ÷
³ ³ ³
, (2.69b)

where now we require the first derivative of the test functions to be continuous at
0
t t . This
result can be applied to test functions ( )
ab
t o to get


0 0 0 0
( ) ( ) ( ) ( ) ( ) ( ) ( ) 0
ab ab ab ab
t t t dt t t t dt t t t dt t o o o o o o o
· · ·
÷· ÷· ÷·
´ ´ ´ ´ ÷ + ÷ + ÷
³ ³ ³

- 153 -
2 · Fourier 1heory

- 154 -
whenever the interval D W E < < does not contain
0
W W = . ThereIore,


0
( ) ( ) 0 0 ( )
DE DE
W W W GW W GW δ φ φ
∞ ∞
−∞ −∞
′ − = = ⋅
³ ³


whenever 0 D E < < or 0 D E < < , showing that
0
( ) W W δ′ − equals the zero Iunction |in the sense oI
Eq. (2.48I)| Ior
0
W W ≠ . Equations (2.52a) and (2.53b) can be applied any number oI times to get
( ) Q
δ , the Qth derivative oI the delta Iunction, shiIted away Irom the origin by an amount
0
W . We
have

( ) ( 1) (1) ( 2) (2)
0 0 0
( ) ( ) ( ) ( ) ( ) ( )
Q Q Q
W W W GW W W W GW W W W GW δ φ δ φ δ φ
∞ ∞ ∞
− −
−∞ −∞ −∞
− = − + = + =
³ ³ ³
" ,

which eventually becomes

( ) ( )
0
( ) ( )
0 0
( ) ( ) 1 ( ) 1
Q
Q Q
Q Q
Q
W W
G
W W W GW W
GW
φ
δ φ φ

−∞ =
− = − = −
³
. (2.69c)

Again, this latest result can be applied to test Iunctions ( )
DE
W φ to get

( )
( ) ( )
0 0
( ) ( ) 1 ( ) 0
Q
Q Q
DE DE
W W W GW W δ φ φ

−∞
− = − =
³


whenever the interval D W E < < does not contain
0
W W = . Because


( )
0
( ) ( ) 0 0 ( )
Q
DE DE
W W W GW W GW δ φ φ
∞ ∞
−∞ −∞
− = = ⋅
³ ³


whenever
0
W W = lies outside this interval, we end up with |using the deIinition oI equality in
(2.48I)|

( )
0 0
( ) 0 Ior
Q
W W W W δ − = ≠ . (2.69d)

The test Iunctions integrated with
( )
0
( )
Q
W W δ − must, oI course, have their Qth derivatives
continuous at
0
W W = .
0
W
0
W
- 154 -
Derivatives of the Delta Function · 2.15
- 155 -
We define the function ( ) t E to be


1 for 0
( ) 1 2 for 0
0 for 0
t
t t
t
> ­
°
E
®
°
<
¯



. (2.70a)

Function E is often called the Heaviside step function. If we take


(1)
( ) ( )
d
t t
dt
E E (2.70b)

to be the first derivative of the E function, then
(1)
( ) 0 t E for all 0 t = . To evaluate
(1)
( ) t E at
the origin, we decide to turn ( ) t E and
(1)
( ) t E into generalized functions that we call “ ( ) t E ” and

(1)
( ) t E ” respectively. We define

" ( )" ( ) ( ) ( ) t t dt t t dt o o
· ·
÷· ÷·
E E
³ ³


for all test functions o , which means that, according to Eqs. (2.48b) and (2.48c),

" ( )" ( ) t t E E . (2.70c)

Having established the generalized function “ ( ) t E ”, we know from Eq. (2.53b) that the
generalized function “
(1)
( ) t E ” must satisfy


(1)
" ( )" ( ) " ( )" ( ) t t dt t t dt o o
· ·
÷· ÷·
´ E ÷ E
³ ³
. (2.70d)

A formal integration by parts of the left-hand side gives


[ ]
(1)
" ( )" ( ) " ( )" ( ) " ( )" ( ) t t dt t t t t dt o o o
· ·
·
÷·
÷· ÷·
´ E E ÷ E
³ ³
.

This becomes, using (2.70c) to remove the double quotes,
- 155 -
2 · Fourier Theory

- 156 -

(1)
0
" ( )" ( ) lim ( ) ( ) ( )
lim ( ) (0) lim ( )
(0) ( ) ( ) .
t
t t
t t dt t t t dt
t t
t t dt
o o o
o o o
o o o
· ·
÷·
÷·
÷· ÷·
·
÷·
ª º
´ E ÷ E
¬ ¼
ª º ª º
+ ÷
¬ ¼ ¬ ¼

³ ³
³




Hence, for all test functions o continuous at the origin (note that they do not have to approach
zero at ’), we have

(1)
" ( )" ( ) ( ) ( ) t t dt t t dt o o o
· ·
÷· ÷·
E
³ ³
,
so

(1)
" ( )" " ( )" ( )
d
t t t
dt
o E E (2.70e)

in the sense of Eq. (2.47b). There is nothing unique about the Heaviside step function. We can
also show, using the generalized function "sgn( )" t introduced in Eqs. (2.60a) and (2.60b) above,
that for any test function o

(1)
1
"sgn ( )" ( ) ( ) ( )
2
t t dt t t dt o o o
· ·
÷· ÷·

³ ³
, (2.70f)

where
(1)
"sgn ( )" t is the first derivative of "sgn( )" t . To show this is true, we do a formal
integration by parts,


[ ]
(1)
1 1 1
"sgn ( )" ( ) "sgn( )" ( ) "sgn( )" ( )
2 2 2
t t dt t t t t dt o o o
· ·
·
÷·
÷· ÷·
´ ÷
³ ³
.

This becomes, using Eqs. (2.60b) and (2.42c),


0
(1)
0
1 1 1 1 1
"sgn ( )" ( ) lim ( ) lim ( ) ( ) ( )
2 2 2 2 2
1 1 1 1 1 1
lim ( ) lim ( ) lim ( ) (0) (0) lim ( )
2 2 2 2 2 2
t t
t t t t
t t dt t t t dt t dt
t t t t
o o o o o
o o o o o o
· ·
÷· ÷÷·
÷· ÷·
÷· ÷÷· ÷÷· ÷·
ª º ª º
´ ´ + + ÷
¬ ¼ ¬ ¼
ª º ª º ª º ª º
+ ÷ + + ÷
¬ ¼ ¬ ¼ ¬ ¼ ¬ ¼
³ ³ ³

(0) ( ) ( ) . t t dt o o o
·
÷·

³

This shows Eq. (2.70f) is true. Again, we get a formula
- 156 -
Derivatives of the Delta Function · 2.15
- 157 -


(1)
1
"sgn ( )" ( )
2
t t o (2.70g)

in the sense of Eq. (2.47b), where the only major restriction on the test functions is that they be
continuous at the origin.
2.16 Fourier Transform of the Delta Function
To find the Fourier transform of the delta function, we construct two sequences of functions
having the relationship specified in (2.59a)–(2.59g) above. It is easiest to start with the delta-
function sequence in Eq. (2.67c). Any standard table of Fourier transforms gives
23



2 ( )
sin(2 ) sin(2 )
( , )
ift ift
nt nt
e dt f n
t t
r
r r
r r
·
÷ ÷
÷·
§ ·
H
¨ ¸
© ¹
³
F
and
( )
2 ( )
sin(2 )
( , ) ( , )
ift ift
nt
e f n df f n
t
r
r
r
·
÷·
H H
³
F
so that

sin(2 )
( , )
nt
f n
t
r
r
÷ H . (2.71a)

Although Eq. (2.71a) holds true for all real n, it is here used only for integer values of n. We
know from (2.67c) that the generalized limit as n ÷· of the left-hand side of (2.71a) is ( ) t o ,
but what is the corresponding generalized limit of the right-hand side? We have

( ) lim ( , ) lim ( , ) ( ) lim ( ) 1 ( )
n
n n n
n
f df G f n f n f df f df f df o o o o
· · ·
÷· ÷· ÷·
÷· ÷· ÷ ÷·
ª º
H H
¬ ¼
³ ³ ³ ³


for any test function o . This shows that

lim ( , ) 1
n
G f n
÷·
H ,

which is no surprise. Therefore, taking the generalized limit as n ÷· of both sides of (2.71a)


23
Jack D. Gaskill, Linear Systems, Fourier Transforms, and Optics (John Wiley & Sons, New York, 1978), p. 201,
with the sinc, rect function pair corresponding to formula (2.71a) above.
- 157 -
2 · Fourier Theory

- 158 -
gives
( ) 1 t o ÷ , (2.71b)
or, restating this result,

2
( ) 1
ift
t e dt
r
o
·
÷
÷·

³
(2.71c)
and

2
( )
ift
e df t
r
o
·
+
÷·

³
. (2.71d)

Equation (2.71c) is just what we expect from Eq. (2.64b), since


2 0
1
if
e
r ÷
;

but Eq. (2.71d) is true only in the sense of Eq. (2.47b), and it is only safe to substitute freely from
(2.71d) when the substitution takes place inside an integral.
Because the sine is an odd function of its argument, we have according to Eq. (2.17), and
assuming the integral is a Cauchy principle value, that

sin(2 ) 0 ft df r
·
÷·

³
.
Therefore, Eq. (2.71d) becomes


[ ]
0
cos(2 ) sin(2 ) 2 cos(2 ) ( ) ft i ft df ft df t r r r o
· ·
÷·
+
³ ³
.

Since the integral over the sine always disappears, we can also write


[ ]
2
( ) cos(2 ) sin(2 )
ift
t ft i ft df e df
r
o r r
· ·
±
÷· ÷·
±
³ ³
.

Hence, two additional formulas for the delta function are


0
2 cos(2 ) ( ) ft df t r o
·

³
(2.71e)
, using Eq. (2.19) and that the cosine is even,
- 158 -
Fourier Transform of the Delta Function · 2.16

- 159 -
and

2
( )
ift
e df t
r
o
·
±
÷·

³
. (2.71f)

As was the case for Eq. (2.71d), these formulas are meant to be used inside integrals.
2.17 Fourier Convolution Theorem with Generalized Functions
Now that we have defined what is meant by the Fourier transform of a generalized function, it is
surprisingly easy to show that the Fourier convolution theorem holds for the product of a
generalized function and a true function.
We start with two sequences of true functions, one of them labeled with a superscript minus
sign for reasons that will become shortly become apparent, called


1 2
( ), ( ), , ( ),
n
v t v t v t … … and
( ) ( ) ( )
1 2
( ), ( ), , ( ),
n
V f V f V f
÷ ÷ ÷
… ….

If these two sequences obey the relationship


( )
1 1
( )
2 2
( )
( ) ( )
( ) ( )
( ) ( )
n n
v t V f
v t V f
v t V f
÷
÷
÷
÷
÷
÷
#
#


,
we know from Eq. (2.59g) that the generalized functions ( )
G
v t and
( )
( )
G
V f
÷
specified by

( ) lim ( )
G n
n
v t G v t
÷·
(2.72a)
and

( ) ( )
( ) lim ( )
G n
n
V f G V f
÷ ÷
÷·
(2.72b)
form a Fourier transform pair,

( )
( ) ( )
G G
v t V f
÷
÷ . (2.72c)

We also suppose that there exists a third sequence of true functions labeled with a superscript
plus sign,

( ) ( ) ( )
1 2
( ), ( ), , ( ),
n
V t V t V t
+ + +
… …,
such that
- 159 -
2 · Fourier Theory

- 160 -

( )
1 1
( )
2 2
( )
( ) ( )
( ) ( )
( ) ( )
n n
V t v f
V t v f
V t v f
+
+
+
÷
÷
÷
#
#


.

If this third sequence has a generalized function as its generalized limit,


( ) ( )
( ) lim ( )
G n
n
V t G V t
+ +
÷·
, (2.72d)

then the generalized functions
( )
( )
G
V t
+
and ( )
G
v f are also a Fourier transform pair,


( )
( ) ( )
G G
V t v f
+
÷ . (2.72e)

Definitions (2.72b) and (2.72d) taken together show that


( ) ( )
( ) lim ( )
G n
n
V f G V f
± ±
÷·
, (2.72f)

where we have replaced t by ƒ in (2.72d); and Eqs. (2.72c) and (2.72e) taken together give

( )
( ) ( )
( ) ( )
ift
G G
V f v t
± ±
F , (2.72g)

where we have interchanged the roles of t and ƒ in Eq. (2.72e).
From the Fourier convolution theorem for true functions [see Eq. (2.39j)], it follows that for
any true function u(t)

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
n n
u t v t u t v t
´ ´´ ± ± ±
´ ´´ · F F F
or

2 ( ) ( )
( ) ( ) ( ) ( )
ift
n n
e u t v t dt U f V f f df
r
· ·
± ± ±
÷· ÷·
´ ´ ´ ÷
³ ³
,
where

( ) 2
( ) ( )
ift
U f e u t dt
r
·
± ±
÷·

³
and
( ) 2
( ) ( )
ift
n n
V f e v t dt
r
·
± ±
÷·

³
.

The integral formula for
( )
( )
n
V f
±
just restates the definitions given to
( )
n
V
+
and
( )
n
V
÷
on the two
previous pages. Taking the limit of both sides as n ÷· gives
- 160 -
Fourier Convolution Theorem with Generalized Functions · 2.17

- 161 -


2 ( ) ( )
lim ( ) ( ) lim ( ) ( )
ift
n n
n n
e u t v t dt U f V f f df
r
· ·
± ± ±
÷· ÷·
÷· ÷·
´ ´ ´ ÷
³ ³


or, moving the limiting process inside the integral so that it becomes a generalized limit [see
discussion after Eq. (2.56a)],


2 ( ) ( )
( ) lim ( ) ( ) lim ( )
ift
n n
n n
e u t G v t dt U f G V f f df
r
· ·
± ± ±
÷· ÷·
÷· ÷·
´ ´ ´ ÷
³ ³
.

From the definitions of ( )
G
v t and
( )
( )
G
V f
±
[see Eqs. (2.72a) and (2.72f)], we get


2 ( ) ( )
( ) ( ) ( ) ( )
ift
G G
e u t v t dt U f V f f df
r
· ·
± ± ±
÷· ÷·
´ ´ ´ ÷
³ ³
,

which becomes

2 ( ) ( )
( ) ( ) ( ) ( )
ift
G G
e u t v t dt U f V f
r
·
± ± ±
÷·
·
³
(2.72h)

or, substituting from Eq. (2.72g),

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
G G
u t v t u t v t
´ ´´ ± ± ±
´ ´´ · F F F . (2.72i)

Consulting Eq. (2.55b) above, we note that convolution with a generalized function is
commutative, just like the convolution of two standard functions, so Eqs. (2.72h) and (2.72i) can
also be written as

2 ( ) ( )
( ) ( ) ( ) ( )
ift
G G
e u t v t dt V f U f
r
·
± ± ±
÷·
·
³
(2.72j)
and
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
G G
u t v t v t u t
´´ ´ ± ± ±
´´ ´ · F F F . (2.72k)

This establishes the generalized-function counterpart to Eq. (2.39j) whenever
2
( )
ift
e u t
r ±
and
( )
( ) U f
±
qualify as acceptable test functions. Since almost all well-behaved, continuous functions
are acceptable test functions when used with linear combinations of delta functions or the
derivatives of delta functions, Eqs. (2.72h) and (2.72i) are valid whenever ( )
G
v t is a linear
combination of delta functions or the derivatives of delta functions.
- 161 -
2 · Fourier Theory

- 162 -
Establishing the Fourier convolution theorem in the other direction is even easier. We just
write, making the variable substitution t t t ´´ ´ ÷ and remembering that the convolutions are
commutative,


2 2
2
[ ( ) ( )] ( ) lim ( )
lim ( ) ( )
lim ( ) (
ift ift
G n
n
ift
n
n
n
n
e u t v t dt dt e dt u t t G v t
dt G v t dt u t t e
dt v t dt u t
r r
r
· · ·
± ±
÷·
÷· ÷· ÷·
· ·
±
÷·
÷· ÷·
÷·
´ ´ ´ · ÷
´ ´ ´ ÷
´ ´ ´ ÷
³ ³ ³
³ ³


2
2 2
2 2
)
lim ( ) ( )
[ lim ( ) ] [ ( ) ] .
ift
ift ift
n
n
ift ift
n
n
t e
dt v t e dt u t e
e G v t dt u t e dt
r
r r
r r
· ·
±
÷· ÷·
· ·
´ ´´ ± ±
÷·
÷· ÷·
· ·
´ ´´ ± ±
÷·
÷· ÷·
´ ´ ´´ ´´
´ ´ ´´ ´´
³ ³
³ ³
³ ³




We conclude that
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
G G
u t v t u t v t
´ ´´ ± ± ±
´ ´´ · F F F , (2.72A )

showing that Eq. (2.39a) holds true for the convolution of a true function and a generalized
function as well as for the convolution of two true functions.
2.18 The Shah Function
The shah function, often written as 1 I I , can be defined as the generalized limit


1
sin 2
2 1
( , ) lim
sin
n
t
n
T
t T G
t T
T
r
r
÷·
§ ·
§ ·
+
¨ ¸ ¨ ¸
© ¹
© ¹
1
§ ·
¨ ¸
© ¹
I I . (2.73)
For any test function ( ) t o , we have


( ) ( )
( )
( ) ( )
( )
1 1
1 1
sin 2 (1 2) sin 2 (1 2)
( ) lim lim ( )
sin sin
n n
tT n tT n
t G dt t dt
tT tT
r r
o o
r r
÷ ÷
· ·
÷ ÷
÷· ÷·
÷· ÷·
ª º ­ ½
+ +
° °
« »
® ¾
« »
° °
¬ ¼ ¯ ¿
³ ³
(2.74a)
- 162 -
The Shah Function · 2.18

- 163 -
As n gets large in (2.74a), the term in braces { } oscillates ever more rapidly between +1 and í1,
causing the more slowly varying function o to make only a negligible contribution to the
integral. The only place this might not hold true is at the isolated t values

0, , 2 , t T T ± ± … . (2.74b)

It is easy to see why these isolated values are different. Suppose t differs from one of these
isolated values by only a small amount ¨t so that

for 0,1, 2, t t mT m A ± … . (2.74c)

Then the term in braces becomes


( ) ( )
( )
( ) ( )
( ) ( )
1 1
1
1
1
1
sin 2 ( ) (1 2) sin 2 (1 2) 2
sin( )
sin ( )
sin 2 (1 2)
.
sin( )
t mT T n tT n nm m
tT m
t mT T
tT n
tT
r r r r
r r
r
r
r
÷ ÷
÷
÷
÷
÷
A ± + A + ± ±

A ±
A ±
A +

A



To explain the last step, we note that the sine does not change when a ±nm number of 2ʌ’s is
added to its argument, and adding a ±m number of ʌ’s to the sine’s argument either leaves the
sine unchanged (if m is even) or multiplies it by í1 (if m is odd). Since the sine values in both the
numerator and denominator have the same number of ʌ’s added to their arguments, we do not
care if m is odd because the factor of í1 cancels, leaving the sine ratio unchanged. As ¨t is taken
to be ever smaller in magnitude for a fixed value of n, there comes a time when the arguments of
both sines are small in magnitude, allowing each sine to be approximated by its argument. We
then have

( ) ( )
( )
( ) ( )
( )
( )
( )
1 1
1 1
1
1
sin 2 ( ) (1 2) sin 2 (1 2)
sin ( ) sin
2 (1 2)
2 (1 2) .
t mT T n tT n
t mT T tT
tT n
n
tT
r r
r r
r
r
÷ ÷
÷ ÷
÷
÷
A ± + A +

A ± A
A +
e +
A



Consequently, the peak values of the term in braces get ever larger at the isolated points in
(2.74b) as n increases, as shown in Figs. 2.9(a)–2.9(c). We see that the triangular peaks at the
isolated points in (2.74b) have widths equal to ( (1 2)) T n + . As n gets ever larger, the term in
braces oscillates so rapidly between +1 and í1 compared to the test function o that there is no
contribution made to the integral on the right-hand side of (2.74a) except at the isolated t values
shown in Figs. 2.9(a)–2.9(c). At these t values, we have
- 163 -
2 · Fourier Theory

- 164 -


( ) ( )
( )
¦ ¦
¦ ¦
1
1
sin 2 (1 2)
lim ( ) ( ) area of triangular peak
sin
(0) area of triangular peak
n
tT n
t dt T
tT
r
o o
r
o
÷
·
÷
÷·
÷·
­ ½
+
° °
+ ÷
® ¾
° °
¯ ¿
+
³
"

¦ ¦
( )¦ ¦
( ) area of triangular peak
1
2 (1 2) ( ) (0) ( ) ,
2 (1 2)
T
T
n T T
n
o
o o o
+ +
+ + ÷ + + +
+
"
" "



which simplifies to


( ) ( )
( )
1
1
sin 2 (1 2)
lim ( ) ( )
sin
k
n
k
tT n
t dt T kT
tT
r
o o
r
÷
·
·
÷
÷·
÷·
÷·
­ ½
+
° °

® ¾
° °
¯ ¿
¦
³
. (2.75a)

But ( )
k
k
T kT o
·
÷·
¦
can be thought of as what we get when evaluating the integral

( ) ( ) ( ) ( ) ( )
k k k
k k k
t T t kT dt T t kT t dt T kT o o o o o
· ·
· · ·
÷· ÷· ÷·
÷· ÷·
ª º
÷ ÷
« »
¬ ¼
¦ ¦ ¦
³ ³
.

This lets us write (2.75a) as


( ) ( )
( )
1
1
sin 2 (1 2)
lim ( ) ( ) ( )
sin
k
n
k
tT n
t dt t T t kT dt
tT
r
o o o
r
÷
· ·
·
÷
÷·
÷·
÷· ÷·
­ ½
+
ª º ° °
÷
® ¾
« »
¬ ¼
° °
¯ ¿
¦
³ ³
(2.75b)

or, using (2.56a) to take the limit inside the integral as a generalized limit,


( ) ( )
( )
1
1
sin 2 (1 2)
( ) lim ( ) ( )
sin
k
n
k
tT n
t G dt t T t kT dt
tT
r
o o o
r
÷
· ·
·
÷
÷·
÷·
÷· ÷·
­ ½
+
ª º ° °
÷
® ¾
« »
¬ ¼
° °
¯ ¿
¦
³ ³
.

Since this last result is true for any test function o , we conclude that

can be regarded as what we get when evaluating the integral
- 164 -
The Shah Function · 2.18

- 165 -

( ) ( )
( )
1
1
sin 2 (1 2)
lim ( )
sin
k
n
k
tT n
G T t kT
tT
r
o
r
÷
·
÷
÷·
÷·
­ ½
+
° °
÷
® ¾
° °
¯ ¿
¦
(2.75c)

in the sense of Eq. (2.47b). Comparison of this result to the definition of the shah function in Eq.
(2.73) above shows that
( , ) ( )
k
t T t kT o
·
÷·
1 ÷
¦
I I . (2.75d)

We note that variable t can be replaced by ƒ in Eq. (2.75c) to get


( ) ( )
( )
1
1
sin 2 (1 2)
lim ( )
sin
k
n
k
fT n
G T f kT
fT
r
o
r
÷
·
÷
÷·
÷·
­ ½
+
° °
÷
® ¾
° °
¯ ¿
¦
.

Parameter T is arbitrary throughout this derivation, so nothing stops us from replacing it by
1
T
÷

everywhere to get

( ) ( )
( )
sin 2 (1 2)
1
lim
sin
k
n
k
fT n
k
G f
fT T T
r
o
r
·
÷·
÷·
­ ½
+
° ° § ·
÷
® ¾
¨ ¸
© ¹
° °
¯ ¿
¦
. (2.75e)

This is another useful version of the formula in Eq. (2.75d).
2.19 Fourier Transform of the Shah Function
To get the Fourier transform of the shah function, we construct the sequence of true functions
1 2
( , ), ( , ), , ( , ),
n
G t T G t T G t T … … such that
( , ) ( )
n
n n
k n
G t T g t kT
÷
÷
¦
, (2.76a)
where

sin(2 ( 1) )
( )
n
n t
g t
t
r
r
+
. (2.76b)

From Eq. (2.67c), we have

1
sin(2 )
lim ( ) lim ( )
n
n n
nt
G g t G t
t
r
o
r
÷
÷· ÷·
.


- 165 -
2 · Fourier Theory

- 166 -




The formula for the t interval between the arrows is /( 1/ 2) T n + in all three plots. Figures 2.9(a), 2.9(b),
and 2.9(c) show how the base width of the central lobe becomes ever narrower as n increases.

FIGURE 2.9(a).
FIGURE 2.9(b).
FIGURE 2.9(c).
- 166 -
Fourier Transform of the Shah Function · 2.19

- 167 -
Since adding one to n does not make any difference in the limit, we end up with

lim ( ) ( )
n
n
G g t t o
÷·
; (2.76c)

and from (2.71a) we get, again adding one to n,


( ) sin 2 ( 1)
( , 1) for 1, 2,
n t
f n n
t
r
r
+
÷ H + … . (2.76d)

To find the generalized function that is the forward Fourier transform of the generalized limit of
n
G as n ÷·, we must evaluate the forward Fourier transform of
n
G for finite n,


( )
( ) 2 2
2 2
( ) ( ) ( )
( )
n
ift ift ift
n n n
k n
n
ifkT ift
n
k n
G t e G t dt e g t kT dt
e e g t dt
r r
r r
· ·
÷ ÷ ÷
÷
÷· ÷·
·
´ ÷ ÷
÷
÷·
÷
´ ´
¦
³ ³
¦
³
,
F


where in the last step the variable of integration has been changed to t t kT ´ ÷ . The Fourier
transform inside the sum can be done using (2.76b) and (2.76d) to get

( )
( ) 2
( ) ( , 1)
n
ift ifkT
n
k n
G t f n e
r ÷ ÷
÷
H +
¦
F . (2.77a)

The sum
2
n
ifkT
k n
e
r ÷
÷
¦
is just a disguised form of geometric series. We can write

2
n n
ifkT k
k n k n
e w
r ÷
÷ ÷

¦ ¦
, (2.77b)

where

2 ifT
w e
r ÷


and define

2
n n
k ifkT
n
k n k n
S w e
r ÷
÷ ÷

¦ ¦
.

- 167 -
2 · Fourier Theory

- 168 -
Using the standard approach for calculating the sum of a geometric series, we note that
multiplying every term in the sum by w increases each power of w in the sum by one. This is the
same as adding
1 n
w
+
and subtracting
n
w
÷
from the original sum, giving


1
1
1
n
k n n
n n
k n
wS w S w w
+
+ ÷
÷ +
+ ÷
¦

or

1
1
n n
n
w w
S
w
+ ÷
÷

÷
.

Hence, (2.77b) becomes


( ) ( ) ( ) ( )
( ) ( ) ( )
2 1 2 2 1 2
2 ( 1) 2 ( )
2
2
1
sin 2 1 2
,
sin( )
ifT n ifT n
ifT n ifT n n
ifkT
ifT ifT ifT
k n
e e e e
e
e e e
fT n
fT
r r
r r
r
r r r
r
r
÷ + +
÷ +
÷
÷ ÷
÷
÷ ÷

÷ ÷
+

¦

(2.77c)

which means Eq. (2.77a) can be written as


( ) ( ) ( )
( )
sin 2 1 2
( ( )) ( , 1)
sin( )
ift
n
fT n
G t f n
fT
r
r
÷
+
H + F . (2.77d)

The inverse Fourier transform of the forward Fourier transform returns the original function [see
Eqs. (2.29b) and (2.29d)], so this last result lets us write


( ) ( ) ( )
sin 2 1 2
( ) ( , 1)
sin( )
n
fT n
G t f n
fT
r
r
+
÷ H + . (2.77e)

From the definition of the Fourier transform of a generalized function [see (2.59g)], we know that
taking the generalized limit of both sides of (2.77e) gives a Fourier transform relationship
between two generalized functions—all that needs to be done now is to find out what these
generalized functions are.
To find the generalized function that is the generalized limit of
n
G as n ÷·, we write for
any test function o , using Eq. (2.76a), that
- 168 -
Fourier Transform of the Shah Function · 2.19

- 169 -

[ ]
( ) lim ( ) lim ( ) ( ) lim ( ) ( )
lim ( ) ( )
n
n n n
n n n
k n
n
n
n
k n
t G G t dt t G t dt t g t kT dt
t g t kT dt
o o o
o
· · ·
÷· ÷· ÷·
÷
÷· ÷· ÷·
·
÷·
÷
÷·
ª º
ª º
÷
« »
¬ ¼
¬ ¼
÷
¦
³ ³ ³
¦
³
.
(2.77f)

Equation (2.76c) states that the generalized limit of
n
g is the delta function, so

lim ( ) ( ) ( ) lim ( ) ( ) ( ) ( )
n n
n n
t g t kT dt t G g t kT dt t t kT dt kT o o o o o
· · ·
÷· ÷·
÷· ÷· ÷·
÷ ÷ ÷
³ ³ ³
,

which means that
lim ( ) ( ) ( )
n
n
n
k n k
t g t kT dt kT o o
·
·
÷·
÷ ÷·
÷·
÷
¦ ¦
³
.

Hence, Eq. (2.77f) can be written as

( ) lim ( ) ( )
n
n
k
t G G t dt kT o o
·
·
÷·
÷·
÷·
ª º

¬ ¼
¦
³
. (2.77g)

But, just as in the discussion following Eq. (2.75a) above, we can regard

( )
k
kT o
·
÷·
¦


as the result of integrating the shah generalized function

( , ) ( )
k
t T t kT o
·
÷·
1 ÷
¦
I I
with any test function o , since

( , ) ( ) ( ) ( ) ( )
k k
t T t dt t kT t dt kT o o o o
· ·
· ·
÷· ÷·
÷· ÷·
ª º
1 ÷
« »
¬ ¼
¦ ¦
³ ³
I I .

Therefore, (2.77g) can be written as
- 169 -
2 · Fourier Theory

- 170 -
( ) lim ( ) ( ) ( )
n
n
k
dt t G G t t kT t dt o o o
· ·
·
÷·
÷·
÷· ÷·
ª º
ª º
÷
« »
¬ ¼
¬ ¼
¦
³ ³
(2.77h)


for any test function o , showing that

lim ( ) ( ) ( , )
n
n
k
G G t t kT t T o
·
÷·
÷·
÷ 1
¦
I I (2.77i)


in the sense of Eq. (2.47b).
The generalized function that is the generalized limit of the right-hand side of (2.77e) is
multiplied by an arbitrary test function ( ) f o and integrated over all ƒ to get



( ) ( ) ( )
( ) ( ) ( )
1
( 1)
sin 2 1 2
( ) lim ( , 1)
sin( )
sin 2 1 2
lim ( )
sin( )
lim
n
n
n
n
n
fT n
f G f n df
fT
fT n
f df
fT
r
o
r
r
o
r
·
÷·
÷·
+
÷·
÷ +
÷
­ ½
ª º
+
° °
« »
H +
® ¾
« »
° °
¬ ¼
¯ ¿
ª º
+
« »

« »
¬ ¼

³
³


( ) ( ) ( )
sin 2 1 2
( ) ,
sin( )
fT n
f df
fT
r
o
r
·
·
÷·
ª º
+
« »
« »
¬ ¼
³
(2.78a)


where in the last step we recognize that the behavior of the sine ratio inside the square brackets
[ ] is not affected by the endpoints for the region of integration as n ÷·. Equations (2.56a) and
(2.75e) show that


( ) ( )
( )
1 1
sin 2 (1 2)
lim ( ) ( ) ( )
sin
k
n
k
fT n
f df f T f kT df
fT
r
o o o
r
· ·
·
÷ ÷
÷·
÷·
÷· ÷·
­ ½
+
ª º ° °
÷
® ¾
« »
¬ ¼
° °
¯ ¿
¦
³ ³
,


which means that (2.78a) simplifies to
- 170 -
Fourier Transform of the Shah Function · 2.19

- 171 -

( ) ( ) ( )
1 1
sin 2 1 2
( ) lim ( , 1)
sin( )
( ) ( )
n
k
k
fT n
f G f n df
fT
f T f kT df
r
o
r
o o
·
÷·
÷·
·
·
÷ ÷
÷·
÷·
­ ½
ª º
+
° °
« »
H +
® ¾
« »
° °
¬ ¼
¯ ¿
ª º
÷
« »
¬ ¼
³
¦
³


for any test function ( ) f o . Therefore,


( ) ( ) ( )
sin 2 1 2
1
lim ( , 1)
sin( )
k
n
k
fT n
k
G f n f
fT T T
r
o
r
·
÷·
÷·
ª º
+
§ ·
« »
H + ÷
¨ ¸
« » © ¹
¬ ¼
¦
(2.78b)

in the sense of Eq. (2.47b). Since the right-hand side of (2.78b) is, according to (2.75d),
proportional to the shah function, we end up with


1
1 1
( , )
k
k
f f T
T T T
o
·
÷
÷·
§ ·
÷ 1
¨ ¸
© ¹
¦
I I . (2.78c)

Equations (2.78b) and (2.77i) let us take the generalized limits as n ÷· of both sides (2.77e) to
get

1
( )
k k
k
t kT f
T T
o o
· ·
÷· ÷·
§ ·
÷ ÷ ÷
¨ ¸
© ¹
¦ ¦
. (2.78d)

According to Eq. (2.75d), this can also be written as


1
1
( , ) ( , ) t T f T
T
÷
1 ÷ 1 I I I I . (2.78e)

These last two results can be transformed directly to show explicitly that both the forward and
inverse Fourier transform of the shah function produce another shah function. We first write
(2.78d) as the forward and inverse Fourier transforms,


2
1
( )
ift
k j
j
e t kT dt f
T T
r
o o
·
· ·
÷
÷· ÷·
÷·
ª º § ·
÷ ÷
¨ ¸ « »
© ¹ ¬ ¼
¦ ¦
³
(2.79a)
and

2
1
( )
ift
j k
j
e f df t kT
T T
r
o o
·
· ·
÷· ÷·
÷·
ª º
§ ·
÷ ÷
« » ¨ ¸
© ¹
¬ ¼
¦ ¦
³
. (2.79b)
These last two results can be modified to generalize how both the forward and inverse Fourier
transform of the shah function produce another shah function. We first write (2.78d) as the forward
and inverse Fourier transforms,
- 171 -
2 · Fourier Theory

- 172 -
The discussion following Eq. (2.52c) above shows that linear transformations of the variables of
integration are allowed when using generalized functions, so we can change to t t ´ ÷ in Eqs.
(2.79a) and (2.79b) to get


2
1
( )
ift
k j
j
e t kT dt f
T T
r
o o
·
· ·
´
÷· ÷·
÷·
ª º § ·
´ ´ ÷ ÷ ÷
¨ ¸ « »
© ¹ ¬ ¼
¦ ¦
³

and

2
1
( )
ift
j k
j
e f df t kT
T T
r
o o
·
· ·
´ ÷
÷· ÷·
÷·
ª º
§ ·
´ ÷ ÷ ÷
« » ¨ ¸
© ¹
¬ ¼
¦ ¦
³
.

The sum over index k goes over all positive and negative integers, so we can change the sum’s
index to k k ´ ÷ and use that the delta function is even [see Eq. (2.68a)] to get


2
1
( )
ift
k j
j
e t k T dt f
T T
r
o o
·
· ·
´
´÷· ÷·
÷·
ª º § ·
´ ´ ´ ÷ ÷
¨ ¸ « »
© ¹ ¬ ¼
¦ ¦
³

and

2
1
( )
ift
j k
j
e f df t k T
T T
r
o o
·
· ·
´ ÷
´ ÷· ÷·
÷·
ª º
§ ·
´ ´ ÷ ÷
« » ¨ ¸
© ¹
¬ ¼
¦ ¦
³
.

Dropping the primes and combining these results with Eqs. (2.79a) and (2.79b) produces the
more general formulas


2
1
( )
ift
k j
j
e t kT dt f
T T
r
o o
·
· ·
±
÷· ÷·
÷·
ª º § ·
÷ ÷
¨ ¸ « »
© ¹ ¬ ¼
¦ ¦
³
(2.79c)
and

2
1
( )
ift
j k
j
e f df t kT
T T
r
o o
·
· ·
±
÷· ÷·
÷·
ª º
§ ·
÷ ÷
« » ¨ ¸
© ¹
¬ ¼
¦ ¦
³
. (2.79d)

In fact, we can easily show that Eqs. (2.79c) and (2.79d) are really the same formula. First, we
interchange the j, k indices and the ƒ, t variables in Eq. (2.79c) so that it becomes


2
1
( )
ift
j k
k
e f jT df t
T T
r
o o
·
· ·
±
÷· ÷·
÷·
ª º
§ ·
÷ ÷
« » ¨ ¸
© ¹
¬ ¼
¦ ¦
³
.

Parameter T is arbitrary, so—just like in the analysis following Eq. (2.75d) above—it can be
replaced everywhere by
1
T
÷
to get
- 172 -
Fourier Transform of the Shah Function · 2.19

- 173 -


2 ift
j k
j k
e f df T t
T T
r
o o
·
· ·
±
÷· ÷·
÷·
ª º
§ · § ·
÷ ÷
« » ¨ ¸ ¨ ¸
© ¹ © ¹
¬ ¼
¦ ¦
³
.

After dividing through by T, we see that this last result is the same as Eq. (2.79d), showing that
Eqs. (2.79c) and (2.79d) are really the same formula.
2.20 Fourier Series
Integral Fourier transforms are connected in a direct and straightforward way to both the Fourier
series and the discrete Fourier transform. This section shows the connection to the Fourier series
and the next section shows the connection to the discrete Fourier transform.
24

We begin with an arbitrary, nonpathological function u(t) that has a well-defined Fourier
integral transform. Function u can be complex-valued but its argument t must be real, and U(ƒ) is
the forward Fourier transform of u(t), so

( )
( ) 2
( ) ( ) ( )
ift ift
U f u t u t e dt
r
·
÷ ÷
÷·

³
F (2.80a)
and
( ) ( ) u t U f ÷ . (2.80b)

From u(t), we create a new function
[ ]
( , ) u t T
·
that repeats forever along the t axis at intervals of
T,

[ ]
( , ) ( )
k
u t T u t kT
·
·
÷·
÷
¦
. (2.81a)

Although perhaps redundant, it turns out that listing T as one of the arguments of
[ ]
u
·
is a
convenient way to keep track of the connection between u and
[ ]
u
·
. Function
[ ]
u
·
is called a
periodic function of period T because, for any finite positive or negative integer m,


[ ] [ ]
( , ) ( , ) u t mT T u t T
· ·
+ . (2.81b)

Figures 2.10(a) and 2.10(b) show the plots for both u and
[ ]
u
·
as functions of t. Since function u
is left unspecified,
[ ]
u
·
can be thought of as representing an arbitrary periodic function. We can


24
The analysis in Secs. 2.20 and 2.21 is adapted from A. Papoulis, Signal Analysis (McGraw-Hill Book Company,
New York, 1977), pp. 76–81.
kT
- 173 -
2 · Fourier Theory

- 174 -
also define a function
[ ]
( , )
N
u t T by the formula


[ ]
( , ) ( )
N
N
k N
u t T u t kT
÷
÷
¦
. (2.81c)
Clearly,

[ ] [ ]
lim ( , ) ( , )
N
N
u t T u t T
·
÷·
. (2.81d)

We assume that
[ ] N
u is well behaved with respect to the test functions o , so that


[ ] [ ]
lim ( ) ( , ) ( ) ( , )
N
N
t u t T dt t u t T dt o o
· ·
·
÷·
÷· ÷·

³ ³
. (2.81e)

_____________________________________________________________________________



Figure 2.10(a) is a plot of ( ) u t . The solid curve in Fig. 2.10(b), shifted upward from its true position, is
[ ]
( , ) u t T
·
and the dashed curves represent ( ) u t displaced by multiples of T .
t
t
FIGURE 2.10(a).
FIGURE 2.10(b).

[ ]
( , ) u t T
·

( ) u t
T
- 174 -
Fourier Series · 2.20
- 175 -
From (2.81e) and the definition of the generalized limit [see Eq. (2.56a)], we then know that


[ ] [ ] [ ]
lim ( ) ( , ) ( ) lim ( , ) ( ) ( , )
N N
N N
t u t T dt t G u t T dt t u t T dt o o o
· · ·
·
÷· ÷·
÷· ÷· ÷·
ª º

¬ ¼
³ ³ ³
,

from which it follows that

[ ] [ ]
lim ( , ) ( , )
N
N
G u t T u t T
·
÷·
(2.81f)
in the sense of Eq. (2.48c).
Following the pattern of the definitions in (2.81a) and (2.81c), we define


[ ]
( , ) ( )
N
N
k N
t T t kT o o
÷
÷
¦
(2.82a)
and

[ ]
( , ) ( )
k
t T t kT o o
·
·
÷·
÷
¦
. (2.82b)

Function
[ ]
( , ) t T o
·
is clearly just another way of writing the shah function ( , ) t T 1 I I . [The shah
function is defined in Eq. (2.73) and shown equal to ( )
k
t kT o
·
÷·
÷
¦
in Eq. (2.75d).] The
convolution of the generalized function


[ ]
( , ) ( )
N
N
k N
t T t kT o o
÷
÷
¦


with the true function u(t) is


( )
[ ] [ ]
( ) ( , ) ( ) ( , ) ( ) ( )
( ) ,
N
N N
k N
N
k N
u t t T u t t t T dt u t t t kT
u t kT
o o o
· ·
÷
÷· ÷·
÷
´ ´ ´ ´ ´ · ÷ ÷ ÷
÷
¦
³ ³
¦



where the next-to-last step uses ( ) ( ) x x o o ÷ as shown in Eq. (2.68a). The definition of
[ ] N
u in
(2.81c) then gives


[ ] [ ]
( , ) ( ) ( , )
N N
u t T u t t T o · . (2.82c)
- 175 -
2 · Fourier Theory

- 176 -
Taking the integral Fourier transform of both sides, using the Fourier convolution theorem [see
Eq. (2.72A )], and remembering that U(ƒ) is the forward Fourier transform of u(t), we get


( ) ( ) ( )
( )
( ) [ ] ( ) ( ) [ ]
2
2
( , ) ( ) ( , )
( ) ( )
( )
sin 2 ( 1 2)
( ) ,
sin( )
ift N ift ift N
N
ift
k N
N
ikfT
k N
u t T u t t T
U f e t kT dt
U f e
fT N
U f
fT
r
r
o
o
r
r
÷ ÷ ÷
·
÷
÷
÷·
÷
÷

÷

+

¦
³
¦



F F F
(2.83a)

where in the last step we substitute from Eq. (2.77c) above. Having now found that


( )
( )
( ) [ ]
sin 2 ( 1 2)
( , ) ( )
sin( )
ift N
fT N
u t T U f
fT
r
r
÷
+
F ,

we take the inverse Fourier transform of both sides to get


( )
[ ] 2
sin 2 ( 1 2)
( , ) ( )
sin( )
N ift
fT N
u t T e U f df
fT
r
r
r
·
÷·
+

³
. (2.83b)

Taking the limit of both sides as N ÷·, we get, using (2.81d), that


( )
[ ] 2
sin 2 ( 1 2)
( , ) lim ( )
sin( )
ift
N
fT N
u t T e U f df
fT
r
r
r
·
·
÷·
÷·
+

³
. (2.83c)

Equations (2.56a) and (2.75e) can now be used to write


( )
[ ] 2
2
sin 2 ( 1 2)
( , ) ( ) lim
sin( )
1
( )
ift
N
ift
k
fT N
u t T e U f G df
fT
k
e U f f df
T T
r
r
r
r
o
·
·
÷·
÷·
·
·
÷·
÷·
ª º +

« »
¬ ¼
ª º
§ ·
÷
¨ ¸ « »
© ¹
¬ ¼
³
¦
³


or

2
[ ] 1
( , ) ( )
kt
i
T
k
u t T T U k T e
r
·
· ÷
÷·
ª º
¬ ¼
¦
. (2.83d)
- 176 -
Fourier Series · 2.20
- 177 -
Equation (2.83d) specifies the Fourier series for an arbitrary periodic function
[ ]
u
·
, showing that
[ ]
u
·
can be written as the infinite sum of complex exponentials multiplied by the complex
constants
1
[ ( )] T U k T
÷
. To get these complex constants directly from
[ ]
u
·
, we note that for any
real number t and integer m,


( 1)
2 2
( 1) ( 2)
2 2
( 1)
2
1 1 1
( ) lim ( )
1
lim ( ) ( )
( )
N T m m
i t i t
T T
N
NT
N T N T m m
i t i t
T T
N
NT N T
m
i t
T
m
U u t e dt u t e dt
T T T T
u t e dt u t e dt
T
u t e dt
t
r r
t
t t
r r
t t
r
t
+ + ·
÷ ÷
÷·
÷· ÷
÷ ÷ ÷ ÷
÷ ÷
÷·
÷ ÷ ÷
÷
÷
­ ½
° ° § ·

® ¾
¨ ¸
© ¹
° °
¯ ¿
­
°
+ +
®
°
¯
+
³ ³
³ ³
"

2
( 1) 2
2 2
( )
( ) ( ) .
T m
i t
T
T
N T T m m
i t i t
T T
T NT
u t e dt
u t e dt u t e dt
t t
r
t
t t
r r
t t
+
÷
+ + +
÷ ÷
+ +
+
½
°
+ + +
¾
°
¿
³ ³
³ ³
"

This can be simplified to


( 1)
2
1 1
lim ( )
k T m
N
i t
T
N
k N
kT
m
U e u t dt
T T T
t
r
t
+ +
÷
÷·
÷
+
§ ·

¨ ¸
© ¹
¦
³
. (2.83e)

For each value of k, we change the variable of integration to t t kT ´ ÷ so that


( 1)
2 2 2
2
( ) ( ) ( )
k T T T m m m
i t i t i t
imk
T T T
kT
e u t dt e e u t kT dt e u t kT dt
t t t
r r r
r
t t t
+ + + +
´ ´ ÷ ÷ ÷
÷
+
´ ´ ´ ´ + +
³ ³ ³
,

where we use that
2
1
imk
e
r ÷
. Substituting this into (2.83e) gives


2 2
1 1 1
lim ( ) lim ( )
T T m m
N N
i t i t
T T
N N
k N k N
m
U e u t kT dt e u t k T dt
T T T T
t t
r r
t t
+ +
´ ´ ÷ ÷
÷· ÷·
´ ÷ ÷
ª º § ·
´ ´ ´ ´ ´ + ÷
¨ ¸ « »
© ¹ ¬ ¼
¦ ¦
³ ³
,

where in the last step we have replaced index k by index k k ´ ÷ . Now, taking the limit inside the
integral to get the generalized limit [see Eq. (2.56a) above], we rely on (2.81f) to get


2 2
[ ]
1 1 1
lim ( ) ( , )
T T m m
N
i t i t
T T
N
k N
m
U e G u t k T dt e u t T dt
T T T T
t t
r r
t t
+ +
´ ´ ÷ ÷
·
÷·
´÷
ª º § ·
´ ´ ´ ´ ´ ÷
¨ ¸ « »
© ¹ ¬ ¼
¦
³ ³
. (2.83f)
- 177 -
2 · Fourier Theory

- 178 -
Equations (2.83d) and (2.83f) let us put the Fourier series into its standard form. For any
periodic function

[ ]
( ) ( , ) ( )
k
v t u t T u t kT
·
·
÷·
÷
¦

of period T, we have found that

2
( )
t
ik
T
k
k
v t A e
r
·
÷·

¦
, (2.84a)

where

2
1
( )
T k
i t
T
k
A e v t dt
T
t
r
t
+
÷

³
. (2.84b)

for any finite value of t . Because we did not require u(t) to be real in (2.80a), Eqs. (2.83d),
(2.83f), (2.84a), and (2.84b) still hold true for complex periodic functions with real arguments t.
It is customary—but of course not mandatory—to choose 0 t or 2 T t ÷ in (2.84b).
Using
[ ]
( ) ( , ) v t u t T
·
, we know from Eqs. (2.83d), (2.83f), (2.84a), and (2.84b) that the
k
A
coefficients can be specified in terms of the forward Fourier transform U(ƒ) of u(t),


1
k
k
A U
T T
§ ·

¨ ¸
© ¹
. (2.85a)

When u is real—which means that
[ ]
( ) ( , ) v t u t T
·
is also real—we know from entry 7 of Table
2.1 (located at the end of this chapter) that U(ƒ) must be Hermitian so that

( ) ( ) U f U f
·
÷ .

Hence, when v(t) is real in (2.84a), it then follows from (2.85a) that


k k
A A
·
÷
(2.85b)

in (2.84b). This procedure can be extended to all the entries in Table 2.1, giving us the entries in
Table 2.2 (also located at the end of this chapter). To go through another example, if u is
imaginary and odd, we know from entry 3 of Table 2.1 that U is real and odd, so

( ) ( ) U f U f ÷ ÷ and ( ) Im ( ) 0 U f .

- 178 -
Fourier Series · 2.20
- 179 -
Equation (2.85a) then shows that


k k
A A
÷
÷ and ( ) Im 0
k
A . (2.85c)

We can show that
[ ]
( ) ( , ) v t u t T
·
is imaginary and odd when u is imaginary and odd (let
k k ´ ÷ ),


[ ]
[ ]
( ) ( , ) ( ) ( ) ( )
( , ) ( )
k k k
v t u t T u t kT u t k T u t k T
u t T v t
· · ·
·
´ ´ ÷· ÷· ÷·
·
´ ´ ÷ ÷ ÷ ÷ ÷ + ÷ ÷
÷ ÷
¦ ¦ ¦
,

and
( ) ( ) Re ( ) Re ( ) 0
k
v t u t kT
·
÷·
÷
¦
.

This shows that we end up with (2.85c) associated with v(t) being imaginary and odd, as stated in
entry 3 of Table 2.2.
A final point worth mentioning about Fourier series is that the A
k
coefficients are often
reshuffled so that the series can be written as a sum of sines and cosines. Equation (2.84a) can be
rewritten as, using cos sin
i
e i
o
o o + ,


2 2
0
1
0
1 1
( )
2 2
cos sin .
t t
i k i k
T T
k k
k
k k k k
k k
v t A A e A e
k t k t
A A A i A A
T T
r r
r r
·
÷
÷

· ·
÷ ÷

ª º
+ +
« »
¬ ¼
§ · § ·
ª º ª º
+ + + ÷
¨ ¸ ¨ ¸
¬ ¼ ¬ ¼
© ¹ © ¹
¦
¦ ¦

(2.86a)

From Eq. (2.84b), we get

0
1
( )
T
A v t dt
T
t
t
+

³
, (2.86b)


2 2 2
1 2
( ) ( ) cos
T T t t
i k i k
T T
k k
k t
A A v t e e dt v t dt
T T T
t t
r r
t t
r
+ +
÷
÷
§ · ª º
+ +
¨ ¸ « »
¬ ¼ © ¹
³ ³
, (2.86c)
and

2 2 2
2
( ) ( ) sin
T T t t
i k i k
T T
k k
k t
i
i A A v t e e dt v t dt
T T T
t t
r r
t t
r
+ +
÷
÷
§ · ª º
ª º
÷ ÷
¨ ¸ « »
¬ ¼
¬ ¼ © ¹
³ ³
. (2.86d)
- 179 -
2 · Fourier Theory

- 180 -
Putting these results together, we can write


0
1 1
2 2
( ) cos sin
2
k k
k k
c kt kt
v t c s
T T
r r
· ·

§ · § ·
+ +
¨ ¸ ¨ ¸
© ¹ © ¹
¦ ¦
, (2.87a)
where

2 2
( ) cos for 0,1, 2,
T
k
kt
c v t k
T T
t
t
r
+
§ ·

¨ ¸
© ¹
³
… (2.87b)
and

2 2
( ) sin for 1, 2, 3,
T
k
kt
s v t k
T T
t
t
r
+
§ ·

¨ ¸
© ¹
³
… . (2.87c)

The absolute value signs are dropped from index k because it is defined positive in (2.87a), and
0
A is replaced by
0
2 c so that the formula for
0
c can be folded into the general formula for
k
c in
(2.87b). Although it is still not mandatory, parameter t is usually given the value 0 or 2 T ÷ .
Nowhere has v been required to be real, so Eqs. (2.87a)–(2.87c), just like Eqs. (2.84a) and
(2.84b), still hold true when v is a complex-valued periodic function of (real) period T. Indeed, if
v is a complex-valued function of a real argument t, both its real part

( ) ( ) Re ( )
R
v t v t
and its imaginary part
( ) ( ) Im ( )
I
v t v t

are real-valued periodic functions of period T. This means that when, for any integer m, we have

( ) ( ) v t mT v t ± (2.88a)

for a complex-valued function v of a real argument, then

( ) ( )
R R
v t mT v t ± (2.88b)
and
( ) ( )
I I
v t mT v t ± . (2.88c)

Since sines and cosines of real arguments are strictly real, we can now take the real and
imaginary parts of (2.87a)–(2.87c) to get
- 180 -
Fourier Series · 2.20
- 181 -

[ ] [ ]
0
1 1
Re( ) 2 2
( ) Re( ) cos Re( ) sin
2
R k k
k k
c kt kt
v t c s
T T
r r
· ·

§ · § ·
+ +
¨ ¸ ¨ ¸
© ¹ © ¹
¦ ¦
, (2.89a)
with

2 2
Re( ) ( ) cos for 0,1, 2,
T
k R
kt
c v t k
T T
t
t
r
+
§ ·

¨ ¸
© ¹
³
… (2.89b)

and

2 2
Re( ) ( ) sin for 1, 2, 3,
T
k R
kt
s v t k
T T
t
t
r
+
§ ·

¨ ¸
© ¹
³
… , (2.89c)
as well as

[ ] [ ]
0
1 1
Im( ) 2 2
( ) Im( ) cos Im( ) sin
2
I k k
k k
c kt kt
v t c s
T T
r r
· ·

§ · § ·
+ +
¨ ¸ ¨ ¸
© ¹ © ¹
¦ ¦
, (2.90a)
with

2 2
Im( ) ( ) cos for 0,1, 2,
T
k I
kt
c v t k
T T
t
t
r
+
§ ·

¨ ¸
© ¹
³
… (2.90b)
and

2 2
Im( ) ( ) sin for 1, 2, 3,
T
k I
kt
s v t k
T T
t
t
r
+
§ ·

¨ ¸
© ¹
³
… . (2.90c)
2.21 Discrete Fourier Transform
The first step in going from the integral Fourier transform to the discrete Fourier transform is to
repeat the procedure used in Sec. 2.20 to get the Fourier series. We pick a nonpathological
function u(t) having a forward Fourier transform


2
( ) ( )
ift
U f u t e dt
r
·
÷
÷·

³
(2.91a)

and, following the same procedure used in Eq. (2.81a) above, create a periodic function of period
T:

[ ]
( , ) ( )
k
u t T u t kT
·
·
÷·
÷
¦
. (2.91b)

As was shown Sec. 2.20, we can now write the associated Fourier series as [see Eq. (2.83d)]

- 181 -
2 · Fourier Theory

- 182 -

2
[ ]
1
( , )
kt
i
T
k
k
u t T U e
T T
r
·
·
÷·
§ ·

¨ ¸
© ¹
¦
, (2.91c)

where, as specified in (2.91a), U is the forward Fourier transform of u.
Next we divide the period T of
[ ]
u
·
into N equal lengths, t T N A , and evaluate (2.91c) only
for t m t A with 0,1, 2, , 1 m N ÷ … ,


2
[ ]
1
( , )
km
i
N
k
k
u m t T U e
T T
r
·
·
÷·
§ ·
A
¨ ¸
© ¹
¦
, (2.92a)

where we have used
N t T A (2.92b)

to simplify the exponent of (2.92a). The infinite sum in (2.92a) can be split in two by making the
substitution k n rN + with 0,1, 2, , 1 n N ÷ … and 0, 1, 2, r ± ± …. This gives


1
2
[ ] 2
0
1
( , )
nm
N
i
irm
N
r n
n rN
u m t T U e e
T T
r
r
· ÷
·
÷·
+ § ·
A
¨ ¸
© ¹
¦ ¦
.

Since
2
1
irm
e
r
and T N t A , this becomes, making the index substitution r r ´ ÷ ,


1
2
[ ]
0
1
( , )
nm
N
i
N
n r
n r
u m t T e U
T T t
r
÷ ·
·
´ ÷·
´
§ ·
A ÷
¨ ¸
A
© ¹
¦ ¦


or

1
2
[ ] [ ]
0
1 1
( , ) ,
nm
N
i
N
n
n
u m t T e U
T T t
r
÷
· ·

§ ·
A
¨ ¸
A
© ¹
¦
, (2.93a)

where we follow the pattern of Eqs. (2.81a) and (2.91b) and define


[ ]
( , ) ( )
r
U f F U f rF
·
·
÷·
÷
¦
(2.93b)
for any two frequencies ƒ and F.
Equation (2.93a) is a somewhat disguised version of the discrete Fourier transform (DFT).
Figures 2.11(a) and 2.11(b) show the relationship of the two periodic functions
[ ]
u
·
and
[ ]
U
·
,
graphed with solid lines, to the two original functions u and U graphed with dashed lines. [In
graphs such as these, u(t) typically stands for data and is usually real, making it easy to represent
- 182 -
Discrete Fourier Transform · 2.21
- 183 -
with a two-dimensional plot; but its transform U(ƒ) is often complex, so it makes more sense to
plot ( ) U f if we just want to show where U(ƒ) is different from zero.] When function
[ ]
u
·
has
period T and is uniformly sampled at intervals of ¨t, then function
[ ]
U
·
has period


1
F
t

A
(2.93c)
and is uniformly sampled at intervals of

1
f
T
A . (2.93d)

Note, of course, we could also say that
[ ]
u
·
has period 1 f A and is uniformly sampled at
intervals of 1 F when
[ ]
U
·
has period F and is sampled at intervals of ¨ƒ. When both ¨ƒ and ¨t
are known, we have from (2.92b) and (2.93d) that


1
f t
N
A A (2.93e)

Figures 2.12(a) and 2.12(b) show that if T and F are large and functions u(t) and U(ƒ) die away
relatively quickly when t and f are large—which means that u and U are localized near the t
and ƒ origins—then the corresponding periodic functions
[ ]
( , ) u t T
·
and
[ ]
( , ) U f F
·
can be used
to approximate the non-negligible regions of u and U. Almost always when the DFT is used, its
users have in mind a situation such as that shown in Figs. 2.12(a) and 2.12(b), with
[ ]
u
·
and
[ ]
U
·

being good approximations of u and U for small to moderately large values of t and ƒ.
To complete the DFT transform pair, we define


2 i
N
N
w e
r
(2.94a)

and write (2.93a) as

1
[ ] [ ]
0
1 1
( , ) ,
N
nm
N
n
n
u m t T w U
T T t
÷
· ·

§ ·
A
¨ ¸
A
© ¹
¦
. (2.94b)

Multiplying both sides by
mk
N
w
÷
and summing over m gives


1 1 1
[ ] [ ] ( )
0 0 0
1 1
( , ) ,
N N N
mk m n k
N N
m n m
n
u m t T w U w
T T t
÷ ÷ ÷
· ÷ · ÷

­ ½
ª º § ·
A
® ¾
¨ ¸ « »
A
© ¹ ¬ ¼
¯ ¿
¦ ¦ ¦
. (2.94c)
- 183 -
2 · Fourier Theory

- 184 -








The sum over m on the right-hand side is the sum of a geometric series,


1
[ ] ( )
,
0
N
N m n k
n k N
m
V w
÷
÷

¦
. (2.94d)

This can be solved using the standard procedure for geometric sums [see the analysis following
Eq. (2.77b) above], multiplying every term in the sum by
n k
N
w
÷
to get


[ ] [ ] ( )
, ,
1
N n k N N n k
n k n k N
V w V w
÷ ÷
+ ÷ . (2.94e)

Solving for
[ ]
,
N
n k
V gives

( ) 2 ( )
[ ]
,
2
1 1
1
1
N n k i n k
N N
n k
n k n k
i
N
N
w e
V
w
e
r
r
÷ ÷
÷ ÷ § ·
¨ ¸
© ¹
÷ ÷

÷
÷
, (2.94f)
FIGURE 2.11(a).
FIGURE 2.11(b).
t
f

[ ]
( , ) u t T
·


[ ]
( , ) U f F
·

1/ f T A
1/ t F A

1
T
f

A


1
F
t

A

- 184 -
Discrete Fourier Transform · 2.21
- 185 -
where in the last step definition (2.94a) is used to eliminate
N
w . Index n goes from zero to 1 N ÷
for each value of k [see Eqs. (2.94b) and (2.94c)]. Deciding also to restrict k to one of the integers
0,1, 2, , 1 k N ÷ … , we see that the denominator in (2.94f) can be zero only when n k . This
looks like it could be a problem, but when n = k, we can return to the original formula in (2.94d),
noting that for n = k the sum
[ ]
,
N
n k
V is equal to N. When n  k, the right-hand side of (2.94f) shows
that
[ ]
,
N
n k
V is zero because
2 ( )
1
i n k
e
r ÷
. We conclude that


[ ]
, ,
for
0 for
N
n k k n
N n k
V N
n k
o
­ ½

® ¾
=
¯ ¿


, (2.94g)

where
, k n
o is the Kronecker delta,

,
1 for
0 for
k n
n k
n k
o
­

®
=
¯


. (2.94h)

Substitution of (2.94d) into (2.94c) gives


1 1
[ ] [ ] [ ]
,
0 0
1 1
( , ) ,
N N
mk N
N n k
m n
n
u m t T w U V
T T t
÷ ÷
· ÷ ·

­ ½
§ ·
A
® ¾
¨ ¸
A
© ¹
¯ ¿
¦ ¦
.

Substituting from (2.94g), we get


1
[ ] [ ]
0
1
, ( , )
N
mk
N
m
N k
U u m t T w
T T t
÷
· · ÷

§ ·
A
¨ ¸
A
© ¹
¦
. (2.94i)

This equation is the other half of the DFT [the first half is specified by Eqs. (2.94a) and (2.94b)].
Using Eqs. (2.94a) and (2.92b) to replace
N
w by
(2 ) / i N
e
r
and N T by 1 t A , we write (2.94b)
and (2.94i) as

1
2
[ ] [ ]
0
1 1
( , ) ,
mn
N
i
N
n
n
u m t T e U
T T t
r
§ ·
÷
¨ ¸
· ·
© ¹

§ ·
A
¨ ¸
A
© ¹
¦
(2.95a)
and

1
2
[ ] [ ]
0
1
, ( , )
mn
N
i
N
m
n
U t u m t T e
T t
r
§ ·
÷
÷
¨ ¸
· ·
© ¹

§ ·
A A
¨ ¸
A
© ¹
¦
, (2.95b)
- 185 -
2 · Fourier Theory

- 186 -









FIGURE 2.12(a).
FIGURE 2.12(b).

[ ]
( , ) u t T
·


[ ]
( , ) U f F
·


1
T
f

A


1
F
t

A

t
f
1/ f T A
1/ t F A
region over which
[ ]
u u
·

region over which
[ ]
U U
·

- 186 -
Discrete Fourier Transform · 2.21
- 187 -
where index k has been replaced by n in (2.94i). This can also be written as, using Eqs. (2.93c)
and (2.93d),
( )
1
2
[ ] [ ]
0
( , ) ,
mn
N
i
N
n
u m t T f e U n f F
r
§ ·
÷
¨ ¸
· ·
© ¹

A A A
¦
(2.95c)
and
( )
1
2
[ ] [ ]
0
, ( , )
mn
N
i
N
m
U n f F t u m t T e
r
§ ·
÷
÷
¨ ¸
· ·
© ¹

A A A
¦
. (2.95d)

The forward and inverse DFTs shown in (2.95c) and (2.95d) are often written as


1
2
0
mn
N
i
N
m n
n
u U e
r
§ ·
÷
¨ ¸
© ¹

¦
(2.96a)
and

1
2
0
1
mn
N
i
N
n m
m
U u e
N
r
§ ·
÷
÷
¨ ¸
© ¹

¦
. (2.96b)

To get Eq. (2.96a) from (2.95c), we define


[ ]
( , )
m
u u m t T
·
A (2.96c)
and

[ ]
( , )
n
U f U n f F
·
A A , (2.96d)

and to get Eq. (2.96b), both sides of (2.95d) are multiplied by ¨ƒ, using (2.93e) to replace f t A A
by 1 N . We can also define

[ ]
( , )
n
U U n f F
·
A

(2.97a)
and

[ ]
( , )
m
u t u m t T
·
A A (2.97b)

to transform Eqs. (2.95c) and (2.95d) into


1
2
0
1
mn
N
i
N
m n
n
u U e
N
r
§ ·
÷
¨ ¸
© ¹

¦

(2.97c)
and

1
2
0
mn
N
i
N
n m
m
U u e
r
§ ·
÷
÷
¨ ¸
© ¹

¦

, (2.97d)
- 187 -
2 · Fourier Theory

- 188 -
where now we have multiplied both sides of (2.95c) by ¨t before replacing f t A A by 1 N .
Figures 2.13(a) and 2.13(b) show how the u
[’]
and U
[’]
continuous functions are sampled to
create the DFT formulas in the previous paragraph. The values of the original functions u and U
are ignored for negative values of t and ƒ; instead, we sample u
[’]
and U
[’]
out to t = T and f = F,
picking up the original u and U values at negative t and ƒ where they repeat near t = T and f = F.
Many times DFT plots show u
m
and U
n
with n and m running from 0 to N í 1. When this is done,
it is with the understanding that the large index values greater than N/2 represent u and U for
negative t and ƒ values respectively.
2.22 Aliasing as an Error
The DFT is important because there is an algorithm, called the fast Fourier transform (FFT), that
allows computers to calculate the sums in Eqs. (2.96a), (2.96b), (2.97c), and (2.97d) rapidly when
N is a multiple of 2. The FFT performs best when 2
j
N for j a positive integer. In fact, when
faced with calculating an integral Fourier transform


2
( ) ( )
ift
U f u t e dt
r
·
÷
÷·

³


over a range of ƒ values for an arbitrary function u(t), it is standard practice to convert the
integral to a DFT and do the job on a computer with a FFT. As we saw in the previous section,
the DFT deals directly with
[ ]
u
·
and
[ ]
U
·
rather than u and U. Thus, successfully using the DFT
to calculate the integral transform requires that
[ ]
u
·
and
[ ]
U
·
consist of well-separated, repetitive
regions of u and U, as shown in Figs. 2.12(a) and 2.12(b), instead of overlapping regions of u and
U, as shown in Figs. 2.11(a) and 2.11(b). Ensuring that
[ ]
u
·
consists of nonoverlapping regions
of u tends to occur naturally; the shape of u is already known so there is no real difficulty in
picking T large enough to prevent significant amounts of overlap in
[ ]
u
·
. The shape of U,
however, is not known in advance, so care must be taken to avoid significant amounts of overlap
in U.
Consider what happens when the DFT is used to analyze a real signal u(t) having the spectrum
U(ƒ) and we know that U(ƒ) is zero for all
max
f f > and nonzero for
max
0 f f < < . Because u is
real, we know from entry 7 in Table 2.1 that ( ) ( ) U f U f
·
÷ , ensuring that U(ƒ) is also nonzero
for negative frequency values
max
0 f f > > ÷ ; that is, for every positive ƒ at which U is nonzero
there must be a íƒ at which U is nonzero, and because U is zero for
max
f f > it follows that U is
zero for all
max
f f s ÷ . Hence U can be represented schematically by the solid triangle centered
on the origin of Fig. 2.14. To construct
[ ]
U
·
, we write
- 188 -
Aliasing as an Error · 2.22
- 189 -










FIGURE 2.13(a).
FIGURE 2.13(b).

[ ]
( , ) u t T
·

t
f

[ ]
( , ) U f F
·


1
T
f

A


1
F
t

A

1/ t F A
1/ f T A
region over which
[ ]
U U
·

region over which
[ ]
u u
·

- 189 -
2 · Fourier Theory

- 190 -

[ ]
( , ) ( )
k
U f F U f kF
·
·
÷·
÷
¦
, (2.98a)

where the smallest we can make F and still avoid overlap is, as shown by the dotted triangles in
Fig. 2.14,

max
2 F f . (2.98b)

From Eq. (2.93c), we see that in Fig. 2.14

1
F
t

A
,

where ¨t is the interval in t between adjacent samples of u(t). If ¨t is made smaller, then F
increases, moving the regions of nonzero U further apart in Fig. 2.14; and if ¨t is made larger,
then F decreases, forcing the regions of nonzero U to overlap in Fig. 2.14. Making ¨t smaller is
wasteful, in that more effort than is needed goes into sampling u(t), and making ¨t larger
damages the integrity of the U calculations for large values of ƒ near
max
f . Clearly, the frequency
value F/2 plays an important role in DFT analysis, because optimum performance requires
max
/ 2 f F . For this reason frequency F/2 is given a special name: the Nyquist frequency
/ 2
Nyq
f F . From (2.93c), we see that

1
2
Nyq
f
t

A
. (2.99a)

A realistic system, of course, is designed with some built-in margin for error. The requirement
then becomes that ¨t be small enough to separate unexpectedly high frequencies when the
highest expected frequency is
max
f . To provide this margin, we take


max
1
2
Nyq
f f
t
>
A
(2.99b)
or

max
1
2
t
f
A < . (2.99c)

Now the region between
max
f and
Nyq
f is available for analysis of unexpectedly high frequencies.
Suppose U(ƒ) is negligible everywhere except at two frequencies, the positive frequency
0
f
and the corresponding negative frequency ( )
0
f ÷ . Since U(ƒ) is the transform of a real signal,
entry 7 of Table 2.1 requires ( ) ( ) U f U f
·
÷ , forcing the existence of a non-negligible transform
- 190 -
Aliasing as an Error · 2.22
- 191 -
value at ( )
0
f ÷ when there is a non-negligible transform value at
0
f . The two frequencies are
represented by wide, solid-sided arrows in Fig. 2.15. The arrows represent isolated, narrow
regions where U is very large, so we can think of them as proportional to delta functions and
write U(ƒ) as
0 0
( ) ( ) ( ) U f A f f B f f o o ÷ + + .

Variables A and B are arbitrary complex constants. We have just seen that Table 2.1 requires
( ) ( ) U f U f
·
÷ . Because the delta functions are real, the equation ( ) ( ) U f U f
·
÷ can be
written as

0 0 0 0
( ) ( ) ( ) ( ) A f f B f f A f f B f f o o o o
· ·
÷ ÷ + ÷ + ÷ + +

or, since the delta functions are also even [see Eq. (2.68a)],


0 0 0 0
( ) ( ) ( ) ( ) A f f B f f A f f B f f o o o o
· ·
+ + ÷ ÷ + + .

This can only be true if A B
·
(which is, of course, the same thing as having B A
·
).
Therefore, we have the freedom to choose only one arbitrary complex constant, say A, and after
making that choice function U(ƒ) becomes

______________________________________________________________________________

FIGURE 2.14.


f
( ) U f

max
f -
max
f
F - F
[ ]
( , ) U f F
·
- 191 -
2 · Fourier Theory

- 192 -

0 0
( ) ( ) ( ) U f A f f A f f o o
·
÷ + + . (2.100a)

It is not difficult to figure out what happens when the DFT is used to calculate this double-delta
frequency spectrum. If the double-delta U(ƒ) is used to construct U
[’]
(f, F) according to formula
(2.98a), we get multiple isolated regions where U
[’]

is very large, as shown by the wide dashed
arrows in Fig. 2.15. The curved single arrows show which wide dashed arrows come from the
wide, solid-sided arrow at f
0
and which wide dashed arrows come from the wide solid-sided
arrow at ( )
0
f ÷ . For example, the wide dashed arrow closest to f
0
comes from the wide solid-
sided arrow at (–f
0
), and the wide dashed arrow closest to (–f
0
) comes from the wide solid-sided
arrow at f
0
. The two wide solid-sided arrows at f
0
and –f
0
lie a distance a inside the positions of
the positive and negative Nyquist frequencies f
Nyq
and –f
Nyq
, and the two wide dashed arrows that
are closest to f
0
and –f
0
lie a distance a outside the positive and negative Nyquist frequencies f
Nyq

and –f
Nyq
. We see that the original double-delta U(ƒ) transform can be written as [from Eq.
(2.100a)]

( ) ( ) ( )
Nyq Nyq
U f A f f a A f f a o o
·
÷ + + + ÷ , (2.100b)

and we can pair up the two wide dashed arrows closest to f
0
and –f
0
to create the transform


[1]
( ) ( ) ( )
Nyq Nyq
U f A f f a A f f a o o
·
÷ ÷ + + +

. (2.100c)

Because the delta function
0
( ) ( )
Nyq
f f a f f o o + ÷ + has the coefficient A
·
in (2.100b), the
curved single arrow going from ( )
0
f ÷ to
Nyq
f a + shows that the delta function ( )
Nyq
f f a o ÷ ÷
at
Nyq
f a + must have the coefficient A
·
in Eq. (2.100c); similarly, the curved single arrow going
from
0
f to
Nyq
f a ÷ ÷ shows that the delta function ( )
Nyq
f f a o + + at
Nyq
f a ÷ ÷ must have the
coefficient A in Eq. (2.100c). Nothing stops us from continuing out from the origin, pairing the
wide dashed arrows at 3
Nyq
f f a ÷ and 3
Nyq
f f a ÷ + to get


[2]
( ) ( 3 ) ( 3 )
Nyq Nyq
U f A f f a A f f a o o
·
÷ + + + ÷

(2.100d)

and pairing the wide dashed arrows at 3
Nyq
f f a + and 3
Nyq
f f a ÷ ÷ to get


[3]
( ) ( 3 ) ( 3 )
Nyq Nyq
U f A f f a A f f a o o
·
÷ ÷ + + +

. (2.100e)
- 192 -
Aliasing as an Error · 2.22
- 193 -
FIGURE 2.15.

















Each time, the curved single arrows in Fig. 2.15 are consulted to find the coefficients of the delta
functions. This can obviously be continued out to indefinitely large values of ƒ, creating the
paired transforms
[4] [5]
, , U U

…, etc. The general formula for
[ ] k
U

turns out to be


[ ]
( )
( ) for even
( )
( ( 1) )
( ( 1) ) for odd
Nyq Nyq
Nyq Nyq
k
Nyq Nyq
Nyq Nyq
A f f kf a
A f f kf a k
U f
A f f k f a
A f f k f a k
o
o
o
o
·
·
÷ ÷ + ­
°
+ + + ÷
°
°

®
°
÷ ÷ ÷ ÷
°
°
+ + + ÷ +
¯

. (2.100f)
frequency
0
f frequency –
0
f
frequency
Nyq
f frequency –
Nyq
f
a a a a
nyq
f F 2

- 193 -
2 · Fourier Theory

- 194 -
We started out with the double-delta U(ƒ) being the forward Fourier transform of u(t), which
means that u(t) is the inverse Fourier transform of the double-delta U(ƒ),


2
( ) ( )
ift
u t U f e df
r
·
÷·

³
.

We now show that u(t), the inverse transform of the double-delta U(ƒ), and
[1] [2]
( ), ( ), u t u t … the
inverse transforms of
[1] [2]
, , U U

…, all have the same values at for 0, 1, 2, t m t m A ± ± … ,



[1] [2] [ ]
( ) ( ) ( ) ( )
k
u m t u m t u m t u m t A A A A " " . (2.100g)


We begin by taking the inverse Fourier transform of the double-delta U(ƒ) function specified
in (2.100b),


2
2 ( ) 2 ( ) 2 ( )
( ) [ ( ) ( )]
2Re[ ]
Nyq Nyq Nyq
ift
Nyq Nyq
it f a it a f it f a
u t A f f a A f f a e df
Ae A e Ae
r
r r r
o o
·
·
÷·
÷ ÷ ÷ ·
÷ + + + ÷
+
³
.
(2.101a)

Similarly, we can take the inverse Fourier transform of
[ ]
( )
k
U f

in (2.100f) to get



2 ( )
[ ]
2 ( ( 1) )
2Re[ ] for even
( )
2Re[ ] for odd
Nyq Nyq
Nyq Nyq
it f kf a
k
it f k f a
Ae k
u t
Ae k
r
r
+ ÷
÷ + ÷ +
­
°

®
°
¯

. (2.101b)


Substituting t m t A from (2.100g) and 1 (2 )
Nyq
f t A from (2.99a) into Eq. (2.101a) gives



1
2 ((2 ) ) 2
2
( ) 2Re[ ] 2Re[ ]
2Re[( 1) ]
im t t a i m ima t
m ima t
u m t Ae Ae e
Ae
r r r
r
÷
A A ÷ ÷ A
÷ A
A
÷ .
(2.101c)

Making the same substitutions into Eq. (2.101b) gives
- 194 -
Aliasing as an Error · 2.22
- 195 -

1 1
1 1
2 ((2 ) (2 ) )
2
[ ]
2 ((2 ) ( 1)(2 ) )
( 1) 2
2Re[ ]
2Re[ ] for even
( )
2Re[ ]
2Re[
im t t k t a
i m i mk ima t
k
im t t k t a
i m i m k ima
Ae
Ae e e k
u m t
Ae
Ae e e
r
r r r
r
r r r
÷ ÷
÷ ÷
A A + A ÷
÷ A
÷ A A + ÷ A +
÷ ÷ ÷ ÷ A

A

] for odd
t
k
­
°
°
°
®
°
°
°
¯

. (2.101d)

But ( 1) 1
i mk mk
e
r ±
÷ when k is even and
( 1) ( 1)
( 1) 1
i m k m k
e
r ± ÷ ÷
÷ when k is odd, so this last
result can be written as


2
[ ]
2
2Re[ ( 1) ] for even
( )
2Re[ ( 1) ] for odd
m ima t
k
m ima t
A e k
u m t
A e k
r
r
÷ A
÷ A
­ ÷
A
®
÷
¯

. (2.101e)

Comparing this with (2.101c), we conclude that
[ ]
( ) ( )
k
u m t u m t A A for all values of m and k,
showing that (2.100g) must be true. Because the
[ ] k
u functions have exactly the same values as
the u functions at for 0, 1, 2, t m t m A ± ± … , the
[ ] k
u functions are called aliases of function
u. Figure 2.16 graphs an example of u(t) and to show how u and its alias
[1]
u can have identical
values at all the sample positions on the t axis.
The term “alias” is an interesting one; it suggests that there is no real way to distinguish these
functions if all we know are the values of the sample points at t m t A . Yet in Figs. 2.14 and
2.15, there is really no question as to which is the correct region of
[ ]
U
·
; spectral values whose
frequencies do not lie between +f
Nyq
and –f
Nyq
can clearly be disregarded. Consider, however, that
before u(t) is analyzed there is no guarantee as to what the correct value of f
max
is. Figure 2.17, for
example, shows a pattern for
[ ]
U
·
that seems to have well-separated regions for U and all its
aliases when in fact there is a high-frequency triangle that is hidden by aliasing. The unwary
analyst might conclude that U has the shape shown in Fig. 2.18(a) when its true shape is the one
shown in Fig. 2.18(b). There is really no way to be sure of the true shape of U when all that is
known is the DFT of the sampled signal u(t). The basic problem, which is that the DFT is the
sampled version of
[ ]
U
·
instead of U, does not disappear when 1 F t A is made larger by
decreasing the sampling interval ¨t; there is always the possibility that the true U curve is broad
enough to overlap. Returning to Fig. 2.16, we see that no matter how small ¨t is made, the
information thrown away from between the samples inevitably allows high frequencies to
masquerade as low frequencies. There is no foolproof method for both sampling the data and
avoiding this possibility.
Fortunately, there are usually ways of avoiding this logical dead end. As is pointed out in Sec.
2.2 above [see discussion after Eq. (2.9b)], in practice all measurements are sampled and, before
representing them by continuous functions, we must know that the samples capture all the
- 195 -
2 · Fourier Theory

- 196 -
relevant detail. In other words, there must be some way of knowing, based on past experience or
knowledge of how the data is gathered, that the sampling is rapid enough to represent faithfully
all the important high-frequency details. In terms of the notation used to discuss Fig. 2.14, we
must eventually be prepared to say that, for some specific ƒ
max
, no higher frequencies are present
to create aliasing—that is, we must know that if more closely spaced sampling is done all that
would be found is a smooth, quasi-linear variation between the current samples. Many times the
electronic instruments used to make the measurements cannot sense high-frequency data, so even
if high-frequency components exist, they cannot be recorded. Other times, all that can be done is
to look at the data samples and decide whether it is reasonable to suspect the presence of unseen
high-frequency components. The data in Fig. 2.19(a), for example, almost certainly do not
contain significant amounts of unseen high frequencies, whereas unseen high frequencies could
well be present in Fig. 2.19(b). There may be cases where all that can be done is to shorten ¨t and
see whether previously aliased frequency components suddenly appear. The question of whether
aliasing is present is analogous to the question of whether experimental error is present. Just as it
is always logically possible that data contain significant amounts of undetected error, so it is






1.1
1.1
y
i
Y
i
4.5 4.5 x
i
5 4 3 2 1 0 1 2 3 4 5
1
0.5
0
0.5
1
FIGURE 2.16.
The solid line represents a sinusoidal oscillation at a frequency that is 0.8 times the Nyquist
frequency, and the dashed line represents a sinusoidal oscillation that is 1.2 times the
Nyquist frequency. When the curves are sampled at the rate represented by the black dots—
which in this case is the Nyquist frequency—there is no way to tell them apart in the sampled
data.
t
- 196 -
Aliasing as an Error · 2.22
- 197 -
always logically possible that significant amounts of aliasing are being overlooked. Just as we
often expect insignificant amounts of error to occur no matter what precautions are taken, so we
often expect insignificant amounts of aliasing to occur in the calculated DFT. What is needed is
the presence of good engineering and scientific judgment; there must always be someone willing
to pick a value for ƒ
max
, allowing us to specify the sampling interval
max
1 (2 ) t f A s that prevents
significant aliasing in the DFT.
2.23 Aliasing as a Tool
The previous section presented the bad aspects of aliasing, treating it as a form of data corruption.
There are, however, occasions when aliasing is more of a feature than a bug. Many times, a real
function u(t) is known to have a Fourier transform


2
( ) ( )
ift
U f u t e dt
r
·
÷
÷·

³
,

which is zero for all positive frequencies ƒ that do not lie between the two positive numbers ƒ
min

and ƒ
max
; that is, U(ƒ) is zero when
min
0 f f s s and
max
f f > . Because u(t) is real, U(ƒ) must be
Hermitian (see entry 7 of Table 2.1), which means

( ) ( ) U f U f
·
÷ .

This shows that U(ƒ) must also be strictly zero for negative frequencies ƒ where
min
0 f f ÷ s s
and
max
f f s ÷ . The U(ƒ) transform is schematically represented in Fig. 2.20 with the two blocks
showing that U is zero unless ƒ lies between
max min
( , ) f f ÷ ÷ or
min max
( , ) f f .
The situation shown in Fig. 2.20 describes the signal produced by Michelson interferometers.
At the beginning of this chapter, we mentioned that interferometers produce interferograms that
must then be Fourier transformed to produce the desired spectral measurement. As explained
later in Chapter 4 (see Sec. 4.10), interferometers use optical filters to block out undesired
electromagnetic frequencies, which means there always exist values of ƒ
min
and ƒ
max
such that the
transform U(ƒ) of the interferogram signal u(t) is zero unless ƒ lies between
max min
( , ) f f ÷ ÷ or
min max
( , ) f f . Suppose we sample the interferogram signal with a sampling interval ¨t such that
the Nyquist frequency
1
(2 )
Nyq
f t
÷
A is slightly larger than ƒ
max
. Repeating the reasoning used to
get Fig. 2.15 above, we see that

[ ]
( , ) ( )
k
U f F U f kF
·
·
÷·
÷
¦


This shows that U( f ) must also be strictly zero for negative frequencies f where
- 197 -
2 · Fourier Theory

- 198 -





The ) , (
] [
F f U
·
data in Fig. 2.17 contains hidden aliasing that can lead spectral analysts to assume
that the Fig. 2.18(a) rather than 2.18(b) depicts the true frequency spectrum.

FIGURE 2.17.
FIGURE 2.18(a).
FIGURE 2.18(b).
f
f
f

Nyq
f
Nyq
f ÷

Nyq
f F 2
Nyq
f F 2 ÷ ÷
) , (
] [
F f U
·

) ( f U
) ( f U
- 198 -
Aliasing as a Tool · 2.23
- 199 -


















This curve varies rapidly in three locations, suggesting the presence of high-frequency
components in the data.
FIGURE 2.19(a).
This data is relatively smooth, suggesting that it does not contain high-frequency components.
FIGURE 2.19(b).
- 199 -
2 · Fourier Theory

- 200 -
now has the form shown in Fig. 2.21. Again, the solid blocks show the original U(ƒ), the dashed
blocks show the aliases created by turning U(ƒ) into
[ ]
( , ) U f F
·
, and the curved arrows drawn
show exactly how the aliased blocks are created from the original blocks. No solid blocks overlap
with the dashed blocks, so aliasing is not a problem.
Now consider what happens when we force aliasing to occur by choosing ¨t to be half its
original size, creating the
[ ]
U
·
plot shown in Fig. 2.22. As in Fig. 2.21, none of the solid blocks
overlap with the dashed blocks. Because the dashed blocks come from turning U into
[ ]
U
·
, the
spectral shapes represented by the solid and dashed blocks are all identical. This means that the
aliasing does not cause spectral information to be lost; either the solid blocks or the dashed
blocks can be used to recover the true shape of U(ƒ). The electronic equipment used to sample
u(t) only needs to sample half as often as before, which usually makes it less expensive to build,
and as a bonus the rate at which data flows from the interferometer ends up being cut in half. This
last point is often a significant consideration when the interferometer is on a satellite and all the
data has to be communicated to the ground. The scheme shown in Fig. 2.22 is called
undersampling. There is nothing special about undersampling by a factor of 2; if the distance
between ƒ
min
and ƒ
max
is small enough, and ƒ
min
is far enough from 0 f , we can undersample
by much higher factors. Figure 2.23 shows a scheme that undersamples by a factor of 5.
2.24 Sampling Theorem
We define a band-limited function u(t) to be a function for which there exists a positive
frequency ƒ
max
such that the forward Fourier transform of u(t),

2
( ) ( )
ift
U f u t e dt
r
·
÷
÷·

³
,

is strictly zero when
max
f f s ÷ or
max
f f > . The previous section indicated that the interferogram
of a Michelson interferometer is a special case of a band-limited function; not only is its
transform zero for
max
f f > , but there is also a positive frequency ƒ
min
such that its transform is
zero for
min
f f s (see Fig. 2.20). It can be shown that whenever a continuous function u(t) is
also band limited, then its samples ( ) u m t A (with 0, 1, 2, m ± ± …) can be used to reconstruct the
complete function—including the values of u between the samples—as long as we choose


max
1
2
t
f
A < (2.102)
to prevent aliasing.
We start by forming the mathematical construct

with 4 aliases rather than one.
- 200 -
Sampling Theorem · 2.24
- 201 -








FIGURE 2.20.
FIGURE 2.21.
) ( f U
f
f
) , (
] [
F f U
·


min
f
max
f
min
f ÷
max
f ÷

min
f

max
f

min
f ÷

max
f ÷

Nyq
f
Nyq
f ÷
F
F ÷
Frequency F is twice the Nyquist frequency
Nyq
f in Fig. 2.21.
- 201 -
2 · Fourier Theory

- 202 -
( ) ( ) ( )
m
v t u m t t m t o
·
÷·
A ÷ A
¦
. (2.103)

Clearly, the ( ) u m t A sample values of function u are the only data used to set up function v(t).
Because
0 0 0
( ) ( ) ( ) ( ) u t t t u t t t o o ÷ ÷ for any continuous function u [see Eq. (2.68e) above], this
can be written as
( ) ( ) ( )
m
v t u t t m t o
·
÷·
÷ A
¦

or
( ) ( ) ( )
m
v t u t t m t o
·
÷·
ª º
÷ A
« »
¬ ¼
¦
.

Note that here t has returned to being a continuous, not a sampled, variable. Taking the Fourier
transform of both sides gives, using the Fourier convolution theorem [see Eq. (2.72i)],


1
( ) ( )
k
k
V f U f f
t t
o
·
÷·
ª º
§ ·
· ÷
¨ ¸ « »
A A
© ¹
¬ ¼
¦
, (2.104a)
where

2
( ) ( )
ift
V f v t e dt
r
·
÷
÷·

³
, (2.104b)


2
( ) ( )
ift
U f u t e dt
r
·
÷
÷·

³
, (2.104c)
and

2
1
( )
ift
k k
k
t k t e dt f
t t
r
o o
·
· ·
÷
÷· ÷·
÷·
ª º § ·
÷ A ÷
¨ ¸ « »
A A
© ¹ ¬ ¼
¦ ¦
³
(2.104d)

from formula (2.78d). Note that here both ƒ and t are continuous, not sampled, variables. We can
now use the linearity of the convolution [see discussion after Eq. (2.38c)] and the definition of
the convolution in Eq. (2.38a) to write (2.104a) as


[ ]
( ) ( ) ( )
1
, ,
k k
k
k k
t V f U f f U f f f df
t t
k
U f U f
t t
o o
·
· ·
÷· ÷·
÷·
·
·
÷·
§ · § ·
´ ´ ´ A · ÷ ÷ ÷
¨ ¸ ¨ ¸
A A
© ¹ © ¹
§ · § ·
÷
¨ ¸ ¨ ¸
A A
© ¹ © ¹
¦ ¦
³
¦

(2.105a)
Note that here t in the function u has returned to being a continuous variable.
- 202 -
Sampling Theorem · 2.24
- 203 -












In both Figs. 2.22 and 2.23, frequency F is twice the Nyquist frequency
Nyq
f .


where
[ ]
U
·
is as defined in Eq. (2.93b) above. Inequality (2.102) ensures that the separate
regions of U that combine to create
[ ]
U
·
do not overlap, giving us the graph of
[ ]
U
·
shown in
Fig. 2.24. Hence, we can use the H function defined in Eq. (2.56c) to select just the region of
nonzero
[ ]
U
·
between
1 1
(2 ) and (2 ) t t
÷ ÷
+ A ÷ A , recreating the original U(ƒ) transform.
Multiplication of (2.105a) by
( )
1
, (2 ) f t
÷
H A then gives


[ ]
1 1 1
( ) , , ( ) ,
2 2
U f f U f t V f f
t t t
·
§ · § · § ·
H A H
¨ ¸ ¨ ¸ ¨ ¸
A A A
© ¹ © ¹ © ¹
. (2.105b)

FIGURE 2.22.
FIGURE 2.23.

Nyq
f

Nyq
f ÷

min
f
max
f

min
f

max
f

min
f ÷

min
f ÷

max
f ÷

max
f ÷
) , (
] [
F f U
·

) , (
] [
F f U
·


Nyq
f
Nyq
f ÷
F
F
F ÷
F ÷
f
f
- 203 -
2 · Fourier Theory

- 204 -
Having recovered the original U(ƒ), an inverse Fourier transform of U(ƒ) gives back the original
unsampled u(t). Using the Fourier convolution theorem again to take the inverse Fourier
transform of both sides of (2.105b), we get [applying Eq. (2.39j) after interchanging the roles of ƒ
and t]

2
2 2
1
( ) ( ) ,
2
1
( ) , ,
2
ift
ift if t
u t t V f f e df
t
t V f e df f e df
t
r
r r
·
÷·
· ·
´
÷· ÷·
§ ·
A H
¨ ¸
A
© ¹
ª º ª º
§ ·
´ ´ A · H
« » « » ¨ ¸
A
© ¹
¬ ¼ ¬ ¼
³
³ ³

(2.106a)

where the convolution between the two expressions inside square brackets [ ] is over the variable
t. From (2.104b), function V(ƒ) is the forward Fourier transform of v(t), making v(t) equal to the
inverse Fourier transform of V(ƒ) in (2.106a), with v(t) defined as

( ) ( ) ( )
m
v t u m t t m t o
·
÷·
A ÷ A
¦


in Eq. (2.103). From Eq. (2.71a) above, the inverse Fourier transform of H is


( ) 2
1 1 1
, , sin
2 2
ift ift
t
f e f df
t t t t
r
r
r
·
÷·
§ ·
§ · § · § ·
H H
¨ ¸ ¨ ¸ ¨ ¸ ¨ ¸
A A A
© ¹ © ¹ © ¹
© ¹
³
F .

Equation (2.106a) can now be written as


1
( ) ( ) ( ) sin
m
t
u t t u m t t m t
t t
r
o
r
·
÷·
ª º ª º § ·
A A ÷ A ·
¨ ¸ « » « »
A
© ¹ ¬ ¼ ¬ ¼
¦
. (2.106b)

Again, the linearity of the convolution can be used to simplify (2.106b),


1
( ) ( ) ( ) sin
m
t
u t t u m t t m t
t t
r
o
r
·
÷·
­ ½
ª º
§ ·
A A ÷ A ·
® ¾
¨ ¸ « »
A
© ¹
¬ ¼
¯ ¿
¦


or, using that
0 0
( ) ( ) ( ) t t u t u t t o ÷ · ÷ for any continuous function u,


1 ( )
( ) ( ) sin
(( ) )
m
t m t
u t u m t
t m t t t
r
r
·
÷·
­ ½
ª º ÷ A ° ° § ·
A
® ¾
¨ ¸ « »
÷ A A A
© ¹ ° ° ¬ ¼
¯ ¿
¦
. (2.106c)

- 204 -
Sampling Theorem · 2.24
- 205 -

FIGURE 2.24.










This formula gives us u(t) everywhere in terms of the samples ( ) u m t A and the function


1
sin
( )
t
t t t
r
r
§ ·
¨ ¸
A A
© ¹
.

We now define the function

sin( )
sinc( )
x
x
x
(2.106d)

and write (2.106c) as

( )
( ) ( )sinc
m
t m t
u t u m t
t
r
·
÷·
÷ A § ·
A
¨ ¸
A
© ¹
¦
. (2.106e)

[ ]
1
, U f
t
·
§ ·
¨ ¸
A
© ¹

) ( f U

max
f
max
f ÷
max
1
f
t
÷
A

¸
¹
·
¨
©
§
÷
A
÷
max
1
f
t


t A 2
1

t A
÷
2
1

f
- 205 -
2 · Fourier Theory

- 206 -
Many authors use a different definition of the sinc function, which we call here sinc
alt
, with

sin( )
sinc ( )
alt
x
x
x
r
r
.

In terms of sinc
alt
, Eq. (2.106e) becomes


( )
( ) ( )sinc
alt
m
t m t
u t u m t
t
·
÷·
÷ A § ·
A
¨ ¸
A
© ¹
¦
.

For the rest of this book, the symbol sinc will refer to
sin( ) x
x
instead of
sin( ) x
x
r
r
. We also
note that the Fourier transform pair in (2.71a) can be written in terms of sinc( ) x as


2
[2 sinc(2 )] ( , )
ift
e F Ft dt f F
r
r
·
÷
÷·
H
³

and

2
( , ) 2 sinc(2 )
ift
e f F df F Ft
r
r
·
÷·
H
³
.

Replacing ƒ by íƒ in the top integral and t by ít in the bottom integral gives


2
[2 sinc(2 )] ( , ) ( , )
ift
e F Ft dt f F f F
r
r
·
÷·
H ÷ H
³

and

2
( , ) 2 sinc( 2 ) 2 sinc(2 )
ift
e f F df F Ft F Ft
r
r r
·
÷
÷·
H ÷
³
,

where we have used that ( , ) f F H and sinc(2 ) Ft r are even functions of their arguments:

sinc( ) sinc( ) x x ÷ (2.107a)
and
( , ) ( , ) f F f F H ÷ H . (2.107b)

This means we can write this Fourier relationship using the more general formulas
- 206 -
Sampling Theorem · 2.24
- 207 -
( )
( ) 2
2 sinc(2 ) [2 sinc(2 )] ( , )
ift ift
F Ft e F Ft dt f F
r
r r
·
± ±
÷·
H
³
F (2.108a)
and

( ) ( )
( ) ( ) 2
( , ) ( , ) ( , ) 2 sinc(2 )
ift itf ift
f F f F e f F df F Ft
r
r
·
± ± ±
÷·
H H H
³
F F . (2.108b)
2.25 Fourier Transforms in Two and Three Dimensions
The integral Fourier transform extends easily and naturally to two- and three-dimensional
functions. We can, for example, define the integral Fourier transform of any two-dimensional
function u(x,y) to be

2 ( )
( , ) ( , )
i x y
U dx dy e u x y
r ç q
ç q
· ·
÷ +
÷· ÷·

³ ³
. (2.109a)

The inverse Fourier transform of U returns the original function,


2 ( )
( , ) ( , )
i x y
u x y d d e U
r ç q
ç q ç q
· ·
+
÷· ÷·

³ ³
. (2.109b)

In three dimensions we can write, for the function ( , , ) u x y z , that


2 ( )
( , , ) ( , , )
i x y z
U dx dy dz e u x y z
r ç q ¸
ç q ¸
· · ·
÷ + +
÷· ÷· ÷·

³ ³ ³
(2.109c)
and

2 ( )
( , , ) ( , , )
i x y z
u x y z d d d e U
r ç q ¸
ç q ¸ ç q ¸
· · ·
+ +
÷· ÷· ÷·

³ ³ ³
. (2.109d)

This pattern of forward and inverse transforms can be extended indefinitely to functions u and U
with ever larger numbers of arguments, but for the purposes of this book there is no need to go
beyond the two- and three-dimensional transforms given in Eqs. (2.109a)–(2.109d). As a matter
of notation, we often use the standard Cartesian ˆ x and ˆ y unit vectors pointing along the x and y
axes of a Cartesian coordinate system to define vectors

ˆ ˆ xx yy p +
G
and ˆ ˆ q x y ç q +
G
.

- 207 -
2 · Fourier Theory

- 208 -
We introduce the symbol ( ) u p
G
as a shorthand for u(x,y) and the symbol ( ) U q
G
as a shorthand for
( , ) U ç q . Now Eqs. (2.109a) and (2.109b) can be written as


2 2
( ) ( )
i q
U q d e u
r p
p p
·
÷ -
÷·

³ ³
G G
G G
(2.110a)
and

2 2
( ) ( )
i q
u d q e U q
r p
p
·
-
÷·

³ ³
G G
G G
. (2.110b)

We can also define vectors for the three-dimensional case,

ˆ ˆ ˆ r xx yy zz + +
G
and ˆ ˆ ˆ s x y z ç q ¸ + +
G
,

and then write Eqs. (2.109c) and (2.109d) as


3 2
( ) ( )
ir s
U s d r e u r
r
·
÷ -
÷·

³ ³ ³
G G
G G
(2.110c)
and

3 2
( ) ( )
ir s
u r d s e U s
r
·
-
÷·

³ ³ ³
G G
G G
. (2.110d)
Vector notation is sometimes used to group families of associated forward and inverse Fourier
transforms into a single equation. We might, for example, write the six scalar equations


3 2
( ) ( )
ir s
x x
U s d r e u r
r
·
÷ -
÷·

³ ³ ³
G G
G G
,
3 2
( ) ( )
ir s
x x
u r d s e U s
r
·
-
÷·

³ ³ ³
G G
G G
,


3 2
( ) ( )
ir s
y y
U s d r e u r
r
·
÷ -
÷·

³ ³ ³
G G
G G
,
3 2
( ) ( )
ir s
y y
u r d s e U s
r
·
-
÷·

³ ³ ³
G G
G G
,
and

3 2
( ) ( )
ir s
z z
U s d r e u r
r
·
÷ -
÷·

³ ³ ³
G G
G G
,
3 2
( ) ( )
ir s
z z
u r d s e U s
r
·
-
÷·

³ ³ ³
G G
G G


as the pair of vector equations

3 2
( ) ( )
ir s
U s d r e u r
r
·
÷ -
÷·

³ ³ ³
G G G
G G G
(2.110e)
- 208 -
Fourier Transforms in Two and Three Dimensions · 2.25
- 209 -
and

3 2
( ) ( )
ir s
u r d s e U s
r
·
-
÷·

³ ³ ³
G G G
G G G
, (2.110f)
where

) ( ˆ ) ( ˆ ) ( ˆ ) ( r u z r u y r u x r u
z y x
G G G G G
+ + and ) ( ˆ ) ( ˆ ) ( ˆ ) ( s U z s U y s U x s U
z y x
G G G G
G
+ + .

We call ( ) U s
G
G
the vector Fourier transform of ( ) u r
G G
and ( ) u r
G G
the vector inverse Fourier
transform of ( ) U s
G
G
. Just as in the one-dimensional case, it makes no difference which Fourier
transform is labeled the forward transform and which is labeled the inverse transform as long as
there is a change in sign of the exponent of e. Following the pattern of Eq. (2.28A ), we can also
write

2 2 2 2
( ) ( )
i q i q
d q e d e u u
r p r p
p p p
· ·
´ ± - -
÷· ÷·
´ ´
³³ ³ ³
G G G G
B
G G
(2.110g)
and

3 2 3 2
( ) ( )
ir s ir s
d s e d r e v r v r
r r
· ·
´ ± - -
÷· ÷·
´ ´
³³ ³ ³ ³ ³
G G G G
B
G G
(2.110h)

for two-dimensional and three-dimensional scalar functions ( ) u p
G
and ( ) v r
G
. For three-
dimensional vector functions, this becomes


3 2 3 2
( ) ( )
ir s ir s
d s e d r e v r v r
r r
· ·
´ ± - ± -
÷· ÷·
´ ´
³³ ³ ³ ³ ³
G G G G
G G G G
. (2.110i)

Many one-dimensional Fourier identities have two-dimensional and three-dimensional
counterparts. For example, the Fourier shift theorem [see Eq. (2.36h) above] in two dimensions
becomes, for a two-dimensional vector constant ˆ ˆ
x y
a xa ya +
G
,


2 2 2 ( )
2 ( ) 2 ( )
( ) ( , )
( , ) ,
x y
i q i x y
x y
i a a i x y
d e u a dx dy e u x a y a
dx dy e e u x y
r p r ç q
r ç q r ç q
p p
-
· · ·
± ± +
÷· ÷· ÷·
· ·
+ ´ ´ ± +
÷· ÷·
+ + +
´ ´ ´ ´
³ ³ ³ ³
³ ³
G G
B
G G



where in the last step we define
x
x x a ´ + and
x
y y a ´ + . We now see that (dropping the
primes inside the double integral)
- 209 -
2 · Fourier Theory

- 210 -

2 2 2 2 2
( ) ( )
i q ia q i q
d e u a e d e u
r p r r p
p p p p
- - -
· ·
± ±
÷· ÷·
+
³ ³ ³ ³
G G G G G G
B
G G G
. (2.110j)

This shows the forward or inverse two-dimensional Fourier transform of ( ) u a p +
G G
to be
2 ia q
e
r -
G G
B

multiplied by the forward or inverse two-dimensional Fourier transform of ( ) u p
G
. Similarly in
three dimensions, we have, for a three-dimensional constant vector ˆ ˆ ˆ
x y z
b xb yb zb + +
G
, that


3 2 ( )
2 ( ) 2 ( )
2
( ) ( , , )
( , , ) ,
x y z
i x y z
x y z
i b b b i x y z
ir s
d r e v r b dx dy dz e v x b y b z b
e dx dy dz e v x y z
r ç q ¸
r ç q ¸ r ç q ¸
r
· · · ·
- ± + +
÷· ÷· ÷· ÷·
· · ·
+ + ´ ´ ´ ± + +
÷· ÷· ÷·
±
+ + + +
´ ´ ´ ´ ´ ´
³ ³ ³ ³ ³ ³
³ ³ ³
B
G G G
G



where
x
x x b ´ + ,
y
y y b ´ + , and
z
z z b ´ + . This time we find that the forward or inverse three-
dimensional Fourier transform of ( ) v r b +
G
G
is
2 is b
e
r -
G
G
B
multiplied by the forward or inverse three-
dimensional Fourier transform of ( ) v r
G
,


3 2 2 3 2
( ) ( )
ir s is b ir s
d r e v r b e d r e v r
r r r - - -
· ·
± ±
÷· ÷·
+
³ ³ ³ ³ ³ ³
G
G G G G G
B
G
G G
. (2.110k)

There is also a two-dimensional and three-dimensional version of the one-dimensional Fourier
scaling theorem discussed in Sec. 2.8 above [see Eq. (2.37a)]. In two dimensions when we have


( ) 2 2
( ) ( )
i q
V q d e v
r p
p p
·
± ± -
÷·

³ ³
G G
G G
(2.110A )

and ( ) v p
G
is replaced by ( ) v op
G
, where Į is a real scalar, then we can substitute p op ´
G G
to get


2
2 2 2 ( )
2 2
1 1
( ) ( ) ( )
i q
i q
d e v d e v V q
p
r
r p o
p op p p o
o o
´
§ · · ·
± -
¨ ¸
± - ±
© ¹
÷· ÷·
´ ´
³ ³ ³ ³
G
G
G G
G G G
. (2.110m)

Suppose there is a function of p
G
called ( ) u p
G
such that p
G
has to change by a vector distance p A
G

whose magnitude must be at least p þ A e
G
for there to be a significant change in the value of
( ) u p
G
. Using the same reasoning as was applied to the one-dimensional Fourier scaling theorem
[see the analysis following Eq. (2.37e)], we can show that
( )
( ) U q
±
G
, the two-dimensional forward
- 210 -
Fourier Transforms in Two and Three Dimensions · 2.25
- 211 -
or inverse Fourier transform of u, must be negligible or zero for all vectors q
G
whose magnitude
q
G
exceeds 1 þ . The Fourier scaling theorem in three dimensions starts with


( ) 3 2
( ) ( )
ir s
V s d r e v r
r
·
± ± -
÷·

³ ³ ³
G G
G G
, (2.110n)

from which we discover, replacing r
G
by r r o ´
G G
, that


2
3 2 3 ( )
3 3
1 1
( ) ( ) ( )
r
i s
ir s
d r e v r d r e v r V s
r
r o
o o
o o
´
§ · · ·
± -
¨ ¸
± - ±
© ¹
÷· ÷·
´ ´
³ ³ ³ ³ ³ ³
G
G
G G
G G G
. (2.110o)

Again we can conclude that if there is a function ( ) u r
G
such that r A
G
must be at least ȕ for there
to be a significant change in u, then
( )
( ) U s
±
G
, the three-dimensional forward or inverse Fourier
transform of u, must be negligible or zero for all vector arguments s
G
whose magnitude s
G

exceeds 1 þ .
The two-dimensional convolution of scalar functions u(x,y) and v(x,y) is written using the
symbol ·· and defined to be
( , ) ( , ) ( , ) ( , ) u x y v x y dx dy u x y v x x y y
· ·
÷· ÷·
´ ´ ´ ´ ´ ´ ·· ÷ ÷
³ ³
, (2.111a)
or

2
( ) ( ) ( ) ( ) u v d u v p p p p p p
·
÷·
´ ´ ´ ·· ÷
³ ³
G G G G G
(2.111b)

using the more concise vector notation. The vector notation may make the connection between
the one- and two-dimensional convolutions in Eqs. (2.38a) and (2.111b) easier to see. The two-
dimensional convolution, like the one-dimensional convolution, is both commutative and
associative. Using the same type of reasoning as in the analysis in Sec. 2.9, we have for the two-
dimensional functions ( ) u p
G
, ( ) v p
G
, and ( ) h p
G
that


( )
2
2 2
2
( ) ( ) ( ) ( ) 1 ( ) ( )
( ) ( ) ( ) ( )
u v d u v d u v
d v u v u
p p p p p p p p p p
p p p p p p
· ÷ ·
÷· ·
·
÷·
´ ´ ´ ´´ ´´ ´´ ·· ÷ ÷ ÷
´´ ´´ ´´ ÷ ··
³ ³ ³ ³
³ ³
G G G G G G G G
G G G G G

(2.111c)
and
- 211 -
2 · Fourier Theory

- 212 -

[ ]
2 2
2 2
2 2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) (( ) )
u v h d h d u v
d u d h v
d u d v h
p p p p p p p p p p
p p p p p p p
p p p p p p p
· ·
÷· ÷·
· ·
÷· ÷·
· ·
÷· ÷·
´´ ´´ ´ ´ ´´ ´ ·· ·· ÷ ÷
´ ´ ´´ ´´ ´´ ´ ÷ ÷
´ ´ ´´´ ´´´ ´ ´´´ ÷ ÷
³³ ³ ³
³³ ³ ³
³³ ³ ³
G G G G G G G G G
G G G G G G
G G G G G G



[ ]
( ) ( ) ( ) , u v h p p p ·· ··
G G G

(2.111d)


where to show that the two-dimensional convolution is commutative we make the variable
substitution p p p ´´ ´ ÷
G G G
in (2.111c); and to show it is associative, we make the variable
substitution p p p ´´´ ´´ ´ ÷
G G G
in (2.111d). The two-dimensional convolution is also linear. For any
two complex constants Į and ȕ, we have


[ ] [ ]
[ ] [ ]
2
2 2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
u v h d u v h
d u v d u h
u v u h
p o p þ p p p o p p þ p p
o p p p p þ p p p p
o p p þ p p
·
÷·
· ·
÷· ÷·
´ ´ ´ ´ ·· + ÷ + ÷
´ ´ ´ ´ ´ ´ ÷ + ÷
·· + ··
³ ³
³ ³ ³ ³
G G G G G G G G
G G G G G G
G G G G

,
(2.111e)

and because the two-dimensional convolution is commutative it follows that


[ ] [ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) v h u v u h u o p þ p p o p p þ p p + ·· ·· + ··
G G G G G G G
. (2.111f)

It is easy to show that the Fourier convolution theorem holds true in two dimensions. We start
with

2 ( )
2 ( )
2 ( )
[ ( , ) ( , )]
( , ) ( , )
( , ) ( , )
i x y
i x y
i x y
dx dy e u x y v x y
dx dy e dx dy u x y v x x y y
dx dy u x y dx dy e v x x y y
r ç q
r ç q
r ç q
· ·
± +
÷· ÷·
· · · ·
± +
÷· ÷· ÷· ÷·
· · · ·
± +
÷· ÷· ÷· ÷·
··
´ ´ ´ ´ ´ ´ ÷ ÷
´ ´ ´ ´ ´ ´ ÷ ÷
³ ³
³ ³ ³ ³
³ ³ ³ ³

.


- 212 -
Fourier Transforms in Two and Three Dimensions · 2.25
- 213 -
Now we replace the x, y integration variables by x x x ´´ ´ ÷ and y y y ´´ ´ ÷ , with dx dx ´´ and
dy dy ´´ , so that


2 ( )
2 ( ) 2 ( )
[ ( , ) ( , )]
( , ) ( , )
i x y
i x y i x y
dx dy e u x y v x y
dx dy u x y e dx dy e v x y
r ç q
r ç q r ç q
· ·
± +
÷· ÷·
· · · ·
´ ´ ´´ ´´ ± + ± +
÷· ÷· ÷· ÷·
··
´ ´ ´ ´ ´´ ´´ ´´ ´´
³ ³
³ ³ ³ ³


or

2 ( ) ( ) ( )
[ ( , ) ( , )] ( , ) ( , )
i x y
dx dy e u x y v x y U V
r ç q
ç q ç q
· ·
± + ± ±
÷· ÷·
··
³ ³
, (2.112a)

where
( )
U
±
is the two-dimensional forward or inverse Fourier transform of u,


( ) 2 ( )
( , ) ( , )
i x y
U dx dy e u x y
r ç q
ç q
· ·
± ± +
÷· ÷·

³ ³
, (2.112b)
and
( )
V
±
is the two-dimensional forward or inverse Fourier transform of v,


( ) 2 ( )
( , ) ( , )
i x y
V dx dy e v x y
r ç q
ç q
· ·
± ± +
÷· ÷·

³ ³
. (2.112c)

This gives the first half of the two-dimensional Fourier convolution theorem. To get the
second half, we reverse the transform in (2.112a). If the plus sign is used in (2.112a), take the
forward two-dimensional Fourier transform of both sides, and if the minus sign is used take the
inverse two-dimensional Fourier transform of both sides. This leads to


2 ( ) ( ) ( )
( , ) ( , ) ( , ) ( , )
i x y
d d e U V u x y v x y
r ç q
ç q ç q ç q
· ·
+ ± ±
÷· ÷·
··
³ ³
B
, (2.113a)

where, reversing the transforms in Eqs. (2.112b) and (2.112c),


2 ( ) ( )
( , ) ( , )
i x y
u x y d d e U
r ç q
ç q ç q
· ·
+ ±
÷· ÷·

³ ³
B
(2.113b)
and

2 ( ) ( )
( , ) ( , )
i x y
v x y d d e V
r ç q
ç q ç q
· ·
+ ±
÷· ÷·

³ ³
B
. (2.113c)
- 213 -
2 · Fourier Theory

- 214 -
The first half of the two-dimensional Fourier convolution theorem, Eqs. (2.112a)–(2.112c),
shows that the forward or inverse two-dimensional Fourier transform of the two-dimensional
convolution of two functions u and v is the product of the forward or inverse two-dimensional
Fourier transforms of u and v. Because no restrictions are placed on the nature of u and v, other
than that they are transformable, there are also no restrictions on the nature of their
( )
U
±
and
( )
V
±

transforms. This means we can think of
( )
U
±
and
( )
V
±
as arbitrary transformable functions. The
( ) ± superscripts on U and V in Eqs. (2.113a)–(2.113c) then just tell us that, according to Eqs.
(2.112b) and (2.112c),

( ) 2 ( )
( , ) ( , )
i x y
U dx dy e u x y
r ç q
ç q
· ·
± ± +
÷· ÷·

³ ³

and

( ) 2 ( )
( , ) ( , )
i x y
V dx dy e v x y
r ç q
ç q
· ·
± ± +
÷· ÷·

³ ³
.

We already know this, however, from looking at Eqs. (2.113b) and (2.113c)—just take the
opposite-sign Fourier transform of both sides. Hence, we can drop the ( ) ± superscripts on U and
V in Eqs. (2.113a)–(2.113c) as long as ( ) B superscripts are added to u and v to distinguish
between the two choices of sign in (2.113b) and (2.113c). Now Eqs. (2.113a)–(2.113c) become


2 ( ) ( ) ( )
( , ) ( , ) ( , ) ( , )
i x y
d d e U V u x y v x y
r ç q
ç q ç q ç q
· ·
+
÷· ÷·
··
³ ³
B B B
, (2.114a)
where

( ) 2 ( )
( , ) ( , )
i x y
u x y d d e U
r ç q
ç q ç q
· ·
+
÷· ÷·

³ ³
B B
(2.114b)
and

( ) 2 ( )
( , ) ( , )
i x y
v x y d d e V
r ç q
ç q ç q
· ·
+
÷· ÷·

³ ³
B B
. (2.114c)


The letters used to label the functions and variables are, of course, arbitrary, so nothing stops us
from interchanging the letters u and U, v and V, x and ȗ, y and Ș, and the vertical order of the ±
signs to get


2 ( ) ( ) ( )
( , ) ( , ) ( , ) ( , )
i x y
dx dy e u x y v x y U V
r ç q
ç q ç q
· ·
± + ± ±
÷· ÷·
··
³ ³
, (2.115a)
- 214 -
Fourier Transforms in Two and Three Dimensions · 2.25
- 215 -
where

( ) 2 ( )
( , ) ( , )
i x y
U dx dy e u x y
r ç q
ç q
· ·
± ± +
÷· ÷·

³ ³
(2.115b)
and

( ) 2 ( )
( , ) ( , )
i x y
V dx dy e v x y
r ç q
ç q
· ·
± ± +
÷· ÷·

³ ³
. (2.115c)

Equations (2.115a)–(2.115c) are the other half of the two-dimensional Fourier convolution
theorem—they show that the forward or inverse two-dimensional Fourier transform of the
product of two functions u and v is the two-dimensional convolution of the forward or inverse
two-dimensional Fourier transforms of u and v.
The three-dimensional convolution is written using the symbol ··· and defined to be

( , , ) ( , , ) ( , , ) ( , , ) u x y z v x y z dx dy dz u x y z v x x y y z z
· · ·
÷· ÷· ÷·
´ ´ ´ ´ ´ ´ ´ ´ ´ ··· ÷ ÷ ÷
³ ³ ³
(2.116a)
or

3
( ) ( ) ( ) ( ) u r v r d r u r v r r
·
÷·
´ ´ ´ ··· ÷
³ ³ ³
G G G G G
. (2.116b)

Using three-dimensional vector notation, the three-dimensional convolution has the same
commutative, associative, and linearity properties as the two-dimensional convolution, as can be
seen by returning to Eqs. (2.111c)–(2.111f), mentally adding an extra ·, an extra integral sign,
and replacing all the superscript 2’s by superscript 3’s.


( ) ( ) ( ) ( ) u v v u p p p p ··· ···
G G G G
, (2.117a)



[ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) u v h u v h p p p p p p ··· ··· ··· ···
G G G G G G
, (2.117b)



[ ] [ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) u v h u v u h p o p þ p o p p þ p p ··· + ··· + ···
G G G G G G G
, (2.117c)

and

[ ] [ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) v h u v u h u o p þ p p o p p þ p p + ··· ··· + ···
G G G G G G G
. (2.117d)

- 215 -
2 · Fourier Theory

- 216 -
Looking carefully at the variable manipulations used to derive Eqs. (2.112a)–(2.112c), the first
half of the two-dimensional Fourier convolution theorem, we see that working with an extra
product z¸ in the exponent of e and an extra integration over dz does not affect the end result.
We can therefore say that


2 ( )
( ) ( )
[ ( , , ) ( , , )]
( , , ) ( , , ) ,
i x y z
dx dy dz e u x y z v x y z
U V
r ç q ¸
ç q ¸ ç q ¸
· · ·
± + +
÷· ÷· ÷·
± ±
···

³ ³ ³

(2.118a)

where

( ) 2 ( )
( , , ) ( , , )
i x y z
U dx dy dz e u x y z
r ç q ¸
ç q ¸
· · ·
± ± + +
÷· ÷· ÷·

³ ³ ³
(2.118b)

and

( ) 2 ( )
( , , ) ( , , )
i x y z
V dx dy dz e v x y z
r ç q ¸
ç q ¸
· · ·
± ± + +
÷· ÷· ÷·

³ ³ ³
. (2.118c)
The argument about relabeling the functions and variables used to go from (2.112a)–(2.112c) to
(2.115a)–(2.115c) works equally well here, giving us at once the other half of the three-
dimensional Fourier convolution theorem,


2 ( )
( ) ( )
( , , ) ( , , )
( , , ) ( , , ) ,
i x y z
dx dy dz e u x y z v x y z
U V
r ç q ¸
ç q ¸ ç q ¸
· · ·
± + +
÷· ÷· ÷·
± ±

···
³ ³ ³

(2.119a)

where

( ) 2 ( )
( , , ) ( , , )
i x y z
U dx dy dz e u x y z
r ç q ¸
ç q ¸
· · ·
± ± + +
÷· ÷· ÷·

³ ³ ³
(2.119b)

and

( ) 2 ( )
( , , ) ( , , )
i x y z
V dx dy dz e v x y z
r ç q ¸
ç q ¸
· · ·
± ± + +
÷· ÷· ÷·

³ ³ ³
. (2.119c)

One last matter of notation worth mentioning is that we can create two-dimensional and three-
dimensional delta functions from the products of the already-discussed one-dimensional delta
function:
- 216 -
Fourier Transforms in Two and Three Dimensions · 2.25
- 217 -
( ) ( ) ( ) x y o p o o
G
(2.120a)

and
( ) ( ) ( ) ( ) r x y z o o o o
G
. (2.120b)

For any two-dimensional continuous function u(x,y), we have


( , ) ( ) ( ) ( ) ( , ) ( )
( ) ( , ) ( , );
o o o o
o o o o
dx dy u x y x x y y dx x x dyu x y y y
dx x x u x y u x y
o o o o
o
· · · ·
÷· ÷· ÷· ÷·
·
÷·
÷ ÷ ÷ ÷
÷
³ ³ ³ ³
³

(2.121a)

and similarly for any continuous three-dimensional function ( , , ) v x y z , we have


( , , ) ( ) ( ) ( )
( ) ( , , ) ( )
( ) ( , , ) (
o o o
o o o
o o o o
dx dy dz v x y z x x y y z z
dx x x dy v x y z y y
dx x x v x y z v x
o o o
o o
o
· · ·
÷· ÷· ÷·
· ·
÷· ÷·
÷ ÷ ÷
÷ ÷
÷
³ ³ ³
³ ³

, , )
o o
y z
·
÷·
³
.
(2.121b)

These equations can be written in vector notation as


2
( ) ( ) ( )
o o
d u u p p o p p p
·
÷·
÷
³ ³
G G G G
(2.121c)
and

3
( ) ( ) ( )
o o
d r v r r r v r o
·
÷·
÷
³ ³ ³
G G G G
. (2.121d)

Combining Eq. (2.71f) for the one-dimensional delta function with Eqs. (2.120a) and (2.120b),
we see that in two dimensions


2 2 2 2
( ) ( ) ( )
ix iy i q
x y d e d e d qe
r ç r q r p
o p o o ç q
· · ·
± ± ± -
÷· ÷· ÷·

³ ³ ³ ³
G G
G
(2.122a)
- 217 -
2 · Fourier Theory

- 218 -
using the vector notation ˆ ˆ q x y ç q +
G
; and in three dimensions


2 2 2
3 2
( ) ( ) ( ) ( )
ix iy iz
ir s
r x y z d e d e d e
d s e
r ç r q r ¸
r
o o o o ç q ¸
· · ·
± ± ±
÷· ÷· ÷·
·
± -
÷·

³ ³ ³
³ ³ ³
G G
G

(2.122b)

using the vector notation ˆ ˆ ˆ s x y z ç q ¸ + +
G
.

__________


This chapter provides both an intuitive understanding and a rigorous explanation of how
Fourier transforms work. Sine and cosine transforms are introduced as a way to measure how
much functions resemble sine and cosine curves, and these transforms are then combined to
create the standard complex Fourier transform. We describe convolutions and how they produce
new functions by blurring old ones. The Fourier convolution theorem—whose importance is
difficult to overstate—directly connects the convolution to Fourier-transform theory. Generalized
limits are explained to show in what sense some of the more puzzling functions found in lists of
Fourier transforms belong there, and a brief outline of generalized functions is presented to show
how delta functions can be described without making them sound like obvious nonsense.
Computers use discrete Fourier transforms to handle Fourier calculations, and we explain how
the discrete Fourier transform can be used to approximate the integral Fourier transform. The
discrete Fourier transform produces aliasing; we show when aliasing is desirable, when it is not
desirable, and when it can be neglected. All the major concepts explained in this chapter—the
linearity of the Fourier transform, the linearity of the convolution, the Fourier convolution
theorem, the idea of even and odd functions, and the delta function—have important roles to play
in the pages that follow.

- 218 -
Table 2.1
- 219 -









































Table 2.1
)) ( ( ) (
) (
t u f U
ift ÷
F )) ( ( ) (
) (
f U t u
ift
F
(1) [real, even]
0 )) ( Im( f U , ) ( ) ( f U f U ÷
[real, even]
0 )) ( Im( t u , ) ( ) ( t u t u ÷
(2) [imag., even]
0 )) ( Re( f U , ) ( ) ( f U f U ÷
[imag., even]
0 )) ( Re( t u , ) ( ) ( t u t u ÷
(3) [real, odd]
0 )) ( Im( f U , ) ( ) ( f U f U ÷ ÷
[imag., odd]
0 )) ( Re( t u , ) ( ) ( t u t u ÷ ÷
(4) [imag., odd]
0 )) ( Re( f U , ) ( ) ( f U f U ÷ ÷
[real, odd]
0 )) ( Im( t u , ) ( ) ( t u t u ÷ ÷
(5) [complex, even]
f f U some for 0 )) ( Re( =
f f U some for 0 )) ( Im( =
) ( ) ( f U f U ÷
[complex, even]
t t u some for 0 )) ( Re( =
t t u some for 0 )) ( Im( =
) ( ) ( t u t u ÷
(6) [complex, odd]
f f U some for 0 )) ( Re( =
f f U some for 0 )) ( Im( =
) ( ) ( f U f U ÷ ÷
[complex, odd]
t t u some for 0 )) ( Re( =
t t u some for 0 )) ( Im( =
) ( ) ( t u t u ÷ ÷
(8) [real]
0 )) ( Im( f U
[Hermitian]
·
÷ ) ( ) ( t u t u
(7) [Hermitian]
·
÷ ) ( ) ( f U f U
[real]
0 )) ( Im( t u
- 219 -
2 · Fourier Theory

- 220 -




























Table 2.1
(continued)

(10) [imag.]
0 )) ( Re( f U
[anti-Hermitian]
·
÷ ÷ ) ( ) ( t u t u
(9) [anti-Hermitian]
·
÷ ÷ ) ( ) ( f U f U
[imag.]
0 )) ( Re( t u
(11) [complex, no symmetry]

[complex, no symmetry]

- 220 -
Table 2.2
- 221 -









































Table 2.2

2
0
1
( )
k
T
i t
T
k
A e v t dt
T
r
§ ·
÷
¨ ¸
© ¹

³


¦
·
÷·
¸
¹
·
¨
©
§

k
T
t
ik
k
e A t v
r 2
) (
(1) [real, even]
0 ) Im(
k
A ,
k k
A A
÷

[real, even]
0 )) ( Im( t v , ) ( ) ( t v t v ÷
(2) [imag., even]
0 ) Re(
k
A ,
k k
A A
÷

[imag., even]
0 )) ( Re( t v , ) ( ) ( t v t v ÷
(3) [real, odd]
0 ) Im(
k
A ,
k k
A A ÷
÷

[imag., odd]
0 )) ( Re( t v , ) ( ) ( t v t v ÷ ÷
(4) [imag., odd]
0 ) Re(
k
A ,
k k
A A ÷
÷

[real, odd]
0 )) ( Im( t v , ) ( ) ( t v t v ÷ ÷
(5) [complex, even]
k A
k
some for 0 ) Re( =
k A
k
some for 0 ) Im( =
k k
A A
÷

[complex, even]
t t v some for 0 )) ( Re( =
t t v some for 0 )) ( Im( =
) ( ) ( t v t v ÷
(6) [complex, odd]
k A
k
some for 0 ) Re( =
k A
k
some for 0 ) Im( =
k k
A A ÷
÷

[complex, odd]
t t v some for 0 )) ( Re( =
t t v some for 0 )) ( Im( =
) ( ) ( t v t v ÷ ÷
(8) [real]
0 ) Im(
k
A
[Hermitian]
·
÷ ) ( ) ( t v t v
(7) [Hermitian]
·
÷

k k
A A
[real]
0 )) ( Im( t v

- 221 -































(1) [real, even]
0 ) Im(
k
A ,
k k
A A
÷

[real, even]
0 )) ( Im( t v , ) ( ) ( t v t v ÷
(2) [imag., even]
0 ) Re(
k
A ,
k k
A A
÷

[imag., even]
0 )) ( Re( t v , ) ( ) ( t v t v ÷
(3) [real, odd]
0 ) Im(
k
A ,
k k
A A ÷
÷

[imag., odd]
0 )) ( Re( t v , ) ( ) ( t v t v ÷ ÷
(4) [imag., odd]
0 ) Re(
k
A ,
k k
A A ÷
÷

[real, odd]
0 )) ( Im( t v , ) ( ) ( t v t v ÷ ÷
(5) [complex, even]
k A
k
some for 0 ) Re( =
k A
k
some for 0 ) Im( =
k k
A A
÷

[complex, even]
t t v some for 0 )) ( Re( =
t t v some for 0 )) ( Im( =
) ( ) ( t v t v ÷
(6) [complex, odd]
k A
k
some for 0 ) Re( =
k A
k
some for 0 ) Im( =
k k
A A ÷
÷

[complex, odd]
t t v some for 0 )) ( Re( =
t t v some for 0 )) ( Im( =
) ( ) ( t v t v ÷ ÷
(8) [real]
0 ) Im(
k
A
[Hermitian]
·
÷ ) ( ) ( t v t v
(7) [Hermitian]
·
÷

k k
A A
[real]
0 )) ( Im( t v
- 221 -
2 · Fourier Theory

- 222 -






























Table 2.2
(continued)

(10) [imag.]
0 ) Re(
k
A
[anti-Hermitian]
·
÷ ÷ ) ( ) ( t v t v
(9) [anti-Hermitian]
·
÷
÷
k k
A A
[imag.]
0 )) ( Re( t v
(11) [complex, no symmetry]

[complex, no symmetry]

- 222 -
- 223 -
3
RANDOM VARIABLES, RANDOM
FUNCTIONS, AND POWER SPECTRA
Engineers and scientists are taught many statistical concepts in school, but all too often this is
done in an informal manner that does a good job of explaining how to eliminate random errors
and noise from real experimental data and a poor job of explaining how to analyze random errors
and noise in physical models. Understanding the correct way to represent random errors and
noise requires formal knowledge of the statistical concepts used to describe random signals;
otherwise, basic equations can be misunderstood and misused. For this reason, we here take a
more formal approach to the subject. Starting off with an explanation of the basics—random
functions, independent and dependent random variables, the expectation operator E, stationarity
and ergodicity—that do not require the Fourier theory discussed in the previous chapter, we then
move on to topics that do, such as autocorrelation functions, white noise, the noise-power
spectrum, and the Wiener-Khinchin theorem. The techniques explained in this chapter are used a
few times in the next chapter during the derivation of the Michelson interference equations and
then over and over again in Chapters 6, 7, and 8 to analyze the random errors and noise found in
Michelson systems.
3.1 Random and Nonrandom Variables
Random variables can be thought of as uncontrolled variables and nonrandom variables can be
thought of as controlled variables. When, for example, a computer program is being written, the
programmer controls the values of nonrandom program variables using inputs or lines of code,
but the programmer has no desire to control the program’s random variables—a pseudo-random
number generator gives them values instead. In a similar spirit, a statistician constructing a set of
model equations always ends up controlling the nonrandom variables—either directly by saying
this variable can be measured like this and that variable can be measured like that, or indirectly,
by saying these variables must solve that set of equations. Even when a statistician plots a
function against its argument, the graph is constructed by specifying the argument’s values and
then calculating the function according to its definition, which puts both the nonrandom argument
and the nonrandom value of the function under the statistician’s control. The statistician always,
on the other hand, treats random variables in a model as if they cannot be controlled. They must
be handled as if coins will be flipped, dice rolled, or needles spun on dials to determine their
values after the model is written down. All the statistician can know is the probability this
random variable takes on that value and the probability that random variable takes on this value;
3 · Random Variables, Random Functions, and Power Spectra
- 224 -
that is, he knows what the chances are that the coins, dice, or needles return one set of numbers
rather than another. Most scientists and engineers do not pay much attention to the difference
between controlled and uncontrolled variables—perhaps because most of their “controlled”
variables are usually a little “uncontrolled” in the sense that they come from imperfectly accurate
measurements—but it is very convenient when analyzing a statistical model to keep careful track
of this distinction. To help us remember which variables are random and which are not, we put a
wavy line or tilde over the random variables while writing the nonrandom variables in the usual
way. As an example of how this looks, we note that u, a
0
, and zƍ are all nonrandom variables
whereas NJ, ã
0
, and z′ are all random.
3.2 Random and Nonrandom Functions
When the argument of a function is a random variable, the value of the function is also random.
If, for example, x is a random variable and f is a function, then

( ) y f x = (3.1a)

is another random variable. To give an example of how this works, we create a nonrandom time
variable t and a random angular frequency ω , multiply them together and take the sine of their
product to get
sin( ) y t ω = . (3.1b)

The value of y is clearly uncontrolled; for each unpredictable value of ω at time t, there is a
corresponding unpredictable number y that is given by sin( ) t ω . This example also shows that
when a function has several arguments, its value becomes random when only one of the
arguments is random. In Eq. (3.1b) the sine of t ω , regarded as a function of both ω and t, is
random even though only one of its arguments, ω , is random.
Many times when a function has multiple arguments, the controlled argument or arguments
are more interesting than the uncontrolled argument or arguments that make the function random.
One way to handle this situation is to list only the nonrandom arguments and say that what we
have is a random function with nonrandom arguments. To show what is going on, we put a wavy
line over the function name, indicating that even though all the listed arguments are nonrandom,
the function itself is random. If, for example, we are only interested in the nonrandom time t, we
could define
( ) sin( ) R t t ω =

(3.2a)

to be a random function of the nonrandom variable t. Now whenever there is a list of time values
t
1
, t
2
, …, there is a corresponding list of random variables

Random and Nonrandom Functions · 3.2
- 225 -

1 1 1
( ) sin( ) u R t t ω = =

, (3.2b)

2 2 2
( ) sin( ) u R t t ω = =

,
#

Although Eq. (3.2b) implicitly assumes a list of distinct and separate t values, this reasoning still
holds up when t is explicitly made a continuous variable. Nothing, for example, stops us from
saying that for each value of t between í’ and +’, there corresponds a different random variable

( ) sin( )
t
u R t t ω = =

. (3.2c)

The idea of a random function of nonrandom arguments becomes more attractive when there is
no realistic possibility of analyzing the effect of multiple random arguments on a single
nonrandom function. We might, for example, know exactly how N random parameters
1
r ,
2
r , …,
N
r interact to cause an error e in an electrical signal s at time t. This lets us write the error as a
nonrandom function

1 2
( , , , , )
N
e t r r r … .

Rather than investigating how
1
r ,
2
r , …,
N
r are behaving, it usually makes more sense to say that
there is a random noise

1 2
( ) ( , , , , )
N
n t e t r r r = " (3.3a)

contaminating electrical signal s. Now we can put the error into our model as a random function ñ
that depends on a nonrandom parameter t instead of as a nonrandom function e that depends on t
and N random parameters
1
r ,
2
r , …,
N
r . Sometimes the signal s in our model depends on more
than one nonrandom parameter, such as the x, y coordinates of an image point at time t. If the
corresponding error e in the signal s depends on x, y, and t as well as the random parameters
1
r ,
2
r , …,
N
r , then we can say there is a random noise


1 2
( , , ) ( , , , , , , )
N
n x y t e x y t r r r = … (3.3b)

contaminating signal s(x, y, t). Note that we can think in terms of a signal noise ñ(t) or ñ(x,y,t)
even when we are not sure what random arguments
1
r ,
2
r , …,
N
r make the nonrandom function e
behave randomly. This is, of course, why the idea of a random function is so useful. In this book,
we use the term “random function” to refer to what statisticians often prefer to call a random or
stochastic process.
3 · Random Variables, Random Functions, and Power Spectra

- 226 -
3.3 Probability Density Distributions: Mean, Variance, Standard
Deviation
With every random variable r , we associate a nonrandom probability density distribution ( )
r
p x

such that ( )
r
p x dx

is the probability that the random variable r takes on a value between x and
x dx + . The nonrandom argument x of
r
p

is a dummy variable, and nothing stops us from calling
it r instead—in fact, that is the convention. The usual way to introduce a probability density
distribution for a random variable r is to say that ( )
r
p r dr

is the probability that r takes on a
value between r and r dr + . The dummy argument of a probability density distribution p must be
nonrandom, and the subscript of the probability density distribution p must be random—the
subscript, after all, labels p to show which random variable is being described. Since r must
always take on some sort of value between í’ and +’, the sum of all the probabilities ( )
r
p r dr

between í’ and +’ must always be one. Consequently, for any probability density distribution
( )
r
p r

, we have
( ) 1
r
p r dr

−∞
=
³

. (3.4)

For Eq. (3.4) to make sense, the probability density distribution ( )
r
p r

must be defined for all r
between í’ and +’ with the understanding that

( ) 0
r
p r =

for those values of r to which the random variable r can never be equal.
The predicted average or mean value of r can be written as

( )
r r
p r r dr µ

−∞
=
³

. (3.5a)

Note that
r
µ

, just like
r
p

, is nonrandom even though it has a random subscript. The predicted
variance of r , which is defined to be the predicted average or mean squared difference between
r and
r
µ

, is another nonrandom quantity

2
( ) ( )
r r r
v p r r dr µ

−∞
= −
³

. (3.5b)
Many people prefer to characterize a random number r by its standard deviation
r
σ

instead of its
variance
r
v

. The standard deviation of a random number r is defined to be the square root of the
variance,
Probability Density Distributions: Mean, Variance, Standard Deviation · 3.3
- 227 -

r r
v σ =

. (3.5c)

Of course
r
σ

, like
r
v

, is a nonrandom quantity. In general, the probability density distribution
r
p

lets us find the predicted average or mean value of any nonrandom function f of the random
variable r by calculating the nonrandom quantity

predicted mean value of ( ) ( )
r
p r f r dr f

−∞
=
³

. (3.5d)

When ( ) f r r = , this equation reduces to formula (3.5a) for
r
µ

; and when
2
( ) ( )
r
f r r µ = −

, this
equation reduces to formula (3.5b) for
r
v

.
Many random variables found in nature appear to obey a Gaussian, or “normal,” probability
distribution:

2
2
( )
2
1
( )
2
r
r
r
r
r
p r e
µ
σ
σ π


=

. (3.6a)

This can in part be explained as a consequence of the central limit theorem,
25
which is described
in Sec. 3.11 below. It is easy to show that parameter
r
µ

in Eq. (3.6a) is the mean of the Gaussian
distribution. Consulting formula (3.5a) above, we see that the mean of the distribution in (3.6a)
must be

2 2
2 2
( ) ( )
2 2
1
( )
2 2
r
r r
r r
r
r r
r
e dr r e dr
µ
σ σ
µ
σ π σ π
′ −
∞ ∞
− −
−∞ −∞
′ ′ = +
³ ³

, (3.6b)

where on the right-hand side the variable of integration is changed to
r
r r µ ′ = −

. This becomes,
consulting Eq. (7A.3d) in Appendix 7A of Chapter 7,

2 2 2
2 2 2
2
2
( ) ( ) ( )
2 2 2
( )
2
1 1
( )
2 2 2
1
1
2
r r r
r
r r r
r
r
r r r
r
r
r
r e dr r e dr e dr
r e dr
σ σ σ
σ
µ
µ
σ π σ π σ π
µ
σ π
′ ′ ′
∞ ∞ ∞
− − −
−∞ −∞ −∞



−∞
′ ′ ′ ′ ′ + = +
′ ′ = + ⋅
³ ³ ³
³

.
(3.6c)



25
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. (McGraw-Hill, Inc., New
York, 1991), p. 214.
3 · Random Variables, Random Functions, and Power Spectra

- 228 -
If we replace r′ by r′ − in

2
2
( )
2
( )
r
r
g r r e
σ


′ ′ =

,

it is the same as multiplying g by 1 − , which makes g an odd function [see Eq. (2.11b) in Chapter
2). Hence, according to Eq. (2.17) in Chapter 2,


2
2
( )
2
0
r
r
r e dr
σ



−∞
′ ′ =
³



because it is the integral of an odd function between í’ and +’. Therefore, Eq. (3.6c) simplifies
to

2
2
( )
2
1
( )
2
r
r
r r
r
r e dr
σ
µ µ
σ π



−∞
′ ′ + =
³

, (3.6d)

which can be substituted back into (3.6b) to get


2
2
( )
2
2
r
r
r
r
r
r
e dr
µ
σ
µ
σ π



−∞
=
³

. (3.6e)

This shows that, as claimed above, parameter
r
µ

is the mean of the probability distribution
specified in Eq. (3.6a). It is just as easy to show that
r
σ

is the standard deviation of the
distribution in (3.6a). From (3.5b) we know that the variance of this distribution is


2 2
2 2
( ) ( )
2 2
2 2
( ) ( )
2 2
r
r r
r r
r
r r
r r
e dr e dr
µ
σ σ
µ
σ π σ π
′ −
∞ ∞
− −
−∞ −∞
′ −
′ =
³ ³

when the variable of integration is changed to
r
r r µ ′ = −

. According to Eq. (7A.3b) in Appendix
7A of Chapter 7, we can write

2
2
( )
2
2 2
( )
2
r
r
r
r
r
e dr
σ
σ
σ π



−∞

′ =
³

. (3.6f)

Consequently,
2
r
σ

is the variance of this probability density distribution. The square root of the
variance is the standard deviation according to (3.5c). Hence, it is, as claimed, easy to see that
r
σ

Probability Density Distributions: Mean, Variance, Standard Deviation · 3.3
- 229 -
is the standard deviation of the probability density distribution in Eq. (3.6a).
When r can only take on the values
1
r ,
2
r , …,
N
r , then
r
p

can be written as a sum of delta
functions. If, for example,
1
p is the probability that r is
1
r ,
2
p is the probability that r is
2
r , …,
N
p is the probability that r is
N
r , then


1
( ) ( )
N
r k k
k
p r p r r δ
=
= ⋅ −
¦
. (3.7a)

The integral for the predicted mean value of r in Eq. (3.5a) now reduces to


1 1 1
[ ( )] ( )
N N N
r k k k k k k
k k k
p r r r dr p r r r dr p r µ δ δ
∞ ∞
= = =
−∞ −∞
= ⋅ − = − =
¦ ¦ ¦
³ ³

(3.7b)

as we expect. Similarly, according to Eq. (3.5b), the predicted variance of r becomes


2 2
1 1
2
1
[ ( )]( ) ( ) ( )
( ) ;
N N
r k k r k k r
k k
N
k k r
k
v p r r r dr p r r r dr
p r
δ µ δ µ
µ
∞ ∞
= =
−∞ −∞
=
= ⋅ − − = − −
= −
¦ ¦
³ ³
¦

(3.7c)

and, according to Eq. (3.5d), the predicted mean value of ( ) f r becomes


1 1 1
[ ( )] ( ) ( ) ( ) ( )
N N N
k k k k k k
k k k
p r r f r dr p f r r r dr p f r δ δ
∞ ∞
= = =
−∞ −∞
⋅ − = − =
¦ ¦ ¦
³ ³
. (3.7d)

Again, the integral formulas reduce to the correct probability-weighted sums. Looking at the
limiting case where 1 N = and
1
1 p = , we get


1
( ) ( )
r
p r r r δ = −

so that

1 1
( )
r
r r r dr r µ δ

−∞
= − =
³

(3.7e)

and the variance about
1 r
r µ =

is

3 · Random Variables, Random Functions, and Power Spectra

- 230 -

2 2
1 1 1 1
( ) ( ) ( ) 0
r
v r r r r dr r r δ

−∞
= − − = − =
³

. (3.7f)

Results (3.7e) and (3.7f) show that the value of r is now completely controlled; it must be equal
to
1
r and no longer needs to be treated like a random variable. Hence, the limiting case where
1 N = and
1
1 p = can be regarded as changing a random variable into a nonrandom variable.
3.4 The Expectation Operator
Statisticians avoid the mathematical awkwardness of probability density distributions and their
associated integrals by defining an expectation operator E. For any nonrandom function f with a
random argument x , we say that
( ) ( ) f x E

is the predicted mean, or average, value of ( ) f x . We also call ( ) ( ) f x E the expectation value of
( ) f x . Mathematically we define
( ) ( ) ( ) ( )
x
f x p x f x dx

−∞
=
³

E . (3.8a)

Just like before, ( )
x
p x dx

is the probability that the random variable x takes on a value between
x and x dx + . We can find ( ) x E , the expectation value of x , by choosing ( ) f x x = in Eq. (3.8a)
to get
( ) ( )
x
x p x x dx

−∞
=
³

E . (3.8b)

Comparing this to Eq. (3.5a) above, we see that the expectation value of x is the same as the
predicted mean or average value of x ,

( )
x
x µ =

E , (3.8c)

which makes good intuitive sense. Choosing
2
( ) ( )
x
f x x µ = −

gives


( )
2 2
( ) ( ) ( )
x x x
x p x x dx µ µ

−∞
− = −
³

E . (3.8d)

The Expectation Operator · 3.4
- 231 -
Comparing this to Eq. (3.5b) above, we see that
( )
2
( )
x
x µ −

E is the variance of x ,


( )
2
( )
x x
v x µ = −

E . (3.8e)

A notation often used for the variance of x instead of
x
v

is


( )
2
( ) ( )
x
Var x x µ = −

E . (3.8f)

When the E operator is applied to any sort of random variable or function—for example,
( ) f x —the result is always a nonrandom variable or function, namely

( ) ( )
x
p x f x dx

−∞
³

.

For example, the characteristic function
x
Φ

of a random variable x , which is the nonrandom
Fourier transform of the probability density distribution of x ,


2
( ) ( )
i x
x x
p x e dx
π ν
ν


−∞
Φ =
³

, (3.9a)

can be written as, using the E operator,


2
( ) ( )
i x
x
e
π ν
ν

Φ =

E . (3.9b)

To specify what happens when E is applied to a nonrandom variable c, we set up a random
variable ρ that has the probability density distribution

( ) ( ) p c
ρ
ρ δ ρ = −

. (3.9c)

According to the discussion following Eqs. (3.7e,f) above, this makes ρ equivalent to the
nonrandom variable c. Consequently, we can say that

( ) ( ) c ρ = E E (3.9d)
and use Eq. (3.8b) above to get
3 · Random Variables, Random Functions, and Power Spectra
- 232 -
( ) ( ) ( ) c p d c d c
ρ
ρ ρ ρ δ ρ ρ ρ
∞ ∞
−∞ −∞
= = − =
³ ³

E . (3.9e)

This justifies the general rule—which also makes good intuitive sense—that

( ) c c = E (3.9f)
for any nonrandom quantity c.
The expectation operator E can be applied to multiple random variables at the same time—all
that we need is the appropriate probability density distribution. Suppose, for example, that the
behavior of two random variables x and X

is described by a two-argument probability density
distribution ( , )
xX
p x X

, with ( , )
xX
p x X dx dX

being the probability that the random variable x
takes on a value between x and x dx + while the random variable X

takes on a value between X
and X dX + . No matter what the behavior of random variables x and X

, we can always
construct an appropriate probability density distribution
xX
p

. Since x and X

must always take
on some values in the intervals

x −∞ < < ∞ and X −∞ < < ∞,

the same reasoning used to produce Eq. (3.4) now shows that

( , ) 1
xX
dx dX p x X
∞ ∞
−∞ −∞
=
³ ³

(3.10a)

for any probability density distribution
xX
p

. The expectation value of any function of the random
variables x and X

, such as ( , ) f x X

, is defined to be


( )
( , ) ( , ) ( , )
xX
f x X dx dX p x X f x X
∞ ∞
−∞ −∞
=
³ ³

E . (3.10b)

In particular, we can always set ( , ) f x X x X =

to get the expected value of the random variables’
product,

( ) ( , )
xX
xX x dx dX X p x X
∞ ∞
−∞ −∞
=
³ ³

E . (3.10c)
Independent and Dependent Random Variables · 3.5
- 233 -
3.5 Independent and Dependent Random Variables
When comparing two random variables such as x and X

, one of the first questions that arises is
whether they are dependent or independent. When two random variables are dependent, the
random variables influence each other; and when two random variables are independent, they do
not.
Independent random variables are used to describe random quantities for which no cause-and-
effect relationship can be found. When, for example, we pick a car randomly from all the cars
sold in a given year, there is no reason to expect that the random variable representing the
brightness of the car’s headlights is associated with any particular value of the random variable
representing the car’s length. Lacking any evidence to the contrary, then, we say that these two
random variables ought to be independent. Similarly, if we pick someone at random from a
collection of adults, there is no obvious reason to assume that the random variable representing
the person’s yearly income is associated with any particular value of the person’s shoe size.
Again, we might assume that these are independent random variables. In general, when there is
no reason to connect the values of random quantities, we set them up in our models as
independent random variables.
Many times random variables turn out to be dependent in surprising ways. Returning to the
first of the previous examples, when we examine the connection between a car’s length and the
brightness of its headlights, it might turn out that very short cars are more likely to be European
sports cars frequently washed by their owners, making them more likely to have cleaner and thus
brighter headlights. Similarly, returning to the second example, a person’s shoe size and height
are connected; and statisticians have in fact shown that tall people, who are more likely to wear
large shoes, are also more likely to earn large incomes (if only because people living in the
United States, Australia, Canada, and Europe are more likely to be tall). Just as in these two
examples, many random variables that look like they ought to be unconnected and independent
turn out, after closer examination, to be dependent; in this sense, the independence of random
variables is the ideal case from which realistic random variables tend to deviate to a greater or
lesser degree.
3.6 Analyzing Independent Random Variables
When x and X

are independent random variables, their probability density distribution can be
written as
26


( , ) ( ) ( )
x
xX X
p x X p x p X = ⋅

. (3.11a)

where
x
p

and
X
p

are the standard probability density distributions for x and X

when x and X

26
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 132.
3 · Random Variables, Random Functions, and Power Spectra
- 234 -
are treated as solitary random variables. This means that ( )
x
p x dx

is the probability that x lies
between x and x dx + regardless of the value of X

, and ( )
X
p X dX

is the probability that X

lies
between X and X dX + regardless of the value of x . We see that, according to Eqs. (3.10c) and
(3.11a), the expectation value of the product xX

of two independent random variables is


( ) ( , ) ( ) ( )
[ ( ) ] [ ( ) ]
x
xX X
x
X
xX x dx dX X p x X x dx dX X p x p X
p x x dx p X X dX
∞ ∞ ∞ ∞
−∞ −∞ −∞ −∞
∞ ∞
−∞ −∞
= =
= ⋅
³ ³ ³ ³
³ ³

E
.


According to Eqs. (3.8b) and (3.8c), this can be written as

( ) ( ) ( ) xX x X = ⋅

E E E (3.11b)
or
( )
x
X
xX µ µ =

E . (3.11c)
3.7 Large Numbers of Random Variables
Our analysis of two random variables can be extended in a straightforward way to large
collections of random variables. If there are N random variables
1
x ,
2
x ,…,
N
x , then we can
always construct a probability density distribution


1 2
1 2
( , , , )
N
x x x N
p x x x
"

such that

1 2
1 2 1 2
( , , , )
N
x x x N N
p x x x dx dx dx
"
… "

is the probability that
1
x lies between
1
x and
1 1
x dx + , that
2
x lies between
2
x and
2 2
x dx + , ... ,
that
N
x lies between
N
x and
N N
x dx + . The expectation value of any function
1 2
( , , , )
N
f x x x … of
these N random variables is


( )
1 2
1 2
1 2 1 2 1 2
( , , , )
( , , , ) ( , , , ).
N
N
N N x x x N
f x x x
dx dx dx f x x x p x x x
∞ ∞ ∞
−∞ −∞ −∞
=
³ ³ ³
"

" … …
E

(3.12a)

Large Numbers of Random Variables · 3.7
- 235 -
Note that nothing has been said so far about the connections between these N random variables;
they could be either dependent or independent. If we now assume that these N random variables
are all independent with respect to one another, then


1 2 1 2
1 2 1 2
( , , , ) ( ) ( ) ( )
N N
x x x N x x x N
p x x x p x p x p x =
"
… " , (3.12b)

where
1
1 1
( )
x
p x dx

is the probability that
1
x lies between
1
x and
1 1
x dx + regardless of the values
of the other 1 N − random variables,
2
2 2
( )
x
p x dx

is the probability that
2
x lies between
2
x and
2 2
x dx + regardless of the values of the other 1 N − random variables, …, ( )
N
x N N
p x dx

is the
probability that
N
x lies between
N
x and
N N
x dx + regardless of the values of the other 1 N −
random variables. The expectation value of the product of these N random variables can now be
written as, setting
1 2 1 2
( , , , )
N N
f x x x x x x = " " in Eq. (3.12a),


1 2
1 2
1 2 1 2 1 2 1 2
1 1 1 2 2 2
( ) [ ] ( , , , )
( ) ( ) ( ) .
N
N
N N N x x x N
x x x N N N
x x x dx dx dx x x x p x x x
p x x dx p x x dx p x x dx
∞ ∞ ∞
−∞ −∞ −∞
∞ ∞ ∞
−∞ −∞ −∞
=
=
³ ³ ³
³ ³ ³
"

" " " …
"
E



Again, we consult Eqs. (3.8b) and (3.8c) to get


1 2 1 2
( ) ( ) ( ) ( )
N N
x x x x x x = " " E E E E (3.12c)
or

1 2
1 2
( )
N
N x x x
x x x µ µ µ =

" " E . (3.12d)
3.8 Single-Variable Means from Multivariable Distributions
We can calculate the predicted mean values of x and X

by choosing ( , ) f x X x =

and
( , ) f x X X =

in Eq. (3.10b) above. This gives
( ) ( , )
x
xX
x dx dX x p x X µ
∞ ∞
−∞ −∞
= =
³ ³


E (3.13a)
and
( ) ( , )
X xX
X dx dX X p x X µ
∞ ∞
−∞ −∞
= =
³ ³

E . (3.13b)
3 · Random Variables, Random Functions, and Power Spectra

- 236 -
Writing the double integrals as
( ) [ ( , ) ]
xX
x x p x X dX dx
∞ ∞
−∞ −∞
=
³ ³

E (3.13c)
and
( ) [ ( , ) ]
xX
X X p x X dx dX
∞ ∞
−∞ −∞
=
³ ³

E , (3.13d)

we compare them to the formula for the expected value of a random variable given in Eq. (3.8b).
This comparison suggests that, if we want to specify the behavior of one random variable while
disregarding the presence of the other, we can construct the single-argument probability density
distributions of x and X

by writing
( ) ( , )
x
xX
p x p x X dX

−∞
=
³


(3.13e)
and
( ) ( , )
X xX
p X p x X dx

−∞
=
³

. (3.13f)

Up to this point, none of the integrations have required assumptions about the dependence or
independence of the random variables, so Eqs. (3.13e) and (3.13f) hold true both for dependent
and independent random variables x and X

. If we specify that x and X

are independent, then
Eq. (3.11a) can be substituted into (3.13e) and (3.13f) to get

( ) ( ) ( ) ( ) ( )
x x x
X X
p x p x p X dX p x p X dX
∞ ∞
−∞ −∞
= =
³ ³


and
( ) ( ) ( ) ( ) ( )
x x
X X X
p X p x p X dx p X p x dx
∞ ∞
−∞ −∞
= =
³ ³

.

Glancing back at Eq. (3.4), we note that these last two equalities are trivially true, because in both
cases the right-most integrals must be one.
3.9 Analyzing Dependent Random Variables
Having found formulas for
x
µ

and
X
µ

that hold true for any pair of dependent or independent
random variables x and X

, we now use
x
µ

and
X
µ

to define a new random variable

Analyzing Dependent Random Variables · 3.9
- 237 -
( )( )
x
X
y x X µ µ = − −


. (3.14a)

From Eq. (3.8c), we know that


( )
( ) ( )( )
x
X
y x X µ µ = − −


E E (3.14b)

is just the predicted average value of y . We can imagine, each time we acquire a random pair of
x and X

values, comparing the sizes of x and X

to their respective averages
x
µ

and
X
µ

by
subtracting
x
µ

and
X
µ

from them. If x and X

are both simultaneously greater than, or both
simultaneously less than, their averages, then y is positive; and if one is greater than its average
when the other is less that its average, then Ϳ is negative. If there is a tendency for one of the
random variables to exceed its average whenever the other exceeds its average, or a tendency for
one of the random variables to fall below its average whenever the other falls below its average,
then Ϳ has a greater probability of being positive than negative, so

( ) 0 y > E .

If, on the other hand, there is a tendency for one of the random variables to exceed its average
when the other falls below its average, then Ϳ has a greater probability of being negative than
positive, so
( ) 0 y < E .

If ( ) y E is zero, it indicates that Ϳ is just as likely to be negative as positive, which means that
knowing one variable lies above or below its average tells us nothing about the likelihood that the
other variable lies above or below its average. Writing out the integral formula for ( ) y E in terms
of the probability density distribution ( , )
xX
p x X

gives


( )
( ) ( )( ) [( )( )] ( , )
x x
X X xX
y x X dx dX x X p x X µ µ µ µ
∞ ∞
−∞ −∞
= − − = − −
³ ³

E E . (3.14c)

We say that the value of the integral in Eq. (3.14c) measures the covariance of random variables
x and X

. When

( )
( ) ( )( )
x
X
y x X µ µ = − −


E E

is greater than zero, x and X

are said to be positively correlated; when

3 · Random Variables, Random Functions, and Power Spectra

- 238 -

( )
( ) ( )( )
x
X
y x X µ µ = − −


E E

is less than zero, x and X

are said to be negatively correlated; and when


( )
( ) ( )( )
x
X
y x X µ µ = − −


E E

equals zero, x and X

are said to be uncorrelated.
Evaluating ( ) y E and finding it not equal to zero is a standard way of showing that two
random variables x and X

are correlated and so cannot be independent. We cannot, however,
say that x and X

are independent just because ( ) y E is zero; that is, saying that x and X

are
uncorrelated is a weaker statement than saying that x and X

are independent. To show why this
is so, we set up a random variable φ

which has a probability density distribution


1 (2 ) for 0 2
( )
0 for 0 2
p
φ
π φ π
φ
φ φ π
≤ < ­
=
®
< ≥
¯

or
. (3.15a)

The probability density distribution p
φ

shows that φ

is equally likely to take on any value
between zero and 2ʌ, and that φ

never takes on values less than zero or greater than 2ʌ. We next
define two random variables u and v such that

sin( ) u φ =

(3.15b)
and
cos( ) v φ =

. (3.15c)

It follows that

2
0
1
( ) (sin ) ( ) sin( ) sin( ) 0
2
u
u p d d
π
φ
µ φ φ φ φ φ φ
π

−∞
= = = = =
³ ³

E E , (3.15d)

and similar reasoning shows that


2
0
1
( ) cos( ) 0
2
v
v d
π
µ φ φ
π
= = =
³

E . (3.15e)
Note that
Analyzing Dependent Random Variables · 3.9
- 239 -

( )
( )
2
0
2
0
( )( ) ( ) (sin )( cos )
1
sin( ) cos( )
2
1
sin(2 ) 0 ,
4
u v
u v u v
d
d
π
π
µ µ φ φ
φ φ φ
π
φ φ
π
− − = =
=
= =
³
³


E E E


(3.15f)
which means that u and v are uncorrelated random variables. On the other hand, we also know
that

2 2 2 2
sin cos 1 u v φ φ + = + =

,

which means that whenever u takes on a particular random value, say 1/2, then v must take on
one of the two random values

2
1 (1 2) 3 2 ± − = ± .

Consequently, u and v are by no means independent random variables even though by definition
they are uncorrelated random variables.
3.10 Linearity of the Expectation Operator
The expectation operator is linear with respect to all random quantities. To see why, we take any
two functions f and g whose arguments are the N random variables
1
x ,
2
x ,…,
N
x and multiply
them by two nonrandom variables Į and ȕ. The expectation operator E applied to


1 2 1 2
( , , , ) ( , , , )
N N
f x x x g x x x α β + … …

then gives, according to Eq. (3.12a) above,


( )
1 2
1 2
1 2 1 2
1 2 1 2 1 2 1 2
1 2 1 2 1 2
1 2 1
( , , , ) ( , , , )
[ ( , , , ) ( , , , )] ( , , , )
( , , , ) ( , , , )
(
N
N
N N
N N N x x x N
N N x x x N
N
f x x x g x x x
dx dx dx f x x x g x x x p x x x
dx dx dx f x x x p x x x
dx dx dx g x
α β
α β
α
β
∞ ∞ ∞
−∞ −∞ −∞
∞ ∞ ∞
−∞ −∞ −∞
∞ ∞
−∞ −∞
+
= +
=
+
³ ³ ³
³ ³ ³
³ ³
"
"
… …
" … … …
" " "
"
E

( ) ( )
1 2
2 1 2
1 2 1 2
, , , ) ( , , , )
( , , , ) ( , , , )
N
N x x x N
N N
x x p x x x
f x x x g x x x α β

−∞
= +
³
"
… …
… … E E .
(3.16a)
3 · Random Variables, Random Functions, and Power Spectra

- 240 -
Note that in the last step Eq. (3.12a) is applied again to return to the expectation operator.
According to Eq. (2.32a) in Chapter 2, the definition of a linear operator L is that

( ) ( ) ( ) f g f g α β α β + = + L L L (3.16b)

for any two functions f, g and any two constants Į, ȕ. When we think of the nonrandom variables
Į and ȕ as “constants,” we see that Eqs. (3.16a) and (3.16b) provide plenty of justification for
calling the expectation operator E a linear operator with respect to all random quantities.
The linearity of E can be used to show that multiplying any random variable x by a
nonrandom parameter Į results in the mean of x being multiplied by Į and the variance of x
being multiplied by Į
2
. Starting with Eq. (3.8c), we multiply both sides by Į to get

( )
x
x α αµ =

E . (3.16c)

Because E is linear, ( ) ( ) x x α α = E E , which means that Eq. (3.16c) can be written as

( )
x
x α αµ =

E . (3.16d)

This shows that multiplying x by Į changes its average value from
x
µ

to
x
αµ

. As for the
variance
x
v

of random variable x , according to Eq. (3.8e) we have


( )
2
( )
x x
x v µ − =

E (3.16e)

from the definition of the variance of x . Multiplying both sides by Į
2
gives


( )
2 2 2
( )
x x
x v α µ α − =

E . (3.16f)
Again the linearity of E lets us write


( ) ( )
2 2 2 2
( ) ( )
x x
x x α µ α µ − = −

E E ,

and taking Į inside the square gives


( ) ( )
2 2 2
( ) ( )
x x
x x α µ α αµ − = −

E E .

This can be substituted into (3.16f) to get

Linearity of the Expectation Operator · 3.10
- 241 -

( )
2 2
( )
x x
x v α αµ α − =

E . (3.16g)

Since x α is the new random variable which comes from multiplying x by Į and [according to
Eq. (3.16d)] the quantity
x
αµ

is the mean of this new random variable, we now realize—
consulting the definition of the variance in Eq. (3.8e)—that
( )
2
( )
x
x α αµ −

E must be the variance
of the new random variable x α . Equation (3.16e) reminds us that
x
v

is the variance of the old
random variable x . Hence, Eq. (3.16g) states that if x is multiplied by Į then its variance must
be multiplied by Į
2
.
The expectation operator usually can be moved inside an integral over a nonrandom variable.
Suppose function f depends on one nonrandom variable z in addition to N random variables
1
x ,
2
x ,…,
N
x . Then, again using Eq. (3.12a), the expectation value of the integral


1 2
( , , , , )
B
A
z
N
z
f z x x x dz
³

is

1 2
1 2
1 2 1 2 1 2
( ( , , , , ) )
( , , , ) ( , , , , ) .
B
A
B
N
A
z
N
z
z
N x x x N N
z
f z x x x dz
dx dx dx p x x x f z x x x dz
∞ ∞ ∞
−∞ −∞ −∞
=
³
³ ³ ³ ³
"

" … …
E


As long as we can interchange the order of these integrations—which is almost always allowed
when dealing with physically realistic integrals—the expectation value can also be written as


1 2
1 2
1 2 1 2 1 2
( , , , , )
( , , , ) ( , , , , ) .
B
A
B
N
A
z
N
z
z
N x x x N N
z
f z x x x dz
dz dx dx dx p x x x f z x x x
∞ ∞ ∞
−∞ −∞ −∞
§ ·
¨ ¸
¨ ¸
© ¹
ª º
=
« »
¬ ¼
³
³ ³ ³ ³
"

" … …
E


This can, again applying Eq. (3.12a), be written as

( )
1 2 1 2
( , , , , ) ( , , , , )
B B
A A
z z
N N
z z
f z x x x dz f z x x x dz
§ ·
= ¨ ¸
¨ ¸
© ¹
³ ³
… … E E . (3.17a)

3 · Random Variables, Random Functions, and Power Spectra

- 242 -
The same reasoning can be extended to M integrals over M nonrandom variables
1
z ,
2
z ,…,
M
z .
We have


1 2
1 2
1
1 2
1
2
1 2
1
1 2 1 2 1 2
1 1 1 1 1
1 1 1 1
( , , , , , , , )
( , , ) ( , , , , , )
( , , ) ( ,
B B MB
A A MA
A MB
N
A MA
A MB
N
A MA
z z z
M M N
z z z
z z
N x x x N M M N
z z
z z
M N x x x N
z z
dz dz dz f z z z x x x
dx dx p x x dz dz f z z x x
dz dz dx dx p x x f z
∞ ∞
−∞ −∞

−∞
§ ·
¨ ¸
¨ ¸
© ¹
=
=
³ ³ ³
³ ³ ³ ³
³ ³ ³
"
"
" … …
" … " … "
" " …
E
1
, , , , ) ,
M N
z x x

−∞
ª º
« »
¬ ¼
³
… "


which can also be written as


( )
2 2
1 2
1 2
1 2
1 2 1 2 1 2
1 2 1 2 1 2
( , , , , , , , )
( , , , , , , , ) .
B B MB
A A MA
B B MB
A A MA
z z z
M M N
z z z
z z z
M M N
z z z
dz dz dz f z z z x x x
dz dz dz f z z z x x x
§ ·
¨ ¸
¨ ¸
© ¹
=
³ ³ ³
³ ³ ³
" … …
" … …
E
E
(3.17b)

The expectation operator can even be moved inside the integral of a random function


1 2
( , , , )
M
f z z z

… .


According to our definition of a random function in Sec. 3.2 above, we have



1 2 1 2 1 2
( , , , ) ( , , , , , , , )
M M N
f z z z f z z z x x x =

… … …


for some set of random variables
1
x ,
2
x ,…,
N
x . Hence, we can just suppress the random variables
1
x ,
2
x ,…,
N
x in Eq. (3.17b) to get

Linearity of the Expectation Operator · 3.10
- 243 -

( )
2 2
1 2
1 2
1 2
1 2 1 2
1 2 1 2
( , , , )
( , , , ) .
B B MB
A A MA
B B MB
A A MA
z z z
M M
z z z
z z z
M M
z z z
dz dz dz f z z z
dz dz dz f z z z
§ ·
¨ ¸
¨ ¸
© ¹
=
³ ³ ³
³ ³ ³

" …

" …
E
E
(3.17c)

This result is referred to more than once in the following chapters.
3.11 The Central Limit Theorem
The central limit theorem states that if there is a random variable
N
s equal to the sum of N
independent random variables
1
r ,
2
r ,…,
N
r , then


1 2 N N
s r r r = + + + " (3.18a)

has a probability density distribution ( )
N
s N
p s

that resembles a Gaussian or normal probability
density distribution more and more as N gets large,


2
2
( )
2 1
( )
2
N s
N
s
N
N
N
s
s N
s
p s e
µ
σ
σ π


. (3.18b)

In Eq. (3.18b),
N
s
µ

is the mean or average value of
N
s and
N
s
σ

is the standard deviation of
N
s
about its mean. Figure 3.1 is a plot of the Gaussian distribution specified on the right-hand side of
(3.18b). For large but finite values of N, this Gaussian distribution tends to be a relatively good
approximation of ( )
N
s N
p s

for
N
s values near the peak in Fig. 3.1 and a not-so-good
approximation of ( )
N
s N
p s

for
N
s values in the tails of Fig. 3.1—that is, for
N
s values far from
the peak.
The mean of
N
s comes from applying the expectation operator E to both sides of Eq. (3.18a).
Remembering that E is linear with respect to random quantities [see Eq. (3.16a) above], we get


1 2 1 2
( ) ( ) ( ) ( ) ( )
N N N
s r r r r r r = + + + = + + + " " E E E E E ,
3 · Random Variables, Random Functions, and Power Spectra

- 244 -
FIGURE 3.1.






which becomes, applying Eq. (3.8c) above,


1 2 N N
s r r r
µ µ µ µ = + + +

" . (3.19a)

The variance of
N
s is, according to Eq. (3.8e),


( )
2
( )
N N
s N s
v s µ = −

E ,

which becomes, after substituting from Eqs. (3.18a) and (3.19a),


N
s
~ µ
N
s
~ σ
N
s
~ σ

N
s
) ( ~
N s
s p
N

The Central Limit Theorem · 3.11
- 245 -

2 2
1 1 1
( )
N j j
N N N
s j r j r
j j j
v r r µ µ
= = =
§ · § ·
§ · § ·
¨ ¸ ¨ ¸
= − = −
¨ ¸ ¨ ¸
¨ ¸ ¨ ¸
© ¹ © ¹
© ¹ © ¹
¦ ¦ ¦
E E .

Expanding the square inside the expectation operator gives


2
1 1 1
( ) [( )( )]
N j j k
N N N
s j r j r k r
j j k
k j
v r r r µ µ µ
= = =

§ ·
¨ ¸
= − + − −
¨ ¸
¨ ¸
© ¹
¦ ¦¦
E ,

and the linearity of the expectation operator with respect to random quantities then lets us write
this as


( ) ( )
2
1 1 1
( ) ( )( )
N j j k
N N N
s j r j r k r
j j k
k j
v r r r µ µ µ
= = =

= − + − −
¦ ¦¦
E E . (3.19b)

Since
1
r ,
2
r ,…,
N
r are independent random quantities, so must the random quantities
1
1 r
r µ −

,
2
2 r
r µ −

,…,
N
N r
r µ −

also be independent. Hence, according to Eq. (3.11b), we see that when
j k ≠


( )
( )( ) ( ) ( )
j k j k
j r k r j r k r
r r r r µ µ µ µ − − = − ⋅ −

E E E . (3.19c)

But, applying the linearity of the expectation operator and Eqs. (3.8c) and (3.9f), we have

( ) ( ) ( ) 0
j j j j
j r j r r r
r r µ µ µ µ − = − = − =

E E E .

Consequently, Eq. (3.19c) becomes


( )
( )( ) 0
j k
j r k r
r r µ µ − − =

E (3.19d)

when j k ≠ . Substituting this into (3.19b) gives


( )
2
1
( )
N j
N
s j r
j
v r µ
=
= −
¦
E ,

3 · Random Variables, Random Functions, and Power Spectra

- 246 -
which becomes, after applying Eq. (3.8e),


1 2 N N
s r r r
v v v v = + + +

" , (3.19e)
where

( )
2
( )
j j
j r r
r v µ − =

E (3.19f)

is the variance of
j
r for 1, 2, , j N = … . The standard deviation of a random quantity is the square
root of its variance [see Eq. (3.5c)], so formulas (3.19e) and (3.19f) can also be written as


1 2
2 2 2 2
N N
s r r r
σ σ σ σ = + + +

" , (3.19g)
where

( )
2
( )
j j
j r r
r µ σ − =

E (3.19h)

is the standard deviation of
j
r for 1, 2, , j N = … and
N
s
σ

is the standard deviation of
N
s .
Returning to the approximation in Eq. (3.18b) used to explain the central limit theorem, we
notice that some care must be exercised in interpreting the limit as N →∞; in particular, it is
clear from Eqs. (3.19a) and (3.19g) that there is a tendency for both
N
s
µ

and
N
s
σ

to become large
without limit as N increases, making the expression on the right-hand side of (3.18b) difficult to
interpret in the limit of large N. The central limit theorem can be written in terms of a
mathematically well-defined limit as N →∞ if we are careful how the arguments of the
Gaussian or normal distribution are defined. To state the central limit theorem precisely, we
define a new random variable

N
N
N s
N
s
s
z
µ
σ

=

(3.20a)

that has a probability density distribution ( )
N
z N
p z

. Now we can present the central limit theorem
exactly by stating that

2
/ 2
1
lim ( )
2
N
z
z
N
p z e
π

→∞
ª º =
¬ ¼

. (3.20b)

The right-hand side of (3.20b) is the Gaussian or normal distribution introduced above in Eq.
(3.6a) where the random variable has a mean of zero and a standard deviation of one. For any
large but finite value of N, we can recover the approximation in (3.18b) by assuming that
N
z
p

is
near its limit and then replacing z in (3.20b) by z
N
as defined in (3.20a). [The extra factor of
N
s
σ

The Central Limit Theorem · 3.11
- 247 -
multiplying the 2π on the right-hand side of (3.18b) can be regarded as coming from Eq. (3.4)
above—if it isn’t there, then the integral of the probability density distribution between í’ and
+’ does not equal one.]
3.12 Averaging to Improve Experimental Accuracy
It is now easy to explain why averaging together many identical but independent measurements
from the same experiment improves the accuracy of the result. Suppose N independent
measurements are to be averaged together this way. We can say that each measurement is an
independent random number
j
r for 1, 2, , j N = … having the same mean value µ, with µ taken to
be the true value of the experimental quantity being measured. Since the measurements are all
identical, all the
j
r have the same standard deviation ı due to the same sorts of random errors
occurring in each independent measurement. When all the experimental results are averaged, we
create a new random number—namely, the sum of all the
j
r divided by N. Let’s call this new
random number
N
a . The work done in the previous section lets us write this as [see Eq. (3.18a)]


N
N
s
a
N
=

. (3.21a)

Applying the expectation operator E to both sides gives, using the linearity of the expectation
operator (see Sec. 3.10 above),

1
( ) ( )
N N
a s
N
= E E . (3.21b)

Since ( )
N
N s
s µ =

E , Eq. (3.19a) shows that, since all the
j
r have the same mean value µ,


1 2
( )
N
N r r r
s N µ µ µ µ = + + + =

" E . (3.21c)

Hence, Eq. (3.21b) now becomes

1
( ) ( )
N
a N
N
µ µ = = E . (3.21d)

Equation (3.21d) states that the expected value of the experimental average
N
a is µ, the true
value of the experimental quantity being measured. This is no great surprise, because the
averaging process would not make sense unless it were true. The typical size of the error left after
the
j
r are averaged together—that is, the amount by which
N
a is likely to be different from its
average value—is just its standard deviation [see Eqs. (3.5c) and (3.8e) above],
3 · Random Variables, Random Functions, and Power Spectra

- 248 -

( )
2
( )
N
a N
a σ µ = −

E ,

which can also be written as, after substituting from Eq. (3.21a) and using the linearity of the
expectation operator,
( )
( )
2
2 1 1
N
a N N
s s N
N N
σ µ µ
§ ·
§ ·
= − = − ¨ ¸
¨ ¸
¨ ¸
© ¹
© ¹

E E . (3.21e)

According to (3.21c), Nµ is the mean value of
N
s , which makes

( )
( )
2
N
s Nµ − E .

the variance
N
s
v

of
N
s [see Eq. (3.8e) above]. Hence, (3.21e) can be written as


2
1 1
N N N
a s s
v
N N
σ σ = =



because the variance is the square of the standard deviation
N
s
σ

. Substituting from (3.19g) now
gives

1 2
2 2 2
1 1
N N N
a s r r r
v
N N
σ σ σ σ = = + + +

" .

As already mentioned above, we can assume that all the
j
r have the same standard deviation ı.
Hence,

2
1
N
a
N
N
N
σ
σ σ = =

. (3.21f)

This shows that when the standard deviation or expected error in one measurement is ı, then the
standard deviation or expected error in the average
N
a of N identical but independent
measurements is / N σ , a significantly smaller number. Although we use several formulas from
the previous section on the central limit theorem to get this result, there is no assumption here
that the
j
r obey any particular probability density distribution. In order to derive Eqs. (3.21d) and
(3.21f), all that is needed is that the
j
r are independent and that the probability density
distributions of the
j
r have the same mean and standard deviation.
When spectrometers are used to make independent measurements of the same radiance
Averaging to Improve Experimental Accuracy · 3.12
- 249 -
spectra, we can extend the above analysis to the spectral measurements by regarding the
independent but identical random variables
j
r as random functions of the spectral wavelength or
frequency, with different values of index j now representing different spectral curves from
independent spectral measurements. We can now repeat all the algebraic manipulations used in
(3.21a)–(3.21f) above while regarding every quantity except N as a function of the spectral
wavelength or frequency and end up with the same results. If, for example, the quantities are
regarded as functions of the spectral wavelength Ȝ, then we just need to visualize a (Ȝ)
immediately following the relevant variables. In a sense, all that is happening is that we have
decided to repeat the algebra of Eqs. (3.21a)–(3.21f) at each spectral wavelength. Equation
(3.21d), for example, becomes

( ) ( ) ( )
N
a λ µ λ = E , (3.22a)

showing that the point-by-point average of the ( )
j
r λ spectral curves creates another curve ( )
N
a λ
whose expected value is the true spectrum µ(Ȝ). The average spectrum ( )
N
a λ is allowed to have
a different expected value µ(Ȝ) at each wavelength Ȝ because it is now, of course, taken to be a
function of Ȝ. Similarly Eq. (3.21f) becomes


( )
( )
N
a
N
σ λ
σ λ =

. (3.22b)

This shows that the expected error ( )
N
a
σ λ

at wavelength Ȝ of the average spectrum ( )
N
a λ is
smaller by a factor of N than the expected error ı(Ȝ) at wavelength Ȝ of a single spectral
measurement. The expected error ( ) σ λ , just like the average µ(Ȝ), is allowed to be different at
different wavelengths. As long as the expected value µ(Ȝ) of ( )
N
a λ is the true spectral curve, Eq.
(3.22b) shows that we can approach this true spectrum as closely as we desire—that is, make the
error in our point-by-point average spectrum arbitrarily small—by making N as large as
necessary.
3.13 Mean, Autocorrelation, Autocovariance of Random Functions of
Time
Using the same notation as in the discussion following Eq. (3.2a) above, we write ñ(t) to
represent a random function ñ of a nonrandom time t. As we already mentioned at the end of Sec.
3.2, ñ(t) is often called a random or stochastic process. Having specified a random function—or
stochastic process or random process—called ñ(t), we know that for each time t there is a random
variable ñ(t); and when there are two different time values t
1
and t
2
with t
1


t
2
, there is no reason
to expect the random variables ñ(t
1
) and ñ(t
2
) to behave the same way.
3 · Random Variables, Random Functions, and Power Spectra

- 250 -
We also know the behavior of random variables can be described by probability density
distributions. Associated with any N sequential random variables
1
( ) n t ,
2
( ) n t ,..., ( )
N
n t specified
by the time values
1 2 N
t t t < < < " there is a probability density distribution


1 2
1 2 ( ) ( ) ( )
( , , , )
N
N n t n t n t
p n n n
"
… ,
such that

1 2
1 2 1 2 ( ) ( ) ( )
( , , , )
N
N N n t n t n t
p n n n dn dn dn
"
… "

is the probability first that ñ(t
1
) takes on a value between n
1
and
1 1
n dn + , and then that
2
( ) n t
takes on a value between
2
n and
2 2
n dn + , and then that
3
( ) n t takes on a value between
3
n and
3 3
n dn + , …, and then that ( )
N
n t takes on a value between
N
n and
N N
n dn + . The expectation
operator E has the same meaning as before: the expected or mean value of any function f of the
N random variables
1
( ) n t ,
2
( ) n t , ... , ( )
N
n t is


( ) ( )
1 2
1 2
1 2 1 2 ( ) ( ) ( ) 1 2
( ), ( ), , ( )
( , , , ) ( , , , ) .
N
N
N N n t n t n t N
f n t n t n t
dn dn dn f n n n p n n n
∞ ∞ ∞
−∞ −∞ −∞
=
³ ³ ³
"

" … …
E

(3.23a)


One of the most important expectation values associated with ñ occurs when we set 2 N = and
specify that
( )
1 2 1 2
( ), ( ), , ( ) ( ) ( )
N
f n t n t n t n t n t = ⋅ …

to get the autocorrelation function

( )
1 2
1 2 1 2 1 2 1 2 ( ) ( ) 1 2
( , ) ( ) ( ) [ ] ( , )
nn n t n t
R t t n t n t dn dn n n p n n
∞ ∞
−∞ −∞
= ⋅ =
³ ³

E . (3.23b)


Other important expectation values are the mean of ñ as a function of time,

( )
( ) ( )
( ) ( )
n t n t
n t n p n dn µ

−∞
= =
³

E , (3.23c)
and the autocovariance of ñ,

Mean, Autocorrelation, Autocovariance of Random Functions of Time · 3.13
- 251 -

( )( ) ( )
1 2
1 2 1 2
1 2 1 ( ) 2 ( )
1 2 1 ( ) 2 ( ) ( ) ( ) 1 2
( , ) ( ) ( )
( )( ) ( , ).
nn n t n t
n t n t n t n t
C t t n t n t
dn dn n n p n n
µ µ
µ µ
∞ ∞
−∞ −∞
= − −
= − −
³ ³


E

(3.23d)

Clearly, when
( )
0
n t
µ =

for all t, we have


1 2 1 2
( , ) ( , )
nn nn
R t t C t t =

. (3.23e)

Almost always, the random functions used to represent noise in a physical system are specified in
such a way that
( )
0
n t
µ =

, which means the distinction between the autocorrelation function and
the autocovariance function becomes irrelevant.
3.14 Ensembles
Just as random variables are often regarded as taking on one or another specific value chosen
randomly from some collection of allowed nonrandom values, so too do we often think of
random functions as becoming one or another specific, nonrandom function chosen randomly
from a collection—or ensemble—of allowed nonrandom functions. We can visualize this
situation by imagining an infinitely long row of biased and crooked slot machines, one for every
value of t on the time axis.
27
The slot machines do not necessarily behave identically and they are
wired together so that they can influence each other. When a slot machine’s lever is pulled, there
is never any jackpot; all that happens is that another number appears inside its window. Each time
we simultaneously pull all the levers of the slot machines, we randomly choose another member
of the ensemble of allowed functions. The probability
( )
( )
n t
p n dn

that random variable ñ(t) takes
on a value between n and n dn + is just the probability that the slot machine at t takes on a value
between n and n dn + , and it is also the probability that some member function randomly chosen
from the ensemble of allowed functions has a value between n and n dn + at time t. In fact, we
can say that


1 2
( ) ( ) ( ) 1 2 1 2
( , , , )
N
n t n t n t N N
p n n n dn dn dn
"
… "

is the probability, after the slot machine levers are pulled, that the slot machine at t
1
has a value
between n
1
and
1 1
n dn + , that the slot machine at t
2
has a value between n
2
and
2 2
n dn + , …, and


27
An objection that could be raised here is that an infinite number of slot machines is only what is called countably
infinite whereas the number of points on the time axis is uncountably infinite, a much “larger” type of infinity. For
our purposes, the distinction between these two types of infinity is not important.
3 · Random Variables, Random Functions, and Power Spectra

- 252 -
that the slot machine at t
N
has a value between n
N
and
N N
n dn + . It can also, of course, be thought
of as the probability that a member function randomly chosen from the ensemble of allowed
functions has values at times
1 2 N
t t t < < < " that lie between n
1
and
1 1
n dn + ,
2
n and
2 2
n dn + ,
…,
N
n and
N N
n dn + respectively.
3.15 Stationary Random Functions
A random function ñ(t) is strictly stationary,
28
or strict-sense stationary,
29
if all its statistical
properties are unaffected when the origin of its time axis is changed (that is, when we change the
point at which 0 t = ). Mathematically we require, for any
1 2 N
t t t < < < " , that the probability
density distribution


1 2 1 2
( ) ( ) ( ) 1 2 ( ) ( ) ( ) 1 2
( , , , ) ( , , , )
N N
n t n t n t N n t n t n t N
p n n n p n n n
τ τ τ + + +
=
" "
… … (3.24a)

for any value of τ and all 1, 2, , N = ∞ … . Thus, for any integrable function f with N arguments,


1 2
1 2
1 2 1 2 ( ) ( ) ( ) 1 2
1 2 1 2 ( ) ( ) ( ) 1 2
( , , , ) ( , , , )
( , , , ) ( , , , ) ,
N
N
N N n t n t n t N
N N n t n t n t N
dn dn dn f n n n p n n n
dn dn dn f n n n p n n n
τ τ τ
∞ ∞ ∞
−∞ −∞ −∞
∞ ∞ ∞
+ + +
−∞ −∞ −∞
=
³ ³ ³
³ ³ ³
"
"
" … …
" … …
(3.24b)

where
1 2 N
t t t < < < " and 1, 2, , N = ∞ … . This means that, according to Eq. (3.23a),

( ) ( ) ( ) ( )
1 2 1 2
( ), ( ), , ( ) ( ), ( ), , ( )
N N
f n t n t n t f n t n t n t τ τ τ = + + + … … E E (3.24c)

for any integrable function f, any value of τ , and 1, 2, , N = ∞ … . We note that when Eq. (3.24c)
holds true,
( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t … E

cannot depend on all the N independent time values
1
t ,
2
t ,…,
N
t as we might at first suppose. To
see why this is so, we just set
1
t τ = − in (3.24c) to get



28
Paul H. Wirsching, Thomas L. Paez, and Keith Ortiz, Random Vibrations: Theory and Practice (John Wiley and
Sons, Inc., New York, 1995), p. 80.
29
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 297.
Stationary Random Functions · 3.15
- 253 -

( ) ( )
( ) ( )
1 2
2 1 3 1 1
( ), ( ), , ( )
(0), ( ), ( ), , ( ) .
N
N
f n t n t n t
f n n t t n t t n t t = − − −


E
E
(3.24d)
This shows that
( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t … E

must be a function of just the nonrandom time parameters
2 1
( ) t t − ,
3 1
( ) t t − ,…,
1
( )
N
t t − and there
are, of course, only 1 N − of these.
Equations (3.24b)–(3.24d) can be understood in terms of the following thought experiment.
We randomly pick some function from the ensemble of allowed functions and choose N time
values
1 2 N
t t t < < < " . The randomly picked function has values
1
n ,
2
n ,…,
N
n at times
1
t ,
2
t ,…,
N
t respectively. Next, we create some nonrandom function f that has N arguments and is
not one of those physically unreasonable abstractions that mathematicians specialize in. We
calculate and store the value of
1 2
( , , , )
N
f n n n … . Randomly choosing another function from the
ensemble of allowed functions for ( ) n t , we again use
1
n ,
2
n ,…,
N
n at
1
t ,
2
t ,…,
N
t to calculate and
store a new value of
1 2
( , , , )
N
f n n n … . Repeating this procedure enough times to get a large
collection of f values, we average them all together to get a good estimate of

( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t … E .

Shifting to a new set of time values
1
t τ + ,
2
t τ + ,…,
N
t τ + , we again generate another large
collection of f values, this time averaging them together to get a good estimate of

( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t τ τ τ + + + … E .

Since n is strict-sense stationary, we know that no matter what the positive integer N is, and no
matter what the function f is, and no matter what the value of τ is, both collections of f values
always have approximately the same average, with the difference between the averages becoming
less and less as the collections of f values get larger and larger.
To give an example of a random function ñ(t) that is strict-sense stationary, we define

( ) cos( ) sin( ) n t a t b t ω ω = +

, (3.25a)

where a and b

obey a probability density distribution ( , )
ab
p a b

such that ( , )
ab
p a b da db

is the
probability that a takes on a value between a and a da + when b

takes on a value between b and
3 · Random Variables, Random Functions, and Power Spectra

- 254 -
b db + . We can also, just as correctly, say that ( , )
ab
p a b da db

is the probability that b

takes on a
value between b and b db + when a takes on a value between a and a da + . We next require


2 2
( , ) ( )
ab ab
p a b p a b = +


. (3.25b)

Equation (3.25b) says that ( , )
ab
p a b

is circularly symmetric because it depends on a and b only
through
2 2
a b + , the “radius length” of a point whose x and y coordinates are a, b. Returning to
the slot-machine model for ñ(t) explained in Sec. 3.14, we note that randomly choosing values for
a and b

is the same as simultaneously pulling the levers of all the slot machines representing
ñ(t) in Eq. (3.25a). Having pulled the levers and gotten, say, values
1
a for a and
1
b for b

, we
then know that the number in the window of the slot machine located at time value
1
t is


1 1 1 1
cos( ) sin( ) a t b t ω ω + ,

we know that the number in the window of the slot machine located at time value
2
t is


1 2 1 2
cos( ) sin( ) a t b t ω ω + ,

and so on. If we pull all the levers again and get values
2
a for a and
2
b for b

, then we know that
the slot machine at
1
t has a number

2 1 2 1
cos( ) sin( ) a t b t ω ω + ,

we know the slot machine at
2
t has a number


2 2 2 2
cos( ) sin( ) a t b t ω ω + ,

and so on. Because the probability density distribution ( , )
ab
p a b

completely determines the
statistics of random variables a and b

, we see that it must also completely determine the
statistics of ñ(t) in Eq. (3.25a).
It is not difficult to show that ñ(t) in Eq. (3.25a) is strict-sense stationary when
ab
p

is
circularly symmetric.
30
Picking an arbitrary time interval τ , we construct two new random
variables


30
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 301.
Stationary Random Functions · 3.15
- 255 -
cos( ) sin( ) A a b ωτ ωτ = +

(3.26a)
and
cos( ) sin( ) B b a ωτ ωτ = −

. (3.26b)

The reverse transformation to Eqs. (3.26a) and (3.26b) is, of course,

cos( ) sin( ) a A B ωτ ωτ = −

(3.26c)
and
cos( ) sin( ) b B A ωτ ωτ = +

, (3.26d)

which we can find by solving Eqs. (3.26a) and (3.26b) for a and b

in terms of A

and B

.
Equations (3.26a) and (3.26b) state that if random variables a and b

take on the values a and b,
then random variables A

and B

must take on the values

cos( ) sin( ) a b ωτ ωτ +
and
cos( ) sin( ) b a ωτ ωτ −

respectively. Similarly Eqs. (3.26c) and (3.26d) state that if random variables A

and B

take on
values A and B, then random variables a and b

must take on values

cos( ) sin( ) A B ωτ ωτ −
and
cos( ) sin( ) B A ωτ ωτ +

respectively. Whenever there are two random variables x and y that have a probability density
distribution ( , )
xy
p x y

and we use constants
1
α ,
2
α ,
3
α , and
4
α to construct from x and y two
new random variables

1 2
z x y α α = + (3.27a)
and

3 4
w x y α α = + , (3.27b)

then we can find the probability density distribution
zw
p

for z and w by calculating the reverse
transformation

1 2
x z w β β = + (3.27c)
3 · Random Variables, Random Functions, and Power Spectra

- 256 -
and

3 4
y z w β β = + , (3.27d)
and requiring that
31



1 2 3 4
1 4 2 3
1
( , ) ( , )
zw xy
p z w p z w z w β β β β
α α α α
= + +


. (3.27e)

Comparing Eqs. (3.26a)–(3.26d) to Eqs. (3.27a)–(3.27d), we see that


1
cos( ) α ωτ = ,
2
sin( ) α ωτ = ,
3
sin( ) α ωτ = − ,
4
cos( ) α ωτ =

and

1
cos( ) β ωτ = ,
2
sin( ) β ωτ = − ,
3
sin( ) β ωτ = ,
4
cos( ) β ωτ = .

Consequently,

2 2
1 4 2 3
cos ( ) sin ( ) 1 α α α α ωτ ωτ − = + = ,

and so the probability density distribution of A

and B

must be


( ) ( , ) cos( ) sin( ), sin( ) cos( )
AB ab
p A B p A B A B ωτ ωτ ωτ ωτ = − +


. (3.28a)


Since
ab
p

is circularly symmetric, obeying Eq. (3.25b), this becomes


( ) ( )
( )
2 2 2 2
2 2 2 2 1 2
2 2 2 2 2 2
( , ) [ cos ( ) sin ( ) 2 sin( ) cos( )
sin ( ) cos ( ) 2 sin( ) cos( )]
cos ( ) sin ( ) cos ( ) sin ( )
(
)
AB ab
ab
p A B p A B AB
A B AB
p A B
ωτ ωτ ωτ ωτ
ωτ ωτ ωτ ωτ
ωτ ωτ ωτ ωτ
= + −
+ + +
= + + +

2 2
( ).
ab
p A B = +

From Eqs. (3.26c) and (3.26d), we know that, whenever A

and B

take on the values A and B,
that a and b

must then take on the values


31
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 144.
Stationary Random Functions · 3.15
- 257 -
cos( ) sin( ) A B ut ut ÷
and
cos( ) sin( ) B A ut ut + .

Hence,
( ) ( )
2 2
2 2 2 2
cos( ) sin( ) sin( ) cos( ) a b A B A B A B ut ut ut ut + ÷ + + +

so that

( )
2 2
( , ) ( , )
AB ab ab
p A B p a b p a b +


,

where Eq. (3.25b) is reversed to make the last step in this equality. We have now shown that Eq.
(3.28a) can be written as
( , ) ( , )
AB ab
p A B p a b


(3.28b)

because
ab
p

is circularly symmetric.
Equation (3.28b) is a very restrictive statement applied to random variables A

and B

because
it requires A

and B

to obey exactly the same statistics as a and b

. Consequently, we can set up
a random function
( ) cos( ) sin( ) N t A t B t u u +

(3.29a)

and know that it has exactly the same random behavior as ñ(t) in Eq. (3.25a). Substituting Eqs.
(3.26a) and (3.26b) into (3.29a) gives


( ) ( )
( ) [ cos( ) sin( )]cos( ) [ cos( ) sin( )]sin( )
cos ( ) sin ( ) .
N t a b t b a t
a t b t
ut ut u ut ut u
u t u t
+ + ÷
+ + +

(3.29b)

According to Eq. (3.25a), this is the same as writing

( ) ( ) N t n t t +

. (3.29c)

This means that not only does Ñ(t) have the same random behavior as ñ(t), it also has the same
random behavior as ( ) n t t + . Consequently, ñ(t) and ( ) n t t + must both have the same random
behavior. We have made no assumptions about the value of t ; hence, Eq. (3.29c) holds true for
any t value. We have therefore demonstrated that

where the equal probability densities do not depend on . ut
3 · Random Variables, Random Functions, and Power Spectra

- 258 -
( ) cos( ) sin( ) n t a t b t ω ω = +



is strict-sense stationary when the probability density distribution
ab
p

is circularly symmetric
with

2 2
( , ) ( )
ab ab
p a b p a b = +


.

A random function ñ(t) is called wide-sense stationary
32
when

( ) ( ) same finite constant for all values of
n
n t t µ = =

E (3.30a)
and
( )
1 2 2 1
( ) ( ) ( )
nn
n t n t R t t = −

E . (3.30b)

Other terms applied to random functions ñ(t) that satisfy these two restrictions are weakly
stationary or covariance stationary.
33
Equation (3.30a) requires the average value of ñ(t) to be
finite and independent of time. We call this average
n
µ

instead of
( ) n t
µ

as in Eq. (3.23c) to
emphasize that it does not depend on time. Equation (3.30b) requires the autocorrelation function
1 2
( , )
nn
R t t

defined in Eq. (3.23b) to depend only on
2 1
( ) t t − , the difference between times
2
t and
1
t . Glancing back at the definition of
1 2
( , )
nn
C t t

in Eq. (3.23d), we see that when Eqs. (3.30a) and
(3.30b) are satisfied,


( )( ) ( )
( )
( ) ( ) ( )
1 2 1 2
2
1 2 1 2
2
1 2 1 2
( , ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) .
nn n n
n n n
n n n
C t t n t n t
n t n t n t n t
n t n t n t n t
µ µ
µ µ µ
µ µ µ
= − −
= − − +
= − − +






E
E
E E E




The last step uses the linearity of the expectation operator (see Sec. 3.10 above) and Eq. (3.9f).
Consequently, the formula for
nn
C

becomes, using Eqs. (3.30a) and (3.30b),


2
1 2 2 1
( , ) ( )
nn nn n
C t t R t t µ = − −

. (3.30c)

This result shows that the autocovariance
1 2
( , )
nn
C t t

of random functions that are wide-sense


32
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 298.
33
T. T. Soong, Random Differential Equations in Science and Engineering (Academic Press, New York, 1973), p.
43.
Stationary Random Functions · 3.15
- 259 -
stationary also depends only on
2 1
( ) t t − , the difference between times
2
t and
1
t . We note that
random functions that are wide-sense stationary need not be strict-sense stationary, but random
functions that are strict-sense stationary must also be wide-sense stationary. For future use, we
note that two random functions ( ) n t
α
and ( ) n t
β
are defined to be jointly wide-sense stationary
34

when each one is itself wide-sense stationary and when


( )
1 2 2 1
( ) ( ) ( )
n n
n t n t R t t
α β
α β
= −

E , (3.30d)

which is called their cross-correlation function, depends only on the difference between times
1
t
and
2
t .
Returning to the ñ(t) defined in Eq. (3.25a) above,

( ) cos( ) sin( ) n t a t b t ω ω = +

,

we stop assuming that ( , )
ab
p a b

is circularly symmetric and examine the weaker conditions that
must be put on random variables a and b

to make ñ wide-sense stationary.
35
The expectation
value of ñ(t) must be time independent, so by the linearity of the expectation operator

( ) ( ) ( ) cos( ) ( ) sin( ) n t a t b t ω ω = +

E E E .

Hence, for ( ) ( ) n t E to obey Eq. (3.30a) and so be time independent, we must have
( ) 0 a = E (3.31a)
and
( ) 0 b =

E . (3.31b)

These are the first two restrictions that must be placed on a and b

for ñ(t) to be wide-sense
stationary. We also know from Eq. (3.30b) that
nn
R

must have the same value whenever
2 1
0 t t − = or
2 1
t t = , so (remember that nothing has been said about what the value of time
2 1
t t =
is)
( ) ( )
3 3 4 4
( ) ( ) ( ) ( ) n t n t n t n t = E E



34
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 299.
35
This treatment is taken from Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p.
300.
3 · Random Variables, Random Functions, and Power Spectra

- 260 -
must hold true for all values of
3
t and
4
t . In particular, this must hold true when
3
0 t = and
4
(2 ) t π ω = . But from Eq. (3.25a)

(0) n a = and ( ) (2 ) n b π ω =

,
so it must be true that

2 2
( ) ( ) a b =

E E . (3.31c)

This is the third restriction that must be placed on a and b

for ñ(t) to be wide-sense stationary.
To find the fourth and last restriction, we evaluate the left-hand side of Eq. (3.30b) for
1 2
t t ≠ ,
using (3.25a) and the linearity of the expectation operator (see Sec. 3.10) to get


1 2 1 1 2 2
2
1 2 1 2
2
2 1 1
( ( ) ( )) ([ cos( ) sin( )][ cos( ) sin( )])
( cos( ) cos( ) cos( ) sin( )
cos( ) sin( ) sin( ) sin(
n t n t a t b t a t b t
a t t ab t t
ab t t b t t
ω ω ω ω
ω ω ω ω
ω ω ω ω
= + +
= +
+ +

E E
E

2
2 2
1 2 1 2
1 2 2 1
))
( ) [cos( ) cos( )] ( ) [sin( ) sin( )]
( ) [cos( ) sin( ) cos( ) sin( )].
a t t b t t
ab t t t t
ω ω ω ω
ω ω ω ω
= ⋅ + ⋅
+ ⋅ +

E E
E




This becomes, using
2 2
( ) ( ) a b =

E E from Eq. (3.31c),

( ) ( ) ( )
2
1 2 2 1 1 2
( ) ( ) ( ) cos ( ) ( ) sin ( ) . n t n t a t t ab t t ω ω = ⋅ − + ⋅ +

E E E (3.31d)

The first term on the right-hand side of (3.31d) depends only on
2 1
( ) t t − , which is what Eq.
(3.30b) requires, but the second term on the right-hand side does not. Therefore, the last
restriction on random variables a and b

is

( ) 0 ab =

E . (3.31e)

Equations (3.31a), (3.31b), (3.31c), and (3.31e) list all the restrictions on random variables a and
b

needed to ensure that ñ(t) in Eq. (3.25a) is a wide-sense stationary random function.
If a and b

are independent random variables that obey the same probability density
distribution, and this probability density distribution assigns a mean value of zero to random
variables obeying it, then Eqs. (3.31a)–(3.31c) are automatically satisfied and, since a and b

are
independent, Eqs. (3.31a) and (3.31b) show that (3.31e) is also satisfied:

Stationary Random Functions · 3.15
- 261 -
( ) ( ) ( ) 0 0 0 ab a b = ⋅ = ⋅ =

E E E .

This is sufficient to make ñ(t) wide-sense stationary, but there are other ways to do the job. We
can, for example, set a u = and b v =

where u and v are the random variables defined in Eqs.
(3.15b) and (3.15c) above. Equations (3.15d) and (3.15e) then show that Eqs. (3.31a) and (3.31b)
are satisfied, and Eq. (3.15f) shows that (3.31e) is satisfied. The only requirement left is (3.31c),
which can be checked now by writing


2
2 2 2
0
1 1
( ) ( ) sin
2 2
a u d
π
φ φ
π
= = =
³
E E (3.32a)
and

2
2 2 2
0
1 1
( ) ( ) cos
2 2
b v d
π
φ φ
π
= = =
³

E E . (3.32b)

Clearly, Eq. (3.31c) is also satisfied. We conclude that even though a u = and b v =

are not, as is
pointed out in the discussion following Eq. (3.15f), independent random variables, the random
function ñ(t) in Eq. (3.25a) is still wide-sense stationary. Note that Eqs. (3.15b) and (3.15c) can
now be used to write ñ(t) as

( ) sin( ) cos( ) cos( ) sin( ) sin( ) n t t t t φ ω φ ω ω φ = + = +

. (3.32c)

In (3.32c), random variable φ

can, according to Eq. (3.15a), be regarded as a random phase
equally likely to take on any value between zero and 2ʌ. Adding this sort of random phase to the
argument of a sinusoidal oscillation always produces a wide-sense stationary random function.
3.16 Gaussian Random Processes
A random function ñ(t) is called a Gaussian random process or normal process when for any N
time values
1 2 N
t t t < < < " the random variables
1
( ) n t ,
2
( ) n t ,…, ( )
N
n t obey a probability density
distribution

1 2
( ) ( ) ( ) 1 2
( , , , )
N
n t n t n t N
p n n n
"
… ,

which is multivariate Gaussian. To write this multivariate Gaussian in a reasonably compact
form, we define the vectors

1 1
( , , , )
N
n n n n =
G
… , (3.33a)

( )
1 1
( ) ( ), ( ), , ( )
N
n t n t n t n t =
G
G

… , (3.33b)
3 · Random Variables, Random Functions, and Power Spectra

- 262 -
and

( )
( ) ( ) ( ) ( )
1 1
( )
( ) ( ) , ( ) , , ( )
N
n t
n t n t n t n t µ = = G G

G
G G

… E E E E . (3.33c)

Glancing back at Eq. (3.23c), we remember that
( ) n t
µ

is the expected or mean value of the
random variable ñ(t), so Eq. (3.33c) can also be written as


1 2
( ) ( ) ( )
( )
( , , , )
N
n t n t n t
n t
µ µ µ µ = G G

G
… . (3.33d)

We define the covariance matrix C to be the N × N square matrix whose i,jth element is given by


( ) ( ) ( )
( ) [ ( ) ][ ( ) ]
i j
ij i n t j n t
n t n t µ µ = − − C

E . (3.33e)

Equation (3.14c) reminds us that ( )
ij
C is measuring the covariance of the two random variables
( )
i
n t and ( )
j
n t . A T superscript applied to a matrix or vector specifies the transpose of that
matrix or vector; so, for example,

1
2 T
N
n
n
n
n
§ ·
¨ ¸
¨ ¸
=
¨ ¸
¨ ¸
© ¹
G
#
.
Now the multivariate Gaussian distribution


1 2
( ) ( ) ( ) 1 2
( , , , )
N
n t n t n t N
p n n n
"

can be written as


1 2
( ) ( ) ( ) 1 2
( )
2 1 2 1
( ) ( )
( , , , )
( )
1
(2 ) [det( )] exp ( ) ( ) .
2
N
n t n t n t N
n t
N T
n t n t
p n n n
p n
n n π µ µ
− − −
=
§ ·
= − − ⋅ ⋅ −
¨ ¸
© ¹
C C
"
G G

G G G G


G
G G G G


(3.33f)

In this formula, det( ) C stands for the determinant of C, and
1 −
C is the inverse matrix of C.
Nothing said so far about Gaussian random processes requires them to be stationary in any
sense of the term, and in fact not all Gaussian random processes are stationary. They are often
good models for the noise found in mechanical processes and electrical signals. Perhaps the most
interesting thing about them, however, is that it can be shown that if they are wide-sense
Gaussian Random Processes · 3.16
- 263 -
stationary, then they are also strict-sense stationary.
36,37

3.17 Products of Two, Three, and Four Jointly Normal Random
Variables
Random variables such as
1
( ) n t ,
2
( ) n t ,…, ( )
N
n t that obey a multivariate Gaussian distribution
such as the one in Eq. (3.33f) are often called jointly normal random variables.
38
There are a
number of useful product identities that apply to groups of two, three, and four jointly normal
random variables. Since the derivation of these identities does not involve t, our notation can be
simplified by writing

1 1
2 2
( )
( )
etc
n t n
n t n




#
.


Each random variable is also assumed to have a mean of zero:


1
2
0
0
etc.
n
n
µ
µ
=
=

#


We start by specifying three jointly normal, zero-mean random variables
1
n ,
2
n , and
3
n .
Consulting Eq. (3.33f) above, we note that the jointly normal probability density function for
1
n ,
2
n , and
3
n can be written as, by expanding the matrix product in the exponent after setting the
means vector µ
G
to zero,

3 3
1 1
1 2 3
1 2 3
( , , )
jk j k
j k
n n
n n n
p n n n K e
α
= =

¦¦
=

(3.34a)

for real constants K and
jk
α (with , 1, 2, 3 j k = ). Note that these three random variables can be
either independent or dependent random variables and still obey the probability density
distribution in (3.34a). The expected value of the triple product
1 2 3
n n n is [applying Eq. (3.12a)


36
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 300.
37
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 83.
38
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 197.
3 · Random Variables, Random Functions, and Power Spectra

- 264 -
above]

3 3
1 1
1 2 3 1 2 3 1 2 3
( ) ( )
jk j k
j k
n n
n n n K dn dn dn n n n e
α
= =
∞ ∞ ∞ −
−∞ −∞ −∞
¦¦
=
³ ³ ³
E . (3.34b)

Changing the dummy variables of integration to


1 1
u n = − ,
2 2
u n = − ,
3 3
u n = −
gives

3 3
1 1
( )( )
1 2 3 1 2 3 1 2 3
( ) ( ) ( ) ( )( )
jk j k
j k
u u
n n n K du du du u u u e
α
= =
−∞ −∞ −∞ − − −
∞ ∞ ∞
¦¦
= − − − −
³ ³ ³
E
or

3 3
1 1
( )( )
1 2 3 1 2 3 1 2 3
( ) ( )
jk j k
j k
u u
n n n K du du du u u u e
α
= =
∞ ∞ ∞ −
−∞ −∞ −∞
¦¦
= −
³ ³ ³
E . (3.34c)

Comparing the right-hand sides of (3.34b) and (3.34c) shows that


1 2 3 1 2 3
( ) ( ) n n n n n n = − E E .

The only number that is equal to (í1) times itself is zero, so we conclude that


1 2 3
( ) 0 n n n = E (3.34d)

for any three distinct, jointly normal, and zero-mean random variables.
When
1
n ,
2
n , and
3
n are not three distinct random variables—or, what amounts to the same
thing, two or more are perfectly correlated—we can redo the analysis to see what happens.
If two of the three random variables
1
n ,
2
n , and
3
n are perfectly correlated, there are really
only two distinct, jointly normal, zero-mean random variables that we call
1
n and
2
n . Their
multivariate probability density distribution can be written as


2 2
1 1
1 2
1 2
( , )
jk j k
j k
n n
n n
p n n K e
α
= =

¦¦
=



for real constants K and
jk
α (with , 1, 2 j k = ). If necessary, we renumber the random variables so
that
2
n represents the two perfectly correlated random variables that used to be distinct. Equation
(3.34b) now simplifies to
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17
- 265 -

2 2
1 1 2 2
1 2 1 2 1 2
( ) ( )
jk j k
j k
n n
n n K dn dn n n e
α
= =
∞ ∞ −
−∞ −∞
¦¦
=
³ ³
E . (3.35a)

Again the dummy variables of integration are changed, this time to


1 1
u n = − and
2 2
u n = − ,
which gives

2 2
1 1
( )( )
2 2
1 2 1 2 1 2
( ) ( ) ( )( )
jk j k
j k
u u
n n K du du u u e
α
= =
−∞ −∞ − − −
∞ ∞
¦¦
= − − −
³ ³
E
or

2 2
1 1 2 2
1 2 1 2 1 2
( ) ( )
jk j k
j k
u u
n n K du du u u e
α
= =
∞ ∞ −
−∞ −∞
¦¦
= −
³ ³
E . (3.35b)

Comparing the right-hand sides of (3.35a) and (3.35b) shows that


2 2
1 2 1 2
( ) ( ) n n n n = − E E ,

so using the same reasoning as before—that only zero can be equal to (í1) times itself—we get


2
1 2
( ) 0 n n = E . (3.35c)

Hence, Eq. (3.34d) still holds true when any two of the jointly normal, zero-mean random
variables
1
n ,
2
n ,
3
n are perfectly correlated.
When all three of these random variables are perfectly correlated, there is really just one zero-
mean random variable
1
n obeying the normal probability distribution [see Eq. (3.6a) above],

2
1
2
1
1
1
2
1
1
( )
2
n
n
n
n
p n e
σ
σ π

=

.

The left-hand side of (3.34d) now becomes
3
1
( ) n E , which satisfies the formula


2
1
2
1
1
2
3 3
1 1 1
1
( )
2
n
n
n
n n e dn
σ
σ π
∞ −
−∞
=
³

E . (3.36a)

3 · Random Variables, Random Functions, and Power Spectra

- 266 -
Since this is the integral between +’ and –’ of an odd function, it must be zero [see Eq.
(2.17) in Chapter 2]. Consequently,

3
1
( ) 0 n = E (3.36b)

for any zero-mean, normally distributed random variable
1
n . We conclude that Eq. (3.34d) holds
for any three strictly normal and zero-mean random variables even if they are not distinct.
To construct a formula for
1 2 3 4
( ) n n n n E for four zero-mean, jointly normal random variables
1
n ,
2
n ,
3
n ,
4
n , we construct a new random variable,


4
1 1 2 2 3 3 4 4
1
j j
j
w n n n n n ω ω ω ω ω
=
= + + + =
¦
. (3.37a)

There is no requirement that
1
n ,
2
n ,
3
n , and
4
n be distinct random variables, but we do assume
that the real parameters
1
ω ,
2
ω ,
3
ω , and
4
ω can independently take on any value between í’
and +’. Since
1
n ,
2
n ,
3
n , and
4
n are jointly normal, w is also a normal variable.
39
Using the
linearity of the expectation operator with respect to random variables (see Sec. 3.10 above) and
remembering that
1
n ,
2
n ,
3
n , and
4
n are zero mean, we have


4 4
1 1
( ) ( ) 0
j j j j
j j
w n n ω ω
= =
§ ·
= = =
¨ ¸
© ¹
¦ ¦
E E E , (3.37b)

showing that w is also zero-mean. For future use we note, applying (3.37b) to Eq. (3.8e), that the
variance of w is


( )
4 4 4 4
2
1 1 1 1
( )
w j j k k j k j k
j k j k
v w n n n n ω ω ω ω
= = = =
§ ·
§ ·
§ ·
= = =
¨ ¸
¨ ¸
¨ ¸
¨ ¸
© ¹
© ¹
© ¹
¦ ¦ ¦¦
E E E ,

which can also be written as, recognizing that [according to Eq. (3.5c)] the variance
w
v

is the
square of the standard deviation
w
σ

of w ,


4 4
2
1 1
( )
w j k j k
j k
n n σ ω ω
= =
=
¦¦
E . (3.37c)


39
This analysis is an expanded version of a treatment given in Athanasios Papoulis, Probability, Random Variables,
and Stochastic Processes, pp. 197–198.
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17
- 267 -
The characteristic function of w is [see Eqs. (3.9a) and (3.9b) above]


2 2
( ) ( )
i w i w
w
e p w e dw
π ν π ν

− −
−∞
=
³

E ,

where ( )
w
p w

is the probability density distribution of random variable w . Since w obeys a
zero-mean normal distribution [defined in Eq. (3.6a)], this becomes


2
2
2 2 2
1
( )
2
w
w
i w i w
w
e e e dw
σ π ν π ν
σ π


− −
−∞
=
³

E . (3.38a)

Substituting the identity cos sin
i
e i
φ
φ φ = + into (3.38a) gives


2 2
2 2
2 2 2
1
( ) cos(2 ) sin(2 )
2 2
w w
w w
i w
w w
i
e w e dw w e dw
σ σ π ν
πν πν
σ π σ π
∞ ∞
− −

−∞ −∞
= +
³ ³

E .


When we replace w by w − in

2
2
2
( ) sin(2 )
w
w
Y w w e
σ
πν

=

,

we see that

2 2
2 2
( ) ( )
2 2
( ) sin( 2 ) sin(2 ) ( )
w w
w w
Y w w e w e Y w
σ σ
πν πν

− −
− = − = − = −

,

showing that Y is an odd function. Hence, according to Eq. (2.17) in Chapter 2, its integral
between í’ and +’ is zero. The formula for
2
( )
i w
e
π ν −
E must then reduce to


2
2
2 2
1
( ) cos(2 )
2
w
w
i w
w
e w e dw
σ π ν
πν
σ π



−∞
=
³

E . (3.38b)

A table of integrals
40
shows that, for any two real parameters a and b,



40
Formula 679 of the Handbook of Chemistry and Physics, edited by Robert C. Weast, 51st ed. (The Chemical
Rubber Company, Cleveland, OH, 1970–1971), p. A-215.
3 · Random Variables, Random Functions, and Power Spectra

- 268 -

2
2 2
2
4
0
cos( )
2
b
a x
a
e bx dx e
a
π



=
³
.
Setting

2 2
( ) cos( )
a x
Z x bx e

= ,

we note that Z is an even function because


2 2 2 2
( ) ( )
( ) cos( ) cos( ) ( )
a x a x
Z x bx e bx e Z x
− − −
− = − = = .

Hence, according to Eq. (2.19) in Chapter 2, we can write


2
2 2
2
4
cos( )
b
a x
a
e bx dx e
a
π



−∞
=
³
. (3.38c)

Applying formula (3.38c) to Eq. (3.38b) by specifying that
1
2
w
a
σ
=

and 2 b πν = , we get


2 2 2
2 2
( )
w
i w
e e
π ν σ π ν − −
=

E . (3.38d)

Equation (3.38d) holds true for any value of ν ; in particular, when
1
(2 ) ν π

= , it must still be
true:

2
/ 2
( )
w
iw
e e
σ − −
=

E . (3.38e)

Formula (3.38e) applies to any zero-mean, normal random variable, which means it applies to w
for any set of
1
ω ,
2
ω ,
3
ω ,
4
ω values in Eq. (3.37a) above.
We can expand the left-hand side of (3.38e) in powers of w to get, using the linearity of the
expectation operator with respect to random variables (see Sec. 3.10 above),


2 3 4 2 3 4
( ) ( ) ( )
( ) 1 1 ( )
2 6 24 2 6 24
iw
w w w w w w
e iw i i w i

§ ·
= − − + + + = − − + + +
¨ ¸
© ¹

" "
E E E
E E E .

According to Eqs. (3.37b) and (3.36b), both ( ) w E and
3
( ) w E are zero [the discussion following
Eq. (3.37a) shows that w like
1
n is a zero-mean, normally distributed random variable, which
means that it must satisfy both Eqs. (3.37b) and (3.36b)]. Hence, we can write, remembering that
2 2
( )
w
w σ =

E because
w
σ

is the standard deviation of w and w is zero mean, that
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17
- 269 -

2 4
( )
( ) 1
2 24
iw w
w
e
σ

= − + +


"
E
E . (3.39a)

The right-hand side of (3.38e) can be expanded in powers of
w
σ

to get


2
2 4
/ 2
1
2 8
w
w w
e
σ
σ σ

= − + +

". (3.39b)

Substitution of (3.39a) and (3.39b) into (3.38e) now gives


2 2 4 4
( )
1 1
2 24 2 8
w w w
w σ σ σ
− + + = − + +

" "
E

or

4 4
( )
24 8
w
w σ
+ = +

" "
E
. (3.39c)

Equation (3.37c) reminds us that
2
w
σ

is the weighted sum of
j k
ω ω products, so for small ω it
follows that
2
w
σ

is of order
2
ω . This means that
4
w
σ

on the right-hand side of (3.39c) is of order
4
ω . Similarly, Eq. (3.37a) reminds us that
4
( ) w E on the left-hand side of (3.39c) is order
4
ω
when the ω values are small. Formula (3.39c) must hold true for all values of
1
ω ,
2
ω ,
3
ω , and
4
ω . If we choose
1
ω through
4
ω to be small, we must have


4 4
( ) 3
w
w σ =

E . (3.39d)

If (3.39d) is false, then the higher powers of w and
w
σ

in (3.39c), which are represented by
“ +"” on both sides of the formula, cannot make (3.39c) hold true because these +" terms
contain only order
6
ω and higher powers of
1
ω through
4
ω , making them too small to rescue
the equality.
The next step is to expand
4
( ) w E . Raising w to the fourth power in (3.37a) gives


4 2 2
1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4
( ) ( ) w n n n n n n n n ω ω ω ω ω ω ω ω = + + + + + +
or

4 2 2 2 2 2 2 2 2
1 1 2 2 3 3 4 4
1 2 1 2 1 3 1 3 1 4 1 4
2
2 3 2 3 2 4 2 4 3 4 3 4
(
2 2 2
2 2 2 )
w n n n n
n n n n n n
n n n n n n
ω ω ω ω
ωω ωω ωω
ω ω ω ω ω ω
= + + +
+ + +
+ + +

.

3 · Random Variables, Random Functions, and Power Spectra

- 270 -
Paying attention only to those terms whose coefficients are proportional to
1 2 3 4
ωω ω ω , we have


4
1 2 3 4 1 2 3 4
24 w n n n n ωω ω ω = + + " ". (3.40a)

Formula (3.37c) gives, again concentrating only on terms whose coefficients are proportional to
1 2 3 4
ωω ω ω ,

4 2 2
1 1 1 2 1 2 1 3 1 3 1 4 1 4
2 2
2 1 2 1 2 2 2 3 2 3 2 4 2 4
2 2
3 1 3 1 3 2 3 2 3 3 3 4 3 4
4
[ ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
w
n n n n n n n
n n n n n n n
n n n n n n n
σ ω ωω ωω ωω
ω ω ω ω ω ω ω
ω ω ω ω ω ω ω
ω
= + + +
+ + + +
+ + + +
+

E E E E
E E E E
E E E E



2 2 2
1 4 1 4 2 4 2 4 3 4 3 4 4
( ) ( ) ( ) ( )] , n n n n n n n ω ω ω ω ω ω + + + E E E E


which becomes


4
1 2 3 4 1 2 3 4 1 2 3 4 1 3 2 4
1 2 3 4 2 3 1 4
8 ( ) ( ) 8 ( ) ( )
8 ( ) ( )
w
n n n n n n n n
n n n n
σ ωω ω ω ωω ω ω
ωω ω ω
= + +
+ +

"
"
E E E E
E E .
(3.40b)

Equations (3.40a) and (3.40b) can be substituted into (3.39d) to get


1 2 3 4 1 2 3 4
1 2 3 4 1 2 3 4 1 2 3 4 1 3 2 4
1 2 3 4 2 3 1 4
( 24 )
3 [ 8 ( ) ( ) 8 ( ) ( )
8 ( ) ( ) ] ,
n n n n
n n n n n n n n
n n n n
ωω ω ω
ωω ω ω ωω ω ω
ωω ω ω
+ +
= ⋅ + +
+ +
" "
"
"
E
E E E E
E E




which simplifies to, using the linearity of the expectation operator (see Sec. 3.10),


1 2 3 4 1 2 3 4
1 2 3 4 1 2 3 4 1 3 2 4 2 3 1 4
24 ( )
24 [ ( ) ( ) ( ) ( ) ( ) ( )]
n n n n
n n n n n n n n n n n n
ωω ω ω
ωω ω ω
+ +
= + + + +
" "
" "
E
E E E E E E .


This must hold true for any combination of
1
ω ,
2
ω ,
3
ω , and
4
ω values, large or small, so the
coefficients of all the
1 2 3 4
ωω ω ω terms must be the same on both sides of this equation. Therefore,


1 2 3 4 1 2 3 4 1 3 2 4 2 3 1 4
( ) ( ) ( ) ( ) ( ) ( ) ( ) n n n n n n n n n n n n n n n n = + + E E E E E E E (3.40c)

for any collection of zero-mean, jointly normal random variables
1
n ,
2
n ,
3
n , and
4
n .
Equation (3.40c) requires
1
ω through
4
ω to be distinct real parameters, but it does not
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17
- 271 -
require the
1
n ,
2
n ,
3
n , and
4
n random variables to be distinct. Consequently, if
1
n and
2
n are the
same, we can relabel the jointly random variables using


1 2 a
n n n = =

3 b
n n =

4 c
n n =
to get

2 2
( ) ( ) ( ) 2 ( ) ( )
a b c a b c a b a c
n n n n n n n n n n = + E E E E E . (3.41a)

Similarly, if
3
n and
4
n are also identical, we can relabel
1
n through
4
n as


1 2 a
n n n = =
and

3 4 b
n n n = = ,
so that

2 2 2 2 2
( ) ( ) ( ) 2 ( )
a b a b a b
n n n n n n = + E E E E . (3.41b)

When all four random variables are the same, Eq. (3.40c) collapses to


4 2 2
( ) 3 ( ) n n = E E , (3.41c)

which holds true for any zero-mean random variable ñ obeying a normal distribution.
3.18 Ergodic Random Functions
Ergodic random functions are random functions where time averages can be used to calculate
ensemble averages. Just as stationary random functions can be stationary in many different ways,
so can ergodic random functions be ergodic in many different ways.
We start with a simple example, discussing what is meant by saying that a random function
ñ(t) is “ergodic in the mean.”
41
Equation (3.23c) defines the mean of ñ(t) to be the ensemble
average created by the expectation operator,

( )
( )
( )
n t
n t µ =

E .

To find the mean using a time average, we must calculate


41
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
3 · Random Variables, Random Functions, and Power Spectra

- 272 -

1
( )
2
T
T
n t dt
T

³


and take the limit as T →∞. Since “ergodic” refers to using time averages to calculate ensemble
averages, we might expect that a random function that is ergodic in the mean would satisfy the
equation

( )
1
lim ( )
2
T
n t
T
T
n t dt
T
µ
→∞

=
³

. (3.42a)

There are two problems with Eq. (3.42a). The first is that
( ) n t
µ

is allowed to be a function of time
t, whereas

1
lim ( )
2
T
T
T
n t dt
T
→∞

³


is not. This means Eq. (3.42a) can only be true when
( ) n t
µ

does not depend on time.
Consequently, for ñ to be ergodic in the mean, we must also require ñ to be stationary in the mean
with [see Eq. (3.30a) above]

( ) ( ) constant with respect to time
n
n t µ = =

. E

Now Eq. (3.42a) can be written as

1
lim ( )
2
T
n
T
T
n t dt
T
µ
→∞

=
³

. (3.42b)

The second problem is more difficult to deal with. We note that the value of


1
( )
2
T
T
n t dt
T

³


must be a random value because it is proportional to the integral of a random function. Hence, we
expect

1
lim ( )
2
T
T
T
n t dt
T
→∞

³


also to be a random value. This means Eq. (3.42b) sets a random value equal to
n
µ

, a nonrandom
Ergodic Random Functions · 3.18
- 273 -
value, which is in general not allowed. The way out of this impasse is to put a restriction on the
limiting process used to get the right-hand side of (3.42b). Clearly,


1
( ) ( )
2
T
T
T n t dt
T
ξ

=
³

(3.42c)

is a random function of T. This means there must be a probability density distribution
( )
( )
T
p
ξ
ξ

such that
( )
( )
T
p d
ξ
ξ ξ

is the probability that ( ) T ξ

takes on a value between ξ and d ξ ξ + . We
now require the limiting random variable


1
lim ( ) lim ( )
2
T
T T
T
T n t dt
T
ξ ξ

→∞ →∞

= =
³

(3.42d)

to obey the limiting the probability density distribution

( ) ( )
n
p
ξ
ξ δ ξ µ

∞ ∞
= −

. (3.42e)

According to the discussion following Eqs. (3.7e) and (3.7f) above, this turns ξ

into a random
variable that behaves like a constant, since

( ) ( )
n n
d ξ δ ξ µ ξ ξ µ

∞ ∞ ∞ ∞
−∞
= − ⋅ ⋅ =
³

E
and

( )
2 2
( ) ( ) ( ) 0
n n n
d ξ µ δ ξ µ ξ µ ξ

∞ ∞ ∞ ∞
−∞
− = − ⋅ − ⋅ =
³

E .


Now we can note that, yes, strictly speaking, Eq. (3.42b) does equate a random variable to a
nonrandom variable, but this does not matter because Eq. (3.42e) makes the random variable


1
lim ( )
2
T
T
T
n t dt
T
→∞

³


equivalent to a nonrandom quantity.
3 · Random Variables, Random Functions, and Power Spectra

- 274 -
A random function ñ(t) is “ergodic in the autocorrelation function”
42
if the autocorrelation
function defined as an ensemble average in Eq. (3.23b) can also be calculated with a time
average. Glancing back at (3.23b), we define
2 1
t t τ − = and set the ensemble average equal to the
time average by writing
( )
1 1
1
( ) ( ) lim ( ) ( )
2
T
T
T
n t n t n t n t dt
T
τ τ
→∞

§ ·
+ = +
¨ ¸
© ¹
³
E . (3.43a)

Once again we face the same two problems: the left-hand side of this equation is allowed to be a
function of
1
t whereas the right-hand side is not, and the left-hand side of this equation is
nonrandom whereas the right-hand side is random.
Dealing with the
1
t problem first, we again say that

( )
1 1
( ) ( ) n t n t τ + E

does not depend on
1
t , making ñ(t) stationary with respect to its autocorrelation function. Now
Eq. (3.43a) can be written as


1 1
1
( ( ) ( )) ( ) lim ( ) ( )
2
T
nn
T
T
n t n t R n t n t dt
T
τ τ τ
→∞

§ ·
+ = = +
¨ ¸
© ¹
³

E . (3.43b)

Both in Eqs. (3.42a) and (3.42b) describing what it means to be ergodic in the mean, and in Eqs.
(3.43a) and (3.43b) describing what it means to be ergodic in the autocorrelation function, the
time dependence that ensemble averaging preserves is lost in the time average. This is clearly
going to happen whenever some sort of ensemble average is set equal to the corresponding time
average. We conclude that when a random function is ergodic in some way, it must also be
stationary in that same way. In this sense, ergodic random functions are always stationary.
43

Moving on to the second problem with Eq. (3.43a)—that of equating random and nonrandom
quantities—we follow the same procedure as before. This time the random function ξ

is defined
to be

1
( , ) ( ) ( )
2
T
T
T n t n t dt
T
ξ τ τ

= +
³

(3.44a)

and the random function ( ) ξ τ

is defined to be


42
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
43
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
Ergodic Random Functions · 3.18
- 275 -
( ) lim ( , )
T
T ç t ç t
·
÷·

. (3.44b)

Associated with ( ) ç t
·

is the probability density distribution
( )
p
ç t
·

such that
( )
( ) p d
ç t
ç ç
·
· ·
is the
probability that ( ) ç t
·

has a value between ç
·
and d ç ç
· ·
+ . We again require

( )
( )
( ) ( )
nn
p R
ç t
ç o ç t
·
· ·
÷

(3.44c)
so that

( )
( )
( )
( ) ( ) ( ) ( )
nn nn
p d R d R
ç t
ç t ç ç ç o ç t ç ç t
·
· ·
· · · · · · ·
÷· ÷·
÷
³ ³

E (3.44d)
and

( )
( )
2 2
( )
2
[ ( ) ( )] ( )[ ( )]
( ) [ ( )] 0.
nn nn
nn nn
R p R d
R R d
ç t
ç t t ç ç t ç
o ç t ç t ç
·
·
· · · ·
÷·
·
· · ·
÷·
÷ ÷
÷ ÷
³
³


E

(3.44e)

This shows, according to the discussion following Eqs. (3.7e) and (3.7f), that the random variable
( ) ç t
·

behaves like a nonrandom quantity. We have now solved the second problem with Eq.
(3.43a) and therefore can make sense of the idea that a random function can be ergodic in the
autocorrelation function.
The pattern used in analyzing the ergodic qualities of a random function ñ(t) has by now been
set. There is some mathematically useful and reasonable function f that has N arguments. We pick
N time values
1
t ,
2
t ,…,
N
t and calculate an ensemble expectation value or average

( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t " E ,

which is then set equal to the time average

( )
2 3
1
lim ( ), ( ), ( ), , ( )
2
T
N
T
T
f n t n t n t n t dt
T
t t t
÷·
÷
+ + +
³
… .
We define

2 2 1
t t t ÷ ,
3 3 1
t t t ÷ , ... ,
1 N N
t t t ÷

and set the expectation value equal to the time average by writing

3 · Random Variables, Random Functions, and Power Spectra

- 276 -

( ) ( )
( ) ( )
( )
1 2
1 1 2 1 3 1
2 3
( ), ( ), , ( )
( ), ( ), ( ), , ( )
1
lim ( ), ( ), ( ), , ( ) .
2
N
N
T
N
T
T
f n t n t n t
f n t n t n t n t
f n t n t n t n t dt
T
τ τ τ
τ τ τ
→∞

= + + +
= + + +
³



E
E

(3.45a)

In order for Eq. (3.45a) to make sense, the expectation value


( ) ( )
( ) ( )
1 2
1 1 2 1 3 1
( ), ( ), , ( )
( ), ( ), ( ), , ( )
N
N
f n t n t n t
f n t n t n t n t τ τ τ = + + +


E
E


cannot be a function of
1
t . This means the right-hand side this of relationship still has the same
value when
1
t is increased by any time value τ ; hence we can write, increasing
1
t by τ only on
the right-hand side,


( ) ( )
( ) ( )
1 2
1 1 2 1 3 1
( ), ( ), , ( )
( ), ( ), ( ), , ( )
N
N
f n t n t n t
f n t n t n t n t τ τ τ τ τ τ τ = + + + + + + +


E
E .


Remembering that

2 2 1
t t τ = − ,
3 3 1
t t τ = − , …,
1 N N
t t τ = − ,

we eliminate
2
τ ,
3
τ ,…,
N
τ from the equation to get

( ) ( ) ( ) ( )
1 2 1 2
( ), ( ), , ( ) ( ), ( ), , ( )
N N
f n t n t n t f n t n t n t τ τ τ = + + + … … E E . (3.45b)

This is the same as Eq. (3.24c) above. We conclude that Eq. (3.24c) must be true whenever Eq.
(3.45a) is true. According to the discussion following Eq. (3.24c), whenever Eq. (3.45a) is true,
the expectation value
( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t … E

must be a function of only the 1 N − independent time values


2 2 1
t t τ = − ,
3 3 1
t t τ = − , …,
1 N N
t t τ = − .

Consequently, the expectation values and the time integral in Eq. (3.45a) have the same number
Ergodic Random Functions · 3.18
- 277 -
of independent time parameters, which we can show by writing

( )
2 3 2 3
1
( , , , ) lim ( ), ( ), ( ), , ( )
2
T
N N
T
T
S f n t n t n t n t dt
T
τ τ τ τ τ τ
→∞

= + + +
³
… … , (3.45c)
where
( ) ( )
2 3 1 2
( , , , ) ( ), ( ), , ( )
N N
S f n t n t n t τ τ τ = … … E . (3.45d)

Equation (3.45a) needs to have one more requirement imposed on it—the random quantity on the
right-hand side must be equivalent to the nonrandom quantity on the left. This means the random
quantity
( )
2 3 2 3
1
( , , , ) lim ( ), ( ), ( ), , ( )
2
T
N N
T
T
f n t n t n t n t dt
T
ξ τ τ τ τ τ τ

→∞

= + + +
³

… … (3.45e)

must become equivalent to the nonrandom quantity S by having


( )
2 3 2 3
( , , , ) ( , , , )
N N
S ξ τ τ τ τ τ τ

=

… … E (3.45f)
and

( )
2
2 3 2 3
( , , , ) ( , , , ) 0
N N
S ξ τ τ τ τ τ τ

ª º
− =
¬ ¼

… … E . (3.45g)

Now, by requiring Eqs. (3.45b)–(3.45g) to hold true, we can be sure that Eq. (3.45a) is
mathematically self-consistent.
It is not difficult to relate this mathematical machinery to the analysis of what it means to say
that ñ(t) is ergodic in the mean or ergodic in the autocorrelation function. When specifying what
it means to say that ñ(t) is ergodic in the mean, we take 1 N = and define function f to be
( ) f x x = ; and when specifying what it means to say that ñ(t) is ergodic in the autocorrelation
function, we take 2 N = and define function f to be ( , ) f x y xy = . To give another example of
how to use Eqs. (3.45a)–(3.45g), we examine an often encountered type of ergodicity called
“ergodic in the variance.”
44
We define ergodic in the variance for a random function ñ(t) by
setting 1 N = and
2
( ) ( )
n
f x x µ = −

, with
n
µ

in function f being the stationary mean of ñ,

( ) ( )
n
n t µ =

E ,

specified by Eq. (3.30a) above. When a random function ñ(t) is ergodic in the variance, Eq.


44
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
3 · Random Variables, Random Functions, and Power Spectra

- 278 -
(3.45a) becomes

( )
2 2
1
[ ( ) ] lim [ ( ) ]
2
T
n n
T
T
n t n t dt
T
µ µ
→∞

− = −
³

E . (3.46a)

The requirements imposed by Eq. (3.45b) can be written as


( ) ( )
2 2
[ ( ) ] [ ( ) ]
n n
n t n t µ τ µ − = + −

E E (3.46b)

for all values of τ , which means that


( )
2
[ ( ) ] nonrandom variable independent of time
n n
n t v µ − = =

E . (3.46c)

Here, we write
n
v

instead of
( ) n t
v

for the variance of ñ(t) to emphasize that
n
v

does not depend
on time. Equation (3.46c) can be interpreted as saying that ñ is stationary with respect to its
variance
n
v

. We note that variance
n
v

is equivalent to S in Eq. (3.45d), so Eqs. (3.45e), (3.45f),
and (3.45g) now reduce to

2
1
lim [ ( ) ]
2
T
n
T
T
n t dt
T
ξ µ

→∞

= −
³

, (3.46d)

( )
n
v ξ

=

E , (3.46e)
and

( )
2
[ ] 0
n
v ξ

− =

E . (3.46f)

A random function ñ(t) is called weakly ergodic if it is ergodic in the mean, ergodic in the
variance, and ergodic in the autocorrelation function.
45
It is called strongly ergodic if Eqs.
(3.45a)–(3.45g) are satisfied for all 1, 2, , N = ∞ … and for any reasonable choice of function f.
This is equivalent to requiring that all reasonable ensemble averages of the random function ñ(t)
be equal to their corresponding time averages.
The distinction made between weakly ergodic and strongly ergodic is reminiscent of the
distinction made between wide-sense stationary and strict-sense stationary. Just as all strict-sense
stationary random functions are also wide-sense stationary, but not all wide-sense stationary
random functions are strict-sense stationary, so too are all strongly ergodic random functions also
weakly ergodic, but not all weakly ergodic random functions are strongly ergodic. The Gaussian
random processes discussed in Sec. 3.16 above are an important special case. We have already


45
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
Ergodic Random Functions · 3.18
- 279 -
said that when Gaussian random processes are wide-sense stationary they must also be strict-
sense stationary; it can also be shown that whenever Gaussian random processes are weakly
ergodic they must also be strongly ergodic.
46

Although we have seen that all ergodic random functions are also stationary, it is easy to show
that not all stationary random functions are ergodic. The random function

( ) n t c = , (3.47a)

where c is a random constant chosen from a probability density distribution ( )
c
p c

, is clearly
strict-sense stationary. To see why this is so, we just observe that Eq. (3.24c) is automatically
satisfied, since


( ) ( )
( ) ( ) ( )
1 2
1 2
( ), ( ), , ( ) ( ) ( , , , )
( , , , ) ( ), ( ), , ( )
N c
N
f n t n t n t p c f c c c dc
f c c c f n t n t n t τ τ τ

−∞
=
= = + + +
³

… …
… …
E
E E
(3.47b)

for any value of τ and any integrable function f with 1, 2, , N = ∞ … arguments. On the other
hand, ( ) n t c = cannot be ergodic because once a value for c is chosen from the ensemble, it must
stay the same for all time values. Looking at even the simplest type of ergodicity, ergodicity in
the mean, we get from Eq. (3.42d)


1 1
lim ( ) lim (2 )
2 2
T
T T
T
n t dt Tc c
T T
ξ

→∞ →∞

§ ·
= = ⋅ =
¨ ¸
© ¹
³

. (3.47c)

Hence, the probability density distribution of ξ

is the same as the probability density
distribution
c
p

, which, unless
c
p

is a delta function, violates requirement (3.42e) for ergodic in
the mean.


46
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 83.
3.19 Experimental Noise
We almost always analyze noise in experimental signals as a random function of time ñ(t). The
signal noise in any given experiment is then a member function chosen at random from the
ensemble of allowed functions because it corresponds to pulling the levers of all the slot
machines simultaneously in Sec. 3.14 above. This suggests that the straightforward way to
calculate an expectation value or ensemble average is to acquire many different member
3 · Random Variables, Random Functions, and Power Spectra

- 280 -
functions by running the experiment many different times. This is, of course, unlikely to happen;
there is usually not much incentive to do the same experiment over and over in exactly the same
way, because the point of most experiments is to measure a signal, not the noise associated with
it. Sometimes repeating an experiment is literally impossible. If, for example, stock-market prices
are treated as random functions of time, there is no way to repeat last year to see what happens
this time around. Consequently, when examining random functions of time, there is usually only
one, or at best a few, member functions of the ensemble to examine. In practice, then, most
experimental statisticians are forced to assume that their random functions are ergodic as well as
stationary; otherwise, they cannot calculate the ensemble averages needed for their analysis.
Another point worth making about stationarity and ergodicity is that, strictly speaking, no
experimental data can be truly stationary or truly ergodic in even the weakest sense, because
before an experiment begins or after an experiment ends the random function representing the
noise must be strictly zero. One way of handling this is to regard the noise data as a finite-length
sample of some random function stretching between t = í’ and t = +’, but we should also
acknowledge that stationarity and ergodicity are ideals that experimental noise can only realize to
some degree of approximation. Just as, in Sec. 3.5 above, many pairs of independent random
variables turn out after all to depend slightly on each other, so too do many recordings of
experimental noise turn out, after close analysis, to be stationary and ergodic only to some degree
of approximation.
3.20 The Power Spectrum
A random function ñ(t) that is wide-sense stationary has an autocorrelation function
nn
R

, which
according to Eq. (3.30b) can be written as

( )
2 1 1 2
( ) ( ) ( )
nn
R t t n t n t − =

E (3.48a)

for any two time values
2
t and
1
t . We note that

( ) ( )
1 2 2 1
( ) ( ) ( ) ( ) n t n t n t n t = E E
automatically. This means that


2 1 1 2
( ) ( )
nn nn
R t t R t t − = −


or, setting
2 1
t t τ = − ,
( ) ( )
nn nn
R R τ τ = −

, (3.48b)

making
nn
R

an even function when n is wide-sense stationary. Since
nn
R

is a function of only
the single real parameter τ , we can set up the one-dimensional Fourier transform of
nn
R

, getting
The Power Spectrum · 3.20
- 281 -

2
( ) ( )
if
nn nn
S f R e d
π τ
τ τ


−∞
=
³

. (3.48c)

This Fourier transform ( )
nn
S f

of
nn
R

almost always exists, and we define it to be the power
spectrum
47,48
of the random function ñ(t). Over the next few sections of this chapter, we examine
the properties of
nn
S

, showing as we go along why it makes sense to call it the power spectrum.
Functions ñ that have power spectra must be wide-sense stationary because we are assuming
that the autocorrelation
nn
R

is a function with only a single real argument. Given that
nn
S

exists,
we can always reverse the transform in Eq. (3.48c) and write the autocorrelation function of ñ as
the inverse Fourier transform of the power spectrum,


2
( ) ( )
if
nn nn
R S f e df
π τ
τ

−∞
=
³

. (3.48d)

When two random functions ( ) n t
α
and ( ) n t
β
are jointly wide-sense stationary, as defined in the
discussion following Eq. (3.30c), we can define their cross-power spectrum to be


2
( ) ( )
if
n n n n
S f R e d
α β α β
π τ
τ τ


−∞
=
³

, (3.48e)
where

( )
2 1 1 2
( ) ( ) ( )
n n
R t t n t n t
α β
α β
− =

E

is their cross-correlation function introduced in Eq. (3.30d).
We know that
nn
R

in Eq. (3.48a) is always real because ( )
1 2
( ) ( ) n t n t E is always real.
According to Eq. (3.48b),
nn
R

is an even function of its argument. Therefore its Fourier
transform, the power spectrum
nn
S

, is the Fourier transform of a real and even function. Because
the Fourier transform of a real and even function is always another real and even function,
49
it
follows that
nn
S

is also real and even:
( ) Im ( ) 0
nn
S f =

(3.49a)
and
( ) ( )
nn nn
S f S f − =

. (3.49b)


47
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 124.
48
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 319.
49
See entry 1 of Table 2.1 in Chapter 2.
3 · Random Variables, Random Functions, and Power Spectra

- 282 -
We note in passing that the cross-power spectrum
n n
S
o þ

in (3.48e) is not necessarily a real-valued
function. It is, however, the Fourier transform of a real-valued function
n n
R
o þ

so it must be
Hermitian,
50


( ) ( )
n n n n
S f S f
o þ o þ
·
÷

(3.49c)

Equation (3.49a) shows that
nn
S

behaves like a power spectrum by being strictly real; Eq.
(3.49b) shows that
nn
S

is double-sided, having the same value at +f and –f. The next step is to
show that
nn
S

behaves like a power spectrum by being non-negative for all values of f, but that
has to wait until we examine what happens to
nn
S

when a wide-sense stationary random function
ñ(t) is put through an arbitrary linear system.



50
See entry 7 of Table 2.1 in Chapter 2.
3.21 Random Inputs and Outputs of Linear Systems
Section 2.9 in Chapter 2 describes what a convolution is and the role it plays in Fourier-transform
theory. A linear system can be represented by a convolution, with the u(t) input being convolved
with the linear system’s impulse-response function h(t) to get the v(t) output,

( ) ( ) ( ) v t u t h t · .

According to the definition of convolution in Chapter 2 [see Eq. (2.38a)], this can be written as

( ) ( ) ( ) v t h u t d t t t
·
÷·
´ ´ ´ ÷
³
.

When a random function ñ(t) is the input to a linear system characterized by an impulse-
response function h(t), the output is another random function ( ) m t given by

( ) ( ) ( ) m t h n t d t t t
·
÷·
´ ´ ´ ÷
³
. (3.50a)

u h
Random Inputs and Outputs of Linear Systems · 3.21
- 283 -
We define the correlation function between ( ) m t and ñ(t) to be
51


( )
1 2 1 2
( , ) ( ) ( )
mn
R t t m t n t =

E . (3.50b)

Function
1 2
( , )
mn
R t t

is called the cross-correlation function of m and ñ. Substitution of (3.50a)
gives

1 2 2 1 2 1
( , ) ( ) ( ) ( ) ( ) ( ) ( )
mn
R t t n t h n t d h n t n t d τ τ τ τ τ τ
∞ ∞
−∞ −∞
§ ·
ª º § ·
′ ′ ′ ′ ′ ′ = − = −
¨ ¸
¨ ¸ « »
¨ ¸
¬ ¼ © ¹
© ¹
³ ³

E E

Using Eq. (3.17c) to move the expectation operator inside the integral, and using (3.16a) to put h
outside the expectation operator because it is a nonrandom quantity, we get

( )
1 2 2 1
( , ) ( ) ( ) ( )
mn
R t t h n t n t d τ τ τ

−∞
′ ′ ′ = −
³

E .

Assuming that ñ is wide-sense stationary, we use Eq. (3.30b) to write

( )
2 1 1 2
( ) ( ) ( )
nn
n t n t R t t τ τ ′ ′ − = − −

E
so that

1 2 1 2
( , ) ( ) ( )
mn nn
R t t h R t t d τ τ τ

−∞
′ ′ ′ = − −
³

. (3.50c)

This shows that
mn
R

depends only on the difference between
1
t and
2
t . Nothing then stops us
from regarding
mn
R

as a function of
2 1
t t τ = − , which gives

( ) ( ) ( )
mn nn
R h R d τ τ τ τ τ

−∞
′ ′ ′ = − −
³


or, using Eq. (3.48b),
( ) ( ) ( )
mn nn
R h R d τ τ τ τ τ

−∞
′ ′ ′ = +
³

.




51
This derivation comes from Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, pp.
323–324.
3 · Random Variables, Random Functions, and Power Spectra

- 284 -
Changing the variable of integration to τ τ ′′ ′ = − changes this into a convolution,

( ) ( ) ( ) ( ) ( )
mn nn nn
R h R d h R τ τ τ τ τ τ τ

−∞
′′ ′′ ′′ = − − = − ∗
³

. (3.50d)

Equation (3.50a) can also be used to evaluate the autocorrelation function of the random
output ( ) m t , giving

( )
1 2 1 2 1 2
1 2
( , ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
mm
R t t m t m t m t h n t d
h m t n t d
τ τ τ
τ τ τ

−∞

−∞
§ ·
ª º
′ ′ ′ = = −
¨ ¸
« »
¨ ¸
¬ ¼
© ¹
§ ·
′ ′ ′ = −
¨ ¸
© ¹
³
³



E E
E .


Again moving the expectation operator inside the integral, we use Eq. (3.50b) to write

( )
1 2 1 2 1 2
( , ) ( ) ( ) ( ) ( ) ( , )
mm mn
R t t h m t n t d h R t t d τ τ τ τ τ τ
∞ ∞
−∞ −∞
′ ′ ′ ′ ′ ′ = − = −
³ ³

E .

From (3.50c) we know that
mn
R

depends only on the difference between times
1
t and
2
t , which
means we can write

1 2 2 1
( , ) ( )
mn mn
R t t R t t = −

.

Hence, the formula for
1 2
( , )
mm
R t t

simplifies to


1 2 2 1
( , ) ( ) ( )
mm mn
R t t h R t t d τ τ τ

−∞
′ ′ ′ = − −
³

. (3.51a)

This is an important result because it shows that the autocorrelation of the output random
function m depends only on
2 1
t t τ = − . Substituting τ for
2 1
( ) t t − gives

( ) ( ) ( ) ( ) ( )
mm mn mn
R h R d h R τ τ τ τ τ τ τ

−∞
′ ′ ′ = − = ∗
³

. (3.51b)

Glancing back at Eqs. (3.30a) and (3.30b) above, and having shown that the autocorrelation
function
1 2
( , )
mm
R t t

depends only on
2 1
( ) t t − , we realize that m must be wide-sense stationary if
Random Inputs and Outputs of Linear Systems · 3.21
- 285 -
( ) ( ) m t E is time-independent and finite. Taking the expectation value of both sides of (3.50a)
gives

( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ,
n
m t h n t d h n t d
h d
τ τ τ τ τ τ
µ τ τ
∞ ∞
−∞ −∞

−∞
§ ·
′ ′ ′ ′ ′ ′ = − = −
¨ ¸
© ¹
′ ′ =
³ ³
³

E E E

(3.51c)
where we have again assumed that ñ(t) is wide-sense stationary so that, according to Eq. (3.30a),

( ) ( ) same finite constant for all values of
n
n t t µ = =

E .

Equation (3.51c) makes ( ) ( ) m t E a time-independent quantity. The Fourier transform of the
impulse-response function h is called the transfer function,


2
( ) ( )
ift
H f h t e dt
π


−∞
=
³
, (3.51d)

of the linear system. (The idea of a transfer function is discussed in greater detail below in
Appendix 5A of Chapter 5.) Therefore Eq. (3.51c) can also be written as

( ) ( ) (0)
n
m t H µ = ⋅

E . (3.51e)

This shows that when H(0), the zero-frequency value of the transfer function, is finite, so is
( ) ( ) m t E . We conclude that the output ( ) m t of the linear system is wide-sense stationary when
the input ñ(t) is wide-sense stationary and the H(0) value of the transfer function is finite.
Because the H(f) transfer function is the Fourier transform of h(t), which is a strictly real
function, we can take the complex conjugate of both sides of Eq. (3.51d) to get


2 2
( ) ( ) ( )
ift ift
H f h t e dt h t e dt
π π
∞ −∞
′ ∗ −
−∞ ∞
′ ′ = = − −
³ ³
. (3.52a)

In the last step of (3.52a), we change the variable of integration to t t ′ = − . Equation (3.52a) can
also be written as, dropping the prime,


2
( ) ( )
ift
H f h t e dt
π

∗ −
−∞
= −
³
. (3.52b)
3 · Random Variables, Random Functions, and Power Spectra

- 286 -
Clearly, ( ) H f

, the complex conjugate of the transfer function H(f), is the Fourier transform of
( ) h t − . Since H is the Fourier transform of a real function h, it must, according to entry 7 of Table
2.1 in Chapter 2, be Hermitian,

*
( ) ( ) H f H f − = . (3.52c)

We now define ( )
mn
S f

to be the Fourier transform of ( )
mn
R τ

, giving


2
( ) ( )
if
mn mn
S f R e d
π τ
τ τ


−∞
=
³

. (3.53a)

Function ( )
mn
S f

is the cross-power spectrum of m and ñ [see Eq. (3.48e)]. The transform can,
of course, be reversed to get

2
( ) ( )
if
mn mn
R S f e df
π τ
τ

−∞
=
³

. (3.53b)

Applying the Fourier convolution theorem to Eq. (3.50d) above gives, according to Eq. (2.39a) in
Chapter 2,

[Fourier transform of ] = [Fourier transform of ( )] [Fourier transform of ]
mn nn
R h -t R

⋅ .


This can be written as, using Eqs. (3.53a), (3.52b), and (3.48c),

( ) ( ) ( )
mn nn
S f H f S f

= ⋅

. (3.53c)


Applying Eq. (2.39a) again, this time to Eq. (3.51b), gives

( ) ( ) ( )
mm mn
S f H f S f =

, (3.53d)
where

2
( ) ( )
if
mm mm
S f R e d
π τ
τ τ


−∞
=
³

(3.53e)

is the Fourier transform of
mm
R

. Following the nomenclature introduced in Eq. (3.48c), this must
be the power spectrum of ( ) m t ; and the Fourier transforms of h and
mn
R

come from (3.51d) and
(3.53a) respectively. The Fourier transform in (3.53e) can, of course, be reversed to get
Random Inputs and Outputs of Linear Systems · 3.21
- 287 -

2
( ) ( )
if
mm mm
R S f e df
π τ
τ

−∞
=
³

. (3.53f)

Substitution of (3.53c) into (3.53d) gives the result we have been working toward:


2
( ) ( ) ( )
mm nn
S f H f S f =

. (3.53g)

This result shows that the power spectrum of the random input function ñ(t) gives, when
multiplied by the squared modulus of the transfer function, the power spectrum of the random
output function ( ) m t of the linear system.
3.22 The Sign of the Power Spectrum
Equation (3.53g) can be used to show that the power spectrum
nn
S

of any wide-sense stationary
random function cannot be negative. To show how this is done, we set up a linear system that has
the transfer function

1 2
2 1
1
2
for
for
( )
0 for
0 for
B
i f f f
i f f f
H f
f f
f f
− ≤ ≤ ­
°
− ≤ ≤ −
°
=
®
<
°
°
>
¯




, (3.54a)

where
1
f and
2
f are both non-negative frequencies. Function ( )
B
H f is ( ) i − when f lies
between
1
f and
2
f and i when f lies between
1
( ) f − and
2
( ) f − ; otherwise it is zero. The transfer
function
B
H satisfies
( ) ( )
B B
H f H f

− = , (3.54b)

which [see Eq. (3.52c)] makes it an acceptable transfer function because it is Hermitian. By
reversing the Fourier transform in (3.51d), we find that the impulse-response function for this
linear system must be the inverse Fourier transform of the transfer function,


2
( ) ( )
ift
B B
h t H f e df
π

−∞
=
³
.

According to entry 7 in Table 2.1 of Chapter 2, since ( )
B
H f is Hermitian, its inverse Fourier
transform ( )
B
h t must be real. We can take any random function ñ(t) that is wide-sense stationary
3 · Random Variables, Random Functions, and Power Spectra

- 288 -
and run it through the
B
H linear system. Looking at the resulting output ( ) m t , we know from the
discussion following Eq. (3.51e) that ( ) m t must also be wide-sense stationary because (0)
B
H is
finite. This means that m has a well-defined autocorrelation function

( )
2 1 1 2
( ) ( ) ( )
mm
R t t m t m t − =

E

and a well-defined power spectrum ( )
mm
S f

. Setting
1 2
t t = in the autocorrelation function gives,
since m is real,


( )
2
1
(0) ( ) 0
mm
R m t = ≥

E . (3.54c)

From Eq. (3.53f) we know
(0) ( )
mm mm
R S f df

−∞
=
³

. (3.54d)

Combining Eqs. (3.53g) and (3.54a) gives


2
( ) ( ) ( )
mm B nn
S f H f S f =

.

This can be substituted into (3.54d) to get, noting the definition of
B
H in (3.54a), that


1 2
2 1
(0) ( ) ( )
f f
mm nn nn
f f
R S f df S f df


= +
³ ³

.

Equation (3.49b) reminds us that
nn
S

is an even function of f, which means that this formula for
(0)
mm
R

can be written as

2
1
(0) 2 ( )
f
mm nn
f
R S f df =
³

. (3.54e)

Substitution of (3.54e) into inequality (3.54c) gives


2
1
( ) 0
f
nn
f
S f df ≥
³

. (3.54f)

The Sign of the Power Spectrum · 3.22
- 289 -
No assumptions have been made about the values of
1
f and
2
f other than


1 2
0 f f ≤ ≤ .

Therefore, because inequality (3.54f) must hold true for all allowed values of
1
f and
2
f no
matter where they are on the positive f axis or how close together they are, we conclude that
( ) 0
nn
S f ≥

for all 0 f ≥ . Because
( ) ( )
nn nn
S f S f − =



in Eq. (3.49b), it then follows that

( ) 0
nn
S f ≥

(3.54g)

for all positive and negative values of f.
We have already demonstrated that
nn
S

is real and even, and now we know that it must also
be a non-negative function of frequency f. These are all attributes that a double-sided power
spectrum ought to have. The final step in justifying the label “power spectrum” for
nn
S

is to show
that it satisfies a power-spectrum type of formula with regard to the random function ñ(t).

3.23 The Power Spectrum and Fourier Transforms of Random
Functions
The power spectrum ( )
zz
P f of a nonrandom function z(t) can be written as
52



2
( )
( ) lim
2
T
zz
T
Z f
P f
T
→∞
= . (3.55a)

Here, ( )
T
Z f is the Fourier transform between times t T = − and t T = of a real signal z(t):


2
( ) ( )
T
ift
T
T
Z f z t e dt
π −

=
³
. (3.55b)



52
B. P. Lathi, An Introduction to Random Signals and Communication Theory (International Textbook Company,
Scranton, PA, 1968), p. 59.
3 · Random Variables, Random Functions, and Power Spectra

- 290 -
We now justify the label “power spectrum” for the function ( )
nn
S f

defined in Eq. (3.48c) by
deriving a formula for
nn
S

in terms of the random function ñ(t) that closely resembles formula
(3.55a) for the power spectrum ( )
zz
P f of the nonrandom function z(t).
We define ( )
T
N f

to be the Fourier transform of the random function ñ(t) between times
t T = − and t T = :

2
( ) ( )
T
ift
T
T
N f n t e dt
π −

=
³

. (3.56a)

In effect, N

is a random function of the two nonrandom variables f and T, and it could be written
as ( , ) N f T

to emphasize this fact. When ñ(t) is a random function that is wide-sense stationary,
we have, since ñ is real,


( )
( )
1 2
2 1
2
2 2
1 1 2 2
2 ( )
1 2 1 2
( ) ( ) ( ) ( ) ( )
( ) ( ) .
T T
ift ift
T T T
T T
T T
i t t f
T T
N f N f N f n t e dt n t e dt
dt dt n t n t e
π π
π
− ∗
− −
− −
− −
§ ·
ª º ª º
= ⋅ =
¨ ¸
« » « »
¨ ¸
¬ ¼ ¬ ¼
© ¹
§ ·
=
¨ ¸
© ¹
³ ³
³ ³



E E E
E
(3.56b)

Applying Eqs. (3.17c) and (3.16a), the expectation operator E is taken inside the double integral
to get

2 1
2 1
2
2 ( )
1 2 1 2
2 ( )
1 2 2 1
( ( ) ) ( ( ) ( ) )
( ) .
T T
i t t f
T
T T
T T
i t t f
nn
T T
N f dt dt n t n t e
dt dt R t t e
π
π
− −
− −
− −
− −
=
= −
³ ³
³ ³

E E

(3.56c)

In the last step, Eq. (3.30b) is used to replace

( )
1 2
( ) ( ) n t n t E

for the wide-sense stationary ñ by the autocorrelation function
2 1
( )
nn
R t t −

.
The rightmost expression in Eq. (3.56c) is a double integral of a function


2 1
2 ( )
2 1 2 1
( ) ( )
i t t f
nn
t t R t t e
π
ψ
− −
− = −



over the square region of the
1
t ,
2
t plane specified by
The Power Spectrum and Fourier Transforms of Random Functions · 3.23
- 291 -

1
T t T − ≤ ≤
and

2
T t T − ≤ ≤ .

Figure 3.2 shows that the value of ȥ must be constant along any line given by


2 1
constant t t τ − = =

in the
1
t ,
2
t plane. To lowest order in dτ in Fig. 3.2, the shaded area is, when
2 1
t t ≥ so that
0 τ ≥ ,
(2 ) 2 (2 )
2
d
T T d
τ
τ τ τ ⋅ − = − .

When
2 1
t t < , as shown in Fig. 3.3, the value of τ is negative, so the formula for the shaded area
in Fig. 3.3 is

(2 ) 2 (2 )
2
d
T T d
τ
τ τ τ ⋅ − = − .


Consequently, the rightmost double integral in Eq. (3.56c) can be written as


2 1
2 ( )
1 2 2 1
2 0
2 2
0 2
2
2
2
( )
( ) (2 ) ( ) (2 )
( ) (2 ) .
T T
i t t f
nn
T T
T
if if
nn nn
T
T
if
nn
T
dt dt R t t e
R e T d R e T d
R e T d
π
π τ π τ
π τ
τ τ τ τ τ τ
τ τ τ
− −
− −
− −




= − + −
= −
³ ³
³ ³
³







Taking the factor of 2T outside the integral and substituting the result back into Eq. (3.56c) gives


( )
2
2
2
2
( ) 2 1 ( ) .
2
T
if
T nn
T
N f T R e d
T
π τ
τ
τ τ


§ ·
= −
¨ ¸
© ¹
³

E (3.57a)

This can be written as
3 · Random Variables, Random Functions, and Power Spectra

- 292 -

( )
2
2
1
( ) ( , 2 ) ( ) ,
2
if
T nn
N f T R e d
T
π τ
τ τ τ


−∞
= Λ
³

E (3.57b)

where

1 for
( , )
0 for
a
a b
b
a b
a b
t
t t
t
t t
t t
­
− ≤
°
Λ =
®
°
¯
>


. (3.57c)


Function Λ is graphed in Fig. 3.4. The Fourier transform of ( , 2 ) t T Λ is


2
2
sin(2 )
( , 2 ) 2
2
ift
fT
t T e dt T
fT
π
π
π


−∞
ª º
Λ = ⋅
« »
¬ ¼
³
. (3.57d)


The right-hand side of Eq. (3.57b) is the Fourier transform of the product of functions ȁ and
nn
R

.
According to the Fourier convolution theorem [see Eq. (2.39k) in Chapter 2], this must equal the
convolution of the Fourier transforms of Λ and
nn
R

. Therefore, Eq. (3.57b) can be written as,
according to (3.57d) and (3.48c),



( )
2
2
( )
sin(2 )
2 ( )
2 2
T
nn
N f
fT
T S f
T fT
π
π
­ ½
ª º
° °
= ⋅ ∗
® ¾
« »
¬ ¼
° °
¯ ¿

E
. (3.57e)


In the limit as T →∞, it can be shown that
53



2
sin(2 )
2 ( )
2
fT
T f
fT
π
δ
π
ª º
⋅ →
« »
¬ ¼
. (3.57f)


53
John B. Thomas, An Introduction to Applied Probability and Random Processes (John Wiley & Sons, Inc., New
York, 1971), p. 231. Formula (3.57f) is also a slightly disguised version of Eq. (2.67b) in Chapter 2.
The Power Spectrum and Fourier Transforms of Random Functions · 3.23
- 293 -










FIGURE 3.2.
τ
T − τ
T − τ

τ
τ
− =
− −
T
T T
2
) (

τ d
T
T −
T −
T

2
t

1
t
3 · Random Variables, Random Functions, and Power Spectra

- 294 -

FIGURE 3.3.





1
t

2
t
T
T
T −
T −
τ − T 2
τ d
τ
The Power Spectrum and Fourier Transforms of Random Functions · 3.23
- 295 -


FIGURE 3.4.




0 . 1

a
t

b
t
b
t −
3 · Random Variables, Random Functions, and Power Spectra

- 296 -
Consequently, we can take the limit of both sides of (3.57e) as T →∞ to get [using Eq. (2.55a)
in Chapter 2)

( )
2
( )
lim ( ) ( ) ( ) ( )
2
T
nn nn
T
N f
f S f f f S f df
T
δ δ

→∞
−∞
ª º
« »
′ ′ ′ = ∗ = −
« »
« »
¬ ¼
³

E

or

( )
2
( )
( ) lim
2
T
nn
T
N f
S f
T
→∞
ª º
« »
=
« »
« »
¬ ¼

E
. (3.57g)

Comparing this result to the similar formula in Eq. (3.55a) for the power spectrum of a
nonrandom function z, we see that the formulas are similar enough to justify the definition of
nn
S


as the power spectrum of the random function ñ.
The ( )
nn
S f

power spectrum specified in Eq. (3.48c) and used later in (3.57g), (3.49b),
(3.54g), and so on, is often called the double-sided power spectrum because it is defined for both
positive and negative values of its argument f. It is typically found as a weighting function in
integrals of the form
( ) ( )
nn e
S f f df φ

−∞
³

,

where ( )
e
f φ , like ( )
nn
S f

, is an even function of f. Because the ( ) ( )
nn e
S f f φ

product must also
be even, this integral can also be written as [see Eq. (2.19) in Chapter 2]


0
( ) ( ) 2 ( ) ( )
nn e nn e
S f f df S f f df φ φ
∞ ∞
−∞
=
³ ³

. (3.58a)

Many analysts define a single-sided power spectrum
(1)
nn
S

to be


(1)
( ) 2 ( ) for 0
nn nn
S f S f f = ≥

(3.58b)

and use it to write equations like (3.58a) as


(1)
0
( ) ( ) ( ) ( )
nn e nn e
S f f df S f f df φ φ
∞ ∞
−∞
=
³ ³

. (3.58c)

The Power Spectrum and Fourier Transforms of Random Functions · 3.23
- 297 -
The motivation for this procedure is often the feeling that only positive frequencies f are
meaningful, so we ought to restrict ourselves to using power spectra with positive arguments.
54

Many times articles and textbooks refer to “the” power spectrum without making it clear whether
they are referring to the double-sided or single-sided power spectrum. Casual references to power
spectra should be treated with caution until it becomes clear which type of power spectrum the
author has in mind.


54
There is, of course, no more problem in using negative ƒ values when ƒ represents a frequency than there is in
using negative x values when x represents a length along the axis of a coordinate system. Lengths can never be
negative, so when we allow x to be negative we are implicitly talking about a length coordinate rather than a length.
Similarly, when we allow ƒ to be negative we are implicitly talking about a frequency coordinate rather than a
frequency.
3.24 The Multidimensional Wiener-Khinchin Theorem
Equation (3.57g) derived in Sec. 3.23 is often referred to as the Wiener-Khinchin theorem. This
theorem can easily be extended to multiple dimensions.
A random function with more than one nonrandom argument is often called a random scalar
field. We can write a random scalar field ñ as
1 2
( , , , )
K
n t t t … when it is a function of K
nonrandom arguments
1
t ,
2
t ,…,
K
t . The property for a random field that is analogous to
stationarity for a one-dimensional random function is called homogeneity. A random function ñ is
called a (wide-sense) homogeneous random field
1 2
( , , , )
K
n t t t … when there is a correlation
function
nn
R

such that


( )
1 1 2 2 1 2 1 2
( , , , ) ( , , , ) ( , , , )
nn K K K K
R t t t t t t n t t t n t t t ′ ′ ′ ′ ′ ′ − − − =

… … … E . (3.59a)


The multidimensional Fourier transform of
nn
R

is the multidimensional power spectrum of the
random field



1 1 2 2
1 2
2 ( )
1 2 1 2
( , , , )
( , , , )
K K
nn K
i f f f
K nn K
S f f f
d d d R e
π τ τ τ
τ τ τ τ τ τ
∞ ∞ ∞
− + + +
−∞ −∞ −∞
=
³ ³ ³

"


" … .
(3.59b)


This transform can, of course, be reversed to get
3 · Random Variables, Random Functions, and Power Spectra

- 298 -

1 1 2 2
1 2
2 ( )
1 2 1 2
( , , , )
( , , , )
K K
nn K
i f f f
K nn K
R f f f
df df df S f f f e
π τ τ τ
∞ ∞ ∞
+ + +
−∞ −∞ −∞
=
³ ³ ³

"


" … .
(3.59c)

The multidimensional Wiener-Khinchin theorem states that


( )
1 2
1
2
1 2
2
1 2
1 2
( , , , )
1
lim ( , , , )
(2 )(2 ) (2 )
K
K
nn K
TT T K
T
K
T
T
S f f f
N f f f
T T T
→∞
→∞
→∞
ª º
=
« »
¬ ¼

"
#


"
E , (3.59d)
where

1 2
1 2
1 1 2 2
1 2
1 2
2 ( )
1 2 1 2
( , , , )
( , , , )
K
K
K K
K
TT T K
T T T
i f t f t f t
K K
T T T
N f f f
dt dt dt n t t t e
π − + + +
− − −
=
³ ³ ³
"
"


" … .
(3.59e)

The next chapter uses the three-dimensional Wiener-Khinchin theorem with one time
coordinate t and two space coordinates x and y. Using the vector notation introduced in Chapter 2
(see Sec. 2.25), we write the random field ñ as

( , , ) ( , ) n x y t n t ρ =
G
, (3.60a)
with
ˆ ˆ xx yy ρ = +
G
(3.60b)

being the position vector defined in terms of the ˆ x and ˆ y unit vectors corresponding to the x and
y coordinates. We also define a vector u
G
with
x
u and
y
u components such that

ˆ ˆ
x y
u xu yu = +
G
. (3.60c)

Here,
x
u and
y
u are the spatial frequencies corresponding to the x and y coordinates respectively.
The frequency corresponding to time t is called w. The truncated time and space Fourier
transform of ( , ) n t ρ
G
can now be written as


2 ( )
,
( , , ) ( , , )
x y
T
i xu yu wt
T A x y
T area A
N u u w dt dx dy n x y t e
π − + +

=
³ ³³

or
The Multidimensional Wiener-Khinchin Theorem · 3.24
- 299 -

2 2 ( )
,
( , ) ( , )
T
i u wt
T A
T area A
N u w dt d n t e
π ρ
ρ ρ
− • +

=
³ ³³
G G
G G

. (3.60d)

Random field ( , ) n t ρ
G
has an autocorrelation function

( ) ( , , ) ( , , ) ( , , )
nn
R x x y y t t n x y t n x y t ′ ′ ′ ′ ′ ′ − − − =

E , (3.61a)

which can be written as

( ) ( , ) ( , ) ( , )
nn
R t t n t n t ρ ρ ρ ρ ′ ′ ′ ′ − − =

G G G G
E . (3.61b)

Because
nn
R

depends only on the difference between the unprimed and primed coordinates, we
say that field ñ is (wide-sense) stationary and homogeneous. The corresponding power spectrum
is

2 2 ( )
( , ) ( , )
i u wt
nn nn
S u w dt d R t e
π ρ
ρ ρ
∞ ∞
− • +
−∞ −∞
=
³ ³ ³
G G

G G
. (3.61c)

The transform can be reversed to get

2 2 ( )
( , ) ( , )
i u wt
nn nn
R t dw d u S u w e
π ρ
ρ
∞ ∞
• +
−∞ −∞
=
³ ³ ³
G G

G G
. (3.61d)

Glancing back at the notation for the truncated Fourier transform of ñ in Eq. (3.60d), we see that
the three-dimensional Wiener-Khinchin theorem for this case can be stated as


( )
2
,
1
( , ) lim ( , )
2
nn T A
T
A
S u w N u w
TA
→∞
→∞
ª º
=
« »
¬ ¼

G G

E . (3.61e)

3.25 Band-Limited White Noise
A random function ñ(t) is band-limited white noise when it is wide-sense stationary and has a
power spectrum

0
for
( ) ( )
0 for
nn nn
W f F
S f W f
f F
­ ≤
°
= =
®
>
°
¯



(3.62a)
with
( ) ( ) 0 n t = E . (3.62b)
3 · Random Variables, Random Functions, and Power Spectra

- 300 -


FIGURE 3.5.




) ( ~ ~ f W
n n

f

0
W
F F −
Band-Limited White Noise · 3.25
- 301 -
The bandwidth of this white noise is said to be F (see Fig. 3.5). Equation (3.48d) shows that the
autocorrelation function of this band-limited white noise must be


2
0 0
sin(2 )
( )
F
if
nn
F
F
R W e df W
π τ
π τ
τ
πτ

= =
³

. (3.62c)

Glancing back at Eq. (3.48a), we see that

( ) ( )
2
( ) ( ) ( ) (0)
nn
n t n t n t R ⋅ = =

E E ,

so that, according to Eq. (3.62c),

( )
2
0 0
( ) 2
F
F
n t W d FW τ

= =
³
E . (3.62d)

According to (3.62b) ñ is a zero-mean random function, so Eq. (3.62d) shows that product
0
2FW
must be the variance of ñ(t) when ñ is band-limited white noise.
Sometimes we take the limit as F →∞ in Eqs. (3.62a)–(3.62d) to get white noise that has no
band limits. Now the power spectrum of ñ(t) is


0
( )
nn
W f W =

(3.63a)

for all values of f. According to formula (3.62c) and Eq. (2.71f) in Chapter 2, this makes the
autocorrelation function
nn
R

proportional to a delta function,


2
0 0
( ) ( )
if
nn
R W e df W
π τ
τ δ τ

−∞
= =
³

, (3.63b)
with of course

2
lim[ ( ( ) )]
F
n t
→∞
= ∞ E (3.63c)
and
( ) ( ) 0 n t = E . (3.63d)

Just like the concepts of stationarity and ergodicity, the concept of white noise (even of band-
limited white noise) is an idealization that is often useful for approximating random processes
seen in nature. When a poor-quality recording is played on an audio system, the noise
contaminating it is often white in nature, showing up as unwanted hissing, crackling, and an
overall “shussing” sound. This white noise is band limited, with the band specified by the finite
3 · Random Variables, Random Functions, and Power Spectra

- 302 -
range of frequencies produced by the audio system and heard by the audience. Setting a TV set to
a channel or station that does not exist, or that cannot be picked up, often produces hissing in the
speakers and a rapidly changing speckle (sometimes called snow) on the screen; both the snow
and the hissing come from quasi white-noise processes that the TV is treating like a nonrandom
signal.
3.26 Even and Odd Components of Random Functions
A useful approach often applied to random functions Ñ(t) that are wide-sense stationary is to
divide them up into even and odd components, as shown in Eqs. (2.11a)–(2.11e) in Chapter 2.
Instead of using e and o subscripts as is done in Chapter 2, this time the even component has a +
superscript and the odd component has a í superscript:


( ) ( )
( ) ( ) ( ) N t N t N t
+ −
= +

, (3.64a)
where

( )
1
( ) ( ) ( )
2
N t N t N t
+
ª º = + −
¬ ¼

(3.64b)
and

( )
1
( ) ( ) ( )
2
N t N t N t

ª º = − −
¬ ¼

. (3.64c)

We now apply to Ñ(t) the time-limited Fourier transform shown in Eq. (3.56a),


2 2
( ) ( ) ( , ) ( )
T
ift ift
T
T
f N t e dt t T N t e dt
π π

− −
− −∞
= = Π
³ ³
N

. (3.65a)

Here, the ( , ) t T Π function [defined in Eq. (2.56c) of Chapter 2] is used to convert the integral
between +T and –T into a true Fourier transform. Substituting (3.64a) into (3.65a) gives


( ) 2 ( ) 2
( ) ( , ) ( ) ( , ) ( )
ift ift
T
f t T N t e dt t T N t e dt
π π
∞ ∞
+ − − −
−∞ −∞
= Π + Π
³ ³
N

,

which can be written as


( ) ( )
( ) ( ) ( )
T T T
f f f
+ −
= + N N N

, (3.65b)
where

( ) ( ) 2
( ) ( , ) ( )
ift
T
f t T N t e dt
π

+ + −
−∞
= Π
³
N

(3.65c)
Even and Odd Components of Random Functions · 3.26
- 303 -
and

( ) ( ) 2
( ) ( , ) ( )
ift
T
f t T N t e dt
π

− − −
−∞
= Π
³
N

. (3.65d)

According to entries 1 and 4 of Table 2.1 in Chapter 2, random function
( )
T
+
N

must be a real and
even function of f because it is the forward Fourier transform of a real and even function of t;
and random function
( )
T

N

must be an imaginary and odd function of f because it is the forward
Fourier transform of a real and odd function of t. This means that every function in the ensemble
of functions associated with random function
( )
T
+
N

is real and even, and every function in the
ensemble of functions associated with random function
( )
T

N

is imaginary and odd. It also reveals
that in Eq. (3.65b) function
( )
T
+
N

is the real part of ( )
T
f N

and
( )
/
T
i

N

is the imaginary part of
( )
T
f N

. This can be written mathematically as


( )
( )
( ) Re ( )
T T
f f
+
= N N

(3.65e)
and

( )
( )
( ) Im ( )
T T
f i f

= N N

. (3.65f)


There is a simple connection between the expectation values of the squared magnitudes of
( )
T
±
N

and
T
N

, that is between


( )
2
( )
( )
T
f
±
N

E and
( )
2
( )
T
f N

E ,


which is worth taking the time to analyze in detail.
We start by applying formulas (3.65c) and (3.65d) to
( )
2
( )
( )
T
f
±
N

E to get


( )
( )( )
( )
2
( ) ( ) ( )
( ) 2 ( ) 2
( ) ( ) ( )
( , ) ( ) ( , ) ( )
T T T
ift ift
f f f
t T N t e dt t T N t e dt
π π

± ± ±

∞ ∞
′ ± − ± −
−∞ −∞
=
§ ·
§ ·§ ·
¨ ¸ ′ ′ ′ = Π Π
¨ ¸¨ ¸
¨ ¸
© ¹© ¹
© ¹
³ ³
N N N


E E
E .


3 · Random Variables, Random Functions, and Power Spectra

- 304 -
Everything inside the integral over dt′ is real except for
2 ift
e
π ′ −
, so we can write this as


( )
2
( ) ( ) 2 ( ) 2
( ) ( , ) ( ) ( , ) ( )
ift ift
T
f t T N t e dt t T N t e dt
π π
∞ ∞
′ ± ± − ±
−∞ −∞
§ ·
′ ′ ′ = Π Π
¨ ¸
© ¹
³ ³
N

E E .

Substituting from (3.64b) and (3.64c) gives


( )
2
( )
2 2
( )
1
( , ) ( ) ( ) ( , ) ( ) ( ) ,
4
T
ift ift
f
t T N t N t e dt t T N t N t e dt
π π
±
∞ ∞
′ −
−∞ −∞
§ ·
ª º ª º ′ ′ ′ ′ = Π ± − Π ± −
¨ ¸
¬ ¼ ¬ ¼
© ¹
³ ³
N

E
E


which becomes, applying the linearity of operator E discussed in Sec. 3.10 above,


( )
( )
2
( )
2 2
( )
1
( , ) ( , ) ( ) ( ) ( ) ( ) .
4
T
ift ift
f
dt t T e dt t T e N t N t N t N t
π π
±
∞ ∞
′ −
−∞ −∞
ª º ª º ′ ′ ′ ′ = Π Π ± − ± −
¬ ¼ ¬ ¼ ³ ³
N

E
E
(3.66a)

The linearity of E can also be used to write


( )
( )
( ) ( ) ( ) ( )
[ ( ) ( )] [ ( ) ( )]
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
N t N t N t N t
N t N t N t N t N t N t N t N t
N t N t N t N t N t N t N t N t
′ ′ ± − ± −
′ ′ ′ ′ = ± − ± − + − −
′ ′ ′ ′ = ± − ± − + − −



E
E
E E E E

.


Equation (3.30b), which specifies the autocorrelation function of wide-sense stationary random
functions like Ñ(t), can now be applied to get


( )
[ ( ) ( )][ ( ) ( )]
( ) ( ) ( ) ( )
NN NN NN NN
N t N t N t N t
R t t R t t R t t R t t
′ ′ ± − ± −
′ ′ ′ ′ = − ± − − ± + + − +


E
.


According to Eq. (3.48b) the autocorrelation function
NN
R

is even, so the right-hand side can be
simplified to

Even and Odd Components of Random Functions · 3.26
- 305 -

( )
[ ( ) ( )][ ( ) ( )] 2 ( ) 2 ( )
NN NN
N t N t N t N t R t t R t t ′ ′ ′ ′ ± − ± − = − ± +


E .

Putting this result back into Eq. (3.66a) gives


( )
2
( ) 2 2
2 2
1
( ) ( , ) ( , ) ( )
2
1
( , ) ( , ) ( )
2
ift ift
T
NN
ift ift
NN
f dt t T e dt t T R t t e
dt t T e dt t T R t t e
π π
π π
∞ ∞
′ ± −
−∞ −∞
∞ ∞
′ −
−∞ −∞
′ ′ ′ = Π Π −
′ ′ ′ ± Π Π +
³ ³
³ ³
N

E
.
(3.66b)

Equation (3.48d) states that there exists a power spectrum ( )
NN
S f

such that


2 ( )
( ) ( )
if t t
NN NN
R t t S f e df
π

′ ±
−∞
′ ± =
³

.

Substituting this expression into the first term on the right-hand side of the formula for
( )
2
( )
( )
T
f
±
N

E and moving the integral over
NN
S

to the front, we get


( )
2
( ) 2 ( ) 2 ( )
2 2
1
( ) ( ) ( , ) ( , )
2
1
( , ) ( , ) ( )
2
it f f it f f
T
NN
ift ift
NN
f df S f dt t T e dt t T e
dt t T e dt t T R t t e
π π
π π
∞ ∞ ∞
′ ′ ′ ± − − −
−∞ −∞ −∞
∞ ∞
′ −
−∞ −∞
′ ′ ′ ′ = Π Π
′ ′ ′ ± Π Π +
³ ³ ³
³ ³
N

E
.
(3.66c)

Interchanging the roles of f, t and then replacing F by T in Eq. (2.108b) of Chapter 2 gives

( )
2 ( ) 2 ( )
( , ) ( , ) 2 sinc 2 ( )
it f f it f f
t T e dt t T e dt T f f T
π π
π
∞ ∞
′ ′ − − −
−∞ −∞
′ Π = Π = −
³ ³
, (3.66d)

with Eq. (2.106d) showing that the definition of the sinc function is


sin( )
sinc( )
x
x
x
= . (3.66e)

Substitution of this formula into Eq. (3.66c) leads to

3 · Random Variables, Random Functions, and Power Spectra

- 306 -

( )
2
( ) 2
2 2
1
( ) ( )[2 sinc(2 ( ) )]
2
1
( , ) ( , ) ( )
2
T
NN
ift ift
NN
f S f T f f T df
dt t T e dt t T R t t e
π π
π

±
−∞
∞ ∞
′ −
−∞ −∞
′ ′ ′ = −
′ ′ ′ ± Π Π +
³
³ ³
N

E
.
(3.66f)

To evaluate the integral over df ′ in (3.66f), we assume that T is chosen large enough that


2
2
sin(2 )
[sinc(2 )]
2
f T
f T
f T
π
π
π
′ ª º
′ =
« »

¬ ¼


varies rapidly as a function of f ′ compared to ( )
NN
S f ′

. Hence, if
S
f ∆ is the change in f ′
required to cause a significant change in ( )
NN
S f ′

, we must have

1
S
f T ∆ ⋅ >> or
1
S
T
f
>>

. (3.67a)

Then we can follow the lead of (3.57f) and approximate


2
2
sin(2 )
2 2 sinc (2 ) ( )
2
f T
T T f T f
f T
π
π δ
π
′ ª º
′ ′ = ≅
« »

¬ ¼
. (3.67b)

Applying this approximation to the integral over df ′ on the right-hand side of (3.66f), we replace


( )
( )
2
2 sin 2 ( )
2 2 sinc 2 ( )
2 ( )
f f T
T T f f T
f f T
π
π
π
′ ª º −
′ ª º = −
« »
¬ ¼
′ −
¬ ¼


by ( ) f f δ ′ − to get


2
( )[2 sinc(2 ( ) )] 2 ( ) ( ) 2 ( )
NN NN NN
S f T f f T df T S f f f df TS f π δ
∞ ∞
−∞ −∞
′ ′ ′ ′ ′ ′ − ≅ − =
³ ³

.


This result can now be substituted back into Eq. (3.66f) to get

Even and Odd Components of Random Functions · 3.26
- 307 -

( )
2
( )
1
( ) ( )
2
T T
NN
f T S f
±
≅ ± Λ N

E , (3.67c)

where we define
T
Λ to be the value of the remaining double integral,


2 2
( , ) ( , ) ( )
ift ift
T
NN
dt t T e dt t T R t t e
π π
∞ ∞
′ −
−∞ −∞
′ ′ ′ Λ = Π Π +
³ ³

. (3.67d)

To evaluate
T
Λ , we change the variable of integration in the inner integral from t′ to
( ) t t t ′′ ′ = − + to get


( )
2 2 ( )
2 2 ( )
( , ) ( 1) ( , ) ( )
( , ) ( ), ( )
ift if t t
T
NN
ift if t t
NN
dt t T e dt t t T R t e
dt t T e dt t t T R t e
π π
π π
∞ −∞
′′ − − +
−∞ +∞
∞ ∞
′′ − − +
−∞ −∞
ª º
′′ ′′ ′′ Λ = Π − Π − − −
« »
¬ ¼
ª º
′′ ′′ ′′ = Π Π − + −
« »
¬ ¼
³ ³
³ ³


.


According to Eq. (2.56c) in Chapter 2, function ( , ) t T Π is an even function of t, so

( ) ( ), ( , ) t t T t t T ′′ ′′ Π − + = Π + .

Similarly, according to Eq. (3.48b) above,

( ) ( )
NN NN
R t R t ′′ ′′ − =

.

Applying these two formulas to the
T
Λ double integral gives, after interchanging the order of the
integrals over dt and dt′′ ,


2 4
( ) ( , ) ( , )
ift ift
T
NN
dt R t e dt t T t t T e
π π
∞ ∞
′′ − −
−∞ −∞
′′ ′′ ′′ Λ = Π Π +
³ ³

. (3.67e)

To simplify the inner integral on the right-hand side of (3.67e), we note that only when both
( , ) t T Π and ( , ) t t T ′′ Π + are one is their product one—in other words, when either ( , ) t T Π or
( , ) t t T ′′ Π + is zero, then their product is zero and no contribution is made to the integral. Figure
3.6(a) shows what happens for positive values of t′′ , and Fig. 3.6(b) shows what happens for
negative values of t′′ .
3 · Random Variables, Random Functions, and Power Spectra

- 308 -






FIGURE 3.6(a).
FIGURE 3.6(b).
t
t
T
T −
T −
T
t ′ ′
ít ′ ′
) , ( T t Π
) , ( T t t ′ ′ + Π for 0 > ′ ′ t
) , ( T t t ′ ′ + Π for 0 < ′ ′ t
) , ( T t Π
Even and Odd Components of Random Functions · 3.26
- 309 -
In both Figs. 3.6(a) and 3.6(b), the dark solid line is a plot of ( , ) t T Π and the dashed line is a plot
of ( , ) t t T ′′ Π + . When 0 t′′ > , the dashed block shifts to the left; when 0 t′′ < , the dashed block
shifts to the right. Only in the region of overlap of the solid and dashed lines in Figs. 3.6(a) and
3.6(b) does the product function

( , ) ( ) t T t t′′ Π Π +

allow a contribution to be made to the inner integral. Hence, we can write


1 when 0 2 and
( , ) ( ) 1 when 0 2 and
0 outside these regions
t T T t T t
t T t t t T T t t T
′′ ′′ < < − < < − ­
°
′′ ′′ ′′ Π Π + = > > − − − < <
®
°
¯
, (3.67f)

disregarding the edge points of the Π functions because these single-point values do not
contribute to the integral. Equation (3.67e) thus reduces to


0
2 4
2
2
2 4
0
( )
( )
T
ift ift
T
NN
T T t
T T t
ift ift
NN
T
dt R t e dt e
dt R t e dt e
π π
π π
′′ − −
′′ − − −
′′ −
′′ − −

′′ ′′ Λ =
′′ ′′ +
³ ³
³ ³


.
(3.67g)
We note that

4 4 4
1
4
b
ift ifa ifb
a
e dt e e
if
π π π
π
− − −
ª º = −
¬ ¼ ³
. (3.67h)

Applying (3.67h) to (3.67g) gives


( )
( )
0
2 4 ( ) 4
2
2
2 4 4 ( )
0
1
( )
4
1
( )
4
ift if T t ifT
T
NN
T
T
ift ifT if t T
NN
R t e e e dt
if
R t e e e dt
if
π π π
π π π
π
π
′′ ′′ − + −

′′ ′′ − −
ª º
′′ ′′ Λ = −
« »
¬ ¼
ª º
′′ ′′ + −
« »
¬ ¼
³
³


.


Changing the variable of integration in the first integral to t t ′′′ ′′ = − leads to [remember to apply
Eq. (3.48b)]
3 · Random Variables, Random Functions, and Power Spectra

- 310 -

( )
( )
2
4 2 4 2
0
2
4 2 4 2
0
2 4 4
2 2
0
1
( )
4
1
( )
4
( ) ( )
2 2
T
ifT ift ifT ift
T
NN
T
ifT ift ifT ift
NN
T ifT ifT
ift ift
NN NN
R t e e e e dt
if
R t e e e e dt
if
e e
R t e dt R t e d
if if
π π π π
π π π π
π π
π π
π
π
π π
′′′ ′′′ − −
′′ ′′ − −


′′′ ′′′ Λ = −
′′ ′′ + −
= −
³
³
³





2
0
,
T
t
³


where in the last step we have dropped the primes from the variables of integration. The second
integral is the complex conjugate of the first, so this formula can be written as


2 4
2
0
Re ( )
T ifT
ift
T
NN
e
R t e dt
if
π
π
π

ª º
Λ =
« »
¬ ¼
³

(3.67i)

because Re( ) ( / 2) ( / 2) c c c

= + for any complex number c.
The Heaviside step function is defined to be


1 for 0
( ) 1 2 for 0
0 for 0
t
t t
t
> ­
°
Ξ = =
®
°
<
¯



(3.67j)

in Eq. (2.70a) of Chapter 2. The integral on the right-hand side of (3.67i) can now be written as


2
2 2
0
( ) ( ) ( , 2 ) ( )
T
ift ift
NN NN
R t e dt t t T R t e dt
π π

− −
−∞
= Ξ Π
³ ³

. (3.67k)

The right-hand side is the Fourier transform of

( ) ( , 2 ) ( )
NN
t t T R t Ξ Π



and the Fourier-transform operator F defined in Eq. (2.29a) of Chapter 2 can be used to write it as


( )
( ( ) ( , 2 ) ( ))
ift
NN
t t T R t

Ξ Π

F .

Even and Odd Components of Random Functions · 3.26
- 311 -
The Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] can be applied to get


( ) ( ) ( )
( ) ( ) ( )
( ) ( , 2 ) ( ) ( ) ( , 2 ) ( )
ift ift ift
NN NN
t t T R t t t T R t
′ ′′ − − −
′ ′ ′′ Ξ Π = Ξ Π ∗

F F F . (3.68a)

According to Eq. (3.48c) there exists a power spectrum ( )
NN
S f

such that


( )
2 ( )
( ) ( ) ( )
ift ift
NN NN NN
S f R t e dt R t
π

′′ ′′ − −
−∞
′′ ′′ ′′ = =
³

F . (3.68b)

Evaluating ( )
( )
( ) ( , 2 )
ift
t t T
′ −
′ ′ Ξ Π F is not much more difficult. Writing the Fourier transform as an
integral gives [remember that cos( ) sin( )
i
e i
φ
φ φ = + ]


( )
( )
2
( ) 2 4
0
2
2 2
1
( ) ( , 2 ) 1
2
2
1
[cos(2 )
T
ift ift ifT
ifT
ifT ifT
t t T e dt e
if
e
e e
if
fT i
f
π π
π
π π
π
π
π
π
′ ′ − − −


′ ′ ′ ª º Ξ Π = = −
¬ ¼
= −
= −
³


F
2
sin(2 )]sin(2 )
1
sin(4 ) sin (2 ) ,
2
ft fT
i
fT fT
f f
π π
π π
π π
= −


where in the last step we use that

1
sin cos sin(2 )
2
θ θ θ = .

Applying the formula for the sinc function from Eq. (3.66e), we end up with

( )
( ) 2
( ) ( , 2 ) 2 sinc(4 ) (2 ) 2 sinc (2 )
ift
t t T T fT i fT T fT π π π
′ −
′ ′ ª º Ξ Π = −
¬ ¼
. F (3.68c)

Equations (3.68b) and (3.68c) are substituted into (3.68a) to get


( ) { }
( ) 2
( ) ( , 2 ) ( ) 2 sinc(4 ) (2 ) 2 sinc (2 ) ( )
ift
NN NN
t t T R t T fT i fT T fT S f π π π

ª º Ξ Π = − ∗
¬ ¼

F ,

3 · Random Variables, Random Functions, and Power Spectra

- 312 -
which can then be substituted into (3.67k), giving


{ }
2
2 2
0
( ) 2 sinc(4 ) (2 ) 2 sinc (2 ) ( )
T
ift
NN NN
R t e dt T fT i fT T fT S f
π
π π π

ª º = − ∗
¬ ¼ ³

. (3.68d)

Equation (2.67c) in Chapter 2 and the discussion following it show that


sin(2 )
( )
as ,
nf
f
f
n
π
δ
π

→∞
(3.68e)

where t in (2.67c) is here replaced by f. We note that, working with Eq. (3.66e),


( ) sin 2 (2 )
sin(4 ) 1
2 sinc(4 )
2 2
f T
fT
T fT
f f
π
π
π
π π
= = ⋅ .

Hence, applying (3.68e), we have

1
2 sinc(4 ) ( )
2
as (2 ) .
T fT f
T
π δ →
→∞
(3.68f)

As n gets large in (3.68e), the sine oscillates ever more rapidly with f. Similarly, as 2T gets large
in (3.68f)—which is, of course, the same as T getting large—the sinc oscillates ever more rapidly
with f. In order to approximate the sinc in (3.68f) by a delta function, then, we need to have the
other functions of f that are also present varying slowly compared to the original oscillation.
Again assuming, as in the discussion following Eq. (3.66f), that T is large enough for the first
sinc function on the right-hand side of Eq. (3.68d) to oscillate rapidly compared to the noise-
power spectrum
NN
S

, we expand the convolution in (3.68d), writing it as [apply Eq. (2.38e) in
Chapter 2]

2
2 2
0
( ) {[2 sinc(4 )] ( )} {(2 [2 sinc (2 )]) ( )}
T
ift
NN NN NN
R t e dt T fT S f i fT T fT S f
π
π π π

= ∗ − ∗
³

,

and then apply (3.68f) to get, since ( ) ( ) ( )
NN NN
f S f S f δ ∗ =

, that


2
2 2
0
1
( ) ( ) {(2 [2 sinc (2 )]) ( )}
2
T
ift
NN NN NN
R t e dt S f i fT T fT S f
π
π π

≅ − ∗
³

. (3.68g)
Even and Odd Components of Random Functions · 3.26
- 313 -
The remaining convolution on the right-hand side can be written as [see Eqs. (2.38a) and (2.38b)
in Chapter 2]


( )
( ) ¦ ¦
2 2
2
(2 [2 sinc (2 )]) ( ) ( ) 2 [2 sinc (2 )]
( ) 2 ( ) [2 sinc 2 ( ) ]
NN NN
NN
fT T fT S f S f fT T fT
S f f f T T f f T df
r r r r
r r
·
÷·
· ·
´ ´ ´ ´ ÷ ÷
³


.


Both functions ( ) f f ´ ÷ and ( )
NN
S f ´

vary slowly with f ´ compared to

( )
2
[2 sinc 2 ( )] T f f T r ´ ÷

for large values of T, so (3.67b) can be applied to the integral to get


2
(2 [2 sinc (2 )]) ( ) ( ){2 ( ) ( )} 0
NN NN
fT T fT S f S f T f f f f df r r r o
·
÷·
´ ´ ´ ´ · ÷ ÷
³

. (3.68h)

Substituting this into (3.68g) gives


2
2
0
1
( ) ( )
2
T
ift
NN NN
R t e dt S f
r ÷
e
³

, (3.68i)

which can then be put back into (3.67i) to get that [using cos( ) sin( )
i
e i
o
o o + ]


cos(4 ) sin(4 ) 1
Re ( )
2
T
NN
ft i ft
S f
if
r r
r
ª º +
A e
« »
¬ ¼

.

Equation (3.66e) simplifies this to

[ 2 sinc(4 )] ( )
T
NN
T fT S f r A e

. (3.68j)

Substituting this approximation into (3.67c) lets us write, at last, that


( )
2
( )
( ) ( ) sinc(4 ) ( )
( )[1 sinc(4 )]
T
NN NN
NN
f T S f T fT S f
T S f fT
r
r
±
e ±
±
N

E
.
(3.68k)
fT) sin(4 ) 1
( )
2
NN
i ft
S f
if
r
r
º +

»
¼

.
fT)
3 · Random Variables, Random Functions, and Power Spectra

- 314 -
The approximation in (3.68k) makes sense whenever T is large enough for sinc(2 ) fT π and
sinc(4 ) fT π to oscillate rapidly with frequency f compared to ( )
NN
S f

, which is usually true for
white-noise-like power spectra. When 1 fT >> , the sinc function’s value in formula (3.68k) is
small compared to one [see, for example, Figs. 3.7(a) and 3.7(b)] and we can write


( )
2
( )
( ) ( )
T
NN
f T S f
±
≅ N

E . (3.69a)


When 0 f = , it is of course no longer true that 1 fT >> . For this special case, the sinc function
is one; and, according to (3.68k), no matter how large T is we have


( )
2
( )
(0) 0
T

≅ N

E (3.69b)
and

( )
2
( )
(0) 2 ( )
T
NN
T S f
+
≅ N

E . (3.69c)


Equation (3.69b) is easy to understand after reviewing the discussion following Eq. (3.65d)
above. Since
( )
T

N

is always an odd function of f, it must be zero at 0 f = according to Eq.
(2.12a) of Chapter 2. To understand Eq. (3.69c), we consult Eqs. (3.65e) and (3.65f) and note that


2 2 2
( ) ( ) 2 2
( ) ( ) [Re( ( ))] [Im( ( ))] ( )
T T T T T
f f f f f
+ −
+ = + = N N N N N

.


Applying the expectation operator E to both sides and using its linearity with respect to random
quantities (see Sec. 3.10 above), we get


( ) ( )
( ) ( )
( )
2 2
2 2 2
( ) ( )
( ) ( ) Re ( ) Im ( ) ( )
T T T T T
f f f f f
+ − § · § ·
ª º ª º
+ = + =
¨ ¸ ¨ ¸
¬ ¼ ¬ ¼
© ¹ © ¹
N N N N N

E E E E E

or

( ) ( ) ( )
2 2 2
( ) ( )
( ) ( ) ( )
T T T
f f f
+ −
= + N N N

E E E . (3.69d)
Even and Odd Components of Random Functions · 3.26
- 315 -

FIGURE 3.7(a).
FIGURE 3.7(b).
0 . 1
0 . 1

T 4
1

T 4
1


T 2
1

T 2
1

f
f
) 4 ( sinc fT π
) 2 ( sinc fT π
3 · Random Variables, Random Functions, and Power Spectra

- 316 -
Glancing back at formula (3.57g), we realize, because T is assumed to be large in our analysis
here, that

( )
2
( )
2
T
f
T
N

E


is close to its limiting value as T ÷·. Hence, (3.57g) lets us write


( )
2
( ) 2 ( )
T
NN
f TS f e N

E (3.69e)

for large values of T. This approximation works well no matter what the value of f is. Therefore,
at 0 f we can substitute (3.69e) into (3.69d) to get


( ) ( )
2 2
( ) ( )
2 (0) (0) (0)
T T
NN
TS
+ ÷
e + N N


E E . (3.69f)

Having already justified (3.69b), we can now apply it to (3.69f) to get


( )
2
( )
2 (0) (0)
T
NN
TS
+
e N


E .


This result then justifies formula (3.69c) above.
Equation (3.69d) can also be used to justify the assumption behind formula (3.69e) that, when
0 f = , the ratio

( )
2
( )
2
T
f
T
N

E


is, for large values of T, close to its limiting value of ( )
NN
S f

. When 0 f = and T is large so that
1 fT >> , we can substitute (3.69a) into (3.69d) to rederive (3.69e),


( )
2
( ) 2 ( )
T
NN
f TS f e N

E .

According to (3.69a), then, it follows that when 1 fT >> and T is large, both
– but only when – the assumption behind 0 f =
formula (3.69e) that the ratio
Even and Odd Components of Random Functions · 3.26
- 317 -

( )
( )
2
2
( )
( ) Re ( ) ( )
T T
NN
f f TS f
+ § ·
ª º
= ≅
¨ ¸
¬ ¼
© ¹
N N


E E
and

( )
( )
2
2
( )
( ) Im ( ) ( )
T T
NN
f f TS f
− § ·
ª º
= ≅
¨ ¸
¬ ¼
© ¹
N N


E E

contribute equally to
( )
2
( )
T
f N

E . Having arrived at the formula


( )
2
( ) 2 ( )
T
NN
f TS f ≅ N

E

without using Eq. (3.57g)—that is, without thinking about what the limiting value of the ratio


( )
2
( )
2
T
f
T
N

E


might be as T gets large—we can now work in reverse to get that


( )
2
( )
( )
2
T
NN
f
S f
T

N

E
.

Not only does this result demonstrate that the ratio


( )
2
( )
2
T
f
T
N

E


is indeed about equal to ( )
NN
S f

when 1 fT >> and T is large, but we have also seen, when
1 fT >> and T is large, that the expected value of the squared real component of
T
N

and the
expected value of the squared imaginary component of
T
N

contribute equally to the expected
value of the squared magnitude of
T
N

. In other words, both


( )
( )
2
2
( )
( ) Re ( )
T T
f f
+ § ·
ª º
=
¨ ¸
¬ ¼
© ¹
N N

E E
3 · Random Variables, Random Functions, and Power Spectra

- 318 -
and

( )
( )
2
2
( )
( ) Im ( )
T T
f f
÷ § ·
ª º

¨ ¸
¬ ¼
© ¹
N N

E E

have turned out to be about half the expected value of the squared magnitude of
T
N

, which lets
us write

( ) ( )
( )
2
2 2
( )
( ) 2 ( ) 2 Re ( )
T T T
f f f
+ § ·
ª º
e
¨ ¸
¬ ¼
© ¹
N N N

E E E (3.69g)
and

( ) ( )
( )
2
2 2
( )
( ) 2 ( ) 2 Im ( )
T T T
f f f
÷ § ·
ª º
e
¨ ¸
¬ ¼
© ¹
N N N

E E E . (3.69h)

A not-very-rigorous argument often used to derive Eqs. (3.69a), (3.69g), and (3.69h) starts out
by breaking ( )
T
f N

into real and imaginary parts. (This step is sound—we did the same thing in
our analysis above.) Writing


( ) ( )
2
2 2
( ) [Re ( ) ] [Im ( ) ]
T T T
f f f + N N N

, (3.70a)

we next assume that
T
N

is equally likely to be real or imaginary, which means that


( ) ( ) ( ) ( )
2 2
[Re ( ) ] [Im ( ) ]
T T
f f N N

E E . (3.70b)

This is the result, of course, that we have gone to some trouble to justify analytically rather
than just assuming it applies; it is sometimes true and sometimes very wrong, for example, when
0 f or when
NN
S

varies rapidly with f. Applying the E expectation operator to both sides of
(3.70a) gives, using the linearity of E explained in Sec. 3.10,


( )
( ) ( ) ( ) ( )
2
2 2
( ) [Re ( ) ] [Im ( ) ]
T T T
f f f + N N N

E E E . (3.70c)

Substitution of (3.70b) into (3.70c) then leads to


( )
( ) ( )
2
2
( ) 2 [Re ( ) ]
T T
f f N N

E E (3.70d)
and
This is the result, of course, that we have gone to some trouble to justify analytically rather
Even and Odd Components of Random Functions · 3.26
- 319 -

( )
( ) ( )
2
2
( ) 2 [Im ( ) ]
T T
f f = N N

E E . (3.70e)

Consulting Eqs. (3.65e) and (3.65f), we see that formulas (3.70d) and (3.70e) are identical to
(3.69g) and (3.69h). Fortunately, since a more rigorous line of reasoning has already been used to
derive Eqs. (3.69g) and (3.69h), there is no need to rely on the assumption that (3.70b) is true to
establish the truth of (3.70d) and (3.70e). Having derived these results more rigorously, we also
now know that formulas (3.69g) and (3.69h) and formulas (3.70d) and (3.70e) are approximations
that should be used only when T is large, when 1 fT >> , and when
NN
S

varies slowly with
frequency f.
3.27 Analyzing the Noise in Artificially Created Even Signals
Many times in interferometer measurements we take all the data recorded for times 0 t > and,
assuming the signal is an even function of time, use the positive-time data to specify what the
data “ought to be” at 0 t < . This means that the noise in the data for t −∞ < < ∞ ends up being an
even function of time; that is, the real-valued random function ( )
E
n t that characterizes the noise
at 0 t > in the original recording also characterizes the noise for all negative time values because
of the way we construct the data set. Mathematically we say that

( ) ( )
E E
n t n t − = for all t −∞ < < ∞. (3.71a)

Although random function ( )
E
n t is neither ergodic nor stationary, we can assume that a real-
valued and stationary random function ñ(t) exists such that

( ) ( )
E
n t n t = for 0 t ≥ . (3.71b)

Just like any other stationary random function, ñ(t) has an autocorrelation function [see Eq.
(3.30b)]
( ) ( ) ( ) ( )
nn
R t t n t n t ′ ′ − =

E . (3.71c)

Following the conventions of Sec. 3.20 above [see Eqs. (3.48a)–(3.48c)], we note that
nn
R

is an
even function,
( ) ( )
nn nn
R R τ τ − =

, (3.71d)

and that autocorrelation
nn
R

and the power spectrum
nn
S

make up a Fourier-transform pair,

3 · Random Variables, Random Functions, and Power Spectra

- 320 -

2
( ) ( )
if
nn nn
S f R e d
r t
t t
·
÷
÷·

³

(3.71e)
and

2
( ) ( )
if
nn nn
R S f e df
r t
t
·
÷·

³

. (3.71f)

Following the same pattern as in Eq. (3.65a), we define


2 2
( ) ( ) ( , ) ( )
T
ift ift
T
T
N f n t e dt t T n t e dt
r r
·
÷ ÷
÷ ÷·
H
³ ³

(3.72a)
and

2 2
( ) ( ) ( , ) ( )
T
ift ift
TE E E
T
N f n t e dt t T n t e dt
r r
·
÷ ÷
÷ ÷·
H
³ ³

. (3.72b)

For large values of T, we can derive a simple approximation for


( )
2
( )
TE
N f

E ,

the expectation value of the squared magnitude of
TE
N

, in terms of


( )
2
( )
T
N f

E
and the power spectrum ( )
nn
S f

.
We start by specifying the Heaviside step function to be


1 for 0
( ) 1 2 for 0
0 for 0
t
t t
t
> ­
°
E
®
°
<
¯



. (3.73a)

This is the same step function defined in Eq. (2.70a) in Chapter 2. It follows that ( )
E
n t can be
written as [see Eqs. (3.71a) and (3.71b)]

( ) ( ) ( ) ( ) ( )
E
n t n t n t t t + ÷ ÷
E E
. (3.73b)

the same as in Eq. (3.67j):
Analyzing the Noise in Artificially Created Even Signals · 3.27
- 321 -
We note that for 0 t > , the first term has ( ) 1 t Ξ = and the second term has ( ) 0 t Ξ − = , so

( ) ( )
E
n t n t = .

For 0 t < , the first term has ( ) 0 t Ξ = and the second term has ( ) 1 t Ξ − = , so

( ) ( )
E
n t n t = − ,

and when 0 t = both ( ) t Ξ and ( ) t Ξ − are 1/2, so

(0) (0)
E
n n = .

We can now write, using Eq. (3.72b) and remembering that
E
n is real, that


( )
( )
2
2 2
( ) ( ) ( )
( , ) ( ) ( , ) ( )
TE TE TE
ift ift
E E
N f N f N f
t T n t e dt t T n t e dt
π π

∞ ∞
′ −
−∞ −∞
= ⋅
§ ·
′ ′ ′ = Π Π
¨ ¸
© ¹
³ ³


E E
E .


Using the linearity of E described in Sec. 3.10 above, we bring the expectation operator inside
the double integral over dt and dt′ to get


( )
( )
2
2 2
( ) ( , ) ( , ) ( ) ( )
ift ift
TE E E
N f dt t T e dt t T e n t n t
π π
∞ ∞
′ −
−∞ −∞
′ ′ ′ = Π Π
³ ³

E E . (3.73c)

Equation (3.73b) shows that, again using the linearity of the expectation operator,


( ) ( )
( ) ( )
( )
( ) ( ) [ ( ) ( )] [ ( ) ( )]
( ) ( ) ( ) ( )
( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) (
E E
n t n t n t n t n t n t
n t n t n t n t
n t n t
t t t t
t t t t
t t
′ ′ ′ = + − ⋅ + −
′ ′ = + −
′ + − +
′ ′ − −
Ξ Ξ Ξ Ξ
′ ′ −
Ξ Ξ Ξ Ξ
′ − −
Ξ Ξ Ξ



E E
E E
E

( ) ( ) ( ) . ) ( ) n t n t t t ′ − − ′ −
Ξ
E


Substituting from Eq. (3.71c) gives


( ) ( ) ( ) ( ) ( )
( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
E E nn nn
nn nn
n t n t R t t R t t
R t t R t t
t t t t
t t t t
′ ′ ′ = − + +
′ ′ + − − + − +
′ ′ −
Ξ Ξ Ξ Ξ
′ ′ − − −
Ξ Ξ Ξ Ξ


E
.

3 · Random Variables, Random Functions, and Power Spectra

- 322 -
Because the autocorrelation is even [see Eq. (3.71d)], this simplifies to


( ) ( ) ( ) [ ( )
[ ( )
( ) ( ) ( ) ( )]
( ) ( ) ( ) ( )]
E E nn
nn
n t n t R t t
R t t
t t t t
t t t t
′ ′ = + −
′ + + +
′ ′ − −
Ξ Ξ Ξ Ξ
′ ′ − −
Ξ Ξ Ξ Ξ


E
.
(3.73d)

Substituting the right-hand side of (3.73d) into the double integral in (3.73c) gives


( )
2
2 2
2 2
1
( )
( , ) ( , ) [ ( )
( , ) ( , ) [ ( )
( ) ( ) ( ) ( )]
( ) ( ) ( ) ( )]
TE
ift ift
nn
ift ift
nn
N f
dt t T e dt t T e R t t
dt t T e dt t T e R t t
t t t t
t t t t
π π
π π
∞ ∞
′ −
−∞ −∞
∞ ∞
′ −
−∞ −∞
′ ′ ′ = Π Π + −
′ ′ ′ + Π Π + +
= +
′ ′ − −
Ξ Ξ Ξ Ξ
′ ′ − −
Ξ Ξ Ξ Ξ
Λ Λ
³ ³
³ ³

E
2


,
(3.73e)

where

2 2
1
( , ) ( , ) [ ( ) ( ) ( ) ( ) ( )]
ift ift
nn
dt t T e dt t T e R t t t t t t
π π
∞ ∞
′ −
−∞ −∞
′ ′ ′ = Π Π + − ′ ′ Λ − −
Ξ Ξ Ξ Ξ
³ ³

(3.73f)
and

2 2
( , ) ( , ) [ ( ) ( ) ( ) ( ) ( )]
ift ift
nn
dt t T e dt t T e R t t t t t t
π π
∞ ∞
′ −
−∞ −∞
′ ′ ′ = Π Π + + ′ ′ Λ − −
Ξ Ξ Ξ Ξ
³ ³

2
. (3.73g)

The dark solid line in Fig. 3.8(a) is a plot of the Heaviside step function ( ) t Ξ and the dashed
line is a plot of ( , ) t T Π . Disregarding the edge points whose values do not contribute to the
integrals in (3.73f) and (3.73g), the product [ ( ) ( , )] t t T Ξ ⋅ Π is zero unless both Ξ and Π are
one—that is, the product is zero unless t lies inside the region where both the solid and dashed
plots are one in Fig. 3.8(a). Comparing this region to the plot of

,
2 2
T T
t
§ ·
Π −
¨ ¸
© ¹

in Fig. 3.8(b), we see that
( ) ( , ) ,
2 2
T T
t t T t
§ ·
Ξ ⋅ Π = Π −
¨ ¸
© ¹
. (3.74a)

Analyzing the Noise in Artificially Created Even Signals · 3.27
- 323 -
In Fig. 3.8(c), the dashed line is again a plot of ( , ) t T Π , but now the dark solid line is a plot of
( ) t Ξ − . Comparing the region where both ( ) t Ξ − and ( , ) t T Π are one in Fig. 3.8(c) to the plot of

,
2 2
T T
t
§ ·
Π +
¨ ¸
© ¹


in Fig. 3.8(d), we see that

( ) ( , ) ,
2 2
T T
t t T t
§ ·
Ξ − ⋅ Π = Π +
¨ ¸
© ¹
. (3.74b)







T − T
T − T
T − T T − T
t t
t t
FIGURE 3.8(a). FIGURE 3.8(c).
FIGURE 3.8(b). FIGURE 3.8(d).
3 · Random Variables, Random Functions, and Power Spectra

- 324 -
Splitting the formula in Eq. (3.73f) into two double integrals, we get that


2 2
2 2
1
( , ) ( , ) ( )
( , ) ( , ) ( ) ,
( ) ( )
( ) ( )
ift ift
nn
ift ift
nn
dt t T e dt t T e R t t
dt t T e dt t T e R t t
t t
t t
π π
π π
∞ ∞
′ −
−∞ −∞
∞ ∞
′ −
−∞ −∞
′ ′ ′ = Π Π −
′ ′ ′ + Π Π −
′ Λ
Ξ Ξ
′ − −
Ξ Ξ
³ ³
³ ³





which becomes, applying (3.74a) and (3.74b),


2 2
2 2
1
, , ( )
2 2 2 2
, , ( )
2 2 2 2
ift ift
nn
ift ift
nn
T T T T
dt t e dt t e R t t
T T T T
dt t e dt t e R t t
π π
π π
∞ ∞
′ −
−∞ −∞
∞ ∞
′ −
−∞ −∞
§ · § ·
′ ′ ′ = Π − Π − −
¨ ¸ ¨ ¸
© ¹ © ¹
§ · § ·
′ ′ ′ + Π + Π + −
¨ ¸ ¨ ¸
© ¹ © ¹
Λ
³ ³
³ ³


.


After changing the variables of integration in the first double integral from , t t′ to ( / 2) t T τ = −
and ( / 2) t T τ ′ ′ = − , and changing the variables of integration in the second double integral from
, t t′ to ( / 2) t T τ ′′ = + and ( / 2) t T τ ′′′ ′ = + , we see that


2 2
2 2
2 2
2 2
1
, , ( )
2 2
, , ( )
2 2
T T
if if
nn
T T
if if
nn
T T
d e d e R
T T
d e d e R
π τ π τ
π τ π τ
τ τ τ τ τ τ
τ τ τ τ τ τ
§ · § · ∞ ∞
′ − + +
¨ ¸ ¨ ¸
© ¹ © ¹
−∞ −∞
§ · § · ∞ ∞
′′′ ′′ − − −
¨ ¸ ¨ ¸
© ¹ © ¹
−∞ −∞
§ · § ·
′ ′ ′ = Π Π −
¨ ¸ ¨ ¸
© ¹ © ¹
§ · § ·
′′′ ′′′ ′′ ′′ ′′′ ′′ + Π Π −
¨ ¸ ¨ ¸
© ¹ © ¹
Λ
³ ³
³ ³


.


Since

2 ( / 2) 2 ( / 2)
1
if T if T
e e
π π − ± ±
⋅ = ,

the double integral over dτ ′′′ and dτ ′′ has the same value as the double integral over dτ ′ and
dτ , which means that


2 2
1
2 , , ( )
2 2
if if
nn
T T
d e d e R
π τ π τ
τ τ τ τ τ τ
∞ ∞
′ −
−∞ −∞
§ · § ·
′ ′ ′ = Π Π −
¨ ¸ ¨ ¸
© ¹ © ¹
Λ
³ ³

.

This type of double integral has already been evaluated in Sec. 3.26 while simplifying Eq.
(3.66b), but there is no harm in quickly repeating the procedure. Applying Eq. (3.71f), we get

Analyzing the Noise in Artificially Created Even Signals · 3.27
- 325 -

2 2 2 ( )
2 ( ) 2 ( )
1
2 , , ( )
2 2
2 ( ) , ,
2 2
if if if
nn
i f f i f f
nn
T T
d e d e df S f e
T T
df S f d e d e
π τ π τ π τ τ
π τ π τ
τ τ τ τ
τ τ τ τ
∞ ∞ ∞
′ ′ ′ − −
−∞ −∞ −∞
∞ ∞ ∞
′ ′ ′ − − −
−∞ −∞ −∞
§ · § ·
′ ′ ′ ′ = Π Π
¨ ¸ ¨ ¸
© ¹ © ¹
§ · § ·
′ ′ ′ ′ = Π Π
¨ ¸ ¨ ¸
© ¹ © ¹
Λ
³ ³ ³
³ ³ ³


.


This expression can be simplified further using Eq. (3.66d). Equation (3.66d) still holds true if T
is replaced by T/2 because the original T is a dummy parameter. So, replacing T by T/2 and
substituting the result in the formula for ȁ
1
,

( )
2
1
2 ( ) sinc ( )
nn
T S f T T f f df π

−∞
′ ′ ′ ª º = −
¬ ¼
Λ
³

. (3.75a)

According to Eq. (3.66e),

( )
( )
2
2
2
sin 2
sin 2
2
sinc ( ) 2 2
2 2
2
2
T
f
T f
T
T Tf T
T T f
f
π
π
π
π
π
ª º
§ ·
⋅ ⋅
¨ ¸
« »
ª º ′
§ ·
© ¹
′ « » = ⋅ ⋅ =
« »
¨ ¸
′ § ·
© ¹ « » « »
¬ ¼
⋅ ⋅
¨ ¸
« »
© ¹
¬ ¼
,

where 2 T T ′ = . In the limit T →∞ we also have, of course, that T′ →∞, so according to Eq.
(3.57f) it follows that


2
sinc ( ) ( )
as .
T Tf f
T
π δ →
→∞
(3.75b)

Again, we assume that T is large enough to make

( )
2
sinc ( ) ( ) T T f f f f π δ ′ ′ − ≅ −

in Eq. (3.75a). Consequently,


1
2 ( ) ( )
nn
T S f f f df δ

−∞
′ ′ ′ ≅ − Λ
³


or

1
2 ( )
nn
TS f ≅ Λ

. (3.75c)

3 · Random Variables, Random Functions, and Power Spectra

- 326 -
To evaluate
2
Λ , we apply Eqs. (3.74a) and (3.74b) to the right-hand side of Eq. (3.73g) to get


2 2
2 2
2 2
( , ) ( , ) ( )
( , ) ( , ) ( )
, ,
2 2 2 2
( ) ( )
( ) ( )
ift ift
nn
ift ift
nn
ift ift
dt t T e dt t T e R t t
dt t T e dt t T e R t t
T T T T
dt t e dt t e R
t t
t t
π π
π π
π π
∞ ∞
′ −
−∞ −∞
∞ ∞
′ −
−∞ −∞

′ −
−∞
′ ′ ′ = Π Π +
′ ′ ′ + Π Π +
§ · § ·
′ ′ = Π + Π −
¨ ¸ ¨ ¸
© ¹ © ¹
′ Λ −
Ξ Ξ
′ −
Ξ Ξ
³ ³
³ ³
³


2


2 2
( )
, , ( )
2 2 2 2
nn
ift ift
nn
t t
T T T T
dt t e dt t e R t t
π π

−∞
∞ ∞
′ −
−∞ −∞
′ +
§ · § ·
′ ′ ′ + Π − Π + +
¨ ¸ ¨ ¸
© ¹ © ¹
³
³ ³


.



In the first double integral, the t′ , t variables of integration are replaced by ( / 2) t T τ ′ ′ = + and
( / 2) t T τ = − respectively; and in the second double integral, the t′ , t variables of integration are
replaced by ( / 2) t T τ ′′′ ′ = − and ( / 2) t T τ ′′ = + respectively. This leads to



2 2
2 2
2 2
2 2
, , ( )
2 2
, , ( )
2 2
T T
if if
nn
T T
if if
nn
T T
d e d e R
T T
d e d e R
π τ π τ
π τ π τ
τ τ τ τ τ τ
τ τ τ τ τ τ
§ · § · ∞ ∞
′ − − +
¨ ¸ ¨ ¸
© ¹ © ¹
−∞ −∞
§ · § · ∞ ∞
′′′ ′′ − + −
¨ ¸ ¨ ¸
© ¹ © ¹
−∞ −∞
§ · § ·
′ ′ ′ = Π Π +
¨ ¸ ¨ ¸
© ¹ © ¹
§ · § ·
′′′ ′′′ ′′ ′′ ′′ ′′′ + Π Π +
¨ ¸ ¨ ¸
© ¹ © ¹
Λ
³ ³
³ ³


2


or

2 2 2
2 2 2
, , ( )
2 2
, , ( )
2 2
ifT if if
nn
ifT if if
nn
T T
e d e d e R
T T
e d e d e R
π π τ π τ
π π τ π τ
τ τ τ τ τ τ
τ τ τ τ τ τ
∞ ∞
′ −
−∞ −∞
∞ ∞
′′′ ′′ − −
−∞ −∞
§ · § ·
′ ′ ′ = Π Π +
¨ ¸ ¨ ¸
© ¹ © ¹
§ · § ·
′′′ ′′′ ′′ ′′ ′′ ′′′ + Π Π +
¨ ¸ ¨ ¸
© ¹ © ¹
Λ
³ ³
³ ³


2
.
(3.75d)


Everything on the right-hand side of (3.75d) is real except the complex exponentials, so the
second term is the complex conjugate of the first term. It is easy to show that this is true. Starting
with the first term we have

Analyzing the Noise in Artificially Created Even Signals · 3.27
- 327 -

2 2 2
2 2 2
2 2
, , ( )
2 2
, , ( )
2 2
,
2
ifT if if
nn
ifT if if
nn
ifT
T T
e d e d e R
T T
e d e d e R
T
e d e
π π τ π τ
π π τ π τ
π π
τ τ τ τ τ τ
τ τ τ τ τ τ
τ τ

∞ ∞
′ −
−∞ −∞
∞ ∞
′ − −
−∞ −∞
− −
ª º
§ · § ·
′ ′ ′ Π Π +
« » ¨ ¸ ¨ ¸
© ¹ © ¹
¬ ¼
§ · § ·
′ ′ ′ = Π Π +
¨ ¸ ¨ ¸
© ¹ © ¹
§ ·
′′′ ′′′ = Π
¨ ¸
© ¹
³ ³
³ ³




2
, ( ) ,
2
if if
nn
T
d e R
τ π τ
τ τ τ τ
∞ ∞
′′′
−∞ −∞
§ ·
′′ ′′ ′′ ′′′ Π +
¨ ¸
© ¹
³ ³



where in the last step we interchange the order of the double integral and replace the dummy
variables of integration τ , τ ′ by τ ′′ , τ ′′′ respectively. Clearly, the second term in (3.75d) is the
complex conjugate of the first. Since 2Re( ) c c c

= + for any complex number c, it follows that
Eq. (3.75d) can be written as


2 2 2
2Re , , ( )
2 2
ifT if if
nn
T T
e d e d e R
π π τ π τ
τ τ τ τ τ τ
∞ ∞
′ −
−∞ −∞
§ ·
§ · § ·
′ ′ ′ = Π Π +
¨ ¸
¨ ¸ ¨ ¸
© ¹ © ¹
© ¹
Λ
³ ³

2
. (3.75e)

After the variable of integration of the inner integral is changed to ( ) t τ τ ′′ ′ = − + , it can be written
as

2 2 ( )
, ( ) , ( )
2 2
if if t
nn nn
T T
d e R dt t e R t
π τ π τ
τ τ τ τ τ
∞ ∞
′′ ′ − +
−∞ −∞
§ · § ·
′ ′′ ′′ ′ ′′ Π + = Π − − −
¨ ¸ ¨ ¸
© ¹ © ¹
³ ³

. (3.75f)

According to Eq. (3.48b) above and Eq. (2.56c) in Chapter 2, both Π and
nn
R

are even functions,
which means that
, ,
2 2
T T
t t τ τ
§ · § ·
′′ ′ ′′ ′ Π − − = Π +
¨ ¸ ¨ ¸
© ¹ © ¹


and
( ) ( )
nn nn
R t R t ′′ ′′ − =

.

Substituting these two formulas into the right-hand side of (3.75f) gives


2 2 ( )
, ( ) , ( )
2 2
if if t
nn nn
T T
d e R dt t e R t
π τ π τ
τ τ τ τ τ
∞ ∞
′′ ′ − +
−∞ −∞
§ · § ·
′ ′′ ′′ ′ ′′ Π + = Π +
¨ ¸ ¨ ¸
© ¹ © ¹
³ ³

,

3 · Random Variables, Random Functions, and Power Spectra

- 328 -
which can in turn be substituted into (3.75e) to get


2 2 2 ( )
2Re , , ( )
2 2
ifT if if t
nn
T T
e d e dt t e R t
r r t r t
t t t
· ·
´ ´´ ´ ÷ ÷ +
÷· ÷·
§ ·
§ · § ·
´ ´ ´´ ´´ ´ ´´ H H +
¨ ¸
¨ ¸ ¨ ¸
© ¹ © ¹
© ¹
A
³ ³

2
.

Interchanging the order of integration and replacing the variable t ´ by t, we end up with


2 2 4
2Re ( ) , ,
2 2
ifT ift ift
nn
T T
e dt R t e dt t t t e
r r r
· ·
´´ ÷ ÷
÷· ÷·
§ ·
§ · § ·
´´ ´´ ´´ H H +
¨ ¸
¨ ¸ ¨ ¸
© ¹ © ¹
© ¹
A
³ ³

2
. (3.75g)

Comparing (3.75g) with (3.67e), we note that the double integral in the formula for
2
A can be
written as

2 4
/ 2
( ) , ,
2 2
ift ift
nn T
T T
dt R t e dt t t t e
r r
· ·
´´ ÷ ÷
÷· ÷·
§ · § ·
´´ ´´ ´´ H H + A
¨ ¸ ¨ ¸
© ¹ © ¹
³ ³



with the understanding that the random function is now ñ(t) instead of Ñ(t) as in Eq. (3.67e). This
leads to a simpler—well, shorter—formula for
2
A ,


( )
2
/ 2
2Re
ifT
T
e
r
A A
2
. (3.75h)

We have already found the appropriate approximation for
T
A and
/ 2 T
A when T and T/2 are large
enough to make the sinc functions oscillate rapidly with f compared to the noise-power
spectrum. Hence, we now apply formula (3.68j) to (3.75h), which gives, after remembering to
replace Ñ by ñ and T by T/2,


( )
2
2Re [ sinc(2 )] ( )
ifT
nn
e T fT S f
r
r e A

2
.

Since

2
cos(2 ) sin(2 )
ifT
e fT i fT
r
r r + ,

the formula for
2
A can be written as

2 cos(2 ) sinc(2 )] ( )
nn
T fT fT S f r r e A

2
. (3.75i)

Having found good approximations for
1
A and
2
A , we can substitute (3.75c) and (3.75i) into
( )
nn
S f

.
Analyzing the Noise in Artificially Created Even Signals · 3.27
- 329 -
(3.73e) to get

( )
2
( ) 2 ( ) 2 cos(2 ) sinc(2 )] ( )
TE nn nn
N f TS f T fT fT S f r r e +

E
or

( )
2
( ) 2 ( ) [1 cos(2 ) sinc(2 )]
TE nn
N f TS f fT fT r r e +

E . (3.76a)

For large values of T, so that
1 fT >> , (3.76b)

we know that [apply Eq. (3.66e)]


cos(2 ) sin(2 )
1
cos(2 ) sinc(2 ) 1
2 2
fT fT
fT fT
fT fT
r r
r r
r r
s <<

because (i) the absolute value of the product of the sine and cosine must always be less than or
equal to one and (ii) the value of 1/ 2 fT r must be small when fT is large. The formula in
(3.76a) now simplifies to

( )
2
( ) 2 ( )
TE nn
N f TS f e

E . (3.76c)

This will be a useful approximation to know when analyzing detector noise in Chapter 6.

__________


The basic concepts introduced in this chapter—such as random variables and functions, the
autocorrelation function, the noise-power spectrum, stationarity and ergodicity—may not be as
important as the Fourier theory covered in Chapter 2, but they turn up over and over again in the
following pages. The Wiener-Khinchin theorem is used to transform electromagnetic wavefields
into the spectral radiances that Michelson interferometers are built to measure. Stationary random
functions are added to interference signals to represent what happens when the interference
signals become contaminated by noise. The expectation operator E is applied to the products of
random quantities to turn them into autocorrelation functions, and the autocorrelation functions
are then transformed into noise-power spectra in formulas for the random-measurement error.
This chapter has explained the statistical ideas behind these procedures—and the context in
which the ideas arise—to show what the formulas mean and why they make sense.
( )
nn
S f



- 330 -
4
FROM MAXWELL’S EQUATIONS TO
THE MICHELSON INTERFEROMETER
The interference formulas for a highly idealized version of the standard Michelson interferometer
can be derived in a page or two, and that is what is done in most textbooks. Section 1.5 of
Chapter 1 lays out the basic approach of this derivation, pointing out that all we really need is the
19th-century ether-wave theory of light because a full knowledge of Maxwell’s equations is not
required. Afterwards, these ideal interference formulas can, with some difficulty and an appeal to
ad hoc arguments, be modified to handle the measurement errors and distortions present in
nonideal instruments, but this is difficult to do in a straightforward and convincing way.
Consequently, in this chapter we prefer to start with first principles, carefully tracing the plane-
wave solutions to Maxwell’s equations through the standard Michelson interferometer and then
applying the Fourier methodology and random-signal theory explained in the previous two
chapters to describe the electromagnetic wavefields leaving the instrument. Although longer than
the standard textbook procedure, this approach leads naturally to detailed formulas describing
what happens when the optical setup is slightly misaligned, what happens when the input
radiation is polarized, and what happens when the interferometer measures an input spectrum that
is nonuniform over its field of view. We do this both for the interferometer’s balanced
interference signal and its unbalanced background signal, explaining first the reasoning behind
the formulas for the balanced input signal and then showing how the same sort of analysis
produces similar formulas for the unbalanced background signal. At the end of this process, the
reader has a detailed understanding of how the formulas describing ideal Michelson
interferometers should be modified and expanded to describe nonideal instruments in an
imperfect world.
4.1 Deriving the Electromagnetic Wave Equations
In SI units, Maxwell’s equations for empty space are


o o
E
B
t
µ r
o

o
G
G G
, (4.1a)


B
E
t
o
V× ÷
o
G
G G
, (4.1b)

interferometers should be modified and expanded to describe optical imperfections and non-
ideal inputs.
Deriving the Electromagnetic Wave Equations · 4.1
- 331 -
0 E

∇ =
G G
, (4.1c)
and
0 B

∇ =
G G
(4.1d)
where

7
4 10 henry meter
o
µ π

= ⋅
and

2
1
o
o
c
ε
µ
= . (4.1e)

In these equations, E
G
is the electric field, which is a function of position and time; B
G
is the
magnetic-induction field, which is also a function of position and time; t is the time coordinate;
o
µ is the magnetic permeability of free space;
o
ε is the permittivity of free space; c is the
velocity of light; and ∇
G
is the standard vector-derivative “del” operator [see Eq. (4A.7a) in
Appendix 4A for a definition]. We take the curl of both sides in Eqs. (4.1a) and (4.1b) to get


( )
[ ]
o o
B E
t
µ ε

∇× ∇× = ∇×

G G G G G
(4.2a)
and

( )
[ ] E B
t

∇× ∇× = − ∇×

G G G G G
. (4.2b)

But for any vector field v
G
, we have the identity


( )
2
[ ] v v v • ∇× ∇× = ∇ ∇ −∇
G G G G
G G G
. (4.2c)

Substitution of (4.2c) into (4.2a) and (4.2b) gives


( ) ( )
2
o o
B B E
t
µ ε •

∇ ∇ −∇ = ∇×

G G G G G G
,


( ) ( )
2
E E B
t


∇ ∇ −∇ = − ∇×

G G G G G G
,
or

2
2
2
0
o o
B
B
t
µ ε

∇ − =

G
G
,

4 · From Maxwell’s Equations to the Michelson Interferometer
- 332 -

2
2
2
0
o o
E
E
t
µ ε

∇ − =

G
G
,

where we have used 0 B E • • ∇ = ∇ =
G G G G
from (4.1c) and (4.1d) and

E B t ∇× = −∂ ∂
G G G
,
o o
B E t µ ε ∇× = ∂ ∂
G G G


from (4.1a) and (4.1b) to simplify our results. The substitution
2
o o
c µ ε

= from (4.1e) now gives


2
2
2 2
1
0
B
B
c t

∇ − =

G
G
(4.3a)
and

2
2
2 2
1
0
E
E
c t

∇ − =

G
G
. (4.3b)

Equation (4.3a) is the wave equation for E
G
, the electric field as a function of position and time;
and (4.3b) is the wave equation for B
G
, the magnetic-induction field as a function of position and
time. Because E
G
and B
G
are vectors and the wave equation is usually applied to scalar fields, we
now rewrite Eqs. (4.3a) and (4.3b) as a collection of six scalar wave equations to show the
meaning of the two vector wave equations. The first step is to identify the E
G
and B
G
Cartesian
field components. Figure 4.1 specifies a three-dimensional Cartesian coordinate system for the E
G

and B
G
field vectors located at a single point P. We use the ˆ x , ˆ y , ˆ z unit vectors of the coordinate
system to write
ˆ ˆ ˆ
x y z
E xE yE zE = + +
G
(4.4a)
and
ˆ ˆ ˆ
x y z
B xB yB zB = + +
G
, (4.4b)

where, as shown in Fig. 4.1,
x
E ,
y
E ,
z
E are the real x, y, z components of the electric field and
x
B ,
y
B ,
z
B are the real x, y, z components of the magnetic-induction field. Both
, , x y z
E and
, , x y z
B
are, of course, functions of position and time. We define a position vector

ˆ ˆ ˆ r xx yy zz = + +
G
(4.4c)

and show the dependence of the E
G
and B
G
fields on position and time by rewriting (4.4a) and
(4.4b) as
Deriving the Electromagnetic Wave Equations · 4.1
- 333 -

FIGURE 4.1.


Point P at the
same x, y, z
coordinates
x
y
z
x
y
y
z
z
x
E
G

B
G

Draw only the E
G
field and
its x, y, z components
Draw only the B
G
field and
its x, y, z components
0
y
B >
0
z
B <
0
x
B >
0
x
E <
0
z
E >
0
y
E >
4 · From Maxwell’s Equations to the Michelson Interferometer
- 334 -
ˆ ˆ ˆ ( , ) ( , ) ( , ) ( , )
x y z
E r t xE r t yE r t zE r t = + +
G
G G G G

and
ˆ ˆ ˆ ( , ) ( , ) ( , ) ( , )
x y z
B r t xB r t yB r t zB r t = + +
G
G G G G
.

This notation is best regarded as a shorthand for [see the discussion after Eq. (2.109d) in Sec.
2.25 of Chapter 2]

ˆ ˆ ˆ ( , , , ) ( , , , ) ( , , , ) ( , , , )
x y z
E x y z t xE x y z t yE x y z t zE x y z t = + +
G

and
ˆ ˆ ˆ ( , , , ) ( , , , ) ( , , , ) ( , , , )
x y z
B x y z t xB x y z t yB x y z t zB x y z t = + +
G
.

For any vector v
G
we have, according to Eq. (4A.11c) in Appendix 4A,


2 2 2 2
ˆ ˆ ˆ
x y z
v x v y v z v ∇ = ∇ + ∇ + ∇
G


where
x
v ,
y
v ,
z
v are the real x, y, z components of real vector v
G
. It follows that substitution of
Eqs. (4.4a) and (4.4b) into (4.3a) and (4.3b) gives six scalar wave equations, one for each
Cartesian component of the two vector equations (4.3a) and (4.3b):


2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
x x x x x
x
E E E E E
E
c t x y z c t
∂ ∂ ∂ ∂ ∂
∇ − = + + − =
∂ ∂ ∂ ∂ ∂
, (4.5a)


2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
y y y y y
y
E E E E E
E
c t x y z c t
∂ ∂ ∂ ∂ ∂
∇ − = + + − =
∂ ∂ ∂ ∂ ∂
, (4.5b)


2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
z z z z z
z
E E E E E
E
c t x y z c t
∂ ∂ ∂ ∂ ∂
∇ − = + + − =
∂ ∂ ∂ ∂ ∂
, (4.5c)


2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
x x x x x
x
B B B B B
B
c t x y z c t
∂ ∂ ∂ ∂ ∂
∇ − = + + − =
∂ ∂ ∂ ∂ ∂
, (4.5d)


2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
y y y y y
y
B B B B B
B
c t x y z c t
∂ ∂ ∂ ∂ ∂
∇ − = + + − =
∂ ∂ ∂ ∂ ∂
, (4.5e)

Deriving the Electromagnetic Wave Equations · 4.1
- 335 -

2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
z z z z z
z
B B B B B
B
c t x y z c t
∂ ∂ ∂ ∂ ∂
∇ − = + + − =
∂ ∂ ∂ ∂ ∂
. (4.5f)

Here,
2 2 2 2 2 2 2
x y z ∇ = ∂ ∂ + ∂ ∂ + ∂ ∂ is used to write these equations using explicit partial
derivatives of x, y, and z. These six equations are just the scalar wave equation for
x
E ,
y
E ,
z
E
and
x
B ,
y
B ,
z
B . They are not really that difficult to solve when they have simple boundary
conditions. In fact, if at some time t the E
G
and B
G
electromagnetic fields are zero everywhere,
then the solution to these equations is the trivial one that the E
G
and B
G
fields remain identically
zero everywhere. If, however, at some time t there is a region of space where the fields are not
identically zero, then we expect nontrivial solutions having nonzero values of the E
G
and B
G

fields.
4.2 Electromagnetic Plane Waves
Equations (4.1a)–(4.1d), (4.3a), and (4.3b) contain five different differential operators—the
divergence ( • ∇
G
), the curl ( ∇×
G
), the Laplacian (
2
∇ ), and the first and second partial derivatives
with respect to time ( t ∂ ∂ ,
2 2
t ∂ ∂ )—and all five are real linear operators as defined in Appendix
4A. According to the discussion following Eqs. (4A.19a) and (4A.19b), we can therefore find real
solutions for E
G
and B
G
by first solving for them as complex vector fields and then, at the end,
taking their real parts to get the desired real solutions. Following this procedure, we begin looking
for complex solutions to (4.3a) and (4.3b) that have the form


2
( , ) ( )
if t
E r t E r e
π −
=
¦
A
A
A
G G
G G
(4.6a)
and

2
( , ) ( )
if t
B r t B r e
π −
=
¦
A
A
A
G G
G G
, (4.6b)

where all the f
A
values are real and E
A
G
, B
A
G
may be complex vector functions of position.
Substituting (4.6a) and (4.6b) into (4.3a) and (4.3b) shows that then we end up with


2 2 2 2
[( 4 ) ] 0
if t
E E e
π
π σ

∇ + =
¦
A
A A A
A
G G

and

2 2 2 2
[( 4 ) ] 0
if t
B B e
π
π σ

∇ + =
¦
A
A A A
A
G G

if we define
4 · From Maxwell’s Equations to the Michelson Interferometer
- 336 -

f
c
σ =
A
A
. (4.7a)


The only way these sums can be identically zero for all times t is to set


2 2 2
4 0 E E π σ ∇ + =
A A A
G G
(4.7b)

and

2 2 2
4 0 B B π σ ∇ + =
A A A
G G
(4.7c)


for each value of A in the sums. We next look for solutions


2 ( )
( )
j
i k r
j
j
E r E e
π •
=
¦
A
G
G
A A
G G
G
(4.8a)

and

2 ( )
( )
j
i k r
j
j
B r B e
π •
=
¦
A
G
G
A A
G G
G
, (4.8b)


where all the
j
k
A
G
are constant, real, three-dimensional vectors and
j
E
A
G
,
j
B
A
G
are complex, constant,
three-dimensional vectors. In terms of the ˆ x , ˆ y , ˆ z unit vectors of Fig. 4.1,


ˆ ˆ ˆ
j jx jy jz
k xk yk zk = + +
A A A A
G
,


so that, substituting from Eq. (4.4c),


j j jx jy jz
k r r k xk yk zk • • = = + +
A A A A A
G G
G G
.


From Eq. (4A.12a) of Appendix 4A,

Electromagnetic Plane Waves · 4.2
- 337 -

( )
( )
( )
2
2 2
2
2
2
2
2
2
2
2
2
( )
j
jx jy jz
jx jy jz
jx jy j
i k r
j
j
i xk yk zk
j
j
i xk yk zk
i xk yk zk
E r E e
E e
x
e
y
e
z
π
π
π
π

+ +
+ +
+ +
ª º
∇ = ∇
« »
¬ ¼
ª ∂
=
«

¬

+


+

¦
¦
A
A A A
A A A
A A A
G
G
A A
A
G G
G
G



( )
( )
( ) ( )
2
2 2
2 2 2 2 2
4 4
z
j j
i k r i k r
jx jy jz j j j
j j
k k k E e k E e
π π
π π
• •
º
»
¼
= − + + = −
¦ ¦
A A
G G
G G
A A A A A A
G G G



and similarly,

( )
2
2
2 2
( ) 4
j
i k r
j j
j
B r k B e
π
π

∇ = −
¦
A
G
G
A A A
G G G
G
.

Substitution of these two results and Eqs. (4.8a) and (4.8b) into (4.7b) and (4.7c) gives


( )
2
2 ( )
2
0
j
i k r
j j
j
E e k
π
σ

ª º
− =
« »
¬ ¼
¦
A
G
G
A A A
G G
(4.9a)
and

( )
2
2 ( )
2
0
j
i k r
j j
j
B e k
π
σ

ª º
− =
« »
¬ ¼
¦
A
G
G
A A A
G G
. (4.9b)

This can be true over all values of r
G
with nonzero values of
j
E
A
G
and
j
B
A
G
only when


2
2
j
k σ =
A A
G
(4.9c)

for all values of A and j. Equation (4.9c) requires the real vector
j
k
A
G
to have a magnitude
j
k σ =
A A
G
that depends only on index A . This suggests that the j index specifies the different
directions taken on by the
j
k
A
G
vectors, giving


ˆ
j j
k σ = ⋅ Ω
A A A
G
.

4 · From Maxwell’s Equations to the Michelson Interferometer
- 338 -
Here
ˆ
j

A
is a dimensionless unit vector, called the propagation vector, which for a specified
value of A points in different directions for different values of j. In fact, nothing stops us from
assuming that the
ˆ
j

A
propagation vectors range over the same (indefinitely large) set of j
directions for each A value; if we want to leave out some j direction for a given A , we can always
remove those directions by making both
j
E
A
G
and
j
B
A
G
zero for the unwanted values of A and j. We
can thus write

ˆ
j j
k σ = ⋅ Ω
A A
G
. (4.9d)

Substitution of (4.8a), (4.8b), (4.7a), and (4.9d) into (4.6a) and (4.6b) gives


( )
ˆ
2 ( )
( , )
j
i r ct
j
j
E r t E e
π σ σ σ Ω • −
=
¦¦
A A A
G
A
A
G G
G
(4.10a)
and

( )
ˆ
2 ( )
( , )
j
i r ct
j
j
B r t B e
π σ σ σ Ω • −
=
¦¦
A A A
G
A
A
G G
G
. (4.10b)

The phase term in Eqs. (4.10a) and (4.10b) is


ˆ
2 ( )
j
r ct π σ • Ω −
A

if
1 σ σ =
A A

and

ˆ
2 ( )
j
r ct π σ • Ω +
A
if 1 σ σ = −
A A
.

When
1 σ σ =
A A
,

Eq. (4.9c) has been solved with
0
j
k σ = ≥
A A
G
;
and when
1 σ σ = −
A A
,

Eq. (4.9c) has been solved with
0
j
k σ = − ≤
A A
G
.

Electromagnetic Plane Waves · 4.2
- 339 -
Figure 4.2 shows that the choice made here is to have the phase increasing in the direction of
ˆ
j

as time increases, hence the solution to (4.9c) is chosen to be

0
j
k σ = ≥
A A
G
(4.10c)
and Eqs. (4.10a) and (4.10b) become


( )
ˆ
2
( , )
j
i r ct
j
j
E r t E e
π σ Ω • −
=
¦¦
A
G
A
A
G G
G
(4.11a)
and

( )
ˆ
2
( , )
j
i r ct
j
j
B r t B e
π σ Ω • −
=
¦¦
A
G
A
A
G G
G
. (4.11b)

The next section explains why these double sums are called electromagnetic plane waves.
We define

ˆ
ˆ ˆ ˆ
j jx jy jz
x y z ε ε ε Ω = + + (4.12a)

so that
jx
ε ,
jy
ε ,
jz
ε are the direction cosines of
ˆ
j
Ω with respect to the x, y, z axes shown in Fig.
4.3,


ˆ
ˆ cos( )
jx j jx
x ε θ • = Ω = ,
ˆ
ˆ cos( )
jy j jy
y ε θ • = Ω = ,
ˆ
ˆ cos( )
jz j jz
z ε θ • = Ω = . (4.12b)

The standard relationship between direction cosines—that the sum of their squares is one—is the
same as the requirement that
ˆ
j
Ω have unit length


2 2 2 2 2 2
cos cos cos 1
jx jy jz jx jy jz
ε ε ε θ θ θ + + = + + = . (4.12c)

Although we have chosen E
G
and B
G
to satisfy the vector wave equations (4.3a) and (4.3b),
they must also satisfy the full set of Maxwell conditions, Eqs. (4.1a)–(4.1d). Substituting (4.11a)
into (4.1c) gives, using Eq. (4A.12b) from Appendix 4A,


ˆ ˆ
2 ( ) 2 ( )
] [ ] 0 [
j j
i r ct i r ct
j j
j j
E e E e
π σ π σ Ω • − Ω • −
• • ∇ = ∇ =
¦¦ ¦¦
A A
G G
A A
A A
G G G G
. (4.13a)

Simplifying the gradient gives


4 · From Maxwell’s Equations to the Michelson Interferometer
- 340 -
FIGURE 4.2.







x
z
y
unit vector
ˆ
j

The planes of constant phase are specified by
ˆ
j
r ct • Ω = =
G
constant , with each value of ct specifying
a different plane perpendicular to
ˆ
j
Ω .
Electromagnetic Plane Waves · 4.2
- 341 -
FIGURE 4.3.





x
z
y
unit vector
ˆ
j


jx
θ

jy
θ

jz
θ
4 · From Maxwell’s Equations to the Michelson Interferometer
- 342 -

ˆ
2 ( ) 2 ( )
ˆ
2 ( )
ˆ
2 (
ˆ ˆ ˆ [ ]
ˆ ˆ ˆ (2 ) (2 ) (2 )
ˆ
2
j x y z
j
j
i r ct i x y z ct
i r ct
x y z
i r c
j
e x y z e
x y z
x i y i z i e
i e
π σ π σ ε ε ε
π σ
π σ
π σ ε π σ ε π σ ε
π σ
Ω • − + + −
Ω • −
Ω • −
§ · ∂ ∂ ∂
ª º
∇ = + +
¨ ¸
¬ ¼
∂ ∂ ∂
© ¹
ª º = + +
¬ ¼
= Ω
A A
A
A
G
G
A A A
G
A
G


) t
(4.13b)

Hence, Eq. (4.13a) becomes


( )
ˆ
2 ( )
ˆ
2 0
j
i r ct
j j
j
i E e
π σ
π σ
Ω • −

ª º
Ω =
¬ ¼
¦¦
A
G
A A
A
G
. (4.14a)

Similarly, substituting (4.11b) into (4.1d) and simplifying gives


( )
ˆ
2 ( )
ˆ
2 0
j
i r ct
j j
j
i B e
π σ
π σ
Ω • −

ª º
Ω =
¬ ¼
¦¦
A
G
A A
A
G
. (4.14b)

The only way (4.14a) and (4.14b) can hold true for all values of r
G
and t with nonzero σ
A
is to
require

ˆ
0
j j
E • Ω =
A
G
(4.14c)
and

ˆ
0
j j
B • Ω =
A
G
(4.14d)

for all values of A and j . Working next with Eq. (4.1a), we substitute (4.11a) and (4.11b) to get


ˆ ˆ
2 ( ) 2 ( )
] [ ] [
j j
i r ct i r ct
j o o j
j j
B e E e
t
π σ π σ
µ ε
Ω • − Ω • −

∇ =

×
¦¦ ¦¦
A A
G G
A A
A A
G G G


which becomes, using Eq. (4A.12c) in Appendix 4A,


ˆ ˆ
2 ( ) 2 ( )
] ( 2 )[ ] [
j j
i r ct i r ct
j o o j
j j
B e i c E e
π σ π σ
µ ε π σ
Ω • − Ω • −
− ×∇ = −
¦¦ ¦¦
A A
G G
A A A
A A
G G G
.

Substituting from Eq. (4.13b) and using
2
o o
c µ ε

= [see Eq. (4.1e)] gives


ˆ
2 ( )
1
ˆ
2 0
j
i r ct
j j j
j
i e B E
c
π σ
π σ
Ω • − ª º
×Ω − =
« »
¬ ¼
¦¦
A
G
A A A
A
G G
. (4.15a)
Electromagnetic Plane Waves · 4.2
- 343 -
The only way this can be true for all r
G
and t with nonzero σ
A
is if


ˆ
( )
j j j
c B E ×Ω =
A A
G G
(4.15b)

for all values of A and j. Similarly, substitution of (4.11a) and (4.11b) into (4.1b) gives


ˆ
2 ( )
ˆ
2 0
j
i r ct
j j j
j
i e E cB
π σ
π σ
Ω • −
ª º
×Ω + =
¬ ¼
¦¦
A
G
A A A
A
G G
. (4.15c)

The only way (4.15c) can hold true for all r
G
and t with nonzero σ
A
is if


ˆ
j j j
E cB ×Ω = −
A A
G G
(4.15d)

for all values of A and j. It is not difficult to show that (4.15b) and (4.15d) are just different forms
of the same equation. Taking the cross product of the left-hand side of (4.15d) with
ˆ
j
Ω gives,
using Eq. (4A.14) in Appendix 4A,


ˆ ˆ ˆ ˆ ˆ ˆ ˆ ˆ
( ) ( ) ( ) ( )
j j j j j j j j j j j j j
E E E E E • • Ω × ×Ω = −Ω × Ω × = − Ω Ω + Ω Ω =
A A A A A
G G G G G
,


where we use
ˆ
0
j j
E • Ω =
A
G
from Eq. (4.14c) and that
ˆ ˆ
1
j j
• Ω Ω = because
ˆ
j
Ω has unit length.
Therefore taking the cross product of both sides of (4.15d) with
ˆ
j
Ω gives


ˆ ˆ
j j j j j
E c B cB = − Ω × = ×Ω
A A A
G G G
,

which is the same as Eq. (4.15b). We can also take the cross product of the left-hand side of
(4.15b) with
ˆ
j
Ω and use
ˆ
0
j j
B • Ω =
A
G
from (4.14d) and Eq. (4A.14) in Appendix 4A to get


ˆ ˆ
[ ( )]
j j j j
c B cB Ω × ×Ω =
A A
G G
.

Taking the cross product of both the right-hand and left-hand sides of (4.15b) with
ˆ
j
Ω now must
give

ˆ ˆ
j j j j j
cB E E = Ω × = − ×Ω
A A A
G G G
.

4 · From Maxwell’s Equations to the Michelson Interferometer
- 344 -
This is the same formula as Eq. (4.15d). Hence, as stated above, the restrictions placed on
ˆ
j
Ω and
the complex vectors
j
E
A
G
,
j
B
A
G
in Eqs. (4.14c) and (4.14d) make (4.15b) and (4.15d) the same
equality. We see that the double sums shown in (4.11a) and (4.11b) lead to acceptable complex
solutions to the vector wave equations for E
G
and B
G
in (4.3a) and (4.3b); and when the
restrictions (4.14c), (4.14d), and either (4.15b) or (4.15d) are placed on
ˆ
j
Ω ,
j
E
A
G
, and
j
B
A
G
, the
double sums also satisfy (4.1a)–(4.1d), Maxwell’s equations for empty space. No limits are
placed on the size of these double sums. This means we can create two different double sums,
both matching the criteria of this section and so solving Maxwell’s equations, and add them
together to get one big double sum matching the criteria of this section and solving Maxwell’s
equations. In general we can add together any number of plane-wave solutions to Maxwell’s
equations to create a new and larger collection of plane waves solving Maxwell’s equations.
4.3 Monochromatic Wave Trains
To show why Eqs. (4.11a) and (4.11b) are called plane-wave sums, we focus attention on a single
component of the sums in Eqs. (4.11a) and (4.11b) by assuming there to be only one nonzero pair
of
j
E
A
G
,
j
B
A
G
terms. Then the formulas for ( , ) E r t
G
G
and ( , ) B r t
G
G
in (4.11a) and (4.11b) become


ˆ
2 ( )
( , )
j
i r ct
j
E r t E e
π σ Ω • −
=
A
G
A
G G
G
(4.16a)
and

ˆ
2 ( )
( , )
j
i r ct
j
B r t B e
π σ Ω • −
=
A
G
A
G G
G
(4.16b)
with

ˆ ˆ
0
j j j j
E B • • Ω = Ω =
A A
G G
and
1
ˆ
( )
j j j
B c E

= Ω ×
A A
G G
(4.16c)

from (4.14c), (4.14d), and (4.15d). Although it is customary to leave wave formulas in complex
form, strictly speaking only the real parts (or imaginary parts, see discussion at end of Appendix
4A) of the right-hand sides of (4.16a) and (4.16b) provide acceptable physical solutions to wave
Eqs. (4.3a) and (4.3b). Since an x, y, z coordinate system has not yet been specified, nothing stops
us from choosing the z axis to be parallel to
ˆ
j
Ω ; and because both ˆ z and
ˆ
j
Ω are dimensionless,
real, unit-length vectors, we then have
ˆ
ˆ
j
z Ω = . Equations (4.14c) and (4.14d) now show that the
complex vectors
j
E
A
G
and
j
B
A
G
have zero z components, allowing us to write

ˆ ˆ
j jx jy
E xE yE = +
A A A
G
(4.17a)
and
ˆ ˆ
j jx jy
B xB yB = +
A A A
G
(4.17b)
Monochromatic Wave Trains · 4.3
- 345 -
where
jx
E
A
,
jy
E
A
,
jx
B
A
,
jy
B
A
are all complex numbers. Substituting into (4.15b) gives, using
ˆ
ˆ ˆ ˆ ˆ
j
x x z y ×Ω = × = − and
ˆ
ˆ ˆ ˆ ˆ
j
y y z x ×Ω = × = ,


( )
( ) ( )
j
ˆ ˆ
ˆ ˆ ˆ ˆ
ˆ ˆ ,
jx jy jx j jy
jx jy
xE yE c x B y B
y cB x cB
+ = ×Ω + ×Ω
= − +
A A A A
A A

(4.17c)
which means that

jx jy
E cB =
A A
(4.17d)
and

jy jx
E cB = −
A A
. (4.17e)

If we write

jx
i
jx jx
E E e
φ
=
A
A A
(4.18a)

and

jy
i
jy jy
E E e
φ
=
A
A A
(4.18b)

using real phase terms
jx
φ
A
and
jy
φ
A
to describe the
jx
E
A
,
jy
E
A
complex constants, it then follows
from (4.17d) and (4.17e), because c is real, that


1
jx
i
jy jx
B E e
c
φ
=
A
A A
(4.18c)
and

1
jy
i
jx jy
B E e
c
φ
= −
A
A A
. (4.18d)

Hence, (4.17a) and (4.17b) become

ˆ ˆ
jx jy
i i
j jx jy
E x E e y E e
φ φ
= +
A A
A A A
G
(4.18e)
and

1 1
ˆ ˆ
jy jx
i i
j jy jx
B x E e y E e
c c
φ φ
= − +
A A
A A A
K
, (4.18f)

so that taking the real part of the right-hand sides of Eqs. (4.16a) and (4.16b) gives, using
ˆ
ˆ
j
z Ω =
and cos sin
i
e i
ψ
ψ ψ = + ,
4 · From Maxwell’s Equations to the Michelson Interferometer
- 346 -

( )
( )
( )
( ) ( )
ˆ 2
ˆ 2
Re[ ]
ˆ ˆ Re
ˆ ˆ cos 2 ( ) cos 2 ( )
jx jy
i z r ct
j
i i i z r ct
jx jy
jx jx jy jy
E e
x E e y E e e
x E z ct y E z ct
π σ
φ φ π σ
πσ φ πσ φ
• −
• −
ª º
= +
¬ ¼
= − + + − +
A
A A A
G
A
G
A A
A A A A A A
G


(4.19a)
and

( )
( )
( ) ( )
ˆ 2
ˆ 2
Re[ ]
1 1
ˆ ˆ Re
1 1
ˆ ˆ cos 2 ( ) cos 2 ( ) .
jy jx
i z r ct
j
i i i z r ct
jy jx
jy jy jx jx
B e
x E e y E e e
c c
x E z ct y E z ct
c c
π σ
φ φ π σ
πσ φ πσ φ
• −
• −
ª º
§ ·
= − +
¨ ¸ « »
© ¹
¬ ¼
= − − + + − +
A
A A A
G
A
G
A A
A A A A A A
G


(4.19b)

When z is held constant, all the x and y components of the E
G
and B
G
fields in (4.19a) and (4.19b)
oscillate at the same frequency f c σ =
A
. We can recognize what is going on by keeping z
constant and noting that if t increases (or decreases) by 1/( ) c σ
A
, then the phases of all the cosines
in Eqs. (4.19a) and (4.19b) increase (or decrease) by 2ʌ. This makes the wavefield specified in
(4.19a) and (4.19b) a plane wavefield, since every point on a plane specified by z = constant has
the same real E
G
field and B
G
field at all times t. Figure 4.4 shows that when t is held constant in
Eqs. (4.19a) and (4.19b) and z increases (or decreases) in value by 1 σ
A
, the phases of all the
cosines also increase (or decrease) by 2ʌ. Consequently, planes in Fig. 4.4 that are separated by
1 σ
A
have the same phase and thus the same real E
G
and B
G
fields. This distance is called the
wavelength Ȝ of the plane wavefield. Parameter σ
A
is called the wavenumber, already defined in
Eq. (1.7b) of Chapter 1 to be 1/Ȝ. The plane wave is called monochromatic because it is specified
by a single frequency f c σ =
A
and wavelength Ȝ. Its wavenumber σ
A
is 1/Ȝ, so the equality

f c σ =
A
(4.19c)

can now be interpreted as
f c λ = ,

the classic relationship between wavelength, frequency, and velocity for any wavefield. We
conclude that Eqs. (4.19a) and (4.19b) describe a wavefield traveling in the
ˆ
ˆ
j
z Ω = direction at
velocity c, the speed of light.
This analysis obviously applies to any

Monochromatic Plane Waves · 4.3
- 347 -

FIGURE 4.4.




x
z
y
E
G

E
G

B
G

B
G

E
G

E
G

B
G

B
G


1
z
σ
=
A

unit vector
ˆ
j

4 · From Maxwell’s Equations to the Michelson Interferometer
- 348 -

ˆ
2 ( )
j
r ct
j
E e
πσ Ω • −
A
G
A
G
and
ˆ
2 ( )
j
r ct
j
B e
πσ Ω • −
A
G
A
G


pair of terms from formulas (4.11a) and (4.11b). Since the pair of sums in (4.11a) and (4.11b) is a
general solution to the vector wave equations, this sort of general solution can now be interpreted
as a sum over an arbitrary collection of monochromatic plane waves characterized by different
wavenumbers and directions of propagation, where for each wavenumber σ
A
, there is a unique
frequency cσ
A
.
From Eqs. (4.19a) and (4.19b), we get


( ) ( )
( ) ( )
( ) ( )
ˆ ˆ 2 2
{Re[ ]} {Re[ ]}
1
cos 2 ( ) cos 2 ( )
1
cos 2 ( ) cos 2 ( ) 0 ,
i z r ct i z r ct
j j
jx jy jx jy
jx jy jx jy
E e B e
E E z ct z ct
c
E E z ct z ct
c
π σ π σ
πσ φ πσ φ
πσ φ πσ φ
• − • −

= − − + − +
+ − + − + =
A A
G G
A A
A A A A A A
A A A A A A
G G


(4.20)

showing that the real E
G
and B
G
fields of a monochromatic plane wave are always perpendicular
to each other while they oscillate. From (4.17a), (4.17b), (4.17d), and (4.17e), we get


1 1
0
j j jx jx jy jy jx jy jy jx
E B E B E B E E E E
c c

§ · § ·
= + = − + =
¨ ¸ ¨ ¸
© ¹ © ¹
A A A A A A A A A A
G G
. (4.21a)


It follows that in Eqs. (4.16a) and (4.16b)


( )
4 ( )
( , ) ( , ) 0
i z ct
j j
E r t B r t E B e
π σ −
• • = =
A
A A
G G G G
G G
. (4.21b)


In this sense, we can say that the complex monochromatic plane wave E
G
and B
G
fields are also
perpendicular to each other. Another result worth deriving, again using Eqs. (4.17a), (4.17b),
(4.17d), and (4.17e), is that


( ) ( )
1 1
ˆ ˆ ˆ ˆ ˆ ˆ [ ] [ ]
ˆ [ ]
j j jx jy jx jy jx jy jy jx
jx jx jy jy
E B xE yE xB yB z E B z E B
E c E E c E z
∗ ∗ ∗ ∗ ∗
− ∗ − ∗
× = + × + = −
= +
A A A A A A A A A A
A A A A
G G


( ) ( )
1 1
ˆ
ˆ ,
j j j j j
E E z E E
c c
∗ ∗
• • = = Ω
A A A A
G G G G

(4.21c)

Monochromatic Plane Waves · 4.3
- 349 -
where we use ˆ ˆ ˆ ˆ 0 x x y y × = × = and
ˆ
ˆ ˆ ˆ
j
x y z × = = Ω . Vector identities that, like Eqs. (4.21a) and
(4.21c), can be written using only dot products and cross products, hold true in all (proper)
coordinate systems if they hold true in any one (proper) coordinate system.
55
Choosing a new
coordinate system where the ˆ z unit vector is not the same as the
ˆ
j
Ω propagation vector is
geometrically equivalent to specifying a new direction for the propagation vector that is not
parallel to the original ˆ z unit vector. Since (4.21a) and (4.21c) use only dot and cross products,
they must also hold true in those coordinate systems where
ˆ
j
Ω is not parallel to ˆ z . Hence we can
conclude that Eqs. (4.21a) and (4.21c) must be obeyed when the A , j monochromatic plane wave
propagates in any direction, not just when it propagates parallel to the z axis. Therefore the
double sums over A and j in Eqs. (4.11a) and (4.11b) must all have coefficients
j
E
A
G
and
j
B
A
G

satisfying Eqs. (4.21a) and (4.21c), with

0
j j
E B • =
A A
G G
(4.22a)
and

( )
1
ˆ
j j j j j
E B E E
c
∗ ∗
• × = Ω
A A A A
G G G G
. (4.22b)

Similarly, the perpendicularity of the real, physical E
G
and B
G
fields as they oscillate in Eq. (4.20)
cannot be affected by the choice of coordinate system, which means the oscillating E
G
and B
G

fields stay perpendicular when ˆ z is not chosen parallel to
ˆ
j
Ω . Since, once again, this is
geometrically equivalent to specifying a new direction of propagation, we conclude that the real
oscillating E
G
and B
G
fields are perpendicular for all
ˆ
j
Ω vectors—that is, they are perpendicular
no matter in what direction the wavefield propagates.
4.4 Linear Polarization of Monochromatic Plane Waves
Equations (4.19a) and (4.19b) specify an acceptable monochromatic plane wave—that is, they
specify an acceptable term in the double-sum solutions in Eqs. (4.11a) and (4.11b)—no matter
what values are given to the real constants
jx
E
A
,
jy
E
A
,
jx
φ
A
, and
jy
φ
A
. If we again use a Cartesian
coordinate system with
ˆ
ˆ
j
z = Ω and choose 0
jy
E =
A
, then from Eqs. (4.18e) and (4.18a) we get

ˆ ˆ
jx
i
j jx jx
E x E e xE
φ
= =
A
A A A
G
. (4.23a)

55
The cross product is invariant only if the coordinate systems are always chosen to be left-handed or right-handed.
This book uses right-handed coordinate systems, sometimes referred to as proper coordinate systems, where the xˆ ,
yˆ , zˆ vectors are always chosen so that z y x ˆ ˆ ˆ = × .
4 · From Maxwell’s Equations to the Michelson Interferometer
- 350 -
Since 0
jy
E
A
, Eqs. (4.18f) and (4.18a) give


1 1
ˆ ˆ
jx
i
j jx jx
B y E e y E
c c
o

A
A A A
G
. (4.23b)
Setting 0
jy
E
A
in Eqs. (4.19a) and (4.19b) now leads to


( )
( )
ˆ 2
ˆ Re[ ] cos 2 ( )
i z r ct
j jx jx
E e x E z ct
r o
ro o
- ÷
÷ +
A
G
A A A A
G
(4.23c)
and

( )
( )
ˆ 2
1
ˆ Re[ ] cos 2 ( )
i z r ct
j jx jx
B e y E z ct
c
r o
ro o
- ÷
÷ +
A
G
A A A A
G
. (4.23d)

Equations (4.23a)–(4.23d) describe a plane wave whose real electric-field vector always points
strictly along the x axis and whose real magnetic-induction vector always points strictly along the
y axis. Characterizing this wave by the direction of the electric-field vector, we call it linearly
polarized along the x axis, or x-polarized for short (see Fig. 4.5). Equation (4.23a) shows that in
an x-polarized plane wave the complex vector
j
E
A
G
is the ˆ x unit vector multiplied by a complex
constant
jx
E
A
—which, of course, means that in (4.23b) the complex vector
j
B
A
G
must be the ˆ y
unit vector multiplied by the complex constant
jx
E c
A
.
To get a monochromatic plane wave that is linearly polarized in the y direction, we choose
0
jx
E
A
. Then, repeating the analysis used to find Eqs. (4.23a)–(4.23d), we have

ˆ ˆ
jy
i
j jy jy
E y E e yE
o

A
A A A
G
, (4.24a)


1 1
ˆ ˆ
jy
i
j jy jy
B x E e x E
c c
o
÷ ÷
A
A A A
G
, (4.24b)


( )
( )
ˆ 2
ˆ Re[ ] cos 2 ( )
i z r ct
j jy jy
E e y E z ct
r o
ro o
- ÷
÷ +
A
G
A A A A
G
, (4.24c)
and

( )
( )
ˆ 2
1
ˆ Re[ ] cos 2 ( )
i z r ct
j jy jy
B e x E z ct
c
r o
ro o
- ÷
÷ ÷ +
A
G
A A A A
G
. (4.24d)

The monochromatic plane wave described by Eqs. (4.24a)–(4.23d) has an electric-field vector
that always points along the y axis and a magnetic induction vector that always points along the
íx axis (see Fig. 4.6). Equation (4.24a) shows that y polarization can be recognized by noting that
the complex vector
j
E
A
G
is the ˆ y unit vector multiplied by a complex constant
jy
E
A
[with,
4.24d
Linear Polarization of Monochromatic Plane Waves · 4.4
- 351 -


FIGURE 4.5.















according to (4.24b), complex vector
j
B
A
G
being the ˆ x unit vector multiplied by the complex
constant ( )
jy
E c −
A
].
Writing down Eqs. (4.19a) and (4.19b) again while switching the order of addition in the
second equation gives


( )
( ) ( )
ˆ 2
ˆ ˆ Re[ ] cos 2 ( ) cos 2 ( )
i z r ct
j jx jx jy jy
E e x E z ct y E z ct
π σ
πσ φ πσ φ
• −
= − + + − +
A
G
A A A A A A A
G

and

E field vectors
B field vectors
One wavelength of a monochromatic plane wave linearly polarized in the
x direction and propagating in the z direction
x
z
y
4 · From Maxwell’s Equations to the Michelson Interferometer
- 352 -
FIGURE 4.6.















( )
( ) ( )
ˆ
ˆ ˆ Re[ ] cos cos
l
2ʌiı z•r -ct
lj ljx l ljx ljy l ljy
1 1
B e = y E 2ʌı (z - ct)+ij - x E 2ʌı (z - ct)+ij
c c
G
G
.

Clearly, the first term in the general formula for the E field and the first term in the general
formula for the B field can be grouped together and called an x-polarized wave, and similarly the
second terms in the general formulas can be grouped together and called a y-polarized wave. This
shows that the E field of an arbitrary monochromatic plane wave—that is, a plane wave where
neither
jx
E
A
nor
jy
E
A
is automatically zero—can be represented as the sum of the E field of a
monochromatic plane wave linearly polarized in the x direction and the sum of the E field of a
monochromatic plane wave linearly polarized in the y direction. Similarly, the B field of that
same monochromatic plane wave can be represented as the sum of the B field of the
corresponding x-polarized plane wave and the B field of the corresponding y-polarized plane
x
z
y
E field vectors
B field vectors
One wavelength of a monochromatic plane wave linearly polarized in the
y direction and propagating in the z direction
monochromatic plane wave linearly polarized in the x direction and the E field of a
Linear Polarization of Monochromatic Plane Waves · 4.4
- 353 -
wave. This point is often made by stating that any monochromatic plane wave can be written as
the sum of an x-polarized plane wave and a y-polarized plane wave.
4.5 Transmitted Plane Waves
Figure 4.7 shows a monochromatic plane wave incident on a thin film of optical material placed
at an angle to the axis of propagation. Note that we have again chosen the ˆ z unit vector equal to
ˆ
j
Ω , the propagation vector of the incident plane wave. This means, according to Eqs. (4.19a) and
(4.19b), that the incident plane wave can be represented by the real part of



( )
( )
( ) ˆ 2 2
ˆ ˆ
jx jy
i i i z r ct i z ct
j jx jy
E e x E e y E e e
φ φ π σ π σ • − −
= +
A A A A
G
A A A
G
(4.25a)

and the real part of


( ) ( ) ˆ 2 2
1 1
ˆ ˆ
jy jx
i i i z r ct i z ct
j jy jx
B e x E e y E e e
c c
φ φ π σ π σ • − − § ·
= − +
¨ ¸
© ¹
A A A A
G
A A A
G
. (4.25b)

The thin film divides the space in Fig. 4.7 into two regions labeled A and B. Equations (4.25a)
and (4.25b) only apply to points in region A, the region occupied by the incident wavefield. The
unit normal vector ˆ n of the surface on which the plane wave is incident lies in the y, z plane of
the coordinate system, making an angle
j
ψ with respect to the z axis. Angle
j
ψ is called the
angle of incidence, and we give it an index j because it specifies the direction of the
ˆ
j

propagation vector with respect to ˆ n . The interaction of the plane wave with the film creates a
transmitted radiation field in region B that also propagates in the
ˆ
ˆ
j
z Ω = direction, and a
reflected radiation field in region A that propagates in the direction


( )
( )
ˆ ˆ ˆ
ˆ ˆ 2
r
j j j
n n • Ω = Ω − Ω (4.26a)
or

( )
( )
ˆ
ˆ ˆ 2 cos
r
j j
z n ψ Ω = + . (4.26b)

Both the transmitted and reflected wavefields have the same σ
A
wavenumber as the incident
wave. For any wavefield incident on a flat surface, the plane of incidence is defined to be that
plane containing both the surface normal ˆ n and the incident propagation vector
ˆ
j
Ω . Equation
(4.26a) shows that the
( )
ˆ
r
j
Ω propagation vector of the reflected wave automatically lies in the
4 · From Maxwell’s Equations to the Michelson Interferometer
- 354 -
FIGURE 4.7.




A B
y
x
z
propagation
vector
ˆ
ˆ
j
z Ω =
propagation vector
( )
ˆ
r
j

surface normal ˆ n

j
ψ

j
ψ
Transmitted Plane Waves · 4.5
- 355 -
same plane as ˆ n and
ˆ
j
Ω . In Fig. 4.7, the plane of incidence is the y, z plane of the coordinate
system.
Since the transmitted radiation field is also a monochromatic plane wave traveling down the z
axis, the E and B fields of the wave can still be found from the real parts of complex plane wave
solutions such as the ones given in Eqs. (4.16a) and (4.16b),


ˆ
2 ( )
2 ( ) ( ) ( ) j
i r ct
i z ct t t
j j
E e E e
π σ
π σ
Ω • −

=
A
A
G
A A
G G
(4.27a)
and

ˆ
2 ( )
2 ( ) ( ) ( ) j
i r ct
i z ct t t
j j
B e B e
π σ
π σ
Ω • −

=
A
A
G
A A
G G
, (4.27b)

where the (t) superscript specifies the transmitted wavefield and Eqs. (4.27a) and (4.27b) are
assumed to apply only to region B in Fig. 4.7. The complex vector
( ) t
j
E
A
G
can be written as


( ) ( ) ( )
ˆ ˆ
t t t
j jx jy
E xE yE = +
A A A
G


with the two complex numbers
( ) t
jx
E
A
and
( ) t
jy
E
A
representing its x and y components. Equations
(4.18e) and (4.18f) show that the complex vectors
( ) t
j
E
A
G
,
( ) t
j
B
A
G
can now be written as


( ) ( )
( ) ( ) ( )
ˆ ˆ
t t
jx jy
i i
t t t
j jx jy
E x E e y E e
φ φ
= +
A A
A A A
G
(4.27c)
and

( ) ( )
( ) ( ) ( )
1 1
ˆ ˆ
t t
jy jx
i i
t t t
j jy jx
B x E e y E e
c c
φ φ
= − +
A A
A A A
K
, (4.27d)

where we have used the two real constants
( ) t
jx
φ
A
and
( ) t
jy
φ
A
to represent the phases of
( ) t
jx
E
A
and
( ) t
jy
E
A

respectively. We require the film to be nonbirefringent, nonoptically active, and to have an index
of refraction that is constant in layers parallel to its surface; that is, the index of refraction can
only depend on the distance from the film’s surface. If the film absorbs radiant energy, we
account for it in the usual way by making its index of refraction complex.
56
This sort of film turns
out to be an adequate model for the partially transmitting, partially reflecting layer of a Michelson
interferometer’s beam splitter.
When the plane wave incident on the film has 0
jy
E =
A
or 0
jx
E =
A
, making the wave in Eqs.
(4.25a) and (4.25b) linearly x-polarized or linearly y-polarized respectively, the transmitted wave

56
Leonard Eyges, The Classical Electromagnetic Field (Dover Publications, Inc., New York, 1972), p. 340.
4 · From Maxwell’s Equations to the Michelson Interferometer
- 356 -
must have the same type of linear polarization.
57
Hence, when 0
jy
E =
A
in (4.25a) and (4.25b),
the transmitted plane wave must also be linearly polarized along the x axis, making
( )
0
t
jy
E =
A
in
Eqs. (4.27c) and (4.27d); and when 0
jx
E =
A
, the transmitted plane wave, which must be linearly
polarized along the y axis, has
( )
0
t
jx
E =
A
in (4.27c) and (4.27d).
Consulting Eqs. (4.25a) and (4.25b), we see that for linear polarization along the x axis with
0
jy
E =
A
, the incident plane wave is given by the real part of


( ) 2
ˆ
jx
i i z ct
jx
x E e e
φ π σ −
A A
A
(4.28a)

for the electric field and the real part of


( ) 2
1
ˆ
jx
i i z ct
jx
y E e e
c
φ π σ −
A A
A
(4.28b)

for the magnetic induction. The corresponding transmitted plane wave is given by the real part of


( )
( )
2 ( )
ˆ
t
jx
i i z ct t
jx
x E e e
φ π σ −
A A
A
(4.29a)

for the electric field and the real part of


( )
( )
2 ( )
1
ˆ
t
jx
i i z ct t
jx
y E e e
c
φ π σ −
A A
A
(4.29b)

for the magnetic induction [see Eqs. (4.27c) and (4.27d) with
( )
0
t
jy
E =
A
). The ratio of the complex
transmitted electric field’s x component in (4.29a) to the complex incident electric field’s x
component in (4.28a) is the complex coefficient


( )
( )
( )
t
jx jx
t
i jx
s
jx
E
t e
E
φ φ −
=
A A
A
A
. (4.30a)

We see by inspection that this is the same as the ratio of the two complex magnetic inductions in
(4.29b) and (4.28b). Consequently, no matter what happens inside the film to produce the

57
Max Born and Emil Wolf, Principles of Optics: Electromagnetic Theory of Propagation, Interference, and
Diffraction of Light, 7th (expanded) ed. (Cambridge University Press, New York, 1999), p. 55.
Transmitted Plane Waves · 4.5
- 357 -
transmitted x-polarized wave, the process can be described by a complex parameter
s
t , which in
general is a function of the wavenumber σ
A
and
j
ψ , the angle of incidence in Fig. 4.7,

( , )
s s j
t t σ ψ =
A
. (4.30b)

The subscript s in Eqs. (4.30a) and (4.30b) is traditionally applied to incident plane waves whose
electric field is linearly polarized perpendicular to the plane of incidence, and parameter
s
t is
called the s-wave amplitude-transmission coefficient.
58

It is important to note that
s
t does not depend on either
jx
E
A
or
jx
φ
A
, giving it the same value
for all monochromatic plane waves having equal wavenumbers and angles of incidence.
59

Equations (4.28a), (4.28b), (4.29a), and (4.29b) and the definition of parameter ( , )
s j
t σ ψ
A
in
(4.30a) let us write

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
js s js
E e t E e
π σ π σ
σ φ
− −
= ⋅
A A
A A A
G G
(4.31a)

and

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
js s js
B e t B e
π σ π σ
σ φ
− −
= ⋅
A A
A A A
G G
, (4.31b)
where
ˆ
jx
i
js jx
E x E e
φ
=
A
A A
G
,
1
ˆ
jx
i
js jx
B y E e
c
φ
=
A
A A
G
, (4.31c)
and

( )
( ) ( )
ˆ
t
jx
i
t t
js jx
E x E e
φ
=
A
A A
G
, and
( )
( ) ( )
1
ˆ
t
jx
i
t t
js jx
B y E e
c
φ
=
A
A A
G
. (4.31d)

This shows that to get the complex formula for the transmitted plane wave linearly polarized
perpendicular to the plane of incidence, we need only multiply the complex formula for the
incident plane wave by ( , )
s j
t σ ψ
A
. If the plane wavefield incident on the optical film at an angle
j
ψ contains more than one wavenumber (but is still polarized perpendicular to the plane of
incidence), then its electric field is given by the real part of


( ) 2 i z ct
js
E e
π σ −
¦
A
A
A
G


and its magnetic induction is given by the real part of

58
This notation can be traced back to the German word for perpendicular, senkrecht.
59
O. S. Heavens, Optical Properties of Thin Solid Films (London, Butterworths Scientific Publications, 1955), pp.
46–95.
4 · From Maxwell’s Equations to the Michelson Interferometer
- 358 -

( ) 2 i z ct
js
B e
π σ −
¦
A
A
A
G
,

where an s subscript has been added to show that all the waves are linearly polarized
perpendicular to the plane of incidence. The s-wave amplitude-transmission coefficient can now
be used to write the complex formulas for the transmitted radiation fields as


( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
js s j js
E e t E e
π σ π σ
σ ψ
− −
= ⋅
¦ ¦
A A
A A A
A A
G G
(4.31e)
and

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
js s j js
B e t B e
π σ π σ
σ ψ
− −
= ⋅
¦ ¦
A A
A A A
A A
G G
(4.31f)
because

( )
( , )
t
js s j js
E t E σ ψ = ⋅
A A A
G G
and
( )
( , )
t
js s j js
B t B σ ψ = ⋅
A A A
G G
(4.31g)

for all values of A .
For linear polarization along the y axis with 0
jx
E =
A
, Eqs. (4.25a) and (4.25b) show that the
electric field of the incident plane wave is given by the real part of


( ) 2
ˆ
jy
i i z ct
jy
y E e e
φ π σ −
A A
A
(4.32a)

and the magnetic induction of the incident plane wave is given by the real part of


( ) 2
1
ˆ
jy
i i z ct
jy
x E e e
c
φ π σ −

A A
A
. (4.32b)

Recalling that the corresponding transmitted plane wave must have the same type of linear
polarization as the incident wave, we set
( )
0
t
jx
E =
A
in Eqs. (4.27c) and (4.27d) to get that the
electric field of the transmitted plane wave is the real part of


( )
( )
2 ( )
ˆ
t
jy
i i z ct t
jy
y E e e
φ π σ −
A A
A
(4.33a)

and the magnetic induction of the transmitted plane wave is the real part of


( )
( )
2 ( )
1
ˆ
t
jy
i i z ct t
jy
x E e e
c
φ π σ −

A A
A
. (4.33b)

Transmitted Plane Waves · 4.5
- 359 -
The ratio of the complex transmitted electric field in (4.33a) to the complex incident electric field
in (4.32a) is

( )
( )
( )
t
jy jy
t
i jy
p
jy
E
t e
E
φ φ −
=
A A
A
A
. (4.34a)

Again, this is the same as the ratio of the two complex magnetic inductions in (4.33b) and
(4.32b)—so again the process of transmission is described by a single complex parameter that is a
function of σ
A
and
j
ψ but not of
jy
E
A
or
jy
φ
A
,

( , )
p p j
t t σ ψ =
A
. (4.34b)

The p subscript is traditionally applied to incident plane waves whose electric field is linearly
polarized parallel to the plane of incidence, and parameter
p
t is called the p-wave amplitude-
transmission coefficient.
60
When the incident wavefield contains more than one wavenumber and
every monochromatic component is a p-type plane wave, its electric field is given by the real part
of

( ) 2 i z ct
jp
E e
π σ −
¦
A
A
A
G
(4.35a)

and its magnetic induction is given by the real part of


( ) 2 i z ct
jp
B e
π σ −
¦
A
A
A
G
, (4.35b)
where
ˆ
jy
i
jp jy
E y E e
φ
=
A
A A
G
and
1
ˆ
jy
i
jp jy
B x E e
c
φ
= −
A
A A
G
(4.35c)

with the p subscript showing that the waves are linearly polarized parallel to the plane of
incidence. To get the complex formula for the transmitted plane wave linearly polarized parallel
to the plane of incidence, we need only multiply the complex term for each incident plane wave
by ( , )
p j
t σ ψ
A
to get

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
jp p j jp
E e t E e
π σ π σ
σ ψ
− −
= ⋅
¦ ¦
A A
A A A
A A
G G
(4.35d)
and

60
This notation can also be traced back to German scientists, with the German word for parallel spelled the same as
in English, parallel.
4 · From Maxwell’s Equations to the Michelson Interferometer
- 360 -

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
jp p j jp
B e t B e
r o r o
o ¢
÷ ÷

¦ ¦
A A
A A A
A A
G G
. (4.35e)

The details of the mathematics used here to represent the incident and transmitted wavefields
have an unfortunate tendency to conceal the basic ideas behind what is being done. No matter
what the orientation of the E field in the incident monochromatic plane wave—parallel or
perpendicular to the plane of incidence—terms having the form


( ) i z ct
Ae
o ÷
A


are used to describe the electromagnetic wavefields on the incident side of the thin film, and
terms such as


( ) i z ct
Ae
o
t
÷
A


are used to describe the electromagnetic wavefields on the transmitted side of the thin film. Here,
t is a complex number standing for either
s
t or
p
t in the above formulas; and A is a complex
number standing for either the x or y components of the E and B fields’ complex amplitudes—for
example,
jx
E
A
,
jy
B
A
, etc.. If we write the complex A value as


A
i
A A e
o
,
then

( ) ( )
A
i z ct i z ct
Ae A e
o o o ÷ ÷ +

A A

and

( ) ( )
A
i z ct i z ct
Ae A e
o o o
t t
÷ ÷ +

A A
.

If the incident monochromatic wavefield is shifted forward or back along the z axis—that is,
along its direction of propagation—by a distance
0
z , then
0
z z z ÷ ± so that


( )
0 0
( ) ( ) ( )
A A A
i z ct z i z i z ct i z ct
A e A e A e e
o o o o o o o o ÷ + ± ± ÷ + ÷ +
÷
A A A A A


To change the amplitude of the incident wavefield to some fraction of its original value, we
multiply A by a real number Į between zero and one to get


( ) ( )
0 0
( ) ( )
A A
i z i z i z ct i z ct
A e e A e e
o o o o o o
o
± ± ÷ + ÷ +
÷
A A A A
.

The complex t parameter can be written as
)
A
ct o ÷ +
.
( )
0
) (
A
i z t i z
A e e
o o o
o
± + ÷
÷
A A
( )
0 0
) (
A
t z i z i z ct
A e e
o o o o + ± ± ÷

A A A
( )
A
i z c ct
A e
o o ÷ +
÷
A
)
A
ct o +

)
A
ct o ÷ +
.
) ( ct i z ct
A e
o ÷

A
) ( ct i z c
A e
o
t
÷

A
b t) þ
þ
b þ þ
þ b þ
þ þ þ þ þ
þ þ þ þ
m (with = 2 þ
A
r
A
and b = 2 c) r o
A
b t)
A
A A
o
A
b
A
b
A
b
A
b
A
b
A
b
A
b
A
b
A
)
A
t o +
b
A
A
Transmitted Plane Waves · 4.5
- 361 -

i
e
t
o
t t ,

which means the transmitted wavefield that is specified above to be


( ) ( )
A
i z ct i z ct
Ae A e
o o o
t t
÷ ÷ +

A A

becomes

( )
( ) ( )
A
i i z ct i z ct
Ae A e e
t
o o o o
t t
÷ ÷ +

A A
.

Comparing the right-hand side of this equation to the expression


( )
0
( )
A
i z i z ct
A e e
o o o
o
± ÷ +
A A


for the incident wavefield shifted by
0
z and diminished by a real factor Į, we note that

o t .
and

0
z
t
o o ± .
A
.

Hence, all that happens when we multiply a wavefield specified to be


( ) i z ct
Ae
o ÷
A

by a complex parameter t to get

( ) i z ct
Ae
o
t
÷
A


is that the amplitude A of the original wavefield changes to A t and the oscillations of the
wavefield are moved forward or back by a distance


arg( )
t
o t
o o

A A


along the direction of propagation. This mathematical fact—knowing what happens when the
complex expression for a monochromatic wavefield is multiplied by a complex parameter—gives
meaning to the formulas derived in the first part of this section. Monochromatic wavefields
transmitted through the thin film in Fig. 4.7 have their amplitudes diminished by
s
t if the E field
is perpendicular to the plane of incidence and by
p
t if the E field is parallel to the plane of
incidence. The oscillations of the transmitted wavefields are also moved forward or back with
) ct

) ct ÷

)
A
ct o +

)
A
ct o +

)
A
z ct o o ÷ +
A
.
) ( z ct i z ct
A e
o o
t
÷ ÷ +

A A
( )
) ( i t i z
A e e
t
o o
t
÷

A
þ þ
þ þ
þ þ
þ
þ
þ þ
b
A
b
A
b
A
b
A
b
A
b
A
b
A
0
z
t
o o .
A
. o ±þ
4 · From Maxwell’s Equations to the Michelson Interferometer
- 362 -
respect to the incident wavefield as specified by the complex phases or arguments of
s
t and
p
t .
How much the wavefields shift and change in amplitude depends on the angle of incidence and
wavenumber—that is why
s
t and
p
t are written as functions of
j
ψ and σ
A
.
From the work done in Sec. 4.4, we know that any monochromatic plane wave having a
propagation vector parallel to the z axis can be analyzed as the sum of a monochromatic plane
wave linearly polarized along the x axis and a monochromatic plane wave linearly polarized
along the y axis. This means that any monochromatic plane wave incident on the optical film in
Fig. 4.7 can be treated as the sum of an s-type monochromatic plane wave and a p-type
monochromatic plane wave. Consequently, we expect an arbitrary plane wavefield incident along
the z axis in region A of Fig. 4.7 to have both s-type and p-type components, with its electric field
given by the real part of


( ) ( ) 2 2 i z ct i z ct
js jp
E e E e
π σ π σ − −
+
¦ ¦
A A
A A
A A
G G
(4.36a)

and its magnetic induction given by the real part of


( ) ( ) 2 2 i z ct i z ct
js jp
B e B e
π σ π σ − −
+
¦ ¦
A A
A A
A A
G G
. (4.36b)

The recipe for taking this combined wavefield through the optical film into region B of Fig. 4.7 is
to multiply each s-wave component and p-wave component by the appropriate s-wave and p-
wave amplitude-transmission coefficients. Hence, the electric field for the transmitted wave in
region B is the real part of


( ) ( ) 2 2
( , ) ( , )
i z ct i z ct
s j js p j jp
t E e t E e
π σ π σ
σ ψ σ ψ
− −
+
¦ ¦
A A
A A A A
A A
G G
(4.36c)

and the magnetic induction is the real part of


( ) ( ) 2 2
( , ) ( , )
i z ct i z ct
s j js p j jp
t B e t B e
π σ π σ
σ ψ σ ψ
− −
+
¦ ¦
A A
A A A A
A A
G G
. (4.36d)

Thus the transmission of any plane wavefield containing many different wavenumbers—that is,
the transmission of any polychromatic plane wave—can be handled by writing each incident
monochromatic wave as the sum of an s-wave and a p-wave, as shown in (4.36a) and (4.36b), and
then multiplying each s-wave and p-wave in that sum by the correct s-wave and p-wave
amplitude-transmission coefficient, as shown in (4.36c) and (4.36d).
Reflected Plane Waves · 4.6
- 363 -
4.6 Reflected Plane Waves
If the incident wavefield in region A of Figs. 4.7 and 4.8 is a monochromatic plane wave with
propagation vector
ˆ
ˆ
j
z Ω = and wavenumber σ
A
, then the reflected wavefield in region A is a
monochromatic plane wave with wavenumber σ
A
and propagation vector
( )
ˆ
r
j
Ω . In Fig. 4.8, we
construct a special
( ) r
x ,
( ) r
y ,
( ) r
z coordinate system to analyze the reflected plane wave. The
( ) r
z
axis is set parallel to the
( )
ˆ
r
j
Ω propagation vector, so that
( ) ( )
ˆ
ˆ
r r
j
z = Ω . Note that, according to the
discussion at the end of Sec. 4.2, the sum of the incident and reflected plane waves is still a
solution to Maxwell’s equations in region A. We see that the
( ) r
x ,
( ) r
y ,
( ) r
z coordinate system is
just the x, y, z coordinate system rotated about the x axis to make ˆ z parallel to
( )
ˆ
r
j
Ω , so the two
coordinate systems have the same origin. Both coordinate systems have the same x axis, so
( )
ˆ ˆ
r
x x = , and to get the y axis of the new coordinate system, we specify
( ) ( ) ( ) ( )
ˆ
ˆ ˆ ˆ ˆ
r r r r
j
y z x x = × = Ω × .
When an x, y, z coordinate system is rotated by an angle ȕ about its x axis to create a new
( ) r
x ,
( ) r
y ,
( ) r
z coordinate system (see Fig. 4.9), the relationship between the ˆ x , ˆ y , ˆ z unit vectors and
the
( )
ˆ
r
x ,
( )
ˆ
r
y ,
( )
ˆ
r
z unit vectors is

( )
ˆ ˆ
r
x x = , (4.37a)


( )
ˆ ˆ ˆ cos sin
r
y y z β β = + , (4.37b)
and


( )
ˆ ˆ ˆ cos sin
r
z z y β β = − . (4.37c)

Equations (4.37a)–(4.37c) provide another way of specifying the
( )
ˆ
r
x ,
( )
ˆ
r
y ,
( )
ˆ
r
z unit vectors in
terms of the ˆ x , ˆ y , ˆ z unit vectors. Comparing Figs. 4.8 and 4.9, we see that to create the desired
( ) r
x ,
( ) r
y ,
( ) r
z coordinate system in Fig. 4.8, the original x, y, z coordinate should be rotated
around the x axis by an angle in radians of 2
j
β ψ π = − or 2
j
β π ψ = + .
Because the reflected plane wave is traveling down the
( ) r
z axis rather than the z axis, when
the E and B fields of the wave are specified by the real parts of complex expressions, such as the
ones shown in Eqs. (4.16a) and (4.16b), we must replace
ˆ
j
Ω and z by
( )
ˆ
r
j
Ω and
( ) r
z
respectively,

( )
( ) ˆ
2 ( )
2 ( ) ( ) ( )
r
r
j
i r ct
i z ct r r
j j
E e E e
π σ
π σ
Ω • −

=
A
A
G
A A
G G
(4.38a)
and

( )
( ) ˆ
2 ( )
2 ( ) ( ) ( )
r
r
j
i r ct
i z ct r r
j j
B e B e
π σ
π σ
Ω • −

=
A
A
G
A A
G G
. (4.38b)
4 · From Maxwell’s Equations to the Michelson Interferometer
- 364 -
FIGURE 4.8.






The r superscript on the complex
( ) r
j
E
A
G
and
( ) r
j
B
A
G
vectors show that they belong to the reflected
wave. Vector
( ) r
j
E
A
G
in (4.38a) can be written as


( )
( ) ( ) ( ) ( )
ˆ ˆ
r
r r r r
j jx
jy
E xE y E +
A A
A
G
(4.38c)

using two complex numbers
( ) r
jx
E
A
and
( )
( )
r
r
jy
E
A
to represent its ˆ x and
( )
ˆ
r
y components. Although
the y subscripts and unit vectors have an r superscript to show that they belong to the
( ) r
x ,
( ) r
y ,
,
r
x x
z

( ) r
z

( ) r
y
y
A B
propagation vector
ˆ
j
O
propagation vector
( )
ˆ
r
j
O
surface normal ˆ n

j
¢

j
¢
( ) r
z
( ) r
y
Reflected Plane Waves · 4.6
- 365 -



FIGURE 4.9.



,
r
x x

( ) r
z
z

( ) r
y y
β
β
4 · From Maxwell’s Equations to the Michelson Interferometer
- 366 -
( ) r
z coordinate system, the x subscripts and unit vectors do not need one because ˆ x and
( )
ˆ
r
x are
identical in the two coordinate systems. Following the pattern of Eqs. (4.27c) and (4.27d), we
write the complex vectors
( ) r
j
E
A
G
and
( ) r
j
B
A
G
as


( )
( )
( )
( )
( ) ( ) ( ) ( )
ˆ ˆ
r
r
r
jx jy
r
i
i
r r r r
j jx
jy
E x E e y E e
φ
φ
= +
A A
A A
A
G
(4.38d)
and

( )
( )
( )
( )
( ) ( ) ( ) ( )
1 1
ˆ ˆ
r
r
r
jy jx
r
i
i
r r r r
j jx
jy
B x E e y E e
c c
φ
φ
= − +
A A
A A
A
G
(4.38e)

using the real constants
( ) r
jx
φ
A
and
( )
( )
r
r
jy
φ
A
to represent the phases of the complex values of
( ) r
jx
E
A
and
( )
( )
r
r
jy
E
A
respectively.
When the plane wave incident on the optical film is linearly polarized along the x axis or y
axis, the reflected wave is linearly polarized along the
( )
ˆ ˆ
r
x x = axis or the
( )
ˆ
r
y axis respectively.
61

Equations (4.28a) and (4.28b), which give the complex formulas for an incident plane wave
that is linearly x-polarized, force the reflected plane wave to be linearly polarized along the
( )
ˆ ˆ
r
x x = axis. According to Eq. (4.38d), this reflected wave must have


( )
( )
0
r
r
jy
E =
A


for it to be linearly polarized along the
( )
ˆ ˆ
r
x x = axis. Equations (4.38a)–(4.38e) then show that the
E field of the reflected wave is given by the real part of


( )
( )
2 ( ) ( )
ˆ
r
r
jx
i
i z ct r
jx
x E e e
φ
π σ − A
A
A
(4.39a)

and the B field of the reflected wave is given by the real part of


( )
( )
2 ( ) ( ) ( )
1
ˆ
r
r
jx
i
i z ct r r
jx
y E e e
c
φ
π σ − A
A
A
. (4.39b)

Comparing these two complex formulas to the complex formulas (4.28a) and (4.28b) for the
incident wave, we note that if we consider only the scalar factors that do not depend on position
or time, then the
( )
ˆ ˆ
r
x x = components of the complex E fields together with the ˆ y ,
( )
ˆ
r
y
components of the complex B fields have the same complex ratio

61
Max Born and Emil Wolf, Principles of Optics, p. 55.
Reflected Plane Waves · 4.6
- 367 -

( )
( )
( )
r
jx jx
r
i jx
s
jx
E
r e
E
φ φ −
=
A A
A
A
. (4.40a)

Parameter
s
r is called the s-wave amplitude-reflection coefficient, with s again referring to the
incident plane wave’s being polarized perpendicular to the plane of incidence. In general,

( , )
s s j
r r σ ψ =
A
, (4.40b)

where
s
r , like the amplitude-transmission coefficients
s
t and
p
t , does not depend on either
jx
E
A

or
jx
φ
A
; it is the same for all incident plane waves having the same σ
A
and
j
ψ . Comparing the x-
polarized reflected wave in (4.39a) and (4.39b) to the x-polarized incident wave in (4.28a) and
(4.28b), we see that multiplying the complex formulas in (4.28a) and (4.28b) by
s
r converts them
to the complex formulas in (4.39a) and (4.39b) if ˆ y is replaced by
( )
ˆ
r
y and z is replaced by
( ) r
z .
Turning to the case of the y-polarized incident wave specified by the complex formulas
(4.32a) and (4.32b), we remember that now the reflected wave must be polarized along the
( )
ˆ
r
y
axis. This forces
( )
0
r
jx
E =
A
in Eqs. (4.38a)–(4.38e), showing the reflected E field is given by the
real part of


( )
( )
( )
( )
2 ( ) ( ) ( )
ˆ
r
r
r
jy
r
i
z ct r r
jy
y E e e
φ
πσ − A
A
A
(4.41a)

and the reflected B field is given by the real part of


( )
( )
( )
( )
2 ( ) ( )
1
ˆ
r
r
r
jy
r
i
z ct r
jy
x E e e
c
φ
πσ −

A
A
A
. (4.41b)


Comparing these two formulas to (4.32a) and (4.32b) for the incident wave, we again see that if
we consider only the scalar factors that do not depend on position or time then the ˆ y ,
( )
ˆ
r
y
components of the complex E fields together with the
( )
ˆ ˆ
r
x x = components of the complex B
fields have the same complex ratio


( )
( )
( )
( )
r
r
jy r
jy
r
i
jy
p
jy
E
r e
E
φ φ
§ ·

¨ ¸
© ¹
=
A
A
A
A
. (4.42a)
4 · From Maxwell’s Equations to the Michelson Interferometer
- 368 -
Parameter
p
r is called the p-wave amplitude-reflection coefficient, where again p refers to the
incident wave being polarized parallel to the plane of incidence. This coefficient, like
s
r ,
s
t , and
p
t , in general depends only on the wavenumber and incidence angle,

( , )
p p j
r r σ ψ =
A
. (4.42b)

Multiplying the complex formulas in (4.32a) and (4.32b) by
p
r converts them to (4.41a) and
(4.41b) if ˆ y is replaced by
( )
ˆ
r
y and z is replaced by
( ) r
z .
Having analyzed how to create the reflected wavefield when the incident wavefield is a
monochromatic s-wave or monochromatic p-wave, we are now prepared to handle the reflection
of an arbitrary polychromatic plane wavefield incident along the z axis. Splitting each
monochromatic term into an s-wave component and a p-wave component as in formulas (4.36a)
and (4.36b), we can write the incident wave’s E field as the real part of


( ) ( ) 2 2 i z ct i z ct
js jp
E e E e
π σ π σ − −
+
¦ ¦
A A
A A
A A
G G


or, using Eqs. (4.31c) and (4.35c), as the real part of


( ) ( ) 2 2
ˆ ˆ
jx jy
i i i z ct i z ct
jx jy
x E e e y E e e
φ φ π σ π σ − −
+
¦ ¦
A A A A
A A
A A
. (4.43a)

Similarly, the incident wave’s B field is, using Eqs. (4.31c) and (4.35c), the real part of


( ) ( ) 2 2
1 1
ˆ ˆ
jx jy
i i i z ct i z ct
jx jy
y E e e x E e e
c c
φ φ π σ π σ − −

¦ ¦
A A A A
A A
A A
. (4.43b)

In these latest formulas, (4.43a) and (4.43b), the first term is the sum over the s-wave components
of the incident wavefield and the second term is the sum over the p-wave components of the
incident wavefield. To get the corresponding polychromatic reflected wavefield, we follow the
just-described recipes for finding the reflected monochromatic plane waves generated by each
incident monochromatic plane wave. The electric field of the reflected wavefield is then found to
be the real part of


( ) ( )
2 ( ) 2 ( ) ( )
ˆ ˆ ( , ) ( , )
r r
jx jy
i i
z ct z ct r
s j jx p j jy
r x E e e r y E e e
φ φ
πσ πσ
σ ψ σ ψ
− −
+
¦ ¦
A A
A A
A A A A
A A
(4.43c)

and the magnetic-induction field of the reflected wavefield is found to be the real part of

Reflected Plane Waves · 4.6
- 369 -

( ) ( )
2 ( ) 2 ( ) ( )
1 1
ˆ ˆ ( , ) ( , ) .
r r
jx jy
i i
z ct z ct r
s j jx p j jy
r y E e e r x E e e
c c
φ φ
πσ πσ
σ ψ σ ψ
− −

¦ ¦
A A
A A
A A A A
A A
(4.43d)

These reflected-wave formulas are, of course, the counterpart equations to (4.36c) and (4.36d) for
the transmitted wavefields.
4.7 Polychromatic Wave Fields
Having found and at least to some extent analyzed the complex E-field and B-field plane-wave
solutions in Eqs. (4.11a) and (4.11b), we can write their associated real-valued radiation fields as


( )
( ) ( )
ˆ
2
(rad)
ˆ ˆ
2 2
( , ) Re
1 1
2 2
j
j j
i r ct
j
j
i r ct i r ct
j j
j
E r t E e
E e E e
π σ
π σ π σ
Ω • −
Ω • − − Ω • −

­ ½
=
® ¾
¯ ¿
­ ½
= +
® ¾
¯ ¿
¦¦
¦ ¦ ¦
A
A A
G
A
A
G G
A A
A A
G G
G
G G

(4.44a)
and


( )
( ) ( )
ˆ
2
(rad)
ˆ ˆ
2 2
( , ) Re
1 1
.
2 2
j
j j
i r ct
j
j
i r ct i r ct
j j
j
B r t B e
B e B e
π σ
π σ π σ
Ω • −
Ω • − − Ω • −

­ ½
=
® ¾
¯ ¿
­ ½
= +
® ¾
¯ ¿
¦¦
¦ ¦ ¦
A
A A
G
A
A
G G
A A
A A
G G
G
G G

(4.44b)

In Eq. (4.44a), to convert the first inside sum over
j
E
A
G
into an integral, we replace 0 σ ≥
A
with
the continuous variable 0 σ ≥ . To convert the sum over
j
E

A
G
into an integral, we use negative
values of the same continuous variable ı; that is, we replace σ −
A
with 0 σ < . To set up these
conversions, we define


1
( ) for 0
2
j j
E E σ σ σ σ ∆ = = >
A A A
G G
, (4.45a)
and

1
( ) for 0
2
j j
E E σ σ σ σ

∆ = = − <
A A A
G G
(4.45b)
with

1
σ σ σ
+
∆ = −
A A A
.

Beam-Chopped and Direction-Chopped Radiation · 4.9

- 370 -
A similar conversion of sums into integrals can be applied to Eq. (4.44b) if we define


1
( ) for 0
2
j j
B B σ σ σ σ ∆ = = >
A A A
G G
, (4.45c)
and

1
( ) for 0
2
j j
B B σ σ σ σ

∆ = = − <
A A A
G G
. (4.45d)

Equations (4.45a) and (4.45c) associate positive ı arguments in ( )
j
E σ
G
and ( )
j
B σ
G
with the
original
j
E
A
G
and
j
B
A
G
vectors, and Eqs. (4.45b) and (4.45d) associate negative ı arguments in
( )
j
E σ
G
and ( )
j
B σ
G
with the complex conjugate
j
E

A
G
and
j
B

A
G
vectors. In the limit of decreasing
σ ∆
A
and increasing numbers of σ
A
values per unit wavenumber interval, Eqs. (4.44a) and
(4.44b) become

( )
ˆ
2
(rad)
( , ) ( )
j
i r ct
j
j
E r t E e d
π σ
σ σ

Ω • −
−∞
=
¦
³
G
G G
G
(4.46a)
and

( )
ˆ
2
(rad)
( , ) ( )
j
i r ct
j
j
B r t B e d
π σ
σ σ

Ω • −
−∞
=
¦
³
G
G G
G
. (4.46b)

For this limit to make sense, we have to set ( ) 0
j
E σ =
G
and ( ) 0
j
B σ =
G
in (4.45a)–(4.45d) at those
wavenumbers for which there are no specified A index values in (4.44a) and (4.44b); in effect,
the indices left out of the sums are now included but assigned zero for their complex vector
coefficients
j
E
A
G
and
j
B
A
G
. Although Eqs. (4.44a) and (4.44b) force vectors
(rad)
E
G
and
(rad)
B
G
to be
real, vectors ( )
j
E σ
G
and ( )
j
B σ
G
are allowed to be complex.
Equations (4.46a) and (4.46b) are a vector shorthand for the six scalar equations


( )
ˆ
2
(rad)
( , ) ( )
j
i r ct
x jx
j
E r t E e d
π σ
σ σ

Ω • −
−∞
=
¦
³
G
G
,


( )
ˆ
2
(rad)
( , ) ( )
j
i r ct
y jy
j
E r t E e d
π σ
σ σ

Ω • −
−∞
=
¦
³
G
G
,


( )
ˆ
2
(rad)
( , ) ( )
j
i r ct
z jz
j
E r t E e d
π σ
σ σ

Ω • −
−∞
=
¦
³
G
G
,
Polychromatic Wave Fields · 4.7
- 371 -
and

( )
ˆ
2
(rad)
( , ) ( )
j
i r ct
x jx
j
B r t B e d
π σ
σ σ

Ω • −
−∞
=
¦
³
G
G
,


( )
ˆ
2
(rad)
( , ) ( )
j
i r ct
y jy
j
B r t B e d
π σ
σ σ

Ω • −
−∞
=
¦
³
G
G
,


( )
ˆ
2
(rad)
( , ) ( )
j
i r ct
z jz
j
B r t B e d
π σ
σ σ

Ω • −
−∞
=
¦
³
G
G
,
where

(rad) (rad) (rad) (rad)
ˆ ˆ ˆ ( , ) ( , ) ( , ) ( , )
x y z
E r t xE r t yE r t zE r t = + +
G
G G G G


with
ˆ ˆ ˆ ( ) ( ) ( ) ( )
j jx jy jz
E xE yE zE σ σ σ σ = + +
G


and

(rad) (rad) (rad) (rad)
ˆ ˆ ˆ ( , ) ( , ) ( , ) ( , )
x y z
B r t xB r t yB r t zB r t = + +
G
G G G G


with
ˆ ˆ ˆ ( ) ( ) ( ) ( )
j jx jy jz
B xB yB zB σ σ σ σ = + +
G


for any ˆ x , ˆ y , ˆ z triplet of mutually perpendicular Cartesian unit vectors. The integrals in (4.46a)
and (4.46b) are inverse Fourier transforms, so we can define, using
ˆ
j
r ct ξ • = Ω −
G
,


2
( ) ( )
i
jx jx
E e d
π σξ
ξ σ σ

−∞
=
³
E ,
2
( ) ( )
i
jy jy
E e d
π σξ
ξ σ σ

−∞
=
³
E ,

2
( ) ( )
i
jz jz
E e d
π σξ
ξ σ σ

−∞
=
³
E
and

2
( ) ( )
i
jx jx
B e d
π σξ
ξ σ σ

−∞
=
³
B ,
2
( ) ( )
i
jy jy
B e d
π σξ
ξ σ σ

−∞
=
³
B ,

2
( ) ( )
i
jz jz
B e d
π σξ
ξ σ σ

−∞
=
³
B .

4 · From Maxwell’s Equations to the Michelson Interferometer
- 372 -
In our shorthand vector notation, this becomes


2
( ) ( )
i
j j
E e d
π σξ
ξ σ σ

−∞
=
³
G
G
E (4.46c)
and

2
( ) ( )
i
j j
B e d
π σξ
ξ σ σ

−∞
=
³
G
G
B (4.46d)
where
ˆ ˆ ˆ ( ) ( ) ( ) ( )
j jx jy jz
x y z ξ ξ ξ ξ = + +
G
E E E E (4.46e)
and
ˆ ˆ ˆ ( ) ( ) ( ) ( )
j jx jy jz
x y z ξ ξ ξ ξ = + +
G
B B B B . (4.46f)

Now Eqs. (4.46a) and (4.46b) can be written as (remember that
ˆ
j
r ct ξ • = Ω −
G
)


(rad)
ˆ
( , ) ( )
j j
j
E r t r ct • = Ω −
¦
G
G G G
E (4.46g)
and

(rad)
ˆ
( , ) ( )
j j
j
B r t r ct • = Ω −
¦
G
G G G
B . (4.46h)

Returning to the definitions of
j
E
G
and
j
B
G
in Eqs. (4.45a)–(4.45d), we see that

( ) ( )
j j
E E σ σ

− =
G G
(4.47a)
and
( ) ( )
j j
B B σ σ

− =
G G
. (4.47b)

This shows that
j
E
G
and
j
B
G
are Hermitian, and entry 7 in Table 2.1 of Chapter 2 requires the
inverse Fourier transforms of Hermitian functions to be real. Consequently, because they are
inverse Fourier transforms of Hermitian functions, each
ˆ
( )
j j
r ct • Ω −
G G
E and
ˆ
( )
j j
r ct • Ω −
G G
B vector
function in (4.46g) and (4.46h) is real. Every
ˆ
( )
j j
r ct • Ω −
G G
E and
ˆ
( )
j j
r ct • Ω −
G G
B pair of vector
functions can be thought of as the real electric and magnetic-induction fields of a single
polychromatic plane wave traveling in direction
ˆ
j
Ω at velocity c. Hence these two equations
demonstrate that electromagnetic radiation fields in empty space can be represented as the sum of
polychromatic plane waves traveling in a specified collection of different directions.
Polychromatic Wave Fields · 4.7
- 373 -
From Eqs. (4.14c) and (4.14d), we know that
ˆ
0
j j
B • Ω =
A
G
and
ˆ
0
j j
E • Ω =
A
G
. Taking the
complex conjugate of these two relationships gives
ˆ
0
j j
B

• Ω =
A
G
and
ˆ
0
j j
E

• Ω =
A
G
. We can now
take the dot product of both sides of Eqs. (4.45a) and (4.45b) with
ˆ
j
Ω to get


ˆ
( ) 0
j j
E σ • Ω =
G
(4.48a)

and the dot product of both sides of Eqs. (4.45c) and (4.45d)
ˆ
j
Ω to get


ˆ
( ) 0
j j
B σ • Ω =
G
(4.48b)

for all positive and negative values of ı. Taking the dot product with
ˆ
j
Ω of both sides of Eqs.
(4.46c) and (4.46d) gives

2
ˆ ˆ
( ) ( )
i
j j j j
E e d
π σξ
ξ σ σ

−∞
• •
ª º
Ω = Ω
¬ ¼
³
G
G
E
and

2
ˆ ˆ
( ) ( )
i
j j j j
B e d
π σξ
ξ σ σ

−∞
• •
ª º
Ω = Ω
¬ ¼
³
G
G
B

because
ˆ
j
Ω is a constant unit vector. Substituting from Eqs. (4.48a) and (4.48b) and
remembering that
ˆ
j
r ct ξ • = Ω −
G
now leads to


ˆ ˆ
( ) 0
j j j
r ct • • Ω − Ω =
G G
E (4.49a)
and

ˆ ˆ
( ) 0
j j j
r ct • • Ω − Ω =
G G
B (4.49b)

for any polychromatic plane wave
ˆ
( )
j j
r ct • Ω −
G G
E and
ˆ
( )
j j
r ct • Ω −
G G
B . Consequently, the E and
B fields of a polychromatic plane wave, just like the E and B fields of a monochromatic plane
wave, are transverse to the wave’s direction of propagation. From Eq. (4.22a) we note that, taking
the complex conjugates of the original equality,

0
j j j j
E B E B
∗ ∗
• • = =
A A A A
G G G G
.

4 · From Maxwell’s Equations to the Michelson Interferometer
- 374 -
Hence from Eqs. (4.45a) and (4.45c) it follows that


( )
2
1
( ) ( ) 0
4
j j j j
E B E B o o
o
- -
A
A A
A
G G G G

for 0 o > and

( )
2
1
( ) ( ) 0
4
j j j j
E B E B o o
o
· ·
- -
A
A A
A
G G G G


for 0 o < . We conclude, in the limit of decreasing o A
A
and increasing numbers of o
A
values,
that
( ) ( ) 0
j j
E B o o -
G G
(4.49c)

for all positive and negative values of ı. We divide both sides of Eq. (4.22b) by
2
4( ) o A
A
to get


( ) ( )
( )
2 2
1 1 1
ˆ
4 4
j j j j j
E B E E
c
o o
· ·
-
ª º
× O « »
A A « »
¬ ¼
A A A A
A A
G G G G
. (4.49d)

Consulting Eq. (4.45a), the complex conjugate of Eq. (4.45a), and the complex conjugate of Eq.
(4.45c), we note that in the limit of decreasing o A
A
and increasing numbers of o
A
it follows that


( )
1
ˆ
( ) ( ) ( ) ( )
j j j j j
E B E E
c
o o o o
· ·
- × O
G G G G
(4.49e)

for 0 o > . For 0 o < we have, using (4.45b) and the complex conjugate of (4.45d), that


( )
2
1
( ) ( )
4
j j j j
E B E B o o
o
· ·
× ×
A
A A
A
G G G G
.

Substituting this into the complex conjugate of (4.49d) gives


( )
( )
2
1
ˆ
( ) ( )
4
j j j j j
E B E E
c
o o
o
· ·
- × O
A
A A
A
G G G G
.

Remembering that 0 o < , we now use (4.45b) and the complex conjugate of (4.45b) to write, in
the limit of decreasing o A
A
and increasing numbers of o
A
, that
through (4.45d) it follows that
Polychromatic Wave Fields · 4.7
- 375 -

( ) ( )
1 1
ˆ ˆ
( ) ( ) ( ) ( ) ( ) ( )
j j j j j j j j
E B E E E E
c c
σ σ σ σ σ σ
∗ ∗ ∗
• • × = Ω = Ω
G G G G G G
.

Comparing the results for 0 σ > and 0 σ < , we conclude that


( )
1
ˆ
( ) ( ) ( ) ( )
j j j j j
E B E E
c
σ σ σ σ
∗ ∗
• × = Ω
G G G G
(4.49f)

holds true for all positive and negative values of ı. Glancing back at Eq. (4.47a), we see that this
can also be written as

( )
1
ˆ
( ) ( ) ( ) ( )
j j j j j
E B E E
c
σ σ σ σ

• × = − Ω
G G G G
(4.49g)

for all positive and negative values of ı.
4.8 Angle-Wavenumber Transforms
The next step is to convert the sums over j in Eqs. (4.46a) and (4.46b) into integrals.
Remembering that the
ˆ
j
Ω are defined in Eq. (4.12a) to be
ˆ
ˆ ˆ ˆ
j jx jy jz
x y z ε ε ε Ω = + + , we require that
0
jz
ε > . Now all the plane waves in Eqs. (4.46a) and (4.46b) are traveling more or less along the
positive z axis of the Cartesian coordinate system—that is, the angle between
ˆ
j
Ω and ˆ z is
always less than / 2 π . We use

2
2 2 2
ˆ
1
j jx jy jz
ε ε ε Ω = = + +

[see Eq. (4.12c)] to write

2 2
ˆ
ˆ ˆ ˆ 1
j jx jy jx jy
x y z ε ε ε ε Ω = + + − − . (4.50a)

This makes it clear that the two real parameters
jx
ε and
jy
ε specify the propagation direction
ˆ
j

of the jth plane wave. Consequently, each plane wave in the sums over j in Eqs. (4.46a) and
(4.46b) can be specified by a single point in the
x
ε ,
y
ε plane. Figure 4.10 shows how this works
for the sum of the five plane waves specified by the points
1 1
( , )
x y
ε ε ,
2 2
( , )
x y
ε ε ,
3 3
( , )
x y
ε ε ,
4 4
( , )
x y
ε ε , and
5 5
( , )
x y
ε ε . We can construct a grid of
x
ε ,
y
ε values such that each plane wave is
located at a node in the grid, where if necessary the grid lines are unevenly spaced as in Fig. 4.10.
After numbering the grid lines, we can replace the single index j by a pair of indices m and n. The
five plane waves in Fig. 4.10, for example, become

Beam-Chopped and Direction-Chopped Radiation · 4.9

- 376 -

1 1 2 4
( , ) ( , )
x y x y
ε ε ε ε → ,
2 2 5 1
( , ) ( , )
x y x y
ε ε ε ε → ,


3 3 3 2
( , ) ( , )
x y x y
ε ε ε ε → ,
4 4 4 5
( , ) ( , )
x y x y
ε ε ε ε → ,
and

5 5 1 3
( , ) ( , )
x y x y
ε ε ε ε → .

Replacing index j by a pair of indices m and n lets us write the sums in Eqs. (4.46a) and (4.46b)
as

( )
ˆ
2
(rad)
( , ) ( )
nm
i r ct
nm
n m
E r t E e d
π σ
σ σ

∞ ∞
Ω • −
=−∞ =−∞
−∞
=
¦ ¦
³
G
G G
G
(4.51a)
and

( )
ˆ
2
(rad)
( , ) ( )
nm
i r ct
nm
n m
B r t B e d
π σ
σ σ

∞ ∞
Ω • −
=−∞ =−∞
−∞
=
¦ ¦
³
G
G G
G
, (4.51b)

where we define ( ) ( ) 0
nm nm
E B σ σ = =
G G
for those grid points that do not correspond to propagation
directions specified in the original sums over j. The new set of
ˆ
nm
Ω propagation vectors can be
written as

2 2
ˆ
ˆ ˆ ˆ 1
nm nx my nx my
x y z ε ε ε ε Ω = + + − − . (4.51c)


For each m and n propagation direction in Eqs. (4.51a) and (4.51b), we now define that

( , , ) ( )
nx my nx my nm
E ε ε ε ε σ σ ∆ ∆ = e
G
G
(4.52a)
and
( , , ) ( )
nx my nx my nm
B ε ε ε ε σ σ ∆ ∆ = b
G G
(4.52b)
with

1, , nx n x n x
ε ε ε
+
∆ = − (4.52c)
and

1, , my m y m y
ε ε ε
+
∆ = − . (4.52d)


In the limit of decreasing
nx
ε ∆ ,
my
ε ∆ and increasing numbers of specified propagation directions
per unit interval in
x
ε and
y
ε , Eqs. (4.51a) and (4.51b) can be written as
Angle-Wavenumber Transforms· 4.8
- 377 -
FIGURE 4.10.





( )
ˆ
2
(rad)
2 2
( , ) ( , , )
[ 1]
i r ct
x y x y
x x
E r t d d d e
π σ
σ ε ε ε ε σ
ε ε

Ω• −
−∞
=
+ <
³ ³ ³
e