Interferometer DLC

Performance of Standard
Fourier-Transform Spectrometers
(or, more than you probably
wanted to know about Fourier
transforms, random-signal theory,
and Michelson interferometers)
by Douglas Cohen
Volume One, Chapters 1-4

Performance Analysis of Standard Fourier-Transform
Spectrometers

Copyright @ 2007 by Douglas Cohen

PRINTED IN THE UNITED STATES OF AMERICA

All rights reserved. No part of this publication may be reproduced,
stored in a retrieval system, or transmitted in any form or by any
means, electronic, mechanical, photocopying, recording, or otherwise,
without the prior permission of the author.

Performance of Standard Fourier-Transform Spectrometers

To Sophie and Phoebe who do not know calculus,
and to Clara who does

i
CONTENTS
Preface ..........................................................................................................................................vii

1 Ether Wind, Spectral Lines, and Michelson Interferometers ...................................... 1

1.1 The First Michelson Interferometer........................................................................ 2
1.2 Historical Reasoning Behind the Ether Wind Experiment .................................. 14
1.3 Monochromatic Light and Spectral Lines ............................................................ 24
1.4 Applying the Michelson Interferometer to Spectral Lines................................... 24
1.5 Interference Equation for the Ideal Michelson Interferometer............................. 31
1.6 Fringe Patterns of Finite-Width Spectral Lines.................................................... 51
1.7 Fourier-Transform Spectrometers ....................................................................... 52
1.8 Laser-Based Control Systems............................................................................... 57

2 Fourier Theory................................................................................................................. 62

2.1 Basic Concept of a Fourier Transform ................................................................ 62
2.2 Fourier Sine and Cosine Transforms.................................................................... 67
2.3 Even, Odd, and Mixed Functions ........................................................................ 76
2.4 Extended Sine and Cosine Transforms................................................................. 80
2.5 Forward and Inverse Fourier Transforms............................................................. 89
2.6 Fourier Transform as a Linear Operator............................................................... 97
2.7 Mathematical Symmetries of the Fourier Transform .......................................... 99
2.8 Basic Fourier Identities....................................................................................... 103
2.9 Fourier Convolution Theorem............................................................................ 110
2.10 Fourier Transforms and Divergent Integrals ...................................................... 117
2.11 Generalized Functions ....................................................................................... 121
2.12 Generalized Limits ............................................................................................. 132
2.13 Fourier Transforms of Generalized Functions ................................................... 136
2.14 The Delta Function ............................................................................................ 144
2.15 Derivative of the Delta Function ....................................................................... 153
2.16 Fourier Transform of the Delta Function ........................................................... 157
2.17 Fourier Convolution Theorem with Generalized Functions............................... 159
2.18 The Shah Function.............................................................................................. 162
2.19 Fourier Transform of the Shah Function ........................................................... 165
2.20 Fourier Series...................................................................................................... 173
2.21 Discrete Fourier Transform ............................................................................... 181
2.22 Aliasing as an Error ........................................................................................... 188
2.23 Aliasing as a Tool ............................................................................................... 197

ii
2.24 Sampling Theorem............................................................................................. 200
2.25 Fourier Transforms in Two and Three Dimensions........................................... 207
Table 2.1.............................................................................................................. 219
Table 2.2.............................................................................................................. 221

3 Random Variables, Random Functions, and Power Spectra ................................... 223

3.1 Random and Nonrandom Variables ................................................................... 223
3.2 Random and Nonrandom Functions................................................................... 224
3.3 Probability Density Distributions: Mean, Variance, Standard Deviation.......... 226
3.4 The Expectation Operator .................................................................................. 230
3.5 Independent and Dependent Random Variables ................................................ 233
3.6 Analyzing Independent Random Variables ...................................................... 233
3.7 Large Numbers of Random Variables ............................................................... 234
3.8 Single-Variable Means from Multivariable Distributions ................................. 235
3.9 Analyzing Dependent Random Variables.......................................................... 236
3.10 Linearity of the Expectation Operator .............................................................. 239
3.11 The Central Limit Theorem .............................................................................. 243
3.12 Averaging to Improve Experimental Accuracy ................................................. 247
3.13 Mean, Autocorrelation, Autocovariance of Random Functions of Time .......... 249
3.14 Ensembles ......................................................................................................... 251
3.15 Stationary Random Functions ............................................................................ 252
3.16 Gaussian Random Processes .............................................................................. 261
3.17 Products of Two, Three, and Four Jointly Normal Random Variables ............. 263
3.18 Ergodic Random Functions................................................................................ 272
3.19 Experimental Noise............................................................................................ 279
3.20 The Power Spectrum.......................................................................................... 280
3.21 Random Inputs and Outputs of Linear Systems ................................................ 282
3.22 The Sign of the Power Spectrum ...................................................................... 287
3.23 The Power Spectrum and Fourier Transforms of Random Functions .............. 289
3.24 The Multidimensional Wiener-Khinchin Theorem............................................ 297
3.25 Band-Limited White Noise ................................................................................ 299
3.26 Even and Odd Components of Random Functions ............................................ 302
3.27 Analyzing the Noise in Artificially Created Even Signals ............................... 319

4 From Maxwells Equations to the Michelson Interferometer ................................. 330

4.1 Deriving the Electromagnetic Wave Equations ................................................. 330
4.2 Electromagnetic Plane Waves............................................................................ 335
4.3 Monochromatic Wave Trains............................................................................. 344
4.4 Linear Polarization of Monochromatic Plane Waves ....................................... 349

iii
4.5 Transmitted Plane Waves .................................................................................. 353
4.6 Reflected Plane Waves ...................................................................................... 363
4.7 Polychromatic Wave Fields ............................................................................... 369
4.8 Angle-Wavenumber Transforms ........................................................................ 375
4.9 Beam-Chopped and Direction-Chopped Radiation ........................................... 383
4.10 Time-Chopped and Band-Limited Radiation .................................................... 390
4.11 Top-Level Description of a Standard Michelson Interferometer ....................... 394
4.12 Monochromatic Plane Waves and Michelson Interferometers .......................... 395
4.13 Multiple Plane Waves and Michelson Interferometers ...................................... 416
4.14 Energy Flux of Time-Chopped and Beam-Chopped Radiation Fields .............. 427
4.15 Energy Flux of the Balanced Radiation Fields .................................................. 438
4.16 Simplified Formulas for the Optical Power in the Balanced Signal .................. 454
4.17 Energy Flux in the Unbalanced Radiation Fields .............................................. 464
4.18 Simplified Formulas Describing Unbalanced Background Radiation ............... 483
Appendix 4A .................................................................................................................. 490
Appendix 4B ................................................................................................................... 499
Appendix 4C ................................................................................................................... 522
Appendix 4D ................................................................................................................... 528
Appendix 4E ................................................................................................................... 532
Appendix 4F ................................................................................................................... 551

5 Description of Practical Interferometer Measurements............................................ 555

5.1 Radiometric Description of Electromagnetic Fields ......................................... 555
5.2 Radiance Fields in Space.................................................................................... 566
5.3 Radiance, Brightness, and the Inverse-Square Law .......................................... 571
5.4 The Balanced Signal of a Michelson Interferometer.......................................... 573
5.5 The Unbalanced Signal of a Michelson Interferometer .................................... 585
5.6 The Off-Axis Signal of a Michelson Interferometer ......................................... 588
5.7 The Standard Michelson Interferometer with Central Detector ......................... 599
5.8 The Fore and Aft Optics .................................................................................... 605
5.9 The Detector Signal ............................................................................................ 611
5.10 The Detector Circuit .......................................................................................... 617
5.11 The Effective Spectrum...................................................................................... 622
5.12 Symmetries of the Interferogram Signal and Effective Spectrum...................... 624
5.13 Background Radiation Inside a Standard Michelson Interferometer ................ 626
5.14 Removing the Background Spectra ................................................................... 640
5.15 Double-Sided Interferograms ............................................................................. 643
5.16 Apodization of Spectra ...................................................................................... 650
5.17 The Effect of a Finite Field of View................................................................... 656
5.18 Single-Sided Interferograms............................................................................... 667

iv
5.19 Calibration ......................................................................................................... 682
5.20 Nonflat Optical Surfaces ................................................................................... 686
5.21 An Example of How to Analyze Nonflat Optical Surfaces .............................. 692
5.22 Sampling the Interferogram Signal .................................................................... 696
5.23 Setting Up the Discrete Fourier Transform of the Sampled Signal ................... 699
5.24 Oversampling the Interferogram........................................................................ 704
5.25 Undersampling the Interferogram...................................................................... 715
5.26 Off-Center Sampling of the Interferogram Signal ............................................. 723
Appendix 5A ................................................................................................................. 727
Appendix 5B ................................................................................................................. 731
Appendix 5C ................................................................................................................. 738

6 NEdN and Detector Noise............................................................................................. 742

6.1 Definition of NEdN............................................................................................ 742
6.2 Signal from the Spectral Radiance..................................................................... 748
6.3 Signal from the Background Radiance .............................................................. 752
6.4 Inverse Fourier Transform of the Background Radiance................................... 753
6.5 Background Radiance, Total Error, and Signal Noise ....................................... 759
6.6 Detector Noise.................................................................................................... 763
6.7 1/f Noise in Detectors......................................................................................... 764
6.8 Avoidable and Unavoidable Noise in Double-Sided Signals ............................ 767
6.9 Passing the Detector Noise Through the Detector Circuit ................................. 769
6.10 Total Detector Noise in Double-Sided Signals .................................................. 772
6.11 Measuring the Noise-Contaminated Spectrum.................................................. 782
6.12 Characterizing the Detector Noise ..................................................................... 792
6.13 Detector Noise with a Band-Limited, White-Noise Power Spectrum............... 795
6.14 An Example of Simulated Detector Noise in a Double-Sided Signal................ 800
6.15 Photon Noise in Detectors.................................................................................. 806
6.16 Detector-Noise NEdN in Double-Sided Signals................................................ 814
6.17 Real and Imaginary Parts of the Detector Noise................................................ 820
6.18 Detector Noise in a Single-Sided Signal............................................................ 821
6.19 Uncalibrated Spectra of Single-Sided Signals with Detector Noise ................. 829
6.20 Calibrated Spectra of Single-Sided Signals with Detector Noise...................... 840
6.21 Detector-Noise NEdN in a Single-Sided Signal ................................................ 844
6.22 Detector Circuit as an Anti-Aliasing Filter ........................................................ 849
Appendix 6A ................................................................................................................. 857
Appendix 6B ................................................................................................................. 861

v
7 Mirror-Misalignment NEdN in Double-Sided Interferograms................................. 865

7.1 Setting Up the Signal Equations......................................................................... 865
7.2 Specifying the Random Misalignment Angle of the Moving Mirror................. 867
7.3 -Based Signal Contaminated by Misalignment Noise ...................................... 873
7.4 Misalignment Noise and the Detector Circuit (or Anti-Aliasing Filter) ............ 879
7.5 Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals ............ 882
7.6 Calibrated Spectra Contaminated by Misalignment Noise ................................ 891
7.7 Avoidable and Unavoidable Mirror-Misalignment Noise in -Based Signals... 895
7.8 Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal
Spectrum ............................................................................................................ 898
7.9 Power Spectrum of
) 2 ( ~
n .................................................................................... 903
7.10 Calculating the Variance of L
........................................................................ 905
7.11 Formula for the Misalignment NEdN of Double-Sided Signals ........................ 909
7.12 Connection Between
) 2 (
~ ~
n n
p
Power Spectrum and the Power Spectra of

x
~
,
y
~
................................................................................................................. 911
7.13 The Shape of the
) 2 (
~ ~
n n
p
Power Spectrum............................................................ 921
7.14 The Size of the
) 2 (
~ ~
n n
p
Power Spectrum............................................................... 927
7.15 Simulated Misalignment Noise .......................................................................... 929
Appendix 7A .................................................................................................................. 945
Appendix 7B .................................................................................................................. 948

8 The Sampling-Error NEdN in Double-Sided Interferograms................................... 953

8.1 Noise-Free Signal at the a/D Converter.............................................................. 953
8.2 Sampling Noise at the a/D Converter ................................................................. 954
8.3 Power Spectrum and Autocorrelation Function of the Sampling Noise ............ 956
8.4 Uncalibrated Spectral Signal .............................................................................. 959
8.5 Calibrating the Spectral Signal Contaminated by Sampling Noise.................... 964
8.6 Random Sampling Error in the Measured Spectrum.......................................... 969
8.7 Calculating the NEdN from the Random Sampling Error.................................. 972
8.8 Black-Body Spectrum Contaminated by Sampling Noise ................................. 986
8.9 Sampling Noise and an Isolated Lorentz Emission Line.................................... 996
8.10 Error from Quasi-Static Sampling Noise.......................................................... 1007
8.11 Comparing the Sampling-Error, Misalignment, and Detector NEdNs............. 1024

Bibliography ............................................................................................................................. 1039

vi

PREFACE
Over the past three or four decades, Fourier-transform spectrometers based on Michelson
interferometers have become an ever more popular way to measure spectral radiance, especially
in the infrared region of the electromagnetic spectrum. The equations and formulas used to
characterize the performance of these instrumentshow accurate they are and in what ways they
distort measured spectraare usually presented in a very approximate form. It is easy to
understand why this is so: optical imperfections and random disturbances have to interact with
the Fourier transform before they affect the spectral measurement. Although engineering intuition
and simple statistics are often all that is needed to evaluate even the most complicated measuring
system, here they are not enough.
Fortunately the problem is not inherently very difficult, although the knowledge needed to
handle it is spread over the fields of optics, Fourier transforms, and random-signal theory. This
book, after briefly outlining the historical development of the Michelson interferometer, starts off
with an overview of both random signal theory and Fourier transform analysis. Maxwells
equations are then used to introduce the optical concepts required to understand Michelson
interferometers, leading to formulas for the balanced, unbalanced, and off-axis signals. This
analysis includes the effects of misaligned optics, polarized radiation, and nonuniform fields of
view; the formulas derived here contain all the information needed to construct professional-
quality computer simulations of these instruments. The typical distortions present in Fourier-
transform measurements are thoroughly analyzed, and there are detailed explanations of the
random measurement errors due to imperfect detectors, unsteady optical alignment, background
radiation, and mistakes in sampling the signal.
Many times optical engineers and scientists interested in evaluating the performance of
Fourier-transform spectrometers are faced with an unappealing choice between equations that are
too simple-minded and computer simulations that are too complicated and specific. The
convolution-based formulas presented here occupy the middle ground between these extremes
sophisticated enough to give accurate, dependable answers and simple enough to be evaluated
without much trouble. All derivations are explained at length, making it easy to adapt them to the
nonstandard types of Michelson interferometers not covered here. By the end of the book, the
reader knows how to analyze nonideal Fourier-transform spectrometers operating in an imperfect
world.
- 1 -
1
ETHER WIND, SPECTRAL LINES, AND
MICHELSON INTERFEROMETERS
The Michelson interferometer is named after Albert Abraham Michelson, who designed and built
it in 1881 to detect the ether wind caused by the Earths orbital motion. Michelsons attempt
failed; his interferometer, sensitive enough to detect stamping feet 100 meters away,
1
could not
detect the Earths orbital motion. So important and difficult to explain was this result that
Michelson and Edward Morley repeated the experiment with a larger and more sensitive
interferometer in 1887. This second attempt, which is today called the Michelson-Morley
experiment, also yielded a negative result: The Earths motion could not be detected. The
Michelson-Morley experiment is one of the most important negative findings of 19th-century
science; it encouraged physics to discard the idea of a luminiferous ether and prepared the way
for Einsteins relativity theories at the beginning of the 20th century.
The idea of a luminiferous ethera plenum pervading both (transparent) matter and empty
spacehad been widely accepted ever since Young and Fresnel established around 1820 that
light behaved like a transverse vibration or wavefield as it propagated past obstacles. There were
recognized difficulties with the concept; for example, the ether provided no detectable resistance
to the motion of material bodies yet was elastic enough to transmit light vibrations without
measurable energy loss. In the 1820s and 30s, Poisson, Cauchy, and Green, famous
mathematical scientists, derived equations of motion for transverse waves in an elastic medium,
but when these equations were applied to the already known behavior of light, the results were at
best mixed.
2
In 1867 James Clerk Maxwell modified the formulas describing the interdependent
behavior of electric and magnetic fields to make them a self-consistent set of equations; he
believed himself to be constructing a mechanical analogy for the ether. After showing that the
new set of equations predicted transverse electromagnetic waves traveling at the speed of light,
Maxwell not only asserted that light was a propagating electromagnetic disturbance, but he also
used his discovery to connect electric and magnetic properties to the behavior of the luminiferous
ether. It was not until 1888 that Hertz demonstrated experimentally that propagating
electromagnetic disturbances actually exist; and the optical community itself did not
acknowledge until 1896, with the discovery of the Lorentz-Zeeman effect, that light had to be

1
A. Michelson, The Relative Motion of the Earth and the Luminiferous Ether, American Journal of Science 22,
Series 2 (1881), p. 120129.
2
E. Whittaker, A History of the Theories of Aether and Electricity, Vol. I, The Classical Theories (Thomas Nelson &
Sons, Ltd., New York, 1951), pp. 129142.
1 Ether Wind, Spectral Lines, and Michelson Interferometers
- 2 -
such a propagating electromagnetic wavefield.
3
So the ether concept was not only alive and well
at the time of Michelsons experiments, but it could also be said, with the growing acceptance of
Maxwells equations to describe the behavior of the luminiferous ether, that it had never been
healthier.

3
D. Goldstein, Polarized Light, 2nd ed. (Marcel Dekker, Inc., New York, 2003), p. 298.
1.1 The First Michelson Interferometer
Figure 1.1(a) is a drawing of the instrument Michelson described in his 1881 paper, and Fig.
1.1(b) shows how the interferometer works. Incident light enters from the left, as shown by the
dark solid arrow, and hits a glass plate whose back is a partly reflecting, partly transmitting
surface. Ideally, half the incident light is transmitted through to mirror C and half is reflected up
to mirror D. Mirrors C and D then return the light to the beam splitter, as shown by the dashed
arrows. At the beam splitter, the light is again half transmitted and half reflected to send two
equal-intensity beams into the observers telescope. The light that is first transmitted and then
reflected at the beam splitter is called beam TR, and the light that is first reflected and then
transmitted at the beam splitter is called beam RT. These beams are drawn as two side-by-side
dotted arrows, but in reality they should be thought of as lying one on top of the other, filling the
same volume of space as they travel from the beam splitter to the telescope.
Michelson, thinking then in terms of 19th-century optical theory, would have regarded light as
transverse and elastic vibrations in the ether. The ethers plane of vibration might be horizontal,
as shown in Fig. 1.2(a), or vertical, as shown in Fig. 1.2(b). It was assumed, in fact, that the ether
could undergo transverse vibrations in any plane at allhorizontal, vertical, or something in
between, as shown in Fig. 1.2(c)although not all at the same time. At any given point in the
light beam, there could be only one plane of vibration, with different colors of light characterized
by different wavelengths of vibration. If a snapshot of a light beam could be taken, the plane of
vibration could well be changing along its length, as shown in Fig. 1.3(a). At some slightly later
time, the snapshot would show the same configuration advanced in the direction of propagation,
as shown in Fig. 1.3(b). White light, then as now, was taken to be a composite beam consisting of
many different wavelengths simultaneously traveling in the same direction. Different colors of
light correspond to disturbances of different wavelengths. Combining or adding together many
different-colored disturbances produces a total transverse vibration having no particular or unique
wavelength and with the plane of vibration free to change in an irregular fashion along the length
of the beam, as shown in Fig. 1.3(c). The situation depicted in Figs. 1.3(a)1.3(c) is actually very
close to the physical models used today to explain the behavior of light; all we need to do is
accept Maxwells equationsbut not Maxwells etherand say that the sinusoidal curves in
1he First Michelson Interferometer 1.1

- 3 -
FIGURE 1.1(A). The first Michelson interferometer.

Figs. 1.3(a)1.3(c) describe the changing length and orientations of the tip of the wavefields
oscillating electric or magnetic field vectors.
4

Suppose length a in Fig. 1.1(b) is adMusted until the distance from mirror C to the beam splitter
is exactly the same as the distance from mirror D to the beam splitter. When monochromatic
lightthat is, light having a unique wavelengthenters the interferometer as shown in Figs.
1.4(a) and 1.4(b), then the beams reflected from C and D recombine when leaving the
interferometer in such a way that their planes of vibration, as well as their state of oscillation,
exactly match. Since the planes of vibration match, we can disregard the planes orientation and
Must add together the two beams sinusoidal curves. Figure 1.5(a) shows that if the RT and TR
beams line up exactlyas they must when the distances from mirrors C and D to the beam
splitter are equalthen the summed oscillation is a maximum because the two wavefields are in
phase. If the distances from mirrors C and D to the beam splitter are unequal, then beams RT and
TR shift with respect to each other, as shown in Figs. 1.5(b)1.5(e). The two beams can be out of
phase by any fraction of a wavelength.

4
See, for example, the discussion in Secs. 4.2 through 4.4 of Chapter 4. Figures 1.2(a) and 1.2(b) can be profitably
compared to Figs. 4.5 and 4.6 in Chapter 4.
D
depending on how much the inequality in mirror distance is. phase by any fraction of a wavelength depending on the amount of inequality in the two distances.
- 4 -
FIGURE 1.1(b).
Mirror D
Compensator
Plate
Beam
Splitter
Beam RT
first reflected then
transmitted at beam splitter
Observing Telescope
Incident
Light
a
Beam TR
first transmitted then
reflected at beam splitter
partially reflective
surface
Mirror C
The First Michelson Interferometer 1.1

- 5 -
FIGURE 1.2(a).
FIGURE 1.2(b).
plane perpendicular
to direction of
propagation
plane
perpendicular
to direction of
propagation
direction of
propagation
vibrations of
transverse wavefield
vibrations of
cut in wavefield
cut in
wavefield
- 6 -

FIGURE 1.2(c).
FIGURE 1.3(a).
FIGURE 1.3(b).
FIGURE 1.3(c).
three different
planes of vibration
propagation direction for
vibration wavelength
vibration wavelength
white lightno unique
wavelength

- 7 -
The closer this fraction is to one-half, the smaller the summed oscillation; and if they are out of
phase by exactly a half-wavelength, then their sum is zero and the combined beam disappears.
When one beam is shifted against the other by exactly one wavelength, and the planes of
vibration still match, then once again the monochromatic RT and TR beams are in phase and
producing a bright combined oscillation.
5
There seems to be a real possibility that a
monochromatic beam cannot be used to confirm that mirrors C and D are the same distance from
the beam splitter because the recombined exit beam may look the same as it does when no shift at
all exists if one wavefield is shifted against the other by one, two, etc., wavelengths.
Suppose two monochromatic beams with two different wavelengths are sent through the
interferometer at the same time. If the distances from mirrors C and D to the beam splitter are
equal, then both the monochromatic beams, even though they have different wavelengths, must
be in phase when leaving the interferometer, producing a maximally bright oscillation in the
recombined exit beam. When the distances to the beam splitter are not exactly equal, however,
one of the monochromatic beams may end up shifted against itself by one, two, etc., wavelengths,
but there is no reason for the other beam to be shifted against itself the same way. When three
monochromatic beams are sent through the interferometer while the distances to the beam splitter
are not equal, matching all three wavetrains becomes even more unlikely. Hence, if we pass
white light containing innumerable distinct monochromatic wavetrains through the instrument,
then the RT and TR beams will recombine to produce a maximally bright output beam if and only
if the distances from mirrors C and D to the beam splitter are equal.
To make the white-light beam work as intended, the interferometer needs a glass compensator
plate between mirror C and the beam splitter [see Fig. 1.1(b)]. The compensator plate must be the
same thickness and orientationand made from the same type of glassas the glass in front of
the beam splitters partially reflecting surface. Figure 1.6(a) shows how light waves reflect from
mirrors C and D; the wavelength does not change while reflecting. In Fig. 1.6(b), however, light
waves inside the glass are somewhat shorter than they are outside the glass; the wavelength of the
light with respect to the glass thickness is greatly exaggerated to show this effect.
Therefore, a given distance traveled inside the glass corresponds to more wavelengths of a
monochromatic beam than the same distance in empty space. Moreover, different colors or
wavelengths of light shrink by different amounts, and this effect was a familiar one to 19th-
century optical scientists. If the compensator plate is not present, then the RT beam in Fig. 1.1(b)
passes through the glass in the beam splitter three times, whereas the TR beam passes through the
beam-splitter glass only once. The RT beam thus contains more wavelengths than the TR beam
even though the distances between the mirrors and the beam splitter are equal. With the
compensator plate present, however, both the TR and the RT beams pass through three glass
thicknesses.

5
In fact, we now know that a strictly monochromatic beam of light must have matching planes of vibration when
shifted against itself by exactly one, two, etc., wavelengths.
plate there, however, both the TR and RT beams pass through three glass layers.
- 8 -

FIGURE 1.4(a). Figure 1.4(a) shows a segment of radiation entering the interferometer and Fig. 1.4(b)
shows what that segment becomes when it leaves the interferometer if the distance it travels up and back
each interferometer arm is the same.

before passing through
the interferometer

- 9 -

FIGURE 1.4(b).

after leaving the
interferometer
Beam RT Beam TR
- 10 -

FIGURE 1.5(a).
FIGURE 1.5(b).
FIGURE 1.5(c).
FIGURE 1.5(d).
FIGURE 1.5(e).
Beam TR
Beam RT
Total
Beam TR
Beam RT
Total
Beam TR
Beam RT
Total
Beam TR
Beam RT
Total
Beam TR
Beam RT
Total
In Phase
Out-of-Phase
by a Quarter
Wavelength
Out-of-Phase
by a Half
Wavelength
Out-of-Phase by
Three-Quarters
Wavelength
In Phase

- 11 -

FIGURE 1.6(b).
Incident Wavefield
Reflected Wavefield
Glass
Substrate
Beamsplitting Film
Reflected Wavefield
Incident Wavefield
Transmitted
Wavefield
FIGURE 1.6(a).
- 12 -
Now each monochromatic component has its own unique number of wavelengths in each arm
of the interferometer; thus, the blue-light component in one arm has the same number of
wavelengths as the blue-light component in the other arm, the red-light component in one arm
has the same number of wavelengths as the red-light component in the other arm, and the same
can be said about all the other colors in the white-light beam.
Michelson wanted to do more than Must make the distances traveled by light going back and
forth between the C, D mirrors and the beam splitter equal; he also wanted to see how the
distances traveled by the light beams changed when he rotated the interferometer on its stand >see
Fig. 1.1(a)@. Up to now, we have assumed that mirrors C and D are exactly perpendicular to the
line of sight between their centers and the beam splitter, but nothing stops us from tilting one of
them a very slight amount, as shown in Fig. 1.7. The degree of tilt is, of course, greatly
exaggerated to show what is happening. When the tilt is imposed after the distances of mirrors C
and D to the beam splitter have been made equal, the center line of the tilted mirror remains at the
same distance from the beam splitter as it was before the tilt occurred. If the tilt is so small that
the slight change in direction of the beam can be disregarded, then that part of the beam reflecting
off the mirrors center line still recombines with light from the other mirror in such a way as to
produce the maximally bright oscillation already discussed above. The off-center parts of the
recombined beam are, of course, dimmer because the off-center parts of the tilted mirror no
longer match up properly to the untilted mirror.
6
An observer looking through the telescope
shown in Figs. 1.1(a) and 1.1(b) sees a bright central band, called a fringe, corresponding to the
central strip lying along the center line of the tilted mirror, with dark and less bright bands or
fringes on either side. If the distance that the light travels between the tilted mirror and the beam
splitter changes slightly, we expect the central fringe to shift as one side or another of the tilted
mirrorinstead of its center linebecomes equal to the distance traveled by the light in the other
arm of the interferometer. It is exactly this sort of fringe shift that Michelson hoped to see when
he rotated the interferometer on its stand, changing the direction in space of the light going up
and back the arms of the interferometer.
One last point we need to make is that many beam splitters of the type shown in Fig. 1.1(b)
reflect differently from the glass side and the nonglass side of the partially reflecting surface,
reversing the directing of vibration in the TR beam reflecting off the nonglass side and not
reversing it in the RT beam reflecting off the glass side.
7

Figure 1.5(c) shows that reversing the direction of vibration is the same as changing the phase
of the beam by one half-wavelength or 180, so the phenomenon is often referred to as a 180
phase shift on reflection. Michelson used this sort of phase-shifting beam splitter, so the RT and
TR beams in his interferometer did not match up the way they are shown in Fig. 1.4(b) when the
distances of mirrors C and D from the beam splitter are equal but instead match up as shown in

6
See Secs. 5.20 and 5.21 in Chapter 5 for a more detailed discussion of how to analyze a tilted mirror.
7
F. Jenkins and H. White, Fundamentals of Optics, 3rd ed. (McGraw-Hill Book Company, New <ork, 1957), p.
251.
Now each monochromatic component has its own unique number of wavelengths in each arm
distances of mirrors C and D from the beam splitter are equal but instead match up as shown in

- 13 -
FIGURE 1.7.

Centerline of
Tilted Mirror
Line of Sight to Beam Splitter
Angle
of Tilt
Note: The angle of tilt is
greatly exaggerated in
this diagram.
- 14 -
Fig. 1.8. Now the central fringe coming from the center line of the tilted mirror is dark because
all the monochromatic components of the two beams cancel out rather than add together. When
Michelson sent white light through his interferometer, he thus saw a central dark fringe with
parallel multicolored fringes on either side. The colored fringes come from the off-center strips of
the tilted mirror where one or another monochromatic wavetrain is shifted against itself by
exactly one, two, etc., wavelengths, increasing the amplitude of its oscillation with respect to the
wavetrains of other colors inside the recombined beam. In this setup, the central dark fringe is
unique, making it easy for Michelson to see how its position changes as the interferometer is
rotated.
1.2 Historical Reasoning Behind the Ether-Wind Experiment
Physical theory has changed a great deal since 1881, but it is still relatively easy to understand
the reasoning behind Michelsons experiment. As soon as light is taken to be a wavefield in a
medium at rest, such as waves on the surface of water, and the Earths motion through space is
regarded as carrying the interferometer through the medium, everything falls into place.
The first point worth mentioning is that the velocity at the equator due to the Earths daily
rotation is 0.46 km/sec, much less than the Earths orbital velocity around the sun of 29.67
km/sec. Consequently, the rotational velocity of Michelsons laboratorywell north of the
equatorwas only about 1% of the orbital velocity, and Michelson did not have to pay any
attention to it. The interferometer in Fig. 1.1(a) can be rotated on its stand, so at noon and
midnight, Michelson could always arrange for one arm to be aligned with the Earths orbital
velocity. Figures 1.9(a) and 1.9(b) show light traveling along the arms of a Michelson
interferometer when the interferometer is viewed as moving with a velocity v through a stationary
mediumthat is, a luminiferous etherand one of the arms is aligned with v. To keep life
simple, we have dropped the compensator plate from the two diagrams. Figure 1.9(a) shows light
traveling out and back along the arm aligned with v, with the interferometer rotated so that this is
the arm holding mirror C in Fig. 1.1(b). Figure 1.9(b) shows light traveling out and back along
the arm holding mirror D in Fig. 1.1(b). The positions of mirrors C and D are adjusted so that
each one is the same distance a from the beam splitter.
Figure 1.9(a) shows the beam splitter at three different positions as a single crest of the lights
wavefield moves through the interferometer: when the wavecrest first enters the arm of the
interferometer, when the wavecrest reflects off mirror C, and when the wavecrest returns to the
beam splitter for the second time. Mirror C is shown at the same three timeswhen the
wavecrest enters the arm, when it reflects off C, and when it returns to the beam splitter. The
velocity of the wavecrest with respect to the ether is c, and time t
1
elapses as the wavecrest goes
from the beam splitter to mirror C. Hence, the wavecrest covers a distance a + vt
1
in the
stationary ether while traveling at velocity c, with

1 1
a vt ct + . (1.1a)

47 9.7
Historical Reasoning Behind the Ether-Wind Experiment 1.2
- 15 -
FIGURE 1.8.

Beam TR
Beam RT
- 16 -

FIGURE 1.9(a).

Direction of
Earths Motion
Positions
of Mirror C
Positions of the
Beam Splitter
To Telescope
Incident Light
a

1
vt
2
vt
1
vt
2
vt
- 17 -

FIGURE 1.9(b).

Mirror D
Direction of
Earths Motion
a
Incident Light
To Telescope
Positions of the
Beam Splitter

3
vt
3
vt
- 18 -

Time t
2
elapses while the wavecrest returns from mirror C to the beam splitter, and similar
reasoning shows that

2 2
a vt ct . (1.1b)

Solving for
1
t and
2
t in Eqs. (1.1a) and (1.1b) gives

1
a
t
c v

and

2
a
t
c v
+
.

The wavecrest spends time

1 2
2 2
2 a a ac
t t
c v c v c v
+ +
+

going out to mirror C and back to the beam splitter, and it does so while traveling at velocity c, so
it covers a total distance

2
1 2 2 2
2
( )
ac
c t t
c v
+
. (1.1c)

Figure 1.9(a) also shows the wavecrest traveling at an angle, instead of straight down, after it
reflects off the beam splitter when leaving the interferometers arm. This allows it to head toward
where the observing telescope will be by the time the wavecrest reaches it; there is thus no
danger of the telescope missing the wavecrest because it has moved out of position. Figures
1.10(a) and 1.10(b) show why this happens. Figure 1.10(a) shows a single wavecrest reflecting
off a 45 stationary mirror. The large dots indicate where the corner of the reflecting wavecrest
is now and has been in the past as it reflects from the stationary mirror. The reflected wavecrest
travels upward at 90 from its original direction, as expected. Figure 1.10(b) shows what happens
when the same type of wavecrest reflects off a moving 45 mirror. The four thin solid lines show
the positions of the mirror at four equally spaced instants in time, and the large dots again show
where the corner of the reflecting wavecrest is at these times. Connecting these dots with a thick
dashed line, we see that the wavecrest feels an effective stationary mirror that is slanted at an
angle somewhat greater than 45. This means the reflected wavecrest does not travel straight up
as in Fig. 1.10(a) but instead moves a little off to the right.
The wavecrest spends time
Solving for
1
t and
2
t in Eqs. (1.1a) and (1.1b) gives
Time t
2
elapses while the wavecrest returns from mirror C to the beam splitter, and similar
- 19 -
Figure 1.9(b) shows how the wavecrest travels up and back the interferometer arm
perpendicular to velocity v. In time
3
t , the wavecrest travels a distance
2 2 2
3
a v t + from the beam
splitter to mirror D; and, because it does this at velocity c, we must have

2 2 2
3 3
ct a v t +
or

3
2 2
a
t
c v
.

Figure 1.9(b) shows that the total distance traveled from the beam splitter to mirror D and
back again must be

3
2 2
2
2
ac
ct
c v
. (1.2)

Even though the two interferometer arms are both of length a, if the interferometer is moving
then a single wavecrest splitting at the beam splitter does not travel the same distance in each arm
before recombining at the beam splitter. The difference s between the distances traveled out and
back in each arm is, according to Eqs. (1.2) and (1.1c),

1 2 3
2 2 2 2 2 2 2 2
2 2 1
( ) 2 1 1
1 1
ac c a
s c t t ct
c v c v v c v c

A +

.

The Earths orbital velocity is about
4
10
of the speed of light c, so we can make the

approximation

( )
2
1 2
2 2
2
1 1
2
v
v c
c
e + .

This gives

2 2 2
4 4
2 2 2
2 1 1 1 ( )
2 2
v v av
s a O v c
c c c

A e + + +

.
Figure 1.9(b) shows that the total distance traveled from the beam splitter to mirror D and
- 20 -
FIGURE 1.10(a). An incident wavecrest enters from the right and is reflected up from a stationary
surface. The dots show where the corner of the wavecrest is at equally spaced time intervals while it is
reflecting off the surface.

incident wavecrest
moving to the left
reflected wavecrest
moving up
reflecting surface
- 21 -

FIGURE 1.10(b). The same wavecrest is shown here at four instants of time, each instant
separated from the next by a time interval of t, as it enters from the right and reflects off a flat
surface traveling from left to right across the page. The dots show where the corner of the wavecrest
is at these four instants of time, and the thick dashed line shows the effective slant of the surface
experienced by the wavecrest as it reflects.
direction of travel of
reflected wavecrest
t
t t A 2
t t A 3
direction of travel of
incident wavecrest
t
t t A
t t A 2
t t A 3
reflecting surface at four equally spaced
instants of time
Same incident wavecrest at four equally
spaced instants of time

t t
- 22 -
Since
2 2 8
10 v c

e and
4 4 16
10 v c

e , it makes sense to neglect the
4 4
v c terms and write

2
8
2
10
av
s a
c

A e = . (1.3a)

It is perhaps of interest to point out that Michelson, by mistakenly assuming that the light
traveling up and back the arm perpendicular to the orbital velocity covered a distance 2a instead
of
2 2
2 / ac c v , ended up with

2
8
2
2
2 10
av
s a
c

A e = (1.3b)

in his 1881 paper. This incorrect formula did not affect Michelsons overall analysis because, as
he explained in the paper, the data was good enough to rule out an effect ten times smaller than
what he expected to see.
As pointed out in Sec. 1.1, when white light passed through the interferometer with one of the
end mirrors slightly tilted, Michelson saw a central dark band or fringe from the centerline of the
tilted mirror because the centerline is the same distance from the beam splitter as the untilted
mirror. Remembering that Michelson used a beam splitter that reversed the direction of vibration
in one of the recombining beams, we know that at the center of the dark fringe each
monochromatic wavetrain in the white-light beam cancels itself out. At the first colored band or
fringe on either side of the centerline, the wavetrains go from cancelling themselves out to
reinforcing themselves, becoming bright at those positions on the tilted mirror where the length
traveled out and back the tilted mirror arm is a half-wavelength longer than at the center of the
dark band [see, for example, the transition from Fig. 1.5(c) to Fig. 1.5(e)]. Hence, for each
monochromatic wavetrain, the transition from dark to bright is halfway complete where the
length traveled out and back the tilted-mirror arm is a quarter wavelength different from what it is
at the center of the dark band. Considering the joint actions of all the monochromatic wavetrains
in the white-light beam, Michelson then knew that going from the center to the edge of the dark
fringe corresponded to shifting from a position on the tilted mirror where the length out and back
in both interferometer arms was equal to a position where the length out and back the tilted
mirror arm was different by one quarter of the average wavelength
av
of the white-light beam.
Thus the fringe widths inside the telescopes field of view gave him an extremely fine-grained
scale for measuring the difference in distance between the two arms. For greater accuracy, a
monochromatic beam could be sent through the interferometer and the tilted mirror adjusted until
the fringes matched up with the scale marks of the telescopes eyepiece.
If the interferometer is rotated so that the arm originally parallel to v is now perpendicular to
v, then the distance out and back one arm is shorter by s and the distance out and back in the
other arm is longer by s, so there isaccording to Eq. (1.3a)a shift of
It is perhaps of interest to point out that Michelson, by mistakenly assuming that the light
- 23 -

2
8
2
2
2 2 10
av
s a
c

(1.4)

of the wavefield from one arm when compared to the wavefield from the other arm. If 2s equals
/ 4
av
, the dark fringe shifts until its center is located at the previous position of one of its edges;
if 2s is larger, then the dark fringe shifts more; and if 2s is smaller, then the dark fringe shifts
less. For the value of a he chose, Michelson expected the fringe to shift by approximately one-
tenth its width. To within experimental error, he did not see the dark fringe shift at all. Michelson
concluded that
the hypothesis of the stationary ether is thus shown to be incorrect, and the necessary conclusion follows that
the hypothesis is erroneous.
8

The existence of the ether was accepted by a lot of scientists, so this experiment was by no
means the last word in the matter; indeed, it inaugurated 50 years of ever more painstaking
attempts to detect an ether wind using larger and more sensitive Michelson interferometers.
Michelson himself took the first step down this road when, in 1887, he collaborated with Edward
Morley to repeat his experiment; Fig. 1.11 shows the optical diagram of the interferometer they
constructed. They concluded that the velocity v of the interferometer with respect to the ether was
probably less than a sixth of the Earths orbital velocity, an upper limit suggested by
experimental error.
9
Michelson and Morley regarded this as another negative result. Many
scientists, including Michelson, at first interpreted these experiments as showing that the Earth
dragged along a layer of ether near its surface, making it hard to say just how fast the
interferometer might be moving with respect to the ether in the laboratory. Interferometers were
set up on tops of mountains and sent up in high-altitude balloons, hoping to get outside the ether
layer dragged along by the Earth, but no one came up with any results convincingly larger than
experimental error. According to Einsteins special theory of relativity, published in 1905, there
is no reason to expect ether drift at all, because the speed of light is the same in all inertial
frames of reference. After 1905, attempts to detect ether drift were basically attempts to disprove
relativity theory, and scientists who pursued them were regarded by their peers as ever more
eccentric. Perhaps the last serious attempt to detect an ether wind using a Michelson
interferometer took place on top of Mount Palomar, where Dayton Miller ran an extremely large
and sensitive Michelson experiment in the 1920s. When publishing the results in the early 1930s,
he claimed to detect ether-wind velocities on the order of 10 km/sec,
10,11
but the data remained

8
Michelson, The Relative Motion of the Earth.
9
A. Michelson and E. Morley, On the Relative Motion of the Earth and the Luminiferous Ether, American Journal
of Science 34, Series 3 (1887), 333345.
10
D. Miller, The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth, Reviews of
Modern Physics 5, no. 2 (July 1933), 203242.
- 24 -
controversial. After his death, the results were attributed to slight but systematic temperature
changes in the instrument during the measurements.
12

11
D. Miller, The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth, Nature
(February 3, 1934), 162164.
12
R. Shankland, S. McCuskey, F. Leone, and G. Kuerti, New Analysis of the Interferometer Observations of
Dayton C. Miller, Reviews of Modern Phvsics 27, no. 2 (April 1955), 167178.
1.3 Monochromatic Light and SpectraI Lines
The wavelength of a monochromatic light wave and the frequency f in cycles per unit time of
that same monochromatic light wave are connected by

f c = , (1.5)

where c is the velocity of light. By the second half of the 19th century, it was known that the light
emitted by free atoms, such as from the atoms inside a hot dilute gas, is often emitted at specific
frequencies called spectral lines. Equation (1.5) then requires the light from a spectral line to
have a precise wavelength c/f. Michelson used these spectral lines to generate the
monochromatic light sent through his interferometer. When, for example, a spectroscope was
used to separate out the cadmium red line and send it through the interferometer, he would see a
regular pattern of red fringes; when the mercury green line was sent through, he would see
regular green fringes; and so on. Many of these lines are in reality clumped groups of spectral
lines, all having nearly the same wavelength; they masquerade as a single bright line when
observed by low-resolution spectroscopes and spectrometers.
1.4 AppIying the MicheIson Interferometer to SpectraI Lines
After the first ether-wind experiments, Michelson demonstrated that his interferometer could also
be used both as an extremely accurate, practical ruler for measuring fundamental lengths and as
an extremely high-resolution spectrometer. To understand Michelsons approach, we must keep
in mind that the only optical detectors available back then were cameras (whose images had to
be chemically developed in darkrooms) and the human eye.
When the interferometer is used as a ruler or spectrometer, one of the arms is modified so that
its mirror is easily moved, as shown in Fig. 1.12. This moving mirror and the fixed mirror on the
other arm are still slightly tilted with respect to each other; that is, when extended indefinitely,
the planes of the mirror surfaces do not meet at exactly 90. In this discussion, we refer to the
moving mirror as being tilted and the fixed mirror as being untilted. To keep things consistent
with the discussion in Sec. 1.1, the beam splitter is assumed to be the same type used in the 1881 with the discussion in Sec. 1.1, the beam splitter is assumed to be the same type used in the 1881
Applying the Michelson Interferometer to Spectral Lines 1.4
- 25 -

FIGURE 1.11.
- 26 -
ether-wind experiment. Hence, when a white-light beam is sent through the instrument, an
observer notes a central dark fringe if the center of the tilted moving mirror is the same distance
from the beam splitter as the center of the fixed mirror. This equidistant position of the moving
mirror is today often called the position of zero-path difference (ZPD) because the lights path up
and back each arm of the interferometer is the same when there is no tilt present.
The position and tilt of the moving mirror can be adjusted until the central dark fringe is
centered on rulings marked in the telescopes eyepiece. When the white-light beam is replaced by
a monochromatic beam from a spectral line, the observer sees a sequence of light and dark bands
forming a regular pattern of fringes having the same color as the spectral line. The marked
position of the central dark fringe in the center of the eyepiece is now occupied by a dark null of
the monochromatic fringe pattern. This null corresponds to the centerline strip of the tilted
mirrors surface being the same distance from the beam splitter as the untilted mirrors surface.
The two bright fringes on either side of the marked null separate that null from the two
neighboring nulls, with the neighboring nulls corresponding to two strips of the tilted mirrors
surface that are a half-wavelength closer to, and a half-wavelength further away from, the beam
splitter. A half-wavelength difference in distance from the beam splitter creates, of course, a full
wavelengths difference in the distance traveled up and back the interferometers arm, which is
why we see another null. Depending on the configuration of the telescope, the amount of tilt in
the tilted mirror, and the wavelength of the monochromatic beam, there will be some number of
additional fringes alternating bright and dark across the field of view, with the nulls
corresponding to strips of the tilted mirrors surface that are one half-wavelength closer to and
further away from the beam splitter, two halves or one full wavelength closer to and further away
from the beam splitter, three halves closer to and further away from the beam splitter, and so on.
The observer can slowly move the tilted mirror out along its arm, watching as the fringe
pattern moves across the telescopes field of view. The movement occurs, of course, because the
strips of the moving mirrors tilted surface that are 1/2, 1, 3/2, etc., wavelengths closer to or
further away from the beam splitter are now no longer where they used to be. The marked null
shifts and, after the mirror moves half a wavelength from its original position, the null that used
to be immediately to one side shifts into the marked location. The fringe pattern looks the same
as just before the mirror began moving, but the observer knows there has been a half-wavelength
shift in the position of the moving mirror because the fringes have been carefully watched as their
positions changed. As the mirror moves, old fringes move out of sight on one side of the field of
view while new fringes replace them on the other side of the field of view. The observer checks
that the tilt of the moving mirror does not change by making sure that there is always the same
number of bright-null repetitions in the fringe pattern. Since the position of the moving mirror is
always known to within a small fraction of a wavelength, the interferometer has now become an
extremely accurate way to measure distance.
- 27 -

FIGURE 1.12.
p
Moving Mirror
Fixed
Mirror
Compensator
Plate
Beam
Splitter
Source Radiance Containing
Spectral Lines
To Telescope
- 28 -
Michelson did not hesitate to measure distances with his interferometer. In 1892 he
established that the standard meter bar in Paris corresponded, to an accuracy of one part in two
million, to 1,553,163.5 wavelengths of monochromatic light from the red cadmium spectral line.
At Yerkes Observatory in Wisconsin, he measured the extremely small tidal distortions of the
planet Earth due to the moons gravity, helping to establish that the Earth has an iron core, and
published the results in 1919. There is, however, a fundamental difficulty limiting his ability to
use the interferometer as a ruler: As the moving mirror gets further and further away from its
equidistant or ZPD position, the pattern of fringes starts to fade and eventually disappears. This
phenomenon is caused by the beam from the spectral line not being exactly monochromatic
either because what looks like a single spectral line is in reality a group of two or more lines
having almost the same wavelength, or because the line itself has a finite spectral width,
simultaneously emitting light at a very large number of wavelengths all very close to each other
in value.
To see why the fade-out occurs for a closely spaced group of spectral lines, we first analyze
what happens when the light from a pair of equal-intensity, closely spaced spectral lines,
sometimes called a spectral doublet, is sent through the interferometer. Inside the interferometer,
the doublet behaves like two monochromatic beamseach having a slightly different
wavelengthsimultaneously passing through the instrument. After using white light to put the
moving, tilted mirror at its ZPD position, we begin sending the doublet beam through the
interferometer. Each monochromatic beam produces a fringe pattern. To the human eye, the
fringe patterns have the same color and their nulls seem to be at exactly the same places in the
telescopes field of view. Because the wavelengths of the beams are nearly identical, the two
fringe patterns lie almost exactly on top of each other, reinforcing each other the same way the
dashed and solid oscillations lie on top of each other to create a thicker line at the left-hand edge
of Fig. 1.13. When, for example, there is a null in one beams fringe pattern because that strip of
the tilted mirrors surface is an integer number of half-wavelengths closer to or further away from
the beam splitter, the null from the other beams fringe pattern falls in almost exactly the same
place because it has almost exactly the same wavelength. As we shift the moving mirror further
away from ZPD and watch the fringes move, we know that when each new fringe forms at the
leading edge of the field of view, it shows that the edge of the tilted moving mirror is an ever
larger number of half-wavelengths further from the beam splitter. Sooner or later, however, the
same thing happens to the two beams fringe patterns that happens in Fig. 1.13 as we look away
from its left-hand edgethe oscillations get out of phase. Just as the dashed and solid lines in
Fig. 1.13 no longer match up exactly because they have slightly different repetition lengths, so do
the two fringe patterns of the two beams match up less well because they have slightly different
wavelengths. There always comes a pointperhaps when the next null is forming at 10,000 or
50,000 or more half-wavelengths from the ZPD position of the moving mirrorwhere the
monochromatic beam with the slightly shorter wavelength
1
is ready to form a null somewhat
before the beam with the slightly longer wavelength
2
. The nulls and brights from one
monochromatic fringe pattern shift enough with respect to the other that we begin to notice a
change: the pattern begins to fade. Eventually, the two fringe patterns are completely out of
- 29 -
phase, with the brights and nulls of one pattern lying on, respectively, the nulls and brights of the
other. If the two beams are of equal intensity, then the fringe pattern fades away completely.
Suppose the
1
set of fringes first becomes exactly out of phase with the
2
set of fringes when
the moving mirror has traveled a distance of approximately N/2 wavelengths of the
2
beam from
its equidistant or ZPD location. At this point, N satisfies the approximate equation

2 1
1 1 1
2 2 2
N N

+

, (1.6a)
which can also be written as

2 1
1
1
2N

. (1.6b)

This gives the formula for the fractional spread

2 1
1

between the doublets wavelengths in terms of N. If N is too large for convenient counting and
only several digits of accuracy are needed, we can directly measure the distance p in Fig. 1.12 at
which the fringe pattern disappears. Recognizing that both sides of Eq. (1.6a) are formulas for p
at the fade-out point, we can approximate either side of Eq. (1.6a) by
av
N , where
av
is the
approximate wavelength of the doublet, and write

2
av
N
p
. (1.6c)

Solving for N gives the formula

2
av
p
N
(1.6d)

to estimate N in terms of the known values of p and
av
. This approximate value of N can then
be put into Eq. (1.6b) to find the fractional spread in the doublet. Hence, we see that the fade-out
is both a bug and a feature of the interferometeralthough it sets a limit on the distances that
can be measured, it also specifies the exact separation of spectral lines too close to be resolved by
other types of spectrometers. This exercise also establishes the basic idea behind Michelson-
based spectroscopy: examining the behavior of the interference signal to measure the beams
spectral shape.
- 30 -
FIGURE 1.13. The solid oscillation represents the fringe pattern of one spectral line in the doublet and
the dashed oscillation represents the fringe pattern of the other spectral line in the doublet. The
wavelengths of both spectral lines are almost the same, so their fringe patterns slowly change from being
in-phase, to being out-of-phase, and then back to being in-phase.

Now that we understand why the fringe pattern of a doublet fades, it is easy to see why the
same sort of thing happens with any size groupor multipletof closely spaced spectral lines.
Each line of intrinsically greater or lesser intensity generates a fringe pattern of intrinsically
greater or lesser intensity connected to its wavelength. Near ZPD, all the fringe patterns are in
phase, but as the moving mirror shifts away from ZPD, the fringe patterns, since each is produced
by a slightly different wavelength, go out of phase, causing the fringes to fade. Figure 1.14 even
suggests a quick way of understanding something about why a single, finite-width spectral line
also produces fading fringe patterns; approximating it as a closely spaced multiplet, we might
expect its fringes to behave the same way any other multiplets would. We should, however, be
careful about carrying this sort of reasoning too far. Figure 1.13 suggests that if, after reaching
the fade-out point, we keep moving the tilted mirror away from its ZPD position, then the
doublets fringe pattern starts to reappear, eventually becoming as strong as it was near ZPD. The
same sort of phenomenon should also occur for any multiplet consisting of a finite number of
exact wavelengths; if we go far enough from ZPD, then there should be a region where the fringe
patterns are all back in phase. In reality, when moving away from ZPD, there are indeed regions
where a multiplets fringe pattern first fades then grows stronger, but the finite width of each
spectral line inside the multiplet stops the fringes from ever regaining their full ZPD strength.
The fringes always, eventually, fade away completely. To explain this behavior, it is enough to
examine how and why the fringe pattern of a single, finite-width spectral line fades away. This is
done in the next three sections, where we show how a fringe pattern is connected to the Fourier
transform of the spectral intensity.
ax p ( )
min p ( )
p
i
P
i
10 0 x
i
0 1 2 3 4 5 6 7 8 9 10
0
strong fringes strong fringes weak fringes weak fringes no fringes
Interference Equation for the Ideal Michelson Interferometer 1.5

- 31 -
1.5 Interference Equation for the Ideal Michelson Interferometer
When using a Michelson interferometer for Fourier-transform spectroscopy, the end mirrors in
each arm are aligned to be perpendicular to the line of sight between their centers and the center
of the beam splitter. In effect, we remove the tilt from the moving mirror so that its central fringe
fills the detectors field of view in Fig. 1.15. The light beam passing through the interferometer
should be collimated, shown schematically in Fig. 1.15, by putting the point source of the beam
at the focus of a thin lens. The beam leaving the interferometer is concentrated onto a detector by
another thin lens. The dashed line shows the ZPD position of the moving mirror in Figs. 1.15 and
1.16. The moving mirror is a distance p from ZPD in these two figures, with p taken to be
positive when the mirror is further away from the beam splitter than its ZPD position and
negative when it is closer to the beam splitter than its ZPD position. The moving mirror should
remain perpendicular to the line of sight between it and the beam splitter as p changes, and the
detector records the changing intensity I of the collimated beam leaving the interferometer.
Even though Michelson did not usually set up his interferometers this way, optical theory was
advanced enough then for him to predict how I depends on p. The first step is to set up an x, y, z
Cartesian coordinate system such as the one shown in Fig. 1.16, with the collimated exit beam
traveling down the z axis. There are dimensionless unit vectors x , y , z pointing in the direction
of the positive x, y, z coordinate axes. Still treating a light beam as a transverse wavefield of the
type shown in Figs. 1.2(a)1.2(c) and 1.3(a)1.3(c), we assume that beam TR in Fig. 1.16 is
monochromatic light and write its transverse disturbance as

2 2
cos 2 cos 2
f f U f V
f f
z z
A xU ft yV ft

= + + +

K
. (1.7a)

Here, t is the time coordinate, f is the frequency of the monochromatic disturbance, and
f
is the
wavelength corresponding to frequency f. The period of the disturbance is, of course, 1/f, and Eq.
(1.5) reminds us that the wavelength
f
is connected to the frequency f by

f
f c = ,

where again c is the speed of light. Vector
f
A
K
has no z component, allowing it to represent a
transverse disturbance in the ether of the type shown in Figs. 1.2(a)1.2(c) and 1.3(a)1.3(c).
The x and y components of
f
A
K
are the real-valued expressions

2
cos 2
f U
f
z
U ft

+

- 32 -
FIGURE 1.14.

frequency f
frequency f
Spectral Multiplet
Spectral Intensity
Spectral Intensity
- 33 -

FIGURE 1.15.

90 deg.
90 deg.
45 deg.
source at
focus
p
Moving Mirror
Fixed
Mirror
Beam
Splitter
Compensator
Plate
Detector

- 34 -
and

2
cos 2
f V
f
z
V ft

+

respectively. These components must both oscillate at the same frequency f because the light
beam is monochromatic, but they can have different constant phase shifts
U
and
V
. This allows
f
A
K
to point in different directions in the x, y plane when we move along the beam, as suggested
by the changing orientations of the arrows in beams RT and TR of Fig. 1.16. The U
f
and V
f

amplitudes of the x and y oscillations do not have to be equal. To simplify the notation, and
because the concept will be routinely used in the rest of the book, we define

1
f
f
= (1.7b)

to be the wavenumber of the monochromatic disturbance. Now Eqs. (1.7a) and (1.5) can be
written as

( ) ( )
cos 2 2 cos 2 2
f f f U f f V
A xU z ft yV z ft = + + +
K
(1.7c)
with
/
f
f c = . (1.7d)

This is the same monochromatic disturbance as before; all that changes is the notation used to
specify how its phase changes with z.
The power transported by a physical wavefield of any type is usually proportional to its
squared amplitude;
13,14
and in optics it is now, as it was in Michelsons time, customary to set the
time average of the squared amplitude equal to the intensity of the transverse wavefield.
15
Visible
light has a wavelength on the order of
7
5 10 meters
, so by Eq. (1.5) its frequency is about

14
7
6 10 Hz
5 10 meters
c
f

(1.8a)
given that
8
3 10 m/sec c . Hence one cycle of the transverse wavefield has a period of about

13
H. Lamb, Hydrodynamics (6th edition), Dover Publications, New York, 1945 copy of the 6th edition first
published in 1879, p. 370.
14
P. Morse and K. Ingard, Theoretical Acoustics, McGraw-Hill, Inc., New York, 1968, p. 250.
15
G. Stokes, Mathematical and Physical Papers, Vol. III, Cambridge at the University Press, 1901, pp. 233-258.
- 35 -

FIGURE 1.16.
y axis
x axis
z axis
x
y
z
Beam TR
Beam RT
Compensator
Plate
Fixed
Mirror
Moving Mirror
p
p 2 =
Beam
Splitter
- 36 -

15
14
1
2 10 sec
6 10 Hz

e

. (1.8b)

The response time of the unaided human eye is perhaps as short as 10
2
s, and 210
15
s is
shorter than that by a factor of about 10
13
. The response of the fastest optical detectors available
today is on the order of 10
9
s, which is still an incredibly long time compared to 210
15
s.
Therefore, we might as well take the time over which the squared amplitude is averaged to be
infinitely long, because compared to the wavefields period, thats what it effectively is.
Following the notation of the time, the time average of a function g(t) is taken to be

( )
1
( ) lim ( )
2
T
T
T
g t g t dt
T
j . (1.9a)

For any two functions g(t) and h(t), we then have

( )
1 1 1
( ) ( ) lim [ ( ) ( )] lim ( ) lim ( )
2 2 2
T T T
T T T
T T T
g t h t g t h t dt g t dt h t dt
T T T

+ + +

j

or
( ) ( ) ( ) ( ) ( ) ( ) ( ) g t h t g t h t + + j j j . (1.9b)

Multiplying g(t) by a constant K and then averaging, we get

( )
1 1
( ) lim [ ( )] lim ( )
2 2
T T
T T
T T
K g t Kg t dt K g t dt
T T

j
or
( ) ( ) ( ) ( ) K g t K g t j j . (1.9c)

The squared amplitude of the monochromatic wavefield in Eq. (1.7c) is

( ) ( )
2 2 2 2
cos 2 2 cos 2 2
f f f f U f f V
A A U z ft V z ft ro r o ro r o
-
+ + +
K K
.

Time averaging both sides to get the intensity gives

( ) ( ) ( )
2 2 2 2
( ) cos 2 2 cos 2 2
f f f f U f f V
-
+ + +
K K
j j , (1.10a)

- 36 -

15
14
1
2 10 sec
6 10 Hz

e

. (1.8b)

2
s, and 210
15
s is
shorter than that by a factor of about 10
13
. The response of the fastest optical detectors available
today is on the order of 10
9
s, which is still an incredibly long time compared to 210
15
s.
Therefore, we might as well take the time over which the squared amplitude is averaged to be
infinitely long, because compared to the wavefields period, thats what it effectively is.
Following the notation of the time, the time average of a function g(t) is taken to be

( )
1
( ) lim ( )
2
T
T
T
g t g t dt
T
j . (1.9a)


( )
1 1 1
( ) ( ) lim [ ( ) ( )] lim ( ) lim ( )
2 2 2
T T T
T T T
T T T
g t h t g t h t dt g t dt h t dt
T T T

+ + +

j

or
( ) ( ) ( ) ( ) ( ) ( ) ( ) g t h t g t h t + + j j j . (1.9b)


( )
1 1
( ) lim [ ( )] lim ( )
2 2
T T
T T
T T
K g t Kg t dt K g t dt
T T

j
or
( ) ( ) ( ) ( ) K g t K g t j j . (1.9c)

The squared amplitude of the monochromatic wavefield in Eq. (1.7c) is

( ) ( )
2 2 2 2
cos 2 2 cos 2 2
f f f f U f f V
-
+ + +
K K
.

Time averaging both sides to get the intensity gives

( ) ( ) ( )
2 2 2 2
( ) cos 2 2 cos 2 2
f f f f U f f V
-
+ + +
K K
j j , (1.10a)


2
s, and 210
15
s is
13
which becomes, applying Eqs. (1.9b) and (1.9c),

( ) ( ) ( ) ( )
2 2 2 2
( ) cos 2 2 cos 2 2
f f f f U f f V
-
+ + +
K K
j j j . (1.10b)

The average of the squared cosine is 1/2 over one of its cycles.
16
As the averaging time gets
longer, it contains ever more cycles of the squared cosine, as well asalmost certainlysome
fraction of a cycle. The contribution of the squared cosine over a fractional cycle has practically
no influence compared to the squared cosines average value of 1/2 over a large number of
complete cycles. In the limit as T , it follows that

( )
2
cos ( ) 1/ 2 at b + j (1.10c)
- 37 -

for all real values of a and b. Hence, the formula for the intensity of the monochromatic beam in
Eq. (1.10b) now reduces to

( )
2 2
1
( )
2
f f f f
A A U V + i
K K
j . (1.10d)

Although the squared cosine is always positive, the cosine itself is negative as often as it is
positive and averages to zero over one cycle. As the averaging time increases, it includes an ever
larger number of cycles as well as (probably) some leftover fraction of a cycle. Again, the
influence of the zero from the large number of complete cycles outweighs the contribution of
whatever fractional cycle may be present, and as T in the limit

( ) cos( ) 0 at b + j (1.11)
for all real values of a and b.
The wavefield of a beam of light containing two monochromatic wavetrains of frequencies f
1

and f
2
can be written as

1 2
f f
A A A +
K K K
, (1.12a)
where

( ) ( )
1 1 1 1 1
(1) (1)
1 1
cos 2 2 cos 2 2
f f f U f f V
A xU z f t yV z f t ro r o ro r o + + +
K
(1.12b)
and

( ) ( )
2 2 2 2 2
(2) (2)
2 2
cos 2 2 cos 2 2
f f f U f f V
K
. (1.12c)

16
D. Griffiths, Introduction to Electrodynamics, 2nd ed. (Prentice Hall, Englewood Cliffs, NJ, 1989), p. 359.
- 38 -
The beams intensity is the time average of its squared amplitude, which is

( ) ( ) ( )
1 2 1 2 1 1 2 2 1 2
( ) ( ) 2 )
f f f f f f f f f f
A A A A A A A A A A A A - - - - - + + + +
K K K K K K K K K K K K
j j j .

Equations (1.9b) and (1.9c) can be applied to get

( ) ( ) ( ) ( )
1 1 2 2 1 2
2
f f f f f f
A A A A A A A A - - - - + +
K K K K K K K K
j j j j . (1.12d)

Substituting Eqs. (1.12b) and (1.12c) into the cross term in Eq. (1.12d) gives

( ) ( ) ( ) (
( ) ( ))
1 2 1 2 1 2
1 2 1 2
(1) (2)
1 2
(1) (2)
1 2
cos 2 2 cos 2 2
cos 2 2 cos 2 2
f f f f f U f U
f f f V f V
A A U U z f t z f t
V V z f t z f t
ro r o ro r o
ro r o ro r o
- + +
+ + +
K K
j j
.

Again, Eqs. (1.9b) and (1.9c) are applied to get

( ) ( ) ( ) ( )
( ) ( ) ( )
1 2 1 2 1 2
1 2 1 2
(1) (2)
1 2
(1) (2)
1 2
cos 2 2 cos 2 2
cos 2 2 cos 2 2
f f f f f U f U
f f f V f V
A A U U z f t z f t
V V z f t z f t
ro r o ro r o
ro r o ro r o
- + +
+ + +
K K
j j
j

.
(1.12e)

There is a trigonometric identity

1 1
(cos )(cos ) cos( ) cos( )
2 2
+ + , (1.12f)
which shows that

( ) ( )
( )
( )
1 2
1 2
1 2
(1) (2)
1 2
(1) (2)
1 2
(1) (2)
1 2
cos 2 2 cos 2 2
1
cos 2 ( ) 2 ( )
2
1
cos 2 ( ) 2 ( )
2
f U f U
f f U U
f f U U
z f t z f t
z t f f
z t f f
ro r o ro r o
r o o r o o
r o o r o o
+ +
+ + + +
+ +

.
(1.12g)

Taking the time average of both sides and applying Eqs. (1.9b) and (1.9c), we see that

Taking the time average of both sides and applying Eqs. (1.9b) and (1.9c), we see that
There is a trigonometric identity
- 39 -

( ) ( ) ( )
( ) ( )
( ) ( )
1 2
1 2
1 2
(1) (2)
1 2
(1) (2)
1 2
(1) (2)
1 2
cos 2 2 cos 2 2
1
cos 2 ( ) 2 ( )
2
1
cos 2 ( ) 2 ( )
2
f U f U
f f U U
f f U U
z f t z f t
z t f f
z t f f
ro r o ro r o
r o o r o o
r o o r o o
+ +
+ + + +
+ +
j
j
j

.

Equation (1.11) requires both terms on the right-hand side to be zero, which gives

( ) ( ) ( )
1 2
(1) (2)
1 2
cos 2 2 cos 2 2 = 0
f U f U
z f t z f t ro r o ro r o + + j . (1.12h)

Replacing
(1,2)
U
o by
(1,2)
V
o in the algebra used to reach this result does not change the
conclusion, which means that

( ) ( ) ( )
1 2
(1) (2)
1 2
cos 2 2 cos 2 2 = 0
f V f V
z f t z f t ro r o ro r o + + j (1.12i)

also. Substituting these two formulas into Eq. (1.12e) leads to

( )
1 2
0
f f
A A -
K K
j (1.12j)

for any two frequencies f
1
and f
2
such that f
1

f
2
. Hence, Eq. (1.12d) can be written as

( ) ( ) ( )
1 1 2 2
f f f f
A A A A A A - - - +
K K K K K K
j j j . (1.12k)

Comparing the formula in (1.12k) for the intensity of a beam containing two monochromatic
wavefields to the left-hand side of the formula in (1.10d) for the intensity of a single
monochromatic wavefield, we note that the intensity of the beam with two monochromatic
wavefields is the sum of the intensities of each monochromatic wavefield.
The wavefield of a beam of light containing three monochromatic wavetrains of frequencies
f
1
, f
2
, and f
3
can be written as

1 2 3
f f f
A A A A + +
K K K K
(1.13a)

with
1
f
A
K
,
2
f
A
K
specified by formulas (1.12b) and (1.12c) respectively and
3
f
A
K
specified by

Equation (1.11) requires both terms on the right-hand side to be zero, which gives
Replacing
(1,2)
U
o by
(1,2)
V
o in the algebra used to reach this result does not change the
Comparing the formula in (1.12k) for the intensity of a beam containing two monochromatic
- 40 -

( ) ( )
3 3 3 3 3
(3) (3)
3 3
cos 2 2 cos 2 2
f f f U f f V
K
. (1.13b)

Following the same analysis as before, we note that the intensity of this three-frequency light
beam is

( ) ( )
( )
( ) ( ) ( )
( )
1 2 3 1 2 3
1 1 2 2 3 3 1 2 1 3 2
1 1 2 2 3 3
1 2 1
3
( ) ( )
2 2 2
2 2
f f f f f f
f f f f f f f f f f f f
f f f f f f
f f f
A A A A A A A A
A A A A A A A A A A A A
A A A A A A
A A A
- -
- - - - - -
- - -
- -
+ + + +
+ + + + +
+ +
+ +
K K K K K K K K
K K K K K K K K K K K K
K K K K K K
K K K
j j
j
j j j
j j

( ) ( )
3 2 3
2
f f f
A A A - +
K K K
j .

Equation (1.12j) shows that
( )
1 2
0
f f
A A -
K K
j

for any two distinct frequencies f
1
and f
2
. The only thing different about
( )
1 3
f f
A A -
K K
j and
( )
2 3
f f
A A -
K K
j is the subscripts assigned to the distinct frequencies, so the same algebra showing
that
( )
1 2
f f
A A -
K K
j is zero also shows that

( ) ( )
1 3 2 3
0
f f f f
A A A A - -
K K K K
j j .

Hence, the three-frequency formula for
( )
A A -
K K
j reduces to

( ) ( ) ( ) ( )
1 1 2 2 3 3
f f f f f f
A A A A A A A A - - - - + +
K K K K K K K K
j j j j . (1.13c)

Here again, the intensity of the beam equals the sum of the intensities of its monochromatic
wavetrains.
This same argument can obviously be generalized to a beam consisting of N monochromatic
wavetrains. Since N may be left unspecified and can be made as large as we please, this is the
same as extending it to a beam of white light. The white-light wavefield can be written as

1
i
N
f
i
A A
K K
, (1.14a)
where

( ) ( )
( ) ( )
cos 2 2 cos 2 2
i i i i i
i i
f f f i U f f i V
K
(1.14b)
Following the same analysis as before, we note that the intensity of this three-frequency light
Hence, the three-frequency formula for
( )
A A -
K K
j reduces to
- 41 -
with f
i
f
j
whenever i j. The intensity of this beam is

( )
1 1 1 1
i j i j
N N N N
f f f f
i j i j
A A A A A A
= = = =

= =

K K K K K K
j j j ,

or, applying Eq. (1.9b),

( ) ( )
1 1
i j
N N
f f
i j
A A A A
= =
=
K K K K
j j . (1.14c)
Equation (1.12j) requires

( )
0
i j
f f
A A =
K K
j (1.14d)

whenever i j, so Eq. (1.14c) reduces to

( ) ( ) ( ) ( ) ( )
1 1 2 2
1
N N i i
N
f f f f f f f f
i
A A A A A A A A A A
=
= + + + =
K K K K K K K K K K
" j j j j j (1.14e)

because all the i j terms disappear. Equation (1.14e) shows that the intensity of any beam, even
a white-light beam, is the sum of the intensities of its monochromatic wavetrains. This is
sometimes called the principle of independent superposition,
17
and can be written as

1 2
1
N i
N
f f f f
i
I I I I I
=
= + + + =
" , (1.14f)
where

( )
I A A =
K K
j (1.14g)
is the total intensity of the beam and

( )
i i i
f f f
I A A =
K K
j (1.14h)

is the intensity of the beams monochromatic wavetrain of frequency f
i
.
Returning now to Fig. 1.16, we suppose that Eqs. (1.14f)(1.14h) refer to beam TR and
consider how to write the disturbance for beam RT. In an ideal Michelson interferometer, the
only difference between beam RT and beam TR is that the wavefields in beam RT lag behind the
wavefields in beam TR by a distance = 2p that is usually called the optical-path difference.
Using the notation specified in Eq. (1.14b), we see that for every monochromatic wavetrain

17
J. Chamberlain, The Principles of Interferometric Spectroscopy (John Wiley & Sons, New York, 1979), p. 98.
- 42 -

( ) ( )
( ) ( ) ( )
cos 2 2 cos 2 2
i i i i i
TR i i
f f f i U f f i V
A xU z f t yV z f t = + + +
K
(1.15a)

in beam TR, there must be, according to Fig. 1.16, a corresponding monochromatic wavetrain

( ) ( )
( ) ( ) ( )
cos 2 ( ) 2 cos 2 ( ) 2
i i i i i
RT i i
f f f i U f f i V
A xU z f t yV z f t = + + + + +
K
(1.15b)

in beam RT. The total disturbance for the combined beams f
i
th wavetrain is then

( ) ( )
i i
RT TR
f f
A A +
K K

in Fig. 1.16. We also note, however, that the beam splitter in Fig. 1.16 is evidently not the same
sort of beam splitter as the one used by Michelson because it does not reverse the direction of the
oscillation of the TR beam the way that the beam splitter in Fig. 1.8 did. For this sort of beam
splitter, the total disturbance of the combined beams f
i
th wavetrain should be

( ) ( )
i i
RT TR
f f
A A
K K

according to the discussion at the end of Sec. 1.1. To accommodate both possibilities, we write
the f
i
th wavetrain of the combined beam as

( ) ( ) ( )
i i i
cb RT TR
f f f
A A WA = +
K K K
, (1.15c)

where parameter W is 1 for Michelson-type beam splitters and 1 for non-Michelson beam
splitters. The superscript (cb) indicates that the disturbance
( )
i
cb
f
A
K
is the f
i
th wavetrain of two
beams combined in a balanced waythat is, each beam has undergone one transmission and one
reflection at the beam splitter. The intensity of the combined f
i
th wavetrain is

( ) ( )
( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) 2 ( ) ( ) ( ) ( )
( ) ( )
2
i i i i i i i
i i i i i i
cb cb cb RT TR RT TR
f f f f f f f
RT RT TR TR RT TR
f f f f f f
I A A A WA A WA
A A W A A WA A

= = + +
= + +
K K K K K K
K K K K K K
j j
j .

Applying Eqs. (1.9b) and (1.9c) gives

( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
2
i i i i i i i
cb RT RT TR TR RT TR
f f f f f f f
I A A A A W A A = + +
K K K K K K
j j j , (1.15d)

- 43 -
where we have recognized that W
2
= 1 because W = 1. Since both disturbances have the same f
i

frequency, Eq. (1.12j) cannot be used to say that
( )
( ) ( )
i i
RT TR
f f
A A -
K K
j is zero. Substituting from
(1.15a) and (1.15b) gives

( ) ( ) ( ) (
( ) ( ))
( ) ( ) 2 ( ) ( )
2 ( ) ( )
cos 2 ( ) 2 cos 2 2
cos 2 ( ) 2 cos 2 2 ,
i i i i i
i i i
RT TR i i
f f f f i U f i U
i i
f f i V f i V
A A U z f t z f t
V z f t z f t
ro r o ro r o
ro r o ro r o
- + + +
+ + + +
K K
j j

or

( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) 2 ( ) ( )
2 ( ) ( )
cos 2 2 2 cos 2 2
cos 2 2 2 cos 2 2
i i i i i i
i i i i
RT TR i i
f f f f f i U f i U
i i
f f f i V f i V
A A U z f t z f t
V z f t z f t
ro ro r o ro r o
ro ro r o ro r o
- + + +
+ + + +
K K
j j
j

.
(1.15e)

Formula (1.12f) shows that

( ) ( ) ( )
( ) ( )
( ) ( )
( )
cos 2 2 2 cos 2 2
1 1
cos 4 2 4 2 cos 2
2 2
i i i
i i i
i i
f f i U f i U
i
f f i U f
z f t z f t
z f t
ro ro r o ro r o
ro ro r o ro
+ + +

+ + +

j
j .

Applying (1.9b) and (1.9c), we get that

( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( )
( )
cos 2 2 2 cos 2 2
1 1
cos 4 2 4 2 cos 2
2 2
i i i
i i i
i i
f f i U f i U
i
f f i U f
z f t z f t
z f t
ro ro r o ro r o
ro ro r o ro
+ + +
+ + +
j
j j .
(1.15f)

The time average of any time-independent quantity equals that quantitythat is,

( ) K K j (1.15g)

for any constant K. Equation (1.11) shows that

( ) ( )
( )
cos 4 2 4 2 0
i i
i
f f i U
z f t ro ro r o + + j .

Applying (1.9b) and (1.9c), we get that
The time average of any time-independent quantity equals that quantitythat is,
- 44 -
These two results can be substituted into (1.15f) to get

( ) ( ) ( )
( )
( ) ( )
cos 2 2 2 cos 2 2
1
cos 2
2
i i i
i
i i
f f i U f i U
f
z f t z f t ro ro r o ro r o
ro
+ + +
j
.
(1.15h)

Replacing
( ) i
U
o by
( ) i
V
o does not change the algebra used to derive (1.15h). It follows that

( ) ( ) ( ) ( )
( ) ( )
1
cos 2 2 2 cos 2 2 cos 2
2
i i i i
i i
f f i V f i V f
z f t z f t ro ro r o ro r o ro + + + j . (1.15i)

Substituting (1.15h) and (1.15i) into (1.15e) now gives

( ) ( ) ( )
( ) ( ) 2 2
1
cos 2
2
i i i i i
RT TR
f f f f f
A A U V ro - +
K K
j , (1.15j)

and this result can be put into (1.15d) to get

( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) 2 2
cos 2
i i i i i i i i
cb RT RT TR TR
f f f f f f f f
I A A A A W U V ro - - + + +
K K K K
j j . (1.15k)

For an ideal Michelson interferometer, the intensity of the f
i
th monochromatic wavetrain in
the RT beam and the intensity of the f
i
th monochromatic wavetrain in the TR beam must be
identical because they arise in a symmetric way from the f
i
th wavetrain of the white-light beam
entering the instrument. We can imagine taking out the moving mirror from its interferometer
arm so that only the TR beam is reflected back to the beam splitter. This means that only the
( )
i
TR
f
A
K
monochromatic disturbance leaves the interferometer in the proper direction, and its
intensity is, of course,
( )
( ) ( )
i i
TR TR
f f
A A -
K K
j . Taking out the fixed mirror in the other arm and
replacing the moving mirror in the first arm ensures that only the RT beam reflects back to the
beam splitter. Now
( )
( ) ( )
i i
RT RT
f f
A A -
K K
j is the intensity of the monochromatic disturbance leaving
the interferometer in the proper direction. Since we have just said that these two intensities must
be equal, it follows that

( ) ( )
( ) ( ) ( ) ( )
i i i i
RT RT TR TR
f f f f
A A A A - -
K K K K
j j . (1.16a)

Substituting (1.15h) and (1.15i) into (1.15e) now gives
Replacing
( ) i
U
o by
( ) i
V
o does not change the algebra used to derive (1.15h). It follows that
- 45 -
Equation (1.10d) holds true for any monochromatic wavetrain
f
A
K
of frequency f, so it must
apply to wavetrain
( )
i
TR
f
A
K
of frequency f
1
. Hence, Eq. (1.15a) must mean that

( )
( ) ( ) 2 2
1
( ).
2
i i i i
TR TR
f f f f
A A U V - +
K K
j (1.16b)

Equation (1.10d) also applies to wavetrain
( )
i
RT
f
A
K
of frequency f
i
in Eq. (1.15b), which
similarly leads to

( )
( ) ( ) 2 2
1
( )
2
i i i i
RT RT
f f f f
A A U V - +
K K
j . (1.16c)

The right-hand sides of (1.16b) and (1.16c) are the same, which makes sense since the left-hand
sides of (1.16b) and (1.16c) must satisfy Eq. (1.16a).
Again taking out the moving mirror, we note that then, in an ideal interferometer, one quarter
of the entering beams power ends up leaving the interferometer as beam TR traveling along the z
axis in Fig. 1.16. Hence, if
(0)
i
f
I is the intensity of the f
i
th monochromatic wavetrain entering this
interferometer, we must have

( )
( ) ( ) (0)
1
4
i i i
TR TR
f f f
A A I -
K K
j . (1.17a)

Consulting Eq. (1.16a), we see that this means

( )
( ) ( ) (0)
1
4
i i i
RT RT
f f f
A A I -
K K
j (1.17b)

and, of course, Eqs. (1.16b) and (1.16c) then reveal that

(0) 2 2
2( )
i i i
f f f
I U V + . (1.17c)

Substituting Eqs. (1.17a)(1.17c) into (1.15k) then leads to

( )
( ) (0) (0)
1
cos 2
2 2
i i i i
cb
f f f f
W
I I I ro +
or

( )
( ) (0)
1
1 cos 2
2
i i i
cb
f f f
I I W ro

+

. (1.17d)

Equation (1.10d) also applies to wavetrain
( )
i
RT
f
A
K
of frequency f
i
in Eq. (1.15b), which
Equation (1.10d) holds true for any monochromatic wavetrain
f
A
K
of frequency f, so it must
- 46 -
Equation (1.17d) is the basic equation for the intensity of a monochromatic wavetrain leaving
an ideal Michelson interferometer when the intensity of the corresponding wavetrain entering the
interferometer is
(0)
i
f
I and the moving mirror is displaced from its ZPD position by a distance
/ 2 p , as shown in Fig. 1.16. We note that for those values of = 2p, where
( )
cos 2 1
i
f
W ro , the intensity of the f
i
th monochromatic wavetrain leaving the interferometer is
the same as the intensity of the f
i
th monochromatic wavetrain entering the interferometer. This
corresponds to constructive interference of the f
i
th monochromatic component of the RT and TR
beams. Suppose the beam entering the interferometer consists of just this one monochromatic
component. Glancing back at Fig. 1.1(b), we see that the power of the beam entering an ideal
Michelson interferometer can leave by either the combined RT and TR dotted beams or by the
two combined dash-dot beams traveling in the opposite direction to the incident beam. The dotted
beams are often called the balanced output of the interferometer, because each one has undergone
one transmission and one reflection at the beam splitter; similarly, the dash-dot beams are called
the unbalanced output, because one beam has undergone two reflections and the other beam has
undergone two transmissions. Conservation of energy requires that the power in all the
monochromatic beams leaving the ideal interferometer must equal the power in the one
monochromatic beam entering the interferometer. Hence, when constructive interference of the
balanced RT and TR beams makes their combined intensity equal to that of the beam entering the
interferometer, we know that destructive interference of the two unbalanced beams must make
their combined intensity equal to zero. Consequently, at each = 2p value where
( )
2 cos 1
i
f
W ro , not only is the intensity of the balanced monochromatic beams the same as
that of the monochromatic beam entering the interferometer, but also the intensity of the
unbalanced monochromatic beams is zero. On the other hand, for moving-mirror positions where
= 2p has a value such that
( )
2 cos 1
i
f
W ro , the intensity of the combined monochromatic
RT and TR beams in Fig. 1.1(b) is zero according to Eq. (1.17d). At these moving-mirror
locations, the balanced output undergoes destructive interference. Conservation of energy then
requires the unbalanced output to undergo constructive interference and have the same intensity
as the monochromatic beam entering the interferometer.
This analysis can be generalized to any mirror position and value of = 2p. If
( )
i
cu
f
I is the
intensity of the unbalanced monochromatic wavetrain and, as before,
(0)
i
f
I and
( )
i
cb
f
I are the
intensities of the incident monochromatic wavetrain and balanced monochromatic wavetrain
respectively, then conservation of energy forces us to write

(0) ( ) ( )
i i i
cb cu
f f f
I I I + . (1.18a)

Substituting from Eq. (1.17d), we get
- 47 -

( )
(0) (0) ( )
1
1 cos 2
2
i i i i
cu
f f f f
I I W I

= + +

,

which can be solved for
( )
i
cu
f
I to get

( )
( ) (0)
1
1 cos 2
2
i i i
cu
f f f
I I W

=

. (1.18b)
This specifies the intensity of the f
i
th monochromatic wavetrain in the unbalanced output of an
ideal Michelson interferometer.
The dashed lines in Fig. 1.17 show the positions of the moving mirror at which

1 2
, , , ,
i i i
f f f
n n n

+ +
= .

These are the positions where
( )
0
i
cb
f
I = in Eq. (1.17d) when W = 1 for an interferometer using a
Michelson-type beam splitter. This can also be written as, substituting from Eq. (1.7b),

, , ( 1) , ( 2) ,
i i i
f f f
n n n = + + " " ,

where
i
f
is the wavelength of the f
i
th monochromatic wavetrain. For beam splitters where
1 W = , of course, these dashed lines represent the moving-mirror positions at which
( ) (0)
i i
cb
f f
I I = . If
the moving mirror is slightly tilted, so that its surface crosses more than one dashed line, and the
beam entering the interferometer contains only the f
i
th monochromatic wavetrain, then the
combined RT and TR beams leaving the interferometer have light and dark strips as the surface
of the tilted mirror crosses through those planes in space where an untilted mirror would produce
an all-bright or an all-dark balanced output. This connects Eq. (1.17d) to the bright and null
fringe patterns from a spectral line discussed in Sec. 1.4.
When a beam of white light passes through the interferometerthat is, a beam having many
different frequenciesthe principle of independent superposition in Eq. (1.14f) requires the
intensity of the interferometers balanced output to be the sum of the intensities of each
monochromatic wavetrain,

( ) ( )
1
i
N
cb cb
f
i
I I
=
=
,

which becomes, substituting from Eq. (1.17d),

( )
( ) (0)
1
1
1 cos 2
2
i i
N
cb
f f
i
I I W
=

= +

. (1.19a)
- 48 -

FIGURE 1.17.
nth crossing
(n + 1)st crossing
(n + 2)nd crossing
(n + 3)rd crossing
position where
i
f
n =
position where
i
f
n ) 1 ( + =
position where
i
f
n ) 2 ( + =
position where
i
f
n ) 3 ( + =
distance between
dashed lines is 2 /
i
f

- 49 -
When describing natural sources of light, we often replace sums of discrete quantities with
integrals over continuous functions, and this transformation was perhaps even more characteristic
of late 19th-century science than it is of todays physics. So it would be an automatic process for
Michelson and his contemporaries to define a spectral intensity function
(0)
( ) I f to describe the
radiation entering the instrument. When using this sort of mathematical formalism, we say that
(0)
( ) I f df is the optical intensity of all the radiation having frequency values between f and f + df
entering the interferometer. The intensity of the balanced output is then

( )
( ) (0)
0
1
( ) 1 cos 2
2
cb
f
I I f W df ro
. (1.19b)

The physical meaning of Eq. (1.19b) is exactly the same as Eq. (1.19a); we have just replaced
(0)
i
f
I by
(0)
( ) I f df and changed the sum to an integral. We have also relied on variable f itself
instead of index i to label the different frequencies. To make this last tactic work, we just assume
that
(0)
( ) I f is zero for those frequencies f that are not part of the original sum over i; this also
lets us specify the integral to be over all possible frequencies f between 0 and . The
wavenumber
f
can be eliminated by substituting from the formula for f in (1.7d) to get

( ) (0)
0
1 2
( ) 1 cos
2
cb
f
I I f W df
c
r
. (1.19c)

The only problem with this equation is the unreasonably high numbers required to represent f
at optical frequencieswhen going from one extreme to the other across the visible spectrum, for
example, frequency f changes from 410
14
Hz to 7.510
14
Hz (approximately). Consequently,
todays Fourier spectroscopists often use Eq. (1.7d) to eliminate f rather than from Eq. (1.19b).
To do this, we differentiate both sides of (1.7d) to get

df c do or
1
d df
c
o
and define

(0)
( ) ( ) S cI c o o (1.19d)
so that
(0)
1
( ) ( ) S d cI c df
c
o o o
simplifies to

(0)
( ) ( ) S d I c df o o o . (1.19e)

The only problem with this equation is the unreasonably high numbers required to represent f
The physical meaning of Eq. (1.19b) is exactly the same as Eq. (1.19a); we have just replaced
- 50 -
Now Eq. (1.7d) can be applied to (1.19c) to get

( )
( )
0
1
( ) 1 cos 2
2
cb
I S W d
= +

. (1.19f)

To get the white-light intensity formulas for the unbalanced output, we can apply to the
unbalanced monochromatic formula the same analysis used on the balanced monochromatic
formula. Comparing the unbalanced formula (1.18b) to the balanced formula (1.17d), we see that
changing the sign of W is all that needs to be done to go from the balanced formula to the
unbalanced formula. Hence, when we apply to the unbalanced formula the same algebra used on
the balanced formula, we know that all the way through the derivationand, of course, in the
final resultsthe only difference would be that W is replaced by W. Consequently, we can write
down at once the unbalanced white-light formulas corresponding to (1.19b), (1.19c), and (1.19f)
as

( )
( ) (0)
0
1
( ) 1 cos 2
2
cu
f
I I f W df
, (1.20a)

( ) (0)
0
1 2
( ) 1 cos
2
cu
f
I I f W df
c
, (1.20b)

and
( )
( )
0
1
( ) 1 cos 2
2
cu
I S W d
=

(1.20c)

respectively. Formulas (1.19b), (1.19c), and (1.19f) contain all the basic information needed to
understand how Fourier-transform spectroscopy works, and it was derived here using only those
facts that Michelson knew over 100 years ago about the nature of light. Unfortunately, it applies
only to an ideal interferometer; not surprisingly, the 19th-century approach used to derive it is
difficult to adapt to the study of both the random and nonrandom errors present in even the most
accurate of todays Michelson interferometers. For this reason, in Chapter 4 we return to basic
principles and rederive the formula for I
(cb)
starting from the modern form of Maxwells
equations, this time being careful to include all the nonideal terms needed for the error analysis.
Formula (1.19f) is, however, already good enoughif we borrow several mathematical results
from Chapter 2to explain why the fringes from even the thinnest of spectral lines discussed in
Sec. 1.4 must eventually fade away as = 2p increases.
Fringe Patterns of Finite-Width Spectral Lines 1.6

- 51 -
1.6 Fringe Patterns of Finite-Width Spectral Lines
Finite-width spectral lines, such as the one in the top graph of Fig. 1.18, can be represented by a
spectral intensity function I
(0)
(f). We can also follow the standard practice of Fourier
spectroscopists and represent the finite-width spectral line by the S() function defined in Eq.
(1.19d) and plotted in the bottom graph of Fig. 1.18. If the intensity of a spectral line is described
by a narrow I
(0)
(f) function such as the one in the top graph of Fig. 1.18, which is significantly
different from zero only between two very closely spaced frequencies f
1
and f
2
, then the
corresponding S() curve is significantly different from zero only between the two closely spaced
wavenumbers
1 1
/ f c o and
2 2
/ f c o , as shown in the bottom graph of Fig. 1.18.
The right-hand side of Eq. (1.19f) can be split up into the sum of a constant term and a term
that changes as the location coordinate p = /2 of the moving mirror changes,

( )
( )
0 0
1
( ) ( ) cos 2
2 2
cb
W
I S d S d o o o ro o

+

. (1.21a)

Since 0 o > in the integrals over do , nothing stops us from replacing ( ) S o by ( ) S o in the
second term to get
( ) ( )
0 0
( ) cos 2 ( ) cos 2 S d S d o ro o o ro o

. (1.21b)

Anticipating some of the Fourier material in Chapter 2, we note that, according to Eq. (2.11a)
in Chapter 2, function ( ) S o is even because

( ) ( ) S S o o ,

and, of course, it is real because it represents a real physical quantitythe intensity of the
spectral line. Turning next to Eq. (2.34g) in Chapter 2, we see that because ( ) S o is a real and
even function, the cosine integral on the right-hand side of Eq. (1.21b) is one half of the Fourier
transform of S [if we specify that parameter in (1.21b) corresponds to variable t in (2.34g) and
that parameter in (1.21b) corresponds to variable f in (2.34g)]. Anticipating the material in
Chapter 2 one last time, we consult Eq. (2.35k) and note that if the nth derivative of S has a well-
defined Fourier transform, then for large values of its argument the Fourier transform of S
approaches zero as the nth power of the absolute value of its argument. Since S describes a
spectral linethat is, a natural phenomenonwe expect it to have derivatives of all orders and
also expect those derivatives to have Fourier transforms. The argument of the Fourier transform
of S is , and we already know that the right-hand side of (1.21b) is half the Fourier transform of
S, so we can now conclude that
Anticipating some of the Fourier material in Chapter 2, we note that, according to Eq. (2.11a)
Since 0 o > in the integrals over do , nothing stops us from replacing ( ) S o by ( ) S o in the
- 52 -
( ) ( )
( )
0 0
( ) cos 2 ( ) cos 2
n
S d S d O o ro o o ro o

=

(1.21c)

for positive values of n as . Applying this to Eq. (1.20a) shows that

( )
( )
0
1
( )
2
n
cb
I S d O o o
(1.21d)

for large values of . Hence, as the moving mirror gets further and further from its ZPD location,
increasing the value of 2 p , the value of
( ) cb
I eventually stops changing and approaches the
constant value

( )
0
1
lim ( )
2
cb
I S d
o o
. (1.21e)

This happens for all types of intensity curves, not just those associated with spectral lines. If S
does represent a spectral line such as the one in Fig. 1.18, the brights and nulls associated with
the dashed lines in Fig. 1.17 eventually fade away. Consequently, no matter how the moving
mirror is tilted, no fringes can be seen. If the Michelson interferometer is being used as a ruler,
the fringe counting must stop. When the spectral line is a closely spaced multiplet, each line in
the group has a finite spectral width, ensuring thatno matter how the lines interact with each
other to form bright and dim regions in the overall fringe patterneventually any and all fringe
traces must disappear. Every spectral line found in nature produces light having some finite
spectral width, no matter how small, so this sort of fade-out is a universal phenomenon.
1.7 Fourier-Transform Spectrometers
In Michelsons time there was no easy way to measure the intensity of the exit beam leaving the
interferometer, so it was not practical to measure the change in I
(cb)
as a function of = 2p in
order to determine the -dependent curve,

( )
0
( ) cos 2 S d o ro o
,

coming from the second term on the right-hand side of Eq. (1.21a). In the previous section we
found that this curve is half the Fourier transform of S. This means that if the curve could be
(1.21a)
Fourier-Transform Spectrometers 1.7

- 53 -

frequency f
wavenumber

1
f
2
f

c
f
1
1
=
c
f
2
2
=
Spectral Intensity ) (
) 0 (
f I
) ( ) (
) 0 (
c cI S =
FIGURE 1.18.
- 54 -
measured, then the Fourier transform could be reversed to get the shape of the S spectrum
entering the interferometer. In the 1950s, both optical detectors to measure I
(cb)
and digital
computers to reverse the Fourier transform became widely available. Spectroscopists began to
design and build spectrometers based on measuring I
(cb)
as a function of and then reversing the
Fourier transform to find S. Today, these sorts of instruments are usually called Fourier-transform
spectrometers.
Equation (1.21a) is an idealized form of the fundamental equation of Fourier-transform
spectroscopy. It describes the intensity of the beam leaving an interferometer whenever we
1) Divide the beam into equal-amplitude secondary beams, and
2) Recombine the two secondary beams after the wavefield of one is shifted a distance
with respect to the wavefield of the other.
Although this is exactly what happens inside a standard Michelson interferometer, Figs. 1.19(a)
1.19(d) show that there are many other combinations of beam splitters and mirrors that divide and
recombine beams in this way.
18

Figure 1.19(a) shows the first and perhaps most obvious modification. Michelson put the arms
of his interferometer at right angles to maximize the fringe shift due to the ether wind thought to
exist by 19th-century scientists. If all that is desired, however, is to divide and recombine beams,
then the two arms can be at any (reasonable) angle with respect to each other, as shown in Fig.
1.19(a). The setup in Fig. 1.19(a) may in fact have some advantages over the standard Michelson
interferometer; arranging for near-normal reflections off the beam splitter usually modifies the
polarization of the wavefields less than large-angle reflections (see Sec. 4.4 of Chapter 4 for an
explanation of polarization).
Figure 1.19(b) shows that the end mirrors can be replaced by retroreflectors like corner cubes
or cats-eyes. For best results, both arms should have the same type of retroreflector.
The discussion following Eq. (1.17d) above explains the difference between the balanced and
unbalanced optical outputs leaving the standard Michelson interferometer. In Figs. 1.19(a) and
1.19(b), the unbalanced output cannot be detected because it goes back out along the entrance
beam, making it impossible to separate the two. The interferometer in Fig. 1.19(c), however,
shows that there are ways to keep the entrance beam separate from the unbalanced output, giving
us access to both the balanced and unbalanced optical signals. According to Eqs. (1.19f) and
(1.20c), if I
(cb)
is the intensity of the balanced output and I
(cb)
is the intensity of the unbalanced
output, then
( )
( ) ( )
0
( ) cos 2
cb cu
I I W S d o ro o
(1.22a)
and

18
To keep things simple, compensation plates and other secondary optical components have been omitted.
I
(cu)
Fourier-1ransform Spectrometers 1.7

- 55 -

( ) ( )
0
( )
cb cu
I I S d
+ =
. (1.22b)

Equation (1.22a) shows that subtracting the output of the detectors measuring the balanced and
unbalanced signals eliminates the constant term and doubles the size of the signal component
containing the Fourier transform. Adding the detectors outputs in Eq. (1.22b) eliminates the
Fourier transform, producing the integrated spectral intensity of the entrance beam. This
integrated source intensity should, of course, remain constant during a spectral measurement
because Fourier-transform spectrometers are vulnerable to source fluctuations. Astronomers often
design their Fourier-transform spectrometers so that both the balanced and unbalanced outputs
are available. When they investigate the spectra of weak and fluctuating sources (such as
twinkling stars), these instruments allow them both to double the signal fromand to check the
constancy ofthe radiances being measured. If the source fluctuates, formula (1.22b) can be
used to measure the fluctuation. Sometimes this allows the astronomer to rescale the Fourier
signal in (1.22a) to correct the spectral measurement.
In a standard Michelson interferometer such as the one shown in Fig. 1.1(b), and in the setups
shown in Figs. 1.19(a)1.19(c), the wavefield of one recombining beam is displaced a distance
with respect to the wavefield of the other whenever the moving mirror or corner cube is displaced
from =PD by a distance /2. In Fig. 1.19(d), however, the corner cube only has to move a
distance /4 to displace one wavefield by with respect to the other. Equation (5.67) in Chapter 5
shows that larger values of lead to more detailed spectral measurements in standard Michelson
interferometers, and the same holds true for the nonstandard interferometers discussed here. In
particular, a setup such as the one shown in Fig. 1.19(d) lets us achieve larger values with
smaller displacements of the corner cube. The moving corner cube is also, strictly speaking, no
longer the retroreflector; plane mirrors in both arms are used to reverse the beam directions.
During the 1950s, it was established that Fourier-transform spectrometers had two basic
advantagesoften called the Jacquinot advantage and the Fellget advantageover contemporary
types of prism-based and grating-based spectrometers.
19
These advantages revealed that under
many circumstances spectra measured by Fourier-transform spectrometers had a better signal-to-
noise ratio than equivalent prism-based or grating-based instruments. With the popularization of
the fast-Fourier transform (FFT) algorithms in the 1960s, Fourier-transform spectrometers soon
established themselves as usually the first and best choice for measuring infrared spectra
(electromagnetic radiation having wavelengths between 1 and 100 m). The growing availability
of personal and desktop computers in the late 1970s and 1980s made Fourier-transform systems
more compact, powerful, and user-friendly. Over the past two decades, there has been a tendency
to use standard Michelson configurations, such as those in Figs. 1.1(b) or 1.19(a), when

19
J. Chamberlain, The Principles of Interferometric Spectroscopv, p. 16.
to use standard Michelson configurations, such as those in Figs. 1.1(b) or 1.19(a), when
- 56 -

FIGURE 1.19(a).
FIGURE 1.19(b).
FIGURE 1.19(c).

2
= p

2
= p

2
= p
Moving Corner
Cube
Fixed Corner
Cube
Fixed Corner
Cube
Beam
Splitter
Beam
Splitter
Entrance Beam
Entrance Beam
To Unbalanced
Signal Detector
To Balanced
Signal Detector
To Balanced
Signal Detector
Moving
Mirror
Fixed
Mirror
Beam
Splitter
Entrance Beam
To Balanced
Signal Detector

Fourier-Transform Spectrometers 1.7

- 57 -

FIGURE 1.19(d).

designing the optics of Fourier-transform spectrometers. Standard Michelsons are well suited to
the laser-based servo controls often used to maintain the alignment of the fixed and moving
mirrors.
1.8 Laser-Based Control Systems
Todays Fourier-transform spectrometers often rely on laser-based servo systems to maintain
alignment and control the motion of the moving mirror. The average wavelength of the measured
spectra determines the standards of alignment and control required for good spectral
Moving Corner Cube
Beam
Splitter
Entrance Beam
Fixed
Mirror
4
= p

To Balanced Signal Detector
- 58 -
measurement. Systems designed to measure infrared spectra typically have lasers that work in the
visible. Not only do modest standards of alignment and control in the visible correspond to
extremely accurate standards of alignment and control in the infraredbecause visible
wavelengths are much shorter than infrared wavelengthsbut the infrared detectors responsible
for the spectral measurements are also easily shielded from stray laser light. The laser servo
systems follow many different designs. Figures 1.20(a) and 1.20(b) show a typical setup that may
not be exactly like any system now in use but that does present the basic ideas behind them.
In Fig. 1.20(a), a single laser beam is separated into beams A, B, and C by laser-beam
splitters. Separating one beam into three ensures that all three beams have the same wavelength.
The three beams enter the interferometer parallel to, and at the edges of, the entrance beam.
Figure 1.20(b) shows the path of beams A and B through the instrument; beam C is not shown
because it is out of the plane of the page, but it is assumed to follow a path similar to beams A
and B. The solid lines representing the laser beams are always parallel to the dotted lines showing
the path of the entrance beam through the interferometer; and the laser beams interact with the
interferometers beam splitter, fixed mirror, and moving mirror exactly the same way the
entrance beam does. Because all three laser beams are monochromatic wavetrains of wavelength
, the same reasoning used to produce Fig. 1.17 shows that we can draw a sequence of dashed
lines perpendicular to the laser beams to represent the moving-mirror positions where the laser
beams would form fringes. Just like in Fig. 1.17, each dashed line is separated from its two
nearest neighbors by /2. Taking the dashed lines to represent nulls, we note that if the moving
mirror has a slight tilt, as shown in Fig. 1.20(b), then the laser detector for beam B will see a near
null in the beam B fringe while the laser detector for beam A will see a near bright in the beam A
fringe. If the moving mirror is aligned in the plane of Fig. 1.20(b) but has a small out-of-plane
tilt, then the laser detector for beam C is sure to see a different fringe brightness than the laser
detectors for beams A and B. The three laser detectors send their signals to a servomechanism
that readjusts the mirror tilt until both detectors see the same fringe intensity, keeping the
interferometer aligned while the moving mirror changes position. Often these servomechanisms
readjust the tilt of the fixed mirror instead of directly correcting the moving mirrors tilt. It is not
difficult to design systems of this sort that can detect changes of /100 in the position of the
moving-mirrors surface. The A, B, and C laser detectors can also be used to count fringes as the
moving mirror changes position, keeping a record of where the moving mirror is and how fast it
is moving. This information is almost always used to sample the interferometers output signal at
equally spaced positions of the moving mirror, and it is often sent to a servomechanism
responsible for producing steady motion in the moving mirror.

___________

Chapters 2 and 3 spell out the mathematical ideas needed to analyze the performance of
Fourier-transform spectrometers, and they also establish the notation used to describe these ideas
in subsequent chapters. Readers who are already familiar with Fourier theory and random
Laser-Based Control Systems 1.8
- 59 -
functions can skip ahead to Chapter 4, returning to Chapters 2 and 3 as needed to refresh their
understanding. Chapter 4 starts with Maxwells equations, working with them to derive the
nonideal versions of Eq. (1.19f) and (1.20c) needed to understand both the nonrandom and
random sources of error in Fourier-transform spectrometers. We always assume a standard
Michelson configuration, such as the ones shown in Fig. 1.1(b) or 1.19(a), controlled by laser-
based metrology and alignment systems similar to the ones shown in Figs. 1.20(a) and 1.20(b).
These are arguably the most common type of Fourier-transform spectrometer in use today. Most
of the basic ideas applied here to these standard Michelson systems are also relevant to other
types of Fourier-transform spectrometers; anyone who reads and understands the analysis
presented in Chapters 4 through 8 will be able to modify the equations presented there so that
they apply to nonstandard Michelson configurations. One possible exception to this rule are
Michelsons such as the one shown in Fig. 1.19(b) that use nonstandard retroreflectors to return
the split entrance beam to the beam splitter. These sorts of systems, which are outside the scope
of this book, are spared many forms of the tilt misalignment possible in a standard Michelson,
which is an advantage, but on the other hand exhibit shear types of misalignments, which
standard Michelsons do not have. The equations governing shear misalignment turn out to be
similar to those for tilt misalignment, but it does not necessarily make sense to analyze them as a
source of random error, the way tilt is analyzed in Chapter 7.
- 60 -

FIGURE 1.20(a).

Interferometer
Beam Splitter
Laser Beam
Splitters
Entrance
Beam
Laser
Beam C
Beam B
Beam A
Laser-Based Control Systems 1.8
- 61 -

FIGURE 1.20(b).
Laser Fringe Positions
Moving
Mirror
Fixed
Mirror
Laser
Laser Beam
Splitters
Beam C
Beam B
Beam A
To Laser
Detector B
To Laser
Detector A
To Infrared Detector
Entrance
Beam
Interferometer
Beam Splitter
- 62 -
2
FOURIER THEORY
Many single-chapter introductions to Fourier theory follow a top-down approach, defining what a
Fourier transform is and then listing the mathematical consequences. Here, on the other hand, we
begin with more of a bottom-up approach, seeking not only to present the mathematical
formalism of Fourier transforms but also to give an intuitive feel for how they work and what
they mean. Once the basic idea is established, we need to know which data sequences and
functions have well-defined Fourier transforms. This topic is often scanted because Fourier
theory is notorious for providing no simple mathematical answers to this simple mathematical
question. Indeed, engineers, scientists, and applied mathematicians have a long tradition of using
Fourier transforms in mathematically improperyet extremely usefulways that usually give
the correct answer. To show why these techniques work, and also when they cannot be trusted,
there is a brief sketch of generalized function theory. This is followed by a discussion of the
Fourier series and the discrete Fourier transform, including an exact description of how they are
connected to the integral Fourier transform. The discrete Fourier transform is particularly
important because, almost without exception, the only type of Fourier transform calculated on
todays computers is the discrete Fourier transform; without it, the Michelson interferometer
would be a much more limited instrument. The chapter then concludes with a brief discussion of
how Fourier transforms are applied to two-dimensional and three-dimensional functions.
2.1 Basic Concept of a Fourier Transform
The idea of a Fourier transform develops naturally from a simple idea for comparing the shape of
two sequences of measurements. A sequence of measurements is really just a list of numbers, so
when we compare sequences of measurements we compare the shapes of number lists graphed in
the order of their measurement. We can suppose without any loss of generality that two lists,
k
u
and
k
v , have the same number of members with 1, 2, , k N . Figures 2.1(a) and 2.1(b) show
two lists
k
u and
k
v graphed against their index value k. Defining u and v to be the mean values
of
k
u and
k
v ,

1
1
N
k
k
u u
N

(2.1a)
and

1
1
N
k
k
v v
N

, (2.1b)
- 62 -
Basic Concept of a Fourier Transform 2.1
- 63 -

FIGURE 2.1(a).
FIGURE 2.1(b).
increasing index k
increasing index k
1 2 3 4
1 2 3 4
List
k
u
List
k
v
- 63 -
2 Fourier Theory

- 64 -
we form the sum S of the products of the differences from the mean,

( )( )
1
N
k k
k
S u u v v
. (2.2)

If the graphs of
k
u and
k
v have similar shapes, so that
k k
u u v v = for most values of k,
then ( )
k
u u and ( )
k
v v are very likely to have the same sign for most values of k. This means
few terms in the sum are negative and S ends up being a large positive number. If
k
u and
k
v have
little similarity in shape, then ( )
k
u u and ( )
k
v v are as likely to have opposite signs as the
same sign and the terms in the sum are just as likely to be positive as they are to be negative.
When this happens, S is a sum of terms that tend to cancel out, and the magnitude of S is likely to
be small.
The same basic idea can be applied to continuous functions u(t) and v(t). To create a formal
correspondence between functions and lists, we define an interval t in t and match
k
u and
k
v to
u(t) and v(t) with the equations
( )
k
u
u k t
t
A
A

and
( )
k
v k t v A .

Because u and v are continuous functions of time, we can assume that they vary in an
unsurprising manner between the isolated points at , 2 , , t t N t A A A at which they have been
specified. Traditionally, the argument of functions u and v is called t and assumed to be time, but
it is worth remembering that t can stand for any relevant physical parameter, such as length,
voltage, current, etc. Now we can approximate Eq. (2.2) as

( )( ) ( ) ( )
N t
t
S u t u v t v dt
A
A
e
, (2.3a)
where now

1
( )
N t
t
u u t dt
N t
A
A
e
A

(2.3b)
and

1
( )
N t
t
v v t dt
N t
A
A
e
A

. (2.3c)

Equations (2.3b) and (2.3c) just ensure that u and v are now the average values of u(t) and
- 64 -
- 65 -
v(t) respectively. We note that the value of u has been redefined from what it was in Eq. (2.1a)
above,
/
new old
u u t e A ,

whereas v has basically the same value as in Eq. (2.1b)the only change is to replace the sum
by the equivalent integral. At this point, the finite value of t is just a distraction, because it is the
shapes of the continuous functions u(t) and v(t) that are being compared. Taking the limit as
0 t A and N in such a way that

max
0
lim
t
N
N t T constant
A
A , (2.4a)
we get
( )( )
max
0
( ) ( )
T
S u t u v t v dt
, (2.4b)
where

max
max
0
1
( )
T
u u t dt
T
(2.4c)
and

max
max
0
1
( )
T
v v t dt
T
. (2.4d)

We still expect S to be large when functions u and v have similar shapes and S to be small when
they have dissimilar shapes.
Equation (2.4b) can be written as

( ) ( )
( )
max max
max max
max max max
max
0 0
max
0 0
max
0 0 0
max
0
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ,
T T
T T
T T T
T
S u t u v t dt v u t u dt
u t u v t dt v u t dt u T
u t v t dt u v t dt v u t dt u T
u t v t dt u v T

(2.5)

where in the last step (2.4c) ensures that the term in the square brackets [ ] is zero and (2.4d) is
- 65 -
2 Fourier Theory

- 66 -
used to replace the integral over v by
max
vT . To get to Fourier theory from Eq. (2.5), we suppose
v(t) to be an oscillatory function like sin(2 ) ft r or cos(2 ) ft r with 0 f = . This makes function u
the datathat is, the value of our measurement at time t is u(t). Equation (2.4d) then reveals,
depending on whether we choose v to be a sine curve or a cosine curve, that

( )
max
max max
0
1
sin(2 ) 1 cos(2 )
2
T
vT ft dt fT
f
r r
r

(2.6a)
or

max
max max
0
1
cos(2 ) sin(2 )
2
T
vT ft dt fT
f
r r
r

. (2.6b)

When v is a sine curve,
max
vT oscillates between ( ) 1 f r and 0 as T
max
increases; and when v
is a cosine curve,
max
vT oscillates between ( ) 1 2 f r and ( ) 1 2 f r as T
max
increases. Keeping in
mind that u(t) represents a function measured in a laboratory, if we want to compare the shape of
u to either sin(2 ) ft r or cos(2 ) ft r , common sense requires T
max
, the range of t over which data is
gathered, to be much greater than 1/, the period of the sine or cosine curve to which we want to
compare the data. Unless u entirely lacks a resemblance to the sine or cosine so that

max
0
( ) ( ) 0
T
u t v t dt e

no matter how large u or T
max
become, we expect

max
0
( ) ( )
T
u t v t dt

to be large when the u measurements are large, and small when the u measurements are small
and the integrals magnitude should also increase as T
max
increases. So when u represents a
typical set of data that is not completely unlike v in shape, then

max
max
0
( ) ( ) ( )
T
u t v t dt O uT

or
- 66 -
- 67 -
max
max
0
1
( ) ( ) ( )
T
u t v t dt O T
u

.

Equations (2.6a) and (2.6b) show that
max
vT must remain somewhere between the two values
( ) 1 f r and ( ) 1 2 f r no matter how large T
max
gets, which means

1
max
( ) vT O f

.

Having already concluded that T
max
has been chosen much larger than 1/, we expect

max
1
max max
0
1
( ) ( ) ( ) ( )
T
u t v t dt O T O f vT
u

>>
,

which, of course, reduces to

max
max
0
1
( ) ( )
T
u t v t dt vT
u
>>
.

Therefore, Eq. (2.5) can be approximated as

max max max
max
0 0 0
1 1
( ) ( ) ( ) ( ) ( ) ( )
T T T
S u u t v t dt v T u u t v t dt u t v t dt
u u

e

. (2.7)

The integral in (2.7) can be regarded as assigning the number S to the similarity in shape of u and
v, when v is a sine or cosine curve of frequency . Remembering where S came from, we realize
that this number is large when u and v have similar shapes and small when u and v have
dissimilar shapes.
2.2 Fourier Sine and Cosine Transforms
To make the ideas of the previous section mathematically rigorous, we define the Fourier sine
transform of function u to be
( )
( )
0
( ) 2 ( ) sin(2 )
ft
u t u t ft dt r
p (2.8a)

- 67 -
2 Fourier Theory
- 68 -
and the Fourier cosine transform of u to be

( )
( )
0
( ) 2 ( ) cos(2 )
ft
u t u t ft dt r
C . (2.8b)

The notation ( )
( )
( )
ft
u t p and ( )
( )
( )
ft
u t C shows that the function u(t) is being multiplied by,
respectively, the sine or cosine function havingas indicated by the superscriptan argument ft
multiplied by 2r . The order of the ft product in the superscript does not matter because it does
not matter in the arguments of the sine and cosine, so

( ) ( )
( ) ( )
( ) ( )
ft tf
u t u t p p and ( ) ( )
( ) ( )
( ) ( )
ft tf
u t u t C C .

In particular we know, because t is repeated in both u(t) and the superscript of p and C , that t is
the dummy variable of integration whereas , which is only contained in the superscript, is an
independent parameter. This means the transforms ( )
( )
( )
ft
u t p and ( )
( )
( )
ft
u t C are themselves
functions of the parameter ,
( )
0
2 ( ) sin(2 ) U f u t ft dt r
p
(2.8c)
and
( )
0
2 ( ) cos(2 ) U f u t ft dt r
C
. (2.8d)

The capital U names of functions U
p
and U
C
show that they are mathematically associated
with the original function u(t), created from u(t) by the integrals in (2.8c) and (2.8d).
Although the upper limit of integration is now in Eqs. (2.8a) and (2.8b), this should not be
interpreted as taking the limit as
max
T in Eq. (2.7). The upper limit is put at just to
eliminate T
max
as an explicit parameter, and the idea behind the presence of T
max
that u(t)
represents the result of a measurementis kept alive by placing restrictions on the type of
function u can be. In particular, we expect u(t), in some sense, to diminish or get small as t gets
large, because it is impossible to measure data for all the times t out to . It turns out that when
the right sorts of restrictions are placed on u, the Fourier sine and cosine transforms can be
inverted to recover the original functions,

( )
0
( ) 2 sin(2 ) u t U f ft df r
p
(2.8e)
- 68 -
Fourier Sine and Cosine Transforms 2.2
- 69 -
and
( )
0
( ) 2 cos(2 ) u t U f ft df r
C
(2.8f)

for 0 t > .
If we adopt the strictest definition of what is meant by the integral of a function between 0 and
, then Eqs. (2.8a)(2.8f) are true when function u(t) satisfies the following four requirements:
(I) It is absolutely integrable.
(II) It is continuous except for a finite number of jump discontinuities.
(III) It is bounded on any finite interval 0 a t b < < < < .
(IV) It has finite variation on any finite interval 0 a t b < < < < .
We now show why function u(t) naturally satisfies all these restrictions when it represents a
(possibly idealized) measurement controlled or described by a continuous parameter t.
No matter what the argument t of function u representstime, voltage, energy, etc.function
u(t) can only be measured over a finite range of t. Although there may be no reason to think u is
zero or negligible when measured outside this range, we obviously cannot make up values for
what it might be. If we extrapolate to get the unmeasured t values, the extrapolation should not
dominate the information contained in u. In general, the measurement should be carried out in
such a way that the unmeasured or extrapolated values are of negligible importance compared to
the measured values. Mathematically we might say that there exists a positive, finite value of t,
which we call T
max
, such that the important measured values of u are all at
max
t T s . One way of
expressing this constraint is to require

max
0 0
( ) ( )
T
u t dt u t dt
e

. (2.9a)

Since the left-hand integral ought to be finite, when (2.9a) is true, it follows that

0
( ) u t dt
<
. (2.9b)

Functions u that satisfy (2.9b) are said to be absolutely integrable; clearly, all functions
representing possible measurements share this quality, satisfying requirement (I) above.
Understanding requirement (II) requires some discussion of what it means to call an
experimental measurement continuous. To assign, with negligible experimental error, a definite
value of t to a measurement u, some minimum and finite change in t must occur between adjacent
measurements. In practice, continuous measurements are constructed by connecting sequences of
- 69 -
2 Fourier Theory
- 70 -
adjacent but separate points. We then assume that if u were measured between these already
known points, it would equal (to within experimental error) the values selected by connecting the
points. Thus, the continuity of u is a requirement that the measurement captures all the relevant
detail. In this sense, asserting that u is continuous is a type of idealizationjust another way of
saying that the measurement is accurate and representative. This takes care of the first part of
requirement (II), but there is a second part permitting u to have a finite number of jump
discontinuities. Figure 2.2 shows a jump discontinuity in u(t). Jump discontinuities represent
another type of idealizationwhat can occur when, for example, instruments are turned on or off
during a measurement. Because it is unrealistic to have this happen an infinite number of times
over a finite range of t, it makes sense to say that all functions u representing measurements are
continuous over any finite range of t except for a finite number of jump discontinuities.
Consequently, we can expect all functions representing measurements to satisfy requirement (II).
Standard proofs that the Fourier transform of the Fourier transform returns the original
function u usually end up showing as their final step that

( ) [ ]
0
0
1
2 sin(2 ) lim ( ) ( )
2
U f ft df u t u t
r
r r r
+ +
p
(2.9c)
and
( ) [ ]
0
0
1
2 cos(2 ) lim ( ) ( )
2
U f ft df u t u t
r
r r r
+ +
C
. (2.9d)

When u is continuous, this immediately reduces to the desired result, but when the integrals are
evaluated at a jump discontinuity, such as at
o
t t in Fig. 2.2, the limits on the right-hand side of
(2.9c) and (2.9d) give u a value at the jump discontinuity that is probably different from the
original value of u at the jump discontinuity. To keep this from happening, we define the value of
u to be, for all values
jump
t t marking the location of a jump discontinuity,

0
1
( ) lim ( ) ( )
2
jump jump jump
u t u t u t
r
r r
+ +

. (2.9e)

Modifying u this way cannot change the value of any integral whose integrand is the product of u
with another smooth function. The sine and cosine are smooth functions, so using (2.9e) to
modify the value of u at jump discontinuities does not change the values of the sine or cosine
transforms.
Measurements must be done with physically realizable equipment, which necessarily
produces finite values of u. This means there always exists a finite real number B < such that
- 70 -
- 71 -
Figure 2.2.

______________________________________________________________________________

( ) u t B < (2.9f)

over any finite interval 0 a t b < < < < when function u represents a measurement. Functions
obeying this inequality are called bounded functions, so functions representing measurements
always satisfy requirement (III).
Requirement (IV) is a little bit more complicated to explain. Any function u(t) can be written
as the difference of two other functions
1
( ) u t and
2
( ) u t , as shown in Figs. 2.3(a) and 2.3(b),

1 2
( ) ( ) ( ) u t u t u t (2.9g)

In Fig. 2.3(a), function u is drawn with a continuous line where it is increasing and with a dashed
line where it is decreasing. In Fig. 2.3(b), we see that functions
1
u and
2
u are constructed so that
every time u increases,
1
u also increases while
2
u remains the same, and every time u decreases,
2
u increases while
1
u remains the same. Consequently, for any function u and time values b a > ,
the differences
1 1
( ) ( ) u b u a and
2 2
( ) ( ) u b u a are non-negative and can only increase, which
means that their sum
t
( ) u t

0
t t
- 71 -
2 Fourier Theory
- 72 -

FIGURE 2.3(a).
FIGURE 2.3(b).
t
t
( ) u t

1,2
( ) u t
a b
a b

1
t
2
t
3
t

2
t
1
t
3
t

1
( ) u t

2
( ) u t
- 72 -
- 73 -

1 1 2 2
( ) ( ) ( ) ( ) ( )
ab
V u u b u a u b u a + (2.9h)

is also non-negative. Functions
1
u and
2
u have been constructed so that every time u goes up and
down, the differences
1 1
( ) ( ) u b u a and
2 2
( ) ( ) u b u a increase, making the size of ( )
ab
V u a
record of how many times u oscillates in the interval a t b < < . We define ( )
ab
V u to be the
variation of u over the interval a t b < < , and if

( )
ab
V u < , (2.9i)

we say that u has finite variation over the interval a t b < < . Requirement (IV), that u have finite
variation in any interval 0 a t b < < < < , means that u can only oscillate a finite number of
times in that interval. The function
1
sin(( 1) ) t

, for example, does not have finite variation over
any interval containing 1 t . If we attempted to measure a quantity that had infinite variation
inside a finite interval, we would be blocked by the realization, already discussed above in
connection with requirement (II), that adjacent measurements must be separated by some
minimum value of t. If the measurement were repeated over and over, it would seem as if u were
changing unpredictably in the region of infinite variation, leading us to wonder whether our
measurement reflected the same physical reality. Therefore, our measurements cannot have
infinite variation, and so any function u(t) representing a realistic measurement must also satisfy
requirement (IV).
We see that requirements (I) through (IV) are always satisfied by functions representing
physically realizable measurements. It should be emphasized that requirements (I) through (IV)
are sufficient to ensure that Eqs. (2.8a)(2.8f) hold true, but not necessary. It is easy to show that
there exist functions that do not meet requirements (I) through (IV) yet still satisfy Eqs. (2.8a)
(2.8f). Consider, for example,

( )
( )
( )
for 0 1 2
( ) / 2 for 1 2
0 for 1 2
t
g t t
t
r r
r r
r
s <
>

(2.10a)

This test function clearly satisfies (I) through (IV) and so must have a Fourier cosine transform,

( )
1
2
0
sin( )
( ) 2 cos(2 )
f
G f ft dt
f
r
r r
C
(2.10b)

such that we return to the original function g by taking cosine transform of the G
C
transform,
- 73 -
2 Fourier Theory
- 74 -

0 0
sin( )
( ) 2 ( ) cos(2 ) 2 cos(2 )
f
g t G f ft df ft df
f
r r

C
. (2.10c)

We could, however, just as easily have started with the function

sin( )
( )
t
h t
t

and taken its cosine transform to get

0
sin( )
( ) 2 cos(2 )
t
H f ft dt
t
r
C
. (2.10d)

The integral in (2.10d) is clearly the same as the first integral in (2.10c) with the variables and t
interchanged. Therefore,

( )
( )
( )
for 0 1 2
( ) ( ) / 2 for 1 2
0 for 1 2
f
H f g f f
f
r r
r r
r
s <
>
C

Hence we know that h(t) satisfies Eqs. (2.8b), (2.8d), and (2.8f)it is both cosine transformable
and its cosine transform returns the original function when cosine transformedexactly because
g(t) in (2.10a) satisfies Eqs. (2.8b), (2.8d), and (2.8f). Yet h(t), unlike g(t), does not satisfy
requirements (I) through (IV)in particular, it violates requirement (I) because it is not
absolutely integrable. To see that this is true, note that

( ) ( )
1 1 1
0 1 1
sin( ) sin( )
1 2 1
sin( )
j j
j j j
j j
t t
dt dt t dt
t t j j
r r
r r
r r

>

,

where the last step uses a well-known property of the harmonic series,

1
1
j
j
,

that it grows large without limit. This simple example also shows that just because a function g(t)
satisfies requirements (I) through (IV), so that the transform of the transform returns the original
- 74 -
- 75 -
function g(t), it does not necessarily follow that transform itself satisfies requirements (I) through
(IV).
Here is another example to show that, even though the transform of a function may exist, if
requirements (I) through (IV) are violated, then the transform of the transform does not
necessarily return the original function. We consider another test function,

1
( ) z t t
, (2.10e)

which is clearly not absolutely integrable because

( )
0
0 0
lim lim ln
A
A A
dt dt
A
t t
r
r r
r

,

violating requirement (I). The sine transform of z is

0
sin(2 )
( ) 2
ft
Z f dt
t
r
p
.

Any handbook of definite integrals shows that

0 for 0
( )
for 0
f
Z f
f r

>
p

. (2.10f)

Therefore, the sine transform Z
p
of
1
( ) z t t
exists, yet the sine transform of the sine transform

does not return z:

[ ]
0 0
1 1
2 sin(2 ) lim 2 sin(2 ) lim 1 cos(2 )
F
F F
ft df ft df Ft
t t
r r r r r

=

. (2.10g)

Clearly, if a function violates requirements (I) through (IV) yet has a well-defined sine or
cosine transform, the sine transform of the sine transform and the cosine transform of the cosine
transform must be checked explicitly to confirm that the original function is returned. The only
exception is when the transform itself satisfies (I) through (IV) even though the original test
function does not. Because we could just as easily have started with the transform itself instead of
the original test function, we can conclude that the transform of the transform of the original
function must return the original function. In general, repeatedly applying the sine or cosine
- 75 -
2 Fourier Theory
- 76 -
transform just takes us back and forth between the same two functions, and the transformations
are mathematically justified whenever at least one of those functions satisfies requirements (I)
through (IV).
2.3 Even, Odd, and Mixed Functions
Fourier transform theory can be extended to include functions that are evaluated for negative as
well as positive values of their arguments. To assist our analysis of these extended transforms, we
decide to classify u as an even, odd, or mixed function. An even function u satisfies the constraint

( ) ( ) u t u t (2.11a)

for all values of t, negative as well as positive; an odd function satisfies the constraint

( ) ( ) u t u t (2.11b)

for all values of t, negative as well as positive; and a mixed function is partly even and partly odd
in the sense that it is the sum of an even function and an odd function, neither of which is
identically zero. Any function u(t)whether even, odd, or mixedcan be written as the sum of
two functions,
e
u and
o
u , with
e
u being an even function obeying (2.11a) and
o
u being an odd
function obeying (2.11b),

( ) ( ) ( )
e o
u t u t u t + , (2.11c)
where

[ ]
1
( ) ( ) ( )
2
e
u t u t u t + (2.11d)
and

[ ]
1
( ) ( ) ( )
2
o
u t u t u t . (2.11e)

Clearly,

[ ] [ ]
1 1
( ) ( ) ( ) ( ) ( ) ( )
2 2
e e
u t u t u t u t u t u t + +
and

[ ] [ ]
1 1
( ) ( ) ( ) ( ) ( ) ( )
2 2
o o
u t u t u t u t u t u t .

If u starts off as an even function, then
e
u u , and
o
u is identically zero; if u starts off as an odd
function, then
o
u u , and
e
u is identically zero; and if u starts off as a mixed function, then
- 76 -
Even, Odd, and Mixed Functions 2.3
- 77 -
neither
e
u nor
o
u are identically zero. If u is identically zero, it can be regarded as either even or
odd, according to the classifiers convenience.
Figures 2.4(a) and 2.4(b) graph examples of even and odd functions respectively, and Fig.
2.4(c) shows a mixed function that is split up into its even and odd parts. We note that cos(2 ) ft r
is an even function of both and t and sin(2 ) ft r is an odd function of both and t. One point
worth remembering is that the behavior of even and odd functions is severely constrained near
0 t . For any odd function at 0 t , we have

(0) ( 0) (0) u u u

from Eq. (2.11b). Since the only number equal to its own negative value is zero, all odd functions
u(t) that have a well-defined value at 0 t must be zero at 0 t ,

0
0 if (0) exists and is odd.
t
u u u

(2.12a)

Because ( ) ( ) u t u t for even functions, when t is near zero the value of u (if u is continuous) is
almost constant. Therefore, when t is exactly zero the derivative of any even function u(t), if it is
well defined, must be zero,

0
0 if the derivative at zero exists and is even.
t
du
u
dt

(2.12b)

In fact, using the definition of the derivative

0 0
( ) ( ) ( ) ( )
lim lim
du u t u t u t u t
dt
r r
r r
r r

+

,

when u is even we see that

0 0
( ) ( ) ( ) ( )
lim lim
o o
o o o o
t t t t
du u t u t u t u t du
dt dt
r r
r r
r r

+

.

This shows that when u is even, the derivative of u is odd, and so from (2.12a), which states that
odd functions are zero when their argument is zero, we know that (2.12b) must be true. Similarly,
for any odd function u,
- 77 -
2 Fourier Theory

- 78 -

FIGURE 2.4(a).
FIGURE 2.4(b).
t
( ) u t
( ) u t
t
- 78 -
Even, Odd, and Mixed Functions 2.3
- 79 -
FIGURE 2.4(c).

0 0
( ) ( ) ( ) ( )
lim lim
o o
o o o o
t t t t
du u t u t u t u t du
dt dt
r r
r r
r r

+

,

showing that when u is odd, its derivative is even. The second derivative
2 2
d u dt of an even
function u is the first derivative of du dt that is odd, and so
2 2
d u dt must be even; similarly, the
third derivative
3 3
d u dt is the first derivative of
2 2
d u dt that is even, and so must be odd.
Examining in this fashion ever higher derivatives of the even function u, we conclude that
9.28
9.557
u t
i
ue t
i
uo t
i
2 2 t
i
2 1.5 1 0.5 0 0.5 1 1.5 2
10
5
0
5
10
( )
e
u t
( )
o
u t
( ) u t
t
0 t
- 79 -
2 Fourier Theory

- 80 -

odd function for 1, 3, 5,
when is even.
even function for 2, 4,
n
n
n
d u
u
n dt

(2.12c)

The same reasoning applied to the derivatives of an odd function u shows that

even function for 1, 3, 5,
when is odd.
odd function for 2, 4, 6,
n
n
n
d u
u
n dt

(2.12d)

Equation (2.12c) states that the odd-numbered derivatives of an even function are odd while the
even-numbered derivatives of an even function are even, and Eq. (2.12d) states that the odd-
numbered derivatives of an odd function are even while the even-numbered derivatives of an odd
function are odd. Therefore, an immediate consequence of (2.12a), (2.12c), and (2.12d) is that the
odd-numbered derivatives of an even functionif they exist and are well-definedare zero at
0 t and the even-numbered derivatives of an odd functionif they exist and are well-defined
are zero at 0 t .
2.4 Extended Sine and Cosine Transforms
We can now extend the sine and cosine transforms to include functions u(t) evaluated for
negative as well as positive values of t while generalizing requirements (I) through (IV)
previously applied to u for 0 t > in Sec. 2.2. The extended requirements are

(V) Function ( ) u t must satisfy
( ) u t dt
<
. (2.13a)

(VI) Function ( ) u t must be continuous except for a finite number of jump discontinuities
over any finite interval a t b < < < < .
(VII) There must exist a finite positive number B such that

( ) u t B < . (2.13b)

(VIII) The non-negative variation ( )
ab
V u of function u(t) as defined in Eqs. (2.9g) and (2.9h)
is finite over any finite interval a t b < < < < ,

( )
ab
V u < . (2.13c)

- 80 -
Extended Sine and Cosine Transforms 2.4
- 81 -
We also define the value of u at all its jump discontinuities to be given by Eq. (2.9e). These new
requirements are clearly just the old set of requirements extended to cover negative as well as
positive values of t.
The extended Fourier sine transform of u is

( )
( )
( ) ( ) sin(2 )
ft
u t u t ft dt r
E
p , (2.14a)

and the extended Fourier cosine transform of u is

( )
( )
( ) ( ) cos(2 )
ft
u t u t ft dt r
E
C . (2.14b)

Just like in Eqs. (2.8a) and (2.8b), defining the standard sine and cosine transforms, the order of
the ft product in the superscript does not matter:

( ) ( )
( ) ( )
( ) ( )
ft tf
u t u t
E E
p p
and
( ) ( )
( ) ( )
( ) ( )
ft tf
u t u t
E E
C C .

We can write u as the sum of even and odd functions, ( ) ( ) ( )
e o
u t u t u t + , as described in Eq.
(2.11c), and substitute this sum into the definitions of the extended sine and cosine transforms in
(2.14a) and (2.14b) to get

( )
( )
( ) ( ) sin(2 ) ( ) sin(2 )
ft
e o
u t u t ft dt u t ft dt r r

+

E
p (2.15a)
and
( )
( )
( ) ( ) cos(2 ) ( ) cos(2 )
ft
e o
u t u t ft dt u t ft dt r r

+

E
C . (2.15b)

We note that the product of an even function
e
u and the sine, as well as the product of an odd
function
o
u and the cosine, must be an odd function,

( ) [ ] [ ]
( ) sin 2 ( ) ( ) sin(2 ) ( ) sin(2 )
e e e
u t f t u t ft u t ft r r r , (2.16a)
- 81 -
2 Fourier Theory

- 82 -
and
( ) [ ] [ ]
( ) cos 2 ( ) ( ) cos(2 ) ( ) cos(2 )
o o o
u t f t u t ft u t ft r r r . (2.16b)

The integral between and + of any odd function ( )
o
t o can be thought of as the limit of
the sum of a large number of small terms,

( ) ( 2 ) ( ) (0) ( ) (2 )
o o o o o o
t dt dt dt dt dt dt dt dt dt dt o o o o o o
e + + + + + +
" ".

Because
o
o is odd, (0)
o
o is zero; ( ) ( )
o o
dt dt dt dt o o and cancels ( )
o
dt dt o ;
( 2 ) (2 )
o o
dt dt dt dt o o and cancels (2 )
o
dt dt o ; and so on. Therefore,
20

( ) 0
o
t dt o
, (2.17)

and Eqs. (2.15a) and (2.15b) can be written as

( )
( )
E
( ) ( ) sin(2 )
ft
o
u t u t ft dt r
p (2.18a)
and
( )
( )
E
( ) ( ) cos(2 )
ft
e
u t u t ft dt r
C . (2.18b)

The integral between and + of any even function ( )
e
t o can be thought of as

( ) ( 2 ) ( ) (0) ( ) (2 )
e e e e e e
t dt dt dt dt dt dt dt dt dt dt o o o o o o
e + + + + + +
" " .

Because
e
o is even, ( ) ( )
e e
dt dt o o , ( 2 ) (2 )
e e
dt dt o o , and so on. Therefore, the integral over
negative t has the same value as the integral over positive t and we can write

20
Strictly speaking, we are here treating the integral between and + as a Cauchy principle value, a concept
introduced in Sec. 2.10 below.
- 82 -
- 83 -

0
( ) 2 ( )
e e
t dt t dt o o

. (2.19)

The product of
o
u and the sine is an even function,

( ) [ ] [ ] [ ]
( ) sin 2 ( ) ( ) sin(2 ) ( ) sin(2 )
o o o
u t f t u t ft u t ft r r r , (2.20)

and the product of
e
u and the cosine, both of them even functions, is another even function.
Consequently, the extended sine and cosine transforms in Eqs. (2.18a) and (2.18b) are, according
to (2.19), (2.8a), and (2.8b),

( ) ( )
( ) ( )
E
0
( ) ( ) sin(2 ) 2 ( ) sin(2 ) ( )
ft ft
o o o
u t u t ft dt u t ft dt u t r r

p p (2.21a)
and
( ) ( )
( ) ( )
E
0
( ) ( ) cos(2 ) 2 ( ) cos(2 ) ( )
ft ft
e e e

C C . (2.21b)

Equation (2.21a) shows that the extended sine transform of a function u(t) is the unextended sine
transform of
o
u , the odd component of u; and Eq. (2.21b) shows that the extended cosine
transform of u(t) is the unextended cosine transform of
e
u , the even component of u. Because the
result will be needed later, we also show that the extended sine transform defined in Eq. (2.14a)
is an odd function of ,

( ) ( )
( ) ( )
E E
( ) ( ) sin( 2 ) ( ) sin(2 ) ( )
ft ft

p p ; (2.22a)

and a similar manipulation shows that the extended cosine transform defined in (2.14b) is an even
function of ,

( ) ( )
( ) ( )
E E
( ) ( ) cos( 2 ) ( ) cos(2 ) ( )
ft ft

C C . (2.22b)

We now examine what happens when the extended sine and cosine transforms are applied
twice to the same function. We define
- 83 -
2 Fourier Theory

- 84 -
( ) ( ) ( )
( ) ( )
E E
( ) ( )
ft ft
o
U f u t u t
p
p p (2.23a)
and
( ) ( ) ( )
( ) ( )
E E
( ) ( )
ft ft
e
U f u t u t
C
C C , (2.23b)

where the second step in Eqs. (2.23a) and (2.23b) comes from (2.21a) and (2.21b). Taking the
extended Fourier sine and cosine transforms of
E
U
p
and
E
U
C
respectively, we get

( ) ( )
( ) ( )
E E E E E
( ) ( ) ( ) sin(2 )
tf ft
U f U f U f ft df r
p p p
p p (2.24a)
and
( ) ( )
( ) ( )
E E E E E
( ) ( ) ( ) cos(2 )
tf ft
U f U f U f ft df r
C C C
C C . (2.24b)

The second step in (2.24a) and (2.24b) is there just to emphasize that we are allowed to change
the order of the ft product in the superscripts.
Equation (2.22a) shows that the extended sine transform
E
U
p
is an odd function of , so its
product with the sine is an even function of ; and Eq. (2.22b) shows that the extended cosine
transform
E
U
C
is an even function of , so its product with the cosine is also an even function of
. Hence, according to (2.19), Eqs. (2.24a) and (2.24b) become

( )
( )
E E E
0
( ) 2 ( ) sin(2 )
tf
U f U f ft df r
p p
p (2.25a)
and
( )
( )
E E E
0
( ) 2 ( ) cos(2 )
tf
U f U f ft df r
C C
C . (2.25b)

But Eq. (2.23a) shows that
E
U
p
is also the unextended sine transform of
o
u , so from (2.25a) we
see that
( )
( )
E E
( )
tf
U f
p
p

equals the unextended sine transform of the unextended sine transform of
o
u , the odd component
of function u. According to Eqs. (2.8a), (2.8c), and (2.8e), the unextended sine transform of the
unextended sine transform returns the original function for positive values of t. This means that
the extended sine transform of the extended sine transform,
- 84 -
- 85 -
( )
( )
E E
( )
tf
U f
p
p ,

which we have just seen to be equal to the unextended sine transform of the unextended sine
transform, must return
o
u for positive values of t. Consequently, for positive values of t, Eq.
(2.25a) becomes
( )
( )
E E E
0
( ) 2 ( ) sin(2 ) ( )
tf
o
U f U f ft df u t r
p p
p . (2.26a)

Function
o
u is, however, defined for all values of t according to the rule for odd functions
( ) ( )
o o
u t u t , and the integral

E
0
2 ( ) sin(2 ( )) U f f t df r
p

is also an odd function of t when we allow t to be both positive and negative,

E E
0 0
2 ( ) sin(2 ( )) 2 ( ) sin(2 ) U f f t df U f ft df r r

p p
.

Consequently, the integral exists and is well defined for negative t whenever the integral exists
and is well-defined for positive t. We conclude that Eq. (2.26a) holds true for negative as well as
positive t. Hence, using Eq. (2.23a) to substitute for
E
U
p
in Eq. (2.26a), we can write

( ) ( )
( ) ( )
E E
( ) ( )
tf ft
o
u t u t
p p (2.26b)

This shows that taking the extended sine transform of the extended sine transform returns the odd
component
o
u of function u for all values of t, both positive and negative. Switching now to the
extended cosine transform
E
U
C
, we see that Eq. (2.23b) shows the extended cosine transform
E
U
C

is also the unextended cosine transform of
e
u , the even component of function u. From the right-
hand side of Eq. (2.25b), we then know that

( )
( )
E E
( )
tf
U f
C
C

is equal to the unextended cosine transform of the unextended cosine transform of
e
u . Equations
(2.8b), (2.8d), and (2.8f) show that the unextended cosine transform of the unextended cosine
transform returns the original function for positive values of t. Consequently, the extended cosine
- 85 -
2 Fourier Theory

- 86 -
transform of the extended cosine transform,

( )
( )
E E
( )
tf
U f
C
C ,

which we have just seen to be equal to the unextended cosine transform of the unextended cosine
transform of
e
u , must also equal
e
u for positive values of t. This means that Eq. (2.25b) becomes
(for positive values of t),
( )
( )
E E E
0
( ) 2 ( ) cos(2 ) ( )
tf
e
U f U f ft df u t r
C C
C . (2.26c)

But ( )
e
u t is defined for negative as well as positive values of t according to the rule
( ) ( )
e e
u t u t for even functions of t, and the integral

E
0
2 ( ) cos(2 ) U f ft df r
C

is also an even function of t when t is allowed to be both positive and negative:

( ) ( )
E E
0 0
2 ( ) cos 2 ( ) 2 ( ) cos 2 ( ) U f f t df U f f t df r r

C C
.

Consequently, the integral exists and is well defined for negative t if it exists and is well defined
for positive t. We conclude that Eq. (2.26c) is valid for both negative and positive t and that,
substituting Eq. (2.23b) into Eq. (2.26c),

( ) ( )
( ) ( )
E E
( ) ( )
tf ft
e
u t u t
C C . (2.26d)

This shows that taking the extended cosine transform of the extended cosine transform returns
e
u , the even component of function u, for all values of t both positive and negative. Equations
(2.11d) and (2.11e), the original definitions of the even and odd components of a function u,
show that Eqs. (2.26b) and (2.26d) can be written as

( ) ( ) [ ]
( ) ( )
E E
1
( ) ( ) ( )
2
tf ft
u t u t u t
p p (2.26e)
and
- 86 -
- 87 -

( ) [ ]
( ) ( )
E E
1
( ( )) ( ) ( )
2
tf ft
u t u t u t
+ C C . (2.26f)

Adding together the extended sine transform of the extended sine transform and the extended
cosine transform of the extended cosine transform then gives

( ) ( ) ( ) ( )
[ ] [ ]
( ) ( ) ( ) ( )
E E E E
( ) ( )
1 1
( ) ( ) ( ) ( ) ( ) .
2 2
tf ft tf ft
u t u t
u t u t u t u t u t

+
+ +
p p C C

(2.26g)

We conclude that for any function u(t), the sum of the extended sine transform of the extended
sine transform and the extended cosine transform of the extended cosine transform returns the
original function.
One obvious way to proceed from this point is to define the Hartley transform

( ) [ ]
( ) ( )
( )
( )
( ) ( )
E E
E E
( ) ( ) cos(2 ) sin(2 )
( ) cos(2 ) ( ) sin(2 )
( ) ( )
( ) ,
ft
tf tf
u t u t ft ft dt
u t ft dt u t ft dt
u t u t
U f U f
r r
r r

+
+
+
+

a
p
e
p
C
C

(2.26h)

where in the next-to-last step we use definitions (2.14a) and (2.14b) of the extended sine and
cosine transforms and in the last step Eqs. (2.23a) and (2.23b) are used to write the extended sine
and cosine transforms as functions of . The order of the ft product in the superscript is not
important because, just like in the sine and cosine transforms, we have

( ) ( )
( ) ( )
a a
( ) ( )
ft tf
u t u t e e .

Working with this definition, we see that the Hartley transform of the Hartley transform gives

( ) ( ) ( ) ( )
( ) [ ]
( ) ( ) ( )
E E
E E
( ) ( )
( ) cos(2 ) sin(2 )
tf ft tf
u t U f U f
U f U f ft ft df r r
+
+ +

a a a
.
p
p
e e e
C
C
(2.26i)

- 87 -
2 Fourier Theory

- 88 -
According to Eqs. (2.22a) and (2.22b), the extended sine transform
E
U
C
is an odd function of
and the extended cosine transform
E
U
C
is an even function of . Using the same reasoning as in
Eqs. (2.16a) and (2.16b) above,

( ) [ ] [ ]
E E E
( ) sin 2 ( ) ( ) sin(2 ) ( ) sin(2 ) U f t f U f ft U f ft r r r
C C C

and
( ) [ ] [ ]
E E E
( ) cos 2 ( ) ( ) cos(2 ) ( ) cos(2 ) U f t f U f ft U f ft r r r
p p p
.

We see that
E
( ) sin(2 ) U f ft r
C
and ( )
E
cos(2 ) U f ft r
p
are both odd functions of , and Eq. (2.17)
states that the integral between and + of any odd function is zero. Therefore,

( )
E E
( ) sin(2 ) cos(2 ) 0 U f ft df U f ft df r r

p C
.

Now the Hartley transform of the Hartley transform in Eq. (2.26i) can be simplified to

( ) ( ) ( ) [ ]
( ) ( )
( ) ( )
E E
E E
E E
( ) ( ) cos(2 ) sin(2 )
( ) cos(2 ) ( ) sin(2 )
cos(2 ) sin(2
tf ft
u t U f U f ft ft df
U f ft df U f ft df
U f ft df U f f
r r
r r
r r
+ +

+
+ +
a a p
p p
e e
C
C C

( )
( ) ( )
E E
( ) ( )
E E E E
)
( ) cos(2 ) sin(2 )
( ) ( )
tf tf
t df
U f ft df U f ft df
U f U f
r r

+
+

p
p
p
C
C
C

.

Because
E
U
C
and
E
U
p
are respectively the extended sine and cosine transforms of u [see Eqs.
(2.23a) and (2.23b)], we have

( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
E E E E
( ) ( ) ( )
tf ft tf ft tf ft
u t u t u t

+
a a
e e p p C C ,

which becomes, substituting from (2.26g),
E
U
p
cosine and sine
- 88 -
- 89 -
( ) ( )
( ) ( )
( ) ( )
tf ft
u t u t

a a
e e . (2.26j)

We see that the Hartley transform of the Hartley transform returns the original function for both
positive and negative values of t. The Hartley transform was never very popular and is only rarely
encountered today. What is done instead, as we shall see in the next section, is to combine the
extended sine and cosine transforms into a single Fourier transform based on a complex
exponential.
2.5 Forward and Inverse Fourier Transforms
The Fourier transform is based on the well-known identity

cos( ) sin( )
i
e i
o
o o + , (2.27)

where 1 i .
For any real function u(t) satisfying requirements (V) through (VIII) in Sec. 2.4, we can add
the extended cosine transform to i times the extended sine transform to get

( ) ( ) [ ]
( ) ( ) 2
E E
( ) ( ) ( ) cos(2 ) sin(2 ) ( )
ft ft ift
u t i u t u t ft i ft dt e u t dt
r
r r

+ +

p C . (2.28a)

From Eqs. (2.23a) and (2.23b), we have

( ) ( )
( )
E E
( )
ft
u t U f
C
C and ( ) ( )
( )
E E
( )
ft
u t U f
p
p ,

which means (2.28a) can be written as

( ) ( )
2
E E
( )
ift
e u t dt U f iU f
r
p C
. (2.28b)

Taking the extended sine transform of both sides of (2.28b) gives

( ) ( )
( )
2
E E
E
sin(2 ) ( ) sin(2 ) sin(2 )
sin(2 )
ift
df ft dt e u t U f ft df i U f ft df
i U f ft df
r
r r r
r

p
p
C
(2.28c)
- 89 -
2 Fourier Theory

- 90 -
because ( )
E
sin(2 ) U f ft r
C
is an odd function of and integrates to zero [see discussion after Eq.
(2.26i) above]. Taking the extended cosine transform of both sides of Eq. (2.28b) gives

( ) ( )
( )
2
E E
E
cos(2 ) ( ) cos(2 ) cos(2 )
cos(2 )
ift
df ft dt e u t U f ft df i U f ft df
U f ft df
r
r r r
r

p C
C
(2.28d)

because ( )
E
cos(2 ) U f ft r
p
is an odd function of and integrates to zero. Substitution of Eqs.
(2.24a) and (2.24b) into (2.28c) and (2.28d) gives

( )
2 ( )
E E
sin(2 ) ( ) ( )
ift tf
df ft dt e u t i U f
r
r

p
p (2.28e)
and
( )
2 ( )
E E
cos(2 ) ( ) ( )
ift tf
df ft dt e u t U f
r
r

C
C . (2.28f)

Since ( ) ( )
( )
E E
( )
ft
u t U f
C
C and ( ) ( )
( )
E E
( )
ft
u t U f
p
p [see Eqs. (2.23a) and (2.23b)], Eqs.
(2.28e) and (2.28f) can be written as

( ) ( )
2 ( ) ( )
E E
sin(2 ) ( ) ( )
ift tf ft
df ft dt e u t i u t
r
r

p p (2.28g)
and
( ) ( )
2 ( ) ( )
E E
cos(2 ) ( ) ( )
ift tf ft
df ft dt e u t u t
r
r

C C . (2.28h)

We now multiply both sides of (2.28g) by ( ) i and sum the resulting equation with Eq. (2.28h) to
get

( ) ( ) ( ) ( )
2 2
( ) ( ) ( ) ( )
E E E E
cos(2 ) ( ) sin(2 ) ( )
( ) ( )
ift ift
tf ft tf ft
df ft dt e u t i df ft dt e u t
u t u t
r r
r r

+

C C p p

or, using the identity cos( ) sin( )
i
e i
o
o o
,

- 90 -
Forward and Inverse Fourier Transforms 2.5
- 91 -
( ) ( ) ( ) ( )
2 2 ( ) ( ) ( ) ( )
E E E E
( ) ( ) ( )
ift ift tf ft tf ft
df e dt e u t u t u t
r r

+

p p C C . (2.28i)

Equation (2.26g) simplifies this to

2 2
( ) ( )
ift ift
df e dt e u t u t
r r

. (2.28j)

If, in Eq. (2.28a), we start out by adding the extended cosine transform to ( ) i times the extended
sine transform, then instead of Eqs. (2.28g) and (2.28h), we get [just replace i by ( ) i
everywhere]
( ) ( )
2 ( ) ( )
E E
sin(2 ) ( ) ( )
ift tf ft
df ft dt e u t i u t
r
r

p p
and
( ) ( )
2 ( ) ( )
E E
cos(2 ) ( ) ( )
ift tf ft
df ft dt e u t u t
r
r

C C .

Now we must multiply the top equation by i before summing it with the bottom equation to get

( ) ( ) ( ) ( )
2 2
( ) ( ) ( ) ( )
E E E E
cos(2 ) ( ) sin(2 ) ( )
( ) ( )
ift ift
tf ft tf ft
df ft dt e u t i df ft dt e u t
u t u t
r r
r r

+
+

p p C C

or

2 2
( ) ( )
ift ift
df e dt e u t u t
r r

. (2.28k)

Clearly, Eqs. (2.28j) and (2.28k) are basically the same identity, which can be written as

2 2
( ) ( )
ift ift
df e dt e u t u t
r r

B
. (2.28A )

As long as the exponent of e changes sign in the two integrals over and t, we get back the
original function. Looking at how Eqs. (2.28j) and (2.28k) are derived, we see that if the sign of
the exponent does not change, we get
- 91 -
2 Fourier Theory

- 92 -
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
E E E E
( ) ( )
tf ft tf ft
u t u t

p p C C
instead of
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
E E E E
( ) ( )
tf ft tf ft
u t u t

+ p p C C .

Equations (2.26e) and (2.26f) then show that

( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
E E E E
( ) ( ) ( )
tf ft tf ft
u t u t u t

p p C C ,

which gives

2 2
( ) ( )
ift ift
df e dt e u t u t
r r

(2.28m)

This interesting result shows that when u is even so that ( ) ( ) u t u t , we still get back the
original function, and when u is odd so that ( ) ( ) u t u t , we just have to multiply by ( 1) to
retrieve u. Even when u is mixed, no information is lost; reversing the sign of the argument still
gets us back to the original function. Replacing t by t in (2.28m) takes us back to the original
formula (2.28A ).
Up to this point, we have taken u to be real, but if Eq. (2.28A ) holds true when u is a real
function of a real argument, it must also hold true when u is a complex function of a real
argument. To show why this is so, we break complex functions u(t) of a real argument t into real
and imaginary parts,
( ) ( ) ( )
r i
u t u t iu t + ,

where
r
u and
i
u are both real functions of t. Substituting this complex-valued u(t) into the left-
hand side of (2.28A ) gives

[ ]
2 2
2 2 2 2
( ) ( )
( ) ( ) .
ift ift
r i
ift ift ift ift
r i
df e dt e u t iu t
df e dt e u t i df e dt e u t
r r
r r r r

+
+

B
B B

Since (2.28A ) holds for real functions
r
u and
i
u , this last expression must be equal to the
original complex function u,

( ) ( ) ( )
r i
u t iu t u t + ,
- 92 -
- 93 -
showing that Eq. (2.28A ) is true for complex functions of t as well as strictly real functions of t.
Similar reasoning shows that (2.28m) also holds true for complex functions of real variables.
Indeed, we can even apply this analysis to the unextended sine and cosine transforms to show that
the unextended sine transform of the unextended sine transform and the unextended cosine
transform of the unextended cosine transform return the original function (for positive values of
the argument) when the original function is complex.
We now define the Fourier transform of a complex function u with real argument t to be

( )
( ) 2
( ) ( )
ift ift
u t u t e dt
r
F . (2.29a)

The notation for F introduced in (2.29a) explicitly shows that t, being repeated inside both upper
and lower parentheses, is the dummy variable of integration; and that F produces a function of
because is only listed in the upper parentheses. We call (2.29a) the forward Fourier transform
and, when convenient, follow the custom of writing it with the upper-case letter of the
transformed function,

2
( ) ( )
ift
U f u t e dt
r
. (2.29b)

If (2.29a) is the forward transform, then the inverse Fourier transform is

( ) 2
( ( )) ( )
itf ift
U f U f e df
r
F . (2.29c)

In both the forward and inverse transform the order of the tf product in the superscript is
irrelevant, just as it is for the sine, cosine, and Hartley transforms,

( ) ( )
( ) ( )
( ) ( )
itf ift
u t u t

F F and ( ) ( )
( ) ( )
( ) ( )
itf ift
U f U f

F F .

What is important is the sign inside the superscript, since it determines whether the forward or
inverse transform is being performed. Equation (2.28A ) shows, of course, that

( ) ( ) ( )
( ) 2 ( ) ( )
( ) ( ) ( ) ( )
itf ift itf ift
u t U f U f e df u t
r
F F F . (2.29d)

It is entirely a matter of convention which Fourier transform is called the forward transform and
which is called the reverse transform; all that matters is for (2.28A ) to be satisfied. Some authors
- 93 -
2 Fourier Theory

- 94 -
change the sign of the exponent ( ) 2 ift r , defining the forward Fourier transform to be
( ) ift
F ,

( )
( ) 2
( ) ( )
ift ift
u t u t e dt
r
F ,

and the inverse Fourier transform to be
( ) ift
F ,

( )
( ) 2
( ) ( )
itf ift
U f U f e df
r
F .

Clearly, this convention also satisfies (2.28 A ), with the inverse Fourier transform of the forward
Fourier transform still returning the original function.
In physics and related disciplines, the frequency variable is often changed to 2 f u r , so that
(2.28A ) becomes

1
( ) ( )
2
i t i t
d e dt e u t u t
u u
u
r

B
. (2.30a)

Authors using the frequency variable allocate the factor of 1 (2 ) r different ways when
defining the forward and inverse Fourier transforms in terms of , with all reasonable
possibilities chosen at one time or another:

Forward Fourier transform of ( ) is ( ) ( )
i t
u t u t e dt U
u
u
B
, (2.30b)

1
Inverse Fourier transform of ( ) ( )
2
i t
U U e d
u
u u u
r
,

1
2
i t
u t u t e dt U
u
u
r
B
, (2.30c)

1
2
i t
U U e d
u
u u u
r
,

1
2
i t
u t u t e dt U
u
u
r
B
, (2.30d)
- 94 -
- 95 -
i t
U U e d
u
u u u
.

In each of the three pairs of definitions listed above, the plus and minus signs are synchronized;
so if the top (bottom) sign is chosen for the first member of the pair then the top (bottom) sign
must also be chosen for the second member of the pair. This gives a total of six different ways of
defining the forward and inverse Fourier transforms, and all six satisfy Eq. (2.30a).
The unextended sine and cosine transformsusually called just the sine and cosine
transformscan also be defined in many different ways. Equations (2.8a), (2.8c), (2.8e), and
(2.8b), (2.8d), (2.8f) can be combined to write

0 0
4 sin(2 ) ( ) sin(2 ) ( ) for 0 df ft dt u t ft u t t r r

>

(2.31a)
and

0 0
4 cos(2 ) ( ) cos(2 ) ( ) for 0 df ft dt u t ft u t t r r

>

. (2.31b)

Changing the frequency variable to 2 f u r gives

0 0
2
sin( ) ( ) sin( ) ( ) for 0 df t dt u t t u t t u u
r

>

(2.31c)
and

0 0
2
cos( ) ( ) cos( ) ( ) for 0 df t dt u t t u t t u u
r

>

. (2.31d)

Just like the factor of 1 (2 ) r in Eq. (2.30a), the factor of 2 r in (2.31c) and (2.31d) can be
allocated three different ways when defining the forward and inverse sine and cosine transforms:

( )
0
Forward sine transform of ( ) for 0 is ( ) sin( ) u t t u t t dt U u u
>

p
, (2.31e)
( )
0
Forward cosine transform of ( ) for > 0 is ( ) cos( ) u t t u t t dt U u u
C
,
( ) ( )
0
2
Inverse sine transform of is sin( ) ( ) for 0 U U t d u t t u u u u
r
>
p p
,
- 95 -
2 Fourier Theory

- 96 -
( ) ( )
0
2
Inverse cosine transform of is cos( ) ( ) for 0 U U t d u t t u u u u
r
>
C C
,

( )
0
2
Forward sine transform of ( ) for > 0 is ( ) sin( ) u t t u t t dt U u u
r
p
, (2.31f)
( )
0
2
r
C
,
( ) ( )
0
2
r
>
p p
,
( ) ( )
0
2
r
>
C C
,

( )
0
2
Forward sine transform of ( ) for > 0 is ( ) sin( ) u t t u t t dt U u u
r
p
, (2.31g)
( )
0
2
r
C
,
( ) ( )
0
>
p p
,
( ) ( )
0
>
C C
.

The reader should expect to encounter all three classes of definitions given in (2.31e)(2.31g).
The symmetric definitions in (2.31f) are the most popular, probably because they remove the
distinction between the forward and inverse transform, letting us say that the sine transform of
the sine transform and the cosine transform of the cosine transform return the original function
for 0 t > .
In todays optical-engineering textbooksand user manuals for the fast Fourier transform
there is a tendency to choose Eq. (2.29a)(2.29d) as the definitions of the forward and inverse
Fourier transform, and that is the convention followed here. It is perhaps somewhat
unconventional not to use the frequency variable 2 f u r when defining the sine and cosine
transforms, but using rather than brings their definitions into conformity with the definitions
chosen for the forward and inverse Fourier transforms.
- 96 -
Fourier Transform as a Linear Operation 2.6
- 97 -
2.6 Fourier Transform as a Linear Operation
The forward and inverse Fourier transforms are linear operations. If , are any two complex
constants and u(t), v(t) are two complex-valued functions of a real variable t, then the definition
of a linear operator L is that

( ) ( ) ( ) ( ) ( ) ( ) ( ) u t v t u t v t o o + + L L L . (2.32a)

Examples of linear operators are multiplication by a specified function g(t)

( )
1
( ) ( ) ( ) u t g t u t L ,

differentiation with respect to t
( )
2
( )
( )
du t
u t
dt
L ,

and integration over the interval
1 2
t t t < <

( )
2
1
3
( ) ( )
t
t
u t u t dt
L .
We see that for these three examples

( ) ( ) ( )
1 1 1
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) u t v t g t u t g t v t u t v t o o o + + + L L L ,

( ) ( ) ( )
2 2 2
( ) ( )
( ) ( ) ( ) ( )
du t dv t
u t v t u t v t
dt dt
o o o + + + L L L ,
and
( ) ( ) ( )
2 2
1 1
3 3 3
( ) ( ) ( ) ( ) ( ) ( )
t t
t t
u t v t u t dt v t dt u t v t o o o + + +

L L L .

Combinations of linear operators are always linear; for example, the operator Z defined by

( ) ( ) ( )
3 1
( ) ( ) u t u t Z L L
must be linear because
is that
- 97 -
2 Fourier Theory

- 98 -

( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( )
3 1 3 1 1
3 1 3 1
( ) ( ) ( ) ( ) ( ) ( )
( ) ( )
( ) ( )
u t v t u t v t u t v t
u t v t
u t v t
o o o
o
o
+ + +
+
+

Z L L L L L
L L L L
L L
(2.32b)

We note that the forward Fourier transform

( )
( ) 2
( ) ( )
ift ift
u t u t e dt
r
F

as defined in Eq. (2.29a) is, in fact, just ( ) ( )
3 1
( ) u t L L with
2
( )
ift
g t e
r
in the
1
L multiplication
and
1
t ,
2
t in the
3
L integration. Similarly, the inverse Fourier transform is,
interchanging the roles of the and t variables in Eq. (2.29b),

( )
( ) 2
( ) ( )
ift ift
U t U t e dt
r
F ,

showing it to be ( ) ( )
3 1
( ) U t L L with
2
( )
ift
g t e
r
in the
1
L multiplication and
1
t ,
2
t in
the
3
L integration. Equation (2.32b) thus shows that both the forward and inverse Fourier
transforms are linear. The unextended and extended sine transforms in Eqs. (2.8a) and (2.14a),

( )
( )
0
( ) 2 ( ) sin(2 )
ft
u t u t ft dt r
p and ( )
( )
( ) ( ) sin(2 )
ft
u t u t ft dt r
E
p ,

are also both ( ) ( )
3 1
( ) u t L L : the unextended sine transform has ( ) 2sin(2 ) g t ft r in the
1
L
multiplication and
1
0 t ,
2
t in the
3
L integration; and the extended sine transform has
( ) sin(2 ) g t ft r in the
1
L multiplication and
1
t ,
2
t in the
3
L integration. The
unextended and extended cosine transforms in Eqs. (2.8b) and (2.14b),

( )
( )
0
( ) 2 ( ) cos(2 )
ft
u t u t ft dt r
C and ( )
( )
( ) ( ) cos(2 )
ft
u t u t ft dt r
E
C ,

are, of course, identical to the unextended and extended sine transforms in being ( ) ( )
3 1
( ) u t L L ;
the only change is that the sines change to cosines in the
1
L multiplications. From Eq. (2.32b), all
- 98 -
Fourier Transform as a Linear Operation 2.6
- 99 -
four transformsthe extended sine transform, the unextended sine transform, the extended
cosine transform, and the unextended cosine transformare linear operations. We see that the
only other transform discussed so far, the Hartley transform

( ) [ ]
( )
( ) ( ) cos(2 ) sin(2 )
ft
u t u t ft ft dt r r
a
e

in Eq. (2.26h), must also be linear because it is

( ) ( )
3 1
( ) u t L L with ( ) cos(2 ) sin(2 ) g t ft ft r r +

in the
1
L multiplication and has
1
t ,
2
t in the
3
L integration.
2.7 Mathematical Symmetries of the Fourier Transform
There are a large number of symmetry relations that hold for any function u(t) and its Fourier
transform
( )
( ) 2
( ) ( ) ( )
ift ift
U f u t u t e dt
r
F . (2.33a)

We have already seen that the inverse Fourier transform of ( ) U f returns the original function,

( )
2 ( )
( ) ( ) ( )
ift itf
U f e df U f u t
r
F . (2.33b)

Replacing t by t, changes this to
( )
( )
( ) ( )
itf
u t U f
F .

Interchanging the roles of variables and t, we get

( )
( )
( ) ( )
ift
u f U t
F , (2.33c)

which shows that u(f) is the forward Fourier transform of U(t). We expect, then, that U(t) is the
inverse Fourier transform of u(f). To show this is true, we interchange the roles of variables
and t in (2.33a) and then make f f the new variable of integration to get
- 99 -
2 Fourier Theory

- 100 -

( )
( )
( ) 2 2 2
( )
( ) ( ) ( ) ( ) ( )
( ) .
itf ift if t ift
itf
U t u f u f e df u f e df u f e df
u f
r r r

F
F
(2.33d)

Not only does this show that U(t) is the inverse Fourier transform of u(f) but also, by comparing
the two expressions involving the F operator, we see that changing the sign of the integration
variable does not change the value of the Fourier operation F. It does, however, change its
namethe first F operation in (2.33d) is the forward Fourier transform of u(f) and the second F
operation in (2.33d) is the inverse Fourier transform of u(f). Taking the complex conjugate of all
three expressions in Eq. (2.33b) gives

( )
2 ( )
( ) ( ) ( )
ift itf
u t U f e df U f
r
F ,

which shows that we get the complex conjugate of operator F by taking the complex conjugates
of the quantities inside both parentheses. Starting with the original Fourier transform relationship
between U and u,
( )
( )
( ) ( )
ift
U f u t
F (2.33e)
and
( )
( )
( ) ( )
itf
u t U f F , (2.33f)

we take the complex conjugates of both sides of (2.33e),

( )
( )
( ) ( )
ift
U f u t

F ,

and then change the sign of to get

( )
( )
( ) ( )
ift
U f u t

F . (2.33g)

This shows that U(f)
*
is the forward Fourier transform of u(t)
*
. Since U(f)
*
is the forward
Fourier transform of u(t)
*
, we expect the inverse Fourier transform of U(f)
*
to be u(t)
*
. To show
this is true, we just change the sign of integration variable in Eq. (2.33f),

( )
( )
( ) ( )
itf
u t U f
F ,

and then take the complex conjugate to get
- 100 -
Mathematical Symmetries of the Fourier Transform 2.7
- 101 -

( )
( )
( ) ( )
itf
u t U f

F . (2.33h)

Hence, u(t)
*
is indeed the inverse Fourier transform of U(f)
*
.
When u(t) is a strictly real function, as it is for much of the Fourier-transform work done in
this book, u equals its complex conjugate so that

( ) ( )
( ) ( )
( ) ( )
ift ift
u t u t

F F ,

and Eq. (2.33g) becomes
( )
( )
( ) ( )
ift
U f u t

F .

But ( )
( )
( )
ift
u t
F is just U(f), the forward Fourier transform of u, so

( ) ( ) U f U f

or, taking the complex conjugate of both sides,

( ) ( ) U f U f

. (2.34a)

Functions U(f) that obey Eq. (2.34a) are called Hermitian. If u(t) is purely imaginary, so that
( ) ( ) u t u t
, then Eq. (2.33g) becomes

( )
( )
( ) ( )
ift
U f u t

F
or
( )
( )
( ) ( )
ift
u t U f

F , (2.34b)

where the linearity of F is used to take ( 1) outside the transform and shift it over to the other
side of the equation. Since ( )
( )
( )
ift
u t
F is just U(f), Eq. (2.34b) shows that

( ) ( ) U f U f

or
( ) ( ) U f U f

(2.34c)

when u is purely imaginary. Functions U(f) that obey Eq. (2.34c) are called anti-Hermitian. A
special and very important case occurs when u is both real and even. Then, since U is the forward
- 101 -
2 Fourier Theory

- 102 -
Fourier transform of u with ( )
( )
( ) ( )
ift
U f u t
F , we take the complex conjugate of both sides to

get

( )
( )
( ) ( )
ift
U f u t

F .

Because u is real this becomes, changing the sign of the variable of integration,

( ) ( )
( ) ( )
( ) ( ) ( )
ift ift
U f u t u t

F F .

Because u is even, this simplifies to

( )
( )
( ) ( ) ( )
ift
U f u t U f

F
so that
( ) ( ) U f U f
. (2.34d)

Hence, U equals its own complex conjugate, which shows it must be real. Because u is real, we
already know that U is Hermitian and (2.34a) must hold true; now that U is known to be real, Eq.
(2.34a) can be written as
( ) ( ) U f U f (2.34e)

This shows that U must be real and even when u is real and even. Taking the real part of Eq.
(2.33a) now gives, since both U and u are known to be real,

( )
2 2
( ) Re ( ) ( ) Re
ift ift
U f u t e dt u t e dt
r r

,

which becomes, applying Eq. (2.27),

( ) ( ) cos(2 ) U f u t ft dt r
. (2.34f)

Because u(t) is also even, we know that the product ( ) cos(2 ) u t ft r is even with respect to t,
which means that (2.34f) can be written as [see formula (2.19) above]

0
( ) 2 ( ) cos(2 ) U f u t ft dt r
. (2.34g)
- 102 -
Mathematical Symmetries of the Fourier Transform 2.7
- 103 -
The right-hand side is the unextended cosine transform of u, showing that when u(t) is real and
even, its Fourier transform equals its cosine transform. According to Eq. (2.8f), it follows that u
must then be the cosine transform of U,

0
( ) 2 ( ) cos(2 ) u t U f ft df r
. (2.34h)
2.8 Basic Fourier Identities
There are a number of simple Fourier identities that are true for the transforms of any function u.
One very simple identitysurprisingly easy to overlookis that when U(f) is the forward or
inverse Fourier transform of u(t), the value of U at the origin is the total integral of u:

2
0
0
( ) ( )
ift
f
f
U f u t e dt
r
B

or
(0) ( ) U u t dt
. (2.35a)

Similarly, (0) u is the total integral of ( ) U f :

2
0
0
( ) ( )
ift
t
t
u t U f e df
r

or
(0) ( ) u U f df
. (2.35b)

When U(f) is the forward Fourier transform of u(t), the nth derivative of U is

2 2
( ) ( 2 ) ( )
n n
ift n n ift
n n
d U
u t e dt i t u t e dt
df f
r r
r

o

o

; (2.35c)

and, because Eqs. (2.29a) and (2.29d) require u to be the inverse transform of U when U is the
forward transform of u, the nth derivative of u is
- 103 -
2 Fourier Theory

- 104 -

2 2
( ) (2 ) ( )
n n
ift n n ift
n n
d u
U f e df i f U f e df
dt t
r r
r

o

o

. (2.35d)

Therefore, when both u and
n n
d u dt satisfy requirements (V) through (VIII) in Sec. 2.4 and U(f)
is the forward Fourier transform of u(t), Eq. (2.35d) shows that [(2 ) ( )]
n n
i f U f r must be the
forward Fourier transform of
n n
d u dt because
n n
d u dt is the inverse Fourier transform of
[(2 ) ( )]
n n
i f U f r . Equation (2.35c) similarly shows that when u(t) and [ ( )]
n
t u t satisfy
requirements (V) through (VIII) in Sec. 2.4 and U(f) is the forward Fourier transform of u(t), the
forward Fourier transform of [ ( )]
n
t u t is

1
( 2 )
n
n n
d U
i df r
.

We introduce the notation to show this sort of Fourier-transform relationship between
functions, adopting the convention that the function on the right is always the forward Fourier
transform of the function on the left and the function on the left is always the inverse Fourier
transform of the function on the right. The results of the above analysis can then be written as

(2 ) ( )
n
n n
n
d u
i f U f
dt
r (2.35e)
and

1
( )
( 2 )
n
n
n n
d U
t u t
i df r
. (2.35f)

For the integral of any complex function c(t), the inequality

( ) ( )
b b
a a
c t dt c t dt s

(2.35g)
must hold true for any two real values of a and b where a b s . When u(t) is real, so is its nth
derivative, and we can write

2 2 2
n n n
ift ift ift
n n n
d u d u d u
e dt e dt e dt
dt dt dt
r r r

s

,

which reduces to, since
2
1
ift
e
r
,
- 104 -
Basic Fourier Identities 2.8
- 105 -

2
n n
ift
n n
d u d u
e dt dt
dt dt
r

s

. (2.35h)

Because we are supposing the Fourier transform of /
n n
d u dt to exist, the existence requirement
in Eq. (2.13a) shows that

n
n
d u
dt
dt

is finite. Hence, inequality (2.35h) requires

2
n
ift
n
d u
e dt
dt
r

also to be finite, which means that we can assume that it is less than or equal to some finite real
and non-negative number B for all values of :

2
n
ift
n
d u
e dt B
dt
r
. (2.35i)

Formula (2.35e) states that

2
(2 ) ( )
n
ift n n n
n
d u
e dt i f U f
dt
r
r
, (2.35j)
where

2
( ) ( )
ift
U f u t e dt
r

is, of course, the Fourier transform of u(t). Taking the magnitude of the complex values of both
sides of (2.35j) and remembering that 1
n
i shows that

2
(2 ) ( )
n
n
ift n
n
d u
e dt f U f
dt
r
r
,

which becomes, applying inequality (2.35i),
- 105 -
2 Fourier Theory

- 106 -
(2 ) ( )
n
n
B f U f r >
or
( )
(2 )
n
n
B
U f f
r
s . (2.35k)

Hence, when the Fourier transform of the nth derivative of u(t) exists, we know that the
magnitude ( ) U f of the Fourier transform of u decreases as
n
f

for large values of .
We next examine a set of identities often called the Fourier shift theorem. When U(f) is the
forward Fourier transform of u(t),

2
( ) ( )
ift
U f u t e dt
r
,

and u(t) is shifted to the right by an amount a,

( ) ( ) u t u t a ,

then the forward Fourier transform of ( ) u t a is, changing the variable of integration to
t t a ,

2 2 ( )
2 2 2
( ) ( )
( ) ( ).
ift if t a
ifa ift ifa
u t a e dt u t e dt
e u t e dt e U f
r r
r r r

+

Hence the forward Fourier transform of ( ) u t a is
2
( )
ifa
e U f
r
when the forward Fourier
transform of u(t) is U(f), which we can write as

If ( ) ( ) u t U f then
2
( ) ( )
ifa
u t a e U f
r
. (2.36a)

In terms of the Fourier F operator, we have

( ) ( )
( ) 2 ( )
( ) ( )
ift ifa ift
u t a e u t
r
F F . (2.36b)

Working with the reverse Fourier transform of
0
( ) U f f and changing the variable of
integration to
0
f f f , we see that
operator, we have
- 106 -
- 107 -

0 0
2 2 2 2
0
( ) ( ) ( )
ift if t if t if t
U f f e df e U f e df e u t
r r r r

(2.36c)
or

0
2
0
( ) ( )
if t
e u t U f f
r
. (2.36d)

The F operator lets us write this result as

( ) ( )
0
( ) 2 ( )
0
( ) ( )
itf if t itf
U f f e U f
r
F F (2.36e)
or

( )
( ) ( )
( )
0
0
( ) 2
0
( ) ( ) ( )
i f f t
ift if t
e u t U f f u t
r

F F . (2.36f)

Equations (2.36d)(2.36f) show that multiplying u(t) by
0
2 if t
e
r
shifts U(), the forward Fourier
transform of u(t), to the right by a frequency
0
f . By interchanging the roles of t and and
replacing u by U and
0
f by ain (2.36e) and comparing the result to (2.36b), we see the two
equations can be combined into one formula:

( ) ( )
( ) 2 ( )
( ) ( )
ift ifa ift
u t a e u t
r
F F . (2.36g)

This last result can also be written as, defining a new constant b a ,

2 2 2
( ) ( )
ift ifb ift
u t b e dt e u t e dt
r r r

+

B
(2.36h)
or
( ) ( )
( ) 2 ( )
( ) ( )
ift ifb ift
u t b e u t
r
+
B
F F . (2.36i)

The next set of identities is sometimes called the Fourier scaling theorem. If U() is the
forward Fourier transform of u(t) and the argument of u is scaled by the real constant a,

( ) ( ) u t u at ,

then the forward Fourier transform of ( ) u at is, letting t at ,

2
2
1 1
( ) ( )
ft
i
ift a
f
u at e dt u t e dt U
a a a
r
r

.
- 107 -
2 Fourier Theory

- 108 -
This can be written as

1
( )
f
u at U
a a

(2.37a)
or
( )
( ) ( )
( )
( )
1
( ) ( )
i f a t
ift
u at u t
a

F F . (2.37b)

We also have, scaling the frequency by a positive constant a and letting f af , that

2
2
1 1
( ) ( )
f t
i
ift a
t
U af e df U f e df u
a a a
r
r

.


1
( )
t
u U af
a a

for 0 a > (2.37c)
or
( )
( ) ( )
( )
( )
1
( ) ( )
i t a f
itf
U af U f
a
F F for 0 a > . (2.37d)

Equation (2.37b) and (after interchanging the roles of and t) Eq. (2.37d) can be combined into
the single formula,
( )
( ) ( )
( )
( )
1
( ) ( )
i f a t
ift
u at u t
a

F F for 0 a > . (2.37e)

Because u(t) must satisfy requirements (V) through (VIII) in Sec. 2.4 for these results to be
trueand in particular it must satisfy requirement (V) that it be absolutely integrablethere may
well be only a finite region of t over which u(t) is significantly different from zero. When
0 1 a < < so that the range of t over which u is significantly different from zero expands, formula
(2.37a) shows that the region of over which U() is significantly different from zero shrinks;
and, of course, when 1 a > , just the opposite occurs. For 0 1 a < < , function ( ) u at more closely
resembles sin(2 ) ft r and cos(2 ) ft r for smaller values of , explaining why the region of for
which U is significantly different from zero shrinks; and when 1 a > , function ( ) u at more closely
resembles sin(2 ) ft r and cos(2 ) ft r for larger values of , explaining why the region of for
which U is significantly different from zero expands. We also note that if 1 (2 ) f r , so that
sin(2 ) sin( ) ft t r and cos(2 ) cos( ) ft t r , then the sine and cosine can change significantly in
value only when t changes by at least
- 108 -
- 109 -

min
(1) t O A .

Suppose t must also change by at least
min
(1) t O A for a significant change in u(t) to occur,
which means that sin(2 ) sin( ) ft t r and cos(2 ) cos( ) ft t r vary about as fast with respect to t as
u doesthat is, sin( ) t and cos( ) t resemble u somewhat. Recalling the heuristic reasoning used
in Sec. 2.1 to introduce and justify the sine and cosine integrals, we now expect U() to be
significantly different from zero when 1 (2 ) f r . Suppose next that t changes by less than
min
(1) t O A so that u does not change significantly in value, remaining almost constant. Now
when becomes significantly larger than 1 (2 ) r , functions sin(2 ) ft r and cos(2 ) ft r oscillate
ever more rapidly so that they change significantly in value for changes in t that are ever smaller
than
min
t A . For these larger values of , the sine and cosine do not much resemble u(t), forcing
the Fourier transform U() to be negligible or zero for (1 (2 )) f O r > . We can modify the
original function u by creating a new function ( ) ( ) u t u t
for 0 > . Now t must change by at

least an ( ) O amount for u
to change significantly; and when t changes by less than ( ) O ,

function u
does not change significantly in value. We know from (2.37a) with 1 a that the
forward Fourier transform of u
is ( ) ( ) U f U f
. Hence, when is larger than

( ) 1 (2 ) O r , it must be true that ( ) U f
is negligible or zero, since this is the same as having

(1 (2 )) f O r > in U(). Because 2r is often regarded as an (1) O quantity, this result can also be
interpreted as showing that ( ) U f
must be negligible or zero for (1 ) f O > . Since the original

Fourier transform pair

( ) ( ) u t U f

is left unspecified, u
in fact represents any function v(t) where t must change by at least an

( ) O amount for a significant change in v to occur. Consequently, we can conclude if t must
change by at least an ( ) O amount for v(t) to change significantly, then the forward Fourier
transform of v(t) must be negligible or zero for (1 ) f O > . The arguments leading to this
conclusion work just as well when we consider the inverse Fourier transform in Eqs. (2.37c) and
(2.37e). Therefore, this more general result is also true: if v(t) is a function such that t must
change by at least an ( ) O amount for a significant change in v to occur, then the forward or
inverse Fourier transform,

2
( ) ( )
ift
V f v t e dt
r
,
is negligible or zero for (1 ) f O > .
- 109 -
2 Fourier Theory

- 110 -
2.9 Fourier Convolution Theorem
It is hard to overstate the importance of the Fourier convolution theorem; it plays a fundamental
role in linear signal theory and structures the thinking of many different engineering
disciplinessignal processing, electrical engineering, image analysis, and servomechanism
design, to name but a few.
We define the convolution of two functions u(t) and v(t) to be

( ) ( ) ( ) ( ) u t v t u t v t t dt
. (2.38a)

Here, u and v may be complex functions but their argument t is assumed to be real. The
convolution is commutative and associative. It is commutative because making the substitution
t t t gives

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) u t v t u t v t t dt u t t v t dt v t u t t dt

,

showing that
( ) ( ) ( ) ( ) u t v t v t u t . (2.38b)

The convolution is associative because for three complex functions u(t), v(t), and h(t) with real
argument t we can write, changing the variable of integration to t t t ,

[ ]
( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
u t v t h t dt h t t dt u t v t t dt u t dt h t t v t t
dt u t dt v t h t t t

[ ]
( ) ( ) ( ) . u t v t h t

Hence,

[ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) u t v t h t u t v t h t . (2.38c)

The convolution is a linear operation, because for any two complex constants and ,

- 110 -
Fourier Convolution Theorem 2.9
- 111 -

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ,
h t u t v t h t u t t v t t dt
h t u t t dt h t v t t dt
o o
o

+ +
+

showing that

( ) ( ) ( ) ( ( ) ( )) ( ) ( ) ( ) ( ) h t u t v t h t u t h t v t o o + + . (2.38d)

Because the convolution is commutative, the equation can also be written as

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) u t v t h t u t h t v t h t o o + + . (2.38e)

This shows that the convolution is linear on both the left-hand and right-hand sides of the .
The convolution of two even functions or two odd functions is an even function. If u(t) and
v(t) are both even or both odd, then we have, using t t ,

( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) .
u t v t u t v t t dt u t v t t dt
u t v t t dt u t v t

(2.38f)

When u is even and v is odd, or u is odd and v is even, then we have

( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) .
u t v t u t v t t dt u t v t t dt
u t v t t dt u t v t

(2.38g)

Hence, the convolution of an even and an odd function is always odd.
If u and v have more than one argument so that they are written
1 2
( , , , ) u y x x and
1 2
( , , , ) v y x x , then we adopt the convention that the convolution

- 111 -
2 Fourier Theory

- 112 -

1 2 1 2
( , , , ) ( , , , ) u y x x v y x x

is over variable y rather than variables
1 1 2 2
, , , , x x x x ,

1 2 1 2 1 2 1 2
( , , , ) ( , , , ) ( , , , ) ( , , , ) u y x x v y x x u y x x v y y x x dy
,

because y is the only argument repeated on both sides of the .
To derive the Fourier convolution theorem, we take the forward or inverse transform of
( ) ( ) u t v t to get

( ) [ ]
( ) 2 2
2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ).
ift ift ift
ift
u t v t e u t v t dt dt e dt u t v t t
dt u t dt e v t t
r r
r

F

Changing the variable of integration in the inner integral to t t t gives

( )
( ) 2 2
2 2
( ) ( ) ( ) ( )
( ) ( )
ift ift ift
ift ift
u t v t dt u t e dt e v t
dt u t e dt e v t
r r
r r

F

or
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
u t v t u t v t

F F F . (2.39a)

If U() and V() are the forward Fourier transforms of u(t) and v(t) respectively, we can choose
the minus sign of (2.39a) to get

[ ]
2
( ) ( ) ( ) ( )
ift
e u t v t dt U f V f
r
, (2.39b)
which shows that
( ) ( ) ( ) ( ) u t v t U f V f . (2.39c)

Equation (2.28A ) can be written as, for any function g(t) after interchanging the roles of t and t ,
- 112 -
- 113 -
( ) ( )
( ) ( )
( ) ( )
it f ift
g t g t

B
F F . (2.39d)

We replace
( )
F by
( ) B
F on the right-hand side of Eq. (2.39a), which is just a change in the order
in which the two possible signs of the exponent are listed, and then take
( ) it f
F of both sides to
get that, applying (2.39d) with ( ) ( ) ( ) g t u t v t ,

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
it f ift ift
u t v t u t v t

B B
F F F . (2.39e)

Because u(t) and v(t) represent arbitrary, Fourier-transformable functions of t,
( )
( ( ))
ift
u t
B
F and
( )
( ( ))
ift
v t
B
F must be arbitrary, Fourier-transformable functions of , which we can call
( )
U
B
and
( )
V
B
respectively,
( )
( ) ( )
( ) ( )
ift
U f u t
B B
F (2.39f)
and
( )
( ) ( )
( ) ( )
ift
V f v t
B B
F . (2.39g)

Applying this notation to (2.39d), first with ( ) ( ) g t u t and then with ( ) ( ) g t v t , we see that

( )
( ) ( )
( ) ( )
it f
U f u t

B
F (2.39h)
and

( )
( ) ( )
( ) ( )
it f
V f v t

B
F . (2.39i)

Hence Eq. (2.39e) can be written as

( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
it f it f it f
U f V f U f V f

B B B B
F F F ,

where the convolution is over t because it is the only argument repeated on both sides of the .
Since
( )
U
B
and
( )
V
B
are arbitrary, transformable functions, we can replace them by the arbitrary
transformable functions u and v to get, after interchanging the roles of and t ,

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
u t v t u t v t

F F F .

This can be simplified by dropping a prime from each of the ts:

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
u t v t u t v t

F F F . (2.39j)
(2.39d)
- 113 -
2 Fourier Theory

- 114 -
If U() and V() are the forward Fourier transforms of u(t) and v(t) respectively, we can choose
the minus sign of (2.39j) to get

[ ]
2
( ) ( ) ( ) ( )
ift
r
(2.39k)
or
( ) ( ) ( ) ( ) u t v t U f V f . (2.39A )

Equation (2.39b) shows that the forward Fourier transform of the convolution of two functions
is the product of the forward Fourier transform of each function, and (2.39k) shows that the
forward Fourier transform of the product of two functions is the convolution of the forward
Fourier transform of each function. Equations (2.39a) and (2.39j) show that everything we just
said about the forward Fourier transform still holds true when we take the reverse Fourier
transform of the product of two functions or of the convolution of two functions.
When using the Fourier convolution theorem, we usually regard one of the two convolved
functions as representing the undisturbed signalthat is, the true set of values for what is to be
measuredand the otherusually much more narrowfunction as specifying the blurring or
smearing effect of an imperfect measurement. The blurring or smearing function has different
names in different engineering disciplines; optical engineers often call it the instrument-response
or instrument line-shape function. In Fig. 2.5(a), function u is taken to be the true signal, and in
Fig. 2.5(b) function v is the instrument-response or instrument line-shape function. The
convolution
( ) ( ) ( ) ( ) ( )
blur
u t v t u t v t t dt u t

defines the new function ( )
blur
u t as shown in Figs. 2.5(c)2.5(e). The function v is flipped left to
right and slid along the t axis in Fig. 2.5(c) by changing the value of t. Figure 2.5(d) is a close-
up of v at a specific value of t, with the shaded region being the area under the product
( ) ( ) u t v t t . Since ( ) ( ) u t v t t is zero where ( ) v t t is zero, the area of the shaded region can
be found by integrating ( ) ( ) u t v t t over t between and +. This is, of course, just the
convolution of u and v for this particular value of t , which means the area of the shaded region
must be ( )
blur
u t for this value of t. Figure 2.5(e) represents the complete ( )
blur
u t function for all
values of t; clearly
blur
u has less detail than the original signal u.
The v(t) function in Fig. 2.5(b) is an unusual type of instrument response because it is not an
even function of t. Figure 2.5(f) shows a typical even instrument response ( )
e
v t . When the
instrument-response function is
e
v , the blurred signal is
- 114 -
- 115 -

,
( ) ( ) ( )
e blur e
u t u t v t . (2.40a)

The instrument-response function is even, so ( ) ( )
e e
v t v t and we can write

,
( ) ( ) ( ) ( ) ( )
e blur e e
u t u t v t t dt u t v t t dt

(2.40b)

with the last integral in (2.40b) making it perhaps more obvious that u
e,blur
is a localized and
weighted average of u centered on t. Instrument-response or line-shape functions are usually
designed to be even because an even instrument-response function does not shift the center point
of isolated peaks in the true data u.
As described in the first chapter, when using Michelson interferometers, we do not much care
about the exact shape of the optical intensity signal u but are instead interested in the shape of its
transform,
( )
( )
( ) ( )
ift
U f u t
F . (2.40c)

In many types of interferometers, u is a signal of time t, which means U can be analyzed as a
function of , the signal frequency. The electrical circuits transmitting and recording the signal u
can never do a perfect jobthey always blur and smooth the original signal to some extentso
what we end up with is not u(t) and U() but rather
,
( )
e blur
u t and the associated Fourier transform

( )
( )
, ,
( ) ( )
ift
e blur e blur
U f u t
F . (2.40d)

The relationship between
, e blur
U and U must be understood to design the electrical circuits
properly. Here is an important example of how to use the Fourier convolution theorem.
Substitution of (2.40a) into (2.40d) gives

( )
( )
,
( ) ( ) ( )
ift
e blur e
U f u t v t
F .

Using the Fourier convolution theorem as presented in Eq. (2.39a), this is rewritten as

( ) ( )
( ) ( )
,
( ) ( ) ( )
ift ift
e blur e
U f u t v t

F F
or

,
( ) ( ) ( )
e blur e
U f U f V f , (2.40e)

where U() comes from (2.40c) and we define

- 115 -
2 Fourier Theory

- 116 -

FIGURE 2.5(a).
FIGURE 2.5(b).
FIGURE 2.5(c).
FIGURE 2.5(d).
FIGURE 2.5(e).
FIGURE 2.5(f).
( ) u t
( ) v t
( )
e
v t
( )
blur
u t
t
t
t
t
t
t
( ) v t t
( ) u t
( ) ( ) u t v t t
t value

- 116 -
- 117 -
( )
( )
( ) ( )
ift
e e
V f v t
F .

Equation (2.40e) is a very reassuring result, stating that as long as ( )
e
V f is known and not zero,
we can recover the Fourier transform of the true signal U() from
,
( )
e blur
U f by calculating

,
( )
( )
( )
e blur
e
U f
U f
V f
. (2.40f)

To design the circuits of a Michelson interferometer, we find the frequencies for which U()
must be known and arrange for
e
V to be as constant as possibleand definitely not zeroover
these frequencies. It turns out that preserving certain signal frequencies while neglecting others is
a standard problem in electrical circuit design, and it is usually easy to arrange for this to occur.
There is, in fact, a whole branch of electrical engineering called filter theory that describes
exactly how to design circuits where
e
V is zero or very small at some frequencies while being
large and quasi-constant at others.
2.10 Fourier Transforms and Divergent Integrals
Fourier-transform theory has a history of treating with extreme kindness engineers and scientists
who blindly use its formalism without worrying about whether their manipulations make
mathematical sense. The rule of thumb seems to be that if the final result is mathematically
soundsuch as a finite integral or the transform of an obviously transformable functionit
almost never matters whether intermediate steps involve the transforms of functions that
obviously cannot be transformed or even, strictly speaking, are not true functions at all. Any
reasonably comprehensive table of Fourier transforms contains functions that not only violate
requirements (V) through (VIII) in Sec. 2.4 but also have transform integrals that, according to
the standard definition of integration, either diverge or have no well-defined value. This book
shows that these puzzling entries are the modest but ubiquitous legacy of mathematicians who
have extended the meaning of what is meant by an integral and what is meant by a function in
Fourier-transform theory. Their work has not only benefited many scientists and engineers who
no longer have to apologize for the way they solve Fourier-transform problems but has also
helped their students who no longer need to accept without good explanations divergent integrals
and the transforms of poorly defined functions.
The standard definition of an improper integral

( ) u t dt

for the function u(t) is that
- 117 -
2 Fourier Theory

- 118 -

2
1
1
2
( ) lim ( )
T
T
T
T
u t dt u t dt

.y

If there is any singular point
s
t where lim ( )
s
t t
u t
, the definition becomes

1 2
1 2
1 2
1 2
,
0, 0
( ) lim ( ) ( )
s
s
t T
T T
T t
u t dt u t dt u t dt
r
r
r r

+

+

. (2.41a)
In this definition, the limits as
1
T ,
2
T ,
1
0 r , and
2
0 r occur independently; no
matter how
1
T ,
2
T ,
1
r , and
2
r approach their limits, the same answer is expected if the integral
exists. We now decide, in the interest of expanding Fourier-transform theory, to change this
standard definition of improper integral by connecting
1
r to
2
r and
1
T to
2
T as we take the limit,

0
( ) lim ( ) ( )
s
s
t T
T
T t
u t dt u t dt u t dt
r
r
r

+

. (2.41b)

The limiting process in definition (2.41b) is said to give the Cauchy principle value of the
integral, sometimes written as
PV ( ) u t dt
or
_
( ) u t dt
.

If u(t) has multiple singular points, the definition is expanded in the obvious way. For example,
with two singular points at
1 s
t and
2 s
t with
1 2 s s
t t < , we have

1 1 2 2
1 1 2 2
1
2
0
0
PV ( ) lim ( ) ( ) ( )
s s
s s
t t T
T
T t t
u t dt u t dt u t dt u t dt
r r
r r
r
r

+ +

+ +

(2.41c)

and so on for three, four, etc., interior points of singularity in u(t). If an improper integral
converges to a finite value in the standard sense of (2.41a), then its Cauchy principle value also
converges to the same answer, but many improper integrals that do not converge in the sense of
(2.41a) nevertheless have well-defined Cauchy principle values. For this reason, it is customary
in Fourier-transform theory to interpret all improper integralssuch as the forward and inverse
Fourier transformsas Cauchy principle values, and that is what we shall do from now on. There
will be no special notation used to distinguish Cauchy principle values from ordinary improper
integrals.
- 118 -
Fourier Transforms and Divergent Integrals 2.10
- 119 -
To show the relevance of the Cauchy principle value, we calculate the Fourier transform of
1 t , an example already considered above in connection with the sine transform [see discussion
following Eq. (2.10e)]. Using the identity cos( ) sin( )
i
e i
o
o o + , we have

( ) 1 2 1 1 1
( ) cos(2 ) sin(2 )
ift ift
t e t dt ft t dt i ft t dt
r
r r

F . (2.42a)

There is no problem evaluating the imaginary part of this transform. Because
1
[ sin(2 )] t ft r
is
an even function of t, we can apply formulas (2.19) and (2.10f) to get

1 1
0
sin(2 ) 2 sin(2 ) for 0 i ft t dt i ft t dt i f r r r

>

.

When 0 f < , we have

1 1
sin(2 ) sin(2 ) i ft t dt i f t t dt i r r r

,

allowing us to write

1
sin(2 ) sgn( ) i ft t dt i f r r
, (2.42b)

where we define

1 for 0
sgn( ) 0 for 0
1 for 0
f
f f
f
>
<

. (2.42c)

The specification that sgn(0) 0 makes sgn( ) f a proper odd function, equal to zero at 0 f ,
even though it has a jump discontinuity there. It also, of course, makes sense considering that
(2.42b) is the integral of the zero function when 0 f . Evaluation of the real part of the
transform in (2.42a) shows the usefulness of interpreting improper integrals as Cauchy principle
values. When 0 f , the real part of the left-hand side of (2.42a) becomes, using the standard
interpretation of an improper integral in (2.41a),
- 119 -
2 Fourier Theory

- 120 -

1 2 1
1 2 1 2
1 2 1
1 2 1 2
1 2
1 2
2
, ,
2
0, 0 0, 0
1 2
,
1 2
0, 0
lim lim ln
lim ln ln
T T
T T T T
T
T T
dt dt dt dt T
t t t t
T T
r
r r
r r r r
r r
r
r r

+ +

+

1 2
1 2
1 2
,
2 1
0, 0
lim ln ln
T T
T
T
r r
r
r

+

.
(2.43a)

The expression
1 2
ln( ) r r can be made anything we want depending on the limiting ratio
chosen for
1 2
r r as
1
0 r and
2
0 r ; the same is true of
1 2
ln( ) T T as
1
T and
2
T .
Therefore, under the standard interpretation of an improper integral, the limit in (2.43a) does not
exist. Comparison of (2.41a) to (2.41b) shows that (2.43a) can be converted to a Cauchy principle
value by setting
1 2
r r r ,
1 2
T T T , and taking the limit as T , 0 r . This leads to

0
lim ln ln 0
T
T
T
r
r
r

+

,

allowing us to give a well-defined value to the expression
dt
t
.
In general, the Cauchy principle value of any odd function is always zero,

( ) 0 for any function such that ( ) ( ) u t dt u u t u t
, (2.43b)

because when taking the limit we are always simultaneously adding ( ) u t dt increments to the
integral at values of t and t with the balanced addition of increments always cancelling out.
Hence, interpreted as a Cauchy principle value,

1
cos(2 ) 0 ft t dt r
(2.43c)

because
1
[ cos(2 )] t ft r
is an odd function of t. Therefore we can now assign a well-defined

meaning to the forward Fourier transform of 1 t in (2.42a) using (2.43c) and (2.42b):

( ) 1
( ) sgn( )
ift
t i f r

F . (2.43d)
- 120 -
Fourier Transforms and Divergent Integrals 2.10
- 121 -
For this answer to be a true extension to Fourier-transform theory, however, 1/t must satisfy
Eq. (2.28A ); that is, the inverse transform

( )
( )
sgn( )
itf
i f r F

has to give back the original function 1/t.
Direct evaluation of the inverse transform gives

( )
( ) 2
sgn( ) sgn( )
cos(2 ) sgn( ) sin(2 ) sgn( ) .
itf ift
i f i e f df
i ft f df ft f df
r
r r
r r r r

F
(2.43e)

The cosine integral is again the integral of an odd function so its Cauchy principle value is zero,
but it is still not clear what value to assign the integral of [sin(2 ) sgn( )] ft f r . As the integral of
an even function, we might try applying formula (2.19) to get

0 0
?
sin(2 ) sgn( ) 2 sin(2 ) sgn( ) 2 sin(2 ) ft f df ft f df ft df r r r r r r

, (2.43f)

but then we have the same difficulty already encountered when trying to evaluate the sine
transform

0
2 sin(2 ) ft df r r

in Eq. (2.10g). To evaluate the inverse transform of sgn( ) i f r , we need to create a new class of
mathematical entities, called generalized functions, together with a set of rules for how they
behave inside integrals. This extension to Fourier-transform theory is often called distribution
theory, with the generalized functions called distributions.
2.11 Generalized Functions
Generalized functions are based on the well-established mathematical concept of a functional. A
functional is a rule for assigning a complex number to each member of a set of test functions,
where each test function o has only one number assigned to it and the same number may end up
assigned to different test functions. The Fourier transform of a function ( ) t o at a specific
frequency
0
f f is a functional because it assigns the number ( )
0
( )
0
( ) ( )
if t
f t o
d F to the test
- 121 -
2 Fourier Theory
- 122 -
function o . In general, we can use any complex function u(t) having a real argument t as a
weighting function inside an integral to create a functional. This functional, called u , is defined
to be

( )
( ) ( ) complex number u dt u t t o o
. (2.44)

According to this definition the functional u is linear, like the Fourier transform, because

( )( ) [ ]
( ) ( )
1 2 1 2 1 2
1 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) u u t t t dt u t t dt u t t dt
u u
oo o oo o o o o
o o o

+ + +
+

(2.45)

for any two complex constants , and test functions
1
o ,
2
o .
From the notation u , it is clear that all functions u, as long as the integral in Eq. (2.44) exists,
have associated with them the functional u defined for the test functions o . There are also
functionals that behave in every way like the functionals u , but for which no corresponding true
function u can be defined. We can, however, associate with these functionals a new class of
mathematical objects, called generalized functions, which can be shown to have many of the
properties of true functions. For this reason, it is customary to use function notation when
referring to generalized functions. If an already-understood functional has no true function u(t)
associated with it, we can use the properties of this already-understood functional to define a
generalized function called ( )
G
u t , with the subscript G reminding us that
G
u is a generalized
function. By analogy with the true function u(t) associated with the functional u , the
generalized function and its behavior inside integrals is defined in terms of the already-known
functional, which we call
G
u , using the definition

( )
( ) ( )
G G
u t t dt u o o
(2.46)

for any test function o . Since we already know what complex number the functional
G
u gives
for any test function o , Eq. (2.46) is not a definition of
G
u but rather a definition of what it
means to put [ ( ) ( )]
G
u t t o inside an integral. Clearly, the generalized function itself is well
defined only when its product with a test function is integrated over t. Because the functional
G
u
behaves in every way like the functionals u based on the Cauchy-principle-value integration of
true functions, we have established a new type of integration using the product of generalized
- 122 -
Generalized Functions 2.11
- 123 -
functions ( )
G
u t with test functions ( ) t o . Hence, we have not only generalized what is meant by a
function but have also extended again what is meant by integration.
To handle algebraic expressions involving both generalized functions and true functions, we
must define what it means to say two generalized functions ( )
G
u t and ( )
G
v t are equal. We say
that when
( ) ( ) ( ) ( )
G G
u t t dt v t t dt o o

(2.47a)

for all appropriate test functions o , then

( ) ( )
G G
u t v t . (2.47b)

We also define a generalized function ( )
G
u t , which we know only from its associated
functional
G
u using definition (2.46), to be equal to a true function v(t) when

( ) ( ) G
u v o o (2.48a)

for all appropriate test functions o . Another way of stating this is that whenever

( ) ( ) ( ) ( )
G

(2.48b)

for all the test functions o , we say that
( ) ( )
G
u t v t . (2.48c)

Two generalized functions ( )
G
u t and ( )
G
v t are defined to be equal over an interval a t b < <
when

( ) ( ) G ab G ab
u v o o (2.48d)
or
( ) ( ) ( ) ( )
G ab G ab

(2.48e)

for all test functions ( )
ab
t o that are identically zero for all t a < and for all t b > . The key point
here is that we are explicitly allowing ( )
ab
t o to be nonzero only inside the interval a t b < < . We
also say that a true function v(t) equals a generalized function ( )
G
u t in the interval a t b < < ,
- 123 -
2 Fourier Theory
- 124 -

( ) ( ) for
G
u t v t a t b < < , (2.48f)
whenever
( ) ( ) ( ) ( )
G ab ab

(2.48g)

for all the ( )
ab
t o test functions. In Eqs. (2.48d)(2.48g), we allow for half-infinite intervals by
permitting constant b to be +with constant a finite and constant a to be with constant b
finite.
The definitions of equality between two generalized functions or between a generalized
function and a true function can be, depending on the set of test functions o chosen, either very
much looser than the standard idea of equality or very much the same. Suppose, by way of
analogy, we define two true functions
1
( ) u t and
2
( ) u t to be equal when

1 2
( ) ( ) ( ) ( ) u t t dt u t t dt o o

(2.49)

for all test functions o . If the only allowed test function is ( ) 0 t o , then any two functions
1
( ) u t
and
2
( ) u t are equal. If, on the other hand, the allowed test functions are
2
( )
ift
t e
r
o

for all real
values of , we are saying that
1
( ) u t and
2
( ) u t are equal when their Fourier transforms
( )
( 2 )
1
( )
ift
u t
r
F and ( )
( 2 )
2
( )
ift
u t
r
F are the same. From the Fourier inversion formulas, it then
follows that
1
( ) u t must be identical to
2
( ) u t , except possibly at jump discontinuities and isolated
points, for all reasonably well-behaved functions
1
( ) u t and
2
( ) u t . In general, we expect the set of
test functions to be diverse enough that serious thought and some mathematical ingenuity are
required to find two functions
1
( ) u t and
2
( ) u t that satisfy Eq. (2.49) yet are not basically the
same function. Of course, the integrals used in Eq. (2.49)and all the other integrals involving
only true functions in Eqs. (2.44) through (2.48g), for that mattermust be known to exist. Often
the finiteness of these integrals and the general smoothness of the test functions are enforced by
the requirement that

lim[ ( )] 0 for 0,1, 2,
N
t
t t N o
, (2.50a)

with the Mth derivative,
( )
( )
M M M
t d dt o o , satisfying
- 124 -
- 125 -

( )
lim[ ( )] 0 for 0,1, 2,
and 1, 2,
N
M
t
t t N
M
o

. (2.50b)

A function such as
2
at
e
for 0 a > satisfies (2.50a) and (2.50b), and in general all functions
representing physically realistic measurements can be taken to satisfy these two requirements. It
turns out, however, that the most useful and popular generalized function used in Fourier theory
can handle a wider variety of test functions, requiring only that the test functions o be
continuous at 0 t (see Sec. 2.14 below).
Continuing to develop what is meant by the sign applied to generalized functions, we say
that the product of a true function w(t) and a generalized function ( )
G
u t is another generalized
function ( )
G
v t ,
( ) ( ) ( )
G G
v t w t u t , (2.51a)

which is defined to mean that
( ) ( ) ( ) ( ) ( )
G G
v t t dt w t u t t dt o o

for all test functions ( ) t o . A linear combination of true functions and generalized functions
specified by

1 1 2 2
( ) ( ) ( ) ( ) ( )
G G G
w t u t v t u t v t + +" (2.51b)

is defined to mean that

1 1 2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
G G G
w t t dt u t v t t dt u t v t t dt o o o

+ +

"

for all test functions ( ) t o . In general, there is no difficulty assigning a meaning to equations such
as

1 1 2 2
1 1 2 2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
G G N GN
G G M GM
u t v t u t v t u t v t
U t V t U t V t U t V t
+ + +
+ + +
"
"
(2.51c)

for true functions
1 2 1 2
( ), ( ), , ( ), ( ), ( ), , ( )
N M
u t u t u t U t U t U t and generalized functions
1 2 1 2
( ), ( ), , ( ), ( ), ( ), , ( )
G G GN G G GM
v t v t v t V t V t V t . As long as both sides of the equation are just
linear combinations of generalized functions and true functions, we interpret their equality to
mean that
- 125 -
2 Fourier Theory
- 126 -

1 1 2 2
1 1 2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
G G N GN
G G M GM
u t v t t dt u t v t t dt u t v t t dt
U t V t t dt U t V t t dt U t V t t dt
o o o
o o o

+ + +
+ + +

"
"

for all test functions ( ) t o . Even the simplest nonlinear expressions, however, such as

[ ]
2
?
( ) ( )
G G
v t u t ,

cannot be resolved by putting both sides inside an integral, because the right-hand side of

[ ]
2
?
( ) ( ) ( ) ( )
G G
v t t dt u t t dt o o

is still undefined. We know that the left-hand side is the same as applying the already-understood
functional
G
u to o ,

( )
( ) ( )
G G
u t t dt u o o
,
but no definition has been given to

[ ]
2
( ) ( )
G
u t t dt o

in terms of the functional
G
u . It turns out that, in general, nonlinear expressions involving
generalized functions cannot be given useful interpretations. Hence, generalized functions must
be treated with caution unless they are used inside linear combinations of the type shown in
(2.51b) and (2.51c).
Although generalized functions do have limitations, there are many things that can be done
with them. We can give meaning to ( )
G
u t a for any real constant a by defining that

( ) ( ) ( ) ( )
G G
u t a t dt u t t a dt o o

+

(2.52a)

for all test functions o . This definition is, of course, consistent with what happens when the
formal substitution t t a is made inside the original integral,
- 126 -
- 127 -
( ) ( ) ( ) ( ) ( ) ( )
G G G
u t a t dt u t t a dt u t t a dt o o o

+ +

,

treating ( )
G
u t a like a true function ( ) u t a . We can give meaning to ( )
G
u at for any real
constant a by defining that
( )
1
( ) ( ) ( )
G G
u at t dt u t t a dt
a
o o

(2.52b)

for all test functions o . This definition is consistent with what happens when we make the formal
substitution t at in the integral
( ) ( )
G
u at t dt o

and treat ( )
G
u at like a true function,

( )
( )
( )
1
( ) for 0
1
( ) ( ) ( )
1
( ) for 0
G
G G
G
u t t a dt a
a
u at t dt u t t a dt
a
u t t a dt a
a
o
o o
o

>

<

.

When the argument of
G
u is the a linear combination at c + for real constants a and c, we
define
( )
1
( ) ( ) ( ) ( )
G G
u at c t dt u t t c a dt
a
o o

+

(2.52c)

and, combining the arguments used to explain definitions (2.52a) and (2.52b), we see that
transforming the variable of integration to t at c + gives

( )
1
( ) ( ) ( ) ( )
G G
u at c t dt u t t c a dt
a
o o

+

,

justifying definition (2.52c). In general, any variable transformation that is permitted for the
argument of a true function we also permit for the argument of a generalized function unless it
results in an inappropriate test function.
We define a generalized function ( )
G
u t to be even if
- 127 -
2 Fourier Theory
- 128 -
( ) ( ) 0
G o
u t t dt o
(2.52d)

for all odd test functions
o
o , and we define ( )
G
u t to be odd if

( ) ( ) 0
G e
u t t dt o
(2.52e)

for all even test functions
e
o . This gives ( )
G
u t the same behavior it would have if it were an even
or odd true function multiplied by
e
o or
o
o and integrated over all t. Putting a subscript e on the
generalized function ( )
Ge
u t to show that it obeys the above definition for an even generalized
function, we note that, as described in Eq. (2.11c) above, any test function ( ) t o can be written as
the sum of an even function ( )
e
t o and an odd function ( )
o
t o . Hence, for any test function o and
an even generalized function ( )
Ge
u t , we can write, using definition (2.52d),

[ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) .
Ge Ge e o Ge e Ge o
Ge e
u t t dt u t t t dt u t t dt u t t dt
u t t dt
o o o o o
o

+ +

Definition (2.52b) gives, again using that ( ) ( ) ( )
e o
t t t o o o + ,

[ ]
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
Ge Ge Ge e o
Ge e Ge o
G
u t t dt u t t dt u t t t dt
u t t dt u t t dt
u
o o o o
o o

+
+

( ) ( ) ( ) ( )
( ) ( ) ,
e e Ge o
Ge e
t t dt u t t dt
u t t dt
o o
o

where in the last two steps we use ( ) ( )
o o
t t o o , ( ) ( )
e e
t t o o , and definition (2.52d). We see
that both
- 128 -
- 129 -
( ) ( )
Ge
u t t dt o
and ( ) ( )
Ge
u t t dt o

are equal to
( ) ( )
Ge e
u t t dt o

for any test function o , so by definition (2.47a) for the equality of two generalized functions, it
follows that
( ) ( )
Ge Ge
u t u t (2.52f)

for any even generalized function ( )
Ge
u t . If ( )
Go
u t is any odd generalized function, we can use
( ) ( ) ( )
e o
t t t o o o + and definition (2.52e) to get

[ ]
( ) ( ) ( ) ( ) ( ) ( ) ( )
Go Go e o Go o
u t t dt u t t t dt u t t dt o o o o

+

and definition (2.52b) to get

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) [ ( )] ( )
Go Go Go e Go o
Go e Go o
u t t dt u t t dt u t t dt u t t dt
u t t dt u t t dt
o o o o
o o

+
+

[ ( ) ( )]
Go o
u t t dt o

or
[ ( )] ( ) ( ) ( )
Go Go o
u t t dt u t t dt o o

.

Clearly, ( ) ( )
Go
u t t dt o
and [ ( )] ( )
Go
u t t dt o
are equal to each other because they are both

equal to ( ) ( )
Go o
u t t dt o
for any test function o , so by definition (2.47a) we conclude that

( ) ( )
Go Go
u t u t
- 129 -
2 Fourier Theory
- 130 -
or
( ) ( )
Go Go
u t u t . (2.52g)

We define the derivative of a generalized function ( )
G
u t to be another generalized function

(1)
( ) ( )
G G
u t u t .

The generalized function ( )
G
u t is defined in terms of the already-known functional
G
u , but
what functional
G
u defines the generalized function ( )
G
u t ? We specify this new functional
G
u
with the definition

( ) ( ) G G
u u o o
or

( )
( ) ( ) ( )
G G G
d
u u t t dt u t dt
dt
o
o o

(2.53a)

for any test function o . Therefore, the new generalized function ( )
G
u t satisfies the equation

( ) ( ) ( )
G G
d
u t t dt u t dt
dt
o
o

(2.53b)

for any test function o . We note that this definition is consistent with a formal integration by
parts, treating ( )
G
u t like a true function ( ) u t to get

[ ]
( ) ( ) ( ) ( ) ( ) ( )
G G G G
d d
u t t dt u t t u t dt u t dt
dt dt
o o
o o

,

with the term in square brackets [ ] zero for all test functions o . We can make this first term zero
either by requiring o to approach zero as t or by having ( )
G
u t equal a true function in the
sense of (2.48g) with the true function becoming zero as t . The integral involving
( ) t d dt o o must also, of course, have a well-defined meaning for all the test functions o .
The convolution of two generalized functions ( )
G
u t and ( )
G
v t is defined to be another
generalized function
( ) ( ) ( )
G G G
w t u t v t . (2.54a)

From Eqs. (2.47a) and (2.47b), we know that (2.54a) must mean that
- 130 -
- 131 -

[ ]
( ) ( ) ( ) ( ) ( )
G G G
w t t dt u t v t t dt o o

(2.54b)

for all test functions o . We now give meaning to both sides of (2.54b) by defining that, for all
test functions o ,

[ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
G G G G G
w t t dt u t v t t dt dt u t dt v t t t o o o

+

. (2.54c)

Note that the right-hand side of (2.54c) is as well defined as our previous definitions, since

( ) ( )
v G
v t t t dt o
d +

is just another complex number depending on the real parameter t , which can be treated as
another true test function ( )
v
t d inside the double integral of (2.54c),

( ) ( ) ( ) ( ) ( )
G G G v
dt u t dt v t t t u t t dt o

+ d

.

As long as ( ) t t o + and ( )
v
t d are both test functions whenever o is a test function,
definition (2.54c) should present no difficulties. To justify this definition, we note that formally
treating ( )
G
u t and ( )
G
v t as true functions gives

[ ]
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ,
G G G G
G G
u t v t t dt dt t dt u t v t t
dt u t dt t v t t
o o
o

where the last step interchanges the order of integration. We now use (2.52a) to write

( ) ( ) ( ) ( )
G G
t v t t dt v t t t dt o o

+

,
which leads to
- 131 -
2 Fourier Theory
- 132 -

[ ]
( ) ( ) ( ) ( ) ( ) ( )
G G G G
u t v t t dt dt u t dt v t t t o o

+

,

justifying the definition given in (2.54c). Note that the order of integration inside the double
integral of (2.54c) can be freely interchanged,

( ) ( ) ( ) ( ) ( ) ( )
G G G G
dt u t dt v t t t dt v t dt u t t t o o

+ +

,

showing that ( ) ( ) ( ) ( )
G G G G
u t v t v t u t for generalized functions as well as true functions.
Because the convolution itself is defined as an integral, there is no problem giving a meaning to
the convolution of a true function with a generalized function as long as the true function is an
acceptable test function. For a generalized function ( )
G
u t and test function ( ) t o , we have

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
G G G G
u t t u t t t dt u t t t dt u t t t dt o o o o

, (2.55a)

where definition (2.52c) with 1 a and c t is used in the last step of (2.55a). It clearly makes
sense to say that
( ) ( ) ( ) ( )
G G
u t t t dt t u t o o
,
which means that
( ) ( ) ( ) ( )
G G
u t t t u t o o (2.55b)

for the convolution of a generalized function with any test function o .
2.12 Generalized Limits
Given a sequence of true functions
1 2
( ), ( ), , ( ),
n
u t u t u t , we can form a corresponding
sequence of integrals with the test functions o ,

1 2
( ) ( ) , ( ) ( ) , , ( ) ( ) ,
n
u t t dt u t t dt u t t dt o o o

.

We define Glim, the generalized limit of the sequence of true functions ( )
n
u t , by taking the
standard limit of the sequence of integrals,
- 132 -
Generalized Limits 2.12
- 133 -
lim ( ) ( )
n
n
u t t dt o
,

and requiring that the generalized limit of the sequence of true functions ( )
n
u t , written as

lim ( )
n
n
G u t
,

satisfy the equation
lim ( ) ( ) lim ( ) ( )
n n
n n
u t t dt G u t t dt o o

(2.56a)

for any test function o . In effect, the generalized limit Glim is what we get when we insist on
moving the standard limit inside the integral. Almost always, of course, it turns out that the
generalized limit is the same as the standard limit,

lim ( ) lim ( )
n n
n n
G u t u t

,
so that
lim ( ) ( ) lim ( ) ( )
n n
n n
u t t dt u t t dt o o

, (2.56b)

but this is not always the case. If we define the H function (see Fig. 2.6) by

1 for
( , ) 1 2 for
0 for
t T
t T t T
t T
<
>

, (2.56c)

we can construct a sequence of true functions by

1
( ) ,1
n
t
u t
n n

H

. (2.56d)

Function ( ,1) t n H is 1 only when n t n < < , so when

( ) 1 t o
- 133 -
2 Fourier Theory

- 134 -
is an acceptable test function, it is always true that

1
( ) ,1 2
n
t
u t dt dt
n n

H

,

which makes
lim ( ) 2
n
n
u t dt
. (2.56e)
On the other hand,

1
lim ( ) lim ,1 0
n
n n
t
u t
n n

H

,
which gives
lim ( ) 0
n
n
u t dt
. (2.56f)
______________________________________________________________________________

FIGURE 2.6. ( , ) t T H

t
t T t T
- 134 -
Generalized Limits 2.12
- 135 -
The disagreement of (2.56e) and (2.56f) shows that there can be a very important difference
between the generalized limit and the standard limit, because Eq. (2.56b) does not always hold
true. We cannot avoid this problem by ruling out constant test functions such as ( ) 1 t o .
Consider, for example,

2
1
( )
1
t
t
o
+

and construct a sequence of true functions

( ) sin( )
n
u t t t n .

We find that
21

1
2
sin( )
1
n
t t n
dt e
t
r
, (2.57a)
which gives

2
sin( )
lim
1
n
t t n
dt
t
r
. (2.57b)

This is not the same as

[ ]
2
{lim sin( ) }
0
1
n
t t n
dt
t
. (2.57c)

Once again, we have found a sequence of true functions ( )
n
u t that does not satisfy (2.56b). This
second example can, in fact, be seen to fail (2.56b) for much the same reason as the first. Since an
even function is being integrated, we can write that [see Eq. (2.19)]

2 2
0
sin( ) sin( )
lim 2lim
1 1
n n
t t n t t n
dt dt
t t

+ +

. (2.57d)

Consider what happens to the first, positive hump of the sine as n increases in the integral on the
right-hand side of Eq. (2.57d). The values of t for which sin( ) t n is significantly different from
zero, say from ( 4) n r to (3 4) n r , comprise an interval ( 2) t n r A with a width that
increases linearly with n, just like the interval 2n in (2.56d) over which ( ,1) t n H equals one. The

21
I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, edited by Alan Jeffrey, 5th ed.
(Academic Press, New York, 1994), p. 445, formula 4 in Sec. 3.723 with a=1/n and =1.
- 135 -
- 136 -
2.13 Fourier Transforms of Generalized Functions
For every generalized function ( )
G
u t , there is at least one sequence of true functions
1 2
( ), ( ), , ( ),
n
u t u t u t such that
lim ( ) ( )
n G
n
G u t u t
. (2.58a)

This formula should be interpreted in the sense of (2.47b) and (2.56a); that is, it means

lim ( ) ( ) lim ( ) ( ) ( ) ( )
n n G
n
n
G u t t dt u t t dt u t t dt o o o

(2.58b)

for all test functions o . We use the sequence of true functions whose generalized limit is the
generalized function to define the Fourier transform of the generalized function. If a sequence of
true functions
1 2
( ), ( ), , ( ),
n
w t w t w t can be forward Fourier transformed to give another
2 Fourier Theory

center of this hump is at ( 2) t n r , so as n increases, the humps center appears at ever larger
values of t. Hence, we can make the approximation that for large n

1
2
2
1
t
t
t nr
e =
+
.

This means the characteristic size of

2
sin( )
1
t t n
t +

at the hump decreases as 1 n , while the humps width, ( 2) t n r A , increases as n. The product
of the size and width therefore tends to a constant as n gets large, preventing the integral from
shrinking as n . This is the same phenomenon that caused our first example
1
( ,1) n t n
H to
fail Eq. (2.56b). Up to this point, we have, of course, only discussed the contribution of the first
- 136 -
Fourier Transforms of Generalized Functions 2.13
- 137 -
sequence of true functions
1 2
( ), ( ), , ( ),
n
W f W f W f such that

2
( ) ( )
ift
n n
W f w t e dt
r
(2.59a)
and

2
( ) ( )
ift
n n
w t W f e df
r
(2.59b)

for all values of n, we then define the forward Fourier transform of the generalized function

( ) lim ( )
G n
n
w t G w t
(2.59c)
to be
( )
( )
( ) lim ( )
ift
G n
n
w t G W f
F . (2.59d)

We expect the sequence of true functions
1 2
( ), ( ), , ( ),
n
W f W f W f also to give a generalized
function when we take the generalized limit of the sequence,

( ) lim ( )
G n
n
W f G W f
, (2.59e)

and we define the inverse Fourier transform of this generalized function to be ( )
G
w t ,

( )
( )
( ) lim ( ) ( )
itf
G n G
n
W f G w t w t
F . (2.59f)

The double-arrow notation introduced in the discussion after Eq. (2.35d) can be used to
restate this definition more concisely. We define that whenever

1 2
( ), ( ), , ( )
G
w t w t w t

is true, and that whenever

1 2
( ), ( ), , ( )
G
W f W f W f

is true, and that whenever

1 1 2 2
( ) ( ), ( ) ( ), , ( ) ( ),
n n
w t W f w t W f w t W f
- 137 -
2 Fourier Theory

- 138 -
is true for all n, it must also be true that

( ) ( )
G G
w t W t (2.59g)

for the generalized functions given by the generalized limits of sequences

1 2
( ), ( ), w t w t and
1 2
( ), ( ), W f W f .

Now at last we can attach a meaning to the Fourier transform pair that could not be completed
in Eqs. (2.43d)(2.43f). The explicit development that follows is perhaps somewhat long, but
worth doing to show how to construct the Fourier transforms of some of the functions violating
one or more of requirements (V) through (VIII) in Sec. 2.4. We create the sequence

sgn( ) ( ,1), sgn( ) ( , 2), , sgn( ) ( , ), f f f f f f n H H H

and define the generalized sgn function by

[ ]
"sgn( )" lim sgn( ) ( , )
n
f G f f n
H , (2.60a)

where quotes are used to indicate that the sgn( ) f is a generalized function instead of the
true function sgn( ) f defined in Eq. (2.42c) above. The reason for this choice of sequence is
straightforwardfunction [sgn( ) ( , )] f f n H satisfies requirements (V) through (VIII) in Sec. 2.4
for every finite value of n and so has a well-defined Fourier transform; as n increases, function
[sgn( ) ( , )] f f n H resembles ever more closely the sgn( ) f function to which we want to give a
Fourier transform. We note that for any test function o

[ ]
( ) "sgn( )" ( ) lim sgn( ) ( , )
lim ( ) sgn( ) ( , )
lim ( ) sgn( )
( ) sgn( )
n
n
n
n
n
f f df f G f f n df
f f f n df
f f df
f f df
o o
o
o
o

H
H

so
"sgn( )" sgn( ) f f (2.60b)
- 138 -
- 139 -
in the sense of Eq. (2.48c). This equivalence can be used to justify dropping the distinction
between sgn( ) f and sgn( ) f . Applied mathematicians who work with generalized functions
often drop the distinction between a generalized function and the true function to which it is
equivalent, and the double-quote notation introduced here is not standard usage. There is,
however, no harm in keeping track of the distinction between the two types of functions, and the
double quotes acknowledge the close relationship of the two functions while reminding us that
they are not the same.
The inverse Fourier transform of [ sgn( ) ( , )] i f f n r H is, using the identity
cos sin
i
e i
+ ,

( )
( ) 2
0
sgn( ) ( , ) sgn( ) ( , ) 2 sin(2 )
n
itf ift
i f f n i e f f n df ft df
r
r r r r
H H

F .

In the last step, we use that the integral of

[cos(2 ) sgn( ) ( , )] ft f f n r H ,

which is an odd function in , has an integral that is zero according to Eq. (2.17); and the integral
between (n) and n of [sin(2 ) sgn( )] ft f r , which is an even function in , is twice the value of its
integral from zero to n according to Eq. (2.19). Making the substitution 2 f tf r gives

( ) [ ]
2
( )
0
1
sgn( ) ( , ) cos
nt
itf
i f f n f
t
r
r H F .

This shows that the inverse Fourier transform of [ sgn( ) ( , )] i f f n r H is

( ) [ ]
( ) 1
sgn( ) ( , ) 1 cos(2 )
itf
i f f n t nt r r
H F .

Now we calculate the forward Fourier transform of (1/ )[1 cos(2 )] t nt r . We get

( )
( ) 1 2 1
2 2
[1 cos(2 )] [1 cos(2 )]
1
cos(2 )
1
sgn( ) cos(2 ) sin(2
ift ift
ift ift
t nt e t nt dt
dt
e e nt dt
t t
i f i nt ft
t
r
r r
r r
r
r r r

F
) . dt

- 139 -
2 Fourier Theory

- 140 -
In the last step, Eq. (2.43d) is used to evaluate the integral of
2 1
[ ]
ift
e t
r
; we also substitute
cos sin
i
e i
+ into the integral of

2 1
[ cos(2 )]
ift
e t nt
r
r

, discovering that the Cauchy principle
value of the integral of
1
[ cos(2 ) cos(2 )] t ft nt r r
, which is an odd function in t, is zero [see Eq.

(2.17)]. The remaining integral over the even function

1
[ sin(2 ) cos(2 )] t ft nt r r

can be simplified by applying Eq. (2.19) and then consulting a table of definite integrals,
22

0
1 1
cos(2 ) sin(2 ) 2sgn( ) cos(2 ) sin(2 )
sgn( ) (2 , 2 ) sgn( ) ( , ) .
nt ft dt f nt f t dt
t t
f n f f n f
r r r r
r r r r

H H

We conclude that the forward Fourier transform of (1/ )[1 cos(2 )] t nt r is

( )
( ) 1
[1 cos(2 )] sgn( ) ( , ) sgn( ) 1 ( , )
sgn( ) ( , ) .
ift
t nt f i i n f i f n f
i f f n
r r r r
r

+ H H

H
F

Hence, (1/ )[1 cos(2 )] t nt r and [ sgn( ) ( , )] i f f n r H are a Fourier-transform pair,

[ ]
1
1 cos(2 ) sgn( ) ( , ) nt i f f n
t
r r H .

This confirms that there are two sequences

[ ] [ ] [ ]
1 1 1
1 cos(2 ) , 1 cos(4 ) , , 1 cos(2 ) , t t nt
t t t
r r r (2.60c)
and

sgn( ) ( ,1), sgn( ) ( , 2), , sgn( ) ( , ), i f f i f f i f f n r r r H H H

22
I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, p. 453, formula 2 in Sec. 3.741 with
a=2r|f| and b=2rn.
- 140 -
- 141 -
such that each member of the lower sequence is the forward Fourier transform of the
corresponding member of the upper sequence and each member of the upper sequence is the
inverse Fourier transform of the corresponding member of the lower sequence. We know from
(2.60a) and (2.60b) that the generalized function given by the generalized limit of the lower
sequence is

[ ] [ ]
lim sgn( ) ( , ) lim sgn( ) ( , ) "sgn( )"
sgn( ) ,
n n
G i f f n i G f f n i f
i f
r r r
r

H H

(2.60d)

but what is the generalized function given by the generalized limit of the upper sequence? We
have for any test function o

[ ]
1
1
( ) lim [1 cos(2 )] lim ( ) 1 cos(2 )
lim ( ) ( ) cos(2 )
n n
n
t G t nt dt t nt dt
t
dt dt
t t nt
t t
o r o r
o o r

1
( ) lim ( ) cos(2 ) .
n
dt
t t nt dt
t t
o o r

(2.60e)

Working with the limit of the integral containing cos(2 ) nt r , we write

1 1
lim ( ) cos(2 ) lim ( ) cos(2 )
1
lim ( ) cos(2 )
1
lim ( ) cos(2 ) ,
n n
n
n
t nt dt t nt dt
t t
t nt dt
t
t nt dt
t
r
r
r
r
o r o r
o r
o r
+
+

(2.60f)

where r is a small positive number. By making all the test functions ( ) t o have finite variation as
in requirement (VIII) in Sec. 2.4, we recognize the first and third integrals on the right-hand side
of (2.60f) become zero as n , because eventually the cosine oscillates both positive and
negative over each infinitesimal interval while ( ) t t o barely changes at allthe integrals can be
made as small as desired by picking a large enough value of n. For future use, we note that for
any continuous, finite-variation test function o ,
- 141 -
2 Fourier Theory

- 142 -

int
lim ( ) sin( ) lim ( ) cos( ) lim ( ) 0
n n n
t nt dt t nt dt t e dt o o o

,

so that

int
limsin( ) limcos( ) lim 0
n n n
G nt G nt G e

. (2.60g)

The middle integral in Eq. (2.60f) can be written as

1
( ) cos(2 ) (0) ( , ) cos(2 )
dt
t nt t nt dt
t t
r
r
o r o r r

e H

,

where we have chosen r small enough that ( ) t o barely changes over the integral, letting us
replace it by (0) o . Now the middle integral on the right-hand side of (2.60f) can be recognized as
the Cauchy principle value of the integral of (1 ) ( , ) cos(2 ) t t nt r r H , which is an odd function of t
and must be zero according to Eq. (2.17). Hence, (2.60f) becomes

1
lim ( ) cos(2 ) 0
n
t nt dt
t
o r
,

which shows that (2.60e) simplifies to

1
( ) lim [1 cos(2 )] ( )
n
dt
t G t nt dt t
t
o r o

(2.60h)

for any test function o . Since (2.60h) denotes equality in the sense of Eq. (2.48c), we can define
the generalized function
1
t
to be

1 1
" " lim [1 cos(2 )]
n
t G t nt r

(2.60i)

and then note that Eq. (2.60h) now states that

1 1
" " t t

. (2.60j)

Equations (2.60d) and (2.60j) show that [ "sgn( )"] i f r and
1
t
are the generalized limits of the

two sequences in (2.60c). Because all the sequence members are Fourier transform pairs, we
- 142 -
- 143 -
know, according to (2.59g), that [ "sgn( )"] i f r and
1
" " t
are a Fourier transform pair even

though [ sgn( )] i f r and
1
t
do not satisfy requirements (V) through (VIII) in Sec. 2.4 and, as

shown in Eqs. (2.43a) and (2.43f), their transforms cannot be evaluated as standard integrals. In
this sense, we can write that

( ) 1
( ) sgn( )
ift
t i f r

F (2.60k)
and
( )
( ) 1
sgn( )
ift
i f t r

F . (2.60A )

This can also be written as, reversing the sign of in (2.60k), the sign of t in (2.60A ), and using
Eq. (2.42c) to get that sgn( ) sgn( ) f f ,

( ) 1
( ) sgn( )
ift
t i f r
F (2.60m)
and
( )
( ) 1
sgn( )
ift
i f t r

F . (2.60n)

It is important to remember that Eqs. (2.60k) and (2.60m) are true only when integrals between
and + are interpreted as Cauchy principle values and (2.60A ) and (2.60n) are true only
when equality is defined as in Eq. (2.48c) using generalized function theory. Strictly speaking, it
might be better to say that the Cauchy principle value of

2 ift
dt
e
t
r
is sgn( ) i f r
and that

[ ]
2 1
"sgn( )" " "
ift
e i f df t
r
r
.

This is the reason that

2
sgn( )
ift
dt
e i f
t
r
r
(2.61a)

is usually not listed in standard tables of improper integrals without notation showing that it is a
Cauchy principle value, and the equality

[ ]
2
sgn( )
ift
i
e f df
t
r
r
(2.61b)
- 143 -
2 Fourier Theory

- 144 -
is usually not listed in these tables under any circumstances. It is also true, however, that (2.61a)
and (2.61b) are constantly used either explicitly or implicitly in Fourier-transform theory; and
lists of Fourier-transform pairs often contain (2.61a) and (2.61b). Unfortunately, it is standard
practice in the Fourier-transform tables that do list these integrals to omit any explanation that
they are only true when interpreted as the Fourier transforms of generalized functions. In general,
when using tables of Fourier transforms, all those transforms that do not exist as standard
integrals or Cauchy principle values should be interpreted as the transforms of generalized
functions and used only in the context of generalized function theory.
2.14 The Delta Function
The most popular and useful generalized function is the Dirac delta function, a name usually
shortened to just the delta function. In a sense, the Secs. 2.112.13 describing generalized
function theory are there just so we can give a mathematically exact description of the delta
function. The delta function is often inexactly described in elementary textbooks as that function
( ) t o such that

for 0
( )
0 for 0
t
t
t
o

(2.62a)

with

(0) for 0
( ) ( )
0 for 0 or 0
b
a
f a b
t f t dt
a b a b
o
< <

< < < <

. (2.62b)

More sophisticated textbooks may define it as a standard limit, for example,

1
( ) lim[ ( , )]
n
t n t n o

H (2.63a)
or

2
( ) lim
nt
n
n
t e o
r

. (2.63b)

There are, in fact, two differentbut equivalentmathematically exact ways to define the delta
function. The first way is to create a well-defined functional o that, when operating on a
complex-valued test function ( ) t o with a real argument t, produces as its complex number (0) o ,
the value of o at t equal to zero,

( )
(0) o o o . (2.64a)
- 144 -
The Delta Function 2.14
- 145 -
This makes ( ) t o the generalized function associated with functional o , with ( ) t o having the
property that
( ) ( ) (0) t t dt o o o
(2.64b)

for all test functions o . The second way to define ( ) t o is to say it is the generalized limit of a
sequence such as the ones specified in (2.63a) and (2.63b),

1
( ) lim[ ( , )]
n
t G n t n o

H (2.65a)
or

2
( ) lim
nt
n
n
t G e o
r

. (2.65b)

Although the delta function is a generalized function in every sense of the term, we follow
standard notation and do not add the G subscriptor add the quotes used to label other
generalized functions in this chapter.
Defining ( ) t o with a functional, as in (2.64a), shows that this generalized function can be
used on an extremely large set of test functionsany true function that is continuous at the origin
is an acceptable and appropriate test function. The subset of test functions
ab
o used in Eqs.
(2.48d)(2.48g) has a b < with ( )
ab
t o automatically set to zero when t does not lie inside the
interval a t b < < . These functions can be used in (2.64b) to show that

( ) ( ) (0) 0
ab ab
t t dt o o o

when 0 a b < < or 0 a b < < . Therefore, we have

( ) 0 for 0 t t o = (2.65c)

in the sense of definition (2.48f)that is, we know that

( ) ( ) 0 ( ) 0
ab ab
t t dt t dt o o o

- 145 -
2 Fourier Theory

- 146 -
for all test functions
ab
o where the interval a t b < < does not include 0 t . This is a
mathematically exact way of stating the lower level of Eq. (2.62a). If ( ) t o is defined using
generalized limits, as in Eqs. (2.65a) and (2.65b), then we must show why Eq. (2.64b) is true. The
sequence in (2.65b), for example, leads to

2 2 2
2
( ) lim lim ( ) lim (0)
(0) lim
nt nt nt
n n n
nt
n
n n n
t G e dt e t dt e dt
n
e dt
o o o
r r r
o
r

(0) o
(2.66)

for any test function o . As n gets large in (2.66), only the value of o at 0 t can contribute
significantly to the integral. Replacing ( ) t o by (0) o quickly reduces the whole expression to
(0) o , showing that the generalized limit of the sequence in (2.65b) is indeed the delta function.
Some commonly used sequences that have the delta function as their generalized limits are

( )
2 2
( ) lim
1
n
n
t G
n t
r
o
+
, (2.67a)

2
2
sin ( )
( ) lim
n
nt
t G
n t
o
r
, (2.67b)

sin(2 )
( ) lim
n
nt
t G
t
r
o
r
, (2.67c)

and so on. Perhaps the most interesting of these sequences is (2.67c). We know from (2.65c) that
one important property of the delta function is

( ) ( ) 0
ab
t t dt o o

whenever the interval a t b < < does not include 0 t . The reason that

sin(2 ) sin(2 )
lim ( ) lim ( ) 0
ab ab
n n
nt nt
G t dt t dt
t t
r r
o o
r r

- 146 -
- 147 -
when the interval a t b < < does not include 0 t is that for extremely large n values the sine
oscillates rapidly between +1 and 1 while ( )
ab
t t o stays essentially constant for 0 t = , averaging
the integrand to zero. Hence,

sin(2 )
lim ( ) 0 for 0
n
nt
G t t
t
r
o
r
=

for the same reason that

int
lim 0
n
G e

in Eq. (2.60g). To understand the behavior near 0 t , we construct function
0
( )
a b
t o in which the
interval a t b < < does include 0 t . Now we can write, transforming the variable of integration
to 2 t nt r ,

( ) ( )
0 0
0 0 0
sin(2 ) 1 sin( )
lim ( ) lim
2
1 sin( )
0 lim 0 ( ) ( ) ,
a b a b
n n
a b a b a b
n
nt t t
G t dt dt
t t n
t
dt t t dt
t
r
o o
r r r
o o o o
r

where in the second-to-last step we use

sin( ) t
dt
t
r
.

Any arbitrary test function can be written as a function
0
( )
a b
t o whose interval of nonzero values
includes 0 t plus other test functions whose intervals of nonzero values do not include 0 t ;
that is, we can always write
0
( ) ( ) [other functions zero at the origin]
a b
t t o o + . When this ( ) t o is
multiplied by limsin(2 ) ( )
n
G nt t r r
and integrated over t between and +, we realize that the

value of the integral is
0
(0) (0)
a b
o o because the other functions that are zero at the origin give
zero contribution to the integral as n . Consequently,

( )
sin(2 )
lim ( ) 0 ( ) ( )
n
nt
G t dt t t dt
t
r
o o o o
r

,

indicating that the generalized limit of the sequence
(see any handbook of denite integrals)
- 147 -
2 Fourier Theory

- 148 -

sin(2 ) nt
t
r
r

equals the delta function in the only sense that two generalized functions can ever be equalthe
integral of the left-hand side with any test function o is always the same as the integral of the
right-hand side with any test function o [see discussion after Eq. (2.47b)]. Figures 2.7(a)2.7(c)
and 2.8(a)2.8(c) plot the behavior of
2
nt
n e r

and
1
( ) sin(2 ) t nt r r
sequences, showing the

two different ways these sequences change into delta functions.
We note that for any odd test function ( )
o
t o

( ) ( ) (0) 0
o o
t t dt o o o

because, according to Eq. (2.12a), odd functions are zero at the origin. Therefore, from the
definitions of even and odd generalized functions in Eqs. (2.52d) and (2.52e), we conclude that
the delta function is an even generalized function because its integral with all odd test functions is
always zero. This means we can write [see Eq. (2.52f)]

( ) ( ) t t o o . (2.68a)

From the behavior of generalized functions specified in Eq. (2.52a), we have

0 0 0
( ) ( ) ( ) ( ) ( ) t t t dt t t t dt t o o o o o

+

and, because the delta function equals the zero function for 0 t = , this result can be written as

0 0
0
0 0
0 for or
( ) ( )
( ) for
b
a
a b t t a b
t t t dt
t a t b
o o
o
< < < <

< <

. (2.68b)

From Eq. (2.52b), we have

1 1
( ) ( ) ( ) ( / ) (0) c t t dt t t c dt
c c
o o o o o

,

from which we conclude that
- 148 -
- 149 -

Figures 2.7(a)2.7(c) show how
2
/
nt
n e r

changes into a delta function of t as n increases.
FIGURE 2.7(a).
FIGURE 2.7(b).
FIGURE 2.7(c).
t
t
t
0
0
0
- 149 -
2 Fourier Theory

- 150 -

Figures 2.8(a)2.8(c) show how
-1
(t) sin(2nt) changes into a delta function of t as n increases.
0
0
0
0
0
FIGURE 2.8(A).
FIGURE 2.8(b).
FIGURE 2.8(c).
t
t
t
0
0
0
- 150 -
- 151 -

1
( ) ( ) ct t
c
o o (2.68c)
because

1 1
( ) ( ) (0) ( ) ( ) c t t dt t t dt
c c
o o o o o

for all test functions o . We note that this last rule, Eq. (2.68c), can also be used to show that the
delta function is even, since (2.68a) is just a special case of (2.68c) with 1 c .
Equation (2.52c) shows that there is no difficulty handling a general linear transformation of
the delta functions argument, because for any two real constants a and c, we have

1 1 1
( ) ( ) ( ) (( ) / ) ( )
c c
a t c t dt t t c a dt t t dt
a a a a a
o o o o o o o

+

for all test functions o . Consequently,

1
( )
c
a t c t
a a
o o

. (2.68d)

This is the same answer we would get from factoring a out of the delta function argument and
then using (2.68c) to rescale the delta function.
When the delta function is multiplied by a true function v(t), we have

[ ] [ ] [ ]
0 0 0 0 0 0
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) t t v t t dt t t v t t dt v t t t t v t t dt o o o o o o o

for any test function o , from which we conclude that

0 0 0
( ) ( ) ( ) ( ) v t t t v t t t o o . (2.68e)

A useful generalization of (2.68d) is, for continuous true functions u(t),

( )
all
1
( ) ( )
( )
k
k k
u t t t
u t
o o
, (2.68f)

where ( ) u t du dt and
1 2
, , t t are the values of t for which ( ) 0 u t . This formula only makes
sense, of course, when ( ) 0
k
u t = for
1 2
, , t t . Perhaps the easiest way to see that (2.68f) must be
- 151 -
2 Fourier Theory

- 152 -
true is to note that the delta function equals the zero function whenever its argument is not zero.
Therefore,

all
( ( )) ( ) ( ( )) ( )
k
k
t
k
t
u t t dt u t t dt
r
r
o o o o
+

(2.68g)

with 0 r > taken to be small enough that each interval
k k
t t t r r < < + only includes one of the
k
t values for which u is zero. Nothing stops us from making r as small as we pleaseas long as
it does not become zeroand eventually each integral on the right-hand side of (2.68g) can be
written as
( ) ( ) ( ) ( ) ( ) ( ) ( )
k k
k k
t t
k k
t t
u t t dt t t u t t dt
r r
r r
o o o o
+ +

,
where we expand u as

( ) ( ) ( ) ( ) ( ) ( )
k k k k k
u t u t t t u t t t u t e +

since ( ) 0
k
u t . Next, we use (2.68d) to write

( )
1
( ) ( ) ( )
( )
k k k
k
t t u t t t
u t
o o
,

so that

( )
1
( ) ( ) ( ) ( )
( )
1
( ) ( ) .
( )
k k
k k
t t
k
k
t t
k
k
u t t dt t t t dt
u t
t t t dt
u t
r r
r r
o o o o
o o
+ +

Substitution of this result back into (2.68g) gives

all all
1 1
( ( )) ( ) ( ) ( ) ( ) ( )
( ) ( )
k k
k k k k
u t t dt t t t dt t t t dt
u t u t
o o o o o o

for all test functions o . This justifies Eq. (2.68f) according to the definition for the equality of
generalized functions [see Eqs. (2.47a) and (2.47b)].
- 152 -
Derivatives of the Delta Function 2.15
- 153 -
2.15 Derivatives of the Delta Function
We have already remarked that the set of test functions for ( ) t o contains all functions that are
continuous at the origin. Changing the argument of the delta function changes the set of
appropriate test functions. In Eq. (2.68b), for example, the test functions must be continuous at
0
t t ; in (2.68d) they must be continuous at / t c a ; and in (2.68f) they must be continuous at
all
k
t t . When Eq. (2.53b) is used to define the derivative of a delta function, ( ) t o , we have

( ) ( ) ( ) ( ) (0) t t dt t t dt o o o o o

, (2.69a)

which shows that now the first derivative of all the test functions must be continuous at the
origin. If we start out with a test function ( )
ab
t o that must be identically zero for all t a < and for
all t b > , then Eq. (2.69a) becomes

( ) ( ) ( ) ( ) (0) 0
ab ab ab
t t dt t t dt o o o o o

whenever the interval a t b < < does not contain the origin. Hence, we can write

( ) ( ) 0 0 ( )
ab ab
t t dt t dt o o o

for 0 a b < < or 0 a b < < , showing that ( ) t o equals the zero function in the sense of Eq. (2.48f)
for 0 t = . Equation (2.52a) can be used in conjunction with (2.53b) to evaluate ( ) t o when it is
shifted from the origin by an amount
0
t ,

0 0 0 0
( ) ( ) ( ) ( ) ( ) ( ) ( ) t t t dt t t t dt t t t dt t o o o o o o o

+ +

, (2.69b)

where now we require the first derivative of the test functions to be continuous at
0
t t . This
result can be applied to test functions ( )
ab
t o to get

0 0 0 0
( ) ( ) ( ) ( ) ( ) ( ) ( ) 0
ab ab ab ab
t t t dt t t t dt t t t dt t o o o o o o o

+ +

- 153 -
2 Fourier 1heory

- 154 -
whenever the interval D W E < < does not contain
0
W W = . ThereIore,

0
( ) ( ) 0 0 ( )
DE DE
W W W GW W GW

= =

whenever 0 D E < < or 0 D E < < , showing that
0
( ) W W equals the zero Iunction |in the sense oI
Eq. (2.48I)| Ior
0
W W . Equations (2.52a) and (2.53b) can be applied any number oI times to get
( ) Q
, the Qth derivative oI the delta Iunction, shiIted away Irom the origin by an amount
0
W . We
have

( ) ( 1) (1) ( 2) (2)
0 0 0
( ) ( ) ( ) ( ) ( ) ( )
Q Q Q
W W W GW W W W GW W W W GW

= + = + =

" ,

which eventually becomes

( ) ( )
0
( ) ( )
0 0
( ) ( ) 1 ( ) 1
Q
Q Q
Q Q
Q
W W
G
W W W GW W
GW
=
= =
. (2.69c)

Again, this latest result can be applied to test Iunctions ( )
DE
W to get

( )
( ) ( )
0 0
( ) ( ) 1 ( ) 0
Q
Q Q
DE DE
W W W GW W
= =

whenever the interval D W E < < does not contain
0
W W = . Because

( )
0
( ) ( ) 0 0 ( )
Q
DE DE
W W W GW W GW

= =

whenever
0
W W = lies outside this interval, we end up with |using the deIinition oI equality in
(2.48I)|

( )
0 0
( ) 0 Ior
Q
W W W W = . (2.69d)

The test Iunctions integrated with
( )
0
( )
Q
W W must, oI course, have their Qth derivatives
continuous at
0
W W = .
0
W
0
W
- 154 -
- 155 -
We define the function ( ) t E to be

1 for 0
( ) 1 2 for 0
0 for 0
t
t t
t
>
<

. (2.70a)

Function E is often called the Heaviside step function. If we take

(1)
( ) ( )
d
t t
dt
E E (2.70b)

to be the first derivative of the E function, then
(1)
( ) 0 t E for all 0 t = . To evaluate
(1)
( ) t E at
the origin, we decide to turn ( ) t E and
(1)
( ) t E into generalized functions that we call ( ) t E and
(1)
( ) t E respectively. We define

" ( )" ( ) ( ) ( ) t t dt t t dt o o

E E

for all test functions o , which means that, according to Eqs. (2.48b) and (2.48c),

" ( )" ( ) t t E E . (2.70c)

Having established the generalized function ( ) t E , we know from Eq. (2.53b) that the
generalized function
(1)
( ) t E must satisfy

(1)
" ( )" ( ) " ( )" ( ) t t dt t t dt o o

E E

. (2.70d)

A formal integration by parts of the left-hand side gives

[ ]
(1)
" ( )" ( ) " ( )" ( ) " ( )" ( ) t t dt t t t t dt o o o

E E E

.

This becomes, using (2.70c) to remove the double quotes,
- 155 -
2 Fourier Theory

- 156 -

(1)
0
" ( )" ( ) lim ( ) ( ) ( )
lim ( ) (0) lim ( )
(0) ( ) ( ) .
t
t t
t t dt t t t dt
t t
t t dt
o o o
o o o
o o o

E E

+

Hence, for all test functions o continuous at the origin (note that they do not have to approach
zero at ), we have

(1)
" ( )" ( ) ( ) ( ) t t dt t t dt o o o

E

,
so

(1)
" ( )" " ( )" ( )
d
t t t
dt
o E E (2.70e)

in the sense of Eq. (2.47b). There is nothing unique about the Heaviside step function. We can
also show, using the generalized function "sgn( )" t introduced in Eqs. (2.60a) and (2.60b) above,
that for any test function o

(1)
1
"sgn ( )" ( ) ( ) ( )
2
t t dt t t dt o o o

, (2.70f)

where
(1)
"sgn ( )" t is the first derivative of "sgn( )" t . To show this is true, we do a formal
integration by parts,

[ ]
(1)
1 1 1
"sgn ( )" ( ) "sgn( )" ( ) "sgn( )" ( )
2 2 2
t t dt t t t t dt o o o

.

This becomes, using Eqs. (2.60b) and (2.42c),

0
(1)
0
1 1 1 1 1
"sgn ( )" ( ) lim ( ) lim ( ) ( ) ( )
2 2 2 2 2
1 1 1 1 1 1
lim ( ) lim ( ) lim ( ) (0) (0) lim ( )
2 2 2 2 2 2
t t
t t t t
t t dt t t t dt t dt
t t t t
o o o o o
o o o o o o

+ +

+ + +

(0) ( ) ( ) . t t dt o o o

This shows Eq. (2.70f) is true. Again, we get a formula
- 156 -
- 157 -

(1)
1
"sgn ( )" ( )
2
t t o (2.70g)

in the sense of Eq. (2.47b), where the only major restriction on the test functions is that they be
continuous at the origin.
2.16 Fourier Transform of the Delta Function
To find the Fourier transform of the delta function, we construct two sequences of functions
having the relationship specified in (2.59a)(2.59g) above. It is easiest to start with the delta-
function sequence in Eq. (2.67c). Any standard table of Fourier transforms gives
23

2 ( )
sin(2 ) sin(2 )
( , )
ift ift
nt nt
e dt f n
t t
r
r r
r r
F
and
( )
2 ( )
sin(2 )
( , ) ( , )
ift ift
nt
e f n df f n
t
r
r
r
H H
F
so that

sin(2 )
( , )
nt
f n
t
r
r
H . (2.71a)

Although Eq. (2.71a) holds true for all real n, it is here used only for integer values of n. We
know from (2.67c) that the generalized limit as n of the left-hand side of (2.71a) is ( ) t o ,
but what is the corresponding generalized limit of the right-hand side? We have

( ) lim ( , ) lim ( , ) ( ) lim ( ) 1 ( )
n
n n n
n
f df G f n f n f df f df f df o o o o

H H

for any test function o . This shows that

lim ( , ) 1
n
G f n
H ,

which is no surprise. Therefore, taking the generalized limit as n of both sides of (2.71a)

23
Jack D. Gaskill, Linear Systems, Fourier Transforms, and Optics (John Wiley & Sons, New York, 1978), p. 201,
with the sinc, rect function pair corresponding to formula (2.71a) above.
- 157 -
2 Fourier Theory

- 158 -
gives
( ) 1 t o , (2.71b)
or, restating this result,

2
( ) 1
ift
t e dt
r
o
(2.71c)
and

2
( )
ift
e df t
r
o
. (2.71d)

Equation (2.71c) is just what we expect from Eq. (2.64b), since

2 0
1
if
e
r
;

but Eq. (2.71d) is true only in the sense of Eq. (2.47b), and it is only safe to substitute freely from
(2.71d) when the substitution takes place inside an integral.
Because the sine is an odd function of its argument, we have according to Eq. (2.17), and
assuming the integral is a Cauchy principle value, that

sin(2 ) 0 ft df r
.
Therefore, Eq. (2.71d) becomes

[ ]
0
cos(2 ) sin(2 ) 2 cos(2 ) ( ) ft i ft df ft df t r r r o

+

.

Since the integral over the sine always disappears, we can also write

[ ]
2
( ) cos(2 ) sin(2 )
ift
t ft i ft df e df
r
o r r

.

Hence, two additional formulas for the delta function are

0
2 cos(2 ) ( ) ft df t r o
(2.71e)
, using Eq. (2.19) and that the cosine is even,
- 158 -
Fourier Transform of the Delta Function 2.16

- 159 -
and

2
( )
ift
e df t
r
o
. (2.71f)

As was the case for Eq. (2.71d), these formulas are meant to be used inside integrals.
2.17 Fourier Convolution Theorem with Generalized Functions
Now that we have defined what is meant by the Fourier transform of a generalized function, it is
surprisingly easy to show that the Fourier convolution theorem holds for the product of a
generalized function and a true function.
We start with two sequences of true functions, one of them labeled with a superscript minus
sign for reasons that will become shortly become apparent, called

1 2
( ), ( ), , ( ),
n
v t v t v t and
( ) ( ) ( )
1 2
( ), ( ), , ( ),
n
V f V f V f

.

If these two sequences obey the relationship

( )
1 1
( )
2 2
( )
( ) ( )
( ) ( )
( ) ( )
n n
v t V f
v t V f
v t V f
#
#

,
we know from Eq. (2.59g) that the generalized functions ( )
G
v t and
( )
( )
G
V f
specified by

( ) lim ( )
G n
n
v t G v t
(2.72a)
and

( ) ( )
( ) lim ( )
G n
n
V f G V f

(2.72b)
form a Fourier transform pair,

( )
( ) ( )
G G
v t V f
. (2.72c)

We also suppose that there exists a third sequence of true functions labeled with a superscript
plus sign,

( ) ( ) ( )
1 2
( ), ( ), , ( ),
n
V t V t V t
+ + +
,
such that
- 159 -
2 Fourier Theory

- 160 -

( )
1 1
( )
2 2
( )
( ) ( )
( ) ( )
( ) ( )
n n
V t v f
V t v f
V t v f
+
+
+
#
#

.

If this third sequence has a generalized function as its generalized limit,

( ) ( )
( ) lim ( )
G n
n
V t G V t
+ +
, (2.72d)

then the generalized functions
( )
( )
G
V t
+
and ( )
G
v f are also a Fourier transform pair,

( )
( ) ( )
G G
V t v f
+
. (2.72e)

Definitions (2.72b) and (2.72d) taken together show that

( ) ( )
( ) lim ( )
G n
n
V f G V f

, (2.72f)

where we have replaced t by in (2.72d); and Eqs. (2.72c) and (2.72e) taken together give

( )
( ) ( )
( ) ( )
ift
G G
V f v t

F , (2.72g)

where we have interchanged the roles of t and in Eq. (2.72e).
From the Fourier convolution theorem for true functions [see Eq. (2.39j)], it follows that for
any true function u(t)

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
n n
u t v t u t v t

F F F
or

2 ( ) ( )
( ) ( ) ( ) ( )
ift
n n
e u t v t dt U f V f f df
r

,
where

( ) 2
( ) ( )
ift
U f e u t dt
r
and
( ) 2
( ) ( )
ift
n n
V f e v t dt
r
.

The integral formula for
( )
( )
n
V f
just restates the definitions given to

( )
n
V
+
and
( )
n
V

on the two
previous pages. Taking the limit of both sides as n gives
- 160 -
Fourier Convolution Theorem with Generalized Functions 2.17

- 161 -

2 ( ) ( )
lim ( ) ( ) lim ( ) ( )
ift
n n
n n
r

or, moving the limiting process inside the integral so that it becomes a generalized limit [see
discussion after Eq. (2.56a)],

2 ( ) ( )
( ) lim ( ) ( ) lim ( )
ift
n n
n n
e u t G v t dt U f G V f f df
r

.

From the definitions of ( )
G
v t and
( )
( )
G
V f
[see Eqs. (2.72a) and (2.72f)], we get

2 ( ) ( )
( ) ( ) ( ) ( )
ift
G G
r

,

which becomes

2 ( ) ( )
( ) ( ) ( ) ( )
ift
G G
r
(2.72h)

or, substituting from Eq. (2.72g),

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
G G
u t v t u t v t

F F F . (2.72i)

Consulting Eq. (2.55b) above, we note that convolution with a generalized function is
commutative, just like the convolution of two standard functions, so Eqs. (2.72h) and (2.72i) can
also be written as

2 ( ) ( )
( ) ( ) ( ) ( )
ift
G G
e u t v t dt V f U f
r
(2.72j)
and
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
G G
u t v t v t u t

F F F . (2.72k)

This establishes the generalized-function counterpart to Eq. (2.39j) whenever
2
( )
ift
e u t
r
and
( )
( ) U f
qualify as acceptable test functions. Since almost all well-behaved, continuous functions
are acceptable test functions when used with linear combinations of delta functions or the
derivatives of delta functions, Eqs. (2.72h) and (2.72i) are valid whenever ( )
G
v t is a linear
combination of delta functions or the derivatives of delta functions.
- 161 -
2 Fourier Theory

- 162 -
Establishing the Fourier convolution theorem in the other direction is even easier. We just
write, making the variable substitution t t t and remembering that the convolutions are
commutative,

2 2
2
[ ( ) ( )] ( ) lim ( )
lim ( ) ( )
lim ( ) (
ift ift
G n
n
ift
n
n
n
n
e u t v t dt dt e dt u t t G v t
dt G v t dt u t t e
dt v t dt u t
r r
r

2
2 2
2 2
)
lim ( ) ( )
[ lim ( ) ] [ ( ) ] .
ift
ift ift
n
n
ift ift
n
n
t e
dt v t e dt u t e
e G v t dt u t e dt
r
r r
r r

We conclude that
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
ift ift ift
G G
u t v t u t v t

F F F , (2.72A )

showing that Eq. (2.39a) holds true for the convolution of a true function and a generalized
function as well as for the convolution of two true functions.
2.18 The Shah Function
The shah function, often written as 1 I I , can be defined as the generalized limit

1
sin 2
2 1
( , ) lim
sin
n
t
n
T
t T G
t T
T
r
r

+

1

I I . (2.73)
For any test function ( ) t o , we have

( ) ( )
( )
( ) ( )
( )
1 1
1 1
sin 2 (1 2) sin 2 (1 2)
( ) lim lim ( )
sin sin
n n
tT n tT n
t G dt t dt
tT tT
r r
o o
r r

+ +

(2.74a)
- 162 -
The Shah Function 2.18

- 163 -
As n gets large in (2.74a), the term in braces { } oscillates ever more rapidly between +1 and 1,
causing the more slowly varying function o to make only a negligible contribution to the
integral. The only place this might not hold true is at the isolated t values

0, , 2 , t T T . (2.74b)

It is easy to see why these isolated values are different. Suppose t differs from one of these
isolated values by only a small amount t so that

for 0,1, 2, t t mT m A . (2.74c)

Then the term in braces becomes

( ) ( )
( )
( ) ( )
( ) ( )
1 1
1
1
1
1
sin 2 ( ) (1 2) sin 2 (1 2) 2
sin( )
sin ( )
sin 2 (1 2)
.
sin( )
t mT T n tT n nm m
tT m
t mT T
tT n
tT
r r r r
r r
r
r
r

A + A +
A
A
A +
A

To explain the last step, we note that the sine does not change when a nm number of 2s is
added to its argument, and adding a m number of s to the sines argument either leaves the
sine unchanged (if m is even) or multiplies it by 1 (if m is odd). Since the sine values in both the
numerator and denominator have the same number of s added to their arguments, we do not
care if m is odd because the factor of 1 cancels, leaving the sine ratio unchanged. As t is taken
to be ever smaller in magnitude for a fixed value of n, there comes a time when the arguments of
both sines are small in magnitude, allowing each sine to be approximated by its argument. We
then have

( ) ( )
( )
( ) ( )
( )
( )
( )
1 1
1 1
1
1
sin 2 ( ) (1 2) sin 2 (1 2)
sin ( ) sin
2 (1 2)
2 (1 2) .
t mT T n tT n
t mT T tT
tT n
n
tT
r r
r r
r
r

A + A +
A A
A +
e +
A

Consequently, the peak values of the term in braces get ever larger at the isolated points in
(2.74b) as n increases, as shown in Figs. 2.9(a)2.9(c). We see that the triangular peaks at the
isolated points in (2.74b) have widths equal to ( (1 2)) T n + . As n gets ever larger, the term in
braces oscillates so rapidly between +1 and 1 compared to the test function o that there is no
contribution made to the integral on the right-hand side of (2.74a) except at the isolated t values
shown in Figs. 2.9(a)2.9(c). At these t values, we have
- 163 -
2 Fourier Theory

- 164 -

( ) ( )
( )

1
1
sin 2 (1 2)
lim ( ) ( ) area of triangular peak
sin
(0) area of triangular peak
n
tT n
t dt T
tT
r
o o
r
o

+

+

+
"

( )
( ) area of triangular peak
1
2 (1 2) ( ) (0) ( ) ,
2 (1 2)
T
T
n T T
n
o
o o o
+ +
+ + + + +
+
"
" "

which simplifies to

( ) ( )
( )
1
1
sin 2 (1 2)
lim ( ) ( )
sin
k
n
k
tT n
t dt T kT
tT
r
o o
r
. (2.75a)

But ( )
k
k
T kT o
can be thought of as what we get when evaluating the integral

( ) ( ) ( ) ( ) ( )
k k k
k k k
t T t kT dt T t kT t dt T kT o o o o o

.

This lets us write (2.75a) as

( ) ( )
( )
1
1
sin 2 (1 2)
lim ( ) ( ) ( )
sin
k
n
k
tT n
t dt t T t kT dt
tT
r
o o o
r

+

(2.75b)

or, using (2.56a) to take the limit inside the integral as a generalized limit,

( ) ( )
( )
1
1
sin 2 (1 2)
( ) lim ( ) ( )
sin
k
n
k
tT n
t G dt t T t kT dt
tT
r
o o o
r

+

.

Since this last result is true for any test function o , we conclude that

can be regarded as what we get when evaluating the integral
- 164 -
The Shah Function 2.18

- 165 -

( ) ( )
( )
1
1
sin 2 (1 2)
lim ( )
sin
k
n
k
tT n
G T t kT
tT
r
o
r

+

(2.75c)

in the sense of Eq. (2.47b). Comparison of this result to the definition of the shah function in Eq.
(2.73) above shows that
( , ) ( )
k
t T t kT o
I I . (2.75d)

We note that variable t can be replaced by in Eq. (2.75c) to get

( ) ( )
( )
1
1
sin 2 (1 2)
lim ( )
sin
k
n
k
fT n
G T f kT
fT
r
o
r

+

.

Parameter T is arbitrary throughout this derivation, so nothing stops us from replacing it by
1
T

everywhere to get

( ) ( )
( )
sin 2 (1 2)
1
lim
sin
k
n
k
fT n
k
G f
fT T T
r
o
r

+

. (2.75e)

This is another useful version of the formula in Eq. (2.75d).
2.19 Fourier Transform of the Shah Function
To get the Fourier transform of the shah function, we construct the sequence of true functions
1 2
( , ), ( , ), , ( , ),
n
G t T G t T G t T such that
( , ) ( )
n
n n
k n
G t T g t kT
, (2.76a)
where

sin(2 ( 1) )
( )
n
n t
g t
t
r
r
+
. (2.76b)

From Eq. (2.67c), we have

1
sin(2 )
lim ( ) lim ( )
n
n n
nt
G g t G t
t
r
o
r

.

- 165 -
2 Fourier Theory

- 166 -

The formula for the t interval between the arrows is /( 1/ 2) T n + in all three plots. Figures 2.9(a), 2.9(b),
and 2.9(c) show how the base width of the central lobe becomes ever narrower as n increases.

FIGURE 2.9(a).
FIGURE 2.9(b).
FIGURE 2.9(c).
- 166 -
Fourier Transform of the Shah Function 2.19

- 167 -
Since adding one to n does not make any difference in the limit, we end up with

lim ( ) ( )
n
n
G g t t o
; (2.76c)

and from (2.71a) we get, again adding one to n,

( ) sin 2 ( 1)
( , 1) for 1, 2,
n t
f n n
t
r
r
+
H + . (2.76d)

To find the generalized function that is the forward Fourier transform of the generalized limit of
n
G as n , we must evaluate the forward Fourier transform of
n
G for finite n,

( )
( ) 2 2
2 2
( ) ( ) ( )
( )
n
ift ift ift
n n n
k n
n
ifkT ift
n
k n
G t e G t dt e g t kT dt
e e g t dt
r r
r r

,
F

where in the last step the variable of integration has been changed to t t kT . The Fourier
transform inside the sum can be done using (2.76b) and (2.76d) to get

( )
( ) 2
( ) ( , 1)
n
ift ifkT
n
k n
G t f n e
r
H +

F . (2.77a)

The sum
2
n
ifkT
k n
e
r
is just a disguised form of geometric series. We can write

2
n n
ifkT k
k n k n
e w
r

, (2.77b)

where

2 ifT
w e
r

and define

2
n n
k ifkT
n
k n k n
S w e
r

.

- 167 -
2 Fourier Theory

- 168 -
Using the standard approach for calculating the sum of a geometric series, we note that
multiplying every term in the sum by w increases each power of w in the sum by one. This is the
same as adding
1 n
w
+
and subtracting
n
w
from the original sum, giving

1
1
1
n
k n n
n n
k n
wS w S w w
+
+
+
+

or

1
1
n n
n
w w
S
w
+
.

Hence, (2.77b) becomes

( ) ( ) ( ) ( )
( ) ( ) ( )
2 1 2 2 1 2
2 ( 1) 2 ( )
2
2
1
sin 2 1 2
,
sin( )
ifT n ifT n
ifT n ifT n n
ifkT
ifT ifT ifT
k n
e e e e
e
e e e
fT n
fT
r r
r r
r
r r r
r
r
+ +
+

(2.77c)

which means Eq. (2.77a) can be written as

( ) ( ) ( )
( )
sin 2 1 2
( ( )) ( , 1)
sin( )
ift
n
fT n
G t f n
fT
r
r
+
H + F . (2.77d)

The inverse Fourier transform of the forward Fourier transform returns the original function [see
Eqs. (2.29b) and (2.29d)], so this last result lets us write

( ) ( ) ( )
sin 2 1 2
( ) ( , 1)
sin( )
n
fT n
G t f n
fT
r
r
+
H + . (2.77e)

From the definition of the Fourier transform of a generalized function [see (2.59g)], we know that
taking the generalized limit of both sides of (2.77e) gives a Fourier transform relationship
between two generalized functionsall that needs to be done now is to find out what these
generalized functions are.
To find the generalized function that is the generalized limit of
n
G as n , we write for
any test function o , using Eq. (2.76a), that
- 168 -

- 169 -

[ ]
( ) lim ( ) lim ( ) ( ) lim ( ) ( )
lim ( ) ( )
n
n n n
n n n
k n
n
n
n
k n
t G G t dt t G t dt t g t kT dt
t g t kT dt
o o o
o

.
(2.77f)

Equation (2.76c) states that the generalized limit of
n
g is the delta function, so

lim ( ) ( ) ( ) lim ( ) ( ) ( ) ( )
n n
n n
t g t kT dt t G g t kT dt t t kT dt kT o o o o o

,

which means that
lim ( ) ( ) ( )
n
n
n
k n k
t g t kT dt kT o o
.

Hence, Eq. (2.77f) can be written as

( ) lim ( ) ( )
n
n
k
t G G t dt kT o o
. (2.77g)

But, just as in the discussion following Eq. (2.75a) above, we can regard

( )
k
kT o

as the result of integrating the shah generalized function

( , ) ( )
k
t T t kT o
I I
with any test function o , since

( , ) ( ) ( ) ( ) ( )
k k
t T t dt t kT t dt kT o o o o

1

I I .

Therefore, (2.77g) can be written as
- 169 -
2 Fourier Theory

- 170 -
( ) lim ( ) ( ) ( )
n
n
k
dt t G G t t kT t dt o o o

(2.77h)

for any test function o , showing that

lim ( ) ( ) ( , )
n
n
k
G G t t kT t T o
I I (2.77i)

in the sense of Eq. (2.47b).
The generalized function that is the generalized limit of the right-hand side of (2.77e) is
multiplied by an arbitrary test function ( ) f o and integrated over all to get

( ) ( ) ( )
( ) ( ) ( )
1
( 1)
sin 2 1 2
( ) lim ( , 1)
sin( )
sin 2 1 2
lim ( )
sin( )
lim
n
n
n
n
n
fT n
f G f n df
fT
fT n
f df
fT
r
o
r
r
o
r

+

H +

+

( ) ( ) ( )
sin 2 1 2
( ) ,
sin( )
fT n
f df
fT
r
o
r
(2.78a)

where in the last step we recognize that the behavior of the sine ratio inside the square brackets
[ ] is not affected by the endpoints for the region of integration as n . Equations (2.56a) and
(2.75e) show that

( ) ( )
( )
1 1
sin 2 (1 2)
lim ( ) ( ) ( )
sin
k
n
k
fT n
f df f T f kT df
fT
r
o o o
r

+

,

which means that (2.78a) simplifies to
- 170 -

- 171 -

( ) ( ) ( )
1 1
sin 2 1 2
( ) lim ( , 1)
sin( )
( ) ( )
n
k
k
fT n
f G f n df
fT
f T f kT df
r
o
r
o o

+

H +

for any test function ( ) f o . Therefore,

( ) ( ) ( )
sin 2 1 2
1
lim ( , 1)
sin( )
k
n
k
fT n
k
G f n f
fT T T
r
o
r

+

H +

(2.78b)

in the sense of Eq. (2.47b). Since the right-hand side of (2.78b) is, according to (2.75d),
proportional to the shah function, we end up with

1
1 1
( , )
k
k
f f T
T T T
o
I I . (2.78c)

Equations (2.78b) and (2.77i) let us take the generalized limits as n of both sides (2.77e) to
get

1
( )
k k
k
t kT f
T T
o o

. (2.78d)

According to Eq. (2.75d), this can also be written as

1
1
( , ) ( , ) t T f T
T

1 1 I I I I . (2.78e)

These last two results can be transformed directly to show explicitly that both the forward and
inverse Fourier transform of the shah function produce another shah function. We first write
(2.78d) as the forward and inverse Fourier transforms,

2
1
( )
ift
k j
j
e t kT dt f
T T
r
o o
(2.79a)
and

2
1
( )
ift
j k
j
e f df t kT
T T
r
o o
. (2.79b)
These last two results can be modied to generalize how both the forward and inverse Fourier
transform of the shah function produce another shah function. We rst write (2.78d) as the forward
and inverse Fourier transforms,
- 171 -
2 Fourier Theory

- 172 -
The discussion following Eq. (2.52c) above shows that linear transformations of the variables of
integration are allowed when using generalized functions, so we can change to t t in Eqs.
(2.79a) and (2.79b) to get

2
1
( )
ift
k j
j
e t kT dt f
T T
r
o o

and

2
1
( )
ift
j k
j
e f df t kT
T T
r
o o
.

The sum over index k goes over all positive and negative integers, so we can change the sums
index to k k and use that the delta function is even [see Eq. (2.68a)] to get

2
1
( )
ift
k j
j
e t k T dt f
T T
r
o o

and

2
1
( )
ift
j k
j
e f df t k T
T T
r
o o
.

Dropping the primes and combining these results with Eqs. (2.79a) and (2.79b) produces the
more general formulas

2
1
( )
ift
k j
j
e t kT dt f
T T
r
o o
(2.79c)
and

2
1
( )
ift
j k
j
e f df t kT
T T
r
o o
. (2.79d)

In fact, we can easily show that Eqs. (2.79c) and (2.79d) are really the same formula. First, we
interchange the j, k indices and the , t variables in Eq. (2.79c) so that it becomes

2
1
( )
ift
j k
k
e f jT df t
T T
r
o o
.

Parameter T is arbitrary, sojust like in the analysis following Eq. (2.75d) aboveit can be
replaced everywhere by
1
T
to get
- 172 -

- 173 -

2 ift
j k
j k
e f df T t
T T
r
o o
.

After dividing through by T, we see that this last result is the same as Eq. (2.79d), showing that
Eqs. (2.79c) and (2.79d) are really the same formula.
2.20 Fourier Series
Integral Fourier transforms are connected in a direct and straightforward way to both the Fourier
series and the discrete Fourier transform. This section shows the connection to the Fourier series
and the next section shows the connection to the discrete Fourier transform.
24

We begin with an arbitrary, nonpathological function u(t) that has a well-defined Fourier
integral transform. Function u can be complex-valued but its argument t must be real, and U() is
the forward Fourier transform of u(t), so

( )
( ) 2
( ) ( ) ( )
ift ift
U f u t u t e dt
r
F (2.80a)
and
( ) ( ) u t U f . (2.80b)

From u(t), we create a new function
[ ]
( , ) u t T
that repeats forever along the t axis at intervals of

T,

[ ]
( , ) ( )
k
u t T u t kT
. (2.81a)

Although perhaps redundant, it turns out that listing T as one of the arguments of
[ ]
u

is a
convenient way to keep track of the connection between u and
[ ]
u

. Function
[ ]
u

is called a
periodic function of period T because, for any finite positive or negative integer m,

[ ] [ ]
( , ) ( , ) u t mT T u t T

+ . (2.81b)

Figures 2.10(a) and 2.10(b) show the plots for both u and
[ ]
u

as functions of t. Since function u
is left unspecified,
[ ]
u

can be thought of as representing an arbitrary periodic function. We can

24
The analysis in Secs. 2.20 and 2.21 is adapted from A. Papoulis, Signal Analysis (McGraw-Hill Book Company,
New York, 1977), pp. 7681.
kT
- 173 -
2 Fourier Theory

- 174 -
also define a function
[ ]
( , )
N
u t T by the formula

[ ]
( , ) ( )
N
N
k N
u t T u t kT
. (2.81c)
Clearly,

[ ] [ ]
lim ( , ) ( , )
N
N
u t T u t T
. (2.81d)

We assume that
[ ] N
u is well behaved with respect to the test functions o , so that

[ ] [ ]
lim ( ) ( , ) ( ) ( , )
N
N
t u t T dt t u t T dt o o

. (2.81e)

_____________________________________________________________________________

Figure 2.10(a) is a plot of ( ) u t . The solid curve in Fig. 2.10(b), shifted upward from its true position, is
[ ]
( , ) u t T
and the dashed curves represent ( ) u t displaced by multiples of T .

t
t
FIGURE 2.10(a).
FIGURE 2.10(b).

[ ]
( , ) u t T

( ) u t
T
- 174 -
Fourier Series 2.20
- 175 -
From (2.81e) and the definition of the generalized limit [see Eq. (2.56a)], we then know that

[ ] [ ] [ ]
lim ( ) ( , ) ( ) lim ( , ) ( ) ( , )
N N
N N
t u t T dt t G u t T dt t u t T dt o o o

,

from which it follows that

[ ] [ ]
lim ( , ) ( , )
N
N
G u t T u t T
(2.81f)
in the sense of Eq. (2.48c).
Following the pattern of the definitions in (2.81a) and (2.81c), we define

[ ]
( , ) ( )
N
N
k N
t T t kT o o
(2.82a)
and

[ ]
( , ) ( )
k
t T t kT o o
. (2.82b)

Function
[ ]
( , ) t T o

is clearly just another way of writing the shah function ( , ) t T 1 I I . [The shah
function is defined in Eq. (2.73) and shown equal to ( )
k
t kT o
in Eq. (2.75d).] The

convolution of the generalized function

[ ]
( , ) ( )
N
N
k N
t T t kT o o

with the true function u(t) is

( )
[ ] [ ]
( ) ( , ) ( ) ( , ) ( ) ( )
( ) ,
N
N N
k N
N
k N
u t t T u t t t T dt u t t t kT
u t kT
o o o

where the next-to-last step uses ( ) ( ) x x o o as shown in Eq. (2.68a). The definition of
[ ] N
u in
(2.81c) then gives

[ ] [ ]
( , ) ( ) ( , )
N N
u t T u t t T o . (2.82c)
- 175 -
2 Fourier Theory

- 176 -
Taking the integral Fourier transform of both sides, using the Fourier convolution theorem [see
Eq. (2.72A )], and remembering that U() is the forward Fourier transform of u(t), we get

( ) ( ) ( )
( )
( ) [ ] ( ) ( ) [ ]
2
2
( , ) ( ) ( , )
( ) ( )
( )
sin 2 ( 1 2)
( ) ,
sin( )
ift N ift ift N
N
ift
k N
N
ikfT
k N
u t T u t t T
U f e t kT dt
U f e
fT N
U f
fT
r
r
o
o
r
r

F F F
(2.83a)

where in the last step we substitute from Eq. (2.77c) above. Having now found that

( )
( )
( ) [ ]
sin 2 ( 1 2)
( , ) ( )
sin( )
ift N
fT N
u t T U f
fT
r
r
+
F ,

we take the inverse Fourier transform of both sides to get

( )
[ ] 2
sin 2 ( 1 2)
( , ) ( )
sin( )
N ift
fT N
u t T e U f df
fT
r
r
r
. (2.83b)

Taking the limit of both sides as N , we get, using (2.81d), that

( )
[ ] 2
sin 2 ( 1 2)
( , ) lim ( )
sin( )
ift
N
fT N
u t T e U f df
fT
r
r
r
. (2.83c)

Equations (2.56a) and (2.75e) can now be used to write

( )
[ ] 2
2
sin 2 ( 1 2)
( , ) ( ) lim
sin( )
1
( )
ift
N
ift
k
fT N
u t T e U f G df
fT
k
e U f f df
T T
r
r
r
r
o

or

2
[ ] 1
( , ) ( )
kt
i
T
k
u t T T U k T e
r
. (2.83d)
- 176 -
Fourier Series 2.20
- 177 -
Equation (2.83d) specifies the Fourier series for an arbitrary periodic function
[ ]
u

, showing that
[ ]
u

can be written as the infinite sum of complex exponentials multiplied by the complex
constants
1
[ ( )] T U k T
. To get these complex constants directly from

[ ]
u

, we note that for any
real number t and integer m,

( 1)
2 2
( 1) ( 2)
2 2
( 1)
2
1 1 1
( ) lim ( )
1
lim ( ) ( )
( )
N T m m
i t i t
T T
N
NT
N T N T m m
i t i t
T T
N
NT N T
m
i t
T
m
U u t e dt u t e dt
T T T T
u t e dt u t e dt
T
u t e dt
t
r r
t
t t
r r
t t
r
t
+ +

+ +
+

"

2
( 1) 2
2 2
( )
( ) ( ) .
T m
i t
T
T
N T T m m
i t i t
T T
T NT
u t e dt
u t e dt u t e dt
t t
r
t
t t
r r
t t
+
+ + +

+ +
+
+ + +

"

This can be simplified to

( 1)
2
1 1
lim ( )
k T m
N
i t
T
N
k N
kT
m
U e u t dt
T T T
t
r
t
+ +
+

. (2.83e)

For each value of k, we change the variable of integration to t t kT so that

( 1)
2 2 2
2
( ) ( ) ( )
k T T T m m m
i t i t i t
imk
T T T
kT
e u t dt e e u t kT dt e u t kT dt
t t t
r r r
r
t t t
+ + + +

+
+ +

,

where we use that
2
1
imk
e
r
. Substituting this into (2.83e) gives

2 2
1 1 1
lim ( ) lim ( )
T T m m
N N
i t i t
T T
N N
k N k N
m
U e u t kT dt e u t k T dt
T T T T
t t
r r
t t
+ +

+

,

where in the last step we have replaced index k by index k k . Now, taking the limit inside the
integral to get the generalized limit [see Eq. (2.56a) above], we rely on (2.81f) to get

2 2
[ ]
1 1 1
lim ( ) ( , )
T T m m
N
i t i t
T T
N
k N
m
U e G u t k T dt e u t T dt
T T T T
t t
r r
t t
+ +

. (2.83f)
- 177 -
2 Fourier Theory

- 178 -
Equations (2.83d) and (2.83f) let us put the Fourier series into its standard form. For any
periodic function

[ ]
( ) ( , ) ( )
k
v t u t T u t kT

of period T, we have found that

2
( )
t
ik
T
k
k
v t A e
r

, (2.84a)

where

2
1
( )
T k
i t
T
k
A e v t dt
T
t
r
t
+
. (2.84b)

for any finite value of t . Because we did not require u(t) to be real in (2.80a), Eqs. (2.83d),
(2.83f), (2.84a), and (2.84b) still hold true for complex periodic functions with real arguments t.
It is customarybut of course not mandatoryto choose 0 t or 2 T t in (2.84b).
Using
[ ]
( ) ( , ) v t u t T
, we know from Eqs. (2.83d), (2.83f), (2.84a), and (2.84b) that the
k
A
coefficients can be specified in terms of the forward Fourier transform U() of u(t),

1
k
k
A U
T T

. (2.85a)

When u is realwhich means that
[ ]
( ) ( , ) v t u t T
is also realwe know from entry 7 of Table

2.1 (located at the end of this chapter) that U() must be Hermitian so that

( ) ( ) U f U f

.

Hence, when v(t) is real in (2.84a), it then follows from (2.85a) that

k k
A A

(2.85b)

in (2.84b). This procedure can be extended to all the entries in Table 2.1, giving us the entries in
Table 2.2 (also located at the end of this chapter). To go through another example, if u is
imaginary and odd, we know from entry 3 of Table 2.1 that U is real and odd, so

( ) ( ) U f U f and ( ) Im ( ) 0 U f .

- 178 -
Fourier Series 2.20
- 179 -
Equation (2.85a) then shows that

k k
A A
and ( ) Im 0
k
A . (2.85c)

We can show that
[ ]
( ) ( , ) v t u t T
is imaginary and odd when u is imaginary and odd (let

k k ),

[ ]
[ ]
( ) ( , ) ( ) ( ) ( )
( , ) ( )
k k k
v t u t T u t kT u t k T u t k T
u t T v t

+

,

and
( ) ( ) Re ( ) Re ( ) 0
k
v t u t kT
.

This shows that we end up with (2.85c) associated with v(t) being imaginary and odd, as stated in
entry 3 of Table 2.2.
A final point worth mentioning about Fourier series is that the A
k
coefficients are often
reshuffled so that the series can be written as a sum of sines and cosines. Equation (2.84a) can be
rewritten as, using cos sin
i
e i
o
o o + ,

2 2
0
1
0
1 1
( )
2 2
cos sin .
t t
i k i k
T T
k k
k
k k k k
k k
v t A A e A e
k t k t
A A A i A A
T T
r r
r r

+ +

+ + +

(2.86a)

From Eq. (2.84b), we get

0
1
( )
T
A v t dt
T
t
t
+
, (2.86b)

2 2 2
1 2
( ) ( ) cos
T T t t
i k i k
T T
k k
k t
A A v t e e dt v t dt
T T T
t t
r r
t t
r
+ +

+ +

, (2.86c)
and

2 2 2
2
( ) ( ) sin
T T t t
i k i k
T T
k k
k t
i
i A A v t e e dt v t dt
T T T
t t
r r
t t
r
+ +

. (2.86d)
- 179 -
2 Fourier Theory

- 180 -
Putting these results together, we can write

0
1 1
2 2
( ) cos sin
2
k k
k k
c kt kt
v t c s
T T
r r

+ +

, (2.87a)
where

2 2
( ) cos for 0,1, 2,
T
k
kt
c v t k
T T
t
t
r
+

(2.87b)
and

2 2
( ) sin for 1, 2, 3,
T
k
kt
s v t k
T T
t
t
r
+

. (2.87c)

The absolute value signs are dropped from index k because it is defined positive in (2.87a), and
0
A is replaced by
0
2 c so that the formula for
0
c can be folded into the general formula for
k
c in
(2.87b). Although it is still not mandatory, parameter t is usually given the value 0 or 2 T .
Nowhere has v been required to be real, so Eqs. (2.87a)(2.87c), just like Eqs. (2.84a) and
(2.84b), still hold true when v is a complex-valued periodic function of (real) period T. Indeed, if
v is a complex-valued function of a real argument t, both its real part

( ) ( ) Re ( )
R
v t v t
and its imaginary part
( ) ( ) Im ( )
I
v t v t

are real-valued periodic functions of period T. This means that when, for any integer m, we have

( ) ( ) v t mT v t (2.88a)

for a complex-valued function v of a real argument, then

( ) ( )
R R
v t mT v t (2.88b)
and
( ) ( )
I I
v t mT v t . (2.88c)

Since sines and cosines of real arguments are strictly real, we can now take the real and
imaginary parts of (2.87a)(2.87c) to get
- 180 -
Fourier Series 2.20
- 181 -

[ ] [ ]
0
1 1
Re( ) 2 2
( ) Re( ) cos Re( ) sin
2
R k k
k k
c kt kt
v t c s
T T
r r

+ +

, (2.89a)
with

2 2
Re( ) ( ) cos for 0,1, 2,
T
k R
kt
c v t k
T T
t
t
r
+

(2.89b)

and

2 2
Re( ) ( ) sin for 1, 2, 3,
T
k R
kt
s v t k
T T
t
t
r
+

, (2.89c)
as well as

[ ] [ ]
0
1 1
Im( ) 2 2
( ) Im( ) cos Im( ) sin
2
I k k
k k
c kt kt
v t c s
T T
r r

+ +

, (2.90a)
with

2 2
Im( ) ( ) cos for 0,1, 2,
T
k I
kt
c v t k
T T
t
t
r
+

(2.90b)
and

2 2
Im( ) ( ) sin for 1, 2, 3,
T
k I
kt
s v t k
T T
t
t
r
+

. (2.90c)
2.21 Discrete Fourier Transform
The first step in going from the integral Fourier transform to the discrete Fourier transform is to
repeat the procedure used in Sec. 2.20 to get the Fourier series. We pick a nonpathological
function u(t) having a forward Fourier transform

2
( ) ( )
ift
U f u t e dt
r
(2.91a)

and, following the same procedure used in Eq. (2.81a) above, create a periodic function of period
T:

[ ]
( , ) ( )
k
u t T u t kT
. (2.91b)

As was shown Sec. 2.20, we can now write the associated Fourier series as [see Eq. (2.83d)]

- 181 -
2 Fourier Theory

- 182 -

2
[ ]
1
( , )
kt
i
T
k
k
u t T U e
T T
r
, (2.91c)

where, as specified in (2.91a), U is the forward Fourier transform of u.
Next we divide the period T of
[ ]
u

into N equal lengths, t T N A , and evaluate (2.91c) only
for t m t A with 0,1, 2, , 1 m N ,

2
[ ]
1
( , )
km
i
N
k
k
u m t T U e
T T
r
, (2.92a)

where we have used
N t T A (2.92b)

to simplify the exponent of (2.92a). The infinite sum in (2.92a) can be split in two by making the
substitution k n rN + with 0,1, 2, , 1 n N and 0, 1, 2, r . This gives

1
2
[ ] 2
0
1
( , )
nm
N
i
irm
N
r n
n rN
u m t T U e e
T T
r
r

+
A

.

Since
2
1
irm
e
r
and T N t A , this becomes, making the index substitution r r ,

1
2
[ ]
0
1
( , )
nm
N
i
N
n r
n r
u m t T e U
T T t
r

A

A

or

1
2
[ ] [ ]
0
1 1
( , ) ,
nm
N
i
N
n
n
u m t T e U
T T t
r

A

A

, (2.93a)

where we follow the pattern of Eqs. (2.81a) and (2.91b) and define

[ ]
( , ) ( )
r
U f F U f rF
(2.93b)
for any two frequencies and F.
Equation (2.93a) is a somewhat disguised version of the discrete Fourier transform (DFT).
Figures 2.11(a) and 2.11(b) show the relationship of the two periodic functions
[ ]
u

and
[ ]
U

,
graphed with solid lines, to the two original functions u and U graphed with dashed lines. [In
graphs such as these, u(t) typically stands for data and is usually real, making it easy to represent
- 182 -
Discrete Fourier Transform 2.21
- 183 -
with a two-dimensional plot; but its transform U() is often complex, so it makes more sense to
plot ( ) U f if we just want to show where U() is different from zero.] When function
[ ]
u

has
period T and is uniformly sampled at intervals of t, then function
[ ]
U

has period

1
F
t
A
(2.93c)
and is uniformly sampled at intervals of

1
f
T
A . (2.93d)

Note, of course, we could also say that
[ ]
u

has period 1 f A and is uniformly sampled at
intervals of 1 F when
[ ]
U

has period F and is sampled at intervals of . When both and t
are known, we have from (2.92b) and (2.93d) that

1
f t
N
A A (2.93e)

Figures 2.12(a) and 2.12(b) show that if T and F are large and functions u(t) and U() die away
relatively quickly when t and f are largewhich means that u and U are localized near the t
and originsthen the corresponding periodic functions
[ ]
( , ) u t T
and
[ ]
( , ) U f F
can be used
to approximate the non-negligible regions of u and U. Almost always when the DFT is used, its
users have in mind a situation such as that shown in Figs. 2.12(a) and 2.12(b), with
[ ]
u

and
[ ]
U

being good approximations of u and U for small to moderately large values of t and .
To complete the DFT transform pair, we define

2 i
N
N
w e
r
(2.94a)

and write (2.93a) as

1
[ ] [ ]
0
1 1
( , ) ,
N
nm
N
n
n
u m t T w U
T T t

A

A

. (2.94b)

Multiplying both sides by
mk
N
w
and summing over m gives

1 1 1
[ ] [ ] ( )
0 0 0
1 1
( , ) ,
N N N
mk m n k
N N
m n m
n
u m t T w U w
T T t

A

A

. (2.94c)
- 183 -
2 Fourier Theory

- 184 -

The sum over m on the right-hand side is the sum of a geometric series,

1
[ ] ( )
,
0
N
N m n k
n k N
m
V w

. (2.94d)

This can be solved using the standard procedure for geometric sums [see the analysis following
Eq. (2.77b) above], multiplying every term in the sum by
n k
N
w

to get

[ ] [ ] ( )
, ,
1
N n k N N n k
n k n k N
V w V w

+ . (2.94e)

Solving for
[ ]
,
N
n k
V gives

( ) 2 ( )
[ ]
,
2
1 1
1
1
N n k i n k
N N
n k
n k n k
i
N
N
w e
V
w
e
r
r

, (2.94f)
FIGURE 2.11(a).
FIGURE 2.11(b).
t
f

[ ]
( , ) u t T

[ ]
( , ) U f F

1/ f T A
1/ t F A

1
T
f
A

1
F
t
A

- 184 -
- 185 -
where in the last step definition (2.94a) is used to eliminate
N
w . Index n goes from zero to 1 N
for each value of k [see Eqs. (2.94b) and (2.94c)]. Deciding also to restrict k to one of the integers
0,1, 2, , 1 k N , we see that the denominator in (2.94f) can be zero only when n k . This
looks like it could be a problem, but when n = k, we can return to the original formula in (2.94d),
noting that for n = k the sum
[ ]
,
N
n k
V is equal to N. When n k, the right-hand side of (2.94f) shows
that
[ ]
,
N
n k
V is zero because
2 ( )
1
i n k
e
r
. We conclude that

[ ]
, ,
for
0 for
N
n k k n
N n k
V N
n k
o

=

, (2.94g)

where
, k n
o is the Kronecker delta,

,
1 for
0 for
k n
n k
n k
o

. (2.94h)

Substitution of (2.94d) into (2.94c) gives

1 1
[ ] [ ] [ ]
,
0 0
1 1
( , ) ,
N N
mk N
N n k
m n
n
u m t T w U V
T T t

A

A

.

Substituting from (2.94g), we get

1
[ ] [ ]
0
1
, ( , )
N
mk
N
m
N k
U u m t T w
T T t

A

A

. (2.94i)

This equation is the other half of the DFT [the first half is specified by Eqs. (2.94a) and (2.94b)].
Using Eqs. (2.94a) and (2.92b) to replace
N
w by
(2 ) / i N
e
r
and N T by 1 t A , we write (2.94b)
and (2.94i) as

1
2
[ ] [ ]
0
1 1
( , ) ,
mn
N
i
N
n
n
u m t T e U
T T t
r

A

A

(2.95a)
and

1
2
[ ] [ ]
0
1
, ( , )
mn
N
i
N
m
n
U t u m t T e
T t
r

A A

A

, (2.95b)
- 185 -
2 Fourier Theory

- 186 -

FIGURE 2.12(a).
FIGURE 2.12(b).

[ ]
( , ) u t T

[ ]
( , ) U f F

1
T
f
A

1
F
t
A

t
f
1/ f T A
1/ t F A
region over which
[ ]
u u

region over which
[ ]
U U

- 186 -
- 187 -
where index k has been replaced by n in (2.94i). This can also be written as, using Eqs. (2.93c)
and (2.93d),
( )
1
2
[ ] [ ]
0
( , ) ,
mn
N
i
N
n
u m t T f e U n f F
r

A A A
(2.95c)
and
( )
1
2
[ ] [ ]
0
, ( , )
mn
N
i
N
m
U n f F t u m t T e
r

A A A
. (2.95d)

The forward and inverse DFTs shown in (2.95c) and (2.95d) are often written as

1
2
0
mn
N
i
N
m n
n
u U e
r

(2.96a)
and

1
2
0
1
mn
N
i
N
n m
m
U u e
N
r

. (2.96b)

To get Eq. (2.96a) from (2.95c), we define

[ ]
( , )
m
u u m t T
A (2.96c)
and

[ ]
( , )
n
U f U n f F
A A , (2.96d)

and to get Eq. (2.96b), both sides of (2.95d) are multiplied by , using (2.93e) to replace f t A A
by 1 N . We can also define

[ ]
( , )
n
U U n f F
(2.97a)
and

[ ]
( , )
m
u t u m t T
A A (2.97b)

to transform Eqs. (2.95c) and (2.95d) into

1
2
0
1
mn
N
i
N
m n
n
u U e
N
r

(2.97c)
and

1
2
0
mn
N
i
N
n m
m
U u e
r

, (2.97d)
- 187 -
2 Fourier Theory

- 188 -
where now we have multiplied both sides of (2.95c) by t before replacing f t A A by 1 N .
Figures 2.13(a) and 2.13(b) show how the u
[]
and U
[]
continuous functions are sampled to
create the DFT formulas in the previous paragraph. The values of the original functions u and U
are ignored for negative values of t and ; instead, we sample u
[]
and U
[]
out to t = T and f = F,
picking up the original u and U values at negative t and where they repeat near t = T and f = F.
Many times DFT plots show u
m
and U
n
with n and m running from 0 to N 1. When this is done,
it is with the understanding that the large index values greater than N/2 represent u and U for
negative t and values respectively.
2.22 Aliasing as an Error
The DFT is important because there is an algorithm, called the fast Fourier transform (FFT), that
allows computers to calculate the sums in Eqs. (2.96a), (2.96b), (2.97c), and (2.97d) rapidly when
N is a multiple of 2. The FFT performs best when 2
j
N for j a positive integer. In fact, when
faced with calculating an integral Fourier transform

2
( ) ( )
ift
U f u t e dt
r

over a range of values for an arbitrary function u(t), it is standard practice to convert the
integral to a DFT and do the job on a computer with a FFT. As we saw in the previous section,
the DFT deals directly with
[ ]
u

and
[ ]
U

rather than u and U. Thus, successfully using the DFT
to calculate the integral transform requires that
[ ]
u

and
[ ]
U

consist of well-separated, repetitive
regions of u and U, as shown in Figs. 2.12(a) and 2.12(b), instead of overlapping regions of u and
U, as shown in Figs. 2.11(a) and 2.11(b). Ensuring that
[ ]
u

consists of nonoverlapping regions
of u tends to occur naturally; the shape of u is already known so there is no real difficulty in
picking T large enough to prevent significant amounts of overlap in
[ ]
u

. The shape of U,
however, is not known in advance, so care must be taken to avoid significant amounts of overlap
in U.
Consider what happens when the DFT is used to analyze a real signal u(t) having the spectrum
U() and we know that U() is zero for all
max
f f > and nonzero for
max
0 f f < < . Because u is
real, we know from entry 7 in Table 2.1 that ( ) ( ) U f U f

, ensuring that U() is also nonzero
for negative frequency values
max
0 f f > > ; that is, for every positive at which U is nonzero
there must be a at which U is nonzero, and because U is zero for
max
f f > it follows that U is
zero for all
max
f f s . Hence U can be represented schematically by the solid triangle centered
on the origin of Fig. 2.14. To construct
[ ]
U

, we write
- 188 -
Aliasing as an Error 2.22
- 189 -

FIGURE 2.13(a).
FIGURE 2.13(b).

[ ]
( , ) u t T

t
f

[ ]
( , ) U f F

1
T
f
A

1
F
t
A

1/ t F A
1/ f T A
region over which
[ ]
U U

region over which
[ ]
u u

- 189 -
2 Fourier Theory

- 190 -

[ ]
( , ) ( )
k
U f F U f kF
, (2.98a)

where the smallest we can make F and still avoid overlap is, as shown by the dotted triangles in
Fig. 2.14,

max
2 F f . (2.98b)

From Eq. (2.93c), we see that in Fig. 2.14

1
F
t
A
,

where t is the interval in t between adjacent samples of u(t). If t is made smaller, then F
increases, moving the regions of nonzero U further apart in Fig. 2.14; and if t is made larger,
then F decreases, forcing the regions of nonzero U to overlap in Fig. 2.14. Making t smaller is
wasteful, in that more effort than is needed goes into sampling u(t), and making t larger
damages the integrity of the U calculations for large values of near
max
f . Clearly, the frequency
value F/2 plays an important role in DFT analysis, because optimum performance requires
max
/ 2 f F . For this reason frequency F/2 is given a special name: the Nyquist frequency
/ 2
Nyq
f F . From (2.93c), we see that

1
2
Nyq
f
t
A
. (2.99a)

A realistic system, of course, is designed with some built-in margin for error. The requirement
then becomes that t be small enough to separate unexpectedly high frequencies when the
highest expected frequency is
max
f . To provide this margin, we take

max
1
2
Nyq
f f
t
>
A
(2.99b)
or

max
1
2
t
f
A < . (2.99c)

Now the region between
max
f and
Nyq
f is available for analysis of unexpectedly high frequencies.
Suppose U() is negligible everywhere except at two frequencies, the positive frequency
0
f
and the corresponding negative frequency ( )
0
f . Since U() is the transform of a real signal,
entry 7 of Table 2.1 requires ( ) ( ) U f U f

, forcing the existence of a non-negligible transform
- 190 -
- 191 -
value at ( )
0
f when there is a non-negligible transform value at
0
f . The two frequencies are
represented by wide, solid-sided arrows in Fig. 2.15. The arrows represent isolated, narrow
regions where U is very large, so we can think of them as proportional to delta functions and
write U() as
0 0
( ) ( ) ( ) U f A f f B f f o o + + .

Variables A and B are arbitrary complex constants. We have just seen that Table 2.1 requires
( ) ( ) U f U f

. Because the delta functions are real, the equation ( ) ( ) U f U f

can be
written as

0 0 0 0
( ) ( ) ( ) ( ) A f f B f f A f f B f f o o o o

+ + + +

or, since the delta functions are also even [see Eq. (2.68a)],

0 0 0 0
( ) ( ) ( ) ( ) A f f B f f A f f B f f o o o o

+ + + + .

This can only be true if A B
(which is, of course, the same thing as having B A
).
Therefore, we have the freedom to choose only one arbitrary complex constant, say A, and after
making that choice function U() becomes

______________________________________________________________________________

FIGURE 2.14.

f
( ) U f

max
f -
max
f
F - F
[ ]
( , ) U f F
- 191 -
2 Fourier Theory

- 192 -

0 0
( ) ( ) ( ) U f A f f A f f o o
+ + . (2.100a)

It is not difficult to figure out what happens when the DFT is used to calculate this double-delta
frequency spectrum. If the double-delta U() is used to construct U
[]
(f, F) according to formula
(2.98a), we get multiple isolated regions where U
[]

is very large, as shown by the wide dashed
arrows in Fig. 2.15. The curved single arrows show which wide dashed arrows come from the
wide, solid-sided arrow at f
0
and which wide dashed arrows come from the wide solid-sided
arrow at ( )
0
f . For example, the wide dashed arrow closest to f
0
comes from the wide solid-
sided arrow at (f
0
), and the wide dashed arrow closest to (f
0
) comes from the wide solid-sided
arrow at f
0
. The two wide solid-sided arrows at f
0
and f
0
lie a distance a inside the positions of
the positive and negative Nyquist frequencies f
Nyq
and f
Nyq
, and the two wide dashed arrows that
are closest to f
0
and f
0
lie a distance a outside the positive and negative Nyquist frequencies f
Nyq

and f
Nyq
. We see that the original double-delta U() transform can be written as [from Eq.
(2.100a)]

( ) ( ) ( )
Nyq Nyq
U f A f f a A f f a o o
+ + + , (2.100b)

and we can pair up the two wide dashed arrows closest to f
0
and f
0
to create the transform

[1]
( ) ( ) ( )
Nyq Nyq
+ + +
. (2.100c)

Because the delta function
0
( ) ( )
Nyq
f f a f f o o + + has the coefficient A
in (2.100b), the
curved single arrow going from ( )
0
f to
Nyq
f a + shows that the delta function ( )
Nyq
f f a o
at
Nyq
f a + must have the coefficient A
in Eq. (2.100c); similarly, the curved single arrow going

from
0
f to
Nyq
f a shows that the delta function ( )
Nyq
f f a o + + at
Nyq
f a must have the
coefficient A in Eq. (2.100c). Nothing stops us from continuing out from the origin, pairing the
wide dashed arrows at 3
Nyq
f f a and 3
Nyq
f f a + to get

[2]
( ) ( 3 ) ( 3 )
Nyq Nyq
+ + +
(2.100d)

and pairing the wide dashed arrows at 3
Nyq
f f a + and 3
Nyq
f f a to get

[3]
( ) ( 3 ) ( 3 )
Nyq Nyq
+ + +
. (2.100e)
- 192 -
- 193 -
FIGURE 2.15.

Each time, the curved single arrows in Fig. 2.15 are consulted to find the coefficients of the delta
functions. This can obviously be continued out to indefinitely large values of , creating the
paired transforms
[4] [5]
, , U U

, etc. The general formula for
[ ] k
U
turns out to be

[ ]
( )
( ) for even
( )
( ( 1) )
( ( 1) ) for odd
Nyq Nyq
Nyq Nyq
k
Nyq Nyq
Nyq Nyq
A f f kf a
A f f kf a k
U f
A f f k f a
A f f k f a k
o
o
o
o
+ + +
+ + + +

. (2.100f)
frequency
0
f frequency
0
f
frequency
Nyq
f frequency
Nyq
f
a a a a
nyq
f F 2

- 193 -
2 Fourier Theory

- 194 -
We started out with the double-delta U() being the forward Fourier transform of u(t), which
means that u(t) is the inverse Fourier transform of the double-delta U(),

2
( ) ( )
ift
u t U f e df
r
.

We now show that u(t), the inverse transform of the double-delta U(), and
[1] [2]
( ), ( ), u t u t the
inverse transforms of
[1] [2]
, , U U

, all have the same values at for 0, 1, 2, t m t m A ,

[1] [2] [ ]
( ) ( ) ( ) ( )
k
u m t u m t u m t u m t A A A A " " . (2.100g)

We begin by taking the inverse Fourier transform of the double-delta U() function specified
in (2.100b),

2
2 ( ) 2 ( ) 2 ( )
( ) [ ( ) ( )]
2Re[ ]
Nyq Nyq Nyq
ift
Nyq Nyq
it f a it a f it f a
u t A f f a A f f a e df
Ae A e Ae
r
r r r
o o

+ + +
+
.
(2.101a)

Similarly, we can take the inverse Fourier transform of
[ ]
( )
k
U f
in (2.100f) to get

2 ( )
[ ]
2 ( ( 1) )
2Re[ ] for even
( )
2Re[ ] for odd
Nyq Nyq
Nyq Nyq
it f kf a
k
it f k f a
Ae k
u t
Ae k
r
r
+
+ +

. (2.101b)

Substituting t m t A from (2.100g) and 1 (2 )
Nyq
f t A from (2.99a) into Eq. (2.101a) gives

1
2 ((2 ) ) 2
2
( ) 2Re[ ] 2Re[ ]
2Re[( 1) ]
im t t a i m ima t
m ima t
u m t Ae Ae e
Ae
r r r
r
A A A
A
A
.
(2.101c)

Making the same substitutions into Eq. (2.101b) gives
- 194 -
- 195 -

1 1
1 1
2 ((2 ) (2 ) )
2
[ ]
2 ((2 ) ( 1)(2 ) )
( 1) 2
2Re[ ]
2Re[ ] for even
( )
2Re[ ]
2Re[
im t t k t a
i m i mk ima t
k
im t t k t a
i m i m k ima
Ae
Ae e e k
u m t
Ae
Ae e e
r
r r r
r
r r r

A A + A
A
A A + A +
A

] for odd
t
k

. (2.101d)

But ( 1) 1
i mk mk
e
r
when k is even and
( 1) ( 1)
( 1) 1
i m k m k
e
r
when k is odd, so this last
result can be written as

2
[ ]
2
2Re[ ( 1) ] for even
( )
2Re[ ( 1) ] for odd
m ima t
k
m ima t
A e k
u m t
A e k
r
r
A
A

A

. (2.101e)

Comparing this with (2.101c), we conclude that
[ ]
( ) ( )
k
u m t u m t A A for all values of m and k,
showing that (2.100g) must be true. Because the
[ ] k
u functions have exactly the same values as
the u functions at for 0, 1, 2, t m t m A , the
[ ] k
u functions are called aliases of function
u. Figure 2.16 graphs an example of u(t) and to show how u and its alias
[1]
u can have identical
values at all the sample positions on the t axis.
The term alias is an interesting one; it suggests that there is no real way to distinguish these
functions if all we know are the values of the sample points at t m t A . Yet in Figs. 2.14 and
2.15, there is really no question as to which is the correct region of
[ ]
U

; spectral values whose
frequencies do not lie between +f
Nyq
and f
Nyq
can clearly be disregarded. Consider, however, that
before u(t) is analyzed there is no guarantee as to what the correct value of f
max
is. Figure 2.17, for
example, shows a pattern for
[ ]
U

that seems to have well-separated regions for U and all its
aliases when in fact there is a high-frequency triangle that is hidden by aliasing. The unwary
analyst might conclude that U has the shape shown in Fig. 2.18(a) when its true shape is the one
shown in Fig. 2.18(b). There is really no way to be sure of the true shape of U when all that is
known is the DFT of the sampled signal u(t). The basic problem, which is that the DFT is the
sampled version of
[ ]
U

instead of U, does not disappear when 1 F t A is made larger by
decreasing the sampling interval t; there is always the possibility that the true U curve is broad
enough to overlap. Returning to Fig. 2.16, we see that no matter how small t is made, the
information thrown away from between the samples inevitably allows high frequencies to
masquerade as low frequencies. There is no foolproof method for both sampling the data and
avoiding this possibility.
Fortunately, there are usually ways of avoiding this logical dead end. As is pointed out in Sec.
2.2 above [see discussion after Eq. (2.9b)], in practice all measurements are sampled and, before
representing them by continuous functions, we must know that the samples capture all the
- 195 -
2 Fourier Theory

- 196 -
relevant detail. In other words, there must be some way of knowing, based on past experience or
knowledge of how the data is gathered, that the sampling is rapid enough to represent faithfully
all the important high-frequency details. In terms of the notation used to discuss Fig. 2.14, we
must eventually be prepared to say that, for some specific
max
, no higher frequencies are present
to create aliasingthat is, we must know that if more closely spaced sampling is done all that
would be found is a smooth, quasi-linear variation between the current samples. Many times the
electronic instruments used to make the measurements cannot sense high-frequency data, so even
if high-frequency components exist, they cannot be recorded. Other times, all that can be done is
to look at the data samples and decide whether it is reasonable to suspect the presence of unseen
high-frequency components. The data in Fig. 2.19(a), for example, almost certainly do not
contain significant amounts of unseen high frequencies, whereas unseen high frequencies could
well be present in Fig. 2.19(b). There may be cases where all that can be done is to shorten t and
see whether previously aliased frequency components suddenly appear. The question of whether
aliasing is present is analogous to the question of whether experimental error is present. Just as it
is always logically possible that data contain significant amounts of undetected error, so it is

1.1
1.1
y
i
Y
i
4.5 4.5 x
i
5 4 3 2 1 0 1 2 3 4 5
1
0.5
0
0.5
1
FIGURE 2.16.
The solid line represents a sinusoidal oscillation at a frequency that is 0.8 times the Nyquist
frequency, and the dashed line represents a sinusoidal oscillation that is 1.2 times the
Nyquist frequency. When the curves are sampled at the rate represented by the black dots
which in this case is the Nyquist frequencythere is no way to tell them apart in the sampled
data.
t
- 196 -
- 197 -
always logically possible that significant amounts of aliasing are being overlooked. Just as we
often expect insignificant amounts of error to occur no matter what precautions are taken, so we
often expect insignificant amounts of aliasing to occur in the calculated DFT. What is needed is
the presence of good engineering and scientific judgment; there must always be someone willing
to pick a value for
max
, allowing us to specify the sampling interval
max
1 (2 ) t f A s that prevents
significant aliasing in the DFT.
2.23 Aliasing as a Tool
The previous section presented the bad aspects of aliasing, treating it as a form of data corruption.
There are, however, occasions when aliasing is more of a feature than a bug. Many times, a real
function u(t) is known to have a Fourier transform

2
( ) ( )
ift
U f u t e dt
r
,

which is zero for all positive frequencies that do not lie between the two positive numbers
min

and
max
; that is, U() is zero when
min
0 f f s s and
max
f f > . Because u(t) is real, U() must be
Hermitian (see entry 7 of Table 2.1), which means

( ) ( ) U f U f

.

This shows that U() must also be strictly zero for negative frequencies where
min
0 f f s s
and
max
f f s . The U() transform is schematically represented in Fig. 2.20 with the two blocks
showing that U is zero unless lies between
max min
( , ) f f or
min max
( , ) f f .
The situation shown in Fig. 2.20 describes the signal produced by Michelson interferometers.
At the beginning of this chapter, we mentioned that interferometers produce interferograms that
must then be Fourier transformed to produce the desired spectral measurement. As explained
later in Chapter 4 (see Sec. 4.10), interferometers use optical filters to block out undesired
electromagnetic frequencies, which means there always exist values of
min
and
max
such that the
transform U() of the interferogram signal u(t) is zero unless lies between
max min
( , ) f f or
min max
( , ) f f . Suppose we sample the interferogram signal with a sampling interval t such that
the Nyquist frequency
1
(2 )
Nyq
f t

A is slightly larger than
max
. Repeating the reasoning used to
get Fig. 2.15 above, we see that

[ ]
( , ) ( )
k
U f F U f kF

This shows that U( f ) must also be strictly zero for negative frequencies f where
- 197 -
2 Fourier Theory

- 198 -

The ) , (
] [
F f U

data in Fig. 2.17 contains hidden aliasing that can lead spectral analysts to assume
that the Fig. 2.18(a) rather than 2.18(b) depicts the true frequency spectrum.

FIGURE 2.17.
FIGURE 2.18(a).
FIGURE 2.18(b).
f
f
f

Nyq
f
Nyq
f

Nyq
f F 2
Nyq
f F 2
) , (
] [
F f U

) ( f U
) ( f U
- 198 -
Aliasing as a Tool 2.23
- 199 -

This curve varies rapidly in three locations, suggesting the presence of high-frequency
components in the data.
FIGURE 2.19(a).
This data is relatively smooth, suggesting that it does not contain high-frequency components.
FIGURE 2.19(b).
- 199 -
2 Fourier Theory

- 200 -
now has the form shown in Fig. 2.21. Again, the solid blocks show the original U(), the dashed
blocks show the aliases created by turning U() into
[ ]
( , ) U f F
, and the curved arrows drawn

show exactly how the aliased blocks are created from the original blocks. No solid blocks overlap
with the dashed blocks, so aliasing is not a problem.
Now consider what happens when we force aliasing to occur by choosing t to be half its
original size, creating the
[ ]
U

plot shown in Fig. 2.22. As in Fig. 2.21, none of the solid blocks
overlap with the dashed blocks. Because the dashed blocks come from turning U into
[ ]
U

, the
spectral shapes represented by the solid and dashed blocks are all identical. This means that the
aliasing does not cause spectral information to be lost; either the solid blocks or the dashed
blocks can be used to recover the true shape of U(). The electronic equipment used to sample
u(t) only needs to sample half as often as before, which usually makes it less expensive to build,
and as a bonus the rate at which data flows from the interferometer ends up being cut in half. This
last point is often a significant consideration when the interferometer is on a satellite and all the
data has to be communicated to the ground. The scheme shown in Fig. 2.22 is called
undersampling. There is nothing special about undersampling by a factor of 2; if the distance
between
min
and
max
is small enough, and
min
is far enough from 0 f , we can undersample
by much higher factors. Figure 2.23 shows a scheme that undersamples by a factor of 5.
2.24 Sampling Theorem
We define a band-limited function u(t) to be a function for which there exists a positive
frequency
max
such that the forward Fourier transform of u(t),

2
( ) ( )
ift
U f u t e dt
r
,

is strictly zero when
max
f f s or
max
f f > . The previous section indicated that the interferogram
of a Michelson interferometer is a special case of a band-limited function; not only is its
transform zero for
max
f f > , but there is also a positive frequency
min
such that its transform is
zero for
min
f f s (see Fig. 2.20). It can be shown that whenever a continuous function u(t) is
also band limited, then its samples ( ) u m t A (with 0, 1, 2, m ) can be used to reconstruct the
complete functionincluding the values of u between the samplesas long as we choose

max
1
2
t
f
A < (2.102)
to prevent aliasing.
We start by forming the mathematical construct

with 4 aliases rather than one.
- 200 -
Sampling Theorem 2.24
- 201 -

FIGURE 2.20.
FIGURE 2.21.
) ( f U
f
f
) , (
] [
F f U

min
f
max
f
min
f
max
f

min
f

max
f

min
f

max
f

Nyq
f
Nyq
f
F
F
Frequency F is twice the Nyquist frequency
Nyq
f in Fig. 2.21.
- 201 -
2 Fourier Theory

- 202 -
( ) ( ) ( )
m
v t u m t t m t o
A A
. (2.103)

Clearly, the ( ) u m t A sample values of function u are the only data used to set up function v(t).
Because
0 0 0
( ) ( ) ( ) ( ) u t t t u t t t o o for any continuous function u [see Eq. (2.68e) above], this
can be written as
( ) ( ) ( )
m
v t u t t m t o

or
( ) ( ) ( )
m
v t u t t m t o
.

Note that here t has returned to being a continuous, not a sampled, variable. Taking the Fourier
transform of both sides gives, using the Fourier convolution theorem [see Eq. (2.72i)],

1
( ) ( )
k
k
V f U f f
t t
o

A A

, (2.104a)
where

2
( ) ( )
ift
V f v t e dt
r
, (2.104b)

2
( ) ( )
ift
U f u t e dt
r
, (2.104c)
and

2
1
( )
ift
k k
k
t k t e dt f
t t
r
o o

A

A A

(2.104d)

from formula (2.78d). Note that here both and t are continuous, not sampled, variables. We can
now use the linearity of the convolution [see discussion after Eq. (2.38c)] and the definition of
the convolution in Eq. (2.38a) to write (2.104a) as

[ ]
( ) ( ) ( )
1
, ,
k k
k
k k
t V f U f f U f f f df
t t
k
U f U f
t t
o o

A

A A

A A

(2.105a)
Note that here t in the function u has returned to being a continuous variable.
- 202 -
- 203 -

In both Figs. 2.22 and 2.23, frequency F is twice the Nyquist frequency
Nyq
f .

where
[ ]
U

is as defined in Eq. (2.93b) above. Inequality (2.102) ensures that the separate
regions of U that combine to create
[ ]
U

do not overlap, giving us the graph of
[ ]
U

shown in
Fig. 2.24. Hence, we can use the H function defined in Eq. (2.56c) to select just the region of
nonzero
[ ]
U

between
1 1
(2 ) and (2 ) t t

+ A A , recreating the original U() transform.
Multiplication of (2.105a) by
( )
1
, (2 ) f t

H A then gives

[ ]
1 1 1
( ) , , ( ) ,
2 2
U f f U f t V f f
t t t

H A H

A A A

. (2.105b)

FIGURE 2.22.
FIGURE 2.23.

Nyq
f

Nyq
f

min
f
max
f

min
f

max
f

min
f

min
f

max
f

max
f
) , (
] [
F f U

) , (
] [
F f U

Nyq
f
Nyq
f
F
F
F
F
f
f
- 203 -
2 Fourier Theory

- 204 -
Having recovered the original U(), an inverse Fourier transform of U() gives back the original
unsampled u(t). Using the Fourier convolution theorem again to take the inverse Fourier
transform of both sides of (2.105b), we get [applying Eq. (2.39j) after interchanging the roles of
and t]

2
2 2
1
( ) ( ) ,
2
1
( ) , ,
2
ift
ift if t
u t t V f f e df
t
t V f e df f e df
t
r
r r

A H

A

A H

A

(2.106a)

where the convolution between the two expressions inside square brackets [ ] is over the variable
t. From (2.104b), function V() is the forward Fourier transform of v(t), making v(t) equal to the
inverse Fourier transform of V() in (2.106a), with v(t) defined as

( ) ( ) ( )
m
v t u m t t m t o
A A

in Eq. (2.103). From Eq. (2.71a) above, the inverse Fourier transform of H is

( ) 2
1 1 1
, , sin
2 2
ift ift
t
f e f df
t t t t
r
r
r

H H

A A A

F .

Equation (2.106a) can now be written as

1
( ) ( ) ( ) sin
m
t
u t t u m t t m t
t t
r
o
r

A A A

A

. (2.106b)

Again, the linearity of the convolution can be used to simplify (2.106b),

1
( ) ( ) ( ) sin
m
t
u t t u m t t m t
t t
r
o
r

A A A

A

or, using that
0 0
( ) ( ) ( ) t t u t u t t o for any continuous function u,

1 ( )
( ) ( ) sin
(( ) )
m
t m t
u t u m t
t m t t t
r
r

A
A

A A A

. (2.106c)

- 204 -
- 205 -

FIGURE 2.24.

This formula gives us u(t) everywhere in terms of the samples ( ) u m t A and the function

1
sin
( )
t
t t t
r
r

A A

.

We now define the function

sin( )
sinc( )
x
x
x
(2.106d)

and write (2.106c) as

( )
( ) ( )sinc
m
t m t
u t u m t
t
r
A
A

A

. (2.106e)

[ ]
1
, U f
t

A

) ( f U

max
f
max
f
max
1
f
t

A

max
1
f
t

t A 2
1

t A
2
1

f
- 205 -
2 Fourier Theory

- 206 -
Many authors use a different definition of the sinc function, which we call here sinc
alt
, with

sin( )
sinc ( )
alt
x
x
x
r
r
.

In terms of sinc
alt
, Eq. (2.106e) becomes

( )
( ) ( )sinc
alt
m
t m t
u t u m t
t
A
A

A

.

For the rest of this book, the symbol sinc will refer to
sin( ) x
x
instead of
sin( ) x
x
r
r
. We also
note that the Fourier transform pair in (2.71a) can be written in terms of sinc( ) x as

2
[2 sinc(2 )] ( , )
ift
e F Ft dt f F
r
r

and

2
( , ) 2 sinc(2 )
ift
e f F df F Ft
r
r
.

Replacing by in the top integral and t by t in the bottom integral gives

2
[2 sinc(2 )] ( , ) ( , )
ift
e F Ft dt f F f F
r
r
H H

and

2
( , ) 2 sinc( 2 ) 2 sinc(2 )
ift
e f F df F Ft F Ft
r
r r
,

where we have used that ( , ) f F H and sinc(2 ) Ft r are even functions of their arguments:

sinc( ) sinc( ) x x (2.107a)
and
( , ) ( , ) f F f F H H . (2.107b)

This means we can write this Fourier relationship using the more general formulas
- 206 -
- 207 -
( )
( ) 2
2 sinc(2 ) [2 sinc(2 )] ( , )
ift ift
F Ft e F Ft dt f F
r
r r
F (2.108a)
and

( ) ( )
( ) ( ) 2
( , ) ( , ) ( , ) 2 sinc(2 )
ift itf ift
f F f F e f F df F Ft
r
r
H H H
F F . (2.108b)
2.25 Fourier Transforms in Two and Three Dimensions
The integral Fourier transform extends easily and naturally to two- and three-dimensional
functions. We can, for example, define the integral Fourier transform of any two-dimensional
function u(x,y) to be

2 ( )
( , ) ( , )
i x y
U dx dy e u x y
r q
q

+

. (2.109a)

The inverse Fourier transform of U returns the original function,

2 ( )
( , ) ( , )
i x y
u x y d d e U
r q
q q

+

. (2.109b)

In three dimensions we can write, for the function ( , , ) u x y z , that

2 ( )
( , , ) ( , , )
i x y z
U dx dy dz e u x y z
r q
q

+ +

(2.109c)
and

2 ( )
( , , ) ( , , )
i x y z
u x y z d d d e U
r q
q q

+ +

. (2.109d)

This pattern of forward and inverse transforms can be extended indefinitely to functions u and U
with ever larger numbers of arguments, but for the purposes of this book there is no need to go
beyond the two- and three-dimensional transforms given in Eqs. (2.109a)(2.109d). As a matter
of notation, we often use the standard Cartesian x and y unit vectors pointing along the x and y
axes of a Cartesian coordinate system to define vectors

xx yy p +
G
and q x y q +
G
.

- 207 -
2 Fourier Theory

- 208 -
We introduce the symbol ( ) u p
G
as a shorthand for u(x,y) and the symbol ( ) U q
G
as a shorthand for
( , ) U q . Now Eqs. (2.109a) and (2.109b) can be written as

2 2
( ) ( )
i q
U q d e u
r p
p p

G G
G G
(2.110a)
and

2 2
( ) ( )
i q
u d q e U q
r p
p

G G
G G
. (2.110b)

We can also define vectors for the three-dimensional case,

r xx yy zz + +
G
and s x y z q + +
G
,

and then write Eqs. (2.109c) and (2.109d) as

3 2
( ) ( )
ir s
U s d r e u r
r

G G
G G
(2.110c)
and

3 2
( ) ( )
ir s
u r d s e U s
r

G G
G G
. (2.110d)
Vector notation is sometimes used to group families of associated forward and inverse Fourier
transforms into a single equation. We might, for example, write the six scalar equations

3 2
( ) ( )
ir s
x x
U s d r e u r
r

G G
G G
,
3 2
( ) ( )
ir s
x x
u r d s e U s
r

G G
G G
,

3 2
( ) ( )
ir s
y y
U s d r e u r
r

G G
G G
,
3 2
( ) ( )
ir s
y y
u r d s e U s
r

G G
G G
,
and

3 2
( ) ( )
ir s
z z
U s d r e u r
r

G G
G G
,
3 2
( ) ( )
ir s
z z
u r d s e U s
r

G G
G G

as the pair of vector equations

3 2
( ) ( )
ir s
U s d r e u r
r

G G G
G G G
(2.110e)
- 208 -
Fourier Transforms in Two and Three Dimensions 2.25
- 209 -
and

3 2
( ) ( )
ir s
u r d s e U s
r

G G G
G G G
, (2.110f)
where

) ( ) ( ) ( ) ( r u z r u y r u x r u
z y x
G G G G G
+ + and ) ( ) ( ) ( ) ( s U z s U y s U x s U
z y x
G G G G
G
+ + .

We call ( ) U s
G
G
the vector Fourier transform of ( ) u r
G G
and ( ) u r
G G
the vector inverse Fourier
transform of ( ) U s
G
G
. Just as in the one-dimensional case, it makes no difference which Fourier
transform is labeled the forward transform and which is labeled the inverse transform as long as
there is a change in sign of the exponent of e. Following the pattern of Eq. (2.28A ), we can also
write

2 2 2 2
( ) ( )
i q i q
d q e d e u u
r p r p
p p p

- -

G G G G
B
G G
(2.110g)
and

3 2 3 2
( ) ( )
ir s ir s
d s e d r e v r v r
r r

- -

G G G G
B
G G
(2.110h)

for two-dimensional and three-dimensional scalar functions ( ) u p
G
and ( ) v r
G
. For three-
dimensional vector functions, this becomes

3 2 3 2
( ) ( )
ir s ir s
d s e d r e v r v r
r r

- -

G G G G
G G G G
. (2.110i)

Many one-dimensional Fourier identities have two-dimensional and three-dimensional
counterparts. For example, the Fourier shift theorem [see Eq. (2.36h) above] in two dimensions
becomes, for a two-dimensional vector constant
x y
a xa ya +
G
,

2 2 2 ( )
2 ( ) 2 ( )
( ) ( , )
( , ) ,
x y
i q i x y
x y
i a a i x y
d e u a dx dy e u x a y a
dx dy e e u x y
r p r q
r q r q
p p
-

+

+ +

+ + +

G G
B
G G

where in the last step we define
x
x x a + and
x
y y a + . We now see that (dropping the
primes inside the double integral)
- 209 -
2 Fourier Theory

- 210 -

2 2 2 2 2
( ) ( )
i q ia q i q
d e u a e d e u
r p r r p
p p p p
- - -

+

G G G G G G
B
G G G
. (2.110j)

This shows the forward or inverse two-dimensional Fourier transform of ( ) u a p +
G G
to be
2 ia q
e
r -
G G
B

multiplied by the forward or inverse two-dimensional Fourier transform of ( ) u p
G
. Similarly in
three dimensions, we have, for a three-dimensional constant vector
x y z
b xb yb zb + +
G
, that

3 2 ( )
2 ( ) 2 ( )
2
( ) ( , , )
( , , ) ,
x y z
i x y z
x y z
i b b b i x y z
ir s
d r e v r b dx dy dz e v x b y b z b
e dx dy dz e v x y z
r q
r q r q
r

- + +

+ + + +

+ + + +

B
G G G
G

where
x
x x b + ,
y
y y b + , and
z
z z b + . This time we find that the forward or inverse three-
dimensional Fourier transform of ( ) v r b +
G
G
is
2 is b
e
r -
G
G
B
multiplied by the forward or inverse three-
dimensional Fourier transform of ( ) v r
G
,

3 2 2 3 2
( ) ( )
ir s is b ir s
d r e v r b e d r e v r
r r r - - -

+

G
G G G G G
B
G
G G
. (2.110k)

There is also a two-dimensional and three-dimensional version of the one-dimensional Fourier
scaling theorem discussed in Sec. 2.8 above [see Eq. (2.37a)]. In two dimensions when we have

( ) 2 2
( ) ( )
i q
V q d e v
r p
p p

G G
G G
(2.110A )

and ( ) v p
G
is replaced by ( ) v op
G
, where is a real scalar, then we can substitute p op
G G
to get

2
2 2 2 ( )
2 2
1 1
( ) ( ) ( )
i q
i q
d e v d e v V q
p
r
r p o
p op p p o
o o

-

-

G
G
G G
G G G
. (2.110m)

Suppose there is a function of p
G
called ( ) u p
G
such that p
G
has to change by a vector distance p A
G

whose magnitude must be at least p A e
G
for there to be a significant change in the value of
( ) u p
G
. Using the same reasoning as was applied to the one-dimensional Fourier scaling theorem
[see the analysis following Eq. (2.37e)], we can show that
( )
( ) U q
G
, the two-dimensional forward
- 210 -
- 211 -
or inverse Fourier transform of u, must be negligible or zero for all vectors q
G
whose magnitude
q
G
exceeds 1 . The Fourier scaling theorem in three dimensions starts with

( ) 3 2
( ) ( )
ir s
V s d r e v r
r

G G
G G
, (2.110n)

from which we discover, replacing r
G
by r r o
G G
, that

2
3 2 3 ( )
3 3
1 1
( ) ( ) ( )
r
i s
ir s
d r e v r d r e v r V s
r
r o
o o
o o

-

-

G
G
G G
G G G
. (2.110o)

Again we can conclude that if there is a function ( ) u r
G
such that r A
G
must be at least for there
to be a significant change in u, then
( )
( ) U s
G
, the three-dimensional forward or inverse Fourier
transform of u, must be negligible or zero for all vector arguments s
G
whose magnitude s
G

exceeds 1 .
The two-dimensional convolution of scalar functions u(x,y) and v(x,y) is written using the
symbol and defined to be
( , ) ( , ) ( , ) ( , ) u x y v x y dx dy u x y v x x y y

, (2.111a)
or

2
( ) ( ) ( ) ( ) u v d u v p p p p p p

G G G G G
(2.111b)

using the more concise vector notation. The vector notation may make the connection between
the one- and two-dimensional convolutions in Eqs. (2.38a) and (2.111b) easier to see. The two-
dimensional convolution, like the one-dimensional convolution, is both commutative and
associative. Using the same type of reasoning as in the analysis in Sec. 2.9, we have for the two-
dimensional functions ( ) u p
G
, ( ) v p
G
, and ( ) h p
G
that

( )
2
2 2
2
( ) ( ) ( ) ( ) 1 ( ) ( )
( ) ( ) ( ) ( )
u v d u v d u v
d v u v u
p p p p p p p p p p
p p p p p p

G G G G G G G G
G G G G G

(2.111c)
and
- 211 -
2 Fourier Theory

- 212 -

[ ]
2 2
2 2
2 2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) (( ) )
u v h d h d u v
d u d h v
d u d v h
p p p p p p p p p p
p p p p p p p
p p p p p p p

G G G G G G G G G
G G G G G G
G G G G G G

[ ]
( ) ( ) ( ) , u v h p p p
G G G

(2.111d)

where to show that the two-dimensional convolution is commutative we make the variable
substitution p p p
G G G
in (2.111c); and to show it is associative, we make the variable
substitution p p p
G G G
in (2.111d). The two-dimensional convolution is also linear. For any
two complex constants and , we have

[ ] [ ]
[ ] [ ]
2
2 2
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
u v h d u v h
d u v d u h
u v u h
p o p p p p o p p p p
o p p p p p p p p
o p p p p

+ +
+
+

G G G G G G G G
G G G G G G
G G G G

,
(2.111e)

and because the two-dimensional convolution is commutative it follows that

[ ] [ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) v h u v u h u o p p p o p p p p + +
G G G G G G G
. (2.111f)

It is easy to show that the Fourier convolution theorem holds true in two dimensions. We start
with

2 ( )
2 ( )
2 ( )
[ ( , ) ( , )]
( , ) ( , )
( , ) ( , )
i x y
i x y
i x y
dx dy e u x y v x y
dx dy e dx dy u x y v x x y y
dx dy u x y dx dy e v x x y y
r q
r q
r q

+

+

+

.

- 212 -
- 213 -
Now we replace the x, y integration variables by x x x and y y y , with dx dx and
dy dy , so that

2 ( )
2 ( ) 2 ( )
[ ( , ) ( , )]
( , ) ( , )
i x y
i x y i x y
dx dy e u x y v x y
dx dy u x y e dx dy e v x y
r q
r q r q

+

+ +

or

2 ( ) ( ) ( )
[ ( , ) ( , )] ( , ) ( , )
i x y
dx dy e u x y v x y U V
r q
q q

+

, (2.112a)

where
( )
U

is the two-dimensional forward or inverse Fourier transform of u,

( ) 2 ( )
( , ) ( , )
i x y
U dx dy e u x y
r q
q

+

, (2.112b)
and
( )
V

is the two-dimensional forward or inverse Fourier transform of v,

( ) 2 ( )
( , ) ( , )
i x y
V dx dy e v x y
r q
q

+

. (2.112c)

This gives the first half of the two-dimensional Fourier convolution theorem. To get the
second half, we reverse the transform in (2.112a). If the plus sign is used in (2.112a), take the
forward two-dimensional Fourier transform of both sides, and if the minus sign is used take the
inverse two-dimensional Fourier transform of both sides. This leads to

2 ( ) ( ) ( )
( , ) ( , ) ( , ) ( , )
i x y
d d e U V u x y v x y
r q
q q q

+

B
, (2.113a)

where, reversing the transforms in Eqs. (2.112b) and (2.112c),

2 ( ) ( )
( , ) ( , )
i x y
u x y d d e U
r q
q q

+

B
(2.113b)
and

2 ( ) ( )
( , ) ( , )
i x y
v x y d d e V
r q
q q

+

B
. (2.113c)
- 213 -
2 Fourier Theory

- 214 -
The first half of the two-dimensional Fourier convolution theorem, Eqs. (2.112a)(2.112c),
shows that the forward or inverse two-dimensional Fourier transform of the two-dimensional
convolution of two functions u and v is the product of the forward or inverse two-dimensional
Fourier transforms of u and v. Because no restrictions are placed on the nature of u and v, other
than that they are transformable, there are also no restrictions on the nature of their
( )
U

and
( )
V

transforms. This means we can think of
( )
U

and
( )
V

as arbitrary transformable functions. The
( ) superscripts on U and V in Eqs. (2.113a)(2.113c) then just tell us that, according to Eqs.
(2.112b) and (2.112c),

( ) 2 ( )
( , ) ( , )
i x y
U dx dy e u x y
r q
q

+

and

( ) 2 ( )
( , ) ( , )
i x y
V dx dy e v x y
r q
q

+

.

We already know this, however, from looking at Eqs. (2.113b) and (2.113c)just take the
opposite-sign Fourier transform of both sides. Hence, we can drop the ( ) superscripts on U and
V in Eqs. (2.113a)(2.113c) as long as ( ) B superscripts are added to u and v to distinguish
between the two choices of sign in (2.113b) and (2.113c). Now Eqs. (2.113a)(2.113c) become

2 ( ) ( ) ( )
( , ) ( , ) ( , ) ( , )
i x y
d d e U V u x y v x y
r q
q q q

+

B B B
, (2.114a)
where

( ) 2 ( )
( , ) ( , )
i x y
u x y d d e U
r q
q q

+

B B
(2.114b)
and

( ) 2 ( )
( , ) ( , )
i x y
v x y d d e V
r q
q q

+

B B
. (2.114c)

The letters used to label the functions and variables are, of course, arbitrary, so nothing stops us
from interchanging the letters u and U, v and V, x and , y and , and the vertical order of the
signs to get

2 ( ) ( ) ( )
( , ) ( , ) ( , ) ( , )
i x y
dx dy e u x y v x y U V
r q
q q

+

, (2.115a)
- 214 -
- 215 -
where

( ) 2 ( )
( , ) ( , )
i x y
U dx dy e u x y
r q
q

+

(2.115b)
and

( ) 2 ( )
( , ) ( , )
i x y
V dx dy e v x y
r q
q

+

. (2.115c)

Equations (2.115a)(2.115c) are the other half of the two-dimensional Fourier convolution
theoremthey show that the forward or inverse two-dimensional Fourier transform of the
product of two functions u and v is the two-dimensional convolution of the forward or inverse
two-dimensional Fourier transforms of u and v.
The three-dimensional convolution is written using the symbol and defined to be

( , , ) ( , , ) ( , , ) ( , , ) u x y z v x y z dx dy dz u x y z v x x y y z z

(2.116a)
or

3
( ) ( ) ( ) ( ) u r v r d r u r v r r

G G G G G
. (2.116b)

Using three-dimensional vector notation, the three-dimensional convolution has the same
commutative, associative, and linearity properties as the two-dimensional convolution, as can be
seen by returning to Eqs. (2.111c)(2.111f), mentally adding an extra , an extra integral sign,
and replacing all the superscript 2s by superscript 3s.

( ) ( ) ( ) ( ) u v v u p p p p
G G G G
, (2.117a)

[ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) u v h u v h p p p p p p
G G G G G G
, (2.117b)

[ ] [ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) u v h u v u h p o p p o p p p p + +
G G G G G G G
, (2.117c)

and

[ ] [ ] [ ]
( ) ( ) ( ) ( ) ( ) ( ) ( ) v h u v u h u o p p p o p p p p + +
G G G G G G G
. (2.117d)

- 215 -
2 Fourier Theory

- 216 -
Looking carefully at the variable manipulations used to derive Eqs. (2.112a)(2.112c), the first
half of the two-dimensional Fourier convolution theorem, we see that working with an extra
product z in the exponent of e and an extra integration over dz does not affect the end result.
We can therefore say that

2 ( )
( ) ( )
[ ( , , ) ( , , )]
( , , ) ( , , ) ,
i x y z
dx dy dz e u x y z v x y z
U V
r q
q q

+ +

(2.118a)

where

( ) 2 ( )
( , , ) ( , , )
i x y z
r q
q

+ +

(2.118b)

and

( ) 2 ( )
( , , ) ( , , )
i x y z
V dx dy dz e v x y z
r q
q

+ +

. (2.118c)
The argument about relabeling the functions and variables used to go from (2.112a)(2.112c) to
(2.115a)(2.115c) works equally well here, giving us at once the other half of the three-
dimensional Fourier convolution theorem,

2 ( )
( ) ( )
( , , ) ( , , )
( , , ) ( , , ) ,
i x y z
dx dy dz e u x y z v x y z
U V
r q
q q

+ +

(2.119a)

where

( ) 2 ( )
( , , ) ( , , )
i x y z
r q
q

+ +

(2.119b)

and

( ) 2 ( )
( , , ) ( , , )
i x y z
V dx dy dz e v x y z
r q
q

+ +

. (2.119c)

One last matter of notation worth mentioning is that we can create two-dimensional and three-
dimensional delta functions from the products of the already-discussed one-dimensional delta
function:
- 216 -
- 217 -
( ) ( ) ( ) x y o p o o
G
(2.120a)

and
( ) ( ) ( ) ( ) r x y z o o o o
G
. (2.120b)

For any two-dimensional continuous function u(x,y), we have

( , ) ( ) ( ) ( ) ( , ) ( )
( ) ( , ) ( , );
o o o o
o o o o
dx dy u x y x x y y dx x x dyu x y y y
dx x x u x y u x y
o o o o
o

(2.121a)

and similarly for any continuous three-dimensional function ( , , ) v x y z , we have

( , , ) ( ) ( ) ( )
( ) ( , , ) ( )
( ) ( , , ) (
o o o
o o o
o o o o
dx dy dz v x y z x x y y z z
dx x x dy v x y z y y
dx x x v x y z v x
o o o
o o
o

, , )
o o
y z
.
(2.121b)

These equations can be written in vector notation as

2
( ) ( ) ( )
o o
d u u p p o p p p

G G G G
(2.121c)
and

3
( ) ( ) ( )
o o
d r v r r r v r o

G G G G
. (2.121d)

Combining Eq. (2.71f) for the one-dimensional delta function with Eqs. (2.120a) and (2.120b),
we see that in two dimensions

2 2 2 2
( ) ( ) ( )
ix iy i q
x y d e d e d qe
r r q r p
o p o o q

-

G G
G
(2.122a)
- 217 -
2 Fourier Theory

- 218 -
using the vector notation q x y q +
G
; and in three dimensions

2 2 2
3 2
( ) ( ) ( ) ( )
ix iy iz
ir s
r x y z d e d e d e
d s e
r r q r
r
o o o o q

G G
G

(2.122b)

using the vector notation s x y z q + +
G
.

__________

This chapter provides both an intuitive understanding and a rigorous explanation of how
Fourier transforms work. Sine and cosine transforms are introduced as a way to measure how
much functions resemble sine and cosine curves, and these transforms are then combined to
create the standard complex Fourier transform. We describe convolutions and how they produce
new functions by blurring old ones. The Fourier convolution theoremwhose importance is
difficult to overstatedirectly connects the convolution to Fourier-transform theory. Generalized
limits are explained to show in what sense some of the more puzzling functions found in lists of
Fourier transforms belong there, and a brief outline of generalized functions is presented to show
how delta functions can be described without making them sound like obvious nonsense.
Computers use discrete Fourier transforms to handle Fourier calculations, and we explain how
the discrete Fourier transform can be used to approximate the integral Fourier transform. The
discrete Fourier transform produces aliasing; we show when aliasing is desirable, when it is not
desirable, and when it can be neglected. All the major concepts explained in this chapterthe
linearity of the Fourier transform, the linearity of the convolution, the Fourier convolution
theorem, the idea of even and odd functions, and the delta functionhave important roles to play
in the pages that follow.

- 218 -
Table 2.1
- 219 -

Table 2.1
)) ( ( ) (
) (
t u f U
ift
F )) ( ( ) (
) (
f U t u
ift
F
(1) [real, even]
0 )) ( Im( f U , ) ( ) ( f U f U
[real, even]
0 )) ( Im( t u , ) ( ) ( t u t u
(2) [imag., even]
0 )) ( Re( f U , ) ( ) ( f U f U
[imag., even]
0 )) ( Re( t u , ) ( ) ( t u t u
(3) [real, odd]
0 )) ( Im( f U , ) ( ) ( f U f U
[imag., odd]
0 )) ( Re( t u , ) ( ) ( t u t u
(4) [imag., odd]
0 )) ( Re( f U , ) ( ) ( f U f U
[real, odd]
0 )) ( Im( t u , ) ( ) ( t u t u
(5) [complex, even]
f f U some for 0 )) ( Re( =
f f U some for 0 )) ( Im( =
) ( ) ( f U f U
[complex, even]
t t u some for 0 )) ( Re( =
t t u some for 0 )) ( Im( =
) ( ) ( t u t u
(6) [complex, odd]
f f U some for 0 )) ( Re( =
f f U some for 0 )) ( Im( =
) ( ) ( f U f U
[complex, odd]
t t u some for 0 )) ( Re( =
t t u some for 0 )) ( Im( =
) ( ) ( t u t u
(8) [real]
0 )) ( Im( f U
[Hermitian]
) ( ) ( t u t u
(7) [Hermitian]
) ( ) ( f U f U
[real]
0 )) ( Im( t u
- 219 -
2 Fourier Theory

- 220 -

Table 2.1
(continued)

(10) [imag.]
0 )) ( Re( f U
[anti-Hermitian]
) ( ) ( t u t u
(9) [anti-Hermitian]
) ( ) ( f U f U
[imag.]
0 )) ( Re( t u
(11) [complex, no symmetry]

[complex, no symmetry]

- 220 -
Table 2.2
- 221 -

Table 2.2

2
0
1
( )
k
T
i t
T
k
A e v t dt
T
r

k
T
t
ik
k
e A t v
r 2
) (
(1) [real, even]
0 ) Im(
k
A ,
k k
A A

[real, even]
0 )) ( Im( t v , ) ( ) ( t v t v
(2) [imag., even]
0 ) Re(
k
A ,
k k
A A

[imag., even]
0 )) ( Re( t v , ) ( ) ( t v t v
(3) [real, odd]
0 ) Im(
k
A ,
k k
A A

[imag., odd]
0 )) ( Re( t v , ) ( ) ( t v t v
(4) [imag., odd]
0 ) Re(
k
A ,
k k
A A

[real, odd]
0 )) ( Im( t v , ) ( ) ( t v t v
(5) [complex, even]
k A
k
some for 0 ) Re( =
k A
k
some for 0 ) Im( =
k k
A A

[complex, even]
t t v some for 0 )) ( Re( =
t t v some for 0 )) ( Im( =
) ( ) ( t v t v
(6) [complex, odd]
k A
k
some for 0 ) Re( =
k A
k
some for 0 ) Im( =
k k
A A

[complex, odd]
) ( ) ( t v t v
(8) [real]
0 ) Im(
k
A
[Hermitian]
) ( ) ( t v t v
(7) [Hermitian]

k k
A A
[real]
0 )) ( Im( t v

- 221 -

(1) [real, even]
0 ) Im(
k
A ,
k k
A A

[real, even]
0 )) ( Im( t v , ) ( ) ( t v t v
(2) [imag., even]
0 ) Re(
k
A ,
k k
A A

[imag., even]
0 )) ( Re( t v , ) ( ) ( t v t v
(3) [real, odd]
0 ) Im(
k
A ,
k k
A A

[imag., odd]
0 )) ( Re( t v , ) ( ) ( t v t v
(4) [imag., odd]
0 ) Re(
k
A ,
k k
A A

[real, odd]
0 )) ( Im( t v , ) ( ) ( t v t v
(5) [complex, even]
k A
k
some for 0 ) Re( =
k A
k
some for 0 ) Im( =
k k
A A

[complex, even]
) ( ) ( t v t v
(6) [complex, odd]
k A
k
some for 0 ) Re( =
k A
k
some for 0 ) Im( =
k k
A A

[complex, odd]
) ( ) ( t v t v
(8) [real]
0 ) Im(
k
A
[Hermitian]
) ( ) ( t v t v
(7) [Hermitian]

k k
A A
[real]
0 )) ( Im( t v
- 221 -
2 Fourier Theory

- 222 -

Table 2.2
(continued)

(10) [imag.]
0 ) Re(
k
A
[anti-Hermitian]
) ( ) ( t v t v
(9) [anti-Hermitian]

k k
A A
[imag.]
0 )) ( Re( t v
(11) [complex, no symmetry]

[complex, no symmetry]

- 222 -
- 223 -
3
RANDOM VARIABLES, RANDOM
FUNCTIONS, AND POWER SPECTRA
Engineers and scientists are taught many statistical concepts in school, but all too often this is
done in an informal manner that does a good job of explaining how to eliminate random errors
and noise from real experimental data and a poor job of explaining how to analyze random errors
and noise in physical models. Understanding the correct way to represent random errors and
noise requires formal knowledge of the statistical concepts used to describe random signals;
otherwise, basic equations can be misunderstood and misused. For this reason, we here take a
more formal approach to the subject. Starting off with an explanation of the basicsrandom
functions, independent and dependent random variables, the expectation operator E, stationarity
and ergodicitythat do not require the Fourier theory discussed in the previous chapter, we then
move on to topics that do, such as autocorrelation functions, white noise, the noise-power
spectrum, and the Wiener-Khinchin theorem. The techniques explained in this chapter are used a
few times in the next chapter during the derivation of the Michelson interference equations and
then over and over again in Chapters 6, 7, and 8 to analyze the random errors and noise found in
Michelson systems.
3.1 Random and Nonrandom Variables
Random variables can be thought of as uncontrolled variables and nonrandom variables can be
thought of as controlled variables. When, for example, a computer program is being written, the
programmer controls the values of nonrandom program variables using inputs or lines of code,
but the programmer has no desire to control the programs random variablesa pseudo-random
number generator gives them values instead. In a similar spirit, a statistician constructing a set of
model equations always ends up controlling the nonrandom variableseither directly by saying
this variable can be measured like this and that variable can be measured like that, or indirectly,
by saying these variables must solve that set of equations. Even when a statistician plots a
function against its argument, the graph is constructed by specifying the arguments values and
then calculating the function according to its definition, which puts both the nonrandom argument
and the nonrandom value of the function under the statisticians control. The statistician always,
on the other hand, treats random variables in a model as if they cannot be controlled. They must
be handled as if coins will be flipped, dice rolled, or needles spun on dials to determine their
values after the model is written down. All the statistician can know is the probability this
random variable takes on that value and the probability that random variable takes on this value;
3 Random Variables, Random Functions, and Power Spectra
- 224 -
that is, he knows what the chances are that the coins, dice, or needles return one set of numbers
rather than another. Most scientists and engineers do not pay much attention to the difference
between controlled and uncontrolled variablesperhaps because most of their controlled
variables are usually a little uncontrolled in the sense that they come from imperfectly accurate
measurementsbut it is very convenient when analyzing a statistical model to keep careful track
of this distinction. To help us remember which variables are random and which are not, we put a
wavy line or tilde over the random variables while writing the nonrandom variables in the usual
way. As an example of how this looks, we note that u, a
0
, and z are all nonrandom variables
whereas ,
0
, and z are all random.
3.2 Random and Nonrandom Functions
When the argument of a function is a random variable, the value of the function is also random.
If, for example, x is a random variable and f is a function, then

( ) y f x = (3.1a)

is another random variable. To give an example of how this works, we create a nonrandom time
variable t and a random angular frequency , multiply them together and take the sine of their
product to get
sin( ) y t = . (3.1b)

The value of y is clearly uncontrolled; for each unpredictable value of at time t, there is a
corresponding unpredictable number y that is given by sin( ) t . This example also shows that
when a function has several arguments, its value becomes random when only one of the
arguments is random. In Eq. (3.1b) the sine of t , regarded as a function of both and t, is
random even though only one of its arguments, , is random.
Many times when a function has multiple arguments, the controlled argument or arguments
are more interesting than the uncontrolled argument or arguments that make the function random.
One way to handle this situation is to list only the nonrandom arguments and say that what we
have is a random function with nonrandom arguments. To show what is going on, we put a wavy
line over the function name, indicating that even though all the listed arguments are nonrandom,
the function itself is random. If, for example, we are only interested in the nonrandom time t, we
could define
( ) sin( ) R t t =
(3.2a)

to be a random function of the nonrandom variable t. Now whenever there is a list of time values
t
1
, t
2
, , there is a corresponding list of random variables

Random and Nonrandom Functions 3.2
- 225 -

1 1 1
( ) sin( ) u R t t = =
, (3.2b)

2 2 2
( ) sin( ) u R t t = =
,
#

Although Eq. (3.2b) implicitly assumes a list of distinct and separate t values, this reasoning still
holds up when t is explicitly made a continuous variable. Nothing, for example, stops us from
saying that for each value of t between and +, there corresponds a different random variable

( ) sin( )
t
u R t t = =
. (3.2c)

The idea of a random function of nonrandom arguments becomes more attractive when there is
no realistic possibility of analyzing the effect of multiple random arguments on a single
nonrandom function. We might, for example, know exactly how N random parameters
1
r ,
2
r , ,
N
r interact to cause an error e in an electrical signal s at time t. This lets us write the error as a
nonrandom function

1 2
( , , , , )
N
e t r r r .

Rather than investigating how
1
r ,
2
r , ,
N
r are behaving, it usually makes more sense to say that
there is a random noise

1 2
( ) ( , , , , )
N
n t e t r r r = " (3.3a)

contaminating electrical signal s. Now we can put the error into our model as a random function
that depends on a nonrandom parameter t instead of as a nonrandom function e that depends on t
and N random parameters
1
r ,
2
r , ,
N
r . Sometimes the signal s in our model depends on more
than one nonrandom parameter, such as the x, y coordinates of an image point at time t. If the
corresponding error e in the signal s depends on x, y, and t as well as the random parameters
1
r ,
2
r , ,
N
r , then we can say there is a random noise

1 2
( , , ) ( , , , , , , )
N
n x y t e x y t r r r = (3.3b)

contaminating signal s(x, y, t). Note that we can think in terms of a signal noise (t) or (x,y,t)
even when we are not sure what random arguments
1
r ,
2
r , ,
N
r make the nonrandom function e
behave randomly. This is, of course, why the idea of a random function is so useful. In this book,
we use the term random function to refer to what statisticians often prefer to call a random or
stochastic process.

- 226 -
3.3 Probability Density Distributions: Mean, Variance, Standard
Deviation
With every random variable r , we associate a nonrandom probability density distribution ( )
r
p x

such that ( )
r
p x dx
is the probability that the random variable r takes on a value between x and
x dx + . The nonrandom argument x of
r
p
is a dummy variable, and nothing stops us from calling

it r insteadin fact, that is the convention. The usual way to introduce a probability density
distribution for a random variable r is to say that ( )
r
p r dr
is the probability that r takes on a

value between r and r dr + . The dummy argument of a probability density distribution p must be
nonrandom, and the subscript of the probability density distribution p must be randomthe
subscript, after all, labels p to show which random variable is being described. Since r must
always take on some sort of value between and +, the sum of all the probabilities ( )
r
p r dr

between and + must always be one. Consequently, for any probability density distribution
( )
r
p r
, we have
( ) 1
r
p r dr

. (3.4)

For Eq. (3.4) to make sense, the probability density distribution ( )
r
p r
must be defined for all r

between and + with the understanding that

( ) 0
r
p r =

for those values of r to which the random variable r can never be equal.
The predicted average or mean value of r can be written as

( )
r r
p r r dr

. (3.5a)

Note that
r
, just like
r
p
, is nonrandom even though it has a random subscript. The predicted

variance of r , which is defined to be the predicted average or mean squared difference between
r and
r
, is another nonrandom quantity

2
( ) ( )
r r r
v p r r dr

. (3.5b)
Many people prefer to characterize a random number r by its standard deviation
r
instead of its
variance
r
v
. The standard deviation of a random number r is defined to be the square root of the
variance,
Probability Density Distributions: Mean, Variance, Standard Deviation 3.3
- 227 -

r r
v =

. (3.5c)

Of course
r
, like
r
v
, is a nonrandom quantity. In general, the probability density distribution

r
p

lets us find the predicted average or mean value of any nonrandom function f of the random
variable r by calculating the nonrandom quantity

predicted mean value of ( ) ( )
r
p r f r dr f

=

. (3.5d)

When ( ) f r r = , this equation reduces to formula (3.5a) for
r
; and when
2
( ) ( )
r
f r r =

, this
equation reduces to formula (3.5b) for
r
v
.
Many random variables found in nature appear to obey a Gaussian, or normal, probability
distribution:

2
2
( )
2
1
( )
2
r
r
r
r
r
p r e

. (3.6a)

This can in part be explained as a consequence of the central limit theorem,
25
which is described
in Sec. 3.11 below. It is easy to show that parameter
r
in Eq. (3.6a) is the mean of the Gaussian

distribution. Consulting formula (3.5a) above, we see that the mean of the distribution in (3.6a)
must be

2 2
2 2
( ) ( )
2 2
1
( )
2 2
r
r r
r r
r
r r
r
e dr r e dr

= +

, (3.6b)

where on the right-hand side the variable of integration is changed to
r
r r =

. This becomes,
consulting Eq. (7A.3d) in Appendix 7A of Chapter 7,

2 2 2
2 2 2
2
2
( ) ( ) ( )
2 2 2
( )
2
1 1
( )
2 2 2
1
1
2
r r r
r
r r r
r
r
r r r
r
r
r
r e dr r e dr e dr
r e dr

+ = +
= +

.
(3.6c)

25
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. (McGraw-Hill, Inc., New
York, 1991), p. 214.

- 228 -
If we replace r by r in

2
2
( )
2
( )
r
r
g r r e

=

,

it is the same as multiplying g by 1 , which makes g an odd function [see Eq. (2.11b) in Chapter
2). Hence, according to Eq. (2.17) in Chapter 2,

2
2
( )
2
0
r
r
r e dr

because it is the integral of an odd function between and +. Therefore, Eq. (3.6c) simplifies
to

2
2
( )
2
1
( )
2
r
r
r r
r
r e dr
+ =
, (3.6d)

which can be substituted back into (3.6b) to get

2
2
( )
2
2
r
r
r
r
r
r
e dr
. (3.6e)

This shows that, as claimed above, parameter
r
is the mean of the probability distribution

specified in Eq. (3.6a). It is just as easy to show that
r
is the standard deviation of the

distribution in (3.6a). From (3.5b) we know that the variance of this distribution is

2 2
2 2
( ) ( )
2 2
2 2
( ) ( )
2 2
r
r r
r r
r
r r
r r
e dr e dr

when the variable of integration is changed to
r
r r =

. According to Eq. (7A.3b) in Appendix
7A of Chapter 7, we can write

2
2
( )
2
2 2
( )
2
r
r
r
r
r
e dr
. (3.6f)

Consequently,
2
r
is the variance of this probability density distribution. The square root of the
variance is the standard deviation according to (3.5c). Hence, it is, as claimed, easy to see that
r

Probability Density Distributions: Mean, Variance, Standard Deviation 3.3
- 229 -
is the standard deviation of the probability density distribution in Eq. (3.6a).
When r can only take on the values
1
r ,
2
r , ,
N
r , then
r
p
can be written as a sum of delta

functions. If, for example,
1
p is the probability that r is
1
r ,
2
2
r , ,
N
N
r , then

1
( ) ( )
N
r k k
k
p r p r r
=
=

. (3.7a)

The integral for the predicted mean value of r in Eq. (3.5a) now reduces to

1 1 1
[ ( )] ( )
N N N
r k k k k k k
k k k
p r r r dr p r r r dr p r

= = =

= = =

(3.7b)

as we expect. Similarly, according to Eq. (3.5b), the predicted variance of r becomes

2 2
1 1
2
1
[ ( )]( ) ( ) ( )
( ) ;
N N
r k k r k k r
k k
N
k k r
k
v p r r r dr p r r r dr
p r

= =

=
= =
=

(3.7c)

and, according to Eq. (3.5d), the predicted mean value of ( ) f r becomes

1 1 1
[ ( )] ( ) ( ) ( ) ( )
N N N
k k k k k k
k k k
p r r f r dr p f r r r dr p f r

= = =

= =

. (3.7d)

Again, the integral formulas reduce to the correct probability-weighted sums. Looking at the
limiting case where 1 N = and
1
1 p = , we get

1
( ) ( )
r
p r r r =

so that

1 1
( )
r
r r r dr r
= =
(3.7e)

and the variance about
1 r
r =
is


- 230 -

2 2
1 1 1 1
( ) ( ) ( ) 0
r
v r r r r dr r r
= = =
. (3.7f)

Results (3.7e) and (3.7f) show that the value of r is now completely controlled; it must be equal
to
1
r and no longer needs to be treated like a random variable. Hence, the limiting case where
1 N = and
1
1 p = can be regarded as changing a random variable into a nonrandom variable.
3.4 The Expectation Operator
Statisticians avoid the mathematical awkwardness of probability density distributions and their
associated integrals by defining an expectation operator E. For any nonrandom function f with a
random argument x , we say that
( ) ( ) f x E

is the predicted mean, or average, value of ( ) f x . We also call ( ) ( ) f x E the expectation value of
( ) f x . Mathematically we define
( ) ( ) ( ) ( )
x
f x p x f x dx

E . (3.8a)

Just like before, ( )
x
p x dx
is the probability that the random variable x takes on a value between

x and x dx + . We can find ( ) x E , the expectation value of x , by choosing ( ) f x x = in Eq. (3.8a)
to get
( ) ( )
x
x p x x dx

E . (3.8b)

Comparing this to Eq. (3.5a) above, we see that the expectation value of x is the same as the
predicted mean or average value of x ,

( )
x
x =

E , (3.8c)

which makes good intuitive sense. Choosing
2
( ) ( )
x
f x x =

gives

( )
2 2
( ) ( ) ( )
x x x
x p x x dx

E . (3.8d)

The Expectation Operator 3.4
- 231 -
Comparing this to Eq. (3.5b) above, we see that
( )
2
( )
x
x

E is the variance of x ,

( )
2
( )
x x
v x =

E . (3.8e)

A notation often used for the variance of x instead of
x
v
is

( )
2
( ) ( )
x
Var x x =

E . (3.8f)

When the E operator is applied to any sort of random variable or functionfor example,
( ) f x the result is always a nonrandom variable or function, namely

( ) ( )
x
p x f x dx

.

For example, the characteristic function
x
of a random variable x , which is the nonrandom

Fourier transform of the probability density distribution of x ,

2
( ) ( )
i x
x x
p x e dx

, (3.9a)

can be written as, using the E operator,

2
( ) ( )
i x
x
e

E . (3.9b)

To specify what happens when E is applied to a nonrandom variable c, we set up a random
variable that has the probability density distribution

( ) ( ) p c
. (3.9c)

According to the discussion following Eqs. (3.7e,f) above, this makes equivalent to the
nonrandom variable c. Consequently, we can say that

( ) ( ) c = E E (3.9d)
and use Eq. (3.8b) above to get
- 232 -
( ) ( ) ( ) c p d c d c

= = =

E . (3.9e)

This justifies the general rulewhich also makes good intuitive sensethat

( ) c c = E (3.9f)
for any nonrandom quantity c.
The expectation operator E can be applied to multiple random variables at the same timeall
that we need is the appropriate probability density distribution. Suppose, for example, that the
behavior of two random variables x and X
is described by a two-argument probability density

distribution ( , )
xX
p x X
, with ( , )
xX
p x X dx dX
being the probability that the random variable x

takes on a value between x and x dx + while the random variable X
takes on a value between X

and X dX + . No matter what the behavior of random variables x and X
, we can always
construct an appropriate probability density distribution
xX
p

. Since x and X
must always take

on some values in the intervals

x < < and X < < ,

the same reasoning used to produce Eq. (3.4) now shows that

( , ) 1
xX
dx dX p x X

=

(3.10a)

for any probability density distribution
xX
p

. The expectation value of any function of the random

variables x and X
, such as ( , ) f x X
, is defined to be

( )
( , ) ( , ) ( , )
xX
f x X dx dX p x X f x X

=

E . (3.10b)

In particular, we can always set ( , ) f x X x X =

to get the expected value of the random variables
product,

( ) ( , )
xX
xX x dx dX X p x X

=

E . (3.10c)
Independent and Dependent Random Variables 3.5
- 233 -
3.5 Independent and Dependent Random Variables
When comparing two random variables such as x and X
, one of the first questions that arises is

whether they are dependent or independent. When two random variables are dependent, the
random variables influence each other; and when two random variables are independent, they do
not.
Independent random variables are used to describe random quantities for which no cause-and-
effect relationship can be found. When, for example, we pick a car randomly from all the cars
sold in a given year, there is no reason to expect that the random variable representing the
brightness of the cars headlights is associated with any particular value of the random variable
representing the cars length. Lacking any evidence to the contrary, then, we say that these two
random variables ought to be independent. Similarly, if we pick someone at random from a
collection of adults, there is no obvious reason to assume that the random variable representing
the persons yearly income is associated with any particular value of the persons shoe size.
Again, we might assume that these are independent random variables. In general, when there is
no reason to connect the values of random quantities, we set them up in our models as
independent random variables.
Many times random variables turn out to be dependent in surprising ways. Returning to the
first of the previous examples, when we examine the connection between a cars length and the
brightness of its headlights, it might turn out that very short cars are more likely to be European
sports cars frequently washed by their owners, making them more likely to have cleaner and thus
brighter headlights. Similarly, returning to the second example, a persons shoe size and height
are connected; and statisticians have in fact shown that tall people, who are more likely to wear
large shoes, are also more likely to earn large incomes (if only because people living in the
United States, Australia, Canada, and Europe are more likely to be tall). Just as in these two
examples, many random variables that look like they ought to be unconnected and independent
turn out, after closer examination, to be dependent; in this sense, the independence of random
variables is the ideal case from which realistic random variables tend to deviate to a greater or
lesser degree.
3.6 Analyzing Independent Random Variables
When x and X
are independent random variables, their probability density distribution can be

written as
26

( , ) ( ) ( )
x
xX X
p x X p x p X =

. (3.11a)

where
x
p
and
X
p
are the standard probability density distributions for x and X
when x and X

26
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 132.
- 234 -
are treated as solitary random variables. This means that ( )
x
p x dx
is the probability that x lies

between x and x dx + regardless of the value of X
, and ( )
X
p X dX
is the probability that X
lies
between X and X dX + regardless of the value of x . We see that, according to Eqs. (3.10c) and
(3.11a), the expectation value of the product xX
of two independent random variables is

( ) ( , ) ( ) ( )
[ ( ) ] [ ( ) ]
x
xX X
x
X
xX x dx dX X p x X x dx dX X p x p X
p x x dx p X X dX

= =
=

E
.

According to Eqs. (3.8b) and (3.8c), this can be written as

( ) ( ) ( ) xX x X =

E E E (3.11b)
or
( )
x
X
xX =

E . (3.11c)
3.7 Large Numbers of Random Variables
Our analysis of two random variables can be extended in a straightforward way to large
collections of random variables. If there are N random variables
1
x ,
2
x ,,
N
x , then we can
always construct a probability density distribution

1 2
1 2
( , , , )
N
x x x N
p x x x
"

such that

1 2
1 2 1 2
( , , , )
N
x x x N N
p x x x dx dx dx
"
"

is the probability that
1
x lies between
1
x and
1 1
x dx + , that
2
x lies between
2
x and
2 2
x dx + , ... ,
that
N
x lies between
N
x and
N N
x dx + . The expectation value of any function
1 2
( , , , )
N
f x x x of
these N random variables is

( )
1 2
1 2
1 2 1 2 1 2
( , , , )
( , , , ) ( , , , ).
N
N
N N x x x N
f x x x
dx dx dx f x x x p x x x

=

"

"
E

(3.12a)

Large Numbers of Random Variables 3.7
- 235 -
Note that nothing has been said so far about the connections between these N random variables;
they could be either dependent or independent. If we now assume that these N random variables
are all independent with respect to one another, then

1 2 1 2
1 2 1 2
( , , , ) ( ) ( ) ( )
N N
x x x N x x x N
p x x x p x p x p x =
"
" , (3.12b)

where
1
1 1
( )
x
p x dx

1
x lies between
1
x and
1 1
x dx + regardless of the values
of the other 1 N random variables,
2
2 2
( )
x
p x dx

2
x lies between
2
x and
2 2
x dx + regardless of the values of the other 1 N random variables, , ( )
N
x N N
p x dx
is the
probability that
N
x lies between
N
x and
N N
x dx + regardless of the values of the other 1 N
random variables. The expectation value of the product of these N random variables can now be
written as, setting
1 2 1 2
( , , , )
N N
f x x x x x x = " " in Eq. (3.12a),

1 2
1 2
1 2 1 2 1 2 1 2
1 1 1 2 2 2
( ) [ ] ( , , , )
( ) ( ) ( ) .
N
N
N N N x x x N
x x x N N N
x x x dx dx dx x x x p x x x
p x x dx p x x dx p x x dx

=
=

"

" " "
"
E

Again, we consult Eqs. (3.8b) and (3.8c) to get

1 2 1 2
( ) ( ) ( ) ( )
N N
x x x x x x = " " E E E E (3.12c)
or

1 2
1 2
( )
N
N x x x
x x x =

" " E . (3.12d)
3.8 Single-Variable Means from Multivariable Distributions
We can calculate the predicted mean values of x and X
by choosing ( , ) f x X x =
and
( , ) f x X X =

in Eq. (3.10b) above. This gives
( ) ( , )
x
xX
x dx dX x p x X

= =

E (3.13a)
and
( ) ( , )
X xX
X dx dX X p x X

= =

E . (3.13b)

- 236 -
Writing the double integrals as
( ) [ ( , ) ]
xX
x x p x X dX dx

=

E (3.13c)
and
( ) [ ( , ) ]
xX
X X p x X dx dX

=

E , (3.13d)

we compare them to the formula for the expected value of a random variable given in Eq. (3.8b).
This comparison suggests that, if we want to specify the behavior of one random variable while
disregarding the presence of the other, we can construct the single-argument probability density
distributions of x and X
by writing
( ) ( , )
x
xX
p x p x X dX

(3.13e)
and
( ) ( , )
X xX
p X p x X dx
. (3.13f)

Up to this point, none of the integrations have required assumptions about the dependence or
independence of the random variables, so Eqs. (3.13e) and (3.13f) hold true both for dependent
and independent random variables x and X
. If we specify that x and X
are independent, then

Eq. (3.11a) can be substituted into (3.13e) and (3.13f) to get

( ) ( ) ( ) ( ) ( )
x x x
X X
p x p x p X dX p x p X dX

= =

and
( ) ( ) ( ) ( ) ( )
x x
X X X
p X p x p X dx p X p x dx

= =

.

Glancing back at Eq. (3.4), we note that these last two equalities are trivially true, because in both
cases the right-most integrals must be one.
3.9 Analyzing Dependent Random Variables
Having found formulas for
x
and
X
that hold true for any pair of dependent or independent

random variables x and X
, we now use
x
and
X

to define a new random variable

Analyzing Dependent Random Variables 3.9
- 237 -
( )( )
x
X
y x X =

. (3.14a)

From Eq. (3.8c), we know that

( )
( ) ( )( )
x
X
y x X =

E E (3.14b)

is just the predicted average value of y . We can imagine, each time we acquire a random pair of
x and X
values, comparing the sizes of x and X
to their respective averages

x
and
X

by
subtracting
x
and
X

from them. If x and X
are both simultaneously greater than, or both

simultaneously less than, their averages, then y is positive; and if one is greater than its average
when the other is less that its average, then is negative. If there is a tendency for one of the
random variables to exceed its average whenever the other exceeds its average, or a tendency for
one of the random variables to fall below its average whenever the other falls below its average,
then has a greater probability of being positive than negative, so

( ) 0 y > E .

If, on the other hand, there is a tendency for one of the random variables to exceed its average
when the other falls below its average, then has a greater probability of being negative than
positive, so
( ) 0 y < E .

If ( ) y E is zero, it indicates that is just as likely to be negative as positive, which means that
knowing one variable lies above or below its average tells us nothing about the likelihood that the
other variable lies above or below its average. Writing out the integral formula for ( ) y E in terms
of the probability density distribution ( , )
xX
p x X
gives

( )
( ) ( )( ) [( )( )] ( , )
x x
X X xX
y x X dx dX x X p x X

= =

E E . (3.14c)

We say that the value of the integral in Eq. (3.14c) measures the covariance of random variables
x and X
. When

( )
( ) ( )( )
x
X
y x X =

E E

is greater than zero, x and X
are said to be positively correlated; when


- 238 -

( )
( ) ( )( )
x
X
y x X =

E E

is less than zero, x and X
are said to be negatively correlated; and when

( )
( ) ( )( )
x
X
y x X =

E E

equals zero, x and X
are said to be uncorrelated.

Evaluating ( ) y E and finding it not equal to zero is a standard way of showing that two
random variables x and X
are correlated and so cannot be independent. We cannot, however,

say that x and X
are independent just because ( ) y E is zero; that is, saying that x and X
are
uncorrelated is a weaker statement than saying that x and X
are independent. To show why this

is so, we set up a random variable
which has a probability density distribution

1 (2 ) for 0 2
( )
0 for 0 2
p

<
=

<

or
. (3.15a)

The probability density distribution p
shows that
is equally likely to take on any value

between zero and 2, and that
never takes on values less than zero or greater than 2. We next

define two random variables u and v such that

sin( ) u =

(3.15b)
and
cos( ) v =

. (3.15c)

It follows that

2
0
1
( ) (sin ) ( ) sin( ) sin( ) 0
2
u
u p d d
= = = = =

E E , (3.15d)

and similar reasoning shows that

2
0
1
( ) cos( ) 0
2
v
v d
= = =
E . (3.15e)
Note that
Analyzing Dependent Random Variables 3.9
- 239 -

( )
( )
2
0
2
0
( )( ) ( ) (sin )( cos )
1
sin( ) cos( )
2
1
sin(2 ) 0 ,
4
u v
u v u v
d
d
= =
=
= =

E E E

(3.15f)
which means that u and v are uncorrelated random variables. On the other hand, we also know
that

2 2 2 2
sin cos 1 u v + = + =

,

which means that whenever u takes on a particular random value, say 1/2, then v must take on
one of the two random values

2
1 (1 2) 3 2 = .

Consequently, u and v are by no means independent random variables even though by definition
they are uncorrelated random variables.
3.10 Linearity of the Expectation Operator
The expectation operator is linear with respect to all random quantities. To see why, we take any
two functions f and g whose arguments are the N random variables
1
x ,
2
x ,,
N
x and multiply
them by two nonrandom variables and . The expectation operator E applied to

1 2 1 2
( , , , ) ( , , , )
N N
f x x x g x x x +

then gives, according to Eq. (3.12a) above,

( )
1 2
1 2
1 2 1 2
1 2 1 2 1 2 1 2
1 2 1 2 1 2
1 2 1
( , , , ) ( , , , )
[ ( , , , ) ( , , , )] ( , , , )
( , , , ) ( , , , )
(
N
N
N N
N N N x x x N
N N x x x N
N
f x x x g x x x
dx dx dx f x x x g x x x p x x x
dx dx dx f x x x p x x x
dx dx dx g x

+
= +
=
+

"
"

"
" " "
"
E

( ) ( )
1 2
2 1 2
1 2 1 2
, , , ) ( , , , )
( , , , ) ( , , , )
N
N x x x N
N N
x x p x x x
f x x x g x x x
= +
"

E E .
(3.16a)

- 240 -
Note that in the last step Eq. (3.12a) is applied again to return to the expectation operator.
According to Eq. (2.32a) in Chapter 2, the definition of a linear operator L is that

( ) ( ) ( ) f g f g + = + L L L (3.16b)

for any two functions f, g and any two constants , . When we think of the nonrandom variables
and as constants, we see that Eqs. (3.16a) and (3.16b) provide plenty of justification for
calling the expectation operator E a linear operator with respect to all random quantities.
The linearity of E can be used to show that multiplying any random variable x by a
nonrandom parameter results in the mean of x being multiplied by and the variance of x
being multiplied by
2
. Starting with Eq. (3.8c), we multiply both sides by to get

( )
x
x =

E . (3.16c)

Because E is linear, ( ) ( ) x x = E E , which means that Eq. (3.16c) can be written as

( )
x
x =

E . (3.16d)

This shows that multiplying x by changes its average value from
x
to
x
. As for the
variance
x
v
of random variable x , according to Eq. (3.8e) we have

( )
2
( )
x x
x v =

E (3.16e)

from the definition of the variance of x . Multiplying both sides by
2
gives

( )
2 2 2
( )
x x
x v =

E . (3.16f)
Again the linearity of E lets us write

( ) ( )
2 2 2 2
( ) ( )
x x
x x =

E E ,

and taking inside the square gives

( ) ( )
2 2 2
( ) ( )
x x
x x =

E E .

This can be substituted into (3.16f) to get

Linearity of the Expectation Operator 3.10
- 241 -

( )
2 2
( )
x x
x v =

E . (3.16g)

Since x is the new random variable which comes from multiplying x by and [according to
Eq. (3.16d)] the quantity
x
is the mean of this new random variable, we now realize

consulting the definition of the variance in Eq. (3.8e)that
( )
2
( )
x
x

E must be the variance
of the new random variable x . Equation (3.16e) reminds us that
x
v
is the variance of the old

random variable x . Hence, Eq. (3.16g) states that if x is multiplied by then its variance must
be multiplied by
2
.
The expectation operator usually can be moved inside an integral over a nonrandom variable.
Suppose function f depends on one nonrandom variable z in addition to N random variables
1
x ,
2
x ,,
N
x . Then, again using Eq. (3.12a), the expectation value of the integral

1 2
( , , , , )
B
A
z
N
z
f z x x x dz

is

1 2
1 2
1 2 1 2 1 2
( ( , , , , ) )
( , , , ) ( , , , , ) .
B
A
B
N
A
z
N
z
z
N x x x N N
z
f z x x x dz
dx dx dx p x x x f z x x x dz

=

"

"
E

As long as we can interchange the order of these integrationswhich is almost always allowed
when dealing with physically realistic integralsthe expectation value can also be written as

1 2
1 2
1 2 1 2 1 2
( , , , , )
( , , , ) ( , , , , ) .
B
A
B
N
A
z
N
z
z
N x x x N N
z
f z x x x dz
dz dx dx dx p x x x f z x x x

=

"

"
E

This can, again applying Eq. (3.12a), be written as

( )
1 2 1 2
( , , , , ) ( , , , , )
B B
A A
z z
N N
z z
f z x x x dz f z x x x dz

=

E E . (3.17a)


- 242 -
The same reasoning can be extended to M integrals over M nonrandom variables
1
z ,
2
z ,,
M
z .
We have

1 2
1 2
1
1 2
1
2
1 2
1
1 2 1 2 1 2
1 1 1 1 1
1 1 1 1
( , , , , , , , )
( , , ) ( , , , , , )
( , , ) ( ,
B B MB
A A MA
A MB
N
A MA
A MB
N
A MA
z z z
M M N
z z z
z z
N x x x N M M N
z z
z z
M N x x x N
z z
dz dz dz f z z z x x x
dx dx p x x dz dz f z z x x
dz dz dx dx p x x f z

=
=

"
"
"
" " "
" "
E
1
, , , , ) ,
M N
z x x
"


( )
2 2
1 2
1 2
1 2
1 2 1 2 1 2
1 2 1 2 1 2
( , , , , , , , )
( , , , , , , , ) .
B B MB
A A MA
B B MB
A A MA
z z z
M M N
z z z
z z z
M M N
z z z

=

"
"
E
E
(3.17b)

The expectation operator can even be moved inside the integral of a random function

1 2
( , , , )
M
f z z z
.

According to our definition of a random function in Sec. 3.2 above, we have

1 2 1 2 1 2
( , , , ) ( , , , , , , , )
M M N
f z z z f z z z x x x =

for some set of random variables
1
x ,
2
x ,,
N
x . Hence, we can just suppress the random variables
1
x ,
2
x ,,
N
x in Eq. (3.17b) to get

Linearity of the Expectation Operator 3.10
- 243 -

( )
2 2
1 2
1 2
1 2
1 2 1 2
1 2 1 2
( , , , )
( , , , ) .
B B MB
A A MA
B B MB
A A MA
z z z
M M
z z z
z z z
M M
z z z
dz dz dz f z z z
dz dz dz f z z z

=

"
"
E
E
(3.17c)

This result is referred to more than once in the following chapters.
3.11 The Central Limit Theorem
The central limit theorem states that if there is a random variable
N
s equal to the sum of N
independent random variables
1
r ,
2
r ,,
N
r , then

1 2 N N
s r r r = + + + " (3.18a)

has a probability density distribution ( )
N
s N
p s
that resembles a Gaussian or normal probability

density distribution more and more as N gets large,

2
2
( )
2 1
( )
2
N s
N
s
N
N
N
s
s N
s
p s e

. (3.18b)

In Eq. (3.18b),
N
s
is the mean or average value of

N
s and
N
s
is the standard deviation of

N
s
about its mean. Figure 3.1 is a plot of the Gaussian distribution specified on the right-hand side of
(3.18b). For large but finite values of N, this Gaussian distribution tends to be a relatively good
approximation of ( )
N
s N
p s
for
N
s values near the peak in Fig. 3.1 and a not-so-good
approximation of ( )
N
s N
p s
for
N
s values in the tails of Fig. 3.1that is, for
N
s values far from
the peak.
The mean of
N
s comes from applying the expectation operator E to both sides of Eq. (3.18a).
Remembering that E is linear with respect to random quantities [see Eq. (3.16a) above], we get

1 2 1 2
( ) ( ) ( ) ( ) ( )
N N N
s r r r r r r = + + + = + + + " " E E E E E ,

- 244 -
FIGURE 3.1.

which becomes, applying Eq. (3.8c) above,

1 2 N N
s r r r
= + + +

" . (3.19a)

The variance of
N
s is, according to Eq. (3.8e),

( )
2
( )
N N
s N s
v s =

E ,

which becomes, after substituting from Eqs. (3.18a) and (3.19a),

N
s
~
N
s
~
N
s
~

N
s
) ( ~
N s
s p
N

The Central Limit Theorem 3.11
- 245 -

2 2
1 1 1
( )
N j j
N N N
s j r j r
j j j
v r r
= = =

= =

E E .

Expanding the square inside the expectation operator gives

2
1 1 1
( ) [( )( )]
N j j k
N N N
s j r j r k r
j j k
k j
v r r r
= = =

= +

E ,

and the linearity of the expectation operator with respect to random quantities then lets us write
this as

( ) ( )
2
1 1 1
( ) ( )( )
N j j k
N N N
s j r j r k r
j j k
k j
v r r r
= = =
= +

E E . (3.19b)

Since
1
r ,
2
r ,,
N
r are independent random quantities, so must the random quantities
1
1 r
r

,
2
2 r
r

,,
N
N r
r

also be independent. Hence, according to Eq. (3.11b), we see that when
j k

( )
( )( ) ( ) ( )
j k j k
j r k r j r k r
r r r r =

E E E . (3.19c)

But, applying the linearity of the expectation operator and Eqs. (3.8c) and (3.9f), we have

( ) ( ) ( ) 0
j j j j
j r j r r r
r r = = =

E E E .

Consequently, Eq. (3.19c) becomes

( )
( )( ) 0
j k
j r k r
r r =

E (3.19d)

when j k . Substituting this into (3.19b) gives

( )
2
1
( )
N j
N
s j r
j
v r
=
=

E ,


- 246 -
which becomes, after applying Eq. (3.8e),

1 2 N N
s r r r
v v v v = + + +

" , (3.19e)
where

( )
2
( )
j j
j r r
r v =

E (3.19f)

is the variance of
j
r for 1, 2, , j N = . The standard deviation of a random quantity is the square
root of its variance [see Eq. (3.5c)], so formulas (3.19e) and (3.19f) can also be written as

1 2
2 2 2 2
N N
s r r r
= + + +

" , (3.19g)
where

( )
2
( )
j j
j r r
r =

E (3.19h)

j
r for 1, 2, , j N = and
N
s

N
s .
Returning to the approximation in Eq. (3.18b) used to explain the central limit theorem, we
notice that some care must be exercised in interpreting the limit as N ; in particular, it is
clear from Eqs. (3.19a) and (3.19g) that there is a tendency for both
N
s
and
N
s
to become large
without limit as N increases, making the expression on the right-hand side of (3.18b) difficult to
interpret in the limit of large N. The central limit theorem can be written in terms of a
mathematically well-defined limit as N if we are careful how the arguments of the
Gaussian or normal distribution are defined. To state the central limit theorem precisely, we
define a new random variable

N
N
N s
N
s
s
z

(3.20a)

that has a probability density distribution ( )
N
z N
p z
. Now we can present the central limit theorem

exactly by stating that

2
/ 2
1
lim ( )
2
N
z
z
N
p z e
. (3.20b)

The right-hand side of (3.20b) is the Gaussian or normal distribution introduced above in Eq.
(3.6a) where the random variable has a mean of zero and a standard deviation of one. For any
large but finite value of N, we can recover the approximation in (3.18b) by assuming that
N
z
p
is
near its limit and then replacing z in (3.20b) by z
N
as defined in (3.20a). [The extra factor of
N
s

The Central Limit Theorem 3.11
- 247 -
multiplying the 2 on the right-hand side of (3.18b) can be regarded as coming from Eq. (3.4)
aboveif it isnt there, then the integral of the probability density distribution between and
+ does not equal one.]
3.12 Averaging to Improve Experimental Accuracy
It is now easy to explain why averaging together many identical but independent measurements
from the same experiment improves the accuracy of the result. Suppose N independent
measurements are to be averaged together this way. We can say that each measurement is an
independent random number
j
r for 1, 2, , j N = having the same mean value , with taken to
be the true value of the experimental quantity being measured. Since the measurements are all
identical, all the
j
r have the same standard deviation due to the same sorts of random errors
occurring in each independent measurement. When all the experimental results are averaged, we
create a new random numbernamely, the sum of all the
j
r divided by N. Lets call this new
random number
N
a . The work done in the previous section lets us write this as [see Eq. (3.18a)]

N
N
s
a
N
=

. (3.21a)

Applying the expectation operator E to both sides gives, using the linearity of the expectation
operator (see Sec. 3.10 above),

1
( ) ( )
N N
a s
N
= E E . (3.21b)

Since ( )
N
N s
s =

E , Eq. (3.19a) shows that, since all the
j
r have the same mean value ,

1 2
( )
N
N r r r
s N = + + + =

" E . (3.21c)

Hence, Eq. (3.21b) now becomes

1
( ) ( )
N
a N
N
= = E . (3.21d)

Equation (3.21d) states that the expected value of the experimental average
N
a is , the true
value of the experimental quantity being measured. This is no great surprise, because the
averaging process would not make sense unless it were true. The typical size of the error left after
the
j
r are averaged togetherthat is, the amount by which
N
a is likely to be different from its
average valueis just its standard deviation [see Eqs. (3.5c) and (3.8e) above],

- 248 -

( )
2
( )
N
a N
a =
E ,

which can also be written as, after substituting from Eq. (3.21a) and using the linearity of the
expectation operator,
( )
( )
2
2 1 1
N
a N N
s s N
N N

= =

E E . (3.21e)

According to (3.21c), N is the mean value of
N
s , which makes

( )
( )
2
N
s N E .

the variance
N
s
v
of
N
s [see Eq. (3.8e) above]. Hence, (3.21e) can be written as

2
1 1
N N N
a s s
v
N N
= =

because the variance is the square of the standard deviation
N
s
. Substituting from (3.19g) now

gives

1 2
2 2 2
1 1
N N N
a s r r r
v
N N
= = + + +

" .

As already mentioned above, we can assume that all the
j
r have the same standard deviation .
Hence,

2
1
N
a
N
N
N
= =
. (3.21f)

This shows that when the standard deviation or expected error in one measurement is , then the
standard deviation or expected error in the average
N
a of N identical but independent
measurements is / N , a significantly smaller number. Although we use several formulas from
the previous section on the central limit theorem to get this result, there is no assumption here
that the
j
r obey any particular probability density distribution. In order to derive Eqs. (3.21d) and
(3.21f), all that is needed is that the
j
r are independent and that the probability density
distributions of the
j
r have the same mean and standard deviation.
When spectrometers are used to make independent measurements of the same radiance
Averaging to Improve Experimental Accuracy 3.12
- 249 -
spectra, we can extend the above analysis to the spectral measurements by regarding the
independent but identical random variables
j
r as random functions of the spectral wavelength or
frequency, with different values of index j now representing different spectral curves from
independent spectral measurements. We can now repeat all the algebraic manipulations used in
(3.21a)(3.21f) above while regarding every quantity except N as a function of the spectral
wavelength or frequency and end up with the same results. If, for example, the quantities are
regarded as functions of the spectral wavelength , then we just need to visualize a ()
immediately following the relevant variables. In a sense, all that is happening is that we have
decided to repeat the algebra of Eqs. (3.21a)(3.21f) at each spectral wavelength. Equation
(3.21d), for example, becomes

( ) ( ) ( )
N
a = E , (3.22a)

showing that the point-by-point average of the ( )
j
r spectral curves creates another curve ( )
N
a
whose expected value is the true spectrum (). The average spectrum ( )
N
a is allowed to have
a different expected value () at each wavelength because it is now, of course, taken to be a
function of . Similarly Eq. (3.21f) becomes

( )
( )
N
a
N

=
. (3.22b)

This shows that the expected error ( )
N
a

at wavelength of the average spectrum ( )

N
a is
smaller by a factor of N than the expected error () at wavelength of a single spectral
measurement. The expected error ( ) , just like the average (), is allowed to be different at
different wavelengths. As long as the expected value () of ( )
N
a is the true spectral curve, Eq.
(3.22b) shows that we can approach this true spectrum as closely as we desirethat is, make the
error in our point-by-point average spectrum arbitrarily smallby making N as large as
necessary.
3.13 Mean, Autocorrelation, Autocovariance of Random Functions of
Time
Using the same notation as in the discussion following Eq. (3.2a) above, we write (t) to
represent a random function of a nonrandom time t. As we already mentioned at the end of Sec.
3.2, (t) is often called a random or stochastic process. Having specified a random functionor
stochastic process or random processcalled (t), we know that for each time t there is a random
variable (t); and when there are two different time values t
1
and t
2
with t
1

t
2
, there is no reason
to expect the random variables (t
1
) and (t
2
) to behave the same way.

- 250 -
We also know the behavior of random variables can be described by probability density
distributions. Associated with any N sequential random variables
1
( ) n t ,
2
( ) n t ,..., ( )
N
n t specified
by the time values
1 2 N
t t t < < < " there is a probability density distribution

1 2
1 2 ( ) ( ) ( )
( , , , )
N
N n t n t n t
p n n n
"
,
such that

1 2
1 2 1 2 ( ) ( ) ( )
( , , , )
N
N N n t n t n t
p n n n dn dn dn
"
"

is the probability first that (t
1
) takes on a value between n
1
and
1 1
n dn + , and then that
2
( ) n t
takes on a value between
2
n and
2 2
n dn + , and then that
3
( ) n t takes on a value between
3
n and
3 3
n dn + , , and then that ( )
N
n t takes on a value between
N
n and
N N
n dn + . The expectation
operator E has the same meaning as before: the expected or mean value of any function f of the
N random variables
1
( ) n t ,
2
( ) n t , ... , ( )
N
n t is

( ) ( )
1 2
1 2
1 2 1 2 ( ) ( ) ( ) 1 2
( ), ( ), , ( )
( , , , ) ( , , , ) .
N
N
N N n t n t n t N
f n t n t n t
dn dn dn f n n n p n n n

=

"

"
E

(3.23a)

One of the most important expectation values associated with occurs when we set 2 N = and
specify that
( )
1 2 1 2
( ), ( ), , ( ) ( ) ( )
N
f n t n t n t n t n t =

to get the autocorrelation function

( )
1 2
1 2 1 2 1 2 1 2 ( ) ( ) 1 2
( , ) ( ) ( ) [ ] ( , )
nn n t n t
R t t n t n t dn dn n n p n n

= =

E . (3.23b)

Other important expectation values are the mean of as a function of time,

( )
( ) ( )
( ) ( )
n t n t
n t n p n dn
= =

E , (3.23c)
and the autocovariance of ,

Mean, Autocorrelation, Autocovariance of Random Functions of Time 3.13
- 251 -

( )( ) ( )
1 2
1 2 1 2
1 2 1 ( ) 2 ( )
1 2 1 ( ) 2 ( ) ( ) ( ) 1 2
( , ) ( ) ( )
( )( ) ( , ).
nn n t n t
n t n t n t n t
C t t n t n t
dn dn n n p n n

=
=

E

(3.23d)

Clearly, when
( )
0
n t
=
for all t, we have

1 2 1 2
( , ) ( , )
nn nn
R t t C t t =

. (3.23e)

Almost always, the random functions used to represent noise in a physical system are specified in
such a way that
( )
0
n t
=
, which means the distinction between the autocorrelation function and

the autocovariance function becomes irrelevant.
3.14 Ensembles
Just as random variables are often regarded as taking on one or another specific value chosen
randomly from some collection of allowed nonrandom values, so too do we often think of
random functions as becoming one or another specific, nonrandom function chosen randomly
from a collectionor ensembleof allowed nonrandom functions. We can visualize this
situation by imagining an infinitely long row of biased and crooked slot machines, one for every
value of t on the time axis.
27
The slot machines do not necessarily behave identically and they are
wired together so that they can influence each other. When a slot machines lever is pulled, there
is never any jackpot; all that happens is that another number appears inside its window. Each time
we simultaneously pull all the levers of the slot machines, we randomly choose another member
of the ensemble of allowed functions. The probability
( )
( )
n t
p n dn
that random variable (t) takes

on a value between n and n dn + is just the probability that the slot machine at t takes on a value
between n and n dn + , and it is also the probability that some member function randomly chosen
from the ensemble of allowed functions has a value between n and n dn + at time t. In fact, we
can say that

1 2
( ) ( ) ( ) 1 2 1 2
( , , , )
N
n t n t n t N N
p n n n dn dn dn
"
"

is the probability, after the slot machine levers are pulled, that the slot machine at t
1
has a value
between n
1
and
1 1
n dn + , that the slot machine at t
2
has a value between n
2
and
2 2
n dn + , , and

27
An objection that could be raised here is that an infinite number of slot machines is only what is called countably
infinite whereas the number of points on the time axis is uncountably infinite, a much larger type of infinity. For
our purposes, the distinction between these two types of infinity is not important.

- 252 -
that the slot machine at t
N
has a value between n
N
and
N N
n dn + . It can also, of course, be thought
of as the probability that a member function randomly chosen from the ensemble of allowed
functions has values at times
1 2 N
t t t < < < " that lie between n
1
and
1 1
n dn + ,
2
n and
2 2
n dn + ,
,
N
n and
N N
n dn + respectively.
3.15 Stationary Random Functions
A random function (t) is strictly stationary,
28
or strict-sense stationary,
29
if all its statistical
properties are unaffected when the origin of its time axis is changed (that is, when we change the
point at which 0 t = ). Mathematically we require, for any
1 2 N
t t t < < < " , that the probability
density distribution

1 2 1 2
( ) ( ) ( ) 1 2 ( ) ( ) ( ) 1 2
( , , , ) ( , , , )
N N
n t n t n t N n t n t n t N
p n n n p n n n
+ + +
=
" "
(3.24a)

for any value of and all 1, 2, , N = . Thus, for any integrable function f with N arguments,

1 2
1 2
1 2 1 2 ( ) ( ) ( ) 1 2
1 2 1 2 ( ) ( ) ( ) 1 2
( , , , ) ( , , , )
( , , , ) ( , , , ) ,
N
N
N N n t n t n t N
N N n t n t n t N

+ + +

=

"
"
"
"
(3.24b)

where
1 2 N
t t t < < < " and 1, 2, , N = . This means that, according to Eq. (3.23a),

( ) ( ) ( ) ( )
1 2 1 2
( ), ( ), , ( ) ( ), ( ), , ( )
N N
f n t n t n t f n t n t n t = + + + E E (3.24c)

for any integrable function f, any value of , and 1, 2, , N = . We note that when Eq. (3.24c)
holds true,
( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t E

cannot depend on all the N independent time values
1
t ,
2
t ,,
N
t as we might at first suppose. To
see why this is so, we just set
1
t = in (3.24c) to get

28
Paul H. Wirsching, Thomas L. Paez, and Keith Ortiz, Random Vibrations: Theory and Practice (John Wiley and
Sons, Inc., New York, 1995), p. 80.
29
Stationary Random Functions 3.15
- 253 -

( ) ( )
( ) ( )
1 2
2 1 3 1 1
( ), ( ), , ( )
(0), ( ), ( ), , ( ) .
N
N
f n t n t n t
f n n t t n t t n t t =

E
E
(3.24d)
This shows that
( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t E

must be a function of just the nonrandom time parameters
2 1
( ) t t ,
3 1
( ) t t ,,
1
( )
N
t t and there
are, of course, only 1 N of these.
Equations (3.24b)(3.24d) can be understood in terms of the following thought experiment.
We randomly pick some function from the ensemble of allowed functions and choose N time
values
1 2 N
t t t < < < " . The randomly picked function has values
1
n ,
2
n ,,
N
n at times
1
t ,
2
t ,,
N
t respectively. Next, we create some nonrandom function f that has N arguments and is
not one of those physically unreasonable abstractions that mathematicians specialize in. We
calculate and store the value of
1 2
( , , , )
N
f n n n . Randomly choosing another function from the
ensemble of allowed functions for ( ) n t , we again use
1
n ,
2
n ,,
N
n at
1
t ,
2
t ,,
N
t to calculate and
store a new value of
1 2
( , , , )
N
f n n n . Repeating this procedure enough times to get a large
collection of f values, we average them all together to get a good estimate of

( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t E .

Shifting to a new set of time values
1
t + ,
2
t + ,,
N
t + , we again generate another large
collection of f values, this time averaging them together to get a good estimate of

( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t + + + E .

Since n is strict-sense stationary, we know that no matter what the positive integer N is, and no
matter what the function f is, and no matter what the value of is, both collections of f values
always have approximately the same average, with the difference between the averages becoming
less and less as the collections of f values get larger and larger.
To give an example of a random function (t) that is strict-sense stationary, we define

( ) cos( ) sin( ) n t a t b t = +

, (3.25a)

where a and b
obey a probability density distribution ( , )

ab
p a b
such that ( , )
ab
p a b da db
is the
probability that a takes on a value between a and a da + when b
takes on a value between b and


- 254 -
b db + . We can also, just as correctly, say that ( , )
ab
p a b da db
is the probability that b
takes on a
value between b and b db + when a takes on a value between a and a da + . We next require

2 2
( , ) ( )
ab ab
p a b p a b = +

. (3.25b)

Equation (3.25b) says that ( , )
ab
p a b
is circularly symmetric because it depends on a and b only

through
2 2
a b + , the radius length of a point whose x and y coordinates are a, b. Returning to
the slot-machine model for (t) explained in Sec. 3.14, we note that randomly choosing values for
a and b
is the same as simultaneously pulling the levers of all the slot machines representing
(t) in Eq. (3.25a). Having pulled the levers and gotten, say, values
1
a for a and
1
b for b
, we
then know that the number in the window of the slot machine located at time value
1
t is

1 1 1 1
cos( ) sin( ) a t b t + ,

we know that the number in the window of the slot machine located at time value
2
t is

1 2 1 2

and so on. If we pull all the levers again and get values
2
a for a and
2
b for b
, then we know that

the slot machine at
1
t has a number

2 1 2 1

we know the slot machine at
2
t has a number

2 2 2 2

and so on. Because the probability density distribution ( , )
ab
p a b
completely determines the

statistics of random variables a and b
, we see that it must also completely determine the

statistics of (t) in Eq. (3.25a).
It is not difficult to show that (t) in Eq. (3.25a) is strict-sense stationary when
ab
p

is
circularly symmetric.
30
Picking an arbitrary time interval , we construct two new random
variables

30
- 255 -
cos( ) sin( ) A a b = +

(3.26a)
and
cos( ) sin( ) B b a =

. (3.26b)

The reverse transformation to Eqs. (3.26a) and (3.26b) is, of course,

cos( ) sin( ) a A B =

(3.26c)
and
cos( ) sin( ) b B A = +

, (3.26d)

which we can find by solving Eqs. (3.26a) and (3.26b) for a and b
in terms of A
and B
.
Equations (3.26a) and (3.26b) state that if random variables a and b
take on the values a and b,

then random variables A
and B
must take on the values

cos( ) sin( ) a b +
and
cos( ) sin( ) b a

respectively. Similarly Eqs. (3.26c) and (3.26d) state that if random variables A
and B
take on
values A and B, then random variables a and b
must take on values

cos( ) sin( ) A B
and
cos( ) sin( ) B A +

respectively. Whenever there are two random variables x and y that have a probability density
distribution ( , )
xy
p x y
and we use constants

1
,
2
,
3
, and
4
to construct from x and y two
new random variables

1 2
z x y = + (3.27a)
and

3 4
w x y = + , (3.27b)

then we can find the probability density distribution
zw
p

for z and w by calculating the reverse
transformation

1 2
x z w = + (3.27c)

- 256 -
and

3 4
y z w = + , (3.27d)
and requiring that
31

1 2 3 4
1 4 2 3
1
( , ) ( , )
zw xy
p z w p z w z w

= + +

. (3.27e)

Comparing Eqs. (3.26a)(3.26d) to Eqs. (3.27a)(3.27d), we see that

1
cos( ) = ,
2
sin( ) = ,
3
sin( ) = ,
4
cos( ) =

and

1
cos( ) = ,
2
sin( ) = ,
3
sin( ) = ,
4
cos( ) = .

Consequently,

2 2
1 4 2 3
cos ( ) sin ( ) 1 = + = ,

and so the probability density distribution of A
and B
must be

( ) ( , ) cos( ) sin( ), sin( ) cos( )
AB ab
p A B p A B A B = +

. (3.28a)

Since
ab
p

is circularly symmetric, obeying Eq. (3.25b), this becomes

( ) ( )
( )
2 2 2 2
2 2 2 2 1 2
2 2 2 2 2 2
( , ) [ cos ( ) sin ( ) 2 sin( ) cos( )
sin ( ) cos ( ) 2 sin( ) cos( )]
cos ( ) sin ( ) cos ( ) sin ( )
(
)
AB ab
ab
p A B p A B AB
A B AB
p A B

= +
+ + +
= + + +

2 2
( ).
ab
p A B = +

From Eqs. (3.26c) and (3.26d), we know that, whenever A
and B
take on the values A and B,

that a and b
must then take on the values

31
- 257 -
cos( ) sin( ) A B ut ut
and
cos( ) sin( ) B A ut ut + .

Hence,
( ) ( )
2 2
2 2 2 2
cos( ) sin( ) sin( ) cos( ) a b A B A B A B ut ut ut ut + + + +

so that

( )
2 2
( , ) ( , )
AB ab ab
p A B p a b p a b +

,

where Eq. (3.25b) is reversed to make the last step in this equality. We have now shown that Eq.
(3.28a) can be written as
( , ) ( , )
AB ab
p A B p a b

(3.28b)

because
ab
p

is circularly symmetric.
Equation (3.28b) is a very restrictive statement applied to random variables A
and B
because
it requires A
and B
to obey exactly the same statistics as a and b
. Consequently, we can set up

a random function
( ) cos( ) sin( ) N t A t B t u u +

(3.29a)

and know that it has exactly the same random behavior as (t) in Eq. (3.25a). Substituting Eqs.
(3.26a) and (3.26b) into (3.29a) gives

( ) ( )
( ) [ cos( ) sin( )]cos( ) [ cos( ) sin( )]sin( )
cos ( ) sin ( ) .
N t a b t b a t
a t b t
ut ut u ut ut u
u t u t
+ +
+ + +

(3.29b)

According to Eq. (3.25a), this is the same as writing

( ) ( ) N t n t t +
. (3.29c)

This means that not only does (t) have the same random behavior as (t), it also has the same
random behavior as ( ) n t t + . Consequently, (t) and ( ) n t t + must both have the same random
behavior. We have made no assumptions about the value of t ; hence, Eq. (3.29c) holds true for
any t value. We have therefore demonstrated that

where the equal probability densities do not depend on . ut

- 258 -
( ) cos( ) sin( ) n t a t b t = +

is strict-sense stationary when the probability density distribution
ab
p

is circularly symmetric
with

2 2
( , ) ( )
ab ab
p a b p a b = +

.

A random function (t) is called wide-sense stationary
32
when

( ) ( ) same finite constant for all values of
n
n t t = =
E (3.30a)
and
( )
1 2 2 1
( ) ( ) ( )
nn
n t n t R t t =

E . (3.30b)

Other terms applied to random functions (t) that satisfy these two restrictions are weakly
stationary or covariance stationary.
33
Equation (3.30a) requires the average value of (t) to be
finite and independent of time. We call this average
n

instead of
( ) n t

as in Eq. (3.23c) to
emphasize that it does not depend on time. Equation (3.30b) requires the autocorrelation function
1 2
( , )
nn
R t t

defined in Eq. (3.23b) to depend only on
2 1
( ) t t , the difference between times
2
t and
1
t . Glancing back at the definition of
1 2
( , )
nn
C t t

in Eq. (3.23d), we see that when Eqs. (3.30a) and
(3.30b) are satisfied,

( )( ) ( )
( )
( ) ( ) ( )
1 2 1 2
2
1 2 1 2
2
1 2 1 2
( , ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) .
nn n n
n n n
n n n
C t t n t n t
n t n t n t n t
n t n t n t n t

=
= +
= +

E
E
E E E

The last step uses the linearity of the expectation operator (see Sec. 3.10 above) and Eq. (3.9f).
Consequently, the formula for
nn
C

becomes, using Eqs. (3.30a) and (3.30b),

2
1 2 2 1
( , ) ( )
nn nn n
C t t R t t =

. (3.30c)

This result shows that the autocovariance
1 2
( , )
nn
C t t

of random functions that are wide-sense

32
33
T. T. Soong, Random Differential Equations in Science and Engineering (Academic Press, New York, 1973), p.
43.
- 259 -
stationary also depends only on
2 1
( ) t t , the difference between times
2
t and
1
t . We note that
random functions that are wide-sense stationary need not be strict-sense stationary, but random
functions that are strict-sense stationary must also be wide-sense stationary. For future use, we
note that two random functions ( ) n t
and ( ) n t
are defined to be jointly wide-sense stationary

34

when each one is itself wide-sense stationary and when

( )
1 2 2 1
( ) ( ) ( )
n n
n t n t R t t

=

E , (3.30d)

which is called their cross-correlation function, depends only on the difference between times
1
t
and
2
t .
Returning to the (t) defined in Eq. (3.25a) above,

( ) cos( ) sin( ) n t a t b t = +

,

we stop assuming that ( , )
ab
p a b
is circularly symmetric and examine the weaker conditions that

must be put on random variables a and b
to make wide-sense stationary.

35
The expectation
value of (t) must be time independent, so by the linearity of the expectation operator

( ) ( ) ( ) cos( ) ( ) sin( ) n t a t b t = +

E E E .

Hence, for ( ) ( ) n t E to obey Eq. (3.30a) and so be time independent, we must have
( ) 0 a = E (3.31a)
and
( ) 0 b =
E . (3.31b)

These are the first two restrictions that must be placed on a and b
for (t) to be wide-sense

stationary. We also know from Eq. (3.30b) that
nn
R

must have the same value whenever
2 1
0 t t = or
2 1
t t = , so (remember that nothing has been said about what the value of time
2 1
t t =
is)
( ) ( )
3 3 4 4
( ) ( ) ( ) ( ) n t n t n t n t = E E

34
35
This treatment is taken from Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p.
300.

- 260 -
must hold true for all values of
3
t and
4
t . In particular, this must hold true when
3
0 t = and
4
(2 ) t = . But from Eq. (3.25a)

(0) n a = and ( ) (2 ) n b =

,
so it must be true that

2 2
( ) ( ) a b =

E E . (3.31c)

This is the third restriction that must be placed on a and b
for (t) to be wide-sense stationary.

To find the fourth and last restriction, we evaluate the left-hand side of Eq. (3.30b) for
1 2
t t ,
using (3.25a) and the linearity of the expectation operator (see Sec. 3.10) to get

1 2 1 1 2 2
2
1 2 1 2
2
2 1 1
( ( ) ( )) ([ cos( ) sin( )][ cos( ) sin( )])
( cos( ) cos( ) cos( ) sin( )
cos( ) sin( ) sin( ) sin(
n t n t a t b t a t b t
a t t ab t t
ab t t b t t

= + +
= +
+ +

E E
E

2
2 2
1 2 1 2
1 2 2 1
))
( ) [cos( ) cos( )] ( ) [sin( ) sin( )]
( ) [cos( ) sin( ) cos( ) sin( )].
a t t b t t
ab t t t t

= +
+ +
E E
E

This becomes, using
2 2
( ) ( ) a b =

E E from Eq. (3.31c),

( ) ( ) ( )
2
1 2 2 1 1 2
( ) ( ) ( ) cos ( ) ( ) sin ( ) . n t n t a t t ab t t = + +
E E E (3.31d)

The first term on the right-hand side of (3.31d) depends only on
2 1
( ) t t , which is what Eq.
(3.30b) requires, but the second term on the right-hand side does not. Therefore, the last
restriction on random variables a and b
is

( ) 0 ab =
E . (3.31e)

Equations (3.31a), (3.31b), (3.31c), and (3.31e) list all the restrictions on random variables a and
b
needed to ensure that (t) in Eq. (3.25a) is a wide-sense stationary random function.
If a and b
are independent random variables that obey the same probability density
distribution, and this probability density distribution assigns a mean value of zero to random
variables obeying it, then Eqs. (3.31a)(3.31c) are automatically satisfied and, since a and b
are
independent, Eqs. (3.31a) and (3.31b) show that (3.31e) is also satisfied:

- 261 -
( ) ( ) ( ) 0 0 0 ab a b = = =

E E E .

This is sufficient to make (t) wide-sense stationary, but there are other ways to do the job. We
can, for example, set a u = and b v =
where u and v are the random variables defined in Eqs.

(3.15b) and (3.15c) above. Equations (3.15d) and (3.15e) then show that Eqs. (3.31a) and (3.31b)
are satisfied, and Eq. (3.15f) shows that (3.31e) is satisfied. The only requirement left is (3.31c),
which can be checked now by writing

2
2 2 2
0
1 1
( ) ( ) sin
2 2
a u d
= = =
E E (3.32a)
and

2
2 2 2
0
1 1
( ) ( ) cos
2 2
b v d
= = =
E E . (3.32b)

Clearly, Eq. (3.31c) is also satisfied. We conclude that even though a u = and b v =
are not, as is
pointed out in the discussion following Eq. (3.15f), independent random variables, the random
function (t) in Eq. (3.25a) is still wide-sense stationary. Note that Eqs. (3.15b) and (3.15c) can
now be used to write (t) as

( ) sin( ) cos( ) cos( ) sin( ) sin( ) n t t t t = + = +

. (3.32c)

In (3.32c), random variable
can, according to Eq. (3.15a), be regarded as a random phase

equally likely to take on any value between zero and 2. Adding this sort of random phase to the
argument of a sinusoidal oscillation always produces a wide-sense stationary random function.
3.16 Gaussian Random Processes
A random function (t) is called a Gaussian random process or normal process when for any N
time values
1 2 N
t t t < < < " the random variables
1
( ) n t ,
2
( ) n t ,, ( )
N
n t obey a probability density
distribution

1 2
( ) ( ) ( ) 1 2
( , , , )
N
n t n t n t N
p n n n
"
,

which is multivariate Gaussian. To write this multivariate Gaussian in a reasonably compact
form, we define the vectors

1 1
( , , , )
N
n n n n =
G
, (3.33a)

( )
1 1
( ) ( ), ( ), , ( )
N
n t n t n t n t =
G
G
, (3.33b)

- 262 -
and

( )
( ) ( ) ( ) ( )
1 1
( )
( ) ( ) , ( ) , , ( )
N
n t
n t n t n t n t = = G G
G
G G
E E E E . (3.33c)

Glancing back at Eq. (3.23c), we remember that
( ) n t
is the expected or mean value of the

random variable (t), so Eq. (3.33c) can also be written as

1 2
( ) ( ) ( )
( )
( , , , )
N
n t n t n t
n t
= G G

G
. (3.33d)

We define the covariance matrix C to be the N N square matrix whose i,jth element is given by

( ) ( ) ( )
( ) [ ( ) ][ ( ) ]
i j
ij i n t j n t
n t n t = C

E . (3.33e)

Equation (3.14c) reminds us that ( )
ij
C is measuring the covariance of the two random variables
( )
i
n t and ( )
j
n t . A T superscript applied to a matrix or vector specifies the transpose of that
matrix or vector; so, for example,

1
2 T
N
n
n
n
n

=

G
#
.
Now the multivariate Gaussian distribution

1 2
( ) ( ) ( ) 1 2
( , , , )
N
n t n t n t N
p n n n
"

can be written as

1 2
( ) ( ) ( ) 1 2
( )
2 1 2 1
( ) ( )
( , , , )
( )
1
(2 ) [det( )] exp ( ) ( ) .
2
N
n t n t n t N
n t
N T
n t n t
p n n n
p n
n n

=

=

C C
"
G G
G G G G

G
G G G G

(3.33f)

In this formula, det( ) C stands for the determinant of C, and
1
C is the inverse matrix of C.
Nothing said so far about Gaussian random processes requires them to be stationary in any
sense of the term, and in fact not all Gaussian random processes are stationary. They are often
good models for the noise found in mechanical processes and electrical signals. Perhaps the most
interesting thing about them, however, is that it can be shown that if they are wide-sense
Gaussian Random Processes 3.16
- 263 -
stationary, then they are also strict-sense stationary.
36,37

3.17 Products of Two, Three, and Four Jointly Normal Random
Variables
Random variables such as
1
( ) n t ,
2
( ) n t ,, ( )
N
n t that obey a multivariate Gaussian distribution
such as the one in Eq. (3.33f) are often called jointly normal random variables.
38
There are a
number of useful product identities that apply to groups of two, three, and four jointly normal
random variables. Since the derivation of these identities does not involve t, our notation can be
simplified by writing

1 1
2 2
( )
( )
etc
n t n
n t n

#
.

Each random variable is also assumed to have a mean of zero:

1
2
0
0
etc.
n
n
=
=
#

We start by specifying three jointly normal, zero-mean random variables
1
n ,
2
n , and
3
n .
Consulting Eq. (3.33f) above, we note that the jointly normal probability density function for
1
n ,
2
n , and
3
n can be written as, by expanding the matrix product in the exponent after setting the
means vector
G
to zero,

3 3
1 1
1 2 3
1 2 3
( , , )
jk j k
j k
n n
n n n
p n n n K e

= =
=

(3.34a)

for real constants K and
jk
(with , 1, 2, 3 j k = ). Note that these three random variables can be
either independent or dependent random variables and still obey the probability density
distribution in (3.34a). The expected value of the triple product
1 2 3
n n n is [applying Eq. (3.12a)

36
37
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 83.
38

- 264 -
above]

3 3
1 1
1 2 3 1 2 3 1 2 3
( ) ( )
jk j k
j k
n n
n n n K dn dn dn n n n e

= =

=

E . (3.34b)

Changing the dummy variables of integration to

1 1
u n = ,
2 2
u n = ,
3 3
u n =
gives

3 3
1 1
( )( )
1 2 3 1 2 3 1 2 3
( ) ( ) ( ) ( )( )
jk j k
j k
u u
n n n K du du du u u u e

= =

=

E
or

3 3
1 1
( )( )
1 2 3 1 2 3 1 2 3
( ) ( )
jk j k
j k
u u
n n n K du du du u u u e

= =

=

E . (3.34c)

Comparing the right-hand sides of (3.34b) and (3.34c) shows that

1 2 3 1 2 3
( ) ( ) n n n n n n = E E .

The only number that is equal to (1) times itself is zero, so we conclude that

1 2 3
( ) 0 n n n = E (3.34d)

for any three distinct, jointly normal, and zero-mean random variables.
When
1
n ,
2
n , and
3
n are not three distinct random variablesor, what amounts to the same
thing, two or more are perfectly correlatedwe can redo the analysis to see what happens.
If two of the three random variables
1
n ,
2
n , and
3
n are perfectly correlated, there are really
only two distinct, jointly normal, zero-mean random variables that we call
1
n and
2
n . Their
multivariate probability density distribution can be written as

2 2
1 1
1 2
1 2
( , )
jk j k
j k
n n
n n
p n n K e

= =
=

for real constants K and
jk
(with , 1, 2 j k = ). If necessary, we renumber the random variables so
that
2
n represents the two perfectly correlated random variables that used to be distinct. Equation
(3.34b) now simplifies to
Products of Two, Three, and Four Jointly Normal Random Variables 3.17
- 265 -

2 2
1 1 2 2
1 2 1 2 1 2
( ) ( )
jk j k
j k
n n
n n K dn dn n n e

= =

=

E . (3.35a)

Again the dummy variables of integration are changed, this time to

1 1
u n = and
2 2
u n = ,
which gives

2 2
1 1
( )( )
2 2
1 2 1 2 1 2
( ) ( ) ( )( )
jk j k
j k
u u
n n K du du u u e

= =

=

E
or

2 2
1 1 2 2
1 2 1 2 1 2
( ) ( )
jk j k
j k
u u
n n K du du u u e

= =

=

E . (3.35b)

Comparing the right-hand sides of (3.35a) and (3.35b) shows that

2 2
1 2 1 2
( ) ( ) n n n n = E E ,

so using the same reasoning as beforethat only zero can be equal to (1) times itselfwe get

2
1 2
( ) 0 n n = E . (3.35c)

Hence, Eq. (3.34d) still holds true when any two of the jointly normal, zero-mean random
variables
1
n ,
2
n ,
3
n are perfectly correlated.
When all three of these random variables are perfectly correlated, there is really just one zero-
mean random variable
1
n obeying the normal probability distribution [see Eq. (3.6a) above],

2
1
2
1
1
1
2
1
1
( )
2
n
n
n
n
p n e

.

The left-hand side of (3.34d) now becomes
3
1
( ) n E , which satisfies the formula

2
1
2
1
1
2
3 3
1 1 1
1
( )
2
n
n
n
n n e dn
E . (3.36a)


- 266 -
Since this is the integral between + and of an odd function, it must be zero [see Eq.
(2.17) in Chapter 2]. Consequently,

3
1
( ) 0 n = E (3.36b)

for any zero-mean, normally distributed random variable
1
n . We conclude that Eq. (3.34d) holds
for any three strictly normal and zero-mean random variables even if they are not distinct.
To construct a formula for
1 2 3 4
( ) n n n n E for four zero-mean, jointly normal random variables
1
n ,
2
n ,
3
n ,
4
n , we construct a new random variable,

4
1 1 2 2 3 3 4 4
1
j j
j
w n n n n n
=
= + + + =
. (3.37a)

There is no requirement that
1
n ,
2
n ,
3
n , and
4
n be distinct random variables, but we do assume
that the real parameters
1
,
2
,
3
, and
4
can independently take on any value between
and +. Since
1
n ,
2
n ,
3
n , and
4
n are jointly normal, w is also a normal variable.
39
Using the
linearity of the expectation operator with respect to random variables (see Sec. 3.10 above) and
remembering that
1
n ,
2
n ,
3
n , and
4
n are zero mean, we have

4 4
1 1
( ) ( ) 0
j j j j
j j
w n n
= =

= = =

E E E , (3.37b)

showing that w is also zero-mean. For future use we note, applying (3.37b) to Eq. (3.8e), that the
variance of w is

( )
4 4 4 4
2
1 1 1 1
( )
w j j k k j k j k
j k j k
v w n n n n
= = = =

= = =

E E E ,

which can also be written as, recognizing that [according to Eq. (3.5c)] the variance
w
v
is the
square of the standard deviation
w

of w ,

4 4
2
1 1
( )
w j k j k
j k
n n
= =
=

E . (3.37c)

39
This analysis is an expanded version of a treatment given in Athanasios Papoulis, Probability, Random Variables,
and Stochastic Processes, pp. 197198.
- 267 -
The characteristic function of w is [see Eqs. (3.9a) and (3.9b) above]

2 2
( ) ( )
i w i w
w
e p w e dw

E ,

where ( )
w
p w
is the probability density distribution of random variable w . Since w obeys a

zero-mean normal distribution [defined in Eq. (3.6a)], this becomes

2
2
2 2 2
1
( )
2
w
w
i w i w
w
e e e dw

E . (3.38a)

Substituting the identity cos sin
i
e i
= + into (3.38a) gives

2 2
2 2
2 2 2
1
( ) cos(2 ) sin(2 )
2 2
w w
w w
i w
w w
i
e w e dw w e dw

= +

E .

When we replace w by w in

2
2
2
( ) sin(2 )
w
w
Y w w e

=

,

we see that

2 2
2 2
( ) ( )
2 2
( ) sin( 2 ) sin(2 ) ( )
w w
w w
Y w w e w e Y w

= = =

,

showing that Y is an odd function. Hence, according to Eq. (2.17) in Chapter 2, its integral
between and + is zero. The formula for
2
( )
i w
e

E must then reduce to

2
2
2 2
1
( ) cos(2 )
2
w
w
i w
w
e w e dw

E . (3.38b)

A table of integrals
40
shows that, for any two real parameters a and b,

40
Formula 679 of the Handbook of Chemistry and Physics, edited by Robert C. Weast, 51st ed. (The Chemical
Rubber Company, Cleveland, OH, 19701971), p. A-215.

- 268 -

2
2 2
2
4
0
cos( )
2
b
a x
a
e bx dx e
a
.
Setting

2 2
( ) cos( )
a x
Z x bx e
= ,

we note that Z is an even function because

2 2 2 2
( ) ( )
( ) cos( ) cos( ) ( )
a x a x
Z x bx e bx e Z x

= = = .

Hence, according to Eq. (2.19) in Chapter 2, we can write

2
2 2
2
4
cos( )
b
a x
a
e bx dx e
a
. (3.38c)

Applying formula (3.38c) to Eq. (3.38b) by specifying that
1
2
w
a
and 2 b = , we get

2 2 2
2 2
( )
w
i w
e e

=

E . (3.38d)

Equation (3.38d) holds true for any value of ; in particular, when
1
(2 )

= , it must still be
true:

2
/ 2
( )
w
iw
e e

=

E . (3.38e)

Formula (3.38e) applies to any zero-mean, normal random variable, which means it applies to w
for any set of
1
,
2
,
3
,
4
values in Eq. (3.37a) above.
We can expand the left-hand side of (3.38e) in powers of w to get, using the linearity of the
expectation operator with respect to random variables (see Sec. 3.10 above),

2 3 4 2 3 4
( ) ( ) ( )
( ) 1 1 ( )
2 6 24 2 6 24
iw
w w w w w w
e iw i i w i

= + + + = + + +

" "
E E E
E E E .

According to Eqs. (3.37b) and (3.36b), both ( ) w E and
3
( ) w E are zero [the discussion following
Eq. (3.37a) shows that w like
1
n is a zero-mean, normally distributed random variable, which
means that it must satisfy both Eqs. (3.37b) and (3.36b)]. Hence, we can write, remembering that
2 2
( )
w
w =

E because
w

is the standard deviation of w and w is zero mean, that
- 269 -

2 4
( )
( ) 1
2 24
iw w
w
e

= + +

"
E
E . (3.39a)

The right-hand side of (3.38e) can be expanded in powers of
w

to get

2
2 4
/ 2
1
2 8
w
w w
e

= + +

". (3.39b)

Substitution of (3.39a) and (3.39b) into (3.38e) now gives

2 2 4 4
( )
1 1
2 24 2 8
w w w
w
+ + = + +

" "
E

or

4 4
( )
24 8
w
w
+ = +
" "
E
. (3.39c)

Equation (3.37c) reminds us that
2
w

is the weighted sum of
j k
products, so for small it
follows that
2
w

is of order
2
. This means that
4
w

on the right-hand side of (3.39c) is of order
4
. Similarly, Eq. (3.37a) reminds us that
4
( ) w E on the left-hand side of (3.39c) is order
4

when the values are small. Formula (3.39c) must hold true for all values of
1
,
2
,
3
, and
4
. If we choose
1
through
4
to be small, we must have

4 4
( ) 3
w
w =

E . (3.39d)

If (3.39d) is false, then the higher powers of w and
w

in (3.39c), which are represented by
+" on both sides of the formula, cannot make (3.39c) hold true because these +" terms
contain only order
6
and higher powers of
1
through
4
, making them too small to rescue
the equality.
The next step is to expand
4
( ) w E . Raising w to the fourth power in (3.37a) gives

4 2 2
1 1 2 2 3 3 4 4 1 1 2 2 3 3 4 4
( ) ( ) w n n n n n n n n = + + + + + +
or

4 2 2 2 2 2 2 2 2
1 1 2 2 3 3 4 4
1 2 1 2 1 3 1 3 1 4 1 4
2
2 3 2 3 2 4 2 4 3 4 3 4
(
2 2 2
2 2 2 )
w n n n n
n n n n n n
n n n n n n

= + + +
+ + +
+ + +

.


- 270 -
Paying attention only to those terms whose coefficients are proportional to
1 2 3 4
, we have

4
1 2 3 4 1 2 3 4
24 w n n n n = + + " ". (3.40a)

Formula (3.37c) gives, again concentrating only on terms whose coefficients are proportional to
1 2 3 4
,

4 2 2
1 1 1 2 1 2 1 3 1 3 1 4 1 4
2 2
2 1 2 1 2 2 2 3 2 3 2 4 2 4
2 2
3 1 3 1 3 2 3 2 3 3 3 4 3 4
4
[ ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
w
n n n n n n n
n n n n n n n
n n n n n n n

= + + +
+ + + +
+ + + +
+

E E E E
E E E E
E E E E

2 2 2
1 4 1 4 2 4 2 4 3 4 3 4 4
( ) ( ) ( ) ( )] , n n n n n n n + + + E E E E

which becomes

4
1 2 3 4 1 2 3 4 1 2 3 4 1 3 2 4
1 2 3 4 2 3 1 4
8 ( ) ( ) 8 ( ) ( )
8 ( ) ( )
w
n n n n n n n n
n n n n

= + +
+ +
"
"
E E E E
E E .
(3.40b)

Equations (3.40a) and (3.40b) can be substituted into (3.39d) to get

1 2 3 4 1 2 3 4
1 2 3 4 1 2 3 4 1 2 3 4 1 3 2 4
1 2 3 4 2 3 1 4
( 24 )
3 [ 8 ( ) ( ) 8 ( ) ( )
8 ( ) ( ) ] ,
n n n n
n n n n n n n n
n n n n

+ +
= + +
+ +
" "
"
"
E
E E E E
E E

which simplifies to, using the linearity of the expectation operator (see Sec. 3.10),

1 2 3 4 1 2 3 4
1 2 3 4 1 2 3 4 1 3 2 4 2 3 1 4
24 ( )
24 [ ( ) ( ) ( ) ( ) ( ) ( )]
n n n n
n n n n n n n n n n n n

+ +
= + + + +
" "
" "
E
E E E E E E .

This must hold true for any combination of
1
,
2
,
3
, and
4
values, large or small, so the
coefficients of all the
1 2 3 4
terms must be the same on both sides of this equation. Therefore,

1 2 3 4 1 2 3 4 1 3 2 4 2 3 1 4
( ) ( ) ( ) ( ) ( ) ( ) ( ) n n n n n n n n n n n n n n n n = + + E E E E E E E (3.40c)

for any collection of zero-mean, jointly normal random variables
1
n ,
2
n ,
3
n , and
4
n .
Equation (3.40c) requires
1
through
4
to be distinct real parameters, but it does not
- 271 -
require the
1
n ,
2
n ,
3
n , and
4
n random variables to be distinct. Consequently, if
1
n and
2
n are the
same, we can relabel the jointly random variables using

1 2 a
n n n = =

3 b
n n =

4 c
n n =
to get

2 2
( ) ( ) ( ) 2 ( ) ( )
a b c a b c a b a c
n n n n n n n n n n = + E E E E E . (3.41a)

Similarly, if
3
n and
4
n are also identical, we can relabel
1
n through
4
n as

1 2 a
n n n = =
and

3 4 b
n n n = = ,
so that

2 2 2 2 2
( ) ( ) ( ) 2 ( )
a b a b a b
n n n n n n = + E E E E . (3.41b)

When all four random variables are the same, Eq. (3.40c) collapses to

4 2 2
( ) 3 ( ) n n = E E , (3.41c)

which holds true for any zero-mean random variable obeying a normal distribution.
3.18 Ergodic Random Functions
Ergodic random functions are random functions where time averages can be used to calculate
ensemble averages. Just as stationary random functions can be stationary in many different ways,
so can ergodic random functions be ergodic in many different ways.
We start with a simple example, discussing what is meant by saying that a random function
(t) is ergodic in the mean.
41
Equation (3.23c) defines the mean of (t) to be the ensemble
average created by the expectation operator,

( )
( )
( )
n t
n t =
E .

To find the mean using a time average, we must calculate

41

- 272 -

1
( )
2
T
T
n t dt
T

and take the limit as T . Since ergodic refers to using time averages to calculate ensemble
averages, we might expect that a random function that is ergodic in the mean would satisfy the
equation

( )
1
lim ( )
2
T
n t
T
T
n t dt
T
. (3.42a)

There are two problems with Eq. (3.42a). The first is that
( ) n t
is allowed to be a function of time

t, whereas

1
lim ( )
2
T
T
T
n t dt
T

is not. This means Eq. (3.42a) can only be true when
( ) n t
does not depend on time.

Consequently, for to be ergodic in the mean, we must also require to be stationary in the mean
with [see Eq. (3.30a) above]

( ) ( ) constant with respect to time
n
n t = =
. E

Now Eq. (3.42a) can be written as

1
lim ( )
2
T
n
T
T
n t dt
T
. (3.42b)

The second problem is more difficult to deal with. We note that the value of

1
( )
2
T
T
n t dt
T

must be a random value because it is proportional to the integral of a random function. Hence, we
expect

1
lim ( )
2
T
T
T
n t dt
T

also to be a random value. This means Eq. (3.42b) sets a random value equal to
n
, a nonrandom
Ergodic Random Functions 3.18
- 273 -
value, which is in general not allowed. The way out of this impasse is to put a restriction on the
limiting process used to get the right-hand side of (3.42b). Clearly,

1
( ) ( )
2
T
T
T n t dt
T
(3.42c)

is a random function of T. This means there must be a probability density distribution
( )
( )
T
p

such that
( )
( )
T
p d
is the probability that ( ) T
takes on a value between and d + . We

now require the limiting random variable

1
lim ( ) lim ( )
2
T
T T
T
T n t dt
T

= =

(3.42d)

to obey the limiting the probability density distribution

( ) ( )
n
p

=

. (3.42e)

According to the discussion following Eqs. (3.7e) and (3.7f) above, this turns
into a random
variable that behaves like a constant, since

( ) ( )
n n
d
= =
E
and

( )
2 2
( ) ( ) ( ) 0
n n n
d
= =
E .

Now we can note that, yes, strictly speaking, Eq. (3.42b) does equate a random variable to a
nonrandom variable, but this does not matter because Eq. (3.42e) makes the random variable

1
lim ( )
2
T
T
T
n t dt
T

equivalent to a nonrandom quantity.

- 274 -
A random function (t) is ergodic in the autocorrelation function
42
if the autocorrelation
function defined as an ensemble average in Eq. (3.23b) can also be calculated with a time
average. Glancing back at (3.23b), we define
2 1
t t = and set the ensemble average equal to the
time average by writing
( )
1 1
1
( ) ( ) lim ( ) ( )
2
T
T
T
n t n t n t n t dt
T

+ = +

E . (3.43a)

Once again we face the same two problems: the left-hand side of this equation is allowed to be a
function of
1
t whereas the right-hand side is not, and the left-hand side of this equation is
nonrandom whereas the right-hand side is random.
Dealing with the
1
t problem first, we again say that

( )
1 1
( ) ( ) n t n t + E

does not depend on
1
t , making (t) stationary with respect to its autocorrelation function. Now
Eq. (3.43a) can be written as

1 1
1
( ( ) ( )) ( ) lim ( ) ( )
2
T
nn
T
T
n t n t R n t n t dt
T

+ = = +

E . (3.43b)

Both in Eqs. (3.42a) and (3.42b) describing what it means to be ergodic in the mean, and in Eqs.
(3.43a) and (3.43b) describing what it means to be ergodic in the autocorrelation function, the
time dependence that ensemble averaging preserves is lost in the time average. This is clearly
going to happen whenever some sort of ensemble average is set equal to the corresponding time
average. We conclude that when a random function is ergodic in some way, it must also be
stationary in that same way. In this sense, ergodic random functions are always stationary.
43

Moving on to the second problem with Eq. (3.43a)that of equating random and nonrandom
quantitieswe follow the same procedure as before. This time the random function
is defined
to be

1
( , ) ( ) ( )
2
T
T
T n t n t dt
T

= +
(3.44a)

and the random function ( )
is defined to be

42
43
- 275 -
( ) lim ( , )
T
T t t

. (3.44b)

Associated with ( ) t
is the probability density distribution

( )
p
t
such that
( )
( ) p d
t

is the
probability that ( ) t
has a value between
and d

+ . We again require

( )
( )
( ) ( )
nn
p R
t
o t

(3.44c)
so that

( )
( )
( )
( ) ( ) ( ) ( )
nn nn
p d R d R
t
t o t t
E (3.44d)
and

( )
( )
2 2
( )
2
[ ( ) ( )] ( )[ ( )]
( ) [ ( )] 0.
nn nn
nn nn
R p R d
R R d
t
t t t
o t t
E

(3.44e)

This shows, according to the discussion following Eqs. (3.7e) and (3.7f), that the random variable
( ) t
behaves like a nonrandom quantity. We have now solved the second problem with Eq.
(3.43a) and therefore can make sense of the idea that a random function can be ergodic in the
autocorrelation function.
The pattern used in analyzing the ergodic qualities of a random function (t) has by now been
set. There is some mathematically useful and reasonable function f that has N arguments. We pick
N time values
1
t ,
2
t ,,
N
t and calculate an ensemble expectation value or average

( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t " E ,

which is then set equal to the time average

( )
2 3
1
lim ( ), ( ), ( ), , ( )
2
T
N
T
T
f n t n t n t n t dt
T
t t t
+ + +
.
We define

2 2 1
t t t ,
3 3 1
t t t , ... ,
1 N N
t t t

and set the expectation value equal to the time average by writing


- 276 -

( ) ( )
( ) ( )
( )
1 2
1 1 2 1 3 1
2 3
( ), ( ), , ( )
( ), ( ), ( ), , ( )
1
lim ( ), ( ), ( ), , ( ) .
2
N
N
T
N
T
T
f n t n t n t
f n t n t n t n t
T

= + + +
= + + +

E
E

(3.45a)

In order for Eq. (3.45a) to make sense, the expectation value

( ) ( )
( ) ( )
1 2
1 1 2 1 3 1
( ), ( ), , ( )
( ), ( ), ( ), , ( )
N
N
f n t n t n t
f n t n t n t n t = + + +

E
E

cannot be a function of
1
t . This means the right-hand side this of relationship still has the same
value when
1
t is increased by any time value ; hence we can write, increasing
1
t by only on
the right-hand side,

( ) ( )
( ) ( )
1 2
1 1 2 1 3 1
( ), ( ), , ( )
( ), ( ), ( ), , ( )
N
N
f n t n t n t
f n t n t n t n t = + + + + + + +

E
E .

Remembering that

2 2 1
t t = ,
3 3 1
t t = , ,
1 N N
t t = ,

we eliminate
2
,
3
,,
N
from the equation to get

( ) ( ) ( ) ( )
1 2 1 2
( ), ( ), , ( ) ( ), ( ), , ( )
N N
f n t n t n t f n t n t n t = + + + E E . (3.45b)

This is the same as Eq. (3.24c) above. We conclude that Eq. (3.24c) must be true whenever Eq.
(3.45a) is true. According to the discussion following Eq. (3.24c), whenever Eq. (3.45a) is true,
the expectation value
( ) ( )
1 2
( ), ( ), , ( )
N
f n t n t n t E

must be a function of only the 1 N independent time values

2 2 1
t t = ,
3 3 1
t t = , ,
1 N N
t t = .

Consequently, the expectation values and the time integral in Eq. (3.45a) have the same number
- 277 -
of independent time parameters, which we can show by writing

( )
2 3 2 3
1
( , , , ) lim ( ), ( ), ( ), , ( )
2
T
N N
T
T
S f n t n t n t n t dt
T

= + + +
, (3.45c)
where
( ) ( )
2 3 1 2
( , , , ) ( ), ( ), , ( )
N N
S f n t n t n t = E . (3.45d)

Equation (3.45a) needs to have one more requirement imposed on itthe random quantity on the
right-hand side must be equivalent to the nonrandom quantity on the left. This means the random
quantity
( )
2 3 2 3
1
( , , , ) lim ( ), ( ), ( ), , ( )
2
T
N N
T
T
T

= + + +
(3.45e)

must become equivalent to the nonrandom quantity S by having

( )
2 3 2 3
( , , , ) ( , , , )
N N
S
E (3.45f)
and

( )
2
2 3 2 3
( , , , ) ( , , , ) 0
N N
S
E . (3.45g)

Now, by requiring Eqs. (3.45b)(3.45g) to hold true, we can be sure that Eq. (3.45a) is
mathematically self-consistent.
It is not difficult to relate this mathematical machinery to the analysis of what it means to say
that (t) is ergodic in the mean or ergodic in the autocorrelation function. When specifying what
it means to say that (t) is ergodic in the mean, we take 1 N = and define function f to be
( ) f x x = ; and when specifying what it means to say that (t) is ergodic in the autocorrelation
function, we take 2 N = and define function f to be ( , ) f x y xy = . To give another example of
how to use Eqs. (3.45a)(3.45g), we examine an often encountered type of ergodicity called
ergodic in the variance.
44
We define ergodic in the variance for a random function (t) by
setting 1 N = and
2
( ) ( )
n
f x x =

, with
n
in function f being the stationary mean of ,

( ) ( )
n
n t =

E ,

specified by Eq. (3.30a) above. When a random function (t) is ergodic in the variance, Eq.

44

- 278 -
(3.45a) becomes

( )
2 2
1
[ ( ) ] lim [ ( ) ]
2
T
n n
T
T
n t n t dt
T

E . (3.46a)

The requirements imposed by Eq. (3.45b) can be written as

( ) ( )
2 2
[ ( ) ] [ ( ) ]
n n
n t n t = +

E E (3.46b)

for all values of , which means that

( )
2
[ ( ) ] nonrandom variable independent of time
n n
n t v = =

E . (3.46c)

Here, we write
n
v
instead of
( ) n t
v
for the variance of (t) to emphasize that

n
v
does not depend

on time. Equation (3.46c) can be interpreted as saying that is stationary with respect to its
variance
n
v
. We note that variance

n
v
is equivalent to S in Eq. (3.45d), so Eqs. (3.45e), (3.45f),

and (3.45g) now reduce to

2
1
lim [ ( ) ]
2
T
n
T
T
n t dt
T

, (3.46d)

( )
n
v
E , (3.46e)
and

( )
2
[ ] 0
n
v
E . (3.46f)

A random function (t) is called weakly ergodic if it is ergodic in the mean, ergodic in the
variance, and ergodic in the autocorrelation function.
45
It is called strongly ergodic if Eqs.
(3.45a)(3.45g) are satisfied for all 1, 2, , N = and for any reasonable choice of function f.
This is equivalent to requiring that all reasonable ensemble averages of the random function (t)
be equal to their corresponding time averages.
The distinction made between weakly ergodic and strongly ergodic is reminiscent of the
distinction made between wide-sense stationary and strict-sense stationary. Just as all strict-sense
stationary random functions are also wide-sense stationary, but not all wide-sense stationary
random functions are strict-sense stationary, so too are all strongly ergodic random functions also
weakly ergodic, but not all weakly ergodic random functions are strongly ergodic. The Gaussian
random processes discussed in Sec. 3.16 above are an important special case. We have already

45
- 279 -
said that when Gaussian random processes are wide-sense stationary they must also be strict-
sense stationary; it can also be shown that whenever Gaussian random processes are weakly
ergodic they must also be strongly ergodic.
46

Although we have seen that all ergodic random functions are also stationary, it is easy to show
that not all stationary random functions are ergodic. The random function

( ) n t c = , (3.47a)

where c is a random constant chosen from a probability density distribution ( )
c
p c
, is clearly
strict-sense stationary. To see why this is so, we just observe that Eq. (3.24c) is automatically
satisfied, since

( ) ( )
( ) ( ) ( )
1 2
1 2
( ), ( ), , ( ) ( ) ( , , , )
( , , , ) ( ), ( ), , ( )
N c
N
f n t n t n t p c f c c c dc
f c c c f n t n t n t
=
= = + + +

E
E E
(3.47b)

for any value of and any integrable function f with 1, 2, , N = arguments. On the other
hand, ( ) n t c = cannot be ergodic because once a value for c is chosen from the ensemble, it must
stay the same for all time values. Looking at even the simplest type of ergodicity, ergodicity in
the mean, we get from Eq. (3.42d)

1 1
lim ( ) lim (2 )
2 2
T
T T
T
n t dt Tc c
T T

= = =

. (3.47c)

Hence, the probability density distribution of
is the same as the probability density

distribution
c
p
, which, unless
c
p
is a delta function, violates requirement (3.42e) for ergodic in

the mean.

46
3.19 Experimental Noise
We almost always analyze noise in experimental signals as a random function of time (t). The
signal noise in any given experiment is then a member function chosen at random from the
ensemble of allowed functions because it corresponds to pulling the levers of all the slot
machines simultaneously in Sec. 3.14 above. This suggests that the straightforward way to
calculate an expectation value or ensemble average is to acquire many different member

- 280 -
functions by running the experiment many different times. This is, of course, unlikely to happen;
there is usually not much incentive to do the same experiment over and over in exactly the same
way, because the point of most experiments is to measure a signal, not the noise associated with
it. Sometimes repeating an experiment is literally impossible. If, for example, stock-market prices
are treated as random functions of time, there is no way to repeat last year to see what happens
this time around. Consequently, when examining random functions of time, there is usually only
one, or at best a few, member functions of the ensemble to examine. In practice, then, most
experimental statisticians are forced to assume that their random functions are ergodic as well as
stationary; otherwise, they cannot calculate the ensemble averages needed for their analysis.
Another point worth making about stationarity and ergodicity is that, strictly speaking, no
experimental data can be truly stationary or truly ergodic in even the weakest sense, because
before an experiment begins or after an experiment ends the random function representing the
noise must be strictly zero. One way of handling this is to regard the noise data as a finite-length
sample of some random function stretching between t = and t = +, but we should also
acknowledge that stationarity and ergodicity are ideals that experimental noise can only realize to
some degree of approximation. Just as, in Sec. 3.5 above, many pairs of independent random
variables turn out after all to depend slightly on each other, so too do many recordings of
experimental noise turn out, after close analysis, to be stationary and ergodic only to some degree
of approximation.
3.20 The Power Spectrum
A random function (t) that is wide-sense stationary has an autocorrelation function
nn
R

, which
according to Eq. (3.30b) can be written as

( )
2 1 1 2
( ) ( ) ( )
nn
R t t n t n t =

E (3.48a)

for any two time values
2
t and
1
t . We note that

( ) ( )
1 2 2 1
( ) ( ) ( ) ( ) n t n t n t n t = E E
automatically. This means that

2 1 1 2
( ) ( )
nn nn
R t t R t t =

or, setting
2 1
t t = ,
( ) ( )
nn nn
R R =

, (3.48b)

making
nn
R

an even function when n is wide-sense stationary. Since
nn
R

is a function of only
the single real parameter , we can set up the one-dimensional Fourier transform of
nn
R

, getting
The Power Spectrum 3.20
- 281 -

2
( ) ( )
if
nn nn
S f R e d

. (3.48c)

This Fourier transform ( )
nn
S f

of
nn
R

almost always exists, and we define it to be the power
spectrum
47,48
of the random function (t). Over the next few sections of this chapter, we examine
the properties of
nn
S

, showing as we go along why it makes sense to call it the power spectrum.
Functions that have power spectra must be wide-sense stationary because we are assuming
that the autocorrelation
nn
R

is a function with only a single real argument. Given that
nn
S

exists,
we can always reverse the transform in Eq. (3.48c) and write the autocorrelation function of as
the inverse Fourier transform of the power spectrum,

2
( ) ( )
if
nn nn
R S f e df

. (3.48d)

When two random functions ( ) n t
and ( ) n t
are jointly wide-sense stationary, as defined in the

discussion following Eq. (3.30c), we can define their cross-power spectrum to be

2
( ) ( )
if
n n n n
S f R e d

, (3.48e)
where

( )
2 1 1 2
( ) ( ) ( )
n n
R t t n t n t

=

E

is their cross-correlation function introduced in Eq. (3.30d).
We know that
nn
R

in Eq. (3.48a) is always real because ( )
1 2
( ) ( ) n t n t E is always real.
According to Eq. (3.48b),
nn
R

is an even function of its argument. Therefore its Fourier
transform, the power spectrum
nn
S

, is the Fourier transform of a real and even function. Because
the Fourier transform of a real and even function is always another real and even function,
49
it
follows that
nn
S

is also real and even:
( ) Im ( ) 0
nn
S f =

(3.49a)
and
( ) ( )
nn nn
S f S f =

. (3.49b)

47
48
49
See entry 1 of Table 2.1 in Chapter 2.

- 282 -
We note in passing that the cross-power spectrum
n n
S
o

in (3.48e) is not necessarily a real-valued
function. It is, however, the Fourier transform of a real-valued function
n n
R
o

so it must be
Hermitian,
50

( ) ( )
n n n n
S f S f
o o

(3.49c)

Equation (3.49a) shows that
nn
S

behaves like a power spectrum by being strictly real; Eq.
(3.49b) shows that
nn
S

is double-sided, having the same value at +f and f. The next step is to
show that
nn
S

behaves like a power spectrum by being non-negative for all values of f, but that
has to wait until we examine what happens to
nn
S

when a wide-sense stationary random function
(t) is put through an arbitrary linear system.

50
3.21 Random Inputs and Outputs of Linear Systems
Section 2.9 in Chapter 2 describes what a convolution is and the role it plays in Fourier-transform
theory. A linear system can be represented by a convolution, with the u(t) input being convolved
with the linear systems impulse-response function h(t) to get the v(t) output,

( ) ( ) ( ) v t u t h t .

According to the definition of convolution in Chapter 2 [see Eq. (2.38a)], this can be written as

( ) ( ) ( ) v t h u t d t t t
.

When a random function (t) is the input to a linear system characterized by an impulse-
response function h(t), the output is another random function ( ) m t given by

( ) ( ) ( ) m t h n t d t t t
. (3.50a)

u h
Random Inputs and Outputs of Linear Systems 3.21
- 283 -
We define the correlation function between ( ) m t and (t) to be
51

( )
1 2 1 2
( , ) ( ) ( )
mn
R t t m t n t =

E . (3.50b)

Function
1 2
( , )
mn
R t t

is called the cross-correlation function of m and . Substitution of (3.50a)
gives

1 2 2 1 2 1
( , ) ( ) ( ) ( ) ( ) ( ) ( )
mn
R t t n t h n t d h n t n t d

= =

E E

Using Eq. (3.17c) to move the expectation operator inside the integral, and using (3.16a) to put h
outside the expectation operator because it is a nonrandom quantity, we get

( )
1 2 2 1
( , ) ( ) ( ) ( )
mn
R t t h n t n t d

E .

Assuming that is wide-sense stationary, we use Eq. (3.30b) to write

( )
2 1 1 2
( ) ( ) ( )
nn
n t n t R t t =

E
so that

1 2 1 2
( , ) ( ) ( )
mn nn
R t t h R t t d

. (3.50c)

This shows that
mn
R

depends only on the difference between
1
t and
2
t . Nothing then stops us
from regarding
mn
R

as a function of
2 1
t t = , which gives

( ) ( ) ( )
mn nn
R h R d

or, using Eq. (3.48b),
( ) ( ) ( )
mn nn
R h R d
= +

.

51
This derivation comes from Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, pp.
323324.

- 284 -
Changing the variable of integration to = changes this into a convolution,

( ) ( ) ( ) ( ) ( )
mn nn nn
R h R d h R
= =

. (3.50d)

Equation (3.50a) can also be used to evaluate the autocorrelation function of the random
output ( ) m t , giving

( )
1 2 1 2 1 2
1 2
( , ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
mm
R t t m t m t m t h n t d
h m t n t d

= =

=

E E
E .

Again moving the expectation operator inside the integral, we use Eq. (3.50b) to write

( )
1 2 1 2 1 2
( , ) ( ) ( ) ( ) ( ) ( , )
mm mn
R t t h m t n t d h R t t d

= =

E .

From (3.50c) we know that
mn
R

depends only on the difference between times
1
t and
2
t , which
means we can write

1 2 2 1
( , ) ( )
mn mn
R t t R t t =

.

Hence, the formula for
1 2
( , )
mm
R t t

simplifies to

1 2 2 1
( , ) ( ) ( )
mm mn
R t t h R t t d

. (3.51a)

This is an important result because it shows that the autocorrelation of the output random
function m depends only on
2 1
t t = . Substituting for
2 1
( ) t t gives

( ) ( ) ( ) ( ) ( )
mm mn mn
R h R d h R
= =

. (3.51b)

Glancing back at Eqs. (3.30a) and (3.30b) above, and having shown that the autocorrelation
function
1 2
( , )
mm
R t t

depends only on
2 1
( ) t t , we realize that m must be wide-sense stationary if
- 285 -
( ) ( ) m t E is time-independent and finite. Taking the expectation value of both sides of (3.50a)
gives

( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ,
n
m t h n t d h n t d
h d

= =

=

E E E

(3.51c)
where we have again assumed that (t) is wide-sense stationary so that, according to Eq. (3.30a),

( ) ( ) same finite constant for all values of
n
n t t = =
E .

Equation (3.51c) makes ( ) ( ) m t E a time-independent quantity. The Fourier transform of the
impulse-response function h is called the transfer function,

2
( ) ( )
ift
H f h t e dt
, (3.51d)

of the linear system. (The idea of a transfer function is discussed in greater detail below in
Appendix 5A of Chapter 5.) Therefore Eq. (3.51c) can also be written as

( ) ( ) (0)
n
m t H =
E . (3.51e)

This shows that when H(0), the zero-frequency value of the transfer function, is finite, so is
( ) ( ) m t E . We conclude that the output ( ) m t of the linear system is wide-sense stationary when
the input (t) is wide-sense stationary and the H(0) value of the transfer function is finite.
Because the H(f) transfer function is the Fourier transform of h(t), which is a strictly real
function, we can take the complex conjugate of both sides of Eq. (3.51d) to get

2 2
( ) ( ) ( )
ift ift
H f h t e dt h t e dt

= =

. (3.52a)

In the last step of (3.52a), we change the variable of integration to t t = . Equation (3.52a) can
also be written as, dropping the prime,

2
( ) ( )
ift
H f h t e dt
. (3.52b)

- 286 -
Clearly, ( ) H f

, the complex conjugate of the transfer function H(f), is the Fourier transform of
( ) h t . Since H is the Fourier transform of a real function h, it must, according to entry 7 of Table
2.1 in Chapter 2, be Hermitian,

*
( ) ( ) H f H f = . (3.52c)

We now define ( )
mn
S f

to be the Fourier transform of ( )
mn
R

, giving

2
( ) ( )
if
mn mn
S f R e d

. (3.53a)

Function ( )
mn
S f

is the cross-power spectrum of m and [see Eq. (3.48e)]. The transform can,
of course, be reversed to get

2
( ) ( )
if
mn mn
R S f e df

. (3.53b)

Applying the Fourier convolution theorem to Eq. (3.50d) above gives, according to Eq. (2.39a) in
Chapter 2,

[Fourier transform of ] = [Fourier transform of ( )] [Fourier transform of ]
mn nn
R h -t R

.

This can be written as, using Eqs. (3.53a), (3.52b), and (3.48c),

( ) ( ) ( )
mn nn
S f H f S f
=

. (3.53c)

Applying Eq. (2.39a) again, this time to Eq. (3.51b), gives

( ) ( ) ( )
mm mn
S f H f S f =

, (3.53d)
where

2
( ) ( )
if
mm mm
S f R e d

(3.53e)

is the Fourier transform of
mm
R

. Following the nomenclature introduced in Eq. (3.48c), this must
be the power spectrum of ( ) m t ; and the Fourier transforms of h and
mn
R

come from (3.51d) and
(3.53a) respectively. The Fourier transform in (3.53e) can, of course, be reversed to get
- 287 -

2
( ) ( )
if
mm mm
R S f e df

. (3.53f)

Substitution of (3.53c) into (3.53d) gives the result we have been working toward:

2
( ) ( ) ( )
mm nn
S f H f S f =

. (3.53g)

This result shows that the power spectrum of the random input function (t) gives, when
multiplied by the squared modulus of the transfer function, the power spectrum of the random
output function ( ) m t of the linear system.
3.22 The Sign of the Power Spectrum
Equation (3.53g) can be used to show that the power spectrum
nn
S

of any wide-sense stationary
random function cannot be negative. To show how this is done, we set up a linear system that has
the transfer function

1 2
2 1
1
2
for
for
( )
0 for
0 for
B
i f f f
i f f f
H f
f f
f f

=

<
>

, (3.54a)

where
1
f and
2
f are both non-negative frequencies. Function ( )
B
H f is ( ) i when f lies
between
1
f and
2
f and i when f lies between
1
( ) f and
2
( ) f ; otherwise it is zero. The transfer
function
B
H satisfies
( ) ( )
B B
H f H f

= , (3.54b)

which [see Eq. (3.52c)] makes it an acceptable transfer function because it is Hermitian. By
reversing the Fourier transform in (3.51d), we find that the impulse-response function for this
linear system must be the inverse Fourier transform of the transfer function,

2
( ) ( )
ift
B B
h t H f e df
.

According to entry 7 in Table 2.1 of Chapter 2, since ( )
B
H f is Hermitian, its inverse Fourier
transform ( )
B
h t must be real. We can take any random function (t) that is wide-sense stationary

- 288 -
and run it through the
B
H linear system. Looking at the resulting output ( ) m t , we know from the
discussion following Eq. (3.51e) that ( ) m t must also be wide-sense stationary because (0)
B
H is
finite. This means that m has a well-defined autocorrelation function

( )
2 1 1 2
( ) ( ) ( )
mm
R t t m t m t =

E

and a well-defined power spectrum ( )
mm
S f

. Setting
1 2
t t = in the autocorrelation function gives,
since m is real,

( )
2
1
(0) ( ) 0
mm
R m t =

E . (3.54c)

From Eq. (3.53f) we know
(0) ( )
mm mm
R S f df

. (3.54d)

Combining Eqs. (3.53g) and (3.54a) gives

2
( ) ( ) ( )
mm B nn
S f H f S f =

.

This can be substituted into (3.54d) to get, noting the definition of
B
H in (3.54a), that

1 2
2 1
(0) ( ) ( )
f f
mm nn nn
f f
R S f df S f df
= +

.

Equation (3.49b) reminds us that
nn
S

is an even function of f, which means that this formula for
(0)
mm
R

can be written as

2
1
(0) 2 ( )
f
mm nn
f
R S f df =

. (3.54e)

Substitution of (3.54e) into inequality (3.54c) gives

2
1
( ) 0
f
nn
f
S f df

. (3.54f)

The Sign of the Power Spectrum 3.22
- 289 -
No assumptions have been made about the values of
1
f and
2
f other than

1 2
0 f f .

Therefore, because inequality (3.54f) must hold true for all allowed values of
1
f and
2
f no
matter where they are on the positive f axis or how close together they are, we conclude that
( ) 0
nn
S f

for all 0 f . Because
( ) ( )
nn nn
S f S f =

in Eq. (3.49b), it then follows that

( ) 0
nn
S f

(3.54g)

for all positive and negative values of f.
We have already demonstrated that
nn
S

is real and even, and now we know that it must also
be a non-negative function of frequency f. These are all attributes that a double-sided power
spectrum ought to have. The final step in justifying the label power spectrum for
nn
S

is to show
that it satisfies a power-spectrum type of formula with regard to the random function (t).

3.23 The Power Spectrum and Fourier Transforms of Random
Functions
The power spectrum ( )
zz
P f of a nonrandom function z(t) can be written as
52

2
( )
( ) lim
2
T
zz
T
Z f
P f
T
= . (3.55a)

Here, ( )
T
Z f is the Fourier transform between times t T = and t T = of a real signal z(t):

2
( ) ( )
T
ift
T
T
Z f z t e dt

. (3.55b)

52
B. P. Lathi, An Introduction to Random Signals and Communication Theory (International Textbook Company,
Scranton, PA, 1968), p. 59.

- 290 -
We now justify the label power spectrum for the function ( )
nn
S f

defined in Eq. (3.48c) by
deriving a formula for
nn
S

in terms of the random function (t) that closely resembles formula
(3.55a) for the power spectrum ( )
zz
P f of the nonrandom function z(t).
We define ( )
T
N f
to be the Fourier transform of the random function (t) between times

t T = and t T = :

2
( ) ( )
T
ift
T
T
N f n t e dt

. (3.56a)

In effect, N
is a random function of the two nonrandom variables f and T, and it could be written
as ( , ) N f T
to emphasize this fact. When (t) is a random function that is wide-sense stationary,
we have, since is real,

( )
( )
1 2
2 1
2
2 2
1 1 2 2
2 ( )
1 2 1 2
( ) ( ) ( ) ( ) ( )
( ) ( ) .
T T
ift ift
T T T
T T
T T
i t t f
T T
N f N f N f n t e dt n t e dt
dt dt n t n t e

= =

=

E E E
E
(3.56b)

Applying Eqs. (3.17c) and (3.16a), the expectation operator E is taken inside the double integral
to get

2 1
2 1
2
2 ( )
1 2 1 2
2 ( )
1 2 2 1
( ( ) ) ( ( ) ( ) )
( ) .
T T
i t t f
T
T T
T T
i t t f
nn
T T
N f dt dt n t n t e
dt dt R t t e

=
=

E E

(3.56c)

In the last step, Eq. (3.30b) is used to replace

( )
1 2
( ) ( ) n t n t E

for the wide-sense stationary by the autocorrelation function
2 1
( )
nn
R t t

.
The rightmost expression in Eq. (3.56c) is a double integral of a function

2 1
2 ( )
2 1 2 1
( ) ( )
i t t f
nn
t t R t t e

=

over the square region of the
1
t ,
2
t plane specified by
The Power Spectrum and Fourier Transforms of Random Functions 3.23
- 291 -

1
T t T
and

2
T t T .

Figure 3.2 shows that the value of must be constant along any line given by

2 1
constant t t = =

in the
1
t ,
2
t plane. To lowest order in d in Fig. 3.2, the shaded area is, when
2 1
t t so that
0 ,
(2 ) 2 (2 )
2
d
T T d
= .

When
2 1
t t < , as shown in Fig. 3.3, the value of is negative, so the formula for the shaded area
in Fig. 3.3 is

(2 ) 2 (2 )
2
d
T T d
= .

Consequently, the rightmost double integral in Eq. (3.56c) can be written as

2 1
2 ( )
1 2 2 1
2 0
2 2
0 2
2
2
2
( )
( ) (2 ) ( ) (2 )
( ) (2 ) .
T T
i t t f
nn
T T
T
if if
nn nn
T
T
if
nn
T
dt dt R t t e
R e T d R e T d
R e T d
= +
=

Taking the factor of 2T outside the integral and substituting the result back into Eq. (3.56c) gives

( )
2
2
2
2
( ) 2 1 ( ) .
2
T
if
T nn
T
N f T R e d
T

E (3.57a)


- 292 -

( )
2
2
1
( ) ( , 2 ) ( ) ,
2
if
T nn
N f T R e d
T

E (3.57b)

where

1 for
( , )
0 for
a
a b
b
a b
a b
t
t t
t
t t
t t
>

. (3.57c)

Function is graphed in Fig. 3.4. The Fourier transform of ( , 2 ) t T is

2
2
sin(2 )
( , 2 ) 2
2
ift
fT
t T e dt T
fT
. (3.57d)

The right-hand side of Eq. (3.57b) is the Fourier transform of the product of functions and
nn
R

.
According to the Fourier convolution theorem [see Eq. (2.39k) in Chapter 2], this must equal the
convolution of the Fourier transforms of and
nn
R

. Therefore, Eq. (3.57b) can be written as,
according to (3.57d) and (3.48c),

( )
2
2
( )
sin(2 )
2 ( )
2 2
T
nn
N f
fT
T S f
T fT
E
. (3.57e)

In the limit as T , it can be shown that
53

2
sin(2 )
2 ( )
2
fT
T f
fT

. (3.57f)

53
John B. Thomas, An Introduction to Applied Probability and Random Processes (John Wiley & Sons, Inc., New
York, 1971), p. 231. Formula (3.57f) is also a slightly disguised version of Eq. (2.67b) in Chapter 2.
- 293 -

FIGURE 3.2.

T
T

=

T
T T
2
) (

d
T
T
T
T

2
t

1
t

- 294 -

FIGURE 3.3.

1
t

2
t
T
T
T
T
T 2
d

- 295 -

FIGURE 3.4.

0 . 1

a
t

b
t
b
t

- 296 -
Consequently, we can take the limit of both sides of (3.57e) as T to get [using Eq. (2.55a)
in Chapter 2)

( )
2
( )
lim ( ) ( ) ( ) ( )
2
T
nn nn
T
N f
f S f f f S f df
T

= =

E

or

( )
2
( )
( ) lim
2
T
nn
T
N f
S f
T
E
. (3.57g)

Comparing this result to the similar formula in Eq. (3.55a) for the power spectrum of a
nonrandom function z, we see that the formulas are similar enough to justify the definition of
nn
S

as the power spectrum of the random function .
The ( )
nn
S f

power spectrum specified in Eq. (3.48c) and used later in (3.57g), (3.49b),
(3.54g), and so on, is often called the double-sided power spectrum because it is defined for both
positive and negative values of its argument f. It is typically found as a weighting function in
integrals of the form
( ) ( )
nn e
S f f df

,

where ( )
e
f , like ( )
nn
S f

, is an even function of f. Because the ( ) ( )
nn e
S f f

product must also
be even, this integral can also be written as [see Eq. (2.19) in Chapter 2]

0
( ) ( ) 2 ( ) ( )
nn e nn e
S f f df S f f df

=

. (3.58a)

Many analysts define a single-sided power spectrum
(1)
nn
S

to be

(1)
( ) 2 ( ) for 0
nn nn
S f S f f =

(3.58b)

and use it to write equations like (3.58a) as

(1)
0
( ) ( ) ( ) ( )
nn e nn e
S f f df S f f df

=

. (3.58c)

- 297 -
The motivation for this procedure is often the feeling that only positive frequencies f are
meaningful, so we ought to restrict ourselves to using power spectra with positive arguments.
54

Many times articles and textbooks refer to the power spectrum without making it clear whether
they are referring to the double-sided or single-sided power spectrum. Casual references to power
spectra should be treated with caution until it becomes clear which type of power spectrum the
author has in mind.

54
There is, of course, no more problem in using negative values when represents a frequency than there is in
using negative x values when x represents a length along the axis of a coordinate system. Lengths can never be
negative, so when we allow x to be negative we are implicitly talking about a length coordinate rather than a length.
Similarly, when we allow to be negative we are implicitly talking about a frequency coordinate rather than a
frequency.
3.24 The Multidimensional Wiener-Khinchin Theorem
Equation (3.57g) derived in Sec. 3.23 is often referred to as the Wiener-Khinchin theorem. This
theorem can easily be extended to multiple dimensions.
A random function with more than one nonrandom argument is often called a random scalar
field. We can write a random scalar field as
1 2
( , , , )
K
n t t t when it is a function of K
nonrandom arguments
1
t ,
2
t ,,
K
t . The property for a random field that is analogous to
stationarity for a one-dimensional random function is called homogeneity. A random function is
called a (wide-sense) homogeneous random field
1 2
( , , , )
K
n t t t when there is a correlation
function
nn
R

such that

( )
1 1 2 2 1 2 1 2
( , , , ) ( , , , ) ( , , , )
nn K K K K
R t t t t t t n t t t n t t t =

E . (3.59a)

The multidimensional Fourier transform of
nn
R

is the multidimensional power spectrum of the
random field

1 1 2 2
1 2
2 ( )
1 2 1 2
( , , , )
( , , , )
K K
nn K
i f f f
K nn K
S f f f
d d d R e

+ + +

=

"

" .
(3.59b)

This transform can, of course, be reversed to get

- 298 -

1 1 2 2
1 2
2 ( )
1 2 1 2
( , , , )
( , , , )
K K
nn K
i f f f
K nn K
R f f f
df df df S f f f e

+ + +

=

"

" .
(3.59c)

The multidimensional Wiener-Khinchin theorem states that

( )
1 2
1
2
1 2
2
1 2
1 2
( , , , )
1
lim ( , , , )
(2 )(2 ) (2 )
K
K
nn K
TT T K
T
K
T
T
S f f f
N f f f
T T T

=

"
#

"
E , (3.59d)
where

1 2
1 2
1 1 2 2
1 2
1 2
2 ( )
1 2 1 2
( , , , )
( , , , )
K
K
K K
K
TT T K
T T T
i f t f t f t
K K
T T T
N f f f
dt dt dt n t t t e
+ + +

=

"
"

" .
(3.59e)

The next chapter uses the three-dimensional Wiener-Khinchin theorem with one time
coordinate t and two space coordinates x and y. Using the vector notation introduced in Chapter 2
(see Sec. 2.25), we write the random field as

( , , ) ( , ) n x y t n t =
G
, (3.60a)
with
xx yy = +
G
(3.60b)

being the position vector defined in terms of the x and y unit vectors corresponding to the x and
y coordinates. We also define a vector u
G
with
x
u and
y
u components such that

x y
u xu yu = +
G
. (3.60c)

Here,
x
u and
y
u are the spatial frequencies corresponding to the x and y coordinates respectively.
The frequency corresponding to time t is called w. The truncated time and space Fourier
transform of ( , ) n t
G
can now be written as

2 ( )
,
( , , ) ( , , )
x y
T
i xu yu wt
T A x y
T area A
N u u w dt dx dy n x y t e
+ +

or
The Multidimensional Wiener-Khinchin Theorem 3.24
- 299 -

2 2 ( )
,
( , ) ( , )
T
i u wt
T A
T area A
N u w dt d n t e

+
=

G G
G G
. (3.60d)

Random field ( , ) n t
G
has an autocorrelation function

( ) ( , , ) ( , , ) ( , , )
nn
R x x y y t t n x y t n x y t =

E , (3.61a)

which can be written as

( ) ( , ) ( , ) ( , )
nn
R t t n t n t =

G G G G
E . (3.61b)

Because
nn
R

depends only on the difference between the unprimed and primed coordinates, we
say that field is (wide-sense) stationary and homogeneous. The corresponding power spectrum
is

2 2 ( )
( , ) ( , )
i u wt
nn nn
S u w dt d R t e

+

=

G G

G G
. (3.61c)

The transform can be reversed to get

2 2 ( )
( , ) ( , )
i u wt
nn nn
R t dw d u S u w e

+

=

G G

G G
. (3.61d)

Glancing back at the notation for the truncated Fourier transform of in Eq. (3.60d), we see that
the three-dimensional Wiener-Khinchin theorem for this case can be stated as

( )
2
,
1
( , ) lim ( , )
2
nn T A
T
A
S u w N u w
TA

=

G G
E . (3.61e)

3.25 Band-Limited White Noise
A random function (t) is band-limited white noise when it is wide-sense stationary and has a
power spectrum

0
for
( ) ( )
0 for
nn nn
W f F
S f W f
f F

= =

>

(3.62a)
with
( ) ( ) 0 n t = E . (3.62b)

- 300 -

FIGURE 3.5.

) ( ~ ~ f W
n n

f

0
W
F F
Band-Limited White Noise 3.25
- 301 -
The bandwidth of this white noise is said to be F (see Fig. 3.5). Equation (3.48d) shows that the
autocorrelation function of this band-limited white noise must be

2
0 0
sin(2 )
( )
F
if
nn
F
F
R W e df W

= =

. (3.62c)

Glancing back at Eq. (3.48a), we see that

( ) ( )
2
( ) ( ) ( ) (0)
nn
n t n t n t R = =

E E ,

so that, according to Eq. (3.62c),

( )
2
0 0
( ) 2
F
F
n t W d FW
= =
E . (3.62d)

According to (3.62b) is a zero-mean random function, so Eq. (3.62d) shows that product
0
2FW
must be the variance of (t) when is band-limited white noise.
Sometimes we take the limit as F in Eqs. (3.62a)(3.62d) to get white noise that has no
band limits. Now the power spectrum of (t) is

0
( )
nn
W f W =

(3.63a)

for all values of f. According to formula (3.62c) and Eq. (2.71f) in Chapter 2, this makes the
autocorrelation function
nn
R

proportional to a delta function,

2
0 0
( ) ( )
if
nn
R W e df W

= =

, (3.63b)
with of course

2
lim[ ( ( ) )]
F
n t
= E (3.63c)
and
( ) ( ) 0 n t = E . (3.63d)

Just like the concepts of stationarity and ergodicity, the concept of white noise (even of band-
limited white noise) is an idealization that is often useful for approximating random processes
seen in nature. When a poor-quality recording is played on an audio system, the noise
contaminating it is often white in nature, showing up as unwanted hissing, crackling, and an
overall shussing sound. This white noise is band limited, with the band specified by the finite

- 302 -
range of frequencies produced by the audio system and heard by the audience. Setting a TV set to
a channel or station that does not exist, or that cannot be picked up, often produces hissing in the
speakers and a rapidly changing speckle (sometimes called snow) on the screen; both the snow
and the hissing come from quasi white-noise processes that the TV is treating like a nonrandom
signal.
3.26 Even and Odd Components of Random Functions
A useful approach often applied to random functions (t) that are wide-sense stationary is to
divide them up into even and odd components, as shown in Eqs. (2.11a)(2.11e) in Chapter 2.
Instead of using e and o subscripts as is done in Chapter 2, this time the even component has a +
superscript and the odd component has a superscript:

( ) ( )
( ) ( ) ( ) N t N t N t
+
= +

, (3.64a)
where

( )
1
( ) ( ) ( )
2
N t N t N t
+
= +

(3.64b)
and

( )
1
( ) ( ) ( )
2
N t N t N t
=

. (3.64c)

We now apply to (t) the time-limited Fourier transform shown in Eq. (3.56a),

2 2
( ) ( ) ( , ) ( )
T
ift ift
T
T
f N t e dt t T N t e dt

= =

N

. (3.65a)

Here, the ( , ) t T function [defined in Eq. (2.56c) of Chapter 2] is used to convert the integral
between +T and T into a true Fourier transform. Substituting (3.64a) into (3.65a) gives

( ) 2 ( ) 2
( ) ( , ) ( ) ( , ) ( )
ift ift
T
f t T N t e dt t T N t e dt

+

= +

N

,


( ) ( )
( ) ( ) ( )
T T T
f f f
+
= + N N N
, (3.65b)
where

( ) ( ) 2
( ) ( , ) ( )
ift
T
f t T N t e dt
+ +
N

(3.65c)
Even and Odd Components of Random Functions 3.26
- 303 -
and

( ) ( ) 2
( ) ( , ) ( )
ift
T
f t T N t e dt
N

. (3.65d)

According to entries 1 and 4 of Table 2.1 in Chapter 2, random function
( )
T
+
N
must be a real and

even function of f because it is the forward Fourier transform of a real and even function of t;
and random function
( )
T
must be an imaginary and odd function of f because it is the forward

Fourier transform of a real and odd function of t. This means that every function in the ensemble
of functions associated with random function
( )
T
+
N
is real and even, and every function in the

ensemble of functions associated with random function
( )
T
is imaginary and odd. It also reveals

that in Eq. (3.65b) function
( )
T
+
N
is the real part of ( )

T
f N
and
( )
/
T
i
is the imaginary part of

( )
T
f N
. This can be written mathematically as

( )
( )
( ) Re ( )
T T
f f
+
= N N

(3.65e)
and

( )
( )
( ) Im ( )
T T
f i f
= N N

. (3.65f)

There is a simple connection between the expectation values of the squared magnitudes of
( )
T
and
T
N
, that is between

( )
2
( )
( )
T
f
E and
( )
2
( )
T
f N
E ,

which is worth taking the time to analyze in detail.
We start by applying formulas (3.65c) and (3.65d) to
( )
2
( )
( )
T
f
E to get

( )
( )( )
( )
2
( ) ( ) ( )
( ) 2 ( ) 2
( ) ( ) ( )
( , ) ( ) ( , ) ( )
T T T
ift ift
f f f
t T N t e dt t T N t e dt

=

=

N N N

E E
E .


- 304 -
Everything inside the integral over dt is real except for
2 ift
e

, so we can write this as

( )
2
( ) ( ) 2 ( ) 2
( ) ( , ) ( ) ( , ) ( )
ift ift
T
f t T N t e dt t T N t e dt

=

N

E E .

Substituting from (3.64b) and (3.64c) gives

( )
2
( )
2 2
( )
1
( , ) ( ) ( ) ( , ) ( ) ( ) ,
4
T
ift ift
f
t T N t N t e dt t T N t N t e dt

=

N

E
E

which becomes, applying the linearity of operator E discussed in Sec. 3.10 above,

( )
( )
2
( )
2 2
( )
1
( , ) ( , ) ( ) ( ) ( ) ( ) .
4
T
ift ift
f
dt t T e dt t T e N t N t N t N t

=

N

E
E
(3.66a)

The linearity of E can also be used to write

( )
( )
( ) ( ) ( ) ( )
[ ( ) ( )] [ ( ) ( )]
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
N t N t N t N t
N t N t N t N t N t N t N t N t
N t N t N t N t N t N t N t N t

= +
= +

E
E
E E E E

.

Equation (3.30b), which specifies the autocorrelation function of wide-sense stationary random
functions like (t), can now be applied to get

( )
[ ( ) ( )][ ( ) ( )]
( ) ( ) ( ) ( )
NN NN NN NN
N t N t N t N t
R t t R t t R t t R t t

= + + +

E
.

According to Eq. (3.48b) the autocorrelation function
NN
R

is even, so the right-hand side can be
simplified to

- 305 -

( )
[ ( ) ( )][ ( ) ( )] 2 ( ) 2 ( )
NN NN
N t N t N t N t R t t R t t = +

E .

Putting this result back into Eq. (3.66a) gives

( )
2
( ) 2 2
2 2
1
( ) ( , ) ( , ) ( )
2
1
( , ) ( , ) ( )
2
ift ift
T
NN
ift ift
NN
f dt t T e dt t T R t t e
dt t T e dt t T R t t e

=
+

N

E
.
(3.66b)

Equation (3.48d) states that there exists a power spectrum ( )
NN
S f

such that

2 ( )
( ) ( )
if t t
NN NN
R t t S f e df

.

Substituting this expression into the first term on the right-hand side of the formula for
( )
2
( )
( )
T
f
E and moving the integral over

NN
S

to the front, we get

( )
2
( ) 2 ( ) 2 ( )
2 2
1
( ) ( ) ( , ) ( , )
2
1
( , ) ( , ) ( )
2
it f f it f f
T
NN
ift ift
NN
f df S f dt t T e dt t T e

=
+

N

E
.
(3.66c)

Interchanging the roles of f, t and then replacing F by T in Eq. (2.108b) of Chapter 2 gives

( )
2 ( ) 2 ( )
( , ) ( , ) 2 sinc 2 ( )
it f f it f f
t T e dt t T e dt T f f T

= =

, (3.66d)

with Eq. (2.106d) showing that the definition of the sinc function is

sin( )
sinc( )
x
x
x
= . (3.66e)

Substitution of this formula into Eq. (3.66c) leads to


- 306 -

( )
2
( ) 2
2 2
1
( ) ( )[2 sinc(2 ( ) )]
2
1
( , ) ( , ) ( )
2
T
NN
ift ift
NN
f S f T f f T df

=
+
E
.
(3.66f)

To evaluate the integral over df in (3.66f), we assume that T is chosen large enough that

2
2
sin(2 )
[sinc(2 )]
2
f T
f T
f T

varies rapidly as a function of f compared to ( )
NN
S f

. Hence, if
S
f is the change in f
required to cause a significant change in ( )
NN
S f

, we must have

1
S
f T >> or
1
S
T
f
>>
. (3.67a)

Then we can follow the lead of (3.57f) and approximate

2
2
sin(2 )
2 2 sinc (2 ) ( )
2
f T
T T f T f
f T

. (3.67b)

Applying this approximation to the integral over df on the right-hand side of (3.66f), we replace

( )
( )
2
2 sin 2 ( )
2 2 sinc 2 ( )
2 ( )
f f T
T T f f T
f f T

=

by ( ) f f to get

2
( )[2 sinc(2 ( ) )] 2 ( ) ( ) 2 ( )
NN NN NN
S f T f f T df T S f f f df TS f

=

.

This result can now be substituted back into Eq. (3.66f) to get

- 307 -

( )
2
( )
1
( ) ( )
2
T T
NN
f T S f
E , (3.67c)

where we define
T
to be the value of the remaining double integral,

2 2
( , ) ( , ) ( )
ift ift
T
NN

= +

. (3.67d)

To evaluate
T
, we change the variable of integration in the inner integral from t to
( ) t t t = + to get

( )
2 2 ( )
2 2 ( )
( , ) ( 1) ( , ) ( )
( , ) ( ), ( )
ift if t t
T
NN
ift if t t
NN
dt t T e dt t t T R t e
dt t T e dt t t T R t e

+
+

+

=

= +

.

According to Eq. (2.56c) in Chapter 2, function ( , ) t T is an even function of t, so

( ) ( ), ( , ) t t T t t T + = + .

Similarly, according to Eq. (3.48b) above,

( ) ( )
NN NN
R t R t =

.

Applying these two formulas to the
T
double integral gives, after interchanging the order of the
integrals over dt and dt ,

2 4
( ) ( , ) ( , )
ift ift
T
NN
dt R t e dt t T t t T e

= +

. (3.67e)

To simplify the inner integral on the right-hand side of (3.67e), we note that only when both
( , ) t T and ( , ) t t T + are one is their product onein other words, when either ( , ) t T or
( , ) t t T + is zero, then their product is zero and no contribution is made to the integral. Figure
3.6(a) shows what happens for positive values of t , and Fig. 3.6(b) shows what happens for
negative values of t .

- 308 -

FIGURE 3.6(a).
FIGURE 3.6(b).
t
t
T
T
T
T
t
t
) , ( T t
) , ( T t t + for 0 > t
) , ( T t t + for 0 < t
) , ( T t
- 309 -
In both Figs. 3.6(a) and 3.6(b), the dark solid line is a plot of ( , ) t T and the dashed line is a plot
of ( , ) t t T + . When 0 t > , the dashed block shifts to the left; when 0 t < , the dashed block
shifts to the right. Only in the region of overlap of the solid and dashed lines in Figs. 3.6(a) and
3.6(b) does the product function

( , ) ( ) t T t t +

allow a contribution to be made to the inner integral. Hence, we can write

1 when 0 2 and
( , ) ( ) 1 when 0 2 and
0 outside these regions
t T T t T t
t T t t t T T t t T
< < < <
+ = > > < <
, (3.67f)

disregarding the edge points of the functions because these single-point values do not
contribute to the integral. Equation (3.67e) thus reduces to

0
2 4
2
2
2 4
0
( )
( )
T
ift ift
T
NN
T T t
T T t
ift ift
NN
T
dt R t e dt e
dt R t e dt e

=
+

.
(3.67g)
We note that

4 4 4
1
4
b
ift ifa ifb
a
e dt e e
if

=

. (3.67h)

Applying (3.67h) to (3.67g) gives

( )
( )
0
2 4 ( ) 4
2
2
2 4 4 ( )
0
1
( )
4
1
( )
4
ift if T t ifT
T
NN
T
T
ift ifT if t T
NN
R t e e e dt
if
R t e e e dt
if

=

+

.

Changing the variable of integration in the first integral to t t = leads to [remember to apply
Eq. (3.48b)]

- 310 -

( )
( )
2
4 2 4 2
0
2
4 2 4 2
0
2 4 4
2 2
0
1
( )
4
1
( )
4
( ) ( )
2 2
T
ifT ift ifT ift
T
NN
T
ifT ift ifT ift
NN
T ifT ifT
ift ift
NN NN
R t e e e e dt
if
R t e e e e dt
if
e e
R t e dt R t e d
if if

=
+
=

2
0
,
T
t

where in the last step we have dropped the primes from the variables of integration. The second
integral is the complex conjugate of the first, so this formula can be written as

2 4
2
0
Re ( )
T ifT
ift
T
NN
e
R t e dt
if

(3.67i)

because Re( ) ( / 2) ( / 2) c c c
= + for any complex number c.

The Heaviside step function is defined to be

1 for 0
( ) 1 2 for 0
0 for 0
t
t t
t
>
= =
<

(3.67j)

in Eq. (2.70a) of Chapter 2. The integral on the right-hand side of (3.67i) can now be written as

2
2 2
0
( ) ( ) ( , 2 ) ( )
T
ift ift
NN NN
R t e dt t t T R t e dt

=

. (3.67k)

The right-hand side is the Fourier transform of

( ) ( , 2 ) ( )
NN
t t T R t

and the Fourier-transform operator F defined in Eq. (2.29a) of Chapter 2 can be used to write it as

( )
( ( ) ( , 2 ) ( ))
ift
NN
t t T R t

F .

- 311 -
The Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] can be applied to get

( ) ( ) ( )
( ) ( ) ( )
( ) ( , 2 ) ( ) ( ) ( , 2 ) ( )
ift ift ift
NN NN
t t T R t t t T R t

=

F F F . (3.68a)

According to Eq. (3.48c) there exists a power spectrum ( )
NN
S f

such that

( )
2 ( )
( ) ( ) ( )
ift ift
NN NN NN
S f R t e dt R t
= =

F . (3.68b)

Evaluating ( )
( )
( ) ( , 2 )
ift
t t T

F is not much more difficult. Writing the Fourier transform as an
integral gives [remember that cos( ) sin( )
i
e i
= + ]

( )
( )
2
( ) 2 4
0
2
2 2
1
( ) ( , 2 ) 1
2
2
1
[cos(2 )
T
ift ift ifT
ifT
ifT ifT
t t T e dt e
if
e
e e
if
fT i
f

= =

=
=

F
2
sin(2 )]sin(2 )
1
sin(4 ) sin (2 ) ,
2
ft fT
i
fT fT
f f

=

where in the last step we use that

1
sin cos sin(2 )
2
= .

Applying the formula for the sinc function from Eq. (3.66e), we end up with

( )
( ) 2
( ) ( , 2 ) 2 sinc(4 ) (2 ) 2 sinc (2 )
ift
t t T T fT i fT T fT

=

. F (3.68c)

Equations (3.68b) and (3.68c) are substituted into (3.68a) to get

( ) { }
( ) 2
( ) ( , 2 ) ( ) 2 sinc(4 ) (2 ) 2 sinc (2 ) ( )
ift
NN NN
t t T R t T fT i fT T fT S f
=

F ,


- 312 -
which can then be substituted into (3.67k), giving

{ }
2
2 2
0
( ) 2 sinc(4 ) (2 ) 2 sinc (2 ) ( )
T
ift
NN NN
R t e dt T fT i fT T fT S f
=

. (3.68d)

Equation (2.67c) in Chapter 2 and the discussion following it show that

sin(2 )
( )
as ,
nf
f
f
n

(3.68e)

where t in (2.67c) is here replaced by f. We note that, working with Eq. (3.66e),

( ) sin 2 (2 )
sin(4 ) 1
2 sinc(4 )
2 2
f T
fT
T fT
f f

= = .

Hence, applying (3.68e), we have

1
2 sinc(4 ) ( )
2
as (2 ) .
T fT f
T

(3.68f)

As n gets large in (3.68e), the sine oscillates ever more rapidly with f. Similarly, as 2T gets large
in (3.68f)which is, of course, the same as T getting largethe sinc oscillates ever more rapidly
with f. In order to approximate the sinc in (3.68f) by a delta function, then, we need to have the
other functions of f that are also present varying slowly compared to the original oscillation.
Again assuming, as in the discussion following Eq. (3.66f), that T is large enough for the first
sinc function on the right-hand side of Eq. (3.68d) to oscillate rapidly compared to the noise-
power spectrum
NN
S

, we expand the convolution in (3.68d), writing it as [apply Eq. (2.38e) in
Chapter 2]

2
2 2
0
( ) {[2 sinc(4 )] ( )} {(2 [2 sinc (2 )]) ( )}
T
ift
NN NN NN
R t e dt T fT S f i fT T fT S f

,

and then apply (3.68f) to get, since ( ) ( ) ( )
NN NN
f S f S f =

, that

2
2 2
0
1
( ) ( ) {(2 [2 sinc (2 )]) ( )}
2
T
ift
NN NN NN
R t e dt S f i fT T fT S f

. (3.68g)
- 313 -
The remaining convolution on the right-hand side can be written as [see Eqs. (2.38a) and (2.38b)
in Chapter 2]

( )
( )
2 2
2
(2 [2 sinc (2 )]) ( ) ( ) 2 [2 sinc (2 )]
( ) 2 ( ) [2 sinc 2 ( ) ]
NN NN
NN
fT T fT S f S f fT T fT
S f f f T T f f T df
r r r r
r r

.

Both functions ( ) f f and ( )
NN
S f

vary slowly with f compared to

( )
2
[2 sinc 2 ( )] T f f T r

for large values of T, so (3.67b) can be applied to the integral to get

2
(2 [2 sinc (2 )]) ( ) ( ){2 ( ) ( )} 0
NN NN
fT T fT S f S f T f f f f df r r r o

. (3.68h)

Substituting this into (3.68g) gives

2
2
0
1
( ) ( )
2
T
ift
NN NN
R t e dt S f
r
e

, (3.68i)

which can then be put back into (3.67i) to get that [using cos( ) sin( )
i
e i
o
o o + ]

cos(4 ) sin(4 ) 1
Re ( )
2
T
NN
ft i ft
S f
if
r r
r
+
A e

.

Equation (3.66e) simplifies this to

[ 2 sinc(4 )] ( )
T
NN
T fT S f r A e

. (3.68j)

Substituting this approximation into (3.67c) lets us write, at last, that

( )
2
( )
( ) ( ) sinc(4 ) ( )
( )[1 sinc(4 )]
T
NN NN
NN
f T S f T fT S f
T S f fT
r
r
e

N

E
.
(3.68k)
fT) sin(4 ) 1
( )
2
NN
i ft
S f
if
r
r
+

.
fT)

- 314 -
The approximation in (3.68k) makes sense whenever T is large enough for sinc(2 ) fT and
sinc(4 ) fT to oscillate rapidly with frequency f compared to ( )
NN
S f

, which is usually true for
white-noise-like power spectra. When 1 fT >> , the sinc functions value in formula (3.68k) is
small compared to one [see, for example, Figs. 3.7(a) and 3.7(b)] and we can write

( )
2
( )
( ) ( )
T
NN
f T S f
E . (3.69a)

When 0 f = , it is of course no longer true that 1 fT >> . For this special case, the sinc function
is one; and, according to (3.68k), no matter how large T is we have

( )
2
( )
(0) 0
T
E (3.69b)
and

( )
2
( )
(0) 2 ( )
T
NN
T S f
+
N

E . (3.69c)

Equation (3.69b) is easy to understand after reviewing the discussion following Eq. (3.65d)
above. Since
( )
T
is always an odd function of f, it must be zero at 0 f = according to Eq.

(2.12a) of Chapter 2. To understand Eq. (3.69c), we consult Eqs. (3.65e) and (3.65f) and note that

2 2 2
( ) ( ) 2 2
( ) ( ) [Re( ( ))] [Im( ( ))] ( )
T T T T T
f f f f f
+
+ = + = N N N N N

.

Applying the expectation operator E to both sides and using its linearity with respect to random
quantities (see Sec. 3.10 above), we get

( ) ( )
( ) ( )
( )
2 2
2 2 2
( ) ( )
( ) ( ) Re ( ) Im ( ) ( )
T T T T T
f f f f f
+

+ = + =

N N N N N

E E E E E

or

( ) ( ) ( )
2 2 2
( ) ( )
( ) ( ) ( )
T T T
f f f
+
= + N N N

E E E . (3.69d)
- 315 -

FIGURE 3.7(a).
FIGURE 3.7(b).
0 . 1
0 . 1

T 4
1

T 4
1

T 2
1

T 2
1

f
f
) 4 ( sinc fT
) 2 ( sinc fT

- 316 -
Glancing back at formula (3.57g), we realize, because T is assumed to be large in our analysis
here, that

( )
2
( )
2
T
f
T
N
E

is close to its limiting value as T . Hence, (3.57g) lets us write

( )
2
( ) 2 ( )
T
NN
f TS f e N

E (3.69e)

for large values of T. This approximation works well no matter what the value of f is. Therefore,
at 0 f we can substitute (3.69e) into (3.69d) to get

( ) ( )
2 2
( ) ( )
2 (0) (0) (0)
T T
NN
TS
+
e + N N

E E . (3.69f)

Having already justified (3.69b), we can now apply it to (3.69f) to get

( )
2
( )
2 (0) (0)
T
NN
TS
+
e N

E .

This result then justifies formula (3.69c) above.
Equation (3.69d) can also be used to justify the assumption behind formula (3.69e) that, when
0 f = , the ratio

( )
2
( )
2
T
f
T
N
E

is, for large values of T, close to its limiting value of ( )
NN
S f

. When 0 f = and T is large so that
1 fT >> , we can substitute (3.69a) into (3.69d) to rederive (3.69e),

( )
2
( ) 2 ( )
T
NN
f TS f e N

E .

According to (3.69a), then, it follows that when 1 fT >> and T is large, both
but only when the assumption behind 0 f =
formula (3.69e) that the ratio
- 317 -

( )
( )
2
2
( )
( ) Re ( ) ( )
T T
NN
f f TS f
+

=

N N

E E
and

( )
( )
2
2
( )
( ) Im ( ) ( )
T T
NN
f f TS f

=

N N

E E

contribute equally to
( )
2
( )
T
f N
E . Having arrived at the formula

( )
2
( ) 2 ( )
T
NN
f TS f N

E

without using Eq. (3.57g)that is, without thinking about what the limiting value of the ratio

( )
2
( )
2
T
f
T
N
E

might be as T gets largewe can now work in reverse to get that

( )
2
( )
( )
2
T
NN
f
S f
T

N

E
.

Not only does this result demonstrate that the ratio

( )
2
( )
2
T
f
T
N
E

is indeed about equal to ( )
NN
S f

when 1 fT >> and T is large, but we have also seen, when
1 fT >> and T is large, that the expected value of the squared real component of
T
N
and the
expected value of the squared imaginary component of
T
N
contribute equally to the expected

value of the squared magnitude of
T
N
. In other words, both

( )
( )
2
2
( )
( ) Re ( )
T T
f f
+

=

N N

E E

- 318 -
and

( )
( )
2
2
( )
( ) Im ( )
T T
f f

N N

E E

have turned out to be about half the expected value of the squared magnitude of
T
N
, which lets
us write

( ) ( )
( )
2
2 2
( )
( ) 2 ( ) 2 Re ( )
T T T
f f f
+

e

N N N

E E E (3.69g)
and

( ) ( )
( )
2
2 2
( )
( ) 2 ( ) 2 Im ( )
T T T
f f f

e

N N N

E E E . (3.69h)

A not-very-rigorous argument often used to derive Eqs. (3.69a), (3.69g), and (3.69h) starts out
by breaking ( )
T
f N
into real and imaginary parts. (This step is soundwe did the same thing in
our analysis above.) Writing

( ) ( )
2
2 2
( ) [Re ( ) ] [Im ( ) ]
T T T
f f f + N N N

, (3.70a)

we next assume that
T
N
is equally likely to be real or imaginary, which means that

( ) ( ) ( ) ( )
2 2
[Re ( ) ] [Im ( ) ]
T T
f f N N

E E . (3.70b)

This is the result, of course, that we have gone to some trouble to justify analytically rather
than just assuming it applies; it is sometimes true and sometimes very wrong, for example, when
0 f or when
NN
S

varies rapidly with f. Applying the E expectation operator to both sides of
(3.70a) gives, using the linearity of E explained in Sec. 3.10,

( )
( ) ( ) ( ) ( )
2
2 2
( ) [Re ( ) ] [Im ( ) ]
T T T
f f f + N N N

E E E . (3.70c)

Substitution of (3.70b) into (3.70c) then leads to

( )
( ) ( )
2
2
( ) 2 [Re ( ) ]
T T
f f N N

E E (3.70d)
and
This is the result, of course, that we have gone to some trouble to justify analytically rather
- 319 -

( )
( ) ( )
2
2
( ) 2 [Im ( ) ]
T T
f f = N N

E E . (3.70e)

Consulting Eqs. (3.65e) and (3.65f), we see that formulas (3.70d) and (3.70e) are identical to
(3.69g) and (3.69h). Fortunately, since a more rigorous line of reasoning has already been used to
derive Eqs. (3.69g) and (3.69h), there is no need to rely on the assumption that (3.70b) is true to
establish the truth of (3.70d) and (3.70e). Having derived these results more rigorously, we also
now know that formulas (3.69g) and (3.69h) and formulas (3.70d) and (3.70e) are approximations
that should be used only when T is large, when 1 fT >> , and when
NN
S

varies slowly with
frequency f.
3.27 Analyzing the Noise in Artificially Created Even Signals
Many times in interferometer measurements we take all the data recorded for times 0 t > and,
assuming the signal is an even function of time, use the positive-time data to specify what the
data ought to be at 0 t < . This means that the noise in the data for t < < ends up being an
even function of time; that is, the real-valued random function ( )
E
n t that characterizes the noise
at 0 t > in the original recording also characterizes the noise for all negative time values because
of the way we construct the data set. Mathematically we say that

( ) ( )
E E
n t n t = for all t < < . (3.71a)

Although random function ( )
E
n t is neither ergodic nor stationary, we can assume that a real-
valued and stationary random function (t) exists such that

( ) ( )
E
n t n t = for 0 t . (3.71b)

Just like any other stationary random function, (t) has an autocorrelation function [see Eq.
(3.30b)]
( ) ( ) ( ) ( )
nn
R t t n t n t =

E . (3.71c)

Following the conventions of Sec. 3.20 above [see Eqs. (3.48a)(3.48c)], we note that
nn
R

is an
even function,
( ) ( )
nn nn
R R =

, (3.71d)

and that autocorrelation
nn
R

and the power spectrum
nn
S

make up a Fourier-transform pair,


- 320 -

2
( ) ( )
if
nn nn
S f R e d
r t
t t

(3.71e)
and

2
( ) ( )
if
nn nn
R S f e df
r t
t

. (3.71f)

Following the same pattern as in Eq. (3.65a), we define

2 2
( ) ( ) ( , ) ( )
T
ift ift
T
T
N f n t e dt t T n t e dt
r r
(3.72a)
and

2 2
( ) ( ) ( , ) ( )
T
ift ift
TE E E
T
N f n t e dt t T n t e dt
r r
. (3.72b)

For large values of T, we can derive a simple approximation for

( )
2
( )
TE
N f
E ,

the expectation value of the squared magnitude of
TE
N
, in terms of

( )
2
( )
T
N f
E
and the power spectrum ( )
nn
S f

.
We start by specifying the Heaviside step function to be

1 for 0
( ) 1 2 for 0
0 for 0
t
t t
t
>
<

. (3.73a)

This is the same step function defined in Eq. (2.70a) in Chapter 2. It follows that ( )
E
n t can be
written as [see Eqs. (3.71a) and (3.71b)]

( ) ( ) ( ) ( ) ( )
E
n t n t n t t t +
E E
. (3.73b)

the same as in Eq. (3.67j):
Analyzing the Noise in Artificially Created Even Signals 3.27
- 321 -
We note that for 0 t > , the first term has ( ) 1 t = and the second term has ( ) 0 t = , so

( ) ( )
E
n t n t = .

For 0 t < , the first term has ( ) 0 t = and the second term has ( ) 1 t = , so

( ) ( )
E
n t n t = ,

and when 0 t = both ( ) t and ( ) t are 1/2, so

(0) (0)
E
n n = .

We can now write, using Eq. (3.72b) and remembering that
E
n is real, that

( )
( )
2
2 2
( ) ( ) ( )
( , ) ( ) ( , ) ( )
TE TE TE
ift ift
E E
N f N f N f
t T n t e dt t T n t e dt

=

=

E E
E .

Using the linearity of E described in Sec. 3.10 above, we bring the expectation operator inside
the double integral over dt and dt to get

( )
( )
2
2 2
( ) ( , ) ( , ) ( ) ( )
ift ift
TE E E
N f dt t T e dt t T e n t n t

=

E E . (3.73c)

Equation (3.73b) shows that, again using the linearity of the expectation operator,

( ) ( )
( ) ( )
( )
( ) ( ) [ ( ) ( )] [ ( ) ( )]
( ) ( ) ( ) ( )
( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
( ) ( ) (
E E
n t n t n t n t n t n t
n t n t n t n t
n t n t
t t t t
t t t t
t t
= + +
= +
+ +

E E
E E
E

( ) ( ) ( ) . ) ( ) n t n t t t
E

Substituting from Eq. (3.71c) gives

( ) ( ) ( ) ( ) ( )
( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
E E nn nn
nn nn
n t n t R t t R t t
R t t R t t
t t t t
t t t t
= + +
+ + +

E
.


- 322 -
Because the autocorrelation is even [see Eq. (3.71d)], this simplifies to

( ) ( ) ( ) [ ( )
[ ( )
( ) ( ) ( ) ( )]
( ) ( ) ( ) ( )]
E E nn
nn
n t n t R t t
R t t
t t t t
t t t t
= +
+ + +

E
.
(3.73d)

Substituting the right-hand side of (3.73d) into the double integral in (3.73c) gives

( )
2
2 2
2 2
1
( )
( , ) ( , ) [ ( )
( , ) ( , ) [ ( )
( ) ( ) ( ) ( )]
( ) ( ) ( ) ( )]
TE
ift ift
nn
ift ift
nn
N f
dt t T e dt t T e R t t
t t t t
t t t t

= +
+ + +
= +

E
2

,
(3.73e)

where

2 2
1
( , ) ( , ) [ ( ) ( ) ( ) ( ) ( )]
ift ift
nn
dt t T e dt t T e R t t t t t t

= +

(3.73f)
and

2 2
( , ) ( , ) [ ( ) ( ) ( ) ( ) ( )]
ift ift
nn
dt t T e dt t T e R t t t t t t

= + +

2
. (3.73g)

The dark solid line in Fig. 3.8(a) is a plot of the Heaviside step function ( ) t and the dashed
line is a plot of ( , ) t T . Disregarding the edge points whose values do not contribute to the
integrals in (3.73f) and (3.73g), the product [ ( ) ( , )] t t T is zero unless both and are
onethat is, the product is zero unless t lies inside the region where both the solid and dashed
plots are one in Fig. 3.8(a). Comparing this region to the plot of

,
2 2
T T
t

in Fig. 3.8(b), we see that
( ) ( , ) ,
2 2
T T
t t T t

=

. (3.74a)

- 323 -
In Fig. 3.8(c), the dashed line is again a plot of ( , ) t T , but now the dark solid line is a plot of
( ) t . Comparing the region where both ( ) t and ( , ) t T are one in Fig. 3.8(c) to the plot of

,
2 2
T T
t

+

in Fig. 3.8(d), we see that

( ) ( , ) ,
2 2
T T
t t T t

= +

. (3.74b)

T T
T T
T T T T
t t
t t
FIGURE 3.8(a). FIGURE 3.8(c).
FIGURE 3.8(b). FIGURE 3.8(d).

- 324 -
Splitting the formula in Eq. (3.73f) into two double integrals, we get that

2 2
2 2
1
( , ) ( , ) ( )
( , ) ( , ) ( ) ,
( ) ( )
( ) ( )
ift ift
nn
ift ift
nn
t t
t t

=
+

which becomes, applying (3.74a) and (3.74b),

2 2
2 2
1
, , ( )
2 2 2 2
, , ( )
2 2 2 2
ift ift
nn
ift ift
nn
T T T T
dt t e dt t e R t t
T T T T
dt t e dt t e R t t

=

+ + +

.

After changing the variables of integration in the first double integral from , t t to ( / 2) t T =
and ( / 2) t T = , and changing the variables of integration in the second double integral from
, t t to ( / 2) t T = + and ( / 2) t T = + , we see that

2 2
2 2
2 2
2 2
1
, , ( )
2 2
, , ( )
2 2
T T
if if
nn
T T
if if
nn
T T
d e d e R
T T
d e d e R

+ +

=

+

.

Since

2 ( / 2) 2 ( / 2)
1
if T if T
e e

= ,

the double integral over d and d has the same value as the double integral over d and
d , which means that

2 2
1
2 , , ( )
2 2
if if
nn
T T
d e d e R

=

.

This type of double integral has already been evaluated in Sec. 3.26 while simplifying Eq.
(3.66b), but there is no harm in quickly repeating the procedure. Applying Eq. (3.71f), we get

- 325 -

2 2 2 ( )
2 ( ) 2 ( )
1
2 , , ( )
2 2
2 ( ) , ,
2 2
if if if
nn
i f f i f f
nn
T T
d e d e df S f e
T T
df S f d e d e

=

=

.

This expression can be simplified further using Eq. (3.66d). Equation (3.66d) still holds true if T
is replaced by T/2 because the original T is a dummy parameter. So, replacing T by T/2 and
substituting the result in the formula for
1
,

( )
2
1
2 ( ) sinc ( )
nn
T S f T T f f df

. (3.75a)

According to Eq. (3.66e),

( )
( )
2
2
2
sin 2
sin 2
2
sinc ( ) 2 2
2 2
2
2
T
f
T f
T
T Tf T
T T f
f

= =

,

where 2 T T = . In the limit T we also have, of course, that T , so according to Eq.
(3.57f) it follows that

2
sinc ( ) ( )
as .
T Tf f
T

(3.75b)

Again, we assume that T is large enough to make

( )
2
sinc ( ) ( ) T T f f f f

in Eq. (3.75a). Consequently,

1
2 ( ) ( )
nn
T S f f f df

or

1
2 ( )
nn
TS f

. (3.75c)


- 326 -
To evaluate
2
, we apply Eqs. (3.74a) and (3.74b) to the right-hand side of Eq. (3.73g) to get

2 2
2 2
2 2
( , ) ( , ) ( )
( , ) ( , ) ( )
, ,
2 2 2 2
( ) ( )
( ) ( )
ift ift
nn
ift ift
nn
ift ift
T T T T
dt t e dt t e R
t t
t t

= +
+ +

= +

2

2 2
( )
, , ( )
2 2 2 2
nn
ift ift
nn
t t
T T T T
dt t e dt t e R t t

+

+ + +

.

In the first double integral, the t , t variables of integration are replaced by ( / 2) t T = + and
( / 2) t T = respectively; and in the second double integral, the t , t variables of integration are
replaced by ( / 2) t T = and ( / 2) t T = + respectively. This leads to

2 2
2 2
2 2
2 2
, , ( )
2 2
, , ( )
2 2
T T
if if
nn
T T
if if
nn
T T
d e d e R
T T
d e d e R

+

+

= +

+ +

2

or

2 2 2
2 2 2
, , ( )
2 2
, , ( )
2 2
ifT if if
nn
ifT if if
nn
T T
e d e d e R
T T
e d e d e R

= +

+ +

2
.
(3.75d)

Everything on the right-hand side of (3.75d) is real except the complex exponentials, so the
second term is the complex conjugate of the first term. It is easy to show that this is true. Starting
with the first term we have

- 327 -

2 2 2
2 2 2
2 2
, , ( )
2 2
, , ( )
2 2
,
2
ifT if if
nn
ifT if if
nn
ifT
T T
e d e d e R
T T
e d e d e R
T
e d e

+

= +

=

2
, ( ) ,
2
if if
nn
T
d e R

+

where in the last step we interchange the order of the double integral and replace the dummy
variables of integration , by , respectively. Clearly, the second term in (3.75d) is the
complex conjugate of the first. Since 2Re( ) c c c
= + for any complex number c, it follows that

Eq. (3.75d) can be written as

2 2 2
2Re , , ( )
2 2
ifT if if
nn
T T
e d e d e R

= +

2
. (3.75e)

After the variable of integration of the inner integral is changed to ( ) t = + , it can be written
as

2 2 ( )
, ( ) , ( )
2 2
if if t
nn nn
T T
d e R dt t e R t

+

+ =

. (3.75f)

According to Eq. (3.48b) above and Eq. (2.56c) in Chapter 2, both and
nn
R

are even functions,
which means that
, ,
2 2
T T
t t

= +

and
( ) ( )
nn nn
R t R t =

.

Substituting these two formulas into the right-hand side of (3.75f) gives

2 2 ( )
, ( ) , ( )
2 2
if if t
nn nn
T T
d e R dt t e R t

+

+ = +

,


- 328 -
which can in turn be substituted into (3.75e) to get

2 2 2 ( )
2Re , , ( )
2 2
ifT if if t
nn
T T
e d e dt t e R t
r r t r t
t t t

+

H H +

A

2
.

Interchanging the order of integration and replacing the variable t by t, we end up with

2 2 4
2Re ( ) , ,
2 2
ifT ift ift
nn
T T
e dt R t e dt t t t e
r r r

H H +

A

2
. (3.75g)

Comparing (3.75g) with (3.67e), we note that the double integral in the formula for
2
A can be
written as

2 4
/ 2
( ) , ,
2 2
ift ift
nn T
T T
dt R t e dt t t t e
r r

H H + A

with the understanding that the random function is now (t) instead of (t) as in Eq. (3.67e). This
leads to a simplerwell, shorterformula for
2
A ,

( )
2
/ 2
2Re
ifT
T
e
r
A A
2
. (3.75h)

We have already found the appropriate approximation for
T
A and
/ 2 T
A when T and T/2 are large
enough to make the sinc functions oscillate rapidly with f compared to the noise-power
spectrum. Hence, we now apply formula (3.68j) to (3.75h), which gives, after remembering to
replace by and T by T/2,

( )
2
2Re [ sinc(2 )] ( )
ifT
nn
e T fT S f
r
r e A

2
.

Since

2
cos(2 ) sin(2 )
ifT
e fT i fT
r
r r + ,

the formula for
2
A can be written as

2 cos(2 ) sinc(2 )] ( )
nn
T fT fT S f r r e A

2
. (3.75i)

Having found good approximations for
1
A and
2
A , we can substitute (3.75c) and (3.75i) into
( )
nn
S f

.
- 329 -
(3.73e) to get

( )
2
( ) 2 ( ) 2 cos(2 ) sinc(2 )] ( )
TE nn nn
N f TS f T fT fT S f r r e +

E
or

( )
2
( ) 2 ( ) [1 cos(2 ) sinc(2 )]
TE nn
N f TS f fT fT r r e +

E . (3.76a)

For large values of T, so that
1 fT >> , (3.76b)

we know that [apply Eq. (3.66e)]

cos(2 ) sin(2 )
1
cos(2 ) sinc(2 ) 1
2 2
fT fT
fT fT
fT fT
r r
r r
r r
s <<

because (i) the absolute value of the product of the sine and cosine must always be less than or
equal to one and (ii) the value of 1/ 2 fT r must be small when fT is large. The formula in
(3.76a) now simplifies to

( )
2
( ) 2 ( )
TE nn
N f TS f e

E . (3.76c)

This will be a useful approximation to know when analyzing detector noise in Chapter 6.

__________

The basic concepts introduced in this chaptersuch as random variables and functions, the
autocorrelation function, the noise-power spectrum, stationarity and ergodicitymay not be as
important as the Fourier theory covered in Chapter 2, but they turn up over and over again in the
following pages. The Wiener-Khinchin theorem is used to transform electromagnetic wavefields
into the spectral radiances that Michelson interferometers are built to measure. Stationary random
functions are added to interference signals to represent what happens when the interference
signals become contaminated by noise. The expectation operator E is applied to the products of
random quantities to turn them into autocorrelation functions, and the autocorrelation functions
are then transformed into noise-power spectra in formulas for the random-measurement error.
This chapter has explained the statistical ideas behind these proceduresand the context in
which the ideas ariseto show what the formulas mean and why they make sense.
( )
nn
S f

- 330 -
4
FROM MAXWELLS EQUATIONS TO
THE MICHELSON INTERFEROMETER
The interference formulas for a highly idealized version of the standard Michelson interferometer
can be derived in a page or two, and that is what is done in most textbooks. Section 1.5 of
Chapter 1 lays out the basic approach of this derivation, pointing out that all we really need is the
19th-century ether-wave theory of light because a full knowledge of Maxwells equations is not
required. Afterwards, these ideal interference formulas can, with some difficulty and an appeal to
ad hoc arguments, be modified to handle the measurement errors and distortions present in
nonideal instruments, but this is difficult to do in a straightforward and convincing way.
Consequently, in this chapter we prefer to start with first principles, carefully tracing the plane-
wave solutions to Maxwells equations through the standard Michelson interferometer and then
applying the Fourier methodology and random-signal theory explained in the previous two
chapters to describe the electromagnetic wavefields leaving the instrument. Although longer than
the standard textbook procedure, this approach leads naturally to detailed formulas describing
what happens when the optical setup is slightly misaligned, what happens when the input
radiation is polarized, and what happens when the interferometer measures an input spectrum that
is nonuniform over its field of view. We do this both for the interferometers balanced
interference signal and its unbalanced background signal, explaining first the reasoning behind
the formulas for the balanced input signal and then showing how the same sort of analysis
produces similar formulas for the unbalanced background signal. At the end of this process, the
reader has a detailed understanding of how the formulas describing ideal Michelson
interferometers should be modified and expanded to describe nonideal instruments in an
imperfect world.
4.1 Deriving the Electromagnetic Wave Equations
In SI units, Maxwells equations for empty space are

o o
E
B
t
r
o
V
o
G
G G
, (4.1a)

B
E
t
o
V
o
G
G G
, (4.1b)

interferometers should be modied and expanded to describe optical imperfections and non-
ideal inputs.
Deriving the Electromagnetic Wave Equations 4.1
- 331 -
0 E
=
G G
, (4.1c)
and
0 B
=
G G
(4.1d)
where

7
4 10 henry meter
o

=
and

2
1
o
o
c
= . (4.1e)

In these equations, E
G
is the electric field, which is a function of position and time; B
G
is the
magnetic-induction field, which is also a function of position and time; t is the time coordinate;
o
is the magnetic permeability of free space;
o
is the permittivity of free space; c is the
velocity of light; and
G
is the standard vector-derivative del operator [see Eq. (4A.7a) in
Appendix 4A for a definition]. We take the curl of both sides in Eqs. (4.1a) and (4.1b) to get

( )
[ ]
o o
B E
t

=
G G G G G
(4.2a)
and

( )
[ ] E B
t
G G G G G
. (4.2b)

But for any vector field v
G
, we have the identity

( )
2
[ ] v v v =
G G G G
G G G
. (4.2c)

Substitution of (4.2c) into (4.2a) and (4.2b) gives

( ) ( )
2
o o
B B E
t

=
G G G G G G
,

( ) ( )
2
E E B
t
G G G G G G
,
or

2
2
2
0
o o
B
B
t

=
G
G
,

4 From Maxwells Equations to the Michelson Interferometer
- 332 -

2
2
2
0
o o
E
E
t

=
G
G
,

where we have used 0 B E = =
G G G G
from (4.1c) and (4.1d) and

E B t =
G G G
,
o o
B E t =
G G G

from (4.1a) and (4.1b) to simplify our results. The substitution
2
o o
c

= from (4.1e) now gives

2
2
2 2
1
0
B
B
c t
G
G
(4.3a)
and

2
2
2 2
1
0
E
E
c t
G
G
. (4.3b)

Equation (4.3a) is the wave equation for E
G
, the electric field as a function of position and time;
and (4.3b) is the wave equation for B
G
, the magnetic-induction field as a function of position and
time. Because E
G
and B
G
are vectors and the wave equation is usually applied to scalar fields, we
now rewrite Eqs. (4.3a) and (4.3b) as a collection of six scalar wave equations to show the
meaning of the two vector wave equations. The first step is to identify the E
G
and B
G
Cartesian
field components. Figure 4.1 specifies a three-dimensional Cartesian coordinate system for the E
G

and B
G
field vectors located at a single point P. We use the x , y , z unit vectors of the coordinate
system to write

x y z
E xE yE zE = + +
G
(4.4a)
and

x y z
B xB yB zB = + +
G
, (4.4b)

where, as shown in Fig. 4.1,
x
E ,
y
E ,
z
E are the real x, y, z components of the electric field and
x
B ,
y
B ,
z
B are the real x, y, z components of the magnetic-induction field. Both
, , x y z
E and
, , x y z
B
are, of course, functions of position and time. We define a position vector

r xx yy zz = + +
G
(4.4c)

and show the dependence of the E
G
and B
G
fields on position and time by rewriting (4.4a) and
(4.4b) as
- 333 -

FIGURE 4.1.

Point P at the
same x, y, z
coordinates
x
y
z
x
y
y
z
z
x
E
G

B
G

Draw only the E
G
field and
its x, y, z components
Draw only the B
G
field and
its x, y, z components
0
y
B >
0
z
B <
0
x
B >
0
x
E <
0
z
E >
0
y
E >
- 334 -
( , ) ( , ) ( , ) ( , )
x y z
E r t xE r t yE r t zE r t = + +
G
G G G G

and
( , ) ( , ) ( , ) ( , )
x y z
B r t xB r t yB r t zB r t = + +
G
G G G G
.

This notation is best regarded as a shorthand for [see the discussion after Eq. (2.109d) in Sec.
2.25 of Chapter 2]

( , , , ) ( , , , ) ( , , , ) ( , , , )
x y z
E x y z t xE x y z t yE x y z t zE x y z t = + +
G

and
( , , , ) ( , , , ) ( , , , ) ( , , , )
x y z
B x y z t xB x y z t yB x y z t zB x y z t = + +
G
.

For any vector v
G
we have, according to Eq. (4A.11c) in Appendix 4A,

2 2 2 2

x y z
v x v y v z v = + +
G

where
x
v ,
y
v ,
z
v are the real x, y, z components of real vector v
G
. It follows that substitution of
Eqs. (4.4a) and (4.4b) into (4.3a) and (4.3b) gives six scalar wave equations, one for each
Cartesian component of the two vector equations (4.3a) and (4.3b):

2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
x x x x x
x
E E E E E
E
c t x y z c t

= + + =

, (4.5a)

2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
y y y y y
y
E E E E E
E
c t x y z c t

= + + =

, (4.5b)

2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
z z z z z
z
E E E E E
E
c t x y z c t

= + + =

, (4.5c)

2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
x x x x x
x
B B B B B
B
c t x y z c t

= + + =

, (4.5d)

2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
y y y y y
y
B B B B B
B
c t x y z c t

= + + =

, (4.5e)

- 335 -

2 2 2 2 2
2
2 2 2 2 2 2 2
1 1
0
z z z z z
z
B B B B B
B
c t x y z c t

= + + =

. (4.5f)

Here,
2 2 2 2 2 2 2
x y z = + + is used to write these equations using explicit partial
derivatives of x, y, and z. These six equations are just the scalar wave equation for
x
E ,
y
E ,
z
E
and
x
B ,
y
B ,
z
B . They are not really that difficult to solve when they have simple boundary
conditions. In fact, if at some time t the E
G
and B
G
electromagnetic fields are zero everywhere,
then the solution to these equations is the trivial one that the E
G
and B
G
fields remain identically
zero everywhere. If, however, at some time t there is a region of space where the fields are not
identically zero, then we expect nontrivial solutions having nonzero values of the E
G
and B
G

fields.
4.2 Electromagnetic Plane Waves
Equations (4.1a)(4.1d), (4.3a), and (4.3b) contain five different differential operatorsthe
divergence (
G
), the curl (
G
), the Laplacian (
2
), and the first and second partial derivatives
with respect to time ( t ,
2 2
t )and all five are real linear operators as defined in Appendix
4A. According to the discussion following Eqs. (4A.19a) and (4A.19b), we can therefore find real
solutions for E
G
and B
G
by first solving for them as complex vector fields and then, at the end,
taking their real parts to get the desired real solutions. Following this procedure, we begin looking
for complex solutions to (4.3a) and (4.3b) that have the form

2
( , ) ( )
if t
E r t E r e

=
A
A
A
G G
G G
(4.6a)
and

2
( , ) ( )
if t
B r t B r e

=
A
A
A
G G
G G
, (4.6b)

where all the f
A
values are real and E
A
G
, B
A
G
may be complex vector functions of position.
Substituting (4.6a) and (4.6b) into (4.3a) and (4.3b) shows that then we end up with

2 2 2 2
[( 4 ) ] 0
if t
E E e

+ =
A
A A A
A
G G

and

2 2 2 2
[( 4 ) ] 0
if t
B B e

+ =
A
A A A
A
G G

if we define
- 336 -

f
c
=
A
A
. (4.7a)

The only way these sums can be identically zero for all times t is to set

2 2 2
4 0 E E + =
A A A
G G
(4.7b)

and

2 2 2
4 0 B B + =
A A A
G G
(4.7c)

for each value of A in the sums. We next look for solutions

2 ( )
( )
j
i k r
j
j
E r E e

=
A
G
G
A A
G G
G
(4.8a)

and

2 ( )
( )
j
i k r
j
j
B r B e

=
A
G
G
A A
G G
G
, (4.8b)

where all the
j
k
A
G
are constant, real, three-dimensional vectors and
j
E
A
G
,
j
B
A
G
are complex, constant,
three-dimensional vectors. In terms of the x , y , z unit vectors of Fig. 4.1,

j jx jy jz
k xk yk zk = + +
A A A A
G
,

so that, substituting from Eq. (4.4c),

j j jx jy jz
k r r k xk yk zk = = + +
A A A A A
G G
G G
.

From Eq. (4A.12a) of Appendix 4A,

Electromagnetic Plane Waves 4.2
- 337 -

( )
( )
( )
2
2 2
2
2
2
2
2
2
2
2
2
( )
j
jx jy jz
jx jy jz
jx jy j
i k r
j
j
i xk yk zk
j
j
i xk yk zk
i xk yk zk
E r E e
E e
x
e
y
e
z
+ +
+ +
+ +

=

=

A
A A A
A A A
A A A
G
G
A A
A
G G
G
G

( )
( )
( ) ( )
2
2 2
2 2 2 2 2
4 4
z
j j
i k r i k r
jx jy jz j j j
j j
k k k E e k E e

= + + =

A A
G G
G G
A A A A A A
G G G

and similarly,

( )
2
2
2 2
( ) 4
j
i k r
j j
j
B r k B e

=

A
G
G
A A A
G G G
G
.

Substitution of these two results and Eqs. (4.8a) and (4.8b) into (4.7b) and (4.7c) gives

( )
2
2 ( )
2
0
j
i k r
j j
j
E e k
A
G
G
A A A
G G
(4.9a)
and

( )
2
2 ( )
2
0
j
i k r
j j
j
B e k
A
G
G
A A A
G G
. (4.9b)

This can be true over all values of r
G
with nonzero values of
j
E
A
G
and
j
B
A
G
only when

2
2
j
k =
A A
G
(4.9c)

for all values of A and j. Equation (4.9c) requires the real vector
j
k
A
G
to have a magnitude
j
k =
A A
G
that depends only on index A . This suggests that the j index specifies the different
directions taken on by the
j
k
A
G
vectors, giving

j j
k =
A A A
G
.

- 338 -
Here
A
is a dimensionless unit vector, called the propagation vector, which for a specified
value of A points in different directions for different values of j. In fact, nothing stops us from
assuming that the
A
propagation vectors range over the same (indefinitely large) set of j
directions for each A value; if we want to leave out some j direction for a given A , we can always
remove those directions by making both
j
E
A
G
and
j
B
A
G
zero for the unwanted values of A and j. We
can thus write

j j
k =
A A
G
. (4.9d)

Substitution of (4.8a), (4.8b), (4.7a), and (4.9d) into (4.6a) and (4.6b) gives

( )
2 ( )
( , )
j
i r ct
j
j
E r t E e

=
A A A
G
A
A
G G
G
(4.10a)
and

( )
2 ( )
( , )
j
i r ct
j
j
B r t B e

=
A A A
G
A
A
G G
G
. (4.10b)

The phase term in Eqs. (4.10a) and (4.10b) is

2 ( )
j
r ct
A

if
1 =
A A

and

2 ( )
j
r ct +
A
if 1 =
A A
.

When
1 =
A A
,

Eq. (4.9c) has been solved with
0
j
k =
A A
G
;
and when
1 =
A A
,

Eq. (4.9c) has been solved with
0
j
k =
A A
G
.

- 339 -
Figure 4.2 shows that the choice made here is to have the phase increasing in the direction of
j

as time increases, hence the solution to (4.9c) is chosen to be

0
j
k =
A A
G
(4.10c)
and Eqs. (4.10a) and (4.10b) become

( )
2
( , )
j
i r ct
j
j
E r t E e

=
A
G
A
A
G G
G
(4.11a)
and

( )
2
( , )
j
i r ct
j
j
B r t B e

=
A
G
A
A
G G
G
. (4.11b)

The next section explains why these double sums are called electromagnetic plane waves.
We define

j jx jy jz
x y z = + + (4.12a)

so that
jx
,
jy
,
jz
are the direction cosines of
j
with respect to the x, y, z axes shown in Fig.
4.3,

cos( )
jx j jx
x = = ,
cos( )
jy j jy
y = = ,
cos( )
jz j jz
z = = . (4.12b)

The standard relationship between direction cosinesthat the sum of their squares is oneis the
same as the requirement that
j
have unit length

2 2 2 2 2 2
cos cos cos 1
jx jy jz jx jy jz
+ + = + + = . (4.12c)

Although we have chosen E
G
and B
G
to satisfy the vector wave equations (4.3a) and (4.3b),
they must also satisfy the full set of Maxwell conditions, Eqs. (4.1a)(4.1d). Substituting (4.11a)
into (4.1c) gives, using Eq. (4A.12b) from Appendix 4A,

2 ( ) 2 ( )
] [ ] 0 [
j j
i r ct i r ct
j j
j j
E e E e

= =

A A
G G
A A
A A
G G G G
. (4.13a)

Simplifying the gradient gives

- 340 -
FIGURE 4.2.

x
z
y
unit vector
j

The planes of constant phase are specified by
j
r ct = =
G
constant , with each value of ct specifying
a different plane perpendicular to
j
.
- 341 -
FIGURE 4.3.

x
z
y
unit vector
j

jx

jy

jz

- 342 -

2 ( ) 2 ( )
2 ( )
2 (
[ ]
(2 ) (2 ) (2 )
2
j x y z
j
j
i r ct i x y z ct
i r ct
x y z
i r c
j
e x y z e
x y z
x i y i z i e
i e

+ +

= + +

= + +

=
A A
A
A
G
G
A A A
G
A
G

) t
(4.13b)

Hence, Eq. (4.13a) becomes

( )
2 ( )
2 0
j
i r ct
j j
j
i E e

A
G
A A
A
G
. (4.14a)

Similarly, substituting (4.11b) into (4.1d) and simplifying gives

( )
2 ( )
2 0
j
i r ct
j j
j
i B e

A
G
A A
A
G
. (4.14b)

The only way (4.14a) and (4.14b) can hold true for all values of r
G
and t with nonzero
A
is to
require

0
j j
E =
A
G
(4.14c)
and

0
j j
B =
A
G
(4.14d)

for all values of A and j . Working next with Eq. (4.1a), we substitute (4.11a) and (4.11b) to get

2 ( ) 2 ( )
] [ ] [
j j
i r ct i r ct
j o o j
j j
B e E e
t

A A
G G
A A
A A
G G G

which becomes, using Eq. (4A.12c) in Appendix 4A,

2 ( ) 2 ( )
] ( 2 )[ ] [
j j
i r ct i r ct
j o o j
j j
B e i c E e

=

A A
G G
A A A
A A
G G G
.

Substituting from Eq. (4.13b) and using
2
o o
c

= [see Eq. (4.1e)] gives

2 ( )
1
2 0
j
i r ct
j j j
j
i e B E
c

=

A
G
A A A
A
G G
. (4.15a)
- 343 -
The only way this can be true for all r
G
and t with nonzero
A
is if

( )
j j j
c B E =
A A
G G
(4.15b)

for all values of A and j. Similarly, substitution of (4.11a) and (4.11b) into (4.1b) gives

2 ( )
2 0
j
i r ct
j j j
j
i e E cB

+ =

A
G
A A A
A
G G
. (4.15c)

The only way (4.15c) can hold true for all r
G
and t with nonzero
A
is if

j j j
E cB =
A A
G G
(4.15d)

for all values of A and j. It is not difficult to show that (4.15b) and (4.15d) are just different forms
of the same equation. Taking the cross product of the left-hand side of (4.15d) with
j
gives,
using Eq. (4A.14) in Appendix 4A,

( ) ( ) ( ) ( )
j j j j j j j j j j j j j
E E E E E = = + =
A A A A A
G G G G G
,

where we use
0
j j
E =
A
G
from Eq. (4.14c) and that

1
j j
= because
j
has unit length.
Therefore taking the cross product of both sides of (4.15d) with
j
gives

j j j j j
E c B cB = =
A A A
G G G
,

which is the same as Eq. (4.15b). We can also take the cross product of the left-hand side of
(4.15b) with
j
and use
0
j j
B =
A
G
from (4.14d) and Eq. (4A.14) in Appendix 4A to get

[ ( )]
j j j j
c B cB =
A A
G G
.

Taking the cross product of both the right-hand and left-hand sides of (4.15b) with
j
now must
give

j j j j j
cB E E = =
A A A
G G G
.

- 344 -
This is the same formula as Eq. (4.15d). Hence, as stated above, the restrictions placed on
j
and
the complex vectors
j
E
A
G
,
j
B
A
G
in Eqs. (4.14c) and (4.14d) make (4.15b) and (4.15d) the same
equality. We see that the double sums shown in (4.11a) and (4.11b) lead to acceptable complex
solutions to the vector wave equations for E
G
and B
G
in (4.3a) and (4.3b); and when the
restrictions (4.14c), (4.14d), and either (4.15b) or (4.15d) are placed on
j
,
j
E
A
G
, and
j
B
A
G
, the
double sums also satisfy (4.1a)(4.1d), Maxwells equations for empty space. No limits are
placed on the size of these double sums. This means we can create two different double sums,
both matching the criteria of this section and so solving Maxwells equations, and add them
together to get one big double sum matching the criteria of this section and solving Maxwells
equations. In general we can add together any number of plane-wave solutions to Maxwells
equations to create a new and larger collection of plane waves solving Maxwells equations.
4.3 Monochromatic Wave Trains
To show why Eqs. (4.11a) and (4.11b) are called plane-wave sums, we focus attention on a single
component of the sums in Eqs. (4.11a) and (4.11b) by assuming there to be only one nonzero pair
of
j
E
A
G
,
j
B
A
G
terms. Then the formulas for ( , ) E r t
G
G
and ( , ) B r t
G
G
in (4.11a) and (4.11b) become

2 ( )
( , )
j
i r ct
j
E r t E e

=
A
G
A
G G
G
(4.16a)
and

2 ( )
( , )
j
i r ct
j
B r t B e

=
A
G
A
G G
G
(4.16b)
with

0
j j j j
E B = =
A A
G G
and
1
( )
j j j
B c E
=
A A
G G
(4.16c)

from (4.14c), (4.14d), and (4.15d). Although it is customary to leave wave formulas in complex
form, strictly speaking only the real parts (or imaginary parts, see discussion at end of Appendix
4A) of the right-hand sides of (4.16a) and (4.16b) provide acceptable physical solutions to wave
Eqs. (4.3a) and (4.3b). Since an x, y, z coordinate system has not yet been specified, nothing stops
us from choosing the z axis to be parallel to
j
; and because both z and
j
are dimensionless,
real, unit-length vectors, we then have
j
z = . Equations (4.14c) and (4.14d) now show that the
complex vectors
j
E
A
G
and
j
B
A
G
have zero z components, allowing us to write

j jx jy
E xE yE = +
A A A
G
(4.17a)
and

j jx jy
B xB yB = +
A A A
G
(4.17b)
Monochromatic Wave Trains 4.3
- 345 -
where
jx
E
A
,
jy
E
A
,
jx
B
A
,
jy
B
A
are all complex numbers. Substituting into (4.15b) gives, using

j
x x z y = = and

j
y y z x = = ,

( )
( ) ( )
j

,
jx jy jx j jy
jx jy
xE yE c x B y B
y cB x cB
+ = +
= +
A A A A
A A

(4.17c)
which means that

jx jy
E cB =
A A
(4.17d)
and

jy jx
E cB =
A A
. (4.17e)

If we write

jx
i
jx jx
E E e
=
A
A A
(4.18a)

and

jy
i
jy jy
E E e
=
A
A A
(4.18b)

using real phase terms
jx
A
and
jy
A
to describe the
jx
E
A
,
jy
E
A
complex constants, it then follows
from (4.17d) and (4.17e), because c is real, that

1
jx
i
jy jx
B E e
c

=
A
A A
(4.18c)
and

1
jy
i
jx jy
B E e
c

=
A
A A
. (4.18d)

Hence, (4.17a) and (4.17b) become

jx jy
i i
j jx jy
E x E e y E e

= +
A A
A A A
G
(4.18e)
and

1 1

jy jx
i i
j jy jx
B x E e y E e
c c

= +
A A
A A A
K
, (4.18f)

so that taking the real part of the right-hand sides of Eqs. (4.16a) and (4.16b) gives, using
j
z =
and cos sin
i
e i
= + ,
- 346 -

( )
( )
( )
( ) ( )
2
2
Re[ ]
Re
cos 2 ( ) cos 2 ( )
jx jy
i z r ct
j
i i i z r ct
jx jy
jx jx jy jy
E e
x E e y E e e
x E z ct y E z ct

= +

= + + +
A
A A A
G
A
G
A A
A A A A A A
G

(4.19a)
and

( )
( )
( ) ( )
2
2
Re[ ]
1 1
Re
1 1
cos 2 ( ) cos 2 ( ) .
jy jx
i z r ct
j
i i i z r ct
jy jx
jy jy jx jx
B e
x E e y E e e
c c
x E z ct y E z ct
c c

= +

= + + +
A
A A A
G
A
G
A A
A A A A A A
G

(4.19b)

When z is held constant, all the x and y components of the E
G
and B
G
fields in (4.19a) and (4.19b)
oscillate at the same frequency f c =
A
. We can recognize what is going on by keeping z
constant and noting that if t increases (or decreases) by 1/( ) c
A
, then the phases of all the cosines
in Eqs. (4.19a) and (4.19b) increase (or decrease) by 2. This makes the wavefield specified in
(4.19a) and (4.19b) a plane wavefield, since every point on a plane specified by z = constant has
the same real E
G
field and B
G
field at all times t. Figure 4.4 shows that when t is held constant in
Eqs. (4.19a) and (4.19b) and z increases (or decreases) in value by 1
A
, the phases of all the
cosines also increase (or decrease) by 2. Consequently, planes in Fig. 4.4 that are separated by
1
A
have the same phase and thus the same real E
G
and B
G
fields. This distance is called the
wavelength of the plane wavefield. Parameter
A
is called the wavenumber, already defined in
Eq. (1.7b) of Chapter 1 to be 1/. The plane wave is called monochromatic because it is specified
by a single frequency f c =
A
and wavelength . Its wavenumber
A
is 1/, so the equality

f c =
A
(4.19c)

can now be interpreted as
f c = ,

the classic relationship between wavelength, frequency, and velocity for any wavefield. We
conclude that Eqs. (4.19a) and (4.19b) describe a wavefield traveling in the
j
z = direction at
velocity c, the speed of light.
This analysis obviously applies to any

Monochromatic Plane Waves 4.3
- 347 -

FIGURE 4.4.

x
z
y
E
G

E
G

B
G

B
G

E
G

E
G

B
G

B
G

1
z
=
A

unit vector
j

- 348 -

2 ( )
j
r ct
j
E e

A
G
A
G
and
2 ( )
j
r ct
j
B e

A
G
A
G

pair of terms from formulas (4.11a) and (4.11b). Since the pair of sums in (4.11a) and (4.11b) is a
general solution to the vector wave equations, this sort of general solution can now be interpreted
as a sum over an arbitrary collection of monochromatic plane waves characterized by different
wavenumbers and directions of propagation, where for each wavenumber
A
, there is a unique
frequency c
A
.
From Eqs. (4.19a) and (4.19b), we get

( ) ( )
( ) ( )
( ) ( )
2 2
{Re[ ]} {Re[ ]}
1
cos 2 ( ) cos 2 ( )
1
cos 2 ( ) cos 2 ( ) 0 ,
i z r ct i z r ct
j j
jx jy jx jy
jx jy jx jy
E e B e
E E z ct z ct
c
E E z ct z ct
c

= + +
+ + + =
A A
G G
A A
A A A A A A
A A A A A A
G G

(4.20)

showing that the real E
G
and B
G
fields of a monochromatic plane wave are always perpendicular
to each other while they oscillate. From (4.17a), (4.17b), (4.17d), and (4.17e), we get

1 1
0
j j jx jx jy jy jx jy jy jx
E B E B E B E E E E
c c

= + = + =

A A A A A A A A A A
G G
. (4.21a)

It follows that in Eqs. (4.16a) and (4.16b)

( )
4 ( )
( , ) ( , ) 0
i z ct
j j
E r t B r t E B e

= =
A
A A
G G G G
G G
. (4.21b)

In this sense, we can say that the complex monochromatic plane wave E
G
and B
G
fields are also
perpendicular to each other. Another result worth deriving, again using Eqs. (4.17a), (4.17b),
(4.17d), and (4.17e), is that

( ) ( )
1 1
[ ] [ ]
[ ]
j j jx jy jx jy jx jy jy jx
jx jx jy jy
E B xE yE xB yB z E B z E B
E c E E c E z

= + + =
= +
A A A A A A A A A A
A A A A
G G

( ) ( )
1 1
,
j j j j j
E E z E E
c c

= =
A A A A
G G G G

(4.21c)

Monochromatic Plane Waves 4.3
- 349 -
where we use 0 x x y y = = and

j
x y z = = . Vector identities that, like Eqs. (4.21a) and
(4.21c), can be written using only dot products and cross products, hold true in all (proper)
coordinate systems if they hold true in any one (proper) coordinate system.
55
Choosing a new
coordinate system where the z unit vector is not the same as the
j
propagation vector is
geometrically equivalent to specifying a new direction for the propagation vector that is not
parallel to the original z unit vector. Since (4.21a) and (4.21c) use only dot and cross products,
they must also hold true in those coordinate systems where
j
is not parallel to z . Hence we can
conclude that Eqs. (4.21a) and (4.21c) must be obeyed when the A , j monochromatic plane wave
propagates in any direction, not just when it propagates parallel to the z axis. Therefore the
double sums over A and j in Eqs. (4.11a) and (4.11b) must all have coefficients
j
E
A
G
and
j
B
A
G

satisfying Eqs. (4.21a) and (4.21c), with

0
j j
E B =
A A
G G
(4.22a)
and

( )
1
j j j j j
E B E E
c

=
A A A A
G G G G
. (4.22b)

Similarly, the perpendicularity of the real, physical E
G
and B
G
fields as they oscillate in Eq. (4.20)
cannot be affected by the choice of coordinate system, which means the oscillating E
G
and B
G

fields stay perpendicular when z is not chosen parallel to
j
. Since, once again, this is
geometrically equivalent to specifying a new direction of propagation, we conclude that the real
oscillating E
G
and B
G
fields are perpendicular for all
j
vectorsthat is, they are perpendicular
no matter in what direction the wavefield propagates.
4.4 Linear Polarization of Monochromatic Plane Waves
Equations (4.19a) and (4.19b) specify an acceptable monochromatic plane wavethat is, they
specify an acceptable term in the double-sum solutions in Eqs. (4.11a) and (4.11b)no matter
what values are given to the real constants
jx
E
A
,
jy
E
A
,
jx
A
, and
jy
A
. If we again use a Cartesian
coordinate system with
j
z = and choose 0
jy
E =
A
, then from Eqs. (4.18e) and (4.18a) we get

jx
i
j jx jx
E x E e xE
= =
A
A A A
G
. (4.23a)

55
The cross product is invariant only if the coordinate systems are always chosen to be left-handed or right-handed.
This book uses right-handed coordinate systems, sometimes referred to as proper coordinate systems, where the x ,
y , z vectors are always chosen so that z y x = .
- 350 -
Since 0
jy
E
A
, Eqs. (4.18f) and (4.18a) give

1 1

jx
i
j jx jx
B y E e y E
c c
o

A
A A A
G
. (4.23b)
Setting 0
jy
E
A
in Eqs. (4.19a) and (4.19b) now leads to

( )
( )
2
Re[ ] cos 2 ( )
i z r ct
j jx jx
E e x E z ct
r o
ro o
-
+
A
G
A A A A
G
(4.23c)
and

( )
( )
2
1
Re[ ] cos 2 ( )
i z r ct
j jx jx
B e y E z ct
c
r o
ro o
-
+
A
G
A A A A
G
. (4.23d)

Equations (4.23a)(4.23d) describe a plane wave whose real electric-field vector always points
strictly along the x axis and whose real magnetic-induction vector always points strictly along the
y axis. Characterizing this wave by the direction of the electric-field vector, we call it linearly
polarized along the x axis, or x-polarized for short (see Fig. 4.5). Equation (4.23a) shows that in
an x-polarized plane wave the complex vector
j
E
A
G
is the x unit vector multiplied by a complex
constant
jx
E
A
which, of course, means that in (4.23b) the complex vector
j
B
A
G
must be the y
unit vector multiplied by the complex constant
jx
E c
A
.
To get a monochromatic plane wave that is linearly polarized in the y direction, we choose
0
jx
E
A
. Then, repeating the analysis used to find Eqs. (4.23a)(4.23d), we have

jy
i
j jy jy
E y E e yE
o

A
A A A
G
, (4.24a)

1 1

jy
i
j jy jy
B x E e x E
c c
o

A
A A A
G
, (4.24b)

( )
( )
2
Re[ ] cos 2 ( )
i z r ct
j jy jy
E e y E z ct
r o
ro o
-
+
A
G
A A A A
G
, (4.24c)
and

( )
( )
2
1
Re[ ] cos 2 ( )
i z r ct
j jy jy
B e x E z ct
c
r o
ro o
-
+
A
G
A A A A
G
. (4.24d)

The monochromatic plane wave described by Eqs. (4.24a)(4.23d) has an electric-field vector
that always points along the y axis and a magnetic induction vector that always points along the
x axis (see Fig. 4.6). Equation (4.24a) shows that y polarization can be recognized by noting that
the complex vector
j
E
A
G
is the y unit vector multiplied by a complex constant
jy
E
A
[with,
4.24d
Linear Polarization of Monochromatic Plane Waves 4.4
- 351 -

FIGURE 4.5.

according to (4.24b), complex vector
j
B
A
G
being the x unit vector multiplied by the complex
constant ( )
jy
E c
A
].
Writing down Eqs. (4.19a) and (4.19b) again while switching the order of addition in the
second equation gives

( )
( ) ( )
2
Re[ ] cos 2 ( ) cos 2 ( )
i z r ct
j jx jx jy jy
E e x E z ct y E z ct

= + + +
A
G
A A A A A A A
G

and

E field vectors
B field vectors
One wavelength of a monochromatic plane wave linearly polarized in the
x direction and propagating in the z direction
x
z
y
- 352 -
FIGURE 4.6.

( )
( ) ( )
Re[ ] cos cos

l
2i zr -ct
lj ljx l ljx ljy l ljy
1 1
B e = y E 2 (z - ct)+ - x E 2 (z - ct)+
c c
G
G
.

Clearly, the first term in the general formula for the E field and the first term in the general
formula for the B field can be grouped together and called an x-polarized wave, and similarly the
second terms in the general formulas can be grouped together and called a y-polarized wave. This
shows that the E field of an arbitrary monochromatic plane wavethat is, a plane wave where
neither
jx
E
A
nor
jy
E
A
is automatically zerocan be represented as the sum of the E field of a
monochromatic plane wave linearly polarized in the x direction and the sum of the E field of a
monochromatic plane wave linearly polarized in the y direction. Similarly, the B field of that
same monochromatic plane wave can be represented as the sum of the B field of the
corresponding x-polarized plane wave and the B field of the corresponding y-polarized plane
x
z
y
E field vectors
B field vectors
One wavelength of a monochromatic plane wave linearly polarized in the
y direction and propagating in the z direction
monochromatic plane wave linearly polarized in the x direction and the E eld of a
Linear Polarization of Monochromatic Plane Waves 4.4
- 353 -
wave. This point is often made by stating that any monochromatic plane wave can be written as
the sum of an x-polarized plane wave and a y-polarized plane wave.
4.5 Transmitted Plane Waves
Figure 4.7 shows a monochromatic plane wave incident on a thin film of optical material placed
at an angle to the axis of propagation. Note that we have again chosen the z unit vector equal to
j
, the propagation vector of the incident plane wave. This means, according to Eqs. (4.19a) and
(4.19b), that the incident plane wave can be represented by the real part of

( )
( )
( ) 2 2

jx jy
i i i z r ct i z ct
j jx jy
E e x E e y E e e

= +
A A A A
G
A A A
G
(4.25a)

and the real part of

( ) ( ) 2 2
1 1

jy jx
i i i z r ct i z ct
j jy jx
B e x E e y E e e
c c

= +

A A A A
G
A A A
G
. (4.25b)

The thin film divides the space in Fig. 4.7 into two regions labeled A and B. Equations (4.25a)
and (4.25b) only apply to points in region A, the region occupied by the incident wavefield. The
unit normal vector n of the surface on which the plane wave is incident lies in the y, z plane of
the coordinate system, making an angle
j
with respect to the z axis. Angle
j
is called the
angle of incidence, and we give it an index j because it specifies the direction of the
j

propagation vector with respect to n . The interaction of the plane wave with the film creates a
transmitted radiation field in region B that also propagates in the
j
z = direction, and a
reflected radiation field in region A that propagates in the direction

( )
( )

2
r
j j j
n n = (4.26a)
or

( )
( )
2 cos
r
j j
z n = + . (4.26b)

Both the transmitted and reflected wavefields have the same
A
wavenumber as the incident
wave. For any wavefield incident on a flat surface, the plane of incidence is defined to be that
plane containing both the surface normal n and the incident propagation vector
j
. Equation
(4.26a) shows that the
( )
r
j
propagation vector of the reflected wave automatically lies in the
- 354 -
FIGURE 4.7.

A B
y
x
z
propagation
vector
j
z =
propagation vector
( )
r
j

surface normal n

j

j

Transmitted Plane Waves 4.5
- 355 -
same plane as n and
j
. In Fig. 4.7, the plane of incidence is the y, z plane of the coordinate
system.
Since the transmitted radiation field is also a monochromatic plane wave traveling down the z
axis, the E and B fields of the wave can still be found from the real parts of complex plane wave
solutions such as the ones given in Eqs. (4.16a) and (4.16b),

2 ( )
2 ( ) ( ) ( ) j
i r ct
i z ct t t
j j
E e E e

=
A
A
G
A A
G G
(4.27a)
and

2 ( )
2 ( ) ( ) ( ) j
i r ct
i z ct t t
j j
B e B e

=
A
A
G
A A
G G
, (4.27b)

where the (t) superscript specifies the transmitted wavefield and Eqs. (4.27a) and (4.27b) are
assumed to apply only to region B in Fig. 4.7. The complex vector
( ) t
j
E
A
G
can be written as

( ) ( ) ( )

t t t
j jx jy
E xE yE = +
A A A
G

with the two complex numbers
( ) t
jx
E
A
and
( ) t
jy
E
A
representing its x and y components. Equations
(4.18e) and (4.18f) show that the complex vectors
( ) t
j
E
A
G
,
( ) t
j
B
A
G

( ) ( )
( ) ( ) ( )

t t
jx jy
i i
t t t
j jx jy
E x E e y E e

= +
A A
A A A
G
(4.27c)
and

( ) ( )
( ) ( ) ( )
1 1

t t
jy jx
i i
t t t
j jy jx
B x E e y E e
c c

= +
A A
A A A
K
, (4.27d)

where we have used the two real constants
( ) t
jx
A
and
( ) t
jy
A
to represent the phases of
( ) t
jx
E
A
and
( ) t
jy
E
A

respectively. We require the film to be nonbirefringent, nonoptically active, and to have an index
of refraction that is constant in layers parallel to its surface; that is, the index of refraction can
only depend on the distance from the films surface. If the film absorbs radiant energy, we
account for it in the usual way by making its index of refraction complex.
56
This sort of film turns
out to be an adequate model for the partially transmitting, partially reflecting layer of a Michelson
interferometers beam splitter.
When the plane wave incident on the film has 0
jy
E =
A
or 0
jx
E =
A
, making the wave in Eqs.
(4.25a) and (4.25b) linearly x-polarized or linearly y-polarized respectively, the transmitted wave

56
Leonard Eyges, The Classical Electromagnetic Field (Dover Publications, Inc., New York, 1972), p. 340.
- 356 -
must have the same type of linear polarization.
57
Hence, when 0
jy
E =
A
in (4.25a) and (4.25b),
the transmitted plane wave must also be linearly polarized along the x axis, making
( )
0
t
jy
E =
A
in
Eqs. (4.27c) and (4.27d); and when 0
jx
E =
A
, the transmitted plane wave, which must be linearly
polarized along the y axis, has
( )
0
t
jx
E =
A
in (4.27c) and (4.27d).
Consulting Eqs. (4.25a) and (4.25b), we see that for linear polarization along the x axis with
0
jy
E =
A
, the incident plane wave is given by the real part of

( ) 2
jx
i i z ct
jx
x E e e

A A
A
(4.28a)

for the electric field and the real part of

( ) 2
1
jx
i i z ct
jx
y E e e
c

A A
A
(4.28b)

for the magnetic induction. The corresponding transmitted plane wave is given by the real part of

( )
( )
2 ( )
t
jx
i i z ct t
jx
x E e e

A A
A
(4.29a)

for the electric field and the real part of

( )
( )
2 ( )
1
t
jx
i i z ct t
jx
y E e e
c

A A
A
(4.29b)

for the magnetic induction [see Eqs. (4.27c) and (4.27d) with
( )
0
t
jy
E =
A
). The ratio of the complex
transmitted electric fields x component in (4.29a) to the complex incident electric fields x
component in (4.28a) is the complex coefficient

( )
( )
( )
t
jx jx
t
i jx
s
jx
E
t e
E

=
A A
A
A
. (4.30a)

We see by inspection that this is the same as the ratio of the two complex magnetic inductions in
(4.29b) and (4.28b). Consequently, no matter what happens inside the film to produce the

57
Max Born and Emil Wolf, Principles of Optics: Electromagnetic Theory of Propagation, Interference, and
Diffraction of Light, 7th (expanded) ed. (Cambridge University Press, New York, 1999), p. 55.
- 357 -
transmitted x-polarized wave, the process can be described by a complex parameter
s
t , which in
general is a function of the wavenumber
A
and
j
, the angle of incidence in Fig. 4.7,

( , )
s s j
t t =
A
. (4.30b)

The subscript s in Eqs. (4.30a) and (4.30b) is traditionally applied to incident plane waves whose
electric field is linearly polarized perpendicular to the plane of incidence, and parameter
s
t is
called the s-wave amplitude-transmission coefficient.
58

It is important to note that
s
t does not depend on either
jx
E
A
or
jx
A
, giving it the same value
for all monochromatic plane waves having equal wavenumbers and angles of incidence.
59

Equations (4.28a), (4.28b), (4.29a), and (4.29b) and the definition of parameter ( , )
s j
t
A
in
(4.30a) let us write

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
js s js
E e t E e

=
A A
A A A
G G
(4.31a)

and

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
js s js
B e t B e

=
A A
A A A
G G
, (4.31b)
where

jx
i
js jx
E x E e
=
A
A A
G
,
1
jx
i
js jx
B y E e
c

=
A
A A
G
, (4.31c)
and

( )
( ) ( )
t
jx
i
t t
js jx
E x E e
=
A
A A
G
, and
( )
( ) ( )
1
t
jx
i
t t
js jx
B y E e
c

=
A
A A
G
. (4.31d)

This shows that to get the complex formula for the transmitted plane wave linearly polarized
perpendicular to the plane of incidence, we need only multiply the complex formula for the
incident plane wave by ( , )
s j
t
A
. If the plane wavefield incident on the optical film at an angle
j
contains more than one wavenumber (but is still polarized perpendicular to the plane of
incidence), then its electric field is given by the real part of

( ) 2 i z ct
js
E e

A
A
A
G

and its magnetic induction is given by the real part of

58
This notation can be traced back to the German word for perpendicular, senkrecht.
59
O. S. Heavens, Optical Properties of Thin Solid Films (London, Butterworths Scientific Publications, 1955), pp.
4695.
- 358 -

( ) 2 i z ct
js
B e

A
A
A
G
,

where an s subscript has been added to show that all the waves are linearly polarized
perpendicular to the plane of incidence. The s-wave amplitude-transmission coefficient can now
be used to write the complex formulas for the transmitted radiation fields as

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
js s j js
E e t E e

=

A A
A A A
A A
G G
(4.31e)
and

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
js s j js
B e t B e

=

A A
A A A
A A
G G
(4.31f)
because

( )
( , )
t
js s j js
E t E =
A A A
G G
and
( )
( , )
t
js s j js
B t B =
A A A
G G
(4.31g)

for all values of A .
For linear polarization along the y axis with 0
jx
E =
A
, Eqs. (4.25a) and (4.25b) show that the
electric field of the incident plane wave is given by the real part of

( ) 2
jy
i i z ct
jy
y E e e

A A
A
(4.32a)

and the magnetic induction of the incident plane wave is given by the real part of

( ) 2
1
jy
i i z ct
jy
x E e e
c

A A
A
. (4.32b)

Recalling that the corresponding transmitted plane wave must have the same type of linear
polarization as the incident wave, we set
( )
0
t
jx
E =
A
in Eqs. (4.27c) and (4.27d) to get that the
electric field of the transmitted plane wave is the real part of

( )
( )
2 ( )
t
jy
i i z ct t
jy
y E e e

A A
A
(4.33a)

and the magnetic induction of the transmitted plane wave is the real part of

( )
( )
2 ( )
1
t
jy
i i z ct t
jy
x E e e
c

A A
A
. (4.33b)

- 359 -
The ratio of the complex transmitted electric field in (4.33a) to the complex incident electric field
in (4.32a) is

( )
( )
( )
t
jy jy
t
i jy
p
jy
E
t e
E

=
A A
A
A
. (4.34a)

Again, this is the same as the ratio of the two complex magnetic inductions in (4.33b) and
(4.32b)so again the process of transmission is described by a single complex parameter that is a
function of
A
and
j
but not of
jy
E
A
or
jy
A
,

( , )
p p j
t t =
A
. (4.34b)

The p subscript is traditionally applied to incident plane waves whose electric field is linearly
polarized parallel to the plane of incidence, and parameter
p
t is called the p-wave amplitude-
transmission coefficient.
60
When the incident wavefield contains more than one wavenumber and
every monochromatic component is a p-type plane wave, its electric field is given by the real part
of

( ) 2 i z ct
jp
E e

A
A
A
G
(4.35a)

and its magnetic induction is given by the real part of

( ) 2 i z ct
jp
B e

A
A
A
G
, (4.35b)
where

jy
i
jp jy
E y E e
=
A
A A
G
and
1
jy
i
jp jy
B x E e
c

=
A
A A
G
(4.35c)

with the p subscript showing that the waves are linearly polarized parallel to the plane of
incidence. To get the complex formula for the transmitted plane wave linearly polarized parallel
to the plane of incidence, we need only multiply the complex term for each incident plane wave
by ( , )
p j
t
A
to get

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
jp p j jp
E e t E e

=

A A
A A A
A A
G G
(4.35d)
and

60
This notation can also be traced back to German scientists, with the German word for parallel spelled the same as
in English, parallel.
- 360 -

( ) ( ) 2 2 ( )
( , )
i z ct i z ct t
jp p j jp
B e t B e
r o r o
o

A A
A A A
A A
G G
. (4.35e)

The details of the mathematics used here to represent the incident and transmitted wavefields
have an unfortunate tendency to conceal the basic ideas behind what is being done. No matter
what the orientation of the E field in the incident monochromatic plane waveparallel or
perpendicular to the plane of incidenceterms having the form

( ) i z ct
Ae
o
A

are used to describe the electromagnetic wavefields on the incident side of the thin film, and
terms such as

( ) i z ct
Ae
o
t

A

are used to describe the electromagnetic wavefields on the transmitted side of the thin film. Here,
t is a complex number standing for either
s
t or
p
t in the above formulas; and A is a complex
number standing for either the x or y components of the E and B fields complex amplitudesfor
example,
jx
E
A
,
jy
B
A
, etc.. If we write the complex A value as

A
i
A A e
o
,
then

( ) ( )
A
i z ct i z ct
Ae A e
o o o +
A A

and

( ) ( )
A
i z ct i z ct
Ae A e
o o o
t t
+
A A
.

If the incident monochromatic wavefield is shifted forward or back along the z axisthat is,
along its direction of propagationby a distance
0
z , then
0
z z z so that

( )
0 0
( ) ( ) ( )
A A A
i z ct z i z i z ct i z ct
A e A e A e e
o o o o o o o o + + +

A A A A A

To change the amplitude of the incident wavefield to some fraction of its original value, we
multiply A by a real number between zero and one to get

( ) ( )
0 0
( ) ( )
A A
i z i z i z ct i z ct
A e e A e e
o o o o o o
o
+ +
A A A A
.

The complex t parameter can be written as
)
A
ct o +
.
( )
0
) (
A
i z t i z
A e e
o o o
o
+
A A
( )
0 0
) (
A
t z i z i z ct
A e e
o o o o +
A A A
( )
A
i z c ct
A e
o o +
A
)
A
ct o +

)
A
ct o +
.
) ( ct i z ct
A e
o
A
) ( ct i z c
A e
o
t

A
b t)
b
b

m (with = 2
A
r
A
and b = 2 c) r o
A
b t)
A
A A
o
A
b
A
b
A
b
A
b
A
b
A
b
A
b
A
b
A
)
A
t o +
b
A
A
- 361 -

i
e
t
o
t t ,

which means the transmitted wavefield that is specified above to be

( ) ( )
A
i z ct i z ct
Ae A e
o o o
t t
+
A A

becomes

( )
( ) ( )
A
i i z ct i z ct
Ae A e e
t
o o o o
t t
+
A A
.

Comparing the right-hand side of this equation to the expression

( )
0
( )
A
i z i z ct
A e e
o o o
o
+
A A

for the incident wavefield shifted by
0
z and diminished by a real factor , we note that

o t .
and

0
z
t
o o .
A
.

Hence, all that happens when we multiply a wavefield specified to be

( ) i z ct
Ae
o
A

by a complex parameter t to get

( ) i z ct
Ae
o
t

A

is that the amplitude A of the original wavefield changes to A t and the oscillations of the
wavefield are moved forward or back by a distance

arg( )
t
o t
o o
A A

along the direction of propagation. This mathematical factknowing what happens when the
complex expression for a monochromatic wavefield is multiplied by a complex parametergives
meaning to the formulas derived in the first part of this section. Monochromatic wavefields
transmitted through the thin film in Fig. 4.7 have their amplitudes diminished by
s
t if the E field
is perpendicular to the plane of incidence and by
p
t if the E field is parallel to the plane of
incidence. The oscillations of the transmitted wavefields are also moved forward or back with
) ct

) ct

)
A
ct o +

)
A
ct o +

)
A
z ct o o +
A
.
) ( z ct i z ct
A e
o o
t
+
A A
( )
) ( i t i z
A e e
t
o o
t

b
A
b
A
b
A
b
A
b
A
b
A
b
A
0
z
t
o o .
A
. o
- 362 -
respect to the incident wavefield as specified by the complex phases or arguments of
s
t and
p
t .
How much the wavefields shift and change in amplitude depends on the angle of incidence and
wavenumberthat is why
s
t and
p
t are written as functions of
j
and
A
.
From the work done in Sec. 4.4, we know that any monochromatic plane wave having a
propagation vector parallel to the z axis can be analyzed as the sum of a monochromatic plane
wave linearly polarized along the x axis and a monochromatic plane wave linearly polarized
along the y axis. This means that any monochromatic plane wave incident on the optical film in
Fig. 4.7 can be treated as the sum of an s-type monochromatic plane wave and a p-type
monochromatic plane wave. Consequently, we expect an arbitrary plane wavefield incident along
the z axis in region A of Fig. 4.7 to have both s-type and p-type components, with its electric field
given by the real part of

( ) ( ) 2 2 i z ct i z ct
js jp
E e E e

+

A A
A A
A A
G G
(4.36a)

and its magnetic induction given by the real part of

( ) ( ) 2 2 i z ct i z ct
js jp
B e B e

+

A A
A A
A A
G G
. (4.36b)

The recipe for taking this combined wavefield through the optical film into region B of Fig. 4.7 is
to multiply each s-wave component and p-wave component by the appropriate s-wave and p-
wave amplitude-transmission coefficients. Hence, the electric field for the transmitted wave in
region B is the real part of

( ) ( ) 2 2
( , ) ( , )
i z ct i z ct
s j js p j jp
t E e t E e

+

A A
A A A A
A A
G G
(4.36c)

and the magnetic induction is the real part of

( ) ( ) 2 2
( , ) ( , )
i z ct i z ct
s j js p j jp
t B e t B e

+

A A
A A A A
A A
G G
. (4.36d)

Thus the transmission of any plane wavefield containing many different wavenumbersthat is,
the transmission of any polychromatic plane wavecan be handled by writing each incident
monochromatic wave as the sum of an s-wave and a p-wave, as shown in (4.36a) and (4.36b), and
then multiplying each s-wave and p-wave in that sum by the correct s-wave and p-wave
amplitude-transmission coefficient, as shown in (4.36c) and (4.36d).
Reflected Plane Waves 4.6
- 363 -
4.6 Reflected Plane Waves
If the incident wavefield in region A of Figs. 4.7 and 4.8 is a monochromatic plane wave with
propagation vector
j
z = and wavenumber
A
, then the reflected wavefield in region A is a
monochromatic plane wave with wavenumber
A
and propagation vector
( )
r
j
. In Fig. 4.8, we
construct a special
( ) r
x ,
( ) r
y ,
( ) r
z coordinate system to analyze the reflected plane wave. The
( ) r
z
axis is set parallel to the
( )
r
j
propagation vector, so that
( ) ( )
r r
j
z = . Note that, according to the
discussion at the end of Sec. 4.2, the sum of the incident and reflected plane waves is still a
solution to Maxwells equations in region A. We see that the
( ) r
x ,
( ) r
y ,
( ) r
z coordinate system is
just the x, y, z coordinate system rotated about the x axis to make z parallel to
( )
r
j
, so the two
coordinate systems have the same origin. Both coordinate systems have the same x axis, so
( )

r
x x = , and to get the y axis of the new coordinate system, we specify
( ) ( ) ( ) ( )

r r r r
j
y z x x = = .
When an x, y, z coordinate system is rotated by an angle about its x axis to create a new
( ) r
x ,
( ) r
y ,
( ) r
z coordinate system (see Fig. 4.9), the relationship between the x , y , z unit vectors and
the
( )
r
x ,
( )
r
y ,
( )
r
z unit vectors is

( )

r
x x = , (4.37a)

( )
cos sin
r
y y z = + , (4.37b)
and

( )
cos sin
r
z z y = . (4.37c)

Equations (4.37a)(4.37c) provide another way of specifying the
( )
r
x ,
( )
r
y ,
( )
r
z unit vectors in
terms of the x , y , z unit vectors. Comparing Figs. 4.8 and 4.9, we see that to create the desired
( ) r
x ,
( ) r
y ,
( ) r
z coordinate system in Fig. 4.8, the original x, y, z coordinate should be rotated
around the x axis by an angle in radians of 2
j
= or 2
j
= + .
Because the reflected plane wave is traveling down the
( ) r
z axis rather than the z axis, when
the E and B fields of the wave are specified by the real parts of complex expressions, such as the
ones shown in Eqs. (4.16a) and (4.16b), we must replace
j
and z by
( )
r
j
and
( ) r
z
respectively,

( )
( )
2 ( )
2 ( ) ( ) ( )
r
r
j
i r ct
i z ct r r
j j
E e E e

=
A
A
G
A A
G G
(4.38a)
and

( )
( )
2 ( )
2 ( ) ( ) ( )
r
r
j
i r ct
i z ct r r
j j
B e B e

=
A
A
G
A A
G G
. (4.38b)
- 364 -
FIGURE 4.8.

The r superscript on the complex
( ) r
j
E
A
G
and
( ) r
j
B
A
G
vectors show that they belong to the reflected
wave. Vector
( ) r
j
E
A
G
in (4.38a) can be written as

( )
( ) ( ) ( ) ( )

r
r r r r
j jx
jy
E xE y E +
A A
A
G
(4.38c)

using two complex numbers
( ) r
jx
E
A
and
( )
( )
r
r
jy
E
A
to represent its x and
( )
r
y components. Although
the y subscripts and unit vectors have an r superscript to show that they belong to the
( ) r
x ,
( ) r
y ,
,
r
x x
z

( ) r
z

( ) r
y
y
A B
propagation vector
j
O
propagation vector
( )
r
j
O
surface normal n

j

j

( ) r
z
( ) r
y
- 365 -

FIGURE 4.9.

,
r
x x

( ) r
z
z

( ) r
y y

- 366 -
( ) r
z coordinate system, the x subscripts and unit vectors do not need one because x and
( )
r
x are
identical in the two coordinate systems. Following the pattern of Eqs. (4.27c) and (4.27d), we
write the complex vectors
( ) r
j
E
A
G
and
( ) r
j
B
A
G
as

( )
( )
( )
( )
( ) ( ) ( ) ( )

r
r
r
jx jy
r
i
i
r r r r
j jx
jy
E x E e y E e
= +
A A
A A
A
G
(4.38d)
and

( )
( )
( )
( )
( ) ( ) ( ) ( )
1 1

r
r
r
jy jx
r
i
i
r r r r
j jx
jy
B x E e y E e
c c
= +
A A
A A
A
G
(4.38e)

using the real constants
( ) r
jx
A
and
( )
( )
r
r
jy
A
to represent the phases of the complex values of
( ) r
jx
E
A
and
( )
( )
r
r
jy
E
A
respectively.
When the plane wave incident on the optical film is linearly polarized along the x axis or y
axis, the reflected wave is linearly polarized along the
( )

r
x x = axis or the
( )
r
y axis respectively.
61

Equations (4.28a) and (4.28b), which give the complex formulas for an incident plane wave
that is linearly x-polarized, force the reflected plane wave to be linearly polarized along the
( )

r
x x = axis. According to Eq. (4.38d), this reflected wave must have

( )
( )
0
r
r
jy
E =
A

for it to be linearly polarized along the
( )

r
x x = axis. Equations (4.38a)(4.38e) then show that the
E field of the reflected wave is given by the real part of

( )
( )
2 ( ) ( )
r
r
jx
i
i z ct r
jx
x E e e
A
A
A
(4.39a)

and the B field of the reflected wave is given by the real part of

( )
( )
2 ( ) ( ) ( )
1
r
r
jx
i
i z ct r r
jx
y E e e
c

A
A
A
. (4.39b)

Comparing these two complex formulas to the complex formulas (4.28a) and (4.28b) for the
incident wave, we note that if we consider only the scalar factors that do not depend on position
or time, then the
( )

r
x x = components of the complex E fields together with the y ,
( )
r
y
components of the complex B fields have the same complex ratio

61
Max Born and Emil Wolf, Principles of Optics, p. 55.
- 367 -

( )
( )
( )
r
jx jx
r
i jx
s
jx
E
r e
E

=
A A
A
A
. (4.40a)

Parameter
s
r is called the s-wave amplitude-reflection coefficient, with s again referring to the
incident plane waves being polarized perpendicular to the plane of incidence. In general,

( , )
s s j
r r =
A
, (4.40b)

where
s
r , like the amplitude-transmission coefficients
s
t and
p
t , does not depend on either
jx
E
A

or
jx
A
; it is the same for all incident plane waves having the same
A
and
j
. Comparing the x-
polarized reflected wave in (4.39a) and (4.39b) to the x-polarized incident wave in (4.28a) and
(4.28b), we see that multiplying the complex formulas in (4.28a) and (4.28b) by
s
r converts them
to the complex formulas in (4.39a) and (4.39b) if y is replaced by
( )
r
y and z is replaced by
( ) r
z .
Turning to the case of the y-polarized incident wave specified by the complex formulas
(4.32a) and (4.32b), we remember that now the reflected wave must be polarized along the
( )
r
y
axis. This forces
( )
0
r
jx
E =
A
in Eqs. (4.38a)(4.38e), showing the reflected E field is given by the
real part of

( )
( )
( )
( )
2 ( ) ( ) ( )
r
r
r
jy
r
i
z ct r r
jy
y E e e
A
A
A
(4.41a)

and the reflected B field is given by the real part of

( )
( )
( )
( )
2 ( ) ( )
1
r
r
r
jy
r
i
z ct r
jy
x E e e
c

A
A
A
. (4.41b)

Comparing these two formulas to (4.32a) and (4.32b) for the incident wave, we again see that if
we consider only the scalar factors that do not depend on position or time then the y ,
( )
r
y
components of the complex E fields together with the
( )

r
x x = components of the complex B
fields have the same complex ratio

( )
( )
( )
( )
r
r
jy r
jy
r
i
jy
p
jy
E
r e
E

=
A
A
A
A
. (4.42a)
- 368 -
Parameter
p
r is called the p-wave amplitude-reflection coefficient, where again p refers to the
incident wave being polarized parallel to the plane of incidence. This coefficient, like
s
r ,
s
t , and
p
t , in general depends only on the wavenumber and incidence angle,

( , )
p p j
r r =
A
. (4.42b)

Multiplying the complex formulas in (4.32a) and (4.32b) by
p
r converts them to (4.41a) and
(4.41b) if y is replaced by
( )
r
y and z is replaced by
( ) r
z .
Having analyzed how to create the reflected wavefield when the incident wavefield is a
monochromatic s-wave or monochromatic p-wave, we are now prepared to handle the reflection
of an arbitrary polychromatic plane wavefield incident along the z axis. Splitting each
monochromatic term into an s-wave component and a p-wave component as in formulas (4.36a)
and (4.36b), we can write the incident waves E field as the real part of

( ) ( ) 2 2 i z ct i z ct
js jp
E e E e

+

A A
A A
A A
G G

or, using Eqs. (4.31c) and (4.35c), as the real part of

( ) ( ) 2 2

jx jy
i i i z ct i z ct
jx jy
x E e e y E e e

+

A A A A
A A
A A
. (4.43a)

Similarly, the incident waves B field is, using Eqs. (4.31c) and (4.35c), the real part of

( ) ( ) 2 2
1 1

jx jy
i i i z ct i z ct
jx jy
y E e e x E e e
c c

A A A A
A A
A A
. (4.43b)

In these latest formulas, (4.43a) and (4.43b), the first term is the sum over the s-wave components
of the incident wavefield and the second term is the sum over the p-wave components of the
incident wavefield. To get the corresponding polychromatic reflected wavefield, we follow the
just-described recipes for finding the reflected monochromatic plane waves generated by each
incident monochromatic plane wave. The electric field of the reflected wavefield is then found to
be the real part of

( ) ( )
2 ( ) 2 ( ) ( )
( , ) ( , )
r r
jx jy
i i
z ct z ct r
s j jx p j jy
r x E e e r y E e e

+

A A
A A
A A A A
A A
(4.43c)

and the magnetic-induction field of the reflected wavefield is found to be the real part of

- 369 -

( ) ( )
2 ( ) 2 ( ) ( )
1 1
( , ) ( , ) .
r r
jx jy
i i
z ct z ct r
s j jx p j jy
r y E e e r x E e e
c c

A A
A A
A A A A
A A
(4.43d)

These reflected-wave formulas are, of course, the counterpart equations to (4.36c) and (4.36d) for
the transmitted wavefields.
4.7 Polychromatic Wave Fields
Having found and at least to some extent analyzed the complex E-field and B-field plane-wave
solutions in Eqs. (4.11a) and (4.11b), we can write their associated real-valued radiation fields as

( )
( ) ( )
2
(rad)

2 2
( , ) Re
1 1
2 2
j
j j
i r ct
j
j
i r ct i r ct
j j
j
E r t E e
E e E e

=

= +

A
A A
G
A
A
G G
A A
A A
G G
G
G G

(4.44a)
and

( )
( ) ( )
2
(rad)

2 2
( , ) Re
1 1
.
2 2
j
j j
i r ct
j
j
i r ct i r ct
j j
j
B r t B e
B e B e

=

= +

A
A A
G
A
A
G G
A A
A A
G G
G
G G

(4.44b)

In Eq. (4.44a), to convert the first inside sum over
j
E
A
G
into an integral, we replace 0
A
with
the continuous variable 0 . To convert the sum over
j
E

A
G
into an integral, we use negative
values of the same continuous variable ; that is, we replace
A
with 0 < . To set up these
conversions, we define

1
( ) for 0
2
j j
E E = = >
A A A
G G
, (4.45a)
and

1
( ) for 0
2
j j
E E
= = <
A A A
G G
(4.45b)
with

1

+
=
A A A
.

Beam-Chopped and Direction-Chopped Radiation 4.9

- 370 -
A similar conversion of sums into integrals can be applied to Eq. (4.44b) if we define

1
( ) for 0
2
j j
B B = = >
A A A
G G
, (4.45c)
and

1
( ) for 0
2
j j
B B
= = <
A A A
G G
. (4.45d)

Equations (4.45a) and (4.45c) associate positive arguments in ( )
j
E
G
and ( )
j
B
G
with the
original
j
E
A
G
and
j
B
A
G
vectors, and Eqs. (4.45b) and (4.45d) associate negative arguments in
( )
j
E
G
and ( )
j
B
G
with the complex conjugate
j
E

A
G
and
j
B

A
G
vectors. In the limit of decreasing

A
and increasing numbers of
A
values per unit wavenumber interval, Eqs. (4.44a) and
(4.44b) become

( )
2
(rad)
( , ) ( )
j
i r ct
j
j
E r t E e d

G
G G
G
(4.46a)
and

( )
2
(rad)
( , ) ( )
j
i r ct
j
j
B r t B e d

G
G G
G
. (4.46b)

For this limit to make sense, we have to set ( ) 0
j
E =
G
and ( ) 0
j
B =
G
in (4.45a)(4.45d) at those
wavenumbers for which there are no specified A index values in (4.44a) and (4.44b); in effect,
the indices left out of the sums are now included but assigned zero for their complex vector
coefficients
j
E
A
G
and
j
B
A
G
. Although Eqs. (4.44a) and (4.44b) force vectors
(rad)
E
G
and
(rad)
B
G
to be
real, vectors ( )
j
E
G
and ( )
j
B
G
are allowed to be complex.
Equations (4.46a) and (4.46b) are a vector shorthand for the six scalar equations

( )
2
(rad)
( , ) ( )
j
i r ct
x jx
j
E r t E e d

G
G
,

( )
2
(rad)
( , ) ( )
j
i r ct
y jy
j
E r t E e d

G
G
,

( )
2
(rad)
( , ) ( )
j
i r ct
z jz
j
E r t E e d

G
G
,
Polychromatic Wave Fields 4.7
- 371 -
and

( )
2
(rad)
( , ) ( )
j
i r ct
x jx
j
B r t B e d

G
G
,

( )
2
(rad)
( , ) ( )
j
i r ct
y jy
j
B r t B e d

G
G
,

( )
2
(rad)
( , ) ( )
j
i r ct
z jz
j
B r t B e d

G
G
,
where

(rad) (rad) (rad) (rad)
( , ) ( , ) ( , ) ( , )
x y z
E r t xE r t yE r t zE r t = + +
G
G G G G

with
( ) ( ) ( ) ( )
j jx jy jz
E xE yE zE = + +
G

and

(rad) (rad) (rad) (rad)
( , ) ( , ) ( , ) ( , )
x y z
B r t xB r t yB r t zB r t = + +
G
G G G G

with
( ) ( ) ( ) ( )
j jx jy jz
B xB yB zB = + +
G

for any x , y , z triplet of mutually perpendicular Cartesian unit vectors. The integrals in (4.46a)
and (4.46b) are inverse Fourier transforms, so we can define, using
j
r ct =
G
,

2
( ) ( )
i
jx jx
E e d

E ,
2
( ) ( )
i
jy jy
E e d

E ,

2
( ) ( )
i
jz jz
E e d

E
and

2
( ) ( )
i
jx jx
B e d

B ,
2
( ) ( )
i
jy jy
B e d

B ,

2
( ) ( )
i
jz jz
B e d

B .

- 372 -
In our shorthand vector notation, this becomes

2
( ) ( )
i
j j
E e d

G
G
E (4.46c)
and

2
( ) ( )
i
j j
B e d

G
G
B (4.46d)
where
( ) ( ) ( ) ( )
j jx jy jz
x y z = + +
G
E E E E (4.46e)
and
( ) ( ) ( ) ( )
j jx jy jz
x y z = + +
G
B B B B . (4.46f)

Now Eqs. (4.46a) and (4.46b) can be written as (remember that
j
r ct =
G
)

(rad)
( , ) ( )
j j
j
E r t r ct =
G
G G G
E (4.46g)
and

(rad)
( , ) ( )
j j
j
B r t r ct =
G
G G G
B . (4.46h)

Returning to the definitions of
j
E
G
and
j
B
G
in Eqs. (4.45a)(4.45d), we see that

( ) ( )
j j
E E

=
G G
(4.47a)
and
( ) ( )
j j
B B

=
G G
. (4.47b)

This shows that
j
E
G
and
j
B
G
are Hermitian, and entry 7 in Table 2.1 of Chapter 2 requires the
inverse Fourier transforms of Hermitian functions to be real. Consequently, because they are
inverse Fourier transforms of Hermitian functions, each
( )
j j
r ct
G G
E and
( )
j j
r ct
G G
B vector
function in (4.46g) and (4.46h) is real. Every
( )
j j
r ct
G G
E and
( )
j j
r ct
G G
B pair of vector
functions can be thought of as the real electric and magnetic-induction fields of a single
polychromatic plane wave traveling in direction
j
at velocity c. Hence these two equations
demonstrate that electromagnetic radiation fields in empty space can be represented as the sum of
polychromatic plane waves traveling in a specified collection of different directions.
- 373 -
From Eqs. (4.14c) and (4.14d), we know that
0
j j
B =
A
G
and
0
j j
E =
A
G
. Taking the
complex conjugate of these two relationships gives
0
j j
B
=
A
G
and
0
j j
E
=
A
G
. We can now
take the dot product of both sides of Eqs. (4.45a) and (4.45b) with
j
to get

( ) 0
j j
E =
G
(4.48a)

and the dot product of both sides of Eqs. (4.45c) and (4.45d)
j
to get

( ) 0
j j
B =
G
(4.48b)

for all positive and negative values of . Taking the dot product with
j
of both sides of Eqs.
(4.46c) and (4.46d) gives

2

( ) ( )
i
j j j j
E e d

G
G
E
and

2

( ) ( )
i
j j j j
B e d

G
G
B

because
j
is a constant unit vector. Substituting from Eqs. (4.48a) and (4.48b) and
remembering that
j
r ct =
G
now leads to

( ) 0
j j j
r ct =
G G
E (4.49a)
and

( ) 0
j j j
r ct =
G G
B (4.49b)

for any polychromatic plane wave
( )
j j
r ct
G G
E and
( )
j j
r ct
G G
B . Consequently, the E and
B fields of a polychromatic plane wave, just like the E and B fields of a monochromatic plane
wave, are transverse to the waves direction of propagation. From Eq. (4.22a) we note that, taking
the complex conjugates of the original equality,

0
j j j j
E B E B

= =
A A A A
G G G G
.

- 374 -
Hence from Eqs. (4.45a) and (4.45c) it follows that

( )
2
1
( ) ( ) 0
4
j j j j
E B E B o o
o
- -
A
A A
A
G G G G

for 0 o > and

( )
2
1
( ) ( ) 0
4
j j j j
E B E B o o
o

- -
A
A A
A
G G G G

for 0 o < . We conclude, in the limit of decreasing o A
A
and increasing numbers of o
A
values,
that
( ) ( ) 0
j j
E B o o -
G G
(4.49c)

for all positive and negative values of . We divide both sides of Eq. (4.22b) by
2
4( ) o A
A
to get

( ) ( )
( )
2 2
1 1 1
4 4
j j j j j
E B E E
c
o o

-

O
A A

A A A A
A A
G G G G
. (4.49d)

Consulting Eq. (4.45a), the complex conjugate of Eq. (4.45a), and the complex conjugate of Eq.
(4.45c), we note that in the limit of decreasing o A
A
A
it follows that

( )
1
( ) ( ) ( ) ( )
j j j j j
E B E E
c
o o o o

- O
G G G G
(4.49e)

for 0 o > . For 0 o < we have, using (4.45b) and the complex conjugate of (4.45d), that

( )
2
1
( ) ( )
4
j j j j
E B E B o o
o

A
A A
A
G G G G
.

Substituting this into the complex conjugate of (4.49d) gives

( )
( )
2
1
( ) ( )
4
j j j j j
E B E E
c
o o
o

- O
A
A A
A
G G G G
.

Remembering that 0 o < , we now use (4.45b) and the complex conjugate of (4.45b) to write, in
the limit of decreasing o A
A
A
, that
through (4.45d) it follows that
- 375 -

( ) ( )
1 1

( ) ( ) ( ) ( ) ( ) ( )
j j j j j j j j
E B E E E E
c c

= =
G G G G G G
.

Comparing the results for 0 > and 0 < , we conclude that

( )
1
( ) ( ) ( ) ( )
j j j j j
E B E E
c

=
G G G G
(4.49f)

holds true for all positive and negative values of . Glancing back at Eq. (4.47a), we see that this
can also be written as

( )
1
( ) ( ) ( ) ( )
j j j j j
E B E E
c

=
G G G G
(4.49g)

for all positive and negative values of .
4.8 Angle-Wavenumber Transforms
The next step is to convert the sums over j in Eqs. (4.46a) and (4.46b) into integrals.
Remembering that the
j
are defined in Eq. (4.12a) to be

j jx jy jz
x y z = + + , we require that
0
jz
> . Now all the plane waves in Eqs. (4.46a) and (4.46b) are traveling more or less along the
positive z axis of the Cartesian coordinate systemthat is, the angle between
j
and z is
always less than / 2 . We use

2
2 2 2
1
j jx jy jz
= = + +

[see Eq. (4.12c)] to write

2 2
1
j jx jy jx jy
x y z = + + . (4.50a)

This makes it clear that the two real parameters
jx
and
jy
specify the propagation direction
j

of the jth plane wave. Consequently, each plane wave in the sums over j in Eqs. (4.46a) and
(4.46b) can be specified by a single point in the
x
,
y
plane. Figure 4.10 shows how this works
for the sum of the five plane waves specified by the points
1 1
( , )
x y
,
2 2
( , )
x y
,
3 3
( , )
x y
,
4 4
( , )
x y
, and
5 5
( , )
x y
. We can construct a grid of
x
,
y
values such that each plane wave is
located at a node in the grid, where if necessary the grid lines are unevenly spaced as in Fig. 4.10.
After numbering the grid lines, we can replace the single index j by a pair of indices m and n. The
five plane waves in Fig. 4.10, for example, become


- 376 -

1 1 2 4
( , ) ( , )
x y x y
,
2 2 5 1
( , ) ( , )
x y x y
,

3 3 3 2
( , ) ( , )
x y x y
,
4 4 4 5
( , ) ( , )
x y x y
,
and

5 5 1 3
( , ) ( , )
x y x y
.

Replacing index j by a pair of indices m and n lets us write the sums in Eqs. (4.46a) and (4.46b)
as

( )
2
(rad)
( , ) ( )
nm
i r ct
nm
n m
E r t E e d

= =
G
G G
G
(4.51a)
and

( )
2
(rad)
( , ) ( )
nm
i r ct
nm
n m
B r t B e d

= =
G
G G
G
, (4.51b)

where we define ( ) ( ) 0
nm nm
E B = =
G G
for those grid points that do not correspond to propagation
directions specified in the original sums over j. The new set of
nm
propagation vectors can be
written as

2 2
1
nm nx my nx my
x y z = + + . (4.51c)

For each m and n propagation direction in Eqs. (4.51a) and (4.51b), we now define that

( , , ) ( )
nx my nx my nm
E = e
G
G
(4.52a)
and
( , , ) ( )
nx my nx my nm
B = b
G G
(4.52b)
with

1, , nx n x n x

+
= (4.52c)
and

1, , my m y m y

+
= . (4.52d)

In the limit of decreasing
nx
,
my
and increasing numbers of specified propagation directions
per unit interval in
x
and
y
, Eqs. (4.51a) and (4.51b) can be written as
Angle-Wavenumber Transforms 4.8
- 377 -
FIGURE 4.10.

( )
2
(rad)
2 2
( , ) ( , , )
[ 1]
i r ct
x y x y
x x
E r t d d d e

=
+ <

e
G
G
G G

(4.53a)
and

( )
2
(rad)
2 2
( , ) ( , , )
[ 1]
i r ct
x y x y
x x
B r t d d d e

=
+ <

b
G
G G
G

(4.53b)
-2 -1 0 1 2 3 4 5 6
-2
-1
0
2
3
4
5
6

x

y

1

y x 2 2
,

y x 4 4
,

y x 3 3
,

y x 1 1
,

y x 5 5
,
- 378 -
with
2 2
1
x y x y
x y z = + + . We single out the x and y components of
and r
G
by
writing

2
1 z = +
G
(4.54a)
and
r z z = +
G G
, (4.54b)
where

x y
x y = +
G
, (4.54c)

2
2 2 2
x y
= = +
G
, (4.54d)
and
x x y y = +
G
. (4.54e)

As a shorthand, we write the complex vector functions e
G
and b
G
as

( , , ) ( , )
x x
= e e
G G G
and ( , , ) ( , )
x x
= b b
G G
G
.

Equations (4.52a) and (4.52b) show that both ( , ) e
G G
and ( , ) b
G
G
must be negligible or zero for
values of
x y
x y = +
G
that do not correspond to grid points contained in the original sums over
j. We also require ( , ) e
G G
and ( , ) b
G
G
to be zero for values of
G
for which 1
G
. Now Eqs.
(4.53a) and (4.53b) become

( )
2
2 (rad) 2 2 1
( , , ) [ ( , ) ]
i ct i z
E z t d d e e

=

e
G G G
G G G
(4.55a)
and

( )
2
2 (rad) 2 2 1
( , , ) [ ( , ) ]
i ct i z
B z t d d e e

=

b
G G G G
G G
, (4.55b)

where we have singled out the z dependence of
(rad)
E
G
and
(rad)
B
G
, writing that

(rad) (rad) (rad)
( , ) ( , , , ) ( , , ) E r t E x y z t E z t = =
G G G
G G

and

(rad) (rad) (rad)
( , ) ( , , , ) ( , , ) B r t B x y z t B z t = =
G G G
G G
.
- 379 -
From Eqs. (4.49f), (4.52a), and (4.52b), we have, replacing each j index by the appropriate m
and n pair of indices,

( )
( ) ( )
2
2
( ) ( ) ( , , ) ( , , )
1
( , , ) ( , , )
nm nm nx my nx my nx my
nx my nx my nx my nm
E B
c
o o r r r r o r r o
r r r r o r r o

-
A A
A A O
e b
e e
G G G
G
G G
.

Dropping the m and n indices, making the notation change ,
x y
r r r
G
, and dividing through by
2
( )
nx my
r r A A , we get, in the limit of decreasing
nx
r A ,
my
r A and increasing numbers of specified
propagation directions, that

( )
1
( , ) ( , ) ( , ) ( , )
c
r o r o r o r o

- O e b e e
G
G G G G G G G
. (4.56a)

Following the same procedure, we substitute Eqs. (4.52a) and (4.52b) into (4.49g) to get

( )
1
( , ) ( , ) ( , ) ( , )
c
r o r o r o r o
- O e b e e
G
G G G G G G G
. (4.56b)

We can also substitute (4.52a) into (4.48a) to get, replacing each j by appropriate m and n indices,

( )
( , , ) 0
nx my nx my nm
r r r r o - A A O e
G
, (4.56c)

which becomes, making the same notation changes as before and taking the same limit as before,

( , ) 0 r o - O e
G G
. (4.56d)

A similar substitution of (4.52b) into (4.48b) gives

( , ) 0 r o - O b
G
G
. (4.56e)

Equations (4.55a) and (4.55b) can be simplified by defining

2
2 1
( , , ) ( , )
i z
z e
r o r
r o r o

E e
G
G G G
(4.57a)
and

2
2 1
( , , ) ( , )
i z
z e
r o r
r o r o

B b
G G
G G
(4.57b)
to get
Following the same procedure, we substitute Eqs. (4.52a) and (4.52b) into (4.49g) to get
- 380 -

( ) 2 (rad) 2
( , , ) ( , , )
i ct
E z t d d z e
r o r p
p o r r o

-

E
G G G G
G G
(4.58a)
and

( ) 2 (rad) 2
( , , ) ( , , )
i ct
B z t d d z e
r o r p
p o r r o

-

B
G G G G
G G
. (4.58b)

The complex vectors ( , , ) z r o E
G
G
and ( , , ) z r o B
G
G
are called the angle-wavenumber transforms
of
(rad)
E
G
and
(rad)
B
G
respectively. By definition [see Eqs. (4.57a) and (4.57b)], the angle-
wavenumber transforms at
0
z z + are given by

2
2 1
0 0
( , , ) ( , , )
i z
z z z e
r o r
r o r o

+ E E
G G
G G
(4.59a)
and

2
2 1
0 0
( , , ) ( , , )
i z
z z z e
r o r
r o r o

+ B B
G G
G G
. (4.59b)

These equalities show that to get E
G
and B
G
at
0
z z + we need only multiply E
G
and B
G
at
0
z
by
2
2 1 i z
e
r o r
. Multiplication of Eqs. (4.56d) and (4.56e) by
2
2 1 i z
e
r o r
gives

( , , ) 0 z r o - O E
G
G
(4.59c)
and

( , , ) 0 z r o - O B
G
G
. (4.59d)

Multiplying both sides of (4.56a) by

2 2
2 1 2 1
1
i z i z
e e
r o r r o r

gives

2 2
2 2
2 1 2 1
2 1 2 1
( , ) ( , )
( , ) ( , )
i z i z
i z i z
e e
e e
r o r r o r
r o r r o r
r o r o
r o r o

O

e b
e e
G
G G G
G G G G

or

( )
1
( , , ) ( , , ) ( , , ) ( , , ) z z z z
c
r o r o r o r o

- O E B E E
G G G G
G G G G
. (4.59e)

The complex vectors ( , , ) z r o E
G
G
and ( , , ) z r o B
G
G
are called the angle-wavenumber transforms
These equalities show that to get E
G
and B
G
at
0
z z + we need only multiply E
G
and B
G
at
0
z
- 381 -
Similar treatment of (4.56b) gives

( )
1
( , , ) ( , , ) ( , , ) ( , , ) z z z z
c

= E B E E
G G G G
G G G G
. (4.59f)

Equations (4.58a) and (4.58b) are a disguised form of the inverse Fourier transform. Writing
(4.58a) using x and y for
G
,
x
and
y
for
G
, and then making the substitutions

w c = , (4.60a)

x x
u = , (4.60b)

y y
u = (4.60c)
gives

( )
( )
(rad)
2
1 1
2
( , , , )
( , , , )
, , ,
x y
x y
i xu yu ct
x y x y
i xu yu wt
y
x
x y
E x y z t
d du du u u z e
cu
cu dw c c w
du du z e
c w w w w c

+

+ +

=

=

E
E
G
G
G

or

( ) 2 (rad) 2 2
( , , ) , ,
i u wt
cu w
E z t dw d u cw z e
w c

+

=

E
G G
G
G G
G
, (4.61a)

where in the last step we create a vector

x y
u xu yu = +
G
(4.61b)
such that

1
c
u u
w

= =
G G G
. (4.61c)

The same transformation of variables applied to the triple integral over B
G
in (4.58b) gives

( ) 2 (rad) 2 2
( , , ) , ,
i u wt
cu w
B z t dw d u cw z e
w c

+

=

B
G G
G
G G
G
. (4.61d)

- 382 -
According to Eq. (2.110f) in Chapter 2, we have now demonstrated that functions
(rad)
E
G
and
(rad)
B
G

at a specified value of z are the vector inverse Fourier transforms of

2
, ,
cu w
cw z
w c

E
G
G

and

2
, ,
cu w
cw z
w c

B
G
G
.

Hence, the vector forward Fourier transforms of
(rad)
E
G
and
(rad)
B
G
must be [see Eq. (2.110e) in
Chapter 2)

( ) 2 2 2 (rad)
, , ( , , )
i u wt
cu w
cw z dt d E z t e
w c

+

=

E
G G
G
G G
G
(4.62a)
and

( ) 2 2 2 (rad)
, , ( , , )
i u wt
cu w
cw z dt d B z t e
w c

+

=

B
G G
G
G G
G
(4.62b)

or, returning to the
G
and arguments,

( )
( ) 2 2 2 (rad)
, , ( , , )
i ct
z c dt d E z t e

=

E
G G G G
G G
(4.62c)
and
( )
( ) 2 2 2 (rad)
, , ( , , )
i ct
z c dt d B z t e

=

B
G G G G
G G
. (4.62d)

Equations (4.58a), (4.58b), (4.62c), and (4.62d) are a formal transformation from the angle-
wavenumber transforms to the real E and B radiation fields and back again, subject only to the
constraint that 0
z
> in the propagation vector
z
z = +
G
and that

( ) ( )
2
2 2
, , , , 0 when 1
x y
z z = = = + E B
G G
G G G
.

To go from Eqs. (4.58a) and (4.58b) to (4.62c) and (4.62d), we show the original angle-
wavenumber transforms to be a form of three-dimensional vector Fourier transform. This lets us
use Fourier transform theory to write down the integrals for the inverse transforms.
Unfortunately, the change in Eqs. (4.60a)(4.60c) from the
G
, variables to the u
G
, w variables
that reveals the transforms Fourier nature is a somewhat awkward one. There are two reasons for
- 383 -
this: In the physical sciences, waves conventionally travel from left to right, forcing and w to
have opposite signs in (4.60a), and in spectroscopy, wavenumbers rather than frequencies are
conventionally used to characterize monochromatic radiation. Nevertheless, the rewards of
converting to the Fourier transformimmediate access to the well-known results of Fourier
theorysignificantly outweigh the inconvenience, and the reader can expect to see
transformations between the
G
, variables and the u
G
, w variables more than once in the balance
of this chapter.
4.9 Beam-Chopped and Direction-Chopped Radiation
In geometric optics, a plane wave of any sort, polychromatic or monochromatic, is represented by
a collection of equally spaced parallel rays (see Fig. 4.11). Returning briefly to the notation of
Eqs. (4.46g) and (4.46h), we again label each plane waves direction of propagation with a
propagation vector
j
. Each ray belonging to the collection of rays representing the plane wave
points in the direction of
j
, and the plane surfaces specified by
constant
j
r =
G
are surfaces
perpendicular to all the parallel rays. If the plane wave is monochromatic, then these surfaces
where
constant
j
r =
G
are also surfaces of constant phase at fixed time t, since the
monochromatic phase term is

2 ( )
j
r ct
A
G

[see the discussion following Eq. (4.19b)].

This means, of course, that the monochromatic E field as well as the monochromatic B field is
constant over any of these plane surfaces at fixed time t. If the plane wave is polychromatic, we
review the discussion following Eq. (4.47b) and note that a single polychromatic plane wave has
E and B fields specified by the vector functions

( )
j j
r ct
G G
E and
( )
j j
r ct
G G
B

respectively. Consequently, at any fixed time t, the polychromatic E field as well as the
polychromatic B field is constant over any plane surface where
constant
j
r =
G
; that is, they are
constant over any plane surface perpendicular to the rays. For both monochromatic and
polychromatic plane waves, the E and B fields themselves lie in these plane surfaces because they
must be perpendicular to the propagation vector
j
[as shown by Eqs. (4.14c), (4.14d), (4.49a),
and (4.49b)].
Figure 4.11 shows a plane wave encountering an aperture. The rays entering the aperture pass
on through, creating a beam; we say that the aperture creates a beam-chopped radiation field.

- 384 -
FIGURE 4.11.

From our current point of view, the most important characteristic of beam-chopped fields is that
they obviously can be Fourier transformed in planes perpendicular to the beams direction of
travel. Using the x, y, z coordinate system shown in Fig. 4.11, with its origin in the center of the
beam and its z axis pointing down the beam, we drop the (rad) superscript from Eqs. (4.62a) and
(4.62b) and write

( )
( )
2 2 2
2
, , ( , , )
( , , , )
x y
i u wt
i xu yu wt
cu w
cw z dt d E z t e
w c
dt dx dy E x y z t e
r p
r
p p

- +

+ +

E
G G
G
G G
G
G

(4.63a)
and

( )
( )
2 2 2
2
, , ( , , )
( , , , ) ,
x y
i u wt
i xu yu wt
cu w
cw z dt d B z t e
w c
dt dx dy B x y z t e
r p
r
p p

- +

+ +

B
G G
G
G G
G
G

(4.63b)
x
y
z
Apertures can be used to create beam-chopped radiation fields.
x
y
z
y
- 385 -
where ( , , , ) E x y z t
G
and ( , , , ) B x y z t
G
represent the E and B fields after the aperture in Fig. 4.11. In
these formulas the integrals over x and y can be assumed to converge because the beam-chopped
E and B fields are negligibly small for large values of x and y.
Figure 4.11 suggests that a beam-chopped radiation field can travel indefinitely far to the right
with a cross-section that is always the same shape as the aperture. We know, however, that
diffraction eventually causes all beam-chopped radiation fields to spread; the smaller the
characteristic (or average) wavelength of the radiation compared to the characteristic (or average)
distance across the aperture, the farther the beam travels before significant spreading occurs.
62

Michelson interferometers use apertures that are very large compared to the wavelengths of
interest, ensuring that only an insignificant amount of spreading occurs in the beam-chopped field
as it travels through the instrument.
In geometric optics, when a lens is placed perpendicular to the z axisthat is, the optical
axisof a beam, the plane waves with propagation vectors parallel to the optical axis are focused
onto the point where the optical axis intersects a perpendicular surface called the focal plane (see
Fig. 4.12). Plane waves with propagation vectors at an angle with respect to the optical axis are
focused onto points in the focal plane that are off to the side. Figure 4.12 shows four rays
representing a plane wave propagating at a small angle to the optical axis being focused by lens A
slightly to the side of where the axis intersects the focal plane. Every propagation direction that is
at a small angle to the optical axis is focused onto a unique point in the focal plane close to the
optical axis and each point in the focal plane close to the optical axis corresponds to a unique
propagation direction at a small angle to the optical axis. Directions that differ only slightly with
respect to each other are focused at closely adjacent points. The plane wave in Fig. 4.12 has a
propagation vector propagating at a small enough angle with respect to the optical axis that it is
focused by lens A only slightly to the side of where the axis intersects the focal plane.
Consequently, it passes through the small aperture placed in the focal plane and out to lens B,
which defocuses it back into a plane wave. Figure 4.13 gives a side view of this phenomenon.
Here there are three plane waves a, b, and c propagating in different directions with respect to the
beams optical axis. All the rays belonging to the plane wave a are focused at point a in the focal
plane; all the rays belonging to plane wave b are focused at point b in the focal plane; and all the
rays belonging to plane wave c are focused at point c in the focal plane. Only those plane waves
with propagation vectors at just a slight angle to the optical axis, such as plane wave b, pass
through the central aperture, allowing lens B to create a beam of plane waves propagating nearly
parallel to the optical axis. We say that the radiation leaving lens B has been direction-chopped,
meaning that it contains only a small range of propagation directions. The distance between the
focal plane and lenses A and B depends on the lens index of refraction, which may in turn
depend on the radiation frequency f c = . If the frequency dependence is strong, then the two
lenses in Fig. 4.13 may not do a good job of creating a polychromatic direction-chopped beam.
When this is a concern, the all-reflective setup shown in Fig. 4.14 (composed of two Cassegrain

62
R. W. Ditchburn, Light, Vol. I, 2nd ed. (Interscience Publishers, a division of John Wiley & Sons, Inc., New York,
1963), pp. 162166, 195.
- 386 -
telescopes having focal-plane locations independent of frequency) is a better way to remove
unwanted propagation directions.
Using the notation of Eq. (4.51c), we note that direction-chopped radiation can contain only
propagation vectors

2 2
1
nm nx my nx my
x y z = + +

that are nearly parallel to the optical axis. This means that both

1
nx
<< and 1
my
<<

for all values of n, m in the sum over plane waves in Eqs. (4.51a) and (4.51b).
When these sums are transformed into double integrals in (4.53a) and (4.53b), the propagation
vectors

2 2
1
nm nx my nx my
x y z = + + with 1
nx
<< and 1
my
<<

become, according to Eqs. (4.54a) and (4.54c),

2
1
x y
x y z = + + .

Functions
( , ) ( , , )
x x
= e e
G G G
and ( , ) ( , , )
x x
= b b
G G
G

are negligible or zero in direction-chopped beams unless both

1
x
<< and 1
y
<< .

Since the angle-wavenumber transforms ( , , ) z E
G
G
and ( , , ) z B
G
G
in (4.57a) and (4.57b) are
proportional to ( , ) e
G G
and ( , ) b
G
G
, they should also be negligible or zero in direction-chopped
beams when
x
and
y
are not both very small. Consequently, in Eqs. (4.58a) and (4.58b) we
see, dropping the (rad) superscript, that the formulas for the E and B fields of the direction-
chopped beam,

( ) 2 2
( , , ) ( , , )
i ct
E z t d d z e

=

E
G G G G
G G
(4.64a)

- 387 -

FIGURE 4.12.

Optical
Axis
Focal Plane
with Aperture
Lens A
Lens B
Two matched lenses can be used to create direction-chopped radiation. Only plane waves
propagating at small angles to the optical axis can make it through the aperture in the focal
plane of the lenses (see also Figs. 4.13 and 4.14).
- 388 -

FIGURE 4.13.

FIGURE 4.12 GIVES A THREE-DIMENSIONAL VIEW OF HOW MATCHED LENSES CAN BE USED TO CREATE
DIRECTION-CHOPPED RADIATION, AND THIS DIAGRAM IS THE SIDE VIEW. PLANE WAVES PROPAGATING AT
LARGE ANGLES TO THE OPTICAL AXIS, LIKE THE a and c plane waves in the diagram, are removed from
the beam because they focus outside the aperture in the focal plane (see also Fig. 4.14).
Focal Plane
with Aperture
Lens A Lens B
Optical Axis
a
a
b
c
a
b
b
c
c
c
a
b b
b
b
Fig. 4.12 gives a three-dimensional view of how matched lenses can be used to create direction-
chopped radiation, and this diagram is the side view. Plane waves propagating at large angles to the
optical axis, like the a and c plane waves in the diagram, are removed from the beam because they
focus outside the aperture in the focal plane (see also Fig. 4.14).
- 389 -
FIGURE 4.14.

and

( ) 2 2
( , , ) ( , , )
i ct
B z t d d z e

=

B
G G G G
G G
, (4.64b)

have double integrals over
2
x y
d d d = that must converge. For each
x
,
y
pair of values
inside the double integral, the tip of the
vector can be thought of as lying somewhere inside

the infinitesimal area
2
x y
d d d = (see Fig. 4.15). As long as only direction-chopped beams
where both
x
and
y
are small are being analyzed, this
2
d infinitesimal area must be
Telescope A Telescope B
Focal Plane
with Aperture
Optical
Axis
a
a
b
b
b
b
c
c
x
y
z
Just like the lenses in Fig. 4.13, two matched Cassegrain telescopes can be used to
create direction-chopped radiation. Again plane waves propagating at large angles to the
optical axis are removed from the beam because they focus outside the focal-plane
aperture.
- 390 -
FIGURE 4.15.

approximately perpendicular to the direction in which
points. Because
is of unit length and

2
d is an infinitesimal area, the formula for the solid angle subtended by
2
d becomes
2
2 2
d d = . Hence,
2
d can also be regarded as an infinitesimal solid angle, and the double
integrals over
2
d can be interpreted as integrals over all the solid angles that specify allowed
propagation directions inside the direction-chopped beam.
4.10 Time-Chopped and Band-Limited Radiation
When Michelson interferometers are used to measure spectra, the act of measurement must cover
a finite interval of time. If the radiation fields drop abruptly to zero outside this time interval, the
result of the measurement is not affected as long as the field values inside the time interval do not
change. Mathematically speaking, it is often convenient to analyze the situation as if the radiation
fields do indeed drop abruptly to zero before and after the measurement interval. We say these
unit vector x
unit vector y
unit vector z
propagation vector

infinitesimal area
element
2
d

x

y

Time-Chopped and Band-Limited Radiation 4.10
- 391 -
radiation fields are time-chopped. The formulas for the angle-wavenumber transforms of time-
chopped radiation fields are [dropping the (rad) superscript from Eqs. (4.62c) and (4.62d)]

( )
( ) 2 2 2
, , ( , , )
i ct
z c dt d E z t e

=

E
G G G G
G G
(4.65a)
and
( )
( ) 2 2 2
, , ( , , )
i ct
z c dt d B z t e

=

B
G G G G
G G
. (4.65b)

Since the ( , , ) E z t
G
G
and ( , , ) B z t
G
G
radiation fields are assumed to be time-chopped, the integrals
between and + over time must be well defined and so converge. When the E and B fields are
beam-chopped, the infinite double integrals over
2
d are also well defined and converge [see the
discussion after (4.63b)] so the multiple integrals defining ( ) , , z E
G
G
and ( ) , , z B
G
G
are well-
defined quantities when ( , , ) E z t
G
G
and ( , , ) B z t
G
G
represent beam-chopped and time-chopped
radiation fields. Similar reasoning shows that when the angle-wavenumber transforms are
calculated using the three-dimensional Fourier transforms in Eqs. (4.63a) and (4.63b),

( ) 2 2 2
, , ( , , )
i u wt
cu w
cw z dt d E z t e
w c

+

=

E
G G
G
G G
G
(4.65c)
and

( ) 2 2 2
, , ( , , )
i u wt
cu w
cw z dt d B z t e
w c

+

=

B
G G
G
G G
G
, (4.65d)

the infinite integrals over dt and
2
d are for the same reasons well-defined and convergent when
( , , ) E z t
G
G
and ( , , ) B z t
G
G
represent beam-chopped and time-chopped radiation fields.
The inverse transforms to Eqs. (4.65a) and (4.65b) are given in (4.64a) and (4.64b),

( ) 2 2
( , , ) ( , , )
i ct
E z t d d z e

=

E
G G G G
G G
(4.66a)
and

( ) 2 2
( , , ) ( , , )
i ct
B z t d d z e

=

B
G G G G
G G
. (4.66b)

Michelson interferometersindeed, most types of optical instrumentsusually shield their
detectors with filters that pass only the radiation wavelengths that the detectors are designed to
- 392 -
measure. Hence, in (4.66a) and (4.66b), we expect ( ) , , z E
G
G
and ( ) , , z B
G
G
to be negligible for
wavenumbers corresponding to radiation wavelengths blocked by the filters. The filters are said
to define the spectral band (or bands) to which the instrument is sensitive. Even when these filters
are built into the detectors themselves, which means the actual radiation fields traversing the
instrument may contain out-of-band radiation, it is mathematically convenient to assume that only
negligible amounts of out-of-band radiation are present inside the instrument (while, of course,
retaining the correct amounts of in-band radiation measured by the detectors). The situation is
very similar to that encountered in the discussion of time-chopped radiation fields; just as we
assume the absence of radiation outside the time interval during which the measurement occurs,
so now we assume the absence of out-of-band radiation to which the detectors are insensitive.
We must be careful to note which wavenumbers correspond to the radiation band passed by
the filters. Remembering that the wavenumber is one over the wavelength, and reviewing how the
original sums over
A
in Eqs. (4.44a) and (4.44b) become integrals over in Eqs. (4.46a) and
(4.46b), we see that if only wavelengths between
a
and
b
are measured by the detectors,

0
b a
< , (4.67a)

then ( ) , , z E
G
G
and ( ) , , z B
G
G
can be non-negligible only for values inside the two intervals

1 1
0
b a

< <
and

1 1
0
a b

< < .
These intervals can also be written as

0
b a
< < (4.67b)
and
0
a b
< < , (4.67c)
where

1
a
a
= (4.67d)
and

1
b
b
= . (4.67e)

Time-Chopped and Band-Limited Radiation 4.10
- 393 -
A band-limited function g(t) is a function whose Fourier transform

2
( ) ( )
ift
G f e g t dt

becomes strictly zero when f F > for some positive value of F. There is a well-known theorem
that states that when a function g(t) is time-chopped, meaning that there exists some positive
value of T such that g(t) is strictly zero whenever t T > , then there is no value of F such that the
Fourier transform

2
( ) ( )
ift
G f e g t dt

becomes strictly zero whenever f F > . In short, a function cannot be both time-chopped and
band-limited.
63
If the angle-wavenumber transforms ( ) , , z E
G
G
and ( ) , , z B
G
G
are taken to be
strictly zero for wavenumbers outside the intervals specified in (4.67b) and (4.67c), then,
because Eqs. (4.65c) and (4.65d) show ( ) , , z E
G
G
and ( ) , , z B
G
G
to be proportional to the
Fourier transforms of ( , , ) E z t
G
G
and ( , , ) B z t
G
G
, functions ( , , ) E z t
G
G
and ( , , ) B z t
G
G
must be band-
limited functions. Therefore, according to the just-mentioned theorem, we cannot say that
functions ( , , ) E z t
G
G
and ( , , ) B z t
G
G
are both band-limited and time-chopped. Unfortunately, we
have just said in the previous two paragraphs that we expect ( , , ) E z t
G
G
and ( , , ) B z t
G
G
to be just
thatboth band-limited and time-chopped. The loophole in this situation is that Fourier
transforms

2
( ) ( )
ift
G f e g t dt

can be negligibly small without becoming strictly zero, allowing us to create time-chopped
functions g(t) whose Fourier transforms G() are only approximately zero when f F > for some
positive value of F. Hence it is possible for g(t) to be exactly time-chopped and approximately
band-limited.
64
Similarly, we are free to regard the angle-wavenumber transforms E
G
and B
G
as
being negligibly small rather than strictly zero for values of representing out-of-band radiation
when E
G
and B
G
represent strictly time-chopped radiation fields. Hence it does make sense to treat
the radiation fields as both time-chopped and approximately band-limited, taking the ( , , ) E z t
G
G

63
Athanasios Papoulis, Signal Analysis, p. 188.
64
We can also create functions g(t) that are exactly band-limited and approximately time-chopped.
- 394 -
and ( , , ) B z t
G
G
fields to be strictly zero for all times t outside the measurement interval and
assuming the angle-wavenumber transforms ( ) , , z E
G
G
and ( ) , , z B
G
G
to be negligible or zero
for all wavenumbers lying outside the intervals specified in (4.67b) and (4.67c).
The same mathematical point, by the way, comes up when analyzing the relationship of beam-
chopped and direction-chopped E
G
and B
G
radiation fields. For this reason we have been careful
in the previous section to say that beam-chopped radiation fields are negligible or zero, instead of
strictly zero, for positions outside the beam and that direction-chopped radiation fields have
angle-wavenumber transforms that are negligible or zero, instead of strictly zero, for those
propagation vectors removed from the beam. This allows the beam passing through the
interferometer to be both direction-chopped and beam-chopped without getting into mathematical
difficulties.
4.11 Top-Level Description of a Standard Michelson Interferometer
Although the operation of a standard Michelson interferometer was described in some detail in
Chapter 1, it does no harm to review the basic setup before beginning a more rigorous analysis of
how electromagnetic radiation passes through the instrument. Figure 4.16 is a top view of a
standard Michelson interferometer and Fig. 4.17 is a perspective drawing of the same
interferometer configuration. In Fig. 4.16 the radiation whose spectrum is to be measured enters
the system traveling along the z axis, with the beam splitter partially reflecting and partially
transmitting the incident beam. Rays entering the system split at the beam splitter into reflected
rays shown with dashed lines and transmitted rays shown with solid lines. The reflected rays
travel down the moving-mirror arm to the moving mirror that reflects them back to the beam
splitter. The transmitted rays travel out the fixed-mirror arm through the compensator plate to the
fixed mirror that reflects them back to the beam splitter. When both sets of rays return to the
beam splitter, they are again partially reflected and partially transmitted. The rays from the
moving-mirror arm (partially transmitted by the beam splitter) and the rays from the fixed-mirror
arm (partially reflected by the beam splitter) then travel up the z axis, combining to produce the
balanced radiation field recorded by the detector. Figure 4.16 shows neither the rays from the
moving-mirror arm that are partially reflected nor the rays from the fixed-mirror arm that are
partially transmitted, because these rays end up going back out the way they came in. Following
the convention introduced in Chapter 1, the field produced by the rays going to the
interferometers detector is called the balanced radiation field, and the field produced by the rays
going back out the way they came in is called the unbalanced radiation field.
The beam splitters partial transmissions and reflections typically occur in a thin layer of
material shown as a dark line in Fig. 4.16. The thin layer lies on the right side of a transparent
block called the beam-splitter substrate. Even though the substrate material is transparent, it does
absorb a small fraction of the electromagnetic radiation traveling through it; and, as discussed in
Appendix 4E, plane waves passing through the substrate material undergo a phase shift. The
fraction absorbed usually depends on the wavenumber and can also depend to some extent on
Top-Level Description of a Standard Michelson Interferometer 4.11
- 395 -
the radiations angle of incidence and whether it consists of s-type or p-type plane waves. The
phase shift is strongly dependent on the angle of incidence and . It can also depend on whether
s-type or p-type plane waves are passing through the substrate material. Appendix 4E introduces
six complex parameters
( ) a
s
,
( ) a
p
,
( ) b
s
,
( ) b
p
,
( ) c
s
,
( ) c
p
to describe the passage of radiation
through the two optical elementsthe beam splitter substrate and the compensator platethat
are made from the beam-splitter substrate material.
When the moving mirror in Fig. 4.16 is further from the beam splitter than the fixed mirror,
the rays in the moving-mirror arm travel a longer distance down and back than the rays in the
fixed-mirror arm. Just like in Eq. (1.15b) of Chapter 1, we call this extra distance the optical-path
difference (OPD) and represent it by the variable . There is, of course, a position of the moving
mirror for which the OPD is zero, shown by the dash-dot line in Fig. 4.16. When the moving
mirror is closer to the beam splitter than this dash-dot line, the OPD value is taken to be
negative. Just like in Sec. 1.4 of Chapter 1, the position of the dash-dot line is called the zero-path
difference (ZPD) position. When the OPD is for the interferometer setup shown in Fig. 4.16, the
moving mirror is a distance /2 from its ZPD position.
Section 1.7 of Chapter 1 shows that there are many different ways to build a Michelson
interferometer, and for some setups the moving mirror is not /2 from its ZPD position when the
OPD is [see, for example, Fig. 1.19(d)]. The interferometer signal in the ideal case does not,
however, depend directly on the interferometer setup but rather on the OPD value generated by
the setup. For this reason, it makes sense to unfold the interferometer as shown in Fig. 4.18. Now
we see only what is common to Michelson interferometers of all configurations: the distance
traveled along one path through the interferometer differs by from the distance traveled along
the other path through the interferometer.
4.12 Monochromatic Plane Waves and Michelson Interferometers
Consider a single monochromatic plane wave characterized by the incident propagation vector
[ ] i
shown in Fig. 4.19. Vector

[ ] i
is drawn at a greatly exaggerated angle

b
with respect to
the
[ ] i
z axis, which is here the same as the optical axis; almost all interferometers are designed to
make angle
b
small, requiring the propagation vector of any monochromatic plane wave
reaching the detector to be nearly parallel to the optical axis. In Fig. 4.19, the part of the
[ ] i

plane wave that first transmits through the beam splitter and then, coming back from the fixed
mirror, reflects off the beam splitter, ends up going toward the detector with propagation vector
. The part of the

[ ] i
plane wave that first reflects off the beam splitter and then, coming back
from the moving mirror, transmits through the beam splitter, ends up going toward the detector
with propagation vector
d
. Vectors

and
d
are not the same because we allow the moving
mirror to be tilted slightly out of alignment, producing a very small angle
d
between the two
- 396 -

FIGURE 4.16.

[ ]
i
x

[ ]
i
y

[ ]
i
z
x
y
z
Input Radiance
Compensator
Plate
Beam
Splitter
Fixed
Mirror
Moving Mirror
ZPD Position

2

Monochromatic Plane Waves and Michelson Interferometers 4.12
- 397 -

FIGURE 4.17.

x
z
y

[ ]
i
x
[ ]
i
y

[ ]
i
z
Input Radiance
To Detector
Beam Splitter
Compensator Plate
Fixed Mirror
Moving Mirror
- 398 -

FIGURE 4.18.

9
x
[ ]
i
x

[ ]
i
y y
z
[ ]
i
z
First Pass through the
Beam Splitter
Second Pass through
the Beam Splitter
Fixed-Mirror Arm
Moving-Mirror Arm
= extra distance (OPD) traveled in
the moving-mirror arm.
- 399 -

FIGURE 4.19.

z
y
x

[ ]
i
z

[ ]
i
y

[ ]
i
x
Input plane wave
propagates
in direction
[ ]
i

at an angle
b

with respect to the
optical axis.
angle
d

The slightly tilted moving
mirror causes the reflected
plane wave to propagate in the
d
direction (dashed arrow)
instead of the
direction
(solid arrow).
propagation vector

propagation vector
d

angle
b

angle
d

Fixed Mirror
Compensator Plate
Beam
Splitter
Angles
b
and
d
are drawn much larger than they actually are. Note that propagation vector
,
propagation vector
d
, and the optical axis do not necessarily all lie in the same plane.
- 400 -
propagation vectors. Angle
d
between

and
d
is greatly exaggerated in Fig. 4.19; the
d

unit vector is drawn much shorter than the
unit vector, using perspective to show there is no

reason to expect
d
and
b
to be co-planar angles.
Michelson interferometers are, of course, designed to keep
d
small, and as a general rule
they do not work well unless
d
is much less than the typical angle
b
between the plane waves
propagation vector and the optical axis,

d b
<< . (4.68)

As is pointed out at the end of Appendix 4E, angle
d
is so small that we expect neither the
amplitude nor the phase shifts of monochromatic plane waves propagating through the beam
splitter substrate to be affected by it.
We note that when 0
b d
= = , the plane of incidence
65
is the same for all reflections and
transmissions through the beam splitter in Figs. 4.16, 4.17, and 4.19. Both the
[ ] i
x and x unit
vectors are normal to this plane of incidence; indeed, they are the same unit vector. If we unfold
the interferometer as shown in Fig. 4.18, the

[ ] [ ] [ ] i i i
( , , ) x y z and ( , , ) x y z

coordinate systems are brought into alignment, with
[ ] i
( , ) y y and
[ ] i
( , ) z z also becoming the same
unit vectors. Now the only difference between the two coordinate systems is the location of their
origins, with the
[ ] [ ] [ ] i i i
( , , ) x y z system having its origin on the optical axis of the input beam
approaching the beam splitter and the ( , , ) x y z system having its origin on the optical axis of the
output beam traveling from the beam splitter to the detector. This means the two coordinate
systems are essentially equivalent, allowing us to discard one and keep the other. For the rest of
this chapter, we work with the unfolded interferometer and use only the ( , , ) x y z coordinate
system to represent the plane waves in the input beam, the fixed-mirror arm, the moving-mirror
arm, and the output beam traveling from the beam splitter to the detector.
When
b
is not zero, the tunnel-diagram analysis performed in Figs. 4E.4(a) and 4E.4(b) of
Appendix 4E shows that vector
[ ] i
must have the same angles with respect to

[ ] [ ] [ ] i i i
( , , ) x y z that
vector
has with respect to ( , , ) x y z ; in particular, angle

b
is the same in both the input and
output coordinate systems. Vector
d
and its associated angle
d
, on the other hand, are defined

65
The plane of incidence of a reflected or transmitted monochromatic plane wave is defined in Sec. 4.5 above.
- 401 -
in the output ( , , ) x y z coordinate system after reflection off the slightly misaligned moving mirror
but not, of course, in the input
[ ] [ ] [ ] i i i
( , , ) x y z coordinate system
From the work done in Secs. 4.3 and 4.4, we know that the input plane wave can be written
using the real part of

2 ( )
0
i r ct
E e

G G
(4.69a)

to represent the waves E field and the real part of

( )
2 ( )
0
1
i r ct
E e
c

G G
(4.69b)

to represent the waves B field when angle
b
is small. These formulas come from dropping the
, j A subscripts from Eqs. (4.16a) and (4.16b) and using (4.16c) to substitute for the B vector. In
(4.69a) and (4.69b), parameter
0
E
G
is a constant complex vector; and the convention of the
unfolded interferometer is used to replace the propagation vector
[ ] i
by
when describing the

input plane wave. The wavenumber is taken to be positive. According to Eq. (4.16c), the
complex
0
E
G
vector satisfies

0
0 E =
G
. (4.69c)

The work done in Sec. 4.3 shows that this means the plane waves real E field is always
perpendicular to the direction of propagation
.
Since
b
is small, we know that plane waves entering the interferometer must be propagating
parallel to, or nearly parallel to, the z axis. When, as in Fig. 4.19,
is tilted at a nonzero small

angle
b
to the z axis, it follows that the real E field must have a small component along the z
axis. According to Fig. 4.20, the real E-field component along the z axis must be on the order of
sin
b
. Since
b
is a small angle, we have

(sin ) ( )
b b
O O = . (4.70)

Writing the complex constant vector
0
E
G
in terms of its complex ( , , ) x y z components,

0 0 0 0

x y z
E xE yE zE = + +
G
, (4.71a)
- 402 -

FIGURE 4.20.

b

b

vector E
G

unit vector

unit vector z
- 403 -
we note that the real E field of the monochromatic plane wave must be

2 ( ) 2 ( )
0 0
2 ( )
0
2 ( )
0
Re[ ] Re[ ]
Re[ ]
Re[ ].
i r ct i r ct
x
i r ct
y
i r ct
z
E e x E e
y E e
z E e

=
+
+
G G
G
G
G

(4.71b)

Looking at the special point in space 0 r =
G
at time 0 t = , we see that, according to Fig. 4.20,

0
Re[ ] ( )
z b
E O = . (4.71c)

The imaginary part of
2 ( )
0
i r ct
z
E e

G
has no physical relevance, so it can also be specified as
( )
b
O at point 0 r =
G
when 0 t = . This means the formula for
0
E
G
can be written as

0 0 0
[ ( ) ( )]
x y b b
E xE yE z O iO = + + +
G
. (4.71d)

We now introduce the symbol
( ) ( ) ( )
b b b
O O iO = + (4.71e)

as a notational convenience to describe a complex scalar whose real and imaginary parts are both
( )
b
O . Then Eq. (4.71d) can be written as

0 0 0
( )
x y b
E xE yE z O = + +
G
. (4.71f)

The ( )
b
O symbol, like the ( )
b
O symbol, is an algebraic black hole absorbing other finite
algebraic quantities. Some of the formal rules for manipulating ( )
b
O are that

( ) ( ) ( )
b b b
aO bO O + = (4.72a)

for any two finite complex scalars a and b; that

( ) ( )
i
b b
O e O
= (4.72b)

for any real parameter ; and, of course, that

( ) ( )
b b
O O = . (4.72c)
- 404 -
From Eqs. (4.54a), (4.54c), and (4.54d) we have

2 2 2
1 1
x y x y
z x y z = + = + +
G
.

Clearly, both and
x y
are (sin ) ( )
b b
O O = when
is nearly parallel to the optical axis

(see Fig. 4.20), so

( ) ( )
b b
z xO yO = + + , (4.73a)
where

2
2
2 2
1 1
2 2
y
x
x y
z z z

,

neglecting terms of
2
( )
b
O . From Eqs. (4.71f) and (4.73a) we have, again neglecting terms of
2
( )
b
O , that

0 0 0
0 0
[ ( ) ( )] [ ( )]
( ).
b b x y b
x y b
E z xO yO xE yE z O
yE xE z O

= + + + +
= +
G

(4.73b)

We next introduce the symbol ( )
b
O
G
to represent a small complex vector, each of whose
( , , ) x y z components are ( )
b
O . The symbol ( )
b
O
G
is another algebraic black hole. We note that

( ) ( ) ( )
b b b
aO bO O + =
G G G
(4.74a)

for any two finite complex scalars a and b, that

( ) ( )
i
b b
O e O
=
G G
(4.74b)

for any real parameter , and that

( ) ( )
b b
c O O =
G
G
(4.74c)
and
( ) ( )
b b
c O O =
G G
G
(4.74d)

for the vector dot and cross products with any finite complex vector c
G
. The underscore in the
symbol ( )
b
O
G
can be dropped to give ( )
b
O
G
, with this new symbol indicating a strictly real
- 405 -
vector, each of whose real ( , , ) x y z components are ( )
b
O . Then for any two real scalars a and b,
we have
( ) ( ) ( )
b b b
aO bO O + =
G G G
, (4.75a)

for any real parameter , we have
( ) ( )
i
b b
O e O
=
G G
, (4.75b)

( ) ( )
b b
c O O =
G
G
, (4.75c)
and
( ) ( )
b b
c O O =
G G
G
(4.75d)

for the vector dot product and vector cross product with any finite complex vector c
G
. If c
G
is a
finite real vector, we can, of course, drop the underscore on the right-hand sides of (4.75c) and
(4.75d) to show that the resulting small quantities must also be strictly real. The ( )
b
O
G
symbol
can be used to write
in (4.73a) as

( )
b
z O = +
G
(4.76a)

and, of course, the ( )
b
O
G
symbol can be used to write the complex vectors in (4.71f) and (4.73b)
as

0 0 0
( )
x y b
E xE yE O = + +
G G
(4.76b)
and

0 0 0
( )
x y b
E yE xE O = +
G G
. (4.76c)

Substituting Eqs. (4.76b) and (4.76c) into the expressions for the complex E and B fields in
(4.69a) and (4.69b) gives, when angle
b
is small,

2 ( )
0 0
Complex field ( ) ( )
i r ct
x y b
E xE yE e O

= + +
G G
(4.77a)
and

2 ( )
0 0
Complex field
1
( ) ( )
i r ct
x y b
B yE xE e O
c

= +
G G
, (4.77b)

where (4.74b) is used to simplify the final results and
is given by Eq. (4.76a). If the ( )

b
O
G

terms in (4.77a) and (4.77b) and the ( )
b
O
G
terms in (4.76a) are all exactly zero, then the plane
waves propagation vector is strictly parallel to the z optical axis; when they are not, the plane
- 406 -
wave is propagating in a slightly off-axis direction. Looking at how the interferometer is unfolded
going from Fig. 4.17 to Fig. 4.18, we see that if all the ( )
b
O
G
and ( )
b
O
G
terms are exactly zero,
then the x component of E is strictly perpendicular to the plane of incidence on the beam splitter
and the y component of E is strictly parallel to the plane of incidence on the beam splitter. For
now, we assume that all the ( )
b
O
G
and ( )
b
O
G
terms are exactly zero and analyze just plane
waves that propagate parallel to the optical axisthat is, just the on-axis plane waves. From the
work done in Secs. 4.5 and 4.6, we can then predict that the on-axis monochromatic plane
wavefield transmitted through the beam splitter is

( ) ( ) 2 ( )
0 0
Complex field [ ]
a a i r ct
x s s y p p
E xE t yE t e

= +
G
(4.77c)
and

( ) ( ) 2 ( )
0 0
Complex field
1
[ ]
a a i r ct
x s s y p p
B yE t xE t e
c

=
G
, (4.77d)

where
( ) a
s
is the complex parameter introduced in Appendix 4E that describes the passage of s-
type monochromatic plane waves on their first pass through the beam-splitter substrate, and
( ) a
p

is the complex parameter from Appendix 4E describing the passage of p-type monochromatic
plane waves on their first pass through the beam-splitter substrate. Both
( ) a
s
and
( ) a
p
are
functions of and the plane waves angle of incidence on the substrate. The plane wave reflected
off the beam splitter after passing into and out of the substrate is

( ) ( ) 2 ( )
0 0
Complex field [ ]
ab ab i r ct
x s s y p p
E xE r yE r e

= +
G
(4.77e)
and

( ) ( ) 2 ( )
0 0
Complex field
1
[ ]
ab ab i r ct
x s s y p p
B yE r xE r e
c

=
G
. . (4.77f)
Here, we define

(ab) ( ) ( )
s
a b
s s
= (4.77g)
and

(ab) ( ) ( )
p
a b
p p
= , (4.77h)

where
( ) b
s
is the complex parameter introduced in Appendix 4E that describes the second pass of
s-type monochromatic plane waves through the beam-splitter substrate and
( ) b
p
is the complex
parameter from Appendix 4E that describes the second pass of p-type monochromatic plane
waves through the substrate. Like
( ) a
s
and
( ) a
p
, the
( )
,
b
s p
complex parameters are functions of
and the plane waves angle of incidence. The complex parameters r
s
, r
p
, t
s
, t
p
describe what
- 407 -
happens in the thin beam-splitter layer in Fig. 4.16, where the partial transmission and partial
reflection of the radiation fields occur. Parameters r
s
and t
s
are the s-wave amplitude-reflection
and amplitude-transmission coefficients, and parameters r
p
and t
p
are the p-wave amplitude-
reflection and amplitude-transmission coefficients. Recognizing that the amount of reflection and
transmission can depend on both wavenumber and angle of incidence, we realize that these
coefficients must also be functions of and the angle of incidence on the substrate. For the on-
axis plane waves characterized by Eqs. (4.77c)(4.77f), the angle of incidence on the beam-
splitter substrate must be the same as the angle of incidence made by the optical axis on the
beam splitter. The unfolded model of the interferometer in Fig. 4.18 lets us use the same symbol
y for both the original and reflected y unit vectors and also allows us to represent both the
transmitted and reflected propagation vectors by the same symbol
.
Now we consider what happens to the slightly off-axis plane waves where
b
is no longer
exactly zero, which means that
is at a slight angle to the optical axis. In this situation Fig. 4.21

shows that the angle of incidence on the beam splitter changes by an ( )
b
O amount from the
optical axiss angle of incidence . We want to show that the transmitted and reflected
wavefields can now be written as

( ) ( ) 2 ( )
0 0
Complex field [ ] ( )
a a i r ct
x s s y p p b
E xE t yE t e O

= + +
G G
(4.78a)

and

( ) ( ) 2 ( )
0 0
Complex field
1
[ ] ( )
a a i r ct
x s s y p p b
B yE t xE t e O
c

= +
G G
(4.78b)

for the wavefield transmitted through the beam splitter and as

( ) ( ) 2 ( )
0 0
Complex field [ ] ( )
ab ab i r ct
x s s y p p b
E xE r yE r e O

= + +
G G
(4.79a)

and

( ) ( ) 2 ( )
0 0
Complex field
1
[ ] ( )
ab ab i r ct
x s s y p p b
B yE r xE r e O
c

= +
G G
(4.79b)

for the wavefield reflected from the beam splitter.
Figure 4.22 shows that when
b
is not exactly zero, the
vector defines a new, slightly tilted

plane of incidence of the plane wave on the beam splitter. We choose s to be the unit vector
perpendicular to this new plane of incidence and note that

( )
b
s x O = +
G
. (4.80a)
- 408 -
FIGURE 4.21.

unit vector

unit vector z
beam-splitter surface-normal vector n
angle
b

angle
size of angle between
and
n is ( )
b
O +
Vectors n ,
, and z do not necessarily all lie in the same plane.

- 409 -
The unit vector perpendicular to both
and s is given by [see Fig. 4.22 and Eq. (4.76a)]

[ ( )] [ ( )]
b b
p s z O x O = = + +
G G
.

This becomes, gathering together the ( )
b
O
G
terms and neglecting the
2
[ ( )]
b
O
G
terms,

( )
b
p y O = +
G
. (4.80b)

We take components along s and p of the complex vector
0
E
G
vector used to describe the
incident plane wave in Eqs. (4.69a) and (4.69b) to get [since s , p , and
are mutually
perpendicular unit vectors and the complex
0
E
G
vector is, according to (4.69c), strictly
perpendicular to
]

0 0 0

s p
E sE pE = +
G
. (4.80c)

Here,
0s
E and
0 p
E are two complex scalars representing the components of
0
E
G
along s and p .
Substitution of (4.80a) and (4.80b) into (4.80c) gives

0 0 0
( )
s p b
E xE yE O = + +
G G
.

Comparing this expression to Eq. (4.76b) shows that

0 0
( )
s x b
E E O = + (4.80d)
and

0 0
( )
p y b
E E O = + (4.80e)

if the two formulas for
0
E
G
are to be consistent.
Using the relationships
s p = and
p s = from Fig. 4.22, we substitute (4.80c) into

(4.69a) and (4.69b) to write the incident wave as

2 ( )
0 0
Complex field ( )
i r ct
s p
E sE pE e

= +
G

and

2 ( )
0 0
Complex field
1
( )
i r ct
s p
B pE sE e
c

=
G
.
- 410 -
FIGURE 4.22.

unit vector x
unit vector y
unit vector z
unit vector s
unit vector

unit vector p
beam-splitter surface-normal vector n

b

Angle here is ( )
b
O
Angle here is
( )
b
O
New,
slightly
tilted
plane of
incidence
containing
the
and
n vectors
- 411 -
In effect, the original ( , , ) x y z coordinate system is replaced by the slightly tilted
( , , ) s p
coordinate system with
0s
E and
0 p
E playing the role of
0x
E and
0 y
E . Thus it has now been
shown that we can make the ( )
b
O
G
terms in (4.77a) and (4.77b) equal to zero by replacing
[ ( , ) x y ,
0x
E ,
0y
E ] with [ ( , ) s p ,
0s
E ,
0 p
E ] respectively. Previously and x y represented unit
vectors perpendicular and parallel to the plane of incidence, and now and s p represent unit
vectors perpendicular and parallel to the plane of incidence. Following the pattern established in
going from Eqs. (4.77a) and (4.77b) to Eqs. (4.77c)(4.77f), we see that the wave transmitted
through the beam splitter must be

( ) ( ) 2 ( )
0 0
Complex field [ ]
a a i r ct
s s s p p p
E sE t pE t e

= +
G
(4.80f)
and

( ) ( ) 2 ( )
0 0
Complex field
1
[ ]
a a i r ct
s s s p p p
B pE t sE t e
c

=
G
, (4.80g)

and the wave reflected off the beam splitter after passing into and out of the substrate must be

( ) ( ) 2 ( )
0 0
Complex field [ ]
ab ab i r ct
s s s p p p
E sE r pE r e

= +
G
, (4.80h)
and

( ) ( ) 2 ( )
0 0
Complex field
1
[ ]
ab ab i r ct
s s s p p p
B pE r sE r e
c

=
G
. (4.80i)

The
( )
,
a
s p
,
( )
,
b
s p
,
( )
,
ab
s p
parameters are the same functions of and the angle of incidence as in Eqs.
(4.77c)(4.77f); and the r
s,p
and t
s,p
parameters are also the same functions as they were in
(4.77c)(4.77f). We note that even if the wavenumber has the same value as in Eqs. (4.77c)
(4.77f), the work done in Appendix 4E shows that the values of
( )
,
a
s p
,
( )
,
b
s p
, and
( )
,
ab
s p
are different
because these complex-valued functions are very sensitive to the slight changes in the angle of
incidence produced by nonzero values of
b
. The values of r
s,p
and t
s,p
do not, however, usually
depend as sensitively on the angle of incidence. As long as
b
is small, we can treat r
s,p
and t
s,p
as
complex functions that depend only on the wavenumber .
Substituting Eqs. (4.80a), (4.80b), (4.80d), and (4.80e) into Eqs. (4.80f)(4.80i) and gathering
together the ( )
b
O
G
terms while neglecting the
2
( )
b
O terms gives us, as expected, Eqs. (4.78a),
(4.78b), (4.79a), and (4.79b) for the beam splitters transmitted and reflected waves. This
establishes that (4.78a), (4.78b), (4.79a), and (4.79b) can be used to represent monochromatic
plane waves propagating through the interferometer in a slightly off-axis direction. From now on,
we use (4.78a), (4.78b), (4.79a), and (4.79b) to represent both the on-axis and off-axis
- 412 -
monochromatic plane waves with the understanding, of course, that both
b
and all the order
b

terms are strictly zero for on-axis propagation.
The plane wave transmitted through the beam splitter into the fixed-mirror arm of the
interferometer reflects off the fixed mirror and returns to the beam splitter. There is no way to
distinguish between s-wave and p-wave reflections when
O is exactly parallel to the z axis, so

we use the single amplitude-reflection coefficient
FM
r to describe normal reflection off the fixed
mirror. When
O is not exactly parallel to the z axis, which means the reflection off the fixed
mirror is only nearly normal and not strictly normal, we can distinguish between s-wave and p-
wave reflections; but there is no real point to it because both the s-wave and p-wave amplitude-
reflection coefficients are approximately equal to
FM
r . When
O is allowed to be approximately
parallel to z , the radiation fields of the plane wave after reflection off the fixed mirror are

( ) ( ) 2 ( )
0 0
Complex field
[ ] ( )
abc abc i r ct
FM x s s y p p b
E
r xE t yE t e O
r o
y y
O-
+ +
G G

(4.81a)
and

( ) ( ) 2 ( )
0 0
Complex field
[ ] ( )
abc abc i r ct FM
x s s y p p b
B
r
yE t xE t e O
c
r o
y y
O-
+
G G
.
(4.81b)
Here,

( ) ( ) ( ) ( ) ( ) ( ) abc ab c a b c
s s s s s s
y y y y y y (4.81c)
and

( ) ( ) ( ) ( ) ( ) ( ) abc ab c a b c
p p p p p p
y y y y y y , (4.81d)

where
( )
,
c
s p
y are the complex parameters introduced in Appendix 4E to describe the third pass
through the beam-splitter substrate and the second pass through the compensator plate of the s-
type and p-type waves respectively. We note that
( )
,
b
s p
y can, according to Eq. (4E.7b) in Appendix
4E, describe the first passage of a plane wave through the compensator plate as well as the second
passage through the beam-splitter substrate. Like
( )
,
a
s p
y and
( )
,
b
s p
y , they are functions of
wavenumber and the angle of incidence. In Eqs. (4.81a) and (4.81b), the factors of
( )
,
abc
s p
y show
that the plane wave passes once through the beam-splitter substrate and twice through the
compensator plate, and the ( )
b
O
G
symbol again represents complex vector components that are
too small to be worth keeping track of explicitly. Just like before, these equations reduce to the
case where
O is exactly parallel to z when all the ( )

b
O
G
terms are taken to be exactly equal to
zero.
passage through the beam-splitter substrate. Like
( )
,
a
s p
y and
( )
,
b
s p
y , the
( )
,
c
s p
y
- 413 -
The plane wave reflected off the beam splitter and into the interferometers moving-mirror
arm reflects off the moving mirror and returns to the beam splitter. Because it reflects normally or
near normally, we can write, following the pattern of Eqs. (4.81a) and (4.81b) and assuming the
moving mirror is at its ZPD position,

2 ( ) ( ) ( )
0 0
Complex field
[ ] ( )
d
i r ct abc abc
MM x s s y p p b
E
r xE r yE r e O

= + +
G G

(4.82a)
and

2 ( ) ( ) ( )
0 0
Complex field
[ ] ( ) ,
d
i r ct abc abc MM
x s s y p p b
B
r
yE r xE r e O
c

= +
G G

(4.82b)

where
MM
r is the complex amplitude-reflection coefficient for plane waves normally incident on
the moving mirror, and

is replaced by
d
because of the slightly tilted moving mirror. The
factors of
( )
,
abc
s p
now represent three passages through the beam-splitter substrate. At the end of
Appendix 4E, there is a discussion about why it makes sense to neglect the very slight change in
the angle of incidence due to the tilted moving mirror. Most interferometers use identical
reflective surfaces for the fixed and moving mirrors, so from now on we assume that

FM MM M
r r r = = (4.83)

with the lack of an s or p subscript on
M
r reminding us that it represents the amplitude-reflection
coefficient of the fixed and moving mirrors, which have identical s-wave and p-wave amplitude-
reflection coefficients, instead of the beam splitter, which does not.
We have to be careful when reflecting the plane wave coming from the fixed-mirror arm off
the beam splitter because the reflection takes place outside rather than inside the substrate (see
Figs. 4.16, 4.17, and 4.19). In this type of interferometer, the beam splitter is usually designed so
that the s-wave and p-wave amplitude-reflection coefficients are (1) times the amplitude-
reflection coefficients r
s
and r
p
for reflection inside the substrate. In some types of beam splitter,
however, these s-wave and p-wave amplitude-reflection coefficients are equal to r
s
and r
p
rather
than [r
s
] and [r
p
] (see discussion in Sec. 1.1 of Chapter 1). To allow for both types of beam-
splitter arrangements, we use the same parameter W introduced in the discussion following Eq.
(1.15c) of Chapter 1. Just like before, W can only equal +1 or 1. Now the expressions
and
s p
Wr Wr can be used to represent both types of s-wave and p-wave amplitude-reflection
coefficients off the beam splitters backside. Reflecting the plane wavefield in (4.81a) and (4.81b)
off the backside of the beam splitter now gives

- 414 -

( ) ( ) 2 ( )
0 0
Complex field
[ ] ( )
abc abc i r ct
M x s s s y p p p b
E
Wr xE t r yE t r e O
r o
y y
O-
+ +
G G

(4.84a)
and

( ) ( ) 2 ( )
0 0
Complex field
[ ] ( ) ,
abc abc i r ct M
x s s s y p p p b
B
Wr
yE t r xE t r e O
c
r o
y y
O-
+
G G

(4.84b)

where Eq. (4.83) is used to replace
FM
r by
M
r in the expressions for the complex E and B fields.
There is no difficulty passing the plane wave coming from the moving-mirror arm through the
beam-splitter film because it transmits the same way the original plane wave transmitted through
to the fixed-mirror arm. This means we can use the same t
s,p
complex parameters to describe the
change in the plane wave in Eqs. (4.84a) and (4.84b). Now, however, we also want to allow for
the possibility that the moving mirror is no longer at ZPD. In Eqs. (4.84a) and (4.84b) the
complex exponential
2 ( ) i r ct
e
r o O-
G

is always the correct phase term for the plane wave traveling toward the detector after passing out
and back the fixed-mirror arm. When the moving mirror is no longer at ZPD, the correct phase
term for the plane wave passing out and back the moving-mirror arm is

2 [ ( ) ]
d
i r z ct
e
r o O - +
G

with r r z +
G G
to account for the moving-mirror arms OPD (that is, to account for the extra
distance traveled when the moving mirror is not at its ZPD position).
66
Therefore, we now write
the E and B fields of the plane wave traveling toward the detector after transmitting through the
beam splitter from the moving-mirror arm as

2 [ ( ) ] ( ) ( )
0 0
Complex field
[ ] ( )
d
i r z ct abc abc
M x s s s y p p p b
E
r xE t r yE t r e O
r o
y y
O - +
+ +
G G

(4.85a)
and

2 [ ( ) ] ( ) ( )
0 0
Complex field
[ ] ( )
d
i r z ct abc abc M
x s s s y p p p b
B
r
yE t r xE t r e O
c
r o
y y
O - +
+
G G
.
(4.85b)

66
The OPD is defined in Sec. 4.11 and first used in Eq. (1.15b) of Chapter 1. The ZPD is defined at the beginning of
Sec. 1.4 of Chapter 1.
(4.84) and (4.82).
s, in Eqs. (4.82),
s [just put t
s,p
into (4.82 a,b) and use (4.83)]
- 415 -
In Sec. 4.11 we decided that the recombined radiation on the far side of the beam splitter
would be called the balanced radiation field. Having traced a monochromatic plane wave through
the interferometer, we can now represent its balanced E and B fields by adding together the
formulas in (4.84a), (4.84b), (4.85a), and (4.85b),

2 ( ) 2 ( ) ( ) 2 [ ]
0

2 ( ) 2 ( ) ( ) 2 [ ]
0
Complex balanced field
( )
( )
( )

d d
d d
i z i r abc i r ct
M x s s s
i z i r abc i r ct
M y p p p
b
x r E t r e W e e
y r E t r e W e e
O
E

= +
+ +
+
G G
G G
G

(4.86a)

and

2 ( ) 2 ( ) ( ) 2 [ ]
0 s

2 ( ) 2 ( ) ( ) 2 [ ]
0 p
( )
( )
( )

d d
d d
i z i r abc i r ct M
x s s
i z i r abc i r ct M
y p p
b
r
y E t r e W e e
c
r
x E t r e W e e
c
O
B

= +
+
+
G G
G G
G

.
(4.86b)

According to inequality (4.68), angle
b
is much greater than
d
; and because the input beam is
direction-chopped, we know that
b
is itself a small quantity. When the typical values of
b
and
d
for standard Michelson interferometers are plugged into the phase terms of (4.86a) and (4.86b),
it can be shown that [see Eqs. (4B.5d) and (4B.10d) from Appendix 4B]

2 ( ) 2 ( )
d
i z i z
e e

(4.87a)

and

2 ( ) 4 ( )
d M
i r i n z r
e e

G G
. (4.87b)

Here
M
n is the dimensionless unit normal vector to the moving mirrors surface and, following
the convention of the unfolded interferometer, z points from the moving mirror to the beam
splitter. When the moving mirror is perfectly aligned,
M
n z = . Substitution of these two
approximations into the formulas for the complex balanced E and B fields gives
- 416 -

4 ( ) ( ) 2 [ ] 2 ( )
0

4 ( ) ( ) 2 [ ] 2 ( )
0
( )
( )
( )
M
M
i n z r abc i r ct i z
M x s s s
M y p p p
b
E
x r E t r e W e e
y r E t r e W e e
O

= +
+ +
+
G G
G G
G

(4.88a)
and

4 ( ) ( ) 2 [ ] 2 ( )
0

4 ( ) ( ) 2 [ ] 2 ( )
0
( )
( )
( )
M
M
i n z r abc i r ct i z M
x s s s
i n z r abc i r ct i z M
y p p p
b
B
r
y E t r e W e e
c
r
x E t r e W e e
c
O

= +
+
+
G G
G G
G

.
(4.88b)
4.13 Multiple Plane Waves and Michelson Interferometers
Having found the complex E and B fields for one monochromatic plane wave passing through the
interferometer, we put an index A on the wavenumber, an index j on the propagation vector,
replace
0
E
G
by
j
E
A
G
, and take the sum over A and j to create a polychromatic input radiance field.
In place of formulas (4.69a) and (4.69b) for a monochromatic plane wave entering the
interferometer, we have

( )
( )
2
Complex input field
( ) ( )
j
j
i r ct
j
j
i r ct
jx jy b
j
E E e
xE yE e O

=
= + +
A
A
G
A
A
G
A A
A
G
G

(4.89a)
and

( )
( )
( )
2
Complex input field
1
1
( ) ,
j
j
i r ct
j j
j
i r ct
jx jy b
j
B E e
c
yE xE e O
c

=

= +
A
A
G
A
A
G
A A
A
G
G

(4.89b)

where Eqs. (4.77a) and (4.77b) are used to write the sums over A and j in terms of the x and y
components of
j
E
A
G
. Equations (4.89a) and (4.89b) apply to a collection of plane waves with
j

propagation vectors parallel to, or nearly parallel to, the optical axis. Having passed through the

Multiple Plane Waves and Michelson Interferometers 4.13
- 417 -
interferometer, each plane wave takes on the form given in Eqs. (4.88a) and (4.88b), so that the
total balanced radiation field traveling to the detector becomes

2 [ ] 2 ( ) 4 ( ) ( )

2 [ ] 2 ( ) 4 ( ) ( )
( )
( )
(
j j
M
j j
M
i r ct i z
i n z r abc
M sj jx s s
j
i r ct i z
i n z r abc
M pj jy p p
b
E
x r E t r e W e e
y r E t r e W e e
O
r o r o
r o
r o r o
r o
y
y
O - O -
-
O - O -
-
+
+ +
+
A A
A
A A
A
G
G
A A A A A
A
G
G
A A A A A
G

)
(4.89c)
and

( )

2 [ ] 2 ( ) 4 ( )
( )

2 [ ] 2 ( ) 4 ( )
( )
( )
j j
M
j j
M
abc
i r ct i z M sj i n z r
jx s s
j
abc
i r ct i z M pj i n z r
jy p p
B
r
y E t r e W e e
c
r
x E t r e W e e
c
r o r o
r o
r o r o
r o
y
y
O - O -
-
O - O -
-
A A
A
A A
A
G
G
A A
A A A
A
G
G
A A
A A A

( ) .
b
O
+
G

(4.89d)

Note that all the parameters depending on acquire A subscripts; all the parameters depending on
the angle of incidence acquire j subscripts; and all the parameters with A and j subscripts depend
on both. Specifically, we define

l
j
( ) ( )
at the = wave number and at the angles of incidence
corresponding to a monochromatic plane wave with an =
propagation vector
abc abc
sj s
y y
A
G G
(4.89e)
and

l
j
( ) ( )
at the = wavenumber and at the angles of incidence
corresponding to a monochromatic plane wave with an =
propagation vector.
abc abc
pj p
y y
A
G G
(4.89f)

Similarly, we define
at
s s
r r o o
A A
, (4.89g)

at
p p
r r o o
A A
, (4.89h)

at
s s
t t o o
A A
, (4.89i)

o o
A
o o
A
wave number and at the angles of incidence
wave number and at the angles of incidence
- 418 -
at
p p
t t = =
A A
, (4.89j)
and
at
M M
r r = =
A A
. (4.89k)

Following the procedure shown in Eqs. (4.44a) and (4.44b) above, the true radiation fields can
be written as the real part of the above formulas, giving

2 [ ] 2 ( ) 4 ( ) ( )

2 [ ] 2 ( ) 4 ( ( )
Real balanced field
1
( )
2
(
j j
M
j j
M
i r ct i z
i n z r abc
M sj jx s s
j
i r ct i z
i n abc
M pj jy p p
E
x r E t r e W e e
y r E t r e W e e

+ +

A A
A
A A
A
G
G
A A A A A
A
G
A A A A A

}
)

2 [ ] 2 ( ) 4 ( ) ( )

2 [ ] 2 ( ) 4 ( ) ( )
)
1
( )
2
( )
( )
j j
M
j j
M
z r
i r ct i z
i n z r abc
M sj jx s s
i r ct i z
i n z r abc
M pj jy p p
b
x r E t r e W e e
y r E t r e W e e
O

+ +
A A
A
A A
A
G
G
G
A A A A A
A
G
G
A A A A A
G

(4.90a)
and

( )

2 [ ] 2 ( ) 4 ( )
( )

2 [ ] 2 ( )
4 (
Real balanced field
1
( )
2
(
j j
M
j j
abc
jx s s
j
abc
i r ct i z M pj i
jy p p
B
r
y E t r e W e e
c
r
x E t r e W e e
c

+

A A
A
A A
A
G
G
A A
A A A
A
G
A A
A A A

)
( )

2 [ ] 2 ( ) 4 ( )
( )

2 [ ] 2 ( ) 4 ( )
)
1
( )
2
(
M
j j
M
j j
M
n z r
abc
jx s s
abc
i r ct i z M pj i n z r
jy p p
r
y E t r e W e e
c
r
x E t r e W e e
c

A A
A
A A
A
G
G
G
A A
A A A
A
G
G
A A
A A A

)
( ) ,
b
O
+
G
(4.90b)

where the underscore has been removed from the ( )
b
O
G
symbol to show that only the real part of
this small uncertainty is retained. We define

- 419 -

1
( ) for 0
2
jx jx
E E = = >
A A A
, (4.91a)

1
( ) for 0
2
jy jy
E E = = >
A A A
, (4.91b)

1
( ) for 0
2
jx jx
E E
= = <
A A A
, (4.91c)
and

1
( ) for 0
2
jy jy
E E
= = <
A A A
, (4.91d)
with

1

+
=
A A A
.

We set up new versions of the , , and r t parameters by defining the complex functions
( ) ( )
, , , , , ,
abc abc
s p s p sj pj
r r r t t to be

( )
M
r r =
A A
with ( ) ( ) r r

= , (4.92a)

( )
s s
r r =
A A
with ( ) ( )
s s
r r

= , (4.92b)

( )
p p
r r =
A A
with ( ) ( )
p p
r r

= , (4.92c)

( )
s s
t t =
A A
with ( ) ( )
s s
t t

= , (4.92d)
and
( )
p p
t t =
A A
with ( ) ( )
p p
t t

= . (4.92e)

We also say that

( ) ( )
( )
abc abc
sj sj
=
A A
and
( ) ( )
( )
abc abc
pj pj
=
A A
, (4.92f)
where

( ) ( )
( ) ( )
abc abc
sj sj

= and
( ) ( )
( ) ( )
abc abc
pj pj

= .

Now
( ) ( )
, , , , , ,
abc abc
s p s p sj pj
r r r t t are Hermitian functions of [see remark following Eq. (2.34a) in
Chapter 2]. The definitions of ( )
jx
E and ( )
jy
E in Eqs. (4.91a)(4.91d) require that

- 420 -
( ) ( )
jx jx
E E

= (4.92g)
and

( ) ( )
jy jy
E E

= , (4.92h)

showing that they are also Hermitian functions. Just as in Eqs. (4.46a) and (4.46b), the sums over
A in (4.90a) and (4.90b) can be converted to integrals over to get

2 [ ] 2 ( ) 4 ( ) ( )
(
Real balanced field
( ) ( ) ( ) ( ) ( ) ( )
( )
j j
M
i r ct i z
i n z r abc
sj jx s s
j
a
pj
E
x r E t r e W e e
y r

= +

G
G

}

2 [ ] 2 ( ) 4 ( ) )
( ) ( ) ( ) ( ) ( )
( )
j j
M
i r ct i z
i n z r bc
jy p p
b
E t r e W e e d
O

+
G
G
G

(4.93a)
and

( )

2 [ ] 2 ( ) 4 ( )
( )
2 [ ] 2
Real balanced field
( ) ( )
( ) ( ) ( ) ( )
( ) ( )
( ) ( ) ( ) (
j j
M
j
abc
i r ct i z sj i n z r
jx s s
j
abc
i r ct pj
jy p p
B
r
y E t r e W e e
c
r
x E t r e W e
c

= +

G
G
G

( ) 4 ( )
)
( ) .
j
M
i z
i n z r
b
e d
O

+
G
G

(4.93b)

The limits of integration are put at and + by defining ( )
jx
E and ( )
jy
E to be zero for
values that do not correspond to allowed index values in the sums over A . In particular, we
expect the integrals to converge because
,
( )
jx y
E are negligible or zero for values of
corresponding to radiation wavelengths not measured by the interferometers detector [see
discussion after Eq. (4.66b) above].
Following the procedure already explained in Sec. 4.8, we replace the sum over j with a
double integral over
2
d . The first step is to convert the sum over j into a double sum over
indices m, n as in Eqs. (4.51a) and (4.51b) above,
- 421 -

2 [ ] 2 ( ) 4 ( ) ( )
,
( )
Real balanced field
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) (
nm nm M
i r ct i z i n z r abc
snm nmx s s
n m
abc
pnm nmy p p
E
x r E t r e W e e
y r E t r
r o r o r o
o y o o o o
o y o o o
O - O - -
G G

2 [ ] 2 ( ) 4 ( )
) ( )
( )
nm nm M
i r ct i z i n z r
b
e W e e d
O
r o r o r o
o o
O - O - -
+
G G
G
(4.94a)

and

2 [ ] 2 ( ) 4 ( ) ( )
,
2 [ ] ( )
Real balanced field
1
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
nm nm M
nm
i r ct i z i n z r abc
snm nmx s s
n m
i r ct abc
pnm nmy p p
B
y r E t r e W e e
c
x r E t r e
r o r o r o
r o
o y o o o o
o y o o o o
O - O - -
O -
G G
G

2 ( ) 4 ( )
( )
( )
nm M
i z i n z r
b
W e e d
O
r o r o
o
O - -
+
G
G
,
(4.94b)

where we define ( ) ( ) 0
nmx nmy
E E o o for those m and n values that do not correspond to j
values in the original sums, the ones over propagation directions in Eqs. (4.93a) and (4.393b). As
in Sec. 4.8, the
nm
O propagation vectors can be written as [see Eq. (4.51c)]

2 2
1
nm nx my nx my
x y z r r r r O + + .

Unlike the situation at the beginning of Sec. 4.8, parameters
nx
r and
mx
r are always very small
compared to one because all the j values in the original sum correspond to propagation vectors
that are parallel, to or nearly parallel to, z ; that is

1
nx
r << (4.95a)
and
1
my
r << . (4.95b)

For each m, n propagation direction, we define

( , , ) ( )
nx my x nx my nmx
E r r r r o o A A e (4.96a)
and
( , , ) ( )
nx my y nx my nmy
E r r r r o o A A e (4.96b)
values in the original sums, the sums over propagation directions in Eqs. (4.93a) and (4.93b). As
to,
- 422 -
with

1, , nx n x n x

+
= (4.96c)
and

1, , my m m m y

+
= . (4.96d)

We also specify that

( ) ( )
( , , ) ( )
abc abc
s nx my snm
= and
( ) ( )
( , , ) ( )
abc abc
p nx my pnm
= . (4.96e)

Since

( ) ( )
( ) ( )
abc abc
sj sj

= and
( ) ( )
( ) ( )
abc abc
pj pj

=

in Eqs. (4.92f), it follows that when

index indices , j m n
we must have

( ) ( )
( ) ( )
abc abc
snm snm

= and
( ) ( )
( ) ( )
abc abc
pnm pnm

=
so that

( ) ( )
( ) ( )
( , , ) ( , , )
and ( , , ) ( , , ) .
abc abc
s nx my s nx my
abc abc
p nx my p nx my

=
=

(4.96f)

Just like in Eqs. (4.53a) and (4.53b), we pass to the limit of decreasing
nx
,
my
in (4.94a) and
(4.94b) to get

2 ( ) 2 [ ]
2 (
Real balanced field
( ) ( , ) ( , ) ( ) ( )
(
abc i r ct
s x s s
i z
E
d d x r t r e
W e

+

e
G
G G

4 ( ) )

4 ( ) ( ) 2 [ ] 2 ( )
)
( ) ( , ) ( , ) ( ) ( ) ( )
( )
M
M
i n z r
p y p p
b
e
y r t r e W e e
O

+ +
+
e
G
G G
G G
G

(4.97a)
and
- 423 -

2 ( ) 2 [ ]
2 (
Real balanced field
1
( ) ( , ) ( , ) ( ) ( )
(
abc i r ct
s x s s
i
B
d d yr t r e
c
W e

+

e
G
G G

4 ( ) )

4 ( ) ( ) 2 [ ] 2 ( )
)
( ) ( , ) ( , ) ( ) ( ) ( )
( )
M
M
i n z r z
p y p p
b
e
x r t r e W e e
O

+
e
G
G G
G G
G

(4.97b)

As in Sec. 4.8, the vector argument
x y
x y = +
G
is used as a shorthand for the two arguments
x
and
y
, so that
( , , ) ( , )
x x y x
= e e
G
, ( , , ) ( , )
y x y y
= e e
G

and

( ) ( )
, ,
( , , ) ( , )
abc abc
s p x y s p
=
G
.

This last formula lets us write rule (4.96f) as

( ) ( ) ( ) ( )
( , ) ( , ) and ( , ) ( , )
abc abc abc abc
s s p p

= =
G G G G
. (4.97c)

Following the notation used in Eqs. (4.54a) and (4.54d), we write

2 2 2
1 1
x y x y
x y z z = + + = +
G
(4.97d)
with

2
2 2 2
x y
= = +
G
.

Vector xx yy = +
G
lets us write r zz = +
G G
[see Eqs. (4.54b) and (4.54e)] so that the expressions
2 [ ]
( , )
i r ct
x
e

e
G
G
and
2 [ ]
( , )
i r ct
y
e

e
G
G
become

2
2 [ ] 2 1 2 [ ] 2 [ ]
( , ) ( , ) ( , , )
i r ct i z i ct i ct
x x x
e e e z e

= = e e E
G G G G G
G G G

and

2
2 [ ] 2 1 2 [ ] 2 [ ]
( , ) ( , ) ( , , )
i r ct i z i ct i ct
y y y
e e e z e

= = e e E
G G G G G
G G G
,

- 424 -
where we define

2
2 1
( , , ) ( , )
i z
x x
z e

= E e
G G
(4.98a)
and

2
2 1
( , , ) ( , )
i z
y y
z e

= E e
G G
. (4.98b)

Substitution of these results into Eqs. (4.97a) and (4.97b) gives

2
(bal)
2 ( ) 2 [ ] 2 1
Real balanced field ( , , )
( ) ( , ) ( , , ) ( ) ( ) (
abc i ct i
s x s s
E E z t
d d x r z t r e W e

= =

E
G G
G
G
G G

2
4 ( )
4 ( ) ( ) 2 [ ] 2 1
)
( ) ( , ) ( , , ) ( ) ( ) ( )
( )
M
M
i n z r
i n z r abc i ct i
p y p p
b
e
y r z t r e W e e
O

+ +
+
E
G
G G G
G G
G

and

2
(bal)
2 ( ) 2 [ ] 2 1
Real balanced field ( , , )
1
( ) ( , ) ( , , ) ( ) ( ) (
abc i ct i
s x s s
B B z t
d d y r z t r e W e
c

= =

E
G G
G
G
G G

2
4 ( )
4 ( ) ( ) 2 [ ] 2 1
)
( ) ( , ) ( , , ) ( ) ( ) ( )
( )
M
M
i n z r
i n z r abc i ct i
p y p p
b
e
x r z t r e W e e
O

+
E
G
G G G
G G
G

with
(bal)
( , , ) E z t
G
G
and
(bal)
( , , ) B z t
G
G
used as a shorthand for
(bal)
( , , , ) E x y z t
G
and
(bal)
( , , , ) B x y z t
G
.
These formulas can be simplified by gathering together like terms to get

2
(bal)
4 ( ) 2 2 [ ] 2 1
( ) ( )
( , , )
( )
( , , ) ( , ) ( ) ( ) ( , , ) ( , ) ( ) ( )
( )
( )
{
}
M
i n z r i ct i
abc abc
x s s s y p p p
b
E z t
d d e r W e e
x z t r y z t r
O

=
+
+

+

E E
G G G
G
G
G G G G
G

(4.99a)

- 425 -
and

2
(bal)
4 ( ) 2 2 [ ] 2 1
( ) ( )
( , , )
1
( )
( , , ) ( , ) ( ) ( ) ( , , ) ( , ) ( ) ( )
( )
( )
{
}
M
i n z r i ct i
abc abc
x s s s y p p p
b
B z t
d d e r W e e
c
y z t r x z t r
O

=
+

+

E E
G G G
G
G
G G G G
G

.
(4.99b)

In Eqs. (4.92g) and (4.92h) we see that ( )
jx
E and ( )
jy
E are Hermitian functions, so when
the index j is replaced by the pair of indices m, n, it follows that ( )
nmx
E and ( )
nmy
E must also
be Hermitian. This forces ( , )
x
e
G
and ( , )
y
e
G
in Eqs. (4.96a) and (4.96b) to be Hermitian
functions of . Changing the sign of in
2
2 1 i z
e

is equivalent to taking its complex conjugate,
so Eqs. (4.98a) and (4.98b) show that ( , , )
x
z E
G
and ( , , )
y
z E
G
are also Hermitian functions of
, giving
( , , ) ( , , )
x x
z z

= E E
G G
(4.100a)
and
( , , ) ( , , )
y y
z z

= E E
G G
. (4.100b)

Returning briefly to the discussion leading up to inequalities (4.95a) and (4.95b), we see that
because only plane waves traveling parallel to, or nearly parallel to, the optical axis can pass
through the interferometerthat is, because the radiation passing through the interferometer is
direction-choppedboth e
x
and e
y
must be zero or negligible unless
x
and
y
are small.
Consequently, consulting the definitions of E
x
and E
y
in (4.98a) and (4.98b), it follows that for
the direction-chopped radiation passing through the interferometer both E
x
and E
y
must be zero or
negligible unless 1 <<
G
.
The connection between the output radiation fields
(bal)
( , , ) E z t
G
G
,
(bal)
( , , ) B z t
G
G
and the input
radiance is easy to understand because we have just created a carefully elaborated connection
between ( , , )
x
z E
G
, ( , , )
y
z E
G
and the complex
jx
E
A
,
jy
E
A
values in Eqs. (4.89a) and (4.89b)
characterizing the input radiation fields. To develop a consistent notation and make the
connection explicit, we apply the same process used to go from (4.89c) and (4.89d) to (4.99a) and
(4.99b) to Eq. (4.89a) and (4.89b) representing the input radiation fields. The interferometers
input fields then become

- 426 -

(in) 2 2 [ ]
Real input field
( , , ) ( , , ) ( , , ) ( )
i ct
x y b
E
E z t d d x z y z e O

=
= + +

E E
G G G G
G G G
(4.101a)
and

(in) 2 2 [ ]
Real input field
1
( , , ) ( , , ) ( , , ) ( )
i ct
x y b
B
B z t d d y z x z e O
c

=
= +

E E
G G G G
G G G
.
(4.101b)

For future use, we note that
(in)
E
G
,
(in)
B
G
,
(bal)
E
G
, and
(bal)
B
G
can be written as three-dimensional
inverse Fourier transforms. We make the same variable substitutions used above in equations
(4.60a)(4.60c), (4.61b), and (4.61c), specifying that

w c = , (4.102a)

x x
u = , (4.102b)
and

y y
u = , (4.102c)
with

x y
w
u xu yu
c
= + = =
G G G
. (4.102d)
and

2
2 2 2
x y
u u u u = = +
G
. (4.102e)

Equations (4.101a) and (4.101b) now become

(in)
2 2 [ ]
2
( , , )
( , , ) ( , , )
( )
i u wt
x y
b
E z t
c cu w cu w
dw d u x z y z e
w w c w c
O

+

= +

+

E E
G G
G
G
G G
G

(4.103a)
and

(in)
2 2 [ ]
2
( , , )
1
( , , ) ( , , )
( )
i u wt
x y
b
B z t
cu w cu w
dw d u y z x z e
w w c w c
O

+

=

+

E E
G G
G
G
G G
G

.
(4.103b)
A similar transformation converts Eqs. (4.99a) and (4.99b) to
- 427 -
2
(bal)
2
4
1
( )
2 2 [ ]
2
( )
( , , )
( )
( , , ) (
( )
{
M
iw cu
iw
n z r
c w i u wt
c
abc
x s
E z t
c w
dw d u e r W e e
w c
cu w cu
x z
w c

+

=
+

E
G
G G
G
G
G G

[
( )
, ) ( ) ( )
( , , ) ( , ) ( ) ( )
( )
]
}
s s
abc
y p p p
b
w w w
t r
w c c c
cu w cu w w w
y z t r
w c w c c c
O

+
+
E
G G
G

(4.104a)

and

2
(bal)
2
4
1
( )
2 2 [ ]
2
( )
( , , )
1
( )
( , , ) ( , )
( )
{
M
iw cu
iw
n z r
c w i u wt
c
abc
x s
B z t
w
dw d u e r W e e
w c
cu w cu w
y z t
w c w c

+

=
+

E
G
G G
G
G
G G

[
( )
( ) ( )
( , , ) ( , ) ( ) ( )
( ) .
]
}
s s
abc
y p p p
b
w w
r
c c
cu w cu w w w
x z t r
w c w c c c
O

+
E
G G
G

(4.104b)
4.14 Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields
Because
(in)
E
G
and
(in)
B
G
in Eqs. (4.101a) and (4.101b) represent the electric and magnetic fields
for the input radiation entering a Michelson interferometer, we know from Secs. 4.9 and 4.10 that
these fields are both time-chopped and beam-chopped.
67
Up to now, there has been no need to
indicate this explicitly, but from this point on we introduce T, A subscripts to show that the
radiant fields are significantly different from zero only over a time interval T t T and only
over a beam cross-sectional area A in the x,y plane. We also know from Sec. 4.10 that the
radiation inside the interferometer can be thought of as being approximately band-limited.

67
The radiation is also, of course, direction-chopped. The direction-chopped property is used in the discussion after
Eq. (4.119d) below.
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields 4.14
- 428 -
Consequently, there exists a positive wavenumber
av
, which can be thought of as the typical or
average wavenumber of the approximately band-limited radiation, that characterizes the
polychromatic wavefield passing through the interferometer. We require T to be extremely long
compared to the period 1
av av
f c = of a typical electromagnetic wave inside the interferometer.
We also require any characteristic distance across area A to be extremely large compared to the
wavelength 1
av av
= of a typical electromagnetic wave inside the interferometer,

av
T c >> (4.105a)
And

1
av
A
>> . (4.105b)

To show how the T, A subscripts are used, we rewrite Eqs. (4.99a) through (4.104b) using T, A
subscripts and neglecting all terms of ( )
b
O ,

(in) 2 2 [ ]
TA
( , , ) ( , , ) ( , , )
i ct
xTA yTA
E z t d d x z y z e

= +

E E
G G G
G G G
, (4.106a)

(in) 2 2 [ ]
TA
1
( , , ) ( , , ) ( , , )
i ct
xTA yTA
B z t d d y z x z e
c

=

E E
G G G
G G G
, (4.106b)

( )
2
(bal)
4 ( ) 2 2 [ ] 2 1
( ) ( )
( , , )
( )
( , , ) ( , ) ( ) ( ) ( , , ) ( , ) ( ) ( ) ,
{
}
M
TA
i n z r i ct i
abc abc
xTA s s s yTA p p p
E z t
d d e r W e e
x z t r y z t r

=
+
+

E E
G G G
G
G
G G G G

(4.107a)

( )
2
(bal)
4 ( ) 2 2 [ ] 2 1
( ) ( )
( , , )
1
( )
( , , ) ( , ) ( ) ( ) ( , , ) ( , ) ( ) ( ) ,
{
}
M
TA
i n z r i ct i
abc abc
xTA s s s yTA p p p
B z t
d d e r W e e
c
y z t r x z t r

=
+

E E
G G G
G
G
G G G G

(4.107b)
- 429 -

(in)
2 2 [ ]
2
( , , )
( , , ) ( , , ) ,
TA
i u wt
xTA yTA
E z t
c cu w cu w
dw d u x z y z e
w w c w c

+

= +

E E
G G
G
G
G G

(4.108a)

(in)
2 2 [ ]
2
( , , )
1
( , , ) ( , , ) ,
TA
i u wt
xTA yTA
B z t
cu w cu w
dw d u y z x z e
w w c w c

+

=

E E
G G
G
G
G G

(4.108b)

2
(bal)
TA
2
4
1
( )
2 2 [ ]
2
( )
( , , )
( )
( , , ) (
( )
{
M
iw cu
iw
n z r
c w i u wt
c
abc
xTA s
E z t
c w
dw d u e r W e e
w c
cu w cu
x z
w c w

+

=
+

E
G
G G
G
G
G G

[
( )
, ) ( ) ( )
( , , ) ( , ) ( ) ( ) , ]
}
s s
abc
yTA p p p
w w w
t r
c c c
cu w cu w w w
y z t r
w c w c c c

+ E
G G

(4.109a)

2
(bal)
TA
1
( )
2
2
( )
2
4
2 [ ]
( , , )
1
( )
( , , ) (
( )
{
M
cu
n z r
w
abc
xTA s
iw
iw
c i u wt
c
B z t
w
dw d u e r W e e
w c
cu w c
y z
w c

+
=
+

E
G
G G
G
G
G

[
( )
, ) ( ) ( )
( , , ) ( , ) ( ) ( ) . ]
}
s s
abc
yTA p p p
u w w w
t r
w c c c
cu w cu w w w
x z t r
w c w c c c

E
G
G G

(4.109b)

Equations (4.100a) and (4.100b) require the
xTA
E and
yTA
E functions inside these integrals to
satisfy the Hermitian condition for their wavenumber arguments,

( , , ) ( , , )
xTA xTA
z z

= E E
G G
(4.110a)
and
( , , ) ( , , )
yTA yTA
z z

= E E
G G
. (4.110b)

- 430 -
For future use, we note that this is the same thing as saying

( , , ) ( , , )
xTA xTA
cu w cu w
z z
w c w c

= E E
G G
(4.110c)
and
( , , ) ( , , )
yTA yTA
cu w cu w
z z
w c w c

= E E
G G
. (4.110d)

The power fluxenergy per unit area per unit timecarried by the input radiation field at any
point in space is given by the Poynting vector,
68

( )
(in) (in) (in)
1
TA TA TA
o
S E B
=
G G G
. (4.111a)

The Poynting vector S
G
is zero where
(in)
TA
E
G
and
(in)
TA
B
G
are zero, so it is given TA subscripts to show
that it is time-chopped and beam-chopped in the same way that
(in)
TA
E
G
and
(in)
TA
B
G
are time-chopped
and beam-chopped. Equations (4.108a) and (4.108b) show that the total radiant energy entering
the interferometer during a time interval T t T is

( )
[ ]
2 (in)
2 ( ) ( ) 2 2 2 2 2
( , , ) ( , , ) ( , , ) ( ,
TA
i u u w w t
o
xTA xTA yTA yTA
dt d S z
c
dt d dw dw d u d u w w e
cu w cu w cu w cu
z z z z
w c w c w c w

+ + +

=

+

E E E E
G G G
G
G G G G

, ) .
w
c

(4.111b)

The integrals over
2
d u and dw in (4.108a) and (4.108b) are changed to integrals over
2
d u ,
2
d u
and dw, dw before they are substituted into (4.111a). This maneuver is often used to show that
formulas such as the one in (4.111b) deal with integrals over independent variables of integration.
We have also used the unit-vector identities x y y x z = = and 0 x x y y = = to simplify
the expression inside the square brackets [ ]. We note that the integrals over dt and
2
d can be
extended to and + exactly because the time-chopped and beam-chopped nature of
(in)
TA
E
G
,
(in)
TA
B
G
, and
(in)
TA
S
G
ensures that their integrals drop to zero thus correctly excluding the

68
John David Jackson, Classical Electrodynamics, 3rd ed. (John Wiley & Sons, Inc., New York, 1999), p. 259.
- 431 -
electromagnetic energy at large values of x, y, and t that are not part of the interferometer
measurement. Moving the integrals over dt and
2
d p to the inside to get

( )
2 (in)
2 2 2 2 2 ( ) 2 2 ( )
TA
i w w t i u u
o
xTA
dt d S z
c
dw dw d u d u w w e dt d e
r r p
p
p

+ - +

-

E
G G G

( , , ) ( , , ) ( , , ) ( , , )
xTA yTA yTA
cu w cu w cu w cu w
z z z z
w c w c w c w c

+

E E E
G G G G
,

we recognize these integrals to be forms of the delta function [see Eqs. (2.71f) and (2.122a) in
Chapter 2),

2 ( )
( )
i w w t
dt e w w
r
o

and

2 ( )
2 ( ) 2 2 ( )
( ) ( ) ( )
y y
x x
iy u u
ix u u i u u
x x y y
d e dx e dy e
u u u u u u
r
r r p
p
o o o

+ + - +

+ + +

G G G
G G
.

Substituting these delta functions back into the multiple integral gives

( )
2 (in)
4 2 2
( )
( , , ) ( , , ) ( , ,
TA
o
xTA xTA yTA
dt d S z
c
dw w d u d u u u
cu w cu w cu
z z z
w c w c w
p
o

-
+
+

E E E
G
G G
G G G

4 2
) ( , , )
( , , ) ( , , ) ( , , ) ( , , )
yTA
o
xTA xTA yTA yTA
w cu w
z
c w c
c
dw w d u
cu w cu w cu w cu w
z z z z
w c w c w c w c

+

E
E E E E
G
G G G G

.

( )
(in)
TA
S z -
G

- 432 -
From Eqs. (4.110c) and (4.110d), we get

2
( , , ) ( , , ) ( , , )
xTA xTA xTA
cu w cu w cu w
z z z
w c w c w c
= E E E
G G G
(4.112a)
and

2
( , , ) ( , , ) ( , , )
yTA yTA yTA
cu w cu w cu w
z z z
w c w c w c
= E E E
G G G
, (4.112b)

which shows the total radiant energy entering the interferometer during a time interval T t T
to be

( )
2 (in)
2 2
2 2
2
4 4
1
( , , ) ( , , )
TA
xTA yTA
o
dt d S z
c cu w c cu w
dw d u z z
c w w c w w c

= +

E E
G
G G

.
(4.113a)

The radiation fields entering the interferometerunlike, say, the electromagnetic signal put
out by television or radio stationscan be modeled as random variables because they are not
under our direct control. Following the notation used in Chapter 3, we now write

(in)
TA
S
G
,
xTA
E
and
yTA
E

to show that these are random functions (see Sec. 3.2). No tilde is added to their arguments
because the arguments are nonrandom variables. To find the average or expected radiant energy
entering the interferometer during a time interval 2T, which is long compared to the period

1
av av
f c =

of a typical electromagnetic wave inside the interferometer, we apply the expectation operator E
defined in Sec. 3.4 of Chapter 3 to both sides of Eq. (4.113a) to get

( )
2 (in)
2 2
2
2 2
Average input energy
1
( , , ) ( , , )
TA
xTA yTA
o
dt d S z
c cu w c cu w
dw d u z z
c w w c w w c

=

= +

E E
G
G G

.
E
E E
(4.113b)

- 433 -
Equations (3.17c) and (3.16a) of Chapter 3 are used when taking E inside the integrals over dw
and
2
d u .
Although the random radiation fields are not under our direct control, the amount of radiant
energy that is linearly polarized in the x or y direction is. We can, for example, imagine passing
the radiation in (4.113b) through a polarizing filter, setting
yTA
E
to zero without affecting

xTA
E
or
setting
xTA
E
to zero without affecting

yTA
E
. Therefore (4.113b) can be interpreted as saying that

during a time interval 2T in duration,

2
2
2
Average input energy polarized in
1
( , , )
xTA
o
x
c cu w
dw d u z
c w w c

=

E
G
E
(4.114a)
and

2
2
2
1
( , , ) .
yTA
o
y
c cu w
dw d u z
c w w c

=

E
G
E
(4.114b)

Because
2
2
( , , )
xTA
cw cu w z w c
E
G
and
2
2
( , , )
yTA
cw cu w z w c
E
G
are non-negative random

quantities, we can then interpret

2
4
2
( , , )
o
xTA
c
d u dw
w
cu w
z
w c

E
G
E (4.114c)
and

2
4
2
( , , )
o
yTA
c
d u dw
w
cu w
z
w c

E
G
E (4.114d)

as the average or expected energy characterized by u =
G G
and w c = that is carried by,
respectively, the x-polarized and y-polarized radiation fields entering the interferometer during a
time interval 2T in length. By converting the integrals over dw and
2
d u to integrals over d and
2
d using the variable transformations [see Eqs. (4.102a)(102d)]

w c = , cu w =
G G
, dw cd = , and
2 2 2 2
( / ) d u w c d = ,

- 434 -
we get that during a time interval 2T in length

( )
( )
2 2
2
2
2 4
2
2
2
1
( ) ( , , )
1
( , , )
xTA
o
o xTA
x
w c
cd d z
c c w
d d z

=

=

E
E
G

E
E
(4.115a)
and

( )
2
2
2
1
( , , ) .
o yTA
y
d d z

=

E
G
E
(4.115b)

In (4.115a) and (4.115b), we use that
1 2
o o
c

= from Eq. (4.1e) above. Remembering that the
direction of the propagation vector
2
1 z = +
G
is specified by vector
G
, we note that

( )
2
2
2
( , , )
o
xTA
z d d
E
G
E (4.116a)
and

( )
2
2
2
( , , )
o
yTA
z d d
E
G
E (4.116b)
can be interpreted as the average or expected energy entering the interferometer during a time
interval 2T in length carried by, respectively, the x-polarized or y-polarized radiation fields
traveling in the
direction at wavenumber .
From Appendix 4C, we see that, according to the three-dimensional Wiener-Khinchin theorem
discussed in Sec. 3.24 of Chapter 3, there exist power spectra S ( , )
x
u w
G
and S ( , )
y
u w
G
such that
[see Eqs. (4C.10a) and (4C.10b)]

2
2
4
1 1
S ( , ) lim ( , , )
2
x xTA
T
A
c cu w
u w z
T A w w c

=

E
G
G
E (4.117a)
and

2
2
4
1 1
S ( , ) lim ( , , )
2
y yTA
T
A
c cu w
u w z
T A w w c

=

E
G
G
E . (4.117b)

- 435 -
Here, the limit as A is interpreted to be the limit as the beam cross-sectional area A extends
to cover the entire x, y plane; and, of course, the limit as T means that the measurement
time becomes infinitely long. We have dropped z from the argument list of S
x,y
on the left-hand
side of these two equations because, as is pointed out at the end of Appendix 4C, the values of
2
xTA
E
and
2
yTA
E
are no longer functions of z. According to inequalities (4.105a) and (4.105b),

area A has already been assumed to be much wider than the typical wavelength of the radiation
fields and time interval 2T has already been assumed to be much longer than the typical period of
the radiation fields. It is therefore plausible that in (4.117a) and (4.117b) the values of A and T are
already large enough for the expressions inside the braces { } to be approximately equal to their
limits. Assuming this to be true and multiplying both sides by
1 2
( )
o
c d u dw

, we then get

2
2 2
4
o
1 1 1
S ( , ) ( , , )
2
x xTA
o
c cu w
u w d u dw z d u dw
c T A w w c

E
G
G
E (4.118a)
and

2
2 2
4
o
1 1 1
S ( , ) ( , , )
2
y yTA
o
c cu w
u w d u dw z d u dw
c T A w w c

E
G
G
E . (4.118b)
Comparing (4.114c) to (4.118a), we see that
1 2
o
( ) S ( , )
x
c u w d u dw

G
is the average x-polarized
input energy at ( ) u w
G
, divided by the both the time interval 2T during which it entered the
interferometer and the area A through which it entered the interferometer. This means
1 2
o
( ) S ( , )
x
c u w d u dw

G
can be interpreted as the average x-polarized input power per unit area at
values u w
G
, ; and a similar comparison of (4.114d) to (4.118b) shows that
1 2
o
( ) S ( , )
y
c u w d u dw

G

is the average y-polarized input power per unit area at values ( ) u w
G
, .
Integrating both sides of (4.118a) and (4.118b) over dw and
2
d u gives expressions for the
average input x-polarized and y-polarized input power per unit area from all the u
G
and w values,

2
o
2
2
4
Average - polarized input power per unit area
1
S ( , )
1 1
( , , )
2
x
xTA
o
x
dw d u u w
c
c cu w
dw d u z
TA w w c

=

E
G
G

E

and
- 436 -

2
o
2
2
4
1
S ( , )
1 1
( , , ) .
2
y
yTA
o
y
dw d u u w
c
c cu w
dw d u z
TA w w c

=

E
G
G

E

Making the same variable transformations as before,

w c = , cu w =
G G
, dw cd = , and
2 2 2 2
( / ) d u w c d = ,
gives

( )
2
2
o
2
2
2
S ( , )
( , , )
2
x
o
xTA
x
d d c
d d z
TA

E
G
G

E
(4.119a)
and

( )
2
2
o
2
2
2
S ( , )
( , , ) ,
2
y
o
yTA
y
d d c
d d z
TA

E
G
G

E
(4.119b)

where we again use
1 2
o o
c

= from Eq. (4.1e). These last two equations suggest that

( )
2
2
2 2
2
S ( , ) ( , , )
2
o
x xTA
o
c d d z d d
TA

E
G G
E (4.119c)

can be interpreted as the average x-polarized input power per unit area traveling in direction
2
1 z = +
G
at wavenumber , and

( )
2
2
2 2
2
S ( , ) ( , , )
2
o
y yTA
o
c d d z d d
TA

E
G G
E (4.119d)

- 437 -
can be interpreted as the average y-polarized input power per unit area traveling in direction
2
1 z = +
G
at wavenumber . From the discussion following Eq. (4.100b) in Sec. 4.13, we
know that
xTA yTA
E E

and represent direction-chopped radiation even though nothing like the T or
A subscripts has been used to make this explicit. This means, of course, that
xTA yTA
E E

and must
be negligible or zero for
G
values that do not represent propagation directions that are parallel to,
or nearly parallel to, the z axis. Consequently, Eqs. (4.119c) and (4.119d) show that S
x
and S
y

must also be negligible or zero for
G
values not representing propagation directions parallel to or
nearly parallel to the z axis. From the observations made at the end of Sec. 4.9, we know that
2
d
can be interpreted as an infinitesimal solid angle. Hence, we can always regard

( )
2
2
, , 2
S
2
o
x y xTA yTA
o
d d
TA

E
E

as the input power per unit area and per unit solid angle of x-polarized or y-polarized radiation
respectively. The next obvious step is to drop d and recognize

( )
2
2
, , 2
S
2
o
x y xTA yTA
o
TA

E
E

as the input power per unit area per unit solid angle and per unit wavenumber interval of the x-
polarized or y-polarized radiation respectively. It is customary in interferometric spectroscopy to
define two functions ( , )
x
L
G
and ( , )
y
L
G
to represent the x-polarized and y-polarized radiant
power per unit area per unit solid angle and per unit wavenumber interval traveling in the
direction
2
1 z = +
G
at wavenumber . Hence it now makes sense to define that

( )
2
2
2
( , ) S ( , ) ( , , )
2
o
x x xTA
o
c z
TA

= L E
G G
E (4.120a)

and

( )
2
2
2
( , ) S ( , ) ( , , )
2
o
y y yTA
o
c z
TA

= L E
G G
E . (4.120b)

Again, because the beam is direction-chopped, the newly defined functions L
x,y
must be
negligible or zero for
G
values not representing directions parallel to, or nearly parallel to, the z
axis. As noted in Sec. 4.10 in the discussion after Eq. (4.66b), we are never interested in the
values of or
xTA yTA
E E

at 0 = . Consequently, we can always take the expected values of
- 438 -
, xTA yTA
E
2
to be zero at 0 = , preventing the factors of
2
in the last steps of (4.120a) and

(4.120b) from specifying a singularity when the wavenumber is zero. The x-polarized and y-
polarized power spectra specified by L
x
and L
y
are double-sided in because functions L
x,y
equal
1 2
,
S
o x y

, and the S
x,y
functions are double-sided. [We know that the S
x,y
are double-sided
because, according to Eqs. (4.119a) and (4.119b), the S
x,y
must be integrated over all
wavenumbers between and + to get the average power per unit area.] Equations (4.110a)
and (4.110b) show that
2
( , , )
xTA
z E
and
2
( , , )
yTA
z E
must have the same values at that

they have at + , requiring L
x
and L
y
to be even functions of the wavenumber argument:

( , ) ( , )
x x
= L L
G G
(4.121a)
and
( , ) ( , )
y y
= L L
G G
. (4.121b)

4.15 Energy Flux of the Balanced Radiation Fields
To find the energy carried by the balanced radiation fields reaching the interferometers detector,
we repeat the procedure used in the previous section to find the energy flux of the input radiation
fields. In the previous section, we decided to make
xTA
E
and
yTA
E
random functions, so now

(bal)
TA
E
G
and
(bal)
TA
B
G
in Eqs. (4.109a) and (4.109b) must also be random. We write these vector
functions as

(bal)
( , , )
TA
E z t
G
G
and
(bal)
( , , )
TA
B z t
G
G
,

where again the tilde is used to indicate that these are random functions of nonrandom variables
(see Sec. 3.2 in Chapter 3). The balanced energy flux at any point in the balanced output beam is
now simply the expected or average value of

( )
(bal) (bal) (bal)
1

TA TA TA
o
S z E B z
=
G G G

, (4.122a)

the z component of the Poynting vector. To get the radiant energy reaching the detector, we just
integrate the expected value of the Poynting vectors z component over the beams cross-sectional
area and the time interval T t T used to collect the signal. Therefore,
Energy Flux of the Balanced Radiation Fields 4.15
- 439 -

( ) ( )
2 (bal) 2 (bal) (bal)
Average energy in balanced signal over time interval 2 and beam cross - section
1
.
TA TA TA
o
T A
dt d S z dt d E B z p p

- -

G G G

E E
(4.122b)

In this section, we use Eqs. (4.109a) and (4.109b) to evaluate the right-hand side of (4.122b) from
the inside out. Most of the massive algebraic manipulations we encounter turn out to be
conceptually simple exercises in listingand then eliminating through integrationa large
number of superfluous variables.
We introduce simplifying notation before substituting (4.109a) and (4.109b) into (4.122b).
According to Eq. (4B.12b) in Appendix 4B, the typical angle between
m
n and z is small enough
for us to neglect the z component of the ( )
m
n z vector in Eqs. (4.109a) and (4.109b).
This means there must exist two real constants a and b such that

2( )
m
n z ax by e + . (4.123a)
To shorten the way functions

( )
,
, , , , ,
abc
s s p p s p
r r t r t y and

are written with primed and unprimed arguments, we define that

( ), ( ) r r w c r r w c , (4.123b)

( ), ( )
s s s s
r r w c r r w c , (4.123c)

( ), ( )
s s s s
t t w c t t w c , (4.123d)

( ), ( )
p p p p
r r w c r r w c , (4.123e)

( ), ( )
p p p p
t t w c t t w c , (4.123f)
and

( ) ( )
, , , ,
( / , ), ( / , )
abc abc
s p s p s p s p
cu w w c cu w w c y y y y
G G
. (4.123g)

We also define that

2 2 2
x y
u u u +
and

2 2 2
x y
u u u + .
This means there must exist two real constants a and b such that
and
- 440 -
Now at last Eqs. (4.109a) and (4.109b) can be substituted into (4.122b). Postponing for a
while the application of the expectation operator E, we write

( )
( )
2 2 2
2 (bal) (bal)
2 2
2
2 ( ) ( ) ( )
2 ( ) 1 ( ) 2 ( ) ( )
1
{
x x y y
TA TA
o
o
i x u u y u u t w w
i w c c u w i w c xa yb
dt d E B z
c rr
dt dxdy dw dw d u d u
ww
e
W e e
W e
r
r r
p

+ + + + +

+
-

+

+

G G

2 2 2
2 ( ) 1 ( ) 2 ( ) ( )
( , , ) ( , , )
( , , ) ( , , ) .
}
s s s s s s xTA xTA
p p p p p p yTA yTA
e
cu w cu w
r r t t z z
w c w c
cu w cu w
r r t t z z
w c w c
r r
y y
y y
+

E E
E E
G G

G G

(4.123h)

The three double integrals over
2
d p ,
2
d u , and
2
d u are, of course, a shorthand for dxdy,
x y
du du , and
x y
du du respectively. Moving the integral over dt to the inside gives

2 ( )
( )
it w w
dt e w w
r
o
. (4.124a)

We define

( )
, ,
( / , / )
abc
s p s p
cu w w c y y
G
(4.124b)

and substitute (4.124a) into (4.123h) to get

s [see Eq. (2.71f )
in Chapter 2]
- 441 -

( )
2 2 2
2 2 2
2 (bal) (bal)
2
2 ( ) ( )
2 2
4
2 ( ) 1 ( ) 2 2 ( ) ( )
2 ( ) 1 ( )
1
{
x x y y
TA TA
o
i x u u y u u
o
i w c c u w
dt d E B z
r
c
dw d u d u dxdye
w
W We e
We

+ + +

+

=
+

G G

2 2 2 2 2 2
2 ( ) ( )
2 ( ) [ 1 ( ) 1 ( ) ]
2 2
2 2
( , , ) ( , , )
( , , ) (
i w c xa yb
i w c c u w c u w
s s s s xTA xTA
p p p p yTA yTA
e
e
cu w cu w
r t z z
w c w c
cu w c
r t z
w c

+

+
+
E E
E E
G G

G G

, , ) .
}
u w
z
w c

(4.125a)

In Eq. (4.125a), the integral over ( ) w w dw + has been used to replace w by w everywhere, so
that [see Eqs. (4.92a-e), (4.123g), and (4.124b)]

2
( ) ( ) rr r w c r w c r = ,

2
( ) ( )
s s s s s
r r r w c r w c r = ,

2
( ) ( )
s s s s s
t t t w c t w c t = ,

2
( ) ( )
p p p p p
r r r w c r w c r = ,

2
( ) ( )
p p p p p
t t t w c t w c t = ,
and

, , , , s p s p s p s p
.

Equations (4.123g), (4.124b), and (4.97c) show that when u u
G G
in the argument lists of
s s
, we get

- 442 -

( ) ( )
2
( ) ( ) ( ) ( ) ( )
( / , / ) ( / , / )
( / , / ) ( / , / )
abc abc
s s s s
abc abc abc abc abc
s s s s s
cu w w c cu w w c
cu w w c cu w w c
y y y y
y y y y y

G G
G G

(4.125b)
and similarly

2
( ) abc
p p p
y y y (4.125c)

when u u
G G
in the argument lists of
p p
y y . Equation (4.97c) also shows that, when

/ cu w r
G G
and / w c o ,

we can write

2
( ) ( ) ( ) ( ) ( )
, , , , ,
( , ) ( , ) ( , ) ( , ) ( , )
abc abc abc abc abc
s p s p s p s p s p
y r o y r o y r o y r o y r o

G G G G G

and

2
( ) ( ) ( ) ( ) ( )
, , , , ,
( , ) ( , ) ( , ) ( , ) ( , )
abc abc abc abc abc
s p s p s p s p s p
y r o y r o y r o y r o y r o

G G G G G
.

At the extreme left-hand side of these last two formulas, we find
( ) ( )
, ,
( , ) ( , )
abc abc
s p s p
y r o y r o
G G
and
( ) ( )
, ,
( , ) ( , )
abc abc
s p s p
y r o y r o
G G
, and since it must always be true that

( ) ( ) ( ) ( )
, , , ,
( , ) ( , ) ( , ) ( , )
abc abc abc abc
s p s p s p s p
y r o y r o y r o y r o
G G G G
,

it follows that, examining the extreme right-hand sides of these two formulas,

2 2
( ) ( )
, ,
( , ) ( , )
abc abc
s p s p
y r o y r o
G G
. (4.125d)

From the discussion following Eq. (4.83) in Sec. 4.12 above, we see that
2
1 W because W must
be either 1 or 1. We also note, according to Eq. (2.122a) in Chapter 2, that

2 ( ) ( ) 2 ( )
2 ( )
( ) ( ).
x x y y y y
x x
i x u u y u u iy u u
ix u u
x x y y
dxdy e dxe dye
u u u u
r r
r
o o

+ + + + +

+ +

Now Eq. (4.125a) can be written as

From the discussion following Eq. (4.83) in Sec. 4.12 above, we see that
2
1 W because W must
- 443 -

( )
2 (bal) (bal)
2
2
2
2 2
2 ( )
4
2
2 2 2
( )
2 2
1
2
( , , )
( , , )
{
}
TA TA
o
abc
s s s xTA
o
abc
p p p yTA
o
dt d E B z
r
c cu w
dw d u r t z
w w c
cu w
r t z
w c
Wc
dw d u d u
+

E
E
G G

G

2 2 2
2
2 ( ) ( )
4
2 2
2 ( ) 1 ( )
2 2
( , , ) ( , , )
( , , )
{
x x y y
wa wb
i x u u y u u
c c
i w c c u w
s s s s xTA xTA
p p p p yTA y
r
dxdy e
w
cu w cu w
e r t z z
w c w c
cu w
r t z
w c

+ + +

+

E E
E E
G G

G

2 2 2
2
2 ( ) ( )
2 2
4
2 2
2 ( ) 1 ( )
( , , )
( , , ) ( , , )
}
{
x x y y
TA
wa wb
i x u u y u u
c c
o
i w c c u w
s s s s xTA xTA
cu w
z
w c
r
Wc
dw d u d u dxdy e
w
cu w cu w
e r t z z
w c w c

+ + + + +

E E
G
G G

2 2
( , , ) ( , , ) ,
} p p p p yTA yTA
cu w cu w
r t z z
w c w c

+
E E
G G

(4.125e)

where Eqs. (4.112a), (4.112b), (4.125b), and (4.125c) are used to simplify the first set of integrals
on the right-hand side. Even though Eqs. (4.112a) and (4.112b) state an equality between
nonrandom quantities
xTA
E and
yTA
E , we know this equality is also true for the random quantities
xTA
E
and
yTA
E
because (4.112a) and (4.112b) must hold true for any radiation fields. The
remaining double integrals over dxdy can be written as

2
( ) ( )
x x y y
wa wb
i x u u y u u
c c
x x y y
wa wb
dxdy e u u u u
c c
+ + +

= + +

. (4.125f)

From Eq. (4B.13e) in Appendix 4B, we get the approximation

2 2 2
2 2
2 1 2 1
w c w w w u c
i u u i
c c c c w w
e e

+ +

G G
G G
, (4.126a)
- 444 -
where we define, in harmony with Eqs. (4.123a) and (4B.13e),

2( )
M
ax by n z = + =
G
. (4.126b)

This new vector will make it easier to write down what happens to Eq. (4.125e) when we
substitute from (4.126a). Equations (4.100a) and (4.100b) hold true for all physically possible
radiation fields, so they must still be true when
xTA
E and
yTA
E are taken to be the random
quantities
xTA
E
and
yTA
E
. Therefore we can take the complex conjugate of both sides of Eqs.

(4.100a) and (4.100b) to get

( , , ) ( , , )
xTA xTA
z z

= E E
G G

(4.126c)
and

( , , ) ( , , )
yTA yTA
z z

= E E
G G

, (4.126d)

where the T, A subscripts are added because now we are explicitly acknowledging their time-
chopped and beam-chopped nature. Equation (4.124b) shows that when the argument u
G
of
, s p

is replaced by ( / ) u w c
G
G
, we get

( )
, ,
,
abc
s p s p
cu w
w c

G
G
B .

Examining the definition of
G
in Eq. (4.126b), we note that the angle between
M
n and z is
( )
d
O , which means, according to inequality (4.68) above, that the angle between
M
n and z
must be much smaller than the typical size of the off-axis propagation angle
b
. Although we
know from the discussion at the beginning of Appendix 4E that changing the propagation
direction by
b
can significantly affect the value of the complex
( )
,
abc
s p
parameters, the discussion

- 445 -
at the end of Appendix 4E demonstrates that changing the direction of propagation by only an
( )
d
O amount does not significantly affect
( )
,
abc
s p
y . Hence, Eqs. (4.125b) and (4.125c) still specify
what happens to
s s
y y and
p p
y y when ( / ) u u w c A
G
G G
. Taking all this into account while
changing the double integrals over dxdy in Eq. (4.125e) into the delta functions specified by
(4.125f) then leads to

( )
2 (bal) (bal)
2
2
2
2 2
2 ( )
4
2
2 2 2
( )
1
2
( , , )
( , , )
{
TA TA
o
abc
s s s xTA
o
abc
p p p yTA
dt d E B z
r
c cu w
dw d u r t z
w w c
cu w
r t z
w c
p
y

-

E
E
G G

G

2
2
2
2
2 2
2 ( )
4
( )
2 1
2 2 2
( )
2
2 2
2
4
( , , ) ( , , )
( , , ) ( , , )
}
{
}
{
abc
s s s xTA xTA
o
w cu
i
abc c w
p p p yTA yTA
s s
o
r
Wc cu w cu w
dw d u r t z z
w w c w c
cu w cu w
r t z z e
w c w c
r
Wc
dw d u r t
w
r
y
+ A
+ A
+

E E
E E
G G
G

G G
G

2
2
2
( )
( )
2 1
2 2 2
( )
( , , ) ( , , )
( , , ) ( , , )
}
abc
s xTA xTA
w cu
i
abc c w
p p p yTA yTA
cu w cu w
z z
w c w c
cu w cu w
r t z z e
w c w c
r
y
y
+ A
+ + A
E E
E E
G G
G

G G
G

.
(4.127a)

There is no point postponing any longer the application of the expectation operator E to both
sides of this formula. Because the expectation operator is linear with respect to nonrandom
quantities [see Eqs. (3.16a) and (3.17c) in Chapter 3], it can be taken inside all the integrals on
the right-hand side, which means Eq. (4.122b) can now be written as

o [applying (4.126a), (4.126c), and (4.126d)]
- 446 -

( )
2 (bal) (bal)
2
2
2
2 2
2 ( )
s 4
Average energy in balanced signal over time interval 2 and beam cross - section
1
2
( , , )
{
TA TA
o
abc
s s xTA
o
T A
dt d E B z
r
c cu w
dw d u r t z
w w c
p

E
G G

G
E
E
2
2 2 2
( )
2
2
2 2
2 ( )
s 4
2
( , , )
( , , ) ( , , )
}
{
abc
p p p yTA
abc
s s xTA xTA
o
p
cu w
r t z
w c
r
Wc cu w cu w
dw d u r t z z
w w c w c
r
y
y

+ A

+

E
E E
G
G G
G

E
E
2
2
( )
2 1
2 2
( )
2
2
2 2
2 ( )
s 4
2 2 2
( )
( , , ) ( , , )
( , , ) ( , , )
(
}
{
w cu
i
abc c w
p p yTA yTA
abc
s s xTA xTA
o
abc
p p p yTA
cu w cu w
t z z e
w c w c
r
Wc cu w cu w
dw d u r t z z
w w c w c
cu
r t
w
r
y
y

+ + A

+

E E
E E
E
G G
G

G G
G

G

E
E
E
2
2
( )
2 1
, , ) ( , , )
}
w cu
i
c w
yTA
w cu w
z z e
c w c
r

+ A

E
G
G
.

(4.127b)

The key terms in Eq. (4.127b) are the expectation values of the random variables

2
( )
xTA
E
E ,
2
( )
yTA
E
E ,

( , , ) ( , , )
xTA xTA
cu w cu w
z z
w c w c

A

E E
G G
G

E ,
and
( , , ) ( , , )
yTA yTA
cu w cu w
z z
w c w c

A

E E
G G
G

E .

We learned how to handle terms such as
2
( )
xTA
E
E ,
2
( )
yTA
E
E in Sec. 4.14 [see Eqs. (4.120a)

and (4.120b)], but what can be done with terms such as

and
- 447 -
( , , ) ( , , )
xTA xTA
cu w cu w
z z
w c w c

E E
G G
G

E
and
( , , ) ( , , )
yTA yTA
cu w cu w
z z
w c w c

E E
G G
G

E ?

To evaluate this new type of term, we return to Eq. (4.108a) above, making the radiation field
random and taking x and y components to get

(in) 2 2 2 [ ]
( , , ) ( , , )
i u wt
xTA xTA
cu w
E z t dw d u cw z e
w c

+

=

E
G G
G
G

(4.128a)
and

(in) 2 2 2 [ ]
( , , ) ( , , )
i u wt
yTA yTA
cu w
E z t dw d u cw z e
w c

+

=

E
G G
G
G

. (4.128b)

This shows that
(in)
xTA
E
and
(in)
yTA
E
are the inverse three-dimensional Fourier transforms of

2
xTA
cw
and
2
yTA
cw
,
which means that

2
xTA
cw
and
2
yTA
cw

must be the three-dimensional forward Fourier transforms of
(in)
xTA
E
and
(in)
yTA
E
,

2 2 (in) 2 [ ]
( , , ) ( , , )
i u wt
xTA xTA
cu w
cw z dt d E z t e
w c

+

=

E
G G
G
G

(4.129a)
and

2 2 (in) 2 [ ]
( , , ) ( , , )
i u wt
yTA yTA
cu w
cw z dt d E z t e
w c

+

=

E
G G
G
G

. (4.129b)

We now let
, x yTA
E
stand for either

xTA
E
or
yTA
E
, and
(in)
, x yTA
E
stand for either

(in)
xTA
E
or
(in)
yTA
E
. Since
the algebra is the same for the x and y components of
TA
E
G
and
(in)
TA
E
G
, we combine Eqs. (4.129a)

and (4.129b) to write, using Eq. (3.17c) of Chapter 3,

- 448 -
( )
, ,
2
2 (in) 2 [ ]
,
2
2 [ ( ) ]
2 (in)
,
( , , ) ( , , )
( , , )
( , , )
x yTA x yTA
i u wt
x yTA
i u w c wt
x yTA
cu w cu w
z z
w c w c
w
dt d E z t e
c
w
dt d E z t e
c
r p
r p
p p
p p

- +

A - +

A

E E
G G
G
G G
B
G G
G

G

E
E
( )
4
2 2 2 [ ( ) ( ) ( ) ] (in) (in)
, , 2
( , , ) ( , , ) .
i u w t t w c
x yTA x yTA
w
dt dt d d e E z t E z t
c
r p p p
p p p p

- + A-

G
G G G G
G G

E
(4.130)

Equations (4C.4c) and (4C.4d) from Appendix 4C show that

( )
(in) (in)
, ,
,
( , , ) ( , , )
( , ) ( , ) ( ; ) ( ; ) R ( , , ) .
x yTA x yTA
x y
E z t E z t
t T t T A A t t z
p p
p p p p

e H H H H
G G

G G G G

E
(4.131a)

It is important to remember, when using this approximation, that R
x,y
are the three-dimensional
autocorrelation functions of the x and y radiation field components before they enter the
interferometer [see Eqs. (4C.3a) and (4C.3b) in Appendix 4C]. The ( , ) t T H and ( ; ) A p H
G

functions are defined in Appendix 4C to be
69

1 for
( , )
0 for
t T
t T
t T
s
H

>

(4.131b)

1 when point ( , ) lies inside or on the edge
of the beam of cross - sectional area
( ; ) ( , ; )
0 when point ( , ) lies outside the beam of
cross - sectional area
x y
A
A x y A
x y
A
p
p
p

H H

G
G
G

. (4.131c)

These H functions approximate what happens to the original autocorrelation function R
x,y
when
radiation enters the interferometer; they make explicit the time-chopped and beam-chopped
nature of the interferometer signal (see discussion in Secs. 4.9 and 4.10 above). Substitution of
(4.131a) into (4.130) gives

69
This formula for ) , ( T t H in Eq. (4.131b) is similar to the formula for ) , ( T t H given in Eq. (1.56c) of Chapter 1,
differing only in the value specified for H at T t .
(2.56c) of Chapter 2,
- 449 -
( )
, ,
4
2 ( )
2
2
2 [ (
( , , ) ( , , )
( , ) ( , ) ( ; ) ( ; ) [
x yTA x yTA
i w c
i u
cu w cu w
z z
w c w c
w
t T dt t T dt A e d A
c
e

E E
G
G
B
G G
G G
G

G G

E
) ( )] 2
,
R ( , , )]
w t t
x y
t t z d

+

G
G G
(4.132a)

Transforming the variables of integration to =
G G G
and t t t = so that dt dt = and
2 2
d d = changes the formula to

( )
, ,
4
2 ( )
2
2
( , , ) ( , , )
( , ) ( ; ) ( , ) ( ; ) [
x yTA x yTA
i w c
cu w cu w
z z
w c w c
w
t T dt A e d t t T dt A
c

+ +

E E
G
G
B
G G
G

G G G

E
2 [ ] 2
,
R ( , , )]
i u wt
x y
e t z d

+

G G
G
.
(4.132b)

We note that, in the limit as T and A , the inner integrals over dt and
2
d become
the three-dimensional Fourier transform of
,
R ( , , )
x y
t z
G
:

2 [ ] 2
,
2 [ ]
,
( , ) ( ; ) R ( , , )
R ( , , )
[ ]
i u wt
x y
i u wt
x y
t t T dt A e t z d
T A
dt t z e

+ +

=

G G
G G
G G G
G

2
d

.
(4.133a)

According to Eqs. (4C.5a) and (4C.5b) in Appendix 4C, the three-dimensional Fourier transform
of
,
R ( , , )
x y
t z
G
is

2 2 ( )
, ,
R ( , , ) S ( , )
i u wt
x y x y
dt d t z e u w

+

=

G G
G G
, (4.133b)

- 450 -
where, as was discussed at the end of Appendix 4C, functions S
x,y
do not need to have z as part of
their argument list because they do not depend on that variable. Equations (4.133a) and (4.133b)
can be combined to give

2 [ ] 2
,
,
( , ) ( ; ) R ( , , )
S ( , )
[ ]
i u wt
x y
x y
t t T dt A e t z d
T A
u w
r p
p p p p

- +

H + H +

G G
G G G
G

.
(4.133c)

Following the same reasoning used in the discussion following Eq. (4.117b) above, we assume
that in a well-designed interferometer A and T are large enough for the left-hand side of (4.133c)
to be approximately equal to its limit:

2 [ ] 2
,
,
( , ) ( ; ) R ( , , )
S ( , )
[ ]
i u wt
x y
x y
t t T dt A e t z d
u w
r p
p p p p

- +

H + H +
e

G G
G G G
G

.
(4.133d)

When using this approximation, the inner integrals in (4.132b) no longer depend on variables t
and p
G
, allowing us to write

( )
, ,
4
2 ( )
2
,
2
( , , ) ( , , )
S ( , ) ( , ) ( ; )
x yTA x yTA
i w c
x y
cu w cu w
z z
w c w c
w
u w t T dt A e d
c
r p
p p

-A

A

e H H

E E
G
G
B
G G
G

G G

.
E
(4.133e)

At this point, we have everything needed to put together the interferometers balanced-signal
equations. Using Eqs. (4.102a)(4.102d) to transform the variables of integration in Eq. (4.127b)
to ( ) w c o and ( ) c u w r
G G
gives

( )
, ,
4
2 ( )
2
,
2
( , , ) ( , , )
S ( , ) ( , ) ( ; )
x yTA x yTA
i w c
x y
cu w cu w
z z
w c w c
w
u w t T dt A e d
c
r p
p p

-A

A

e H H

E E
G
G
B
G G
G

G G

.
E
(4.133e)
Following the same reasoning used in the discussion following Eq. (4.117b) above, we assume
that in a well-designed interferometer A and T are large enough in fact, that a relatively small
patch of A (say A/100 or A/1000) is large enough and a similarly small fraction of T is large enough
for the left-hand side of (4.133c) to be approximately equal to its limit:
Another way of looking at this is to say that only the values of R
x,y
reasonably near = 0 and
= 0 contribute signicantly to the S
x,y
Fourier transform. When using this approximation, the
inner integrals in (4.132b) no longer depend on variables and , except for a relatively small
border region around the edge of A and relatively small time durations at the beginning and end
of T. Neglecting the contribution of these small border regions and time durations, we make the
approximation that
p
G
t
t p
G
- 451 -

2
2
2
Average energy in balanced signal over time interval 2 and beam cross section
2 ( )
{
o
T A
d
d r

=

( )
( )
2 2
2 2
( )
2 2 2 2
( )
2
2
2
2
2 2
( )
( ) ( ) ( ) ( , , )
( ) ( ) ( ) ( , , )
( )
( ) ( ) ( ) ( , , )
[
]
[
}
{
abc
s s s xTA
abc
p p p yTA
o
abc
s s s xTA xTA
r t z
r t z
d
W d r
r t z

+
+

E
E
E E
G
G

E
E
E
( )
( )
( )
2 2 2 2
( ) 2 1
2
2
2
2
2 2
( )
2 2
(
( , , )
( ) ( ) ( ) ( , , ) ( , , )
( )
( ) ( ) ( ) ( , , ) ( , , )
( ) ( )
]
[
}
{
abc i
p p p yTA yTA
o
abc
s s s xTA xTA
ab
p p p
z
r t z z e
d
W d r
r t z z
r t

+
+
+
+

E E
E E
G
G
G
G G

G
G G

E
E
( )
2 2
) 2 1
( ) ( , , ) ( , , ) ]
}
c i
yTA yTA
z z e

+ E E
G
G G

. E
(4.134a)

Here rule (4E.6a) in Appendix 4E is used to acknowledge that
( ) abc
s
and
( ) abc
p
are functions
only of the wavenumber for on-axis and slightly off-axis plane waves, and once again Eq.
(4.1e) is used to simplify the constant outside the integrals. Converting the arguments in (4.133e)
to ( ) w c = and ( ) c u w =
G G
gives

( )
( )
, ,
2 ( )
2 4 2
,
2 4
,
( , , ) ( , , )
S ( , ) ( , ) ( ; )
S ( , ) 2 ( ) ,
x yTA x yTA
i
x y
x y A
z z
c c t T dt A e d
c c T

=

E E
G
G
G
G G

G G
G
G
B

E
(4.134b)

where
A
is defined to be the two-dimensional forward Fourier transform

- 452 -

2 2
( ) ( ; )
i u
A
u d A e
r p
p p
G G
G G
. (4.134c)

of the beams pupil function ( ; ) A p H
G
defined in Eq. (4.131c). Because ( ; ) A p H
G
is strictly real,
we see that

2 2 2 2
( ) ( ; ) ( ; )
i u i u
A
u d A e d A e
r p r p
p p p p

- -

H H

G G G G
G G G

or
( ) ( )
A A
u u

G G
. (4.134d)

Equations (4.120a) and (4.120b) let us substitute for S
x,y
in (4.134b) to get

( )
2 2
( , , ) ( , , ) 2 ( , ) ( )
xTA xTA o x A
z z T c r o r o o r o o
A A E E L
G G
G G G

B E (4.135a)
and

( )
2 2
( , , ) ( , , ) 2 ( , ) ( )
yTA yTA o y A
z z T c r o r o o r o o
A A E E L
G G
G G G

B E . (4.135b)
These last two results, together with
2
o o
c r

from Eq. (4.1e), can be substituted into (4.134a)
to give

2
2
2
2 ( )
{
o
T A
d
d r
r
r o o
o

( )
( )
2 2
2 2
( )
2 2 2 2
( )
2
2 2 2
2 ( )
( ) ( ) ( ) ( , , )
( ) ( ) ( ) ( , , )
2 ( ) ( ) ( ) ( ) ( ) ( , )
[
]
[
}
{
abc
s s s xTA
abc
p p p yTA
abc
A s s s x
r t z
r t z
TW d d r r t
o o o r o
o o o r o
o r o o o o o r o

+
+ A

E
E
L
G
G
G

E
E
2 2 2 2
( ) 2 1
2
2 2 2
2 ( )
2
( ) ( ) ( ) ( , )
2 ( ) ( ) ( ) ( ) ( ) ( , )
( ) (
]
[
}
{
abc i
p p p y
abc
A s s s x
p p
r t e
TW d d r r t
r t
r o r
o o o r o
o r o o o o o r o
o o

+
+ A
+

L
L
G
G
G

2 2 2
( ) 2 1
) ( ) ( , )]
}
abc i
p y
e
r o r
o r o

L
G
.
(4.135c)

Equations (4.120a) and (4.120b) let us substitute for S
x,y
in (4.134b) to get
- 453 -
Returning again to Eqs. (4.120a) and (4.120b), this time to substitute for
2
,
( )
x yTA
E
E , gives

2
2
2 2
(
4 ( )
( ) ( )
{
abc
s s s
T A
TA d d r
r t
o r o
o o

2 2
2 2 2 2
) ( )
2
2 2 2
2 ( )
2 2 2
( ) 2 1 2 1
( ) ( , ) ( ) ( ) ( ) ( , )
2 ( ) ( ) ( ) ( ) ( , )
( ) ( ) ( ) ( , ) ( ) ( )
[
]
}
{
}
abc
x p p p y
abc
s s s x
abc i i
p p p y A A
r t
TW d d r r t
r t e e
r o r r o r
o r o o o o r o
o r o o o o r o
o o o r o o o

+

+

+ A + A

L L
L
L
G G
G
G G
G

(4.135d)

where Eq. (4.134d) is used to replace ( )
A
o A
G
by ( )
A
o

A
G
. From the definitions of L
x,y
in the
discussion preceding Eqs. (4.120a), we know that
2
( , )
x
d r o r L
G
is the x-polarized optical power
per unit area of the beam and per unit wavenumber interval at wavenumber that is inside the
2
d r solid angle and traveling in the direction of the propagation vector
2
1 z r r O +
G
G
. A
similar statement can be made about
2
( , )
y
d r o r L
G
that it is the y-polarized optical power per
unit area of the beam and per unit wavenumber interval at wavenumber that is inside the
2
d r
solid angle and traveling in the direction of the propagation vector O
G
. The discussion following
Eq. (4.120b) points out that L
x
and L
y
must represent direction-chopped radiation, with both L
x

and L
y
negligible for r
G
values specifying propagation directions that are not parallel to, or nearly
parallel to, the optical axis z . Hence L
x
and L
y
are negligible for those propagation directions
that cannot enter the interferometer because they lie outside the interferometers field of view,
and we can regard the integrals over
2
d r as occurring only over the interferometers field of
view. We define P ( )
bal
to be the time-averaged power in the balanced signal from the beam of
cross-sectional area A at an OPD value of . Dividing both sides of (4.135d) by 2T then gives

2
2 2 2
2 ( )
field of view
2 2 2
( )
P ( ) 2 ( ) ( ) ( ) ( ) ( , )
( ) ( ) ( ) ( , )
1 Re (
[
]
[
{
abc
bal s s s x
abc
p p s y
A
A d d r r t
r t
W
A
o r o o o o r o
o o o r o
+
+

L
L
G
G

( )
2 cos
) ]
}
i
e
r
r o o
o

A
G

(4.135e)
cross-sectional area A at an OPD value of . Dividing both sides of (4.135d) by 2T then gives s, after
using that Re(c) = (c+c

)/2 for any complex number c,

2
2 2 2
2 ( )
field of view
2 2 2
( )
P ( ) 2 ( ) ( ) ( ) ( ) ( , )
( ) ( ) ( ) ( , )
1 Re (
[
]
[
{
abc
bal s s s x
abc
p p s y
A
A d d r r t
r t
W
A
o r o o o o r o
o o o r o
+
+

L
L
G
G

( )
2 cos
) ]
}
i
e
r
r o o
o

A
G

(4.135e)

- 454 -
where

2
cos 1
cosine of the angle between the propagation vector and the axis z
r
o r
K
.
(4.135f)

We note that by definition
r
o is the same as angle
b
used in Sec. 4.12 and Appendix 4B.
Writing the integral over
2
d r like this lets us think of L
x
and L
y
as representing the radiation
field before it becomes direction-choppedalways assuming, of course, that direction-chopping
the incident radiation does not significantly change L
x
and L
y
. Equation (4.135e) makes it clear
that the triple integral over d and
2
d r must be real because the quantity being integrated is
always real.
4.16 Simplified Formulas for the Optical Power in the Balanced Signal
Equation (4.135e) specifies the optical power in the balanced signal when
x y
= L L so that the
incident radiation is polarized, when L
x
and L
y
depend on r
G
as well as so that there is both
spectral and intensity variation across the interferometers field of view, and when 0 A =
G
so that
there are small misalignments in the moving mirror. This section strips away these effects step by
step, eventually to arrive at the same formula for the optical power in the balanced signal for an
ideal interferometer that was presented in Eq. (1.19f) of Chapter 1.
The first step is to specify unpolarized incident radiation, which we do by setting

1
( , ) ( , ) ( , )
2
x y
r o r o r o L L
G G G
L . (4.136a)

Here, the incident radiation is made unpolarized by splitting the total power equally between the
two possibilitiesx polarization and y polarization. From Eqs. (4.121a) and (4.121b), we see that
both L
x
and L
y
are even functions of , requiring L to be another even function of :

( , ) ( , ) r o r o
G G
L L . (4.136b)

Equation (4.135e) can now be written as

( )
2
2
field of view
2 2 2 2
2 2
( ) ( )
2 cos
P ( ) ( ) ( , )
( ) ( ) ( ) ( ) ( ) ( )
1 Re ( )
[ ]
[ ]
{
}
bal
abc abc
s s s p p p
i
A
A d d r
r t r t
W
e
A
r
r o o
o r o r o
o o o o o o
o
+
+ A
G
G
.

L
(4.136c)
1 Re ( [
A
W
A
+
( )
2 cos
) ]
i
e
r
r o o
o

A
G

Simplified Formulas for the Optical Power in the Balanced Signal 4.16

- 455 -
Glancing back to the definitions of L
x
and L
y
[see discussion preceding Eq. (4.120a) above], we
recognize ( , ) ( , ) ( , )
x y
r o r o r o + L L
G G G
L to be the total optical power per unit cross-sectional area
of this unpolarized beam per unit solid angle per unit wavenumber interval at wavenumber . As
an argument of function L, the wavenumber takes on negative as well as positive values:
o < < . This makes L analogous to a double-sided power spectrum [in Chapter 3, see Sec.
3.20 and the discussion following Eq. (3.57g)]. In radiometry the spectral radiance of an optical
field is the transmitted optical power per unit area transverse to the direction of propagation per
unit solid angle in the direction of propagation per unit wavenumber interval. This is the same
meaning we have attached to L; however, in radiometry, the wavenumber is always positive.
70

This makes the radiometric spectral radiance of the optical field analogous to a single-sided
power spectrum. Because the radiation passing through the interferometer is direction-chopped,
ensuring that all the propagation vectors are parallel to, or nearly parallel to, the z axis, a unit
cross-sectional area of the beam is approximately the same as a unit area transverse to the
radiations direction of propagation; and a solid angle
2
d r in Eq. (4.135e) is approximately the
same as a solid angle in the direction of propagation. Hence we could interpret ( , ) r o
G
L as the
radiometric spectral radiance of the optical field if L were not in fact defined for both positive and
negative wavenumbers, making it analogous to a double-sided rather than a single-sided power
spectrum. Therefore we use the standard conversion for going from double-sided to single-sided
power spectra [see Eq. (3.58b) in Chapter 3] to define the spectral radiance L of the optical field
as
( , ) 2 ( , ) for r o r o o > L
G G
0 L . (4.136d)

The next step is to assume no spectral or intensity variation across the interferometers field of
view, which means we suppress the dependence of L and L on r
G
and write Eqs. (4.136a),
(4.136b), and (4.136d) as

1 1
( , ) ( , ) ( , ) ( )
2 2
x y
r o r o r o o L L
G G G
L L , (4.136e)
with
( ) ( ) o o L L (4.136f)
and
( ) 2 ( ) for o o o > L 0 L . (4.136g)

Substituting Eq. (4.136e) into (4.136c) now gives

2
field of view
2 cos
P ( ) ( )
2
1 Re ( ) [ ]
bal
i
A
A
d d
W
e
A
r
r o o
o r q o o o

+ A

G
( )L (4.136h)

70
See Table 1-2 on page 1-4 in The Infrared Handbook, edited by William L. Wolfe and George J. Zissis, rev. ed.
(Infrared Information Analysis Center of the Environmental Research Institute of Michigan, 1985).
1 Re ( [
A
W
A
+
( )
2 cos
) ]
i
e
r
r o o
o

A
G

- 456 -

with the beam-splitter efficiency defined to be

2 2 2 2
2 2 2
( ) ( )
( ) 2 ( ) ( ) ( ) ( ) ( ) ( ) ( )
abc abc
s s s p p p
r r t r t

= +

. (4.136i)

An ideal interferometer has

2 2
2 2 1
( ) ( ) ( ) ( )
2
s p s p
r r t t = = = =
and

2
2
( )
,
( ) ( ) 1
abc
s p
r = =

so that 1 = . For realistic interferometers, we expect 0 1 < < ; and the closer is to one, the
more nearly ideal is the performance of the interferometers optical components (i.e., the beam
splitter, compensator, and return mirrors in Figs. 4.16, 4.17, and 4.19).
Traditional interferometers have beams with a circular cross section. Equation (4D.6b) in
Appendix 4D gives the formula for the two-dimensional forward Fourier transform of a circular
pupil function when R is the pupil radius,

1
circle of
radius
(2 )
( )
A
R
J R u
u R
u
=
G
G
G
. (4.137a)

Here J
1
is the first-order Bessel function of the first kind. We note that this two-dimensional
Fourier transform depends only on the magnitude of vector u
G
; and, since J
1
is always real-valued
for real arguments (see Fig. 4.23), the transform is always real. Substitution of
G
for u
G
in
(4.137a) gives

1
1
circle of
radius
(2 )
(4 )
( )
2
M
A
M
R
J R
J R n z
R R
n z

= =

G
G
G , (4.137b)

where in the last step we replace
G
with its definition from Eq. (4.126b): 2( )
M
n z =
G
.
Both
M
n , the normal vector to the reflecting surface of the moving mirror, and z , the vector
pointing down the interferometer beam along the optical axis, have unit length. Because the angle
between them is always smallthe moving mirror is assumed to be only slightly misalignedit
follows that
M
n z is the misalignment angle of the moving mirror with respect to the optical
- 457 -
Figure 4.23.

1.2
0.582
y
i
30 30 x
i
30 20 10 0 10 20 30
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
0.0 10 20 30 -10 -20 -30
-0.2
0.0
0.2
0.44
0.6
0.8
1.0
1.2
-0.4
-0.6
x

1
( ) J x
- 458 -
FIGURE 4.24.

1.2
0.132
y
i
30 30 x
i
30 20 10 0 10 20 30
0.2
0
0.2
0.4
0.6
0.8
1
0.0 10 20 30 -10 -20 -30
-0.2
0.0
0.2
0.44
0.6
0.8
1.0
1.2
x

x
x J ) ( 2
1

- 459 -
axis and [see discussion following Eq. (4B.4h) in Appendix 4B], the angle between rays
reflecting from a perfectly aligned moving mirror and rays reflecting from a slightly misaligned
moving mirror is always 2
d M
n z = . We define

ma

M
n z = (4.137c)

to be the misalignment angle of the interferometers moving mirror for a beam with a circular
cross section and write (4.137b) as

1 ma
circle of
ma
radius
(4 )
( )
2
A
R
J R
R

G
. (4.137d)
It follows that

1 ma
ma
circle of
radius
(4 )
1
( )
2
A
R
J R
A R

G
(4.137e)

because the area of a circular beam is, of course,
2
A R = . To see how this function behaves, we
note that the right-hand side of (4.137e) can be written as

1
ma
( )
2 for 4
J x
x R
x
=

and graph this function of x in Fig. 4.24. Because

1 1
( ) ( ) J x J x = , (4.137f)

function J
1
is an odd function (see Sec. 2.3 of Chapter 2 for a description of what an odd function
is).
71
This means that

1 ma 1 ma
ma ma
(4 ( ) ) (4 )
2 ( ) 2
J R J R
R R

, (4.137g)
which shows that

circle of
radius
1
( )
A
R
A

G

71
The standard series formula for J
1
shows at once that it is odd. See Eq. (9.1.10) in Handbook of Mathematical
Functions, edited by Milton Abramowitz and Irene A. Stegun (National Bureau of Standards, Applied Mathematics
Series 55, November 1964), p. 360.
- 460 -
is an even function of . Consequently the absolute value signs can be dropped from so that Eq.
(4.137e) becomes

1 ma
ma
circle of
radius
(4 ) 1
( )
2
A
R
J R
A R
r o
o
r o
A
G
. (4.137h)
Hence for an interferometer beam with a circular cross section, Eq. (4.136h) can be written as

2 1 ma
ma field of view
P ( )
(4 )
( ) cos(2 cos )
2 2
1
bal
J R A
d d
R
W
r
r o
o r q o o ro o
r o

+

. ( )L
(4.137i)

Returning to the original definition of
1
( )
A
A o
A
G
in Eq. (4.134c), we have, as A
G
goes to zero,

2 ) 2
0
0
(
1 1
( ) ( ; ) 1
i
A
d A e A
A A
r p o
o p p
- A
A
A
A H

G
G
G
G
G
G
(4.137j)

for the pupil functions ( ; ) A p H
G
of beams with any shape cross section. According to Eq. (4.137c)
and the discussion preceding it,

ma
2 2
M
n z A
G
.

This means the limit when 0 A
G
is the same as the limit when
ma
0 . For a beam with a
circular cross section, it must then be true that [see Eqs. (4.137h) and (4.137j)]

ma
1 ma
0
ma
(4 )
lim 1
2
J R
R
r o
r o
. (4.137k)

Equation (4.137i) for a perfectly aligned system then becomes

2
field of view
P ( ) ( ) 1 cos(2 cos )
2
bal
A
d d W
r
o r q o o ro o

+

. ( )L (4.137A )

If we assume that the interferometers field of view is sufficiently narrow that cos 1
r
o e , then
Eq. (4.137i), for an imperfectly aligned moving mirror, becomes
1 ma
ma
(4 )
2
J R
R
r o
r o
2
field of view
( ) 1
2
A
d d W o r q o o

( )L
1 ma
ma
(4 )
2
J R
R
r o
r o
cos(2 cos )
r
ro o

.
If we assume that the interferometers field of view is sufficiently narrow that cos
- 461 -

1 ma
ma
(4 )
2
P ( )
( ) 1 cos(2 )
2
bal
J R
R
A
W d
r o
r o
q o o ro o

AO
+

( )L
(4.138a)
where
= solid angle of the interferometers field of view (4.138b)

Equations (4.92a)(4.92d) above let us write that

2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) r r r r r r r r o o o o o o o o

, (4.139a)

2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
s s s s s s s s
r r r r r r r r o o o o o o o o

, (4.139b)

2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
s s s s s s s s
t t t t t t t t o o o o o o o o

, (4.139c)

2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
p p p p p p p p
r r r r r r r r o o o o o o o o

, (4.139d)
and

2 2
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
p p p p p p p p
t t t t t t t t o o o o o o o o

. (4.139e)

Equation (4.125d) shows that, using rule (4E.6a) in Appendix 4E to drop the superfluous
argument r
G
,

2 2
( ) ( )
( ) ( )
abc abc
s s
y o y o and
2 2
( ) ( )
( ) ( )
abc abc
p p
y o y o . (4.139f)

Consequently, () defined in Eq. (4.136i) is an even function of :

2 2 2 2
2 2 2
( ) ( )
2 2 2 2
2 2 2
( ) ( )
( )
2 ( ) ( ) ( ) ( ) ( ) ( ) ( )
2 ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
abc abc
s s s p p p
abc abc
s s s p p p
r r t r t
r r t r t
q o
o o o y o o o y o
o o o y o o o y o q o

+

+

.
(4.139g)

We also know that L is an even function of [see Eq. (4.136f)], that cos(2 ) ro is an even
function of , and that
1 ma
ma
(4 )
2
J R
R
r o
r o
cos(2 ) d ro o

- 462 -

1 ma
ma
(4 )
2
J R
R
r o
r o

is an even function of [see Eq. (4.137g)]. Therefore, the entire product being integrated in
(4.138a),

1 ma
ma
(4 )
( ) cos(2 )
2
1
J R
R
W
r o
q o o ro
r o

+ ( )L ,

is an even function of . Hence we can write, using the rule from Eq. (2.19) in Chapter 2 and also
that ( ) 2 ( ) o o L L from Eq. (4.136g), that the balanced-signal power specified in (4.138a) is

1 ma
ma 0
(4 )
P ( ) ( ) cos(2 )
2 2
1
bal
J R A
d
R
W
r o
q o o ro o
r o

AO

+
L ( ) . (4.140a)

Making the interferometer perfectly aligned by taking
ma
0 , we see from the limit

ma
1 ma
0
ma
(4 )
lim 1
2
J R
R
r o
r o

in (4.137k) that the Bessel function ratio disappears in (4.140a). Now we can write

0
1
P ( ) ( )S( ) [1 cos(2 )]
2
bal
W d q o o ro o
, (4.140b)
where we define
S( ) ( ) A o o AOL (4.140c)

to be the total optical power per unit wavenumber interval entering the interferometer. All that
needs to be done now is to make the final idealization, 1 q , and we get the same formula for the
ideal interferometer signal given in Eq. (1.19f) in Chapter 1,

0
1
P ( ) S( ) [1 cos(2 )]
2
bal
W d o ro o
. (4.140d)

The only difference is that in Chapter 1 the balanced optical power is called
( ) cb
I instead of
P ( )
bal
, and that now we can get progressively less idealized formulas for the balanced optical
power by reversing the simplifications leading to (4.140d).
1 W +
1 W +
- 463 -
In most optical textbooks it is customary to assume that 1 W = , allowing Eq. (4.140d) to be
written as

0 0 0
1 1 1
P ( ) S( ) S( ) cos(2 ) constant S( ) cos(2 )
2 2 2
bal
d d d

= + = +

.

Separating out the nonconstant signal component that changes with , we give it the name

0
1
I ( ) S( ) cos(2 )
2
bal
d
. (4.141a)

In this book, we call I ( )
bal
the interferogram. Comparing this last result to Eq. (2.8b) in Chapter
2, we see that the interferogram is 1/4 of the Fourier cosine transform of S(). Since

S( ) ( ) 2 ( ) for 0 A A = = L L (4.141b)

and L is an even function [see Eqs. (4.136f) and (4.136g)], the definition of S() can be extended
to negative values of by making S another even function:

S( ) S( ) = . (4.141c)

The cosine is also even, so we can then write the interferogram as [see Eq. (2.19) in Chapter 2)

1
I ( ) S( ) cos(2 )
4
bal
d
.

The sine is an odd function and S() is even, so the product S( ) sin(2 ) is an odd function of
. According to Eq. (2.17) of Chapter 2, the integral between and + of any odd function is
zero, so
S( ) sin(2 ) 0 d
.
Hence we can write

1
I ( ) S( ) cos(2 ) S( ) sin(2 )
4 4
1
S( )[cos(2 ) sin(2 )]
4
bal
i
d d
i d

=
=

- 464 -
or

2
1
I ( ) S( )
4
i
bal
e d

(4.141d)
using cos( ) sin( )
i
i e

= . This shows that the interferogram of an ideal interferometer is 1/4 of
either the forward or inverse Fourier transform of S().
Equations (4.141a) and (4.141d) are important becauseas pointed out at the beginning of
Sec. 1.7 of Chapter 1they show why people build Michelson interferometers. Reversing the
complex Fourier transform gives

2
S( ) 4 I ( )
i
bal
e d

B
. (4.141e)

or, using Eq. (2.8f) in Chapter 2 to reverse the cosine transform in (4.141a),

0
S( ) 8 I ( ) cos(2 )
bal
d
. (4.141f)

To find S(), the radiation spectrum as a function of wavenumber, we need only measure I ( )
bal

and take its transform.
4.17 Energy Flux in the Unbalanced Radiation Fields
As was explained in Chapter 1, one very common application of Michelson interferometers is as
infrared spectrometers. When interferometers are used to analyze infrared spectra, the relatively
warm optical elements used to shape and direct the signal beam, like any other type of warm
surface, can generate large amounts of infrared radiation. As we remarked when discussing the
operation of a standard interferometer in Sec. 4.11 above, the unbalanced radiation field from the
input beam is irrelevant because it goes back out the entrance aperture, never reaching the
detector; the same, however, cannot be said of the unbalanced signal from the infrared
background. Figure 4.25 shows background radiation from the detector side of the beam splitter
entering the interferometer to generate both a balanced and an unbalanced interference signal.
The balanced signal is a combination of two sets of rays, each set having been reflected once and
transmitted once at the beam splitter, while the unbalanced signal is a combination of two sets of
rays where one set has been transmitted twice at the beam splitter and one set has been reflected
twice at the beam splitter. Figure 4.25 traces this process through the interferometer, with the
balanced background signal represented by a combination of dash-dot and dashed rays and the
unbalanced signal represented by a combination of solid and dashed rays. The balanced
background signal travels out the input port, leaving the system; but the unbalanced background
signal is sent back to the detector, creating an unwanted optical signal. This unbalanced
background term is often a relatively large fraction of the total interference signal reaching the
Energy Flux in the Unbalanced Radiation Fields 4.17

- 465 -
detector. In well-designed interferometers, it can usually be eliminated by the same basic
calibration procedures used in other infrared spectrometers (with a few special twists due to the
Fourier nature of the spectral measurementsee Sec. 5.19 of Chapter 5), but when designing an
interferometer we need to calculate the background spectrum and unbalanced interferogram
because, as will be explained in Chapters 6, 7, and 8, it contributes to the noise contaminating the
interferometer measurements.
The derivation of the unbalanced signal equations is very similar to the derivation of balanced
signal equations, with every major step in the unbalanced derivation having its counterpart in the
balanced derivation. Because we have just completed a detailed, step-by-step derivation of the
balanced signal equations, there is no need for an equally detailed derivation of the unbalanced
signal equations. What we do instead is to list, with a minimum of explanation, the important
equations of the unbalanced derivation, always specifying the equations in the balanced
derivation to which they correspond. This approach avoids repeating at length points already
covered during the balanced derivation while giving the interested reader enough information to
fill in the details if so inclined.
Just like before, we start with a monochromatic plane wave,

(back) (back) 2 ( )
0 0
Complex field ( )
i r ct
x y
E xE yE e

= +
G
(4.142a)
and

1 (back) (back) 2 ( )
0 0
Complex field ( )
i r ct
x y
B c yE xE e

=
G
. (4.142b)

We assume that intelligent efforts are made to control the background radiation from the warm
optical surfaces, so that, unless
is parallel to or nearly parallel to the optical axis z , the

background radiation cannot reach the detector. This means that, as was the case for the balanced
signal, only direction-chopped radiation at relatively small angles
b
can reach the detector.
Equations (4.142a) and (4.142b) correspond to (4.77a) and (4.77b) in the balanced derivation, and
the complex plane wave they specify represents radiation entering the interferometer from the
detector side of the system (neglecting terms of order
b
). Again we imagine an unfolded system
of coordinates such as the one shown in Fig. 4.18 above, only now the coordinates are unfolded
in such a way as to trace the unbalanced background signalrather than the balanced input
signalinto and out of the interferometer. Both ways of unfolding the interferometer end up
specifying the same exit beam traveling to the detector. Therefore the , , x y z coordinate system
used for vectors
and r in Eqs. (4.142a) and (4.142b) is the same coordinate system as the one
located in the exit beam of the unfolded interferometer in Fig. 4.18. In this sense, the , , x y z
coordinate system used to specify
and r in (4.142a) and (4.142b) is the same as the , , x y z

coordinate system used to specify
and r in Eqs. (4.77a) and (4.77b).

- 466 -

FIGURE 4.25.

Background Radiance from Detector
Side of the Interferometer
Detector Side of the
Interferometer
Fixed
Mirror
Compensator
Plate
Beam
Splitter
Input Side of the
Interferometer
Moving Mirror

2


- 467 -
The plane wave specified in Eqs. (4.142a) and (4.142b) can be decomposed into two linearly
polarized plane waves: one plane wave that has
(back)
0x
E for its complex amplitude and is linearly
polarized perpendicular to the plane of incidence on the beam splitter, and one plane wave that
has
(back)
0 y
E for its complex amplitude and is linearly polarized parallel to the plane of incidence on
the beam splitter. Tracing the background rays through Fig. 4.25, we find that the unbalanced
radiation field for the rays traveling out and back the moving-mirror arm are (again neglecting
terms of order
b
)

2 [ ( ) ] (back) 2 ( ) (back) 2 ( )
0 0
Complex field
[ ]
d
i r z ct uv uv
M x s s y p p
E
r xE t yE t e

+
= +
G

(4.143a)
and

2 [ ( ) ] (back) 2 ( ) (back) 2 ( )
0 0
Complex field
[ ]
d
i r z ct uv uv M
x s s y p p
B
r
yE t xE t e
c

+
=
G
.
(4.143b)

Appendix 4F presents the tunnel diagrams used to construct the
( ) uv
parameters for the s-type
and p-type plane waves passing through the beam-splitter substrate and compensator plate; the
s p
t t , ,
M
r ,
d
, and variables all have the same meaning as before. Equations (4.143a) and
(4.143b) correspond to Eqs. (4.85a) and (4.85b) in the balanced derivation. Corresponding to Eqs.
(4.84a) and (4.84b), we have (neglecting terms of order
b
)

(back) 2 ( ) (back) 2 ( ) 2 ( )
0 0
Complex field
[ ]
uv uv i r ct
M x s s y p p
E
r xE r yE r e

= +
G

(4.144a)
and

(back) 2 ( ) (back) 2 ( ) 2 ( )
0 0
Complex field
[ ]
uv uv i r ct M
x s s y p p
B
r
yE r xE r e
c

=
G
.
(4.144b)
From the discussion following Eq. (4.83) above, we know that the amplitude reflection
coefficients for plane waves reflecting off the back side of the beam splitter are ( )
s
Wr and
( )
p
Wr , with W = 1 or W = 1 depending on the type of beam splitter being used. The W
parameter occurred in Eqs. (4.84a) and (4.84b) of the balanced derivation because the and
s p
r r
parameters appeared to the first power in the formulas. In Eqs. (4.144a) and (4.144b), on the other
hand, only the squares of and
s p
r r appearwhich means, since
2
1 W = , that the W parameter
disappears. The formulas for the recombined, unbalanced fields corresponding to Eqs. (4.88a)
and (4.88b) in the balanced derivation is (neglecting terms of order
b
)

- 468 -

4 ( ) (back) 2 [ ] 2 ( ) 2 ( ) 2 ( )
0

4 ( ) (back) 2 [ ] 2 ( ) 2 ( ) 2 ( )
0
Complex unbalanced field
[ ]
[ ]
M
M
i n z r i r ct uv uv i z
M x s s s s
i n z r i r ct uv uv i z
M y p p p p
E
x r E e r t e e
y r E e r t e e

= +
+ +
G G
G G

(4.145a)
and

4 ( ) (back) 2 [ ] 2 ( ) 2 ( ) 2 ( )
0

4 ( ) (back) 2 [ ] 2 ( ) 2 ( ) 2 ( )
0
Complex unbalanced field
[ ]
[ ]
M
M
i n z r i r ct uv uv i z M
x s s s s
i n z r i r ct uv uv i z M
y p p p p
B
r
y E e r t e e
c
r
x E e r t e e
c

= +
+
G G
G G
.
(4.145b)

For future use, we note that Eqs. (4.89g)(4.89k), (4.92a)(4.92e), and (4.139a)(4.139e) already
specify how
s
r ,
p
r ,
s
t ,
p
t , and
M
r r = behave as functions of wavenumber ; and the
( )
,
uv
s p
can be
set up to behave the same way the
( )
,
abc
s p
do in Eqs. (4.97c) and (4.139f),

( ) ( )
( , ) ( , )
uv uv
s s

=
G G
and
( ) ( )
( , ) ( , )
uv uv
p p

=
G G
(4.145c)

with

2 2
( ) ( )
( ) ( )
uv uv
s s
= and
2 2
( ) ( )
( ) ( )
uv uv
p p
= . (4.145d)

Equation (4F.2a) in Appendix 4F points out that, like the magnitudes of the
( )
,
abc
s p
parameters, the
magnitudes of the
( )
,
uv
s p
parameters are functions only of wavenumber .
The next major step in the balanced derivation was to represent the radiation entering the
system by integrals over dw and
2
d u , as in Eqs. (4.103a) and (4.103b). We now do the same for
the background radiation entering the interferometer from the detector side of the system
(neglecting terms of order
b
),

(back)
2 (back) (back) 2 [ ]
2
( , , )
( , , ) ( , , )
i u wt
x y
E z t
c cu w cu w
dw d u x z y z e
w w c w c

+

= +

E E
G G
G
G
G G

(4.146a)

and

- 469 -

(back)
2 (back) (back) 2 [ ]
2
( , , )
1
( , , ) ( , , ) .
i u wt
x y
B z t
cu w cu w
dw d u y z x z e
w w c w c

+

=

E E
G G
G
G
G G

(4.146b)

We note that
(back)
E
G
and
(back)
B
G
must be real whereas
(back)
x
E and
(back)
y
E are allowed to be complex.
In addition
(back)
x
E and
(back)
y
E must satisfy all the symmetry relations that E
x
and E
y
satisfied for
the incident signal radiation entering the interferometer [see, for example, Eqs. (4.100a) and
(4.100b)],

(back) (back)
( , , ) ( , , )
x x
z z

= E E
G G
(4.146c)
and

(back) (back)
( , , ) ( , , )
y y
z z

= E E
G G
. (4.146d)

The total E and B fields for the unbalanced radiation traveling back to the detector are also
written as integrals over dw and
2
d u ,

2
(unb)
2 2 [ ] (back) ( )
2
2
4
1
( )
2 2 (back)
( , , )
( ) ( , , ) ( , )
( ) ( ) ( , , ) [ ]
{
M
i u wt uv
x s
iw cu
iw
n z r
c w
c
s s y
p
E z t
c w cu w cu w
dw d u e r x z
w c w c w c
w w cu w
r t e e y z
c c w c

+

=

+ +

E
E
G G
G
G
G
G G
G

2
2
4
1
( )
( ) 2 2
( , ) ( ) ( ) [ ]}
M
iw cu
iw
n z r
c w uv
c
p p
cu w w w
r t e e
w c c c

+
G
G
(4.147a)
and

2
(unb)
2 2 [ ] (back) ( )
2
2
4
1
( )
2 2 (back)
( , , )
1
( ) ( , , ) ( , )
( ) ( ) ( , , ) [ ]
{
M
i u wt uv
x s
iw cu
iw
n z r
c w
c
s s y
p
B z t
w cu w cu w
dw d u e r y z
w c w c w c
w w cu w
r t e e x z
c c w c

+

=

+

E
E
G G
G
G
G
G G
G

2
2
4
1
( )
( ) 2 2
( , ) ( ) ( ) [ ]}
M
iw cu
iw
n z r
c w uv
c
p p
cu w w w
r t e e
w c c c

+
G
G
(4.147b)

- 470 -
These two equations correspond to Eqs. (4.104a) and (4.104b) in the balanced derivation
(neglecting terms of order
b
).
The unbalanced background signal from the warm optical surfaces can be thought of as
traveling to the detector from the beam splitter along the same ray paths as the balanced optical
signal; consequently, it ends up being processed by the system much the same way as the
balanced optical signal. For this reason, we now give T, A subscripts to

(unb)
E
G
,
(unb)
B
G
,
(back)
x
E , and
(back)
y
E

to show that they also represent time-chopped and beam-chopped radiation fields. The
unbalanced E and B fields are time-chopped to the same 2T time interval as the balanced fields,
because the detector records both signal and background for the same length of time. Although
the effective cross-sectional area of the background beam is probably somewhat larger than that
of the input beam, in a well-designed system they are roughly the same size and can be
represented by the same symbol A. Again we treat

(unb)
E
G
,
(unb)
B
G
,
(back)
x
E , and
(back)
y
E

as random quantities. Hence, the z component for the Poynting vector for the unbalanced
radiation fields is another random quantity given by

( )
(unb) (unb) (unb)
1

TA TA TA
o
S z E B z
=
G G G

. (4.148a)

This corresponds to Eq. (4.122a) in the balanced derivation. The average radiant energy from the
unbalanced background reaching the interferometers detector during a time interval 2T over a
beam cross-sectional area A is now

( )
2 (unb) (unb)
1
TA TA
o
dt d E B z

G G

E , (4.148b)

which corresponds to the right-hand side of (4.122b) in the balanced derivation. Adding T, A
subscripts to
(back)
x
E and
(back)
y
E in Eqs. (4.147a) and (4.147b)and representing them as random
quantitieswe substitute the right-hand sides of these two equations into the expression in
(4.148b) to get, after a great deal of algebra, that


- 471 -
( )
2
2 2
2 4 4
( ) (back)
2
4 4
(
Average energy in unbalanced background signal over time 2 and beam cross section
( ) ( ) ( ) ( ) ( , , )
( ) ( )
[ ]
[ ]
{
uv
o s s s xTA
uv
p p p
T A
d
d r r t z
r t

= +
+ +

E
G

E
( )
( )
2 2
) (back)
2
2
2
2 2 ( ) (back) (back)
2
2
2 2 ( ) (back) (back)
( ) ( , , )
( ) ( ) ( ) ( ) ( , , ) ( , , )
( ) ( ) ( ) ( , , ) (
[
}
{
yTA
uv
o s s s xTA xTA
uv
p p p yTA yTA
z
d
d r r t z z
r t z

+
+

E
E E
E E
G
G
G G

G G

E
E
E
( )
( )
( )
2
2
2 1
2
2
2
2 2 ( ) (back) (back)
2
2
2 2 ( ) 2 1
, , )
( ) ( ) ( ) ( ) ( , , ) ( , , )
( ) ( ) ( ) ( , , ) ( , , )
]
[
]
}
{
}
i
uv
o s s s xTA xTA
uv i
p p p yTA yTA
z e
d
d r r t z z
r t z z e

+ +
+ +

E E
E E
G
G
G G

G
G G

E
E
(4.148c)

Here we have used Eqs. (4.102a)(4.102d) to transform the integrals over dw and
2
d u into
integrals over d and
2
d , and once again we have used 2( )
M
n z =
G
. Equation (4.148c)
corresponds to Eq. (4.134a) in the balanced derivation.
Following the pattern of Eqs. (4.120a), (4.120b), (4.135a), and (4.135b), we now write

( )
2
(back) (back)
2
( , , ) ( , )
2
o
xTA x
z
TA
E L
G
E , (4.149a)

( )
2
(back) (back)
2
( , , ) ( , )
2
o
yTA y
z
TA
E L
G
E , (4.149b)

( )
(back) (back) 2 2 (back)
( , , ) ( , , ) 2 ( , ) ( )
xTA xTA o x A
z z T c
E E L
G G
G G G

B E , (4.149c)
and

( )
(back) (back) 2 2 (back)
( , , ) ( , , ) 2 ( , ) ( )
yTA yTA o y A
z z T c
E E L
G G
G G G

B E . (4.149d)

As was the case for the balanced derivation [see Eqs. (4.121a) and (4.121b)],
(back)
x
L and
(back)
y
L are
even functions of ,
- 472 -

(back) (back)
( , ) ( , )
x x
r o r o L L
G G
(4.149e)
and

(back) (back)
( , ) ( , )
y y
r o r o L L
G G
. (4.149f)

Glancing back at where these two functions came from, we see that these two functions represent,
respectively, the x-polarized and y-polarized background optical power per unit area per unit solid
angle per unit interval entering the interferometer from the detector side of the system.
Substitution of Eqs. (4.149a)(4.149d) into (4.148c) gives

2
2 4 4
2 ( ) (back)
field of view
4
2 ( ) ( ) ( ) ( ) ( , )
( )
[ ]
[
{
uv
s s s x
p
T A
TA d d r r t
r t
o r o o o y o r o
o
+
+ +

L
G

2
2 4
( ) (back)
2
2
2 2 2 ( ) (back)
field of view
2
2 2 ( ) (back) 2 1
( ) ( ) ( , )
2 ( ) ( ) ( ) ( ) ( , )
1
( ) ( ) ( ) ( , ) ( )
2
]
[
] [ ]
}
{
}
uv
p p y
uv
s s s x
uv i
p p p y A
TA d d r r t
r t e
A
TA d
r o r
o y o r o
o r o o o y o r o
o o y o r o o
o
+
+ A
+

L
L
L
G
G
G
G

2
2
2
2 2 2 ( ) (back)
field of view
2
2 2 ( ) (back) 2 1
( ) ( ) ( ) ( ) ( , )
1
( ) ( ) ( ) ( , ) ( ) ,
[
] [ ]
{
}
uv
s s s x
uv i
p p p y A
d r r t
r t e
A
r o r
r o o o y o r o
o o y o r o o

+ A

L
L
G
G
G

(4.150)

where we use Eq. (4.1e) to replace
2
o o
c r by one and Eq. (4.134d) to replace ( )
A
u
G
by
( )
A
u

G
. This result corresponds to Eq. (4.135d) in the balanced derivation, except we have
anticipated the reasoning used to go from (4.135d) to (4.135e) by using the interferometers field
of view to set the limits on the double integral over
2
d r . Strictly speaking, this should be the
field of view for the unbalanced background radiation coming from the warm optical surfaces
between the beam splitter and the detector, but in a well-designed system the two fields of view
are roughly the same. In this formula, the second triple integral over d and
2
d r is the complex
conjugate of the third triple integral over d and
2
d r , ensuring that their sum is real. Since the
first triple integral is the integral of a real expression, evaluation of the right-hand side of (4.150)
produces a real numberwhich makes sense considering that this is the formula for the energy in
the unbalanced background signal.
field of view
{

- 473 -
To make further simplifications in this energy formula, we break from the pattern of the
balanced derivation and use Eq. (4.150) to represent a somewhat idealized interferometer with a
nonideal beam-splitter film. From this point on to the end of this section, we are not so much
analyzing a likely type of Michelson setup as we are constructing a thought experiment to
discover hidden properties of the beam-splitter amplitude-transmission and amplitude-reflection
coefficients t
s
, t
p
, r
s
, and r
p
. The first step is to set up the interferometer so that no electromagnetic
energy enters the system through the input portfor example, by having the interferometer
entrance aperture look at a chilled nonreflective surface. This means only detector-side
background radiation enters the system. To keep things simple, we first assume that all single-
pass s-type and p-type transmissions through the beam-splitter substrate and compensator plate
are equivalent, with every single-pass transmission characterized by complex constants having
the same magnitude . Now the
( )
,
abc
s p
terms correspond to
3
and the
( )
,
uv
s p
terms correspond
to
2
so that

2
6
( )
,
abc
s p
and
2
4
( )
,
uv
s p
. (4.151a)

This lets us assume that only negligible amounts of optical power are lost passing through the
substrate and compensator plate by saying that is approximately equal to one; similarly, we
say that only negligible amounts of optical power are lost by reflection off the fixed and moving
mirrors by saying that r is approximately equal to one. These assumptions can be written as

( ) 1 and ( ) 1 r . (4.151b)

In addition, the moving mirror is taken to be in perfect alignment with 0 =
G
so that [see Eq.
(4.137j)]

1 1
(0) (0) 1
A A
A A

= = . (4.151c)

To keep our thought experiment simple, we force the background radiation to be x-polarized and
confined to a very narrow solid angle
back
, so that

(back)
( , ) 0
y
L
G
(4.151d)
and

(back) (xback) (xback)
back back
( , ) ( ) ( ) ( ) ( ) ( )
x x y
= L
G G
L L . (4.151e)

- 474 -
The double-sided power function
(xback)
( ) o L for o < < has units of optical power per unit
cross-sectional area per unit solid angle per unit interval. We note that the delta function ( ) o r
G

is explained in Sec. 2.25 of Chapter 2. It has units of inverse steradians, as can be seen from the
identity

2
( ) 1 d r o r

G
,
showing that the delta function when integrated over
2
d r (that is, integrated over a solid angle
containing 0 r
G
) always produces the dimensionless number one. In Eq. (4.151e), we drop the
dependence of the background optical power
(xback)
L on ( , )
x y
r r r
G
, using ( ) and ( )
x y
o r o r to
show that only the contribution from the on-axis direction is significant. Just like L() in Eq.
(4.136f), function ( )
(xback)
o L must be even,

( ) ( )
(xback) (xback)
o o L L . (4.151f)

Although it is highly unlikely that an actual interferometer would have this sort of idealized x-
polarized background radiance, we can always arrange for an existing system to have this sort of
contaminating background without changing the properties of the interferometers beam splitter.
Substitution of Eqs. (4.151a)(4.151e) into (4.150) gives

4 4
(xback)
back
2 ( ) ( ) ( )
{ s s
T A
TA r t o o o

AO +

L
2 2 2 2 2 2
( ) ( ) ( ) ( )
}
i i
s s s s
e r t e r t d
r o r o
o o o o o

+ + .
(4.152)

We next consider what happens to the balanced, instead of the unbalanced, detector-side
background signal. Equation (4.135d), which specifies the energy in the balanced input signal,
can be adapted to describe the balanced background signal, but to do this we must analyze how
the balanced background signal differs from the balanced input signal. We note that r
s
and r
p
in
(4.135d) refer to an initial reflection of the beam coming from the input port that is off the front
side of the interferometer, as shown in Fig. 4.16, whereas the balanced background signal must,
as shown in Fig. 4.25, have its initial reflection off the back side of the beam splitter. Tracing the
balanced background rays through the interferometer, we see that, compared to the balanced input
rays, front-side beam-splitter reflections are replaced by back-side beam-splitter reflections and
back-side beam-splitter reflections are replaced by front-side beam-splitter reflections. We also
note that rays pass through the compensator plate and beam-splitter substrate a different number
is explained at the end of Sec. 2.25 of Chapter 2. It has units of inverse steradians, as can be seen
from the identity

- 475 -
of times, but this does not matter because we take ( ) 1 in our idealized interferometer. From
the discussion following Eq. (4.83) above, we know that if the front-side reflection coefficients
are r
s
and r
p
, then the back-side reflection coefficients are Wr
s
and Wr
p
. This means that to
convert the balanced input-signal derivation to the balanced background-signal derivation, we
need to convert all the r
s
and r
p
variables to Wr
s
and Wr
p
whenever r
s
and r
p
refer to front-side
reflection coefficients. What about those times when r
s
and r
p
are already part of Wr
s
and Wr
p

products referring to back-side reflection coefficients? To handle this situation, we note that when
1 W = , making the original back-side reflection coefficients ( and ( )
s p
r r ) , then W
2
r
s
and W
2
r
p

return to us the front-side coefficients r
s
and r
p
; and when 1 W = , the back-side and front-side
coefficients are always equal and can be multiplied by as many powers of W as we please.
Therefore, if Wr
s
and Wr
p
refer to back-side reflection coefficients in the balanced input-signal
derivation, then W
2
r
s
and W
2
r
p
automatically convert the terms to the desired front-side reflection
coefficients. This shows that replacing the r
s
and r
p
variables everywhere by Wr
s
and Wr
p

converts all front-side reflection terms to back-side reflection terms and all back-side reflection
terms to front-side reflection terms. Hence, Eq. (4.135d) can be used to calculate the energy in the
balanced background signal if r
s
and r
p
are replaced everywhere by Wr
s
and Wr
p
(and, of course,
L
x
and L
y
are replaced by
(back)
x
L and
(back)
y
L ). The only values W can have are +1 or 1 so as
always
2
1 W = . Looking at Eq. (4.135d), we see that r
s
and r
p
only enter the formula as
2
2
and
p
r r
s
, so replacing r
s
and r
p
by Wr
s
and Wr
p
does not change the equation. Therefore, all
that needs to be done to adapt (4.135d) to the balanced background signal using the
approximations in (4.151d) is to set
( )
,
( ) ( ) ( ) 1
abc
s p
r = = = and to replace L
x
and L
y
by
(back)
x
L and
(back)
y
L , which gives us

2
2
Average energy in balanced background over time 2 and beam cross section
4
( ) [
{
s
T A
TA d d
r t

2 2
2 2
2
(back) (back)
2 2
2 (back)
2 2
(back) 2 1 2 1
( ) ( , ) ( ) ( ) ( , )
2 ( ) ( ) ( , )
( ) ( ) ( , ) ( ) ( )
]
[
] [ ]
}
{
}
s x p p y
s s x
i i
p p y A A
r t
TW d d r t
r t e e

+
+
+ +

L L
L
L
G G
G
G G
G

(4.153)

Substitution of the remaining idealizations and approximations in (4.151c)(4.151e) into Eq.
(4.153) gives
- 476 -

2 2
(xback) 2 2
back
Average energy in balanced background over time 2 and beam cross section
2 ( ) ( ) ( ) 2
i i
s s
T A
TA r t We We d

= + +

. L
(4.154)

We now consider formulas (4.152) and (4.154) for the balanced and unbalanced background
energy. Although the background radiance while passing through the interferometer may have
some of its energy absorbed, by conservation of energy there is no way for its energy to
increaseconsequently, the sum of (4.152) and (4.154) must be less than or equal to the total x-
polarized energy produced by the radiant background in time 2T for a beam of cross-sectional
area A and solid angle
back
,

(xback)
back
2 ( ) TA d
L . (4.155a)

Since
(xback)
L is even [see Eq. (4.151f)], the total background energy entering the interferometer

(xback)
back
0
4 ( ) TA d
L (4.155b)

following the rule given in Eq. (2.19) of Chapter 2.
We now add together the balanced and unbalanced energythat is, the total energyleaving
the interferometer. The sum of the right-hand sides of (4.152) and (4.154) gives

4 4
(xback) 2 2 2
back
2 2 2 2
2 2 2 2
Total background energy over time 2 and beam cross section
2 ( ) ( ) ( ) ( ) ( )
( ) ( ) 2 ( ) ( ) ( ) ( )
{
i
s s s s
i i
s s s s s s
T A
TA r t r t e
r t e r t W r t e

= + +

+ + +

L
2 2
2
( ) ( )
}
i
s s
W r t e d

+ .
(4.156a)

We represent the complex scalars
s
r and
s
t by

( )
( ) ( )
rs
i
s s
r r e

= (4.156b)
and

( )
( ) ( )
ts
i
s s
t t e

= (4.156c)


- 477 -
for ( )
rs
o and ( )
ts
o defined to be real wavenumber-dependent angles representing the phases
of
s
r and
s
t . Since ( ) ( )
s s
r r o o

and ( ) ( )
s s
t t o o

from Eqs. (4.92b) and (4.92d), we must
have
( ) ( )
rs rs
o o (4.156d)
and
( ) ( )
ts ts
o o (4.156e)

in (4.156b) and (4.156c), the defining equations for ( )
rs
o and ( )
ts
o . Substitution of (4.156b)
and (4.156c) into (4.156a) gives

[ ]
2
2 2
(xback)
back
2 2
2 ( ) ( ) 2 2
2 ( ) ( ) ( )
( ) ( )
{
ts rs
s s
i i i
s s
T A
TA r t
r t e e e e
o o r o r o
o o o
o o

AO +

+ +

L
2 ( ( ) ( )) 2 2
}
ts rs
i i i
We We d
o o r o r o
o

+ +

.
(4.157a)

Equations (4.139b) and (4.139c) show that
2
( )
s
r o and
2
( )
s
t o are even functions of , as is
(back)
L according to (4.151f). The term

( ) ( ) 2 ( ) ( ) 2 ( ) ( ) 2 2 2 2
ts rs ts rs
i i i i i i
e e e e We We
o o o o r o r o r o r o
+ + +

inside the integral is also even with respect to , because by (4.156d) and (4.156e)

( ) ( )
( ) ( )
2 ( ) ( ) 2 ( ) ( ) 2 ( ) 2 ( ) 2 ( ) 2 ( )
2 ( ) ( ) 2 ( ) ( ) 2 2 2 2
ts rs ts rs
ts rs ts rs
i i i i i i
i i i i i i
e e e e We We
e e e e We We

+ + +

+ + +

.

This means (4.157a) is an integral of an even expression between and +, so by rule (2.19) in
Chapter 2 it can also be written as

[ ]
2
2 2
(back)
back
0
2 2
2 ( ) ( ) 2 2 2
4 ( ) ( ) ( )
( ) ( )
{
ts rs
s s
i i i i
s s
T A
TA r t
r t e e e e
o o r o r o
o o o
o o

AO +

+ +

L
( ( ) ( )) 2 2
}
ts rs
i i
We We d
o o r o r o
o

+ +

.
(4.157b)
Eq.
- 478 -
We know from formula (4.116a) in Sec. 4.14 above that

( )
2
2
(0, , )
o
xTA
z
E

is the average input energy, per unit wavenumber interval and per unit solid angle, that is entering
the interferometer during a time 2T and is carried by the x-polarized radiation field traveling in
the z direction at wavenumber . We note that the z in the argument list of
xTA
E
can be
disregarded because, as is mentioned at the end of Appendix 4C, the value of
2
xTA
E
does not
depend on z. Using the approximations introduced for
( )
,
( )
abc
s p
in (4.151a) above, we know from
the analysis in Appendix 4E that the effect of one transmission through the beam splitter is to
replace the monochromatic plane wavefield specified by (0, , )
xTA
z E
with the monochromatic

plane wavefield specified by (0, , )
s xTA
t z E
. Hence, the average energy, per unit wavenumber

interval per unit solid angle, that passes through the beam splitter during a time 2T and is carried
by the x-polarized radiation field traveling in the z direction at wavenumber is

( ) ( )
( )
2 2
2 2
2 2
2
2
2
( ) ( ) (0, , ) ( ) ( ) (0, , )
( ) (0, , ) ,
o o
s xTA s xTA
o
s xTA
t z t z
t z

E E
E

E E
E

where in the last step (4.151b) is used to drop from the formula. This result shows why
2
( )
s
t
is called the power transmission coefficient for x-polarized radiation. Using similar reasoning, the
effect on the plane waves of one reflection from the beam splitter is to replace (0, , )
xTA
z E
by
2
(0, , )
s xTA
r z E
. Hence, the formula for the average energy, per unit wavenumber interval and
per unit solid angle, that is carried by the x-polarized radiation reflected off the beam splitter in
time 2T is

( ) ( )
( )
2 2
4 2
2
2 2
2
2
2
( ) ( ) (0, , ) ( ) ( ) (0, , )
( ) (0, , )
o o
s xTA s xTA
o
s xTA
r z r z
r z

E E
E

.
E E
E

This shows why
2
( )
s
r is called the power reflection coefficient for x-polarized radiation.
Although the beam-splitter substrate can absorb energya process now being neglected by

- 479 -
taking about equal to onea well-designed beam splitter has only negligible absorption in the
thin film where the partial transmission and reflection of the interferometer beam occurs. This
means, by conservation of energy, that

( )
( ) ( )
( )
2
2
2 2
2 2
2 2
2
2 2
2
(0, , )
( ) (0, , ) ( ) (0, , )
( ( ) ( ) ) (0, , )
o
xTA
o o
s xTA s xTA
o
s s xTA
z
t z r z
t r z
= +
= +
E
E E
E

E
E E
E

or

2 2
( ) ( ) 1
s s
t r + = . (4.157c)

Substitution of this conclusion back into (4.157b) gives

( )
2 2
(xback)
back
0
4 ( ) 1 2 ( ) ( )
cos[2 2 ( ) ( ) ] cos(2 ) ,
[
{ }]
s s
ts rs
T A
TA r t
W d

= +
+

L (4.157d)

where cos sin
i
e i
= + is used to reduce the complex exponentials to a sum of cosines. For an

ideal beam splitter

2 2
( ) ( ) 1 2
s s
r t = = ,

so
2 2
2 ( ) ( )
s s
r t must also be about equal to 1/2 for a well-designed, nonideal beam splitter; it
obviously cannot be a small term. We now compare (4.157d) to formula (4.155b) for the total
energy produced by the radiant background. Unless the term inside the braces { } in (4.157d) is
identically zero for all values of , we can always construct an x-polarized background spectrum
(xback)
L that, for certain values of , specifies more energy leaving the interferometer in the
balanced and unbalanced background signal than entered the interferometer in (4.155b).
Therefore, the term inside the braces { } must be identically zero for all non-negative values of ,
which means that

( ) cos[2 2 ( ) ( ) ] cos(2 ) 0
ts rs
W + = . (4.158a)

- 480 -
To make this happen, we require ( ) ( ) 2
ts rs
= for 1 W = and ( ) ( ) 0
ts rs
= for
1 W = . Of course, multiples of can be added to these values because that does not change the
value of the cosine in (4.157d). We can specify both these conditions by the constraint

[ ] 2 ( ) ( )
ts rs
i
e W

= . (4.158b)

By Eqs. (4.156d) and (4.156e), this constraint holds true for all negative values of if it holds
true for all non-negative values of , since

[ ] [ ] ( ) ( ) ( ) ( ) 2 2
ts rs ts rs
i i
e e

=
B
.

When this constraint is substituted back into Eq. (4.157b), the right-hand side reduces to

2
2 2
(xback)
back
0
4 ( ) ( ) ( )
s s
T A
TA d r t

= +

, L

which, by substituting in (4.157c), is shown to be the same as the expression for the background
radiant energy given in (4.155b).
We have just seen that the background radiant energy is conservedfor x-polarized
background radiation. Clearly, nothing stops us from now making the background energy y-
polarized and repeating the analysis. If we return to Eq. (4.150), now specifying that

(back)
( , ) 0
x
L
G
(4.159a)
and

(back) (yback) (yback)
back back
( , ) ( ) ( ) ( ) ( ) ( )
y x y
= L
G G
L L , (4.159b)

everything will proceed as before because all the properties used to get to (4.158b) for r
s
and t
s

also hold true for r
p
and t
p
. Having switched from x polarization to y polarization, we define

( )
( ) ( )
rp
i
p p
r r e

= (4.160a)
and

( )
( ) ( )
tp
i
p p
t t e

= (4.160b)


- 481 -
for ( )
rp
and ( )
tp
real parameters representing the phase of r
p
and t
p
as functions of . Again,
these functions must be odd:

( ) ( )
rp rp
= (4.160c)
and
( ) ( )
tp tp
= . (4.160d)

We can show that
2
( )
p
t is the power transmission coefficient through the beam splitter for y-
polarized waves and that
2
( )
p
r is the power reflection coefficient through the beam splitter for
y-polarized waves so that

2 2
( ) ( ) 1
p p
t r + = . (4.160e)

As before, this leads to the final conclusion that

2 ( ) ( )
tp rp
i
e W

= (4.160f)

for all positive and negative values of , allowing us to conserve energy for the y-polarized
background radiation passing through the interferometer.
These results certainly hold for the beam-splitter transmission and reflection coefficients t
s
, t
p
,
r
s
, and r
p
in our thought experiment on an ideal interferometer, but what about the t
s
, t
p
, r
s
, and r
p

coefficients of a nonideal interferometer? The idealizations made at the start of this analysis in
Eqs. (4.151a)(4.151c) are standard ways of improving the performance of Michelson
interferometersdecreasing substrate absorption, improving mirror reflectivity, and correctly
aligning the moving mirrorand in that sense are physically possible modifications that can be
made to the interferometer without changing the t
s
, t
p
, r
s
, and r
p
of the partially transmitting,
partially reflecting beam-splitter film. Similarly, we can imagine using polarizing filters and
beam collimators to create an x-polarized or y-polarized radiance field that is severely direction-
chopped, and then switching the interferometers entrance and exit ports to create background
radiation of the type specified in Eqs. (4.151d), (4.151e), (4.159a), and (4.159b). This is also a
procedure that does not affect the t
s
, t
p
, r
s
, and r
p
beam-splitter coefficients. Hence, our analysis
strongly suggests that the constraints on t
s
, t
p
, r
s
, and r
p
in Eqs. (4.158b) and (4.160f), which are
derived from these idealizations, can be confidently applied to the nonideal system of Eq. (4.150).
Concluding that this is in fact the case, we substitute (4.158b) and (4.160f) into (4.150) to get

- 482 -

2
2 4 4
2 ( ) (back)
field of view
2 ( ) ( ) ( ) ( ) ( , ) [ ]
[
{
uv
s s s x
p
T A
TA d d r r t
r

= +
+

L
G

2
2 4 4
( ) (back)
2
2 2 2
2 ( ) (back)
field of view
2 2 2
( ) (back) 2 1
2
( ) ( ) ( ) ( , )
2 ( ) ( ) ( ) ( ) ( , )
1
( ) ( ) ( ) ( , ) ( )
2
]
[
] [ ]
}
{
}
uv
p p y
uv
s s s x
uv i
p p p y A
t
WTA d d r r t
r t e
A
WTA d d

L
L
L
G
G
G
G

2
2
2 2 2
( ) (back)
field of view
2 2 2
( ) (back) 2 1
( ) ( ) ( ) ( ) ( , )
1
( ) ( ) ( ) ( , ) ( )
[
] [ ]
{
}
uv
s s s x
uv i
p p p y A
r r t
r t e
A

+

L
L
G
G
G
.
(4.161a)

Dividing both sides by 2T to get an expression for
(back)
P ( )
unb
, the average power in the
unbalanced background signal for a beam of cross-sectional area A at an OPD value of , gives

(back)
2 4 4 2
2 ( ) (back)
field of view
4 4 2
( ) (back)
2 2 2
( ) (back)
P ( )
( ) ( ) ( ) ( ) ( , )
( ) ( ) ( ) ( , )
2 ( ) ( ) ( ) ( , ) [
{
unb
uv
s s s x
uv
p p p y
uv
s s s x
A d d r r t
r t
W
r t
A

= +

+ +

L
L
L
G
G
G

2 2 2
2 cos ( ) (back)
( ) ( ) ( ) ( , ) Re ( ) ]
}
i uv
p p p y A
r t e

+

L
G
G

(4.161b)

where
2
cos 1
= has the same meaning as in Eqs. (4.135f) above (it is the cosine of the
angle the propagation vector
2
1 z = +
G
makes with the z axis of the unfolded
interferometer). This integral clearly gives a real value for
(back)
P
unb
, as it should because
(back)
P
unb

is a real quantity.
Simplified Formulas Describing Unbalanced Background Radiation 4.18
- 483 -
4.18 Simplified Formulas Describing Unbalanced Background Radiation
There is usually no reason to treat the background radiation as anything other than unpolarized or
as having anything other than the same background spectrum everywhere inside the field of view.
Once again, we follow the pattern of the balanced derivation [see Eq. (4.136e)] and define

(back) (back) (back)
1
( , ) ( , ) ( )
2
x y
r o r o o L L
G G
L . (4.162a)

Here,
(back)
( ) o L is the total background optical power per unit cross-sectional area of the beam per
unit solid angle per unit interval. Just like ( ) o L in (4.136f),
(back)
L is a double-sided power
spectrum, making it an even function of :

(back) (back)
( ) ( ) o o L L . (4.162b)

When there is negligible absorption in the partially transmitting and partially reflecting beam-
splitter film, Eqs. (4.157c) and (4.160e) require that

2
2 2
( ) ( ) 1
s s
t r o o

+

and

2
2 2
( ) ( ) 1
p p
t r o o

+

.

We can use these equations to write

4 4 2 2
( ) ( ) 1 2 ( ) ( )
s s s s
t r t r o o o o +
and

4 4 2 2
( ) ( ) 1 2 ( ) ( )
p p p p
t r t r o o o o + ,

which can be rearranged to get

4 4
4 4
2 2
2 2
( ) ( ) ( ) ( )
2 2 ( ) ( ) ( ) ( )
s s p p
s s p p
t r t r
t r t r
o o o o
o o o o
+ + +

+

.
(4.162c)

The idealizations introduced in Eq. (4.151a) let Eq. (4.136i) be approximated as

( ) o L
- 484 -

2 2
2 6 2 2
( ) 2 ( ) ( ) ( ) ( ) ( ) ( )
s s p p
r r t r t q o o y o o o o o

e +

, (4.162d)

which can be substituted into (4.162c) to get

4 4
4 4
6 2
( )
( ) ( ) ( ) ( ) 2
( ) ( )
s s p p
t r t r
r
q o
o o o o
y o o
+ + + e . (4.162e)

Applying the idealizations in (4.151a) to (4.161b), and then substituting from Eqs. (4.162a),
(4.162d), and (4.162e) gives

(back)
2 4
2 (back)
2
field of view
2 cos
2
P ( )
( )
( ) 2 ( ) ( )
2
( )
( )
Re ( ) .
( )
{
}
unb
i
A
A
d d r
W
e
A

r
r o o
q o
o r o o o
o
q o
o
o
G

L (4.162f)

The next idealization is to give the background-radiance beam a circular cross section. From
the work done in the balanced derivation [see Eq. (4.137e)], the formula for
1
A
A
is then

1 ma
ma
circle of
radius
(4 )
1
( )
2
A
R
J R
A R
r o
o
r o

A

G
,

where
ma
is the angle (in radians) between the surface normal vectors of the correctly aligned
and misaligned moving-mirror positions. From Eq. (4.137g), we know that

1 ma
ma
(4 )
2
J R
R
r o
r o

has the same value at as it has at +, so we can discard the absolute value signs and write

1 ma
ma
circle of
radius
(4 ) 1
( )
2
A
R
J R
A R
r o
o
r o
A
G
.
field of view
{
- 485 -
The J
1
Bessel function is always real when it has a real argument, so
1
A
A
must be real for a

circular cross section. This means that when this last expression is substituted into (4.162f), we
get

(back)
2 4
2 (back)
2
field of view
1 ma
2
ma
P ( )
( )
( ) 2 ( ) ( )
2
( )
(4 ) ( )
cos(2 cos )
2
( )
[ ] [ ]
{
}
unb
A
d d r
J R
W
R
q o
o r o o o
o
r o q o
ro o
r o
o

.
L (4.163a)

Equation (4.163a) corresponds to (4.137i) in the balanced derivation. Assuming the effective field
of view for the background radiance is sufficiently narrow that cos 1
r
o e , we can write (4.163a)
as

2 4
(back)
(back)
2
1 ma
2
ma
( )
P ( ) ( ) 2 ( ) ( )
2
( )
(4 ) ( )
cos(2 )
2
( )
{
}
unb
A
r
J R
W d
R
q o
o o o
o
r o q o
ro o
r o
o
AO

.
L
(4.163b)

This result corresponds to Eq. (4.138a). Again, represents the value of the integral over
2
d r .
This makes the solid angle of the interferometers effective field of view for the unbalanced
background signal, which should be, as pointed out in the discussion after Eq. (4.150), about the
same size as the interferometers input field of view. We note that the entire product

2 4
1 ma
(back)
2 2
ma
(4 ) ( ) ( )
( ) 2 ( ) ( ) cos(2 )
2
( ) ( )
} {
J R
r W
R

r o q o q o
o o o ro
r o
o o

L

is an even function of if
(back)
L ,
2
r ,
2
y , q , cos(2 ) ro , and

1 ma
ma
(4 )
2
J R
R
r o
r o

are all even functions of . The cosine is always an even function, and Eq. (4.162b) shows that
(back)
L is even. The analysis following Eq. (4.138b) shows that
2
r and are also even functions
field of view
{
- 486 -
of , and Eq. (4.137g) shows that
1
ma 1 ma
(2 ) (4 ) R J R r o r o
is even. As for
2
y , the only
uncertainty left, we know that it must be an even function of because, according to (4.151a), it
comes from idealized approximations for
2
( )
,
abc
s p
y and
2
( )
,
uv
s p
y that are themselves, as shown in
Eqs. (4.139f) and (4.145d), even functions of . Hence Eq. (4.163b) can be written as [by
applying formula (2.19) in Chapter 2]

2 4
(back)
(back)
2
0
1 ma
2
ma
( )
P ( ) ( ) 2 ( ) ( )
2
( )
(4 ) ( )
cos(2 ) ,
2
( )
{
}
unb
A
r
J R
W d
R
q o
o o o
o
r o q o
ro o
r o
o
AO

L

(4.163c)
where we define

(back) (back)
( ) 2 ( ) for 0 o o o > L L . (4.163d)

We recognize from the discussion preceding Eq. (4.136d) that
(back)
L can be thought of as the
spectral radiance of the background radiation causing the unbalanced background signal.
Equation (4.163c), then, corresponds to Eq. (4.140a) in the balanced derivation.
Our next idealization is to assume the interferometer is well aligned so that
ma
0 .
Substitution of Eq. (4.137k), which states that

ma
1 ma
0
ma
(4 )
lim 1
2
J R
R
r o
r o
,
into (4.163c) gives

2 4
(back)
(back)
2
0
2
1 ( )
P ( ) S ( ) 2 ( ) ( )
2
( )
( )
cos(2 ) ,
( )
{
}
unb
r
W d
q o
o o o
o
q o
ro o
o

(4.164a)
where

(back) (back)
S ( ) ( ) A o o AOL (4.164b)

is the total, single-sided optical power per unit wavenumber interval entering the detector-side of
the interferometer as background radiation. This corresponds to Eq. (4.140b) in the balanced
derivation.
Application of Eq. (4.137k), which states that
to (4.163c) gives
- 487 -
The final idealization is to assume that

( ) ( ) ( ) 1 r y o o q o e e e
so that

[ ]
(back)
(back)
0
1
P ( ) S ( ) 1 cos(2 )
2
unb
W d o ro o
, (4.164c)

which matches Eq. (4.140d) in the balanced derivation. We can then adopt the same convention
as most optical textbooks by setting 1 W to get

[ ]
(back)
(back)
0
1
P ( ) S ( ) 1 cos(2 )
2
unb
d o ro o
. (4.164d)

Separating out the signal component
(back)
I ( )
unb
, which changes with , gives

(back)
0
1
I ( ) S( ) cos(2 )
2
unb
d o ro o
, (4.165a)

corresponding to Eq. (4.141a) in the balanced derivation. Function
(back)
I ( )
unb
is often called the
unbalanced background interferogram. It is difficult to imagine a procedure for recording the
balanced interferogram for the input optical signal that does not at the same time record the
unbalanced background interferogram; fortunately, there are several well-known calibration
methods discussed in Secs. 5.14 and 5.19 of Chapter 5 that can be used to measure and eliminate
the unbalanced background interferogram from interferometer data.
From (4.163d) and (4.164b), we have

S ( ) ( ) 2 ( ) for 0 A A o o o o AO AO > L L . (4.165b)

Because
(back)
L is an even function [see Eq. (4.162b)], we can easily extend the definition of
(back)
S to negative values of by saying that

(back) (back)
S ( ) S ( ) o o , (4.165c)

(back)
S ( ) cos(2 )d o ro o .
- 488 -
so that it becomes another even function of . Now the product

(back)
S ( ) cos(2 )

is an even function of and Eq. (2.19) of Chapter 2 can be used to write (4.165a) as

(back)
(back)
1
I ( ) S ( ) cos(2 )
4
unb
d
. (4.165d)

We also note that the product

(back)
S ( ) sin(2 )

is an odd function of . This means that

(back) 2
(back) (back)
(back)
S ( )
S ( ) cos(2 ) S ( ) sin(2 )
S ( ) cos(2 )
i
e d
d i d
d

=
=

because the integral between and + of any odd function such as

(back)
S ( ) sin(2 )

must be zero [see Eq. (2.17) in Chapter 2]. This last result can be combined with Eq. (4.165d) to
get

(back)
(back) 2
1
I ( ) S ( )
4
i
unb
e d

, (4.165e)

corresponding to Eq. (4.141d) in the balanced derivation. Equation (4.165e), just like (4.141d) for
the balanced interferogram, shows that we can get the unbalanced background spectrum by taking
the appropriate Fourier transform of
(back)
I
unb
. There are calibration procedures that can be used to
isolate the unbalanced background interferogram, giving us access to the unbalanced background
- 489 -
spectrum, but these measurements are usually of interest only to scientists and engineers trying to
improve the performance of poorly working interferometers.

__________

This chapter starts with Maxwells equations and ends up with detailed formulas for the
balanced and unbalanced optical power leaving the exit port of a standard Michelson
interferometer. The formulas account for imperfect reflection off the interferometers end mirrors
as well as the reflection, transmission, and absorption characterizing nonideal beam splitters and
compensator plates. Along the way, we have learned how to characterize the optical beams
passing through interferometers as well as how to handle polarized input radiation, slightly
misaligned instruments, and an input spectrum that is nonuniform over the field of view. We have
also, and in the end perhaps most importantly, introduced the concept of spectral radiance to
describe the behavior of electromagnetic wavefields inside Michelson interferometers.


- 490 -
Appendix 4A
We define a complex vector a
G
to be, for any three-dimensional Cartesian coordinate system
having x y z , , unit vectors along the x y z , , Cartesian axes,

x y z
a xa ya za = + +
G
, (4A.1)

where
x y z
a a a , , are three complex scalars. Using the subscript r to denote a complex scalars real
part and the subscript i to denote the complex scalars imaginary part, we have

x rx ix
a a i a = + , (4A.2a)

y ry iy
a a i a = + , (4A.2b)

z rz iz
a a i a = + , (4A.2c)

for 1 i = . The x y z , , unit vectors themselves are taken to be real and can be written in
column-vector notation as

1
0
0
x

=

,
0
1
0
y

=

,
0
0
1
z

=

,

which means the complex vector a
G
can be written in column-vector notation as

rx ix
ry iy
rz iz
a ia
a a ia
a ia
+

= +

+

G
.

Many of the standard three-dimensional formulas for real vectors can be extended to complex
vectors without any difficulty. For example, we define the vector dot product of two complex
vectors a b
G
G
and to be

x x y y z z
a a b a b a b b = + +
G
G
, (4A.3a)

where
x x y y z z
a b a b a b , , are the complex products of two complex scalars. Applying (4A.3a) to the
formulas for x y z , , , and a
G
, we get

Appendix 4A
- 491 -

x
a a x x a = =
G G
,
y
a a y y a = =
G G
,
z
a a z z a = =
G G
(4A.3b)

just like when a
G
is a real vector. To make the length of a complex vector a
G
a non-negative real
number, we define
a a a
=
G G G
(4A.4)

where a
G
, the complex conjugate of a
G
, is

x y z
a xa ya za

= + +
G
(4A.5)
or in column-vector notation

rx ix
ry iy
rz iz
a ia
a a ia
a ia

G
.

The formula for the vector cross product of two complex three-dimensional vectors and a b
G
G
is
also identical to the formula for the cross product of two real three-dimensional vectors,

( ) ( ) ( )
y z z y z x x z x y y x
a b b a x a b a b y a b a b z a b a b = = + +
G G
G G
. (4A.6)

The well-known operations of vector calculus on real three-dimensional vector fields can also
be extended to fields of complex three-dimensional vectors. We define the
G
operator in the
usual way,
x y z
x y z

= + +

G
, (4A.7a)

so that for any complex scalar field we have

,
i i i r r r
x y z
x y z
x i y i z i
x x y y z z

= + +

= + + + + +

G

(4A.7b)

where
r i
i = + for
r
the real part of and
i
the imaginary part of . We know for any
real three-dimensional vector field
x y z
x y z = + +
G
that

- 492 -

y
x z
x y z

= + +

G
G
. (4A.8a)

For any complex vector field
x y z
a xa ya za = + +
G
we now define

y ry iy
x rx ix iz z rz
a a a
a a a a a a
a i
x y z x y z x y z

= + + = + + + + +

G
G
. (4A.8b)

Indeed, we can regard any complex vector field a
G
as the complex sum of two real vector fields

r i
a a i a = +
G G G
, (4A.9a)

where the vector fields real component
r
a
G
is the real vector

rx
r rx ry rz ry
rz
a
a xa ya za a
a

= + + =

G
, (4A.9b)

and the vector fields imaginary component
i
a
G
is the real vector

ix
i ix iy iz iy
iz
a
a xa ya za a
a

= + + =

G
. (4A.9c)
Now we can treat i like any other constant scalar to write

( )
r i r i
a a ia a i a = + = +
G G G G
G G G G G
. (4A.10a)

Equation (4A.10a) is the same as (4A.8b) and can be used instead of (4A.8b) to define a
G
G
for a
complex vector field a
G
in terms of the already-understood
G
operation applied to the real
vector fields
r i
a a
G G
and . We know that the curl
G
G
of any real, three-dimensional vector field

x y z
x y z = + +
G
is

y y
x x z z
x y z
y z z x x y

= + +

G
G
.

Now for the curl of any complex vector field a
G
we can write
Appendix 4A
- 493 -
( ) ( )
r i r i
a a i a a i a = + = +
G G G G
G G G G G
, (4A.10b)

which defines a
G
G
in terms of the curls
r
a
G
G
and
i
a
G
G
of two real vector fields
r
a
G
and
i
a
G
.
We know that
2
r
for any real scalar field
r
is

2 2 2
2
2 2 2
r r r
r
x y z

= + +

, (4A.11a)

so that
2
for any complex scalar field
r i
i = + becomes

2 2 2 2 2 2
2 2 2
2 2 2 2 2 2
i i i r r r
r i
i i
x y z x y z

= + = + + + + +

. (4A.11b)

The standard definition of
2

G
for any real vector
x y z
x y z = + +
G
is

2 2 2 2

x y z
x y z = + +
G
. (4A.11c)

For any complex vector field a
G
we say that

2 2 2 2

x y z
a x a y a z a = + +
G
(4A.11d)

for
x y z
a a a , , , the three complex scalar fields that are the x y z , , components of the complex
vector field a
G
. Equations (4A.11a), (4A.11b) and (4A.11d) when taken together define
2
a
G
for
any complex vector field a
G
. Note that we can also use

2 2 2
r i
a a i a = +
G G G
(4A.11e)

to define
2
a
G
in terms of
2
applied to the real vector fields
r
a
G
and
i
a
G
.
If we have a constant complex vector u
G
multiplied by a complex scalar field , then

x y z
u xu yu zu = + +
G
and ( ) ( ) ( )
x y z
u x u y u z u = + +
G
,

where
x y z
u u u , , are constant complex scalars and is a complex scalar function of position. From
(4A.11d) we have

- 494 -

2 2 2 2
2 2 2 2
( ) ( ) ( ) ( )

x y z
x y z
u x u y u z u
xu yu zu u

= + +
= + + =
G
G
.
(4A.12a)

Another useful identity involving a constant complex vector u
G
multiplied by a complex scalar
field comes from using Eq. (4A.8b) to simplify ( ) u
G
G
,

( ) ( ) ( ) ( )
( ) .
x y z
x y z
u u u u
x y z
u u u u
x y z

= + +

= + + =

G
G
G
G

(4A.12b)

Here we have used Eqs. (4A.3a) and (4A.7b) in the last step of (4A.12b). We also note that

( )
( ) ( )
( ) ( ) ( ) ( )

( )
y y
x x z z
z y x z y x
u
u u
u u u u
x y z
y z z x x y
x u u y u u z u u
y z z x x y
u

= + +

= + +

=
G
G
G
G

.
(4A.12c)

We define a complex vector
r i
a a i a = +
G G G
to be orthogonal to a real vector
G
when

0
r i
a a a i a = = + =
G G G G G G G G
. (4A.13)

In (4A.13), both the real and imaginary components of the dot product,
r
a
G G
and
i
a
G G

respectively, must be zero. Equation (4A.13) requires that both
r
a
G
and
i
a
G
be perpendicular to
G

in the standard sense of real three-dimensional vectors. Another vector identity that holds true for
two real vectors
a
G
,
b
G
, and a complex vector
r i
a a i a = +
G G G
is

( ) ( ) ( )
a b a b a b
a a a =
G G G G G G G G G
. (4A.14)

To justify (4A.14), we note that because

1 2 3 1 3 2 1 2 3
( ) ( ) ( ) =
G G G G G G G G G

Appendix 4A
- 495 -
holds true for any real vectors
1
G
,
2
G
,
3
G
, it follows that

( ) ( ) [ ( )]
( ) ( ) [( ) ( ) ]
[ ( )] ( )( ) ( ) ( )
a b a b r a b i
a r b a b r a i b a b i
a r i b a b r i a b a b
a a i a
a a i a a
a ia a ia a

= +
= +
= + + =
G G G G G G G G G
G G G G G G G G G G G G
G G G G G G G G G G G G G

a
G
.

Another useful formula comes from simplifying

( ) ( ) a a

G G G G

when
G
is a real three-dimensional vector,
r i
a a i a = +
G G G
is a complex three-dimensional vector,
and 0 a =
G G
. Because 0 a =
G G
, we have [see Eq. (4A.13)]

0
r i
a i a + =
G G G G
or 0
r i
a a = =
G G G G
.

It follows that

( ) ( ) [ ( )] [ ( )]
[( ) ( )] [( ) ( )]
( ) ( ) ( ) ( )
r i r i
r i r i
r r i i
a a a ia a ia
a i a a i a
a a a a

= +
= +
= +
G G G G G G G G G G
G G G G G G G G
G G G G G G G G

(4A.15)

Because
G
and
r
a
G
are realand because
G
and
r
a
G
are orthogonal (remember that 0
r
a =
G G
)
we know the length of
r
a
G G
must be

sin
r r r
a a a = =
G G G G G G
,

where
G
is the length of
G
,
r
a
G
is the length of
r
a
G
, and the angle between
G
and
r
a
G
must be
2 . Because the dot product of a real vector with itself gives the square of its length, we
conclude that

2 2
( ) ( ) ( )( )
r r r r r
a a a a a = =
G G G G G G G G G G
.

Using similar reasoning, we find that

2 2
( ) ( ) ( )( )
i i i i i
a a a a a = =
G G G G G G G G G G
.
- 496 -
Hence, Eq. (4A.15) can be written as

( ) ( ) ( )[( ) ( )]
( )[( ) ( )] ( )( ) ,
r r i i
r i r i
a a a a a a
a i a a i a a a

= +
= + =
G G G G G G G G G G
G G G G G G G G G G

(4A.16)

which is the formula we are looking for.
For any complex scalar
r i
i = + , we can use the notation

Re( )
r
= (4A.17a)
and
Im( )
i
= (4A.17b)

to specify the real and imaginary parts of . Similarly, for any complex vector
r i
a a i a = +
G G G
, we
can use the notation
Re( )
r
a a =
G G
(4A.17c)
and
Im( )
i
a a =
G G
(4A.17d)

to specify the real and imaginary parts of a
G
. We define
R
L to be a linear operator that, when
operating on a real three-dimensional scalar or vector field, creates another scalar or vector field
that is also real. We call
R
L a real linear operator. When operating on a complex quantity, a real
linear operator
R
L can return either a real or complex quantity; but when operating on a real
quantity, a real linear operator must return another real quantity. Because
R
L is linear, we know
that

( ) ( ) ( )
R R R
a b a b + = +
G G
G G
L L L (4A.18)

for any two real or complex constant scalars , and any two real or complex vectors fields , a b
G
G
.
When dealing with scalar fields we need only remove all the vector signs from the linear-operator
formula in Eq. (4A.18). We note that the
G
,
G
, and t operators in Maxwells equations
are all real linear operators, as are the
2 2
t and
2
operators created by manipulation of
Maxwells equations.
Many times we have to find real vector fields and
R R
a b
G
G
that satisfy equations of the form

1 2
( ) ( ) 0
R R
a b + =
G
G
L L (4A.19a)
Appendix 4A
- 497 -

3 4
( ) ( ) 0
R R
a b + =
G
G
L L (4A.19b)

etc
#
.

for real linear operators
1 2 3 4
L L L L , , , , . It is often easier to find two complex vector-field
solutions and a b
G
G
such that

1 2
( ) ( ) 0 a b + =
G
G
L L (4A.19c)

3 4
( ) ( ) 0 a b + =
G
G
L L (4A.19d)

etc
#
.

than it is to find real vector fields and a b
G
G
satisfying (4A.19a) and (4A.19b). For any real linear
operator
R
L acting on a complex vector field
r i
c c i c = +
G G G
, with
r
c
G
and
i
c
G
the real and imaginary
parts of c
G
, we have
( ) ( ) ( ) ( )
R R r i R r R i
c c i c c i c = + = +
G G G G G
L L L L .

Both ( )
R r
c
G
L and ( )
R i
c
G
L must be real because they represent real linear operators acting on real
vector fields
r
c
G
and
i
c
G
. Hence,
( ) ( ) Re ( ) ( ) Re( )
R R r R
c c c = =
G G G
L L L (4A.20a)
and
( ) ( ) Im ( ) ( ) Im( )
R R i R
c c c = =
G G G
L L L . (4A.20b)

Although Re and Im are not themselves true linear operators, we do know that for any two
complex vector fields u
G
and v
G

Re( ) Re( ) Re( ) u v u v + = +
G G G G
(4A.21a)
and
Im( ) Im( ) Im( ) u v u v + = +
G G G G
. (4A.21b)

We can thus take the real and imaginary parts of (4A.19c) and (4A19d), using (4A.21a) and
(4A.21b) to get

1 2
Re[ ( )] Re[ ( )] 0 a b + =
G
G
L L

3 4
Re[ ( )] Re[ ( )] 0 a b + =
G
G
L L

etc
#
.

- 498 -
and

1 2
Im[ ( )] Im[ ( )] 0 a b + =
G
G
L L

3 4
Im[ ( )] Im[ ( )] 0 a b + =
G
G
L L

etc
#
.

Equations (4A.20a) and (4A.20b) now give

( )
( )
1 2
Re( ) Re( ) 0 a b + =
G
G
L L (4A.22a)
( )
( )
3 4
Re( ) Re( ) 0 a b + =
G
G
L L (4A.22b)

etc
#
.

and
( )
( )
1 2
Im( ) Im( ) 0 a b + =
G
G
L L (4A.22c)
( )
( )
3 4
Im( ) Im( ) 0 a b + =
G
G
L L (4A.22d)

etc
#
.

Equations (4A.22a)(4A.22d) show that both Re( ) Re( ) a b
G
G
, and Im( ) Im( ) a b
G
G
, are pairs of real
R R
a b
G
G
, fields that satisfy Eqs. (4A.19a) and (4A.19b). We can thus solve sets of equations based
on real linear operators by allowing the proposed solutions to be complex vector fields, finding
formulas for these complex vector fields, and thenat the very end of the processtaking either
the real or imaginary part of the complex solutions to get the desired real solutions. When
following this procedure, it is customary to take the real rather than the imaginary parts of the
complex solutions to get the desired real solutions.
Appendix 4B

- 499 -
Appendix 4B
We must be careful when approximating the phase terms of interferometer equations because
phase changes can be significant while still being very small compared to the largest term in the
phase expression. Consider, for example, the expressions

(1 ) iA
cmplx
S e
+
= (4B.1a)
and
( ) cos (1 )
real
S A = + . (4B.1b)

What are the constraints on the size of such that both

iA
cmplx
S e and cos( )
real
S A

are good approximations of Eqs. (4B.1a) and (4B.1b)? At first glance, we might say that if
1 << , then the contribution of to the phase expression (1 ) A + can be neglected because no
matter what the size of A the fractional error in the phase from neglecting the presence of is

(1 )
1
A A
A
+
= << .

Note, however, that when A is very large we can write

2 A N a = +

for some positive (or negative) integer N and a non-negative real variable 2 a < . Because
(1 ) A + is a phase, Eqs. (4B.1a) and (4B.1b) can be written as

( ) (2 ) ( )
cmplx
i A A i N a A i a A
S e e e
+ + + +
= = =
and
cos( ) cos(2 ) cos( )
real
S A A N a A a A = + = + + = + .

Now it looks like what matters is that A be small compared to a. But all we are interested in is
the approximate value of
( ) i a A
e
+
or cos( ) a A + . If A is about equal to 2N so that 2 a A N =
is very small, making it about the same size or even smaller than the small value of A , then
A can still be neglected as long as we can say

- 500 -

(1 ) 2 iA i N
cmplx
S e e
+
=
and
( ) cos (1 ) cos(2 )
real
S A N = + .

For this reason, we adopt as our rule for neglecting that the change in phase A must be small
compared to the change in phase producing an (1) O change in exp( (1 )) iA + or cos( (1 )) A + .
This means must satisfy

4 1 A << (4B.1c)

before we can say that

(1 ) iA iA
e e
+

or
( ) cos (1 ) cos( ) A A + .

Our rule of thumb, then, is to give both A and their extreme allowed values, maximizing A ,
and after that to check to see whether the resulting maximum A value satisfies (4B.1c). If it
does, we can be sure that (4B.1c) is also satisfied for all the nonextreme A products, allowing us
to neglect in Eqs. (4B.1a) and (4B.1b).
We start our analysis with terms such as

2 i z
e

,

where
and are respectively the propagation vector and wavenumber of a monochromatic

plane wave and is the OPD value of an interferometer. The beam passing through an
interferometer is direction-chopped, which means that all the plane waves have propagation
vectors that are parallel to, or nearly parallel to, z , so
1 z . Does this mean that

2 2 i z i
e e

?

We now show why this approximation does not work. Following the notation developed in
Sec. 4.12 above, we take
b
to be the angle between
and the z axis. The types of

interferometers we are interested in have angles
b
that are relatively small,

2
max
0 10 radians
b b

, (4B.2a)

Appendix 4B
- 501 -
and measure infrared spectra over a range of wavenumbers

min max
0 < < . (4B.2b)

In a well-designed interferometer
max
, the largest possible absolute value of the OPD, or optical
path difference, must satisfy the inequality

2
max max max b

(4B.2c)

for accurate spectral measurements to occur.
72
As a general rule, interferometer designs attempt
to maximize the optical signal, which usually means making
bmax
as large as possible.
Consequently, it makes sense to assume that

2
max max max b

. (4B.2d)
We know that

2 4
cos 1
2 24
b b
b
z

= + " (4B.3a)

because angle
b
is small. Substituting this into the phase
2 z gives

2 4
2 2 cos 2 (1 )
2 24
b b
b
z

= + " .

Here, 2 plays the role of A [see discussion following Eq. (4B.1b)], and the terms
2
2
b
and
4
24
b
play the role of . We first take
4
24
b
= and note that the maximum expected value of
A is

4 2
max max max max
1
2 ( 24)
4
b b
,

where we have taken 3 and used (4B.2d). Inequality (4B.2a) then shows that

4
1
10
4
A

,

72
John Chamberlain, The Principles of Interferometric Spectroscopy (John Wiley and Sons, New York, 1979), pp.
220222.
- 502 -
which, according to (4B.1c), is small enough to neglect. When, however,
2
2
b
= , it follows
that A can be as large as

2
max max max
2 ( 2)
b
= ,

which is obviously too large to discard. Hence, we must approximate cos
b
by

2
cos 1
2
b
b

(4B.3b)

when multiplied by
max max
2 . We cannot take
cos 1
b
z = even though, according to
(4B.2a), the
2
( )
b
O term can be no larger than

5
5 10 1
<< .

We conclude that
2 i z
e

cannot be approximated as
2 i
e

unless we are prepared to put stricter
limits on , , and
b
.
We now consider a plane wave with a unit-length propagation vector that is incident on the
flat moving mirror of a Michelson interferometer. When the moving mirror is correctly aligned,
its unit-length surface normal is z , pointing approximately antiparallel to as shown in Fig.
4B.1; and when the moving mirror is misaligned by a very small angle, its unit-length surface
normal is
M
n . The unit-length propagation vector of the plane wave reflected from the aligned
moving mirror is
, and the unit-length propagation vector of the plane wave reflected from the
misaligned moving mirror is
d
. We know that the angle between

and
d
is
d
, with
d

much smaller than
b
as shown in inequality (4.68) of Sec. 4.12 above. Since we are only
interested in finding the interferometers measurement noise for small misalignment angles, we
say that
dmax
, the maximum expected value of
d
, satisfies

2 max
max
10
d
b
. (4B.4a)

According to inequality (4B.2a) this means the largest we expect
d
to become is

4
max
10 radians
d

. (4B.4b)
Appendix 4B
- 503 -

FIGURE 4B.1.
unit vector z
unit vector

unit vector

b

b

Reflective
Surface of
the Moving
Mirror
- 504 -
There is also a close connection between
dmax
and the cross-sectional size of the interferometers
beam. In a well-designed interferometer,
73

max max
D 0.14
d
, (4B.4c)

where D is the typical distance across the beams cross-sectional area. If, for example, the beam
has a circular cross section, then D is the circles diameter.
Although, as shown in Fig. 4B.1, vectors , z , and
always lie in the same plane, there is

no reason to expect the surface normal
M
n of the misaligned moving mirroror the propagation
vector
d
of the plane wave reflected off misaligned moving mirroralso to lie in that plane.
We do, however, know that , z ,
,
M
n , and
d
are all unit-length vectors. When we put the
bases or tails of vectors z ,
,
M
n , and
d
at the same location, their tips always lie on the
surface of a sphere of unit radius; and if we put the tip of together with the other four vectors
bases, then the base of lies on the surface of that same sphere. Because
2
10 radians
b

and
4
10 radians
d

[see inequalities (4B.2a) and (4B.4b)] are very small angles, we can approximate
the spheres curving surface near the tip of z as a plane, drawing the construction shown in Fig.
4B.2. Then, according to the law of specular reflection, the base of lies on a straight line with
the tips of z and
, with the tip of z lying a distance

b
from the base of and the tip of

lying a distance
b
from the tip of z . Similarly, the base of lies on a straight line with the tips
of
M
n and
d
, with the tip of
M
n lying halfway between and
d
. Having definedusing
this flat-plane approximationthat the distance between the tips of

and
d
on the unit sphere
is angle
d
, we then know that the distance between the tips of z and
M
n must be 2
d
. This
result comes from the similar triangle theorem: the triangle formed by the base of and the tips
of

and
d
is twice the size of, and similar in shape to, the triangle formed by the base of
and the tips of z and
M
n .
We can define displacement vectors

M
n z =
G
(4B.4d)
and

d
=
G
, (4B.4e)

73
D. Cohen, Performance Degradation of a Michelson Interferometer When Its Misalignment Angle Is a Rapidly
Varying Time Series, Applied Optics 36, no. 18 (20 June 1997), pp. 40344042.
Appendix 4B
- 505 -

FIGURE 4B.2.

/ 2
d

b

b

d

unit vector
M
n
unit vector z
vector
G

unit vector
vector
G

This diagram and Fig. 4B.3 go with the discussion following Eq. (4B.4c) in Appendix 4B. No matter where is
put in this geometric construction, the angle between and
d
is always twice the angle between the tips of
vectors z and
M
n . Note in Fig. 4B.3 how the angle between the tips of
1
,
1
d
and
2
,
2
d
is twice as
large as the angle between the tips of z ,
M
n even though
1
and
2
are not the same vector.
unit vector

unit vector
d

- 506 -

FIGURE 4B.3.
unit vector
1

unit vector z
unit vector
2

unit vector
2

unit vector
2
d

vector
2 d

unit vector
1

unit vector
1
d

vector
1 d

unit vector
M
n
vector

1 b

2 b

Appendix 4B
- 507 -
with
G
the displacement vector from the tip of z to the tip of
M
n and
G
the displacement vector
from the tip of
to the tip of
d
. According to these definitions, we have

d

G
(4B.4f)
and
2
d

G
. (4B.4g)

Because the two displacement vectors point in the same direction, we can write, according to the
flat-plane approximation, 2 =
G
G
. Since the flat-plane approximation is only approximately true,
we settle for
2
G
G
. (4B.4h)

Angle gives the orientation of
G
with respect to the line joining the base of to the tip of
z ; by changing the value of , we change the shape of the two similar triangles, but we cannot
change the fact that they are similar.
Figure 4B.3 shows another geometric fact worth noting. Holding vectors z and
M
n fixed, we
consider two different propagation vectors
1
and
2
making two different angles
b1
and
b2

with respect to z . The
1
plane wave reflects off the aligned and misaligned moving mirror with
propagation vectors
1
and
1 d
respectively, and the
2
plane wave reflects off the aligned and
misaligned moving mirror with propagation vectors
2
and
2 d
respectively. Using the flat-
plane approximation, we see thatbecause the displacement vector
M
n z =
G
does not
changethe similar triangle theorem forces the two displacement vectors
1 1 1

d
=
G
and
2 2 2

d
=
G
to be equal. Realizing again that the flat-plane approximation is only
approximately true, we end up with

2 ( )
d M
n z (4B.4i)

for all possible incident propagation vectors (as long as the incident wave is part of the field-
chopped beam, propagating parallel to or nearly parallel to the z axis).
We next consider whether the approximation 2
G
G
, which is strictly true when the surface
of the unit sphere is treated as a plane, is accurate enough to use in the phase terms of Chapter 4.
Figure 4B.4 shows the orientation of vectors , z ,
M
n ,

and
d
, on the curved surface of a
unit sphere. We acknowledge the curvature of the sphere by drawing two straight lines s and s
perpendicularly from the shaft of z to the tips of

and
d
respectively. We also draw two arc
- 508 -
lengths a and a running along the surface of the sphere from the tip of z to the tips of

and
d
respectively. If we decrease while holding
b
and
d
constant, which shortens a
and draws the tip of
d
closer to the tip of z , then the straight line s hits the shaft of z at a
point that gets closer to the tip of z , increasing
d
z , the distance from where s hits z to the
base of z . Changing angle does not change a , s , or the value of
z , the distance from

where s hits z to the base of z . Clearly, the smaller we can make a compared to a , the
greater is the difference between the values of
d
z and
z . If instead of decreasing we
increase it past 2 , the point where s hits the shaft of z starts dropping, eventually going
below the point where s hits the shaft of z . Thus it is also true that as we increase , the
difference between
d
z and
z eventually begins to increase. We conclude, then, that the

difference between
d
z and
z is maximized when is 0 or , making a a a

maximum.
The term
2 ( )
d
i z
e

first appears in Eqs. (4.86a) and (4.86b) in Sec. 4.12. We want to
maximize the difference between the phase term
2
d
z and
2 z to see if, even

when this difference is at a maximum, the latter can be used to approximate the former.
Therefore we take
max
= ,
max
= in the phase term to get

max max
max max
max max

2 2 2
2 cos( ) cos( )
2 cos( ) cos( ) ,
d d
b
b d b
z z z z
a

=
=

where in the last step we say is 0 or in Fig. 4B.4 to make

d
z z a maximum. Of
course if is 0 or , then cos( ) cos( )
b d
a .
Working now with the term inside the absolute value signs, we have, remembering that both
b
and
d
are small with
d b
<< [see inequality (4.68)],
2 4
2 4
2
3
max
1 1
cos( ) cos( ) [1 ( ) ( ) ] [1 ]
2 24 2 24
( )
2
b b
b d b b d b d
d
b d b d
b
O

= + +
= +
" "
B

2
3 max
max max max
( )
2
d
d b d
O
+ + .

Appendix 4B
- 509 -

a
unit vector
d

unit vector

unit vector z
unit vector
m
n
FIGURE 4B.4.
vector
G

vector
G

a
s
unit vector

b

b

d

s

- 510 -
Substituting this latest result into the previous inequality to set an upper bound on the size of the
, we get

2
3 max
max max max max max max max max
2 [ ] ( )
2
d
b d b d
O
+ + . (4B.5a)

Substitution from (4B.2a), (4B.2d), and (4B.4b) gives

3 6
max max max max max max
( ) ( ) (10 )
b d b d
O O O

, (4B.5b)

which according to (4B.1c) is small enough to neglect. Hence we can write (4B.5a) as

2
max max max max max max max
2
b d d
+ . (4B.5c)

We substitute from (4B.2c) to get

2
max max
2
max max
2
d d
b b

+

From inequality (4B.4a), the first term on the right-hand side is less than or equal to
2
2 10

, and
the second term on the right-hand side is less than or equal to
4
10

. Both of these are,
according to (4B.1c), small enough to neglect. We conclude that must itself be small enough
to neglect, letting us write
2 2
d
z z
or

2 2
d
z z
e e

(4B.5d)

for the phase terms of our interferometer equations.
The phase term
2 ( )
d
i r
e

G G
G
first appears in Eqs. (4.86a) and (4.86b) in Sec. 4.12. From the
definition

d
=
G
in Eq. (4B.4e), we can write this phase term as
2 i r
e

G
G
. From the laws of
specular opticsor careful study of Figs. 4B.1 and 4B.4we know that

2 ( ) z z = (4B.6a)

for plane waves reflecting off the correctly aligned moving mirror, and

2 ( )
d M M
n n = (4B.6b)

Appendix 4B
- 511 -
for plane waves reflecting off the misaligned moving mirror. The orientation of and
with
respect to z shows that

z z = , (4B.6c)

and similarly the orientation of ,
d
with respect to
M
n shows that

M d M
n n = . (4B.6d)

Hence Eqs. (4B.6a-d) can be used to write the phase in
2 i r
e

G
G
as

2 2 ( ) 4 [ ( ) ( )]
d M d M
r r r n n z z = =
G
G G G
. (4B.7a)

Remembering the definition
M
n z =
G
in Eq. (4B.4d), we will now demonstrate that the
rightmost expression in (4B.7a) can be approximated as 4 r
G G
, which turns (4B.7a) into

2 4 4 [ ]
M
r r r n z =
G
G G G G
. (4B.7b)

We start the demonstration by noting that

4 [ ( ) ( )]

4 [ ( ) ( ) ( )]

4 [ ( ) ( )] 4 ( )( ) .
M d M
M M d M
M M d M
r n n z z
r n z z z n n z
r n z z z r n n z

= +
= +
G
G
G G

(4B.8a)

The

4 ( )( )
M d M
r n n z
G
term can be shown to be negligible by evaluating the upper
limit of its absolute value,

max
max

4 ( )( ) 4

4
M d M M d M
d M
r n n z r n n z
z n z

G G
,
(4B.8b)

where we use that

max
and
M
r n r z z =
G G

- 512 -
because the
M
n unit vector is tilted away from z by only a very small angle.
74
Although at first
we might suppose that z can be indefinitely large, this is not the case. The maximum value of
b
, which we called
bmax
above, governs how much the interferometer beams cross section
spreads as radiation travels through the interferometer. We are only interested in approximating
the phase terms for field points inside the interferometer, and if z gets too large it represents
points outside the interferometer where the validity of our phase approximations is irrelevant. We
assume that in a well-designed interferometer the beam does not spread more than 5%, which
means the product
max b
z satisfies the inequality

max
max
max
D D
0 or
20 20
b
b
z z
s s s (4B.8c)

for D having the same meaning as in inequality (4B.4c) above.
Figure 4B.5 is the same as Fig. 4B.4 only now we have left out a and s to avoid clutter, and
added a and s to represent the arc-length and straight-line separation of the tips of vectors
d
O
and
M
n . Following the same sort of reasoning used above to analyze the behavior of
d
z - O and
z - O [see discussion after Eq. (4B.4i)], we note that as o decreases to zero in Fig. 4B.5, the arc
length a eventually decreases, because as the tips of
M
n , z ,

and
d
O O , fall onto the same arc,
the tip of
M
n only goes about half as far toward the base of u as the tip of
d
O [see Eq. (4B.4h)].
This means that the point where s perpendicularly joins the shaft of
M
n approaches the tip of
M
n , increasing the value of
d M
n - O . While this happens, there is no change in the position where
s perpendicularly joins the shaft of z , so
z - O stays the same. Thus, for 0 o there is a

maximum in the value of
d M
n - O that, because
z - O stays the same, maximizes the expression

d M
n z - - O O .

When o increases to in Fig. 4B.5, arc length a eventually increases, because as the tips of
M
n , z ,

and
d
O O , fall onto the same arc, the tip of
d
O moves away from the base of u by
double the distance that
M
n does. This makes the point where s perpendicularly joins the shaft
of
M
n drop further from the tip of
M
n , decreasing the value of
d M
n - O . Consequently, o r
marks the other maximum in

74
This angle is approximately 2
d
, which is less than or equal to
5
5 10 radians
; see inequality (4B.4b).
Appendix 4B
- 513 -

FIGURE 4B.5.
vector
G

vector
G

a
s
a
s
unit vector

unit vector
d

d

b

unit vector
m
n
unit vector z

unit vector

- 514 -

d M
n z .

Hence the upper limit of the absolute value of the

4 ( )( )
M d M
r n n z
G
term in Eq.
(4B.8a) is given by, using Eq. (4B.8b),

max
max
max

4 ( )( ) 4
4 cos( ) cos
M d M d M
b
r n n z z n z
z a

G

To maximize cos( ) cos
b
a , we either maximize cos( ) a when cos( ) cos( )
b
a > at 0 = or
minimize cos( ) a when cos( ) cos( )
b
a > at = . When 0 = we have / 2
b d
a = so
( ) ( ) cos cos / 2
b d
a = . Similarly, when = , we have / 2
b d
a = + so
( ) ( ) cos cos / 2
b d
a = + . Hence the two possible maximums of cos( ) cos
b
a at 0 = and
= must each be less than or equal to cos( / 2) cos
b d b
. This latest expression can only
get larger when we stop dividing
d
by 2. Therefore we can write

max
max

4 ( )( ) 4 cos( ) cos
M d M b d b
r n n z z
G
. (4B.9a)

Inequality (4B.8c) can now be used to show that (approximating the cosine by its power series
because both
b
and
d
are small)

2
3
max
max
2
3 max
max
D
D

4 ( )( ) ( )
5 2
( )
5 2
[ ]
d
M d M b d b d
b
d
b d b d
b
r n n z O
O
+
+ +
G
B
.

Replacing
b
and
d
by
bmax
and
dmax
gives

2
2 max max max max
max max max
max
D D
D

4 ( )( )
( )
5 10
M d M
d d
b d
b
r n n z
O

+ +
G
.
(4B.9b)

Inequality (4B.4c) shows that

Appendix 4B
- 515 -

2 2 4
max max max max
D ( ) (0.14) 0.14 10
b d b
O

, (4B.9c)

where in the last step we used (4B.2a) to establish an upper bound on

2
max max max
D ( )
b d
O .

According to (4B.1c), this upper bound is small enough to neglect, so we can rewrite inequality
(4B.9b) as

2
max max max max
max
D D

4 ( )( )
neglectable terms
5 10
M d M
d d
b
r n n z

+ +
G
.
(4B.9d)

Again using inequality (4B.4c) and also (4B.4a), we write

2
3 max max max
max max
D 0.14
0.14 10
10 10
d d
b b

,

which is also, according to (4B.1c), small enough to neglect.
Applying inequality (4B.4c) yet again, now to the first term on the right-hand side of (4B.9d),
gives

max max
D 0.14
5 5
d

,

which is again small enough to neglect. Clearly, the left-hand side of inequality (4B.9d) must
always be small enough to neglect, allowing us to approximate Eq. (4B.8a) as

4 [ ( ) ( )] 4 [ ( ) ( )]
4 ( ) ( )
M d M M
M
r n n z z r n z z z
r n z z

=
G G
G
.
(4B.10a)
We now write

2
cos 1
2
b
b
z

=

and substitute this into the rightmost expression in (4B.10a) to get

2

4 [ ( ) ( )] 4 ( ) 2 ( )
M d M M M b
r n n z z r n z r n z
G G G
. (4B.10b)
- 516 -
The second term on the right-hand side of (4B.10b) has an absolute value with an upper limit

2 2
max max
2
max max
2
max max
max max
2 ( ) 2 ( )
2
2
M b b M
b
b
r n z r n z
r
r
ro ro
ro y
ro y
- -
-
s
s
G G
G G
G G

.

We note that the last step here is really a gross overestimate of r y -
G G
because the two unit-length
vectors
M
n and z are almost parallel, making vectors r
G
and
M
n z y
G
almost perpendicular for
large values of r
G
. We estimate
max
r
G
by
max
z , writing that

2 2
max max
max max
2 max
max max
max
D
2 ( ) 2
2 ,
20 2
M b b
d
b
b
r n z z ro ro y
ro
- s
s
G G

where in the second step Eq. (4B.4g) is used to replace
max
y
G
by
max
2
d
, and inequality (4B.8c)
is used to replace
max
z by
max
( 20 )
b
D . Now we can use inequalities (4B.4c) and (4B.2a) to
write

2 2
max
0.14 0.14
2 ( ) 10
20 20
M b b
r n z
r r
ro

- s s
G
,

which is, according to (4B.1c), small enough to neglect. We conclude that the second term on the
right-hand side of (4B.10b) is small enough to ignore, giving

4 [ ( ) ( )] 4 ( )
M d M M
r n n z z r n z ro ro - - - - O O e
G G
.

For the final step, we substitute this back into Eq. (4B.7a) to get

2 4 ( )
M
r r n z ro ro - - I e
G
G G

or
2 4 r r ro ro y - - I e
G
G G G
, (4B.10c)

where in the last step we use that
M
n z y
G
from Eq. (4B.4d). This shows that the
approximation in Eq. (4B.7b) holds true, which is what we set out to demonstrate. Since

d
I O O
G
[see Eq. (4B.4e) above], this result can also be written as
so that (4B.10b) becomes
Appendix 4B
- 517 -

2 ( ) 4 ( )
d M
i r i r n z
e e

G G
. (4B.10d)

Before moving on, it is worth checking whether the phase term
4 ( )
M
i r n z
e

G
can be simplified
any further. Figure 4B.6 (see caption) shows that when the angle between the z and
M
n unit-
normal vectors is approximately 2
d
, as specified in Fig. 4B.2, then the deviation of vector

M
n z =
G
from being exactly perpendicular to z is approximately the angle 4
d
. If we
decompose
G
into a vector
G
that is exactly perpendicular to z and a vector
||
G
that is
antiparallel to z , we have

||

= +
G G G
(4B.11a)
with

2
d

G G
(4B.11b)
and

2
||
4 8
d d

G G
. (4B.11c)

Substitution of
||

M
n z
= = +
G G G
into (4B.10c) gives, remembering that

d
=
G
,

|| ||

2 ( ) 4 ( ) 4 4
d
r r r r

+ = +
G G G G G G G G
. (4B.12a)

The absolute value of
||
4 r
G G
has an upper limit

2
max
|| max || max
max max
max max
max max
D
D
4 4 4
8
0.14
,
2 20
d
d
b
r z z

G G G

where we have used (4B.11c), (4B.8c), and (4B.4c) to simplify the expression for the upper limit.
Clearing away common factors and using inequality (4B.4a) gives

2
||
4 (0.14 10 )
40
r

G G
,

which is, according to (4B.1c), small enough to neglect. Hence, (4B.12a) can be written as

- 518 -

FIGURE 4B.6.

vector y
G

vector y
G

vector
1
y
G

angle
4 2
d
r

unit vector
m
n
unit vector z
angle
4
d

angle
2
d

angle
4
d
o
Angle is part of the right triangle whose hypotenuse is unit vector z , showing that
4 2
d
r
+ . The sum of angles o and is also
2
r
because y
1
is perpendicular z . Hence,
angle o must be equal to
4
d
.

angle
4
d
o vector y
G
vector
1
y
G
M
Appendix 4B
- 519 -

2 ( ) 4
d
r r

G G G
(4B.12b)

with
G
being the component of
M
n z that is perpendicular to z . We note that the right-hand
side of (4B.12b) is too large to neglect. This expression can be as large as

max
D 2

G
,

where we remember that, when an interferometer beam has a circular cross section, D is its
diameter and the component of r
G
parallel to
G
can then be as large as D/2. From (4B.11b) and
(4B.4c), we see that
max
D 2

G
can be as large as

max
max max
D D 2 2 0.14 0.44
2
d
=
G
,

which, according to (4B.1c), is really too large to neglect. This is why we retain the term

2 ( )
d
r
G
on the left-hand side of (4B.12b) in the interferometer equations.
The final phase approximation that we need to justify is

2 1 ( cos 2 ) ( cos 2 ) 2 1 ( cos ) ( cos )
b b b b
i z z i z z
e e

G G
. (4B.13a)

We write the phase term on the left-hand side as

2 2

2 1 ( cos 2 ) 2 1 ( cos )
b b
z z K = +
G
. (4B.13b)

The absolute value of the difference term K has an upper bound

2 2
max max

2 1 ( cos 2 ) 1 ( cos )
b b
K z z
G
. (4B.13c)

Figure 4B.4 shows that
cos
b b
z and that

( )
cos 2
b
z
G

has its maximum and minimum values when is zero and respectively. These minimum and
maximum values will maximize the right-hand side of (4B.13c) because the term
- 520 -
2
1 ( cos )
b
z does not vary when angle changes. Since
cos
b b
z , 2
d

G
, and
d b
<< , the maximum and minimum values of
cos 2
b
z
G
are
b d
, so that we can
write

2 2 2 2
2 2
4
2
4

1 ( cos 2 ) 1 ( cos ) 1 ( ) 1
( )
1 (1 ) ( )
2 2
( )
2
b b b d b
b d b
b
d
b d b
z z
O
O

= +
= +
G

2
4 max
max max
( )
2
d
b d b
O
+ + .

This can be used in (4B.13c) to get

2
4 max
2 [ ] ( )
2
d
b d b
K O
+ + . (4B.13d)

We have already seen from the discussion following Eq. (4B.5c) that

2
max
max max max max
2
2
2
2
d
b d
b d d

+

= +

can be neglected. We note that substitution from (4B.2a) and (4B.2d) gives

4 2 4
max max max max
( ) ( ) 10
b b
O O

= ,

which, according to (4B.1c), is small enough to neglect. Hence everything on the right-hand side
of (4B.13d) is small enough to neglect, which means K can be dropped from (4B.13b), making
Eq. (4B.13a) a good approximation. From the definition of
G
in Eq. (4.54c), we can rewrite
(4.54a) to get, after applying Eq. (4.135f), that

2

1 cos
b
z z = =
G
.

From Eq. (4.126b) and (4B.4d), we get
Appendix 4B
- 521 -
2( ) 2
M
n z = =
G
G
.

These two formulas together with Eqs. (4.102a) and (4.102d) in Sec. 4.13 lead to

2 2
2
2
2
2
2 1 ( cos 2 ) 2 1 ( )
2 1 ( )
2 1 ( )
b
z
w c
u
c w
w c w
u
c w c

=
=
= +
G
G G
G
G
G
G

and

2
2 2
2
2 1 ( cos ) 2 1 ( )
b
w c
z u
c w
=
G
.

The phase approximation used in (4B.13a) now becomes, written in terms of w, c , u
G
, and
G
,

2 2 2
2
2 2
2 1 ( ) 2 1
w c w w c u
i u i
c c c w w
e e
+
G
G
. (4B.13e)
- 522 -
Appendix 4C
In this appendix, we apply the three-dimensional Wiener-Khinchin theorem explained in Sec.
3.24 of Chapter 3 to the random functions describing the radiation fields entering the
interferometer.
We specify function ( , ) t T to be

1 for
( , )
0 for
t T
t T
t T

=

>

, (4C.1a)

and also define a two-dimensional version of this function to be ( ; ) ( , ; ) A x y A =
G
such that

1 when point ( , ) lies inside or on the edge
of the beam of cross - sectional area
( ; ) ( , ; )
0 when point ( , ) lies outside the beam of
cross - sectional area
x y
A
A x y A
x y
A
= =

=
G
G
G
. (4C.1b)

Function ( ; ) A
G
can be thought of as a pupil function for the beam.
75
Function ( , ) t T
specifies the one-dimensional measurement time for the beam, and function ( ; ) A
G
specifies
the two-dimensional cross section of the beam as it passes through the interferometer.
We set up two random functions

(in)
( , , )
xTA
E z t
G
and
(in)
( , , )
yTA
E z t
G

to represent, respectively, the x and y electric-field components at coordinate z of the radiation
beam entering the interferometer. The T, A subscripts show that the radiation fields are time-
chopped and beam-chopped (see Secs. 4.9, 4.10, and 4.14 for an explanation of what this means).
The three-dimensional autocorrelation functions in
G
and t used in the three-dimensional
Wiener-Khinchin theorem are

( )
(in) (in)
R ( , , , , ) ( , , ) ( , , )
xTA xTA xTA
t t z E z t E z t =
G G G G

E (4C.2a)
and

( )
(in) (in)
R ( , , , , ) ( , , ) ( , , )
yTA yTA yTA
t t z E z t E z t =
G G G G

E . (4C.2b)

The T, A subscripts in R
xTA
, R
yTA
show that these are the autocorrelations of time-chopped and
beam-chopped radiation fields. The argument z is always unprimed because we want to compare

75
Joseph W. Goodman, Introduction to Fourier Optics (McGraw-Hill Inc., New York, 1988), p. 83; reissue of 1968
book.
Appendix 4C
- 523 -
the
(in)
xTA
E
and
(in)
yTA
E
variables at the same z coordinate along the beam. Because

(in)
xTA
E
and
(in)
yTA
E

represent time-chopped and beam-chopped radiation fields, we know they cannot be
homogeneous and stationary in ( , ) x y =
G
and t (see Secs. 3.15 and 3.24 in Chapter 3). When
either t or t lies outside the time interval between +T and T, so that

( , ) 0 t T = or ( , ) 0 t T = ,
we expect the product

(in) (in)
( , , ) ( , , )
xTA xTA
E z t E z t
G G

to be zero; and the same of course is true for the product

(in) (in)
( , , ) ( , , )
yTA yTA
E z t E z t
G G

.
Consequently,

( )
(in) (in)
( , , ) ( , , )
xTA xTA
E z t E z t
G G

E and
( )
(in) (in)
( , , ) ( , , )
yTA yTA
E z t E z t
G G

E

should be zero when ( , ) 0 t T = or ( , ) 0 t T = . Similarly, when either
G
or
G
represent points
outside the beam cross-section, so that

( ; ) 0 A =
G
or ( ; ) 0 A =
G
,
we know that

( )
(in) (in)
( , , ) ( , , )
xTA xTA
E z t E z t
G G

E and
( )
(in) (in)
( , , ) ( , , )
yTA yTA
E z t E z t
G G

E

should be zero. Therefore, R
xTA
and R
yTA
cannot be written as

R ( , , )
xTA
t t z
G G
or R ( , , )
yTA
t t z
G G

as we would for the three-dimensional autocorrelations in
G
and t of homogeneous and
stationary random functions. On the other hand, if the radiation field had not been time-chopped
and beam-chopped, we would expect
(in)
x
E
and
(in)
y
E
to follow the pattern of other radiation fields

in nature. When described by random variables, these fields are usually taken to be stationary in
time,
76
andsince we have given the non-beam-chopped fields no preferred structureit makes
sense to have them homogeneous in
G
also. We can therefore assume that the random functions

76
Handbook of Optics, edited by Michael Bass, Vol. I (McGraw-Hill Inc., New York, 1995), Chapter 4, page 4.2,
sponsored by the Optical Society of America.
- 524 -
(rad)
x
E
and
(rad)
y
E
representing the radiation before it enters the interferometer are homogeneous

and stationary in p
G
and t, with autocorrelation functions R
x
and R
y
, which can be written as

( )
(rad) (rad)
R ( , , ) ( , , ) ( , , )
x x x
t t z E z t E z t p p p p
G G G G

E (4C.3a)
and

( )
(rad) (rad)
R ( , , ) ( , , ) ( , , )
y y y
t t z E z t E z t p p p p
G G G G

E . (4C.3b)

We also suppose the interferometer to be well designed, only minimally perturbing
(rad)
x
E
and
(rad)
y
E
, when turning them into

(in)
xTA
E
and
(in)
yTA
E
to create the time-chopped and beam-chopped

radiation fields entering the interferometer. This means we can assume that
(in)
xTA
E
and
(in)
yTA
E
are the
same as
(rad)
x
E
and
(rad)
y
E
well away from the boundaries of the beam in p

G
and t. Hence, we can
make the approximations that

(in) (rad)
( , , ) ( , ) ( ; ) ( , , )
xTA x
E z t t T A E z t p p p e H H
G G G

(4C.4a)
and

(in) (rad)
( , , ) ( , ) ( ; ) ( , , )
yTA y
E z t t T A E z t p p p e H H
G G G

. (4C.4b)

These approximations respect both our knowledge that
(in)
xTA
E
and
(in)
yTA
E
are negligible or zero

outside the time and cross-section boundaries of the beam and also our assumption that inside the
beam and during the time interval of the measurement
(in)
xTA
E
and
(in)
yTA
E
are little changed from the

(rad)
x
E
and
(rad)
y
E
values they would have if they did not enter the interferometer. Substituting these
approximations for
(in)
xTA
E
and
(in)
yTA
E
into the right-hand sides of Eqs. (4C.2a) and (4C.2b) gives

( )
( )
(in) (in)
(rad) (rad)
( , , ) ( , , )
( , ) ( , ) ( ; ) ( ; ) ( , , ) ( , , )
( , ) ( , ) ( ; ) ( ; ) R ( , , )
xTA xTA
x x
x
E z t E z t
t T t T A A E z t E z t
t T t T A A t t z
p p
p p p p
p p p p

e H H H H
e H H H H
G G

G G G G

G G G G

E
E (4C.4c)
and

( )
( )
(in) (in)
(rad) (rad)
( , , ) ( , , )
( , ) ( , ) ( ; ) ( ; ) ( , , ) ( , , )
( , ) ( , ) ( ; ) ( ; ) R ( , , ) ,
yTA yTA
y y
y
E z t E z t
t T t T A A E z t E z t
t T t T A A t t z
p p
p p p p
p p p p

e H H H H
e H H H H
G G

G G G G

G G G G

E
E (4C.4d)

when turning them into
p
Appendix 4C
- 525 -
where we have used (4C.3a) and (4C.3b) in the final steps of these two equations. From the three-
dimensional Wiener-Khinchin theorem, we know that the Fourier transforms of R
x
and R
y
are the
two power spectra

2 2 ( )
-
S ( , , ) R ( , , )
i u wt
x x
u z w dt d t z e

+

=

G G
G G
(4C.5a)
and

2 2 ( )
-
S ( , , ) R ( , , )
i u wt
y y
u z w dt d t z e

+

=

G G
G G
. (4C.5b)

The Wiener-Khinchin theorem also states that the power spectra S
x
and S
y
are given by the limits

( )
2 1 1
S ( , , ) lim ( , , )
2
x xTA
T
A
u z w u z w
T A
=
G G
E E (4C.5c)
and

( )
2
1 1
S ( , , ) lim ( , , )
2
y yTA
T
A
u z w u z w
T A
=
G G
E E , (4C.5d)
where the random functions ( , , )
xTA
u z w
G
E and ( , , )
yTA
u z w
G
E are defined to be the three-dimensional

forward Fourier transforms of

(rad)
( , ) ( ; ) ( , , )
x
t T A E z t
G G
and
(rad)
( , ) ( ; ) ( , , )
y
t T A E z t
G G
,
given by

2 (rad) 2 ( )
-
( , , ) ( , ) ( ; ) ( , , )
i u wt
xTA x
u z w dt d t T A E z t e

+

=

G G
G G G
E (4C.6a)
and

2 (rad) 2 ( )
-
( , , ) ( , ) ( ; ) ( , , )
i u wt
yTA y
u z w dt d t T A E z t e

+

=

G G
G G G
E . (4C.6b)

In Eqs. (4C.5c) and (4C.5d), lim
T
is interpreted to be the limit as the time interval specified by
( , ) t T extends to cover all time, and lim
A
is interpreted to be the limit as the cross-sectional area
specified by ( ; ) A
G
extends to cover the entire x, y plane. From the approximations in (4C.4a)
and (4C.4b), we have

2 (in) 2 ( )
-
( , , ) ( , , )
i u wt
xTA xTA
u z w dt d E z t e

+

G G
G G
E (4C.7a)
and
- 526 -

2 (in) 2 ( )
-
( , , ) ( , , )
i u wt
yTA yTA
u z w dt d E z t e

+

G G
G G
E . (4C.7b)

We compare these results to Eqs. (4.129a) and (4.129b) in this chapter to get

2
( , , ) , ,
xTA xTA
cu w
u z w cw z
w c

E
G
G
E (4C.8a)
and

2
( , , ) , ,
yTA yTA
cu w
u z w cw z
w c

E
G
G
E . (4C.8b)

Substituting these last two approximations in (4C.5c) and (4C.5d) gives

2
2
4
1 1
S ( , , ) lim ( , , )
2
x xTA
T
A
c cu w
u z w z
T A w w c

E
G
G
E (4C.9a)
and

2
2
4
1 1
S ( , , ) lim ( , , )
2
y yTA
T
A
c cu w
u z w z
T A w w c

E
G
G
E . (4C.9b)

As A, T get ever larger, the time-chopped and beam-chopped
(in)
xTA
E
and
(in)
yTA
E
inside the
interferometer more and more resemble the
(rad)
x
E
and
(rad)
y
E
random functions that would have

been present if the original radiation fields had not been modified by entering the interferometer.
This means that as A and T get larger, the approximations made in Eqs. (4C.4a) and (4C.4b)
become ever more exact, and so do the approximations made in Eqs. (4C.7a), (4C.7b), (4C.8a),
and (4C.8b). Concentrating on (4C.8a) and (4C.8b) in particular, we expect
xTA
E and
yTA
E to
resemble
2
xTA
cw
and
2
yTA
cw
ever more closely as A, T get large, turning the approximate

equalities in (4C.9a) and (4C.9b) into the exact equalities,

2
2
4
1 1
S ( , ) lim ( , , )
2
x xTA
T
A
c cu w
u w z
T A w w c

=

E
G
G
E (4C.10a)
and

2
2
4
1 1
S ( , ) lim ( , , )
2
y yTA
T
A
c cu w
u w z
T A w w c

=

E
G
G
E . (4C.10b)

Appendix 4C
- 527 -
Tracing
xTA
E
and
yTA
E
back to their original definitions of

x
E and
y
E in Eqs. (4.98a) and
(4.98b)before they acquired their T, A subscriptswe recognize that
2 2
xTA yTA
E E

and are no
longer functions of z, allowing us to drop z from the argument lists of S
x
and S
y
.
- 528 -
Appendix 4D
We calculate here the two-dimensional Fourier transform of the pupil function

( ; ) ( , ; ) ( , X) ( , Y) A x y A x y = =
G
(4D.1a)

for an interferometer beam with a (2X) (2Y) rectangular cross section as well as the two-
dimensional Fourier transform of the pupil function

2 2
( ; ) ( , ; ) ( , ) A x y A x y R = = +
G
(4D.1b)

for an interferometer beam with a circular cross section of radius R. Function ( , ) u v can be
thought of as a one-dimensional pupil function and is defined to be [see Eq. (4C.1a) in Appendix
4C]

1 for
( , )
0 for
u v
u v
u v

=

>

. (4D.1c)

It can be distinguished from the two-dimensional ( ; ) A
G
functions by the absence of a
semicolon in its argument list.
To evaluate the two-dimensional Fourier transform of the pupil function in (4D.1a), we write

2
2 2 2 ( )
( ; ) ( , X) ( , Y)
y
x
iyu
ixu i u
d A e dx x e dy y e

=

G G
G
. (4D.2)

The one-dimensional integrals are straightforward. In x we have

X X
X X
2
( , X) cos(2 ) sin(2 )
sin(2 X
2Xsinc(2 X) ,
x
ixu
x x
x
x
x
x e dx xu dx i xu dx
u )
u
u
=
= =

(4D.3a)

where
sin( )
sinc( )
x
x
x
= . The integral over y is identical, of course, so the final result is

2 2 ( )
( ; ) 4XYsinc(2 X) sinc(2 Y)
i u
x y
d A e u u

=

G G
G
(4D.3b)

Appendix 4D
- 529 -
or, choosing the minus sign in the exponent of e to match the definition of
A
in Eq. (4.134c) of
this chapter,
( ) 4XYsinc(2 X) sinc(2 Y)
A x y
u u u =
G
. (4D.3c)

This is the two-dimensional forward Fourier transform of the pupil function of an interferometer
beam with a (2X) (2Y) rectangular cross section.
To evaluate the two-dimensional Fourier transform of the pupil function in (4D.1b), we write

2 2 2
2
0 0
2 ( )
2
2 (cos cos sin sin )
( ; ) ( , )
,
x y
R
i xu yu
i u
i u
d A e dx dy e x y R
d d e

+

+
= +
=

G G
G

(4D.4a)

where in the last step the variables of integration have been transformed using

2 2
x y = + ,
2 2
x y
u u u u = + =
G
,

cos x = , cos
x
u u = (4D.4b)

sin y = , sin
y
u u = .
We note that

( ) ( )
2 (cos cos sin sin ) 2 cos( )
cos 2 cos( ) sin 2 cos( )
i u i u
e e
u i u

+
=
= .
(4D.5a)

From the Handbook of Mathematical Functions, we know that
77

0 2
1
cos( cos ) ( ) 2 ( 1) ( ) cos(2 )
k
k
k
z J z J z k
=
= +
(4D.5b)
and
( )
2 1
0
sin( cos ) 2 ( 1) ( ) cos (2 1)
k
k
k
z J z k
+
=
= +
, (4D.5c)

77
See Eqs. (9.1.44) and (9.1.45) in Handbook of Mathematical Functions, edited by Milton Abramowitz and Irene
Stegun (National Bureau of Standards, Applied Mathematics Series 55, November 1964), p. 361.
- 530 -
where ( )
n
J z is a Bessel function of the first kind of order n, with 0,1, 2, n = , and z a non-
negative real number. Using Eqs. (4D.5a)(4D.5c), we find that

( ) ( )
( )
2
0
2 2
0 0
2
0 2
1
0
cos 2 cos( ) sin 2 cos( )
2 (2 ) 2 ( 1) (2 ) cos 2 ( )
k
k
k
i u
e d
u d i u d
J u J u k d
=
+
=
= +

( )
2
2 1
0
0
2 ( 1) (2 ) cos (2 1)( ) .
k
k
k
i J u k d
+
=
+

The integrals over the cosine are clearly zero, because in each one the cosine is being integrated
over an integral number of periods. Hence,

2
0
0
2 (2 )
i u
d e J u

+
=
. (4D.5d)

Substituting this result back into (4D.4a) gives, changing the variable of integration to
2 u = ,

2
2 2
0 0 2
0 0
1
( ; ) 2 (2 ) ( )
2
R uR
i u
d A e J u d J d
u
= =

G G
G
.

The Bessel identity
78

0 1
0
( ) ( )
x
zJ z dz xJ x =

now lets us write

2 2 1
1
2
2 (2 )
( ; ) (2 )
2
i u
uR J uR R
d A e J uR
u u

= =

G G
G
(4D.6a)

or, choosing the minus sign in the exponent of e to match the definition of
A
in Eq. (4.134c) of

78
Joseph W. Goodman, Introduction to Fourier Optics, p. 16.
Appendix 4D
- 531 -
this chapter,

1
( ) (2 )
A
R
u J u R
u
=
G G
G
, (4D.6b)

where now
A
is the two-dimensional forward Fourier transform of the pupil function of an
interferometer beam with a circular cross section of radius R.
- 532 -
Appendix 4E
Snells law requires monochromatic plane waves entering a thick transparent slab or window to
change their angle of propagation. Figure 4E.1 uses a triplet of parallel rays to show this change,
and the angle variables specified there can be used to write Snells law as

sin sin
A B
n n o , (4E.1a)

where
A
n is the index of refraction outside the slab and
B
n is the index of refraction inside the
slab.

FIGURE 4E.1.

A
n

B
n

C
n
o
o

planes of constant phase
(side view)

A
n
Appendix 4E
- 533 -
The index of refraction of any transparent medium is here taken to be a real dimensionless ratio
of the monochromatic waves velocity in empty space to the same monochromatic waves
velocity inside the slab. The index of refraction of empty space is thus always one. The index of
refraction of air is extremely close to one, being just little bit larger than one by an amount that
can usually be neglected when analyzing optical instruments. The indexes of refraction of the
transparent substances used to make interferometer windows, beam-splitter substrates, and
compensator plates are significantly larger than one and usually less than six or seven. If c is the
monochromatic waves velocity in empty space,
A
v is its velocity outside the slab in Fig. 4E.1,
and
B
v is its velocity inside the slab, then

A
A
c
n
v
= and
B
B
c
n
v
= . (4E.1b)

The wavelength of a monochromatic plane wave also changes inside the transparent slab [see,
for example, Fig. 1.6(b) in Chapter 1]. This effect is shown in Fig. 4E.1 by drawing the planes of
constant phase perpendicular to the triplet of rays as being more closely spaced inside the slab
than outside the slab. If
A
is the wavelength outside the slab and
B
is the wavelength inside the
slab, then

A A B B
n n = . (4E.2a)

Because the wavenumber is one over the wavelength, this can also be written as

A B B A
n n = , (4E.2b)

where 1/
A A
= and 1/
B B
= . Substituting Eq. (4E.1b) into (4E.2a) gives

A B
A B
c c
v v

=

or

A B
A B
v v

= . (4E.2c)

Remembering that the frequency of a monochromatic plane wave, according to Eq. (1.5) in
Chapter 1, satisfies the formula

wavelength frequency = velocity ,

we note that the wavelength divided by the velocity must be the frequency. Therefore, Eq.
(4E.2c) requires the frequency of a monochromatic plane wave to be equal inside and outside the
- 534 -
slab in Fig. 4E.1. Another point worth mentioning here is that the index of refraction can beand
usually isa function of frequency when the monochromatic plane wave is propagating through
a transparent substance. Rearranging Eq. (4E.2a), we see that the ratio of the monochromatic
wavelengths inside and outside the slab shown in Fig. 4E.1 must be

A B
B A
n
n
= , (4E.3)

and so can also depend on the plane waves frequency. This point is discussed in a general sort of
way in Sec. 1.1 of Chapter 1 when explaining why Michelsons interferometer needed a
compensator plate to work correctly.
Figure 4E.2(a) shows two monochromatic plane waves propagating through a Michelson
interferometers compensator plate. The interferometers optical axis passes horizontally through
the compensator plate, parallel to the ray showing the direction of propagation of the on-axis
wave incident on the plate. The direction of propagation of the off-axis wave incident on the plate
has a slight downward component. The solid ray paths show the change in direction of the on-
axis and off-axis plane waves inside the plate as well as the way Snells law requires both types
of wave to revert to their incident propagation directions when leaving the plate. The short, solid
lines perpendicular to and crossing through the rays show the planes of constant phase of the
monochromatic waves, with the distance between equivalent planes being much less inside the
plate than outside the plate. This distance can be regarded as a proxy for the wavelength if we are
careful to remember that the diagram would not then be drawn to scalethe typical wavelength
of these infrared plane wavefields is 1000 to 10,000 times less than the thickness of a typical
interferometers compensator plate. The dashed rays with the dashed lines of constant phase show
how the incident monochromatic waves would have propagated had the compensator plate not
been present.
In Fig. 4E.2(a), the on-axis plane wave travels a distance p
1
through the compensator plate
and the off-axis plane wave travels a distance p
2
through the compensator plate. The substances
used to make compensator plates and beam-splitter substrates can absorb significant amounts of
power from propagating wavefields, with the amount of absorbed power depending linearly on
the distance traveled inside the substance. In a well-designed interferometer, the plane waves
propagating in an off-axis direction are traveling at nearly the same angle of incidence with
respect to the compensator plate as are the plane waves propagating on axis, making p
1
and p
2

nearly equal. Hence, both types of plane wave lose about the same amount of power passing
through the compensator plate and so, to a good approximation, the amplitudes of both the on-
axis and off-axis monochromatic plane waves decrease by the same fractional amount, say .
The absolute value or magnitude signs here force to be non-negative, but this is no problem
because we can take the plane-wave amplitudes before and after passage through the plate also to
be inherently non-negative.
Appendix 4E
- 535 -
FIGURE 4E.2(a).

2
p

1
p
The behavior of the wavefield for on-axis and oblique rays passing through the compensator plate is
shown schematically by short lines drawn perpendicularly to the rays direction of propagation. The
solid rays and lines show how the rays and wavefields actually behave while passing through the
compensator plate, and the dashed rays and lines show how the rays and wavefields would have
behaved had the compensator plate not been present. Although not drawn to scalethe wavelengths
are typically several orders of magnitude shorter than the width of the compensator plateradiation
wavelengths do, as shown, get shorter inside the compensator plate, which means the solid rays
wavefields are very unlikely to match up exactly to the dashed rays wavefields. Hence, there is almost
always a phase change of the wavefields compared to what they would have been had they not
passed through the compensator plate.
- 536 -
FIGURE 4E.2(b).

1
o
The behavior of the wavefield for an on-axis ray interacting with the beam splitter and its substrate can
be analyzed the same way the compensator plate was analyzed in Fig. 4E.2(a). Again, the solid rays
and lines show how the rays and wavefields actually behave, and the dashed rays and lines show how
the rays and wavefields would have behaved had the substrate not been present. Like the
compensator plate, the substrate is not drawn to scalethe wavelengths are made much too large
compared to the substrates width. Radiation wavelengths shorten inside the substrate just as they do
inside the compensator plate, so the solid wavefields do not match up exactly to the dashed
wavefields. This again produces a phase change in the wavefields compared to what they would have
been had the substrate not been present.
Appendix 4E
- 537 -
FIGURE 4E.2(c).

2
o
The behavior of the wavefield for an oblique ray interacting with the beam splitter and its substrate is
similar to that of an on-axis ray and wavefield [see Fig. 4E.2(b)]. Again, the solid rays and lines show
how the rays and wavefields actually behave while passing through the substrate, and the dashed rays
and lines show how the rays and wavefields would have behaved had the substrate not been present.
As before, radiation wavelengths shorten inside the substrate, so the solid wavefields do not match up
to the dashed wavefields. Just like for the on-axis ray, there is a phase change of the wavefields
compared to what they would have been had the substrate not been present.
- 538 -
Using the same notation as in the discussion following Eq. (4.35e) in Sec. 4.5 of this chapter, we
can write that

i
A A
= (4E.4a)

where
i
A is a complex parameter standing for the complex amplitudes of any of the components
of the E or B field of the monochromatic plane wave and A
is another complex parameter

standing for the complex amplitudes of the corresponding E or B field components of the
monochromatic plane wave after it has passed through the slab. The value of can change
significantly for different values of frequency; we can allow for this by writing

( ) = , (4E.4b)

where is the wavenumber of the plane wave incident on the compensator plate. Having just
noted that in a well-designed interferometer does not change significantly when comparing
on-axis and off-axis plane waves, there is no need to include a dependence on the angle of
incidence in Eq. (4E.4b).
When comparing a monochromatic plane wave entering the slab in Fig. 4E.2(a) to the same
monochromatic plane wave leaving the slab, we are analyzing a situation very similar to the
situation examined in Sec. 4.5 abovethe only real difference is that in Sec. 4.5 we discuss what
happens to monochromatic plane waves passing through a thin slab or film and now we are
analyzing what happens when passing through a thick slab or window. Passage through a thin
slab or film produces a change in phase as well as a change in amplitude, and the same thing
happens in the passage through the thick slab in Fig. 4E.2(a). In Fig. 4E.2(a) there are short
dashed lines representing what the planes of constant phase in the monochromatic wave would be
if the slab were absent. Comparing these to the short solid lines showing the actual position of the
planes of constant phase after the wave leaves the slab, we note that in both the on-axis and off-
axis cases they fail to match up. This comes from the shortening of the wavefields wavelength
inside the slab. Even though there is only a slight difference in the p
1
and p
2
distances covered by
the on-axis and off-axis waves, the on-axis phase change is much different from the off-axis
phase change because the wavefields wavelengths are so much shorter than the width of the slab.
This means that even though p
1
and p
2
are almost equal, their difference is still large compared to
a wavelength.
Figure 4E.2(b) shows an on-axis monochromatic wavefield reflecting off and transmitting
through the beam splitter and beam-splitter substrate. The one-way distance through the substrate
is called o
1
. Figure 4E.2(c) shows an off-axis monochromatic wave reflecting off and
transmitting through the beam splitter and its substrate; here, the one-way distance through the
substrate is called o
2
. Just like in Fig. 4E.2(a), the off-axis ray is only slightly off-axis because in
well-designed interferometers only slightly off-axis plane waves are allowed to pass through the
Appendix 4E
- 539 -
instrument. Hence, o
1
and o
2
, just like p
1
and p
2
, are almost equal. The compensator plate is
made from the same materialand is designed with the same thickness and orientationas the
beam-splitter substrate, so the same value of that is used for the compensator plate can be
used to describe the one-way passage through the beam-splitter substrate of the on-axis and
slightly off-axis monochromatic plane waves. Just like in Eq. (4E.4b), we expect to be a
function only of because the loss of power is about the same for both the on-axis and slightly
off-axis waves. Figures 4E.2(b) and 4E.2(c) also show that, just like in Fig. 4E.2(a), the on-axis
and off-axis monochromatic waves can undergo significantly different phase shifts after passing
through the beam-splitter substrate. Again, this is due to the wavelength being so short compared
to the thickness of the slab, which is in this case the beam-splitter substrate.
Section 4.5 of this chapter explains how to show that a monochromatic plane wavefield

2 ( ) i z ct
Ae

traveling along the z axis has undergone both a phase shift and a change in amplitude: just
multiply by a complex constant. The magnitude of the constant changes the wavefields
amplitude and the complex phase angle of the constant changes the wavefields phase, shifting
the position of the planes of constant phase from where they would be if the multiplication did
not occur. This happens no matter what direction in space is taken to be the z axisthat is, no
matter what the direction of propagation of the plane wavefield. We have already chosen ( )
to be the magnitude specifying how much the amplitude of the plane wave changes when it goes
through the compensator plate, and now nothing stops us from taking to be a complex number

arg( ) i
e

= , (4E.5)

where the complex phase anglethat is, the argument of the complex valueis chosen so that
multiplying by the complex correctly modifies both the phase and the amplitude of the plane
wave. Now taking the z axis to lie along either the on-axis or the off-axis ray in Fig. 4E.2(a), we
know that if

2 ( ) i z ct
Ae

represents any E field or B field component of the monochromatic plane wave incident on the
compensator plate, then

2 ( ) i z ct
Ae

must represent the corresponding E or B component after the monochromatic plane wave has
passed once through the compensator plate.
When analyzing transmission through a thin film in Sec. 4.5 of this chapter, we distinguish
- 540 -
between s-type wavefields where the E field is perpendicular to the plane of incidence and p-type
wavefields where the E field is parallel to the plane of incidence. Following the same pattern
here, we say that there is a
s
y complex parameter specifying how s-type waves transmit through
the compensator plate and a
p
y complex parameter specifying how p-type waves transmit
through the compensator plate.
We have already noted that the phase shift undergone by a monochromatic plane wave
passing through the compensator plate depends sensitively on the path taken through the plate;
even small differences in p
1
and p
2
in Fig. 4E.2(a) can lead to significantly different phase shifts.
Hence, even though
, s p
y does not depend sensitively on the monochromatic plane waves angle
of incidence on the compensator plate so that for both on-axis and slightly off-axis plane waves
, s p
y can be taken to depend only on the wavenumber as shown in Eq. (4E.4b), the same cannot
be said about the complex phase angle or argument of
, s p
y . It follows that for a well-designed
standard interferometer,

,
function only of the incident wavenumber
s p
y e (4E.6a)
but

,
arg( ) function of both the angle of incidence and
the incident wavenumber
s p
y
(4E.6b)
so that

,
function of both the angle of incidence and
s p
y
(4E.6c)

Multiplying a plane wavefield by a complex parameter is also a good way to show what
happens to the wavefield when it passes once through the beam-splitter substrate before reflecting
off or transmitting through the beam-splitter film, and similarly a complex parameter can be used
to show what happens to the wavefield when it passes back through the beam-splitter substrate
after reflecting from the beam-splitter film. The above discussion of Figs. 4E.2(b) and 4E.2(c)
shows that the Eqs. (4E.6a)(4E.6c) still hold true when
, s p
y is taken to be a complex parameter
describing one passagebefore or after reflectionthrough the beam-splitter substrate. The only
caveat, of course, is that o is now taken to be the angle of incidence of the monochromatic plane
wave on the combined substrate-and-film beam-splitter optical element shown in Figs. 4E.2(b)
and 4E.2(c). A little thought shows that rules (4E.6a)(4E.6c) must in fact have a still wider
application: if they hold true for any two complex parameters
A
y and
B
y , then they must also
hold true for their complex product

A A
y y y .
It is easy to see why; we just note that
caveat, of course, is that is now (see Fig. 4E.1) the angle of incidence of the monochromatic plane o
Appendix 4E
- 541 -

A A
=
and
arg( ) arg( ) arg( )
A A
= + .

Hence must be a function only of , not depending significantly on the angle of incidence,
because it is the product of functions for which this is true; and similarly arg( ) must depend on
both and the angle of incidence because it is the sum of functions for which this is true. This
same reasoning can in fact be extended to conclude that rules (4E.6a)(4E.6c) must hold true for
all complex products
, s p
representing any number of passages through any combination of the
compensator plate and beam-splitter substrate.
Since the complex phase angles of the parameters describing the compensator plate and
beam-splitter substrate depend sensitively on the angle of incidence, we should examine how the
angle of incidence of a plane wave changes as it passes through the interferometer.
Most textbooks on elementary optics describe a simple procedure for analyzing the geometry
of rays reflecting off mirrors and other types of specular surfacesthey recommend the
construction of a mirror-image virtual world on the other side of the mirror or specularly
reflecting surface. Figure 4E.3(a) shows how this works for the elementary case of rays leaving a
chair and then specularly reflecting into an observers eye. For each ray entering the observers
eye, there is a corresponding direction at which the ray originally left the chair, as shown by the
solid arrows in Fig. 4E.3(a). To find the direction at which a ray must leave the chair, we
construct a virtual worldin this case, a virtual chairon the other side of the reflecting surface,
as shown by the dashed lines in Fig. 4E.3(a). The virtual chair is drawn point by point exactly the
same distance behind the mirror as the real chair is in front of the mirror. To find, for example,
the direction of ray A
r
in coordinate system S such that it reflects off the mirror and enters the
observers eye as ray A, we just draw a straight dashed
r
A extension of the ray back to the dashed
virtual chair on the other side of the mirror and look at the direction of
r
A in the virtual S
coordinate system.
Figure 4E.3(b) shows how to analyze optical configurations by constructing virtual objects on
the virtual side of all the specularly reflecting surfaces. The plane wave represented by the Z
triplet of rays drawn with solid arrows in Fig. 4E.3(b) reflects first off mirror M
1
, then reflects off
mirror M
2
and into the transparent slab T. Using the same procedure as in Fig. 4E.3(a), we
construct
1
T and
2
M , the dashed virtual images of T and M
2
on the far side of M
1
. Just like
before, the direction at which the rays approach M
2
can be found by extending the Z triplet of rays
as dashed straight arrows onto the virtual
2
M surface. To analyze and reflect the virtual rays
off
2
M , we construct
12
T , a dash-dot virtual image of
1
T on the far side of the virtual mirror
2
M .
- 542 -

FIGURE 4E.3(a).

M
A
S
C

r
A
S

r
A
C
Appendix 4E
- 543 -
FIGURE 4E.3(b).

The direction of the Z rays at the true transparent slab T can now be found by extending the
dashed arrows as dash-dot arrows onto the virtual
12
T transparent slab. (The symmetry of the
situation, by the way, shows that
12
T can also be constructed as the virtual image in M
1
of the
virtual image
2
T on the other side of M
2
.) The collection of surfaces M
1
,
2
M , and
12
T together
with the solid, dashed, and dash-dot Z rays is sometimes called a tunnel diagram. Tunnel
diagrams can be a convenient way to keep track of the angle of incidence of plane waves
propagating through a collection of specularly reflecting flat surfaces.
Figures 4E.4(a) and 4E.4(b) are tunnel diagrams for the A triplet of rays propagating through a
standard Michelson interferometer. The A rays represent a monochromatic plane wave A
propagating through the instrument in a slightly off-axis direction. Figure 4E.4(a) is the tunnel
diagram for the interferometer arm without the compensator plate. Here the dashed slab S
b
and
mirror M come from constructing a virtual S
a
and M on the other side of the beam splitters
thin, partially reflecting film; and the dash-dot S
c
slab is then the virtual representation of S
b
in
mirror M . The dashed and dash-dot virtual extensions of the A rays show what the angle of
Z T

1
T

1
M

2
M

12
T
2
T

2
M
- 544 -
incidence of plane wave A must be for its three passes through S
a
while following the path of the
solid rays in Fig. 4E.4(a). In this tunnel diagram, we characterize the passage of slightly off-axis
plane waves through S
a
, S
b
, and S
c
by complex parameters
( )
,
a
s p
,
( )
,
b
s p
, and
( )
,
c
s p
respectively. For s-
type plane waves, the s subscript is chosen and for p-type plane waves the p subscript is chosen.
This is, of course, the same thing as saying that the
( )
,
a
s p
characterize the first passage of s-type
and p-type plane waves through the beam-splitter substrate before reflection off the beam-splitter
film, that the
( )
,
b
s p
characterize the second passage of s-type and p-type plane waves through the
beam-splitter substrate after reflection off the beam-splitter film, and that the
( )
,
c
s p
characterize the
third s-type or p-type passage through the beam-splitter substrate after reflection off mirror M.
We note that it is important to distinguish between these three passages because, as shown by the
tunnel diagram, the angle of incidence corresponding to
( )
,
c
s p
must be slightly different from the
angle of incidence corresponding to
( )
,
a
s p
for slightly off-axis plane waves; and, of course, the
( )
,
b
s p

is allowed to be different from
( )
,
a
s p
and
( )
,
c
s p
because it characterizes the reverse passage through
the beam-splitter substrate.
______________________________________________________________________________

FIGURE 4E.4(a). This is a tunnel diagram for the interferometer arm without the compensator plate.

M
M
A
Virtual
Optical
Axis

a
S

b
S

c
S

) (
,
c
p s

) (
,
b
p s

) (
,
a
p s

Appendix 4E
- 545 -
FIGURE 4E.4(b). This is the tunnel diagram for the interferometer arm with the compensator plate.

Figure 4E.4(b) is the tunnel diagram for the interferometer arm with the compensator plate; it
is simpler than the tunnel diagram in Fig. 4E.4(a) because it uses the virtual images
corresponding to one rather than two specularly reflecting surfaces. Slab
a
S represents the beam-
splitter substrate in Fig. 4E.4(b); it must have the same shape and orientation as S
a
in Fig. 4E.4(b)
because it represents the same block of substrate material. Hence, for any monochromatic plane
wave, on-axis or off-axis, we must have

( ) ( )
, ,
a a
s p s p
y y

(4E.7a)

where
( )
,
a
s p
y

are the complex parameters specifying an s-type or p-type plane waves passage
through slab
a
S in Fig. 4E.4(b) and
( )
,
a
s p
y are the complex parameters in Fig. 4E.4(a) that have
been defined in the previous paragraph to specify an s-type or p-type plane waves first passage
through slab S
a
. Clearly, since passage through slab
a
S is just another name for the same event as
passage through slab S
a
, Eq. (4E.7a) is trivially true; in fact, for this reason it makes sense to drop
the primes from the
( )
,
a
s p
y

parameters, assuming them to be always the same as the
( )
,
a
s p
y
parameters. We next note that if the compensator plate in Fig. 4E.4(b) has the same shape as the
substrate slab and it is given the appropriate orientation parallel to the substrate slab, then the
angle of passage of any on-axis or slightly off axis plane wave through slab S
b
in Fig. 4E.4(a) is
the same as the angle of passage of that same plane wave through slab
b
S in Fig. 4E.4(b).
Consequently, we can regard the angle of incidence at which monochromatic plane waves
approach S
b
to be the same as the angle of incidence at which the monochromatic plane waves
approach
b
S , which means that if the plane waves have the same wavenumber then
A
Optical
Axis
Virtual
Optical
Axis

' ) (
,
c
p s
y

' ) (
,
b
p s
y
' ) (
,
a
p s
y

a
S

b
S
c
S
M
a
- 546 -

( ) ( )
, ,
b b
s p s p

, (4E.7b)

where
( )
,
b
s p

are the complex parameters in Fig. 4E.4(b) specifying an s-type or p-type plane
waves passage through slab
b
S , and
( )
,
b
s p
are the already-defined complex parameters in Fig.
4E.4(a) specifying an s-type or p-type plane waves passage through slab S
b
. Even though we are
here saying that
( )
,
b
s p

and
( )
,
b
s p
are only approximately equal because the compensator plate may
not be exactly matched in thickness and orientation to the moving-mirror arms second pass
through the beam-splitter substrate, it still makes sense to idealize the situation and drop the
primes, assuming that in a well-designed interferometer all the
( )
,
b
s p
complex parameters are
effectively the same. Finally, we examine the angles of incidence of the plane wave on slab S
c
in
Fig. 4E.4(a) and on slab
c
S in Fig. 4E.4(b). Even though the ray triplet hits the two slabs at
different places, the angles of incidence must always be the same for any on-axis or slightly off-
axis monochromatic plane wave. The plane wave incident on the virtual compensator plate
c
S in
Fig. 4E.4(b) passes through the slab in reverse compared to
b
S ; and, of course, the same
observation applies to S
c
compared to S
b
in Fig. 4E.4(a). So again, in a well-designed
interferometer, we know that the same compensator plate satisfying Eq. (4E.7b) can also satisfy

( ) ( )
, ,
c c
s p s p

, (4E.7c)

where
( )
,
c
s p

are the complex parameters specifying a plane waves passage through slab
c
S and
( )
,
c
s p
are the previously defined complex parameters specifying a plane waves passage through
slab S
c
. In this situation, the primed and unprimed complex parameters may be only
approximately equal not only due to a slightly mismatched compensator plate but also because
the moving mirror may be slightly out of alignment, changing the angle of incidence from what it
ought to be.
When the moving mirror is slightly out of alignment, we know that angle
d
defined at the
beginning of Sec. 4.12 of this chapter is nonzero and can give rise to a slight change in the angle
of propagation for on-axis and off-axis plane waves propagating back down the moving mirror
arm of the interferometer and into the beam-splitter substrate for the third time. Angle
d
is much
smaller than angle
b
, the typical angle at which a slightly off-axis plane wave propagates with
respect to the optical axis. According to the discussion associated with Eqs. (4E.6a)(4E.6c), the
only reason to worry about the effect of slightly different angles of incidence on the complex
parameters associated with the beam-splitter substrate and compensator plate is that the phase
but not the amplitudeof plane waves passing through these transparent slabs can depend
sensitively on the angle of passage. This change in the plane waves phase shows up as a change
in the complex phase angle or argument of the

parameters and does not significantly affect the
Appendix 4E
- 547 -
value of their complex magnitudes . We now show that the typical
d
angle is in fact small
enough to disregard its effect on the phase of the monochromatic plane waves, allowing us to
disregard its effect on the complex phase angleand thus on the valueof the parameters. This
means in particular that even for typical nonzero
d
values we can drop the primes from the
( )
,
c
s p

complex parameters and assume that
( )
,
c
s p
and
( )
,
c
s p

are always effectively the same parameters in
a well-designed instrument.
Figure 4E.5(a) shows the solid ray corresponding to a properly aligned plane wave
propagating down the z axis toward the origin of an x, y, z Cartesian coordinate system. The
hollow arrow going through the origin and lying in the x, z plane is the unit-normal vector of the
surface of the transparent slab corresponding to the beam-splitter substrate. The plane of
incidence of the solid ray is then of course the x, z plane of the coordinate system. The solid ray
makes an angle of incidence
A
when it intersects the slabs surface at the origin. This ray is
labeled as ray 1. It refracts into the slab as ray I, still lying in the x, z plane of incidence and
having an angle of refraction

1
sin
sin
A A
B
B
n
n

=

from Eq. (4E.1a) above. The dashed ray, labeled ray 2 in Fig. 4E.5(a), corresponds to the
direction of propagation of the solid rays plane wave when the moving mirror is slightly
misaligned, changing its direction of propagation by an angle
d
. There is no reason to expect
d

to lie in the x, z plane, so the plane of incidence of ray 2 is depicted as being different from that of
ray 1. When ray 2 refracts at the origin, turning into ray II, we see that the angle between rays I
and II must be the same order of magnitude as
d
, so we write this angle as O(
d
).
Figure 4E.5(b) shows the plane containing refracted rays I and II. The intersection of this
plane with the flat surfaces of the beam-splitter substrate produces lines 1 and 2 in Fig. 4E.5(b),
showing where rays I and II enter and exit the slab. The O(
d
) angle between them is also clearly
shown, now lying in the plane of the diagram. The distance between lines 1 and 2 must be O(w)
where w is the thickness of the slab. To estimate the monochromatic plane waves change in
phase due to the O(
d
) change in the angle of propagation through the slab, we evaluate

[ ]
2
2 2 2
( ) ( ) ( )
d d d
w
s O O w O O

=

, (4E.8a)

where is the smallest typical wavelength of the monochromatic plane wave.

- 548 -
FIGURE 4E.5(a).

ray 1
ray 2
ray I
ray II
surface normal
vector n
angle
A

angle
d

angle
=

B
A A
B
n
n
sin
sin
1

angle is ) (
d
O
unit vector z
unit vector y
unit vector x
Appendix 4E
- 549 -
We note that

4
min
10 cm i i

> =

for infrared systems and that, according to inequality (4B.4b) in Appendix 4B of this chapter, the
maximum expected value of
d
is

4
10 radians
d

= .

The thickness w is typically on the order of 1 cm. We now see that

8 4
4
2 2
10 2 10
10
s O
r r
r
i

A s e

. (4E.8b)

This is clearly small enough to ignore, justifying our decision in the discussion following Eq.
(4E.7c) to disregard the difference between the complex
( )
,
c
s p
y and
( )
,
c
s p
y

parameters. Inequality
(4B.2a) in Appendix 4B reveals that
b
, the typical off-axis angle of propagation of the off-axis
plane waves, can be as large as
2
10 radians
. Putting this into formula (4E.8a) gives

4
4
2 2
10 (2 )
10
s O O
r r
r
i

A e

. (4E.8c)

This is clearly too large to neglect, showing why we have been so careful to do the bookkeeping
on the phase changes undergone by the on-axis and off-axis monochromatic plane waves as they
propagate through the beam-splitter substrate and compensator plate.
The thickness w is typically on the order of 1 cm. We now see that [see requirement (4B.1c) in

8 4
4
2 2
10 2 10
10
s O
r r
r
i

A s e

. (4E.8b)

This is clearly small enough to ignore, justifying our decision in the discussion following Eq.
(4E.7c) to disregard the difference between the complex
( )
,
c
s p
y and
( )
,
c
s p
y

parameters. Inequality
(4B.2a) in Appendix 4B reveals that
b
, the typical off-axis angle of propagation of the off-axis
plane waves, can be as large as
2
10 radians
. Putting this into formula (4E.8a) gives

4
4
2 2
10 (2 )
10
s O O
r r
r
i

A e

. (4E.8c)

This is clearly too large to neglect, showing why we have been so careful to do the bookkeeping
on the phase changes undergone by the on-axis and off-axis monochromatic plane waves as they
propagate through the beam-splitter substrate and compensator plate.
Appendix 4B]
- 550 -

FIGURE 4E.5(b).

This point represents the
origin of the coordinate
system in Fig. 4E.5(b).
2 / r
2 / r
Angle size is ) (
d
O .
Small length
here is s A . Angle size is ) (
d
O .
Length is ) (w O .
line 1
line 2
line 2
line 1
a
Appendix 4F
- 551 -
Appendix 4F
Figures 4F.1 and 4F.2 are tunnel diagrams like the ones used to explain the meaning of the
( )
,
a
s p
,
( )
,
b
s p
, and
( )
,
c
s p
complex parameters introduced in Appendix 4E. The only difference is that these
tunnel diagrams apply to the monochromatic plane waves of the unbalanced background optical
signal coming from the detector side of a standard Michelson interferometer while the tunnel
diagrams in Appendix 4E analyze the monochromatic plane waves for the balanced optical signal
entering the interferometers front aperture.
Figures 4F.1 and 4F.2 show the path of a slightly off-axis plane wave represented by three
rays coming from the detector side of the interferometer. The tunnel diagram in Fig. 4F.1
corresponds to the rays transmitting through the beam-splitter film and substrate, reflecting off
the moving mirror, and then transmitting a second time through the beam-splitter substrate and
film on their way back to the detector. The tunnel diagram in Fig. 4F.2 corresponds to the rays
coming from the same off-axis direction, but this time they reflect off the back side of the beam-
splitter film, transmit twice through the compensator plate while going out and back the fixed-
mirror arm, and then reflect a second time off the back side of the beam-splitter film to return to
the detector.
In Fig. 4F.1, the angle of incidence of the off-axis rays on the back side of the beam-splitter
substrate is slightly different from the angle of incidence on the virtual beam-splitter substrate
shown on the other side of the moving mirror in the tunnel diagram. We have found that the
change in a plane wave passing through a transparent slab can be described by a complex
parameter whose argument or complex phase angle depends sensitively on the angle of incidence
and whose magnitude does not [see, for example, Eqs. (4E.6a)(4E.6c) in Appendix 4E]. The
tunnel diagram in Fig. 4F.1 shows that the angle of incidence for the second pass through the
beam-splitter substrate differs slightly from that of the first pass, so we call the complex
parameter for the second pass
( ) v
s
for s-type monochromatic plane waves and
( ) v
p
for the p-type
monochromatic plane waves, while the complex parameter for the first pass is called
( ) u
s
for s-
type monochromatic plane waves and
( ) u
p
for the p-type monochromatic plane waves.
In Fig. 4F.2, the angle of incidence of the slightly off-axis rays on their second pass through
the compensator plate must also be slightly different from the angle of incidence of their first
pass; we also note, however, that according to the tunnel diagrams in Figs. 4F.1 and 4F.2, the
angle of incidence of the first pass through the beam-splitter substrate must be the same as the
angle of incidence of the first pass through the compensator plate when the compensator plate is
correctly aligned parallel to the beam-splitter substrate. Similarly, the angle of incidence of the
second pass through the compensator plate must be the same as the angle of incidence of the
second pass through the beam-splitter substrate.
- 552 -

FIGURE 4F.1.
Background Radiation from
Detector Side of Interferometer

) (
,
v
p s

) (
,
u
p s

Beam-Splitter Substrate
Moving Mirror
Virtual Beam-Splitter
Substrate
Virtual Optical Axis
Optical
Axis
Appendix 4F
- 553 -

FIGURE 4F.2.
Background Radiation from
Detector Side of Interferometer
Virtual Compensator Plate
Virtual Fixed Mirror

) (
,
u
p s

) (
,
v
p s

Second Virtual Compensator Plate
from Virtual Fixed Mirror
Optical
Axis
- 554 -
Hence, the same complex parameters
( )
,
u
s p
and
( )
,
v
s p
used for the first and second passes through
the beam-splitter substrate should also be used to describe the first and second passes of the s-
type and p-type monochromatic plane waves through the compensator plate.
For future use, we define that

( ) ( ) ( ) uv u v
s s s
= (4F.1a)
and

( ) ( ) ( ) uv u v
p p p
= . (4F.1b)

Just like in Eqs. (4E.6a)(4E.6c), we know that

( )
,
function only of the incident wavenumber
uv
s p
(4F.2a)
but

( )
,
arg( ) function of both the angle of incidence and
uv
s p
=

(4F.2b)
so that

( )
,
function of both the angle of incidence and
the incident wavenumber .
uv
s p
=

(4F.2c)

The reason for this is the same as before: the change in phase of the slightly off-axis plane waves
passing through the beam-splitter substrate or compensator plate depends sensitively on their
angle of incidence while their loss of power does not. We also note that, according to the analysis
at the end of Appendix 4E of this chapter, this dependence of the phase change on the angle of
incidence is not so sensitive as to be affected by the very small misalignments of the moving
mirror that may occur in well-designed interferometers.
- 555 -
5
DESCRIPTION OF PRACTICAL
INTERFEROMETER MEASUREMENTS
The concept of spectral radiance was introduced in Chapter 4 to simplify the interference
used to analyze the large-scale power flow and spectral content of electromagnetic radiation
fields, matching this to our intuitive understanding of what is meant by the brightness and
darkness of both near and distant objects. This is followed by a description of what is seen with
the naked eye when looking out through a standard Michelson interferometer, showing how it fits
in with the previous chapters interference formulas. The somewhat abstract equations derived in
Chapter 4 are converted into more practical formulas, and we explain the consequences of the
nonrandom errors and signal distortions found in realistic instruments. We describe the balanced,
unbalanced, and off-axis interferogram signals as well as how calibration removes contaminating
background radiances from the measured spectra. The characteristic strengths and weaknesses of
double-sided and single-sided interferogram systems are discussed, and we analyze the
degradation introduced by nonflat optical surfaces. The signals produced in the detector are
traced through the anti-aliasing filter to the analog-to-digital converter, where they are
transformed into digital input for the discrete Fourier transform. The chapter ends with an
explanation of why it sometimes makes sense to oversample or undersample the interferogram
signals.
5.1 Radiometric Description of Electromagnetic Fields
Radiometry analyzes the power flow and spectral content of radiation fields. The analysis is
almost always done using length scales much larger than the typical wavelength of the radiation
and time scales much longer than the typical period of the radiation, allowing us to treat the
radiation as collections of beam-chopped and direction-chopped radiant beams (see Sec. 4.9 of
Chapter 4). It should be emphasized that this division into separate beams is entirely conceptual;
no apertures or lenses are required. In Chapter 4, we introduced the spectral radiance function
L() to describe the propagation of electromagnetic energy inside an interferometer beam.
79

There, the spectral radiance of the beam is defined in such a way that the amount of radiant

79
See, for example, the discussion at the start of Sec. 4.16 of Chapter 4.
equations, and it turns out to have a much wider range of usefulness than might at first be
suspected. We start off this chapter with a quick description of how the spectral radiance can be expected.
5 Description of Practical Interferometer Measurements
- 556 -
energy dE
o
passing through a cross-sectional area A of a beam in time 2T into a solid angle d
and having a wavenumber between and + d is

( ) 2 dE T A d d
o
o o O L . (5.1)

In analyzing any radiation field as a collection of radiant beams, as we are doing here, the idea of
a spectral radiance can be applied to any large-scale description of electromagnetic radiation. In
Sec. 4.9 of the last chapter, parallel groups of rays are used to represent plane waves inside a
beam. In radiometry, these ray groups are bundled together into what are often called pencils of
rays,
80
or pencil rays for short, such that each pencil ray becomes an idealized representation of a
single conceptual beam of the radiation field. The pencil rays can be thought of as channels along
which electromagnetic energy flows. Just as the interferometer beam has a spectral radiance, so
too can a spectral radiance be assigned to every pencil ray of a large-scale radiation field.
In radiometry we the spectral radiance of each pencil ray to be a function L() such that the
amount of radiant energy dE
o
passing through a cross-sectional area dA of a pencil ray in time dt
into a solid angle d and having a wavenumber between and + d is

( ) dE dt dA d d
o
o o O L . (5.2a)

In this formula area dA has its normal vector parallel to the axis of the ray as shown in Fig. 5.1(a).
Equations (5.1) and (5.2a) can be matched to each other exactly if we make the associations

A dA . (5.2b)

and
2T dt . . (5.2c)

This shows that to keep the radiometric L() function consistent with Maxwells equations, the
physical quantities dA and dt, although mathematical infinitesimals, must always be thought of as
much larger than the wavelengths and periods of the propagating electromagnetic fields. If the
normal vector of area dA makes an angle with respect to the axis of the pencil ray, we expect
the effective area transverse to the beam to be ( cos ) dA as shown in Fig. 5.1(b). Now the
energy propagating along the pencil ray is

( ) ( cos ) dE dt dA d d
o
o o O L . (5.2d)

80
Max Born and Emil Wolf, Principles of Optics, 7th exp. ed. (Macmillan Company, New York, 1964).
In radiometry the spectral radiance of each pencil ray is a f
Radiometric Description of Electromagnetic Fields 5.1
- 557 -
FIGURE 5.1(a).

FIGURE 5.1(b).

There is no particular reason to use wavenumbers to characterize the spectral distribution of
the energy flowing along the pencil rays. As a matter of fact, in radiometry the spectral radiance
of pencil rays is more likely to be given in terms of ( )
L , the spectral radiance with respect to

wavelength , or ( )
f
f L , the spectral radiance with respect to frequency . It is straightforward to
convert from L(), the spectral radiance with respect to wavenumber, to either ( )
L or ( )
f
f L .
To get ( )
L , we simply note that dE
, the radiant energy flowing in time dt through an area dA

making an angle with respect to the pencil ray into a solid angle d and having a wavelength
between and d + should be
edge-on
view of dA
pencil ray passing
through dA
pencil ray passing
through dA

edge-on
view of dA
edge-on view of
dA ) (cos
unit vector normal to dA
- 558 -
( ) ( cos ) dE dt dA d d

= L . (5.3a)

Similarly dE
f
, the radiant energy flowing in time dt through an area dA making an angle with
respect to the pencil ray into a solid angle d and having a frequency between and f df +
should be
( ) ( cos )
f f
dE f dt dA d df = L . (5.3b)

The total amount of radiant energy flowing along the ray should be the same no matter how the
spectrum is represented, so

0 0
0
( cos ) ( ) ( cos ) ( )
( cos ) ( )
f
dA dt d d dA dt d f df
dA dt d d
=
=

L L
L

or

0 0
( ) ( )
f
d f df

=

L L (5.3c)
and

0 0
( ) ( ) d d

=

L L (5.3d)

But 1 f c = = [see discussion immediately preceding Eq. (4.19c) in chapter 4], which we
can use to change the variable of integration in these last two equations to get

0 0
1
( )
f
f
df f df
c c

=

L L (5.3e)
and

2
0 0
1 1
( ) d d

=

L L . (5.3f)

These equations must hold true for any physically conceivable spectral radiance L(), which
means that

1
( )
f
f
f
c c

=

L L (5.3g)
and
- 559 -

2
1 1
( )

=

L L . (5.3h)

This can be used to define L
and L
in terms of L. The phrase physically conceivable lets us

assume that (1 ) 0 L as 0 and that it does this fast enough to avoid any concern that the
right-hand side of (5.3h) becomes singular as 0 .
Radiation escaping from relatively small holes in cavities whose walls are all at the same
temperature T is called black-body or Planck radiation. One of the first triumphs of quantum
mechanics at the beginning of the 20th century was to explain why the spectral radiance of this
sort of radiation is always given by the formula

2 3
Planck
2
( )
1
hc
kT
hc
e

L , (5.3i)

where T is the temperature of the walls in degrees Kelvin (abbreviated as K),
27
6.625 10 erg sec h

= is Plancks constant,
16
1.381 10 erg/K k

= is Boltzmanns constant,
and
10
2.998 10 cm/sec c = is the speed of light in empty space. Equivalent forms of this
equation come from applying formulas (5.3g) and (5.3h) to get

3 2
Planck
(2 / )
( )
1
f hf
kT
hf c
f
e
=
L (5.3j)
and

2 5
Planck
(2 / )
( )
1
hc
kT
hc
e
L . (5.3k)

We often use a gray-body approximation to get the spectral radiance for heat or infrared
radiation that a surface of temperature T spontaneously emits. To use the gray-body
approximation, we just multiply
Planck
L at the surfaces temperature T by a dimensionless fraction
between zero and one, which is called the surfaces emissivity, with different surfaces having
different emissivity values. Sometimes, to get more accuracy, the emissivity is taken to be a
function of wavenumber and temperature; when this is done the
Planck
L function is being used to
give the correct overall size and shape to the surfaces spectral radiance while the spectral
dependence of the emissivity is used to reproduce the rapid fluctuations with respect to
characteristic of the surface.
Figures 5.2(a)5.2(c) contain plots of
Planck
( ) L ,
Planck
( )
f
f L , and
Planck
( )
L at temperatures
- 560 -
of 300 K, 400 K and 500 K. The spectral radiance increases with temperature at every
wavenumber, matching our intuition about what ought to occur. We note that at 300 K
(approximate room temperature) only negligible radiation is emitted in the visible region of the
electromagnetic spectrum between approximately 15,000 cm
1
and 22,000 cm
1
which is, of
course, what we should expectand the same is also true of the 400 K and 500 K curves.
(Surfaces in fact start to become visibly hot only at 700 K and higher.) Unfortunately, the Planck
curve is rather featureless, tending to conceal what is going on when we switch from L to L
to
L
to represent the same radiance spectrum. Figures 5.2(d)5.2(f) show a more interesting
electromagnetic spectrum represented using the L(), ( )
f
f L , and ( )
L spectral radiance
functions. These plots reveal that the transformation from L to L
not only distorts the spectrums

overall shape but also reverses the ordering of the spectral features, putting large wavenumber
features at small wavelengths and small wavenumber features at large wavelengths. The
transformation from L to L
, on the other hand, just involves a rescaling of the x and y axes of the
spectrum. This latter transformation, then, acts like a simple change in our choice of units; and
for this reason the word frequency is sometimes used to refer to wavenumber. The idea behind
this terminology is that wavenumbers are just frequencies that happen to be measured in units
of cm
1
.

FIGURE 5.2(a).
710.524107
2.477522 10
3
.
B1
i
B2
i
B3
i
4 10
3
.
1
i
0 500 1000 1500 2000 2500 3000 3500 4000
0
200
400
600
800
4000 3000 2000 1000 0.0
800
600
400
200
0.0
500 K
400 K
300 K
( )
Planck
L
[in (erg/sec)/cm/sr]
(in cm
-1
)
- 561 -
FIGURE 5.2(b).

3 10
8
.
8.263917 10
14
.
B1f
i
B2f
i
B3f
i
1.1992 10
14
.
2.998 10
10
.
f
i
0 2 10
13
4 10
13
6 10
13
8 10
13
1 10
14
1.2 10
14
0
5 10
9
1 10
8
1.5 10
8
2 10
8
2.5 10
8
1.2x10
14
4x10
13
8x10
13
0.0

1x10
-8
2x10
-8
3x10
-8
0.0

( )
f
Planck
f L
[in (erg/sec)/cm
2
/sr/Hz]
f (in Hz)
300 K
400 K
500 K
- 562 -
FIGURE 5.2(c).

1.279813 10
9
.
0
B1
i
B2
i
B3
i
max 0
i
0 0.001 0.002 0.003 0.004
0
2 10
8
4 10
8
6 10
8
8 10
8
1 10
9
1.2 10
9
1.4 10
9
0.005 0.004 0.003 0.002 0.001 0.0
1.4x10
9
1.2x10
9
1.0x10
9
8.0x10
8
6.0x10
8
4.0x10
8
2.0x10
8
0.0

500 K
400 K
300 K
(in cm)
( )
Planck
L
[in (erg/sec)/cm
3
/sr]
- 563 -
FIGURE 5.2(d).

Lmax
0
L ( )
max min
0 500 1000 1500 2000 2500 3000
0
5 10
4
0.001
0.0015
0.002
0.0025
0.003
3000 2000 2500 1500 1000 500 0
0.003
0.002
0.001
0.0
( ) L
(in Watts/cm
2
/sr/cm
-1
)
(in cm
-1
)
- 564 -
FIGURE 5.2(e).

Lmax
0
L ( )
max min
0 1 10
13
2 10
13
3 10
13
4 10
13
5 10
13
6 10
13
7 10
13
8 10
13
9 10
13
1 10
14
0
2 10
14
4 10
14
6 10
14
8 10
14
1 10
13
20 40 60 80 100 0
4x10
-14
2x10
-14
6x10
-14
8x10
-14
1x10
-13
0.0
f (in TeraHz)
( )
f
f L
(in Watts/cm
2
/sr/Hz)
- 565 -
FIGURE 5.2(f).

Lmax
0
L ( )
max min
0 5 10
4
0.001 0.0015 0.002 0.0025 0.003 0.0035 0.004
0
500
1000
1500
2000
2500
3000
3500
4000
20 30 40 10 0
1000
2000
3000
4000
0.0
(in microns)
( )
L
(in Watts/cm
3
/sr)
- 566 -
Different authors use different notations for L, L
, and L
. The easiest way to find out what

exactly is meant by the term spectral radiance is to check the units. Consulting Eqs. (5.2d),
(5.3a), and (5.3b), we see that L must have units of energy per unit time per unit area per unit
solid angle per unit wavenumber interval, whereas L
has units of energy per unit time per unit

area per unit solid angle per unit frequency interval and L
has units of energy per unit time per

unit area per unit solid angle per unit wavelength interval. Although solid angles measured in
steradians, like angles measured in radians, are strictly speaking dimensionless, it is customary in
radiometry to write out the steradian unit explicitly, treating it as if it had a dimension. This
convention makes it easy to distinguish physical quantities such as the spectral radiance, which
are both per unit surface area and per unit steradian from physical quantities such as the
energy flux that are just per unit surface area.
To go from the spectral radiance to the radiance, we need only integrate L() over all positive
wavenumbers, integrate ( )
f
f L over all positive frequencies, or integrate L
over all
wavelengths. Using l to represent the radiance, we say that

0 0 0
( ) ( ) ( )
f
d f df d

= = =

L L L l . (5.4a)

The integrals are between 0 and because L and L
are defined in such a way as to spread the

radiant energy over positive wavenumbers and frequencies respectivelyand wavelength, of
course, must be a positive quantity. In this sense, they are all analogous to the single-sided power
spectra discussed at the end of Sec. 3.23 in Chapter 3. We integrate Eq. (5.2d) over positive and
use (5.4a) to get that the total energy dE flowing in time dt through an area dA making an angle
with respect to the pencil ray into a solid angle d is

( cos ) dE dA dt d = l . (5.4b)

The same formula comes from integrating Eqs. (5.3a) or (5.3b) over positive frequencies or
wavelengths respectively. Different authors may use different notations for the radiance, and
again the surest way to find out what is going on is to check the units. The units of the radiance l
are, of course, energy per unit time per unit area per unit solid angle.
5.2 Radiance Fields in Space
The solid angle d referred to in the definitions of L and l [see Eqs. (5.2a), (5.4a), and (5.4b)]
can be taken to extend either forward or backward along the pencil ray, as shown in Fig. 5.3(a).
We can place two areas dA
1
and dA
2
at positions 1 and 2 along the same pencil ray, with the
normals of dA
1
and dA
2
making angles
1
and
2
with respect to the ray as shown in Fig. 5.3(b).
The amount of radiant energy passing through dA
1
in time dt into a solid angle d
1
is
Radiance Fields in Space 5.2
- 567 -

1 1 1 1
( cos ) dA dt d O l , (5.5a)

where
1
l is the radiance at position 1 along the pencil ray. Similarly, the amount of radiant energy
passing through dA
2
in time dt into a solid angle d
2
is

2 2 2 2
( cos ) dA dt d O l , (5.5b)

where
2
l is the radiance at position 2 along the pencil ray. The values of l
1
and
2
l

cannot depend
on the size of the infinitesimal quantities dA
1
, dA
2
, d
1
, d
2
, or dt, so nothing stops us from
choosing

2 2
1
2
cos dA
d
r

O (5.5c)
and

1 1
2
2
cos dA
d
r

O , (5.5d)

where, as shown in Fig. 5.3(b), r is the distance between positions 1 and 2.

______________________________________________________________________________

FIGURE 5.3(a).

Unit Vector
normal to
area dA
dA

Solid angle O d
surrounding pencil ray
Solid angle O d
to be the solid angles subtended by at positions 1,2: d
1
,
2
dA
2
,
1

2 2
1
2
cos dA
d
r

O (5.5c)
and

1 1
2
2
cos dA
d
r

O , (5.5d)

where, as shown in Fig. 5.3(b), r is the distance between positions 1 and 2.

______________________________________________________________________________

FIGURE 5.3(a).

Unit Vector
normal to
area dA
dA

Solid angle O d
Solid angle O d
- 568 -
FIGURE 5.3(b).

______________________________________________________________________________

If we make the reasonable assumption that energy travels in straight lines inside a homogeneous
medium, as shown by the dotted lines in Fig. 5.3(c), and also specify that the values of
1
l and
2
l
do not change with time, then the radiant energy passing through dA
1
into d
1
in time dt must be
the same as the radiant energy passing through dA
2
into d
2
in time dt. From Eqs. (5.5a)(5.5d)
we then have

2 2 1 1
1 1 1 2 2 2
2 2
cos cos
( cos ) ( cos )
dA dA
dA dt dA dt
r r

l l , (5.5e)

which reduces to

1 2
l l (5.5f)

Hence when the radiance is not changing with time it must also be constant along any pencil
of rays.
We have now established a self-consistent model for radiation fields in empty space and
transparent media. To find the radiance at any point, such as point A in Fig. 5.4, we need only
take note of all the criss-crossing pencil rays, establishing their radiances by tracing them back to
the surfaces where they originated. It does not matter whether the surface has reflected them like
surface 1 or, being self-luminous, has created them like surfaces 2 and 3; all that is relevant is the
radiance value they have when leaving the surface. There is nothing special about point A in Fig.
5.4its location is obviously arbitrary.
Position 1 Position 2

1
dA
2
dA
Unit Vector
normal to
2
dA

1
O d
1
O d
2
O d
2
O d
r

1

2

Unit Vector
normal to
1
dA
Hence when the radiance is not changing with time it must also be constant along any pencil
- 569 -
FIGURE 5.3(c).

FIGURE 5.4.

Position 1 Position 2

1
dA

2
dA
Point A
Surface 1
Surface 2
Surface 3
- 570 -
By moving point A around and specifying the radiances of the different pencil rays passing
through point A, we construct a radiance field
( , ) r O
G
l that is a function both of position r
G
and
direction
O. Having picked a position r

G
at which to evaluate l, we need
O as well to specify
one particular pencil ray passing through position r
G
. It is even possible, once the radiance l is
thought of as a function of r
G
and
O, to derive a simple differential equation describing the

gradual change in radiance undergone by these pencil beams when they travel through
semiopaque and self-luminous media, such as clouds of radiating gas. This last idea is the starting
point for modeling radiance fields inside stars or planetary atmospheres, but is not really needed
for the material in this book.
81

Along with the radiance field
( , ) r O
G
l , which is a function of position and direction of
propagation, we can associate a spectral radiance field
( , , ) r o O L
G
that is a function of
wavenumber , position r
G
, and direction of propagation
O such that

0

( , ) ( , , ) r r d o o
O O
L
G G
l . (5.6)

Suppressing the r
G
dependence on position to represent a radiance field that is constant over some
region of space, and choosing a direction in space to be the z axis of a coordinate system so that

2
1 z r r O +
G
,

as in Eq. (4.97d) of Chapter 4, we can write L as a function of r
G
, as in L( , ) r o
G
, to show its
dependence on the radiations direction of propagation. This function L is the same quantity as
the spectral radiance ( , ) r o L
G
specified by Eq. (4.136d) in Chapter 4. Many times in the rest of
this chapter we will talk about a single pencil ray from a distant source passing through an
interferometer. The pencil ray has, of course, a unique spectral radiance ( , ) r o L
G
; and the pencil
ray while passing through the interferometer can be decomposed into a group of parallel rays
because it emanates from a distant source. This parallel group of rays, according to Sec. 4.9 of
Chapter 4, specifies a plane wave passing through the interferometer. To get the optical energy
per unit area per unit time per unit wavenumber interval carried by the plane wave, we just
multiply the spectral radiance of the pencil ray by the extremely small solid angle d subtended

81
The interested reader is referred to S. Chandrasekhar, Radiative Transfer (Dover Publications, New York, 1960)
for a classic textbook, or Curtis D. Mobley, Light and Water: Radiative Transfer in Natural Waters (Academic
Press, New York, 1994), based in part on collaborations with Rudolf W. Preisendorfer, for a more modern work in
this field. What we call radiance and spectral radiance, Chandrasekhar calls, respectively, intensity and specific
intensity.
L
- 571 -
by the distant source at the position of the interferometer. This procedure amounts to nothing
more than mentally associating d with ( , ) L
G
in Eq. (5.2a) to get

[ ]
( , ) dE d dt dA d
= L
G
.
Writing this equality as

[ ]
( , )
dE
d
dt dA d
=

L
G

makes it easy to see why multiplying the spectral radiance of the pencil ray by d gives the
optical energy of the plane wave per unit time per unit area per unit wavenumber interval.
5.3 Radiance, Brightness, and the Inverse-Square Law
One interesting consequence of the radiance l being constant along any pencil ray is that we can
immediately identify the radiance with our subjective sensation of the brightness of an
illuminated or luminous surface. No matter what the distance between the observer and the
surface patch in Fig. 5.5, we know that, as long as the surface patch is close enough for its shape
to be discerned, the surface brightness remains the same. This is, of course, easily explained by
noting that the radiance along any pencil ray between the surface patch and the eye of the
observer does not change. We hasten to add that the radiance turns out not to be exactly the same
as the subjective notion of brightness because, when measuring radiance, all the energy in the ray
must be recorded no matter what its wavelength, and the human eye is more sensitive to some
wavelengths of visible light than to others.
Figure 5.5 also shows how to recover the well-known inverse-square law for the amount of
radiant energy perceived by an observer. Although the observer sees any point on the surface
patch as equally bright at positions A and B, the surface patch itselfor, to be more precise, its
image inside the observers eyeshrinks as the distance increases. If the surface patch has an
area
surf
a , the eyes pupil has an area
pupil
a , and we assign all the pencil rays coming from the
surface patch the same radiance
surf
l , then Eq. (5.4b) above states that the radiant energy entering
the eye at position A in time dt is

A surf pupil A
dE a d dt = l , (5.7a)

where

2
surf
A
A
a
d
r
= (5.7b)

is the solid angle subtended by the surface patch at position A.
- 572 -

FIGURE 5.5.

Similarly, the radiant energy entering the eye at position B in time dt is

B surf pupil B
dE a d dt = l , (5.8a)
where

2
surf
B
B
a
d
r
= (5.8b)

is the solid angle subtended by the surface patch at position B. Substitution of (5.7b) into (5.7a)
and (5.8b) into (5.8a) gives

2
surf pupil surf
A
A
A
a a
dE
P
dt r

= =
l
(5.9a)
and

2
surf pupil surf
B
B
B
a a
dE
P
dt r

= =
l
, (5.9b)

A B

A
r

B
r
What observer
at point A sees
What observer
at point B sees
Radiance, Brightness, and the Inverse-Square Law 5.3
- 573 -
where P
A
and P
B
are, respectively, the radiant power entering the observers eye at positions A
and B. This result can be written as

2
2
B A
A B
P r
P r

=

, (5.9c)

showing how the familiar inverse-square law for radiant power hides inside the rule that the
radiance along any pencil ray is constant.
The idea that the interior points of a surface patch can have a brightness only makes sense
when the observer is near enough to resolveor seethe shape of the surface patch. When the
observer is so distant that the surface patch is just a point of light, we say that the image of the
surface patch is unresolved. Now the brightness of that point of light follows the inverse-square
law directly by growing ever dimmer as the distance between the observer and surface patch
increases. The brightness of an unresolved point source, then, depends not on the radiance of
the pencil ray emanating from that source but rather on the total radiant power entering the
observers eye.
5.4 The Balanced Signal of a Michelson Interferometer
Suppose a pencil ray from a distant object passes through an idealized Michelson interferometer
with a beam having a circular cross section. The object is so far away that it acts like an
unresolved point source and to the naked eye it looks like a bright star. This means, according to
the work done in Sec. 5.3, that the total power entering the naked eye determines the perceived
brightness of the source. To keep things simple, we assume the radiation in the pencil ray is
unpolarized.
We unbundle the pencil ray inside the interferometer into a collection of parallel rays as
shown in Fig. 5.6, turning it into a single plane wave of the type discussed in Chapter 4. The
plane waves propagation vector is parallel to the interferometers optical axis. We have, using
the notation of Eq. (4.135f) of Chapter 4, that cos 1
= because
, the propagation angle with

respect to the optical axis, is zero. The only source of radiation present in the system is the pencil
ray, so we can take the interferometers field of view to be the extremely small solid angle
subtended by the distant source at the position of the interferometer. Now the appropriate formula
to pull from Chapter 4 to describe the radiant power passing through the interferometer is
(4.140a). This equation can be written as

[ ]
ma
0
1
P ( ) ( ) 1 M( ) cos(2 )
2
bal
S W R d
= +
, (5.10a)
where
( ) ( ) ( ) S A = L (5.10b)

- 574 -

FIGURE 5.6.

Moving Mirror
Parallel Rays Coming from
Distant Point Source
Ideal
Beam
Splitter
Fixed
Mirror

2
= a
The Balanced Signal of a Michelson Interferometer 5.4
- 575 -
and

1 ma
ma
ma
(4 )
M
2
( )
J R
R
R

= . (5.10c)

For an ideal interferometer the beam-splitter efficiency is always one and so S() specified by
(5.10b) is the same S() specified by Eq. (1.19d) in Chapter 1 and (4.140c) in Chapter 4. Function
P ( )
bal
gives the optical power in the balanced interference signal coming from the point source,
and we often call P
bal
the balanced interference signal when context makes it clear what is meant.
Figure 5.6 shows the source observed through the interferometer by the unaided eye, so in Eq.
(5.10b) the effective cross-sectional area A of the interferometer beam is the area of the eyes
circular entrance pupil and L(), of course, is the spectral radiance of the pencil ray from the
distant source. The beam-splitter efficiency (), which is a function of wavenumber, reminds us
that radiation of wavenumber only contributes to the interference signal to the extent that it
penetrates the beam splitterwavenumbers for which the beam splitter is opaque so that 0 = ,
for example, cannot contribute to P ( )
bal
. As is pointed out in the discussion following Eq.
(4.136i) of Chapter 4, we expect that
0 1 < <

for realistic interferometers. In Eq. (5.10c) the radius R of the eyes circular entrance pupil is
related to the pupil area by the standard formula

A
R
= , (5.10d)

and
ma
is the misalignment angle of the moving mirror. For an ideal interferometer that is in
perfect alignment,
ma
0 = ; and, according to Eq. (4.137k) of Chapter 4, when
ma
is zero

ma
1 ma
ma
0
(4 )
M 1
2
(0)
J R
R

=
= = . (5.10e)

This means M = 1 is a shorthand for the assumption that the interferometer is in perfect
alignment. For future use we also note that, according to Eq. (4.137g),

ma ma
M M ( ) ( ) R R = , (5.10f)

making M an even function of wavenumber . Figure 4.24 of Chapter 4 shows that

- 576 -
M 1 (5.10g)

always. Unless otherwise stated, we assume in this chapter that M is constant, postponing until
the next chapter any discussion of what happens when M changes while the interferometer is
measuring spectra.
To show how formula (5.10a) works, we choose a specific shape for the spectral radiance
L(), making the idealization that M 1 = = at all for which ( ) 0 L . Figure 5.7 specifies the
shape of L(), and according to Eq. (5.10b) the single-sided power spectrum

( ) ( ) S A = L

must have the same shape as L because A and are constant.

FIGURE 5.7.

Lmax
0
L ( )
max min
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
0
5 10
5
1 10
4
1.5 10
4
2 10
4
2.5 10
4
3 10
4
2000 1000 -2000 -1000 0.0
(in cm
-1
)
( ) L
- 577 -
When 0 a = in Fig. 5.6, the optical path difference, which is

2a = (5.11a)

for this interferometer, is also zero. This means that when 0 a = the moving mirror is at the zero-
path difference (ZPD) position shown by the dashed line in Fig. 5.6. From (5.10a) we see that at
ZPD when 1 M = =

0
P for 1
1
P (0) ( )
0 for 1 2
(1 )
inp
bal
W
S d
W
W
=
= =

=

, (5.11b)
where

0
input radiant power in pencil ray P ( )
inp
S d
= =
. (5.11c)

Evaluating (5.10a) when 1 M = = for all nonzero values of 2a = , we get the two different
P
bal
curves shown in Figs. 5.8(a) and 5.8(b). When 2 0 a = = and 1 W = , we see that P (0)
bal
in
Eq. (5.11b) specifies the maximum possible value for the interference signal; and when 1 W = ,
we see that P (0)
bal
specifies the minimum possible value for the interference signal. The observer
in Fig. 5.6 sees the starlike source disappear when the pencil ray passes through an ideal
interferometer that has its moving mirror at ZPD and a beam splitter with 1 W = . When the
pencil ray passes through an ideal interferometer whose beam splitter has 1 W = , the observer
sees the full brightness of the starlike source when the moving mirror is at ZPD. If is changed
by shifting the moving mirror, both Figs. 5.8(a) and 5.8(b) show how, for this ideal
interferometer, the source brightness seen by the observer oscillates around P
inp
/2, half the full
brightness of the starlike source. We note that when a and are positive (that is, when
2 0 a = > ), the moving mirror is more distant from the beam splitter than it is at ZPD; and
when a and are negative (that is, when 2 0 a = < ), the moving mirror is closer to the beam
splitter than it is at ZPD. Because
cos( 2 ) cos(2 ) = ,

Eq. (5.10a) also shows that
P ( ) P ( )
bal bal
= , (5.12)

which means that P
bal
is an even function of the optical-path difference. Consequently the
observer sees the same source brightness when the moving mirror moves off ZPD and away from
the beam splitter by a distance 2 a = as he does when the moving mirror moves off ZPD and
closer to the beam splitter by a distance 2 a = .
- 578 -
FIGURE 5.8(a). [for W = 1]

FIGURE 5.8(b). [for W = 1]

2 Imax
.
0.
IferSig
ng
10
2
10
2 graph
ng
0.01 0.008 0.006 0.004 0.002 0 0.002 0.004 0.006 0.008 0.01
0
0.02
0.04
0.06
0.08
0.0

2
inp
P

inp
P
0.008 0.004 -0.008 -0.004 0.0
(in cm)
) (
bal
P
2 Imax
.
0.
IferSig
ng
10
2
10
2 graph
ng
0.01 0.008 0.006 0.004 0.002 0 0.002 0.004 0.006 0.008 0.01
0
0.02
0.04
0.06
0.08
0.008 0.004 -0.008 -0.004 0.0
0.0

2
inp
P

inp
P
(in cm)
) (
bal
P
- 579 -
Following the notation introduced in Eq. (4.141a) of Chapter 4, we say the ideal interferogram
of the balanced power spectrum is

( )
0
1
( ) ( ) cos(2 )
2
ideal
bal
I S d o ro o
. (5.13a)

When 1 M q , we can write the interference signal in Eq. (5.10a) as

( )
0
1
P ( ) ( ) ( )
2
ideal
bal bal
S d W I o o
(5.13b)
or, using Eq. (5.11c),

( )
1
P ( ) P ( )
2
ideal
bal inp bal
W I + . (5.13c)

Figure 5.8(c) shows the
( )
( )
ideal
bal
I interferogram that corresponds to both the 1 W and the
1 W interference signals.

FIGURE 5.8(c).

Imax
Imax
Igraph
ng
10
2
10
2 graph
ng
0.01 0.008 0.006 0.004 0.002 0 0.002 0.004 0.006 0.008 0.01
0.04
0.02
0
0.02
0.04
0.008 0.004 -0.008 -0.004 0.0

2
inp
P

2
inp
P

0.0
(in cm)
) (
) (
ideal
bal
I
- 580 -
Since there are values of for which

P ( ) (1 2)P
bal inp
< ,

the interferogram
( ) ideal
bal
I takes on negative as well as positive values. A negative interferogram
value does not mean the total optical power reaching the observer has gone negativethis cannot
ever happen, of coursebut just that the interference signal has dropped below P
inp
/2. One easy
way to keep track of the distinction between the interferogram signal and the interference signal
is to remember that the interferogram signal has negative values whereas the interference signal
is never negative. Because, according to Eq. (5.12),

P ( ) P ( )
bal bal
= ,

we can conclude from Eq. (5.13c) that

( ) ( )
1 1
( ) P ( ) P P ( ) P ( )
2 2
ideal ideal
bal bal inp bal inp bal
W I W I = = =
or

( ) ( )
( ) ( )
ideal ideal
bal bal
I I = . (5.14)

Hence both P
bal
and
( ) ideal
bal
I are even functions of . Since the interference signal P
bal
approaches
P
inp
/2 as gets large in Figs. 5.8(a) and 5.8(b), the balanced interferogram

( )
( )
P ( ) 1 2 P
( )
bal inp ideal
bal
I
W

= (5.15)

approaches zero for large values of in Fig. 5.8(c). This behavior is typical of all
interferograms; the only way to avoid it is to make the power spectrum a delta function,

0 0
( ) ( ) S S = , (5.16a)

of the type discussed in Sec. 2.14 of Chapter 2 [see also Fig. 5.9(a)]. This delta function
represents monochromatic light of wavenumber
0
coming from the distant source. Equation
(5.11c) now requires
0
P
inp
S = , so according to Eqs. (5.13a) and (5.13c) the balanced interference
signal P
bal
becomes

- 581 -

[ ]
0 0 0
0 0
S S
P ( ) cos(2 ) 1 cos(2 )
2 2 2
bal
WS
W = + = + , (5.16b)

which is plotted in Figs. 5.9(b) and 5.9(c) for 1and 1 W W = = . Equation (5.15) gives the
associated interferogram

( ) 0
0
( ) cos(2 )
2
ideal
bal
S
I = , (5.16c)

which we plot in Fig. 5.9(d). Formula (5.16b) is clearly identical to Eq. (1.17d) in Chapter 1 after
we set up the correspondences

( )
P
i
cb
f bal
I ,
(0)
0
i
f
I S , and
0
i
f
.

This ideal delta-function spectrum can be approximated by passing a laser through the
interferometer, producing interferograms resembling the one shown in Fig. 5.9(d). Even lasers,
however, have a small but finite spread in their power spectra, causing their interferograms to
approach zero at extremely large values of .

FIGURE 5.9(a).

0
=
2.5
1.5
c0 ( )
2.5 2.5
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
1.5
1
0.5
0
0.5
1
1.5
2
2.5

) ( S
- 582 -
FIGURE 5.9(b). [for W = 1]

FIGURE 5.9(C). [for W = 1]

2.5
1.5
c1 ( )
2.5 2.5
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
1.5
1
0.5
0
0.5
1
1.5
2
2.5
0 =

0
S

2
0
S

2
0
S

0.0

0
/ 1
2.5
1.5
c2 ( )
2.5 2.5
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
1.5
1
0.5
0
0.5
1
1.5
2
2.5

0
S

2
0
S

2
0
S

0.0
0 =

0
/ 1
- 583 -
FIGURE 5.9(d).

Having separated the balanced interference signal P ( )
bal
for the ideal interferometer into a
constant term P 2
inp
and an ideal interferogram
( )
( )
ideal
bal
I , we note that a similar procedure can
be followed with respect to the nonideal interference signal where 0 1 < < and M 1 < .
Equation (5.10a) can be written as

ma
0
0
1
P ( ) ( ) ( ) M cos(2 )
2 2
P / 2 ( )
( )
bal
bal
W
S d S R d
W I

= +
+ =

0

(5.17a)
or

( )
0
P ( ) 1 2 P
( )
bal
bal
I
W

= , (5.17b)
where, applying Eq. (5.10b),

0
0 0
P ( ) ( ) ( ) S d A d

= =

L (5.17c)
and
2.5
1.5
c3 ( )
2.5 2.5
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
1.5
1
0.5
0
0.5
1
1.5
2
2.5

0
S

2
0
S

2
0
S

0.0
0 =

0
/ 1
- 584 -

ma
0
ma
0
1
( ) ( ) M cos(2 )
2
1
( ) ( ) M cos(2 )
2
( )
( )
bal
I S R d
A R d

=
=
L .
(5.17d)

Although
0
P in Eq. (5.17c) looks superficially like P
inp
in Eq. (5.11c), since it too can be written
as

0
( ) S d
,

the constant power level
0
P is not the same as P
inp
because now ( ) 1 < in Eq. (5.10b), making
0
P less than the radiant power P
inp
of the pencil ray entering the interferometer. Similarly ( )
bal
I
in formula (5.17d) becomes, for 0 = ,

ma ma
0 0
1 1
(0) ( ) ( ) M ( ) M
2 2
( ) ( )
bal
I A R d S R d

= =

L ,

which meanssince M 1 < in this nonideal casethat we cannot expect to have P (0)
bal
be
either
0
P or zero for 1 W = or 1 W = respectively. Nevertheless, in a well-designed
interferometer, both and M are reasonably close to one for the wavenumbers of interest, and the
balanced signal of a nonideal interferometer usually behaves much the same as the balanced
signal of the ideal interferometer. In fact the symmetry properties with respect to of the ideal
balanced signalthat the balanced interference signal and balanced interferogram are even
functions of the optical path differenceapply as well to the nonideal case where 0 1 < < and
M 1 < because neither nor M depends on . Hence the same reasoning already used to derive
Eqs. (5.12) and (5.14) can also be applied to this nonideal case to get

P ( ) P ( ) for 0 1 and M 1
bal bal
= < < < (5.18a)

and
( ) ( ) for 0 1 and M 1
bal bal
I I = < < < . (5.18b)

The Unbalanced Signal of a Michelson Interferometer 5.5
- 585 -
5.5 The Unbalanced Signal of a Michelson Interferometer
At large values of the optical-path difference , all balanced interferograms, ideal and nonideal
even those generated by lasersapproach zero. Returning to the ideal case where M 1 = = , we
see that if the ideal interferogram is zero in Eq. (5.13c), then the optical power P
bal
in the
balanced interference signal becomes P
inp
/2, half the original P
inp
power entering the
interferometer. Hence, for large values, the observer in Fig. 5.6 sees the distant source at half its
true brightness when looking through the interferometer. This raises the question of where the
optical power unseen by the observer goes.
When analyzing the background signal in Sec. 4.17 of Chapter 4, we saw that the balanced
and unbalanced signals had to contain all the unabsorbed background power entering the
interferometer [see discussion following Eq. (4.154) of Chapter 4]. The same must be true for the
optical power of the distant point source in Fig. 5.6; and with this clue we realize, following the
same reasoning used in Sec. 4.17 of Chapter 4 and in the discussion following Eq. (1.18a) in
Chapter 1, that the missing optical power goes back out the interferometers entrance aperture as
an unbalanced and unseen optical signal. Figure 5.10 shows that the unbalanced signal comes
from the interference of those rays that reflect twice off the beam splitterat the beginning and
end of their trip up and back the moving-mirror armwith those rays that transmit twice through
the beam splitterat the beginning and end of their trip up and back the fixed-mirror arm. Using
conservation of energy, we note that the unbalanced signal power P ( )
unb
and the balanced
signal power P ( )
bal
, for the ideal interferometer with M 1 = = , must add up to P
inp
, the input
radiant power entering the system:

P ( ) P ( ) P
unb bal inp
+ = . (5.19a)

Substitution of (5.10a) with M 1 = = and (5.11c) into (5.19a) then gives

[ ]
0 0
1
P ( ) ( ) ( ) 1 cos(2 )
2
unb
S d S W d

= +

or

[ ]
0
1
P ( ) ( ) 1 cos(2 )
2
unb
S W d
. (5.19b)

Comparing this result to Eq. (5.10a) with = M = 1, we see that, at this level of idealization,
going from the balanced to the unbalanced interference signal is the same as changing the sign of
W. Consulting Figs. 5.8(a) and 5.8(b), we see that when 5.8(a) is the balanced interference signal,
- 586 -

FIGURE 5.10.

Moving Mirror
Fixed
Mirror
Ideal
Beam
Splitter
Parallel Rays Coming from
Distant Point Source
The dashed lines show the rays going back out the front aperture as an unbalanced interference
signal. This unbalanced interference signal cannot by seen by the observer.
The Unbalanced Signal of a Michelson Interferometer 5.5
- 587 -
then 5.8(b) is the unbalanced interference signal; and when 5.8(b) is the balanced interference
signal, then 5.8(a) is the unbalanced interference signal. Following the pattern of Eq. (5.13c), we
can define an ideal unbalanced interferogram
( )
( )
ideal
unb
I for the unbalanced optical signal by
saying that

( )
1
P ( ) P ( )
2
ideal
unb inp unb
W I + (5.19c)
so that

( )
( )
0
P ( ) 1 2 P
1
( ) ( ) cos(2 )
2
unb inp ideal
unb
I S d
W
o ro o
. (5.19d)

The sign convention chosen for the balanced and unbalanced interferograms in Eqs. (5.13a) and
(5.19d) specifies a positive ZPD peak for the balanced interferogram,

( )
0
0
1
( ) 0
2
ideal
bal
I S d
o o
>
, (5.20a)

and a negative ZPD peak for the unbalanced interferogram,

( )
0
0
1
( ) 0
2
ideal
unb
I S d
o o
<
. (5.20b)

The qualitative behavior of the nonideal unbalanced interference signal and nonideal unbalanced
interferogram is, in a well-designed interferometer, very similar to the behavior of the ideal
unbalanced interference signal and ideal unbalanced interferogram. Note that although the shapes
of the balanced and unbalanced interference signals depend on the sign of W, the shapes of the
balanced and unbalanced interferograms do not.
82

82
Section 4.17 of Chapter 4 derives the formulas for the nonideal unbalanced interference signal of the
interferometers background radiance because they show the total radiant power reaching the interferometer detector.
The same procedures can be used to derive formulas for the nonideal unbalanced interference signal of the
interferometers input radiance. For the interferometer designs analyzed here, this signal is of much less interest
because it goes back out the interferometers entrance aperture and has no effect on the total radiant power reaching
the interferometers detector. There do exist interferometers, like the one shown in Fig. 1.19c of Chapter 1, for which
both types of formula are relevant.
interferometers background radiance because it contributes to the total radiant power reaching the detector.
- 588 -
5.6 The Off-Axis Signal of a Michelson Interferometer
Suppose there is another distant source in addition to the one present in Fig. 5.6, with the pencil
ray coming from the second source making an angle
s
with respect to the pencil ray coming from
the first. An observer looking directly at these distant sources sees two stars in the sky
separated by an angular distance
s
. When an observer looks at these two distant sources through
an interferometer, as shown in Fig. 5.11, they still look like two stars in the sky separated by an
angular distance
s
, but now their brightness depends on the position of the interferometers
moving mirror. We unbundle the pencil rays from these two sources as they enter the
interferometer to form the two sets of parallel rays shown in Fig. 5.11. The unbundled rays from
the first source are parallel to the interferometers optical axis and the unbundled rays from the
second source are at an angle
s
to the optical axis.
We have already discussed how the brightness of the on-axis source varies with the optical
path difference for an ideal interferometer according to the formula [see Eq. (5.10a) with
M 1 = = ]

[ ]
(0) (0)
0
1
P ( ) ( ) 1 cos(2 )
2
bal
S W d
= +
. (5.21a)

Here, the superscript (0) has been added to show that the balanced interference signal P
bal
and the
spectrum S refer to the point source whose rays are parallel tothat is, at a zero angle tothe
optical axis. To get the corresponding formula for the off-axis point source, we use Eq. (4.137i)
from Chapter 4 to write

[ ]
s
( )
( ) 2
ma
field of view for
point source
P ( )
( ) 1 M( ) cos(2 cos )
2
s
s
bal
A
d d W R
= +

, ( )L
(5.21b)

where again, using Eq. (5.10c), we say that

1 ma
ma
ma
(4 )
2
M( )
J R
R
R

= .

The superscript (
s
) is added to show that P
bal
and L refer only to the off-axis source, the one
whose rays are at an angle
s
to the optical axis. The effective cross-sectional area of the
interferometer beam is still A, the area of the eyes entrance pupil; and R in the formula for M is
still the radius of the eyes entrance pupil so that / R A = . The relevant field of view, however,
The Off-Axis Signal of a Michelson Interferometer 5.6
- 589 -
FIGURE 5.11.

is now
( )
s
, the extremely small solid angle subtended by the second distant source at the
position of the interferometer. Recognizing that
s
for all the rays coming from this distant,
off-axis source, we perform the integral over
2
d in (5.21b) to get

[ ]
( )
( )
( )
ma
P ( )
( ) 1 M( ) cos(2 cos )
2
s
s
bal
s
s
A
W R d
= +
. ( )L
(5.21c)

Equations (4.136f) and (4.139g) in Chapter 4 require L and to be even functions of ; Eq.
Moving Mirror
Fixed
Mirror
Ideal
Beam
Splitter
The parallel rays coming from a
distant, off-axis point source are
shown with dashed arrows.
The parallel rays coming from
a distant, on-axis point source
are shown with solid arrows.

s

s

- 590 -
(5.10f) shows that M is another even function of ; and the cosine is also even. Therefore the
product

( )
ma
( ) ( ) [1 M( ) cos(2 cos )]
s
s
W R
+ L

must be an even function of , which means that, according to Eq. (2.19) in Chapter 2,

( )
ma
( )
ma
0
( ) ( ) [1 M( ) cos(2 cos )]
2 ( ) ( ) [1 M( ) cos(2 cos )] .
s
s
s
s
W R d
W R d
+
= +

L
L

From Eq. (4.136g) in Chapter 4, we know that the off-axis spectral radiance is

( ) ( )
( ) 2 ( )
s s

= L L ,

where the superscript (
s
) is added to show that we are only interested in the pencil ray entering
the interferometer at an angle
s
to the optical axis. This lets us write

( )
ma
( )
ma
0
( ) ( ) [1 M( ) cos(2 cos )]
( ) ( ) [1 M( ) cos(2 cos )] .
s
s
s
s
W R d
W R d
+
= +
L

L

Substitution of this last result into (5.21c) gives

( ) ( )
ma
0
1
P ( ) [1 M( ) cos(2 cos )]
2
s s
bal s
S W R d

= +
( ) , (5.21d)
where

( ) ( ) ( )
( ) ( ) ( )
s s s
S A

= L . (5.21e)

For the ideal interferometer with M 1 = = , this becomes

( ) ( )
0
1
P ( ) [1 cos(2 cos )]
2
s s
bal s
S W d

= +
( ) (5.21f)
where
- 591 -

( ) ( ) ( )
( ) ( )
s s s
S A

= L . (5.21g)

Comparing Eq. (5.21f) for the ideal off-axis case to Eq. (5.21a) for the ideal on-axis case shows
that the only effect of the off-axis passage through the interferometer is to multiply by cos
s

and to replace
(0)
S by
( )
s
S

.
Equations (5.21f) and (5.21g) for the off-axis source can be compared to Eq. (5.21a) for the
on-axis source under the assumption that both sources are the same size, have the same spectral
radiance L(), and are at the same distance from the interferometer. Both sources then pass the
same power spectrum S() through the interferometer so that

(0)
0
1
P ( ) [1 cos(2 )]
2
bal
S W d
= +
( ) (5.22a)
and

( )
0
1
P ( ) [1 cos(2 cos )]
2
s
bal s
S W d
= +
( ) . (5.22b)

Comparing these two formulas, we see that

( ) ( )
( ) (0)
P P cos
s
bal bal s
= . (5.23a)

The displacement a of the moving mirror from its ZPD position is given by (see Eq. (5.11a)]

2 a = . (5.23b)

Consequently Eq. (5.23a) can also be written as

( ) ( )
( ) (0)
P 2 P 2 cos
s
bal bal s
a a
= . (5.23c)

This shows that the balanced interference signal of a distant, on-axis source has the same power
when the moving mirror is displaced from ZPD by a distance ( cos )
s
a that an identical distant,
off-axis source has when the moving mirror is displaced from ZPD by a distance a. Another way
of saying this is to note that the on-axis source looks as bright when the moving mirror is
displaced from ZPD by a distance a as the off-axis source does when the moving mirror is
displaced from ZPD by a distance ( cos )
s
a . Since ( cos )
s
a a > , as the moving mirror is
shifted steadily away from ZPD the brightness of the on-axis source predicts the brightness of the
off-axis sourceif the on-axis source brightens or dims, we know that soon the same thing will
happen to the off-axis source.
- 592 -
We next consider a ring of distant sources surrounding the on-axis source, with all the sources
passing the same power spectrum S() through the interferometer. As shown in Fig. 5.12(a), an
observer looking at the ring sees these sources as a circle of stars, a circle with angular radius
s

centered on the distant on-axis source. Each source in the ring sends its own group of parallel
rays through the interferometer as shown in Fig. 5.12(b).
Every parallel group of rays passes through the interferometer at the same
s
angle with respect to
the optical axis, so everything previously said about the single off-axis source also applies to the
ring of off-axis sources. As the moving mirror shifts away from ZPD, we know using the same
reasoning as beforethat if the central source brightens or dims then soon the same thing will
______________________________________________________________________________

FIGURE 5.12(a).

s

s

Moving
Mirror
Fixed
Mirror

s

s

s

s

FIGURE 5.12(b).
Every parallel group of rays passes through the interferometer at the same angle to
s

- 593 -
FIGURE 5.13.

happen simultaneously to every source on the off-axis ring. We can imagine filling the entire
sky with identical distant sources, as shown in Fig. 5.13.
Now when the sky is observed directly, not looking through the interferometer, it exhibits a
uniform, featureless glow; but when it is observed indirectly through the interferometer with the
eye focused at infinitywhich may require a little practicethe sky becomes a concentric series
of rings at different levels of brightness. These are sometimes called Heidinger rings. The rings
have different levels of brightness because they are at different angular distances from the on-axis
source. The only way to escape this effect is to put the moving mirror at its ZPD position, with
0 a = = . According to Eq. (5.22b), the rays at every angle
s
with respect to the optical axis
then all have the same P
bal
value; and the observer looking through the interferometer either sees
the same uniform featureless glow seen when looking directly at the source-filled sky (if 1 W = )
or nothing at all (if 1 W = ). As the moving mirror shifts steadily away from ZPD, the region at
the center of the scene changes its brightness first and, then, obeying Eq. (5.23c), this change in
brightness forms a ring that expands and travels out to the edge of the scene. This is, of course,
just a consequence of the on-axis brightness predicting the off-axis brightness, with regions at
larger
s
copying the central brightness after a longer delay as the interference rings form and
expand.
To record these rings in the laboratory, we need only replace the observers eye with a camera
Moving
Mirror
Fixed
Mirror
Ideal Beam Splitter
- 594 -
focused at infinity. In Fig. 5.14 this camera is shown schematically as a lens and a light-sensitive
surface in the lenss focal plane. As has already been discussed in Sec. 4.9 of Chapter 4, each
group of parallel rays can be regarded as a single plane wave, and each plane wave reaching the
lens focuses to its own separate and distinct point of light on the light-sensitive surface. In fact
what the light-sensitive surface records is an image of the scene at infinity, with each distant
source showing up as a separate point of brightness on the lenss focal plane. The position of
each bright point on the focal plane corresponds to the angular separations seen by an observer;
for example, the ring of distant sources depicted in Fig. 5.12(a) shows up as a ring of bright
points equidistant from the central bright point representing the on-axis source. In practice the
creation of bright distant sources all having the same spectrum is an awkward and tedious
business; what is done instead is to create a nearby extended source with a uniformly bright
surface having the same spectral radiance everywhere. From the discussion at the end of Sec. 4.2
as well as the discussion following Eq. (4.47b) in Chapter 4, we know that every radiation field
can be thought of as a collection of plane waves propagating in different directions. When the
extended source is placed close to the interferometer, its plane waves fill the interferometers
field of view; that is, every point on the light-sensitive surface of the lenss focal plane represents
a different plane wave generated by the extended source (see Fig. 5.15). To get a sequence of
brightness rings such as the ones shown in Fig. 5.16, we make sure the camera is focused at
infinity and then just take a series of snapshots while steadily shifting the moving mirror away
from ZPD.
The discussion so far has assumed that all the plane waves entering the interferometer,
whether coming from distant sources or an extended nearby source, pass the same power
spectrum S() through the interferometer. There is, of course, no reason why this has to be the
case. Returning to Eq. (5.21f), we rewrite it using slightly different notation. Instead of talking
about parallel rays passing through the interferometer at an angle
s
to the optical axis, we give
each group of parallel rays an index i and refer to the ith group of parallel rays as the ith plane
wave passing through the interferometer. The balanced signal power associated with this ith
plane wave is then

( ) ( )
0
1
P ( ) [1 cos(2 cos )]
2
i i
bal i
W d S
= +
( ) , (5.24a)

where
i
refers to the ith plane waves
s
angle with respect to the interferometers optical axis
and
( ) i
S ( ) is the power spectrum of the ith plane wave as it passes through the interferometer.
According to Eq. (5.21g), if the plane wave is generated by a distant point source then we should
say that

( ) ( ) ( )
( ) ( )
i i i
S A = L . (5.24b)

- 595 -
FIGURE 5.14.

Moving Mirror
Fixed
Mirror
Lens
LIght-Sensitive Surface in
the Focal Plane of the Lens
Ideal
Beam
Splitter
The parallel
rays coming
from a distant,
on-axis point
source are
shown with
solid arrows.
The parallel rays coming
from a distant, off-axis point
source are shown with
dashed arrows.
- 596 -
FIGURE 5.15.

Moving Mirror
Fixed
Mirror
Lens
LIght-Sensitive Surface in
the Focal Plane of the Lens
Ideal
Beam
Splitter
Plane Waves from
Extended Source
Extended
Source
- 597 -
FIGURE 5.16.

Here
( )
( )
i
L is the spectral radiance of the pencil ray entering the interferometer from the distant
point source and A is the cross-sectional area of the beam gathered in by the lensthat is, the
area of the lens itself. We can think of the ith plane wave as just one of a group of 1, 2, , i N =
plane waves all emanating from distant sources, which makes
( ) i
the extremely small solid
angle subtended by the ith distant source at the position of the interferometer.
After these plane waves pass through the interferometer, the lens in Figs. 5.14 and 5.15 forms
an imagethat is, N points of brightnessfrom these N distant sources. If, as shown in Fig. 5.17,
we put an array of small detectors in the focal plane then, as the moving mirror shifts away from
ZPD, each detector records the
( )
P
i
bal
signal given by Eq. (5.24a) that is generated by the ith plane
wave coming from the ith distant source. The central region of the focal plane no longer
1 2 3 4
5 6 7 8
This sequence of eight brightness rings is modeled on the brightness rings
occurring at the interferometers focal plane. The radii of the rings increase going
from one to eight due to the increasing displacement 2 a = of the moving
mirror from ZPD. Features like this are sometimes called Heidinger rings.
- 598 -
FIGURE 5.17.

Moving Mirror
Fixed
Mirror
Lens
Detector Array
Ideal Beam
Splitter
Plane
Waves
coming
from
Distant
Scene
- 599 -
automatically predicts the brightness of the off-center regions, and there need not exist any well-
formed, outwardly moving rings because now the different plane waves have different
( ) i
S
spectra.
83
This setup is sometimes referred to as an imaging Fourier-transform spectrometer, and
when it is put on board a spacecraft it can be used to investigate distant astronomical scenes, such
as a planets surface viewed from orbit, where we expect the power spectra to vary with position
in the scene.

83
Of course if all these S
(i)
spectra have common features producing similar interference signals, there will still be a
tendency for ringlike features to form and expand out from the center as the moving mirror shifts away from ZPD.
5.7 The Standard Michelson Interferometer with Central Detector
In laboratory Michelson interferometers, we usually place a single circular detector in the central
region of the focal plane as shown in Fig. 5.18. The points on the detector near the center of the
focal plane represent plane waves propagating parallel, to or nearly parallel to, the optical axis, so
cos
i
is always close to one and the central detector records the sum of all the
( )
P
i
bal
at
approximately the same optical path difference . As justification for saying that cos 1
i
for all
the plane waves hitting the detector, we note that when all the plane waves have the same
( )
( )
i
S , producing a ring pattern of the sort shown in Fig. 5.16, there is usually (but not always)
a large circular patch in the center having about the same brightness. For the time being, though,
we retain the factor of cos
i
in order to derive equations showing how to analyze
interferometers having large detectors that extend into the ring pattern of the focal plane.
Using Eqs. (5.24a) and (5.24b) and assuming that the plane waves are numbered so that
det
1, 2, , i N = are all the plane waves focused onto the detector, we write the balanced signal
power reaching the detector as

[ ]
det det
(det) ( ) ( )
1 1
0
P ( ) P ( ) ( ) 1 cos(2 cos )
2
N N
i i
bal bal i
i i
A
W d
= =

= = +

L , (5.25a)

where in the last step we have assumed that all the plane waves entering the interferometer have
the same spectrum S() and thus the same spectral radiance L(). We convert the sum over i into
an integral over solid angle by writing

[ ]
(det) 2
0 field of view
of detector
P ( ) ( ) 1 cos(2 cos )
2
bal
A
d W d
= +

L . (5.25b)
- 600 -
FIGURE 5.18.

Moving Mirror
Ideal Beam
Splitter
Fixed
Mirror
Lens
Circular Detector in
Focal Plane of Lens
Plane Waves
Coming from
outside the
Interferometer
The Standard Michelson Interferometer with Central Detector 5.7
- 601 -
On the right-hand side of this equation, A is the area of the lens focusing the interferometer signal
onto the detector,
2
d is an infinitesimal solid angle replacing
( ) i
, and angle
replaces
angle
i
as the angle of propagation through the interferometer. We note that this angle
is the
same as the
defined in Eq. (4.135f) of Chapter 4 and used in Eq. (4.137i) of Chapter 4. This is
not very surprising, considering that the line of reasoning used to derive Eq. (5.25b) begins with
Eq. (5.21b), which is a special case of Eq. (4.137i).
We can, in fact, easily show that Eq. (5.25b) is the same as Eq. (4.137i) in Chapter 4 with
M 1 = = . Formula (5.10c) lets us write the integral on the right-hand side of (4.137i) as

2 1 ma
ma field of view
2
ma
field of view
(4 )
( ) ( ) [1 cos(2 cos )]
2 2
( ) ( ) [1 M( ) cos(2 cos )]
2
J R A
d d W
R
A
d d W R
+
= +

.

L
L

Here A is the cross-sectional area of the interferometer beam; / R A = is the radius of the
interferometer beam; the field of view limiting the integral over
2
d is the interferometers
field of view; and of course
is the angle of propagation through the interferometer. For the

lens and detector in Fig. 5.18, the area of the lens focusing the beam onto the detector defines the
cross-sectional area of the interferometer beam, so variable A has the same meaning as in Eq.
(5.25b). The field of view specified by the size of the detectorthat is, the detectors field of
viewis the same as the field of view of the interferometer, so the integral over
2
d is also the
same integral as in Eq. (5.25b). Following the procedure used in the discussion after Eq. (5.21c),
we recognize that field of view in the integral over
2
d now refers to the detectors field of
view and note that L, , M, and the cosine are even functions of . This gives us, after applying
Eq. (2.19) in Chapter 2,

2 1 ma
ma field of view
2
ma
field of view
of detector
2
0 field o
(4 )
( ) ( ) [1 cos(2 cos )]
2 2
( ) ( ) [1 M( ) cos(2 cos )]
2
( ) ( )
2
J R A
d d W
R
A
d d W R
A
d d
+
= +
=

L

L
L
ma
f view
of detector
[1 M( ) cos(2 cos )] , W R +
(5.25c)

where L() is, according to Eq. (4.136g) of Chapter 4, the spectral radiance of the beam entering
- 602 -
the interferometer. Equation (5.25c) is a new formula for the right-hand side of Eq. (4.137i) in
Chapter 4. Thus it can be substituted back into (4.137i) to get

2
ma
0 field of view
of detector
P ( )
( ) ( ) [1 M( ) cos(2 cos )].
2
bal
A
d d W R
= +

L

From Chapter 4 we know that P
bal
in this formula is the optical power leaving the interferometer
in the balanced signal, and since the ideal lens in Fig. 5.18 focuses all of the beam onto the
detector, P
bal
is the same quantity as
(det)
P
bal
in Eq. (5.25b). Hence this last result can be written as

(det)
2
ma
0 field of view
of detector
P ( )
( ) ( ) [1 M( ) cos(2 cos )].
2
bal
A
d d W R
= +

L
(5.25d)

When M 1 = = in Eq. (5.25d), it becomes the same as Eq. (5.25b). Consequently we have now,
as promised, shown that (5.25b) is the same as Eq. (4.137i) of Chapter 4 applied to an ideal
interferometer. Equation (5.25d) with 0 1 < < and M 1 < is then clearly the extension of Eq.
(5.25b) to the nonideal case of an interferometer with an imperfect beam splitter and an
imperfectly aligned moving mirror. Interchanging the integrals in (5.25d) gives

(det)
2
ma
field of view 0
of detector
P ( )
( ) ( )[1 M( ) cos(2 cos )] .
2
bal
A
d d W R

= +

L
(5.25e)

Now at last we make the idealization that the detector is small enough to assume that all the
plane waves focused on it provide an approximately uniform illumination across its surface,
allowing us to set cos 1
in Eq. (5.25e) to get, after dropping the (det) superscript,

ma
0
ma
0
P ( ) ( ) ( )[1 M( ) cos(2 )]
2
1
( )[1 M( ) cos(2 )]
2
bal
A
W R d
S W R d

= +
= +
L

,
(5.26a)
- 603 -
where

2
field of view
of detector
d r AO
(5.26b)
and
( ) ( ) ( ) S A o q o o AO L . (5.26c)

The (det) superscript has been dropped to emphasize the close resemblance of Eq. (5.26a) to Eq.
(5.10a) for the balanced interference signal of the distant, on-axis source. Indeed the only real
difference is that in Eqs. (5.10a), (5.10b), and (5.10c) refers to the solid angle subtended by
the distant source and in (5.26a), (5.26b), and (5.26c) refers to the detectors field of view.
Because the mathematical formalism is the same, it makes sense to call P
bal
in (5.26a) the optical
power of the balanced interference signal hitting the detector and, following the pattern of Eqs.
(5.17a) through (5.17d), once again define

ma
0
1
( ) ( )M )cos(2 )
2
(
bal
I S R d o o ro o
(5.27a)

to be the balanced interferogram. The only difference between (5.17d) and (5.27a) is the meaning
we attach to the solid angle in the definition of S. Now Eq. (5.26a) can be written as

0
1
P ( ) P ( )
2
bal bal
WI + , (5.27b)
where, just like in (5.17c),

0
0
P ( ) S d o o
. (5.27c)

Since the cosine in Eq. (5.26a) is an even function of , the interference signal P
bal
must be, as it
is in Eq. (5.18a), an even function of ,

P ( ) P ( )
bal bal
, (5.28a)

which means that, according to Eq. (5.27b), the interferogram

( )
0
P ( ) 1 2 P
( )
bal
bal
I
W

(5.28b)
is once again an even function of :
( ) ( )
bal bal
I I . (5.28c)
difference is that solid angle in Eqs. (5.10a), (5.10b) refers to the solid angle subtended by
- 604 -
As in Eq. (4.141c) of Chapter 4, we can make S() into an even function by requiring

( ) ( ) S S o o (5.29a)

to end up with, after extending Eq. (5.26c) to negative values of ,

( ) ( ) ( ) S A o q o o AO L . (5.29b)

Unlike Eqs. (4.140c) and (4.141c) of Chapter 4, the beam-splitter efficiency is now included in
the definition of S. The argument of does not have to be put inside absolute value signs
because, according to Eq. (4.139g) of Chapter 4, it is already an even function of . Function
ma
M( ) Ro is also an even function of [see Eq. (5.10f)], as is cos(2 ) ro , so both
[ ]
ma
( ) M( ) S R o o and
ma
( ) M( cos(2 ) ) S R o o ro

are even functions of . The sine of (2 ) ro
is an odd function of because
sin( 2 ) sin(2 ) ro ro ,

so multiplying the even function
[ ]
ma
( ) M( ) S R o o by sin(2 ) ro produces an odd function:

[ ]
ma
( ) M( ) sin(2 ) S R o o ro .
This means we can write

2
ma
ma ma
ma
0
M( ) ( )
M( ) ( ) cos(2 ) M( ) ( ) sin(2 )
2 M( ) ( ) cos(2 ) .
i
R S e d
R S d i R S d
R S d
r o
o o o
o o ro o o o ro o
o o ro o

Here we use that the integral of [ ]
ma
( ) M( ) cos(2 ) S R o o ro over just positive is, according to
Eq. (2.19) in Chapter 2, twice the value of its integral between and +, because
[ ]
ma
( ) M( ) cos(2 ) S R o o ro is an even function of ; and we also use that the integral of
[ ]
ma
( ) M( ) sin(2 ) S R o o ro over is the integral of an odd function between and +, which,
according to Eq. (2.17) in Chapter 2, must be zero. Comparison of this result to Eq. (5.27a) shows
that the interferogram can be written as

This means we can write, using cos( ) sin( )
i
e i
o
o o + ,

2
ma
ma ma
ma
0
M( ) ( )
M( ) ( ) cos(2 ) M( ) ( ) sin(2 )
2 M( ) ( ) cos(2 ) .
i
R S e d
R S d i R S d
R S d
r o
o o o
o o ro o o o ro o
o o ro o

Here we use that the integral of [ ]
ma
( ) M( ) cos(2 ) S R o o ro over just positive is, according to
Eq. (2.19) in Chapter 2, twice the value of its integral between and +, because
[ ]
ma
( ) M( ) cos(2 ) S R o o ro is an even function of ; and we also use that the integral of
[ ]
ma
( ) M( ) sin(2 ) S R o o ro over is the integral of an odd function between and +, which,
according to Eq. (2.17) in Chapter 2, must be zero. Comparison of this result to Eq. (5.27a) shows
that the interferogram can be written as
[ ]
ma
( ) M( ) sin(2 ) S R o o ro over is the integral of an odd function between and +
- 605 -

2
ma
1
( ) ( ) M )
4
(
i
bal
I S R e d

(5.29c)

where the plus sign is chosen for the complex exponent of e. Note that, having now chosen
( )
bal
I to be the inverse Fourier transform of ( ) [ ]
ma
1 4 ( )M( ) S R , we can reverse the Fourier
transform in (5.29c) to get

2
ma
( ) M ) 4 ( ) (
i
bal
S R I e d

. (5.29d)

Our choice of sign for the complex exponent thus makes [ ]
ma
( )M( ) S R the forward Fourier
transform of 4 ( )
bal
I . This sign choice is, of course, purely a matter of convention, but it is the
one followed by most optical engineers today and it is the one used for the rest of this book.
5.8 The Fore and Aft Optics
We now derive an expression for the optical power of the balanced interference signal when the
polychromatic plane waves propagating at different angles
are characterized by different

spectral radiances.
We can rewrite Eq. (4.136c) of Chapter 4 as, after using Eqs. (4.136a) and (4.136i) to simplify
the integral over d,

( )
( )
2 cos 2
field of view
P ( , ) ( ) 1 Re ( )
2
i
bal A
A W
d e d
A

= +

G
G
L .

Applying the same reasoning as in the discussion after Eq. (5.25b), we note that here the
interferometer beams cross-sectional area A must be the same as the area A of the lens, and that
the field of view in the integral over
2
d must refer to the field of view of the detector. For the
standard interferometer beam with a circular cross section, we have from Eq. (4.137h) of Chapter
4 and Eq. (5.10c) that

1 ma
ma
ma
circle of
radius
(4 ) 1
( ) M( )
2
A
R
J R
R
A R

= =
G
.

Substituting this into the expression for P ( )
bal
gives, since M is real and

cos( ) sin( )
i
e i
= + ,
- 606 -
that

( ) [ ]
2
ma
field of view
P ( , ) ( ) 1 M( ) cos(2 cos ) .
2
bal
A
d W R d
r
r r o q o o ro o o
+

G

L

Equations (4.136b) and (4.139g) of Chapter 4 require ( , ) r o
G
L and () to be even functions of ,
and we already know that M is an even function of [see Eq. (5.10f)]. Consequently, because the
cosine is also an even function, it follows that

[ ]
ma
( , ) ( ) 1 M( ) cos(2 cos ) W R
r
r o q o o ro o +
G
L

is itself an even function of . Equation (2.19) in Chapter 2 can now be used to modify the upper
and lower bounds of the integral over d, so that the integration takes place between 0 and .
Having made these changes, Eq. (4.136d) of Chapter 4 can then be used to write the formula for
P ( )
bal
as

( )
[ ]
2
ma
field of view 0
P
( , ) ( ) 1 M( ) cos(2 cos )
2
bal
A
d W R d
r
r r o q o o ro o o

+

L
G
.
(5.30)

Function ( , ) r o L
G
was defined in Eq. (4.136d) of Chapter 4 to be the spectral radiance as a
function of wavenumber and direction r
G
for the beam entering the interferometer; and from the
analysis in Sec. 5.2 above, and in particular the discussion following Eq. (5.6), we know that
( , ) r o L
G
can also be interpreted as the spectral radiance of the pencil ray traveling in a direction
specified by r
G
. When this pencil ray becomes part the interferometers beam, it can be
decomposed into a parallel group of rays traveling in the direction specified by r
G
, which is of
course the same thing as recognizing the existence of a plane wave traveling in the direction
specified by r
G
. This means that the integral over
2
d r can be interpreted as a sum over all the
plane waves passing through the interferometer. Consequently, in Eq. (5.30), the term inside the
braces { },

[ ]
2
ma
0
( , ) ( ) 1 M( ) cos(2 cos )
2
A
d W R d
r
r r o q o o ro o o
L
G
,

should be interpreted as the small amount of power, an order-of-magnitude
2
d r amount of
power, that a polychromatic plane wave, when traveling at an angle
r
o to the optical axis in the
direction specified by r
G
, contributes to the optical power of the balanced interference signal.
Parallel pencil rays with the same radiance entering the interferometer
are treated as a parallel group of rays traveling in the direction specied by
The Fore and Aft Optics 5.8
- 607 -
Having interpreted the integral over
2
d r in Eq. (5.30) as a sum over the power contributed by
each polychromatic plane wave passing through the interferometer, we next analyze the integral
over d as a sum over all the monochromatic wavenumber components present in any one
polychromatic plane wave.
84
In Eq. (5.30) we regard the th wavenumber component of the plane
wave specified by r
G
as contributing an amount of power

[ ]
2
ma
( , ) ( ) 1 M( ) cos(2 cos )
2
A
d d W R
r
r o r o q o o ro o

+

L
G

to the optical power reaching the detector. Analyzing the system this way shows us how to
include the effects of nonideal optical components in the formulas for P ( )
bal
. If, for example,
the lens in Fig. 5.18 transmits some optical wavelengths more efficiently than others, a behavior
typical of real optical materials, we can introduce a transmission
lens
t that is always a real number
between zero and one and make it a function of wavenumber
1
o i
. Now in Fig. 5.18 each th

wavenumber component of the plane wave specified by r
G
contributes an order-of-magnitude
2
d d r o amount of power

[ ]
2
ma
( ) ( , ) ( ) 1 M( ) cos(2 cos )
2
lens
A
d d W R
r
t o r o r o q o o ro o

+

L
G

to the detector. Those wavenumbers for which 0
lens
t , showing that for them the lens is opaque,
are blocked from contributing any power to P ( )
bal
; and those wavenumbers for which 1 t ,
meaning that they pass through the lens without losing any power, contribute to P ( )
bal
as if they
were being focused by an ideal lens.
In general, an interferometer such as the one shown in Fig. 5.18 will have both fore optics
to gather in and prepare outside radiation for passage through the interferometer and aft optics
to focus the optical beam onto the detector after passage through the interferometer (see Fig.
5.19). In an astronomical Fourier-transform system, for example, the fore optics could be a
telescope designed to gather in large quantities of photons and send them through the
interferometer while the aft optics, like the lens in Fig. 5.18, is designed to focus the beam onto
the detector. We can lump the transmissions of the individual optical elements of both the fore
optics and the aft optics into two combined transmission functions ( )
f
t o and ( )
a
t o
respectively. This means the th component of the 'th r
G
plane wave can only contribute a

84
In effect, we are reverting to the analysis at the beginning of Chapter 4, representing the optical field propagating
through the interferometer as a sum of monochromatic plane waves over different directions and wavenumbers.
respectively. This means the th component of the r
G
- 608 -

[ ]
2
ma
( ) ( ) ( , ) ( ) 1 M( ) cos(2 cos )
2
f a
A
d d W R

+

L
G

amount to the optical power reaching the detector. Consequently, we adjust the formula for
P ( )
bal
in Eq. (5.30) to get

( )
[ ]
2
ma
field of view 0
P
( , ) ( ) ( ) ( ) 1 M( ) cos(2 cos )
2
bal
f a
A
d W R d
= +

L
G

(5.31a)

for the total power from the balanced optical signal reaching the detector in Fig. 5.19. If all the
plane waves of interest are characterized by the same spectral radiance, the dependence of L on
G
can be suppressed to get

( )
[ ]
2
ma
0 field of view
P
( ) ( ) ( ) ( ) 1 M( ) cos(2 cos )
2
bal
f a
A
d d W R

= +

L

.
(5.31b)

If, in addition, the field of view is sufficiently small to make cos 1
a good approximation,
then we can write

( ) [ ]
ma
0
P ( ) ( ) ( ) ( ) 1 M( ) cos(2 )
2
bal f a
A
W R d
= +
L , (5.31c)

where

2
field of view
d =
. (5.31d)

Equations (5.31a)(5.31d) are a useful set of formulas for describing P ( )
bal
. If an interferometer
is built with no fore optics, then we can set ( ) 1
f
= ; and to represent negligible loss in the aft
optics, we set ( ) 1
a
= . As was discussed in the previous sections, we know that for an ideal
beam splitter ( ) 1 = , and for a perfectly aligned interferometer M = 1.
- 609 -
FIGURE 5.19.

We can put Eqs. (5.31c) and (5.31d) into the same form as Eqs. (5.26a)(5.26c) by writing

( ) [ ]
ma
0
1
P ( ) 1 M( ) cos(2 )
2
bal
S W R d
= +
, (5.32a)
where
( ) ( ) ( ) ( ) ( )
f a
S A = L . (5.32b)

All that is different from Eqs. (5.26a)(5.26c) is the definition of S(), which now includes
Moving Mirror
Fixed
Mirror
Ideal Beam Splitter
Circular Detector
FORE
OPTICS
AFT OPTICS
- 610 -
factors of ( )
f
and ( )
a
, so we can set up the same pattern of mathematical definitions as
before by calling the balanced interferogram

ma
0
1
( ) ( )M )cos(2 )
2
(
bal
I S R d
, (5.32c)
with

0
1
P ( ) P ( )
2
bal bal
WI = + (5.32d)
and

0
0
P ( ) S d
. (5.32e)

Again we can see that I
bal
and P
bal
are even functions of because the cosine is an even function
of :
( ) ( )
bal bal
I I = (5.33a)
and

P ( ) P ( )
bal bal
= . (5.33b)

As before, we can make S an even function of ,

( ) ( ) S S = , (5.33c)
by writing
( ) ( ) ( ) ( ) ( )
f a
S A = L (5.33d)

for negative values of . Using the same argument as in the discussion following Eq. (5.29b), the
interferogram can now be written as the inverse Fourier transform of ( ) [ ]
ma
1 4 ( ) M( ) S R ,

2
ma
1
( ) ( ) M )
4
(
i
bal
I S R e d

, (5.34a)

which makes
[ ]
ma
( ) M( ) S R the Fourier transform of 4 ( )
bal
I ,

2
ma
( ) M ) 4 ( ) (
i
bal
S R I e d

. (5.34b)
- 611 -
There is nothing new here; all that has changed from the previous Fourier-transform relations in
Eqs. (5.29c) and (5.29d) is that we have extended the definition of S() from

( ) ( ) ( ) S A = L
in Eq. (5.29b) to
( ) ( ) ( ) ( ) ( )
f a
S A = L

in Eq. (5.33d). In fact, all of Eqs. (5.26a) through (5.29d) can now be regarded as a special case
of Eqs. (5.32a) through (5.34b), what we get when making the idealization that 1
f a
= = .
5.9 The Detector Signal
Up to this point, we have been talking about how to calculate the power in the optical signal
reaching the detector, but of course what a detector produces is an electrical signalusually
measured in volts or ampsthat is proportional to the optical power it absorbs. Unfortunately
detectors have different sensitivities to different wavelengths of electromagnetic radiation, which
means that the proportionality constant between the optical power absorbed by the detector and
the electrical signal produced by the detector is a function of wavelength . This proportionality
constant is called the detector responsivity R. Since the interferometer equations are based on
integrals over wavenumber, we write the responsivity as a function of wavenumber
1

=
rather than wavelength: R R( ) = . Depending on the type of detector being analyzed, the
responsivity R() has units of volts per unit optical power or amps per unit optical powerthat is,
its units are always detector-output signal per unit optical power reaching the detector.
In the previous section, the integrals in Eq. (5.31a) were interpreted as a sum over all the
balanced power contributions of all the wavenumber components of all the plane waves reaching
the detector. This means the th component of the th
G
plane wave contributes an amount of
power

[ ]
2
ma
( , ) ( ) ( ) ( ) 1 M( ) cos(2 cos )
2
f a
A
d d W R

+

L
G

to the balanced component of the optical power reaching the detector in Eq. (5.31a). To find the
corresponding contribution to the electrical signal leaving the detector, we multiply this by R()
to get

[ ]
2
ma
R( ) ( , ) ( ) ( ) ( ) 1 M( ) cos(2 cos )
2
f a
A
d d W R

+

L
G
.

Consequently, the balanced component of the electrical signal leaving the detector at an optical
- 612 -
path difference is

( )
[ ]
2
ma
field of 0
view
R
K
( , ) ( ) ( ) ( ) ( ) 1 M( ) cos(2 cos )
2
bal
f a
A
d W R d
= +

L
G

.
(5.35a)

When all the plane waves of interest have the same spectral radiance L, this becomes

( )
[ ]
2
ma
0 field of
view
R
K
( ) ( ) ( ) ( ) ( ) 1 M( ) cos(2 cos )
2
bal
f a
A
d d W R

= +

L

;
(5.35b)

and if we can assume cos 1
because the interferometers field of view is small, then

( ) [ ]
ma
0
R K ( ) ( ) ( ) ( ) ( ) 1 M( ) cos(2 )
2
bal f a
A
W R d
= +
L (5.35c)
with

2
field of
view
d =
. (5.35d)

From the way this result is derived, we see that it is always easy to go from the formulas for the
signal leaving the detector to the formulas for the optical signal hitting the detector: just set
R( ) 1 = .
We work now with the assumption that all the plane waves of interest have the same spectral
radiance L. Just like in Eq. (5.32b), we define a function

R ( ) ( ) ( ) ( ) ( ) ( )
f a
S A = L . (5.36a)

This definition of S(), unlike the one in (5.32b), contains the detector responsivity R().
Equation (5.35b) becomes, when cos 1
is not a good approximation,

( ) [ ]
2
ma
0 field of
view
1 1
K ( ) 1 M( ) cos(2 cos )
2
bal
S d W R d

= +

; (5.36b)
The Detector Signal 5.9
- 613 -
and Eq. (5.35c) becomes, when cos 1
is a good approximation,

( ) [ ]
ma
0
1
K ( ) 1 M( ) cos(2 )
2
bal
S W R d
= +
. (5.36c)

Following the same pattern as in the discussions after Eqs. (5.26c) and (5.32b), we can write
either of these two expressions as the sum of a constant term and a term depending on ,

0
1
K ( ) K K ( )
2
bal Ibal
W = + . (5.37a)

No matter what we do with cos

,

0
0
K ( ) S d
. (5.37b)

When cos

cannot be approximated as one,
( )
2
ma
0 field of
view
1 1
K ( )M( ) cos(2 cos )
2
Ibal
S R d d

; (5.37c)

and when cos

can be approximated as one,
( )
ma
0
1
K ( ) M( ) cos(2 )
2
Ibal
S R d
. (5.37d)

Whether or not cos 1
is a good approximation, Eqs. (5.37c) and (5.37d) show that K

Ibal
is an
even function of the optical path difference ,

( ) ( ) K K
Ibal Ibal
= , (5.38a)
because
cos( 2 cos ) cos(2 cos )

=

for all values of cos

. Since K
Ibal
is even, it follows from (5.37a) that K
bal
must also be an even
function of ,
K ( ) K ( )
bal bal
= (5.38b)
- 614 -
As before, the nonconstant component K
Ibal
of the total signal can made proportional to a Fourier
transform. Equation (5.10f) shows
ma
M( ) R to be an even function of , and we can always
force S to be even by defining ( ) ( ) S S = so that

( ) ( ) S S = . (5.39a)

Now both

ma
( ) M( ) cos(2 ) S R

and

2
ma
field of
view
1
( )M( ) cos(2 cos ) S R d

are even functions of because they are the products of even functions of . We can write Eq.
(5.37b) as

0
1
K ( )
2
S d
(5.39b)

because S is even [see Eq. (2.19) in Chapter 2]. Equation (5.37c) becomes

( )
2
ma
field of
view
1 1
K ( )M( ) cos(2 cos )
4
Ibal
S R d d

(5.39c)

when cos

cannot be approximated as one and Eq. (5.37d) becomes

( )
ma
1
K ( ) M( ) cos(2 )
4
Ibal
S R d
(5.39d)

when cos

can be approximated as one. Using the same reasoning as in the discussion
following Eq. (5.29b), we see that

- 615 -

2
ma
ma ma
ma
1
( ) M( )
4
1
( ) M( ) cos(2 ) ( ) M( ) sin(2 )
4 4
1
( ) M( ) cos(2 ) .
4
i
S R e d
i
S R d S R d
S R d

= +
=

because the integral over d of the odd function of ,

ma
( )M( ) sin(2 ) S R ,

must, according to Eq. (2.17) of Chapter 2, equal zero. Hence, when cos

can be approximated
as one, Eq. (5.39d) can be written as

( )
2
ma
1
K ( ) M( )
4
i
Ibal
S R e d

. (5.40a)

A similar manipulation is possible when cos

cannot be approximated as one. Interchanging
the order of the integrals in Eq. (5.39c) gives

( )
2
ma
field of
view
1
K ( )M( ) cos(2 cos )
4
Ibal
d d S R

.

Now we note that

ma
( )M( ) cos(2 cos ) S R

is an even function of and

ma
( )M( ) sin(2 cos ) S R

is an odd function of , so according to Eq. (2.17) in Chapter 2,

- 616 -

ma
ma ma
ma
2 cos
1
( ) M( )
4
1
( ) M( ) cos(2 cos ) ( ) M( ) sin(2 cos )
4 4
1
( ) M( ) cos(2 cos ) .
4
i
S R e d
i
S R d S R d
S R d
= +
=

This means that
( )
2
ma
field of
view
1
K ( )M( ) cos(2 cos )
4
Ibal
d d S R

can be written as

( )
2
ma
field of
view
2
ma
field of
view
2 cos
2 cos
1
K ( )M( )
4
1 1
( )M( ) .
4
Ibal
i
i
d d S R e
S R d e d

(5.40b)

Therefore we have shown that, according to Eqs. (5.40a) and (5.40b), K
Ibal
can be written as

( )
2 cos 2
ma
field of
view
1 1
K ( )M( )
4
i
Ibal
S R d e d

(5.40c)

when cos

cannot be approximated as one and as

( )
2
ma
1
K ( ) M( )
4
i
Ibal
S R e d

(5.40d)

when cos

can be approximated as one. Glancing back at Eqs. (5.37a) and (5.39b), we note that
the balanced component of the electrical signal leaving the detector due to the input spectral
power is [see also Eqs. (5.40c) and (5.40d)]
- 617 -
( )
2 cos 2
ma
field of
view
1 1
K ( ) ( )M( )
4 4
i
bal
W
S d S R d e d
r
r o o
o o o o r o

+

AO

(5.40e)
when cos
r
o cannot be approximated as one, and

( )
2
ma
0
1
K ( ) ( )M( )
4 4
i
bal
W
S d S R e d
r o
o o o o o

+

(5.40f)

when it can. The formula for S() comes from Eqs. (5.39a) and (5.36a), which can be combined
to give

R
R
( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
f a
f a
S A
A
o o q o o t o t o
o q o o t o t o
AO
AO
L
L .
(5.40g)

The absolute value signs are dropped from the argument of because it is already an even
function [see Eq. (4.139g) of Chapter 4].
5.10 The Detector Circuit
A realistic Fourier-transform spectrometer sends the signal leaving the detector into an electronic
circuit designed to record and stabilize signal K
bal
. One nice thing about electronic signals is that
they can be negative as well as positive; that is, the detectors electronic circuit can have both
negative and positive potentials (in volts) and currents (in amps). The K
0
term that keeps K 0
bal
>
in Eq. (5.37a) has no information about the power spectrum S and the detector circuit need not
respond to its presence. Typically what is done is to give the moving mirror in Fig. 5.20 a
constant velocity while at the same time building the detector circuit in such a way as to record
only time-varying signals. For the interferometer in Fig. 5.20, the optical-path difference is two
times the displacement a of the moving mirror from ZPD,

2a .

Taking the time t to be zero when the moving mirror is at ZPD with 0 a , we have

a vt

for v the velocity of the moving mirror. Substituting the second formula into the first gives

- 618 -
FIGURE 5.20.

Moving Mirror
Ideal Beam Splitter
Fixed
Mirror
FORE
OPTICS
AFT OPTICS
Circular
Detector
Electrical Signal
from Detector
Detector circuit
to process K
bal

The Detector Circuit 5.10
- 619 -
2vt ut = = , (5.41a)
where
2 u v = (5.41b)

is a quantity called the optical-path-difference velocity, or OPD velocity for short. Just as the
optical-path difference has the same length units as the mirror displacement a, so does u have
the same velocity units as v.
Substitution of (5.41a) into (5.37a) gives

0
1
K ( ) K K ( )
2
bal Ibal
ut W ut = + , (5.42a)

where, according to Eq. (5.40c), K ( )
Ibal
ut can be written as

( )
2 cos 2
ma
field of
view
1 1
K ( )M( )
4
i ut
Ibal
ut S R d e d

, (5.42b)

when cos

cannot be approximated as one and, according to Eq. (5.40d), K ( )
Ibal
ut can be
written as
( )
2
ma
1
K ( ) M( )
4
i ut
Ibal
ut S R e d

(5.42c)

when cos

can be approximated as one. If the detector circuit is built to record only time-
varying signals, a process sometimes called AC coupling of the detector,
85
then the K ( )
bal
ut
signal leaving the detector only contributes its time-varying part K ( )
Ibal
ut to the rest of the
system.
Suppose we define

0
1
( ) K ( ) K K ( )
2
in bal Ibal
g t ut W ut = = + (5.43)

to be the time-varying signal leaving the detector and entering the detector circuit. Assuming the
circuit to be linearand the interferometer cannot produce accurate spectral measurements if it is
notwe know from the discussion in Appendix 5A of this chapter that the product of the circuit
transfer function and Fourier transform of the input signal equals the Fourier transform of the

85
AC stands for alternating current.
- 620 -
output signal [see Eq. (5A.3a) in Appendix 5A]. Consequently, to get the output of a linear
circuit, we just take the Fourier transform of the input, multiply by the transfer function, and then
take the inverse Fourier transform of the product. Applying this recipe to ( )
in
g t , we see from Eq.
(5.43) that ( )
in
G f , the Fourier transform of ( )
in
g t , is

2 2 2
0 0
1 1
( ) K K ( ) K K ( )
2 2
ift ift ift
in Ibal Ibal
G f W ut e dt e dt W ut e dt
r r r

+ +

.

According to Eq. (2.71f) of Chapter 2, the constant term turns into a delta function. This means
that when cos
r
o cannot be approximated by one, Eq. (5.42b) can be used to write

2 cos 2 2 0
ma
field of
view
( )
K 1
( ) ( )M( ) ,
2 4
in
i ut ift
G f
W
f dt e d S R d e
r
r o o r
o o o o r

+

AO

(5.44a)

and when cos
r
o can be approximated as one, Eq. (5.42c) can be used to write

2 2 0
ma
K
( ) ( ) ( ) M( )
2 4
ift i ut
in
W
G f f dt e d S R e
r r o
o o o o

+

. (5.44b)

In either case, we can move the integral over dt to the inside to get, using Eq. (2.71f) from
Chapter 2, that

2 ( cos )
1
( cos )
cos cos
it u f
f
e dt u f
u u
r
r o o
r
r r
o o o o o
o o

when cos
r
o cannot be approximated as one and

2 ( )
1
( )
it u f
f
e dt u f
u u
r o
o o o o

when it can. In both these expressions, Eq. (2.68d) of Chapter 2 is used to factor the arguments of
the delta functions. Substitution of these two results back into Eqs. (5.44a) and (5.44b) gives

Here u is positive, and so is the cosine because its argument is always a
relatively small angle. Substitution of these two results back into Eqs. (5.44a) and (5.44b) gives
The Detector Circuit 5.10
- 621 -

2
0 ma
field of
view
K
( ) ( ) M
2 4 cos cos cos
in
Rf W d f
G f f S
u u u

= +

(5.45a)

when cos

cannot be approximated as one and

0 ma
K
( ) ( ) M
2 4
in
Rf W f
G f f S
u u u

= +

(5.45b)

when it can. Still following the recipe for the detector circuits output signal, we define H() to
be the detector circuits transfer function and take the inverse Fourier transform of the product

H( ) ( )
in
f G f

to get the formula for the signal leaving the detector circuit:

2
( ) H( ) ( )
ift
out in
g t e f G f df
. (5.46a)

When cos

cannot be approximated as one, this becomes, according to (5.45a),

( ) ( )
2 0
2
2 ma
field of
view
2 cos 2 0
ma
field
K
( ) H( ) ( )
2
H( ) M
4 cos cos cos
K 1
H(0) M H( cos )
2 4
ift
out
ift
i ut
g t e f f df
Rf W d f
df e f S
u u u
W
S R d u e

=

+

= +

of
view
, d

(5.46b)

where in the last step the variable of integration is changed from to

cos
f
u

= .

Glancing back at Eq. (5.45b), we see that the formula for the case where cos

can be
- 622 -
approximated by one must be [just take cos 1
r
o in Eq. (5.46b)]

( ) ( )
2 0
ma
K
( ) H(0) H( ) M
2 4
i ut
out
W
g t u S R e d
r o
o o o o
. (5.46c)

In either case, we can AC couple the detector to the detector circuit by designing the circuit so
that its transfer function has
H(0) 0 . (5.46d)

This eliminates the constant term from formulas (5.46b) and (5.46c). At this level of idealization,
there is no particular reason to think of the signal leaving the detector circuit as a function of time
rather than the optical-path difference, since they are linearly related to each other by formula
(5.41a) above. Dropping the prime from , we use (5.41a) to write the output of the detector
circuit as
( ) ( )
out
z g u (5.47a)
with
( ) ( )
2 cos 2
ma
field of
view
1
( ) M H( cos )
4
i
W
z S R d u e d
r
r o o
r
o o r o o o

AO

(5.47b)

when cos
r
o cannot be approximated as one and

( ) ( )
2
ma
( ) H( ) M
4
i
W
z u S R e d
r o
o o o o
(5.47c)

when it can. Because these last two formulas refer to the time-based signal leaving the detector
circuit, it may seem unnatural to write them in terms of and , but we will find it useful to have
them written in terms of the optical-path difference and wavenumber just like the previous
equations discussed in this chapter. To neglect the effect of the detector circuit, for example, we
need only take H = 1 inside the integrals of (5.47b) and (5.47c) to return at once to the integrals
in (5.40c) and (5.40d) respectively, which, when multiplied by W, become K
Ibal
W , the -
dependent part of the signal leaving the detector.
5.11 The Effective Spectrum
Equation (5.47c) is an example of a formalism we will see many times in the rest of this chapter.
We can write (5.47c) as
and apply Eq. (5.35d)]
signal absorbed by the detector.
The Effective Spectrum 5.11
- 623 -

2
( ) ( )
i
eff
z e d

Z ,
where

ma
( ) H( ) ( )M( )
4
eff
W
u S R = Z . (5.48a)

This shows that in (5.47c) the interferogram signal z() can be written as the inverse Fourier
transform of an effective spectrum ( )
eff
Z . It is easy to show that the interferogram signal can
always be written as the inverse Fourier transform of an effective spectrum. As long as the
interferogram signal z() is a transformable function, we can take its Fourier transform,

2
z( )
i
e d

,
and call it the effective spectrum,

2
( ) z( )
i
eff
e d

Z . (5.48b)

The reciprocity of the Fourier transform then leads to

2
( ) ( )
i
eff
z e d

Z . (5.48c)

When, for example, cos

cannot be approximated as one, as in Eq. (5.47b), we can write for the
effective spectrum

( ) ( )
2
2 cos 2 2
ma
field of
view
( )
( )
1
M H( cos ) ,
4
eff
i
i i
z e d
W
d e d S R d u e

=

=

Z

(5.48d)

which, reversing the Fourier transform, leads again to the formula

- 624 -

2
( ) ( )
i
eff
z e d

Z .

Although there is nothing very profound about this procedure, it can be a useful way of analyzing
the distortions undergone by the interferogram signal as it passes through the Fourier-transform
spectrometer.
5.12 Symmetries of the Interferogram Signal and Effective Spectrum
As long as the effective spectrum
eff
Z is a real and even function of , we know from the first
entry of Table 2.1 in Chapter 2 that its inverse Fourier transform

2
( ) ( )
i
eff
z e d

Z

must also be a real and even function of the optical-path difference . After the interferogram
signal passes through the detector circuit, it is still, of course, real, but there is no reason to
suppose that it is still even.
Suppose we look first at the simpler case where cos

can be approximated as one. Then,
according to the Eq. (5.48a), we have

ma
( ) H( ) ( )M( )
4
eff
W
u S R = Z .

From Eq. (5A.6b) in Appendix 5A, we know that the transfer function H is Hermitian,

H( ) H( ) - u u

= , (5.49a)

and the discussion following (5A.6b) points out that H must have a nonzero imaginary part. We
know that W = +1 or 1 and that S() and
ma
M( ) R are both real. From Eqs. (5.39a) and
(5.10f), we know that
( ) ( ) S S =
and

ma ma
M( ) M( ) R R =

are even. Hence the transfer function H in Eq. (5.48a) must give a nonzero imaginary part to
eff
Z , and consequently all that can be said about
eff
Z is that it is Hermitian:
Symmetries of the Interferogram Signal and Effective Spectrum 5.12
- 625 -

ma ma
ma
( ) H( ) ( )M( ) H( ) ( )M( )
4 4
H( ) ( )M( )
4
eff
W W
u S R u S R
W
u S R

= =

=

Z

( )
eff

= Z .
(5.49b)
This makes

2
z( ) ( )
i
eff
e d

Z

the inverse Fourier transform of a Hermitian function. Therefore, according to entry 7 in Table
2.1 of Chapter 2, z() must be real but need not be even. In fact, if z() were both even and real,
then entry 1 of Table 2.1 states that
eff
Z must be both real and eventhat is, entry 1 requires
eff
Z to have a zero imaginary part when z() is even. Since we already know that
eff
Z must have
a nonzero imaginary part, we conclude that z() cannot be an even function of . So already in the
simpler case where cos

is approximated as one, the interferogram signal cannot be even after
passing through the detector circuit.
The interferogram signal, in fact, always becomes uneven after passing through the detector
circuit. To see why this is so, we return to Eq. (5.46a), which holds true both when cos

can be
approximated as one and when it cannot. According to the Fourier convolution theorem, the
right-hand side, which is now the inverse Fourier transform of the product of two functions, can
be replaced by a convolution to get [see Eq. (2.39c) in Chapter 2]

( ) ( ) ( )
out in
g t h t g t = , (5.50a)
where

2
( ) ( )
ift
h t e H f df
(5.50b)

is the impulse-response function of the detector circuit, as described at the beginning of Appendix
5A, and

2
( ) ( )
ift
in in
g t e G f df
. (5.50c)

In Eq. (5.43) we defined g
in
to be the signal as it leaves the detector and enters the detector
circuit, and in the discussion following (5.43) G
in
was defined to be the Fourier transform of g
in
.
Hence g
in
must be the inverse Fourier transform of G
in
as shown in Eq. (5.50c). We know from
Eqs. (5.43) and (5.38b) that g
in
is an even function of time when t = 0 is chosen to coincide with
- 626 -
0 = as in Eq. (5.41a). Relationship (5A.5) in Appendix 5A states that the impulse-response
function h(t) must be zero for 0 t < and, of course, it cannot be a delta function at t = 0 for any
physically realistic detector circuit. Consequently, h(t) must have nonzero values at 0 t > that are
not matched by nonzero values at 0 t < . This means the convolution in formula (5.50a) makes
g
out
a blurred version of g
in
that has also been shifted to the right, in the direction of positive t. All
this can be regarded as just a complicated way of saying that the detector signal cannot pass
through the detector circuit with infinite swiftnessthere is always some sort of delay.
Therefore, g
out
can never be an even function of t, which means, according to Eq. (5.47a),

( ) ( )
out
z g u =

can never be an even function of . No assumptions have been made about the value of cos

, so
this result clearly holds true whether or not we approximate cos

by one in the double integral
over the interferometers field of view.
One last point worth making is that, although we now know that z() cannot be strictly even,
detector circuits are often designed to preserve the major features of the signals passing through
them, making the delays with which signals pass through the circuit small compared to the signal
fluctuation rate. Consequently in (5.50a) we then have

( ) ( )
out in
g t g t
so that
( ) ( )
in
z g u .

Now, since g
in
is an even function, z() is an approximately even function so that

( ) ( ) z z .

In some systems, the output signal of the detector circuit may have to be examined quite closely
to confirm that it is not a strictly even function of its argument.
5.13 Background Radiation Inside a Standard Michelson Interferometer
In Fig. 5.20, the optical signal passes through the interferometers fore optics and aft optics on its
way to the detector; and, as described in Sec. 5.8, we can represent the effects of this passage by
the two transmission functions ( )
f
and ( )
a
. When measuring infrared spectra with
uncooled interferometers, the fore optics and aft optics not only affect the optical signal passing
through them but can also act as unwanted sources of infrared background radiation. Unless they
have been cooled far below room temperature, optical elements spontaneously glow in the
infraredso if the object being observed by the interferometer is at or near room temperature, the
Background Radiation Inside a Standard Michelson Interferometer 5.13
- 627 -
optical elements may be as strong a source of infrared radiance as the object itself.
Figure 5.21 shows that the fore optics background masquerades as an additional type of
radiance entering the interferometer. To include the fore optics background in our formulas, we
add a background term to the input spectrum S() defined in Eq. (5.36a),

( )
( ) ( ) ( )
fore
S S S + .

The
( )
( )
fore
S term is just like S() in (5.36a) except that, since the radiance
( )
( )
fore
L coming
from the fore optics does not have to pass through the fore optics before reaching the
interferometer, we set ( ) 1
f
= to get

( ) ( )
R ( ) ( ) ( ) ( ) ( )
fore fore
a
S A = L .

Remembering that the formula for S() in Eq. (5.36a) is made into an even function of in
(5.39a), we do the same thing to
( )
( )
fore
S by writing

( ) ( )
R ( ) ( ) ( ) ( ) ( )
fore fore
a
S A = L . (5.51a)

As before, there is no need to add absolute value signs to the wavenumber argument of ()
because, according to Eq. (4.139g) in Chapter 4, it is already an even function of wavenumber.
Here we implicitly assume that the detectors field of view for the fore optics is the same as
its field of view for the external sourcewhich is usually a good approximation for well-
designed systems.
Now when we consider Eq. (5.37a) for the signal leaving the detector,

0
1
K ( ) K K ( )
2
bal Ibal
W = + , (5.51b)

the constant
0
K in Eq. (5.37b) becomes

( ) ( )
0
0 0 0
K ( ) ( ) ( ) ( )
fore fore
S S d S d S d

= + = +

(5.51c)

- 628 -
FIGURE 5.21.

Moving Mirror
Ideal Beam
Splitter
Fixed
Mirror
AFT OPTICS
FORE
OPTICS
Circular Detector
The warm surfaces of the fore and aft optics emit infrared background radiation in both
directions along the interferometers optical axis.
- 629 -
and K
Ibal
in Eqs. (5.40a) and (5.40b) becomes

( )
( ) 2
ma
2 ( ) 2
ma ma
1
K [ ( ) ( )]M( )
4
1 1
( )M( ) ( )]M( )
4 4
fore i
Ibal
i fore i
S S R e d
S R e d S R e d

= +
= +

(5.51d)

when cos

is approximated as one in (5.40a) and

( )
2 cos ( ) 2
ma
field of
view
2 cos 2
ma
field of
view
( )
1 1
K [ ( ) ( )]M( )
4
1 1
( )M( )
4
1
( )M(
4
i fore
Ibal
i
fore
S S R d e d
S R d e d
S R

= +

+

2 cos 2
ma
field of
view
1
)
i
d e d

(5.51e)

when cos

is not approximated by one in (5.40b). Unfortunately the background radiance
generated by the aft optics cannot be handled this simply.
Figure 5.21 shows that the background radiance generated by the aft optics travels in two
different directionsdirectly to the detector and backwards into the interferometer. The detector
sees the aft optics radiation that shines directly on it as a constant level of infrared illumination,
introducing a new constant term into the detector signal. This term can be written as

( ) ( ) ( )
det
0
R( ) ( )
dir dir dir
S d
L , (5.51f)
where we note that

( ) ( ) ( )
det
0
( )
dir dir dir
P d
L (5.51g)

is the background optical power contributed by warm surfaces emitting a spectral radiance
( )
( )
dir
L uniformly over a solid angle
( ) dir
as seen from the detector of area
det
. Just like the
- 630 -
constant term in the interference signal coming from the source, this additional constant signal is
removed by the detectors AC coupling to the detector circuit and for that reason can be
disregarded (it should, however, be taken into account when calculating the noise terms in the
next chapter). The aft optics radiance going backward into the interferometer, on the other hand,
interferes with itself as it passes backwards through the interferometer, generating an
interference signal that depends on , the optical-path difference. Some of this -dependent
optical signal ends up returning to the detector. As the moving mirror changes its position, this
interference signal also changes, generating a time-dependent signal capable of passing through
the AC coupling to the rest of the system. In Sec. 4.17 of Chapter 4, we call this the unbalanced
background signal and derive a formula for
( )
P ( )
back
unb
, the power in the unbalanced background
signal at an optical-path difference .
Working at the same level of idealization as in the analysis of the balanced interference signal
reaching the detector, we set 1 to neglect substrate absorption in formula (4.163a) for
( )
P ( )
back
unb
from Chapter 4 to get

2
(back) 2 (back)
field of
view
ma
P ( ) ( ) 2 ( ) ( )
2
( ) M ) cos(2 cos ) (
{
}
unb
A
d d r
W R

=

.
L
(5.52)

Here Eq. (5.10c) is used to substitute M for the original Bessel-function ratio, and A again refers
to the area of the aperture in the aft optics that specifies the cross-sectional area of the beam
passing through the interferometer. The double integral over
2
d can be taken over the
detectors field of view of the exterior source, since in well-designed systems this is usually a
good approximation for the detectors background field of view. The
(back)
( ) L function refers to
all the radiance entering the back end of the interferometer, not only the background radiance
coming directly from the aft optics but also radiance emitted from the detector itself that passes
backwards through the aft optics before entering the back end of the interferometer. This is why
the unbalanced background signal is sometimes called the Narcissus interference signal,
because it can come in part from the detector looking at itself in the interferometer.
From Eq. (5.10f) in this chapter and Eqs. (4.139a), (4.139g), and (4.162b) of Chapter 4, we
know that M,
2
r , , and
(back)
L are all even functions of , as is, of course, cos(2 cos )
.
Hence, the double integral

( )
2
2 (back)
ma
field of
view
2 ( ) ( ) ( ) M ) cos(2 cos ) (
} {
d r W R

L
- 631 -

has the same value at and , making it another even function of . Equation (5.52) can thus be
written as

( )
2
back (back) 2
0 field of
view
ma
P ( ) ( ) 2 ( ) ( )
2
( ) M ) cos(2 cos ) , (
{
}
unb
A
d d r
W R
r
o r o o q o
q o o ro o

L
(5.53a)

where we have used Eq. (4.163d) of Chapter 4 to recognize

(back) (back)
( ) 2 ( ) for 0 o o o > L L

as the spectral radiance of the infrared background entering the back end of the interferometer.
When cos
r
o can be approximated as one, this equation reduces to

(back)
2
(back)
ma
0
P ( )
( )[ 2 ( ) ( ) ( ) M ) cos(2 )] ,
2
(
unb
A
r W R d
o o q o q o o ro o
AO

L
(5.53b)
with

2
field of
view
d r AO
. (5.53c)

Keeping in mind the definition of M given in Eq. (5.10c) and our approximation that 1 y e , we
see that (5.53b) is the same as Eq. (4.163c) in Chapter 4.
Just as we did for the power in the balanced signal, we can interpret the integrals over d in
Eqs. (5.53a) and (5.53b) to be sums over all the power contributions of all the monochromatic
wavenumber components of the background radiation. Hence, when cos
r
o is approximated by
one in Eq. (5.53b), we say that

2
(back)
ma
A
( ) 2 ( ) ( ) ( ) M ) cos(2 )
2
( d r W R o o o q o q o o ro
AO

L

is the power carried by the th wavenumber component leaving the interferometer and traveling
(back)
L
(back)
L
- 632 -
toward the detector; and when cos
r
o cannot be approximated by one in Eq. (5.53a), we make the
same claim for

2
2 (back)
ma
field of
view
( ) 2 ( ) ( ) ( ) M ) cos(2 cos )
2
(
A
d d r W R
r
o r o o q o q o o ro o

L .

Following the same reasoning used in Secs. 5.8 and 5.9 above to analyze the power in the
balanced optical signal, we multiply these expressions first by the aft optics transmission ( )
a
t o
to get the fraction of power component passing from the interferometer to the detector and then
by the detector responsivity R() to get the signal component produced by the interferometers
detector. This makes

2
2 (back)
0 field of
view
ma
R K ( ) ( ) ( ) ( ) 2 ( ) ( )
2
( ) M ) cos(2 cos ) , (
{
}
unb a
A
d d r
W R
r
o r t o o o o q o
q o o ro o

L

(5.54a)

the total unbalanced interference signal leaving the detector when cos
r
o cannot be approximated
by one, and

2
(back)
ma
0
R
K ( )
( ) ( ) ( ) 2 ( ) ( ) ( ) M ) cos(2 ) ,
2
(
unb
a
A
r W R d
t o o o o q o q o o ro o
AO

L

(5.54b)

the total unbalanced interference signal leaving the detector when cos
r
o can be approximated as
one.
Following the pattern of Eq. (5.37a), we can write Eqs. (5.54a) and (5.54b) as

( )
0
1
K ( ) K K ( )
2
unb
unb Iunb
W + , (5.55a)
where
2 (back)
ma
0 field of
view
R
K ( )
( ) ( ) ( ) ( ) M ) (2 cos )
2
( cos
Iunb
a
A
d d R
r
o r t o o o q o o ro o

L
(5.55b)
cos o
- 633 -

when cos

cannot be approximated as one, and

(back)
ma
0
R
K ( )
( ) ( ) ( ) ( ) M ) cos(2 )
2
(
Iunb
a
A
R d
L

(5.55c)

when cos

can be approximated as one. No matter how cos

is approximated, we have

2
( ) (back)
0
0
R K ( ) ( ) ( ) ( ) [ 2 ( ) ( )]
unb
a
A r d
L . (5.55d)

We simplify the formulas in these expressions by defining

(back) (back)
R ( ) ( ) ( ) ( ) ( )
a
S A = L (5.56a)
to get

(back) 2
ma
0 field of
view
K ( )
1 1
( ) M ) cos(2 cos )
2
(
Iunb
S R d d

(5.56b)

when cos

cannot be approximated by one and

(back)
ma
0
1
K ( ) ( ) M ) cos(2 ) ,
2
(
Iunb
S R d
(5.56c)

when it can. We force
(back)
S to be an even function of its argument by writing

(back) (back)
R ( ) ( ) ( ) ( ) ( )
a
S A = L . (5.57a)

There is, of course, no need to put absolute value signs on the argument of because we already
know from Eq. (4.139g) of Chapter 4 that it is even. Since the cosine is an even function of its
argument and, according to Eq. (5.10f), so is M, we recognize that now both

(back)
ma
( ) M ) cos(2 ) ( S R
and
- 634 -

(back)
ma
( ) M ) cos(2 cos ) ( S R
r
o o ro o

are even functions of . Repeating the same argument that has already been used before to
convert cosine integrals over even functions into Fourier transforms, we note that

(back)
ma
0
(back)
ma
(back)
ma
2 cos
( ) M ) cos(2 cos )
1
( ) M ) cos(2 cos ) sin(2 cos )
2
1
( )M )
2
(
(
(
i
S R d
S R i d
S R d e
r
r r
r
r o o
o o ro o o
o o ro o ro o o
o o o

(5.57b)
because

(back)
ma
( ) M ) sin(2 cos ) ( S R
r
o o ro o

is an odd function of , making its integral over between and + equal to zero for all values
of cos
r
o . Hence, Eq. (5.57b) can be used to write (5.56b) as

2 (back)
ma
field of 0
view
2 cos 2 (back)
ma
field of
view
1 1
K ( ) ( )M( ) cos(2 cos )
2
1 1
( )M( )
4
Iunb
i
d d S R
d d S R e
r
r
r o o
r o o o ro o
r o o o

AO

AO

or

2 cos (back) 2
ma
field of
view
1 1
K ( ) ( )M( )
4
i
Iunb
S R d e d
r
r o o
o o r o

AO

(5.58a)
when cos
r
o cannot be approximated as one. When cos
r
o can be approximated as one, (5.57b)
can be used to write (5.56c) as

(back) 2
ma
1
K ( ) ( ) M )
4
(
i
Iunb
S R e d
r o
o o o
. (5.58b)

To get all of the interference signal reaching the detector from the source, the fore optics
[see Eq. (2.17) in Chapter 2]. Hence, Eq. (5.57b) can be used to write (5.56b) as
with =1 can be used to write (5.56c) as cos
r
o
- 635 -
background, and the aft optics background, we add together the expressions for the signal
components from the source, the fore optics background, and the aft optics background.
Equation (5.51b) specifies the combined signal and fore optics background, and Eqs. (5.51f),
(5.55a) give the signal coming from the aft optics background. Adding all these formulas
together gives

( )
K ( ) K ( ) K ( )
dir
tot bal unb
S = + + . (5.59a)

If cos

cannot be approximated as one, Eq. (5.59a) expands to, after applying Eqs. (5.51c)
(5.51f), (5.55d), (5.58a), and (5.58b),

( ) ( )
det
0 0
2
( ) (back)
0 0
2 cos 2
ma
field of
view
R
R
1
K ( ) ( ) ( ) ( )
2
1
( ) ( ) ( ) ( ) [ 2 ( ) ( )]
2 2
1
( )M( )
4
dir dir
tot
fore
a
i
d S d
A
S d r d
W
S R d e

= +
+ +

+

L
L

2 cos ( ) 2
ma
field of
view
2 cos (back) 2
ma
field of
view
1
( )M( )
4
1
( )M( )
4
i fore
i
d
W
S R d e d
W
S R d e d

(5.59b)

- 636 -
and if cos

can be approximated as one, Eq. (5.59a) can be written as

( ) ( )
det
0 0
2
( ) (back)
0 0
2
ma
( )
R
R
1
K ( ) ( ) ( ) ( )
2
1
( ) ( ) ( ) ( )[ 2 ( ) ( )]
2 2
( )M( )
4
( )M(
4
dir dir
tot
fore
a
i
fore
d S d
A
S d r d
W
S R e d
W
S

= +
+ +
+
+

L
L

2
ma
(back) 2
ma
)
( )M( ) .
4
i
i
R e d
W
S R e d

(5.59c)

When the moving mirror moves at a constant OPD velocity u so that

ut = ,

then the constant terms (that is, the terms that do not depend on ) do not make it past the detector
circuit that AC couples the detector to the rest of the system. According to the discussion
following Eq. (5A.2a) in Appendix 5A, if we know what the output of the linear detector circuit
is for each individual component of a sum of input signals, then we know that the output of the
linear detector circuit for the sum of the input signals is the sum of the outputs of the individual
components. Using = ut to represent the nonconstant terms, we already know from the
procedure used to transform Eq. (5.42a) to (5.47b) that the term

2 cos 2
ma
field of
view
1
K ( ) ( )M( )
4
i
Ibal
W
W S R d e d

in Eq. (5.42a) entering the detector circuit comes out as

( ) ( )
2 cos 2
ma
field of
view
1
M H( cos )
4
i
W
S R d u e d

- 637 -
in Eq. (5.47b) when cos

cannot be approximated as one. Consequently, when the same term

2 cos 2
ma
field of
view
1
( )M( )
4
i
W
S R d e d

occurs in Eq. (5.59b), we know that it comes out of the detector circuit as

( ) ( )
2 cos 2
ma
field of
view
1
M H( cos )
4
i
W
S R d u e d

.

Passage through the detector circuit just introduces a factor of H( cos ) u

into the integral
over the field of view when cos

cannot be approximated as one. Examining the other two
nonconstant terms in Eq. (5.59b), we note that the only difference between them and the term just
analyzed is way the S() function is labeled: for one of the input terms we have

( )
( ) ( )
fore
S S

and for the other we have

(back)
( ) ( ) S S .

Therefore, we can write down at once that

2 cos ( ) 2
ma
field of
view
1
( )M( )
4
i fore
W
S R d e d

becomes

2 cos ( ) 2
ma
field of
view
1
( )M( ) H( cos )
4
i fore
W
S R d u e d

,
- 638 -
and that

2 cos (back) 2
ma
field of
view
1
( )M( )
4
i
W
S R d e d

becomes

2 cos (back) 2
ma
field of
view
1
( )M( ) H( cos )
4
i
W
S R d u e d

.

We now know what the output of the detector circuit is for each nonconstant component of the
sum in Eq. (5.59b), and we have already noted that the -independent, constant terms in (5.59b)
have zero output. Knowing what the output is for each individual component of the sum in
(5.59b), we can write down the total output of (5.59b) as the sum of the outputs of each
individual component to get, when cos

cannot be approximated as one, that the total signal
leaving the detector circuit is

( ) ( )
2 cos 2
ma
field of
view
2 cos ( ) 2
ma
field of
view
( )
1
M H( cos )
4
1
( )M( ) H( cos )
4
tot
i
i fore
z
W
S R d u e d
W
S R d u e

( )
( )
2 cos (back) 2
ma
field of
view
( ) (back)
2
ma
1
( )M( ) H( cos )
4
( ) ( )
4
1
M H(
i
fore
d
W
S R d u e d
W
S S S
R d

= +

2 cos
field of
view
cos )
i
u e d
.
(5.60a)

To get the total signal leaving the detector circuit when cos

can be approximated as one, we
- 639 -
need only replace cos

by one to get

( ) ( )
( ) (back) 2
ma
( ) ( ) ( ) H( ) M
4
fore i
tot
W
z S S S u R e d

= +

. (5.60b)

Once again, we use the formula

2
field of view
d =

to dispose of the integral over the field of view. Equations (5.60a) and (5.60b) show that to
include the effect of the background radiance in the standard formulas for the signal leaving the
detector circuit, we need only replace the original source spectrum S() in Eqs. (5.47b) and
(5.47c) with

( ) (back)
( ) ( ) ( ) ( )
fore
S S S S + . (5.60c)

Equations (5.40g), (5.51a), and (5.57a) are now substituted into (5.60c) to get

( )
R R
R
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
f a f a
fore
a
A A
A

+
L L
L

( )
R( ) ( ) ( ) ( ),
back
a
A L

which can be reduced to

( ) ( )
( ) ( )
( ) ( )
( ) ( )
fore back
f f

+
L L
L L . (5.60d)

When the background radiance
( ) back
L is very large, the signal
tot
z leaving the detector circuit can
quite literally be the transform of a negative spectrum. The replacement rules given in (5.60c)
and (5.60d) are one reason we only need to keep track of the input radiance L when analyzing the
noise-free signal leaving the detector circuitbecause (5.60c) or (5.60d) can be used at any point
to reintroduce the background radiances into the Fourier transforms. The next section gives
another reason the background radiances can be disregarded: they are easy to eliminate from the
signal leaving the detector circuit before any attempt is made to measure the input radiance
spectrum.
- 640 -
5.14 Removing the Background Spectra
The source spectrumwhat we build Fourier-transform spectrometers to measureis derived
from the Fourier transform of z(), the signal leaving the detector circuit. When cos

can be
approximated as one and the background radiance can be neglected, as in Eq. (5.47c), we start
with the formula
( ) ( )
2
ma
z( ) H( ) M
4
i
W
u S R e d

and reverse the Fourier transform to get

( ) ( )
2
ma
z( ) H( ) M
4
i
W
e d u S R

-
. (5.61a)

Substituting for the source spectrum S from Eq. (5.40g) gives

( )
2
ma
R z( ) ( )H( ) ( ) ( ) ( ) ( )M
4
i
f a
A W
e d u R

L
-
, (5.61b)

which can be solved for the source radiance to get

( )
1
2
ma
R
( )
H( ) ( ) ( ) ( ) ( )M z( )
4
i
f a
A W
u R e d

=

L
-

.
(5.61c)

Before we started analyzing the interferometers background radiance, this sort of equation had
been enough to explain how to find the source radiance, since in a well-aligned interferometer
M 1 and all the other quantities,

R H and
a f
A W u , , , , , , , , ,

are known or canin principle, anywaybe measured. This lets us write

1
2
R ( ) H( ) ( ) ( ) ( ) ( ) z( )
4
i
f a
A W
u e d

=

L
-
(5.61d)

Removing the Background Spectra 5.14
- 641 -
to get a formula for what we want to measure in terms of the Fourier transform of z() and other
known quantities. Now, however, we know from the work done in the previous section that when
measuring infrared spectra there may be significant amounts of background radiance
contaminating the source spectrum. Equations (5.60a) and (5.60b) show that if the background
radiance cannot be neglected, then the signal leaving the detector circuit is not z() but rather
( )
tot
z , which is not the correct signal to substitute into equations such as (5.61d).
To recover z() from ( )
tot
z there must be two measurements made: one looking at the source
and one looking at nothing at all. No matter how cos

is approximated, when the
interferometer observes an extremely cold source, it produces a signal in Eqs. (5.60a) and (5.60b)
in which S(), the infrared source spectrum, is very small compared to the background spectra
( )
( )
fore
S and
(back)
( ) S . To match the notation used in Chapter 6, where the background
radiances play a more important role than they do here, we call this signal
( )
( )
cold
C
z . According
to Eqs. (5.60a) and (5.60b),
( )
( )
cold
C
z can be written as

( )
( ) (back)
2 cos 2
ma
field of
view
( ) ( ) ( )
4
1
M H( cos )
(cold) fore
C
i
W
z S S
R d u e d

(5.62a)

when cos

cannot be approximated as one, and as

( )
(cold) ( ) (back) 2
C ma
( ) ( ) ( ) H( ) M
4
fore i
W
z S S u R e d

=

(5.62b)

when cos

can be approximated as one. Assuming the interferometer is stable, meaning that the
background radiances of the instrument do not change, we can then measure ( )
tot
z as given in
formulas (5.60a) and (5.60b) and subtract from it
( )
( )
cold
C
z as defined in Eqs. (5.62a) and
(5.62b). This gives
- 642 -

( )
( )
(cold)
C
( ) (back)
2 cos 2
ma
field of
view
( ) (back)
( ) ( ) ( )
( ) ( )
4
1
M H( cos )
( ) (
4
tot
fore
i
fore
z z z
W
S S S
R d u e d
W
S S
r
r o o
r

o o o
o r o o o
o

+

AO

( )
( ) ( )
2 cos 2
ma
field of
view
2 cos 2
ma
field of
view
)
1
M H( cos )
1
M H( cos )
4
i
i
R d u e d
W
S R d u e d
r
r
r o o
r
r o o
r
o
o r o o o
o o r o o o

AO

AO

(5.62c)

when cos
r
o cannot be approximated as one, and

( ) ( )
( )
( ) ( )
(cold)
C
( ) (back) 2
ma
( ) (back) 2
ma
2
ma
( ) ( ) ( )
( ) ( ) M H( )
4
( ) ( ) M H( )
4
M H( )
4
tot
fore i
fore i
i
z z z
W
S S S R u e d
W
S S R u e d
W
S R u e d
r o
r o
r o

o o o o o o
o o o o o
o o o o

(5.62d)

when cos
r
o can be approximated as one. This is one of the ways infrared spectroscopists using
interferometers with uncooled optics can eliminate unwanted background spectra and retrieve the
desired z() interferogram signal associated with the source spectrum. Designers of satellite
interferometers almost always schedule some form of space look where the instrument
observes nothing but empty space, containing only distance sources of radiation too dim for the
instrument to detect. This sort of space look allows it to acquire the information needed to find
the
( )
( )
cold
c
z signal generated by its own internal warmth. A quick way of achieving the same
(cold)
C
( ) z
Removing the Background Spectra 5.14
- 643 -
effect on the ground is to point the interferometer at a surface cooled by liquid nitrogen or, for
greater accuracy, liquid helium.
Now that we know how to extract z() from the unwanted background, the presence of the
background signal can be disregarded when analyzing nonrandom spectral distortions introduced
by nonideal interferometer measurements. This is what we do for the rest of this chapter (except
for Sec. 5.19, where we discuss one common method of extracting a radiance measurement from
the raw signal spectrum). These formulas do, however, return in the next chapter because the
background signal can have a significant effect on the amount of random noise present in the
measurement.
5.15 Double-Sided Interferograms
Equation (5.61d) gives the spectral radiance (what Fourier-transform spectrometers measure) in
terms of the Fourier transform

2
( )
i
z e d

-

of the interferogram signal z() leaving the detector circuit. It is, of course, impossible to measure
z for all optical-path differences between and +, so there is no hope of calculating the
direct, unadulterated Fourier transform of z. We must therefore settle for an approximation of the
Fourier transform, and there are two different ways to do thisone using finite-length, double-
sided measurements of the interferogram signal and one using finite-length, single-sided
measurements of the interferogram signal. Because it is conceptually simpler, we start with the
double-sided interferogram measurement, postponing discussion of the single-sided
interferogram until Sec. 5.18 below.
As was remarked at the end of Sec. 5.12, the interferogram signal leaving the detector circuit
is usually approximatelyalthough not exactlyeven, so that it tends to look as shown in Fig.
5.22 when plotted as a function of . In a double-sided interferogram measurement, there is a
positive length D such that the signal z() is measured for all

D D ,
or
D .

When z is only measured for D , there is no way to know what z is in the regions marked
with question marks ? in Fig. 5.22, and in a double-sided interferogram measurement, the value
of z() in these regions is assumed to be, if not negligible, at any rate unimportant. The Fourier
transform of z then becomes,
- 644 -
FIGURE 5.22.

______________________________________________________________________________

2 2 2
( ) ( ) ( , ) ( )
D
i i i
D
z e d z e d D z e d

=

- - -
, (5.63a)
where

1 for
( , )
0 for
D
D
D
=

>

(5.63b)

has already been defined by Eq. (4C.1a) in Appendix 4C of Chapter 4.
To see the effect of neglecting the signal values at D > , we must understand the
information carried by the Fourier transform of z(). We know from the discussion in Sec. 5.11
that there is always an effective spectral function ( )
eff
Z that is the Fourier transform of z [see
0 =
D = D =
D 2
?
?
Double-Sided Interferograms 5.15
- 645 -
Eq. (5.48b)],

2
( ) ( )
i
eff
z e d
r o
o
Z
-
. (5.64a)

Rewriting Eq. (5.61d) by substituting (5.64a) for the Fourier transform of z gives

1
R ( ) H( ) ( ) ( ) ( ) ( ) ( )
4
f a eff
A W
u o o o q o t o t o o
AO

L Z . (5.64b)

The terms inside the square brackets are usually designed to be slowly varying functions over the
range of wavenumbers for which L() is being measured. This means ( )
eff
o Z contains the fine
details of spectrum L(). Since L() is real andaccording to the discussion following Eq.
(5A.6b) in Appendix 5Athe transfer function H( ) uo is complex, the effective spectrum
( )
eff
o Z in (5.64b) must also be complex. Taking the complex magnitude of both sides of formula
(5.64b), we indicate that ( )
eff
o Z carries the fine details of L() by writing

( ) ~ ( )
eff
o o L Z . (5.64c)

Although Eq. (5.64b) comes from formulas that apply only when the interferometers field of
view is sufficiently narrow that cos
r
o can be approximated as one, the idea expressed by
(5.64c), that ( )
eff
o Z carries the fine details of the L() spectrum, holds true even when cos
r
o
cannot be approximated as one.
We now consider what happens to these fine details when what we have is not ( )
eff
o Z , the
true Fourier transform of z(), but rather the double-sided approximation specified in Eq. (5.63a).
The integral in (5.63a) is the Fourier transform of the product ( , ) ( ) D z H , and by the Fourier
convolution theorem [see Eq. (2.39k) of Chapter 2], this can be written as the convolution of the
Fourier transform of H and z. We already know that ( )
eff
o Z is the Fourier transform of z, and
the Fourier transform of H can be evaluated directly as

2 2 2
1
( , ) 2 sinc(2 )
2
D
D
i i i
D
D
D e d e d e D D
i
r o r o r o
ro
r o
H

- -
, (5.65a)

where in the last step we use
cos sin
i
e i
o
o o +

H and z. We already know that ( )
eff
o Z is the Fourier transform of z, and Fourier transforms of
- 646 -
and the function

sin
sinc( )
x
x
x
= (5.65b)

previously defined in Eq. (2.106d) of Chapter 2. Hence, by the Fourier convolution theorem

[ ]
2
( , ) ( ) 2 sinc(2 ) ( )
i
eff
D z e d D D

Z
-
. (5.65c)

This shows that what we settle for in a double-sided interferogram measurement is the
convolution of ( )
eff
Z with 2 sinc(2 ) D D instead of the true Fourier transform ( )
eff
Z .
In the discussion following Eq. (2.39A ) of Chapter 2, we pointed out that when two functions
are convolved and one of them is much narrower than the other, the narrower function can be
thought of as blurring and distorting the shape of the other. Since what we are interested in is the
fine detail encoded in

2
( ) ( )
i
eff
z e d

Z
-
,

we cannot hope to get even an approximate measurement of this fine detail unless
2 sinc(2 ) D D is narrower than ( )
eff
Z , the Fourier transform of z. We substitute the right-
hand side of (5.65c), which is our approximation for ( )
eff
Z , the Fourier transform of z, into
(5.64c) to get
( ) ~ 2 sinc(2 ) ( )
blur eff
D D L Z . (5.66a)

The original spectral radiance L() encodes its own fine details at least as well as ( )
eff
Z , which
lets us write (5.66a) as
( ) ~ 2 sinc(2 ) ( )
blur
D D L L
or
( ) ~ 2 sinc(2 ) ( )
blur
D D L L . (5.66b)

In the last step, we restrict the magnitude signs to the arguments of L and
blur
L because
2 sinc(2 ) D D and L() are always realmaking their convolution realand because negative
values of the convolution indicate an unphysically distorted measurement of L(), because L
cannot be negative. Since, according to Eq. (2.38b) in Chapter 2, it does not matter in what order
two functions are convolved, this can also be written as
- 647 -
( ) ~ ( ) 2 sinc(2 )
blur
D D o o ro L L .

Comparing this result to Eq. (2.40a) of Chapter 2, we realize that 2 sinc(2 ) D D ro is playing the
role of an instrument response function. Figure 5.23 reveals the width of function
2 sinc(2 ) D D ro between the two zeros bracketing the central peak to be 1/D. This shows us how
to control the narrowness of the spectrometers instrument-response function. When designing
Fourier-transform spectrometers we try to pick D sufficiently large that the blurring sinc
function in (5.66b) does not significantly distort the spectral features of the radiance L() that we
want to measure.
Figures 5.24(a)5.24(f) give examples of how this works when the 2 sinc(2 ) D D ro
instrument-response function acts to blur together a collection of ever-closer spectral peaks. We
see that when the peaks are separated by a wavenumber interval

1
2D
o A (5.67)

all sure knowledge of their separate existence is lost. In Fourier-transform spectrometry, the
quantity
1
(2 ) D

is often called the unapodized spectral resolution of the interferometer
measurement. This terminology can be confusing, because a smaller spectral resolution now
corresponds to a higher resolving power for the interferometer. The important thing to remember
is that the interferometers resolving powerthat is, its ability to measure spectral detailis
directly proportional to D. Figures 5.24(a)5.24(f) also show that when the true spectra are
convolved with sinc-like instrument-response functions, the oscillations in the instrument-
response functions create secondary oscillations in regions where L() is changing rapidly. This
is sometimes referred to as ringing in the measured spectrum ( )
blur
o L . This ringing can lead to
unphysically negative values in ( )
blur
o L , as shown in Figs. 5.24(b), 5.24(d), and 5.24(f).
In Fourier-transform spectroscopy, the instrument-response function is often called the
instrument line shape, or ILS for short. The instrument line shape can be measured by passing a
laser beam through the interferometer. Although all lasers in practice have some spectral width,
they do produce a spectral radiance L() that is, as shown in Fig. 5.25(a), very close to a delta
function.
86
Figure 5.25(b) plots the curve ( )
blur
o L produced by a Fourier-transform spectrometer
when it measures the laser spectrum at wavenumber
0
o o . We can normalize ( )
blur
o L so that
the total area under the curve is one, creating a new curve

86
Equation (5.16c) gives the ideal interferogram created by a strictly monochromatic source represented by a delta
function.
instrument-response function acts to blur together a pair of ever-closer spectral peaks. We see that
when the peaks are separated by a wavenumber interval
In Fourier-transform spectroscopy, the instrument-response function is often called the
- 648 -

FIGURE 5.23.

____________________________________________________________________________________

1
( )
0
( ) ( ) ( )
norm
blur blur blur
d o o o o
L L L . (5.68a)

The origin of the wavenumber axis is then shifted so that the center of the normalized curve is at
the origin, giving a measurement of the instrument-response function or instrument line shape at
0
o o ,

( )
0
( ) ( )
norm
LS blur
I o o o + L , (5.68b)

D 2
o

D 2
1

D 2
1

0.0
This is a graph of 2 (2 ) sinc D D ro versus . sinc(
- 649 -

D 2
1
3
FIGURE 5.24(a). FIGURE 5.24(b).
FIGURE 5.24(c).
FIGURE 5.24(d).
FIGURE 5.24(e).
FIGURE 5.24(f).

D 2
1
3

( )
blur
L
( ) L
( )
blur
L
( )
blur
L
( ) L
( ) L

D 2
1
2

D 2
1
2

D 2
1

D 2
1

- 650 -
as shown in Fig. 5.25(c). To a first approximation (and as a general rule of thumb), we expect to
get about the same shape for ( )
LS
I o no matter what the wavenumber
0
o of the laser used to
make the measurement.
One last point worth making is that after the effective spectrum ( )
eff
o Z has been blurred by a
convolution with the sinc function, what we end up with is a new effective spectrum

, ,
( ) [2 sinc(2 )] [ ( )]
eff new eff old
D D o ro o Z Z . (5.69a)

Now the Fourier-transform relationship in Eq. (5.65c) can be written as

2
,
( ) ( , ) z( )
i
eff new
D e d
r o
o
Z
-
(5.69b)
and

2
,
( , ) z( ) ( )
i
eff new
D e d
r o
o o
Z
-
. (5.69c)

So even this aspect of the interferogram signalthat we cannot measure it for all optical-path
differences between and +can be expressed by representing the truncated signal

( , ) ( ) D z H

as the Fourier transform of an effective spectral function ( ) o Z .
5.16 Apodization of Spectra
In Sec. 5.15, the basic philosophy of the double-sided interferogram is to give equal weight to all
parts of the signal measured between +D and D, as shown in Eq. (5.63a). When, however, the
approximation in (5.63a) is written as

2 2
( ) ( , ) ( )
i i
z e d D z e d
r o r o

e H

- -
,
it is perhaps not so obvious that putting function ( , ) D H inside the integral on the right-hand
side leads to the best possible approximation of the true Fourier transform of z. Suppose we
replace H with an arbitrary function of called ( )
D
a , making the approximation that
.
Apodization of Spectra 5.16
- 651 -
FIGURE 5.25(a).

( ) L

0
=
- 652 -

FIGURE 5.25(b).

( )
blur
L

0

D 2
1
0

D 2
1
0
+
- 653 -
FIGURE 5.25(c).

D 2
1

D 2
1

0.0

( )
LS
I
- 654 -

2 2
( ) ( ) ( )
i i
D
z e d a z e d

- -
. (5.70a)

The subscript D reminds us that
( ) 0 for
D
a D = > (5.70b)

since we do not know what values to give z when D > . Setting up the problem of
approximating the true Fourier transform in this waythat is, the way it is stated in Eq. (5.70a)
suggests that what we need to do is find that function ( )
D
a for which the integral

2
( ) ( )
i
D
a z e d

-

best approximates the true Fourier transform

2
z( )
i
e d

-
.

Trying to approximate the Fourier transform of a function z, which is known from only a finite
stretch of data, is not a problem unique to Fourier-transform spectroscopy; in fact, it occurs over
and over again in many different fields of electrical engineering and signal processing. In these
fields,
D
a is called the window function and multiplying z() by ( )
D
a is referred to as
windowing z(). In Fourier-transform spectroscopy
D
a is called the apodization function, and
multiplying z() by ( )
D
a is called apodizing the interferogram signal z.
There are several different types of restrictions put on the apodization function ( )
D
a . If

2
( ) ( )
i
eff
z e d

Z
-
(5.71a)

is the true Fourier transform of z, then according to Eq. (2.35b) of Chapter 2,

(0) ( )
eff
z d
Z
-
. (5.71b)

When we replace z() by ( ) z( )
D
a in Eq. (5.70a), distorting the shape of the Fourier transform
- 655 -
( )
eff
Z , we want the integral over the distorted spectrum to have the same value as the integral
over the undistorted spectrum in (5.71b). Because the distorted spectrum is by definition the
Fourier transform of ( ) ( )
D
a z , it followsagain using (2.35b) of Chapter 2that the integral
over the distorted spectrum is (0) (0)
D
a z . Forcing the integrals over the distorted and undistorted
spectra to have the same values now leads to

(0) (0) z(0)
D
a z =
or
(0) 1
D
a = . (5.71c)

It is hard to justify giving the apodization or window function a nonzero imaginary part, so
almost always
( ) Im ( ) 0
D
a = . (5.71d)

According to the discussion at the end of Sec. 5.12, z() is often an approximately symmetric
function of the optical-path difference , which means there is no obvious reason to weight z()
differently from z() in the integral on the right-hand side of (5.70a). This suggests that the
apodization should be an even function of the optical-path difference:

(- ) ( )
D D
a a = . (5.71e)

Different choices of ( )
D
a preserve different aspects of ( )
eff
Z when it is approximated by
the apodization integral in (5.70a), and it is impossible to pick one particular apodization or
window function as being ideal under all circumstances. Applying the Fourier convolution
theorem [in the form of Eq. (2.39k) of Chapter 2] to (5.70a) and remembering that ( )
eff
Z is the
exact Fourier transform of z(), we get

( ) A ( ) ( )
eff D eff
Z Z , (5.72a)
where

2
A ( ) ( )
i
D D
a e d

-
. (5.72b)

From Eqs. (5.71d) and (5.71e), we know that ( )
D
a is real and even, which means, according to
entry 1 in Table 2.1 of Chapter 2, that the Fourier transform A ( )
D
is also real and even. Figures
5.26(a) and 5.26(b) give some of the more popular apodization or window functions and their
corresponding Fourier transforms. Compared to (,D), they all do a better job of preventing
- 656 -
ringing; in fact, the Bartlett and Parzen window functions, because their Fourier transforms do
not go negative, can never produce unphysical negative values when convolved with the non-
negative true spectrum L() [which is the basic shape-determining factor of ( )
eff
Z on the right-
hand side of Eq. (5.72a)]. Apodization functions in fact get their name from the way they can
diminish or remove unsightly ringing at the base of sharp, spectral peaks in Fourier
measurements. The pod root comes from the Latin word for foot, a metaphorical reference to
the small spurious bumps often present at the base of these peaks; and the a prefix before the
pod shows that apodization is intended to remove (or diminish) the feet. As a rule of thumb,
apodizing the interferogram signal is more a matter of aestheticsmaking the measured spectrum
look betterthan it is a way to reveal previously hidden spectral detail. If there are doubts about
the true shape of a measured spectrum, it is better to increase the value of D than to introduce a
more sophisticated apodization function.
5.17 The Effect of a Finite Field of View
Equation (5.62c) gives the formula for the detector-circuit signal z() generated by the source
when cos

cannot be approximated as one:

( ) ( )
2 cos 2
ma
field of
view
1
( ) M H( cos )
4
i
W
z S R d u e d

. (5.73a)

To investigate what happens to this signal when the field of view is sufficiently large that cos

is approximately but not exactly equal to one, we write

2
cos 1
2
. (5.73b)

Substitution of this back into the formula for z() gives

( ) ( )
2
2
2 2
ma
field of
view
( )
1
M H( )
4 2
i i
z
u W
S R e d e u d

.
(5.73c)

The Effect of a Finite Field of View 5.17
- 657 -

FIGURE 5.26(a).

1.0
0.2
B
kg
T
kg
H
kg
P
kg
1.2 1.2 t
kg
1.2 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1.2
0.2
0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.8
0.6
0.4
0.2
0.0
0.0
D D
Parzen
Hamming
Tukey
Bartlett
This graph shows four popular apodization or window functions as (). They get
their names from the analysts who first publicized them.
( )
D
a This graph shows four popular apodization or window functions
- 658 -

FIGURE 5.26(b).

1.5
0.2
B
kg
T
kg
H
kg
P
kg
3.0 3 t
kg
3 2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5 3
0.2
0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1.1
1.2
1.3
1.4
0.0
D / 1 D / 1
D

2
3D

2
D

0.0
Bartlett
Tukey
Hamming
Parzen
o
This graph plots the Fourier transforms A
d
() of the four apodization or
window functions shown in Fig. 5.26(a).
A ( )
D
o of the four apodization or
- 659 -
The outer integral over d goes between and +, so as long as
2
is not zero there is

eventually a value of large enough to make cos

in the expression cos

too large to be
approximated by (5.73b) in the phase formulas in Eqs. (5.73a) and (5.73c). (The first part of
Appendix 4B of Chapter 4 explains why we must be careful when deriving approximations for
the phase.) The first step, then, in treating
2
as a small quantity in Eq. (5.73c) is to require that

S() be zero or negligible for those values of large enough to invalidate (5.73b). Because S is
even so that [see Eq. (5.39a) above]

( ) ( ) S S = ,

it follows that S must also be zero or negligible for large negative values of . Glancing back at
the definition of S for positive in Eq. (5.36a),

R ( ) ( ) ( ) ( ) ( ) ( )
f a
S A = L ,

we see that the behavior of S at large is under the control of the interferometer designerfor
example, the fore and aft optics can be constructed so that the product ( ) ( )
f a
is zero or
negligible for large values of . When
2
is also multiplied by the optical-path difference , as

it is in the phase of

2
i
e

in Eq. (5.73c), the interferometer designer must also choose an appropriate upper limit on .
This upper limit was called D in Secs. 5.15 and 5.16 above, so in (5.73a)(5.73c) we want D to
be chosen small enough that for all
D (5.73d)

the product
2
can be treated as a small quantity.

To connect
to the variable of integration in the double integral over

2
d , we return to
the original definition of cos

in Eq. (4.135f) of Chapter 4,

2
cos 1
= . (5.74a)

When the detectors field of view is small, which means is always close to zero, we can
approximate the square root as
- 660 -

2
2
1 1
2
.

Consequently, Eq. (5.74a) can be written as

2
cos 1
2
(5.74b)

inside the double integral over
2
d in Eq. (5.73c). Comparing Eq. (5.74b) to (5.73b), we see that
if
is smallwhich is, of course, the same as saying the detectors field of view is smallit
follows that

2 2
. (5.74c)

In the discussion at the beginning of Sec. 5.7 above, we interpreted the double integral over
2
d
in Eq. (5.25b) as a sum over all the plane waves propagating through the interferometer, with

being the angle of the th plane waves propagation vector with respect to the optical axis. The
double integral over
2
d in Eqs. (5.73a) and (5.73c) can be interpreted in the same way. The
angle
is always taken to be greater than or equal to zero, so Eq. (5.74c) can be written as

. (5.74d)

Hence, when the propagation angle is small, can be thought of as the angle in radians with
respect to the optical axis at which the th plane wave is propagating through the interferometer.
The discussion in Sec. 5.7 shows that interferometer setups that have a circular detector centered
on the optical axis, such as the standard Michelson interferometer shown in Fig. 5.18, have
propagation angles
= that are at a maximum

max
when the plane waves are focused onto
the detectors edge. The interior points of the detector absorb the focused energy of plane waves
passing through the interferometer at propagation angles
max
< ; in fact, all plane waves with
the same propagation angle end up focused onto a circle surrounding the detectors center,
with the radius of the circle proportional to as shown in Fig. 5.27.
The double integral over the field of view has this same sort of circular symmetry.
Substituting (5.74c) into (5.73c) gives

- 661 -

( ) ( )
2
2
2 2
ma
field of
view
( )
1
M H( )
4 2
i i
z
W u
S R e d e u d

.
(5.75a)

We see that the quantity inside the double integral over
2
d ,

2
2
H( )
2
i
u
e u

,

depends only on
2
. This means the double integral can be thought of as an integral over all the
infinitesimal area patches
2
x y
d d d = of a quantity that only depends on
2 2 2
x y
= = + ,
the distance of any point in this area integral from the origin where 0 = . Consequently the area
integral
2
d has circular symmetry and can be treated as a one-dimensional integral over a
collection of rings with radii between 0 and
max
,

max
2
field of 0
view
2 d d

.

Equation (5.75a) can now be written as

( ) ( )
( ) ( )
max
2
max
2
2
2
ma
0
2
ma
0
( )
2
M [ H( )]
4 2
2
M H( ) ,
4
i i
i i
z
W u
S R e e u d d
W
S R u e e d d

(5.75b)

where in the last step we have assumed that the transfer function H() is such a slowly varying
function of that we can disregard the effect of adding the small quantity
2
( ) / 2 u to the
argument u. For future use, we note that the circular symmetry of the detectors field of view
lets us write Eq. (5.35d) as
- 662 -

FIGURE 5.27.
All plane waves with the same off-axis
propagation angle end up focused onto the
same circle surrounding the center of the
detector.
- 663 -

max
2 2
max
field of 0
view
2 d d
= = =

(5.75c)

Hence is given by the formula for the area of a circle of radius
max
. In Eq. (5.75b) the term
inside the braces { } can be simplified to

( )
( ) ( )
( )
max
max 2 2
2
max
2 2
max max
0
0
1 2
1 2 1 2
2 1
i i
i
i i
e d e
i
e
e e
i

=

=
.

Equation (5.75c) can be written as
2
max
= , and with this substitution the integral becomes

( )
max
2
0
2
sin
2 2
2
i
i
e d e
.

Following the definition of

sin( )
sinc( )
x
x
x
=

given in Eq. (2.106d) of Chapter 2, we write

( )
max
2
2
0
2
sinc
2
i i
e d e
. (5.75d)

This can be substituted back into (5.75b) to get

( ) ( )
ma
2 1
4
( ) M H( ) sinc
4 2
i
W
z S R u e d

. (5.75e)

According to the discussion in Sec. 5.11, we can associate an effective spectrum with the
formula in Eq. (5.75e),
- 664 -

( ) ( )
2
2 1
2 4
ma
( ) ( )
M H( ) sinc
4 2
i
eff
i
i
z e d
W
d e d S R u e

=

=

Z
.
(5.76a)

From Eqs. (5B.8a) and (5B.8b) in Appendix 5B at the end of this chapter, it follows that

( ) ( )
1
4 2
ma
1
4 2
1
( ) M H( )
4
eff
W
S R u d

+ +

Z , (5.76b)
where

2
= . (5.76c)

Formula (5.76b) is good for fields of view small enough that cos

can be approximated
quadratically as

2
cos 1
2
,

but not so small that cos

can be approximated as one. In (5.76b) the term inside the square
brackets [ ] is, as pointed out in Appendix 5B, averaged over a wavenumber interval that is
centered on
1
4

+

and that has width

2
= .

Equations (5.47c) and (5.48a) show that when cos

can be approximated by one and there is no
background radiance, the effective signal spectrum can be written as

( ) ( )
ma
( ) M H( )
4
eff
W
S R u = Z . (5.76d)

- 665 -
This expression is the same as the term inside the square brackets in (5.76b). We conclude that in
Eq. (5.76b) the term inside the integral is just the effective signal spectrum of the narrow field-of-
view case where cos
r
o can be approximated by one. Consequently, the effect of increasing the
field of view beyond the point where cos
r
o can be approximated by one is to blur the effective
signal spectrum by averaging it over a wavenumber region of width

2
o
o
r
AO
A

centered on wavenumber 1
4
o
r
AO
+

instead of . Therefore, another effect of the increased
field of view is to scale the wavenumber axis of the effective signal spectrum by a factor of
1
4r
AO
+

. In other words, the spectral details at
0
o o are blurred over a region in width
around
0
and then, in the spectral measurements, show up at wavenumber
1
0
1
4
o
r
AO
+

instead
of at wavenumber
0
. When the field of view is known, we can always rescale the
wavenumber axis to put the spectral details in their correct locations, but the blurring degrades
the spectral resolution in a way that cannot be fixed.
We specify a new variable of integration

g o o
AO
(5.77a)
with
1
4
g
r
AO
AO

(5.77b)

and use it to write Eq. (5.75e) as

( ) ( ) ( )
1
1 1 1 2
ma
z( )
M H sinc
4 2
i
g W
S g g R g u e d
r o
o
o o o o

AO
AO AO AO
AO

.
(5.77c)

The term inside the square brackets is just another version of the effective spectrum in (5.76d),
but now it is multiplied by the factor

z( )
4
W
1
g o
AO
o
A
- 666 -

1
1
sinc sinc 1
2 4 2
g

=

.

This sinc factor artificially decreases the size of the effective spectrum, forcing it to contribute
too little to the integration over d so that the signal z() is smaller than it would otherwise be
at large values of the optical-path difference . This effect is sometimes called the self-
apodization of the interferogram signal. To avoid having significant amounts of self-apodization
in the measured spectrum, we should keep the optical-path difference from becoming so large
that the sinc factor becomes small or even negative. Following the notation of Sec. 5.15, there
must be a length D with
D
such that

1
sinc 1
4 2
D

stays reasonably close to one. In any well-designed interferometer, the wavenumbers to which the
detector is sensitive lie within a specified wavenumber range,

min max
, (5.78)

as is discussed following Eq. (4.66b) in Chapter 4. Consequently, the traditional rule of thumb is
to require the sinc factor to be greater than 2/3 for the maximum possible value of its argument,

1
max
2
sinc 1
4 2 3
D

>

, (5.79)

to avoid having significant amounts of self-apodization occur. This implies that

1
max
1 1.488
4 2
D

<

or

max max
(2.976) 2.976
1
4
D

<

, (5.80)

where in the last step we assume [see Eqs. (5B.1c) and (5B.1d) in Appendix 5B]
- 667 -
1
4
<< ,

something that is almost always the case. As was discussed in the Sec. 5.15, the size of D
controls the overall resolution of the spectral measurement, with small values of D producing
low-resolution spectral measurements and large values of D producing high-resolution spectral
measurements [see Eq. (5.67)]. What we have here, then, is the interferometric version of the
classic inverse relationship between spectral resolution and field of view that affects all
spectrometers, not just the Fourier-transform type. Inequality (5.80) states that to avoid self-
apodization, large fields of view should have small values of D, producing low-resolution
spectral measurements, and small fields of view can have large values of D, producing high-
resolution spectral measurements. If inequality (5.80) is ignored, then self-apodization occurs and
resolution is lost from the blurring effect of the integral in Eq. (5.76b) above.
5.18 Single-Sided Interferograms
It is easy to show that when the interferogram signal z is even, we can double the spectral
resolving power of a standard Michelson interferometer by shifting the fixed mirror so that the
moving-mirrors ZPD position occurs at the beginning (rather than the center) of the moving
mirrors range of motion (see Fig. 5.28). Before the fixed mirror is shifted, we have the standard
setup for a double-sided interferogram, with varying between +D and D as the moving mirror
moves from the beginning to the end of its path. After the fixed mirror is shifted, running the
moving mirror over the same physical positions as before gives z at all the optical-path
differences between zero and 2D instead of between D and D. By assumption, z is even, so we
can then use
( ) ( ) z z =

to get z between 2D and zero. Consequently, we end up with the same knowledge of the
interferogram signal that we would get from measuring a double-sided interferogram between
2D and 2D. Putting the moving mirrors ZPD location at the beginning of its range of motion
therefore doubles the effective length of the interferogram signal. According to Eq. (5.67), the
resolving power of a double-sided interferogram is directly proportional to the interferogram
signals length, sowhen the interferogram signal is evenputting the moving mirrors ZPD
location at the beginning of its range of motion doubles the spectral resolving power.
Shifting the position of the fixed mirror is, as a general rule, much easier than extending the
moving mirrors range of motion, so it is unfortunate thatbecause z is not exactly even after
passing through the detector circuit
87
we cannot so simply double the resolving power of
already-built Michelson interferometers. If, however, the fixed mirror is shifted as shown in Fig.

87
See discussion at the end of Sec. 5.12.
- 668 -
5.29 so that the ZPD position is put close to, rather than exactly at, the beginning of the moving
mirrors range of motion, we can usually symmetrize the interferogram signal, turning it into an
exactly even function of . This returns us to the ideal case discussed above, letting us increase
the interferometers spectral resolving power by increasing the effective length of the
interferogram signal. Because we do not put the ZPD exactly at the beginning of the moving
mirrors range of motion, we cannot double the resolving power; but in almost all cases there is a
large increasealmost a doublingin the amount of spectral detail which the interferometer can
measure.
From the work done in Sec. 5.11, we know that after passing through the detector circuit the
interferogram signal can be written as the inverse Fourier transform of an effective spectrum,

2
( ) ( )
i
eff
z e d

Z . (5.81a)

From entry 7 in Table 2.1 of Chapter 2, we know that, since z() is real, ( )
eff
Z must be
Hermitian,
( ) ( )
eff eff

= Z Z . (5.81b)

We also know from the discussion following Eq. (5A.6b) in Appendix 5A that the transfer
function H(u) of the detector circuit must have a nonzero imaginary component. For small fields
of view where cos

can be approximated by one, Eq. (5.48a) gives

ma
( ) H( ) ( )M( )
4
eff
W
u S R = Z , (5.82a)

showing that, since W = +1 or 1 and functions S() and
ma
M( ) R are real, the effective
spectrum ( )
eff
Z has a nonzero imaginary component only because H has a nonzero imaginary
component.
For larger fields of view when cos

cannot be approximated by one, we can again show that
( )
eff
Z has a nonzero imaginary component because H has a nonzero imaginary component.
Equations (5.76b) and (5.76c) give

( ) ( )
1
4 2
ma
1
4 2
1
( ) M H( )
4
eff
W
S R u d

+ +

+

=

Z (5.82b)

Single-Sided Interferogram 5.18
- 669 -

FIGURE 5.28.

Old ZPD Position
New ZPD Position
Moving
Mirror
Range
of
Motion
Old
Position
of Fixed
Mirror
New
Position
of Fixed
Mirror
Ideal
Beam
Splitter
Radiance heading to the
Detector
Radiance entering the
interferometer
- 670 -
FIGURE 5.29.

Old ZPD Position
New ZPD Position
d
d
D
D
Moving
Mirror
Range
of
Motion
Old
Position
of Fixed
Mirror
New
Position
of Fixed
Mirror
Ideal
Beam
Splitter
Detector
interferometer
- 671 -
with

2
o
o
r
AO
A . (5.82c)

In a well-designed interferometer system we want M (if it is not equal to one) and H to vary
slowly as functions of , letting S() carry the high-resolution spectral detail. In fact, we know
from Eq. (5.40g) that
R ( ) ( ) ( ) ( ) ( ) ( )
f a
S A o o q o o t o t o AO L . (5.82d)

This shows that the interferometer can be designed and built so that R, ,
a
t , and
f
t also vary
slowly with over the range of wavenumbers being measured, allowing the rapid variation with
wavenumber to come entirely from the spectral radiance L(). This is, in fact, how we expect R,
,
a
t , and
f
t to behave in well-designed interferometers. Consequently, all the slowly varying
functions of can be brought outside the integral in Eq. (5.82b), which means we can substitute
(5.82d) into (5.82b) to get

ma
1
R
( ) H 1 M 1
4 4 4
1 1 1 1
4 4 4 4
( )
eff
a f
WA
u R
d
o
o o o
o r r
o q o t o t o
r r r r
o o
+
AO AO AO
e + +

A

AO AO AO AO
+ + + +

Z
L

( ) ( ) ( ) ( ) ( ) ( )
1
4 2
4 2
1
4 2
ma
1
4 2
R H M
( )
4
a f
WA u R
d
o
o
o
o
o
r
r
o
r
o
r
o o o q o t o t o
o o
o
A
AO
+ +

A
AO

A
AO
+ +

A
AO
+

AO
e
A
L .
(5.83a)

In the last step, we assume both that
1
4r
AO
<<

and that H, M, R, ,
a
t , and
f
t vary slowly enough for us to write

o
A
o
A
- 672 -
( ) H 1 H
4
u u

+

, ( )
ma ma
M 1 M
4
R R

+

, (5.83b)
( ) R R 1
4

+

, ( ) 1
4

+

,
( ) 1
4
a a

+

, ( ) 1
4
f f

+

.

It is also worth noting that the integral

1
4 2
1
4 2
( ) d

+ +

L

must be an even function of because ( ) L is an even function of . In Eqs. (5.82b) and
(5.83a) everything but H is real. Thereforeboth in Eq. (5.82a), which applies to very small
fields of view when cos

can be approximated as one, and in Eq. (5.83a), which applies to
slightly larger fields of view when cos

cannot be approximated as onewe see that ( )
eff
Z
has a nonzero imaginary component only because H(u) has a nonzero imaginary component.
We can underline the fundamental similarity of Eqs. (5.82a) and (5.83a) by combining them
into a single formula. Substitution of (5.82d) into (5.82a) gives

ma
R ( ) H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
eff f a
WA
u R
= Z L , (5.83c)

and this last result can be combined with (5.83a) by writing

ma
R ( ) H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
eff f a FOV
WA
u R
= Z L , (5.83d)

where we define that

- 673 -

1
4 2

1
4 2
( ) for small where cos
can be approximated as one
( )
for slightly larger where cos
1
( )
FOV
d

+ +

+

=

L
L
L

one

.
(5.83e)

Absolute value signs are put around the argument of L
FOV
in (5.83d) and (5.83e) in part to remind
us, as pointed out in the discussion following (5.83b), that the integral

1
4 2
1
4 2
( ) d

+ +

L

must be an even function of . Figures 5.30(a) and 5.30(b) show how the original spectrum L()
is shifted and blurred by an interferometers finite field of view. In Fig. 5.30(b) the compression
of the wavenumber axis can be removed by stretching the axis so that spectral edge E is returned
to its proper position, but nothing can recover the detail lost in the spectral blurring.
The next step in setting up a single-sided interferogram measurement is to write Eq. (5.83d) as

( )
( ) ( )
i
eff eff
e

= Z Z (5.84a)

for real functions ( )
eff
Z and (). Here
eff
Z is the magnitude of
eff
Z ,

( ) ( )
eff eff
= Z Z , (5.84b)
and is the argument of
eff
Z ,
( ) arg[ ( )]
eff
= Z . (5.84c)

Applying these definitions to the right-hand side of (5.83d) gives

ma
R ( ) H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
eff f a FOV
WA
u R
= L Z (5.85a)
and
( ) arg[H( )] u = . (5.85b)

- 674 -
Equation (5.85b) shows that

( ) slowly varying function of = (5.85c)

because H(u) is a slowly varying function of the wavenumber. Substituting (5.84a) into (5.81b)
gives, since both
eff
Z and are real, that

( ) ( )
( ) ( )
i i
eff eff
e e

= Z Z .

Taking the magnitude of both sides, we get

( ) ( )
eff eff
= Z Z , (5.86a)

making
eff
Z an even function of . Now we can write

( ) ( )
( ) ( )
i i
eff eff
e e

= Z Z or
( ) ( ) i i
e e

= .

Taking the complex logarithm of both sides shows that must be an odd function of ,

( ) ( ) = . (5.86b)

Equation (5.85b) suggests that we automatically know () because, having designed and built
the detector circuit, we know its transfer function H. In practice, however, it is often difficult to
know H with sufficient accuracy to get good measurements of L
FOV
. It turns out that all we need
to make single-sided interferograms practical is to know that is a slowly varying function of the
wavenumber, because then it is easy to measure as a function of . The key point to take away
from Eq. (5.85b), then, is that if the transfer function is designed to be a slowly varying function
of wavenumber, then we have good reason to expect to be a slowly varying function of
wavenumber.
88

The customary procedure used to measure () directly is to run the moving mirror in Fig.
5.29 between d = and 2D d = , at first confining our attention to the z() signal values

88
This point is more important than it looks. There are interferometer defects not discussed here that, like the
transfer function, contribute slowly varying complex modulations to the effective spectrum. All we need for a good
single-sided interferogram measurement is to know that the total complex modulation is slowly varying, and then we
can use the procedure discussed in this section to remove all these complex modulations from the effective spectrum
at the same time.
- 675 -

FIGURE 5.30(a).

( ) o L
o
1010 cm
-1
1009 cm
-1
This small section of the radiance spectrum entering the interferometer is
plotted here the way it actually is, undistorted by any measurement errors. The
true position of Spectral Edge E is at wavenumber 1010 cm
-1
. E
Spectral Edge E
- 676 -

FIGURE 5.30(b).

Spectral Edge E
( )
FOV
o L
o
1010 cm
-1
1009 cm
-1
The same small section of the radiance spectrum plotted in Fig. 5.30(a) is shown here
with the rescaled wavenumber axis and blurring due to the interferometers finite field of
view. Spectral Edge E is measured at a wavenumber slightly smaller than its true
positon.
E
- 677 -
between = d and = +d. These signal values give a perfectly good double-sided interferogram
of the type described in Sec. 5.15 above, leading to a low-resolution estimate of the effective
spectrum

(low res)
( ) ( )
eff eff
Z Z . (5.87a)

According to Eq. (5.67), the spectral resolution of this
(low res)
( )
eff
Z measurement is

low res
1
2d
= . (5.87b)

This spectral resolution is not sufficient to measure L(), the spectral radiance of the source, but
we can easily make it good enough to capture all the spectral detail in the slowly varying function
(). We choose d twice as large as we would for a minimally accurate representation, making
low res
half the size of the spectral interval
detail
used to examine the detail in (). This
makes

detail low res
1
2
d
= = . (5.87c)

Because is a low-resolution function of wavenumber , Eqs. (5.84c) and (5.87a) show that

(low res)
( ) arg[ ( )]
eff
= Z . (5.88a)

Now that is known, we can define a new function ( ) such that
( ) i
e

is the Fourier
transform ( ) . According to Eq. (5.78), we are only interested in values that are between
min
,
max
and
min
( ) ,
max
( ) . This means () can be given any values we please outside
these two ranges, and for that matter so can any function of such as
( ) i
e

. Keeping in mind,
then, that the only values that matter satisfy

min max
,
we set up the Fourier transform pair

( ) 2
( ) ( )
i i
V e e d

(5.88b)
and

( ) 2
( ) [ ( ) ]
i i
V e e d

. (5.88c)

- 678 -
In these two formulas, V() is a real-valued tapering function chosen so that ( ) 0 V slowly as
with
( ) 1 V = for
min max
. (5.88d)

For future use (and to keep things neat), we require V() to be non-negative and even,

( ) 0 V
and
( ) ( ) V V = . (5.88e)

Since ( ) ( ) = in Eq. (5.86b) and V() is real and even in (5.88e), we note that
( )
( )
i
V e

is Hermitian,

( ) ( ) ( )
( ) ( ) [ ( ) ]
i i i
V e V e V e

= = . (5.88f)

Consequently, according to entry 7 of Table 2.1 in Chapter 2, its Fourier transform must be real:

( ) Im ( ) 0 = . (5.88g)

Because and V are slowly varying functions of , their product
( )
( )
i
V e

is also a slowly
varying function of . According to the discussion following Eq. (2.37e) in Chapter 2, it follows
that ( ) , the inverse Fourier transform of
( )
( )
i
V e

in (5.88c), must be a relatively narrow
function of . By the end of that discussion, we realize that if
detail
is the change in required
to produce a significant change in
( )
( )
i
V e

, then the inverse Fourier transform ( ) must be
negligible at all values of with
-1
detail
> . From Eq. (5.87c), we know

detail
1
d
= ,
which means that
( ) is negligible when d > . (5.88h)

Having analyzed ( ) , we now turn our attention to the entire interferogram signal recorded
between d = and 2D d = . When the interferogram signal in Eq. (5.81a) is convolved
with ( ) , the result is
( ) ( ) ( ).
conv
z z = (5.89a)
- 679 -

From the definition of convolution in Chapter 2 [see Eq. (2.38a)], we understand that both ( )
and z() must be known for all between and + to calculate their convolution,

( ) ( ) ( )
conv
z z d
. (5.89b)

We have just seen, however, that ( ) is a narrow function of , so from (5.88h) we get

( ) ( ) ( )
d
conv
d
z z d
. (5.89c)

Consequently, there is now no real difficulty in calculating z
conv
() between 0 = and
2 2 D d = from our limited knowledge of ( ) and z(). [Note that the formula for the
integral in (5.89c) does not let us calculate z
conv
all the way out to 2D d = .] Suppose there
were some way to know z
conv
() for negative as well as positive values of its argument. The
Fourier transform of z
conv
would then be, using the Fourier convolution theorem [see Eq. (2.39b)
of Chapter 2] and formula (5.88b),

[ ]
2 2
( ) 2
( ) ( ) ( )
( ) ( )
i i
conv
i i
z e d z e d
V e z e d

=

=

.
(5.90a)

Reversing the Fourier transform in (5.81a) gives

2
( ) ( )
i
eff
z e d

Z , (5.90b)

which, when substituted into (5.90a), would lead to

2 ( )
( ) ( ) ( )
i i
conv eff
z e d V e

Z .

In well-designed interferometers, the R [ ( ) ( )]
a
product in Eq. (5.83d) goes to zero when
- 680 -
max
or
min
. Consequently,
eff
Z is zero for those values where V() is, according to
(5.88d), not necessarily equal to one. Hence, this latest result can be written as

2 ( )
( ) ( )
i i
conv eff
z e d e

Z . (5.90c)

Consulting Eqs. (5.84a) and (5.84b), we see that (5.90c) could also be written as

2
( ) ( )
i
conv eff
z e d

. Z (5.90d)

We already know that ( )
eff
Z is real, and from (5.86a) we see that ( )
eff
Z is even. Reversing
the Fourier transform in (5.90d) now gives

2
( ) ( )
i
conv eff
z e d

Z . (5.91a)

According to entry 1 of Table 2.1 in Chapter 2, the inverse Fourier transform of a real and even
function is another real and even function, which means that z
conv
is an even function of ,

( ) ( )
conv conv
z z = . (5.91b)

This is the result we need. In the discussion following Eq. (5.89c), we supposed that z
conv
was
known for negative as well as positive values of its argument because we wanted to take its
Fourier transform. It now turns out, however, that when z
conv
is known between 0 = and
2 2 D d = , it is also known between 0 = and (2 2 ) D d = because it must be an even
function. This means that measuring z() between d = and 2D d = , as shown in Fig.
5.29, gives enough information to calculate z
conv
() for

2( ) 2( ) D d D d . (5.91c)

Applying the double-sided approximation for the Fourier transform discussed in Sec. 5.15 to
formula (5.90d), we can now treat z
conv
() as a double-sided interferogram signal to get

2( )
2
2( )
( ) ( )
D d
i
eff conv
D d
z e d

. Z (5.91d)

- 681 -
Equation (5.91d) justifies the use of single-sided interferograms. In a conventional double-
sided interferogram, we measure the signal z() leaving the detector circuit between
and D D = = + and use Eq. (5.63a) to write the approximation

2
( ) ( )
D
i
eff
D
z e d

Z
-
(5.92a)
for the effective spectrum ( )
eff
Z . According to Eq. (5.83d), the correct formula for the effective
spectrum is

ma
R ( ) H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
eff f a FOV
WA
u R
= Z L . (5.92b)

Now compare this to the single-sided situation. Following the procedure outlined above, we
measure signal z() between d = and 2D d = . This data lets us calculate z
conv
() between
0 = and 2( ) D d = . Because z
conv
() is even, we end up knowing its values between
2( ) and 2( ) D d D d = = + , allowing us to make the new approximation

2( )
2
2( )
( ) ( )
D d
i
eff conv
D d
z e d

Z , (5.92c)
where, according to Eq. (5.85a),

ma
R ( ) H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
eff f a FOV
WA
u R
= L Z . (5.92d)

The only difference between the two spectral formulas in Eqs. (5.92b) and (5.92d) is that in
(5.92b) spectrum
eff
Z is proportional to the full complex transfer function H while in (5.92d)
spectrum
eff
Z is proportional to the magnitude of H. Although the detector circuits transfer
function H must have a nonzero imaginary part [see discussion following Eq. (5A.6b) in
Appendix 5A], we shall see in the following section that the calibration formula used for complex
H also works when the original transfer function H is replaced by H . The alternate method of
removing an interferometers background radiance discussed in Sec. 5.14 also works as desired
when H is replaced by H . Consequently, we can think of the magnitude of H in (5.92d) as just
another type of transfer function and treat
eff
Z like any other effective spectrum when it becomes
time to calibrate the interferometer and eliminate unwanted background radiation from our
measurements. Since the integral in Eq. (5.92c) goes between 2( ) and 2( ) D d D d + rather
than between D and +D, the discussion in Sec. 5.15 shows that we must end up with a more
- 682 -
highly resolved spectrum. According to Eq. (5.67), a double-sided interferogram system can
measure spectral details separated by a wavenumber interval as small as

double sided
1
2D
= (5.93a)

using formulas (5.92a) and (5.92b). Therefore, when integrating between
2( ) and 2( ) D d D d + in Eq. (5.92c), we know that single-sided interferogram system can
measure spectral details separated by a wavenumber interval as small as

single sided double sided
1
2(2 2 ) D d
= <
, (5.93b)

which gives us much more resolving power than the equivalent double-sided system,

<< . (5.93c)

From what has been said so far, it seems that all spectral measurements ought to be made
using single-sided rather than double-sided interferograms. In practice, however, we often want
to compare one side of a double-sided interferogram signal to the other to check that no blunders
have been made in taking the measurementand we clearly give up this possibility when using
single-sided interferograms. In addition, the expected noise amplitude of single-sided
measurements is, as a general rule, larger by 2 than the expected noise amplitude of equivalent,
equal-resolution double-sided measurements [see the discussion following Eq. (6.76e) in Chapter
6 below]. Finally, to justify our single-sided procedure, we are forced to assume that the phase
term e
i()
is a slowly varying function of wavenumber and then choose parameter d large
enough to capture all the relevant spectral detail in e
i()
. The only way to confirm that this is
true is to make a high-resolution, double-sided spectral measurement, verify that e
i()
behaves as
expected, and adjust the value of d accordingly. In this sense, then, a good single-sided
measurement depends on our having at some point performed a high-resolution, double-sided
measurement with the same instrument. Nevertheless, having the flexibility to perform single-
sided measurements can be a very attractive way to increase an interferometers resolving power
when a standard double-sided measurement turns up unexpected but poorly resolved spectral
detail, and for this reason many interferometer designs include it as one of their options.
5.19 Calibration
The uncalibrated spectrum of a standard Michelson interferometer can be treated the same way as
the output spectrum of any other type of uncalibrated spectrometer would be treated. Consider,
for example, Eq. (5.60b) for the total interferogram signal z
tot
when the interferograms field of
Calibration 5.19
- 683 -
view is small enough that cos

can be approximated as one,

( ) ( )
( ) (back) 2
ma
( ) ( ) ( ) H( ) M
4
fore i
tot
W
z S S S u R e d

= +

. (5.94a)

In this section, we can regard function M as a constant and steady misalignment, unchanging
during calibration and subsequent spectral measurementsor we can think of the instrument as
being so well-aligned that M 1 . Assuming that z
tot
in (5.94a) is analyzed using a double-sided
interferogram with D large enough that there is no significant ringing or loss of spectral detail
from the sinc convolution in Eq. (5.66b), we can treat the Fourier-transform of z
tot
, which we call
,
( )
eff tot
Z , as the uncalibrated spectrum of the Michelson interferometer. Reversing the Fourier
transform in (5.94a) then gives

( ) ( )
2
,
( ) (back)
ma
( ) z ( )
( ) ( ) H( ) M
4
i
eff tot tot
fore
e d
W
S S S u R

=
= +

Z
-
.
(5.94b)

Equation (5.40g), which specifies that

R ( ) ( ) ( ) ( ) ( ) ( )
f a
S A = L , (5.94c)

can now be used to write (5.94b) as

( )
,
( ) (back)
ma
R
( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) H( ) M
4
eff tot
fore
f a
W
A S S u R
= +

Z
L

(5.94d)

for the ideal case where cos

can be approximated by one and D is large enough that there is
no significant loss of detail from the sinc convolution described in Sec. 5.15 above.
What can be done with the more realistic case where there is significant loss of detail from the
sinc convolution and cos

can no longer be approximated as one because the field of view is
relatively large? Glancing back at the analysis used in Sec. 5.18 to go from Eq. (5.82b) to
(5.83d)and in particular paying close attention to the approximations listed in (5.83b)we
note that in a well-designed interferometer R, ,
a
,
f
, H, and M all vary slowly with
compared to L(). In fact compared to L() they can be regarded as quasi-constants, especially
over the range of wavenumbers
- 684 -

min max

over which L is being measured. In Eq. (5.83d) we account for the effect of a small but finite
field of view blurring and distorting the measurement of L by replacing L() with L
FOV
(). This
is very similar to the situation examined in Sec. 5.15 above, where we represented the distorting
effect of the sinc convolution on the measured spectrum by replacing L() with L
blur
(). To
combine the blurring and distorting effects of both the sinc convolution and the finite field of
view, we replace L() by L
eff
() in Eq. (5.94c) to get

R ( ) ( ) ( ) ( ) ( ) ( )
f a eff
S A = L , (5.94e)

where we have added absolute value signs to the argument of L
eff
to keep S() well-defined for
both positive and negative values and to show that it is still an even function, having the same
value at and . Applying this to Eq. (5.94d), we say that

( )
,
( ) ( )
ma
R
( )
( ) ( ) ( ) ( ) ( ) ( ) ( ) H( ) M
4
eff tot
fore fore
eff f a eff eff
W
A S S u R
= +

Z
L

(5.94f)

with L,
( ) fore
S , and
( ) back
S replaced by
eff
L ,
( ) fore
eff
S , and
( ) back
eff
S respectively to show that the finite
field of view and sinc convolution have somewhat blurred and distorted the original functions.
We can, in fact, regard ( )
eff
L as the best measurement of ( ) L that the interferometer system
can be expected to produce. Hence, for relatively small fields of view in situations where the sinc
convolution introduces only a negligible distortion,

( ) ( )
eff
L L ,

and for situations where the finite field of view and sinc convolution must be taken into account,
( )
eff
L is what ( ) L is measured as when subjected to these two unavoidable effects.
To calibrate any type of spectrometer having a linear response to the input spectrum, we need
Calibration 5.19
- 685 -
to observe at least two known spectral radiances
(1)
( ) o L and
(2)
( ) o L where again we use
absolute value signs to make the radiances well-defined for negative as well as positive values.
For an interferometer the
(1)
L and
(2)
L radiances should be distinct and slowly varying functions
of wavenumber so that they undergo only negligible distortion from the sinc convolution and
finite field of view; a black-body target at two widely separated temperatures does nicely. We
suppose that
(1)
,
( )
eff tot
o Z and
(2)
,
( )
eff tot
o Z are the uncalibrated spectra measured when the
interferometer is observing the known spectral radiances
(1)
( ) o L and
(2)
( ) o L respectively. We
then observe a source of unknown spectral radiance ( ) o L and calculate
( )
,
( )
meas
eff tot
o Z , the
uncalibrated spectrum associated with the z
tot
signal generated by ( ) o L . For a standard
Michelson interferometer, we note that the traditional linear calibration algorithm gives,
consulting Eq. (5.94f) to get the appropriate formulas for
(1)
, eff tot
Z ,
(2)
, eff tot
Z , and
( )
,
meas
eff tot
Z ,

( )
( )
( ) (1)
, , (2) (1) (1)
(2) (1)
, ,
(2) (1)
(1)
ma
ma
R
( ) ( )
( ) ( ) ( )
( ) ( )
( ) ( )
M H( ) ( ) ( ) ( ) ( ) ( ) ( )
4
M H( )
4
meas
eff tot eff tot
eff tot eff tot
eff f a
W
R u A
W
R u A
o o
o o o
o o
o o
o o o o o q o t o t o
o o
+

AO

A
Z Z
L L L
Z Z
L L
L L

(2) (1)
(1)
R ( ) ( ) ( ) ( ) ( ) ( )
( ) ( )
f a
eff
o o o q o t o t o
o o
O

+
L L
L L .
(5.95a)

This is the best estimate of the unknown spectral radiance that the interferometer can be expected
to produce, which shows that the standard linear calibration algorithm can work well when we
treat the effective total spectrum
,
( )
eff tot
o Z of the signal leaving the detector circuit just like we
would any other uncalibrated spectrometer signal that depended linearly on the spectral radiance
entering the instrument. Equation (5.94e) can be generalized as

( ) ( ) Function of
eff
S o o L . (5.95b)

Now in Eq. (5.94f) the effective signal spectrum can be written as, for both positive and negative
values,
entering the instrument. Once the system has been calibrated, we can measure any number of
other spectra simply by pointing the instrument at the other radiances, recording new (meas)
quantities, and plugging these (meas) quantities into Equation (5.95a) while leaving all other
formula values the same.

( ) ( ) Function of
eff
S o o L . (5.95b)

Now in Eq. (5.94f) the effective signal spectrum can be written as, for both positive and negative
values,
Equation (5.94e) can be generalized as
- 686 -

{ } ( )
( )
{ } { }
, ma
( ) (back)
ma
( ) [ ( ) Function of ]H( ) M
4
( ) ( ) H( ) M
4
( ) Complex Function of Background Complex Function of
eff tot eff
fore
eff eff
eff
W
u R
W
S S u R

=
+

= +
Z L
L

.
(5.95c)

As long as the effective spectrum of the total signal can be written as a product of the spectral
radiance and a complex function of wavenumber that, due to the background radiance, must be
added to another complex function of the wavenumber, the standard linear calibration algorithm
given in (5.95a) successfully extracts the desired spectral measurement ( )
eff
L . This procedure
is sometimes called the Revercomb calibration algorithm.
89

89
H. E. Revercomb et al., Radiometric Calibration of IR Fourier Transform Spectrometers: Solution to a Problem
with the High-Resolution Interferometer Sounder, Applied Optics, 27, no. 5 (1 August 1988), pp. 32103218.
5.20 Nonflat Optical Surfaces
The easiest way to handle nonflat optical surfaces is to treat the interferometer as a collection of
secondary interferometers operating side by side as shown in Fig. 5.31. The main interferometer
beam is split up into a grid of parallel secondary beams with, as shown in Fig. 5.32, each
secondary beam hitting only a small area of the lens, focusing it onto the detector. Each point on
the detector corresponds to a beam direction hitting the lens. All rays of the secondary beams
that, like the solid lines in Fig. 5.32, are parallel to the optical axis, end up focused at the
detectors center; and all the rays of the secondary beams that, like the dashed lines in Fig. 5.32,
are traveling in the same off-axis direction, end up focused at the same off-center detector point.
This means that all the small secondary interferometers have the same field of view as the
original large-scale interferometer because each point on the detector corresponds to a different
angle in the field of view.
We label each secondary interferometer with the x, y coordinates of its secondary beam inside
the cross section of the main beam, as shown in Fig. 5.33, and define the distance ( , ) x y to be
the offset of the x, y secondary interferometers optical-path difference from the average optical-
path difference of the entire collection of secondary beams. When 0 = for all the secondary
interferometers, we return to the ideal case of a standard interferometer built using perfectly flat
optical surfaces, with the total signal z() leaving the detector circuit becoming

2
( ) ( )
i
eff
z e d

Z , (5.96a)
Nonflat Optical Surface 5.20
- 687 -

FIGURE 5.31.
Moving Mirror Surface
Ideal Beam
Splitter
Fixed Mirror
Surface
interferometer
Detector
- 688 -

interferometer
FIGURE 5.32.
Moving Mirror Surface
Fixed Mirror
Surface
Ideal Beam
Splitter
Lens
Circular Detector
in the Focal Plane
of the Lens
- 689 -
FIGURE 5.33.

entering the
interferometer
heading to the
detector
Ideal Beam Splitter
Moving Mirror
Grid of Secondary
Interferometers on the
Fixed Mirror
x axis
y axis
- 690 -
using the effective spectrum ( )
eff
o Z explained in Sec. 5.11 above.
90
If the total cross-sectional
area of the interferometer beam is A, then for ( , ) 0 x y o = the beam coming from the x, y
secondary interferometer can be thought of as producing a signal

( ) 2 ( , )
secondary
,
( ) ( )
i x y
eff
x y
dx dy
z e d
A
r o o
o o
Z . (5.96b)

The total signal coming from the interferometer can now be written as

2 ( ( , ))
cross section
of main beam
2 2 ( , )
cross section
of main beam
1
( ) ( )
1
( )
i x y
eff
i i x y
eff
z dx dy d e
A
e dx dy e d
A
r o o
r o r oo
o o
o o

Z
Z .
(5.96c)

Because is the average optical-path difference of all the secondary interferometersand
( , ) x y o is the difference from this average at any x, y pointwe can write

[ ]
Average OPD difference
1
( , )
over beam cross section
dx dy x y
A
o +

,

which simplifies to

cross section
of main beam
( , ) 0 dx dy x y o

. (5.96d)

The interferometer has no hope of working unless o is always small. We use that

2
1
1
2
x
e x x e + +
for small x and write that

2 2 2 2
1 2 2
i
e i
r oo
r oo r o o e + . (5.97a)

90
Any of the previously discussed formulas for the effective spectrum can be substituted into the formulas used in
this section as long as the mirror-tilt term M is taken to be identically equal to one. We explain the reason for this
rule at the beginning of Sec. 5.21 below.
2 ( , ) i x y
dx dy e
r oo
( , ) 0 dy x y o .
doo.
- 691 -
Substitution of (5.97a) into (5.96c) gives

2 2 2 2
cross section
of main beam
2 2
2 2
cross section cross section
of main beam of main beam
1
( ) ( ) { [1 2 ( , ) 2 ( , ) ]}
2 2
( ) [1 ( , ) ( , ) ]
i
eff
i
eff
z e dx dy i x y x y d
A
i
e dx dy x y dx dy x y d
A A
r o
r o
o r oo r o o o
r o r o
o o o
e +
+

Z
Z o
.

Equation (5.96d) shows that the imaginary term inside the square brackets [ ] disappears, leading
to

2 2 2 2
( ) ( ) 1 2
i
eff
z e d
r o
o r o o o
Z , (5.97b)

where
2
o , the average value of
2
o , is defined to be

2 2
cross section
of main beam
1
[ ( , )] dx dy x y
A
o o

. (5.97c)

We want
2 2 2
[1 2 ] r o o to be approximately one for all the wavenumbers measured by the
interferometer, so if we plan to measure spectra over the wavenumber range defined by

min max
0 o o o < s s , (5.98a)

we must use surfaces whose average squared deviation from flatness
2
o satisfies

2
2 2
1
2
o
r o
<< . (5.98b)

for all the wavenumbers between
min
o and
max
o . If (5.98b) is satisfied at
max
o o , it is satisfied
for all the wavenumbers in (5.98a). Hence, after defining the root-mean-square deviation from
flatness to be
2
RMS
o o , the inequality in (5.98b) reduces to

min
RMS
2
i
o
r
<< , (5.98c)
2 2 2
2 2
2
oss section cross section
main beam of main beam
[1 2 ( , ) 2 ( , ) ]}
2
( , ) ( , ) ]
dy i x y x y d
dx dy x y dx dy x y d
A
r oo r o o o
r o
o o
+

o .
2
( , ) ] dy x y d o o .
dx
2
[ ( , )] dy x y o .
- 692 -
where the formula
1

= is used to write the inequality in terms of the minimum measured

wavelength instead of the maximum measured wavenumber.
5.21 An Example of How to Analyze Nonflat Optical Surfaces
Most of our previous formulas for the effective spectrum ( )
eff
Z , such as Eqs. (5.82a) or
(5.83d), contain a factor
ma
M( ) R representing the effect of a slightly misaligned moving
mirror. This term is defined in Eq. (5.10c) above to be

1 ma
ma
ma
(4 )
2
M( )
J R
R
R

= ,

and we see from Eq. (5.10e) that M = 1 when the misalignment angle
ma
is zero. A misaligned
moving mirror is, of course, misaligned with respect to the fixed mirror, so we can always model
this imperfection as a misalignment of the fixed mirror rather than the moving mirror (see Fig.
5.34). The size of the fixed mirrors misalignment angle is also
ma
, the same as the size of the
moving mirrors misalignment angle. This means that when
ma
0 > , as in Fig. 5.34, we have a
special case of the nonflat optical surface discussed in Sec. 5.20. Hence, when using the analysis
for a nonflat optical surface in Sec. 5.20, we must also set M = 1 in all the formulas for ( )
eff
Z ,
because otherwise we double count the effect of a tilted moving mirror. By the same reasoning,
however, the accuracy of the procedure used to analyze nonflat surfaces can be checked by
comparing it to what we get when
ma
is small but not zero.
Equation (5.97b) states that when the moving or fixed mirror is not flat for any reason
including, for example, being slightly misaligned and so having a nonzero
ma
valuethe
original formula for the effective spectrum ( )
eff
Z should be multiplied by a factor of

2 2 2
1 2

.

Equations (5.82a) and (5.83d), on the other hand, require the formulas for ( )
eff
Z to be
multiplied by

1 ma
ma
ma
(4 )
2
M( )
J R
R
R

=

when the misalignment angle
ma
is small but nonzero. (As before, R is the radius of the circular
cross section of the beam passing through the interferometer.) Comparing these two expressions,
we see that for them to be consistent

An Example of How to Analyze Nonflat Optical Surfaces 5.21
- 693 -
FIGURE 5.34.

Moving Mirror
Fixed Mirror
with Tilt

ma

Radiance
entering the
Interferometer
Radiance heading to
the detector
- 694 -

2 2 2 1 ma
ma
?
(4 )
1 2
2
J R
R

(5.99)

must hold true when the misalignment angle
ma
is small.
To see whether (5.99) is in fact true, we expand its left-hand side in a power series. When x is
small, we know that
91

3
1
( )
2 16
x x
J x , (5.100a)
so for small
ma
we can write

1 3 1 ma
ma ma ma
ma
2 2 2 2 2
ma ma
(4 ) 1
(2 ) [2 (2 ) ]
2 2
1
1 (2 ) 1 2 .
2
J R
R R R
R
R R

= =
(5.100b)

To evaluate the right-hand side of (5.99), we consult Fig. 5.35 to get

ma
( , ) 2 x y y = . (5.101a)

Circular symmetry allows us to choose the orientation of the x, y axes any way we please, and
they have been chosen so that the moving mirror is tilted by a rotation
ma
about the x axis. We
convert to polar coordinates using

cos
sin
x r
y r

=
=

so that formula (5.101a) becomes

ma
( , ) 2 sin r r = . (5.101b)

Since the main beam has a circular cross section of radius R, Eq. (5.97c) can be written as

91
See Eq. (9.1.10) on page 360 of Handbook of Mathematical Functions, edited by Milton Abramowitz and Irene
Stegun.
An Example of How to Analyze Nonflat Optical Surfaces 5.21
- 695 -
FIGURE 5.35.

x
y
y
ma
= 2
Radiance
Impinging
on the
Tilted
Mirror
- 696 -

2 2 2
2 2 2 3 ma
ma 2 2
0 0 0 0
2
2 4
ma
2
0
4 1
[2 sin ] sin
R
4 sin(2 )
R 2 4 4
R R
d dr r r d dr r
R
R
r r
r
o o o o o
r r
o o
r

2 2
ma
R .
(5.101c)

Substitution of Eq. (5.100b) into the left-hand sideand Eq. (5.101c) into the right-hand side
of the proposed equality in (5.99) gives

2 2 2 2 2 2 2 2
ma ma
?
1 2 [1 2 ] R R r o r o
e
,
which is clearly true. This result not only shows why we should be careful to regard a misaligned
moving or fixed mirror as a special type of nonflat optical surface but also justifies the procedure
used in Sec. 5.20 to analyze more general types of nonflat optical surfaces.
5.22 Sampling the Interferogram Signal
After the interferogram signal leaves the detector circuit, it should be sampled at equally spaced
intervals of the optical-path difference. In principle, all we need to do is keep the moving mirror
traveling at a constant velocity while using an analog-to-digital converter (A/D converter) to
sample the signal at equally spaced instants in time. In practice, much better results are achieved
when a laser beam is used to trigger the A/D converter. In Sec. 1.8 of Chapter 1, we discussed in
a general way how laser control systems can be used to maintain alignment and produce steady
motion of the moving mirror. In a well-aligned system, we only need a single laser beam to
sample the interferogram signal at equally spaced intervals of the optical path difference. In Fig,
5.36, for example, two small angle mirrors insert and remove a laser beam parallel to the optical
axis of the signal beam. The laser beam passes through the interferometer in exactly the same
way as the main signal beam, and it experiences the same optical-path difference as the main
signal beam. The laser detector registers a monochromatic interference signal, and from Eqs.
(5.16b) and (5.16c) and Figs. 5.9(b) and 5.9(c), we know that this signal generates a cosine wave
in . The laser trigger circuit analyzes this cosine wave, sending out a trigger signal telling the
A/D converter to sample the main-beam signal every time the cosine wave crosses a
predetermined trigger level (see Fig. 5.37). Now when the location of the moving mirror varies
slightly from its predetermined value, the error in the sampling position no longer comes from
sampling at the wrong position of the moving mirror but instead is caused by inaccuracies in the
moving or fixed mirror as a special type of nonat optical surface but also checks the procedure
Sampling the Interferogram Signal 5.22
- 697 -
FIGURE 5.36.

A/D
Converter
Moving Mirror
Ideal Beam
Splitter
Fixed Mirror
Laser
Laser Detector
Trigger
Circuit
processing
the Signal
from the
Laser
Detector
Lens
Interferometer
Detector
Detector
Circuit
Digitized Detector Signal
Outside
Radiance
entering the
Interferometer
- 698 -

FIGURE 5.37.

laser-trigger and main-beam detector circuits.
92
As a general rule, this makes the error a great
deal smaller. Similarly, slight changes in the overall size of the interferometer setup due to
mechanical flexing need no longer concern us; the laser beam establishes an invariant ruler that
does not care whether the overall distance between, say, the beam splitter and the fixed mirror
has changed by several microns since the last time the instrument was calibrated.
Section 5.14 above points out that to remove the background radiance from the main-beam
detector signal, we just subtract the interferogram signal produced by a very cold source from the
interferogram signal produced by the source whose spectrum we want to measure. Equation
(5.62c) describes this process as

92
In Chapter 8 we analyze this sort of sampling error as a random source of noise.

Total Power in the Laser
Interference Signal

0
i
Laser Trigger
Lever
Sampling the Interferogram Signal 5.22
- 699 -
( ) ( ) ( )
(cold)
tot C
z z z = , (5.102)

where z
tot
() is the interferogram signal produced by the combination of the desired source
spectrum with the instrument background, z ( )
(cold)
C
is the interferogram signal produced by just
the instrument background when observing a very cold source, and z() is the interferogram
signal by just the source spectrum we want to measure. When we sample the signal leaving the
detector circuit at equal optical-path-difference intervals , what we get is either

( ) for 0, 1, 2,
tot
z m m =

when observing the source radiance combined with the instrument background or

( ) for 0, 1, 2,
(cold)
C
z m m =
when observing just the cold source.
93
Working with a double-sided interferogram system of the
type described in Sec. 5.15, we acquire a total of N samples. Formula (5.102) then gives the
sampled interferogram signal produced by just the source spectrum,

( ) ( ) ( )
(cold)
tot C
z m z m z m = . (5.103a)

The fast-Fourier transform algorithms that are applied to these samples work best when N is a
multiple of 2, as is mentioned at the beginning Sec. 2.22 of Chapter 2, so in (5.103a) the index
values of m can be chosen so that

1, 2, , 1, 0, 1, , 1,
2 2 2 2
N N N N
m = + + . (5.103b)

Note that (5.103b) specifies one extra sample to occur on the positive axis.

93
To keep things simple, we assume for now that the sample with index 0 = m occurs at 0 = . Section 5.26
below shows what happens when we stop assuming that one of the samples occurs at exactly 0 = .
5.23 Setting Up the Discrete Fourier Transform of the Sampled Signal
The key step in modern Fourier-transform spectroscopy is to apply a fast-Fourier transform (FFT)
algorithm to the sampled signal z( ) m in order to calculate the discrete Fourier transform
(DFT) that best approximates the integral Fourier transform of z(). The unsampled interferogram
signal leaving the detector circuit can be written as
- 700 -

2
( ) ( )
i
eff
z e d

Z (5.104a)
where, according to Eq. (5.83d),

ma
R ( ) H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
eff f a FOV
WA
u R
= Z L . (5.104b)

Usually the aft optics transmission function ( )
a
is nonzero only for those wavenumbers that
satisfy

min max
, (5.105)

making the effective spectrum
eff
Z equal to zero for
max
> or
min
< as shown in Fig.
5.38.

______________________________________________________________________________

FIGURE 5.38.

max

min
-
min
-
max

( )
eff
Z
Setting Up the Discrete Fourier Transform 5.23
- 701 -

FIGURE 5.39.

______________________________________________________________________________

The work done in Sec. 5.15 shows that the interferogram signal for a double-sided
interferogram, which we call the truncated interferogram signal, can be written as

( ) ( , ) ( )
trunc
z D z = (5.106a)
so that

( ) for
( )
0 for
trunc
z D
z
D

=

>

(5.106b)

as shown in Fig. 5.39.
Looking ahead to when the signal is sampled, we note that for N equally spaced samples

2D
N
= . (5.107a)
- D D

) (
trunc
z
- 702 -
Since
( ) ( ) for
trunc
z z D = , (5.107b)
it then follows that
( ) ( )
trunc
z m z m = (5.107c)
for

1, 2, , 1, 0, 1, , 1,
2 2 2 2
N N N N
m = + + . (5.107d)

Equation (5.65c) shows, after we substitute from (5.106a), that the effective spectrum
associated with the unsampled signal is

2
( ) ( )
i
trunc eff
trunc
z e d

Z (5.108a)
with
( ) [2 sinc(2 )] ( )
eff eff
trunc
D D = Z Z . (5.108b)

Figure 5.40 shows that D will be chosen large enough to make
eff
trunc
Z just a slightly blurred
version of
eff
Z with a tendency to oscillate at abrupt changes in value. According to the
discussion following Eq. (5.82c) above, the quantities H, M, R, ,
a
, and
f
are all slowly
varying functions of their arguments.
94
This means that when the formula for
eff
Z in (5.104b) is
substituted into Eq. (5.108b), the sinc function is narrow enough for these quantities to be treated
as quasi-constant with respect to the convolution [see Eq. (5C.1) in Appendix 5C]. Hence, we can
approximate (5.108b) as

ma
R ( ) H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
eff f a mnf
trunc
WA
u R
Z L , (5.108c)
where
( ) [2 sinc(2 )] ( )
mnf FOV
D D = L L . (5.108d)

94
So if the fore-optics transmission
a
and the detector responsivity R drop to zero for
max
| | > and
min
| | < , this must occur slowly compared to the rate at which L
FOV
varies with .
Setting Up the Discrete Fourier Transform 5.23
- 703 -
FIGURE 5.40.

We note that because both 2 sinc(2 ) D D and ( )
FOV
L are even functions of , their
convolution L
mnf
() is also even [see Eq. (2.38f) in Chapter 2],

( ) ( )
mnf mnf
= L L (5.108e)
and
( ) ( )
mnf mnf
= L L . (5.108f)

Even though the argument of L
mnf
does not need absolute value signs because L
mnf
is by
definition in (5.108d) already an even function, they are put there anyway to keep the notation
parallel with the previous L-type radiance symbols. The mnf subscript indicates that L
mnf
is the
measured, noise-free spectral radiance produced by the interferometer; it is L() blurred both by
the finite field-of-view effect discussed in Sec. 5.17 and the finite-interferogram effect discussed
in Sec. 5.15. Figures 5.41(a)5.41(c) show the progression from the original L() radiance
spectrum to L
FOV
() defined in Eq. (5.83e) above to L
mnf
() defined in Eq. (5.108d). The
unsampled, noise-free signal can now be written as the Fourier transform pair,

max

min
-
min
-
max

) (
trunc
eff
Z
- 704 -

2
( ) ( )
i
eff trunc
trunc
z e d

Z (5.109a)
and

2
( ) ( )
i
trunc eff
trunc
z e d

Z . (5.109b)

Function ( )
mnf
L is closely related to function ( )
eff
L in Eqs. (5.94e) and (5.94f). Hence, it
makes sense to assume that
( ) ( )
mnf eff
L L (5.110)

and to replace ( )
eff
L by ( )
mnf
L in the calibration formulas of Sec. 5.19.
5.24 Oversampling the Interferogram
Section 2.21 of Chapter 2 shows how to go from the integral Fourier transform to the discrete
Fourier transform. Comparing the integral transform pair in Eqs. (5.109a) and (5.109b) to Eq.
(2.91a) of Chapter 2, we note that variables and in this chapter play the roles of variables t
and respectively in Chapter 2,
t (5.111a)
and
f . (5.111b)

Perhaps the most important decision involved in going from the integral to the discrete Fourier
transform is the choice of step size between the equally spaced samples of z ( )
trunc
.
Converting Eq. (2.99a) of Chapter 2 from variables t and to variables and , we see that the
Nyquist wavenumber
Nyq
corresponding to the Nyquist frequency
Nyq
f is given by

1
2
Nyq
. (5.112)

The discussion at the beginning of Sec. 2.22 of Chapter 2 shows that oversampling the
interferogram signal z
trunc
() means choosing the sampling interval in such a way that the
Nyquist wavenumber
Nyq
satisfies

max Nyq
>

Oversampling the Interferogram 5.24
- 705 -
FIGURE 5.41(a).

Spectral Edge E
( ) L

1010 cm
-1
1009 cm
-1
This small piece of the radiance spectrum, the same piece plotted in Fig. 5.30(a) above, is
here graphed in all its detail as it enters the interferometer. This is why the y axis is labeled
( ) L . Note that Spectral Edge E lies at wavenumber 1010 cm
-1
.
- 706 -

FIGURE 5.41(b).

Spectral Edge E
( )
FOV
L

1010 cm
-1
1009 cm
-1
Here the same small piece of the radiance spectrum plotted in Figs. 5.41(a) and 5.30(b)
is shown with the rescaled wavenumber axis and blurring due to the interferometers
finite field of view. Hence, the y axis is labeled L
FOV
(). Note that Spectral Edge E now
occurs at a slightly smaller wavenumber than before.
- 707 -
FIGURE 5.41(C).

Spectral Edge E
o
1010 cm
-1
1009 cm
-1
( )
mnf
o L
This is the same small piece of the radiance spectrum plotted in Figs. 5.41(a) and
5.41(b). The y axis is labeled L
mnf
() to show that here the radiance is blurred by both
the interferometers finite field of view and the interferometers finite interferogram
length. Note that Spectral Edge E has the same wavenumber shift as in Fig. 5.41(b),
and that the spectral detail has been further blurred by the finite interferogram length.
Figures 5.41(a)5.41(c) show the effect of the two spectral distortions inherent in
standard Fourier-transform spectrometers.
Figures 5.41(a)--5.41(c) qualitatively show the effect of the two spectral distortions
inherent in standard Fourier-transform spectrometers.
- 708 -
with
max
defined by inequality (5.105) and Fig. 5.38. The larger
Nyq
is compared to
max
, the
more accurate the transformation from the integral Fourier transform to the discrete Fourier
transform. The reason for this, of course, is that the larger
Nyq
is compared to
max
, the less likely
it is that significant amounts of aliasing will occur when going from the integral to the discrete
Fourier transform. Although both aliasing and the transformation from the integral to the discrete
Fourier transform have already been covered in Secs. 2.212.23 of Chapter 2, it does no harm to
review these ideas here in the context of the truncated interferogram signal z
trunc
() and its
effective spectrum ( )
eff
trunc
Z .
The first step in setting up the discrete Fourier transform is to construct function
[ ]
( , 2 )
trunc
z D

from z
trunc
() following the procedure used in Eq. (2.91b) of Chapter 2,

[ ]
( , 2 ) ( 2 )
trunc trunc
k
z D z kD
=
=
. (5.113a)

From Eq. (5.106b) and Fig. 5.39, we know that z
trunc
is zero for D > . Consequently
[ ]
( , 2 )
trunc
z D
has the form shown in Fig. 5.42. This matches the situation shown in Fig. 2.12(a) of
Chapter 2, with the original signal z
trunc
turned into a nonoverlapping, periodic signal
[ ]
trunc
z

of
period 2D. In particular, we note that

[ ]
( , 2 ) ( ) for
trunc trunc
z D z D
= . (5.113b)

Next we construct
[ ]
( , 2 )
eff Nyq
trunc

Z using

[ ]
( , 2 ) ( 2 )
eff Nyq eff Nyq
trunc trunc
k
k
=
=
Z Z . (5.113c)

Glancing at the plot of
eff
trunc
Z in Fig. 5.43, we see that the plot of
[ ]
eff
trunc
Z has the form shown in

Fig. 5.44. The original signal
eff
trunc
Z is turned into a periodic signal
[ ]
eff
trunc
Z of period 2
Nyq
,
matching the situation shown in Fig. 2.12(b) of Chapter 2. Consequently, we have that

[ ]
( , 2 ) ( ) for
eff Nyq eff Nyq
trunc trunc

Z Z . (5.113d)

- 709 -

The edge ripples of
eff
trunc
Z are small and become smaller as we get further from the edge, but they
can in principle extend indefinitely far along the wavenumber axis, which means that overlapping
can occur making
[ ]
eff
trunc
Z not exactly equal to

eff
trunc
Z for
Nyq
. Reviewing the discussion
following Eqs. (2.93a) and (2.93b) of Chapter 2, we see that approximating z
trunc
and
eff
trunc
Z by
periodic functions (with periods 2D and 2
Nyq
respectively) is exactly what we need to do when
approximating the integral Fourier transform by the discrete Fourier transform. Now we
understand why the correct choice of
Nyq
is so important; if
Nyq
is set too close to
max
, the
ringing at the edges of
eff
trunc
Z could create significant amounts of overlap in its periodic extension
to function
[ ]
eff
trunc
Z .
As is pointed out in Sec. 2.22 of Chapter 2, this sort of overlap is called aliasing of the signal
spectrum. When
Nyq
is chosen to be decidedly greater than
max
, the interferogram signal is said
to be oversampled. The choice of D made when going from z
trunc
to
[ ]
z
trunc
, although in principle
equally important in characterizing the discrete Fourier transform, is in practice specified at an
earlier stage of the interferometer design when deciding on the spectral resolution of the
measured spectrum [see Eq. (5.67) above].
Because z
trunc
() is zero for D > , and both the real and imaginary components of ( )
eff
trunc
Z
are negligible for
Nyq
> , the pair of integral Fourier transforms in Eqs. (5.109a) and (5.109b)
can be approximated by
FIGURE 5.42.
D D 3 D 5 - D - D 3 - D 5

) 2 , (
] [
D z
trunc

- 710 -

FIGURE 5.43.

max

min
-
min
-
max

Nyq

-
Nyq

) (
trunc
eff
Z

FIGURE 5.44.

Nyq

Nyq
2
Nyq
3
Nyq

Nyq
2
Nyq
3

min

max
-
min

-
max

) 2 , (
] [
Nyq
trunc
eff

Z
- 711 -

2
( ) ( )
D
i
eff trunc
trunc
D
z e d

Z (5.114a)
and

2
( ) ( )
Nyq
Nyq
i
trunc eff
trunc
z e d
Z . (5.114b)

With the understanding that only the signal values at D and the spectral values at
Nyq

are of interest on the left-hand sides of the formulas, we use Eqs. (5.113b) and (5.113d) to replace
z
trunc
by
[ ]
trunc
z

and
eff
trunc
Z by
[ ]
eff
trunc
Z . Equations (5.114a) and (5.114b) now become

[ ] [ ] 2
( , 2 ) ( , 2 )
D
i
eff Nyq trunc
trunc
D
z D e d

Z (5.115a)
and

[ ] [ ] 2
( , 2 ) ( , 2 )
Nyq
Nyq
i
trunc eff Nyq
trunc
z D e d
Z . (5.115b)

Working first with the right-hand side of Eq. (5.115b), we note that

0
[ ] 2 [ ] 2
[ ] 2
0
( , 2 ) ( , 2 )
( , 2 )
Nyq
Nyq Nyq
Nyq
i i
eff Nyq eff Nyq
trunc trunc
i
eff Nyq
trunc
e d e d
e d
=
+

Z Z
Z

2
2 (2 )
[ ] 2
[ ] 2
0
( 2 , 2 )
( , 2 ) ,
Nyq
Nyq
Nyq
Nyq
i
i
eff Nyq Nyq
trunc
i
eff Nyq
trunc
e e d
e d
=
+
Z
Z

(5.116)

where in the last step the variable of integration in the first integral has been changed to
2
Nyq
= + . From Eq. (5.112) we get

2 (2 )
2 ( ) Nyq
i
i
e e

= .
If / integer m = = , then
- 712 -

2 ( ) 2
1
i im
e e
r r A
.

Substituting (5.116) back into (5.115b) and deciding to evaluate
[ ]
trunc
z

only at those optical-path
differences for which / m A , we get

2
[ ] [ ] 2
[ ] 2
0
( , 2 ) ( 2 , 2 )
( , 2 )
Nyq
Nyq
Nyq
i m
trunc eff Nyq Nyq
trunc
i m
eff Nyq
trunc
z m D e d
e d
o
r o
o
o
r o
o o o o
o o o
A
A
A
+
Z
Z .

This becomes, dropping the prime and recognizing that
[ ]
eff
trunc
Z is periodic with period 2

Nyq
o ,

2
[ ] [ ] 2
0
( , 2 ) ( , 2 )
Nyq
i m
trunc eff Nyq
trunc
z m D e d
o
r o
o o o
A
A
Z . (5.117)

Now we switch our attention to Eq. (5.115a). Following the same procedure as before, this time
changing the variable of integration to 2D + , we write its right-hand side as

2
[ ] 2 [ ] 2 2 (2 )
[ ] 2
0
( , 2 ) ( 2 , 2 )
( , 2 )
D D
i i i D
trunc trunc
D D
D
i
trunc
z D e d z D D e e d
z D e d
r o r o r o
r o

.

Substituting this into (5.15a) gives, since
[ ]
trunc
z

is periodic with period 2D,

[ ] [ ] 2
0
2
[ ] 2 2 (2 )
( , 2 ) ( , 2 )
( , 2 ) ,
D
i
eff Nyq trunc
trunc
D
i i D
trunc
D
z D e d
z D e e d
r o
r o r o
o o

Z

(5.118)

where the prime has been dropped from the integral between D and 2D. From Eq. (2.93d) of
Chapter 2, we noteremembering that variables and here correspond, respectively, to
Now we switch our attention to Eq. (5.115a). Following the same procedure as before, this time
- 713 -
variables t and therethat the interval between samples of
[ ]
eff
trunc
Z in the discrete Fourier

transform is

1
2D
= (5.119)

when
[ ]
trunc
z

has period 2D. This means that

2 (2 ) 2 ( ) i D i
e e

=

in the second integral of (5.118). If / integer n = = , then

2 ( ) 2
1
i in
e e

= = .

Now, deciding to evaluate
[ ]
eff
trunc
Z only at wavenumbers for which / n = , we can write

(5.118) as

2
[ ] [ ] 2 ( )
0
( , 2 ) ( , 2 )
D
i n
eff Nyq trunc
trunc
n z D e d

=
Z . (5.120)

Equations (5.117) and (5.120) are gathered together to get

2
[ ] [ ] 2 ( )
0
( , 2 ) ( , 2 )
D
i n
eff Nyq trunc
trunc
n z D e d

=
Z (5.121a)
and

2
[ ] [ ] 2
0
( , 2 ) ( , 2 )
Nyq
i m
trunc eff Nyq
trunc
z m D e d
Z . (5.121b)

The discussion following Eq. (5.106b) defined N to be the number of equally spaced samples of
[ ]
trunc
z

between D and D, with (2 ) / D N = , so N equally spaced samples spaced apart must
also cover the optical-path difference between zero and 2D. We now show that N equally spaced
samples of
[ ]
eff
trunc
Z spaced apart in wavenumber cover the wavenumber interval between zero

and 2
Nyq
. Remembering that variables and here correspond to variables t and respectively
in Chapter 2, we rewrite Eq. (2.93e) of Chapter 2 as

- 714 -

1
N
= . (5.122a)
Consequently,

1
N

or, using
1
2 ( )
Nyq

= from Eq. (5.112),

2
Nyq
N = . (5.122b)

Therefore N equally spaced samples apart must cover the wavenumber interval between zero
and 2
Nyq
.
Having established that N equally spaced samples cover the regions of integration in Eqs.
(5.121a) and (5.121b), we approximate both integrals as sums over N equally spaced samples in
wavenumber and optical-path difference. This gives

1
[ ] [ ] 2 ( )( )
0
( , 2 ) ( , 2 )
N
i n m
eff Nyq trunc
trunc
m
n z m D e

Z (5.123a)
and

1
[ ] [ ] 2 ( )( )
0
( , 2 ) ( , 2 )
N
i n m
trunc eff Nyq
trunc
n
z m D n e

Z . (5.123b)

To put this into the traditional form of the discrete Fourier transforms shown in Eqs. (2.96a) and
(2.96b) in Chapter 2, just multiply both sides of (5.123a) by and use
1
N

= from
(5.122a) to get

1
2
0
nm
N
i
N
m n
n
z e

=
=
Z (5.124a)
and

1
2
0
1
nm
N
i
N
n m
m
z e
N

=
=

Z , (5.124b)
where

[ ]
( , 2 )
m trunc
z z m D
= (5.124c)
and

[ ]
( , 2 )
n eff Nyq
trunc
n
= Z Z . (5.124d)

It is important to remember, when using the discrete Fourier transforms defined in (5.124a)
- 715 -
(5.124d) to approximate the integral Fourier transforms in (5.114a) and (5.114b), that functions
[ ]
trunc
z

and
[ ]
eff
trunc
Z are qualitatively different from the truncated interferogram signal z

trunc
and its
associated spectrum
eff
trunc
Z with which we beganbecause functions
[ ]
trunc
z

and
[ ]
eff
trunc
Z , unlike z
trunc

and
eff
trunc
Z , are periodic with periods of 2D and 2
Nyq
o respectively. We also note that the
unapodized spectral resolution given in Eq. (5.67) above is, when using the discrete Fourier
transform, the same as the distance between spectral samples given by Eq. (5.119),

1
2D
o A .

Consequently, the unapodized spectral resolution can be defined very simply and exactly as the
distance between adjacent spectral samples after the discrete Fourier transform is applied to the
sampled interferogram signal. This is one reason why the unapodized spectral resolution has
become such a widespread figure of merit for resolution in Fourier-transform spectroscopy
when discrete Fourier transforms are used to approximate integral Fourier transforms, it sets an
easily understood limit on how much spectral detail we can hope to resolve.
5.25 Undersampling the Interferogram
When oversampling the interferogram signal in the previous section, we take advantage of the
way ( )
eff
trunc
o Z becomes negligibly small for
max
o o > in order to avoid overlappingor aliasing
of the spectrum. We do this by requiring that
Nyq
end up well to the right of
max
in Fig. 5.43
when creating the periodic function

[ ]
( , 2 ) ( 2 )
eff Nyq eff Nyq
trunc trunc
k
k o o o o
Z Z (5.125a)

in Eq. (5.113c) above. Consequently , the optical-path difference between adjacent samples of
the interferogram signal z
trunc
(), must be chosen small enough that, according to formula (5.112),

1
2
Nyq
o
A
(5.125b)

is decidedly larger than
max
. Since ( )
eff
trunc
o Z also becomes negligibly small for
min
o o < , we may
be able to follow the strategy outlined in Sec. 2.23 of Chapter 2 and undersample the also be able to follow the strategy outlined in Sec. 2.23 of Chapter 2 and undersample the
becomes negligibly small for
min
o o < , we may is decidedly larger than
max
. Since ( )
eff
trunc
o Z
- 716 -
interferogram signal instead. We now review how undersampling works, explaining in more
detail how to set up the appropriate discrete Fourier transform for an undersampled interferogram
signal.
The first step in undersampling an interferogram signal is to compare the wavenumber
interval
max min
( ) to
min
to see how many aliases of the original spectrum can be fit between
0 = and
min
= . For the spectrum in Fig. 5.45, we could choose the undersampled Nyquist
wavenumber
( ) u
Nyq
small enough to fit in as many as two aliases, as shown by the dashed curves;
but we decide to be conservative and only fit in one, as shown in Fig. 5.46. This conservative
strategy is called undersampling by a factor of 2.
When undersampling by a factor of 2, the old Nyquist frequency
Nyq
and the new Nyquist
frequency
( ) u
Nyq
are related by

( )
2
u
Nyq Nyq
= . (5.126)
Just like

1
2
Nyq
(5.127a)

for the old Nyquist frequency and the old sampling interval in Eq. (5.112), we associate with
( ) u
Nyq
a new sampling interval
( ) u
such that

( )
( )
1
2
u
Nyq u
. (5.127b)

For Eqs. (5.126), (5.127a), and (5.127b) to be true, we must have

( )
2
u
= . (5.127c)

This is, of course, why what we are doing is called undersampling by a factor of 2; according to
(5.127c), the interferogram signal is to be sampled half as often as before.
In the previous section, we found the sampled interferogram signal
[ ]
trunc
z

could be written as
[see Eq. (5.123b)]

1
[ ] [ ] 2 ( )( )
0
( , 2 ) ( , 2 )
N
i n m
trunc eff Nyq
trunc
n
z m D n e

Z . (5.128)
Undersampling the Interferogram 5.25
- 717 -

______________________________________________________________________________

Note that here , ,
Nyq
, and N all retain the old oversampled values specified in the previous
section. Assuming that the number of samples N is large, we see that as the index n goes from
zero to 1 N , the wavenumber argument n of
[ ]
eff
trunc
Z goes from zero to

1 1
( 1) 2
Nyq
N N

= = =

.

Here both
1
( ) N

= from formula (5.122a) and
1
2 ( )
Nyq

= from formula (5.127a) are
used to get the final result. Since
( 1) 2
Nyq
N ,

we see that the sum over
[ ]
eff
trunc
Z in Eq. (5.128) is over the original oversampled spectrum between

0 = and
Nyq
= and one of its aliases between
Nyq
= and 2
Nyq
= . Suppose the old
Nyquist wavenumber in Eq. (5.128) is replaced by
( ) u
Nyq
, half the old Nyquist value, to get

1
[ ] [ ] ( ) 2 ( )( )
0
?
( , 2 ) ( , 2 )
N
u i n m
trunc eff Nyq
trunc
n
z m D n e

Z . (5.129)
FIGURE 5.45.

) (u
Nyq

) (
2
u
Nyq

Nyq
u
Nyq
=
) (
3
( ) u
Nyq

( )
2
u
Nyq

( )
3
u
Nyq Nyq
=
- 718 -
FIGURE 5.46.

Figure 5.46 shows that the new spectrum
[ ] ( )
( , 2 )
u
eff Nyq
trunc
n
Z has twice as many aliases as the

original spectrum
[ ]
( , 2 )
eff Nyq
trunc
n
Z . Comparing the new spectrum in Fig. 5.46 to the original

spectrum in Fig. 5.44, we see that the sum in (5.129) covers two extra aliases in Fig. 5.46 that it
did not cover in Fig. 5.44. The wavenumbers where
[ ]
eff
trunc
Z are zero do not, of course, contribute

anything to the sum. Lets see what happens when we eliminate two of the aliases by taking the
sum over the new spectrum only up to the new, rather than the old, Nyquist wavenumber.
According to the discussion following Eq. (5.103a), N is even, which means that formula (5.129)

( / 2) 1
[ ] [ ] ( ) 2 ( )( )
0
?
( , 2 ) ( , 2 )
N
u i n m
trunc eff Nyq
trunc
n
z m D n e

Z . (5.130)

This eliminates altogether the alias between
( )
3
u
Nyq
and
( )
4
u
Nyq
in Fig. 5.46, which was not part of

) (u
Nyq

Nyq
u
Nyq
=
) (
2

) (
3
u
Nyq

) (u
Nyq

Nyq
u
Nyq
=
) (
2

Nyq
u
Nyq
2 4
) (
=
Solid lines show the position of the original spectrum on the wavenumber axis, and the unshaded
dashed lines show the aliases associated with the original Nyquist wavenumber
Nyq
. The shaded
dashed lines show the aliases produced by undersampling. They are associated with the
undersampled Nyquist wavenumber
( ) u
Nyq
.
- 719 -
the original sum, and replaces the alias between
( )
2
u
Nyq
o and
( )
3
u
Nyq
o , which was part of the original
sum, with the alias between zero and
( ) u
Nyq
o . The alias between zero and
( ) u
Nyq
o is an exact copy of
the alias between
( )
2
u
Nyq
o and
( )
3
u
Nyq
o , and these two aliases are separated by a wavenumber interval
[see Eqs. (5.126), (5.127a), and (5.122a)]

( )
1 1
2
2 2 2
u
Nyq Nyq
N o o
o o
o
A A

A A A
.
Consequently, we can write that

[ ] ( ) [ ] ( )
( ) , 2 ( , 2 )
2
u u
eff Nyq eff Nyq
trunc trunc
N
n n o o o o

A A

Z Z (5.131a)

when comparing spectral values in the alias between zero and
( ) u
Nyq
o to spectral values in the alias
between
( )
2
u
Nyq
o and
( )
3
u
Nyq
o . As far as the complex exponent multiplying
[ ]
eff
trunc
Z is concerned, we
have, according to (5.122a), that

( ) ( / 2) 2 2 ( )( ) 2 ( / 2)( )
2 ( )( ) 2 ( / 2) (1/ )
2 ( )( ) 2 ( / 2)
n N i m i n m im N
i n m im N N
i n m i m
e e e
e e
e e
r o r o r o
r o r
r o r
A A A A A A
A A
A A

.

Hence, whenever m is even, we have

2 ( / 2)
1
i m
e
r

so that

( ) 2 ( / 2)
2 ( )( )
i n N m
i n m
e e
r o
r o
A A
A A
.

Suppose we add a subscript 2 to m to show that it must be a non-negative and even integer,

2
0, 2, 4, m . (5.131b)
This lets us write the latest result as

( )
2 2
( / 2) 2 2 ( )( ) n N i m i n m
e e
r o r o A A A A
. (5.131c)

Consequently, we can combine (5.131a) and (5.131c) to get
the original sum, as well as the alias between which we can regard as being
replaced by the alias between zero and
- 720 -

( )
2
2
( / 2) [ ] ( )
[ ] ( )
2
2 ( )( )
( ) , 2
2
( , 2 )
n N u
eff Nyq
trunc
u
eff Nyq
trunc
i m
i n m
N
n e
n e

=
Z
Z .
(5.131d)

This shows thatwhenever m = m
2
= a non-negative even integereach term in the original sum
over the alias in Fig. 5.46 between
( )
2
u
Nyq
and
( )
3
u
Nyq
is the same as the corresponding term in the
new sum over the alias in Fig. 5.46 between zero and
( ) u
Nyq
. Therefore, whenever the index is a
non-negative even integer, we can remove the question mark from formula (5.130) and write

2
( / 2) 1
2 ( )( ) [ ] [ ] ( )
2
0
( , 2 ) ( , 2 )
N
i n m u
trunc eff Nyq
trunc
n
z m D n e

Z ,

where we have replaced m by m
2
on the left-hand side to honor the restriction placed on the
permitted values of the index. If we define an undersampled value of N,

( )
/ 2
u
N N = , (5.132a)
then the formula becomes

( )
2
1
2 ( )( ) [ ] [ ] ( )
2
0
( , 2 ) ( , 2 )
u
N
i n m u
trunc eff Nyq
trunc
n
z m D n e

Z . (5.132b)

We note that the m
2
sequence of non-negative even integers can be written as

2
2 for 0, 1, 2, m m m = = .

Hence, using that
( )
2
u
= from Eq. (5.127c), we see that

( )
2
2
u
m m m = = .

Equation (5.132b) can now be written as

( )
( )
1
[ ] ( ) [ ] ( ) 2 ( )( )
0
( , 2 ) ( , 2 )
u
u
N
u u i n m
trunc eff Nyq
trunc
n
z m D n e

Z . (5.132c)

This gives one of the two formulas for the discrete Fourier transform of the undersampled
interferogram signal.
- 721 -
To get the other formula, we multiply both sides of (5.132c) by

( )
2 ( )( )
u
i n m
e

and sum over m. This gives

( )
( ) ( )
( )
( )
1
[ ] ( )
0
1 1
[ ] ( )
0 1
2 ( )( )
2 ( ) ( )
( , 2 )
( , 2 )
u
u u
u
u
N
u
trunc
m
N N
u
eff Nyq
trunc
n m
i n m
i n n m
e z m D
n e

= =

Z .
(5.133)

We note, using Eqs. (5.127c) and (5.122a), that

( )
( )
2 1
2
u
u
N N
= = = (5.134a)

with the last step using definition (5.132a). Therefore,

( ) ( ) ( )
( )
( ) ( )
1 1 1
0 0 0
2 ( )( ) 2 ( ) / ( )
( )
u u u
u
u u
N N N
N
m m m
i m n n im n n N m n n
e e w

= = =

= =

with
( ) u
N
w given by Eq. (2.94a) of Chapter 2 as

( )
( )
2
u
u
N
i N
w e

= .

According to Eqs. (2.94d) and (2.94g) of Chapter 2,

( )
( )
1
( ) ( )
,
0
( )
u
u
N
m n n u
n n
N
m
w N
=
=
,
and so

( )
( )
1
( )
,
0
2 ( )( )
u
u
N
u
n n
m
i m n n
e N

=

=
. (5.134b)

Substitution of (5.134b) into (5.133) gives

- 722 -

( )
( )
( )
1
[ ] ( )
0
1
( ) [ ] ( )
,
0
2 ( )( )
( , 2 )
( , 2 )
u
u
u
N
u
trunc
m
N
u u
eff Nyq n n
trunc
n
i n m
e z m D
N n

Z .

or

( )
( )
1
[ ] ( ) [ ] ( )
( )
0
2 ( )( )
1
( , 2 ) ( , 2 )
u
u
N
u u
eff Nyq trunc u
trunc
m
i n m
n z m D e
N

Z . (5.135a)

Dropping the primes from n and using formula (5.134a) to write

( )
( )
1
u
u
N

, (5.135b)
Eq. (5.135a) becomes

( )
( )
1
[ ] ( ) ( ) [ ] ( )
0
2 ( )( )
( , 2 ) ( , 2 )
u
u
N
u u u
eff Nyq trunc
trunc
m
i n m
n z m D e

Z . (5.135c)

Having now found the second formula for the discrete Fourier transform of the undersampled
signal, we gather together Eqs. (5.132c) and (5.135c) to write

( )
( )
1
[ ] ( ) ( ) [ ] ( )
0
2 ( )( )
( , 2 ) ( , 2 )
u
u
N
u u u
eff Nyq trunc
trunc
m
i n m
n z m D e

Z (5.136a)
and

( )
( )
1
[ ] ( ) [ ] ( )
0
2 ( )( )
( , 2 ) ( , 2 )
u
u
N
u u
trunc eff Nyq
trunc
n
i n m
z m D n e

Z . (5.136b)

This pair of equations has the exact same form as the pair of equations specifying the discrete
Fourier transform for the oversampled signal in Eqs. (5.123a) and (5.123b), with

( ) u
,

( ) u
N N ,
and

( ) u
Nyq Nyq
.

- 723 -
This shows we can sample the interferogram signal z
trunc
with double the sampling interval used
in the previous sectionthat is, undersample by a factor of 2and plug the resulting z
trunc
values
into formula (5.136a) to get

[ ] ( )
( , 2 )
u
eff Nyq
trunc
n
Z .

Knowing that the wavenumber interval has not changed from what it was before, and that
the aliases of
eff
trunc
Z do not overlap when undersampled by a factor of 2, we can now use the
correspondences shown in Fig. 5.46 to extract the true spectral values between

( )
[ 2 ]
u
Nyq
,
( )
[ ]
u
Nyq
, and
( )
[ ]
u
Nyq
,
( )
[2 ]
u
Nyq
.

When oversampling the interferogram signal z
trunc
in the previous section, N interferogram
samples are used to find the spectrum
eff
trunc
Z ; and in this section, when undersampling the
interferogram signal by a factor of 2, only
( )
/ 2
u
N N = interferogram samples are needed to get
the same information. When the spectrum to be measured is narrow enough for this sort of
undersampling to make sense, it can lead to significant savings in data storage and calculation
time for the discrete Fourier transform. The drawback is, as shown by the discussion at the end of
Sec. 6.22 of Chapter 6, that we may end up with an increased level of low-frequency noise in the
measured spectrum.
5.26 Off-Center Sampling of the Interferogram Signal
When analyzing the sampled interferogram signal in Secs. 5.225.25, we said the interferogram
samples occurred at optical-path differences m for

1, 2, , 1, 0, 1, , 1,
2 2 2 2
N N N N
m = + +

(see footnote 93 above). The problem with this is that it assumes the interferogram signal is
sampled at exactly 0 = when 0 m = , as shown in Fig. 5.47(a). In practice, however, it is very
hard to sample the interferogram at exactly 0 = ; often the sample nearest 0 = is located a
large fraction of a sampling interval away from 0 = . We call this fraction of a sampling
interval , with

1 1
2 2
,
- 724 -

FIGURE 5.47(a).
FIGURE 5.47(B).

Off-Center Sampling of the Interferogram Signal 5.26
- 725 -
which means that the peak of z
trunc
is located at = , as shown in Fig. 5.47(b).
Mathematically, this can be regarded as a displacement of z
trunc
(), the interferogram signal as
defined above in Eq. (5.106a), along the axis by a distance . The displaced interferogram
signal can be written as

( )
( ) ( )
trunc trunc
z z
= . (5.137a)

Glancing back at Eq. (5.108a), we see that the new effective signal spectrum is

( ) ( ) 2 2
( ) ( ) ( )
i i
eff trunc trunc
trunc
z e d z e d

= =

Z . (5.137b)

Transforming the variable of integration to = gives

( ) 2 2 2 2
( ) ( ) ( )
i i i i
eff trunc trunc
trunc
z e e d e z e d

= =

Z ,

where in the last step the prime is dropped from the variable of integration. Substituting from Eq.
(5.108a), we see that

( ) 2
( ) ( )
i
eff eff
trunc trunc
e

= Z Z . (5.137c)

Since is a small quantity, the effect of shifting z
trunc
by a distance along the axis is
to multiply the original signal spectrum
eff
trunc
Z by a slowly varying, complex function of . There
is nothing profound about this result; it is just an example of the Fourier shift theorem given in
Eq. (2.36a) of Chapter 2. From the discussion following Eq. (5.95c) above, we know that
multiplying the original effective signal spectrum by another complex function of does not
change the way the calibration procedure extracts the desired spectral radiance L()as long as
the complex function does not change after the instrument is calibrated. Hence, as long as is a
true constant, having the same value each time the moving mirror scans through its range of
- 726 -
motion, the extra factor of
2 i
e

in Eq. (5.137c) can be removed by calibration. Since
2 i
e

is a slowly varying function of , we can even, as described in footnote 88 above, use the type of
single-sided system discussed in Sec. 5.18 to measure the spectral radiance L().

__________

Interferometer systems nonrandomly distort their spectral measurements in characteristic
ways. Background radiances and complex modulations can be removed by calibration, and
careful assembly can minimize the effects of nonflat optical surfaces and static misalignments.
We can enjoy without any reservations the high-resolution advantages of single-sided
interferogram systems when double-sided measurements confirm that the e
i
phase term
[introduced in Eq. (5.84a)] is a slowly varying function of wavenumber . There is no way,
however, to avoid the nonrandom distortions and errors introduced by the finite interferogram
length and finite field of view of practical interferometer systems. The effect of the finite
field of view is to blur the true spectral radiance L while at the same time shrinking the
wavenumber axis by a factor of (1 (4 )) + . This blurred and shifted spectral radiance is
called L
FOV
. The effect of the finite interferogram length is to blur by convolution with a sinc
function, as shown in Eq. (5.108d) above. When the spectral radiance is distorted both by the
interferometers finite field of view and by its finite interferogram length, we call it L
mnf
. We plan
to keep track of these distinctions between true radiances, FOV radiances, and mnf radiances
when the random spectral errors produced by detector noise, misalignment noise, and sampling
noise are discussed in the next three chapters.

Appendix 5A
- 727 -
Appendix 5A
The detector circuit of a Fourier-transform spectrometer is a time-invariant linear system. If g(t)
is the input signal as a function of time going into the linear system, then the output signal k(t)
can always be written as
( ) ( ) ( ) k t g t h t t dt
, (5A.1a)

where h(t) is a continuous function of time specifying how the input signal is modified by passing
through the circuit. The explicit limits on the integral expression for output k(t) are + and ,
but in practice we always assume that the input signal g(t) is time limited, with the true limits on
the integral being set by the finite range of t over which g(t) is not zero. Function h(t) is often
called the impulse-response function of the linear circuit, because when the input is a delta
function impulse (see Sec. 2.14 of Chapter 2),

( ) ( ) g t t = ,
then the output is h(t):
( ) ( ) ( ) ( ) k t t h t t dt h t
= =
. (5A.1b)

As a general rule, we expect h(t) to be a much narrower function of time than g(t), which means
that output k(t) can be regarded as just a slightly blurred and distorted version of the input g(t).
According to Eq. (2.38a) of Chapter 2, Eq. (5A.1a) states that output k is the convolution of h
and g,
( ) ( ) ( ) k t g t h t = . (5A.2a)

The convolution is a linear operation, so when the input signal is the linear combination of two
functions
1
g and
2
g , with

1 2
( ) ( ) ( ) g t g t g t = +

for two real constants and , then the resulting output is multiplied by the output that would
occur if only
1
g were present plus multiplied by the output that would occur if only
2
g were
present [see Eq. (2.38e) in Chapter 2],

1 2 1 2
( ) [ ( ) ( )] ( ) [ ( ) ( )] [ ( ) ( )] g t g t g t h t g t h t g t h t = + = + . (5A.2b)

Therefore, if we know the output of the circuit for input
1
g and the output of the circuit for input
2
g , we know at once the output of the circuit for an input
1 2
[ ( ) ( )] g t g t + . In particular, taking
- 728 -
1 = = in (5A.2b), we see that if the output of the circuit for an input
1
( ) g t is

( )
1 1
( ) ( ) ( )
out
g t g t h t = ,

and the output of the circuit for an input
2
( ) g t is

( )
2 2
( ) ( ) ( )
out
g t g t h t = ,

then the output of the circuit when the input is

1 2
( ) ( ) g t g t +

must be must be the sum of the individual signals outputs,

( ) ( )
1 2
( ) ( )
out out
g t g t + .

Glancing back to Eq. (2.40a) in Chapter 2 and the discussion following it, we note that the input
signal g(t) in Eq. (5A.2a) plays the role of u(t) in (2.40a), that the output signal k(t) plays the role
of
,
( )
e blur
u t in (2.40a), and that the impulse-response function h(t) plays the role of the
instrument-response function ( )
e
v t in (2.40a). In fact we already know from the discussion
following Eq. (2.40a) that the correct way to handle Eq. (5A.2a) is to take the Fourier transform
of both sides and then apply the Fourier convolution theorem [see Eq. (2.39b) of Chapter 2] to get

( ) ( ) ( ) K f G f H f = , (5A.3a)
where

2
( ) ( )
ift
K f v t e dt
, (5A.3b)

2
( ) ( )
ift
G f g t e dt
, (5A.3c)
and

2
( ) ( )
ift
H f h t e dt
. (5A.3d)

The Fourier transform H() of the impulse-response function is often called the transfer
function of the linear circuit. The formula shown in Eq. (5A.3a) is often the easiest way to find
the output k(t) corresponding to a given input g(t). We first calculate G(), the Fourier transform
Appendix 5A
- 729 -
of input g(t), then multiply G() by the transfer function H() to get K(), the Fourier transform
of the output. Having found K(), we then take the inverse Fourier transform to get output k(t),

2
( ) ( )
ift
k t K f e df
. (5A.4)

Although the impulse-response function h(t) in Eq. (5A.2a) plays the same role as the instrument-
response function ( )
e
v t in Eq. (2.40a) of Chapter 2, there is one important difference. A linear
circuit is a causal system, which means that its output signal k(t) cannot start happening before
the input signal g(t) occurs. Consequently, the circuits impulse-response function h(t) must
satisfy the restriction
( ) 0 for 0 h t t = < . (5A.5)

Suppose, for example, we supply a delta function at 0 t = , the impulse signal ( ) ( ) g t t = , for the
circuits input. Then, according to Eq. (5A.1b), the circuits output is

( ) ( ) k t h t = ;

and if, for some 0 t < , we have ( ) 0 h t , then there will be some part of the circuits output
signal being produced at 0 t < before its cause, the input delta function at 0 t = , has occurred.
This is why the impulse-response function of a causal linear system, unlike ( )
e
v t in Fig. 2.5(f) of
Chapter 2, must satisfy (5A.5).
Because h(t) is a nonzero function that must nevertheless be zero for negative values of t, it
cannot be an even function,
95

( ) ( ) h t h t . (5A.6a)

The transfer function H() is, according to Eq. (5A.3d), the Fourier transform of the real impulse-
response function h(t), which means, according to entry 7 of Table 2.1 of Chapter 2, that H is a
Hermitian function of ,
( ) ( ) H f H f

= . (5A.6b)

If H were a real function of , then it would also need to be even in order to satisfy (5A.6b).
According to entry 1 of Table 2.1, however, function H() can be both real and even only when
h(t) is both real and even. Since, according to (5A.6a), we know that h(t) is not even, we conclude
that H, although Hermitian, cannot be real. We can directly verify this conclusion by using
cos sin
i
e i
= + to break the Fourier transform of the real impulse-response function h(t) into

95
See Eq. (2.11a) of Chapter 2 for a definition of what it means to say a function is even.
- 730 -
real and imaginary parts,

2
0 0
( ) ( ) ( ) cos(2 ) ( ) sin(2 )
( ) cos(2 ) ( ) sin(2 )
ift
H f h t e dt h t ft dt i h t ft dt
h t ft dt i h t ft dt
r
r r
r r

+
+

.
(5A.7a)

The last step here uses the restriction in (5A.5) to limit the sine and cosine integrals to non-
negative values of t. Because the sine integral in particular is limited to non-negative values of t,
we note that the imaginary part of the transfer function,

0
Im[ ( )] ( ) sin(2 ) H f h t ft dt r
, (5A.7b)

can be zero for all values of only if h(t) is zero for all non-negative values of t. Since we
already know that h is zero for all negative values of t, it follows that h must be zero everywhere.
This is an unacceptable impulse-response function, confirming our previous assertion that the
transfer function H() of the detector circuit cannot be a real-valued function.
already know that h is zero for all negative values of t, it follows that h would be zero everywhere.
Appendix 5B
- 731 -
Appendix 5B
This appendix shows how to simplify Eq. (5.76a) in Sec. 5.17 of Chapter 5. We start off with

( ) ( )
2 1
2 4
ma
( )
M H( ) sinc
4 2
eff
i
i
W
d e d S R u e

=

Z

(5B.1a)

and note that the integral over d can be moved inside to get

( ) ( ) ( )
( ) ( ) ( )
0
2 [ (1 )]
ma
2 [ (1 )]
ma
0
( )
M H( ) sinc 2
4
M H( ) sinc 2 ,
4
eff
i
i
W
d S R u d e
W
d S R u d e

=

+

Z

(5B.1b)
where

4
= (5B.1c)

and the integral over d is, for future convenience, divided into two integralsone from to
zero and one from zero to . For any reasonable interferometer design, the field of view (in
steradians, of course) is small compared to 4 , so

0 1 < << . (5B.1d)

From Eq. (5A.6b) in Appendix 5A, we know that the transfer function H() is Hermitian,

( ) ( ) H f H f

= . (5B.2a)

From Eq. (5.46d) in Chapter 5, we know that

H(0) 0 = . (5B.2b)

We can write the complex transfer function H() as

( )
( ) ( )
i f
H f f e
= , (5B.2c)

- 732 -
where both and are real functions of . The same sort of reasoning used to derive Eqs.
(5.86a) and (5.86b) in Chapter 5 can be used here to analyze ( ) f and ( ) f . Substituting
(5B.2c) into (5B.2a) gives, since both and are real,

( ) ( )
( ) ( )
i f i f
f e f e

= .

Taking the complex magnitude of both sides shows to be an even function of ,

( ) ( ) f f = . (5B.2d)
To match Eq. (5B.2b), we require
(0) 0 = . (5B.2e)
Now we can write

( ) ( )
( ) ( )
i f i f
f e f e

= or
( ) ( ) i f i f
e e

=

and take the complex logarithm of both sides to get

( ) ( ) f f = , (5B.2f)

showing to be an odd function of . Because both Eqs. (5B.2d) and (5B.2e) must be true, we
conclude that not only is ( ) f equal to zero at 0 f = but also that the derivative of ( ) f with
respect to is zero at 0 f = . The point of this analysis is revealed when we substitute formula
(5B.2c) into the two integrals on the right-hand side of (5B.1b) to get

( ) ( ) ( )
0
( ) 2 [ (1 )]
ma
( ) M sinc 2
4
i u i
W
d u S R e d e

and
( ) ( ) ( )
( ) 2 [ (1 )]
ma
0
( ) M sinc 2
4
i u i
W
d u S R e d e

.

Changing the variable of integration of the inner integral to = with d d = , we can
write these two integrals as

( ) ( ) ( )
0
( ) 2 [( / ) (1 )]
ma
( ) M sinc 2
4
i u i
W d
u S R e d e

and
Appendix 5B
- 733 -
( ) ( ) ( )
( ) 2 [( / ) (1 )]
ma
0
( ) M sinc 2
4
i u i
W d
u S R e d e
u o r o o o
o
o o o r o
o
+

,

where, in the first integral over d , we note that is negative when is positive because
0 o < . Ordinarily we might worry about the singularity at 0 o in the outside integrals over
do , but since both and its derivative are zero at 0 o , it follows that ( ) / uo o A must be
zero at 0 o and very small near 0 o . Consequently, both of these integrals are well-defined,
and, replacing the
i
e
u
A product by the transfer function H, we can write Eq. (5B.1b) as

( ) ( ) ( )
( ) ( ) ( )
0
2 [( / ) (1 )]
ma
2 [( / ) (1 )]
ma
0
( )
M H( ) sinc 2
4
M H( ) sinc 2
4
eff
i
i
W d
S R u d e
W d
S R u d e
r o o o
r o o o
o
o
o o o r o
o
o
o o o r o
o

Z

.
(5B.3)

Equation (2.108a) in Chapter 2 can be written as, after replacing F by ,

2
1
sinc(2 ) ( , )
2
itf
t e dt f
r
r o o
o
, (5B.4a)
where

1 for
( , ) 1/ 2 for 0
0 for
f
f f
f
o
o
o
<
>

. (5B.4b)

This definition of function H is the same as that in Eq. (2.56c) of Chapter 2, and the definition of
the sinc function is given in Eq. (2.106d) of Chapter 2.
Applying Eq. (5B.4a) to (5B.3) gives (here plays the role of t)

( ) ( )
( ) ( )
0
ma
ma
0
1
( ) M H( ) (1 ),
8
1
M H( ) (1 ),
8
eff
W
S R u d
W
S R u d
o
o o o o o o o
o o o
o
o o o o o o
o o o

H

+ H

Z
.
(5B.5a)

From Eq. (5B.4b), it follows that H is zero unless
f o <=
- 734 -
(1 )

or
1 2 1
.

Since, according to (5B.1d), 0 1 < << , we realize that 1 2 is positive. Therefore, if is
positive then must also be positive, and if is negative then must also be negative. Hence
this double inequality can be written as

1
1
1 2

or

for 0
1 2
for 0
1 2

>

<

. (5B.5b)

According to the discussion immediately preceding Eq. (5B.3), the quantity

( ) ( )
ma
1
M H( ) S R u

is very small or zero when is near or at zero, which means that the region around 0 = cannot
contribute significantly to either integral in Eq. (5B.5a). Therefore, in the first integral between
and zero, we can think of as always negative, and in the second integral between zero and
we can think of as always positive. According to (5B.5b), then, the first integral can be
nonzero only when 0 < and the second integral can be nonzero only when 0 > . This means
that Eq. (5B.5a) can be written as

( ) ( )
( ) ( )
ma
/(1 2 )
/(1 2 )
ma
1
M H( ) for 0
8
( )
1
M H( ) for 0
8
eff
W
S R u d
W
S R u d
<

>

Z

.

Changing the variable of integration in the top integral to = gives

Appendix 5B
- 735 -

( ) ( )[ ]
( ) ( )
/(1 2 )
ma
/(1 2 )
ma
1
M H( ) for 0
8
1
( ) M H( ) for 0
8
eff
W
S R u d
W
S R u d
o o
o
o o
o
o o o o o
o o
o o o o o o
o o
<
>
Z

,
where
( ) ( ) S S o o , (5B.6a)

( ) ( )
ma ma
M M R R o o , (5B.6b)
and
H( ) [H( )] u u o o

(5B.6c)

from Eqs. (5.39a) in Chapter 5, (5.10f) in Chapter 5, and (5A.6b) in Appendix 5A respectively.
Since it makes no difference whether we label the variable of integration or , we can now
write, remembering that /(4 ) o r AO from Eq. (5B.1c),

( ) ( )[ ]
( ) ( )
1
1
/[1 (2 ) ]
ma
/[1 (2 ) ]
ma
1
M H( ) for 0
2
( )
1
M H( ) for 0
2
eff
W
S R u d
W
S R u d
o r
o
o r
o
r
o o o o o
o
o
r
o o o o o
o
AO
AO
<
AO
>
AO
Z

. (5B.7a)

Since 2 /(2 ) o r AO is small compared to one, we have

1 1 1
[1 (2 ) ] [1 (2 ) ] o r o r

AO e + AO ,

so that Eq. (5B.7a) becomes

( ) ( )[ ]
( ) ( )
1
1
(2 )
ma
(2 )
ma
2 1
M H( ) for 0
4
2 1
( ) M H( ) for 0
4
eff
W
S R u d
W
S R u d
o o r
o
o o r
o
r
o o o o o
o
r
o o o o o o
o
+AO
+AO

<

AO

e >

AO

Z

.
(5B.7b)

Because only changes from o to
( )
eff
o e Z
- 736 -

1 1
(2 ) (1 (2 ) )

+ = +

inside the top and bottom integrals, we can, remembering from Eqs. (5B.1c) and (5B.1d) that

1
4
<< ,

use the average value of to approximate the 1/ term as

1
1 1 1
1
4 1
1 2
2 2 2 2

= = +

+ + +

.

Now the 1/ term can be brought outside the integrals to get

( ) ( )[ ]
( ) ( )
(1 )
4 4
ma
(1 )
4 4
(1 )
4
ma
(1 )
4 4
( )
2 1
M H( ) for 0
4
1
4
2 1
M H( )
4
1
4
eff
W
S R u d
W
S R u d

+ +
+ +

+

<

Z

4
for 0
>

.

(5B.7c)

Making the variable substitution = in the upper integral of (5B.7c), we can write, using
Eqs. (5B.6a)(5B.6c) and remembering that 0 < so that = ,

( ) ( )[ ]
( ) ( )
(1 )
4 4
ma
(1 )
4 4
(1 )
4 4
ma
(1 )
4 4
M H( )
4
H( ) M
4
W
S R u d
W
u S R d

+ +

+

+ +

+

=
.
(5B.7d)
Appendix 5B
- 737 -
In the bottom integral of (5B.7c), we can replace 1
4

+

by 1
4

+

because 0 > ,
making it look the same as Eq. (5B.7d). Consequently, both the top and bottom parts of Eq.
(5B.7c) can be combined into a single formula,

( ) ( )
(1 )
4 4
ma
(1 )
4 4
( )
2 1
H( ) M
4
1
4
eff
W
u S R d

+ +

+

+

Z
.

Making the approximation from Eqs. (5B.1c) and (5B.1d) that

1 1
4
+ ,
we can write this latest formula as

( ) ( )
(1 )
4 2
ma
(1 )
4 2
1
( ) H( ) M
4
eff
W
u S R d

+ +

+

Z , (5B.8a)
where

2
= . (5B.8b)

We conclude that ( )
eff
Z is, to a very good approximation, given by the average value of

( ) ( )
ma
H( ) M
4
W
u S R

over a wavenumber range centered on
1
4

+

,
which has a width of /(2 ) .
- 738 -
Appendix 5C
When a relatively narrow and rapidly varying function h(z) centered on zero is convolved with
the product of another rapidly varying function g(z) and a broad, slowly varying function G(z),
we can often approximate the result as

( ) [ ( ) ( )] ( ) [ ( ) ( )] h z G z g z G z h z g z . (5C.1)

It is easy to see why this works. We start out by making h(z) a narrow function centered on z
0
,
as shown in Fig. 5C.1. Starting with the definition of a convolution in Eq. (2.38a) of Chapter 2,
we have
( ) [ ( ) ( )] ( ) ( ) ( ) h z G z g z h z G z z g z z dz
. (5C.2)

Since h is a narrow function compared to G, the range of values between
0 0
and
h h
z L z L + in
Fig. 5C.1 for which h is significantly different from zero is for function G in Fig. 5C.2 a range of
z values over which very little change occurs. Hence, the right-hand side of Eq. (5C.2) can be
approximated as

0
( ) ( ) ( ) ( ) ( ) ( ) h z G z z g z z dz G z z h z g z z dz

(5C.3a)
or

0
( ) ( ) ( ) ( )[ ( ) ( )] h z G z z g z z dz G z z h z g z
, (5C.3b)

using the definition of the convolution in Eq. (2.38a). Substituting this back into (5C.2) now
gives the desired result,

0
( ) [ ( ) ( )] ( ) [ ( ) ( )] h z G z g z G z z h z g z (5C.4a)

or
( ) [ ( ) ( )] ( ) [ ( ) ( )] h z G z g z G z h z g z (5C.4b)

when
0
0 z because h is centered on zero. This justifies Eq. (5C.1).
Appendix 5C
- 739 -

FIGURE 5C.2.
FIGURE 5C.1.
z
z

h
L
h
L

0
z
2
h
L
) (z h
) (z G
2
h
L
- 740 -

FIGURE 5C.3.
FIGURE 5C.4.
) (z h
sum

z
z z z
) (
1
z h ) (
2
z h ) (
3
z h
Appendix 5C
- 741 -
When a function h is the sum of several narrow functions, it can be written as

( )
1
( )
N
sum k k
k
h z h z z
=
=
(5C.5)

with the N narrow ( )
k
h z functions centered at the origin. Figure 5C.3 shows what a plot of
( )
sum
h z might look like for 3 N = when
1
h ,
2
h ,
3
h are as shown in Fig. 5C.4. The linearity of the
convolution shown in Eq. (2.38e) of Chapter 2 can now be used to write

{ }
{ }
1
1
( ) [ ( ) ( )] ( ) [ ( ) ( )]
( ) [ ( ) ( )] ,
N
sum k k
k
N
k k
k
h z G z g z h z z G z g z
G z z h z z g z
=
=
=

(5C.6)

where the last step uses Eq. (5C.4a) to move G outside the convolutions.
- 742 -
6
NEdN AND DETECTOR NOISE
Laboratory measurements contaminated by random errors are usually characterized by their
signal-to-noise ratio (SNR). In measurements of spectral radiance, however, signal-to-noise ratios
can be confusing because the SNR can change by orders of magnitude as the signal itselfin
spectra having strong emission or absorption lineschanges by orders of magnitude. Hence the
noise performance of Fourier-transform spectrometers is often characterized by the noise-
equivalent change in radiance (NEdN) instead of the signal-to-noise ratio.
96
By far the largest
part of the random error or NEdN in the spectral measurements of most Fourier-transform
spectrometers comes from random errors in the way detectors respond to the optical signal. These
random errors in the detector response are called detector noise. Because as few assumptions as
possible are made in this chapter about the shape of the detector-noise power spectrum, our
approach to detector noise is more elaborate than most discussions of the subject. In this chapter,
we derive formulas for the detector-noise NEdN of Michelson spectrometers using double-sided
and single-sided interferogram signals. While deriving our NEdN formulas, we are careful to
trace through what happens to the spectral signal during calibration, making it easy to understand
the different ways detector noise is processed in double-sided and single-sided systems. Although
the formulas in this chapter apply directly only to the detector noise in standard two-port
Michelson systems, the approach used here can be easily adapted to any type of Fourier-
transform spectrometer by changing the details of the analysis to accommodate the interferogram
signals generated by more elaborate interferometers.

96
The NEdN is described in Sec. 6.1 below.
6.1 Definition of NEdN
The letter N is often used to represent radiance in radiometric equations, and in radiometric
measurements the letters NEdN usually stand for noise-equivalent change in radiance. The
NEdN is the expected amount of uncertainty that random errors give to the radiance value. The
name itself does a good job of explaining the basic concept: random errorsthat is, noise
produce an error in the measurement that corresponds to a modificationthat is, an equivalent
changeof the radiance N.
When a nonideal instrument whose measurement errors are predominantly random is used to
measure radiance, there are error bars attached to the data points (see Fig. 6.1). The error bar
(usually) indicates that the true value of the radiance probably lies within one error-bar length B
E

of the data point. When the error is predominantly random, the randomness shows up as a change
instruments.
Definition of NEdN 6.1
- 743 -
in the measured radiance if the same measurement is repeated; so for predominantly random
errors, the error bars also show that a repeated measurement of the same spectrum value with the
same instrument is likely to lie within one error-bar length of the original data point. For
predominantly random errors, then, the length B
E
of the error bar attached to the data point also
gives the NEdN valuethe noise-equivalent change in radianceassociated with the data point.
When the measurement errors are not predominantly random, the error bar specifies the total
measurement error, both random and nonrandom. Hence, when a data point is also contaminated
by significant amounts of nonrandom error, B
E
is larger than the probable change in value if the
measurement is repeated. Consequently, when both random and nonrandom errors are present,
the NEdN can be thought of as that portion of the error-bar length caused by random
measurement errorthat is, the NEdN is the increase in the length of B
E
due to the presence of
random measurement errors. Because the NEdN must be either the error-bar length itself or an
increase in the error-bar length, the NEdN always has the same type of units as the radiance
measurements it describes. In this chapter, the NEdN describes the expected amount of random
error in the spectral radiance L as a function of wavenumber , so here the NEdN always has
units of optical power per unit area per unit solid angle per unit wavenumber interval (for
example, watts/m
2
/sr/cm
1
or erg/sec/cm
2
/sr/cm
1
= erg/sec/sr/cm). In a well-designed
interferometer, we expect the NEdNindeed, we expect the total measurement errorto be
small compared to the average or typical size of the radiance.
______________________________________________________________________________

FIGURE 6.1.
o
) (o N
of the error bar attached to the data point approx-
imates the NEdN value
measurements it describes. In this chapter, the NEdN describes the expected amount of random
error in the spectral radiance L as a function of wavenumber , so here the NEdN always has
units of optical power per unit area per unit solid angle per unit wavenumber interval (for
example, watts/m
2
/sr/cm
1
or erg/sec/cm
2
/sr/cm
1
= erg/sec/sr/cm). In a well-designed
interferometer, we expect the NEdNindeed, we expect the total measurement errorto be
small compared to the average or typical size of the radiance.
measurement is repeated. The NEdN always has the same type of units as the radiance
6 NEdN and Detector Noise
- 744 -
In Chapter 5, we found that an interferometers spectral radiance measurements always suffer
to some degree from two types of nonrandom error: a measurement distortion due to the
interferometers finite field of view and a measurement distortion due to the finite length of the
interferometers interferogram. According to the discussion following Eq. (5.108f) in Chapter 5,
this nonrandomly distorted spectral measurement can be represented by function L
mnf
(). To get a
complete representation of the measured spectral radiance produced by the interferometer, we
must add to the distorted yet noise-free measurement specified by L
mnf
() a random term
representing the measurement noise. We write the measured, noise-contaminated spectral
radiance produced by an interferometer measurement as

( ) ( ) ( )
mN mnf
= + L L L

(6.1a)

with L
being the random error contaminating the

mN
L
spectral measurement. The wavy line

over
mN
L
and L
shows that these are both random functions of (see Secs. 3.1 and 3.2 of
Chapter 3 for an explanation of the wavy-line notation and random functions). We need L
to be
a random function of because very often the size and nature of the random error in the spectral
measurement depends strongly on the value of the wavenumber . The part of L
reminds us
that the random error takes on values that are small compared to the typical size of L
mnf
.
Representing this typical size by the spectral average of L
mnf
, we note that

max
min
max min
1
( ) ( )
mnf
d

<<

L L
. (6.1b)

In this inequality, the interferometer is assumed to measure spectral radiances between
min
and
max
. Using the same notation as in inequality (5.78) of Chapter 5, we say that

min max
0 < (6.1c)

for all wavenumbers in (6.1a) and (6.1b).
The relationship between the NEdN and L
is not as straightforward as it might at first look.

From Sec. 3.4 of Chapter 3, we know that the average or expected value of
mN
L
in Eq. (6.1a) is

( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
mN mnf mnf
= + = + L L L L L

E E E , (6.2a)

where Eqs. (3.16a) and (3.9f) in Chapter 3 are used to simplify the right-hand side of the
equation. Looking at (6.2a), we might be tempted to define ( ( )) L
E , which is the average or

- 745 -
expected value of L
, to be the NEdN associated with the

mN
L
measurement; but there are

problems with this approach. Suppose, for example, that only random errors are present in our
measurements and that the error bar attached to the data point shows that the random error is just
as likely to make a measured radiance too large as it is to make it too small. This suggests that the
radiance value that would be produced by a noise-free interferometer measurement can be
estimated by averaging together a large number of independent measurements, with the presence
of the randomly occurring too-large measurements compensating for the presence of the
randomly occurring too-small measurements. According to Sec. 3.4 of Chapter 3, to get the
average value of a randomly varying quantity, we should apply the expectation operator E.
Hence, the assumption that averaging together many randomly occurring too-large and too-small
measurements produces a good estimate of the noise-free interferometer measurement can be
written as

( )
( ) ( )
mN mnf
= L L
E . (6.2b)

Substitution of (6.2b) into (6.2a) then gives

( )
( ) 0 = L
E . (6.2c)

So now if ( ( )) L
E is defined to be the NEdN, we end up saying that our measurements have

zero NEdN even though every individual measurement is contaminated by a substantial amount
of random error. This is obviously not acceptable.
To define the NEdN correctly, we must remember that the NEdN is not the average random
error itself, ( ( )) L
E , but rather the average size of the random error. Glancing back at Eqs.
(3.5c) and (3.8e) in Chapter 3, we see that the standard deviation of L
, which can be written as

( )
( ) { }
1/ 2
2
( )- ( )

L L

E E ,

gives us what we want. Even if
( )
( ) L
E is zero, the standard deviation

( )
( )
( )
2 2
( )- ( ) ( )

=

L L L

E E E

will be greater than zero as long as L
itself is not identically zero. Hence, the standard

deviation behaves the way we want it to when ( ( )) L
E is zero while ( ) L
is not zero. The

next step is to check how well this definition of the NEdN works when ( ( )) L
E is not equal to
- 746 -
zero.
Suppose Eq. (6.2c) is no longer satisfied; that is, suppose that

( )
nr
( ) ( ) small nonzero error which depends on = = L L
. E (6.3a)

We decide to write ( ) L
as the sum of a nonrandom function

nr
( ) L and a random function
r
( ) L
,

nr r
( ) ( ) ( ) = + L L L

. (6.3b)

Taking the expectation value of both sides and using Eqs. (3.16a) and (3.9f) in Chapter 3, we get

( ) ( )
nr r
( ) ( ) ( ) = + L L L

E E .

We can reconcile this result with (6.3a) only by requiring that

( )
r
( ) 0 = L
E . (6.3c)

Equations (6.3a)(6.3c) show that if ( ( )) L
E is not zero, then ( ) L
can be written as the

sum of both a random function
r
( ) L
and a nonrandom function

nr
( ) L , with the nonrandom
function
nr
( ) L equal to the nonzero expectation value of ( ) L
and the random function

r
( ) L
having a zero expectation value.

It is easy to show that
nr
( ) L acts like an extra, nonrandom error added to L
mnf
. Substituting
(6.3a) into (6.2a) gives

( )
nr
( ) ( ) ( )
mN mnf
= + L L L
E , (6.3d)

and substituting (6.3b) into (6.1a) gives

nr r
( ) [ ( ) ( )] ( )
mN mnf
= + + L L L L

. (6.3e)

Equation (6.3e) shows that the sum inside the square brackets [ ] plays the same role as L
mnf
does
in (6.1a) because it is a nonrandom function of added to a random function of ; and Eq. (6.3d)
shows that
nr
( ) L cannot be removed by averaging together many different measurements of
( )
mN
L
.
When repeated measurements are made of the same data point and then averaged together,
- 747 -
Eqs. (6.3d) and (6.3e) show that the random change from one measurement to the next comes
entirely from
r
( ) L
, with
nr
( ) L just shifting the data point away from the L
mnf
value by the
same amount each time the measurement is made. This shows that the increase in B
E
, the error-
bar length, due to random error comes entirely from the random component
r
( ) L
of ( ) L
.
Fortunately, defining the NEdN to be the standard deviation of ( ) L
still gives us a well-

behaved value for the NEdN when ( ) L
has a significant nonrandom component

nr
( ) L . We
have already seen that [see the formula following Eq. (6.2c) above]

( )
( ) { }
1/ 2
2
standard deviation of ( )- ( )

=

L L L

E E .

Substituting first (6.3b) and then (6.3a) into the right-hand side gives

( )
( ) { }
( ) { }
1/ 2
1/ 2
2 2
nr r r
( ) ( )- ( ) ( )

+ =

L L L L

E E E .

Again the standard deviation gives us what we want: a nonzero and positive value of the NEdN
that does not depend in any way on the nonrandom error component
nr
( ) L of ( ) L
. We
conclude that it makes sense to define the NEdN of any radiance measurement described by Eq.
(6.1a) to be the standard deviation of the random function ( ) L
even when ( ( )) L
E does not
equal zero:

( )
( ) { }
1/ 2
2
( ) ( )- ( ) NEdN

=

L L

E E . (6.3f)

We note that this definition automatically gives the NEdN units of spectral radiance, as it should.
To emphasize that the standard deviation formula only applies to non-negative wavenumbers ,
we often write

( )
( ) { }
1/ 2
2
( ) ( )- ( ) NEdN

=

L L

E E . (6.3g)

Equation (6.3g) can also be thought of as giving the NEdN the same behavior with respect to
negative wavenumber values as the spectral radiance; the absolute value signs make the NEdN an
even function of in the same way that absolute value signs make L,
( ) fore
L , and
(back)
L even
functions of in Eqs. (5.40g), (5.51a), and (5.57a) of Chapter 5.
- 748 -
6.2 Signal from the Spectral Radiance
Shifting the position of the moving mirror in Fig. 6.2 changes the interferogram signal by
changing the value of , the interferometers optical path difference or OPD. During a spectral
measurement, the moving mirror moves uniformly and steadily through its allowed range of
positions, which means that satisfies [see Eq. (5.41a) of Chapter 5]

ut = . (6.4)

Here u is the constant OPD velocity and t is a time coordinate chosen so that 0 t = when 0 = .
We usually find it more convenient to represent the interferometer signal and signal errors as
functions of while remembering that, according to Eq. (6.4), the OPD value and the time
coordinate t are directly proportional to each other.
The interferometer signal can be evaluated at any position along the signal chain shown in
Fig. 6.2. If we think of the signal as being the electrical impulses leaving the detector circuit due
to the input radiance L(), then we can analyze it at point C in Fig. 6.2 and represent it by ( )
C
z .
Function ( )
C
z can be either the voltage or current as a function of OPD, depending on how we
want to record the signal; and Eq. (6.4) can always be used to write the signal as ( )
C
z ut if we
want it as a function of time. To get the corresponding electrical impulses leaving the detector,
we can analyze the signal at point B in Fig. 6.2 and represent it by ( )
B
z ; and if we think of the
interferometer signal as being the corresponding optical power reaching the detector, then we
analyze it at point A in Fig. 6.2 and represent it by ( )
A
z . Again we have the choice of using
either volts or amps to represent the electrical signal ( )
B
z , and signal ( )
A
z is usually thought
of as having units of optical power. Just like the z
C
signal, the z
B
and z
A
signals can be specified
as functions of time by writing ( )
B
z ut and ( )
A
z ut .
At point C in Fig. 6.2, we know from Sec. 5.18 of Chapter 5 [see Eqs. (5.81a) and (5.83d)]
that the electrical signal due to the spectral radiance L() entering the interferometers aperture is

2
2
ma
R
( )
H( ) M( ) ( ) ( ) ( ) ( ) ( ) ,
4
i
eff
i
f a FOV
e d
WA
u R e d

Z
L
(6.5a)

Signal from the Spectral Radiance 6.2
- 749 -

FIGURE 6.2.
Interferometer Beam Splitter
Fore Optics
Interferometer Fixed Mirror
Interferometer Moving Mirror
Input Scene
Radiance
Aft
Optics
Detector
Det. circuit
w/ antialiasing
filter
Analog-to-Digital Converter
sampling signal at equally-
spaced values
ZPD position
of moving
mirror
Region of Optical Signal
Region of Electrical Signal
Region of Digital Signal
An OPD value of
corresponds to
a physical shift of /2

POINT B

POINT C

POINT A

POINT B
- 749 -

FIGURE 6.2.
Interferometer Beam Splitter
Fore Optics
Interferometer Fixed Mirror
Interferometer Moving Mirror
Input Scene
Radiance
Aft
Optics
Detector
Det. circuit
w/ antialiasing
filter
Analog-to-Digital Converter
sampling signal at equally-
spaced values
ZPD position
of moving
mirror
Region of Optical Signal
Region of Electrical Signal
Region of Digital Signal
An OPD value of
corresponds to
a physical shift of /2

POINT B

POINT C

POINT A

POINT B
POINT A
POINT C
- 750 -
where, according to Eq. (5.83e) of Chapter 5, ( )
FOV
o L is defined to be

1
4 2

1
4 2
( )
1
( )
FOV
d
o
o
o
r
o
r
r
o
r
o o
o
o
o o
A
AO
+ +

A
AO
+

A

L
L
L

one

.
(6.5b)

According to Eq. (5.76c) of Chapter 5, in Eq. (6.5b) is given by

2
o
o
r
AO
A . (6.5c)

Since z
C
() is defined to be the electrical signal due to L() at point C of Fig. 6.2, we can now
write Eq. (6.5a) as

2
ma
R
( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
C
i
f a FOV
z
WA
u R e d
r o
o o o q o t o t o o o
AO
L .
(6.5d)

Examining carefully Eqs. (5.104a) and (5.104b) in Chapter 5, we see that z
C
() is the same signal
as ( ) z in (5.104a) because the signal spectrum ( )
eff
o Z in (5.104b) is the same as the
expression put through the inverse Fourier transform on the right-hand side of (6.5d).
The easiest way to get the formulas for z
B
() and z
A
() is to go backwards through the signal
chain in Fig. 6.2.
Going back to z
B
(), we note that it is the component of the electrical signal leaving the
detector due to the input radiance L(). To find this component, we just set H 1 in (6.5d) to
remove the influence of the detector circuit. Since the AC coupling of the detector circuit also
removes constant terms from the signal, we should also add back any constant signal terms
leaving the detector.
97
Equation (6.4), which requires time and OPD to be proportional, reminds
us that the constant signal terms must be independent of both time t and the OPD value .
Examining Eqs. (5.40e)(5.40g) in Sec. 5.9 of Chapter 5, we note that the formulas for K ( )
bal

97
See the discussion following Eq. (5.46c) in Chapter 5 for an explanation of how the constant terms are eliminated
as the signal passes from point B to C in Fig. 6.2.
o
A
- 751 -
are formulas for what we are now calling z
B
(), the electrical signal leaving the detector due to the
spectral radiance L() entering the interferometers aperture. Both of the formulas for K
bal B
z =
in Eqs. (5.40e) and (5.40f) have the same constant termthat is, the same -independent term
no matter what approximation is used for cos

. This term can be written as, substituting from
Eq. (5.40g),
R
1
( ) ( ) ( ) ( ) ( ) ( )
4 4
f a
A
S d d

=

L . (6.6a)

Because this constant term is the same no matter what approximation is used for cos

, all that
we need to do to get the formula for signal z
B
() is to add this constant term to the formula for z
C

in (6.5d) with H set equal to one. This gives

2
ma
R
R
( )
( ) ( ) ( ) ( ) ( )
4
M( ) ( ) ( ) ( ) ( ) ( )
4
B
f a
i
f a FOV
z
A
d
WA
R e d

L
L

.
(6.6b)

To get z
A
(), the optical power reaching the detector due to the spectral radiance L() entering the
interferometers aperture, we go back one more step in Fig. 6.2. According to the remark
following Eq. (5.35d) at the beginning of Sec. 5.9 of Chapter 5, replacing the detector
responsivity R( ) by one takes us from the electrical signal produced by the detector to the
optical power hitting the detector. Therefore, to get z
A
(), the optical power reaching the detector
at point A due to the spectral radiance L(), we just set R( ) 1 = in Eq. (6.6b) to get

2
ma
( )
( ) ( ) ( ) ( )
4
M( ) ( ) ( ) ( ) ( )
4
A
f a
i
f a FOV
z
A
d
WA
R e d

L
L

.
(6.6c)

- 752 -
6.3 Signal from the Background Radiance
As is discussed in Sec. 5.13 of Chapter 5, the total interference signal passing through the signal
chain of Fig. 6.2 often contains significant background components as well as the z
A
(), z
B
(), and
z
C
() signal components due to the spectral radiance L() entering the interferometers aperture.
The background components at point C have already been discussed to some extent in Chapter 5.
When we compare the background components at C to the background components of points A
and B, we note that the background signals at points A and B contain additional constant terms
that is, terms that are independentthat do not pass through the AC coupling of the detector to
point C. Conceptually, we can write for the total signal at points A, B, and C of Fig. 6.2 that

( )
( ) ( ) background terms at point A
tot
A A
z z = + , (6.7a)

( )
( ) ( ) background terms at point B
tot
B B
z z = + , (6.7b)
and

( )
( ) ( ) background terms at point C
tot
C C
z z = + . (6.7c)

The formulas for z
A
, z
B
, and z
C
in Eqs. (6.6a)(6.6c) show that if L(), the spectral radiance
entering the front aperture, is zero, then z
A
, z
B
, and z
C
are also zero. The standard way of making
the infrared spectral radiance L() negligiblethat is, effectively zero compared to the
background radiancesis to point the interferometer at an extremely cold surface. When this is
done, Eqs. (6.7a)(6.7c) reduce to

( ) ( )
( ) background terms at point A ( )
tot cold
A A
z z = = ,

( ) ( )
( ) background terms at point B ( )
tot cold
B B
z z = = ,
and

( ) ( )
( ) background terms at point C ( )
tot cold
C C
z z = = .

The superscript (cold) reminds us that the background terms at points A, B, and C are the same
thing as the total signal at points A, B, and C when the interferometer is looking at a cold surface.
Because
( ) cold
A
z ,
( ) cold
B
z , and
( ) cold
C
z represent the background terms, and these terms are the same
no matter what negligible or non-negligible spectral radiance L() is entering the front aperture
of the interferometer, Eqs. (6.7a)(6.7c) can also be written as

( ) ( )
( ) ( ) ( )
tot cold
A A A
z z z = + , (6.8a)

( ) ( )
( ) ( ) ( )
tot cold
B B B
z z z = + , (6.8b)
Signal from the Background Radiance 6.3
- 753 -
and

( ) ( )
( ) ( ) ( )
tot cold
C C C
z z z + . (6.8c)

As a general rule, the calibration of any well-designed interferometer system provides us with the
data needed to find

( )
( )
cold
A
z ,
( )
( )
cold
B
z , and
( )
( )
cold
C
z .

Consequently, in principle all that need be done to recover signals z
A
(), z
B
(), and z
C
() at points
A, B, and C is to subtract
( ) cold
A
z ,
( ) cold
B
z , and
( ) cold
C
z from
( ) tot
A
z ,
( ) tot
B
z , and
( ) tot
C
z to get

( ) ( )
( ) ( ) ( )
tot cold
A A A
z z z , (6.8d)

( ) ( )
( ) ( ) ( )
tot cold
B B B
z z z , (6.8e)
and

( ) ( )
( ) ( ) ( )
tot cold
C C C
z z z . (6.8f)

6.4 Inverse Fourier Transform of the Background Radiance
In Chapter 5, we were interested in the nonrandom errors and distortions of the measured
spectrum, so we concentrated on those signal components carrying information about the input
spectral radiance L(). Signal noise, however, can arise from the interferometers background
radiance as well as from its input radiance, so we now have to expand somewhat on the analysis
of the background radiance in Chapter 5. In this section, we show what the background signal
terms look like at points A, B, and C in Fig. 6.2, specifying them as integrals and inverse Fourier
transforms of the background spectral radiance.
Function
( )
( )
cold
C
z in Eq. (6.8c) represents the component of the total signal at point C
created by the background radiance. Returning to Chapter 5 and examining how Eqs. (5.62a) and
(5.62b) are derived, we confirm, as was mentioned in Chapter 5, that
( )
( )
cold
C
z in Eq. (6.8a) is
the same signal as the
( )
( )
cold
C
z given in Eqs. (5.62a) and (5.62b). We work first with the ideal
case where the interferometers field of view AO is small enough that cos
r
o can be
approximated as one. Substituting Eqs. (5.51a) and (5.57a) from Chapter 5 into (5.62b) gives

( )
( ) (back) 2
ma
R
( )
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
cold
C
fore i
a
z
WA
u R e d
r o
o o o q o t o o o o
AO
L L

.
(6.9)
(6.8c)
- 754 -
The nonideal case where cos

can no longer be approximated by one requires somewhat more
work. Equation (5.62a) in Chapter 5 gives the nonideal formula for
( ) cold
C
z . When we compare
(5.62a) to Eq. (5.73a) in Chapter 5, we notice that Eq. (5.73a) becomes identical to (5.62a) when
functions z() and S() are taken to be the same as
( )
( )
cold
C
z and
( ) (back)
[ ( ) ( )]
fore
S S
respectively:

( )
( ) ( )
cold
C
z z (6.10a)

( ) (back)
( ) [ ( ) ( )]
fore
S S S . (6.10b)

In the mathematical analysis following Eq. (5.73a), functions z() and S() are just placeholder
functionsthat is, our mathematical analysis down to Eq. (5.75e) holds true for any appropriate
pair of z and S functions because it makes no assumptions about them other than that they are
related by a formula like Eq. (5.73a). This means we can find out what would happen to Eq.
(5.62a) when the same sort of analysis is applied to it as is applied to the z and S functions in
(5.73a) simply by replacing z and S in (5.75e) by
( ) cold
C
z and
( ) (back)
[ ]
fore
S S respectively. Making
this replacement shows that the mathematical relationship specified in Eq. (5.62a) transforms into

( )
2 1
( ) (back) 4
ma
( )
( ) ( ) M H( ) sinc
4 2
(cold)
C
i
fore
z
W
S S R u e d

.
(6.10c)

For this new formula to be true, we must assume, just as in the analysis following Eq. (5.73a),
that
2
can be treated as a small quantity for all D D over which the

( )
( )
cold
C
z signal
is recorded and that the field of view , although relatively large, is not so large that

2
cos 1
2

is a bad approximation [see Eq. (5.73b) in Chapter 5].
Using the notation introduced in Eq. (2.29a) of Chapter 2, we continue the analysis by taking
the forward Fourier transform of both sides of (6.10c) to get

Inverse Fourier Transform of the Background Radiance 6.4
- 755 -

( )
( )
( ) ( ) 2
2 1
( ) (back) 4
ma
( )
( ) ( ) M H( ) sinc
4 2
i cold i
C
i
fore
z d e d
W
S S R u e

=

.
F
(6.10d)

Comparing this to Eq. (5.76a) in Chapter 5, we note that the right-hand sides of (5.76a) and
(6.10d) become identical if we once again use (6.10b), matching ( ) S to
( ) (back)
[ ( ) ( )]
fore
S S . Checking out how Appendix 5B is used to transform the right-hand side
of (5.76a) into the right-hand side of (5.76b), we note that again S is just a placeholder function.
This means the mathematical analysis still holds true when ( ) S is replaced by
( ) (back)
[ ( ) ( )]
fore
S S . Consequently, we can apply the same transformation used on (5.76a) to
Eq. (6.10d) to get

( )
( )
( ) ( )
1
4 2
( ) (back)
ma
1
4 2
( )
1
( ) ( ) M H( ) ,
4
i cold
C
fore
z
W
S S R u d

+ +

F
(6.10e)
where

2
= .

Substituting from Eqs. (5.51a) and (5.57a) of Chapter 5 gives

( )
( )
{ }
( ) ( )
1
4 2
( ) (back)
ma
1
4 2
R
1
( )
4
M H( ) ( ) ( ) ( ) ( ) ( )
i cold
C
fore
a
WA
z
R u d

+ +

L L

.
F
(6.10g)

According to the discussion following Eq. (5.82c) of Chapter 5, the functions M, H, R, , and
a

all vary slowly with wavenumber , allowing them to be brought outside the integral in (6.10g).
In well-designed interferometers, it is often true that the background radiances
( ) fore
L and
(back)
L
- 756 -
are also slowly varying functions of , being more or less proportional to a combination of Planck
black-body curves, but for now we can leave open the possibility that this is not the case.
Equation (6.10g) can now be written as, using the approximations specified in (5.83b) of Chapter
5,

( ) ( )
( ) ( )
ma
1 1
4 2 4 2
( ) (back)
1 1
4 2 4 2
R ( ) M H( ) ( ) ( ) ( )
4
1 1
( ) ( )
i cold
C a
fore
WA
z R u
d d

+ + + +

+ +

L L

F

.
(6.10h)

Equation (6.10h) applies, of course, to the nonideal case where is small but not so small that
cos

can be approximated by one. Returning to Eq. (6.9), which gives the formula for
( )
( )
cold
C
z when the field of view is small enough to approximate cos

by one, we take the
forward Fourier transform of both sides of (6.9) to get

( )
( ) ( )
( ) (back)
ma
R
( )
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
i cold
C
fore
a
z
WA
u R
=

L L
F
.
(6.10i)

Comparing Eqs. (6.10h) and (6.10i), we see they can be combined into a single result by writing

( )
( ) ( )
( ) (back)
ma
R
( )
H( ) M( ) ( ) ( ) ( )[ ( ) ( )] ,
4
i cold
C
fore
a FOV FOV
z
WA
u R

L L
F

(6.11a)

where we define, following the pattern of Eq. (5.83e) in Chapter 5, that

( )
( )
1
4 2
( )
1
4 2
( )
1
( )
cann
fore
fore
FOV
fore
d

+ +

+

=

L
L
L

ot be approximated as one

(6.11b)
- 757 -
and

(back)
(back)
1
4 2
(back)
1
4 2
( )
1
( )
cann
FOV
d

+ +

+

=

L
L
L

ot be approximated as one

.
(6.11c)

The Fourier transform in Eq. (6.11a) can always be reversed to get

( )
( ) (back) 2
ma
R
( )
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
cold
C
fore i
a FOV FOV
z
WA
u R e d

L L

.
(6.12a)

This is the formula for
( ) cold
C
z that belongs in Eq. (6.8c).
Having found the background terms at point C in Fig. 6.2, we now get the background terms
at point B by going back up the signal chain the same way we did for z
A
, z
B
, and z
C
in Sec. 6.2
above. To evaluate the right-hand side of (6.12a) at point B, we set H( ) 1 u = to get

( ) (back) 2
ma
R M( ) ( ) ( ) ( )[ ( ) ( )]
4
fore i
a FOV FOV
WA
R e d

L L .

Unfortunately, to get the complete
( )
( )
cold
B
z background signal at point B, we have to add to this
the constant terms removed by the AC coupling of the detector to the rest of the system.
98
Since,
according to Eq. (6.4), time and the OPD value are proportional to each other, the time-
independent constant terms are also independent. We note that according to Eq. (6.8b) above,
the total interference signal at point B is

( ) ( )
( ) ( ) ( )
tot cold
B B B
z z z = + .

Returning to Chapter 5, we compare Eq. (5.59b), which gives the total interference signal at point
B when the interferometers field of view is too large for cos

to be approximated as one, and
Eq. (5.59c), which gives the total interference signal at point B when the field of view is small

98
See discussion following Eqs. (5.42c) and (5.46c) in Sec. 5.10 of Chapter 5 for more information on AC coupling.
- 758 -
enough for cos

to approximated by one, and see that they both have the same -independent
constant terms:

( ) ( )
det
0
( )
0 0
2
(back)
0
R
R
independent terms ( ) ( )
1 1
( ) ( )
2 2
( ) ( ) ( )[ 2 ( ) ( )]
2
dir dir
fore
a
d
S d S d
A
r d

=
+ +
L
L
-

.

Because we are only trying to find the constant terms in the
( )
( )
cold
B
z background radiancethat
is, constant terms that are still present when the input radiance ( ) 0 L because the instrument
observes a cold scenewe must be careful to drop everything that is zero when L() is zero.
Formula (5.40g) in Chapter 5 shows that the integral over S() becomes zero when L() is zero,
so it should be removed to give

( ) ( ) ( )
det
0 0
R
R
cold input radiance source produced the -independent background terms
1
( ) ( ) ( )
2
( ) ( )
2
dir dir fore
a
d S d
A

= +
+

L
L

2
(back)
0
( ) [ 2 ( ) ( )] r d
.

Because these -independent background terms are the same no matter how cos

is
approximated, they correctly represent the constant background terms at point B for all
reasonable sizes of the interferometers field of view. Adding them to the -dependent terms from
Eq. (6.12a) with H 1 = thus gives the total background interference signal at point B,

( )
( ) (back) 2
ma
( ) ( ) (
det
0
R
R
( )
M( ) ( ) ( ) ( )[ ( ) ( )]
4
1
( ) ( )
2
cold
B
fore i
a FOV FOV
dir dir f
z
WA
R e d
d S

=

+ +
L L
L

)
0
2
(back)
0
R
( )
( ) ( ) ( )[ 2 ( ) ( )]
2
ore
a
d
A
r d

L .
(6.12b)
- 759 -
We substitute for
( )
( )
fore
S from Eq. (5.51a), with absolute value signs dropped from the
arguments because the integral does not cover negative values, to get

( )
( ) (back) 2
ma
( ) ( ) (
det
0
R
R R
( )
M( ) ( ) ( ) ( )[ ( ) ( )]
4
( ) ( ) ( ) ( )
2
cold
B
fore i
a FOV FOV
dir dir f
z
WA
R e d
A
d

+ +
L L
L L

)
0
2
(back)
0
R
( ) ( )
( ) ( ) ( ) [ 2 ( ) ( )]
2
ore
a
a
d
A
r d

L .
(6.12c)

Now that the constant terms have been correctly incorporated into (6.12c), it is easy to get the
formula for
( )
( )
cold
A
z , the total background signal at point A in Fig. 6.2: just set the detector
responsivity to R 1 = . Hence, we see that the total background optical power reaching the
detector is

( )
( ) (back) 2
ma
( ) ( ) ( )
det
0
( )
M( ) ( ) ( )[ ( ) ( )]
4
( ) ( ) ( )
2
cold
A
fore i
a FOV FOV
dir dir fore
z
WA
R e d
A
d

+ +
L L
L L

0
2
(back)
0
( )
( ) ( ) [ 2 ( ) ( )]
2
a
a
d
A
r d

L .
(6.12d)

Equations (6.5d), (6.6b), (6.6c), and (6.12a)(6.12d) give all the information needed to make
sense of the formulas (6.8a)(6.8c) for the
( )
( )
tot
z signals at points A, B, and C in Fig. 6.2.
6.5 Background Radiance, Total Error, and Signal Noise
Equations (6.8a)(6.8c) are not, of course, the complete story because we have not yet considered
random errors in the measurements. In reality, we can never measure
( )
( )
tot
A
z ,
( )
( )
tot
B
z , and
( )
( )
tot
C
z directly; instead, what we get from any measurement at points A, B, or C are the noise-
contaminated signals
( )
( )
tot
AN
z ,
( )
( )
tot
BN
z , and
( )
( )
tot
CN
z given by the formulas

- 760 -

( ) ( )
( ) ( ) ( ) ( )
tot cold
AN A A A
z z z z o + + , (6.13a)

( ) ( )
( ) ( ) ( ) ( )
tot cold
BN B B B
z z z z o + + , (6.13b)
and

( ) ( )
( ) ( ) ( ) ( )
tot cold
CN C C C
z z z z o + + . (6.13c)

Here ( )
A
z o represents the noise associated with any signal at point A in Fig. 6.2, ( )
B
z o
represents the noise associated with any signal at point B in Fig. 6.2, and ( )
C
z o represents the
noise associated with any signal at point C in Fig. 6.2. Just like in Eq. (6.1a) above, the noise
terms have a o to show that they are expected to be small, and they have wavy lines or tildes to
show that they are random functions of . Tildes are added to
( )
( )
tot
AN
z ,
( )
( )
tot
BN
z , and
( )
( )
tot
CN
z to
show that these signals are also random quantities (because they are contaminated by the random
noise).
As pointed out in Sec. 6.3, the
( )
( )
cold
z signals are special cases of the
( )
( )
tot
z
interferometer signals; they are just the total signals at points A, B, or C when z
A
(), z
B
(), and
z
C
() are negligible or zero because the interferometer is observing a cold scene having negligible
or zero spectral radiance L(). Hence, when L is negligible or zero, Eqs. (6.13a)(6.13c) can be
specialized by writing

( ) ( ) ( )
( ) ( ) ( )
cold cold cold
AN A A
z z z o + , (6.13d)

( ) ( ) ( )
( ) ( ) ( )
cold cold cold
BN B B
z z z o + , (6.13e)
and

( ) ( ) ( )
( ) ( ) ( )
cold cold cold
CN C C
z z z o + . (6.13f)

Here
( ) cold
AN
z ,
( ) cold
BN
z , and
( ) cold
CN
z represent the noise-contaminated signals at points A, B, and C in
Fig. 6.2 for cold-surface observations with negligible or zero L(), and
( ) cold
A
z o ,
( ) cold
B
z o , and
( ) cold
C
z o are, of course, their noise components.
In a well-designed interferometer, we can assume that many different measurements of
( ) tot
AN
z ,
( ) tot
BN
z , and
( ) tot
CN
z can be averaged together to produce signals contaminated by only negligible
amounts of noise. This imposes the requirements

( )
( ) ( )
( ) ( )
tot tot
AN A
z z E , (6.14a)

( )
( ) ( )
( ) ( )
tot tot
BN B
z z E , (6.14b)
the noise contaminating them.
amounts of random error or noise. This imposes the requirements
Background Radiance, Total Error, and Signal Noise 6.5
- 761 -
and

( )
( ) ( )
( ) ( )
tot tot
CN C
z z = E (6.14c)

on the average or expected values of random functions
( ) tot
AN
z ,
( ) tot
BN
z , and
( ) tot
CN
z . Since the
( ) cold
z
signals are just special cases of the
( ) tot
z signals, we can also write

( )
( ) ( )
( ) ( )
cold cold
AN A
z z = E , (6.14d)

( )
( ) ( )
( ) ( )
cold cold
BN B
z z = E , (6.14e)
and

( )
( ) ( )
( ) ( )
cold cold
CN C
z z = E . (6.14f)

We substitute (6.13a)(6.13c) into (6.14a)(6.14c) and use the linearity of the expectation
operator E as explained in Sec. 3.10 of Chapter 3 to get

( ) ( )
( ) ( )
, , , , , , , ,
( ) ( ) ( ) ( )
cold tot
A B C A B C A B C A B C
z z z z + + = E E .

Substitution of (6.8a)(6.8c) gives

( ) ( )
( ) ( )
, , , , , ,
( ) ( ) ( )
tot tot
A B C A B C A B C
z z z + = E E ,

which becomes, using Eq. (3.9f) of Chapter 3,

( )
, ,
( ) 0
A B C
z = E . (6.14g)

Similarly, we can substitute (6.13d)(6.13f) into (6.14d)(6.14f) and use the linearity of the
expectation operator to get

( ) ( )
( ) ( )
, , , , , ,
( ) ( ) ( )
cold cold
A B C A B C A B C
z z z + = E E .

According to Eq. (3.9f) of Chapter 3, this also reduces to

( )
( )
, ,
( ) 0
cold
A B C
z = E . (6.14h)
- 762 -
Equations (6.14g) and (6.14h) require the expectation or average values of the random functions
representing the noise to be equal to zero at every OPD value . From this point on, we can think
of the z o signal noise as a random signal error whose expectation value is always zero.
Function L
mnf
() is, according to the discussion in Sec. 6.1, the best and most accurate
spectral-radiance measurement that an interferometer can produce. It can be recovered from the
noise-free signal at points A, B, or C in Fig. 6.2; however, all that we get from a single
measurement is the noise-contaminated signal
( ) tot
AN
z ,
( ) tot
BN
z ,
( ) tot
CN
z or (when looking at a cold
surface)
( ) cold
AN
z ,
( ) cold
BN
z ,
( ) cold
CN
z . In principle, we could average together large numbers of
measurements to get, according to Eqs. (6.14a)(6.14f), both of the noise-free signals
( )
, ,
tot
A B C
z and
( )
, ,
cold
A B C
z . Then, following the recipe in Eqs. (6.8d)(6.8f), the noise-free
( ) cold
z signal could be
subtracted from the noise-free
( ) tot
z signal to get z
A
(), z
B
(), or z
C
() at points A, B, or C in Fig.
6.2. This is exactly what is needed to gain access to L
mnf
(); unfortunately, it is also impractical.
Typically, enough work is invested in calibrating an interferometer to produce very high-quality
estimates of the
( ) cold
z signals if we want them. Even when we calibrate in the spectral domain, as
discussed in Sec. 5.19 of Chapter 5, the calibration algorithm requires substantially noise-free
signal spectra from which we could extract substantially noise-free
( ) cold
z signals. When making
everyday measurements, on the other hand, we end up relying on less high-quality information;
that is, we use the noise-contaminated
( ) tot
AN
z ,
( ) tot
BN
z ,
( ) tot
CN
z signals or their equivalents. Everyday
measurements are less accurate than the information used to calibrate the interferometer because
that is what it means to calibrate an instrument: however accurate the everyday measurement, and
however many noise-suppression averages go into its making, we expect the calibration to be
done with even greater care. Hence, when analyzing the z
A,B,C
signals generated by the input L()
radiance, we can assume there is always enough noise-free data to subtract off, if only as a
thought experiment, the nonrandom functions
( )
, ,
cold
A B C
z from the random functions
( )
, ,
tot
AN BN CN
z to get

( ) ( )
( ) ( ) ( )
tot cold
AN AN A
z z z , (6.15a)

( ) ( )
( ) ( ) ( )
tot cold
BN BN B
z z z , (6.15b)
and

( ) ( )
( ) ( ) ( )
tot cold
CN CN C
z z z . (6.15c)

Substitution of (6.13a)(6.13c) now gives

( ) ( ) ( )
AN A A
z z z o + , (6.16a)

( ) ( ) ( )
BN B B
z z z o + , (6.16b)
spectral-radiance measurement produced by an interferometer. It can be recovered from the
the non-randomly distorted
Background Radiance, Total Error, and Signal Noise 6.5
- 763 -
and
( ) ( ) ( )
CN C C
z z z = + . (6.16c)

Equations (6.16a)(6.16c) show that any noise in the signals at points A, B, or C in Fig. 6.2
automatically ends up attached to z
A,B,C
; that is, it ends up attached to the signal component
used to recover the L
mnf
() spectral radiance measured by the interferometer.
6.6 Detector Noise
Detectors are usually the largest and most noticeable source of noise in interferometer
measurements. Detector noise enters the signal chain at point B in Fig. 6.2; this is where it first
shows up as a random error contaminating the signal. As a general rule, detector noise has many
high-frequency components, changing very rapidly with time as the detector is being used.
During a spectral measurement the moving mirror moves at a steady rate, making the OPD value
directly proportional to time [as required by Eq. (6.4)]. Consequently the detector noise can also
be written as a rapidly changing random function of at point B,

(det)
( ) ( )
B
z n = , (6.17a)

and we expect it to obey Eq. (6.14g),

( )
(det)
( ) 0 n = E . (6.17b)

Equation (6.13b) above can now be written as

( ) ( ) (det)
( ) ( ) ( ) ( )
tot cold
BN B B
z z z n = + + . (6.17c)

Since only detector noise is being analyzed in this chapter, we specify here that only negligible
amounts of noise occur upstream of point B in Fig. 6.2 by setting

( ) 0
A
z = (6.17d)

in Eqs. (6.13a) and (6.16a). We also assume that only negligible amounts of extra noise enter the
signal chain downstream of point C, which means that
C
z in (6.13c) and (6.16c) comes entirely
from the transmission of
(det)
B
z n = between points B and C. Our job is to find what
C
z looks
like in terms of
(det)
n and then to use that information to find a formula for the NEdN due to
detector noise.
Many Fourier-transform systems go to great lengths to minimize detector noise. Some tactics
are obviousfor example, careful choice and treatment of detectors so that they perform well
- 764 -
and do not generate large amounts of random error. Other tactics are perhaps less obviousfor
example, averaging together a large number of interferogram signals to reduce the detector noise
present. Section 3.12 of Chapter 3 has a discussion of how averaging of identical, noise-
contaminated signals works to reduce random error; and of course Fourier-transform signals are
put through computers to extract spectra, making it easy to store and average them. This sort of
averaging often involves the combination of many different independent measurements at the
same OPD value and is often referred to as co-adding the interferograms. (It should not be
confused with the averaging discussed in Sec. 6.8 below, where we talk about averaging together
the signal values at and .) There are two points that should be kept in mind when reading the
balance of this chapter:
(1) However much effort is put into co-adding interferograms to reduce noise, almost
alwaysas discussed at the end of the previous sectioneven more effort is put
into processing the calibration data to reduce noise; and
(2) The
(det)
n random function in Eq. (6.17a) above can be taken to represent the amount
of noise that still contaminates the signal after co-adding has occurred.
We can, in effect, pretend that co-adding is something that happens to the signal immediately
after it leaves the detector, acting to reduce the noise at point B and all points further downstream
in the signal processing chain of Fig. 6.2.
6.7 1/f Noise in Detectors
Calibrations tend to go stale; that is, the more time there is between when an instrument is
calibrated and when it is used, the less accurate the measurements are. In particular this is true of
the detectors in Fourier-transform spectrometers, or indeed the detectors of any optical
instrumentthe longer the time between calibration and use, the less accurately do we know how
the detectors respond to incoming photons. In optical detectors this phenomenon is often referred
to as 1/f noise for reasons that will be explained below.
Suppose, as a thought experiment if nothing else, that a collection of k identical detectors are
all calibrated at the same time and we then keep track of their random errors as the calibrations
go stale. (Note that the error due to the calibrations going stale must be random or else we could
study how the detector response changes with time and correct for it.) We do this over a very
long time interval t . From this set of data we then select, for each detector, a subset of data
covering a time interval 2T with 2T t << , and the 2T time interval is then used to construct k
error functions for the 1, 2, , k k identical detectors. We call these functions

(det)
( ) measured detector noise for the th detector as a function of time
with
k
n t k t
T t T
s s

; and of course interferometer signals are
1/f Noise in Detectors 6.7
- 765 -
Note that although these are error functions, they are not random since they represent the actual
measured error for each detector. Because the detectors are all identical, each
(det)
( )
k
n t can be
thought of as a specific instance of the same random function
(det)
( ) n t ; that is, each
(det)
( )
k
n t can
be treated as a typical member of the ensemble of functions associated with the
(det)
( ) n t random
function.
99
Returning briefly to Sec. 3.23 of Chapter 3, we use Eq. (3.56a) to calculate another set
of functions,

( ) (det) 2
( )
T
k ift
T k
T
N f n e dt
r
.

Each
( )
( )
k
T
N f can be regarded as a member of the ensemble of functions associated with random
function

(det) 2
( )
T
ift
T
T
N f n e dt
r
.

Formula (3.57g) in Chapter 3 then states that the noise-power spectrum of
(det)
( ) n t is

( )
2
( )
( ) lim
2
T
nn
T
N f
S f
T
E
.

Because the expected value of a random quantity can be estimated by taking its average, we can
write that

2
( )
1
1 1
( ) lim ( )
2
k
nn T
T
k
S f N f
T

e

k
k

and the formula reduces to

2
( )
1
( )
1
( )
2
k
T
nn
k
N f
S f
T
e

k
k

when we assume that T is large enough for the value of

2
( )
( )
2
k
T
N f
T

99
See Sec. 3.14 of Chapter 3 for an explanation of what is meant by an ensemble of functions.
(det)
( )
k
n t
2 ift
e dt
r
.
(det)
( ) n t
2 ift
e dt
r
.
- 766 -
to be close to its limit as T . This result shows how to calculate the noise-power spectrum of
the k= = identical detectors. When discussing 1/f noise, it is customary to introduce one final step:
using Eq. (3.58b) to go from the double-sided power spectrum S
to the single-sided power

spectrum
(1)
nn
S

,

(1)
( ) 2 ( ) for 0
nn nn
S f S f f >

.

This is really just a change of scaledoubling the size of the noise-power spectrumalong with
an agreement to ignore the negative f values because they are always the same as the positive
ones [see Eq. (3.49b) in Chapter 3].
Figure 6.3(a) shows a typical plot, for detector noise, of
(1)
nn
S

versus f on a log-log scale. For
most detectors, there is a corner frequency f
c
such that when f > f
c
the value of
(1)
nn
S

is
essentially constant over a wide range of frequencies (before rolling off at very high f). When f <
f
c
, on the other hand, the value of
(1)
nn
S

is typically proportional to 1/ f
o
, with approximately
equal to one. Low frequencies correspond to long time intervals, so the growth in the value of
(1)
nn
S

as f gets small reflects the way detector calibrations go stale as time goes by. It has become
convenient to refer to this phenomenon as detector 1/f noise because in many detectors the corner
frequency f
c
is relatively large, meaning that their calibrations start to go stale in a very small
fraction of a second. We like to set up Fourier-transform systems so that the low-frequency noise
at f < f
c
cannot significantly contaminate our measurements. The basic strategy for doing this is
to use high-quality detectorsmeaning that f
c
is smalland calibrate often enough that 1/f noise
does not become important. This is only the first line of defense; there are other ways of
minimizing the effect of 1/f noise and they will be pointed out in the remainder of the chapter
when appropriate.
A mathematical point often ignored in elementary discussions of 1/f noise is that if noise-
power spectra are 1/f all the way down to zero frequency, then integrals over frequency that
include the zero must divergethat is, they become infinite. Standard treatments of random
function theory require the use of these integrals. Equation (3.48d) in Chapter 3, for example,
shows that R
(0) is equal to the integral of the power spectrum over all frequency values
including, of course, f=0. Hence, the integral formula for R
(0) diverges when the power

spectrum is 1/f all the way down to zero. According to Eq. (3.48a) in Chapter 3, R
(0) is just the

squared standard deviation of the random function at any time t. This squared standard
deviation must have a well-defined value to describe the detector noise accurately. Consequently,
the integral for R
(0) cannot be allowed to diverge. Perhaps the quickest way out of this problem
is to note that zero frequency corresponds to the most recent calibration occurring an infinite time
in the past; so, as long as the detectors have been calibrated more recently than that, we do not
expect the 1/f region of the noise-power spectrum to extend all the way down to zero. In general,
when the 1/f form of the noise-power spectrum leads to problems near f=0, it means that an
important aspect of the random erroran aspect which prevents the 1/f noise from producing
Figure 6.3(a) shows a typical plot, for detector noise, of
(1)
nn
S

versus f on a log-log scale. For ) on page 795 s
1/f Noise in Detectors 6.7
- 767 -
infinite integralshas been left out of the noise model.
Since 1/f noise can usually be neglected when analyzing the effects of detector noise on the
spectral measurements of well-designed Fourier-transform spectrometers, many models of
detector noise assume that it can be approximated as band-limited white noise of the type
discussed in Sec. 3.25 of Chapter 3. The white-noise level used for this approximation is typically
given by the level part of the power spectrum for frequencies f > f
c
in Fig. 6.3(a) (before the roll
off at very high frequencies). In this chapterexcept for Secs. 6.13 and 6.15we do not make
this sort of approximation, which is why the treatment of detector noise given here may seem
overly elaborate to those familiar with other presentations of the topic. Modeling detector noise
as band-limited white noise may capture most of the basic features of detector noise in Fourier-
transform spectrometers, but it can be misleading when analyzing the effects of 1/f noise and
other types of nonstandard detector errors.
6.8 Avoidable and Unavoidable Noise in Double-Sided Signals
Equation (6.15b) shows that when specialized calibration procedures are used to measure the
( )
( )
cold
B
z signal, we can then subtract it from
( )
( )
tot
BN
z , giving us access to the noise-
contaminated interferogram signal ( )
BN
z specified in Eq. (6.16b). According to (6.17a), when
analyzing detector noise the signal in (6.16b) should be written as

(det)
( ) ( ) ( )
BN B
z z n + . (6.18a)

From Eq. (6.6b), we know that the noise-free signal ( )
B
z in (6.18a) is

2
ma
R
R
( )
( ) ( ) ( ) ( ) ( )
4
M( ) ( ) ( ) ( ) ( ) ( )
4
B
f a
i
f a FOV
z
A
d
WA
R e d
r o
q o o t o t o o o
o o q o t o t o o o
AO
AO
+
L
L

.

Consulting Eqs. (4.139g) of Chapter 4 and (5.10f) of Chapter 5, we see that and M are even
functions of . This turns the second integral on the right-hand side into the inverse Fourier
transform of a real and even function of . Therefore, according to entry 1 of Table 2.1 in Chapter
2, the integral itself is a real and even function of . Because the first integral on the right-hand
side is a constant, independent of , we conclude that the noise-free signal ( )
B
z must also be a
real and even function of ,
( ) ( )
B B
z z . (6.18b)

r noise.
- 768 -
Glancing back at the formula for ( )
BN
z in Eq. (6.18a), we see that the detector noise
(det)
( ) n
is, however, another storyit would be strange indeed if the random error coming from the
detector is an even function of . The detector cannot possibly care what the position of the
moving mirror is; the only reason
(det)
n depends on is that we acknowledge
(det)
n to be a
function of time and then use Eq. (6.4) to make it function of . Consequently
BN
z , the sum of z
B

and
(det)
n in (6.18a), is an uneven function of only because it is a noise-contaminated signal.
This distinction between z
B
() and
(det)
( ) n , that one is an even function and the other is not, can,
in principle, be used to reduce the NEdN of the interferometers spectral measurements. (In
practice we always have to worry about the distorting effect of any circuit used to measure the
detector signalsee for example the discussion of the detector circuit in Sec. 5.12 of Chapter 5.)
For this reason, we say that some of the noise contributed to z
B
() by
(det)
( ) n is avoidable
noisethat is, noise that can be eliminated by an intelligent analysis of the
BN
z signal.
Perhaps the quickest way to distinguish the avoidable and unavoidable noise in ( )
BN
z is to
recall the discussion following Eq. (2.11b) in Chapter 2, where it is pointed out that any function
can be written as the sum of even and odd components. Hence, we can always write

(det) (det) (det)
( ) ( ) ( )
e o
n n n = + , (6.19a)
with

(det) (det) (det)
1
( ) ( ) ( )
2
e
n n n = +

(6.19b)
and

(det) (det) (det)
1
( ) ( ) ( )
2
o
n n n =

. (6.19c)

Here
(det)
e
n is the even component of
(det)
n and
(det)
o
n is the odd component of
(det)
n ,

(det) (det)
( ) ( )
e e
n n = (6.19d)
and

(det) (det)
( ) ( )
o o
n n = . (6.19e)

Equations (6.19d) and (6.19e) are just the definition of what it means for a function to be even or
odd [see Eqs. (2.11a) and (2.11b) in Chapter 2], and it is easy to see that (6.19d) and (6.19e) are
true by checking what happens when the sign of the argument is changed in formulas (6.19b) and
(6.19c). Substitution of (6.19a) into (6.18a) gives

(det) (det)
( ) [ ( ) ( )] ( )
BN B e o
z z n n = + + . (6.19f)
Avoidable and Unavoidable Noise in Double-Sided Signals 6.8
- 769 -
A little thought shows that
(det)
( )
e
n must be the unavoidable component of the noise, because
there is no way to distinguish the noise-contaminated sum inside the square brackets [ ] from a
noise-free measurement of a z
B
() interference signal. The
(det)
( )
o
n noise, on the other hand, is an
avoidable source of error. We could, for example, eliminate it by averaging together ( )
BN
z and
( )
BN
z ,

(det) (det)
(det) (det)
1 1
[ ( ) ( )] [ ( ) ( ) ( )]
2 2
1
[ ( ) ( ) ( )]
2
1
[ ( ) ( )]
2
BN BN B e o
B e o
B B
z z z n n
z n n
z z

+ = + +
+ + +
= +

(det) (det)
(det) (det)
(det)
1
[ ( ) ( )]
2
1
[ ( ) ( )]
2
( ) ( ) ,
e e
o o
B e
n n
n n
z n

+ +
+ +
= +

where in the last step Eqs. (6.18b), (6.19d), and (6.19e) are used to show that the average
produces signal z
B
() contaminated only by
(det)
( )
e
n , the unavoidable even-noise component.
Although in practice the avoidable noise
(det)
( )
o
n is usually not averaged away at this point in the
signal processing chain, it could in principle be eliminated this way. To show that the
(det)
( )
o
n
avoidable noise has not yet been eliminated from the noise-contaminated signal, we substitute
(6.19a) into (6.17c) to get

( ) ( ) (det) (det)
( ) ( ) ( ) ( ) ( )
tot cold
BN B B e o
z z z n n = + + + . (6.19g)

For now, this is still the signal we trace through the signal chain, always remembering that only
the
(det)
e
n noise component is an unavoidable source of signal contamination.
6.9 Passing the Detector Noise Through the Detector Circuit
The discussion following Eq. (5.43) of Chapter 5 points out that the detector circuit must be
linear, which means it obeys the rules outlined in Appendix 5A of Chapter 5. The analysis
following Eq. (5A.2a) in Appendix 5A of Chapter 5 shows that if the input to the detector circuit
is the sum of two signals, for example,
- 770 -

( )
[ ( ) ( )]
cold
B B
z z + and
(det) (det)
[ ( ) ( )]
e o
n n + ,

as in Eq. (6.19g), then the output signal must be the sum of the outputs generated by each signal
going through the circuit separately. We already know, according to Eqs. (6.8b) and (6.8c) above,
that the output corresponding to input

( )
[ ( ) ( )]
cold
B B
z z + is
( )
[ ( ) ( )]
cold
C C
z z + ;

and we also know that the total signal plus noise leaving the detector circuit at point C is,
according to Eq. (6.13c),

( ) ( )
( ) ( ) ( ) ( )
tot cold
CN C C C
z z z z o + + . (6.20a)

Hence ( )
C
z o , the noise contaminating the signal at point C, is the signal we would get when
passing the sum

(det) (det) (det)
( ) ( ) ( )
e o
n n n +

of both the avoidable and unavoidable noise through the detector circuit as a separate signal.
The first step in sending the total detector noise
(det)
( ) n through the detector circuit is to use
Eq. (6.4) above to convert

(det) (det)
( ) n n ut (6.20b)

into a function of time. Then, using formula (5A.1a) in Appendix 5A of Chapter 5, we know the
corresponding output is

(det)
( ) ( ) n ut h t t dt
,

where h(t) is the impulse-response function of the detector circuit. Following the suggestion in
Eq. (6.4), we change the variable of integration to ut . The detector circuits output
corresponding to the
(det)
n input is then

(det)
1
( ) n h t d
u u
.

Now we substitute / t u from Eq. (6.4) to get the noise output corresponding to input
(det)
n as
a function of ,
m in Eq. (6.19a)
Passing the Detector Noise Through the Detector Circuit 6.9
- 771 -

(det)
1
( ) n h d
u u

.

The discussion following Eq. (6.20a) above shows that ( )
C
z must be exactly this integral
that is, the output of the detector circuit corresponding to input
(det)
n . Therefore, we can write

(det)
1
( ) ( )
C
z n h d
u u

. (6.20c)

Glancing back at the definition of the convolution in Eq. (2.38a) of Chapter 2, we note that this

(det)
1
( ) ( )
C
z n h
u u

=

. (6.20d)

Equations (6.20c) and (6.20d) are exact formulas for ( )
C
z , but there is also an
approximation for it that is often useful. According to the analysis at the beginning of Appendix
5A to Chapter 5, when h(t) is a narrow function of time the output of the detector circuit is just a
slightly blurred and distorted version of the input; and, according to the discussion at the end of
Sec. 5.12 of Chapter 5, detector circuits are typically designed to produce this sort of output. We
can almost always assume that h(t) is relatively narrowthat is, that there exists a time T such
that h(t) is negligible when t lies outside the time interval between +T and T,

( ) 0 for h t t > q . (6.21a)

In fact, if h is causal, we can also assume that ( ) 0 h t = for 0 t < [see Eq. (5A.5) in Appendix 5A
of Chapter 5]. Therefore the time-based output of the detector circuit can be approximated as

(det) (det)
( ) ( ) ( ) ( )
t
t
n ut h t t dt n ut h t t dt
+

T
T
. (6.21b)

Again we change the t dummy variable of integration to ut = and replace the time parameter
t by / t u = to get

(det) (det)
1 1
( ) ( )
u
u
n h d n h d
u u u u

+

T
T
. (6.21c)

- 772 -
According to the definition of convolution in Eq. (2.38a) of Chapter 2, this can also be written as

(det) (det)
( ) ( )
u
u
n h n h d
u u

T
T
. (6.21d)

Hence Eq. (6.20d) can be approximated as

(det)
1
( ) ( )
u
C
u
z n h d
u u

o
+

T
T
. (6.21e)

6.10 Total Detector Noise in Double-Sided Signals
Having found the formula for ( )
C
z o , we substitute (6.20d) into (6.20a) to get the total noise-
contaminated signal at point C in Fig. 6.2,

( ) ( ) (det)
1
( ) ( ) ( ) ( )
tot cold
CN C C
z z z n h
u u

+ +

. (6.22a)

We multiply
( )
( )
tot
CN
z in (6.22a) by

1 for
( , )
0 for
D
D
D
H

>

(6.22b)

to make it a double-sided signal, following the same tactic used before in Eq. (5.106a) of Chapter
5. Function ( , ) D H is given the same definition as in Appendix 4C of Chapter 4 [see Eq.
(4C.1a)]. The formula for the double-sided and noise-contaminated signal used to measure the
spectral radiance thus becomes

( )
( ) (det)
( , ) ( )
1
( , ) ( ) ( , ) ( ) ( , ) ( )
tot
CN
cold
C C
D z
D z D z D n h
u u

H

H + H + H

.
(6.22c)

Applying the Fourier transform to both sides of the equation gives, because the Fourier transform
is linear (see Sec. 2.6 of Chapter 2),

Section 5.11 of Chapter 5 explains why there is always an effective spectrum corresponding to an interferometer
signal. We now develop a formula for the detector noise-contaminated effective spectrum corresponding to the
signal in Eq. (6.22c). Applying the Fourier transform to both sides of the equation gives, because the Fourier
transform is linear (see Sec. 2.6 of Chapter 2),
Total Detector Noise in Double-Sided Signals 6.10
- 773 -

( ) ( )
( )
( ) ( ) ( )
( ) ( ) ( ) (det)
( , ) ( ) ( , ) ( )
1
( , ) ( ) ( , ) ( )
i tot i
CN C
i cold i
C
D z D z
D z D n h
u u
o o
o o

H H

+ H + H

.
F F
F F
(6.22d)

Evaluating the first Fourier transform on the right-hand side of (6.22d) is not very difficult.
The remark following Eq. (6.5d) above points out that z
C
() is the same signal as z() in Eq.
(5.104a) of Chapter 5. The discussion following (5.104a) shows that ( , ) ( )
C
D z H must then be
the same signal function that we called z
trunc
() in (5.106a). This means the Fourier transform

( )
( )
( , ) ( )
i
C
D z
o

H F

is the same quantity as ( )
eff
trunc
o Z specified in Eqs. (5.108a) and (5.108b). According to Eq.
(5.108c), function ( )
eff
trunc
o Z can be approximated as

ma
R H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
f a mnf
WA
u R o o o q o t o t o o
AO
L ,

where
( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L . (6.23a)

We conclude that the same expression can be used to approximate ( )
( )
( , ) ( )
i
C
D z
o

H F ; that
is, we can write that

( )
( )
ma
R
( , ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
i
C
f a mnf
D z
WA
u R
o

o o o q o t o t o o
H e
AO
L
F
.
(6.23b)

The second Fourier transform on the right-hand side of (6.22d) is not much more difficult.
According to the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2],

( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( , ) ( ) ( , ) ( )
i cold i i cold
C C
D z D z
o o o

H H F F F . (6.24a)

Equation (5.65a) in Chapter 5 and the definition of
( ) io
F from Eq. (2.29a) in Chapter 2 give

( )
( )
ma
R
( , ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
i
C
f a mnf
D z
WA
u R
o

o o o q o t o t o o
H e
AO
L
F
.
(6.23b)

The second Fourier transform on the right-hand side of (6.22d) is not much more difficult.
According to the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2],

( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( , ) ( ) ( , ) ( )
i cold i i cold
C C
D z D z
o o o

H H F F F . (6.24a)

( ) io

( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( , ) ( ) ( , ) ( )
i cold i i cold
C C
D z D z
o o o

H H F F F . (6.24a)

( ) io
( ) io
- 774 -
( )
( ) 2
( , ) ( , ) 2 sinc(2 )
i i
D D e d D D

= =
-
F , (6.24b)
where

sin( )
sinc( )
x
x
x
=

is defined in Eq. (2.106d). Hence, Eq. (6.24a) can be written as

( ) [ ] ( )
( ) ( ) ( ) ( )
( , ) ( ) 2 sinc(2 ) ( )
i cold i cold
C C
D z D D z

= F F . (6.24c)

Consulting Eq. (6.12a) above, we note that
( )
( )
cold
C
z is the inverse Fourier transform of

( ) (back)
ma
R H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
fore
a FOV FOV
WA
u R

L L ,

which means that
( )
( ) ( )
( )
i cold
C
z
F , the forward Fourier transform of

( )
( )
cold
C
z , is

( )
( ) ( )
( ) (back)
ma
R
( )
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
i cold
C
fore
a FOV FOV
z
WA
u R
=

L L
F
.
(6.24d)

According to the discussion following Eq. (5.82c), the quantities M, H, R, , and
a
are all
slowly varying functions of their arguments. Hence they can, following the reasoning explained
in Appendix 5C of Chapter 5, be treated as quasi-constants with respect to the narrow sinc
convolution when (6.24d) is substituted into (6.24c). This leads to the approximation

( )
{ }
( ) ( )
ma
( ) (back)
R
( , ) ( )
H( ) M( ) ( ) ( ) ( )
4
[2 sinc(2 )] [ ( ) ( )] .
i cold
C
a
fore
FOV FOV
D z
WA
u R
D D

L L

F

The linearity of the convolution [see, for example, Eq. (2.38d) of Chapter 2] now lets us write

- 775 -

( )
( ) ( )
( ) (back)
ma
R
( , ) ( )
H( ) M( ) ( ) ( ) ( ) [ ( ) ( )]
4
i cold
C
fore
a mnf mnf
D z
WA
u R
o

o o q o o t o o o
H
AO
e

L L
F

, (6.25a)
where

( ) ( )
( ) [2 sinc(2 )] ( )
fore fore
mnf FOV
D D o ro o L L (6.25b)
and

(back) (back)
( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L . (6.25c)

Since sinc(2 ) D ro ,
( )
( )
fore
FOV
o L , and
(back)
( )
FOV
o L are all even functions of , it follows that the
convolutions of
( ) fore
FOV
L and
(back)
FOV
L with the sinc function are also even [see Eq. (2.38f) in Chapter
2],

( ) ( )
( ) ( )
fore fore
mnf mnf
o o L L , (6.25d)
and

(back) (back)
( ) ( )
mnf mnf
o o L L . (6.25e)

The absolute-value signs around the arguments of
( ) fore
mnf
L and
(back)
mnf
L in Eq. (6.25a) are not needed
because the functions are already even, but they are put there anyway to keep our notation
parallel with that of the previous L-type radiance symbols:

( ) ( )
( ) ( )
fore fore
mnf mnf
o o L L , (6.25f)
and

(back) (back)
( ) ( )
mnf mnf
o o L L . (6.25g)

Functions
( ) fore
mnf
L and
(back)
mnf
L are the background spectral radiances distorted by the effects of the
interferometers finite field of view and finite length of interferogram signal. They are given the
subscript mnf to show their similarity to L
mnf
, the input spectral radiance distorted by the
interferometers finite field of view and finite interferogram length.
Unfortunately, the third Fourier transform on the right-hand side of Eq. (6.22d) is not as easy
to evaluate as the first two. We start the analysis by multiplying both sides of Eq. (6.21d) by
[ ( , ) / ] D u H to get

1 (det) 1 (det)
( , ) ( ) ( , ) ( )
u
u
u D n h u D n h d
u u

H e H

T
T
(6.26a)
Hence the absolute-value signs around the arguments of are unnecessary
- 776 -
where, according to Eq. (6.21a), ( ) 0 for h t t > T . The ( , ) D function specified in Eq.
(6.22b) automatically makes both sides of (6.26a) equal to zero when D > , so we only need a
good approximation for the integral on the right-hand side when D . In particular, we note
that in (6.26a) the integral goes between u = T and u = + T , which means that

u u + T T .

Since D we also know, putting less strict bounds on , that

( ) D u D u + + T T .

Hence
(det)
( ) n can be multiplied by ( , ) D u + T without changing the value of the integral
for any values of that matter. Consequently Eq. (6.26a) can be written as

1 (det)
1 (det)
( , ) ( )
( , ) ( , ) ( ) ,
u
u
u D n h
u
u D n h d
u

T
T
D
(6.26b)

where
D u = + D T . (6.26c)

The integrals limits between and u u + T T in (6.26b) came from the observation in (6.21a)
that function h is very small outside these limits, making the product

(det)
( , ) ( ) n h
u

D

negligible when is less than u T or exceeds u + T . Therefore, we expect the integral to
have the same value when its limits are extended to and + giving

- 777 -

1 (det)
1 (det)
( , ) ( )
( , ) ( , ) ( )
u D n h
u
u D n h d
u
. D

Equation (2.38a) of Chapter 2 shows this integral to be the convolution of
(det)
n and h,

1 (det) 1 (det)
( , ) ( ) ( , ) ( , ) ( ) u D n h u D n h
u u

. D (6.26d)

Taking the Fourier transform of both sides, and then applying the Fourier convolution theorem,
gives [see Eqs. (2.39a) and (2.39j) in Chapter 2]

( ) ( ) ( )
( ) 1 (det)
( ) 1 ( ) (det) ( )
( , ) ( )
( , ) ( , ) ( ) ( / )
i
i i i
u D n h
u
u D n h u
.
F
F F D F
(6.27a)

To evaluate the Fourier transform of h, we replace the dummy variable of integration by
/ t u = to get
( )
( ) 2 2
( / ) ( / ) ( )
i i i ut
h u h u e d u h t e dt

= =

F .

Equation (5A.3d) in Appendix 5A of Chapter 5 shows that this can be written as

( )
( )
( / ) H( )
i
h u u u

= F (6.27b)
or

2
( / ) H( )
i
h u e d u u

, (6.27c)

where H, the Fourier transform of h, is the transfer function of the detector circuit in Fig. 6.2.
Substituting this into Eq. (6.27a) gives

- 778 -

( ) ( )
( ) 1 (det)
( ) 1 ( ) (det)
( , ) ( )
( , ) H( ) ( , ) ( )
i
i i
u D n h
u
u D u u n
.
F
F F D
(6.27d)

Equation (6.24b) [see also Eq. (5.65a) of Chapter 5] shows that

( )
( ) 1 1 2 1
( , ) ( , ) 2 sinc(2 D)
i i
u D u D e d u D

= =
F .

Hence Eq. (6.27d) can be written as, using the linearity of the convolution to cancel out
1
u
and
u,

[ ] ( )
( ) 1 (det)
( ) (det)
( , ) ( )
2 sinc(2 D) H( ) ( , ) ( )
i
i
u D n h
u
D u n
.
F
F D
(6.27e)

According to the discussion following Eq. (5.82c) of Chapter 5, the transfer function H( ) u
varies slowly compared to the spectral radiance L() that the interferometer is measuring, and
Sec. 5.15 of Chapter 5 explains why
[ ]
2 sinc(2 D) D should be a narrow function compared to
L(). Consequently, there is every reason to expect H( ) u to vary slowly with respect to
[ ]
2 sinc(2 D) D . Therefore, according to Eq. (5C.1) in Appendix 5C of Chapter 5, Eq. (6.27e)
can be approximated as

[ ] ( ) { }
( ) { }
( ) 1 (det)
( ) (det)
( ) ( ) (det)
( , ) ( )
H( ) 2 sinc(2 D) ( , ) ( )
H( ) ( ( , )) ( , ) ( )
i
i
i i
u D n h
u
u D n
u D n

F
F D
F F D

where in the last step Eq. (6.24b) is again used, this time to replace

2 sinc(2 D) D

by the Fourier transform of ( , ) D .
According to the Fourier convolution theorem [see Eq. (2.39j) of Chapter 2], this can be
- 779 -
written as

( )
( ) 1 (det)
( ) (det)
( , ) ( )
H( ) ( , ) ( , ) ( )
i
i
u D n h
u
u D n
.
F
F D

Glancing back at Eq. (6.22b) above, we note that

( , ) ( , ) ( , ) D D = D (6.28a)

because, according to (6.26c), D D. Hence the latest approximation becomes

( )
( ) 1 (det)
( ) (det)
( , ) ( )
H( ) ( , ) ( )
i
i
u D n h
u
u D n
.
F
F
(6.28b)

We define the D-limited Fourier transform to be

(det) (det) 2
( ) ( , ) ( )
i
D
D n e d

n , (6.29a)

( )
(det) ( ) (det)
( ) ( , ) ( )
i
D
D n
= n F (6.29b)
or

(det) (det) 2
( ) ( )
D
i
D
D
n e d

n . (6.29c)

Equation (6.28b) now becomes, using the linearity of the Fourier transform to take the factor of
1
u
outside the F operator,

( ) (det) (det)
1
( , ) ( ) H( ) ( )
i
D
D n h u
u u

n . F (6.29d)

It makes sense to work with
(det)
( )
D
n , the Fourier transform of the product
(det)
n , instead of
- 780 -
working directly with the simple Fourier transform of
(det)
n . To see why this is so, we write down
the simple Fourier transform of
(det)
n ,

(det) 2
( )
i
n e d
r o

,

and note that there is no reason to think that
(det)
n always satisfies requirement (V) in Sec. 2.4 of
Chapter 2 for the existence of Fourier transforms.
100
Function
(det)
( )
D
o n , on the other hand,
because it is the Fourier transform of
(det)
( n ) after it is multiplied by ( , ) D H , is a well-defined,
random function of because the
(det)
n H product must be zero for D > .
Now that formulas (6.23b), (6.25a), and (6.29d) are known for all three Fourier transforms on
the right-hand side of Eq. (6.22d), they can be substituted into (6.22d) to get

( )
( ) ( )
ma
( ) (back)
ma
(det)
R
R
( , ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
H( ) M( ) ( ) ( ) ( ) [ ( ) ( )]
4
H( ) ( )
i tot
CN
f a mnf
fore
a mnf mnf
D
D z
WA
u R
WA
u R
u
o

o o o q o t o t o o
o o q o o t o o o
o o
H e
AO
AO
+
+
L
L L
n
F

.

or, combining terms,

( )
( ) ( )
ma
( ) (back)
(det
R
( , ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
( ) ( )
H( )
i tot
CN
a f mnf
fore
mnf mnf
D
D z
WA
u R
u
o

o o o q o t o t o o
o o
o
H e
AO
+
L
L L
n
F

)
( ) o .
(6.30a)

The quantity
( ) ( )
( ( , ) ( ))
i tot
CN
D z
o

H F on the right-hand side of (6.30a) represents the noise-

contaminated measurement of the total signal spectrum at point C in Fig. 6.2. It can be thought of

100
Remember that the extended sine and cosine transforms to which requirement (V) applies will be used to define
the standard Fourier transform in Eq. (2.28a) of Chapter 2, so requirement (V) also applies to the standard Fourier
transform.
zero for D > . D, satisfying requirement (V). is
- 781 -
as the uncalibrated, noise-contaminated output spectrum of the interferometer.
In principle, there is no problem removing all the noise from (6.30a). Glancing back at the
formula for
(det)
D
n in Eq. (6.29c), we apply the expectation operator E to both sides to get

( ) ( )
(det) (det) 2 (det) 2
( ) ( ) ( ) 0
D D
i i
D
D D
n e d n e d
r o r o
o

n E E E , (6.30b)

where Eq. (3.17c) of Chapter 3 and Eq. (6.17b) are used to show that
(det)
( ( ))
D
o n E , the average or
expected value of
(det)
( )
D
o n , is zero. Applying E to both sides of (6.30a) now gives, using Eqs.
(3.16a) and (3.9f) from Chapter 3,

( ) ( )
( ) ( )
ma
( ) (back)
ma
R
R
( , ) ( )
H( ) M( ) ( ) ( ) ( )
4
( ) ( ) ( ) ( )
( ) H( ) M( ) ( ) ( ) ( ) ( )
4
i tot
CN
a
fore
f mnf mnf mnf
mnf a f
D z
WA
u R
WA
u R
o

o o q o o t o
t o o o o
o o o q o o t o t o
H
AO
e
+

AO
L L L
L
F

E
( )
( ) (back)
ma
R ( ) ( ) H( ) M( ) ( ) ( ) ( ) ,
4
fore
mnf mnf a
WA
u R o o o o q o o t o

AO
+

L L
(6.30c)

which shows that the noise term
(det)
[H( ) ( )]
D
uo o n disappears when many noise-contaminated
measurements of

( )
( ) ( )
( , ) ( )
i tot
CN
D z
o

H F

are averaged together. Hence the right-hand side of (6.30c) can be thought of as the uncalibrated,
noise-free output spectrum of the interferometer. According to Eq. (5.110) in Chapter 5, the L
mnf

radiance spectrum on the right-hand side of (6.30c) is the same as spectrum L
eff
in Eq. (5.95c) of
Chapter 5. Consequently the entire right-hand side of (6.30c) has the same form as Z
eff,tot
in
(5.95c), since it looks like

( ) Complex Function of Background Complex Function of
eff
o o o + L . (6.30d)

This is no surprise, because Z
eff,tot
in Sec. 5.19 of Chapter 5 is defined to be the uncalibrated
output spectrum of a Michelson interferometer, which is exactly what the total noise-free signal
spectrum at point C of Fig. 6.2 ought to be. Therefore it now makes sense to write Eqs. (6.30a)
as the uncalibrated, noise-contaminated ouput spectrum of the interferometer. It is the detector
noise-contaminated effective spectrum.
In principle, there is no problem removing all the noise from (6.30a). Glancing back at the
formula for
(det)
D
n in Eq. (6.29c), we apply the expectation operator E to both sides to get

( ) ( )
( ) ( ) ( ) 0
D D
i i
D
D D
n e d n e d
r o r o
o

n E E E , (6.30b)
- 782 -
and (6.30c) as

( )
( ) ( ) (det)
,
( , ) ( ) ( ) H( ) ( )
i tot
CN eff tot D
D z u
+ Z n F (6.31a)
and

( ) ( )
( ) ( )
,
( , ) ( ) ( )
i tot
CN eff tot
D z
Z E F , (6.31b)

where, reversing the factoring in (6.30c), we have

( )
, ma
( ) (back)
ma
ma
R
R
R
( ) ( ) H( ) M( ) ( ) ( ) ( ) ( )
4
( ) ( ) H( ) M( ) ( ) ( ) ( )
4
H( ) M( ) ( ) ( ) ( )
4
eff tot mnf a f
fore
mnf mnf a
a
WA
u R
WA
u R
WA
u R

=
Z L
L L

( ) (back)
( ) ( ) ( ) ( )
fore
f mnf mnf mnf
+

L L L .
(6.31c)

6.11 Measuring the Noise-Contaminated Spectrum
We have discussed two basic strategies for eliminating the interferometers unwanted background
radiance: measuring and subtracting the background radiances interferogram from the total
interferogram signal [see Eqs. (6.15a)(6.15c)], or measuring and removing the background
radiances signal spectrum from the total signal spectrum (see Sec. 5.19 of Chapter 5). Here in
Chapter 6 we have concentrated up to now on the first strategy, assuming that during calibration
( )
( )
cold
C
z is measured and subtracted from
( )
( )
tot
CN
z to get the ( )
CN
z signal specified in Eq.
(6.15c). Although this is in principle an acceptable way to calibrate interferometers, in practice
the spectral calibration strategy described in Sec. 5.19 of Chapter 5 is more popular. Equations
(6.31a)(6.31c) let us investigate this more popular spectral calibration strategy, because these
equations describe the total signal and noise in an uncalibrated spectral measurement. One
obvious way to investigate the spectral calibration strategy is to apply the spectral calibration
algorithm directly to Eqs. (6.31a)(6.31c). The spectral calibration algorithm in Sec. 5.19 of
Chapter 5 [specified by Eq. (5.95a)] can be used whenever the interferometers uncalibrated
output spectrum has the form

{ } { } ( ) Complex Function of Background Complex Function of
eff
+ L ;

and, according to the discussion following Eq. (6.30c), that is exactly the form taken by the
noise-free uncalibrated spectrum in the previous section. To use the calibration algorithm in
Measuring the Noise-Contaminated Spectrum 6.11
- 783 -
(5.95a), we not only need the uncalibrated spectral output Z
eff,tot
() associated with the spectral
radiance L but also must have the uncalibrated output signals associated with two known,
calibrating spectral radiances. Following the notation of Sec. 5.19 of Chapter 5, we call the two
calibrating radiances L
(1)
and L
(2)
and the two output signals associated with them
(1)
,
( )
eff tot
Z and
(2)
,
( )
eff tot
Z respectively. Equation (6.31b) reminds us that to extract the noise-free signals
(1)
, eff tot
Z
and
(2)
, eff tot
Z in the presence of noise, we need only point the interferometer at radiances L
(1)
and
L
(2)
and average together a large number of uncalibrated output spectra to get each spectrums
noise-free expectation value. Examining Eqs. (6.31b) and (6.31c) closely, we realize that
(1,2)
,
( )
eff tot
Z cannot depend directly on L
(1)
and L
(2)
but instead must depend directly on
(1)
mnf
L and
(2)
mnf
L , where again the mnf subscripts indicate that the L
(1)
and L
(2)
radiances entering the front
end of the interferometer are blurred and distorted by the interferometers finite field of view and
finite interferogram length. Fortunately, because L
(1)
and L
(2)
are under our control, we can
choose them to be slowly varying functions of wavenumber. This means, according to Eq. (6A.6)
in Appendix 6A, that

(1) (1)
( ) ( )
mnf
L L (6.32a)
and

(2) (2)
( ) ( )
mnf
L L (6.32b)

should be acceptable approximations for
(1,2)
mnf
L . We can construct formulas for
(1)
, eff tot
Z and
(2)
, eff tot
Z ,
using Eqs. (6.31c), (6.32a), and (6.32b) to write

(1)
, ma
(1) ( ) (back)
R ( ) H( ) M( ) ( ) ( ) ( )
4
( ) ( ) ( ) ( )
eff tot a
fore
f mnf mnf
WA
u R

+

Z
L L L
(6.33a)
and

(2)
, ma
(2) ( ) (back)
R ( ) H( ) M( ) ( ) ( ) ( )
4
( ) ( ) ( ) ( )
eff tot a
fore
f mnf mnf
WA
u R

+

Z
L L L .
(6.33b)

This, together with the uncalibrated output spectrum Z
eff,tot
() produced by the unknown radiance
L that the interferometer is being used to measure, is all we need to apply the spectral calibration
algorithm.
Although we could in principle collect a large number of measurements of Z
eff,tot
(), averaging
them together to remove the noise just like we did when calculating
(1,2)
,
( )
eff tot
Z , in practice more
- 784 -
effort is put into removing noise from the calibration data than is put into removing noise from
everyday measurements. [This same point is made at the end of Sec. 6.5 when discussing noise in
the measurements of signals z
A
(), z
B
(), and z
C
().] Consequently, even though noise-free values
of
(1,2)
, eff tot
Z are available for use in the calibration algorithm in Eq. (5.95a) of Chapter 5, we should
replace the noise-free
( )
,
meas
eff tot
Z in (5.95a) by the noise-contaminated, uncalibrated output specified
in Eq. (6.31a):

( )
( ) ( ) (det)
,
( , ) ( ) ( ) H( ) ( )
i tot
CN eff tot D
D z u
o
o o o
H e + Z n F .

We call this noise-contaminated, uncalibrated spectral signal
( )
,
meas
eff totN
Z
and specify it using the

formula

( ) (det)
, ,
( ) ( ) H( ) ( )
meas
eff totN eff tot D
u o o o o + Z Z n
. (6.34a)

Changing tot to totN in the subscript of
( )
,
meas
eff totN
Z
shows, of course, that it is

( )
,
meas
eff tot
Z contaminated
by noise; and the tilde shows that the spectral signal now has a random component. Function
( )
,
meas
eff totN
Z
is just, of course, a different name for

( )
( ) ( )
( , ) ( )
i tot
CN
D z
o

H F .

The discussion at the beginning of Sec. 6.1 points out that L
mnf
is the best measurement of the
unknown spectral radiance L that can be extracted from the interferometer, and substituting Eq.
(6.31c) into (6.34a) gives
( )
,
meas
eff totN
Z
in terms of L
mnf
,

( )
, ma
( ) (back)
(det)
R ( ) H( ) M( ) ( ) ( ) ( )
4
( ) ( ) ( ) ( )
H( ) ( )
meas
eff totN a
fore
f mnf mnf mnf
D
WA
u R
u
o o o q o o t o
t o o o o
o o
AO

+

+
Z
L L L
n

.
(6.34b)

Now we apply the calibration algorithm in Sec. 5.19 of Chapter 5. Equations (6.34b) and (6.33a)
give

is the noise-free measurement of the
Now we apply the calibration algorithm in Sec. 5.19 of Chapter 5. Equations (6.34b) and
(6.33a) give
- 785 -

( ) (1)
, ,
(1)
ma
(det)
R
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( ) ( )
4
H( ) ( )
meas
eff totN eff tot
a f mnf
D
WA
u R
u

=

+
Z Z
L L
n

.
(6.35a)

Because we have decided to include noise in our measurement of the uncalibrated output
spectrum, this corresponds to the difference

( ) (1)
, ,
( ) ( )
meas
eff tot eff tot
Z Z

in Eq. (5.95a) of Chapter 5. Equations (6.33a) and (6.33b) give

(2) (1)
, ,
(2) (1)
ma
R
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( ) ( )
4
eff tot eff tot
a f
WA
u R

=

Z Z
L L .
(6.35b)

The ratio of these two differences is

( ) (1)
, ,
(2) (1)
, ,
(1)
(2) (1)
(det)
(2) (1)
ma
R
( ) ( )
( ) ( )
( ) ( )
( ) ( )
4 ( )
( ) M( ) ( ) ( ) ( ) ( )[ ( ) ( )]
meas
eff totN eff tot
eff tot eff tot
mnf
D
a f
WA R

+

Z Z
Z Z
L L
L L
n
L L

.
(6.35c)

The left-hand side of this formula is, of course, the noise-contaminated version of the ratio

( ) (1) (2) (1)
, , , ,
[ ( ) ( )] [ ( ) ( )]
meas
eff tot eff tot eff tot eff tot
Z Z Z Z

in Eq. (5.95a). We now complete the calibration algorithm by substituting (6.35c) into (5.95a) to
get
- 786 -

( ) (1)
, , (2) (1) (1)
(2) (1)
, ,
(det)
ma
R
( ) ( )
( ) ( ) ( )
( ) ( )
4 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
meas
eff totN eff tot
eff tot eff tot
D
mnf
a f
WA R

+

= +
Z Z
L L L
Z Z
n
L
.
(6.35d)

At first we might think, examining Eq. (6.35d) and comparing it to (6.1a) above, that the
right-hand side is just a disguised version of

( ) ( )
mnf
+ L L
,

which would mean that for detector noise

(det)
ma
R
?
4 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
D
a f
WA R

=
n
L

.

A little thought, however, shows that this cannot be correct. The quantities W, A, , M, , R,
a
,
and
f
are all real, as is L
, but there is no reason for

(det)
D
n to be real. From Eq. (6.29a), we
have

(det) (det) 2
( ) ( , ) ( )
i
D
D n e d

n .

This means
(det)
D
n is the forward Fourier transform of the real product

(det)
( , ) ( ) D n ,

which, according to entry 7 of Table 2.1 in Chapter 2, makes
(det)
D
n Hermitian:

(det) (det)
( ) ( )
D D

= n n . (6.36)

Unless
(det)
( , ) ( ) D n is also an even functionand there is absolutely no reason for this to be
truewe expect
(det)
D
n to have both real and imaginary components.
The observation that
(det)
( , ) ( ) D n must be even for
(det)
D
n to be strictly real brings to mind
the distinction previously made between avoidable and unavoidable detector noise. In Sec. 6.8
above, we note that only the even component
- 787 -

(det) (det) (det)
1
( ) ( ) ( )
2
e
n n n = +

of the total detector noise in double-sided signals is unavoidable, because in principle the odd
component

(det) (det) (det)
1
( ) ( ) ( )
2
o
n n n =

can be removed from the signal at point B by averaging together the signal values at + and .
We also point out in Sec. 6.8 that the avoidable noise is usually not eliminated this way, but
instead passed along the signal chain to be eliminated later. We have now reached the point
where it is easy to eliminate the avoidable noise in double-sided signals.
Suppose, just like in Eq. (6.19a) of Sec. 6.8, we write
(det)
( ) n as the sum of an unavoidable,
even component and an avoidable, odd component,

(det) (det) (det)
( ) ( ) ( )
e o
n n n = + .

Since
(det)
D
n is the forward Fourier transform of
(det)
( , ) ( ) D n , we have

( )
( )
( ) ( )
(det) ( ) (det)
( ) (det) (det)
( ) (det) ( ) (det)
( ) ( , ) ( )
( , ) ( ) ( , ) ( )
( , ) ( ) ( , ) ( ) ,
i
D
i
e o
i i
e o
D n
D n D n
D n D n

=
= +
= +
n

F
F
F F
(6.37a)

where in the last step the linearity of the Fourier transform is used to write the transform of the
sum as the sum of the transforms (see Sec. 2.6 of Chapter 2). To get a spectrum for the
unavoidable detector noise, we now define

( )
(det) ( ) (det)
( ) ( , ) ( )
i
De e
D n
= n F . (6.37b)

This makes
(det)
De
n the forward Fourier transform of a real and even function of , which means,
according to entry 1 of Table 2.1 in Chapter 2, that
(det)
De
n must be a real and even function of ,

( )
(det) (det)
( ) Re ( )
De De
= n n (6.37c)
and

(det) (det)
( ) ( )
De De
= n n . (6.37d)
- 788 -
To get a spectrum for the avoidable detector noise, we define

( )
(det) ( ) (det)
( ) ( , ) ( )
i
Do o
D n
o
o
H n F . (6.37e)

This makes
(det)
Do
n the forward Fourier transform of a real and odd function of , which means,
according to entry 4 of Table 2.1 in Chapter 2, that
(det)
Do
n must be an imaginary and odd function
of ,

( )
(det) (det)
( ) Im ( )
Do Do
i o o n n (6.37f)
and

(det) (det)
( ) ( )
Do Do
o o n n . (6.37g)

Substitution of (6.37b) and (6.37e) into (6.37a) gives

(det) (det) (det)
( ) ( ) ( )
D De Do
o o o + n n n , (6.37h)

which we can interpret as requiring the total noise spectrum
(det)
D
n to be the sum of the
unavoidable noise spectrum
(det)
De
n and the avoidable noise spectrum
(det)
Do
n . Remembering that Eqs.
(6.37c) and (6.37f) show that
(det)
De
n is strictly real and
(det)
Do
n is strictly imaginary, we also note that
(det)
De
n must be the real part of the detector noise spectrum and
1 (det)
Do
i
n must be the imaginary part

of the detector noise spectrum,

( )
(det) (det)
( ) Re ( )
De D
o o n n (6.37i)
and

( )
(det) (det)
( ) Im ( )
Do D
i o o n n . (6.37j)

Therefore we can remove all of the avoidable detector noise from the
(det)
D
n detector noise
spectrum by taking its real part, as shown in (6.37i); moreover, since the noise-free spectral
measurement of L
mnf
must be real, we can remove all of the avoidable detector noise from our
noise-contaminated spectral measurement by taking its real part. The right-hand side of Eq.
(6.35d) gives the formula for the noise-contaminated spectral measurement, and taking its real
part gives
1 (det)
Do
n
- 789 -

( )
(det)
ma
(det)
ma
R
R
4 ( )
Re ( )
( ) M( ) ( ) ( ) ( ) ( )
4Re ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
D
mnf
a f
D
mnf
a f
WA R
WA R

= +
n
L
n
L
.
(6.38a)

The imaginary part of the right-hand side is, of course, pure noise:

( )
(det)
ma
(det)
ma
R
R
4 ( )
Im ( )
( ) M( ) ( ) ( ) ( ) ( )
4Im ( )
( ) M( ) ( ) ( ) ( ) ( )
D
mnf
a f
D
a f
WA R
WA R
n
L
n
.
(6.38b)

Comparing (6.38a) to the right-hand side of Eq. (6.1a) above,

( ) ( )
mnf
+ L L
,

now suggests that the appropriate formula for the unavoidable random error in a double-sided
signal contaminated by detector noise must be

( )
(det)
ma
R
4Re ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
D
a f
WA R

=
n
L

. (6.38c)

The right-hand side of (6.38c) comes from (6.38a), which was derived while permitting
wavenumber to be negative as well as positive; the left-hand side however comes from Eq.
(6.1a) where, according to (6.1c),

min max
0 < .

Wavenumbers
max
and
min
are the maximum and minimum wavenumber values over which
radiance spectra are measured, and in a well-built interferometer unwanted spectral energy is
usually prevented from entering the optical signal chain by designing the product

R( ) ( ) ( )
a f

to be zero when does not lie between
min
and
max
. Because the denominator on the right-hand
side of (6.38c) contains the product
- 790 -
R( ) ( ) ( )
a f
,

we end up dividing by zero unless we require that

min max
0 < . (6.38d)

Therefore the restrictions on the left-hand and right-hand sides of (6.38c) look very similar; the
only real difference is the way is allowed to be negative on the right-hand side but not on the
left. According to Eq. (5.10f) in Chapter 5 and (4.139g) in Chapter 4, functions M and in the
denominator of (6.38c) are even with respect to , and of course the absolute value signs in R,
a
,
and
f
force them to be even functions of their arguments. Equations (6.37d) and (6.37i) show
that the real part of
(det)
D
n is also an even function:

( ) ( )
(det) (det)
Re ( ) Re ( )
D D
= n n . (6.38e)

Consequently, the entire right-hand side of Eq. (6.38c) is an even function of and there is no
extra information to be lost if we require to be positive on both sides of (6.38c). To show that
both sides should be evaluated for positive wavenumbers , we follow the convention used in
Sec. 6.1 when going from Eq. (6.3f) to (6.3g) and write (6.38c) as

( )
(det)
ma
R
4Re ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
D
a f
WA R

=
n
L

. (6.38f)

This can also, of course, be interpreted as a decision to make L
an even function of
wavenumber, giving it a well-defined meaning for all such that
min max
0 < . No matter
what the interpretation, however, the mathematical meaning is clear. For future use, we note that
(6.38f) can also be written as, substituting from Eq. (6.37i),

(det)
ma
R
4 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
De
a f
WA R

=
n
L

, (6.38g)

where applying the definition of the forward Fourier transform to (6.37b) gives [see Eq. (2.29a)
in Chapter 2]

(det) (det) 2
( ) ( , ) ( )
i
De e
D n e d

n . (6.38h)
- 791 -
Substitution of (6.37f) into the right-hand side of (6.38b) gives

( )
(det)
ma
1 (det)
ma
R
R
4Im ( )
( ) M( ) ( ) ( ) ( ) ( )
4 ( )
,
( ) M( ) ( ) ( ) ( ) ( )
D
a f
Do
a f
WA R
i
WA R
n
n

(6.38i)

where, consulting Eq. (6.37e), we note that

(det) (det) 2
( ) ( , ) ( )
i
Do o
D n e d

n . (6.38j)

Equations (6.38h)(6.38j) can be used for both negative and positive values of .

6.12 Characterizing the Detector Noise
When the noise coming from a detector is examined, it almost always looks ergodic. According
to the discussion at the end of Sec. 3.18 of Chapter 3, all ergodic functions are also stationary,
which means that
(det)
n representing the detector noise is a stationary random function. There is
nothing unusual about characterizing the detector noise this way; most mathematical treatments
of random processes assume at least wide-sense stationarity in order to assign power spectra to
the random behavior under investigation. Like all statistical assumptions, saying the detector
noise is wide-sense stationary is at best an approximation. As a general rule, however, the
assumption that
(det)
n is wide-sense stationary and so has a well-defined power spectrum turns
out to be a good description of reality.
Appendix 6B explains how to use the direct proportionality between time t and the OPD value
given by
ut =

in Eq. (6.4) to analyze the detector noise
(det)
n as a random process or function that is wide-sense
stationary in instead of t. We can say that [see Eq. (6B.4e) in Appendix 6B] that the
autocorrelation function
(det)
nn
o of
(det)
( ) n is given by

( )
(det) (det) (det)
2 1 1 2
( ) ( ) ( )
nn
n n =

E o . (6.39a)

The corresponding -based power spectrum is [see Eq. (6B.6a) in Appendix 6B]
- 792 -

(det) (det) 2
( ) ( )
i
nn nn
e d
r o
o

p o , (6.39b)

from which it follows, reversing the Fourier transform, that

(det) (det) 2
( ) ( )
i
nn nn
e d
r o
o o

o p . (6.39c)

Glancing back at (6.39a), we note that
(det)
nn
o is real because
(det)
n is real. We can easily show that
(det)
nn
o must be even. Starting with (6.39a), we have

( ) ( )
( )
(det) (det) (det) (det) (det)
2 1 1 2 2 1
(det) (det)
1 2 2 1
( ) ( ) ( ) ( ) ( )
( ) ( )
nn
nn nn
n n n n

E E
.
o
o o

Hence, replacing
2 1
by , we get

(det) (det)
( ) ( )
nn nn

o o (6.39d)

It follows, since
(det)
nn
p is the forward Fourier transform of a real and even function, that
(det)
nn
p is
also real and even (see entry 1 of Table 2.1 in Chapter 2):

(det) (det)
( ) ( )
nn nn
o o

p p (6.39e)
and

( )
(det)
Im ( ) 0
nn
o

p . (6.39f)

The detector noise can also, of course, be analyzed in a more conventional way, treating it as a
random function of time
(det)
( ) N t
that is wide-sense stationary. The transformation between

(det)
n
and
(det)
N
is given in Eqs. (6B.2a) and (6B.2b) in Appendix 6B as

(det) (det)
( ) ( / ) n N u

(6.40a)
and

(det) (det)
( ) ( ) n ut N t

, (6.40b)

where u is the OPD velocity. The D-limited transform of
(det)
n defined in Eq. (6.29a) can be
t [see Eq. (2.11a) in Chapter 2 dening even functions]
Characterizing the Detector Noise 6.12
- 793 -
written as [see Eq. (6.29c)]

(det) (det) 2
( ) ( )
D
i
D
D
n e d

n .

Changing the variable of integration to / t u = gives

/
(det) (det) 2
/
( ) ( )
D u
i ut
D
D u
u n ut e dt

n .
If we define
/ T D u = (6.40c)

and set
f u = , (6.40d)

then Eq. (6.40b) can be used to write this latest formula as

(det) (det) 2
( ) ( )
T
ift
D
T
u N t e dt
n

. (6.40e)

Working in the time domain, it makes sense to define the T-limited Fourier transform of
(det)
( ) N t

to be

(det) (det) 2
( ) ( )
T
ift
T
T
f N t e dt

N

, (6.40f)

which means that (6.40e) can now be written as, remembering that f u = ,

(det) (det)
( ) ( )
D T
u u = n N
(6.40g)
or

1 (det) (det)
( / ) ( )
D T
u f u f
= n N
. (6.40h)

[These are the detector-noise versions of Eqs. (6B.7g) and (6B.7h) in Appendix 6B.] Equation
(6B.7i) in Appendix 6B now gives

( )
2
(det) (det)
1
( ) lim ( )
2
nn D
D
D

=

n

E p , (6.40i)

- 794 -
which we can approximate as, assuming that D is large enough for

( )
2
(det)
1
( )
2
D
D
n E
to be close to its limit as D ,

( )
2
(det) (det)
1
( ) ( )
2
nn D
D
n

E p . (6.40j)

The time-based autocorrelation function of the detector noise is [see Eq. (6B.3a) in Appendix 6B)

( )
(det) (det) (det)
2 1 1 2
( ) ( ) ( )
NN
R t t N t N t =

E (6.41a)

with an associated time-based power spectrum that is the forward Fourier transform of
(det)
NN
R

,

(det) (det) 2
( ) ( )
ift
NN NN
S f R t e dt

. (6.41b)
The transform can be reversed to get

(det) (det) 2
( ) ( )
ift
NN NN
R t S f e df

. (6.41c)

Equations (6B.4g), (6B.4h), (6B.6d), and (6B.6f) of Appendix 6B give the transformation
formulas connecting
(det)
nn
o to
(det)
NN
R

and
(det)
nn
p to
(det)
NN
S

:

(det) (det)
( ) ( / )
nn
NN
R u =

o , (6.41d)

(det) (det)
( ) ( )
nn
NN
R t ut =

o , (6.41e)

(det) (det)
( ) ( )
nn
NN
uS u =

p , (6.41f)
and

(det) 1 (det)
( ) ( / )
nn
NN
S f u f u
=

p . (6.41g)

Working with power spectra and autocorrelation functions that are both time-based and -based
can sometimes be confusing, but the custom of using variables and to analyze interferometer
signals makes it hard to avoid.

Detector Noise with a Band-Limited, White-Noise Power Spectrum 6.13
- 795 -
FIGURE 6.3(a).

6.13 Detector Noise with a Band-Limited, White-Noise Power Spectrum
Many times detector noise can be modeled as band-limited white noise (see the discussion at the
end of Sec. 6.7 above). Following the rules outlined in Sec. 3.25 of Chapter 3, Fig. 6.3(b) plots
the double-sided power spectrum of white noise with bandwidth f
band
. The constant power level
of this spectrum is
(det)
const
S . The corresponding -based power spectrum is plotted in Fig. 6.3(c); it
has the same shape as the power spectrum in Fig. 6.3(b) but obeys Eqs. (6.41f) and (6.41g) by
having the constant power level

(det) (det)
0 const
u S p (6.42a)
and a bandwidth of

band
band
f
u
o . (6.42b)

In these two equations, u is still the constant OPD velocity used in Eq. (6.4) above.
log( ) f

( )
(1)
log ( )
nn
S f

) log(
c
f
positive
positive
- 796 -

FIGURE 6.3(b).
FIGURE 6.3(c).

(det)
const
S

(det) (det)
0 const
uS = p

(det)
( )
NN
S f

(det)
( )
nn

p

f

band
f
band
f
u f
band band
/ =
u f
band band
/ =
- 797 -
FIGURE 6.4(a).

FIGURE 6.4(b).

2 10
6
.
2 10
6
.
mp
k
Nask 1 ( )
.
0 k
.
0 0.02 0.04 0.06 0.08 0.1
1 10
6
0
1 10
6
0.0 D D
0.0

(det)
N
I

(det)
N
I
) (
~
(det)
n
4 10
9
.
4 10
9
.
Re hdTradSpec
k
9.999 10
3
.
0 k
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 1 10
4
2 10
9
0
2 10
9
0.0
Nyq

Nyq

(det)
DN
Z

(det)
DN
Z
0.0
Real part
of ) (
~
(det)
D
n

- 798 -

Figure 6.4(a) plots one possible member of the ensemble of functions associated with the
random function
(det)
( ) n obeying a band-limited, white noise power spectrum like the one in
Fig. 6.3(b)that is, Fig. 6.4(a) contains a specific instance of
(det)
( ) n . In Fig. 6.4(a), the
interval between samples is

1
2
Nyq
o
A , (6.43a)

where
Nyq
is the Nyquist wavenumber of the sampled interferogram signal that we plan to
contaminate with this noise. We make the simulated
(det)
( ) n relatively large so that its effects
are easily visible, giving it a scale size
(det)
N
I , shown with dashed lines, equal to 1/50th of the
maximum value of the simulated interferogram signal. A power spectrum such as the one shown
in Fig. 6.3(c) does not uniquely determine all the statistical rules needed to generate the random
noise sequence in Fig. 6.4(a)we also need to pick a probability density distribution for
(det)
n at
each value of . This probability density distribution must be zero-mean because, according to
Eq. (6.17b),

( )
(det)
( ) 0 n E .

To match the probability density distribution to the power spectrum, we also need to give it the
correct variance. Remembering that the noise is zero-mean, and consulting Eq. (6.39a) with
2 1
, we see that

( )
( )
2
(det) (det) (det)
variance of detector noise ( ) (0)
n nn
v n

E o . (6.43b)

Equation (6.39c) with 0 then requires that

(det) (det) (det)
(0) ( )
n nn nn
v d o o

o p . (6.43c)

Equations (6.42a) and (6.42b)and the band-limited nature of the white-noise power spectrum in
(6.43c)now give (remember that
(det)
nn
p is zero for
band
o o > )

(det) (det)
0
2
n band
v o
p . (6.43d)

Having made the probability density distribution zero-mean, and matched its variance to the
power level, we are left free to arrange everything else about the probability density distribution
(c)
- 799 -
FIGURE 6.4(c).

FIGURE 6.4(d).

4 10
9
.
4 10
9
.
Im hdTradSpec
k
9.999 10
3
.
0 k
0 1000 2000 3000 4000 5000 6000 7000 8000 9000 1 10
4
2 10
9
0
2 10
9
(det)
DN
Z

(det)
DN
Z
0.0
0.0
Nyq

Nyq

Imaginary part
of ) (
~
(det)
D
n
2.68834 10
9
.
2.68834 10
9
.
Re hdTradSpec
k
Im hdTradSpec
k
4999 200 4999 200 k f
.
4799 4839 4879 4919 4959 4999 5039 5079 5119 5159 5199
4 10
9
2 10
9
0
2 10
9
4 10
9
0.0 200 200
0.0

(det)
DN
Z

(det)
DN
Z
Real and imaginary
parts of ) (
~
(det)
D
n

Real part
of ) (
~
(det)
D
n
Imaginary part
of ) (
~
(det)
D
n
- 800 -
any way we please. Sometimes knowing the variance is enough to pick a specific probability
distribution from a family of similar zero-mean density distributions; it is certainly all that is
needed to specify the Gaussian probability distribution used to generate the random noise in Fig.
6.4(a).
We have already noted that the simulated noise plotted in Fig. 6.4(a) can be regarded as a
member function picked at random from the ensemble of functions associated with
(det)
( ) n ; that
is, it is a single instance of the detector noise. Even though it is not, in the strictest sense, possible
to graph a random function as suchbecause it stands for a whole collection or ensemble of
functionsas a convenient shorthand, we often call graphs such as the one in Fig. 6.4(a) a
simulation of
(det)
( ) n . Figure 6.4(b) contains the real part of the D-limited forward Fourier
transform of the detector noise shown in Fig. 6.4(a). This means, following the notation specified
in Eqs. (6.29a), (6.37i), and (6.37j), that Fig. 6.4(b) simulates the random function

( )
(det) (det)
( ) Re ( )
De D
o o n n .

Figure 6.4(c) plots the imaginary part of this same transform, which means Fig. 6.4(c) simulates
the random function

( )
(det) 1 (det)
Im ( ) ( )
D Do
i o o
n n

corresponding to the detector noise specified in Fig. 6.4(a). The scale size
(det)
DN
Z , shown with
dashed lines in Figs. 6.4(b) and 6.4(c), is 1/50th of the maximum uncalibrated spectral signal
produced by the simulated interferometer. Figure 6.4(d) plots a very short stretch of the two
curves in Figs. 6.4(b) and 6.4(c) around 0 o . Here is the distance between adjacent samples
on the wavenumber axis for the radiance spectrum measured by the interferometer. We see that
the imaginary part obeys Eqs. (6.37g) and (6.37j) by being an odd function of ; and the real part
obeys Eqs. (6.37d) and (6.37i) by being an even function of .

6.14 An Example of Simulated Detector Noise in a Double-Sided Signal
In this section, we consider the effect of detector noise in an ideal interferometer system where,
in Eqs. (6.38d) and (6.1c), we take
min
= 0 and
max
= . Figures 6.5(a)6.5(c) show what
happens to a radiance spectrum ( ) o L measured without any noise, and Figs. 6.6(a) and 6.6(b)
show what happens when these same measurements are contaminated by significant amounts of
detector noise. Figure 6.5(a) plots the radiance spectrum entering the interferometers input
aperture, with the radiance spectrum graphed for both positive and negative wavenumbers to
honor the absolute value sign in its argument. Figure 6.5(b) gives the interferogram signal ( )
C
z
generated by ( ) o L , as specified in Eq. (6.5d) with 1 W . This is the signal seen at point C in
In this section, we consider the effect of detector noise in a simplified interferometer system where,
An Example of Simulated Detector Noise in a Double-Sided Signal 6.14
- 801 -
FIGURE 6.5(a).

2 10
5
.
5 10
6
.
Lp
ip
2500 2500 op
ip
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
0
5 10
6
1 10
5
1.5 10
5
2 10
5
.
5 10
6
.
L kPlot Ao
.
( )
1500 1000 kPlot Ao
.
1000 1100 1200 1300 1400 1500
5 10
6
0
5 10
6
1 10
5
1.5 10
5

max
L

max
L
0.0
0.0
0.0
2 /
Nyq
o 2 /
Nyq
o
o
o
( ) o L
( ) o L
- 802 -
FIGURE 6.5(b).

1 10
5
.
1 10
5
.
IfTradNFT
kPlot
0.03 0.005
kPlot
Ntot
2
1
.
0.005 0.01 0.015 0.02 0.025
0
5.614754 10
5
.
5.344839 10
5
.
IfTradNFT
kPlot
0.030 0.030
kPlot
Ntot
2
1
.
0.02 0.01 0 0.01 0.02
1 10
4
5 10
5
0
5 10
5
1 10
4
0.0 2 / D 2 / D
Interferogram
signal
Interferogram
signal
- 803 -
FIGURE 6.5(c).

2 10
5
.
5 10
6
.
Re LNFT
kPlot
1500 1000 kPlot
.
1000 1100 1200 1300 1400 1500
5 10
6
0
5 10
6
1 10
5
1.5 10
5
2 10
5
.
5 10
6
.
gp
kPlot
2500 2500
kPlot
Ntot
2
1
.
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
0
5 10
6
1 10
5
1.5 10
5
max
L

max
L
0.0
0.0
0.0
2 /
Nyq
2 /
Nyq

( )
mnf
L
( )
mnf
L
- 804 -
FIGURE 6.6(a).

5.582875 10
5
.
5.392793 10
5
.
IfTradNT
kPlot
0.030 0.030
kPlot
Notot
2
1 A
.
0.02 0.01 0 0.01 0.02
1 10
4
5 10
5
0
5 10
5
1 10
4
1 10
5
.
1 10
5
.
IfTradNT
kPlot
0.030 0.005
kPlot
Notot
2
1 A
.
0.005 0.01 0.015 0.02 0.025
0
0.0 2 / D 2 / D
Interferogram
signal
Interferogram
signal

0.0

44839 10
5
.
0.030 0.030
kPlot
Notot
2
1 A
.
0.02 0.01 0 0.01 0.02
1 10
4
5 10
5
0.0 2 / D 2 / D
0
5
- 805 -

FIGURE 6.6(b).

2 10
5
.
5 10
6
.
LTradNT
kPlot
6000 6000
kPlot
Notot
2
1 Ao
.
6000 4000 2000 0 2000 4000 6000
0
5 10
6
1 10
5
1.5 10
5
2 10
5
.
5 10
6
.
LTradNT
kPlot
1500 1000
kPlot
Notot
2
1 Ao
.
1000 1100 1200 1300 1400 1500
5 10
6
0
5 10
6
1 10
5
1.5 10
5
0.0
Nyq
o
Nyq
o

max
L

max
L

Noise-contaminated
radiance measurement
Noise-contaminated
radiance measurement
o
0.0
0.0
- 806 -
Fig. 6.2 when only negligible amounts of noise and background radiance are present. Figure
6.5(c) gives the ( )
mnf
o L radiance measurement extracted from the interferogram signal in Fig.
6.5(b). The most dramatic change is perhaps the spurious oscillation or ringing produced
throughout the measured spectrum by the finite signal length or truncation of the interferogram
signal (only signal values between = +D and = D are recorded in this double-sided system).
Careful examination also reveals the blurring effects of this truncationnote that three
absorption lines in the center of Fig. 6.5(c) are not quite as deep and are more closely matched in
intensity than are the absorption lines in Fig. 6.5(a). The characteristic scale of the radiance axis
in Figs. 6.5(a) and 6.5(c) is taken to be L
max
, the maximum value of the input radiance spectrum
(in units of optical power per unit area per unit solid angle per unit wavenumber interval). Next
detector noise is added to the radiance measurement. Figure 6.6(a) plots the interference signal in
Fig. 6.5(b) contaminated by the band-limited detector noise plotted in Fig. 6.4(a), and Fig. 6.6(b)
gives the spectral measurement produced by this noise-contaminated signal.
The discussion following Eq. (6.35d) above reveals that the detector noise
(det)
( ) n in the
z
C
() signal adds a complex spectral noise
(det)
( )
D
o n to the spectral data coming out of the
calibration algorithm; and, as shown in Eq. (6.38a), only the real component of the complex
spectral noise unavoidably contaminates the spectral measurement. Figure 6.6(b) shows that this
real component typically introduces a fuzziness into the measured spectrum, which is most easily
seen where the noise-free L
mnf
spectrum is negligible or zero. Figures 6.7(a) and 6.7(b) show the
real and imaginary parts of the complex spectral noise in this simulated interferometer
measurement. Because the last step in producing a double-sided interferometer measurement is
according to Eq. (6.38a) aboveto take the real part of the calculated spectrum, only the real part
plotted in Fig. 6.7(a) ends up contaminating the spectral measurement. The plots in Figs.
6.7(a)and 6.7(b) look qualitatively similar and have the same characteristic size, which is typical
of detector noise (see the discussion at the beginning of Sec. 6.17 below).
It is important to remember that the random noise in Figs. 6.7(a) and 6.7(b) comes from one
specific spectral measurement. The very next measurement might have negative errors where
there are now positive, or positive errors where there are now negative, or something in
betweenthere is quite literally no necessary connection to the random spectral errors in the
previous measurement. If we keep track of the detector-noise error in a very large collection of
measurements, and then at each wavenumber average together the detector-noise error from all
the different measurements, we would discover that the average detector-noise error approaches
zero at every wavenumber as we increase the number of independent measurements. This is, of
course, just what should happen according to Eq. (6.30b) above.
6.15 Photon Noise in Detectors
Most detectors approach an ideal state when chilled to very low temperatures (typically tens of
degrees Kelvin) at reasonable levels of illumination. For an ideal detector, the only source of
course, just what should happen according to Eq. (6.30b) above. If we calculate the standard
deviation at every wave number, we get the NEdN levels shown in Figs. 6.7(a,b).
6.15 Photon Noise in Detectors
Most detectors approach an ideal state when chilled to very low temperatures (typically tens of
degrees Kelvin) at reasonable levels of illumination. For an ideal detector, the only source of
Photon Noise in Detectors 6.15
- 807 -
FIGURE 6.7(a).

detector noise is the quantum fluctuations in the number of photons it absorbs. When the detector
experiences a constant level of illumination, these quantum fluctuations show up as band-limited
white noise. The photon noise in many types of photovoltaic (PV) detectors often approaches the
ideal of band-limited white noise. Many times this occurs when the detector observes the signal
in the presence of large amounts of background radiation, because then most of the photons
reaching the detector come from the constant background, keeping the total number of absorbed
photons approximately constant as the optical signal varies. A detector operating in this mode is
said to have reached its background-limited infrared photon, or BLIP, limit. Figures 5.8(a) and
5.8(b) in Chapter 5 show that when detectors measure interferograms, the total signal variation
about its average level is usually small except very close to ZPD in a region symmetrically
located about = 0. In this sense, even when background radiation is disregarded, PV detectors
measuring interferograms are analogous to PV detectors operating in the BLIP limit: photons are
absorbed at a more or less constant rate during most of the measurement. Experience has shown
6 10
7
.
6 10
7
.
LrTradNT
kPlot
5000 5000
kPlot
Ntot
2
1
.
4000 2000 0 2000 4000
5 10
7
0
5 10
7
0.0
Nyq

Nyq

max
L

max
L max
/ 50 L

max
/ 50 L
0.0

Real
part of the
complex
spectral noise
in the radiance
measurement
Thick, solid line is the
NEdN level for this
noise.
- 808 -
that for this reason the photon noise contaminating interferograms can usually be approximated
as band-limited white noise, with the photon noise level specified by the detectors average
illumination from both the background and signal radiances.
To derive a power level for the photon noise generated in a detector, we treat the detector as
an element of an electric circuitit does, after all, put out an electric signal when illuminated
which means it must have a typical bandwidth that we call
(det)
band
f . Associated with this bandwidth
is a response time

(det)
(det)
1
2
band
band
f
=
. (6.44a)

If the illumination hitting the detector varies significantly on a timescale shorter than
(det)
band
, the
detector does not record the change in illumination directly but instead generates a signal based
on the average level of illumination reaching the detector over the
(det)
band
time interval. In this
sense,
(det)
band
is the effective length of time during which the detector collects photons to produce
its signal. We also assume that the detector responsivity R( ) (which is defined at the beginning
of Sec. 5.9 in Chapter 5) can be written as the product of two functions ( )
d
and ( )
d
e for
wavenumbers greater than zero,
R( ) ( ) ( )
d d
e = . (6.44b)

Function
d
is often called the detectors quantum efficiency; it specifies the fraction of photons
of frequency / f c c = = that are absorbed after hitting the detectors surface. The value of
d

for any must be a dimensionless number between zero and one:

0 ( ) 1
d
. (6.44c)

Every photon is associated with a monochromatic wavefield of frequency f (in cycles per
second) and carries an amount of energy hf hc = , where
27
6.626 10 erg sec h

is Plancks
constant and
10
2.998 10 cm/sec c is the speed of light in a vacuum. We define
1
P
to be the
random number of photons absorbed by the detector in time
(det)
band
that have frequency
1 1
f c = ,
2
P
to be the random number of photons absorbed in time

(det)
band
that have frequency
2 2
f c = ,
3
P

to be the random number of photons absorbed in time
(det)
band
that have frequency
3 3
f c = , and so
on. The statistical rules obeyed by photons require
1
P
,
2
P
,
3
P
, to be independent random
numbers.
The total number of photons absorbed by the detector in time
(det)
band
is

- 809 -

FIGURE 6.7(b).

______________________________________________________________________________

1 2 3 tot
P P P P = + + +

". (6.45a)

The detector has an area A
d
, a field of view specified by the solid angle
d
, and is illuminated
by a constant radiance L
d
() that is defined only for 0 . As has already been pointed out at the
beginning of this section, for interferometers we can take L
d
() to be the average radiance level,
both from the optical background and the optical signal, reaching the detector. Using the linearity
of the expectation operator E with respect to random variables (see Sec. 3.10 of Chapter 3), the
average number of photons absorbed by the detector in time
(det)
band
is

1 2 3
( ) ( ) ( ) ( )
tot
P P P P = + + +

" E E E E , (6.45b)

where
1
( ) P
E is the average number of photons absorbed in time

(det)
band
that have frequency
1
f ,
2
( ) P
E is the average number of photons absorbed in time

(det)
band
that have frequency
2
f , and so
Thick, solid line is the
NEdN level for this
noise.
6 10
7
.
6 10
7
.
LiTradNT
kPlot
5000 5000
kPlot
Ntot
2
1
.
4000 2000 0 2000 4000
5 10
7
0
5 10
7
0.0
0.0
Nyq

Nyq

max
L

max
L

Imaginary part
of the complex
spectral noise
in the radiance
measurement

max
/ 50 L

max
/ 50 L
Imaginary
part of the
complex
spectral noise
in the radiance
measurement
- 810 -
on. Given L
d
(), we know that

(det) 1
1 1
1
(det) 2
2 2
2
(det)
( )
( ) ( )
( )
( ) ( )
( )
( ) ( )
d d d
d band
d d d
d band
d d d j
j d j band
j
A
P d
hc
A
P d
hc
A
P d
hc

L
L
L
#
,
,

,

E
E
E
(6.45c)
where
( )
d d d j
A d L

is the radiant power carried by electromagnetic radiation having a frequency f between
j
f c =
and ( )
j
f c d = + , and

( )
d d d j
j
A
d
hc

L

is, of course, the average number of photons per unit time carried by that radiation.
Returning to Eq. (6.45a), we see that the actual random optical power
d
W absorbed by the
detector over a time interval
(det)
band
is

3 1 2
1 2 3
(det) (det) (det)
d
band band band
hc hc hc
P P P

= + + +

" W . (6.46a)

This should not be confused with the average or expected optical power absorbed over the time
interval
(det)
band
. Since the photons have already been absorbed, all that is needed to get the actual
random signal
d
I
is to multiply the first term by

1
( )
d
e , the second term by
2
( )
d
e , etc., which
gives

3 1 2
1 1 2 2 3 3 (det) (det) (det)
( ) ( ) ( )
d d d d
band band band
hc hc hc
I e P e P e P

= + + +

" . (6.46b)

The right-hand side of this equation is a sum of independent random variables. Equation (3.19e)
in Chapter 3 states that the variance of the sum of independent random variables is the sum of the
- 811 -
variances, so we can use the notation introduced in Eq. (3.8f) of Chapter 3 to write

1
1 1 (det)
2
2 2 (det) (det)
( ) ( )
( ) ( ) .
d d
band
j
d d j j
band band
hc
Var I Var e P
hc
hc
Var e P Var e P

=

+ + + +

" "

Equation (3.16g) in Chapter 3 points out that multiplying a random variable by a nonrandom
parameter means that its variance must be multiplied by the square of that parameter, so the
variance in signal
d
I

2
1
1 1 (det)
2 2
2
2 2 (det) (det)
( ) ( ) ( )
( ) ( ) ( ) ( ) .
d d
band
j
d d j j
band band
hc
Var I e Var P
hc
hc
e Var P e Var P

=

+ + + +

" "
(6.46c)

The number of photons absorbed at any frequency
j
f c = obeys Poisson statistics, which means
that the variance in the random number of photons equals the mean or average number of
photons:

( ) ( )
j j
Var P P =

E . (6.46d)

Substituting Eqs. (6.46d) and (6.45c) into (6.46c) gives

2
(det) 1 1
1 1 (det)
1
2
(det) 2 2
2 2 (det)
2
( )
( ) ( ) ( )
( )
( ) ( )
( )
d d d
d d d band
band
d d d
d d band
band
d j
A hc
Var I e d
hc
A hc
e d
hc
hc
e

=

+ +

+
L
L
"

2
(det)
(det)
( )
( ) ,
j d d d j
d j band
band j
A
d
hc

+

L
"


- 812 -

2 1
1 1 1 (det)
2 2
2 2 2 (det)
2 3
3 3 3 (det)
( ) ( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
d d d d d d
band
d d d d d
band
d d d d d
band
hc
Var I A e d
hc
A e d
hc
A e d

=

+

+

L
L
L

+" .

Converting this sum into an integral, we get

2
(det)
0
( ) ( ) ( ) ( )
d d d d d d
band
hc
Var I A e d

=

L
.

Equations (6.44a) and (6.44b) can be used to write this as

2
(det)
0
R( )
( ) 2 ( )
( )
d band d d d
d
Var I f hc A d
. (6.46e)

The photon noise is band-limited white noise like that shown in Figs. 6.3(b) and 6.3(c) above.
Hence, Eq. (3.62d) in Chapter 3, which connects the variance of band-limited white noise to the
constant level of its noise-power spectrum, here allows us to write that

(det) (det)
2
( ) 2
d band p
Var I f S =
, (6.46f)

where
(det)
2 p
S is the constant power level of the double-sided, time-based power spectrum due to
the random quantum fluctuations in the number of photons absorbed by the detector. Comparing
Eq. (6.46e) to (6.46f), we see that

2
(det)
2
0
R( )
( )
( )
p d d d
d
S hc A d
L . (6.46g)

A single-sided power spectrum must, according to Eq. (3.58b) of Chapter 3, have a constant
power level
(det)
1 p
S that is twice the size of the double-sided power level, hence

- 813 -

2
(det)
1
0
R( )
2 ( )
( )
p d d d
d
S hc A d
L . (6.46h)

When the detector experiences an approximately constant level of monochromatic radiation at
wavenumber
0
= , we can write the radiance L
d
as

0 0
( ) ( )
d
d
Q
hc

=

L , (6.47a)

where Q, which is often called the photon incidence, is defined to be the number of photons per
unit time and per unit area hitting the detector. The delta function in (6.47a) has units of inverse
wavenumbers (that is, length) and is explained in Sec. 2.14 of Chapter 2. Substitution of (6.47a)
into (6.46h) gives

2
(det)
1 0 0
0
R( )
2 ( )
( )
p d d
d d
Q
S hc A hc d

or

[ ]
2
(det)
1 0 0
0
R
2
( )
( )
d
p
d
A Q
S hc

= . (6.47b)

Detectors are often characterized by a figure of merit called the specific detectivity D
*
, or D-
star. The specific detectivity of a detector at a positive wavenumber is defined to be

(det)
1
R( )
( )
( )
d
A
D
S u
= , (6.48a)

where u is again the constant OPD velocity used in Eq. (6.4) above, R() is the detectors
responsivity, A
d
is the detector area, and
(det)
1
( ) S f is the single-sided noise-power density at the
signal frequency f (in Hz). The absolute value signs applied to both remind us that its value
must be positive and allow us to extend the definition of D
*
to negative wavenumbers. The units
of D
*
are cm Hz/watt (which is often called a Jones). The D
*
tends to be constant for all
infrared detectors made from the same detector material and operating at the same temperature,
no matter what the detector area A
d
; consequently, it can be used to predict the amount of noise
contamination present in any size detector, all other things being equal. High-performance
detectors produce low-noise signals and have large D
*
values (for example,
14
10 cm Hz/watt ),
and low-performance detectors have small D
*
values (for example,
7
10 cm Hz/watt ). The D
*
of
an ideal detector that is photon-noise limited and experiencing an approximately constant level of
- 814 -
monochromatic illumination at wavenumber
0
is, substituting Eq. (6.47b) into (6.48a),

0
0
(det)
0
1
R( )
( ) 1
2
d
d
p
A
D
hc Q
S
= = (6.48b)

or, remembering that the radiation wavelength
0
equals
1
0
,

0 0
( )
2
d
D
hc Q

= . (6.48c)

This equation is the standard D
formula for a PV detector in the BLIP limit.

101

101
See, for example, Eq. (2.48a) in John David Vincent, Fundamentals of Infrared Detector Operation and Testing
(John Wiley and Sons, New York, 1990), p. 65.
6.16 Detector-Noise NEdN in Double-Sided Signals
It is easy to show thatas expected from Eqs. (6.2c) and (6.3c)the expectation value of
( ) L
is zero in a double-sided signal contaminated by detector noise. Returning to Eq. (6.30b),

we get that

( )
(det)
( ) 0
D
= n E . (6.49a)

Substituting from Eq. (6.37h) now gives, using the linearity of the expectation operator with
respect to random variables [see Eq. (3.16a) in Chapter 3],

( ) ( ) ( )
(det) (det) (det) (det)
( ) ( ) ( ) ( ) 0
De Do De Do
+ = + = n n n n E E E . (6.49b)

According to Eqs. (6.37i) and (6.37j),
(det)
( )
De
n is purely real and
(det)
( )
Do
n is purely imaginary,
which means the expectation value
( )
(det)
( )
De
n E must be purely real and the expectation value
( )
(det)
( )
Do
n E must be purely imaginary. Consequently we can take real and imaginary parts of
(6.49b) to get

( )
(det)
( ) 0
De
= n E (6.49c)
and

( )
(det)
( ) 0
Do
= n E . (6.49d)
Detector-Noise NEdN in Double-Sided Signals 6.16
- 815 -
Taking the expectation value of both sides of the formula for ( ) o o L
in Eq. (6.38g) now gives

the desired result:

( )
( ) 0 o o L
E (6.49e)
for the double-sided detector noise.
To get the detector-noise NEdN in a double-sided signal, we first substitute Eq. (6.49e) into
(6.3g) to get

( )
2
( ) ( ) NEdN o o o

L
E (6.50a)

and then substitute (6.38f),

( )
2
(det)
(det)
2
ma
R
4 Re ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
D
a f
NEdN
A R
o
o
o q o o t o t o

AO
n E
. (6.50b)

The subscript 2 and the superscript (det) are added to the NEdN parameter to show that this is the
NEdN of a double-sided signal contaminated by detector noise. According to the discussion
immediately preceding Eqs. (4.84a) and (4.84b) in Chapter 4, parameter W = +1 or 1, which
means that it drops out of the formula when ( ) o o L
is squared. We can remove the absolute

value signs from the arguments of M and because they are already even functions [see Eqs.
(4.139g) and (5.10f) in Chapters 4 and 5 respectively] to get

( )
2
(det)
(det)
2
ma
R
4 Re ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
D
a f
NEdN
A R
o
o
o q o o t o t o

AO
n E
. (6.50c)

According to the discussion at the beginning of Sec. 6.12, we can assume the detector noise to
be wide-sense stationary; and Appendix 6B shows that it has this property both when treated as a
random function of time and when treated as a random function of the OPD value . Using the
transformation specified in Eqs. (6.40a) and (6.40b) to treat the detector noise as the random
function of time
(det)
( ) N t
, we use Eq. (6.40f) to construct its T-limited Fourier transform

( ) ( ) ( , ) ( )
T
ift ift
T
T
r r

H

N

. (6.51a)

The analysis given in Sec. 3.26 of Chapter 3 shows, according to Eqs. (3.69g) and (3.69h), that
(6.22b) above defines ]
, we use Eq. (6.40f) to construct its T-limited Fourier transform [Eq.
H

( ) ( ) ( , ) ( )
T
ift ift
T
T
r r

H

N

. (6.51a)

The analysis given in Sec. 3.26 of Chapter 3 shows, according to Eqs. (3.69g) and (3.69h), that
- 816 -

( )
( )
2
2
(det) (det)
1
Re ( ) ( )
2
T T
f f

N N

E E (6.51b)
and

( )
( )
2
2
(det) (det)
1
Im ( ) ( )
2
T T
f f

N N

E E (6.51c)

as long as f is substantially greater than
1
( ) O T
. This is a very easy requirement to satisfy

since at this point all it really does is show how large T must be chosen for us to have Eqs.
(6.51b) and (6.51c) hold true at the frequencies f we are interested in. Remembering that E is a
linear operator with respect to random variables and that / f u = , we use Eq. (6.40h) to write

( )
( )
2
2
(det) (det)
1
Re ( ) ( )
2
D D

n n E E (6.51d)

and

( )
( )
2
2
(det) (det)
1
Im ( ) ( )
2
D D

n n E E . (6.51e)

These two formulas only hold true as long as is substantially greater than
1
( ) O D
as can be
seen by applying Eqs. (6.40c) and (6.40d) to the requirement that f is substantially greater than
1
( ) O T
. The intersample distance between spectral samples along the wavenumber axis of the
radiance measurement is, according to the discussion following Eq. (5.124d) in Chapter 5,

1
2D
= .

Consequently, as long as the wavenumbers between
min
and
max
at which the spectral radiance is
being measured lie a reasonable number of lengths away from the 0 = origin of the
wavenumber axisas would be the case in a well-designed interferometer systemwe can rely
on being substantially greater than
1
( ) O D
for the wavenumbers of interest. Hence formulas

(6.51d) and (6.51e) can be assumed to hold true. Now we can substitute Eq. (6.51d) into (6.50c)
to get

( )
2
(det)
(det)
2
ma
R
2 2 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
D
a f
NEdN
A R
n E
. (6.51f)

- 817 -
This basic equation for the detector-noise NEdN of a double-sided signal can be put into a
variety of forms.
If the power spectrum of the detector noise is known, we can evaluate

( )
2
(det)
( )
D
n E

directly no matter what shape it has. In particular, we do not need to assume that the detector
produces band-limited white noise. Starting with Eq. (6.29a), we have

2
(det) (det) (det)
(det) 2 (det) 2
( ) ( ) ( )
( , ) ( ) ( , ) ( ) ,
D D D
i i
D n e d D n e d

=

=

n n n

which becomes, since
(det)
( ) n is real,

2
( ) ( , ) ( ) ( , ) ( )
i i
D
D n e d D n e d

=

n .

Equation (3.17c) in Chapter 3 allows the expectation operator E to be taken inside the double
integral formula, so applying E to both sides leads to

( )
( )
2
(det) 2 2 (det) (det)
( ) ( , ) ( , ) ( ) ( )
i i
D
d D e d D e n n

=

n E E .

Substituting from Eq. (6.39a) and then applying (6.39c) gives

( )
2
(det) 2 2 (det)
2 2 (det) 2 ( )
(det) 2 ( )
( ) ( , ) ( , ) ( )
( , ) ( , ) ( )
( ) ( , )
i i
D nn
i i i
nn
i
nn
d D e d D e
d D e d D e d e
d d D e d

=
=
=

o
p
p
E
2 ( )
( , )
i
D e

.
(6.52a)

- 818 -
Consulting Eq. (2.108b) of Chapter 2, we set up the variable correspondences

f . , D F . , ( ) t o o .
for the integral

2 ( )
( , )
i
D e d
r o o

and the variable correspondences

f . , D F . , ( ) t o o .
for the integral

2 ( )
( , )
i
D e d
r o o

.
This gives

2 ( )
( , ) 2 sinc(2 ( ) )
i
D e d D D
r o o
r o o
(6.52b)
and

2 ( )
( , ) 2 sinc(2 ( ) )
i
D e d D D
r o o
r o o
, (6.52c)

where, following the definition in Eq. (2.106d) of Chapter 2, we say that

sin( )
sinc( )
x
x
x
.

Substitution of (6.52b) and (6.52c) into (6.52a) leads to

( )
( ) ( )
2
(det)
(det)
( )
( ) 2 sinc 2 ( ) 2 sinc 2 ( )
D
nn
D D D D d
o
o r o o r o o o

n

E
. p
(6.52d)

Clearly, the sinc is an even function of its argument,

sin( ) sin( )
sinc( ) sinc( )
x x
x x
x x
.

is an even function of its argument,
- 819 -
Consequently, Eq. (6.52d) can be written as

( )
( )
{ }
2 2
(det) (det)
( ) 2 ( ) 2 sinc 2 ( )
D nn
D D D d
=

n

E . p (6.52e)

We assume that the detector noise has a power spectrum
(det)
nn
p that varies slowly with compared
to
sinc(2 ( ) ) D .

This means we can, just as in Eq. (3.67b) of Chapter 3, approximate the action of

[ ]
2
2 sinc(2 ( ) ) D D

inside the integral by replacing it with a delta function ( ) . Equation (6.52e) then
simplifies to

( )
2
(det) (det)
( ) 2 ( )
D nn
D = n

E p , (6.52f)

which can be substituted into (6.51f) to get

(det)
(det)
2
ma
R
4 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
nn
a f
D
NEdN
A R

p
. (6.52g)

We note that
(det)
nn
p is a double-sided power spectrum, which means [see Eqs. (6.39e) and (6.39f)
above] it is real and even, making the absolute value signs applied to its argument superfluous.
Many times the detector noise is characterized by its power spectrum written as a function of
the frequency f (in Hz). This is called
(det)
( )
NN
S f

in Sec. 6.12 above, and Eq. (6.41f) can be used to
write (6.52g) as

(det)
(det)
2
ma
R
4 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
NN
a f
uDS u
NEdN
A R

(6.53a)

Again, the absolute value signs do not need to be added to the argument of the power spectrum
because it is a real and even function. This formula is often written in terms of the single-sided
power spectrum described by Eq. (3.58b) of Chapter 3, which is defined only for non-negative
values of frequency f u = . Calling this single-sided power spectrum
(det)
1
( ) S f , we know from
Eq. (3.58b) that
- 820 -

(det) (det)
1
( ) 2 ( )
NN
S f S f =

. (6.53b)

Here, the absolute value signs are needed to show that the frequency argument must be non-
negative. Substituting this into (6.53a) gives

(det)
1
(det)
2
ma
R
2 2 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
a f
uDS u
NEdN
A R

. (6.53c)

One last form into which this formula can be put uses the D
*
figure of merit introduced in Eq.
(6.48a),

(det)
1
R( )
( )
( )
d
A
D
S u
= .

Substituting this into (6.53c) gives

(det)
2
ma
2 2
( )
( ) M( ) ( ) ( ) ( ) ( )
d
a f
uDA
NEdN
A R D
, (6.53d)

where A
d
is the optically sensitive area of the detector.

6.17 Real and Imaginary Parts of the Detector Noise
One wayperhaps the easiest wayto estimate the detector noise contaminating a spectral
measurement is to graph the imaginary component of the spectral data coming out of the
interferometers calibration algorithm. Figures 6.5(a)6.5(c), 6.6(a), and 6.6(b) above show the
simulated spectral measurement of an interferometer both with and without detector noise. To
show the behavior of the imaginary component of the complex data, we graph it in Fig. 6.7(b) for
the spectral measurement in Fig. 6.6(b), stretching the scale of the y axis to make it easier to see.
According to Eq. (6.38b), this is pure noise. Squaring the right-hand side of (6.38b) and taking its
expectation value gives, after substituting from Eq. (6.51e) (and using that W = +1 or 1 from the
discussion immediately preceding Eq. (4.84a) in Chapter 4),
Real and Imaginary Parts of the Detector Noise 6.17
- 821 -

( )
( )
2
(det)
2
ma
2
(det)
2
ma
R
R
16 Im ( )
( ) M( ) ( ) ( ) ( ) ( )
8 ( )
( ) M( ) ( ) ( ) ( ) ( )
D
a f
D
a f
WA R
A R

=

n
n
.
E
E

This is the variance of the noise in Fig. 6.7(b). Taking the square root gives, for the imaginary
component,

( )
2
(det)
ma
R
2 2 ( )
standard deviation
( ) M( ) ( ) ( ) ( ) ( )
D
a f
A R
n E
. (6.54)

The thick solid line labeled NEdN in Fig. 6.7(b) shows the size of this standard deviation. Figure
6.7(a) plots the actual spectral noise in the measured spectrum. Not only does this spectral noise
qualitatively resemble the imaginary component of the complex data in Fig. 6.7(b), but also, as
shown by the thick solid line in Fig. 6.7(a), the NEdN or standard deviation of the spectral noise
has the same value as the standard deviation of the imaginary component of the complex data.
This is no surprise; glancing back at the right-hand side of Eq. (6.51f), we note that the right-hand
side of (6.51f) has the same formula for the NEdN (or standard deviation of the spectral noise) as
appears on the right-hand side of Eq. (6.54) above.

6.18 Detector Noise in a Single-Sided Signal
Section 5.18 of Chapter 5 describes how to produce a single-sided measurement of the spectral
radiance L. When the interferometer signal at point C in Fig. 6.2 is free of noise, the most
important difference between single-sided and double-sided measurements from the viewpoint of
the interferometer user is the gain in spectral resolution that can be achieved without a major
redesign of the moving mirror (see the discussion at the beginning of Sec. 5.18 in Chapter 5).
When, however, the signal is contaminated by significant amounts of detector noise, the NEdN in
a single-sided measurement at a specified spectral resolution is larger than the NEdN in a double-
sided measurement at that same spectral resolution. To show why this is so, we add detector
noise to the signal and process it as a single-sided measurement while keeping track of what the
detector noise does to the spectral measurement.
Equations (5.84c) and (5.88c) in Chapter 5 introduce two functions, ( ) and ( ) , used to
process single-sided measurements. Function ( ) is, according to Eq. (5.85b), the phase angle
- 822 -
of the detector circuits transfer function,
( ) arg[H( )] u = . (6.55a)

Footnote 88 of Chapter 5 explains that there are other nonideal aspects to interferometer
signalssuch as the off-center sampling mentioned in Sec. 5.26 of Chapter 5that can modify
the nonzero phase angle (although it always remains a slowly varying function of ). From this
point on, we can include all these aspects in our analysis by regarding H as the effective
transfer function that includes not only the effects of the detector circuit but also all the other
significant causes of a nonzero phase angle. This makes h, the forward Fourier transform of H
(see Appendix 5A of Chapter 5), an effective impulse-response function for the signal leaving
the detector. Because H is still the forward Fourier transform of a real-valued function h when H
and h are taken to be the effective transfer function and effective impulse-response function, H is
still a Hermitian function satisfying Eq. (5A.6b) in Appendix 5A. Equation (5A.5), however, may
not be satisfied by an effective impulse-response function because the effective h may not be
causal.
Equation (5.88c) defines function ( ) to be the inverse Fourier transform of
( ) i
e

multiplied by the tapering function ( ) V specified in Eq. (5.88d),

( ) 2
( ) [ ( ) ]
i i
V e e d

. (6.55b)

As pointed out in the discussion following Eq. (5.88a) of Chapter 5, we only need to know
exactly for

min max
;

outside this range, function V can be adjusted to make
( )
[ ( ) ]
i
V e

taper to zero, ensuring that
the Fourier transform in (6.55b) exists.
Functions () and ( ) can usually be recovered from the calibration procedure applied to
the interferometer. One method, as described at the beginning of Sec. 6.11 above, is to subtract
off the background signal
( ) cold
C
z described in Sec. 6.3 and thenbeing sure to repeat the signal
measurements often enough to average away the noiseto calculate and from the recipe
given in Sec. 5.18 of Chapter 5. Another possibility is to note that every detector signal must pass
through the same signal chain, ending up multiplied by the same effective transfer function H.
Hence both
(1)
,
( )
eff tot
Z and
(2)
,
( )
eff tot
Z in Eqs. (6.33a) and (6.33b) above are complex because all
their real functions of are multiplied by the same complex transfer function H(u), giving both
spectra the same nonzero phase angle (). In this sense
(1)
,
( )
eff tot
Z and
(2)
,
( )
eff tot
Z are
mathematically equivalent to Z
eff
() in Eq. (5.83d) of Chapter 5which means that we can get
Detector Noise in a Single-Sided Signal 6.18
- 823 -
the required () phase data by putting either
(1)
,
( )
eff tot
Z or
(2)
,
( )
eff tot
Z through the same
numerical recipe that Z
eff
() is put through in Sec. 5.18.
The single-sided signal z
conv
() defined in Eq. (5.89a) in Chapter 5 is calculated between
0 = and 2 2 D d = because in Sec. 5.18 of Chapter 5 we want to examine how the same
range of moving-mirror motion can be manipulated to improve spectral resolution. In this section,
however, we want to keep the spectral resolution unchanged while comparing the detector noise
in single-sided and double-sided spectral measurements. Equation (5.67) of Chapter 5 specifies
the spectral resolution of a double-sided measurement between = D and = D to be

double sided
1
2D
= ,

and Eq. (5.93b) specifies the corresponding spectral resolution of a single-sided measurement
with ( )
conv
z known between 0 = and 2 2 D d = to be

single sided
1
2(2 2 ) D d
=
.

For the single-sided interferometer discussed in Sec. 5.18 of Chapter 5, we expect to have

d D << , (6.56)

which means that
1/(4 ) / 2 D = . Hence, to create a single-sided
measurement with the same spectral detail as a double-sided measurement, we should record
z
conv
() only between 0 = and D = rather than between 0 = and 2 2 2 D d D = . This
ensures that both the single-sided and double-sided cases have the same spectral resolution.
To construct the z
conv
signal between 0 and D, we convolve ( ) with the signal component
created by the L() input radiance at point C in Fig. 6.2, as shown by Eq. (5.89a) in Chapter 5.
Nothing stops us, however, from convolving the total signal at point C with while planning to
discard the unwanted background components later on. Because we want to keep track of the
noise, should be convolved with the total noise-contaminated signal
( )
( )
tot
CN
z specified in Eq.
(6.22a) above. We get, remembering that the convolution is a linear operation [see Eqs. (2.38b)
and (2.38d) in Chapter 2],

( ) ( ) (det)
1
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
tot cold
CN C C
z z z n h
u u

= + +

.

The associative property of the convolution [see Eq. (2.38c) in Chapter 2] gives
- 824 -

(det) (det) (det)
( ) ( ) ( ) ( ) ( ) n h n h n h
u u u

/ = =

,

where we define
( ) h h
u u

/ =

. (6.57a)

Now the noise-contaminated signal can be written as

( ) ( ) (det)
1
( ) ( ) ( ) ( ) ( ) ( ) ( )
tot cold
CN C C
z z z n h
u u

/ = + +

, (6.57b)

and to get the total noise-free signal, we just set
(det)
( ) n to zero:

( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
tot cold
C C C
z z z = + . (6.57c)

To analyze ( ) ( )
C
z , the first term in Eq. (6.57c), we apply the Fourier convolution
theorem to its forward Fourier transform [see Eq. (2.39a) in Chapter 2],

( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( ) ( )
i i i
C C
z z

= F F F . (6.58a)

The Fourier transforms in Eqs. (6.55b) and (6.5d) can be reversed to get

( )
( ) 2 ( )
( ) ( ) ( )
i i i
V e e d

= =
F (6.58b)
and

( )
ma
2 ( )
R H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
( ) ( )
f a FOV
i i
C C
WA
u R
z e d z

= =
L
F .
(6.58c)

Substituting (6.58b) and (6.58c) into (6.58a) gives
- 825 -

( )
( )
( )
ma
R
( ) ( )
( ) H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
i
C
i
f a FOV
z
WA
V e u R

=

L
F
.
(6.58d)

According to Eq. (6.55a) and the discussion following it, () is the argument or complex phase
angle of the effective transfer function H( ) u , so

( )
H( ) H( )
i
u e u

=
or

( )
H( ) H( )
i
e u u

= . (6.58e)

We also note that [see Eq. (5.88d) in Chapter 5] the tapering function V() equals one for those
values where
min max
0 < . These are also, according to the discussion following Eq.
(6.38c) above, the values where the product

R( ) ( ) ( )
a f

in Eq. (6.58d) is not zero. So either V() is multiplied by zero on the right-hand side of (6.58d),
which means that its value does not matter, or else has a value for which V() is one. Hence Eq.
(6.58d) can be written as

( )
( )
( )
ma
R
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( ) ,
4
i
C
i
f a FOV
z
WA
e u R

=

L
F

which becomes, substituting from (6.58e),

( )
( )
ma
R
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
i
C
f a FOV
z
WA
u R

=

L
F
.
(6.58f)

According to Eq. (5A.6b) in Appendix 5A of Chapter 5, H is a Hermitian function, which makes
its magnitude even:
H( ) H( ) H( ) u u u
= = . (6.58g)

- 826 -
Equation (5.10f) in Chapter 5 and (4.139g) in Chapter 4 show that M and are also even
functions, and clearly the product

R( ) ( ) ( ) ( )
f a FOV
L

is even because all the functions depend on . Hence the entire right-hand side of Eq. (6.58f) is
a real and even function of . Reversing the Fourier transform in (6.58f) to get

( )
ma
R
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
C
i
f a FOV
z
WA
u R

=

L F ,
(6.58h)
we conclude that the convolution ( ) ( )
C
z is another real and even function because it is the
inverse Fourier transform of a real and even function (see entry 1 in Table 2.1 of Chapter 2):

( ) ( ) ( ) ( )
C C
z z = . (6.58i)

To analyze the second term on the right-hand side of Eq. (6.57c),

( )
( ) ( )
cold
C
z ,

we take its forward Fourier transform to get, again using Eq. (2.39a) in Chapter 2,

( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
i cold i cold i
C C
z z

= F F F . (6.59a)

This can be written as, substituting from Eqs. (6.58b) and (6.11a),

( )
( ) ( )
( ) ( ) (back)
ma
R
( ) ( )
( ) H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
i cold
C
i fore
a FOV FOV
z
WA
V e u R
=

L L
F
.
(6.59b)

Comparing this to Eq. (6.58d), we note that if
f
is replaced by one, and if L
FOV
is replaced by
( ) (back)
[ ]
fore
FOV FOV
L L , then the right-hand side of (6.58d) becomes the same as the right-hand side of
(6.59b)that is,

( )
( ) ( )
( ) ( )
i cold
C
z
F

- 827 -
becomes mathematically equivalent to

( )
( )
( ) ( )
i
C
z
o
c
F .

No special assumption was made about the nature of L
FOV
when analyzing the formula for

( )
( )
( ) ( )
i
C
z
o
c
F ,

and only one assumption was made about
f
t : that the tapering function V() equals one for those
values where the product
R( ) ( ) ( )
a f
o t o t o

is not zero [see discussion following Eq. (6.58e) above]. Nothing stops us from tightening this
assumption slightly by requiring that the tapering function equals one when the product
R( ) ( )
a
o t o is not equal to zero; this prevents
f
t from having any effect on our previous
analysis of
( )
( ( ) ( ))
i
C
z
o
c
F . Hence both
f
t and L
FOV
turn into placeholder functions when
deriving Eqs. (6.58h) and (6.58i) from (6.58d), which means that (6.58h) and (6.58i) still hold
true when
f
t is set equal to one and L
FOV
is replaced by
( ) (back)
[ ]
fore
FOV FOV
L L . Consequently, we can
now apply Eqs. (6.58h) and (6.58i) to Eq. (6.59b) to get, setting
f
t equal to one and replacing
L
FOV
by
( ) (back)
[ ]
fore
FOV FOV
L L ,

( )
( ) ( )
( ) (back)
ma
R
( ) ( )
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
i cold
C
fore
a FOV FOV
z
WA
u R
o
c
o o q o o t o o o
AO

L L
F

(6.59c)
and

( ) ( )
( ) ( ) ( ) ( )
cold cold
C C
z z c c . (6.59d)

Equations (6.58i) and (6.59d) show that both terms on the right-hand side of Eq. (6.57c) are
even functions of , which means that their sum

( ) ( )
( ) ( ) ( ) ( ) ( ) ( )
tot cold
C C C
z z z c c c +

f f
f equal to one, replacing
, and adding a (cold) superscript to Eqs. (6.58 f, i),
- 828 -
must also be an even function of . This means we can take the
( )
( ) ( )
tot
C
z c data collected
from 0 to D and use it to create an artificial signal between 0 and D . We call
the artificially doubled, noise-free signal

( )
( )
Even[ ( ) ( )]
( , )[ ( ) ( )] ( , )[ ( ) ( )]
tot
C
cold
C C
z
D z D z
c
c c
H + H .
(6.60a)

We define the Even operator by stating that

( )
Even[ ( )] ( , ) z D z H (6.60b)

for any function ( ) z . This forces
( )
Even[ ( ) ( )]
tot
C
z c to have the same values at
0
, for
0
0 D s s , as
( )
( ) ( )
tot
C
z c has at
0
. The ( , ) D H function has the same meaning as in
Eq. (6.22c) above, reminding us, since it equals one for D s and equals zero otherwise, that
no data exists for D > . We note that, although absolute value signs are applied to on the
right-hand side of (6.60b), they are not needed in (6.60a) because the right-hand side is already
an even function of . These formulas seem straightforward enough, but we should note that the
Even operator has an interesting effect on a noise-contaminated signal such as the one in Eq.
(6.57b): the noise contaminating the signal at positive automatically becomes the same as the
noise contaminating the signal at negative . Another way of putting this is that, for any
0
0 D s s , the signal at
0
is always in error from the presence of random detector noise
by exactly the same amount as the signal at
0
. To show what the Even operator does to the
noise-contaminated signal in (6.57b), we need the Heaviside step function [which has already
been defined in Eq. (2.70a) of Chapter 2],

1 for 0
( ) 1 2 for 0
0 for 0
>
<

. (6.60c)

s at
0

- 829 -
Applying the Even operator to both sides of (6.57b) now gives

[ ]
( )
( )
1 (det) (det)
Even[ ( ) ( )]
( , ) ( ) ( ) ( , ) ( ) ( )
( , ) ( ) ( ) ( ) ( )
tot
CN
cold
C C
z
D z D z
u D n h n h
u u

= +

/ / + +

.
(6.60d)

To show that the noise term is handled correctly in (6.60d), we note that when is positive, the
first term inside the braces { } specifies the noise because the second term is zero; and when is
negative, the second term specifies the noise to be the same as it is for 0 because then the
first term is zero. This ensures that the random noise inside the braces automatically has the same
value at + and .

6.19 Uncalibrated Spectra of Single-Sided Signals with Detector Noise
To get the uncalibrated signal spectrum of the artificially even, noise-contaminated signal in Eq.
(6.60d), we apply the forward Fourier transform to both sides of the equation, using the linearity
of the transform as described in Sec. 2.6 of Chapter 2 to write

( )
[ ] ( ) ( )
( ) ( )
( ) ( ) ( )
1 ( ) (det)
1 ( )
Even[ ( ) ( )]
( , ) ( ) ( ) ( , ) ( ) ( )
( , ) ( ) ( )
( , ) ( )
i tot
CN
i i cold
C C
i
i
z
D z D z
u D n h
u
u D
= +

/ +

+

F
F F
F
F
(det)
( ) n h
u

/

.
(6.61)

The first two terms on the right-hand side are easier to evaluate than the last two, so we start with
the first two and leave the more difficult work for later.
Using the Fourier convolution theorem once on the first term [see Eq. (2.39j) of Chapter 2]
gives

- 830 -

[ ] ( )
( )
[ ]
( )
( ) ( )
ma
R
( , ) ( ) ( )
( ( , )) ( ) ( )
2 sinc(2 ) H( ) M( ) ( ) ( ) ( ) ( ) ( ) ,
4
i
C
i i
C
f a FOV
D z
D z
WA
D D u R

=

=

L
F
F F

(6.62a)

where in the last step we substitute from Eqs. (6.24b) and (6.58f) to evaluate the convolved
Fourier transforms. According to the discussion following Eq. (5.82c) in Chapter 5, everything
inside the braces { } is a slowly varying function of compared to L
FOV
; and Sec. 5.15 of Chapter
5 explains why sinc(2 ) D must, in a well-designed interferometer, be a narrow function
varying no less rapidly than the major features of L
FOV
. Hence everything inside the braces
(except L
FOV
) must be slowly varying with compared to the narrow function sinc(2 ) D .
Therefore, according to Eq. (5C.1) in Appendix 5C of Chapter 5, the convolution in (6.62a)
primarily affects L
FOV
, giving us

[ ] ( )
( )
ma
R
( , ) ( ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( ) ,
4
i
C
f a mnf
D z
WA
u R
L
F

(6.62b)

where [see Eqs. (6.23a) above and (5.108f) in Chapter 5]

( ) [2 sinc(2 )] ( )
mnf FOV
D D = L L . (6.62c)

The second term on the right-hand side of (6.61) is handled the same way as the first. Again
using Eq. (2.39j) in Chapter 2, we write

( )
( ) ( )
[ ]
( ) ( )
( ) ( ) ( )
( ) (back)
ma
R
( , )[ ( ) ( )]
( , ) ( ) ( )
2 sinc(2 )
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
i cold
C
i i cold
C
fore
a FOV FOV
D z
D z
D D
WA
u R

=
=

L L
F
F F

(6.63a)

with Eqs. (6.24b) and (6.59c) used to evaluate the convolved Fourier transforms. Only

( ) (back)
[ ( ) ( )]
fore
FOV FOV
L L

Uncalibrated Spectra of Single Sided Signals with Detector Noise 6.19
- 831 -
inside the braces { } is not a slowly varying function of compared to sinc(2 ) D ro , so again Eq.
(5C.1) in Appendix 5C can be used to write

( )
( ) ( )
( ) (back)
ma
R
( , )[ ( ) ( )]
H( ) M( ) ( ) ( ) ( )[ ( ) ( )] ,
4
i cold
C
fore
a mnf mnf
D z
WA
u R
o
c
o o q o o t o o o
H
AO
e L L
F

(6.63b)

where [see Eqs. (6.25b), (6.25c) and (6.25f), (6.25g) above]

( ) ( )
( ) [2 sinc(2 )] ( )
fore fore
mnf FOV
D D o ro o L L (6.63c)
and

(back) (back)
( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L . (6.63d)

Now we are ready to analyze the last two terms in Eq. (6.61). Evaluation of the forward
Fourier transforms of ( / ) h u / , [ ( , ) ( )] D H
E
, and [ ( , ) ( )] D H
E
comes first.
Taking the forward Fourier transform of ( / ) h u / defined in Eq. (6.57a) gives, applying the
Fourier convolution theorem [Eq. (2.39a) in Chapter 2],

( )
( ) ( ) ( )
( )
i i i
h h
u u
o o o

c

/

F F F .

This can be written as, substituting from Eqs. (6.27b) and (5.88b) in Chapter 5,

( ) ( )
H( ) ( ) ( ) H( )
i i
h u u V e uV u
u
o o
o o o o

/

F , (6.64a)

where in the last step we use

( )
H( ) H( )
i
e u u
o
o o

from (6.58e) to simplify the formula. According to Eq. (6.58g), the magnitude of the effective
transfer function H( ) uo is even with respect to , and of course it must also be real. Function
V() is real and, according to Eq. (5.88e) in Chapter 5, it is also even. Hence, (6.64a) reveals that
the forward Fourier transform of ( / ) h u / is real and even. Entry 1 of Table 2.1 in Chapter 2 now
shows that h/ itself must be real and even:

h h
u u

/ /

(6.64b)
inside the braces { } might not be a slowly varying function of
- 832 -
and
Im 0 h
u

/ =

. (6.64c)

For future use, we note that ( / ) h u / , just like h(t), is a relatively narrow function of its argument.
To see why this is so, we consult Eq. (6.21a) and note that there exists a time T such that

( ) 0 for h t t > T ,

which means that
0 for h u
u

>

T . (6.65a)

Function ( ) is also a relatively narrow function of with [see Eq. (5.88h) in Chapter 5]

( ) 0 for d > . (6.65b)

Function ( / ) h u / is, according to Eq. (6.57a), the convolution of ( / ) h u and ( ) and so can
be written as [see the definition of the convolution in Eq. (2.38a) of Chapter 2]

( ) h h d
u u

/ =

.

The approximation in (6.65a) gives

( )
u
u
h h d
u u

T
T
. (6.65c)

The approximation in (6.65b) reveals that

( ) h
u

can only make a significant contribution to the integral in (6.65c) when

d < ,
- 833 -
because, when this is not true, (6.65b) forces to be small. But the limits on the integral confine
to values between +uT and uT, so when

d u > + T ,
it is impossible for
( ) h
u

to make a significant contribution to the integral for any of the allowed values of .
Consequently,
( )
u
u
h h d
u u

T
T

must be negligible when d u > + T . Therefore ( / ) h u / is a relatively narrow function, because
we can write
0 for h d u
u

/ > +

T . (6.65d)

This demonstration that h/ is a narrow function relies only on its being the convolution of two
other narrow functions. In general, the convolution of two narrow functions produces another
narrow function whose width can be no wider than (approximately) the sum of the widths of the
functions being convolved.
The forward Fourier transform of [ ( , ) ( )] D
is, according to Eq. (6.22b) and (6.60c),

( )
( ) 2 2
0
( )
( , ) ( ) ,
2 2
,
2 2
D
i i i
i
D D
D e d e d
D D

= =

=

.
F
F

This becomes, using Eqs. (2.36b) and (2.108b) in Chapter 2,

( ) [ ]
( )
( , ) ( ) sinc( )
i i D
D e D D

= . F (6.66a)

The same analysis of the forward Fourier transform of [ ( , ) ( )] D
gives

- 834 -
( )
0
( ) 2 2
( , ) ( ) ,
2 2
i i i
D
D D
D e d e d

= = +

F
or
( ) [ ]
( )
( , ) ( ) sinc( )
i i D
D e D D

= . F (6.66b)

Having evaluated the formulas for the Fourier transforms of ( / ) h u / , [ ( , ) ( )] D
, and
[ ( , ) ( )] D
, we are ready to evaluate the third and fourth terms on the right-hand side of
Eq. (6.61).
We begin evaluation of the third term by using the definition of the convolution [see Eq.
(2.38a) in Chapter 2] to write

(det) (det)
( )
(det)
( )
( ) ( )
( )
d u
d u
n h n h d
u u
n h d
u
+ +
+

/ / =

/

.
T
T
(6.67a)

The approximation in the last step comes from noting that the product

(det)
( ) n h
u

/

is negligible when ( ) lies outside the range of values between (d+uT) and (d+uT) for
which h/ is significantly different from zero [see (6.65d)]. Multiplying both sides of (6.67a) by
( , ) D gives

( )
(det) (det)
( )
( , ) ( ) ( , ) ( )
d u
d u
D n h D n h d
u u

+ +
+

/ /

T
T
.

The new ( , ) D factor reduces this equation to 0 0 = when D > . Remembering that h/ is
negligible whenever ( ) lies outside the range of values between (d+uT) and (d+uT), we
extend the limits of the integral on the right-hand side to get the new approximation

( )
(det) (det)
( )
( , ) ( ) ( , ) ( )
D d u
D d u
D n h D n h d
u u

+ +
+

/ /

T
T
.

- 835 -
Here we rely on the extra regions of integration going from ( ) D d u = + T to
( ) d u = + T and from ( ) d u = + + T to ( ) D d u = + + T to contribute only a
negligible amount to the integral. This approximation can also be written as

(det) (det)
( , ) ( ) ( , ) ( , ) ( ) D n h D n h d
u u

/ /

D ,
where
D d u = + + D T . (6.67b)

Equation (2.38a) in Chapter 2 can be used to recognize the integral as a convolution,

(det) (det)
( , ) ( ) ( , ) ( , ) ( ) D n h D n h
u u

/ /

D . (6.67c)

This becomes, multiplying through by the Heaviside step function ( )
,

(det) (det)
( ) ( , ) ( ) ( ) ( , ) ( , ) ( ) D n h D n h
u u

/ /

D .

Taking the forward Fourier transform of both sides gives, using Eqs. (2.39j) and (2.39a) in
Chapter 2,

( ) ( )
( ) (det)
( ) ( ) (det) ( )
( ) ( , ) ( )
( ) ( , ) ( , ) ( )
i
i i i
D n h
u
D n h
u

/

/

.
F
F F D F

This can be written as, substituting from Eqs. (6.64a) and (6.66a),

( )
( ) (det)
( ) (det)
( ) ( , ) ( )
sinc( ) ( ) ( ) ( , ) ( )
i
i D i
D n h
u
De D uV H u n
.
F
F D

We note, due to the size of the D product, that sinc( )
i D
e D

is about as narrow and rapidly

varying a function of as sinc( ) D . Hence, glancing back at the discussion following Eq.
- 836 -
(6.62a) above, we see that V() and ( ) H u vary slowly with compared to sinc( )
i D
e D

,
which means, according to Eq. (5C.1) in Appendix 5C of Chapter 5, that V() and ( ) H u can
be moved outside the convolution:

( )
{ }
( ) (det)
( ) (det)
( ) ( , ) ( )
( ) ( ) sinc( ) ( , ) ( )
i
i D i
D n h
u
uV H u De D n
.
F
F D

Remembering that
( )
( )
sinc( ) ( , ) ( )
i D i
De D D

= F

from Eq. (6.66a), we apply Eq. (2.39j) in Chapter 2 to get

( )
( ) (det)
( ) (det)
( ) ( , ) ( )
( ) ( ) ( ) ( , ) ( , ) ( )
i
i
D n h
u
uV H u D n
.
F
F D

From Eq. (6.67b), we know D > D , which means that [see Eq. (6.22b)]

( , ) ( , ) ( , ) D D = D .
Therefore

( )
( ) (det)
( ) (det)
( ) ( , ) ( )
( ) ( ) ( ) ( , ) ( )
i
i
D n h
u
uV H u D n
.
F
F
(6.67d)

This takes care of the third term on the right-hand side of Eq. (6.61). At no point during this
derivation did we make any assumptions about the behavior of
(det)
( ) n ; it acts as a placeholder
and could be replaced by other functionsboth random and nonrandomwithout making any
part of the analysis untrue.
It is now time to simplify the fourth and last term in Eq. (6.61). We have just remarked that, in
the analysis of the third term in (6.61),
(det)
( ) n acts as a placeholder and can be replaced by any
other reasonable choice. It turns out that we are not so much interested in modifying the final
result in Eq. (6.67d) as we are in modifying the approximation in (6.67c) that appears partway
- 837 -
through the derivation of (6.67d). Replacing the
(det)
( ) n placeholder in (6.67c) by
(det)
( ) n
gives

(det) (det)
( , ) ( ) ( , ) ( , ) ( ) D n h D n h
u u

/ / H e H H

D . (6.68a)

This can be written as, using (6.64b) to modify the left-hand side,

(det) (det)
( , ) ( ) ( , ) ( , ) ( ) D n h D n h
u u

/ / H e H H

D . (6.68b)

Multiplying through by ( ) E and taking the forward Fourier transform of both sides leads to

( ) (det)
( ) (det)
( ) ( , ) ( )
( ) ( , ) ( , ) ( )
i
i
D n h
u
D n h
u
o
o

/ H E

/ e H H E

.
F
F D
(6.68c)

The left-hand side of this formula is (after dividing by u) the Fourier transform of the fourth term
in (6.61) that we need to evaluate. We apply Eqs. (2.39a) and (2.39j) in Chapter 2 to the right-
hand side to get

( ) ( )
( ) (det)
( ) ( ) ( ) (det)
( ) ( , ) ( )
( ) ( , ) ( , ) ( )
i
i i i
D n h
u
D h n
u
o
o o o

/ H E

/ e H H E

.
F
F F F D

Substituting from Eqs. (6.66b) and (6.64a) gives

( )
( ) (det)
( ) (det)
( ) ( , ) ( )
sinc( ) ( ) H( ) ( , ) ( )
i
i D i
D n h
u
De D uV u n
o
r o o

ro o o

/ H E

e H

.
F
F D
(6.68d)

Again, just like in the analysis of the third term of (6.61), Eq. (5C.1) in Appendix 5C of Chapter
5 is used to move V and H outside the convolution because they vary slowly compared to
The left-hand side of this formula is (after dividing by u) exactly the same as the fourth term
- 838 -
[ sinc( )]
i D
e D

. Equation (6.68d) then becomes

( ) { }
( ) (det)
( ) (det)
( ) ( , ) ( )
( ) H( ) sinc( ) ( , ) ( )
i
i D i
D n h
u
uV u De D n
.
F
F D

Glancing back at (6.66b) to get

[ ] ( )
( )
sinc( ) ( , ) ( )
i D i
e D D D

= F ,

we use (2.39j) in Chapter 2 to write

( ) { }
( ) (det)
( ) (det)
( ) ( , ) ( )
( ) H( ) ( ) ( , ) ( , ) ( )
i
i
D n h
u
uV u D n
.
F
F D

Again we note [see Eq. (6.67b)] that D > D , making ( , ) ( , ) ( , ) D D = D . Hence,

( ) { }
( ) (det)
( ) (det)
( ) ( , ) ( )
( ) H( ) ( ) ( , ) ( )
i
i
D n h
u
uV u D n
.
F
F
(6.68e)

This takes care of the fourth term on the right-hand side of Eq. (6.61).
Before substituting our results back into Eq. (6.61), it makes sense to use the linearity of the
Fourier transform (see Sec. 2.6 in Chapter 2) to combine the equations third and fourth terms.
Multiplying by
1
u
and adding together (6.67d) and (6.68e) gives

- 839 -
( ) {
1 ( ) (det)
1 ( ) (det)
( ) (det)
( ) ( , ) ( )
( ) ( , ) ( )
( ) ( ) ( ) ( , ) ( )
- i
- i
i
u D n h
u
u D n h
u
V H u D n

/

/ +

F
F
F
( )}
( )
( ) (det)
( ) (det) (det)
( ) ( , ) ( )
( ) ( ) ( , ) ( ) ( ) ( ) ( )
i
i
D n
V H u D n n
+
= +

.
F
F
(6.69a)

Now we can substitute into Eq. (6.61) the approximations shown in Eqs. (6.62b), (6.63b), and
(6.69a) to get

( ) ( )
ma
( ) (back)
ma
( ) (det)
R
R
(Even[ ( ) ( )])
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
( ) ( ) ( , ) ( ) ( )
i tot
CN
f a mnf
fore
a mnf mnf
i
z
WA
u R
WA
u R
V H u D n
+
+
L
L L
F
F

( )
(det)
( ) ( ) n +

or

( ) ( )
ma
( ) (back)
R
(Even[ ( ) ( )])
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
( ) ( )
( ) (
i tot
CN
a f mnf
fore
mnf mnf
z
WA
u R
V H u
+
L
L L
F

( )
( ) (det) (det)
) ( , ) ( ) ( ) ( ) ( )
i
D n n
+

F .
(6.69b)

The next section explains how to analyze the noise term in this formula.
- 840 -
6.20 Calibrated Spectra of Single-Sided Signals with Detector Noise
To analyze the detector noise contaminating a single-sided signal, we define a new random
function,

(det) (det) (det)
( ) ( ) ( ) ( ) ( )
E
n n n = +
. (6.70a)

The Heaviside step function ( )
from Eq. (6.60c) ensures that
(det)
E
n always has the same value
at as it does at + : when = is positive, the first term specifies the value of
(det)
E
n
because the second term is zero; and when = is negative, the second term specifies the
value of
(det)
E
n to be the same as it is for = because the first term is zero. This means random
function
(det)
E
n is always even,

(det) (det)
( ) ( )
E E
n n = , (6.70b)

and, because it represents noise contaminating a real signal, it must also be real:

( )
(det)
Im ( ) 0
E
n = . (6.70c)

Following the same pattern as in the previous -based noise terms [see Eq. (6.29a)], we define the
D-limited forward Fourier transform of
(det)
E
n to be

( )
(det) ( ) (det) (det) 2
( ) ( , ) ( ) ( , ) ( )
i i
DE E E
D n D n e d

= =
n F (6.70d)
or

(det) (det) 2
( ) ( )
D
i
DE E
D
n e d

n . (6.70e)

This can also be written as, substituting from (6.70a),

( )
(det) ( ) (det) (det)
( ) ( , )[ ( ) ( ) ( ) ( )]
i
DE
D n n
= + n F . (6.70f)

We have just seen that function
(det)
E
n is real and even. Function ( , ) D is also real and even
[see Eq. (6.22b) above], so
(det)
( )
DE
n in (6.70d) and (6.70e) is the forward Fourier transform of a
real and even function. This makes
(det)
( )
DE
n another real and even function (see entry 1 of Table
2.1 in Chapter 2):
Calibrated Spectra of Single Sided Signals with Detector Noise 6.20
- 841 -

(det) (det)
( ) ( )
DE DE
= n n (6.70g)
and

( )
(det)
Im ( ) 0
DE
= n . (6.70h)

The expectation value of
(det)
( )
E
n is, applying the expectation operator E to both sides of
(6.70a),

( ) ( ) ( )
(det) (det) (det)
( ) ( ) ( ) ( ) ( )
E
n n n = +
E E E ,

using the linearity of the expectation operator with respect to random quantities discussed in Sec.
3.10 of Chapter 3. Since

( )
(det)
( ) 0 n = E

for any value of [see Eq. (6.17b)], we can now see that

( )
(det)
( ) 0
E
n = E . (6.70i)

Applying the expectation operator E to both sides of Eq. (6.70e) gives, using Eq. (3.17c) in
Chapter 3,

( ) ( )
(det) (det) 2
( ) ( )
D
i
DE E
D
n e d

n E E .

Since we now know that
( )
(det)
( )
E
n E is zero, this shows that

( )
(det)
( ) 0
DE
= n E . (6.70j)

The detector-noise term in Eq. (6.69b) can be simplified by substituting from Eq. (6.70f):

( ) ( )
ma
( ) (back)
R
(Even[ ( ) ( )])
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
( ) ( )
H( ) ( )
i tot
CN
a f mnf
fore
mnf mnf
D
z
WA
u R
u V
+
L
L L
n
F

(det)
( )
E

.
(6.71a)

- 842 -
In a single-sided measurement, we can think of

( )
( ) ( )
Even[ ( ) ( )]
i tot
CN
z
F

as the uncalibrated, noise-contaminated signal spectrum at point C in Fig. 6.2, because it plays
the same role that

( )
( ) ( )
( , ) ( )
i tot
CN
D z
F

does in the double-sided signal spectrum specified in Eq. (6.30a) above. Comparing the formulas
for

( )
( ) ( )
Even[ ( ) ( )]
i tot
CN
z
F and
( )
( ) ( )
( , ) ( )
i tot
CN
D z
F

in Eqs. (6.71a) and (6.30a), we see that there is an exact correspondence if H(u) in (6.30a) is
matched with H( ) u in (6.71a) and if
(det)
( )
D
n in (6.30a) is matched with
(det)
[ ( ) ( )]
DE
V n in
(6.71a):
H( ) H( ) u u (6.71b)
and

(det) (det)
( ) [ ( ) ( )]
D DE
V n n .

We also note that the expectation value of the spectral noise
(det)
[ ( ) ( )]
DE
V n in (6.71a) is zero,
just like the expectation value of the spectral noise
(det)
( )
D
n in (6.30a) is zero. To see why this is
so, we just apply the expectation operator E to
(det)
[ ( ) ( )]
DE
V n and consult Eq. (6.70j) to get

( ) ( )
(det) (det)
( ) ( ) ( ) ( ) 0
DE DE
V V = = n n E E . (6.71c)

Knowing that the spectral noise in (6.70a) has a zero expectation value, we can repeat the
mathematical analysis used in Sec. 6.11 to extract the L
mnf
data from the uncalibrated spectrum
in (6.30a), only this time replacing H(u) by H( ) u and
(det)
( )
D
n by
(det)
[ ( ) ( )]
DE
V n as
specified in (6.71b). The formulas for
(1)
,
( )
eff tot
Z and
(2)
,
( )
eff tot
Z in Eqs. (6.33a) and (6.33b) now
become

(1)
, ma
(1) ( ) (back)
R ( ) H( ) M( ) ( ) ( ) ( )
4
( ) ( ) ( ) ( )
eff tot a
fore
f mnf mnf
WA
u R

+

Z
L L L
(6.71d)
Calibrated Spectra of Single Sided Signals with Detector Noise 6.20
- 843 -
and

(2)
, ma
(2) ( ) (back)
R ( ) H( ) M( ) ( ) ( ) ( )
4
( ) ( ) ( ) ( )
eff tot a
fore
f mnf mnf
WA
u R

+

Z
L L L .
(6.71e)

The formula for
( )
,
( )
meas
eff totN
Z
in Eq. (6.34b) changes to

( )
, ma
( ) (back)
(det)
R ( ) H( ) M( ) ( ) ( ) ( )
4
( ) ( ) ( ) ( )
H( ) ( ) ( )
meas
eff totN a
fore
f mnf mnf mnf
DE
WA
u R
u V

=
+

+

Z
L L L
n

.
(6.71f)

Substituting these expressions into the calibration formula in (6.35d) now gives

( ) (1)
, , (2) (1) (1)
(2) (1)
, ,
(det)
ma
R
( ) ( )
( ) ( ) ( )
( ) ( )
4 ( ) ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
meas
eff totN eff tot
eff tot eff tot
DE
mnf
a f
V
WA R

+

= +
Z Z
L L L
Z Z
n
L
.
(6.71g)

The most important difference between the single-sided formula in (6.71g) and the double-sided
formula in (6.35d) is that, according to Eq. (6.70h),
(det)
( )
DE
n is strictly real whereas, as was
pointed out following Eq. (6.35d),
(det)
( )
D
n has both real and imaginary components. According
to the discussion of the double-sided case following Eq. (6.36) above, the imaginary component
of
(det)
( )
D
n is called the avoidable spectral noise because it can be eliminated by taking the real
part of the interferometer measurement; and the real component of
(det)
( )
D
n is called the
unavoidable spectral noise because it cannot be eliminated from the interferometer measurement.
The avoidable spectral noise comes from the odd part of the
(det)
( ) n signal noise contaminating
the interferometer data, and the unavoidable spectral noise comes from the even part of the
(det)
( ) n signal noise contaminating the interferometer data. The
(det)
( ) n noise contaminating
the double-sided signal has both even and odd components because the interferometer data is
recorded for both positive and negative values of the OPD value . In the single-sided case, on
the other hand, interferometer data is recorded only for non-negative values of and then
artificially extended to negative values, automatically turning the noise contaminating the
signal into an even function of [see Eq. (6.70b)]. Consequently, the single-sided spectral noise
- 844 -
(det)
( )
DE
o n is always real and even [see Eqs. (6.70g) and (6.70h)], and there is no avoidable noise
that can be eliminated by taking the real part of the measured spectrum. Hence, when comparing
the right-hand side of (6.71g) to a spectral radiance measurement contaminated by random error,
such as

( ) ( )
mnf
o o o + L L

in Eq. (6.1a) above, we see that for single-sided spectral measurements contaminated by detector
noise all of
(det)
DE
n contributes to oL
, giving

(det)
ma
R
4 ( ) ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
DE
a f
V
WA R
o o
o o
o q o o t o t o
e
AO
n
L

. (6.72a)

for positive values. We know that
(det)
DE
n on the right-hand side of (6.72a) is a real and even
function of . Functions , M, and V are real andaccording to Eq. (4.139g) in Chapter 4 and
Eqs. (5.10f) and (5.88e) in Chapter 5even functions of . Functions R,
a
t , and
f
t are also real
and have o for their argument, forcing them to be even functions of . It follows that Eq.
(6.72a) presents a single-sided formula for ( ) o o L
that is, as it should be, a real and even random

function of just like equation (6.38c) above. Following the convention adopted there, we write
oL
in (6.72c) as a function of o to get

(det)
ma
R
4 ( ) ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
DE
a f
V
WA R
o o
o o
o q o o t o t o
e
AO
n
L

. (6.72b)

6.21 Detector-Noise NEdN in a Single-Sided Signal
Taking the expectation value of both sides of Eq. (6.72b) gives, after consulting Eq. (6.70j),

( )
( )
(det)
ma
R
4 ( ) ( )
( ) 0
( ) M( ) ( ) ( ) ( ) ( )
DE
a f
V
WA R
o o
o o
o q o o t o t o

AO
n
L

E
E . (6.73a)

To find the NEdN for the detector noise in the single-sided signal, we apply Eqs. (6.72b) and
(6.73a) to the formula in (6.3g) above to get, remembering that
2
1 W according to the
discussion following Eq. (4.83) in Chapter 4, that
(6.72a) presents a well-founded formula for
Detector-Noise NEdN in a Single-Sided Signal 6.21
- 845 -

( )
(det) 2
(det)
1
ma
R
4 ( ) ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
DE
a f
V
NEdN
A R

n E

or, removing the absolute value signs from the arguments of
(det)
DE
n , , M, and V,

( )
(det) 2
(det)
1
ma
R
4 ( ) ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
DE
a f
V
NEdN
A R

n E
. (6.73b)

The absolute value signs are removed because these functions are even [see Eq. (6.70g), Eq.
(4.139g) in Chapter 4, and Eqs. (5.10f) and (5.88e) in Chapter 5]. The subscript 1 and superscript
(det) show that this is the formula for the NEdN of a single-sided signal contaminated by detector
noise.
The quickest way to connect
(det)
1
NEdN to the formula for the double-sided signal is to
analyze the detector noise as a time-based rather than a -based random function. Returning to
the definition of
(det)
( )
E
n in Eq. (6.70a) above, we use ut = from Eq. (6.4) to convert both
sides of (6.70a) to time-based, rather than -based, functions,

(det) (det) (det)
( ) ( ) ( ) ( ) ( )
E
n ut ut n ut ut n ut = +
. (6.74a)

Equation (6.60c) shows that
( ) ( ) t ut = (6.74b)

for the Heaviside step function, so (6.74a) can be written as

(det) (det) (det)
( ) ( ) ( ) ( ) ( )
E
N t t N t t N t = +

, (6.74c)

where Eq. (6.40b) is used to replace
(det)
n by
(det)
N
on the left-hand side of the formula, and on

the right-hand side we define

(det) (det)
( ) ( )
E E
N t n ut =
(6.74d)
so that

(det) (det)
( ) ( / )
E E
n N u =

. (6.74e)

Equation (6.74c) is exactly the same as Eq. (3.73b) in Sec. 3.27 of Chapter 3 when
(det)
( ) N t
is
matched to ( ) n t and
(det)
( )
E
N t
is matched to ( )
E
n t ,
- 846 -

(det)
( ) ( ) N t n t
(6.75a)
and

(det)
( ) ( )
E E
N t n t
.

Remember that in this section all terms with the superscript (det) refer to the detector noise
being analyzed in this chapter and the terms without the superscript (det) come from Chapter 3.
Section 3.27 of Chapter 3 defines the T-limited forward Fourier transform of ( )
E
n t to be,
according to Eq. (3.72b),

2
( ) ( )
T
ift
TE E
T
N f n t e dt

. (6.75b)

Following this lead, we copy this idea and define the T-limited forward Fourier transform of
(det)
( )
E
N t
to be

(det) (det) 2
( ) ( )
T
ift
TE E
T
f N t e dt

N

, (6.75c)

where, just like in Eq. (6.40c) above, / T D u = . Since
(det)
( )
E
N t
matches up to ( )
E
n t in (6.75a), it
follows that Eqs. (6.75b) and (6.75c) are now the same equation with
(det)
( )
TE
f N
matching up to
( )
TE
N f
,

(det)
( ) ( )
TE TE
f N f N

. (6.75d)

The analysis presented in Sec. 3.27 [see Eq. (3.76c) in Chapter 3] shows that

( )
2
( ) 2 ( )
TE nn
N f T S f

E , (6.75e)

where ( )
nn
S f

is the double-sided power spectrum of random function (t) in (6.75a). We know
that
(det)
( ) N t
, which corresponds to (t) has, according to Eq. (6.41b), its own power spectrum
(det)
( )
NN
S f

. Since
(det)
( ) N t
corresponds to (t), the power spectrum in ( )

nn
S f

(6.75e) corresponds
to the power spectrum
(det)
( )
NN
S f

,

(det)
( ) ( )
nn
NN
S f S f

. (6.75f)

Hence the formula corresponding to Eq. (6.75e), which has been directly copied from (3.76c) in
Chapter 3, must be, according to (6.75d) and (6.75f),

Detector-Noise NEdN in a Single-Sided Signal 6.21
- 847 -

( )
2
(det) (det)
( ) 2 ( )
TE
NN
f T S f N

E . (6.75g)

To find the counterpart of this result for -based random functions, we follow Eq. (6.4) and
change the dummy variable of integration in (6.75c) to ut = to get

(det) 1 (det) 2 ( / )
( ) ( / )
uT
TE E
uT
i f u
f u N u e d

N

.

According to Eqs. (6.40c), (6.40d), and (6.74e), this can be transformed into

(det) 1 (det) 2
( ) ( )
D
i
TE E
D
u u n e d

,

which can also be written as [see Eq. (6.70e)]

(det) 1 (det)
( ) ( )
TE DE
u u
= N n
. (6.76a)

We again consult Eqs. (6.40c) and (6.40d), this time using them to write (6.75g) as

( )
2
(det) 1 (det)
( ) 2 ( )
TE
NN
u u DS u
E .

According to Eqs. (6.41f) and (6.76a), this can also be written as

( )
2
(det) (det)
( ) 2 ( )
DE nn
D n

E p (6.76b)

or, since (6.70h) shows that
(det)
( )
DE
n is strictly real,

( )
(det) 2 (det)
[ ( )] 2 ( )
DE nn
D n

E p . (6.76c)

Substituting this into (6.73b) gives

(det)
(det)
1
ma
R
4 ( ) 2 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
nn
a f
V D
NEdN
A R

p
. (6.76d)

- 848 -
We are usually interested in the NEdN only for values corresponding to the wavenumber
range over which L() is to be measuredthat is, formula (6.76d) is almost always used for
wavenumbers such that

min max

with
min
and
max
the same as in Eq. (5.78) in Chapter 5. According to Eq. (5.88d) in Chapter 5,
V() is always one for these values, which means it can be eliminated from (6.76d),

(det)
(det)
1
ma
R
4 2 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
nn
a f
D
NEdN
A R

p
, (6.76e)

without (usually) making the formula any less useful.
Comparing the formula in (6.76e) to the corresponding formula in (6.52g) for
(det)
2
NEdN , we
see that the single-sided NEdN is 2 larger than the double-sided NEdN. This resultthat the
single-sided NEdN is 2 larger than the double-sided NEdNcan be blamed entirely on the
way the single-sided measurement is constructed. Double-sided signals are measured for both
positive and negative values, which means, as discussed following Eqs. (6.19f) and (6.35d)
above, that part of the signal noise is in principle avoidable: we can either average together the
noise-contaminated signal values at and to reduce the detector noise at once or remove the
avoidable noise later on by taking the real part of the measured spectrum. Single-sided signals, on
the other hand, are in effectonce they have been turned into even functionsmeasured only for
positive and then artificially extended to negative values. There is thus no way to lessen the
single-sided signal noise because there is no way to compare independent signal measurements at
and , so it is no surprise to find that single-sided NEdNs are larger than the corresponding
double-sided NEdNs. Now that
(det)
1
NEdN is known to be larger than
(det)
2
NEdN by 2 , Eqs.
(6.53a), (6.53c), and (6.53d) can be used to put the formula for the single-sided NEdN into
several different forms:

(det)
(det)
1
ma
R
4 2 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
NN
a f
uDS u
NEdN
A R

, (6.77a)

(det)
1
(det)
1
ma
R
4 ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
a f
uDS u
NEdN
A R

, (6.77b)
and

(det)
1
ma
4
( )
( ) M( ) ( ) ( ) ( ) ( )
d
a f
uDA
NEdN
A R D
. (6.77c)
Detector Circuit as an Anti-Aliasing Filter 6.22
- 849 -
6.22 Detector Circuit as an Anti-Aliasing Filter
Detector noise is usually the dominant type of noise in Michelson interferometers. Up to now the
detector circuit between points B and C in Fig. 6.2 has been treated as just another part of the
signal chain; now that we know how to describe detector noise, we can discuss the detector
circuits role as an anti-aliasing filter.
To get the uncalibrated, noise-contaminated signal spectrum at point C in Fig. 6.2, we consult
Eqs. (6.31a) and (6.31c) to get

( )
( ) ( ) (det)
,
( , ) ( ) ( ) H( ) ( )
i tot
CN eff tot D
D z u
= + Z n F , (6.78a)
where

, ma
( ) (back)
R ( ) H( ) M( ) ( ) ( ) ( )
4
( ) ( ) ( ) ( )
eff tot a
fore
f mnf mnf mnf
WA
u R

=
+

Z
L L L
(6.78b)

and
(det)
( )
D
n is defined in Eq. (6.29a) to be the (D-limited) complex spectrum of the detector
noise. At point C, the analog-to-digital converter samples the signal at equally spaced intervals in
so that a discrete Fourier transform (DFT) can be applied. To analyze the effect of this
procedure, we start with the obvious point that

( )
[ ( , ) ( )]
tot
CN
D z and
(det)
,
[ ( ) H( ) ( )]
eff tot D
u + Z n

in (6.78a) are a Fourier transform pair and then analyze what must happen to them when
( )
[ ( , ) ( )]
tot
CN
D z is sampled and put through a DFT.
Section 2.21 of Chapter 2 explains the effect of sampling and the DFT on any two functions,
such as U(f) and u(t) in Eq. (2.91a) of Chapter 2, which form a Fourier-transform pair. To match
the interferometer signal at point C to functions u(t) and U(f), we write Eq. (6.78a) as

(det) ( ) 2
,
[ ( ) H( ) ( )] [ ( , ) ( )]
tot i
eff tot D CN
u D z e d

+ =
Z n . (6.79)

Comparing this to Eq. (2.91a) in Chapter 2, we note that wavenumber corresponds to f, the
OPD value corresponds to t, function U(f) corresponds to

(det)
,
[ ( ) H( ) ( )]
eff tot D
u + Z n ,

and function u(t) corresponds to
- 850 -

( )
[ ( , ) ( )]
tot
CN
D z .

These correspondences can be written symbolically as

t , (6.80a)

f , (6.80b)

( )
( ) [ ( , ) ( )]
tot
CN
u t D z , (6.80c)
and

(det)
,
( ) [ ( ) H( ) ( )]
eff tot D
U f u + Z n . (6.80d)

In double-sided interferometer measurements, the analog-to-digital converter samples the
signal at N equally spaced OPD values between = D and = D, with being the OPD
interval between neighboring samples. In single-sided interferometer measurements, even though
the analog-to-digital converter samples only half (approximately) the length in , we
subsequently double the signal to get, again, N equally spaced data points between = D and
= D with again being the OPD interval between neighboring samples [see the discussion
following Eq. (6.59d) in Sec. 6.18 above]. Hence, for both double-sided and single-sided
interferogram systems, we have
2 N D = . (6.81a)

This corresponds to Eq. (2.92b) in Chapter 2, which states that the N equally spaced samples used
to represent u(t) are separated by intervals t such that

N t T = .

Therefore, interval t corresponds to and T corresponds to 2D, which can be written
symbolically as

t (6.81b)
and
2 T D . (6.81c)

Furthermore, since corresponds to t, parameter

1
F
t
=

- 851 -
specified in Eq. (2.93c) of Chapter 2 must correspond to 1/ ,

1
F
. (6.81d)

The Nyquist wavenumber, defined in Eq. (5.112) of Chapter 5 to be

1
2
Nyq
, (6.81e)

can be used to write the correspondence in (6.81d) as

2
Nyq
F . (6.81f)

Formulas (2.95c) and (2.95d) show what happens to the sampled interferometer signal when
the DFT is applied: the original Fourier-transform pair u(t) and U(f), which describes the signal
and its spectrum, changes into
[ ]
( , ) u t T
and
[ ]
( , ) U f F
. The transformation of spectrum U(f)

into
[ ]
( , ) U f F
is discussed at some length in Secs. 2.22 and 2.23 of Chapter 2, which show why
it is referred to as aliasing the signal spectrum. Equation (2.93b) defines
[ ]
( , ) U f F
to be

[ ]
( , ) ( )
r
U f F U f rF
=
=
.

Therefore, applying correspondences (6.80b), (6.80d), and (6.81f) to (2.93b), we see that the
original noise-contaminated spectrum

(det)
,
[ ( ) H( ) ( )]
eff tot D
u + Z n

in (6.78a) and (6.79) must transform, after sampling and the DFT, into

( )
(det)
,
noise-contaminated and aliased spectrum
[ ( 2 ) H ( 2 ) ( 2 )]
eff tot Nyq Nyq D Nyq
r
r u r r
=
= +
Z n .
(6.82)
- 852 -

FIGURE 6.8(a).

When
(det)
0
D
= n in (6.82)that is, in the absence of noiseEq. (6.82) becomes the same as Eq.
(5.113c) in Chapter 5 if all the background radiances are negligible compared to the radiance
spectrum entering the interferometer. The practical consequences of Eq. (5.113c) are discussed at
length in Secs. 5.24 and 5.25 of Chapter 5. Following the same sort of reasoning used there, we
note that Z
eff,tot
is expected to be negligible or zero unless

min max
,

as shown in Fig. 6.8(a).
We choose small enough that

max
1
2
Nyq

= >

so that the spectrum is oversampled and its original shape preserved, as shown in Fig. 6.8(b). If
there is a large gap between 0 = and
min
= , we can instead choose large enough to
undersample the spectrum while preserving its original shape, as shown in Fig. 6.8(c). When

min

max

min

max

tot eff ,
Z
This is a schematic plot of the magnitude of the signal spectrum Z
eff,tot

against wavenumber . Spectrum Z
eff,tot
is zero unless
min max
.
- 853 -
(det)
D
n is not zero, however, both oversampling and undersampling may introduce extra noise into
the measured spectrum if

(det)
H( ) ( )
D
uo o n
in Eq. (6.82) is not negligible or zero when
min
o o < and
max
o o > . This is shown pictorially by
the dashed lines of the aliased noise spike in Fig. 6.8(b) and the overlap of the aliased spectrum
over the solid lines representing the low-frequency noise in Fig. 6.8(c). We cannot easily control
the spectrum
(det)
( )
D
o n of the detector noise, which tends to be significantly different from zero at
all frequencies, both high and low; but we can design the detector circuit so that H(u) is very
small for those wavenumbers that can contaminate the spectral measurement due to aliasing.
Detector circuits of this sort are often referred to as anti-aliasing filters or as containing an anti-
aliasing filter. Although it may not be mandatory to design the anti-aliasing transfer function H so
that H(u) is negligible or zero unless

min max
o o o s s ,

we note that if H obeys this rule, then
(det)
H( ) ( )
D
uo o n is small whenever Z
eff,tot
is small, and
aliasing can never introduce extra noise into the measured spectrum. Figure 6.8(d) graphs this
ideal band-pass transfer function, suitable for all types of oversampled or undersampled spectral
measurements.

__________

Detectors are the major source of random error in almost all Michelson interferometers. The
NEdN of an interferometer measurement is defined at the beginning of this chapter to be the
standard deviation of the random measurement error, which suggests that some effort might be
required to observe detector noise. It turns out, however, that the distinctively fuzzy appearance
of detector noise [see Fig. 6.6(b)] usually means that a single spectral measurement is enough to
show its presence and importance. We have traced detector noise through the block diagram of a
standard Michelson interferometer (shown in Fig. 6.2), taking care to include the effect of the
calibration process on the spectral signal. In double-sided interferogram systems, some of the
signal noise can be eliminated rather easily by taking the real part of the noise-contaminated
measurement after the calibration algorithm has been applied. This lets us divide the signal noise
of double-sided systems into avoidable and unavoidable components. Signal noise is somewhat
more prominent in systems using single-sided interferogramsbeing larger by a factor of square-
root of 2because there is no way to eliminate the avoidable component of the signal noise. This
is the inevitable price paid for the gain in spectral resolution discussed in Sec. 5.18 of Chapter 5.
the dotted lines of the aliased noise spikes
is small where is not measured, and Z
eff,tot

to eliminate an avoidable
- 854 -

FIGURE 6.8(b).

min
o

max
o

Nyq
o

max
2 o o
Nyq

min
2 o o
Nyq

min
o

max
o

Nyq
o

Nyq
o o 2
max

min
2
Nyq
o o
low-frequency
noise
high-frequency
noise
aliased high-
frequency noise
high-frequency
noise
aliased high-
frequency noise
This is a schematic plot of the magnitude of the noise-contaminated spectral signal Z
eff,tot

against wavenumber when the data has been oversampled. The solid lines represent
the noise-free Z
eff,tot
and the dashed lines represent its aliases. The solid bars represent
the high-frequency and low-frequency noise at their correct positions on the
wavenumber axis, and the dotted bars represent the high-frequency and low-frequency
noise at their aliased positions on the wavenumber axis. Only aliased high-frequency
noise ends up in the measured spectrum.
o
Use this region of oversampled data
to measure the spectrum.
2o
Nyq min
o
- 855 -

FIGURE 6.8(c).

Nyq
3
Nyq
3
Use this region of undersampled data
to measure the spectrum.
aliased high-frequency noise
high-frequency
noise
low-frequency
noise
aliased low-
frequency noise
aliased low-
frequency noise
This is a schematic plot of the magnitude of the noise-contaminated spectral signal Z
eff,tot

against wavenumber when the data has been undersampled. The solid lines represent
the noise-free Z
eff,tot
and the dashed lines represent its aliases. The solid bars represent
the high-frequency and low-frequency noise at their correct positions on the
wavenumber axis, and the dotted bars represent the high-frequency and low-frequency
noise at their aliased positions on the wavenumber axis. Both low-frequency noise and
aliased high-frequency noise end up in the measured spectrum.

- 856 -

FIGURE 6.8(D).

min
o

max
o
max
o

min
o
1.0
) ( o u H
o
d
Appendix 6A
- 857 -
Appendix 6A
When a spectral radiance L() is a slowly varying function of wavenumber, then the distortion
given by an interferometers field of view can be disregarded. To see why this is so, we use the
formula given in Eq. (6.5b) [and also in Eq. (5.83e) of Chapter 5] for L
FOV
(), the spectral
radiance distorted by an interferometers finite field of view when is small but also large
enough that cos
r
o cannot be approximated as one:

1
4 2
1
4 2
1
( ) ( )
FOV
d
o
o
o
o
r
o
r
o o o
A
AO
+ +

A
AO
+

A

L L (6A.1)
In this formula, 0 o > and

2
o
o
r
AO
A . (6A.2)

When L() is a slowly varying function of wavenumber, we can assume that it is quasi-constant
when changes by an amount
, so the integral in (6A.1) can be approximated as

1 1
4 2 4 2
1 1
4 2 4 2
( ) ( )
2
d d
o o
o
o o
o o
r r
o o
r r
o o o o o o
A A
AO AO
+ + + +

A A
AO AO
+ +

A
e + e A

L L L .

Equation (6A.1) now simplifies to
( ) ( )
FOV
o o e L L , (6A.3)

showing that an interferometer with a small but finite field of view does not significantly distort
the measured spectral radiance when the radiance is a slowly varying function of wavenumber.
The effect of the interferometers finite interferogram length can also be shown to disappear
when L is a slowly varying function of wavenumber. Following the notation introduced in Sec.
5.15 of Chapter 5, we say that 2D is the finite length of the interferogram signal. According to
Eq. (5.108d) in Chapter 5,

( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L , (6A.4a)

is then the spectral radiance distorted by both the interferometers finite interferogram length and
its finite field of view. Using (6A.3), this reduces to
o
A
- 858 -
( ) [2 sinc(2 )] ( )
mnf
D D = L L . (6A.4b)

The sinc function has a tall central lobe centered on 0 = and then oscillates to zero as we move
away from the origin (see Fig. 5.23 in Chapter 5). Since L is a slowly varying function of
wavenumber, the sinc can be thought of as an extremely narrow function compared to L.
Appendix 5C of Chapter 5 discusses what happens when a narrow function such as
[2 sinc(2 )] D D in (6A.4b) is convolved with a broad, slowly varying function such as L. To
make use of the work done in Appendix 5C, we consult Eq. (5C.4b) to get

( ) [ ( ) ( )] ( ) [ ( ) ( )] h z G z g z G z h z g z .

Here, h(z) represents the narrow function and G(z) represents the broad function. We apply the
definition of the convolution in Eq. (2.38a) of Chapter 2 to just the right-hand side of this formula
to get
( ) [ ( ) ( )] ( ) ( ) ( ) h z G z g z G z h z g z z dz
.

For the special case ( ) 1 g z = , this reduces to
( ) ( ) ( ) ( ) h z G z G z h z dz
. (6A.4c)

Remembering that h(z) represents the narrow sinc function and G(z) represents the broad, slowly
varying spectral radiance L, we set up the correspondences

z

( ) ( ) G z L

( ) [2 sinc(2 )] h z D D

and then apply (6A.4c) to the right-hand side of (6A.4b) to get

( ) ( ) 2 sinc(2 )
mnf
D D d
L L . (6A.4d)
Appendix 6A
- 859 -
Glancing back at Eq. (2.108a) in Chapter 2, we mentally replace F by D and t by , noting that
when 0 f = formula (2.108a) becomes

[2 sinc(2 )] (0, ) D D d D
.

Equation (2.56c) in Chapter 2 shows that (0, ) D is one for all 0 D > , so
[2 sinc(2 )] 1 D D d
(6A.5)

and Eq. (6A.4d) can be written as
( ) ( )
mnf
L L . (6A.6)

Hence, when the spectral radiance L is a slowly varying function of wavenumber with respect to
[2 sinc(2 )] D D and with respect to a change in wavenumber

2
= ,

then it undergoes only negligible distortion from the interferometers finite field of view and
2D finite interferogram length.
- 860 -
Appendix 6B
The noise contaminating a time-based signal can be represented by a random function of time t,
which we write as (t) using the notation of Chapter 3 [see Sec. 3.2 of Chapter 3). According to
Eq. (6.4), for time-based interferometer signals the time t is linearly proportional to the OPD
value ,
/ t u , (6B.1)

where u is the OPD velocity. Hence, when (t) represents noise contaminating a time-based
interferometer signal, we can also decide to represent the same noise as a random function (),
with
( ) ( / ) n N u

(6B.2a)
or
( ) ( ) n ut N t

. (6B.2b)

From Sec. 3.15 of Chapter 3 [see Eq. (3.30b)], we know that when is wide-sense stationary it
has an autocorrelation function
NN
R

given by

( )
2 1 1 2
( ) ( ) ( )
NN
R t t N t N t

E . (6B.3a)

The power spectrum
NN
S

of (t) is the forward Fourier transform of
NN
R

given by

2
( ) ( )
if
NN NN
S f R e d
r t
t t

. (6B.3b)

The Fourier transform can, of course, be reversed to give

2
( ) ( )
if
NN NN
R S f e df
r t
t

. (6B.3c)

Equation (6B.2b) can be used to replace by in Eq. (6B.3a) to get

( )
2 1 1 2
( ) ( ) ( )
NN
R t t n ut n ut

E . (6B.4a)

Using Eq. (6B.1), we define

1 1
ut and
2 2
ut ,

which lets us write (6B.4a) as
[see Eq. (3.48c)]
Appendix 6B
- 861 -
( )
2 1
1 2
( ) ( )
NN
R n n
u

=

E . (6B.4b)

We can now, using the most basic definition of the autocorrelation function in Eq. (3.23b) of
Chapter 3, define the autocorrelation function of () to be

( )
1 2 1 2
( , ) ( ) ( )
nn
n n =

E o , (6B.4c)

which means, according to Eq. (6B.4b), that

2 1
1 2
( , )
nn
NN
R
u

=

o . (6B.4d)

This shows that whenever the autocorrelation
NN
R

of (t) is a function only of
2 1
( ) t t , as shown
in Eq. (6B.3a), the autocorrelation of () must also be a function only of
2 1
( ) .
Consequently we can write Eq. (6B.4c) as

( )
2 1 1 2
( ) ( ) ( )
nn
n n =

E o . (6B.4e)

Equation (6B.4d) can now be written as

2 1
2 1
( )
nn
NN
R
u

=

o (6B.4f)
or, setting
2 1
= ,
( )
nn
NN
R
u

=

o . (6B.4g)

This formula can also be written as, setting / u = ,

( ) ( )
nn
NN
u R =

o . (6B.4h)

Equations (6B.4g) and (6B.4h) specify the connection between the autocorrelation functions of
and .
We examine the definition of a wide-sense stationary random function in Sec. 3.15 of Chapter
3 [in Eq. (3.30b)] and note that (6B.4e) is the major requirement for showing that () is wide-
sense stationary. All that remains is to discover whether or not

- 862 -
( ) ( ) n E

is finite and independent of . If (t) is wide-sense stationary, we know from Eq. (3.30a) of
Chapter 3 that

( )
( ) finite constant
N
N t = =
E . (6B.5a)

Substituting (6B.2b) into (6B.5a) gives

( ) ( ) finite constant
N
n ut = =
E ,

which, since ut = from (6B.1), is clearly the same thing as saying that

( ) ( ) finite constant
N
n = =
E . (6B.5b)

Therefore, putting together (6B.4e) and (6B.5b), we find that () satisfies all the requirements
for being a wide-sense stationary random function of whenever (t) is a wide-sense stationary
random function of t.
The power spectrum
nn
p of ( ) n is the forward Fourier transform of its autocorrelation
function
nn
o ,

2
( ) ( )
i
nn nn
e d

p o . (6B.6a)
Reversing this transform gives

2
( ) ( )
i
nn nn
e d

o p . (6B.6b)
Substituting (6B.4g) into (6B.6a) gives

2
( ) ( / )
i
nn
NN
R u e d

p .

We can, following the suggestion contained in Eq. (6B.1), change the dummy variable of
integration to / u = (with d u d = ) to get

2
( ) ( )
i u
nn
NN
u R e d

p . (6B.6c)

Appendix 6B
- 863 -
Comparing (6B.6c) to (6B.3b), we see that

( ) ( )
nn
NN
uS u =

p , (6B.6d)
which, setting
f u = , (6B.6e)


1
( / ) ( )
nn
NN
u f u S f
=

p . (6B.6f)

Equation (3.57g) in Chapter 3 can be written as, using the notation of this appendix,

( )
2
1
( ) lim ( )
2
T
NN
T
S f f
T

=

N

E , (6B.7a)
where

2 2
( ) ( ) ( , ) ( )
T
ift ift
T
T

= =

N

(6B.7b)

with ( , ) t T defined the same way it was in Eq. (4C.1a) in Appendix 4C of Chapter 4:

1 for
( , )
0 for
t T
t T
t T

=

>

. (6B.7c)

Transforming Eq. (6B.7b) from f and t variables to and variables gives [see Eqs. (6B.1) and
(6B.6e)]

2 ( ) ( / )
1
( ) ( / )
uT
i u u
T
uT
u N u e d
u

N

or, substituting from (6B.2a),

2
1
( ) ( )
D
i
T
D
u n e d
u

, (6B.7d)

where we define, as in Eq. (6.40c) above,

D uT = . (6B.7e)
- 864 -
If we also define

2 2
( ) ( ) ( , ) ( )
D
i i
D
D
n e d D n e d

= =

n , (6B.7f)

then Eq. (6B.7d) can be written as
( ) ( )
D T
u u = n N
. (6B.7g)

Replacing by / f u gives

1
( ) ( / )
T D
f u f u
= N n
. (6B.7h)

Now Eqs. (6B.6f), (6B.7h), and (6B.7e) can be combined with Eq. (6B.7a) to get

2
1
2
1 1
( / ) lim ( / )
2( / )
nn D
D
u f u f u
D u u

=

n

E p

or, replacing f by u ,

( )
2 1
( ) lim ( )
2
nn D
D
D

=

n

E p . (6B.7i)

Equations (6B.6d), (6B.6f), and (6B.7a)(6B.7i) specify the connections between the -based and
the t-based power spectra of and .

- 865 -
7
MIRROR-MISALIGMENT NEdN IN
DOUBLE-SIDED INTERFEROGRAMS
Unlike the detector noise described in the previous chapter, the misalignment noise in a well-
designed interferometer should be a small source of random error. To design these instruments
properly, ensuring that misalignment noise is likely to be small, we need some way to analyze it.
The formulas derived in Chapters 4 and 5 can handle static interferometer misalignmentsthat
is, they can handle situations where the alignment does not significantly change during a spectral
measurementbut a more sophisticated approach is needed when the alignment changes rapidly
and randomly. In this chapter we use wide-sense stationary random functions of the type
described in Sec. 3.15 of Chapter 3 to describe how the interferometers randomly changing
misalignment can contaminate the interference signal. By tracing the contaminated interferogram
through the entire signal-processing chain, including the calibration algorithm, we discover what
the spectral NEdN looks like when the interferometer is dominated by misalignment noise. This
not only produces the formulas needed to design interferometers with insignificant amounts of
random misalignment but it also, when interferometers break down, gives us the information
needed to decide whether unexpectedly large and randomly changing alignment errors are
contributing to the problem.

7.1 Setting Up the Signal Equations
Equation (6.8a) in Chapter 6 specifies the total optical signal presented to the detector at point A
in Fig. 6.2 by the formula

( ) ( )
( ) ( ) ( )
tot cold
A A A
z z z = + .

Consulting Eqs. (6.6c) and (6.12d) in Chapter 6, we expand this expression to

7 Mirror-Misalignment NEdN in Double-Sided Interferograms
- 866 -

( )
2
ma
( ) (back) 2
ma
( ) ( ) ( ) ( ) ( )
4
M( ) ( ) ( ) ( ) ( )
4
M( ) ( ) ( )[ ( ) ( )]
4
tot
A f a
i
f a FOV
fore i
a FOV FOV
A
z d
WA
R e d
WA
R e d
A

L
L
L L

0
2
(back)
0
( ) ( )
det
0
( ) ( ) ( )
2
[2 ( ) ( )] ( ) ( )
2
A ( )
(fore)
a
a
dir dir
d
A
r d
d

+
+
L
L
L

.
(7.1a)

We note that because is even [see Eq. (4.139g) in Chapter 4] that the product

( ) ( ) ( ) ( )
f a
L

is even. According to formula (2.19) in Chapter 2, we then can write that

0
( ) ( ) ( ) ( ) 2 ( ) ( ) ( ) ( )
f a f a
d d

=

L L . (7.1b)

This allows the first and fourth terms on the right-hand side of (7.1a) to be combined into a single
integral,

0
0
( ) ( ) ( ) ( ) ( ) ( ) ( )
4 2
( ) ( )[ ( ) L( ) ( )]
2
(fore)
f a a
(fore)
a f
A A
d d
A
d

= +

L L
L

.
(7.1c)

In a similar way, we can combine the two Fourier transforms in (7.1a) to get
Setting Up the Signal Equations 7.1
- 867 -

2
ma
( ) (back) 2
ma
( ) (back) 2
ma
M( ) ( ) ( ) ( ) ( )
M( ) ( ) ( )[ ( ) ( )]
M( ) ( ) ( )[ ( ) ( ) ( ) ( )]
i
f a FOV
fore i
a FOV FOV
fore i
a f FOV FOV FOV
R e d
R e d
R e d

+
= +
L
L L
L L L

.
(7.1d)

Equations (7.1c) and (7.1d) can be substituted into (7.1a) to get

( )
( ) (back) 2
ma
0
2
( )
M( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) (
2
tot
A
fore i
a f FOV FOV FOV
(fore)
a f
z
WA
R e d
A
d
A
r

+ +
L L L
L L

(back) ( ) ( )
det
0 0
)] ( ) ( ) A ( )
dir dir
a
d d

+

L L .
(7.1e)

This is the formula for
( )
( )
tot
A
z that we will trace through the signal chain of Fig. 6.2 in Chapter
6.
7.2 Specifying the Random Misalignment Angle of the Moving Mirror
Figure 7.1 specifies the random-angle variables
x
and
y
used to describe the misalignment of

the moving mirror. The total misalignment angle
ma
in Eq. (7.1e) is now called
to show that it
too is random. The dashed arrow in Fig. 7.1 shows the orientation of the surface normal when it
is misaligned by the random angle

2 2
x y
= +

,

and the bold arrow pointing along the interferometers optical axis shows the orientation of the
moving mirrors surface normal when it is correctly aligned. The formula for the modulation
function M used in Eq. (7.1e) and defined in Eq. (5.10c) of Chapter 5 assumes that the beam
- 868 -
passing through the interferometer has a circular cross section. In a well-designed interferometer
is always small, so in Eq. (5.10c) we can make the approximation

102

2 2 1
(4 )
M( ) 1
2
J R
R
R
r o
o o
r o
e
a (7.2a)
with

2 2
2 R r a . (7.2b)

FIGURE 7.1. The z axis is the correctly aligned normal vector of the mirror surface and the dashed arrow
is the misaligned normal vector. The x and y axes show the orientation of the ( )
x

and ( )
y

components of the total ( )
misalignment angle.

102
Handbook of Mathematical Functions, edited by Abramowitz and Stegun, see formula (9.1.10), p. 360.
) (
~

x

) (
~

x axis
y axis
z axis

) (
~

y
Specifying the Random Misalignment Angle of the Moving Mirror 7.2
- 869 -
The random angles
x
and
y
can take on both positive and negative values, but random angle

can never be negative. All three angles
x
,
y
, and
can be treated as random functions of

the OPD value , giving us

2 2
( ) ( ) ( )
x y
= +

. (7.2c)

By making these angles stationary random functions of , we can analyze what happens to the
interferometer signal when
x
and
y
change randomly with OPD while the moving mirror is in

motion.
Using stationary random functions to represent angles
x
,
y
, and
is an obvious approach
when the misalignment is driven by outside disturbanceswhen, for example, interferometers
are operated in high-vibration environments. In this sort of situation, we expect ( )
x

, ( )
y

,
and ( )
to be at least wide-sense stationary and weakly ergodic (these types of random

quantities are discussed in Secs. 3.15 and 3.18 of Chapter 3). In low-vibration environments,
however, there may well be a tendency for the interferometers own motionit does, after all,
have a moving mirrorto excite internal resonances that disturb the alignment. When this
happens, the misalignment may well be preferentially large at certain values. Although at first
glance it may seem that
x
,
y
, and
must now be nonstationary random functions, we can

instead, remembering the discussion following Eq. (3.47a) in Chapter 3, say that
x
,
y
are
still stationary but nonergodic. Before the instrument is built, it is very difficult to know at what
values the random quantities
x
,
y
, and
have a greater chance of taking on large values.

Hence, in our ignorance, while designing the instrument, we can treat these angles as equally
likely to be large or small at any valuethat is, we say that
x
,
y
, and
are stationary.
Building the interferometer then corresponds to choosing specific angle functions from the
ensemble of allowed functions, as discussed in Sec. 3.14 in Chapter 3. If the angle function turns
out to be preferentially large at some values, this just means that a nonergodic member function
of the ensemble has been chosen. So even in a low-vibration environment we can still, while
designing the interferometer, regard ( )
x

, ( )
y

, and ( )
as wide-sense stationary random

functions.
Now that we have decided to treat ( )
x

and ( )
y

as wide-sense stationary random

functions, we note that
x
and
y
are usually zero-mean random variables, which means that

their expectation values are zero:
- 870 -

( )
( ) 0
x

E (7.2d)
and

( )
( ) 0
y

E . (7.2e)

Some interferometers, however, have a bias tilt angle o , which is the same thing as saying that
( ( ))
x

E and ( ( ))
y

E are not both equal to zero. When this happens, the expectation values of
( )
x

and ( )
y

are assumed to be independent of , and we can orient the x and y axes in Fig.
7.1 so that

( )
( )
x
o
E (7.2f)
and

( )
( ) 0
y

E . (7.2g)

Note that when 0 o , these equations reduce to the previous formulas in (7.2d) and (7.2e). To
analyze mirror-misalignment noise both with and without bias tilt, we say that the probability
density distribution characterizing the behavior of
x
at all values of has a mean of o and that

the probability density distribution characterizing the behavior of
y
at all values of has a mean

of zero. We assume that the probability density distributions for
x
and
y
are normal and have

standard deviations
x
y and
y
y respectively. These two normal distributions can then be written
as

( )
2 2
( ) 2 1
( )
2
x x
x
x
x
p e
o y

y r

(7.2h)
and

( )
2 2
2 1
( )
2
y y
y
y
y
p e
y

y r
. (7.2i)

Here ( )
x
x x
p d

x

x
and
x x
d + and
( )
y
y y
p d

y

y
and
y y
d + . Having used
Eqs. (7.2h) and (7.2i) to set up the
x
and
y
distributions, it can be shown that if

x y
y y y
Note that, when
- 871 -
and
x
,
y
are independent, then
in Eq. (7.2c) must obey the probability density

distribution
103

2 2
2
( )
2
0
2 2
( ) p I e

, (7.2j)
where

2
cos
0
0
1
( )
2
I e d
(7.2k)

is a modified Bessel function of order zero.
Since the statistics of
x
,
y
, and
do not change with , the random functions ( )

x

,
( )
y

, and ( )
arespeaking somewhat looselyequally likely to take on the same values at

any position of the interferometers moving mirror. This means that the average or mean squared
values of
x
,
y
, and
are -independent constants. Equations (7A.5a) and (7A.5c) in Appendix

7A then show that

( )
2 2 2
( )
x x
= +
E (7.3a)
and

( )
2 2
( )
y y
=
E . (7.3b)

We define
2
rms
to be the -independent constant equal to
( )
2
( )
E ,

( )
2 2
( )
rms
=
E . (7.3c)

Squaring (7.2c) and taking the expectation value of both sides gives, after applying Eq. (3.16a) in
Chapter 3,

( ) ( ) ( ) ( )
2 2 2 2 2 2
( ) ( ) ( ) ( ) ( )
rms x y x y
= = + = +

E E E E .

Substituting from (7.3a) and (7.3b) gives

( )
2 2 2 2 2
( )
rms x y
= = + +
E . (7.3d)

103
A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 140.
- 872 -
When both
x
and
y
have the same standard deviation, with

x y
= = ,
then Eq. (7.3d) becomes

( )
2 2 2 2
( ) 2
rms
= = +
E ,

which can be solved for
2
to get

2 2
2
2
rms

= . (7.3e)

For future use, we derive the value of
( )
4
( )
E . Taking the fourth power of both sides of Eq.

(7.2c) and taking the expectation value gives [again using Eq. (3.16a) in Chapter 3],

( )
( )
( )
( ) ( ) ( )
2
4 2 2 4 4 2 2
4 4 2 2
( ) ( ) ( ) ( ) ( ) 2 ( ) ( )
( ) ( ) 2 ( ) ( )
x y x y x y
x y x y

= + = + +

= + +

.
E E E
E E E

Assuming that ( )
x

and ( )
y

are independent random variableswhich, of course, means

that
2
( )
x

and
2
( )
y

are also independentwe can write that [see formula (3.12c) in Chapter
3)

( ) ( ) ( ) ( ) ( )
4 4 4 2 2
( ) ( ) ( ) 2 ( ) ( )
x y x y
= + +

E E E E E . (7.4a)

From Eqs. (7A.5b) and (7A.5d) in Appendix 7A, we have

( )
4 4 2 2 4
( ) 3 6
x x x
= + +
E (7.4b)
and

( )
4 4
( ) 3
y y
=
E . (7.4c)

Substitution of Eqs. (7.3a), (7.3b), (7.4b), and (7.4c) into (7.4a) gives

( )
4 4 2 2 4 4 2 2 2
4 4 2 2 2 2 2 4
( ) 3 6 3 2( )
3( ) 2 (3 ) 2
x x y x y
x y x y y

= + + + + +
= + + + + +
.
E
(7.4d)
- 873 -
When
x
and
y
have the same standard deviation

x y
= = , this reduces to

( )
4 4 2 2 4
( ) 8 8 = + +
E . (7.4e)

Substituting Eq. (7.3e) into (7.4e) gives

( )
2
2 2 2 2
4 2 4
4 4 2 2 2 2 4 4
( ) 8 8
2 2
2( 2 ) 4 4
rms rms
rms rms rms

= + +

= + + +
.
E

Thus we have, simplifying the right-hand side, that

( )
4 4 4
( ) 2
rms
=
E (7.4f)

when
x
and
y
are independent and obey normal distributions having the same standard
deviation.
7.3 -Based Signal Contaminated by Mirror-Misalignment Noise
When random misalignment of the moving mirror is the primary source of noise, Eq. (7.1e)
above with
ma
replaced by ( )
isas was pointed out at the beginning of the previous

sectionthe formula for the noise-contaminated signal,

( )
( )
( ) (back) 2
0
2
( )
M ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) (
2
tot
AN
fore i
a f FOV FOV FOV
(fore)
a f
z
WA
R e d
A
d
A
r

+ +
L L L
L L

(back) ( ) ( )
det
0 0
)] ( ) ( ) A ( )
dir dir
a
d d

+

L L .
(7.5a)

In this chapter, the random function
( ) tot
AN
z represents the total signal contaminated by mirror-
misalignment noise at point A in Fig. 6.2 of Chapter 6. The AN subscript and (tot) superscript
remind us that
( ) tot
AN
z is the noise-contaminated total signal at point A, and the tilde shows that
- 874 -
( )
turns
( ) tot
AN
z into a random function of . To get the detector signal generated by all the
optical power hitting the detector, we insert the detector responsivity R into the integrals on the
right-hand side of (7.5a):

( )
( )
( ) (back) 2
0
R
R
( )
( )M ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) ( ) ( )[ ( ) ( ) ( )]
2
tot
BN
fore i
a f FOV FOV FOV
(fore)
a f
z
WA
R e d
A
d
A

+ +
+
L L L
L L

2
(back) ( ) ( )
det
0 0
R R [2 ( ) ( )] ( ) ( ) ( ) A ( ) ( )
2
dir dir
a
r d d

+

L L .
(7.5b)

Here
( ) tot
BN
z represents the total signal contaminated by mirror-misalignment noise at point B in
Fig. 6.2. Traditionally the responsivity R() is defined only for positive wavenumber arguments,
so inside the first integral on the left-hand side the argument of R has absolute value signs to
make R well-defined for negative values.
Equation (7.2a) can be substituted into (7.5b) to get

( )
( ) (back) 2
0
2
R
R
( )
( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) ( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) (
2
tot
BN
fore i
a f FOV FOV FOV
(fore)
a f
z
WA
e d
A
d
A
r

+ +
L L L
L L

(back) ( ) ( )
det
0 0
2 2 ( ) (back) 2
R R
R
)] ( ) ( ) ( ) A ( ) ( )
( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )] .
4
dir dir
a
fore i
a f FOV FOV FOV
d d
WA
e d

L L
L L L

a
(7.6a)

Adding and subtracting

2 2 ( ) (back) 2
R( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore i
rms a f FOV FOV FOV
WA
e d

L L L a

-Based Signal Contaminated by Mirror-Misalignment Noise 7.3
- 875 -
on the right-hand side of (7.6a) gives
( )
2 2 ( ) (back) 2
0
2
R
R
R
( )
4
(1 ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
( ) ( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) ( )] (
2
tot
BN
fore i
rms a f FOV FOV FOV
(fore)
a f
z
WA
e d
A
d
A
r

+

+ +

+

L L L
L L

a
(back) ( ) ( )
det
0 0
2 2
2 ( ) (back) 2
R
R
) ( ) ( ) A ( ) ( )
[ ( ) ]
4
( ) ( ) ( )[ ( ) ( ) ( ) ( )] .
dir dir
a
rms
fore i
a f FOV FOV FOV
d d
WA
e d

+

+

L L
L L L

a
(7.6b)
Again we apply Eq. (7.2a) above to get, since
2
rms
is a small angle,

2 2
rms rms
M( ) 1 R = a .
This lets us write (7.6b) as
( )
( ) (back) 2
rms
0
2
R
R
R
( )
4
M( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
( ) ( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) ( )] ( ) (
2
tot
BN
fore i
a f FOV FOV FOV
(fore)
a f
a
z
WA
R e d
A
d
A
r

+

+ +

+

L L L
L L

(back) ( ) ( )
det
0 0
2 2
2 ( ) (back) 2
R
R
) ( ) A ( ) ( )
[ ( ) ]
4
( ) ( ) ( )[ ( ) ( ) ( ) ( )] .
dir dir
rms
fore i
a f FOV FOV FOV
d d
WA
e d

+

+

L L
L L L

a
(7.7a)
- 876 -
Now by defining

( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
FOV a f FOV FOV FOV
WA

= +

Z L L L , (7.7b)

we can write Eq. (7.7a) as

( ) 2
0
2
(back)
0
( )
det
R
R
R
( ) M( ) ( )
( ) ( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) ( )] ( ) ( ) ( )
2
A
tot i
BN rms FOV
(fore)
a f
a
dir
z R e d
A
d
A
r d

=

+ +

+

+
Z
L L
L

( )
0
2 2 2 2
( ) ( )
[ ( ) ] ( )
dir
i
rms FOV
d
e d

L
Z
. a
(7.7c)

The formula for
( ) tot
BN
z can be cleaned up some more by defining function ( ) W to be

2 2
( ) ( )
i
FOV
W e d

Z (7.8a)

and also defining a new random function
( 2)
( ) n

,

( 2) 2 2
( ) ( )
rms
n

=

. (7.8b)

( 2)
( ) n


( 2) 2 2
( ) ( ( ) ) ( ) n

=

E . (7.8c)

We note that, using the linearity of operator E described in Sec. 3.10 of Chapter 3,

( ) ( ) ( ) ( ) ( ) ( )
( 2) 2 2 2 2
( ) ( ) ( ) ( ) ( ) n

= =

E E E E E E .
- 877 -
Since
( )
2
( )
E is a nonrandom quantity, Eq. (3.9f) of Chapter 3 requires that

( ) ( ) ( )
2 2
( ) ( ) =

E E E ,

from which it follows that

( ) ( ) ( ) ( ) ( ) ( )
( 2) 2 2 2 2
( ) ( ) ( ) ( ) ( ) 0 n

= = =

. E E E E E E (7.8d)

Hence,
( 2)
( ) n

is a zero-mean random function. Since
( 2)
( ) n

is just the square of ( )

subtracted from a constant, and the statistics of ( )
do not depend on , we expect that the

statistics of
( 2)
( ) n

also do not depend on . Consequently, we now assume that
( 2)
( ) n

is at
least wide-sense stationary with respect to [see Eqs. (3.30a) and (3.30b) and the discussion
following them for a description of what this means]. Substituting (7.8a) and (7.8b) into (7.7c)
gives

( ) 2
0
2
(back)
0
( )
det
R
R
R
( ) M( ) ( )
( ) ( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) ( )] ( ) ( ) ( )
2
A
tot i
BN rms FOV
(fore)
a f
a
dir
z R e d
A
d
A
r d

=

+ +

+

+
Z
L L
L

( )
0
( 2)
( ) ( )
( ) ( )
dir
d
n W
L
. a
(7.8e)

The first four terms on the right-hand side are all nonrandom, so it makes sense to write (7.8e) as

( ) ( ) ( 2)
( ) ( ) ( ) ( )
tot tot
BN B
z z n W
= + a , (7.8f)

where
- 878 -

( ) 2
0
2
(back)
0
( )
det
R
R
R
( ) M( ) ( )
( ) ( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) ( )] ( ) ( ) ( )
2
A (
tot i
B rms FOV
(fore)
a f
a
dir
z R e d
A
d
A
r d

=

+ +

+

+
Z
L L
L

( )
0
) ( )
dir
d
L .
(7.8g)

Substituting for ( )
FOV
Z from (7.7b) lets the formula for
( ) tot
B
z be written as

( )
( ) (back) 2
0
R
R
( )
M( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) ( ) ( )[ ( ) ( ) ( )]
2
[2
2
tot
B
fore i
rms a f FOV FOV FOV
(fore)
a f
z
WA
R e d
A
d
A

+ +
L L L
L L

2
(back) ( ) ( )
det
0 0
R R ( ) ( )] ( ) ( ) ( ) A ( ) ( )
dir dir
a
r d d

+

L L .
(7.8h)

Comparing this latest expression to the formula for
( )
( )
tot
A
z in Eq. (7.1e), we note that
( )
( )
tot
A
z
turns into
( )
( )
tot
B
z if we insert the responsivity R into all the integrals of (7.1e) and also set

ma rms
= (7.8i)

in the modulation term M. This correspondence justifies the
( ) tot
B
z label given to the sum of the
four nonrandom terms in (7.8e) above, because this term looks like what the noise-free signal
( ) tot
A
z at point A in Fig. 6.2 would become as it leaves the detector at point B, provided we say
that
rms
is the effective value of the moving mirrors constant misalignment angle. After using
both the linearity of E described in Sec. 3.10 of Chapter 3 and Eq. (3.9f) from that same chapter,
we apply the expectation operator E to both sides of Eq. (7.8f) to get

( ) ( ) ( ) ( )
( ) ( ) ( 2) ( ) ( 2)
( ) ( ) ( ) ( ) ( ) ( ) ( )
tot tot tot
BN B B
z z n W z W n

= + = + a a E E E E .

- 879 -
This becomes, using Eq. (7.8d),

( )
( ) ( )
( ) ( )
tot tot
BN B
z z = E . (7.8j)

Hence
( )
( )
tot
B
z is the expectation value, or average value, of the noise-contaminated signal
leaving the detector. It is the -based signal we get after averaging together many independent
measurements of the same spectral radiance to reduce the mirror-misalignment noise to
negligible levels.
7.4 Misalignment Noise and the Detector Circuit (or Anti-Aliasing Filter)
To get the noise-contaminated signal through the detector circuit (which contains the anti-aliasing
filter) to point C in Fig. 6.2 in Chapter 6, we convert the noise-contaminated signal into a
function of time. Using the notation of Eq. (5.41a) in Chapter 5 and Eq. (6.4) of Chapter 6, we
write
t
u
= , (7.9a)

where u is the constant, positive OPD velocitythat is, the constant time rate of change of the
OPD value . For the interferometer in Fig. 6.2, if is the constant physical velocity of the
moving mirror, then 2 u v = . The 0 t = origin of the time coordinate is chosen to coincide with
the OPD value 0 = . The time-based signal at point B can be written as, using (7.9a) to replace
by ut in Eq. (7.8f),

( ) ( ) ( 2)
( ) ( ) ( ) ( )
tot tot
BN B
z ut z ut n ut W ut
= + a . (7.9b)

To find the time-based output signal ( )
o
s t leaving the detector circuit, we apply the standard
linear-circuit formula
( ) ( ) ( )
o i
s t h t s t = , (7.10a)

where is the convolution operator defined in Eq. (2.38a) of Chapter 2, s
i
(t) is the input signal
entering the detector circuit, and h(t) is the real-valued impulse-response function of the detector
circuit including the anti-aliasing filter.
104
Equation (7.10a) can be written as

( ) ( ) ( ) ( )
i i
h t s t h t s t t dt
. (7.10b)

104
See Appendix 5A of Chapter 5 for more discussion of the impulse-response function and the implications of Eq.
(7.10a) relating the input and output signals of the detector circuit.
- 880 -
We know that the detector circuit (and anti-aliasing filter) has a transfer function H() such that h
and H are a Fourier-transform pair,

2
H( ) ( )
t if
f h t e dt
(7.10c)
and

2
( ) H( )
t if
h t f e df
. (7.10d)

According to Eq. (5A.6b) in Appendix 5A of Chapter 5, the transfer function H is Hermitian,

H( ) H( ) f f

= . (7.10e)

The superscript indicates that H()
*
is the complex conjugate of H(). As explained in
Appendix 5A, formula (7.10e) holds true for any Fourier transform of a real function h. The
detector circuit is AC coupled to the detector, which means, according to Eq. (5.46d) in Chapter
5, that
H(0) 0 = .

Substituting from (7.10c) with 0 f = then leads to
H(0) ( ) 0 h t dt
= =
. (7.10f)

An immediate consequence of Eq. (7.10f) and the definition of the convolution in Eq. (2.38a) in
Chapter 2 is that, for any time-independent constant K,

( ) ( ) 0 h t K K h t dt
= =
. (7.10g)

From Eq. (6.21a) in Chapter 6, we know that, for a relatively small time value T,

( ) 0 for all h t t > T . (7.10h)

Equations (7.9b) and (7.10a) can now be combined to get the time-based output signal of the
detector circuit (and anti-aliasing filter), which we decide to call
( )
( )
tot
CN
s t ,

( ) ( ) ( 2)
( ) ( ) [ ( ) ( ) ( )]
tot tot
CN B
s t h t z ut n ut W ut
= + a . (7.11a)
Misalignment Noise and the Detector Circuit (or Anti-Aliasing Filter) 7.4
- 881 -
Substitution from (7.8g) shows that
( )
( )
tot
B
z ut has many constantthat is, time-independent
terms. Gathering together all the constant terms inside a pair of braces { }, we use the linearity of
the convolution [see Eq. (2.38d) in Chapter 2] to write

( ) 2
0
2
(back)
R
R
( ) ( ) M( ) ( )
( ) ( ) ( ) ( )[ ( ) ( ) ( )]
2
[2 ( ) ( )] ( ) ( ) (
2
tot i ut
CN rms FOV
(fore)
a f
a
s t h t R e d
A
h t d
A
r

+ +
Z
L L
L

0
( ) ( )
det
0
( 2)
R
)
A ( ) ( )
( ) [ ( ) ( )]
dir dir
d
d
h t n ut W ut

. a

According to Eq. (7.10g), the convolution with the constant terms is zero, leaving us with

( ) 2
( 2)
( ) ( ) M( ) ( )
( ) [ ( ) ( )]
tot i ut
CN rms FOV
s t h t R e d
h t n ut W ut

=

+
Z

. a
(7.11b)

For any time-based convolution such as the one in (7.10a) where

( ) ( ) ( )
o i
s t h t s t = ,

Eq. (7.9a) and the formula for the convolution in (7.10b) can be used to convert back to functions
of ,

/
( ) ( )
o i
t u
s h t s t
u

=

=

or, using that / t u = ,

1 1
( )
o i i i
s h t s t dt h s d h s
u u u u u u u u u

= = =

. (7.11c)

Applying this rule to Eq. (7.11b) gives
- 882 -

( ) 2
( 2)
1
M( ) ( )
( ) ( )
tot i
CN rms FOV
s h R e d
u u u
h n W
u u
r o

o o o

+

Z

a

so that, deciding to call the -based signal
( )
( )
tot
CN
z instead of
( )
( / )
tot
CN
s u , we can write

( ) 1 2
1 ( 2)
( ) M( ) ( )
( ) ( )
tot i
CN rms FOV
z u h R e d
u
u h n W
u
r o
o o o

+

Z
. a
(7.11d)

In this chapter, random function
( ) tot
CN
z represents the total signal contaminated by mirror-
misalignment noise at point C in Fig. 6.2 of Chapter 6.
7.5 Misalignment Noise in Uncalibrated Spectra of Double-Sided
Signals
To construct the double-sided signal, we repeat the definition of function ( , ) D H given in Eq.
(4C.1a) in Appendix 4C of Chapter 4 to get

1 for
( , )
0 for
D
D
D
H

>

. (7.12a)

Any -based signal multiplied by ( , ) D H is left unchanged for OPD values between +D and D
and set to zero for OPD values greater than D or less than D. We now multiply both sides of Eq.
(7.11d) to get the double-sided signal at point C in Fig. 6.2 of Chapter 6,

( ) 1 2
1 ( 2)
( , ) ( ) ( , ) M( ) ( )
( , ) [ ( ) ( )]
tot i
CN rms FOV
D z u D h R e d
u
u D h W
u
n
r o
o o o

H H

+ H

Z

. a
(7.12b)

The approximation specified in Eq. (7.10h) can be used to simplify the second term on the
right-hand side of (7.12b). Because h is a narrow function, the definition of a convolution can be
approximated as [see Eqs. (2.38a) and (2.38b) in Chapter 2]
by to get the double-sided signal at point C in Fig. 6.2 of Chapter 6, ( , ) D H
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals 7.5
- 883 -

( 2)
( 2)
( 2)
( , ) [ ( ) ( )]
( , ) [ ( ) ( )]
( , ) ( ) ( )
D h n W
u
D n W h
u
D n W h d
u

H

H

H

( 2)
( , ) ( ) ( )
uT
uT
D n W h d
u

e H

.
(7.13a)

Using the same reasoning as in the discussion following Eq. (6.26a) in Chapter 6, we note that
this equation reduces to 0 0 when does not lie between D and D. Consequently, the limits
on the integral over d can be replaced by ( ) D u + q and ( ) D u + q . When the integrals
limits are extended like this, the extra range of integration going from a to ( ) u q and from
( ) u + q to a makes only a negligible contribution to the integral due to the smallness of h at
these OPD values. Hence we can write

( 2)

( 2)
( )
( 2)
( , ) [ ( ) ( )]
( , ) ( ) ( )
( , ) ( , ) ( ) ( )
D u
D u
D h n W
u
D n W h d
u
D n W h d
u

+
+

H

e H

H H

,
q
q
D
(7.13b)
where
D u + q D . (7.13c)

Referring back to the formula for the convolution of two functions in Eq. (2.38a) of Chapter 2,
we see that (7.13b) can be written as

( 2)
( 2)
( , ) [ ( ) ( )]
( , ) ( , ) ( ) ( )
D h n W
u
D h n W
u

H

e H H

. D
(7.13d)
D
D
T
T
T T
T
T
, using (2.38b) to reverse the order of the convolution,
- 884 -
To make our Fourier notation more concise, we start using F , the Fourier-transform operator
defined by Eqs. (2.29a) and (2.29c) in Chapter 2. When using this notation

( )
( ) 2
( ) ( )
i i
u u e d

F (7.14a)

is the forward Fourier transform of any transformable function u and

( )
( ) 2
( ) ( )
i i
v v e d

F (7.14b)

is the reverse Fourier transform of any transformable function .
To get the uncalibrated spectrum of a double-sided signal contaminated by mirror-
misalignment noise, we take the forward Fourier transform of both sides of Eq. (7.12b) to get,
using the linearity of the Fourier transform described in Sec. 2.6 of Chapter 2,

( )
( ) ( )
1 ( ) 2
1 ( ) ( 2)
( , ) ( )
( , ) M( ) ( )
( , ) [ ( ) ( )]
i tot
CN
i i
rms FOV
i
D z
u D h R e d
u
u D h n W
u

=

+

Z

. a
F
F
F

The Fourier transform of
( )
[ ( , ) ( )]
tot
CN
D z is the uncalibrated signal spectrum contaminated by
mirror-misalignment noise, and we recognize this by writing that

( )
( ) ( )
,
( ) ( , ) ( )
i tot
eff totN CN
D z
= Z
F , (7.14c)

so that the previous formula becomes

1 ( ) 2
,
1 ( ) ( 2)
( ) ( , ) M( ) ( )
( , ) [ ( ) ( )]
i i
eff totN rms FOV
i
u D h R e d
u
u D h n W
u

=

+

Z Z

. a
F
F

- 885 -
We apply the Fourier convolution theorem shown in Eq. (2.39j) in Chapter 2 to the first term on
the right-hand side, and to the second term we apply the approximation shown in Eq. (7.13d).
This gives

( )
,
1 ( ) ( ) 2
1 ( ) ( 2)
( )
( , ) M( ) ( )
( , ) [ ( , ) ( ) ( )]
eff totN
i i i
rms FOV
i
u D h R e d
u
u D h n W
u
o o r o
o
o
o o o

H

+ H H

Z
Z

. a a
F F
F

We again apply the Fourier convolution theorem, this time using the forms shown in Eqs. (2.39j)
and (2.39a), to write

( )
,
1 ( ) ( ) ( ) 2
( )
( , ) M( ) ( )
eff totN
i i i i
rms FOV
u D h R e d
u
u
o o o r o
o
o o o

H

+
Z
Z

F F F
( )
1 ( ) ( ) ( 2)
( , ) [ ( , ) ( ) ( )]
i i
D h n W
u
o o

H H

a a F F

or, again applying (2.39a),

( )
( )
,
1 ( ) ( ) ( ) 2
1 ( ) ( )
( )
( , ) M( ) ( )
( , )
eff totN
i i i i
rms FOV
i i
u D h R e d
u
u D h
u
o o o r o
o o
o
o o o

+ H

Z
Z

a
F F F
F F
( )
( ) ( 2)
( , ) ( ) ( )
i
n W
o

H

. a F
(7.15a)

The Fourier transform of ( , ) D H is, according to Eq. (2.108b) in Chapter 2,

( )
( )
( , ) 2 sinc(2 )
i
D D D
o
ro
H F , (7.15b)

where the sinc function is, following the definition in Eq. (2.106d),

D
D
D
- 886 -

sin( )
sinc( )
x
x
x
. (7.15c)

Glancing back at Eqs. (7.14a) and (7.10c), we note that (when / t u ),

( )
( ) 2 2
H( )
i i i ut
h h e d u h t e dt u u
u u
o r o r o

o

F . (7.15d)

We can now substitute Eqs. (7.15b) and (7.15d) into (7.15a) to get

( )
,
( ) 2
( ) ( 2)
( )
[2 sinc(2 )] H( ) M( ) ( )
[2 sinc(2 )] H( ) ( , ) ( ) ( )
eff totN
i i
rms FOV
i
D D u R e d
D D u n W
o r o
o
o
ro o o o o
ro o

+ H
Z
Z

a
F
F D

or

( )
,
( ) ( 2)
( )
[2 sinc(2 )] H( ) M( ) ( )
[2 sinc(2 )] H( ) ( , ) ( ) ( ) ,
eff totN
rms FOV
i
D D u R
D D u n W
o
o
ro o o o
ro o

+ H
Z
Z

a F D
(7.15e)

where in the last step the forward Fourier transform of the reverse Fourier transform returns the
original function:

( ) ( )
( ) 2
( ) ( )
M( ) ( )
M( ) ( ) M( ) ( )
i i
rms FOV
i i
rms FOV rms FOV
R e d
R R
o r o
o o
o o o
o o o o

Z
Z Z

.
F
F F

Working with the first term on the right-hand side of (7.15e), we note that [see Eq. (7.7b)
above]

( ) (back)
R
H( ) M( ) ( )
H( ) M( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
rms FOV
fore
rms a f FOV FOV FOV
u R
WA
u R
o o o
o o o q o t o t o o o o
AO
+
Z
L L L

.
(7.16a)

[see Eqs. (2.28 ) and (2.29a,b) in Chapter 2]: A
- 887 -
In a well-designed interferometer all the functions on the right-hand side of (7.16a), except for
the radiances
FOV
L ,
( ) fore
FOV
L , and
(back)
FOV
L , vary slowly with compared to sinc(2 ) D ro .
Furthermore, this sinc function is very narrow, dropping rapidly to zero compared to all the
nonradiance functions in (7.16a). Consequently, we can, according to Eq. (5C.1) in Appendix 5C
of Chapter 5, treat the nonradiance functions as quasi-constants in the convolution

[2 sinc(2 )] H( ) M( ) ( )
rms FOV
D D u R ro o o o Z .

This lets us write

( ) (back)
R
[2 sinc(2 )] H( ) M( ) ( )
[2 sinc(2 )] H( ) M( ) ( ) ( ) ( )
4
[ ( ) ( ) ( ) ( )]
H(
4
rms FOV
rms a
fore
f FOV FOV FOV
D D u R
WA
D D u R
WA
u
ro o o o
ro o o o q o t o
t o o o o
o
AO

+
AO
e
Z
L L L

( ) (back)
R ) M( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )] ,
fore
rms a f mnf mnf mnf
Ro o q o t o t o o o o + L L L

(7.16b)

where, following the notation of Eqs. (6.62c), (6.63c), and (6.63d) in Chapter 6, we say that

( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L , (7.16c)

( ) ( )
( ) [2 sinc(2 )] ( )
fore fore
mnf FOV
D D o ro o L L , (7.16d)
and

(back) (back)
( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L . (7.16e)

Functions
mnf
L ,
( ) fore
mnf
L ,
(back)
mnf
L have the same units as
FOV
L ,
( ) fore
FOV
L ,
(back)
FOV
L and represent spectral
radiances distorted both by the effect of the interferometers finite field of view and by its finite
interferogram length. Defining function
mnf
Z to be

( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
mnf a f mnf mnf mnf
WA
o o q o t o t o o o o
AO
+ Z L L L (7.16f)

lets us write (7.16b) as

must vary slowly with compared to
, after using Eqs. (5C.1) and (2.38d) in Chapter 2,
- 888 -

{ } [2 sinc(2 )] H( ) M( ) ( )
H( ) M( ) ( )
rms FOV
rms mnf
D D u R
u R

Z
Z .
(7.16g)

Remembering that all nonradiance functionsincluding H(u) and M(R
rms
)can be treated as
quasi-constants in a convolution with sinc(2 ) D , we again apply Eq. (5C.1) from Appendix 5C
in Chapter 5 to get

{ } H( ) M( ) [2 sinc(2 )] ( )
H( ) M( ) ( )
rms FOV
rms mnf
u R D D
u R

Z
Z ,

which reduces to
[2 sinc(2 )] ( ) ( )
FOV mnf
D D Z Z . (7.16h)

The second term on the right-hand side of (7.15e) can also be simplified. The nonradiance
H(u) transfer function can be treated like a quasi-constant in the convolution over to get

( ) { }
{ }
( ) ( 2)
( ) ( 2)
[2 sinc(2 )] H( ) ( , ) ( ) ( )
H( ) [2 sinc(2 )] ( ( , ) ( ) ( ))
i
i
D D u n W
u D D n W

F D
F D .

Equation (7.15b) and Eq. (2.39j) in Chapter 2 can be used to turn the sinc function into another
factor inside the Fourier transform,

( ) { }
( )
( ) ( 2)
( ) ( 2)
[2 sinc(2 )] H( ) ( , ) ( ) ( )
H( ) ( , ) ( , ) ( ) ( )
i
i
D D u n W
u D n W

.
F D
F D
(7.17a)

Equation (7.13c) shows that D D , which means that [see the specification of in Eq. (7.12a)]

( , ) ( , ) ( , ) D D = D .

Hence, Eq. (7.17a) reduces to

( ) { }
( )
( ) ( )
( ) ( 2)
( ) ( 2)
( ) ( 2) ( )
[2 sinc(2 )] H( ) ( , ) ( ) ( )
H( ) ( , ) ( ) ( )
H( ) ( , ) ( ) ( ) ,
i
i
i i
D D u n W
u D n W
u D n W

F D
F
F F
(7.17b)
- 889 -
where again the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] is applied in the last
step. We define the D-limited Fourier transform of the noise
( 2)
n

to be

( )
( 2) ( ) ( 2) ( 2) 2
( 2) 2
( ) ( , ) ( ) ( , ) ( )
( )
i i
D
D
i
D
D n D n e d
n e d

= =
=
n

F
(7.17c)

so that

{ }
( ) ( 2)
( 2) ( )
[2 sinc(2 )] H( ) ( ( , ) ( ) ( ))
H( ) ( ) ( ( ))
i
i
D
D D u n W
u W

.
F D
F
(7.17d)

Reversing the Fourier transform in Eq. (7.8a) gives

( )
2 2 ( )
( ) ( ) ( )
i i
FOV
W e d W

= =
Z F , (7.17e)

which can now be substituted into (7.17d) to get

( ) { }
{ }
( ) ( 2)
( 2) 2
[2 sinc(2 )] H( ) ( , ) ( ) ( )
H( ) ( ) ( )
i
D FOV
D D u n W
u

n Z
.
F D
(7.17f)

Having found approximations for the first and second terms on the right-hand side of the
formula in (7.15e), we can write down a simplified expression for the uncalibrated signal
spectrum of the double-sided signal contaminated by mirror-misalignment noise. Substituting
(7.16g) and (7.17f) into (7.15e) gives

{ }
,
( 2) 2
( )
H( ) M( ) ( ) H( ) ( ) ( )
eff totN
rms mnf D FOV
u R u

+

Z
Z n Z
. a
(7.18a)

For future use, we note that the expectation value of the noise term in (7.18a) is, using the
definition of convolution in Eq. (2.38a) in Chapter 2 and the linearity of the expectation operator
E explained in Sec. 3.10 of Chapter 3,
- 890 -

{ } ( )
{ }
( 2) 2
( 2) 2
( 2) 2
( 2) 2
H( ) ( ) [ ( )]
H( ) ( ) ( ) ( )
H( ) ( ( )) ( ) ( )
H( ) ( ( )) ( ) ( )
D FOV
D FOV
D FOV
D FOV
u
u d
u d
u

=

=

=

n Z
n Z
n Z
n Z

a
a
a
a
E
E
E
E .
(7.18b)

Glancing back at the definition of
( 2)
D
n in Eq. (7.17c), we note that

( ) ( )
( 2) ( 2) 2
( ) ( ) 0
D
i
D
D
n e d

= =
n E E (7.18c)

because, according to Eq. (7.8d),

( )
( 2)
( ) 0 n

= E .

Substituting (7.18c) into (7.18b), we see that

{ } ( )
( 2) 2
H( ) ( ) ( ) 0
D FOV
u

=

n Z . a E (7.18d)

Applying the expectation operator to both sides of (7.18a) now gives, using Eqs. (3.9f) and
(3.16a) in Chapter 3,

( )
{ } ( )
,
( 2) 2
( )
H( ) M( ) ( ) H( ) ( ) ( )
H( ) M( ) ( )
eff totN
rms mnf D FOV
rms mnf
u R u
u R

= +

=
Z
Z n Z
Z

.
a
E
E (7.18e)

This shows that, in principle, we can always reduce the mirror-misalignment noise to negligible
levels in the uncalibrated spectrum of the double-sided signal by averaging together many
independent measurements of the same spectral radiance.
Calibrated Spectra Contaminated by Misalignment Noise 7.6
- 891 -
7.6 Calibrated Spectra Contaminated by Misalignment Noise
The easiest way to find the noise-contaminated spectral radiance is to apply the spectral
calibration algorithm, discussed in Sec. 5.19 of Chapter 5, to the uncalibrated spectral signal in
Eq. (7.18a). We choose
(1)
( ) L and
(2)
( ) L to be the known spectral radiances used to calibrate
the instrument, with both L
(1)
and L
(2)
being slowly varying functions of wavenumber so that the
distorting effects of the interferometers finite field of view and finite interferogram length can be
neglected. Applying Eqs. (6A.3) and (6A.6) in Appendix 6A of Chapter 6 to L
(1)
and L
(2)
, we
write that

(1) (1) (1)
( ) ( ) ( )
FOV mnf
L L L (7.19a)
and

(2) (2) (2)
( ) ( ) ( )
FOV mnf
L L L , (7.19b)

with absolute value signs used to make L
(1)
and L
(2)
even functions of wavenumber. We say that
(1)
,
( )
eff totN
Z
is the uncalibrated, noise-contaminated signal spectrum at point C in Fig. 6.2 of

Chapter 6 when the interferometer is observing the L
(1)
spectral radiance. To get the formula for
(1)
,
( )
eff totN
Z
, we need to replace radiance L by radiance L

(1)
in formula (7.18a), which we do by
writing

{ }
(1) (1) ( 2) 2 (1)
,
( ) H( ) M( ) ( ) H( ) ( ) ( )
eff totN rms mnf D FOV
u R u

+

Z Z n Z
, a (7.20a)

where, following the pattern of Eqs. (7.7b) and (7.16f), we define

(1) (1) ( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
FOV a f FOV FOV
WA

= + Z L L L (7.20b)
and

(1) (1) ( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
mnf a f mnf mnf
WA

= + Z L L L . (7.20c)

The approximation shown in (7.19a) is our justification for dropping the FOV and mnf subscripts
from L
(1)
in Eqs. (7.20a)(7.20c). Similarly, we define
(2)
,
( )
eff totN
Z
to be the uncalibrated, noise-

contaminated spectrum at point C when the interferometer is observing the L
(2)
spectral radiance.
This gives, using (7.19b) to drop the FOV and mnf subscripts from L
(2)
,

- 892 -

{ }
(2) (2) ( 2) 2 (2)
,
( ) H( ) M( ) ( ) H( ) ( ) ( )
eff totN rms mnf D FOV
u R u

+

Z Z n Z
, a (7.20d)

where

(2) (2) ( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
FOV a f FOV FOV
WA

= + Z L L L (7.20e)
and

(2) (2) ( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
mnf a f mnf mnf
WA

= + Z L L L . (7.20f)

Because the uncalibrated signal spectra from L
(1)
and L
(2)
used in our calibration algorithm
should be noise-free, we average together a large number of measurements to get, following the
pattern of Eq. (7.18e) and the statement after it,

( )
(1) (1)
,
( ) H( ) M( ) ( )
eff totN rms mnf
u R = Z Z
E (7.20g)
and

( )
(2) (2)
,
( ) H( ) M( ) ( )
eff totN rms mnf
u R = Z Z
E . (7.20h)

Since
( )
(1,2)
,
( )
eff totN
Z
E are the noise-free spectral signals corresponding to L

(1,2)
, we can write

(1) (1)
,
( ) H( ) M( ) ( )
eff tot rms mnf
u R = Z Z (7.20i)
and

(2) (2)
,
( ) H( ) M( ) ( )
eff tot rms mnf
u R = Z Z , (7.20j)

where, to show that these are no longer random functions of , the tilde has been removed and
subscript totN has been changed to tot.
Now we can apply the calibration algorithm in Sec. 5.19 of Chapter 5 to get [see Eq. (5.95a)]

( ) (1)
, , (2) (1) (1)
(2) (1)
, ,
Measured Radiance
( ) ( )
( ) ( ) ( ) ,
( ) ( )
meas
eff totN eff tot
eff tot eff tot

= +

Z Z
L L L
Z Z

(7.21a)

where
( )
,
( )
meas
eff totN
Z
is the uncalibrated, noise-contaminated spectrum of the signal at point C in

Fig. 6.2 associated with the unknown optical radiance L that we want to measure. Note that,
although the expectation operator E is used to remove the noise from the L
(1,2)
signals, the noise
is left in the uncalibrated
( )
,
( )
meas
eff totN
Z
signal. This is our way of showing that, while a great deal of

Calibrated Spectra Contaminated by Misalignment Noise 7.6
- 893 -
effort can be invested in obtaining noise-free calibration data, the unknown spectrum L may be
changing slowly with timeand is often only one of a number of measurements to be performed
in a limited amount of timewhich prevents us from averaging away its noise.
105
The
uncalibrated (meas) signal spectrum, contaminated by mirror misalignment noise, is called
,
( )
eff totN
Z
in Eq. (7.18a), so we can now write that

{ }
( )
, ,
( 2) 2
( ) ( )
H( ) M( ) ( ) H( ) ( ) ( )
meas
eff totN eff totN
rms mnf D FOV
u R u

=
+

Z Z
Z n Z

a
(7.21b)

with ( )
mnf
Z given by Eq. (7.16f) and ( )
FOV
Z given by Eq. (7.7b). Working with the first
term on the right-hand side of (7.21a), we note that, substituting from Eqs. (7.20c), (7.20f),
(7.20i), and (7.20j),

(2) (1)
(2) (1)
, ,
(2) (1)
(2) (1)
1
R
R
( ) ( )
( ) ( )
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( )[ ( ) ( )]
4
H( ) M( ) ( ) ( ) ( ) ( ) .
4
eff tot eff tot
rms a f
rms a f
WA
u R
WA
u R

=

L L
Z Z
L L
L L

(7.21c)

Consulting Eqs. (7.21b) and (7.16f), as well as (7.20c) and (7.20i), we get

{ }
( ) (1)
, ,
(1)
( 2) 2
R
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( )[ ( ) ( )]
4
H( ) ( ) ( ) .
meas
eff totN eff tot
rms a f mnf
D FOV
WA
u R
u

=
+

Z Z
L L
n Z

a
(7.21d)

Substituting (7.21c) and (7.21d) into (7.21a) gives

105
In Chapter 6, see the discussion at the end of Sec. 6.5 as well as the discussion following Eq. (6.33b).
- 894 -

( 2) 2
R
Measured Radiance
4 ( ) ( )
( )
M( ) ( ) ( ) ( ) ( )
D FOV
mnf
rms a f
WA R
o o o
o
o o q o t o t o

+
AO
n Z
L

.
a (7.21e)

The right-hand side of (7.21e) is the sum of L
mnf
, which is the spectral radiance distorted by
the effect of the interferometers finite field of view and finite interferogram length, and a random
noise term

( 2) 2
R
4 ( ) [ ( )]
M( ) ( ) ( ) ( ) ( )
D FOV
rms a f
WA R
o o o
o o q o t o t o
AO
n Z a
.

Function L
mnf
is strictly real, but there is no reason to expect this noise term to be strictly real. In
fact only the real component of the noise term unavoidably contaminates the L
mnf
data. We
conclude, then, that the oL
measurement noise in the radiance spectrum is

( )

( 2) 2
( 2) 2
R
R
4 ( ) ( )
Re
M( ) ( ) ( ) ( ) ( )
4 Re ( ) ( )
.
M( ) ( ) ( ) ( ) ( )
D FOV
rms a f
D FOV
rms a f
WA R
WA R
o o o
o
o o q o t o t o
o o o
o o q o t o t o

AO

AO
n Z
L
n Z

a
a
(7.22a)

The second step in (7.27a) relies on
( 2)
D
n being the only complex quantity in the expression for

the oL
spectral noise. For future use, we note that the imaginary component of the noise term in
(7.21e) can be written as

( )

( 2) 2
( 2) 2
R
R
4 ( ) [ ( )]
Im
M( ) ( ) ( ) ( ) ( )
4 Im ( ) [ ( )]
M( ) ( ) ( ) ( ) ( )
D FOV
rms a f
D FOV
rms a f
WA R
WA R
o o o
o o q o t o t o
o o o
o o q o t o t o

AO

AO
n Z
n Z
.
a
a
(7.22b)

Taking the real part of the measured spectrum eliminates this noise component from the data, just
like it did in our analysis of the avoidable and unavoidable detector noise [see the discussion
following Eq. (6.35d) in Chapter 6].
2
Avoidable and Unavoidable Misalignment Noise in -Based Signals 7.7
- 895 -
7.7 Avoidable and Unavoidable Misalignment Noise in -Based Signals
Examining Eq. (7.7b) for ( )
FOV
o Z , we note that since Eq. (4.139g) in Chapter 4 shows () to
be even, ( )
FOV
o Z must also be even:

( ) (back)
( ) (back)
R
R
( )
( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) .
FOV
fore
a f FOV FOV FOV
fore
a f FOV FOV FOV
FOV
WA
WA
o
o q o t o t o o o o
o q o t o t o o o o
o
AO
+
AO
+
Z
L L L
L L L
Z

(7.23a)

Equation (5.10f) in Chapter 5 shows that
ma ma
M( ) M( ) R R o o , which means that

rms rms
M( ) ( ) M( ) ( )
FOV FOV
R R o o o o Z Z (7.23b)

is also even with respect to . Consequently the reverse Fourier transform

( )
( ) 2
ma ma
M( ) ( ) M( ) ( )
i i
FOV FOV
R R e d
o r o
o o o o o
Z Z F (7.23c)

must be a real and even function of because it is the reverse Fourier transform of a real and
even function of (see entry 1 in Table 2.1 of Chapter 2). This forces
( )
( )
tot
B
z in Eq. (7.8g) to be
a real and even function of . To show why this is so, we note that the formula for
( )
( )
tot
B
z is the
sum of the reverse Fourier transform specified in (7.23c) and several -independent constant
terms. We have just seen that the Fourier transform is a real and even function of , and the real
constant terms cannot change with ; hence,
( )
( )
tot
B
z must be real and even:

( ) ( )
( ) ( )
tot tot
B B
z z (7.23d)
and

( )
( )
Im ( ) 0
tot
B
z . (7.23e)

Consulting the definition of ( ) W in Eq. (7.8a),

is also even [see Eq. (2.11a) of Chapter 2 for denition of an even function]:
- 896 -

( )
2 2 ( ) 2
( ) ( ) ( )
i i
FOV FOV
W e d

= =
Z Z F , (7.23f)

we note that since [see Eq. (7.23a)]

2 2
( ) ( ) ( )
FOV FOV
= Z Z , (7.23g)

function ( ) W is also the reverse Fourier transform of an even function of . All the factors in
the definition of ( )
FOV
Z in Eq. (7.7b) are real, which means that the
2
[ ( )]
FOV
Z product in
(7.23g) is also real. Hence, ( ) W is the reverse Fourier transform of a real and even function,
making it also real and even:
( ) ( ) W W = (7.23h)
and
( ) Im ( ) 0 W = . (7.23i)

Following the same pattern as in Eq. (7.23a), we see that ( )
mnf
Z defined in Eq. (7.16f) is even
because

( ) (back)
( ) (back)
R
R
( )
( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
( ) .
mnf
fore
a f mnf mnf mnf
fore
a f mnf mnf mnf
mnf
WA
WA
= +
= +
=
Z
L L L
L L L
Z

(7.23j)

Every factor in the definition of
mnf
Z is real, so

( )
Im ( ) 0
mnf
= Z . (7.23k)

Clearly,
mnf
Z is another real and even function.
Equation (7.8f) gives the signal contaminated by mirror-misalignment noise as it leaves the
detector:

( ) ( ) ( 2)
( ) ( ) ( ) ( )
tot tot
BN B
z z n W
= + a . (7.24a)
Avoidable and Unavoidable Misalignment Noise in -Based Signals 7.7
- 897 -
From Eq. (7.23d) we know that the noise-free signal
( )
( )
tot
B
z is an even function of , so in
principle we could reduce the noise in (7.24a) by comparing the noise-contaminated signal at
and . (In practice, of course, we would have to worry about distortions introduced by any
circuit used to measure the signal of and . See the discussion of the distortions produced by
the detector circuit in Sec. 5.12 of Chapter 5.)To show how this works, we follow the pattern of
Eqs. (2.11d), (2.11e) in Chapter 2 and divide the mirror-misalignment noise
( 2)
( ) n

into even
and odd components, which we call
( 2)
( )
e
n

and
( 2)
( )
o
n

respectively, by defining

( )
( 2) ( 2) ( 2)
1
( ) ( ) ( )
2
e
n n n

+ (7.24b)
and

( )
( 2) ( 2) ( 2)
1
( ) ( ) ( )
2
o
n n n

. (7.24c)
According to these definitions

( 2) ( 2)
( ) ( )
e e
n n

(7.24d)
and

( 2) ( 2)
( ) ( )
o o
n n

. (7.24e)

The sum of
( 2)
e
n

and
( 2)
o
n

returns the original noise term,

( ) ( )
( 2) ( 2) ( 2) ( 2) ( 2) ( 2)
( 2)
1 1
( ) ( ) ( ) ( ) ( ) ( )
2 2
( )
e o
n n n n n n
n

+ + +

.

Since

( 2) ( 2) ( 2)
( ) ( ) ( )
e o
n n n

+ , (7.24f)

we can replace
( 2)
n

in Eq. (7.24a) by the sum of
( 2)
e
n

and
( 2)
o
n

to get

( ) ( ) ( 2) ( 2)
( ) ( ) ( ) ( ) ( ) ( )
tot tot
BN B e o
z z n W n W

+ +

a a . (7.24g)

The sum inside the square brackets [ ] is even with respect to because, according to Eqs.
(7.23d), (7.23h), and (7.24d),

( ) ( 2) ( ) ( 2)
[ ( ) ( ) ( )] [ ( ) ( ) ( )]
tot tot
B e B e
z n W z n W

+ + a a . (7.24h)

the detector circuit in Sec. 5.12 of Chapter 5.) To show how this works, we follow the pattern of
- 898 -
This sum, just like the noise-free signal
( ) tot
B
z , is even, which means that the even noise
component

( 2)
( ) ( )
e
n W
a

cannot be distinguished from the noise-free
( ) tot
B
z signal. The odd noise component,

( 2)
( ) ( )
o
n W
a ,

on the other hand, can in principle be eliminatedfor example, by averaging together the noise-
contaminated signal at and . To see how this works, we consult Eq. (7.24g) and write

( ) ( ) ( ) ( 2) ( 2)
( ) ( 2) ( 2)
1 1
( ) ( ) [ ( ) ( ) ( )] ( ) ( )
2 2
[ ( ) ( ) ( )] ( ) ( )
tot tot tot
BN BN B e o
tot
B e o
z z z n W n W
z n W n W

+ + +
+ + +

.
a a
a a

This becomes, applying Eqs. (7.23h), (7.24e), and (7.24h),

( ) ( ) ( ) ( 2)
( ) ( 2)
( ) ( 2)
1 1
( ) ( ) [ ( ) ( ) ( )]
2 2
[ ( ) ( ) ( )]
[ ( ) ( ) ( )]
tot tot tot
BN BN B e
tot
B e
tot
B e
z z z n W
z n W
z n W

+ +
+ +
+

a
a
a .

Averaging the noise-contaminated signal
( ) tot
BN
z at and eliminates the odd noise component,
reducing the amount of mirror-misalignment noise contaminating the signal. For this reason, it
makes sense to call
( 2)
e
n

the unavoidable mirror-tilt noisebecause it is even and so cannot be
distinguished from the even, noise-free signaland to call
( 2)
o
n

the avoidable mirror-tilt noise
because it can be removed by averaging the noise-contaminated signal at and .
7.8 Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal
Spectrum
It is easy to connect the unavoidable
( 2)
e
n

and avoidable
( 2)
o
n

noise components to the D-limited
Fourier transform
( 2)
D
n . Substitution of Eq. (7.24f) into (7.17c) gives

in Eq. (7.23d), is even, which means that the
even noise component
Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal Spectrum 7.8
- 899 -

( 2) ( 2) 2 ( 2) 2
( 2) ( 2)
( ) ( ) ( )
( ) ( ) ,
D D
i i
D e o
D D
De Do
e d e d n n

= +
= +

n
n n

(7.25a)
where we define

( )
( 2) ( 2) 2 ( 2) 2
( ) ( 2)
( ) ( ) ( , ) ( )
( , ) ( )
D
i i
De e e
D
i
e
n e d D n e d
D n

= =
=

n
F
(7.25b)
and

( )
( 2) ( 2) 2 ( 2) 2
( ) ( 2)
( ) ( ) ( , ) ( )
( , ) ( )
D
i i
Do o o
D
i
o
n e d D n e d
D n

= =
=

n
. F
(7.25c)

Equation (7.25b) states that
( 2)
De

( 2)
[ ( , ) ( )]
e
D n

.
Glancing back at Eqs. (7.12a) and (7.24d), we note that

( 2) ( 2)
( , ) ( ) ( , ) ( )
e e
D n D n

= , (7.26a)

making
( 2)
[ ( , ) ( )]
e
D n

even with respect to . Hence
( 2)
De
n is the forward Fourier transform

of a real and even function, which means (according to entry 1 of Table 2.1 in Chapter 2) that
( 2)
De
n must also be real and even:

( 2) ( 2)
( ) ( )
De De

= n n (7.26b)
and

( )
( 2) ( 2)
Re ( ) ( )
De De

= n n . (7.26c)

Equation (7.25c) states that
( 2)
Do

( 2)
[ ( , ) ( )]
o
D n

.
According to Eqs. (7.12a) and (7.24e),

( 2) ( 2)
( , ) ( ) ( , ) ( )
o o
D n D n

= , (7.27a)

which makes
( 2)
Do
n the forward Fourier transform of a real and odd function. Consequently,

according to entry 4 of Table 2.1 in Chapter 2,
( 2)
Do
n must be imaginary and odd:

( 2) ( 2)
( ) ( )
Do Do

= n n (7.27b)
- 900 -
and

( )
( 2) 1 ( 2)
Im ( ) ( )
Do Do
i

= n n . (7.27c)

Equations (7.26c) and (7.27c) show that taking the real part of both sides of (7.25a) now gives

( )
( 2) ( 2)
Re ( ) ( )
D De

= n n , (7.28a)

and taking the imaginary part of both sides gives

( )
( 2) 1 ( 2)
Im ( ) ( )
D Do
i

= n n . (7.28b)

Equation (7.28a) shows that the real part of
( 2)
D
n , the D-limited Fourier transform of

( 2)
n

, is
( 2)
De
n , which is, according to (7.25b), the D-limited Fourier transform of the unavoidable signal
noise
( 2)
e
n

. Because the real part of
( 2)
D
n comes from
( 2)
e
n

, the unavoidable signal noise, it
makes sense to regard the real part of
( 2)
D
n as the unavoidable component of

( 2)
n

in the spectral
domain. This matches what we see in Eq. (7.22a), where the formula for the noise L
in the
measured spectrum uses only the real part of
( 2)
D
n (that is, it uses only the unavoidable

component of
( 2)
n

in the spectral domain). Equation (7.28a) can be substituted into (7.22a) to
make the dependence on
( 2)
De
n explicit:

{ }
( 2) 2
R
4 ( ) [ ( )]
M( ) ( ) ( ) ( ) ( )
De FOV
rms a f
WA R
n Z
L

a
. (7.28c)

Equations (4.139g) in Chapter 4 and (5.10f) in Chapter 5 show that and M are even functions of
, and absolute value signs turn everything else in the denominator of the right-hand side of
(7.28c) into an even function of . Equations (7.26b) and (7.23g) show that
( 2)
De
n and
2
[ ( )]
FOV
Z are even functions of , and Eq. (2.38f) in Chapter 2 requires the convolution of
two even functions to be another even function. Hence, the numerator of (7.28c) is also an even
function of . This makes the measurement noise L
an even function of , which can be shown

by writing it as a function of ,
( ) = L L

Therefore, Eqs. (7.22a) and (7.28c) can be written as

Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal Spectrum 7.8
- 901 -

( ) { }
( 2) 2
R
4 [Re ( ) ] [ ( )]
( )
M( ) ( ) ( ) ( ) ( )
D FOV
rms a f
WA R
n Z
L

a
(7.28d)
and

{ }
( 2) 2
R
4 ( ) [ ( )]
( )
M( ) ( ) ( ) ( ) ( )
De FOV
rms a f
WA R
n Z
L

a
. (7.28e)

Equation (7.28b) shows that the imaginary part of
( 2)
D
n is the same as
1 ( 2)
( )
Do
i

n , the D-limited
Fourier transform of the avoidable signal noise divided by i. Equation (7.28b) can be substituted
into (7.22b) to make this explicit:

{ }
{ }
( 2) 2
1 ( 2) 2
R
R
4 ( ) [ ( )]
Im
M( ) ( ) ( ) ( ) ( )
4 ( ) [ ( )]
M( ) ( ) ( ) ( ) ( )
D FOV
rms a f
Do FOV
rms a f
WA R
i
WA R
n Z
n Z
.
a
a
(7.28f)

Since
( )
( 2)
( ) 0
D
= n E in Eq. (7.18c), we know that, using the linearity of Eexplained in Sec.

3.10 of Chapter 3,

( ) ( ) ( )
( ) ( ) ( ) ( )
( 2) ( 2) ( 2)
( 2) ( 2)
( ( )) Re ( ) Im ( )
Re ( ) Im ( ) 0
D D D
D D
i
i

= +
= + =
n n n
n n

.
E E
E E

Consequently both the real and imaginary components of
( )
( 2)
( )
D
n E must be separately equal

to zero, which means

( ) ( )
( 2)
Re ( ) 0
D
= n E (7.29a)
and

( ) ( )
( 2)
Im ( ) 0
D
= n E . (7.29b)

According to Eqs. (7.28a) and (7.28b), this can be written as

( )
( 2)
( ) 0
De
= n E (7.29c)
and
- 902 -

( )
( 2)
( ) 0
Do
= n E . (7.29d)

Applying the expectation operator to both sides of Eq. (7.28e) leads to, using Eqs. (2.38b) and
(2.38a) in Chapter 2 and the linearity of the expectation operator in Sec. 3.10 of Chapter 3,

( )
{ }
2 ( 2)
( 2) 2
( 2)
R
R
R
( )
4
[ ( )] ( )
M( ) ( ) ( ) ( ) ( )
4
( ) ( )
M( ) ( ) ( ) ( ) ( )
4
(
M( ) ( ) ( ) ( ) ( )
FOV De
rms a f
De FOV
rms a f
De
rms a f
WA R
d
WA R
WA R
L
Z n
n Z
n

a
a
a
E
E
E
E
( )
2
) ( ) ,
FOV
d

Z

which becomes, using (7.29c),

( )
( ) 0 = L
E . (7.29e)

This shows that the measurement noise ( ) L
is a zero-mean random variable. Similarly Eq.

(7.28f) gives us, after applying the expectation operator to both sides,

{ }
( )
( 2) 2
1
( 2) 2
R
R
4 ( ) [ ( )]
Im
M( ) ( ) ( ) ( ) ( )
4
( ) ( ) ,
M( ) ( ) ( ) ( ) ( )
D FOV
rms a f
Do FOV
rms a f
WA R
i
d
WA R

n Z
n Z

a
a
E
E

which becomes, using (7.29d),

{ }
( 2) 2
R
4 ( ) [ ( )]
Im 0
M( ) ( ) ( ) ( ) ( )
D FOV
rms a f
WA R

=

n Z a
E . (7.29f)

Hence both the real and imaginary contamination of the measurement due to the signals mirror-
tilt noise can be reduced to negligible levels by averaging together many independent
measurements of the same spectrum.
Power Spectrum of
(2)
7.9
- 903 -
7.9 Power Spectrum of
( 2)
n

In the discussion following Eq. (7.8d) above, random function
( 2)
( ) n

is assumed to be wide-
sense stationary. Hence, based on the analysis in Secs. 3.20 and 3.23 of Chapter 3, we expect its
power spectrum and autocorrelation function to be a Fourier-transform pair. The -based
autocorrelation function is defined to be

( 2) ( 2) ( 2)
( , ) ( ( ) ( ))
nn
n n

o E .

Since
( 2)
n

is wide-sense stationary, we know that
( 2)
nn

o depends only on the difference between
and :

( )
( 2) ( 2) ( 2)
( ) ( ) ( )
nn
n n

o E . (7.30a)

The -based, double-sided power spectrum of
( 2)
n

is given by

( )
( 2) ( 2) 2 ( ) ( 2)
( ) ( ) ( )
i i
nn nn nn
e d
r o o
o

p o o F (7.30b)

and of course this transform can be reversed to get

( )
( 2) ( 2) 2 ( ) ( 2)
( ) ( ) ( )
i i
nn nn nn
e d
r o o
o o o

o p p F . (7.30c)

Equations (7.30b) and (7.30c) show how we set up the -based autocorrelation of
( 2)
n

and the -
based power spectrum of
( 2)
n

as a Fourier-transform pair. Equation (7.8b) shows that
( 2)
nn

o is
real because
( 2)
n

is real, and we also note that
( 2)
nn

o must be even because for any two values of
and ,

( ) ( )
( 2) ( 2) ( 2) ( 2) ( 2) ( 2)
( ) ( ) ( ) ( ) ( ) ( )
nn nn
n n n n

o o E E .

Therefore, after defining = , we get

( 2) ( 2)
( ) ( )
nn nn

o o (7.30d)
and

( )
( 2)
Im ( ) 0
nn

o . (7.30e)

( )
( 2)
Im ( ) 0
nn

o . (7.30e)
and, having just decided the autocorrelation is real,
- 904 -
Equation (7.30b) then shows that, according to (7.30d) and (7.30e),
( 2)
nn

p is the forward Fourier
transform of a real and even function, which means that it must also be real and even:
106

( )
( 2)
Im ( ) 0
nn
=

p (7.30f)

and

( 2) ( 2)
( ) ( )
nn nn

=

p p . (7.30g)

The 0 = value of the autocorrelation function can be used to connect the
( 2)
nn

p power
spectrum to the statistics of the misalignment angle. Setting 0 = in Eq. (7.30c) gives

( 2) ( 2)
(0) ( )
nn nn
d

o p

which means, according to (7.30a) with = ,

( )
( 2) 2 ( 2)
[ ( )] ( )
nn
n d

p E . (7.31a)

Substituting from Eq. (7.8b) and using the linearity of operator E with respect to random
quantities (see Sec. 3.10 of Chapter 3) as well as Eq. (3.9f) of Chapter 3, we get

( ) ( ) ( ) ( )
( ) ( )
( 2) 2 2 2 2 4 2 2 4
4 2 2 4
[ ( )] [ ( ) ] ( ) 2 ( ) ( )
2 ( ) ( )
( )
rms rms rms
rms rms
n

= = +
= +
=

E E E E E
E E
E
( )
4 4
,
rms

where in the last step
2 2
( ( ) )
rms
=
E from Eq. (7.3c) is used to simplify the result. Substitution

of this formula into (7.31a) gives

( )
4 4 ( 2)
( ) ( )
rms nn
d
= +
E p . (7.31b)

106
Power Spectrum of
(2)
7.9
- 905 -
Because the statistics of
do not depend on , we are not surprised to see

4
( ( ) )
E set equal to
a -independent sum. This formula connects the integrated value of
( 2)
nn

p to thepresumably
already knownstatistical quantities
rms
and
4
( ( ) )
E . Once a shape has been chosen for

( 2)
nn

p ,
Eq. (7.31b) can be used to find the normalizing constant, which should be applied to the shape
function to get the exact formula for the
( 2)
nn

p noise-power spectrum (see, for example, Sec. 7.14
below).
7.10 Calculating the Variance of oL

Equations (7.28e) and (7.28f) specify the measurement noise ( ) o o L
in the radiance spectrum.

To find the variance of ( ) o o L
, we must evaluate
( )
2
[ ( )] o o L
E , which can be written as,

substituting from Eq. (7.28e),

( )

2
2
( 2) 2
2
2
( 2) 2
R
R
[ ( )]
4 ( ) ( )
M( ) ( ) ( ) ( ) ( )
4
( ) ( )
M( ) ( ) ( ) ( ) ( )
De FOV
rms a f
De FOV
rms a f
WA R
WA R
o o
o o o
o o q o t o t o
o o o
o o q o t o t o

AO

AO

L
n Z
n Z

.
a
a
E
E
E
(7.32)

The only difficult term in this formula is

2
( 2) 2
( ) ( )
De FOV
o o o

n Z E ,

which is what we now set out to calculate.
Reversing the transform in Eq. (7.8a) gives

( )
2 2 ( )
( ) ( ) ( )
i i
FOV
W e d W
r o o
o o
Z F . (7.33a)

Equations (7.25b) and (7.33a) can be combined to get

( ) ( )
( 2) 2 ( ) ( 2) ( )
( ) [ ( )] ( , ) ( ) ( )
i i
De FOV e
D n W
o o
o o o

H n Z F F ,
7.13
- 906 -
which becomes, using the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2],

( )
( 2) 2 ( ) ( 2)
( 2) 2
( ) [ ( )] ( , ) ( ) ( )
( , ) ( ) ( )
i
De FOV e
i
e
D n W
D n W e

=
=
n Z
.
F
(7.33b)

Now we can write (using the linearity of operator E discussed in Sec. 3.10 of Chapter 3)

{ }
( )
2
( 2) 2
( 2) 2 ( 2) 2
2 ( 2) ( 2) 2
( ) ( )
( , ) ( ) ( ) ( , ) ( ) ( )
( , ) ( ) ( , ) ( ) ( ) ( )
De FOV
i i
e e
i i
e e
d D n W e d D n W e
d D W e d D W n n e

=

=

n Z

.
E
E
E
(7.33c)

From Eq. (7.24b) we get

( ) ( )
( ) ( )
( 2) ( 2) ( 2) ( 2) ( 2) ( 2)
( 2) ( 2) ( 2) ( 2)
( 2)
1
( ) ( ) [ ( ) ( )][ ( ) ( )]
4
1
( ) ( ) ( ) ( )
4
( )
e e
n n n n n n
n n n n
n

= + +
= +
+

E E
E E
E
( ) ( )
( 2) ( 2) ( 2)
( ) ( ) ( ) n n n

+
. E

This becomes, applying Eq. (7.30a),

( )
( 2) ( 2) ( 2) ( 2)
( 2) ( 2)
1
( ) ( ) ( ) ( )
4
( ) ( )
e e nn nn
nn nn
n n

= +
+ + + +

,
o o
o o
E

which, according to Eq. (7.30d), simplifies to

( )
( 2) ( 2) ( 2) ( 2)
1
( ) ( ) ( ) ( )
2
e e nn nn
n n

= + +

o o E . (7.33d)

Calculating the Variance of L
7.10
- 907 -
Substitution of (7.33d) into (7.33c) gives, using Eq. (7.30c),

{ }
2
( 2) 2
2 ( 2) 2
2 ( 2) 2
2
( ) ( )
1
( , ) ( ) ( , ) ( ) ( )
2
1
( , ) ( ) ( , ) ( ) ( )
2
1
( , ) ( )
2
De FOV
i i
nn
i i
nn
i
d D W e d D W e
d D W e d D W e
d D W e d

=
+ +
=

n Z

o
o
E
2 ( 2) 2 ( )
2 2 ( 2) 2 ( )
( , ) ( ) ( )
1
( , ) ( ) ( , ) ( ) ( )
2
i i
nn
i i i
nn
D W e e d
d D W e d D W e e d

+

+

.
p
p

This can be written as, interchanging the order of the multiple integrals,

{ }
2
( 2) 2
( 2) 2 ( ) 2 ( )
( 2) 2 ( )
( ) ( )
1
( ) ( , ) ( ) ( , ) ( )
2
1
( ) ( , ) ( ) ( , ) ( )
2
De FOV
i i
nn
i
nn
d d D W e d D W e
d d D W e d D W e

+

=
+

n Z

p
p
E
2 ( ) i
.
(7.33e)

From Eq. (7.15b) and the Fourier convolution theorem [Eq. (2.39j) in Chapter 2], we get

( )
( )
2 ( )
( )
( , ) ( ) ( , ) ( )
[2 sinc(2 )] ( )
i i
i
D W e d D W
D D W

=
=
.
F
F

Substitution from Eq. (7.17e) gives

2 2
( , ) ( ) [2 sinc(2 )] [ ( )]
i
FOV
D W e d D D

Z .

- 908 -
The
2
term is broad and slowly varying compared to the narrow and rapidly varying sinc
function, so it acts like a quasi-constant and can be brought outside the convolution [see Eq.
(5C.1) in Appendix 5C of Chapter 5]. This means we can write, using the approximation in
(7.16h) above,

( )
2 2 2
( , ) ( ) [2 sinc(2 )] ( ) ( )
i
FOV mnf
D W e d D D

Z Z . (7.33f)

Equation (7.33f) is now used to simplify (7.33e):

{ }
2
( 2) 2
( 2) 2 2
2
( 2) 2
( ) ( )
1
( ) ( ) ( ) ( ) ( )
2
1
( ) ( ) ( )
2
De FOV
nn mnf mnf
nn mnf
d
d

= + +

+

n Z
Z Z
Z

.
p
p
E
(7.33g)

This expression is too complicated to substitute comfortably back into Eq. (7.32), the formula for
the variance of ( ) L
, so we define a new function

( 2) ( 2) 2 2
2
( 2) 2
1
( ) ( ) ( ) ( ) ( ) ( )
2
1
( ) ( ) ( )
2
nn mnf mnf
nn mnf
J d
d

= + +

+

Z Z
Z

,
p
p
(7.33h)

which means that (7.33g) reduces to

{ }
2
( 2) 2
( ) ( ) ( )
De FOV
J

=

n Z E . (7.33i)

Equation (7.32) can now be written as

( )
2
2 ( 2)
R
4
[ ( )] ( )
M( ) ( ) ( ) ( ) ( )
rms a f
J
WA R
a
E . (7.33j)

Calculating the Variance of L
7.10
- 909 -
Using Eq. (7.2b) and
2
A R = for a circle of radius R, we have

2 2
2
2
2
R
A R
= =
a
.

Here, variables A and R have the same meaning as in the discussion following Eq. (4.137e) in
Chapter 4. The discussion following Eq. (4.83) in Chapter 4 reveals that, because W must be 1 or
1,

2
1 W = . (7.33k)

These results can be substituted into (7.33j) to get

( )
2
2 ( 2)
R
8
[ ( )] ( )
M( ) ( ) ( ) ( ) ( )
rms a f
J
R
E . (7.33A )

7.11 Formula for the Misalignment NEdN of Double-Sided Signals
By definition, the mirror-misalignment NEdN of the double-sided signal analyzed here is the
square root of the variance in the noise. According to Eq. (7.29e), the mirror-misalignment
noiseL
is a zero-mean random variable, so the formula for the variance in Eq. (3.8f) in Chapter
3 shows that

( )
2
( ) [ ( )] Var = L L

E

because the mean
L

of random variable L
is zero. Consequently the formula for the

misalignment, or tilt-error, NEdNwhich is defined in Sec. 6.1 of Chapter 6 to be the standard
deviation, or the square root of the variance, of the L
noisecan be written as

2
([ ( )] )
tilt
NEdN = L
E . (7.34a)

Taking the square root of both sides of Eq. (7.33A ) gives

( 2)
R
8 ( )
M( ) ( ) ( ) ( ) ( )
tilt
rms a f
J
NEdN
R

. (7.34b)

- 910 -
There are a number of ways to write the
( 2)
J

function defined in Eq. (7.33h) above. The
second term on the right-hand side of (7.33h) can, for example, be written as a convolution [see
Eq. (2.38a) in Chapter 2 for the definition of a convolution]. This gives

2
( 2) ( 2) 2
( 2) 2 2
1
( ) ( ) ( )
2
1
( ) ( ) ( ) ( ) ( )
2
nn mnf
nn mnf mnf
J
d

=

+ + +

Z
Z Z

.
p
p
(7.35a)

Perhaps the most revealing form in which to write the
( 2)
J

function is

2
( 2) ( 2) 2 2
1
( ) ( ) ( ) ( ) ( ) ( )
4
nn mnf mnf
J d

= + + +

Z Z

. p (7.35b)

To justify this latest formula, we consult Eq. (7.30g) and define a new dummy variable of
integration = in order to show that

2
( 2) 2
2
( 2) 2
2
( 2) 2
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
nn mnf
nn mnf
nn mnf
d
d
d
+ +

=

=

Z
Z
Z

.
p
p
p
(7.35c)

Now the square brackets [ ] in Eq. (7.35b) can be expanded to get
Formula for the Misalignment of NEdN of Double-Sided Signals 7.11
- 911 -

2
( 2) ( 2) 2
2
( 2) 2
( 2) 2 2
1
( ) ( ) ( ) ( )
4
1
( ) ( ) ( )
4
1
( ) ( ) ( ) ( ) ( )
2
nn mnf
nn mnf
nn mnf mnf
J d
d
d

= + +

+

+ + +

Z
Z
Z Z

p
p
p
2
( 2) 2
( 2) 2 2
1
( ) ( ) ( )
2
1
( ) ( ) ( ) ( ) ( )
2
nn mnf
nn mnf mnf
d
d
=

+ + +

Z
Z Z

.
p
p

This is the same as Eq. (7.33h) above, showing that the right-hand side of (7.35b) is correct. We
note, since the power spectrum
( 2)
nn

p can never be negative and the terms inside the square
brackets [ ] are all real, that the integral on the right-hand side of (7.35b) is never negative.
Consequently,
( 2)
J

must be a non-negative quantity, which means there is never any problem
taking its square root in the formula for the mirror-misalignment NEdN in Eq. (7.34b). The
mnf
Z
function in Eq. (7.35b) is specified by formula (7.16f) above; we see that
mnf
Z depends on the
background radiances
( ) fore
mnf
L and
(back)
mnf
L as well as on L
mnf
, the radiance being measured. Hence
both the internal background radiances and the radiance being measured end up contributing to
the mirror-tilt NEdN.
7.12 Connection Between the
( 2)
nn

p Power Spectrum and the Power
Spectra of
x
,
y

To understand the implications of the
tilt
NEdN formulas derived in the previous sections, we
need some information about the typical shape of the
( 2)
nn

p power spectrum. It turns out that if we
assign power spectra to the ( )
x

and ( )
y

random functions introduced in Sec. 7.2 above, we

can use them to get information about the probable shape of
( 2)
nn

p by deriving a formula for
( 2)
nn

p
in terms of the power spectra of ( )
x

and ( )
y

.
Simplifying the notation in preparation for the algebra coming up, we define four new random
variables X
, X
, Y
, Y
by specifying that

- 912 -
( )
x
X = +

, (7.36a)

( )
x
X = +

, (7.36b)

( )
y
Y =

, (7.36c)
and
( )
y
Y =

(7.36d)

The point of this new notation is to emphasize the important informationnamely, whether or
not we are dealing with the x or the y component of the angleand to suppress all the irrelevant
aspects of argument , keeping only the relevant information as to whether or not it is primed.
According to Eq. (7.2f), the average value of ( )
x

which is the same thing as the systems

bias tiltis the constant angle at any value of . Writing Eqs. (7.36a) and (7.36b) as

( )
x
X =

and ( )
x
X =

makes it easy to see that X
and X
are zero-mean random functions of and respectively. The

statistics of ( )
x

and ( )
y

do not depend on , so we expect the same to hold true for the

statistics of X
, X
, Y
, and Y
. Hence, we can assume that X
, X
and Y
, Y
are at least wide-

sense stationary functions of (which isaccording to Sec. 3.20 of Chapter 3all that is
necessary to provide them with power spectra). Because they are wide-sense stationary, we can
set up the two autocorrelation functions

( )
( )
( )
xx
XX =

o E (7.36e)
and

( )
( )
( )
yy
YY =

o E (7.36f)

to be functions only of the difference between and . The associated power spectra are, using
= ,

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
xx xx i i xx
e d

= =
p o o F (7.36g)
and

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
yy yy i i yy
e d

= =
p o o F ; (7.36h)
Connection Between
( 2)
nn

p and the Power Spectra of
x
,
y
7.12
- 913 -
and, of course, the Fourier transforms can be reversed to get

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
xx xx i i xx
e d

= =
o p p F (7.36i)
and

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
yy yy i i yy
e d

= =
o p p F . (7.36j)

If we no longer assume that ( )
x

and ( )
y

are uncorrelatedwhich means that X
and Y

might be correlated random variableswe must use the cross-correlation function

( )
( )
( )
xy
XY =

o E , (7.37a)

like the one defined in Eq. (3.30d) in Chapter 3, to describe the statistical relationship between
X
and Y
. Again we assume that, just like

( ) xx
o and
( ) yy
o , it is a real function of the difference
between and , which means that X
and Y
are jointly wide-sense stationary. Hence we can

define a new variable = and construct an associated cross-power spectrum [see Eq.
(3.48e) in Chapter 3],

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
xy xy i i xy
e d

= =
p o o F . (7.37b)

Reversing the Fourier transform gives

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
xy xy i i xy
e d

= =
o p p F . (7.37c)

The same sort of reasoning used above in Sec. 7.9 [see Eqs. (7.30d)(7.30g)] can be used here
to show that
( ) xx
o ,
( ) yy
o ,
( ) xx
p , and
( ) yy
p in Eqs. (7.36e)(7.36h) are real and even functions. We
note that

( ) ( )
( ) ( )
( ) ( )
xx xx
XX X X = = =

o o E E

which becomes, substituting = ,

( ) ( )
( ) ( )
xx xx
= o o . (7.38a)

- 914 -
The same argument can be applied to
( ) yy
o to get

( ) ( )
( ) ( )
yy yy
= o o ; (7.38b)

and, of course, both
( ) xx
o and
( ) yy
o must be real because they are, according to (7.36e) and
(7.36f), the expectation values of real products,

( ) ( )
Im[ ( )] Im[ ( )] 0
xx yy
= = o o . (7.38c)

Since
( ) xx
o and
( ) yy
o are real and even, their Fourier transforms
( ) xx
p and
( ) yy
p in Eqs. (7.36g)
and (7.36h) must also, according to entry 1 in Table 2.1 of Chapter 2, be real and even:

( ) ( )
( ) ( )
xx xx
= p p , (7.38d)

( ) ( )
( ) ( )
yy yy
= p p , (7.38e)
and

( ) ( )
Im[ ( )] Im[ ( )] 0
xx yy
= = p p . (7.38f)

We note in passing that this line of argument most definitely cannot be applied to
( ) xy
o and
( ) xy
p , because, as shown in Appendix 7B, the cross-power spectrum
( ) xy
p can have both real and
imaginary parts.
The probability density distributions in Eqs. (7.2h) and (7.2i) require
x
and
y
to be
normally distributed. Consequently, the definitions of X
, X
, Y
, Y
in Eqs. (7.36a)(7.36d)
show that X
, X
, Y
, Y
are also normally distributed. Variables Y
and Y
obey zero-mean
normal distributions because
y
is a zero-mean random function; and X
and X
also obey zero-

mean normal distributions because, according to the discussion following Eq. (7.36d), the effect
of subtracting from
x
is to make X
and X
zero-mean random quantities. Hence, X
, X
,
Y
, Y
have the same properties as the jointly normal random variables

1
n ,
2
n ,
3
n ,
4
n described
in Sec. 3.17 of Chapter 3,

1,2,3,4
, , , X X Y Y n

. (7.39a)

Note that jointly normal random variables may or may not be correlated and thus may or may not
be independent random quantities. Considered in pairs, the random quantities X
, X
and Y
, Y

obey the formulas describing pairs of jointly normal random variables, for example, Eqs. (3.35c)
and (3.41b) in Chapter 3. When they are examined in isolation, they obey formulas describing
single normal variables, for example, Eq. (3.41c) in Chapter 3. Equations (7.36a)(7.36d) also
Connection Between
( 2)
nn

x
,
y
7.12
- 915 -
require the spread in the probable values of X
, X
about zero to be the same as the spread in the

probable values of
x
at or about o ; and of course Y
, Y
have the same spread about zero as

y
at or because they are the same random variables. Consequently the standard deviations of
X
, X
are the same as the

x
y standard deviation of
x
at or and the standard deviations of

Y
, Y
are the same as the

y
y standard deviation of
y
at or [
x
y and
y
y are introduced in
discussion following Eq. (7.2g) above]. We see that

2 2 2
( ) ( )
x
X X y

E E (7.39b)
and

2 2 2
( ) ( )
y
Y Y y

E E . (7.39c)

Having laid the required mathematical foundation, we begin the derivation of the desired
formula for the
( 2)
nn

p power spectrum in terms of the power spectra of ( )
x

and ( )
y

. The first
step is to evaluate [see Eqs. (7.36a)(7.36d) above]

( ) ( )
2 2 2 2
( ) ( ) ( ) ( )
x x
X X o o + +

E E , (7.40a)

( ) ( )
2 2 2 2
( ) ( ) ( )
x y
X Y o +

E E , (7.40b)
and

( )
2 2 2 2
( ) ( ) ( )
y y
Y Y

E E (7.40c)

in terms of the correlation functions
( ) xx
o ,
( ) yy
o , and
( ) xy
o .
Starting with (7.40a), we use the linearity of the expectation operator with regard to random
variables (see Sec. 3.10 of Chapter 3) to write

( ) ( )
2 2 2 2
2 2 2 2 2
2 2 3
2 2 3 4
( ) ( ) ( ) ( )
( 2
2 4 2
2 )
x x
X X
X X X X X
XX XX X
X X
o o
o o
o o o
o o o
+ +
+ +
+ + +
+ + +

E E
E
2 2 2 2 2
2 2 3
2 2 3 4
( ) 2 ( ) ( )
2 ( ) 4 ( ) 2 ( )
( ) 2 ( ) ,
X X X X X
XX XX X
X X
o o
o o o
o o o
+ +
+ + +
+ + +

E E E
E E E
E E
(7.41a)
the discussion following Eq. (7.2g) above]. We see that
- 916 -
where, in the last step, Eq. (3.9f) in Chapter 3 is used to get
( )
4 4
= E . We examine the
discussion following Eq. (3.34d) in Chapter 3 and apply Eq. (3.35c) to get

( ) ( )
2 2
0 XX X X = =

E E . (7.41b)

We of course also know that

( ) ( )
0 X X = =

E E (7.41c)

because X
and X
are zero-mean random variables. Equation (7.41a) can now be written as

( )
2 2 2 2 2 2 2 2 2 4
2 2 2 2 2 4
( ) ( ) ( ) ( ) 4 ( ) ( )
( ) 4 ( ) 2 ,
x x
x
X X X XX X
X X XX

= + + + +
= + + +

E E E E E
E E
(7.41d)

where in the last step Eq. (7.39b) is used to replace
2
( ) X
E and
2
( ) X
E by
2
x
. Examining the
discussion following Eq. (3.40c) in Chapter 3, we note that Eq. (3.41b) shows us that

( ) ( ) ( ) ( )
2
2 2 2 2
2 X X X X XX = +

E E E E
or, again using (7.39b),

( ) ( )
2
2 2 4
2
x
X X XX = +

E E . (7.41e)

Substituting (7.41e) into (7.41d) and then applying (7.36e) gives

( )
2 2 4 ( ) 2 2 ( ) 2 2 4
( ) ( ) 2 ( ) 4 ( ) 2
xx xx
x x x x
+ + + + =

o o E
or

( )
2 2 ( ) 2 2 ( ) 2 2 2
( ) ( ) 2 ( ) 4 ( ) ( )
xx xx
x x x
+ + + =

o o E . (7.41f)

Having finished with (7.40a), we turn our attention to (7.40b). Again using Eqs. (7.36a) and
(7.36c) and the linearity of the expectation operator (see Sec. 3.10 in Chapter 3), we have

( ) ( )
2 2 2 2 2 2 2 2 2
2 2 2 2 2
( ) ( ) ( ) ( 2 )
( ) 2 ( ) ( )
x y
X Y X Y XY Y
X Y XY Y

= + = + +
= + +

.
E E E
E E E
(7.42a)

Connection Between
( 2)
nn

x
,
y
7.12
- 917 -
Again Eqs. (3.35c) and (3.41b) in Chapter 3 can be applied to the jointly normal random
quantities X
, Y
to get

( )
2
0 XY =

E (7.42b)
and

( ) ( ) ( ) ( )
2
2 2 2 2
2 X Y X Y XY = +

E E E E . (7.42c)

Equations (7.37a), (7.39b), and (7.39c) let us write (7.42c) as

( )
2 2 2 2 ( ) 2
2 ( )
xy
x y
X Y = +

o E . (7.42d)

Substituting (7.39c), (7.42b), and (7.42d) into (7.42a) gives

( )
2 2 2 2 ( ) 2 2 2
( ) ( ) 2 ( )
xy
x y x y y
= + +

o E
or

( )
2 2 2 2 2 ( ) 2
( ) ( ) ( ) 2 ( )
xy
x y y x
= + +

. o E (7.42e)

Equation (7.40c) is the easiest to evaluate. This time applying Eq. (3.41b) in Chapter 3 to the
jointly normal random quantities Y
and Y
, we can write

( ) ( ) ( ) ( )
2
2 2 2 2
2 Y Y Y Y YY = +

E E E E . (7.43a)

Substituting this into (7.40c), we get

( ) ( ) ( ) ( )
2
2 2 2 2
( ) ( ) 2
y y
Y Y YY = +

E E E E ,

which becomes, using (7.39c) and (7.36f),

( )
2 2 4 ( ) 2
( ) ( ) 2 ( )
yy
y y y
= +

o E . (7.43b)

Now that (7.40a)(7.40c) have been evaluated, the next step is to use them to find a formula
for
( 2)
nn

o in terms of
( ) xx
o ,
( ) yy
o , and
( ) xy
o . Substituting Eq. (7.8b) into (7.30a) gives

( )( ) ( )
( 2) 2 2 2 2
( ) ( ) ( )
nn rms rms
=

o E . (7.44a)
- 918 -
The product on the right-hand side can be expanded to get

( )
( 2) 2 2 2 2 2 2 4
( ) ( ) ( ) ( ) ( )
nn rms rms rms
+

o E .

The linearity of the expectation operator [see Sec. 3.10 in Chapter 3 and also Eq. (3.9f)] lets this
be written as

( ) ( ) ( )
( 2) 2 2 2 2 2 2 4
( ) ( ) ( ) ( ) ( )
nn rms rms rms
+

o E E E ,

which becomes, using Eq. (7.3c),

( )
( 2) 2 2 4
( ) ( ) ( )
nn rms

o E . (7.44b)

Substituting from Eq. (7.2c) now gives

( )( ) ( )
( 2) 2 2 2 2 4
( ) ( ) ( ) ( ) ( )
nn x y x y rms
+ +

o E ,

which we can, following the same procedure as before, expand to get

( ) ( ) ( )
( )
( 2) 2 2 2 2 2 2
2 2 4
( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( )
nn x x x y x y
y y rms

+ +
+

.
E E E
E
o
(7.44c)

Applying Eqs. (7.41f), (7.42e), and (7.43b) to the right-hand side of (7.44c) gives

( 2) ( ) 2 2 ( ) 2 2 2
2 2 2 ( ) 2
2 2 2 ( ) 2
4
( ) 2 ( ) 4 ( ) ( )
( ) 2 ( )
( ) 2 ( )
2
xx xx
nn x
xy
y x
xy
y x
y
o y o
y y o
y y o
y
+ + +
+ + +
+ + +
+ +

o o o
o
o
o
( ) 2 4
( ) 2 2 ( ) 2 2 2 2
( ) 2 ( ) 2 ( ) 2 4
( )
2 ( ) 4 ( ) ( )
2 ( ) 2 ( ) 2 ( )
yy
rms
xx xx
x y
xy xy yy
rms

o y o y

+ + + +
+ + +

.
o o
o o o

Glancing back to Eq. (7.3d), we see that this simplifies to

(7.43b), and (7.42e) both as is and after interchanging and , gives

Connection Between
( 2)
nn

x
,
y
7.12
- 919 -

( 2) ( ) 2 ( ) 2 2 ( )
( ) 2 ( ) 2
( ) 2 ( ) 2 ( ) 4 ( )
2 ( ) 2 ( )
xx yy xx
nn
xy xy

= + +
+ +

.
o o o o
o o
(7.44d)

This is what we want, a formula for
( 2)
nn

o in terms of
( ) xx
o ,
( ) yy
o , and
( ) xy
o .
The final step is to apply the Fourier transform to Eq. (7.44d). We define = and
write

( 2) ( ) 2 ( ) 2 2 ( )
( ) 2 ( ) 2
( ) 2 ( ) 2 ( ) 4 ( )
2 ( ) 2 ( )
xx yy xx
nn
xy xy

= + +
+ +

.
o o o o
o o

Dropping the primes and taking the Fourier transform of both sides gives, using the linearity of
the Fourier transform described in Sec. 2.6 of Chapter 2,

( ) ( ) ( )
( ) ( )
( )
( ) ( 2) ( ) ( ) 2 ( ) ( ) 2
( ) ( ) 2 ( ) ( ) 2
2 ( ) ( )
( ) 2 ( ) 2 ( )
2 ( ) 2 ( )
4 ( )
i i xx i yy
nn
i xy i xy
i xx

= +
+ +
+

.
o o o
o o
o
F F F
F F
F
(7.45a)

The Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] lets us write

( ) ( ) ( )
( ) ( ) 2 ( ) ( ) ( ) ( )
( ) ( ) ( )
i xx i xx i xx

= o o o F F F
and

( ) ( ) ( )
( ) ( ) 2 ( ) ( ) ( ) ( )
( ) ( ) ( )
i yy i yy i yy

= o o o F F F .

Equations (7.36g) and (7.36h) then give

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
i xx xx xx

= o p p F , (7.45b)
and

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
i yy yy yy

= o p p F . (7.45c)

Equation (7.36g) needs to be substituted directly into our formula, so we drop the primes and
rewrite it as

( )
( ) ( ) ( )
( ) ( )
i xx xx

= o p F . (7.45d)

Now Eqs. (7.45b)(7.45d) can be substituted into (7.45a) to get

- 920 -

( )
( ) ( )
( ) ( 2) ( ) ( ) ( ) ( )
( ) ( ) 2 ( ) ( ) 2
2 ( )
( ) 2[ ( ) ( )] 2[ ( ) ( )]
2 ( ) 2 ( )
4 ( ) ,
i xx xx yy yy
nn
i xy i xy
xx

= +
+ +
+

o p p p p
o o
p
F
F F

which becomes, applying (7.30b) to the left-hand side,

( ) ( ) { }
( 2) ( ) ( ) ( ) ( ) 2 ( )
( ) ( ) 2 ( ) ( ) 2
( ) 2[ ( ) ( )] 2[ ( ) ( )] 4 ( )
2 ( ) ( )
xx xx yy yy xx
nn
i xy i xy

= + +
+ +

.
p p p p p p
o o F F
(7.45e)

The term inside the braces { }, which is the last term on the right-hand side of (7.45e), can be
simplified if we write the Fourier transforms as integrals. Defining = lets us write

( ) ( )
( ) ( ) 2 ( ) ( ) 2
( ) 2 2 ( ) 2 2
( ) 2 2 ( ) 2 2
( ) ( )
( ) ( )
( ) ( )
i xy i xy
xy i xy i
xy i xy i
e d e d
e d e d

+
=
= +

.
o o
o o
o o
F F

Glancing back at the definition of
( ) xy
o in Eq. (7.37a), we note that
( ) xy
o is real, which makes
the second integral,

( ) 2 2
( )
xy i
e d

o ,
the complex conjugate of the first,

( ) 2 2
( )
xy i
e d

o .
Hence

( ) ( )
( )
( ) ( ) 2 ( ) ( ) 2
( ) 2 2 ( ) ( ) 2
( ) ( )
2Re ( ) 2Re ( )
i xy i xy
xy i i xy
e d

+

= =

.
o o
o o
F F
F

Applying Eq. (2.39j) in Chapter 2 now gives

Connection Between
( 2)
nn

x
,
y
7.12
- 921 -

( ) ( )
( ) ( )
( ) ( ) 2 ( ) ( ) 2
( ) ( ) ( ) ( )
( ) ( )
2Re ( ) ( ) ,
i xy i xy
i xy i xy

+

=

o o
o o
F F
F F

which becomes, applying Eq. (7.37b),

( ) ( )
( ) ( ) 2 ( ) ( ) 2 ( ) ( )
( ) ( ) 2Re[ ( ) ( )]
i xy i xy xy xy

+ = . o o p p F F (7.45f)

Equation (7.45f) can now be substituted into (7.45e) to get

( 2) ( ) ( ) ( ) ( )
2 ( ) ( ) ( )
( ) 2[ ( ) ( )] 2[ ( ) ( )]
4 ( ) 4Re[ ( ) ( )]
xx xx yy yy
nn
xx xy xy

= +
+ +

.
p p p p p
p p p
(7.45g)

This, surprisingly enough, is the result we need in order to learn something about the likely shape
of the
( 2)
nn

p noise-power spectrum.

7.13 The Shape of the
( 2)
nn

p Power Spectrum
If we return to the ideal case where
x
and
y
are taken to be independent random variables, then

Eq. (3.11b) in Chapter 3 can be used to write

( ) ( ) ( )
( ) ( ) ( ) ( ) 0
x y x y
= =

E E E

because ( )
y

is, according to Eq. (7.2g), a zero-mean random variable:

( )
( ) 0
y
=
E .

Similarly, according to Eqs. (7.36a) and (7.36d), we have, using the linearity of the expectation
operator E described in Sec. 3.10 of Chapter 3,

( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( ) ( ) 0
x y x y y
X Y = = =

E E E E E

- 922 -
again because ( )
y

is a zero-mean random variable. Consequently

( )
( )
xy
o in Eq. (7.37a)
above is zero and so is its Fourier transform
( )
( )
xy
o p defined in Eq. (7.37b). The formula for
( 2)
( )
nn
o

p in Eq. (7.45g) now becomes

( 2) ( ) ( ) ( ) ( ) 2 ( )
( ) 2[ ( ) ( )] 2[ ( ) ( )] 4 ( )
xx xx yy yy xx
nn
o o o o o o o + +

p p p p p p . (7.46)

We can recognize two extreme cases for the right-hand side of formula (7.46)one where
2
o is
relatively large compared to
( ) ( ) xx xx
p p and
( ) ( ) yy yy
p p , and one where
2
o is relatively small
compared to
( ) ( ) xx xx
p p and
( ) ( ) yy yy
p p . When the bias angle
2
o is relatively large,

( 2) 2 ( )
( ) 4 ( )
xx
nn
o o o e

p p , (7.47a)
and when
2
o is relatively small,

( 2) ( ) ( ) ( ) ( )
( ) 2[ ( ) ( )] 2[ ( ) ( )]
xx xx yy yy
nn
o o o o o e +

p p p p p . (7.47b)

Equation (7.47a) shows that, when
2
o is large,
( 2)
nn

p is proportional to
( ) xx
p so it has the same
basic shape as
( ) xx
p ; and, since
( ) xx
p is taken to be the standard power spectrum of an ordinary
wide-sense stationary random variable, we expect
( 2)
nn

p also to have a shape appropriate to an
ordinary wide-sense stationary random variable. When
2
o is small, however, Eq. (7.47b) shows
that
( 2)
nn

p is the sum of the convolutions of the
( ) xx
p and
( ) yy
p power spectra with themselves.
Before going any further, we pause to examine more carefully what it means to say that
2
o is
large or small compared to
( ) ( ) xx xx
p p or
( ) ( ) yy yy
p p . Suppose that
( ) xx
spread
o is the length of axis
over which
( )
( )
xx
o p is significantly different from zero and that
( ) yy
spread
o is the length of axis
over which
( )
( )
yy
o p is significantly different from zero. Similarly, we say that
( ) xx
typ
p is the typical
scale size of
( ) xx
p and
( ) yy
typ
p is the typical scale size of
( )
.
yy
p Then the scale size of the
convolutions
( ) ( ) xx xx
p p and
( ) ( ) yy yy
p p can be approximated as [see the definition of the
convolution in Eq. (2.38a) of Chapter 2]

( ) ( ) ( ) ( ) ( )2 ( )
( ) ( ) ( ) ( )
xx xx xx xx xx xx
typ spread
d o o o o o o o
p p p p p (7.47c)
or and
The Shape of the
( 2)
nn

p Power Spectrum 7.14
- 923 -

( ) ( ) ( ) ( ) ( )2 ( )
( ) ( ) ( ) ( )
yy yy yy yy yy yy
typ spread
d o o o o o o o
p p p p p . (7.47d)

From Eqs. (7.36i) and (7.36j), it follows that

( ) ( )
(0) ( )
xx xx
d o o
o p and
( ) ( )
(0) ( )
yy yy
d o o
o p ,

which can be approximated as

( ) ( ) ( )
(0)
xx xx xx
typ spread
o = o p (7.47e)
and

( ) ( ) ( )
(0)
yy yy yy
typ spread
o = o p . (7.47f)

From the definitions of
( )
(0)
xx
o and
( )
(0)
yy
o in Eqs. (7.36e) and (7.36f), we know that

( )
( ) 2
(0)
xx
X

o E and
( )
( ) 2
(0)
yy
Y

o E

because, according to (7.36a)(7.36d), X X

and Y Y

when . Hence, (7.47e) and
(7.47f) can also be written as

( )
2 ( ) ( ) xx xx
typ spread
X o =
p E and
( )
2 ( ) ( ) yy yy
typ spread
Y o =
p E ,

which, after substituting from (7.39b) and (7.39c), simplifies to

2 ( ) ( ) xx xx
x typ spread
y o = p (7.47g)
and

2 ( ) ( ) yy yy
y typ spread
y o = p . (7.47h)

This means that the approximations in (7.47c) and (7.47d) can be written as

( ) ( ) ( ) 2
( ) ( )
xx xx xx
typ x
o o y = p p p (7.47i)
and

( ) ( ) ( ) 2
( ) ( )
yy yy yy
typ y
o o y = p p p . (7.47j)

we know that
7.13
- 924 -
Hence, whenin Eq. (7.46)we say that the product
2 ( ) xx
o p is large or small compared to
( ) ( ) xx xx
p p or
( ) ( ) yy yy
p p , it is the same as saying that

2 ( ) xx
typ
o p

is large or small compared to
( ) 2 xx
typ x
y p and
( ) 2 yy
typ y
y p . Assuming that
( ) ( ) xx yy
typ typ
= p p , it follows that the
relevant comparison is between
2
o and
2
, x y
y or, taking the square root, between o and
, x y
y . If
o is much larger than
x
y and
y
y , then Eq. (7.46) reduces to approximation (7.47a); and if o is
much smaller than
x
y and
y
y , then Eq. (7.46) reduces to approximation (7.47b).
Working with approximation (7.47b) first, we note that, according to inequality (3.54g) in
Chapter 3, a power spectrum is never negativeand Eq. (3.49b) in Chapter 3 reminds us that
power spectra are even functions of their arguments. Equation (7.47b) requires
( 2)
nn

p to be the
sum of two terms, with each term being the convolution of a power spectrum with itself. This
means that
( 2)
nn

p ought to share the same general shape as the convolution of any non-negative
and even power spectrum with itself. Experience shows that noise-power spectra tend to have one
of the two types of shape shown by the dashed lines in Figs. 7.2(a) or 7.2(b). Section 3.25 in
Chapter 3 describes what is meant by band-limited white noise, and the dashed line in Fig. 7.2(a)
depicts a power spectrum closely resembling this ideal case. The other type of shape often seen is
what could be called quasi-harmonic noise, which has a power spectrum containing multiple
narrow peaks. This is the type of spectrum shown by the dashed lines in Fig. 7.2(b). In a way,
band-limited white noise and quasi-harmonic noise represent the two possible extreme casesor
opposite types of noisewhich could describe the behavior of the X
and Y
random
quantities. The solid line in Fig. 7.2(a) shows what the convolution of the dashed plot in Fig.
7.2(a) with itself looks like, and the solid line in Fig. 7.2(b) shows what the convolution of the
dashed plot in Fig. 7.2(b) with itself looks like. These solid lines show that the convolution of
either extreme case with itself has a shape with a large central hump. Consequently, we expect
( 2)
nn

p also to have a large central hump when it obeys (7.47b), so it makes sense, when picking a
generic shape for an
( 2)
( )
nn
o

p power spectrum described by formula (7.47b), to choose a
Gaussian function:

( )
2 2
2
( 2)
( )
s
nn
e
o
o o

p . (7.48)

In this formula, both and s are positive real numbers.
Formula (7.47a), the other approximation for
( 2)
nn

p , forces it to have the same basic shape as
the
( ) xx
p power spectrum. In this situation, it makes sense to assume that
( 2)
nn

p and
( ) xx
p have the
or, taking the square root, between
sum of two terms, with each term being twice the convolution of a power spectrum with itself. This
quantities. The solid curve in
7.2(a) with itself looks like, and the solid peaks in Fig. 7.2(b) show what
The Shape of the
( 2)
nn

- 925 -

FIGURE 7.2(a).

FIGURE 7.2(b).

3 2 1 0 1 2 3
0
1
2
3
3.0
2.061 10
9
.
S
test
o
i
Sconv o
i
3 3 o
i
6 4 2 0 2 4 6
0
1
2
3
3.5
0
S
test
f
i
Sconv f
i
6.0 6 f
i
0 o
0 o
The dashed lines in Figs. 7.2(a) and 7.2(b) represent function f() and the solid lines
represent the convolution of function f() with itselfthat is, they represent f() * f(). Figure
7.2(a) shows what happens to a smooth f() localized near the origin and Fig. 7.2(b) shows
what happens to f() when it consists of multiple, isolated peaks.
7.13
- 926 -

FIGURE 7.2(c).

simple quasi-harmonic shape depicted in Fig. 7.2(c), just to see what the misalignment NEdN
looks like when, unlike Eq. (7.48), the largest
( 2)
nn

p values are far away from the 0 = origin.
The quasi-harmonic power spectral shape in Fig. 7.2(c) can be specified by

( ) ( )
0
( ) , ,
2 2 2 2
xx xx M M M M
C C

= + + + +

p p (7.49a)

when referring to the noise-power spectrum of ( )
x
X =

in Eq. (7.36a) and by

( 2) ( 2)
0
( ) , ,
2 2 2 2
M M M M
nn C C

= + + + +

p p (7.49b)

( 2)
0
p or
( )
0
xx
p
(in cm
-1
)
This plot shows the shape with respect to of functions
( 2)
( )
p and
( )
( )
xx
p for the quasi-harmonic
formulas in Eqs. (7.49a) and (7.49b).

C

M C
+
C
) (
M C
+
The Shape of the
( 2)
nn

- 927 -
when referring to the noise-power spectrum of
( 2)
( ) n

from Eq. (7.8b). In both formulas
( )
0
xx
p ,
( 2)
0
p ,
C
o , and
M
o are positive real parameters; and both power spectra have the same shape
(only their maximum values
( )
0
xx
p and
( 2)
0
p are different). The H function has the same formulas

in Eq. (7.12a) above:

1 for
( , )
0 for
a b
a b
a b
o o
o o
o o
s
H

>

.
7.14 The Size of the
( 2)
nn

p Power Spectrum
Having chosen either (7.48) or (7.49b) to specify the shape of the
( 2)
nn

p power spectrum, we turn
to Eq. (7.31b) to connect the amplitude of
( 2)
nn

p to its spread in wavenumbers. Substitution of Eq.
(7.3d) into (7.31b) gives

( )
4 2 2 2 2 ( 2)
( ) ( ) ( )
x y nn
d
o y y o o
+ + +
E p , (7.50a)

where o ,
x
y ,
y
y are respectively the bias angle, the standard deviation of the
x
component of
the random misalignment angle, and the standard deviation of the
y
component of the random

misalignment angle. All three quantities are, of course, measured in radians. At the beginning of
the previous section, we specified
x
and
y
to be independent random quantities; and the

derivation of Eq. (7.45g) in Sec. 7.12 assumes that both
x
and
y
are normally distributed

random variables [obeying probability density distributions of the type shown in Eqs. (7.2h) and
(7.2i) above]. Equation (7.4d) thus requires that

( )
4 4 2 2 4 4 2 2 2
( ) 3 6 3 2( )
x x y x y
y o y o y o y y + + + + +
E ,

which can be put into Eq. (7.50a) to get

( 2) 2 2 2 2 4 4 2 2 4 2 2 2
( ) ( ) 3( ) 6 2 ( )
nn x y x y x y x
d
o o o y y y y o y o y o y
+ + + + + + + +

p
or

( 2) 4 4 2 2
( ) 2( 2 )
nn x y x
d
o o y y o y
+ +

p . (7.50b)
7.13
function has the same formula
as in Eq. (7.12a) above:
- 928 -
Equation (7.50b) is the formula we need to connect the size of the proposed
( 2)
nn

p spectral shape
to its spread in wavenumbers.
When the Gaussian shape in formula (7.48) is chosen, we note that parameter specifies the
size of the spectrum and parameter s determines the spectral spread in wavenumbers. Substituting
(7.48) into (7.50b) gives

( )
2 2
2
4 4 2 2
2( 2 )
s
x y x
e d
o
o o y y o y
+ +
. (7.51a)

Equation (7A.3d) in Appendix 7A can be written as, replacing t by and y by s,

( )
2 2
2
2
s
e dt s
o
r
. (7.51b)

Applying this to (7.51a), we find that

4 4 2 2
2 2( 2 )
x y x
s o r y y o y + + ,

which can be solved for to get

4 4 2 2
1 2
( 2 )
x y x
s
o y y o y
r
+ + . (7.51c)

This is the expected connection between the size of the noise-power spectrum and its spread in
wavenumbers. Glancing back at the discussion following Eq. (7.47j), we recall that the Gaussian
spectral shape in (7.48) stems from an assumption that the bias angle o is small compared to
x

and
y
. Hence (7.51c) can be approximated as

4 4
1 2
( )
x y
s
o y y
r
e + . (7.51d)

Formulas (7.51c) and (7.51d) can be substituted back into (7.48) to get

( )
2 2
2
( 2) 4 4 2 2
1 2
( ) ( 2 )
s
nn x y x
e
s
o
o y y o y
r
+ +

p , (7.51e)

which simplifies to, using that o is small compared to
x
y and
y
y ,

Formula (7.51c) can be substituted back into (7.48) to get
.
Using that is small compared to and , we substitute (7.51d) into (7.48), or just neglect in o
x
y and
y
y , o
(7.51e), to get
The Shape of the
( 2)
nn

- 929 -

( )
2 2
2
( 2) 4 4
1 2
( ) ( )
s
nn x y
e
s

+

p . (7.51f)

The quasi-harmonic shape for
( 2)
nn

p specified in Eq. (7.49b) stems from the assumption that
the bias angle is large compared to
x
and
y
. Here the spread in wavenumbers is specified by
parameter
M
and the size of the power spectrum is determined by
( 2)
0
p . Substituting (7.49b)
into (7.50b) gives

( 2) 4 4 2 2
0
2 2( 2 )
M x y x
= + + p
or

( 2) 4 4 2 2
0
1
( 2 )
x y x
M
= + + p . (7.52a)

This is the connection between size
( 2)
0
p and wavenumber spread

M
for the spectral shape
specified in (7.49b). Because now we are assuming that is large compared to
x
, the formula
can be approximated as

2 2
( 2)
0
2
x
M
p . (7.52b)

Formula (7.52a) can be applied to Eq. (7.49b) to get

( 2)
4 4 2 2
( )
2
, ,
2 2 2 2
nn
x y x
M M M M
C C
M
+ +

= + + + +

p
(7.52c)

or, again using that is large compared to
x
and
y
,

2 2
( 2)
2
( ) , ,
2 2 2 2
x M M M M
nn C C
M

+ + + +

p (7.52d)

for the quasi-harmonic
( 2)
nn

p noise-power spectrum.
7.15 Simulated Misalignment Noise
To show how misalignment noise can disturb the measurements of Michelson interferometers, we
simulate misalignment-contaminated measurements of both a black-body spectrum and an
- 930 -
isolated Lorentz emission line. For the misalignment-contaminated black-body measurements, we
use a Gaussian noise-power spectrum such as the one specified in Eq. (7.48), and for the Lorentz
emission line we use the quasi-harmonic noise-power spectrum specified in Eq. (7.49b) and
graphed in Fig. 7.2(c). In both cases the
x
and
y
components of the misalignment angle are

taken to be independent random variables.
When generating the black-body measurements, the simulated interferometer samples the
interferogram at = 8192 N evenly spaced positions between the optical-path differences of
D and D , with D = 1.28 cm. Glancing back at Eq. (5.67) in Chapter 5, we see that the
unapodized spectral resolution is now

1
1
0.391cm
2D

(7.53a)

and the optical-path difference between the evenly spaced samples of the interferogram signal is

4
2
3.125 10 cm
D
N

A . (7.53b)

The background radiance is assumed to be negligible, so

( ) ( ) (back) ( ) (back)
( ) ( ) ( ) ( ) ( ) 0
dir fore fore
FOV FOV mnf mnf
o o o o o L L L L L . (7.53c)

We might as well give the responsivity R and the optical parameters
a
t ,
f
t , their ideal values

R
amp sec
( ) 1
erg
o

(7.53d)
and
( ) ( ) ( ) 1
a f
t o t o q o , (7.53e)

because in formula (7.34b) they just end up rescaling the spectral noise to turn it into NEdN
tilt
.
The beam passing through the interferometer has a circular cross section of radius = 3 cm R , so
according to Eq. (7.2b)

2 2 2
2 177.65 cm R r e a (7.53f)

and, of course, the beam cross-sectional area is

2 2
28.27 cm A R r e . (7.53g)

e
Simulated Misalignment Noise 7.15
- 931 -
The interferometers field of view is

4
1.086 10 ster
AO , (7.53h)

and parameter W, explained in the discussion following Eq. (4.83) in Chapter 4, is

1 W . (7.53i)

The detector electronics in Fig. 6.2 of Chapter 6 are given a three-pole, low-pass Butterworth
filter. Figure 7.3 plots
( ) Re H( ) uo , ( ) Im H( ) uo , and H( ) uo

of this filter against wavenumber . The OPD velocity u is taken to be 5 cm/sec and the filter
cutoff frequency is 8000 Hz. This means the magnitude H( ) uo of the transfer function does not
fall off by much inside the 650 cm
1
to 1150 cm
1
band of wavenumbers measured by the
interferometer. The simulated instrument is calibrated using Planck black-body radiances of 77 K
(the temperature of liquid nitrogen) and 350 K.
To characterize the noise in these simulated black-body measurements, we have already
decided to use the Gaussian noise-power spectrum in Eq. (7.48), which means [see discussion
following Eq. (7.46) and continuing on to Eq. (7.48)] that the bias angle o must be negligible. To
keep things simple, we make the bias angle zero,

0 o . (7.54a)

The Gaussian power spectrum in (7.48) has

1
200 cm s

(7.54b)
and

4 23
3.989 10 cm rad o

, (7.54c)

which gives us, by combining Eqs. (7.48) and (7.34b), all the information needed to calculate the
tilt
NEdN contaminating the black-body spectrum. Figure 7.4(a) plots the Gaussian
( 2)
nn

p noise-
power spectrum in (7.48) for the s and values in (7.54b) and (7.45c).
Now that s and are specified, and the bias angle is set to zero in (7.54a), Eqs. (7.51c) and
(7.51d) show that

4 4 20 4
10 rad
x y
y y

+ e . (7.55a)

(7.51c) or
- 932 -
This does not specify uniquely the amount of misalignment error contributed by the
x
and
y

components of the misalignment angle. We could, for example, treat
x
and
y
on an equal
footing by saying that

4 4 21 4
5 10 rad
x y
y y

, (7.55b)

which would then give, according to Eq. (7.3d),

5
1.19 10 rad
rms

e . (7.55c)

To keep the arithmetic simple, we choose another approach, assuming that
y
is always zero so
that
0
y
y . (7.55D)
The remaining
x
component obeys a zero-mean normal probability-density distribution

specified by the value of
x
y . Now Eqs. (7.55a) and (7.3d) reduce to

5
10 rad
x
y

(7.55e)
so that

5
10 rad
rms

. (7.55f)

Figures 7.4(b) and 7.4(c) plot a simulation of
( 2)
n

misalignment noise [defined in Eq. (7.8b)
above] for an interferometer disturbed by a Gaussian noise-power spectrum governed by the
parameter choices shown in (7.54a)(7.45c), (7.55d), and (7.55e). Figure 7.4(b) covers a small
range of OPD values to show what this sort of misalignment noise looks like in detail, and Fig.
7.4(c) covers the entire range of OPD values between +1.28 cm and 1.28 cm.
Figures 7.5(a) and 7.5(b) show what happens when the Gaussian misalignment noise just
described above contaminates measurements of a 320-K Planck black-body spectrum performed
by the interferometer system specified at the beginning of this section [see Eqs. (7.53a)(7.53i)
and the paragraph immediately following Eq. (7.53i)]. The solid line in Fig. 7.5(a) is the true
spectral radiance entering the instrument. This black-body curve is smooth enough that, when
calculating NEdN
tilt
, we do not have to worry about the different shapes of the radiance functions
L, L
FOV
, and L
mnf
specified
107
in Secs. 5.18 and 5.23 of Chapter 5. [A similar point was made
earlier in Sec. 7.6 about the L
(1)
and L
(2)
calibration radiancessee Eqs. (7.19a) and (7.19b)].
Figure 7.5(a) also contains ten independent, noise-contaminated measurements shown by dotted

107
The modified radiances L
FOV
and L
mnf
are defined in Eqs. (5.83e) and (5.108d) respectively.
.54
.55d
.]
- 933 -
FIGURE 7.3.

curves, several of which are too close to the solid curve to be easily seen. This gives some idea of
how the misalignment noise causes the 320-K radiance curve generated by the interferometer to
jump around from measurement to measurement while retaining the general shape of a true
black-body spectrum. The solid curve in Fig. 7.5(b) is the NEdN
tilt
calculated from formula
(7.34b) above. It is clearly consistent with the spread of the dotted curves in Fig. 7.5(a). We have
analyzed 3600 independent, noise-contaminated spectral measurements of this 320-K radiance
curve, calculating the standard deviation of the error as a function of wavenumber between
-1
650 cm and
-1
1150 cm . The crosses in Fig. 7.5(b) plot these standard deviations; there is a close
1.5
1.5
Re Htot u
.
( ) ( )
Im Htot u
.
( ) ( )
Htot u
.
( )
2000 0
0 500 1000 1500 2000
1.5
1
0.5
0
0.5
1
1.5
1000 1500 2000 500 0
-1.0
-0.5
0.0
0.5
1.0
(in cm
-1
)
The solid curve is the magnitude of the transfer function H(u) plotted against . The dashed
and dotted curves are its real and imaginary parts respectively.
- 934 -
FIGURE 7.4(a).

______________________________________________________________________________

match between them and the predicted NEdN
tilt
curve, showing that the simulated interferometer
measurements obey the expected spectral statistics.
Figure 7.6 shows the Lorentz emission line measured in the second simulated interferometer
measurement. We use the same interferometer system as in the black-body measurement, with
two connected changes: the fore optics transmission is taken to be

( ) 0.9
f
= (7.56a)

instead of one, and the fore optics background radiance
( ) fore
mnf
L is no longer assumed to be zero.
These changes are connected because, when
f
is less than one, it contributes a nonzero
5 10
23
.
0.5 10
23
.
sV ( )
800 800
800 600 400 200 0 200 400 600 800
0
1 10
23
2 10
23
3 10
23
4 10
23
(in cm
-1
)

) 2 (
~ ~
n n
p
(in rad
4
cm)
-800 -600 -400 -200 0.0 200 400 600 800
5x10
-23
4x10
-23
3x10
-23
2x10
-23
1x10
-23
0.0
This is a plot of the Gaussian noise-power spectrum in Eq. (7.48) with = 3.989x10
23
rad
4
cm
and s = 200 cm
-1
.
- 935 -

FIGURE 7.4(b).

background radiance to the optical signal. (The changes are made to show the effect of
background radiance on a Lorentz-line measurement contaminated by misalignment noise.) The
( ) fore
mnf
L background radiance is taken to be a gray-body Planck curve [described in the discussion
following Eq. (5.3k) in Chapter 5] with a constant emissivity of 0.1; and, since all the other
interferometer optics are taken to be ideal, we can still set the other background radiances to zero.
Because we are now dealing with a Lorentz emission line instead of a smooth Planck curve, it is
no longer safe to assume automatically that the input spectrum is so smooth that the
interferometers finite field of view and finite interferogram length have no significant effect on
the measured spectrum.
Equation (5.83e) in Chapter 5 reminds us that the finite field of view rescales the wavenumber
axis by a factor of
1
4

+

.

This becomes, using the
-4
=1.086 10 ster value from Eq. (7.53h),
6 10
10
.
6 10
10
.
Re n2Vtemp
kPlot
0.1 0.1 kPlot
.
1.28
0.05 0 0.05
6 10
10
4 10
10
2 10
10
0
2 10
10
4 10
10
0.0 0.05 -0.05 -0.1 0.1
-6x10
-10
-4x10
-10
-2x10
-10
2x10
-10
4x10
-10
6x10
-10
0.0

) (
) 2 (
n
(in rad
2
)
(in cm)
- 936 -

FIGURE 7.4(c).

_____________________________________________________________________________________________

6
1 1 8.642 10
4r
AO
+ e +

. (7.56b)

Consequently, the effective wavenumber of the input radiance spectrum is in error by
approximately 0.00086%. For the Lorentz emission line in Fig. 7.6, this amounts to a shift of
about
-1
0.0086 cm , which is far too small to see on the scale of the graph. The AO finite field of
view also, according to Eqs. (5.83e) and (5.82c) in Chapter 5, blurs the input radiance over a
wavenumber interval of

4 3 1
1
10 10 cm
0.0167cm
2 6
o
r

AO

= e . (7.56c)

6 10
10
.
6 10
10
.
Re n2Vtemp
kPlot
1.28 1.28 kPlot A
.
1.28
1 0.5 0 0.5 1
6 10
10
4 10
10
2 10
10
0
2 10
10
4 10
10
0.0 0.5 1.0 -0.5 -1.0
D
D
-6x10
-10
-4x10
-10
-2x10
-10
2x10
-10
4x10
-10
6x10
-10
0.0
) (
) 2 (
n
(in rad
2
)
(in cm)
82
- 937 -
FIGURE 7.5(a).

200
98.538932
LinpV
kR
LmeasV
kR
Lmeas2V
kR
Lmeas3V
kR
Lmeas4V
kR
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
Lmeas10V
kR
1.15 10
3
.
650 R
kR
600 700 800 900 1000 1100 1200
90
100
110
120
130
140
150
160
170
180
190
200
1200 1100 1000 900 800 700 600
200
190
180
170
160
150
140
130
120
110
100
90
Radiance
(in mW/m
2
/sr/cm
-1
)
(in cm
-1
)
- 938 -
FIGURE 7.5(b).

This is also much too small to matter on the scale of Fig. 7.6. All that is now left to check is the
effect of the finite interferogram length. The value of the unapodized spectral resolution is
-1
0.391 cm in Eq. (7.53a) above. Glancing back at the discussion following Eq. (5.67) in Chapter
5, we note that the unapodized spectral resolution determines the scale of the spectral blurring
caused by the interferometers finite interferogram length. The Lorentz line in Fig. 7.6 looks wide
enough not to have its width significantly affected by the blurring effects of an unapodized
spectral resolution of
-1
0.391 cm . So there is still no need to worry about the slightly different
shapes of the radiance functions L, L
FOV
, and L
mnf
when discussing the radiance spectrum
enteringor measured bythe interferometer.
600 700 800 900 1000 1100 1200
0
1
2
3
4
5
5.0
0
NEdNV
k
NEdNest
k
1.15 10
3
.
650 g
k
900 1000
1100

1200

800
600
(cm
-1
)
5
4
3
2
1
0
Radiance Error
(in mW/m
2
/sr/cm
-1
)
- 939 -
The analysis in Sec. 7.13 shows that the bias angle o must be large compared to
x
y and
y
y
for the noise-power spectrum
( 2)
nn

p to have the quasi-harmonic shape specified in Eq. (7.52d)
above. To satisfy this requirement for the noise contaminating the Lorentz emission line, we set
5
rad o

10 and
6
rad
x
y

10 . Taking
y
y to be approximately the same size as
x
y , we again
use Eq. (7.3d) to get

5
10 rad
rms

e (7.57a)

just like in Eq. (7.55f) above for the black-body measurements. Choosing

1
100 cm
C
o

(7.57b)
and

1
20 cm
M
o

, (7.57c)

we consult Eq. (7.52b) to get

( 2) 23 4
0
10 cm rad

e p (7.57d)

in Eq. (7.49b). To get the desired quasi-harmonic
( 2)
nn

p spectrum, we just apply these
C
o ,
M
o ,
and
( 2)
0
p parameters to the graph in Fig. 7.2(c). Figures 7.7(a) and 7.7(b) contain an example of
( 2)
n

misalignment noise [as defined in Eq. (7.8b)] obeying this quasi-harmonic spectrum. The x
and y components are independent, zero-mean, and normally distributed random quantities.
Figure 7.7(a) plots
( 2)
n

over a small set of OPD values to show what this quasi-harmonic
misalignment noise looks like in detail, and Fig. 7.7(b) plots
( 2)
n

over the entire range of OPD
values between +1.28 cm and 1.28 cm.
Figures 7.8(a) and 7.8(b) show what happens when the quasi-harmonic noise described above
contaminates the measurement of the Lorentz emission line in Fig. 7.6. The split solid curves in
Fig. 7.8(a) depict the rising and trailing edges of the Lorentz emission line using a stretched y
axis, which puts the top of the emission line off the top of the graph. The continuous solid line is
the NEdN
tilt
curve predicted by formulas (7.34b) and (7.35a), and the dotted lines are ten
measurements of the Lorentz emission contaminated by the quasi-harmonic misalignment noise.
The NEdN
tilt
curve correctly predicts the presence and location of the ghost-line noise peaks in
the dotted curves, and it also confirms the way the overall level of the noise-contaminated
measurements rises and falls with respect to the true spectral level far away from the ghost lines.
The ghost-line noise is predicted by the first term on the right-hand side of Eq. (7.35a). This term
is basically a convolution of the quasi-harmonic
( 2)
nn

p power spectrum with the Lorentz line
shape contained in the square of the
2
[ ( )]
mnf
o o Z function.

10 10
axis, which puts the peak of the emission line off the top of the graph. The continuous solid line is
Figures 7.8(a) and 7.8(b) show what happens when the quasi-harmonic noise described above
- 940 -

FIGURE 7.6.

99.00892
1.111099 10
3
.
Linp g
ig
1100 800 g
ig
800 850 900 950 1000 1050 1100
0
20
40
60
80
100
1100 1050 1000 950 900 850 800
100
80
60
40
20
0
Radiance
(in mW/m
2
/sr/cm
-1
)
(in cm
-1
)
- 941 -
We note that the ghost-line regions lie on either side of the Lorentz emission line, offset from the
line center by

1
110 cm
2
M
C

+ = , (7.58)

as we would expect from the convolution. The overall rise and fall of the noise-contaminated
measurements with respect to the true spectral level comes from both the first and second terms
on the right-hand side of (7.35a) and can be traced to the interferometers nonzero background
radiance. This is what happens when misalignment noise interacts with a smooth Planck-like
spectrum, just like in Figs. 7.5(a) and 7.5(b). It is important to realize that large background
radiances can produce large amounts of background noise even at those wavenumbers where the
spectrum being measured is relatively small. We also see that misalignment noise, unlike the
detector noise discussed in Chapter 6, need not look very fuzzy and noiselike; it can easily be
mistaken for part of the spectral signal. Figure 7.8(b) has the same basic format as Fig. 7.5(b).
Again, we generate 3600 noise-contaminated measurements and calculate the standard deviations
of the spectral error as a function of wavenumber . Just as before, the crosses marking the values
of these standard deviations are a good match to the solid line giving the predicted NEdN
tilt

values.
______________________________________________________________________________

FIGURE 7.7(a).

8 10
11
.
8 10
11
.
Re n2Vtemp
kPlot
0.1 0.1 kPlot
.
1.28
0.05 0 0.05
8 10
11
6 10
11
4 10
11
2 10
11
0
2 10
11
4 10
11
6 10
11
8 10
11
0.0 0.05 0.1 -0.05 -0.1
) (
) 2 (
n
(in rad
2
)
(in cm)
-8x10
-11
-6x10
-11
-4x10
-11
-2x10
-11
2x10
-11
4x10
-11
6x10
-11
8x10
-11
0.0
- 942 -

FIGURE 7.7(b).

8 10
11
.
8 10
11
.
Re n2Vtemp
kPlot
1.28 1.28 kPlot
.
1.28
1 0.5 0 0.5 1
8 10
11
6 10
11
4 10
11
2 10
11
0
2 10
11
4 10
11
6 10
11
8 10
11
0.0 0.5 1.0 -0.5 -1.0
D =
D =
-8x10
-11
-6x10
-11
-4x10
-11
-2x10
-11
2x10
-11
4x10
-11
6x10
-11
8x10
-11
0.0

) (
) 2 (
n
(in rad
2
)
(in cm)
- 943 -
FIGURE 7.8(a).

0.4
0.172296
LinpV
kR
NEdNV
kR
LmeasV
kR
Lmeas2V
kR
Lmeas3V
kR
Lmeas4V
kR
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
Lmeas10V
kR
1100 800 oR
kR
800 850 900 950 1000 1050 1100
0.2
0.1
0
0.1
0.2
0.3
1100 1050 1000 950 900 850 800
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
Radiance Error
(in mW/m
2
/sr/cm
-1
)
o (in cm
-1
)
Noise-free
Spectrum
NEdN
tilt

- 944 -
FIGURE 7.8(b).

800 850 900 950 1000 1050 1100
0
0.02
0.04
0.06
0.08
0.1
0
NEdNV
k
NEdNest
k1
1100 800 g
k
g1
k1
,
950 1000 1050 1100 900 850 800
0.10
0.08
0.06
0.04
0.02
0
(cm
-1
)
radiance
(mW/m
2
/ster/cm
-1
)
Appendix 7A
- 945 -
Appendix 7A
We want to calculate the second and fourth moments of the normal probability density
distribution

( )
2 2
2 ( ) 1
( )
2
p e

. (7A.1)

Here, ( ) p d
is the probability that the continuous random variable takes on a value between
and d + . The mean value of is , and its standard deviation is .
We know, for 0 a > , that
108

2
2
0
1
4
ax
x e dx
a a
.

Since
2
2 ax
x e
is an even function of x , this can be written as [according to Eq. (2.19) in Chapter

2]

2
2
1
2
ax
x e dx
a a
. (7A.2a)

Taking the partial derivative with respect to a of both sides gives

2
4 5 2
3
4
ax
x e dx a
. (7A.2b)

To get the second moment of when it obeys the ( ) p
probability density distribution in

(7A.1), we must calculate

( )
2 2
2
2 2
( ) 1
( )
2
p d e d
.

This becomes, changing the variable of integration to t = ,

108
Lennart Rade and Bertil Westergren, Beta Mathematics Handbook, 2nd ed. (CRC Press, Inc., Boca Raton, FL,
1990), formula (42), p. 164.
- 946 -

( )
2 2
2
2 2
1
( ) ( )
2
t
p d t e dt

= +

or

( ) ( ) ( )
2 2 2 2 2 2
2
2 2 2
2 2
1 2
( )
2 2
t t t
p d t e dt t e dt e dt

= + +

. (7A.3a)

Applying (7A.2a) to the first term on the right-hand side gives

( )
2 2
2
2 2
1
2
t
t e dt
. (7A.3b)

According to Eq. (2.17) in Chapter 2, the second term on the right-hand side must be zero
(because
( )
2 2
[ exp /(2 ) ] t t is an odd function of t), and we see that the third term must be

( )
2 2
2
2
2
2
t
e dt

(7A.3c)
because

( )
2 2
2 1
1
2
t
e dt
(7A.3d)

is just the integral of the zero-mean normal probability density over all its allowed values [see
Eq. (7A.1)]. Substituting (7A.3b) and (7A.3c) into (7A.3a) gives

2 2 2
( ) p d
= +

. (7A.3e)

To get the fourth moment of when it obeys the ( ) p
probability density distribution in

(7A.1), we evaluate

4
( ) p d

,

by again changing the variable of integration to t = to get

Appendix 7A
- 947 -

2 2
2 2
( )
4 4 4 2 2
1 1
( ) ( )
2 2
t
p d e d t e dt

= = +

or

2 2 2
2 2 2
2 2
2 2
2
4 4 3 2 2 2 2
3 4
2 2
1 2 2 3 2
( )
2
2 2
2
t t t
t t
p d t e dt t e dt t e dt
t e dt e dt

= + +
+ +

.
(7A.4a)

The second and fourth terms on the right-hand side of (7A.4a) are zero because
3 2 2
[ exp( /(2 ))] t t and
2 2
[ exp( /(2 ))] t t are odd functions of t. Applying Eqs. (7A.2b),
(7A.3b), and (7A.3d),

4 2 5 2 2 2 4
4 2 2 4
1 3
( ) (2 ) 6
4
2
3 6
p d
= + +
= + +

.
(7A.4b)

In Sec. 7.2 above, random variable
x
obeys a normal probability density distribution that has

a mean of and a standard deviation of
x
. According to Eqs. (7A.3e) and (7A.4b), we can
therefore write that

2 2 2
( )
x x
= +
E (7A.5a)
and

4 4 2 2 4
( ) 3 6
x x x
= + +
E . (7A.5b)

Random variable
y
obeys a probability density distribution with a mean of zero and a standard

deviation of
y
. This means that, setting 0 = in Eqs. (7A.5a) and (7A.5b), we know

2 2
( )
y y
=
E (7A.5c)
and

4 4
( ) 3
y y
=
E . (7A.5d)

Equations (7A.5a)(7A.5d) are the results we need for the derivation of the mirror-tilt NEdN.
- 948 -
Appendix 7B
Although the
( ) xx
o ,
( ) yy
o autocorrelation functions and the
( ) xx
p ,
( ) yy
p noise-power spectra
introduced in Sec. 7.12 above follow the expected pattern, being both real and even like every
autocorrelation function and power spectrum of a wide-sense stationary random function,
109
the
cross-correlation function
( ) xy
o and cross-power spectrum
( ) xy
p introduced in Eqs. (7.37a) and
(7.37b) exhibit a more complicated symmetry. In particular, we should be careful to note that the
( ) xy
p cross-power spectrum can have a nonzero imaginary component.
Equation (7.37a) defines the cross-correlation function of X
and Y
to be, using the notation

of Sec. 7.12,

( )
( )
( )
xy
XY =

o E . (7B.1a)

This can also be written as [see Eqs. (7.36a) and (7.36d)]

( )
( )
[ ( ) ] ( ) ( )
xy
x y
=

o E . (7B.1b)

Using the linearity of E with respect to random variables (see Sec. 3.10 of Chapter 3), we note
that

( ) ( ) ( )
[ ( ) ] ( ) ( ) ( ) ( )
x y x y y
=

E E E .

Since
( )
( ) 0
y
=
E , this reduces to

( ) ( )
[ ( ) ] ( ) ( ) ( )
x y x y
=

E E ,

which means that Eq. (7B.1b) can be written as

( )
( )
( ) ( ) ( )
xy
x y
=

o E . (7B.1c)

This shows that
( ) xy
o does not depend on the bias tilt angle . Interchanging the positions of
and in Eqs. (7B.1a) and (7B.1c) gives

( )
( )
( )
xy
X Y =

o E (7B.1d)

109
See Sec. 3.20 of Chapter 3 as well as, in Sec. 3.15, the discussion following Eq. (3.30b).
Appendix 7B
- 949 -
and

( )
( )
( ) ( ) ( )
xy
x y

o E . (7B.1e)
We note that since

( ) ( )
XY Y X

E E

automatically holds true, it followsinterchanging the roles of the x, y labels and the ,
variables in Eq. (7B.1a)that

( ) ( )
( ) ( )
xy yx
o o ,

which can also be written as, using ,

( ) ( )
( ) ( )
xy yx
o o

or, changing the sign of the argument,

( ) ( )
( ) ( )
xy yx
o o . (7B.1f)

The cross-power spectrum defined in Eq. (7.37b) is

( ) ( ) 2
( ) ( )
xy xy i
e d
r o
o
p o . (7B.2a)

We note that, substituting ,

( ) ( ) 2 ( ) 2
( ) 2
( ) ( ) ( )
( )
xy xy i xy i
xy i
e d e d
e d
r o r o
r o
o

.
p o o
o

Substituting from (7B.1f) gives

( ) ( ) 2
( ) ( )
xy yx i
e d
r o
o
. p o (7B.2b)

We can now interchange the roles of the x and y labels in (7B.2a) to get

that
- 950 -

( ) ( ) 2
( ) ( )
yx yx i
e d
r o
o
p o (7B.2c)

and use this definition to write (7B.2b) as

( ) ( )
( ) ( )
xy yx
o o p p . (7B.2d)

Equation (7B.2d) matches the relationship between the cross-correlation functions in Eq. (7B.1f).
According to Eq. (7B.1a), the cross-correlation
( ) xy
o is the expectation value of the product
of two real numbers, so it must be real. We can then write, substituting

2
cos(2 ) sin(2 )
i
e i
r o
ro ro

into Eq. (7B.2a), that

( ) ( ) ( )
( ) ( ) cos(2 ) ( ) sin(2 )
xy xy xy
d i d o ro ro

p o o
so that

( ) ( )
Re[ ( )] ( ) cos(2 )
xy xy
d o ro
p o (7B.3a)
and

( ) ( )
Im[ ( )] ( ) sin(2 )
xy xy
d o ro
p o . (7B.3b)

The remark following Eq. (2.15b) in Chapter 2 points out that the product of any even function
with the sine is an odd function, which means, according to Eq. (2.17) in Chapter 2, that its
integral from to must be zero. Thus, if
( ) xy
o is an even function, Eq. (7B.3b) is the integral
of an odd function between and and must be zero, showing that the cross-power spectrum
( ) xy
p must be real because
( )
Im[ ] 0
xy
p . The next obvious next step is to investigate whether
( )
( )
xy
o must be an even function of .
Again we say, just as in the discussion following Eq. (7B.1e), that so that [see
Eqs. (7B.1e) and (7B.1c)]

( )
( )
( ) ( ) ( )
xy
x y

o E (7B.4a)
and

( )
( )
( ) ( ) ( )
xy
x y

o E . (7B.4b)

The obvious next step, to see whether the cross-power
spectrum must be real, is to investigate whether
( )
( )
xy
o must be an even function of .
Appendix 7B
- 951 -
Equation (7.9a) shows that / t u = for 0 u > , so when > the ( )
y

random value in Eq.

(7B.4b) occurs at a later time than the ( )
x

random value. Suppose we assume that the

y

random quantity always resembles the
x
after a time delay T has elapsed because any

disturbance in the x component of the misalignment angle is followed by a similar disturbance in
the y component of the misalignment angle. In fact, suppose we set up the idealized, but entirely
possible, situation that the bias tilt angle is zero and the random y component is exactly equal
to the random x component after a time delay of T. This means we can write

( ) ( )
x
=
(7B.4c)
and
( ) ( )
y
uT =
(7B.4d)

for some random function . The value of the cross-correlation function
( ) xy
o at

uT = = so that uT = +

is then, according to Eq. (7B.4b),

( )
( )
( ) ( ) ( )
xy
x y
uT uT = +

o E .

Substituting from (7B.4c) and (7B.4d) then gives

( ) ( )
( ) 2
( ) ( ) ( ) ( )
xy
uT uT uT = + = o E E . (7B.4e)

This is the variance of , which could easily be a rather large quantity if there are large
disturbances in the x and y components of the misalignment angle. According to Eq. (7B.4a), on
the other hand,

( ) ( )
( )
( ) ( ) ( ) ( ) ( )
xy
x y x y
uT uT = = +

o E E ,

which becomes, substituting from (7B.4c) and (7B.4d),

( )
( )
( ) ( ) ( )
xy
uT uT uT = + E o . (7B.4f)

This shows that
( )
( )
xy
uT o could easily be quite small when random function is only poorly
correlated with itself at different values of its argument. The x and y components of the
- 952 -
misalignment angle could, for example, be subject to large random disturbances that first perturb
the
x
value and then, after a time delay T, perturb the

y
value. This would make the value of

( )
( )
xy
uT o in Eq. (7B.4e) rather large. The disturbances could also, however, be rather short in
duration, so that the perturbation of an angle component at one time has little resemblance to the
perturbation of that same component at another time. This would make the value of
( )
( )
xy
uT o
in Eq. (7B.4f) rather small. We can conclude, then, that there is no reason for
( ) xy
o to be an even
function of its argument. Hence there is no reason to expect the sine integral in (7B.3a)or for
that matter the cosine integral in (7B.3b)to be zero, which means the cross-power spectrum
( ) xy
p in Eqs. (7B.2a), (7B.3a), and (7B.3b) can easily have nonzero real and imaginary
components.
b
a
-953 -
8
SAMPLING-ERROR NEdN IN DOUBLE-
SIDED INTERFEROGRAMS
Random errors in the sampling position produce random errors in the sampled signal. As was
done in Chapter 7 when analyzing misalignment noise, we use wide-sense stationary random
functions to describe the sampling noise, tracing the effect through the calibration process to find
out what the NEdN of the measured spectrum looks like when it is dominated by this sort of
error. In a well-designed interferometer, the sampling-noise NEdN, just like the misalignment-
noise NEdN, should be a small source of error compared to the detector noise. The formulas
derived here can nevertheless be very useful when designing interferometers because they show
how accurately the interferometer signal needs to be sampled. Moreover, when interferometers
produce unusual types of random errors, the size and shape of the errors can be compared to the
predictions of these formulas, making it easier to determine whether an unexpectedly large
sampling noise could be contributing to the problem.

8.1 Noise-Free Signal at the A/D Converter
Sampling noise occurs at point C in Fig. 6.2 of Chapter 6 where the signal is being sampled at the
analog-to-digital (A/D) converter. Equation (6.8c) in Chapter 6 specifies the total noise-free
signal at point C as a function of the optical-path difference (OPD) value ,

( ) ( )
( ) ( ) ( )
tot cold
C C C
z z z + .

Equations (6.5d) and (6.12a) in Chapter 6 contain formulas for z
c
and
( ) cold
C
z respectively.
Substituting these into the formula for
( ) tot
C
z gives

( )
2
ma
( ) (back) 2
ma
R
R
( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
tot
C
i
f a FOV
fore i
a FOV FOV
z
WA
u R e d
WA
u R e d
r o
r o
o o q o o t o o o o
AO
AO
+
L
L L

.

noise can be compared to the
sampling error could be
8 Sampling-Error NEdN in Double-Sided Interferograms
- 954 -

( )
ma
( ) (back) 2
R ( ) H( ) M( ) ( ) ( ) ( )
4
[ ( ) ( ) ( ) ( )]
tot
C a
fore i
f FOV FOV FOV
WA
z u R
e d

=
+
L L L .
(8.1a)

The definition of
FOV
Z in Eq. (7.7b) of Chapter 7,

( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
FOV a f FOV FOV FOV
WA

= +

Z L L L , (8.1b)

can now be substituted into (8.1a) to get

( ) 2
ma
( ) H( ) M( ) ( )
tot i
C FOV
z u R e d

Z (8.1c)

for the noise-free signal at point C in Fig. 6.2. The Fourier F operator defined in Sec. 2.5 of
Chapter 2 [see Eqs. (2.29a) and (2.29c)] lets this be written as

( )
( ) ( )
ma
( ) H( ) M( ) ( )
tot i
C FOV
z u R
= Z F , (8.1d)

and, of course, the transform can be reversed to get

( )
( ) ( )
ma
H( ) M( ) ( ) ( )
i tot
FOV C
u R z
= Z F . (8.1e)

Unlike the interferometer model analyzed in Chapter 7, in this chapter we assume that the
misalignment angle
ma
, when it is significantly different from zero, has the same constant value
during spectral measurements and their associated calibration proceduresthat is, we assume
that the misalignment angle
ma
does not change with time.
8.2 Sampling Noise at the A/D Converter
The sampling-position noise
( ) s
n is defined as a function of the OPD value by saying that if
( ) tot
C
z is supposed to be sampled at
correct
and is instead sampled at the randomly incorrect OPD
value
incorrect
, then the sampling-position noise
( ) s
n at
correct
is defined to be

Sampling Noise at the A/D Converter 8.2
-955 -

( )
( )
s
correct incorrect correct
n = . (8.2a)

Clearly the units of
( ) s
n are the same as the OPDthat is, units of length (cm). Suppose the plan
is to sample
( ) tot
C
z at N equally spaced OPD values in order to generate a double-sided
interferogram signal with 0 = occurring at or near the middle of the sample set. In the absence
of error, we expect the samples to occur at
j
= with

j
j = , (8.2b)

where is the OPD separation between adjacent samples and, just like in Eq. (5.103b) in
Chapter 5,
1, 2, , 1, 0, 1, , 1,
2 2 2 2
N N N N
j = + + . (8.2c)

In the absence of sampling-position noise, there is one sample taken at 0 = when 0 j = , and
there is one more sample taken for 0 > than for 0 < . When the sampling-position noise
( )
( )
s
n is present, we know that the actual sample positions occur at

( )
( )
s
j j
n +

instead of
j
, and the corresponding sample values are

( )
( ) ( )
( )
tot s
C j j
z n +

instead of
( )
( )
tot
C j
z . We define the sampling noise to be the random errors in the sample values
z
C
due to the sampling-position noise
( ) s
n . We assume that

( )
( )
s
j
n << for all j . (8.2d)

This lets us write, for the jth sample value contaminated by sampling noise,

( )
( )
( ) ( ) ( ) ( )
( ) ( ) ( )
j
tot
tot s tot s C
C j j C j j
dz
z n z n
d

=
+ + . (8.2e)

- 956 -
Hence, reverting to continuous OPD notation, we have

( )
( )
( ) ( ) ( ) ( )
( ) ( ) ( )
tot
tot s tot s C
C C
dz
z n z n
d

+ + . (8.2f)

Since ( )
( ) tot
C
z is the noise-free signal and
( )
( ) ( )
( )
tot s
C
z n + is the noise-contaminated signal,
we see that the formula for the noise-contaminated signal can be approximated by

( )
( ) ( ) ( )
( ) ( ) ( )
tot
tot tot s C
CN C
dz
z z n
d

= + . (8.2g)

In this chapter the random function
( ) tot
CN
z represents the signal contaminated by sampling noise,
with the sampling noise caused by the sampling-position noise
( ) s
n at point C in Fig. 6.2 of
Chapter 6.
8.3 Power Spectrum and Autocorrelation Function of the Sampling
Noise
The
( ) s
n sampling-position noise is zero-mean,

( )
( )
( ) 0
s
n = E . (8.3a)

The expectation operator E is linear with respect to random quantities (see Sec. 3.10 in Chapter
3). Substituting (8.2a) into (8.3a) and applying the expectation operator E, we get

( ) ( ) ( ) 0
incorrect correct incorrect correct
= = E E E .

Parameter
correct
is nonrandom, which means that [see Eq. (3.9f) in Chapter 3]

( )
correct correct
= E .

Consequently, we end up with the formula

( )
incorrect correct
= E .

Hence, (8.3a) is just another way of saying that there is no bias in the attempt to sample the
signal; although any given attempt is randomly incorrect, on the average we get the correct OPD
Power Spectrum and Autocorrelation Function of the Sampling Noise 8.3
-957 -
value. Following the assumptions stated in the previous section, we take
( ) s
n to be at least wide-
sense stationary. This means, according to Eq. (3.30b) in Chapter 3, that its autocorrelation
function
( ) s
nn
o with respect to the OPD can be written as

( )
( ) ( ) ( )
( ) ( ) ( )
s s s
nn
n n =

E o . (8.3b)
Clearly,

( ) ( )
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
s s s s
n n n n = E E ,
which means that

( ) ( )
( ) ( )
s s
nn nn
=

o o .

This becomes, defining = ,

( ) ( )
( ) ( )
s s
nn nn
=

o o , (8.3c)

showing that the autocorrelation function for the sampling-position noise is an even function of
its argument. It is, of course, also real because
( ) s
n is real:

( )
( )
Im ( ) 0
s
nn
=

o . (8.3d)

The Fourier transform of
( ) s
nn
o is called the power spectrum of the
( ) s
n sampling-position
noise (see Sec. 3.20 of Chapter 3),

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
s s i i s
nn nn nn
e d

= =

p o o F . (8.4a)

The transform can, of course, be reversed to get

( )
( ) ( ) 2 ( ) ( )
( ) ( ) ( )
s s i i s
nn nn nn
e d

= =

o p p F . (8.4b)

Equations (8.3c) and (8.3d) show that the autocorrelation function is real and even which means
that, according to entry 1 of Table 2.1 in Chapter 2, the power spectrum
( ) s
nn
p must also be real
and even:

( ) ( )
( ) ( )
s s
nn nn
=

p p (8.4c)
- 958 -
and

( )
( )
Im ( ) 0
s
nn
=

p . (8.4d)

The value of
( )
(0)
s
nn
o can be used to scale the power spectrum
( ) s
nn
p . Consulting Eq. (8.4b), we
have

( ) ( )
(0) ( )
s s
nn nn
d

o p (8.5a)

which can be written as, substituting from Eq. (8.3b) with = ,

( )
( ) 2 ( )
[ ( )] ( )
s s
nn
n d

E p . (8.5b)

Formula (8.5b) shows, since its right-hand side depends only on the shape and size of the power
spectrum, that the wide-sense stationary nature of the sampling-position noise requires
( ) 2
([ ( )] )
s
n E to be independent of the OPD value . If we know that function ( )
h
S specifies
the shape of the power spectrum, but we do not know the size of the power spectrum, then there
exists a real constant such that

( )
( ) ( )
s
nn h
S =

p . (8.5c)

Substituting (8.5c) into (8.5b) then gives

( )
( ) 2
[ ( )] ( )
s
h
n S d
E
or

( )
1 1
( ) 2 ( ) 2
( ) [ ( )] ( ) ([ ] )
s s
h h
S d n S d n

= =

E E , (8.5d)

where the last step drops the argument because wide-sense stationary random functions have
the same mean-square value
( ) 2
([ ] )
s
n E at all values of . Hence we can find the value of from
the shape function
h
S and the mean-squared error
( ) 2
([ ] )
s
n E . Knowing both and ( )
h
S
determines the size and shape of function
( ) s
nn
p in Eq. (8.5c), completely specifying the power
spectrum of the sampling-position noise in terms of the shape function and the mean-squared
error in the sampling position.
Uncalibrated Spectral Signals 8.4
-959 -
8.4 Uncalibrated Spectral Signals
To create the noise-contaminated, double-sided interferogram, we multiply
( ) tot
CN
z in Eq. (8.2g) by
the function

1 for
( , )
0 for
D
D
D
=

>

, (8.6a)

which has already been defined in Eq. (4C.1a) in Appendix 4C of Chapter 4. (This is also the
same as the function in Eq. (2.56c) of Chapter 2, except for its value at D = ; in particular,
we know from the discussion following Eq. (2.9e) that both versions of must have the same
Fourier transform.) We now have, from (8.2g), that

( )
( ) ( ) ( )
( , ) ( ) ( , ) ( ) ( , ) ( )
tot
tot tot s C
CN C
dz
D z D z D n
d

= + (8.6b)

for the total double-sided interferogram signal contaminated by sampling noise at point C in Fig.
6.2. Multiplying by ( , ) D in this way explicitly reminds us that the double-sided
interferogram is truncatedthat is, data is only recorded for OPD values lying between D and
D. The forward Fourier transform of

( )
( , ) ( )
tot
CN
D z

is the uncalibrated spectral signal contaminated by sampling noiseand we show this by writing,
just like in Eq. (7.14c) of Chapter 7, that

( )
( ) ( )
,
( ) ( , ) ( )
i tot
eff totN CN
D z
= Z
F . (8.6c)

Section 2.6 in Chapter 2, where the linear nature of the Fourier operator F is explained, shows
that when the forward Fourier transform is applied to (8.6b) we get a sum of two Fourier
transforms on the right-hand side:

( ) ( )
( ) ( ) ( ) ( )
( )
( ) ( )
( , ) ( ) ( , ) ( )
( , ) ( )
i tot i tot
CN C
tot
i s C
D z D z
dz
D n
d

=

+

.
F F
F

This can also be written as, substituting from (8.6c),

- 960 -

( )
( )
( ) ( ) ( ) ( )
,
( ) ( , ) ( ) ( , ) ( )
tot
i tot i s C
eff totN C
dz
D z D n
d

= +

Z
. F F (8.6d)

Expanding the first term on the right-hand side of (8.6d) is a straightforward process. The
Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] gives

( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
( , ) ( ) ( , ) ( )
i tot i i tot
C C
D z D z

= F F F . (8.7a)

According to Eq. (2.108b) in Chapter 2 [with in (2.108b) replaced by , t replaced by , and F
replaced by D]
( )
( )
( , ) 2 sinc(2 )
i
D D D
= F , (8.7b)

where, following the definition in Eq. (2.106d),

sin( )
sinc( )
x
x
x
= . (8.7c)

Equations (8.7b) and (8.1e) can now be substituted into (8.7a) to get

( )
( ) ( )
ma
( , ) ( ) 2 sinc(2 ) [H( ) M( ) ( )]
i tot
C FOV
D z D D u R
= Z F . (8.7d)

Functions H and M vary slowly with compared to sinc(2 ) D , and the sinc function is very
narrow about 0 = compared to H and M. This means, according to Eq. (5C.1) in Appendix 5C
of Chapter 5, that (8.7d) can be approximated as

( )
( ) ( )
ma
( , ) ( ) H( ) M( )[2 sinc(2 ) ( )]
i tot
C FOV
D z u R D D
Z F ,

which becomes, substituting from Eq. (7.16h) in Chapter 7,

( )
( ) ( )
ma
( , ) ( ) H( ) M( ) ( )
i tot
C mnf
D z u R
Z F , (8.7e)

where [see Eq. (7.16f)]

( ) (back)
R ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
4
fore
mnf a f mnf mnf mnf
WA

= +

Z L L L . (8.7f)

Expanding the second term on the right-hand side of Eq. (8.6d) starts out the same way as
-961 -
expanding the first. Again, we use Eq. (2.39j) in Chapter 2 to get

( )
( )
( ) ( )
( )
( ) ( ) ( )
( , ) ( )
( )
( , ) ( )
tot
i s C
tot
i s i C
dz
D n
d
dz
D n
d
.
F
F F
(8.8a)

Formula (2.35e) in Chapter 2 shows that

( )
( )
( ) ( ) ( )
( )
2 ( )
tot
i i tot C
C
dz
i z
d

F F ,

which becomes, substituting from Eq. (8.1e),

( )
( )
ma
( )
2 H( ) M( ) ( )
tot
i C
FOV
dz
i u R
d

Z F . (8.8b)

For future use, we define, reversing the Fourier transform in (8.8b), that

( )
( )
( )
ma
( ) 2 H( ) M( ) ( )
tot
i C
s FOV
dz
W i u R
d

= = Z F , (8.8c)

which, of course, can also be written as [see Eq. (2.29c) in Chapter 2]

2
ma
( ) 2 H( ) M( ) ( )
i
s FOV
W i u R e d

Z . (8.8d)

The D-limited Fourier transform of
( )
( )
s
n is defined to be

( )
( ) ( ) ( )
( ) ( , ) ( )
s i s
D
D n
= n F (8.8e)
or

( ) ( ) 2
( ) ( , ) ( )
s s i
D
D n e d

n . (8.8f)

Because
( )
( )
s
D
n is the Fourier transform of the real-valued random function
- 962 -

( )
[ ( , ) ( )]
s
D n H ,

it must, according to entry 7 in Table 2.1 of Chapter 2, be Hermitian:

( ) ( )
( ) ( )
s s
D D
o o
n n . (8.8g)

The formula for
( )
( )
s
D
o n can also be written as, consulting the prescription for ( , ) D H in (8.6a),

( ) ( ) 2
( ) ( )
D
s s i
D
D
n e d
r o
o
n . (8.8h)

To finish up the analysis of the second term, we substitute Eqs. (8.8b) and (8.8e) into (8.8a) to get

[ ]
( )
( ) ( ) ( )
ma
( , ) ( ) ( ) 2 H( ) M( ) ( )
tot
i s s C
D FOV
dz
D n i u R
d
o
o r o o o o

H

n Z F . (8.8i)

Now that the two terms on the right-hand side of Eq. (8.6d) have been expanded and analyzed,
we use their formulas in Eqs. (8.7e) and (8.8i) to write the formula for the uncalibrated spectral
signal contaminated by sampling noise:

[ ]
, ma
( )
ma
( ) H( ) M( ) ( )
( ) 2 H( ) M( ) ( )
eff totN mnf
s
D FOV
u R
i u R
o o o o
o r o o o o
e
+
Z Z
n Z
.
(8.9a)

Applying the expectation operator E to both sides of Eq. (8.8f) gives, according to Eqs. (3.16a)
and (3.17c) in Sec. 3.10 of Chapter 3,

( ) ( ) ( )
( ) ( ) 2 ( ) 2
( ) ( ) ( )
D D
s s i s i
D
D D
n e d n e d
r o r o
o

n E E E ,

which reduces to, substituting from Eq. (8.3a) above,

( )
( )
( ) 0
s
D
o n E . (8.9b)

Again using the linearity of E with respect to random quantities as explained in Sec. 3.10 of
Chapter 3, we apply the expectation operator to both sides of (8.9a) to get
(8.8h)
-963 -

( ) ( )
[ ] ( )
, ma
( )
ma
ma
(
( ) H( ) M( ) ( )
( ) 2 H( ) M( ) ( )
H( ) M( ) ( )
eff totN mnf
s
D FOV
mnf
s
D
u R
i u R
u R

+
=
+
Z Z
n Z
Z
n
E E
E
E

[ ] ( )
)
ma
( ) 2 H( ) M( ) ( ) ,
FOV
i u R Z
(8.9c)

where in the last step we apply Eq. (3.9f) of Chapter 3, noting that ( ) c c = E for nonrandom
quantities c. The convolution in the second term on the right-hand side can be written as the
integral [see Eqs. (2.38b) and (2.38a) in Chapter 2]

[ ]
[ ]
[ ]
( )
ma
( )
ma
( )
ma
( ) 2 H( ) M( ) ( )
2 H( ) M( ) ( ) ( )
2 H( ) M( ) ( ) ( )
s
D FOV
s
FOV D
s
FOV D
i u R
i u R
i u R d

=
=
n Z
Z n
Z n

-
.

Applying E to both sides gives

[ ] ( )
[ ]
( )
ma
( )
ma
( ) 2 H( ) M( ) ( )
2 H( ) M( ) ( ) ( )
s
D FOV
s
FOV D
i u R
i u R d

n Z
Z n
E
E
-
.

We use the linearity of E as explained in Sec. 3.10 of Chapter 3 to move E inside the integral
and then substitute from (8.9b) to get

[ ] ( )
[ ] ( )
( )
ma
( )
ma
( ) 2 H( ) M( ) ( )
2 H( ) M( ) ( ) ( ) 0
s
D FOV
s
FOV D
i u R
i u R d

= =
n Z
Z n
E
E
-
.
(8.9d)

Hence, Eq. (8.9c) reduces to

( )
, ma
( ) H( ) M( ) ( )
eff totN mnf
u R Z Z
E . (8.9e)

- 964 -
This shows that the sampling noise can always be reduced to negligible levels in the uncalibrated
spectral signal by averaging together many independent measurements of the same spectral
radiance. In this respect, the sampling noise behaves the same way as the detector noise and
mirror-misalignment noise examined in the two previous chapters [see Eq. (7.18e) in Chapter 7
and the discussion following Eq. (6.30c) in Chapter 6].
8.5 Calibrating the Spectral Signal Contaminated by Sampling Noise
To find the calibrated spectral radiance contaminated by sampling noise, we again apply the
spectral calibration algorithm described in Sec. 5.19 of Chapter 5. The analysis here closely
follows the pattern of Sec. 7.6 in Chapter 7, where the same algorithm is used to find the spectral
radiance contaminated by mirror-misalignment noise. Just like before, the spectral radiances L
(1)

and L
(2)
chosen to calibrate the instrument are set up to be slowly varying functions of
wavenumber so that [see Eqs. (7.19a) and (7.19b) in Chapter 7]

(1) (1) (1)
( ) ( ) ( )
FOV mnf
o o o e e L L L (8.10a)
and

(2) (2) (2)
( ) ( ) ( )
FOV mnf
o o o e e L L L . (8.10b)

To describe the uncalibrated spectral signal generated from observation of L
(1)
, we again use
functions
(1)
FOV
Z and
(1)
mnf
Z defined in Eqs. (7.20b) and (7.20c) of Chapter 7:

(1) (1) ( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
FOV a f FOV FOV
WA
AO
+ Z L L L (8.10c)
and

(1) (1) ( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
mnf a f mnf mnf
WA
AO
+ Z L L L . (8.10d)

When we write down these formulas, functions L
(1)
,
(1)
FOV
L , and
(1)
mnf
L can be used interchangeably
as shown in Eq. (8.10a). Similarly, describing the uncalibrated spectral signal generated from
observation of L
(2)
, we reuse functions
(2)
FOV
Z and
(2)
mnf
Z defined in Eqs. (7.20e) and (7.20f) of
Chapter 7,

(2) (2) ( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
FOV a f FOV FOV
WA
AO
+ Z L L L (8.10e)
and

(2) (2) ( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
mnf a f mnf mnf
WA
AO
+ Z L L L . (8.10f)
This shows that
Calibrating the Spectral Signal Contaminated by Sampling Noise 8.5
-965 -
In these formulas, as shown by Eq. (8.10b), functions L
(2)
,
(2)
FOV
L , and
(2)
mnf
L can be used
interchangeably.
Still using the same notation as in Sec. 7.6 of Chapter 7, we call
(1)
,
( )
eff totN
Z
the uncalibrated,
noise-contaminated spectral signal produced when the interferometer observes
(1)
( ) L and
(2)
,
( )
eff totN
Z
the uncalibrated, noise-contaminated spectral signal produced when the

interferometer observes
(2)
( ) L . Remembering that

,
( )
eff totN
Z

on the left-hand side of Eq. (8.9a) is the uncalibrated spectral signal for any interferometer
measurement contaminated by sampling noise, we can get the formulas for
(1)
,
( )
eff totN
Z
and
(2)
,
( )
eff totN
Z
by applying Eqs. (8.10c)(8.10f) to the right side of Eq. (8.9a),

(1) (1)
, ma
( ) (1)
ma
( ) H( ) M( ) ( )
( ) 2 H( ) M( ) ( )
eff totN mnf
s
D FOV
u R
i u R

+

Z Z
n Z

(8.11a)
and

(2) (2)
, ma
( ) (2)
ma
( ) H( ) M( ) ( )
( ) 2 H( ) M( ) ( )
eff totN mnf
s
D FOV
u R
i u R

+

Z Z
n Z
.
(8.11b)
We assume that the experimental procedure associated with this calibration algorithm includes
some form of data analysis equivalent to averaging together many independent measurements to
eliminate the sampling noise from
(1,2)
,
( )
eff totN
Z
. We apply the expectation operator E to both sides

of (8.11a) and (8.11b) to get the result of this averaging. Following the same reasoning used to go
from Eq. (8.9a) to (8.9e) above, we see that

( )
(1) (1)
, ma
( ) H( ) M( ) ( )
eff totN mnf
u R Z Z
E
and

( )
(2) (2)
, ma
( ) H( ) M( ) ( )
eff totN mnf
u R Z Z
E .

Following the same procedure as in Eqs. (7.20i) and (7.20j) in Chapter 7, we again remove the
tilde and change the totN subscript to tot to define

( )
(1) (1) (1)
, , ma
( ) ( ) H( ) M( ) ( )
eff tot eff totN mnf
u R = Z Z Z
E (8.11c)
- 966 -
and

( )
(2) (2) (2)
, , ma
( ) ( ) H( ) M( ) ( )
eff tot eff totN mnf
u R = Z Z Z
E . (8.11d)

The noise cannot, of course, be averaged away from the spectral measurement itself because [as
is discussed following Eq. (7.21a) in Chapter 7] in practice we cannot take the same amount of
care when collecting the spectral measurements as we do when collecting the known calibration
data. Just as is done in Sec. 7.6 of Chapter 7, we use
( )
,
( )
meas
eff totN
Z
to represent the signal for the

uncalibrated spectral measurement contaminated by noisein this case, sampling noise. When
analyzing sampling noise,
( )
,
( )
meas
eff totN
Z
is the same quantity as

,
( )
eff totN
Z

in Eq. (8.9a), which means Eq. (8.9a) can now be written as

[ ]
( )
, ma
( )
ma
( ) H( ) M( ) ( )
( ) 2 H( ) M( ) ( )
meas
eff totN mnf
s
D FOV
u R
i u R

+
Z Z
n Z
.
(8.11e)

Here formula (8.7f) specifies ( )
mnf
Z and (8.1b) specifies ( )
FOV
Z .
Now we can apply the calibration algorithm. Following the same procedure as in Sec. 7.6 of
Chapter 7, we have [repeating Eq. (7.21a)]

( ) (1)
, , (2) (1) (1)
(2) (1)
, ,
Measured Radiance
( ) ( )
( ) ( ) ( )
( ) ( )
meas
eff totN eff tot
eff tot eff tot

= +

Z Z
L L L
Z Z
.
(8.12a)

The formulas in (8.11c) and (8.11d) show that

(2) (1) (2) (1)
(2) (1) (2) (1)
, , ma
( ) ( ) ( ) ( )
( ) ( ) H( ) M( )[ ( ) ( )]
eff tot eff tot mnf mnf
u R

L L L L
Z Z Z Z
, (8.12b)

which becomes, substituting from (8.10d) and (8.10f),

-967 -

(2) (1)
(2) (1)
, ,
(2) (1)
(2) (1)
R
( ) ( )
( ) ( )
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( )[ ( ) ( )]
4
eff tot eff tot
rms a f
WA
u R
o o
o o
o o
AO
L L
Z Z
L L
L L

or

(2) (1)
(2) (1)
, ,
1
R
( ) ( )
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( )
4
eff tot eff tot
rms a f
WA
u R
o o
o o
o o o q o t o t o
AO

L L
Z Z
.
(8.12c)

This result is identical to (7.21c) in Chapter 7 because the sampling noise, like the mirror-
misalignment noise, can be reduced to negligible levels by averaging together many independent
measurements of the same radiance when gathering data for the calibration algorithms. In fact,
(8.12c) holds true whenever the noise in the data can be removed this way. To find the value of

( ) (1)
, ,
( ) ( )
meas
eff totN eff tot
o o Z Z

in Eq. (8.12a), we substitute from Eqs. (8.11e) and (8.11c) to get

( ) (1) (1)
, , ma
( )
ma
( ) ( ) H( ) M( )[ ( ) ( )]
( ) 2 H( ) M( ) ( )] ,
meas
eff totN eff tot mnf mnf
s
D FOV
u R
i u R
o o o o o o
o r o o o o
e
+
Z Z Z Z
n Z
[

which becomes, consulting Eqs. (8.7f) and (8.10d) for the formulas of
mnf
Z and
(1)
mnf
Z ,

[ ]
( ) (1)
, ,
(1)
ma
( )
ma
R
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( ) ( )
4
( ) 2 H( ) M( ) ( )
meas
eff totN eff tot
a f mnf
s
D FOV
WA
u R
i u R
o o
o r o o o o
AO
e

+
Z Z
L L
n Z

.
. (8.12d)

Equations (8.12c) and (8.12d) can now be put into (8.12a) to get

this formula holds true whenever the noise can be removed this way. To nd the value of
- 968 -
[ ]
( )
ma
ma
R
Measured Radiance
4{ ( ) 2 H( ) M( ) ( ) }
( )
( )H( ) M( ) ( ) ( ) ( ) ( )
s
D FOV
mnf
a f
i u R
WA u R
o r o o o o
o
o o o q o t o t o
+
AO
n Z
L

.
(8.12e)

Equation (6.55a) in Chapter 6 shows that the complex-valued transfer function H can be written
as

( )
H( ) H( )
i
u u e
o
o o (8.12f)

with ( ) o being the phase of the complex-valued function H(u). According to the discussion
following Eq. (6.55a), Eq. (5A.6b) in Appendix 5A of Chapter 5 applies to the transfer function
H in (8.12f); that is, H is Hermitian:
H( ) H( ) u u o o
. (8.12g)

Substitution of (8.12f) into (8.12g) gives

( ) ( )
H( ) H( )
i i
u e u e
o o
o o

,

which can only be true if H( ) uo is an even function of its argument u,

H( ) H( ) u u o o , (8.12h)

and ( ) o is an odd function of its argument ,

( ) ( ) o o . (8.12i)

Equation (8.12f) can be used to write the formula in (8.12e) as

[ ]
( ) ( )
ma
ma
R
Measured Radiance
4 { ( ) 2 H( ) M( ) ( ) }
( )
( ) H( ) M( ) ( ) ( ) ( ) ( )
i s
D FOV
mnf
a f
e i u R
WA u R
o
o r o o o o
o
o o o q o t o t o
+
AO
n Z
L

.
(8.12j)

The denominator of the second term on the right-hand side is real, but the numerator of this term
almost certainly has both a real and imaginary component. In this regard, the result resembles Eq.
(7.21e) in Chapter 7, which also shows the measured radiance spectrum to be the sum of
( )
mnf
o L and a complex random term. Just like in the discussion following (7.21e), we note that
only the real part of the second term acts as a source of unavoidable noise, since we can always
is an even function of its argument u,
This formula, Eq. (8.12j), resembles Eq.
-969 -
discard any imaginary components of the noise-contaminated measured radiance. Once again we
can conclude that only the real part of the second term of the formula is the random spectral noise
L
for the measured radiance,

[ ] ( )
( ) ( )
ma
ma
R
4Re { ( ) 2 H( ) M( ) ( ) }
( ) H( ) M( ) ( ) ( ) ( ) ( )
i s
D FOV
a f
e i u R
WA u R

n Z
L

. (8.12k)

8.6 Random Sampling Error in the Measured Spectrum
The L
sampling error is an even function of wavenumber . To see why this is so, we need
only analyze the different functions inside formula (8.12k) for L
. The R,
a
, and
f
factors are
written as even functions of wavenumberthat is, as functions of the absolute value of and
Eq. (8.12h) shows that H is also an even function of . Equation (4.139g) in Chapter 4 states
that () is even, and Eq. (5.10f) in Chapter 5 reveals M to be an even function of . Hence the
whole denominator of (8.12k) must be an even function of wavenumber . To analyze the
numerator, we note that everything in the formula for
FOV
Z in Eq. (8.1b) is real, so
FOV
Z is real
andsince () and the other functions in the formula are evenfunction
FOV
Z is also even:

( ) ( )
FOV FOV
= Z Z . (8.13a)

The convolution in the numerator of Eq. (8.12k) can be written as [see Eqs. (2.38a) and (2.38b) in
Chapter 2]

[ ]
[ ]
[ ]
( )
ma
( )
ma
( )
ma
( ) 2 H( ) M( ) ( )
2 H( ) M( ) ( ) ( )
2 H( ) M( ) ( ) ( ) ,
s
D FOV
s
FOV D
s
FOV D
i u R
i u R
i u R d

=
=
n Z
Z n
Z n

which means that, since only i, H, and
( ) s
D
n are complex,

- 970 -

[ ] { }
{ }
( )
ma
( )
ma
( )
ma
( )
( ) 2 H( ) M( ) ( )
2 H( ) M( ) ( ) ( )
2 H( ) M( ) ( ) ( )
( ) 2
s
D FOV
s
FOV D
s
FOV D
s
D
i u R
i u R d
i u R
i

=

=

=
n Z
Z n
Z n
n

ma
H( ) M( ) ( )
FOV
u R

Z .

This is the formula for the complex value of the convolution. The numerator in (8.12k) is
proportional to

[ ] ( )
( ) ( )
ma
Re { ( ) 2 H( ) M( ) ( ) }
i s
D FOV
e i u R

n Z ;

and since the real part of any complex number c can be written as
*
0.5( ) c c + , we see that the real
part of the convolution is

[ ] { } ( )
[ ] { } (
[ ] { }
)
( ) ( )
ma
( ) ( )
ma
( ) ( )
ma
( ) ( )
Re ( ) 2 H( ) M( ) ( )
1
( ) 2 H( ) M( ) ( )
2
( ) 2 H( ) M( ) ( )
1
(
2
i s
D FOV
i s
D FOV
i s
D FOV
i s
D
e i u R
e i u R
e i u R
e

=
+
=
n Z
n Z
n Z
n

[ ] { } (
{ })
ma
( ) ( )
ma
) 2 H( ) M( ) ( )
( ) 2 H( ) M( ) ( )
FOV
i s
D FOV
i u R
e i u R

+

Z
n Z .
(8.13b)

Equations (8.8g), (8.12g), and (8.12i) show that this can also be written as

[ ] { } ( )
[ ] { } (
[ ] { })
( ) ( )
ma
( ) ( )
ma
( ) ( )
ma
Re ( ) 2 H( ) M( ) ( )
1
( ) 2 H( ) M( ) ( )
2
( ) 2 ( ) H( ) M( ) ( )
i s
D FOV
i s
D FOV
i s
D FOV
e i u R
e i u R
e i u R

=
+
n Z
n Z
n Z

.

Random Sampling Error in the Measured Spectrum 8.6
-971 -
Since, as has already been noted, M and
FOV
Z are even, it follows that

[ ] { } ( )
[ ] { } (
[ ] { })
( ) ( )
ma
( ) ( )
ma
( ) ( )
ma
Re ( ) 2 H( ) M( ) ( )
1
( ) 2 H( ) M( ) ( )
2
( ) 2 ( ) H( ) M( ) ( )
i s
D FOV
i s
D FOV
i s
D FOV
e i u R
e i u R
e i u R

=
+
n Z
n Z
n Z

.
(8.13c)

The right-hand side of (8.13c) is clearly an even function of wavenumber; when is replaced by
, only the order of the sum changes. Consequently the left-hand side of (8.13c) must also be an
even function of ; hence, the numerator of (8.12k), just like the denominator of (8.12k), must be
an even function of wavenumber. Consequently, it makes sense to write the formula for the
random sampling error in (8.12k) as

[ ] { } ( )
( ) ( )
ma
ma
R
4Re ( ) 2 H( ) M( ) ( )
( )
( ) H( ) M( ) ( ) ( ) ( ) ( )
i s
D FOV
a f
e i u R
WA u R

n Z
L

. (8.13d)

The absolute value signs in the argument for L
remind us that both sides of this formula are

even functions of .
The linearity of the expectation operator E with respect to random quantities (see Sec. 3.10 in
Chapter 3) lets us apply E to both sides of (8.13d) and take the nonrandom quantities outside the
expectation value to get

( )
[ ] { } ( ) ( )
( ) ( )
ma
ma
R
4 Re ( ) 2 H( ) M( ) ( )
( )
( ) H( ) M( ) ( ) ( ) ( ) ( )
i s
D FOV
a f
e i u R
WA u R

n Z
L

E
E . (8.14a)

To evaluate the numerator of the right-hand side, we again note that any complex number c can
be written as
*
0.5( ) c c + and then use the linearity of E to get

[ ] { } ( ) ( )
[ ] ( )
[ ] { } ( )
( ) ( )
ma
( ) ( )
ma
( )
( )
ma
Re ( ) 2 H( ) M( ) ( )
1
[ ( ) 2 H( ) M( ) ( )
2
]
( )
2 H( ) M( ) ( )
i s
D FOV
i s
D FOV
i
s
D
FOV
e i u R
e i u R
e
i u R

=
+

n Z
n Z
n
Z
E
E
E

.
(8.14b)
- 972 -
According to Eq. (8.9d),

[ ] ( )
( )
ma
( ) 2 H( ) M( ) ( ) 0
s
D FOV
i u R o r o o o o n Z E ; (8.14c)

and if a complex number is zero so is its complex conjugate:

[ ] ( )
( )
ma
0
( )
2 H( ) M( ) ( )
s
D
FOV
i u R
o
r o o o o
n
Z
E . (8.14d)

We conclude, referring back to Eq. (8.14b), that

[ ] ( ) ( )
( ) ( )
ma
Re ( ) 2 H( ) M( ) ( ) 0
i s
D FOV
e i u R
o
o r o o o o
n Z E . (8.14e)

Substituting this latest result back into (8.14a) now gives

( )
( ) 0 o o L
E . (8.14f)

Hence the random sampling error L o

is a zero-mean random variable.

8.7 Calculating the NEdN from the Random Sampling Error
Since oL
is a zero-mean random variable, its variance is [after applying Eqs. (3.8f), (3.8c) in
Chapter 3 and Eq. (8.14f) above]

( )
( )
2
2
( ) ( ) ( ) o o o o o o

L L L

E E E . (8.15a)

Consulting Eq. (8.13d), we see that

( )
[ ] ( )
( )
( )
2
( ) ( ) 2
ma
2
ma
R
( )
Re ( ) 2 H( ) M( ) ( )
,
4 H( ) M( ) ( ) ( ) ( ) ( )
i s
D FOV
a f
e i u R
A u R
o
o o
o r o o o o
o o o q o t o t o
AO

L
n Z
E
E

(8.15b)

where we have used that
2
1 W because W = 1 or 1 [see discussion immediately preceding Eq.
and if the mean of a random complex number is zero so is the mean of its complex conjugate:
Calculating the NEdN from the Random Sampling Error 8.7
-973 -
(4.84a) in Chapter 4]. We define function
( )
( )
s
J o to be

[ ] ( )
( )
( ) ( ) ( ) 2
ma
( ) Re ( ) 2 H( ) M( ) ( )
s i s
D FOV
J e i u R
o
o o r o o o o

n Z E , (8.15c)

which means the variance in Eq. (8.14b) can be written as

( )
( )
2
2
ma
R
16 ( )
( )
H( ) M( ) ( ) ( ) ( ) ( )
s
a f
J
A u R
o
o o
o o o q o t o t o

AO

L
E . (8.15d)

Equation (6.3f) in Chapter 6 states that the NEdN associated with any random error oL
is its
standard deviationthat is, the square root of the variance of oL
. Hence, NEdN
samp
, the NEdN
caused by the sampling error, is

( )
ma
R
4 ( )
( )
H( ) M( ) ( ) ( ) ( ) ( )
s
samp
a f
J
NEdN
A u R
o
o
o o o q o t o t o
AO
. (8.15e)

The
( ) s
J function specifies how the sampling noise interacts with the radiance spectrum, and the
denominator rescales the result so it has the right size with respect to the measured spectrum.
To evaluate
( ) s
J , we set up three new functions of wavenumber called
1
( ) T o ,
2
( ) T o , and
3
( ) T o . Using function ( )
s
W from Eqs. (8.8c) or (8.8d) above, we define

( )
2
( ) ( )
1
( ) ( , ) ( ) ( )
i s
s
T D n W
o
o

H

F E , (8.16a)

( )
2
( ) ( )
2
( ) ( , ) ( ) ( )
i s
s
T D n W
o
o

H

F E , (8.16b)
and

( ) ( ) ( )
3
( ) ( ) ( ) ( )
( )
( , ) ( ) ( ) ( , ) ( ) ( )
i s i s
s s
T
D n W D n W
o o
o

H H . F F E
(8.16c)


( )
( )
tot
C
s
dz
W
d
(8.16d)
spectral measurement.
- 974 -
is real, as are functions
( )
( )
s
n and ( , ) D introduced in (8.2a) and (8.6a), so when taking the
complex conjugate of the Fourier transform in Eq. (8.16a) we get, applying Eqs. (2.29a) and
(2.29c) in Chapter 2,

( )
( ) ( ) ( ) 2
( ) 2
( , ) ( ) ( ) ( , ) ( ) ( )
( , ) ( ) ( )
i s s i
s s
s i
s
D n W D n W e d
D n W e d

=

=

F
( )
( ) ( )
( , ) ( ) ( )
i s
s
D n W
= . F
(8.16e)

Hence the Fourier transforms in (8.16a) and (8.16b) are complex conjugates, which means their
squares must also be complex conjugates, as are the expectation values of the squares. We thus
end up with the relationship

1 2
( ) ( ) T T
= . (8.16f)

Consequently any formula derived for
1
( ) T can be turned into a formula for
2
( ) T just by
taking the complex conjugate of both sides of the equation.
Working first with the
1
T term in Eq. (8.16a), we consult the definition of the Fourier-
transform operator F [see Eqs. (2.29a) and (2.29c) in Chapter 2) and Eq. (3.17c) in Chapter 3 to
get

( )
( ) 2 ( ) 2
1
2 2 ( ) ( )
( ) ( , ) ( ) ( ) ( , ) ( ) ( )
( , ) ( ) ( , ) ( ) ( ) ( )
s i s i
s s
i i s s
s s
T D n W e d D n W e d
d D W e d D W e n n

=

=

.
E
E

This can also be written as, first substituting from Eq. (8.3b) and then applying (8.3c),

2 2 ( )
1
( ) ( , ) ( ) ( , ) ( ) ( )
i i s
s s nn
T d D W e d D W e

=

. o (8.17)

The Fourier transform of ( , ) D is [see Eq. (8.7b)]

2 sinc(2 ) D D
-975 -
and reversing the transform in (8.8d) shows that the Fourier transform of ( )
s
W is

ma
2 H( ) M( ) ( )
FOV
i u R r o o o o Z .

Applying the Fourier convolution theorem to the Fourier transform of the product function

( , ) ( )
s
D W H

then gives, using formula (2.39k) in Chapter 2,

2
ma
( , ) ( )
[2 sinc(2 )] [2 H( ) M( ) ( )]
i
s
FOV
D W e d
D D i u R
r o

ro r o o o o
Z .
(8.18a)

In a well-designed interferometer, the D parameter limiting the extent of the double-sided
interferogram is large enough to make the 2 i r o factor, H, and M all slowly varying functions of
wavenumber compared to
2 sinc(2 ) D D ro .

This means, according to Eq. (5C.1) in Appendix 5C of Chapter 5, that (8.18a) can be
approximated as

2
ma
( , ) ( )
2 H( ) M( ) [2 sinc(2 )] ( ) ,
i
s
FOV
D W e d
i u R D D
r o

r o o o ro o
H
e
Z

which becomes, after consulting Eq. (7.16h) in Chapter 7,

2
ma
( , ) ( ) 2 H( ) M( ) ( )
i
s mnf
D W e d i u R
r o
r o o o o
H e
Z . (8.18b)

Taking the complex conjugate of both sides gives, since ( , ) D H , ( )
s
W , M, and
mnf
Z are real
quantities,

2
ma
( , ) ( ) 2 H( ) M( ) ( )
i
s mnf
D W e d i u R
r o
r o o o o
H e
Z . (8.18c)

forward Fourier transform of ( )
s
W is
- 976 -
Substituting (8.4b) into (8.17) leads to

1
2 2 ( ) 2 ( )
( ) 2 ( ) 2 ( )
( )
( , ) ( ) ( , ) ( ) ( )
( ) ( , ) ( ) ( , ) ( )
i i s i
s s nn
s i i
nn s s
T
d D W e d D W e d e
d d D W e d D W e
r o r o r o
r o o r o o
o
o o
o o

+

H H
H H

p
p

which becomes, applying Eqs. (8.18b) and (8.18c),

( ) ( )
( ) ( )
( )
1 ma
ma
( ) ( )[ 2 ( ) H ( ) M ( ) ( )]
[2 ( )H ( ) M ( ) ( )] .
s
nn mnf
mnf
T i u R
i u R d
o o r o o o o o o o o
r o o o o o o o o o

+ + + +
Z
Z

p
(8.18d)

Glancing back at the formula for
mnf
Z in Eq. (8.7f) above, we see that every function in the
formula depends on o except for , and according to Eq. (4.139g) in Chapter 4, is also an
even function of . Hence
mnf
Z is even:
( ) ( )
mnf mnf
o o Z Z . (8.18e)

Equation (5.10f) in Chapter 5 shows that
ma
M( ) Ro is an even function of , and Eq. (5B.2a) in
Appendix 5B of Chapter 5 shows that H is Hermitian. Consequently the formula for
1
( ) T o in
(8.18d) can be written as

( ) ( )
( ) ( )
1
2 ( )
ma
ma
( )
4 ( )[( ) H ( ) M ( ) ( )]
[( )H ( ) M ( ) ( )]
s
nn mnf
mnf
T
u R
u R d
o
r o o o o o o o o o
o o o o o o o o o

+ + + +
Z
Z

.
p (8.18f)

Equation (8.4d) shows that the power spectrum of the sampling-position noise is real, and we
already know that M and
mnf
Z are real. Hence only the transfer function H in (8.18f) can have
a nonzero imaginary component, so when Eq. (8.16f) is applied to (8.18f) to get the formula for
2
( ) T o , the result is
(5A.6b)
A
-977 -
( ) ( )
( ) ( )
2
2 ( )
ma
ma
( )
4 ( )[( ) H ( ) M ( ) ( )]
[( )H ( ) M ( ) ( )]
s
nn mnf
mnf
T
u R
u R d
=

+ + + +
Z
Z

.
p (8.18g)

Function
3
( ) T in Eq. (8.16c) can be written as, applying the Fourier transform operator as
shown in Eqs. (2.29a,c) in Chapter 2,

3
( ) 2 ( ) 2
( )
( , ) ( ) ( ) ( , ) ( ) ( )
s i s i
s s
T
D n W e d D n W e d

=

. E

Equation (3.17c) in Chapter 3 shows that the expectation operator E can be taken inside the
integrals to get, after applying Eq. (8.3b),

( )
3
2 2 ( ) ( )
2 2
( )
( , ) ( ) ( , ) ( ) ( ) ( )
( , ) ( ) ( , ) ( )
i i s s
s s
i i
s s
T
d D W e d D W e n n
d D W e d D W e

=
=

E
( )
( )
s
nn

. o
(8.19a)

According to Eq. (8.3c), the autocorrelation function
( ) s
nn
o is even,

( ) ( )
( ) ( )
s s
nn nn
=

o o ,

which means that
( )
( )
s
nn

o in the formula for
3
T can be written as

( ) ( ) ( )
1 1
( ) ( ) ( )
2 2
s s s
nn nn nn
= +

o o o . (8.19b)

Substituting this into (8.19a) gives

- 978 -

2 2 ( )
3
2 2 ( )
1
( ) ( , ) ( ) ( , ) ( ) ( )
2
1
( , ) ( ) ( , ) ( ) ( )
2
i i s
s s nn
i i s
s s nn
T d D W e d D W e
d D W e d D W e
r o r o
r o r o
o

H H
+ H H

.
o
o

Again, just like in the analysis of
1
T above, Eq. (8.4b) is applied to get

3
2 2 ( ) 2 ( )
2 2 ( ) 2 ( )
( )
1
( , ) ( ) ( , ) ( ) ( )
2
1
( , ) ( ) ( , ) ( ) ( )
2
i i s i
s s nn
i i s i
s s nn
T
d D W e d D W e d e
d D W e d D W e d e
r o r o r o
r o r o r o
o
o o
o o

H H
+ H H

p
p

or

3
( ) 2 ( ) 2 ( )
( ) 2 ( )
( )
1
( ) ( , ) ( ) ( , ) ( )
2
1
( ) ( , ) ( )
2
s i i
nn s s
s i
nn s
T
d d D W e d D W e
d d D W e
r o o r o o o
r o o
o
o o
o o

+

H H
+ H

p
p
2 ( )
( , ) ( )
i
s
d D W e
r o o o

+
H

.
(8.19c)

Equations (8.18b) and (8.18c) show that

( ) ( )
( ) ( )
( )
3
2 ( )
ma
ma
( )
m
( )
2 ( )[( ) H ( ) M ( ) ( )]
[( ) H ( ) M ( ) ( )]
( ) [( ) H ( ) M ( )
{
s
nn mnf
mnf
s
nn
T
u R
u R d
u R
o
r o o o o o o o o o
o o o o o o o o o
o o o o o o o

+ + + +
Z
Z

p
p ( )
( ) ( )
a
ma
( )]
[( ) H ( ) M ( ) ( )] }
mnf
mnf
u R d
o o
o o o o o o o o o
+
+ + + +
Z
Z

or

( ) o o o
( ) o o o +
.

-979 -
( ) ( )
( ) ( )
3
2
2 ( )
ma
2
( )
ma
( )
2 ( ) ( ) H ( ) M ( ) ( )
( ) ( ) H ( ) M ( ) ( )
{
s
nn mnf
s
nn mnf
T
u R d
u R d
=

+ + + + +
Z
Z

p
p }
(8.19d)

because everything inside the integrals except H is real. According to Eq. (3.54g) in Chapter 3,
noise-power spectra such as
( ) s
nn
p can never be negative. The
( )
( )
s
nn

p power spectra in both
integrals of (8.19d) are multiplied by the squared magnitudes of complex numbers before being
integrated over d. Consequently neither of the integrals in (8.19d) can be negative, showing that

3
( ) 0 T (8.19e)
for all values of wavenumber .
Having found formulas for
1
T ,
2
T , and
3
T , we are now prepared to expand the
( ) s
J function
defined in Eq. (8.15c) above. Reversing the transform in Eq. (8.8c) gives

( )
( )
ma
2 H( ) M( ) ( ) ( )
i
FOV s
i u R W
= Z F , (8.20a)

and Eq. (8.8e) above shows that
( ) s
D
n is the Fourier transform

( )
( ) ( ) ( )
( ) ( , ) ( )
s i s
D
D n
= n F .

According to Eq. (2.39j) in Chapter 2, the Fourier convolution theorem shows that

( )
( ) ( ) ( )
ma
( ) [2 H( ) M( ) ( )] ( , ) ( ) ( )
s i s
D FOV s
i u R D n W
= n Z F . (8.20b)

Hence, the formula for
( ) s
J in Eq. (8.15c) can also be written as

( ) ( )
2
( ) ( ) ( ) ( )
( ) Re ( , ) ( ) ( )
s i i s
s
J e D n W

=

F E . (8.20c)

We now begin the analysis of the right-hand side of Eq. (8.20c). Once again noting that the
real part of any complex number c is
*
0.5( ) c c + , we write

- 980 -

( ) ( )
( )
( )
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
Re ( , ) ( ) ( )
1
( , ) ( ) ( )
2
( , ) ( ) ( )
{
}
i i s
s
i i s
s
i i s
s
e D n W
e D n W
e D n W

=

+

.
F
F
F

The product

( )
( , ) ( ) ( )
s
s
D n W

inside the Fourier transforms is real, according to Eqs. (8.6a), (8.2a), and (8.8c), so [applying
formulas (2.29a) and (2.29c) in Chapter 2]

( )
( ) ( ) ( ) 2
( ) 2
( , ) ( ) ( ) ( , ) ( ) ( )
( , ) ( ) ( )
i s s i
s s
s i
s
D n W D n W e d
D n W e d

=

=

F
( )
( ) ( )
( , ) ( ) ( )
i s
s
D n W
= . F

This shows that

( ) ( )
( )
( )
( ) ( ) ( )
( ) ( ) ( )
( ) ( ) ( )
Re ( , ) ( ) ( )
1
( , ) ( ) ( )
2
( , ) ( ) ( )
{
}
i i s
s
i i s
s
i i s
s
e D n W
e D n W
e D n W

=
+

.
F
F
F
(8.21a)
Squaring this formula leads to

( ) ( )
( )
( )
( )
( ) ( ) ( ) 2
2 ( ) ( ) ( ) 2
2 ( ) ( ) ( ) 2
( ) ( ) ( )
Re ( , ) ( ) ( )
1
( , ) ( ) ( )
4
( , ) ( ) ( )
2 ( , ) ( ) ( ) ( , )
{
i i s
s
i i s
s
i i s
s
i s i
s
e D n W
e D n W
e D n W
D n W D n

=

+

+

F
F
F
F F
( )
( )
( ) ( ) }
s
s
W .
(8.21b)

-981 -
The expectation operator E is linear with respect to random quantities (see Sec. 3.10 of Chapter
3), so E can be applied to both sides of (8.21b) to get

( ) ( )
( )
( )
( )
( )
( )
( )
( ) ( ) ( ) 2
2 ( ) ( ) ( ) 2
2 ( ) ( ) ( ) 2
( ) ( ) ( )
Re ( , ) ( ) ( )
1
( , ) ( ) ( )
4
1
( , ) ( ) ( )
4
1
( , ) ( ) ( ) (
2
i i s
s
i i s
s
i i s
s
i s i
s
e D n W
e D n W
e D n W
D n W

=

+

+

F
F
F
F F
E
E
E
E
( ) ( )
( )
, ) ( ) ( )
s
s
D n W .

Substituting from Eqs. (8.16a)(8.16c) gives

( ) ( )
( )
( ) ( ) ( ) 2
2 ( ) 2 ( )
1 2 3
Re ( , ) ( ) ( )
1 1 1
( ) ( ) ( )
4 4 2
i i s
s
i i
e D n W
e T e T T

= + +
,
F E

and Eq. (8.20c) shows that this result can be written as

( ) 2 ( ) 2 ( )
1 2 3
1 1 1
( ) ( ) ( ) ( )
4 4 2
s i i
J e T e T T

= + + . (8.21c)

Equations (8.18f), (8.18g), and (8.19d) have formulas for
1
T ,
2
T , and
3
T that can be substituted
into (8.21c) to get

- 982 -
( ) ( )
( ) ( )
( )
2 2 ( ) ( )
ma
ma
( )
( )[( ) H ( ) M ( ) ( )]
[( )H ( ) M ( )
s
i s
nn mnf
mnf
J
e u R
u R

=

+ + +
Z
Z

p
( ) ( )
( ) ( )
( )
2 2 ( ) ( )
ma
ma
2 ( ) ( )
( )]
( )[( ) H ( ) M ( ) ( )]
[( )H ( ) M ( ) ( )]
( ) ( ) H ( ) M
i s
nn mnf
mnf
s i
nn
d
e u R
u R d
e u R

+

+ + + +
+
Z
Z

p
p ( )
( ) ( )
2
ma
2
2 ( ) ( )
ma
( ) ( )
( ) ( ) H ( ) M ( ) ( ) ,
mnf
s i
nn mnf
d
e u R d

+ + + + +
Z
Z

p
(8.22a)

where
( ) i
e

terms are inserted into the squared magnitudes of the last two integrals. We can do
this because, for any complex number c,

2
2
i
c e c

= .

We next define the complex-valued function ( , ) a to be

( ) ( )
( )
ma
( , ) ( ) H ( ) M ( ) ( )
i
mnf
a e u R

= + + + + Z . (8.22b)

It follows that

( ) ( )
( )
ma
( , ) ( ) H ( ) M ( ) ( )
i
mnf
a e u R

= Z . (8.22c)

Equation (8.22a) can now be written as, remembering that M and
mnf
Z are real-valued functions,

( ) 2 ( ) ( )
2 2
( ) ( )
( ) ( ) ( , ) ( , ) ( ) ( , ) ( , )
( ) ( , ) ( ) ( , )
{
}
s s s
nn nn
s s
nn nn
J a a d a a d
a d a d

=
+ +

p p
p p

-983 -
or, combining the four integrals into one,

2 2
( ) 2 ( )
( ) ( ) ( , ) ( , )
( , ) ( , ) ( , ) ( , )
{
}
s s
nn
J a a
a a a a d
o r o o o o o
o o o o o o o o o

.
p
(8.22d)
We note that

2
( , ) ( , ) ( , ) ( , ) ( , ) ( , )
( , ) ( , ) ( , ) ( , )
( , ) ( , ) ( ,
a a a a a a
a a a a
a a a
o o o o o o o o o o o o
o o o o o o o o
o o o o o

+

2 2
) ( , )
( , ) ( , ) ( , ) ( , )
( , ) ( , ) .
a
a a a a
a a
o o o
o o o o o o o o
o o o o
+

This shows that the formula for
( ) s
J can be written as

2
( ) 2 ( )
( ) ( ) ( , ) ( , )
s s
nn
J a a d o r o o o o o o

p ,

which becomes, substituting from Eqs. (8.22b) and (8.22c),

( ) ( )
( ) ( )
( )
2 ( ) ( )
ma
2
( )
ma
( )
( ) ( ) H ( ) M ( ) ( )
( ) H ( ) M ( ) ( )
s
s i
nn mnf
i
mnf
J
e u R
e u R d
o
o
o
r o o o o o o o o o
o o o o o o o o

+ + + +
Z
Z

p
o .
(8.22e)

The
( ) s
nn
p noise-power spectrum can never be negative [see inequality (3.54g) in Chapter 3], and
inside the integral in (8.22e) the noise-power spectrum is multiplied by the magnitude of a
complex number. Hence the integral in (8.22e) is over the product of two non-negative quantities
and itself can never be negative:

( )
( ) 0
s
J o > . (8.22f)

inside Eq. (8.22e) the noise-power spectrum is multiplied by the squared magnitude of a
- 984 -
This shows there can never be any problem taking the square root of
( ) s
J in formula (8.15e)
when calculating the sampling-error NEdN. Combining Eqs. (8.15e) and (8.22e) in a single place
gives

( )
ma
R
4 ( )
( )
H( ) M( ) ( ) ( ) ( ) ( )
s
samp
a f
J
NEdN
A u R

, (8.22g)
where

( ) ( )
( ) ( )
( )
2 ( ) ( )
ma
2
( )
ma
( )
( ) ( ) H ( ) M ( ) ( )
( ) H ( ) M ( ) ( )
s
s i
nn mnf
i
mnf
J
e u R
e u R d

=

+ + + +
Z
Z

p
.
(8.22h)

Part of the formula for
( ) s
J can also be written as a convolution. Equation (8.4c), which
shows that
( ) s
nn
p is even, can be applied to the second integral in Eq. (8.19d) to get

( ) ( )
( ) ( )
3
2
2 ( )
ma
2
( )
ma
( )
2 ( ) ( ) H ( ) M ( ) ( )
( ) ( ) H ( ) M ( ) ( )
{
}
s
nn mnf
s
nn mnf
T
u R d
u R d
=

+ + + + +
Z
Z

.
p
p

Changing the variable of integration in the second integral to = then gives

( ) ( )
( ) ( )
3
2
2 ( )
ma
2
( )
ma
( )
2 ( ) ( ) H ( ) M ( ) ( )
( ) ( ) H ( ) M ( ) ( ) ,
{
}
s
nn mnf
s
nn mnf
T
u R d
u R d
=

+
Z
Z

p
p

which becomes, glancing back at the definition of a convolution in Eq. (2.38a) in Chapter 2,

( ) ( )
{ }
2
2 ( )
3 ma
( ) 4 ( ) H M ( )
s
nn mnf
T u R

=

Z

p . (8.23a)
-985 -
This can be substituted back into Eq. (8.21c) to get

( ) ( )
{ }
( ) 2 ( ) 2 ( )
1 2
2
2 ( )
ma
1 1
( ) ( ) ( )
4 4
2 ( ) H M ( )
s i i
s
nn mnf
J e T e T
u R

= +

+

Z

. p

Equation (8.16f) shows that this can be written as

( ) ( )
{ }
( ) ( )
( ) 2 ( ) 2 ( )
1 1
2
2 ( )
ma
2 ( ) 2 ( )
1 1
2 ( )
ma
1 1
( ) ( ) ( )
4 4
2 ( ) H M ( )
1 1
( ) ( )
4 4
2 ( ) H M
s i i
s
nn mnf
i i
s
nn
J e T e T
u R
e T e T
u R

= +

+

= +

+
Z
Z

p
p
{ }
2
( )
mnf

.

Again noting that

*
1
Re( ) ( )
2
c c c = +
for any complex number c, we see that

( ) ( ) ( )
{ }
( )
2
2 ( ) 2 ( )
1 ma
( )
1
Re ( ) 2 ( ) H M ( ) ,
2
s
i s
nn mnf
J
e T u R

= +

Z

p
(8.23b)

where Eq. (8.18f) shows the formula for
1
( ) T to be

( ) ( )
( ) ( )
1
2 ( )
ma
ma
( )
4 ( )[( ) H ( ) M ( ) ( )]
[( )H ( ) M ( ) ( )]
s
nn mnf
mnf
T
u R
u R d
=

+ + + +
Z
Z

.
p (8.23c)

This alternative formula for
( ) s
J is useful later on when analyzing the behavior of the sampling-
noise NEdN associated with the measurement of an isolated emission line (see Sec. 8.9).
- 986 -
8.8 Black-Body Spectrum Contaminated by Sampling Noise
To show what sampling noise looks like, we simulate an interferometer system contaminated by
large amounts of sampling noise while observing a 400-K Planck black-body spectrum. [There is
a brief discussion of black-body radiance spectra following Eq. (5.3h) in Chapter 5.] The
simulated interferometer is similar to the one set up in Sec. 7.15 of Chapter 7. Again the
interferogram is (supposed) to be evenly sampled 8192 times between the OPD values of D and
D, with D = 1.28 cm. According to Eq. (5.67) in Chapter 5, this means the unapodized spectral
resolution is

1
1
0.391 cm
2D

, (8.24a)

and, of course, the change in OPD between interferogram samples is still

4
2
3.125 10 cm
D
N

A . (8.24b)

The background radiance from the interferometers interior surfaces is assumed to be small
compared to the 400-K Planck radiance being measured, so we say that

( ) ( ) (back) ( ) (back)
( ) ( ) ( ) ( ) ( ) 0
dir fore fore
FOV FOV mnf mnf
o o o o o L L L L L . (8.24c)

Again, the beam radius is taken to be R = 3 cm, which makes the beam cross-sectional area

2 2
28.27 cm A R r e . (8.24d)

The interferometers field of view is

4
1.086 10 ster
AO (8.24e)

and the responsivity R is still given its ideal value

R
amp sec
( ) 1
erg
o

. (8.24f)

The beam-splitter efficiency and the transmissions of the fore and aft optics are also ideal,

( ) ( ) ( ) 1
a f
t o t o q o . (8.24g)

e
Black-Body Spectrum Contaminated by Sampling Noise 8.8
-987 -
The interferometer is perfectly aligned, with
ma
0 = so that

ma
M( ) 1.0 R = , (8.24h)

and parameter W [see discussion following Eq. (4.83) in Chapter 4] is

1 W = . (8.24i)

The detector electronics again have a three-pole, low-pass Butterworth filter with a cutoff
frequency of 8000 Hz (see Fig. 7.3 in Chapter 7). The OPD velocity u is still 5 cm/sec, so the
wavenumber corresponding to the cutoff frequency is

1
8000 Hz
1600 cm
5 cm/sec

=

. (8.24j)

One difference from the interferometer system simulated in the previous chapter is the band of
wavenumbers over which the spectrum is measured: this time it is 650 to 1250 cm
1
. Another
difference is the radiances used to calibrate the instrument. Since we are simulating the
measurement of a 400-K black-body spectrum, the high-temperature calibration is now a 500-K
instead of a 350-K black-body radiance. The low-temperature calibration is still that of liquid
nitrogen (77 K).
Figure 8.1(a) shows that the sampling-position noise contaminating the 400-K black-body
measurement has a quasi-harmonic
( ) s
nn
p noise-power spectrum. This has the same shape as the
spectrum in Fig. 7.2(c) in Chapter 7, with the power spectrum in Fig. 8.1(a) having
-1
= 30 cm
C

and
-1
=10 cm
M
. The upper level in the sampling-position power spectrum is

13 3
0
1.25 10 cm
= p . (8.25a)

Imitating Eq. (7.49b) in Chapter 7, we write the formula for the quasi-harmonic spectrum as

( ) ( ) { }
( )
13 3 1 1 1 1
( )
[1.25 10 cm ] 35 cm , 5 cm 35 cm , 5 cm
s
nn

= + +

.
p
(8.25b)

Consulting Eq. (8.5b) above, we see that the variance in the sampling-position error due to this
noise-power spectrum is

- 988 -

( ) 1 13 3 12 2
( ) 20 cm 1.25 10 cm 2.5 10 cm
s
nn
d
= =

p , (8.25c)

which means that the root-mean-square average of the error in the sampling position is

12 2 6
2.5 10 cm 1.581 10 cm
rms
s

= . (8.25d)

Comparing this to (8.24b), we see that this is

6
4
1.581 10 cm
3.125 10 cm
,

or approximately 0.5% of the OPD separation between adjacent samples. This may be somewhat
larger than the typical size of the sampling error in well-designed interferometers, but the bad
sampling does make it easier to see how sampling error affects the measured spectra. Figures
8.1(b) and 8.1(c) give an example of sampling-position noise obeying the quasi-harmonic noise-
power spectrum in Fig. 8.1(a). Both figures plot the same simulation of the
( )
( )
s
n random
function, with the axis expanded in Fig. 8.1(b) to provide a detailed example of the
( )
( )
s
n
oscillations. This sampling-position error is a zero-mean and normally distributed random
quantity. Comparing this example of quasi-harmonic noise to the one shown in Figs. 7.7(a) and
7.7(b) in Chapter 7, we see that here the random oscillations occur at a somewhat lower
frequency. This is due to our choice of a much smaller value of
C
, which is 30 cm
-1
for the
noise in Figs. 8.1(b) and 8.1(c) compared to 100 cm
-1
for the noise in Figs. 7.7(a) and 7.7(b).
Figure 8.2(a) plots ten simulated measurements of the 400-K black-body radiance spectrum
for the interferometer system specified by the discussion accompanying Eqs. (8.24a)(8.24j)
above. In Fig. 8.2(a), and only in Fig. 8.2(a), the actual sampling noise is multiplied by a factor of
20 before being added back to the true radiance; it can be regarded as increasing the s
rms
root-
mean-square sampling-position error to 10% of the intersample spacing. This increase makes it
easy to see how the sampling noise reshapes the spectral measurements, because now the width
of the black solid line representing the true 400-K spectrum does not cover over the dashed lines
representing the noise-contaminated measurements. We note that there is a region near
= 1031 cm
-1
where the error is always small. The solid curve in Fig. 8.2(b) is the NEdN versus
wavenumber curve predicted by formulas (8.22g) and (8.22h) for this sampling-position noise,

-989 -
FIGURE 8.1(a).

FIGURE 8.1(b).

) (
) (
~ ~
s
n n
p
1.25x10
-13
cm
3
50 40 30 20 10 -50 -40 -30 -20 -10
(in cm
-1
)
5 10
6
.
5 10
6
.
Re nSVtemp
kPlot
20
0.5 0.5 kPlot
.
1.28
0.4 0.2 0 0.2 0.4
0 0.0
-5x10
-6
5x10
-6
0.5 -0.5 0.0
) (
) (
s
n
(in cm)
(in cm)
- 990 -
FIGURE 8.1(c).

and it also shows the NEdN dipping down to zero near = 1031 cm
-1
. The NEdN is, of course,
just the standard deviation of the error in the noise-contaminated spectral measurements (see Sec.
6.1 in Chapter 6). We can take a large number of noise-contaminated measurements and calculate
directly the standard deviation of their error at any wavenumber . We have done this for 300
measurements contaminated by statistically independent examples of sampling-position noise
obeying the power spectrum in Fig. 8.1(a) and plotted the results with crosses in Fig. 8.2(b). As
expected, there is a good match to the predicted NEdN valuesthat is, the solid curveand we
see that the crosses marking the standard deviation also dip down to zero near = 1031 cm
-1
.
[The reason they do not go as far down as the solid curve is explained in the discussion following
Eq. (8.34c) in Sec. 8.10 below.]
The formula for NEdN
samp
in Eqs. (8.22g) and (8.22h) predicts this dip. The phase angle of
the three-pole, low-pass filter used in the interferometer simulation [this phrase angle is
introduced in Eq. (8.12f) above] is to a very good approximation linear in wavenumber,

0
( ) K + , (8.26a)

for a real
0
and a real, positive constant K. Many types of low-pass filter have this sort of
5 10
6
.
5 10
6
.
Re nSVtemp
kPlot
20
1.28 1.28 kPlot
.
1.28
1 0.5 0 0.5 1
0
-1.0 -0.5 0.0 0.5 1.0
D = D =
0.0
-5x10
-6
5x10
-6
(in cm)
) (
) (
s
n
(in cm)
-991 -
approximately linear dependence of the transfer functions phase. Equations (8.24f)(8.24h) can
be applied to the formula for NEdN
samp
in Eq. (8.22g) to get

( )
4 ( )
( )
amp sec
H( ) 1
erg
s
samp
J
NEdN
A u
=

. (8.26b)

Equations (8.24c) and (8.24i) together with the previously used (8.24f)(8.24h) can be substituted
into the formula for
mnf
Z in Eq. (8.7f) to get

amp sec
( ) 1 ( )
4 erg
mnf mnf
A

=

Z L (8.26c)

in the formula used for
( ) s
J [for example, Eq. (8.22h)]. Again we note, just as in the discussion
following Eq. (7.55f) in Chapter 7, that for this interferometer system the black-body spectrum is
smooth enough to neglect the nonrandom measurement errors due to the interferometers finite
field of view and finite interferogram lengththat is, we do not need to worry about the
potentially different shapes of the L, L
FOV
, and L
mnf
radiance functions. Hence the formula for
mnf
Z can be written as

amp sec
( ) 1 ( )
4 erg
mnf
A

Z L , (8.26d)

where in Eq. (8.26d) L is the spectral radiance curve for Planck radiation coming from a 400-K
black body.
The formula for
( ) s
J can be simplified in the same way that the NEdN
samp
and
mnf
Z formulas
were. Equations (8.24h) and (8.26d) can be substituted into (8.22h) to get

( )
( )
2
( ) ( ) ( )
2
( )
amp sec
( ) 1 ( ) ( ) H ( ) ( )
4 erg
( ) H ( ) ( )
s s i
nn
i
A
J e u
e u d

=

+ + +
L
L

.
p

- 992 -

600 700 800 900 1000 1100 1200 1300
260
280
300
320
340
360
370
250
LinpV
kR
NEdNV
kR
Lmeas1V
kR
Lmeas2V
kR
Lmeas3V
kR
Lmeas4V
kR
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
Lmeas10V
kR
1.25 10
3
.
650 R
kR
360
340
320
300
280
260
1000 900 800 700 600 1100 1200 1300
(in cm
-1
)
Radiance
(in mW/m
2
/sr/cm
-1
)
1031cm
-1
FIGURE 8.2(a).
This graph contains 10 simulated measurements of a 400 K black-body spectrum
contaminated by the sampling noise. The noise is increased by a factor of 20 over the
size specified by the noise-power spectrum in Fig. 8.1(a) to make it easier to see.
-993 -
FIGURE 8.2(b).

0.3
1.300381 10
3
.
NEdNestP
ks
NEdNV
k
NEdNTV
k
1300 600 p
ks
g
k
,
600 700 800 900 1000 1100 1200 1300
0
0.05
0.1
0.15
0.2
0.25

1
1031

cm
1300 1200 1100 1000 900 800 700 600
0.0
0.05
0.10
0.15
0.20
0.25
0.30
Radiance Error
(in mW/m
2
/sr/cm
-1
)
(in cm
-1
)
- 994 -
Applying (8.12f) gives

( )
( )
( )
2
( )
2
[ ( ) ( )]
[ ( ) ( )]
( )
amp sec
1 ( ) ( ) H ( ) ( )
4 erg
( ) H ( ) ( ) ,
s
s
nn
i
i
J
A
e u
e u d

+
+

=

+ + +
L
L

p

which becomes, after substituting from Eq. (8.26a),

( )
( )
0 0
0 0
( )
2
[ ( ) ] ( )
[ ( ) ]
amp sec
erg
( )
1
4
( ) ( ) H ( ) ( )
( ) H ( ) (
s
i K K s
nn
i K K
J
A
e u
e u

+ + +

+ +
L
L

p
( )
( )
2
2
( )
2
amp sec
erg
)
1 ( ) ( ) H ( ) ( )
4
( ) H ( ) ( )
s iK
nn
iK
d
A
e u
e u d
+

=

+ + +
L
L

.
p

Since

1 2 1 2 1 2 1 2
( )
iK iK iK iK
C e C e e C C e C C C C

+ = + = + = +

for any two complex numbers
1
C and
2
C , this formula for
( ) s
J reduces to

( )
( )
2
( ) ( )
2
amp sec
erg
( ) 1 ( ) ( ) H ( ) ( )
4
( ) H ( ) ( )
s s
nn
A
J u
u d

+ + +
L
L

.
p
(8.27)

The black-body radiance L varies slowly with wavenumber , as does the magnitude H of
the filter transfer function. We can define a new function

-995 -
( ) H( ) ( ) u = L g , (8.28a)

which is also a slowly varying function of wavenumber . Now the integral in Eq. (8.27) can be
written as

( ) ( )
2
( )
2
( )
( ) ( ) H ( ) ( ) ( ) H ( ) ( )
( ) ( ) ( ) ( ) ( )
s
nn
s
nn
u u d
d

+ + +
= + +
L L

g g .
p
p

According to Fig. 8.1(a), the power spectrum
( )
( )
s
nn

p is nonzero over only a relatively small
range of centered on = 0, so in effect the integral is only over a small region of the axis
near = 0, and we only need to know the value of ( ) g inside the integral for small values
of . In this situation it makes sense to expand ( ) g as a Taylor series in to get

( ) ( )
2
( )
2
( )
at at
( ) 2
at
( ) ( ) H ( ) ( ) ( ) H ( ) ( )
( ) ( ) ( ) ( ) ( )
( ) ( ) ( )
s
nn
s
nn
s
nn
u u d
d d
d
d d
d d
d d

+ + +

+ +

= +
L L

g g
g g
g g
g g
p
p
p
at
2
2
at at
( ) ( )
d d
d
d d

+ + +

g g
g g .

This simplifies to

( ) ( )
2
( )
2
2 ( )
at
( ) ( ) H ( ) ( ) ( ) H ( ) ( )
4 ( ) ( )
s
nn
s
nn
u u d
d
d
d

+ + +
+
L L

g
g .
p
p
(8.28b)

Equation (8.28b) can now be substituted into (8.27) to get

- 996 -

2 2
( ) 2 ( )
at
amp sec
erg
( ) 1 ( ) ( )
2
s s
nn
A d
J d
d
o
r
o o o o o o
o

AO
e +

g
g p ,

which can in turn be put into (8.26b), giving

1/ 2
2 ( )
at
2
( ) ( ) ( )
H( )
s
samp nn
d
NEdN d
u d
o
r
o o o o o o
o o

g
g p . (8.28c)

Clearly, NEdN
samp
is going to be very small when the absolute value of

at
( ) ( )
d
T
d
o
o o o
o
+
g
g (8.28d)

is small; in fact for the approximation shown in (8.28c), the NEdN
samp
value is zero when

at
( )
d
d
o
o o
o

g
g , (8.28e)

because then ( ) ( ) ( / ) T d d o o o o + g g is zero in (8.28c). Formula (8.28a) defines function g
used to define ( ) T o in (8.28d). Figure (8.3) is a graph of ( ) T o versus for the ( ) T o function
specified by a 400-K black-body spectrum and the magnitude H of the filter transfer function.
We see that ( ) T o is zero for

1
1030.5 cm o

e , (8.28f)
which explains the dip at

1
1031cm o

e

in Fig. 8.2(b) and the negligible sampling error near
-1
1030.5 cm o e of all ten noise-
contaminated measurements in Fig. 8.2(a). We can expect this sort of behavior whenever we
examine the NEdN
samp
curve for a noise-contaminated black-body spectrum.
8.9 Sampling Noise and an Isolated Lorentz Emission Line
Sampling-position noise, just like misalignment noise, can generate spurious ghost lines when
contaminating measurements of strong emission lines (this misalignment-noise effect is discussed
in Sec. 7.15 of Chapter 7). To see how it works, we take the same system discussed in Sec. 8.8,
contaminated by sampling-noise obeying the same power spectrum shown in Fig. 8.1(a), and
e
e
examine the NEdN
samp
curve for a sample-noise-contaminated black-body spectrum.
Sampling Noise and an Isolated Lorentz Emission Line 8.9
-997 -

600 700 800 900 1000 1100 1200 1300
400
200
0
200
400
600
520.709
301.469
fp o ( )
1300 600 o
1000 900 800 700 600 1100 1200 1300
o (in cm
-1
)
-400
-200
0
200
400
600

e o 1030.5cm
-1
FIGURE 8.3.
This is a plot of the ( ) T o curve showing where it crosses zero on the
wavenumber axis.
( ) T o
(in mW/m
2
/sr/cm
-1
)
- 998 -
change the spectral radiance entering the system into the single Lorentz emission line shown in
Fig. 7.6 in Chapter 7. Again the expression for NEdN
samp
in Eq. (8.22g) reduces to the formula
shown in (8.26b),

( )
4 ( )
( )
amp sec
H( ) 1
erg
s
samp
J
NEdN
A u
=

. (8.29a)

The formula for
mnf
Z associated with Eq. (8.22h) is the same as it was before in (8.26c),

amp sec
( ) 1 ( )
4 erg
mnf mnf
A

=

Z L ,

where L
mnf
is the Lorentz emission line as measured by the interferometer. The effects of the
interferometers finite interferogram length and finite field of view are the same as when they are
analyzed in Sec. 7.15 of Chapter 7, so we can again ignore the slight differences in shape of the
L, L
FOV
, and L
mnf
radiance functions [see discussion following Eq. (7.56c) in Chapter 7] to get

amp sec
( ) 1 ( )
4 erg
mnf
A

Z L , (8.29b)

where L is the spectral radiance of the Lorentz emission line entering the system, that is, the
spectral radiance in Fig. 7.6 in Chapter 7.
If we use formula (8.23b) instead of (8.22h) for the
( ) s
J function in Eq. (8.29a), it will be
easier to understand how the sampling-position noise can generate ghost lines when the
interferometer measures the emission line. According to Eq. (8.24h), the M function is one, so
Eq. (8.23b) can be written as

( ) ( )
{ }
2
( ) 2 ( ) 2 ( )
1
1
( ) Re ( ) 2 ( ) H ( )
2
s i s
nn mnf
J e T u

= +

Z

p .

Substitution of (8.29b) gives

( ) ( )
( )
2
2 2 2
2 ( ) ( )
1
( )
1 amp sec
Re ( ) ( ) H 1 ( )
2 8 erg
s
i s
nn
J
A
e T u

= +

L

. p
(8.30a)

-999 -
The formula for
1
( ) T o comes from substituting (8.24h) and (8.29b) into Eq. (8.23c) to get

( )
( )
2
2 2 2
( )
1
amp sec
( ) 1 ( )[( ) H ( ) ( )]
4 erg
[( )H ( ) ( )]
s
nn
A
T u
u d
r
o o o o o o o o
o o o o o o o
AO

+ + +
L
L

.
p
(8.30b)

The L radiance function is narrow enough (see Fig. 7.6 in Chapter 7), and the [ H( )] u o o varies
slowly enough, that we can make the approximation that

( ) ( ) H ( ) H ( )
e e
u u o o o o o o e L L , (8.30c)

where
e
o is the wavenumber of the emission lines peak value (for the Lorentz emission line in
this simulation,
-1
950 cm
e
o ). When is far from
e
o , function L in Eq. (8.30c) is essentially
zero, making the value assigned to [ H( )] u o o irrelevantand, of course, when is near to
e
o ,
we can approximate [ H( )] u o o by its value [ H( )]
e e
u o o at
e
o . In effect, L is treated as a sort of
delta function to which we have applied formula (2.68e) in Chapter 2. Equations (8.30a) and
(8.30b) can now be written as

( ) ( )

( )
2
2 2 2
2
2 ( ) ( )
1
( )
1 amp sec
Re ( ) 1 H ( ) ( ) ,
2 8 erg
s
i s
e e nn
J
A
e T u
o
o
r
o o o o o
AO

+

L

p
(8.30d)

with

( )
1
2
2 2 2
( )
( )
amp sec
1 H ( ) ( ) ( )
4 erg
s
e e nn
T
A
u d
o
r
o o o o o o o o

AO
+

L L

. p
(8.30e)

The solid curve in Fig. 8.4(a) is the Lorentz emission line L centered over the graph of the
( )
( )
s
nn
o

p function in (8.4b), with
( )
( )
s
nn
o

p having the same basic quasi-harmonic shape shown in
Fig. 8.1(a). The effective half-width of the Lorentz line is taken to be
w
o . The two dashed curves
in Fig. 8.4(a) show the L function displaced to either side of original emission line, with new
peak values at
e w
o o .
still having the same quasi-harmonic graph shown in
- 1000 -
FIGURE 8.4(a) [TOP] AND FIGURE 8.4(b) [BOTTOM].

M C
o o +

C
o

M C
o o

C
o

w e
o o +

w e
o o

1
980cm
e C
o o

+

1
920cm
e C
o o

1
950cm
e
o

w
o
( ) o L
( )
w
o o L
( )
w
o o + L
o
o
[top] and [bottom].
-1001 -
When these two dashed curves are closer together, having peaks at
e
with
w
< , then
those wavenumbers where the dashed curves have significant overlap shows where the product

( ) ( ) + L L

is significantly different from zero. When these dashed curves are further apart, having peaks at
e
with
w
> , then there is no significant overlap and the product

( ) ( ) + L L

is not significantly different from zero. Hence, the position of the dashed curves in Fig. 8.4(a)
shows where this product drops to zero; any further apart and the

( ) ( ) + L L

product cannot make any significant contribution to the integral in (8.30e). Notice, however, that
when

w w

so that the double L product can contribute, then the plot of
( )
( )
s
nn

p in Fig. 8.4(b) shows that
the value of
( )
( )
s
nn

p is itself zero. Hence the

( )
( ) ( ) ( )
s
nn
+ L L

p

product is zero for all values for the configuration shown in Figs. 8.4(a) and 8.4(b)because
when the double L product is non-negligible then
( ) s
nn
p is zero, and when
( ) s
nn
p is nonzero then
the double L product is negligible. We conclude that the integral in (8.30e) is very small or
zero, which means that
1
T can be neglected in Eq. (8.30d). Hence, (8.30d) simplifies to

( )
{ }
2
2 2 2
2
( ) ( )
amp sec
( ) 1 H ( ) ( )
8 erg
s s
e e nn
A
J u

L

p . (8.30f)

Consequently, the NEdN
samp
formula for this sort of measurement can be written as, substituting
(8.30f) into (8.29a),
- 1002 -

( )

2
( ) 1/ 2
H
( ) 2 ( ) ( )
H( )
e e s
samp nn
u
NEdN
u
o o
o r o o
o

e

L

p . (8.30g)

This approximate formula for the sampling-noise NEdN can be used whenever the
( ) s
nn
p power
spectrum straddles a strong emission line the way it does in Figs. 8.4(a) and 8.4(b).
The ten dotted lines in Fig. 8.5(a) plot ten spectral measurements of the Lorentz emission line
using the simulated interferometer contaminated by this quasi-harmonic sampling-position noise,
and the two split solid lines show the true spectral values. The continuous solid line in Fig. 8.5(a)
is the NEdN
samp
curve specified by the formulas in (8.29a), (8.30d), and (8.30e). The formula in
(8.30g) shows that NEdN
samp
is approximately proportional to the square root of the convolution
of the squared emission-line radiance L with the quasi-harmonic power spectrum
( ) s
nn
p in Eq.
(8.25b). According to the discussion at the end of Sec. 7.15 of Chapter 7, a similar convolution in
the NEdN formula for the misalignment noise is also associated with ghost lines on either side of
the Lorentz emission line, as can be seen by comparing Fig. 8.5(a) to Fig. 7.8(a) in Chapter 7.
The resemblance is also present in Fig. 8.5(b), which gives an expanded view of the ghost-line
region on the right-hand side of the emission line. Just like in Fig. 7.8(a) for the misalignment
noise, the convolution predicts the presence of ghost lines on either side of the emission line, with
the center of the ghost-line region offset by wavenumber intervals of

2
M
C
o
o +

from the wavenumber, marking the peak of the emission line. Unlike the quasi-harmonic noise-
power spectrum in Chapter 7, the noise-power spectrum used here has
-1
30 cm
C
o and
-1
10 cm
M
o so that

1
35 cm
2
M
C
o
o

+ . (8.31)

This agrees with the ghost-line offsets seen in Figs. 8.5(a) and 8.5(b).
Figure 8.6 compares the standard deviations of the errors in the measured radiances to the
NEdN
samp
values predicted by the formulas in (8.29a), (8.30d) and (8.30e). It follows the same
format as Fig. 8.2(b), and once again we see a good match between the calculated standard
deviations represented by the crosses and the NEdN
samp
predictions represented by the solid line.
The only difference between the procedure used to generate Fig. 8.2(b) and the procedure used to
generate Fig. 8.6 is that the standard deviations in (8.6) are calculated from 900, instead of from
300, noise-contaminated interferometer measurements.
can be used whenever an
-1003 -
FIGURE 8.5(a).

2
1
LinpV
kR
NEdNV
kR
Lmeas1V
kR
Lmeas2V
kR
Lmeas3V
kR
Lmeas4V
kR
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
Lmeas10V
kR
1050 850 R
kR
850 900 950 1000 1050
1
0.5
0
0.5
1
1.5
2
1050 1000 950 900 850
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
Radiance
(in mW/m
2
/sr/cm
-1
)
(in cm
-1
)
Noise-free
Spectrum
Noise-free
Spectrum
NEdN
samp
- 1004 -

FIGURE 8.5(b).

1.0
1.0
LinpV
kR
NEdNV
kR
Lmeas1V
kR
Lmeas2V
kR
Lmeas3V
kR
Lmeas4V
kR
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
Lmeas10V
kR
1050 950 R
kR
960 980 1000 1020 1040
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1000 1020 1040 980 960
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
Radiance
(in mW/m
2
/sr/cm
-1
)
(in cm
-1
)
Noise-free
Spectrum
NEdN
samp
-1005 -
FIGURE (8.6).

0.3
0.1
NEdNestP
ks
NEdNV
k
1100 800 p
ks
g
k
,
800 850 900 950 1000 1050 1100
0.05
0
0.05
0.1
0.15
0.2
0.25
1100 1050 1000 950 900 850 800
-0.10
-0.05
0.0
0.05
0.10
0.15
0.20
0.25
0.30
(in cm
-1
)
Radiance Error
(in mW/m
2
/sr/cm
-1
)
- 1006 -
Section 7.15 of Chapter 7 included background radiance in the simulated measurements of a
Lorentz emission line contaminated by misalignment noise, and nothing stops us from doing the
same thing here with the sampling noise. Following the same strategy as before [see discussion
following Eq. (7.56a) in Chapter 7], we now change the fore-optics transmission to

( ) 0.5
f
t o (8.32a)

rather than one [the value it has had up to now is one, see Eq. (8.24g)] so that the fore-optics
background radiance
( ) fore
FOV
L is no longer insignificant as it was in Eq. (8.24c). For this sort of
setup, a first-order estimate for the effective fore-optics emissivity is

1 ( ) 0.5
f
t o ,

and the effective temperature of the background radiance is taken to be 350 K. The measured
emission line is the same one used before, having the spectral radiance shown in Fig. 7.6 of
Chapter 7, and the sampling-position noise is the same as in Figs. 8.5(a) and 8.5(b); that is, it is
the noise specified by the power spectrum in Fig. 8.1(a). Because significant amounts of
background radiance are present, Eq. (8.30g) is no longer a good approximation for NEdN
samp
; we
must instead return to Eqs. (8.22g) and (8.22h), remembering to allow for
( ) fore
FOV
L no longer being
zero and
f
t being 0.5. Since we still have

R
amp sec
( ) 1
erg
o

, ( ) ( ) 1
a
t o q o ,
ma
M( ) 1.0 Ro , 1 W ,
and

( ) (back) (back)
( ) ( ) ( ) 0
dir
FOV mnf
o o o L L L

from Eqs. (8.24f-i) and (8.24c), Eq. (8.22g) now simplifies to

( )
8 ( )
( )
amp sec
H( ) 1
erg
s
samp
J
NEdN
A u
o
o
o

AO

; (8.32b)
and we have, from Eq. (8.7f) that

( )
amp sec 1
( ) 1 ( ) ( )
4 erg 2
fore
mnf mnf mnf
A
o o o
AO
+

Z L L (8.32c)

from Eq. (8.7f) that
-1007 -
in the simplified formula from (8.22h) used for the
( ) s
J calculation:

( )
( )
( ) 2 ( ) ( )
2
( )
( ) ( ) ( ) H ( ) ( )
( ) H ( ) ( )
s s i
nn mnf
i
mnf
J e u
e u d

=
+ + +
Z
Z

.
p
(8.32d)

Again, according to the discussion after Eq. (7.56b) in Chapter 7, we can neglect the difference
between the L, L
FOV
, and L
mnf
spectral radiance functions; for the same reasons, we can also
neglect the difference between the
( ) fore
L ,
( ) fore
FOV
L ,
( ) fore
mnf
L background radiance spectra.
The dotted lines in Figs. 8.7(a) and 8.7(b) show ten spectral measurements of the Lorentz
emission line contaminated by sampling noise when the 350-K background radiance is present,
and Fig. 8.7(c) is a close-up of the right-hand side of the same set of curves. The continuous solid
lines in Figs. 8.7(a)8.7(c) show the NEdN
samp
values predicted by Eqs. (8.32b)(8.32d), and the
split solid lines give the true ( ) L spectral radiance. Comparing Figs. 8.7(a) and 8.7(c) to the
plots in Figs. 8.5(a) and 8.5(b) without the background radiance, we see that the background
radiance prevents the measurement error from dropping to zero outside the regions where the
ghost lines occur. Figure 8.7(b), which is a somewhat expanded version of Fig. 8.7(a), makes it
easy to see that when the presence of the ghost lines is disregarded, the NEdN from the Planck
black-body radiance drops to zero near
-1
940 cm . This is the same sort of behavior seen
before in Figs. 8.2(a) and 8.2(b), with the dip now occurring at a smaller wavenumber (940 cm
1

instead of 1030 cm
1
) because the background radiance curve is at a lower temperature, 350 K,
instead of the 400 K of Figs. 8.2(a) and 8.2(b). This dip can be seen even more plainly in Fig. 8.8.
Just like in Fig. 8.6, the crosses plot standard deviations of the radiance errors, calculated from
900 measurements contaminated by the power spectrum in Fig. 8.1(a). There is again a good
match between the standard deviations and the solid curve showing the NEdN
samp
values
predicted by Eqs. (8.32b)(8.32d), and again the crosses do not go down as far as the NEdN
curve in the region of the dip.
8.10 Error from Quasi-Static Sampling Noise
When the power spectrum
( )
( )
s
nn

p is proportional to a delta function, so that

( )
0
( ) ( )
s
nn
=

p o (8.33a)

for some positive and constant
0
o value, the formula for
( ) s
J in Eq. (8.22h) reduces to
- 1008 -
FIGURE 8.7(a).

2
1
LinpV
kR
NEdNV
kR
Lmeas1V
kR
Lmeas2V
kR
Lmeas3V
kR
Lmeas4V
kR
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
Lmeas10V
kR
1050 850 oR
kR
850 900 950 1000 1050
1
0.5
0
0.5
1
1.5
2
950 900 850 1000 1050
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
Radiance
(in mW/m
2
/sr/cm
-1
)
o (in cm
-1
)
Noise-free
Spectrum
NEdN
samp
Error from Quasi-Static Sampling Noise 8.10
-1009 -
FIGURE 8.7(b).

1
0.5
LinpV
kR
NEdNV
kR
Lmeas1V
kR
Lmeas2V
kR
Lmeas3V
kR
Lmeas4V
kR
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
Lmeas10V
kR
1050 850 oR
kR
850 900 950 1000 1050
0.4
0.2
0
0.2
0.4
0.6
0.8
1
850 900 950 1000 1050
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
Radiance
(in mW/m
2
/sr/cm
-1
)
o (in cm
-1
)
Noise-free
Spectrum
NEdN
samp

- 1010 -
FIGURE 8.7(c).

1.0
1.0
LinpV
kR
NEdNV
kR
Lmeas1V
kR
Lmeas2V
kR
Lmeas3V
kR
Lmeas4V
kR
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
Lmeas10V
kR
1050 950 R
kR
960 980 1000 1020 1040
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
960 980 1000 1020
1040
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
(in cm
-1
)
Radiance
(in mW/m
2
/sr/cm
-1
)
Noise-free
Spectrum
NEdN
samp
-1011 -
FIGURE 8.8.

( ) ( ) ( ) ( )
( )
2
2 ( ) ( )
0 ma ma
( )
H M ( ) H M ( )
s
i i
mnf mnf
J
e u R e u R

=
Z Z

. o
(8.33b)

Substituting from (8.12f), we find that

( ) ( ) ( ) ( )
2
( ) 2
0 ma ma
( ) H M ( ) H M ( ) 0
s
mnf mnf
J u R u R = = Z Z o , (8.33c)

which shows that, according to Eq. (8.22g),
(in cm
-1
)
0.3
1.149519 10
3
.
NEdNestP
ks
NEdNV
k
1100 800 p
ks
g
k
,
800 850 900 950 1000 1050 1100
0
0.05
0.1
0.15
0.2
0.25
800 900 1000 1100
0
0.05
0.10
0.15
0.20
0.25
0.30
Radiance Error
(in mW/m
2
/sr/cm
-1
)
- 1012 -

0
samp
NEdN (8.33d)

when the nonzero noise-power spectrum for the sampling-position noise is a delta function.
This odd result is an artifact of the approximations made in deriving the NEdN
samp
formulas,
and we can show this by getting the same result using another line of reasoning. Equation (8.33a)
specifies a noise-power spectrum concentrated at 0 o . Substituting Eq. (8.33a) into (8.4b)
gives

( )
0
( )
s
nn

o o .

From the definition of the autocorrelation function in Eq. (8.3b), we see that an autocorrelation
function can have the same nonzero o
0
value at all OPD values only when the random sampling
error
( ) s
n is the same at all OPD values . We interpret this to mean that all the samples of the
interferogram signal are shifted by the same random value from their expected positions during a
single sweep of the interferometers moving mirror. Later, after many new spectral measurements
and many more sweeps of the moving mirror, we find that the shift in the sample positions has
changed to another random value. We can think of this as quasi-static sampling noise; although
effectively constant during each sweep, the sampling shift can gradually change over many
sweeps to a new random value. Suppose r is the random shift in the OPD value for every
sample of a spectral measurements interferogram, which means the random function
( )
( )
s
n
defined in Sec. 8.2 above is now

( )
( )
s
n r . (8.34a)

If we take a very large number of spectral measurements, there is no way to tell ahead of time
what r will be for any particular sweep of the moving mirrorbut whatever r happens to be at
the beginning of the sweep, it has the same value at the end of the sweep. This is why it makes
sense to use (8.34a) to specify the sampling noise
( ) s
n as a stationary but nonergodic function
[function
( ) s
n is identical to the stationary but nonergodic random function discussed following
Eq. (3.47a) in Chapter 3]. Substituting (8.34a) into (8.8e) gives

( ) ( )
( ) ( ) ( )
( ) ( , ) ( , )
s i i
D
r D r D
o o
o

H H n F F ,

which becomes, using formula (8.7b),

( )
( ) 2 sinc(2 )
s
D
rD D o ro n . (8.34b)

function can have the same nonzero value at all OPD values when the random sampling o
0

gives, using the linearity of the Fourier
transform from Sec. 2.6 of Chapter 2,

( ) ( )
( ) ( ) ( )
( ) ( , ) ( , )
s i i
D
r D r D
o o
o

H H n F F ,

which becomes, using formula (8.7b),

( )
( ) 2 sinc(2 )
s
D
rD D o ro n . (8.34b)

-1013 -
The error oL
in Eq. (8.12k) is now

[ ] [ ] ( )
( )
ma
ma
R
4 Re 2 sinc(2 ) 2 H( ) M( ) ( )
( ) H( ) M( ) ( ) ( ) ( ) ( )
i
FOV
a f
r e D D i u R
WA u R
o
ro r o o o o
o
o o o q o t o t o
AO
Z
L

.

Just like before [see discussion following Eq. (8.18a)], we note that the product

[ ]
ma
2 H( ) M( ) i u R r o o o

varies slowly with wavenumber compared to the sinc function, setting up the approximation

[ ] ( )
( )
ma
ma
R
4 Re 2 H( ) M( ){ 2 sinc(2 ) ( )}
( ) H( ) M( ) ( ) ( ) ( ) ( )
i
FOV
a f
r i e u R D D
WA u R
o
r o o o ro o
o
o o o q o t o t o
e
AO
Z
L

based on Eq. (5C.1) in Appendix 5C of Chapter 5. The formula for H in Eq. (8.12f) shows that

[ ] ( )
ma
ma
R
4 Re 2 H( ) M( ) 2 sinc(2 ) ( )
( ) H( ) M( ) ( ) ( ) ( ) ( )
FOV
a f
r i u R D D
WA u R
r o o o ro o
o
o o o q o t o t o
e
AO
Z
L

. (8.34c)

Functions M, H , and
FOV
Z inside Re( ) on the right-hand side are strictly real. Formula
(8.34c) is based on the standard approximations used in this chapternothing extra has been
added. Consequently, when we rely on these approximations, the error oL
ends up proportional
to the real part of a strictly imaginary quantity; that is, it ends up being zero. Hence, we have
confirmed that the standard approximations used so far in this chapter end up predicting zero
sampling noise when the sampling-position noise is quasi-static with a delta function for its
power spectrum.
The best way to interpret the results in (8.33d) and (8.34c) is to regard them as predicting that
for this case the sampling error in the radiance measurement is going to be small instead of
completely nonexistent. There is already a strong hint in Sec. 8.8 that there are times when these
approximations break downwe remember that in Fig. 8.2(b) the exact sampling error marked
by the crosses does not follow the solid curve all the way down to zero at
-1
1030.5 cm o e . The
approximation used when taking the slowly varying H and M functions outside the convolution
with [2 sinc(2 )] D D ro is actually rather good. These functions are also under our control when
designing the instrument; they can be made effectively constant over the band of wavenumbers
being measured, turning the approximation used to remove them from the convolution into an
exact equality. Consequently, if a more accurate formula for NEdN
samp
is desired, it is better to
1031 cm
- 1014 -
rethink the approximation specified in Eq. (8.2e) above. When the error from the linear
approximation

( )
( )
( ) ( ) ( ) ( )
( ) ( ) ( )
j
tot
tot s tot s C
C j j C j j
dz
z n z n
d

+ e +

disappears, it is clearly time to consider what happens when the quadratic approximation is used:

( )
( ) ( )
( ) 2 ( )
( ) ( ) ( ) 2
2
( )
1
( ) ( ) ( )
2
j j
tot s
C j j
tot tot
tot s s C C
C j j j
z n
dz d z
z n n
d d

+ e
+ +

.
(8.35)

Including the effect of the third term on the right-hand side of (8.35), the quadratic term in the
Taylor series for
( )
( ) ( )
( )
tot s
C j j
z n + , would stop the solid curve in Fig. 8.2(b) from dipping
down so close to zeroand also prevent the noise formulas from producing a strictly zero value
for NEdN
samp
when the sampling-position noise is quasi-static and obeys the delta-function power
spectrum in (6.33a).
Retaining both the linear and quadratic terms in (8.35) is, according to Eq. (8.34a), the same
as retaining ( ) O r and
2
( ) O r terms everywhere they occur in the noise equations. Postponing for
a while the expansion of the signal error in powers of r , we use the exact formula for the noise-
contaminated signal
( )
( )
tot
CN
z , writing that

( ) ( )
( ) ( )
tot tot
CN C
z z r + (8.36a)

rather than using the approximation in Eq. (8.2g) above. Our strategy is to repeat the same
procedure used before to derive NEdN
samp
, taking advantage of the way the sampling error is now
a random constant r instead of a random function
( ) s
n . Having already set up Eq. (8.36a) to
replace (8.2g) at the end of Sec. 8.2, we skip past the next section (because there is no reason to
repeat the explanation of the sampling-noise autocorrelation function and power spectrum) and
move on to Sec. 8.4. The formula corresponding to Eq. (8.6b) is

( ) ( )
( , ) ( ) ( , ) ( )
tot tot
CN C
D z D z r H H + , (8.36b)

which means that instead of Eq. (8.6c) we have

( )
( ) ( )
,
( ) ( , ) ( )
i tot
eff totN C
D z r
o
o
H + Z
F

(8.33a).
-1015 -
representing the uncalibrated spectral signal contaminated by sampling noise. Applying Eq.
(2.39j) of Chapter 2 (the Fourier convolution theorem) gives

( ) ( )
( ) ( ) ( )
,
( ) ( , ) ( )
i i tot
eff totN C
D z r
o o
o

H + Z
F F ,

which becomes, substituting from Eq. (8.7b),

( )
( ) ( )
,
( ) [2 sinc(2 )] ( )
i tot
eff totN C
D D z r
o
o ro

+ Z
F .

The Fourier shift theorem [see Eq. (2.36h) in Chapter 2] gives

( )
2 ( ) ( )
,
( ) [2 sinc(2 )] ( )
i r i tot
eff totN C
D D e z
r o o
o ro

Z

F ,

which can be written as, since the small value of r makes
2 i r
e
r o
a slowly varying function of
compared to the sinc function [see Eq. (5C.1) in Appendix 5C of Chapter 5]

( )
2 ( ) ( )
,
( ) [2 sinc(2 )] ( )
i r i tot
eff totN C
e D D z
r o o
o ro

e Z

F .

Using Eq. (8.7b) to replace [2 sinc(2 )] D D ro by ( )
( )
( , )
i
D
o
H F , we get

( ) ( )
2 ( ) ( ) ( )
,
( ) ( , ) ( )
i r i i tot
eff totN C
e D z
r o o o
o

e H Z

F F ,

which can be written as, according to Eq. (2.39j) in Chapter 2,

( )
2 ( ) ( )
,
( ) ( , ) ( )
i r i tot
eff totN C
e D z
r o o
o
e H Z

F .

This becomes, applying the formula in (8.7e) above,

2
, ma
( ) H( ) M( ) ( )
i r
eff totN mnf
e u R
r o
o o o o e Z Z
. (8.36c)

The alert reader will notice that the error in Eq. (8.36c) can now be entirely eliminated by taking
the magnitude of the complex spectral signal contaminated by this particular type of sampling
noise:

(2.36i)
- 1016 -

2
, ma ma
ma
( ) H( ) M( ) ( ) H( ) M( ) ( )
H( ) M( ) ( )
i r
eff totN mnf mnf
mnf
e u R u R
u R
r o
o o o o o o o
o o o

Z Z Z
Z
.
(8.36d)

Here, the last step acknowledges that only H(u) is a complex-valued function of .
Unfortunatelyleaving aside this special casein general taking the magnitude of the complex
spectral signal increases the amount of noise present. When, for example, the signal is
contaminated by detector noise, taking the magnitude of the complex spectral signal puts both the
avoidable and unavoidable detector-noise components into the spectral measurement.
110

Consequently the signal-processing algorithms of Fourier-transform spectrometers usually avoid
taking the magnitude of the complex, noise-contaminated spectral signal and instead use
calibration algorithms like the one described in Sec. 5.19 of Chapter 5 (we have, in fact, already
applied this algorithm to standard sampling noise in Sec. 8.5 above). Although we know that our
analysis here is for the special case of sampling-position noise characterized by a delta function
power spectrum, a real spectroscopist cannot know this ahead of time and so would process his
Fourier-transform data as though other types of noisefor example, detector noisedominate
his noise budget. Hence we should now investigate what happens to sampling-position noise
characterized by a delta-function power spectrum when it is processed this waythat is,
processed as though it is detector noise. This first step, then, is to approximate
2 i r
e
r o
in such a
way as to convert it into an additive noise.
We decide to take advantage of the smallness of r , expanding
2 i r
e
r o
into a power series
while remembering to retain, as promised in the discussion immediately preceding Eq. (8.36a)
above, both the ( ) O r and
2
( ) O r terms,

2 2
2 2 2
1
cos(2 ) sin(2 ) 1 (2 ) (2 )
2
1 2 2
i r
e r i r r i r
i r r
r o
ro ro ro ro
r o r o
+ e +
+

.
(8.37a)
When put back into (8.36c), this gives

,
ma ma
2 2 2
ma
( )
H( ) M( ) ( ) (2 )H( ) M( ) ( )
(2 )H( ) M( ) ( )
eff totN
mnf mnf
mnf
u R i r u R
r u R
o
o o o r o o o o
r o o o o
e
+
Z
Z Z
Z

(8.37b)

for the uncalibrated spectral signal contaminated by delta-function sampling-position noise. We
expect that the noise in any spectral signal can be removed by averaging together many

110
The discussion following Eq. (6.35d) in Chapter 6 explains the difference between the avoidable and unavoidable
detector-noise in a spectral measurement.
Unfortunatelyleaving aside this special casein general taking the magnitude of the complex
-1017 -
independent measurements of the same spectrumthat is, by taking its expectation valueso we
apply the expectation operator E to both sides of (8.37b) to get

( )
,
ma ma
2 2 2
ma
( )
H( ) M( ) ( ) 2 H( ) M( ) ( ) ( )
2 H( ) M( ) ( ) ( )
eff totN
mnf mnf
mnf
u R i u R r
u R r
+

Z
Z Z
Z

.
E
E
E

Here, we have once again applied Eqs. (3.9f) and (3.16a) from Chapter 3 to simplify the formula
by distributing operator E over the expression for the uncalibrated spectral signal. Substitution of
(8.34a) into (8.3a) shows that the random parameter r is zero-mean,

( ) 0 r = E . (8.37c)

This also makes good intuitive sense because we expect the sampling offset r to be equally
likely to take on a positive or a negative value for any given sweep of the interferometers
moving mirror. Hence, the expectation value of the uncalibrated spectral signal can be written as

( )
,
2 2 2
ma ma
( )
H( ) M( ) ( ) 2 H( ) M( ) ( )
eff totN
mnf rms mnf
u R r u R
Z
Z Z

E

or

( )
2 2 2
, ma
( ) (1 2 ) H( ) M( ) ( )
eff totN rms mnf
r u R Z Z
E , (8.37d)

where we define

2
( )
rms
r r = E . (8.37e)

Since r is taken to be a small random error in the sampling position, the factor

2 2 2
(1 2 )
rms
r
in Eq. (8.37d) is always positive with

2 2 2
2 1
rms
r << .
Since ( ) 0 r = E , we see that

( )
2 2 2
( ) [ ( )]
rms
r r r r = = E E E . (8.37f)

- 1018 -
Hence r
rms
, being the square root of the variance
2
([ ( )] ) r r E E , is the standard deviation of r .
[See Eqs. (3.5c) and (3.8e) of Chapter 3 for definitions of the standard deviation and variance.]
The uncalibrated, noise-contaminated spectral signal in (8.37b) can be written as, after both
adding and subtracting

2 2 2
ma
2 H( ) M( ) ( )
rms mnf
r u R Z
from the formula,

,
2 2 2
ma
2 2 2 2
ma
ma
( )
H( ) M( ) ( ) (1 2 )
2 ( ) H( ) M( ) ( )
(2 )H( ) M( ) ( )
eff totN
mnf rms
rms mnf
mnf
u R r
r r u R
i r u R

+
+
Z
Z
Z
Z

.
(8.38a)

We now define a new random variable

(2) 2 2
rms
r r = . (8.38b)

Taking the expectation value of both sides of (8.38b) gives, again applying Eqs. (3.9f) and
(3.16a) from Chapter 3,

(2) 2 2
( ) ( )
rms
r r = E E

which becomes, substituting from Eq. (8.37f),

(2)
( ) 0 = E . (8.38c)

We can also define a new function

2 2
M( ) 1 2 x x = . (8.38d)

Substituting (8.38b) and (8.38d) into (8.38a) gives

, ma
2 2 (2)
ma
ma
( ) H( ) M( ) M( ) ( )
2 H( ) M( ) ( )
(2 )H( ) M( ) (
eff totN rms mnf
mnf
mnf
u r R
u R
i r u R

+
+
Z Z
Z
Z

)

or

, ma
(2)
ma
( ) H( ) M( ) M( ) ( )
( ) 2 H( ) M( ) ( )
eff totN rms mnf
mnf
u r R
r i i u R

+

Z Z
Z
.
(8.38e)
-1019 -
According to the discussion following Eq. (8.37e), we can count on M( )
rms
r defined in (8.38d)
always being a positive quantity slightly less than one. Substituting Eq. (8.38d) into (8.37d) gives

( )
, ma
( ) H( ) M( )M( ) ( )
eff totN rms mnf
u r R Z Z
E . (8.38f)

It is now time to apply the calibration algorithm using the same procedure as in Sec. 8.5
above. We note that Eq. (8.38e) corresponds to Eq. (8.9a) in Sec. 8.4, only now the leading
nonrandom term is

ma
H( ) M( ) M( ) ( )
rms mnf
u r R

Z

instead of

ma
H( ) M( ) ( )
mnf
u R

Z ,

and the small random error is

(2)
ma
( ) [2 H( ) M( ) ( )]
mnf
r i i u R Z

instead of

[ ]
( )
ma
( ) 2 H( ) M( ) ( )
s
D FOV
i u R n Z .

These observations can be written symbolically as

ma ma
H( ) M( ) ( ) H( ) M( ) M( ) ( )
mnf rms mnf
u R u r R

Z Z (8.39a)

for the large nonrandom term and

[ ]
( )
ma
(2)
ma
( ) 2 H( ) M( ) ( )
( ) 2 H( ) M( ) ( )
s
D FOV
mnf
i u R
r i i u R

n Z
Z

(8.39b)

for the small random term. Hence, the formula for
( )
,
meas
eff totN
Z
, the uncalibrated and noise-

contaminated signal spectrum, is now, applying (8.39a) and (8.39b) to Eq. (8.11e),

- 1020 -

( )
, ma
(2)
ma
( ) H( ) M( ) M( ) ( )
( ) 2 H( ) M( ) ( )
meas
eff totN rms mnf
mnf
u r R
r i i u R

+

Z Z
Z
.
(8.39c)

When an interferometer contaminated by delta-function sampling-position noisethat is, quasi-
static sampling-position noise obeying a power spectrum like the one in (8.33a)observes the
calibration radiance L
(1)
, we note that the rule in (8.39a) becomes

(1) (1)
ma ma
H( ) M( ) ( ) H( ) M( ) M( ) ( )
mnf rms mnf
u R u r R

Z Z (8.39d)

where the superscript (1) is added to show that Eq. (8.10d) specifying
(1)
mnf
Z is now the proper
formula for
mnf
Z (because L
(1)
is now the input radiance). Similarly, when the interferometer
observes the L
(2)
calibration radiance, we have

(2) (2)
ma ma
H( ) M( ) ( ) H( ) M( ) M( ) ( )
mnf rms mnf
u R u r R

Z Z , (8.39e)

where now Eq. (8.10f) specifying
(2)
mnf
Z is the proper formula for the
mnf
Z function. Applying
(8.39d) and (8.39e) to Eqs. (8.11c) and (8.11d) respectively gives

(1) (1)
, ma
( ) H( ) M( ) M( ) ( )
eff tot rms mnf
u r R Z Z (8.39f)
and

(2) (2)
, ma
( ) H( ) M( ) M( ) ( )
eff tot rms mnf
u r R Z Z . (8.39g)

The formula corresponding to Eq. (8.12b) above is, again applying (8.39d) and (8.39e),

(2) (1) (2) (1)
(2) (1) (2) (1)
, , ma
( ) ( ) ( ) ( )
( ) ( ) H( ) M( ) M( )[ ( ) ( )]
eff tot eff tot rms mnf mnf
u r R

L L L L
Z Z Z Z
,

which becomes, after substituting from (8.10d) and (8.10f),

(2) (1)
(2) (1)
, ,
1
R
( ) ( )
( ) ( )
H( ) M( ) M( ) ( ) ( ) ( ) ( )
4
eff tot eff tot
rms rms a f
WA
u r R

=

L L
Z Z
.
(8.39h)

-1021 -
This is the formula corresponding to Eq. (8.12c) in Sec. 8.5 above. To construct the formula
corresponding to Eq. (8.12d), we subtract (8.39f) from (8.39c) to get

( ) (1) (1)
, , ma
(2)
ma
( ) ( ) H( ) M( ) M( )[ ( ) ( )]
( ) [2 H( ) M( ) ( )] ,
meas
eff totN eff tot rms mnf mnf
mnf
u r R
r i i u R

+
Z Z Z Z
Z

which becomes, after substituting from (8.7f) and (8.10d),

( ) (1)
, ,
(1)
ma
(2)
ma
R
( ) ( )
H( ) M( ) M( ) ( ) ( ) ( ) ( )[ ( ) ( )]
4
( ) [2 H( ) M( ) ( )]
meas
eff totN eff tot
rms a f mnf
mnf
WA
u r R
r i i u R

+
Z Z
L L
Z

.
(8.39i)

Equations (8.39h) and (8.39i) can now be substituted into the fundamental calibration formula
(8.12a) to get

(2)
R
Measured Radiance
8 [ ] ( )
( )
( ) M( ) ( ) ( ) ( ) ( )
mnf
mnf
rms a f
ir
WA r

+
= +
Z
L

.

Substitution from Eq. (8.7f) gives

( ) (back)
(2)
Measured Radiance
( ) ( )
2 [ ]
( ) ( )
M( ) ( )
fore
mnf mnf
mnf mnf
rms f
ir
r

+
= + +

L L
L L

.
(8.39j)

As always, the true error in the measured radiance is the real part of the complex error terms that
are present [see, for example, the discussion following Eq. (7.21e) in Chapter 7 or Eq. (6.35d) in
Chapter 6]. Hence the error L
is the real part of the second term on the right-hand side:

( ) (back)
2 2 (2)
( ) ( )
2
( )
M( ) ( )
fore
mnf mnf
mnf
rms f
r

= +

L L
L L
. (8.39k)

When the interferometers background radiances
( ) fore
mnf
L and
(back)
mnf
L are negligible, the L
error
- 1022 -
hasexcept for the slowly varying
2
M( )
rms
r factorthe same shape as the L
mnf
radiance
being measured. This is good news if all that is needed is the shape of the input L
mnf
radiance
maybe we just want the position of absorption or emission lines in an unknown spectrum
because the change in
(2)
from measurement to measurement acts like a small random change in
the zero level of the radiance curve. It is, however, disturbing news if the L
mnf
spectrum must be
radiometrically accurate, because there is little off shape evidence of the sampling error in the
measurement.
When the interferometer is contaminated by quasi-static or delta-function sampling-position
noise, the expected value for L
is, applying the expectation operator to both sides of Eq.

(8.39k),

( ) (back)
2 2
(2)
( ) ( )
2
( ) ( ) ( )
M( ) ( )
fore
mnf mnf
mnf
rms f
r

= +

L L
L L
E E .

According to (8.38c) we can now conclude that

( ) 0 = L
E ,

confirming that L

is still, just like in Eq. (8.14f) above, a zero-mean random quantity. Hence,
its variance is, applying formula (8.15a) to (8.39k),

2
( ) (back)
2 2
2 (2) 2
( ) ( )
2
( ) ( ) [ ]
M( ) ( )
fore
mnf mnf
mnf
rms f
r

= +

L L
L L
E E .

The linearity of operator E with respect to random quantities (see Sec. 3.10 of Chapter 3) lets us
write

2
( ) (back)
2 2
2 (2) 2
( ) ( )
2
( ) ( ) ([ ] )
M( ) ( )
fore
mnf mnf
mnf
rms f
r

= +

L L
L L
E E .

This becomes, after substituting from (8.38b) and (8.38d),

2
( ) (back)
2 2
2 2 2 2
2 2 2
( ) ( )
2
( ) ( ) ([ ] )
1 2 ( )
fore
mnf mnf
mnf rms
rms f
r r
r

= +

L L
L L
E E .

-1023 -
Again it is important to remember that, according to the discussion following Eq. (8.37e), the
factor
2 2 2
(1 2 )
rms
r r o is always a positive number slightly less that one. The standard deviation
of oL
is the square root of its variance [see Eq. (3.5c) in Chapter 3]; and, as explained in Sec.
6.1 of Chapter 6, the NEdN of a spectral measurement is the standard deviation of its random
error. Hence, the formula for the NEdN of delta-function sampling-position noise is

( ) (back) 2 2 2 2 2
2 2 2
( ) ( ) 2 ([ ] )
( )
1 2 ( )
fore
mnf mnf rms delta
samp mnf
rms f
r r
NEdN
r
o o r o
o
r o t o

+
L L
L
E
. (8.40a)

Using the linearity of the expectation operator E with respect to random quantities [see Eq. (3.9f)
and Sec. 3.10 of Chapter 3] and then substituting from Eq. (8.37f), we note that

( ) ( )
2 2 2 4 2 2 4 4 2 2 4 4 4
[ ] 2 2 ( ) ( ) ( )
rms rms rms rms rms rms
r r r r r r r r r r r r + + E E E E E .

The
delta
samp
NEdN formula can now be written as

( ) (back) 2 2 4 4
2 2 2
( ) ( ) 2 ( )
( )
1 2 ( )
fore
mnf mnf rms delta
samp mnf
rms f
r r
NEdN
r
o o r o
o
r o t o

+
L L
L
E
. (8.40b)

We already know, according to Eq. (8.37c), that r has a zero-mean probability density
distribution. If we also assume that this is a zero-mean normal distribution, then Eq. (7A.5d) in
Appendix 7A of Chapter 7 shows that

4 4
( ) 3
rms
r r E ,

where, of course, we know from the discussion following Eq. (8.37f) that r
rms
is the standard
deviation of r . Now we have

( ) (back)
2 2 2
2 2 2
( ) ( )
2 2
( )
1 2 ( )
fore
mnf mnf delta rms
samp mnf
rms f
r
NEdN
r
o o
r o
o
r o t o
L L
L (8.40c)

as the formula for the NEdN of our measurement. By keeping both the ( ) O r and
2
( ) O r terms
everywhere they occur in the noise equations, we have ended up with a reasonable formula for
the quasi-static sampling noise.

the quasi-static sampling noise. We see that neglecting the quadratic term in Eq. (8.2e) is the reason
our previous NEdN formula gave zero for the quasi-static sampling noise.
- 1024 -
8.11 Comparing the Sampling-Error, Misalignment, and Detector NEdNs
Equations (7.34b) and (7.35b) in Chapter 7 specify the NEdN due to random misalignment error
to be

( 2)
R
8 ( )
M( ) ( ) ( ) ( ) ( )
tilt
rms a f
J
NEdN
R

, (8.41a)
where

2
( 2) ( 2) 2 2
1
( ) ( ) ( ) ( ) ( ) ( )
4
nn mnf mnf
J d

= + + +

Z Z

. p (8.41b)

The corresponding pair of formulas for the sampling-error NEdN is, in Eqs. (8.22g) and (8.22h)
above,

( )
ma
R
4 ( )
( )
H( ) M( ) ( ) ( ) ( ) ( )
s
samp
a f
J
NEdN
A u R

(8.41c)
and

( ) ( )
( ) ( )
( )
2 ( ) ( )
ma
( )
ma
( )
( ) ( ) H ( ) M ( ) ( )
( ) H ( ) M ( ) ( )
s
s i
nn mnf
i
mnf
J
e u R
e u R

=

+ + + +
Z
Z

p
2
d .
(8.41d)

The formula for
mnf
Z is, of course, the same in both sets of equations; Eq. (8.7f) in this chapter
just repeats the definition in (7.16f) in the previous chapter. For the two types of NEdN,

( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
mnf a f mnf mnf mnf
WA

= + Z L L L . (8.41e)

Speaking very approximately, and noting what happens when the formulas for
( 2)
J

and
( ) s
J
are substituted into Eqs. (8.41a) and (8.41c), we see that both NEdN
tilt
and NEdN
samp
diminish as
the J and thus (disregarding for now the effect of the integrals over d) decrease in an
approximately linear way with
mnf
Z . This point is not just academic because, at least in
principle, both
( ) fore
mnf
L and
(back)
mnf
L are under the control of the interferometers designer. Hence, by
Comparing the Sampling-Error, Misalignment, and Detector NEdNs 8.11
-1025 -
arranging for
(back)
mnf
L , the background radiance from the aft optics, as nearly as possible to equal
under typical operating conditions when measuring a typical input spectrumthe sum

( )
( ) ( ) ( )
fore
f mnf mnf
+ L L ,

we can minimize both NEdN
tilt
and NEdN
samp
. This same relationship shows up in the formula for
random quasi-static sampling error. Equation (8.40b) shows that if

( ) (back)
( ) ( )
( ) 0
( )
fore
mnf mnf
mnf
f

+
L L
L
or

(back) ( )
( ) ( ) ( ) ( )
fore
mnf f mnf mnf
+ L L L , (8.41f)

then 0
delta
samp
NEdN also.
It is not difficult to understand why (8.41f) minimizes random misalignment and sampling
errorsfor both types of noise this minimizing relationship is present from the start of our
analysis. It is also not very difficult to show how this works. Working first with the sampling-
position noise, we get from Eq. (8.1c) that

( ) 2
ma
( ) H( ) M( ) ( )
tot i
C FOV
z u R e d

Z , (8.42a)

which means that its derivative can be written as

( )
2
ma
2 H( ) M( ) ( )
tot
i C
FOV
dz
i u R e d
d

Z . (8.42b)

Equation (8.42b) can be substituted into (8.2g) to get

( ) ( ) ( ) 2
ma
( ) ( ) 2 ( ) H( ) M( ) ( )
tot tot s i
CN C FOV
z z i n u R e d

= +
Z . (8.42c)

Examining the derivation of Eq. (8.2g), we see that it comes from removing the j subscripts in
Eq. (8.2e). If we want to include the
( )
( ) 2
[ ]
s
O n error term from the analysis of the quasi-static
sampling error in Sec. 8.10, we can similarly remove the j subscripts from Eq. (8.35) and
substitute from (8.42a) to get
- 1026 -

( ) ( ) ( ) 2
ma
2 ( ) 2 2 2
ma
( ) ( ) 2 ( ) H( ) M( ) ( )
2 [ ( )] H( ) M( ) ( )
tot tot s i
CN C FOV
s i
FOV
z z i n u R e d
n u R e d
r o
r o
r o o o o o
r o o o o o
Z
Z

.
(8.42d)

Clearly the size of the sampling noise is governed by the size of
FOV
Z . As for the mirror-tilt error
in Chapter 7, Eq. (7.8a) can be substituted into (7.11d) to get

( ) 1 2
1 ( 2) 2 2
( ) M( ) ( )
( ) ( )
tot i
CN rms FOV
i
FOV
z u h R e d
u
u h e d
u
n
r o
r o
o o o
o o o
Z
Z

. a
(8.42e)

Again the size of the signal noise is governed by the size of
FOV
Z . The formula for
FOV
Z is [see
Eq. (8.1b) above or (7.7b) in Chapter 7]

( ) (back)
R ( ) ( ) ( ) ( )[ ( ) ( ) ( ) ( )]
4
fore
FOV a f FOV FOV FOV
WA
AO
+

Z L L L . (8.42f)

All the random-error terms in the formulas for the interferogram signal contaminated by sampling
noise and random misalignment errorsthat is, all the random-error terms on the right-hand
sides of Eqs. (8.42d) and (8.42e)can be minimized by minimizing
FOV
Z . Equation (8.42f)
shows that this occurs when

( ) (back)
0
fore
f FOV FOV FOV
t + = L L L
or

(back) ( )
( ) ( ) ( ) ( )
fore
FOV f FOV FOV
o t o o o = + L L L . (8.42g)

Assuming that the interferometer does a reasonable job of resolving the L,
(back)
L , and
( ) fore
L
radiance spectrathat is, assuming that the distorting effects of the finite interferogram and finite
field of view are negligiblewe know, just like in Eqs. (7.19a) and (7.19b) in Chapter 7, that

( ) ( ) ( )
FOV mnf
o o o e e L L L , (8.42h)

(
sampling
)
noise
(
mirror-tilt
)
noise
-1027 -

( ) ( ) ( )
( ) ( ) ( )
fore fore fore
FOV mnf
o o o e e L L L , (8.42i)
and
( ) ( ) ( )
FOV mnf
o o o e e L L L . (8.42j)

Under these conditions, Eqs. (8.42g) and (8.41f) are effectively identical. Since (8.41f) comes
from minimizing the final formulas for NEdN
tilt
and NEdN
samp
, and (8.42g) comes from
minimizing the raw noise contaminating the initial interferogram signals, we have now confirmed
that this noise-minimizing relationship is present from the beginning of the analysis and
continues through to the end.
The noise associated with the randomly changing misalignment and sampling errors is
sometimes called multiplicative noise.
111
The name comes from the way these random errors
enter the equations only after being multiplied by terms proportional to
FOV
Z which is itself
proportional to

( ) (back) fore
f FOV FOV FOV
t + L L L .

In Eq. (8.42e), for example,
( 2)
n
is multiplied by

2 2
( )
i
FOV
e d
r o
o o o
Z

before contributing to the uncontaminated interference signal. In Eq. (8.42d),
( ) s
n is multiplied
by

2
ma
H( ) M( ) ( )
i
FOV
u R e d
r o
o o o o o
Z
and
( ) 2
[ ]
s
n is multiplied by

2 2
ma
H( ) M( ) ( )
i
FOV
u R e d
r o
o o o o o
Z

before contributing to the uncontaminated signal.
The equation for detector noise corresponding to Eqs. (8.42d,e) is [see Eq. (6.22a) in Chapter
6]

( ) ( ) (det)
1
( ) ( ) ( ) ( )
tot cold
CN C C
z z z n h
u u

+ +

. (8.43)

111
John Chamberlain, The Principles of Interferometric Spectroscopy, pp. 303309.
(
detector
)
noise
enter the equations only after being multiplied by integrals proportional to
- 1028 -
Here the random signal error

(det)
1
( ) n h
u u

is directly added to the uncontaminated interference signal

( )
( ) ( )
cold
C C
z z + .

Even though there is a convolution with ( / ) h u before the addition (to show what happens to
the noise when it passes through the signal processing chain after leaving the detector), no terms
proportional to the input or background radiances are included in the random error before it is
added to the uncontaminated signal. This is why the random error coming from the detector is
sometimes called additive noise.
The noise-free components of the interference signals in Eqs. (8.42d) and (8.43) are the same.
It is easy to show this is true. Setting
(det)
n to zero in (8.43) reduces the right-hand side to

( )
( ) ( )
cold
C C
z z + ,

which becomes, after substituting from Eqs. (6.5d) and (6.12a) in Chapter 6,

( )
2
ma
( ) (back) 2
ma
R
R
( ) ( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
H( ) M( ) ( ) ( ) ( )[ ( ) ( )]
4
cold
C C
i
f a FOV
fore i
a FOV FOV
z z
WA
u R e d
WA
u R e d

L
L L

.

Combining the two integrals into one, we get

( )
ma
( ) (back) 2
R
( ) ( )
H( ) M( ) ( ) ( ) ( )
4
[ ( ) ( ) ( ) ( )]
cold
C C
a
fore i
f FOV FOV FOV
z z
WA
u R
e d

=
+
L L L

.

Substituting from (8.42f) gives

-1029 -

( ) 2
ma
( ) ( ) H( ) M( ) ( )
cold i
C C FOV
z z u R e d

+ =
Z ,

which can also be written as, according to Eq. (8.42a),

( ) ( )
( ) ( ) ( )
cold tot
C C C
z z z + = . (8.44)

This is the same
( ) tot
C
z that the right-hand side of (8.42d) reduces to when the sampling-position
noise
( ) s
n is zero; in both cases, not surprisingly, the same function can be used to represent the
noise-free signal.
The right-hand side of Eq. (8.42e) for the random mirror-misalignment error also reduces to
( ) tot
C
z as the misalignment noise goes to zerobut unfortunately it takes some analysis to show
this. When
( 2)
n
is zero, the Fourier F operator defined in Eqs. (2.29a) and (2.29c) in Chapter 2
can be used to write the right-hand side of (8.42e) as

( )
1 2
1 ( )
M( ) ( )
M( ) ( )
i
rms FOV
i
rms FOV
u h R e d
u
u h R
u

=

Z
Z

. F
(8.45a)

We note that the transform in Eq. (6.27b) in Chapter 6 can be reversed to get (replacing the
dummy variables , by , respectively)

( ) ( )
( ) ( )
H( ) H( )
i i
h u u u u
u

= =

F F ,

where in the last step we have used the linearity of F to move u outside the Fourier transform (see
Sec. 2.6 in Chapter 2). This can also be written as

( )
( )
1
H( )
i
h u
u u

=

F . (8.45b)

Substituting (8.45b) into (8.45a) gives

- 1030 -

( ) ( )
1 2
( ) ( )
M( ) ( )
H( ) M( ) ( )
i
rms FOV
i i
rms FOV
u h R e d
u
u R
r o
o o
o o o
o o o

Z
Z

. F F

Writing the right-hand side as a Fourier integral [after applying Eq. (2.39j) in Chapter 2], we get

( )
1 2
( )
M( ) ( )
H( )M( ) ( )
i
rms FOV
i
rms FOV
u h R e d
u
u R
r o
o
o o o
o o o
Z
Z

. F

We again consult (2.29c) in Chapter 2 to the right-hand side to write

1 2
2
M( ) ( )
H( )M( ) ( )
i
rms FOV
i
rms FOV
u h R e d
u
u R e d
r o
r o
o o o
o o o o
Z
Z

.
(8.45c)
Formula (7.3d) in Chapter 7 states that

2 2 2 2
rms x y
o y y + + ;

and when the misalignment noise drops to zero, we expect the
, x y
y standard deviations of the
random misalignment angles
x
and
y
also to go to zero, giving us

rms
o .

The discussion following Eq. (7.2e) defines o to be the bias-tilt angle of the randomly varying
misalignment, so when the randomly changing misalignment error goes to zero, it makes sense to
regard o as the static misalignment angle
ma
,

ma rms
. .

Replacing
rms
by
ma
and then comparing the right-hand side of (8.45c) to (8.42a), we see that as
the misalignment noise drops to zero, the noise-free signal once again simplifies to the Fourier
transform
We again apply (2.29c) in Chapter 2 to the right-hand side to write
-1031 -

( ) 2
ma
( ) H( ) M( ) ( )
tot i
C FOV
z u R e d

Z . (8.45d)

This is, of course, the same noise-free signal we get in Eq. (8.44) above. Hence we have now
demonstrated that the noise-free signals from our analysis of the detector noise, the sampling-
position noise, and the mirror-misalignment noise indeed all reduce to the same expression, as
they should.
According to the discussion following Eq. (8.42g), the approximate radiance equalities
specified by (8.41f) and (8.42g) are essentially equivalent in well-designed interferometers, and
from this it follows that
FOV
Z [whose formula is given by (8.42f)] is minimized by (8.42g) at the
same time that NEdN
tilt
and NEdN
samp
are minimized by (8.41f). At this point, however, we notice
that the
( ) tot
C
z noise-free signal component in (8.45d) is also minimized when
FOV
Z is minimized.
This seems to cause a problem, because the spectral measurement depends on this noise-free
componentit clearly does not make sense to design the interferometer for minimal tilt and
sample noise if the signal itself then goes away.
To solve this puzzle, we need to be more explicit about what exactly is being measured.
According to the mathematics of information theory, the more unexpected an occurrence is, the
more information it provides.
112
Turning this statement around, the more expected an occurrence
is, the less information it provides. With this idea as a guide, we can divide the L() radiance
spectrum being measured into an expected component and an unexpectedor unknown
component,

(exp) (unk)
( ) ( ) ( ) = + L L L . (8.46a)

The
(exp)
L spectral radiance is what we expect to measure; it could, for example, be the average
spectrum measured in the past under circumstances similar to the present. Assuming that there
are N past measurements, we can label each measurement with an index 1, 2, , j N = and call
the radiance seen in the jth past measurement
( )
( )
j
L so that

(exp) ( )
1
1
( ) ( )
N
j
j
N

=
=

L L . (8.46b)

According to (8.46a), the unknown component
(unk)
L for the spectrum L now being measured
must be the difference between that spectrum and
(exp)
L , so

112
A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 534.
- 1032 -

(unk) (exp)
( ) ( ) ( ) = L L L . (8.46c)

Function
(unk)
L is the real information in the signal because we cannot know anything about it
ahead of time; in fact, because it is defined to be the difference between L and
(exp)
L , its equally
likely to be positive or negative. Not knowing anything about it ahead of time, we cannot design
the instrument around it; we can, however, just like any other truly unpredictable quantity,
estimate its expected size by calculating the associated standard deviation:

2
(unk) ( ) (exp)
1
1
( ) ( ) ( )
N
j
j
N

=

L L L . (8.46d)

To show the effect of the interferometers finite field of view, we follow the pattern of Eqs.
(5.83e), (6.11b), and (6.11c) in Chapters 5 and 6, setting

2
=

and then defining that

(exp)
(exp)
1
4 2
(exp)
1
4 2
( )
1
( )
cannot
FOV
d

+ +

+

=

L
L
L

be approximated as one

(8.47a)
and

(unk)
(unk)
1
4 2
(unk)
1
4 2
( )
1
( )
cannot
FOV
d

+ +

+

=

L
L
L

be approximated as one

(8.47b)

Similarly, following the pattern of Eqs. (5.108d), (6.25b), and (6.25c) in Chapters 5 and 6, the
distorting effect of the finite interferogram length can be introduced by defining
-1033 -

(exp) (exp)
( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L (8.47c)

and

(unk) (unk)
( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L . (8.47d)

The analysis following Eq. (5.108d) applies equally well to
(exp)
mnf
L and
(unk)
mnf
L , letting us write

(exp) (exp)
( ) ( )
mnf mnf
o o L L (8.47e)
and

(unk) (unk)
( ) ( )
mnf mnf
o o L L (8.47f)

to show that
(exp)
mnf
L and
(unk)
mnf
L are even functions of . Formulas (8.47e) and (8.47f) can be
substituted into (8.47c) and (8.47d) to get

(exp) (exp)
( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L (8.47g)
and

(unk) (unk)
( ) [2 sinc(2 )] ( )
mnf FOV
D D o ro o L L . (8.47h)

We now combine results. Substituting (8.46a) into the right-hand side of Eq. (5.83e) in Chapter 5
gives

(exp) (unk)
1 1
4 2 4 2
(exp) (unk)
1 1
4 2 4 2
( ) ( )
or small where cos can be approximated
( )
1 1
( ) ( )
for slight
FOV
d d
o o
o o
r
o o
r r
o o
o o
r r
o o
o
o
o o o o
A A AO AO
+ + + +

A A AO AO
+ +

+
+
A A

L L
L
L L
f as one
ly larger where cos cannot be approximated
r
o
as one

Equations (8.47a) and (8.47b) show that this formula is the same thing as saying that

(exp) (unk)
( ) ( ) ( )
FOV FOV FOV
o o o + L L L . (8.47i)

approximated as one
approximated as one
- 1034 -
Equation (8.47i) can now be substituted into Eq. (5.108d) in Chapter 5 to get

(exp) (unk)
( ) [2 sinc(2 )] [ ( ) ( )]
mnf FOV FOV
D D = + L L L ,

which becomes, using the linearity of the convolution [see Eq. (2.38d) in Chapter 2],

{ } { }
(exp) (unk)
( ) [2 sinc(2 )] ( ) [2 sinc(2 )] ( )
mnf FOV FOV
D D D D = + L L L .

Substitution from Eqs. (8.47g,h) gives

(exp) (unk)
( ) ( ) ( )
mnf mnf mnf
= + L L L .

If the right-hand side of this formula is an even function of and it isthen the left-hand side
must also be an even function of , allowing us to write

(exp) (unk)
( ) ( ) ( )
mnf mnf mnf
= + L L L . (8.47j)

Equations (8.47i) and (8.47j) match the form of (8.46a), showing that the distinction between the
expected and unknown radiances extends naturally to the distorted radiance functions produced
by the finite field of view and finite interferogram length.
The expected component
(exp)
L of the measured radiance often acts like a type of background
radiance generated outside the instrument. Suppose, for example, a spectroscopist is trying to
measure the infrared spectrum of a small burning candle with an interferometer having a
relatively large field of view. The optical signal coming from the candle could easily turn out to
be rather small compared to the infrared background signal coming from the laboratory walls. In
this sort of situation, we can say that

(exp) (wall)
( ) ( ) L L . (8.48)

Of course
(exp)
L , if defined by (8.46b), cannot be exactly the same as
(wall)
L because the candle
would contribute some small average radiance to
(exp)
L , but this could easily turn out to be
negligible, justifying the approximation in (8.48).
Having divided L into
(exp)
L and
(unk)
L , we can revisit the minimization conditions for the
sampling and mirror-misalignment noise in Eqs. (8.41f) and (8.42g). Substituting Eqs. (8.47i) and
(8.47j) into (8.41f) and (8.42g) gives

(back) (exp) (unk) ( )
( ) ( ) ( ) ( ) ( ) ( )
fore
FOV f FOV f FOV FOV
+ + L L L L (8.49a)
-1035 -
and

(back) (exp) (unk) ( )
( ) ( ) ( ) ( ) ( ) ( )
fore
mnf f mnf f mnf mnf
o t o o t o o o = + + L L L L . (8.49b)

In well-designed interferometers where (8.42h)(8.42j) are reasonable approximations, we might
think to build the instrument so that
(back)
L satisfies the approximate equalities in (8.49a) and
(8.49b), minimizing the sampling and mirror-misalignment noise. The problem with this strategy
is not so much a lack of fine control over the spectral shape of
(back)
L , although that is definitely
an important consideration, as it is our complete lack of knowledge about what
(unk)
L will be in
any particular measurement. Having defined
(unk)
L to be that part of the measured radiance about
which nothing can be known ahead of time, we do not even know whether
(unk)
L will be positive
or negative [see discussion following Eq. (8.46c)]. Hence, the best that can be done to satisfy
(8.49a) and (8.49b) is to set up the instrument so that

(back) (exp) ( )
( ) ( ) ( ) ( )
fore
FOV f FOV FOV
o t o o o = + L L L (8.49c)
and

(back) (exp) ( )
( ) ( ) ( ) ( )
fore
mnf f mnf mnf
o t o o o = + L L L . (8.49d)

Now we can make sense of the situation that occurs when instruments are designed to
minimize multiplicative noise like NEdN
tilt
and NEdN
samp
. Substituting (8.47i) into formula
(8.42f) gives

(exp) (unk) ( ) (back)
R ( ) ( ) ( ) ( )
4
( )[ ( ) ( )] ( ) ( )
FOV a
fore
f FOV FOV FOV FOV
WA
o o q o t o
t o o o o o
AO

+ +
Z
L L L L

or

(unk) (exp) ( ) (back)
R ( ) ( ) ( ) ( )
4
( ) ( ) [ ( ) ( )] ( ) ( )]
FOV a
fore
f FOV f FOV FOV FOV
WA
o o q o t o
t o o t o o o o
AO

+ +
Z
L L L L .
(8.50a)

This can be put into formula (8.45d) for the noise-free signal to get

( ) (back)
( ) ( )]
fore
FOV FOV
o o + L L .
- 1036 -

( )
2
ma
R
( )
H( ) M( ) ( ) ( ) ( )
4
( ) ( ) [ ( ) ( )] ( ) ( )]
tot
C
i
a
fore
f FOV f FOV FOV FOV
z
WA
u R e
d
r o
o o o q o t o
t o o t o o o o o
AO
+ +
L L L L

Separating this into an integral containing
(unk)
FOV
L and an integral containing everything else, we
write

( ) (unk) (exp)
( ) ( ) ( )
tot
C C C
z z z + , (8.50b)
where

(unk)
(unk) 2
ma
R
( )
H( ) M( ) ( ) ( ) ( ) ( ) ( )
4
C
i
a f FOV
z
WA
u R e d
r o
AO
L
(8.50c)
and

(exp)
2
ma
(exp) ( ) (back)
R
( )
H( ) M( ) ( ) ( ) ( )
4
[ ( ) ( )] ( ) ( )]
C
i
a
fore
f FOV FOV FOV
z
WA
u R e
d
r o
o o o q o t o
t o o o o o
AO
L L L

.
(8.50d)

Now, if the interferometer is built so that (8.49c) holds true, then all that can happen is that
(exp)
C
z
disappears, reducing
( ) tot
C
z in (8.50b) to

( ) (unk)
( ) ( )
tot
C C
z z . (8.50e)

The
(unk)
C
z component of the noise-free signal is, however, all we really cared about in the first
place. The
(exp)
C
z expected signal component is already knownit provides no new information
because that part of the signal is expected to be there every time the experiment is done. Hence an
interferometer can be designed so that approximations (8.49c) and (8.49d) hold true without
affecting the relevant part of the signal passing through the instrument. Now that there is no
concern about decreasing the quality of the measurement, Eq. (8.47j) can be substituted into
formula (8.41e) to get

( ) (back)
( ) ( )]
fore
FOV FOV
o o + L L
(unk) (exp)
( ) ( ) [ ( ) ( )
f FOV f FOV
t o o t o o + L L
-1037 -

R ( ) ( ) ( ) ( )
4
[ ( ) ( ) ( ) ( ) ( ) ( )]
mnf a
fore
f mnf f mnf mnf mnf
WA

=
+ +
Z
L L L L .

Condition (8.49d) can then be applied to minimize multiplicative noise such as NEdN
tilt
and
NEdN
samp
, leading to

(min) (unk)
R ( ) ( ) ( ) ( ) ( ) ( )
4
mnf a f mnf
WA

= Z L . (8.50f)

This is now substituted into formulas (8.41a) and (8.41b) for
tilt
NEdN to get

(min, 2)
(min)
R
8 ( )
M( ) ( ) ( ) ( ) ( )
tilt
rms a f
J
NEdN
R

, (8.50g)
where

(min, 2)
2
( 2) 2 (min) 2 (min)
( )
1
( ) ( ) ( ) ( ) ( )
4
nn mnf mnf
J
d
=
+ + +

Z Z

. p
(8.50h)

Equation (8.50f) can also be substituted into formulas (8.41c) and (8.41d) to get

(min, )
(min)
ma
R
4 ( )
( )
H( ) M( ) ( ) ( ) ( ) ( )
s
samp
a f
J
NEdN
A u R

, (8.50i)
with

( ) ( )
( ) ( )
(min, )
2 ( ) ( ) (min)
ma
2
( ) (min)
ma
( )
( ) ( ) H ( ) M ( ) ( )
( ) H ( ) M ( ) ( )
s
s i
nn mnf
i
mnf
J
e u R
e u R d

=

+ + + +
Z
Z

.
p (8.50j)

- 1038 -
There is no guarantee, of course, that the interferometer will always be used under the
conditions for which it is designed, or even that it is possible to design the interferometer so that
the minimizing conditions in (8.49c) and (8.49d) are satisfied. We know, for example, that
detector noise dominates the random-error budgets of most well-designed interferometers.
According to the discussion at the beginning of Sec. 6.15 of Chapter 6, many detectors operate
under close to ideal conditions, so any increase in background radiance
(back)
L needed to satisfy
(8.49c) and (8.49d) can easily end up increasing the NEdN
(det)
detector noise more than it
decreases the NEdN
tilt
and NEdN
samp
multiplicative noise. Perhaps, then, it is best just to note that
for any Fourier-transform spectrometer

(min)
tilt tilt
NEdN NEdN (8.51a)
and

(min)
samp samp
NEdN NEdN , (8.51b)

with
(min)
tilt
NEdN and
(min)
samp
NEdN specified by Eqs. (8.50g) and (8.50i) above.
Inequalities such as the ones in (8.51a) and (8.51b) can be very useful. If a proposed
interferometer design, with a good guess of
(unk)
FOV
L based on Eq. (8.46d), produces unacceptably
large values for
(min)
tilt
NEdN and
(min)
samp
NEdN , thenbecause there is no way the true NEdNs of the
actual instrument can be smallerthe design fails. The multiplicative noise in the system must be
reduced before further progress can be made.

- 1039 -
BIBLIOGRAPHY

Articles
Bell, E. E., and Sanderson, R. B. Spectral Errors Resulting From Random Sampling-Position
Errors in Fourier Transform Spectroscopy. Applied Optics, 11, no. 3 (March 1972), pp.
688689.
Cohen, D. Characterization of a Space-Class Fourier Transform Spectrometer Against a
Detailed Performance Model. IEEE Aerospace Conference at Snowmass, CO (March
1999).
Cohen, D. Noise-Equivalent Change in Radiance for Misalignment Noise in a Double-Sided
Interferogram. Applied Optics, 42, no. 31 (1 November 2003), pp. 62926304.
Cohen, D. Noise-Equivalent Change in Radiance for Sampling Noise in a Double-Sided
Interferogram. Applied Optics, 42, no. 13 (1 May 2003), pp. 22892300.
Cohen, D. Performance Degradation of a Michelson Interferometer When Its Misalignment
Angle Is a Rapidly Varying Random Time Series. Applied Optics, 36, no. 18 (20 June
1997), pp. 40344041.
Cohen, D. Performance Degradation of a Michelson Interferometer Due to Random Sampling
Errors. Applied Optics, 38, no. 1 (1 January 1999), pp. 139151.
Forman, Michael L., W. Howard Steel, and George A. Vanasse. Correction of Asymmetric
Interferograms Obtained in Fourier Spectroscopy. Journal of the Optical Society of
America, 56, no. 1 (January 1966), pp. 5963.
Haschberger, Peter. Impact of the Sinusoidal Drive on the Instrumental Line Shape Function of
a Michelson Interferometer with Rotating Retroreflector. Applied Spectroscopy, 48, no. 3
(1994), pp. 307315.
Hirschfeld, Tomas. Multiple Order Spectra in Fourier Transform Infrared Spectroscopy.
Applied Optics, 16, no. 7 (July 1977), pp. 19051907.
Kauppinen, Jyrki, and Pekka Saarinen. Line-Shape Distortions in Misaligned Cube Corner
Interferometers. Applied Optics, 31, no. 1 (January 1992), pp. 6973.
Lambert, D. K., and P. L. Richards. New Results in the Theory of a Plane-Mirror
Interferometer. Journal of the Optical Society of America, 68, no. 8 (August 1978), pp.
11241130.
Learner, R. C. M., A. P. Thorne, and J. W. Brault. Ghosts and Artifacts in Fourier-Transform
Spectrometry. Applied Optics, 35, no. 16 (June 1996), pp. 29472953.
Loewenstein, Ernest V. Fourier Spectroscopy: An Introduction. Proceedings of the Aspen
International Conference on Fourier Spectroscopy, Aspen, CO (March 1620, 1970), pp.
317.
Bibliography
- 1040 -
Mattson, David R. Sensitivity of a Fourier Transform Infrared Spectrometer. Applied
Spectroscopy, 32, no. 4 (1978), pp. 335338.
Michelson, Albert A. The Relative Motion of the Earth and the Luminiferous Ether. The
American Journal of Science, 22 (Second Series, 1881), pp. 120129.
Michelson, Albert A., and Edward W. Morley. On the Relative Motion of the Earth and the
Luminiferous Ether. The American Journal of Science, 34, no. 203 (Third Series, 1887),
pp. 333345.
Miller, Dayton C. The Ether Drift Experiment and the Determination of the Absolute Motion of
the Earth. Nature, 133 (3 February 1934), pp. 162164.
Miller, Dayton C. The Ether-Drift Experiment and the Determination of the Absolute Motion of
the Earth. Reviews of Modern Physics, 5 (July 1933), pp. 203242.
Murty, M. V. R. K. Modification of Michelson Interferometer Using Only One Cube-Corner
Prism. Journal of the Optical Society of American (Letters to the Editor), 50, no. 1, pp.
8384.
Murty, M. V. R. K. Some More Aspects of the Michelson Interferometer with Cube Corners.
Journal of the Optical Society of America, 10, no. 1 (January 1960), pp. 710.
Nishiyama, Taichiro, Takashi Yamauchi, Masanao Ohno, Masao Morii, Nobuo Ura, and Koji
Masutani. New Sampling Method in Fourier Spectroscopy. Japanese Journal of Applied
Physics, 14, Suppl. 14-1, (1975), pp. 6769.
Park, Jae H. Analysis and Application of Fourier Transform Spectroscopy in Atmospheric
Remote Sensing. Applied Optics, 23, no. 15 (1 August 1984), pp. 26042607.
Park, Jae H. Analysis Method for Fourier Transform Spectroscopy. Applied Optics, 22, no. 6
(15 March 1983), pp. 835849.
Park, Jae H. Effect of Interferogram Smearing on Atmospheric Limb Sounding by Fourier
Transform Spectroscopy. Applied Optics, 21, no. 8, (15 April 1982), pp. 13561366.
Raspollini, Piera, Peter Ade, Bruno Carli, and Marco Ridolfi. Correction of Instrument Line-
Shape Distortions in Fourier Transform Spectroscopy. Applied Optics, 37, no. 17 (10 June
1998), pp. 36973704.
Revercomb, Henry E., H. Buijs, Hugh B. Howell, D. D. Laporte, William L. Smith, and L. A.
Sromovsky. Radiometric Calibration of IR Fourier Transform Spectrometers: Solution to
a Problem with the High-Resolution Interferometer Sounder. Applied Optics, 27, no. 15 (1
August 1988), pp. 32103218.
Saarinen, Pekka, and Jyrki Kauppinen. Spectral Line-Shape Distortions in Michelson
Interferometers due to Off-Focus Radiation Source. Applied Optics, 31, no. 13, (1 May
1992), pp. 23532359.
Sakai, H. Consideration of the Signal-to-Noise Ratio in Fourier Spectroscopy. Proceedings of
the Aspen International Conference on Fourier Spectroscopy, Aspen, CO (March 1620,
1970), pp. 1940.
Sakai, H., and G. A. Vanasse. Spectral Recovery in Fourier Spectroscopy. Journal of the
Optical Society of America, 58, no. 1 (January 1968), pp. 8490.
Bibliography
- 1041 -
Schumann, L. W., T. S. Lomheim, and J. F. Johnson. Design Constraints on Advanced Two-
Dimensional LWIR Focal Planes for Imaging Fourier Transform Spectrometer Sensors.
Paper 3063-13 presented at Aerosense: the SPIE International Symposium on Aerospace
and Defense Sensing, Simulation, and Controls at Orlando, FL (April 1997).
Shankland, R. S., S. W. McCuskey, F. C. Leone, and G. Kuerti. New Analysis of the
Interferometer Observations of Dayton C. Miller. Reviews of Modern Physics, 27, no. 2
(April 1955), pp. 167178.
Shaw, J. E. Spectroradiometry over Broad Spectral Regions by Fourier Spectroscopy. Journal
of the Optical Society of America, 57, no. 9 (September 1967), pp. 11361140.
Stroke, George W. Photoelectric Fringe Signal Information and Range in Interferometers with
Moving Mirrors. Journal of the Optical Society of America, 47, no. 12 (December 1957),
pp. 10971103.
Tanner, D. B., and R. P. McCall. Source of a Problem with Fourier Transform Spectroscopy.
Applied Optics, 23, no. 14 (15 July 1984), pp. 23632368.
Williams, Charles S. Mirror Misalignment in Fourier Spectroscopy Using a Michelson
Interferometer with Circular Aperture. Applied Optics, 5, no. 6 (June 1966), pp. 1084
1085.
Yap, B. K., W. A. M. Blumberg, and R. E. Murphy. Off-Axis Effects in a Mosaic Michelson
Interferometer. Applied Optics, 21, no. 22 (15 November 1982), pp. 41764182.
Zachor, Alexander S. Drive Nonlinearities: Their Effects in Fourier Spectroscopy. Applied
Optics, 16, no. 5 (May 1977), pp. 14121424.
Zachor, Alexander S., and Steve M. Aaronson. Delay Compensation: Its Effect in Reducing
Sampling Errors in Fourier Spectroscopy. Applied Optics, 18, no. 1 (1 January 1979), pp.
6875.

Books

Abramowitz, Milton, and Irene A. Stegun (eds.). Handbook of Mathematical Functions (National
Bureau of Standards, Applied Mathematics Series 55, Washington, DC, 1964).
Bass, Michael (ed.). Handbook of Optics, Vols. I and II, 2nd ed. (Optical Society of America,
McGraw-Hill, Inc., New York, 1995).
Batygin, V. V., and I. N. Toptygin. Problems in Electrodynamics (Academic Press, New York,
1964).
Beer, Reinhard. Remote Sensing by Fourier Transform Spectrometry (John Wiley & Sons, Inc.,
New York, 1992).
Beers, Yardley. Introduction to the Theory of Error, 2nd ed. (Addison-Wesley Publishing
Company, Inc., Reading, MA, 1957).
Bennett, Jean M., and Lars Mattson. Introduction to Surface Roughness and Scattering (Optical
Society of America, Washington, DC, 1989).
Bibliography
- 1042 -
Blake, Ian F. An Introduction to Applied Probability (John Wiley & Sons, Inc., New York, 1979).
Bois, G. Petit. Tables of Indefinite Integrals (Dover Publications, Inc., New York, 1961),
unabridged translation of a book first published by B. G. Teubner in 1906.
Born, Max, and Emil Wolf. Principles of Optics: Electromagnetic Theory of Propagation,
Interference, and Diffraction of Light, 7th exp. ed. (Cambridge University Press, New
York, 1999).
Bracewell, Ron. The Fourier Transform and Its Applications (McGraw-Hill Book Company,
New York, 1965).
Chamberlain, John. The Principles of Interferometric Spectroscopy (John Wiley & Sons, New
York, 1979).
Champeney, D. C. A Handbook of Fourier Theorems (Cambridge University Press, New York,
1987).
Chandrasekhar, S. Radiative Transfer (Dover Publications, Inc., New York, 1960), slightly
revised from 1950 book.
Cohen, D. Demystifying Electromagnetic Equations: A Complete Explanation of EM Unit
Systems and Equation Transformations (SPIE Press, Bellingham, WA, 2001).
Davenport, Wilbur B., Jr., and William L. Root. An Introduction to the Theory of Random Signals
and Noise (McGraw-Hill Book Company, Inc., New York, 1958).
Davis, Sumner P., Mark C. Abrams, and James W. Brault. Fourier Transform Spectrometry
(Academic Press, New York, 2001).
Defense Supply Agency, Standardization Division. Military Standardization Handbook Optical
Design, MIL-HDBK-141, 5 October 1962.
Dereniak, Eustace L., and Devon G. Crowe. Optical Radiation Detectors (John Wiley & Sons,
Inc., New York, 1984).
Ditchburn, R. W. Light, Vols. 1 and 2, 2nd ed. (Interscience Publishers, a division of John Wiley
& Sons, Inc., New York, 1963).
Evans, Merran, Nicholas Hastings, and Brian Peacock. Statistical Distributions, 2nd ed. (John
Wiley & Sons, Inc., New York, 1993).
Eyges, Leonard. The Classical Electromagnetic Field (Dover Publications, Inc., New York,
1980), an unabridged and corrected edition of 1972 book published by Addison Wesley.
Francon, M. Optical Interferometry (Academic Press, New York, 1966).
Freeman, J. J. Principles of Noise (John Wiley & Sons, Inc., New York, 1958).
Gabel, Robert A., and Richard A. Roberts. Signals and Linear Systems, 2nd ed. (John Wiley and
Sons, Inc., New York, 1980).
Gaskill, Jack D. Linear Systems, Fourier Transforms, and Optics (John Wiley & Sons, Inc., New
York, 1978).
Goldstein, D. Polarized Light, 2nd ed. (Marcel Dekker, Inc., New York, 2003).
Goodman, Joseph W. Introduction to Fourier Optics, McGraw-Hill, Inc., New York, 1988),
reissue of 1968 book.
Goodman, Joseph W. Statistical Optics (John Wiley & Sons, New York, 1985).
Bibliography
- 1043 -
Goody, R. M., and Y. L Yung. Atmospheric Radiation: Theoretical Basis, 2nd ed. (Oxford
University Press, New York, 1989).
Gradshteyn, I. S., and I. M. Ryzhik. Table of Integrals, Series, and Products, 5th ed., edited by
Alan Jeffrey (Academic Press, New York, 1994).
Griffiths, David J. Introduction to Electrodynamics, 2nd ed. (Prentice-Hall, Englewood Cliffs,
NJ, 1989).
Griffiths, Peter R., and James A. de Haseth. Fourier Transform Infrared Spectrometry (John
Wiley and Sons, Inc., New York, 1986).
Heavens, O. S. Optical Properties of Thin Solid Films (Butterworths Scientific Publications,
London, 1955).
Hecht, Eugene. Optics, 2nd ed., with contributions by Alfred Zajac (Addison-Wesley Publishing
Company, Reading, MA, 1987).
Helstrom, Carl W. Statistical Theory of Signal Detection, 2nd ed. (Pergamon Press, New York,
1968).
Jackson, John David. Classical Electrodynamics, 3rd ed. (John Wiley & Sons, Inc., New York,
1999).
Jaffe, Bernard. Michelson and the Speed of Light (Anchor Books, Doubleday and Company, Inc.,
New York, 1960).
Jeffrey, Alan. Handbook of Mathematical Formulas and Integrals (Academic Press, Inc., New
York, 1995).
Jenkins, F., and H. White. Fundamentals of Optics, 3rd ed. (McGraw-Hill Book Company, New
York, 1957).
Kay, Steven M. Fundamentals of Statistical Signal Processing: Estimation Theory (PTR Prentice
Hall, Inc., Englewood Cliffs, NJ, 1993).
Keigo, Iizuka. Engineering Optics, rev. translation of the 2nd original Japanese ed. (Springer-
Verlag, New York, 1983).
Klambauer, Gabriel. Aspects of Calculus (Springer-Verlag, New York, 1986).
Klein, Miles V. Optics (John Wiley & Sons, Inc., New York, 1970).
Kusse, Bruce, and Eric Westwig. Mathematical Physics: Applied Mathematics for Scientists and
Engineers (John Wiley and Sons, New York, 1998).
Lamb, H. Hydrodynamics (Dover Publications, Inc., New York, 1945), copy of the 6th ed. first
published in 1879.
Landau, L. D., and E. M. Lifshitz. Electrodynamics of Continuous Media, translated from the
Russian by J. B. Sykes and J. S. Bell (Pergamon Press, New York, 1960).
Landau, L. D., and E. M. Lifshitz. The Classical Theory of Fields, 3rd rev. English ed., translated
from the Russian by Morton Hamermesh (Pergamon Press, New York, 1971).
Lathi, B. P. An Introduction to Random Signals and Communication Theory (International
Textbook Company, Scranton, PA 1968).
Lighthill, M. J. Introduction to Fourier Analysis and Generalized Functions (Cambridge
Bibliography
- 1044 -
Livingston, Dorothy Michelson. The Master of Light: A Biography of Albert A. Michelson
(Charles Scribners Sons, New York, 1973).
Mandel, Leonard, and Emil Wolf. Optical Coherence and Quantum Optics (Cambridge
Michelson, A. A. Light Waves and Their Uses (The University of Chicago Press, Chicago, 1903).
Michelson, A. A. Studies in Optics (The University of Chicago Press, Chicago, 1927).
Mobley, Curtis D. Light and Water: Radiative Transfer in Natural Waters, based in part on
collaborations with Rudolf W. Preisendorfer (Academic Press, New York, 1994).
Morse, P., and K. Ingard. Theoretical Acoustics (McGraw-Hill, Inc., New York, 1968).
Morse, Philip M., and Herman Feshbach. Methods of Theoretical Physics, Parts I and II
(McGraw-Hill Book Company, Inc., New York, 1953).
ONeill, Edward L. Introduction to Statistical Optics (Dover Publications, Inc., New York,
copyright 1963, 1991 by Edward ONeill).
Papoulis, Athanasios. Systems and Transforms with Applications in Optics (McGraw-Hill
Publishing Company, Inc., New York, 1968).
Papoulis, Athanasios. The Fourier Integral and Its Applications (McGraw-Hill, Inc., New York,
copyright 1962 and 1987).
Papoulis, Athansios. Probability, Random Variables, and Stochastic Processes, 3rd ed.
(McGraw-Hill, Inc., New York, 1991).
Papoulis, Athansios. Signal Analysis (McGraw-Hill Book Company, New York, 1977).
Porat, Boaz. A Course in Digital Signal Processing (John Wiley & Sons, Inc., New York, 1997).
Press, William H., Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. Numerical
Recipes in C: The Art of Scientific Computing, 2nd ed. (Cambridge University Press, New
York, 1992).
Rade, Lennart, and Bertil Westergren. Beta Mathematics Handbook, 2nd ed. (CRC Press, Boca
Raton, FL, 1990).
Rybicki, George B., and Alan P. Lightman. Radiative Processes in Astrophysics (John Wiley &
Sears, Francis Weston. Optics, 3rd ed. (Addison-Wesley Publishing Company, Reading, MA,
1949).
Slater, John C., and Nathaniel H. Frank. Electromagnetism (Dover Publications, Inc., 1947).
Sneddon, Ian N. Fourier Transforms (Dover Publications, Inc., New York, 1995) an unabridged
and unaltered version of the book published by McGraw-Hill in 1951.
Sommerfeld, Arnold, Optics, Lectures on Theoretical Physics, Vol. IV, (Academic Press, New
York, 1964).
Soong, T. T. Random Differential Equations in Science and Engineering (Academic Press, New
York, 1973).
Sparrow, E. M., and R. D. Cess. Radiation Heat Transfer, augmented ed. (Hemisphere Publishing
Corporation, New York, 1978).
Staff of the Bateman Manuscript Project. Tables of Integral Transforms, Vol. I and II, (McGraw-
Hill Book Company, Inc., New York, 1954).
Bibliography
- 1045 -
Steel, W. H. Interferometry (Cambridge University Press, New York, 1967).
Stokes, G. Mathematical and Physical Papers, Vol. III, (Cambridge University Press, New York,
1901).
Stone, John M. Radiation and Optics: An Introduction to the Classical Theory (McGraw-Hill
Book Company, Inc., New York, 1963).
Thomas, John B. An Introduction to Applied Probability and Random Processes (John Wiley &
Thomson, J. H., and F. G. Smith. Optics (John Wiley & Sons, Ltd., New York, 1971).
Thorne, Anne P. Spectrophysics, 2nd ed. (Chapman and Hall, New York, 1988).
Valasek, Joseph. Elements of Optics (McGraw-Hill Book Company, Inc., New York, 1932).
Vincent, John David. Fundamentals of Infrared Detector Operation and Testing (John Wiley and
Wax, Nelson (ed.). Selected Papers on Noise and Stochastic Processes (Dover Publications, Inc.,
New York, 1954).
Weast, Robert C. (ed.). Handbook of Chemistry and Physics, 51st ed.(The Chemical Rubber
Company, Cleveland, OH, 19701971).
Whittaker, Edmund. A History of the Theories of Aether and Electricity, Vols. I and II (Tomash
Publishers and American Institute of Physics, 1951), published by the Philosophical
Library, copyright 1987 by American Institute of Physics).
Williams, W. Ewart. Applications of Interferometry, 4th ed. (John Wiley & Sons, Inc., New
York, 1950).
Wirsching, Paul H., Thomas L. Paez, and Keith Ortiz. Random Vibrations: Theory and Practice
(John Wiley and Sons, Inc., New York, 1995).
Wolfe, William L., and Zissus, George J. (eds.). The Infrared Handbook, rev. ed., (Infrared
Information Analysis [IRIA] Center for the Office of Naval Research, first ed. 1978,
revised ed. 1985).
Wyatt, Clair L. Radiometric Calibration: Theory and Methods (Academic Press, Inc., New York,
1978).
Yariv, Amnon, and Pochi Yeh. Optical Waves in Crystals: Propagation and Control of Laser
Radiation (Wiley Interscience, Hoboken, NJ, 2003).
Zemanian, A. H. Distribution Theory and Transform Analysis: An Introduction to Generalized
Functions with Applications (Dover Publications, Inc., New York, 1965), copyright by
Zemanian, an unabridged, slightly corrected version of 1965 book published in 1987 by
McGraw-Hill.

Index
1046
1/f noise, 6.7, 764-767

A
A/D converter. See analog-to-digital converter
absolutely integrable, 69, 74, 75, 108
absorption line, 742, 806, 1022. See also emission line
AC coupling, 619, 630, 750, 752, 757, 880
additive noise, 1016, 1028
aft optics, 5.8, 605, 607-609, 618, 626, 628-630, 632, 635,
659, 700, 986, 1025
alias, 195, 200, 716-720, 723
aliasing, 188, 195-198, 200, 218, 708, 709, 715, 716, 851,
853, 854
amplitude-reflection coefficient, 368, 407, 412, 413, 467,
473, 475
amplitude-transmission coefficient, 357-359, 362, 367, 407,
473
analog-to-digital converter, 8.1, 8.2, 555, 696, 697, 749, 849,
850, 953, 954
angle-wavenumber transform, 4.8, 380, 382, 386, 391, 393,
394
anti-aliasing filter, 6.22, 7.4, 555, 849, 853, 879, 880
anti-Hermitian function, 101, 220, 222
apodization, 5.16, 650, 654-656
apodizing, 654, 656
approximation, gray-body. See gray-body approximation
artificially created even signals, 3.27, 319
autocorrelation, 3.13, 249, 319, 322, 523, 903
autocorrelation function, 8.3, 223, 250, 251, 258, 274, 275,
277, 278, 280, 281, 284, 285, 288, 290, 299, 301, 304, 319,
329, 448, 522, 524, 791, 794, 860-862, 903, 904, 912, 948,
956, 957, 977, 1012, 1014
autocovariance, 3.13, 249-251, 258
avoidable misalignment noise, 7.7, 895
avoidable noise, 6.8, 767-770, 787, 788, 843, 848, 901, 1016

B
background-limited infrared proton, 807. See also BLIP
background radiance, 4.18, 5.13, 6.3, 6.4, 6.5, 465, 466, 468,
473, 474, 476, 479-481, 483-486, 555, 587, 626, 628-631,
639-641, 664, 681, 686, 698, 752, 753, 755, 758, 782, 806,
852, 911, 930, 934, 935, 941, 986, 1006, 1007, 1021, 1025,
1028, 1034, 1038
balanced background signal, 330, 464, 474, 475, 585
balanced output, 46, 47, 49, 54, 438
balanced radiation field, 4.15, 394, 415, 417, 438, 552
balanced signal, 4.16, 5.4, 56, 453, 454, 464, 465, 551, 573,
575, 582-585, 587, 588, 591, 594, 599, 602, 603, 605, 606,
608, 610, 611, 616, 630-632
band-limited function, 200, 393
band-limited radiation, 4.10, 390, 428
band-limited white noise, 3.25, 6.13, 299, 301, 767, 795,
798, 807, 808, 812, 817, 924
bandwidth, 301, 795, 808
beam-chopped radiation, 4.9, 4.14, 383-385, 391, 394, 427,
430, 444, 448, 470, 522-526, 555
beam splitter, 2, 3, 7, 12, 14, 18, 19, 22, 24, 26, 28, 31, 42,
44, 46, 47, 54, 58, 59, 355, 394-400, 406, 407, 411-415,
456, 464, 466, 467, 470, 472, 474, 478, 479, 481, 489, 536-
538, 543, 574, 575, 577, 585, 586, 602, 608, 698, 749
Bessel function, 456, 462, 485, 530, 630, 871
bias angle, 922, 927-929, 931, 939
bias-tilt angle, 870, 912, 948, 951, 1030
black-body, 559, 605, 756, 930-932, 934, 939, 987, 991
black-body spectrum, 8.8, 929, 931-933, 986-988, 991, 992,
994, 996, 1007
BLIP, 807, 814
Boltzmann's constant, 559
bounded function, 69, 71

C
cadmium red line, 24, 28
calibration, 5.19, 465, 487, 488, 555, 681, 683, 685, 704,
725, 726, 742, 753, 762, 764, 766, 767, 782, 784, 822, 843,
853, 893, 932, 953, 954, 966, 987, 1020, 1021
calibration algorithm, 685, 686, 762, 782-785, 806, 820,
853, 865, 891, 892, 964-967, 1016, 1019
Cassegrain telescope, 385, 386, 389
cat's-eye, 54
Cauchy, Augustin, 1
Cauchy principle value, 82, 118-122, 140, 142-144, 158
causal system, 729
central dark fringe, 12, 14, 22, 26
central fringe, 12, 14, 31
central limit theorem, 3.11, 227, 243, 246, 248
characteristic function, 231, 267
co-adding, 764
coefficient
amplitude-reflection, 368, 407, 412, 413, 467, 473, 475
amplitude-transmission, 357-359, 362, 367, 407, 473
power reflection, 478, 481
power transmission, 478, 481
compensator plate, 7, 14, 394-397, 399, 412, 456, 466, 467,
473, 474, 489, 533-536, 538-541, 543-546, 549, 551, 553,
554
complex scalar field, 403-409, 476
complex vector field, 335, 490-498
constructive interference, 46
continuous function, 49, 64, 65, 161, 188, 195, 200, 202,
204, 217, 727
convolution, 110-115, 130, 132, 161, 162, 175, 202, 204,
218, 282, 284, 312, 313, 625, 626, 646, 650, 679, 683, 684,
702, 703, 727, 738, 741, 771, 772, 774, 775, 777, 778, 823,
826, 830, 832-837, 858, 879-883, 887-889, 900, 908, 910,
922, 924, 925, 939, 941, 963, 969, 970, 984, 1002, 1013,
1028, 1034
three-dimensional, 215
two-dimensional, 211, 212, 214-216
corner cube, 54, 55, 56
corner frequency, 766
Index
1047
correlated random variable, 239, 265, 913, 914
cosine curve, 66, 67, 218
cosine transform, 2.2, 2.4, 67, 68, 70, 73-75, 80, 81, 83-86,
89-91, 93, 95, 96, 98, 103, 218, 463, 464, 780
coupling, AC. See AC coupling
covariance, 237, 262
covariance stationary, 258
cross-correlation function, 259, 281, 283, 913, 948, 950, 951
cross-power spectrum, 281, 282, 286, 913, 914, 948-952
curl, 331, 335, 492, 493

D
D*, 813, 820
D-limited Fourier transform, 779, 792, 800, 840, 889, 898,
900, 901, 961
D-star, 813
delta function, 144-146, 148-154, 157, 158, 161, 162, 169,
172, 191-193, 216-218, 229, 279, 301, 312, 431, 445, 474,
580, 581, 620, 626, 647, 727, 729, 813, 819, 999, 1007,
1012-1014, 1016
nth derivative of, 154
dependent random variable, 3.5, 3.9, 223, 263
derivative
of a generalized function, 130
of the delta function, 153, 154
destructive interference, 46
detector circuit, 5.10, 6.9, 6.22, 7.4, 617-619, 621, 622, 624-
626, 630, 636-641, 643, 656, 667, 668, 674, 681, 685, 686,
696, 698, 699, 727, 730, 748, 750, 768-771, 777, 821, 822,
849, 853, 879, 880, 897
detector NEdN, 8.11, 1024
detector noise, 6.6, 6.9, 6.10, 6.12, 6.13, 6.14, 6.17, 6.18,
6.19, 6.20, 726, 742, 763, 764, 766-770, 772, 786-789,
791, 792, 794, 795, 800, 806, 807, 814, 815, 817, 819-821,
823, 828, 829, 840, 844-846, 848, 849, 853, 894, 941, 953,
964, 1016, 1027, 1031, 1038
detector responsivity, 611, 612, 632, 751, 759, 808, 874
detector signal, 5.9, 611, 618, 626, 629, 698, 768, 822, 874
DFT, 182, 183, 185, 187, 188, 190, 192, 195, 197, 699, 849,
851. See also discrete Fourier transform
Dirac delta function, 144
direction-chopped radiation, 500, 555
direction cosines, 339
discrete Fourier transform, 5.23, 62, 173, 181, 182, 218, 555,
699, 704, 708, 709, 713-716, 720, 722, 723, 849. See also
DFT
distribution theory, 121
distributions, 121, 246, 870, 873, 914
divergence, 335
divergent integral, 117
dot product, 349, 373, 405, 490, 494, 495
double-sided interferogram, 5.15, 555, 643, 646, 650, 667,
677, 680-683, 701, 742, 850, 853, 865, 953, 959, 975
double-sided NEdN, 848
double-sided power spectrum, 289, 296, 297, 455, 474, 483,
766, 795, 812, 819, 842, 846, 903
double-sided signal, 6.8, 6.10, 6.14, 6.16, 7.5, 7.11, 682,
767, 772, 787, 789, 800, 814, 815, 817, 842, 843, 845, 848,
882, 884, 889, 890, 909, 955

E
Earth's orbital velocity, 14, 19, 23
effective spectrum, 5.11, 5.12, 622-624, 644, 645, 650, 663,
665, 666, 668, 674, 677, 681, 686, 690, 692, 700, 702, 708
Einstein, Albert, 1, 23
elastic vibrations, 2, 43
electric field, 522
electromagnetic radiation, 5.1, 55, 372, 394, 555, 556, 611,
810
electromagnetic wave, 4.1, 4.2, 329, 330, 335, 339, 360,
428, 432, 489
emission line, 8.9, 742, 930, 934-936, 939, 941, 985, 996,
998, 999, 1002, 1006, 1007, 1022
ensemble, 3.14, 251-253, 279, 280, 303, 765, 798, 800, 869
ensemble average, 271, 272, 274, 278, 280
ergodic, 272, 274, 277, 279, 791
in the autocorrelation function, 274, 275, 277, 278
in the mean, 271, 272, 274, 277-279
in the variance, 277, 278
random function, 3.18, 271, 274, 275, 279, 280
ergodicity, 223, 279, 280, 301, 329
ether, 1, 2, 14, 23, 31, 330
drift, 23
luminiferous, 1, 2, 14
stationary, 14, 23
wind, 1, 1.2, 14, 23, 24, 26, 54
even function, 2.3, 51, 76, 77, 79-84, 86, 88, 111, 114, 119,
121, 128, 135, 139, 140, 206, 268, 281, 282, 288, 296, 303,
307, 319, 327, 438, 454, 460-463, 471, 477, 483, 485-488,
575, 577, 580, 584, 589, 590, 601, 603, 604, 606, 610, 613-
615, 617, 624-627, 630, 631, 633-635, 655, 668, 672-674,
680, 684, 703, 729, 732, 746, 767, 768, 775, 786, 787, 790,
792, 800, 815, 818, 819, 826-828, 840, 843, 844, 848, 891,
895-897, 899, 900, 904, 913, 924, 945, 950, 952, 957, 968,
969, 971, 976, 1033, 1034
expectation operator, 3.4, 3.10, 230, 232, 239-243, 245, 247,
248, 250, 258-260, 266, 268, 270, 271, 283, 284, 290, 314,
318, 321, 329, 432, 440, 445, 745, 761, 781, 809, 814, 817,
841, 842, 878, 889, 890, 892, 902, 915, 916, 918, 921, 956,
962, 963, 965, 971, 977, 981, 1017, 1022, 1023

F
fast-Fourier transform, 55, 96, 699
Fellget advantage, 55
FFT, 55, 188, 699. See also fast-Fourier transform
field of view, 22, 26, 28, 31, 330, 453-455, 460, 461, 472,
483, 485, 573, 588, 594, 601, 603, 605, 608, 612, 626, 627,
630, 637, 639, 645, 656, 659-661, 665, 667, 683, 686, 731,
753, 754, 756-758, 809, 857, 931, 986, 1034
filter theory, 117
Index
1048
finite field of view, 5.17, 656, 673, 676, 684, 685, 726, 744,
775, 783, 857, 859, 887, 891, 894, 935, 936, 991, 998,
1026, 1032, 1034
finite variation, 69, 73, 141
fixed mirror, 24, 26, 27, 33, 35, 44, 58, 394-397, 399, 412,
466, 553, 574, 667, 692, 696, 698, 749
focal plane, 385, 387, 594-597, 599, 600, 688
fore optics, 607-609, 618, 626-628, 634, 635, 749, 934, 986,
1006
Fourier convolution theorem, 2.9, 2.17, 110, 112, 114, 115,
159, 160, 162, 176, 202, 204, 212-216, 218, 286, 292, 311,
625, 645, 646, 655, 679, 728, 773, 777, 778, 824, 829, 831,
885, 889, 906, 907, 919, 960, 975, 979, 1015
Fourier identities, 2.8, 103, 209
Fourier scaling theorem, 107, 210, 211
Fourier series, 2.20, 62, 173, 177-179, 181
Fourier shift theorem, 106, 209, 725, 1015
Fourier transform, 2.1, 2.5, 2.6, 2.7, 2.10, 2.13, 2.25, 3.23,
31, 50-52, 54, 57-59, 62, 70, 76, 89, 93-107, 109, 112, 114,
115, 117-122, 124, 136-142, 144, 157, 167, 168, 171, 173,
176, 178, 181, 182, 188, 194, 197, 200, 202, 204, 207-210,
213-215, 218, 231, 281, 282, 285-290, 292, 297-299, 302,
303, 310, 311, 371, 372, 381-384, 391, 393, 426, 447, 449,
451, 456, 464, 488, 525, 605, 610, 614, 620, 623, 625, 626,
639-641, 643-646, 650, 654-656, 677-680, 683, 699, 704,
708, 709, 715, 728-730, 754, 756, 757, 772-775, 777-780,
786-788, 790, 792, 794, 822, 824, 826, 829-831, 833-835,
837, 838, 840, 849, 860, 862, 866, 880, 884-886, 888, 889,
895, 896, 899, 904, 913, 914, 919, 920, 922, 957, 959, 961,
962, 974, 975, 979, 980, 1029, 1030
Fourier transform of generalized functions, 144, 159, 167,
168
Fourier transform of the delta function, 2.16, 157
Fourier transform of the shah function, 2.19, 165, 171
Fourier transform pairs, 143, 159, 160, 206, 319, 677, 703
frequency, 24, 31, 34, 37, 39-41, 43, 45, 47, 49, 51, 67, 94-
96, 107, 108, 115, 121, 188, 190, 192, 195, 196, 198, 200,
201, 203, 224, 249, 289, 297, 298, 314, 319, 533, 534, 538,
557, 558, 560, 723, 766, 808-811, 813, 819, 820, 853-855,
931. See also Nyquist frequency
Fresnel, Augustin, 1
fringe, 1.6, 12, 14, 22-24, 26, 28-30, 47, 50, 52, 58
central. see central fringe
central dark. see central dark fringe
fringe shift, 23, 34, 54
function
anti-Hermitian, 101, 220, 222
autocorrelation, 8.3, 223, 250, 251, 258, 274, 275, 277,
278, 280, 281, 284, 285, 288, 290, 299, 301, 304, 319,
329, 448, 522, 524, 791, 794, 860-862, 903, 904, 912,
948, 956, 957, 977, 1012, 1014
band-limited, 200
bounded, 69, 71
characteristic, 231, 267
continuous, 49, 64, 65, 161, 188, 195, 200, 202, 204, 217
cross-correlation, 259, 281, 283, 913, 948, 950, 951
delta, 144-146, 148-154, 157, 158, 161, 162, 169, 172,
191-193, 216-218, 229, 279, 301, 312, 431, 445, 474,
580, 581, 620, 626, 647, 727, 729, 813, 819, 999, 1007,
1012-1014, 1016
Dirac delta, 144
even, 2.3, 51, 76, 77, 79-84, 86, 88, 111, 114, 119, 121,
128, 135, 139, 140, 206, 268, 281, 282, 288, 296, 303,
307, 319, 327, 438, 454, 460-463, 471, 477, 483, 485-
488, 575, 577, 580, 584, 589, 590, 601, 603, 604, 606,
610, 613-615, 617, 624-627, 630, 631, 633-635, 668,
672-674, 680, 703, 729, 732, 746, 767, 768, 775, 778-
780, 786, 790, 792, 800, 815, 819, 826-828, 840, 843,
844, 848, 891, 895-897, 899, 900, 904, 913, 924, 945,
950, 952, 957, 968, 969, 971, 976, 1033, 1034
generalized, 2.11, 2.13, 2.17, 62, 121-130, 132, 136-139,
141-145, 148, 152, 155, 156, 159-162, 167-170, 172,
175, 218
Hermitian, 101, 102, 219, 221, 282, 286-288, 372, 419,
420, 425, 429, 624, 625, 668, 678, 729, 731, 786, 822,
825, 880, 962, 968, 976
impulse-response, 282, 285, 287, 625, 626, 727-730, 770,
822, 879
instrument line-shape, 114, 115, 648
instrument-response, 114, 115, 647, 728, 729
mixed, 2.3, 76, 77
odd, 2.3, 76, 77, 79-85, 88, 90, 111, 119-121, 128, 129,
139, 140, 142, 148, 158, 218, 228, 266, 267, 303, 314,
459, 463, 488, 604, 615, 634, 674, 732, 788, 800, 899,
946, 947, 950, 968
random, 3.2, 3.13, 3.15, 3.23, 3.26, 223-225, 242, 249,
250, 252, 253, 257-261, 271-275, 277-282, 284, 287-
290, 296, 297, 299, 301-303, 319, 328, 432, 438, 522,
523, 526, 535, 744, 746, 747, 760-766, 780, 792, 798,
800, 815, 840, 844-847, 860, 869, 871, 873, 874, 876,
877, 882, 892, 903, 911, 912, 914, 951, 953, 956, 962,
988, 1012, 1013
stationary random, 3.15, 252, 260, 261, 271, 279, 282,
287, 303, 319, 791, 861, 862
tapering, 678, 822, 825, 827
test, 121-133, 135, 136, 138, 141, 142, 144-148, 151-154,
161-164, 168-171, 173-175
transfer, 285-287, 620-622, 624, 645, 661, 668, 674, 681,
728-731, 733, 777, 778, 821, 822, 825, 831, 853, 880,
888, 923, 931, 968, 976, 991, 994, 996
functional, 121-123, 126, 130, 144, 145

G
Gaussian
multivariate, 261-263
probability distribution, 227, 243, 246, 800
random processes, 3.16, 261, 262, 279
generalized function, 2.11, 2.13, 2.17, 62, 121-132, 136,
137-139, 141-145, 148, 152, 155, 156, 159-162, 167-170,
172, 175, 218
generalized function, derivative of a, 130
Index
1049
generalized function theory, 62, 143, 144
generalized limit, 2.12, 132, 133, 135-138, 141, 142, 145-
147, 157, 160-162, 164, 167-171, 175, 177, 218
geometric optics, 383, 385
geometric series, 167, 168, 184
ghost line, 939, 941, 996, 998, 1002, 1007
gray-body approximation, 559, 935
Green, George, 1

H
Hartley transform, 87-89, 93, 99
Heaviside step function, 155, 156, 310, 320, 322, 828, 835,
840, 845
Heidinger rings, 593, 597
Hermitian function, 101, 102, 219, 221, 282, 286-288, 372,
419, 420, 425, 429, 624, 625, 668, 678, 729, 731, 786, 822,
825, 880, 962, 968, 976
Hertz, Heinrich, 1
homogeneous, 299, 524
homogeneous random field, 297, 523, 524

I
ILS, 647. See also instrument line shape
impulse-response function, 282, 285, 287, 625, 626, 727-
730, 770, 822, 879
independent random variable, 3.6, 233, 234, 236, 239, 243,
245, 260, 261, 280, 810, 811, 872, 914, 921, 927, 930
index of refraction, 355, 385, 532-534
information theory, 1031
infrared spectra, 55, 58, 464, 501, 626, 641, 752, 1034
instrument line shape, 647, 648
instrument line-shape function, 114, 115, 648
instrument-response function, 114, 115, 647, 728, 729
interference
constructive, 46
destructive, 46
interferogram, 5.24, 5.25, 197, 200, 463-465, 487, 488, 579-
581, 583-585, 603, 604, 610, 683, 704, 715, 721, 723, 726,
744, 764, 775, 782, 783, 807, 857, 859, 865, 887, 891, 894,
930, 935, 938, 986, 991, 998, 1012, 1026, 1031, 1034
interferogram signal, 5.22, 5.26, 197, 555, 580, 623-625,
642, 643, 650, 654, 656, 666-668, 678, 683, 696, 698, 699,
701, 704, 709, 715, 716, 723, 725, 742, 748, 764, 767, 775,
782, 798, 802, 804, 806, 857, 930, 1012, 1026, 1027
inverse Fourier transform, 2.5, 6.4, 137, 139, 168, 171, 194,
204, 208-211, 213-215, 280, 281, 287, 371, 372, 381, 382,
426, 464, 605, 610, 620, 621, 623-625, 668, 678, 680, 729,
750, 753, 767, 774, 822, 826
inverse-square law, 5.3, 571, 573

J
Jacquinot advantage, 55
jointly normal random variable, 3.17, 263, 266, 271, 914,
917
jointly wide-sense stationary, 259, 281, 913
Jones, 813
jump discontinuity, 69, 70, 80, 81, 119, 124

K
Kronecker delta, 185

L
laser-based servo controls, 1.8, 57, 59
light
monochromatic, 1.3, 3, 24, 28, 31, 580
speed of, 1, 19, 23, 31, 346, 559, 808
white, 6, 7, 12, 14, 22, 26, 28, 40, 41, 44, 47, 50
linear combination, 125-127, 161, 727
linear operation, 2.6, 97, 99, 110, 727, 823
linear operator, 97, 240, 335, 496, 816
linear polarization, 4.4, 349-351, 355, 356, 362, 366
linear system, 3.21, 282, 285, 287, 288, 727, 729
Lorentz, Hendrik, 1
luminiferous ether. See ether, luminiferous

M
magnetic-induction field, 331, 332, 350, 368, 372
magnetic permeability, 331
Maxwell, James Clerk, 1
Maxwell's equations, 2, 50, 330, 344, 363, 496, 556
mean, 3.3, 3.8, 3.13, 62, 64, 226-230, 235, 240, 241, 243,
246-250, 260, 262-269, 271, 272, 278, 301, 798, 800, 811,
870, 871, 877, 902, 909, 912, 914, 921, 922, 932, 939, 945-
947, 956, 958, 972, 988, 1017, 1022, 1023
mercury green line, 24
Michelson, Albert, 1, 2, 12, 14, 22-24, 28, 31, 42, 49, 50, 52,
54
Michelson-based spectroscopy, 29
Michelson interferometer, 1.1, 1.4, 1.5, 2, 3, 4.11, 4.12, 4.13,
5.4, 5.5, 5.6, 5.7, 5.13, 14, 23, 24, 31, 41, 44, 46, 47, 50,
52, 54, 55, 62, 115, 117, 197, 200, 330, 355, 385, 390, 391,
394, 395, 400, 415, 427, 464, 481, 502, 534, 543, 551, 555,
573, 585, 588, 599, 626, 660, 667, 682, 683, 685, 781, 849,
853, 929
Michelson-Morley experiment, 1
Michelson's mistake, 22
mirror, fixed. See fixed mirror
mirror-misalignment NEdN, 865, 911
mirror-misalignment noise, 7.3, 7.8, 870, 873, 874, 879, 882,
884, 889, 890, 896-898, 909, 964, 967, 1031, 1034, 1035
mirror, moving. See moving mirror
misalignment angle, 7.2, 456, 459, 502, 504, 575, 692, 694,
867, 868, 878, 904, 927, 930, 932, 951, 952, 954, 1030
misalignment NEdN, 7.11, 8.11, 909, 926, 953, 1024
misalignment noise, 7.4, 7.5, 7.6, 7.7, 7.15, 726, 865, 879,
882, 891, 895, 929, 932, 933, 935, 939, 941, 953, 996,
1002, 1006, 1029, 1030
mixed function, 2.3, 76, 77
monochromatic beam, 7, 22, 26, 28, 37, 46
monochromatic light, 1.3, 3, 24, 28, 31, 580
Index
1050
monochromatic plane wave, 4.4, 4.12, 348-353, 357, 360,
362, 363, 368, 373, 395, 400, 403, 406, 411, 412, 415, 416,
465, 478, 500, 532-534, 538-540, 543-547, 549, 551, 554
monochromatic wavetrain, 4.3, 7, 14, 22, 37, 39-42, 44-47,
58, 344, 434, 522, 525
Morley, Edward, 1
moving mirror, 7.2, 24, 26, 28-31, 33, 44-47, 51, 52, 55, 57,
58, 394, 395, 401, 412, 414, 415, 454, 456, 459, 460, 473,
481, 502-504, 507, 510, 546, 547, 551, 552, 554, 574, 575,
577, 588, 591-594, 597, 602, 617, 630, 636, 667, 668, 675,
692, 694, 696, 726, 748, 749, 763, 768, 821, 867, 869, 871,
873, 878, 879, 1012, 1017
multidimensional Wiener-Khinchin theorem, 3.24, 297, 298,
434, 522, 525
multiplicative noise, 1027, 1035, 1037, 1038
multivariate Gaussian, 261-263

N
NEdN, 6.1, 6.16, 8.7, 742-745, 747, 763, 768, 807, 814, 815,
821, 844, 845, 848, 853, 865, 911, 930, 932, 933, 947, 953,
972, 973, 988, 990, 1002, 1007, 1023, 1038
detector, 6.21, 8.11, 844, 1024
double-sided, 848
mirror-misalignment, 865, 911
misalignment, 7.11, 8.11, 909, 926, 953, 1024
sampling error, 8.11, 885, 953, 984, 1002, 1024
sampling-noise, 953, 985, 1002
single-sided, 848
noise
additive, 1016, 1028
avoidable, 6.8, 767-770, 787, 788, 843, 848, 901, 1016
avoidable misalignment, 7.7, 895
band-limited, 3.25, 299, 301
band-limited white, 6.13, 87, 795, 798, 808, 812, 817, 924
detector, 953, 964, 1016, 1027, 1031, 1038
mirror-misalignment, 964, 967, 1031, 1034, 1035
misalignment, 726, 953, 996, 1002, 1006, 1029, 1030
multiplicative, 1027, 1035, 1037, 1038
photon, 6.15, 806-808, 812, 814
quasi-harmonic, 924, 926, 929, 930, 939, 987, 988, 999
quasi-static sampling, 8.10, 988, 1002, 1007, 1012, 1020,
1022, 1023, 1025
sampling-position, 8.2, 8.3, 954-958, 976, 987, 988, 990,
996, 998, 1002, 1006, 1012-1014, 1020, 1022
signal, 6.5, 225, 280, 753, 759, 843, 848, 853, 900, 1026
unavoidable, 6.8, 767-770, 786-788, 843, 900, 969, 1016
unavoidable misalignment, 7.7, 895
white, 223, 301
noise-equivalent change in radiance, 742, 743. See also
NEdN
noise-power spectrum, 223, 312, 328, 765, 766, 812, 813,
905, 921, 924-932, 934, 939, 948, 979, 983, 987, 988, 992,
1002, 1012
normal probability distribution, 265, 870, 873, 914, 932,
945-947
nth derivative of the delta function, 154
Nyquist frequency, 190, 192, 196, 197, 201, 203, 704, 716
Nyquist wavenumber, 704, 716-718, 798, 851

O
odd function, 2.3, 76, 77, 79-85, 88, 90, 111, 119-121, 128,
129, 139, 140, 142, 148, 158, 218, 228, 266, 267, 303, 314,
459, 463, 488, 604, 615, 634, 674, 732, 788, 800, 899, 946,
947, 950, 968
off-axis signal, 5.6, 555, 588, 589, 591-593
off-center sampling, 5.26, 723, 822
OPD, 395, 398, 414, 453, 482, 500, 501, 573, 582, 597, 748-
750, 757, 762-764, 791, 815, 843, 849, 850, 860, 869, 879,
882, 883, 932, 939, 953-959, 986, 988, 1012. See also
optical-path difference
OPD velocity, 619, 636, 748, 792, 795, 813, 879, 931, 987
optical axis, 385-389, 395, 400, 404-407, 416, 425, 453,
456, 465, 534, 544-546, 552, 553, 573, 588, 590, 592-594,
599, 606, 628, 660, 686, 696
optical-path difference, 41, 395, 501, 577, 585, 617, 619,
622, 624, 630, 643, 650, 655, 659, 666, 667, 686, 690, 696,
699, 712-715, 723, 930, 953. See also OPD
oversampling, 5.24, 704, 715, 723, 852-854

P
p-wave, 359, 362, 368, 407, 412, 413
pencil rays, 556-558, 566-568, 570, 571, 573, 575, 577, 584,
588, 590, 597, 606
pencils of rays. See pencil rays
permittivity, 331
photon noise, 6.15, 806-808, 812, 814
photovoltaic, 807, 814
Planck radiation, 559, 931, 932, 986, 991, 1007
Planck's constant, 559, 808
plane of incidence, 353, 355, 357-361, 367-369, 400, 406,
407, 410, 411, 467, 540, 547
plane wave, 4.5, 4.6, 4.13, 344, 346, 350, 352, 353, 355-360,
362, 366, 367, 375, 383, 385, 386, 394, 395, 400, 401, 405-
407, 409, 412-417, 425, 451, 465, 467, 478, 556, 570, 571,
573, 594, 596, 597, 599, 602, 606-608, 611, 612, 660
polarization, 54, 350, 454, 480
polarization, linear, 4.4, 349, 356, 358
polychromatic plane wave, 362, 372, 373, 383, 605-607
polychromatic wavefield, 4.7, 368, 369, 428
power reflection coefficient, 478, 481
power spectrum, 3.20, 3.22, 3.23, 8.3, 280-282, 287-290,
296, 297, 299, 301, 305, 311, 319, 320, 525, 579, 580, 591,
592, 594, 617, 766, 767, 791, 794, 795, 798, 819, 846, 860,
862, 864, 956-958, 976, 979, 987, 990, 995, 1002, 1006,
1007, 1013, 1014, 1016, 1020
power transmission coefficient, 478, 481
Poynting vector, 430, 438, 470
principle of independent superposition, 41, 47
prism-based spectrometer, 55
probability density distribution, 798, 800, 870, 871, 914,
Index
1051
927, 932, 945-947, 1023
propagation vector, 338, 349, 353, 354, 362-364, 376, 382,
383, 385, 386, 390, 394, 395, 399-401, 405, 407, 416, 421,
434, 453, 455, 482, 500, 502, 504, 507, 573, 660
pupil function, 452, 456, 460, 522, 528, 529, 531
PV. See photovoltaic

Q
quantum efficiency, 808
quasi-harmonic noise, 924, 926, 929, 930, 939, 987, 988,
999
quasi-static sampling noise, 8.10, 988, 1002, 1007, 1012,
1020, 1022, 1023, 1025

R
radiance, 5.2, 5.3, 55, 248, 416, 425, 474, 476, 481, 484,
485, 566-568, 570, 571, 573, 627, 629, 630, 639-641, 643,
647, 664, 681, 685, 686, 698, 699, 703, 726, 742, 743, 745,
747, 748, 750, 753, 758, 762, 775, 781, 782, 806, 808, 809,
813, 816, 823, 844, 857, 887, 891, 892, 911, 932, 933, 936-
938, 967, 969, 987, 988, 991, 992, 994, 998, 999, 1002,
1007, 1013, 1020-1022, 1031, 1034, 1035
radiant energy, 355, 430, 432, 433, 438, 470, 480, 555-558,
566-568, 571, 572
radiometric spectral radiance, 6.2, 329, 455, 486, 742-744,
748, 751-753, 760, 762, 763, 772, 775, 778, 783, 784, 789,
800, 816, 821, 852, 857-859, 879, 880, 890, 891, 905, 932,
936, 938, 964, 968, 973, 986, 991, 998, 1006, 1007, 1026,
1031
radiometry, 455, 555-557, 566
random error, 50, 223, 247, 742-745, 747, 759, 763, 764,
766, 768, 789, 844, 853, 865, 953, 955, 973, 1017, 1019,
1023, 1027, 1028
random function, 3.2, 3.13, 3.15, 3.23, 3.26, 223-225, 242,
249, 250, 252, 253, 257-261, 271-275, 277-282, 284, 287-
290, 296, 297, 299, 301-303, 319, 328, 432, 438, 522, 523,
525, 526, 744, 746, 747, 760-766, 780, 792, 798, 800, 815,
840, 844-847, 860, 869, 871, 873, 874, 876, 877, 882, 892,
903, 911, 912, 914, 951, 953, 956, 962, 988, 1012, 1013
random process, 249, 301, 791. See also Gaussian random
processes
random signal, 223, 762, 810, 1028
random variable, 3.1, 3.5, 3.6, 3.7, 3.9, 3.17, 223-227, 230-
243, 246, 249-251, 254, 255, 257, 259-269, 271, 273, 275,
432, 438, 446, 523, 525, 526, 809-811, 814, 816, 869, 902,
909, 911, 915, 916, 921, 922, 947, 948, 972, 1018
rays, 383, 385, 394, 395, 459, 464, 467, 474, 532-534, 541-
545, 551, 556, 570, 573, 585, 588, 589, 592, 594, 606. See
also pencil rays
real linear operator, 335, 496-498
real scalar field, 493
relativity theory, 23
resolving power, 647, 667, 668, 682
response time, 36, 808
retroreflector, 54, 55, 59
Revercomb calibration algorithm, 686
ringing, 647, 656, 683, 709, 806

S
s-wave, 357, 358, 362, 367, 368, 407, 412, 413
sampling error, 8.6, 8.7, 696, 969, 971-973, 988, 996, 1012-
1014, 1022, 1025, 1027
sampling-error NEdN, 8.11, 953, 984, 985, 1002, 1024
sampling-position error, 696, 987, 988, 1017
sampling-position noise, 8.2, 8.3, 954, 955-958, 976, 987,
988, 990, 996, 998, 1002, 1006, 1012-1014, 1020, 1022
sampling theorem, 2.24, 200
self-apodization, 666, 667
shah function, 2.18, 2.19, 162, 165, 171, 175
signal noise, 6.5, 225, 280, 753, 759, 843, 848, 853, 900,
1026
signal-to-noise ratio, 55, 742
sine curve, 66, 67, 218
sine transform, 2.2, 2.4, 67, 68, 70, 75, 80-85, 87-89, 91, 93,
95, 96, 98, 99, 119, 121, 218
single-sided interferogram, 5.18, 555, 643, 667, 673, 674,
681, 682, 726, 742, 850, 853
single-sided NEdN, 848
single-sided power spectrum, 296, 297, 455, 576, 766, 812,
819, 820
single-sided signal, 6.18, 6.19, 6.20, 6.21, 821, 823, 829,
840, 844, 845, 848
Snell's law, 532, 534
SNR. See signal-to-noise ratio
solid angle, 390, 437, 453, 455, 461, 472-474, 476, 478, 483,
485, 555-558, 566, 567, 570-573, 589, 597, 599, 601, 603,
629, 743, 806, 809
source fluctuations, 55
space look, 642
specific detectivity, 813
spectral doublet, 28
spectral intensity function, 49, 51
spectral line, 1.4, 1.5, 24, 26-30, 32, 47, 50-52, 55
spectral multiplet, 32
spectral radiance, 6.2, 329, 455, 486, 555-560, 566, 570,
571, 575, 576, 590, 591, 594, 597, 599, 601, 605, 606, 608,
612, 629, 631, 643, 646, 647, 671, 677, 685, 686, 703, 725,
726, 742-744, 748, 751-753, 760, 762, 763, 772, 775, 778,
783, 784, 789, 800, 816, 821, 852, 857-859, 879, 880, 887,
890, 891, 894, 905, 932, 936, 938, 964, 968, 973, 986, 991,
998, 1006, 1007, 1026, 1031
spectral resolution, 647, 665, 667, 677, 709, 715, 821, 823,
853, 930, 938
spectrometer
Fourier-transform, 1.7, 31, 50, 52, 54, 55, 57-59, 599,
617, 623, 640, 643, 647, 667, 699, 707, 719, 727, 742,
764, 767, 1016, 1038
grating based, 55
prism-based, 55
spectroscope, 24
Index
1052
spectroscopy, Michelson-based, 383, 437
speed of light, 1, 19, 23, 31, 346, 559, 808
standard deviation, 3.3, 226, 228, 229, 243, 246-248, 267,
269, 745, 747, 766, 821, 853, 870-873, 909, 915, 927, 933,
941, 945, 947, 973, 990, 1002, 1007, 1018, 1023, 1030,
1032
stationarity, 223, 280, 297, 301, 791
stationary, 18, 20, 252-254, 258, 259, 262, 263, 271, 272,
274, 278-280, 319, 523, 524, 791, 1012
stationary ether. See ether, stationary
stationary random function, 3.15, 252, 260, 261, 271, 279,
282, 287, 304, 319, 523, 791, 861, 862, 869
step function, 320. See also Heaviside step function
stochastic process, 225, 249
strongly ergodic, 278, 279

T
T-limited Fourier transform, 793, 814, 846
tapering function, 678, 822, 825, 827
Taylor series, 995, 1014
test function, 121-133, 135, 136, 138, 141, 142, 144-148,
151-154, 161-164, 168-171, 173-175
theory of relativity, 23
thin film, 353, 360, 361, 479, 539
three-dimensional convolution, 215
three-dimensional delta function, 216
three-dimensional Fourier transform, 382, 391, 426, 447-
449, 525
time average, 34, 36, 38, 43, 253, 271, 272, 274-276, 278,
453
time-chopped radiation, 4.10, 4.14, 390-393, 427, 430, 444,
448, 470, 522-524, 526
time-invariant linear system, 727
time-limited Fourier transform, 302
transfer function, 285-287, 620-622, 624, 645, 661, 668,
674, 681, 728-731, 733, 777, 778, 821, 822, 825, 831, 853,
880, 888, 923, 931, 968, 976, 991, 994, 996
transform
angle-wavenumber, 4.8, 380, 382, 386, 391, 393, 394
cosine, 2.2, 2.4, 67, 68, 70, 73-75, 80, 81, 83-86, 89-91,
93, 95, 96, 98, 103, 218, 463, 464, 780
D-limited Fourier, 779, 792, 800, 840, 889, 898, 900, 901,
961
fast-Fourier, 55, 96, 699
Fourier, 2.1, 2.5, 2.6, 2.7, 2.10, 2.13, 2.25, 3.23, 14, 30,
31, 50-52, 54, 57-59, 62, 70, 76, 89, 93-107, 109, 112,
114, 115, 117-122, 124, 136-142, 157, 167, 168, 171,
176, 178, 181, 182, 188, 194, 197, 200, 202, 204, 207-
210, 213-215, 218, 231, 281, 282, 285-290, 292, 297-
299, 300, 303, 310, 311, 371, 372, 381-384, 391, 393,
426, 447, 449, 451, 456, 464, 488, 525, 605, 610, 614,
620, 623, 625, 626, 639-641, 643-646, 650, 654-656,
677-680, 683, 699, 704, 708, 709, 715, 728-730, 754,
756, 757, 772-775, 777-780, 786-788, 790, 792, 794,
822, 824, 826, 829-831, 833-835, 837, 838, 840, 860,
866, 880, 882, 884-886, 888, 889, 895, 896, 899, 904,
913, 914, 919, 920, 922, 957, 959, 961, 962, 974, 975,
979, 980, 1029, 1030
Hartley, 87-89, 93, 99
inverse Fourier, 2.5, 6.4, 137, 139, 168, 171, 194, 204,
208-211, 213-215, 280, 281, 287, 371, 372, 381, 382,
426, 464, 605, 610, 620, 621, 623-625, 668, 678, 680,
729, 750, 753, 767, 774, 822, 826
sine, 2.2, 2.4, 67, 68, 70, 75, 80, 81, 83-85, 87-89, 91, 93,
95, 96, 98, 99, 119, 121, 218
three-dimensional Fourier, 382, 391, 426, 447-449, 525
time-limited Fourier, 302
two-dimensional Fourier, 210, 213, 215, 451, 456, 528,
529, 531
vector Fourier, 209, 382
vector inverse Fourier, 209, 382
transverse vibrations, 1, 2
truncated interferogram signal, 701, 708, 715, 806, 959
tunnel diagram, 400, 467, 543-545, 551
two-dimensional convolution, 211, 212, 214-216
two-dimensional delta function, 216
two-dimensional Fourier transform, 210, 213, 215, 451, 456,
528, 529, 531

U
unapodized spectral resolution, 647, 715, 930, 938, 986
unavoidable misalignment noise, 7.7. See also mirror-
misalignment noise
unavoidable noise, 6.8, 767-770, 786-788, 843, 900, 969,
1016
unbalanced background signal, 330, 464, 465, 470, 472, 474,
479, 482, 485, 486, 551, 585, 630
unbalanced output, 46, 47, 50, 54-56
unbalanced radiation field, 4.17, 394, 464, 467, 470, 472
unbalanced signal, 5.5, 55, 464, 465, 585-587, 632
uncalibrated spectrum, 6.19, 7.5, 8.4, 682, 683, 685, 781-
785, 800, 829, 842, 849, 882, 884, 889-893, 959, 962, 964-
966, 1015-1019
uncorrelated random variable, 239
undersampling, 5.25, 200, 715, 716, 718, 723, 852, 853, 855
unfolded interferometer, 400, 401, 406, 407, 415, 465, 482

V
variance, 3.3, 7.10, 226, 228, 229, 231, 240, 241, 244, 246,
248, 266, 277, 278, 301, 798, 800, 811, 812, 821, 905, 908,
909, 951, 972, 973, 987, 1018, 1022, 1023
vector calculus, 491
vector Fourier transform, 209, 382
vector inverse Fourier transform, 209, 382
vector notation, 208, 211, 215, 217, 218, 298, 372, 490, 491
velocity at Earth's equator, 14, 23
vibrations
elastic, 2, 43
transverse, 1, 2

Index
1053
W
wavefield, 4.7, 7, 14, 23, 31, 34, 36, 37, 39-41, 54, 55, 346,
349, 353, 355, 357, 359-363, 368, 369, 406, 407, 413, 428,
478, 534-540, 808
wavelength, 2, 3, 7, 10, 12, 14, 22, 24, 26, 28-31, 34, 47, 55,
57, 58, 249, 346, 351, 352, 385, 391, 392, 420, 428, 435,
533-539, 547, 555-557, 560, 566, 571, 607, 611, 692, 814
wavenumber, 34, 49, 51, 53, 346, 348, 353, 357, 359, 362,
363, 368, 370, 383, 392-394, 401, 407, 411, 412, 416, 428,
429, 434, 436-438, 451, 453, 455, 462, 464, 468, 477, 478,
486, 556, 557, 559, 560, 566, 570, 571, 575, 580, 584, 606,
607, 611, 622, 627, 631, 645, 647, 648, 650, 664-666, 671,
673-677, 682, 684-686, 691, 692, 700, 705-707, 709, 713,
714, 716-719, 723, 726, 727, 743, 744, 747, 755, 783, 789,
790, 798, 800, 806, 808, 813, 814, 816, 848, 849, 853, 857-
859, 874, 891, 927-929, 931, 933, 935, 936, 941, 964, 969,
971, 973, 975, 979, 987, 988, 990, 994, 995, 997, 999,
1001, 1002, 1007, 1013
wavetrain, 7, 14, 22, 37, 39-42, 44-47, 58
weakly ergodic, 278, 279, 869
weakly stationary, 258
white light. See light, white
white noise. See noise, white
wide-sense stationary, 258-261, 263, 279-283, 285, 287,
288, 290, 291, 299, 302, 304, 791, 792, 815, 860-862, 865,
869, 877, 903, 912, 922, 948, 953, 957, 958
Wiener-Khinchin theorem, 3.24, 223, 297-299, 434, 522,
525
window function, 654-658
windowing, 654

Y
Yerkes observatory, 28
Young, Thomas, 28

Z
Zeeman, Pieter, 1
zero-path difference, 26, 395, 577. See also ZPD
ZPD, 26, 28, 30, 31, 55, 395, 414, 577, 587, 591-594, 597,
599, 617, 667, 668, 807
ZPD position, 28-31, 46, 52, 395, 396, 413, 414, 577, 591,
593, 667-670, 749

Interferometer DLC

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Interferometer DLC

Uploaded by

Copyright:

Available Formats

Performance of Standard

of the speed of light c, so we can make the

, so by Eq. (1.5) its frequency is about

exists, yet the sine transform of the sine transform

F is just U(f), the forward Fourier transform of u, so

, then Eq. (2.33g) becomes

F is just U(f), Eq. (2.34b) shows that

F , we take the complex conjugate of both sides to

F F for 0 a > . (2.37e)

for 0 > . Now t must change by at

to change significantly; and when t changes by less than ( ) O ,

. Hence, when is larger than

is negligible or zero, since this is the same as having

must be negligible or zero for (1 ) f O > . Since the original

in fact represents any function v(t) where t must change by at least an

, the definition becomes

is an odd function of t. Therefore we can now assign a well-defined

are equal to each other because they are both

for any test function o , so by definition (2.47a) we conclude that

+ into the integral of

, which is an odd function in t, is zero [see Eq.

are the generalized limits of the

are a Fourier transform pair even

do not satisfy requirements (V) through (VIII) in Sec. 2.4 and, as

and integrated over t between and +, we realize that the

sequences, showing the

just restates the definitions given to

[see Eqs. (2.72a) and (2.72f)], we get

can be thought of as what we get when evaluating the integral

is just a disguised form of geometric series. We can write

from the original sum, giving

that repeats forever along the t axis at intervals of

and the dashed curves represent ( ) u t displaced by multiples of T .

in Eq. (2.75d).] The

. To get these complex constants directly from

is also realwe know from entry 7 of Table

is imaginary and odd when u is imaginary and odd (let

and summing over m gives

(which is, of course, the same thing as having B A

in Eq. (2.100c); similarly, the curved single arrow going

, and the curved arrows drawn

is a dummy variable, and nothing stops us from calling

is the probability that r takes on a

must be defined for all r

, is nonrandom even though it has a random subscript. The predicted

, is another nonrandom quantity

, is a nonrandom quantity. In general, the probability density distribution

in Eq. (3.6a) is the mean of the Gaussian

is the mean of the probability distribution

is the standard deviation of the

can be written as a sum of delta

is the probability that the random variable x takes on a value between

of a random variable x , which is the nonrandom

is described by a two-argument probability density

being the probability that the random variable x

takes on a value between X

must always take

. The expectation value of any function of the random

, one of the first questions that arises is

are independent random variables, their probability density distribution can be

are the standard probability density distributions for x and X

is the probability that x lies

is the probability that X