Professional Documents
Culture Documents
WaveOptics PDF
WaveOptics PDF
Nick Vamivakas
Institute of Optics
University of Rochester
Introduction
Interference and diffraction (OPT 261) is the first physical optics class you en-
counter in the Institute of Optics undergraduate program. Its physical underpin-
ning is based on the notion that light is a wave. And, these waves are a medium
for energy transmission that can be used to communicate information between
distinct spatial locations. We will unpack the previous sentence as we progress
this semester. Two comments are in order. First, we will use the phrase light
rather generically to refer to the phenomena we discuss. In a strict sense light
refers to waves with temporal frequencies 1 (to be defined soon) that lie in the
visible part of the electromagnetic spectrum. Visible just means we can see light
waves with our built in photodetectors - our eyes! - and we will learn that the
spectrum of a wave refers to the variety of temporal frequencies that contribute
to the wave disturbance. But, what we will be discussing applies to visible and
invisible waves as well as many types of non-light waves.
Lastly, before describing the different models of light, let me elaborate on how
1 We use the term temporal frequency to make explicit this is the frequency related to the
temporal variation of a light wave; this foreshadows the idea of spatial frequency
1
2 Chapter 1 Introduction
the topics we encounter this semester are connected. Specifically, our journey
this semester will focus on light waves, but will be generally applicable to all wave
phenomena. Since wave optics does not provide a full description of how light
behaves [see below 1.1] it is necessary to provide some postulates that we take as
true (and can be shown to be true when you have a bit more machinery at your
disposal, i.e. OPT 262). With these postulates we will start by understanding what
a wave “is" and then consider two simple waves: a plane wave and a spherical
wave. These elementary waves are the primitives that the rest of the course is built
on. You will learn how to visualize these waves and understand the relationship
between their strength, amplitude and phase. With these single waves, we will
discuss situations where you add - the grown up word is superpose - 2 waves. It is
here that the idea of interference is engaged. With an understanding of generic
interference effects, we will consider instruments to manipulate interference, so
called interferometers. This will provide us with an opportunity to consider what
coherence means in the context of light waves. Although still an active are of
research, loosely speaking coherence is related to the ability of a light wave to
interfere with a suitably temporally delayed or spatially shifted replica of itself.
Continuing on our journey we will encounter scenarios where instead of 2 we
add up (superpose) N waves. This provides the physical underpinnings for thin
film filters, optical cavities and diffraction gratings. Finally, we let N, the number
of waves we consider, become infinite in a controlled way and we arrive at wave
diffraction. From this perspective diffraction is nothing but the interference of
many waves. The reward for all this heavy lifting is we learn that diffraction sets
the fundamental limits to optical system performance. I will stress many times
throughout this class wave optics is nothing but examining the consequences of
adding up, a possible infinite number, of waves. So at the outset I alert you to the
importance of fully understanding the primitive objects - the plane and spherical
wave - and their manipulation since this provides the basis the rest of the course
is built on. A second cautionary note. Many students struggle in moving from N
to infinity since this introduces multi-dimensional calculus into the discussion.
You will find it useful to review this material now and when appropriate.
times light seems like a particle and sometimes like a wave. With this short detour
regarding the nature of light, we now set out to describe briefly the different ways
we think about light. I encourage you as you learn about each model to consider
how the model deals with generation, propagation and detection.
There are four main models used to understand light. To organize these I like
to think of an archery target (see Fig. 1.1). Such a target is a collection of four
QUANTUM OPTICS
ELECTROMAGNETIC OPTICS
WAVE OPTICS
RAY
OPTICS
Figure 1.1 The nested structure of light models. The least descriptive model is ray
optics and the most comprehensive model is quantum optics.
nested circles. Each model corresponds to a different circle and the larger, outer
circles not only capture some new feature(s) about light, but also can explain
all the effects described by the inner ones. In the bullseye of the target is ray
or geometrical optics. You encountered this material in OPT241. Some would
describe this as the “simplest" model of light. Simple just means the mathematical
model is based primarily on geometry and real analysis. Trust me, simple does
not mean easy by any means! Most of the results are derived via application of
Snell’s law for refraction n 1 sin θ1 = n 2 sin θ2 and the angle of reflection equals
the angle of incidence for light reflection. The previous rules can be derived
by an application of Fermat’s principle also called the principle of least time.
Quantitatively, the path traversed between two points P 1 and P 2 is determined
by minimizing the optical path length (which is analogous to find the path that
takes the least time) between the two points.
Z P2
min(OPL)path = n(s)d s (1.1)
P1
where OPL means optical path length. Principles of least time occupy an inter-
esting place in the physical sciences (not just optics), but we will not have much
4 Chapter 1 Introduction
more to say about them in this course. Presumably, OPT241 provided a thorough
grounding in the consequences of Eq. (1.1). Notice by finding the path that
minimizes the optical path length we are making a explicit statement about light
propagation. For example, if n is constant throughout space, the minimum path
will be a line connecting P 1 and P 2 . This line then represents a ray of light. The
resultant OPL in this case is n(P 2 − P 1 ).
Moving out from the center in Fig. 1.1, the next model of light is wave op-
tics. This is the purview of OPT261 and you are here to learn about the new
phenomena we encounter when light is upgraded from being a ray to being a
wave. Important in this discussion of waves is the notion of phase. Phase, and
in particular phase differences, manifest themselves in both interferometry and
diffraction. The mathematical machinery is also expanded to include complex
analysis, and so called phasors, as well as partial differential equations (algebraic
relationships the waves and their suitably arranged derivatives with respect to
space and time must satisfy) and multivariable calculus. The wider range of math
tools required for wave optics is what makes it more difficult than geometrical
optics. Since wave optics encompasses geometrical optics, it is able in principle to
describe and reproduce all the predictions of geometrical optics. That said, many
problems are more easily understood and solved approximately, but satisfactorily,
from a geometrical optics perspective. In fact, ray/geometrical optics is the limit
of wave optics when the wavelength of a wave (λ) is much smaller than the size of
the physical size of objects encountered by the wave.
Last but not least is quantum optics. Quantum optics is the most complete
theory of light. It can explain from first principles how light interacts with mat-
ter. It is remarkable in its predictions, but also challenging in how it changes
are physical conception of light. Recall that we mentioned for many centuries a
debate has ranged regarding what light “is". In quantum optics we find the the
following answer: light is a quantum object that can behave both as a particle
and as a wave! And, whether it seems to be a particle or a wave depends on how
you look at the light. This tension can be made evident in simple interference
experiments such as a Young’s two-pinhole interferometer. OPT223 also serves
as your first real encounter with photons. Photons can loosely be conceived as
particles of light. In OPT223 you will learn when particle descriptions of light
are useful and when wave descriptions are useful. As a rule of thumb, photons
make analyzing generation and detection problems easy whereas waves are more
1.2 Systems approach to Optics 5
So in summary, we can catalog the different models of light and how they
describe light:
This concludes are brief tour of how we model and understand light. Your job
moving forward, as you master these different models, is to learn when to apply
a particular model(s) to a given problem. Deciding which model to use is by no
means easy since it involves intuition as to what the limitations may arise in the
system modeling and making the right decisions only come from experience that
you acquire from practice.
OPTICAL SYSTEM
ENERGY FLOW
Figure 1.2 Block diagram of the elements that go into a system designed to transfer
optical energy from a source to detector. The system includes a light source, an optical
system and the detector.
6 Chapter 1 Introduction
optical energy is what carries information that we are interested in. In this course
(this may or may not be true in your other optics course so pay attention!) I will
refer to the totality of the light source, the machinery that delivers the light from
the source to the detector 2 , and the detector as the optical system. The part of the
system that delivers the light from the source to the detector we will call the optics.
Each of the models that we will discuss makes a statement about the process of
light generation, how light propagates and interacts with optical components and
how light is detected. Later in 261 we will find explicit mathematical functions for
the waves that represent the light source properties, functions representing the
optical system and a formula to quantify the amount of energy in an optical wave
(this is important for light detection). We will emphasize situations where inter-
ference effects are critical to understanding the spatial and temporal distribution
of optical energy.
Some examples of optical systems that are relevant in optics are microscopes,
telescopes, Michelson interferometers - the list goes on and on. In each of these
systems it is possible to identify each of the component sub-systems in Fig. 1.3.
As a simple example, that allows us to compare what we know from OPT241 with
what we will learn in OPT261, is a single-lens imaging system. The general system
consists of a self-luminous person that serves as the light source. This energy
propagates through the optical system that is composed of a free space segment, a
thin lens and then a free space segments. If system is arranged such that l11 + l12 = 1f
then an image is formed on the plane located at l 2 . This image could be seen on
sheet of paper placed at this plane or recorded by an electronic imaging detector.
In Fig. 1.3(a), the OPT241 approach is the descriptive tool and the light energy
that emanates from each object point travels as straight lines through the optical
system. We can trace the rays from input to output to determine where the rays
arrive on the detector plane. Three convenient rays are illustrated. In OPT241, is
this is an ideal thin lens imaging system then each point in the object is mapped
to a point in the image. There is not limit to how small a feature of the object you
can resolve in the image. For example, if the size of the person shrunk and the
hand (end of arm) moved closer and closer to the foot (end of leg) in the object,
with an ideal 241 thin lens imaging system you could always resolve (ie clearly
distinguish) the hand and foot in the image of the object.
In contrast, in Fig. 1.3(b), each point in the self-luminous object generates a
wave. The curved lines illustrate the wave’s wavefront. We will define wavefronts
in the next chapter. What is important in this illustration is that energy from each
point in the object is spread out into a blur spot (a collection of points) in the
image. This blur is fundamental and present even with an ideal thin lens imaging
system. It is known as the diffraction limit and is a manifestation of the wave
nature of light. The diffraction limit constrains the smallest possible features in
an object we can resolve in the image to the wavelength of the light generated by
the self-luminuous object. What this means is as the hand and foot move closer
2 this machinery could include lenses, mirrors, free space propagation, etc
1.2 Systems approach to Optics 7
f f f f
rays
waves
POINT BLUR
l1 l2 l1 l2
object image object image
Figure 1.3 Ray and Wave Models to Image Formation. a) OPT241 based on rays. All
rays focus to a point in an ideal thin lens system. b) OPT261 based on waves. In an
ideal thin lens imaging system all waves focus to a blur spot that is a manifestation of
interference and diffraction. The blur sets a fundamental limit to the resolving power
of an imaging system (the smallest features of an object discernible in an image).
and closer in the object, there is some minimum separation between the two
that allows the hand and foot to be clearly resolved in the image. If the hand and
foot become closer together in the object than this minimum distance then it is
not possible to distinguish the hand and foot in the image - they blur together
or can not be resolved. The previous is what we mean when we say diffraction
limited imaging resolution. In this course we will unravel how to determine the
resolution limits of imaging and general optical system architectures that use
light.
This is an important point and is worth emphasizing:
In this chapter we begin our discussion of light waves. After some qualitative
discussions of wave phenomena a set of postulates that govern the behavior of
light waves is given. The necessity of these postulates is a result of not having
the full machinery of Maxwell’s equations at our disposal (this waits for OPT
262). With the postulates in hand the two simplest and most useful waves for
understanding wave phenomena are presented - plane waves and spherical waves.
Various approaches to representing these waves are discussed. As phase is a
conceptually new feature of waves we will spend time exploring it. Finally, the
propagation of waves through simple media will be presented. Additionally, since
the most economical way of representing waves leans on phasors a short review
of the necessary complex algebra rules are given in the last section of the chapter.
• water wave
• sound wave
• seismic wave
• electromagnetic waves
• gravitational waves
• quantum waves
9
10 Chapter 2 Fundamentals of wave optics
Can you think of more wave examples? A natural next question is what exactly is
“waving" in each of these. Returning to our previous list:
Figuring out what waves in each wave is not as trivial of a question as it seems
and in fact initiated a revolution in physics at the end of the 19th century. Until
the late 1800’s, most waves were mechanical in origin and propagated through a
recognizable medium; think water, sound and seismic waves. It was natural to
assume the same held true for the newly discovered electromagnetic waves and
that these waves must propagate through a medium, the so-called ether. This
will take us to far off track, but in short trying to detect the ether not only led to
the invention of the Michelson interferometer (we will see this device in the next
chapter), but also ushered in the special theory of relativity.
From considering these previous examples it is clear that there are similarities
between each example (at least the first 4-5!). First something is moving through
space and continues to persist as time elapses. There is a shape or waveform that
is preserved as the wave moves. Maybe not as obvious is that each of these exam-
ples also present a way to transfer energy (the possibility to do work) between
separated locations in space. Leaning on the previous we make the following
qualitative definition for a wave:
As we progress through the material, we can always check back to make sure our
qualitative definition of a wave captures the physical wave properties we uncover.
u(r, t ) (2.1)
2.1 What is a wave? 11
where r is a vector that denotes the point in space where the strength u is mea-
sured and t is the time it is measured. It turns out this function, u(r, t ), is too
general to describe wave motion. By appealing to some physical intuition re-
garding wave motion and consistency of measurements of the wave strength in
different reference frames (coordinate systems moving at constant linear relative
velocity to one another), it is possible to constrain the set of functions u(r, t ) to a
subset of functions that represents waves; so called wavefunctions. To do this we
will consider the wave function illustrated in Fig. 2.1. Note we will consider only
1-dimension, but this reasoning would apply in 3D too.
u(z’) v u(z’) v
optical
disturbance a a
frame
z’a z’ z’a z’
za zb z
Figure 2.1 General wave function illustrated in two frames. The wave’s frame is the top
row of panels. In this frame the disturbance is stationary and fixed for all times. In the
bottom row, the wave is illustrated in the lab frame. The lab frame is at rest and the
wave moves with a velocity v through this frame.
times in the waves frame of reference. This is illustrated in the top row of Fig. 2.1.
The waves frame of reference is plotted for two different times t = 0 and t = t o . In
the wave’s frame (top row) the wave is fixed with respect to the z 0 axis. In the lab
frame, bottom row, the wave moves along the coordinate axis z (parallel, but not
equal to z 0 ). The strength of the wave at location z a0 in its own frame is a. If this is
in fact a traveling wave in the lab frame (you are sitting at your desk watching the
wave move by), the wave appears to be moving with velocity v. See the second
row for snapshots of the wave strength at two different times in the lab frame.
labeled by z and z 0 . At time t = t o , the wave has moved in the lab frame and at this
later time, the wave strength at z a no longer equals the strength at z a0 . From the
snapshot it is discovered the wave strength at z b now equals the wave strength at
z a0 . How are z a0 , z b and t related? From geometry we find z b = v t o + z a0 in the lab
frame. There is nothing special about these two points and at a given time the
coordinates of the wave frame are related to the coordinates of the lab frame via
the relationship z = z 0 + v t . The previous implies z 0 = z − v t so that
From the previous physical considerations we have constrained the set of func-
tions u(z, t ) to the subset describing wave motion u(z − v t ). In fact z − v t is not
the only allowable combination of space and time for a function to be a wave
function. There are four possible ways to express this constraint, each with a
slightly different meaning. They are
These particular combinations of space and time will hold true for all waves. So,
what did we just learn? First, we saw an example of how physical reasoning, the
strength of the wave should be independent of the given frame of reference -
the wave exists independently of the coordinate systems we use to describe it,
constrains the relationship between space and time in a wavefunction. Second,
given a general function of a single independent variable, we now know how to
turn this into a wave; simply replace the single independent variable with z − v t
(z +v t ) to arrive at a mathematical representation of a wave moving in the positive
(negative) z direction in the lab frame. When we have outlined the postulates
of wave optics we will also discover the permissible wave functions need to be
differentiable in a certain way. With this we are ready to outline the postulates of
wave optics.
2.2.1 The optical disturbance function a.k.a the wave optics wave func-
tion
The optical disturbance function is the basic object that will occupy our attention
for the rest of the course/text and is the wave function of wave optics. This
function is the tool we use to quantify the strength of our optical disturbance in
space and time. Since a single function (not a set of a functions; a vector function)
quantifies the wave behavior, the optical disturbance in wave optics is a scalar
wave. We will use the following notation for the optical disturbance function:
u (r, t ) (2.3)
1 ∂2
µ ¶
2
∇ − 2 2 u (r, t ) = 0. (2.4)
c ∂t
In the previous c = cno is the wave phase velocity and is equal to the speed of light
c o (3×108 ms−1 ) divided by the medium’s refractive index n. The differential
³ oper-´
∂ ∂ ∂
ator represented by the nabla symbol in cartesian coordinates is ∇ = ∂x , ∂y , ∂z
and is also expressible in other coordinate systems (for example spherical coordi-
nates). For the wave optics problems we consider we will deal with linear media
that are piecewise homogeneous (essentially the material can be broken into
slabs that are large with respect to the wavelength and have constant refractive
index).
The most important property of the wave equation for wave optics is that it
is linear and allows for superposition. What this means is that if you have two
optical disturbance functions, u 1 (r, t ) and u 2 (r, t ), that are solutions to Eq. (2.4)
then their sum, u S (r, t ), is also a solution to Eq. (2.4), u S (r, t ) = u 2 (r, t ) + u 1 (r, t ).
It is the superposition principle that gives rise to the range of interference and
diffraction effects encountered in wave optics.
14 Chapter 2 Fundamentals of wave optics
An obvious question at this point is why must we time average the optical
disturbance function to find its energy flux? It turns out for optical waves (from
the UV to the IR) the frequencies are in the 10’s to 100’s of THz which result in
periods shorter than a picosecond. Detectors for measuring the optical wave’s
flux are too slow to respond to these rapid oscillations and hence average the
instantaneous field irradiance. We will see this complicates determine the phase
of a wave.
oscillator from intro physics. The equation of motion for simple harmonic mo-
tion (this is analogous to our wave equation, it describes the oscillating systems
dynamics) is
d2
y (t ) = −ω2 y (t ) (2.6)
d y2
where y is the displacement of the simple harmonic oscillator and ω is the angular
temporal frequency of the oscillator. It is easy to check that the following function
is a solution of Eq. (2.6)
y (t ) = a cos (ωt + δ) . (2.7)
A point that is worth emphasizing is that the displacement at time t has the same
units as the amplitude of the oscillation a, both [m]. The argument of the cos in
Eq. (2.7) is called the phase of the simple harmonic oscillator. Specifically,
θ (t ) = ωt + δ. (2.8)
where δ is called the initial phase angle. For each instant of time, the phase of
the simple harmonic oscillator quantifies where the system is located in a given
oscillation cycle. Figure 2.2 illustrates simple harmonic motion for δ = 0 in Fig.
y(t)
ωtn=m2π
a
t
-‐a
y(t)
δ
a
t
-‐a
Figure 2.2 Simple harmonic motion. Simple harmonic [see Eq. (2.7)] is illustrated
with zero initial phase angle, top panel, and a negative initial phase angle, bottom
panel. The blue curve in bottom panel leads the black curve (the black curve lags the
oscillator described by the blue curve).
2.2(a) and for δ < 0 in Fig. 2.2(b). Referring to Fig. 2.2(a), we can ask what is the
change in temporal phase between two displacement maxima. This requires
m2π
ωt m = m2π → t m = = mT (2.9)
ω
16 Chapter 2 Fundamentals of wave optics
A final point is to consider the relative phase of two oscillations and identify
when one oscillation is ahead (leads) or is behind (lags) the second oscillation.
Phase differences are fundamental to wave optics. This point can be understood
by examining Fig. 2.2(b) and assuming the two oscillations refer to two different
harmonic oscillators. By consulting Eq. (2.8), we see a negative initial phase
angle requires we wait an additional time to = |δ|ω to measure the same oscillator
displacement value. Said a different way, the oscillator represented by the blue
curve (maximum displacement at t = 0) leads the oscillator described by the black
curve (maximum displacement shifted to a time to = |δ| ω ).
where i is the imaginary unit and Y is the complex function associated with
y. We will use a capital letter to denote the complex function associated with
sinusoidal function that we represent with a lower case letter (in this case y and
Y ). Geometrically the phasor is a vector in the complex plane with its tail affixed
to the origin and its length equal to a. The orientation of this vector with respect
to the real axis at a given time is determined by the phase of the phasor.
Figure 2.3 plots the phasor in Eq. (2.12). It is important to specify that positive
angles correspond to counter-clockwise phasor rotation. For a simple harmonic
oscillator, with the phase defined in Eq. (2.8), as time evolves the phasor continu-
ously rotates in the counter-clockwise direction around a circle of radius a with a
2.3 Simple Waves 17
imaginary axis
iθ 2 iθ1
Y2 = ae a a Y1 = ae
θ2
asin θ 2 θ1 asin θ
1
Figure 2.3 Phasor Representation. Two phasors Y1 and Y2 illustrated in the complex
plane.
temporal (angular) frequency of ω. The temporal phase accrued after one cycle of
the oscillator is 2π. From this perspective, the phasor Y2 is a time evolved version
of the phasor Y1 . Using the language of lag and lead, phasor Y2 leads phasor Y1 .
In comparing Fig. 2.2 and Fig. 2.3 we see the variation in harmonic oscillator
is not a result of its amplitude changing (a is fixed), but is due to the variation
of its phase. The phasor representation make this clear. Again, at any instant of
time, the displacement of the oscillator is found from the projection of the phasor
along the complex plane’s real axis (see Eq. 2.11, maybe make a figure for this).
So, if the initial phase angle (δ) of the harmonic oscillator is 0 at time t = 0 the
displacement is equal to a and the simple harmonic oscillator phasor is parallel
π
to the real axis. At a later time t = 2πω the displacement is zero (at this instant the
harmonic oscillator’s location is equal to its rest position) and this is consistent
with the phasor being parallel to the imaginary axis in the complex plane.
The reason for the modulus of u o is the pair of numbers |u o | and δ will specify
the complex amplitude of the wave; denoted u o . The phase of this wave is
³z ´
θ (z, t ) = ω − t + δ (2.15)
c
and has both a temporal phase −ωt and spatial phase ω cz in addition to the ini-
tial phase angle δ. We will see in the next section that the previous wave is an
example of a monochromatic plane wave. The reason for this nomenclature will
be described in detail then, but we observe in passing that the equation ω cz = b
where b is a constant (some number of radians) describes a plane parallel to the
x − y plane located at z = cb
ω . Figure 2.4 plots Eq. (2.14) for the case z = 0 (Fig. 2.4
top panel) and for the case t = 0 (Fig. 2.4 bottom panel); δ = 0 for each plot. Con-
u(z=0,t)
ωtn=m2π
uo
t
-‐uo
u(z,t=0)
ωzn/c=m2π
uo
z
-‐uo
Figure 2.4 Time and space harmonic wave. Top panel: u (z = 0, t ). Bottom panel:
u (z, t = 0).
sidering the discussion regarding simple harmonic motion, the angular frequency
of the wave still relates to its linear frequency and period as ω = 2π f → f = T1 .
The new bit in Eq. (2.14) is the spatial phase.
We focus on the spatial phase of Eq. (2.14) to understand the factor that
multiplies the position variable z. From the bottom panel of Fig. 2.4 it is apparent
that ωzc m = m2π determines the distances between maximum (crests) of the
optical disturbance. Manipulating the previous we find
ωz m 2πc c
= m2π → z m = m = m = mλ (2.16)
c 2π f f
where as expected the spatial phases advances by multiples of 2π when a spa-
tial distance denoted by λ [m] is traversed. Since the constant λ quantifies the
2.3 Simple Waves 19
θ (z, t ) = kz − ωt + δ (2.20)
In the next section when we generalize our definition of plane waves we will
also consider the phasor representation of these waves.
Recall that the spatial phase of a time and space harmonic wave (when com-
bined with the initial phase angle) can be expressed as φ (z) = ω cz + δ = kz + δ
where k is the wave’s wavenumber. As was hinted in the last section kz = c, where
c is a constant (δ is absorbed into c), geometrically describes a plane parallel to
the x − y plane located at z = kc . This wave is a monochromatic plane wave travel-
ing in the +z-direction. We can make one more observation regarding the spatial
phase. We can ask what is the negative gradient of this scalar function. Remember
the gradient returns the local normal vector to the surface. The negative gradient
of the plane wave’s spatial phase is
∂ ∂ ∂
µ ¶
∇θ (z) = , , φ (z) = (0, 0, k) (2.21)
∂x ∂y ∂z
which is a vector along the z-direction with a magnitude equal to the wave num-
ber. This vector is called the wave vector and it is normal to the planar wavefronts
(in this case parallel to the z-direction). Two facts have been discovered about
monochromatic plane waves:
• The spatial phase of a plane wave describes a plane, hence the name plane
wave
See Fig. 2.5a for a one way to visualize monochromatic plane waves traveling in
the +z direction assuming the initial phase angle δ = 0 The wave is visualized as
a set of parallel planes orthogonal to the z-axis. In the visualization the planes
are separated by multiples of the wavelength (mλ where m can be a positive or
negative integer) and are labeled with the constant value of spatial phase taken
across the entire plane. The wave vector is locally normal to each these surfaces.
Figure (2.5)b displays a common way to illustrate monochromatic plane waves as
a set of parallel lines. The snapshot is taken at two different times. The top panel
is at time t = 0 and the bottom panel is at a time ωt = π. Notice in the bottom
panel the wave has advanced at distance equal to half a wavelength along the
z-axis. Also decorating the planes are the constant phase assumed across the
x − y plane for the given z-locations. From the phase it is clear these planes are
separated by a spatial distance equal to the wavelength. We are now ready to
define the wavefront of a wave.
So, a plane wave is a plane wave since it’s wavefronts are planar! This is captured
in the visualization of Fig. 2.5(a) and (b).
By construction, the time and space wave that was arrived at in Sec. 2.3.3 was
propagating along the z-direction. The optical disturbances we are discussing
have no preference about the direction they propagate along and we can eas-
ily generalize our equation representing a plane wave propagating along the
2.3 Simple Waves 21
λ
a) b) c) k
x x=0
x z x
z λ
y k z
y
t=0
-2π 0 2π 4π
k
k
ωt=π
-2π 0 2π 4π
Before writing down the formula for a general monochromatic plane wave
we will make sense of the wave vector and its components. To do this it is useful
to introduce a vector space commonly referred to as k-space in wave optics.
This new mode of thinking will prove particularly powerful when we consider
22 Chapter 2 Fundamentals of wave optics
a)
z
b)
kz
γ
rp=(xp,
yp,
zp)
kp
zp
kzp
β
y
y
kx
α
x
xp
kyp
yp
kxp
Figure 2.6 Position space and k-space vector spaces.a) An illustration of the position
space vector space. Points are located by three pieces of information. In this illustra-
tion a Cartesian coordinate system
¡ is adopted
¢ and each point rp is identified by its
three Cartesian components x p , y p , z p . Notice there are points in space and a vector
associated with each point that has its tail at the origin and head at the point. b) An
illustration of k-space. We will see each point in k-space corresponds to a permissible
monochromatic plane wave. The wavevector can be expressed with respect to a Carte-
sian coordinate system, via¢ spherical coordinates (not illustrated) or using direction
cosines with angles α, β, γ .
¡
2.3 Simple Waves 23
q
From Eq. (2.23) we see |k| = k x2 + k y2 + k z2 = k.
There are a few more comments we can make regarding k-space. There
are different ways to parameterize the vectors that inhabit k-space. We have
seen one representation in Eq. (2.24) with the wave vector parameterized by
¡ ¢
its Cartesian components k = k x , k y , k z . It is also possible to use other or-
thonormal coordinate systems such as spherical coordinates. Again referring
to Fig. (2.6)b, in spherical coordinates the wavevector is expressible as k =
|k| sin θ cos φ, sin θ sin φ, cos θ . In this representation the angles determine the
¡ ¢
propagation direction (this is not explicitly illustrated in Fig. (2.6)b). This means
if I give you θ and φ you will know the waves propagation direction. A closely
related representation of the wavevector also relies on angles, but in this third
representation the direction cosines of the wavevector are used. The direction
cosines are found by measuring the angle between the three Cartesian axes and
the wavevector itself. Consulting Fig. (2.6)b it can be seen that the angle between
n̂x − k is α, between n̂y − k is β and between n̂z − k is γ. These angles satisfy the
property cos2 α + cos2 β + cos2 γ = 1. With these angles defined the wavevector
can also be written as k = |k| cos α, cos β, cos γ .
¡ ¢
After this exhaustive discussion of the spatial phase and wavevector we can
write down the real wavefunction for a monochromatic plane wave as
where all the parameters have been previously identified. To specify a monochro-
matic plane wave you need 4 pieces of information
With these quantities we can define the set of all monochromatic plane waves.
An illustration of the wavefronts for a general monochromatic plane wave is
presented in (2.5)c. Notice the to make a plot of the scalar optical disturbance
function in Eq. (2.25) requires 5-dimensions! The optical strength u is one dimen-
sion, the position vector occupies three dimensions and time provides the last
dimension.
It is instructive to check that the plane wave in Eq. (2.25) satisfies the wave
equation given in Eq. (2.4). This is done by substituting Eq. (2.25) into Eq. (2.4).
24 Chapter 2 Fundamentals of wave optics
We find
1 ∂2
µ ¶
∇2 − 2 2 u o cos (k · r − ωt + δ) = 0 (2.26)
c ∂t
ω2
→ k x2 + k y2 + k z2 = 2 → ω = kc. (2.27)
c
The relationship ω = kc is an example of a dispersion relation. We see the wave
equation provides a constraint that the wavenumber and angular frequency sat-
isfy. The two can not be arbitrarily fixed but are related by the speed of light.
Thinking of this constraint in k-space we see that the wavenumber is equal to
the length of the wavevector and it defines a sphere of radius k = ωc . One such
sphere is illustrated in Fig. (2.6)b. Each point on the sphere in k-space defines a
monochromatic plane wave with the same angular frequency, but propagating in
a different direction.
Just like the simple harmonic oscillator, the monochromatic plane wave can
be expressed as a phasor and the prescription for doing this is identical to the
simple harmonic oscillator; recall Eq. (2.11). The first phasor we define is also
called the complex wavefunction. The complex wavefunction is defined as
where the complex wavefunction and real wavefunction are related via
The complex wavefunction is a dynamic phasor since its phase changes as time
evolves. It is also useful to define a static phasor known as the complex amplitude.
The complex amplitude is defined as
where there is a slight abuse of notation since U is used for both the complex
wavefunction and complex amplitude. It will be clear from the context which
phasor is being used. Generally we will be primarily focused on the complex
amplitude.
−0.5 z
−1
z
z=0
−1.5
z
−5 −4 −3 −2 −1 0 1 2 3 4 5
Figure 2.7 Plane wave visualization. Left: Top panel and bottom panel plot the optical
strength in the x − z plane of a monochromatic plane wave propagating in the +z-
direction. The bottom panel has advanced in time by t = T4 resulting in a phase shift
go π2 . Middle: Top and bottom are linecuts of the optical strength along x = 0. Right:
The bottom panel is the irradiance along the same linecut (x = 0). The irradiance is
constant throughout space and time for a monochromatic plane wave. Top panel:
Phasor associated with the complex wavefunction.
26 Chapter 2 Fundamentals of wave optics
the spherical wave’s spatial phase returns the wavevector k = n̂r that is everywhere
normal to the spherical surface. Notice the wavenumber is parallel to the unit
radial vector in spherical coordinates. The second major change when compared
to the monochromatic plane wave is that the complex envelope now depends on
position. Notice
|u o | i ∠uo
|u o |e i ∠uo → e (2.34)
r
and the complex amplitude of the wave decays as r1 .
Figure 2.8 visualizes the wavefronts, the optical disturbance strength and
irradiance of a monochromatic spherical wave.
uo
r
1
u2o
0.5
z
−0.5
r
r
−1
−1.5
−5 −4 −3 −2 −1 0 1 2 3 4 5
2 T 2u o2 T
Z Z
u 2 r, t 0 d t 0 = lim 1 + cos 2ωt 0 + 2ϕ (r) d t 0 = u o2
¡ ¢ ¡ ¢
lim (2.35)
T →∞ T 0 T →∞ 2T 0
where T is the response time of the detector. The detector response time T deter-
mines the fastest signal a detector can faithfully reproduce. The fastest optical
detectors are on the order of 100 GHz (1 GHZ = 109 Hz) and the frequency of
optical waves in a few hundred THz (1 THz = 1012 Hz). This means that there are
more than 103 cycles of the optical wave during the detector response time T . As a
result of this, optical wave detectors integrate the instantaneous intensity over the
detector response time and only report an average value. The limiting procedure
in Eq. (2.35) ensures mathematically the integral converges to a well-defined
value.
for the irradiance. Notice in Eq. (2.38) the monochromatic plane wave’s phase
does not influence the irradiance. So, although we found the phase is responsible
for the variation in optical strength for a plane wave, it does not impact the
irradiance. This turns out to be a generic feature in measuring optical waves and
fields - accessing the phase information. This is an ongoing challenge in optical
measurements, recovering the phase of the optical wave. Many techniques for
phase retrieval leverage interference effects.
28 Chapter 2 Fundamentals of wave optics
spatial ! "
phase
A along x
n "
z=0 z=L A B C
Figure 2.9 Plane wave propagation through a planar slab. Left: The panel represents
the wavefronts of a monochromatic plane wave propagating through a distance L of
free space and through a medium of refractive index n . The spacing of the wavefronts
is determined by the wavelength (λ). In the medium, the distance between wavefronts
is contracted to nλ . Right: Phasor diagram for the two complex amplitudes associated
with each path at z = L. Phasor 2 lags phasor 1 by an amount equal to (n − 1) kL. Also,
althought wavefronts stay connected through space, walking along the x-direction
there is a pronounced jump in the spatial phase.
a distance L outside the medium I encounter less crests than if I walk through
the medium. With this perspective, there is more spatial phase acquired by the
wave propagating through a medium of length L as compared to the same wave
outside the medium Phasors can be used to quantify this intuition.
First, in propagating a distance L a monochromatic plane wave’s complex
amplitude acquires a phase
U1 (z = L) = u o e i kL (2.39)
and by the same reasoning the spatial phase acquired in propagating through a
medium of the same thickness with refractive index n is
U2 (z = L) = u o e i nkL . (2.40)
Notice in both Eq. (2.39) and (2.40) the waves propagate a physical distance L
( a geometric length) although in the second case, the acquired phase is nkL.
Recall, the physical distance scaled by the refractive index is called the optical
path length. These are distinct “lengths". We can also ask, what is the relative
phase of the waves in Eq. (2.39) and (2.40) at the location z = L. This is
U2 (z = L) u o e i nkL
= = e i (n−1)kL . (2.41)
U1 (z = L) u o e i kL
The discussion of this paragraph is important. Notice the second wave has
acquired a relative phase of (n−1)kL as compared to the wave that has propagated
through free space. It is useful to consider the previous in the context of phasors.
The right panel of Fig. 2.9 presents a phasor diagram that contains complex
amplitudes associated with each of the wave. If we assume the phasors are
aligned with the positive real axis when t = z = 0, then the orientation of each
phasor catalogs the phase acquired by each wave and the difference is the relative
phase difference between the 2 waves at z = L (we are ignoring the fact that the
phasors rotate counter-clockwise as time evolves and are only bookkeeping for
the phasor rotation due to spatial phase accumulation). Since positive angles
correspond to clock wise rotation, and nkL > kL, it is clear phasor 2 acquires
more phase phasor 1. From our earlier discussions this means the phasor 2 lags
phasor 1 or phasor 1 leads phasor 2. This is also makes sense with our intuition.
If we labeled the wavefront at z = 0 at time t = 0 we would have to wait longer
for the wave that propagates through the material to arrive at z = L as compared
to the wave that propagates the same distance through free space. Why do we
wait longer? The phase velocity in the medium v = nc is slower than the phase
velocity in free-space v = c and it takes longer for the wave to travel through
the medium. Also, notice in the lower right panel of Fig. 2.9 if you walk along a
30 Chapter 2 Fundamentals of wave optics
direction orthogonal to the propagation direction (in the illustration x), there is a
pronounced jump in the spatial phase in going from the high index medium into
free space.
So we have found in propagating through a medium of refractive index n, an
optical wave’s phase is modified by the medium and we are led to the make the
following observations:
The physical path length, PL, traveled by a wave equals the physical distance
traveled; what you would measure with a ruler.
The optical path length, OPL, traveled by a wave equals the physical dis-
tance traveled multiplied by the refractive index.
is collected by a lens or perhaps a telescope gazing out at star in space . For the
ensuing discussion it is assumed the optical system has a circular entrance pupil t In a later chapter we will
with radius ∆a (the entrance pupil may also be simply called the aperture in what discover the finite size of
the star, it is not really a
follows). Our goal is to use some mathematics tools to quantify how planar is
point, influences the regu-
the plane wave delivered to the entrance pupil of the optical system by the point larity of the wavefront the
source. Said another way, we will be able to quantify when it is reasonable to telescope receives
approximate the incoming optical energy as being delivered by a plane wave.
What this all means is we need to figure out how to turn the complex amplitude
of the spherical wave into the complex amplitude of a plane wave.
To get started, we of course assume the point source generates a spherical
wave - the wavefronts are spherical as discussed in Sec. 2.3. In moving from left
a) b)
Lz
c)
f
f
z'
Figure 2.10 Relating spherical waves and plane waves. a) A spherical point source
delivers a planar wavefront to the entrance pupil of an optical system provided the
distance z = L z is far away from the point source. This is the origin of the phrase a
planes wave is generated by a spherical wave at infinity. In between the wavefront is
parabolodial. b) Focusing a plane wave (left to right) or collimating a spherical wave
(right to left). c) Geometry for finding the transmission coefficient of a thin lens..
to right the distance from the point source, z = L z , increases. When L z becomes
large and we consider points x and y close to the z-axis the spatial phase of a
spherical wave can be expressed as
x2 + y 2
kr = k(L 2z + x 2 + y 2 )1/2 = kL z (1 + )1/2 . (2.42)
L 2z
Why the previous rearrangement of the spatial phase? We are building a line of
reasoning that involves the physical intuition provided by the illustration. From
the illustration, it appears that points in the transverse plane close to the optical
axis will provide the domain over which the wavefront can be approximated as
planar. Notice in Eq. (2.42) the second term in the square root is a ratio of
t Recall the wavevectors in
wave optics are co-linear
with the rays of ray optics.
The nearly on-axis points
we are considering corre-
spond to nearly on-axis
wavevectors or rays. These
32 Chapter 2 Fundamentals of wave optics
kL z (2.45)
is referred to as the linear phase. Why? The spatial phase is linearly dependent on
L z . The second term is
x2 + y 2
k (2.46)
2L z
and is called the quadratic phase. Quadratic phases are extremely important and
will be encountered many times in wave optics. Notice, at the level of approxi-
mation in Eq. (2.44), in the region between the source origin and the entrance
pupil plane, the wavefronts are parabolas and this wave is called a paraboloidal
wave. The second term on the right hand side in Eq. (2.44) describes a parabolic
or quadratic phase variation and when added to the constant linear phase the
resultant wavefront is still a parabola.
With Eq. (2.44) it is possible to determine both under what conditions the
optical system’s entrance pupil receives a plane wave and what is the deviation
from planarity. For the spatial phase in Eq. (2.44) to be linear the second term
must be small. Quantitatively
(x 2 + y 2 )max ∆2
k << π → NF = a << 1 (2.47)
2L z λL z
the quadratic phase must be much smaller than π for it to be negligible in the
entrance pupil. In Eq. (2.47), the maximum transverse distance is the entrance
t π is chosen by convention pupil (orthogonal to the optical axis) has been denoted by ∆a (for a circular en-
although since this is the trance pupil this is the pupil radius). After rearranging the algebraic constraint in
phase most different from 0
Eq. (2.47) the Fresnel number NF is defined. In its new statement small quadratic
phase
phase is equivalent to have a Fresnel number much less than 1. Therefore, given
the entrance pupil extent, wavelength and source pupil distance we can deter-
mine when the quadratic phase variation of the wavefront is negligible so that the
2.5 Connecting monochromatic plane waves and spherical waves 33
received wavefront by the optical system (see Fig. 2.10a) is approximately planar
(only the first term on the right hand side in Eq. (2.44) contributes). What is the
maximum phase variation across the entrance pupil given the NF ? Just substitute
NF into Eq. (2.46). The result is πNF !
A second connection between spherical and planar wave fronts is provided
by a thin lens. This is illustrated in Fig. 2.10b. In this illustration, in going from
left to right a planar wavefront is focused by the lens. Focusing converts the
planar wavefront to a spherical wavefront. Similarly, in going from right to left,
a spherical wave emitted by a point source is converted into a collimated plane
wave by a thin lens. For this reason it is often said a lens is able to make the
plane at infinity accessible in an optical system. To use a lens in wave optics it
is important to understand the phase shift introduced by a thin lens. The phase
shift can be determined by considering Fig. 2.10c. Assuming the lens is centered
on the z-axis, has a refractive index n and a focal length f we can determine the
phase shift for each ray, propagating parallel to the z-axis, that encounters the
spherical surface at location x, y. The path length is
x2 + y 2
³ ´ q
0
d o − d (x, y) = d o − f − z = d o − f 2 − x 2 − y 2 u d o − (2.48)
2f
where we have assumed f >> x, y and made the paraxial approximation. The
phase is found to be
x2 + y 2
u kdo − k (2.49)
2f
which consists of a constant phase we will ignore and a second quadratic phase.
Assuming the lens has a unity power transmission ( no loss or reflection for the
wave propagating through it) we can write the transmission function for the lens
as 2 2 x +y
−i k
t l (x, y) = e 2f . (2.50)
which is by assumption a pure phase transformation on the wave that passes
through the lens. The positive sign in the lens phase indicates the more off-axis
the wavefront location the more advanced it is with respect to the lens center.
This is physically sensible since the wave takes longer to pass through the, thicker,
center of the lens when compared to the thinner portions. The nonuniform phase
delay across the lens is what allows it to convert planar wavefronts to spherical
wavefronts and vice versa (see Fig. 2.11).
In summary, as Illustrated in Fig. 2.11a, a thin spherical lens results in a plane
wave being focused to a focal spot. In passing throughout the lens (assuming the
lens center is coincident with the coordinate origin) the incoming wave’s complex
amplitude gets modified according to
x 2 +y 2
−i k
Uout (x, y) = t l (x, y)Uin (x, y) = Uin (x, y)e 2f (2.51)
x 2 +y 2
−i k
which for a plane monochromatic wave equals Uo e 2f . Although the converg-
ing wavefronts are spherical, we can also investigate the phase of the converging
34 Chapter 2 Fundamentals of wave optics
a) b)
3
−1
−2
−3
−3 −2 −1 0 1 2 3
Figure 2.11 Focusing a plane wave. a) A plane monochromatic wave is focused by a
thin spherical lens. The planar wavefronts are converted to spherical wavefronts. b)
Spatial phase across the plane indicated by the dashed line in a).
wave across the plane indicated by the vertical dashed line in Fig. 2.11a we find
the result plotted in Fig. 2.11b. These are two different quantities related to a
wave’s phase. We see the converging spherical wavefronts result in a spatial phase
distribution across a plane perpendicular to the optical axis that are concentric
circles. The particular slice plotted has the location x = y = 0 coincident with an
anti-node of the converging wave for the instant of time selected.
t1 t2
in Fig. 2.11a we find the result plotted in Fig. 2.11b. These are two different
quantities related to a wave’s phase. We see the converging spherical wavefronts
result in a spatial phase distribution across a plane perpendicular to the optical
axis that are concentric circles. The particular slice plotted has the location
x = y = 0 coincident with an anti-node of the converging wave for the instant of
time selected.
Chapter 3
The stage has now been set to explore what happens when we start to add up
waves. As we said in the Ch. 2, since the wave equation is linear it obeys the
principle of superposition. This is just a fancy way of saying if we know multiple
solutions to the wave equation then we know their sum is too! It is this fact that
makes interference possible in wave optics. A second point to recall is that the
optical disturbance function’s, i.e. the wavefunction’s, strength was found to vary
as a result of its phase changing both in position and time. Interference occurs
when we superpose two waves with a relative phase difference. In this chapter
we will concern ourselves with the interference of two waves. Later in the book we
will relax this constraint. By focusing on the interference of two waves we can gain
intuition and understanding without the complication of added mathematical
clutter such as summations and integrals. In this chapter, after some general
discussion of two-wave interference, we will consider the interference of the
specific elementary waves we introduced in Ch. 2. This will provide a foundation
to investigate interferometers in the next chapter and many wave interference
effects as well as diffraction. Remember, wave optics is fundamentally about
understanding how to add up elementary waves!
37
38 Chapter 3 The interference of two waves
where the subscript refers to wave 1 and wave 2 and the complex envelope’s initial
phase angle (if there is one) is absorbed into the wave’s total phase. Although we
will not pursue this point, if both phasors rotate with the same temporal frequency,
we can find a third phasor that rotates with the same temporal frequency that has
a new complex envelope and spatial phase.
To find the irradiance associated with Eq. (3.2) we need to find the magnitude
squared of the total complex wavefunction. Explicitly we calculate
where the space and time dependence of Ui has been suppressed for simplicity.
Expanding each one of the four terms we find
Notice the third and fourth lines are of the form ae i ψ + ae −i ψ which equals
2a cos ψ.
Next, we define the irradiance associated with wave i as I i (r) = I i (note |u i | =
p
I i and of course there is no time dependence in the constituent wave irradiance)
so that we can recollect all the terms in Eq. (3.7) as
p
I 1 I 2 cos ∆φ(r, t )
¡ ¢
I T (r, t ) = I 1 + I 2 + 2 (3.8)
where
∆φ(r, t ) = θ2 (r, t ) − θ1 (r, t ) (3.9)
is the phase difference between the two waves. The irradiance pattern resulting
from this superposition of two waves is called an interferogram. In Eq. (3.8) I 1
and I 2 are called the direct terms resulting solely from each constituent wave and
3.1 Interference: general considerations 39
the “interference" term depending on the relative phase arises from the cross-
terms (in Eq. (3.7) lines 3 and 4). If the complex envelopes of each wave are equal,
i.e. |u 1 | = |u 2 |, such that I 1 = I 2 = I o then Eq. (3.8) simplifies to
Again, a fringe also has the same value of irradiance and it is common to refer to
the irradiance maxima as fringes. An interferogram is composed of a collection
of fringes and much information can often be extracted from the geometric
structure of a fringe pattern. In the following we will discuss many different
fringe patterns. Finally, the definition of a fringe should be contrasted with the
definition of a wavefront. A wavefront is the locus of points of constant phase for
an optical disturbance whereas a fringe is the locus of points in an interferogram
with constant phase difference. Make sure you appreciate this difference since it
is often a source of confusion.
Second, a natural question is where is the interference irradiance a maximum?
Focusing on the argument of the interference term it is clear I T is a maximum
40 Chapter 3 The interference of two waves
when ∆φ = 2πm where m = 0, ±1, ±2, .... (is an integer); this makes cos ∆φ = 1.
¡ ¢
What does m, the fringe order, tells us about the interfering waves?
The magnitude of the fringe order m is the number of waves of optical path
difference between the interfering waves.
For example, m = 2 means there are two waves of optical path difference or a
relative phase difference of 4π between the two interfering waves. The value of
p
irradiance at the maximum is I 1 + I 2 + 2 I 1 I 2 . Similarly, the minimum occurs for
∆φ = π(2p + 1) where p = 0, ±1, ±2, .... (is an integer); this makes cos ∆φ = −1.
¡ ¢
p
The value of irradiance at the minimum is I 1 +I 2 −2 I 1 I 2 . A convenient definition
that aims to capture “how much" or “how deep" is the interference is the visibility.
Visibility is defined as
I max − I mi n
V= (3.12)
I max + I mi n
where I max (I mi n ) are the maximum (minimum) values in the irradiance pattern.
From the previous considerations the visibility can be expressed as
p
2 I1 I2
V= . (3.13)
I1 + I2
If the complex envelopes of the two waves are equal then I 1 = I 2 = I o and the
visibility equals one, V = 1. Also, if the waves are incoherent so that I T = I 1 + I 2
then the visibility equals zero, V = 0. These properties make the visibility a good
measure of an interferograms strength. Notice V < 1 if the waves are of unequal
amplitude and/or if the waves are partially coherent. We will return later to how
an interferometer’s output reflects properties, such as coherence, of a light source.
In Fig. 3.1 two point sources radiate spherical waves that are made planar by
a thin lens (see Section 2.5) and then overlapped in space. The role of the lens
is to bring the plane at infinity to the len’s focal plane. In this way the planar
wavefronts are easily presented to the region of space on the opposite side of the
lens. Critical in this example is that the two point sources are coherent. In general,
if these were two atoms or two stars, the sources would be incoherent and we
would not observe any interference. By assuming coherence interference effects
become pronounced. The dashed box identifies the region in space where the
total optical disturbance irradiance is measured. To quantify the total irradiance it
is necessary to find expressions for the two wavevectors kA and kB . This depends
on the coordinate system.
lens
kA
B kB
lens
Figure 3.1 Point sources in the focal plane of a lens. Two point sources that radiate
spherical waves are each situated at the focal point of a lens. After the lens collimates
the waves radiated by each source, the two plane waves are overlapped. The irradiance
is measured within the dashed box. By considering only the dashed box, as presented
in the inset, it is clear considering the interference of two monochromatic plane waves
is appropriate.
The second scenario considered is present in In Fig. 3.2 two point sources,
located at infinity, radiate spherical waves. The coordinate origin is coincident
with the geometric center of the sphere. The irradiance in this region can be un-
derstood as resulting from the interference of two monochromatic plane waves.
For this example the simplifications introduced will be employed to justify the
plane wave interference. This will require approximations to the complex enve-
lope and spatial phase of each point source. In Fig. 3.2, point source i is located at
a distance r i = r from the coordinate origin since both point sources are situated
the same distance from the center of the sphere. The calculation for source A
and B is the same so only the calculation for point source B will be presented.
The derivation is identical for A. Before starting, the goal of this derivation is to
42 Chapter 3 The interference of two waves
demonstrate that in the vicinity of the sphere’s center, each point source delivers a
plane wave that propagates along a direction determined by the source’s location.
Consulting the illustration, our intuition tells us the ray emitted by the point
source (that connects the point source to the origin) should be collinear with the
approximate plane wave.
rB=(xB,zB)
rB+rBD=rD
rA+rAD=rD
rBD
rAD
rD=(xD,zD)
rA=(xA,zA)
rB
rA
kA
x
z
kB
Figure 3.2 Point sources at infinity. Two point sources located at infinity radiate spher-
ical waves. The coordinate origin is located at the geometric center of the sphere. In
this region the total irradiance can be understood from considering the interference of
two monochromatic plane waves. The dashed box identifies the region to be consid-
ered and illustrates the planar wavefronts delivered by the two sources. The grey circle
identifies a specific detector point.
For point source A, the geometry in Fig. 3.2 reveals rA + rAd = rd where rd is
the point of detection (d). The spatial phase at this location for source A is
k A r Ad = k A |(rd − rA )| = k A ((x d − x A )2 + (z d + z A )2 )1/2 (3.14)
where the definitions rd = (x d , z d ) and rA = (x A , z A ) have been used. The next
step is to rearrange the spatial phase to leverage the paraxial approximation.
Specifically
2 2
−2x d x A + 2z d z A x d + z d 1/2
k A ((x d − x A )2 + (z d + z Z )2 )1/2 = k A r (1 + + ) (3.15)
r2 r2
where r A2 = x 2A + z 2A = r 2 is the sphere radius. In carrying through the paraxial
approximation, the third term under the square root is assumed to be small
3.2 Interference of two monochromatic plane waves 43
(the Fresnel number is much less than 1 for this situation) and the spatial phase
becomes
k A xd x A − k A zd z A
k A ((x d − x A )2 + (z d + z Z )2 )1/2 u k A r − . (3.16)
r
In Eq. (3.16) we notice there is a global phase in r and linear phase in the detector
coordinates x d and z d . From the geometry, z A /r = cos θ A and x A /r = sin θ A .
Substituting this into Eq. (3.16) the spatial phase becomes
where α = π+θ A and γ = θ A are the appropriate direction cosine angles. The final
result for the spatial phase of the wave generated by point source A in the vicinity
of the sphere’s geometric center, where the irradiance will be measured, is
k A r + k A · rd (3.19)
which is the spatial phase of a monochromatic plane wave. A similar story holds
for source B. With this conceptual background in hand, the next section considers
in detail the interference of two monochromatic plane waves.
where the wavevector and initial phase angle difference determines the relative
phase. Two monochromatic plane waves with the same temporal frequency do
not necessarily have the same wavevector. The temporal frequency also specifies
the waves wavenumber and wavelength, but not the propagation direction. We
will see the observed fringe pattern in a monochromatic plane wave interferogram
will reflect the relative angle of propagation between the two waves.
¡ ¢ ¡ ¢
Since k1 = k 1x , k 1y , k 1z and k2 = k 2x , k 2y , k 2z we can evaluate the phase
difference in Eq. (3.20) as
where the position vector r is r = (x, y, z) and located at the observation point. If
we remember the wavevector definition in terms of the direction cosines then the
phase difference is equivalent to
By substituting Eq. (3.22) into Eq. (3.25) we find the following expressions for the
fringe spacing
λ
Λx = (3.26)
| cos α2 − cos α1 |
λ
Λy = (3.27)
| cos β2 − cos β1 |
λ
Λz = . (3.28)
| cos γ2 − cos γ1 |
Let’s consider the previous results for two plane waves propagating in the x −
z plane (in which case k y2 and k y1 equal zero) with the wavevector for wave
2 making an angle of 60o = π/3 with the z-axis. Figure 3.3 illustrates the real
wavefunctions for both of these waves, see Fig. 3.3a and Fig. 3.3b, and the real
wavefunction that results from their superposition (Fig. 3.3c). In Fig. 3.3c energy
flows along the direction of the white arrow that is parallel to the wavefunction
variation (the red and blue peaks and valleys). The second column of Fig. 3.3
illustrates the wavefronts of the two waves (Fig. 3.3d and Fig. 3.3e) as well as how
the fringes result (solid line) in their superposition (the wavefronts are the two
dashed lines) in Fig. 3.3f.
For this geometry the wavevectors, in terms of their direction cosines (note
cos β1 = cos β2 = 0), are
k1 = k (0, 0, 1) (3.29)
³ p ´
k2 = k (cos 5π/6, 0, cos π/3) = k − 3/2, 0, 0.5 (3.30)
where k is the magnitude of k1 and k2 since they have the same temporal frequen-
cies. We can use Eq. (3.22) to find the phase difference
p
∆φ(r) = −(k 3/2)x − (k/2)z + (δ2 − δ1 ) . (3.31)
3.2 Interference of two monochromatic plane waves 45
Notice the slope of the fringe spacing can be found from the previous. Looking at
p
Eq. (3.31) as the equation of a line x = mz + b we see that m = −1/ 3. The fringe
spacings are
λ 2λ
Λx = p =p (3.32)
| − 3/2| 3
λ
Λz = = 2λ (3.33)
|0.5 − 1|
Λx Λz
Λ= q = λ. (3.34)
Λ2x + Λ2z
a) b) c)
d) e) f)
Figure 3.3 Monochromatic plane wave interference in the x − z plane. a) Real wave-
function for plane wave 1. b) Real wavefunction for plane wave 2 propagating at π/3
from the z-axis. c) The resultant superposition. d) and e) are the wavefronts of wave 1
and 2. f) shows how the fringes (solid line) arise from the wavefronts, the two dashed
lines.
46 Chapter 3 The interference of two waves
a)
b)
x
y
z
Λ
x
Λx
Λz
Λx
Lz
x x
rP I1=I(xx,yp,zp)
rAP
I2=I(xp,Lz)
A rBP
rA
d/2
z
d/2
rB
B
Lx
detector
plane
z
detector
I3=I(Lx,yp,zp) plane
Figure 3.5 Interference of two spherical waves. This is the geometry we consider for
the interference of two spherical waves.
sources located at a distance ± d2 along the x-axis. Each of these point sources gen-
erates a diverging spherical wave; see Eq. (2.34). The observation point is located
¡ ¢
by the observation point vector rP = x p , y p , z p . The displacement along the
3.3 Interference of two monochromatic spherical waves 47
x-axis causes the r in Eq. (2.34) to be replaced by r AP and r BP , the distances from
point source A and B to the observation point. These distance are found from the
¡ ¢ ¡ ¢
magnitudes of the vectors rAP = x p − d/2, y p , z p and rBP = x p + d/2, y p , z p .
If we assume the waves have the same temporal frequency and initial phase
angles then from the given geometry the phase difference in Eq. (3.9) is expressible
as
∆θ(k, rAP , rBP ) = (r BP − r AP ) k (3.35)
where the difference in path length r BP − r AP determines the phase difference at
the observation point. It is clear from Eq. (3.10) and Eq. (3.35) that maximum
in the interference pattern occur when r BP − r AP = mλ, i.e. when the path length
difference to the specified point is an integer multiple of wavelengths. The integer
m is again the number of full waves of path difference to the irradiance maximum.
To figure out the irradiance, we either directly use Eq. (3.35) or we consider
certain regions of space where the path length difference simplifies. In general, the
fringes that result from the interference of two spherical waves are hyperboloids.
These fringes are illustrated in Fig. (3.6). The two panels evaluate the irradiance
associated with the phase difference given in Eq. (3.35) for two different source
locations. In panel a, it is assumed the separation between the sources is d = 4λ,
Notice along the x-axis for locations greater than d /2 the irradiance is a maximum.
In panel b, the distance is some fraction of a full wavelength so that the phase
difference related to the source geometry does not correspond to a multiple of
2π. For this configuration, along the x-axis for locations greater than d /2, the
irradiance takes on an intermediate value between its max and min. It should
also be clear from the images that there are some maximum number of fringes
that fit between the two spherical sources. Refering to Fig. 3.5 there are 4 regions
−10
−10
a)
−8 b)
−8
−6
−6
−4
−4
−2 −2
0 0
2 2
4 4
6 6
8 8
10 10
−10 −8 −6 −4 −2 0 2 4 6 8 10 −10 −8 −6 −4 −2 0 2 4 6 8 10
Figure 3.6 Fringes pattern for two interfering monochromatic spherical waves. a)
The two source separation is a multiple of the wavelength. b) The source separation is
not an integer multiple of the wavelength.
that things simplify. The easiest region to consider is the set of points that are
equidistant from the two point sources. These points coincide with the x − y
plane at z = 0. For all points in this plane r BP = r AP so this determines the m = 0
fringe. The x − y plane at z = 0 is coincident with the m = 0 irradiance maxima.
48 Chapter 3 The interference of two waves
where we have ensured that number being squared underneath each square root
is positive. Constructive interference occurs for locations
mλ
2x pm k = 2πm → x pm = (3.38)
2
where the m superscript indicates that there is a discrete collection of x p values
for which there are irradiance maximum. Since x p must be less than d /2 from Eq.
(3.38) it follows that the largest fringe order m max occurs for
d m max λ d
x pm = = → m max = . (3.39)
2 2 λ
If d /λ is not an integer then you should take the integer portion of d /λ. In total
there are 2 ∗ m max + 1 fringes between the two point sources along the x axis.
We can plot the irradiance along the x-axis and this is illustrated in Fig. 3.7. It
combines what we have observed in the previous two cases. In the region between
the sources, the sources generate counter-propagating waves that interfere as
standing waves with a spatial period of λ/2. These are the fringes we identified in
Fig. 3.6. For locations beyond the sources, the geometry and source wavelength
determines the phase difference for all x p and the irradiance is a constant value.
This was also evidenced in Fig. 3.6. In Fig. 3.7a) it is assumed the source sepa-
ration is a whole number of wavelengths and in panel b) the distance is not an
integer multiple of the wavelength.
¡ ¢
Next we will consider the irradiance I x p , 0, L z along the x-axis at the location
z p = L z . This region is illustrated in Fig. 3.5. For points such that x p << L z we
can make use of the paraxial approximation to the path length difference. This
is a simplification that arises due to the mismatch in length between the two
coordinates of our observation point. Recalling the binomial expansion in Eq.
(2.43) we can simplify both r AP and r BP . First, considering r AP we find
s
q x p2 + (d /2)2 − x p d
r AP = (x p − d /2)2 + L 2z = L z 1 + (3.40)
Lz
2 2
x p + (d /2)2 − x p d
2
à !
x p + (d /2) − x p d
u Lz 1 + = Lz + (3.41)
2L z 2L z
3.3 Interference of two monochromatic spherical waves 49
a) 4.5 b) 4.5
4 4
3.5 3.5
3 3
2.5
I/Io
2.5
I/Io
2 2
1.5 1.5
1 1
0.5 0.5
0 0
−0.5 −0.5
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
xp [µm] xp [µm]
Figure 3.7 Irradiance along the x-axis. a) The irradiance pattern I (x p , y p = 0, z p = 0).
For this example d = 4 µm and λ = 1 µm. b) Same as a) except λ = 0.750 µm
where we have made the paraxial approximation in the second line of the above
equation. We find a similar expression for r BP
s
q x p2 + (d /2)2 + x p d
r BP = (x p + d /2)2 + L 2z = L z 1 + (3.42)
Lz
2 2
x p + (d /2)2 + x p d
2
à !
x p + (d /2) + x p d
u Lz 1 + = Lz + (3.43)
2L z 2L z
Bringing together Eq. (3.41) and Eq. (3.43) the optical path length difference
(assuming the medium’s refractive index is 1) is
xp d
r BP − r AP u , (3.44)
Lz
a) 4b)
−4
4.5
m= -3 -2 -1 0 1 2 3
4 3.5 −3
3.5 3 −2
3
2.5 −1
2.5
I/Io
2
0
2
1.5
1.5 1
1
1
2
0.5
0.5
3
0 0
−0.5 −0.5 4
−4 −3 −2 −1
−4 3 2 1 0 1 2 3 4 −4 3 2 1 00 11 22 33 44
xp [mm] xp [mm]
Figure 3.8 Tilt fringes for L z >> x p , y p . a) The irradiance pattern I (x p , 0, L z ). For this
example d = 100 µm, λ = 1 µm and L = 1 m. Across the top of the plot is the fringe
order. b) The irradiance pattern I (x p , y p , L z ). The observed fringes are called tilt fringes.
Note this is only valid for a finite size x p − y p plane about z p = L z .
¡ ¢
Finally, we determine the irradiance I L x , y p , z p in the y − z plane for the
location x p = L x (see Fig. 3.5). We further assume L x >> y p , z p so that we can
consider a paraxial region about the x-axis. Again we need to evaluate r AP and
¡ ¢ ¡ ¢
r BP . For this region rAP = −(L x + d /2), y p , z p and rBP = −(L x − d /2), y p , z p . If
we define L A = L x + d /2 and ρ 2p = y p2 + z p2 we find
v
ρ 2p
q u
u
r AP = L A + ρ p = L A t1 + 2
2 2
(3.48)
LA
ρ 2p
u LA + (3.49)
2L A
where we have made the paraxial approximation in the second line of the above
equation. With L B = L x − d /2 a similar expression for r BP results
v
ρ 2p
q u
u
r BP = L B + ρ p = L B t1 + 2
2 2
(3.50)
LB
ρ 2p
u LB + (3.51)
2L B
3.3 Interference of two monochromatic spherical waves 51
The two previous equations result in the following optical path length difference
ρ 2p ρ 2p ρ 2p
r BP − r AP = (L B − L A ) + − = L− + (3.52)
2L B 2L A 2L e f f
kρ 2p
r BP − r BP = kL − + (3.53)
2L e f f
kρ 2p
I (L x , y p , z p ) = 2I o [1 + cos(kL − + )]. (3.54)
2L e f f
There are two terms in Eq. (3.53) and each has a different meaning. First notice,
that it y p = z p = 0 then the phase difference is completely determined by kL − .
So, we see that along the x p axis, for distances |x p | > d /2, the ratio of the source
separation d and wavelength determines the strength of the interference along
this axis (see Fig. 3.6). The second piece is a quadratic phase that is a function of
the radial distance ρ p . The circular symmetry in this plane generates so-called
defocus fringes, circular fringes with an increasing spatial frequency away from
the x-axis. The reason for using the word defocus is since the two spherical
sources are not coincident and if one of the sources was at a focal point of a
lens the other would be displaced axially from the focal point/plane. After the
lens, the in focus point would generate a plane wave and the out of focus point
would generate a converging or diverging spherical wave. The overlap of these
two waves, one with a planar and the other with a spherical wave front would
result in circular fringes.
Figure 3.9 examines the fringe patterns observed in the y − z plane at distance
L x far from the two sources. The observed fringe pattern exhibits so-called Fresnel
zones. A Fresnel zone is any one of the circularly symmetric rings. When we learn
a bit more about diffraction we will see such a zone plate can leverage diffraction
to focus light much like a spherical mirror or normal refracting lens. And, such a
plate can be manufactured via holography that records the interference pattern
of 2 displaced spherical sources. From considering Eq. (3.53) we can find the
locations of both minima and maxima in this fringe pattern. Lets consider the
location of the minima. If we assume the geometry is such that kL − is a multiple
of 2π then it ensures that along the x-axis the irradiance is a maximum. So, in the
y − z plane we can find the minima for
kρ 2p ( j ) (j)
q
= (2 j + 1)π → ρ p = (2 j + 1)λL e f f (3.55)
2L e f f
a) 4.5
b) 30
4
20
3.5
3
10
2.5
yp [µm]
I/Io
2 0
1.5
−10
1
0.5
−20
0
−0.5 −30
−40 −30 −20 −10 0 10 20 30 40 −30 −20 −10 0 10 20 30
zp [µm] zp [µm]
Figure 3.9 Defocus fringes for L x >> y p , z p . a) The irradiance pattern I (L x , 0, z p ). For
this example d = 100 µm, λ = 1 µm and L = 1 m. Notice the spatial frequency of these
fringes change as a function of position. b) The irradiance pattern I (L x , y p , z p ). The
observed fringes are called defocus fringes. Note this is only valid for a finite size y p − z p
plane about x p = L x . Circular defocus fringes are often referred to as Fresnel zones.
minima. What is the area of the first bright fringe - the 0th zone? Using the
previous radius it is πρ 2(0)
p = πλL e f f where the superscript in parentheses denotes
2( j +1) 2( j )
the index j = 0. We can also ask for the area of the j th zone. That is πρ p −πρ p
which equals 2πλL e f f . Remarkably, this is true for all the zones - each bright
fringe has an area of 2πλL e f f !! Notice each bright fringe (Fresnel zone) has an
area that is twice the area of the 0th zone.
How about the maxima? These occur at
kρ 2p (m) q
= 2mπ → ρ (m)
p = 2mλL e f f (3.56)
2L e f f
In summary
the phase difference for displaced spherical point sources results in circular
defocus fringes that are commonly called Fresnel zones. All Fresnel zones,
excluding the central disk, have an area equal to 2λL e f f . The area of the
central disk is twice that of any zone.
When we begin discussions about diffraction we will find (and revisit) the previous
observations provide enormous intuition into understanding diffraction.
Chapter 4
Interferometry of 2 simple
sources
In the previous list, items 1-3 primarily belong to optical engineering and form
the backbone of optical shop testing and metrology. The last three items, 4, 5
and 6, are typically identified with optical physics and provide a window into the
understanding of both light sources and materials.
Remarkably, interferometers can be classified as one of two general types.
The two types of interferometers are called wavefront splitting and amplitude
splitting interferometers. Wavefront splitting interferometers sample the wave-
front in distinct locations and recombines them on a detector. We will discuss
this class of interferometers first. Amplitude splitting interferometers divide the
53
54 Chapter 4 Interferometry of 2 simple sources
wavefront into two or more replicas. These replicas are then combined on a
detector. Adopting our system’s viewpoint to studying interferometers, Fig. 4.1a)
presents a block diagram of an interferometer. It is illuminated by a light source,
the light is manipulated by the optical system - it is divided and recombined
dependent on the optical system hardware, and the total irradiance that arrives
at the detector plane is recorded. The irradiance is the result of adding up the
waves delivered to the detector via two distinct paths that are the result of either
amplitude or wavefront splitting the light source that enters the optical system.
For this chapter we will assume one of our two previously studied monochromatic
waves illuminates the interferometer - a spherical wave or a plane wave, see Fig
4.1b). The plane wave illumination, the lower panel of 4.1b), can result from
either collimating the spherical waves generated by a point source or via laser
illumination (not illustrated). Finally, the detector 4.1a) may be either a bucket
detector or a spatially resolving detector like a charged coupled device (CCD).
a) b)
source =
interferometer
source
(optical system) detector
source =
lens
In Ch. 3 we have developed all the tools to predict the irradiance output of
both interferometer classes. Before beginning the discussion, a word of caution.
In the previous chapter, we assumed two independent sources were superposed
to create the optical disturbance superpositions and interferograms we analyzed
(two plane waves, two spherical waves). In the real world of optics, unless great
care is taken, two independent light sources - atoms, light bulbs, lasers, etc -
will never interfere and are incoherent; their total irradiance is the sum of their
constituent irradiances. You will notice as we describe various interferometric
geometries, that each optical system makes a replica of the source and these
replicas are superposed. By deriving interfering waves from the same source co-
herence is ensured. One caveat is that as the source spatial or spectral properties
are increased (we have only considered point sources of a single frequency) the
sources ability to interfere with replicas of itself is reduced. Said another way,
monochromatic point sources are always completely coherent - in our studies
these sources generate spherical and plane waves.
4.1 Wavefront splitting interferometers; 2 Sources 55
A
xs xd
√
t We could also consider A
rAd IT(xd,Lz)
other illuminations. For
B rd
example, extended sources
comprised of incoherent xs √d z
rBd
monochromatic spherical A
sources or a single point B
√ Lz
source that is polychro- S(ω)
pixelated detector
matic (composed of many spatial resolving
B
Charged Coupled Device (CCD)
colors) with colors that are ω CMOS camera
ωo
incoherent.
Figure 4.3 Young’s Interferometer.
With the previous setup we can lean directly on the results of Ch. 3 to write
down the total irradiance received by the detector plane. Restating the results of
Section 3.3 the irradiance is
kx d d
I (x d ) = 2I o [1 + cos( )] (4.1)
Lz
where I o = u o2 /L 2z . In the detector plane, there are irradiance maximum at loca-
tions x dm determined by
kx d d mλL z
= 2πm → x dm = (4.2)
Lz d
where again m is the fringe order determine the number of wavelengths in path
difference between the two sources in delivering optical energy to the observa-
tion point. The fringes generated in this region are called tilt fringes. In this
4.2 Amplitude splitting interferometers; 2 Sources 57
location far removed from the source plane, each spherical source can locally be
approximated as have a planar wavefront each with a wavevector propagating
in a different direction. From this perspective it is as if we have the interference
of two monochromatic plane waves propagating with a relative angle between
their wave vectors. This was discussed in the previous section. The fringe pattern
we see in the x − y plane as well as a linecut along the y = 0 line is shown in Fig.
4.4a and 4.4b. The horizontal line in Fig. 4.4a is the resultant irradiance if the two
sources are incoherent.
4.5
a) 4b)
−4
4.5
m= -3 -2 -1 0 1 2 3
4 3.5 −3
3.5 3 −2
3
2.5 −1
2.5
I/Io
2
0
2
1.5
1.5 1
1
1
2
0.5
0.5
3
0 0
−0.5 −0.5 4
−4 −3 −2 −1
−4 3 2 1 0 1 2 3 4 −4 3 2 1 00 11 22 33 44
xp [mm] xp [mm]
Figure 4.4 Tilt fringes for L z >> x d , y d . a) The irradiance pattern I (x d , 0, L z ). For this
example d = 100 µm, λ = 1 µm and L = 1 m. Across the top of the plot is the fringe
order. b) The irradiance pattern I (x d , y d , L z ). The observed fringes are called tilt fringes.
Note this is only valid for a finite size x d − y d plane about z d = L z .
and t = p1 . Notice the magnitude squared of r and t is 12 so that this type of beam
2
splitter reflects half the impinging power and transmits half of the power. For this
reason such a beam splitter is called a 50/50 beam splitter.
a) b)
S(ω)
ωo ω
c)
fixed mirror
A+B
input LA A 3
a) b) c)
Lin B
A 4 # ∆%
LB
LA+B A+B ∆l
B
1
# ∆$ ∆%
bucket detector 2
pin photodiode
The factor of 2 arises since the wave propagates along path A and B two times.
Notice that if the moving mirror is situated at the location of the dashed line then
the length of path A and path B are equal; L A =L B . We will call this length L o . If
the mirror displaces from the equal path configuration then L B can be expressed
as L B = L o + ∆l .
In the following discussion, the interferometer is assumed to be illuminated
with a monochromatic plane wave. The input plane (the dashed-dot line in Fig.
4.7) is coordinated with the wave’s plane that has a spatial phase of 0 so that the
complex amplitude of the wave that propagates through the interferometer along
path A and arrives at the detector is
where we have ignored phase shifts from mirror reflections, as each path sees the
same number of mirrors, and r and t correspond to the beam splitter reflection
and transmission coefficients. The wave that travels along path B and arrives at
the detector is
UB = r t u o e i ko n(L i n +2L o +2∆l +L A+B ) . (4.6)
With the previous complex amplitude the total wave that arrives at the detec-
tor is ³ ´
U A +UB = r t u o e i ko n(L i n +2L o +L A+B ) 1 + e i ko n2∆l (4.7)
where the dependence on the displacing mirror is explicit. Notice the engineered
phase difference in the Michelson interferometer is
k o n2∆l . (4.9)
60 Chapter 4 Interferometry of 2 simple sources
For the phase difference to equal a multiple of 2π (k o n2∆l = m2π) the mirror
needs to move a distance ∆l = mλ 2n . This feature of a Michelson interferometer
allows it to measure extremely small displacements as well as determine the
refractive index of a material placed in one of the paths. If n = 1 in Eq. (4.9) then
each mirror displacement that corresponds to a physical length of λ2 will result
in the detector observing a different fringe maximum. . Starting from a fringe
t If the detector is observing maximum and counting the number of fringes that pass through the detector
a null in the irradiance, is one way to determine how far the reference mirror has been displaced. Also
where is the optical energy?
notice, that Eq. (4.9) also depends on the refractive index n. If the refractive
changes by an amount ∆n then the optical path length will change and this
change will modify the observed interference signal.
you see a rainbow of colors in an oil slick or in a soap bubble. Figure 4.7 is an
illustration of a thin film, with refractive index n, imagined to be embedded in
air (n = 1). For simplicity, a monochromatic plane wave illuminates the thin slab
through a beam splitter (assume the beam splitter has r = t = 1). The strategy
to understand the total field that arrives on the detector plane is to trace all the
multiple paths that reflect from the slab and contribute to the total field UT that
arrives at the detector. In this section we will only consider 2 reflections/paths (in
a later chapter all we will account for all reflections).
xp
$%&
$'
A B
z
1 n1=1
thin
slab 2 n2=n h
3 n3=1
Figure 4.7 Thin film interference Illustration of a thin film, with refractive index n,
illuminated through a beam splitter by a monochromatic plane wave. Two paths,
labeled A and B , contribute to the total complex amplitude UT that arrives at the
detector. .
The two paths that contribute to UT are labeled A and B in Fig. 4.7. The
arrows are used as a guide to illustrate the paths. Some of the light will pass
through the slab. This is indicated by the sequence of downward facing arrows.
Path A, the solitary upward pointing arrow, corresponds to light that has reflected
of the front interface of the slab with reflection coefficient r . Path B , illustrated by
the double headed arrow within the slab and the upward pointing arrow above
the slab, corresponds to light that has transmitted through the top interface
with a transmission coefficient t , propagated to the bottom interface of the slab
acquiring phase kh, reflects of the bottom interface with a reflection coefficient
r , propagates back to the top interface acquiring phase kh and lastly transmits
through the top interface with transmission coefficient t . Using the previous, an
expression for the light propagating along the z-axis toward the detector (after
reflecting off the beamsplitter), assuming a complex envelope fo u o , is
62 Chapter 4 Interferometry of 2 simple sources
³ ´
UT = u o r e i kz + t r t e i (kz+2kh) (4.12)
where k = nk o and k o is take as the free space wavenumber. Notice, the only
uncommon part of the two paths is for Path B that propagates through the thin
slab. Another feature in the phase difference that we will insert by hand is an extra
π phase shift. We do not have the tools yet to derive this fact, but if two amplitudes
are interfered and one has reflected off a low-high index boundary and the other
has reflected of a high-low index boundary, with the same index, there will be an
extra phase shift of π between the two amplitudes. With the previous in mind Eq.
(4.12) becomes
³ ´
UT = u o r e i kz + t r t e i (kz+2kh+π) . (4.13)
4πnh
UT UT∗ = I T = r 2 u o2 1 + t 4 + 2t 4 cos (2kh + π) → ∆θ = 2kh +π =
¡ ¢
+π (4.14)
λo
where the phase difference has been made explicit as well as its dependence on
wavelength (λo is the free space wavelength). Fixing λo and n it is possible to
determine the set of slab thickness that would result in constructive interference.
The thicknesses are
4πnh (2m − 1) λo
∆θ = m2π = +π → h = (4.15)
λo 4n
where m is a positive integer greater than zero. m = 1 determines the thinnest slab
λo
that will support constructive interference and it corresponds to 4n . Notice that
for m = 1 the slab is a quarter-wavelength thick (in the material since it is divided
by n). For other orders m the thickness is a multiple of this quarter wave thickness.
The quarter-wave condition (althought it corresponds to a π/2 phase-shift is a
manifestation of the extra π phase shift experienced by the two interfering paths.
Conversely, if the slab thickness is fixed, the spectral irradiance (the interferogram
as a function of wavelength) will exhibit maximum at
4πnh 4nh
m2π = + π → λo = (4.16)
λo (2m − 1)
a) 0.1 b) 0.5
c) 0.5
0.09 0.45 0.45
0.08 0.4 0.4
0.07 0.35 0.35
Figure 4.8 Thin film interference Interferograms a) Spectral interferogram for light
reflected from a glass slab (n = 1.5) of thickness 0.75 µm. b) The same slab as in a), but
the refractive index is increased to 3.5 .
Interlude on coherence
In the previous chapters we discussed simple waves and the interference that
can result when two of these waves are superposed. We learned when the waves
are coherent they can interfere, and if there amplitudes are equal, the observed
irradiance is found to be I T = 2I o (1 + cos ∆φ) where ∆φ is the phase difference
between the two waves at the observation point, see Eq. (??). If the waves are
incoherent then their joint irradiance is simply I T = I 1 + I 2 which equals 2I o if the
amplitudes of the waves are equal. A measure called the visibility, see Eq. (3.12),
was introduced to quantify the strength of interference. For the coherent case
the visibility is 1 and for the incoherent scenario the visibility is 0. Said another
way, the ability of a light source to produce interference fringes in an amplitude
or wave-front splitting interferometer is determined by the source’s coherence
properties. We will find that a visibility equal to 0 or 1 represents the two extreme
cases of coherence and sources of light can exist that are partially coherent. Partial
coherent sources generate interference fringes of reduced visibility (< 1). In this
chapter we will show how we can use the visibility as an indicator of a light source’s
coherence properties and we will derive an expression for the irradiance seen by
our detector plane for the two cases illustrated in Fig. 5.1.
We will build on this operational definition of coherence (i.e. the ability to
generate fringes of finite visibility) to try and make sense of coherence effects
in wave optics. The aspects presented in this chapter are designed to give some
physical optics interpretation to coherence effects. In optics coherence can come
in many different flavors and at the level of wave optics we will be considered with
spatial and temporal coherence (for example we ignore polarization coherence).
Connecting with our mathematical model of wave optics, to quantify spatial
coherence we will study the similarity of an optical disturbance at two points in
space at the same instance of time and to quantify temporal coherence we will
study the similarity of an optical disturbance at two instances of time at the same
location in space. Remember, to this point we have only considered two of the
simplest optical disturbances, monochromatic spherical and plane waves. Both
of these optical disturbances are spatially and temporally coherent. More general
optical disturbances need not be fully coherent.
65
66 Chapter 5 Interlude on coherence
xs xa xp
A
2Δ
d
S(ω)
B
Ls Lz
ωo ω
Figure 5.1 Probing spatial and temporal coherence. In this chapter we will examine
the visibility of the inteferogram in a Young’s interferometer for two different source
configurations. Configuration one is an extended monochromatic source. Configura-
tion two is a point source with an extended spectrum - the source is polychromatic.
than what we have discovered previously except the visibility will be found to
depend on the extended source properties. In fact we will find
Consider the scenario illustrated in Fig. 5.2. In this example, we will imagine in
addition to the point source on axis (labeled S1 ) there is a second source located at
S2 a distance ∆ from S1 . Each one of these sources generates its own interferogram
also labeled accordingly. The question we want to ask is how far ∆s S2 needs to
xs xa xp
S1 d S1
z I(xp)
Δs θs
S(ω)
S2
S2
B
ω Ls Lz
ωo
Figure 5.2 Spatial coherence intuition. Fringes from two spherical sources. Source S2
is displaced a distance ∆s from the on-axis source S1 . The inset is the assumed power
spectrum of the source - it is monochromatic. From setup we can determine what
displacement of S2 spoils the interference fringes of S1 .
be shifted so that the 1st minima of S1 is located at the same detector location as
68 Chapter 5 Interlude on coherence
λL z
the m= 0 maximum fringe of S2 . The minima of S1 is located at x pmin = 2d . From
similar triangles (see Fig. 5.2) we can locate the maximum of S2 at x p = LLz ∆s s . If
max
λL z L z ∆s λL s
= → ∆s = (5.2)
2d Ls 2d
which is the desired displacement. So, if the second source is displaced by ∆s
the recorded visibility is zero. This result can be reorganized in the following way.
First notice that the angle subtended by the source at the slit plane is θs = 2∆s /L s .
Notice if we divide the wavelength λ by θs we find a quantity with units of length.
This observation leads to the following definition
λ
ρc = (5.3)
θs
which we call the transverse coherence length. Why do we call this the transverse
coherence length? The transverse coherence length captures the ability of the
source to produce interference fringes in the Young’s interferometer. If ρ c < d
then ∆s is bigger than λL s /2d and the interference fringes disappear whereas
if ρ c > d then there will be observable fringes on the detector and ∆s is less
than λL s /2d . We can say this another way. If the source presents a transverse
coherence length that is larger than the slit separation than the waves leaving the
slits will interfere, otherwise the interference is surpressed.
A final point. It is clear from the previous if our source S2 displaces by more
that ∆s we will recover fringes with only two sources. Since in the end we will be
concerned with sources that fill in between S1 and S2 are previous reasoning is
almost correct. In fact, we will find that for slit separations larger than ∆s there
will be a recovery of visibility, but it will not generate fringes that exhibit visibilities
on the order of unity. Using the transverse coherence length, we can determine a
coherence area A c presented by the source to the optical system (pinhole plane
¡ ρ ¢2
in this example) as A c = π 2c . This is the area throughout which one could
sample the impinging wave with two pinholes and still see interference.
A similar story can be constructed in the context of temporal coherence, but
now we keep our source fixed on axis and allow its spectrum to broaden; see Fig.
5.3. Important is that each spectral component in the broadened source does not
interfere with light of a different color/wavelength. To get started, lets express the
phase difference at the observation point as
2πd x p d xp
∆θ = = 2π f = ωτ (5.4)
λL z cL z
d xp
where we have made explicit the difference in time, τ = cL z , it takes for optical
energy to propagate from pinhole A to the observation point and from pinhole B
to the observation point and have used the relationship f λ = c. Next we imagine
the source contains two spectral sources S1 and S2 with frequencies f 1 and f 2 . We
assume source S1 exhibits its first minima at a location where S2 has its second
5.1 Intuition about coherence 69
xs xa xp
-1.5
-0.5
0.5
1.5
-2
-1
2
-15
A
-10
S1 d
-5
z I(xp)
0
S(f) S2
5
10
B
15
Ls Lz
f1 f2 f
Figure 5.3 Temporal coherence intuition. A bi-chromatic point source delivers two
different spherical waves to the Young’s pinhole plane. The blue color (solid curve) is
such that its second maxima overlaps with the first null of the red source (dashed line).
From this construction we can determine how the spectral width of the source results
in loss of interference visibility on the detector.
Using the speed of light c, the coherence time can be related to a longitudinal
coherence length l c as
l c = cτc (5.7)
where l c is a measure of the path length difference above which interference no
longer occurs.
Figure 5.4 illustrates how lack of completer coherence, so called partial co-
herence, influences the structure of a Young’s two pinhole interferomter. The
upper left panel illustrates the loss of interference as a function of position on the
detector due to a spectrally broad source that is spatially coherent. The lower left
panel zooms in to the region that contains the detector location of equal optical
path length difference (synonymous with equal time delay). In contrast, the up-
per right panel illustrates what happens as the source being interrogated goes
from exhibiting full spatial coherence (unit visibility, dashed line) to successively
70 Chapter 5 Interlude on coherence
temporal spatial
5
4
4.5
3.5
4
3 3.5
3
2.5
IT /Io
2.5
2
2
1.5 1.5
1
1
0.5
0.5
0
0 -0.5
-1.5 -1 -0.5 0 0.5 1 1.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5
5 5
4.5
temporal 4.5 spatial & temporal
4 4
3.5 3.5
IT /Io
3 3
2.5 2.5
2 2
1.5 1.5
1 1
0.5 0.5
0 0
-0.5 -0.5
-0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5
xp [a.u.] xp [a.u.]
xs xa xp
rSA( j ) A rAP
xP
d
2Δ
z
ω Ls Lz
ωo
Figure 5.5 Spatial coherence. Setup to examine the spatial coherence properties of an
extended monochromatic source.
where we have made explicit the location in the plane of the slits, x a , that complex
amplitude is generated. Notice the irradiance from each of the slits propagates
to the detector as a spherical wave. The distance form slit A is labeled r AP and
B is labeled r B P . Previously U A and UB were taken as complex numbers. In the
current discussion to determine U A and UB we look back from the slit plane to
the source plane to determine the complex numbers that feed the detector plane.
(j)
The sources are assumed to be located at discrete locations labeled x s in the
(j)
source plane. Each source point generates a monochromatic wave at location x s
(j) (j)
of complex amplitude U s (x s ) with initial phase angle e i δ(x s )
that propagates
(j)
−i kr S A
to slit A acquiring a spatial phase e and to slit B acquiring a spatial phase
(j)
−i kr S A
e .
With the previous, we can expand Eq. (5.8) to reflect the source point that
delivers the optical disturbance arriving at x p . We find
¯ " (j) (j) (j) ( j ) #¯2
−i k(r S A +r AP −δ(x s )) −i k(r S A +r B P −δ(x s )) ¯
(j) e (j) e
¯X
U s (x s ) +U s (x s ) (5.9)
¯ ¯
¯ ¯
¯ j r AP rB P ¯
where the sum over j indicates each point in the source plane contributes to the
detected signal at x p . If the sum does not make things bad enough, we also need
to multiply the sum by its complex conjugate! We can simplify the appearance of
(j) (j)
Eq. (5.9) if we call the entire first term U A and the second term UB . With these
72 Chapter 5 Interlude on coherence
where we notice in addition to the propagation phase from source to slit the
initial phase angle difference is also relevant. Recall that the irradiance is the time
average of total optical disturbance function squared and that we have used this
synonymously with the magnitude squared of the complex wavefunction. But,
when we considered the case of superposing two monochromatic plane waves of
different angular frequency we needed to be careful about the time dependence.
That result exhibit modulation of the interferogram at the difference frequency
of the two waves. If that difference frequency was too fast then the interference
term would average to zero and the two waves are said to be incoherent. The
same reasoning applies here except the time dependence is implicit in the initial
phase angles and this is what spoils the interference. We imagine each source
has an initial phase angle that can change as a function of time and these phase
variations are uncorrelated between different locations in the source. Focusing
only on the time varying pieces we quantifying this intuition by writing
(j)
(j) )−δ(x s(m) ))
〈U A U A(m)∗ 〉t = 〈e i (δ(x s 〉t = δ j ,m (5.13)
where we have again the notation for time average < f (t ) >t of a function and
we also introduced the Kronecker delta function δ j ,m that equals 1 if j = m and
equals 0 if j 6= m.
The delta function collapses the double sum in Eq. (5.18) back into a single
sum. This happens since only the terms where j = m get multiplied by 1. The
other j 6= m terms are multiplied by 0. The delta function captures the fact that
(j)
the source points located at x s and x s(m) do not interfere! Utilizing the delta
5.2 Spatial coherence 73
It is now possible to evaluate each term by considering the given geometry. Term
by term the result is
(j)
(j) ( j )∗ I s (x s )
UA UA = 2
(5.15)
r AP
(j)
( j ) ( j )∗ I s (x s )
UB UB = (5.16)
r B2 P
(j) (j) (j)
( j ) ( j )∗ I s (x s )e i k(r SB −r S A ) e i k(r B P −r AP )
U A UB = (5.17)
r AP r B P
(j) (j) (j)
( j ) ( j )∗ I s (x s )e −i k(r SB −r S A ) e −i k(r B P −r AP )
UB U A = (5.18)
r AP r B P
(j) (j) (j)
where I s (x s ) = U s (x s )U s∗ (x s )we can now apply on both the source side and
detector side the paraxial approximation to the previous expression. The paraxial
approximation amounts to constraining L s >> |∆s | and L z >> x p . In the phase,
on the detector side this leads to (see Sec. ***) r B P − r AP = kx p /L z and on the
source sider SB − r S A =. In the complex envelope we can write r AP u r B P = L z .
Bringing this all together we have
X 2I s (x s( j ) ) X I s (x s( j ) ) X I s (x s( j ) )
+ e i kd x p /L e i kd x s /L s + e −i kd x p /L e −i kd x s /L s (5.19)
j L 2z j L 2z j L 2z
g AB = ¯g AB ¯ e ∠g AB
¯ ¯
(5.22)
which results in
· ¸
|g AB | i (kd x p /L+∠g AB ) |g AB | −i (kd x p /L+∠g ∗ )
I tot 1 + e + e AB (5.23)
2 2
74 Chapter 5 Interlude on coherence
Finally, when all the dust settles, we have an expression for the irradiance that
looks very much like the our earlier results! From the definition of visibility in Eq.
(3.12) we find with the previous expression the visibility V is
V = |g AB |. (5.25)
Notice, from how g AB is defined its magnitude varies between 0 and 1. It should
also be clear the phase of g AB determines the location of the interferogram’s m = 0
brings.
So, what is g AB ? First, in optical coherence theory it is called the mutual inten-
sity of the optical disturbance. From how it appears in the interference expression,
its magnitude determines the strength of interference and it is therefore related
the source’s spatial coherence properties. Notice that it is a two point relation.
What I mean by this is the visibility of the interferogram is determined by how
correlated the complex amplitude at slit A is with the complex amplitude at slit B .
For this reason g carries the subscripts A and B . For our geometry, the two points
are located at d /2 and −d /2 in the plane of the slits. We can think of the slits as
sampling the source distribution at 2 points and delivering them to a detector that
register’s an interferogram with a visibility that quantifies the coherence between
the two sampled points.
With the previous intuition, we can unpack g AB to understand the coherence
length ρ c and coherence area A c the source presents the aperture plane. For this
we will imagine that we fill the region source plane, with 2∆ with a continuum
of point sources, not a collection of discrete point sources. Ignoring any of the
P
formalities in doing this, our expression for j will be replaced by an integral over
R
the source plane, d x s . With this substitution we have
I s (x s )e i kd x s /L s d x s
R
g AB = . (5.26)
I tot
kd x s 2πd x s
= = 2πνx x s (5.27)
Ls λL s
I s (x s )e i 2πνx x s d x s
R
g AB = (5.29)
I tot
5.2 Spatial coherence 75
and the numerator is called a Fourier Transform. We will adopt the following
notation for Fourier transforms in this text
Z
I (νx ) = I s (x s )e i 2πνx x s d x s . (5.30)
In the coming chapters we will also find the Fourier transform emerges naturally
in the analysis of far-field diffraction patterns. Equation (??) result is a special
case of a more general result from optical coherence theory known as the van
Cittert Zernike Theorem. In words the van Cittert Zernike theorem relates the
spatial coherence properties of a source in a plane to the Fourier transform of the
source irradiance!
Time for a deep breath and to ask, what does this all mean? We started
out wanting to understand the visibility of an interferogram since our intuition
was the strength of interference was an indicator of source spatial coherence
properties. We have found that the Fourier transform of our source Irradiance
determines our interferogram’s visibility.
But, how do we use this? The first step is to find I (νx ). Operationally you
take the Fourier transform of the source distribution or more plainly evaluate
that integral in Eq. (??). The integral returns a function I (νx that is a function
of νx the spatial frequency. The question is now, been moved to figuring out
how to relate this new distribution to a specific slit separation d . To do this we
remember that d /λL s for νx . So, given a particular slit separation, I can figure out
the visibility by finding first νx = d /λL s the spatial frequency associated with the
given geometry and then determining the magnitude of the Fourier transform at
this νx
|I (νx )| . (5.31)
With the previous we can figure out the depth the interferogram produced by any
spatially extended source that is composed of uncorrelated point radiators!!
Lets use this machinery for a simple example. First, we will assume our
source is a monochromatic point source that radiates an irradiance I o = I tot . To
mathematically represent this point source we introduce the Dirac delta function.
The Dirac delta function is the continuous variable analog of the Kronecker
Delta function. Technically the Dirac delta is a distribution that only exists to
be integrated and has unit area. For wave optics we will define the Dirac delta
function as
and will not be concerned with the mathematical subtleties regarding its defini-
tion. With the previous definition we have a mathematical function to represent
point sources of irradiance and complex amplitude. Notice, the Dirac delta is
equal to 1 when its argument is 0 and is zero otherwise. For δ(x − x o ) this happens
when x = x o . We will also encounter δ(x + x o ). This function is one when x = −x o
since this makes the argument of δ(x + x o ) equal zero.
76 Chapter 5 Interlude on coherence
The Dirac delta function has a useful property called the Sifting Property and
this details how to use the Dirac delta function in an integral. Specifically,
Z
f (x)δ(x − x o ) d x = f (x o ) (5.34)
so that a function integrated over the delta function has its value at f (x o ) sifted
out of all its possible values. Recall f (x) is a catalog (or table) of values relating
numbers to each independent variable value x and when integrating over the
Dirac delta function the result is one number, the value of f at x o .
With the previous definitions it is possible to determine g AB for a monochro-
matic point source of irradiance I o located at the origin. Our mathematical
representation of this point source is
I s (x s ) = I o δ(x s ) (5.35)
I o δ(x s )e i kd x s /L s d x s e i kd 0/L s d x s
R
Io
g AB = = = (5.36)
I tot I tot I tot
where the source has been extended along the y-direction and we now need to
take a 2D Fourier transform. To determine the visibility of an interferogram in a
Young’s interferometer it is necessary to orient a linecut through the 2D Fourier
transform that it coordinated with the line connecting the two pinholes in the the
Young’s aperture.
Imagine our source is a disk of radius ∆s .
xs xa xp
A rAP
I(fj) rSA
xP
d
f jc z
rSB rBP
fo f
B
fj = fo+ f jc
Ls Lz
Figure 5.6 Temporal coherence. Setup to examine the temporal coherence properties
of a polychromatic point source.
Although the spectrum is continuous, again we will begin by assuming the spec-
trum is discrete and then take the continuum limit with our final expression. In
the below derivation we will find it convenient to measure the frequency from
the center of the spectrum. With the previous, the frequency will be written as
(j) (j)
f ( j ) = f o + f c where f c is the frequency measured from the spectrum’s center
frequency f o . These frequencies are all illustrated in Fig. 5.6.
If the spectral complex amplitude is expressed as US ( f ( j ) ) and we further
assume that each spectral component of the radiates with an initial phase that is
uncorrelated with other spectral components, we find the irradiance measured
on our detector can be expressed as
U A ( f ( j ) )U A∗ ( f ( j ) ) +UB ( f ( j ) )UB∗ ( f ( j ) )
X
(5.38)
j
where Ui ( f ( j ) ) denotes the complex amplitude from slit i with frequency j that
arrives at the detector location. We had a similar expression in the previous
section where j denoted the source location instead of the spectral component.
It is now possible to evaluate each term by considering the given spectrum. Term
78 Chapter 5 Interlude on coherence
I s ( f ( j ))
U A ( f ( j ) )U A∗ ( f ( j ) ) = 2
(5.40)
r AP
I s ( f ( j ))
UB ( f ( j ) )UB∗ ( f ( j ) ) = (5.41)
r B2 P
(j)
I s ( f ( j ) )e i k (r B P −r AP )
U A ( f ( j ) )UB∗ ( f ( j ) ) = (5.42)
r AP r B P
(j)
I s ( f ( j ) )e −i k (r B P −r AP )
UB ( f ( j ) )U A∗ ( f ( j ) ) = (5.43)
r AP r B P
2π f ( j )
where I s ( f ( j ) ) = U s ( f ( j ) )U s∗ ( f ( j ) ) and k ( j ) = c . Referring to Fig. 5.6 we will de-
compose the linear frequency in terms of the center frequency f o and a frequency
(j)
measured from the center frequency f c . With this substitution, and the paraxial
approximation, the spatial phase can be expressed as
(j)
(j) fo xp d fc xp d
eik (r B P −r AP )
→ e i 2π cL z e i 2π cL z (5.44)
where it has factorized into a piece dependent on the spectral component and
xp d
a constant phase dependent only on the central frequency chosen. Notice, cL z
carries units of seconds and is the time difference it takes for light to travel to the
xp d
detector from slit A and slit B. Lets define τ = cL z and I tot = j I s ( f ( j ) ) to write
P
where
P ( j ) i 2π f c( j ) τ
j I s ( f c )e
g AB (τ) = . (5.47)
I tot
In the previous, g AB (τ) is the time domain analog of mutual intensity and is often
called the complex degree of coherence.
The next steps are identical to the spatial extended source. First, recognizing
g AB (τ) is a complex number, the expression for the irradiance can be cast in the
following form
¡ ¢
I tot 1 + |g AB (τ)| cos[k o d x p /L + ∠g AB (τ)] . (5.48)
which again looks very much like the our earlier results! Notice 2πτ f o has been
replaced with k o d x p /L where k o is the wavenumber of the central linear temporal
5.3 Temporal coherence 79
frequency. From the definition of visibility in Eq. (3.12) we find with the previous
expression the visibility V is
V = |g AB (τ)|. (5.49)
Notice, from how g AB (τ) is defined its magnitude varies between 0 and 1. It should
also be clear the phase of g AB (τ) determines the location of the interferogram’s
m = 0 fringe.
Finally, we will imagine are source spectrum is continuous and not discrete,
(j)
so that f c → f c . Ignoring any of the formalities in doing this, our expression
P R
for j will be replaced by an integral over the source spectrum, d f c . With this
substitution we have
I s ( f c )e i 2π f c τ d f c
R
g AB (τ) = . (5.50)
I tot
where the numerator is called an Inverse Fourier Transform. Notice, that al-
though this is an inverse Fourier transform, it has the same sign as Eq. (5.30). The
sign convention is a result of how we have chosen to express the phase of our
waves. Equation (5.50) is the time domain analog of the van Cittert Zernike The-
orem. In words this van Cittert Zernike theorem relates the temporal coherence
properties of the source to the inverse Fourier transform of the source spectrum!
In discussing spatial coherence, we identified the first zero of |g AB | with the
lateral coherence length of the spatially extended source. We will do the same
here and use the first zero crossing of the inverse Fourier transform of the source
spectrum to define the coherence time τc . The coherence time tells you relative
time delay two waves can experience and still produce interference fringes. Using
the speed of light, it is also possible to define the longitudinal coherence length l c ,
often simply called the coherence length. The coherence length is l c = cτc . The
coherence length is a measure of the largest optical path length difference two
waves can sustain before they can no longer interfere.
Chapter 6
Interferometry of N simple
sources
81
82 Chapter 6 Interferometry of N simple sources
r1p
r2p I(xp,Lz)
ds
rp
ds=2Δs
z
rNp
Figure 6.1 From Young’s Two-Pinhole to the Diffraction Grating. In moving from the
illustration on the left to the right, progressibley more pinholes are poked into the
opaque screen,. Each pinhole samples a single secondary source on the illuminating
wavefront and this source propagates to the right toward the detector plane. Provided
each source is coherent with the others an interference pattern will form.
Defining δ = d s sin θ (θ is the angle between the z-axis and rp ) the phasor sum in
the previous equation becomes
n=N
X−1 n=N
X−1
1 + e i kδ + .... + e i k(N −1)δ = e i nkδ = xn (6.4)
n=0 n=0
since each pair of pinholes has a path length difference of d sin θ and x = e i kδ has
been used. The previous can be expressed as
n=N
X−1 1 − xN
xn = (6.5)
n=0 1−x
where the N term sum has been evaluated. Substituting back into the formula we
find
1 − e i kδN e i kδN /2 e i kδN /2 − e −i kδN /2
= i kδ/2 . (6.6)
1 − e i kδ e e i kδ/2 − e −i kδ/2
Equation (6.6) simplifies by introducing the sin function and recognizing that
N −1
r 1p + δ = rp . (6.7)
2
Bringing together Eqs. (6.3), (6.6) and (6.7) we find
kd s x p N
³ ´
uo e i kr p sin 2L z
UT (rp ) = ³
kd s x p
´ . (6.9)
Lz sin 2L z
6.1 Wavefront splitting interferometers; n=N → the diffraction grating 83
We can understand the previous expression as consisting of two parts. The first
part, that contributes the far-field phase, is a spherical wave. Notice that in this
limit L z u r p and so the factor multiplying the sin ratio is identical to a spherical
wave situated at the coordinate origin. The second factor, the sin ratio, captures
the effect of the pinhole array and is refered to as a radiation pattern. It captures
how the far-field complex amplitude is modulated by the array of pinholes. If
one thinks of the pinhole array as a collection of phased antennas, the influence
of adding more antennas is to induce more directivity in the delivered optical
energy. Can you think of a way to scan the spatial location of the main irradiance
lobe in time?
Notice already there is a potential problem for x p = 0 since the expression
becomes indeterminate (0/0). In fact this is true for all numerator and denomi-
nator arguements that are a multiple of π. To figure the field amplitude at x p = 0
L’Hopital’s rule can be used where (ignoring the phase) the derivative of the
numerator and denominator evaluated as x p → 0 is
kd s x p N
³ ´
d
d xp sin 2L z N cos(0)
³
kd s x p
´ = = N. (6.10)
d
sin cos(0)
d xp 2L z
The same line of reasoning holds for all scenarios that result in 0/0. These loca-
tions correspond to the maxima of the measured/observed irradiance distribution
which is given as
kd s x p N
³ ´
I o N 2 sin2 2L z
I T (rp ) = ³
kd s x p
´ . (6.11)
L 2z sin2 2L z
In Fig. 6.2 we plot the irradiance of Eq. (6.11) for N=2,10,100 assuming
d = 10 µm, λ = 1 µm and L z = 1m. As N increases there are three changes in
the observed irradiance. First, each peak height increases. Notice, as expected,
the peak irradiance increases as N 2 I o where N is the number of pinholes in the
screen. Second, the lobe width narrows. And, lastly, the number of secondary
maximum between the main lobes increases with N and there are N − 1 total
secondary maxima.
We can quantify the previous observations. First, notice the numerator zeros
determine the main lobe width. Consulting Eq. (6.11) the numerator has zeroes
at
kd s x p N (m)λL z
= mπ → x p = (6.12)
2L z N ds
where m is an integer. m = 0 give the main lobe maximum location and the m = 1
determines the main lobe width which is
kd s x p N λL z
= π → xp = (6.13)
2L z N ds
6 12000
5 N=2 10000
N=100
4
IT / Io
8000
3
6000
2 120
4000
1 100
2000
0 80
-1 0
-150 60-100 -50 0 50 100 150 -150 -100 -50 0 50 100 150
40
120 10000
100
N=10
20 N=100
8000
0
80
IT / Io
40 4000
20
2000
0
0
-150 -100 -50 0 50 100 150 -10 -5 0 5 10
xp [mm] xp [mm]
Figure 6.2 Diffracton grating irradiance. Upper left panel: The irradiance for N = 2.
This is identical to the Young’s two pinhole irradiance. Bottom Left Panel: N = 10 irra-
diance. inset: zoom to diffraction pattern to see the number of minima and secondary
maximum. Upper right panel: N = 100 diffraction grating. Notice lobe narrowing and
peak height increase. Bottom right panel: Zoom in to the main lobe of the N = 100
grating main lobe width.
87
88 Chapter 7 Scalar Diffraction Theory
formula to describe wave propagation and diffraction. With our current under-
standing regarding the wave nature of light, Fresnel’s formulation seems natural
and obvious, but at the time he devised his construction, the wave nature of light
was hardly recognized or appreciated.
Fresnel’s reasoning can be understood by considering Fig. 7.1. The left panel
of the figure is an illustration of the Young’s double slit experiment. In describing
d=2Δs d=2Δs
Figure 7.1 From Young’s To Fresnel Diffraction. Adding up spherical wavelets provides
a good intuitive physical optics picture to understanding both interference and diffrac-
tion. The left panel is a Young’s double slit aperture. The middle two panels are the
result of adding more pinholes in the aperture screen. Finally, if the slit is imagined
to be filled with a continuum of secondary spherical sources we can make sense of
diffraction.
e i kr sp
Z
Ud (rp ) u U s (rs ) d xs d ys (7.2)
source r sp
where the distance between the source point rs and detector point rp is r sp and
the source plane complex amplitude distribution U (rs ) has been introduced.
Equation (7.2) is nearly correct. From some further reasoning (see section 6.**)
7.1 Huygen-Fresnel Integral 89
−i e i kr sp
Z
Ud (rp ) = U s (rs ) cos θsd d x s d y s (7.3)
λ source r sp
now called the Huygen-Fresnel diffraction integral. In Eq. (7.3) the changes are
in the prefactor of the integral, which is equal to −i /λ and in the integral kernel
there is a cos θsd . The cos-factor is called the obliquity factor and is a function
of the angle between the normal to the source plane and the vector connecting
the source point and observation point. Although given for completeness, in the
paraxial scenarios we will consider the obliquity factor is approximately 1. Notice
the obliquity factor reduces the contribution of the spherical wavelets to points
that are displaced away from the axis connecting the source and detector plane
through the source point.
What is remarkable about Fresnel’s findings are they predated Maxwell’s equa-
tions by nearly 50 years! Following the discovery of Maxwell’s equations many
approaches have been devised to treat the problems initially considered by Fres-
nel. These later approaches can handle a wider variety of problems – for example
the polarization properties of waves and observation planes that are a few wave-
lengths from the source plane – but, Fresnel’s result is found to still be correct
within its restricted domain and fortunately for us it can explain everything we
are interested in.
Figure 7.2 presents the general diffraction problem we will consider. Although
the illustration is only given in the x − z plane it is a simple matter to include
the y direction in the diffraction equations. We imagine a monochromatic and
spatially coherent source illuminates the diffraction aperture. For example the
aperture could be illuminated by a monochromatic plane wave with complex en-
velope Ui n = u o . If the aperture presents this wave with an amplitude and phase
transmission coefficient t s (x s , y s ) the input complex amplitude, in the source
plane becomes U s (x s , y s ) = Uout (x s , y s ) = u o t s (x s , y s ). The lower 3 panels in Fig.
7.2 are illustrations of the source plane as viewed from the detector plane. The
most generic case is the left panel of the source plane illustrations and here it is
assumed U s (x s , y s ) = u o t s (x s , y s ) is completely arbitrary. Also, notice the source
aperture has some longest opening across it. In this illustration this distance is
parallel to the x s axis and has a length 2∆s . The entire aperture can be circum-
scribed by a circle of radius ∆s . Moving to the right, we can also imagine the
aperture has unit transmission associated with it so that it is only the geometry
of the aperture that determines the diffraction pattern. Finally, the 3rd example
aperture has a phase shift of π/3 for x s < 0 and has a magnitude of u o across the
entire aperture. Equation (7.3) can handle all these cases.
Notice, from Eq. (7.3) each detector point, at a distance L z from the source
plane, receives complex amplitude from all source points. Each complex ampli-
tude in the source plane U s (x s , y s ), located at rs , is propagated as a spherical wave
to the detector point rp . To determine the complex amplitude on the detector
plane, it is in principle necessary to evaluate the integral in Eq. (7.3) for each
90 Chapter 7 Scalar Diffraction Theory
Lz
Source Plane
xs xs xs
ys Δs ys ys
uo uoeiπ/3 uo
Us(xs,ys)
Figure 7.2 Diffraction from an aperture. The upper left panel is a general illustra-
tion of diffraction from an aperture. It is assumed the illumination is monochromatic
and coherent across the aperture. The lower three panels illustrate the source plane
as viewed from the detector plane. The left source plane panel illustrates a generic
2D diffraction aperture with complex amplitude U s (x s , y s ), the middle plane has
U s (x s , y s ) = u o a constant real amplitude and the right aperture is illuminated with
a complex amplitude that has a π/3 phase shift for x s < 0 and a magnitude of u o across
the entire aperture. The detector plane, upper right panel, is assumed to contain a
circular detector of radius ∆d . If the detector was rectangular, 2∆d , would be the length
of the rectangle’s diagonal that the circle circumscribes. ∆s and ∆d will be important in
what follows.
detector point. In general cases this is done numerically. The upper right panel of
Fig. 7.2 is an illustration of the detector plane that contains a detector of radius
∆d . If the detector was rectangular 2∆d would be the length of the rectangle’s
diagonal and the circle of radius ∆d would circumscribe the rectangular detector.
In the coming sections we will make approximations dependent on the system
geometry L z , ∆d , and ∆s as well as the wavelength λo so that the evaluation of
Eq. (7.3) simplifies and even allows for analytic solutions of the Huygen-Fresnel
diffraction integral.
where only the first three terms in the binomial expansion are retained. The
Fresnel Approximation is enforced by substituting the previous into Eq. (7.3).
This approximation will influence both the amplitude and phase of the diffraction
integral. Dealing with the amplitude, as usual, is easier. For these distances, there
is little difference between L z and r sp so L z can replace r sp in Eq. (7.3). The
previous substation modifies the complex envelope of each spherical wave that
connects source and detector points and causes the obliquity factor, equal to
cos θsp = L z /r sp , to approximately equal 1.
We need to be more careful with the phase of the diffraction integral kernel
since the phase is defined modulo 2π. Concentrating on the complex exponential
in the integral kernel of Eq. (7.3) we find
where the single exponential has been split into three distinct pieces by keeping
the first three terms of the binomial expansion. Before focusing on the first two
exponential factors, lets make an argument to ignore the third exponential. The
essence of the argument is to find a relationship between optical system parame-
ters that result in the phase in the third exponential being close to zero. Physically
this means that terms of third order and higher have negligible effect on the phase
received by the detector point from the source point. If only third order terms
and higher were important in the determining the propagation phase between
the source and detector point the approximation we will make would result on
points across the source arriving in phase at the detector point to constructively
interfere.
To simplify the discussion without loss of generality we will assume the source
point is situated at the coordinate origin and we allow the detector point to
explore the detector plane within a radius ∆d (see Fig. upper right panel). By
assuming the source sits at the origin we can express (δx sp )2 + (δy sp )2 as x d2 + y d2 .
For the approximation, it is necessary that
[x d2 + y d2 ]2max
k << π (7.7)
8L 3z
where [x d2 + y d2 ]max is the maximum radius in the detector plane. With the source
at the origin and recalling that we assume the detector sits within a radius ∆d in
the detector plane, see Fig. 7.2. . The previous turns Eq. (7.8) into
t In reality this distance
∆4d should be the maximum
k << π (7.8) distance across the source
8L 3z
and detector plane. In our
notation this would be
∆d + ∆s
92 Chapter 7 Scalar Diffraction Theory
where ∆d defines the largest separation of source and detector points. Rearrang-
ing terms Eq. (7.8) can be written as
N fd θd2
<< 1 (7.9)
4
where θd = ∆d /L z defines the angle between the source point and the observation
point and the detector Fresnel number has been defined as
∆2d
N fd = . (7.10)
λL z
The meaning of the previous is the following. When the optical system param-
eters satisfy Eq. (7.9) only the two lowest order terms in the binomial expansion
of the source-detector distance, see Eq. (7.4) and Eq. (7.5) make a contribution
to the propagation phase between the two points. With the previous approxima-
tions to the amplitude and phase the diffraction integral Eq. (7.3) simplifies to
the Fresnel Diffraction Integral
−i e i kL z (x p −x s )2 +(y p −y s )2
Z
U p (x p , y p , L z ) = U s (x s , y s )e i k 2L z d xs d ys . (7.11)
λL z S
In the previous the integral is taken over the source plane. We can express Eq.
(7.11) to make apparent the different phase contributions. Opening up the ex-
ponent in the integrals kernel we find the second form of the Fresnel diffraction
integral
2 +y 2
xp p
−i e i kL z e −i k x s2 +y s2
2L z
Z x s x p +y s y p
U p (x p , y p , L z ) = U s (x s , y s )e i k 2L z e −i 2π λL z d xs d ys
λL z S
(7.12)
where in the integral prefactor there is a linear plane-wave like phase due to
the propagation between the source and detector plane and there is a quadratic
phase in the detector coordinates x p and y p , in the kernel of the integral there
is a quadratic phase in the source coordinates x s and y s as well as a linear phase
that depends on both the source and detector coordinates. The next set of ap-
proximations are arguments designed to ignore the quadratic phase in the source
and detector coordinate.
xs xp
quadratic phases
kx2s/2Lz kx2p/2Lz
2Δs
2Δd
Lz
Figure 7.3 Quadratic phases in the Fresnel integral. This illustration demonstrates the
quadratic phase variation of Eq. (7.12) found in the source and detector plane. The
next set of approximations deals with these quadratic phases.
of these phases are illustrated by two curves, one solid and one dashed. The
next simplification to the diffraction integral is to constrain the quadratic phase
of the source coordinates so that it does not vary across the source aperture as
illustrated by the solid curve in Fig. 7.3. The approximation amounts to ensuring
that
k(x s2 + y s2 )max (x 2 + y s2 )max
<< π → s = N fs << 1 (7.13)
2L z λL z
where we have equated (x s2 + y s2 )max , the maximum distance in the source plane,
with the largest distance ∆s from the optical axis to the perimeter of the source
aperture and defined the source Fresnel number
∆s
N fs = . (7.14)
λL z
We see if the source Fresnel number, which is a property of the source size, the
wavelength and the distance to detector plane, is much less than 1 than the
Fresnel diffraction integral simplifies to the Fraunhofer diffraction integral
2 +y 2
xp p
−i e i kL z e i k 2L z
Z x s x p +y s y p
U p (x p , y p , L z ) = U s (x s , y s )e −i 2π λL z d xs d ys (7.15)
λL z source
One further simplification is possible and this occurs if the quadratic phase
in the Fraunhofer integral prefactor can be ignored. The reasoning is identical to
94 Chapter 7 Scalar Diffraction Theory
k(x d2 + y d2 )max
<< π → N fd << 1 (7.16)
2L z
where the detector Fresnel number needs to be less than 1. If this is inequality is
also satisfied that the Fraunhofer integral becomes
−i e i kL z
Z x s x p +y s y p
U p (x p , y p , L z ) = U s (x s , y s )e −i 2π λL z d xs d ys (7.17)
λL z source
N fd θd
<< 1 → Fresnel Approximation, use Eq. (7.12) (7.18)
4
N fs << 1 → Fraunhofer Approximation, use Eq. (7.15) (7.19)
N fd << 1 → Fraunhofer Approximation, use Eq. (7.17) (7.20)
The previous is also summarized in the illustration presented in Fig. 7.4. The
expressions we have developed are not valid in the near-field region of the source
plane, within a distance of a few wavelengths. Once the detector plane is multiple
wavelengths away from the source plane, the detector aperture, source-detector
distance (L z ) and the source wavelength determine the diffraction region in which
the observation is made. the detector In propagating from the source plane to the
detector plane, as the
N fd θd2
= 0.8 × 4 × 10−8 /4 = 8 × 10e −9 (7.21)
4
N fs = 5 × 10−4 (7.22)
N fd = 0.8 (7.23)
so in this case, it is appropriate to use Eq. (7.15) and although the Fresnel approxi-
mation is justified, the detector Fresnel number is not less than 1.
7.3 Fraunhofer Diffraction: The far-field 95
Fresnel Fraunhofer
xs xp xp xp
1 2
λo
few λ
Lz
Figure 7.4 Different diffraction regions. This illustration demonstrates the different
regions of diffraction.
t s (x s ) = rect(x s |d ) (7.24)
where d is the full-width (i.e. diameter) of the aperture. The input source
complex amplitude distribution in this case is completely real and equal to
U s (x s ) = u o t s (x s ). The diffraction pattern we will calculate is what would be
expected from a 1D slit. Using C to represent the complex pre-factor of Eq. (7.15)
we find that integral is
Z xs xp
Z ∆s xs xp
i 2π
U p (x p , L z ) = Cu o rect(x s |∆s )e λL z d x s = Cu o e −i 2π λL z d x s (7.25)
s −∆s
where the rectangular function limits the integration across the source plane.
Note the source complex amplitude distribution is constant and coherent across
the aperture. Evaluating the integral we find
xs xp Up(xp)
λo
z xp
2Δs
|Up(xp)|
Lz
xp
source
plane
<Up(xp)
π
Us(xs) 0
xp
Ip(xp)
-Δs Δs xs
xp
magnitude of the complex amplitude and the third plot is the associated phase.
Recall again in optics, a minus 1 is in reality a phase shift of π! The since is a
special diffraction pattern in that its phase is only 0 or π depending on detector
location. Other complex functions can have phases different from 0 and π and it
is only possible to visualize these functions by separately plotting the magnitude
and phase as is done in plots two and three in the right column. Finally, an optical
detector measures irradiance and the bottom plot illustrates the sinc irradiance.
From properties of the sinc function we can identify interesting locations of
the diffraction pattern. First, and most important, is x p = 0. Notice Eq. (7.28) is
divided by x p so it would seem at this location the diffraction pattern should be
infinite. But, in fact, the entire expression is indeterminate at x p = 0 so to find the
value at this location it is necessary to use l’Hopital’s rule (remember calculus!).
From this we find the value of sinc(x) at x = 0 is 1. Second, the numerator of Eq.
(7.27) locates the zeros of the diffraction pattern. Specifically,
the beam along one axis causes it to spread out along that limiting direction in
observation planes beyond the confining aperture. This is quantitatively clear as
x p(m) ∝ ∆−1 s . With the numbers given previously the first zero is located at 50 mm
in the detector plane and the main lobe width is 100 mm.
A second important diffraction aperture is the circle. In contrast to the slit,
the circle is a two-dimensional (2D) aperture in the source plane. We can think of
the circle as a 1D slit rotated about the optical axis. The transmission function,
t s (x s , y s ) of the circle is defined in the same was a rect function
q q
circ( x s2 + y s2 |∆s ) = 1 x s2 + y s2 < ∆s (7.30)
q q
circ( x s2 + y s2 |∆s ) = 0 x s2 + y s2 > ∆s (7.31)
where ∆s is the circle radius. With the previous definition the following diffraction
integral needs to be solved
Z q x s x p +y s y p
U p (ρ p , φd , L z ) = Cu o circ( x s2 + y s2 |∆s )e −i 2π λL z d x s d y s (7.32)
s
J1 (2πρ p ∆s /λL Z )
Jinc(ρ p , φp , L z ) = (7.34)
2πρ p ∆s /λL z
q
and π∆2s is the area of source’s circular aperture and ρ p = x p2 + y p2 .The Jinc
function is the circular equivalent of the Sinc function. The irradiance of the
diffraction pattern formed by the circular aperture is the well known Airy Disk.
Figure ** is an plot of the Airy disk. Looking forward we will discover these
determines the diffraction limited resolution of an imaging system. The first zero
of Jinc(x) is important an located at x = 3.83. From the argument of the Jinc
1.22λL z 1.22λL z
2πρ (0) (0)
p ∆s /λL z = 3.83 → ρ p = = (7.35)
2∆s D
xs
Fraunhofer
xp
λo
thin
lens
f
Figure 7.6 Using a lens to observe diffraction. Placing a lens in the source plane aper-
ture introduces a quadratic phase across the source plane wavefront that can directly
compensate for the diffraction quadratic phase in the lens focal plane. The result is the
diffraction pattern of the lens pupil is observed in the lens focal plane. On either side of
the focal plane a Fresnel diffraction pattern is observed.
When a lens is incorporated into the aperture, the aperture can be modeled
as the following complex transmission function
k(x s2 +y s2 )
−i
t s (x s , y s ) = P (x s , y s )e 2f (7.36)
where we have introduced the pupil of the lens and the spatial phase transforma-
tion associated with a thin lens. If the source aperture is only one dimensional
then the pupil is a rect function. Specifically
where ∆s is the radius of the pupil and is identical to the radius of the lens ∆l .
When discussing the lens we can use ∆s and ∆l interchangeably. If our source
plane is two dimensional then the pupil becomes
q
P (x s , y s ) = circ( x s2 + y s2 |∆s ). (7.38)
The previous equations model the optical properties of a cylindrical lens, Eq.
(7.37), and spherical lens, Eq. (7.38). Important in each is the quadratic phase of
the lens.
To see how the lens directly influences the diffraction problem, lets concen-
trate on one dimension and substitute Eq. (7.36) into Eq. (7.12). We find
x s2
Z xs xp
ik ( L1z − 1f ) −i 2π
U p (x p , L z ) u rect(x s |∆s )e 2 e λL z d xs (7.39)
S
where the complex prefactor has been ignored and initially the detector plane is
not coincident with the lens focal plane. From Eq. (7.39) notice for the quadratic
phase in the integral kernel to vanish L z = f . When this is satisfied
Z xs xp
U p (x p , L z ) u rect(x s |∆s )e −i 2π λL z d x s (7.40)
S
and in the focal plane of the lens we see the Fraunhofer diffraction pattern of the
lens pupil P . Notice the geometry of the pupil determines the observed diffraction
pattern and the lens quadratic phase enables the observation of this pattern in
the lens focal plane. Figure 7.6 also illustrates before and after the lens focal
plane we observe Fresnel diffraction whereas right in the focal plane we observe
Fraunhofer diffraction.